Tasks such as transcribing interviews, deciphering handwriting and summarising documents devour research time. AI can handle them well, but technical barriers often get in the way. Many social scientists are intimidated by the prospect of managing Python environments and dependencies, as they don't see themselves as "power users".

These Google Colab notebooks remove that barrier. If you can click "Play," you can run them. No installation, no setup — everything runs in your browser. The code stays fully visible for anyone who wants to inspect, adapt, or extend it.

Three Core Tools

  • Audio & Video Transcription: Convert interviews, focus groups, and lectures into text with speaker identification. Supports MP3, WAV, MP4, and more.
  • OCR/HTR: Extract text from printed and handwritten documents. Specialised modes for French and Arabic handwriting, plus multilingual support.
  • Text Summarisation: Generate summaries and extract keywords from large documents. Batch-process hundreds of files via Excel integration.

Beyond Western Languages

Most AI tools perform well on English and major European languages. These pipelines tackle more demanding material:

  • Handwritten text: Arabic manuscripts
  • Printed documents: Hindi, Old Tatar
  • Audio transcription: Hausa, Arabic, Swahili, Kurmanji

The notebooks handle long archival files page by page and automatically chunk lengthy audio recordings.

Get Started

All tools are available on GitHub. You need a Gemini API key and can optionally connect Google Drive to save your results.