Skip to content

KavinMK05/coursera-transcript-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š Coursera Transcript Generator

A beautiful CLI tool to bulk-download transcripts and subtitles from any Coursera course you're enrolled in.

CLI Preview

Python License PyPI Downloads


✨ Features

  • Interactive prompts β€” guided step-by-step experience, no need to memorize flags
  • Bulk download β€” grabs every lecture transcript in a course at once
  • Organized output β€” files are neatly sorted into module folders
  • Progress tracking β€” real-time progress bar with download status
  • Retry logic β€” automatic retries with exponential backoff on failures
  • Multiple formats β€” supports both .txt (plain text) and .srt (subtitle) formats
  • Multi-language β€” download transcripts in any available language

πŸ“¦ Installation

# Clone the repo
git clone https://github.com/your-username/coursera-transcript-generator.git
cd coursera-transcript-generator

# Install in editable mode
pip install -e .

πŸš€ Usage

Interactive Mode (recommended)

Just run the command with no arguments β€” it will guide you through everything:

coursera-transcripts

You'll be prompted for:

  1. CAUTH cookie β€” your Coursera authentication token
  2. Course slug β€” the identifier from the course URL
  3. Options β€” language, format, and output directory

CLI Mode

Pass everything as flags for scripting / automation:

coursera-transcripts \
  --cookie "YOUR_CAUTH_VALUE" \
  --slug "machine-learning" \
  --language en \
  --format txt \
  --output ./transcripts

All Options

Flag Short Default Description
--cookie -c (prompted) CAUTH cookie value
--slug -s (prompted) Course slug from URL
--language -l en Subtitle language code
--format txt Output format (txt or srt)
--output -o ./output Parent output directory

πŸ”‘ Getting Your CAUTH Cookie

  1. Open coursera.org and log in
  2. Open DevTools (F12 or Ctrl+Shift+I)
  3. Go to Application β†’ Cookies β†’ https://www.coursera.org
  4. Find the cookie named CAUTH
  5. Copy its Value

Important

You must be enrolled in the course to download its transcripts.


πŸ“ Output Structure

Transcripts are organized by module:

output/
└── machine-learning/
    β”œβ”€β”€ introduction-to-ml/
    β”‚   β”œβ”€β”€ Welcome to Machine Learning.txt
    β”‚   β”œβ”€β”€ What is Machine Learning.txt
    β”‚   └── Supervised Learning.txt
    β”œβ”€β”€ linear-regression/
    β”‚   β”œβ”€β”€ Model Representation.txt
    β”‚   └── Cost Function.txt
    └── ...

πŸ”§ Finding the Course Slug

The slug is the part of the URL after /learn/:

https://www.coursera.org/learn/machine-learning
                                └── this is the slug

πŸ“‹ Requirements

  • Python 3.10+
  • A Coursera account with enrollment in the target course

πŸ“„ License

MIT

About

A beautiful CLI tool to bulk-download transcripts and subtitles from any Coursera course you're enrolled in. πŸŽ“

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages