A Streamlit-based application for recording, managing, transcribing, and converting meeting audio into structured AI-powered meeting notes using OpenAI's Whisper and GPT-5 APIs.
- ποΈ Audio Recording & Upload: Record from microphone (multi-channel) or upload audio files (WAV, MP3, M4A, FLAC, OGG, WebM)
- π AI Transcription: Multiple OpenAI models (GPT-4o Mini, GPT-4o, Whisper-1) with automatic language detection
- π€ AI Meeting Notes: Transform transcriptions into structured meeting notes with GPT-5 models
- π¦ Smart Compression: Automatic compression for large files (>25MB) or long audio (>20min)
- β‘ Long Audio Support: Automatic chunking and parallel processing for files over 20 minutes
- π Secure Storage: Store any content in your desktop
- Python 3.12+
- FFmpeg:
brew install ffmpeg(macOS) orsudo apt-get install ffmpeg(Ubuntu) - OpenAI API Key: Get from OpenAI Platform
- uv Package Manager: Install uv
# Clone repository
git clone https://github.com/DevSlem/ai-meeting-notes.git
cd ai-meeting-notes
# Install dependencies and start the app
uv run streamlit run main.pyThe app opens at http://localhost:8501.
Note
Docker is not recommended for this application as it requires direct microphone access which is difficult to configure in containers.
- Setup API Key: Click βοΈ API Key Settings β Enter OpenAI API key β Save
- Record/Upload: Navigate to "Record & Upload" tab
- Manage: Go to "Recordings" tab to view all files
- Transcribe: Click ποΈ button next to any recording
- Generate Notes: Click π button to create AI meeting notes
- View Full Page: Click π View Full Page for distraction-free reading
- Select microphone and sample rate (16kHz recommended)
- Click π΄ Start Recording β Speak β βΉοΈ Stop Recording
- File saved automatically to
recordings/directory
Note
We provide multi-channel recording support. If you want to record both your microphone and system audio (e.g., Zoom calls), use a virtual audio device like BlackHole (macOS) or VB-Audio Virtual Cable (Windows).
- Click ποΈ button in Recordings tab
- Select model (GPT-4o Mini recommended for most cases)
- Configure advanced options if needed:
- Compression: Auto-enabled for large/long files
- Chunk Overlap: 30s default (adjustable 15-120s)
- Language: Auto-detect or specify (en, ko, ja, etc.)
- Click Start Transcription
- View results in scrollable text area
- Transcribe audio first (required)
- Click π button next to the recording
- Select model:
- GPT-5: Best quality, complex meetings ($1.25/$10 per 1M tokens)
- GPT-5 Mini: Balanced (recommended) ($0.25/$2 per 1M tokens)
- GPT-5 Nano: Fast & affordable ($0.05/$0.40 per 1M tokens)
- Choose output language (auto-detect recommended)
- Configure advanced options:
- Prompt Template: Select or create custom prompts
- Reasoning Effort: minimal/low/medium/high
- Max Output Tokens: 1000-8000
- Review cost estimate
- Click π Generate Meeting Notes
- In Details: Toggle between π AI Meeting Notes and π Transcription
- Full Page: Click π View Full Page for better reading experience
- Shows metadata (model, generation time, tokens used)
- Distraction-free markdown rendering
- π Regenerate option
- β Back to return to recordings
- Click π Prompt Settings in sidebar
- Create New Prompt:
- Click β Create New Prompt
- Enter name (e.g., "technical-meeting")
- Write prompt content (use
{LANGUAGE_INSTRUCTION}placeholder) - Click πΎ Create Prompt
- Edit Prompt:
- Select prompt from dropdown
- Click βοΈ Edit This Prompt
- Modify content
- Click πΎ Save Changes
- Delete Prompt:
- Select prompt (except default)
- Click ποΈ Delete This Prompt
- Click βοΈ button next to recording
- Enter meaningful name (e.g., "Weekly Team Meeting")
- Original filename preserved, display name stored in
<filename>.json
| Method | Ratio | Speed | Use Case |
|---|---|---|---|
| Recommended | 75-85% | Medium | Meetings with silence removal |
| Fast (MP3) | 60-70% | Fast | Quick compression |
| Balanced (Opus) | 65-75% | Medium | Efficient for speech |
| Custom | Varies | Varies | Specify your own FFmpeg options |
Compression auto-enabled when file >25MB or duration >20min.
Warning
The ratio is arbitrary and depends on the audio content.
| Model | Price/hour | Best For |
|---|---|---|
gpt-4o-mini-transcribe |
$0.18 | Most meetings (recommended) |
gpt-4o-transcribe |
$0.36 | Complex audio, heavy accents |
whisper-1 |
$0.36 | When timestamps needed |
| Model | Price (Input/Output per 1M tokens) | Best For | Speed |
|---|---|---|---|
gpt-5 |
$1.25 / $10 | Complex meetings, high quality | Slower |
gpt-5-mini |
$0.25 / $2 | Most meetings (recommended) | Balanced |
gpt-5-nano |
$0.05 / $0.40 | Simple summaries, quick notes | Fastest |
Reasoning Effort Levels:
- minimal: Fastest, least reasoning tokens
- low: Quick processing (default)
- medium: Balanced quality/speed
- high: Best quality, more expensive
ai-meeting-notes/
βββ recordings/
β βββ recording_20251013_145550.wav # Audio file
β βββ recording_20251013_145550.json # Metadata (name, transcription, meeting notes)
β βββ recording_20251013_145550.txt # Legacy transcription (migrated to JSON)
βββ prompts/
β βββ meeting-notes/
β βββ default.txt # Default prompt template
β βββ technical-meeting.txt # Custom prompt example
β βββ standup.txt # Custom prompt example
βββ .config/
β βββ api_key.txt # OpenAI API key (auto-created)
βββ src/
β βββ audio.py # Audio recording
β βββ transcription.py # Whisper transcription
β βββ meeting_notes.py # GPT-5 meeting notes generation
β βββ file_manager.py # File & metadata management
β βββ audio_processor.py # Compression & chunking
β βββ config.py # API key management
β βββ streamlit_ui.py # UI implementation
βββ main.py # Application entry point
Each audio file has a companion .json file storing:
{
"display_name": "Weekly Team Meeting",
"transcription": "Meeting transcription text...",
"transcribed_at": "2025-10-15T14:30:00",
"meeting_notes": "# Meeting Summary\n...",
"meeting_notes_model": "gpt-5-mini",
"meeting_notes_generated_at": "2025-10-15T14:35:00",
"meeting_notes_usage": {
"prompt_tokens": 1500,
"completion_tokens": 800,
"total_tokens": 2300,
"reasoning_tokens": 50
}
}FFmpeg not found: Install with brew install ffmpeg (macOS) or sudo apt-get install ffmpeg (Ubuntu)
API key issues: Click βοΈ API Key Settings and verify your key at OpenAI Platform
Dialog not closing: Use Close/Done buttons instead of X button
Long files: App automatically chunks files >23min with intelligent merging
Empty meeting notes: Check transcription exists and try regenerating with different model
Streamlit media errors in logs: Harmless internal caching issue, does not affect functionality
- Let app auto-decide compression (enabled only when beneficial)
- Rename recordings immediately for easy identification
- Monitor API usage at OpenAI Usage Dashboard
- Use GPT-4o Mini for most meetings
- Specify language code for better accuracy if auto-detect fails
- For long meetings, increase chunk overlap if transcription seems disjointed
- Start with GPT-5 Mini (best balance of quality/cost)
- Use GPT-5 Nano for quick summaries or budget constraints
- Use GPT-5 only for complex technical meetings
- Low reasoning effort is sufficient for most meetings
- Create custom prompts for recurring meeting types (standups, sales calls, etc.)
- Use auto-detect language unless you need specific output language
- Full Page View for comfortable reading and reviewing
- Regenerate if first result isn't satisfactory (try different model or reasoning effort)
- Include
{LANGUAGE_INSTRUCTION}placeholder for language flexibility - Test prompts with different meeting types before committing
- Name prompts descriptively (e.g., "sales-call", "technical-review")
- Keep default prompt as fallback reference
The application follows a modular architecture:
- Frontend (Streamlit): Multi-page interface with dialogs
- Audio Layer: Recording and compression
- Transcription Layer: OpenAI Whisper integration
- AI Notes Layer: GPT-5 meeting notes generation
- Storage Layer: Local file system with JSON metadata
Built with Streamlit, OpenAI Whisper API, OpenAI GPT-5 API, and FFmpeg
License: MIT
