Skip to content

Added example Podcast_and_Audio_Transcription #665

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

SonjeVilas
Copy link

Adds automated audio transcription using Gemini 2.0 with:

✅ Speaker identification (labeled or as Speaker A/B)
✅ Precision timestamps ([HH:MM:SS])
✅ Music/sound effect detection (e.g., [Jingle] or [Song Name])
✅ Clean text output with [END] marker

  • Testing: Verified with podcasts & call recordings.
  • Deps: jinja2, Gemini API client.

Useful for podcasts, interviews, and call analysis.

Adds automated audio transcription using Gemini 2.0 with:
✅ Speaker identification (labeled or as Speaker A/B)
✅ Precision timestamps ([HH:MM:SS])
✅ Music/sound effect detection (e.g., [Jingle] or [Song Name])
✅ Clean text output with [END] marker

Testing: Verified with podcasts & call recordings.
Deps: jinja2, Gemini API client.

Useful for podcasts, interviews, and call analysis.
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions bot added the status:awaiting review PR awaiting review from a maintainer label Apr 4, 2025
@Giom-V
Copy link
Collaborator

Giom-V commented Apr 4, 2025

Thanks @SonjeVilas, that's an interesting example. I won't have time to review it today but I'll try to do it next week.

Copy link
Collaborator

@Giom-V Giom-V left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @SonjeVilas,

That's a nice example. On top of what @andycandy already reported, and the minor stuff I pointed out, can you:

Thanks again!

@Giom-V Giom-V self-assigned this Apr 7, 2025
@SonjeVilas
Copy link
Author

@Giom-V Thanks For the Review... :)

@SonjeVilas
Copy link
Author

@nikitamaia Thanks for Review :)

@SonjeVilas SonjeVilas requested a review from Giom-V May 2, 2025 16:01
@@ -0,0 +1,531 @@
{
Copy link
Collaborator

@Giom-V Giom-V May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #4.    file_path = "https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3"

I think we need to find a better example with 2 speakers to showcase the diarization. What about something like https://archive.org/details/Apollo11Audio (not the whole recording but a specific part). They also have some open-sourced podcasts I think.

In any case, whatever the source of your audio file If you do, don't forget to cite where it comes from.


Reply via ReviewNB

@Giom-V
Copy link
Collaborator

Giom-V commented Jun 4, 2025

Hello @SonjeVilas, do you still want to push that example?

@Giom-V Giom-V added the component:examples Issues/PR referencing examples folder label Jun 4, 2025
@SonjeVilas
Copy link
Author

Thanks for reminder ! I will complete this PR on this weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:examples Issues/PR referencing examples folder status:awaiting review PR awaiting review from a maintainer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants