Skip to content

Conversation

@EzraEllette
Copy link
Contributor

…ription workflow

  • Integrate whisper-rs library for improved audio transcription
  • Add WhisperLargeV3TurboQuantized transcription engine
  • Modify STT processing to use whisper-rs context and state
  • Update Cargo.toml to include whisper-rs with GPU support
  • Refactor transcription methods to work with new whisper-rs workflow
  • Add download function for quantized Whisper model
  • Update CLI and core audio transcription engine to support new quantized model

name: pull request
about: submit changes to the project
title: "[pr] "
labels: ''
assignees: ''


description

brief description of the changes in this pr.

related issue: #587

how to test

add a few steps to test the pr in the most time efficient way.

  1. run accuracy example
  2. run screenpipe with -a whisper-large-v3-turbo-quantized

if relevant add screenshots or screen captures to prove that this PR works to save us time (check Cap).

if you are not the author of this PR and you see it and you think it can take more than 30 mins for maintainers to review, we will tip you between $20 and $200 for you to review and test it for us.

…ription workflow

- Integrate whisper-rs library for improved audio transcription
- Add WhisperLargeV3TurboQuantized transcription engine
- Modify STT processing to use whisper-rs context and state
- Update Cargo.toml to include whisper-rs with GPU support
- Refactor transcription methods to work with new whisper-rs workflow
- Add download function for quantized Whisper model
- Update CLI and core audio transcription engine to support new quantized model
@vercel
Copy link

vercel bot commented Feb 28, 2025

@EzraEllette is attempting to deploy a commit to the louis030195's projects Team on Vercel.

A member of the Team first needs to authorize it.

whisper_model.pcm_to_mel(audio, 2)?;
let (_, lang_tokens) = whisper_model.lang_detect(0, 4)?;
let lang_token = get_lang_token(lang_tokens, languages)?;
params.set_language(get_lang_str(lang_token));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh

@louis030195
Copy link
Collaborator

cool

@AntonIXO
Copy link

AntonIXO commented Feb 28, 2025

whisper-rs should support Vulkan hardware acceleration? Would you include Vulkan, hipblas and other features?
Waiting for merge!

@EzraEllette
Copy link
Contributor Author

whisper-rs should support Vulkan hardware acceleration? Would you include Vulkan, hipblas and other features?

Waiting for merge!

Yes, will you collaborate with me to test on Linux?

@AntonIXO
Copy link

AntonIXO commented Feb 28, 2025

whisper-rs should support Vulkan hardware acceleration? Would you include Vulkan, hipblas and other features?
Waiting for merge!

Yes, will you collaborate with me to test on Linux?

Of course!
What Linux are you running? Do you use X.org or Wayland, and do all screenpipe features work fine for you?
I am on Arch KDE Wayland and I could use only screen record(no app recognition) with this patch: #1496

@AntonIXO
Copy link

AntonIXO commented Mar 2, 2025

I've built with the Vulkan flag, and HW acceleration also works.
but with --enable-realtime-audio-transcription enabled, it fallback to deepgram, even if -a whisper-large-v3-turbo-quantized defined

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants