feat: Add support for quantized Whisper model and update audio transc… #1508

EzraEllette · 2025-02-28T03:18:30Z

…ription workflow

Integrate whisper-rs library for improved audio transcription
Add WhisperLargeV3TurboQuantized transcription engine
Modify STT processing to use whisper-rs context and state
Update Cargo.toml to include whisper-rs with GPU support
Refactor transcription methods to work with new whisper-rs workflow
Add download function for quantized Whisper model
Update CLI and core audio transcription engine to support new quantized model

name: pull request
about: submit changes to the project
title: "[pr] "
labels: ''
assignees: ''

description

brief description of the changes in this pr.

related issue: #587

how to test

add a few steps to test the pr in the most time efficient way.

run accuracy example
run screenpipe with -a whisper-large-v3-turbo-quantized

if relevant add screenshots or screen captures to prove that this PR works to save us time (check Cap).

if you are not the author of this PR and you see it and you think it can take more than 30 mins for maintainers to review, we will tip you between $20 and $200 for you to review and test it for us.

…ription workflow - Integrate whisper-rs library for improved audio transcription - Add WhisperLargeV3TurboQuantized transcription engine - Modify STT processing to use whisper-rs context and state - Update Cargo.toml to include whisper-rs with GPU support - Refactor transcription methods to work with new whisper-rs workflow - Add download function for quantized Whisper model - Update CLI and core audio transcription engine to support new quantized model

vercel · 2025-02-28T03:18:33Z

@EzraEllette is attempting to deploy a commit to the louis030195's projects Team on Vercel.

A member of the Team first needs to authorize it.

screenpipe-audio/src/whisper/process_chunk.rs

louis030195 · 2025-02-28T03:24:50Z

screenpipe-audio/src/whisper/process_chunk.rs

+    whisper_model.pcm_to_mel(audio, 2)?;
+    let (_, lang_tokens) = whisper_model.lang_detect(0, 4)?;
+    let lang_token = get_lang_token(lang_tokens, languages)?;
+    params.set_language(get_lang_str(lang_token));


louis030195 · 2025-02-28T03:25:32Z

cool

AntonIXO · 2025-02-28T17:43:52Z

whisper-rs should support Vulkan hardware acceleration? Would you include Vulkan, hipblas and other features?
Waiting for merge!

EzraEllette · 2025-02-28T17:55:37Z

whisper-rs should support Vulkan hardware acceleration? Would you include Vulkan, hipblas and other features?

Waiting for merge!

Yes, will you collaborate with me to test on Linux?

AntonIXO · 2025-02-28T19:15:37Z

whisper-rs should support Vulkan hardware acceleration? Would you include Vulkan, hipblas and other features?
Waiting for merge!

Yes, will you collaborate with me to test on Linux?

Of course!
What Linux are you running? Do you use X.org or Wayland, and do all screenpipe features work fine for you?
I am on Arch KDE Wayland and I could use only screen record(no app recognition) with this patch: #1496

AntonIXO · 2025-03-02T12:49:32Z

I've built with the Vulkan flag, and HW acceleration also works.
but with --enable-realtime-audio-transcription enabled, it fallback to deepgram, even if -a whisper-large-v3-turbo-quantized defined

louis030195 reviewed Feb 28, 2025

View reviewed changes

screenpipe-audio/src/whisper/process_chunk.rs Show resolved Hide resolved

louis030195 reviewed Feb 28, 2025

View reviewed changes

louis030195 closed this Mar 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add support for quantized Whisper model and update audio transc… #1508

feat: Add support for quantized Whisper model and update audio transc… #1508

Uh oh!

EzraEllette commented Feb 28, 2025

Uh oh!

vercel bot commented Feb 28, 2025

Uh oh!

Uh oh!

louis030195 Feb 28, 2025

Uh oh!

louis030195 commented Feb 28, 2025

Uh oh!

AntonIXO commented Feb 28, 2025 •

edited

Loading

Uh oh!

EzraEllette commented Feb 28, 2025

Uh oh!

AntonIXO commented Feb 28, 2025 •

edited

Loading

Uh oh!

AntonIXO commented Mar 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Add support for quantized Whisper model and update audio transc… #1508

feat: Add support for quantized Whisper model and update audio transc… #1508

Uh oh!

Conversation

EzraEllette commented Feb 28, 2025

description

how to test

Uh oh!

vercel bot commented Feb 28, 2025

Uh oh!

Uh oh!

louis030195 Feb 28, 2025

Choose a reason for hiding this comment

Uh oh!

louis030195 commented Feb 28, 2025

Uh oh!

AntonIXO commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EzraEllette commented Feb 28, 2025

Uh oh!

AntonIXO commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AntonIXO commented Mar 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AntonIXO commented Feb 28, 2025 •

edited

Loading

AntonIXO commented Feb 28, 2025 •

edited

Loading