Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Add Audio Input Support #34

Open
Aditya062003 opened this issue Mar 28, 2024 · 12 comments
Open

Feat: Add Audio Input Support #34

Aditya062003 opened this issue Mar 28, 2024 · 12 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@Aditya062003
Copy link
Contributor

Issue Title:

Enhancement: Add Audio Input Support for Generating Q&A

Issue Description:

Problem:

Currently, the software supports only text and PDF input for generating Q&A. However, there is a growing need to include audio input capabilities, allowing users to provide recorded lectures as input data.

Proposed Solution:

Integrate audio input functionality into the software to enable users to upload recorded lectures. The system should then process these audio files, extract relevant information, and generate Q&A accordingly.

Expected Behavior:

  1. Users should be able to upload audio files in commonly used formats (e.g., MP3, WAV).
  2. The software should process audio input, transcribe the content, and extract key points.
  3. Generated Q&A should reflect the information presented in the audio lecture accurately.
  4. Ensure compatibility and efficiency of audio processing across different platforms and environments.

Labels:

  • enhancement
  • feature-request
@charanbhatia
Copy link

@Aditya062003 please assign this issue to me. It's been a while now, and I can create a PR soon.

@samruddhi-Rahegaonkar
Copy link

@Aditya062003 Please Assign this issue to me currently working on it.

@Roaster05 Roaster05 added good first issue Good for newcomers enhancement New feature or request labels Dec 17, 2024
@Aditya062003
Copy link
Contributor Author

Hey, I have already worked on a PR regarding this: #35. However, it was for the old UI, and I think we need to integrate this into the newer version.

@samruddhi-Rahegaonkar
Copy link

okay lets, Work on it.

@samruddhi-Rahegaonkar
Copy link

Is code of #35 merged in Repo ?

@red-panda3
Copy link

hii ,@Aditya062003
Please assign the issue to me , I am currently working on it

@IronJam11
Copy link

Please assign this issue to me, I am working on it :)

@IronJam11
Copy link

I have made a pr for the same, please check if need be, thank you.

@waqar2403
Copy link

It seems interesting Will happy to work on

@sakina1303
Copy link

please assign this issue to me, i would love to work on it.

@KeshavSingh2703
Copy link

Hi, I’m interested in contributing to this issue! I have experience with machine learning and natural language processing, and I’ve worked on integrating AI models with different input modalities. I believe I can help implement the audio input functionality by leveraging tools like OpenAI Whisper or Google Speech-to-Text for transcription, combined with existing Q&A generation logic.

@AvikRay1001
Copy link

@Aditya062003 is this pr opened or is it solved??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

10 participants