Skip to content

Update wav2vec2-bert model card #38957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AshAnand34
Copy link
Contributor

What does this PR do?

This pull request enhances the documentation for the Wav2Vec2-BERT model by improving its structure, adding usage examples, and providing detailed explanations of its features and configuration options. The changes aim to make the documentation more user-friendly and comprehensive for developers working with this model.

Documentation Enhancements:

  • Introduction and Overview:

    • Added a detailed introduction to Wav2Vec2-BERT, highlighting its multilingual capabilities, pre-training dataset, and downstream use cases like ASR and audio classification.
  • Usage Examples:

    • Included Python code snippets demonstrating how to use the model with the Pipeline and AutoModel classes for tasks like speech recognition and audio classification.
    • Added an example of using 8-bit quantization to reduce memory usage with the bitsandbytes library.
  • Model Architecture and Features:

    • Documented key architectural details, including the use of causal depthwise convolutional layers, mel-spectrogram inputs, and Conformer-based adapter networks.
    • Explained supported position embedding types and their configuration options.
  • API Reference Updates:

    • Reorganized the API reference section for better readability, converting headings to a consistent format (e.g., ### Wav2Vec2BertConfig) and maintaining a logical structure for model components. [1] [2]
  • Additional Notes:

    • Added sections on training data, fine-tuning requirements, and links to official performance results for further exploration.

Fixes #36979

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@stevhliu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Community contributions] Model cards
1 participant