ElevenTools is a comprehensive toolbox for ElevenLabs, providing a user-friendly interface for text-to-speech generation with advanced features and bulk processing capabilities.
- Phonetics processing moved to Ollama: We've migrated the phonetics processing to use Ollama, a local language model, for improved performance and privacy.
- Bug fix: Resolved an issue with the speaker boost parameter to ensure it functions correctly.
- Dynamic voice and model selection from the ElevenLabs library
- Text variable support for personalized audio generation
- Random and fixed seed options for reproducible results
- Customizable voice settings (stability, similarity, style, speaker boost)
- Single and bulk audio generation
- CSV support for batch processing
- Review and playback of generated audio
- Ollama integration for local language model processing, including phonetics
-
Ensure you have Python 3.12 installed.
-
Install uv (a fast Python package/dependency manager):
curl -Ls https://astral.sh/uv/install.sh | sh # or use Homebrew: brew install astral-sh/uv/uv
-
Sync your environment and install dependencies:
uv sync
To include development dependencies:
uv sync --extra dev
-
Create a
.streamlit/secrets.toml
file in the root directory with your API key:ELEVENLABS_API_KEY = "your_elevenlabs_api_key"
-
(Optional) Create a
.streamlit/config.toml
file to customize Streamlit's appearance and behavior.
ElevenTools integrates with Ollama for local language model processing, including phonetics. To use this feature, you need to install Ollama and download the appropriate model:
-
Install Ollama:
-
For macOS and Linux:
curl https://ollama.ai/install.sh | sh
-
For Windows: Download and install from Ollama's official website
-
-
Download the required model: After installing Ollama, open a terminal and run:
ollama pull llama3.2:3b
This will download the small and efficient Llama 3.1:8b model, which is currently used by ElevenTools.
-
Ensure Ollama is running: Ollama should start automatically after installation. If it's not running, you can start it manually:
- On macOS/Linux:
ollama serve
- On Windows: Run the Ollama application
- On macOS/Linux:
For more information on Ollama, visit ollama.ai.
Run the Streamlit app:
uv run -- streamlit run app.py
Navigate to the provided local URL to access the ElevenTools interface.
- Prepare a CSV file with columns: 'text', 'filename' (optional), and any variables used in the text.
- Use the Bulk Generation page to upload your CSV and generate multiple audio files.
- Choose between random or fixed seed generation for consistent results.
ElevenTools uses pytest for testing. The test suite is organized into separate files for each main component of the application:
test_functions.py
: Tests for utility functions infunctions.py
test_elevenlabs_functions.py
: Tests for ElevenLabs API interactions inElevenlabs_functions.py
test_ollama_functions.py
: Tests for Ollama integration inollama_functions.py
test_streamlit_pages.py
: Tests for Streamlit pages (Home.py and Bulk_Generation.py)
To run the tests:
-
Ensure dev dependencies are installed:
uv sync --extra dev
-
Run all tests:
uv run -- pytest
-
To run tests for a specific file:
uv run -- pytest test_functions.py
-
To run tests with more detailed output:
uv run -- pytest -v
-
To run tests and see print statements:
uv run -- pytest -s
When contributing new features or making changes, please add or update the relevant tests to ensure code quality and prevent regressions.
-
Implement automated testing- Unit tests for core functions
- Integration tests for API interactions
- End-to-end tests for user workflows
-
Integrate OLLAMA
- Implement OLLAMA integration in the codebase
- Create tests for OLLAMA integration
- Test and improve enhancing process
-
Enhance UI/UX
- Implement progress bars for audio generation
- Improve error messaging and user feedback
- Create a more intuitive layout for voice settings
-
Optimize performance
- Implement caching for frequently used data
- Optimize bulk generation for large datasets
-
Security enhancements
- Implement proper API key management
- Add user authentication for multi-user support
-
Improve data management
- Implement a database for storing generation history
- Create export options for generated audio metadata
- Develop a pronunciation memory system
-
Expand features
- Add search functionality for voice IDs
-
Documentation
- Create comprehensive API documentation
- Develop a user guide with examples and best practices
-
Enhance Voice-to-Voice functionality
- Add voice cleanup features
-
UI/UX improvements (continued)
- Implement a dark mode option
Implement automated testing- Unit tests for core functions
- Integration tests for API interactions
- End-to-end tests for user workflows
- Integrate OLLAMA
- Research OLLAMA API and integration requirements
- Design integration architecture
- Implement OLLAMA integration in the codebase
- Create tests for OLLAMA integration
- Test and improve enhancing process
- Enhance UI/UX
- Implement progress bars for audio generation
- Improve error messaging and user feedback
- Create a more intuitive layout for voice settings
- Implement a dark mode option
- Optimize performance
- Implement caching for frequently used data
- Optimize bulk generation for large datasets
- Security enhancements
- Implement proper API key management
- Add user authentication for multi-user support
- Improve data management
- Implement a database for storing generation history
- Create export options for generated audio metadata
- Develop a pronunciation memory system
- Expand features
- Add search functionality for voice IDs
- Enhance Voice-to-Voice functionality
- Add voice cleanup features
- Documentation
- Create comprehensive API documentation
- Develop a user guide with examples and best practices
ElevenTools is open-source software released under a custom license.
- Free for individual use and for companies with less than $10 million in annual revenue and fewer than 50 employees.
- Commercial licensing required for larger companies.
- Use for training AI models is prohibited without explicit permission.
Please see the full license for all terms and conditions.
For commercial licensing inquiries, please contact [your contact information].
Contributions are welcome! Please feel free to submit a Pull Request.
If you encounter any problems or have any questions, please open an issue in this repository.
The following voice settings can be adjusted:
- Stability (0.0-1.0): Controls the stability of the voice. Higher values make the voice more consistent.
- Similarity Boost (0.0-1.0): Controls how closely the voice matches the reference audio.
- Style (0.0-1.0): Controls the expressiveness of the voice.
- Speaker Boost: Enhances the clarity and presence of the speaker's voice.
- Speed (0.5-2.0): Controls the speaking speed (only available with multilingual v2 model)
- 0.5: Half speed
- 1.0: Normal speed (default)
- 2.0: Double speed
The following models are available:
- Monolingual v1: English-only model optimized for speed
- Multilingual v2: Advanced model supporting multiple languages and speed control