A web-based text generation application powered by the LLaMA-Mesh model, built with Flask and Transformers.
This project provides a clean, user-friendly web interface for generating text using the LLaMA-Mesh language model. Users can input prompts and control the length of generated text through an intuitive web form.
- Web Interface: Clean, responsive design using Tailwind CSS
- Text Generation: Powered by the LLaMA-Mesh model via Transformers
- Customizable Output: Adjustable text length (10-1000 characters)
- Real-time Processing: Asynchronous text generation with loading states
- Error Handling: Graceful error display and recovery
- Backend: Python Flask
- ML Framework: PyTorch + Transformers
- Model: LLaMA-Mesh (Zhengyi/LLaMA-Mesh)
- Frontend: HTML, JavaScript, Tailwind CSS
- Package Management: pip + virtual environment
- RAM: 32GB+ recommended (model is ~15GB)
- Storage: 20GB+ free space
- GPU: CUDA-compatible GPU recommended (optional)
- Python: 3.8 or higher
- Flask >= 3.1.0
- torch >= 2.2.0
- transformers >= 4.30.0
- Additional dependencies in
requirements.txt
git clone <your-repository-url>
cd capstonepython -m venv .venvWindows:
.venv\Scripts\activatemacOS/Linux:
source .venv/bin/activatepip install -r requirements.txtpython app.pyThe application will start on http://localhost:5000
- Open your browser and navigate to
http://localhost:5000 - Enter your text prompt in the input field
- Adjust the maximum length slider (10-1000 characters)
- Click "Generate" to create text
- View the generated output below
You can also use the API directly:
curl -X POST http://localhost:5000/generate \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "input_text=Your prompt here&max_length=100"capstone/
โโโ app.py # Main Flask application
โโโ index.html # Web interface template
โโโ requirements.txt # Python dependencies
โโโ .gitignore # Git ignore rules
โโโ .venv/ # Virtual environment (ignored)
โโโ README.md # This file
You can modify these parameters in app.py:
temperature: Controls randomness (0.1-2.0)top_p: Nucleus sampling parameter (0.1-1.0)max_length: Maximum output lengthnum_return_sequences: Number of sequences to generate
- Debug Mode: Enabled by default (
debug=True) - Host: localhost (127.0.0.1)
- Port: 5000
Problem: Model too large for available RAM Solutions:
- Use a machine with more RAM (32GB+)
- Use a smaller model variant
- Enable model offloading to disk
Problem: Version compatibility issues Solution:
pip install --upgrade flask werkzeugProblem: GPU memory insufficient Solutions:
- Use CPU-only inference:
torch.device('cpu') - Reduce batch size
- Use gradient checkpointing
Problem: Large model download/loading time Solutions:
- Models are cached after first download
- Use faster internet connection for initial download
- Consider using model quantization
- GPU Acceleration: Ensure CUDA is properly installed
- Model Caching: Models are cached in
~/.cache/huggingface/ - Memory Management: Use
torch.no_grad()for inference - Batch Processing: Process multiple requests efficiently
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- LLaMA-Mesh Model by Zhengyi
- Transformers by Hugging Face
- Flask web framework
- Tailwind CSS for styling
If you encounter any issues or have questions:
- Check the Issues page
- Review the troubleshooting section above
- Create a new issue with detailed information
- Model selection dropdown
- Batch text generation
- Export/save functionality
- User authentication
- API rate limiting
- Docker containerization
- Cloud deployment guides
Note: This application requires significant computational resources due to the large language model. Ensure your system meets the minimum requirements before installation.