Open-source language models often underperform on African languages and demand high computational resources—barriers to real-world use in the African context. To make language AI truly inclusive, we need models that are smaller, smarter, and optimized for resource-constrained environments.
The Lelapa AI Buzuzu-Mavi Challenge tasked participants with compressing Lelapa AI’s InkubaLM—an open-source small language model (SLM)—while maintaining or improving performance for two key African languages: Swahili and Hausa.
This repository presents our Bronze Medal-winning solution. 🥉
✅ Compress InkubaLM to reduce size and inference cost
✅ Retain or improve model accuracy on core NLP tasks
✅ Ensure usability on low-resource devices and CPUs
✅ Focus on Swahili and Hausa performance
The model was evaluated across three NLP tasks:
- 🗣️ Sentiment Analysis
- 🧠 Natural Language Inference (AfriXNLI – true/false reasoning)
- 🌍 Machine Translation (English → Swahili & Hausa)
Performance could be improved by either:
- Increasing task accuracy,
- Reducing model size,
- Or both.
🔧 Quantization – Reduced precision (8-bit & 4-bit) for faster, leaner models
✂️ Pruning – Removed redundant parameters
🌐 Language-Specific Fine-tuning – Custom fine-tuning on Swahili and Hausa datasets
This work moves us closer to a future where African languages have equal representation in the AI ecosystem. Smaller, smarter models enable:
- ✅ Faster NLP on standard CPUs
- ✅ Offline language tools
- ✅ Scalable deployment in education, agriculture, health, and customer service