You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
KnoLo Core is a local-first knowledge base engine built for small language models (LLMs). It packages your documents into a compact .knolo file and enables fully deterministic querying — no embeddings, no vector databases, no cloud services required. Designed for on-device and edge LLM deployments.
Agentic Android Open Source Project (AAOSP) — Android fork with native LLM system service, MCP-aware apps, and an agent-driven launcher. On-device Qwen 2.5 via llama.cpp. Apps declare tools in their manifest. The OS runs the model.
High-performance Android SDK for on-device LLM inference (GGUF). Privacy-focused, offline-first, and powered by llama.cpp with a clean Kotlin Coroutines API.
Apple FoundationModels API on iOS 18+. Same call site, native passthrough on iOS 26 (Apple Intelligence), CoreML / MLX backends on older OSes. Drop-in source compatible.
iOS app that runs a local LLM on-device to transcribe meetings and generate structured notes — action items, decisions, and summaries. No cloud, no API keys, no data leaves the phone.
Ash — offline survival assistant for iOS. Gemma 4 E2B/E4B fully on-device (text · image · voice) with RAG-grounded answers over 56 emergency-response packs. Built for the Kaggle Gemma 4 Good Hackathon.
On-device text-to-SQL distilled from GPT-4o-mini into Qwen2.5 (0.5B → 3B locally on M1 via mlx-lm LoRA, 7B+ on cloud A100). 847 MB at 62.5% on Spider dev; 3B variant hits 72.6%, 7B variant hits 75.0%.
Production KMP framework for Google LiteRT-LM. Run Gemma on-device with OEM-aware RAM fixes, resilient Ktor chunked downloads, and schema-driven function calling. Plain Android support. AGPL-3.0 / Commercial dual-licensed.
Hands-free voice assistant for Android, fully on-device. Gemma 4 multimodal LLM via LiteRT-LM with optional ElevenLabs cloud TTS, smart-turn-classified barge-in, and a vision channel.
Documentation for MobileTransformers - a lightweight, modular framework based on ONNX Runtime for running and adapting large language models (LLMs) directly on mobile and edge devices. It supports on-device fine-tuning (PEFT), efficient inference, quantization, weight merging, and direct inference from merged models.