A premium, high-performance offline Android client for running Large Language Models (LLMs) on-device with multimodal OCR.
Download - Features - Screenshots - Credits - Disclaimer
Warning
Local LLM/AI executes AI models entirely on your physical mobile device. Running large models is highly resource-intensive and requires a modern processor and sufficient RAM (6 GB+). System stability, inference speeds, and output quality depend entirely on your hardware capability. Model weights (such as Qwen, DeepSeek, or Gemma) are not packaged inside the APK and must be downloaded or transferred manually due to their size (1.5 GB+).
Additionally, this application executes all calculations offline. No internet connection is required after models are downloaded, and no conversational data ever leaves your device.
Local LLM/AI is a high-fidelity, modern Android client designed to provide a completely private, offline, and secure conversational AI experience. By integrating Google's optimized MediaPipe Tasks GenAI engine, the app compiles and runs lightweight LLMs (like Qwen 2.5, DeepSeek-R1, Phi-2, and Gemma 2B) natively on mobile hardware.
The app targets GPU acceleration (Vulkan) for responsive streaming generation with graceful CPU fallback.
The app wraps this powerful local engine in a premium, fluid Jetpack Compose (Material 3) user interface featuring offline OCR document parsing, video/file media integration, and background download handling.
The app includes built-in presets for several highly-capable, lightweight models optimized for mobile execution. Below are their approximate download sizes and memory requirements:
| Model | Developer | Parameters | Approx. Size | Min. RAM Requirement |
|---|---|---|---|---|
| Qwen 2.5 1.5B Instruct | Alibaba | 1.5B | ~1.6 GB | 6 GB+ |
| DeepSeek-R1 Distill Qwen 1.5B | DeepSeek | 1.5B | ~1.6 GB | 6 GB+ |
| Gemma 1.1 2B IT | 2B | ~1.4 GB | 8 GB+ | |
| Phi-2 2.7B | Microsoft | 2.7B | ~1.6 GB | 8 GB+ |
| Inference | Multimodal & OCR (100% Offline) |
|---|---|
| High-performance offline LLM execution | Attach Images, Videos & Documents (PDF, Code, Text) |
| Vulkan GPU acceleration with graceful CPU fallback | Offline image OCR text extraction using Google ML Kit |
| Graceful CPU fallback optimization | Offline page-by-page PDF rendering and text recognition |
| Streaming word-by-word responses | Playback attached videos natively and view documents via Intent |
| UI / Experience | Core Features |
|---|---|
| Premium Material 3 dynamic styling | Complete offline privacy (no logs or tracking) |
| Custom system instructions prompt | Large model memory size & RAM badges in-app |
| Interactive file attachments preview drawer | Multi-turn chat context memory (6-turn history) |
| Collapsible OCR logs under bubble cards | Quantized weights optimizations |
- Consolidated Top Header: Reduced vertical height to 56.dp, center-aligned settings and delete icons, added system status bar padding, and cleaned up loaded model titles.
- Refined Chat Bubbles: Applied uniform 16.dp corner radius with a sharp 4.dp anchor corner on the sender's edge, expanded text margins, and integrated high-contrast borders for light and dark themes.
- Sleek Input & Attachments: Replaced individual file buttons with a single "+" dropdown trigger, introduced a compact text input area using custom BasicTextField, and implemented state-based styling for the send controls.
- Seamless Keyboard Handling: Redesigned layout flow to use sequential vertical Column with .imePadding(), automatically shifting the input bar and triggering auto-scroll to the bottom upon soft keyboard popups without leaving blank gaps.
Grab the latest compiled APKs from the GitHub releases page.
The release APK (app-release.apk) is optimized for mobile hardware using on-device GPU (Vulkan) or CPU.
To compile the application yourself, ensure you have Java 17 and Android SDK set up. Set your JDK path and run the compilation:
$env:JAVA_HOME = "C:\Users\Badsiwal\.gradle\jdks\eclipse_adoptium-17-amd64-windows.2"
./gradlew assembleReleaseLocal LLM/AI is built on top of state-of-the-art on-device intelligence libraries and modern Android components.
Special thanks to:
- Google MediaPipe Tasks GenAI
- Google ML Kit Text Recognition
- Jetpack Compose & Material 3
- Coil Image Loading Library
- OkHttp
- Kotlin Coroutines Flow
Local LLM/AI is licensed under the MIT License. See LICENSE for details.
Local LLM/AI is an independent, unofficial project. It is not affiliated with, funded, authorized, endorsed by, or associated with Google LLC, MediaPipe, Gemma, or any of their affiliates.
All trademarks, service marks, catalogs, artwork, metadata, and model weights remain the property of their respective owners. Users are responsible for procuring and loading model files in compliance with the respective model's terms of use, license agreements, and regional requirements.

