Beta is an early Android prototype exploring what a voice-first assistant for personal commerce could feel like. The current cart-only grocery flow supports Blinkit, Swiggy Instamart, and Zepto, with the app stopping before checkout and payment so the user can review the cart.
This project is an early prototype. It is not production-ready and parts of the experience rely on experimentation and iteration.
betaapp.live is reserved for this project, but it is not live yet.
- Voice and text input for intent capture
- Screen understanding to interpret what is currently on the device
- OCR for extracting visible text from the UI
- Accessibility tree inspection for structured UI context
- Assisted ordering flows that guide the user step by step
- Cart-building support for Blinkit, Swiggy Instamart, and Zepto
Grocery MCP integrations are not live yet.
The plan is to replace brittle screen-based automation with reliable, explicit APIs for commerce actions where supported by providers such as Blinkit, Swiggy Instamart, and Zepto. Once integrated, MCP or provider APIs will be used for:
- Search and discovery
- Cart creation and updates
- Checkout validation (address, timing, fees, availability)
- Order placement only after explicit user confirmation
Until MCP is integrated, any ordering assistance is best-effort and should be treated as prototype behavior.
- Android app UI: captures voice/text input and presents guided steps
- Perception layer: combines screenshot-based OCR and accessibility tree inspection
- Intent and flow logic: maps user intent to a sequence of assisted steps
- Action layer (prototype): interacts with on-screen UI elements when needed, with fallbacks and human-in-the-loop prompts
- Android app UI: voice-first experience, confirmations, and review screens
- Intent and flow logic: determines the next best action and required confirmations
- MCP/API client: performs commerce actions through supported grocery-provider tools
- Validation and guardrails: checks totals, address, delivery slot, and constraints before showing a final confirmation
- Explicit confirmation gate: order placement happens only after the user approves the final summary
- User speaks or types an intent, for example "Order a spicy paneer bowl under 250".
- The assistant clarifies constraints if needed, for example location, budget, dietary preferences.
- The assistant gathers context from the current screen (prototype) or via MCP (target).
- The assistant proposes a short list or a recommended choice.
- The assistant builds the cart and validates checkout details.
- The assistant presents a final review with total cost and key details.
- User explicitly confirms, then the order is placed.
- User control first: no order is placed without explicit confirmation.
- Data minimization: collect only what is needed to complete the task.
- Transparency: clearly indicate when screen data (OCR or accessibility tree) is being used.
- Local-first where possible: prefer on-device processing when feasible.
- Sensitive data handling: avoid storing screenshots, extracted text, or identifiers unless required for debugging, and make retention short and opt-in.
This repository contains an early Android prototype.
- Install Android Studio (latest stable recommended).
- Open the project in Android Studio.
- Sync Gradle.
- Run the app on an emulator or a connected Android device.
If the project uses local keys or environment configuration, keep them out of git and follow any existing sample configuration files in the repo.
- Stabilize the voice and text input experience
- Improve screen understanding quality (OCR and accessibility parsing)
- Add robust guided flows with better error handling
- Introduce a review-and-confirmation summary screen for all ordering actions
- Integrate supported grocery MCP/provider APIs for search, cart, checkout validation, and order placement
- Add privacy controls and clear consent UX for any captured screen context