Skip to content

[Feat]: Cloud inference alongside on-device models #663

@slimhk45

Description

@slimhk45

With Gemma 4 E2B/E4B just dropping, on-device inference on Android is genuinely good now. But serious tasks still warrant a larger cloud model.

Would love the ability to configure cloud API keys (OpenAI, Anthropic, etc.) and switch between local and cloud backends per message — same chat UI, shared conversation history. The goal is using local inference for most things to cut API costs, and reaching for cloud only when needed.

Essentially the same model-switching UX that Open WebUI offers, but built into PocketPal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions