fix: skip local llama server in cloud mode#1363
Conversation
SummaryThis fixes cloud/API mode installs where DreamServer is configured to use an external OpenAI-compatible endpoint, such as LM Studio hosted on the LAN. In cloud mode, the installer skips local GGUF model setup, but compose resolution could still include the local This PR:
Validation
|
|
What I've added now: WEBUI_PORT and other ports now respect override on Linux/macOS/Windows. Linux/Windows now check more real ports before composing, including Hermes, LiteLLM, Dashboard, etc. In cloud mode, Hermes no longer points to llama-server; it points to LiteLLM/cloud. Open WebUI/LiteLLM/Hermes no longer declare llama-server as a fixed dependency in the metadata, because in cloud mode this is false. |
|
Achei mais um problema real: cloud mode não estava configurando direito o caso “private cloud” dele, tipo DreamServer na VM/WSL e LM Studio no host/LAN. Agora a PR adiciona CLOUD_LLM_BASE_URL, CLOUD_LLM_MODEL, CLOUD_LLM_API_KEY, faz LiteLLM usar isso, mantém Hermes/Open WebUI apontando para LiteLLM internamente, e preflight/validate param de exigir llama-server em cloud. Também validei porta custom tipo WEBUI_PORT=3007. |
Lightheartdevs
left a comment
There was a problem hiding this comment.
Review: request changes
I do not think this PR is safe to merge as-is. It overlaps heavily with #1364, is currently CONFLICTING against main, and the fleet cloud-mode contract fails on the PR head.
Findings
-
[P1] Cloud installs still wait on local
llama-serverin Phase 12dream-server/installers/phases/12-health.sh:108-146This file still unconditionally runs the local
llama-serverhealth check and pre-warm path. InDREAM_MODE=cloud,llama-serveris intentionally profiled out, so Phase 12 will wait/prewarm a service that should not exist. That preserves the original failure class this PR is meant to fix.Repro from fleet contract on
tower2:
/tmp/dream-cloud-mode-run-1363-auditThe
phase12-cloud-health-targetstep failed with:
phase 12 still appears to wait on llama-server unconditionally -
[P1] Hermes SOUL/config generation is still inside the local-model-only block
dream-server/installers/phases/11-services.sh:455-620The Hermes config and
data/persona/SOUL.mdrender still sit underif [[ "${DREAM_MODE:-local}" != "cloud" ]]; then. That means cloud installs skip the generated SOUL.md file entirely. This can recreate the file-vs-directory bind-mount failure users saw, and the cloud-specific Hermes routing code inside that block is unreachable in cloud mode.Same fleet run failed
hermes-soul-renders-in-cloudwith:
SOUL.md render is still inside local-model-only flow -
[P2] The default LiteLLM cloud route now assumes a private OpenAI-compatible LAN endpoint
dream-server/config/litellm/cloud.yaml:2-6This changes
defaultfrom the hosted cloud route toopenai/${CLOUD_LLM_MODEL}backed byCLOUD_LLM_BASE_URL, which defaults elsewhere tohttp://host.docker.internal:1234/v1. That is good as a private-cloud/LM Studio route, but it changes normal hosted--cloudbehavior: a user with onlyANTHROPIC_API_KEY/OPENAI_API_KEYcan now land on a default route pointing at a local endpoint they never started. I would keep the private-cloud route as an explicit alias, or only make it default when the installer confirms a private-cloud endpoint. -
[P2] The interactive installer prints the default API key in the prompt
dream-server/install-core.sh:266-275If
CLOUD_LLM_API_KEYorOPENAI_API_KEYis set to a real value, the prompt renders it as[${CLOUD_LLM_API_KEY}]and reads the replacement visibly. That exposes secrets in terminal scrollback/recordings. Please mask the default and use a silent prompt for key entry.
Additional notes
- The PR is currently not mergeable: GitHub reports
mergeable=CONFLICTING,mergeStateStatus=DIRTYafter #1364 merged. - Relative to current
main, this branch would also dropdream-server/tests/test-linux-cloud-mode.sh; please rebase onmainand preserve that contract test. - CI is green, but it is not catching these cloud-mode contract regressions yet. The fleet run is the stronger signal here.
Validation I ran
- Checked GitHub PR state and current CI: all CI jobs are green, but branch is conflicting.
- Ran the fleet cloud-mode contract on
tower2against PR head526d6377f969154fe3f41d35f53f9c5386868992:cloud-compose-services-no-llama: passdream-cli-cloud-cache-invalidation: passproduct-linux-cloud-mode-test: fail, missing test from currentmainphase12-cloud-health-target: failhermes-soul-renders-in-cloud: fail
Recommended path: rebase on current main, keep #1364's cloud-mode contract/test changes, then add the private-cloud CLOUD_LLM_* work as a narrower follow-up with tests for both hosted-cloud and LM Studio/private-cloud defaults.
526d637 to
ed0c3c7
Compare
|
I rebased the PR on current main and addressed the requested changes. Cloud mode now preserves the #1364 contract: it does not wait on or prewarm local llama-server, and Hermes SOUL/config rendering is outside the local-model-only flow. I also restored the hosted-cloud default route and made LM Studio/private LAN explicit through the local-lan route using CLOUD_LLM_*. The interactive private-cloud API key prompt is now silent and no longer prints existing/default secrets. I also added Linux cloud-mode contract checks for the hosted vs private-cloud route split and the silent key prompt. Validation passed locally, including:
|
Lightheartdevs
left a comment
There was a problem hiding this comment.
Re-review after rebase
Thanks for rebasing this on current main. The original blockers from my prior review look fixed now:
- GitHub reports the branch mergeable against
main. bash -nandgit diff --checkpass.bash dream-server/tests/test-linux-cloud-mode.shpasses on tower2.- Fleet cloud-mode contract passes on tower2:
/tmp/dream-cloud-mode-run-1363-rerun/REPORT.mdwithOVERALL: PASS. - I did not find private hostnames, usernames, LAN IPs, or concrete secrets in the added diff; the secret-bearing values I saw are env-var references or dummy local keys.
I still cannot approve this head yet because the new private-cloud path appears only partially wired.
Findings
-
[P1] Private-cloud
local-lanroute is added, but Hermes is not routed to itdream-server/config/litellm/cloud.yaml:12-16adds an explicitlocal-lanmodel route backed byCLOUD_LLM_*, which is the right direction for LM Studio/private LAN. But the install still keepsCLOUD_LLM_MODELseparate fromLLM_MODEL(dream-server/installers/phases/06-directories.sh:471,dream-server/installers/phases/06-directories.sh:536,dream-server/installers/phases/06-directories.sh:544), and Phase 11 configures Hermes in cloud mode to request${LLM_MODEL:-default}(dream-server/installers/phases/11-services.sh:542-543).That means a private-cloud install can successfully avoid local
llama-server, but Hermes still asks LiteLLM for the cloud-tierLLM_MODELor the user's upstream model id, not for thelocal-lanroute alias.cloud.yamlhas named routes fordefault,cloud,local-lan,gpt4o,fast,minimax, andminimax-fast; it does not define a wildcard route equivalent tolocal.yaml. So the LM Studio endpoint can be configured and still not be the model route Hermes actually calls.This is important because it is the next likely user-facing failure after the compose fix: the stack starts, but the private-cloud path still 404s or routes to hosted defaults instead of the LAN OpenAI-compatible server. Please either set cloud/private-cloud consumers to the
local-lanroute whenCLOUD_LLM_BASE_URLis present, add a deliberate wildcard/pass-through route for cloud config, or patch Hermes/Open WebUI config so the route alias and upstream model id are distinct and both used in the right place. -
[P2]
scripts/dream-preflight.shstill reports cloud installs as missingllama-serverThis PR updates
scripts/dream-preflight.shto pass--dream-mode, but the script still resolvesLLM_PORT,LLM_HEALTH, andLLM_CONTAINERfromllama-server(dream-server/scripts/dream-preflight.sh:41-44), requires that container in the core container check (dream-server/scripts/dream-preflight.sh:58-65), labels/checks the API asllama-server(dream-server/scripts/dream-preflight.sh:68-77), and runsdocker execagainst the llama container for GPU detection (dream-server/scripts/dream-preflight.sh:88-90).In
DREAM_MODE=cloud,llama-serveris intentionally absent. So a user running./scripts/dream-preflight.shafter a cloud install gets a false failure even when the fixed compose stack is correct. The top-leveldream-preflight.shwas made cloud-aware; this companion script should either mirror that behavior or avoid accepting/resolving cloud compose flags.
Validation I ran
bash -non changed shell entrypoints and installer phases.git diff --check origin/main..HEAD.DREAM_PYTHON_CMD=python3 bash dream-server/tests/test-linux-cloud-mode.shon tower2.- Fleet cloud-mode contract on tower2 against head
ed0c3c73661216867a58c71edd50e2b4b53d2969:/tmp/dream-cloud-mode-run-1363-rerun,OVERALL: PASS. - Manual review of
cloud.yaml, Phase 06 env generation, Phase 11 Hermes config rendering, and both preflight scripts.
The old cloud-compose/SOUL/Phase-12 problems are fixed. I would keep this PR open, tighten the two follow-through pieces above, and then it should be in much better shape to approve.
ca9af95 to
dc5bec5
Compare
Thanks for the re-review. I pushed a new head that addresses both follow-through issues. Changes:
Validation passed:
I could not run |
dc5bec5 to
546301b
Compare
|
Rebased this on current What changed in this update:
Local validation run:
The dashboard-api pytest subset was not run locally because this machine's current Python environment does not have |
Codex re-review notesI re-reviewed the latest head after the rebase and checked it against current What I checked:
Local environment caveats:
Assessment of the earlier requested-change items:
Risk/readiness: no new blocker found, but this is materially higher blast radius than a docs/doctor PR. It touches Linux, macOS, Windows installer/env generation, LiteLLM startup config, service manifests, preflight, and dashboard-api helper behavior. I would not batch it with unrelated changes. Recommendation is to merge only after a fresh CI run on the rebased head or one targeted cloud/private-cloud validation pass; with that, I think the previous changes-requested concerns can be considered addressed. |
546301b to
8049c61
Compare
|
Updated this PR after the latest review and current What changed in this update:
Local validation run:
Local caveats:
Remote CI after this push is green, including shell lint, Python lint, mypy, Docker Compose validation, API/frontend, integration smoke, macOS smoke, Linux distro smoke, and PowerShell lint on both Ubuntu and Windows. |
Summary
Fixes cloud/private-cloud installs so Dream Server does not start or depend on the local
llama-serverstack whenDREAM_MODE=cloud, while keeping normal hosted cloud installs on the hosted provider route.This branch was rebased on current
mainafter #1364 and preserves the upstream cloud-mode contract/test changes.Changes include:
llama-serverand local dependency overlays excludedDREAM_MODEthrough compose resolution paths so restart/update/preflight use the same stack as installWEBUI_PORTfor port 3000 conflicts and Hermes proxy portslitellminstead of the local llama servicedata/persona/SOUL.mdgeneration outside the local-model-only flow, so cloud installs still render the bind-mounted filedefaultfor hosted cloud,local-lanfor private-cloud/LM Studio) instead of sending the raw model id to LiteLLMdefault/cloudroutes on hosted providerslocal-lanroute backed byCLOUD_LLM_BASE_URL,CLOUD_LLM_MODEL, andCLOUD_LLM_API_KEYfor private-cloud OpenAI-compatible upstreams like LM Studio (http://<host-lan-ip>:1234/v1)litellmrather than reportingllama-serveras missing in cloud modescripts/dream-preflight.shcloud-aware so it checkslitellm, logslitellm, and skips local GPU probing instead of failing on an intentionally absentllama-serverReview follow-up
Addresses the requested changes from review/re-review:
llama-serverin cloud mode.SOUL.md/config rendering is not trapped inside the local-model-only block.local-lan.local-lanLiteLLM alias whenCLOUD_LLM_BASE_URLis present.llama-server.scripts/dream-preflight.shno longer reports cloud installs as missingllama-server.mainand keepsdream-server/tests/test-linux-cloud-mode.sh.Validation
bash dream-server/tests/test-linux-cloud-mode.shbash -n dream-server/scripts/dream-preflight.sh dream-server/tests/test-linux-cloud-mode.shbash -n dream-server/installers/phases/11-services.sh dream-server/installers/macos/install-macos.sh dream-server/tests/test-linux-cloud-mode.shpython -m py_compile dream-server/extensions/services/dashboard-api/config.py dream-server/extensions/services/dashboard-api/helpers.py dream-server/extensions/services/dashboard-api/main.py dream-server/extensions/services/dashboard-api/settings.pyPYTHONPATH=. python -m pytest tests/test_main.py tests/test_helpers.py tests/test_settings_env.py -qinextensions/services/dashboard-apiPYTHONPATH=. python -m pytest tests/test_features.py tests/test_config.py tests/test_extensions.py tests/test_templates.py -qinextensions/services/dashboard-apibash dream-server/tests/test-network-timeouts.shpython dream-server/tests/test-render-runtime-configs.pybash dream-server/tests/test-service-registry.shpython -m py_compile dream-server/bin/dream-host-agent.pygit diff --checkNote:
bash dream-server/tests/contracts/test-installer-contracts.shcould not run in this local Windows shell becausejqis not installed here; it exits immediately with[FAIL] jq is requiredbefore exercising the PR changes.