Skip to content

feat(models): support bundled image-model extension packs #114

@DrHepa

Description

@DrHepa

Summary

I would like to propose support for GitHub repositories that ship a bundled image-model extension pack, for example a repository with multiple child extensions under extensions/.

The goal is to make these packs installable and runnable in Modly without breaking existing single-extension repositories.

Problem

Today, Modly's GitHub extension install and model-weight flow are centered around a single extension rooted at the repository root.

That works for legacy model/process extensions, but it is limiting for image-model packs that need to provide several capabilities from one repository, such as:

  • one or more text-to-image/image model extensions,
  • optional process extensions shipped alongside them,
  • shared model-weight ownership between capability aliases,
  • runtime readiness/setup checks before the model can run,
  • image output previews returned from the FastAPI workspace.

Without host-side support for those concepts, bundled image model repositories either cannot install cleanly or install but then fail at download/readiness/generation/preview boundaries.

Proposed solution

Add host support in small, separable pieces:

1. Bundled GitHub extension install

Allow Install GitHub to accept either:

  • the existing legacy shape: manifest.json at repository root, or
  • a bundle shape: child extension folders under extensions/<extension-id>/.

Expected behavior:

  • discover and validate each child extension,
  • reject duplicate extension IDs in the same bundle,
  • flatten valid children into the user's extensions directory,
  • preserve legacy single-extension installs,
  • report partial child setup failures without discarding successfully installed children,
  • reload extensions once after installation.

Candidate implementation area from my fork:

  • electron/main/github-extension-install.ts
  • electron/main/ipc-handlers.ts
  • electron/preload/index.ts
  • src/shared/stores/extensionsStore.ts
  • src/areas/models/ModelsPage.tsx
  • tests/fixtures/github-install/*

2. Owner-scoped image model weights

Separate capability identity from weight ownership so bundled image-model capabilities can share a physical weight directory safely.

Expected behavior:

  • keep capability IDs stable as <extension-id>/<node-id>,
  • allow manifests/nodes to declare owner metadata such as weight_owner_id,
  • download weights into a canonical owner directory,
  • keep read-through compatibility with legacy <extension-id>/<node-id> paths,
  • avoid deleting shared weights while another capability still references the same owner,
  • keep existing 3D model behavior unchanged.

Candidate implementation area from my fork:

  • api/services/generator_registry.py
  • api/routers/model.py
  • api/runner.py
  • api/services/extension_process.py
  • electron/main/model-ownership.ts
  • electron/main/model-downloader.ts
  • src/areas/models/modelOwnershipState.ts
  • src/areas/models/components/ExtensionCard.tsx
  • src/shared/types/electron.d.ts
  • tests/fixtures/bundled-image-models/*

3. Text-to-image generation and workflow defaults

Support image-model nodes whose input is text rather than an uploaded image, and hydrate manifest parameter defaults before dispatch.

Expected behavior:

  • text-to-image nodes do not require a fake image path,
  • default params from params_schema are materialized consistently,
  • legacy image-to-3D generation still works,
  • process nodes continue to dispatch through the process runner, not the model API.

Candidate implementation areas:

  • api/routers/generation.py
  • api/schemas/generation.py
  • api/services/generation_jobs.py
  • src/shared/hooks/useGeneration.ts
  • src/areas/generate/components/WorkflowPanel.tsx
  • src/areas/workflows/workflowNodeFactory.ts
  • src/areas/workflows/workflowNodeParams.ts
  • src/areas/workflows/workflowDispatch.ts
  • src/areas/workflows/processPorts.ts
  • src/areas/workflows/processConnectionRules.ts

4. Image preview output support

Allow generated images returned as workspace paths to render in the workflow UI.

Expected behavior:

  • add/use a single-image preview node for image outputs,
  • resolve /workspace/... URLs against the FastAPI API base URL,
  • allow FastAPI preview images through the app CSP,
  • make preview images fill resized nodes without distorting aspect ratio.

Candidate implementation areas:

  • src/areas/workflows/nodes/PreviewImageNode.tsx
  • src/areas/workflows/nodes/PreviewViewsNode.tsx
  • src/areas/workflows/nodes/previewNodeShared.ts
  • src/index.html
  • tests/index-html-csp.test.mjs

5. Runtime readiness / setup surface

For model extensions that depend on external runtimes or large weights, expose a bounded readiness surface so the UI can show actionable status without silently running setup.

Expected behavior:

  • host can query model runtime readiness from FastAPI,
  • UI can show status such as ready / setup required / update required,
  • only safe, explicit actions are exposed to the UI,
  • refresh actions require explicit user dispatch,
  • diagnostic details are sanitized and bounded.

Candidate implementation areas:

  • api/services/generator_registry.py
  • api/routers/model.py
  • electron/main/model-runtime-readiness.ts
  • electron/preload/electron-api.ts
  • src/shared/stores/extensionsStore.ts
  • src/areas/models/components/ExtensionCard.tsx

Candidate PR split

To keep review manageable, I would suggest splitting this into multiple PRs if you are open to the direction:

  1. Bundle install support — install GitHub repositories with extensions/* children while preserving legacy root installs.
  2. Owner-scoped image model weights — canonical owner paths, legacy read-through, safe delete/download state.
  3. Text-to-image generation + workflow defaults — text input dispatch and default param hydration.
  4. Image preview fixes — workspace URL resolution, CSP, resizable preview rendering.
  5. Runtime readiness — optional/status surface for models that need external runtime setup.

Validation plan

Suggested coverage:

  • bundle install accepts a valid extensions/* repo,
  • bundle install rejects duplicate child extension IDs,
  • valid children survive when another child has a setup failure and the result is reported as partial,
  • legacy single-extension GitHub install still works,
  • shared image-model weight owner downloads once and is reflected across aliases,
  • deleting one capability does not remove weights still referenced by another capability,
  • text-to-image generation dispatch does not require image input,
  • workflow params hydrate defaults from manifests,
  • /workspace/... image outputs render in the preview node,
  • existing 3D model generation behavior remains unchanged.

Related context

This is separate from the broader headless workflow API proposal in #94. The scope here is the desktop host support needed for bundled image-model extension repositories to install, manage weights, run, and preview correctly.

I have a working branch in a fork with these changes split into commits and can prepare one or more PRs if this direction is acceptable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions