feat(scanner): derive skill/plugin source from install manifests by SherlockSalvatore · Pull Request #90 · RealZST/HarnessKit

SherlockSalvatore · 2026-06-24T04:47:56Z

Closes #89.

Problem

scanner::detect_source infers an extension's source by walking up to the nearest enclosing .git. When an agent home (e.g. ~/.claude) is kept under the user's own dotfiles repo, everything beneath it — plugins cached in ~/.claude/plugins/cache/<marketplace>/..., skills, etc. — is mis-attributed to that backup remote, collapsing many unrelated sources into one bogus group. PR #88 fixed only the symlinked-skill slice; this addresses the root cause for skills and plugins.

Fix

Read each tool's own install manifest as the authoritative source; fall back to detect_source only when there is no manifest entry.

Skills — override with <root>/.skill-lock.json (the skills CLI lockfile), matched by the on-disk folder name (how the CLI keys it), parsed once per lockfile and cached. A real git checkout's commit hash is preserved.
Plugins — PluginEntry gains source_url; the Claude adapter fills it from plugins/known_marketplaces.json (github marketplaces → owner/repo); scan_plugins prefers it over the .git walk. Non-github marketplaces are skipped so we never fabricate a wrong URL. Other adapters pass None (behaviour unchanged).

Testing

cargo test -p hk-core — 531 passed, incl. two new regression tests:
- test_skill_lock_overrides_enclosing_git_source — the lockfile beats a populated enclosing repo; a sibling skill absent from the lock still falls back; the override matches by folder name even when frontmatter name differs.
- test_scan_plugins_attributes_to_marketplace_repo — a plugin is attributed to its marketplace's upstream repo.
cargo clippy -p hk-core — clean for the touched files.

Notes

Stacked on fix(skills): attribute symlinked skills to real source, not enclosing repo #88 (builds on its canonicalize in scan_skill_dir). Until fix(skills): attribute symlinked skills to real source, not enclosing repo #88 merges this PR's range includes fix(skills): attribute symlinked skills to real source, not enclosing repo #88's commit; I'll rebase onto main once fix(skills): attribute symlinked skills to real source, not enclosing repo #88 lands.
Deferred (kept surgical): Codex/Copilot also parse marketplace plugins and can adopt the same source_url resolution later; once manifest resolution is in, the store's git-source backfill can be narrowed.

Store self-heal (existing DBs)

The scanner change fixes attribution going forward, but on an existing DB the git-source backfill had already stamped install_url/pack from the old (wrong) source, and the backfill only runs on install_type IS NULL so it never refreshes them — and deriveExtensionUrl prefers install_url, so the corrected source_json.url stays shadowed.

refresh_stale_git_install_meta (parallel to #88's self-heal) realigns install_type='git' skill/plugin rows whose recorded owner/repo differs from the now-authoritative source_json.url owner/repo, clearing the stale branch/subpath + pack (re-derived by backfill_packs). It compares by pack, not raw URL string, so a legitimate install recorded as …/repo is not churned against the scanner's …/repo.git remote every sync (which would wipe its pinned revision/check state). Verified on a real polluted DB: 13 plugins re-attributed to their marketplaces, same-repo and self-authored skills left untouched.

Known properties / limitations

Self-heal is a one-way DB write. refresh_stale_git_install_meta persists the realignment, so reverting the code does not roll the data back (the old backfill only fires on install_type IS NULL). In practice the only values it overwrites are ones whose owner/repo genuinely changed — i.e. stale pollution — and pack-gating leaves same-repo rows (incl. .git-suffix and case-only variants) untouched, so a legitimately pinned revision is not lost.
Comparison is by owner/repo, host-insensitive. A repo that migrated host but kept the same owner/repo (e.g. gitlab→github) is treated as the same repo and not realigned; the stored URL keeps the old host. This is intentional (it's what avoids churning .git-suffix/case/scheme variants) and harmless for grouping, which keys on owner/repo.
Non-github marketplaces are left to detect_source (no fabricated URL); plugins from such markets under a git-managed home stay mis-attributed. Tracked for follow-up in Derive extension source from the tool's install manifest, not the enclosing .git #89, alongside Codex's per-plugin manifest.

Known interactions

Updating a manifest-sourced skill deploys over its on-disk copies. Once a skills-CLI-installed skill resolves to its real upstream, it becomes update-eligible, and the update path clones that repo and deploy_skills it over every same-named installed copy — including the canonical ~/.agents/skills/<name> that the external skills CLI manages (and any agent home symlinked to it). The files get updated but the CLI's .skill-lock.json hash/commit is not, so the two can drift. This is pre-existing behaviour (the marketplace-attributed sibling copies already triggered the same clone+deploy), widened here because the symlinked copy is now also an entry point. Flagged for maintainer awareness; not changed in this PR.

Decision for maintainer

This PR's blast radius is intentionally bounded: it changes attribution (and a one-time, idempotent, pack-gated DB realignment), and the one potentially destructive interaction — updating an externally-managed skill — is user-triggered (nothing happens until someone clicks Update) and pre-existing (marketplace-attributed copies already cloned + deployed over the same files). So it is safe to ship as-is.

What's left is a policy choice that should be yours, not silently encoded here: should HarnessKit offer Update/Delete on skills that an external tool (the skills CLI, tracked in ~/.agents/.skill-lock.json) manages? Now that those skills resolve to a real upstream, they become update-eligible, and deploy_skill writes through the agent-home symlink onto the CLI-managed ~/.agents/skills/<name>, leaving the CLI's lockfile hash stale.

Two directions, for you to pick before any further change:

A — treat externally-managed (lockfile-present / symlinked) skills as read-only in HarnessKit: show source + Check-Updates info, but gate the destructive Update/Delete with a "managed by the skills CLI" notice. Minimal, safe, hands-off. (My recommendation.)
B — full integration: HarnessKit writes back .skill-lock.json (or shells out to the skills CLI) after an update. Complete, but couples HarnessKit to the external tool's format and lifecycle.

This PR deliberately stays at attribution only. If you favour a direction, I'll do it as a separate follow-up (alongside a small, safe hardening: matching find_skill_in_repo by folder as well as name, so a skill whose frontmatter name differs from its repo folder still updates). Tracked with the manifest follow-ups in #89.

HarnessKit inferred an extension's source by walking up to the nearest enclosing .git, mis-attributing everything under a dotfiles-managed agent home (e.g. ~/.claude) to that backup repo. Read each tool's own install manifest instead, falling back to .git detection only when absent. - Skills: override detect_source with <root>/.skill-lock.json (the skills CLI lockfile), matched by on-disk folder name, cached per lockfile. - Plugins: PluginEntry gains source_url; the Claude adapter fills it from plugins/known_marketplaces.json (github marketplaces -> owner/repo); scan_plugins prefers it over the .git walk. Other adapters unchanged. Regression tests: lockfile beats a populated enclosing repo (and a sibling skill absent from the lock falls back); a plugin is attributed to its marketplace repo. Refs RealZST#89. Builds on RealZST#88 canonicalize in scan_skill_dir.

The scanner now resolves the real source from install manifests, but the git-source backfill only writes install_meta when install_type IS NULL, so a row stamped in an earlier sync (e.g. a plugin first attributed to the enclosing dotfiles repo) keeps its stale install_url. deriveExtensionUrl prefers install_url, so the corrected source_json.url stayed shadowed and the extension lingered in the wrong group. Add refresh_stale_git_install_meta: for install_type='git' skill/plugin rows whose install_url owner/repo differs from the authoritative source_json.url owner/repo, realign install_url/revision and clear the now-stale branch/subpath + pack (re-derived by backfill_packs). Compare by pack, not raw URL string, so a legitimate install recorded as ".../repo" is not churned against the scanner's ".../repo.git" remote every sync (which would wipe its pinned revision and check state). Runs in both sync paths after the symlink heal, before backfill_packs. This is the store half of the manifest source-resolution fix (parallel to the RealZST#88 self-heal).

….md skills Two review findings on the manifest source-resolution change: - refresh_stale_git_install_meta compared pack owner/repo case-sensitively, so a stored `Owner/Repo` vs a scanned `owner/repo` (GitHub is case-insensitive) would churn the row every sync, wiping its pinned revision/check state. Compare lowercased. Add a same-repo case-only regression row that must stay intact. - scan_skill_dir's lockfile key used path.file_name() unconditionally, so a standalone `<name>.md` skill would key on `name.md` and miss its lock entry. Key dir skills by folder name (file_name, dot-safe) and standalone .md by stem.

The web and desktop `update_extension` endpoints validated install_meta and install_type but not kind. The update path clones a repo and deploys it as a skill, so passing a non-skill id ran the skill-update logic on it. This became reachable once manifest source-resolution stamps plugins with install_type='git'. Add the same `is_update_eligible` gate the bulk update path already uses (rejects non-skill kinds), so a plugin id can no longer enter the skill clone+deploy flow.

refresh_stale_git_install_meta realigned a row's install_url to its scanned source whenever the owner/repo packs differed. But a `.git`-inferred source (e.g. an HK-git-installed skill that merely sits under a personal dotfiles repo) is not authoritative — trusting it clobbers the real install_url and wipes the pinned revision, re-attributing the skill to the dotfiles repo. Add `Source::from_manifest`, set only where the source is read from an authoritative install manifest (the skills CLI `.skill-lock.json`, or a plugin marketplace map), and gate the realignment on it so only a manifest-derived source may correct an install record. Inferred sources are left alone.

… CLI Skills installed by the external `skills` CLI live in ~/.agents/skills/<name> and are tracked in ~/.agents/.skill-lock.json (detected via Source::from_manifest). Updating them with HK's own clone+deploy overwrites the CLI-managed copy and leaves the CLI's lockfile hash stale, drifting the two tools apart. Route such a skill's update to `skills update <name> -g -y` (preferring a `skills` binary, falling back to `npx --yes skills`, and falling back to HK's own update when neither is available), so the CLI updates the files AND its own lockfile in one step. Delete stays native — HK's delete is agent/path-granular whereas `skills remove` removes from every agent, and removing a per-agent symlink doesn't touch the canonical copy (no lockfile drift). The CLI syncs the skill to its upstream HEAD, but the deployed files aren't a git checkout, so a rescan would leave install_revision NULL and the row stuck on "update available". After a delegated update, record the upstream HEAD across the skill's same-scope copies so the git-based update check clears the indicator — mirroring how the native path records its captured revision.

RealZST · 2026-06-25T20:21:00Z

Rebased onto main (resolves the #88-squash conflict). Two commits:

fix(store): only manifest-derived sources may correct an install record (432970e)

refresh_stale_git_install_meta realigned install_url whenever the scanned source's pack differed — including a .git-inferred dotfiles repo (an HK-git-installed skill sitting under one), which clobbered the authoritative install_url and wiped its pinned revision. Added Source::from_manifest (true only for lockfile/marketplace sources) and gated the realignment on it.
feat(skills): delegate externally-managed skill updates to the skills CLI (099ad62)
A/B decision → B, via shell-out (HK never touches .skill-lock.json).
- A from_manifest skill update runs skills update <name> -g -y (→ npx --yes skills → HK's own clone+deploy as fallbacks). The CLI updates files and lockfile together, no drift.
- Delete stays native: HK's delete is agent/path-granular; skills remove removes from every agent, and a per-agent symlink isn't the canonical copy.
- HK records the upstream HEAD after a delegated update, else install_revision stays NULL (deployed files aren't a git checkout) and the row sticks on "update available".

Tested web + desktop.

Known limitation

Monorepo sources still update-check by repo HEAD, so every skill in a repo flags "update available" when any one of them changes. The CLI checks per-skill folder hash but has no read-only check command, so delegating the check would couple HK to its registry. The repo-level check over-reports but never hides a real update.

SherlockSalvatore mentioned this pull request Jun 24, 2026

Derive extension source from the tool's install manifest, not the enclosing .git #89

Closed

SherlockSalvatore force-pushed the feat/manifest-source-resolution branch from 61c6665 to d6d3b8d Compare June 24, 2026 09:00

SherlockSalvatore and others added 6 commits June 25, 2026 14:26

RealZST force-pushed the feat/manifest-source-resolution branch from 0b14bca to 099ad62 Compare June 25, 2026 20:02

RealZST merged commit d66a8c8 into RealZST:main Jun 25, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(scanner): derive skill/plugin source from install manifests#90

feat(scanner): derive skill/plugin source from install manifests#90
RealZST merged 6 commits into
RealZST:mainfrom
SherlockSalvatore:feat/manifest-source-resolution

SherlockSalvatore commented Jun 24, 2026 •

edited

Loading

Uh oh!

RealZST commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SherlockSalvatore commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Testing

Notes

Store self-heal (existing DBs)

Known properties / limitations

Known interactions

Decision for maintainer

Uh oh!

RealZST commented Jun 25, 2026

Known limitation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SherlockSalvatore commented Jun 24, 2026 •

edited

Loading