refactor(sources): add provider-neutral source substrate#757
Conversation
|
Hi @NicholaiVogel - I'm taking a look at the feature work in This comment is updated in place by pr-reviewer. |
PR-Reviewer-Ant
left a comment
There was a problem hiding this comment.
Review metadata
- Reviewer: pr-reviewer
- Model:
gpt-5.5 - Commit:
1961f273
I did not find a concrete correctness, security, or data-integrity issue in the changed surface. The implementation matches the stated stack step: it adds the generic source substrate, keeps Obsidian compatibility paths, and adds cancellation checks around source indexing.
Confidence: Medium [sufficient_diff_evidence, targeted_context_included, missing_runtime_repro] - The diff includes the relevant core migration, source config validation, daemon provider registry, Obsidian indexing path, recall fallback, purge/stat queries, and tests covering generic chunk recall plus legacy chunk cleanup. I did not run the listed Bun test plan or reproduce a live source disconnect/index cancellation path, so runtime cancellation behavior is assessed from code only.
## Summary Reimplements the GitHub source work from #749 on top of the provider-neutral source substrate from #757. GitHub now plugs into the shared Sources provider/job/purge path instead of adding a parallel daemon bridge and poller, and it is exposed from the dashboard Sources setup flow. ## Changes - Adds `github` source config support in `@signet/core` using `providerSettings`, with validation for repo patterns, resource types, token/discussion requirements, Markdown doc paths, labels, state, and per-repo bounds. - Adds a GitHub fetcher for repo expansion, issues, pull requests, discussions, selected Markdown docs, and comments with the #749 hardening carried forward. - Adds a `githubSourceProvider` that writes source-owned `memory_artifacts` with provenance and failure artifacts, purges through shared source-owned purge, and records unmatched wildcard repo patterns as source failures instead of silently indexing nothing. - Adds `POST /api/sources/github`, `signet sources add github`, source listing support, and API/Sources docs. - Surfaces GitHub in the dashboard Sources tab with a setup form for repositories, optional secret reference, resource toggles, state, labels, doc paths, comments, and item cap. - Tightens shared provider source jobs so provider-reported failures mark the job failed instead of silently completing. ## Notes Supersedes #749, which was closed because it conflicted with #757/#759 and used a bespoke GitHub bridge/poller. This PR keeps the useful review hardening from #749 while adopting the shared provider substrate. * feat(sources): add GitHub source provider * fix(sources): harden GitHub source comments * fix(sources): reject raw GitHub tokens * fix(sources): paginate GitHub fetches * fix(sources): fail discussion comment GraphQL errors * fix(sources): preserve GitHub doc path separators * fix(sources): constrain GitHub doc globs * feat(sources): surface GitHub setup in dashboard * fix(sources): bound GitHub discussion scans * fix(sources): surface GitHub source setup honestly * fix(sources): keep GitHub repo purge scoped * fix(sources): accept GitHub pull responses without labels * fix(sources): clear GitHub request timeouts * fix(sources): clear recovered GitHub failures * fix(sources): scan filtered GitHub discussions safely * fix(sources): constrain GitHub repo purge prefix * fix(sources): hydrate labeled GitHub pulls * fix(sources): keep GitHub failure artifacts distinct * fix(sources): enforce GitHub source item cap * fix(sources): track GitHub comment purge paths * fix(sources): paginate GitHub discussion comments
Summary
memory_artifactsforsource_id,source_root, external ids, parent paths, and provider metadata.source_chunkrows while preserving legacysource_obsidian_chunkrecall and purge compatibility.Stack Context
This is PR 1 of the replacement stack after closing #750. Follow-up PRs should migrate more Obsidian behavior behind this provider contract, then add Discord through the shared source path instead of a bespoke Discord pipeline.
Test Plan
bun test platform/core/src/sources-config.test.ts platform/core/src/migrations/migrations.test.tsbun run --filter '@signet/core' buildbun test platform/daemon/src/obsidian-source-embeddings.test.ts platform/daemon/src/native-memory-sources.test.ts platform/daemon/src/routes/sources-routes.test.ts platform/daemon/src/memory-search.test.tsbun run --filter '@signet/daemon' buildgit diff --checkPR Readiness (MANDATORY)
INDEX.md+dependencies.yaml)Migration Notes (if applicable)
Migration 075 only adds nullable provenance columns and indexes to
memory_artifacts. Rollback compatibility is additive: older rows remain readable, and source stats/purge fall back to the existing Obsidian root/path matching whensource_idis absent.