Skip to content
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
49d55d9
feat(sources): add GitHub source provider
NicholaiVogel May 24, 2026
f0d379b
fix(sources): harden GitHub source comments
NicholaiVogel May 24, 2026
6b1a42f
fix(sources): reject raw GitHub tokens
NicholaiVogel May 24, 2026
6156c65
fix(sources): paginate GitHub fetches
NicholaiVogel May 24, 2026
feee6ff
fix(sources): fail discussion comment GraphQL errors
NicholaiVogel May 24, 2026
ede330e
fix(sources): preserve GitHub doc path separators
NicholaiVogel May 24, 2026
f3a2d52
fix(sources): constrain GitHub doc globs
NicholaiVogel May 24, 2026
625d33b
feat(sources): surface GitHub setup in dashboard
NicholaiVogel May 24, 2026
dd60749
fix(sources): bound GitHub discussion scans
NicholaiVogel May 24, 2026
1fb8a81
fix(sources): surface GitHub source setup honestly
NicholaiVogel May 24, 2026
c877ce1
fix(sources): keep GitHub repo purge scoped
NicholaiVogel May 24, 2026
385db08
fix(sources): accept GitHub pull responses without labels
NicholaiVogel May 24, 2026
c59885c
fix(sources): clear GitHub request timeouts
NicholaiVogel May 24, 2026
e0d13b0
fix(sources): clear recovered GitHub failures
NicholaiVogel May 24, 2026
40c6275
fix(sources): scan filtered GitHub discussions safely
NicholaiVogel May 24, 2026
4fec2f9
fix(sources): constrain GitHub repo purge prefix
NicholaiVogel May 24, 2026
274df5d
fix(sources): hydrate labeled GitHub pulls
NicholaiVogel May 24, 2026
3aa115d
fix(sources): keep GitHub failure artifacts distinct
NicholaiVogel May 24, 2026
1f58121
fix(sources): enforce GitHub source item cap
NicholaiVogel May 24, 2026
6ab5626
fix(sources): track GitHub comment purge paths
NicholaiVogel May 24, 2026
fc39c33
fix(sources): paginate GitHub discussion comments
NicholaiVogel May 24, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 30 additions & 1 deletion docs/SOURCES.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Sources

Sources are external knowledge bases that Signet can read, index, and recall from without turning them into ordinary saved memories.

Sources currently support **Obsidian** vaults and **Discord** guilds. Point Signet at an Obsidian vault and the daemon mounts that vault as a read-only knowledge base: Markdown files become searchable artifacts, the vault structure becomes graph topology, and heading-aware chunks participate in semantic recall. Add Discord with a bot-token secret reference and Signet indexes guild topology, channels, threads, members, message windows, and Discord metadata through the same source-owned artifact lifecycle.
Sources currently support **Obsidian** vaults, **Discord** guilds, and **GitHub** repositories. Point Signet at an Obsidian vault and the daemon mounts that vault as a read-only knowledge base: Markdown files become searchable artifacts, the vault structure becomes graph topology, and heading-aware chunks participate in semantic recall. Add Discord with a bot-token secret reference and Signet indexes guild topology, channels, threads, members, message windows, and Discord metadata through the same source-owned artifact lifecycle. Add GitHub repositories to index issues, pull requests, discussions, selected Markdown docs, comments, and source failure artifacts through the shared source provider pipeline.

The important rule is simple: **the source stays canonical**. Signet reads from the vault. It does not edit notes, rewrite frontmatter, create files, or move anything inside the source directory.

Expand Down Expand Up @@ -94,6 +94,34 @@ artifacts under the synthetic `@me` guild by default; use
`--include-local-discord` only when intentionally moving that private local
cache data.

GitHub v1
---------

GitHub Sources v1 indexes configured repositories through the shared Sources job pipeline:

```bash
signet sources add github --repo Signet-AI/signetai --name "Signet GitHub"
signet sources add github --repo Signet-AI/signetai --token-ref GITHUB_TOKEN --resource-type issues --resource-type discussions
signet sources add github --repo Signet-AI/* --resource-type docs --doc-path "docs/**/*.md" --max-items 50
signet sources list
signet sources remove github:...
```

Without `--token-ref`, GitHub sources default to REST-fetchable resources:
issues, pull requests, and selected Markdown docs. Discussions use the GitHub
GraphQL API and require a token reference. Tokens must be stored in Signet
Secrets or an external secret reference; Signet does not store raw GitHub
tokens in source config.

GitHub source config is bounded by `maxItemsPerRepo`. Repo globs, issue/PR
fetches, discussion fetches, and wildcard docs paths all honor configured caps.
Direct docs paths are limited to Markdown paths or Markdown globs, so GitHub v1
does not become arbitrary source-code indexing by accident.

Partial GitHub failures are written as source-owned failure artifacts and cause
the shared source job to report failure instead of silently marking incomplete
data as fully indexed.

Obsidian v1
-----------

Expand Down Expand Up @@ -228,6 +256,7 @@ The daemon exposes the Sources lifecycle under `/api/sources`:
| `GET` | `/api/sources` | List configured sources. |
| `POST` | `/api/sources/obsidian` | Add/update an Obsidian vault source and index it. |
| `POST` | `/api/sources/discord` | Add/update a Discord source and queue a shared source index job. |
| `POST` | `/api/sources/github` | Add/update a GitHub source and queue a shared source index job. |
| `DELETE` | `/api/sources/:sourceId` | Remove a source config and purge Signet-owned source rows. |
| `POST` | `/api/sources/pick-directory` | Development/browser fallback for choosing a local directory. |

Expand Down
47 changes: 46 additions & 1 deletion docs/api/documents-sources.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ the document are soft-deleted one at a time with audit history.

Sources connect read-only external knowledge bases to Signet recall without
turning them into ordinary saved memories. Supported source kinds are
`obsidian` and `discord`.
`obsidian`, `discord`, and `github`.

### GET /api/sources

Expand Down Expand Up @@ -267,6 +267,51 @@ windows, attachments, mentions, embeds, polls, checkpoints, and import stats.
Cache imports are observational and never reconcile deletes from missing or
evicted local cache files.

### POST /api/sources/github

Add or update a GitHub source and queue a shared source index job. Without a
token reference, GitHub sources default to issues, pull requests, and selected
Markdown docs. Discussions require `tokenRef` because they use the GitHub
GraphQL API. Raw GitHub tokens are rejected; pass a Signet secret name or
external secret reference instead.

**Request body**

```json
{
"repos": ["Signet-AI/signetai"],
"tokenRef": "GITHUB_TOKEN",
"name": "Signet GitHub",
"resourceTypes": ["issues", "pulls", "discussions", "docs"],
"state": "all",
"includeComments": true,
"labels": ["bug", "needs review"],
"docPaths": ["README.md", "docs/**/*.md"],
"maxItemsPerRepo": 500
}
```

`repo` is accepted as a single-repository alias. `docPaths` are limited to
Markdown files or Markdown globs so GitHub source indexing stays focused on
chosen docs instead of broad source-code ingestion.

**Response**

```json
{
"source": { "id": "github:abc123", "kind": "github" },
"created": true,
"indexed": 0,
"queued": true,
"job": { "status": "queued", "sourceId": "github:abc123" }
}
```

The sync path indexes source-owned artifacts for issues, pull requests,
discussions, selected Markdown docs, comments, and partial-failure artifacts.
Partial GitHub failures cause the shared source job to report failure while
preserving source-owned rows that were indexed successfully.

### DELETE /api/sources/:sourceId

Remove a source config and purge Signet-owned source artifacts, graph rows,
Expand Down
11 changes: 11 additions & 0 deletions platform/core/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -224,25 +224,36 @@ export type {
} from "./workspace-source-repo";
export {
addDiscordSource,
addGitHubSource,
addObsidianSource,
DEFAULT_DISCORD_DESKTOP_CACHE_PATH,
DEFAULT_DISCORD_MAX_MESSAGES_PER_CHANNEL,
DEFAULT_GITHUB_DOC_PATHS,
DEFAULT_GITHUB_MAX_ITEMS_PER_REPO,
DEFAULT_GITHUB_RESOURCE_TYPES,
DEFAULT_GITHUB_RESOURCE_TYPES_NO_TOKEN,
DEFAULT_OBSIDIAN_EXCLUDE_GLOBS,
MAX_DISCORD_MAX_MESSAGES_PER_CHANNEL,
MAX_GITHUB_MAX_ITEMS_PER_REPO,
getAgentsDir,
getSourcesConfigPath,
loadSourcesConfig,
markSourceIndexed,
parseDiscordSettings,
parseGitHubSettings,
removeSource,
saveSourcesConfig,
} from "./sources-config";
export type {
AddDiscordSourceInput,
AddGitHubSourceInput,
AddObsidianSourceInput,
AddSourceResult,
DiscordSourceSettings,
DiscordSourceSyncMode,
GitHubSourceResourceType,
GitHubSourceSettings,
GitHubSourceState,
RemoveSourceResult,
SignetSourceEntry,
SignetSourceKind,
Expand Down
117 changes: 117 additions & 0 deletions platform/core/src/sources-config.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,16 @@ import { tmpdir } from "node:os";
import { join } from "node:path";
import {
DEFAULT_DISCORD_MAX_MESSAGES_PER_CHANNEL,
DEFAULT_GITHUB_RESOURCE_TYPES_NO_TOKEN,
DEFAULT_OBSIDIAN_EXCLUDE_GLOBS,
addDiscordSource,
addGitHubSource,
addObsidianSource,
getSourcesConfigPath,
loadSourcesConfig,
markSourceIndexed,
parseDiscordSettings,
parseGitHubSettings,
removeSource,
} from "./sources-config";

Expand Down Expand Up @@ -278,6 +281,120 @@ describe("sources-config", () => {
});
});

it("adds a GitHub source with validated provider settings", () => {
const agentsDir = tmp();

const result = addGitHubSource(
{
repos: ["Signet-AI/signetai", "Signet-AI/signetai"],
tokenRef: "GITHUB_TOKEN",
name: "Signet GitHub",
resourceTypes: ["issues", "pulls", "discussions", "docs"],
state: "open",
labels: ["bug", "needs review", "bug"],
docPaths: ["README.md", "docs/**/*.md"],
maxItemsPerRepo: 25,
now: "2026-01-02T00:00:00.000Z",
},
agentsDir,
);

expect(result.ok).toBe(true);
if (result.ok === false) throw new Error(result.error);
expect(result.source.kind).toBe("github");
expect(result.source.root).toBe("github://repos/Signet-AI/signetai");
expect(result.source.providerSettings).toEqual({
repos: ["Signet-AI/signetai"],
tokenRef: "GITHUB_TOKEN",
resourceTypes: ["issues", "pulls", "discussions", "docs"],
state: "open",
includeComments: true,
labels: ["bug", "needs review"],
docPaths: ["README.md", "docs/**/*.md"],
maxItemsPerRepo: 25,
});
});

it("defaults GitHub sources without tokenRef to REST-fetchable resources", () => {
const result = addGitHubSource({ repos: ["Signet-AI/signetai"] }, tmp());

expect(result.ok).toBe(true);
if (result.ok === false) throw new Error(result.error);
expect(parseGitHubSettings(result.source.providerSettings).resourceTypes).toEqual([
...DEFAULT_GITHUB_RESOURCE_TYPES_NO_TOKEN,
]);
});

it("preserves GitHub settings on partial update", () => {
const agentsDir = tmp();
const first = addGitHubSource(
{
repos: ["Signet-AI/signetai"],
tokenRef: "GITHUB_TOKEN",
resourceTypes: ["issues", "discussions"],
labels: ["reviewed"],
docPaths: ["docs/API.md"],
maxItemsPerRepo: 12,
now: "2026-01-01T00:00:00.000Z",
},
agentsDir,
);
const second = addGitHubSource(
{ repos: ["Signet-AI/signetai"], name: "Renamed", now: "2026-01-02T00:00:00.000Z" },
agentsDir,
);

expect(first.ok).toBe(true);
expect(second.ok).toBe(true);
if (second.ok === false) throw new Error(second.error);
expect(second.created).toBe(false);
expect(second.source.name).toBe("Renamed");
expect(parseGitHubSettings(second.source.providerSettings)).toMatchObject({
tokenRef: "GITHUB_TOKEN",
resourceTypes: ["issues", "discussions"],
labels: ["reviewed"],
docPaths: ["docs/API.md"],
maxItemsPerRepo: 12,
});
expect(loadSourcesConfig(agentsDir).sources).toHaveLength(1);
});

it("rejects invalid GitHub source boundaries", () => {
const agentsDir = tmp();

expect(addGitHubSource({ repos: [] }, agentsDir)).toEqual({
ok: false,
error: "At least one GitHub repo pattern is required",
});
expect(addGitHubSource({ repos: ["not-a-repo"] }, agentsDir)).toEqual({
ok: false,
error: "Invalid GitHub repo pattern: not-a-repo. Expected owner/repo or owner/*",
});
expect(addGitHubSource({ repos: ["Signet-AI/signetai"], resourceTypes: ["discussions"] }, agentsDir)).toEqual({
ok: false,
error: "GitHub discussions require tokenRef because they use the GitHub GraphQL API",
});
for (const tokenRef of [
`ghp_${"a".repeat(36)}`,
`github_pat_${"b".repeat(60)}`,
`Bearer ghp_${"c".repeat(36)}`,
`Authorization: token ghp_${"d".repeat(36)}`,
]) {
expect(addGitHubSource({ repos: ["Signet-AI/signetai"], tokenRef }, agentsDir)).toEqual({
ok: false,
error: "GitHub tokenRef must be a secret reference, not a raw token",
});
}
expect(addGitHubSource({ repos: ["Signet-AI/signetai"], docPaths: ["src/daemon.ts"] }, agentsDir)).toEqual({
ok: false,
error: "Invalid GitHub docPaths: src/daemon.ts",
});
expect(addGitHubSource({ repos: ["Signet-AI/signetai"], maxItemsPerRepo: 0 }, agentsDir)).toEqual({
ok: false,
error: "GitHub maxItemsPerRepo must be an integer between 1 and 10000",
});
});

it("round-trips provider-neutral source settings for future adapters", () => {
const agentsDir = tmp();
const source = {
Expand Down
Loading
Loading