Skip to content

feat(core): implement parseMemory with structured section extraction#747

Merged
NicholaiVogel merged 4 commits into
Signet-AI:mainfrom
Alexi5000:pr/implement-parse-memory
May 23, 2026
Merged

feat(core): implement parseMemory with structured section extraction#747
NicholaiVogel merged 4 commits into
Signet-AI:mainfrom
Alexi5000:pr/implement-parse-memory

Conversation

@Alexi5000
Copy link
Copy Markdown
Contributor

@Alexi5000 Alexi5000 commented May 21, 2026

Summary

  • Implement parseMemory() in platform/core/src/memory.ts which was a TODO stub returning { raw: markdown }
  • Add a typed ParsedMemory interface with fields: userProfile, keyFacts, ongoingContext, manualNotes, raw
  • Extract content from ## headings matching the template produced by generateMemory()
  • Extract manual notes from the <!-- MANUAL:START --> / <!-- MANUAL:END --> block without including that block in ongoingContext
  • Ignore markdown headings inside manual notes so user-authored notes cannot overwrite structured sections
  • Treat generated template placeholders and the default manual-note placeholder as empty structured values
  • Export ParsedMemory type from the core package index

Motivation

The parseMemory function was marked as TODO and returned only the raw markdown string. This makes it impossible for consumers to programmatically access individual memory sections without re-parsing. The implementation follows the exact markdown structure defined by generateMemory().

Test Plan

  • bun test platform/core/src/__tests__/memory.test.ts
  • bunx biome check platform/core/src/memory.ts platform/core/src/__tests__/memory.test.ts
  • bun run --filter '@signet/core' typecheck
  • bun run --filter '@signet/core' build
  • Verify parseMemory(generateMemory()) returns empty-string fields for each section (no data stored yet)
  • Verify round-trip: parseMemory(md).raw === md
  • Verify parsing a memory file with populated sections returns the correct content
  • Verify manual notes between MANUAL:START and MANUAL:END are extracted
  • Verify markdown headings inside manual notes are not parsed as structured memory sections

PR Readiness (MANDATORY)

  • Spec alignment validated (INDEX.md + dependencies.yaml)
  • Agent scoping verified on all new/changed data queries
  • Input/config validation and bounds checks added
  • Error handling and fallback paths tested (no silent swallow)
  • Security checks applied to admin/mutation endpoints
  • Docs updated for API/spec/status changes
  • Regression tests added for each bug fix
  • Lint/typecheck/tests pass locally

Migration Notes (if applicable)

  • No migrations changed.

Replace the stub parseMemory that returned `{ raw: markdown }` with
a real parser that extracts User Profile, Key Facts, Ongoing Context,
and MANUAL notes from the memory markdown format generated by
generateMemory(). Adds a typed ParsedMemory interface and exports it
from the core package.
@PR-Reviewer-Ant
Copy link
Copy Markdown
Collaborator

Hi @Alexi5000 - I'm taking a look at the feature work in feat(core): implement parseMemory with structured section extraction (commit d92a828f) and will follow up shortly.

This comment is updated in place by pr-reviewer.

Copy link
Copy Markdown
Collaborator

@PR-Reviewer-Ant PR-Reviewer-Ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review metadata
  • Reviewer: pr-reviewer
  • Model: gpt-5.5
  • Commit: d92a828f

The implementation does not match the PR's stated parsing behavior for the generated memory template. In particular, parseMemory(generateMemory()) returns placeholder/comment content instead of empty section fields, and ongoingContext absorbs the manual-note block.

Confidence: High [sufficient_diff_evidence, targeted_context_included] - The changed parser reads every non-heading line after ## Ongoing Context until EOF, while generateMemory() places the manual block after that heading with no following ## heading. The PR test plan also explicitly expects parseMemory(generateMemory()) to return empty-string fields, which this implementation cannot do because it preserves the default placeholder lines as section content.

Comment thread platform/core/src/memory.ts Outdated
const headingMatch = line.match(/^##\s+(.+)/);
if (headingMatch) {
// Flush previous section
if (currentSection !== null) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This keeps collecting lines after ## Ongoing Context until another ## heading appears, but generateMemory() puts the <!-- MANUAL:START --> ... <!-- MANUAL:END --> block after Ongoing Context without a new heading. As a result, parseMemory(generateMemory()).ongoingContext includes the manual-marker comments, while manualNotes separately extracts the placeholder comment. That diverges from the PR's claim that the generated template parses into empty fields and gives consumers polluted section content.

Comment thread platform/core/src/memory.ts Outdated

// Extract manual notes block
const manualMatch = markdown.match(
/<!--\s*MANUAL:START\s*-->([\s\S]*?)<!--\s*MANUAL:END\s*-->/,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser returns the generated placeholder text (*No user profile configured yet.*, *No facts stored yet.*, *No ongoing context.*) as real structured data. The PR test plan says parseMemory(generateMemory()) should return empty-string fields when no data is stored yet, so consumers will incorrectly treat the default template as populated memory.

Copy link
Copy Markdown
Collaborator

@PR-Reviewer-Ant PR-Reviewer-Ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review metadata
  • Reviewer: pr-reviewer
  • Model: gpt-5.5
  • Commit: 4368531a

The incremental changes address the two earlier parser issues: generated placeholders now normalize to empty strings, and the manual notes block no longer bleeds into Ongoing Context. I did not find a correctness, security, or data-integrity issue worth flagging in the current diff.

Confidence: High [sufficient_diff_evidence, targeted_context_included] - The changed code in platform/core/src/memory.ts directly implements the PR-described structured extraction, placeholder normalization, manual block extraction, and raw preservation. The added tests cover the prior open concerns and the stated test-plan cases for generated templates, populated sections, manual notes, and raw round-trip preservation.

Since my last review:

  • Fixed or likely addressed: 2
  • Still blocking: 0
  • Newly found: 0
  • Suppressed due to human rebuttal/out-of-scope: 0

Copy link
Copy Markdown
Collaborator

@PR-Reviewer-Ant PR-Reviewer-Ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review metadata
  • Reviewer: pr-reviewer
  • Model: gpt-5.5
  • Commit: 68806732

I found a correctness bug in the parser: manual-note markdown can still be interpreted as generated memory sections, which means user-authored notes can overwrite structured fields. That directly undercuts the PR's claim that manual notes are extracted separately and kept out of the structured sections.

Confidence: High [sufficient_diff_evidence, targeted_context_included] - The parser resets state at MANUAL:START but does not track that it is inside the manual block, so later lines inside that block still run through the heading parser in platform/core/src/memory.ts. The changed tests cover a plain manual note but not manual notes containing markdown headings.

Since my last review:

  • Fixed or likely addressed: 0
  • Still blocking: 0
  • Newly found: 1
  • Suppressed due to human rebuttal/out-of-scope: 0

if (/^<!--\s*MANUAL:START\s*-->$/.test(line)) {
flushSection();
currentSection = null;
sectionLines.length = 0;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manual-note content is still parsed for ## headings after MANUAL:START. Because currentSection is set to null but there is no inManualBlock guard until MANUAL:END, a user note like ## User Profile\nmanual text inside the manual block will be captured into sections["User Profile"] and can overwrite the real generated section. That violates the stated separation between manualNotes and the structured fields, and it can corrupt parsed memory data for perfectly reasonable markdown notes. The parser should skip all section parsing while inside the manual block.

Copy link
Copy Markdown
Collaborator

@PR-Reviewer-Ant PR-Reviewer-Ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review metadata
  • Reviewer: pr-reviewer
  • Model: gpt-5.5
  • Commit: d11eb172

The new parser matches the stated generateMemory() template contract, preserves raw markdown, strips the generated placeholders, and now correctly ignores headings inside the manual block. I did not find a correctness, security, data integrity, or compatibility issue worth flagging in this diff.

Confidence: High [sufficient_diff_evidence, targeted_context_included] - The changed implementation and tests are fully included. The prior manual-block parsing concern is addressed by the new inManualBlock guard in platform/core/src/memory.ts, and the added regression test covers manual notes containing ## User Profile and ## Key Facts headings.

Since my last review:

  • Fixed or likely addressed: 1
  • Still blocking: 0
  • Newly found: 0
  • Suppressed due to human rebuttal/out-of-scope: 0

@NicholaiVogel
Copy link
Copy Markdown
Collaborator

@Alexi5000 nice job here

@NicholaiVogel NicholaiVogel merged commit 02e143b into Signet-AI:main May 23, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants