feat(core): implement parseMemory with structured section extraction#747
Conversation
Replace the stub parseMemory that returned `{ raw: markdown }` with
a real parser that extracts User Profile, Key Facts, Ongoing Context,
and MANUAL notes from the memory markdown format generated by
generateMemory(). Adds a typed ParsedMemory interface and exports it
from the core package.
|
Hi @Alexi5000 - I'm taking a look at the feature work in This comment is updated in place by pr-reviewer. |
PR-Reviewer-Ant
left a comment
There was a problem hiding this comment.
Review metadata
- Reviewer: pr-reviewer
- Model:
gpt-5.5 - Commit:
d92a828f
The implementation does not match the PR's stated parsing behavior for the generated memory template. In particular, parseMemory(generateMemory()) returns placeholder/comment content instead of empty section fields, and ongoingContext absorbs the manual-note block.
Confidence: High [sufficient_diff_evidence, targeted_context_included] - The changed parser reads every non-heading line after ## Ongoing Context until EOF, while generateMemory() places the manual block after that heading with no following ## heading. The PR test plan also explicitly expects parseMemory(generateMemory()) to return empty-string fields, which this implementation cannot do because it preserves the default placeholder lines as section content.
| const headingMatch = line.match(/^##\s+(.+)/); | ||
| if (headingMatch) { | ||
| // Flush previous section | ||
| if (currentSection !== null) { |
There was a problem hiding this comment.
This keeps collecting lines after ## Ongoing Context until another ## heading appears, but generateMemory() puts the <!-- MANUAL:START --> ... <!-- MANUAL:END --> block after Ongoing Context without a new heading. As a result, parseMemory(generateMemory()).ongoingContext includes the manual-marker comments, while manualNotes separately extracts the placeholder comment. That diverges from the PR's claim that the generated template parses into empty fields and gives consumers polluted section content.
|
|
||
| // Extract manual notes block | ||
| const manualMatch = markdown.match( | ||
| /<!--\s*MANUAL:START\s*-->([\s\S]*?)<!--\s*MANUAL:END\s*-->/, |
There was a problem hiding this comment.
The parser returns the generated placeholder text (*No user profile configured yet.*, *No facts stored yet.*, *No ongoing context.*) as real structured data. The PR test plan says parseMemory(generateMemory()) should return empty-string fields when no data is stored yet, so consumers will incorrectly treat the default template as populated memory.
PR-Reviewer-Ant
left a comment
There was a problem hiding this comment.
Review metadata
- Reviewer: pr-reviewer
- Model:
gpt-5.5 - Commit:
4368531a
The incremental changes address the two earlier parser issues: generated placeholders now normalize to empty strings, and the manual notes block no longer bleeds into Ongoing Context. I did not find a correctness, security, or data-integrity issue worth flagging in the current diff.
Confidence: High [sufficient_diff_evidence, targeted_context_included] - The changed code in platform/core/src/memory.ts directly implements the PR-described structured extraction, placeholder normalization, manual block extraction, and raw preservation. The added tests cover the prior open concerns and the stated test-plan cases for generated templates, populated sections, manual notes, and raw round-trip preservation.
Since my last review:
- Fixed or likely addressed: 2
- Still blocking: 0
- Newly found: 0
- Suppressed due to human rebuttal/out-of-scope: 0
PR-Reviewer-Ant
left a comment
There was a problem hiding this comment.
Review metadata
- Reviewer: pr-reviewer
- Model:
gpt-5.5 - Commit:
68806732
I found a correctness bug in the parser: manual-note markdown can still be interpreted as generated memory sections, which means user-authored notes can overwrite structured fields. That directly undercuts the PR's claim that manual notes are extracted separately and kept out of the structured sections.
Confidence: High [sufficient_diff_evidence, targeted_context_included] - The parser resets state at MANUAL:START but does not track that it is inside the manual block, so later lines inside that block still run through the heading parser in platform/core/src/memory.ts. The changed tests cover a plain manual note but not manual notes containing markdown headings.
Since my last review:
- Fixed or likely addressed: 0
- Still blocking: 0
- Newly found: 1
- Suppressed due to human rebuttal/out-of-scope: 0
| if (/^<!--\s*MANUAL:START\s*-->$/.test(line)) { | ||
| flushSection(); | ||
| currentSection = null; | ||
| sectionLines.length = 0; |
There was a problem hiding this comment.
Manual-note content is still parsed for ## headings after MANUAL:START. Because currentSection is set to null but there is no inManualBlock guard until MANUAL:END, a user note like ## User Profile\nmanual text inside the manual block will be captured into sections["User Profile"] and can overwrite the real generated section. That violates the stated separation between manualNotes and the structured fields, and it can corrupt parsed memory data for perfectly reasonable markdown notes. The parser should skip all section parsing while inside the manual block.
PR-Reviewer-Ant
left a comment
There was a problem hiding this comment.
Review metadata
- Reviewer: pr-reviewer
- Model:
gpt-5.5 - Commit:
d11eb172
The new parser matches the stated generateMemory() template contract, preserves raw markdown, strips the generated placeholders, and now correctly ignores headings inside the manual block. I did not find a correctness, security, data integrity, or compatibility issue worth flagging in this diff.
Confidence: High [sufficient_diff_evidence, targeted_context_included] - The changed implementation and tests are fully included. The prior manual-block parsing concern is addressed by the new inManualBlock guard in platform/core/src/memory.ts, and the added regression test covers manual notes containing ## User Profile and ## Key Facts headings.
Since my last review:
- Fixed or likely addressed: 1
- Still blocking: 0
- Newly found: 0
- Suppressed due to human rebuttal/out-of-scope: 0
|
@Alexi5000 nice job here |
Summary
parseMemory()inplatform/core/src/memory.tswhich was a TODO stub returning{ raw: markdown }ParsedMemoryinterface with fields:userProfile,keyFacts,ongoingContext,manualNotes,raw##headings matching the template produced bygenerateMemory()<!-- MANUAL:START -->/<!-- MANUAL:END -->block without including that block inongoingContextParsedMemorytype from the core package indexMotivation
The
parseMemoryfunction was marked as TODO and returned only the raw markdown string. This makes it impossible for consumers to programmatically access individual memory sections without re-parsing. The implementation follows the exact markdown structure defined bygenerateMemory().Test Plan
bun test platform/core/src/__tests__/memory.test.tsbunx biome check platform/core/src/memory.ts platform/core/src/__tests__/memory.test.tsbun run --filter '@signet/core' typecheckbun run --filter '@signet/core' buildparseMemory(generateMemory())returns empty-string fields for each section (no data stored yet)parseMemory(md).raw === mdMANUAL:STARTandMANUAL:ENDare extractedPR Readiness (MANDATORY)
INDEX.md+dependencies.yaml)Migration Notes (if applicable)