Align extraction agent with structured Modelfile JSON contract by Copilot · Pull Request #5 · nmdra/AgentHire

Copilot · 2026-04-21T15:24:21Z

Review requested PR thread feedback and inspect target files
Verify current implementation in extraction schema and retry logic
Enforce strict extraction schema by forbidding extra keys (top-level and nested)
Align extraction prompt contract with nullable nested fields
Remove fragile stop token behavior and safely parse fenced JSON if present
Run targeted extraction tests and full validation (ruff, mypy, pytest)
Commit and reply to the PR comment with commit hash

Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/3d45db37-3bcc-450a-a084-49377e4795c8 Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/88460f99-0014-43cc-bdb8-7d47c16a02db Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

…am agents Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/09a8ad7f-3dd2-49a1-8c0d-d0de3f8c4748 Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

Copilot

Pull request overview

Aligns the extraction agent’s output with a more structured JSON contract (nested experience/education, added website/other_details), while keeping downstream agents mocked for the Phase 1 scaffold.

Changes:

Updates extraction validation schema to structured experience/education entries plus website and other_details.
Refactors extraction prompting to use a persona-driven, sectioned prompt layout.
Adjusts extraction + workflow tests to assert the new extraction JSON shape.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/test_workflow.py	Updates stubbed extraction payload in workflow test to match the new extraction schema.
tests/test_extraction_agent.py	Updates extraction agent tests to expect structured experience/education and new optional fields.
app/tools/validate_extraction.py	Introduces structured Pydantic models for experience/education and extends extraction schema.
app/tools/ollama.py	Adds optional sampling/stop controls to JSON generation requests via Ollama options.
app/agents/personas.py	Adds persona spec + helper to build structured prompts for the extraction agent.
app/agents/extraction_agent.py	Refactors prompt generation to persona-based format and updates Ollama generation parameters.
README.md	Updates workflow documentation to remove human review and model settings that no longer exist.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-21T16:45:10Z

 class CandidateExtraction(BaseModel):
    """Structured extraction output from extraction agent."""

    name: str | None = Field(default=None)
    email: str | None = Field(default=None)
    phone: str | None = Field(default=None)
+    website: str | None = Field(default=None)
    skills: list[str] = Field(default_factory=list)
-    experience: str | None = Field(default=None)
-    education: str | None = Field(default=None)
+    experience: list[ExperienceEntry] = Field(default_factory=list)
+    education: list[EducationEntry] = Field(default_factory=list)
+    other_details: list[str] = Field(default_factory=list)


CandidateExtraction currently accepts and silently ignores unexpected/hallucinated keys because no Pydantic extra policy is set. Since the prompt/persona contract says the JSON must match the schema exactly, consider forbidding extra keys (and similarly for nested ExperienceEntry/EducationEntry) so validation fails when the model returns additional fields instead of dropping them unnoticed.

Copilot · 2026-04-21T16:45:11Z

+            prompt=_build_extraction_prompt(raw_text, correction_error=error),
            temperature=0.0,
+            top_p=0.1,
+            stop=["```"],
            timeout_seconds=timeout_seconds,


Passing stop=["```"] can truncate responses that contain fenced JSON (e.g., starting with ```json), producing an empty/partial string that will always fail json.loads and consume retries. Consider removing the stop token and instead stripping markdown fences in post-processing, or choosing stop tokens that only match trailing fences so the JSON body is preserved.

Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/9aa47908-a01a-4b1e-a3f4-ec74a9327676 Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

nmdra · 2026-04-21T17:15:17Z

@copilot apply changes based on the comments in this thread

Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/db48781c-bc90-4b92-85c8-17294dc3126b Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

feat(extraction): align extracted schema with structured json format

fb428b1

Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/3d45db37-3bcc-450a-a084-49377e4795c8 Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

Copilot AI assigned Copilot and nmdra Apr 21, 2026

Copilot created this pull request from a session on behalf of nmdra April 21, 2026 15:24 View session

nmdra approved these changes Apr 21, 2026

View reviewed changes

nmdra marked this pull request as ready for review April 21, 2026 15:25

Copilot started work on behalf of nmdra April 21, 2026 16:00 View session

feat(agents): add persona-driven prompts and non-mocked agent logic

07b1241

Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/88460f99-0014-43cc-bdb8-7d47c16a02db Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

Copilot finished work on behalf of nmdra April 21, 2026 16:13

Copilot AI requested a review from nmdra April 21, 2026 16:13

Copilot started work on behalf of nmdra April 21, 2026 16:16 View session

refactor: keep extraction persona changes and restore mocked downstre…

e36c798

…am agents Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/09a8ad7f-3dd2-49a1-8c0d-d0de3f8c4748 Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

Copilot finished work on behalf of nmdra April 21, 2026 16:21

nmdra requested a review from Copilot April 21, 2026 16:40

Copilot started reviewing on behalf of nmdra April 21, 2026 16:41 View session

Copilot started work on behalf of nmdra April 21, 2026 16:42 View session

Copilot finished work on behalf of nmdra April 21, 2026 16:43

Copilot AI reviewed Apr 21, 2026

View reviewed changes

Copilot started work on behalf of nmdra April 21, 2026 17:07 View session

feat(extraction): add pymupdf4llm for PDF parsing

6ba7a00

Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/9aa47908-a01a-4b1e-a3f4-ec74a9327676 Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

Copilot finished work on behalf of nmdra April 21, 2026 17:12

Copilot started work on behalf of nmdra April 21, 2026 17:15 View session

Copilot AI and others added 3 commits April 21, 2026 17:17

fix(extraction): apply PR review thread feedback

c688745

Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/db48781c-bc90-4b92-85c8-17294dc3126b Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

chore(extraction): add helper docstring and test naming cleanup

9bef4c9

Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/db48781c-bc90-4b92-85c8-17294dc3126b Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

Changes before error encountered

4b740c1

Agent-Logs-Url: https://github.com/nmdra/AgentHire/sessions/db48781c-bc90-4b92-85c8-17294dc3126b Co-authored-by: nmdra <73674803+nmdra@users.noreply.github.com>

Copilot started work on behalf of nmdra April 21, 2026 17:35 View session

Copilot started work on behalf of nmdra April 21, 2026 17:39 View session

Copilot stopped work on behalf of nmdra due to an error April 21, 2026 17:40
The session was cancelled by the user.

nmdra approved these changes Apr 21, 2026

View reviewed changes

nmdra merged commit c3d660e into main Apr 21, 2026
0 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align extraction agent with structured Modelfile JSON contract#5

Align extraction agent with structured Modelfile JSON contract#5
nmdra merged 7 commits into
mainfrom
copilot/init-chat-ollama-model

Copilot AI commented Apr 21, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

nmdra commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

nmdra commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Apr 21, 2026 •

edited

Loading