test: add extraction regression workflow#1029
Conversation
|
Good idea for a regression workflow — having CI catch silent extraction breakage is useful. A few fixes needed: Missing trailing newline The file ends without a newline after the last Add a Without a on:
pull_request:
paths:
- 'graphify/**'
- 'tests/**'
- 'pyproject.toml'
workflow_dispatch:
The graphify CLI expects graphify extract fixture(Visualization is skipped automatically when there's no display.) Consistency with the uv migration in #885 If #885 lands first, the CI stack will be |
Ensure the graph contains nodes in the regression test.
|
Thanks for the detailed feedback, I’ve updated the workflow accordingly:
I left the install step unchanged for now since the repo still appears to be in transition toward the |
|
Thanks for thinking about end-to-end regression coverage — it's a real gap in current CI. However the workflow as written will fail on every run for two separate reasons, and a few things need alignment with existing CI conventions before we can merge. Blocking issues1.
|
ci.yml (existing) |
extraction-regression.yml (this PR) |
|
|---|---|---|
| Toolchain | uv via astral-sh/setup-uv@v8.1.0 |
bare pip install -e |
| Python | matrix 3.10 + 3.12 | single 3.12 only |
| Checkout | actions/checkout@v6 |
actions/checkout@v4 |
Please align with the existing conventions: use uv, add Python 3.10 to the matrix (the most common failure surface), and update the checkout action.
Suggestion: fold into ci.yml instead of a new file
Rather than a separate workflow, consider adding one extra step to the existing CI job:
- name: End-to-end extraction smoke test
run: |
uv run graphify update tests/fixtures
python -c "
import json
data = json.load(open('tests/fixtures/graphify-out/graph.json'))
nodes = data.get('nodes', [])
assert len(nodes) > 0, 'graph has no nodes'
assert any(n.get('kind') == 'function' for n in nodes), 'no function nodes'
assert len(data.get('edges', [])) > 0, 'graph has no edges'
print(f'OK: {len(nodes)} nodes, {len(data[\"edges\"])} edges')
"This reuses the existing toolchain, runs on both Python versions, uses the already-maintained tests/fixtures/ corpus (which has Go, Rust, C#, Python etc.), and catches real regressions rather than just checking that graph.json exists.
Strengthening the assertions
len(nodes) > 0 will pass on a structurally broken graph. Better checks:
- At least one node with
kind == "function"or"class" - At least one edge (catches regressions that drop all edges)
- Required schema keys present (
id,label,kind,source_file)
Please fix the two blocking issues (wrong command + wrong output path) and align with CI conventions. If you go the folded-into-ci.yml route we can merge that quickly. Happy to re-review once updated — the intent here is good and worth getting right. Thanks!
Summary
Adds a lightweight end-to-end extraction regression workflow that:
Why
Current CI validates installation and tests, but does not explicitly verify minimal graph extraction behavior end-to-end.
This workflow helps catch silent extraction regressions while remaining lightweight and fully local-first.
Scope
Intentionally small and low-risk: