Skip to content

feat: erdos672 assembly, Codex enrichment, sweep & cleanup#75

Open
kavanaghpatrick wants to merge 7 commits intomainfrom
feat/math-forge
Open

feat: erdos672 assembly, Codex enrichment, sweep & cleanup#75
kavanaghpatrick wants to merge 7 commits intomainfrom
feat/math-forge

Conversation

@kavanaghpatrick
Copy link
Copy Markdown
Owner

Summary

  • Sweep 15 new open problems with parallel Codex orchestrator and audit backlog
  • Auto-enrich Codex prompts with formal statements, dead ends, and context summaries
  • Erdos672 sub1 compiled (SeqA∩SeqB ⊂ {v≡1 mod 3080}), full assembly submitted to Aristotle (slot734)
  • Gitignore Rust target/ and remove accidentally tracked build artifacts

Test plan

  • Submission log updated with new entries
  • Rust build artifacts removed from tracking
  • **/target/ added to .gitignore
  • All commits pushed to remote

🤖 Generated with Claude Code

claude added 7 commits March 15, 2026 07:08
…backlog

- Sweep: scanned formal-conjectures, identified 15 fresh open problems
  (Erdos 208/233/236/247/258/263/283/375/680/826/942/1004/1073/274,
  Zariski cancellation), all verified OPEN via erdosproblems.com
- Submitted all 15 bare-gap sketches to Aristotle (slots 717-731)
- Downloaded and audited 22 unprocessed Aristotle results:
  7 PARTIAL_PROGRESS, 15 INFRASTRUCTURE_ONLY, 0 gaps resolved
- Orchestrator: parallel Codex workers (--max-codex-parallel N),
  no more false RESOLVED on wrong-theorem proofs,
  smarter crash recovery (compiled → Aristotle, not re-run)
- Enqueued all problems to both Codex and Aristotle pipelines
- Near-misses (2 sorry) prioritized with Aristotle results as context

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d context summaries

- resolve_formal_statement(): auto-lookup theorem from formal-conjectures repo
- get_failed_approaches(): check DB for known dead ends before each attempt
- _summarize_lean_context(): truncate large proof bodies, keep signatures + sorry locations
- Codex timeout increased from 5min to 1hr for deep reasoning on hard problems
- Fix IndexError when first Codex iteration produces empty output
- Fix sorry counter using count_sorries() instead of raw string count
- Re-read DB row in parallel worker reaper to avoid stale data
- Codex-only loop: re-queue with own results as context instead of Aristotle handoff

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…maining

Codex (50 min reasoning, xhigh effort) produced 231-line proof that:
- Defines SeqA (u²=2v²-1) and SeqB (d²=3v²-2) Pell sequences
- Proves periodicity of both sequences mod 3080 (= 5*7*8*11)
- Shows SeqA residues mod 3080 ∩ SeqB residues mod 3080 = {1}
- Compiled clean against Mathlib (22-minute build, exit code 0)

Remaining sorry: strengthen v≡1 mod 3080 to v=1 (growth rate argument).
Submitted to Aristotle as slot 732 with the compiled proof as context.

Also: fix Codex subprocess invocation (direct list, no shell quoting)
Also: increase build timeout awareness (300s was too short for this proof)

Sub-problem decomposition results:
- erdos672_sub1: COMPILED, 1 sorry (this commit)
- erdos672_sub2: RESOLVED, 0 sorry (consecutive squares impossible)
- erdos672_sub3: RESOLVED, 0 sorry (assembly with axioms)
- erdos283_squares: RESOLVED, 0 sorry (Egyptian fractions p(x)=x²)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ontext

Bundled all 4 proof components as context:
- slot703 (1148 lines Pell infrastructure, 0 sorry)
- sub1 candidate (231 lines, v≡1 mod 3080, 1 sorry)
- sub2 (54 lines, consecutive squares impossible, 0 sorry)
- sub3 (assembly with axioms, 0 sorry)

Aristotle's task: close the one remaining sorry (v≡1 mod 3080 → v=1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rust build artifacts (binaries, rlibs, fingerprints) were accidentally
tracked. Add **/target/ to .gitignore and remove from history.
Also includes pending a375071 rust refactor and submission log updates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Autonomous Proof Loop section (Codex engine + proof orchestrator
with state machine diagram), stats banner, /codex-prove skill, 8 missing
mk CLI commands, codex_proofs/proof_queue tables, Hooks & Guardrails
section. Fix phantom scripts/math_forge.py reference, correct Codex role
from screening tool to proof generation engine, update repo structure to
reflect actual layout (Rust projects, Lean 4 build infra, plugin hooks).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
30-agent audit found systemic mislabeling in tracking.db:
- arithmetic_sums target_resolved=1 corrected to 0 (trivial falsification)
- iata_rtw deleted (airline ticketing rules, not math)
- 14 Leinster entries: gap_targeting → formalization (all known results)
- erdos350/370/427: sketches literally say "SOLVED" yet were gap_targeting
- erdos36/137/erdos_straus: known bounds/partial results → formalization
- wolstenholme/odd_perfect/regular_primes/pell_primes: known theorems
- agoh_giuga 1183,1187: known equivalences with proof strategies
- sierpinski_5n 1125: Selfridge 1962 result
- hardy_littlewood_2: → likely_false (Richards 1974 conditional disproof)
- Remove erdos396_rust/target/ build artifacts from tracking

Honest scoreboard: 0 open problems resolved in 1,157 submissions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants