Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 33 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Recent Releases

**v0.1.87 (May 15, 2026)** - Documentation: Framework Comparisons & `llms.txt`
Documentation release adding three "MassGen vs ..." comparison pages (CrewAI, LangGraph, AutoGen/AG2), a curated `llms.txt` index plus full-corpus `llms-full.txt` dump (per [llmstxt.org](https://llmstxt.org) spec), and small README/landing-page pointers so AI agents and crawlers can discover the docs. Also ships a one-line `refine=False` fix for the `bootstrap_subagent` discriminator that was being shadowed by the orchestrator's default `max_new_answers_per_agent`.

**v0.1.86 (May 13, 2026)** - `bootstrap_subagent` Discriminator + Codex MCP Approval Fix
Variant B (`criteria_mode: bootstrap_subagent`) is now functional: the orchestrator runs an in-process critic between rounds, merges critic-proposed criteria into the accumulator, and augments the next round's checklist. This release also fixes Codex MCP tool calls under `codex exec` by writing the approval bypasses needed for non-interactive runs.

Expand All @@ -18,8 +21,36 @@ New `orchestrator.coordination.criteria_mode` option lets evaluation criteria em
**v0.1.84 (May 8, 2026)** - TUI Consensus Map
A compact visual map below the agent status ribbon during multi-agent runs. Shows agent nodes with latest answer labels, vote arrows, current vote leader, winner state, and waiting/working indicators — driven by existing coordination events without backend schema changes. Hidden on welcome and single-agent runs.

**v0.1.83 (May 1, 2026)** - In-Session Standalone Checkpoint MCP Integration
The standalone checkpoint MCP server can now be exposed *inside* a normal MassGen run via a new `coordination.standalone_checkpoint` config block, giving single-agent sessions access to the richer `init` + `checkpoint` tools backed by their own reviewer team. Enhanced checkpoint tool card visualization separates primary operations from system tasks.
---

## [0.1.87] - 2026-05-15

### Added
- **Framework Comparison Pages** ([#1094](https://github.com/massgen/MassGen/pull/1094)): Three new "MassGen vs ..." pages under `docs/source/reference/comparisons/` — `crewai.rst`, `langgraph.rst`, `autogen.rst`. Each page positions MassGen's parallel-refinement-with-voting model against the target framework's coordination shape and lists when to reach for one versus the other
- **`llms.txt` Index** ([#1094](https://github.com/massgen/MassGen/pull/1094)): Curated [llmstxt.org](https://llmstxt.org)-spec index published at the docs site root via Sphinx `html_extra_path` (`docs/source/_extra/llms.txt`) — gives AI agents a small, hand-picked map of the docs
- **`llms-full.txt` Corpus** ([#1094](https://github.com/massgen/MassGen/pull/1094)): Concatenated full-docs dump (~1 MB across 59 files), generated by a Sphinx `build-finished` hook in `docs/source/conf.py` and shipped alongside `llms.txt` for crawlers that want the complete corpus
- **Docs Landing Page Update** ([#1094](https://github.com/massgen/MassGen/pull/1094)): "How Does MassGen Compare?" section on `docs/source/index.rst` now lists all four comparisons (LLM Council + the three new ones), with the parent `docs/source/reference/comparisons.rst` losing its "coming soon" note and gaining a toctree
- **README Pointers** ([#1094](https://github.com/massgen/MassGen/pull/1094)): One-line pointers in `README.md` (and synced `README_PYPI.md`) directing AI agents to `llms.txt` / `llms-full.txt`

### Fixed
- **`bootstrap_subagent` Discriminator Single-Shot** ([#1094](https://github.com/massgen/MassGen/pull/1094)): `Orchestrator._run_bootstrap_discriminator_step` now passes `refine=False` to `SubagentManager.spawn_subagent`. This is the canonical single-shot knob that `SubagentManager` actually respects at the orchestrator level — without it, the orchestrator's `max_new_answers_per_agent: 3` default shadowed the coordination-dict overrides, letting the discriminator refine instead of single-shot. Found via live log inspection (`log_20260513_095921_816676`)
- `massgen/orchestrator.py:1298` — `refine=False` added to `spawn_subagent` call
- `massgen/tests/test_bootstrap_criteria.py` — new assertion that `discriminator must pass refine=False to spawn_subagent for single-shot`

### Documentations, Configurations and Resources
- **Comparison pages**: `docs/source/reference/comparisons/{crewai,langgraph,autogen}.rst`
- **Sphinx `build-finished` hook**: `docs/source/conf.py` — generates `llms-full.txt` from the source tree at build time
- **README pointers**: `README.md`, `README_PYPI.md` — AI agents are directed to `llms.txt` / `llms-full.txt`

### Notes
- Originally-planned Image/Video Edit Capabilities ([#959](https://github.com/massgen/MassGen/issues/959)) and Discriminative Criteria Refinements deferred to v0.1.88.
- Closes [#1082](https://github.com/massgen/MassGen/issues/1082) (publish `llms.txt` + `llms-full.txt`) and [#1083](https://github.com/massgen/MassGen/issues/1083) (CrewAI / LangGraph / AutoGen comparison pages).

### Technical Details
- **Major Focus**: Make MassGen discoverable to AI agents and crawlers, and give human readers structured "MassGen vs ..." comparisons against the three frameworks most often asked about
- **PRs Merged**: [#1094](https://github.com/massgen/MassGen/pull/1094)
- **Issues Closed**: [#1082](https://github.com/massgen/MassGen/issues/1082), [#1083](https://github.com/massgen/MassGen/issues/1083)
- **Contributors**: @ncrispino, @HenryQi and the MassGen team

---

Expand Down
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -359,7 +359,7 @@ Create a `.env` file in the `massgen` directory as described in [README](README.

## 🔧 Development Workflow

> **Important**: Our next version is v0.1.87. If you want to contribute, please contribute to the `dev/v0.1.87` branch (or `main` if dev/v0.1.87 doesn't exist yet).
> **Important**: Our next version is v0.1.88. If you want to contribute, please contribute to the `dev/v0.1.88` branch (or `main` if dev/v0.1.88 doesn't exist yet).

### 1. Create Feature Branch

Expand All @@ -368,7 +368,7 @@ Create a `.env` file in the `massgen` directory as described in [README](README.
git fetch upstream

# Create feature branch from dev/v0.1.60 (or main if dev branch doesn't exist yet)
git checkout -b feature/your-feature-name upstream/dev/v0.1.87
git checkout -b feature/your-feature-name upstream/dev/v0.1.88
```

### 2. Make Your Changes
Expand Down Expand Up @@ -507,7 +507,7 @@ git push origin feature/your-feature-name
```

Then create a pull request on GitHub:
- Base branch: `dev/v0.1.87` (or `main` if dev branch doesn't exist yet)
- Base branch: `dev/v0.1.88` (or `main` if dev branch doesn't exist yet)
- Compare branch: `feature/your-feature-name`
- Add clear description of changes
- Link any related issues
Expand Down Expand Up @@ -617,7 +617,7 @@ Have a significant feature idea not covered by existing tracks?
- [ ] Tests pass locally
- [ ] Documentation is updated if needed
- [ ] Commit messages follow convention
- [ ] PR targets `dev/v0.1.87` branch (or `main` if dev branch doesn't exist yet)
- [ ] PR targets `dev/v0.1.88` branch (or `main` if dev branch doesn't exist yet)

### PR Description Should Include

Expand Down
53 changes: 29 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,15 +122,15 @@ This project started with the "threads of thought" and "iterative refinement" id
<details open>
<summary><h3>🗺️ Roadmap</h3></summary>

- [Recent Achievements (v0.1.86)](#recent-achievements-v0186)
- [Previous Achievements (v0.0.3 - v0.1.85)](#previous-achievements-v003---v0185)
- [Recent Achievements (v0.1.87)](#recent-achievements-v0187)
- [Previous Achievements (v0.0.3 - v0.1.86)](#previous-achievements-v003---v0186)
- [Key Future Enhancements](#key-future-enhancements)
- Bug Fixes & Backend Improvements
- Advanced Agent Collaboration
- Expanded Model, Tool & Agent Integrations
- Improved Performance & Scalability
- Enhanced Developer Experience
- [v0.1.87 Roadmap](#v0187-roadmap)
- [v0.1.88 Roadmap](#v0188-roadmap)
</details>

<details open>
Expand All @@ -155,19 +155,20 @@ This project started with the "threads of thought" and "iterative refinement" id

---

## 🆕 Latest Features (v0.1.86)
## 🆕 Latest Features (v0.1.87)

**🎉 Released: May 13, 2026**
**🎉 Released: May 15, 2026**

**What's New in v0.1.86:**
- **🧠 `bootstrap_subagent` Discriminator** - `orchestrator.coordination.criteria_mode: bootstrap_subagent` now runs a dedicated between-rounds LLM critic that proposes criteria from the current answers, merges them into the accumulator, and augments the next round's checklist automatically.
- **🧹 Session-End Criteria Drain** - Late stdio JSONL criteria emissions are drained before final presentation so they are not stranded after the last checklist resolution pass.
- **🛠️ Codex MCP Approval Fix** - Codex workspaces now include the non-interactive approval bypasses needed for external MCP tools such as `submit_checklist`, `create_task_plan`, `new_answer`, and `read_media`.
**What's New in v0.1.87:**
- **📚 Framework Comparison Pages** - Three new "MassGen vs ..." pages — CrewAI, LangGraph, AutoGen/AG2 — under `docs/source/reference/comparisons/`, positioning MassGen's parallel-refinement-with-voting model against each framework's coordination shape.
- **🤖 `llms.txt` for AI Agents** - A curated [`llms.txt`](https://docs.massgen.ai/en/latest/llms.txt) index plus a full-corpus [`llms-full.txt`](https://docs.massgen.ai/en/latest/llms-full.txt) dump (per [llmstxt.org spec](https://llmstxt.org)), so AI agents and crawlers can discover MassGen's docs cleanly.
- **🔧 `bootstrap_subagent` Single-Shot Fix** - `Orchestrator._run_bootstrap_discriminator_step` now passes `refine=False` to `spawn_subagent` — the canonical knob `SubagentManager` actually respects at the orchestrator level.

**Try v0.1.86 Features:**
**Try v0.1.87 Features:**
```bash
pip install massgen==0.1.86
uv run massgen --config massgen/configs/coordination/bootstrap_subagent_criteria.yaml "Create an SVG of an AI agent coding."
pip install massgen==0.1.87
# Read the framework comparisons:
# https://docs.massgen.ai/en/latest/reference/comparisons.html
```

→ [See full release history and examples](massgen/configs/README.md#release-history--examples)
Expand Down Expand Up @@ -218,6 +219,8 @@ This collaborative approach ensures that the final output leverages collective i
---

> 📖 **Complete Documentation:** For comprehensive guides, API reference, and detailed examples, visit **[MassGen Official Documentation](https://docs.massgen.ai/)**
>
> 🤖 **For AI agents:** A curated [`llms.txt`](https://docs.massgen.ai/en/latest/llms.txt) index and full [`llms-full.txt`](https://docs.massgen.ai/en/latest/llms-full.txt) dump are published with the docs ([llmstxt.org spec](https://llmstxt.org)).

---

Expand Down Expand Up @@ -1239,19 +1242,21 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch

⚠️ **Early Stage Notice:** As MassGen is in active development, please expect upcoming breaking architecture changes as we continue to refine and improve the system.

### Recent Achievements (v0.1.86)
### Recent Achievements (v0.1.87)

**🎉 Released: May 15, 2026**

**🎉 Released: May 13, 2026**
#### Documentation: Framework Comparisons & `llms.txt`
- **Framework Comparison Pages** ([#1094](https://github.com/massgen/MassGen/pull/1094)): Three new "MassGen vs ..." pages — `crewai.rst`, `langgraph.rst`, `autogen.rst` — under `docs/source/reference/comparisons/`, positioning MassGen against each framework's coordination shape and listing when to reach for one versus the other
- **`llms.txt` Index** ([#1094](https://github.com/massgen/MassGen/pull/1094)): Curated [llmstxt.org](https://llmstxt.org)-spec index published at the docs root via Sphinx `html_extra_path`
- **`llms-full.txt` Corpus** ([#1094](https://github.com/massgen/MassGen/pull/1094)): Concatenated full-docs dump (~1 MB, 59 files), generated by a Sphinx `build-finished` hook in `conf.py`
- **Landing Page Update** ([#1094](https://github.com/massgen/MassGen/pull/1094)): "How Does MassGen Compare?" now lists all four comparisons; parent `comparisons.rst` drops "coming soon" and gains a toctree
- **README Pointer**: One-line pointer to `llms.txt` / `llms-full.txt` for AI agents/crawlers
- **`bootstrap_subagent` Single-Shot Fix** ([#1094](https://github.com/massgen/MassGen/pull/1094)): `_run_bootstrap_discriminator_step` now passes `refine=False` to `spawn_subagent` — without it the orchestrator's `max_new_answers_per_agent: 3` default shadowed the coordination-dict overrides and let the discriminator refine instead of running single-shot

#### `bootstrap_subagent` Discriminator + Codex MCP Approval Fix
- **`bootstrap_subagent` Variant (fully functional)**: A dedicated between-rounds LLM critic now reads the task and each agent's latest answer, emits `proposed_criteria` as JSON, and merges them into `bootstrap_criteria_accumulator.json` for the next round's checklist
- **Answer-Snapshot Gate**: The discriminator runs once per unique answer snapshot, avoiding repeated critiques when the answer set has not changed
- **Session-End Drain**: Late stdio criteria emissions are captured before final presentation
- **Codex MCP Approval Fix**: Non-interactive Codex workspaces now write both `approval_policy = "never"` and per-MCP-server `default_tools_approval_mode = "approve"`, preventing external MCP tools from being cancelled immediately under `codex exec`
- **Example Configs**: `massgen/configs/coordination/bootstrap_subagent_criteria.yaml` for the critic-driven path and `bootstrap_inline_criteria.yaml` for agent-proposed criteria
- **Tests**: Bootstrap criteria coverage expanded to 35 tests, plus Codex workspace approval policy coverage across approval modes
### Previous Achievements (v0.0.3 - v0.1.86)

### Previous Achievements (v0.0.3 - v0.1.85)
✅ **`bootstrap_subagent` Discriminator + Codex MCP Approval Fix (v0.1.86)**: Variant B is now functional — the orchestrator runs an in-process LLM critic between rounds, merges critic-proposed criteria into the accumulator, and augments the next round's checklist. Codex MCP tool calls under `codex exec` now write both approval bypasses needed for non-interactive runs.

✅ **Discriminative Criteria Emergence (`criteria_mode`) (v0.1.85)**: New `orchestrator.coordination.criteria_mode` lets evaluation criteria emerge from observed gaps across rounds. `bootstrap_inline` is fully functional on all backends with checklist tool support, with `proposed_criteria` persisted, deduped, capped, and merged into the next round's effective checklist.

Expand Down Expand Up @@ -1568,9 +1573,9 @@ MassGen is currently in its foundational stage, with a focus on parallel, asynch

We welcome community contributions to achieve these goals.

### v0.1.87 Roadmap
### v0.1.88 Roadmap

Version 0.1.87 picks up the multimodal work deferred from v0.1.86 and continues refinement of the discriminative criteria pipeline:
Version 0.1.88 picks up the multimodal work deferred from v0.1.86/v0.1.87 and continues refinement of the discriminative criteria pipeline:

#### Planned Features
- **Image/Video Edit Capabilities** ([#959](https://github.com/massgen/MassGen/issues/959)): Image and video editing across providers with multi-turn editing workflows via continuation IDs
Expand Down
Loading
Loading