📊 Mind2Report: A Cognitive Deep Research Agent
for Expert-Level Commercial Report Synthesis

📖 Paper | 📊 Datasets | 👀 Code

📖 Abstract

Synthesizing informative commercial reports from massive and noisy web sources is critical for high-stakes business decisions. Although current deep research agents achieve notable progress, their reports still remain limited in terms of quality, reliability, and coverage. In this work, we propose Mind2Report, a cognitive deep research agent that emulates the commercial analyst to synthesize expert-level reports. Specifically, it first probes fine-grained intent, then searches web sources and records distilled information on the fly, and subsequently iteratively synthesizes the report. To rigorously evaluate Mind2Report, we further construct QRC-Eval comprising 200 real-world commercial tasks and establish a holistic evaluation strategy to assess report quality, reliability, and coverage.

This repository contains the official code for our paper:

Mind2Report: A Cognitive Deep Research Agent for Expert-Level Commercial Report Synthesis

Full Abstract

Synthesizing informative commercial reports from massive and noisy web sources is critical for high-stakes business decisions. Although current deep research agents achieve notable progress, their reports still remain limited in terms of quality, reliability, and coverage. In this work, we propose Mind2Report, a cognitive deep research agent that emulates the commercial analyst to synthesize expert-level reports. Specifically, it first probes fine-grained intent, then searches web sources and records distilled information on the fly, and subsequently iteratively synthesizes the report. We design Mind2Report as a training-free agentic workflow that augments general large language models (LLMs) with dynamic memory to support these long-form cognitive processes. To rigorously evaluate Mind2Report, we further construct QRC-Eval comprising 200 real-world commercial tasks and establish a holistic evaluation strategy to assess report quality, reliability, and coverage. Experiments demonstrate that Mind2Report outperforms leading baselines, including OpenAI and Gemini deep research agents. Although this is a preliminary study, we expect it to serve as a foundation for advancing the future design of commercial deep research agents.

🎁 Updates/News:

🚩 News (Jan. 2026): Mind2Report initialized and open-sourced.

✨ Motivation

Mind2Report simulates how a Commercial Analyst performs cognitive deep research—both follow similar patterns:

🎯 Intent clarification & fine-grained probing
📝 Note-taking & memory recording
🔄 Iterative report synthesis

The challenge lies in navigating the Massive and Noisy Web filled with AI-generated content, advertisements, fake news, and scattered information.

Mind2Report transforms this chaotic landscape into Expert-level Commercial Reports that are ⭐ High-quality, ✅ Reliable, 🔍 Comprehensive, and 🎯 Decision-ready.

🌟 Framework

Figure 2: the proposed Mind2Report.

The figure illustrates the comprehensive pipeline of the Mind2Report framework, consisting of three core modules:

Intent-Driven Outline Formulation — On the left, the system begins with intent clarification to understand user queries, followed by outline search to gather domain knowledge. It then generates a structured chapter tree with both broad summaries (e.g., "H100 market, AI infra. 2025") and concrete thinking directions (e.g., "Llama training, BF16 TFLOPs").
Memory-Augmented Adaptive Search — In the center, the core research loop operates recursively. The system performs information distilling from web sources, with a fail-retry mechanism for query expanding when needed. Successfully extracted knowledge is recorded into a knowledge-enriched chapter tree with dynamic memory. The coherent-preserved iterative synthesis phase includes knowledge merging, iterative synthesis, and reference matching.
Multi-dimensional Reflection — The system evaluates research quality across four dimensions: search steps efficiency, integrity of information, freshness of sources, and plurality of perspectives. This reflection mechanism ensures the final commercial report meets expert-level standards with proper citations and structured content.

🚀 Quick Start

This guide provides step-by-step instructions to set up the environment, configure APIs, and run Mind2Report.

1. Environment Setup

We recommend using Conda to manage dependencies. Ensure you have Python 3.10+ installed.

# Clone the Mind2Report repository from GitHub.
git clone https://github.com/Melmaphother/Mind2Report.git

# Navigate into the cloned repository directory.
cd Mind2Report

# Create a new conda environment.
conda create -n mind2report python=3.10

# Activate the environment.
conda activate mind2report

# Install all dependencies.
pip install -r requirements.txt

2. Configuration

Mind2Report requires configuration for LLM APIs and search tools. Edit the configuration files in src/config/:

LLM Configuration (src/config/llms.toml):

[planner]
api_base = "your-api-base"
api_key = "your-api-key"
model = "Deepseek-R1"  # Recommended: reasoning LLM

[basic]
api_base = "your-api-base"
api_key = "your-api-key"
model = "DeepSeek-V3.1"

Search Configuration (src/config/search.toml):

[search]
jina_api_key = "your-jina-api-key"  # Get from https://jina.ai/
# or
tavily_api_key = "your-tavily-api-key"  # Get from https://tavily.com/

3. Run Mind2Report

Start the interactive deep research agent:

python -m src.run

The system will guide you through:

Enter your research query
(Optional) Answer clarifying questions for better results
Wait for the agent to perform deep research
Receive your comprehensive report in Markdown and HTML formats

💪 Performance

Table 1: Performance Comparison on FirmBench Benchmark

We evaluate Mind2Report against baselines across three dimensions: Quality (relevance and structure), Reliability (hallucination, temporality, consistency), and Coverage (breadth and depth). The table compares Mind2Report with:

Proprietary DRAs: o3 Deep Research, o4-mini Deep Research, Gemini Deep Research, Grok Deep Search, Perplexity Deep Research
Open-source Training-based DRAs: WebThinker, MiroThinker, Tongyi-DeepResearch
Open-source Workflow-based DRAs: MiroFlow, OpenManus, OWL

Mind2Report achieves Rank 1.00 (best average rank), with 75.42% relevance, 85.24% structure score, and only 6.12% hallucination rate—significantly outperforming all baselines including commercial systems like o3 and Gemini Deep Research.

📋 Example Output

Figure 3: Example of Generated Commercial Report

The figure shows a sample commercial report generated by Mind2Report on the topic "NVIDIA H100 vs. AMD MI300X: Comparative Analysis for Large-Scale Foundational Model Training in 2025". The report features:

📋 Structured Sections: Organized chapters including Industry Overview, Market Landscape, Leading Players Analysis, and Strategic Implementation Guidance
📊 Data Tables: SWOT analysis comparing strengths, weaknesses, opportunities, and threats of both platforms
📈 Visualizations: Charts comparing real-world computational efficiency (FP8/BF16 TFLOPs) and memory architecture
🔗 Proper Citations: All claims are backed by numbered references with source URLs
📄 Decision-Ready Content: Actionable insights for infrastructure decision-makers

🙏 Acknowledgement

This repo is built on pioneer works. We appreciate the following GitHub repos and resources:

LangGraph - State machine framework for LLM applications
Jina AI - Web reading and search APIs
Tavily - AI-powered search API

🔖 Citation

🙋 Please let us know if you find out a mistake or have any suggestions!

🌟 If you find our work helpful, please consider to star this repository and cite our research.

@misc{cheng2026mind2report,
      title={Mind2Report: A Cognitive Deep Research Agent for Expert-Level Commercial Report Synthesis}, 
      author={Mingyue Cheng and Daoyu Wang and Qi Liu and Shuo Yu and Xiaoyu Tao and Yuqian Wang and Chengzhong Chu and Yu Duan and Mingkang Long and Enhong Chen},
      year={2026},
      eprint={2601.04879},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2601.04879}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 Mind2Report: A Cognitive Deep Research Agent
for Expert-Level Commercial Report Synthesis

📖 Abstract

🎁 Updates/News:

✨ Motivation

🌟 Framework

🚀 Quick Start

1. Environment Setup

2. Configuration

3. Run Mind2Report

💪 Performance

📋 Example Output

🙏 Acknowledgement

🔖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📊 Mind2Report: A Cognitive Deep Research Agent for Expert-Level Commercial Report Synthesis

📖 Abstract

🎁 Updates/News:

✨ Motivation

🌟 Framework

🚀 Quick Start

1. Environment Setup

2. Configuration

3. Run Mind2Report

💪 Performance

📋 Example Output

🙏 Acknowledgement

🔖 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

📊 Mind2Report: A Cognitive Deep Research Agent
for Expert-Level Commercial Report Synthesis

Packages