Synthesizing informative commercial reports from massive and noisy web sources is critical for high-stakes business decisions. Although current deep research agents achieve notable progress, their reports still remain limited in terms of quality, reliability, and coverage. In this work, we propose Mind2Report, a cognitive deep research agent that emulates the commercial analyst to synthesize expert-level reports. Specifically, it first probes fine-grained intent, then searches web sources and records distilled information on the fly, and subsequently iteratively synthesizes the report. To rigorously evaluate Mind2Report, we further construct QRC-Eval comprising 200 real-world commercial tasks and establish a holistic evaluation strategy to assess report quality, reliability, and coverage.
This repository contains the official code for our paper:
Mind2Report: A Cognitive Deep Research Agent for Expert-Level Commercial Report Synthesis
Full Abstract
Synthesizing informative commercial reports from massive and noisy web sources is critical for high-stakes business decisions. Although current deep research agents achieve notable progress, their reports still remain limited in terms of quality, reliability, and coverage. In this work, we propose Mind2Report, a cognitive deep research agent that emulates the commercial analyst to synthesize expert-level reports. Specifically, it first probes fine-grained intent, then searches web sources and records distilled information on the fly, and subsequently iteratively synthesizes the report. We design Mind2Report as a training-free agentic workflow that augments general large language models (LLMs) with dynamic memory to support these long-form cognitive processes. To rigorously evaluate Mind2Report, we further construct QRC-Eval comprising 200 real-world commercial tasks and establish a holistic evaluation strategy to assess report quality, reliability, and coverage. Experiments demonstrate that Mind2Report outperforms leading baselines, including OpenAI and Gemini deep research agents. Although this is a preliminary study, we expect it to serve as a foundation for advancing the future design of commercial deep research agents.
🚩 News (Jan. 2026): Mind2Report initialized and open-sourced.
Mind2Report simulates how a Commercial Analyst performs cognitive deep research—both follow similar patterns:
- 🎯 Intent clarification & fine-grained probing
- 📝 Note-taking & memory recording
- 🔄 Iterative report synthesis
The challenge lies in navigating the Massive and Noisy Web filled with AI-generated content, advertisements, fake news, and scattered information.
Mind2Report transforms this chaotic landscape into Expert-level Commercial Reports that are ⭐ High-quality, ✅ Reliable, 🔍 Comprehensive, and 🎯 Decision-ready.
The figure illustrates the comprehensive pipeline of the Mind2Report framework, consisting of three core modules:
-
Intent-Driven Outline Formulation — On the left, the system begins with intent clarification to understand user queries, followed by outline search to gather domain knowledge. It then generates a structured chapter tree with both broad summaries (e.g., "H100 market, AI infra. 2025") and concrete thinking directions (e.g., "Llama training, BF16 TFLOPs").
-
Memory-Augmented Adaptive Search — In the center, the core research loop operates recursively. The system performs information distilling from web sources, with a fail-retry mechanism for query expanding when needed. Successfully extracted knowledge is recorded into a knowledge-enriched chapter tree with dynamic memory. The coherent-preserved iterative synthesis phase includes knowledge merging, iterative synthesis, and reference matching.
-
Multi-dimensional Reflection — The system evaluates research quality across four dimensions: search steps efficiency, integrity of information, freshness of sources, and plurality of perspectives. This reflection mechanism ensures the final commercial report meets expert-level standards with proper citations and structured content.
This guide provides step-by-step instructions to set up the environment, configure APIs, and run Mind2Report.
We recommend using Conda to manage dependencies. Ensure you have Python 3.10+ installed.
# Clone the Mind2Report repository from GitHub.
git clone https://github.com/Melmaphother/Mind2Report.git
# Navigate into the cloned repository directory.
cd Mind2Report
# Create a new conda environment.
conda create -n mind2report python=3.10
# Activate the environment.
conda activate mind2report
# Install all dependencies.
pip install -r requirements.txtMind2Report requires configuration for LLM APIs and search tools. Edit the configuration files in src/config/:
LLM Configuration (src/config/llms.toml):
[planner]
api_base = "your-api-base"
api_key = "your-api-key"
model = "Deepseek-R1" # Recommended: reasoning LLM
[basic]
api_base = "your-api-base"
api_key = "your-api-key"
model = "DeepSeek-V3.1"Search Configuration (src/config/search.toml):
[search]
jina_api_key = "your-jina-api-key" # Get from https://jina.ai/
# or
tavily_api_key = "your-tavily-api-key" # Get from https://tavily.com/Start the interactive deep research agent:
python -m src.runThe system will guide you through:
- Enter your research query
- (Optional) Answer clarifying questions for better results
- Wait for the agent to perform deep research
- Receive your comprehensive report in Markdown and HTML formats
We evaluate Mind2Report against baselines across three dimensions: Quality (relevance and structure), Reliability (hallucination, temporality, consistency), and Coverage (breadth and depth). The table compares Mind2Report with:
- Proprietary DRAs: o3 Deep Research, o4-mini Deep Research, Gemini Deep Research, Grok Deep Search, Perplexity Deep Research
- Open-source Training-based DRAs: WebThinker, MiroThinker, Tongyi-DeepResearch
- Open-source Workflow-based DRAs: MiroFlow, OpenManus, OWL
Mind2Report achieves Rank 1.00 (best average rank), with 75.42% relevance, 85.24% structure score, and only 6.12% hallucination rate—significantly outperforming all baselines including commercial systems like o3 and Gemini Deep Research.
The figure shows a sample commercial report generated by Mind2Report on the topic "NVIDIA H100 vs. AMD MI300X: Comparative Analysis for Large-Scale Foundational Model Training in 2025". The report features:
- 📋 Structured Sections: Organized chapters including Industry Overview, Market Landscape, Leading Players Analysis, and Strategic Implementation Guidance
- 📊 Data Tables: SWOT analysis comparing strengths, weaknesses, opportunities, and threats of both platforms
- 📈 Visualizations: Charts comparing real-world computational efficiency (FP8/BF16 TFLOPs) and memory architecture
- 🔗 Proper Citations: All claims are backed by numbered references with source URLs
- 📄 Decision-Ready Content: Actionable insights for infrastructure decision-makers
This repo is built on pioneer works. We appreciate the following GitHub repos and resources:
-
LangGraph - State machine framework for LLM applications
-
Jina AI - Web reading and search APIs
-
Tavily - AI-powered search API
🙋 Please let us know if you find out a mistake or have any suggestions!
🌟 If you find our work helpful, please consider to star this repository and cite our research.
@misc{cheng2026mind2report,
title={Mind2Report: A Cognitive Deep Research Agent for Expert-Level Commercial Report Synthesis},
author={Mingyue Cheng and Daoyu Wang and Qi Liu and Shuo Yu and Xiaoyu Tao and Yuqian Wang and Chengzhong Chu and Yu Duan and Mingkang Long and Enhong Chen},
year={2026},
eprint={2601.04879},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2601.04879},
}


