DSA4265 Group 10 — Financial Research Multi-Agent System

A multi-agent research assistant for S&P 500 equities. Given a ticker and a research question, the system runs five specialist agents — news, SEC 10-K, peer comparison, market context, and sentiment — through a LangGraph orchestrator, then writes a cited Q&A brief or a full bull/bear investment memo.

Every claim is grounded in a retrieved source (news URL or SEC filing locator). The LLM stack is fully local via Ollama — no external API keys are required for inference.

For interactive architecture diagrams (system overview, LangGraph flow, RAG pipeline, specialist agent details, data schemas, and evaluation harness), open architecture_diagrams.html in a browser.

Architecture Overview

The central LangGraph orchestrator coordinates five specialist agents through a plan → execute → synthesise → output pipeline.

High-Level Flow

A user query enters via the CLI, gets parsed into a ResearchRequest, and flows through intent classification, agent dispatch, sentiment reconciliation, citation verification, and finally answer generation.

Orchestrator — LangGraph State Machine

The orchestrator classifies the user's question into intents using a deterministic keyword backbone augmented by zero-shot embedding similarity, then routes to the appropriate agents. In QA mode, the router iteratively picks the highest-impact agent per round. In memo mode, all four data-gathering agents are dispatched in parallel before converging at the sentiment and synthesis nodes.

Specialist Agents

Each agent follows a plan-route-execute-evaluate loop pattern with its own data sources and retrieval strategy.

Output Modes

QA mode — a 4-section research brief answering the user's question.
Memo mode — a full 11-section investment memo including executive summary, bull/bear debate, risk factors, peer comparison, and recommendation.

Setup

# 1. Editable install (re-run whenever pyproject.toml changes)
pip install -e .

# 2. Copy env template and fill in credentials
cp .env.example .env
# Required: USER_AGENT (SEC EDGAR fair-use), LOCAL_LLM_MODEL
# Optional: FINNHUB_API_KEY, NEWSAPI_API_KEY (news falls back to GDELT otherwise)

# 3. Pull the Ollama models referenced by .env
ollama pull qwen3:14b       # primary agent model
ollama pull deepseek-r1:7b  # evaluator / LLM-as-judge

The SEC Chroma vector store at src/data_sources/regtech_db_v1/ is checked into the repo (indexed with BAAI/bge-m3), so no build step is required before the first run.

Main Entrypoints

1. `ask` — full orchestrator (the main entrypoint)

Runs the five specialists, verifies citations, and writes a QA brief (--mode qa) or a full bull/bear investment memo (--mode memo) to outputs/report/<TICKER>_<hash>_<timestamp>/report.md.

python -m src.app.cli ask \
  --ticker AAPL \
  --question "What recent news has affected Apple's outlook?" \
  --date-anchor 2026-04-08 \
  --lookback-days 30 \
  --mode memo

2. `news` — news specialist in isolation

Returns the raw AgentResult JSON to stdout. Useful for debugging the retrieval + synthesis pipeline without the full orchestrator.

python -m src.app.cli news \
  --ticker TSLA \
  --question "Why is Tesla under pressure recently?" \
  --date-anchor 2026-04-08 \
  --lookback-days 30

3. `scripts.inspect_last_run` — post-mortem state inspection

Replays the checkpointer for a previous run and dumps every specialist's state, claims, tool calls, and warnings — without re-running the orchestrator.

python -m scripts.inspect_last_run \
  --ticker AAPL \
  --question "What recent news has affected Apple's outlook?" \
  --mode memo
# optional: --history (verbose steps), --json (raw output)

4. `scripts.demo` — canned end-to-end demos

Runs pre-scripted research queries and writes reports. Used in presentations.

python -m scripts.demo                       # both QA + memo cases
python -m scripts.demo --mode memo           # force memo mode
python -m scripts.demo --ticker AAPL         # override ticker

5. `scripts.eval_run` — evaluation harness

Runs the sample queries in config/eval/queries.yaml through the orchestrator and scores each with retrieval metrics + LLM-as-judge rubric. Writes reports/eval_<stamp>.json and .md.

python -m scripts.eval_run
python -m scripts.eval_run --only qa
python -m scripts.eval_run --only memo

6. `scripts.ingest_sec10k` — fetch + parse 10-K (vector-store rebuild)

Only needed if you change the embedding model or add a new ticker.

python -m scripts.ingest_sec10k --ticker AAPL              # fetch latest
python -m scripts.ingest_sec10k --ticker AAPL --year 2024  # specific year

7. `scripts.build_vectorstore` — rebuild Chroma

Chunk + embed ingested 10-Ks into src/data_sources/vectorstore/. Only needed after ingest_sec10k or an embedding-model change.

python -m scripts.build_vectorstore --reset

Testing

All tests exercise live infrastructure — real Finnhub, real NewsAPI, real yfinance, real Ollama, real SEC EDGAR. Set all credentials in .env before running the full suite.

pytest                                        # all tests (~13 min)
pytest tests/test_news_agent.py               # news only
pytest tests/test_sec10k_agent.py             # SEC 10-K only
pytest tests/test_market_context_agent.py     # market context only
pytest tests/test_orchestrator.py             # full pipeline
pytest -k classify_intents                    # by keyword

Project Layout

src/
├── agents/              # One file per specialist + orchestrator.py
├── prompts/             # All LLM prompts (no inline prompt strings elsewhere)
├── schemas/             # Pydantic contracts: ResearchRequest, Claim, Citation,
│                        #   AgentResult, FinalOutput
├── retrieval/           # SEC Chroma wrapper, boilerplate detector, delta analysis
├── data_sources/        # HTTP clients: Finnhub, NewsAPI, GDELT, SEC EDGAR, yfinance
├── eval/                # Retrieval metrics, LLM-as-judge, evaluation runner
├── tools/               # Citation verifier, financial table parser
├── utils/               # Config loader, caching, SQLite checkpointer, embeddings
└── app/cli.py           # CLI entry point (ask, news)

config/
├── default.yaml         # Thresholds, rate limits, source classifications
└── eval/queries.yaml    # Sample evaluation queries (QA + memo)

scripts/                 # eval_run, demo, inspect_last_run, ingest_sec10k,
                         #   build_vectorstore

data/cache/              # External API call cache (per-namespace TTLs)
outputs/                 # Generated reports (qa/ and report/)
reports/                 # Evaluation harness outputs

Tech Stack

Component	Technology
Orchestration	LangGraph (StateGraph with SQLite checkpointing)
LLM inference	Ollama — Qwen 3 14B (agents), DeepSeek-R1 7B (judge)
Embeddings	BAAI/bge-m3 via sentence-transformers
Vector store	Chroma (prebuilt SEC 10-K collection)
Market data	yfinance
News sources	Finnhub, NewsAPI, GDELT (fallback, no key required)
SEC filings	SEC EDGAR
Validation	Pydantic v2
Caching	File-based with per-namespace TTLs (news 24h, market 1h, SEC permanent)

Disclaimer

This repository is a course project for DSA4265. Nothing it produces constitutes investment advice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DSA4265 Group 10 — Financial Research Multi-Agent System

Architecture Overview

High-Level Flow

Orchestrator — LangGraph State Machine

Specialist Agents

Output Modes

Setup

Main Entrypoints

1. `ask` — full orchestrator (the main entrypoint)

2. `news` — news specialist in isolation

3. `scripts.inspect_last_run` — post-mortem state inspection

4. `scripts.demo` — canned end-to-end demos

5. `scripts.eval_run` — evaluation harness

6. `scripts.ingest_sec10k` — fetch + parse 10-K (vector-store rebuild)

7. `scripts.build_vectorstore` — rebuild Chroma

Testing

Project Layout

Tech Stack

Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.vscode		.vscode
config		config
docs/images		docs/images
logs		logs
outputs		outputs
reports		reports
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
architecture_diagrams.html		architecture_diagrams.html
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

DSA4265 Group 10 — Financial Research Multi-Agent System

Architecture Overview

High-Level Flow

Orchestrator — LangGraph State Machine

Specialist Agents

Output Modes

Setup

Main Entrypoints

1. ask — full orchestrator (the main entrypoint)

2. news — news specialist in isolation

3. scripts.inspect_last_run — post-mortem state inspection

4. scripts.demo — canned end-to-end demos

5. scripts.eval_run — evaluation harness

6. scripts.ingest_sec10k — fetch + parse 10-K (vector-store rebuild)

7. scripts.build_vectorstore — rebuild Chroma

Testing

Project Layout

Tech Stack

Disclaimer

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. `ask` — full orchestrator (the main entrypoint)

2. `news` — news specialist in isolation

3. `scripts.inspect_last_run` — post-mortem state inspection

4. `scripts.demo` — canned end-to-end demos

5. `scripts.eval_run` — evaluation harness

6. `scripts.ingest_sec10k` — fetch + parse 10-K (vector-store rebuild)

7. `scripts.build_vectorstore` — rebuild Chroma

Packages