Skip to content

ashleytoh/AlphaAgents

 
 

Repository files navigation

DSA4265 Group 10 — Financial Research Multi-Agent System

A multi-agent research assistant for S&P 500 equities. Given a ticker and a research question, the system runs five specialist agents — news, SEC 10-K, peer comparison, market context, and sentiment — through a LangGraph orchestrator, then writes a cited Q&A brief or a full bull/bear investment memo.

Every claim is grounded in a retrieved source (news URL or SEC filing locator). The LLM stack is fully local via Ollama — no external API keys are required for inference.

For interactive architecture diagrams (system overview, LangGraph flow, RAG pipeline, specialist agent details, data schemas, and evaluation harness), open architecture_diagrams.html in a browser.


Architecture Overview

The central LangGraph orchestrator coordinates five specialist agents through a plan → execute → synthesise → output pipeline.

System Overview

High-Level Flow

A user query enters via the CLI, gets parsed into a ResearchRequest, and flows through intent classification, agent dispatch, sentiment reconciliation, citation verification, and finally answer generation.

High-Level Architecture

Orchestrator — LangGraph State Machine

The orchestrator classifies the user's question into intents using a deterministic keyword backbone augmented by zero-shot embedding similarity, then routes to the appropriate agents. In QA mode, the router iteratively picks the highest-impact agent per round. In memo mode, all four data-gathering agents are dispatched in parallel before converging at the sentiment and synthesis nodes.

Orchestrator LangGraph Flow

Specialist Agents

Each agent follows a plan-route-execute-evaluate loop pattern with its own data sources and retrieval strategy.

Specialist Agent Architecture

Output Modes

  • QA mode — a 4-section research brief answering the user's question.
  • Memo mode — a full 11-section investment memo including executive summary, bull/bear debate, risk factors, peer comparison, and recommendation.

Setup

# 1. Editable install (re-run whenever pyproject.toml changes)
pip install -e .

# 2. Copy env template and fill in credentials
cp .env.example .env
# Required: USER_AGENT (SEC EDGAR fair-use), LOCAL_LLM_MODEL
# Optional: FINNHUB_API_KEY, NEWSAPI_API_KEY (news falls back to GDELT otherwise)

# 3. Pull the Ollama models referenced by .env
ollama pull qwen3:14b       # primary agent model
ollama pull deepseek-r1:7b  # evaluator / LLM-as-judge

The SEC Chroma vector store at src/data_sources/regtech_db_v1/ is checked into the repo (indexed with BAAI/bge-m3), so no build step is required before the first run.


Main Entrypoints

1. ask — full orchestrator (the main entrypoint)

Runs the five specialists, verifies citations, and writes a QA brief (--mode qa) or a full bull/bear investment memo (--mode memo) to outputs/report/<TICKER>_<hash>_<timestamp>/report.md.

python -m src.app.cli ask \
  --ticker AAPL \
  --question "What recent news has affected Apple's outlook?" \
  --date-anchor 2026-04-08 \
  --lookback-days 30 \
  --mode memo

2. news — news specialist in isolation

Returns the raw AgentResult JSON to stdout. Useful for debugging the retrieval + synthesis pipeline without the full orchestrator.

python -m src.app.cli news \
  --ticker TSLA \
  --question "Why is Tesla under pressure recently?" \
  --date-anchor 2026-04-08 \
  --lookback-days 30

3. scripts.inspect_last_run — post-mortem state inspection

Replays the checkpointer for a previous run and dumps every specialist's state, claims, tool calls, and warnings — without re-running the orchestrator.

python -m scripts.inspect_last_run \
  --ticker AAPL \
  --question "What recent news has affected Apple's outlook?" \
  --mode memo
# optional: --history (verbose steps), --json (raw output)

4. scripts.demo — canned end-to-end demos

Runs pre-scripted research queries and writes reports. Used in presentations.

python -m scripts.demo                       # both QA + memo cases
python -m scripts.demo --mode memo           # force memo mode
python -m scripts.demo --ticker AAPL         # override ticker

5. scripts.eval_run — evaluation harness

Runs the sample queries in config/eval/queries.yaml through the orchestrator and scores each with retrieval metrics + LLM-as-judge rubric. Writes reports/eval_<stamp>.json and .md.

python -m scripts.eval_run
python -m scripts.eval_run --only qa
python -m scripts.eval_run --only memo

6. scripts.ingest_sec10k — fetch + parse 10-K (vector-store rebuild)

Only needed if you change the embedding model or add a new ticker.

python -m scripts.ingest_sec10k --ticker AAPL              # fetch latest
python -m scripts.ingest_sec10k --ticker AAPL --year 2024  # specific year

7. scripts.build_vectorstore — rebuild Chroma

Chunk + embed ingested 10-Ks into src/data_sources/vectorstore/. Only needed after ingest_sec10k or an embedding-model change.

python -m scripts.build_vectorstore --reset

Testing

All tests exercise live infrastructure — real Finnhub, real NewsAPI, real yfinance, real Ollama, real SEC EDGAR. Set all credentials in .env before running the full suite.

pytest                                        # all tests (~13 min)
pytest tests/test_news_agent.py               # news only
pytest tests/test_sec10k_agent.py             # SEC 10-K only
pytest tests/test_market_context_agent.py     # market context only
pytest tests/test_orchestrator.py             # full pipeline
pytest -k classify_intents                    # by keyword

Project Layout

src/
├── agents/              # One file per specialist + orchestrator.py
├── prompts/             # All LLM prompts (no inline prompt strings elsewhere)
├── schemas/             # Pydantic contracts: ResearchRequest, Claim, Citation,
│                        #   AgentResult, FinalOutput
├── retrieval/           # SEC Chroma wrapper, boilerplate detector, delta analysis
├── data_sources/        # HTTP clients: Finnhub, NewsAPI, GDELT, SEC EDGAR, yfinance
├── eval/                # Retrieval metrics, LLM-as-judge, evaluation runner
├── tools/               # Citation verifier, financial table parser
├── utils/               # Config loader, caching, SQLite checkpointer, embeddings
└── app/cli.py           # CLI entry point (ask, news)

config/
├── default.yaml         # Thresholds, rate limits, source classifications
└── eval/queries.yaml    # Sample evaluation queries (QA + memo)

scripts/                 # eval_run, demo, inspect_last_run, ingest_sec10k,
                         #   build_vectorstore

data/cache/              # External API call cache (per-namespace TTLs)
outputs/                 # Generated reports (qa/ and report/)
reports/                 # Evaluation harness outputs

Tech Stack

Component Technology
Orchestration LangGraph (StateGraph with SQLite checkpointing)
LLM inference Ollama — Qwen 3 14B (agents), DeepSeek-R1 7B (judge)
Embeddings BAAI/bge-m3 via sentence-transformers
Vector store Chroma (prebuilt SEC 10-K collection)
Market data yfinance
News sources Finnhub, NewsAPI, GDELT (fallback, no key required)
SEC filings SEC EDGAR
Validation Pydantic v2
Caching File-based with per-namespace TTLs (news 24h, market 1h, SEC permanent)

Disclaimer

This repository is a course project for DSA4265. Nothing it produces constitutes investment advice.

About

Local-LLM multi-agent equity research assistant for S&P 500 stocks, combining news, SEC filings, peers, market context, and sentiment into cited briefs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 91.6%
  • HTML 8.4%