Project B: Dashboards Are Dead
This project explores whether a traditional analytics dashboard can be replaced by a conversational experience powered by an LLM. Instead of requiring users to navigate dashboards with limited information, the project processes Overture Map's monthly analytics data into an LLM-readable format in the form of a text file so users can ask the LLM natural language questions regarding the data. The goal is to understand what data formatting, prompt engineering, and context optimization choices make these answers reliable and useful, especially under real constraints like limited context windows and the risk of hallucinations.
SQL Tooling + Metrics QA Agent A tool-calling LLM agent that answers plain-English questions about Overture-style metrics by generating guardrailed SQL, executing it in DuckDB, and returning a grounded natural-language response with citations to the executed queries/results. What this project does Cleans raw metrics CSVs → Parquet Builds a DuckDB database with convenient views:
- row_counts
- row_counts_accurate
- theme_column_summary_stats
- theme_column_release_to_release_comparisons
- changelog_stats Provides a SQL tool that only allows SELECT / CTE queries Runs an agent loop: question → SQL tool calls → final grounded answer Includes an evaluation harness for hallucination/grounding checks on a test set in sql_tooling/test/questions.txt
LLM-Data-Insights/ data/ cleaned/ metrics/ clean_metrics.py sql_tooling/ test/ evaluate.py agent_test.py questions.txt init_duckdb.py sql_tool.py run_agent.py sql_context.md
- Configure environment variables Create a .env for export OPEN_API_KEY
- Add metrics folder to /data. Run python clean_metrics.py from /data folder to produce a cleaned folder
- From the root directory, run python sql.tooling/ init_duckdb.py to create the duckdb database
- Run python sql.tooling/run_agent.py
What is cleaned We: Read metrics CSVs from release/theme/type folders Normalize filename/path metadata into stable columns (e.g., release_id, theme, type) Ensure consistent column presence/order across files Output Parquet partitions for fast querying Generate: manifest (what was ingested) ingest report (row counts, missing columns, anomalies) schema report (observed columns + types)
Running the agent (interactive) From the repo root: python3 -m sql_tooling.run_agent You’ll be prompted: Ask a metrics question: Example questions:
- “What is the total number of address records?”
- “What percentage of address records include postcode/ZIP values?”
- “Which datasets contribute the most address records?”
Run python3 -m sql_tooling.test.evaluate.py