codeforpdx · yangm2 · Nov 24, 2025 · Nov 25, 2025 · Nov 25, 2025 · Nov 25, 2025
diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md
@@ -33,6 +33,37 @@ Welcome to the Tenant First Aid repository. This file contains the main points f
 
 All python commands should be run via `uv run python ...`
 
+## LangChain Agent Architecture
+
+The backend uses LangChain 1.0.8+ for agent-based conversation management with Vertex AI integration.
+
+### Key Components
+- **LangChainChatManager**: Main agent orchestration class (`backend/tenantfirstaid/langchain_chat.py`)
+- **retrieve_city_law**: Tool for city-specific legal retrieval
+- **retrieve_state_law**: Tool for state-wide legal retrieval
+- **ChatVertexAI**: LangChain wrapper for Google Gemini 2.5 Pro
+
+### Environment Variables
+```bash
+MODEL_NAME=gemini-2.5-pro              # LLM model name
+VERTEX_AI_DATASTORE=projects/.../datastores/...  # RAG corpus ID
+SHOW_MODEL_THINKING=false              # Enable Gemini thinking mode
+LANGSMITH_API_KEY=...                  # Optional: Enable tracing
+LANGSMITH_PROJECT=tenant-first-aid     # Optional: LangSmith project name
+```
+
+### Testing LangChain Components
+```bash
+# Run LangChain-specific tests
+uv run pytest -k langchain
+
+# Run with LangSmith tracing (requires API key)
+LANGSMITH_TRACING=true uv run pytest -k langchain
+
+# Run evaluations (see docs/EVALUATION.md)
+uv run python scripts/run_langsmith_evaluation.py --num-samples 20
+```
+
 ## Local `./frontend` workflow
 
 1. Format, lint and type‑check your changes:

diff --git a/.claude/settings.json b/.claude/settings.json
@@ -1,7 +1,7 @@
 {
     "permissions": {
         "allow": [
-            "WebFetch(https://docs.langchain.com/*)",
-        ],
+            "WebFetch(https://docs.langchain.com/*)"
+        ]
     }
 }
diff --git a/Architecture.md b/Architecture.md
@@ -83,14 +83,22 @@ backend/
 
 ### RAG (Retrieval-Augmented Generation)
 
-The system uses **Vertex AI RAG (Retrieval-Augmented Generation)**, which combines Google's Vertex AI vector search capabilities with the Gemini 2.5 Pro language model. This is specifically a **grounded generation** approach where the LLM has access to a tool-based retrieval system that searches through a curated corpus of Oregon housing law documents.
+The system uses **LangChain agents** with **Vertex AI RAG** tools for document retrieval. This combines LangChain's agent orchestration with Google's Vertex AI vector search capabilities and Gemini 2.5 Pro language model.
 
-**RAG Type and Category:**
+**Architecture Type**: Agent-based RAG with tool calling
+- **Framework**: LangChain 1.0.8+ (monolithic package)
+- **LLM Integration**: ChatVertexAI (langchain-google-vertexai 3.0.3+)
+- **Agent Pattern**: `create_tool_calling_agent()` with custom RAG tools
+- **Retrieval Method**: Dense vector similarity search with metadata filtering
 
-- **Architecture Type**: Tool-augmented RAG with function calling
-- **Implementation**: Vertex AI managed RAG service
-- **Retrieval Method**: Dense vector similarity search with semantic matching
-- **Grounding**: Tool-based retrieval integrated directly into Gemini's generation process
+#### Tool-Based Retrieval
+
+The agent has access to two retrieval tools:
+
+1. **City-Specific Law Retrieval**: Searches documents filtered by city and state
+2. **State-Wide Law Retrieval**: Searches general Oregon laws
+
+The LLM decides which tool(s) to use based on the user's query and location context.
 
 #### Data Ingestion Pipeline
 

diff --git a/backend/pyproject.toml b/backend/pyproject.toml
@@ -1,7 +1,7 @@
 
 [project]
 name = "tenant-first-aid"
-version = "0.2.0"
+version = "0.3.0"
 requires-python = ">=3.12"
 dependencies = [
   "flask>=3.1.1",
@@ -19,6 +19,11 @@ dependencies = [
   "python-dotenv",
   "pandas>=2.3.0",
   "vertexai>=1.43.0",
+  "langchain>=1.1.0",
+  "langchain-google-vertexai>=3.1.0",
+  "langsmith>=0.4.47",
+  "langchain-core>=1.1.0",
+  "openevals>=0.1.2",
 ]
 
 [tool.setuptools.packages.find]
@@ -36,12 +41,14 @@ dev = [
     "pandas-stubs>=2.2.3.250527",
     "pyrefly>=0.21.0",
     "pytest>=8.4.0",
+    "pytest-asyncio>=0.23.0",
     "pytest-cov>=6.1.1",
     "pytest-mock>=3.14.1",
     "ruff>=0.12.0",
     "ty>=0.0.1a11",
     "types-Flask>=1.1.6",
     "types-simplejson>=3.20.0.20250326",
+    "httpx>=0.27.0",
 ]
 
 gen_convo = [

diff --git a/backend/scripts/create_langsmith_dataset.py b/backend/scripts/create_langsmith_dataset.py
@@ -0,0 +1,66 @@
+"""Convert tenant_questions_facts_full.csv to LangSmith evaluation dataset.
+
+This script uploads test scenarios from the manual evaluation CSV to LangSmith
+for automated evaluation.
+"""
+
+import ast
+from pathlib import Path
+
+import pandas as pd
+from langsmith import Client
+
+
+def create_langsmith_dataset():
+    """Upload test scenarios to LangSmith for automated evaluation."""
+    client = Client()
+
+    # Read existing test scenarios.
+    csv_path = (
+        Path(__file__).parent
+        / "generate_conversation"
+        / "tenant_questions_facts_full.csv"
+    )
+    df = pd.read_csv(csv_path, encoding="cp1252")
+
+    # Create dataset in LangSmith.
+    dataset = client.create_dataset(
+        dataset_name="tenant-legal-qa-scenarios",
+        description="Test scenarios for Oregon tenant legal advice chatbot",
+    )
+
+    # Convert each row to LangSmith example.
+    for idx, row in df.iterrows():
+        facts = (
+            ast.literal_eval(row["facts"])
+            if isinstance(row["facts"], str)
+            else row["facts"]
+        )
+        city = row["city"] if not pd.isna(row["city"]) else "null"
+
+        # Each example has inputs and expected metadata.
+        client.create_example(
+            dataset_id=dataset.id,
+            inputs={
+                "first_question": row["first_question"],
+                "city": city,
+                "state": row["state"],
+                "facts": facts,
+            },
+            metadata={
+                "scenario_id": idx,
+                "city": city,
+                "state": row["state"],
+                # Tag scenarios for filtering.
+                "tags": ["tenant-rights", f"city-{city}", f"state-{row['state']}"],
+            },
+            # Optionally include reference conversation for comparison.
+            outputs={"reference_conversation": row.get("Original conversation", None)},
+        )
+
+    print(f"Created dataset '{dataset.name}' with {len(df)} scenarios")
+    return dataset
+
+
+if __name__ == "__main__":
+    create_langsmith_dataset()