Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Running Scripts

All scripts must be run through the `run.py` wrapper — never call scripts directly:

```bash
# Authentication
python scripts/run.py auth_manager.py setup # First-time Google login (browser opens)
python scripts/run.py auth_manager.py status # Check auth state
python scripts/run.py auth_manager.py reauth # Re-authenticate
python scripts/run.py auth_manager.py clear # Clear auth data

# Notebook library
python scripts/run.py notebook_manager.py list
python scripts/run.py notebook_manager.py add --url URL --name NAME --description DESC --topics TOPICS
python scripts/run.py notebook_manager.py activate --id ID
python scripts/run.py notebook_manager.py remove --id ID

# Query
python scripts/run.py ask_question.py --question "..." --notebook-id ID
python scripts/run.py ask_question.py --question "..." --notebook-url URL
python scripts/run.py ask_question.py --question "..." --show-browser # debug mode

# Cleanup
python scripts/run.py cleanup_manager.py # preview
python scripts/run.py cleanup_manager.py --confirm # execute
```

The `run.py` wrapper auto-creates `.venv` and installs dependencies (including Chrome via `patchright install chrome`) on first run.

## Architecture

```
scripts/
run.py — venv bootstrap + subprocess launcher
config.py — all paths, CSS selectors, and timeouts (single source of truth)
auth_manager.py — Google login, browser state persistence (7-day TTL)
browser_utils.py — BrowserFactory (persistent Chrome context), StealthUtils
browser_session.py — low-level session wrapper
notebook_manager.py — NotebookLibrary class: JSON-backed notebook registry
ask_question.py — main query flow: auth check → browser → NotebookLM → answer
cleanup_manager.py — clears browser state / auth / library selectively
setup_environment.py — first-run venv + Chrome install
```

**Auth model**: Hybrid — persistent browser profile (`data/browser_state/browser_profile/`) for fingerprint consistency + manual cookie injection from `data/browser_state/state.json` to work around a Playwright bug. State expires after 7 days.

**Session model**: Stateless. Each `ask_question.py` call opens a fresh browser, sends the question, waits for a stable (non-streaming) response, then closes. No cross-question context.

**Answer stability detection**: Waits for `div.thinking-message` to disappear, then requires 3 consecutive stable polls before returning — prevents truncated streaming responses.

## Key Configuration (`scripts/config.py`)

- `QUERY_TIMEOUT_SECONDS = 120` — increase if long queries time out
- `QUERY_INPUT_SELECTORS` / `RESPONSE_SELECTORS` — update here if NotebookLM's UI changes
- `LIBRARY_FILE` = `data/library.json` — notebook registry
- `STATE_FILE` = `data/browser_state/state.json` — auth cookies

## Data Directory

`data/` is gitignored and contains sensitive auth state. Never commit it.

## Dependencies

Only two: `patchright==1.55.2` (Playwright-based stealth automation) and `python-dotenv==1.0.0`. Pinned in `requirements.txt`. After pip install, Chrome must be installed separately: `patchright install chrome`.
97 changes: 64 additions & 33 deletions scripts/ask_question.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@
from config import QUERY_INPUT_SELECTORS, RESPONSE_SELECTORS
from browser_utils import BrowserFactory, StealthUtils


# Follow-up reminder (adapted from MCP server for stateless operation)
# Since we don't have persistent sessions, we encourage comprehensive questions
FOLLOW_UP_REMINDER = (
Expand Down Expand Up @@ -67,8 +66,7 @@ def ask_notebooklm(question: str, notebook_url: str, headless: bool = True) -> s

# Launch persistent browser context using factory
context = BrowserFactory.launch_persistent_context(
playwright,
headless=headless
playwright, headless=headless
)

# Navigate to notebook
Expand All @@ -77,7 +75,9 @@ def ask_notebooklm(question: str, notebook_url: str, headless: bool = True) -> s
page.goto(notebook_url, wait_until="domcontentloaded")

# Wait for NotebookLM
page.wait_for_url(re.compile(r"^https://notebooklm\.google\.com/"), timeout=10000)
page.wait_for_url(
re.compile(r"^https://notebooklm\.google\.com/"), timeout=10000
)

# Wait for query input (MCP approach)
print(" ⏳ Waiting for query input...")
Expand All @@ -88,7 +88,7 @@ def ask_notebooklm(question: str, notebook_url: str, headless: bool = True) -> s
query_element = page.wait_for_selector(
selector,
timeout=10000,
state="visible" # Only check visibility, not disabled!
state="visible", # Only check visibility, not disabled!
)
if query_element:
print(f" ✓ Found input: {selector}")
Expand All @@ -102,17 +102,31 @@ def ask_notebooklm(question: str, notebook_url: str, headless: bool = True) -> s

# Type question (human-like, fast)
print(" ⏳ Typing question...")

# Use primary selector for typing
input_selector = QUERY_INPUT_SELECTORS[0]
StealthUtils.human_type(page, input_selector, question)

# Snapshot pre-existing response count before submission
# (persistent browser profile retains previous conversation DOM)
pre_count = 0
pre_selector = None
for selector in RESPONSE_SELECTORS:
try:
existing = page.query_selector_all(selector)
if existing is not None:
pre_count = len(existing)
pre_selector = selector
break
except:
continue

# Submit
print(" 📤 Submitting...")
page.keyboard.press("Enter")

# Small pause
StealthUtils.random_delay(500, 1500)
# Wait for UI to begin rendering new response before polling
StealthUtils.random_delay(800, 1200)

# Wait for response (MCP approach: poll for stable text)
print(" ⏳ Waiting for answer...")
Expand All @@ -125,8 +139,10 @@ def ask_notebooklm(question: str, notebook_url: str, headless: bool = True) -> s
while time.time() < deadline:
# Check if NotebookLM is still thinking (most reliable indicator)
try:
thinking_element = page.query_selector('div.thinking-message')
thinking_element = page.query_selector("div.thinking-message")
if thinking_element and thinking_element.is_visible():
stable_count = 0 # Reset stability while still thinking
last_text = None
time.sleep(1)
continue
except:
Expand All @@ -136,20 +152,28 @@ def ask_notebooklm(question: str, notebook_url: str, headless: bool = True) -> s
for selector in RESPONSE_SELECTORS:
try:
elements = page.query_selector_all(selector)
if elements:
# Get last (newest) response
latest = elements[-1]
text = latest.inner_text().strip()

if text:
if text == last_text:
stable_count += 1
if stable_count >= 3: # Stable for 3 polls
answer = text
break
else:
stable_count = 0
last_text = text
if not elements:
continue

# Only read the newest element if a new one has appeared
# This prevents returning stale DOM from a previous session
current_count = len(elements)
effective_pre = pre_count if (pre_selector == selector) else 0
if current_count <= effective_pre:
continue # No new response element yet

latest = elements[-1]
text = latest.inner_text().strip()

if text:
if text == last_text:
stable_count += 1
if stable_count >= 3: # Stable for 3 polls
answer = text
break
else:
stable_count = 0
last_text = text
except:
continue

Expand All @@ -169,6 +193,7 @@ def ask_notebooklm(question: str, notebook_url: str, headless: bool = True) -> s
except Exception as e:
print(f" ❌ Error: {e}")
import traceback

traceback.print_exc()
return None

Expand All @@ -188,12 +213,12 @@ def ask_notebooklm(question: str, notebook_url: str, headless: bool = True) -> s


def main():
parser = argparse.ArgumentParser(description='Ask NotebookLM a question')
parser = argparse.ArgumentParser(description="Ask NotebookLM a question")

parser.add_argument('--question', required=True, help='Question to ask')
parser.add_argument('--notebook-url', help='NotebookLM notebook URL')
parser.add_argument('--notebook-id', help='Notebook ID from library')
parser.add_argument('--show-browser', action='store_true', help='Show browser')
parser.add_argument("--question", required=True, help="Question to ask")
parser.add_argument("--notebook-url", help="NotebookLM notebook URL")
parser.add_argument("--notebook-id", help="Notebook ID from library")
parser.add_argument("--show-browser", action="store_true", help="Show browser")

args = parser.parse_args()

Expand All @@ -204,7 +229,7 @@ def main():
library = NotebookLibrary()
notebook = library.get_notebook(args.notebook_id)
if notebook:
notebook_url = notebook['url']
notebook_url = notebook["url"]
else:
print(f"❌ Notebook '{args.notebook_id}' not found")
return 1
Expand All @@ -214,28 +239,34 @@ def main():
library = NotebookLibrary()
active = library.get_active_notebook()
if active:
notebook_url = active['url']
notebook_url = active["url"]
print(f"📚 Using active notebook: {active['name']}")
else:
# Show available notebooks
notebooks = library.list_notebooks()
if notebooks:
print("\n📚 Available notebooks:")
for nb in notebooks:
mark = " [ACTIVE]" if nb.get('id') == library.active_notebook_id else ""
mark = (
" [ACTIVE]"
if nb.get("id") == library.active_notebook_id
else ""
)
print(f" {nb['id']}: {nb['name']}{mark}")
print("\nSpecify with --notebook-id or set active:")
print("python scripts/run.py notebook_manager.py activate --id ID")
else:
print("❌ No notebooks in library. Add one first:")
print("python scripts/run.py notebook_manager.py add --url URL --name NAME --description DESC --topics TOPICS")
print(
"python scripts/run.py notebook_manager.py add --url URL --name NAME --description DESC --topics TOPICS"
)
return 1

# Ask the question
answer = ask_notebooklm(
question=args.question,
notebook_url=notebook_url,
headless=not args.show_browser
headless=not args.show_browser,
)

if answer:
Expand Down