Skip to content

feat: Deep Research crawl + selective source import & HTML preview#178

Open
PengyiZhang wants to merge 2 commits into
nashsu:mainfrom
PengyiZhang:feat/deep-research-crawl-import
Open

feat: Deep Research crawl + selective source import & HTML preview#178
PengyiZhang wants to merge 2 commits into
nashsu:mainfrom
PengyiZhang:feat/deep-research-crawl-import

Conversation

@PengyiZhang
Copy link
Copy Markdown

Summary

  • Auto-crawl full page content for all Deep Research search results in parallel with LLM synthesis
  • Allow users to selectively import crawled pages as source files into raw/sources/ for standard LLM ingest
  • Add HTML file preview via sandboxed iframe for .html source files

Changes

  • web-crawler.ts: URL crawler with concurrency=4, 15s timeout, regex-based article/main/body extraction
  • research-store: Add crawledPages, crawlProgress, selectedUrls state and actions
  • deep-research.ts: Auto-crawl after search + importSelectedSources() for selective import
  • research-panel.tsx: Add checkboxes, crawl progress bar, select all/import button to Sources section
  • file-preview.tsx: Add HtmlPreview component using sandboxed iframe
  • file-types.ts: Add html file category
  • i18n: Add crawl/import translations in en.json and zh.json

Test plan

  • Start a Deep Research task, verify crawl runs automatically after search with progress indicator
  • After crawl completes, select sources and click Import Selected, verify files are written to raw/sources/ and enqueued for ingest
  • Click an imported .html source file, verify rendered preview instead of raw code

🤖 Generated with Claude Code

PyZhangBIT and others added 2 commits May 14, 2026 11:23
- Add web-crawler.ts: fetch URLs via Tauri HTTP, extract article/main/body content
- Extend research-store with crawledPages, crawlProgress, selectedUrls
- Auto-crawl all search results in parallel with LLM synthesis
- Add checkboxes and Import Selected button in research panel
- importSelectedSources writes HTML to raw/sources/ and enqueues ingest
- Fix: pass real project object to enqueueSourceIngest

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants