Skip to content

[Feature Request] enable crawling when indexing a URL #474

@Mikethebot44

Description

@Mikethebot44

Summary

Add a toggle to enable crawling and indexing subpages when scraping a URL, with tabbed display for each subpage's content. Useful for comprehensive indexing of documentation sites like vercel.com/docs.

Problem

Currently, indexing a URL scrapes only the specific page's content, excluding subpages. This limits utility for sites with distributed content, such as documentation or blogs.

Proposed Solution

Add a boolean toggle in the URL input dialog to enable subpage crawling.
Update scraping logic to crawl and index subpages if toggled.
Modify the content display UI to use tabs for each subpage, replacing the single content container.

Alternatives Considered

Implement a separate "crawl site" feature instead of integrating into URL indexing.
Use a third-party crawling service for subpage discovery.
Limit to manual subpage selection rather than automatic crawling.

Additional Context

UI could feature tabs labeled by subpage path (e.g., /docs/api, /docs/guides) for easy navigation of indexed content.

Happy to start getting to work on this myself but looking to hear thoughts

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions