feat: add screenshot OCR URL detection using Tesseract.js#165
Open
shresthbhargava wants to merge 1 commit into
Open
feat: add screenshot OCR URL detection using Tesseract.js#165shresthbhargava wants to merge 1 commit into
shresthbhargava wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #147
What's Changed
index.htmlaria-live="polite"on result area so screen readers announce extracted URLsscript.jsAdded full screenshot OCR URL detection module:
handleDragOver/handleDragLeave/handleDrop— drag and drop handlershandleScreenshotUpload— file input change handlerprocessScreenshot(file)— reads image via FileReader, draws to canvas, passes to Tesseract.js OCR with live progress %extractUrlsFromText(text)— regex extracts URLs from raw OCR text, deduplicates, strips trailing punctuation, filters noise (version numbers, too-short strings), caps at 10 resultsshowScreenshotUrls(urls)— renders extracted URLs as a list with individual Scan buttonsscanExtractedUrl(url)— fills the URL input with the extracted URL and triggerscheckSecurity()automaticallystyle.css.screenshot-dropzone— dashed border drop zone with hover/dragover highlight.screenshot-divider— "or" divider between URL input and screenshot upload.screenshot-urls-found— card showing extracted URLs.screenshot-url-item— individual URL row with scan button.screenshot-scan-btn— styled scan button per URL.screenshot-processing/.screenshot-error/.screenshot-no-urls— state stylesBehavior
Note
Website Preview Sandbox (headless browser screenshot) is not included in this PR as it requires Puppeteer (~300MB server dependency). Can be implemented as a follow-up if the maintainer wants it.