Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing text with OCR although detected #335

Open
EHadoux opened this issue Mar 11, 2025 · 1 comment
Open

Missing text with OCR although detected #335

EHadoux opened this issue Mar 11, 2025 · 1 comment

Comments

@EHadoux
Copy link

EHadoux commented Mar 11, 2025

Hey, I love the model, it's game changer.

I'm trying to extract this document. All is well except on page 6. As you can see on the screens (surya_gui, but it's the same in direct Python), the bounded boxes are good with "Run Text Detection", but the right-hand side of the first line is missing with "Run OCR".

The code I'm using which returns the same thing but directly in Python:

recognition_predictor(images, [["en"]] * len(images), detection_predictor, highres_images=images_high)

Any idea? Thanks!

12102682_2022-07-06.pdf

Image Image
@kaiwang13
Copy link

Any solution for this case? I have met the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants