Skip to content

Progress reporting during parsing #49

@hybridherbst

Description

@hybridherbst

Is your feature request related to a problem? Please describe.
For large PDFs (e.g. 100 pages with images on each page), extraction can take a long time (many minutes). During this time, there is no way to show progress to a user (e.g. which page we're at, how long it will likely take, etc.).

Describe the solution you'd like
Some kind of progress reporting would be great.

Describe alternatives you've considered
In the API Extractor docs, an undocumented "progress" object is mentioned. It seems to only return the loaded bytes (so basically always at 100%).
Would be better if there was a proper progress callback of sorts.

Additional context
To test, you can use scanned books (where you might have hundreds of pages of images).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions