Skip to content

Conversation

@Blecoeur
Copy link
Collaborator

  • Implemented CloudPDFProcessor class to handle PDF file processing.
  • Added methods for encoding PDFs to base64, decoding images from base64, and extracting results from OCR responses.
  • Added a method to maximise the number of requests per second without getting "429 - Too Many Request" errors as Mistral API has a pretty low maximum number of requests per second.
  • Added tests for CloudPDFProcessor to validate file acceptance and processing functionality.

Linked to issue #15

Blecoeur and others added 2 commits September 12, 2025 23:08
- Implemented CloudPDFProcessor class to handle PDF file processing.
- Added methods for encoding PDFs to base64, decoding images from base64, and extracting results from OCR responses.
- Included synchronous and asynchronous processing methods, along with batch processing capabilities (that takes into account the maximum number of request per second)
- Added tests for CloudPDFProcessor to validate file acceptance and processing functionality.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants