Pulling in docling-core, docling, pymupdf fixes#1313
Merged
jamesbraza merged 5 commits intomainfrom Mar 11, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the repo’s PDF parsing stack by upgrading docling/docling-core (including the backend rename) and updating PyMuPDF to pick up upstream fixes, with corresponding dependency/lockfile adjustments.
Changes:
- Bump
docling,docling-core,docling-ibm-models, anddocling-parseinuv.lock, and alignpaper-qa-doclingdependencies accordingly. - Update the Docling reader to use
DoclingParseDocumentBackend(rename from the v4 backend). - Update the PyMuPDF reader to call
pymupdf.table.find_tables(page)and remove the previous dev-only PyMuPDF downpin.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
uv.lock |
Updates locked versions for docling-related packages and PyMuPDF; adjusts editable package dependency metadata accordingly. |
packages/paper-qa-pymupdf/src/paperqa_pymupdf/reader.py |
Switches table detection callsite to find_tables(page) to match upstream typing/API changes. |
packages/paper-qa-pymupdf/pyproject.toml |
Removes the dev-only PyMuPDF <1.27 downpin now that the upstream typing issue is fixed. |
packages/paper-qa-docling/src/paperqa_docling/reader.py |
Updates the default backend import/class to the renamed Docling backend. |
packages/paper-qa-docling/pyproject.toml |
Removes the explicit docling-parse<5 downpin and tightens docling minimum version to >=2.74. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
c87ea4e to
8ea2d84
Compare
whitead
approved these changes
Mar 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pulling in:
docling-corefix: fix: accept relative URIs in PdfHyperlink without validation failure docling-project/docling-core#520doclingrename: feat: Introduce docling-parse v5 and deprecate old docling-parse backends docling-project/docling#2872pymupdftyping fix: Typing broken because of*_forward_declpymupdf/PyMuPDF#4903Note
Medium Risk
Dependency upgrades and backend/API swaps may subtly change PDF parsing output/behavior and add heavier transitive requirements (e.g.,
torch), so parsing regression testing is important.Overview
Updates
paper-qa-doclingto requiredocling>=2.74, drops thedocling-parse<5downpin, and switches the default PDF backend fromDoclingParseV4DocumentBackendto the renamedDoclingParseDocumentBackend.Updates
paper-qa-pymupdfto remove the dev-only PyMuPDF downpin and adjusts table extraction to usepymupdf.table.find_tables(page)instead ofpage.find_tables(). The lockfile is refreshed to newerdocling/docling-core/docling-parseandpymupdfversions (including new transitive deps).Written by Cursor Bugbot for commit 32ee967. Configure here.