Skip to content

Pulling in docling-core, docling, pymupdf fixes#1313

Merged
jamesbraza merged 5 commits intomainfrom
modernizing-reader-deps
Mar 11, 2026
Merged

Pulling in docling-core, docling, pymupdf fixes#1313
jamesbraza merged 5 commits intomainfrom
modernizing-reader-deps

Conversation

@jamesbraza
Copy link
Copy Markdown
Collaborator

@jamesbraza jamesbraza commented Mar 10, 2026

Pulling in:


Note

Medium Risk
Dependency upgrades and backend/API swaps may subtly change PDF parsing output/behavior and add heavier transitive requirements (e.g., torch), so parsing regression testing is important.

Overview
Updates paper-qa-docling to require docling>=2.74, drops the docling-parse<5 downpin, and switches the default PDF backend from DoclingParseV4DocumentBackend to the renamed DoclingParseDocumentBackend.

Updates paper-qa-pymupdf to remove the dev-only PyMuPDF downpin and adjusts table extraction to use pymupdf.table.find_tables(page) instead of page.find_tables(). The lockfile is refreshed to newer docling/docling-core/docling-parse and pymupdf versions (including new transitive deps).

Written by Cursor Bugbot for commit 32ee967. Configure here.

@jamesbraza jamesbraza self-assigned this Mar 10, 2026
@jamesbraza jamesbraza added the enhancement New feature or request label Mar 10, 2026
Copilot AI review requested due to automatic review settings March 10, 2026 19:39
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Mar 10, 2026
@dosubot dosubot bot added the bug Something isn't working label Mar 10, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the repo’s PDF parsing stack by upgrading docling/docling-core (including the backend rename) and updating PyMuPDF to pick up upstream fixes, with corresponding dependency/lockfile adjustments.

Changes:

  • Bump docling, docling-core, docling-ibm-models, and docling-parse in uv.lock, and align paper-qa-docling dependencies accordingly.
  • Update the Docling reader to use DoclingParseDocumentBackend (rename from the v4 backend).
  • Update the PyMuPDF reader to call pymupdf.table.find_tables(page) and remove the previous dev-only PyMuPDF downpin.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
uv.lock Updates locked versions for docling-related packages and PyMuPDF; adjusts editable package dependency metadata accordingly.
packages/paper-qa-pymupdf/src/paperqa_pymupdf/reader.py Switches table detection callsite to find_tables(page) to match upstream typing/API changes.
packages/paper-qa-pymupdf/pyproject.toml Removes the dev-only PyMuPDF <1.27 downpin now that the upstream typing issue is fixed.
packages/paper-qa-docling/src/paperqa_docling/reader.py Updates the default backend import/class to the renamed Docling backend.
packages/paper-qa-docling/pyproject.toml Removes the explicit docling-parse<5 downpin and tightens docling minimum version to >=2.74.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jamesbraza jamesbraza force-pushed the modernizing-reader-deps branch from c87ea4e to 8ea2d84 Compare March 10, 2026 22:21
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 11, 2026
@jamesbraza jamesbraza merged commit 4c9050f into main Mar 11, 2026
11 of 14 checks passed
@jamesbraza jamesbraza deleted the modernizing-reader-deps branch March 11, 2026 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Typing broken because of *_forward_decl

3 participants