Releases: pymupdf/PyMuPDF
PyMuPDF-1.20.1
PyMuPDF-1.20.0
This release integrates the recently-released MuPDF-1.20.0, and has fixes for #1733 and #1738. The latter also contains an additional fix for occasional SEGVs when freeing documents.
Building from source works slightly differently from before:
- We now automatically download the required MuPDF source and build it into PyMuPDF.
- Python sdists (source distributions) already contain the required MuPDF source and build without downloading.
- One can override the default build behaviour by setting environmental variables, for example to build with a system-installed mupdf. See the doc-comment at the start of
setup.py
for details.
Bug fixes and minor enhancements
Enhancements:
- new method
Page.load_widget()
to load a widget from its xref - new dictionary
pdfcolor
which contains 500 predefined PDF colors Quad
class supports operator algebra- text search and extraction default flags now accessible as predefined constants
- iterators
Page.annots()
andPage.widgets()
now prohibit reloading the page within their scope - removed multiple utility functions from the
Tools
class and redefined them as standalone - Parameter
new
inDocument.update_stream()
is now obsolete.
Bug fixes and, minor enhancements
Fixes: #1583, #1552, #1550, #1521, #1518, #1513, #1510, #1417, #1550.
Also fixed some undocumented errors that caused the span["origin"]
to be incorrectly set in corner cases.
Added new items "orientation"
and associated transformtion matrix to the output of fitz.image_properties()
, which contains EXIF data of supporting image files.
A new method Document.xref_copy()
allows making xref objects duplicates of each other.
Minor bug fixes and enhancements
Fixes: #1505, #1484, #1479, #1474.
Changes:
- Full support of PDF page rectangles like
/ArtBox
etc. - New global variable TESSDATA_PREFIX for comfortably checking presence of OCR support
- Changed
Document.xref_set_key()
such that dictionary keys will physically be removed if set to value "null". - Changed
Document.extract_font()
to optionally return a dictionary (instead of a tuple).
New features for class Pixmap and several fixes
Fixes:
#1351, #1417, #1418, #1430, #1433
- New or changed Pixmap methods
color_topusage()
,color_count()
,warp()
. Some of them solve #1397. - New Annot method and property
irt_xref
,set_irt_xref()
. Implements #1450. - New
Rect
/IRect
methodtorect()
which creates a matrix to transform between given rectangles. Page.get_texttrace()
now also supports non-horizontal text.
Improvements for drawings extraction and bug fixes
Important improvements for OCR support
OCR of a document page has been improved a lot compared to v1.19.0.
Text extractions now also come with an integrated sort.
Fixes: #1328
First version to support MuPDF v1.19.*
Introduces major new features like PDF journalling and OCR support by directly invoking Tesseract-OCR.
In addition, it is possible to detect whether object are covered (hidden) by other objects.
As part of the new version, the following issues have resolved:
#1313, #1311, #1290, #1286, #1287, #1284.