You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/page.rst
+32-29Lines changed: 32 additions & 29 deletions
Original file line number
Diff line number
Diff line change
@@ -306,7 +306,7 @@ In a nutshell, this is what you can do with PyMuPDF:
306
306
307
307
:arg int align: the horizontal alignment for the replacing text. See :meth:`insert_textbox` for available values. The vertical alignment is (approximately) centered if a PDF built-in font is used (CJK or :ref:`Base-14-Fonts`). (New in v1.16.12)
308
308
309
-
:arg sequence fill: the fill color of the rectangle **after applying** the redaction. The default is *white = (1, 1, 1)*, which is also taken if *None* is specified. To suppress a fill color altogether, specify *False*. In this cases the rectangle remains transparent. (New in v1.16.12)
309
+
:arg sequence fill: the fill color of the rectangle **after applying** the redaction. The default is *white = (1, 1, 1)*, which is also taken if ``None`` is specified. To suppress a fill color altogether, specify ``False``. In this cases the rectangle remains transparent. (New in v1.16.12)
310
310
311
311
:arg sequence text_color: the color of the replacing text. Default is *black = (0, 0, 0)*. (New in v1.16.12)
312
312
@@ -349,7 +349,7 @@ In a nutshell, this is what you can do with PyMuPDF:
349
349
350
350
* For option `images=PDF_REDACT_IMAGE_PIXELS` a new image of format PNG is created, which the page will use in place of the original one. The original image is not deleted or replaced as part of this process, so other pages may still show the original. In addition, the new, modified PNG image currently is **stored uncompressed**. Do keep these aspects in mind when choosing the right garbage collection method and compression options during save.
351
351
352
-
* **Text removal** is done by character: A character is removed if its bbox has a **non-empty overlap** with a redaction rectangle (changed in MuPDF v1.17). Depending on the font properties and / or the chosen line height, deletion may occur for undesired text parts. Using :meth:`Tools.set_small_glyph_heights` with a *True* argument before text search may help to prevent this.
352
+
* **Text removal** is done by character: A character is removed if its bbox has a **non-empty overlap** with a redaction rectangle (changed in MuPDF v1.17). Depending on the font properties and / or the chosen line height, deletion may occur for undesired text parts. Using :meth:`Tools.set_small_glyph_heights` with a ``True`` argument before text search may help to prevent this.
353
353
354
354
* Redactions are a simple way to replace single words in a PDF, or to just physically remove them. Locate the word "secret" using some text extraction or search method and insert a redaction using "xxxxxx" as replacement text for each occurrence.
355
355
@@ -414,14 +414,14 @@ In a nutshell, this is what you can do with PyMuPDF:
414
414
the location(s) -- rectangle(s) or quad(s) -- to be marked. (Changed in v1.14.20)
415
415
A list or tuple must consist of :data:`rect_like` or :data:`quad_like` items (or even a mixture of either).
416
416
Every item must be finite, convex and not empty (as applicable).
417
-
**Set this parameter to** *None* if you want to use the following arguments (Changed in v1.16.14).
418
-
And vice versa: if not *None*, the remaining parameters must be *None*.
417
+
**Set this parameter to** ``None`` if you want to use the following arguments (Changed in v1.16.14).
418
+
And vice versa: if not ``None``, the remaining parameters must be ``None``.
419
419
420
-
:arg point_like start: start text marking at this point. Defaults to the top-left point of *clip*. Must be provided if `quads` is *None*. (New in v1.16.14)
421
-
:arg point_like stop: stop text marking at this point. Defaults to the bottom-right point of *clip*. Must be used if `quads` is *None*. (New in v1.16.14)
420
+
:arg point_like start: start text marking at this point. Defaults to the top-left point of *clip*. Must be provided if `quads` is ``None``. (New in v1.16.14)
421
+
:arg point_like stop: stop text marking at this point. Defaults to the bottom-right point of *clip*. Must be used if `quads` is ``None``. (New in v1.16.14)
422
422
:arg rect_like clip: only consider text lines intersecting this area. Defaults to the page rectangle. Only use if `start` and `stop` are provided. (New in v1.16.14)
423
423
424
-
:rtype::ref:`Annot` or *None* (changed in v1.16.14).
424
+
:rtype::ref:`Annot` or ``None`` (changed in v1.16.14).
425
425
:returns: the created annotation. If *quads* is an empty list, **no annotation** is created (changed in v1.16.14).
426
426
427
427
.. note::
@@ -1544,8 +1544,8 @@ In a nutshell, this is what you can do with PyMuPDF:
1544
1544
1545
1545
For paths other than groups or clips, key `"type"` takes one of the following values:
1546
1546
1547
-
* **"f"** -- this is a *fill-only* path. Only key-values relevant for this operation have a meaning, not applicable ones are present with a value of *None*: `"color"`, `"lineCap"`, `"lineJoin"`, `"width"`, `"closePath"`, `"dashes"` and should be ignored.
1548
-
* **"s"** -- this is a *stroke-only* path. Similar to previous, key `"fill"` is present with value *None*.
1547
+
* **"f"** -- this is a *fill-only* path. Only key-values relevant for this operation have a meaning, not applicable ones are present with a value of ``None``: `"color"`, `"lineCap"`, `"lineJoin"`, `"width"`, `"closePath"`, `"dashes"` and should be ignored.
1548
+
* **"s"** -- this is a *stroke-only* path. Similar to previous, key `"fill"` is present with value ``None``.
1549
1549
* **"fs"** -- this is a path performing combined *fill* and *stroke* operations.
1550
1550
1551
1551
Each item in `path["items"]` is one of the following:
@@ -1670,24 +1670,27 @@ In a nutshell, this is what you can do with PyMuPDF:
1670
1670
:arg bool xrefs: **PDF only.** Try to find the :data:`xref` for each image. Implies `hashes=True`. Adds the `"xref"` key to the dictionary. If not found, the value is 0, which means, the image is either "inline" or its xref is undetectable for some reason. Please note that this option has an extended response time, because the MD5 hashcode will be computed at least two times for each image with an xref. (New in v1.18.13)
1671
1671
1672
1672
:rtype: list[dict]
1673
-
:returns: A list of dictionaries. This includes information for **exactly those** images, that are shown on the page -- including *"inline images"*. In contrast to images included in :meth:`Page.get_text`, image **binary content** is not loaded, which drastically reduces memory usage. The dictionary layout is similar to that of image blocks in `page.get_text("dict")`.
1673
+
:returns: A list of dictionaries. This includes information for **exactly those** images, that are shown on the page -- including *"inline images"*. The dictionary layout is similar to that of image blocks in `page.get_text("dict")`.
1674
+
1675
+
In contrast to images included in :meth:`Page.get_text`, image **binary content** is not loaded by this method, which drastically reduces memory usage. Another difference is that image detection is not restricted to the visible part of the page or any ``clip`` parameter: method :meth:`Page.get_text` will only extract images **fully contained** in the provided ``clip``.
Multiple occurrences of the same image are always reported. You can detect duplicates by comparing their `digest` values.
@@ -1771,7 +1774,7 @@ In a nutshell, this is what you can do with PyMuPDF:
1771
1774
Create an SVG image from the page. Only full page images are currently supported.
1772
1775
1773
1776
:arg matrix_like matrix: a matrix, default is :ref:`Identity`.
1774
-
:arg bool text_as_path: -- controls how text is represented. *True* outputs each character as a series of elementary draw commands, which leads to a more precise text display in browsers, but a **very much larger** output for text-oriented pages. Display quality for *False* relies on the presence of the referenced fonts on the current system. For missing fonts, the internet browser will fall back to some default -- leading to unpleasant appearances. Choose *False* if you want to parse the text of the SVG. (New in v1.17.5)
1777
+
:arg bool text_as_path: -- controls how text is represented. ``True`` outputs each character as a series of elementary draw commands, which leads to a more precise text display in browsers, but a **very much larger** output for text-oriented pages. Display quality for ``False`` relies on the presence of the referenced fonts on the current system. For missing fonts, the internet browser will fall back to some default -- leading to unpleasant appearances. Choose ``False`` if you want to parse the text of the SVG. (New in v1.17.5)
1775
1778
1776
1779
:returns: a UTF-8 encoded string that contains the image. Because SVG has XML syntax it can be saved in a text file, the standard extension is `.svg`.
1777
1780
@@ -1796,12 +1799,12 @@ In a nutshell, this is what you can do with PyMuPDF:
1796
1799
:arg colorspace: The desired colorspace, one of "GRAY", "RGB" or "CMYK" (case insensitive). Or specify a :ref:`Colorspace`, ie. one of the predefined ones: :data:`csGRAY`, :data:`csRGB` or :data:`csCMYK`.
1797
1800
:type colorspace: str or :ref:`Colorspace`
1798
1801
:arg irect_like clip: restrict rendering to the intersection of this area with the page's rectangle.
1799
-
:arg bool alpha: whether to add an alpha channel. Always accept the default *False* if you do not really need transparency. This will save a lot of memory (25% in case of RGB ... and pixmaps are typically **large**!), and also processing time. Also note an **important difference** in how the image will be rendered: with *True* the pixmap's samples area will be pre-cleared with *0x00*. This results in **transparent** areas where the page is empty. With *False* the pixmap's samples will be pre-cleared with *0xff*. This results in **white** where the page has nothing to show.
1802
+
:arg bool alpha: whether to add an alpha channel. Always accept the default ``False`` if you do not really need transparency. This will save a lot of memory (25% in case of RGB ... and pixmaps are typically **large**!), and also processing time. Also note an **important difference** in how the image will be rendered: with ``True`` the pixmap's samples area will be pre-cleared with *0x00*. This results in **transparent** areas where the page is empty. With ``False`` the pixmap's samples will be pre-cleared with *0xff*. This results in **white** where the page has nothing to show.
1800
1803
1801
1804
|history_begin|
1802
1805
1803
1806
Changed in v1.14.17
1804
-
The default alpha value is now *False*.
1807
+
The default alpha value is now ``False``.
1805
1808
1806
1809
* Generated with *alpha=True*
1807
1810
@@ -1881,7 +1884,7 @@ In a nutshell, this is what you can do with PyMuPDF:
1881
1884
:arg str,int ident: the annotation name or xref.
1882
1885
1883
1886
:rtype::ref:`Annot`
1884
-
:returns: the annotation or *None*.
1887
+
:returns: the annotation or ``None``.
1885
1888
1886
1889
.. note:: Methods :meth:`Page.annot_names`, :meth:`Page.annot_xrefs` provide lists of names or xrefs, respectively, from where an item may be picked and loaded via this method.
1887
1890
@@ -1898,7 +1901,7 @@ In a nutshell, this is what you can do with PyMuPDF:
1898
1901
:arg int xref: the field's xref.
1899
1902
1900
1903
:rtype::ref:`Widget`
1901
-
:returns: the field or *None*.
1904
+
:returns: the field or ``None``.
1902
1905
1903
1906
.. note:: This is similar to the analogous method :meth:`Page.load_annot` -- except that here only the xref is supported as identifier.
1904
1907
@@ -1913,7 +1916,7 @@ In a nutshell, this is what you can do with PyMuPDF:
1913
1916
Return the first link on a page. Synonym of property :attr:`first_link`.
1914
1917
1915
1918
:rtype::ref:`Link`
1916
-
:returns: first link on the page (or *None*).
1919
+
:returns: first link on the page (or ``None``).
1917
1920
1918
1921
.. index::
1919
1922
pair: rotate; set_rotation
@@ -2187,19 +2190,19 @@ In a nutshell, this is what you can do with PyMuPDF:
2187
2190
2188
2191
.. attribute:: first_link
2189
2192
2190
-
Contains the first :ref:`Link` of a page (or *None*).
2193
+
Contains the first :ref:`Link` of a page (or ``None``).
2191
2194
2192
2195
:type::ref:`Link`
2193
2196
2194
2197
.. attribute:: first_annot
2195
2198
2196
-
Contains the first :ref:`Annot` of a page (or *None*).
2199
+
Contains the first :ref:`Annot` of a page (or ``None``).
2197
2200
2198
2201
:type::ref:`Annot`
2199
2202
2200
2203
.. attribute:: first_widget
2201
2204
2202
-
Contains the first :ref:`Widget` of a page (or *None*).
2205
+
Contains the first :ref:`Widget` of a page (or ``None``).
Possible values of the "ext" key are "bmp", "gif", "jpeg", "jpx" (JPEG 2000), "jxr" (JPEG XR), "png", "pnm", and "tiff".
@@ -241,6 +238,12 @@ Possible values of the "ext" key are "bmp", "gif", "jpeg", "jpx" (JPEG 2000), "j
241
238
242
239
3. The image's "transformation matrix" is defined as the matrix, for which the expression `bbox / transform == pymupdf.Rect(0, 0, 1, 1)` is true, lookup details here: :ref:`ImageTransformation`.
243
240
241
+
4. A transparent image may be accompanied by a mask image. This is stored under key `"mask"` and has the format of a `DeviceGray` PNG image. Otherwise the value of this key is ``None``. If present, you may be able to recover (an equivalent of) the original image -- i.e. with transparency -- by creating :ref:`Pixmap` objects from the "image", respectively "mask" values and overlay them. This is not guaranteed to always work because mask images come in multiple formats, of which not all qualify for the conditions under which overlaying Pixmaps are supported. Here is a code snippet:
0 commit comments