Skip to content

Bug: Missing "image" data in DoclingServe v1.15.0 JSON response (returns null) #576

@NicolasVidalDuque

Description

@NicolasVidalDuque

Basically...

I’m running into a persistent issue with DoclingServe v1.15.0 (docker pull ghcr.io/docling-project/docling-serve:v1.15.0) where the API response fails to include image data. Even when I explicitly configure the service with include_images: true and image_export_mode: "embedded", the image field in the pictures array is always returned as null.

I want to clarify that this isn't a client-side conversion or "helper" issue. I’ve seen previous discussions (like Issue #71) suggesting that docling-core should be used to generate images on the fly, but in this case, the raw JSON response from the server is simply empty for that field. Because the server isn't sending the actual image object data, there’s nothing for the client-side libraries to work with.


Environment

  • Service: DoclingServe v1.15.0
  • Deployment: Official Docker Image
  • Architecture: ARM64
  • Config: image_export_mode: "embedded", include_images: true

The Problem

When I send a document to the /v1/convert/source endpoint, the service successfully identifies the document layout and marks the bounding boxes for pictures. However, it fails to populate the image dataclass.

Actual Behavior

The "pictures" section of the raw JSON response looks like this:

"pictures": [
    {
      "self_ref": "#/pictures/0",
      "label": "picture",
      "prov": [
        {
          "page_no": 3,
          "bbox": {
            "l": 195.3986,
            "t": 719.1842,
            "r": 416.5754,
            "b": 398.1719,
            "coord_origin": "BOTTOMLEFT"
          }
        }
      ],
      "image": null, 
      "annotations": []
    }
]

Expected Behavior

The image field should not be null. It should contain the full image dataclass (including the base64-encoded string and relevant metadata) as per the "embedded" export mode specification.


Why this is not Issue #71

In Issue #71, the suggestion was to use docling-core to create images from the document. I am already using docling-core to handle the conversion (see my document_converter.py logic), but because the server returns a null value for the image property in the raw payload, the resulting DoclingDocument object is also empty. This appears to be a failure in the server's serialization or extraction pipeline rather than a client-side implementation detail.


Reproduction Context

I am using a Python client to call the service. Below is the configuration being sent to the endpoint:

Request Configuration (docling_service.py)

payload = {
    "options": {
        "to_formats": ["json"],
        "include_images": True,
        "image_export_mode": 'embedded',
        "do_ocr": True,
        "do_table_structure": True,
        "table_mode": "fast",
    },
    "sources": [
        {
            "base64_string": base64_encoded_file,
            "filename": filename,
            "kind": "file",
        }
    ],
    "target": {"kind": "inbody"},
}

Server Logs

The Docker console indicates that the pipeline stages complete successfully, including the layout and assembly:

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage layout: ... duration=2.687s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: ... duration=0.000s
INFO:docling.document_converter:Finished converting document 1706.03762v7.pdf in 21.45 sec.
INFO: 192.168.65.1 - "POST /v1/convert/source HTTP/1.1" 200 OK

Questions

  • Is the embedded image export mode currently supported/functional in the ARM64 Docker build for v1.15.0?
  • Are there any additional backend dependencies or environment variables required to trigger the actual image cropping and serialization?
  • Since the prov (provenance) data is correctly identifying the bounding boxes, what would cause the service to skip the population of the image field entirely?

Any help or insight into why the server is returning null here would be greatly appreciated!

Docling-serve response (Pictures element only)

"pictures": [
    {
      "self_ref": "#/pictures/0",
      "parent": {
        "cref": "#/body"
      },
      "children": [
        // truncated
        {}
      ],
      "content_layer": "body",
      "meta": null,
      "label": "picture",
      "prov": [
        {
          "page_no": 3,
          "bbox": {
            "l": 195.3986053466797,
            "t": 719.1842193603516,
            "r": 416.5754089355469,
            "b": 398.1719970703125,
            "coord_origin": "BOTTOMLEFT"
          },
          "charspan": [
            0,
            0
          ]
        }
      ],
      "captions": [
        {
          "cref": "#/texts/31"
        }
      ],
      "references": [],
      "footnotes": [],
      "image": null,
      "annotations": []
    },

Files

send_file.py

#!/usr/bin/env python3
"""
send_file.py

Process a PDF file using DoclingService and DocumentConverter,
then save the resulting JSON document to a specified output directory.

Usage:
    python send_file.py

The input PDF path and output directory are hardcoded as per requirements.
"""

import json
import os
import sys
from pathlib import Path

# Import the required modules from the existing codebase
from docling_service import DoclingService
from document_converter import DocumentConverter

# ----------------------------------------------------------------------
# Configuration (adjust if needed)
# ----------------------------------------------------------------------
INPUT_PDF = "docling-serve/files/1706.03762v7.pdf"
OUTPUT_DIR = "docling/docling-serve/files/output"
DOCLING_URL = os.getenv("DOCLING_URL", "http://localhost:5002")

# Conversion parameters (same defaults as in the Streamlit app)
CONVERSION_PARAMS = {
    "to_formats": ["json"],                 # We only need JSON output
    "do_ocr": True,
    "force_ocr": False,
    "include_images": True,
    "do_table_structure": True,
    "table_mode": "fast",                   # "fast" or "accurate"
    "do_picture_description": False,
    "picture_area_threshold": 0.05,
    "do_code_enrichment": False,
    "do_formula_enrichment": False,
}

# ----------------------------------------------------------------------
def main():
    # Check if input file exists
    input_path = Path(INPUT_PDF)
    if not input_path.is_file():
        print(f"❌ Input file not found: {INPUT_PDF}")
        sys.exit(1)

    # Create output directory if it doesn't exist
    output_dir = Path(OUTPUT_DIR)
    output_dir.mkdir(parents=True, exist_ok=True)

    # Prepare output filename (stem + .json)
    output_json = output_dir / f"{input_path.stem}.json"

    print(f"📄 Processing: {input_path.name}")
    print(f"🔗 Docling service: {DOCLING_URL}")

    # Initialize services
    svc = DoclingService(base_url=DOCLING_URL)
    converter = DocumentConverter()

    # Health check
    if not svc.health():
        print("❌ Docling service is not healthy. Please ensure the server is running.")
        sys.exit(1)

    # Read file bytes
    try:
        with open(input_path, "rb") as f:
            file_bytes = f.read()
    except Exception as e:
        print(f"❌ Failed to read input file: {e}")
        sys.exit(1)

    # Perform conversion
    print("🔄 Converting...")
    try:
        raw_response = svc.convert_file(
            file_bytes,
            input_path.name,
            **CONVERSION_PARAMS
        )
    except Exception as e:
        print(f"❌ Conversion request failed: {e}")
        sys.exit(1)

    # Parse the raw response into a usable result object
    try:
        result = converter.convert(raw_response)
    except Exception as e:
        print(f"❌ Failed to parse conversion response: {e}")
        sys.exit(1)

    # Extract the validated DoclingDocument as JSON
    doc_json = result.document.model_dump_json(indent=2)

    # Write JSON to file
    try:
        with open(output_json, "w", encoding="utf-8") as f:
            f.write(doc_json)
        print(f"✅ JSON saved to: {output_json}")
    except Exception as e:
        print(f"❌ Failed to write output file: {e}")
        sys.exit(1)

    # Optional: print a short summary
    stats = result.stats
    print("\n📊 Document profile:")
    print(f"   Pages:     {stats.num_pages}")
    print(f"   Tables:    {stats.num_tables}")
    print(f"   Pictures:  {stats.num_pictures}")
    print(f"   Text items:{stats.num_texts}")
    print(f"   Total items:{stats.total_items}")

# ----------------------------------------------------------------------
if __name__ == "__main__":
    main()

docling_service.py

"""DoclingService — thin HTTP client for the Docling /v1/convert/source endpoint."""

import base64
import json
import os

import requests


class DoclingService:
    """Makes conversion requests to a running Docling server."""

    def __init__(self, base_url: str | None = None, timeout: int = 600):
        self.base_url = (base_url or os.getenv("DOCLING_URL", "http://localhost:5002")).rstrip("/")
        self.timeout = timeout

    # ── public ────────────────────────────────────────────────────────────

    def convert_file(
        self,
        file_bytes: bytes,
        filename: str,
        *,
        to_formats: list[str] | None = None,
        do_ocr: bool = True,
        force_ocr: bool = False,
        include_images: bool = True,
        do_table_structure: bool = True,
        table_mode: str = "fast",
        do_picture_description: bool = False,
        picture_area_threshold: float = 0.05,
        do_code_enrichment: bool = False,
        do_formula_enrichment: bool = False,
    ) -> dict:
        """
        Send *file_bytes* to Docling and return the raw response dict.

        ``to_formats`` always includes ``"json"`` (required for DoclingDocument).
        """
        formats = list(set(to_formats or ["json"]) | {"json"})
        payload = {
            "options": {
                "to_formats": formats,
                "do_ocr": do_ocr,
                "force_ocr": force_ocr,
                "include_images": include_images,
                "image_export_mode": 'embedded',
                "do_table_structure": do_table_structure,
                "table_mode": table_mode,
                "do_picture_description": do_picture_description,
                "picture_description_area_threshold": picture_area_threshold,
                "do_code_enrichment": do_code_enrichment,
                "do_formula_enrichment": do_formula_enrichment,
            },
            "sources": [
                {
                    "base64_string": base64.b64encode(file_bytes).decode(),
                    "filename": filename,
                    "kind": "file",
                }
            ],
            "target": {"kind": "inbody"},
        }

        response = requests.post(
            f"{self.base_url}/v1/convert/source",
            headers={"Content-Type": "application/json"},
            data=json.dumps(payload),
            timeout=self.timeout,
        )
        response.raise_for_status()
        return response.json()

    def health(self) -> bool:
        """Return True if the server responds to a GET /health."""
        try:
            r = requests.get(f"{self.base_url}/health", timeout=5)
            return r.ok
        except Exception:
            return False

document_converter.py

"""DocumentConverter — turns a raw Docling API response into a DoclingDocument + profile."""

from dataclasses import dataclass

from docling_core.transforms.profiler import DocumentProfiler
from docling_core.types.doc import DoclingDocument


@dataclass
class ConversionResult:
    """Holds the parsed document, its profile stats, and the optional markdown."""

    document: DoclingDocument
    stats: object          # DocumentProfiler stats (DocumentStats)
    markdown: str | None   # present only when "md" was requested


class DocumentConverter:
    """
    Converts a raw Docling API response dict into a :class:`ConversionResult`.

    Raises
    ------
    ValueError
        If no JSON payload can be found in the response (needed for DoclingDocument).
    """

    # Keys for the nested outputs dict
    _JSON_OUTPUTS_KEY = "json"

    # Keys the server may use for the markdown payload
    _MD_KEYS = ("md_content", "markdown")
    _MD_OUTPUTS_KEY = "md"

    def convert(self, raw_response: dict) -> ConversionResult:
        json_payload = self._extract_json(raw_response)
        doc = self._build_document(json_payload)
        stats = DocumentProfiler.profile_document(doc)
        markdown = self._extract_markdown(raw_response)
        return ConversionResult(document=doc, stats=stats, markdown=markdown)

    # ── private helpers ───────────────────────────────────────────────────

    def _extract_json(self, response: dict) -> dict | str:
        if json_content := response.get('document').get('json_content'):
            return json_content
        raise ValueError(
            "No JSON payload found in Docling response. "
            f"Available keys: {list(response.keys())}"
        )

    def _build_document(self, json_payload: dict | str) -> DoclingDocument:
        """
        Deserialise *json_payload* into a DoclingDocument.

        The Docling API returns a dict that uses ``filename`` instead of ``name``.
        We normalise it before validation so Pydantic does not complain about
        the missing ``name`` field.
        """
        if isinstance(json_payload, str):
            import json
            json_payload = json.loads(json_payload)

        # Normalise: copy ``filename`` → ``name`` when ``name`` is absent
        if "name" not in json_payload and "filename" in json_payload:
            json_payload = {**json_payload, "name": json_payload["filename"]}

        return DoclingDocument.model_validate(json_payload)

    def _extract_markdown(self, response: dict) -> str | None:
        for key in self._MD_KEYS:
            if value := response.get(key):
                return value
        outputs = response.get("outputs", {})
        return outputs.get(self._MD_OUTPUTS_KEY)

docling-serve docker debug console output

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage table: run_id=3 pages=[4, 5, 6, 7] start=1775834413.797 end=1775834415.422 duration=1.625s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[4] start=1775834415.423 end=1775834415.423 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[5] start=1775834415.424 end=1775834415.424 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[6] start=1775834415.425 end=1775834415.426 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[7] start=1775834415.428 end=1775834415.428 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage layout: run_id=3 pages=[8, 9, 10, 11] start=1775834413.797 end=1775834416.484 duration=2.687s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage layout: run_id=3 pages=[12, 13, 14, 15] start=1775834416.484 end=1775834419.347 duration=2.864s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage table: run_id=3 pages=[8, 9, 10, 11] start=1775834416.484 end=1775834424.808 duration=8.323s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage table: run_id=3 pages=[12, 13, 14, 15] start=1775834424.808 end=1775834424.808 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[8] start=1775834424.808 end=1775834424.808 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[9] start=1775834424.809 end=1775834424.809 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[10] start=1775834424.810 end=1775834424.810 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[11] start=1775834424.812 end=1775834424.812 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[12] start=1775834424.813 end=1775834424.813 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[13] start=1775834424.815 end=1775834424.815 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[14] start=1775834424.816 end=1775834424.816 duration=0.000s

DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[15] start=1775834424.818 end=1775834424.818 duration=0.000s

INFO:docling.document_converter:Finished converting document 1706.03762v7.pdf in 21.45 sec.

INFO:docling_jobkit.convert.results:Processed 1 docs in 21.45 seconds.

INFO:docling_jobkit.orchestrators.local.worker:Worker 1 completed job 925c5fb8-2236-4957-8ef7-41b3823bbc28 in 21.45 seconds

DEBUG:docling_jobkit.orchestrators.local.worker:Worker 1 completely done with 925c5fb8-2236-4957-8ef7-41b3823bbc28

INFO:     192.168.65.1:59294 - "POST /v1/convert/source HTTP/1.1" 200 OK

INFO:docling_jobkit.orchestrators.local.orchestrator:Deleting result of task task_id='925c5fb8-2236-4957-8ef7-41b3823bbc28'

INFO:docling_jobkit.orchestrators.base_orchestrator:Deleting task_id='925c5fb8-2236-4957-8ef7-41b3823bbc28'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions