Basically...
I’m running into a persistent issue with DoclingServe v1.15.0 (docker pull ghcr.io/docling-project/docling-serve:v1.15.0) where the API response fails to include image data. Even when I explicitly configure the service with include_images: true and image_export_mode: "embedded", the image field in the pictures array is always returned as null.
I want to clarify that this isn't a client-side conversion or "helper" issue. I’ve seen previous discussions (like Issue #71) suggesting that docling-core should be used to generate images on the fly, but in this case, the raw JSON response from the server is simply empty for that field. Because the server isn't sending the actual image object data, there’s nothing for the client-side libraries to work with.
Environment
- Service: DoclingServe v1.15.0
- Deployment: Official Docker Image
- Architecture: ARM64
- Config:
image_export_mode: "embedded", include_images: true
The Problem
When I send a document to the /v1/convert/source endpoint, the service successfully identifies the document layout and marks the bounding boxes for pictures. However, it fails to populate the image dataclass.
Actual Behavior
The "pictures" section of the raw JSON response looks like this:
"pictures": [
{
"self_ref": "#/pictures/0",
"label": "picture",
"prov": [
{
"page_no": 3,
"bbox": {
"l": 195.3986,
"t": 719.1842,
"r": 416.5754,
"b": 398.1719,
"coord_origin": "BOTTOMLEFT"
}
}
],
"image": null,
"annotations": []
}
]
Expected Behavior
The image field should not be null. It should contain the full image dataclass (including the base64-encoded string and relevant metadata) as per the "embedded" export mode specification.
In Issue #71, the suggestion was to use docling-core to create images from the document. I am already using docling-core to handle the conversion (see my document_converter.py logic), but because the server returns a null value for the image property in the raw payload, the resulting DoclingDocument object is also empty. This appears to be a failure in the server's serialization or extraction pipeline rather than a client-side implementation detail.
Reproduction Context
I am using a Python client to call the service. Below is the configuration being sent to the endpoint:
Request Configuration (docling_service.py)
payload = {
"options": {
"to_formats": ["json"],
"include_images": True,
"image_export_mode": 'embedded',
"do_ocr": True,
"do_table_structure": True,
"table_mode": "fast",
},
"sources": [
{
"base64_string": base64_encoded_file,
"filename": filename,
"kind": "file",
}
],
"target": {"kind": "inbody"},
}
Server Logs
The Docker console indicates that the pipeline stages complete successfully, including the layout and assembly:
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage layout: ... duration=2.687s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: ... duration=0.000s
INFO:docling.document_converter:Finished converting document 1706.03762v7.pdf in 21.45 sec.
INFO: 192.168.65.1 - "POST /v1/convert/source HTTP/1.1" 200 OK
Questions
- Is the
embedded image export mode currently supported/functional in the ARM64 Docker build for v1.15.0?
- Are there any additional backend dependencies or environment variables required to trigger the actual image cropping and serialization?
- Since the
prov (provenance) data is correctly identifying the bounding boxes, what would cause the service to skip the population of the image field entirely?
Any help or insight into why the server is returning null here would be greatly appreciated!
Docling-serve response (Pictures element only)
"pictures": [
{
"self_ref": "#/pictures/0",
"parent": {
"cref": "#/body"
},
"children": [
// truncated
{}
],
"content_layer": "body",
"meta": null,
"label": "picture",
"prov": [
{
"page_no": 3,
"bbox": {
"l": 195.3986053466797,
"t": 719.1842193603516,
"r": 416.5754089355469,
"b": 398.1719970703125,
"coord_origin": "BOTTOMLEFT"
},
"charspan": [
0,
0
]
}
],
"captions": [
{
"cref": "#/texts/31"
}
],
"references": [],
"footnotes": [],
"image": null,
"annotations": []
},
Files
send_file.py
#!/usr/bin/env python3
"""
send_file.py
Process a PDF file using DoclingService and DocumentConverter,
then save the resulting JSON document to a specified output directory.
Usage:
python send_file.py
The input PDF path and output directory are hardcoded as per requirements.
"""
import json
import os
import sys
from pathlib import Path
# Import the required modules from the existing codebase
from docling_service import DoclingService
from document_converter import DocumentConverter
# ----------------------------------------------------------------------
# Configuration (adjust if needed)
# ----------------------------------------------------------------------
INPUT_PDF = "docling-serve/files/1706.03762v7.pdf"
OUTPUT_DIR = "docling/docling-serve/files/output"
DOCLING_URL = os.getenv("DOCLING_URL", "http://localhost:5002")
# Conversion parameters (same defaults as in the Streamlit app)
CONVERSION_PARAMS = {
"to_formats": ["json"], # We only need JSON output
"do_ocr": True,
"force_ocr": False,
"include_images": True,
"do_table_structure": True,
"table_mode": "fast", # "fast" or "accurate"
"do_picture_description": False,
"picture_area_threshold": 0.05,
"do_code_enrichment": False,
"do_formula_enrichment": False,
}
# ----------------------------------------------------------------------
def main():
# Check if input file exists
input_path = Path(INPUT_PDF)
if not input_path.is_file():
print(f"❌ Input file not found: {INPUT_PDF}")
sys.exit(1)
# Create output directory if it doesn't exist
output_dir = Path(OUTPUT_DIR)
output_dir.mkdir(parents=True, exist_ok=True)
# Prepare output filename (stem + .json)
output_json = output_dir / f"{input_path.stem}.json"
print(f"📄 Processing: {input_path.name}")
print(f"🔗 Docling service: {DOCLING_URL}")
# Initialize services
svc = DoclingService(base_url=DOCLING_URL)
converter = DocumentConverter()
# Health check
if not svc.health():
print("❌ Docling service is not healthy. Please ensure the server is running.")
sys.exit(1)
# Read file bytes
try:
with open(input_path, "rb") as f:
file_bytes = f.read()
except Exception as e:
print(f"❌ Failed to read input file: {e}")
sys.exit(1)
# Perform conversion
print("🔄 Converting...")
try:
raw_response = svc.convert_file(
file_bytes,
input_path.name,
**CONVERSION_PARAMS
)
except Exception as e:
print(f"❌ Conversion request failed: {e}")
sys.exit(1)
# Parse the raw response into a usable result object
try:
result = converter.convert(raw_response)
except Exception as e:
print(f"❌ Failed to parse conversion response: {e}")
sys.exit(1)
# Extract the validated DoclingDocument as JSON
doc_json = result.document.model_dump_json(indent=2)
# Write JSON to file
try:
with open(output_json, "w", encoding="utf-8") as f:
f.write(doc_json)
print(f"✅ JSON saved to: {output_json}")
except Exception as e:
print(f"❌ Failed to write output file: {e}")
sys.exit(1)
# Optional: print a short summary
stats = result.stats
print("\n📊 Document profile:")
print(f" Pages: {stats.num_pages}")
print(f" Tables: {stats.num_tables}")
print(f" Pictures: {stats.num_pictures}")
print(f" Text items:{stats.num_texts}")
print(f" Total items:{stats.total_items}")
# ----------------------------------------------------------------------
if __name__ == "__main__":
main()
docling_service.py
"""DoclingService — thin HTTP client for the Docling /v1/convert/source endpoint."""
import base64
import json
import os
import requests
class DoclingService:
"""Makes conversion requests to a running Docling server."""
def __init__(self, base_url: str | None = None, timeout: int = 600):
self.base_url = (base_url or os.getenv("DOCLING_URL", "http://localhost:5002")).rstrip("/")
self.timeout = timeout
# ── public ────────────────────────────────────────────────────────────
def convert_file(
self,
file_bytes: bytes,
filename: str,
*,
to_formats: list[str] | None = None,
do_ocr: bool = True,
force_ocr: bool = False,
include_images: bool = True,
do_table_structure: bool = True,
table_mode: str = "fast",
do_picture_description: bool = False,
picture_area_threshold: float = 0.05,
do_code_enrichment: bool = False,
do_formula_enrichment: bool = False,
) -> dict:
"""
Send *file_bytes* to Docling and return the raw response dict.
``to_formats`` always includes ``"json"`` (required for DoclingDocument).
"""
formats = list(set(to_formats or ["json"]) | {"json"})
payload = {
"options": {
"to_formats": formats,
"do_ocr": do_ocr,
"force_ocr": force_ocr,
"include_images": include_images,
"image_export_mode": 'embedded',
"do_table_structure": do_table_structure,
"table_mode": table_mode,
"do_picture_description": do_picture_description,
"picture_description_area_threshold": picture_area_threshold,
"do_code_enrichment": do_code_enrichment,
"do_formula_enrichment": do_formula_enrichment,
},
"sources": [
{
"base64_string": base64.b64encode(file_bytes).decode(),
"filename": filename,
"kind": "file",
}
],
"target": {"kind": "inbody"},
}
response = requests.post(
f"{self.base_url}/v1/convert/source",
headers={"Content-Type": "application/json"},
data=json.dumps(payload),
timeout=self.timeout,
)
response.raise_for_status()
return response.json()
def health(self) -> bool:
"""Return True if the server responds to a GET /health."""
try:
r = requests.get(f"{self.base_url}/health", timeout=5)
return r.ok
except Exception:
return False
document_converter.py
"""DocumentConverter — turns a raw Docling API response into a DoclingDocument + profile."""
from dataclasses import dataclass
from docling_core.transforms.profiler import DocumentProfiler
from docling_core.types.doc import DoclingDocument
@dataclass
class ConversionResult:
"""Holds the parsed document, its profile stats, and the optional markdown."""
document: DoclingDocument
stats: object # DocumentProfiler stats (DocumentStats)
markdown: str | None # present only when "md" was requested
class DocumentConverter:
"""
Converts a raw Docling API response dict into a :class:`ConversionResult`.
Raises
------
ValueError
If no JSON payload can be found in the response (needed for DoclingDocument).
"""
# Keys for the nested outputs dict
_JSON_OUTPUTS_KEY = "json"
# Keys the server may use for the markdown payload
_MD_KEYS = ("md_content", "markdown")
_MD_OUTPUTS_KEY = "md"
def convert(self, raw_response: dict) -> ConversionResult:
json_payload = self._extract_json(raw_response)
doc = self._build_document(json_payload)
stats = DocumentProfiler.profile_document(doc)
markdown = self._extract_markdown(raw_response)
return ConversionResult(document=doc, stats=stats, markdown=markdown)
# ── private helpers ───────────────────────────────────────────────────
def _extract_json(self, response: dict) -> dict | str:
if json_content := response.get('document').get('json_content'):
return json_content
raise ValueError(
"No JSON payload found in Docling response. "
f"Available keys: {list(response.keys())}"
)
def _build_document(self, json_payload: dict | str) -> DoclingDocument:
"""
Deserialise *json_payload* into a DoclingDocument.
The Docling API returns a dict that uses ``filename`` instead of ``name``.
We normalise it before validation so Pydantic does not complain about
the missing ``name`` field.
"""
if isinstance(json_payload, str):
import json
json_payload = json.loads(json_payload)
# Normalise: copy ``filename`` → ``name`` when ``name`` is absent
if "name" not in json_payload and "filename" in json_payload:
json_payload = {**json_payload, "name": json_payload["filename"]}
return DoclingDocument.model_validate(json_payload)
def _extract_markdown(self, response: dict) -> str | None:
for key in self._MD_KEYS:
if value := response.get(key):
return value
outputs = response.get("outputs", {})
return outputs.get(self._MD_OUTPUTS_KEY)
docling-serve docker debug console output
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage table: run_id=3 pages=[4, 5, 6, 7] start=1775834413.797 end=1775834415.422 duration=1.625s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[4] start=1775834415.423 end=1775834415.423 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[5] start=1775834415.424 end=1775834415.424 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[6] start=1775834415.425 end=1775834415.426 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[7] start=1775834415.428 end=1775834415.428 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage layout: run_id=3 pages=[8, 9, 10, 11] start=1775834413.797 end=1775834416.484 duration=2.687s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage layout: run_id=3 pages=[12, 13, 14, 15] start=1775834416.484 end=1775834419.347 duration=2.864s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage table: run_id=3 pages=[8, 9, 10, 11] start=1775834416.484 end=1775834424.808 duration=8.323s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage table: run_id=3 pages=[12, 13, 14, 15] start=1775834424.808 end=1775834424.808 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[8] start=1775834424.808 end=1775834424.808 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[9] start=1775834424.809 end=1775834424.809 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[10] start=1775834424.810 end=1775834424.810 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[11] start=1775834424.812 end=1775834424.812 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[12] start=1775834424.813 end=1775834424.813 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[13] start=1775834424.815 end=1775834424.815 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[14] start=1775834424.816 end=1775834424.816 duration=0.000s
DEBUG:docling.pipeline.standard_pdf_pipeline:PIPELINE_PROFILING Stage assemble: run_id=3 pages=[15] start=1775834424.818 end=1775834424.818 duration=0.000s
INFO:docling.document_converter:Finished converting document 1706.03762v7.pdf in 21.45 sec.
INFO:docling_jobkit.convert.results:Processed 1 docs in 21.45 seconds.
INFO:docling_jobkit.orchestrators.local.worker:Worker 1 completed job 925c5fb8-2236-4957-8ef7-41b3823bbc28 in 21.45 seconds
DEBUG:docling_jobkit.orchestrators.local.worker:Worker 1 completely done with 925c5fb8-2236-4957-8ef7-41b3823bbc28
INFO: 192.168.65.1:59294 - "POST /v1/convert/source HTTP/1.1" 200 OK
INFO:docling_jobkit.orchestrators.local.orchestrator:Deleting result of task task_id='925c5fb8-2236-4957-8ef7-41b3823bbc28'
INFO:docling_jobkit.orchestrators.base_orchestrator:Deleting task_id='925c5fb8-2236-4957-8ef7-41b3823bbc28'
Basically...
I’m running into a persistent issue with DoclingServe v1.15.0 (docker pull ghcr.io/docling-project/docling-serve:v1.15.0) where the API response fails to include image data. Even when I explicitly configure the service with
include_images: trueandimage_export_mode: "embedded", theimagefield in thepicturesarray is always returned asnull.I want to clarify that this isn't a client-side conversion or "helper" issue. I’ve seen previous discussions (like Issue #71) suggesting that
docling-coreshould be used to generate images on the fly, but in this case, the raw JSON response from the server is simply empty for that field. Because the server isn't sending the actual image object data, there’s nothing for the client-side libraries to work with.Environment
image_export_mode: "embedded",include_images: trueThe Problem
When I send a document to the
/v1/convert/sourceendpoint, the service successfully identifies the document layout and marks the bounding boxes for pictures. However, it fails to populate theimagedataclass.Actual Behavior
The "pictures" section of the raw JSON response looks like this:
Expected Behavior
The
imagefield should not be null. It should contain the full image dataclass (including the base64-encoded string and relevant metadata) as per the "embedded" export mode specification.Why this is not Issue #71
In Issue #71, the suggestion was to use
docling-coreto create images from the document. I am already usingdocling-coreto handle the conversion (see mydocument_converter.pylogic), but because the server returns anullvalue for the image property in the raw payload, the resultingDoclingDocumentobject is also empty. This appears to be a failure in the server's serialization or extraction pipeline rather than a client-side implementation detail.Reproduction Context
I am using a Python client to call the service. Below is the configuration being sent to the endpoint:
Request Configuration (
docling_service.py)Server Logs
The Docker console indicates that the pipeline stages complete successfully, including the layout and assembly:
Questions
embeddedimage export mode currently supported/functional in the ARM64 Docker build for v1.15.0?prov(provenance) data is correctly identifying the bounding boxes, what would cause the service to skip the population of theimagefield entirely?Any help or insight into why the server is returning
nullhere would be greatly appreciated!Docling-serve response (Pictures element only)
Files
send_file.py
docling_service.py
document_converter.py
docling-serve docker debug console output