Run docling-serve VLM through external API

I am trying hard to get docling-serve to run using a VLM per default, using an external API endpoint, unsuccessfully. 
The system continues to look for internal and seems to ignore all variables set.

using docling-serve 1.15.1

ENV Variables:
DOCLING_SERVE_ENABLE_REMOTE_SERVICES=true
DOCLING_SERVE_ALLOW_CUSTOM_VLM_CONFIG=true
DOCLING_SERVE_VLM_API_URL= "https://xxx/v1/chat/completions"
DOCLING_SERVE_VLM_API_HEADERS_JSON='{"Authorization":"Bearer xxx"}'
DOCLING_SERVE_VLM_API_PARAMS_JSON='{"model": "granite_docling","max_completion_tokens:"1500}'
DOCLING_SERVE_VLM_RESPONSE_FORMAT="<enum_value>"
DOCLING_SERVE_VLM_TEMPERATURE="0.0"
DOCLING_SERVE_DEFAULT_PIPELINE=vlm
DOCLING_SERVE_MAX_SYNC_WAIT=600

Error shown is
ERROR:docling_jobkit.orchestrators.local.worker:Worker 1 failed to process job 420d1082-ed82-4003-bdb6-06a843786198: Model 'ibm-granite/granite-docling-258M' not found in artifacts_path.
Expected location: /opt/app-root/src/.cache/docling/models/ibm-granite--granite-docling-258M

Trying from ui and external. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run docling-serve VLM through external API #559

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Run docling-serve VLM through external API #559

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions