Hello,
Thanks for this fantastic tool. The potential is incredible, however TBH, the documentation is terrible.
I appreciate the effort making an incredible code, but I was working on trying to make work together the parameters picture_description_api and vlm_pipeline_model_api and they were deprecated in favor of picture_description_custom_config and vlm_pipeline_custom_config. However, there's no a single example in the documentation that help us to understand how this new version is working.
In the past, I was able to configure only vlm_pipeline_model_api in the following way:
{
"pipeline": "vlm",
"do_ocr": false,
"do_picture_description": true,
"image_export_mode": "embedded",
"include_images": "true",
"to_formats": [
"md"
],
"vlm_pipeline_model_api": "{\"url\": \"https://my_openai_compatible_endpoint/api/chat/completions\", \"headers\": {\"Authorization\": \"Bearer sk-my-token\"}, \"params\": {\"model\": \"MY-VLM-MODEL\", \"max_tokens\": 8192}, \"prompt\": \"Convert this page to markdown. Do not miss any text and only output the bare markdown! Your response will be used as a RAG by another model, it is important you don't make any mistake. DO YOUR BEST\", \"response_format\": \"markdown\", \"timeout\": 180.0, \"concurrency\": 16}",
"picture_description_api": "{\"url\": \"https://my_openai_compatible_endpoint/api/chat/completions\", \"headers\": {\"Authorization\": \"Bearer sk-my-token\"}, \"params\": {\"model\": \"MY-VLM-MODEL\"}, \"prompt\": \"Describe this image in detail for RAG purposes, your description must be absolutely detailed\", \"timeout\": 180.0, \"concurrency\": 16}"
}
How can I do it now? I want to integrate it to Open WebUI
Hello,
Thanks for this fantastic tool. The potential is incredible, however TBH, the documentation is terrible.
I appreciate the effort making an incredible code, but I was working on trying to make work together the parameters
picture_description_apiandvlm_pipeline_model_apiand they were deprecated in favor ofpicture_description_custom_configandvlm_pipeline_custom_config. However, there's no a single example in the documentation that help us to understand how this new version is working.In the past, I was able to configure only
vlm_pipeline_model_apiin the following way:How can I do it now? I want to integrate it to Open WebUI