LLM Node cannot read file with a custom field type of "File" when using a multimodal model #12269

meetwab · 2024-12-31T09:31:54Z

Self Checks

This is only for bug report, if you would like to ask a question, please head to Discussions.
I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template :) and fill in all the required fields.

Dify version

0.14.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Step

Create a Chatflow application.
Add an LLM node and select an LLM with file processing capabilities.
Add a txt file variable in the start node.
Enter the file variable in the system prompt of the LLM node.
Run the test, the file output remains an empty array.

Start Node Output

{
  "File": null,
  "BookName": "活着",
  "sys.query": "1",
  "sys.files": [
    {
      "dify_model_identity": "__dify__file__",
      "id": null,
      "tenant_id": "6687b21a-719e-445b-91f2-6151dfc72a82",
      "type": "document",
      "transfer_method": "local_file",
      "remote_url": "",
      "related_id": "49a01f90-ea92-4bab-913c-d029644ef04c",
      "filename": "余华 活着 (1).txt",
      "extension": ".txt",
      "mime_type": "text/plain",
      "size": 183406,
      "url": "/files/49a01f90-ea92-4bab-913c-d029644ef04c/file-preview?timestamp=1735631394&nonce=6559a45400e623bc133fa7c6c84cb785&sign=f3EyJDlM8UqFta8XwugcJIjO2ffrF1FEHAuUYUpyltE="
    }
  ],
  "sys.conversation_id": "ed1658a9-67ee-4b57-8e7c-b1681f0748fa",
  "sys.user_id": "8f1e639a-707c-4c98-bb75-9000aca33041",
  "sys.dialogue_count": 0,
  "sys.app_id": "68798d8e-6b35-4627-a0bc-6175038b0853",
  "sys.workflow_id": "09321b36-d496-4d85-8dcb-581408f9ac2c",
  "sys.workflow_run_id": "29da3e44-9a1c-4247-ab2c-ee11a680fa96"
}

LLM Node Input

{
  "model_mode": "chat",
  "prompts": [
    {
      "role": "system",
      "text": "以下是用户上传的一份文件：\n请根据文件中的信息，回答用户的问题。\n",
      "files": []
    },
    {
      "role": "user",
      "text": "1\n你看到了什么内容？\n\n",
      "files": []
    }
  ],
  "model_provider": "azure_openai",
  "model_name": "gpt-4o"
}

✔️ Expected Behavior

The output JSON (files) should include the file content.

❌ Actual Behavior

In the output JSON format, the files were an empty array.

dosubot · 2024-12-31T09:33:21Z

Hello @meetwab! I'm here to help you with any bugs, questions, or contributions you have for the project. Let's work together to solve this issue!

I found a similar open issue regarding the LLM Node's inability to read files with a custom field type of "File" when using a multimodal model. The issue is due to the current limitations of the Dify platform, where the LLM Node only supports recognizing image files and does not support reading document files directly.

To address this, you can use a List Operation node to preprocess the array of file variables. This node can sort and filter the files, sending different file types to their corresponding processing nodes. For image files, ensure that the vision feature is enabled in the LLM Node to allow processing. You can find more details in the related issue here ^[1].

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

dosubot bot added the 🐞 bug Something isn't working label Dec 31, 2024

cpwan mentioned this issue Jan 3, 2025

Handle file by Code Node #10528

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Node cannot read file with a custom field type of "File" when using a multimodal model #12269

LLM Node cannot read file with a custom field type of "File" when using a multimodal model #12269

meetwab commented Dec 31, 2024

dosubot bot commented Dec 31, 2024

LLM Node cannot read file with a custom field type of "File" when using a multimodal model #12269

LLM Node cannot read file with a custom field type of "File" when using a multimodal model #12269

Comments

meetwab commented Dec 31, 2024

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

Step

Start Node Output

LLM Node Input

✔️ Expected Behavior

❌ Actual Behavior

dosubot bot commented Dec 31, 2024