Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM Node cannot read file with a custom field type of "File" when using a multimodal model #12269

Open
5 tasks done
meetwab opened this issue Dec 31, 2024 · 1 comment
Open
5 tasks done
Labels
🐞 bug Something isn't working

Comments

@meetwab
Copy link

meetwab commented Dec 31, 2024

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.14.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Step

  1. Create a Chatflow application.
  2. Add an LLM node and select an LLM with file processing capabilities.
  3. Add a txt file variable in the start node.
  4. Enter the file variable in the system prompt of the LLM node.
  5. Run the test, the file output remains an empty array.

Start Node Output

{
  "File": null,
  "BookName": "活着",
  "sys.query": "1",
  "sys.files": [
    {
      "dify_model_identity": "__dify__file__",
      "id": null,
      "tenant_id": "6687b21a-719e-445b-91f2-6151dfc72a82",
      "type": "document",
      "transfer_method": "local_file",
      "remote_url": "",
      "related_id": "49a01f90-ea92-4bab-913c-d029644ef04c",
      "filename": "余华 活着 (1).txt",
      "extension": ".txt",
      "mime_type": "text/plain",
      "size": 183406,
      "url": "/files/49a01f90-ea92-4bab-913c-d029644ef04c/file-preview?timestamp=1735631394&nonce=6559a45400e623bc133fa7c6c84cb785&sign=f3EyJDlM8UqFta8XwugcJIjO2ffrF1FEHAuUYUpyltE="
    }
  ],
  "sys.conversation_id": "ed1658a9-67ee-4b57-8e7c-b1681f0748fa",
  "sys.user_id": "8f1e639a-707c-4c98-bb75-9000aca33041",
  "sys.dialogue_count": 0,
  "sys.app_id": "68798d8e-6b35-4627-a0bc-6175038b0853",
  "sys.workflow_id": "09321b36-d496-4d85-8dcb-581408f9ac2c",
  "sys.workflow_run_id": "29da3e44-9a1c-4247-ab2c-ee11a680fa96"
}

LLM Node Input

{
  "model_mode": "chat",
  "prompts": [
    {
      "role": "system",
      "text": "以下是用户上传的一份文件:\n请根据文件中的信息,回答用户的问题。\n",
      "files": []
    },
    {
      "role": "user",
      "text": "1\n你看到了什么内容?\n\n",
      "files": []
    }
  ],
  "model_provider": "azure_openai",
  "model_name": "gpt-4o"
}

✔️ Expected Behavior

The output JSON (files) should include the file content.

❌ Actual Behavior

In the output JSON format, the files were an empty array.

Copy link

dosubot bot commented Dec 31, 2024

Hello @meetwab! I'm here to help you with any bugs, questions, or contributions you have for the project. Let's work together to solve this issue!

I found a similar open issue regarding the LLM Node's inability to read files with a custom field type of "File" when using a multimodal model. The issue is due to the current limitations of the Dify platform, where the LLM Node only supports recognizing image files and does not support reading document files directly.

To address this, you can use a List Operation node to preprocess the array of file variables. This node can sort and filter the files, sending different file types to their corresponding processing nodes. For image files, ensure that the vision feature is enabled in the LLM Node to allow processing. You can find more details in the related issue here [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added the 🐞 bug Something isn't working label Dec 31, 2024
@cpwan cpwan mentioned this issue Jan 3, 2025
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant