HTTP Request Node Not Retrieve File Extension from Content-Disposition Header
Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [X] Please do not modify this template :) and fill in all the required fields.
Dify version
0.15.0
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
- "http-request" download docx file
- "doc extractor" node
✔️ Expected Behavior
HTTP Request Node download the .docx file
❌ Actual Behavior
"doc extractor" node error: Unsupported Extension Type: .bin
Http request output:
{
"status_code": 200,
"body": "",
"headers": {
"server": "openresty",
"date": "Fri, 10 Jan 2025 08:24:28 GMT",
"content-type": "application/octet-stream; charset=UTF-8",
"content-length": "19121",
"connection": "keep-alive",
"accept-ranges": "bytes",
"etag": "64ba7c8713b64b689feeb4cc2b1f4954",
"content-disposition": "attachment; filename=\"%e4******%95.docx\"; charset=UTF-8"
},
"files": [
{
"dify_model_identity": "__dify__file__",
"id": null,
"tenant_id": "3e581dc4-d40f-412b-b1de-dcc39fa288b4",
"type": "custom",
"transfer_method": "tool_file",
"remote_url": null,
"related_id": "1c9dd6a3-d0e2-4fd6-b899-222fc34eb366",
"filename": "32b96deebef6444bad37bab894fb5d37.bin",
"extension": ".bin",
"mime_type": "application/octet-stream; charset=UTF-8",
"size": 19121,
"url": "/files/tools/1c9dd6a3-d0e2-4fd6-b899-222fc34eb366.bin?timestamp=1736497468&nonce=d2a5bb1c5141ab6b91ae877a8550d894&sign=5b7SpNwVHSfBruJ8zJc8AuGMAqOiCZfMRgWf0EnHuRg="
}
]
}
"doc extractor" data handle :
{
"documents": [
{
"dify_model_identity": "__dify__file__",
"id": null,
"tenant_id": "3e581dc4-d40f-412b-b1de-dcc39fa288b4",
"type": "custom",
"transfer_method": "tool_file",
"remote_url": null,
"related_id": "1c9dd6a3-d0e2-4fd6-b899-222fc34eb366",
"filename": "32b96deebef6444bad37bab894fb5d37.bin",
"extension": ".bin",
"mime_type": "application/octet-stream; charset=UTF-8",
"size": 19121,
"url": "/files/tools/1c9dd6a3-d0e2-4fd6-b899-222fc34eb366.bin?timestamp=1736497468&nonce=e4c5844225ee8e0b11424aaaad307e1c&sign=ZkMjOv1-A9YpkQ3iBSn2Yu8ANZO9AbMarklwQh9IasU="
}
]
}
The "Unsupported Extension Type: .bin" error occurs because the "doc extractor" node does not support the .bin extension. The file downloaded by the "Http-request" node is being saved with a .bin extension due to its MIME type being application/octet-stream, which is a generic binary type.
To resolve this issue, you can modify the process to correctly identify the file type and extension. Here are some steps you can take:
-
Check the
Content-DispositionHeader: TheContent-Dispositionheader in your response includes a filename with a.docxextension. Use this filename to set the correct extension for the file. -
Modify the File Handling Logic: Ensure that the logic in the "Http-request" node or subsequent processing steps correctly assigns the
.docxextension to the file based on theContent-Dispositionheader. -
Update the File Information: Before passing the file to the "doc extractor" node, update the file's metadata to reflect the correct extension and MIME type.
By ensuring the file is correctly identified as a .docx file, the "doc extractor" node should be able to process it without errors, as .docx is a supported extension [1].
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other
I will work on this issue. Please assign, if it is possible
https://github.com/langgenius/dify/pull/12644#issuecomment-2585286631
Hi, @EcoleKeine. I'm Dosu, and I'm helping the Dify team manage their backlog and am marking this issue as stale.
Issue Summary:
- The HTTP Request Node misidentifies .docx files as .bin due to MIME type
application/octet-stream. - I suggested checking the
Content-Dispositionheader and updating file metadata. - Halogen22 showed interest in resolving the issue and requested assignment.
- Kurokobo referenced a related pull request.
Next Steps:
- Please confirm if this issue is still relevant to the latest version of the Dify repository by commenting here.
- If no further updates are provided, this issue will be automatically closed in 15 days.
Thank you for your understanding and contribution!
dosu, the pull request(#12653) to fix this problem is not merged
This problem has not been fixed, has it?
I'm issuing the same behavior with PDF Files.
I'm running self-hosted V1.0.0 Dify
I met the same problem with csv, I'm running self-hosted V1.0.0 Dify
I met the same problem with excel, the return is .bin
I use Dify web, version 1.4.2, same issue with yaml
{ "status_code": 200, "body": "", "headers": { "accept-ranges": "bytes", "content-disposition": "attachment; filename=name.yaml", "content-length": "2254517", "content-type": "application/yaml", "date": "Tue, 01 Jul 2025 15:10:57 GMT", "etag": "\"54af2a6578400c9ef6b9d1ab6ce077ba\"", "last-modified": "Thu, 26 Jun 2025 08:15:46 GMT", "files": [ { "dify_model_identity": "__dify__file__", "id": null, "tenant_id": "b6d6745b-7467-4397-b6b9-c5b4c8aafaa9", "type": "custom", "transfer_method": "tool_file", "remote_url": null, "related_id": "273e93f2-dc72-43f1-b456-c98749c626be", "filename": "0cbfa19996a44f48b73973c5cf238333.bin", "extension": ".bin", "mime_type": "application/yaml", "size": 2254517, "url": "/files/tools/273e93f2-dc72-43f1-b456-c98749c626be.bin?timestamp=1751382658&nonce=a99757666687d1a407be2746dcf155c9&sign=aaa" } ] }
I changed to application/x-yaml, text/x-yaml, text/yaml, but still no luck
I met the same problem with xml, the return is .xsl (dify 1.6.0 ) @EcoleKeine Http-request.filepOutput:
{
"documents": [
{
"dify_model_identity": "__dify__file__",
"id": null,
"tenant_id": "9024b8ec-023a-4beb-b01f-22db8b5ca2ec",
"type": "custom",
"transfer_method": "tool_file",
"remote_url": null,
"related_id": "cd210776-ced6-4198-94bb-878366477478",
"filename": "d87748f0255544d59f580cb30c6565be.xsl",
"extension": ".xsl",
"mime_type": "application/xml",
"size": 151271,
"url": "https://upload.dify.ai/files/tools/cd210776-ced6-4198-94bb-878366477478.xsl?timestamp=1752916630&nonce=a27208ca6ff66142a29df25912dd4499&sign=Sc1Df_87xnv8F5zVOJAvpB6jzieuMQZHsj4hfvgSf7E="
}
]
}
"doc extractor" data handle eooer : Unsupported Extension Type: .xsl
inpute:
{
"ransomeRss": "https://www.ransomware.live/rss",
"NewsList": null,
"sys.files": [],
"sys.user_id": "a0d6e238-0154-4df4-9d4a-731b5bc1b10d",
"sys.app_id": "20a597f7-acf9-41ea-97b1-84066130f4a3",
"sys.workflow_id": "fd95cd7d-a1bb-472a-b55c-85985417c4c9",
"sys.workflow_run_id": "17ee5c37-cf2d-42f0-8c11-da469c9444f5"
}
@ChenMartin 你应该@ 项目成员,最好是开新issue you should create new issue