Skip to content

本地部署dify添加pymupdf插件后无法读取文件出错 #4452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yanpan22 opened this issue Apr 18, 2025 · 6 comments
Closed

本地部署dify添加pymupdf插件后无法读取文件出错 #4452

yanpan22 opened this issue Apr 18, 2025 · 6 comments
Assignees
Labels
not a bug not a bug / user error / unable to reproduce

Comments

@yanpan22
Copy link

请求帮助,我在本地部署dify添加pymupdf插件后无法读取文件出错
输入:
{
"files": [
{
"dify_model_identity": "dify__file",
"id": null,
"tenant_id": "36e0aa23-c85f-4d83-b11c-6f09c1df3144",
"type": "document",
"transfer_method": "local_file",
"remote_url": "",
"related_id": "5c03033e-d65d-49fa-89c8-861538b7ae7b",
"filename": "发明.pdf",
"extension": ".pdf",
"mime_type": "application/pdf",
"size": 443814,
"url": "/files/5c03033e-d65d-49fa-89c8-861538b7ae7b/file-preview?timestamp=1744964436&nonce=433c31bc2230fc88448a02798d72f5a2&sign=sbVn70SZz_ObRTrk0EwYJnSrEOVNaFzZneRy2oSgUr0="
}
]
}

输出:
{
"text": "Error processing 发明.pdf: Request URL is missing an 'http://' or 'https://' protocol.",
"files": [],
"json": [
{
"发明.pdf": {
"error": "Request URL is missing an 'http://' or 'https://' protocol."
}
}
]
}

@JorjMcKie
Copy link
Collaborator

Please provide evidence that PyMuPDF is causing an issue here, and please use English language exclusively: this will help our team and our international users to understand.

@yanpan22
Copy link
Author

An error occurred when dify was deployed locally and the pymupdf plugin was added, but the file could not be read

@JorjMcKie
Copy link
Collaborator

Check out whether pymupdf can read the file when provided locally via doc = pymupdf.open("input.pdf").
If that works, then PyMuPDF does not cause the error, but your setup / configuration.

@CTD-Networks-CO-LTD
Copy link

@JorjMcKie Hmm? Is this a similar problem? #4461

@JorjMcKie
Copy link
Collaborator

@CTD-Networks-CO-LTD - We are talking past each other:

  1. Please store the file in question on your local computer, for instance under the name "input.pdf".
  2. Try to open it via doc = pymupdf.open("input.pdf").

Only if the statement under point 2. does not work, we are able to deal with your post.

Background is that we cannot to reproduce your environment without installing additional software ("dify" etc.). This is beyond the scope we can cover here.

BTW: When trying to understand the provided JSON snippets it seems that you are trying to directly open a remote file. This is not supported by PyMuPDF. PyMuPDF can only access files on local disks or as file images resident in memory.
Therefore, non-local files must first be downloaded to either a local disk or provided as memory.

@JorjMcKie
Copy link
Collaborator

Closed as being part of this converted Discussions post.

@JorjMcKie JorjMcKie added not a bug not a bug / user error / unable to reproduce and removed not a bug not a bug / user error / unable to reproduce Waiting for information labels Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not a bug not a bug / user error / unable to reproduce
Projects
None yet
Development

No branches or pull requests

3 participants