Skip to content

[FIX] [self-hosted] image in chat not working #1112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
azertylr opened this issue Feb 12, 2025 · 4 comments
Closed
3 tasks done

[FIX] [self-hosted] image in chat not working #1112

azertylr opened this issue Feb 12, 2025 · 4 comments
Labels
fix Fix something that isn't working as expected

Comments

@azertylr
Copy link

Describe the bug

Hello,

I have a self hosted version, using Gemini API and when I upload a image in the chat, the picture appear but the LLM cannot interact with it. I don't have a AWS S3 account, I don't known if it is the problem.

To Reproduce

  • Setup a Gemini as a vision model.
  • Send a picture in a chat and saying for example "describe the image"

Screenshots

Image

Platform

  • Server:
    • Self-Hosted Docker
  • Client:
    • Web browser
  • OS:
    • Linux

If self-hosted

  • Server Version : 1.36.4

Additional context

` /api/chat/sessions?client=web&agent_s
lug=khoj HTTP/1.1" 200
[17:30:31.279432] INFO uvicorn.access: 172.18.0.100:36832 - h11_impl.py:476
"GET
/chat?conversationId=762e8554-a2e4-4a
be-92e7-c13091da3b5c HTTP/1.1" 200
[17:30:31.416034] INFO uvicorn.access: 172.18.0.100:36836 - h11_impl.py:476
"GET
/static/_next/static/media/a6ecd16fa0
44d500-s.p.woff2 HTTP/1.1" 200
[17:30:31.608361] INFO uvicorn.access: 172.18.0.100:54616 - h11_impl.py:476
"GET /api/chat/options HTTP/1.1" 200
[17:30:31.615971] INFO uvicorn.access: 172.18.0.100:54606 - h11_impl.py:476
"GET /api/v1/user HTTP/1.1" 200
[17:30:31.718722] INFO uvicorn.access: 172.18.0.100:54616 - h11_impl.py:476
"GET /api/agents/options HTTP/1.1"
200
[17:30:31.724130] INFO uvicorn.access: 172.18.0.100:54630 - h11_impl.py:476
"GET /api/content/computer HTTP/1.1"
200
[17:30:31.738923] INFO uvicorn.access: 172.18.0.100:54606 - h11_impl.py:476
"GET
/api/chat/conversation/file-filters/7
62e8554-a2e4-4abe-92e7-c13091da3b5c
HTTP/1.1" 200
[17:30:31.751615] INFO uvicorn.access: 172.18.0.100:54634 - h11_impl.py:476
"GET
/api/agents/conversation?conversation
_id=762e8554-a2e4-4abe-92e7-c13091da3
b5c HTTP/1.1" 200
[17:30:31.947593] INFO uvicorn.access: 172.18.0.100:54634 - h11_impl.py:476
"GET
/api/chat/history?client=web&conversa
tion_id=762e8554-a2e4-4abe-92e7-c1309
1da3b5c&n=10 HTTP/1.1" 200
[17:30:32.069118] INFO uvicorn.access: 172.18.0.100:54634 - h11_impl.py:476
"GET /api/chat/sessions HTTP/1.1" 200
[17:30:32.084099] INFO uvicorn.access: 172.18.0.100:54606 - h11_impl.py:476
"GET /api/settings?detailed=true
HTTP/1.1" 200
[17:30:32.723865] INFO khoj.routers.storage: AWS is not storage.py:44
enabled. Skipping image upload
[17:30:32.738683] INFO uvicorn.access: 172.18.0.100:54606 - h11_impl.py:476
"POST /api/chat?client=web HTTP/1.1"
200
[17:30:33.115191] DEBUG khoj.routers.helpers: Chat actor: helpers.py:195
Infer information sources to refer:
0.358 seconds
[17:30:33.128236] DEBUG khoj.routers.api: No documents in api.py:389
knowledge base. Use a Khoj client to sync
and chat with your docs.
[17:30:33.133650] DEBUG khoj.routers.helpers: Conversation helpers.py:1397
Types: [<ConversationCommand.Default:
'default'>,
<ConversationCommand.Text: 'text'>]
[17:30:33.136795] INFO uvicorn.access: 172.18.0.100:54634 - h11_impl.py:476
"GET
/chat.txt?conversationId=f5b74cbd-555
b-4fd5-ba0a-1267bc90fbc2&_rsc=1rc0u
HTTP/1.1" 404
[17:30:33.139894] INFO uvicorn.access: 172.18.0.100:54630 - h11_impl.py:476
"GET
/chat.txt?conversationId=1f42ca3c-1a2
2-44f3-88d8-8f3976ec3b3a&_rsc=1rc0u
HTTP/1.1" 404
[17:30:33.149375] DEBUG khoj.processor.conversation.google gemini_chat.py:240
.gemini_chat: Conversation Context
for Gemini: ["Explain the
significance of this image"]...
[17:30:33.596607] INFO khoj.processor.conversation.utils: First utils.py:102
response took: 0.445 seconds
[17:30:33.723521] INFO khoj.processor.conversation.utils: Chat utils.py:91
streaming took: 0.572 seconds
[17:30:33.736441] INFO khoj.processor.conversation.utils: Saved utils.py:315
Conversation Turn
You (default): "Explain the significance
of this image"

                       Khoj: "I am sorry, but I do not have                 
                       access to the image you are referring                
                       to. Please upload the image or provide a             
                       description of it so I can help you                  
                       understand its significance.                         
                       "                                                    

[17:30:33.739307] INFO khoj.routers.api_chat: Chat response api_chat.py:726
time to first token: 0.961 seconds
[17:30:33.740722] INFO khoj.routers.api_chat: Chat response api_chat.py:727
total time: 1.547 seconds
[17:30:33.742040] INFO khoj.routers.api_chat: Chat response api_chat.py:728
cost: $0.00000
[17:30:33.745239] DEBUG khoj.routers.api_chat: Finished api_chat.py:1272
streaming response
[17:30:34.265640] DEBUG khoj.routers.helpers: Chat actor: helpers.py:195
Generate title from conversation
history: 0.391 seconds
[17:30:34.272573] INFO uvicorn.access: 172.18.0.100:54606 - h11_impl.py:476
"POST
/api/chat/title?conversation_id=762e8
554-a2e4-4abe-92e7-c13091da3b5c
HTTP/1.1" 200 `

@azertylr azertylr added the fix Fix something that isn't working as expected label Feb 12, 2025
@Livvux
Copy link

Livvux commented Feb 22, 2025

same issue here, even with aws it doesnt work

@sabaimran
Copy link
Member

  1. Can you verify if your chat model settings in the admin panel whether you've set vision_enabled to True?
  2. Can you verify in the chat controls tab that you are chatting with the Gemini vision model?

@loorisr
Copy link

loorisr commented Feb 23, 2025

Yes it is correct.
I'm using version 1.36.6

Image

Image

@debanjum
Copy link
Member

Hey folks, thanks for opening the bug and sharing details! I was able to reproduce and fix this issue for vision not working with inline images. You should now be able to use gemini and other vision models with inline images.

Feel free to reopen this issue if you still see it after the next release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Fix something that isn't working as expected
Projects
None yet
Development

No branches or pull requests

5 participants