Skip to content

[FIX] Unable to connect to local Ollama using self-hosted Docker #1100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 of 14 tasks
raphaelventura opened this issue Feb 2, 2025 · 23 comments
Open
3 of 14 tasks
Labels
fix Fix something that isn't working as expected

Comments

@raphaelventura
Copy link

raphaelventura commented Feb 2, 2025

Describe the bug

I'm trying to connect to my local ollama server but am getting the following connection_exceptions (too large to fit inside the issue) after getting these DEBUG messages inside the server logs:

server-1    | [22:43:54.017709] DEBUG    khoj.processor.conversation.openai before_sleep.py:65
server-1    |                            .utils: Retrying                                     
server-1    |                            khoj.processor.conversation.openai                   
server-1    |                            .utils.completion_with_backoff in                    
server-1    |                            0.6096594730061932 seconds as it                     
server-1    |                            raised APIConnectionError:                           
server-1    |                            Connection error..                                   
server-1    | [22:43:55.932440] DEBUG    khoj.processor.conversation.openai before_sleep.py:65
server-1    |                            .utils: Retrying                                     
server-1    |                            khoj.processor.conversation.openai                   
server-1    |                            .utils.completion_with_backoff in                    
server-1    |                            0.6813156899799166 seconds as it                     
server-1    |                            raised APIConnectionError:                           
server-1    |                            Connection error..                                   
server-1    | [22:43:58.039865] DEBUG    khoj.routers.helpers: Chat actor:      helpers.py:195
server-1    |                            Infer information sources to refer:                  
server-1    |                            5.843 seconds  

I've tried to setup the OPENAI_API_BASE variable to the two following values inside the docker-compose file:

      - OPENAI_API_BASE=http://host.docker.internal:11434/v1/
      - OPENAI_API_BASE=http://localhost:11434/v1/

(and set the same URLs for the AI model API later in the server admin panel).

To Reproduce

Steps to reproduce the behavior:

  1. Install ollama
  2. Execute ollama serve than ollama pull llama3.1
  3. Launch docker-compose up, and configure the AI model API and Chat Model as stated in the doc.
  4. Choose the model that has been configured, and try chatting.

Screenshots

API config
Image

Models config
Image

I tried with the names 8b and latest, since the output of ollama list yields

NAME                ID              SIZE      MODIFIED    
llama3.1:latest     46e0c10c039e    4.9 GB    2 hours ago 

Platform

  • Server:
    • Cloud-Hosted (https://app.khoj.dev)
    • Self-Hosted Docker
    • Self-Hosted Python package
    • Self-Hosted source code
  • Client:
    • Obsidian
    • Emacs
    • Desktop app
    • Web browser
    • WhatsApp
  • OS:
    • Windows
    • macOS
    • Linux
    • Android
    • iOS

If self-hosted

  • Server Version: v1.36.0
@raphaelventura raphaelventura added the fix Fix something that isn't working as expected label Feb 2, 2025
@Francommit
Copy link

Francommit commented Feb 5, 2025

Having a similar problem.
Was initially using windows and I reproduced the same thing on Linux using docker in conjunction with ollama and LM Studio

@infocillasas
Copy link

Hi,
Same here. The server container cannot connect to the host port 11434 on which Ollama listens.
Any ideas ?

@Francommit
Copy link

I thought it might have just been problems with windows being windows with WSL but exact same problem with Linux Mint.

There's got to be something that were just doing wrong.

@infocillasas
Copy link

infocillasas commented Feb 8, 2025

@raphaelventura I've struggled a while but found what wasn't working on my setup (Linux ubuntu):

Be sure that the AI model API setup in the Admin Web Interface is set to http://host.docker.internal:11434/v1/
On my setup Ollama is working inside a Docker internal (either bare docker run or docker compose)

So based on your config shown in the screenshots above, prefer using your "ollama" API config and not "ollama-local"

@raphaelventura
Copy link
Author

raphaelventura commented Feb 8, 2025

Thanks @infocillasas , but I've tried both URLs (one after the other) in the yaml config file as well as the model API config page.

My ollama instance isn't running inside a container, its my distro's package application

@Techn0Hippie
Copy link

This is a very open issue that I have been fighting for days. I can run this fine on my mac in docker, but when I install it on an ubuntu desktop instance with a GPU I see the same issues and error message. I am hosting ollama on the same machine and also trying with docker ( just like OP did).

I can easily connect to my local instance, but have no idea what this "/v1" is:
curl http://192.168.6.241:11434/v1/
404 page not found

curl http://192.168.6.241:11434
Ollama is running

@infocillasas
Copy link

the /v1/ endpoint is the API's and is the correct address to setup Khoj to connect to Ollama

This is a very open issue that I have been fighting for days. I can run this fine on my mac in docker, but when I install it on an ubuntu desktop instance with a GPU I see the same issues and error message. I am hosting ollama on the same machine and also trying with docker ( just like OP did).

I can easily connect to my local instance, but have no idea what this "/v1" is: curl http://192.168.6.241:11434/v1/ 404 page not found

curl http://192.168.6.241:11434 Ollama is running

@nicolas33
Copy link

nicolas33 commented Feb 18, 2025

Same issue here with docker and system ollama installation:

ollama pull llama3.2:latest

NOTICE: ollama run llama3.2 works fine

chat model

  • name: llama3.2
  • model type: openai
  • model api: ollama
  • max token: 800

model api

Khoj

v1.36.6

ollama logs

The ollama server logs show nothing, meaning there is no HTTP access.
Trying the ollama installation from openweb UI works fine with the expected logs in ollama.

error.log

@nicolas33
Copy link

nicolas33 commented Feb 18, 2025

I can easily connect to my local instance, but have no idea what this "/v1" is: curl http://192.168.6.241:11434/v1/ 404 page not found
curl http://192.168.6.241:11434 Ollama is running

That's how the API is designed:

  • curl http://localhost:11434/v1 gives 404
  • curl http://localhost:11434/v1/models list the models

@infocillasas
Copy link

try with host.docker.internalinstead of localhost

Same issue here with docker and system ollama installation:

ollama pull llama3.2:latest

NOTICE: ollama run llama3.2 works fine

chat model

  • name: llama3.2
  • model type: openai
  • model api: ollama
  • max token: 800

model api

Khoj

v1.36.6

ollama logs

The ollama server logs show nothing, meaning there is no HTTP access. Trying the ollama installation from openweb UI works fine with the expected logs in ollama.

error.log

@nicolas33
Copy link

Sadly, this didn't help:

host> curl http://localhost:11434
Ollama is running%

host> 
host> docker exec -it khoj-server-1 /bin/bash
root@633fd8c28278:/app# grep internal /etc/hosts
172.17.0.1      host.docker.internal

root@633fd8c28278:/app# curl http://host.docker.internal:11434
curl: (7) Failed to connect to host.docker.internal port 11434 after 0 ms: Connection refused

root@633fd8c28278:/app# curl https://8.8.8.8
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="https://dns.google/">here</A>.
</BODY></HTML>

root@633fd8c28278:/app#
host> docker network ls
NETWORK ID     NAME           DRIVER    SCOPE
a33f5632af16   bridge         bridge    local
d242b17567d6   host           host      local
2e266333f8e6   khoj_default   bridge    local
09a506d21dcf   none           null      local

host> brctl show
bridge name     bridge id               STP enabled     interfaces
br-2e266333f8e6         8000.02424dc296a0       no              veth05c3791
                                                        veth7f1269e
                                                        vetha75db8c
                                                        vethe6550e3
docker0         8000.0242f0873c9b       no

host> ip a
<...>
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:f0:87:3c:9b brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
6: br-2e266333f8e6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:4d:c2:96:a0 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 brd 172.18.255.255 scope global br-2e266333f8e6
       valid_lft forever preferred_lft forever
<...>

There is still no access in the ollama logs...

@nicolas33
Copy link

BTW, I've tried to set OLLAMA_ORIGINS in the systemd service (restarted: option applied):

[Service]                                                                                                                                                                                                         
Environment="OLLAMA_ORIGINS=http://172.18.0.2:*,http://172.18.0.3:*,http://172.18.0.4:*,http://172.18.0.5:*,http://172.18.0.2,http://172.18.0.3,http://172.18.0.4,http://172.18.0.5"

I still get "Connection refused" in the docker instance and no access in the logs of ollama.

@nicolas33
Copy link

nicolas33 commented Feb 19, 2025

It's now working, here. For short, the ollama service and network configuration did not match.

Detailed explanation (on Linux)

The docker config

The docker instance must be able to reach the ollama service running directly on the host.

The docker installation of khoj with compose is using the bridge mode of docker via the "docker0" bridge. The docker instances hosts file have the host.docker.internal hostname pointing to 172.17.0.1 (the ip of the bridge gateway).

The ollama config

Because of the bridge mode on linux, the ollama service MUST be configured to listen on the bridge address 172.17.0.1:11434. If not, the khoj app won't be able to reach the ollama service on the host.

It's expected to use host.docker.internal in your model api base url: http://host.docker.internal:11434/v1

Possible solutions

a) ollama listening on 172.17.0.1

For this to work, configure your service to listen on the correct address. For systemd, use systemctl edit ollama.service and add these lines to the config file (in the correct section):

### Anything between here and the comment below will become the contents of the drop-in file

[Service]
Environment="OLLAMA_HOST=172.17.0.1:11434"

Restart ollama:

systemctl daemon-reload
systemctl restart ollama.service

Downsides:

  • any other client must be reconfigured to use this IP address
  • the docker0 must be up before ollama gets started so it can bind on this address
b) ollama listening on all interfaces

Configure ollama to listen on 0.0.0.0:11434 by following the same steps above.

Downsides:

  • the docker0 must be up before ollama gets started so it can bind on the bridge address
  • ollama is now listening on ALL the interfaces so it can be reached from your networks, likely. This is insecure and additional firewall configuration is strongly recommended.
c) Redirect the traffic

The idea is to bind the docker0 on 11434 to the ollama service on localhost:11434.
IOW, we keep ollama to listen on localhost:11434.

With socat:

sudo socat TCP4-LISTEN:11434,bind=172.17.0.1,fork TCP:127.0.0.1:11434

NOTICE: something similar should be possible with nat PREROUTING and POSTROUTING iptables rules.

downsides:

  • additional configuration is required for the full setup to work.
  • the docker0 bridge must be up

Testing

The docker instance must be able to reach the ollama service like this:

host> docker exec -it khoj-server-1 /bin/bash
root@aa2ea544712e:/app# curl http://host.docker.internal:11434
Ollama is runningroot@aa2ea544712e:/app#

Notice the "Ollama is running".

Hope this helps.

@raphaelventura
Copy link
Author

Thanks very much for this detailed explanation.

I modified my ollama service with a new OLLAMA_HOST address and an ExecStartPre instruction in order to wait for the docker bridge to be up, as advised.

Unfortunately, I still have a connection error. It may be trivial to troubleshoot but I'm quite unfamiliar with network stuff. I tried to ping / curl ollama's URL using

docker-compose start server 
docker-compose exec -it server /bin/bash
root@df6c8002d319:/app# curl http://host.docker.internal:11434
curl: (6) Could not resolve host: host.docker.internal

from the directory containing the docker-compose.yml official file.

From within the container, I can see that

root@df6c8002d319:/app# cat /etc/hosts                                                           
127.0.0.1       localhost                                                                       
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet    
ff00::0 ip6-mcastprefix                   
ff02::1 ip6-allnodes                                                                                                                                                                             
ff02::2 ip6-allrouters
172.18.0.3      df6c8002d319

Seems like I'm missing some alias for host.docker.internal. How should I handle this?

@nicolas33
Copy link

nicolas33 commented Feb 19, 2025

I guess you didn't start all the docker stuff with docker-compose start server. Could you try with docker-compose up instead?

What's the output of

  • docker network ls
  • docker network inspect bridge
  • brctl show

@raphaelventura
Copy link
Author

Here are the 3 outputs in order after launching docker-compose up

❯ sudo docker network ls
NETWORK ID     NAME           DRIVER    SCOPE
52d79ccd92a1   bridge         bridge    local
c84193da4b2a   host           host      local
f57e0698514f   khoj_default   bridge    local
07593506cb45   none           null      local
❯ sudo docker network inspect bridge
[
    {
        "Name": "bridge",
        "Id": "52d79ccd92a118d389c4bb5c3adcc8fdeb5daed79a53226bdebb3f2a1ed7944e",
        "Created": "2025-02-19T15:58:41.723093059+01:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.17.0.0/16",
                    "Gateway": "172.17.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {},
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            "com.docker.network.bridge.enable_icc": "true",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]
❯ brctl show
bridge name     bridge id               STP enabled     interfaces
br-f57e0698514f         8000.02424e01dcd6       no              veth0d73803
                                                        veth2b4770c
                                                        veth2da0e3f
                                                        veth6915c9d
docker0         8000.0242f5ad693f       no

@nicolas33
Copy link

The ouputs looks good and the docker0 bridge with the correct ips et binding is there. Do you have the host.docker.internal line in /etc/hosts with docker-compose up?

@raphaelventura
Copy link
Author

raphaelventura commented Feb 19, 2025

No, I didn't!

I tried to add

    extra_hosts:
      - "host.docker.internal:host-gateway"

inside the docker-compose config, and then a corresponding line appeared inside the /etc/hosts and a communication was established between my running ollama service and khoj. But now I'm getting (probably unrelated) time out errors like these ones:

server-1    | [20:21:23.900044] DEBUG    khoj.processor.conversation.openai before_sleep.py:65
server-1    |                            .utils: Retrying                                      
server-1    |                            khoj.processor.conversation.openai                                                                                                                      
server-1    |                            .utils.completion_with_backoff in                                                                                                                       
server-1    |                            0.3248539937731624 seconds as it                                                                                                                        
server-1    |                            raised APITimeoutError: Request                                                                                                                         
server-1    |                            timed out..                                                                                                                                             
server-1    | [20:22:25.516722] DEBUG    khoj.processor.conversation.openai before_sleep.py:65                                                                                                   
server-1    |                            .utils: Retrying                                       
server-1    |                            khoj.processor.conversation.openai                     
server-1    |                            .utils.completion_with_backoff in                    
server-1    |                            0.49318227139203263 seconds as it                    
server-1    |                            raised APITimeoutError: Request                      
server-1    |                            timed out..                                          
server-1    | [20:23:27.470012] DEBUG    khoj.routers.api: Extracting search    helpers.py:195
server-1    |                            queries took: 184.879 seconds                        
server-1    | [20:23:27.471631] ERROR    khoj.routers.api_chat: Error         api_chat.py:1005
server-1    |                            searching knowledge base: Request                    
server-1    |                            timed out.. Attempting to respond                    
server-1    |                            without document references.  

I'll let this issue open for now since we had to go through some configuration that is not documented, so that may be of interest for contributors to look into.

@Francommit
Copy link

Using Linux.

Tried a) and b) and still unfortunately still getting;

APIConnectionError: Connection error.

Pretty sure my config is right;

- OPENAI_BASE_URL=http://host.docker.internal:11434/v1/

Image

Image

Image

I'm performing the following.

docker compose up
ollama serve
ollama run deepseek-r1:32b

After editing said file above and running;

➜  .khoj systemctl daemon-reload
systemctl restart ollama.service

@raphaelventura
Copy link
Author

raphaelventura commented Feb 20, 2025 via email

@btschwertfeger
Copy link

btschwertfeger commented Apr 30, 2025

I faced the exactly the same issues described here.

Environment:

  • Ubuntu 24.4.02
  • NVIDIA RTX 4070 TI Super
  • NVIDIA-SMI 560.35.03 Driver Version: 560.35.03
  • CUDA Version: 12.6
  • Khoj v1.41.0
  • Khoj setup: docker-compose.yaml + ollama running as system service

I did something similar to @Francommit and do not get connection errors anymore.

Steps:

  1. Check where ollama is listening and found out that it only accepts requests from localhost:
    ❯ ss -tuln | grep 11434
    tcp   LISTEN 0      4096                             127.0.0.1:11434      0.0.0.0:* 
    ... which is the reason for:
    ❯ docker exec -it khoj-server-1 bash
    root@8fe6c6cb95f2:/app#  curl http://localhost:11434/v1/models
    curl: (7) Failed to connect to localhost port 11434 after 0 ms: Connection refused
  2. ollama needs to accept requests from more than localhost, for simplicity from everywhere:
    ❯ sudo systemctl edit ollama.service
    [sudo] password for ...: 
    ... enter:
    [Service]
    Environment="OLLAMA_HOST=0.0.0.0:11434"
    
    ... and save
    Successfully installed edited file '/etc/systemd/system/ollama.service.d/override.conf'.
  3. Restart the service and check if if it accepts more than localhost:
    ❯ sudo systemctl restart ollama
    ❯ ss -tuln | grep 11434
    tcp   LISTEN 0      4096                                     *:11434            *:*
  4. Check if it is now available from within the khoj container:
    ❯ docker exec -it khoj-server-1 bash
    root@8fe6c6cb95f2:/app# curl http://host.docker.internal:11434/v1/models
    {"object":"list","data":[{"id":"llama3.1:8b-instruct-fp16","object":"model","created":1745992192,"owned_by":"library"},{"id":"llama3.1:8b","object":"model","created":1745989544,"owned_by":"library"}]}

So the issue is solved with that.


Apart from that, the ollama integration doesn't seem to respect my local models.

I have the following models downloaded:

❯       ollama list
NAME                         ID              SIZE      MODIFIED     
llama3.1:8b-instruct-fp16    4aacac419454    16 GB     10 hours ago    
llama3.1:8b                  46e0c10c039e    4.9 GB    11 hours ago 

I successfully integrated ollama into Khoj, but can't use these models directly.

  1. After configuring the "Ai model apis": Name: ollama; API key: None; API base URL: http://host.docker.internal:11434/v1
  2. Adding the desired model to "Chat models": Name llama3.1:8b; Model type: offline; Ai model api: ollama
  3. ... and setting the model as default model for all cases in http://localhost:42110/server/admin/database/serverchatsettings/
  4. Opening a new chat (even after restarting the container), and entering a message results in /app/src/khoj/processor/conversation/offline/utils.py:63 (load_model_from_cache) which seems to expect a model name that can be split by / (and not the name that ollama uses). So I used hugging-face-like model names such as "meta-llama/Llama-3.1-8B"- But no success, when doing the same thing and setting meta-llama/Llama-3.1-8B as default for all chat cases, we get:
    # ... during /app/src/khoj/processor/conversation/offiline/utils.py:58
    server-1    |                            ValueError: No file found in
    server-1    |                            meta-llama/Llama-3.1-8B that match                   
    server-1    |                            *Q4_K_M.gguf                                         
    server-1    |                                                                                 
    server-1    |                            Available Files:                                     
    server-1    |                            ["original", ".gitattributes",                       
    server-1    |                            "LICENSE", "README.md",                              
    server-1    |                            "USE_POLICY.md", "config.json",                      
    server-1    |                            "generation_config.json",                            
    server-1    |                            "model-00001-of-00004.safetensors",                  
    server-1    |                            "model-00002-of-00004.safetensors",                  
    server-1    |                            "model-00003-of-00004.safetensors",                  
    server-1    |                            "model-00004-of-00004.safetensors",                  
    server-1    |                            "model.safetensors.index.json",                      
    server-1    |                            "special_tokens_map.json",                           
    server-1    |                            "tokenizer.json",                                    
    server-1    |                            "tokenizer_config.json",                             
    server-1    |                            "original/consolidated.00.pth",                      
    server-1    |                            "original/params.json",                              
    server-1    |                            "original/tokenizer.model"]  
    

If I'm not missing something (please let me know), I can create another issue for that.

@raphaelventura
Copy link
Author

raphaelventura commented Apr 30, 2025

1. After configuring the ["Ai model apis"](http://localhost:42110/server/admin/database/aimodelapi/): Name: ollama; API key: None; API base URL: http://host.docker.internal:11434/v1

2. Adding the desired model to ["Chat models"](http://localhost:42110/server/admin/database/chatmodel/): Name llama3.1:8b; Model type: offline; Ai model api: ollama

@btschwertfeger
I think you should not choose the offline Model type, but rather OpenAI since you're using the Ollama API which is OpenAI-compliant.

@btschwertfeger
Copy link

1. After configuring the ["Ai model apis"](http://localhost:42110/server/admin/database/aimodelapi/): Name: ollama; API key: None; API base URL: http://host.docker.internal:11434/v1

2. Adding the desired model to ["Chat models"](http://localhost:42110/server/admin/database/chatmodel/): Name llama3.1:8b; Model type: offline; Ai model api: ollama

@btschwertfeger I think you should not choose the offline Model type, but rather OpenAI since you're using the Ollama API which is OpenAI-compliant.

Oh yes, that was the trick - Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Fix something that isn't working as expected
Projects
None yet
Development

No branches or pull requests

6 participants