-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
The output of `python collect_env.py`
Your output of `python collect_env.py` here
Model Input Dumps
No response
🐛 Describe the bug
INFO: Started server process [11912]
INFO: Waiting for application startup.
INFO: Application startup complete.
ERROR: [Errno 98] error while attempting to bind on address ('0.0.0.0', 8080): address already in use
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
My command to start vllm:
python3 -m vllm.entrypoints.openai.api_server --model hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 \
--host 0.0.0.0 --port 8080 --seed 42 --trust-remote-code --disable-frontend-multiprocessing \
--enable-chunked-prefill --tensor-parallel-size 2 --max-model-len 98304 >> "$LOG_FILE" 2>&1 &
If I change tensor-parallel-size from 2 to 1, no such issue.
docker image in use is "vllm/vllm-openai:v0.6.2".
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working