[Bug]: vllm v0.6.2 is crashed on multiple GPU

### Your current environment

<details>
<summary>The output of `python collect_env.py`</summary>

```text
Your output of `python collect_env.py` here
```

</details>


### Model Input Dumps

_No response_

### 🐛 Describe the bug

INFO:     Started server process [11912]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
ERROR:    [Errno 98] error while attempting to bind on address ('0.0.0.0', 8080): address already in use
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.

My command to start vllm:
```
python3 -m vllm.entrypoints.openai.api_server --model hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 \
        --host 0.0.0.0 --port 8080  --seed 42 --trust-remote-code --disable-frontend-multiprocessing \
        --enable-chunked-prefill --tensor-parallel-size 2 --max-model-len 98304 >> "$LOG_FILE" 2>&1 &
```

If I change tensor-parallel-size from 2 to 1, no such issue.

docker image in use is "vllm/vllm-openai:v0.6.2".

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: vllm v0.6.2 is crashed on multiple GPU #9225

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: vllm v0.6.2 is crashed on multiple GPU #9225

Description

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions