[Bug]: vLLM throws error when sampling from Cerebras GPT Models

### Your current environment

<details>
<summary>The output of `python collect_env.py`</summary>

```text
python -u collect_env.py 
Collecting environment information...
Traceback (most recent call last):
  File "/lfs/skampere1/0/rschaef/KoyejoLab-Pretraining-Inference-Compute-Exchange-Rate/collect_env.py", line 765, in <module>
    main()
  File "/lfs/skampere1/0/rschaef/KoyejoLab-Pretraining-Inference-Compute-Exchange-Rate/collect_env.py", line 744, in main
    output = get_pretty_env_info()
             ^^^^^^^^^^^^^^^^^^^^^
  File "/lfs/skampere1/0/rschaef/KoyejoLab-Pretraining-Inference-Compute-Exchange-Rate/collect_env.py", line 739, in get_pretty_env_info
    return pretty_str(get_env_info())
                      ^^^^^^^^^^^^^^
  File "/lfs/skampere1/0/rschaef/KoyejoLab-Pretraining-Inference-Compute-Exchange-Rate/collect_env.py", line 568, in get_env_info
    vllm_version = get_vllm_version()
                   ^^^^^^^^^^^^^^^^^^
  File "/lfs/skampere1/0/rschaef/KoyejoLab-Pretraining-Inference-Compute-Exchange-Rate/collect_env.py", line 273, in get_vllm_version
    from vllm import __version__, __version_tuple__
ImportError: cannot import name '__version_tuple__' from 'vllm' (/lfs/skampere1/0/rschaef/miniconda3/envs/llmonk/lib/python3.11/site-packages/vllm/__init__.py)
```

</details>


### Model Input Dumps

_No response_

### 🐛 Describe the bug

vLLM throws an error when attempting to use Cerebras's models. Here is a minimal reproduction:

```
from vllm import LLM, SamplingParams
from vllm.distributed.parallel_state import destroy_model_parallel


model = LLM(model="cerebras/Cerebras-GPT-1.3B", dtype="bfloat16")

model_sampling_params = SamplingParams(
    n=1,
    temperature=1.0,
    max_tokens=64,
    seed=0,
)

output = model.generate(
    prompts=["Please continue the following sentence: The quick brown fox jumps "],
    sampling_params=model_sampling_params,
)
```

The error is: `TypeError: 'NoneType' object is not iterable`

It arises here: 

```
    def _verify_embedding_mode(self) -> None:
        architectures = getattr(self.hf_config, "architectures", [])
        self.embedding_mode = any(
            ModelRegistry.is_embedding_model(arch) for arch in architectures)
```

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: vLLM throws error when sampling from Cerebras GPT Models #11224

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: vLLM throws error when sampling from Cerebras GPT Models #11224

Description

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions