[Bug]: ValueError: Ray does not allocate any GPUs on the driver node. Consider adjusting the Ray placement group or running the driver on a GPU node.

### Your current environment

```text
The output of `python collect_env.py`
```


### 🐛 Describe the bug

We should improve the way we create the default placement group in 

https://github.com/vllm-project/vllm/blob/cbbc904470668b9420e71595edeef76d673a2d59/vllm/executor/ray_utils.py#L119-L131

1. If we are in a placement group, but it does not contain the current node, error out. (this is a rare case, users usually don't set placement groups)
2. If not, we are creating a placement group. Make sure it contains the current node.


	num_devices_in_cluster = ray.cluster_resources().get(device_str, 0)
	if parallel_config.world_size > num_devices_in_cluster:
	raise ValueError(
	f"The number of required {device_str}s exceeds the total "
	f"number of available {device_str}s in the placement group.")
	# Create a new placement group
	placement_group_specs = ([{
	device_str: 1
	}] * parallel_config.world_size)
	current_placement_group = ray.util.placement_group(
	placement_group_specs)
	# Wait until PG is ready - this will block until all
	# requested resources are available, and will timeout

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: ValueError: Ray does not allocate any GPUs on the driver node. Consider adjusting the Ray placement group or running the driver on a GPU node. #6956

Your current environment

🐛 Describe the bug

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: ValueError: Ray does not allocate any GPUs on the driver node. Consider adjusting the Ray placement group or running the driver on a GPU node. #6956

Description

Your current environment

🐛 Describe the bug

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions