-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Description
I am trying to run a distributed (multi-node) inference server with vLLM using ray, but I keep getting the following ValueError:
Ray does not allocate any GPUs on the driver node. Consider adjusting the Ray placement group or running the driver on a GPU node.
I'm not sure how exactly to resolve this. I suspect the issue is with this script https://github.com/vllm-project/vllm/blob/main/vllm/engine/ray_utils.py, especially when a ray_address
is passed. Is there a specific ray_address arg that gets passed during the ray.init() stage?
More specifically, it seems like this error is raised because of the driver_dummy_worker
in line 182 of https://github.com/vllm-project/vllm/blob/main/vllm/engine/llm_engine.py.
I'm confused what's going on with this piece of code
def _init_workers_ray(self, placement_group: "PlacementGroup",
**ray_remote_kwargs):
if self.parallel_config.tensor_parallel_size == 1:
num_gpus = self.cache_config.gpu_memory_utilization
else:
num_gpus = 1
self.driver_dummy_worker: RayWorkerVllm = None
self.workers: List[RayWorkerVllm] = []
driver_ip = get_ip()
for bundle_id, bundle in enumerate(placement_group.bundle_specs):
if not bundle.get("GPU", 0):
continue
scheduling_strategy = PlacementGroupSchedulingStrategy(
placement_group=placement_group,
placement_group_capture_child_tasks=True,
placement_group_bundle_index=bundle_id,
)
worker = ray.remote(
num_cpus=0,
num_gpus=num_gpus,
scheduling_strategy=scheduling_strategy,
**ray_remote_kwargs,
)(RayWorkerVllm).remote(self.model_config.trust_remote_code)
worker_ip = ray.get(worker.get_node_ip.remote())
if worker_ip == driver_ip and self.driver_dummy_worker is None:
# If the worker is on the same node as the driver, we use it
# as the resource holder for the driver process.
self.driver_dummy_worker = worker
else:
self.workers.append(worker)
if self.driver_dummy_worker is None:
raise ValueError(
"Ray does not allocate any GPUs on the driver node. Consider "
"adjusting the Ray placement group or running the driver on a "
"GPU node.")
When the error is raised, it is checking if driver_dummy_worker
is None, but don't we set it to None
above, that is, self.driver_dummy_worker: RayWorkerVllm =None
?