Skip to content

[Usage]: #12861

@werruww

Description

@werruww

Your current environment

2025-02-07 01:56:08.876832: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1738893368.916322 17728 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1738893368.928223 17728 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
INFO 02-07 01:56:16 init.py:194] No platform detected, vLLM is running on UnspecifiedPlatform
Traceback (most recent call last):
File "/content/vllm/examples/offline_inference/cpu_offload.py", line 16, in
llm = LLM(model="openai-community/gpt2", cpu_offload_gb=10)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/utils.py", line 1051, in inner
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/entrypoints/llm.py", line 242, in init
self.llm_engine = self.engine_class.from_engine_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/engine/llm_engine.py", line 481, in from_engine_args
engine_config = engine_args.create_engine_config(usage_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/engine/arg_utils.py", line 1074, in create_engine_config
device_config = DeviceConfig(device=self.device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/config.py", line 1626, in init
raise RuntimeError("Failed to infer device type")
RuntimeError: Failed to infer device type

[22]
14s
!python /content/vllm/examples/offline_inference/cli.py
2025-02-07 01:57:36.838676: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1738893456.865144 18104 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1738893456.872977 18104 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
INFO 02-07 01:57:41 init.py:194] No platform detected, vLLM is running on UnspecifiedPlatform
Traceback (most recent call last):
File "/content/vllm/examples/offline_inference/cli.py", line 82, in
main(args)
File "/content/vllm/examples/offline_inference/cli.py", line 39, in main
llm = LLM(**asdict(engine_args))
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/utils.py", line 1051, in inner
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/entrypoints/llm.py", line 242, in init
self.llm_engine = self.engine_class.from_engine_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/engine/llm_engine.py", line 481, in from_engine_args
engine_config = engine_args.create_engine_config(usage_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/engine/arg_utils.py", line 1074, in create_engine_config
device_config = DeviceConfig(device=self.device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/config.py", line 1626, in init
raise RuntimeError("Failed to infer device type")
RuntimeError: Failed to infer device type

How would you like to use vllm

!python /content/vllm/examples/offline_inference/cpu_offload.py

2025-02-07 01:56:08.876832: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1738893368.916322 17728 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1738893368.928223 17728 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
INFO 02-07 01:56:16 init.py:194] No platform detected, vLLM is running on UnspecifiedPlatform
Traceback (most recent call last):
File "/content/vllm/examples/offline_inference/cpu_offload.py", line 16, in
llm = LLM(model="openai-community/gpt2", cpu_offload_gb=10)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/utils.py", line 1051, in inner
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/entrypoints/llm.py", line 242, in init
self.llm_engine = self.engine_class.from_engine_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/engine/llm_engine.py", line 481, in from_engine_args
engine_config = engine_args.create_engine_config(usage_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/engine/arg_utils.py", line 1074, in create_engine_config
device_config = DeviceConfig(device=self.device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/config.py", line 1626, in init
raise RuntimeError("Failed to infer device type")
RuntimeError: Failed to infer device type

[22]
14s
!python /content/vllm/examples/offline_inference/cli.py
2025-02-07 01:57:36.838676: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1738893456.865144 18104 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1738893456.872977 18104 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
INFO 02-07 01:57:41 init.py:194] No platform detected, vLLM is running on UnspecifiedPlatform
Traceback (most recent call last):
File "/content/vllm/examples/offline_inference/cli.py", line 82, in
main(args)
File "/content/vllm/examples/offline_inference/cli.py", line 39, in main
llm = LLM(**asdict(engine_args))
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/utils.py", line 1051, in inner
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/entrypoints/llm.py", line 242, in init
self.llm_engine = self.engine_class.from_engine_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/engine/llm_engine.py", line 481, in from_engine_args
engine_config = engine_args.create_engine_config(usage_context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/engine/arg_utils.py", line 1074, in create_engine_config
device_config = DeviceConfig(device=self.device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/vllm/config.py", line 1626, in init
raise RuntimeError("Failed to infer device type")
RuntimeError: Failed to infer device type

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    usageHow to use vllm

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions