-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
The output of `python collect_env.py`
INFO 01-20 18:00:32 __init__.py:179] Automatically detected platform cuda.
Collecting environment information...
PyTorch version: 2.5.1
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.6 LTS (aarch64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Clang version: 18.0.0 ([email protected]:llvm/llvm-project.git 5e5a22caf88ac1ccfa8dc5720295fdeba0ad9372)
CMake version: version 3.31.0-rc2
Libc version: glibc-2.31
Python version: 3.10.16 (main, Dec 11 2024, 16:18:56) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.10.216-tegra-aarch64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 11.8.89
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Probably one of the following:
/usr/lib/aarch64-linux-gnu/libcudnn.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_adv_infer.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_adv_train.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_cnn_infer.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_cnn_train.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_ops_infer.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_ops_train.so.8.6.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: aarch64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-7
Off-line CPU(s) list: 8-11
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 2
Vendor ID: ARM
Model: 1
Model name: ARMv8 Processor rev 1 (v8l)
Stepping: r0p1
CPU max MHz: 2201.6001
CPU min MHz: 115.2000
BogoMIPS: 62.50
L1d cache: 512 KiB
L1i cache: 512 KiB
L2 cache: 2 MiB
L3 cache: 4 MiB
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Mitigation; CSV2, but not BHB
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp uscat ilrcpc flagm
Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] nvidia-ml-py==12.560.30
[pip3] pyzmq==26.2.0
[pip3] torch==2.5.1
[pip3] torchvision==0.20.1a0
[pip3] transformers==4.48.0
[pip3] triton==3.0.0+git4f6f7687
[conda] numpy 1.26.4 pypi_0 pypi
[conda] nvidia-ml-py 12.560.30 pypi_0 pypi
[conda] pyzmq 26.2.0 py310h55e1596_3 conda-forge
[conda] torch 2.5.1 pypi_0 pypi
[conda] torchvision 0.20.1a0 pypi_0 pypi
[conda] transformers 4.48.0 pypi_0 pypi
[conda] triton 3.0.0+git4f6f7687 pypi_0 pypi
ROCM Version: Could not collect
Neuron SDK Version: N/A
vLLM Version: 0.6.6.post2.dev264+g813f249f
vLLM Build Flags:
CUDA Archs: Not Set; ROCm: Disabled; Neuron: Disabled
GPU Topology:
Could not collect
LD_LIBRARY_PATH=/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/cv2/../../lib64:/home/orin/tools/SZ3/lib:/home/orin/tools/anaconda3/envs/llmserving_py/lib/python3.10/site-packages/nvidia/cuda_runtime/lib:/usr/local/cuda/lib64:/home/orin/tools/SZ3/lib:/home/orin/tools/anaconda3/envs/llmserving_py/lib/python3.10/site-packages/nvidia/cuda_runtime/lib:/usr/local/cuda/lib64:
CUDA_HOME=/usr/local/cuda
CUDA_HOME=/usr/local/cuda
CUDA_MODULE_LOADING=LAZY
Model Input Dumps
No response
🐛 Describe the bug
The example usage from official site
from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest
llm = LLM(model="meta-llama/Llama-2-7b-hf", enable_lora=True) # <- crash
INFO 01-20 18:06:33 __init__.py:179] Automatically detected platform cuda.
ERROR 01-20 18:06:56 registry.py:296] Error in inspecting model architecture 'LlamaForCausalLM'
ERROR 01-20 18:06:56 registry.py:296] Traceback (most recent call last):
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 491, in _run_in_subprocess
ERROR 01-20 18:06:56 registry.py:296] returned.check_returncode()
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/subprocess.py", line 457, in check_returncode
ERROR 01-20 18:06:56 registry.py:296] raise CalledProcessError(self.returncode, self.args, self.stdout,
ERROR 01-20 18:06:56 registry.py:296] subprocess.CalledProcessError: Command '['/home/orin/tools/anaconda3/envs/demo/bin/python', '-m', 'vllm.model_executor.models.registry']' returned non-zero exit status 1.
ERROR 01-20 18:06:56 registry.py:296]
ERROR 01-20 18:06:56 registry.py:296] The above exception was the direct cause of the following exception:
ERROR 01-20 18:06:56 registry.py:296]
ERROR 01-20 18:06:56 registry.py:296] Traceback (most recent call last):
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 294, in _try_inspect_model_cls
ERROR 01-20 18:06:56 registry.py:296] return model.inspect_model_cls()
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 265, in inspect_model_cls
ERROR 01-20 18:06:56 registry.py:296] return _run_in_subprocess(
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 494, in _run_in_subprocess
ERROR 01-20 18:06:56 registry.py:296] raise RuntimeError(f"Error raised in subprocess:\n"
ERROR 01-20 18:06:56 registry.py:296] RuntimeError: Error raised in subprocess:
ERROR 01-20 18:06:56 registry.py:296] /home/orin/tools/anaconda3/envs/demo/lib/python3.10/runpy.py:126: RuntimeWarning: 'vllm.model_executor.models.registry' found in sys.modules after import of package 'vllm.model_executor.models', but prior to execution of 'vllm.model_executor.models.registry'; this may result in unpredictable behaviour
ERROR 01-20 18:06:56 registry.py:296] warn(RuntimeWarning(msg))
ERROR 01-20 18:06:56 registry.py:296] Traceback (most recent call last):
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/runpy.py", line 196, in _run_module_as_main
ERROR 01-20 18:06:56 registry.py:296] return _run_code(code, main_globals, None,
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/runpy.py", line 86, in _run_code
ERROR 01-20 18:06:56 registry.py:296] exec(code, run_globals)
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 515, in <module>
ERROR 01-20 18:06:56 registry.py:296] _run()
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 508, in _run
ERROR 01-20 18:06:56 registry.py:296] result = fn()
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 266, in <lambda>
ERROR 01-20 18:06:56 registry.py:296] lambda: _ModelInfo.from_model_cls(self.load_model_cls()))
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 269, in load_model_cls
ERROR 01-20 18:06:56 registry.py:296] mod = importlib.import_module(self.module_name)
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/importlib/__init__.py", line 126, in import_module
ERROR 01-20 18:06:56 registry.py:296] return _bootstrap._gcd_import(name[level:], package, level)
ERROR 01-20 18:06:56 registry.py:296] File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
ERROR 01-20 18:06:56 registry.py:296] File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
ERROR 01-20 18:06:56 registry.py:296] File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
ERROR 01-20 18:06:56 registry.py:296] File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
ERROR 01-20 18:06:56 registry.py:296] File "<frozen importlib._bootstrap_external>", line 883, in exec_module
ERROR 01-20 18:06:56 registry.py:296] File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/llama.py", line 39, in <module>
ERROR 01-20 18:06:56 registry.py:296] from vllm.model_executor.layers.logits_processor import LogitsProcessor
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/layers/logits_processor.py", line 12, in <module>
ERROR 01-20 18:06:56 registry.py:296] from vllm.model_executor.layers.vocab_parallel_embedding import (
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py", line 137, in <module>
ERROR 01-20 18:06:56 registry.py:296] def get_masked_input_and_mask(
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/torch/__init__.py", line 2424, in fn
ERROR 01-20 18:06:56 registry.py:296] return compile(
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/torch/__init__.py", line 2447, in compile
ERROR 01-20 18:06:56 registry.py:296] return torch._dynamo.optimize(
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 716, in optimize
ERROR 01-20 18:06:56 registry.py:296] return _optimize(rebuild_ctx, *args, **kwargs)
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 790, in _optimize
ERROR 01-20 18:06:56 registry.py:296] compiler_config=backend.get_compiler_config()
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/torch/__init__.py", line 2237, in get_compiler_config
ERROR 01-20 18:06:56 registry.py:296] from torch._inductor.compile_fx import get_patched_config_dict
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 49, in <module>
ERROR 01-20 18:06:56 registry.py:296] from torch._inductor.debug import save_args_for_compile_fx_inner
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/torch/_inductor/debug.py", line 26, in <module>
ERROR 01-20 18:06:56 registry.py:296] from . import config, ir # noqa: F811, this is needed
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/torch/_inductor/ir.py", line 77, in <module>
ERROR 01-20 18:06:56 registry.py:296] from .runtime.hints import ReductionHint
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/torch/_inductor/runtime/hints.py", line 36, in <module>
ERROR 01-20 18:06:56 registry.py:296] attr_desc_fields = {f.name for f in fields(AttrsDescriptor)}
ERROR 01-20 18:06:56 registry.py:296] File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/dataclasses.py", line 1198, in fields
ERROR 01-20 18:06:56 registry.py:296] raise TypeError('must be called with a dataclass type or instance') from None
ERROR 01-20 18:06:56 registry.py:296] TypeError: must be called with a dataclass type or instance
ERROR 01-20 18:06:56 registry.py:296]
Traceback (most recent call last):
File "/home/orin/workspace/paper/test.py", line 5, in <module>
llm = LLM(model="NousResearch/Llama-2-7b-chat-hf", enable_lora=True)
File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/utils.py", line 1039, in inner
return fn(*args, **kwargs)
File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 236, in __init__
self.llm_engine = self.engine_class.from_engine_args(
File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 479, in from_engine_args
engine_config = engine_args.create_engine_config(usage_context)
File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 1043, in create_engine_config
model_config = self.create_model_config()
File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 969, in create_model_config
return ModelConfig(
File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/config.py", line 342, in __init__
self.multimodal_config = self._init_multimodal_config(
File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/config.py", line 402, in _init_multimodal_config
if ModelRegistry.is_multimodal_model(architectures):
File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 430, in is_multimodal_model
model_cls, _ = self.inspect_model_cls(architectures)
File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 390, in inspect_model_cls
return self._raise_for_unsupported(architectures)
File "/home/orin/tools/anaconda3/envs/demo/lib/python3.10/site-packages/vllm/model_executor/models/registry.py", line 347, in _raise_for_unsupported
raise ValueError(
ValueError: Model architectures ['LlamaForCausalLM'] failed to be inspected. Please check the logs for more details.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
SurealCereal
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working