Skip to content

[Bug]: Llama-3.2-11B-Vision-Instruct which is an encoder-decoder model fails with BlockManager V2 #9099

@sroy745

Description

@sroy745

Your current environment

The tests fail on all envs

Model Input Dumps

No response

🐛 Describe the bug

The following 2 tests fail with BlockManagerV2

  • tests/models/encoder_decoder/vision_language/test_mllama.py::test_models[5-128-bfloat16-sizes0-meta-llama/Llama-3.2-11B-Vision-Instruct]
  • tests/models/encoder_decoder/vision_language/test_mllama.py::test_models[5-128-bfloat16-sizes4-meta-llama/Llama-3.2-11B-Vision-Instruct]

The test fail with the following error

Traceback (most recent call last):
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/utils.py", line 443, in wrapper
    f(*args, **kwargs)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/models/encoder_decoder/vision_language/test_mllama.py", line 249, in test_models
    run_test(
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/models/encoder_decoder/vision_language/test_mllama.py", line 138, in run_test
    _run_test(
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/models/encoder_decoder/vision_language/test_mllama.py", line 187, in _run_test
    vllm_outputs_per_image = [
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/models/encoder_decoder/vision_language/test_mllama.py", line 188, in <listcomp>
    vllm_model.generate_greedy_logprobs(prompts,
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/conftest.py", line 768, in generate_greedy_logprobs
    return self.generate_w_logprobs(prompts,
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/conftest.py", line 708, in generate_w_logprobs
    req_outputs = self.model.generate(inputs,
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/utils.py", line 1051, in inner
    return fn(*args, **kwargs)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/entrypoints/llm.py", line 391, in generate
    outputs = self._run_engine(use_tqdm=use_tqdm)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/entrypoints/llm.py", line 899, in _run_engine
    step_outputs = self.llm_engine.step()
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/engine/llm_engine.py", line 1357, in step
    ) = self.scheduler[virtual_engine].schedule()
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/scheduler.py", line 1218, in schedule
    scheduler_outputs: SchedulerOutputs = self._schedule()
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/scheduler.py", line 1178, in _schedule
    return self._schedule_default()
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/scheduler.py", line 1013, in _schedule_default
    prefills = self._schedule_prefills(budget,
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/scheduler.py", line 949, in _schedule_prefills
    self._allocate_and_set_running(seq_group)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/scheduler.py", line 1411, in _allocate_and_set_running
    self.block_manager.allocate(seq_group)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/block_manager_v2.py", line 198, in allocate
    block_table = self._allocate_sequence(encoder_seq)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/block_manager_v2.py", line 154, in _allocate_sequence
    block_table.allocate(seq.get_token_ids())
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/block/block_table.py", line 95, in allocate
    assert token_ids
AssertionError

Looking into it. This is currently blocking #9084 and #8704.

cc: @afeldman-nm

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions