[Bug]: Llama-3.2-11B-Vision-Instruct which is an encoder-decoder model fails with BlockManager V2

### Your current environment

The tests fail on all envs

### Model Input Dumps

_No response_

### 🐛 Describe the bug

The following 2 tests fail with BlockManagerV2 
- tests/models/encoder_decoder/vision_language/test_mllama.py::test_models[5-128-bfloat16-sizes0-meta-llama/Llama-3.2-11B-Vision-Instruct] 
- tests/models/encoder_decoder/vision_language/test_mllama.py::test_models[5-128-bfloat16-sizes4-meta-llama/Llama-3.2-11B-Vision-Instruct] 

The test fail with the following error

```
Traceback (most recent call last):
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/utils.py", line 443, in wrapper
    f(*args, **kwargs)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/models/encoder_decoder/vision_language/test_mllama.py", line 249, in test_models
    run_test(
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/models/encoder_decoder/vision_language/test_mllama.py", line 138, in run_test
    _run_test(
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/models/encoder_decoder/vision_language/test_mllama.py", line 187, in _run_test
    vllm_outputs_per_image = [
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/models/encoder_decoder/vision_language/test_mllama.py", line 188, in <listcomp>
    vllm_model.generate_greedy_logprobs(prompts,
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/conftest.py", line 768, in generate_greedy_logprobs
    return self.generate_w_logprobs(prompts,
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/tests/conftest.py", line 708, in generate_w_logprobs
    req_outputs = self.model.generate(inputs,
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/utils.py", line 1051, in inner
    return fn(*args, **kwargs)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/entrypoints/llm.py", line 391, in generate
    outputs = self._run_engine(use_tqdm=use_tqdm)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/entrypoints/llm.py", line 899, in _run_engine
    step_outputs = self.llm_engine.step()
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/engine/llm_engine.py", line 1357, in step
    ) = self.scheduler[virtual_engine].schedule()
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/scheduler.py", line 1218, in schedule
    scheduler_outputs: SchedulerOutputs = self._schedule()
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/scheduler.py", line 1178, in _schedule
    return self._schedule_default()
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/scheduler.py", line 1013, in _schedule_default
    prefills = self._schedule_prefills(budget,
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/scheduler.py", line 949, in _schedule_prefills
    self._allocate_and_set_running(seq_group)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/scheduler.py", line 1411, in _allocate_and_set_running
    self.block_manager.allocate(seq_group)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/block_manager_v2.py", line 198, in allocate
    block_table = self._allocate_sequence(encoder_seq)
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/block_manager_v2.py", line 154, in _allocate_sequence
    block_table.allocate(seq.get_token_ids())
  File "/home/jovyan/sroy-enc-dec-blk-mgr-fix/vllm/core/block/block_table.py", line 95, in allocate
    assert token_ids
AssertionError

```

Looking into it. This is currently blocking #9084  and #8704.

cc: @afeldman-nm 

### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Llama-3.2-11B-Vision-Instruct which is an encoder-decoder model fails with BlockManager V2 #9099

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Llama-3.2-11B-Vision-Instruct which is an encoder-decoder model fails with BlockManager V2 #9099

Description

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions