Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion tests/core/block/e2e/test_correctness.py
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ def test_v1_v2_greedy_equality_with_cow(baseline_llm_generator,

# Allow only 2 sequences of ~128 tokens in worst case.
# Note 16 = 128/block_size
"num_gpu_blocks_override": 2 * (16 + 1),
"num_gpu_blocks_override": 2 * (16 + 2),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: What is this change for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some corner case, that swapping order change in this PR makes the number of blocks not enough.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youkaichao Oh why the swapping order changes? I thought the change in this PR doesn't actually change the FCFS logic at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The running queue is FCFS. The swapping queue is somewhat complicated, when you swap in / swap out again and again (which happens in this test case).

That being said, I don't think we have strict FCFS even before this PR, when we consider swap in / swap out again and again. I think we can only guarantee strict FCFS after we use priority queue in the future.

}
])
@pytest.mark.parametrize("baseline_llm_kwargs", [{
Expand Down
Loading