Skip to content

Conversation

WoosukKwon
Copy link
Collaborator

Fixes #66

This PR fixes a bug in our attention kernel. The bug was introduced in #53 when changing the precision of computations in the attention kernel. Now the kernel unit tests are passed normally.

@WoosukKwon WoosukKwon merged commit 130d5fd into main May 4, 2023
@WoosukKwon WoosukKwon deleted the attn-kernel-bugfix branch May 4, 2023 09:56
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
SUMMARY:
- Added `licenses` subfolder for directories
- Moved `LICENSE-apache` into `licenses` directory
- Updated `setup.py` with NM Community License

TEST PLAN:
None
dllehr-amd pushed a commit to dllehr-amd/vllm that referenced this pull request Jul 22, 2024
jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Aug 15, 2024
maxdebayser pushed a commit to maxdebayser/vllm that referenced this pull request Feb 13, 2025
…ncies (vllm-project#68)

I already tried to fix this using IBM/vllm#66
but upstream didn't like that change (the behaviour to filter out
comments containing torch was intentional). After [some
discussion](vllm-project#12255), we agreed
on a different solution implemented in this PR. Note that I reverted the
changes from vllm-project#66 by force pushing main.

Note this has already been merged upstream by
vllm-project#12260 but I'm cherry-picking
the fix here since it is blocking the CI builds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

A critical bug in attention kernel after refactoring
1 participant