-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
[Core] Enable LoRA support for classification model #24596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Jee Jee Li <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces ClassifierWithLoRA
to enable LoRA support for classification models. However, the implementation in the new file vllm/lora/layers/classifier.py
has several critical issues. These include a hardcoded value that should be configurable, incorrect initialization of an internal state variable that will lead to runtime errors, and an incorrect and incomplete forward
method implementation. These issues must be addressed for the layer to function correctly.
vllm/lora/layers/classifier.py
Outdated
) -> None: | ||
self.lora_config = lora_config | ||
|
||
self.max_class_label =3 # self.lora_config.max_class_label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The max_class_label
is hardcoded to 3
. This appears to be a placeholder and should be derived from lora_config
as suggested by the commented-out code. Hardcoding this value prevents the layer from being used for classification tasks with a different number of labels.
self.max_class_label =3 # self.lora_config.max_class_label | |
self.max_class_label = self.lora_config.max_class_label |
vllm/lora/layers/classifier.py
Outdated
self.lora_config = lora_config | ||
|
||
self.max_class_label =3 # self.lora_config.max_class_label | ||
self._label_slot = [-1] * self.max_class_label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _label_slot
list is initialized with a size of self.max_class_label
. However, it is indexed by a LoRA slot index in reset_lora
and set_lora
, which can go up to max_loras - 1
. If max_loras
is greater than max_class_label
, this will cause an IndexError
. The size of _label_slot
should be max_loras
to correctly track the state for each LoRA slot.
self._label_slot = [-1] * self.max_class_label | |
self._label_slot = [-1] * max_loras |
vllm/lora/layers/classifier.py
Outdated
def forward(self, input_: torch.Tensor) -> torch.Tensor: | ||
"""Forward of ClassifierWithLoRA | ||
|
||
Args: | ||
input_: Tensor whose last dimension is `input_size`. | ||
|
||
Returns: | ||
- output | ||
|
||
""" | ||
y = torch.zeros( self.input_size, self.max_class_label, device=input_.device) | ||
|
||
self.punica_wrapper.add_shrink( | ||
y, | ||
self.lora_a_stacked, | ||
add_input=True) | ||
# Cast y using self._label_slot | ||
return y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The forward
method has several critical issues making it non-functional and incomplete:
- Unused Input: The
input_
tensor is not used in the computation, which is incorrect for a forward pass. - Incorrect
add_shrink
Call: The arguments toself.punica_wrapper.add_shrink
are mismatched with its definition (add_shrink(y, x, lora_a_stacked, scale, **kwargs)
). Theinput_
tensor should be passed asx
,self.lora_a_stacked
should be wrapped in a tuple for thelora_a_stacked
argument, and ascale
factor is missing. The current call will cause aTypeError
. - Incomplete Logic: The comment
# Cast y using self._label_slot
on line 79 indicates missing implementation. - Incorrect
y
Shape: They
tensor is initialized with shape(self.input_size, self.max_class_label)
, which is missing a batch dimension. - Missing LoRA 'B' weights: A standard LoRA layer uses both A and B weight matrices. This implementation only seems to use
lora_a
, andlora_b
is ignored inset_lora
. Theforward
method should incorporate the full LoRA logic.
This method needs a complete rewrite to correctly implement the forward pass.
Signed-off-by: Jee Jee Li <[email protected]>
7bd8239
to
fac4b71
Compare
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Had a question on this PR, I thought LoRA already works for classification models? Or is this to extend support for LoRA weights for the classifier head as well as the base model? |
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Yes, mainly for the multi-lora of the classifier head |
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
bd0eb9f
to
8209698
Compare
Purpose
FIX #23719
FIX #19623
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.