-
Notifications
You must be signed in to change notification settings - Fork 1.1k
fix init weights issue for critic/reward model #983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @jouw, can you fix the format and DCO error? |
Signed-off-by: Hongwei Chen <[email protected]> Signed-off-by: jouw <[email protected]>
Signed-off-by: jouw <[email protected]>
Signed-off-by: raviguptaamd <[email protected]> Signed-off-by: jouw <[email protected]>
e26fe55
to
9a7062b
Compare
hi @hwchen2017 , I have fixed the error, please help review the change, thanks! |
Signed-off-by: jouw <[email protected]>
hi @hwchen2017 , I have fixed the errors, can you help merge the change? Thanks! |
Hi @jouw, It seems this breaks our CI test using DS-Chat. Can you share more about the error you encountered? |
Let's revert this PR and fix it. |
This reverts commit 3d83278.
This reverts commit 3d83278. Signed-off-by: Hongwei Chen <[email protected]>
Add the following code to disable init weights operation, otherwise it will init model weights and got an error. @hwchen2017
with no_init_weights():
Detailed explanation as belows.
Take Qwen3Model as example, the function call stack is:
Qwen3Model.init() -> Qwen3Model.post_init() -> PreTrainedModel.init_weights()
If we don't add
with no_init_weights():
for the codemodel = model_class.from_config(model_config)
, the parameter _init_weights will be true, and cause error.https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py