Skip to content

Fix: sae input is collected instead of sae latent #131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mahbubcseju
Copy link

The latent cache should store the sparse activations generated by the SAE encoder. However, the following line mistakenly stores the decoder’s output instead as it calls the forward function instead of encode function.

sae_latents = self.hookpoint_to_sparse_encode[hookpoint](

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Md Mahbubur Rahman seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@SrGonao
Copy link
Collaborator

SrGonao commented Jun 3, 2025

This is incorrect. Hookpoint to sparse encode already returns the correct encoded latents (via https://github.com/EleutherAI/delphi/blob/main/delphi/sparse_coders/load_sparsify.py#L26-L34). In fact the suggested change is incorrect because encode returns a tuple with (top_indices, top_acts). Which version of sparsify are you using where this works?

@mahbubcseju
Copy link
Author

Ah! I see! I was actually running it for gemma model. It seems like the issue is for the gemma model.

hookpoint_to_sparse_encode = load_gemma_autoencoders(

It directly calls "load_gemma_autoencoders" function, I believe this line should call "load_gemma_hooks" function. Also, I dont see any sparsify settings for the gemma model.

@SrGonao
Copy link
Collaborator

SrGonao commented Jun 3, 2025

Nice catch, I think that should be the correct fix! Could you change the PR to change that line instead of the current change?

@mahbubcseju
Copy link
Author

Just changed it. Thank you.

@SrGonao
Copy link
Collaborator

SrGonao commented Jun 4, 2025

And you confirm this works as expected? Otherwise I can run an evaluation run and check if we get reasonable scores

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants