Fix: sae input is collected instead of sae latent #131

mahbubcseju · 2025-06-03T07:10:39Z

The latent cache should store the sparse activations generated by the SAE encoder. However, the following line mistakenly stores the decoder’s output instead as it calls the forward function instead of encode function.

delphi/delphi/latents/cache.py

Line 280 in aff973d

sae_latents = self.hookpoint_to_sparse_encode[hookpoint](

CLAassistant · 2025-06-03T07:10:45Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Md Mahbubur Rahman seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

for more information, see https://pre-commit.ci

SrGonao · 2025-06-03T09:43:13Z

This is incorrect. Hookpoint to sparse encode already returns the correct encoded latents (via https://github.com/EleutherAI/delphi/blob/main/delphi/sparse_coders/load_sparsify.py#L26-L34). In fact the suggested change is incorrect because encode returns a tuple with (top_indices, top_acts). Which version of sparsify are you using where this works?

mahbubcseju · 2025-06-03T14:29:04Z

Ah! I see! I was actually running it for gemma model. It seems like the issue is for the gemma model.

delphi/delphi/sparse_coders/sparse_model.py

Line 61 in aff973d

hookpoint_to_sparse_encode = load_gemma_autoencoders(

It directly calls "load_gemma_autoencoders" function, I believe this line should call "load_gemma_hooks" function. Also, I dont see any sparsify settings for the gemma model.

SrGonao · 2025-06-03T14:56:59Z

Nice catch, I think that should be the correct fix! Could you change the PR to change that line instead of the current change?

mahbubcseju · 2025-06-03T16:05:20Z

Just changed it. Thank you.

SrGonao · 2025-06-04T20:30:02Z

And you confirm this works as expected? Otherwise I can run an evaluation run and check if we get reasonable scores

Fix: sae input is collected instead of sae latent

cd43058

[pre-commit.ci] auto fixes from pre-commit.com hooks

999abab

for more information, see https://pre-commit.ci

Md Mahbubur Rahman added 2 commits June 3, 2025 08:57

Fix: gemma model sparse encoding

1f63a2c

Fix: merge conflict resloved

55b28a5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: sae input is collected instead of sae latent #131

Fix: sae input is collected instead of sae latent #131

Uh oh!

mahbubcseju commented Jun 3, 2025

Uh oh!

CLAassistant commented Jun 3, 2025

Uh oh!

SrGonao commented Jun 3, 2025

Uh oh!

mahbubcseju commented Jun 3, 2025

Uh oh!

SrGonao commented Jun 3, 2025

Uh oh!

mahbubcseju commented Jun 3, 2025

Uh oh!

SrGonao commented Jun 4, 2025

Uh oh!

Uh oh!

Fix: sae input is collected instead of sae latent #131

Are you sure you want to change the base?

Fix: sae input is collected instead of sae latent #131

Uh oh!

Conversation

mahbubcseju commented Jun 3, 2025

Uh oh!

CLAassistant commented Jun 3, 2025

Uh oh!

SrGonao commented Jun 3, 2025

Uh oh!

mahbubcseju commented Jun 3, 2025

Uh oh!

SrGonao commented Jun 3, 2025

Uh oh!

mahbubcseju commented Jun 3, 2025

Uh oh!

SrGonao commented Jun 4, 2025

Uh oh!

Uh oh!