Fix parameter naming: Separate l0_lambda from l1_coefficient #376

muyo8692 · 2024-11-14T01:44:01Z

Resolves #360

Description

Currently, when using the JumpReLU architecture which involves l0 regularization, the code uses l1_coefficient parameter name for the l0 loss calculation. This naming is confusing and makes the configuration less intuitive. This PR introduces a dedicated l0_lambda parameter to clearly separate these two regularization terms.

Changes

Added new parameter l0_lambda in config for l0 loss calculation (default: 0.0)
Added validation to enforce l0_lambda specification when architecture="jumprelu" is selected
Maintained existing behavior: using JumpReLU without specifying l0_lambda will raise an error to prevent silent failures

Testing

Verified numerical equivalence: models trained with the original code and modified code produce identical metrics
Added error handling test for missing l0_lambda with JumpReLU architecture

Notes

This change improves code clarity while ensuring backward compatibility is maintained through proper error handling rather than silent parameter reuse.

chanind · 2024-11-14T10:37:53Z

I'm a bit worried about adding lots of architecture-specific parameters to the config as this seems like it will get out of hand pretty quickly. I agree that using l1_coefficient for JumpReLU doesn't feel great, but I worry about the confusion burden of having lots of different parameters that apply only to 1 architecture in the config. Longer-term, we should probably move to a config structure with something like a architecture_params dictionary in the config where each architecture can have separate settings or something.

If we do decide to add this parameter, I'm not a fan of the name l0_lambda since we have l1_coefficient, and these feel like they should be consistent.

The benefit of an approach like in #360 where we just have a sparsity_coefficient is we don't need to have multiple params and multiple schedulers even though only 1 is actually relevant.

What do others think? cc @curt-tigges @hijohnnylin @anthonyduong9 @jbloomAus

.vscode/settings.json

sae_lens/training/optim.py

muyo8692 · 2024-11-15T06:08:49Z

I'm a bit worried about adding lots of architecture-specific parameters to the config as this seems like it will get out of hand pretty quickly. I agree that using l1_coefficient for JumpReLU doesn't feel great, but I worry about the confusion burden of having lots of different parameters that apply only to 1 architecture in the config. Longer-term, we should probably move to a config structure with something like a architecture_params dictionary in the config where each architecture can have separate settings or something.

If we do decide to add this parameter, I'm not a fan of the name l0_lambda since we have l1_coefficient, and these feel like they should be consistent.

The benefit of an approach like in #360 where we just have a sparsity_coefficient is we don't need to have multiple params and multiple schedulers even though only 1 is actually relevant.

thanks for the comments & totally agree with add sparsity_coefficient instead of individual configs.
I should have time to work on it this weekend

Refactor l1/l0 variable names to coefficient

muyo8692 · 2024-11-19T02:29:05Z

@chanind I've made the adjustment, it should work fine now

chanind · 2024-11-20T16:18:13Z

Thank you for doing this! There's a few more complications though:

@curt-tigges might be refactoring the configs / SAE classes soon to have separate architecture-specific config sections, which would make this unnecessary. @curt-tigges can talk to the timeline for that
We should make the old options still work for backwards compatibility with a deprecation message instructing the user to use sparsity_coefficient if their config has l1_coefficient still. Otherwise this would be a breaking change.
Docs/notebooks also need to be updated to talk about sparsity_coefficient instead of l1_coefficient. This is probably just a find/replace.

I'm not sure if it's worth it though if Curt is going to be refactoring this soon anyway. Up to you if you think it's worth doing!

muyo8692 · 2024-11-21T09:00:24Z

Good to hear there will be a refactoring of the configs, in this case I think I'll just wait for the new version! closing the PR

muyo8692 added 2 commits November 13, 2024 21:26

create individual config for l_0 lambda

12325d3

create individual config for l_0 lambda

c61ca98

muyo8692 mentioned this pull request Nov 14, 2024

[Proposal] Rename l1_coefficient to sparsity_coefficient #360

Open

1 task

Fix l0_scheduler step inot being updated

6e437ce

chanind reviewed Nov 14, 2024

View reviewed changes

.vscode/settings.json Outdated Show resolved Hide resolved

chanind reviewed Nov 14, 2024

View reviewed changes

sae_lens/training/optim.py Outdated Show resolved Hide resolved

muyo8692 added 4 commits November 15, 2024 16:39

Update code formatting in .vscode/settings.json

ab51e9b

Refactor l1/l0 variable names to coefficient

update test configs for sparsity schedular

ea5a40f

update test configs for sparsity schedular

85b0669

update test setting

ca3a179

muyo8692 closed this Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix parameter naming: Separate l0_lambda from l1_coefficient #376

Fix parameter naming: Separate l0_lambda from l1_coefficient #376

muyo8692 commented Nov 14, 2024 •

edited

Loading

chanind commented Nov 14, 2024 •

edited

Loading

muyo8692 commented Nov 15, 2024

muyo8692 commented Nov 19, 2024

chanind commented Nov 20, 2024

muyo8692 commented Nov 21, 2024

Fix parameter naming: Separate l0_lambda from l1_coefficient #376

Fix parameter naming: Separate l0_lambda from l1_coefficient #376

Conversation

muyo8692 commented Nov 14, 2024 • edited Loading

Description

Changes

Testing

Notes

chanind commented Nov 14, 2024 • edited Loading

muyo8692 commented Nov 15, 2024

muyo8692 commented Nov 19, 2024

chanind commented Nov 20, 2024

muyo8692 commented Nov 21, 2024

muyo8692 commented Nov 14, 2024 •

edited

Loading

chanind commented Nov 14, 2024 •

edited

Loading