Skip to content

Commit

Permalink
chore: fixing whitespace so docs render as list not paragraph (#374)
Browse files Browse the repository at this point in the history
  • Loading branch information
chanind authored Nov 11, 2024
1 parent aa98caf commit 156ddc9
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions docs/training_saes.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ It may sound daunting to train a real SAE but nothing could be further from the
SAE Training best practices are still rapidly evolving, so the default settings in SAELens may not be optimal for real SAEs. Fortunately, it's easy to see what any SAE trained using SAELens used for its training configuration and just copy its values as a starting point! If there's a SAE on Huggingface trained using SAELens, you can see all the training settings used by looking at the `cfg.json` file in the SAE's repo. For instance, here's the [cfg.json](https://huggingface.co/jbloom/Gemma-2b-Residual-Stream-SAEs/blob/main/gemma_2b_blocks.12.hook_resid_post_16384/cfg.json) for a Gemma 2B standard SAE trained by Joseph Bloom. You can also get the config in SAELens as the second return value from `SAE.from_pretrained()`. For instance, the same config mentioned above can be accessed as `cfg_dict = SAE.from_pretrained("jbloom/Gemma-2b-Residual-Stream-SAEs", "gemma_2b_blocks.12.hook_resid_post_16384")[1]`. You can browse all SAEs uploaded to Huggingface via SAELens to get some inspiration with the [SAELens library tag](https://huggingface.co/models?library=saelens).

Some general performance tips:

- If your GPU supports it (most modern nvidia-GPUs do), setting `autocast=True` and `autocast_lm=True` in the config will dramatically speed up training.
- We find that often SAEs struggle to train well with `dtype="bfloat16"`. We aren't sure why this is, but make sure to compare the SAE quality if you change the dtype.
- You can try turning on `compile_sae=True` and `compile_llm=True`in the config to see if it makes training faster. Your mileage may vary though, compilation can be finicky.
Expand Down

0 comments on commit 156ddc9

Please sign in to comment.