Remove einsum in logit_attrs in ActivationCache #788

degenfabian · 2024-11-23T20:27:52Z

Description

There are known accuracy issues which are to some extent caused by the usage of einsum across large parts of the codebase. This PR removes one of these usages in the logic_attrs function in ActivationCache.py. There is no specific issue associated with this PR, although removing a lot of the einsum usages could help in removing implementation inaccuracies addressed in many issues. I have not added specific test cases for this change, but I ran all the models (except those that required authorization and the paid_cpu models) in the Colab_Compatibility notebook and have not run into any compatibility issues introduced by this change. Additionally, the changed part is executed 8 times across the unit and acceptance tests.

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

I have commented my code, particularly in hard-to-understand areas
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

* fix prepend_bos to False by default for bloom model family * add comment * edit documentation * fix wrong expected value for bloom-560m model loss * fix expected loss value for bloom model computed with google colab * set prepend_bos to user value, then to value in model config and then default to true * fix format * remove log points in test_hooked_transformer * remove einsum in forward pass in AbstractAttention (#783) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * Colab compatibility bug fixes (#794) * call functions on model object instead of model string in run_encoder_decoder_set * remove generate call in run_encoder_decoder_set because HookedEncoderDecoder doesn't support generate yet * add testing function for HookedEncoders and stop testing BERT as HookedTransformer * clear cell output to prevent test from failing * add comment about bert working with free version of colab --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * remove einsum usage from create_alibi_bias function in AbstractAttention (#781) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * updated token location (#797) * remove einsum from apply_causal_mask in abstract_attention (#782) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * clarified arguments a bit for hook_points (#799) * remove einsum in logit_attrs in ActivationCache (#788) Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum in compute_head_results in ActivationCache (#789) * remove einsum in compute_head_results in ActivationCache * ran format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum usage in refactor_factored_attn_matrices in HookedTransformer (#791) * remove einsum usage in refactor_factored_attn_matrices in HookedTransformer * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum usage in _get_w_in_matrix in SVDInterpreter (#792) * remove einsum usage in _get_w_in_matrix in SVDInterpreter * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * Remove einsum usage in forward function of BertMLMHead (#793) * remove einsam usage in forward function of BertMLMHead * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Set default_prepend_bos to false in Bloom configuration --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

Fabian Degen and others added 2 commits November 20, 2024 14:31

remove einsum in logit_attrs in ActivationCache

ce42ce1

Merge branch 'dev' into remove_einsum_in_logit_attrs

61bc80a

bryce13950 merged commit a695f79 into TransformerLensOrg:dev Nov 26, 2024
13 checks passed

degenfabian deleted the remove_einsum_in_logit_attrs branch November 27, 2024 01:11

degenfabian added a commit to degenfabian/TransformerLens that referenced this pull request Dec 4, 2024

remove einsum in logit_attrs in ActivationCache (TransformerLensOrg#788)

0494843

Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove einsum in logit_attrs in ActivationCache #788

Remove einsum in logit_attrs in ActivationCache #788

degenfabian commented Nov 23, 2024

Remove einsum in logit_attrs in ActivationCache #788

Remove einsum in logit_attrs in ActivationCache #788

Conversation

degenfabian commented Nov 23, 2024

Description

Type of change

Checklist: