Remove einsum usage from create_alibi_bias function #781

degenfabian · 2024-11-17T04:19:26Z

Description

There are known accuracy issues which are to some extent caused by the usage of einsum across large parts of the codebase. This PR removes one of these usages in the create_alibi_bias function. There is no specific issue associated with this PR, although removing a lot of the einsum usages could help in removing implementation inaccuracies addressed in many issues. I have not added specific test cases for this change, but I ran all the models (except those that required authorization and the paid_cpu models) in the Colab_Compatibility notebook and have not run into any compatibility issues introduced by this change. Additionally, the changed part is executed over 250 times across the unit and acceptance tests.

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

I have commented my code, particularly in hard-to-understand areas
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

Release 2.8

v2.8.1

Release 2.9

bryce13950 · 2024-11-18T21:51:56Z

Same story as #783

…ion (TransformerLensOrg#781) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

* fix prepend_bos to False by default for bloom model family * add comment * edit documentation * fix wrong expected value for bloom-560m model loss * fix expected loss value for bloom model computed with google colab * set prepend_bos to user value, then to value in model config and then default to true * fix format * remove log points in test_hooked_transformer * remove einsum in forward pass in AbstractAttention (#783) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * Colab compatibility bug fixes (#794) * call functions on model object instead of model string in run_encoder_decoder_set * remove generate call in run_encoder_decoder_set because HookedEncoderDecoder doesn't support generate yet * add testing function for HookedEncoders and stop testing BERT as HookedTransformer * clear cell output to prevent test from failing * add comment about bert working with free version of colab --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * remove einsum usage from create_alibi_bias function in AbstractAttention (#781) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * updated token location (#797) * remove einsum from apply_causal_mask in abstract_attention (#782) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * clarified arguments a bit for hook_points (#799) * remove einsum in logit_attrs in ActivationCache (#788) Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum in compute_head_results in ActivationCache (#789) * remove einsum in compute_head_results in ActivationCache * ran format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum usage in refactor_factored_attn_matrices in HookedTransformer (#791) * remove einsum usage in refactor_factored_attn_matrices in HookedTransformer * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum usage in _get_w_in_matrix in SVDInterpreter (#792) * remove einsum usage in _get_w_in_matrix in SVDInterpreter * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * Remove einsum usage in forward function of BertMLMHead (#793) * remove einsam usage in forward function of BertMLMHead * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Set default_prepend_bos to false in Bloom configuration --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

bryce13950 and others added 4 commits October 22, 2024 02:31

Merge pull request TransformerLensOrg#756 from TransformerLensOrg/dev

b6e19d6

Release 2.8

Merge pull request TransformerLensOrg#767 from TransformerLensOrg/dev

8f482fc

v2.8.1

Merge pull request TransformerLensOrg#780 from TransformerLensOrg/dev

dc19c08

Release 2.9

remove einsum usage from create_alibi_bias function in AbstractAttention

a14c1a2

degenfabian changed the base branch from main to dev November 18, 2024 20:46

bryce13950 mentioned this pull request Nov 18, 2024

Remove einsum in apply_causal_mask in abstract_attention.py #782

Merged

6 tasks

degenfabian mentioned this pull request Nov 25, 2024

Remove einsum in forward pass in AbstractAttention #783

Merged

6 tasks

bryce13950 merged commit 623407f into TransformerLensOrg:dev Nov 25, 2024
13 checks passed

degenfabian deleted the remove_einsum_in_create_alibi_bias branch November 25, 2024 22:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove einsum usage from create_alibi_bias function #781

Remove einsum usage from create_alibi_bias function #781

degenfabian commented Nov 17, 2024 •

edited

Loading

bryce13950 commented Nov 18, 2024

Remove einsum usage from create_alibi_bias function #781

Remove einsum usage from create_alibi_bias function #781

Conversation

degenfabian commented Nov 17, 2024 • edited Loading

Description

Type of change

Checklist:

bryce13950 commented Nov 18, 2024

degenfabian commented Nov 17, 2024 •

edited

Loading