-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove einsum usage from create_alibi_bias function #781
Merged
bryce13950
merged 4 commits into
TransformerLensOrg:dev
from
degenfabian:remove_einsum_in_create_alibi_bias
Nov 25, 2024
Merged
Remove einsum usage from create_alibi_bias function #781
bryce13950
merged 4 commits into
TransformerLensOrg:dev
from
degenfabian:remove_einsum_in_create_alibi_bias
Nov 25, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Same story as #783 |
6 tasks
6 tasks
degenfabian
added a commit
to degenfabian/TransformerLens
that referenced
this pull request
Dec 4, 2024
…ion (TransformerLensOrg#781) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
bryce13950
added a commit
that referenced
this pull request
Dec 5, 2024
* fix prepend_bos to False by default for bloom model family * add comment * edit documentation * fix wrong expected value for bloom-560m model loss * fix expected loss value for bloom model computed with google colab * set prepend_bos to user value, then to value in model config and then default to true * fix format * remove log points in test_hooked_transformer * remove einsum in forward pass in AbstractAttention (#783) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * Colab compatibility bug fixes (#794) * call functions on model object instead of model string in run_encoder_decoder_set * remove generate call in run_encoder_decoder_set because HookedEncoderDecoder doesn't support generate yet * add testing function for HookedEncoders and stop testing BERT as HookedTransformer * clear cell output to prevent test from failing * add comment about bert working with free version of colab --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * remove einsum usage from create_alibi_bias function in AbstractAttention (#781) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * updated token location (#797) * remove einsum from apply_causal_mask in abstract_attention (#782) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * clarified arguments a bit for hook_points (#799) * remove einsum in logit_attrs in ActivationCache (#788) Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum in compute_head_results in ActivationCache (#789) * remove einsum in compute_head_results in ActivationCache * ran format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum usage in refactor_factored_attn_matrices in HookedTransformer (#791) * remove einsum usage in refactor_factored_attn_matrices in HookedTransformer * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum usage in _get_w_in_matrix in SVDInterpreter (#792) * remove einsum usage in _get_w_in_matrix in SVDInterpreter * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * Remove einsum usage in forward function of BertMLMHead (#793) * remove einsam usage in forward function of BertMLMHead * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Set default_prepend_bos to false in Bloom configuration --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
There are known accuracy issues which are to some extent caused by the usage of einsum across large parts of the codebase. This PR removes one of these usages in the create_alibi_bias function. There is no specific issue associated with this PR, although removing a lot of the einsum usages could help in removing implementation inaccuracies addressed in many issues. I have not added specific test cases for this change, but I ran all the models (except those that required authorization and the paid_cpu models) in the Colab_Compatibility notebook and have not run into any compatibility issues introduced by this change. Additionally, the changed part is executed over 250 times across the unit and acceptance tests.
Type of change
Checklist: