Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Set default_prepend_bos to False in Bloom model configuration (#806)
* fix prepend_bos to False by default for bloom model family * add comment * edit documentation * fix wrong expected value for bloom-560m model loss * fix expected loss value for bloom model computed with google colab * set prepend_bos to user value, then to value in model config and then default to true * fix format * remove log points in test_hooked_transformer * remove einsum in forward pass in AbstractAttention (#783) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * Colab compatibility bug fixes (#794) * call functions on model object instead of model string in run_encoder_decoder_set * remove generate call in run_encoder_decoder_set because HookedEncoderDecoder doesn't support generate yet * add testing function for HookedEncoders and stop testing BERT as HookedTransformer * clear cell output to prevent test from failing * add comment about bert working with free version of colab --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * remove einsum usage from create_alibi_bias function in AbstractAttention (#781) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * updated token location (#797) * remove einsum from apply_causal_mask in abstract_attention (#782) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * clarified arguments a bit for hook_points (#799) * remove einsum in logit_attrs in ActivationCache (#788) Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum in compute_head_results in ActivationCache (#789) * remove einsum in compute_head_results in ActivationCache * ran format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum usage in refactor_factored_attn_matrices in HookedTransformer (#791) * remove einsum usage in refactor_factored_attn_matrices in HookedTransformer * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Remove einsum usage in _get_w_in_matrix in SVDInterpreter (#792) * remove einsum usage in _get_w_in_matrix in SVDInterpreter * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> * Remove einsum usage in forward function of BertMLMHead (#793) * remove einsam usage in forward function of BertMLMHead * fix format --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de> Co-authored-by: Bryce Meyer <bryce13950@gmail.com> * Set default_prepend_bos to false in Bloom configuration --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
- Loading branch information