Skip to content

Commit

Permalink
Set default_prepend_bos to False in Bloom model configuration (#806)
Browse files Browse the repository at this point in the history
* fix prepend_bos to False by default for bloom model family

* add comment

* edit documentation

* fix wrong expected value for bloom-560m model loss

* fix expected loss value for bloom model computed with google colab

* set prepend_bos to user value, then to value in model config and then default to true

* fix format

* remove log points in test_hooked_transformer

* remove einsum in forward pass in AbstractAttention (#783)

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

* Colab compatibility bug fixes (#794)

* call functions on model object instead of model string in run_encoder_decoder_set

* remove generate call in run_encoder_decoder_set because HookedEncoderDecoder doesn't support generate yet

* add testing function for HookedEncoders and stop testing BERT as HookedTransformer

* clear cell output to prevent test from failing

* add comment about bert working with free version of colab

---------

Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

* remove einsum usage from create_alibi_bias function in AbstractAttention (#781)

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

* updated token location (#797)

* remove einsum from apply_causal_mask in abstract_attention (#782)

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

* clarified arguments a bit for hook_points (#799)

* remove einsum in logit_attrs in ActivationCache (#788)

Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

* Remove einsum in compute_head_results in ActivationCache (#789)

* remove einsum in compute_head_results in ActivationCache

* ran format

---------

Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

* Remove einsum usage in refactor_factored_attn_matrices in HookedTransformer (#791)

* remove einsum usage in refactor_factored_attn_matrices in HookedTransformer

* fix format

---------

Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

* Remove einsum usage in _get_w_in_matrix in SVDInterpreter (#792)

* remove einsum usage in _get_w_in_matrix in SVDInterpreter

* fix format

---------

Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

* Remove einsum usage in forward function of BertMLMHead (#793)

* remove einsam usage in forward function of BertMLMHead

* fix format

---------

Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

* Set default_prepend_bos to false in Bloom configuration

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
  • Loading branch information
3 people authored Dec 5, 2024
1 parent f4279b5 commit b5bd89c
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions transformer_lens/loading_from_pretrained.py
Original file line number Diff line number Diff line change
Expand Up @@ -1158,6 +1158,7 @@ def convert_hf_model_config(model_name: str, **kwargs):
"normalization_type": "LN",
"post_embedding_ln": True,
"positional_embedding_type": "alibi",
"default_prepend_bos": False,
}
elif architecture == "GPT2LMHeadCustomModel":
# santacoder
Expand Down

0 comments on commit b5bd89c

Please sign in to comment.