Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Minor fixes so PretrainedTransformerIndexer works with roberta #3203

Merged
merged 1 commit into from
Aug 27, 2019

Conversation

matt-gardner
Copy link
Contributor

You currently need to manually install pytorch-transformers from master, because there is no released version that includes roberta in the AutoTokenizer and such. But this makes the indexer compatible with roberta.

@matt-gardner matt-gardner merged commit 993034f into allenai:master Aug 27, 2019
@matt-gardner matt-gardner deleted the roberta branch August 27, 2019 00:21
@@ -51,6 +51,8 @@ def __init__(self,
self.tokenizer = AutoTokenizer.from_pretrained(model_name, do_lower_case=do_lowercase)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm trying this PR out, and in the case where model name isn't loadable from AutoTokenizer.from_pretrained, it returns None. Maybe consider catching that and throwing a more informative error message here? (instead, it errors when you try to use self.tokenizer to get the id of the token.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, PR welcome.

reiyw pushed a commit to reiyw/allennlp that referenced this pull request Nov 12, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants