-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement function for BERT quantization tutorial, resolves issue #1971 #2403
Conversation
✅ Deploy Preview for pytorch-tutorials-preview ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I continue to wonder about the name=None
parameter, but it does exist in the version of this function at the following URL, so I guess we can keep it.
Quantizing BERT Model
https://pytorch.org/tutorials/prototype/graph_mode_dynamic_bert_tutorial.html?highlight=transformer#setup
for _ in range(total_dims): | ||
values.append(rng.randint(0, vocab_size - 1)) | ||
|
||
return torch.tensor(data=values, dtype=torch.long, device='cpu').view(shape).contiguous() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would create int64
tensor, not int32
one, wouldn't it?
# Creates a random int32 tensor of the shape within the vocab size | ||
if rng is None: | ||
rng = global_rng | ||
|
||
total_dims = 1 | ||
for dim in shape: | ||
total_dims *= dim | ||
|
||
values = [] | ||
for _ in range(total_dims): | ||
values.append(rng.randint(0, vocab_size - 1)) | ||
|
||
return torch.tensor(data=values, dtype=torch.long, device='cpu').view(shape).contiguous() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, isn't it all just equivalent of
return torch.randint(0, vocab_size, shape=shape, dtype=torch.int, device='cpu')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixes #1971
Fix missing implementation of
ids_tensor
.