-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding distillation loss functions from TinyBERT #1879
Changes from 12 commits
5b20379
bb07c8c
f983c12
aa10c19
d5a7230
0481952
449ff7e
0bf2bfc
970b512
2a85647
3f468b6
0cccd75
f8a7701
9163f35
faedf8b
0a095e1
8798c3e
f30496b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -484,6 +484,8 @@ def forward( | |
input_ids: torch.Tensor, | ||
segment_ids: torch.Tensor, | ||
padding_mask: torch.Tensor, | ||
output_hidden_states: bool = False, | ||
output_attentions: bool = False, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add docstrings for these new parameters There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I have added the docstrings. |
||
**kwargs, | ||
): | ||
""" | ||
|
@@ -501,13 +503,17 @@ def forward( | |
input_ids, | ||
token_type_ids=segment_ids, | ||
attention_mask=padding_mask, | ||
output_hidden_states=self.model.encoder.config.output_hidden_states or output_hidden_states, | ||
output_attentions=self.model.encoder.config.output_attentions or output_attentions, | ||
return_dict=False | ||
) | ||
if self.model.encoder.config.output_hidden_states == True: | ||
sequence_output, pooled_output, all_hidden_states = output_tuple[0], output_tuple[1], output_tuple[2] | ||
return sequence_output, pooled_output, all_hidden_states | ||
else: | ||
sequence_output, pooled_output = output_tuple[0], output_tuple[1] | ||
return sequence_output, pooled_output | ||
return output_tuple | ||
# if self.model.encoder.config.output_hidden_states == True: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please check the commented code. :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have now deleted the commented code. It is unnecessary as output tuple is now handled by HuggingFace transformers. |
||
# sequence_output, pooled_output, all_hidden_states = output_tuple[0], output_tuple[1], output_tuple[2] | ||
# return sequence_output, pooled_output, all_hidden_states | ||
# else: | ||
# sequence_output, pooled_output = output_tuple[0], output_tuple[1] | ||
# return sequence_output, pooled_output | ||
|
||
def enable_hidden_states_output(self): | ||
self.model.encoder.config.output_hidden_states = True | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
from haystack.modeling.training.base import Trainer, DistillationTrainer | ||
from haystack.modeling.training.base import Trainer, DistillationTrainer, TinyBERTDistillationTrainer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the doc strings for the new parameters here as well, e.g.:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the doc strings.