-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix bug with lazy data loading, un-implement __len__ on AllennlpLazyDataset #4328
Conversation
@@ -490,7 +490,7 @@ def run(self) -> Dict[str, Any]: | |||
return self.trainer.train() | |||
|
|||
def finish(self, metrics: Dict[str, Any]): | |||
if self.evaluation_data_loader and self.evaluate_on_test: | |||
if self.evaluation_data_loader is not None and self.evaluate_on_test: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need this so len()
is not called.
@@ -361,7 +361,7 @@ def __init__( | |||
self.optimizer = optimizer | |||
|
|||
if patience is None: # no early stopping | |||
if validation_data_loader: | |||
if validation_data_loader is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need this so len()
is not called.
By the way I tracked the |
I think this should close #4035 |
@@ -138,7 +138,9 @@ def get_values(self): | |||
self.batch_num_total_epoch_end[-1] / (len(self.batch_num_total_epoch_end) - 1) | |||
) | |||
else: | |||
actual_num_steps_per_epoch = max(self.num_steps_per_epoch, self.last_batch_num_total) | |||
actual_num_steps_per_epoch = max( | |||
self.num_steps_per_epoch or 1, self.last_batch_num_total |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it default to 1
here? Isn't that a problem when it happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gives the same behavior as it was before. I'm not sure if that's an issue.
This fixes a bug that occurs when using a lazy dataset reader where you'd get a
PyTorch
warning on every single batch during training. It does so by makingAllennlpLazyDataset
consistent withtorch.data.IterableDataset
in that it no longer implements__len__()
(previouslylen()
would just return 1).To reproduce this bug just run
allennlp train
on a experiment with a lazy dataset reader like this: