Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Cases where the length of the IterableDataset is not overriden properly #4035

Closed
MaksymDel opened this issue Apr 7, 2020 · 4 comments · Fixed by #4328
Closed

Cases where the length of the IterableDataset is not overriden properly #4035

MaksymDel opened this issue Apr 7, 2020 · 4 comments · Fixed by #4328
Milestone

Comments

@MaksymDel
Copy link
Contributor

MaksymDel commented Apr 7, 2020

_LazyInstances (allennlp's subclass of IterableDataset) returns 1 when the __len__ method is not overridden.
This, for example, causes a problem with SlantedTriangular which expects the length of the dataset in batches as constructor parameter if used programmatically.

See #4028 (comment) and #4028 (comment) for more context.

@dirkgr
Copy link
Member

dirkgr commented Apr 7, 2020

It is baffling behavior to return 1 for the length. We should return None for the length when we don't know, and deal with the fallout wherever it happens.

@dirkgr dirkgr added this to the 1.1 milestone Apr 7, 2020
@JohnGiorgi
Copy link
Contributor

JohnGiorgi commented Apr 17, 2020

@maksym-del If I did want to use SlantedTriangular while lazy=True in my DatasetReader, would I just manually set dataloader.batches_per_epoch in my config?

(not just asking to be annoying, I actually do want to use SlantedTriangular with a lazy reader).

@MaksymDel
Copy link
Contributor Author

@JohnGiorgi yes, specifying batches per epoch in loader and passsing corresponding number to the scheduler should work.

@JohnGiorgi
Copy link
Contributor

@maksym-del Awesome, thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants