Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloaded datasets are not usable offline #761

Closed
ghazi-f opened this issue Oct 26, 2020 · 2 comments · Fixed by #1726
Closed

Downloaded datasets are not usable offline #761

ghazi-f opened this issue Oct 26, 2020 · 2 comments · Fixed by #1726

Comments

@ghazi-f
Copy link
Contributor

ghazi-f commented Oct 26, 2020

I've been trying to use the IMDB dataset offline, but after downloading it and turning off the internet it still raises an error from the requests library trying to reach for the online dataset.
Is this the intended behavior ?
(Sorry, I wrote the the first version of this issue while still on nlp 0.3.0).

@ghazi-f ghazi-f changed the title Cash hashing for the downloaded datasets is incompatible with offline mode Cache hashing for the downloaded datasets is incompatible with offline mode Oct 26, 2020
@ghazi-f ghazi-f changed the title Cache hashing for the downloaded datasets is incompatible with offline mode Downloaded datasets are not usable offline Oct 26, 2020
@lhoestq
Copy link
Member

lhoestq commented Oct 27, 2020

Yes currently you need an internet connection because the lib tries to check for the etag of the dataset script online to see if you don't have it locally already.

If we add a way to store the etag/hash locally after the first download, it would allow users to first download the dataset with an internet connection, and still have it working without an internet connection.

I'll let you know when we add this feature.

@albertvillanova
Copy link
Member

Already fixed by:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants