-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offline loading #1726
Merged
Merged
Offline loading #1726
Changes from 23 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
c2a7cab
minor
lhoestq 93c6134
add prepare module test
lhoestq 6946abd
fix windows path scheme check
lhoestq bbc1132
cached_path raises requests error if no internet
lhoestq 23f766b
look for cached modules if there's no internet
lhoestq afdecdd
wip tests
lhoestq 5ab0108
add warning message
lhoestq af473d8
update tests
lhoestq fc85400
style
lhoestq e394196
remove test modules if already exist
lhoestq 100cca4
style
lhoestq 1a7425e
add init_dynamic_modules function for testing purposes
lhoestq 2e3efee
fix importlib cache
lhoestq 247ea0f
Merge branch 'master' into offline-loading
lhoestq aebb4d3
move csv, json, text and pandas to inside the package
lhoestq c56a765
add packaged datasets handling in prepare_module
lhoestq 76238f6
update tests
lhoestq 7e69c14
minor fix
lhoestq a567e8f
add missing __init__.py
lhoestq 78d2607
fix test
lhoestq 235380c
style
lhoestq 92fce40
fix test
lhoestq 75215c6
fix tests
lhoestq 7080102
show last modification date in the warning
lhoestq File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
import inspect | ||
import re | ||
from hashlib import sha256 | ||
from typing import List | ||
|
||
from .csv import csv | ||
from .json import json | ||
from .pandas import pandas | ||
from .text import text | ||
|
||
|
||
def hash_python_lines(lines: List[str]) -> str: | ||
filtered_lines = [] | ||
for line in lines: | ||
line.replace("\n", "") # remove line breaks, white space and comments | ||
line.replace(" ", "") | ||
line.replace("\t", "") | ||
line = re.sub(r"#.*", "", line) | ||
if line: | ||
filtered_lines.append(line) | ||
full_str = "\n".join(filtered_lines) | ||
|
||
# Make a hash from all this code | ||
full_bytes = full_str.encode("utf-8") | ||
return sha256(full_bytes).hexdigest() | ||
|
||
|
||
# get importable module names and hash for caching | ||
_PACKAGED_DATASETS_MODULES = { | ||
"csv": (csv.__name__, hash_python_lines(inspect.getsource(csv).splitlines())), | ||
"json": (json.__name__, hash_python_lines(inspect.getsource(json).splitlines())), | ||
"pandas": (pandas.__name__, hash_python_lines(inspect.getsource(pandas).splitlines())), | ||
"text": (text.__name__, hash_python_lines(inspect.getsource(text).splitlines())), | ||
} |
Empty file.
File renamed without changes.
Empty file.
File renamed without changes.
Empty file.
File renamed without changes.
Empty file.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhoestq small typo here which breaks metrics loading, submitting a fix now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was "metrics" with an 's' !! Good catch