-
-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add example rijn #803
Add example rijn #803
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, as the script is a bit more elaborate, maybe you want to add a few print statements showing parts of the data structure, for example by using print(dataframe.head())
?
for idx, task_id in enumerate(suite.tasks): | ||
logging.info('Starting with task %d (%d/%d)' % (task_id, idx+1, len(suite.tasks))) | ||
evals = openml.evaluations.list_evaluations_setups( | ||
evaluation_measure, flow=[flow_id], task=[task_id], size=limit_per_task, output_format='dataframe') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you also have to limit by the uploader or study?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the filtering for study is implicit, (as we enumerate over tasks from the suite).
as openml is a collaborative platform, we should in theory not care about who uploaded the results. if we did our job right, the results will be consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, do you think it would make sense to specifically note this in the example?
Codecov Report
@@ Coverage Diff @@
## develop #803 +/- ##
===========================================
+ Coverage 87.71% 88.03% +0.31%
===========================================
Files 36 36
Lines 4234 4363 +129
===========================================
+ Hits 3714 3841 +127
- Misses 520 522 +2
Continue to review full report at Codecov.
|
for idx, task_id in enumerate(suite.tasks): | ||
logging.info('Starting with task %d (%d/%d)' % (task_id, idx+1, len(suite.tasks))) | ||
evals = openml.evaluations.list_evaluations_setups( | ||
evaluation_measure, flow=[flow_id], task=[task_id], size=limit_per_task, output_format='dataframe') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, do you think it would make sense to specifically note this in the example?
Everything excepts for a random failure at appveyor:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, there are only two minor things left. If you want I can also change them myself.
Reference Issue
What does this PR implement/fix? Explain your changes.
How should this PR be tested?
Any other comments?