Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example rijn #803

Merged
merged 13 commits into from
Oct 11, 2019
Merged

Add example rijn #803

merged 13 commits into from
Oct 11, 2019

Conversation

janvanrijn
Copy link
Member

Reference Issue

What does this PR implement/fix? Explain your changes.

How should this PR be tested?

Any other comments?

Copy link
Collaborator

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, as the script is a bit more elaborate, maybe you want to add a few print statements showing parts of the data structure, for example by using print(dataframe.head())?

examples/40_paper/2018_kdd_rijn_example.py Outdated Show resolved Hide resolved
examples/40_paper/2018_kdd_rijn_example.py Outdated Show resolved Hide resolved
examples/40_paper/2018_kdd_rijn_example.py Outdated Show resolved Hide resolved
for idx, task_id in enumerate(suite.tasks):
logging.info('Starting with task %d (%d/%d)' % (task_id, idx+1, len(suite.tasks)))
evals = openml.evaluations.list_evaluations_setups(
evaluation_measure, flow=[flow_id], task=[task_id], size=limit_per_task, output_format='dataframe')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you also have to limit by the uploader or study?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the filtering for study is implicit, (as we enumerate over tasks from the suite).

as openml is a collaborative platform, we should in theory not care about who uploaded the results. if we did our job right, the results will be consistent

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, do you think it would make sense to specifically note this in the example?

examples/40_paper/2018_kdd_rijn_example.py Show resolved Hide resolved
examples/40_paper/2018_kdd_rijn_example.py Show resolved Hide resolved
examples/40_paper/2018_kdd_rijn_example.py Outdated Show resolved Hide resolved
examples/40_paper/2018_kdd_rijn_example.py Show resolved Hide resolved
@codecov-io
Copy link

codecov-io commented Oct 8, 2019

Codecov Report

Merging #803 into develop will increase coverage by 0.31%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #803      +/-   ##
===========================================
+ Coverage    87.71%   88.03%   +0.31%     
===========================================
  Files           36       36              
  Lines         4234     4363     +129     
===========================================
+ Hits          3714     3841     +127     
- Misses         520      522       +2
Impacted Files Coverage Δ
openml/datasets/functions.py 95.61% <0%> (+0.1%) ⬆️
openml/flows/functions.py 90.78% <0%> (+3.36%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 382959f...807d33c. Read the comment docs.

examples/40_paper/2018_kdd_rijn_example.py Show resolved Hide resolved
examples/40_paper/2018_kdd_rijn_example.py Outdated Show resolved Hide resolved
examples/40_paper/2018_kdd_rijn_example.py Show resolved Hide resolved
for idx, task_id in enumerate(suite.tasks):
logging.info('Starting with task %d (%d/%d)' % (task_id, idx+1, len(suite.tasks)))
evals = openml.evaluations.list_evaluations_setups(
evaluation_measure, flow=[flow_id], task=[task_id], size=limit_per_task, output_format='dataframe')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, do you think it would make sense to specifically note this in the example?

@janvanrijn
Copy link
Member Author

Everything excepts for a random failure at appveyor:

FAILED tests/test_tasks/test_split.py::OpenMLSplitTest::test_eq

Copy link
Collaborator

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, there are only two minor things left. If you want I can also change them myself.

ci_scripts/install.sh Outdated Show resolved Hide resolved
ci_scripts/install.sh Outdated Show resolved Hide resolved
@mfeurer mfeurer merged commit 3e23a3b into develop Oct 11, 2019
@mfeurer mfeurer deleted the add_example_rijn branch October 11, 2019 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants