AttributeError: 'NoneType' object has no attribute 'seed_model' #724

hp2500 · 2019-07-02T19:05:04Z

Hi there,

I have been trying to run experiments with a fairly new sklearn-extra classifier (/~https://github.com/Alex7Li/scikit-learn-extra/tree/master/sklearn_extra). The classifier runs fine on a local dataset. However, when I am trying to run it on an openml task, I am getting an error.

Here is a minimal example:

import openml
# define classifier
from sklearn_extra.fast_kernel import FKC_EigenPro
clf = FKC_EigenPro()
# get task
task = openml.tasks.get_task(3)
# run model on task
run = openml.runs.run_model_on_task(clf, task)
# publish run on openml
run.publish()

AttributeError Traceback (most recent call last)
in
4 task = openml.tasks.get_task(3)
5 # run model on task
----> 6 run = openml.runs.run_model_on_task(clf, task)
7 # publish run on openml
8 run.publish()

/miniconda3/lib/python3.7/site-packages/openml/runs/functions.py in run_model_on_task(model, task, avoid_duplicate_runs, flow_tags, seed, add_local_measures, upload_flow, return_flow)
104 seed=seed,
105 add_local_measures=add_local_measures,
--> 106 upload_flow=upload_flow,
107 )
108 if return_flow:

/miniconda3/lib/python3.7/site-packages/openml/runs/functions.py in run_flow_on_task(flow, task, avoid_duplicate_runs, flow_tags, seed, add_local_measures, upload_flow)
172 task, flow = flow, task
173
--> 174 flow.model = flow.extension.seed_model(flow.model, seed=seed)
175
176 # We only need to sync with the server right now if we want to upload the flow,

AttributeError: 'NoneType' object has no attribute 'seed_model'

Is there anything I can do to prevent this from happening?
@amueller

amueller · 2019-07-02T21:10:03Z

I haven't looked at this since the extensions got refactored.

It looks like you need to tell openml that the sklearn extension can be used to handle this estimator.
I'm a bit surprised that that's not the fallback / default.

hp2500 · 2019-07-05T15:47:09Z

I think I found a workaround. Using the estimator in a pipeline seems to solve the issue.
@amueller

mfeurer · 2019-07-08T13:19:44Z

This appears to have the same root cause as #718 and #720 and will be fixed via #722. Please reopen if this is not the case.

hp2500 · 2019-07-11T18:00:39Z

Hi there, unfortunately the issue still persists.

nok · 2019-07-11T21:26:39Z

Hello @hp2500 , why do you swap the objects in line 172?

task, flow = flow, task

To be clear I just read your provided source code and it confused me. Can you check the data type with type(flow) after line 172 please?

amueller · 2019-07-22T18:13:51Z

Easier way to reproduce is this:

# define classifier
import openml
from sklearn.linear_model import LogisticRegression

# there needs to be a version specified but this works lol.
__version__ = 0.1

class MyLR(LogisticRegression):
    pass

clf = MyLR()
# get task
task = openml.tasks.get_task(3)
# run model on task
run = openml.runs.run_model_on_task(clf, task)
# publish run on openml
run.publish()

RuntimeError: No extension could be found for flow None: main.MyLR

amueller · 2019-07-22T18:18:08Z

It's pretty obvious that this was broken in #647.
The sklearn extension should handle this, but the sklearn extension can handle only things that _is_sklearn_flow returns True for, and that's ',sklearn==' in flow.external_version

amueller · 2019-07-22T18:37:32Z

Ok so get_extension_by_model returns sklearn because isinstance(MyLR(), BaseEstimator) - which is also not the correct test btw but whatever.

The problem is that the flow that is created from that model is not an sklearn extension flow, because that's created by get_extension_by_flow, and the sklearn.extension module doesn't set include sklearn in its external_version, it's only including the sklearn version in the tags:

openml-python/openml/extensions/sklearn/extension.py

Line 420 in 56fcc00

flow = OpenMLFlow(name=name,

There are two obvious fixes:
a) When creating the flow, allow setting the extension directly, because we know what the extension is supposed to be.

b) include the sklearn version in the external version

I feel we should be doing both possibly?

amueller · 2019-07-22T18:50:18Z

damn meant to comment on #734

mfeurer closed this as completed Jul 8, 2019

mfeurer mentioned this issue Jul 9, 2019

Issues instantiating/running models from flows #718

Closed

hp2500 mentioned this issue Jul 11, 2019

Still not able to use non-sklearn estimators without wrapping them in a pipeline #734

Closed

amueller reopened this Jul 22, 2019

amueller closed this as completed Jul 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'NoneType' object has no attribute 'seed_model' #724

AttributeError: 'NoneType' object has no attribute 'seed_model' #724

hp2500 commented Jul 2, 2019 •

edited by amueller

Loading

amueller commented Jul 2, 2019

hp2500 commented Jul 5, 2019 •

edited

Loading

mfeurer commented Jul 8, 2019

hp2500 commented Jul 11, 2019

nok commented Jul 11, 2019 •

edited

Loading

amueller commented Jul 22, 2019 •

edited

Loading

amueller commented Jul 22, 2019 •

edited

Loading

amueller commented Jul 22, 2019

amueller commented Jul 22, 2019

AttributeError: 'NoneType' object has no attribute 'seed_model' #724

AttributeError: 'NoneType' object has no attribute 'seed_model' #724

Comments

hp2500 commented Jul 2, 2019 • edited by amueller Loading

amueller commented Jul 2, 2019

hp2500 commented Jul 5, 2019 • edited Loading

mfeurer commented Jul 8, 2019

hp2500 commented Jul 11, 2019

nok commented Jul 11, 2019 • edited Loading

amueller commented Jul 22, 2019 • edited Loading

amueller commented Jul 22, 2019 • edited Loading

amueller commented Jul 22, 2019

amueller commented Jul 22, 2019

hp2500 commented Jul 2, 2019 •

edited by amueller

Loading

hp2500 commented Jul 5, 2019 •

edited

Loading

nok commented Jul 11, 2019 •

edited

Loading

amueller commented Jul 22, 2019 •

edited

Loading

amueller commented Jul 22, 2019 •

edited

Loading