Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] fix mypy errors about custom eval and metric functions #5790

Merged
merged 4 commits into from
Mar 30, 2023

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Mar 17, 2023

Contributes to #3867.

Fixes the following mypy errors:

engine.py:258: error: Argument 1 to "eval_train" of "Booster" has incompatible type "Union[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]], List[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]]], None]"; expected "Union[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]], List[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]]], None]"  [arg-type]
engine.py:259: error: Argument 1 to "eval_valid" of "Booster" has incompatible type "Union[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]], List[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]]], None]"; expected "Union[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]], List[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]]], None]"  [arg-type]

sklearn.py:130: error: Too few arguments  [call-arg]
sklearn.py:130: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:132: error: Too many arguments  [call-arg]
sklearn.py:132: error: Too few arguments  [call-arg]
sklearn.py:132: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:132: error: Argument 3 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:134: error: Too many arguments  [call-arg]
sklearn.py:134: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:134: error: Argument 3 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:134: error: Argument 4 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:208: error: Too few arguments  [call-arg]
sklearn.py:208: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:210: error: Too many arguments  [call-arg]
sklearn.py:210: error: Too few arguments  [call-arg]
sklearn.py:210: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:210: error: Argument 3 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:212: error: Too many arguments  [call-arg]
sklearn.py:212: error: Argument 1 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:212: error: Argument 3 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:212: error: Argument 4 has incompatible type "Optional[ndarray[Any, Any]]"; expected "ndarray[Any, Any]"  [arg-type]
sklearn.py:814: error: Argument "feval" to "train" has incompatible type "List[_EvalFunctionWrapper]"; expected "Union[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]], List[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]]], None]"  [arg-type]
sklearn.py:814: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance
sklearn.py:814: note: Consider using "Sequence" instead, which is covariant

Callable[
[np.ndarray, np.ndarray],
[Optional[np.ndarray], np.ndarray],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add all these Optionals?

mypy is struggling with the facts that Dataset.get_field() can return None.

This PR proposes updating the type hints for custom metric and objective functions to match that behavior.

I intentionally chose not to update the user-facing docs about custom metric and objective functions to reflect that the label, group, and weights passed to these functions can technically be None... in almost all situations, they should be non-None. I don't think complicating the docs is worth it.

Comment on lines +64 to +71
Callable[
[Optional[np.ndarray], np.ndarray, Optional[np.ndarray], Optional[np.ndarray]],
_LGBM_EvalFunctionResultType
],
Callable[
[Optional[np.ndarray], np.ndarray, Optional[np.ndarray], Optional[np.ndarray]],
List[_LGBM_EvalFunctionResultType]
]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting cases like

Callable[
    [np.ndarray, np.ndarray],
    Union[_LGBM_EvalFunctionResultType, List[_LGBM_EvalFunctionResultType]]
]

into

Union[
    Callable[
        [np.ndarray, np.ndarray],
        _LGBM_EvalFunctionResultType
    ],
    Callable[
        [np.ndarray, np.ndarray],
        List[_LGBM_EvalFunctionResultType]
    ]
]

helps mypy with errors like this:

Argument 1 to "eval_valid" of "Booster" has incompatible type "Union[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]], List[Callable[[ndarray[Any, Any], Dataset], Union[Tuple[str, float, bool], List[Tuple[str, float, bool]]]]], None]"; expected "Union[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]], List[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]]], None]"  [arg-type]

And I think it's slightly more correct... I'd expect people to provide a custom metric function that returns a single tuple on every iteration or one that returns a list of tuples on each iteration, but not one that could return either of those.

elif argc == 4:
grad, hess = self.func(labels, preds, dataset.get_weight(), dataset.get_group())
grad, hess = self.func(labels, preds, dataset.get_weight(), dataset.get_group()) # type: ignore [call-arg]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy isn't able to break apart the individual Union items base on how many arguments inspect.signature() returns. This PR proposes just ignoring errors like this:

sklearn.py:132: error: Too many arguments  [call-arg]
sklearn.py:132: error: Too few arguments  [call-arg]

@@ -811,7 +829,7 @@ def _get_meta_data(collection, name, i):
num_boost_round=self.n_estimators,
valid_sets=valid_sets,
valid_names=eval_names,
feval=eval_metrics_callable,
feval=eval_metrics_callable, # type: ignore[arg-type]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This # type: ignore fixes this error:

Argument "feval" to "train" has incompatible type "List[_EvalFunctionWrapper]"; expected "Union[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]], List[Union[Callable[[ndarray[Any, Any], Dataset], Tuple[str, float, bool]], Callable[[ndarray[Any, Any], Dataset], List[Tuple[str, float, bool]]]]], None]"  [arg-type]
sklearn.py:814: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance
sklearn.py:814: note: Consider using "Sequence" instead, which is covariant

I'm not sure why, but mypy isn't able to figure out that _EvalFunctionWrapper is actually a Callable[[ndarray, Dataset], Tuple[str, float, bool]].

I think that's maybe related to some issues it has with comparing lists to union types including lists? e.g. python/mypy#6463

This PR proposes just ignoring that error for now... the use of custom eval metric functions is well-covered by the unit tests in test_sklearn.py and test_dask.py.

@jameslamb jameslamb changed the title [python-package] ignore mypy errors about custom eval and metric functions [python-package] fix mypy errors about custom eval and metric functions Mar 19, 2023
@jameslamb
Copy link
Collaborator Author

I was looking into some other mypy errors tonight and they required touching the same type annotations as this PR I'd opened a few days ago. So I just pushed those changes here, in d0d8d9e.

This is ready for review.

@jameslamb jameslamb merged commit a528598 into master Mar 30, 2023
@jameslamb jameslamb deleted the ci/mypy-sklearn-custom-functions branch March 30, 2023 02:47
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at /~https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants