-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(python): Fix row_by_key
typing
#19888
Conversation
row_by_key
typingrow_by_key
typing
I cant find why mypy complains... |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #19888 +/- ##
===========================================
+ Coverage 59.38% 79.50% +20.11%
===========================================
Files 1554 1556 +2
Lines 215612 216407 +795
Branches 2452 2456 +4
===========================================
+ Hits 128035 172046 +44011
+ Misses 87019 43803 -43216
Partials 558 558 ☔ View full report in Codecov by Sentry. |
Thanks for fixing this, I am having the same problem in my code. This function should have 4 overloads as it branches on both
Your current implementation only considers one combination out of four. The full signature should be this: @overload
def rows_by_key(
self,
key: ColumnNameOrSelector | Sequence[ColumnNameOrSelector],
*,
named: Literal[False] = False,
include_key: bool = False,
unique: Literal[False] = False,
) -> dict[Any, list[Any]]: ...
@overload
def rows_by_key(
self,
key: ColumnNameOrSelector | Sequence[ColumnNameOrSelector],
*,
named: Literal[False] = False,
include_key: bool = False,
unique: Literal[True],
) -> dict[Any, Any]: ...
@overload
def rows_by_key(
self,
key: ColumnNameOrSelector | Sequence[ColumnNameOrSelector],
*,
named: Literal[True],
include_key: bool = False,
unique: Literal[False] = False,
) -> dict[Any, list[dict[str, Any]]]: ...
@overload
def rows_by_key(
self,
key: ColumnNameOrSelector | Sequence[ColumnNameOrSelector],
*,
named: Literal[True],
include_key: bool = False,
unique: Literal[True],
) -> dict[Any, dict[str, Any]]: ...
def rows_by_key(
self,
key: ColumnNameOrSelector | Sequence[ColumnNameOrSelector],
*,
named: bool = False,
include_key: bool = False,
unique: bool = False,
) -> dict[Any, Any]: |
Thanks hamdanal, I modified the signature based on your implementation and fixed the mypy error. |
py-polars/polars/dataframe/frame.py
Outdated
named: Literal[False] = ..., | ||
include_key: bool = ..., | ||
unique: Literal[False] = ..., | ||
) -> dict[Any, Iterable[tuple[Any, ...]]]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct, the values of the dictionary are not iterables of tuple, they should be a "list of Any":
import polars as pl
df = pl.DataFrame({
"w": ["a", "b"],
"x": ["q", "q"]
})
print(repr(df.rows_by_key(key=["w"])))
# defaultdict(<class 'list'>, {'a': ['q'], 'b': ['q']})
Actually all return types here are not correct. See my previous comment on this PR for the correct types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arff, the type depends on the number of columns, see:
import polars as pl
df = pl.DataFrame({
"w": ["a", "b"],
"x": ["q", "q"],
"z": ["v", "v"]
})
print(repr(df.rows_by_key(key=["w"], named=False, unique=True)))
# {'a': ('q', 'v'), 'b': ('q', 'v')}
But in any case, you are right, this is not correct.
Looks like I have been beaten to the punch - thanks @hamdanal ;) |
fixes: #19887