-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
TYP: Fix typing in ExtensionDtype registry #41203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
7d4531d
41c5355
03d4e66
7f51bf0
1a71e71
bd21368
d7803f0
246ee33
37affba
3174b0f
4d6dbdb
eaefa6c
9b295a2
d9651a4
b72f9dd
00092c0
c4deaa0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,13 +8,16 @@ | |
TYPE_CHECKING, | ||
Any, | ||
TypeVar, | ||
cast, | ||
overload, | ||
) | ||
|
||
import numpy as np | ||
|
||
from pandas._libs.hashtable import object_hash | ||
from pandas._typing import ( | ||
DtypeObj, | ||
NpDtype, | ||
type_t, | ||
) | ||
from pandas.errors import AbstractMethodError | ||
|
@@ -29,7 +32,7 @@ | |
from pandas.core.arrays import ExtensionArray | ||
|
||
# To parameterize on same ExtensionDtype | ||
E = TypeVar("E", bound="ExtensionDtype") | ||
ExtensionDtypeT = TypeVar("ExtensionDtypeT", bound="ExtensionDtype") | ||
|
||
|
||
class ExtensionDtype: | ||
|
@@ -206,7 +209,9 @@ def construct_array_type(cls) -> type_t[ExtensionArray]: | |
raise AbstractMethodError(cls) | ||
|
||
@classmethod | ||
def construct_from_string(cls, string: str): | ||
def construct_from_string( | ||
cls: type_t[ExtensionDtypeT], string: str | ||
) -> ExtensionDtypeT: | ||
r""" | ||
Construct this type from a string. | ||
|
||
|
@@ -368,7 +373,7 @@ def _can_hold_na(self) -> bool: | |
return True | ||
|
||
|
||
def register_extension_dtype(cls: type[E]) -> type[E]: | ||
def register_extension_dtype(cls: type[ExtensionDtypeT]) -> type[ExtensionDtypeT]: | ||
""" | ||
Register an ExtensionType with pandas as class decorator. | ||
|
||
|
@@ -422,28 +427,52 @@ def register(self, dtype: type[ExtensionDtype]) -> None: | |
|
||
self.dtypes.append(dtype) | ||
|
||
def find(self, dtype: type[ExtensionDtype] | str) -> type[ExtensionDtype] | None: | ||
@overload | ||
def find(self, dtype: type[ExtensionDtypeT]) -> ExtensionDtypeT: | ||
... | ||
|
||
@overload | ||
def find(self, dtype: ExtensionDtypeT) -> ExtensionDtypeT: | ||
... | ||
|
||
@overload | ||
def find( | ||
self, dtype: NpDtype | type_t[str | float | int | complex | bool | object] | str | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. type_t and type used. choose one for consistency. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. changed to |
||
) -> ExtensionDtype | None: | ||
... | ||
|
||
def find( | ||
self, | ||
dtype: type[ExtensionDtype] | ||
| ExtensionDtype | ||
| NpDtype | ||
| type_t[str | float | int | complex | bool | object], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. arent these subsumed by NpDtype? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does #41945 help simplify There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, in next commit There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Had to undo this |
||
) -> type[ExtensionDtype] | ExtensionDtype | None: | ||
""" | ||
Parameters | ||
---------- | ||
dtype : Type[ExtensionDtype] or str | ||
dtype : ExtensionDtype class or instance or str or numpy dtype or python type | ||
|
||
Returns | ||
------- | ||
return the first matching dtype, otherwise return None | ||
""" | ||
if not isinstance(dtype, str): | ||
dtype_type = dtype | ||
dtype_type: type[ExtensionDtype] | type | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what about NpDtype added to the signature above? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In this PR, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the union of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed in next commit |
||
if not isinstance(dtype, type): | ||
dtype_type = type(dtype) | ||
else: | ||
dtype_type = dtype | ||
if issubclass(dtype_type, ExtensionDtype): | ||
return dtype | ||
# cast needed here as mypy doesn't know we have figured | ||
# out it is an ExtensionDtype | ||
return cast(ExtensionDtype, dtype) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not at this point in the logic There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. using code sample from #41203 (comment)
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I see. Fixed in a new commit. |
||
|
||
return None | ||
|
||
for dtype_type in self.dtypes: | ||
for dtype_loop in self.dtypes: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. according to the annotations and so why is this changed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reverted back due to change of declaration of |
||
try: | ||
return dtype_type.construct_from_string(dtype) | ||
return dtype_loop.construct_from_string(dtype) | ||
except TypeError: | ||
pass | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out that
type[ExtensionDtypeT]
is not a possible return type, so I removed it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see code sample in comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so I can put it back, but then
type[ExtensionDtypeT]
conflicts withnpt.DTypeLike
, because that can be any type, so have to go back to specifying the specific types.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type[object] can create issues. but unfortunately
object
is a valid dtype, use ignores where needed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal here is to avoid false positives for users passing a valid dtype and also avoid using our
NpDtype
alias. Ideally we want to remove use of that alias completely.Examining this function and ignoring the docstring, any type can be passed and the return would be
None
if not an EA type. i.e. this function never raises a TypeError.maybe widen the types to any, use
object
and if mypy complains useAny
or ignore?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If working with the public api, you may want to consider reviving #40202.
if not create a test file and post the results (and the test file used) in a similar maner to #40200
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Registry.find()
is not in the public API, so do we need to worry about "false positives for users" ? This code is just so pandas developers don't mess things up.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, The registry attribute was privatized in #40538 so that
pyright --verifytypes
passes. The changes in this PR are not to fix pyright issues?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I don't remember! It's been since January since I started this adventure. I did this PR as part of getting
ExtensionArray
working. I haven't been usingpyright
to verify this.