-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: DataFrame.where for EA dtype mask #51574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/internals/managers.py
Outdated
@@ -390,6 +398,14 @@ def putmask(self, mask, new, align: bool = True): | |||
align_keys = ["mask"] | |||
new = extract_array(new, extract_numpy=True) | |||
|
|||
if isinstance(mask, ABCDataFrame) and mask._mgr.any_extension_types: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a viable option to avoid getting here with DataFrames?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect something could be done in NDFrame._where
to avoid most cases. And then maybe .clip
could just call _where
?
If you're already thinking about changes to ._where
in #51547 I can close this if it makes sense to combine the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this to NDFrame._where
and updated the timings in the OP.
I think we can do them separately, but I agree that this should be handled before getting to the manager level |
return self._update_inplace(result) | ||
else: | ||
return result.__finalize__(self) | ||
if lower is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This basically reverts the initial clip pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This basically reverts the initial clip pr?
Partially, but it still retains the new inplace behavior and the perf improvements for EAs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, forgot about the inplace fix
Can you fix conflicts? |
thx @lukemanley |
doc/source/whatsnew/v2.1.0.rst
file if fixing a bug or adding a new feature.Perf improvement when the mask/cond is "boolean" (EA) dtype.
Ending up with an EA mask is common when working with EA-backed data:
ASV added:
EA-backed frames are not widely covered by ASVs, but there are other methods it may help with, e.g.:
Note a slight slowdown in the non-EA
clip
ASV. However, I think the simplification is worth it and it putsclip
back to where it was a few days ago in terms of perf (prior to #51472). In addition, we might be able to improve perf via the discussion in #51547.