-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Fixed pd.infer_freq
incompatible with Series["timestamp[s][pyarrow]"]
#58404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed pd.infer_freq
incompatible with Series["timestamp[s][pyarrow]"]
#58404
Conversation
pd.infer_freq
incompatible with Series["timestamp[s][pyarrow]"]
) | ||
index = values | ||
if isinstance(index, ABCSeries) and not ( | ||
is_datetime64_any_dtype(index) or index.dtype == object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check dtype.kind == "mM"
on ArrowDtype
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit surprised by this comment, isn't pandas.api.types.is_
supposed to be a single source of truth of what is considered a certain data type? How do you remember what are all the conditions to check otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For internal usages, checking dtype.kind
is more performant than api.types
functions so that is preferred
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I guess I accidentally removed checking for timedelta. Although no tests failed. I am adding some tests for timedeltas as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another bug: when trying with Index["duration[s][pyarrow]"]
, it complains about the property asi8
missing. asi8 is denoted as deprecated on general index types. Moreover, the FrequencyInferer
assumes numpy dtype and makes use of internals (like index._data._ndarray
)
the easiest workaround would be to cast to numpy, but I guess we want to avoid that and keep dtype as pyarrow inside the ops.
This is going to convert to a (non-pyarrow) DatetimeIndex under the hood, which will incur a silent perf penalty. Better to just tell users to convert themselves? |
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
pd.infer_freq
incompatible withSeries["timestamp[s][pyarrow]"]
. #58403doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.