-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH/API: preserve non-nano in to_datetime #50369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/tools/datetimes.py
Outdated
if not is_supported_unit(get_unit_from_dtype(arg_dtype)): | ||
# We go to closest supported reso, i.e. "s" | ||
arg = astype_overflowsafe( | ||
np.asarray(arg), np.dtype("M8[s]"), coerce=errors == "coerce" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is errors="ignore" handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will continue to incorrectly raise for the time being
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Planning on addressing in this PR or worth adding a TODO/xfail test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add a TODO in my next "assorted" branch if this isn't a show-stopper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(actually looks like this needs rebase so will add comment here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! just got a question about coverage
pandas/_libs/tslibs/np_datetime.pyx
Outdated
@@ -304,6 +304,7 @@ cpdef ndarray astype_overflowsafe( | |||
cnp.dtype dtype, | |||
bint copy=True, | |||
bint round_ok=True, | |||
bint coerce=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps is_coerce
, for consistency with tslib
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
if is_datetime64_dtype(bins): | ||
# As of 2.0, to_datetime may give non-nano, so we need to convert | ||
# here until the rest of this file recognizes non-nano | ||
bins = bins.astype("datetime64[ns]", copy=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any test that would hit the else
branch? I think now is_datetime64_dtype(bins)
is always True
in the test suite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no extant test, but im not sure its impossible to reach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps raise an AssertionError in the else
block? the it won't show up as uncovered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im not concerned about it showing as uncovered, and don't want to break potentially-working code in the wild.
will need to revisit this eventually to handle non-nano without the cast anyway
@@ -858,22 +858,31 @@ def test_to_datetime_dt64s_and_str(self, arg, format): | |||
"dt", [np.datetime64("1000-01-01"), np.datetime64("5000-01-02")] | |||
) | |||
def test_to_datetime_dt64s_out_of_bounds(self, cache, dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the name of the test change, now that they're no longer out-of-bounds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gentle reminder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel last one, sorry 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already renamed to reflect it is only out of ns bounds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed that, thanks!
ts = to_datetime(dt, errors="raise", cache=cache) | ||
assert isinstance(ts, Timestamp) | ||
assert ts.unit == "s" | ||
assert ts.asm8 == dt | ||
|
||
ts = to_datetime(dt, errors="coerce", cache=cache) | ||
assert isinstance(ts, Timestamp) | ||
assert ts.unit == "s" | ||
assert ts.asm8 == dt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is checking with both raise
and coerce
necessary, given that no errors are thrown in this test anyway? errors='ignore'
works too
could we either parametrise over that too, or just remove errors=
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gentle reminder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code changes are fine, I'd just left a couple of comments (which may have got lost in the conversation), so I've just bumped them
ts = to_datetime(dt, errors="raise", cache=cache) | ||
assert isinstance(ts, Timestamp) | ||
assert ts.unit == "s" | ||
assert ts.asm8 == dt | ||
|
||
ts = to_datetime(dt, errors="coerce", cache=cache) | ||
assert isinstance(ts, Timestamp) | ||
assert ts.unit == "s" | ||
assert ts.asm8 == dt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gentle reminder
@@ -858,22 +858,31 @@ def test_to_datetime_dt64s_and_str(self, arg, format): | |||
"dt", [np.datetime64("1000-01-01"), np.datetime64("5000-01-02")] | |||
) | |||
def test_to_datetime_dt64s_out_of_bounds(self, cache, dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gentle reminder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me pending green
Thanks @jbrockmendel |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.