Skip to content

API: require tz equality or just tzawareness-compat in setitem-like methods? #37605

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jbrockmendel opened this issue Nov 3, 2020 · 14 comments · Fixed by #44243
Closed

API: require tz equality or just tzawareness-compat in setitem-like methods? #37605

jbrockmendel opened this issue Nov 3, 2020 · 14 comments · Fixed by #44243
Labels
API Design Enhancement Indexing Related to indexing on series/frames, not to indexes themselves Timezones Timezone data dtype
Milestone

Comments

@jbrockmendel
Copy link
Member

There are a handful of methods that do casting/validation in DatetimeArray (and indirectly, Series and DatetimeIndex) methods:

__setitem__
shift
insert
fillna
take
searchsorted
__cmp__

In all cases, we upcast datetime64/datetime to Timestamp, try to parse strings, and check for tzawareness compat (i.e. raise if (self.tz is None) ^ (other.tz is None))

In the cases of searchsorted and cmp, we stop there, and don't require the actual timezones to match. In the other cases, we are stricter and raise if not tz_compare(self.tz, other.tz)

In #37299 we discussed the idea of making these uniform. I am generally positive on this, as it would make things simpler both for users and in the code. cc @jorisvandenbossche @jreback

Side-notes

  1. __sub__ doesn't do the same casting, but also requires tz match but could get by with only tzawareness compat, xref #31793 Add support for subtracting datetime from Timestamp #37329. BTW the stdlib datetime only requires tzawareness compat.

  2. In Series/DataFrame.__setitem__ we cast to object instead of raising on tz mismatch. For brevity I lump this in with "we raise on tz mismatch".

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 3, 2020
@jorisvandenbossche
Copy link
Member

Personally, I would go for "tz-awareness compat" requirement for all cases, I think. I don't think there are really "ambiguous" cases about what the result would be if all involved are tz-aware? (which if of course not the same with mixed tz-awareness)

@jorisvandenbossche jorisvandenbossche added API Design Timezones Timezone data dtype and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 3, 2020
@jbrockmendel
Copy link
Member Author

im on board with going all-tzawareness-compat. Assuming we get consensus on that, should we do a deprecation cycle or Just Do It?

@jreback
Copy link
Contributor

jreback commented Nov 6, 2020

yeah i think should deprecate now

+1 in strict

@jbrockmendel
Copy link
Member Author

@jreback to be clear, joris and I are advocating for the non-strict version

@jbrockmendel
Copy link
Member Author

@TomAugspurger @simonjayhawkins @WillAyd @mroeschke any preferences here?

@mroeschke
Copy link
Member

If the non-strict route was chosen, what would be the resulting timezone after these operations?

@jbrockmendel
Copy link
Member Author

If the non-strict route was chosen, what would be the resulting timezone after these operations?

self.tz would be retained

@mroeschke
Copy link
Member

I guess either choice is okay with me with slight preference for full strict

@AnjoMan
Copy link
Contributor

AnjoMan commented Nov 25, 2020

I'm +1 on this as well. if we can standardize on one function for checking compatibility in all these cases, it will be easier to maintain and also have more test coverage

@qiuxiaomu
Copy link

qiuxiaomu commented Feb 25, 2023

Hi team, I recently bumped into this warning and after reading the change notes and relevant code changes (not very familiar with this area), I am not quite following.
result.loc[, 'timestamp'] = pd.to_datetime(result.timestamp).dt.tz_convert(desired_timezone)

This LOC of mine gives me warning.
Are we saying, instead of this one liner, I need to break it down to:

  1. cast to the desired object type first (.dt not going to do it)
  2. then tz_convert to desired timezone.

Thanks

@jbrockmendel
Copy link
Member Author

what warning? what dtype does result['timestamp'] have before you do tz_convert?

@qiuxiaomu
Copy link

qiuxiaomu commented Feb 26, 2023

The warning comes from here:

 if not timezones.tz_compare(self.tz, other.tz):
                # TODO(2.0): remove this check. GH#37605
                warnings.warn(
                    "Setitem-like behavior with mismatched timezones is deprecated "
                    "and will change in a future version. Instead of raising "
                    "(or for Index, Series, and DataFrame methods, coercing to "
                    "object dtype), the value being set (or passed as a "
                    "fill_value, or inserted) will be cast to the existing "
                    "DatetimeArray/DatetimeIndex/Series/DataFrame column's "
                    "timezone. To retain the old behavior, explicitly cast to "
                    "object dtype before the operation.",
                    FutureWarning,
                    stacklevel=find_stack_level(),
                )

The dtype of result['timestamp'] was datetime64[ns, UTC].

Specifically, result['timestamp'].dtype yields {DatetimeTZDtype: ()} datetime64[ns, UTC].

what warning? what dtype does result['timestamp'] have before you do tz_convert?

@jbrockmendel
Copy link
Member Author

To keep the current behavior, in 2.0 you'll need to do df["timestamp"] = ... instead of df.loc[:, "timestamp"] = ....

@qiuxiaomu
Copy link

To keep the current behavior, in 2.0 you'll need to do df["timestamp"] = ... instead of df.loc[:, "timestamp"] = ....

Thanks a lot !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Enhancement Indexing Related to indexing on series/frames, not to indexes themselves Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants