Skip to content

Deprecate decode timedelta #2105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

ocefpaf
Copy link
Contributor

@ocefpaf ocefpaf commented May 4, 2018

  • Closes Undesired decoding to timedelta64 (was: units of "seconds" translated to time coordinate) #1621 (remove if there is no corresponding issue, which should only be the case for minor changes)
  • Tests added (for all bug fixes or enhancements)
  • Tests passed (for all non-documentation changes)
  • Fully documented, including whats-new.rst for all changes and api.rst for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

I'll add tests, docs, and the whats-new entry later if I'm on the right path here.

xref.: #843, #940, and #2085

See http://nbviewer.jupyter.org/gist/ocefpaf/e736c07faf3d4c9361ecf546a692c2cd

@@ -293,10 +294,15 @@ def decode_cf_variable(name, var, concat_characters=True, mask_and_scale=True,
variables.CFScaleOffsetCoder()]:
var = coder.decode(var, name=name)

enable_future_time_unit_decoding = OPTIONS['enable_future_time_unit_decoding']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E501 line too long (82 > 79 characters)

@@ -180,6 +180,9 @@ def decode_cf_timedelta(num_timedeltas, units):
"""Given an array of numeric timedeltas in netCDF format, convert it into a
numpy timedelta64[ns] array.
"""
warnings.warn('Decoding timedeltas been deprecated and '
Copy link
Contributor Author

@ocefpaf ocefpaf May 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you want to deprecate decode_cf_timedelta too or if you want to keep that around in case anyone wants to manually convert data to timedeltas. I don't have a user case for that but it was the default behavior, so maybe someone uses this.

PS: I never got any feedback from the mailing list users https://groups.google.com/forum/#!searchin/xarray/timedelta%7Csort:date/xarray/iadL57Zgw6w/KaW9UPVTCAAJ

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave this function/CFTimedeltaCoder around for now and raise the warning from decode_cf_variable instead.

In the future (after we finish refactoring the backends API), users will have to explicitly pass it in as a coder to use when reading data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

return Variable(dims, data, attrs, encoding)

def decode(self, variable, name=None):
warnings.warn('Decoding timedeltas been deprecated and '
'will be removed in xarray v??.',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What version of xarray would be OK to make this default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say v0.11 for now. We can always push this back.

@@ -293,10 +294,15 @@ def decode_cf_variable(name, var, concat_characters=True, mask_and_scale=True,
variables.CFScaleOffsetCoder()]:
var = coder.decode(var, name=name)

enable_future_time_unit_decoding = OPTIONS['enable_future_time_unit_decoding']
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An suggestions for a shorter name here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can split this expression across lines:

enable_future_time_unit_decoding = OPTIONS[
    'enable_future_time_unit_decoding']

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't really worried about the linter but rather with the long name 😄
But naming things is hard!

Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start! The last thing I'd like to see is support for automatically serializing/deserializing timedelta64 by saving an attribute dtype='timedelta64[ns]'. This will preserve the ability to roundtrip this data to netCDF, which I do think is useful for some users -- otherwise users will have xarray objects that they can no longer save to netCDF.

@@ -293,10 +294,15 @@ def decode_cf_variable(name, var, concat_characters=True, mask_and_scale=True,
variables.CFScaleOffsetCoder()]:
var = coder.decode(var, name=name)

enable_future_time_unit_decoding = OPTIONS['enable_future_time_unit_decoding']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can split this expression across lines:

enable_future_time_unit_decoding = OPTIONS[
    'enable_future_time_unit_decoding']

return Variable(dims, data, attrs, encoding)

def decode(self, variable, name=None):
warnings.warn('Decoding timedeltas been deprecated and '
'will be removed in xarray v??.',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say v0.11 for now. We can always push this back.

@@ -180,6 +180,9 @@ def decode_cf_timedelta(num_timedeltas, units):
"""Given an array of numeric timedeltas in netCDF format, convert it into a
numpy timedelta64[ns] array.
"""
warnings.warn('Decoding timedeltas been deprecated and '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave this function/CFTimedeltaCoder around for now and raise the warning from decode_cf_variable instead.

In the future (after we finish refactoring the backends API), users will have to explicitly pass it in as a coder to use when reading data.

@ocefpaf ocefpaf force-pushed the deprecate_decode_timedelta branch 2 times, most recently from 089b777 to 0e5ca37 Compare May 4, 2018 18:00
@ocefpaf ocefpaf force-pushed the deprecate_decode_timedelta branch from 4788690 to 39eb727 Compare May 5, 2018 18:41
@@ -335,6 +335,10 @@ like ``'days'`` for ``timedelta64`` data. ``calendar`` should be one of the cale
supported by netCDF4-python: 'standard', 'gregorian', 'proleptic_gregorian' 'noleap',
'365_day', '360_day', 'julian', 'all_leap', '366_day'.

The automatic decoding of untis like ``'days'`` to ``timedelta64`` is deprecated and
will be removed in xarray version 0.11. The default behavior will keep the same numeric
type of the original data instead of converting to ``timedelta64``.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shoyer b/c english is not my mother tongue I really appreciate your help wording this part.

When reading or writing netCDF files, xarray automatically decodes datetime and
timedelta arrays using `CF conventions`_ (that is, by using a ``units``
When reading or writing netCDF files, xarray automatically decodes datetime
arrays using `CF conventions`_ (that is, by using a ``units``
Copy link
Contributor Author

@ocefpaf ocefpaf May 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that removing the mention of timedelta here is better b/c the netCDF original data is rarely a timedelta. The case of wave periods, or any other phenomena 'period,' isn't a timedelta IMO.

@ocefpaf
Copy link
Contributor Author

ocefpaf commented May 5, 2018

@shoyer this is ready for a second round of reviews.

The last thing I'd like to see is support for automatically serializing/deserializing timedelta64 by saving an attribute dtype='timedelta64[ns]'. This will preserve the ability to roundtrip this data to netCDF, which I do think is useful for some users -- otherwise users will have xarray objects that they can no longer save to netCDF.

I'm not sure I follow. Roundtrip is easier now since the original time unit dtype is preserved, no? Ping @rsignell-usgs who is a netCDF/CF specialist 😉

@shoyer
Copy link
Member

shoyer commented May 7, 2018

I'm not sure I follow. Roundtrip is easier now since the original time unit dtype is preserved, no? Ping rsignell-usgs who is a netCDF/CF specialist

To clarify: I would like to be able to round-trip something like the following dataset to netCDF. The following code should still work without raising an error:

In [30]: import xarray

In [31]: import pandas as pd

In [32]: ds = xarray.Dataset({'foo': ('dt', [1, 2, 3])}, {'dt': pd.to_timedelta(['1 day', '2 days', '
    ...: 3 days'])})

In [33]: ds
Out[33]:
<xarray.Dataset>
Dimensions:  (dt: 3)
Coordinates:
  * dt       (dt) timedelta64[ns] 1 days 2 days 3 days
Data variables:
    foo      (dt) int64 1 2 3

In [34]: assert ds.identical(xarray.open_dataset(ds.to_netcdf()))

@ocefpaf
Copy link
Contributor Author

ocefpaf commented May 7, 2018

I would like to be able to round-trip something like the following dataset to netCDF.

OK, I though you meant roundtrip from the netCDF file and back. In my line of work handling high level python objects serialization like that is usually not desired as the user should be responsible of how s/he wants to save the data. (I also never had an application that required timedelta, so it was hard for me to contextualize that.)

I'll be away the next weeks, so if someone wants to pick it up from here please feel free to do so. Otherwise I'll try to get back to this once I return to the office.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Undesired decoding to timedelta64 (was: units of "seconds" translated to time coordinate)
3 participants