Skip to content

units = 'days' leads to timedelta64 for data variable #2085

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mathause opened this issue Apr 26, 2018 · 5 comments
Closed

units = 'days' leads to timedelta64 for data variable #2085

mathause opened this issue Apr 26, 2018 · 5 comments

Comments

@mathause
Copy link
Collaborator

Code Sample

import numpy as np
import xarray as xr

def test_units(units):

    x = np.arange(10)
    data = np.random.randn(10)

    ds = xr.Dataset(data_vars=dict(data=('x', data)), coords=dict(x=('x', x)))
    ds.data.attrs['units'] = units

    ds.to_netcdf('tst.nc')

    decoded = xr.open_dataset('tst.nc')

    print(units.ljust(8), decoded.data.dtype)


    ds.close()
    decoded.close()

test_units('seconds')
test_units('second')

test_units('minutes')
test_units('minute')

test_units('days')
test_units('day')

test_units('months')
test_units('years')

Problem description

Returns:

seconds  timedelta64[ns]
second   float64
minutes  timedelta64[ns]
minute   float64
days     timedelta64[ns]
day      float64
months   float64
years    float64

Expected Output

I would expect type float for all of them. Or is this expected behaviour?

I have a dataset that reports 'consecutive dry days' and the dataset creator correctly set the units of the data to 'days', but I don't want this to be decoded (but the time axis should)....

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Linux OS-release: 4.4.126-48-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8

xarray: 0.10.2+dev6.g9261601
pandas: 0.22.0
numpy: 1.14.2
scipy: 1.0.1
netCDF4: 1.3.1
h5netcdf: 0.5.0
h5py: 2.7.1
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: 1.0.0
dask: 0.17.2
distributed: 1.21.3
matplotlib: 2.2.2
cartopy: None
seaborn: None
setuptools: 39.0.1
pip: 9.0.2
conda: None
pytest: 3.4.2
IPython: 6.2.1
sphinx: None

@shoyer
Copy link
Member

shoyer commented Apr 26, 2018

This is expected behavior, but you're not the first person to complain, and it seems like there is a consensus around changing it: #1621

I'm going to close this issue in favor of the other one.

@shoyer shoyer closed this as completed Apr 26, 2018
@mathause
Copy link
Collaborator Author

Thanks & sorry I didn't see the other issue.

@ocefpaf
Copy link
Contributor

ocefpaf commented Apr 26, 2018

@shoyer what is the path forward? In #940 I implemented a keyword so we could keep both behaviors, which I believe is a bad idea.

Would a PR changing the current behavior and return floats instead timedelta64 be OK? If so I can try to put that together over this weekend.

@shoyer
Copy link
Member

shoyer commented Apr 26, 2018

@ocefpaf I suggested one option in #1621 (comment), issuing a FutureWarning so anyone reliant on this behavior is not surprised. (We could also add an option for opting into the behavior early if you like, e.g., xarray.set_options(enable_future_time_unit_decoding=True) like in #1252.)

@ocefpaf
Copy link
Contributor

ocefpaf commented Apr 26, 2018

Thanks! I'll look into those and should have something by next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants