-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Series shift method breaks for series of pandas Intervals in Pandas 1.0 (works in 0.25.3) #31495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't agree with the expected output. The dtype was lost ( I think the bug is from the shifted interval sub-dtype not necessarily matching the original sub-dtype. In In [21]: pd.arrays.IntervalArray._from_sequence([np.nan], dtype=pd.IntervalDtype('int64'))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-21-34d8dc2cf5c6> in <module>
----> 1 pd.arrays.IntervalArray._from_sequence([np.nan], dtype=pd.IntervalDtype('int64'))
~/sandbox/pandas/pandas/core/arrays/interval.py in _from_sequence(cls, scalars, dtype, copy)
243 @classmethod
244 def _from_sequence(cls, scalars, dtype=None, copy=False):
--> 245 return cls(scalars, dtype=dtype, copy=copy)
246
247 @classmethod
~/sandbox/pandas/pandas/core/arrays/interval.py in __new__(cls, data, closed, dtype, copy, verify_integrity)
182 copy=copy,
183 dtype=dtype,
--> 184 verify_integrity=verify_integrity,
185 )
186
~/sandbox/pandas/pandas/core/arrays/interval.py in _simple_new(cls, left, right, closed, copy, dtype, verify_integrity)
202 raise TypeError(msg)
203 elif dtype.subtype is not None:
--> 204 left = left.astype(dtype.subtype)
205 right = right.astype(dtype.subtype)
206
~/sandbox/pandas/pandas/core/indexes/numeric.py in astype(self, dtype, copy)
385 # TODO(jreback); this can change once we have an EA Index type
386 # GH 13149
--> 387 arr = astype_nansafe(self.values, dtype=dtype)
388 return Int64Index(arr)
389 return super().astype(dtype, copy=copy)
~/sandbox/pandas/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy, skipna)
866
867 if not np.isfinite(arr).all():
--> 868 raise ValueError("Cannot convert non-finite values (NA or inf) to integer")
869
870 elif is_object_dtype(arr):
ValueError: Cannot convert non-finite values (NA or inf) to integer @owenlamont are you interested into looking into this? |
The expected output is just what was in 0.25.3. I agree we can do better (preserve the dtype), but it's still a regression that it errors instead of returning the correct values (just not the correct dtype) |
Agreed that it's a regression, but if we were writing a test we wouldn't want to assert that the result dtype is object right? |
Thanks for the quick responses. I'd love to contribute to Pandas but I'm not familiar with the code base and probably wouldn't be able to address this issue in a reasonable time. You've got me interested learning so I'll have a shot at compiling from source and see how I go from there. |
Thanks! Since this is a regression I may fix it myself so that we can adrress it quickly. We have lots of "Good first issues" though :) |
Ah, this also worked (partly) in 0.25 because there was not yet interval dtype inference in the Series constructor .. |
Code Sample, a copy-pastable example if possible
Problem description
Calling the shift method on an integer indexed Pandas series of Pandas intervals throws opaque exceptions. The same code works as expected in Pandas 0.25.3. This is definitely a breaking change - I'm unsure if it is intentional. I'm assuming it should still work the same as 0.25.3 for now. I tried searching for any documented changes to the shift method behaviour but didn't find any.
Exception traceback for example 1:
Exception traceback for example 2
Expected Output
This is the actual output I got executing with pandas 0.25.3
1 NaN
2 (2020-09-04 10:00:00, 2020-11-30 14:00:00]
dtype: object
1 NaN
2 (1, 2]
dtype: object
Output of
pd.show_versions()
commit : None
python : 3.7.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None
pandas : 1.0.0
numpy : 1.17.5
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.1.0.post20200119
Cython : None
pytest : 5.3.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 1.2.7
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.11.1
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.1
fastparquet : 0.3.2
gcsfs : None
lxml.etree : 4.5.0
matplotlib : 3.1.2
numexpr : None
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : 0.15.1
pytables : None
pytest : 5.3.5
pyxlsb : None
s3fs : 0.2.2
scipy : 1.3.1
sqlalchemy : 1.3.13
tables : None
tabulate : 0.8.6
xarray : 0.14.1
xlrd : 1.2.0
xlwt : None
xlsxwriter : 1.2.7
numba : 0.48.0
The text was updated successfully, but these errors were encountered: