DataFrame.iloc does not raise for setitem if empty #32896

SaturnFromTitan · 2020-03-21T20:26:06Z

test_setitem_ndarray_3d in pandas/tests/indexing/test_indexing.py isn't raising for empty index, iloc and DataFrame parametrization.

The mentioned PR added a new xfail test for this combination: test_setitem_ndarray_3d_does_not_fail_for_iloc_empty_dataframe. The goal of this PR is to remove the xfail test and remove the associated pytest.skip in test_setitem_ndarray_3d.

The text was updated successfully, but these errors were encountered:

simonjayhawkins · 2020-03-30T15:44:16Z

This didn't silently fail on 0.25.3

>>> import numpy as np
>>>
>>> import pandas as pd
>>>
>>> pd.__version__
'0.25.3'
>>>
>>> i = pd.Index([])
>>> obj = pd.DataFrame(np.random.randn(len(i), len(i)), index=i, columns=i)
>>> obj
Empty DataFrame
Columns: []
Index: []
>>>
>>> nd3 = np.random.randint(5, size=(2, 2, 2))
>>>
>>> obj.iloc[nd3] = 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\simon\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 204, in __setitem__
    indexer = self._get_setitem_indexer(key)
  File "C:\Users\simon\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 191, in _get_setitem_indexer
    return self._convert_to_indexer(key, axis=axis, is_setter=True)
  File "C:\Users\simon\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 2175, in _convert_to_indexer
    self._validate_key(obj, axis)
  File "C:\Users\simon\Anaconda3\lib\site-packages\pandas\core\indexing.py", line 2031, in _validate_key
    raise IndexError("positional indexers are out-of-bounds")
IndexError: positional indexers are out-of-bounds

922f932 is the first bad commit
commit 922f932
Author: jbrockmendel [email protected]
Date: Thu Feb 27 04:52:36 2020 -0800

PERF: pass through to numpy validation for iloc setitem (#32257)

cc @jbrockmendel

jbrockmendel · 2020-04-01T21:18:21Z

This matches the numpy behavior:

arr = np.array([]).reshape(0, 0)
nd3 = np.random.randint(5, size=(2, 2, 2))
arr[nd3] = 0

This may be a numpy bug as well cc @seberg ?

seberg · 2020-04-01T21:27:16Z

Yeah, I guess it technically is a NumPy bug. The reason here must be the additional empty dimension that is not being indexed, somehow causing the index check to never be run to begin with.
Can you open a NumPy issue about it? I am surprised there is a regression, this is probably the case at least since NumPy 1.9, but then I guess the regression is due to a change in how pandas does things internally?

jbrockmendel · 2020-04-01T23:31:41Z

but then I guess the regression is due to a change in how pandas does things internally?

yes, we recently stopped doing some validation on iloc and instead fall through to let numpy do the validation

seberg · 2020-04-02T20:05:01Z

As a heads up, I am considering deprecating this for a release if that is OK? I agree that it is a straight bug, but apparently I intentionally put this in (although possibly not intentionally this exact case).
If it would be easier for you if I remove it right away, that would be a welcome vote though :) (may still decide to do that in any case).

jbrockmendel · 2020-04-02T20:20:58Z

So we'll put back in a check and raise, and then when our min numpy version catches up to 1.19(?) we can go back to falling through? If so, sounds good.

seberg · 2020-04-02T21:59:35Z

Yeah, I suppose so. You could get away with only doing it when the result/assignment may be empty (if you want to avoid doing unnecessary/slow checks). If we deprecate, I guess it would be 1.20 otherwise 1.19, but since you need backcompat anyway, I guess it makes no real difference.

phofl · 2020-11-22T23:14:38Z

cc @jbrockmendel This raises since cd649a2

Traceback (most recent call last):
  File "/home/developer/.config/JetBrains/PyCharm2020.2/scratches/scratch_4.py", line 145, in <module>
    obj.iloc[nd3] = 0
  File "/home/developer/PycharmProjects/pandas/pandas/core/indexing.py", line 687, in __setitem__
    iloc._setitem_with_indexer(indexer, value, self.name)
  File "/home/developer/PycharmProjects/pandas/pandas/core/indexing.py", line 1630, in _setitem_with_indexer
    self._setitem_with_indexer_split_path(indexer, value, name)
  File "/home/developer/PycharmProjects/pandas/pandas/core/indexing.py", line 1646, in _setitem_with_indexer_split_path
    raise ValueError(r"Cannot set values with ndim > 2")
ValueError: Cannot set values with ndim > 2

Can we close this ?

SaturnFromTitan added the Bug label Mar 21, 2020

jbrockmendel added the Indexing Related to indexing on series/frames, not to indexes themselves label Mar 27, 2020

simonjayhawkins added Regression Functionality that used to work in a prior pandas version and removed Bug labels Mar 30, 2020

seberg mentioned this issue Apr 2, 2020

DEP: Check integer array indices when result/subspace is empty numpy/numpy#15898

Closed

mroeschke added the Bug label May 11, 2020

jbrockmendel closed this as completed Nov 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrame.iloc does not raise for setitem if empty #32896

DataFrame.iloc does not raise for setitem if empty #32896

SaturnFromTitan commented Mar 21, 2020 •

edited

Loading

simonjayhawkins commented Mar 30, 2020

jbrockmendel commented Apr 1, 2020

seberg commented Apr 1, 2020

jbrockmendel commented Apr 1, 2020

seberg commented Apr 2, 2020

jbrockmendel commented Apr 2, 2020

seberg commented Apr 2, 2020

phofl commented Nov 22, 2020 •

edited

Loading

DataFrame.iloc does not raise for setitem if empty #32896

DataFrame.iloc does not raise for setitem if empty #32896

Comments

SaturnFromTitan commented Mar 21, 2020 • edited Loading

simonjayhawkins commented Mar 30, 2020

jbrockmendel commented Apr 1, 2020

seberg commented Apr 1, 2020

jbrockmendel commented Apr 1, 2020

seberg commented Apr 2, 2020

jbrockmendel commented Apr 2, 2020

seberg commented Apr 2, 2020

phofl commented Nov 22, 2020 • edited Loading

SaturnFromTitan commented Mar 21, 2020 •

edited

Loading

phofl commented Nov 22, 2020 •

edited

Loading