-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: loc with empty multiindex raises exception #38711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Added test case for the changes made in GH#36936 |
df.loc[df.loc[df.loc[:, "value"] == 0].index, "value"] = 5 | ||
result = df | ||
expected = DataFrame([1, 2, 3, 4], index=index, columns=["value"]) | ||
tm.assert_equal(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does tm.assert_frame_equal
not work here? (and above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion method is changed to tm.assert_frame_equal
in the most recent commit.
# loc on empty multiindex == loc with False mask | ||
empty_multiindex = df.loc[df.loc[:, "value"] == 0, :].index | ||
result = df.loc[empty_multiindex, :] | ||
expected = df.loc[[False] * len(df.index), :] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you construct the expected frame explicityly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constructing an empty DataFrame named expected
with Multiindex assigns the expected.index.inferred_type
attribute to object
by default.
The value of result.index.inferred_type
attribute is string
as inferred from the index
variable created earlier.
This fails the tm.assert_frame_equal
test
I used the following code to create an empty dataframe with multiindex:
def test_loc_empty_multiindex():
# GH#36936
arrays = [["a", "a", "b", "a"], ["a", "a", "b", "b"]]
index = MultiIndex.from_arrays(arrays, names=("idx1", "idx2"))
df = DataFrame([1, 2, 3, 4], index=index, columns=["value"])
# loc on empty multiindex == loc with False mask
empty_multiindex = df.loc[df.loc[:, "value"] == 0, :].index
result = df.loc[empty_multiindex, :]
index2 = MultiIndex(levels=[[], []], codes=[[], []], names=["idx1", "idx2"])
expected = DataFrame([], index=index2, columns=["value"], dtype="string")
expected = expected.astype({"value": "int"})
tm.assert_frame_equal(result, expected) # this test fails
# replacing value with loc on empty multiindex
df.loc[df.loc[df.loc[:, "value"] == 0].index, "value"] = 5
result = df
expected = DataFrame([1, 2, 3, 4], index=index, columns=["value"])
tm.assert_frame_equal(result, expected)
If the index
variable is used directly to create the empty dataframe, it introduces NaN values
@@ -695,3 +695,22 @@ def test_loc_getitem_index_differently_ordered_slice_none(): | |||
columns=["a", "b"], | |||
) | |||
tm.assert_frame_equal(result, expected) | |||
|
|||
|
|||
def test_loc_empty_multiindex(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you co-locate this with similar tests
jreback@lhs2:~/pandas-dev$ grep empty pandas/tests/indexing/multiindex/*
pandas/tests/indexing/multiindex/test_getitem.py:def test_frame_getitem_multicolumn_empty_level():
pandas/tests/indexing/multiindex/test_getitem.py:def test_frame_mi_empty_slice():
pandas/tests/indexing/multiindex/test_loc.py: empty = Series(data=[], dtype=np.float64)
pandas/tests/indexing/multiindex/test_loc.py: result = x.loc[empty]
pandas/tests/indexing/multiindex/test_loc.py: # empty array:
pandas/tests/indexing/multiindex/test_loc.py: empty = np.array([])
pandas/tests/indexing/multiindex/test_loc.py: result = x.loc[empty]
pandas/tests/indexing/multiindex/test_loc.py: ([], []), # empty ok
pandas/tests/indexing/multiindex/test_loc.py:def test_loc_getitem_duplicates_multiindex_empty_indexer(columns_indexer):
pandas/tests/indexing/multiindex/test_loc.py: # empty indexer
IOW there are 2 tests in the test_getitem.py that are testing basically the same, so locate there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the test to pandas/tests/indexing/multiindex/test_getitem.py
thanks @kasim95 |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff