Skip to content

BUG: IntervalArray.__setitem__ creates copies incorrectly #27147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jbrockmendel opened this issue Jun 30, 2019 · 4 comments · Fixed by #32782
Closed

BUG: IntervalArray.__setitem__ creates copies incorrectly #27147

jbrockmendel opened this issue Jun 30, 2019 · 4 comments · Fixed by #32782
Labels
Bug Interval Interval data type

Comments

@jbrockmendel
Copy link
Member

from pandas.core.arrays import IntervalArray
import numpy as np
arr = np.random.randn(10)
a = IntervalArray.from_arrays(arr, arr+1)
b = a[:5]

b[0] = b[1]
assert a[0] == a[1]  # <-fails
@jschendel jschendel added Bug Interval Interval data type labels Jun 30, 2019
@jschendel jschendel added this to the Contributions Welcome milestone Jun 30, 2019
@jreback jreback modified the milestones: Contributions Welcome, 1.1 Mar 19, 2020
@jbrockmendel jbrockmendel reopened this Apr 10, 2020
@jbrockmendel
Copy link
Member Author

I originally thought #32782 would fix this, but it doesn't get all the way across the finish line.

@TomAugspurger
Copy link
Contributor

This from the explicit copies at

# Need to ensure that left and right are updated atomically, so we're
# forced to copy, update the copy, and swap in the new values.
left = self.left.copy(deep=True)
left._values[key] = value_left
self._left = left
right = self.right.copy(deep=True)
right._values[key] = value_right
self._right = right

To check that, this diff gets the original example working.

diff --git a/pandas/core/arrays/interval.py b/pandas/core/arrays/interval.py
index c861d25afd..344538d1a2 100644
--- a/pandas/core/arrays/interval.py
+++ b/pandas/core/arrays/interval.py
@@ -555,11 +555,13 @@ class IntervalArray(IntervalMixin, ExtensionArray):
 
         # Need to ensure that left and right are updated atomically, so we're
         # forced to copy, update the copy, and swap in the new values.
-        left = self.left.copy(deep=True)
+        # left = self.left.copy(deep=True)
+        left = self.left
         left._values[key] = value_left
         self._left = left
 
-        right = self.right.copy(deep=True)
+        # right = self.right.copy(deep=True)
+        right = self.right
         right._values[key] = value_right
         self._right = right

However, as the comment indicates I don't think we can just do that. Perhaps the flow should be

left2 = self.left.copy()
left2._values[key] = value_left

right2 = self.right.copy()
right2._values[key] = value_right

# We know the setitem works
left = self.left
left._values[key] = value_left
right = self.right
right._values[key] = value_right

(the fact that we're manipulating Index._values makes me a bit worried, but that's true regardless)

@TomAugspurger TomAugspurger modified the milestones: 1.1, Contributions Welcome Jun 17, 2020
@jbrockmendel
Copy link
Member Author

Perhaps the flow should be

That looks right to me. Ideally we could do this without making copies, but this is a strict improvement over the status quo.

@jbrockmendel
Copy link
Member Author

closed by #36310

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Interval Interval data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants