Skip to content

BUG/PERF: Fixed IntervalIndex.nbytes and itemsize #20600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -896,6 +896,7 @@ Performance Improvements
- Improved performance of :func:`pandas.core.groupby.GroupBy.ffill` and :func:`pandas.core.groupby.GroupBy.bfill` (:issue:`11296`)
- Improved performance of :func:`pandas.core.groupby.GroupBy.any` and :func:`pandas.core.groupby.GroupBy.all` (:issue:`15435`)
- Improved performance of :func:`pandas.core.groupby.GroupBy.pct_change` (:issue:`19165`)
- Improved performance of ``IntervalIndex.nbytes`` and ``IntervalIndex.itemsize`` (:issue:`19209`)

.. _whatsnew_0230.docs:

Expand Down Expand Up @@ -1062,6 +1063,7 @@ Indexing
- Bug in ``Index`` subclasses constructors that ignore unexpected keyword arguments (:issue:`19348`)
- Bug in :meth:`Index.difference` when taking difference of an ``Index`` with itself (:issue:`20040`)
- Bug in :meth:`DataFrame.first_valid_index` and :meth:`DataFrame.last_valid_index` in presence of entire rows of NaNs in the middle of values (:issue:`20499`).
- Bug in ``IntervalIndex.nbytes`` and ``IntervalIndex.itemsize`` underreporting memory usage (:issues:`19209`).

MultiIndex
^^^^^^^^^^
Expand Down
8 changes: 8 additions & 0 deletions pandas/core/indexes/interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -728,6 +728,14 @@ def size(self):
# Avoid materializing self.values
return self.left.size

@property
def nbytes(self):
return self.left.nbytes + self.right.nbytes

@property
def itemsize(self):
return self.left.itemsize + self.right.itemsize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn’t really make sense to have
why do u need this?


@property
def shape(self):
# Avoid materializing self.values
Expand Down
19 changes: 19 additions & 0 deletions pandas/tests/indexes/interval/test_interval.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ def name(request):

class TestIntervalIndex(Base):
_holder = IntervalIndex
_compat_props = ['shape', 'ndim', 'size']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for? we don’t use it anywhere


def setup_method(self, method):
self.index = IntervalIndex.from_arrays([0, 1], [1, 2])
Expand Down Expand Up @@ -964,3 +965,21 @@ def test_to_tuples_na(self, tuples, na_tuple):
assert all(isna(x) for x in result_na)
else:
assert isna(result_na)

def test_nbytes(self):
# GH 19209
left = np.arange(0, 4, dtype='i8')
right = np.arange(1, 5, dtype='i8')

result = IntervalIndex.from_arrays(left, right).nbytes
expected = 64 # 4 * 8 * 2
assert result == expected

def test_itemsize(self):
# GH 19209
left = np.arange(0, 4, dtype='i8')
right = np.arange(1, 5, dtype='i8')

result = IntervalIndex.from_arrays(left, right).itemsize
expected = 16 # 8 * 2
assert result == expected