Skip to content

started to fixturize pandas/tests/base #31701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 15, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions pandas/tests/base/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import numpy as np
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import pytest

import pandas as pd
from pandas.tests.indexes.conftest import indices_dict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather just move the indices_dict to pandas/conftest.py and create these helper series there (as fixtures). ok to do this in 2 PRs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did it in this one. Please let me know if it's what you intended.

create these helper series there (as fixtures)

Not sure what you mean with "as fixtures". Like described in my comment above, I don't see a better way to create this index_or_series_obj fixture



def _create_series(index):
""" Helper for the _series dict """
data = np.random.randn(len(index))
return pd.Series(data, index=index, name=index.name)


_series = {
f"series-with-{i_id}-index": _create_series(i) for i_id, i in indices_dict.items()
}


def _create_narrow_series(data_dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just call this dtype, use the is_integer and so one for comparisons; we really try to avoid using numpy things generally (as they don't scale to all of our dtypes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I just tried that, but is_integer(np.int16) is actually False. Similar for the other dtypes or is_float 😐

""" Helper for the _narrow_series dict """
index = indices_dict["int"].copy()
size = len(index)
if np.issubdtype(data_dtype, np.floating):
data = np.random.choice(size, size=size, replace=False)
elif np.issubdtype(data_dtype, np.integer):
data = np.random.randn(size)
else:
raise ValueError(f"Received an unexpected data_dtype: {data_dtype}")

return pd.Series(data.astype(data_dtype), index=index, name="a")


_narrow_series = {
"float32-series": _create_narrow_series(np.float32),
"int8-series": _create_narrow_series(np.int8),
"int16-series": _create_narrow_series(np.int16),
"int32-series": _create_narrow_series(np.int32),
"uint8-series": _create_narrow_series(np.uint8),
"uint16-series": _create_narrow_series(np.uint16),
"uint32-series": _create_narrow_series(np.uint32),
}

_all_objs = {**indices_dict, **_series, **_narrow_series}


@pytest.fixture(params=_all_objs.keys())
def index_or_series_obj(request):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think this is a rather complicated way of composing things. Given there is already a fixture for indeces that does something very similar can we just not modify this to use that and create the Series objects as required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand you right. What do you mean by "create the Series objects as required"?

I need a fixture to replace self.all_objs from the Ops mixin. This will be used in a lot more places in pandas/tests/base/test_ops.py, so I can't just construct the Series' in the test cases.

I was hoping that there are other fixtures which I could use instead of _series and _narrow_series, but I think I still know too little about the codebase to find them. I'd be grateful for any hints to point me in the right direction ✌️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To rephrase I'm asking if there is a way to leverage the already existing indices fixture, as this duplicates part of the logic of that

Copy link
Contributor Author

@SaturnFromTitan SaturnFromTitan Feb 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. I checked two possibilities:

  1. Creating more parametrized fixtures for series and narrow_series:
@pytest.fixture
def series(indices):
    data = np.random.randn(len(indices))
    return pd.Series(data, index=indices, name=indices.name)

The problem is, that when using these fixtures in a test like

def test1(indices, series):
    ...

the product of all fixture values is executed instead of the union. So a lot of redundant test cases are run.

  1. Letting index_or_series_obj depend on indices:
    The problem is that index_or_series_obj is supposed to have a different length. So this doesn't really work either.

-> Based on this, the current implementation seems to be the best we can do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue 1 and a potential solution (requires a new dependency) is well explained in this SO post.

"""
Fixture for tests on indexes, series and series with a narrow dtype
copy to avoid mutation, e.g. setting .name
"""
return _all_objs[request.param].copy(deep=True)
30 changes: 15 additions & 15 deletions pandas/tests/base/test_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,26 +109,26 @@ def test_binary_ops(klass, op_name, op):
assert expected_str in getattr(klass, "r" + op_name).__doc__


class TestTranspose(Ops):
class TestTranspose:
errmsg = "the 'axes' parameter is not supported"

def test_transpose(self):
for obj in self.objs:
tm.assert_equal(obj.transpose(), obj)
def test_transpose(self, index_or_series_obj):
obj = index_or_series_obj
tm.assert_equal(obj.transpose(), obj)

def test_transpose_non_default_axes(self):
for obj in self.objs:
with pytest.raises(ValueError, match=self.errmsg):
obj.transpose(1)
with pytest.raises(ValueError, match=self.errmsg):
obj.transpose(axes=1)
def test_transpose_non_default_axes(self, index_or_series_obj):
obj = index_or_series_obj
with pytest.raises(ValueError, match=self.errmsg):
obj.transpose(1)
with pytest.raises(ValueError, match=self.errmsg):
obj.transpose(axes=1)

def test_numpy_transpose(self):
for obj in self.objs:
tm.assert_equal(np.transpose(obj), obj)
def test_numpy_transpose(self, index_or_series_obj):
obj = index_or_series_obj
tm.assert_equal(np.transpose(obj), obj)

with pytest.raises(ValueError, match=self.errmsg):
np.transpose(obj, axes=1)
with pytest.raises(ValueError, match=self.errmsg):
np.transpose(obj, axes=1)


class TestIndexOps(Ops):
Expand Down