-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
started to fixturize pandas/tests/base #31701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
3a9802d
ac4a5eb
0f88d47
e865e90
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
import numpy as np | ||
import pytest | ||
|
||
import pandas as pd | ||
from pandas.tests.indexes.conftest import indices_dict | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would rather just move the indices_dict to pandas/conftest.py and create these helper series there (as fixtures). ok to do this in 2 PRs There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just did it in this one. Please let me know if it's what you intended.
Not sure what you mean with "as fixtures". Like described in my comment above, I don't see a better way to create this |
||
|
||
|
||
def _create_series(index): | ||
""" Helper for the _series dict """ | ||
data = np.random.randn(len(index)) | ||
return pd.Series(data, index=index, name=index.name) | ||
|
||
|
||
_series = { | ||
f"series-with-{i_id}-index": _create_series(i) for i_id, i in indices_dict.items() | ||
} | ||
|
||
|
||
def _create_narrow_series(data_dtype): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just call this dtype, use the is_integer and so one for comparisons; we really try to avoid using numpy things generally (as they don't scale to all of our dtypes). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback I just tried that, but |
||
""" Helper for the _narrow_series dict """ | ||
index = indices_dict["int"].copy() | ||
size = len(index) | ||
if np.issubdtype(data_dtype, np.floating): | ||
data = np.random.choice(size, size=size, replace=False) | ||
elif np.issubdtype(data_dtype, np.integer): | ||
data = np.random.randn(size) | ||
else: | ||
raise ValueError(f"Received an unexpected data_dtype: {data_dtype}") | ||
|
||
return pd.Series(data.astype(data_dtype), index=index, name="a") | ||
|
||
|
||
_narrow_series = { | ||
"float32-series": _create_narrow_series(np.float32), | ||
"int8-series": _create_narrow_series(np.int8), | ||
"int16-series": _create_narrow_series(np.int16), | ||
"int32-series": _create_narrow_series(np.int32), | ||
"uint8-series": _create_narrow_series(np.uint8), | ||
"uint16-series": _create_narrow_series(np.uint16), | ||
"uint32-series": _create_narrow_series(np.uint32), | ||
} | ||
|
||
_all_objs = {**indices_dict, **_series, **_narrow_series} | ||
|
||
|
||
@pytest.fixture(params=_all_objs.keys()) | ||
def index_or_series_obj(request): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm I think this is a rather complicated way of composing things. Given there is already a fixture for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure I understand you right. What do you mean by "create the Series objects as required"? I need a fixture to replace self.all_objs from the I was hoping that there are other fixtures which I could use instead of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To rephrase I'm asking if there is a way to leverage the already existing indices fixture, as this duplicates part of the logic of that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not really. I checked two possibilities:
The problem is, that when using these fixtures in a test like
the product of all fixture values is executed instead of the union. So a lot of redundant test cases are run.
-> Based on this, the current implementation seems to be the best we can do. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Issue 1 and a potential solution (requires a new dependency) is well explained in this SO post. |
||
""" | ||
Fixture for tests on indexes, series and series with a narrow dtype | ||
copy to avoid mutation, e.g. setting .name | ||
""" | ||
return _all_objs[request.param].copy(deep=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a link to the original mixin this aims to replace:
https://github.com/pandas-dev/pandas/blob/master/pandas/tests/base/test_ops.py#L43-L83