Skip to content

BUG: assert_series_equal does not compare indices exactly #50473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
jaapaap79 opened this issue Dec 28, 2022 · 5 comments
Closed
3 tasks done

BUG: assert_series_equal does not compare indices exactly #50473

jaapaap79 opened this issue Dec 28, 2022 · 5 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@jaapaap79
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

df1 = pd.DataFrame(
    [
        (20080813, "CLZ08", 1.5),
        (20080813, "CLZ09", -0.4),
        (20080813, "CLZ08", 1.5),
        (20080814, "CLZ09", -0.4),
    ],
    columns=["date", "asset", "value"],
).set_index(["date", "asset"])["value"]

df2 = pd.DataFrame(
    [
        (20080813, "CLZ08", 1.5),
        (20080813, "CLZ09", -0.4),
        (20080814, "CLZ08", 1.5),
        (20080814, "CLZ09", -0.4),
    ],
    columns=["date", "asset", "value"],
).set_index(["date", "asset"])["value"]

pd.testing.assert_series_equal(df1, df2, check_index_type=True, check_index=True)

Issue Description

Does not raise even though index is different

Expected Behavior

Raise because index is different

Installed Versions

INSTALLED VERSIONS ------------------ commit : 8dab54d python : 3.9.13.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19044 machine : AMD64 processor : Intel64 Family 6 Model 154 Stepping 4, GenuineIntel byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : English_United States.1252

pandas : 1.5.2
numpy : 1.21.2
pytz : 2021.1
dateutil : 2.8.2
setuptools : 63.4.1
pip : 22.1.2
Cython : None
pytest : 7.1.2
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.8.0
html5lib : None
pymysql : None
psycopg2 : 2.9.1
jinja2 : 2.11.3
IPython : 8.5.0
pandas_datareader: 0.10.0
bs4 : None
bottleneck : None
brotli : None
fastparquet : None
fsspec : 2022.8.2
gcsfs : None
matplotlib : 3.5.1
numba : None
numexpr : None
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : 9.0.0
pyreadstat : None
pyxlsb : None
s3fs : 2022.8.2
scipy : 1.7.1
snappy : None
sqlalchemy : 1.3.24
tables : None
tabulate : 0.9.0
xarray : None
xlrd : 2.0.1
xlwt : 1.3.0
zstandard : None
tzdata : None

@jaapaap79 jaapaap79 added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 28, 2022
@dicristina
Copy link
Contributor

For assert_series_equal the default value for check_exact is False. assert_series_equal goes on to call assert_index_equal but it propagates its check_exact value, which is False in the example. This causes the index comparison to be done by assert_almost_equal which does not raise because the values are in fact almost equal according to the default parameters.

The most obvious solution is to use check_exact=True but you could also use a data type other than int in the first index level if possible (like str or datetime64). Also you could use an rtol smaller than the default like 4e-8.

I recommend closing this issue.

@jaapaap79
Copy link
Author

Does it really make sense to have assert_almost_equal on the index regardless?

@dicristina
Copy link
Contributor

I guess that only for a small minority of tests would it be right to compare indices approximately but I cannot point out even one such test. One could argue for check_exact to be True by default for assert_series_equal, as it is for assert_index_equal .

@MarcoGorelli MarcoGorelli changed the title BUG: BUG: assert_series_equal does not compare indices exactly Dec 31, 2022
@jaapaap79
Copy link
Author

Or perhaps a separate flag check_index_exact that defaults to True and gets passed through to assert_index_equal?

@topper-123
Copy link
Contributor

Thanks for the bug report, @jaapaap79. This is a duplicate of #40719. I suggest continuing the discussion there.

@topper-123 topper-123 closed this as not planned Won't fix, can't repro, duplicate, stale May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

3 participants