Skip to content

BUG: dtype comparison between numpy dtypes and pandas dtypes fail #19238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
topper-123 opened this issue Jan 14, 2018 · 4 comments
Closed

BUG: dtype comparison between numpy dtypes and pandas dtypes fail #19238

topper-123 opened this issue Jan 14, 2018 · 4 comments
Labels
API Design Dtype Conversions Unexpected or buggy dtype conversions

Comments

@topper-123
Copy link
Contributor

Code Sample, a copy-pastable example if possible

>>> ndt = np.dtype(object)
>>> pdt = pd.api.types.CategoricalDtype(categories=['German', 'English', 'French'])
>>> pdt == ndt
False  # ok
>>> ndt == pdt
TypeError: data type not understood

Problem description

The dtypes are not always comparable, The same issue is with IntervalDtype and if the numpy types are oher kinds (int, float, dates etc.).

The issue may be a numpy issue and not a pandas issue, but I raise it here for discussion first, and can later file an issue at the numpy repository.

Expected Output

Expected was False.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: 78d4e5d
python: 3.6.3.final.0
python-bits: 32
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.22.0.dev0+561.g78d4e5d
pytest: 3.3.1
pip: 9.0.1
setuptools: 38.2.5
Cython: 0.26.1
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.9
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0b10
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Jan 14, 2018

you can see the original issue that I filed to numpy in 2014, numpy/numpy#5329 as well as #8814

this is an understood and numpy issue that won't be fixed :<

@jreback jreback closed this as completed Jan 14, 2018
@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions API Design labels Jan 14, 2018
@jreback jreback added this to the No action milestone Jan 14, 2018
@jschendel
Copy link
Member

Note that pandas.core.dtypes.common.is_dtype_equal allows for safe comparisons between numpy and pandas dtypes:

In [2]: ndt = np.dtype(object)

In [3]: pdt = pd.api.types.CategoricalDtype(categories=['German', 'English', 'French'])

In [4]: pd.core.dtypes.common.is_dtype_equal(ndt, pdt)
Out[4]: False

In [5]: pd.core.dtypes.common.is_dtype_equal(pdt, ndt)
Out[5]: False

@topper-123
Copy link
Contributor Author

Nice, thanks @jschendel. This solves my problem, though this will probably trip up some people from time to time.

BTW, this is also part of the public API as pd.api.types.is_dtype_equal

@adamczykm
Copy link

Me and probably many others wasted their time because of this unexpected behaviour of comparing dtypes with '=='.
Why can't it be implemented in numpy-compatible fashion -.- ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

No branches or pull requests

4 participants