BUG: dtype comparison between numpy dtypes and pandas dtypes fail #19238

topper-123 · 2018-01-14T08:28:32Z

Code Sample, a copy-pastable example if possible

>>> ndt = np.dtype(object)
>>> pdt = pd.api.types.CategoricalDtype(categories=['German', 'English', 'French'])
>>> pdt == ndt
False  # ok
>>> ndt == pdt
TypeError: data type not understood

Problem description

The dtypes are not always comparable, The same issue is with IntervalDtype and if the numpy types are oher kinds (int, float, dates etc.).

The issue may be a numpy issue and not a pandas issue, but I raise it here for discussion first, and can later file an issue at the numpy repository.

Expected Output

Expected was False.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: 78d4e5d
python: 3.6.3.final.0
python-bits: 32
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.22.0.dev0+561.g78d4e5d
pytest: 3.3.1
pip: 9.0.1
setuptools: 38.2.5
Cython: 0.26.1
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.0
openpyxl: 2.4.9
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0b10
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

jreback · 2018-01-14T14:17:42Z

you can see the original issue that I filed to numpy in 2014, numpy/numpy#5329 as well as #8814

this is an understood and numpy issue that won't be fixed :<

jschendel · 2018-01-14T19:40:54Z

Note that pandas.core.dtypes.common.is_dtype_equal allows for safe comparisons between numpy and pandas dtypes:

In [2]: ndt = np.dtype(object)

In [3]: pdt = pd.api.types.CategoricalDtype(categories=['German', 'English', 'French'])

In [4]: pd.core.dtypes.common.is_dtype_equal(ndt, pdt)
Out[4]: False

In [5]: pd.core.dtypes.common.is_dtype_equal(pdt, ndt)
Out[5]: False

topper-123 · 2018-01-14T20:46:27Z

Nice, thanks @jschendel. This solves my problem, though this will probably trip up some people from time to time.

BTW, this is also part of the public API as pd.api.types.is_dtype_equal

adamczykm · 2020-03-01T16:14:20Z

Me and probably many others wasted their time because of this unexpected behaviour of comparing dtypes with '=='.
Why can't it be implemented in numpy-compatible fashion -.- ?

jreback closed this as completed Jan 14, 2018

jreback added Dtype Conversions Unexpected or buggy dtype conversions API Design labels Jan 14, 2018

jreback added this to the No action milestone Jan 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: dtype comparison between numpy dtypes and pandas dtypes fail #19238

BUG: dtype comparison between numpy dtypes and pandas dtypes fail #19238

topper-123 commented Jan 14, 2018

INSTALLED VERSIONS

jreback commented Jan 14, 2018

jschendel commented Jan 14, 2018

topper-123 commented Jan 14, 2018

adamczykm commented Mar 1, 2020

BUG: dtype comparison between numpy dtypes and pandas dtypes fail #19238

BUG: dtype comparison between numpy dtypes and pandas dtypes fail #19238

Comments

topper-123 commented Jan 14, 2018

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

jreback commented Jan 14, 2018

jschendel commented Jan 14, 2018

topper-123 commented Jan 14, 2018

adamczykm commented Mar 1, 2020

Output of `pd.show_versions()`