-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
astype CategoricalDtype on unknown categorical is still unknown #2947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Looks like this is a bug in pandas, where In [1]: import pandas as pd
In [2]: a = pd.Series(['a', 'b', 'c'])
In [3]: dtype = pd.api.types.CategoricalDtype(['x', 'y', 'z'])
In [4]: b = a.astype('category')
In [5]: b
Out[5]:
0 a
1 b
2 c
dtype: category
Categories (3, object): [a, b, c]
In [6]: b.astype(dtype)
Out[6]:
0 a
1 b
2 c
dtype: category
Categories (3, object): [a, b, c] |
Whoops, thanks :) |
Since pandas-dev/pandas#18593 is resolved, do you want to close this? For older versions of pandas, the code above won't work for either pandas or dask, for newer versions it works for both. I think this should be fine, as we try to mimic the behavior of pandas - users shouldn't rely on non-pandas behavior to work. |
When converting from an unknown categorical to a
CategoricalDtype
, we should be able to have known categories:Should be similar fix as to #2835
The text was updated successfully, but these errors were encountered: