-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
pd.to/read_sql_table silently corrupts Categorical columns #8624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the catch! This was untested. Seems there is a problem with the |
@jreback The values are stored differently in
@jreback Should this be fixed in CategoricalBlock itself? Or should I just catch it in the sql function and reshape it there appropriately for |
this is how they are stored ; they are un consolidated (as are sparse) - eg you cannot usually combine 2 different categorical columns you will need to turn them into a full array you should be using get_values() if this is a block (which will densify these types of structures ) note that u obviously lose the fact that it is a categorical - but csv/sql are not able to store this type of meta data |
Ah, yes, using yes, the categorical is just returned and written as strings. |
using relational db to store catagorical columns in seperare tables would be very cool, and rebuilding the frame in pandas by JOIN from the multiple tables would save time on the wire. also memory if categorical was build directly.
The text was updated successfully, but these errors were encountered: