-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Specify how pandas infers dtype on objects #41848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The supported way to do this is to place your objects in your from physipy import m
s = pd.Series(QuantityArray([1, 2, 3]*m)) pandas will then set the dtype of your series to Will this work for you? |
This does work, but I find it heavier than what I hoped for : semanticaly, my quantity object For reference, matplotib's unit interface does something like this. Basically, you define the conversion interface (kinda privately, developper-side), then register the class with its interface : # Finally we register our object type with the Matplotlib units registry.
units.registry[datetime.date] = DateConverter() Then any user can use the plotting interface for a plt.plot(datetime.date.today()) as opposed to a heavier : plt.plot(ConversionInterfaceDatetime(datetime.date.today())) For example, would it be possible to extend infer_dtype, to create the proper mapping between a Maybe my problem is that I don't see (yet, probably) the added value of the "wrapper" ExtensionArray (it feels like my base object plus the ExenstionType would suffice), but I definitely don't have a broad view over the subject. I should say that my base object, like |
xref #27462 for the analogous issue for The solution here is going to involve having a check in infer_dtype along the lines of
|
Hello there
Is your feature request related to a problem?
[this should provide a description of what the problem is, e.g. "I wish I could use pandas to do [...]"]
Context : I am creating a package to handle physical units (yes, another one), and I started working on the pandas interface implementation. I looked into pandas extension page, as well as what
pint
did withpint-pandas
. I am pretty satisfied with the result, except for one thing : When creating pandas objects (Series of DataFrame), I have to explicitly specify what dtype (using my DtypeExtension for my "Quantity" class) pandas should use to cast my Quantity object to the correspond QuantityArrayExtension. Categorical objects kinda exhibit the same problem :Now, I understand that for the Categorical example, it is not obvious what kind of dtype pandas should use, but for my custom class, I would like to be able to tell pandas how to behave.
Describe the solution you'd like
I would expect some interface like this :
Here, pandas admits it doesn't know the passed object's type, and so check in its
dtype_lut
if a corresponding dtype is set.Another interface would be to add a method, pandas-specifically named, to Quantity that does this look-up table :
so that when pandas encounters an unknown object type, it first tries to get its Dtype using "obj.pd_type()"
Cheers
The text was updated successfully, but these errors were encountered: