-
Notifications
You must be signed in to change notification settings - Fork 39
Revisit Series implicit size mutability and implicit type conversions #32
Comments
Personally, I'd let the type conversion on mutation go away completely, I don't see any upside to that, especially with a first class NA value that doesn't force casts. More ambivalent on setting with expansion - sometimes that is useful (or at least convenient), e.g., below with a
|
I'm strongly supportive here -- this is a variation of #21. If we disable automatic reindexing in |
@shoyer IIUC However there is a special case that needs addressing. In particular, even though this is a discouraged pattern, its used a lot. e.g.
so an empty Series really can't default to dtype, and needs to be coerced on first fill. IOW, do we have a concept of so this is clear
but is this the same as |
Xarray is immutable in the same way proposed here for pandas: DataFrame columns can be mutated and existing rows can be mutated, but rows cannot be inserted or removed. This requires users to do things in a reasonable efficient way, because, like pandas, data is ultimately stored in non-resizeable arrays. Implicit size mutability is problematic because a naive user might expect that data is backed by a data structure that can be efficiently resized, like Python's |
The fact that this is used a lot tells me there is some API deficiency:
Series is not a dict and was never intended to be fully substitutable for one -- the lack of an empty method may be the culprit. We should look more at this |
related: pandas-dev/pandas#9738 |
I haven't been able to wrap my head around these behaviors:
or
On first principles, I think these should raise KeyError and ValueError/TypeError, respectively. I'm concerned that preserving these APIs (especially implicit type changes) is going to be problematic for pandas 2.0 if our goal is to provide more precise / consistent / explicit behavior, particularly with respect to data types. If you want to change the type of something, you should cast (unless there is a standard implicit cast, e.g. int -> float in arithmetic). In the case of implicit size mutation, it seems like a recipe for bugs to me (or simply working around coding anti-patterns -- better to explicitly reindex then assign values).
Let me know other thoughts about this.
The text was updated successfully, but these errors were encountered: