-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
REF: handle 2D in tslibs.vectorized #46886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# like we do for the other functions because result is known C-contiguous | ||
# and is the first argument to PyArray_MultiIterNew2. The usual pattern | ||
# does not seem to work with object dtype. | ||
res_flat[i] = res_val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seberg is there a more idiomatic way to do this __setitem__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I want to replace it (it is not quite ideal, NumPy uses a different functionality internally now). This looks like it should be fine to use PyArray_SETITEM(arr, pointer, res_val)
as the "idiomatic" way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. To clarify though, I was hoping for something that didn't rely on setting into res_flat
, as that requires knowledge about the inner workings of the cnp.broadcast object that i only kinda-get, and that only because you and bashtage explained it in the comments to one of his PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are not trying to avoid Cython memory-views due to the buffer export overhead for smaller arrays, right?
In that case, I would suggest using (untested):
object res_flat[::1] = PyArray_BYTES(arr) # if that works, or the ravel()
That way, cython can deal with the correct assignment, and you know that object
is right here. (You can avoid the memoryview, but I think you may have to Py_INCREF
manually then.)
There could be faster/better ways to iterate the other array as well, but I suppose since this works with Python objects, it won't matter much. However, I think you can still replace the MultiIter
with a single array iterator in any case (just make sure that one iterates in C order). There is no reason to iteraterate res
, since that is effectively unused.
It would be really nice to have a "ufunc-like" pattern for this type of thing, but I would need more dedicated time to figure that out.
(The point being, that you would define a single "strided loop" for 1-D arrays, and than add machinery/code generation to enable to for the N-D case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are not trying to avoid Cython memory-views due to the buffer export overhead for smaller arrays, right?
Never crossed my mind.
The question may be clarified by describing the failure mode I am looking to avoid:
In 6 months another contributor goes to make the analogous change to another function, uses this pattern as a template, but instead of mi = cnp.PyArray_MultiIterNew2(result, stamps)
reverses the arguments as mi = cnp.PyArray_MultiIterNew2(stamps, result)
, so setting res_flat[i] = res_val
is no longer necessarily correct.
i.e. using res_flat
at all is implicitly relying on knowledge of MultiIter
internals that AFAIK is not documented. It just seems like a code smell. Is that clearer?
but I think you may have to Py_INCREF manually then
Definitely don't want to be in that business.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are only relying on the C-order iteration I think. Which I believe MultiIter guarantees (but I am not sure). In any case, you do not need MultiIter, you only need PyArray_IterNew
with a comment that it iterates in C-order and is thus compatible with the res_flat
.
Then you can use object res_flat[::1] = res.ravel()
for assignment, relying on the fact that the typed memoryview is typed correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I think I was a bit too fuzzy about what I meant, so here is the working diff in the details.
So long the result is allocated as C-order, the same pattern could also be used elsewhere for small speedups ("iterating" one instead of two arrays), but I think here it may be particularly worthwhile, because I am not certain that cython
can optimize the item assignment otherwise.
It would be cool if cython had a way to use a more "ufunc-like" pattern for these iterations, but I guess that that would be a pretty big new thing...
diff --git a/pandas/_libs/tslibs/vectorized.pyx b/pandas/_libs/tslibs/vectorized.pyx
index 3f0f06d874..6764127552 100644
--- a/pandas/_libs/tslibs/vectorized.pyx
+++ b/pandas/_libs/tslibs/vectorized.pyx
@@ -135,9 +135,11 @@ def ints_to_pydatetime(
bint use_date = False, use_time = False, use_ts = False, use_pydt = False
object res_val
+ # Note that `result` (and thus `result_flat`) is C-order and
+ # `stamps_iter` iterates C-order as well, so the iteration matches:
ndarray result = cnp.PyArray_EMPTY(stamps.ndim, stamps.shape, cnp.NPY_OBJECT, 0)
- ndarray res_flat = result.ravel() # should NOT be a copy
- cnp.broadcast mi = cnp.PyArray_MultiIterNew2(result, stamps)
+ object[::1] res_flat = result.ravel() # should NOT be a copy
+ cnp.flatiter stamps_iter = cnp.PyArray_IterNew(stamps)
if box == "date":
assert (tz is None), "tz should be None when converting to date"
@@ -155,7 +157,7 @@ def ints_to_pydatetime(
for i in range(n):
# Analogous to: utc_val = stamps[i]
- utc_val = (<int64_t*>cnp.PyArray_MultiIter_DATA(mi, 1))[0]
+ utc_val = (<int64_t*>cnp.PyArray_ITER_DATA(stamps_iter))[0]
new_tz = tz
@@ -183,15 +185,11 @@ def ints_to_pydatetime(
else:
res_val = time(dts.hour, dts.min, dts.sec, dts.us, new_tz, fold=fold)
- # Note: we can index result directly instead of using PyArray_MultiIter_DATA
- # like we do for the other functions because result is known C-contiguous
- # and is the first argument to PyArray_MultiIterNew2. The usual pattern
- # does not seem to work with object dtype.
- # See discussion at
- # github.com/pandas-dev/pandas/pull/46886#discussion_r860261305
+ # Note: We use a typed memory-view here to use cython for handling
+ # need for object ownership handling (reference counting).
res_flat[i] = res_val
- cnp.PyArray_MultiIter_NEXT(mi)
+ cnp.PyArray_ITER_NEXT(stamps_iter)
return result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you only need PyArray_IterNew with a comment that it iterates in C-order
huh i guess i assumed it iterated in a cheapest-available order, so would be F-order if the array is F-contiguous. Good to know, thanks!
Added a comment pointing back to the discussion here; i think that's the best we'll be able to do for the forseeable future. cc @mroeschke |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.