Wrong buffer format returned for similar numpy arrays passed to pybind11 function #1806

JanuszL · 2019-06-13T11:58:40Z

Issue description

Function accepting pybind11::buffer reports different underlying type when provided with (at least at the first glance) numpy arrays. I can only guess it may be more related to numpy which somehow fails to comply with python's buffer protocol in some cases. But I want to make sure that pybind11 doesn't do anything that could affect this.

Reproducible example code

a = np.array([1, 2], dtype=np.longlong)
b = np.array([1, 2], dtype=np.int64)
some_native_function_accepting_py11buffer(a)
some_native_function_accepting_py11buffer(b)

a reports to have q format while b l which is wrong in b case. Numpy claims dtype as int64 for both of them.
Full repro attached, just issue run.sh (and make sure that pybind11 is present in the same dir as run.sh).

The text was updated successfully, but these errors were encountered:

eacousineau · 2019-06-19T12:32:19Z

Hm... Mayhaps it's due to an ordering mismatch in pybind11s NumPy type flags?
I had complained about it in this issue:
#1328

But then, uh, let my PR stall:
#1329

Lemme see about dusting that off, just in case it covers your problem at all.

JanuszL · 2019-06-19T13:45:00Z

It seems that it is not that simple. Basically, py_buffer format can have two characters (https://docs.python.org/3/library/struct.html#byte-order-size-and-alignment):

modifier telling if to use native or standard size (there are some about byte ordering but I'm not caring about them that much)
actual format
And it can happen that there is more than one valid answer (mapping), np.longlong and np.int64 are both 8 bytes long and can be encoded using l or q (assuming default @ modifier telling to use the native size).
The problem I had was that I have used format_descriptor function instantiated for a given type and compared returned result with the actual type in the tested py_buffer. Due to the above reason, it worked fine for q format while it didn't for l as the implementation is not that flexible https://github.com/pybind/pybind11/blob/master/include/pybind11/detail/common.h#L700 and returns only on character for given data size and sign.
I think this logic should be extended. I followed my own way - Rework how DALI handles py_buffer format string NVIDIA/DALI#985, which is not that beautiful but works in all cases I have tested.

YannickJadoul · 2020-07-17T17:32:22Z

Related to the discussion on #1908: i, l, and q are related to the dtypes np.cint, np.long, and np.longlong, and int32 and int64 are only aliases in numpy.

If there's something where pybind11 mismatches pure numpy, please do reopen!

Python integer format char is ambiguous and platform dependent. PyBind11 `format_descriptor<...>::format()` always returns "q" and "Q" for 64bit integers, independent of the platform. Compatible passed-in Python buffers on the other hand might also have the equivalent format "l" or "L" set. See pybind/pybind11#1806 and pybind/pybind11#1908 for details. This fix introduces a special case for integer format comparisons, just checking size and signedness.

eacousineau mentioned this issue Jun 19, 2019

numpy: Provide concrete size aliases, test equivalence for dtype(...).num #1329

Merged

1 task

YannickJadoul closed this as completed Jul 17, 2020

hawkinsp mentioned this issue Nov 16, 2020

np.cint/np.int32 type confusion jax-ml/jax#4903

Closed

fthaler mentioned this issue Apr 9, 2021

Fix for Integer Format Check in Python SID Adapter GridTools/gridtools#1632

Merged

rwgk mentioned this issue Feb 10, 2023

FWD pybind11 google/pybind11clif#1806

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong buffer format returned for similar numpy arrays passed to pybind11 function #1806

Wrong buffer format returned for similar numpy arrays passed to pybind11 function #1806

JanuszL commented Jun 13, 2019

eacousineau commented Jun 19, 2019

JanuszL commented Jun 19, 2019

YannickJadoul commented Jul 17, 2020

Wrong buffer format returned for similar numpy arrays passed to pybind11 function #1806

Wrong buffer format returned for similar numpy arrays passed to pybind11 function #1806

Comments

JanuszL commented Jun 13, 2019

Issue description

Reproducible example code

eacousineau commented Jun 19, 2019

JanuszL commented Jun 19, 2019

YannickJadoul commented Jul 17, 2020