-
Notifications
You must be signed in to change notification settings - Fork 29
Feature/memory zero copy #152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9eaf8a5
to
d6e6483
Compare
c261e9a
to
bd9f1f1
Compare
By the way, reconstruction of
|
75690ba
to
bc861ab
Compare
Implement support for aligned memory allocation, exposing Changed constructor of Non-positive alignment values are ignored and |
2c07fab
to
0a8da06
Compare
1. Memory class exposes the interface 2. Memory variants constructors can consum objects exposing the said interface, and take over memory zero copy. 3. Class implements memInst.copy_to_host(pyobj=None) If `pyobj` supports Python's buffer protocol, content of USM memory in the instance is copied to the host buffer. Otherwise, bytearray is allocated, populated and returned memInst.copy_from_host(pyobj) Copies buffer of `pyobj` into USM memory of the instance. Raises exception if pyobj is not a byte array memInst.copy_from_device(sycl_usm_obj) Copies USM memory of sycl_usm_obj exposing __sycl_usm_array_interface__ into USM memory of the instance 4. Class is pickleable 5. Class implements tobytes method that produces bytes object populated by the content of USM memory. Methods are currently not releasing GIL, but I think they should.
modularized test, + changes per black
Previously it would always produced shared memory on unpickling.
queue can no longer be specified via positional argument, only through a keyword to allow a user to specify alignment but not queue. SYCL spec says that aligned allocation may return null pointer when the requested alignment is not supported by the device. Non-positive alignments silently go unused (i.e. DPPLmalloc_* is instead of DPPL_aligned_alloc_*)
Added DPPLDevice_AreEq to check if two devices are pointer equal. Used in the test.
0a8da06
to
cebc064
Compare
1. In that case we can avoid making change of reference objects very long. ``` In [1]: import dpctl, dpctl.memory as dpmem In [2]: m = dpmem.MemoryUSMShared(256) In [3]: m2 = dpmem.MemoryUSMShared(m) In [4]: m3 = dpmem.MemoryUSMShared(m2) In [5]: m3.reference_obj is m Out[5]: True In [6]: m2.reference_obj is m Out[6]: True In [7]: m2._pointer Out[7]: 94798596370432 In [8]: m3._pointer Out[8]: 94798596370432 In [9]: m._pointer Out[9]: 94798596370432 ```
1. Removed dpctl.memory 2. Exposed MemoryUSMShared, MemoryUSMDevice, MemoryUSMHost to dpctl 3. When dpctl is cimported MemoryUSMShared, MemoryUSMHost, MemoryUSMDevice and Memory classes are exposed.
Also simplified queue construction from context and device per PR feedback
…r user consumption
Now to access memory objects one does import dpctl import dpctl.memory
@diptorupd The refactoring is complete. Lapse in logic on L109 of |
+ formatting changes, + EXPECT_NO_FATAL_FAILURE on prefetch call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take care of the minor formatting stuff and then merge.
Windows build failure in CI is not related to changes in this PR at all - test environment fails to activate. I was able to build the package on Windows manually, and test suite run was clean. |
0a66066
to
948a67b
Compare
431dc12
to
6d7375e
Compare
Closes #43, closes #76, closes #45, closes #120.
Work on
Memory
class and its derivatives.Classes
MemoryUSMShared
,MemoryUSMDevice
, andMemoryUSMHost
constructors allow to specify the number of bytes to allocate, and optionally thequeue
, or give a Python object expected to implement__sycl_usm_array_interface__
attribute to create the USM memory object pointing the same USM memory with zero copy.Classes also implement several new features.
copy_from_host(pyobj)
to copy content of Python object buffer to USM memorycopy_to_hist(pyobj=None)
to copy content of USM memory to pre-allocated Python object buffer. If object does not implement the buffer protocol,bytearray
of the appropriate size is allocated and populated. The method returns the populated object.copy_from_device(sycl_usm_obj)
to copy content of USM memory behind Python object that is expected to implement__sycl_usm_array_interface__
to the USM memory of the instance.Classes are now pickleable, with it being implemented via copying to host and picking that.
@Alexander-Makaryev