-
-
Notifications
You must be signed in to change notification settings - Fork 132
Optimize CSR and sorted conversions #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e854fb6
to
d4801f4
Compare
pull out linear_loc to separate function
This stores the last few objects returned from reshape and transpose calls. This allows efficiencies from in-place operations like `sum_duplicates` and `sort_indices` to persist in interative workflows. Modern NumPy programmers are accustomed to operations like .transpose() being cheap and aren't accustomed to having to pay sorting costs after many computations. These assumptions are no longer true in sparse by default. However, by caching recent transpose and reshape objects we can reuse their inplace modifications. This greatly accelerates common machine learning workloads.
I have extended this to cache the last few results of reshape and transpose. This makes things run quite fast for me in practice. Copying commit message here for convenience: This stores the last few objects returned from reshape and transpose Modern NumPy programmers are accustomed to operations like .transpose() |
Merging shortly if no objections |
Looks good. A couple things:
|
All good points.
For the caching issue I'm tempted to implement the book-keeping necessary for invalidation. |
Makes sense. Regarding the book-keeping for cacheing: we could detect changes in the arrays by making them properties, but that still wouldn't protect against users modifying the contents of these arrays either directly, or via views in derived CSR matrices (for example). One way we might address that is by setting the |
Alternatively we could have the user explicitly opt-in to caching. This may be simpler all around. |
+1, good solution for right now |
Planning to merge this soon if there are no further comments |
Looks good – one thing came to mind though: I wonder if it would be possible to use |
I would love to use a standard LRU data structure. However the |
Makes sense! |
This started by adding the optimzied version of tocsr from #9 (cc @jakevdp)
Then, to avoid calling sorting routines excessively we started tracking
sorted=
metadata.Then, to accelerate sorting and to consolidate lexsort and reshape code we based all sorting code onto a single function called
linear_loc
, which provides the linear location of any index in a C-ordered array. This replacesnp.lexsort
withnp.argsort
and performs a quickissorted
check beforehand. This can dramatically speed up some workflows, but does limit the size of our arrays to2**64
elements (zero or nonzero) so for example this library can no longer represent an array of shape(1e6, 1e6, 1e6, 1e6)
. This was already the case in reshape. I'm personally ok with this for the time being. I think that we can probably remove this limitation in the future (with a cost) if necessary.There is still work to be done to optimize matvecs and matmuls, but this is a decent start.