-
Notifications
You must be signed in to change notification settings - Fork 18k
proposal: sync: SlicePool for variable-sized slices #73620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Personally i would change the signature to just: See #73598 (comment) |
Updated 👍🏻 |
Related Issues
Related Code Changes Related Discussions (Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
How will the code make a decision on which sizes are reasonable to re-use and when trigger a new alloc? The reason I am asking is that it is not really something I have a "generically solved" solution myself. If you have |
@klauspost I realize this is not clear from the proposal: what I was imagining is that This is the simplest thing I could come up with, more complex solutions are definitely possible. Then, the usual As an added protection against misuse Put may even try to detect whether the slice capacity does not match the size class of the backing storage (e.g. a slice originally allocated as |
@mateusz834 My main concern is someone subslicing a slice, e.g. We posted almost at the same time so you may have missed it, but this is the reason why above I wrote:
|
Is that the expected (or desired) behavior?, consider: b := p.Get()
b = b[:0]
b = append(b, 1)
b.Put(b) I believe that |
@mateusz834 my thinking there was just that there are good reasons for all options:
so I just went for the one that seemed to require the lowest number of assumptions on how the pool will be used |
Would sync/v2 not solve this? Is a new type really necessary? |
I was also thinking that if we decided to do a
Makes sense, but on the other hand i wonder, whether this is something that is currently being done in practice. The cleanup might also be done before calling |
I think having a nil New is something we should still allow; sometimes the semantics of "give me something from the pool only if it is already there" are useful. IIRC even in sync/v2 there are arguments in favour of maintaining this.
I can only say for sure that I have often relied on the fact that the contents of what you put in the pool are preserved (e.g. knowing that objects I get out of the pool are in a very specific state that was set when I put them into the pool). Whether this is widespread practice I don't know (https://github.com/josharian/intern suggests it may be though). |
Log2 buckets is not really great. Especially at smaller sizes it fragments much more than needed. Log10 is better, but overallocs when sizes gets big. Also the switchovers depends on sizeof(T) - what makes sense for I am not really sure a good generic Usually I end up with a "min_len" and "max_len" depending on the expected payloads. When the cap of a returned slice is > max_len I see it as an outlier and just dump it for GC. In some cases I add a couple of buckets with cap ranges - using a |
@klauspost I think what we should end up doing is just doing exactly what the runtime does when it needs to satisfy an allocation of size x bytes (IIRC it rounds up to the size class for x <= 32KB; for larger allocations it rounds to pages or groups of pages). It may not be perfect, and could potentially end up resulting in a large amounts of ephemeral pools if used inefficiently... or we could just add a MaxCap (that when > 0 would cause Put to silently discard any slice of capacity larger than MaxCap). |
Sharing my slice pool implementation which I've been using. Got a couple of methods in there to calculate memory usage and amount of allocated contiguous blocks. https://github.com/soypat/gsdf/blob/b629e072407e28d69621e4bbba272b3ade11383c/gleval/cpu.go#L222 |
What are the types we would use this for besides |
@earthboundkid I would say that this would likely be useful in places where we don't have just
These are just off the top of my head, and just looking at uses in the standard library. It is true though that I would expect It is also true that it would be great if none of this was needed; the old adage "every use of sync.Pool is a bug report on the GC" applies here. |
Could the backing memory be used with slices of different element types to suplement the above case ( type SlicePool struct {
// ...
}
type SlicePoolHandle[E any] struct {
sp *SlicePool
}
func GetSlicePoolHandle[E any](sp *SlicePool) SlicePoolHandle[E]
func (sph SlicePoolHandle[E]) Get(n int) []E
func (sph SlicePoolHandle[E]) Put(s []E) |
But pools only really make sense in cases where they are API-transparent. In all of these cases, the allocated slices are passed out of the control of the respective packages and can't really be pooled. |
@Merovius my thinking is that in those cases the slices could be recycled while the slice is growing, so before the slice is passed to the outside world. Once it is passed to the outside world, it would never hit the pool again. |
Proposal Details
To address the known footguns of using
Pool
with variable-sized slices (excessive memory usage on misuse, allocation onPut
), I would like to propose the following addition tosync
:The guarantees given allow significant flexibility in implementation, while addressing the aforementioned footguns (note that while the allocations on
Put
may be solved in sync/v2 or withPoolOf
, the problem of excessive memory usage on misuse won't).Implementations could range incrementally from very simple (e.g. something in the spirit of https://github.com/josharian/intern) to a fully dedicated pool implementation deeply integrated with the memory allocator (e.g. alignment with size classes) and GC (e.g. incremental pool cleanup). While a very crude implementation of this proposal could very well live outside of the standard library, anything more performant would necessarily need to live in the standard library, and be integrated with the runtime (as
sync.Pool
does).As mentioned in #23199 the http2 logic makes use of a set of pools to handle slices of different sizes, and could likely be switched to
SlicePool
instead. A quick search on golang/go suggests at least 3 other places where this could be used to replace currently hand-rolled solutions, and another on all other Go code on github suggests a few hundreds more (it is true though that not all these examples are slices1). A few other possibilities include during JSON unmarshalling,slices.Collect
,bytes.Buffer
,io.ReadAll
,math/big.nat
, etc.As a small note, the current proposal uses the
sync.Pool
(v1) convention of aNew
field because this proposal targets the currentsync
. If it were to targetsync/v2
it would need to be adapted to whatever convention ends up being used in v2.Implementation notes
@klauspost correctly points out that coming up with a generic policy for when to return a slice of size m > n is anything but easy, and it is quite likely completely impractical to come up with something that is optimal for all workloads. My suggestion (for now a bit handwavy) is to use the same (or very similar) logic used by the memory allocator in the runtime, rounding up allocations to size classes with bounded overhead.
Questions answered so far
Put
must be the same as the slice returned byGet
?Get
not specified?sync/v2
going to make this irrelevant?[]byte
?Footnotes
For the non-slices examples I was thinking to later propose a
KeyedPool
that would be able to handle all of those I've seen so far.KeyedPool
would not have the ability to return slices of capacity "at least N", so it would not be a good fit especially for byte slices. ↩The text was updated successfully, but these errors were encountered: