proposal: sync: SlicePool for variable-sized slices #73620

CAFxX · 2025-05-07T07:38:39Z

Proposal Details

To address the known footguns of using Pool with variable-sized slices (excessive memory usage on misuse, allocation on Put), I would like to propose the following addition to sync:

package sync

// SlicePool is a generic pool for variable-sized slices.
type SlicePool[E any] struct {
	// New, if non-nil, must return a new slice of the specified capacity.
	New func(int) []E
}

// Get returns a slice of capacity at least n from the pool, if possible.
// If not possible and [New] is not nil, Get returns the value returned by [New](m),
// with m greater or equal to n.
// Otherwise it returns nil.
//
// Callers must not assume any relationship between individual slices returned by Get
// and slices previously passed as arguments to [Put].
// Callers can assume that the length and contents of the returned slice are either as
// set by [New], or as they were when the slice was passed to [Put].
func (p *SlicePool[E]) Get(n int) []E

// Put adds the slice s to the pool.
// The slice s should not be used after it has been passed to Put.
//
// If the slice has been previously obtained from a call to [Get], when Put is called
// the slice should have the same capacity as it had when it was returned by [Get].
//
// Put is best effort, in that it may silently drop the slice in case it detects it would
// not be beneficial to add it to the pool.
func (p *SlicePool[E]) Put(s []E)

The guarantees given allow significant flexibility in implementation, while addressing the aforementioned footguns (note that while the allocations on Put may be solved in sync/v2 or with PoolOf, the problem of excessive memory usage on misuse won't).

Implementations could range incrementally from very simple (e.g. something in the spirit of https://github.com/josharian/intern) to a fully dedicated pool implementation deeply integrated with the memory allocator (e.g. alignment with size classes) and GC (e.g. incremental pool cleanup). While a very crude implementation of this proposal could very well live outside of the standard library, anything more performant would necessarily need to live in the standard library, and be integrated with the runtime (as sync.Pool does).

As mentioned in #23199 the http2 logic makes use of a set of pools to handle slices of different sizes, and could likely be switched to SlicePool instead. A quick search on golang/go suggests at least 3 other places where this could be used to replace currently hand-rolled solutions, and another on all other Go code on github suggests a few hundreds more (it is true though that not all these examples are slices¹). A few other possibilities include during JSON unmarshalling, slices.Collect, bytes.Buffer, io.ReadAll, math/big.nat, etc.

As a small note, the current proposal uses the sync.Pool (v1) convention of a New field because this proposal targets the current sync. If it were to target sync/v2 it would need to be adapted to whatever convention ends up being used in v2.

Implementation notes

@klauspost correctly points out that coming up with a generic policy for when to return a slice of size m > n is anything but easy, and it is quite likely completely impractical to come up with something that is optimal for all workloads. My suggestion (for now a bit handwavy) is to use the same (or very similar) logic used by the memory allocator in the runtime, rounding up allocations to size classes with bounded overhead.

Questions answered so far

For the non-slices examples I was thinking to later propose a KeyedPool that would be able to handle all of those I've seen so far. KeyedPool would not have the ability to return slices of capacity "at least N", so it would not be a good fit especially for byte slices. ↩

The text was updated successfully, but these errors were encountered:

mateusz834 · 2025-05-07T08:01:41Z

Personally i would change the signature to just: type SlicePool[E any] struct {}. It is pretty easy to convert []E into S, it does not cause any performance penalties, but has a benefit of clearer type-signature.

See #73598 (comment)

CAFxX · 2025-05-07T08:09:16Z

Personally i would change the signature to just: type SlicePool[E any] struct {}. It is pretty easy to convert []E into S, it does not cause any performance penalties, but has a benefit of clearer type-signature.

Updated 👍🏻

gabyhelp · 2025-05-07T08:24:04Z

Related Issues

Related Code Changes

container/list: Add elementPool for alloc new element

Related Discussions

sync.Pool and slices with no known upper size bound

_{(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)}

klauspost · 2025-05-07T08:37:36Z

How will the code make a decision on which sizes are reasonable to re-use and when trigger a new alloc?

The reason I am asking is that it is not really something I have a "generically solved" solution myself.

If you have []byte buffers, you done want a Get(20) to return a slice with 1MB capacity, just because that is what it has around. So if most of my buffers are 4KB, I don't want it to keep handing out 4MB slices because they were needed at some point.

CAFxX · 2025-05-07T08:48:37Z

@klauspost I realize this is not clear from the proposal: what I was imagining is that m := roundUpToSizeClass(n), so getting a 1MB slice when you request a 3KB one would never happen (assuming power-of-2 size classes for simplicity, at most you'd get 4KB; in reality we would likely use the size classes defined by the runtime).

This is the simplest thing I could come up with, more complex solutions are definitely possible.

Then, the usual sync.Pool cleanup logic, applied to each size class, would mean that within a couple of GC cycles that old 1MB slice would be collected (if really it has not been used in these last GC cycles).

As an added protection against misuse Put may even try to detect whether the slice capacity does not match the size class of the backing storage (e.g. a slice originally allocated as make([]byte, 4096) is being added to the pool after having been sliced to [::1024] or [2048:]), in which case it would be silently dropped. Not sure whether this is feasible in terms of overhead.

CAFxX · 2025-05-07T09:07:42Z

@mateusz834 My main concern is someone subslicing a slice, e.g. make([]byte, 4096) into [::2048] and [2048:] and then adding both subslices to the pool. Those two subslices would not be individually garbage-collectible, because they point to the same underlying storage. Similar issue if someone subslices in the same way, and then adds only one of the two subslices.

We posted almost at the same time so you may have missed it, but this is the reason why above I wrote:

As an added protection against misuse Put may even try to detect whether the slice capacity does not match the size class of the backing storage (e.g. a slice originally allocated as make([]byte, 4096) is being added to the pool after having been sliced to [::1024] or [2048:]), in which case it would be silently dropped. Not sure whether this is feasible in terms of overhead.

mateusz834 · 2025-05-07T09:15:18Z

// Callers can assume that the length and contents of the returned slice are either as
// set by [New], or as they were when the slice was passed to [Put].

Is that the expected (or desired) behavior?, consider:

b := p.Get()
b = b[:0]
b = append(b, 1)
b.Put(b)

I believe that Put should do a b = b[:cap(b)] implicitly, i don't see a good reason to preserve the length. This also in turn could allow dropping the len field of a slice when it is stored internally in the pool (with unsafe).

CAFxX · 2025-05-07T09:21:21Z

@mateusz834 my thinking there was just that there are good reasons for all options:

leaving the length (and contents) unchanged
- minimize the cleanup required, e.g. if you know that when you did the Put anything beyond len(s) was already zeroed, or alternatively that anything before len(s) is properly initialized (consider the case of E being a struct)
setting the length to n
- Get(n) would be similar in spirit to append([]E(nil), make([]E, n)...)
- may be a bit quirky for users to deal with the fact that they call Get(n) but New gets called with m>n
- would potentially allow the runtime to do more creative things with the slice contents (e.g. MADV_DONTNEED)
setting the length to m (capacity of the slice)
- would make it easier to see that the capacity can be higher than n
- what happens if New(m) does not set len(s) == cap(s)?
- would potentially allow the runtime to do more creative things with the slice contents (e.g. MADV_DONTNEED)

so I just went for the one that seemed to require the lowest number of assumptions on how the pool will be used

jub0bs · 2025-05-07T09:28:24Z

Would sync/v2 not solve this? Is a new type really necessary?

CAFxX · 2025-05-07T09:29:01Z

@jub0bs

note that while the allocations on Put may be solved in #71076 or with #47657, the problem of excessive memory usage on misuse won't

mateusz834 · 2025-05-07T09:31:55Z

what happens if New(m) does not set len(s) == cap(s)?

I was also thinking that if we decided to do a b[:cap(b), then New() is not really needed, as it can be just always be a make().

minimize the cleanup required, e.g. if you know that when you did the Put anything beyond len(s) was already zeroed, or alternatively that anything before len(s) is properly initialized (consider the case of E being a struct)

Makes sense, but on the other hand i wonder, whether this is something that is currently being done in practice. The cleanup might also be done before calling Put() (not-ideal, as we might do a cleanup of something that is going to be GCed soon).

CAFxX · 2025-05-07T09:38:40Z

@mateusz834

I was also thinking that if we decided to do a b[:cap(b), then New() is not really needed, as it can be just always be a make().

I think having a nil New is something we should still allow; sometimes the semantics of "give me something from the pool only if it is already there" are useful. IIRC even in sync/v2 there are arguments in favour of maintaining this.

whether this is something that is currently being done in practice

I can only say for sure that I have often relied on the fact that the contents of what you put in the pool are preserved (e.g. knowing that objects I get out of the pool are in a very specific state that was set when I put them into the pool). Whether this is widespread practice I don't know (https://github.com/josharian/intern suggests it may be though).

klauspost · 2025-05-07T09:40:32Z

@klauspost I realize this is not clear from the proposal: what I was imagining is that m := roundUpToSizeClass(n), so getting a 1MB slice when you request a 3KB one would never happen (assuming power-of-2 size classes for simplicity, at most you'd get 4KB; in reality we would likely use the size classes defined by the runtime).

Log2 buckets is not really great. Especially at smaller sizes it fragments much more than needed. Log10 is better, but overallocs when sizes gets big. Also the switchovers depends on sizeof(T) - what makes sense for []byte doesn't necessarily translate to []bigstruct{}.

I am not really sure a good generic roundUpToSizeClass(n) can be made. That observation is mostly just a reflection of my own inability to do so.

Usually I end up with a "min_len" and "max_len" depending on the expected payloads. When the cap of a returned slice is > max_len I see it as an outlier and just dump it for GC. In some cases I add a couple of buckets with cap ranges - using a sync.Pool underneath.

CAFxX · 2025-05-07T09:52:05Z

@klauspost I think what we should end up doing is just doing exactly what the runtime does when it needs to satisfy an allocation of size x bytes (IIRC it rounds up to the size class for x <= 32KB; for larger allocations it rounds to pages or groups of pages). It may not be perfect, and could potentially end up resulting in a large amounts of ephemeral pools if used inefficiently... or we could just add a MaxCap (that when > 0 would cause Put to silently discard any slice of capacity larger than MaxCap).

soypat · 2025-05-07T18:00:49Z

Sharing my slice pool implementation which I've been using. Got a couple of methods in there to calculate memory usage and amount of allocated contiguous blocks. https://github.com/soypat/gsdf/blob/b629e072407e28d69621e4bbba272b3ade11383c/gleval/cpu.go#L222

earthboundkid · 2025-05-07T22:59:39Z

What are the types we would use this for besides []byte? Should it just be bytes.Pool instead and have the knobs be around that?

CAFxX · 2025-05-08T01:11:09Z

@earthboundkid I would say that this would likely be useful in places where we don't have just []byte: e.g.

when unmarshalling arrays of unknown (a priori) length, like in JSON -- in this case the slice can be basically of any type
slices.Collect or in general any function that accumulates into a slice a stream of elements of unknown length -- same as above
math/big -- in this case it operates on []uint

These are just off the top of my head, and just looking at uses in the standard library.

It is true though that I would expect []byte to likely be the most common use.

It is also true that it would be great if none of this was needed; the old adage "every use of sync.Pool is a bug report on the GC" applies here.

soypat · 2025-05-08T03:28:19Z

Sharing my slice pool implementation which I've been using. Got a couple of methods in there to calculate memory usage and amount of allocated contiguous blocks. https://github.com/soypat/gsdf/blob/b629e072407e28d69621e4bbba272b3ade11383c/gleval/cpu.go#L222

Could the backing memory be used with slices of different element types to suplement the above case (type VecPool struct) where I have to create a different pool for every type?

type SlicePool struct {
// ...
}
type SlicePoolHandle[E any] struct {
   sp *SlicePool
}
func GetSlicePoolHandle[E any](sp *SlicePool) SlicePoolHandle[E]

func (sph SlicePoolHandle[E]) Get(n int) []E

func (sph SlicePoolHandle[E]) Put(s []E)

Merovius · 2025-05-08T03:29:09Z

when unmarshalling arrays of unknown (a priori) length, like in JSON -- in this case the slice can be basically of any type

slices.Collect or in general any function that accumulates into a slice a stream of elements of unknown length -- same as above

math/big -- in this case it operates on []uint

But pools only really make sense in cases where they are API-transparent. In all of these cases, the allocated slices are passed out of the control of the respective packages and can't really be pooled.

CAFxX · 2025-05-08T04:19:58Z

@Merovius my thinking is that in those cases the slices could be recycled while the slice is growing, so before the slice is passed to the outside world. Once it is passed to the outside world, it would never hit the pool again.
This should cut the average number of allocations by those functions down to 1 (the last one), regardless of how many elements are in the stream, without the need for this pool usage to not be API-transparent.

CAFxX added the Proposal label May 7, 2025

gopherbot added this to the Proposal milestone May 7, 2025

CAFxX changed the title ~~proposal: sync: SlicePool~~ proposal: sync: SlicePool for variable-sized slices May 7, 2025

gabyhelp added the LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool label May 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: sync: SlicePool for variable-sized slices #73620

proposal: sync: SlicePool for variable-sized slices #73620

CAFxX commented May 7, 2025 •

edited

Loading

mateusz834 commented May 7, 2025

CAFxX commented May 7, 2025 •

edited

Loading

gabyhelp commented May 7, 2025

klauspost commented May 7, 2025

CAFxX commented May 7, 2025 •

edited

Loading

CAFxX commented May 7, 2025 •

edited

Loading

mateusz834 commented May 7, 2025

CAFxX commented May 7, 2025 •

edited

Loading

jub0bs commented May 7, 2025

CAFxX commented May 7, 2025

mateusz834 commented May 7, 2025 •

edited

Loading

CAFxX commented May 7, 2025 •

edited

Loading

klauspost commented May 7, 2025

CAFxX commented May 7, 2025 •

edited

Loading

soypat commented May 7, 2025

earthboundkid commented May 7, 2025

CAFxX commented May 8, 2025 •

edited

Loading

soypat commented May 8, 2025

Merovius commented May 8, 2025

CAFxX commented May 8, 2025 •

edited

Loading

proposal: sync: SlicePool for variable-sized slices #73620

proposal: sync: SlicePool for variable-sized slices #73620

Comments

CAFxX commented May 7, 2025 • edited Loading

Proposal Details

Implementation notes

Questions answered so far

Footnotes

mateusz834 commented May 7, 2025

CAFxX commented May 7, 2025 • edited Loading

gabyhelp commented May 7, 2025

klauspost commented May 7, 2025

CAFxX commented May 7, 2025 • edited Loading

CAFxX commented May 7, 2025 • edited Loading

mateusz834 commented May 7, 2025

CAFxX commented May 7, 2025 • edited Loading

jub0bs commented May 7, 2025

CAFxX commented May 7, 2025

mateusz834 commented May 7, 2025 • edited Loading

CAFxX commented May 7, 2025 • edited Loading

klauspost commented May 7, 2025

CAFxX commented May 7, 2025 • edited Loading

soypat commented May 7, 2025

earthboundkid commented May 7, 2025

CAFxX commented May 8, 2025 • edited Loading

soypat commented May 8, 2025

Merovius commented May 8, 2025

CAFxX commented May 8, 2025 • edited Loading

CAFxX commented May 7, 2025 •

edited

Loading

CAFxX commented May 7, 2025 •

edited

Loading

CAFxX commented May 7, 2025 •

edited

Loading

CAFxX commented May 7, 2025 •

edited

Loading

CAFxX commented May 7, 2025 •

edited

Loading

mateusz834 commented May 7, 2025 •

edited

Loading

CAFxX commented May 7, 2025 •

edited

Loading

CAFxX commented May 7, 2025 •

edited

Loading

CAFxX commented May 8, 2025 •

edited

Loading

CAFxX commented May 8, 2025 •

edited

Loading