Skip to content

proposal: sync: SlicePool for variable-sized slices #73620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
CAFxX opened this issue May 7, 2025 · 20 comments
Open

proposal: sync: SlicePool for variable-sized slices #73620

CAFxX opened this issue May 7, 2025 · 20 comments
Labels
LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool Proposal
Milestone

Comments

@CAFxX
Copy link
Contributor

CAFxX commented May 7, 2025

Proposal Details

To address the known footguns of using Pool with variable-sized slices (excessive memory usage on misuse, allocation on Put), I would like to propose the following addition to sync:

package sync

// SlicePool is a generic pool for variable-sized slices.
type SlicePool[E any] struct {
	// New, if non-nil, must return a new slice of the specified capacity.
	New func(int) []E
}

// Get returns a slice of capacity at least n from the pool, if possible.
// If not possible and [New] is not nil, Get returns the value returned by [New](m),
// with m greater or equal to n.
// Otherwise it returns nil.
//
// Callers must not assume any relationship between individual slices returned by Get
// and slices previously passed as arguments to [Put].
// Callers can assume that the length and contents of the returned slice are either as
// set by [New], or as they were when the slice was passed to [Put].
func (p *SlicePool[E]) Get(n int) []E

// Put adds the slice s to the pool.
// The slice s should not be used after it has been passed to Put.
//
// If the slice has been previously obtained from a call to [Get], when Put is called
// the slice should have the same capacity as it had when it was returned by [Get].
//
// Put is best effort, in that it may silently drop the slice in case it detects it would
// not be beneficial to add it to the pool.
func (p *SlicePool[E]) Put(s []E)

The guarantees given allow significant flexibility in implementation, while addressing the aforementioned footguns (note that while the allocations on Put may be solved in sync/v2 or with PoolOf, the problem of excessive memory usage on misuse won't).

Implementations could range incrementally from very simple (e.g. something in the spirit of https://github.com/josharian/intern) to a fully dedicated pool implementation deeply integrated with the memory allocator (e.g. alignment with size classes) and GC (e.g. incremental pool cleanup). While a very crude implementation of this proposal could very well live outside of the standard library, anything more performant would necessarily need to live in the standard library, and be integrated with the runtime (as sync.Pool does).

As mentioned in #23199 the http2 logic makes use of a set of pools to handle slices of different sizes, and could likely be switched to SlicePool instead. A quick search on golang/go suggests at least 3 other places where this could be used to replace currently hand-rolled solutions, and another on all other Go code on github suggests a few hundreds more (it is true though that not all these examples are slices1). A few other possibilities include during JSON unmarshalling, slices.Collect, bytes.Buffer, io.ReadAll, math/big.nat, etc.

As a small note, the current proposal uses the sync.Pool (v1) convention of a New field because this proposal targets the current sync. If it were to target sync/v2 it would need to be adapted to whatever convention ends up being used in v2.

Implementation notes

@klauspost correctly points out that coming up with a generic policy for when to return a slice of size m > n is anything but easy, and it is quite likely completely impractical to come up with something that is optimal for all workloads. My suggestion (for now a bit handwavy) is to use the same (or very similar) logic used by the memory allocator in the runtime, rounding up allocations to size classes with bounded overhead.

Questions answered so far

Footnotes

  1. For the non-slices examples I was thinking to later propose a KeyedPool that would be able to handle all of those I've seen so far. KeyedPool would not have the ability to return slices of capacity "at least N", so it would not be a good fit especially for byte slices.

@CAFxX CAFxX added the Proposal label May 7, 2025
@gopherbot gopherbot added this to the Proposal milestone May 7, 2025
@mateusz834
Copy link
Member

Personally i would change the signature to just: type SlicePool[E any] struct {}. It is pretty easy to convert []E into S, it does not cause any performance penalties, but has a benefit of clearer type-signature.

See #73598 (comment)

@CAFxX
Copy link
Contributor Author

CAFxX commented May 7, 2025

Personally i would change the signature to just: type SlicePool[E any] struct {}. It is pretty easy to convert []E into S, it does not cause any performance penalties, but has a benefit of clearer type-signature.

Updated 👍🏻

@CAFxX CAFxX changed the title proposal: sync: SlicePool proposal: sync: SlicePool for variable-sized slices May 7, 2025
@gabyhelp gabyhelp added the LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool label May 7, 2025
@klauspost
Copy link
Contributor

How will the code make a decision on which sizes are reasonable to re-use and when trigger a new alloc?

The reason I am asking is that it is not really something I have a "generically solved" solution myself.

If you have []byte buffers, you done want a Get(20) to return a slice with 1MB capacity, just because that is what it has around. So if most of my buffers are 4KB, I don't want it to keep handing out 4MB slices because they were needed at some point.

@CAFxX
Copy link
Contributor Author

CAFxX commented May 7, 2025

@klauspost I realize this is not clear from the proposal: what I was imagining is that m := roundUpToSizeClass(n), so getting a 1MB slice when you request a 3KB one would never happen (assuming power-of-2 size classes for simplicity, at most you'd get 4KB; in reality we would likely use the size classes defined by the runtime).

This is the simplest thing I could come up with, more complex solutions are definitely possible.

Then, the usual sync.Pool cleanup logic, applied to each size class, would mean that within a couple of GC cycles that old 1MB slice would be collected (if really it has not been used in these last GC cycles).

As an added protection against misuse Put may even try to detect whether the slice capacity does not match the size class of the backing storage (e.g. a slice originally allocated as make([]byte, 4096) is being added to the pool after having been sliced to [::1024] or [2048:]), in which case it would be silently dropped. Not sure whether this is feasible in terms of overhead.

@CAFxX
Copy link
Contributor Author

CAFxX commented May 7, 2025

@mateusz834 My main concern is someone subslicing a slice, e.g. make([]byte, 4096) into [::2048] and [2048:] and then adding both subslices to the pool. Those two subslices would not be individually garbage-collectible, because they point to the same underlying storage. Similar issue if someone subslices in the same way, and then adds only one of the two subslices.

We posted almost at the same time so you may have missed it, but this is the reason why above I wrote:

As an added protection against misuse Put may even try to detect whether the slice capacity does not match the size class of the backing storage (e.g. a slice originally allocated as make([]byte, 4096) is being added to the pool after having been sliced to [::1024] or [2048:]), in which case it would be silently dropped. Not sure whether this is feasible in terms of overhead.

@mateusz834
Copy link
Member

// Callers can assume that the length and contents of the returned slice are either as
// set by [New], or as they were when the slice was passed to [Put].

Is that the expected (or desired) behavior?, consider:

b := p.Get()
b = b[:0]
b = append(b, 1)
b.Put(b)

I believe that Put should do a b = b[:cap(b)] implicitly, i don't see a good reason to preserve the length. This also in turn could allow dropping the len field of a slice when it is stored internally in the pool (with unsafe).

@CAFxX
Copy link
Contributor Author

CAFxX commented May 7, 2025

@mateusz834 my thinking there was just that there are good reasons for all options:

  • leaving the length (and contents) unchanged
    • minimize the cleanup required, e.g. if you know that when you did the Put anything beyond len(s) was already zeroed, or alternatively that anything before len(s) is properly initialized (consider the case of E being a struct)
  • setting the length to n
    • Get(n) would be similar in spirit to append([]E(nil), make([]E, n)...)
    • may be a bit quirky for users to deal with the fact that they call Get(n) but New gets called with m>n
    • would potentially allow the runtime to do more creative things with the slice contents (e.g. MADV_DONTNEED)
  • setting the length to m (capacity of the slice)
    • would make it easier to see that the capacity can be higher than n
    • what happens if New(m) does not set len(s) == cap(s)?
    • would potentially allow the runtime to do more creative things with the slice contents (e.g. MADV_DONTNEED)

so I just went for the one that seemed to require the lowest number of assumptions on how the pool will be used

@jub0bs
Copy link

jub0bs commented May 7, 2025

Would sync/v2 not solve this? Is a new type really necessary?

@CAFxX
Copy link
Contributor Author

CAFxX commented May 7, 2025

@jub0bs

note that while the allocations on Put may be solved in #71076 or with #47657, the problem of excessive memory usage on misuse won't

@mateusz834
Copy link
Member

mateusz834 commented May 7, 2025

what happens if New(m) does not set len(s) == cap(s)?

I was also thinking that if we decided to do a b[:cap(b), then New() is not really needed, as it can be just always be a make().

minimize the cleanup required, e.g. if you know that when you did the Put anything beyond len(s) was already zeroed, or alternatively that anything before len(s) is properly initialized (consider the case of E being a struct)

Makes sense, but on the other hand i wonder, whether this is something that is currently being done in practice. The cleanup might also be done before calling Put() (not-ideal, as we might do a cleanup of something that is going to be GCed soon).

@CAFxX
Copy link
Contributor Author

CAFxX commented May 7, 2025

@mateusz834

I was also thinking that if we decided to do a b[:cap(b), then New() is not really needed, as it can be just always be a make().

I think having a nil New is something we should still allow; sometimes the semantics of "give me something from the pool only if it is already there" are useful. IIRC even in sync/v2 there are arguments in favour of maintaining this.

whether this is something that is currently being done in practice

I can only say for sure that I have often relied on the fact that the contents of what you put in the pool are preserved (e.g. knowing that objects I get out of the pool are in a very specific state that was set when I put them into the pool). Whether this is widespread practice I don't know (https://github.com/josharian/intern suggests it may be though).

@klauspost
Copy link
Contributor

@klauspost I realize this is not clear from the proposal: what I was imagining is that m := roundUpToSizeClass(n), so getting a 1MB slice when you request a 3KB one would never happen (assuming power-of-2 size classes for simplicity, at most you'd get 4KB; in reality we would likely use the size classes defined by the runtime).

Log2 buckets is not really great. Especially at smaller sizes it fragments much more than needed. Log10 is better, but overallocs when sizes gets big. Also the switchovers depends on sizeof(T) - what makes sense for []byte doesn't necessarily translate to []bigstruct{}.

I am not really sure a good generic roundUpToSizeClass(n) can be made. That observation is mostly just a reflection of my own inability to do so.

Usually I end up with a "min_len" and "max_len" depending on the expected payloads. When the cap of a returned slice is > max_len I see it as an outlier and just dump it for GC. In some cases I add a couple of buckets with cap ranges - using a sync.Pool underneath.

@CAFxX
Copy link
Contributor Author

CAFxX commented May 7, 2025

@klauspost I think what we should end up doing is just doing exactly what the runtime does when it needs to satisfy an allocation of size x bytes (IIRC it rounds up to the size class for x <= 32KB; for larger allocations it rounds to pages or groups of pages). It may not be perfect, and could potentially end up resulting in a large amounts of ephemeral pools if used inefficiently... or we could just add a MaxCap (that when > 0 would cause Put to silently discard any slice of capacity larger than MaxCap).

@soypat
Copy link

soypat commented May 7, 2025

Sharing my slice pool implementation which I've been using. Got a couple of methods in there to calculate memory usage and amount of allocated contiguous blocks. https://github.com/soypat/gsdf/blob/b629e072407e28d69621e4bbba272b3ade11383c/gleval/cpu.go#L222

@earthboundkid
Copy link
Contributor

What are the types we would use this for besides []byte? Should it just be bytes.Pool instead and have the knobs be around that?

@CAFxX
Copy link
Contributor Author

CAFxX commented May 8, 2025

@earthboundkid I would say that this would likely be useful in places where we don't have just []byte: e.g.

  • when unmarshalling arrays of unknown (a priori) length, like in JSON -- in this case the slice can be basically of any type
  • slices.Collect or in general any function that accumulates into a slice a stream of elements of unknown length -- same as above
  • math/big -- in this case it operates on []uint

These are just off the top of my head, and just looking at uses in the standard library.

It is true though that I would expect []byte to likely be the most common use.

It is also true that it would be great if none of this was needed; the old adage "every use of sync.Pool is a bug report on the GC" applies here.

@soypat
Copy link

soypat commented May 8, 2025

Sharing my slice pool implementation which I've been using. Got a couple of methods in there to calculate memory usage and amount of allocated contiguous blocks. https://github.com/soypat/gsdf/blob/b629e072407e28d69621e4bbba272b3ade11383c/gleval/cpu.go#L222

Could the backing memory be used with slices of different element types to suplement the above case (type VecPool struct) where I have to create a different pool for every type?

type SlicePool struct {
// ...
}
type SlicePoolHandle[E any] struct {
   sp *SlicePool
}
func GetSlicePoolHandle[E any](sp *SlicePool) SlicePoolHandle[E]

func (sph SlicePoolHandle[E]) Get(n int) []E

func (sph SlicePoolHandle[E]) Put(s []E)

@Merovius
Copy link
Contributor

Merovius commented May 8, 2025

  • when unmarshalling arrays of unknown (a priori) length, like in JSON -- in this case the slice can be basically of any type
  • slices.Collect or in general any function that accumulates into a slice a stream of elements of unknown length -- same as above
  • math/big -- in this case it operates on []uint

But pools only really make sense in cases where they are API-transparent. In all of these cases, the allocated slices are passed out of the control of the respective packages and can't really be pooled.

@CAFxX
Copy link
Contributor Author

CAFxX commented May 8, 2025

@Merovius my thinking is that in those cases the slices could be recycled while the slice is growing, so before the slice is passed to the outside world. Once it is passed to the outside world, it would never hit the pool again.
This should cut the average number of allocations by those functions down to 1 (the last one), regardless of how many elements are in the stream, without the need for this pool usage to not be API-transparent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool Proposal
Projects
None yet
Development

No branches or pull requests

9 participants