-
Notifications
You must be signed in to change notification settings - Fork 18k
compress: add AppendCompress and AppendDecompress #54078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't see anywhere to keep state here. Is this the correct usage? In particular, are we required to always keep passing the same destination slice to each call? Is there really no other state required? buf := make([]byte, 4096)
var dst []byte
for {
n, err := r.Read(buf)
if err != nil && err != io.EOF {
return nil, err
}
if n == 0 {
break
}
dst, err = gzip.AppendEncoded(dst, buf[:n], 9)
if err != nil {
return nil, err
}
}
return dst, nil |
The intention of the proposed API is for use when you already have a For your example, you could do: b, err := io.ReadAll(r) // converts io.Reader to []byte
if err != nil {
return nil, err
}
return gzip.AppendEncoded(nil, b, 9) // stays in the []byte world but I'm not sure if that's any faster or slower than doing: var bb bytes.Buffer
zw := gzip.NewWriter(bb) // stays in the io.Writer world
if _, err := io.Copy(zw, r); err != nil { // connects io.Writer and io.Reader together
return nil, err
}
zw.Close()
return bb.Bytes(), nil // converts io.Writer into a []byte Calling |
This API would remove the allocation of the window. How much can we remove of the other allocated state? Can these be implemented with no allocation other than the append? (That is, can we get everything to be stack-allocated? Or were you planning on a sync.Pool?) |
I was planning on using a |
This proposal has been added to the active column of the proposals project |
Is |
I'm fine with either |
strconv.AppendQuote is not AppendQuoted (no d). Does anyone object to AppendCompress and AppendDecompress? |
Based on the discussion above, this proposal seems like a likely accept. |
No change in consensus, so accepted. 🎉 |
Change https://go.dev/cl/429736 mentions this issue: |
There are situations where we have the entire input on hand and want to compress/decompress it into a buffer we already have on hand. I propose the addition of the following functions to
gzip
,flate
,zlib
:At face value, this appears to be yet another proposal for convenience functions (which has been proposed and rejected before: #16504). However, there are CPU and memory performance benefits to this API that is impossible to achieve otherwise.
Compression is the combination of LZ77 and Huffman encoding. The memory used by LZ77 is essentially the last 32KiB of uncompressed data. When operating with an
io.Reader
orio.Writer
, the compressor/decompressor is responsible for maintaining a copy of the LZ77 window. This is quite a bit of copying and extra memory usage. The advantage of the APIs above is that there is zero work that needs to be done to maintain the LZ77 window separately since theAppendEncoded.src
andAppendDecoded.dst
are the LZ77 windows. When decompressing, the only memory required is the memory for the Huffman decoder, which is relatively small. When compressing, the only memory required is the memory for the LZ77 search structure and Huffman encoder, which is still sizeable, but still smaller than the entire LZ77 window.Other considerations: The
flate
andzlib
APIs provide the ability to specify a dictionary and the proposed APIs have no means of specifying such a feature. It appears that dictionaries are rarely used such that adding it as part of the API is probably not worth it. Usage of dictionaries can still use theio.Reader
andio.Writer
based APIs.The text was updated successfully, but these errors were encountered: