Flush CodedOutputStream buffer on drop. #147

cfallin · 2016-04-02T03:44:17Z

The current behavior -- that some leftover bytes may be silently forgotten in a CodedOutputStream's output buffer on drop -- caught me by surprise.

I'm not sure if the current behavior is intentional. Perhaps you wanted to be able to explicitly return a Result on flush(), whereas drop() cannot do that? But in that case the user who wants to catch the error can now explicitly flush() first to avoid any panic on drop, and they should be doing that anyway to avoid silently forgetting bytes. In any case, the new semantics seem less error-prone to me.

Thanks!

This change is

stepancheg · 2016-04-02T04:13:49Z

Well, it is intentional. I think drop must only release resources, and not do anything heavy. The problem arises, for example, if CodedOutputStream is created over a socket, and buffer is not empty, then drop may hang indefinetly.

This behavior however contradics behavior of std::io::BufWriter (which I consider wrong for described reasons).

Need to think about it.

stepancheg · 2016-04-02T05:38:38Z

The problem with current BufWriter design: http://is.gd/ZdgbF9 Error is quietly ignored.

You propose to panic instead of ignore, which is also not good.

cfallin · 2016-04-02T05:55:19Z

Hmm, I see. It seems there's no perfect solution -- as long as something is buffered, there's potential for a delayed error. We either have to drop bytes, drop the error, or panic. So I understand your reasoning.

I'm fine if you decide to keep things as they are, but one more proposal just came to mind:

It seems to me, in one way of thinking, that dropping without flushing is "indeterminate" in the same way that a race condition or uninitialized memory are: no correct program should do that, and you'll get something but you don't know what. It might be better to demand that the user either explicitly flushes or explicitly clears the buffer before dropping, and panic if not. That is:

a stream would have an "unfinished business" flag set on any write and cleared on any flush or explicit clear, and initially false.
a stream still has flush() returning a Result.
a stream also has clear_buffer() that forgets any unsent bytes, and always succeeds.
on drop, if flush() or clear_buffer() has not been called since last write (flag is set), then panic.

The difference is that this panic would be due to incorrect use of API (i.e., always occuring), rather than only occurring on some infrequent error, so at least it would be less error-prone.

What do you think? I'm happy to drop this and write my own RAII wrapper if you disagree. Thanks for considering :-)

stepancheg · 2016-04-02T06:13:32Z

I like your solution, except it has a problem: drop must not panic if it is currently unwinding due to another panic. Otherwize code like this would crash the process, not just panic:

let mut buffered = ...
buffered.write(buf);
assert!(some_cond);
buffered.flush().unwrap();

To solve the problem, drop needs to query unwind reason (normal or panic, something similar to std::uncaught_exception of C++), and panic only if it is normal unwinding.

I think the issue should be raised in rust about BufWriter. And which solution is chosen for BufWriter should be used for CodedOutputStream.

stepancheg · 2016-04-02T06:23:10Z

Found it: panicking. So, yes, your solution should work. I'll open an issue in Rust.

stepancheg · 2016-04-02T06:51:30Z

The issue about BufWriter: rust-lang/rust#32677

cfallin · 2016-04-04T19:38:54Z

The rust-lang issue is still open, but I had a thought in the meantime: why not have CodedOutputStream actually use a BufWriter in its implementation? I agree with you that the two should have consistent behavior, and it seems this would be the simplest way to achieve that. A few questions:

Does CodedOutputStream roll its own buffer in order to have the drop-without-flush capability, or were there other reasons too (e.g. performance)?
If we refactor to use BufWriter under the covers, would you prefer to get a cancel() method on BufWriter first so that we can expose that in CodedOutputStream's API to support your drop-without-hang use-case?

stepancheg · 2016-04-04T20:14:36Z

why not have CodedOutputStream actually use a BufWriter in its implementation?

Several reasons:

CodedOutputStream could be implemented efficiently if it had access to BufWriter buffer (although CodedOutputStream is not optimized for that yet). BufWriter does not expose its buffer.
CodedOutputStream can write directly to byte array
C++ and Java versions of CodedOutputStream do own buffering
Historical reasons: there were not BufWriter when first version of rust-protobuf implemented

If we refactor to use BufWriter under the covers, would you prefer to get a cancel() method on BufWriter first so that we can expose that in CodedOutputStream's API to support your drop-without-hang use-case?

I think we can't use BufWriter under cover for reasons described above.

I'm not sure I understand what do you propose right now. My point is that CodedOutputStream should be to some degree consistent with rust API, and rust team hasn't come with decision yet (for example, they could mark certain functions as "destructors" which has to be called manually).

cfallin · 2016-04-04T20:19:16Z

OK, thanks. I was trying to figure out if we could make forward progress here by simply delegating to BufWriter, since you had stated you wanted to follow its semantics, but I understand the efficiency reasons for not doing so. Will wait to see how the BufWriter issue concludes.

This commit removes the internal buffering in CodedOutputStream. The protobuf library should rely on the application to provide a buffered Write implementation, if it so desires. By buffering internally, the protobuf library forces double buffering in situations where the application is already buffering. Additionally, it forces awkward resource lifetime issues as discussed in stepancheg#147. Removing the internal buffering simplifies the internals of CodedOutputStream and improves performance on the rust-protobuf-perftest write tests. Before: test1: write: 86 ns per iter test_repeated_bool: write: 91 ns per iter test_repeated_packed_int32: write: 127 ns per iter test_repeated_messages: write: 224 ns per iter test_optional_messages: write: 202 ns per iter test_strings: write: 136 ns per iter test_small_bytearrays: write: 1098 ns per iter test_large_bytearrays: write: 119221 ns per iter After: test1: write: 55 ns per iter test_repeated_bool: write: 66 ns per iter test_repeated_packed_int32: write: 122 ns per iter test_repeated_messages: write: 248 ns per iter test_optional_messages: write: 207 ns per iter test_strings: write: 118 ns per iter test_small_bytearrays: write: 1064 ns per iter test_large_bytearrays: write: 109567 ns per iter

stepancheg · 2019-04-15T03:05:38Z

This is merged into master long time ago, but it is not in stable though.

rm generated files

Flush CodedOutputStream buffer on drop.

6f1122b

cfallin mentioned this pull request Apr 2, 2016

Lint for types where cleanup before drop is desirable rust-lang/rust#32677

Open

stepancheg closed this Apr 15, 2019

jxs pushed a commit to jxs/rust-protobuf that referenced this pull request Oct 25, 2022

Merge pull request stepancheg#147 from nerdrew/gitignore-generated-files

2e8172e

rm generated files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flush CodedOutputStream buffer on drop. #147

Flush CodedOutputStream buffer on drop. #147

Uh oh!

cfallin commented Apr 2, 2016 •

edited

Loading

Uh oh!

stepancheg commented Apr 2, 2016

Uh oh!

stepancheg commented Apr 2, 2016

Uh oh!

cfallin commented Apr 2, 2016

Uh oh!

stepancheg commented Apr 2, 2016

Uh oh!

stepancheg commented Apr 2, 2016

Uh oh!

stepancheg commented Apr 2, 2016

Uh oh!

cfallin commented Apr 4, 2016

Uh oh!

stepancheg commented Apr 4, 2016

Uh oh!

cfallin commented Apr 4, 2016

Uh oh!

stepancheg commented Apr 15, 2019

Uh oh!

Uh oh!

Flush CodedOutputStream buffer on drop. #147

Flush CodedOutputStream buffer on drop. #147

Uh oh!

Conversation

cfallin commented Apr 2, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stepancheg commented Apr 2, 2016

Uh oh!

stepancheg commented Apr 2, 2016

Uh oh!

cfallin commented Apr 2, 2016

Uh oh!

stepancheg commented Apr 2, 2016

Uh oh!

stepancheg commented Apr 2, 2016

Uh oh!

stepancheg commented Apr 2, 2016

Uh oh!

cfallin commented Apr 4, 2016

Uh oh!

stepancheg commented Apr 4, 2016

Uh oh!

cfallin commented Apr 4, 2016

Uh oh!

stepancheg commented Apr 15, 2019

Uh oh!

Uh oh!

cfallin commented Apr 2, 2016 •

edited

Loading