-
Notifications
You must be signed in to change notification settings - Fork 385
Make Tree Borrows Provenance GC no longer produce stack overflows #3833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Note: I'm not too sure what the point of the assert here was, but with the new implementation, it is already ensured otherwise that the root is never deleted. miri/src/borrow_tracker/tree_borrows/tree.rs Line 703 in a84aba2
|
a84aba2
to
06e2806
Compare
for (_perms_range, perms) in self.rperms.iter_mut_all() { | ||
perms.remove(idx); | ||
fn keep_only_needed(&mut self, root: UniIndex, live: &FxHashSet<BorTag>) -> bool { | ||
let mut stack = vec![(root, 0)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment explaining why we don't use the TreeVisitorStack
mem::swap(&mut self.nodes.get_mut(*tag).unwrap().children, &mut children_of_node); | ||
let to_remove: Vec<_> = | ||
children_of_node.drain_filter(|x| self.is_useless(*x, live)).collect(); | ||
mem::swap(&mut self.nodes.get_mut(*tag).unwrap().children, &mut children_of_node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then this can just be ...children = children_of_node
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. Since it's a SmallVec
you can probably create (and discard) the empty ones relatively cheaply. But also, since it's a SmallVec
, the two copies/swaps are not going to be completely cheap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A global let mut
outside the loop that survives all loop iterations is not a good idea IMO. So children_of_node
should be per-iteration. That means it is dead after this swap
call, so it makes little sense to use swap
in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. :)
There's just one thing I am wondering:
can we do the self.nodes.get_mut(*tag).unwrap().children = children_of_node;
after the for idx in to_remove {
? If yes, then we could entirely avoid the collect
, which seems quite nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but it does not avoid the collect
. That is needed because the closure passed to drain_filter
needs a shared reference to self
, while the loop body needs a mutable one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which makes me think, perhaps there's a better API than drain_filter
🤔 But that would need to be RFC'ed first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense, thanks.
// list of remaining children back in. | ||
let mut children_of_node = | ||
mem::take(&mut self.nodes.get_mut(*tag).unwrap().children); | ||
// Remove all useless children, and save them for later (again to appease the borrow checker) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Remove all useless children, and save them for later (again to appease the borrow checker) | |
// Remove all useless children, and save them for later. | |
// The closure needs `&self` and the loop below needs `&mut self`, so we need to `collect` | |
// in to a temporary list. |
Just one last comment nit, then please squash the commits. |
620e1b7
to
ad46edb
Compare
squashed! |
@bors r+ |
☀️ Test successful - checks-actions |
Avoid extra copy by using `retain_mut` and moving the deletion into the closure Fixes the FIXME introduced in #3833. Thanks to `@dmitrii-ubskii` for the idea 🙂
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no longer cause a stack overflow. One test that motivated it was the test `fill::horizontal_line` in `tiny_skia`. But not causing stack overflows was not a large improvents, since it did not fix the fundamental issue: The tree was too large. The test now ran, but it required gigabytes of memory and hours of time, whereas it finishes within seconds in Stacked Borrows. The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate a slice in chunks. That iterator is written to reborrow at each call to `next`, which creates a linear tree with a bunch of intermediary nodes, which also fragments the `RangeMap` for that allocation. The solution is to now compact the tree, so that these interior nodes are removed. Care is taken to not remove nodes that are protected, or that otherwise restrict their children.
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no longer cause a stack overflow. One test that motivated it was the test `fill::horizontal_line` in `tiny_skia`. But not causing stack overflows was not a large improvents, since it did not fix the fundamental issue: The tree was too large. The test now ran, but it required gigabytes of memory and hours of time, whereas it finishes within seconds in Stacked Borrows. The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate a slice in chunks. That iterator is written to reborrow at each call to `next`, which creates a linear tree with a bunch of intermediary nodes, which also fragments the `RangeMap` for that allocation. The solution is to now compact the tree, so that these interior nodes are removed. Care is taken to not remove nodes that are protected, or that otherwise restrict their children.
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no longer cause a stack overflow. One test that motivated it was the test `fill::horizontal_line` in `tiny_skia`. But not causing stack overflows was not a large improvents, since it did not fix the fundamental issue: The tree was too large. The test now ran, but it required gigabytes of memory and hours of time, whereas it finishes within seconds in Stacked Borrows. The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate a slice in chunks. That iterator is written to reborrow at each call to `next`, which creates a linear tree with a bunch of intermediary nodes, which also fragments the `RangeMap` for that allocation. The solution is to now compact the tree, so that these interior nodes are removed. Care is taken to not remove nodes that are protected, or that otherwise restrict their children.
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no longer cause a stack overflow. One test that motivated it was the test `fill::horizontal_line` in `tiny_skia`. But not causing stack overflows was not a large improvents, since it did not fix the fundamental issue: The tree was too large. The test now ran, but it required gigabytes of memory and hours of time, whereas it finishes within seconds in Stacked Borrows. The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate a slice in chunks. That iterator is written to reborrow at each call to `next`, which creates a linear tree with a bunch of intermediary nodes, which also fragments the `RangeMap` for that allocation. The solution is to now compact the tree, so that these interior nodes are removed. Care is taken to not remove nodes that are protected, or that otherwise restrict their children.
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no longer cause a stack overflow. One test that motivated it was the test `fill::horizontal_line` in `tiny_skia`. But not causing stack overflows was not a large improvents, since it did not fix the fundamental issue: The tree was too large. The test now ran, but it required gigabytes of memory and hours of time, whereas it finishes within seconds in Stacked Borrows. The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate a slice in chunks. That iterator is written to reborrow at each call to `next`, which creates a linear tree with a bunch of intermediary nodes, which also fragments the `RangeMap` for that allocation. The solution is to now compact the tree, so that these interior nodes are removed. Care is taken to not remove nodes that are protected, or that otherwise restrict their children.
…alfJung Make Tree Borrows Provenance GC compact the tree Follow-up on #3833 and #3835. In these PRs, the TB GC was fixed to no longer cause a stack overflow. One test that motivated it was the test `fill::horizontal_line` in [`tiny-skia`](https://github.com/RazrFalcon/tiny-skia). But not causing stack overflows was not a large improvents, since it did not fix the fundamental issue: The tree was too large. The test now ran, but it required gigabytes of memory and hours of time (only for it to be OOM-killed 🤬), whereas it finishes within 24 seconds in Stacked Borrows. With this merged, it finishes in about 40 seconds under TB. The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate a slice in chunks. That iterator is written to reborrow at each call to `next`, which creates a linear tree with a bunch of intermediary nodes, which also fragments the `RangeMap` for that allocation. The solution is to now compact the tree, so that these interior nodes are removed. Care is taken to not remove nodes that are protected, or that otherwise restrict their children. I am currently only 99% sure that this is sound, and I do also think that this could compact even more. So `@Vanille-N` please also have a look at whether I got the compacting logic right. For a more visual comparison, [here is a gist](https://gist.github.com/JoJoDeveloping/ae4a7f7c29335a4c233ef42d2f267b01) of what the tree looks like at one point during that test, with and without compacting. This new GC requires a different iteration order during accesses (since the current one can make the error messages non-deterministic), so it is rebased on top of #3843 and requires that PR to be merged first.
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no longer cause a stack overflow. One test that motivated it was the test `fill::horizontal_line` in `tiny_skia`. But not causing stack overflows was not a large improvents, since it did not fix the fundamental issue: The tree was too large. The test now ran, but it required gigabytes of memory and hours of time, whereas it finishes within seconds in Stacked Borrows. The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate a slice in chunks. That iterator is written to reborrow at each call to `next`, which creates a linear tree with a bunch of intermediary nodes, which also fragments the `RangeMap` for that allocation. The solution is to now compact the tree, so that these interior nodes are removed. Care is taken to not remove nodes that are protected, or that otherwise restrict their children.
Most functions operating on Tree Borrows' trees are carefully written to not cause stack overflows due to too much recursion. The one exception is
Tree::keep_only_needed
, which just uses regular recursion.This function is part of the provenance GC, so it is called regularly for every allocation in the program.
Tests show that this is a problem in practice. For example, the test
fill::horizontal_line
in cratetiny-skia
(version 0.11.4) is such a test.This PR changes this, this test no now longer crashes. Instead, it succeeds (after a long time).