Skip to content

Make Tree Borrows Provenance GC no longer produce stack overflows #3833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 22, 2024

Conversation

JoJoDeveloping
Copy link
Contributor

Most functions operating on Tree Borrows' trees are carefully written to not cause stack overflows due to too much recursion. The one exception is Tree::keep_only_needed, which just uses regular recursion.
This function is part of the provenance GC, so it is called regularly for every allocation in the program.

Tests show that this is a problem in practice. For example, the test fill::horizontal_line in crate tiny-skia (version 0.11.4) is such a test.

This PR changes this, this test no now longer crashes. Instead, it succeeds (after a long time).

@JoJoDeveloping
Copy link
Contributor Author

JoJoDeveloping commented Aug 21, 2024

Note: I'm not too sure what the point of the assert here was, but with the new implementation, it is already ensured otherwise that the root is never deleted.

assert!(root_is_needed);

for (_perms_range, perms) in self.rperms.iter_mut_all() {
perms.remove(idx);
fn keep_only_needed(&mut self, root: UniIndex, live: &FxHashSet<BorTag>) -> bool {
let mut stack = vec![(root, 0)];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment explaining why we don't use the TreeVisitorStack

mem::swap(&mut self.nodes.get_mut(*tag).unwrap().children, &mut children_of_node);
let to_remove: Vec<_> =
children_of_node.drain_filter(|x| self.is_useless(*x, live)).collect();
mem::swap(&mut self.nodes.get_mut(*tag).unwrap().children, &mut children_of_node);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then this can just be ...children = children_of_node, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. Since it's a SmallVec you can probably create (and discard) the empty ones relatively cheaply. But also, since it's a SmallVec, the two copies/swaps are not going to be completely cheap.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A global let mut outside the loop that survives all loop iterations is not a good idea IMO. So children_of_node should be per-iteration. That means it is dead after this swap call, so it makes little sense to use swap in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. :)

There's just one thing I am wondering:
can we do the self.nodes.get_mut(*tag).unwrap().children = children_of_node; after the for idx in to_remove {? If yes, then we could entirely avoid the collect, which seems quite nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but it does not avoid the collect. That is needed because the closure passed to drain_filter needs a shared reference to self, while the loop body needs a mutable one.

Copy link
Contributor Author

@JoJoDeveloping JoJoDeveloping Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which makes me think, perhaps there's a better API than drain_filter 🤔 But that would need to be RFC'ed first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense, thanks.

// list of remaining children back in.
let mut children_of_node =
mem::take(&mut self.nodes.get_mut(*tag).unwrap().children);
// Remove all useless children, and save them for later (again to appease the borrow checker)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Remove all useless children, and save them for later (again to appease the borrow checker)
// Remove all useless children, and save them for later.
// The closure needs `&self` and the loop below needs `&mut self`, so we need to `collect`
// in to a temporary list.

@RalfJung
Copy link
Member

Just one last comment nit, then please squash the commits.

@JoJoDeveloping
Copy link
Contributor Author

Just one last comment nit, then please squash the commits.

squashed!

@RalfJung
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Aug 22, 2024

📌 Commit ad46edb has been approved by RalfJung

It is now in the queue for this repository.

@bors
Copy link
Contributor

bors commented Aug 22, 2024

⌛ Testing commit ad46edb with merge 0e56cfa...

@bors
Copy link
Contributor

bors commented Aug 22, 2024

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 0e56cfa to master...

@bors bors merged commit 0e56cfa into rust-lang:master Aug 22, 2024
8 checks passed
bors added a commit that referenced this pull request Aug 22, 2024
Avoid extra copy by using `retain_mut` and moving the deletion into the closure

Fixes the FIXME introduced in #3833. Thanks to `@dmitrii-ubskii` for the idea 🙂
JoJoDeveloping added a commit to JoJoDeveloping/miri that referenced this pull request Aug 23, 2024
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no
longer cause a stack overflow. One test that motivated it was the test
`fill::horizontal_line` in `tiny_skia`. But not causing stack overflows
was not a large improvents, since it did not fix the fundamental issue:
The tree was too large. The test now ran, but it required gigabytes of
memory and hours of time, whereas it finishes within seconds in Stacked
Borrows.

The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate
a slice in chunks. That iterator is written to reborrow at each call to
`next`, which creates a linear tree with a bunch of intermediary nodes,
which also fragments the `RangeMap` for that allocation.

The solution is to now compact the tree, so that these interior nodes
are removed. Care is taken to not remove nodes that are protected, or
that otherwise restrict their children.
JoJoDeveloping added a commit to JoJoDeveloping/miri that referenced this pull request Aug 25, 2024
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no
longer cause a stack overflow. One test that motivated it was the test
`fill::horizontal_line` in `tiny_skia`. But not causing stack overflows
was not a large improvents, since it did not fix the fundamental issue:
The tree was too large. The test now ran, but it required gigabytes of
memory and hours of time, whereas it finishes within seconds in Stacked
Borrows.

The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate
a slice in chunks. That iterator is written to reborrow at each call to
`next`, which creates a linear tree with a bunch of intermediary nodes,
which also fragments the `RangeMap` for that allocation.

The solution is to now compact the tree, so that these interior nodes
are removed. Care is taken to not remove nodes that are protected, or
that otherwise restrict their children.
JoJoDeveloping added a commit to JoJoDeveloping/miri that referenced this pull request Aug 25, 2024
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no
longer cause a stack overflow. One test that motivated it was the test
`fill::horizontal_line` in `tiny_skia`. But not causing stack overflows
was not a large improvents, since it did not fix the fundamental issue:
The tree was too large. The test now ran, but it required gigabytes of
memory and hours of time, whereas it finishes within seconds in Stacked
Borrows.

The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate
a slice in chunks. That iterator is written to reborrow at each call to
`next`, which creates a linear tree with a bunch of intermediary nodes,
which also fragments the `RangeMap` for that allocation.

The solution is to now compact the tree, so that these interior nodes
are removed. Care is taken to not remove nodes that are protected, or
that otherwise restrict their children.
JoJoDeveloping added a commit to JoJoDeveloping/miri that referenced this pull request Aug 27, 2024
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no
longer cause a stack overflow. One test that motivated it was the test
`fill::horizontal_line` in `tiny_skia`. But not causing stack overflows
was not a large improvents, since it did not fix the fundamental issue:
The tree was too large. The test now ran, but it required gigabytes of
memory and hours of time, whereas it finishes within seconds in Stacked
Borrows.

The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate
a slice in chunks. That iterator is written to reborrow at each call to
`next`, which creates a linear tree with a bunch of intermediary nodes,
which also fragments the `RangeMap` for that allocation.

The solution is to now compact the tree, so that these interior nodes
are removed. Care is taken to not remove nodes that are protected, or
that otherwise restrict their children.
JoJoDeveloping added a commit to JoJoDeveloping/miri that referenced this pull request Aug 27, 2024
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no
longer cause a stack overflow. One test that motivated it was the test
`fill::horizontal_line` in `tiny_skia`. But not causing stack overflows
was not a large improvents, since it did not fix the fundamental issue:
The tree was too large. The test now ran, but it required gigabytes of
memory and hours of time, whereas it finishes within seconds in Stacked
Borrows.

The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate
a slice in chunks. That iterator is written to reborrow at each call to
`next`, which creates a linear tree with a bunch of intermediary nodes,
which also fragments the `RangeMap` for that allocation.

The solution is to now compact the tree, so that these interior nodes
are removed. Care is taken to not remove nodes that are protected, or
that otherwise restrict their children.
bors added a commit that referenced this pull request Aug 28, 2024
…alfJung

Make Tree Borrows Provenance GC compact the tree

Follow-up on #3833 and #3835. In these PRs, the TB GC was fixed to no longer cause a stack overflow. One test that motivated it was the test `fill::horizontal_line` in [`tiny-skia`](https://github.com/RazrFalcon/tiny-skia). But not causing stack overflows was not a large improvents, since it did not fix the fundamental issue: The tree was too large. The test now ran, but it required gigabytes of memory and hours of time (only for it to be OOM-killed 🤬), whereas it finishes within 24 seconds in Stacked Borrows. With this merged, it finishes in about 40 seconds under TB.

The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate a slice in chunks. That iterator is written to reborrow at each call to `next`, which creates a linear tree with a bunch of intermediary nodes, which also fragments the `RangeMap` for that allocation.

The solution is to now compact the tree, so that these interior nodes are removed. Care is taken to not remove nodes that are protected, or that otherwise restrict their children.

I am currently only 99% sure that this is sound, and I do also think that this could compact even more. So `@Vanille-N` please also have a look at whether I got the compacting logic right.

For a more visual comparison, [here is a gist](https://gist.github.com/JoJoDeveloping/ae4a7f7c29335a4c233ef42d2f267b01) of what the tree looks like at one point during that test, with and without compacting.

This new GC requires a different iteration order during accesses (since the current one can make the error messages non-deterministic), so it is rebased on top of #3843 and requires that PR to be merged first.
JoJoDeveloping added a commit to JoJoDeveloping/miri that referenced this pull request Oct 27, 2024
Follow-up on rust-lang#3833 and rust-lang#3835. In these PRs, the TB GC was fixed to no
longer cause a stack overflow. One test that motivated it was the test
`fill::horizontal_line` in `tiny_skia`. But not causing stack overflows
was not a large improvents, since it did not fix the fundamental issue:
The tree was too large. The test now ran, but it required gigabytes of
memory and hours of time, whereas it finishes within seconds in Stacked
Borrows.

The problem in that test was that it used [`slice::chunked`](https://doc.rust-lang.org/std/primitive.slice.html#method.chunks) to iterate
a slice in chunks. That iterator is written to reborrow at each call to
`next`, which creates a linear tree with a bunch of intermediary nodes,
which also fragments the `RangeMap` for that allocation.

The solution is to now compact the tree, so that these interior nodes
are removed. Care is taken to not remove nodes that are protected, or
that otherwise restrict their children.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants