-
Notifications
You must be signed in to change notification settings - Fork 547
Add chapter on fuzzing #1646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add chapter on fuzzing #1646
Changes from all commits
7a8e791
48864ab
b941d22
7e6c2ea
80b04f9
19cafc0
a38cd10
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
# Fuzzing | ||
|
||
<!-- date-check: Mar 2023 --> | ||
|
||
For the purposes of this guide, *fuzzing* is any testing methodology that | ||
involves compiling a wide variety of programs in an attempt to uncover bugs in | ||
rustc. Fuzzing is often used to find internal compiler errors (ICEs). Fuzzing | ||
can be beneficial, because it can find bugs before users run into them and | ||
provide small, self-contained programs that make the bug easier to track down. | ||
However, some common mistakes can reduce the helpfulness of fuzzing and end up | ||
making contributors' lives harder. To maximize your positive impact on the Rust | ||
project, please read this guide before reporting fuzzer-generated bugs! | ||
|
||
## Guidelines | ||
|
||
### In a nutshell | ||
|
||
*Please do:* | ||
|
||
- Ensure the bug is still present on the latest nightly rustc | ||
- Include a reasonably minimal, standalone example along with any bug report | ||
- Include all of the information requested in the bug report template | ||
- Search for existing reports with the same message and query stack | ||
- Format the test case with `rustfmt`, if it maintains the bug | ||
langston-barrett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Indicate that the bug was found by fuzzing | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would like to add something like "remove distractions". Often, fuzzed snippets are full of weird code and one of (or even multiple together) these weirdnesses is triggering the ICE. While it's not always easy without understanding the issue, at least trying to remove as many weirdnesses as possible for a few minutes can be great in making the report more obvious. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see you already mentioned something like this below, but I really like the "distractions" wording, Generally, a little less minimal but less "weird" example is easier to understand. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is discussed a bit later on, under "extra credit" and also under "minimization". My inclination is to avoid putting too many things in the "Please do" list to make sure that people don't see it and think either (1) "That's so much work! I'm not going to do all that, I'm just going to report it however I feel" or (2) "That's so much work! I'm going to fuzz something else.". I think we want a short list of guidelines that include everything at the "just right" intersection of important and easy. I put bisection (even though it's important), minimization, and adding to Glacier under "extra credit", because they seem a bit harder. I put All of that being said! I've never fixed a bug in rustc, so I don't have a good picture of what's most important. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I clarified this point in the "extra credit" section and used this wording! |
||
|
||
*Please don't:* | ||
|
||
- Don't report lots of bugs that use internal features, including but not | ||
limited to `custom_mir`, `lang_items`, `no_core`, and `rustc_attrs`. | ||
- Don't seed your fuzzer with inputs that are known to crash rustc (details | ||
below). | ||
|
||
### Discussion | ||
|
||
If you're not sure whether or not an ICE is a duplicate of one that's already | ||
been reported, please go ahead and report it and link to issues you think might | ||
be related. In general, ICEs on the same line but with different *query stacks* | ||
are usually distinct bugs. For example, [#109020][#109202] and [#109129][#109129] | ||
had similar error messages: | ||
|
||
``` | ||
error: internal compiler error: compiler/rustc_middle/src/ty/normalize_erasing_regions.rs:195:90: Failed to normalize <[closure@src/main.rs:36:25: 36:28] as std::ops::FnOnce<(Emplacable<()>,)>>::Output, maybe try to call `try_normalize_erasing_regions` instead | ||
``` | ||
``` | ||
error: internal compiler error: compiler/rustc_middle/src/ty/normalize_erasing_regions.rs:195:90: Failed to normalize <() as Project>::Assoc, maybe try to call `try_normalize_erasing_regions` instead | ||
``` | ||
but different query stacks: | ||
``` | ||
query stack during panic: | ||
#0 [fn_abi_of_instance] computing call ABI of `<[closure@src/main.rs:36:25: 36:28] as core::ops::function::FnOnce<(Emplacable<()>,)>>::call_once - shim(vtable)` | ||
end of query stack | ||
``` | ||
``` | ||
query stack during panic: | ||
#0 [check_mod_attrs] checking attributes in top-level module | ||
#1 [analysis] running analysis passes on this crate | ||
end of query stack | ||
``` | ||
|
||
[#109020]: https://github.com/rust-lang/rust/issues/109020 | ||
[#109129]: https://github.com/rust-lang/rust/issues/109129 | ||
|
||
## Building a corpus | ||
|
||
When building a corpus, be sure to avoid collecting tests that are already | ||
known to crash rustc. A fuzzer that is seeded with such tests is more likely to | ||
generate bugs with the same root cause, wasting everyone's time. The simplest | ||
way to avoid this is to loop over each file in the corpus, see if it causes an | ||
ICE, and remove it if so. | ||
|
||
To build a corpus, you may want to use: | ||
|
||
- The rustc/rust-analyzer/clippy test suites (or even source code) --- though avoid | ||
tests that are already known to cause failures, which often begin with comments | ||
like `// failure-status: 101` or `// known-bug: #NNN`. | ||
- The already-fixed ICEs in [Glacier][glacier] --- though avoid the unfixed | ||
ones in `ices/`! | ||
|
||
## Extra credit | ||
|
||
Here are a few things you can do to help the Rust project after filing an ICE. | ||
|
||
- [Bisect][bisect] the bug to figure out when it was introduced | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ✨ If this list is arbitrarily ordered, I would prefer if we reorder this in terms of value to the ICE solver / compiler person like me 😸 :
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
My only thought here is that it's possible to shink the test case in terms of bytes, LoC, etc. without reducing the number of unrelated errors (indeed, while increasing them). |
||
- Fix "distractions": problems with the test case that don't contribute to | ||
triggering the ICE, such as syntax errors or borrow-checking errors | ||
- Minimize the test case (see below) | ||
- Add the minimal test case to [Glacier][glacier] | ||
|
||
[bisect]: https://github.com/rust-lang/cargo-bisect-rustc/blob/master/TUTORIAL.md | ||
|
||
## Minimization | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just found two Zulip threads about minimization:
One of them mentions There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Might not do this before merging, but I will try to get to it at some point!) |
||
|
||
It is helpful to carefully *minimize* the fuzzer-generated input. When | ||
minimizing, be careful to preserve the original error, and avoid introducing | ||
distracting problems such as syntax, type-checking, or borrow-checking errors. | ||
|
||
There are some tools that can help with minimization. If you're not sure how | ||
to avoid introducing syntax, type-, and borrow-checking errors while using | ||
these tools, post both the complete and minimized test cases. Generally, | ||
*syntax-aware* tools give the best results in the least amount of time. | ||
[`treereduce-rust`][treereduce] and [picireny][picireny] are syntax-aware. | ||
`halfempty` is not, but is generally a high-quality tool. | ||
langston-barrett marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
[halfempty]: https://github.com/googleprojectzero/halfempty | ||
[picireny]: https://github.com/renatahodovan/picireny | ||
[treereduce]: https://github.com/langston-barrett/treereduce | ||
|
||
## Effective fuzzing | ||
|
||
When fuzzing rustc, you may want to avoid generating machine code, since this | ||
is mostly done by LLVM. Try `--emit=mir` instead. | ||
|
||
A variety of compiler flags can uncover different issues. `-Zmir-opt-level=4` | ||
will turn on MIR optimization passes that are not run by default, potentially | ||
uncovering interesting bugs. `-Zvalidate-mir` can help uncover such bugs. | ||
|
||
If you're fuzzing a compiler you built, you may want to build it with `-C | ||
target-cpu=native` or even PGO/BOLT to squeeze out a few more executions per | ||
second. Of course, it's best to try multiple build configurations and see | ||
what actually results in superior throughput. | ||
|
||
You may want to build rustc from source with debug assertions to find | ||
additional bugs, though this is a trade-off: it can slow down fuzzing by | ||
requiring extra work for every execution. To enable debug assertions, add this | ||
to `config.toml` when compiling rustc: | ||
|
||
```toml | ||
[rust] | ||
debug-assertions = true | ||
``` | ||
|
||
ICEs that require debug assertions to reproduce should be tagged | ||
[`requires-debug-assertions`][requires-debug-assertions]. | ||
|
||
[requires-debug-assertions]: https://github.com/rust-lang/rust/labels/requires-debug-assertions | ||
|
||
## Existing projects | ||
|
||
- [fuzz-rustc][fuzz-rustc] demonstrates how to fuzz rustc with libfuzzer | ||
- [icemaker][icemaker] runs rustc and other tools on a large number of source | ||
files with a variety of flags to catch ICEs | ||
- [tree-splicer][tree-splicer] generates new source files by combining existing | ||
ones while maintaining correct syntax | ||
|
||
[glacier]: https://github.com/rust-lang/glacier | ||
[fuzz-rustc]: https://github.com/dwrensha/fuzz-rustc | ||
[icemaker]: https://github.com/matthiaskrgr/icemaker/ | ||
[tree-splicer]: https://github.com/langston-barrett/tree-splicer/ |
Uh oh!
There was an error while loading. Please reload this page.