Skip to content

Commit d64af77

Browse files
authored
Merge pull request #1766 from ravicodelabs/fix-asm-example-wording
Fix asm example explanation for `inlateout` usage (22.1 Inline Assembly)
2 parents d272717 + 93b0b9a commit d64af77

File tree

1 file changed

+6
-5
lines changed

1 file changed

+6
-5
lines changed

src/unsafe/asm.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -139,8 +139,8 @@ can be written at any time, and can therefore not share its location with any ot
139139
However, to guarantee optimal performance it is important to use as few registers as possible,
140140
so they won't have to be saved and reloaded around the inline assembly block.
141141
To achieve this Rust provides a `lateout` specifier. This can be used on any output that is
142-
written only after all inputs have been consumed.
143-
There is also a `inlateout` variant of this specifier.
142+
written only after all inputs have been consumed. There is also an `inlateout` variant of this
143+
specifier.
144144

145145
Here is an example where `inlateout` *cannot* be used in `release` mode or other optimized cases:
146146

@@ -163,11 +163,12 @@ unsafe {
163163
assert_eq!(a, 12);
164164
# }
165165
```
166-
The above could work well in unoptimized cases (`Debug` mode), but if you want optimized performance (`release` mode or other optimized cases), it could not work.
167166

168-
That is because in optimized cases, the compiler is free to allocate the same register for inputs `b` and `c` since it knows they have the same value. However it must allocate a separate register for `a` since it uses `inout` and not `inlateout`. If `inlateout` was used, then `a` and `c` could be allocated to the same register, in which case the first instruction to overwrite the value of `c` and cause the assembly code to produce the wrong result.
167+
In unoptimized cases (e.g. `Debug` mode), replacing `inout(reg) a` with `inlateout(reg) a` in the above example can continue to give the expected result. However, with `release` mode or other optimized cases, using `inlateout(reg) a` can instead lead to the final value `a = 16`, causing the assertion to fail.
169168

170-
However the following example can use `inlateout` since the output is only modified after all input registers have been read:
169+
This is because in optimized cases, the compiler is free to allocate the same register for inputs `b` and `c` since it knows that they have the same value. Furthermore, when `inlateout` is used, `a` and `c` could be allocated to the same register, in which case the first `add` instruction would overwrite the initial load from variable `c`. This is in contrast to how using `inout(reg) a` ensures a separate register is allocated for `a`.
170+
171+
However, the following example can use `inlateout` since the output is only modified after all input registers have been read:
171172

172173
```rust
173174
# #[cfg(target_arch = "x86_64")] {

0 commit comments

Comments
 (0)