Skip to content

Improve optimizations for boolean values #15464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 7, 2014
Merged

Conversation

dotdash
Copy link
Contributor

@dotdash dotdash commented Jul 5, 2014

LLVM doesn't handle i1 value in allocas/memory very well and skips a number of optimizations if it hits it. So we have to do the same thing that Clang does, using i1 for SSA values, but storing i8 in memory.

Fixes #15203.

@lilyball
Copy link
Contributor

lilyball commented Jul 5, 2014

I am not qualified to evaluate this patch, but do you have any benchmarks that demonstrate an improvement?

@dotdash
Copy link
Contributor Author

dotdash commented Jul 5, 2014

See the test case given in #15203. (I only had the "Fixes" on a commit, added it to the PR description now).

Before:

running 1 test
test bench_p3 ... bench:    319188 ns/iter (+/- 37902)

test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured

After:

running 1 test
test bench_p3 ... bench:    138881 ns/iter (+/- 13047)

test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured

The relevant optimization here is turning the store loop into a memset. It's not triggered for types which are not a byte-width multiple.

@dotdash dotdash changed the title Improve optimizations for booleans value Improve optimizations for boolean values Jul 5, 2014
let lltemp = builder.alloca(val_ty(llforeign_arg), "");
builder.store(llforeign_arg, lltemp);
llforeign_arg = lltemp;
llforeign_arg = if ty::type_is_bool(rust_ty) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason isn't this using the store_ty abstraction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't, because this function doesn't use the regular building blocks like Block, but uses a Builder directly.

dotdash added 2 commits July 6, 2014 22:12
LLVM doesn't really like types with a bit-width that isn't a multiple of
8 and disable various optimizations if it encounters such types used
with loads/stores. OTOH, booleans must be represented as i1 when used as
SSA values. To get the best results, we must use i1 for SSA values, and
i8 when storing the value to memory.

By using range asserts on loads, LLVM can eliminate the required
zero-extend and truncate operations.

Fixes rust-lang#15203
bors added a commit that referenced this pull request Jul 7, 2014
LLVM doesn't handle i1 value in allocas/memory very well and skips a number of optimizations if it hits it. So we have to do the same thing that Clang does, using i1 for SSA values, but storing i8 in memory.

Fixes #15203.
@bors bors closed this Jul 7, 2014
@bors bors merged commit dd4112b into rust-lang:master Jul 7, 2014
@dotdash dotdash deleted the bool_stores branch February 4, 2015 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance regression
5 participants