Skip to content
This repository was archived by the owner on Jul 11, 2019. It is now read-only.

Performs slower than GNU parallel when transparent huge pages are enabled #22

Closed
Shnatsel opened this issue Sep 24, 2016 · 15 comments
Closed

Comments

@Shnatsel
Copy link

The benchmark in README.md as of version 0.5.0 claims time 0:04 for Rust parallel vs 0:54 for GNU parallel. However, this benchmark is misleading because the command line used is completely useless:

seq 1 10000 | time -v parallel echo > /dev/null would never print anything because it lacks parameter substitution. The correct command that actually does something would be seq 1 10000 | time -v parallel echo '{}' > /dev/null

On my machine Rust parallel measures 1:02 vs 0:33 for GNU parallel for the actually useful command.

@mmstick
Copy link
Owner

mmstick commented Sep 24, 2016

Parameter substitution isn't required. When you don't provide a placeholder token, it is automatically inferred, so parallel echo '{}' is equivalent to parallel echo. I'm not sure why it would be slower for your system. Using either commands still gives me the same results for my version, with GNU Parallel taking 40x longer.

@mmstick
Copy link
Owner

mmstick commented Sep 24, 2016

What processor do you have? Is it AMD or Intel? Cores? Hyper-threading?

@Shnatsel
Copy link
Author

Shnatsel commented Sep 25, 2016

Here's my /proc/cpuinfo

That kind of difference would rather stem from rustc difference than CPU difference.

$ rustc --version
rustc 1.11.0 (9b21dcd6a 2016-08-15)

Parallel is built from git with cargo build --release

I've double-checked and I am indeed using version 0.5.0 of Rust parallel

@Shnatsel
Copy link
Author

Forcing CPU into the highest frequency only makes GNU parallel a bit faster (3x difference again), does not affect Rust parallel.

Explicitly passing -j 4 does not change anything either.

@mmstick
Copy link
Owner

mmstick commented Sep 25, 2016

I do believe that it is a rustc or LLVM bug. Sadly, I've also noticed that my AMD FX 8120 also performs abysmally for some reason -- even slower than my mobile Intel CPU, and others have noted the same behavior too. Intel processors aren't exhibiting this issue, and my benchmarks were taken from an i5-2410M CPU @ 2.30GHz using the Performance governor.

@Shnatsel Shnatsel changed the title Benchmark in README.md is incorrect/misleading Performs slower than GNU parallel on non-Intel CPUs Sep 25, 2016
@mmstick
Copy link
Owner

mmstick commented Sep 25, 2016

I've reported the bug here rust-lang/rust#36705 so maybe someone who works closer to the lower level side of Rust can give some insight on why AMD hardware is executing much slower than Intel hardware.

@Shnatsel
Copy link
Author

Here's a recorded sysprof session so whoever investigates this can see where the time is spent

sysprof-log-for-weird-performance.zip

@erickt
Copy link

erickt commented Sep 25, 2016

@Shnatsel / @mmstick: Just to be safe, are your AMD users compiling with optimizations turned on, a la cargo build --release ...? If not, perhaps enabling native CPU optimizations with RUSTFLAGS="-C target-cpu=native" cargo build --release ... might actually get the program properly optimized. That could help lead us towards what's going on if it's still performing poorly.

@Shnatsel
Copy link
Author

Nope, enabling optimizations like that didn't help.

export RUSTFLAGS="-C target-cpu=native"
cargo build --release

Still slow.

@mmstick
Copy link
Owner

mmstick commented Sep 25, 2016

I can also confirm that my AMD systems see no improvement from enabling native optimizations. I have perf data from both my Intel laptop and AMD desktop with both debug and release builds on the associated Rust issue: rust-lang/rust#36705

@Shnatsel Shnatsel changed the title Performs slower than GNU parallel on non-Intel CPUs Performs slower than GNU parallel when transparent huge pages are enabled Sep 27, 2016
@mmstick
Copy link
Owner

mmstick commented Sep 28, 2016

The solution to the problem is for Linux distributions to change their default parameter from always to madvise, as Solus did some time ago as they experienced issues with always: https://git.solus-project.com/packages/kernel/commit/?id=40b3b940348ce91ca7c03278f7f238a66883ad8f

Therefore, this should be reported as a bug against your Linux distribution.

@mmstick mmstick closed this as completed Sep 28, 2016
@cuviper
Copy link

cuviper commented Sep 28, 2016

For another datapoint, if you need to convince your distro, Fedora switched to madvise in 2014:
http://pkgs.fedoraproject.org/cgit/rpms/kernel.git/commit/?id=9a031d5070d9f8f5916c48637bd0c237cd52eaf9

@cuviper
Copy link

cuviper commented Sep 28, 2016

I should clarify, I didn't mention the year for one-upmanship, but just to indicate experience with the change. If Solus only made this change a few weeks ago, other distros may wonder if they've fully experienced the fallout of that change. Fedora's longer period with madvise should inspire more confidence.

@cuviper
Copy link

cuviper commented Sep 28, 2016

Even earlier, Debian has used madvise since 2012:
https://anonscm.debian.org/cgit/kernel/linux.git/commit/debian/config/config?id=c36d637ccaf86c15082a41018160b2bc9a431440
(I'll stop digging now.)

@mmstick
Copy link
Owner

mmstick commented Sep 29, 2016

Considering Debian has been using it since 2012, I wonder why Ubuntu didn't decide to do the same.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants