-
Notifications
You must be signed in to change notification settings - Fork 31
Performs slower than GNU parallel when transparent huge pages are enabled #22
Comments
Parameter substitution isn't required. When you don't provide a placeholder token, it is automatically inferred, so |
What processor do you have? Is it AMD or Intel? Cores? Hyper-threading? |
That kind of difference would rather stem from rustc difference than CPU difference.
Parallel is built from git with I've double-checked and I am indeed using version 0.5.0 of Rust parallel |
Forcing CPU into the highest frequency only makes GNU parallel a bit faster (3x difference again), does not affect Rust parallel. Explicitly passing |
I do believe that it is a rustc or LLVM bug. Sadly, I've also noticed that my AMD FX 8120 also performs abysmally for some reason -- even slower than my mobile Intel CPU, and others have noted the same behavior too. Intel processors aren't exhibiting this issue, and my benchmarks were taken from an i5-2410M CPU @ 2.30GHz using the Performance governor. |
I've reported the bug here rust-lang/rust#36705 so maybe someone who works closer to the lower level side of Rust can give some insight on why AMD hardware is executing much slower than Intel hardware. |
Here's a recorded sysprof session so whoever investigates this can see where the time is spent |
@Shnatsel / @mmstick: Just to be safe, are your AMD users compiling with optimizations turned on, a la |
Nope, enabling optimizations like that didn't help.
Still slow. |
I can also confirm that my AMD systems see no improvement from enabling native optimizations. I have perf data from both my Intel laptop and AMD desktop with both debug and release builds on the associated Rust issue: rust-lang/rust#36705 |
The solution to the problem is for Linux distributions to change their default parameter from always to madvise, as Solus did some time ago as they experienced issues with always: https://git.solus-project.com/packages/kernel/commit/?id=40b3b940348ce91ca7c03278f7f238a66883ad8f Therefore, this should be reported as a bug against your Linux distribution. |
For another datapoint, if you need to convince your distro, Fedora switched to madvise in 2014: |
I should clarify, I didn't mention the year for one-upmanship, but just to indicate experience with the change. If Solus only made this change a few weeks ago, other distros may wonder if they've fully experienced the fallout of that change. Fedora's longer period with madvise should inspire more confidence. |
Even earlier, Debian has used madvise since 2012: |
Considering Debian has been using it since 2012, I wonder why Ubuntu didn't decide to do the same. |
The benchmark in README.md as of version 0.5.0 claims time 0:04 for Rust parallel vs 0:54 for GNU parallel. However, this benchmark is misleading because the command line used is completely useless:
seq 1 10000 | time -v parallel echo > /dev/null
would never print anything because it lacks parameter substitution. The correct command that actually does something would beseq 1 10000 | time -v parallel echo '{}' > /dev/null
On my machine Rust parallel measures 1:02 vs 0:33 for GNU parallel for the actually useful command.
The text was updated successfully, but these errors were encountered: