-
-
Notifications
You must be signed in to change notification settings - Fork 465
Uniform distribution: bias and usize portability #809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It's just the usual hit form rejection sampling, yes? |
All of these use rejection sampling, which makes the cost variable (especially when the required range is within a factor of 2 of the type's range). I'm not sure why the |
Maybe it is branch misprediction? You could try running the benchmarks separately with I think we should have the bias corrections for sure. I'm not so sure about the portability improvements, there is a lot of code duplication (that could be reduced with macros). I would prefer not to promise value-stability for different platforms for platform-dependent types, and state this in the documentation. The tests could be fixed accordingly. I think for sampling |
Good point @vks that this would be better done within Compatibility: the changes to Performance: there are some minor wins and losses; nothing too significant I think:
|
@@ -451,6 +451,18 @@ impl<'a, S: Index<usize, Output = T> + ?Sized + 'a, T: 'a> ExactSizeIterator | |||
} | |||
|
|||
|
|||
// Sample a number uniformly between 0 and `ubound`. Uses 32-bit sampling where | |||
// possible, primarily in order to produce the same output on 32-bit and 64-bit | |||
// platforms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add #[inline]
to encourage LLVM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense but has negligible effect on benchmarks (seq_iter_choose_from_1000
and seq_iter_window_hinted_choose_from_1000
still being about 12% slower than before this PR). But we can live with this small hit.
Other than the broken tests, this looks good! |
I dropped the uniformity test which was responsible for most of the failures and appears useless. |
The remaining failures will be fixed by #813, so I think this can be merged. |
The usize64 bench is noticably slower than the others, perhaps due to use of rejection sampling.
Signed extension of zone was incorrect. This method has near identical performance in benchmarks.
Primarily for value stability, also slight performance boost.
Rebased on master; hopefully it passes this time |
Bias
I noticed that the
zone
produced via signed extension is not a multiple of therange
. This means that the implementations for 8- and 16-bit types (@pitdicker) were biased. I added a fix for this which does not appear to have any performance impact.It's worth mentioning however that the bias is tiny: the biggest deviation from a multiple of the range I could find was 32579 (for range = 65355); since this is sampling from a
u32
the probability of generating a biased sample is thus ~7.6e-6. With 100 million samples this bias is still lost in the noise.Note: the included uniformity test is unfinished because it's pretty useless. Included for example but it should probably just be deleted.
Portability for isize/usize samples
As discussed in #805. Unfortunately this does have a performance hit:
It also results in quite a bit of redundant, ugly code. As such I'm not happy about adding it (though it would be nice to have).
Thoughts? @burdges @vks @pitdicker