Fix NumericRange[Double] #489

eed3si9n · 2018-03-03T14:12:36Z

Ref scala/bug#8518
Ref scala/bug#9875
Ref scala/bug#9874
Ref scala/bug#8620
Ref scala/bug#8670

NumericRange[Double] has been broken because various parts of the code assumes sane plus operation, which is not possible with Double or Float. Within a few iterations of adding 0.1 or 0.2 it can quickly accumulate floating errors that are surprising.

scala> import collection.immutable.NumericRange
import collection.immutable.NumericRange

scala> val r = NumericRange.inclusive(1.0, 2.0, 0.2)(scala.math.Numeric.DoubleAsIfIntegral)
r: scala.collection.immutable.NumericRange.Inclusive[Double] = NumericRange 1.0 to 2.0 by 0.2

scala> assert(r(3) == r.toList(3), s"${r(3)} != ${r.toList(3)}")
java.lang.AssertionError: assertion failed: 1.6 != 1.5999999999999999

scala> val x = NumericRange.inclusive(0.0, 1.0, 0.1)(scala.math.Numeric.DoubleAsIfIntegral)
x: scala.collection.immutable.NumericRange.Inclusive[Double] = NumericRange 0.0 to 1.0 by 0.1

scala> x.drop(3).take(3).toList
res3: List[Double] = List(0.30000000000000004, 0.4) // take(3) mean taking three thing usually

When encountering Double or Float, this internally uses BigDecimal, which can add numbers without losing precision. This will produce range iterator that looks like List(1.0, 1.2, 1.4, 1.6, 1.8, 2.0) when converted into List, as opposed to List(1.0, 1.2, 1.4, 1.5999999999999999, 1.7999999999999998, 1.9999999999999998).

Ref scala/bug#8518 Ref scala/bug#9875 Ref scala/bug#9874 NumericRange[Double] has been broken because various parts of the code assumes sane plus operation, which is not possible with Double or Float. Within a few iterations of adding 0.1 or 0.2 it can quickly accumulate floating errors that are surprising. ```scala scala> import collection.immutable.NumericRange import collection.immutable.NumericRange scala> val r = NumericRange.inclusive(1.0, 2.0, 0.2)(scala.math.Numeric.DoubleAsIfIntegral) r: scala.collection.immutable.NumericRange.Inclusive[Double] = NumericRange 1.0 to 2.0 by 0.2 scala> assert(r(3) == r.toList(3), s"${r(3)} != ${r.toList(3)}") java.lang.AssertionError: assertion failed: 1.6 != 1.5999999999999999 scala> val x = NumericRange.inclusive(0.0, 1.0, 0.1)(scala.math.Numeric.DoubleAsIfIntegral) x: scala.collection.immutable.NumericRange.Inclusive[Double] = NumericRange 0.0 to 1.0 by 0.1 scala> x.drop(3).take(3).toList res3: List[Double] = List(0.30000000000000004, 0.4) ``` When encountering Double or Float, this internally uses BigDecimal, which can add numbers without losing precision. This will produce range iterator that looks like `List(1.0, 1.2, 1.4, 1.6, 1.8, 2.0)` when converted into List, as opposed to `List(1.0, 1.2, 1.4, 1.5999999999999999, 1.7999999999999998, 1.9999999999999998)`.

Ref scala/bug#8670

pnf · 2018-03-03T17:15:10Z

Did you consider Kahan’s algorithm? It might be accurate enough and also more efficient.

pnf · 2018-03-03T17:16:09Z

https://en.wikipedia.org/wiki/Kahan_summation_algorithm

eed3si9n · 2018-03-03T22:09:37Z

@pnf Thanks for the pointer. It didn't occur to me, but it makes sense.

eed3si9n · 2018-03-04T00:38:59Z

I implemented Kahan summation in https://github.com/eed3si9n/collection-strawman/tree/wip/double-range2, but it wasn't accurate enough out of the box.

For test I am using val r = NumericRange.inclusive(0.0, 1000.0, 0.1)(DoubleAsIfIntegral).

collection-strawman> junit/testOnly strawman.collection.NumericRangeTest
[info] Compiling 1 Scala source to /Users/eed3si9n/work/collection-strawman/collections/.jvm/target/scala-2.13.0-M2/classes...
[info] Test run started
[info] Test strawman.collection.NumericRangeTest.emptyIterator started
[info] Test strawman.collection.NumericRangeTest.nonEmptyIterator started
[info] Test strawman.collection.NumericRangeTest.doubleRangeToList started
[info] Test strawman.collection.NumericRangeTest.doubleRangeUneven started
[info] Test strawman.collection.NumericRangeTest.doubleRangeContains started
List(0.0, 0.1, 0.2, 0.30000000000000004, ..snip.. 
37.800000000000004, 37.9, 38.0, 38.1, 38.2, 38.300000000000004, 38.400000000000006, 
38.5, 38.6, 38.7, 38.800000000000004, 38.900000000000006, 39.0, 39.1, 39.2, 
39.300000000000004, 39.400000000000006, 39.5, 39.6, 39.7, 39.800000000000004, 
39.900000000000006, 40.0, 40.1, 40.2, 40.300000000000004, 40.400000000000006, 40.5, 
40.6, 40.7, 40.800000000000004, 40.900000000000006, 41.0, 41.1, 41.2, 41.300000000000004, 
41.400000000000006, 41.5, 41.6, 41.7, 41.800000000000004, 41.900000000000006, 42.0, 
..snip.., 1000.0)
[error] Test strawman.collection.NumericRangeTest.doubleRangeContains failed: java.lang.AssertionError: expected:<-1> but was:<3>, took 0.05 sec

As you can see there the lower bits are pretty noisy, and it's noisy enough for contains(r(i)) to not pass. This required me to run an aggressive round off. I chose 12 decimal, but it probably needed to be calibrated based on the range and step.

But probably because of the rounding, it actually became slower than doing things in BigDecimal land. Here's looping on 10000 points and calling contains(r(i)):

Kahan with rounding
11156 ms
11189 ms
11464 ms

BigDecial without rounding
3416 ms
3011 ms
3263 ms

sjrd · 2018-03-04T04:02:48Z

Reading from my phone right now, so I'm not completely sure, but I think this PR will have severe consequences on the code size in Scala.js, because it will pull in the big number implementation every time one uses NumericRange, whether with floating point numbers or other Numerics.

eed3si9n · 2018-03-04T04:12:06Z

@sjrd Perhaps I should refactor and provide Double specific Range class?

julienrf · 2018-03-05T08:07:28Z

@eed3si9n It looks like you are still using BigDecimals in the base NumericRange implementation, right?

eed3si9n · 2018-03-05T08:11:12Z

@julienrf It's used in NumericRange companion, but I tried to not use them in sealed class NumericRange[T], assuming that's where it hurts Scala.JS if it needs to code gen for a bunch of types.

Ichoran · 2018-03-05T19:08:07Z

Seth suggests we discuss this here instead of over at scala/bug#10759, so (reposting):

Is it possible to drop numeric ranges that pretend to be for Float and Double? These are just bad news due to numeric imprecision. If you say something like 0.1 to 0.7 by (0.1 + 0.2) you would think you would have three steps, but you either won't, or you will but other code will be randomly inaccurate (because you rounded things enough to make the above work). For what it's worth, Octave's equivalent, [0.1:x:0.7] produces [0.1, 0.4, 0.7] if x is up to 3e-16 above 0.3, but [0.1, 0.4] if it is 4e-16 above 0.3. (And you also get weird cases where for some values of x you end on the expected endpoint when the endpoint is small but miss it if it's bigger.) This just emphasizes that nobody can really "do it right".

Normally I don't like breaking changes, but in the case where there is no way to do the right thing except by not actually using the type that is stated, I think the better thing to do is remove the feature.

Note that sanity is possible if you have typed in numeric literals but not if you just have x to y by z because you can't know whether the imprecision in x and y and z is intended or not. If you can't even tell if your inputs honestly represent the intended values, using BigDecimal is not enough to recover non-surprising behavior. And I'm not sure being "better but still flawed" is justification enough to basically use a different data type than the one you're saying you're using.

SethTisue · 2018-03-06T01:23:41Z

I lean towards keeping it, and perhaps adding some cautionary language to the Scaladoc.

If we get rid of DoubleRange and the like, it won't stop people from inlining their own versions into their code and making all the same mistakes. I think we should provide the best version we can and document any limitations or sharp edges. Some people will still get hurt, but they'd have gotten hurt anyway, maybe worse.

Let's give people clean needles, rather than hope that if they only have dirty needles, they'll stop taking drugs. Because they won't stop.

optician · 2018-03-06T03:50:25Z

I agree with @SethTisue. Path through pitfall is unavoidable and better to have well-known pitfall rather than private handmade . Every ambitious dreamer can dig own anyway.

eed3si9n · 2018-03-06T04:40:37Z

As it stands, Double range is so broken that I can't imagine anyone using it in production.
So, there is a danger in giving users a real-enough looking vehicle that works for day-to-day usage, but the wheels come off when going over 70mph on the highway (112km/h).

Although I enjoyed addressing a handful of issues in this PR, I also don't want to be responsible for accident with Real consequences (pun). I am ok with strawman withdrawing the Float and Double range. It would be cool if the compilation failed with suggestion to migrate to BigDecimal via a (restligeist) macro.

SethTisue · 2018-03-06T04:42:53Z

As it stands, Double range is so broken that I can't imagine anyone using it in production

even after your rewrite?

Ichoran · 2018-03-06T04:43:39Z

@SethTisue - The problem is that it is a minefield, and the best way to avoid the mines is to use a BigDecimal range instead. Except then we are saying "this is a Double", but everything acts as if it is a BigDecimal (e.g. performance); we just don't say so.

So we should just come clean and say, "Look, Float and Double just aren't suitable for ranges. Here's a BigDecimal range that you can use almost as easily."

eed3si9n · 2018-03-06T04:58:42Z

even after your rewrite?

After my change, some might be tempted to use it. But as @Ichoran's examples show, I am only handling sample-code usages of with literals, but not really solving the fundamental issue that adding two Doubles, even unassuming values like 0.1, is not accurate.

Related: it might be cool to reconsider BigDecimal literal scala/bug#5988

eed3si9n · 2018-03-06T22:39:39Z

I am closing this for now.

NthPortal · 2018-03-15T04:19:53Z

Is there a plan/issue/PR to get rid of Double and Float ranges?

eed3si9n · 2018-03-15T05:14:49Z

I was gonna send a PR for this at some point, but I can open an issue first in case if anyone else wants to pick it up.

eed3si9n · 2018-03-15T05:22:47Z

scala/bug#10781

Fixes scala/bug#10781 At the basis of range is an assumption that summation works reliably. Unfortunately it is not true for IEEE floating point number where numbers are stored as approximate fraction of binary numbers. Natually, Double and Float ranges are completely broken. See numbers of issues such as scala/bug#8518, scala/bug#9875, scala/bug#9874, scala/bug#8620, scala/bug#8670, and scala/bug#10759. I've attempted to fix the Double range in scala/collection-strawman#489 by faking it further using BigDecimal, but ultimately I was not able to overcome the fact that IEEE floats are not suited for ranges. This removes both Float and Double ranges, as well as the fake `Integral[T]` instances that underpins the ranges.

Fixes scala/bug#10781 At the basis of range is an assumption that summation works reliably. Unfortunately it is not true for IEEE floating point number where numbers are stored as approximate fraction of binary numbers. Natually, Double and Float ranges are completely broken. See numbers of issues such as scala/bug#8518, scala/bug#9875, scala/bug#9874, scala/bug#8620, scala/bug#8670, and scala/bug#10759. I've attempted to fix the Double range in scala/collection-strawman#489 by faking it further using BigDecimal, but ultimately I was not able to overcome the fact that IEEE floats are not suited for ranges. This removes both Float and Double ranges, as well as the fake `Integral[T]` instances that underpin their existence.

SethTisue · 2018-04-25T13:49:51Z

uncle! I'm okay with the removal.

Fixes scala/bug#10781 At the basis of range is an assumption that summation works reliably. Unfortunately it is not true for IEEE floating point number where numbers are stored as approximate fraction of binary numbers. Natually, Double and Float ranges are completely broken. See numbers of issues such as scala/bug#8518, scala/bug#9875, scala/bug#9874, scala/bug#8620, scala/bug#8670, and scala/bug#10759. I've attempted to fix the Double range in scala/collection-strawman#489 by faking it further using BigDecimal, but ultimately I was not able to overcome the fact that IEEE floats are not suited for ranges. This removes both Float and Double ranges, as well as the fake `Integral[T]` instances that underpin their existence.

This was referenced Mar 3, 2018

NumericRange[Double].foreach() doesn't match apply(idx) due to floating point rounding scala/bug#9875

Closed

NumericRange[Double].contains() doesn't work due to rounding errors scala/bug#9874

Closed

Fix count when there's a stub

aaa339a

Ref scala/bug#8670

increase the testing range

864239f

eed3si9n mentioned this pull request Mar 4, 2018

Floating point ranges don't work scala/bug#8620

Closed

extract DoubleRange out

286543e

SethTisue mentioned this pull request Mar 5, 2018

Nonsensical floating-point Ranges are allowed scala/bug#10759

Closed

eed3si9n closed this Mar 6, 2018

eed3si9n mentioned this pull request Mar 15, 2018

Remove Float range and Double range scala/bug#10781

Closed

eed3si9n mentioned this pull request Mar 25, 2018

bug#10781. Remove Float range and Double range [ci: last-only] scala/scala#6468

Merged

eed3si9n deleted the wip/double-range branch April 25, 2018 17:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix NumericRange[Double] #489

Fix NumericRange[Double] #489

eed3si9n commented Mar 3, 2018 •

edited

Loading

pnf commented Mar 3, 2018

pnf commented Mar 3, 2018

eed3si9n commented Mar 3, 2018

eed3si9n commented Mar 4, 2018

sjrd commented Mar 4, 2018

eed3si9n commented Mar 4, 2018

julienrf commented Mar 5, 2018

eed3si9n commented Mar 5, 2018

Ichoran commented Mar 5, 2018

SethTisue commented Mar 6, 2018

optician commented Mar 6, 2018

eed3si9n commented Mar 6, 2018

SethTisue commented Mar 6, 2018

Ichoran commented Mar 6, 2018 •

edited

Loading

eed3si9n commented Mar 6, 2018 •

edited

Loading

eed3si9n commented Mar 6, 2018

NthPortal commented Mar 15, 2018

eed3si9n commented Mar 15, 2018 •

edited

Loading

eed3si9n commented Mar 15, 2018

SethTisue commented Apr 25, 2018

Fix NumericRange[Double] #489

Fix NumericRange[Double] #489

Conversation

eed3si9n commented Mar 3, 2018 • edited Loading

pnf commented Mar 3, 2018

pnf commented Mar 3, 2018

eed3si9n commented Mar 3, 2018

eed3si9n commented Mar 4, 2018

sjrd commented Mar 4, 2018

eed3si9n commented Mar 4, 2018

julienrf commented Mar 5, 2018

eed3si9n commented Mar 5, 2018

Ichoran commented Mar 5, 2018

SethTisue commented Mar 6, 2018

optician commented Mar 6, 2018

eed3si9n commented Mar 6, 2018

SethTisue commented Mar 6, 2018

Ichoran commented Mar 6, 2018 • edited Loading

eed3si9n commented Mar 6, 2018 • edited Loading

eed3si9n commented Mar 6, 2018

NthPortal commented Mar 15, 2018

eed3si9n commented Mar 15, 2018 • edited Loading

eed3si9n commented Mar 15, 2018

SethTisue commented Apr 25, 2018

eed3si9n commented Mar 3, 2018 •

edited

Loading

Ichoran commented Mar 6, 2018 •

edited

Loading

eed3si9n commented Mar 6, 2018 •

edited

Loading

eed3si9n commented Mar 15, 2018 •

edited

Loading