Use unlimited MathContext for BigDecimal arithmetic #3303

travisbrown · 2020-02-18T15:22:21Z

Scala 2.13 changes the behavior of arithmetic on BigDecimal so that the MathContext of the first operand is used (instead of always using UNLIMITED, which is what was done in previous Scala versions; see this issue for some discussion of the change).

While this change is arguably the right thing to do in the standard library (assuming we're stuck with the mistake of carrying around a context in the first place), I think I've convinced myself that it's the wrong thing to do in Cats's instances. For one thing the 2.13 change means that operations from these instances are no longer commutative or even associative, even for fairly simple cases:

import cats.kernel.instances.bigDecimal._, cats.syntax.semigroup._

val x = BigDecimal("1e20")(java.math.MathContext.UNLIMITED)
val y = BigDecimal("1e-20")

And then:

scala> (x |+| y) == (y |+| x)
res0: Boolean = false

scala> (x |+| (y |+| x)) == ((x |+| y) |+| x)
res1: Boolean = false

…while on 2.12 both of these results are true.

These issues don't turn up when we're checking the laws right now, because we have this instance in cats.kernel.laws.Tests:

   // default Arbitrary[BigDecimal] is a bit too intense :/
    implicit val arbBigDecimal: Arbitrary[BigDecimal] =
      Arbitrary(arbitrary[Double].map(n => BigDecimal(n.toString)))

This workaround seems to me like a bad idea to start with, and for more complex test cases it's not enough, anyway (see for example the issue @johnynek ran into in #3279).

This PR changes the implementations of all of the operations that require contexts back to the 2.12 behavior (it has no effect on 2.12). I've also removed the custom Arbitrary[BigDecimal] in the cats-kernel-laws tests. It precedes 2.13 (it was introduced in #1319 in 2016), but removing it doesn't seem to cause problems in a few test runs I've just done on both 2.12 and 2.13. If it turns out that we needed it to avoid flakiness I'll revert that part.

codecov-io · 2020-02-18T15:44:46Z

Codecov Report

Merging #3303 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3303      +/-   ##
==========================================
+ Coverage   93.15%   93.15%   +<.01%     
==========================================
  Files         378      378              
  Lines        7579     7585       +6     
  Branches      203      210       +7     
==========================================
+ Hits         7060     7066       +6     
  Misses        519      519

Flag	Coverage Δ
#scala_version_212	`93.4% <100%> (ø)`	⬆️
#scala_version_213	`92.93% <100%> (+0.01%)`	⬆️

Impacted Files	Coverage Δ
...la/cats/kernel/instances/BigDecimalInstances.scala	`100% <100%> (ø)`	⬆️
...src/main/scala-2.13+/cats/instances/arraySeq.scala	`100% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6e24fc...cbe2247. Read the comment docs.

tpolecat

Gosh. Yeah this makes sense to me, but in the event someone is surprised by x + y not equaling x |+| y and .sum not equalling .combineAll it might be good to add a comment.

erikerlandson · 2020-02-18T15:55:16Z

If the MathContext was part of the type signature of BigDecimal - in pseudocode maybe something like Algebra[BigDecimal, Context[...]] - would these types be properly associative and/or commutative? That's to say, are BigDecimal objects having the same context associative or commutative?

At any rate, part of the problem is certainly that BigDecimals with differing contexts definitely are not the same type.

tixxit · 2020-02-18T15:55:43Z

Just to make sure I understand, the old behaviour would add the numbers using an unlimited context, then truncate them using the lhs' MathContext? The new behaviour adds them using the lhs' MathContext?

erikerlandson · 2020-02-18T16:02:28Z

Another slant on this is that it reminds me of some monoidal data sketches (e.g. t-digests) where in general s1 |+| s2 is not exactly equal to |s2| + |s1| however it is the case that they are always very similar, for some appropriate definition of "similar" (and likewise for associativity). These kind of cases have had me wondering in the past if there is some useful epsilon/delta formalism for objects that are not formally associative or commutative, but are "within epsilon of exact associativity, with probability 1-delta", etc

travisbrown · 2020-02-18T16:18:23Z

@tixxit There's no truncation on 2.13 with this change, or on 2.12 in either case, since Scala's BigDecimal constructor doesn't truncate, even if you provide a context that would require rounding (if you want to be sure a Scala BigDecimal has been rounded according to its context you have to call .rounded explicitly).

Without this change on 2.13 the context of the lhs is used to add (subtract, etc.), but there's no explicit rounding. After this change on 2.13 (and in either case on 2.12) unlimited precision is used to add, and the lhs's context is propagated, but there's still no rounding.

I guess it's worth noting that the propagation means that |+| isn't really commutative on either 2.12 or 2.13, since:

scala> (x |+| y).mc == (y |+| x).mc
res13: Boolean = false

…but I can't make myself care much about this, and throwing away the contexts instead of propagating the lhs's seems likely to be surprising to users.

non · 2020-02-18T16:34:04Z

I'm fine with this change, especially if we document the difference between x |+| y and x + y. Since BigDecimal is conceptually broken (as Rex says on the original ticket) there's no way to avoid some confusing behavior somewhere.

travisbrown · 2020-02-18T16:44:03Z

@erikerlandson

If the MathContext was part of the type signature of BigDecimal - in pseudocode maybe something like Algebra[BigDecimal, Context[...]] - would these types be properly associative and/or commutative? That's to say, are BigDecimal objects having the same context associative or commutative?

It's pretty easy to break associativity with non-UNLIMITED contexts. Off the top of my head I'm not sure how you could break commutativity? But yeah, if we could require UNLIMITED via the types there'd be no issue.

travisbrown · 2020-02-18T16:46:18Z

@erikerlandson

These kind of cases have had me wondering in the past if there is some useful epsilon/delta formalism for objects that are not formally associative or commutative, but are "within epsilon of exact associativity, with probability 1-delta", etc

Agreed, but I think that would fit better somewhere like Algebra or Spire. In cats-kernel I think it's reasonable just to say "we'll perform operations in a way that doesn't throw out precision, if you want to do that you can do it yourself".

travisbrown · 2020-02-18T16:58:26Z

@tpolecat Added a comment, thanks for the suggestion!

johnynek · 2020-02-18T17:20:04Z

What about defining an order in MathContext and then taking the max or min rather than taking the left?

I can see an argument for taking both the max and the min. I’d probably lean towards the max.

travisbrown · 2020-02-18T17:36:17Z

@johnynek

What about defining an order in MathContext and then taking the max or min rather than taking the left?

I like this in theory but I think it's going to break expectations. Take the following:

myUnlimitedPrecisionValue |+| BigDecimal(1)

In my view this pretty much absolutely must not return a limited-precision result, so we have to take the max, but then the following returns a Decimal128 value:

myDecimal32Value |+| BigDecimal(1)

…which also seems kind of weird? I wish this was properly commutative with respect to context, but I also think that only matters to people who care about the context, who are also likely to expect the standard "use the lhs's context" behavior (which has been around forever, not just in 2.13).

travisbrown · 2020-02-21T08:53:32Z

@johnynek There are other issues with ordering MathContext and taking the max. A MathContext is a pair of precision and rounding mode, and there's no good order for rounding modes. You could pick an arbitrary one (like ordinal) I guess just to make the commutativity work out, but that's not very satisfying.

The bigger issue (apart from the user expectation problem) is that it seems really strange to have an ordering on contexts that we use for these operations but ignore in our ordering on BigDecimal itself (and Hash, etc.).

My feeling is that Cats should follow the standard library's approach to propagating contexts but otherwise pretend they don't exist. This gets us uniform behavior across Scala versions, reliable associativity everywhere, and reliable commutativity as long as you don't look at mc (which our Eq and Hash instances don't).

travisbrown · 2020-02-24T14:52:56Z

@johnynek Are you okay with the solution in this PR? It doesn't make the commutativity situation with respect to mc worse on any Scala version—it just doesn't fix it—and it does fix the commutativity situation with respect to the value on 2.13.

We just hit another spurious test failure CI (#3316) because of this issue.

johnynek · 2020-02-24T16:39:33Z

I’m not really happy about it, but I am not trying to block it. If it isn’t commutative (I’m on my phone now) we should make sure it doesn’t extend commutative group (or monoid etc...)

I’d rather define an Ordering on MathContext and use that as I said. Or remove the implicit instance of the typeclasses and require a MathContext to construct the instance.

travisbrown · 2020-02-24T17:08:37Z

@johnynek To summarize:

Right now we have an implicit CommutativeGroup instance for BigDecimal that has three problems:

It's not commutative on any Scala version, because you can observe the instances' MathContexts, and |+|, etc. always return a value that uses the first operand's.
It's also not commutative on Scala 2.13 because it uses the first operand's MathContext for the operation.
It's not even associative on Scala 2.13 for the same reason.
The instance can return different values on different Scala versions for the same reason.

This PR fixes the latter three problems, but doesn't change any behavior related to the first, and doesn't really have anything to do with the first. I've opened #3317 to track the first separately.

travisbrown added 2 commits February 18, 2020 08:53

Use MathContext.UNLIMITED for BigDecimal arithmetic

fe342bb

Use default Arbitrary for BigDecimal

a1aff96

travisbrown requested review from johnynek, non and LukaJCB February 18, 2020 15:22

tpolecat reviewed Feb 18, 2020

View reviewed changes

Add note about precision of BigDecimal operations

cbe2247

tpolecat approved these changes Feb 18, 2020

View reviewed changes

travisbrown added this to the 2.2.0-M1 milestone Feb 20, 2020

travisbrown added the enhancement label Feb 20, 2020

travisbrown mentioned this pull request Feb 24, 2020

Add semiflatTap and leftSemiflatTap functions to EitherT #3316

Merged

LukaJCB approved these changes Feb 24, 2020

View reviewed changes

travisbrown mentioned this pull request Feb 24, 2020

CommutativeGroup for BigDecimal isn't actually commutative #3317

Open

travisbrown merged commit bce605e into typelevel:master Feb 24, 2020

travisbrown mentioned this pull request Feb 24, 2020

Add Semigroup and Monoid combinators reverse and intercalate #3279

Merged

This was referenced Feb 26, 2020

Reinstate @johnynek's intercalate optimization from #3279 #3321

Closed

Specific commutativity and associativity tests for BigDecimal #3325

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use unlimited MathContext for BigDecimal arithmetic #3303

Use unlimited MathContext for BigDecimal arithmetic #3303

travisbrown commented Feb 18, 2020 •

edited

Loading

codecov-io commented Feb 18, 2020 •

edited

Loading

tpolecat left a comment •

edited

Loading

erikerlandson commented Feb 18, 2020 •

edited

Loading

tixxit commented Feb 18, 2020

erikerlandson commented Feb 18, 2020

travisbrown commented Feb 18, 2020

non commented Feb 18, 2020

travisbrown commented Feb 18, 2020

travisbrown commented Feb 18, 2020

travisbrown commented Feb 18, 2020

johnynek commented Feb 18, 2020

travisbrown commented Feb 18, 2020

travisbrown commented Feb 21, 2020

travisbrown commented Feb 24, 2020

johnynek commented Feb 24, 2020

travisbrown commented Feb 24, 2020

Use unlimited MathContext for BigDecimal arithmetic #3303

Use unlimited MathContext for BigDecimal arithmetic #3303

Conversation

travisbrown commented Feb 18, 2020 • edited Loading

codecov-io commented Feb 18, 2020 • edited Loading

Codecov Report

tpolecat left a comment • edited Loading

Choose a reason for hiding this comment

erikerlandson commented Feb 18, 2020 • edited Loading

tixxit commented Feb 18, 2020

erikerlandson commented Feb 18, 2020

travisbrown commented Feb 18, 2020

non commented Feb 18, 2020

travisbrown commented Feb 18, 2020

travisbrown commented Feb 18, 2020

travisbrown commented Feb 18, 2020

johnynek commented Feb 18, 2020

travisbrown commented Feb 18, 2020

travisbrown commented Feb 21, 2020

travisbrown commented Feb 24, 2020

johnynek commented Feb 24, 2020

travisbrown commented Feb 24, 2020

travisbrown commented Feb 18, 2020 •

edited

Loading

codecov-io commented Feb 18, 2020 •

edited

Loading

tpolecat left a comment •

edited

Loading

erikerlandson commented Feb 18, 2020 •

edited

Loading