Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use unlimited MathContext for BigDecimal arithmetic #3303

Merged
merged 3 commits into from
Feb 24, 2020

Conversation

travisbrown
Copy link
Contributor

@travisbrown travisbrown commented Feb 18, 2020

Scala 2.13 changes the behavior of arithmetic on BigDecimal so that the MathContext of the first operand is used (instead of always using UNLIMITED, which is what was done in previous Scala versions; see this issue for some discussion of the change).

While this change is arguably the right thing to do in the standard library (assuming we're stuck with the mistake of carrying around a context in the first place), I think I've convinced myself that it's the wrong thing to do in Cats's instances. For one thing the 2.13 change means that operations from these instances are no longer commutative or even associative, even for fairly simple cases:

import cats.kernel.instances.bigDecimal._, cats.syntax.semigroup._

val x = BigDecimal("1e20")(java.math.MathContext.UNLIMITED)
val y = BigDecimal("1e-20")

And then:

scala> (x |+| y) == (y |+| x)
res0: Boolean = false

scala> (x |+| (y |+| x)) == ((x |+| y) |+| x)
res1: Boolean = false

…while on 2.12 both of these results are true.

These issues don't turn up when we're checking the laws right now, because we have this instance in cats.kernel.laws.Tests:

   // default Arbitrary[BigDecimal] is a bit too intense :/
    implicit val arbBigDecimal: Arbitrary[BigDecimal] =
      Arbitrary(arbitrary[Double].map(n => BigDecimal(n.toString)))

This workaround seems to me like a bad idea to start with, and for more complex test cases it's not enough, anyway (see for example the issue @johnynek ran into in #3279).

This PR changes the implementations of all of the operations that require contexts back to the 2.12 behavior (it has no effect on 2.12). I've also removed the custom Arbitrary[BigDecimal] in the cats-kernel-laws tests. It precedes 2.13 (it was introduced in #1319 in 2016), but removing it doesn't seem to cause problems in a few test runs I've just done on both 2.12 and 2.13. If it turns out that we needed it to avoid flakiness I'll revert that part.

@codecov-io
Copy link

codecov-io commented Feb 18, 2020

Codecov Report

Merging #3303 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3303      +/-   ##
==========================================
+ Coverage   93.15%   93.15%   +<.01%     
==========================================
  Files         378      378              
  Lines        7579     7585       +6     
  Branches      203      210       +7     
==========================================
+ Hits         7060     7066       +6     
  Misses        519      519
Flag Coverage Δ
#scala_version_212 93.4% <100%> (ø) ⬆️
#scala_version_213 92.93% <100%> (+0.01%) ⬆️
Impacted Files Coverage Δ
...la/cats/kernel/instances/BigDecimalInstances.scala 100% <100%> (ø) ⬆️
...src/main/scala-2.13+/cats/instances/arraySeq.scala 100% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e6e24fc...cbe2247. Read the comment docs.

Copy link
Member

@tpolecat tpolecat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gosh. Yeah this makes sense to me, but in the event someone is surprised by x + y not equaling x |+| y and .sum not equalling .combineAll it might be good to add a comment.

@erikerlandson
Copy link
Contributor

erikerlandson commented Feb 18, 2020

If the MathContext was part of the type signature of BigDecimal - in pseudocode maybe something like Algebra[BigDecimal, Context[...]] - would these types be properly associative and/or commutative? That's to say, are BigDecimal objects having the same context associative or commutative?

At any rate, part of the problem is certainly that BigDecimals with differing contexts definitely are not the same type.

@tixxit
Copy link

tixxit commented Feb 18, 2020

Just to make sure I understand, the old behaviour would add the numbers using an unlimited context, then truncate them using the lhs' MathContext? The new behaviour adds them using the lhs' MathContext?

@erikerlandson
Copy link
Contributor

Another slant on this is that it reminds me of some monoidal data sketches (e.g. t-digests) where in general s1 |+| s2 is not exactly equal to |s2| + |s1| however it is the case that they are always very similar, for some appropriate definition of "similar" (and likewise for associativity). These kind of cases have had me wondering in the past if there is some useful epsilon/delta formalism for objects that are not formally associative or commutative, but are "within epsilon of exact associativity, with probability 1-delta", etc

@travisbrown
Copy link
Contributor Author

@tixxit There's no truncation on 2.13 with this change, or on 2.12 in either case, since Scala's BigDecimal constructor doesn't truncate, even if you provide a context that would require rounding (if you want to be sure a Scala BigDecimal has been rounded according to its context you have to call .rounded explicitly).

Without this change on 2.13 the context of the lhs is used to add (subtract, etc.), but there's no explicit rounding. After this change on 2.13 (and in either case on 2.12) unlimited precision is used to add, and the lhs's context is propagated, but there's still no rounding.

I guess it's worth noting that the propagation means that |+| isn't really commutative on either 2.12 or 2.13, since:

scala> (x |+| y).mc == (y |+| x).mc
res13: Boolean = false

…but I can't make myself care much about this, and throwing away the contexts instead of propagating the lhs's seems likely to be surprising to users.

@non
Copy link
Contributor

non commented Feb 18, 2020

I'm fine with this change, especially if we document the difference between x |+| y and x + y. Since BigDecimal is conceptually broken (as Rex says on the original ticket) there's no way to avoid some confusing behavior somewhere.

@travisbrown
Copy link
Contributor Author

@erikerlandson

If the MathContext was part of the type signature of BigDecimal - in pseudocode maybe something like Algebra[BigDecimal, Context[...]] - would these types be properly associative and/or commutative? That's to say, are BigDecimal objects having the same context associative or commutative?

It's pretty easy to break associativity with non-UNLIMITED contexts. Off the top of my head I'm not sure how you could break commutativity? But yeah, if we could require UNLIMITED via the types there'd be no issue.

@travisbrown
Copy link
Contributor Author

@erikerlandson

These kind of cases have had me wondering in the past if there is some useful epsilon/delta formalism for objects that are not formally associative or commutative, but are "within epsilon of exact associativity, with probability 1-delta", etc

Agreed, but I think that would fit better somewhere like Algebra or Spire. In cats-kernel I think it's reasonable just to say "we'll perform operations in a way that doesn't throw out precision, if you want to do that you can do it yourself".

@travisbrown
Copy link
Contributor Author

@tpolecat Added a comment, thanks for the suggestion!

@johnynek
Copy link
Contributor

What about defining an order in MathContext and then taking the max or min rather than taking the left?

I can see an argument for taking both the max and the min. I’d probably lean towards the max.

@travisbrown
Copy link
Contributor Author

@johnynek

What about defining an order in MathContext and then taking the max or min rather than taking the left?

I like this in theory but I think it's going to break expectations. Take the following:

myUnlimitedPrecisionValue |+| BigDecimal(1)

In my view this pretty much absolutely must not return a limited-precision result, so we have to take the max, but then the following returns a Decimal128 value:

myDecimal32Value |+| BigDecimal(1)

…which also seems kind of weird? I wish this was properly commutative with respect to context, but I also think that only matters to people who care about the context, who are also likely to expect the standard "use the lhs's context" behavior (which has been around forever, not just in 2.13).

@travisbrown travisbrown added this to the 2.2.0-M1 milestone Feb 20, 2020
@travisbrown
Copy link
Contributor Author

@johnynek There are other issues with ordering MathContext and taking the max. A MathContext is a pair of precision and rounding mode, and there's no good order for rounding modes. You could pick an arbitrary one (like ordinal) I guess just to make the commutativity work out, but that's not very satisfying.

The bigger issue (apart from the user expectation problem) is that it seems really strange to have an ordering on contexts that we use for these operations but ignore in our ordering on BigDecimal itself (and Hash, etc.).

My feeling is that Cats should follow the standard library's approach to propagating contexts but otherwise pretend they don't exist. This gets us uniform behavior across Scala versions, reliable associativity everywhere, and reliable commutativity as long as you don't look at mc (which our Eq and Hash instances don't).

@travisbrown
Copy link
Contributor Author

@johnynek Are you okay with the solution in this PR? It doesn't make the commutativity situation with respect to mc worse on any Scala version—it just doesn't fix it—and it does fix the commutativity situation with respect to the value on 2.13.

We just hit another spurious test failure CI (#3316) because of this issue.

@johnynek
Copy link
Contributor

I’m not really happy about it, but I am not trying to block it. If it isn’t commutative (I’m on my phone now) we should make sure it doesn’t extend commutative group (or monoid etc...)

I’d rather define an Ordering on MathContext and use that as I said. Or remove the implicit instance of the typeclasses and require a MathContext to construct the instance.

@travisbrown
Copy link
Contributor Author

@johnynek To summarize:

Right now we have an implicit CommutativeGroup instance for BigDecimal that has three problems:

  1. It's not commutative on any Scala version, because you can observe the instances' MathContexts, and |+|, etc. always return a value that uses the first operand's.
  2. It's also not commutative on Scala 2.13 because it uses the first operand's MathContext for the operation.
  3. It's not even associative on Scala 2.13 for the same reason.
  4. The instance can return different values on different Scala versions for the same reason.

This PR fixes the latter three problems, but doesn't change any behavior related to the first, and doesn't really have anything to do with the first. I've opened #3317 to track the first separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants