-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Сalculation speed up for the Gregorian calendar #5849
base: main
Are you sure you want to change the base?
Conversation
Performance of new version is much better(~60%) at the end of a year, ~25% better in avg, and the same at the start of a year.
Performance of new version is much better(~400%) (at least on my CPU). Seems like indexing (in prev algo) is pretty slow.
TODO: Is this comment true? |
Before in `utils/calendarical_calculations` was only test of equality with prev impl [⚠️ ] In old Julian algo there was a comment: ``` Julian epoch is equivalent to fixed_from_iso of December 30th of 0 year 1st Jan of 1st year Julian is equivalent to December 30th of 0th year of ISO year ``` is it true?
iso_old_file.rs is just old version of the iso.rs file. iso_old_algos.rs is some of old algo implementations that was changed outside of the iso.rs file. All of them were copied without changes. |
@@ -23,7 +23,7 @@ rust-version.workspace = true | |||
|
|||
# This is a special exception: The algorithms in this crate are based on "Calendrical Calculations" by Reingold and Dershowitz | |||
# which has its lisp code published at /~https://github.com/EdReingold/calendar-code2/ | |||
license = "Apache-2.0" | |||
license = "Apache-2.0" # TODO: Need to add `MIT`/`GNU`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the license was picked by a lawyer. Why would GNU be needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a realization of the algorithm(from the author of the article) in C/C++ and in the repo no apache 2.0 license
Here is a comment with mentioning it in the PR
So I don't sure is it necessary to add any of them or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. We'd have to talk to the lawyer again for this.
Typically algorithms themselves aren't copyrightable, however we would indeed need to check with our lawyer to pull in this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the policy is, we can redistribute MIT (but not GPL) code under the Apache-2.0 license, and all third-party code should retain its copyright comments inline in the code, similar to the Reingold code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first link goes to a file with the following comment at the top
// SPDX-License-Identifier: GPL-3.0-or-later
According to the terms of the GPL license, which are fairly strict, an Apache-licensed crate such as calendrical_calculations
would not be able to redistribute that code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A workaround to this type of issue would be for any GPL-licensed code to live in its own crate, and then either icu_calendar
or calendrical_calculations
has an optional Cargo feature to consume it. Clients who would like the speedup and are okay consuming GPL-licensed code would need to manually enable the Cargo feature.
I do not know whether such a GPL crate could live in this repository or whether it would need its own repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather avoid introducing GPL licensed code in our dep tree at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sffc @Manishearth
As I can understand you can implement algos from an article without such license restriction.
And if you will check the code by link and the code from PR it's will be pretty clear* that code was inspired by the article and only then it was matched with author's code for reference to authority. So maybe I can just remove link to the author implementation and leave only links to the article?
[*]: because of naming(unnamed const and very short names that say almost nothing (except for y/m/d ones, yeah they can say something, but nothing about how and why)) in the repo of the article's author. And in the PR even some consts was changed because of we have larger valid dates' interval -- in the author's code they are just magic numbers again; and how will you change such consts without understanding for what and why? And of course in the PR code there is plenty comments about why and how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand that generally algorithms are not copyrightable, but either way, we will have to get approval from our lawyer for doing this, and they may choose to be more cautious about this. We were already quite cautious about the Reingold&Dershowitz algoritms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in general not super in favor of this much added code for an optimization which I'm not sure matters. Thoughts @sffc?
It is well commented, so we do have that.
Thank you for the PR!
As far as I can tell, the amount of library code is reduced by this PR. Most of it is in new benches and tests. This looks like a reasonably high quality PR. @Nikita-str, can you post the results of the benchmarks before and after the change? Are we talking a 5% or 50% speedup, for example? |
no, iso.rs has a pretty large diff. It's also replacing relatively straightforward code with complex code. The tests and benches are definitely welcome either way! |
I will run benches from the PR, and they do in such way to minimize LTO optimization that turn on for I actually don't sure that benches from the And I guess that do so against LTO optimization(don't allow him to predict the dates) is pretty fair because I looked on generated asm code (for my machine and I can give full asm to you if needed) for new and old implementation of
But, because of I never seen the LTO turned on for benches, I don't sure that I do all correct, and maybe it makes sense to discuss. Sorry for such big prelude, now the benches.
Few words about
And let's calc time ratio (
|
@Manishearth |
Regarding your comment on LTO: we only benchmark optimized code; we assume that users in performance-sensitive environments will be building with all optimizations enabled. I fully expect that the compiler will unwrap the loop, but you have enough |
About
So the question is only: do we want to test perf of a concrete cycles/cycles with some info(I guess -- not) or a general case when we don't know which dates we will have (for me it seems like a more real use case)? Oh, one note about the cycles: I use it to calc average performance of algo, because there may be cases like Now, about
As I can understand, I should remove as many |
Problems:
Discussion: @nekevss: Cassio Neri gave a CPP conference presentation and has other sample code and benchmarks. There's an open issue on jiff as well. Conclusion:
|
I re-read the notes and noticed that it states:
I don't agree with this; |
@sffc I don't think that's sufficient justification? They're in the critical path (also, only sometimes? I'm happy to believe these are actually performance critical but so far I have yet to see benchmarks showing these numbers to be relevant. It is factually true that the perf benefit is not yet clear. |
The yd/ymd benchmarks measure the time taken for ~1,200,000 (365×2×2000) runs of the function, if I'm understanding the setup correctly (I'm on a phone, I might not be). The true time taken by most of these functions is in the order of nanoseconds. The fixed benchmarks measure time taken for 10,000 runs, which also ends up with numbers in the realm of nanoseconds. I am extremely skeptical that a single call to these functions during formatting is at all relevant for performance. |
Here are the numbers with iteration counts. I've listed the iteration count for each section under ITER, and the divided time is in braces. Note that some of these functions run an operation four times, I haven't divided by four. A couple of these numbers seem a bit more low than I'd expect, so I'm not yet certain of these numbers, but at the very least these should be the right order of magnitude. Everything is in ns,
|
To expand on this, when it comes to whether we should optimize something when there are tradeoffs involved, here's how I look at it:
In this case we have a pile of functions, some of which are used in perf-critical code: I think the day-of-year and day-of-week stuff is going to be called in every format for certain skeleta, but the to/from fixed stuff is less critical to formatting? I could be wrong, but AIUI we should not call to/from fixed during formatting unless using a converting calendar. So right off the bat it's unclear if we should be optimizing all of these. Then, on top of that, all of these functions are single or low-double digit nanoseconds (assuming my calcoulations above are correct, I do not have faith in them at the moment), and I'd be really surprised if that shows up on the order of magnitude of formatting performance since formatting actually hits memory and does string writing. |
Solving #5562 by implementing algo from the article for the Gregorian calendar.
The article: "Euclidean Affine Functions and Applications to Calendar Algorithms" by Cassio Neri & Lorenz Schneider (Feb. 2021).
Draft PR:As I understand there is many places where parts of the logic is used and I only change the main file.Also need to run/writes benchmarks in the repo (by now, I run benches only in local pseudo repo where I test the algo).Also with the same logic can be speeded up the Julian calendar.Tests: not only test the same value as prev algos