-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CalendarDuration
type
#1290
base: main
Are you sure you want to change the base?
Conversation
For initialization the methods work much like the builder pattern, except that we don't need a seperate builder type. I have intentionally added only methods that set a field, not methods that add some value to a field. And also not To get the value of the individual components, we have not five but four methods (because minutes and seconds are closely related): |
1458553
to
78336c9
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1290 +/- ##
========================================
Coverage 91.80% 91.81%
========================================
Files 37 38 +1
Lines 18151 18409 +258
========================================
+ Hits 16664 16902 +238
- Misses 1487 1507 +20 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start!
src/calendar_duration.rs
Outdated
// `seconds` can either encode `minutes << 6 | seconds`, or just seconds. | ||
seconds: u32, | ||
// `nanos` encodes `nanoseconds << 2 | has_minutes << 1 | 1`. | ||
nanos: NonZeroU32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to implement the boring version here and optimize memory usage later if we decide that is necessary. Since these are public fields anyway, there should be no need to prematurely optimize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a little premature optimization going on. But my goal is more about having two clear modes of operation. My thought process went like this:
- Include a flag for 'I care about leap seconds'/'leap seconds don't exist'.
However that is not really something we can serialize, and also not enough to makeCalendarDuration
at least in theory capable of working with leap seconds correctly. - Have explicit
minutes
andseconds
fields. - Suppose we pre-multiply the
minutes
field with60
. This turns the fields intoseconds_in_utc
andseconds_in_tai
.
This is becoming a mess. Having two pretty much the same fields that only differ in how they count leap seconds within one duration seems unreasonable. - Introduce two sane modes of operation:
- arbitrary large minutes and only sub-minute seconds
- arbitrary large seconds
To encode that with rust types:
pub struct CalendarDuration {
// Components with a nominal duration
months: u32,
days: u32,
// Components with an accurate duration
accurate: AccurateDuration,
}
enum AccurateDuration {
MinutesAndSubMinuteSecondsAndNanos(u32, u8, u32),
SecondsAndNanos(u32, u32);
}
We can simplify that a bit if we manually uphold the relation between minutes
and seconds
fields to:
pub struct CalendarDuration {
// Components with a nominal duration
months: u32,
days: u32,
// Components with an accurate duration
minutes: u32,
seconds: u32,
nanos: u32,
}
The idea is that if you have an existing CalendarDuration
and change it with with_seconds
, it should switch to the SecondsAndNanos
mode. Previous minutes and seconds should be forgotten. And if minutes
is non-zero, seconds
must be less than 61
.
If we already have to uphold a relation between minutes
and seconds
anyway, why not encode them in a single u32
and get the smaller type size for free? And if we don't do it now but want to keep the option for the optimization open for later, we should already include the range checks.
I have to admit stuffing the discriminant inside the nanos
field, and using the other bit for niche optimization, is a premature optimization. Can we please keep it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO that's way too much magic. I'd rather have the actual enum if we're going to do something like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an MinutesAndSeconds
type to encode this in.
And I switched to accurate part of a duration to a 64-bit type.
With a 32-bit value we could only store an accurate duration of ca. 136 years. With a 64-bit type CalendarDuration
can cover all uses over std::time::Duration
and TimeDelta
(forgetting the negative part). With this type we also never have to make methods fallible that construct a duration from the difference between two dates; the result always fits.
src/calendar_duration.rs
Outdated
} | ||
|
||
/// Set the months component of this duration to `months`. | ||
pub const fn months(mut self, months: u32) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should call this with_months()
and allow months()
to be used for the getter (same for the other ones).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed. I was going for having short names for initialization, but there is something to say for making the getters have the short names.
(Please avoid adding more stuff to this PR -- other than per review feedback.) |
78336c9
to
33cfcec
Compare
Setting to draft for now. I have most of the parser ready, but need to clean up things a bit more before it is ready a second look. |
ceee9b7
to
1916234
Compare
Sorry. I added two more things, but minor I think: |
1916234
to
7e8e29f
Compare
f930164
to
850054f
Compare
Started working on this again. |
9bc4c86
to
a6790d2
Compare
Put a couple of hours into self-review and improving the initial documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's split this into smaller chunks, first the CalendarDuration
type itself (up to and including the Default
commit, except squashing that) but removing some of the extra setters.
/// | ||
/// # Encoding | ||
/// | ||
/// - Seconds: `seconds << 2 | 0b10` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much memory is this stuff saving? It's not obvious to me that this is worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main motivation is to encode the either seconds or minutes and seconds cases.
Still I prefer to make the type not larger than necessary, especially if one of the goals is to replace the Days
and Months
types. An enum will add 8 bytes to a type of currently 20 bytes; 40%.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent some time thinking this over and would still prefer that we start with a "real" enum -- that doesn't rely on bit-shifting tricks. If we get an issue that complains about the size we can decide whether it's worth optimizing this. But it seems unlikely to me at this point that people will want to hold a ton of these in an Vec
or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With an enum we would still have most of the complexity.
- In my opinion we want to keep the range checks in
MinutesAndSeconds::from_seconds
andfrom_minutes_and_seconds
.
The restricted range for minutes gives as the guaranteeminutes * 60 + seconds
does not overflow.
The checks would be needed to allow for this 'optimization' in the future. - We still need the special case where we encode a value created with
from_minutes_and_seconds(0, s)
the same asfrom_seconds(s)
so that they compare equal. - We still need a match to get both values in
mins_and_secs()
.
Switching to an enum is not going to save many lines or maybe even none.
I just don't consider a couple of bitwise operations much of a problem and well worth it for the much nicer type size. Really don't want to drop them 😞.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we going to proceed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this is a textbook case of premature optimization and think we should drop the bitshifting for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a difference of opinion here. If chrono gets a CalendarDuration
as new functionality I want it to be as good as I can make it. Do you want to block the functionality on this detail?
a6790d2
to
3739e2f
Compare
@djc You gave an approval, but want to have a final look? |
See #1282.
As a first PR this adds nothing but the type, methods to set and get the individual components, and a
Display
implementation.We have 5 components: months, days, minutes, seconds and nanoseconds.
Example of ways to initialize a
CalendarDuration
:As described in #1282 (comment) we squeeze both the minutes and seconds components into one
u32
.The idea is that it is too strange and niche to have two large components that only differ in how they count leap seconds.
So we either have:
The
Debug
implementation will format the type as if it is a struct with fields for 5 components.