-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.5.x: Timezone conversion result (LocalResult
)
#1448
Comments
ProposalWe should remove the pub enum LocalResult<T> {
Single(T),
Ambiguous(T, T), /* first, last */
InGap(DateTime<Utc>, T, T), /* time of transition, before, after */
Err(Error)
} New is Return
|
I agree with the arguments that wrapping Two principles look important to me:
Therefor I propose a slightly different naming and behaviour.
I vouch for deprecating/removing the
|
The terms other libraries use to describe what happens during a transition are (https://peps.python.org/pep-0495/#terminology):
So the current naming seems good to keep. During a fold there are two clear answers: the earliest and latest result.
I am not sure which parts of dealing with timezone transitions makes you believe the result is an approximation or incorrect. Of course what helper methods make the most sense besides the current ones is something we can discuss. I think something like |
I updated the description of |
Can we implement a separate trait for these which we can use to "specialize" results for these cases? |
I remember reading that idea in one of the issues, but can't find it. Honestly I have no idea what that would look like, and if it would make chrono any easier to use. In my opinion returning If methods on |
Thanks for the link, that was what I was looking for. Of course there is the argument to make that
Sorry I was cutting short there. Of course the user should be able to fully handle all cases if he intends to (
This is not fully clear to me. Let's assume
Returning UTC wouldn't be the desired time zone anyway, so |
Never heard of that either. If |
There are two terms to describe what happens to the local time: it is folded, or there is a gap. Then there are two terms to describe the result of mapping a local (naive) datetime within a transition to an absolute
That is an argument. I used
It seems just as useful and controversial as
How about the example in the opening post? I want to know the number of seconds since midnight, but midnight doesn't exist because it falls in a transition gap? |
I'm still lost. In the first post you mention that 23:30 UTC (before/after mapped to UTC) vs 00:00 UTC (naive timestamp) Edit: If there is a DST transition from 23:30 to 00:30, is 23:30 even defined? Just like 24:00, the upper bound is usually not included. So Regarding the minutes since midnight and it is 1:30, I would expect to obtain "+30". Is this correct? |
Sorry, I'm not saying things clear enough 😄. 23:30:00+1:00 is the same instant in time as 00:30:00+2:00, and the same as 22:30:00 UTC. If local time jumps from 23:30:00+1:00 to 00:30:00+2:00 there is a gap of 1 hour in the values. But both the start and the end are exactly at 22:30:00 UTC. There are multiple ways to deal with missing values because they fall inside a gap in local time. Let's continue with the time 00:00:00 from the example, where we don't know the offset and want to map it to a UTC value.
With the example of the first post, how much time has elapsed since midnight, I would use the first option. Edit: |
Nearly so. Because the offset type is a generic (it can also be a |
In chrono at least we treat it as defined. Both values are allowed and map to the same time in UTC. With the conversion from UTC to local time, when there is no offset available, I believe we always pick the value after the transition as canonical. 00:30 in this case. |
My head is spinning, but I think i got it now. The second value would be my preferred choice as user. I approve. I had twisted it the other way around, which of course makes no sense at all: Wrong signatureGap(00:00 UTC, 22:30 +2:00, 23:30 + 1:00)
|
The reverse is obviously not possible. If the trait enforces a return type of Result<u8, Error>, we will never get more information out of it. So the trait has to return |
I am not sure what you are trying to come up with to be honest. To make sure: it is a design choice to have a custom With planned variants for our
|
The proposal by djc was to "specialize" the return type for certain implementations. Once the trait returns Implementing Edit: Indeed it is the same mapping with Error:NotEnough and Error::DoesNotExist. |
You can't specialize the return type. You can have a different set of methods. I'm still struggling with the composition of all of these cases:
I'm inclined to think that it would be better to use a normal |
Better nameShall we try to pick a better name for The problem with the current How about Type of the genericInterestingly in chrono 4.19 Now that we have much more complex and correct implementations it is natural to look up the offset for the given local time in a time zone, and So typically we crate a It seems to me figuring out the offset is always the natural first step. Possible errorsIt is still a bit hard to see all the potential errors because we have more than one stategy for dealing with the problem that we can't return any errors today. We currently often return Another interesting case is that currently we silently fall back to assuming UTC if we can't determine the current time zone on Unix. The following is what I can come up with given our current code. It seems general enough to make sense as a generic error type for all implementers of the /// Error type for time zone lookups.
#[non_exhaustive]
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub enum TzLookupError {
/// Unable to determine the local time zone of the os/platform.
TimeZoneUnknown,
/// Error returned by a platform API.
OsError(i32),
/// `TZ` environment variable set to an invalid value.
InvalidTzString,
/// Unable to locate an IANA time zone database.
NoTzdb,
/// The specified time zone is not found (in the database).
TimeZoneNotFound,
/// There is an error when reading/validating the time zone data.
InvalidTimeZoneData,
/// The result would be out of range.
OutOfRange,
}
impl From<TzLookupError> for Error {
fn from(error: TzLookupError) -> Self {
Error::TzLookupFailure(error)
}
} This error type can be wrapped in our MethodsI'll bite. We currently have five methods on impl<T> LocalResult<T> {
/// Returns `Some` only when the conversion result is unique, or `None` otherwise.
pub fn single(self) -> Option<T>;
/// Returns `Some` for the earliest possible conversion result, or `None` if none.
pub fn earliest(self) -> Option<T>;
/// Returns `Some` for the latest possible conversion result, or `None` if none.
pub fn latest(self) -> Option<T>;
/// Maps a `LocalResult<T>` into `LocalResult<U>` with given function.
pub fn map<U, F: FnMut(T) -> U>(self, mut f: F) -> LocalResult<U>;
}
impl<T: fmt::Debug> LocalResult<T> {
/// Returns the single unique conversion result, or panics accordingly.
#[track_caller]
pub fn unwrap(self) -> T;
} If the type can carry an error and a way to describe a gap, it could become: pub enum TzLookup<T> {
Single(T),
Ambiguous(T, T), /* first, last */
InGap(DateTime<Utc>, T, T), /* time of transition, before, after */
Err(TzLookupError),
}
impl<T> TzLookup<T> {
/// Returns `Ok` if the time zone lookup has a single result.
///
/// # Errors
///
/// Returns [`Error::Ambiguous`] if the local time falls within fold, and
/// [`Error::DoesNotExist`] if it falls in a gap in the local time.
///
/// Passes on any error that may have been returned by the type implementing [`TimeZone`].
pub fn single(self) -> Result<T, Error>;
/// Returns the earliest possible result of a the time zone lookup.
///
/// # Errors
///
/// Returns [`Error::DoesNotExist`] if it falls in a gap in the local time.
///
/// Passes on any error that may have been returned by the type implementing [`TimeZone`].
pub fn earliest(self) -> Result<T, Error>;
/// Returns the latest possible result of a the time zone lookup.
///
/// # Errors
///
/// Returns [`Error::DoesNotExist`] if it falls in a gap in the local time.
///
/// Passes on any error that may have been returned by the type implementing [`TimeZone`].
pub fn latest(self) -> Result<T, Error>;
/// Returns the earliest possible result of a the time zone lookup, or the start of a gap if the
/// local time if it falls within a gap.
///
/// # Errors
///
/// Passes on any error that may have been returned by the type implementing [`TimeZone`].
pub fn earliest_or_before(self) -> Result<T, Error>;
/// Returns the latest possible result of a the time zone lookup, or the end of a gap if the
/// local time if it falls within a gap.
///
/// # Errors
///
/// Passes on any error that may have been returned by the type implementing [`TimeZone`].
pub fn latest_or_after(self) -> Result<T, Error>;
}
impl<T: fmt::Debug> TzLookup<T> {
/// Returns a single unique conversion result or panics.
///
/// `unwrap` is best combined with time zone types where the lookup can never fail like [`Utc`]
/// and [`FixedOffset`]. Note that for [`FixedOffset`] there is a rare case where a resulting
/// `DateTime` can be out of range.
///
/// # Panics
///
/// Panics if the local time falls within a fold or a gap in the local time, and on any error
/// that may have been returned by the type implementing [`TimeZone`].
#[track_caller]
pub fn unwrap(self) -> T;
/// Returns a single unique conversion result or panics. Assumes UTC if the type implementing
/// [`TimeZone`] returns an error.
///
/// # Panics
///
/// Panics if the local time falls within a fold or a gap in the local time.
#[track_caller]
pub fn unwrap_or_utc(self) -> T;
} |
I've changed my mind a bit about this. I still believe it would be a big improvement if the return type of many methods on |
Agreed. To condense the long post above, what about
Maybe we think about this differently because you view Or maybe because my idea's heavily leans on the users using the helper methods of Including the Also it allows us to provide better helper methods such as |
Maybe we should dig ourselves out of this first so we have a better sense of how things fit together? |
That should not need an awful lot of digging 😆. I'll make a PR in a few days so we have something to shoot at. |
What would we need to change to propagate the error instead of unwrapping on this line in Three methods:
Those methods currently come with the assumption that finding the offset of a UTC datetime in some time zone will never fail. Within chrono it turns out these methods are used almost always in methods that are already fallible. For 0.5 I would just change these methods to return a |
I think it might help? We could deprecate the old methods, which might get us feedback sooner about where people run into trouble. |
How about |
Sounds good to me! |
I'm not sure I understand the structure you have there. Why are all of the Also what is I was more thinking of having two structs. It may be a bit more library code but it would be fairly simple and may be easier for users to understand for the reasons mentioned. pub enum TzDatetimeResolution<TZ: TimeZone> {
Single(DateTime<TZ>),
Ambiguous {
earliest: DateTime<TZ>,
latest: DateTime<TZ>,
},
InGap {
/// The instant the gap ends.
after: DateTime<TZ>,
},
}
pub enum TzOffsetResolution<T: Offset> {
Single(T),
Ambiguous {
earliest: T,
latest: T,
},
InGap {
before_offset: T,
after_offset: T,
/// The instant the gap ends.
/// Do we need this? Maybe if they want to know the boundaries of the offset validity there should be a dedicated API for that.
/// It is a bit weird that we tack this information on for one particular case but not others.
after_datetime: DateTime<Utc>,
},
} But I may be thinking this because my use case is |
Oops, those should be local values.
Have you seen the four ways in which you may want to handle a gap that I described in #1448 (comment)? |
I am not thinking entirely clearly this evening... Ignore what I said about the |
Can you provide an example when you would want to do the middle two options? Continue the offset on either side of the gap? It seems that this would lead to very strange behaviour and I can't think of a use case. Notably the result of your calculation would be erratic based on small time shifts.
It seems like we are trying to combine "offset at local time" and "UTC at local time" and ending up with a more complex solution than we need. But I'm trying to make sure that I understand all of the questions that we are trying to answer with these APIs. |
No. Option one and four make most sense to my personally. The middle options are what libc, Windows and javascript do though. Even if I can't see it directly, they must have some use cases or not be entirely unreasonable. Making them possible would at least be helpful for compatibility reasons. For the rest we seem to have a misunderstanding that I can't put my finger on. Is it okay to pause this discussion for now? |
If I understood the topic correctly, this is an example where I want the offset before the DST to continue after the gap. I have two NaiveTime input fields for allowing a user to create an event with a begin and end time. He will put "1:30" as begin time and "2:30" as end time. The chosen date has a DST from 2:00 to 3:00. It is fair to assume that he wants his event to last 1h. Of course the correct thing is to directly notify him that the LocalTime for the EventEnd does not exist. Or the input fields require an UTC offset as well. But both seem very user unfriendly.
As developer I would just take his NaiveTimes, calculate the TimeDelta, assume the first DST offset (-01:00) and add the TimeDelta to the LocalTime. This also works if both times are inside the DST gap.
Because the choice of picking the DST offset is arbitrary and up to the developer, in cases where the duration between two times matters more than the "correct" time, it is important to stay on the same offset despite a DST transition happening. |
Interesting example! If a use case can assume it is more important to preserve the duration than the exact end time, you would get more consistent results by coding it with a duration: fn local_start_end(start: NaiveDateTime, end: NaiveDateTime) -> (DateTime<Local>, DateTime<Local>) {
let duration = end.duration_since(start).unwrap(); // may fail if end < start
let start = Local.from_local_datetime(&start).earliest().unwrap(); // may fail if inside a gap
let end = start + duration; // no problems getting an end time with offset
(start, end)
} Or you could choose to only do this fallback in case the end time does not exist. |
Okay. New and complete proposal. I think I now better get @kevincox's concerns. Base ideas
APIpub trait TimeZone {
type Offset: Offset;
fn from_utc_datetime(&self, utc: &NaiveDateTime) -> Result<DateTime<Self>, TzError>;
fn from_local_datetime(&self, local: &NaiveDateTime) -> TzResolution<Self>;
/* with_ymd_and_hms, with_timestamp* */
}
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub enum TzResolution<Tz: TimeZone> {
Single(DateTime<Tz>),
Ambiguous {
first: DateTime<Tz>,
last: DateTime<Tz>,
},
InGap {
gap_start: DateTime<Tz>,
gap_end: DateTime<Tz>,
},
Err(TzError),
}
impl<Tz> TzResolution<Tz> {
/// Returns `Ok` if the time zone lookup has a single result.
pub fn single(self) -> Result<DateTime<Tz>, Error>;
/// Returns the earliest possible result.
pub fn earliest(self) -> Result<DateTime<Tz>, Error>;
/// Returns the latest possible result.
pub fn latest(self) -> Result<DateTime<Tz>, Error>;
/// Returns the earliest possible result or the datetime of the gap.
pub fn earliest_or_gap(self) -> Result<DateTime<Tz>, Error>;
/// Returns the latest possible result or the datetime of the gap.
pub fn latest_or_gap(self) -> Result<DateTime<Tz>, Error>;
}
impl<Tz: fmt::Debug> TzResolution<Tz> {
/// Returns a single unique conversion result or panics.
#[track_caller]
pub fn unwrap(self) -> DateTime<Tz>;
}
/// Error type for time zone lookups.
#[non_exhaustive]
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub enum TzError {
/// Unable to determine the local time zone of the os/platform.
TimeZoneUnknown,
/// Error returned by a platform API.
OsError(OsError),
/// `TZ` environment variable set to an invalid value.
InvalidTzString,
/// Unable to locate an IANA time zone database.
NoTzdb,
/// The specified time zone is not found (in the database).
TimeZoneNotFound,
/// There is an error when reading/validating the time zone data.
InvalidTimeZoneData,
/// The result would be out of range.
OutOfRange,
}
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub struct OsError(/* ... */); |
Re #1448 (comment): Yeah, that use case is good but also a bit strange. Because if I have an event that ends in a gap I probably want to keep the duration, but if it ends after the gap then you probably want to keep the end time? I think a perfect UI would at least inform the user about the decision that is being made here. I think it is worth trying to make that case very explicit, but it is also nice to make it easy. I think the API proposed should make it easy enough to apply the duration difference to the converted start time as a fallback for landing in a gap. That proposal looks great to me. I only have one question. What is |
Interesting issue. How can the new design deal with a time that does not exist in DST? e.g.
|
@TCeason what timezone transition happens in Shanghai in May of 1986? By "no valid time", which of the |
LocalResult::None. Because of dst in Asia/Shanghai 1986-05-14 02:00:00 is not exists |
Well, arguably Postgres is wrong and chrono is being more careful/explicit. |
I'm not sure. Shanghai is +08. PG add one hour and display this value with +09 |
No -- due to the DST transition, times between 02:00 and 03:00 did not exist on May 4th 1986. |
Yes, I think that's reasonable. However, in real business scenarios, there may be cases where string is converted to ts. If it's LoadResult::None, How can I try to do a proper conversion for that ts? Convert to the right value? E.g. in clickhouse give a value that delete a hour. SELECT toDateTime('1986-05-04 02:01:00'); SELECT toDateTime('1986-05-04 02:01:00') Query id: 93c03d26-c2ab-433f-a1e1-8cb6244c489b ┌─toDateTime('1986-05-04 02:01:00')─┐
|
Don't you have any way to catch invalid values before this point? |
It's hard. Because varchar may have different formats. Now I try to add a match arm when matching None does some process. But not only this case return None.
|
I would need this feature to be able to order two claimed local times, i.e. two instances of I read through the discussion and have a few opinions to add.
Here is the API I'd like to see: pub enum TzResolution<Tz: TimeZone> {
Single(DateTime<Tz>),
Ambiguous(
// Earliest
DateTime<Tz>,
// Latest
DateTime<Tz>,
),
Missing(
// The instant the gap starts and ends.
DateTime<Tz>,
),
}
trait TimeZone {
// Panic if there was an error while resolving the local time,
// caused by for example missing time zone data files,
// an error in an OS API, or overflow.
fn offset_from_local_datetime(&self, local: &NaiveDateTime) -> TzResolution<Self>;
// With result instead of panic
fn try_offset_from_local_datetime(
&self,
local: &NaiveDateTime,
) -> Result<TzResolution<Self>, impl Error>;
} |
Chiming in here as I’m very interested in being able to more gracefully handle gaps. Did you have a look at what Temporal (the official new datetime API for JS) is doing in this regard? Its very close to the API you suggest here. Other reasons for choosing the 'extrapolation' approach:
The figure in the Python docs here also shows how this ‘extrapolation’ makes sense graphically. edit: add missing link, spelling |
I understand the convenience in extrapolation like that, could that be a separate function? My function name is probably not the best, but: fn offset_from_local_datetime_rfc5545(&self, local: &NaiveDateTime) -> DateTime<impl TimeZone>; This function signature is more convenient for those who don't want custom handling for folds and gaps, but also having access to something like |
Just a JFYI: https://docs.rs/jiff/latest/jiff/tz/struct.TimeZone.html#method.to_ambiguous_zoned (I've been using the jiff crate with great success and their API to disambiguate datetime conversion is also very user-friendly imo.) |
Part of this has been discussed before in #716 and #1049, and was my motivation to start contributing to chrono in #1017.
The calculation to go from a local time (wall clock time) to an instant in UTC can end up in three cases:
Our current
LocalResult
type matches those three cases:The type of
T
is usually an offset orDateTime
.As we discovered this simple definition has some issues.
No way to distinguish errors from the
None
caseThe type that implements
TimeZone
and returns aLocalResult
may not be completely under our/your control. Prime example is ourLocal
type. It wraps platform APIs, which can fail for various cases. Our current code returnsLocalResult::None
on such a failure.Now if a method returns
LocalResult::None
there is no way for the user to know if it is because the datetime falls within a gap, or because there is an error.Not enough info to handle the
None
caseSuppose we want to write a function that returns exactly how many seconds have passed since midnight, taking into account DST transitions.
That should be simple. Take the supplied
DateTime
, create aNaiveDateTime
with the time set to 00:00:00, useTimeZone::from_local_datetime
to create aDateTime
at midnight and subtract.What if some government decided to do a DST transition right at midnight, say from 23:30 to 00:30?
TimeZone::from_local_datetime
would returnLocalResult::None
, and our function can't return a result. However both 23:30 (from before moving the clock forwards) and 00:30 map to the same time in UTC, and that is the time you could consider midnight for this calculation.LocalResult::None
would be more useful if it returned more info thanNone
, such as the UTC time during the gap and the value ofT
before and after the transition.Related problem: wrapping other APIs that don't give enough info
Other APIs may return a single value, even when it would be ambiguous or doesn't exist.
And I suppose there could be API's, like chrono 0.4.x, that can return two results for ambiguous cases but can't return more info for the 'gap' case.
I'm not sure if there is anything special we want to do here. But it may be useful for callers to know that some platform or library may return bogus results around transitions, with no way to detect that.
When to use a
LocalResult
Methods on
DateTime
currently don't return aLocalResult
, but consider any result that falls in a timezone transition as an error. This makes it cumbersome for users to handle these cases correctly. They have to work on aNaiveDateTime
and then convert it back withTimeZone::from_local_datetime
. Returning aLocalResult
would be a big improvement for correctness in my opinion.(a proposal comes in the next comment)
cc @Zomtir
The text was updated successfully, but these errors were encountered: