unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#22033: time-utc format is lossy
@ 2015-11-27 19:38 Zefram
  2017-04-24 20:32 ` Zefram
  0 siblings, 1 reply; 3+ messages in thread
From: Zefram @ 2015-11-27 19:38 UTC (permalink / raw)
  To: 22033

In SRFI-19, round-tripping some UTC dates through the time-utc structure
format, for the couple of seconds around a leap second:

scheme@(guile-user)> (use-modules (srfi srfi-19))
scheme@(guile-user)> (define (tdate d) (write (list (date->string d "~4") (date->string (time-utc->date (date->time-utc d) 0) "~4"))) (newline))
scheme@(guile-user)> (tdate (make-date 0 59 59 23 30 6 2012 0))
("2012-06-30T23:59:59Z" "2012-06-30T23:59:59Z")
scheme@(guile-user)> (tdate (make-date 0 60 59 23 30 6 2012 0))
("2012-06-30T23:59:60Z" "2012-06-30T23:59:60Z")
scheme@(guile-user)> (tdate (make-date 0 0 0 0 1 7 2012 0))
("2012-07-01T00:00:00Z" "2012-06-30T23:59:60Z")
scheme@(guile-user)> (tdate (make-date 0 1 0 0 1 7 2012 0))
("2012-07-01T00:00:01Z" "2012-07-01T00:00:01Z")

Observe that the second immediately following the leap second, the
first second of the following UTC day, isn't round-tripped correctly.
It comes back as the leap second.  These two seconds are perfectly
distinct parts of the UTC time scale, and the time-utc format ought to
preserve their distinction.

-zefram





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#22033: time-utc format is lossy
  2015-11-27 19:38 bug#22033: time-utc format is lossy Zefram
@ 2017-04-24 20:32 ` Zefram
  2018-10-20 22:08   ` Mark H Weaver
  0 siblings, 1 reply; 3+ messages in thread
From: Zefram @ 2017-04-24 20:32 UTC (permalink / raw)
  To: 22033

I wrote:
>                                   These two seconds are perfectly
>distinct parts of the UTC time scale, and the time-utc format ought to
>preserve their distinction.

This is a problematic goal.  At the time I wrote the bug report I didn't
have a satisfactory idea of how to achieve it, but I think I've come up
with one now.

The essential problem is that the SRFI-19 time structure expects to
encapsulate a scalar value -- as it says, a count of seconds since
some epoch -- but there is no natural scalar representation of a UTC
time.  Because of the irregularity imposed by its leaps, the natural
representation of a UTC time is a two-part structure, consisting of an
integer identifying the day and a fractional count of seconds elapsed
within the day.  Because UTC days contain differing numbers of seconds,
this is a variable-radix system.  SRFI-19 doesn't offer any structure that
has this simple form.  The only structure that it describes as separating
representation of the day from time of day is the date structure, which
splits up the time representation much more and has the complication of
the timezone offset.

The present approach of the library is to squeeze a UTC time into the time
structure by converting the variable-radix value into a scalar by using
a fixed radix of 86400.  This has the advantage of producing a scalar,
and of the scalar behaving continuously on most UTC days, but the major
downside of being lossy, aliasing some UTC times.  The scalar also isn't
really a count of seconds since an epoch, as SRFI-19 expects, breaking
arithmetic on it.  It looks rather as though this part of SRFI-19 was
written expecting this sort of transformation of UTC, but conflictingly
expecting it to serve as an unambiguous encoding and as a genuine count
of seconds since an epoch.

A simple workaround would be to create a scalar in the same kind of
way but using a larger fixed radix: minimally 86401, or more roundly
131072.  This means we have a scalar value that fits easily into the time
structure, and unambiguously encodes all UTC times.  But it's still not
a count of seconds since an epoch, and it's appreciably less like such
a count because it's no longer continuous across (most) UTC day ends.

Since the time structure has separate fields for seconds and nanoseconds,
it would be possible to borrow a trick sometimes used with the Unix
struct timespec: extending the nanoseconds range to represent leap
seconds.  This would be mostly like the present arrangement, with
the seconds count increasing by 86400 per UTC day, but with a leap
second unambiguously represented by the seconds count of the preceding
second and a nanoseconds count in the range [1000000000, 2000000000).
This fixes the ambiguity, but retains all the other downsides of the
present badly-behaved scalar, and adds the substantial downside of
breaking expectations of normalisation.

The alternative to all of those hacks is to produce a continuous scalar
value that genuinely counts the seconds of UTC.  This is feasible.
It would have a distinct representation for all points on the UTC
time scale.  By being a true scalar value it would fully meet SRFI-19's
description of the time structure, would be represented in normalised
fashion, and would support arithmetic operations on the seconds of UTC
(fixing bug#26164 with no extra effort).

The downside is that this is an unusual and somewhat surprising
arrangement.  I've never previously seen a linear count of UTC
seconds brought out as a product of any time library.  It would
mean that a time-utc structure is not an encoding of a UTC time as
normally understood: the date structure would serve that purpose, and
a time-utc would instead have a hybrid meaning halfway between what we
usually think of as UTC and TAI times.  In the leap-seconds era (1972
onwards), the scalar value in a time-utc would be a constant offset
from the scalar value in the corresponding time-tai.  This implies that
conversion operations would be in a different place from where they
are now.  Whereas currently date/time-utc conversions are almost purely
arithmetical and time-utc/time-tai conversions involve the leap second
table, instead date/time-utc conversions would require the leap second
table and time-utc/time-tai conversions would be purely arithmetical
for the leap-seconds era.  (Frequency offsets would come into the
time-utc/time-tai conversions, for times in the rubber-seconds era.)

I'm pretty sure that this actually-linear treatment of time-utc is not
what the author of SRFI-19 envisioned.  But it fits the actual words of
the standard better than anything else I can imagine, and would fix a
bunch of problems that otherwise look painful.  I reckon this is the best
way forward.  What do you think?  If you like it, I could work up a patch.

-zefram





^ permalink raw reply	[flat|nested] 3+ messages in thread

* bug#22033: time-utc format is lossy
  2017-04-24 20:32 ` Zefram
@ 2018-10-20 22:08   ` Mark H Weaver
  0 siblings, 0 replies; 3+ messages in thread
From: Mark H Weaver @ 2018-10-20 22:08 UTC (permalink / raw)
  To: Zefram; +Cc: 22033

tags 22033 + notabug
close 22033
thanks

Hi Zefram,

Zefram <zefram@fysh.org> writes:

> I wrote:
>>                                   These two seconds are perfectly
>>distinct parts of the UTC time scale, and the time-utc format ought to
>>preserve their distinction.
>
> This is a problematic goal.  At the time I wrote the bug report I didn't
> have a satisfactory idea of how to achieve it, but I think I've come up
> with one now.
>
> The essential problem is that the SRFI-19 time structure expects to
> encapsulate a scalar value -- as it says, a count of seconds since
> some epoch -- but there is no natural scalar representation of a UTC
> time.  Because of the irregularity imposed by its leaps, the natural
> representation of a UTC time is a two-part structure, consisting of an
> integer identifying the day and a fractional count of seconds elapsed
> within the day.  Because UTC days contain differing numbers of seconds,
> this is a variable-radix system.

More precisely, UTC days contain differing numbers of TAI seconds.
However, they contain equal numbers of UTC seconds.

I don't see how we can fix this given the definition of UTC.  UTC, when
represented as a number of seconds since some epoch, simply cannot
represent leap seconds that cause UTC to jump backwards, as all leap
seconds so far have done.  This is an inherent problem with UTC, and is
one of the reasons that TAI is more appropriate than UTC for many
applications.

Your objections here are valid, and cut to the heart of the
long-standing debate over whether leap seconds are a good idea, a debate
which continues today.  If you're curious to read more on this,
<https://www.cl.cam.ac.uk/~mgk25/time/#leap> is a good starting point.

You might also be interested to know that your idea to encode leap
seconds within the 'nanoseconds' field was also proposed by Markus Kuhn
and mentioned by Olin Shivers on the SRFI-19 mailing list during the
early discussion of SRFI-19:

  https://srfi-email.schemers.org/srfi-19/msg/2772123

It's an interesting idea, but I don't think it's something that we can
unilaterally change in an existing, long-finalized SRFI.  It would need
to be part of a new SRFI, I think.

So, I'm closing this as not-a-bug, although I acknowledge that the issue
you raised is valid.  Feel free to reopen and continue the discussion if
you disagree.

In any case, thanks very much for your many interesting and detailed bug
reports, and I apologize for the long delay in addressing them.

    Regards,
      Mark





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-10-20 22:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-27 19:38 bug#22033: time-utc format is lossy Zefram
2017-04-24 20:32 ` Zefram
2018-10-20 22:08   ` Mark H Weaver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).