unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [outreachy] “guix git log --date=”
@ 2021-02-01 19:49 zimoun
  2021-02-01 20:41 ` Christopher Baines
  0 siblings, 1 reply; 3+ messages in thread
From: zimoun @ 2021-02-01 19:49 UTC (permalink / raw)
  To: Magali, Guix Devel, Gábor Boskovits

Hi Magali,

As discussed today at our weekly meeting, it could be cool to add the
option:

  guix git log --date=YYYY-MM-DD

listing the first (resp. last) commit date of the day.  Or maybe all the
commits of the days.  Using this information would be really useful to
feed “guix time-machine”.  The use case I am interested is to easily
find the commit when I only know the date of publication/submission of
the paper.

(Format for the date, at first, the one of “git log --date=short”.)


The second thing which could be nice is to profile a bit the function
“get-commits”.  Do not hesitate to ping here if you need help about
Guile profiler. :-)

Feel free to push the last features you implemented and drop an email
here to reach out testers. ;-)


Cheers,
simon




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [outreachy] “guix git log --date=”
  2021-02-01 19:49 [outreachy] “guix git log --date=” zimoun
@ 2021-02-01 20:41 ` Christopher Baines
  2021-02-01 21:44   ` zimoun
  0 siblings, 1 reply; 3+ messages in thread
From: Christopher Baines @ 2021-02-01 20:41 UTC (permalink / raw)
  To: zimoun; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1652 bytes --]


zimoun <zimon.toutoune@gmail.com> writes:

> As discussed today at our weekly meeting, it could be cool to add the
> option:
>
>   guix git log --date=YYYY-MM-DD
>
> listing the first (resp. last) commit date of the day.  Or maybe all the
> commits of the days.  Using this information would be really useful to
> feed “guix time-machine”.  The use case I am interested is to easily
> find the commit when I only know the date of publication/submission of
> the paper.

I'd be a little careful about the implementation of this, commits have a
commit date, and author date, but neither of these things tell you when
commits were on a given branch.

Take the following commit for example:
f5f642058a3b6bf3eda5eb714ad5fa1f0a2b1b20 [1]

Would it be shown when running the following?

  guix git log --date=2021-01-17

It's commit date is the 17th, so maybe yes? But this commit didn't
actually turn up on the master branch until the 18th, at least according
to the Guix Data Service [2].

Taking your paper use case, if I produce some results on the 17th, even
perhaps stating the time down to the second, and then you using the
commit date of commits try to reproduce the environment, you're going to
get some commits that I didn't have.

Approaches that work most of the time, or have subtleties that might not
be immediately obvious make me a little nervous.

1:
commit f5f642058a3b6bf3eda5eb714ad5fa1f0a2b1b20
AuthorDate: Sun Jan 3 16:26:16 2021
CommitDate: Sun Jan 17 23:07:29 2021

    gnu: wxmaxima: Update to 20.12.2.

2: https://data.guix.gnu.org/revision/f5f642058a3b6bf3eda5eb714ad5fa1f0a2b1b20

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 987 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [outreachy] “guix git log --date=”
  2021-02-01 20:41 ` Christopher Baines
@ 2021-02-01 21:44   ` zimoun
  0 siblings, 0 replies; 3+ messages in thread
From: zimoun @ 2021-02-01 21:44 UTC (permalink / raw)
  To: Christopher Baines; +Cc: guix-devel

Hi Chris,

On Mon, 01 Feb 2021 at 20:41, Christopher Baines <mail@cbaines.net> wrote:
> zimoun <zimon.toutoune@gmail.com> writes:
>
>> As discussed today at our weekly meeting, it could be cool to add the
>> option:
>>
>>   guix git log --date=YYYY-MM-DD
>>
>> listing the first (resp. last) commit date of the day.  Or maybe all the
>> commits of the days.  Using this information would be really useful to
>> feed “guix time-machine”.  The use case I am interested is to easily
>> find the commit when I only know the date of publication/submission of
>> the paper.
>
> I'd be a little careful about the implementation of this, commits have a
> commit date, and author date, but neither of these things tell you when
> commits were on a given branch.

We are focusing on commit date, which is the one making sense.

> Take the following commit for example:
> f5f642058a3b6bf3eda5eb714ad5fa1f0a2b1b20 [1]
>
> Would it be shown when running the following?
>
>   guix git log --date=2021-01-17
>
> It's commit date is the 17th, so maybe yes? But this commit didn't
> actually turn up on the master branch until the 18th, at least according
> to the Guix Data Service [2].

Yes, for sure.  It is difficult to rebuild aposteriori the exact date
history.  Well, merges increase the number of commit candidates, so the
correct commit will be listed in the middle of other false positive
ones.


> Taking your paper use case, if I produce some results on the 17th, even
> perhaps stating the time down to the second, and then you using the
> commit date of commits try to reproduce the environment, you're going to
> get some commits that I didn't have.

I agree.  The error rate (on average) depends on the number of commits
per day (on average) vs the number of commits per day in other branches
vs the number of merges of such branches.

Well, I did not do the stats, so just guessing that the approximation is
not so bad and the false positive are acceptable in practise.

From my understanding, you have right that the correct would be to take
care about the merges of the ’staging’ and ’core-updates’ branches.

But even without that, it is already inexact and an rough approximation.
Let consider the last CRAN update for example, it is impossible to have
the exact same environment knowing only the date, say 2021-01-20.

--8<---------------cut here---------------start------------->8---
$ git log --pretty="%cd %s" --before=2021-01-21 --after=2021-01-19 \
    | grep 'r-' | grep Update | head -n1
Wed Jan 20 17:19:10 2021 +0100 gnu: r-fdrtool: Update to 1.2.16.
$ git log --pretty="%cd %s" --before=2021-01-21 --after=2021-01-19 \
    | grep 'r-' | grep Update | tail -n1
Wed Jan 20 17:18:59 2021 +0100 gnu: r-foreign: Update to 0.8-81.
$ git log --pretty="%cd %s" --before=2021-01-21 --after=2021-01-19 \
    | grep 'r-' | grep Update | wc -l
144
--8<---------------cut here---------------end--------------->8---

The same day, the user who pulled before 17:18 got a CRAN environment
and the user who pulled after 17:20 got another CRAN environment.
Without speaking about the one who pulled in the meantime. :-)

> Approaches that work most of the time, or have subtleties that might not
> be immediately obvious make me a little nervous.
>
> 1:
> commit f5f642058a3b6bf3eda5eb714ad5fa1f0a2b1b20
> AuthorDate: Sun Jan 3 16:26:16 2021
> CommitDate: Sun Jan 17 23:07:29 2021
>
>     gnu: wxmaxima: Update to 20.12.2.
>
> 2: https://data.guix.gnu.org/revision/f5f642058a3b6bf3eda5eb714ad5fa1f0a2b1b20

Aside, noting that the Data Service is also doing an approximation on
the commit dates.  It considers only batches of pushed commits and not
all the commits individually.  As explained for example here:

<https://lists.gnu.org/archive/html/guix-devel/2020-11/msg00420.html>


Thanks for sharing your experience.  Yeah, dealing with dates and Git is
not as straightforward as it appears at first.  I agree that maybe this
first approximation is too rough and maybe not useful in practise
because knowing the date is too vague.


Cheers,
simon


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-02-01 21:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-01 19:49 [outreachy] “guix git log --date=” zimoun
2021-02-01 20:41 ` Christopher Baines
2021-02-01 21:44   ` zimoun

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).