unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
* bug#62656: broken guix time-machine + software-heritage
@ 2023-04-03 21:39 Nicolas Graves via Bug reports for GNU Guix
  2023-04-04 10:51 ` Simon Tournier
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Nicolas Graves via Bug reports for GNU Guix @ 2023-04-03 21:39 UTC (permalink / raw)
  To: 62656


Hi Guix!

I was trying to use guix time-machine as I did in the past, but the
recent updates with software heritage seem to have broken my use of it.

Here's the channels.scm file I used:

(list (channel
        (name 'guix)
        (url "/https://git.savannah.gnu.org/git/guix.git")
        (commit "1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1")
        (introduction
          (make-channel-introduction
            "9edb3f66fd807b096b48283debdcddccfea34bad"
            (openpgp-fingerprint
              "BBB0 2DDF 2CEA F6A8 0D1D  E643 A2A0 6DF2 A33A 54FA")))))

Here is the content + backtrace of the time-machine call, after the ~10
hours long object processing on Software Heritage side:

> guix time-machine -C channels.scm -- shell
Mise à jour du canal « guix » depuis le dépôt Git « /https://git.savannah.gnu.org/git/guix.git »...
SWH: found revision 1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 with directory at 'https://archive.softwareheritage.org/api/1/directory/1ea499e7529e67a0632ecbe0a8214f0618a82c1a/'
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/HEAD
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/branches/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/config
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/description
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/hooks/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/info/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/info/exclude
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/info/refs
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/info/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/info/packs
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/pack/
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/pack/pack-20648aeebad9dc6d8a29c87bd99d8fd773e1266a.idx
swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/pack/pack-20648aeebad9dc6d8a29c87bd99d8fd773e1266a.pack
Backtrace:
In ice-9/boot-9.scm:
  1752:10 19 (with-exception-handler _ _ #:unwind? _ # _)
In guix/store.scm:
   659:37 18 (thunk)
In guix/status.scm:
    830:4 17 (call-with-status-report _ _)
In guix/store.scm:
   1298:8 16 (call-with-build-handler #<procedure 7f8d1bf5adb0 at g…> …)
In guix/inferior.scm:
   928:34 15 (cached-channel-instance #<store-connection 256.99 7f8…> …)
In guix/channels.scm:
    528:7 14 (loop _ _)
In guix/combinators.scm:
    48:26 13 (fold2 #<procedure 7f8d1bf592a0 at guix/channels.scm:5…> …)
In guix/channels.scm:
   538:29 12 (_ #<<channel> name: guix url: "/https://git.savannah.…> …)
   409:17 11 (latest-channel-instance #<store-connection 256.99 7f8…> …)
In guix/git.scm:
   477:29 10 (update-cached-checkout _ #:ref _ #:recursive? _ # _ # _ …)
    378:2  9 (_ git-error #<<git-error> code: -1 message: "failed to…>)
In guix/utils.scm:
    959:8  8 (call-with-temporary-directory _)
In guix/git.scm:
   380:10  7 (_ "/tmp/guix-directory.v8A5Fq")
In guix/swh.scm:
    655:8  6 (call-with-temporary-directory #<procedure 7f8d248dba80…>)
   682:11  5 (_ "/tmp/guix-directory.4kHVt8")
In guix/build/utils.scm:
  1018:28  4 (_)
In unknown file:
           3 (get-bytevector-n! #<input: string 7f8d1aad5cb0> # 0 #)
In web/response.scm:
     95:2  2 (read! _ _ _)
In ice-9/boot-9.scm:
  1685:16  1 (raise-exception _ #:continuable? _)
  1685:16  0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Throw to key `bad-response' with args `("EOF while reading response body: ~a bytes of ~a" (53394376 296632320))'.
tar: Fin prématurée rencontrée dans l'archive.
tar: Fin prématurée rencontrée dans l'archive.
tar: swh\:1\:rev\:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/pack : utime impossible: Aucun fichier ou dossier de ce type
tar: swh\:1\:rev\:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects : utime impossible: Aucun fichier ou dossier de ce type
tar: swh\:1\:rev\:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git : utime impossible: Aucun fichier ou dossier de ce type
tar: Error is not recoverable: exiting now
zsh: exit 1     guix time-machine -C channels.scm -- shell

Hope this can be fixed soon, good luck ;) 

-- 
Best regards,
Nicolas Graves




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-04-03 21:39 bug#62656: broken guix time-machine + software-heritage Nicolas Graves via Bug reports for GNU Guix
@ 2023-04-04 10:51 ` Simon Tournier
  2023-04-26  9:50 ` Ludovic Courtès
  2024-02-04 13:03 ` bug#62656: close 62656 Nicolas Graves via Bug reports for GNU Guix
  2 siblings, 0 replies; 14+ messages in thread
From: Simon Tournier @ 2023-04-04 10:51 UTC (permalink / raw)
  To: Nicolas Graves, 62656

Hi,

Cool you did this test! :-)

On Mon, 03 Apr 2023 at 23:39, Nicolas Graves via Bug reports for GNU Guix <bug-guix@gnu.org> wrote:

> Here is the content + backtrace of the time-machine call, after the ~10
> hours long object processing on Software Heritage side:

Last time I checked that, I never got the object from SWH because a bug
on their side.  Nice now they cook the content.

Note that SWH is an archive and so it is expected to take a long time to
extract a large dataset as the files of the Guix repository is.  Many
data are stored cold not to say frozen and that’s why it takes a long
time to warm them up.


>> guix time-machine -C channels.scm -- shell
> Mise à jour du canal « guix » depuis le dépôt Git « /https://git.savannah.gnu.org/git/guix.git »...
> SWH: found revision 1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 with directory at 'https://archive.softwareheritage.org/api/1/directory/1ea499e7529e67a0632ecbe0a8214f0618a82c1a/'
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/HEAD

[...]

> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/objects/pack/pack-20648aeebad9dc6d8a29c87bd99d8fd773e1266a.pack
> Backtrace:
> In ice-9/boot-9.scm:
>   1752:10 19 (with-exception-handler _ _ #:unwind? _ # _)
> In guix/store.scm:
>    659:37 18 (thunk)

[...]

> In unknown file:
>            3 (get-bytevector-n! #<input: string 7f8d1aad5cb0> # 0 #)
> In web/response.scm:
>      95:2  2 (read! _ _ _)
> In ice-9/boot-9.scm:
>   1685:16  1 (raise-exception _ #:continuable? _)
>   1685:16  0 (raise-exception _ #:continuable? _)
>
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> Throw to key `bad-response' with args `("EOF while reading response body: ~a bytes of ~a" (53394376 296632320))'.

Well, if I understand correctly, SWH cooked the full Git repository of
Guix and somehow it is probably too big.  Hum, I do not know how to
investigate…

Thanks for the report!

Cheers,
simon




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-04-03 21:39 bug#62656: broken guix time-machine + software-heritage Nicolas Graves via Bug reports for GNU Guix
  2023-04-04 10:51 ` Simon Tournier
@ 2023-04-26  9:50 ` Ludovic Courtès
  2023-04-26 10:01   ` Ludovic Courtès
  2023-04-28 14:43   ` Simon Tournier
  2024-02-04 13:03 ` bug#62656: close 62656 Nicolas Graves via Bug reports for GNU Guix
  2 siblings, 2 replies; 14+ messages in thread
From: Ludovic Courtès @ 2023-04-26  9:50 UTC (permalink / raw)
  To: Nicolas Graves; +Cc: 62656

Hello,

Nicolas Graves <ngraves@ngraves.fr> skribis:

> I was trying to use guix time-machine as I did in the past, but the
> recent updates with software heritage seem to have broken my use of it.
>
> Here's the channels.scm file I used:
>
> (list (channel
>         (name 'guix)
>         (url "/https://git.savannah.gnu.org/git/guix.git")
>         (commit "1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1")
>         (introduction
>           (make-channel-introduction
>             "9edb3f66fd807b096b48283debdcddccfea34bad"
>             (openpgp-fingerprint
>               "BBB0 2DDF 2CEA F6A8 0D1D  E643 A2A0 6DF2 A33A 54FA")))))

Interesting test!

> Here is the content + backtrace of the time-machine call, after the ~10
> hours long object processing on Software Heritage side:
>
>> guix time-machine -C channels.scm -- shell
> Mise à jour du canal « guix » depuis le dépôt Git « /https://git.savannah.gnu.org/git/guix.git »...
> SWH: found revision 1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 with directory at 'https://archive.softwareheritage.org/api/1/directory/1ea499e7529e67a0632ecbe0a8214f0618a82c1a/'
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/HEAD
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/branches/
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/config
> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/description

[...]

>            3 (get-bytevector-n! #<input: string 7f8d1aad5cb0> # 0 #)
> In web/response.scm:
>      95:2  2 (read! _ _ _)
> In ice-9/boot-9.scm:
>   1685:16  1 (raise-exception _ #:continuable? _)
>   1685:16  0 (raise-exception _ #:continuable? _)
>
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> Throw to key `bad-response' with args `("EOF while reading response body: ~a bytes of ~a" (53394376 296632320))'.

I can reproduce it like this:

--8<---------------cut here---------------start------------->8---
$ wget -O/tmp/swh.git \
   "https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/"
--2023-04-26 11:43:22--  https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/
Resolving archive.softwareheritage.org (archive.softwareheritage.org)... 128.93.166.15
Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|128.93.166.15|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 296632320 (283M) [application/x-tar]
Saving to: ‘/tmp/swh.git’

/tmp/swh.git              13%[===>                             ]  39.11M  84.1MB/s    in 0.5s    

2023-04-26 11:43:40 (84.1 MB/s) - Connection closed at byte 41015184. Retrying.

--2023-04-26 11:43:41--  (try: 2)  https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/
Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|128.93.166.15|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 296632320 (283M), 255617136 (244M) remaining [application/x-tar]
Saving to: ‘/tmp/swh.git’

/tmp/swh.git              65%[++++================>            ] 184.66M  96.7MB/s    in 1.5s    

2023-04-26 11:44:00 (96.7 MB/s) - Connection closed at byte 193634304. Retrying.

[…]

--2023-04-26 11:48:01--  (try:12)  https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/
Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|128.93.166.15|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 296632320 (283M), 28199637 (27M) remaining [application/x-tar]
Saving to: ‘/tmp/swh.git’

/tmp/swh.git              90%[+++++++++++++++++++++++++++++    ] 256.00M  5.39KB/s    in 0.3s    

2023-04-26 11:48:19 (5.39 KB/s) - Connection closed at byte 268434406. Retrying.

--2023-04-26 11:48:29--  (try:13)  https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/
Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|128.93.166.15|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 296632320 (283M), 28197914 (27M) remaining [application/x-tar]
Saving to: ‘/tmp/swh.git’

/tmp/swh.git              90%[+++++++++++++++++++++++++++++    ] 256.00M  --.-KB/s    in 0s      

2023-04-26 11:48:46 (0.00 B/s) - Connection closed at byte 268434406. Retrying.
--8<---------------cut here---------------end--------------->8---

The server keeps closing the connection prematurely.  Unlike our client
in Guile, wget keeps retrying and so, little by little, it eventually
gets more bytes.  In my case it seems to get stuck at 90% though, where
each attempt gives it zero or very few additional bytes.

I suspect this is an issue at SWH.  I’ll bring it up there.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-04-26  9:50 ` Ludovic Courtès
@ 2023-04-26 10:01   ` Ludovic Courtès
  2023-10-24 13:23     ` Simon Tournier
  2023-04-28 14:43   ` Simon Tournier
  1 sibling, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2023-04-26 10:01 UTC (permalink / raw)
  To: Nicolas Graves; +Cc: 62656

Ludovic Courtès <ludovic.courtes@inria.fr> skribis:

> The server keeps closing the connection prematurely.  Unlike our client
> in Guile, wget keeps retrying and so, little by little, it eventually
> gets more bytes.  In my case it seems to get stuck at 90% though, where
> each attempt gives it zero or very few additional bytes.
>
> I suspect this is an issue at SWH.  I’ll bring it up there.

👉 https://gitlab.softwareheritage.org/swh/devel/swh-vault/-/issues/4346




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-04-26  9:50 ` Ludovic Courtès
  2023-04-26 10:01   ` Ludovic Courtès
@ 2023-04-28 14:43   ` Simon Tournier
  2023-05-02  7:42     ` Ludovic Courtès
  1 sibling, 1 reply; 14+ messages in thread
From: Simon Tournier @ 2023-04-28 14:43 UTC (permalink / raw)
  To: Ludovic Courtès, Nicolas Graves; +Cc: 62656

Hi,

On mer., 26 avril 2023 at 11:50, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:

>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/
>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/HEAD
>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/branches/
>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/config
>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/description

[...]


> I suspect this is an issue at SWH.  I’ll bring it up there.

Aside the potential bug on SWH side, maybe we could ask a flat cooking
instead of a git-bare cooking.

Considering the size of the Guix repository, it can take hours to cook
it – remember the test with CRLF ;-) – when most of the time, we need
only one specific revision.

Somehow, we could tweak ’clone-from-swh’ from (guix git) to use 'flat
instead of 'git-bare.  However, I am unsure the other tweaks it would
require since a Git repository is somehow expected.


Cheers,
simon




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-04-28 14:43   ` Simon Tournier
@ 2023-05-02  7:42     ` Ludovic Courtès
  2023-05-02 18:01       ` Simon Tournier
  0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2023-05-02  7:42 UTC (permalink / raw)
  To: Simon Tournier; +Cc: 62656, Nicolas Graves

Hi!

Simon Tournier <zimon.toutoune@gmail.com> skribis:

> On mer., 26 avril 2023 at 11:50, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:
>
>>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/
>>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/HEAD
>>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/branches/
>>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/config
>>> swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git/description
>
> [...]
>
>
>> I suspect this is an issue at SWH.  I’ll bring it up there.
>
> Aside the potential bug on SWH side, maybe we could ask a flat cooking
> instead of a git-bare cooking.
>
> Considering the size of the Guix repository, it can take hours to cook
> it – remember the test with CRLF ;-) – when most of the time, we need
> only one specific revision.
>
> Somehow, we could tweak ’clone-from-swh’ from (guix git) to use 'flat
> instead of 'git-bare.  However, I am unsure the other tweaks it would
> require since a Git repository is somehow expected.

Yeah, ‘clone-from-swh’ is really cloning, so it needs ‘git-bare’.
Generally, in the case of channels, we need a full clone, not just a
revision.  Various bits of the machinery expect the clone: (guix
describe), (guix channels), and so on.

Ludo’.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-05-02  7:42     ` Ludovic Courtès
@ 2023-05-02 18:01       ` Simon Tournier
  2023-05-04  7:22         ` Ludovic Courtès
  0 siblings, 1 reply; 14+ messages in thread
From: Simon Tournier @ 2023-05-02 18:01 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Nicolas Graves, 62656

Hi,

On Tue, 02 May 2023 at 09:42, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:

>> Somehow, we could tweak ’clone-from-swh’ from (guix git) to use 'flat
>> instead of 'git-bare.  However, I am unsure the other tweaks it would
>> require since a Git repository is somehow expected.
>
> Yeah, ‘clone-from-swh’ is really cloning, so it needs ‘git-bare’.
> Generally, in the case of channels, we need a full clone, not just a
> revision.  Various bits of the machinery expect the clone: (guix
> describe), (guix channels), and so on.

Even if the bug on SWH would be fixed, at the rate the Guix repo is
growing, it would be impractical to cook the whole Guix repo.  And it
appears to me weird when we, most of the time, need a very restricted
set of commits.

We could imagine to locally create a new repo (git init) and only add
the content of the commit specified by “guix time-machine”.

Cheers,
simon

PS: Just some numbers backing the rate of growing:

        $ git log --oneline | wc -l
        114457

        $ git log --oneline --before=2019-05-01 | wc -l
        43845

        $ git log --oneline --after=2019-05-01 | wc -l
        70612


 1. We are cooking 43845 commits of the history that are useless because
    unreachable with the time-machine.  They pre-date the introduction
    of the inferiors – yes, we could refine and consider v0.15 instead
    of v1.0.0. :-)

 2. The first commit is from 2012.  Over the first 7 years, 38% of the
    history had been produced.  In less than 4 years, we have produced
    62% of the history!  Yeah, that’s cool!

    Basically, from now to less than 5 years, we will generate the same
    number of commits as over the past 10 years.
    




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-05-02 18:01       ` Simon Tournier
@ 2023-05-04  7:22         ` Ludovic Courtès
  2023-05-04  7:57           ` Simon Tournier
  0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2023-05-04  7:22 UTC (permalink / raw)
  To: Simon Tournier; +Cc: Nicolas Graves, 62656

Hi,

Simon Tournier <zimon.toutoune@gmail.com> skribis:

> On Tue, 02 May 2023 at 09:42, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:
>
>>> Somehow, we could tweak ’clone-from-swh’ from (guix git) to use 'flat
>>> instead of 'git-bare.  However, I am unsure the other tweaks it would
>>> require since a Git repository is somehow expected.
>>
>> Yeah, ‘clone-from-swh’ is really cloning, so it needs ‘git-bare’.
>> Generally, in the case of channels, we need a full clone, not just a
>> revision.  Various bits of the machinery expect the clone: (guix
>> describe), (guix channels), and so on.
>
> Even if the bug on SWH would be fixed, at the rate the Guix repo is
> growing, it would be impractical to cook the whole Guix repo.

Falling back to SWH to fetch channels is something we expect to be rare,
though.

> And it appears to me weird when we, most of the time, need a very
> restricted set of commits.
>
> We could imagine to locally create a new repo (git init) and only add
> the content of the commit specified by “guix time-machine”.

To do that we’d need to say goodbye to the features I mentioned above.

> PS: Just some numbers backing the rate of growing:
>
>         $ git log --oneline | wc -l
>         114457
>
>         $ git log --oneline --before=2019-05-01 | wc -l
>         43845
>
>         $ git log --oneline --after=2019-05-01 | wc -l
>         70612
>
>
>  1. We are cooking 43845 commits of the history that are useless because
>     unreachable with the time-machine.  They pre-date the introduction
>     of the inferiors – yes, we could refine and consider v0.15 instead
>     of v1.0.0. :-)
>
>  2. The first commit is from 2012.  Over the first 7 years, 38% of the
>     history had been produced.  In less than 4 years, we have produced
>     62% of the history!  Yeah, that’s cool!
>
>     Basically, from now to less than 5 years, we will generate the same
>     number of commits as over the past 10 years.

Heh, insightful figures!

Ludo’.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-05-04  7:22         ` Ludovic Courtès
@ 2023-05-04  7:57           ` Simon Tournier
  2023-05-04 13:05             ` Ludovic Courtès
  0 siblings, 1 reply; 14+ messages in thread
From: Simon Tournier @ 2023-05-04  7:57 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Nicolas Graves, 62656

Hi Ludo,

On Thu, 4 May 2023 at 09:22, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:

> > Even if the bug on SWH would be fixed, at the rate the Guix repo is
> > growing, it would be impractical to cook the whole Guix repo.
>
> Falling back to SWH to fetch channels is something we expect to be rare,
> though.

Being rare will not make it practical. ;-)

What I am trying to point is that considering the size of the Guix
repository and its rate, the current implementation will not scale and
the fallback will be impossible for the end-user.

> > And it appears to me weird when we, most of the time, need a very
> > restricted set of commits.
> >
> > We could imagine to locally create a new repo (git init) and only add
> > the content of the commit specified by “guix time-machine”.
>
> To do that we’d need to say goodbye to the features I mentioned above.

Well, I do not see which features will be missing.  I am talking about
making practical:

    guix time-machine -C channels.scm -- shell -m manifest.scm

and not having a complete working Guix.  Well, I read a paper that
mentions this command line, I want to inspect so I am running this
command.  Somehow, I do not care about the others 114456 commits of
the history.  And for sure "guix time-machine -C channels.scm --
describe -f channels" will not be a fixed-point.

Maybe, we could imagine an option for shortcutting the complete clone
and restrict to one specific commit.


Cheers,
simon




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-05-04  7:57           ` Simon Tournier
@ 2023-05-04 13:05             ` Ludovic Courtès
  2023-05-04 17:00               ` Simon Tournier
  0 siblings, 1 reply; 14+ messages in thread
From: Ludovic Courtès @ 2023-05-04 13:05 UTC (permalink / raw)
  To: Simon Tournier; +Cc: Nicolas Graves, 62656

Hi,

Simon Tournier <zimon.toutoune@gmail.com> skribis:

> On Thu, 4 May 2023 at 09:22, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:
>
>> > Even if the bug on SWH would be fixed, at the rate the Guix repo is
>> > growing, it would be impractical to cook the whole Guix repo.
>>
>> Falling back to SWH to fetch channels is something we expect to be rare,
>> though.
>
> Being rare will not make it practical. ;-)
>
> What I am trying to point is that considering the size of the Guix
> repository and its rate, the current implementation will not scale and
> the fallback will be impossible for the end-user.

It’s not impossible, it just takes time (how long exactly, I don’t know,
we should check with the SWH folks what we can expect and what the
relevant factors are.)

That it takes time is acceptable IMO: we’re likely talking about
disaster recovery after the Savannah repo and its GitHub mirror have
disappeared.

Other channels, are typically smaller but also more likely to vanish; I
wonder how that affects the cooking time at SWH—again something to ask
them.

>> > And it appears to me weird when we, most of the time, need a very
>> > restricted set of commits.
>> >
>> > We could imagine to locally create a new repo (git init) and only add
>> > the content of the commit specified by “guix time-machine”.
>>
>> To do that we’d need to say goodbye to the features I mentioned above.
>
> Well, I do not see which features will be missing.

Those mentioned earlier, provenance tracking and downgrade detection in
particular.

Ludo’.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-05-04 13:05             ` Ludovic Courtès
@ 2023-05-04 17:00               ` Simon Tournier
  2023-05-05  7:36                 ` Ludovic Courtès
  0 siblings, 1 reply; 14+ messages in thread
From: Simon Tournier @ 2023-05-04 17:00 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 62656, Nicolas Graves

Hi,

On jeu., 04 mai 2023 at 15:05, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:

>> Well, I do not see which features will be missing.
>
> Those mentioned earlier, provenance tracking and downgrade detection in
> particular.

Do we care about provenance tracking for this scenario?  Similarly, do
we care about downgrade detection for this scenario?

I mean, we are not talking about a regular scenario but as you said a
worst-case scenario.

Somehow, I am missing where “security” (provenance tracking and
downgrade detection) fits in the picture.

If tomorrow Savannah is totally down and let assume the malicious Eve is
serving https://git.savannah.gnu.org/git/guix.git.  The authentication
is useless since Eve can easily rewrite it.  The only mechanism that
protects Alice is the commit SHA-1 hash she has at hand.  Eve needs to
attack this SHA-1 with some collision.  And if it’s possible to produce
pre-image attack for SHA-1, then nothing would prevent Eve to also
replace the origins of some packages in
https://git.savannah.gnu.org/git/guix.git.

Moreover, cloning from SWH using git-bare is not protecting neither.
Well, you are trusting SWH.  Somehow, you have no mean to be sure that
the repository you get back from SWH is the one you expect.  The only
way is to inspect the signatures; it means the end-user knows exactly
which gpg key from .guix-authorizations they must trust.

Obviously, the former could be injected in the latter. ;-)  Noting that
SWH heavily relies on SHA-1, IIUC.

Yeah, we should talk with SWH’s folks. :-)

Cheers,
simon




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-05-04 17:00               ` Simon Tournier
@ 2023-05-05  7:36                 ` Ludovic Courtès
  0 siblings, 0 replies; 14+ messages in thread
From: Ludovic Courtès @ 2023-05-05  7:36 UTC (permalink / raw)
  To: Simon Tournier; +Cc: 62656, Nicolas Graves

Hi!

Simon Tournier <zimon.toutoune@gmail.com> skribis:

> On jeu., 04 mai 2023 at 15:05, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:
>
>>> Well, I do not see which features will be missing.
>>
>> Those mentioned earlier, provenance tracking and downgrade detection in
>> particular.
>
> Do we care about provenance tracking for this scenario?  Similarly, do
> we care about downgrade detection for this scenario?

Provenance tracking, yes.  I wrote about the current status: (guix
describe), (guix channels), etc. expect a full Git repo, which is why
things are done this way.

We could imagine a different design, but that’s a broader endeavor.

[...]

> If tomorrow Savannah is totally down and let assume the malicious Eve is
> serving https://git.savannah.gnu.org/git/guix.git.  The authentication
> is useless since Eve can easily rewrite it.

The authentication mechanism is designed to make this impossible.
That’s why one can run:

  guix pull --url=https://github.com/guix-mirror/guix

without fear (worst that can happen is that the mirror is stale).

> The only mechanism that protects Alice is the commit SHA-1 hash she
> has at hand.  Eve needs to attack this SHA-1 with some collision.  And
> if it’s possible to produce pre-image attack for SHA-1, then nothing
> would prevent Eve to also replace the origins of some packages in
> https://git.savannah.gnu.org/git/guix.git.

True to some extent—see the section about SHA1 in the Programming paper¹.

Ludo’.

¹ https://doi.org/10.22152/programming-journal.org/2023/7/1




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: broken guix time-machine + software-heritage
  2023-04-26 10:01   ` Ludovic Courtès
@ 2023-10-24 13:23     ` Simon Tournier
  0 siblings, 0 replies; 14+ messages in thread
From: Simon Tournier @ 2023-10-24 13:23 UTC (permalink / raw)
  To: Ludovic Courtès, Nicolas Graves; +Cc: 62656

Hi,

On Wed, 26 Apr 2023 at 12:01, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:

>> I suspect this is an issue at SWH.  I’ll bring it up there.
>
> https://gitlab.softwareheritage.org/swh/devel/swh-vault/-/issues/4346

Issue closed. \o/

Now, it passes:

--8<---------------cut here---------------start------------->8---
$ time wget -O/tmp/swh.git \
   "https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/"
> --2023-10-24 15:12:14--  https://archive.softwareheritage.org/api/1/vault/git-bare/swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1/raw/
Resolving archive.softwareheritage.org (archive.softwareheritage.org)... 128.93.166.15
Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|128.93.166.15|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://swhvaultstorage.blob.core.windows.net/contents-uncompressed/4210e49babbe65df77ab7075d68615ca5edc2a23?se=2023-10-25T13%3A12%3A14Z&sp=r&sv=2019-02-02&sr=b&rscd=attachment%3B%20filename%3D%22swh_1_rev_1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git.tar%22&sig=scSRKMM3zV0UO5rb91lk/M8AUhlQEeKrhm31VbvhB6w%3D [following]
--2023-10-24 15:12:14--  https://swhvaultstorage.blob.core.windows.net/contents-uncompressed/4210e49babbe65df77ab7075d68615ca5edc2a23?se=2023-10-25T13%3A12%3A14Z&sp=r&sv=2019-02-02&sr=b&rscd=attachment%3B%20filename%3D%22swh_1_rev_1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git.tar%22&sig=scSRKMM3zV0UO5rb91lk/M8AUhlQEeKrhm31VbvhB6w%3D
Resolving swhvaultstorage.blob.core.windows.net (swhvaultstorage.blob.core.windows.net)... 20.209.11.33
Connecting to swhvaultstorage.blob.core.windows.net (swhvaultstorage.blob.core.windows.net)|20.209.11.33|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 296632320 (283M) [application/octet-stream]
Saving to: ‘/tmp/swh.git’

/tmp/swh.git                      100%[===========================================================>] 282.89M  6.87MB/s    in 37s     

2023-10-24 15:12:51 (7.70 MB/s) - ‘/tmp/swh.git’ saved [296632320/296632320]


real	0m37.034s
user	0m0.973s
sys	0m2.602s
--8<---------------cut here---------------end--------------->8---

Please note:

--8<---------------cut here---------------start------------->8---
$ file swh.git
swh.git: POSIX tar archive (GNU)

$ mkdir -p some-dir
$ mv swh.git some-dir/
$ cd some-dir/
$ tar -xf swh.git
$ mv swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1.git .git

$ git log --oneline -10
1984d56b0e (HEAD -> master) gnu: Add scilab.
13b2d110ee gnu: Add suitesparse-3.
f4d7b901db gnu: matio: Add header file.
42b938ae8c gnu: Add audmes.
be5e280e5f Revert "gnu: network-manager: Update to 1.43.4."
7ceedc7df7 gnu: conan: Update to 2.0.2.
57c3662ddd gnu: conan: Use gexps and remove input labels.
113146d31c gnu: r-mumin: Update to 1.47.5.
c029bac121 gnu: r-tclust: Update to 1.5-4.
aadc68f297 gnu: r-car: Update to 3.1-2.

$ git log --oneline | wc -l
110743

$ git log --format="%cd %s" | tail -3
Wed Apr 18 23:34:19 2012 +0200 Add `.gitignore'.
Wed Apr 18 23:34:12 2012 +0200 Split (guix) in (guix store) and (guix derivations).
Wed Apr 18 23:21:11 2012 +0200 Initial commit.
--8<---------------cut here---------------end--------------->8---

And only the master branch seems around,

--8<---------------cut here---------------start------------->8---
$ git branch -avv
* master 1984d56b0e gnu: Add scilab.
--8<---------------cut here---------------end--------------->8---

Last, there is a SWH redirection that is probably not supported.

--8<---------------cut here---------------start------------->8---
$ guix time-machine -q --commit=1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 -- describe
SWH: found revision 1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 with directory at 'https://archive.softwareheritage.org/api/1/directory/1ea499e7529e67a0632ecbe0a8214f0618a82c1a/'
SWH: object swh:1:rev:1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 could not be fetched from the vault
guix time-machine: warning: revision 1984d56b0e437af7be7fa6cf8e1a00e45eb8ffa1 of https://git.savannah.gnu.org/git/guix.git could not be fetched from Software Heritage
guix time-machine: error: Git error: failed to resolve address for git.savannah.gnu.org: Name or service not known
--8<---------------cut here---------------end--------------->8---

The issue is progressing…

Cheers,
simon




^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#62656: close 62656
  2023-04-03 21:39 bug#62656: broken guix time-machine + software-heritage Nicolas Graves via Bug reports for GNU Guix
  2023-04-04 10:51 ` Simon Tournier
  2023-04-26  9:50 ` Ludovic Courtès
@ 2024-02-04 13:03 ` Nicolas Graves via Bug reports for GNU Guix
  2 siblings, 0 replies; 14+ messages in thread
From: Nicolas Graves via Bug reports for GNU Guix @ 2024-02-04 13:03 UTC (permalink / raw)
  To: 62656-done


Issue fixed.

-- 
Best regards,
Nicolas Graves




^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-02-04 13:05 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-03 21:39 bug#62656: broken guix time-machine + software-heritage Nicolas Graves via Bug reports for GNU Guix
2023-04-04 10:51 ` Simon Tournier
2023-04-26  9:50 ` Ludovic Courtès
2023-04-26 10:01   ` Ludovic Courtès
2023-10-24 13:23     ` Simon Tournier
2023-04-28 14:43   ` Simon Tournier
2023-05-02  7:42     ` Ludovic Courtès
2023-05-02 18:01       ` Simon Tournier
2023-05-04  7:22         ` Ludovic Courtès
2023-05-04  7:57           ` Simon Tournier
2023-05-04 13:05             ` Ludovic Courtès
2023-05-04 17:00               ` Simon Tournier
2023-05-05  7:36                 ` Ludovic Courtès
2024-02-04 13:03 ` bug#62656: close 62656 Nicolas Graves via Bug reports for GNU Guix

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).