all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludovic.courtes@inria.fr>
To: zimoun <zimon.toutoune@gmail.com>
Cc: rekado@elephly.net, Timothy Sample <samplet@ngyro.com>,
	39885@debbugs.gnu.org, me@tobias.gr
Subject: bug#39885: Bioconductor tarballs are not archived
Date: Fri, 22 Dec 2023 14:40:01 +0100	[thread overview]
Message-ID: <874jgacq4u.fsf_-_@gnu.org> (raw)
In-Reply-To: <87lesqmmrr.fsf@gmail.com> (zimoun's message of "Mon, 18 Jul 2022 18:03:04 +0200")

Hello!

zimoun <zimon.toutoune@gmail.com> skribis:

> Since 2020, I provided several examples of breakage with bug#39885 [1].
> Here another one:
>
> $ guix time-machine --commit=77e2de365497bf4c8b81cbd78624f78293490485 \
>        -- build r-biocneighbors -S

[...]

> Starting download of /gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz
>>From https://web.archive.org/web/20220718175152/https://bioconductor.org/packages/release/bioc/src/contrib/BiocNeighbors_1.4.1.tar.gz...
> download failed "https://web.archive.org/web/20220718175152/https://bioconductor.org/packages/release/bioc/src/contrib/BiocNeighbors_1.4.1.tar.gz" 404 "NOT FOUND"
> Trying to use Disarchive to assemble /gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz...
> could not find its Disarchive specification
> failed to download "/gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz" from ("https://bioconductor.org/packages/release/bioc/src/contrib/BiocNeighbors_1.4.1.tar.gz" "https://bioconductor.org/packages/3.10/bioc/src/contrib/Archive/BiocNeighbors_1.4.1.tar.gz")
> builder for `/gnu/store/q9ggmh5a9bzmnr49p10x1w9sv6pzjarv-BiocNeighbors_1.4.1.tar.gz.drv' failed to produce output path `/gnu/store/zgf7x09kgiqbvj0dmhplxi1xzpljxd7k-BiocNeighbors_1.4.1.tar.gz'
> build of /gnu/store/q9ggmh5a9bzmnr49p10x1w9sv6pzjarv-BiocNeighbors_1.4.1.tar.gz.drv failed
> View build log at '/var/log/guix/drvs/q9/ggmh5a9bzmnr49p10x1w9sv6pzjarv-BiocNeighbors_1.4.1.tar.gz.drv.gz'.
> guix build: error: build of `/gnu/store/q9ggmh5a9bzmnr49p10x1w9sv6pzjarv-BiocNeighbors_1.4.1.tar.gz.drv' failed
>
> Well, several comments:
>
>  1. Berlin or Bordeaux do not have it as substitutes,
>  2. Diasarchive does not have it,
>  3. Many others neither.

I was wondering whether we’re now doing better for Bioconductor
tarballs.  The answer, based on small sample, seems to be “not quite”:

--8<---------------cut here---------------start------------->8---
$ guix lint -c archival $(guix package -A ^r-bioc | cut -f1)
gnu/packages/bioconductor.scm:19708:12: r-biocbaseutils@1.4.0: Disarchive entry refers to non-existent SWH directory '726af85395d163b5a21e52e4df1bf18aa0072f6b'
gnu/packages/bioconductor.scm:19752:12: r-bioccheck@1.38.0: Disarchive entry refers to non-existent SWH directory '12cfedcbc27005a3fb7e01c5c4b727e0116f596f'
gnu/packages/bioconductor.scm:16892:5: r-biocfilecache@2.10.1: Disarchive entry refers to non-existent SWH directory '6a2d6d909a7cedd56e96f5a98770deeaaaa8d220'
gnu/packages/bioconductor.scm:4540:12: r-biocgenerics@0.48.1: Disarchive entry refers to non-existent SWH directory '6f19ea14f46dbc75909b77bc08e9023daae6fb9e'
gnu/packages/bioconductor.scm:19785:5: r-biocgraph@1.64.0: Disarchive entry refers to non-existent SWH directory '977ff052b4e6c948af7af0fc14ae61f71427cb1a'
gnu/packages/bioconductor.scm:21524:6: r-biocio@1.12.0: Disarchive entry refers to non-existent SWH directory '29d8fef9a5b386384f20513c612f1e34f6118532'
gnu/packages/bioconductor.scm:13090:5: r-biocneighbors@1.20.0: Disarchive entry refers to non-existent SWH directory '6d3728b2dee78cceecdeba0318f3e57b6013d96f'
gnu/packages/bioconductor.scm:19957:5: r-bioconcotk@1.22.0: Disarchive entry refers to non-existent SWH directory '251081d4bc3f061ef8e16338eb042ad4c71ed02d'
gnu/packages/bioconductor.scm:20003:5: r-biocor@1.26.0: Disarchive entry refers to non-existent SWH directory '0cc9d3dcde06fb353cdd77f3b538845d16a77720'
gnu/packages/bioconductor.scm:6613:12: r-biocparallel@1.36.0: Disarchive entry refers to non-existent SWH directory '41e09414898f61655bcc99fdd44d69b0531c0b2d'
gnu/packages/bioconductor.scm:20030:5: r-biocpkgtools@1.20.0: Disarchive entry refers to non-existent SWH directory '55de8618648ed16797a8effd5b508c652a5d7cbe'
gnu/packages/bioconductor.scm:20144:5: r-biocset@1.16.0: Disarchive entry refers to non-existent SWH directory '1cfa6cac0cb453f2882a35c8f5ae6ddfa713ad2d'
gnu/packages/bioconductor.scm:13276:5: r-biocsingular@1.18.0: Disarchive entry refers to non-existent SWH directory '992d3f9d48633fa5d46b9a7640a825054e9538aa'
gnu/packages/bioconductor.scm:19806:12: r-biocstyle@2.30.0: Disarchive entry refers to non-existent SWH directory 'bb17c3bd9ac7c373b24782fcfecdde5fa2f0a965'
gnu/packages/bioconductor.scm:22965:5: r-biocthis@1.12.0: Disarchive entry refers to non-existent SWH directory '3d08f77aae1e81ce9ca9bb9ae2adf4d4c7421d11'
gnu/packages/bioconductor.scm:4521:5: r-biocversion@3.18.1: source not archived on Software Heritage and missing from the Disarchive database
gnu/packages/bioconductor.scm:19830:12: r-biocviews@1.70.0: Disarchive entry refers to non-existent SWH directory '47e0877ab988469fc09a37505dd769f9626cac2e'
gnu/packages/bioconductor.scm:20182:5: r-biocworkflowtools@1.28.0: Disarchive entry refers to non-existent SWH directory '393f3472cc27f632caea3488aef93a7675b403ef'
$ guix describe
Generation 285  Dec 17 2023 23:31:56    (current)
  guix 6ab2426
    repository URL: https://git.savannah.gnu.org/git/guix.git
    branch: master
    commit: 6ab242609daec00e8bd54f7bff54557c92695724
--8<---------------cut here---------------end--------------->8---

In all cases but one, we’re doing the right thing Disarchive-wise, but
our SWH did not archive them.

<https://guix.gnu.org/sources.json> has entries like:

--8<---------------cut here---------------start------------->8---
    {
      "type": "url",
      "urls": [
        "https://bioconductor.org/packages/release/bioc/src/contrib/BiocNeighbors_1.20.0.tar.gz",
        "https://bioconductor.org/packages/3.18/bioc/src/contrib/BiocNeighbors_1.20.0.tar.gz",
        "https://bordeaux.guix.gnu.org/file/BiocNeighbors_1.20.0.tar.gz/sha256/0a5wg099fgwjbzd6r3mr4l02rcmjqlkdcz1w97qzwx1mir41fmas",
        "https://ci.guix.gnu.org/file/BiocNeighbors_1.20.0.tar.gz/sha256/0a5wg099fgwjbzd6r3mr4l02rcmjqlkdcz1w97qzwx1mir41fmas",
        "https://tarballs.nixos.org/sha256/0a5wg099fgwjbzd6r3mr4l02rcmjqlkdcz1w97qzwx1mir41fmas"
      ],
      "integrity": "sha256-WlUXSI41dP7xSTx81ibFsrIsACW5jmzaX5I/lxJ4vCg=",
      "outputHashAlgo": "sha256",
      "outputHashMode": "flat"
    },
--8<---------------cut here---------------end--------------->8---

Note that we have at least one copy on our infra:

--8<---------------cut here---------------start------------->8---
$ wget -qO- "https://bordeaux.guix.gnu.org/file/BiocNeighbors_1.20.0.tar.gz/sha256/0a5wg099fgwjbzd6r3mr4l02rcmjqlkdcz1w97qzwx1mir41fmas"|guix hash  - -f base64
WlUXSI41dP7xSTx81ibFsrIsACW5jmzaX5I/lxJ4vCg=
--8<---------------cut here---------------end--------------->8---

<https://ci.guix.gnu.org/file/BiocNeighbors_1.20.0.tar.gz/sha256/0a5wg099fgwjbzd6r3mr4l02rcmjqlkdcz1w97qzwx1mir41fmas>
is 404 (but I can see why: for /file, ‘guix publish’ relies on things
being available in the store and we no longer keep them on ci.guix; we
do have a substitute at
<https://ci.guix.gnu.org/nar/6kfpflffl7b4hx6ibb5k879ar8ffcxb7-BiocNeighbors_1.20.0.tar.gz>
though; we should fix this).

What about hypothesis (2)?  This is what we have:

--8<---------------cut here---------------start------------->8---
$ wget -qO- https://disarchive.guix.gnu.org/sha256//5a5517488e3574fef1493c7cd626c5b2b22c0025b98e6cda5f923f971278bc28 |grep swh
                        (swhid "swh:1:dir:6d3728b2dee78cceecdeba0318f3e57b6013d96f"))
--8<---------------cut here---------------end--------------->8---

I checked with folks on #swh-devel and it turns out that “the legacy
nixguix lister that is still used in production did not detect the
fallback URL as a tarball URL” (the bordeaux.guix.gnu.org URL), but this
is fixed in the new lister, which should be in production “soon”.

As for past tarballs, #swh-devel comrades say we could send them a list
of URLs and they’d create “Save Code Now” requests on our behalf (we
cannot do it ourselves since the site doesn’t accept plain tarballs.)

Any volunteer to write a script that’d generate a list of Bioconductor
content-addressed URLs (the bordeaux.guix.gnu.org/file ones) for say the
past couple of years?

Thanks!

Ludo’.




  parent reply	other threads:[~2023-12-22 13:41 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-03 15:59 bug#39885: Bioconductor URI, fallback and time-machine zimoun
2020-03-23 21:20 ` Ricardo Wurmus
2020-05-21 23:29   ` zimoun
2020-06-24 11:07 ` zimoun
2020-06-28 20:14   ` Ludovic Courtès
2020-06-29 17:36     ` zimoun
2020-06-29 20:42       ` Ludovic Courtès
2020-11-19 14:22 ` zimoun
2021-11-22 19:48 ` zimoun
2022-07-18 16:03 ` zimoun
2022-07-18 16:21   ` Ricardo Wurmus
2022-08-10 18:25     ` Ricardo Wurmus
2022-08-10 19:44       ` Maxime Devos
2022-08-10 19:48         ` Maxime Devos
2022-09-09 17:23       ` zimoun
2024-01-08 15:07       ` Ludovic Courtès
2024-01-08 15:34         ` Ricardo Wurmus
2024-01-11 16:11           ` Simon Tournier
2023-12-22 13:40   ` Ludovic Courtès [this message]
2024-01-08  9:09     ` bug#39885: Bioconductor tarballs are not archived Simon Tournier
2024-01-08 15:02       ` Ludovic Courtès
2024-01-10 12:41         ` Ricardo Wurmus
2024-01-10 15:23           ` Simon Tournier
2024-01-19 15:46     ` Timothy Sample
2024-01-23  9:10       ` Ludovic Courtès
2024-02-14 15:23       ` Simon Tournier
2024-02-16 16:14         ` Timothy Sample
2024-02-19 16:50           ` Simon Tournier
2024-02-21 18:16             ` Timothy Sample
2023-12-22 20:57   ` bug#39885: Bioconductor URI, fallback and time-machine Ludovic Courtès
2024-01-02  9:20     ` Simon Tournier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874jgacq4u.fsf_-_@gnu.org \
    --to=ludovic.courtes@inria.fr \
    --cc=39885@debbugs.gnu.org \
    --cc=me@tobias.gr \
    --cc=rekado@elephly.net \
    --cc=samplet@ngyro.com \
    --cc=zimon.toutoune@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.