unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Simon Tournier <zimon.toutoune@gmail.com>
To: Timothy Sample <samplet@ngyro.com>, guix-devel@gnu.org
Subject: Re: Preservation of Guix (PoG) report 2023-03-13
Date: Tue, 14 Mar 2023 11:36:48 +0100	[thread overview]
Message-ID: <86356739hb.fsf@gmail.com> (raw)
In-Reply-To: <87r0tsm7u4.fsf@ngyro.com>

Hi,

On Mon, 13 Mar 2023 at 19:37, Timothy Sample <samplet@ngyro.com> wrote:

> Note that you can link to the most recent version of the report using
> <https://ngyro.com/pog-reports/latest/>.

Awesome! \o/

Well, I do not remember if you consider also the ’origin’
(fixed-outputs) as ’inputs’ or ’patches’.  Do you?

Basically, ’package-direct-sources’ from (guix packages).

For instance, see the package ’ntp’,

--8<---------------cut here---------------start------------->8---
(source
     (origin
       (method url-fetch)
       (uri (list (string-append
                   "https://www.eecis.udel.edu/~ntp/ntp_spool/ntp4/ntp-"
                   (version-major+minor version)
[...]
       (sha256
        (base32 "06cwhimm71safmwvp6nhxp6hvxsg62whnbgbgiflsqb8mgg40n7n"))
       ;; Add an upstream patch to fix build with GCC 10.  Taken from
       ;; <https://bugs.ntp.org/show_bug.cgi?id=3688>.
       (patches (list (origin
                        (method url-fetch)
                        (uri "https://bugs.ntp.org/attachment.cgi?id=1760\
&action=diff&context=patch&collapsed=&headers=1&format=raw")
                        (file-name "ntp-gcc-compat.patch")
                        (sha256
                         (base32
                          "13d28sg45rflc7kqiv30asrhna8n69wlpwx16l65rravgpvp90h2")))
--8<---------------cut here---------------end--------------->8---

or see the package ’tensorflow’,

--8<---------------cut here---------------start------------->8---
    (native-inputs
     `(("pkg-config" ,pkg-config)
[...]
       ("boringssl-src"
        ,(let ((commit "ee7aa02")
               (revision "1"))
           (origin
             (method git-fetch)
             (uri (git-reference
                   (url "https://boringssl.googlesource.com/boringssl")
                   (commit commit)))
             (file-name (string-append "boringssl-0-" revision
                                       (string-take commit 7)
                                       "-checkout"))
             (sha256
              (base32
               "1jf693q0nw0adsic6cgmbdx6g7wr4rj4vxa8j1hpn792fqhd8wgw")))))
--8<---------------cut here---------------end--------------->8---


> Over the whole set, 77.1% are known to be safely tucked away in the
> Software Heritage archive.  But it’s actually much better than that.  If
> we only look at the most recent sampled commit (from Sunday the 5th),
> that number becomes 87.4%, which is starting to look pretty good!

Just to be point the new nixguix loader [1] is still in SWH staging and
not yet deployed, IIRC.  It will not change much the coverage on our
side but it should be fix some corner-cases.

1: <https://gitlab.softwareheritage.org/swh/meta/-/issues/4662>


>      This is kinda like an automated version of Simon’s recent
> investigation.

Neat!  Note that I also wanted to check the SWH capacity for cooking,
not only checking the end points.  For instance, it allowed to discover
mismatch due to uncovered CR/LF normalization; now fixed with:
58f20fa8181bdcd4269671e1d3cef1268947af3a.


> Here’s a rough road map for that based on a glance at the script’s
> output:
>
>     • Subversion support (for TeX-based documentation stuff, I guess)

For the interested reader, details for helping in the implementation:

    https://issues.guix.gnu.org/issue/43442#9
    https://issues.guix.gnu.org/issue/43442#11

However, it would ease all the dance if SWH would consider to store and
expose NAR hashes on their side.  As discussed here:

    https://gitlab.softwareheritage.org/swh/meta/-/issues/4538


>              However, 42% of them are old Bioconductor packages.  They
> seem to be lost.  It looks like Bioconductor now stores multiple package
> versions per Bioconductor version [2], but before version 3.15 that was
> not the case.  As an example, take “ggcyto” from Bioconductor 3.10 [3].
> We packaged version 1.14.0, and then at some point Bioconductor 3.10
> switched to version 1.14.1.  We packaged that, too, but now 1.14.0 is
> gone.

Well, I have not investigated much because it is between December 2019
and March 2020 thus “guix time-machine” is not smooth for this old time.

First question, does we have the source tarball in Berlin or Bordeaux or
somewhere else?  If yes, there is a hope. :-) Else, it is probably gone
forever.

The hope is: https://git.bioconductor.org/packages/ggcyto

If we have the tarball with the correct checksum from commit
f5f440312d848e12463f0c6f7510a86b623a9e27

--8<---------------cut here---------------start------------->8---
+    (version "1.14.0")
+    (source
+     (origin
+       (method url-fetch)
+       (uri (bioconductor-uri "ggcyto" version))
+       (sha256
+        (base32
+         "165qszvy5z176h1l3dnjb5dcm279b6bjl5n5gzz8wfn4xpn8anc8"))))
--8<---------------cut here---------------end--------------->8---

then we can disassemble it and then using the Git repository, we can try
to assemble the content from SWH and the meta from Disarchive DB.

For sure, it is again another example why we should augment by intrinsic
identifiers the Guix way for fetching.  See:

    https://lists.gnu.org/archive/html/guix-devel/2023-03/msg00025.html


>        I know it’s been discussed before, but I can’t remember what the
> conclusion was.  Are these just gone forever?

Discussed here:

    https://issues.guix.gnu.org/issue/39885
    https://issues.guix.gnu.org/issue/54787    



Cheers,
simon


  reply	other threads:[~2023-03-14 11:29 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-14  1:37 Preservation of Guix (PoG) report 2023-03-13 Timothy Sample
2023-03-14 10:36 ` Simon Tournier [this message]
2023-03-18 20:35   ` Timothy Sample
2023-03-22 14:21     ` Ludovic Courtès
2023-03-16 16:41 ` Ludovic Courtès
2023-03-19  2:25   ` Timothy Sample

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86356739hb.fsf@gmail.com \
    --to=zimon.toutoune@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=samplet@ngyro.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).