unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Preservation of Guix Report 2021-12-06
@ 2021-12-06 19:59 Timothy Sample
  2021-12-07 16:41 ` Ludovic Courtès
  2021-12-07 18:33 ` zimoun
  0 siblings, 2 replies; 3+ messages in thread
From: Timothy Sample @ 2021-12-06 19:59 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2822 bytes --]

Hi Guix,

This is an update to the preservation of Guix report.  There are no new
commits or fixed-output derivations in this report, but I spent some
time cleaning up the results, and I think the improvements are worth
sharing.  The last report generated a lot of questions.  This one
doesn’t answer all of them, but it’s a big improvement:

    <https://ngyro.com/pog-reports/2021-12-06/>

Since the last report, I added many more reference categories and moved
them to the database.  The new categories are 'hg', 'svn', 'cvs', 'bzr',
'tar-bz2', 'tar', 'zip', and 'text'.  Of these, only 'tar' and 'text'
are being processed.  The rest are currently unsupported by my scripts.
Moving the categories to the database allows me to make manual
corrections when needed.  It also encouraged me to look through the
references a bit more carefully to track down some of the weirder 'text'
sources (like Bash patches) and fix up some other ones (in the style of
“/tar_gz?download=yes”).

I also made the fetching code more tenacious.  Now it uses the
content-addressed mirrors from Guix and Nix to find regular files, and
will recover “easy” Git references from SWH (“easy” means the commit is
specified).

Between improving the fetching code and adding 'tar' and 'text'
processing, I’ve computed another 2.5K SWHIDs.  We now have SWHIDs for
86% of our fixed-output derivations.  There are only 51 “unknown”
non-recursive Git sources now (the list is attached).

But that’s not all!

The scripts now categorize failures, so we have a better idea of what’s
going on with the remaining 14% “unknown” sources:

  no-ref:        13
  disarchive:   863
  fetch:       1262
  bail:        3324
  -----------------
  total:       5462

The “bail” category is all the stuff my scripts don’t yet process, like
Mercurial repositories and bzip2 tarballs.

The “fetch” category is everything the scripts couldn’t track down.

The “disarchive” category is all the tarballs Disarchive failed to
process.  An interesting thing here is that most of them are from Cargo.
Long story short: older versions of Cargo used the “miniz”
implementation of DEFLATE (rewritten in Rust) to compress tarballs.
Disarchive doesn’t support this (yet...?).  There are 686
old-Cargo-produced tarballs in the “disarchive” category.

The “no-ref” category covers a few fixed-output derivations used in
bootstrapping that do not come from an origin record.  I will probably
just load them by hand eventually.

(In the future I hope to put some of this in the report itself.)

One last thing to add is that the SWH folks were very quick to fix the
loading error, so the increase in missing sources for recent commits is
now gone.


-- Tim


[-- Attachment #2: git-missing.txt --]
[-- Type: text/plain, Size: 7666 bytes --]

00000nij3ray7nssvq0lzb352wmnab8ffzk7dgff2c68mvjbh1l6
(git-reference
 (url "https://github.com/fdik/libetpan")
 (commit "210ba2b3b310b8b7a6ee4a4e35e50f7fa379643f"))

03ym14g9qhjqmryr5z065kynqm8yhmvnbs2djl6vp3i9cmqln8cl
(git-reference
 (url "https://git.savannah.gnu.org/git/emacsy.git")
 (commit "v0.4.1-37-g5f91ee6"))

04c2vqxg31mk15cfrhzrivykis8fmf0m1d8h1qdjdmlfxd4qwaqf
(git-reference
 (url "https://github.com/hboetes/mg")
 (commit "20210609"))

06plnhi1489wqsag5wgm16hb1xd1a8nbnb9gw7635d3fidxyb0wp
(git-reference
 (url "https://github.com/erlang/otp")
 (commit "OTP-24.0.2"))

06w86xk7sjl2x2h3z6msn8kpmwj05qdimcym77wzhz5s94dzh1bl
(git-reference
 (url "https://github.com/hboetes/mg")
 (commit "20180408"))

07xansmhn4l0b9ghzf56vyx8cqg0q01aq3pz5ikx2i19v5f0rc66
(git-reference
 (url "https://github.com/tgvaughan/elpher")
 (commit "v1.4.6"))

08yca0a0prrnrc7ir7ajd56yxvxpcs4m1k8f5kf273f5whgr7wzw
(git-reference
 (url "https://github.com/ProtonVPN/linux-cli")
 (commit "v2.2.4"))

0a066f56hnb9znbwnv1blm31j0ysv05n4wzlkli0zgw087c9047x
(git-reference
 (url "https://source.atlas.engineer/public/next")
 (commit "1.2.0"))

0anmprm63a88kii251rl296v1g4iq62r6n4nssx5jbc0hzkknanz
(git-reference
 (url "https://git.savannah.gnu.org/git/nomad.git/")
 (commit "0.2.0-alpha-100-g6a565d3"))

0d269474kk1933c55hx4azw3sak5ycfrxkw6ida0sb2cm00kfich
(git-reference
 (url "https://gitlab.savoirfairelinux.com/sflphone/libiax2.git")
 (commit "0e5980f1d78ce462e2d1ed6bc39ff35c8341f201"))

0f8zr2jxr0v4zcd98zqx99zxdn768vjpzwxsbsd6ss3if405sq2a
(git-reference
 (url "https://github.com/erlang/otp")
 (commit "OTP-24.0.5"))

0fy4qsss3i3pkq1rpgjds4aipbwlh1dr9hbbf7jn2a1c63kfks0r
(git-reference
 (url "https://git.zx2c4.com/wireguard-go/")
 (commit "v0.0.20200320"))

0fzhwbpyndwrmxip9zlcwkrr675l5pzwcygi45hv7w1hn39w0hxp
(git-reference
 (url "https://github.com/emacsattic/relative-buffers")
 (commit "9762fe268e9ff150dcec2e2e45d862d82d5c4008"))

0gv83i5ybj1z3ykbbldjzf7dbfjszp84c0yzrpshj611b9wp0176
(git-reference
 (url "https://github.com/erlang/otp.git")
 (commit "OTP-21.0.5"))

0ibq30xrf871pkpasi8p9krn0pmd86rsdzb3jqvz3wnp4wa3hl9d
(git-reference
 (url "https://source.atlas.engineer/public/next")
 (commit "1.3.0"))

0ifnaclsz7w08mc485i3j1kkcpd1m8q5qamckrfwc375ac13xf4g
(git-reference
 (url "git://anongit.kde.org/kdenlive.git")
 (commit "v18.08.1"))

0p0ha6prp7pyadp61clbhc6b55023vxzfwy14j2qygb2mkq7fhic
(git-reference
 (url "https://git.savannah.gnu.org/git/nomad.git/")
 (commit "0.2.0-alpha-199-g3e7a475"))

0rac70p6rpvdx9v0bdd8nphgr7imdxb7nz0x77n3p7h3180zz9x0
(git-reference
 (url "https://github.com/OpenVisualCloud/SVT-HEVC")
 (commit "v1.5.1"))

0rgiypb9ig8x4rl3hfzpy7kwnx1q3064nvlrv4fk0dnp84girn0v
(git-reference
 (url "https://github.com/scour-project/scour")
 (commit "v038.1"))

0w0dff3s7wv2d9m78a4jhckiik58q38wx6wpbba5hzbs4yxz35ck
(git-reference
 (url "https://github.com/parallaxinc/propgcc")
 (commit "4c46ecbe79ffbecd2ce918497ace5b956736b5a3"))
(git-reference
 (url "https://github.com/parallaxinc/propgcc.git")
 (commit "4c46ecbe79ffbecd2ce918497ace5b956736b5a3"))

0xxd9nhqiclpkdd9crymvba37fl0xs5mikwhya68nfzcgar7w480
(git-reference
 (url "https://github.com/libretro/RetroArch.git")
 (commit "v1.7.8"))

0y7v9ikrmy5dbjlpbpacp08gy838i8z54m8m4ps7ldk1j6kyia3n
(git-reference
 (url "https://github.com/ProtonVPN/linux-cli")
 (commit "v2.2.6"))

0zw5wra9hc717srmcar1wm4i34kyj8c49ny4bb7y3nrvkjp2pdb5
(git-reference
 (url "https://github.com/aureliojargas/clitest")
 (commit "v0.3.0"))

106v177y3nrjv2l1yskch4phpqd8h97b67zj0jiq9pc3c69jr1ay
(git-reference
 (url "https://github.com/powertab/rtmidi.git")
 (commit "2.1.0"))

12bs2rfmmy021087i10vxibdbbvd5vld0vk3h5hymhpz7rgszcmg
(git-reference
 (url "https://github.com/sekrit-twc/zimg.git")
 (commit "release-2.9.3"))

13i7dczbqwhws08zzrdraki1zkqv0qkbgx9c1r8vmg5qr9f7hfzg
(git-reference
 (url "https://github.com/KittyKatt/screenFetch")
 (commit "v3.9.0"))

14knnvfaskfz97vs3lfqrdpcbcx22s6qp16213wdnvnsf4c1lx1b
(git-reference
 (url "https://chromium.googlesource.com/webm/libvpx")
 (commit "v1.9.0-88-g12059d956"))

14vrm8lvwksf697sqks7xfd1xaqjlqjc9afjk33sksq5p27wr203
(git-reference
 (url "https://github.com/hboetes/mg")
 (commit "20180927"))

15i5ixpryfrbf3vrrb5rici8fb585f25k0v1ljds16bp1f1msr4q
(git-reference
 (url "https://git.code.sf.net/p/itk-snap/src")
 (commit "v3.8.0"))

1827jljs8mps489fm7xw63cakdqwc5grilrr5n9spr2rlk76jpx3
(git-reference
 (url "https://github.com/dom4j/dom4j.git")
 (commit "version-2.1.0"))

189jsj87hycs57a54x0b9lifwvhr63nypb9vfxdrq7rwrpcvi5f8
(git-reference
 (url "https://inqlab.net/git/guile-sodium.git")
 (commit "v0.1.0"))

19s6dbn47xy30dwfdd2p8fxz6z63rp5h7sm0barb69r7mvgnqvc1
(git-reference
 (url "https://source.atlas.engineer/public/next")
 (commit "1.2.1"))

1a87mka2sfzhbch2jip6wlvvs0glxq9lqwmyrp359d1rmwwmqiw9
(git-reference
 (url "https://github.com/markfasheh/duperemove")
 (commit "v0.11.2"))

1bif1k738knhifxhkn0d2x1m521zkx40pri44vyjqncp9r95hkbk
(git-reference
 (url "https://source.atlas.engineer/public/next")
 (commit "1.2.2"))

1ccy7qz1wcmggqlf3hwigbqq4wrx1amds4x9bxz9py6bypglyjc5
(git-reference
 (url "https://framagit.org/contrapunctus/chronometrist.git")
 (commit "v0.4.2"))

1cljkkyi9dxqpqhx8y6l2ja4zjmlya26m26kqxml8gx08vyvddhx
(git-reference
 (url "git://git.tuxfamily.org/gitroot/non/non.git")
 (commit "5ae43bb27c42387052a73e5ffc5d33efb9d946a9"))

1cs1i1hxwrv0a512j54yrvfh743nci1chx6qjgp4jyzq98ncvxgg
(git-reference
 (url "https://git.savannah.gnu.org/git/emacsy.git")
 (commit "v0.4.1-31-g415d96f"))

1dj37vk712dx76y25g13na24wbpn7a5ddmlpf4n51gm10sib54wj
(git-reference
 (url "https://github.com/erlang/otp")
 (commit "OTP-21.3.8.13"))
(git-reference
 (url "https://github.com/erlang/otp.git")
 (commit "OTP-21.3.8.13"))

1f1gapvs9j89qr474103dqgsiyb96phlnsmq5hiv4ba242blg9lb
(git-reference
 (url "https://git.umaneti.net/flycheck-grammalecte/")
 (commit "v1.3"))

1gw27lqc3f525n8qdcmr2nyn16y9g10z9f6dnmckyyxcdzvhq35n
(git-reference
 (url "https://github.com/JohnCremona/eclib/")
 (commit "v20190909"))

1hnh2mnmw179gr094r561w6cw1haid0lpvpqvkc24wpj82vphzpa
(git-reference
 (url "http://dr-qubit.org/git/undo-tree.git")
 (commit "release/0.6.6"))

1ijglmwkdy1l87gj429qfjis0v8b1zlxhbyfhx5za8664h68nqka
(git-reference
 (url "https://inqlab.net/git/eris.git")
 (commit "v0.2.0"))

1ks1fa0027s3xp0z6qp0dxmayvrb4dwwscfhbx7da0khp153f2cp
(git-reference
 (url "https://github.com/trezor/trezord-go")
 (commit "v2.0.29"))
(git-reference
 (url "https://github.com/trezor/trezord-go.git")
 (commit "v2.0.29"))

1ljjqzghcap4admv0hvw6asm148b80mfgjgxjjcw6qc95fkjjjlr
(git-reference
 (url "https://framagit.org/contrapunctus/chronometrist.git")
 (commit "v0.4.3"))

1lv1nckvzyhpn8cs6m40f2np15b3a8071kh7sy1216q2345s2ckc
(git-reference
 (url "https://notabug.org/cage/tinmop")
 (commit "v0.8.1"))

1nr0149y2nvrxj56gc12jqnfl01g6z9ypfsgl6pfg85cw73hnggk
(git-reference
 (url "http://dr-qubit.org/git/undo-tree.git")
 (commit "release/0.7.1"))

1p3lw4bcm2dph3pf1h4i0d9pzrcfr83r0iadqanxkwbmm1bl11pm
(git-reference
 (url "https://github.com/erlang/otp")
 (commit "OTP-23.2.1"))

1pm4pwg2abd0j9cn5v3k2ksk9ig4vlwxmlw9rrglanziv9l967qp
(git-reference
 (url "https://github.com/kr/pretty.git")
 (commit "v0.2.0"))

1s29hz44rb5dwzq8d4i4bfg77dr0v3ywpvidpa6xzg7hnnv3mhi5
(git-reference
 (url "https://github.com/jurplel/qView.git")
 (commit "2.0"))

1sx31zp6q2qc6fz3r78rx34zp2x4blrqzxwbpww71vb6lp1clmdm
(git-reference
 (url "https://github.com/MaskRay/ccls")
 (commit "0.20190823.3"))

1xv3h6svw9aay5ixpql231md3pf00qxvhg62z88daraf18hlkfja
(git-reference
 (url "https://github.com/OpenCPN/OpenCPN")
 (commit "v5.0.0"))
(git-reference
 (url "https://github.com/OpenCPN/OpenCPN.git")
 (commit "v5.0.0"))

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Preservation of Guix Report 2021-12-06
  2021-12-06 19:59 Preservation of Guix Report 2021-12-06 Timothy Sample
@ 2021-12-07 16:41 ` Ludovic Courtès
  2021-12-07 18:33 ` zimoun
  1 sibling, 0 replies; 3+ messages in thread
From: Ludovic Courtès @ 2021-12-07 16:41 UTC (permalink / raw)
  To: Timothy Sample; +Cc: guix-devel

Hello,

Timothy Sample <samplet@ngyro.com> skribis:

>     <https://ngyro.com/pog-reports/2021-12-06/>
>
> Since the last report, I added many more reference categories and moved
> them to the database.  The new categories are 'hg', 'svn', 'cvs', 'bzr',
> 'tar-bz2', 'tar', 'zip', and 'text'.  Of these, only 'tar' and 'text'
> are being processed.  The rest are currently unsupported by my scripts.
> Moving the categories to the database allows me to make manual
> corrections when needed.  It also encouraged me to look through the
> references a bit more carefully to track down some of the weirder 'text'
> sources (like Bash patches) and fix up some other ones (in the style of
> “/tar_gz?download=yes”).

Good to see these additional details.

The SWH folks told me that plain files (like .el or .patch files) that
appear in ‘sources.json’ are currently not archived, but that this could
change.  So seeing 86% of them are archived is good news.

> I also made the fetching code more tenacious.  Now it uses the
> content-addressed mirrors from Guix and Nix to find regular files, and
> will recover “easy” Git references from SWH (“easy” means the commit is
> specified).

I suppose the scripts could use ‘url-fetch’, or even build the
fixed-output tarballs, to benefit from Guix’s fallback methods.
(Apologies if I’m stating the obvious.)

> The “disarchive” category is all the tarballs Disarchive failed to
> process.  An interesting thing here is that most of them are from Cargo.
> Long story short: older versions of Cargo used the “miniz”
> implementation of DEFLATE (rewritten in Rust) to compress tarballs.
> Disarchive doesn’t support this (yet...?).  There are 686
> old-Cargo-produced tarballs in the “disarchive” category.

Ah, I don’t want to hear about Rust!  ;-)

> One last thing to add is that the SWH folks were very quick to fix the
> loading error, so the increase in missing sources for recent commits is
> now gone.

Awesome.

Thanks for the update!

Ludo’.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Preservation of Guix Report 2021-12-06
  2021-12-06 19:59 Preservation of Guix Report 2021-12-06 Timothy Sample
  2021-12-07 16:41 ` Ludovic Courtès
@ 2021-12-07 18:33 ` zimoun
  1 sibling, 0 replies; 3+ messages in thread
From: zimoun @ 2021-12-07 18:33 UTC (permalink / raw)
  To: Timothy Sample; +Cc: Guix Devel

Hi Timothy,


On Mon, 6 Dec 2021 at 21:02, Timothy Sample <samplet@ngyro.com> wrote:

>     <https://ngyro.com/pog-reports/2021-12-06/>

Thanks!  Really cool!


As mentioned in [1], upstream of one disappeared:

    https://code.divoplade.fr/mkdir-p.git

and the package had been removed by
0d2400ceca8d0a0358abaf4cd699e54ddad0e885 two months ago.  Hopefully,
we still have it in Berlin.

    guix time-machine --commit=3275c9e1f5 -- build -S guile-mkdir-p

However, it returns

/gnu/store/dz88dc1k8q7161f3j1m668hi8zna4qcx-guile-mkdir-p-1.0.1.tar.xz

So, it needs some work to push it to SWH and get the same checksum
01k20rjcv6p0spmw8ls776aar6bfw0jxw46d2n12w0cb2p79xjv8, IIUC.

1: <https://lists.gnu.org/archive/html/guix-devel/2021-12/msg00032.html>


We have to investigate to these 3 ones:

    R "https://github.com/dalanicolai/djvu3";
    R "https://github.com/halostatue/minitar";
    R "https://github.com/stardiviner/org-contacts.el";

because the Web front-end of SWH says 'successed' when archiving and
for instance,

https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://github.com/dalanicolai/djvu3

but PoG asks for the commit 37b675be1d4d436cdd0c3b5d3f13e88b59a7bf18
which is not in, if I read correctly.  Well, the only reference of
djvu3 in the Guix repo is commit
b61ee34c9d7b679cb772e6e8ff0c0876ccf087b1 which indeed refers to commit
37b675b.  However, if upstream is cloned:

    git clone https://github.com/dalanicolai/djvu3
    git -C djvu3 log --oneline | grep 37b

returns nothing.  Which probably means an upstream in-place rewrite.  Arf!

Berlin also has it.  However, again it requires some work to push to
SWH with the expected commit.


I have not investigated the two others.  Similar cases I guess.


Last, maybe I miss something but I do not see 'python-scikit-learn' or
'sway' when this bug#51726 is still pending.  Somehow, I have the same
question I tried to ask there [3].


2: <https://issues.guix.gnu.org/51726>
3: <https://lists.gnu.org/archive/html/guix-devel/2021-12/msg00064.html>


Cheers,
simon


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-12-07 18:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-06 19:59 Preservation of Guix Report 2021-12-06 Timothy Sample
2021-12-07 16:41 ` Ludovic Courtès
2021-12-07 18:33 ` zimoun

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).