all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Preservation of Guix Report 2021-11-30
@ 2021-12-01 18:48 Timothy Sample
  2021-12-02 14:09 ` Timothy Sample
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Timothy Sample @ 2021-12-01 18:48 UTC (permalink / raw)
  To: guix-devel

Hi Guix!

Here’s a new version of the Preservation of Guix Report:

    <https://ngyro.com/pog-reports/2021-11-30/>

I actually made one a month ago but my message about it never made it to
the list somehow.  The most important part of that message was to
highlight how well we are doing for Git sources.

Here’s what I wrote:

>  This version has a breakdown by different origin types.  The good
>  news is that Git origins are doing very well.  We’ve confirmed that
>  97.2% of the 9,272 Git origins that we’re tracking are in the SWH
>  archive.  Most of the progress there is due to zimoun wading through
>  the missing packages and telling SWH to store them – thanks, zimoun!

That’s still basically true this month, but we have a few more missing
Git sources.  Actually, we are starting to lose sources!  If you look at
the graph of commits, you can see a sharp increase in missing sources
for recent commits.  It looks like a problem on the SWH side.  Visiting
[1] and selecting “Show all visits”, you can see that the nixguix loader
has been having trouble loading our “sources.json” recently.

[1] <https://archive.softwareheritage.org/browse/origin/visits/?origin_url=https://guix.gnu.org/sources.json>

I will try and get in touch with SWH about this.  While it’s troubling,
it certainly is a good confirmation that doing some basic monitoring is
important!

That’s the bad news.  The good news is I’ve added support for XZ to
Disarchive (to be officially released in Disarchive soon).  That means
that we have information about 4K more sources.  We now know the status
of 80% of our sources.  Unfortunately, 40% of the XZ sources are
missing!  Most of them are old, as can be seen in this (secret) graph:

    <https://ngyro.com/pog-reports/2021-11-30/tar-xz-rel-hist.svg>

(The filename format is “{tar-gz,tar-xz,git}-{rel,abs}-hist.svg” if you
want to see all the secret graphs.)

Lastly, if you scroll to the bottom of the report and select “View
Schema”, I’ve added some example queries that generate lists of
interesting sources.  For example – if you’re so inclined – you could
look at the 128 “unknown”, non-recursive Git sources that we should know
about and figure out what’s going on.  ;)


-- Tim


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Preservation of Guix Report 2021-11-30
  2021-12-01 18:48 Preservation of Guix Report 2021-11-30 Timothy Sample
@ 2021-12-02 14:09 ` Timothy Sample
  2021-12-02 17:15 ` zimoun
  2021-12-06 13:09 ` Ludovic Courtès
  2 siblings, 0 replies; 8+ messages in thread
From: Timothy Sample @ 2021-12-02 14:09 UTC (permalink / raw)
  To: guix-devel

Timothy Sample <samplet@ngyro.com> writes:

> [W]e are starting to lose sources!  If you look at the graph of
> commits, you can see a sharp increase in missing sources for recent
> commits.  It looks like a problem on the SWH side.  Visiting [1] and
> selecting “Show all visits”, you can see that the nixguix loader has
> been having trouble loading our “sources.json” recently.
>
> [1] <https://archive.softwareheritage.org/browse/origin/visits/?origin_url=https://guix.gnu.org/sources.json>
>
> I will try and get in touch with SWH about this.

The good folks over there have opened a ticket:

    <https://forge.softwareheritage.org/T3763>


-- Tim


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Preservation of Guix Report 2021-11-30
  2021-12-01 18:48 Preservation of Guix Report 2021-11-30 Timothy Sample
  2021-12-02 14:09 ` Timothy Sample
@ 2021-12-02 17:15 ` zimoun
  2021-12-03 10:02   ` Ricardo Wurmus
  2021-12-06 13:10   ` Ludovic Courtès
  2021-12-06 13:09 ` Ludovic Courtès
  2 siblings, 2 replies; 8+ messages in thread
From: zimoun @ 2021-12-02 17:15 UTC (permalink / raw)
  To: Timothy Sample; +Cc: Guix Devel

Hi,

On Wed, 1 Dec 2021 at 19:52, Timothy Sample <samplet@ngyro.com> wrote:

> Here’s a new version of the Preservation of Guix Report:
>
>     <https://ngyro.com/pog-reports/2021-11-30/>

Cool!

> That’s still basically true this month, but we have a few more missing
> Git sources.

Well, we need more tools to explore the data (do not hold your breath
:-)).  Using the database, some Emacs macros, a quick Guix scripts and
a patient review of SWH, the results about missing vs unknown Git
sources (type:git) looks more or less like that:

 1. Guix is not able to reach but SWH schedules and processes with success
 2. SWH processes but fails

Some sources are lost from upstream (and the package removed from Guix
long time ago).  For instance, mumimu removed 2 years ago.  For now, I
count 3.  It should easily fixable, IMHO.  I will give a look later.

The '?' means that SWH is still processing.  Well, I bet the ingestion
will fail.

On Guix side, the package to tackle are 'R'.  It means the way Guix
uses the API does not get the correct information.  For instance djvu3
is saved by SWH [1], but "guix lint -c archival emacs-djvu3" always
schedules for archiving.  This list is not complete because I found
more as pointed here [2] and Ludo found at least one thing incorrect
[3].

Where I am surprised is that PoG does not return 'python-scikit-learn' when:

--8<---------------cut here---------------start------------->8---
$ guix lint -c archival python-scikit-learn
gnu/packages/machine-learning.scm:946:5: python-scikit-learn@0.24.2:
scheduled Software Heritage archival
--8<---------------cut here---------------end--------------->8---

Anyway. :-)

Last, the number 35 (missing) or 240 (unknown) is a bit inflated
because many version / tag / commit refers to one URL, and thus if a
failure happens on this URL, all is reported as failing.  Therefore,
if we fix the list below (which is much less than 35+340 ;-)), we
should be almost done for 100% coverage of Git source.  It's some work
on the plate. ;-)


Cheers,
simon


1: <https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://github.com/dalanicolai/djvu3>
2: <https://lists.gnu.org/archive/html/guix-devel/2021-10/msg00250.html>
3: <https://lists.gnu.org/archive/html/guix-devel/2021-10/msg00291.html>


F: SWH failure
R: SWH says succeed but Guix does not find it
H: hidden package, not checked
>: Duplicate
I: Duplicate but in
d: Disappeared

Missing
=======

d "https://code.divoplade.fr/mkdir-p.git"

R "https://github.com/dalanicolai/djvu3"
> "https://github.com/halostatue/minitar.git"
R "https://github.com/halostatue/minitar"
R "https://github.com/stardiviner/org-contacts.el"

F "https://github.com/adobe-fonts/source-han-sans"
> "https://github.com/adobe-fonts/source-han-sans.git"
F "git://pumpa.branchable.com/"
F "https://git.mfiano.net/mfiano/pngload.git"
F "https://gitlab.com/sequoia-pgp/sequoia.git"

H "https://github.com/desktop-app/tg_owt.git"


Unknown
=======

d "https://source.atlas.engineer/public/next"
d "https://git.elephly.net/software/mumimu.git"

R "https://chromium.googlesource.com/webm/libvpx"
R "https://framagit.org/contrapunctus/chronometrist.git"
R "https://git.savannah.gnu.org/git/emacsy.git"
R "https://git.savannah.gnu.org/git/hurd/incubator.git"
R "https://git.savannah.gnu.org/git/nomad.git/"
R "https://git.sr.ht/~pkal/autocrypt"
R "https://git.sr.ht/~zge/bang"
R "https://git.zx2c4.com/wireguard-go/"
R "https://github.com/MaskRay/ccls"
R "https://github.com/OpenCPN/OpenCPN"
R "https://github.com/ProtonVPN/linux-cli"
R "https://github.com/aureliojargas/clitest"
R "https://github.com/boostorg/signals2.git"
R "https://github.com/cdown/clipmenu.git"
R "https://github.com/cdown/clipnotify.git"
R "https://github.com/emacsattic/relative-buffers"
R "https://github.com/emacsmirror/cl-print"
R "https://github.com/jurplel/qView.git"
R "https://github.com/kr/pretty.git"
R "https://github.com/parallaxinc/propgcc"
R "https://github.com/sekrit-twc/zimg.git"
R "https://github.com/tlaplus/tlaplus"
R "https://inqlab.net/git/eris.git"
R "https://inqlab.net/git/guile-sodium.git"

F "https://git.joeyh.name/filters"
I "git://git.joeyh.name/filters"

F "git://git.tuxfamily.org/gitroot/non/non.git"
F "https://anonscm.debian.org/cgit/users/kaction-guest/retired/dev.guile-bash.git"
F "https://bitbucket.org/eeeickythump/cl-abstract-classes"
F "https://framagit.org/a-guile-mind/guile-wiredtiger.git"
F "https://github.com/LLNL/hypre.git"
F "https://github.com/PacificBiosciences/cDNA_primer"
F "https://github.com/atomnuker/wlstream"
F "https://github.com/biod/undeaD.git"
F "https://github.com/fdik/libetpan"
F "https://github.com/jujudusud/caps-lv2"
F "https://github.com/mattn/runewidth"
F "https://github.com/powertab/rtmidi.git"
F "https://github.com/proofit404/edbi-sqlite"
F "https://github.com/syohex/git-gutter-fringe"
F "https://github.com/tgvaughan/elpher"
F "https://gitlab.com/kavalogic-inc/inspekt3d.git"
F "https://gitlab.savoirfairelinux.com/sflphone/libiax2.git"
F "https://go.googlesource.com/x/mod"


? "git://anongit.kde.org/kdenlive.git"
? "git://dthompson.us/guile-websocket.git"
? "http://dr-qubit.org/git/undo-tree.git"
? "http://git.fuzzle.org/mloop"
? "http://www.foldr.org/~michaelw/projects/redshank.git"
? "https://framagit.org/contrapunctus/chronometrist.git"
? "https://git.gnome.org/browse/byzanz"
? "https://git.savannah.gnu.org/git/emacs.git/"
? "https://git.zapb.de/libjaylink.git"
? "https://github.com/emacsorphanage/git-gutter-fringe"
? "https://github.com/erlang/otp"
? "https://github.com/golang/mod"
? "https://github.com/libretro/RetroArch.git"
? "https://github.com/parallaxinc/propgcc.git"
? "https://github.com/scour-project/scour"


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Preservation of Guix Report 2021-11-30
  2021-12-02 17:15 ` zimoun
@ 2021-12-03 10:02   ` Ricardo Wurmus
  2021-12-03 13:22     ` zimoun
  2021-12-06 13:10   ` Ludovic Courtès
  1 sibling, 1 reply; 8+ messages in thread
From: Ricardo Wurmus @ 2021-12-03 10:02 UTC (permalink / raw)
  To: zimoun; +Cc: guix-devel


zimoun <zimon.toutoune@gmail.com> writes:

> F "git://pumpa.branchable.com/"

This has moved to http://source.pumpa.branchable.com/

-- 
Ricardo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Preservation of Guix Report 2021-11-30
  2021-12-03 10:02   ` Ricardo Wurmus
@ 2021-12-03 13:22     ` zimoun
  0 siblings, 0 replies; 8+ messages in thread
From: zimoun @ 2021-12-03 13:22 UTC (permalink / raw)
  To: Ricardo Wurmus; +Cc: guix-devel

Hi,

On Fri, 03 Dec 2021 at 10:02, Ricardo Wurmus <rekado@elephly.net> wrote:
> zimoun <zimon.toutoune@gmail.com> writes:
>
>> F "git://pumpa.branchable.com/"
>
> This has moved to http://source.pumpa.branchable.com/

Thanks.  Waiting the easy fix (perfect first contribution for Outreachy
:-)) as specified by #52257 [1], let save it:

--8<---------------cut here---------------start------------->8---
$ guix repl
GNU Guile 3.0.7
Copyright (C) 1995-2021 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guix-user)> ,use(guix swh)
scheme@(guix-user)> (save-origin "http://source.pumpa.branchable.com/" "git")
$1 = #<<save-reply> origin-url: "http://source.pumpa.branchable.com" origin-type: #<unspecified> request-date: #<date nanosecond: 648231 second: 45 minute: 12 hour: 13 day: 3 month: 12 year: 2021 zone-offset: 0> request-status: pending task-status: not-created>
--8<---------------cut here---------------end--------------->8---

And the result can be seen at
https://archive.softwareheritage.org/save/#requests.

Note that SWH web interface returns «The origin url is not valid or does
not reference a code repository» if the submission is done via it; but
the request is accepted via API.


1: <http://issues.guix.gnu.org/issue/52257>


Cheers,
simon

PS:
Note that:

--8<---------------cut here---------------start------------->8---
$ git clone git://pumpa.branchable.com/
Cloning into 'pumpa.branchable.com'...
remote: Enumerating objects: 5819, done.
remote: Counting objects: 100% (5819/5819), done.
remote: Compressing objects: 100% (1495/1495), done.
remote: Total 5819 (delta 4275), reused 5819 (delta 4275)
Receiving objects: 100% (5819/5819), 1.51 MiB | 537.00 KiB/s, done.
Resolving deltas: 100% (4275/4275), done.
--8<---------------cut here---------------end--------------->8---


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Preservation of Guix Report 2021-11-30
  2021-12-01 18:48 Preservation of Guix Report 2021-11-30 Timothy Sample
  2021-12-02 14:09 ` Timothy Sample
  2021-12-02 17:15 ` zimoun
@ 2021-12-06 13:09 ` Ludovic Courtès
  2 siblings, 0 replies; 8+ messages in thread
From: Ludovic Courtès @ 2021-12-06 13:09 UTC (permalink / raw)
  To: Timothy Sample; +Cc: guix-devel

Hi!

Timothy Sample <samplet@ngyro.com> skribis:

> Here’s a new version of the Preservation of Guix Report:
>
>     <https://ngyro.com/pog-reports/2021-11-30/>

The PoG reports are all 404 right now.

> Here’s what I wrote:
>
>>  This version has a breakdown by different origin types.  The good
>>  news is that Git origins are doing very well.  We’ve confirmed that
>>  97.2% of the 9,272 Git origins that we’re tracking are in the SWH
>>  archive.  Most of the progress there is due to zimoun wading through
>>  the missing packages and telling SWH to store them – thanks, zimoun!

Yay!

> That’s still basically true this month, but we have a few more missing
> Git sources.  Actually, we are starting to lose sources!  If you look at
> the graph of commits, you can see a sharp increase in missing sources
> for recent commits.  It looks like a problem on the SWH side.  Visiting
> [1] and selecting “Show all visits”, you can see that the nixguix loader
> has been having trouble loading our “sources.json” recently.
>
> [1] <https://archive.softwareheritage.org/browse/origin/visits/?origin_url=https://guix.gnu.org/sources.json>
>
> I will try and get in touch with SWH about this.  While it’s troubling,
> it certainly is a good confirmation that doing some basic monitoring is
> important!

Yup.

> That’s the bad news.  The good news is I’ve added support for XZ to
> Disarchive (to be officially released in Disarchive soon).  That means
> that we have information about 4K more sources.  We now know the status
> of 80% of our sources.  Unfortunately, 40% of the XZ sources are
> missing!  Most of them are old, as can be seen in this (secret) graph:
>
>     <https://ngyro.com/pog-reports/2021-11-30/tar-xz-rel-hist.svg>
>
> (The filename format is “{tar-gz,tar-xz,git}-{rel,abs}-hist.svg” if you
> want to see all the secret graphs.)

Can’t wait to see the secret graphs.  :-)

Thanks!

Ludo’.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Preservation of Guix Report 2021-11-30
  2021-12-02 17:15 ` zimoun
  2021-12-03 10:02   ` Ricardo Wurmus
@ 2021-12-06 13:10   ` Ludovic Courtès
  2021-12-06 14:00     ` zimoun
  1 sibling, 1 reply; 8+ messages in thread
From: Ludovic Courtès @ 2021-12-06 13:10 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel

zimoun <zimon.toutoune@gmail.com> skribis:

> Where I am surprised is that PoG does not return 'python-scikit-learn' when:
>
> $ guix lint -c archival python-scikit-learn
> gnu/packages/machine-learning.scm:946:5: python-scikit-learn@0.24.2:
> scheduled Software Heritage archival

This is most likely due to <https://issues.guix.gnu.org/51726>.

Ludo’.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Preservation of Guix Report 2021-11-30
  2021-12-06 13:10   ` Ludovic Courtès
@ 2021-12-06 14:00     ` zimoun
  0 siblings, 0 replies; 8+ messages in thread
From: zimoun @ 2021-12-06 14:00 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Guix Devel

Hi,

On Mon, 06 Dec 2021 at 14:10, Ludovic Courtès <ludo@gnu.org> wrote:
> zimoun <zimon.toutoune@gmail.com> skribis:
>
>> Where I am surprised is that PoG does not return 'python-scikit-learn' when:
>>
>> $ guix lint -c archival python-scikit-learn
>> gnu/packages/machine-learning.scm:946:5: python-scikit-learn@0.24.2:
>> scheduled Software Heritage archival
>
> This is most likely due to <https://issues.guix.gnu.org/51726>.

Yes, that’s the explanation why “guix lint” is always scheduling it. :-)

The very same mechanism is used to fallback.  Therefore, Guix cannot
uses SWH as fallback for ’python-scikit-learn’; for now. ;-)

PoG uses the same API point*, IIUC.  Somehow, «PoG code finds ’foo’» has
to match «Guix code can use SWH as fallback for ’foo’» and the question
using ’python-scikit-learn’ as example: is it the case?

Other said, does this coverage done by PoG code correctly represent what
Guix code can reach?  Looking for corner cases; the devil is in the
details. ;-)


*API: instead of one query by request, PoG sends 1000 queries by request.
Because SWH limits to 100 request per hour, PoG checks for all packages
when Guix requires… bah! :-)


Cheers,
simon


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-12-06 14:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-01 18:48 Preservation of Guix Report 2021-11-30 Timothy Sample
2021-12-02 14:09 ` Timothy Sample
2021-12-02 17:15 ` zimoun
2021-12-03 10:02   ` Ricardo Wurmus
2021-12-03 13:22     ` zimoun
2021-12-06 13:10   ` Ludovic Courtès
2021-12-06 14:00     ` zimoun
2021-12-06 13:09 ` Ludovic Courtès

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.