unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* SWH and lookup (bug?)
@ 2021-10-26 15:35 zimoun
  2021-10-29 14:57 ` Ludovic Courtès
  0 siblings, 1 reply; 3+ messages in thread
From: zimoun @ 2021-10-26 15:35 UTC (permalink / raw)
  To: Guix Devel

Hi,

Well, I do not know if it is a bug.  But I note a weird behaviour.
Between 26/10/2021, 16:53:13 and 26/10/2021, 16:59:18, I have scheduled
“missing” packages.  And basically, the status reports “succeeded” (dig
to pages there [1]).  Therefore, I am sure they are in. :-)

Now, if I run “guix lint -c archival” on these same packages, it will
reschedule them again.  Therefore, something is not working as expected,
I guess.


For instance, try with the package ’sway’ (search bar [1]:
https://github.com/swaywm/sway).  SWH says the status for archiving
succeeded.  Even, it is archived for instance there [2].

Then if you give a look at the visit webpage, it says that the
repository had been visited several times on 2021 [3].

1: <https://archive.softwareheritage.org/save/#requests>
2: <https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://github.com/swaywm/sway&release=1.5.1>
3: <https://archive.softwareheritage.org/browse/origin/visits/?origin_url=https://github.com/swaywm/sway>


However, I get this:

--8<---------------cut here---------------start------------->8---
$ guix lint -c archival sway                    
gnu/packages/wm.scm:1527:5: sway@1.5.1: scheduled Software Heritage archival
--8<---------------cut here---------------end--------------->8---

Sway is updated by 7fc378aa7bc1fff5d87ed993205a1e825a7872d9 pushed
on CommitDate: Fri Nov 20 00:10:11 2020 +0100.  And git-blame says that
all the commits modifying the ’source’ field are older.

On the top of that, the Preservation of Guix database pog.db [4] does
not list it as missing; IIUC.

4: <https://ngyro.com/pog-reports/2021-10-22/pog.db>

However, if I get it correctly, it uses another entry point than the one
defined by (guix swh).  Right?


Here a short list of packages providing similar behaviour; IIRC.

--8<---------------cut here---------------start------------->8---
open-zwave
ao
slop
wxwidgets
wxwidgets-gtk2
wlroots
sway
monolith
tidy-html
wabt
rss-bridge
libsass
glslang
spirv-tools
protonvpn-cli
remmina
vlang
ganeti-instance-debootstrap
skopeo
vim-full
xxd
vim
neovim
--8<---------------cut here---------------end--------------->8---

Basically, what triggers ’swh-download’ is somehow using
’lookup-revision’ or ’lookup-origin-revision’, which are the same which
triggers a save (check-archival).


Cheers,
simon


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: SWH and lookup (bug?)
  2021-10-26 15:35 SWH and lookup (bug?) zimoun
@ 2021-10-29 14:57 ` Ludovic Courtès
  2021-11-02 11:23   ` zimoun
  0 siblings, 1 reply; 3+ messages in thread
From: Ludovic Courtès @ 2021-10-29 14:57 UTC (permalink / raw)
  To: zimoun; +Cc: Guix Devel

zimoun <zimon.toutoune@gmail.com> skribis:

> For instance, try with the package ’sway’ (search bar [1]:
> https://github.com/swaywm/sway).  SWH says the status for archiving
> succeeded.  Even, it is archived for instance there [2].
>
> Then if you give a look at the visit webpage, it says that the
> repository had been visited several times on 2021 [3].
>
> 1: <https://archive.softwareheritage.org/save/#requests>
> 2: <https://archive.softwareheritage.org/browse/origin/directory/?origin_url=https://github.com/swaywm/sway&release=1.5.1>
> 3: <https://archive.softwareheritage.org/browse/origin/visits/?origin_url=https://github.com/swaywm/sway>
>
>
> However, I get this:
>
> $ guix lint -c archival sway                    
> gnu/packages/wm.scm:1527:5: sway@1.5.1: scheduled Software Heritage archival

Indeed, I’m getting that as well.

Right now Sway’s origin refers to the “1.5.1” tag.

I found the problem:

--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> ,use(guix swh)
scheme@(guile-user)> (lookup-origin-revision "https://github.com/swaywm/sway" "1.5.1")
$2 = #f
scheme@(guile-user)> (lookup-origin "https://github.com/swaywm/sway")
$3 = #<<origin> visits-url: "https://archive.softwareheritage.org/api/1/origin/https://github.com/swaywm/sway/visits/" type: #<unspecified> url: "https://github.com/swaywm/sway">
scheme@(guile-user)> (car (origin-visits $3))
$4 = #<<visit> date: #<date nanosecond: 490956 second: 6 minute: 45 hour: 14 day: 29 month: 10 year: 2021 zone-offset: 0> origin: "https://github.com/swaywm/sway" url: "https://archive.softwareheritage.org/api/1/origin/https://github.com/swaywm/sway/visit/41/" snapshot-url: "https://archive.softwareheritage.org/api/1/snapshot/10ba0257e3290ce4504c2413f32b9358d72975d6/" status: full number: 41>
scheme@(guile-user)> (define s (visit-snapshot $4))
scheme@(guile-user)> ,pp (map branch-name (snapshot-branches s))
*** output flushed ***
scheme@(guile-user)> (length (snapshot-branches s))
$6 = 1000
scheme@(guile-user)> (filter (lambda (b)
			       (string-prefix? "refs/tags" (branch-name b)))
			     (snapshot-branches s))
$7 = ()
scheme@(guile-user)> ,use(srfi srfi-1)
scheme@(guile-user)> ,pp (take (snapshot-branches s) 10)
$8 = (#<<branch> name: "refs/pull/2715/head" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/2f258eff6fd2c89a94caa658c1ea22beb76d728a/">
 #<<branch> name: "refs/pull/2713/head" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/4e4898e90f4d9b721091137a744deac335e73f12/">
 #<<branch> name: "refs/pull/2712/head" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/d129108cddc485299443d0b98c3bdf3f9839aa1c/">
 #<<branch> name: "refs/pull/271/head" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/20cb390323b19dc0c767ba63925def7f51c31044/">
 #<<branch> name: "refs/pull/2709/head" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/426c33f4dc2515867a0d3b04cb865d5cad091d10/">
 #<<branch> name: "refs/pull/2708/head" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/b1a0e95e8e6ecf66542cc62e6109949de59afb5e/">
 #<<branch> name: "refs/pull/2704/head" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/6194a445d3b10e8afc968712faccdd1d127a8beb/">
 #<<branch> name: "refs/pull/2703/head" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/f16529e2588f5e71d6777f4c06dfb58b29308cd0/">
 #<<branch> name: "refs/pull/2701/head" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/baeb28ea6230ef9aa409ee52abe208720120e45c/">
 #<<branch> name: "refs/pull/270/head" target-type: revision target-url: "https://archive.softwareheritage.org/api/1/revision/a32cbb52ce81ee38d2928ba873ff7fc182df8393/">)
--8<---------------cut here---------------end--------------->8---

This snapshot has more than 1,000 branches, mostly ‘refs/pull’ branches.
But by default, the endpoint used by ‘visit-snapshot’ only returns the
first 1,000 branches, and then it’s up to the caller to use the
pagination mechanism.

It’s not implemented though!  It turns out the ‘refs/tags’ “branches”
were not among the first thousand branches, so the code incorrectly
thinks that the tag is missing.

The solution is to implement pagination (yuk!), or to use an endpoint to
look up a branch by name instead of using ‘snapshot-branches’ (is there
such an endpoint?).

Thoughts?

Ludo’.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: SWH and lookup (bug?)
  2021-10-29 14:57 ` Ludovic Courtès
@ 2021-11-02 11:23   ` zimoun
  0 siblings, 0 replies; 3+ messages in thread
From: zimoun @ 2021-11-02 11:23 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: Guix Devel

Hi Ludo,

On Fri, 29 Oct 2021 at 16:57, Ludovic Courtès <ludo@gnu.org> wrote:

> Right now Sway’s origin refers to the “1.5.1” tag.
>
> I found the problem:

Wow!  Thanks for sharing.

> The solution is to implement pagination (yuk!), or to use an endpoint to
> look up a branch by name instead of using ‘snapshot-branches’ (is there
> such an endpoint?).

Maybe, the solution is to drop referring by tag and only use commit.  As
you proposed elsewhere [1]:

        No, I think we should consider always referring to commits
        instead of tags.  It’s annoying from a readability viewpoint,
        but it would ensure reproducibility.  Even flatpak has this
        policy.  :-)

          https://github.com/flathub/flathub/wiki/App-Requirements

Although, it would be hard and a lot of work for switching, it would
solve many of lookup issues.

At least, we could refer to commit instead of tag in ’origin’ for these
specific packages.  Bah, it is inelegant. :-)

Note that using another endpoint via SWHID could be an option.  As
discussed here [2].  For instance, the lookup would use Disarchive-DB.
On one hand, it would ease coverage and so on.  On the other hand, it
means rely on another service — it does not appears straightforward when
speaking about long-term support; although Disarchive-DB is somehow a
Git repo, i.e., archival on SWH is more or less easy, but it is still
missing a mechanism to locally fallback is this very service is down.
Well, the story is not complete yet. ;-)

1: <https://yhetil.org/guix/87mtmr2a3t.fsf_-_@gnu.org>
2: <https://yhetil.org/guix/87h7cz29r4.fsf@gnu.org>

Cheers,
simon


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-11-02 12:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-26 15:35 SWH and lookup (bug?) zimoun
2021-10-29 14:57 ` Ludovic Courtès
2021-11-02 11:23   ` zimoun

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).