all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: Christopher Baines <mail@cbaines.net>
Cc: guix-devel@gnu.org
Subject: Re: Package file indexing
Date: Fri, 03 Jan 2020 12:26:58 +0100	[thread overview]
Message-ID: <87tv5cpypp.fsf@gnu.org> (raw)
In-Reply-To: <87png11xgi.fsf@cbaines.net> (Christopher Baines's message of "Thu, 02 Jan 2020 19:15:41 +0000")

Hello!

Christopher Baines <mail@cbaines.net> skribis:

> Pierre Neidhardt <mail@ambrevar.xyz> writes:
>
>> Hello again!
>>
>> I'm resurrecting this since I've just started working on this as part of
>> the NGI application! :)
>>
>>>>> Internally it’d call ‘guix substitute’ to fetch the file index from
>>>>> the substitute server, check its signature, cache it locally, and then
>>>>> look up the file.
>>
>> What about storing the file listing in the narinfo instead?
>> Is this doable?  If so, then it should be quite simple to implement, it
>> would basically mimic "guix size."
>
> I haven't followed this thread particularly well, but at least from my
> recent experience messing with nar and narinfo stuff in the Guix Data
> Service, I'd be cautious about trying to adapt narinfo files for this
> purpose.
>
> It seems to me that the narinfo file is a good at capturing the
> information about the hash, size, location and signature of the
> nar. They're small, and human readable.
>
> I think making information about the contents of Guix store items more
> available is great, but even in the average case, it seems like that's
> too much information to pack in to a narinfo file. Imagining a manifest
> in abstract, having a list of the files and directories as well as the
> hashes and sizes of the files could be really useful, but that for most
> store items, all that information is much larger than the narinfo
> files. A separate file might be more flexible.

I concur!  Actually, there’s a separate file already: the nar itself.

  wget -q -O - https://ci.guix.gnu.org/nar/lzip/1gyi4i5lbpr7apm74p08dwy11fhzh4j7-sed-4.7 \
     | lzip -d | guix archive -t

But…

> Additionally, now that I'm thinking about this, having information about
> each store item is great, but if you want to know which store items in a
> particular revision of Guix contain files called foo, then it might take
> a while to download and search them all. Having something that's focused
> around the packages in a channel, and acts as an index for all of the
> files in all of the available outputs might be faster to search, by
> doing the combining of the data upfront.

… I agree.  I think file search has to be a service providing access to
a fast database.

I think the Guix Data Service is a good fit since it knows about
packages, derivations, commits, and how they map to each other.  :-)  It
could download nars and do the equivalent of ‘guix archive -t’ to get
the list of file names.

There’s an argument that it would be nice if file search were
implemented as part of ‘guix publish’ because that would immediately
benefit everyone without going through complex setups.  However, ‘guix
publish’ wouldn’t really know what to index upfront, or maybe it could
index lazily like it does with “baking”.

Food for thought!

Ludo’.

  reply	other threads:[~2020-01-03 11:27 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-14 18:31 Improve package search mikadoZero
2019-03-14 20:49 ` Leo Famulari
2019-03-14 22:01   ` Tobias Geerinckx-Rice
2019-03-14 22:09     ` Tobias Geerinckx-Rice
2019-03-14 22:46     ` Pierre Neidhardt
2019-03-14 23:09       ` Tobias Geerinckx-Rice
2019-03-23 16:27       ` Package file indexing Ludovic Courtès
2019-03-25  8:46         ` Pierre Neidhardt
2019-03-26 12:41           ` Ludovic Courtès
2020-01-02 17:12             ` Pierre Neidhardt
2020-01-02 19:15               ` Christopher Baines
2020-01-03 11:26                 ` Ludovic Courtès [this message]
2020-01-09 11:19                   ` Pierre Neidhardt
2020-01-09 12:24                     ` zimoun
2020-01-09 13:01                       ` Pierre Neidhardt
2020-01-09 16:49                     ` Christopher Baines
2020-01-10 12:35                       ` Pierre Neidhardt
2020-01-10 13:30                         ` Christopher Baines
2020-01-11 18:26                           ` Pierre Neidhardt
2020-01-12 13:29                             ` Christopher Baines
2020-01-13 14:28                               ` Pierre Neidhardt
2020-01-13 17:57                                 ` Christopher Baines
2020-01-13 18:21                                   ` Pierre Neidhardt
2020-01-13 19:45                                     ` Christopher Baines
2020-01-14  9:21                                       ` Pierre Neidhardt
2020-01-02 22:50               ` zimoun
2020-01-03 16:00                 ` raingloom
2020-01-06 16:56                   ` zimoun
2020-01-09 13:01                     ` Pierre Neidhardt
2020-01-09 13:53                       ` zimoun
2020-01-09 14:14                         ` Pierre Neidhardt
2020-01-09 14:36                           ` zimoun
2020-01-09 15:38                             ` Pierre Neidhardt
2020-01-09 16:59                               ` zimoun
2020-01-09 12:57                   ` Pierre Neidhardt
2020-01-09 12:55                 ` Pierre Neidhardt
2020-01-09 14:05                   ` zimoun
2020-01-09 14:21                     ` Pierre Neidhardt
2020-01-09 14:51                       ` zimoun
2020-01-09 15:41                         ` Pierre Neidhardt
2020-01-09 17:04                           ` zimoun
2020-01-09 17:27                             ` Pierre Neidhardt
2020-01-15 16:23         ` Pierre Neidhardt
2020-01-15 17:27           ` Nicolò Balzarotti
2020-01-15 18:02             ` Pierre Neidhardt
2020-01-15 22:14               ` Ludovic Courtès
2019-03-16  2:11     ` Improve package search mikadoZero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tv5cpypp.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guix-devel@gnu.org \
    --cc=mail@cbaines.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.