all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: zimoun <zimon.toutoune@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Arun Isaac <arunisaac@systemreboot.net>,
	Pierre Neidhardt <mail@ambrevar.xyz>,
	39258@debbugs.gnu.org
Subject: [bug#39258] [PATCH v4 0/3] Faster cache generation (similar as v3)
Date: Sun, 3 May 2020 20:10:10 +0200	[thread overview]
Message-ID: <CAJ3okZ1GS3aMjX3kGBYOkJi03MzGe2qgfAznWE5aGNn+zKonrw@mail.gmail.com> (raw)
In-Reply-To: <87r1w1ynnm.fsf@gnu.org>

Hi Ludo,

On Sun, 3 May 2020 at 18:43, Ludovic Courtès <ludo@gnu.org> wrote:

> > Therefore the cache '/lib/guix/package.cache' contains more
> > information.
>
> This breaks the binary interface, so we’ll have to analyze the impact of
> such a change and devise a strategy.

Interface between what and what?

Because from my understanding, this file is only used by only one
guix.  What do I miss?


Note that I have read your comment in v3 2/3 but I did not understand it. Sorry.

--8<---------------cut here---------------start------------->8---
I realize the other cache also has that problem, but it would be nice to
add a version tag to the cache.  Basically emit something like:

  (package-metadata-cache (version 0) VECTOR …)

instead of just:

  (VECTOR …)
--8<---------------cut here---------------end--------------->8---



> > (The v4 structure of 'package.cache' is a quick draft, so details
> > should be discussed and an interesting move should to have a
> > structured (binary and all strings) S-exp; because it should become an
> > entry point to export the packages list to JSON.  WDYT?)
>
> It’s on purpose that this cache is an object file: it just needs to be
> mmap’d, and that’s it.  It’s the cheapest possible way to do it.
> Parsing sexps would be more costly, and since we’re talking about
> startup time, this is sensitive.

I agree and I have badly worded or I misunderstand something.
For example, 'supported-systems' is saved as a list of strings,
whereas 'license' is expanded as 3 strings without be packed in a list
of strings.  From my point of view, it is inconsistent and I do not
know what is the best (readibility, startup time, etc.).


> > To be clear about BM25 and caching, what I have in mind is:
> >   1. "guix search --build-index" optionally done by the user if they wants for example the BM25 ranking.
>
> Something that must be done explicitly doesn’t seem great to me.  As a
> user, I’d rather not think about search indexes and all.  But I don’t
> know, maybe if it happened automatically on the first ‘guix search’
> invocation that’d be fine.

I do not think it is an option to build the BM25 the first time "guix
search" is called.  Back-to-envelop estimation, it needs ~25 seconds
to Xapian* to do so.

From my point of view, two options:
 a) "guix pull" does this extra ~25 seconds (compared to 10 seconds to
build the v4 cache)
 b) the user manually build the index (I agree it is awkward!)

Well, the first question is to evaluate if it is worth -- I am using
the v2 version based on Xapian to have an idea.  Please if you have
suggestions about query (terms an user could type) and results
(packages an user could expect), there are welcome.


*Xapian: I do not think we could do better but I have not checked yet
if there is a bottleneck Guix, Guile-Xapian and Xapian.


> >  1. The name of 'fold-packages*' should be misleading since it does not return "true" packages.
>
> Did you see ‘fold-available-packages’?  It seems you could extend it
> instead of introducing ‘fold-packages*’, no?

Yes and no.

 a) 'fold-available-packages' requires to modify the 'lambda' in
'find-package-by-description',
 b) 'fold-package*' returning a 'package' is less tweaks, IMHO.

Well, I agree that on the long term, what 'fold-package*' does could
be done by 'fold-available-packages' with the adequate 'proc'.

Thank you for the suggestion; even if once re-read correctly v3 2/3
you already mentioned it. :-)


> >  2. The function 'package->recutils' in 'guix/ui.scm' is modified but it is not the better.
> >
> >           (match (package-supported-systems p)
> >             (('cache supported-systems)
> >              (string-join supported-systems))
> >             (_
> >              (string-join (package-transitive-supported-systems p)))))
> >
> >     However it avoids to duplicate code; as it is done in version v3.
>
> I made suggestions to Arun’s v3 about the API here.  Essentially, I
> think I proposed having a procedure that takes the list of fields as
> keyword parameters, and ‘package->recutils’ would just delegate to that.

Yes, it was already your suggestion in v3 3/3.  Do you suggest to
refactor 'package->recutils'? For example,

--8<---------------cut here---------------start------------->8---
(define* (package->recutils name version
                            ... all-the-other-fields ...
                            port #:optional (width (%text-width))
                            #:key
                            (hyperlinks? (supports-hyperlinks? port))
                            (extra-fields '()))
--8<---------------cut here---------------end--------------->8---



> >  4. Impolite '@@' is used to access the private license construction.
>
> (guix licenses) could provide a ‘string->license’ procedure.

Well, do you suggest:

    (define (string->license name) (license name #f #f))

? Skipping 'uri' and 'comment'?  Naive question: what is the purpose
of these 2 fields?  Because there are not exposed at the CLI level,
AFAIK, and I do not think an user evaluate '(license-uri pkg)' in a
script.

Well, I think that the hyperlink feature could be used to display the
license URI too.  WDYT?



> Stopping here for now because I’m sorta drowning in patch review.  :-)

Thank you for all the comments.


> Thanks for exploring this design space, we’re making progress!

My pleasure. Scheme is designed to explore. ;-)


Cheers,
simon




  reply	other threads:[~2020-05-03 18:10 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-23 19:51 [bug#39258] Faster guix search using an sqlite cache Arun Isaac
2020-01-29 23:33 ` zimoun
2020-01-30 13:48   ` Arun Isaac
2020-01-31 12:48     ` zimoun
2020-02-02 21:16       ` Arun Isaac
2020-02-04 10:19         ` zimoun
2020-02-06  1:58           ` Arun Isaac
2020-02-11 16:29             ` Ludovic Courtès
2020-02-11 18:21               ` zimoun
2020-02-11 18:39                 ` Ludovic Courtès
2020-02-11 19:07                   ` Arun Isaac
2020-02-11 20:20                     ` zimoun
2020-02-15 14:50                     ` Arun Isaac
2020-02-11 20:13                   ` zimoun
2020-02-27 20:41 ` [bug#39258] [PATCH 0/4] Xapian for Guix package search Arun Isaac
2020-02-27 20:41   ` [bug#39258] [PATCH 1/4] gnu: Add guile-xapian Arun Isaac
2020-03-03 16:29     ` zimoun
2020-02-27 20:41   ` [bug#39258] [PATCH 2/4] build-self: Add guile-xapian to Guix dependencies Arun Isaac
2020-02-27 20:41   ` [bug#39258] [PATCH 3/4] gnu: Generate xapian package search index Arun Isaac
2020-02-28  8:04     ` Pierre Neidhardt
2020-03-05 20:26       ` Arun Isaac
2020-03-03 18:29     ` zimoun
2020-02-27 20:41   ` [bug#39258] [PATCH 4/4] gnu: Use xapian index for package search Arun Isaac
2020-02-28  8:11     ` Pierre Neidhardt
2020-03-03 19:21     ` zimoun
2020-03-03 19:51       ` zimoun
2020-02-28  8:13   ` [bug#39258] [PATCH 0/4] Xapian for Guix " Pierre Neidhardt
2020-02-28 12:39     ` zimoun
2020-02-28 12:49       ` Pierre Neidhardt
2020-02-28 15:36     ` Arun Isaac
2020-02-28 16:04       ` Arun Isaac
2020-03-02 18:37         ` zimoun
2020-03-02 19:13           ` zimoun
2020-03-03 20:04             ` zimoun
2020-02-29  8:25       ` Arun Isaac
2020-03-02 18:27         ` zimoun
2020-02-28 12:36   ` zimoun
2020-03-05 16:46   ` Ludovic Courtès
2020-03-07 13:31 ` [bug#39258] [PATCH v2 0/3] " Arun Isaac
2020-03-07 13:31   ` [bug#39258] [PATCH v2 1/3] build-self: Add guile-xapian to Guix dependencies Arun Isaac
2020-03-09 18:14     ` zimoun
2020-03-09 23:40     ` Jonathan Brielmaier
2020-03-10  5:24       ` Arun Isaac
2020-03-07 13:31   ` [bug#39258] [PATCH v2 2/3] gnu: Generate Xapian package search index Arun Isaac
2020-03-09 18:19     ` zimoun
2020-03-07 13:31   ` [bug#39258] [PATCH v2 3/3] gnu: Use Xapian index for package search Arun Isaac
2020-03-07 20:33   ` [bug#39258] [PATCH v2 0/3] Xapian for Guix " Ludovic Courtès
2020-03-08  9:01     ` Arun Isaac
2020-03-08 11:33       ` Ludovic Courtès
2020-03-08 20:27         ` Arun Isaac
2020-03-09  7:42           ` Pierre Neidhardt
2020-03-09 12:50             ` zimoun
2020-03-09 10:35           ` Ludovic Courtès
2020-03-10 14:17             ` Arun Isaac
2020-03-10 14:33               ` zimoun
2020-03-11 13:50               ` Ludovic Courtès
2020-03-13  5:37                 ` Arun Isaac
2020-03-15 20:40                   ` Ludovic Courtès
2020-03-09  7:50         ` Pierre Neidhardt
2020-03-09 10:28           ` Ludovic Courtès
2020-03-09 13:03             ` zimoun
2020-03-09 12:53           ` zimoun
2020-03-09 12:47         ` zimoun
2020-03-09 12:40       ` zimoun
2020-03-09 12:34     ` zimoun
2020-03-08 20:27   ` zimoun
2020-03-08 20:40     ` Arun Isaac
2020-03-09 12:28   ` zimoun
2020-03-27 16:26 ` [bug#39258] [PATCH v3 0/3] Package metadata cache for guix search Arun Isaac
2020-03-27 16:26   ` [bug#39258] [PATCH v3 1/3] guix: Generate package metadata cache Arun Isaac
2020-04-24 20:48     ` Ludovic Courtès
2020-04-26  9:48       ` zimoun
2020-04-26 14:35         ` Ludovic Courtès
2020-04-26 14:54           ` Pierre Neidhardt
2020-04-26 15:33             ` Ludovic Courtès
2020-04-26 15:05           ` zimoun
2020-03-27 16:26   ` [bug#39258] [PATCH v3 2/3] guix: Search " Arun Isaac
2020-04-24 20:58     ` Ludovic Courtès
2020-03-27 16:26   ` [bug#39258] [PATCH v3 3/3] guix: Use package metadata cache for package search Arun Isaac
2020-04-24 21:03     ` Ludovic Courtès
2020-04-05 14:08   ` [bug#39258] [PATCH v3 0/3] Package metadata cache for guix search Ludovic Courtès
2020-04-24 21:05   ` Ludovic Courtès
2020-04-26  3:54 ` [bug#39258] benchmark search: default vs v2 vs v3 zimoun
2020-04-26  7:29   ` Pierre Neidhardt
2020-04-26 15:49   ` Ludovic Courtès
2020-04-26 17:01     ` zimoun
2020-04-26 20:22       ` Ludovic Courtès
2020-04-30 13:10     ` zimoun
2020-05-03 15:01 ` [bug#39258] [PATCH v4 0/3] Faster cache generation (similar as v3) zimoun
2020-05-03 15:01   ` [bug#39258] [PATCH v4 1/3] DRAFT packages: Add fields to packages cache zimoun
2020-05-03 15:01   ` [bug#39258] [PATCH v4 2/3] DRAFT packages: Add new procedure 'fold-packages*' zimoun
2020-05-03 15:01   ` [bug#39258] [PATCH v4 3/3] DRAFT guix package: Use cache in 'find-packages-by-description' zimoun
2020-05-03 16:43   ` [bug#39258] [PATCH v4 0/3] Faster cache generation (similar as v3) Ludovic Courtès
2020-05-03 18:10     ` zimoun [this message]
2020-05-03 19:49       ` Ludovic Courtès
2020-06-01  0:00 ` [bug#39258] [PATCH 0/4] Optimize guix search Arun Isaac
2020-06-01  0:00   ` [bug#39258] [PATCH 1/4] ui: Cut off search early if any regexp does not match Arun Isaac
2020-06-09  8:29     ` Ludovic Courtès
2020-06-01  0:00   ` [bug#39258] [PATCH 2/4] ui: Use string matching with literal search strings Arun Isaac
2020-06-09  8:33     ` Ludovic Courtès
2020-06-09  9:55       ` zimoun
2020-06-13 12:37       ` Arun Isaac
2020-06-13 13:36         ` zimoun
2020-06-13 17:21           ` Arun Isaac
2020-06-14 19:14             ` zimoun
2020-06-13 19:32         ` Ludovic Courtès
2020-06-15 20:18           ` Arun Isaac
2020-06-01  0:00   ` [bug#39258] [PATCH 3/4] ui: Do not translate package synopsis a second time Arun Isaac
2020-06-09  8:33     ` Ludovic Courtès
2020-06-01  0:00   ` [bug#39258] [PATCH 4/4] ui: Use package-description-string Arun Isaac
2020-06-09  8:34     ` Ludovic Courtès
2020-06-01  1:25   ` [bug#39258] [PATCH v5 0/4] Optimize guix search zimoun
2020-06-01  2:24     ` Arun Isaac
2020-06-01 10:01     ` zimoun
2020-06-01 10:11 ` [bug#39258] KMP string search algorithm? zimoun
2020-06-01 22:24   ` Leo Famulari
2020-06-01 23:48     ` Arun Isaac
2020-06-02  8:49       ` Ludovic Courtès
2021-07-15  7:33 ` [bug#39258] [PATCH v6 0/2] DRAFT "guix search" performances zimoun
2021-07-15  7:33   ` [bug#39258] [PATCH v6 1/2] DRAFT packages: Add fields to packages cache zimoun
2021-07-17  8:31     ` Arun Isaac
2021-07-23 15:30       ` Ludovic Courtès
2021-08-17 14:03         ` zimoun
2021-07-15  7:33   ` [bug#39258] [PATCH v6 2/2] DRAFT scripts: package: Use cache in 'find-packages-by-description' zimoun
2021-07-23 15:43   ` [bug#39258] [PATCH v6 0/2] DRAFT "guix search" performances Ludovic Courtès
2021-08-20 15:42     ` zimoun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ3okZ1GS3aMjX3kGBYOkJi03MzGe2qgfAznWE5aGNn+zKonrw@mail.gmail.com \
    --to=zimon.toutoune@gmail.com \
    --cc=39258@debbugs.gnu.org \
    --cc=arunisaac@systemreboot.net \
    --cc=ludo@gnu.org \
    --cc=mail@ambrevar.xyz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.