From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id iKhIHRukRmNisgAAbAwnHQ (envelope-from ) for ; Wed, 12 Oct 2022 13:25:15 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id gIcrHRukRmNjcwAAauVa8A (envelope-from ) for ; Wed, 12 Oct 2022 13:25:15 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 1C81D5A48 for ; Wed, 12 Oct 2022 13:25:15 +0200 (CEST) Received: from localhost ([::1]:53582 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oiZrO-0000fm-81 for larch@yhetil.org; Wed, 12 Oct 2022 07:25:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38234) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oiZrC-0000fe-7K for guix-patches@gnu.org; Wed, 12 Oct 2022 07:25:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:56824) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oiZrB-0006jM-Uw for guix-patches@gnu.org; Wed, 12 Oct 2022 07:25:01 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oiZrB-0000G5-P8 for guix-patches@gnu.org; Wed, 12 Oct 2022 07:25:01 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#58136] [PATCH] ui: Improve sort order when searching package names. Resent-From: Lars-Dominik Braun Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Wed, 12 Oct 2022 11:25:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58136 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: zimoun Cc: ludo@gnu.org, 58136@debbugs.gnu.org Received: via spool by 58136-submit@debbugs.gnu.org id=B58136.1665573858931 (code B ref 58136); Wed, 12 Oct 2022 11:25:01 +0000 Received: (at 58136) by debbugs.gnu.org; 12 Oct 2022 11:24:18 +0000 Received: from localhost ([127.0.0.1]:55899 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oiZqU-0000Ex-0U for submit@debbugs.gnu.org; Wed, 12 Oct 2022 07:24:18 -0400 Received: from mail-wr1-f51.google.com ([209.85.221.51]:34467) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oiZqS-0000El-2v for 58136@debbugs.gnu.org; Wed, 12 Oct 2022 07:24:16 -0400 Received: by mail-wr1-f51.google.com with SMTP id b4so25784466wrs.1 for <58136@debbugs.gnu.org>; Wed, 12 Oct 2022 04:24:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=leibniz-psychology-org.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=MrfnaiYkrXg/oz6p3rCZ467AMtTMt5CjEXJzoyzHU1M=; b=O2MTMSq+BPERh4/auVY34PTalkgEcPEzJpiGeT3xWzcJRMGbQSWDZOxUWGNRx0EPVT mYMJyFXvsMBZt6CXlErY7eM3Asz4ZwxIxnUxghv6nfiMv3XLriOegX2Ul2XjTUkL8ak0 Zvt4i1JMq4ihKubYgI+rW1198W/7pWOEGeIcBUjv+cL5g24TwKhKNe4ezB5y41l0TpN+ bE0DhmuSMLdR4D3kpX9X3vOQLn+BQAngWotdlf1x93sMImdkiJPWzD341kis/cjium5Q qNGrpGA0lCc5QLriBqwB3CmwRvwfwHPYlwfDbMRsWA/8cWHKScg/zXjuZSnDoM8zgKwm JW/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=MrfnaiYkrXg/oz6p3rCZ467AMtTMt5CjEXJzoyzHU1M=; b=6bVH05uMy9+is01b7JgjWviHRwCI+2Tcgw8IeHnLJ0C2mk4Mdr1BmELO97tsHb1pTP daj4wjky2/8mOy63PRFYpcp/wWPzBkwgaogx3GRhp40XyNMzr7e27QxCTZc6zfBepZ+O rtPrNBNXQXnTJsYSRQcdLLxKf9ZGqFjQoR0opzsZvAUPbGM7oB4Dt9P/yaKZUgK3DKAX Xf68HlOpEVMlXAyf0nyWP3sLk2Gn9ZUA2NL1Q0HLad93QM2cjIvMKzqe3epqJqogjb3Z oLBNLKPf3eOwFL19ayiiika26/nOph6TZoGi1p54/lY/2Cnccxc+RnOrmtOAtohca6Pf abmQ== X-Gm-Message-State: ACrzQf2s40Yf3BnJQTFZDkOc10yXThga9dG342P0yVNUxdSLB2Zfm31H EbCLab6FT5p0irevn38y7HEDJJxr3BIw3n+xa/T5kRuPhyyr3w6EAreJ3WLvS3O76l5rIbNoRvm dPLrW8bfxHBP2TU3b3lUsyS8Hwg2yf4irlnRkxMnUbISKY+YwGacXMG0rYokC3g3OeS70zFfDtH 7UbHE= X-Google-Smtp-Source: AMsMyM48etzAk6iiUFVBA50abfDNao+EMp5IfPQoQJfkVDkVsAKIV06BVALMRyk62GGJOUDb7N6K0w== X-Received: by 2002:a05:6000:788:b0:22e:412b:7959 with SMTP id bu8-20020a056000078800b0022e412b7959mr17943239wrb.491.1665573849958; Wed, 12 Oct 2022 04:24:09 -0700 (PDT) Received: from localhost (opensense.uni-trier.de. [136.199.1.50]) by smtp.gmail.com with ESMTPSA id g17-20020a05600c001100b003c6bbe910fdsm1877617wmc.9.2022.10.12.04.24.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Oct 2022 04:24:09 -0700 (PDT) Date: Wed, 12 Oct 2022 13:24:08 +0200 From: Lars-Dominik Braun Message-ID: References: <86wn9na82p.fsf@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="1WUFmoobFniE+iSj" Content-Disposition: inline In-Reply-To: <86wn9na82p.fsf@gmail.com> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: "Guix-patches" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1665573915; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:resent-cc:resent-from:resent-sender: resent-message-id:in-reply-to:in-reply-to:references:references: list-id:list-help:list-unsubscribe:list-subscribe:list-post: dkim-signature; bh=MrfnaiYkrXg/oz6p3rCZ467AMtTMt5CjEXJzoyzHU1M=; b=LNhllwp2M4XM/LfknX14eqcaTIoGgfw7mKR/boYfi1nKOieNQ31385GV6O6W4xCc0Ys564 Bg6iUe3VmISAbygfvfCJJU7+lMbtzLt989UTv2ayzJYXdHlADWD1PRqqEE7NKYLT2AUih0 geCgflEFcb4uX7I5K977053eS3ShtFELgltNkdY80Gm9rqEhkalIdfZWGzRxa4JULeTQwS CYwdNcudWZVsDASe9hz2ZRkSOFfu46oC7S5pLKJjk76ZjL9zBhrDsHAtkoSj3yhLYNPV8r Lh8DcSR+0HfiqrGAvRKnL4Gx7z4ucvbh+ZwH6jFI5OJv1SPeAqjortLNVlQMvg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1665573915; a=rsa-sha256; cv=none; b=UqUPoyBRPshJCHSClKDSihvcooUvhnRxX2L4zYmmkFMAEPJyscQoclFDalVZEFj5BSl91O 2spi4RF6W7JrLLoPPSlydsXrD1xY5IUtwDmj65BGy40OJGw/XViWDVoSsOJx4Q81ZgdkPx kpak2tixb5H3tLyiCU6TbY5nkM7rGddyn8/h/dWkB3L/CzziGXb6aVQHo12yC91SVtdWFg SG2BUaM3GQez2BWtSnUiK0biiPQ2URECSWIw9C4dT5CBTcXKe7cSRc1PFXqO6DYT4WYx/P jqzN2mAag2M1zZsxQRdLI0q34iROVVDH1Mv24v7OJE564r9Km6p09lJuUqpuxg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=leibniz-psychology-org.20210112.gappssmtp.com header.s=20210112 header.b=O2MTMSq+; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -1.30 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=leibniz-psychology-org.20210112.gappssmtp.com header.s=20210112 header.b=O2MTMSq+; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "guix-patches-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-patches-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 1C81D5A48 X-Spam-Score: -1.30 X-Migadu-Scanner: scn0.migadu.com X-TUID: 86misDFWZMlP --1WUFmoobFniE+iSj Content-Type: multipart/mixed; boundary="6l9BvxkPFUTZKg+P" Content-Disposition: inline --6l9BvxkPFUTZKg+P Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi simon, > In addition to your proposal which LGTM, maybe we could also use the > =E2=80=99upstream-name=E2=80=99 properties. Most of the time, the Guix n= ame matches the > upstream name, but sometimes not. Although, it would not fix the issue > for ggplot2 since there is no upstream-name for this package. :-) I agree that using the upstream-name would be a good idea. > 2. set the =E2=80=9Cnamespace=E2=80=9D weight to 1 (or 2 if you prefer) >=20 > Otherwise, for example, generic name as CSV could artificially bump > the relevance and hide relevant packages. For instance, compare >=20 > guix search csv The issue here is we don=E2=80=99t know what the user is searching for. If = we add more weight to the package name then usually libraries (rust-csv, ghc-csv, =E2=80=A6) win. Imo a search for =E2=80=9Ccsv=E2=80=9D should retu= rn tools to manipulate CSV files like csvkit, csvdiff, xlsx2csv, =E2=80=A6 Just like =E2=80=9Cjson=E2=80=9D should yield tools like jq, json.sh and possibly oth= ers which I cannot find right now. But maybe I=E2=80=99m searching for a C library th= at parses CSV instead. And then what=E2=80=A6? As for ggplot2, the particular issue seems to be that scores are added for each match and the description for some of our packages contains =E2=80=9Cggplot2=E2=80=9D alot. So I tried using MAX instead of +, which wo= rks, but results in little variation of scores and thus weird sort order (descending by name). It does not feel like an improvement either. Cheers, Lars --=20 Lars-Dominik Braun Wissenschaftlicher Mitarbeiter/Research Associate www.leibniz-psychology.org ZPID - Leibniz-Institut f=C3=BCr Psychologie / ZPID - Leibniz Institute for Psychology Universit=C3=A4tsring 15 D-54296 Trier - Germany Tel.: +49=E2=80=93651=E2=80=93201-4964 --6l9BvxkPFUTZKg+P Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="v2.patch" Content-Transfer-Encoding: quoted-printable diff --git a/guix/packages.scm b/guix/packages.scm index 94e464cd01..9934501cdb 100644 --- a/guix/packages.scm +++ b/guix/packages.scm @@ -86,6 +86,7 @@ (define-module (guix packages) this-package package-name package-upstream-name + package-upstream-name* package-version package-full-name package-source @@ -657,6 +658,38 @@ (define (package-upstream-name package) (or (assq-ref (package-properties package) 'upstream-name) (package-name package))) =20 +(define (package-upstream-name* package) + "Return the upstream name of PACKAGE, which could be different from the = name +it has in Guix." + (let ((namespaces (list "cl-" + "ecl-" + "emacs-" + "ghc-" + "go-" + "guile-" + "java-" + "julia-" + "lua-" + "minetest-" + "node-" + "ocaml-" + "perl-" + "python-" + "r-" + "ruby-" + "rust-" + "sbcl-" + "texlive-")) + (name (package-name package))) + (or (assq-ref (package-properties package) 'upstream-name) + (let loop ((prefixes namespaces)) + (match prefixes + ('() name) + ((prefix rest ...) + (if (string-prefix? prefix name) + (substring name (string-length prefix)) + (loop (cdr prefixes))))))))) + (define (hidden-package p) "Return a \"hidden\" version of P--i.e., one that 'fold-packages' and th= us, user interfaces, ignores." diff --git a/guix/ui.scm b/guix/ui.scm index dad2b853ac..da16a50f9f 100644 --- a/guix/ui.scm +++ b/guix/ui.scm @@ -1623,10 +1623,23 @@ (define (relevance obj regexps metrics) (define (score regexp str) (fold-matches regexp str 0 (lambda (m score) - (+ score - (if (string=3D? (match:substring m) str) - 5 ;exact match - 1))))) + (let* ((start (- (match:start m) 1)) + (end (match:end m)) + (left (if (>=3D start 0) (string-ref str start)= #f)) + (right (if (< end (string-length str)) (string-= ref str end) #f)) + (delimiter-classes '(Cc Cf Pd Pe Pf Pi Po Ps Sk= Zs Zl Zp)) + (delim-left (or (member (and=3D> left char-gene= ral-category) delimiter-classes) (eq? left #f))) + (delim-right (or (member (and=3D> right char-ge= neral-category) delimiter-classes) (eq? right #f)))) + (max score + (cond + ;; regexp is a full match for str. + ((and (eq? left #f) (eq? right #f)) 4) + ;; regexp matches a single word in str. + ((and delim-left delim-right) 3) + ;; regexp matches the beginning or end of a word= in str. + ((or delim-left delim-right) 2) + ;; Everything else. + (#t 1))))))) =20 (define (regexp->score regexp) (let ((score-regexp (lambda (str) (score regexp str)))) @@ -1635,10 +1648,11 @@ (define (regexp->score regexp) ((field . weight) (match (field obj) (#f relevance) + ('() relevance) ((? string? str) - (+ relevance (* (score-regexp str) weight))) + (max relevance (* (score-regexp str) weight))) ((lst ...) - (+ relevance (* weight (apply + (map score-regexp lst)= )))))))) + (max relevance (* weight (apply max (map score-regexp = lst))))))))) 0 metrics))) =20 (let loop ((regexps regexps) @@ -1655,7 +1669,8 @@ (define (regexp->score regexp) (define %package-metrics ;; Metrics used to compute the "relevance score" of a package against a = set ;; of regexps. - `((,package-name . 4) + `((,package-name . 5) + (,package-upstream-name* . 1) =20 ;; Match against uncommon outputs. (,(lambda (package) --6l9BvxkPFUTZKg+P-- --1WUFmoobFniE+iSj Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQGzBAABCAAdFiEEyk+M9DfXR4/aBV/UQhN3ARo3hEYFAmNGo8wACgkQQhN3ARo3 hEba5gv+ILcz/1yKN1+6IUZ4SaSfwURreKttay/0O7uTVddFMAfOmkryx4SOfLoN 0fpmhpJ93nF7TYkaUBelvTSOmrb0bZa63OgxfOxLc4DoDRjAY+F7bY2Oa9HvRM1r dCuHh/Fob7JvVFy9Z8jA5iqmnICrUcNgJnh1TI8xNybY+g+1nvLhyKwWH5+ZjtDt J9r60nDeisxk8Dkoub3mxJbILBymjscviPWRoA0iwY0//KZv3JIRl4ICufqGMil0 S2TkL5vpzVYtCLfhyvv33rKNkXjj5bHGy6Cy94psVkVo3v0wIf5JS5mJGpYv3wEP wTp74j7dd1wnZgwhjlK3hUXBnm/vwBPOJxTYdH0VGUjxmqFcGJ7Hm6tYWv++3Zbz x5BK6l3dgXCNgcNf5yUShRncpVLHyeOnjQ/Lv6i64pKaPxRpVkCXO66uyTpc+lgq WDLbjmKjq8nTiokwfRtFMNspwjquhlju4C5NXsM21WQJ916TOotSRmVlONKQvdh4 whHkNPMm =Ji6J -----END PGP SIGNATURE----- --1WUFmoobFniE+iSj--