From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id 8ClGOwz/B2CRdwAA0tVLHw (envelope-from ) for ; Wed, 20 Jan 2021 09:59:40 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id sD0XNwz/B2BCJAAA1q6Kng (envelope-from ) for ; Wed, 20 Jan 2021 09:59:40 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 247D39404CF for ; Wed, 20 Jan 2021 09:59:30 +0000 (UTC) Received: from localhost ([::1]:36484 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l2AGv-0005MI-Tk for larch@yhetil.org; Wed, 20 Jan 2021 04:59:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:57080) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l2AGV-0005M8-5l for guix-patches@gnu.org; Wed, 20 Jan 2021 04:59:04 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:41403) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1l2AGU-0007Il-Jt for guix-patches@gnu.org; Wed, 20 Jan 2021 04:59:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1l2AGU-0003NQ-Ih for guix-patches@gnu.org; Wed, 20 Jan 2021 04:59:02 -0500 X-Loop: help-debbugs@gnu.org Subject: [bug#45893] Hint for package name: full matrix iteration Resent-From: zimoun Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Wed, 20 Jan 2021 09:59:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 45893 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: patch To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Received: via spool by 45893-submit@debbugs.gnu.org id=B45893.161113670512936 (code B ref 45893); Wed, 20 Jan 2021 09:59:02 +0000 Received: (at 45893) by debbugs.gnu.org; 20 Jan 2021 09:58:25 +0000 Received: from localhost ([127.0.0.1]:52949 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1l2AFs-0003Ma-KS for submit@debbugs.gnu.org; Wed, 20 Jan 2021 04:58:24 -0500 Received: from mail-wm1-f50.google.com ([209.85.128.50]:54061) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1l2AFq-0003ML-7O for 45893@debbugs.gnu.org; Wed, 20 Jan 2021 04:58:23 -0500 Received: by mail-wm1-f50.google.com with SMTP id j18so2263522wmi.3 for <45893@debbugs.gnu.org>; Wed, 20 Jan 2021 01:58:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-transfer-encoding; bh=PibM+cCAPr2E8aM6E5XRyoc3peBbr7yIvgt9qbG066I=; b=gpygHfLOrfGIr3hRG7x6vGIlSq1BQMNdtA9o8u/ZOqMHkBcpCGYseHocAXmL7FzZPJ sZE3ko0njtaVes1iBRn/guh9LtFY3pjzjQItkwctP3qVlZqixG9Z0uDk4IXaVsGOIBB4 J0WPd8gevfhR/SswtiTH/WZN+thYS0xj2XGZq6Yxb4paPSZ7ZTX3ro8vcuYfbO2+OXN7 lqb229A3kcK3ZM9tb2a7lZnFCQR2Ej/upc1NltlCdbMkyeodk0UiIlQxZRnHgiNPkNZF o14nCU/rX/CsM6BAwFy2SJ5Hg1LS+dnFcdDHjULoEQrNkl3GorDmSRzZo+j7VEsl2ndl oV+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=PibM+cCAPr2E8aM6E5XRyoc3peBbr7yIvgt9qbG066I=; b=sqkk6TO2fXrYYjVbm7jxf1ddRvqjeUQPr8GLbUL32EL32QW2LM2mcz0/aRPsP+iUjp E7RLFcsHN4uw3d1loqcgxMBVfGMbur+j886y/XxomAy5MzgAyUN5dzxZRs2M9rKqa+kC KF1gjCNv/JI875aFGhur5eD2dTgixlLBY3rnNXnZ7++TourGk27nMg2tpeX3FOCuC3Tx st6nYE8CkqHmHIOQ8ITLQIdB2cpLySmYZUWBdWuhfDxb3XLZSW3gvbNlJzMBgiki+44X /gKLlmwgecDQef51X0dZyCjRJkSDYKZL5fqhHqay5M6frD4y3WYAx4Jhet8F51xmFEbO Hi9Q== X-Gm-Message-State: AOAM533VafAonongdBTYnp74ydJlPuWv2LOyAL9Yf3LoUn4xJoojg9ak Sf28k24RtYKKzmEeBu3AZeDbOxp8Nq8= X-Google-Smtp-Source: ABdhPJx3Lkz5lsaN5l4wtAka32U9cf6BN1hWh6m3qKZ8nv8Dn4kwVFC/EkxZ8ehZ5NWHXz7l+nleBw== X-Received: by 2002:a1c:a406:: with SMTP id n6mr3550276wme.53.1611136695876; Wed, 20 Jan 2021 01:58:15 -0800 (PST) Received: from lili ([2a01:e0a:59b:9120:65d2:2476:f637:db1e]) by smtp.gmail.com with ESMTPSA id b132sm2990582wmh.21.2021.01.20.01.58.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Jan 2021 01:58:15 -0800 (PST) From: zimoun In-Reply-To: <86o8hk8olh.fsf@gmail.com> References: <865z3xlp2y.fsf@gmail.com> <20210116002634.10401-1-zimon.toutoune@gmail.com> <20210116002634.10401-3-zimon.toutoune@gmail.com> <87a6t4hlom.fsf_-_@gnu.org> <86o8hk8olh.fsf@gmail.com> Date: Wed, 20 Jan 2021 10:49:47 +0100 Message-ID: <86ft2w7xac.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 45893@debbugs.gnu.org Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: "Guix-patches" X-Migadu-Flow: FLOW_IN X-Migadu-Spam-Score: -1.25 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=gmail.com header.s=20161025 header.b=gpygHfLO; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Migadu-Queue-Id: 247D39404CF X-Spam-Score: -1.25 X-Migadu-Scanner: scn1.migadu.com X-TUID: URobE8CzFYw9 Hi, On Wed, 20 Jan 2021 at 00:59, zimoun wrote: > However, the first improvement is to speed up =E2=80=99string-distance=E2= =80=99. Well, > the naive recursive implementation is well-known to be really slow. For the interested reader, the patch implements the naive recursion version. And just to compare, I have quickly compared to the iterative with full matrix. See [1]. Roughly speaking, it is a factor of 5x. --8<---------------cut here---------------start------------->8--- scheme@(guix-user)> ,time (define x (read-the-cache "macs-mgit")) ;; 3.425513s real time, 4.524607s run time. 1.619604s spent in GC. scheme@(guix-user)> ,time (define x (compute-distance2 "macs-mgit")) ;; 0.870167s real time, 1.056089s run time. 0.271307s spent in GC. scheme@(guix-user)> ,time (define x (compute-distance "macs-mgit")) ;; 3.637917s real time, 5.601863s run time. 2.847500s spent in GC. --8<---------------cut here---------------end--------------->8--- The naive recursion version seems fast enough for options and commands =E2=80=93=E2=80=93 because there are few. The key advantage of recursion i= mplementation is the readibility, IMHO. Compare with the iteration with full matrix below. I am in favor to merge this naive recursion version for now. And postpone improvements. Because to be competitive for package hints, instead of plain =E2=80=9Cedit distance=E2=80=9C which scales poorly, there= is 2 directions: - implement =E2=80=99string-distance=E2=80=99 at the C level (in the stand= ard library?) - pre-process the package names at package cache build time; with suffix tree or n-gram or =E2=80=93=E2=80=93 in the scope of = =E2=80=9Cguix search=E2=80=9D improvements. Both are piece of works and I am not convinced the package name hint is worth. =20 Cheers, simon 1: --8<---------------cut here---------------start------------->8--- (define (edit-distance s1 s2) (let* ((as (string->list s1)) (bs (string->list s2)) (s (list->vector as)) (t (list->vector bs)) (m (length as)) (n (length bs)) (d (make-typed-array 'u32 0 (1+ m) (1+ n)))) (do ((i 1 (1+ i))) ((> i m)) (array-set! d i i 0)) (do ((j 1 (1+ j))) ((> j n)) (array-set! d j 0 j)) (do ((j 1 (1+ j))) ((> j n)) (do ((i 1 (1+ i))) ((> i m)) (let* ((c1 (vector-ref s (1- i))) (c2 (vector-ref t (1- j))) (cost (if (char=3D? c1 c2) 0 1)) (deletion (1+ (array-ref d (1- i) j))) (insertion (1+ (array-ref d i (1- j)))) (substitution (+ cost (array-ref d (1- i) (1- j)))) (v (min deletion insertion substitution))) (array-set! d v i j)))) (array-ref d m n))) --8<---------------cut here---------------end--------------->8---