From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id GOemBEqA1V5JSwAA0tVLHw (envelope-from ) for ; Mon, 01 Jun 2020 22:25:14 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id UB1vAEqA1V6BSAAA1q6Kng (envelope-from ) for ; Mon, 01 Jun 2020 22:25:14 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 9970E94036C for ; Mon, 1 Jun 2020 22:25:12 +0000 (UTC) Received: from localhost ([::1]:60272 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jfsrm-0004uN-3l for larch@yhetil.org; Mon, 01 Jun 2020 18:25:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50098) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jfsre-0004uD-TJ for guix-patches@gnu.org; Mon, 01 Jun 2020 18:25:02 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:53955) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jfsre-0005SU-JC for guix-patches@gnu.org; Mon, 01 Jun 2020 18:25:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1jfsre-0003Oc-FB for guix-patches@gnu.org; Mon, 01 Jun 2020 18:25:02 -0400 X-Loop: help-debbugs@gnu.org Subject: [bug#39258] KMP string search algorithm? Resent-From: Leo Famulari Original-Sender: "Debbugs-submit" Resent-CC: guix-patches@gnu.org Resent-Date: Mon, 01 Jun 2020 22:25:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 39258 X-GNU-PR-Package: guix-patches X-GNU-PR-Keywords: To: zimoun Cc: Arun Isaac , Ludovic =?UTF-8?Q?Court=C3=A8s?= , 39258@debbugs.gnu.org Received: via spool by 39258-submit@debbugs.gnu.org id=B39258.159105026513004 (code B ref 39258); Mon, 01 Jun 2020 22:25:02 +0000 Received: (at 39258) by debbugs.gnu.org; 1 Jun 2020 22:24:25 +0000 Received: from localhost ([127.0.0.1]:37268 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jfsr2-0003Nf-UX for submit@debbugs.gnu.org; Mon, 01 Jun 2020 18:24:25 -0400 Received: from wout2-smtp.messagingengine.com ([64.147.123.25]:54225) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jfsr1-0003NT-6y for 39258@debbugs.gnu.org; Mon, 01 Jun 2020 18:24:23 -0400 Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.west.internal (Postfix) with ESMTP id 55B4E9CD; Mon, 1 Jun 2020 18:24:17 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Mon, 01 Jun 2020 18:24:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=famulari.name; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=mesmtp; bh=rBTGbzaqNCLwuyZUt/axayMN HtAcCyX93g9xiJ3WoI4=; b=M701//WVMnOPSw/SxpKkPiXc9kawGh2u13zFz2Ci Zks3puqNz++L3YQCiUtIG479gD9sjpeHKCGk3CXppuUx3I418mug1Zp2XxfqM/ll RDV7Yh17qLgSjlS21MKRoTvEFu7loy9MCOd2mX3LjvnVgPW2JT6rrrYYdPztwwXc vhQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=rBTGbz aqNCLwuyZUt/axayMNHtAcCyX93g9xiJ3WoI4=; b=NxKDv9XSSkWWgOhlhBAK31 Zkwa0WNeUpIj0XB3XCUtRQKNjiFEWQntqiwwV32PmRHh7Bqn2x/AgOXz5yqGbYHD YKYdDm6qBQkK/g20s/CZNyBAkQb9gK7+3q14GcJm/WWfF4ygOf8HCXkujYuYWztJ OigzcDJx3fvGLkuXX+88KGyoKo5W/VypPhlD7c0U49fi36L3eu5qsiqJToOuw/u+ F2XzY+Res265f9nBtGvQXxct5jacM8Zpd520uvfOB8A4q8ScgTsPCfHPni1Y/2NB p+VzBggOwgm5NBr00Gg72nNdIcbYAYwq711KrMGWn/RKQzTTg5OcztdUGA7/b6XQ == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduhedrudefiedgtdelucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvffukfhfgggtuggjsehttdertddttddvnecuhfhrohhmpefnvghoucfh rghmuhhlrghrihcuoehlvghosehfrghmuhhlrghrihdrnhgrmhgvqeenucggtffrrghtth gvrhhnpeektdfhtdeigeejleeiffeuuedtiedttddvhfdvhfejgfdujeeiveeifefhtdej veenucffohhmrghinhepfihikhhiphgvughirgdrohhrghenucfkphepjeeirdduvdegrd dufeekrdeifeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhr ohhmpehlvghosehfrghmuhhlrghrihdrnhgrmhgv X-ME-Proxy: Received: from localhost (c-76-124-138-63.hsd1.pa.comcast.net [76.124.138.63]) by mail.messagingengine.com (Postfix) with ESMTPA id 64B2B30624CC; Mon, 1 Jun 2020 18:24:16 -0400 (EDT) Date: Mon, 1 Jun 2020 18:24:14 -0400 From: Leo Famulari Message-ID: <20200601222414.GA30829@jasmine.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-Spam-Score: -1.7 (-) X-BeenThere: guix-patches@gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+larch=yhetil.org@gnu.org Sender: "Guix-patches" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=fail (rsa verify failed) header.d=famulari.name header.s=mesmtp header.b=M701//WV; dkim=fail (rsa verify failed) header.d=messagingengine.com header.s=fm2 header.b=NxKDv9XS; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-patches-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-patches-bounces@gnu.org X-Spam-Score: 3.99 X-TUID: rchP2XdON+Zi On Mon, Jun 01, 2020 at 12:11:52PM +0200, zimoun wrote: > Dear, > > > > Often our search strings are only literal strings. So, we can save some time > > > by using string-contains instead of invoking the regexp engine. Patch 2 does > > > this. In addition, guile's string-contains uses a naive O(n^2) string search > > > algorithm. We should perhaps use the O(n) Knuth-Morris-Pratt algorithm[1]. In > > > fact, a comment on line 2006 of libguile/srfi-13.c in the guile source code > > > mentions this. If implemented, the KMP algorithm could speed up guix search > > > further. > > > > > > [1]: https://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm > > It could improve. > Well, I will try to do some back-to-envelop computations because I am > not convinced that the mean value of 'n' (length of description, > isn't) is large enough to really see an improvement for the end-user; > the visible bottleneck is I/O. > > All the best, > simon > > ps; > To be honest, I thought this kind of algorithm was the default. :-) I also recommend taking a look at the Boyer Moore string search implementation in (guix build grafts). It would be great to generalize it and make it accessible to other parts of Guix.