From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id CP/BHabkNF/YSQAA0tVLHw (envelope-from ) for ; Thu, 13 Aug 2020 06:58:46 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id WHa9GabkNF/zUAAAB5/wlQ (envelope-from ) for ; Thu, 13 Aug 2020 06:58:46 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 3814F940214 for ; Thu, 13 Aug 2020 06:58:46 +0000 (UTC) Received: from localhost ([::1]:51152 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k67CH-00044Z-5E for larch@yhetil.org; Thu, 13 Aug 2020 02:58:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:33746) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k67Bu-00040S-98 for guix-devel@gnu.org; Thu, 13 Aug 2020 02:58:22 -0400 Received: from relay6-d.mail.gandi.net ([217.70.183.198]:36861) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k67Bs-0005dV-JF for guix-devel@gnu.org; Thu, 13 Aug 2020 02:58:21 -0400 X-Originating-IP: 86.246.37.13 Received: from mimimi (lfbn-idf2-1-572-13.w86-246.abo.wanadoo.fr [86.246.37.13]) (Authenticated sender: mail@ambrevar.xyz) by relay6-d.mail.gandi.net (Postfix) with ESMTPSA id 70332C0003; Thu, 13 Aug 2020 06:58:17 +0000 (UTC) From: Pierre Neidhardt To: Arun Isaac , Ricardo Wurmus Subject: Re: File search progress: database review and question on triggers In-Reply-To: <87eeobh01d.fsf@systemreboot.net> References: <87sgcuh8rb.fsf@ambrevar.xyz> <87y2ml429i.fsf@elephly.net> <87364tgja3.fsf@ambrevar.xyz> <87y2mlf4jw.fsf@ambrevar.xyz> <87pn7x3pyw.fsf@elephly.net> <87r1sbel4f.fsf@ambrevar.xyz> <87eeobh01d.fsf@systemreboot.net> Date: Thu, 13 Aug 2020 08:58:16 +0200 Message-ID: <87eeobxcaf.fsf@ambrevar.xyz> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Received-SPF: pass client-ip=217.70.183.198; envelope-from=mail@ambrevar.xyz; helo=relay6-d.mail.gandi.net X-detected-operating-system: by eggs.gnu.org: First seen = 2020/08/13 02:52:41 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -5 X-Spam_score: -0.6 X-Spam_bar: / X-Spam_report: (-0.6 / 5.0 requ) BAYES_00=-1.9, FROM_SUSPICIOUS_NTLD=1, PDS_OTHER_BAD_TLD=1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel@gnu.org Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: -0.61 X-TUID: WyuFaOowvc2k --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hi Arun, Arun Isaac writes: > sqlite insert statements can be very fast. sqlite.org claims 50000 or > more insert statements per second. But in order to achieve that speed > all insert statements have to be grouped together in a single > transaction. See https://www.sqlite.org/faq.html#q19 Very good point, thanks a lot! I'll try again! >> A string-contains filter takes less than 1 second. > > Guile's string-contains function uses a naive O(nk) implementation, > where 'n' is the length of string s1 and 'k' is the length of string > s2. If it was implemented using the Knuth-Morris-Pratt algorithm, it > could cost only O(n+k). So, there is some scope for improvement here. In > fact, a comment on line 2007 of libguile/srfi-13.c in the guile source > tree makes this very point. Thanks for this low-level insight, this is very useful! Then if we can get Knuth-Morris-Pratt in the next Guile version, a textual database could be an ideal option in terms of performance. >> I need to measure the time SQL takes for a regexp match. > > sqlite, by default, does not come with regexp support. You might have to > load some external library. See > https://www.sqlite.org/lang_expr.html#the_like_glob_regexp_and_match_oper= ators > > --8<---------------cut here---------------start------------->8--- > The REGEXP operator is a special syntax for the regexp() user > function. No regexp() user function is defined by default and so use of > the REGEXP operator will normally result in an error message. If an > application-defined SQL function named "regexp" is added at run-time, > then the "X REGEXP Y" operator will be implemented as a call to > "regexp(Y,X)". > --8<---------------cut here---------------end--------------->8--- I saw that. Now I need to how regexp search performs as opposed to pattern search. Cheers! =2D-=20 Pierre Neidhardt https://ambrevar.xyz/ --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEUPM+LlsMPZAEJKvom9z0l6S7zH8FAl805IgACgkQm9z0l6S7 zH/jEgf+Kdiwbiqe3S7mYyWwwAHmhP1RMZprCcbyPJJIoZZqoioXFxrivS+7+C3+ jPVbYPsCwAv5851vGbZj9+TOQAeh/b46mq8p7Ey4ou8orcfOQaVAbWJlMqJs/9r5 ZmRivQ9LxagqZorzA2T9xmltHX6w8IjKS5WGDT39suOcitl5y8ocmgp3YwSrLPMF uylejwJ2qJkHDqHL8idDV1vJEvheXBxq4+1Y5slDnLOy6zPakIiwKJg0dV4xhj2x JFE9e+tCXGety8VAK5y3sTLw/FbIigThd6itpqzaTZGp8zAPUVvnnUaWy9vsCxF6 VCxnG2NLNH9GK4VYyOWW0DG4/ESgbg== =Jw2u -----END PGP SIGNATURE----- --=-=-=--