From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ihor Radchenko Newsgroups: gmane.emacs.bugs Subject: bug#63225: Compiling regexp patterns (and REGEXP_CACHE_SIZE in search.c) Date: Mon, 08 May 2023 19:44:47 +0000 Message-ID: <87354638wg.fsf@localhost> References: <63882A45-BD02-40D5-92FA-70175267BA3B@acm.org> <874jou7lsf.fsf@localhost> <37EED5F9-F1FE-46B6-B4FA-0B268B945123@gmail.com> <87wn1qqvj0.fsf@localhost> <34F4849A-CB39-4C96-9CC1-11ED723706DA@gmail.com> <87wn1psqny.fsf@localhost> <6DAF37F9-B236-4C33-8E30-0FCA47CCBCC5@gmail.com> <87zg6lfobh.fsf@localhost> <281B22C2-CD69-4495-A97C-E754446CA9A6@gmail.com> <87o7n1v1w3.fsf@localhost> <878E8D66-A548-42E6-B077-6068A8B131D8@gmail.com> <87ednvul22.fsf@localhost> <87h6sn9bb1.fsf@localhost> <1473BC99-6A16-482F-B77E-0E7B25B4844E@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23153"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 63225@debbugs.gnu.org To: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon May 08 21:42:10 2023 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pw6kM-0005p2-2p for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 08 May 2023 21:42:10 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pw6kG-0000yz-Rd; Mon, 08 May 2023 15:42:04 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pw6kE-0000wp-Na for bug-gnu-emacs@gnu.org; Mon, 08 May 2023 15:42:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pw6kE-0002Xe-Fo for bug-gnu-emacs@gnu.org; Mon, 08 May 2023 15:42:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1pw6kE-0004rg-CV for bug-gnu-emacs@gnu.org; Mon, 08 May 2023 15:42:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Ihor Radchenko Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 08 May 2023 19:42:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63225 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 63225-submit@debbugs.gnu.org id=B63225.168357489718655 (code B ref 63225); Mon, 08 May 2023 19:42:02 +0000 Original-Received: (at 63225) by debbugs.gnu.org; 8 May 2023 19:41:37 +0000 Original-Received: from localhost ([127.0.0.1]:41732 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pw6jp-0004qo-7K for submit@debbugs.gnu.org; Mon, 08 May 2023 15:41:37 -0400 Original-Received: from mout01.posteo.de ([185.67.36.65]:39923) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pw6jn-0004qb-Dc for 63225@debbugs.gnu.org; Mon, 08 May 2023 15:41:36 -0400 Original-Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id 7D0E424026B for <63225@debbugs.gnu.org>; Mon, 8 May 2023 21:41:29 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1683574889; bh=/8y63xNg6o66rTT1qKAdopXkpLHw5Qpskrzf6K7MEuw=; h=From:To:Cc:Subject:Date:From; b=opVxJQ0SfdE1zuIeKr9kruHXxz0ta/HqWaUFv1QQ9OWKWsUqB/QKBZQTq6k8rB2TB O7enDCXSkHWaayrdf8fssty1zmyQDdLzCeBkHRnPkfuHZ1QT+BK+Rjbvk5z4lvk6Tc g8v54AdAAMrKjEAclM8cDoLjy4nyqxMC5+DSy3oNFVRkqQjUcOUiG1RuAGl6UpHT92 EboMfqPNE/dU9L5k8hp0Y7LczxXQlq8ekrE1cqPAw8Lj80885eR38uT8V1/7lu9Vei pgWIOk2A0iETBtruGxqnf49BkIBB3/ayvFlqUGOqivMZtV2MY1iIwiNEW/7iXrVrd8 8UQXCSI1uo9xQ== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4QFWr85PmCz6txG; Mon, 8 May 2023 21:41:28 +0200 (CEST) In-Reply-To: <1473BC99-6A16-482F-B77E-0E7B25B4844E@gmail.com> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:261364 Archived-At: Mattias Engdeg=C3=A5rd writes: > 8 maj 2023 kl. 15.56 skrev Ihor Radchenko : > >> I am not sure what I can make out of hits/misses, but I am at least able >> to look into frequency data, via sort re.log | uniq -c > re-freq.log > > I'm mostly curious about the regexp cache behaviour. What cache size > did you use in this run? 50 > Hardly 20, given the low miss rate? It would be interesting to see what s= equence of regexps most commonly cause thrashing. Here is the log: https://0x0.st/HZgH.log >> It would be even nicer if apart from frequency, there was information >> about time taken to search for each regexp. > > That's a bit messier but could be done if really needed. >From this discussion, I am, so far, having an impression that Elisp regexps can various non-obvious pitfalls that may need to be considered. However, Org uses so many regexps that optimizing them all is not a viable option, especially when the optimization may involve changing the syntax. Having the data on the major bottlenecks would at least allow us to focus on the regexps that really slow things down in practice. --=20 Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at