From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ihor Radchenko Newsgroups: gmane.emacs.devel Subject: Limits on the regexp string length Date: Wed, 21 Dec 2022 12:51:11 +0000 Message-ID: <87h6xox6m8.fsf@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6926"; mail-complaints-to="usenet@ciao.gmane.io" To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Dec 21 13:52:00 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1p7yZk-0001Xz-QV for ged-emacs-devel@m.gmane-mx.org; Wed, 21 Dec 2022 13:52:00 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1p7yZ7-0006hy-Re; Wed, 21 Dec 2022 07:51:21 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p7yZ6-0006hc-ED for emacs-devel@gnu.org; Wed, 21 Dec 2022 07:51:20 -0500 Original-Received: from mout01.posteo.de ([185.67.36.65]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p7yZ4-0001QK-IT for emacs-devel@gnu.org; Wed, 21 Dec 2022 07:51:20 -0500 Original-Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id 34D23240028 for ; Wed, 21 Dec 2022 13:51:13 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1671627076; bh=Wb7zCiech1Lv4fNjDdzwZQudOZzlP0WM2f93j1RhFGc=; h=From:To:Subject:Date:From; b=CZpxEhIYB5nit4knL7zkHZXipw3NzTZaLb5dStNVqY+LpRcdqp5jO4KIrAYVIp6vy l/2r7wzhwX5pAHrUAztr+AWVtpnwHK+78S3eFmDJv7VDbIy+NxZW+AHVN2iE50u87k I1Z88Fd4rn6KfbdkdXVOFf5zU+8IIiKvZl5mwxteCV0mBeyhntDrdOXb54hX+qqTHW yvwnEnUv7eke/NzMw9DkZG73BqaVov4ejiBavCLGDEgkQy76E6gdiJsIUN3BlQUM7N QqyBeliu88ZLF4uQ/vfvbvlYANzzqXbzOOcG52lo+wrL5ZP8qlwxO9rRvD7luDjTpg oT2KnLAo3ZZwg== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4NcYGT3Jnkz6tmQ for ; Wed, 21 Dec 2022 13:51:13 +0100 (CET) Received-SPF: pass client-ip=185.67.36.65; envelope-from=yantar92@posteo.net; helo=mout01.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:301742 Archived-At: Hi, I am writing as a follow-up of a recent bug report we got in Org. Rudolf Adamkovi=C4=8D (December 14) (2022 emacs-orgmode.gn= u.org inbox maillist replied) Subject: Radio links work only in small numbers https://orgmode.org/list/m2lenax5m6.fsf@me.com It looks like the length of regular expressions in Emacs is limited and regexps exceeding this length cause error being thrown: "Regular expression too big". Is there any rationale behind this limit? Can we increase it somehow from Elisp? The regexps in question are giant (or re1 re2 ...) where we are searching for occurrences of word combinations from list. The compiled discrete automata should not occupy too much memory. No more than ~ max_phrase_length * char_table_size. P.S. Note that `regexp-opt' is not suitable because we need to match arbitrary numbers of newlines/spaces inside the word combinations equally. --=20 Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at