* Limits on the regexp string length
@ 2022-12-21 12:51 Ihor Radchenko
2022-12-21 13:39 ` Eli Zaretskii
0 siblings, 1 reply; 2+ messages in thread
From: Ihor Radchenko @ 2022-12-21 12:51 UTC (permalink / raw)
To: emacs-devel
Hi,
I am writing as a follow-up of a recent bug report we got in Org.
Rudolf Adamkovič <salutis@me.com> (December 14) (2022 emacs-orgmode.gnu.org inbox maillist replied)
Subject: Radio links work only in small numbers
https://orgmode.org/list/m2lenax5m6.fsf@me.com
It looks like the length of regular expressions in Emacs is limited and
regexps exceeding this length cause error being thrown: "Regular
expression too big".
Is there any rationale behind this limit? Can we increase it somehow
from Elisp?
The regexps in question are giant (or re1 re2 ...) where we are
searching for occurrences of word combinations from list.
The compiled discrete automata should not occupy too much memory. No more
than ~ max_phrase_length * char_table_size.
P.S. Note that `regexp-opt' is not suitable because we need to match
arbitrary numbers of newlines/spaces inside the word combinations
equally.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Limits on the regexp string length
2022-12-21 12:51 Limits on the regexp string length Ihor Radchenko
@ 2022-12-21 13:39 ` Eli Zaretskii
0 siblings, 0 replies; 2+ messages in thread
From: Eli Zaretskii @ 2022-12-21 13:39 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: emacs-devel
> From: Ihor Radchenko <yantar92@posteo.net>
> Date: Wed, 21 Dec 2022 12:51:11 +0000
>
> It looks like the length of regular expressions in Emacs is limited and
> regexps exceeding this length cause error being thrown: "Regular
> expression too big".
>
> Is there any rationale behind this limit? Can we increase it somehow
> from Elisp?
See this part of regex-emacs.c:
/* This is not an arbitrary limit: the arguments which represent offsets
into the pattern are two bytes long. So if 2^15 bytes turns out to
be too small, many things would have to change. */
# define MAX_BUF_SIZE (1 << 15)
/* Extend the buffer by at least N bytes via realloc and
reset the pointers that pointed into the old block to point to the
correct places in the new one. If extending the buffer results in it
being larger than MAX_BUF_SIZE, then flag memory exhausted. */
#define EXTEND_BUFFER(n) \
do { \
ptrdiff_t requested_extension = n; \
unsigned char *old_buffer = bufp->buffer; \
if (MAX_BUF_SIZE - bufp->allocated < requested_extension) \
return REG_ESIZE; \
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-12-21 13:39 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-21 12:51 Limits on the regexp string length Ihor Radchenko
2022-12-21 13:39 ` Eli Zaretskii
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.