* re-search-forward/backward causes a segmentation fault
@ 2003-10-08 23:28 Kenichi Handa
2003-10-11 5:37 ` Richard Stallman
0 siblings, 1 reply; 4+ messages in thread
From: Kenichi Handa @ 2003-10-08 23:28 UTC (permalink / raw)
Cc: mule-ja
I got this bug report.
----------------------------------------------------------------------
With these Emacsen:
NTEmacs 21.3(WindowsXP), NTEmacs 21.3.50(CVS Head, Windows??)
Emacs 21.2 (Zaurus)
Emacs 21.3 (RHL7.2, Debian)
Emacs 21.3.50 (CVS Head, Solaris7)
evaluating the following causes segmentation fault.
(let* ((re "[X\xd1d8]*")
(re2 (concat re re))
(re4 (concat re2 re2))
(re8 (concat re4 re4))
(re16 (concat re8 re8)))
(re-search-backward
(concat
re "\\|" "\\(" re2 " \\|" re2 "\\|" re2 "\\)" re
"\\([X" (make-string 1816 ?\xd1d8) "]\\|" re4 "\\|" re2
"\\(" re4 "\\|" re "\\|" re4 "\\)\\|"
re4 "\\|" re4 "\\|" re4 "\\|"
re2 "\\(" re16 "\\|" re8 "\\|" re8 "\\)\\|"
re2 "\\(" re "\\|" re2 "\\|" re4 "\\)\\|"
re2 "\\(" re "\\|" re "\\|" re8 "\\|" re4 "\\(" re2 "\\)\\)\\|"
re16 re16 "\\|" re "\\(" re8 "\\|" re8 "\\)\\|"
re2 "\\|" re "\\(" re16"\\|" re4 "\\|" re4 "\\)\\|"
re4 "\\(" re4 "\\|" re4 "\\)\\|" re "\\(" re8 "\\|" re8 "\\|"
re8 "\\(" re4 "\\|" re4 "\\)" "\\)\\|"
re2 "\\(" re "\\|" re "\\|" re2 "\\|" re2 "\\|" re2 "\\|" re "\\|"
re "\\(" re4 "\\|" re "\\)" "\\|" re16 "\\|"
re "\\(" re4 "\\|" re16 re16 "\\)\\|"
re "\\(" re16 re16 "\\|" re "\\|" re4 "\\(" re8 "\\|" re4 "\\)\\)\\|"
re "\\|" re "\\|" re "\\)\\|" re "\\(" re8 "\\|"
re2 "\\)\\|" re4 "\\|" re4 "\\|" re4 "\\|" re4 "\\|" re2 "\\|"
re2 "\\|" re2 "\\|" re2 "\\|" re2 "\\(" re "\\|" re8 "\\|"
re16 re16 re16 re8 "\\|" re8 "\\|" re2 "\\|" re2 "\\|"
re2 "\\|" re2 "\\|" re2 "\\|" re2 "\\|" re2 "\\)\\|"
(mapconcat 'identity (make-list 39 re4) "\\|") "\\|"
re "\\|" re "\\)") nil t))
This kind of giant regular expression is generated by migemo
(http://migemo.namazu.org/).
----------------------------------------------------------------------
I also confirmed the same phenomenon on:
Emacs 21.3.50 (CVS Head, Debian)
Here's the backtrace I got at that time.
Program received signal SIGABRT, Aborted.
0x4030f781 in kill () from /lib/libc.so.6
(gdb) bt 10
#0 0x4030f781 in kill () from /lib/libc.so.6
#1 0x080d964a in abort () at emacs.c:417
#2 0x0811c295 in re_match_2_internal (bufp=0x83b692c,
string1=0x8665988 ";; This buffer is for notes you don't want to save, and for Lisp evaluation.\n;; If you want to create a file, visit that file with C-x C-f,\n;; then enter the text in that file's own buffer.\n\n(let* ((r"...,
size1=1554, string2=0x8666216 "\n", size2=1, pos=1554, regs=0x83acc44,
stop=1554) at regex.c:5866
#3 0x08119044 in re_search_2 (bufp=0x83b692c,
str1=0x8665988 ";; This buffer is for notes you don't want to save, and for Lisp evaluation.\n;; If you want to create a file, visit that file with C-x C-f,\n;; then enter the text in that file's own buffer.\n\n(let* ((r"...,
size1=1554, str2=0x8666216 "\n", size2=1, startpos=1554, range=-1554,
regs=0x83acc44, stop=1554) at regex.c:4260
#4 0x08110643 in search_buffer (string=1751017996, pos=1551, pos_byte=1554,
lim=1, lim_byte=1, n=-1, RE=1, trt=-2007727184, inverse_trt=-2007707384,
posix=0) at search.c:1069
#5 0x081103dc in search_command (string=1751017996, bound=675020044,
noerror=675020092, count=675020044, direction=-1, RE=1, posix=0)
at search.c:904
#6 0x081123d4 in Fre_search_backward (regexp=1751017996, bound=675020044,
noerror=675020092, count=675020044) at search.c:2108
#7 0x08132b91 in Feval (form=-1467761176) at eval.c:2088
#8 0x08130714 in Fprogn (args=-1467763496) at eval.c:408
#9 0x08131197 in FletX (args=-1467761184) at eval.c:878
(More stack frames follow...)
(gdb) up
#1 0x080d964a in abort () at emacs.c:417
(gdb) up
#2 0x0811c295 in re_match_2_internal (bufp=0x83b692c,
string1=0x8665988 ";; This buffer is for notes you don't want to save, and for Lisp evaluation.\n;; If you want to create a file, visit that file with C-x C-f,\n;; then enter the text in that file's own buffer.\n\n(let* ((r"...,
size1=1554, string2=0x8666216 "\n", size2=1, pos=1554, regs=0x83acc44,
stop=1554) at regex.c:5866
(gdb) p p[-1]
$1 = 168 '\250' <- This is an invalid (re_opcode_t).
---
Ken'ichi HANDA
handa@m17n.org
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: re-search-forward/backward causes a segmentation fault
2003-10-08 23:28 re-search-forward/backward causes a segmentation fault Kenichi Handa
@ 2003-10-11 5:37 ` Richard Stallman
2003-10-13 2:11 ` Kenichi Handa
0 siblings, 1 reply; 4+ messages in thread
From: Richard Stallman @ 2003-10-11 5:37 UTC (permalink / raw)
Cc: emacs-devel
The regexp you showedme is too big to be handled with the current
regexp format. The bug was that regex.c thought that 2^16 bytes was
the limit. Since jump offsets are signed, really only 2^15 bytes can
be accommodated.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: re-search-forward/backward causes a segmentation fault
2003-10-11 5:37 ` Richard Stallman
@ 2003-10-13 2:11 ` Kenichi Handa
2003-10-13 18:21 ` Richard Stallman
0 siblings, 1 reply; 4+ messages in thread
From: Kenichi Handa @ 2003-10-13 2:11 UTC (permalink / raw)
Cc: emacs-devel
In article <E1A8CRN-0001Ey-FS@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
> The regexp you showedme is too big to be handled with the current
> regexp format. The bug was that regex.c thought that 2^16 bytes was
> the limit. Since jump offsets are signed, really only 2^15 bytes can
> be accommodated.
I see. So, regex_compile should check the size of offset
before storing it in a buffer for compiled code. But,
doesn't it mean that if regex_compile does that check, we
don't have to have the limit of 2^16 as below?
/* This is not an arbitrary limit: the arguments which represent offsets
into the pattern are two bytes long. So if 2^16 bytes turns out to
be too small, many things would have to change. */
/* Any other compiler which, like MSC, has allocation limit below 2^16
bytes will have to use approach similar to what was done below for
MSC and drop MAX_BUF_SIZE a bit. Otherwise you may end up
reallocating to 0 bytes. Such thing is not going to work too well.
You have been warned!! */
#if defined _MSC_VER && !defined WIN32
/* Microsoft C 16-bit versions limit malloc to approx 65512 bytes. */
# define MAX_BUF_SIZE 65500L
#else
# define MAX_BUF_SIZE (1L << 16)
#endif
---
Ken'ichi HANDA
handa@m17n.org
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: re-search-forward/backward causes a segmentation fault
2003-10-13 2:11 ` Kenichi Handa
@ 2003-10-13 18:21 ` Richard Stallman
0 siblings, 0 replies; 4+ messages in thread
From: Richard Stallman @ 2003-10-13 18:21 UTC (permalink / raw)
Cc: emacs-devel
I see. So, regex_compile should check the size of offset
before storing it in a buffer for compiled code. But,
doesn't it mean that if regex_compile does that check, we
don't have to have the limit of 2^16 as below?
regex_compile is the place that checks, and I am going to cut the
value of MAX_BUF_SIZE by 50%.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-10-13 18:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-08 23:28 re-search-forward/backward causes a segmentation fault Kenichi Handa
2003-10-11 5:37 ` Richard Stallman
2003-10-13 2:11 ` Kenichi Handa
2003-10-13 18:21 ` Richard Stallman
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.