From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: npostavs@users.sourceforge.net Newsgroups: gmane.emacs.bugs Subject: bug#24358: 25.1.50; re-search-forward errors with "Variable binding depth exceeds max-specpdl-size" Date: Mon, 24 Oct 2016 22:00:02 -0400 Message-ID: <878ttdrxwd.fsf@users.sourceforge.net> References: <83oa2erx0k.fsf@gnu.org> <87lgxht8hp.fsf@users.sourceforge.net> <871sz8kq2v.fsf@gmail.com> <87shroroh8.fsf@users.sourceforge.net> <838ttfpnxt.fsf@gnu.org> <83vawjo21l.fsf@gnu.org> <83bmybnopx.fsf@gnu.org> <8360ojnk0n.fsf@gnu.org> <83twc3m198.fsf@gnu.org> <83pomrlz27.fsf@gnu.org> <83k2cynabi.fsf@gnu.org> <87eg35swni.fsf@users.sourceforge.net> <83lgxd50ic.fsf@gnu.org> <8337jl4tes.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: blaine.gmane.org 1477360826 15010 195.159.176.226 (25 Oct 2016 02:00:26 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 25 Oct 2016 02:00:26 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) Cc: sam.halliday@gmail.com, 24358@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Oct 25 04:00:21 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1byr2C-00025k-Vx for geb-bug-gnu-emacs@m.gmane.org; Tue, 25 Oct 2016 04:00:13 +0200 Original-Received: from localhost ([::1]:51125 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1byr2F-0001rV-83 for geb-bug-gnu-emacs@m.gmane.org; Mon, 24 Oct 2016 22:00:15 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36705) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1byr26-0001mF-NW for bug-gnu-emacs@gnu.org; Mon, 24 Oct 2016 22:00:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1byr23-0005Qg-Kg for bug-gnu-emacs@gnu.org; Mon, 24 Oct 2016 22:00:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:40228) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1byr23-0005Qa-H8 for bug-gnu-emacs@gnu.org; Mon, 24 Oct 2016 22:00:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1byr23-0008Fl-7P for bug-gnu-emacs@gnu.org; Mon, 24 Oct 2016 22:00:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: npostavs@users.sourceforge.net Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 25 Oct 2016 02:00:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24358 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch fixed Original-Received: via spool by 24358-submit@debbugs.gnu.org id=B24358.147736077431653 (code B ref 24358); Tue, 25 Oct 2016 02:00:03 +0000 Original-Received: (at 24358) by debbugs.gnu.org; 25 Oct 2016 01:59:34 +0000 Original-Received: from localhost ([127.0.0.1]:55627 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1byr1a-0008ET-3J for submit@debbugs.gnu.org; Mon, 24 Oct 2016 21:59:34 -0400 Original-Received: from mail-oi0-f53.google.com ([209.85.218.53]:35265) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1byr1Y-0008EH-3Q for 24358@debbugs.gnu.org; Mon, 24 Oct 2016 21:59:32 -0400 Original-Received: by mail-oi0-f53.google.com with SMTP id i127so14960896oia.2 for <24358@debbugs.gnu.org>; Mon, 24 Oct 2016 18:59:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=21kNctDzVDUMgaG1DbovJpyjWnZAiP9IJdLSS9bLXYA=; b=N//HzFOh7brtloAqGEvCh57I7szfSXTOI+4sXUfwq0+bbyA0dtAoFnjB1DJbUpz1P0 H9jg+Ht4GnC/DgpwNCzgZCdP7i7VF0LNJLNRPJ5zE7DNadIYCqVeISba29GJaqZFbNIT YNHmZMD//Tb1zQR42UvtbBiWI4BTRnZ0vsUMoZWfIisBie/n7XxksTPHSjvVo2GUsovM 70I2WDAUkr6TwVDptIoRV5lc2Th6mEFsI5F3UZnxdN1Obg8JR0NcF59cjaSCrJwLeWSs EyIbtDn1Cti3hMtAPLEtog5Un+uHYKLx/6odCIDgbJWZO7U50DTb0oo0BZux2XC8sX8F RsRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=21kNctDzVDUMgaG1DbovJpyjWnZAiP9IJdLSS9bLXYA=; b=awg1uAB08tktGgP/0FmIz0bktf2GdLlseno4yWxe9IsVhmgHLpL0wxwAOTrHRW8bfE eyW9DEF5dvtzolPGbYP/L1Y9WcFSvLIc22yk6JvtJr+7CkqSu5n7nK66Pd9Guu2t57/J Zhjnu2S3NR0SpMK1amgES3qKLIIk1uvbeYeZWJ/NEx649IPUkTZ/PUb+7ljCRh+ApQ/l PamLhIrEnmk39rhv1/b3udKPygFsEVdiWx3pnDYBZ4Q6XxyjStkSYSL5QUWFhANCims7 3N8XD3vt/LTOUOG5T3CORU3q5HW7fGGykdOolEbdlEYVFFOonAf7+6PAWnxMGSlEWjys gMeg== X-Gm-Message-State: ABUngvcWq2nfO8HaE0tiKnIBGoEs4NgsAntcquZcw3tLUduUVnUKOJ9NxQXEX+4g92h0fw== X-Received: by 10.107.32.10 with SMTP id g10mr16845736iog.183.1477360766312; Mon, 24 Oct 2016 18:59:26 -0700 (PDT) Original-Received: from zony ([45.2.7.130]) by smtp.googlemail.com with ESMTPSA id v197sm476382ita.0.2016.10.24.18.59.24 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 24 Oct 2016 18:59:25 -0700 (PDT) In-Reply-To: <8337jl4tes.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 24 Oct 2016 19:13:15 +0300") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:124985 Archived-At: --=-=-= Content-Type: text/plain Eli Zaretskii writes: >> Oh, I see. Yes, I think you're right, the pointers stored in regstart, >> regend, and fail_stack could become inconsistent. Hard to say what >> kind of regex could trigger it, but it seems quite possible. > > Maybe we should simply call r_alloc_inhibit_buffer_relocation around > any calls to regex.c functions, and remove the code you wrote to > handle relocations internally. How's this: --=-=-= Content-Type: text/plain Content-Disposition: attachment; filename=v4-0001-Revert-fixes-to-allocation-of-regex-matching.patch Content-Description: reverting patch >From c3e87ee1e1399f3687db02dc71b13830852a50eb Mon Sep 17 00:00:00 2001 From: Noam Postavsky Date: Mon, 24 Oct 2016 19:54:29 -0400 Subject: [PATCH v4 1/2] Revert fixes to allocation of regex matching The fix was not complete, and completing it was proving too complicated. - Revert "* src/regex.c (re_search_2): Make new code safe for -Wjump-misses-init." This reverts commit c2a17924a57483d14692c8913edbe8ad24b5ffbb. - Revert "Port to GCC 6.2.1 + --enable-gcc-warnings" This reverts commit f6134bbda259c115c06d4a9a3ab5c39340a15949. - Revert "Fix handling of allocation in regex matching" This reverts commit ad66b3fadb7ae22a4cbb82bb1507c39ceadf3897. - Revert "Fix handling of buffer relocation in regex.c functions" This reverts commit ee04aedc723b035eedaf975422d4eb242894121b. --- src/dired.c | 4 +--- src/regex.c | 73 ------------------------------------------------------------ src/regex.h | 4 +--- src/search.c | 40 ++++++++++----------------------- 4 files changed, 14 insertions(+), 107 deletions(-) diff --git a/src/dired.c b/src/dired.c index 006f74c..dba575c 100644 --- a/src/dired.c +++ b/src/dired.c @@ -259,11 +259,9 @@ directory_files_internal (Lisp_Object directory, Lisp_Object full, QUIT; bool wanted = (NILP (match) - || (re_match_object = name, - re_search (bufp, SSDATA (name), len, 0, len, 0) >= 0)); + || re_search (bufp, SSDATA (name), len, 0, len, 0) >= 0); immediate_quit = 0; - re_match_object = Qnil; /* Stop protecting name from GC. */ if (wanted) { diff --git a/src/regex.c b/src/regex.c index b12e95b..56b18e6 100644 --- a/src/regex.c +++ b/src/regex.c @@ -1438,62 +1438,11 @@ WEAK_ALIAS (__re_set_syntax, re_set_syntax) #define NEXT_FAILURE_HANDLE(h) fail_stack.stack[(h) - 3].integer #define TOP_FAILURE_HANDLE() fail_stack.frame -#ifdef emacs -# define STR_BASE_PTR(obj) \ - (NILP (obj) ? current_buffer->text->beg \ - : STRINGP (obj) ? SDATA (obj) \ - : NULL) -#else -# define STR_BASE_PTR(obj) NULL -#endif #define ENSURE_FAIL_STACK(space) \ while (REMAINING_AVAIL_SLOTS <= space) { \ - re_char *orig_base = STR_BASE_PTR (re_match_object); \ - bool might_relocate = orig_base != NULL; \ - ptrdiff_t string1_off, end1_off, end_match_1_off; \ - ptrdiff_t string2_off, end2_off, end_match_2_off; \ - ptrdiff_t d_off, dend_off, dfail_off; \ - if (might_relocate) \ - { \ - if (string1) \ - { \ - string1_off = string1 - orig_base; \ - end1_off = end1 - orig_base; \ - end_match_1_off = end_match_1 - orig_base; \ - } \ - if (string2) \ - { \ - string2_off = string2 - orig_base; \ - end2_off = end2 - orig_base; \ - end_match_2_off = end_match_2 - orig_base; \ - } \ - d_off = d - orig_base; \ - dend_off = dend - orig_base; \ - dfail_off = dfail - orig_base; \ - } \ if (!GROW_FAIL_STACK (fail_stack)) \ return -2; \ - /* In Emacs, GROW_FAIL_STACK might relocate string pointers. */ \ - if (might_relocate) \ - { \ - re_char *new_base = STR_BASE_PTR (re_match_object); \ - if (string1) \ - { \ - string1 = new_base + string1_off; \ - end1 = new_base + end1_off; \ - end_match_1 = new_base + end_match_1_off; \ - } \ - if (string2) \ - { \ - string2 = new_base + string2_off; \ - end2 = new_base + end2_off; \ - end_match_2 = new_base + end_match_2_off; \ - } \ - d = new_base + d_off; \ - dend = new_base + dend_off; \ - dfail = new_base + dfail_off; \ - } \ DEBUG_PRINT ("\n Doubled stack; size now: %zd\n", (fail_stack).size);\ DEBUG_PRINT (" slots available: %zd\n", REMAINING_AVAIL_SLOTS);\ } @@ -4380,10 +4329,6 @@ re_search_2 (struct re_pattern_buffer *bufp, const char *str1, size_t size1, /* Loop through the string, looking for a place to start matching. */ for (;;) { - ptrdiff_t offset1, offset2; - re_char *orig_base; - bool might_relocate; - /* If the pattern is anchored, skip quickly past places we cannot match. We don't bother to treat startpos == 0 specially @@ -4500,17 +4445,6 @@ re_search_2 (struct re_pattern_buffer *bufp, const char *str1, size_t size1, && !bufp->can_be_null) return -1; - /* re_match_2_internal may allocate, relocating the Lisp text - object that we're searching. */ - IF_LINT (offset2 = 0); /* Work around GCC bug 78081. */ - orig_base = STR_BASE_PTR (re_match_object); - might_relocate = orig_base != NULL; - if (might_relocate) - { - if (string1) offset1 = string1 - orig_base; - if (string2) offset2 = string2 - orig_base; - } - val = re_match_2_internal (bufp, string1, size1, string2, size2, startpos, regs, stop); @@ -4520,13 +4454,6 @@ re_search_2 (struct re_pattern_buffer *bufp, const char *str1, size_t size1, if (val == -2) return -2; - if (might_relocate) - { - re_char *new_base = STR_BASE_PTR (re_match_object); - if (string1) string1 = offset1 + new_base; - if (string2) string2 = offset2 + new_base; - } - advance: if (!range) break; diff --git a/src/regex.h b/src/regex.h index 61c771c..51f4424 100644 --- a/src/regex.h +++ b/src/regex.h @@ -169,9 +169,7 @@ extern reg_syntax_t re_syntax_options; #ifdef emacs # include "lisp.h" /* In Emacs, this is the string or buffer in which we are matching. - It is used for looking up syntax properties, and also to recompute - pointers in case the object is relocated as a side effect of - calling malloc (if it calls r_alloc_sbrk in ralloc.c). + It is used for looking up syntax properties. If the value is a Lisp string object, we are matching text in that string; if it's nil, we are matching text in the current buffer; if diff --git a/src/search.c b/src/search.c index b50e7f0..fa5ac44 100644 --- a/src/search.c +++ b/src/search.c @@ -287,10 +287,8 @@ looking_at_1 (Lisp_Object string, bool posix) immediate_quit = 1; QUIT; /* Do a pending quit right away, to avoid paradoxical behavior */ - /* Get pointers and sizes of the two strings that make up the - visible portion of the buffer. Note that we can use pointers - here, unlike in search_buffer, because we only call re_match_2 - once, after which we never use the pointers again. */ + /* Get pointers and sizes of the two strings + that make up the visible portion of the buffer. */ p1 = BEGV_ADDR; s1 = GPT_BYTE - BEGV_BYTE; @@ -409,7 +407,6 @@ string_match_1 (Lisp_Object regexp, Lisp_Object string, Lisp_Object start, (NILP (Vinhibit_changing_match_data) ? &search_regs : NULL)); immediate_quit = 0; - re_match_object = Qnil; /* Stop protecting string from GC. */ /* Set last_thing_searched only when match data is changed. */ if (NILP (Vinhibit_changing_match_data)) @@ -480,7 +477,6 @@ fast_string_match_internal (Lisp_Object regexp, Lisp_Object string, SBYTES (string), 0, SBYTES (string), 0); immediate_quit = 0; - re_match_object = Qnil; /* Stop protecting string from GC. */ return val; } @@ -568,7 +564,6 @@ fast_looking_at (Lisp_Object regexp, ptrdiff_t pos, ptrdiff_t pos_byte, len = re_match_2 (buf, (char *) p1, s1, (char *) p2, s2, pos_byte, NULL, limit_byte); immediate_quit = 0; - re_match_object = Qnil; /* Stop protecting string from GC. */ return len; } @@ -1183,8 +1178,8 @@ search_buffer (Lisp_Object string, ptrdiff_t pos, ptrdiff_t pos_byte, if (RE && !(trivial_regexp_p (string) && NILP (Vsearch_spaces_regexp))) { - unsigned char *base; - ptrdiff_t off1, off2, s1, s2; + unsigned char *p1, *p2; + ptrdiff_t s1, s2; struct re_pattern_buffer *bufp; bufp = compile_pattern (string, @@ -1198,19 +1193,16 @@ search_buffer (Lisp_Object string, ptrdiff_t pos, ptrdiff_t pos_byte, can take too long. */ QUIT; /* Do a pending quit right away, to avoid paradoxical behavior */ - /* Get offsets and sizes of the two strings that make up the - visible portion of the buffer. We compute offsets instead of - pointers because re_search_2 may call malloc and therefore - change the buffer text address. */ + /* Get pointers and sizes of the two strings + that make up the visible portion of the buffer. */ - base = current_buffer->text->beg; - off1 = BEGV_ADDR - base; + p1 = BEGV_ADDR; s1 = GPT_BYTE - BEGV_BYTE; - off2 = GAP_END_ADDR - base; + p2 = GAP_END_ADDR; s2 = ZV_BYTE - GPT_BYTE; if (s1 < 0) { - off2 = off1; + p2 = p1; s2 = ZV_BYTE - BEGV_BYTE; s1 = 0; } @@ -1225,16 +1217,12 @@ search_buffer (Lisp_Object string, ptrdiff_t pos, ptrdiff_t pos_byte, { ptrdiff_t val; - val = re_search_2 (bufp, - (char*) (base + off1), s1, - (char*) (base + off2), s2, + val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2, pos_byte - BEGV_BYTE, lim_byte - pos_byte, (NILP (Vinhibit_changing_match_data) ? &search_regs : &search_regs_1), /* Don't allow match past current point */ pos_byte - BEGV_BYTE); - /* Update 'base' due to possible relocation inside re_search_2. */ - base = current_buffer->text->beg; if (val == -2) { matcher_overflow (); @@ -1274,15 +1262,11 @@ search_buffer (Lisp_Object string, ptrdiff_t pos, ptrdiff_t pos_byte, { ptrdiff_t val; - val = re_search_2 (bufp, - (char*) (base + off1), s1, - (char*) (base + off2), s2, - pos_byte - BEGV_BYTE, lim_byte - pos_byte, + val = re_search_2 (bufp, (char *) p1, s1, (char *) p2, s2, + pos_byte - BEGV_BYTE, lim_byte - pos_byte, (NILP (Vinhibit_changing_match_data) ? &search_regs : &search_regs_1), lim_byte - BEGV_BYTE); - /* Update 'base' due to possible relocation inside re_search_2. */ - base = current_buffer->text->beg; if (val == -2) { matcher_overflow (); -- 2.9.3 --=-=-= Content-Type: text/plain Content-Disposition: attachment; filename=v4-0002-Inhibit-buffer-relocation-during-regex-searches.patch Content-Description: patch >From 79806c03ad9f45d8ed016a06f5e24cf69df85ad2 Mon Sep 17 00:00:00 2001 From: Noam Postavsky Date: Mon, 24 Oct 2016 21:22:07 -0400 Subject: [PATCH v4 2/2] Inhibit buffer relocation during regex searches * src/search.c (looking_at_1, fast_looking_at, search_buffer): Prevent relocation of buffer contents during calls to re_search_2. This ensures the pointers into buffer text won't be invalidated by r_alloc_sbrk (called from malloc with configurations where REL_ALLOC=yes). --- src/search.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/src/search.c b/src/search.c index fa5ac44..15504be 100644 --- a/src/search.c +++ b/src/search.c @@ -308,12 +308,20 @@ looking_at_1 (Lisp_Object string, bool posix) re_match_object = Qnil; +#ifdef REL_ALLOC + /* Prevent ralloc.c from relocating the current buffer while + searching it. */ + r_alloc_inhibit_buffer_relocation (1); +#endif i = re_match_2 (bufp, (char *) p1, s1, (char *) p2, s2, PT_BYTE - BEGV_BYTE, (NILP (Vinhibit_changing_match_data) ? &search_regs : NULL), ZV_BYTE - BEGV_BYTE); immediate_quit = 0; +#ifdef REL_ALLOC + r_alloc_inhibit_buffer_relocation (0); +#endif if (i == -2) matcher_overflow (); @@ -561,8 +569,16 @@ fast_looking_at (Lisp_Object regexp, ptrdiff_t pos, ptrdiff_t pos_byte, buf = compile_pattern (regexp, 0, Qnil, 0, multibyte); immediate_quit = 1; +#ifdef REL_ALLOC + /* Prevent ralloc.c from relocating the current buffer while + searching it. */ + r_alloc_inhibit_buffer_relocation (1); +#endif len = re_match_2 (buf, (char *) p1, s1, (char *) p2, s2, pos_byte, NULL, limit_byte); +#ifdef REL_ALLOC + r_alloc_inhibit_buffer_relocation (0); +#endif immediate_quit = 0; return len; @@ -1213,6 +1229,12 @@ search_buffer (Lisp_Object string, ptrdiff_t pos, ptrdiff_t pos_byte, } re_match_object = Qnil; +#ifdef REL_ALLOC + /* Prevent ralloc.c from relocating the current buffer while + searching it. */ + r_alloc_inhibit_buffer_relocation (1); +#endif + while (n < 0) { ptrdiff_t val; @@ -1254,6 +1276,9 @@ search_buffer (Lisp_Object string, ptrdiff_t pos, ptrdiff_t pos_byte, else { immediate_quit = 0; +#ifdef REL_ALLOC + r_alloc_inhibit_buffer_relocation (0); +#endif return (n); } n++; @@ -1296,11 +1321,17 @@ search_buffer (Lisp_Object string, ptrdiff_t pos, ptrdiff_t pos_byte, else { immediate_quit = 0; +#ifdef REL_ALLOC + r_alloc_inhibit_buffer_relocation (0); +#endif return (0 - n); } n--; } immediate_quit = 0; +#ifdef REL_ALLOC + r_alloc_inhibit_buffer_relocation (0); +#endif return (pos); } else /* non-RE case */ -- 2.9.3 --=-=-= Content-Type: text/plain I did 'make extraclean && . ../config-debug.sh && make' (all in one, to ensure I didn't forget to clean) and it succeeded. > > I'm sorry I didn't look at this closer before, and thus caused you to > work on all those changes, which we might now need to revert. Yes, that's how hindsight works, you only get it when it's too late. ;) --=-=-=--