unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: npostavs@users.sourceforge.net
To: Eli Zaretskii <eliz@gnu.org>
Cc: 24751@debbugs.gnu.org
Subject: bug#24751: 26.0.50; Regex stack overflow not detected properly (gets "Variable binding depth exceeds max-specpdl-size")
Date: Sun, 13 Nov 2016 00:39:39 -0500	[thread overview]
Message-ID: <87a8d4lyzo.fsf@users.sourceforge.net> (raw)
In-Reply-To: <83zilcipcr.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 06 Nov 2016 17:45:40 +0200")

[-- Attachment #1: Type: text/plain, Size: 376 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>
> I think the patch can be simplified, where we now multiply by the size
> of fail_stack_elt_t and then divide by it: simply remove both the
> multiplication and the division.  That will make the code easier to
> read, and will make the units of each variable clear, something that I
> think is at the heart of this issue.

Ah, right.


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 2147 bytes --]

From 7b24484346417c8fdf46fd7ee0be1758393f13fb Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sat, 5 Nov 2016 16:51:53 -0400
Subject: [PATCH v2] Fix computation of regex stack limit

The regex stack limit was being computed as the number of stack entries,
whereas it was being compared with the current size as measured in
bytes.  This could cause indefinite looping when nearing the stack limit
if re_max_failures happened not to be a multiple of sizeof
fail_stack_elt_t (Bug #24751).

* src/regex.c (GROW_FAIL_STACK): Compute both current stack size and
limit as numbers of stack entries.
---
 src/regex.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/src/regex.c b/src/regex.c
index 1c6c9e5..d23ba01 100644
--- a/src/regex.c
+++ b/src/regex.c
@@ -1319,23 +1319,20 @@ WEAK_ALIAS (__re_set_syntax, re_set_syntax)
 #define FAIL_STACK_GROWTH_FACTOR 4
 
 #define GROW_FAIL_STACK(fail_stack)					\
-  (((fail_stack).size * sizeof (fail_stack_elt_t)			\
-    >= re_max_failures * TYPICAL_FAILURE_SIZE)				\
+  (((fail_stack).size >= re_max_failures * TYPICAL_FAILURE_SIZE)        \
    ? 0									\
    : ((fail_stack).stack						\
       = REGEX_REALLOCATE_STACK ((fail_stack).stack,			\
 	  (fail_stack).size * sizeof (fail_stack_elt_t),		\
-	  min (re_max_failures * TYPICAL_FAILURE_SIZE,			\
-	       ((fail_stack).size * sizeof (fail_stack_elt_t)		\
-		* FAIL_STACK_GROWTH_FACTOR))),				\
+          min (re_max_failures * TYPICAL_FAILURE_SIZE,                  \
+               ((fail_stack).size * FAIL_STACK_GROWTH_FACTOR))          \
+          * sizeof (fail_stack_elt_t)),                                 \
 									\
       (fail_stack).stack == NULL					\
       ? 0								\
       : ((fail_stack).size						\
-	 = (min (re_max_failures * TYPICAL_FAILURE_SIZE,		\
-		 ((fail_stack).size * sizeof (fail_stack_elt_t)		\
-		  * FAIL_STACK_GROWTH_FACTOR))				\
-	    / sizeof (fail_stack_elt_t)),				\
+         = (min (re_max_failures * TYPICAL_FAILURE_SIZE,                \
+                 ((fail_stack).size * FAIL_STACK_GROWTH_FACTOR))),      \
 	 1)))
 
 
-- 
2.9.3


[-- Attachment #3: Type: text/plain, Size: 2448 bytes --]



>
>> but effectively increases the size of the failure stack (so the
>> sample file size has to be increased 8-fold to get a regex stack
>> overflow).
>
> Which IMO is exactly TRT, since re_max_failures was computed given the
> runtime stack size of 8MB, so having it bail out after merely 800KB
> doesn't sound right to me, don't you agree?

Yes, I suppose we should also try to make use of the stack, rather than
calling malloc, right?  Something like this:

diff --git i/src/regex.c w/src/regex.c
index d23ba01..dcabde5 100644
--- i/src/regex.c
+++ w/src/regex.c
@@ -447,7 +447,11 @@ init_syntax_once (void)
 #else /* not REGEX_MALLOC  */
 
 # ifdef emacs
-#  define REGEX_USE_SAFE_ALLOCA USE_SAFE_ALLOCA
+#  define REGEX_USE_SAFE_ALLOCA                                         \
+  ptrdiff_t sa_avail = re_max_failures                                  \
+    * TYPICAL_FAILURE_SIZE * sizeof (fail_stack_elt_t);                 \
+  ptrdiff_t sa_count = SPECPDL_INDEX (); bool sa_must_free = false
+
 #  define REGEX_SAFE_FREE() SAFE_FREE ()
 #  define REGEX_ALLOCATE SAFE_ALLOCA
 # else



>
>> Strangely, changing the value in the definition of re_max_failures
>> doesn't seem to have any effect, it stays 40000 regardless.  I am
>> quite confused.
>
> I don't think I follow.  Can you tell what you tried to change, and
> where did you see the lack of any effect?

Modifying the initial value of re_max_failures as in the below patch,
has no effect on the size of the file at which stack regex overflow
happens (I confirmed it gets compiled by adding a #warning on the
previous line).  Printing re_max_failures in gdb still shows 40000.

diff --git i/src/regex.c w/src/regex.c
index d23ba01..d9170c0 100644
--- i/src/regex.c
+++ w/src/regex.c
@@ -1249,7 +1249,7 @@ WEAK_ALIAS (__re_set_syntax, re_set_syntax)
    whose default stack limit is 2mb.  In order for a larger
    value to work reliably, you have to try to make it accord
    with the process stack limit.  */
-size_t re_max_failures = 40000;
+size_t re_max_failures = 20;
 # else
 size_t re_max_failures = 4000;
 # endif


Actually I find Emacs still compiles if I removed that line completely,
there's just a compile warning saying

    regex.o: In function `re_match_2_internal':
    /home/npostavs/src/emacs/emacs-master/lib-src/../src/regex.c:5529: warning: the 're_max_failures' variable is obsolete and will go away.

I guess there's some kind of definition of it in libc?

  reply	other threads:[~2016-11-13  5:39 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-21  3:54 bug#24751: 26.0.50; Regex stack overflow not detected properly (gets "Variable binding depth exceeds max-specpdl-size") npostavs
2016-11-04  8:22 ` Eli Zaretskii
2016-11-05 19:34   ` npostavs
2016-11-06 15:45     ` Eli Zaretskii
2016-11-13  5:39       ` npostavs [this message]
2016-11-13 16:12         ` Eli Zaretskii
2016-11-15  3:08           ` npostavs
2016-11-15 16:12             ` Eli Zaretskii
2016-11-16  1:06               ` npostavs
2016-11-16 16:25                 ` Eli Zaretskii
2016-11-16 23:25                   ` npostavs
2016-11-17 16:21                     ` Eli Zaretskii
2016-11-19 10:02                       ` Eli Zaretskii
2017-01-01 18:33                       ` npostavs
2017-01-01 18:41                         ` Eli Zaretskii
2017-01-01 18:57                           ` npostavs
2017-01-01 20:06                             ` Eli Zaretskii
2017-01-02  4:49                       ` npostavs
2017-01-02 15:24                         ` Eli Zaretskii
2017-01-02 18:30                           ` npostavs
2017-01-02 19:22                             ` Eli Zaretskii
2017-01-08 23:49                               ` npostavs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a8d4lyzo.fsf@users.sourceforge.net \
    --to=npostavs@users.sourceforge.net \
    --cc=24751@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).