From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: npostavs@users.sourceforge.net Newsgroups: gmane.emacs.bugs Subject: bug#24751: 26.0.50; Regex stack overflow not detected properly (gets "Variable binding depth exceeds max-specpdl-size") Date: Sun, 13 Nov 2016 00:39:39 -0500 Message-ID: <87a8d4lyzo.fsf@users.sourceforge.net> References: <87twc6tl0i.fsf@users.sourceforge.net> <83h97nlknj.fsf@gnu.org> <87mvhdoh4q.fsf@users.sourceforge.net> <83zilcipcr.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: blaine.gmane.org 1479015621 25890 195.159.176.226 (13 Nov 2016 05:40:21 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 13 Nov 2016 05:40:21 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) Cc: 24751@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Nov 13 06:40:17 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c5nWV-0005Nu-5d for geb-bug-gnu-emacs@m.gmane.org; Sun, 13 Nov 2016 06:40:11 +0100 Original-Received: from localhost ([::1]:60730 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c5nWY-0006jd-34 for geb-bug-gnu-emacs@m.gmane.org; Sun, 13 Nov 2016 00:40:14 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:50241) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c5nWP-0006hi-7t for bug-gnu-emacs@gnu.org; Sun, 13 Nov 2016 00:40:06 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c5nWM-00071n-3b for bug-gnu-emacs@gnu.org; Sun, 13 Nov 2016 00:40:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:39958) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1c5nWL-00071Z-W4 for bug-gnu-emacs@gnu.org; Sun, 13 Nov 2016 00:40:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1c5nWL-0003q0-NN for bug-gnu-emacs@gnu.org; Sun, 13 Nov 2016 00:40:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: npostavs@users.sourceforge.net Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 13 Nov 2016 05:40:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 24751 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 24751-submit@debbugs.gnu.org id=B24751.147901554214671 (code B ref 24751); Sun, 13 Nov 2016 05:40:01 +0000 Original-Received: (at 24751) by debbugs.gnu.org; 13 Nov 2016 05:39:02 +0000 Original-Received: from localhost ([127.0.0.1]:55357 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c5nVN-0003oL-50 for submit@debbugs.gnu.org; Sun, 13 Nov 2016 00:39:01 -0500 Original-Received: from mail-it0-f43.google.com ([209.85.214.43]:38844) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c5nVK-0003o2-NQ for 24751@debbugs.gnu.org; Sun, 13 Nov 2016 00:38:59 -0500 Original-Received: by mail-it0-f43.google.com with SMTP id q124so51102383itd.1 for <24751@debbugs.gnu.org>; Sat, 12 Nov 2016 21:38:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=P9Y8wCVmtj1msQ1a3RokRsfAcUrmU6TjLpDazaOSYcQ=; b=s/Sofetg40SmF7S1J3fPKftM1JE8rLHBBbK9kpXoFJFAoKih2ck5LBu1VONKYmq9hY gMeZKjlScPOPWuPeApzdMk7n6BuaB0RY5EtOOZUaxjyXJk5Sip7XVjBVPDkDNV1hzkO0 rkh6coQYK3sKvCVf8yyMfb1aPryCBsGi3uW2QE4lfzxQ8iaDZ4iVMlf7yBo1pvPA39Wi wKIEclvbYRBZ/oRKhpWZKb8Rn61OZn1+j21ULsD0HJv+h1DtgdWjRj57fAk+ruFNfkQ6 KlynTAKqkJDMMGY/zk+bFcWmUSD8mJ8nTkFGrIEe+o7PoT7r8vIbzYNNTu342eQ3wv8U ds8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=P9Y8wCVmtj1msQ1a3RokRsfAcUrmU6TjLpDazaOSYcQ=; b=HiZHer6EnCFlTrbtZwavUpnbeBK5VZurilRYaUPbKgQJUU1WgCYM7m8/kFbP5Jfsed DZiItMCK/tabwMmYNglOK+Fo2dLyNWCgRcUoEYOMcXytZv5D9D3pkBfomys0Ljp+ZJ7d aHyT75nf6gN5/aB1hDcN1Lavk6qxEhf8iN9t9RZiLnGgX2f3lqKS7nlIbzkTx0X3sc07 +VuK0Voa+X0uGMyv25rt2buRkY0vWJ2b5tHKr4SiQbwbH6zlKlOfA7QWpKL4KfxLOC6A 7/xuVdlzR+HdeQDbl6oaGv8NOraFoXsEiH28xYUyezIS9L5vmGCQnMDjgpbHCot66CAf 46ag== X-Gm-Message-State: ABUngvcWBHA1cZKFSDN822y2w6W9pZdYaqiXOoHpS5QtlrbOnpBHNoi/6QOo3YlGlLUI5w== X-Received: by 10.107.175.147 with SMTP id p19mr22663887ioo.80.1479015532753; Sat, 12 Nov 2016 21:38:52 -0800 (PST) Original-Received: from zony ([45.2.7.65]) by smtp.googlemail.com with ESMTPSA id v197sm16966422ita.0.2016.11.12.21.38.51 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 12 Nov 2016 21:38:51 -0800 (PST) In-Reply-To: <83zilcipcr.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 06 Nov 2016 17:45:40 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:125652 Archived-At: --=-=-= Content-Type: text/plain Eli Zaretskii writes: > > I think the patch can be simplified, where we now multiply by the size > of fail_stack_elt_t and then divide by it: simply remove both the > multiplication and the division. That will make the code easier to > read, and will make the units of each variable clear, something that I > think is at the heart of this issue. Ah, right. --=-=-= Content-Type: text/plain Content-Disposition: attachment; filename=v2-0001-Fix-computation-of-regex-stack-limit.patch Content-Description: patch >From 7b24484346417c8fdf46fd7ee0be1758393f13fb Mon Sep 17 00:00:00 2001 From: Noam Postavsky Date: Sat, 5 Nov 2016 16:51:53 -0400 Subject: [PATCH v2] Fix computation of regex stack limit The regex stack limit was being computed as the number of stack entries, whereas it was being compared with the current size as measured in bytes. This could cause indefinite looping when nearing the stack limit if re_max_failures happened not to be a multiple of sizeof fail_stack_elt_t (Bug #24751). * src/regex.c (GROW_FAIL_STACK): Compute both current stack size and limit as numbers of stack entries. --- src/regex.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/src/regex.c b/src/regex.c index 1c6c9e5..d23ba01 100644 --- a/src/regex.c +++ b/src/regex.c @@ -1319,23 +1319,20 @@ WEAK_ALIAS (__re_set_syntax, re_set_syntax) #define FAIL_STACK_GROWTH_FACTOR 4 #define GROW_FAIL_STACK(fail_stack) \ - (((fail_stack).size * sizeof (fail_stack_elt_t) \ - >= re_max_failures * TYPICAL_FAILURE_SIZE) \ + (((fail_stack).size >= re_max_failures * TYPICAL_FAILURE_SIZE) \ ? 0 \ : ((fail_stack).stack \ = REGEX_REALLOCATE_STACK ((fail_stack).stack, \ (fail_stack).size * sizeof (fail_stack_elt_t), \ - min (re_max_failures * TYPICAL_FAILURE_SIZE, \ - ((fail_stack).size * sizeof (fail_stack_elt_t) \ - * FAIL_STACK_GROWTH_FACTOR))), \ + min (re_max_failures * TYPICAL_FAILURE_SIZE, \ + ((fail_stack).size * FAIL_STACK_GROWTH_FACTOR)) \ + * sizeof (fail_stack_elt_t)), \ \ (fail_stack).stack == NULL \ ? 0 \ : ((fail_stack).size \ - = (min (re_max_failures * TYPICAL_FAILURE_SIZE, \ - ((fail_stack).size * sizeof (fail_stack_elt_t) \ - * FAIL_STACK_GROWTH_FACTOR)) \ - / sizeof (fail_stack_elt_t)), \ + = (min (re_max_failures * TYPICAL_FAILURE_SIZE, \ + ((fail_stack).size * FAIL_STACK_GROWTH_FACTOR))), \ 1))) -- 2.9.3 --=-=-= Content-Type: text/plain > >> but effectively increases the size of the failure stack (so the >> sample file size has to be increased 8-fold to get a regex stack >> overflow). > > Which IMO is exactly TRT, since re_max_failures was computed given the > runtime stack size of 8MB, so having it bail out after merely 800KB > doesn't sound right to me, don't you agree? Yes, I suppose we should also try to make use of the stack, rather than calling malloc, right? Something like this: diff --git i/src/regex.c w/src/regex.c index d23ba01..dcabde5 100644 --- i/src/regex.c +++ w/src/regex.c @@ -447,7 +447,11 @@ init_syntax_once (void) #else /* not REGEX_MALLOC */ # ifdef emacs -# define REGEX_USE_SAFE_ALLOCA USE_SAFE_ALLOCA +# define REGEX_USE_SAFE_ALLOCA \ + ptrdiff_t sa_avail = re_max_failures \ + * TYPICAL_FAILURE_SIZE * sizeof (fail_stack_elt_t); \ + ptrdiff_t sa_count = SPECPDL_INDEX (); bool sa_must_free = false + # define REGEX_SAFE_FREE() SAFE_FREE () # define REGEX_ALLOCATE SAFE_ALLOCA # else > >> Strangely, changing the value in the definition of re_max_failures >> doesn't seem to have any effect, it stays 40000 regardless. I am >> quite confused. > > I don't think I follow. Can you tell what you tried to change, and > where did you see the lack of any effect? Modifying the initial value of re_max_failures as in the below patch, has no effect on the size of the file at which stack regex overflow happens (I confirmed it gets compiled by adding a #warning on the previous line). Printing re_max_failures in gdb still shows 40000. diff --git i/src/regex.c w/src/regex.c index d23ba01..d9170c0 100644 --- i/src/regex.c +++ w/src/regex.c @@ -1249,7 +1249,7 @@ WEAK_ALIAS (__re_set_syntax, re_set_syntax) whose default stack limit is 2mb. In order for a larger value to work reliably, you have to try to make it accord with the process stack limit. */ -size_t re_max_failures = 40000; +size_t re_max_failures = 20; # else size_t re_max_failures = 4000; # endif Actually I find Emacs still compiles if I removed that line completely, there's just a compile warning saying regex.o: In function `re_match_2_internal': /home/npostavs/src/emacs/emacs-master/lib-src/../src/regex.c:5529: warning: the 're_max_failures' variable is obsolete and will go away. I guess there's some kind of definition of it in libc? --=-=-=--