From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: (error "Stack overflow in regexp matcher") and (?)wrong display of regexp in backtrace Date: Sun, 15 Mar 2020 10:39:22 +0000 Message-ID: <20200315103922.GA4928@ACM> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="105546"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mutt/1.10.1 (2018-07-13) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Mar 15 11:41:24 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jDQhw-000ROb-Q5 for ged-emacs-devel@m.gmane-mx.org; Sun, 15 Mar 2020 11:41:24 +0100 Original-Received: from localhost ([::1]:52780 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jDQhv-0006HW-Ok for ged-emacs-devel@m.gmane-mx.org; Sun, 15 Mar 2020 06:41:23 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:36022) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jDQg4-0004w8-By for emacs-devel@gnu.org; Sun, 15 Mar 2020 06:39:29 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jDQg3-0000S6-4l for emacs-devel@gnu.org; Sun, 15 Mar 2020 06:39:28 -0400 Original-Received: from colin.muc.de ([193.149.48.1]:29892 helo=mail.muc.de) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1jDQg2-0000DF-RJ for emacs-devel@gnu.org; Sun, 15 Mar 2020 06:39:27 -0400 Original-Received: (qmail 49692 invoked by uid 3782); 15 Mar 2020 10:39:23 -0000 Original-Received: from acm.muc.de (p2E5D5251.dip0.t-ipconnect.de [46.93.82.81]) by colin.muc.de (tmda-ofmipd) with ESMTP; Sun, 15 Mar 2020 11:39:22 +0100 Original-Received: (qmail 4968 invoked by uid 1000); 15 Mar 2020 10:39:22 -0000 Content-Disposition: inline X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 193.149.48.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:245523 Archived-At: Hello, Emacs. I'm not sure whether I've tripped over bugs or not here, so I'm posting to emacs-devel. 1. Start the Emacs-27 pretest or master with -Q. 2. Visit the file given for bug #40052, namely: http://hg.openjdk.java.net/jdk/jdk/raw-file/29edf1cb3c02/src/hotspot/share/runtime/globals.hpp . (Note: this file is one gigantic #define macro and is thus slow to scroll, etc. That is the topic of bug #40052.) 3. M-: (setq debug-on-error t) 4. move point to the #define at line 115. 5. Type a space. This causes a stack overflow error in the regexp engine, producing this backtrace: Debugger entered--Lisp error: (error "Stack overflow in regexp matcher") re-search-forward("\\(\\\\\\(.\\|\n\\)\\|[^\\\n\15]\\)*" nil t) c-before-change-check-unbalanced-strings(5717 5717) #f(compiled-function (fn) #)(c-before-change-check-unbalanced-strings) mapc(#f(compiled-function (fn) #) (c-extend-region-for-CPP c-before-change-check-raw-strings c-before-change-check-<>-operators c-depropertize-CPP c-invalidate-macro-cache c-truncate-bs-cache c-before-change-chec$ c-before-change(5717 5717) self-insert-command(1 32) funcall-interactively(self-insert-command 1 32) call-interactively(self-insert-command nil nil) command-execute(self-insert-command) First of all, note the regexp, "\\(\\\\\\(.\\|\n\\)\\|[^\\\n\15]\\)*" ^^^ In the source, the "\15" is "\r". Why is this substitution being made for the backtrace? Is it intentional (in which case, why not do the same to the "\n"?), or is it a bug? To me, it is more like a bug. More importantly, why is there a stack overflow here at all? Even though the regexp matcher has a long, long piece of buffer to scan over, the regexp is a simple linear search, without any nesting to speak of. There would appear to be no need for any backtracking. -- Alan Mackenzie (Nuremberg, Germany).