From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.bugs Subject: bug#13802: stack overflow in mm-add-meta-html-tag Date: Sun, 24 Feb 2013 21:04:21 -0500 Message-ID: References: <87wqtyrry6.fsf@zigzag.favinet> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1361757880 9666 80.91.229.3 (25 Feb 2013 02:04:40 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 25 Feb 2013 02:04:40 +0000 (UTC) Cc: 13802@debbugs.gnu.org To: Thien-Thi Nguyen Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Feb 25 03:05:02 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1U9nR8-0005mu-Lr for geb-bug-gnu-emacs@m.gmane.org; Mon, 25 Feb 2013 03:05:02 +0100 Original-Received: from localhost ([::1]:35399 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U9nQo-0001Hw-01 for geb-bug-gnu-emacs@m.gmane.org; Sun, 24 Feb 2013 21:04:42 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:32992) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U9nQk-0001E4-Eb for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2013 21:04:40 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U9nQg-0004aQ-LE for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2013 21:04:38 -0500 Original-Received: from [140.186.70.43] (port=42977 helo=debbugs.gnu.org) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U9nQg-0004SE-I9 for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2013 21:04:34 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1U9nS6-0007Qg-7R; Sun, 24 Feb 2013 21:06:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org, bugs@gnus.org Resent-Date: Mon, 25 Feb 2013 02:06:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13802 X-GNU-PR-Package: emacs,gnus X-GNU-PR-Keywords: Original-Received: via spool by 13802-submit@debbugs.gnu.org id=B13802.136175796128553 (code B ref 13802); Mon, 25 Feb 2013 02:06:02 +0000 Original-Received: (at 13802) by debbugs.gnu.org; 25 Feb 2013 02:06:01 +0000 Original-Received: from localhost ([127.0.0.1]:48440 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9nS5-0007QS-8j for submit@debbugs.gnu.org; Sun, 24 Feb 2013 21:06:01 -0500 Original-Received: from ironport2-out.teksavvy.com ([206.248.154.182]:40981) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1U9nS3-0007QK-NH for 13802@debbugs.gnu.org; Sun, 24 Feb 2013 21:05:59 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av8EABK/CFHO+KLv/2dsb2JhbABEhke4Rxdzgh4BAQQBIzMjBQsLGgIYDgICFBgNJIgeBq5fkk6BI45UgRMDiGGcGYFegxU X-IPAS-Result: Av8EABK/CFHO+KLv/2dsb2JhbABEhke4Rxdzgh4BAQQBIzMjBQsLGgIYDgICFBgNJIgeBq5fkk6BI45UgRMDiGGcGYFegxU X-IronPort-AV: E=Sophos;i="4.84,565,1355115600"; d="scan'208";a="2234486" Original-Received: from 206-248-162-239.dsl.teksavvy.com (HELO fmsmemgm.homelinux.net) ([206.248.162.239]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 24 Feb 2013 21:04:21 -0500 Original-Received: by fmsmemgm.homelinux.net (Postfix, from userid 20848) id E7B44AE2C6; Sun, 24 Feb 2013 21:04:21 -0500 (EST) In-Reply-To: <87wqtyrry6.fsf@zigzag.favinet> (Thien-Thi Nguyen's message of "Sun, 24 Feb 2013 10:17:53 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:71761 Archived-At: > I see a "Stack overflow in regexp matcher" error traceable back to > lisp/gnus/mm-decode.el func =E2=80=98mm-add-meta-html-tag=E2=80=99 fragme= nt: > (re-search-forward "\ > text/\\(\\sw+\\)\\(?:\;\\s-*charset=3D\\(.+\\)\\)?[\"'][^>]*>" nil t) Hmm... I don't see any obvious reason for a stack overflow unless the text has some very long lines or a lot of space between elements. > One idea (untested) is to replace the ".+" (used to match the charset) > with a more specific pattern. Perhaps "[^<>]+" or "\\sw+"? I don't think that would help. To avoid such overflow, you need to reduce the backtracking, i.e. reduce the number of cases where two options are possible according to the simplistic regexp-optimizer. \s pattern is actually very poor in this respect, because the optimizer can't know anything about the chars that this matches (since it depends on text-properties). The flip side is that replacing \\s- with [ \t\n] might help (this way, the optimizer will see that the + repetition does not need backtracking since a char cannot both match a loop iteration and the "after the loop" content). Similarly using [^;'\"]+ instead of \\sw+ would help, and maybe replacing .+ with [^'\"\n]+ would help as well. > Thinking more systematically, maybe Emacs should add a condition > =E2=80=98stack-overflow/regexp=E2=80=99 (or something like that) such tha= t code can > =E2=80=98condition-case=E2=80=99 for it and try a fallback path. In reality, such overflow should only ever happen if you have backrefs in your regexp. Stefan