From: Eli Zaretskii <eliz@gnu.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: mah@everybody.org, 61514@debbugs.gnu.org
Subject: bug#61514: 30.0.50; sadistically long xml line hangs emacs
Date: Mon, 20 Feb 2023 14:19:18 +0200 [thread overview]
Message-ID: <831qmkwmux.fsf@gnu.org> (raw)
In-Reply-To: <jwvlektp6n7.fsf-monnier+emacs@gnu.org> (message from Stefan Monnier on Sun, 19 Feb 2023 18:48:43 -0500)
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: "Mark A. Hershberger" <mah@everybody.org>, 61514@debbugs.gnu.org
> Date: Sun, 19 Feb 2023 18:48:43 -0500
>
> > The problem is in the combination of nxml-mode and some subtle
> > bug/misfeature in our regexp routines. Specifically, when we overflow
> > the fail stack, we fail to recover in this case, and seem to infloop
> > inside re_match_2_internal, or maybe recover very inefficiently (I
> > waited for almost 1 hour before giving up). The call which causes the
> > loop is in xmltok.el, in the indicated line:
> >
> > (defun xmltok-scan-attributes ()
> > (let ((recovering nil)
> > (atts-needing-normalization nil))
> > (while (cond ((or (looking-at (xmltok-attribute regexp))
> > ;; use non-greedy group
> > (when (looking-at (concat "[^<>\n]+?" <<<<<<<<<<<<<<<<<
> > (xmltok-attribute regexp)))
> > (unless recovering
> > (xmltok-add-error "Malformed attribute"
> > (point)
> > (save-excursion
> > (goto-char (xmltok-attribute start
> > name))
> > (skip-chars-backward "\r\n\t ")
> > (point))))
> > t))
> >
> > The regexp that causes this is as follows:
> >
> > "[^<>\n]+?\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[ \r\t\n]*=\\(?:[ \r\t\n]*\\('[^<'&\r\n\t]*\\([&\r\n\t][^<']*\\)?'\\|\"[^<\"&\r\n\t]*\\([&\r\n\t][^<\"]*\\)?\"\\)\\(?:\\([ \r\t\n]*>\\)\\|\\(?:\\([ \r\t\n]*/\\)\\(>\\)?\\)\\|\\([ \r\t\n]+\\)\\)\\)?"
>
> IIUC the above describes the code where we're stuck inf-looping inside
> `looking-at`?
Not inflooping, but very slowly backtracking, or so it seems.
> Is it the same place where the regexp-stack overflow happens (and with
> the same regexp)?
It's (almost) the same place, but not the same regexp. The regexp
which causes the stack-overflow message (which is emitted from
set-auto-mode, before entering redisplay) is this:
"\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[ \r\t\n]*=\\(?:[ \r\t\n]*\\('[^<'&\r\n\t]*\\([&\r\n\t][^<']*\\)?'\\|\"[^<\"&\r\n\t]*\\([&\r\n\t][^<\"]*\\)?\"\\)\\(?:\\([ \r\t\n]*>\\)\\|\\(?:\\([ \r\t\n]*/\\)\\(>\\)?\\)\\|\\([ \r\t\n]+\\)\\)\\)?"
As you can see, the prepended "[^<>\n]+?" in the regexp which "hangs"
makes all the difference. So the looking-at which fails reasonably
quickly is the first call to looking-at above, whereas the one the
"hangs" is the second one. Maybe this points out a way out of this
misery?
next prev parent reply other threads:[~2023-02-20 12:19 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-14 21:02 bug#61514: 30.0.50; sadistically long xml line hangs emacs Mark A. Hershberger via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-14 22:05 ` Gregory Heytings
2023-02-15 1:04 ` Mark A. Hershberger
2023-02-15 8:39 ` Gregory Heytings
2023-02-15 10:24 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-15 10:41 ` Gregory Heytings
2023-02-15 10:52 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-15 10:59 ` Gregory Heytings
2023-02-15 11:52 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-15 12:11 ` Gregory Heytings
2023-02-15 12:54 ` Po Lu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-15 13:31 ` Gregory Heytings
2023-02-15 13:56 ` Eli Zaretskii
2023-02-15 12:20 ` Dmitry Gutov
2023-02-15 13:58 ` Gregory Heytings
2023-02-15 14:17 ` Eli Zaretskii
2023-02-15 14:34 ` Gregory Heytings
2023-02-18 16:22 ` Eli Zaretskii
2023-02-18 17:06 ` Mark A. Hershberger
2023-02-18 17:58 ` Eli Zaretskii
2023-02-18 23:06 ` Gregory Heytings
2023-02-19 0:46 ` Gregory Heytings
2023-02-19 6:42 ` Eli Zaretskii
2023-02-19 23:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-19 23:48 ` Gregory Heytings
2023-02-19 23:58 ` Gregory Heytings
2023-02-20 2:05 ` Gregory Heytings
2023-02-20 4:24 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-20 11:28 ` Gregory Heytings
2023-02-20 12:33 ` Eli Zaretskii
2023-02-20 12:31 ` Eli Zaretskii
2023-02-20 12:40 ` Gregory Heytings
2023-02-20 13:14 ` Eli Zaretskii
2023-02-20 14:17 ` Gregory Heytings
2023-02-20 0:14 ` Gregory Heytings
2023-02-20 12:32 ` Eli Zaretskii
2023-02-19 23:48 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-20 12:19 ` Eli Zaretskii [this message]
2023-02-20 13:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-20 13:54 ` Eli Zaretskii
2023-02-20 14:59 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-20 15:56 ` Gregory Heytings
2023-02-20 16:47 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-20 17:14 ` Gregory Heytings
2023-02-20 17:34 ` Gregory Heytings
2023-02-20 18:49 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-20 19:11 ` Gregory Heytings
2023-02-20 19:29 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-20 19:37 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-20 20:13 ` Gregory Heytings
2023-02-21 12:05 ` Eli Zaretskii
2023-02-21 12:37 ` Gregory Heytings
2023-02-21 13:07 ` Eli Zaretskii
2023-02-21 14:38 ` Gregory Heytings
2023-02-21 14:48 ` Eli Zaretskii
2023-02-21 15:25 ` Gregory Heytings
2023-02-21 15:44 ` Gregory Heytings
2023-02-21 16:58 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-03-18 10:59 ` Gregory Heytings
2023-03-18 11:10 ` Eli Zaretskii
2023-03-18 15:06 ` Gregory Heytings
2023-03-19 2:39 ` mah via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-21 13:24 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-21 13:35 ` Gregory Heytings
2023-02-20 20:01 ` Eli Zaretskii
2023-02-21 2:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-21 9:39 ` Gregory Heytings
2023-02-21 12:44 ` Eli Zaretskii
2023-02-20 17:04 ` Gregory Heytings
2023-02-20 14:06 ` Gregory Heytings
2023-02-20 14:16 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-20 14:24 ` Gregory Heytings
2023-02-20 15:02 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-19 23:38 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-20 12:41 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=831qmkwmux.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=61514@debbugs.gnu.org \
--cc=mah@everybody.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.