From: miha--- via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: Ioannis Kappas <ioannis.kappas@gmail.com>
Cc: Lars Ingebrigtsen <larsi@gnus.org>, 53808@debbugs.gnu.org
Subject: bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
Date: Mon, 07 Feb 2022 12:42:20 +0100 [thread overview]
Message-ID: <86r18eucmb.fsf@miha-pc> (raw)
In-Reply-To: <CAMRHuGBMe7v+0vd1yW7BPqZzfPHo6=tSz+ejbU-6mjP8OJSioA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3450 bytes --]
Ioannis Kappas <ioannis.kappas@gmail.com> writes:
> Thanks for looking into this! The patch looks good and reduces the
> issue considerably, but I've noticed there is still some undesired
> behaviour with non SGR CSI sequences. I was expecting the following
> test to display the non SGR `\e[a' characters verbatim in the output
> (this is in the context of the
> test/lisp/ansi-color-tests.el:ansi-color-incomplete-sequences-test()),
>
> (dolist (fun (list ansi-filt ansi-app))
> (with-temp-buffer
> (should (equal (funcall fun "\e[a") ""))
> (should (equal (funcall fun "\e[33m Z \e[0m")
> (with-temp-buffer
> (concat "\e[a" (funcall fun "\e[33m Z \e[0m")))))
> ))
>
> but fails to do so with
>
> Test ansi-color-incomplete-sequences-test condition:
> (ert-test-failed
> ((should
> (equal
> (funcall fun "\33[33m Z \33[0m")
> (with-temp-buffer ...)))
> :form
> (equal " Z " "\33[a Z ")
> :value nil :explanation
> (arrays-of-different-length 3 6 " Z " "\33[a Z " first-mismatch-at 0)))
>
> i.e. the "\e[a" seq does not appear in the output. Even before that, I
> was expecting (equal (funcall fun "\e[a") "") to fail and (equal
> (funcall fun "\e[a") "\e[a") to be true instead (as this can't be the
> start of a valid SGR expression).
>
> Is there a reason why the ansi-color library tries to match input
> against the CSI superset sequence instead of the SGR subset? The
> package appears to be dealing exclusively with the latter and using
> CSI regexps seems like an unnecessary complication to me.
Seems like filtering of non-SGR CSI sequences was introduced in commit
from Sat May 29 14:25:00 2010 -0400
(bc8d33d540d079af28ea93a0cf8df829911044ca) to fix bug#6085. And indeed,
if I try to set 'ansi-color-control-seq-regexp' to the more specific
SGR-only regexp "\e\\[[0-9;]*m", I get a lot of distracting "^[[K" in
the output of "grep --color=always" on my system.
> (Just for reference, I'm using the terminology found in the ANSI
> escape code in wikipedia at
> https://en.wikipedia.org/w/index.php?title=ANSI_escape_code&oldid=1070369816#Description)
>
> The SGR set as I understand it is the char sequence starting with the
> ESC control character followed by the [ character followed by zero or
> more of [0-9]+; followed by [0-9]+ followed by m. For example, ESC[33m
> or ESC[3;31m. This is what I tried to capture as a fragment with the
> "\e\\(?:\\[\\|$\\)\\(?:(?:[0-9]+;?\\)*" regexp in my original patch.
I believe 'ansi-color--control-seq-fragment-regexp' should mirror
'ansi-color-control-seq-regexp' as exactly as possible. In other words,
if one matches all CSI sequences, the other shouldn't match only SGR
sequences.
> Another minor observation, perhaps the following concat could be moved
> into defconst in the interest of performance (it appears twice in the
> patch)?
>
> (let ((fragment ""))
> (push (substring string start
> - (if (string-match "\033" string start)
> + (if (string-match
> + (concat "\\(?:"
> ansi-color--control-seq-fragment-regexp "\\)\\'")
> + string start)
Thanks, noted, I will hopefully send the simple patch soon.
> Best Regards
Thanks, best regards.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]
prev parent reply other threads:[~2022-02-07 11:42 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-05 20:52 bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char Ioannis Kappas
2022-02-05 21:00 ` Ioannis Kappas
2022-02-05 21:47 ` Ioannis Kappas
2022-02-05 21:56 ` Lars Ingebrigtsen
2022-02-05 22:05 ` Ioannis Kappas
2022-02-06 20:36 ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-06 22:55 ` Lars Ingebrigtsen
2022-02-07 7:51 ` Ioannis Kappas
2022-02-07 11:42 ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86r18eucmb.fsf@miha-pc \
--to=bug-gnu-emacs@gnu.org \
--cc=53808@debbugs.gnu.org \
--cc=ioannis.kappas@gmail.com \
--cc=larsi@gnus.org \
--cc=miha@kamnitnik.top \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).