unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: miha--- via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: Ioannis Kappas <ioannis.kappas@gmail.com>
Cc: Lars Ingebrigtsen <larsi@gnus.org>, 53808@debbugs.gnu.org
Subject: bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
Date: Mon, 07 Feb 2022 12:42:20 +0100	[thread overview]
Message-ID: <86r18eucmb.fsf@miha-pc> (raw)
In-Reply-To: <CAMRHuGBMe7v+0vd1yW7BPqZzfPHo6=tSz+ejbU-6mjP8OJSioA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3450 bytes --]

Ioannis Kappas <ioannis.kappas@gmail.com> writes:

> Thanks for looking into this! The patch looks good and reduces the
> issue considerably, but I've noticed there is still some undesired
> behaviour with non SGR CSI sequences. I was expecting the following
> test to display the non SGR `\e[a' characters verbatim in the output
> (this is in the context of the
> test/lisp/ansi-color-tests.el:ansi-color-incomplete-sequences-test()),
>
> (dolist (fun (list ansi-filt ansi-app))
>         (with-temp-buffer
>           (should (equal (funcall fun "\e[a") ""))
>           (should (equal (funcall fun "\e[33m Z \e[0m")
>                          (with-temp-buffer
>                            (concat "\e[a" (funcall fun "\e[33m Z \e[0m")))))
>           ))
>
> but fails to do so with
>
> Test ansi-color-incomplete-sequences-test condition:
>     (ert-test-failed
>      ((should
>        (equal
>         (funcall fun "\33[33m Z \33[0m")
>         (with-temp-buffer ...)))
>       :form
>       (equal " Z " "\33[a Z ")
>       :value nil :explanation
>       (arrays-of-different-length 3 6 " Z " "\33[a Z " first-mismatch-at 0)))
>
> i.e. the "\e[a" seq does not appear in the output. Even before that, I
> was expecting  (equal (funcall fun "\e[a") "") to fail and (equal
> (funcall fun "\e[a") "\e[a") to be true instead (as this can't be the
> start of a valid SGR expression).
>
> Is there a reason why the ansi-color library tries to match input
> against the CSI superset sequence instead of the SGR subset? The
> package appears to be dealing exclusively with the latter and using
> CSI regexps seems like an unnecessary complication to me.

Seems like filtering of non-SGR CSI sequences was introduced in commit
from Sat May 29 14:25:00 2010 -0400
(bc8d33d540d079af28ea93a0cf8df829911044ca) to fix bug#6085. And indeed,
if I try to set 'ansi-color-control-seq-regexp' to the more specific
SGR-only regexp "\e\\[[0-9;]*m", I get a lot of distracting "^[[K" in
the output of "grep --color=always" on my system.

> (Just for reference, I'm using the terminology found in the ANSI
> escape code in wikipedia at
> https://en.wikipedia.org/w/index.php?title=ANSI_escape_code&oldid=1070369816#Description)
>
> The SGR set as I understand it is the char sequence starting with the
> ESC control character followed by the [ character followed by zero or
> more of [0-9]+; followed by [0-9]+ followed by m. For example, ESC[33m
> or ESC[3;31m. This is what I tried to capture as a fragment with the
> "\e\\(?:\\[\\|$\\)\\(?:(?:[0-9]+;?\\)*"  regexp in my original patch.

I believe 'ansi-color--control-seq-fragment-regexp' should mirror
'ansi-color-control-seq-regexp' as exactly as possible. In other words,
if one matches all CSI sequences, the other shouldn't match only SGR
sequences.

> Another minor observation, perhaps the following concat could be moved
> into defconst in the interest of performance (it appears twice in the
> patch)?
>
>      (let ((fragment ""))
>        (push (substring string start
> -                       (if (string-match "\033" string start)
> +                       (if (string-match
> +                            (concat "\\(?:"
> ansi-color--control-seq-fragment-regexp "\\)\\'")
> +                            string start)

Thanks, noted, I will hopefully send the simple patch soon.

> Best Regards

Thanks, best regards.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

      reply	other threads:[~2022-02-07 11:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-05 20:52 bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char Ioannis Kappas
2022-02-05 21:00 ` Ioannis Kappas
2022-02-05 21:47   ` Ioannis Kappas
2022-02-05 21:56   ` Lars Ingebrigtsen
2022-02-05 22:05     ` Ioannis Kappas
2022-02-06 20:36       ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-06 22:55         ` Lars Ingebrigtsen
2022-02-07  7:51         ` Ioannis Kappas
2022-02-07 11:42           ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86r18eucmb.fsf@miha-pc \
    --to=bug-gnu-emacs@gnu.org \
    --cc=53808@debbugs.gnu.org \
    --cc=ioannis.kappas@gmail.com \
    --cc=larsi@gnus.org \
    --cc=miha@kamnitnik.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).