unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
@ 2022-02-05 20:52 Ioannis Kappas
  2022-02-05 21:00 ` Ioannis Kappas
  0 siblings, 1 reply; 9+ messages in thread
From: Ioannis Kappas @ 2022-02-05 20:52 UTC (permalink / raw)
  To: 53808

Hi,

there appears to be an issue with `ansi-color-apply' that a stray ESC
control character in the input string can block the colorization process

(with-temp-buffer (ansi-color-apply "a\ebc"))
;; => "a"

(with-temp-buffer (concat (ansi-color-apply "a\ebc") (ansi-color-apply "xyz")))
;; => "a"

The process is blocked at character a the rest are never printed. It
can only resume when a CSI
seq (i.e. one starting with ESC [) appears in the stream

(with-temp-buffer (concat (ansi-color-apply "ab\ec") (ansi-color-apply
"x\e[yz")))
;; => "ab^[cxz"

or, using a valid SGR as an example

(with-temp-buffer (concat (ansi-color-apply "ab\ec") (ansi-color-apply
"x\e[3myz")))
;; => #("ab^[cxyz" 5 7
;;      (font-lock-face italic))


This behavior can pose serious problems to applications which support
ansi colorisation of their output streams, but otherwise treat ESC as
any other control character (e.g. REPLs colorizing their output with
ansi-color but they also like to display any other character). Their
output might be blocked indefinitely when an ESC character appears in
their output.

My expectation is that a character sequence starting with ESC which is
not being part of an SGR sequence, should be output immediately,
rather than treated as a potential SGR sequence (which by definition
it can never be) blocking further processing.

e.g.
(with-temp-buffer (ansi-color-apply "a\ebc"))
;; => "a^[bc"

Analysis to follow.

Thanks

In GNU Emacs 29.0.50 (build 1, x86_64-w64-mingw32)
Repository revision: 3a8e140ad115633791d057bd10998d80c33e6dc7
Repository branch: master
Windowing system distributor 'Microsoft Corp.', version 10





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
  2022-02-05 20:52 bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char Ioannis Kappas
@ 2022-02-05 21:00 ` Ioannis Kappas
  2022-02-05 21:47   ` Ioannis Kappas
  2022-02-05 21:56   ` Lars Ingebrigtsen
  0 siblings, 2 replies; 9+ messages in thread
From: Ioannis Kappas @ 2022-02-05 21:00 UTC (permalink / raw)
  To: 53808

The issue appears to be caused by the ansi color context logic, trying
to handle potential SGR sequences split between string fragments. The
SGR sequence is accumulated into the context until is complete and
only then output with the rest of the input string.

But currently, identifying the beginning of an SGR sequence is
performed naively based on the first character (ESC aka ^], \033 or
\e) and until a valid C1 sequence is matched in the accumulated
context string, rather than checking whether the SGR sequence is valid
or completed:

(defconst ansi-color-control-seq-regexp
  ;; See ECMA 48, section 5.4 "Control Sequences".
  "\e\\[[\x30-\x3F]*[\x20-\x2F]*[\x40-\x7E]"
  "Regexp matching an ANSI control sequence.")

(defun ansi-color-apply (string)
  "Translates SGR control sequences into text properties..."
  (let* ((context
          (ansi-color--ensure-context 'ansi-color-context nil))
         (face-vec (car context))
         (start 0)
         end result)
    ;; If context was saved and is a string, prepend it.
    (setq string (concat (cadr context) string))
    (setcar (cdr context) "")
    ;; Find the next escape sequence.
    (while (setq end (string-match ansi-color-control-seq-regexp string start))
      (let ((esc-end (match-end 0)))
        ;; ...
        (push (substring string start end) result)
        (setq start (match-end 0))
        ;; ...
        ))
    ;; ...

    ;; save context, add the remainder of the string to the result
    (if (string-match "\033" string start)
        (let ((pos (match-beginning 0)))
          (setcar (cdr context) (substring string pos))
          (push (substring string start pos) result))
      (push (substring string start) result))
    (apply 'concat (nreverse result))))


A solution (open to discussion) could be to identify a partial SGR
fragment based on its actual specification rather than only starting
with the ESC char:

modified   lisp/ansi-color.el
@@ -501,6 +501,19 @@ ansi-color-filter-apply
       (setcar (cdr context) fragment))
     (apply #'concat (nreverse result))))

+(defconst ansi-color--sgr-partial-regex
+  "\e\\(?:\\[\\|$\\)\\(?:(?:[0-9]+;?\\)*"
+  "A regexp for locating the beginning of a partial SGR
+  sequence.")
+
+(defun ansi-color--sgr-fragment-pos (string start)
+  "Check if STRING ends with a partial SGR sequence and return
+its position or nil otherwise. Start looking in STRING at position START."
+  (save-match-data
+    (when (and (string-match ansi-color--sgr-partial-regex string start)
+               (= (match-end 0) (1- (length string))))
+      (match-beginning 0))))
+
 (defun ansi-color-apply (string)
   "Translates SGR control sequences into text properties.
 Delete all other control sequences without processing them.
@@ -549,8 +562,8 @@ ansi-color-apply
       (put-text-property start (length string)
                          'font-lock-face face string))
     ;; save context, add the remainder of the string to the result
-    (if (string-match "\033" string start)
-        (let ((pos (match-beginning 0)))
+    (if-let ((pos (ansi-color--sgr-fragment-pos string start)))
+        (progn
           (setcar (cdr context) (substring string pos))
           (push (substring string start pos) result))
       (push (substring string start) result))

Let me know your thoughts, there is also `ansi-color-filter-apply' and
`ansi-color-filter-region' that would need similar treatment. I also
have a unit test in development.

Thanks





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
  2022-02-05 21:00 ` Ioannis Kappas
@ 2022-02-05 21:47   ` Ioannis Kappas
  2022-02-05 21:56   ` Lars Ingebrigtsen
  1 sibling, 0 replies; 9+ messages in thread
From: Ioannis Kappas @ 2022-02-05 21:47 UTC (permalink / raw)
  To: 53808

(sorry, I sent out the wrong patch, the correct one is

modified   lisp/ansi-color.el
@@ -501,6 +501,20 @@ ansi-color-filter-apply
       (setcar (cdr context) fragment))
     (apply #'concat (nreverse result))))

+(defconst ansi-color--sgr-partial-regex
+  "\e\\(?:\\[\\|$\\)\\(?:[0-9]+;?\\)*"
+  "A regexp for locating the beginning of a partial SGR
+  sequence.")
+
+(defun ansi-color--sgr-fragment-pos (string start)
+  "Check if STRING ends with a partial SGR sequence and return
+its position or nil otherwise. Start looking in STRING at position START."
+  (save-match-data
+    (when (and (string-match ansi-color--sgr-partial-regex string start)
+               (or (= (match-end 0) 0)
+                   (= (match-end 0) (length string))) )
+      (match-beginning 0))))
+
 (defun ansi-color-apply (string)
   "Translates SGR control sequences into text properties.
 Delete all other control sequences without processing them.
@@ -549,8 +563,8 @@ ansi-color-apply
       (put-text-property start (length string)
                          'font-lock-face face string))
     ;; save context, add the remainder of the string to the result
-    (if (string-match "\033" string start)
-        (let ((pos (match-beginning 0)))
+    (if-let ((pos (ansi-color--sgr-fragment-pos string start)))
+        (progn
           (setcar (cdr context) (substring string pos))
           (push (substring string start pos) result))
       (push (substring string start) result))



)





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
  2022-02-05 21:00 ` Ioannis Kappas
  2022-02-05 21:47   ` Ioannis Kappas
@ 2022-02-05 21:56   ` Lars Ingebrigtsen
  2022-02-05 22:05     ` Ioannis Kappas
  1 sibling, 1 reply; 9+ messages in thread
From: Lars Ingebrigtsen @ 2022-02-05 21:56 UTC (permalink / raw)
  To: Ioannis Kappas; +Cc: 53808, Miha Rihtaršič

Ioannis Kappas <ioannis.kappas@gmail.com> writes:

> A solution (open to discussion) could be to identify a partial SGR
> fragment based on its actual specification rather than only starting
> with the ESC char:

Hm...  what happens if the ESC arrives in one chunk and then the rest of
the SGR sequence in the next chunk?

(Miha has done work in this area recently; added to the CCs.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
  2022-02-05 21:56   ` Lars Ingebrigtsen
@ 2022-02-05 22:05     ` Ioannis Kappas
  2022-02-06 20:36       ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 9+ messages in thread
From: Ioannis Kappas @ 2022-02-05 22:05 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 53808, Miha Rihtaršič

On Sat, Feb 5, 2022 at 9:56 PM Lars Ingebrigtsen <larsi@gnus.org> wrote:
>
> Ioannis Kappas <ioannis.kappas@gmail.com> writes:
>
> > A solution (open to discussion) could be to identify a partial SGR
> > fragment based on its actual specification rather than only starting
> > with the ESC char:
>
> Hm...  what happens if the ESC arrives in one chunk and then the rest of
> the SGR sequence in the next chunk?

It is handled correctly as expected if the concatenated sequence is an
SGR, it is output as such, i.e. all
test/lisp/ansi-color-tests.el:ansi-color-incomplete-sequences-test
pass still pass.

Here is the list of unit tests showing of what I consider correct
handling of non SGR sequences I have came up with thus far

(ert-deftest ansi-color-context-non-sgr ()

  (with-temp-buffer
    (let ((text (ansi-color-apply "\e[33mHello World\e[0m")))
      (should (string= "Hello World" text))
      (should (equal (get-char-property 0 'font-lock-face text)
                     '(:foreground "yellow3")))
      ))

  (with-temp-buffer
    (let ((pretext (ansi-color-apply "5"))
          (text (ansi-color-apply "\e[33mHello World\e[0m")))
      (should (string= "Hello World" text))
      (should (equal (get-char-property 0 'font-lock-face text)
                     '(:foreground "yellow3")))
      ))

  (with-temp-buffer
    (let ((pretext (ansi-color-apply "\e"))
          (text (ansi-color-apply "\e[33mHello World\e[0m")))
      (should (string= "\eHello World" text))
      (should (equal (get-char-property 1 'font-lock-face text)
                     '(:foreground "yellow3")))
      ))

  (with-temp-buffer
    (let ((pretext (ansi-color-apply "\e["))
          (text (ansi-color-apply "\e[33mHello World\e[0m")))
      (should (string= "\e[Hello World" text))
      (should (equal (get-char-property 2 'font-lock-face text)
                     '(:foreground "yellow3")))
      ))

  (with-temp-buffer
    (let ((pretext (ansi-color-apply "\e[33"))
          (text (ansi-color-apply "mHello World\e[0m")))
      (should (string= "Hello World" text))
      (should (equal (get-char-property 2 'font-lock-face text)
                     '(:foreground "yellow3")))
      ))

  (with-temp-buffer
    (let ((pretext (ansi-color-apply "\e[33m"))
          (text (ansi-color-apply "Hello World\e[0m")))
      (should (string= "Hello World" text))
      (should (equal (get-char-property 2 'font-lock-face text)
                     '(:foreground "yellow3")))
      ))
  (with-temp-buffer
    (let ((pretext (ansi-color-apply "\e[33;1"))
          (text (ansi-color-apply "mHello World\e[0m")))
      (should (string= "Hello World" text))
      (should (equal (get-char-property 2 'font-lock-face text)
                     '(ansi-color-bold (:foreground "yellow3"))))
      ))

  (with-temp-buffer
    (let ((pretext (ansi-color-apply "\e[33;"))
          (text (ansi-color-apply "1mHello World\e[0m")))
      (should (string= "Hello World" text))
      (should (equal (get-char-property 2 'font-lock-face text)
                     '(ansi-color-bold (:foreground "yellow3"))))
      ))
  )

> (Miha has done work in this area recently; added to the CCs.)

Looking forward to his feedback :) it is because of his work I've
decided to raise this against 29 instead of the 28 branch.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
  2022-02-05 22:05     ` Ioannis Kappas
@ 2022-02-06 20:36       ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-06 22:55         ` Lars Ingebrigtsen
  2022-02-07  7:51         ` Ioannis Kappas
  0 siblings, 2 replies; 9+ messages in thread
From: miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-06 20:36 UTC (permalink / raw)
  To: Ioannis Kappas, Lars Ingebrigtsen; +Cc: 53808


[-- Attachment #1.1: Type: text/plain, Size: 996 bytes --]

Ioannis Kappas <ioannis.kappas@gmail.com> writes:

> It is handled correctly as expected if the concatenated sequence is an
> SGR, it is output as such, i.e. all
> test/lisp/ansi-color-tests.el:ansi-color-incomplete-sequences-test
> pass still pass.
>
> Here is the list of unit tests showing of what I consider correct
> handling of non SGR sequences I have came up with thus far
>
> (ert-deftest ansi-color-context-non-sgr ()
>
> [...]
>
>   (with-temp-buffer
>     (let ((pretext (ansi-color-apply "\e[33;"))
>           (text (ansi-color-apply "1mHello World\e[0m")))
>       (should (string= "Hello World" text))
>       (should (equal (get-char-property 2 'font-lock-face text)
>                      '(ansi-color-bold (:foreground "yellow3"))))
>       ))
>   )

Thanks. I took the liberty of working on your patch, adding support for
ansi-color-apply-on-region, ansi-color-filter-region,
ansi-color-filter-apply. I also added some tests as you suggested and
made a minor simplification.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-ansi-color-don-t-get-stuck-on-e.patch --]
[-- Type: text/x-patch, Size: 5364 bytes --]

From 162045f83154d3df7b482871b05076a92efd02f9 Mon Sep 17 00:00:00 2001
From: Ioannis Kappas <ioannis.kappas@gmail.com>
Date: Sun, 6 Feb 2022 21:25:56 +0100
Subject: [PATCH] ansi-color: don't get stuck on \e

* lisp/ansi-color.el (ansi-color--control-seq-fragment-regexp): New
constant.

(ansi-color-filter-apply):
(ansi-color-apply):
(ansi-color-filter-region):
(ansi-color-apply-on-region): Don't get stuck on \e if it is
determined that it cannot start a valid ANSI escape
sequence (Bug#53808).

* test/lisp/ansi-color-tests.el (ansi-color-incomplete-sequences-test):
Test for \e that doesn't start a valid ANSI escape sequence.
---
 lisp/ansi-color.el            | 26 ++++++++++++++++++++------
 test/lisp/ansi-color-tests.el | 20 +++++++++++++++++++-
 2 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/lisp/ansi-color.el b/lisp/ansi-color.el
index 3973d9db08..e5d2e2c4ac 100644
--- a/lisp/ansi-color.el
+++ b/lisp/ansi-color.el
@@ -347,6 +347,10 @@ ansi-color-control-seq-regexp
   "\e\\[[\x30-\x3F]*[\x20-\x2F]*[\x40-\x7E]"
   "Regexp matching an ANSI control sequence.")
 
+(defconst ansi-color--control-seq-fragment-regexp
+  "\e\\[[\x30-\x3F]*[\x20-\x2F]*\\|\e"
+  "Regexp matching a partial ANSI control sequence.")
+
 (defconst ansi-color-parameter-regexp "\\([0-9]*\\)[m;]"
   "Regexp that matches SGR control sequence parameters.")
 
@@ -492,7 +496,9 @@ ansi-color-filter-apply
     ;; save context, add the remainder of the string to the result
     (let ((fragment ""))
       (push (substring string start
-                       (if (string-match "\033" string start)
+                       (if (string-match
+                            (concat "\\(?:" ansi-color--control-seq-fragment-regexp "\\)\\'")
+                            string start)
                            (let ((pos (match-beginning 0)))
                              (setq fragment (substring string pos))
                              pos)
@@ -549,7 +555,9 @@ ansi-color-apply
       (put-text-property start (length string)
                          'font-lock-face face string))
     ;; save context, add the remainder of the string to the result
-    (if (string-match "\033" string start)
+    (if (string-match
+         (concat "\\(?:" ansi-color--control-seq-fragment-regexp "\\)\\'")
+         string start)
         (let ((pos (match-beginning 0)))
           (setcar (cdr context) (substring string pos))
           (push (substring string start pos) result))
@@ -685,7 +693,11 @@ ansi-color-filter-region
       (while (re-search-forward ansi-color-control-seq-regexp end-marker t)
         (delete-region (match-beginning 0) (match-end 0)))
       ;; save context, add the remainder of the string to the result
-      (if (re-search-forward "\033" end-marker t)
+      (set-marker start (point))
+      (while (re-search-forward ansi-color--control-seq-fragment-regexp
+                                end-marker t))
+      (if (and (/= (point) start)
+               (= (point) end-marker))
 	  (set-marker start (match-beginning 0))
         (set-marker start nil)))))
 
@@ -742,10 +754,12 @@ ansi-color-apply-on-region
             ;; Otherwise, strip.
             (delete-region esc-beg esc-end))))
       ;; search for the possible start of a new escape sequence
-      (if (re-search-forward "\033" end-marker t)
+      (while (re-search-forward ansi-color--control-seq-fragment-regexp
+                                end-marker t))
+      (if (and (/= (point) start-marker)
+               (= (point) end-marker))
           (progn
-            (while (re-search-forward "\033" end-marker t))
-            (backward-char)
+            (goto-char (match-beginning 0))
             (funcall ansi-color-apply-face-function
                      start-marker (point)
                      (ansi-color--face-vec-face face-vec))
diff --git a/test/lisp/ansi-color-tests.el b/test/lisp/ansi-color-tests.el
index 71b706c763..2ff7fc6aaf 100644
--- a/test/lisp/ansi-color-tests.el
+++ b/test/lisp/ansi-color-tests.el
@@ -171,7 +171,25 @@ ansi-color-incomplete-sequences-test
           (insert str)
           (ansi-color-apply-on-region opoint (point))))
       (should (ansi-color-tests-equal-props
-               propertized-str (buffer-string))))))
+               propertized-str (buffer-string))))
+
+    ;; \e not followed by '[' and invalid ANSI escape seqences
+    (dolist (fun (list ansi-filt ansi-app))
+      (with-temp-buffer
+        (should (equal (funcall fun "\e") ""))
+        (should (equal (funcall fun "\e[33m test \e[0m")
+                       (with-temp-buffer
+                         (concat "\e" (funcall fun "\e[33m test \e[0m"))))))
+      (with-temp-buffer
+        (should (equal (funcall fun "\e[") ""))
+        (should (equal (funcall fun "\e[33m Z \e[0m")
+                       (with-temp-buffer
+                         (concat "\e[" (funcall fun "\e[33m Z \e[0m"))))))
+      (with-temp-buffer
+        (should (equal (funcall fun "\e a \e\e[\e[") "\e a \e\e["))
+        (should (equal (funcall fun "\e[33m Z \e[0m")
+                       (with-temp-buffer
+                         (concat "\e[" (funcall fun "\e[33m Z \e[0m")))))))))
 
 (provide 'ansi-color-tests)
 
-- 
2.34.1


[-- Attachment #1.3: Type: text/plain, Size: 32 bytes --]


Thanks again and best regards.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
  2022-02-06 20:36       ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2022-02-06 22:55         ` Lars Ingebrigtsen
  2022-02-07  7:51         ` Ioannis Kappas
  1 sibling, 0 replies; 9+ messages in thread
From: Lars Ingebrigtsen @ 2022-02-06 22:55 UTC (permalink / raw)
  To: miha; +Cc: Ioannis Kappas, 53808

<miha@kamnitnik.top> writes:

> Thanks. I took the liberty of working on your patch, adding support for
> ansi-color-apply-on-region, ansi-color-filter-region,
> ansi-color-filter-apply. I also added some tests as you suggested and
> made a minor simplification.

Thanks; applied to Emacs 29.

Ioannis' original code was small enough to apply without an FSF
copyright assignment, so I noted the different authors in the commit
message.

Ioannis, this change was small enough to apply without assigning
copyright to the FSF, but for future patches you want to submit, it
might make sense to get the paperwork started now, so that subsequent
patches can be applied speedily. Would you be willing to sign such
paperwork?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
  2022-02-06 20:36       ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-02-06 22:55         ` Lars Ingebrigtsen
@ 2022-02-07  7:51         ` Ioannis Kappas
  2022-02-07 11:42           ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 9+ messages in thread
From: Ioannis Kappas @ 2022-02-07  7:51 UTC (permalink / raw)
  To: Miha Rihtaršič; +Cc: Lars Ingebrigtsen, 53808

Hi Miha,

On Sun, Feb 6, 2022 at 8:30 PM <miha@kamnitnik.top> wrote:

> Thanks. I took the liberty of working on your patch, adding support for
> ansi-color-apply-on-region, ansi-color-filter-region,
> ansi-color-filter-apply. I also added some tests as you suggested and
> made a minor simplification.
>

thanks for looking into this! The patch looks good and reduces the
issue considerably, but I've noticed there is still some undesired
behaviour with non SGR CSI sequences. I was expecting the following
test to display the non SGR `\e[a' characters verbatim in the output
(this is in the context of the
test/lisp/ansi-color-tests.el:ansi-color-incomplete-sequences-test()),

(dolist (fun (list ansi-filt ansi-app))
        (with-temp-buffer
          (should (equal (funcall fun "\e[a") ""))
          (should (equal (funcall fun "\e[33m Z \e[0m")
                         (with-temp-buffer
                           (concat "\e[a" (funcall fun "\e[33m Z \e[0m")))))
          ))

but fails to do so with

Test ansi-color-incomplete-sequences-test condition:
    (ert-test-failed
     ((should
       (equal
        (funcall fun "\33[33m Z \33[0m")
        (with-temp-buffer ...)))
      :form
      (equal " Z " "\33[a Z ")
      :value nil :explanation
      (arrays-of-different-length 3 6 " Z " "\33[a Z " first-mismatch-at 0)))

i.e. the "\e[a" seq does not appear in the output. Even before that, I
was expecting  (equal (funcall fun "\e[a") "") to fail and (equal
(funcall fun "\e[a") "\e[a") to be true instead (as this can't be the
start of a valid SGR expression).

Is there a reason why the ansi-color library tries to match input
against the CSI superset sequence instead of the SGR subset? The
package appears to be dealing exclusively with the latter and using
CSI regexps seems like an unnecessary complication to me.

(Just for reference, I'm using the terminology found in the ANSI
escape code in wikipedia at
https://en.wikipedia.org/w/index.php?title=ANSI_escape_code&oldid=1070369816#Description)

The SGR set as I understand it is the char sequence starting with the
ESC control character followed by the [ character followed by zero or
more of [0-9]+; followed by [0-9]+ followed by m. For example, ESC[33m
or ESC[3;31m. This is what I tried to capture as a fragment with the
"\e\\(?:\\[\\|$\\)\\(?:(?:[0-9]+;?\\)*"  regexp in my original patch.

Another minor observation, perhaps the following concat could be moved
into defconst in the interest of performance (it appears twice in the
patch)?

     (let ((fragment ""))
       (push (substring string start
-                       (if (string-match "\033" string start)
+                       (if (string-match
+                            (concat "\\(?:"
ansi-color--control-seq-fragment-regexp "\\)\\'")
+                            string start)

Best Regards





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char
  2022-02-07  7:51         ` Ioannis Kappas
@ 2022-02-07 11:42           ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 9+ messages in thread
From: miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2022-02-07 11:42 UTC (permalink / raw)
  To: Ioannis Kappas; +Cc: Lars Ingebrigtsen, 53808

[-- Attachment #1: Type: text/plain, Size: 3450 bytes --]

Ioannis Kappas <ioannis.kappas@gmail.com> writes:

> Thanks for looking into this! The patch looks good and reduces the
> issue considerably, but I've noticed there is still some undesired
> behaviour with non SGR CSI sequences. I was expecting the following
> test to display the non SGR `\e[a' characters verbatim in the output
> (this is in the context of the
> test/lisp/ansi-color-tests.el:ansi-color-incomplete-sequences-test()),
>
> (dolist (fun (list ansi-filt ansi-app))
>         (with-temp-buffer
>           (should (equal (funcall fun "\e[a") ""))
>           (should (equal (funcall fun "\e[33m Z \e[0m")
>                          (with-temp-buffer
>                            (concat "\e[a" (funcall fun "\e[33m Z \e[0m")))))
>           ))
>
> but fails to do so with
>
> Test ansi-color-incomplete-sequences-test condition:
>     (ert-test-failed
>      ((should
>        (equal
>         (funcall fun "\33[33m Z \33[0m")
>         (with-temp-buffer ...)))
>       :form
>       (equal " Z " "\33[a Z ")
>       :value nil :explanation
>       (arrays-of-different-length 3 6 " Z " "\33[a Z " first-mismatch-at 0)))
>
> i.e. the "\e[a" seq does not appear in the output. Even before that, I
> was expecting  (equal (funcall fun "\e[a") "") to fail and (equal
> (funcall fun "\e[a") "\e[a") to be true instead (as this can't be the
> start of a valid SGR expression).
>
> Is there a reason why the ansi-color library tries to match input
> against the CSI superset sequence instead of the SGR subset? The
> package appears to be dealing exclusively with the latter and using
> CSI regexps seems like an unnecessary complication to me.

Seems like filtering of non-SGR CSI sequences was introduced in commit
from Sat May 29 14:25:00 2010 -0400
(bc8d33d540d079af28ea93a0cf8df829911044ca) to fix bug#6085. And indeed,
if I try to set 'ansi-color-control-seq-regexp' to the more specific
SGR-only regexp "\e\\[[0-9;]*m", I get a lot of distracting "^[[K" in
the output of "grep --color=always" on my system.

> (Just for reference, I'm using the terminology found in the ANSI
> escape code in wikipedia at
> https://en.wikipedia.org/w/index.php?title=ANSI_escape_code&oldid=1070369816#Description)
>
> The SGR set as I understand it is the char sequence starting with the
> ESC control character followed by the [ character followed by zero or
> more of [0-9]+; followed by [0-9]+ followed by m. For example, ESC[33m
> or ESC[3;31m. This is what I tried to capture as a fragment with the
> "\e\\(?:\\[\\|$\\)\\(?:(?:[0-9]+;?\\)*"  regexp in my original patch.

I believe 'ansi-color--control-seq-fragment-regexp' should mirror
'ansi-color-control-seq-regexp' as exactly as possible. In other words,
if one matches all CSI sequences, the other shouldn't match only SGR
sequences.

> Another minor observation, perhaps the following concat could be moved
> into defconst in the interest of performance (it appears twice in the
> patch)?
>
>      (let ((fragment ""))
>        (push (substring string start
> -                       (if (string-match "\033" string start)
> +                       (if (string-match
> +                            (concat "\\(?:"
> ansi-color--control-seq-fragment-regexp "\\)\\'")
> +                            string start)

Thanks, noted, I will hopefully send the simple patch soon.

> Best Regards

Thanks, best regards.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-02-07 11:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-05 20:52 bug#53808: 29.0.50; ansi colorization process could block indefinetly on stray ESC char Ioannis Kappas
2022-02-05 21:00 ` Ioannis Kappas
2022-02-05 21:47   ` Ioannis Kappas
2022-02-05 21:56   ` Lars Ingebrigtsen
2022-02-05 22:05     ` Ioannis Kappas
2022-02-06 20:36       ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-02-06 22:55         ` Lars Ingebrigtsen
2022-02-07  7:51         ` Ioannis Kappas
2022-02-07 11:42           ` miha--- via Bug reports for GNU Emacs, the Swiss army knife of text editors

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).