From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: "Brian D. Carlstrom" Newsgroups: gmane.emacs.devel Subject: Re: Man-fontify-manpage does not handle man, version 1.5o1, ANSI escape sequences Date: Tue, 30 Nov 2004 00:40:07 -0800 Message-ID: <16812.12775.164457.262301@zot.electricrain.com> References: Reply-To: "Brian D. Carlstrom" NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1101804067 21271 80.91.229.6 (30 Nov 2004 08:41:07 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 30 Nov 2004 08:41:07 +0000 (UTC) Cc: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Nov 30 09:41:00 2004 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1CZ3ZE-0006Qa-00 for ; Tue, 30 Nov 2004 09:41:00 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1CZ3if-0000kz-2g for ged-emacs-devel@m.gmane.org; Tue, 30 Nov 2004 03:50:45 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1CZ3hv-0000hr-Bg for emacs-devel@gnu.org; Tue, 30 Nov 2004 03:49:59 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1CZ3hu-0000hF-DD for emacs-devel@gnu.org; Tue, 30 Nov 2004 03:49:58 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1CZ3hu-0000h8-6E for emacs-devel@gnu.org; Tue, 30 Nov 2004 03:49:58 -0500 Original-Received: from [64.71.143.226] (helo=electricrain.com) by monty-python.gnu.org with esmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.34) id 1CZ3YP-0003BU-UG for emacs-devel@gnu.org; Tue, 30 Nov 2004 03:40:10 -0500 Original-Received: (qmail 25549 invoked by uid 525); 30 Nov 2004 08:40:07 -0000 Original-To: Stefan Monnier In-Reply-To: X-Mailer: VM 7.18 under Emacs 21.3.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:30521 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:30521 Stefan Monnier writes: > > Please describe exactly what actions triggered the bug > > and the precise symptoms of the bug: > > > My GNU/Linux system recently had several upgrades: > > kernel upgraded to 2.6.9 > > glibc upgraded to 2.3.3 > > man upgraded to 1.5o1 > > (other unknown upgrades, I'd have to ask administrator) > > > Since then my M-x man output has been full of ANSI escape sequences that > > weren't previously there. I traced this to the fact that > > Man-fontify-manpage assumes that the ANSI sequences will be terminated > > by "\e[0m". However, the new "man" output uses more specific attribute > > termination sequences. For example: > > > bold "\e[22m" > > underline "\e[24m" > > reverse "\e[27m" > > > I append a fix below. Basically I pull out the code that previously only > > handled ANSI bold sequences and replace it with a new function > > Man-fontify-manpage-ANSI that I call from Man-fontify-manpage to handle > > bold, underlining, and reverse video. > > Can you try the patch below instead? It tries to handle the case where the > code does \e[1m...\e[4m....\e[0m where the 0 turns off both bold > and underline. This patch did not work as is. Here is a list of issues and fixes/workarounds. I append my new working version at the end. 1. The regexp should be: "\e\\[\\([1470]\\|2\\([247]\\)\\)m" specifically, I changed \\e to \e and [ to \\[. Actually "3." below requires a further change. 2. The numeric (I'm not sure its safe to assume ASCII) character values need to be changed to the corresponding number. So I changed these expressions (char-after (match-beginning 2)) (char-after (match-beginning 1)) to (- (char-after (match-beginning 2)) ?0) (- (char-after (match-beginning 1)) ?0) I not at all confident in the mutli-byte character set portability of this. 3. The old code lumped the "clear all attributes" code \e[0m in with the [147] codes marking the start of attributes: > + (while (re-search-forward "\\e[\\([1470]\\|2\\([247]\\)\\)m" nil t) ... > + (cons (case (char-after (match-beginning 1)) > + (1 Man-overstrike-face) > + (4 Man-underline-face) > + (7 Man-reverse-face) > + (t nil)) > + faces))) However, this led to the faces list being filled with nils. I instead break out the \e[0m case and use it to set faces to nil. (while (re-search-forward "\e\\[\\([147]\\|\\(0\\)\\|2\\([247]\\)\\)m" nil t) (if faces (put-text-property start (match-beginning 0) 'face faces)) (setq start (match-beginning 0)) (setq faces (cond ((match-beginning 3) (message "before case 3 %s" faces) (delq (case (- (char-after (match-beginning 3)) ?0) (2 Man-overstrike-face) (4 Man-underline-face) (7 Man-reverse-face) (t (error "Unexpected case 3"))) faces)) ((match-beginning 2) (message "before case 2 %s" faces) nil) ((match-beginning 1) (message "before case 1 %s" faces) (cons (case (- (char-after (match-beginning 1)) ?0) (1 Man-overstrike-face) (4 Man-underline-face) (7 Man-reverse-face) (t (error "Unexpected case 1 %s" (- (char-after (match-beginning 1)) ?0)))) faces)) (t (error "Unexpeced case")))) 4. the "start" value is never advanced so the attributes would always span from the start of the buffer. I added this to advance the pointer (setq start (match-beginning 0)) I will say that your approach is more bullet proof in terms of handling overlapping escape sequences but a little harder to read. I guess I might just comment the three cases and the magic constants, which were more self explanatory in my original version. > BTW, is it right that bold is turned on with \e[1m and turned off with > \e[22m? It seems odd that it isn't \e[21m to turn it off or \e[2m to turn > it on, seeing how the other fit the \e[Nm and \e[2Nm rule. Yeah I noticed that "almost" rule. Here is the reference I used: http://www.catalyst.com/support/help/cstools3/visual/terminal/escapeseq.html [nm Select display attributes and color n Value Description 0 Reset to default attributes and color 1 Bold attribute 2 Dim attribute 4 Underline attribute 5 Blink attribute (ignored) 7 Reverse attribute 8 Hidden attribute 22 Clear bold attribute 24 Clear underline attribute 25 Clear blink attribute (ignored) 27 Clear reverse attribute here is another corroborating reference: http://www.isthe.com/chongo/tech/comp/ansi_escapes.html 00 for normal display (or just 0) 01 for bold on (or just 1) 02 faint (or just 2) 03 standout (or just 3) 04 underline (or just 4) 05 blink on (or just 5) 07 reverse video on (or just 7) 08 nondisplayed (invisible) (or just 8) 22 normal 23 no-standout 24 no-underline 25 no-blink 27 no-reverse note the second says that they can actually have leading zeros on the singal digit codes which the code we are discussing doesn't handle but would be easy to fix... (sorry I might have fixed it if I had noticed this first) -bri ;; Fontify ANSI escapes. (let ((faces nil) (start (point)) code) (while (re-search-forward "\e\\[\\([147]\\|\\(0\\)\\|2\\([247]\\)\\)m" nil t) (if faces (put-text-property start (match-beginning 0) 'face faces)) (setq start (match-beginning 0)) (setq faces (cond ((match-beginning 3) (message "before case 3 %s" faces) (delq (case (- (char-after (match-beginning 3)) ?0) (2 Man-overstrike-face) (4 Man-underline-face) (7 Man-reverse-face) (t (error "Unexpected case 3"))) faces)) ((match-beginning 2) (message "before case 2 %s" faces) nil) ((match-beginning 1) (message "before case 1 %s" faces) (cons (case (- (char-after (match-beginning 1)) ?0) (1 Man-overstrike-face) (4 Man-underline-face) (7 Man-reverse-face) (t (error "Unexpected case 1 %s" (- (char-after (match-beginning 1)) ?0)))) faces)) (t (error "Unexpeced case")))) (delete-region (match-beginning 0) (match-end 0))))