From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#27270: display-raw-bytes-as-hex generates ambiguous output for Emacs strings Date: Sun, 11 Jun 2017 17:48:04 +0300 Message-ID: <83r2yq60nv.fsf@gnu.org> References: <29d6844f-2f6f-11c1-7877-a9d169e613f8@cs.ucla.edu> <83tw3s8jhr.fsf@gnu.org> <1c05b888-0c4a-05c8-248a-6e550637fff4@cs.ucla.edu> <8737bbxp6a.fsf@users.sourceforge.net> <2d5a8cd8-0884-bc1e-4298-a84dca61acbf@cs.ucla.edu> <831squ8no8.fsf@gnu.org> <93d9c575-4eb2-ea9e-d998-a8f3cff33a1e@cs.ucla.edu> <83y3t271ar.fsf@gnu.org> <83shja6yoq.fsf@gnu.org> <83r2yt7lad.fsf@gnu.org> <2202b54b-606f-0a10-abf7-5cb1a9164897@cs.ucla.edu> <83h8zo71au.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1497192558 18139 195.159.176.226 (11 Jun 2017 14:49:18 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 11 Jun 2017 14:49:18 +0000 (UTC) Cc: 27270@debbugs.gnu.org, v.schneidermann@gmail.com, npostavs@users.sourceforge.net To: Paul Eggert Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Jun 11 16:49:11 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dK4Aw-0004Ce-Gt for geb-bug-gnu-emacs@m.gmane.org; Sun, 11 Jun 2017 16:49:10 +0200 Original-Received: from localhost ([::1]:34248 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dK4Az-00063l-OB for geb-bug-gnu-emacs@m.gmane.org; Sun, 11 Jun 2017 10:49:13 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:55360) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dK4As-00063T-Ux for bug-gnu-emacs@gnu.org; Sun, 11 Jun 2017 10:49:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dK4Ao-0000lc-Dt for bug-gnu-emacs@gnu.org; Sun, 11 Jun 2017 10:49:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:38565) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dK4Ao-0000lB-Am for bug-gnu-emacs@gnu.org; Sun, 11 Jun 2017 10:49:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dK4Ao-00010A-0g for bug-gnu-emacs@gnu.org; Sun, 11 Jun 2017 10:49:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 11 Jun 2017 14:49:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 27270 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 27270-submit@debbugs.gnu.org id=B27270.14971925143817 (code B ref 27270); Sun, 11 Jun 2017 14:49:01 +0000 Original-Received: (at 27270) by debbugs.gnu.org; 11 Jun 2017 14:48:34 +0000 Original-Received: from localhost ([127.0.0.1]:41242 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dK4AM-0000zV-C6 for submit@debbugs.gnu.org; Sun, 11 Jun 2017 10:48:34 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:36049) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dK4AK-0000zI-8T for 27270@debbugs.gnu.org; Sun, 11 Jun 2017 10:48:32 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dK4AE-0000VA-6m for 27270@debbugs.gnu.org; Sun, 11 Jun 2017 10:48:27 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:54171) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dK4A6-0000RA-LT; Sun, 11 Jun 2017 10:48:18 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2172 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1dK4A5-0002aF-PK; Sun, 11 Jun 2017 10:48:18 -0400 In-reply-to: (message from Paul Eggert on Sat, 10 Jun 2017 17:04:40 -0700) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:133473 Archived-At: > Cc: npostavs@users.sourceforge.net, 27270@debbugs.gnu.org, > v.schneidermann@gmail.com > From: Paul Eggert > Date: Sat, 10 Jun 2017 17:04:40 -0700 > > On 06/10/2017 12:24 AM, Eli Zaretskii wrote: > > So your proposal would mean a change to the Lisp reader to support > > such escapes, right? If so, isn't such a change > > backward-incompatible? > > Yes, but only in the sense that undocumented escapes evaluate to > themselves, e.g., "\F" is currently the same as "F" in Emacs Lisp > because there is no escape sequence \F currently defined for character > constants. But there's nothing new here, e.g., when we added "\N{...}" > last year we changed the interpretation of the formerly-undocumented \N > escape. Then maybe the new hex display should use the \N{U+nnn} format? > >> Also, display-raw-bytes-as-hex would cause raw bytes to be displayed with this > >> new X escape, rather than with with the x escape. > > It could only do that for codepoints below 256 decimal, so that > > limitation should be taken into account when deciding on the proposal. > > Ouch, I hadn't thought of that. > > Wait -- doesn't that mean that "display-raw-bytes-as-hex" is a > misleading name, because it affects the display not only of raw bytes, > but of other undisplayable characters? That's true, but since the chances of a _user_ changing the printable-chars char-table are pretty slim, I didn't think it was justified to obfuscate the name. > Shouldn't we change its name to > something more generic and more accurate, like "display-characters-as-hex"? Codepoints whose printable-chars entry is nil cannot in good faith be called "characters", IMO. "Codepoints", maybe? But again, that makes the discoverability harder, so I'm not sure it's worth the hassle. > Anyway, to address the point you raised: how about a different idea? We > extend the existing \x syntax in strings so that \x{dddd} has the same > meaning as "\xdddd", except that the "}" terminates the escape. This > syntax is used by Perl and so is in the same family as \N{...}. We also > change display-raw-bytes-as-hex to use this new syntax when a character > is immediately followed by a hexadecimal digit. That way, most > characters are displayed as before, but my problematic example is > displayed as "x\x{90}5y", which is a good visual cue of the unusual > situation. See above: why not \N{U+...}? The only downside is that it's much longer than \xNN. Could be another option, perhaps.