unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Alan Mackenzie <acm@muc.de>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: emacs-devel@gnu.org
Subject: Re: Fwd: Re: Inadequate documentation of silly characters on screen.
Date: Thu, 19 Nov 2009 08:20:40 +0000	[thread overview]
Message-ID: <20091119082040.GA1720@muc.de> (raw)
In-Reply-To: <jwvlji3fgzi.fsf-monnier+emacs@gnu.org>

Morning, Stefan!

On Wed, Nov 18, 2009 at 08:27:24PM -0500, Stefan Monnier wrote:

> The integer 241 is used to represent the char ?ñ, but it's also used for
> many other things, one of them being to represent the byte 241 (tho such
> a byte can also be represented as the integer 4194289).

> Now strings come in two flavors: multibyte (i.e. sequences of chars) and
> unibyte (i.e. sequences of bytes).  So when you do:

>    M-: (setq nl "\n")
>    M-: (aset nl 0 ?ñ)
>    M-: (insert nl)

> The `aset' part may do two different things depending on whether `nl' is
> unibyte or multibyte: it will either insert the char ?ñ or the byte 241.
> In the above code the "\n" is taken as a unibyte string, tho I'm not
> sure why we made this arbitrary choice.

The above sequence "works" in Emacs 22.3, in the sense that "ñ" gets
displayed - when I do M-: (aset nl 0 ?ñ), I get

   "2289 (#o4361, #x8f1)" (Emacs 22.3)
   "241 (#o361, #xf1)"    (Emacs 23.1)

displayed in the echo area.  So my `aset' invocation is trying to write a
multibyte ?ñ into a unibyte ?\n, and gets truncated from #x8f1 to #xf1 in
the process.  Surely this behaviour in Emacs 23.1 is a bug?  Shouldn't we
fix it before the pretest?  How about interpreting "\n" and friends as
multibyte or unibyte according to the prevailing flavour?

> If you give us more context (i.e. more of the real code where the
> problem show up), maybe we can tell you how to avoid it.

OK.  I have my own routine to display regexps.  As a first step, I
translate \n -> ñ, (and \t, \r, \f similarly).  This is how:

    (defun translate-rnt (regexp)
      "REGEXP is a string.  Translate any \t \n \r and \f characters
    to wierd non-ASCII printable characters: \t to Î (206, \xCE), \n
    to ñ (241, \xF1), \r to ® (174, \xAE) and \f to £ (163, \xA3).
    The original string is modified."
      (let (ch pos)
        (while (setq pos (string-match "[\t\n\r\f]" regexp))
          (setq ch (aref regexp pos))
          (aset regexp pos                        ; <===================
                (cond ((eq ch ?\t) ?Î)
                      ((eq ch ?\n) ?ñ)
                      ((eq ch ?\r) ?®)
                      (t           ?£))))
        regexp))



> Usually, I recommend to stay away from `aset' on strings for various
> reasons, and it seems that it also helps avoid those tricky issues (tho
> it doesn't protect you from them completely).

Again, surely this is a bug?  These tricky issues should be dealt with in
the lisp interpreter in a way that lisp hackers don't have to worry
about.  Why do we have both unibyte and multibyte?  Is there any reason
not to remove unibyte altogether (though obviously not for 23.2).

What was the change between 22.3 and 23.1 that broke my code?  Would it,
perhaps, be a good idea to reconsider that change?

>         Stefan

-- 
Alan Mackenzie (Nurmberg, Germany).




  reply	other threads:[~2009-11-19  8:20 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-18 19:12 [acm@muc.de: Re: Inadequate documentation of silly characters on screen.] Alan Mackenzie
2009-11-19  1:27 ` Fwd: Re: Inadequate documentation of silly characters on screen Stefan Monnier
2009-11-19  8:20   ` Alan Mackenzie [this message]
2009-11-19  8:50     ` Miles Bader
2009-11-19 10:16     ` Fwd: " Andreas Schwab
2009-11-19 12:21       ` Alan Mackenzie
2009-11-19 13:21       ` Jason Rumney
2009-11-19 13:35         ` Stefan Monnier
2009-11-19 14:18         ` Alan Mackenzie
2009-11-19 14:58           ` Jason Rumney
2009-11-19 15:42             ` Alan Mackenzie
2009-11-19 19:39               ` Eli Zaretskii
2009-11-19 15:30           ` Stefan Monnier
2009-11-19 15:58             ` Alan Mackenzie
2009-11-19 16:06               ` Andreas Schwab
2009-11-19 16:47               ` Aidan Kehoe
2009-11-19 17:29                 ` Alan Mackenzie
2009-11-19 18:21                   ` Aidan Kehoe
2009-11-20  2:43                   ` Stephen J. Turnbull
2009-11-19 19:45                 ` Eli Zaretskii
2009-11-19 20:07                   ` Eli Zaretskii
2009-11-19 19:55                 ` Stefan Monnier
2009-11-20  3:13                   ` Stephen J. Turnbull
2009-11-19 16:55               ` David Kastrup
2009-11-19 18:08                 ` Alan Mackenzie
2009-11-19 19:25                   ` Davis Herring
2009-11-19 21:25                     ` Alan Mackenzie
2009-11-19 22:31                       ` David Kastrup
2009-11-21 22:52                         ` Richard Stallman
2009-11-23  2:08                           ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Stefan Monnier
2009-11-23 20:38                             ` Richard Stallman
2009-11-23 21:34                               ` Per Starbäck
2009-11-24 22:47                                 ` Richard Stallman
2009-11-25  1:33                                   ` Kenichi Handa
2009-11-25  2:29                                     ` Displaying bytes (was: Inadequate documentation of silly Stefan Monnier
2009-11-25  2:50                                       ` Lennart Borgman
2009-11-25  6:25                                       ` Stephen J. Turnbull
2009-11-25  5:40                                     ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Ulrich Mueller
2009-11-26 22:59                                       ` Displaying bytes Reiner Steib
2009-11-27  0:16                                         ` Ulrich Mueller
2009-11-27  1:41                                         ` Stefan Monnier
2009-11-27  4:14                                         ` Stephen J. Turnbull
2009-11-25  5:59                                     ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Stephen J. Turnbull
2009-11-25  8:16                                       ` Kenichi Handa
2009-11-29 16:01                                     ` Richard Stallman
2009-11-29 16:31                                       ` Displaying bytes (was: Inadequate documentation of silly Stefan Monnier
2009-11-29 22:01                                         ` Juri Linkov
2009-11-30  6:05                                           ` tomas
2009-11-30 12:09                                             ` Andreas Schwab
2009-11-30 12:39                                               ` tomas
2009-11-29 22:19                                       ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Kim F. Storm
2009-11-30  1:42                                         ` Stephen J. Turnbull
2009-11-24  1:28                               ` Displaying bytes Stefan Monnier
2009-11-24 22:47                                 ` Richard Stallman
2009-11-25  2:18                                   ` Stefan Monnier
2009-11-26  6:24                                     ` Richard Stallman
2009-11-26  8:59                                       ` David Kastrup
2009-11-26 14:57                                       ` Stefan Monnier
2009-11-26 16:28                                         ` Lennart Borgman
2009-11-27  6:36                                         ` Richard Stallman
2009-11-24 22:47                                 ` Richard Stallman
2009-11-20  8:48                       ` Fwd: Re: Inadequate documentation of silly characters on screen Eli Zaretskii
2009-11-19 19:52                   ` Eli Zaretskii
2009-11-19 20:53                     ` Alan Mackenzie
2009-11-19 22:16                       ` David Kastrup
2009-11-20  8:55                         ` Eli Zaretskii
2009-11-19 20:05                   ` Stefan Monnier
2009-11-19 21:27                     ` Alan Mackenzie
2009-11-19 19:43               ` Eli Zaretskii
2009-11-19 21:57                 ` Alan Mackenzie
2009-11-19 23:10                   ` Stefan Monnier
2009-11-19 20:02               ` Stefan Monnier
2009-11-19 14:08     ` Stefan Monnier
2009-11-19 14:50       ` Jason Rumney
2009-11-19 15:27         ` Stefan Monnier
2009-11-19 23:12           ` Miles Bader
2009-11-20  2:16             ` Stefan Monnier
2009-11-20  3:37             ` Stephen J. Turnbull
2009-11-20  4:30               ` Stefan Monnier
2009-11-20  7:18                 ` Stephen J. Turnbull
2009-11-20 14:16                   ` Stefan Monnier
2009-11-21  4:13                     ` Stephen J. Turnbull
2009-11-21  5:24                       ` Stefan Monnier
2009-11-21  6:42                         ` Stephen J. Turnbull
2009-11-21  6:49                           ` Stefan Monnier
2009-11-21  7:27                             ` Stephen J. Turnbull
2009-11-23  1:58                               ` Stefan Monnier
2009-11-21 12:33                           ` David Kastrup
2009-11-21 13:55                             ` Stephen J. Turnbull
2009-11-21 14:36                               ` David Kastrup
2009-11-21 17:53                                 ` Stephen J. Turnbull
2009-11-21 23:30                                   ` David Kastrup
2009-11-22  1:27                                     ` Sebastian Rose
2009-11-22  8:06                                       ` David Kastrup
2009-11-22 23:52                                         ` Sebastian Rose
2009-11-19 17:08       ` Fwd: " Alan Mackenzie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091119082040.GA1720@muc.de \
    --to=acm@muc.de \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).