From: Alan Mackenzie <acm@muc.de>
To: Andreas Schwab <schwab@linux-m68k.org>
Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
Subject: Re: Fwd: Re: Inadequate documentation of silly characters on screen.
Date: Thu, 19 Nov 2009 12:21:19 +0000 [thread overview]
Message-ID: <20091119122119.GB1720@muc.de> (raw)
In-Reply-To: <m3ws1mx1jw.fsf@hase.home>
Hi, Andreas,
On Thu, Nov 19, 2009 at 11:16:03AM +0100, Andreas Schwab wrote:
> Alan Mackenzie <acm@muc.de> writes:
> > So my `aset' invocation is trying to write a multibyte ?ñ into a
> > unibyte ?\n, and gets truncated from #x8f1 to #xf1 in the process.
> Nothing gets truncated. In Emacs 23 ?ñ is simply the number 241,
> whereas in Emacs 22 is it the number 2289. You can put 2289 in a string
> in Emacs 23, but there is no defined unicode character with that value.
Ah, thanks! So when I do
M-: (setq nl "\n")
M-: (aset nl 0 ?ñ)
M-: (insert nl)
, after the `aset', the string nl correctly contains, one character which
is the single byte #xf1. The bug happens in `insert', where something is
interpreting the byte #xf1 as the signed integer #xfffff.....ffff1.
Delving into the bowels of Emacs, I find this in character.h:
1. #define STRING_CHAR_AND_LENGTH(p, len, actual_len) \
2. (!((p)[0] & 0x80) \
3. ? ((actual_len) = 1, (p)[0]) \
4. : ! ((p)[0] & 0x20) \
5. ? ((actual_len) = 2, \
6. (((((p)[0] & 0x1F) << 6) \
7. | ((p)[1] & 0x3F)) \
8. + (((unsigned char) (p)[0]) < 0xC2 ? 0x3FFF80 : 0))) \
9. : ! ((p)[0] & 0x10) \
10. ? ((actual_len) = 3, \
11. ((((p)[0] & 0x0F) << 12) \
12. | (((p)[1] & 0x3F) << 6) \
13. | ((p)[2] & 0x3F))) \
14. : string_char ((p), NULL, &actual_len))
#xf1 drops through all this nonsense to string_char (in character.c). It
drops through to this case:
else if (! (*p & 0x08))
{
c = ((((p)[0] & 0xF) << 18)
| (((p)[1] & 0x3F) << 12)
| (((p)[2] & 0x3F) << 6)
| ((p)[3] & 0x3F));
p += 4;
}
, where it obviously becomes silly. At least, I think that's where it
ends up. This isn't the most maintainable piece of code in Emacs.
So, if ISO-8559-1 characters are now represented as single bytes in
Emacs, what test for mutibyticity should STRING_CHAR_AND_LENGTH be using?
> Andreas.
--
Alan Mackenzie (Nuremberg, Germany).
next prev parent reply other threads:[~2009-11-19 12:21 UTC|newest]
Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-18 19:12 [acm@muc.de: Re: Inadequate documentation of silly characters on screen.] Alan Mackenzie
2009-11-19 1:27 ` Fwd: Re: Inadequate documentation of silly characters on screen Stefan Monnier
2009-11-19 8:20 ` Alan Mackenzie
2009-11-19 8:50 ` Miles Bader
2009-11-19 10:16 ` Fwd: " Andreas Schwab
2009-11-19 12:21 ` Alan Mackenzie [this message]
2009-11-19 13:21 ` Jason Rumney
2009-11-19 13:35 ` Stefan Monnier
2009-11-19 14:18 ` Alan Mackenzie
2009-11-19 14:58 ` Jason Rumney
2009-11-19 15:42 ` Alan Mackenzie
2009-11-19 19:39 ` Eli Zaretskii
2009-11-19 15:30 ` Stefan Monnier
2009-11-19 15:58 ` Alan Mackenzie
2009-11-19 16:06 ` Andreas Schwab
2009-11-19 16:47 ` Aidan Kehoe
2009-11-19 17:29 ` Alan Mackenzie
2009-11-19 18:21 ` Aidan Kehoe
2009-11-20 2:43 ` Stephen J. Turnbull
2009-11-19 19:45 ` Eli Zaretskii
2009-11-19 20:07 ` Eli Zaretskii
2009-11-19 19:55 ` Stefan Monnier
2009-11-20 3:13 ` Stephen J. Turnbull
2009-11-19 16:55 ` David Kastrup
2009-11-19 18:08 ` Alan Mackenzie
2009-11-19 19:25 ` Davis Herring
2009-11-19 21:25 ` Alan Mackenzie
2009-11-19 22:31 ` David Kastrup
2009-11-21 22:52 ` Richard Stallman
2009-11-23 2:08 ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Stefan Monnier
2009-11-23 20:38 ` Richard Stallman
2009-11-23 21:34 ` Per Starbäck
2009-11-24 22:47 ` Richard Stallman
2009-11-25 1:33 ` Kenichi Handa
2009-11-25 2:29 ` Displaying bytes (was: Inadequate documentation of silly Stefan Monnier
2009-11-25 2:50 ` Lennart Borgman
2009-11-25 6:25 ` Stephen J. Turnbull
2009-11-25 5:40 ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Ulrich Mueller
2009-11-26 22:59 ` Displaying bytes Reiner Steib
2009-11-27 0:16 ` Ulrich Mueller
2009-11-27 1:41 ` Stefan Monnier
2009-11-27 4:14 ` Stephen J. Turnbull
2009-11-25 5:59 ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Stephen J. Turnbull
2009-11-25 8:16 ` Kenichi Handa
2009-11-29 16:01 ` Richard Stallman
2009-11-29 16:31 ` Displaying bytes (was: Inadequate documentation of silly Stefan Monnier
2009-11-29 22:01 ` Juri Linkov
2009-11-30 6:05 ` tomas
2009-11-30 12:09 ` Andreas Schwab
2009-11-30 12:39 ` tomas
2009-11-29 22:19 ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Kim F. Storm
2009-11-30 1:42 ` Stephen J. Turnbull
2009-11-24 1:28 ` Displaying bytes Stefan Monnier
2009-11-24 22:47 ` Richard Stallman
2009-11-25 2:18 ` Stefan Monnier
2009-11-26 6:24 ` Richard Stallman
2009-11-26 8:59 ` David Kastrup
2009-11-26 14:57 ` Stefan Monnier
2009-11-26 16:28 ` Lennart Borgman
2009-11-27 6:36 ` Richard Stallman
2009-11-24 22:47 ` Richard Stallman
2009-11-20 8:48 ` Fwd: Re: Inadequate documentation of silly characters on screen Eli Zaretskii
2009-11-19 19:52 ` Eli Zaretskii
2009-11-19 20:53 ` Alan Mackenzie
2009-11-19 22:16 ` David Kastrup
2009-11-20 8:55 ` Eli Zaretskii
2009-11-19 20:05 ` Stefan Monnier
2009-11-19 21:27 ` Alan Mackenzie
2009-11-19 19:43 ` Eli Zaretskii
2009-11-19 21:57 ` Alan Mackenzie
2009-11-19 23:10 ` Stefan Monnier
2009-11-19 20:02 ` Stefan Monnier
2009-11-19 14:08 ` Stefan Monnier
2009-11-19 14:50 ` Jason Rumney
2009-11-19 15:27 ` Stefan Monnier
2009-11-19 23:12 ` Miles Bader
2009-11-20 2:16 ` Stefan Monnier
2009-11-20 3:37 ` Stephen J. Turnbull
2009-11-20 4:30 ` Stefan Monnier
2009-11-20 7:18 ` Stephen J. Turnbull
2009-11-20 14:16 ` Stefan Monnier
2009-11-21 4:13 ` Stephen J. Turnbull
2009-11-21 5:24 ` Stefan Monnier
2009-11-21 6:42 ` Stephen J. Turnbull
2009-11-21 6:49 ` Stefan Monnier
2009-11-21 7:27 ` Stephen J. Turnbull
2009-11-23 1:58 ` Stefan Monnier
2009-11-21 12:33 ` David Kastrup
2009-11-21 13:55 ` Stephen J. Turnbull
2009-11-21 14:36 ` David Kastrup
2009-11-21 17:53 ` Stephen J. Turnbull
2009-11-21 23:30 ` David Kastrup
2009-11-22 1:27 ` Sebastian Rose
2009-11-22 8:06 ` David Kastrup
2009-11-22 23:52 ` Sebastian Rose
2009-11-19 17:08 ` Fwd: " Alan Mackenzie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091119122119.GB1720@muc.de \
--to=acm@muc.de \
--cc=emacs-devel@gnu.org \
--cc=monnier@iro.umontreal.ca \
--cc=schwab@linux-m68k.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.