all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Stephen J. Turnbull" <stephen@xemacs.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: Miles Bader <miles@gnu.org>, Alan Mackenzie <acm@muc.de>,
	emacs-devel@gnu.org, Jason Rumney <jasonr@gnu.org>
Subject: Re: Inadequate documentation of silly characters on screen.
Date: Sat, 21 Nov 2009 15:42:23 +0900	[thread overview]
Message-ID: <877htk2xbk.fsf@uwakimon.sk.tsukuba.ac.jp> (raw)
In-Reply-To: <jwvlji0pi9p.fsf-monnier+emacs@gnu.org>

Stefan Monnier writes:

 > I don't know what you mean.  The eight-bit "chars" were introduced to
 > make sure that decoding+reencoding will always return the exact same
 > byte-sequence, no matter what coding-system was used (i.e. even if the
 > byte-sequence is invaldi for that coding-system).  Dunno how XEmacs
 > handles it.

Honestly, it currently doesn't, or doesn't very well, despite some
work by Aidan.

However, I think a well-behaved platform should by default error
(something derived from invalid-state, in XEmacs's error hierarchy) in
such a case; normally this means corruption in the file.  There are
special cases like utf8latex whose error messages give you a certain
number of octets without respecting character boundaries; I agree
there is need to handle this case.  What Python 3 (PEP 383) does is
provide a family of coding system variants which use invalid Unicode
surrogates to encode "raw bytes" for situations where the user asks
you to proceed despite invalid octet sequences for the coding system;
since Emacs's internal code is UTF-8, any Unicode surrogate is invalid
and could be used for this purpose.  This would make non-Emacs apps
barf errors on such Emacs autosaves, but they'll probably barf on the
source file, too.

 > > And it should be either an error to (aset string pos 241) (sorry
 > > Alan!) or 241 should be implicitly interpreted as Latin-1 (ie, ?ñ).  I
 > > favor the former, because what Alan is doing screws Spanish-speaking
 > > users AFAICS.  OTOH, the latter extends naturally if you have plans to
 > > add support for fixed-width Unicode buffers (UTF-16 and UTF-32).
 > 
 > I understand this even less.

There's a typo in the expr above, should be "multibyte-string".  The
proposed treatment of 241 is due to the fact that it is currently
illegal in multibyte strings AIUI.

Re the bit about Spanish-speakers: AIUI, Alan is translating multiline
strings to oneline strings by using an unusual graphic character.  But
it's only unusual in non-Spanish cases; Spanish-speakers may very well
want to include comments like "¡I wanna write this comment in Español!"
which would presumably get unfolded to "¡I wanna write this comment in
Espa\nol!"  Not very nice.

Re widechar buffers: the codes for Latin-1 characters in UTF-16 and
UTF-32 are just zero-padded extensions of the unibyte codes.  I'm
pretty sure it's this kind of thing that Ben had in mind when he
originally designed the XEmacs version of the Mule internal encoding
to make (= (char-int ?ñ) 241) true in all versions of XEmacs.

 > I think XEmacs's fundamental tradeoffs are subtly different but
 > lead to very far-reaching consequences,

Indeed, but I'm not talking about XEmacs, except for comparison of
techniques.





  reply	other threads:[~2009-11-21  6:42 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-18 19:12 [acm@muc.de: Re: Inadequate documentation of silly characters on screen.] Alan Mackenzie
2009-11-19  1:27 ` Fwd: Re: Inadequate documentation of silly characters on screen Stefan Monnier
2009-11-19  8:20   ` Alan Mackenzie
2009-11-19  8:50     ` Miles Bader
2009-11-19 10:16     ` Fwd: " Andreas Schwab
2009-11-19 12:21       ` Alan Mackenzie
2009-11-19 13:21       ` Jason Rumney
2009-11-19 13:35         ` Stefan Monnier
2009-11-19 14:18         ` Alan Mackenzie
2009-11-19 14:58           ` Jason Rumney
2009-11-19 15:42             ` Alan Mackenzie
2009-11-19 19:39               ` Eli Zaretskii
2009-11-19 15:30           ` Stefan Monnier
2009-11-19 15:58             ` Alan Mackenzie
2009-11-19 16:06               ` Andreas Schwab
2009-11-19 16:47               ` Aidan Kehoe
2009-11-19 17:29                 ` Alan Mackenzie
2009-11-19 18:21                   ` Aidan Kehoe
2009-11-20  2:43                   ` Stephen J. Turnbull
2009-11-19 19:45                 ` Eli Zaretskii
2009-11-19 20:07                   ` Eli Zaretskii
2009-11-19 19:55                 ` Stefan Monnier
2009-11-20  3:13                   ` Stephen J. Turnbull
2009-11-19 16:55               ` David Kastrup
2009-11-19 18:08                 ` Alan Mackenzie
2009-11-19 19:25                   ` Davis Herring
2009-11-19 21:25                     ` Alan Mackenzie
2009-11-19 22:31                       ` David Kastrup
2009-11-21 22:52                         ` Richard Stallman
2009-11-23  2:08                           ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Stefan Monnier
2009-11-23 20:38                             ` Richard Stallman
2009-11-23 21:34                               ` Per Starbäck
2009-11-24 22:47                                 ` Richard Stallman
2009-11-25  1:33                                   ` Kenichi Handa
2009-11-25  2:29                                     ` Displaying bytes (was: Inadequate documentation of silly Stefan Monnier
2009-11-25  2:50                                       ` Lennart Borgman
2009-11-25  6:25                                       ` Stephen J. Turnbull
2009-11-25  5:40                                     ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Ulrich Mueller
2009-11-26 22:59                                       ` Displaying bytes Reiner Steib
2009-11-27  0:16                                         ` Ulrich Mueller
2009-11-27  1:41                                         ` Stefan Monnier
2009-11-27  4:14                                         ` Stephen J. Turnbull
2009-11-25  5:59                                     ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Stephen J. Turnbull
2009-11-25  8:16                                       ` Kenichi Handa
2009-11-29 16:01                                     ` Richard Stallman
2009-11-29 16:31                                       ` Displaying bytes (was: Inadequate documentation of silly Stefan Monnier
2009-11-29 22:01                                         ` Juri Linkov
2009-11-30  6:05                                           ` tomas
2009-11-30 12:09                                             ` Andreas Schwab
2009-11-30 12:39                                               ` tomas
2009-11-29 22:19                                       ` Displaying bytes (was: Inadequate documentation of silly characters on screen.) Kim F. Storm
2009-11-30  1:42                                         ` Stephen J. Turnbull
2009-11-24  1:28                               ` Displaying bytes Stefan Monnier
2009-11-24 22:47                                 ` Richard Stallman
2009-11-25  2:18                                   ` Stefan Monnier
2009-11-26  6:24                                     ` Richard Stallman
2009-11-26  8:59                                       ` David Kastrup
2009-11-26 14:57                                       ` Stefan Monnier
2009-11-26 16:28                                         ` Lennart Borgman
2009-11-27  6:36                                         ` Richard Stallman
2009-11-24 22:47                                 ` Richard Stallman
2009-11-20  8:48                       ` Fwd: Re: Inadequate documentation of silly characters on screen Eli Zaretskii
2009-11-19 19:52                   ` Eli Zaretskii
2009-11-19 20:53                     ` Alan Mackenzie
2009-11-19 22:16                       ` David Kastrup
2009-11-20  8:55                         ` Eli Zaretskii
2009-11-19 20:05                   ` Stefan Monnier
2009-11-19 21:27                     ` Alan Mackenzie
2009-11-19 19:43               ` Eli Zaretskii
2009-11-19 21:57                 ` Alan Mackenzie
2009-11-19 23:10                   ` Stefan Monnier
2009-11-19 20:02               ` Stefan Monnier
2009-11-19 14:08     ` Stefan Monnier
2009-11-19 14:50       ` Jason Rumney
2009-11-19 15:27         ` Stefan Monnier
2009-11-19 23:12           ` Miles Bader
2009-11-20  2:16             ` Stefan Monnier
2009-11-20  3:37             ` Stephen J. Turnbull
2009-11-20  4:30               ` Stefan Monnier
2009-11-20  7:18                 ` Stephen J. Turnbull
2009-11-20 14:16                   ` Stefan Monnier
2009-11-21  4:13                     ` Stephen J. Turnbull
2009-11-21  5:24                       ` Stefan Monnier
2009-11-21  6:42                         ` Stephen J. Turnbull [this message]
2009-11-21  6:49                           ` Stefan Monnier
2009-11-21  7:27                             ` Stephen J. Turnbull
2009-11-23  1:58                               ` Stefan Monnier
2009-11-21 12:33                           ` David Kastrup
2009-11-21 13:55                             ` Stephen J. Turnbull
2009-11-21 14:36                               ` David Kastrup
2009-11-21 17:53                                 ` Stephen J. Turnbull
2009-11-21 23:30                                   ` David Kastrup
2009-11-22  1:27                                     ` Sebastian Rose
2009-11-22  8:06                                       ` David Kastrup
2009-11-22 23:52                                         ` Sebastian Rose
2009-11-19 17:08       ` Fwd: " Alan Mackenzie
  -- strict thread matches above, loose matches on Subject: below --
2009-11-18  9:37 Alan Mackenzie
2009-11-18  9:40 ` Miles Bader
2009-11-18 10:15   ` Alan Mackenzie
2009-11-18 12:03     ` Jason Rumney
2009-11-18 15:02     ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877htk2xbk.fsf@uwakimon.sk.tsukuba.ac.jp \
    --to=stephen@xemacs.org \
    --cc=acm@muc.de \
    --cc=emacs-devel@gnu.org \
    --cc=jasonr@gnu.org \
    --cc=miles@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.