all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Miles Bader <miles@lsi.nec.co.jp>
Cc: handa@m17n.org
Subject: Re: setenv -> locale-coding-system cannot handle ASCII?!
Date: 04 Mar 2003 11:48:57 +0900	[thread overview]
Message-ID: <buod6l73kyu.fsf@mcspd15.ucom.lsi.nec.co.jp> (raw)
In-Reply-To: <E18pv9d-0004nD-00@fencepost.gnu.org>

Richard Stallman <rms@gnu.org> writes:
>     a buffer/string's should have an associated `unibyte encoding'
>     attribute, which would allow it to be encoded using the
>     straightforward and efficient `unibyte representation' but appear
>     to lisp/whoweve as being a multibyte buffer/string (all of who's
>     characters happen to have the same charset).
> 
> This is more or less what a unibyte buffer is now, except that there
> is only one possibility for which character sets can be stored in it:
> it holds the character codes from 0 to 0377.

Yeah, but I'm saying that emacs should be able to use this efficient
representation for other character sets as well -- I think it's far more
common to have buffers storing non-raw 8-bit characters than raw
characters, so why is the uncommon case optimized?

> If we wanted to hide from the user the distinction between unibyte and
> multibyte buffers, we would have to change the buffer's representation
> automatically when inserting characters that don't fit unibyte.  That
> seems like a bad idea.

Well I agree that it would be annoying if your 10-megabyte raw-bytes buffer
suddenly got converted because you accidentally inserted a chinese
character. :-)

However I think that in many cases such a conversion would be OK, and
since 99% of the time, people _don't_ mix character sets, it would
probably be a win on average.

Maybe there could be a buffer-local variable that `locks' the buffer's
character set, and would cause an error to be signalled if some code
attempts to insert non-compatible text (instead of converting the
buffer)?  This might better catch errors in coding than current
`just insert the raw-codes' unibyte buffers (if you _really_ want to
insert the raw-codes, you can of course do so explicitly.

> The advantage of unibyte mode for some European Latin-N users is that
> they don't have to deal with encoding and decoding, so they never have
> to specify a coding system.  It is possible that today we could get
> the same results using multibyte buffers and forcing use of a specific
> Latin-N coding system.  People could try experimenting with this and
> seeing if it provides results that are just like what European users
> now get with unibyte mode.

Perhaps the same advantages could be had, without making a special case,
by having a `uninterpreted' character set, which would effectively be
treated by the display code as `just send whatever code raw to the terminal.'

> As for the idea that efficiency should never be a factor in deciding
> what to do here, I am skeptical of that.

I'm not saying that efficiency isn't an issue, I'm saying that lisp
programmers shouldn't have to worry about it as much.  They should be
able to just use `normal' coding methods (which currently means
multibyte by default), and expect that emacs would optimize this in
certain common cases; currently if lisp programmer wants extra
efficiency, he's got to use special and more dangerous operations.

I realize that what I'm suggesting is a bit much, at least for the near
future, but I also think the current design is somewhat broken, and
makes it too easy for programmers to do the wrong thing.

-Miles
-- 
Ich bin ein Virus. Mach' mit und kopiere mich in Deine .signature.

  reply	other threads:[~2003-03-04  2:48 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-02-25  0:18 setenv -> locale-coding-system cannot handle ASCII?! Sam Steingold
2003-02-25  6:34 ` Kenichi Handa
2003-02-25  6:47   ` Miles Bader
2003-02-26  0:58     ` Kenichi Handa
2003-02-26  2:11       ` Stefan Monnier
2003-02-26  2:34         ` Kenichi Handa
2003-02-26  2:52           ` Stefan Monnier
2003-02-26  5:32             ` Kenichi Handa
2003-02-26  5:50               ` Stefan Monnier
2003-02-26  7:49                 ` Kenichi Handa
2003-02-26  8:05                   ` Kenichi Handa
2003-02-26  8:08                     ` Stefan Monnier
2003-02-26  8:12                   ` Stefan Monnier
2003-02-26  8:38                     ` tar-mode Kenichi Handa
2003-02-26  8:53                       ` tar-mode Stefan Monnier
2003-02-26 11:53                         ` tar-mode Kenichi Handa
2003-02-26 12:22                           ` tar-mode Stefan Monnier
2003-02-26 23:26                   ` setenv -> locale-coding-system cannot handle ASCII?! Richard Stallman
2003-02-26 23:26                   ` Richard Stallman
2003-02-26 23:26                 ` Richard Stallman
2003-02-26 23:26               ` Richard Stallman
2003-02-27  0:06                 ` Miles Bader
2003-03-03 18:59                   ` Richard Stallman
2003-03-04  2:48                     ` Miles Bader [this message]
2003-03-04  4:33                       ` Kenichi Handa
2003-03-05 20:46                       ` Richard Stallman
2003-02-26 23:25       ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=buod6l73kyu.fsf@mcspd15.ucom.lsi.nec.co.jp \
    --to=miles@lsi.nec.co.jp \
    --cc=handa@m17n.org \
    --cc=miles@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.