From: Kenichi Handa <handa@m17n.org>
Cc: miles@gnu.org
Subject: Re: setenv -> locale-coding-system cannot handle ASCII?!
Date: Wed, 26 Feb 2003 14:32:16 +0900 (JST) [thread overview]
Message-ID: <200302260532.OAA29294@etlken.m17n.org> (raw)
In-Reply-To: <200302260252.h1Q2qIK08490@rum.cs.yale.edu> (monnier+gnu/emacs@rum.cs.yale.edu)
In article <200302260252.h1Q2qIK08490@rum.cs.yale.edu>, "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:
> I consider this context-dependent meaning of unibyte strings
> to be a problem. I understand why text in a unibyte buffer
> has such an ambiguous meaning and agree that it's difficult
> to avoid, but it's not a reason to carry over this difficulty
> to strings where it is not needed.
Why is it not needed? Strings and buffers are not that
different, both are containers of characters. If we get a
unibyte string from a unibyte buffer by buffer-substring,
how should we treat that string?
>> In the former case, as it is given to encode-coding-string,
>> it is a multibyte form by which emacs represents
>> character(s), not a sequence of characters representing raw
>> bytes.
> The problem is that the multibyteness of strings is not
> always as easy to guess/control.
I agree.
> For example: what is the multibyteness of
> (concat "\201" (format "%s" "hello"))
> and
> (concat "\201" (format "%s" 1))
The latter yields multibyte, but I think it'a bug. I found
that "(format "%s" 1)" is implemented by using
prin1-to-string, and prin1-to-string prints an object to a
temporary buffer and gets that buffer string. So, in a
multibyte sesstion "(format "%s" 1)" yields a multibyte
string. :-(
>> In the latter case, as it is given to string-to-multibyte,
>> it should be regard as a sequence of characters representing
>> raw bytes, thus the result of (string-to-multibyte
>> "\201\300") is still a sequence of raw-bytes. Encoding
>> raw-bytes should yield the same raw-bytes.
> Indeed, that's what I and `setenv' would want.
>> And, this behaviour of encode-coding-string on a unibyte
>> string is a natural consequence of encode-coding-region in a
>> unibyte buffer.
> As mentioned above, I understand why it works that way in buffers,
> but I don't think it has to work the same way for strings.
So, do you mean that you want this?
If a unibyte buffer has \201\300 in the region FROM and TO,
(encode-coding-string (buffer-substring FROM TO) 'iso-latin-1)
=> "\201\300"
(encode-coding-region FROM TO 'iso-latin-1) changes the
region to \300.
Isn't it more confusing?
By the way, I also really really hate this unibyte/mulitbyte
problem. Sometimes I think I should have opposed to the
introduction of such a concept more strongly.
imagine there's no unibyte
it's easy if you try
no bytes below us
above us only chars
imagine all the people living in multibyte
:-)
---
Ken'ichi HANDA
handa@m17n.org
next prev parent reply other threads:[~2003-02-26 5:32 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-02-25 0:18 setenv -> locale-coding-system cannot handle ASCII?! Sam Steingold
2003-02-25 6:34 ` Kenichi Handa
2003-02-25 6:47 ` Miles Bader
2003-02-26 0:58 ` Kenichi Handa
2003-02-26 2:11 ` Stefan Monnier
2003-02-26 2:34 ` Kenichi Handa
2003-02-26 2:52 ` Stefan Monnier
2003-02-26 5:32 ` Kenichi Handa [this message]
2003-02-26 5:50 ` Stefan Monnier
2003-02-26 7:49 ` Kenichi Handa
2003-02-26 8:05 ` Kenichi Handa
2003-02-26 8:08 ` Stefan Monnier
2003-02-26 8:12 ` Stefan Monnier
2003-02-26 8:38 ` tar-mode Kenichi Handa
2003-02-26 8:53 ` tar-mode Stefan Monnier
2003-02-26 11:53 ` tar-mode Kenichi Handa
2003-02-26 12:22 ` tar-mode Stefan Monnier
2003-02-26 23:26 ` setenv -> locale-coding-system cannot handle ASCII?! Richard Stallman
2003-02-26 23:26 ` Richard Stallman
2003-02-26 23:26 ` Richard Stallman
2003-02-26 23:26 ` Richard Stallman
2003-02-27 0:06 ` Miles Bader
2003-03-03 18:59 ` Richard Stallman
2003-03-04 2:48 ` Miles Bader
2003-03-04 4:33 ` Kenichi Handa
2003-03-05 20:46 ` Richard Stallman
2003-02-26 23:25 ` Richard Stallman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200302260532.OAA29294@etlken.m17n.org \
--to=handa@m17n.org \
--cc=miles@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.