From: "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu>
Cc: emacs-devel@gnu.org
Subject: Re: unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue 28]
Date: Mon, 20 Jan 2003 19:45:28 -0500 [thread overview]
Message-ID: <200301210045.h0L0jS812745@rum.cs.yale.edu> (raw)
In-Reply-To: 200301210010.JAA17551@etlken.m17n.org
> In article <6480-Mon20Jan2003214849+0200-eliz@is.elta.co.il>, "Eli Zaretskii" <eliz@is.elta.co.il> writes:
> >> On process reading, if raw-text is used, the process output
> >> is at first read as a unibyte string, the string is coverted
> >> to multibyte by string-as-mulitbyte (not by not-yet-existing
> >> string-to-multibyte), then inserted in a multibyte buffer.
>
> > Sorry, I don't think I understand the difference. What will we have
> > in the buffer after process output is converted as you describe in the
> > last paragraph above?
>
> Ok, here's an example (Latin-1 lang. env.).
>
> unibyte sequence (hex): 81 81 C0 C0
> result of conversion display in multbyte buffer
> string-as-multibyte: 9E A1 81 C0 C0 \201À\300
> string-make-multibyte: 9E A1 9E A1 81 C0 81 C0 \201\201ÀÀ
> string-to-multibyte: 9E A1 9E A1 C0 C0 \201\201\300\300
I find the terminology and the concepts confusing.
On the other hand, I understand the concept of encoding and decoding.
The following equivalences almost hold:
(string-as-multibyte str) == (decode-coding-string str 'internal)
(string-make-multibyte str) == (decode-coding-string str 'default)
(string-to-multibyte str) == (decode-coding-string str 'raw-text)
I said "almost" because:
1 - there is no `internal' coding-system as of now. In Emacs-21 we'd
use `emacs-mule' but for Emacs-22 it would be `utf-8-emacs'.
I'm still not sure what such a thing is useful for, tho (see
my other email).
2 - there is no `default' coding-system either. Or maybe
locale-coding-system is this default: if your locale is
latin-1 then that's latin-1. For non-8-bit locales,
I don't know what string-make-multibyte does.
3 - when called with a `raw-text' coding-system, decode-coding-string
returns a unibyte string, which is obviously not what we want here.
It might make sense for internal operations to return unibyte
strings for the `raw-text' case, but I was really surprised that
decode-coding-string would ever return a unibyte string.
I think avoiding string-FOO-multibyte and using decode-coding-string
instead would make things a lot more clear.
-- Stefan
next prev parent reply other threads:[~2003-01-21 0:45 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <E18ZDQC-0003mt-02@monty-python.gnu.org>
2003-01-18 0:48 ` Emacs-diffs Digest, Vol 2, Issue 28 Richard Stallman
2003-01-18 12:35 ` Kim F. Storm
2003-01-18 12:40 ` Eli Zaretskii
2003-01-20 0:49 ` Richard Stallman
2003-01-20 2:29 ` unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue 28] Kenichi Handa
2003-01-20 18:48 ` Eli Zaretskii
2003-01-20 20:55 ` Stefan Monnier
2003-01-21 0:20 ` Kenichi Handa
2003-01-21 0:54 ` Stefan Monnier
2003-01-21 5:57 ` Eli Zaretskii
2003-01-22 9:59 ` Richard Stallman
2003-01-22 14:12 ` Stefan Monnier
2003-01-22 18:09 ` Eli Zaretskii
2003-01-23 11:38 ` Richard Stallman
2003-01-23 16:18 ` Stefan Monnier
2003-01-24 17:16 ` Richard Stallman
2003-01-23 17:48 ` Eli Zaretskii
2003-01-24 5:43 ` Richard Stallman
2003-01-26 1:30 ` Stefan Monnier
2003-01-27 2:31 ` Richard Stallman
2003-01-27 7:38 ` Kenichi Handa
2003-01-27 14:12 ` Stefan Monnier
2003-01-29 11:23 ` Kenichi Handa
2003-01-21 0:10 ` Kenichi Handa
2003-01-21 0:45 ` Stefan Monnier [this message]
2003-01-21 6:01 ` Eli Zaretskii
2003-01-21 6:43 ` Kenichi Handa
2003-01-21 8:04 ` Kenichi Handa
2003-01-21 15:02 ` Miles Bader
2003-01-21 17:44 ` Stefan Monnier
2003-01-22 10:00 ` Richard Stallman
2003-01-21 5:56 ` Eli Zaretskii
2003-01-21 6:38 ` Kenichi Handa
2003-01-22 10:00 ` Richard Stallman
2003-01-22 14:12 ` Stefan Monnier
2003-01-20 1:52 ` Emacs-diffs Digest, Vol 2, Issue 28 Kenichi Handa
2003-01-21 18:18 ` Richard Stallman
2003-01-28 0:32 ` Kenichi Handa
2003-01-28 12:35 ` Kim F. Storm
2003-02-10 8:15 ` set-process-filter-multibyte and etc Kenichi Handa
2003-02-10 14:57 ` Kim F. Storm
2003-02-11 0:15 ` Kenichi Handa
2003-02-20 1:27 ` Tak Ota
2003-02-20 1:56 ` Kenichi Handa
2003-02-20 2:44 ` Tak Ota
2003-03-03 18:59 ` Emacs-diffs Digest, Vol 2, Issue 28 Richard Stallman
2003-01-21 18:18 ` Richard Stallman
2003-01-27 12:20 ` Kenichi Handa
2003-01-29 0:05 ` Richard Stallman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200301210045.h0L0jS812745@rum.cs.yale.edu \
--to=monnier+gnu/emacs@rum.cs.yale.edu \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.