all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu>
Cc: emacs-devel@gnu.org
Subject: Re: unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue 28]
Date: Mon, 20 Jan 2003 19:45:28 -0500	[thread overview]
Message-ID: <200301210045.h0L0jS812745@rum.cs.yale.edu> (raw)
In-Reply-To: 200301210010.JAA17551@etlken.m17n.org

> In article <6480-Mon20Jan2003214849+0200-eliz@is.elta.co.il>, "Eli Zaretskii" <eliz@is.elta.co.il> writes:
> >>  On process reading, if raw-text is used, the process output
> >>  is at first read as a unibyte string, the string is coverted
> >>  to multibyte by string-as-mulitbyte (not by not-yet-existing
> >>  string-to-multibyte), then inserted in a multibyte buffer.
> 
> > Sorry, I don't think I understand the difference.  What will we have
> > in the buffer after process output is converted as you describe in the
> > last paragraph above?
> 
> Ok, here's an example (Latin-1 lang. env.).
> 
> unibyte sequence (hex): 81    81    C0    C0
>                         result of conversion    display in multbyte buffer
> string-as-multibyte:    9E A1 81    C0    C0    \201À\300
> string-make-multibyte:  9E A1 9E A1 81 C0 81 C0 \201\201ÀÀ
> string-to-multibyte:    9E A1 9E A1 C0    C0    \201\201\300\300

I find the terminology and the concepts confusing.
On the other hand, I understand the concept of encoding and decoding.
The following equivalences almost hold:

 (string-as-multibyte str) == (decode-coding-string str 'internal)
 (string-make-multibyte str) == (decode-coding-string str 'default)
 (string-to-multibyte str) == (decode-coding-string str 'raw-text)

I said "almost" because:

1 - there is no `internal' coding-system as of now.  In Emacs-21 we'd
    use `emacs-mule' but for Emacs-22 it would be `utf-8-emacs'.
    I'm still not sure what such a thing is useful for, tho (see
    my other email).

2 - there is no `default' coding-system either.  Or maybe
    locale-coding-system is this default: if your locale is
    latin-1 then that's latin-1.  For non-8-bit locales,
    I don't know what string-make-multibyte does.

3 - when called with a `raw-text' coding-system, decode-coding-string
    returns a unibyte string, which is obviously not what we want here.
    It might make sense for internal operations to return unibyte
    strings for the `raw-text' case, but I was really surprised that
    decode-coding-string would ever return a unibyte string.

I think avoiding string-FOO-multibyte and using decode-coding-string
instead would make things a lot more clear.


-- Stefan

  reply	other threads:[~2003-01-21  0:45 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <E18ZDQC-0003mt-02@monty-python.gnu.org>
2003-01-18  0:48 ` Emacs-diffs Digest, Vol 2, Issue 28 Richard Stallman
2003-01-18 12:35   ` Kim F. Storm
2003-01-18 12:40   ` Eli Zaretskii
2003-01-20  0:49     ` Richard Stallman
2003-01-20  2:29     ` unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue 28] Kenichi Handa
2003-01-20 18:48       ` Eli Zaretskii
2003-01-20 20:55         ` Stefan Monnier
2003-01-21  0:20           ` Kenichi Handa
2003-01-21  0:54             ` Stefan Monnier
2003-01-21  5:57             ` Eli Zaretskii
2003-01-22  9:59           ` Richard Stallman
2003-01-22 14:12             ` Stefan Monnier
2003-01-22 18:09               ` Eli Zaretskii
2003-01-23 11:38                 ` Richard Stallman
2003-01-23 16:18                   ` Stefan Monnier
2003-01-24 17:16                     ` Richard Stallman
2003-01-23 17:48                   ` Eli Zaretskii
2003-01-24  5:43               ` Richard Stallman
2003-01-26  1:30                 ` Stefan Monnier
2003-01-27  2:31                   ` Richard Stallman
2003-01-27  7:38                   ` Kenichi Handa
2003-01-27 14:12                     ` Stefan Monnier
2003-01-29 11:23                       ` Kenichi Handa
2003-01-21  0:10         ` Kenichi Handa
2003-01-21  0:45           ` Stefan Monnier [this message]
2003-01-21  6:01             ` Eli Zaretskii
2003-01-21  6:43               ` Kenichi Handa
2003-01-21  8:04             ` Kenichi Handa
2003-01-21 15:02               ` Miles Bader
2003-01-21 17:44               ` Stefan Monnier
2003-01-22 10:00               ` Richard Stallman
2003-01-21  5:56           ` Eli Zaretskii
2003-01-21  6:38             ` Kenichi Handa
2003-01-22 10:00           ` Richard Stallman
2003-01-22 14:12             ` Stefan Monnier
2003-01-20  1:52   ` Emacs-diffs Digest, Vol 2, Issue 28 Kenichi Handa
2003-01-21 18:18     ` Richard Stallman
2003-01-28  0:32       ` Kenichi Handa
2003-01-28 12:35         ` Kim F. Storm
2003-02-10  8:15           ` set-process-filter-multibyte and etc Kenichi Handa
2003-02-10 14:57             ` Kim F. Storm
2003-02-11  0:15               ` Kenichi Handa
2003-02-20  1:27             ` Tak Ota
2003-02-20  1:56               ` Kenichi Handa
2003-02-20  2:44                 ` Tak Ota
2003-03-03 18:59         ` Emacs-diffs Digest, Vol 2, Issue 28 Richard Stallman
2003-01-21 18:18     ` Richard Stallman
2003-01-27 12:20       ` Kenichi Handa
2003-01-29  0:05         ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200301210045.h0L0jS812745@rum.cs.yale.edu \
    --to=monnier+gnu/emacs@rum.cs.yale.edu \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.