all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Kenichi Handa <handa@m17n.org>
Cc: monnier+gnu/emacs@rum.cs.yale.edu
Subject: Re: unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue 28]
Date: Mon, 27 Jan 2003 16:38:39 +0900 (JST)	[thread overview]
Message-ID: <200301270738.QAA14597@etlken.m17n.org> (raw)
In-Reply-To: <200301260130.h0Q1Uo518101@rum.cs.yale.edu> (monnier+gnu/emacs@rum.cs.yale.edu)

In article <200301260130.h0Q1Uo518101@rum.cs.yale.edu>, "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:
> I don't understand your question.  When people use string-FOO-multibyte
> it's generally because they don't understand what's going on and they
> think "a char is a char is a char and I don't get this multibyte madness":
> using decode-coding-string would force them to better understand what's
> going on.

But I suspect that such people won't use the correct coding
system anyway.  To use the correct coding system, they must
clearly understand what kind of multibyte string they want.
And if they understand that, there should be no difficulty
in using the correct string-FOO-multibyte function.

In one sense, it seems clean to use the concept of decoding
and encoding for all unibyte<->multibyte conversions
coherently.  But, that hides what Emacs actually does.

You wrote:
> I find it more helpful to think in terms of bytes and chars:

Definitely.  But,

> unibyte strings are sequences of bytes while multibyte
> strings are sequences of chars.

Unfortunately no.

Emacs can represent a character sequence both in unibyte and
multibyte string.  Emacs can also represent a raw-byte
sequence both in unibyte and multibyte string.  For a
multibyte string, which it represents (char-seq or byte-seq)
can be detected by what kind of characters it contains.
But, for a unibyte string, it's impossible, only the context
of how it is used decides that.

For string-make-multibyte, the input is a char-seq, and the
resulf of conversion is also a char-seq.  So, the concept of
decoding is not applicable here.

For string-to-multibyte, the input is a byte-seq, and the
result of conversion is also a byte-seq.  So, again, the
concept of decoding is not applicable neither.

For string-as-multibyte, the intput is a byte-seq, and the
result of conversion is a char-seq.  So, only here, the
concept of decoding is also applicable.

I hope this explains why I insist on string-FOO-multibyte
functions.

By the way, it may be good to instroduce coding system
aliases `internal' and `default', and write, for instance,
in the docstring of string-as-multibyte that the effect is
the same as (decode-coding-string UNIBYTE-STRING 'internal).

---
Ken'ichi HANDA
handa@m17n.org

  parent reply	other threads:[~2003-01-27  7:38 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <E18ZDQC-0003mt-02@monty-python.gnu.org>
2003-01-18  0:48 ` Emacs-diffs Digest, Vol 2, Issue 28 Richard Stallman
2003-01-18 12:35   ` Kim F. Storm
2003-01-18 12:40   ` Eli Zaretskii
2003-01-20  0:49     ` Richard Stallman
2003-01-20  2:29     ` unibyte<->multibyte conversion [Re: Emacs-diffs Digest, Vol 2, Issue 28] Kenichi Handa
2003-01-20 18:48       ` Eli Zaretskii
2003-01-20 20:55         ` Stefan Monnier
2003-01-21  0:20           ` Kenichi Handa
2003-01-21  0:54             ` Stefan Monnier
2003-01-21  5:57             ` Eli Zaretskii
2003-01-22  9:59           ` Richard Stallman
2003-01-22 14:12             ` Stefan Monnier
2003-01-22 18:09               ` Eli Zaretskii
2003-01-23 11:38                 ` Richard Stallman
2003-01-23 16:18                   ` Stefan Monnier
2003-01-24 17:16                     ` Richard Stallman
2003-01-23 17:48                   ` Eli Zaretskii
2003-01-24  5:43               ` Richard Stallman
2003-01-26  1:30                 ` Stefan Monnier
2003-01-27  2:31                   ` Richard Stallman
2003-01-27  7:38                   ` Kenichi Handa [this message]
2003-01-27 14:12                     ` Stefan Monnier
2003-01-29 11:23                       ` Kenichi Handa
2003-01-21  0:10         ` Kenichi Handa
2003-01-21  0:45           ` Stefan Monnier
2003-01-21  6:01             ` Eli Zaretskii
2003-01-21  6:43               ` Kenichi Handa
2003-01-21  8:04             ` Kenichi Handa
2003-01-21 15:02               ` Miles Bader
2003-01-21 17:44               ` Stefan Monnier
2003-01-22 10:00               ` Richard Stallman
2003-01-21  5:56           ` Eli Zaretskii
2003-01-21  6:38             ` Kenichi Handa
2003-01-22 10:00           ` Richard Stallman
2003-01-22 14:12             ` Stefan Monnier
2003-01-20  1:52   ` Emacs-diffs Digest, Vol 2, Issue 28 Kenichi Handa
2003-01-21 18:18     ` Richard Stallman
2003-01-28  0:32       ` Kenichi Handa
2003-01-28 12:35         ` Kim F. Storm
2003-02-10  8:15           ` set-process-filter-multibyte and etc Kenichi Handa
2003-02-10 14:57             ` Kim F. Storm
2003-02-11  0:15               ` Kenichi Handa
2003-02-20  1:27             ` Tak Ota
2003-02-20  1:56               ` Kenichi Handa
2003-02-20  2:44                 ` Tak Ota
2003-03-03 18:59         ` Emacs-diffs Digest, Vol 2, Issue 28 Richard Stallman
2003-01-21 18:18     ` Richard Stallman
2003-01-27 12:20       ` Kenichi Handa
2003-01-29  0:05         ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200301270738.QAA14597@etlken.m17n.org \
    --to=handa@m17n.org \
    --cc=monnier+gnu/emacs@rum.cs.yale.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.