all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: handa@m17n.org, emacs-devel@gnu.org
Subject: Re: Emacs 23 character code space
Date: Sun, 23 Nov 2008 06:22:45 -0500	[thread overview]
Message-ID: <E1L4D37-0003Oc-7J@fencepost.gnu.org> (raw)
In-Reply-To: <jwvbpw7dr2w.fsf-monnier+emacs@gnu.org> (message from Stefan Monnier on Sat, 22 Nov 2008 23:16:49 -0500)

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 22 Nov 2008 23:16:49 -0500
> Cc: emacs-devel@gnu.org, Kenichi Handa <handa@m17n.org>
> 
> I think we should state somewhere that unibyte strings and buffers
> contain bytes only.  And that multibyte strings and buffers contain
> chars.  And that bytes are a subset of chars.

Please take a look at the current version of nonascii.texi in CVS, I
already did state this.  Specific suggestions for improvement are
welcome, of course.

(The text I was quoting was the original one written by Handa-san, not
the one I put into the manual.)

> >     @defun string-to-multibyte string
> >     This function returns a multibyte string containing the same sequence
> >     of characters as @var{string}.  If @var{string} is a multibyte string,
> >     it is returned unchanged.
> >     @end defun
> 
> > I'm not sure I understand the effect of this function.
> 
> It returns a string containing the same bytes (in the sense of
> ASCII+eight-bit, not in the sense of the underlying internal
> representation, which we should as much as possible not mention
> anywhere) but in a multibyte string instead.  I.e. the output is
> a multibyte string of the same length whose chars are bytes.

So you are in effect saying that the effect of this function is only
well defined for a string that holds ASCII characters and raw 8-bit
bytes?

> >     @defun string-to-unibyte string
> >     This function returns a unibyte string containing the same sequence of
> >     characters as @var{string}.  It signals an error if @var{string}
> >     contains a non-@acronym{ASCII} character.  If @var{string} is a
> >     unibyte string, it is returned unchanged.
> >     @end defun
> 
> > Since this function handles any non-ASCII characters lossily, when
> > would it be useful?
> 
> I think the "non-ASCII" part is incorrect.  It probably should say
> "non-byte char" instead.

"Non-ASCII characters" here does not mean "anything but ASCII
characters", it means "any character except ASCII and raw 8-bit
bytes" (assuming I understand the text correctly).  I will make sure
this tricky distinction is clear in the manual.

> In 99% (actually 99.99999% for the `as' case) of the cases you shouldn't
> use string-{as/make/to}-{uni/multi}byte.  Instead you should use
> {en/de}code-coding-string.

This specific section is not about en/decoding text, it's about
converting between unibyte and multibyte.  Unless we want to remove
any mention of these capabilities (and leave Lisp programmers without
any documentation on how to handle binary data and/or byte streams of
undecoded text), I don't think we can remove the description of these
functions from the manual.




  reply	other threads:[~2008-11-23 11:22 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-01 14:20 Emacs 23 character code space Eli Zaretskii
2008-11-01 16:46 ` Eli Zaretskii
2008-11-03  1:34 ` Kenichi Handa
2008-11-03 12:45   ` Kenichi Handa
2008-11-03 20:13     ` Eli Zaretskii
2008-11-04  7:35       ` Kenichi Handa
2008-11-04 20:19         ` Eli Zaretskii
2008-11-05 12:27           ` Kenichi Handa
2008-11-05 18:23             ` Eli Zaretskii
2008-11-22 18:25             ` Eli Zaretskii
2008-11-26  1:41               ` Kenichi Handa
2008-11-26  4:13                 ` Eli Zaretskii
2008-11-26  4:24                   ` Kenichi Handa
2008-11-26  4:58                     ` Kenichi Handa
2008-11-26 20:26                       ` Eli Zaretskii
2008-11-26 22:52                         ` Juanma Barranquero
2008-11-27  1:10                         ` Stephen J. Turnbull
2008-11-27  1:35                           ` Kenichi Handa
2008-11-26 20:18                     ` Eli Zaretskii
2008-11-27  1:29                       ` Kenichi Handa
2008-11-29 17:12                         ` Eli Zaretskii
2008-12-02  5:40                           ` Kenichi Handa
2008-11-28 13:19                 ` Eli Zaretskii
2008-12-02  5:44                   ` Kenichi Handa
2008-12-02 19:40                     ` Eli Zaretskii
2008-11-29 12:01             ` Eli Zaretskii
2008-11-22 16:28     ` Eli Zaretskii
2008-11-23  4:16       ` Stefan Monnier
2008-11-23 11:22         ` Eli Zaretskii [this message]
2008-11-26  1:51         ` Kenichi Handa
2008-11-23  8:29       ` Ulrich Mueller
2008-11-23 11:11         ` Eli Zaretskii
2008-11-23 11:55           ` Ulrich Mueller
2008-11-24  3:06         ` Stefan Monnier
2008-11-26  1:31       ` Kenichi Handa
2008-11-22 17:03     ` New function: what-file-line, used when writing gdb script richardeng
2008-11-07  7:21 ` Emacs 23 character code space Kenichi Handa
2008-11-07 10:27   ` Eli Zaretskii
2008-11-07 11:52     ` Kenichi Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1L4D37-0003Oc-7J@fencepost.gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=handa@m17n.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.