Re: setenv -> locale-coding-system cannot handle ASCII?!

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

From: "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu>
Cc: monnier+gnu/emacs@rum.cs.yale.edu
Subject: Re: setenv -> locale-coding-system cannot handle ASCII?!
Date: Tue, 25 Feb 2003 21:52:18 -0500	[thread overview]
Message-ID: <200302260252.h1Q2qIK08490@rum.cs.yale.edu> (raw)
In-Reply-To: 200302260234.LAA29082@etlken.m17n.org

> >>    (if (multibyte-string-p variable)
> >>        (setq variable (encode-coding-string variable locale-coding-system)))
> >>  
> >>  multibyte-string-p is mandatory because encode-coding-string
> >>  will change the byte-sequence of `variable' even if it is
> >>  unibyte.
> >>  Ex. (encode-coding-string "\201\300" 'iso-latin-1) => "\300"
> 
> > I find this behavior annoying because it makes the emacs-mule
> > encoding appear in a situation where it is not mentioned.
> > I wish that
> 
> >     (encode-coding-string "\201\300" 'iso-latin-1)
> > and
> >     (encode-coding-string (string-to-multibyte "\201\300") 'iso-latin-1)
> 
> > returned the same value.
> 
> Why?  As I wrote before, what does bytes of unibyte string
> means depends on a context.

I consider this context-dependent meaning of unibyte strings
to be a problem.  I understand why text in a unibyte buffer
has such an ambiguous meaning and agree that it's difficult
to avoid, but it's not a reason to carry over this difficulty
to strings where it is not needed.

> In the former case, as it is given to encode-coding-string,
> it is a multibyte form by which emacs represents
> character(s), not a sequence of characters representing raw
> bytes.

The problem is that the multibyteness of strings is not
always as easy to guess/control.  For example: what is the
multibyteness of

	(concat "\201" (format "%s" "hello"))
and
	(concat "\201" (format "%s" 1))

> In the latter case, as it is given to string-to-multibyte,
> it should be regard as a sequence of characters representing
> raw bytes, thus the result of (string-to-multibyte
> "\201\300") is still a sequence of raw-bytes.  Encoding
> raw-bytes should yield the same raw-bytes.

Indeed, that's what I and `setenv' would want.

> And, this behaviour of encode-coding-string on a unibyte
> string is a natural consequence of encode-coding-region in a
> unibyte buffer.

As mentioned above, I understand why it works that way in buffers,
but I don't think it has to work the same way for strings.

	Stefan

next prev parent reply	other threads:[~2003-02-26  2:52 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-02-25  0:18 setenv -> locale-coding-system cannot handle ASCII?! Sam Steingold
2003-02-25  6:34 ` Kenichi Handa
2003-02-25  6:47   ` Miles Bader
2003-02-26  0:58     ` Kenichi Handa
2003-02-26  2:11       ` Stefan Monnier
2003-02-26  2:34         ` Kenichi Handa
2003-02-26  2:52           ` Stefan Monnier [this message]
2003-02-26  5:32             ` Kenichi Handa
2003-02-26  5:50               ` Stefan Monnier
2003-02-26  7:49                 ` Kenichi Handa
2003-02-26  8:05                   ` Kenichi Handa
2003-02-26  8:08                     ` Stefan Monnier
2003-02-26  8:12                   ` Stefan Monnier
2003-02-26  8:38                     ` tar-mode Kenichi Handa
2003-02-26  8:53                       ` tar-mode Stefan Monnier
2003-02-26 11:53                         ` tar-mode Kenichi Handa
2003-02-26 12:22                           ` tar-mode Stefan Monnier
2003-02-26 23:26                   ` setenv -> locale-coding-system cannot handle ASCII?! Richard Stallman
2003-02-26 23:26                   ` Richard Stallman
2003-02-26 23:26                 ` Richard Stallman
2003-02-26 23:26               ` Richard Stallman
2003-02-27  0:06                 ` Miles Bader
2003-03-03 18:59                   ` Richard Stallman
2003-03-04  2:48                     ` Miles Bader
2003-03-04  4:33                       ` Kenichi Handa
2003-03-05 20:46                       ` Richard Stallman
2003-02-26 23:25       ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200302260252.h1Q2qIK08490@rum.cs.yale.edu \
    --to=monnier+gnu/emacs@rum.cs.yale.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).