all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: "Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu>
Cc: monnier+gnu/emacs@rum.cs.yale.edu
Subject: Re: setenv -> locale-coding-system cannot handle ASCII?!
Date: Wed, 26 Feb 2003 00:50:27 -0500	[thread overview]
Message-ID: <200302260550.h1Q5oSc08967@rum.cs.yale.edu> (raw)
In-Reply-To: 200302260532.OAA29294@etlken.m17n.org

> > I consider this context-dependent meaning of unibyte strings
> > to be a problem.  I understand why text in a unibyte buffer
> > has such an ambiguous meaning and agree that it's difficult
> > to avoid, but it's not a reason to carry over this difficulty
> > to strings where it is not needed.
> 
> Why is it not needed?  Strings and buffers are not that
> different, both are containers of characters.

They are used differently.  Operations on strings generally apply to the
whole string: you can only encode/decode a whole string at a time.

> If we get a unibyte string from a unibyte buffer by buffer-substring,
> how should we treat that string?

Like any other unibyte string: as a sequence of raw bytes.
If you want to treat it as a sequence of characters, then
you need to pass it through `string-as-multibyte'.

In buffers, there is sometimes a need to represent multibyte chars
inside a unibyte buffer because only part of the buffer is
decoded.  For a string, that can be avoided.  You can make sure
that if it is decoded it's a multibyte string and if it's not
then it's a unibyte string.

> > For example: what is the multibyteness of
> 
> > 	(concat "\201" (format "%s" "hello"))
> > and
> > 	(concat "\201" (format "%s" 1))
> 
> The latter yields multibyte, but I think it'a bug.  I found
> that "(format "%s" 1)" is implemented by using
> prin1-to-string, and prin1-to-string prints an object to a
> temporary buffer and gets that buffer string.  So, in a
> multibyte sesstion "(format "%s" 1)" yields a multibyte
> string.  :-(

I know: I bumped into it yesterday while playing around with tar-mode.
How about the attached patch ?

> So, do you mean that you want this?
> 
>     If a unibyte buffer has \201\300 in the region FROM and TO,
> 
>     (encode-coding-string (buffer-substring FROM TO) 'iso-latin-1)
> 	=> "\201\300"
> 
>     (encode-coding-region FROM TO 'iso-latin-1) changes the
>     region to \300.

Yes, I guess I'd be happy with it.

> Isn't it more confusing?

Not to me.

> By the way, I also really really hate this unibyte/mulitbyte
> problem.  Sometimes I think I should have opposed to the
> introduction of such a concept more strongly.

But it's pretty damn handy for binary data.


	Stefan


PS: I wish there was a way to swap two buffers's content so that
    tar-mode could swap the (potentially very large) data to
    a helper buffer (without needing to copy this large data)
    and then use multibyte for the display and unibyte for
    the helper buffer.


Index: print.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/print.c,v
retrieving revision 1.184
diff -u -r1.184 print.c
--- print.c	4 Feb 2003 14:03:13 -0000	1.184
+++ print.c	26 Feb 2003 05:43:26 -0000
@@ -774,9 +774,12 @@
   /* Make Vprin1_to_string_buffer be the default buffer after PRINTFINSH */
   PRINTFINISH;
   set_buffer_internal (XBUFFER (Vprin1_to_string_buffer));
+  if (ZV == ZV_BYTE)
+    Fset_buffer_multibyte (Qnil);
   object = Fbuffer_string ();
 
   Ferase_buffer ();
+  Fset_buffer_multibyte (Qt);
   set_buffer_internal (old);
 
   Vdeactivate_mark = tem;

  reply	other threads:[~2003-02-26  5:50 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-02-25  0:18 setenv -> locale-coding-system cannot handle ASCII?! Sam Steingold
2003-02-25  6:34 ` Kenichi Handa
2003-02-25  6:47   ` Miles Bader
2003-02-26  0:58     ` Kenichi Handa
2003-02-26  2:11       ` Stefan Monnier
2003-02-26  2:34         ` Kenichi Handa
2003-02-26  2:52           ` Stefan Monnier
2003-02-26  5:32             ` Kenichi Handa
2003-02-26  5:50               ` Stefan Monnier [this message]
2003-02-26  7:49                 ` Kenichi Handa
2003-02-26  8:05                   ` Kenichi Handa
2003-02-26  8:08                     ` Stefan Monnier
2003-02-26  8:12                   ` Stefan Monnier
2003-02-26  8:38                     ` tar-mode Kenichi Handa
2003-02-26  8:53                       ` tar-mode Stefan Monnier
2003-02-26 11:53                         ` tar-mode Kenichi Handa
2003-02-26 12:22                           ` tar-mode Stefan Monnier
2003-02-26 23:26                   ` setenv -> locale-coding-system cannot handle ASCII?! Richard Stallman
2003-02-26 23:26                   ` Richard Stallman
2003-02-26 23:26                 ` Richard Stallman
2003-02-26 23:26               ` Richard Stallman
2003-02-27  0:06                 ` Miles Bader
2003-03-03 18:59                   ` Richard Stallman
2003-03-04  2:48                     ` Miles Bader
2003-03-04  4:33                       ` Kenichi Handa
2003-03-05 20:46                       ` Richard Stallman
2003-02-26 23:25       ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200302260550.h1Q5oSc08967@rum.cs.yale.edu \
    --to=monnier+gnu/emacs@rum.cs.yale.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.