all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Andreas Politz <politza@hochschule-trier.de>
Cc: 15426@debbugs.gnu.org
Subject: bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer
Date: Sat, 21 Sep 2013 09:48:50 +0300	[thread overview]
Message-ID: <83vc1uk6ul.fsf@gnu.org> (raw)
In-Reply-To: <87eh8jgqkp.fsf@hochschule-trier.de>

> From: Andreas Politz <politza@hochschule-trier.de>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,  15426@debbugs.gnu.org
> Date: Fri, 20 Sep 2013 22:56:22 +0200
> 
> (let ((d "/tmp/\303\204")) ;; utf-8 for german umlaut "A 

This makes d a unibyte string:

  (setq d "/tmp/\303\204")
  "/tmp/\303\204"

  (multibyte-string-p d)
    => nil

Why would one do such a thing in the first place?  Are any of the file
names involved in your real-life use case unibyte strings that include
bytes above 127?  If there are, I suggest to find out how did they
come into existence -- that might be the source of your trouble.

Handling of unibyte strings in Emacs is optimized for certain use
cases, certainly not those that manipulate file names on the Lisp
level.  I suggest to stay away of unibyte strings as non-ASCII file
names, unless you really must (which normally is only necessary if you
need to encode and decode file names by hand, like when you get them
from some program, and the encoding of process output is different
from the encoding of file names on your system).  Otherwise, Lisp code
should only ever manipulate file names with non-ASCII characters that
are multibyte strings.

>   (when (file-exists-p d)
>     (delete-directory d t))
>   (make-directory d)
>   (append
>    (list (car (directory-files d t)) 
>          (file-exists-p (car (directory-files d t))))
>    ;; switch to a multibyte buffer
>    (with-temp-buffer
>      (list (car (directory-files d t))
> 	   (file-exists-p (car (directory-files d t)))))))
> --------------------8<-------------------------------------
> 
> If I save this somewhere (/tmp/foo.el), do
> 
> $ LC_ALL=C emacs -Q /tmp/foo.el
> 
> and evaluate it with C-x C-e, the minibuffer displays
> 
> => ("/tmp/\301\203\300\204/." nil "/tmp/\303\204/." t)

"The minibuffer displays" is the key point here: to display anything
in the minibuffer or echo area, Emacs first _inserts_ the textual
representation of that thing into a buffer, and then triggers
redisplay.  Insertion of unibyte strings into a multibyte buffer, or
insertion of multibyte strings into the minibuffer when the current
buffer is unibyte, causes all kinds of transformations on the inserted
string, whose purpose is to intuit what the user expects to see.  What
you see is the result of those transformations.  And yes, that result
could be baffling at times; that's why I suggest to stay away of
unibyte strings as much as you can, certainly as long as those strings
are file names with non-ASCII characters.

Again, I suggest to figure out if and how did you get unibyte strings
as file names in your original use case.

> I hope that clarifies it.

Sorry, it does not.





  reply	other threads:[~2013-09-21  6:48 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-20 16:47 bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer Andreas Politz
2013-09-20 17:46 ` Eli Zaretskii
2013-09-20 18:51   ` Andreas Politz
2013-09-20 19:08     ` Eli Zaretskii
2013-09-20 19:15   ` Stefan Monnier
2013-09-20 19:17     ` Eli Zaretskii
2013-09-20 20:56       ` Andreas Politz
2013-09-21  6:48         ` Eli Zaretskii [this message]
2013-09-21  9:35           ` Andreas Politz
2013-09-21  9:38             ` Andreas Politz
2013-09-21 11:59             ` Eli Zaretskii
2013-09-21 17:12               ` Andreas Politz
2013-09-21 18:53                 ` Eli Zaretskii
2013-09-21 16:06           ` Stefan Monnier
2013-09-21 16:26             ` Eli Zaretskii
2013-09-22  1:29               ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83vc1uk6ul.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=15426@debbugs.gnu.org \
    --cc=politza@hochschule-trier.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.