From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#15426: 24.3.50; Multibyte filenames and directory-files in unibyte buffer Date: Sat, 21 Sep 2013 09:48:50 +0300 Message-ID: <83vc1uk6ul.fsf@gnu.org> References: <87ob7nh22t.fsf@hochschule-trier.de> <831u4jl738.fsf@gnu.org> <83wqmbjoat.fsf@gnu.org> <87eh8jgqkp.fsf@hochschule-trier.de> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1379746215 29159 80.91.229.3 (21 Sep 2013 06:50:15 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 21 Sep 2013 06:50:15 +0000 (UTC) Cc: 15426@debbugs.gnu.org To: Andreas Politz Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Sep 21 08:50:17 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1VNH1E-00039y-Ft for geb-bug-gnu-emacs@m.gmane.org; Sat, 21 Sep 2013 08:50:16 +0200 Original-Received: from localhost ([::1]:58772 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VNH1E-0001wi-5u for geb-bug-gnu-emacs@m.gmane.org; Sat, 21 Sep 2013 02:50:16 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:53584) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VNH17-0001wX-N0 for bug-gnu-emacs@gnu.org; Sat, 21 Sep 2013 02:50:14 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VNH12-0002UW-RX for bug-gnu-emacs@gnu.org; Sat, 21 Sep 2013 02:50:09 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:44282) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VNH12-0002Te-Nv for bug-gnu-emacs@gnu.org; Sat, 21 Sep 2013 02:50:04 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1VNH10-00076L-KS for bug-gnu-emacs@gnu.org; Sat, 21 Sep 2013 02:50:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 21 Sep 2013 06:50:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 15426 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 15426-submit@debbugs.gnu.org id=B15426.137974614427225 (code B ref 15426); Sat, 21 Sep 2013 06:50:02 +0000 Original-Received: (at 15426) by debbugs.gnu.org; 21 Sep 2013 06:49:04 +0000 Original-Received: from localhost ([127.0.0.1]:52574 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VNH02-000751-PA for submit@debbugs.gnu.org; Sat, 21 Sep 2013 02:49:03 -0400 Original-Received: from mtaout20.012.net.il ([80.179.55.166]:60075) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VNGzw-00074S-FN for 15426@debbugs.gnu.org; Sat, 21 Sep 2013 02:48:57 -0400 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0MTG00200QO5Z700@a-mtaout20.012.net.il> for 15426@debbugs.gnu.org; Sat, 21 Sep 2013 09:48:49 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0MTG002Z2QX8IXA0@a-mtaout20.012.net.il>; Sat, 21 Sep 2013 09:48:45 +0300 (IDT) In-reply-to: <87eh8jgqkp.fsf@hochschule-trier.de> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:78641 Archived-At: > From: Andreas Politz > Cc: Stefan Monnier , 15426@debbugs.gnu.org > Date: Fri, 20 Sep 2013 22:56:22 +0200 > > (let ((d "/tmp/\303\204")) ;; utf-8 for german umlaut "A This makes d a unibyte string: (setq d "/tmp/\303\204") "/tmp/\303\204" (multibyte-string-p d) => nil Why would one do such a thing in the first place? Are any of the file names involved in your real-life use case unibyte strings that include bytes above 127? If there are, I suggest to find out how did they come into existence -- that might be the source of your trouble. Handling of unibyte strings in Emacs is optimized for certain use cases, certainly not those that manipulate file names on the Lisp level. I suggest to stay away of unibyte strings as non-ASCII file names, unless you really must (which normally is only necessary if you need to encode and decode file names by hand, like when you get them from some program, and the encoding of process output is different from the encoding of file names on your system). Otherwise, Lisp code should only ever manipulate file names with non-ASCII characters that are multibyte strings. > (when (file-exists-p d) > (delete-directory d t)) > (make-directory d) > (append > (list (car (directory-files d t)) > (file-exists-p (car (directory-files d t)))) > ;; switch to a multibyte buffer > (with-temp-buffer > (list (car (directory-files d t)) > (file-exists-p (car (directory-files d t))))))) > --------------------8<------------------------------------- > > If I save this somewhere (/tmp/foo.el), do > > $ LC_ALL=C emacs -Q /tmp/foo.el > > and evaluate it with C-x C-e, the minibuffer displays > > => ("/tmp/\301\203\300\204/." nil "/tmp/\303\204/." t) "The minibuffer displays" is the key point here: to display anything in the minibuffer or echo area, Emacs first _inserts_ the textual representation of that thing into a buffer, and then triggers redisplay. Insertion of unibyte strings into a multibyte buffer, or insertion of multibyte strings into the minibuffer when the current buffer is unibyte, causes all kinds of transformations on the inserted string, whose purpose is to intuit what the user expects to see. What you see is the result of those transformations. And yes, that result could be baffling at times; that's why I suggest to stay away of unibyte strings as much as you can, certainly as long as those strings are file names with non-ASCII characters. Again, I suggest to figure out if and how did you get unibyte strings as file names in your original use case. > I hope that clarifies it. Sorry, it does not.