From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Why does dired go through extra efforts to avoid unibyte names Date: Tue, 02 Jan 2018 23:14:20 -0500 Message-ID: References: <83lghlfinq.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1514952805 862 195.159.176.226 (3 Jan 2018 04:13:25 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 3 Jan 2018 04:13:25 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Jan 03 05:13:21 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eWaQW-000849-44 for ged-emacs-devel@m.gmane.org; Wed, 03 Jan 2018 05:13:16 +0100 Original-Received: from localhost ([::1]:38363 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eWaSV-0005Ts-9I for ged-emacs-devel@m.gmane.org; Tue, 02 Jan 2018 23:15:19 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56536) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eWaRv-0005TQ-M6 for emacs-devel@gnu.org; Tue, 02 Jan 2018 23:14:44 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eWaRq-00068W-Nr for emacs-devel@gnu.org; Tue, 02 Jan 2018 23:14:43 -0500 Original-Received: from [195.159.176.226] (port=43617 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eWaRq-00066s-HX for emacs-devel@gnu.org; Tue, 02 Jan 2018 23:14:38 -0500 Original-Received: from list by blaine.gmane.org with local (Exim 4.84_2) (envelope-from ) id 1eWaPl-0005A4-Td for emacs-devel@gnu.org; Wed, 03 Jan 2018 05:12:29 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 41 Original-X-Complaints-To: usenet@blaine.gmane.org Cancel-Lock: sha1:e1tLvruG31T/AAu2P8u1L/nSkK0= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 195.159.176.226 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:221546 Archived-At: >> I bumped into the following code in dired-get-filename: >> >> ;; The above `read' will return a unibyte string if FILE >> ;; contains eight-bit-control/graphic characters. >> (if (and enable-multibyte-characters >> (not (multibyte-string-p file))) >> (setq file (string-to-multibyte file))) >> >> and I'm wondering why we don't want a unibyte string here. >> `vc-region-history` told me this comes from the commit appended below, >> which seems to indicate that we're worried about a subsequent encoding, >> but AFAIK unibyte file names are not (re)encoded, and passing them >> through string-to-multibyte would actually make things worse in this >> respect (since it might cause the kind of (re)encoding this is >> supposedly trying to avoid). >> >> What am I missing? > > Why does it matter whether eight-bit-* characters are encoded one more > or one less time? That's part of the question, indeed. > As for the reason for using string-to-multibyte: maybe it's because we > use concat further down in the function, which will determine whether > the result will be unibyte or multibyte according to its own ideas of > what's TRT? But `concat` will do a string-to-multibyte for us, if needed, so that doesn't seem like a good reason. This said, when that code was written, maybe `concat` used string-make-multibyte internally instead, so this call to string-to-multibyte might have been added to avoid using string-make-multibyte inside `concat`? It would be good to have a concrete case that needed the above code, to see if the problem still exists. Stefan