From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Why does dired go through extra efforts to avoid unibyte names Date: Fri, 05 Jan 2018 11:10:27 +0200 Message-ID: <837eswbri4.fsf@gnu.org> References: <83lghlfinq.fsf@gnu.org> <83tvw3asgk.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1515143375 9069 195.159.176.226 (5 Jan 2018 09:09:35 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 5 Jan 2018 09:09:35 +0000 (UTC) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Jan 05 10:09:31 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eXO0G-0001n5-AA for ged-emacs-devel@m.gmane.org; Fri, 05 Jan 2018 10:09:28 +0100 Original-Received: from localhost ([::1]:57062 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eXO2D-0007Ps-G8 for ged-emacs-devel@m.gmane.org; Fri, 05 Jan 2018 04:11:29 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51537) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eXO1d-0007Ow-FC for emacs-devel@gnu.org; Fri, 05 Jan 2018 04:10:54 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eXO1Y-00069X-Mh for emacs-devel@gnu.org; Fri, 05 Jan 2018 04:10:53 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43204) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eXO1Y-00069S-Jh; Fri, 05 Jan 2018 04:10:48 -0500 Original-Received: from [176.228.60.248] (port=4370 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eXO1X-0007UX-Pl; Fri, 05 Jan 2018 04:10:48 -0500 In-reply-to: (message from Stefan Monnier on Wed, 03 Jan 2018 15:09:06 -0500) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:221610 Archived-At: > From: Stefan Monnier > Date: Wed, 03 Jan 2018 15:09:06 -0500 > > > Eight-bit-* characters are not in general modified by encoding them, > > so you could encode them any number of times and still get the same > > bytes as result. > > Agreed. But even if it were not the case, I don't see why that would > explain the presence of this code. I meant to ask why do _you_ worry about eight-bit-* characters being encoded more than once? > >> > As for the reason for using string-to-multibyte: maybe it's because we > >> > use concat further down in the function, which will determine whether > >> > the result will be unibyte or multibyte according to its own ideas of > >> > what's TRT? > >> But `concat` will do a string-to-multibyte for us, if needed > > Not if the other concatenated parts are ASCII (which tend to be > > unibyte strings). > > But that's still perfectly fine as well since it will then result in > a unibyte string which will get "encoded" correctly. Where do you see encoding in this picture? I think the issue is that we want dired-get-filename to always return a multibyte string, so that its callers don't need to deal with the complications, like inserting unibyte strings into multibyte buffers, concatenating them with leading directories to form other file names, etc.