From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Why does dired go through extra efforts to avoid unibyte names Date: Fri, 05 Jan 2018 11:12:38 -0500 Message-ID: References: <83lghlfinq.fsf@gnu.org> <83tvw3asgk.fsf@gnu.org> <837eswbri4.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1515168667 17073 195.159.176.226 (5 Jan 2018 16:11:07 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 5 Jan 2018 16:11:07 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Jan 05 17:11:03 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eXUa4-0003Xn-GK for ged-emacs-devel@m.gmane.org; Fri, 05 Jan 2018 17:10:52 +0100 Original-Received: from localhost ([::1]:47456 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eXUc1-0004Ju-KW for ged-emacs-devel@m.gmane.org; Fri, 05 Jan 2018 11:12:53 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:32974) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eXUbt-0004In-DB for emacs-devel@gnu.org; Fri, 05 Jan 2018 11:12:46 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eXUbs-0007NS-G9 for emacs-devel@gnu.org; Fri, 05 Jan 2018 11:12:45 -0500 Original-Received: from pmta31.teksavvy.com ([76.10.157.38]:60821) by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71) (envelope-from ) id 1eXUbo-0007D5-9I; Fri, 05 Jan 2018 11:12:40 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A2FxOAB5o09a/yyKSC1dHgEGDIM+gVqDb?= =?us-ascii?q?YVchHqPA4ICmT+FRQKEMUMUAQEBAQEBAQEBA2gohSUBBAF5BQsLDScSFBgxijo?= =?us-ascii?q?IswYhAoocAQEBBwImhBSCFYZtixoFkziQJKFIKIdSmE82I4FQMhoIMIJoglAfg?= =?us-ascii?q?gUjihIBAQE?= X-IPAS-Result: =?us-ascii?q?A2FxOAB5o09a/yyKSC1dHgEGDIM+gVqDbYVchHqPA4ICmT+?= =?us-ascii?q?FRQKEMUMUAQEBAQEBAQEBA2gohSUBBAF5BQsLDScSFBgxijoIswYhAoocAQEBB?= =?us-ascii?q?wImhBSCFYZtixoFkziQJKFIKIdSmE82I4FQMhoIMIJoglAfggUjihIBAQE?= X-IronPort-AV: E=Sophos;i="5.46,318,1511845200"; d="scan'208";a="16850552" Original-Received: from unknown (HELO pastel.home) ([45.72.138.44]) by smtp.teksavvy.com with ESMTP; 05 Jan 2018 11:12:38 -0500 Original-Received: by pastel.home (Postfix, from userid 20848) id 2BA6E65697; Fri, 5 Jan 2018 11:12:38 -0500 (EST) In-Reply-To: <837eswbri4.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 05 Jan 2018 11:10:27 +0200") X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 76.10.157.38 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:221622 Archived-At: > I meant to ask why do _you_ worry about eight-bit-* characters being > encoded more than once? I don't really worry about it (other than as part of understanding why the only explanation accompanying this code mentions it). > I think the issue is that we want dired-get-filename to always return > a multibyte string, so that its callers don't need to deal with the > complications, like inserting unibyte strings into multibyte buffers, > concatenating them with leading directories to form other file names, > etc. AFAICT a multibyte string which only consists of ascii and eight-bit bytes will "suffer" from the exact same problems as the corresponding unibyte string (two such strings can be called "equal modulo multibyteness"). Actually, most primitives will handle those two strings in the same way E.g. inserting either string into a buffer gives the same result (both for unibyte and multibyte buffers), concatenating either of those strings to a multibyte string gives the same result. Concatenating either of those strings to a unibyte string does not give the same result, but the two results are again "equal modulo multibyteness". So I can't imagine a scenario where calling string-to-multibyte here will help subsequent code. Stefan