From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#56469: 29.0.50; Unibyte dir in directory_files_internal Date: Sun, 10 Jul 2022 10:23:28 -0400 Message-ID: References: <83y1x2177x.fsf@gnu.org> Reply-To: Stefan Monnier Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="13464"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: 56469@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Jul 10 16:24:20 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oAXr9-0003IW-K8 for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 10 Jul 2022 16:24:19 +0200 Original-Received: from localhost ([::1]:41996 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oAXr8-0000G4-2o for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 10 Jul 2022 10:24:18 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:59482) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oAXqs-0000Bo-46 for bug-gnu-emacs@gnu.org; Sun, 10 Jul 2022 10:24:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:43784) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oAXqr-00022I-QS for bug-gnu-emacs@gnu.org; Sun, 10 Jul 2022 10:24:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1oAXqr-0003eg-LG for bug-gnu-emacs@gnu.org; Sun, 10 Jul 2022 10:24:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 10 Jul 2022 14:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 56469 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 56469-submit@debbugs.gnu.org id=B56469.165746301914013 (code B ref 56469); Sun, 10 Jul 2022 14:24:01 +0000 Original-Received: (at 56469) by debbugs.gnu.org; 10 Jul 2022 14:23:39 +0000 Original-Received: from localhost ([127.0.0.1]:37681 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oAXqU-0003dx-Vf for submit@debbugs.gnu.org; Sun, 10 Jul 2022 10:23:39 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:35896) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1oAXqT-0003dl-Nz for 56469@debbugs.gnu.org; Sun, 10 Jul 2022 10:23:38 -0400 Original-Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 222F58007C; Sun, 10 Jul 2022 10:23:32 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 81E1C8054F; Sun, 10 Jul 2022 10:23:30 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1657463010; bh=HySLdkK7/6sA3UeykHOExouM+o/EEeZ3jMvCBA7csvU=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=dcbgGxsD2KYrAtuWXIpz2gmtIIe9SDBfeZ2ftIupZNOcb6mw2O84dh24WfudJSVUd s186c+OVKa0euZC9RL64d9ChP0KCZahF5U2yS++xjF6w/UbWAKmLu9Yom7NHHIp50y HAHi2fgAtSTKr47K0MC6bbJ85xzSxjy+8fq8YFIx9GrsRCUnl83xxRt2+IXOM+IFP9 03vvLqdteyvhtideR1qE8zgaiN3+RkDZAo6ZOAXbAiYhe7pci9KNnpxYVBRo3pbY5h 1wEXKEzUopZK17PyDJ5/FfBIO0neQtUGYukyH4bEyM2jvhlUO3WHLOcmaMtTPX3E4z QFOL2fgo+gBCA== Original-Received: from pastel (unknown [45.72.196.165]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 4A2DE120415; Sun, 10 Jul 2022 10:23:30 -0400 (EDT) In-Reply-To: <83y1x2177x.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 09 Jul 2022 21:17:22 +0300") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:236579 Archived-At: > Please bootstrap Emacs in a directory with such a name, and if that > works, I'm okay with installing this change. Pushed, thanks. W.r.t to the comment, it's indeed unrelated to the patch (other than the fact that it touches the same code). The question is when we do: finalname = (nchars == nbytes) ? make_uninit_string (nbytes) : make_uninit_multibyte_string (nchars, nbytes); the actual bytes are "decoded" (i.e. in our internal UTF-8 encoding), so (nchars == nbytes) checks whether its "pure ASCII" or not and if it's pure ASCII we return a unibyte string. Our file-name manipulation routines always consider unibyte-ASCII and multibyte-ASCII as "equivalent", and indeed DECODE_FILE and ENCODE_FILE take advantage of that so as to return their argument as-is when it's all-ASCII so as to avoid allocating a string unnecessarily. So in the above code snippet, when the string is all-ASCII, we actually have a choice, and both a unibyte string and a multibyte string should work. Currently in that case we return a unibyte string, but I think in such cases we're better off returning a multibyte string because the subsequent "all-ASCII" test (that DE/ENCODE_FILE will perform when we pass that filename to some further operation) will be more efficient (it's a constant-time (nchars == nbytes) test whereas when the string is unibyte it requires looking at each and every byte). IOW, while it makes sense to return a "decoded unibyte" string from DECODE_FILE in order to avoid an allocation, I don't think it makes sense to return such a "decoded unibyte" string when we have to allocate a new string anyway. Stefan