From: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
To: 56469@debbugs.gnu.org
Subject: bug#56469: 29.0.50; Unibyte dir in directory_files_internal
Date: Sat, 09 Jul 2022 13:44:52 -0400 [thread overview]
Message-ID: <jwvy1x2p4dn.fsf@iro.umontreal.ca> (raw)
Package: Emacs
Version: 29.0.50
If you have a directory named "/tmp/\303a" with a file named "fée"
inside, then (directory-files "/tmp/\303a" 'full) is likely to return
a funny string which is multibyte but contains an invalid
utf-8 sequence (its bytes spell "/tmp/\303a/f\303\251e").
That strings seems to be printed as "/tmp/¡/fée" which corresponds
to "/tmp/\303\241/f\303\251e".
Such a string with an invalid UTF-8 sequence is handled quite graciously
by Emacs, so I wasn't able to get an actual crash out of it, but it's
still something we should avoid.
I suggest the patch below. In a comment I suggest we don't try to use
unibyte strings when a multibyte string would work as well. This is
because for those ASCII-only strings, it's cheaper to test bytes==chars
to (re)discover that they are ASCII-only (when they're multibyte) than
having to loop through the bytes (when they're unibyte).
Stefan
diff --git a/src/dired.c b/src/dired.c
index 6bb8c2fcb9f..33ddfafd8e7 100644
--- a/src/dired.c
+++ b/src/dired.c
@@ -219,6 +219,13 @@ directory_files_internal (Lisp_Object directory, Lisp_Object full,
}
#endif
+ if (!NILP (full) && !STRING_MULTIBYTE (directory))
+ { /* We will be concatenating 'directory' with local file name.
+ We always decode local file names, so in order to safely concatenate
+ them we need 'directory' to be multibyte. */
+ directory = Fstring_to_multibyte (directory);
+ }
+
ptrdiff_t directory_nbytes = SBYTES (directory);
re_match_object = Qt;
@@ -263,9 +270,10 @@ directory_files_internal (Lisp_Object directory, Lisp_Object full,
ptrdiff_t name_nbytes = SBYTES (name);
ptrdiff_t nbytes = directory_nbytes + needsep + name_nbytes;
ptrdiff_t nchars = SCHARS (directory) + needsep + SCHARS (name);
- finalname = make_uninit_multibyte_string (nchars, nbytes);
- if (nchars == nbytes)
- STRING_SET_UNIBYTE (finalname);
+ /* FIXME: Why not make them all multibyte? */
+ finalname = (nchars == nbytes)
+ ? make_uninit_string (nchars, nbytes)
+ : make_uninit_multibyte_string (nchars, nbytes);
memcpy (SDATA (finalname), SDATA (directory), directory_nbytes);
if (needsep)
SSET (finalname, directory_nbytes, DIRECTORY_SEP);
next reply other threads:[~2022-07-09 17:44 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-09 17:44 Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors [this message]
2022-07-09 18:17 ` bug#56469: 29.0.50; Unibyte dir in directory_files_internal Eli Zaretskii
2022-07-09 18:20 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-07-09 18:53 ` Eli Zaretskii
2022-07-10 14:23 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-07-10 14:32 ` Eli Zaretskii
2022-07-10 14:58 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-07-10 15:07 ` Eli Zaretskii
2022-07-10 15:19 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-07-10 15:41 ` Eli Zaretskii
2022-07-10 22:13 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-07-11 2:27 ` Eli Zaretskii
2022-09-05 19:21 ` Lars Ingebrigtsen
2022-09-07 13:32 ` Eli Zaretskii
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=jwvy1x2p4dn.fsf@iro.umontreal.ca \
--to=bug-gnu-emacs@gnu.org \
--cc=56469@debbugs.gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).