* expand-file-name problem for eight-bit-control/graphic
@ 2003-03-13 7:47 Kenichi Handa
2003-03-15 6:54 ` Richard Stallman
0 siblings, 1 reply; 6+ messages in thread
From: Kenichi Handa @ 2003-03-13 7:47 UTC (permalink / raw)
I've just found that expand-file-name sometimes converts
unibyte filename to multibyte, and multibyte filename to
unibyte.
Ex.1 unibyte->multibyte
(expand-file-name "~/\201\300") => "/home/handa/À"
Ex.2 multibyte->unibyte
(expand-file-name "~/À\200") => "/home/handa/\201\300\236\240"
The reason is that it uses make_string and build_string
blindly. It seems that the attached patch fixes this bug,
but, as expand-file-name is one of heavily system-dependent
parts, and has lots of "#ifdef", I'd like to ask the other
poeple to confirm this patch doesn't cause any problem.
---
Ken'ichi HANDA
handa@m17n.org
2003-03-13 Kenichi Handa <handa@etlken2>
* fileio.c (Fexpand_file_name): Preserve multibyteness of NAME in
the return value.
*** fileio.c.~1.474.~ Mon Feb 3 09:16:21 2003
--- fileio.c Thu Mar 13 16:28:58 2003
***************
*** 1028,1033 ****
--- 1028,1034 ----
#endif /* DOS_NT */
int length;
Lisp_Object handler;
+ int multibyte;
CHECK_STRING (name);
***************
*** 1111,1116 ****
--- 1112,1123 ----
name = FILE_SYSTEM_CASE (name);
#endif
+ if (STRING_MULTIBYTE (default_directory))
+ default_directory = ENCODE_FILE (default_directory);
+ multibyte = STRING_MULTIBYTE (name);
+ if (multibyte)
+ name = ENCODE_FILE (name);
+
nm = SDATA (name);
#ifdef DOS_NT
***************
*** 1275,1281 ****
{
#ifdef VMS
if (index (nm, '/'))
! return build_string (sys_translate_unix (nm));
#endif /* VMS */
#ifdef DOS_NT
/* Make sure directories are all separated with / or \ as
--- 1282,1294 ----
{
#ifdef VMS
if (index (nm, '/'))
! {
! nm = sys_translate_unix (nm);
! name = make_unibyte_string (nm, strlen (nm));
! if (multibyte)
! name = DECODE_FILE (name);
! return name;
! }
#endif /* VMS */
#ifdef DOS_NT
/* Make sure directories are all separated with / or \ as
***************
*** 1286,1307 ****
if (IS_DIRECTORY_SEP (nm[1]))
{
if (strcmp (nm, SDATA (name)) != 0)
! name = build_string (nm);
}
else
#endif
/* drive must be set, so this is okay */
if (strcmp (nm - 2, SDATA (name)) != 0)
{
! name = make_string (nm - 2, p - nm + 2);
SSET (name, 0, DRIVE_LETTER (drive));
SSET (name, 1, ':');
}
return name;
#else /* not DOS_NT */
! if (nm == SDATA (name))
! return name;
! return build_string (nm);
#endif /* not DOS_NT */
}
}
--- 1299,1324 ----
if (IS_DIRECTORY_SEP (nm[1]))
{
if (strcmp (nm, SDATA (name)) != 0)
! name = make_unibyte_string (nm, strlen (nm));
}
else
#endif
/* drive must be set, so this is okay */
if (strcmp (nm - 2, SDATA (name)) != 0)
{
! name = make_unibyte_string (nm - 2, p - nm + 2);
SSET (name, 0, DRIVE_LETTER (drive));
SSET (name, 1, ':');
}
+ if (multibyte)
+ name = DECODE_FILE (name);
return name;
#else /* not DOS_NT */
! if (nm != SDATA (name))
! name = make_unibyte_string (nm, strlen (nm));
! if (multibyte)
! name = DECODE_FILE (name);
! return name;
#endif /* not DOS_NT */
}
}
***************
*** 1670,1676 ****
CORRECT_DIR_SEPS (target);
#endif /* DOS_NT */
! return make_string (target, o - target);
}
#if 0
--- 1687,1696 ----
CORRECT_DIR_SEPS (target);
#endif /* DOS_NT */
! name = make_unibyte_string (target, o - target);
! if (multibyte)
! name = DECODE_FILE (name);
! return name;
}
#if 0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: expand-file-name problem for eight-bit-control/graphic
2003-03-13 7:47 expand-file-name problem for eight-bit-control/graphic Kenichi Handa
@ 2003-03-15 6:54 ` Richard Stallman
2003-03-18 2:03 ` Kenichi Handa
0 siblings, 1 reply; 6+ messages in thread
From: Richard Stallman @ 2003-03-15 6:54 UTC (permalink / raw)
Cc: emacs-devel
I have a lot of doubts about this code, because it seems to encode and
then decode the file name. Since the arguments and values are both
strings for use within Emacs, I think it is incorrect for
expand-file-name to ever encode or decode a file name.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: expand-file-name problem for eight-bit-control/graphic
2003-03-15 6:54 ` Richard Stallman
@ 2003-03-18 2:03 ` Kenichi Handa
2003-03-18 13:24 ` Kenichi Handa
0 siblings, 1 reply; 6+ messages in thread
From: Kenichi Handa @ 2003-03-18 2:03 UTC (permalink / raw)
Cc: emacs-devel
In article <E18u5Z4-0004yd-00@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes:
> I have a lot of doubts about this code, because it seems to encode and
> then decode the file name. Since the arguments and values are both
> strings for use within Emacs, I think it is incorrect for
> expand-file-name to ever encode or decode a file name.
But, at least, we can't use build_string and make_string
blindly to reconstruct a file name. So, how about the
attached change?
---
Ken'ichi HANDA
handa@m17n.org
*** fileio.c.~1.474.~ Tue Mar 18 09:59:02 2003
--- fileio.c Tue Mar 18 10:52:33 2003
***************
*** 992,997 ****
--- 992,1014 ----
\f
+ /* Return a string made from NBYTES bytes at P. If MULTIBYTE is
+ nonzero, the string is multibyte (it is assumed that the bytes are
+ in correct multibyte form). If MULTIBYTE is zero, the string is
+ unibyte. */
+
+ static Lisp_Object
+ bytes_to_string (unsigned char *p, int nbytes, int multibyte)
+ {
+ int nchars;
+
+ if (! multibyte)
+ return make_unibyte_string ((char *) p, nbytes);
+ nchars = multibyte_chars_in_text (p, nbytes);
+ return make_multibyte_string ((char *) p, nchars, nbytes);
+ }
+
+
DEFUN ("expand-file-name", Fexpand_file_name, Sexpand_file_name, 1, 2, 0,
doc: /* Convert filename NAME to absolute, and canonicalize it.
Second arg DEFAULT-DIRECTORY is directory to start with if NAME is relative
***************
*** 1028,1033 ****
--- 1045,1051 ----
#endif /* DOS_NT */
int length;
Lisp_Object handler;
+ int multibyte;
CHECK_STRING (name);
***************
*** 1111,1116 ****
--- 1129,1135 ----
name = FILE_SYSTEM_CASE (name);
#endif
+ multibyte = STRING_MULTIBYTE (name);
nm = SDATA (name);
#ifdef DOS_NT
***************
*** 1275,1281 ****
{
#ifdef VMS
if (index (nm, '/'))
! return build_string (sys_translate_unix (nm));
#endif /* VMS */
#ifdef DOS_NT
/* Make sure directories are all separated with / or \ as
--- 1294,1304 ----
{
#ifdef VMS
if (index (nm, '/'))
! {
! nm = sys_translate_unix (nm);
! length = strlen (nm);
! return bytes_to_string (nm, length, multibyte);
! }
#endif /* VMS */
#ifdef DOS_NT
/* Make sure directories are all separated with / or \ as
***************
*** 1286,1299 ****
if (IS_DIRECTORY_SEP (nm[1]))
{
if (strcmp (nm, SDATA (name)) != 0)
! name = build_string (nm);
}
else
#endif
/* drive must be set, so this is okay */
if (strcmp (nm - 2, SDATA (name)) != 0)
{
! name = make_string (nm - 2, p - nm + 2);
SSET (name, 0, DRIVE_LETTER (drive));
SSET (name, 1, ':');
}
--- 1309,1322 ----
if (IS_DIRECTORY_SEP (nm[1]))
{
if (strcmp (nm, SDATA (name)) != 0)
! name = bytes_to_string (nm, strlen (nm), multibyte);
}
else
#endif
/* drive must be set, so this is okay */
if (strcmp (nm - 2, SDATA (name)) != 0)
{
! name = bytes_to_string (nm, strlen (nm), multibyte);
SSET (name, 0, DRIVE_LETTER (drive));
SSET (name, 1, ':');
}
***************
*** 1301,1307 ****
#else /* not DOS_NT */
if (nm == SDATA (name))
return name;
! return build_string (nm);
#endif /* not DOS_NT */
}
}
--- 1324,1330 ----
#else /* not DOS_NT */
if (nm == SDATA (name))
return name;
! return bytes_to_string (nm, strlen (nm), multibyte);
#endif /* not DOS_NT */
}
}
***************
*** 1670,1676 ****
CORRECT_DIR_SEPS (target);
#endif /* DOS_NT */
! return make_string (target, o - target);
}
#if 0
--- 1693,1699 ----
CORRECT_DIR_SEPS (target);
#endif /* DOS_NT */
! return bytes_to_string (target, o - target, multibyte);
}
#if 0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: expand-file-name problem for eight-bit-control/graphic
2003-03-18 2:03 ` Kenichi Handa
@ 2003-03-18 13:24 ` Kenichi Handa
2003-03-19 13:36 ` Richard Stallman
2003-03-20 18:49 ` Juanma Barranquero
0 siblings, 2 replies; 6+ messages in thread
From: Kenichi Handa @ 2003-03-18 13:24 UTC (permalink / raw)
Cc: emacs-devel
I wrote:
> But, at least, we can't use build_string and make_string
> blindly to reconstruct a file name. So, how about the
> attached change?
I found that several other functions in fileio.c have the
same problem as expand-file-name. They all do something
like this:
str = SDATA (filename);
...
if (STRING_MULTIBYTE (filename))
return make_string (beg, p - beg);
return make_unibyte_string (beg, p - beg);
and make_string will return a unibyte string if FILENAME
originally contains eight-bit-control/graphics.
Another bug is in read-file-name. It doesn't decode
homedir.
Here's a new patch which replaces the previous one.
I think this fix is important because nowadays people more
often encounter, for instance, utf-8 filenames in latin-1
locale or vice versa.
---
Ken'ichi HANDA
handa@m17n.org
*** fileio.c.~1.474.~ Tue Mar 18 09:59:02 2003
--- fileio.c Tue Mar 18 22:09:35 2003
***************
*** 235,240 ****
--- 235,258 ----
Lisp_Object *, struct coding_system *));
static int e_write P_ ((int, Lisp_Object, int, int, struct coding_system *));
+ static Lisp_Object build_file_name P_ ((const unsigned char *, int, int));
+
+ /* Return a string made from NBYTES bytes at P. If MULTIBYTE is
+ nonzero, the string is multibyte (it is assumed that the bytes are
+ in correct multibyte form). If MULTIBYTE is zero, the string is
+ unibyte. */
+
+ static Lisp_Object
+ build_file_name (const unsigned char *p, int nbytes, int multibyte)
+ {
+ int nchars;
+
+ if (! multibyte)
+ return make_unibyte_string ((char *) p, nbytes);
+ nchars = multibyte_chars_in_text (p, nbytes);
+ return make_multibyte_string ((char *) p, nchars, nbytes);
+ }
+
\f
void
report_file_error (string, data)
***************
*** 447,455 ****
CORRECT_DIR_SEPS (beg);
#endif /* DOS_NT */
! if (STRING_MULTIBYTE (filename))
! return make_string (beg, p - beg);
! return make_unibyte_string (beg, p - beg);
}
DEFUN ("file-name-nondirectory", Ffile_name_nondirectory,
--- 465,471 ----
CORRECT_DIR_SEPS (beg);
#endif /* DOS_NT */
! return build_file_name (beg, p - beg, STRING_MULTIBYTE (filename));
}
DEFUN ("file-name-nondirectory", Ffile_name_nondirectory,
***************
*** 488,496 ****
)
p--;
! if (STRING_MULTIBYTE (filename))
! return make_string (p, end - p);
! return make_unibyte_string (p, end - p);
}
DEFUN ("unhandled-file-name-directory", Funhandled_file_name_directory,
--- 504,510 ----
)
p--;
! return build_file_name (p, end - p, STRING_MULTIBYTE (filename));
}
DEFUN ("unhandled-file-name-directory", Funhandled_file_name_directory,
***************
*** 631,637 ****
return call2 (handler, Qfile_name_as_directory, file);
buf = (char *) alloca (SBYTES (file) + 10);
! return build_string (file_name_as_directory (buf, SDATA (file)));
}
\f
/*
--- 645,652 ----
return call2 (handler, Qfile_name_as_directory, file);
buf = (char *) alloca (SBYTES (file) + 10);
! file_name_as_directory (buf, SDATA (file));
! return build_file_name (buf, strlen (buf), STRING_MULTIBYTE (file));
}
\f
/*
***************
*** 831,837 ****
buf = (char *) alloca (SBYTES (directory) + 20);
#endif
directory_file_name (SDATA (directory), buf);
! return build_string (buf);
}
static char make_temp_name_tbl[64] =
--- 846,852 ----
buf = (char *) alloca (SBYTES (directory) + 20);
#endif
directory_file_name (SDATA (directory), buf);
! return build_file_name (buf, strlen (buf), STRING_MULTIBYTE (directory));
}
static char make_temp_name_tbl[64] =
***************
*** 1275,1281 ****
{
#ifdef VMS
if (index (nm, '/'))
! return build_string (sys_translate_unix (nm));
#endif /* VMS */
#ifdef DOS_NT
/* Make sure directories are all separated with / or \ as
--- 1290,1300 ----
{
#ifdef VMS
if (index (nm, '/'))
! {
! nm = sys_translate_unix (nm);
! return build_file_name (nm, strlen (nm),
! STRING_MULTIBYTE (name));
! }
#endif /* VMS */
#ifdef DOS_NT
/* Make sure directories are all separated with / or \ as
***************
*** 1286,1299 ****
if (IS_DIRECTORY_SEP (nm[1]))
{
if (strcmp (nm, SDATA (name)) != 0)
! name = build_string (nm);
}
else
#endif
/* drive must be set, so this is okay */
if (strcmp (nm - 2, SDATA (name)) != 0)
{
! name = make_string (nm - 2, p - nm + 2);
SSET (name, 0, DRIVE_LETTER (drive));
SSET (name, 1, ':');
}
--- 1305,1320 ----
if (IS_DIRECTORY_SEP (nm[1]))
{
if (strcmp (nm, SDATA (name)) != 0)
! name
! = build_file_name (nm, strlen (nm), STRING_MULTIBYTE (name));
}
else
#endif
/* drive must be set, so this is okay */
if (strcmp (nm - 2, SDATA (name)) != 0)
{
! name
! = build_file_name (nm, strlen (nm), STRING_MULTIBYTE (name));
SSET (name, 0, DRIVE_LETTER (drive));
SSET (name, 1, ':');
}
***************
*** 1301,1307 ****
#else /* not DOS_NT */
if (nm == SDATA (name))
return name;
! return build_string (nm);
#endif /* not DOS_NT */
}
}
--- 1322,1328 ----
#else /* not DOS_NT */
if (nm == SDATA (name))
return name;
! return build_file_name (nm, strlen (nm), STRING_MULTIBYTE (name));
#endif /* not DOS_NT */
}
}
***************
*** 1670,1676 ****
CORRECT_DIR_SEPS (target);
#endif /* DOS_NT */
! return make_string (target, o - target);
}
#if 0
--- 1691,1697 ----
CORRECT_DIR_SEPS (target);
#endif /* DOS_NT */
! return build_file_name (target, o - target, STRING_MULTIBYTE (name));
}
#if 0
***************
*** 2101,2107 ****
}
#ifdef VMS
! return build_string (nm);
#else
/* See if any variables are substituted into the string
--- 2122,2128 ----
}
#ifdef VMS
! return build_file_name (nm, strlen (nm), STRING_MULTIBYTE (filename));
#else
/* See if any variables are substituted into the string
***************
*** 2244,2252 ****
xnm = p;
#endif
! if (STRING_MULTIBYTE (filename))
! return make_string (xnm, x - xnm);
! return make_unibyte_string (xnm, x - xnm);
badsubst:
error ("Bad format environment-variable substitution");
--- 2265,2271 ----
xnm = p;
#endif
! return build_file_name (xnm, x - xnm, STRING_MULTIBYTE (filename));
badsubst:
error ("Bad format environment-variable substitution");
***************
*** 6023,6028 ****
--- 6042,6048 ----
Lisp_Object val, insdef, tem;
struct gcpro gcpro1, gcpro2;
register char *homedir;
+ Lisp_Object decoded_homedir;
int replace_in_history = 0;
int add_to_history = 0;
int count;
***************
*** 6045,6069 ****
CORRECT_DIR_SEPS (homedir);
}
#endif
if (homedir != 0
&& STRINGP (dir)
! && !strncmp (homedir, SDATA (dir), strlen (homedir))
! && IS_DIRECTORY_SEP (SREF (dir, strlen (homedir))))
{
! dir = make_string (SDATA (dir) + strlen (homedir) - 1,
! SBYTES (dir) - strlen (homedir) + 1);
! SSET (dir, 0, '~');
}
/* Likewise for default_filename. */
if (homedir != 0
&& STRINGP (default_filename)
! && !strncmp (homedir, SDATA (default_filename), strlen (homedir))
! && IS_DIRECTORY_SEP (SREF (default_filename, strlen (homedir))))
{
default_filename
! = make_string (SDATA (default_filename) + strlen (homedir) - 1,
! SBYTES (default_filename) - strlen (homedir) + 1);
! SSET (default_filename, 0, '~');
}
if (!NILP (default_filename))
{
--- 6065,6093 ----
CORRECT_DIR_SEPS (homedir);
}
#endif
+ if (homedir != 0)
+ decoded_homedir
+ = DECODE_FILE (make_unibyte_string (homedir, strlen (homedir)));
if (homedir != 0
&& STRINGP (dir)
! && !strncmp (SDATA (decoded_homedir), SDATA (dir),
! SBYTES (decoded_homedir))
! && IS_DIRECTORY_SEP (SREF (dir, SBYTES (decoded_homedir))))
{
! dir = Fsubstring (dir, make_number (SCHARS (decoded_homedir) + 1), Qnil);
! dir = concat2 (build_string ("~"), dir);
}
/* Likewise for default_filename. */
if (homedir != 0
&& STRINGP (default_filename)
! && !strncmp (SDATA (decoded_homedir), SDATA (default_filename),
! SBYTES (decoded_homedir))
! && IS_DIRECTORY_SEP (SREF (default_filename, SBYTES (decoded_homedir))))
{
default_filename
! = Fsubstring (default_filename,
! make_number (SCHARS (decoded_homedir) + 1), Qnil);
! default_filename = concat2 (build_string ("~"), default_filename);
}
if (!NILP (default_filename))
{
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: expand-file-name problem for eight-bit-control/graphic
2003-03-18 13:24 ` Kenichi Handa
@ 2003-03-19 13:36 ` Richard Stallman
2003-03-20 18:49 ` Juanma Barranquero
1 sibling, 0 replies; 6+ messages in thread
From: Richard Stallman @ 2003-03-19 13:36 UTC (permalink / raw)
Cc: emacs-devel
This change looks good. But it seems that build_file_name
could be useful for other purposes, not only for file names,
so maybe it should have a different name that doesn't say it's
only for file names.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: expand-file-name problem for eight-bit-control/graphic
2003-03-18 13:24 ` Kenichi Handa
2003-03-19 13:36 ` Richard Stallman
@ 2003-03-20 18:49 ` Juanma Barranquero
1 sibling, 0 replies; 6+ messages in thread
From: Juanma Barranquero @ 2003-03-20 18:49 UTC (permalink / raw)
Cc: emacs-devel
ELISP> (expand-file-name "c:/tmp")
"c:/tmp"
ELISP> (expand-file-name "c://tmp")
"c:mp"
ELISP> (expand-file-name "c:\\tmp")
"c:mp"
/L/e/k/t/u
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2003-03-20 18:49 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-13 7:47 expand-file-name problem for eight-bit-control/graphic Kenichi Handa
2003-03-15 6:54 ` Richard Stallman
2003-03-18 2:03 ` Kenichi Handa
2003-03-18 13:24 ` Kenichi Handa
2003-03-19 13:36 ` Richard Stallman
2003-03-20 18:49 ` Juanma Barranquero
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.