* expand-file-name problem for eight-bit-control/graphic @ 2003-03-13 7:47 Kenichi Handa 2003-03-15 6:54 ` Richard Stallman 0 siblings, 1 reply; 6+ messages in thread From: Kenichi Handa @ 2003-03-13 7:47 UTC (permalink / raw) I've just found that expand-file-name sometimes converts unibyte filename to multibyte, and multibyte filename to unibyte. Ex.1 unibyte->multibyte (expand-file-name "~/\201\300") => "/home/handa/À" Ex.2 multibyte->unibyte (expand-file-name "~/À\200") => "/home/handa/\201\300\236\240" The reason is that it uses make_string and build_string blindly. It seems that the attached patch fixes this bug, but, as expand-file-name is one of heavily system-dependent parts, and has lots of "#ifdef", I'd like to ask the other poeple to confirm this patch doesn't cause any problem. --- Ken'ichi HANDA handa@m17n.org 2003-03-13 Kenichi Handa <handa@etlken2> * fileio.c (Fexpand_file_name): Preserve multibyteness of NAME in the return value. *** fileio.c.~1.474.~ Mon Feb 3 09:16:21 2003 --- fileio.c Thu Mar 13 16:28:58 2003 *************** *** 1028,1033 **** --- 1028,1034 ---- #endif /* DOS_NT */ int length; Lisp_Object handler; + int multibyte; CHECK_STRING (name); *************** *** 1111,1116 **** --- 1112,1123 ---- name = FILE_SYSTEM_CASE (name); #endif + if (STRING_MULTIBYTE (default_directory)) + default_directory = ENCODE_FILE (default_directory); + multibyte = STRING_MULTIBYTE (name); + if (multibyte) + name = ENCODE_FILE (name); + nm = SDATA (name); #ifdef DOS_NT *************** *** 1275,1281 **** { #ifdef VMS if (index (nm, '/')) ! return build_string (sys_translate_unix (nm)); #endif /* VMS */ #ifdef DOS_NT /* Make sure directories are all separated with / or \ as --- 1282,1294 ---- { #ifdef VMS if (index (nm, '/')) ! { ! nm = sys_translate_unix (nm); ! name = make_unibyte_string (nm, strlen (nm)); ! if (multibyte) ! name = DECODE_FILE (name); ! return name; ! } #endif /* VMS */ #ifdef DOS_NT /* Make sure directories are all separated with / or \ as *************** *** 1286,1307 **** if (IS_DIRECTORY_SEP (nm[1])) { if (strcmp (nm, SDATA (name)) != 0) ! name = build_string (nm); } else #endif /* drive must be set, so this is okay */ if (strcmp (nm - 2, SDATA (name)) != 0) { ! name = make_string (nm - 2, p - nm + 2); SSET (name, 0, DRIVE_LETTER (drive)); SSET (name, 1, ':'); } return name; #else /* not DOS_NT */ ! if (nm == SDATA (name)) ! return name; ! return build_string (nm); #endif /* not DOS_NT */ } } --- 1299,1324 ---- if (IS_DIRECTORY_SEP (nm[1])) { if (strcmp (nm, SDATA (name)) != 0) ! name = make_unibyte_string (nm, strlen (nm)); } else #endif /* drive must be set, so this is okay */ if (strcmp (nm - 2, SDATA (name)) != 0) { ! name = make_unibyte_string (nm - 2, p - nm + 2); SSET (name, 0, DRIVE_LETTER (drive)); SSET (name, 1, ':'); } + if (multibyte) + name = DECODE_FILE (name); return name; #else /* not DOS_NT */ ! if (nm != SDATA (name)) ! name = make_unibyte_string (nm, strlen (nm)); ! if (multibyte) ! name = DECODE_FILE (name); ! return name; #endif /* not DOS_NT */ } } *************** *** 1670,1676 **** CORRECT_DIR_SEPS (target); #endif /* DOS_NT */ ! return make_string (target, o - target); } #if 0 --- 1687,1696 ---- CORRECT_DIR_SEPS (target); #endif /* DOS_NT */ ! name = make_unibyte_string (target, o - target); ! if (multibyte) ! name = DECODE_FILE (name); ! return name; } #if 0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: expand-file-name problem for eight-bit-control/graphic 2003-03-13 7:47 expand-file-name problem for eight-bit-control/graphic Kenichi Handa @ 2003-03-15 6:54 ` Richard Stallman 2003-03-18 2:03 ` Kenichi Handa 0 siblings, 1 reply; 6+ messages in thread From: Richard Stallman @ 2003-03-15 6:54 UTC (permalink / raw) Cc: emacs-devel I have a lot of doubts about this code, because it seems to encode and then decode the file name. Since the arguments and values are both strings for use within Emacs, I think it is incorrect for expand-file-name to ever encode or decode a file name. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: expand-file-name problem for eight-bit-control/graphic 2003-03-15 6:54 ` Richard Stallman @ 2003-03-18 2:03 ` Kenichi Handa 2003-03-18 13:24 ` Kenichi Handa 0 siblings, 1 reply; 6+ messages in thread From: Kenichi Handa @ 2003-03-18 2:03 UTC (permalink / raw) Cc: emacs-devel In article <E18u5Z4-0004yd-00@fencepost.gnu.org>, Richard Stallman <rms@gnu.org> writes: > I have a lot of doubts about this code, because it seems to encode and > then decode the file name. Since the arguments and values are both > strings for use within Emacs, I think it is incorrect for > expand-file-name to ever encode or decode a file name. But, at least, we can't use build_string and make_string blindly to reconstruct a file name. So, how about the attached change? --- Ken'ichi HANDA handa@m17n.org *** fileio.c.~1.474.~ Tue Mar 18 09:59:02 2003 --- fileio.c Tue Mar 18 10:52:33 2003 *************** *** 992,997 **** --- 992,1014 ---- \f + /* Return a string made from NBYTES bytes at P. If MULTIBYTE is + nonzero, the string is multibyte (it is assumed that the bytes are + in correct multibyte form). If MULTIBYTE is zero, the string is + unibyte. */ + + static Lisp_Object + bytes_to_string (unsigned char *p, int nbytes, int multibyte) + { + int nchars; + + if (! multibyte) + return make_unibyte_string ((char *) p, nbytes); + nchars = multibyte_chars_in_text (p, nbytes); + return make_multibyte_string ((char *) p, nchars, nbytes); + } + + DEFUN ("expand-file-name", Fexpand_file_name, Sexpand_file_name, 1, 2, 0, doc: /* Convert filename NAME to absolute, and canonicalize it. Second arg DEFAULT-DIRECTORY is directory to start with if NAME is relative *************** *** 1028,1033 **** --- 1045,1051 ---- #endif /* DOS_NT */ int length; Lisp_Object handler; + int multibyte; CHECK_STRING (name); *************** *** 1111,1116 **** --- 1129,1135 ---- name = FILE_SYSTEM_CASE (name); #endif + multibyte = STRING_MULTIBYTE (name); nm = SDATA (name); #ifdef DOS_NT *************** *** 1275,1281 **** { #ifdef VMS if (index (nm, '/')) ! return build_string (sys_translate_unix (nm)); #endif /* VMS */ #ifdef DOS_NT /* Make sure directories are all separated with / or \ as --- 1294,1304 ---- { #ifdef VMS if (index (nm, '/')) ! { ! nm = sys_translate_unix (nm); ! length = strlen (nm); ! return bytes_to_string (nm, length, multibyte); ! } #endif /* VMS */ #ifdef DOS_NT /* Make sure directories are all separated with / or \ as *************** *** 1286,1299 **** if (IS_DIRECTORY_SEP (nm[1])) { if (strcmp (nm, SDATA (name)) != 0) ! name = build_string (nm); } else #endif /* drive must be set, so this is okay */ if (strcmp (nm - 2, SDATA (name)) != 0) { ! name = make_string (nm - 2, p - nm + 2); SSET (name, 0, DRIVE_LETTER (drive)); SSET (name, 1, ':'); } --- 1309,1322 ---- if (IS_DIRECTORY_SEP (nm[1])) { if (strcmp (nm, SDATA (name)) != 0) ! name = bytes_to_string (nm, strlen (nm), multibyte); } else #endif /* drive must be set, so this is okay */ if (strcmp (nm - 2, SDATA (name)) != 0) { ! name = bytes_to_string (nm, strlen (nm), multibyte); SSET (name, 0, DRIVE_LETTER (drive)); SSET (name, 1, ':'); } *************** *** 1301,1307 **** #else /* not DOS_NT */ if (nm == SDATA (name)) return name; ! return build_string (nm); #endif /* not DOS_NT */ } } --- 1324,1330 ---- #else /* not DOS_NT */ if (nm == SDATA (name)) return name; ! return bytes_to_string (nm, strlen (nm), multibyte); #endif /* not DOS_NT */ } } *************** *** 1670,1676 **** CORRECT_DIR_SEPS (target); #endif /* DOS_NT */ ! return make_string (target, o - target); } #if 0 --- 1693,1699 ---- CORRECT_DIR_SEPS (target); #endif /* DOS_NT */ ! return bytes_to_string (target, o - target, multibyte); } #if 0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: expand-file-name problem for eight-bit-control/graphic 2003-03-18 2:03 ` Kenichi Handa @ 2003-03-18 13:24 ` Kenichi Handa 2003-03-19 13:36 ` Richard Stallman 2003-03-20 18:49 ` Juanma Barranquero 0 siblings, 2 replies; 6+ messages in thread From: Kenichi Handa @ 2003-03-18 13:24 UTC (permalink / raw) Cc: emacs-devel I wrote: > But, at least, we can't use build_string and make_string > blindly to reconstruct a file name. So, how about the > attached change? I found that several other functions in fileio.c have the same problem as expand-file-name. They all do something like this: str = SDATA (filename); ... if (STRING_MULTIBYTE (filename)) return make_string (beg, p - beg); return make_unibyte_string (beg, p - beg); and make_string will return a unibyte string if FILENAME originally contains eight-bit-control/graphics. Another bug is in read-file-name. It doesn't decode homedir. Here's a new patch which replaces the previous one. I think this fix is important because nowadays people more often encounter, for instance, utf-8 filenames in latin-1 locale or vice versa. --- Ken'ichi HANDA handa@m17n.org *** fileio.c.~1.474.~ Tue Mar 18 09:59:02 2003 --- fileio.c Tue Mar 18 22:09:35 2003 *************** *** 235,240 **** --- 235,258 ---- Lisp_Object *, struct coding_system *)); static int e_write P_ ((int, Lisp_Object, int, int, struct coding_system *)); + static Lisp_Object build_file_name P_ ((const unsigned char *, int, int)); + + /* Return a string made from NBYTES bytes at P. If MULTIBYTE is + nonzero, the string is multibyte (it is assumed that the bytes are + in correct multibyte form). If MULTIBYTE is zero, the string is + unibyte. */ + + static Lisp_Object + build_file_name (const unsigned char *p, int nbytes, int multibyte) + { + int nchars; + + if (! multibyte) + return make_unibyte_string ((char *) p, nbytes); + nchars = multibyte_chars_in_text (p, nbytes); + return make_multibyte_string ((char *) p, nchars, nbytes); + } + \f void report_file_error (string, data) *************** *** 447,455 **** CORRECT_DIR_SEPS (beg); #endif /* DOS_NT */ ! if (STRING_MULTIBYTE (filename)) ! return make_string (beg, p - beg); ! return make_unibyte_string (beg, p - beg); } DEFUN ("file-name-nondirectory", Ffile_name_nondirectory, --- 465,471 ---- CORRECT_DIR_SEPS (beg); #endif /* DOS_NT */ ! return build_file_name (beg, p - beg, STRING_MULTIBYTE (filename)); } DEFUN ("file-name-nondirectory", Ffile_name_nondirectory, *************** *** 488,496 **** ) p--; ! if (STRING_MULTIBYTE (filename)) ! return make_string (p, end - p); ! return make_unibyte_string (p, end - p); } DEFUN ("unhandled-file-name-directory", Funhandled_file_name_directory, --- 504,510 ---- ) p--; ! return build_file_name (p, end - p, STRING_MULTIBYTE (filename)); } DEFUN ("unhandled-file-name-directory", Funhandled_file_name_directory, *************** *** 631,637 **** return call2 (handler, Qfile_name_as_directory, file); buf = (char *) alloca (SBYTES (file) + 10); ! return build_string (file_name_as_directory (buf, SDATA (file))); } \f /* --- 645,652 ---- return call2 (handler, Qfile_name_as_directory, file); buf = (char *) alloca (SBYTES (file) + 10); ! file_name_as_directory (buf, SDATA (file)); ! return build_file_name (buf, strlen (buf), STRING_MULTIBYTE (file)); } \f /* *************** *** 831,837 **** buf = (char *) alloca (SBYTES (directory) + 20); #endif directory_file_name (SDATA (directory), buf); ! return build_string (buf); } static char make_temp_name_tbl[64] = --- 846,852 ---- buf = (char *) alloca (SBYTES (directory) + 20); #endif directory_file_name (SDATA (directory), buf); ! return build_file_name (buf, strlen (buf), STRING_MULTIBYTE (directory)); } static char make_temp_name_tbl[64] = *************** *** 1275,1281 **** { #ifdef VMS if (index (nm, '/')) ! return build_string (sys_translate_unix (nm)); #endif /* VMS */ #ifdef DOS_NT /* Make sure directories are all separated with / or \ as --- 1290,1300 ---- { #ifdef VMS if (index (nm, '/')) ! { ! nm = sys_translate_unix (nm); ! return build_file_name (nm, strlen (nm), ! STRING_MULTIBYTE (name)); ! } #endif /* VMS */ #ifdef DOS_NT /* Make sure directories are all separated with / or \ as *************** *** 1286,1299 **** if (IS_DIRECTORY_SEP (nm[1])) { if (strcmp (nm, SDATA (name)) != 0) ! name = build_string (nm); } else #endif /* drive must be set, so this is okay */ if (strcmp (nm - 2, SDATA (name)) != 0) { ! name = make_string (nm - 2, p - nm + 2); SSET (name, 0, DRIVE_LETTER (drive)); SSET (name, 1, ':'); } --- 1305,1320 ---- if (IS_DIRECTORY_SEP (nm[1])) { if (strcmp (nm, SDATA (name)) != 0) ! name ! = build_file_name (nm, strlen (nm), STRING_MULTIBYTE (name)); } else #endif /* drive must be set, so this is okay */ if (strcmp (nm - 2, SDATA (name)) != 0) { ! name ! = build_file_name (nm, strlen (nm), STRING_MULTIBYTE (name)); SSET (name, 0, DRIVE_LETTER (drive)); SSET (name, 1, ':'); } *************** *** 1301,1307 **** #else /* not DOS_NT */ if (nm == SDATA (name)) return name; ! return build_string (nm); #endif /* not DOS_NT */ } } --- 1322,1328 ---- #else /* not DOS_NT */ if (nm == SDATA (name)) return name; ! return build_file_name (nm, strlen (nm), STRING_MULTIBYTE (name)); #endif /* not DOS_NT */ } } *************** *** 1670,1676 **** CORRECT_DIR_SEPS (target); #endif /* DOS_NT */ ! return make_string (target, o - target); } #if 0 --- 1691,1697 ---- CORRECT_DIR_SEPS (target); #endif /* DOS_NT */ ! return build_file_name (target, o - target, STRING_MULTIBYTE (name)); } #if 0 *************** *** 2101,2107 **** } #ifdef VMS ! return build_string (nm); #else /* See if any variables are substituted into the string --- 2122,2128 ---- } #ifdef VMS ! return build_file_name (nm, strlen (nm), STRING_MULTIBYTE (filename)); #else /* See if any variables are substituted into the string *************** *** 2244,2252 **** xnm = p; #endif ! if (STRING_MULTIBYTE (filename)) ! return make_string (xnm, x - xnm); ! return make_unibyte_string (xnm, x - xnm); badsubst: error ("Bad format environment-variable substitution"); --- 2265,2271 ---- xnm = p; #endif ! return build_file_name (xnm, x - xnm, STRING_MULTIBYTE (filename)); badsubst: error ("Bad format environment-variable substitution"); *************** *** 6023,6028 **** --- 6042,6048 ---- Lisp_Object val, insdef, tem; struct gcpro gcpro1, gcpro2; register char *homedir; + Lisp_Object decoded_homedir; int replace_in_history = 0; int add_to_history = 0; int count; *************** *** 6045,6069 **** CORRECT_DIR_SEPS (homedir); } #endif if (homedir != 0 && STRINGP (dir) ! && !strncmp (homedir, SDATA (dir), strlen (homedir)) ! && IS_DIRECTORY_SEP (SREF (dir, strlen (homedir)))) { ! dir = make_string (SDATA (dir) + strlen (homedir) - 1, ! SBYTES (dir) - strlen (homedir) + 1); ! SSET (dir, 0, '~'); } /* Likewise for default_filename. */ if (homedir != 0 && STRINGP (default_filename) ! && !strncmp (homedir, SDATA (default_filename), strlen (homedir)) ! && IS_DIRECTORY_SEP (SREF (default_filename, strlen (homedir)))) { default_filename ! = make_string (SDATA (default_filename) + strlen (homedir) - 1, ! SBYTES (default_filename) - strlen (homedir) + 1); ! SSET (default_filename, 0, '~'); } if (!NILP (default_filename)) { --- 6065,6093 ---- CORRECT_DIR_SEPS (homedir); } #endif + if (homedir != 0) + decoded_homedir + = DECODE_FILE (make_unibyte_string (homedir, strlen (homedir))); if (homedir != 0 && STRINGP (dir) ! && !strncmp (SDATA (decoded_homedir), SDATA (dir), ! SBYTES (decoded_homedir)) ! && IS_DIRECTORY_SEP (SREF (dir, SBYTES (decoded_homedir)))) { ! dir = Fsubstring (dir, make_number (SCHARS (decoded_homedir) + 1), Qnil); ! dir = concat2 (build_string ("~"), dir); } /* Likewise for default_filename. */ if (homedir != 0 && STRINGP (default_filename) ! && !strncmp (SDATA (decoded_homedir), SDATA (default_filename), ! SBYTES (decoded_homedir)) ! && IS_DIRECTORY_SEP (SREF (default_filename, SBYTES (decoded_homedir)))) { default_filename ! = Fsubstring (default_filename, ! make_number (SCHARS (decoded_homedir) + 1), Qnil); ! default_filename = concat2 (build_string ("~"), default_filename); } if (!NILP (default_filename)) { ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: expand-file-name problem for eight-bit-control/graphic 2003-03-18 13:24 ` Kenichi Handa @ 2003-03-19 13:36 ` Richard Stallman 2003-03-20 18:49 ` Juanma Barranquero 1 sibling, 0 replies; 6+ messages in thread From: Richard Stallman @ 2003-03-19 13:36 UTC (permalink / raw) Cc: emacs-devel This change looks good. But it seems that build_file_name could be useful for other purposes, not only for file names, so maybe it should have a different name that doesn't say it's only for file names. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: expand-file-name problem for eight-bit-control/graphic 2003-03-18 13:24 ` Kenichi Handa 2003-03-19 13:36 ` Richard Stallman @ 2003-03-20 18:49 ` Juanma Barranquero 1 sibling, 0 replies; 6+ messages in thread From: Juanma Barranquero @ 2003-03-20 18:49 UTC (permalink / raw) Cc: emacs-devel ELISP> (expand-file-name "c:/tmp") "c:/tmp" ELISP> (expand-file-name "c://tmp") "c:mp" ELISP> (expand-file-name "c:\\tmp") "c:mp" /L/e/k/t/u ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2003-03-20 18:49 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-03-13 7:47 expand-file-name problem for eight-bit-control/graphic Kenichi Handa 2003-03-15 6:54 ` Richard Stallman 2003-03-18 2:03 ` Kenichi Handa 2003-03-18 13:24 ` Kenichi Handa 2003-03-19 13:36 ` Richard Stallman 2003-03-20 18:49 ` Juanma Barranquero
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.