* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP @ 2013-01-26 5:52 Shigeru Fukaya 2013-01-26 10:26 ` Stefan Monnier 2013-01-26 12:56 ` Eli Zaretskii 0 siblings, 2 replies; 15+ messages in thread From: Shigeru Fukaya @ 2013-01-26 5:52 UTC (permalink / raw) To: 13553 Hi, `file-attributes' returns nil as stat info for a file, its name ending with a character whose second byte is 0x5c. It is because IS_DIRECTORY_SEP is wrongly used. IS_DIRECTORY_SEP only works when its argument is surely on a start byte of dbcs characters. I find incorrect usage, at least in some functions of `w32.c'. (readdir, stat_worker, read_unc_volume) I show, as a silly sample, of easy fix. #define IS_AT_DIRECTORY_SEP(p,i) \ (p[i] == '/' || (p[i] == '\\' && _mbsbtype(p,i) == _MBC_SINGLE)) and, IS_DIRECTORY_SEP(p[i]) to IS_AT_DIRECTORY_SEP(p, i) Regards, Shigeru ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 5:52 bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP Shigeru Fukaya @ 2013-01-26 10:26 ` Stefan Monnier 2013-01-26 11:16 ` Eli Zaretskii 2013-01-26 12:56 ` Eli Zaretskii 1 sibling, 1 reply; 15+ messages in thread From: Stefan Monnier @ 2013-01-26 10:26 UTC (permalink / raw) To: Shigeru Fukaya; +Cc: 13553 > It is because IS_DIRECTORY_SEP is wrongly used. > IS_DIRECTORY_SEP only works when its argument is surely on a start > byte of dbcs characters. If the file name is properly decoded (i.e. in its multibyte form), there can't be a 0x5c byte other than at the beginning of a char. So this bug can only happens if you pass a unibyte filename. IOW the bug is probably in the caller rather than in the file-attributes function. Please show us the actual situation where this happens, since file names are normally always multibyte. Stefan ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 10:26 ` Stefan Monnier @ 2013-01-26 11:16 ` Eli Zaretskii 2013-01-26 12:40 ` Stefan Monnier 0 siblings, 1 reply; 15+ messages in thread From: Eli Zaretskii @ 2013-01-26 11:16 UTC (permalink / raw) To: Stefan Monnier; +Cc: 13553, shigeru.fukaya > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sat, 26 Jan 2013 05:26:04 -0500 > Cc: 13553@debbugs.gnu.org > > > It is because IS_DIRECTORY_SEP is wrongly used. > > IS_DIRECTORY_SEP only works when its argument is surely on a start > > byte of dbcs characters. > > If the file name is properly decoded (i.e. in its multibyte form), there > can't be a 0x5c byte other than at the beginning of a char. > > So this bug can only happens if you pass a unibyte filename. IOW the > bug is probably in the caller rather than in the file-attributes function. No, it's not in the caller. It's in w32.c, whose functions almost always manipulate encoded file names (because they shadow system APIs). > Please show us the actual situation where this happens, since file names > are normally always multibyte. You call file-attributes, which encodes the file name and passes it to 'lstat'. The implementation of 'lstat' in w32.c then looks at the last byte of the encoded file name to see if there's a slash or backslash there. Boom! I will fix that. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 11:16 ` Eli Zaretskii @ 2013-01-26 12:40 ` Stefan Monnier 2013-01-26 13:23 ` Eli Zaretskii 0 siblings, 1 reply; 15+ messages in thread From: Stefan Monnier @ 2013-01-26 12:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 13553, shigeru.fukaya > You call file-attributes, which encodes the file name and passes it to > 'lstat'. The implementation of 'lstat' in w32.c then looks at the > last byte of the encoded file name to see if there's a slash or > backslash there. Boom! I see. So, does that meant that w32.c can't faithfully implement lstat without doing the moral equivalent of re-decoding its argument? Stefan ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 12:40 ` Stefan Monnier @ 2013-01-26 13:23 ` Eli Zaretskii 2013-01-26 22:24 ` Stefan Monnier 0 siblings, 1 reply; 15+ messages in thread From: Eli Zaretskii @ 2013-01-26 13:23 UTC (permalink / raw) To: Stefan Monnier; +Cc: 13553, shigeru.fukaya > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: shigeru.fukaya@gmail.com, 13553@debbugs.gnu.org > Date: Sat, 26 Jan 2013 07:40:08 -0500 > > > You call file-attributes, which encodes the file name and passes it to > > 'lstat'. The implementation of 'lstat' in w32.c then looks at the > > last byte of the encoded file name to see if there's a slash or > > backslash there. Boom! > > I see. So, does that meant that w32.c can't faithfully implement lstat > without doing the moral equivalent of re-decoding its argument? It can, if we limit such support to Windows codepage encodings. See the changes I made on the emacs-24 branch revisions 111194 and 111200. This will still lose if the user binds file-name-coding-system to something like shift_jis (instead of using its Windows extension cp932), but there's nothing I can do about that. I was lucky to find in Windows APIs 2 functions that can move forward and backward by DBCS characters, and that accept a codepage as their argument (i.e. do not limit support to the current system codepage). Personally, I think supporting all possible Windows codepages is good enough. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 13:23 ` Eli Zaretskii @ 2013-01-26 22:24 ` Stefan Monnier 2013-01-27 7:06 ` Eli Zaretskii 0 siblings, 1 reply; 15+ messages in thread From: Stefan Monnier @ 2013-01-26 22:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 13553, shigeru.fukaya >> > You call file-attributes, which encodes the file name and passes it to >> > 'lstat'. The implementation of 'lstat' in w32.c then looks at the >> > last byte of the encoded file name to see if there's a slash or >> > backslash there. Boom! >> I see. So, does that meant that w32.c can't faithfully implement lstat >> without doing the moral equivalent of re-decoding its argument? > It can, if we limit such support to Windows codepage encodings. See > the changes I made on the emacs-24 branch revisions 111194 and 111200. CharNextExA and CharPrevExA, in my mind, do perform the moral equivalent of decoding the argument (just in an incremental way). > Personally, I think supporting all possible Windows codepages is > good enough. Fine by me, Stefan ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 22:24 ` Stefan Monnier @ 2013-01-27 7:06 ` Eli Zaretskii 2013-01-27 8:36 ` Shigeru Fukaya 0 siblings, 1 reply; 15+ messages in thread From: Eli Zaretskii @ 2013-01-27 7:06 UTC (permalink / raw) To: Stefan Monnier; +Cc: 13553, shigeru.fukaya > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: shigeru.fukaya@gmail.com, 13553@debbugs.gnu.org > Date: Sat, 26 Jan 2013 17:24:37 -0500 > > CharNextExA and CharPrevExA, in my mind, do perform the moral equivalent > of decoding the argument (just in an incremental way). No, they are the equivalents of NEXT_CHAR_BOUNDARY and PREV_CHAR_BOUNDARY, except that they can do this for many different encodings, not just for UTF-8. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-27 7:06 ` Eli Zaretskii @ 2013-01-27 8:36 ` Shigeru Fukaya 2013-01-27 9:40 ` Eli Zaretskii 0 siblings, 1 reply; 15+ messages in thread From: Shigeru Fukaya @ 2013-01-27 8:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 13553 One more, please. CharNextExA/CharPrevExA work based on dynamically specified codepage. And, _mbspbrk/_mbslwr are also used in w32.c. They work based on MS Window's locale setting. Isn't it inconsistent? _mbsinc/_mbsdec would simply be sufficient? (Besides, I don't know well, _mbslwr seems to me lower latin characters' case on some locale. Is it true, or intended behavior?) Thank you, Shigeru >> From: Stefan Monnier <monnier@iro.umontreal.ca> >> Cc: shigeru.fukaya@gmail.com, 13553@debbugs.gnu.org >> Date: Sat, 26 Jan 2013 17:24:37 -0500 >> >> CharNextExA and CharPrevExA, in my mind, do perform the moral equivalent >> of decoding the argument (just in an incremental way). > >No, they are the equivalents of NEXT_CHAR_BOUNDARY and >PREV_CHAR_BOUNDARY, except that they can do this for many different >encodings, not just for UTF-8. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-27 8:36 ` Shigeru Fukaya @ 2013-01-27 9:40 ` Eli Zaretskii 0 siblings, 0 replies; 15+ messages in thread From: Eli Zaretskii @ 2013-01-27 9:40 UTC (permalink / raw) To: Shigeru Fukaya; +Cc: 13553 > From: Shigeru Fukaya <shigeru.fukaya@gmail.com> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, > 13553@debbugs.gnu.org > Date: Sun, 27 Jan 2013 17:36:48 +0900 > > CharNextExA/CharPrevExA work based on dynamically specified codepage. > > And, _mbspbrk/_mbslwr are also used in w32.c. They work based on MS > Window's locale setting. > > Isn't it inconsistent? It is, but I don't know of any practical way of eliminating that inconsistency. In the normal case, where the current file-name-coding-system is the same as the ANSI codepage, there is no inconsistency, so using CharNextExA/CharPrevExA is never worse than using _mbsinc/_mbsdec. It is sometimes better, because the code which calls the _mbs* functions is executed only by some functions in w32.c. > _mbsinc/_mbsdec would simply be sufficient? That would preclude code like this: (let ((file-name-coding-system 'cp932)) (expand-file-name "SOMETHING" "C:/") I think it's reasonable to expect this to be supported, even if the default file-name encoding is not cp932. > (Besides, I don't know well, _mbslwr seems to me lower latin > characters' case on some locale. Is it true, or intended behavior?) The intended behavior is to downcase letters for which lower-case is defined. I don't know if this is limited to Latin characters. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 5:52 bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP Shigeru Fukaya 2013-01-26 10:26 ` Stefan Monnier @ 2013-01-26 12:56 ` Eli Zaretskii 2013-01-26 14:06 ` Eli Zaretskii 2013-01-26 16:33 ` Shigeru Fukaya 1 sibling, 2 replies; 15+ messages in thread From: Eli Zaretskii @ 2013-01-26 12:56 UTC (permalink / raw) To: Shigeru Fukaya; +Cc: 13553 > From: Shigeru Fukaya <shigeru.fukaya@gmail.com> > Date: Sat, 26 Jan 2013 14:52:59 +0900 > > `file-attributes' returns nil as stat info for a file, its name ending > with a character whose second byte is 0x5c. > > It is because IS_DIRECTORY_SEP is wrongly used. > IS_DIRECTORY_SEP only works when its argument is surely on a start > byte of dbcs characters. > I find incorrect usage, at least in some functions of `w32.c'. > (readdir, stat_worker, read_unc_volume) Thanks. I fixed w32.c functions that use IS_DIRECTORY_SEP in revision 111200 on the emacs-24 branch. As I cannot easily test DBCS file-name encodings on my system, please test the fixed version and see if the problem indeed goes away. The patch relative to emacs-24 branch is below, for your convenience. After looking at the code which I fixed, I'm afraid I don't understand why file-attributes returned nil in your case. The code that uses IS_DIRECTORY_SEP there is only a fallback, for when the GetFileInformationByHandle API fails for some reason. You don't say what version of Windows you are using, but I won't expect that fallback code be executed for any reasonably recent Windows version. So could you please trace through the code that file-attributes calls, and see which use of IS_DIRECTORY_SEP caused the problem? I'm afraid that it's not in w32.c at all, but elsewhere. === modified file 'src/w32.c' --- src/w32.c 2013-01-25 14:47:37 +0000 +++ src/w32.c 2013-01-26 12:49:34 +0000 @@ -1503,12 +1503,17 @@ parse_root (char * name, char ** pPath) else if (IS_DIRECTORY_SEP (name[0]) && IS_DIRECTORY_SEP (name[1])) { int slashes = 2; + int dbcs_p = max_filename_mbslen () > 1; + name += 2; do { if (IS_DIRECTORY_SEP (*name) && --slashes == 0) break; - name++; + if (dbcs_p) + name = CharNextExA (file_name_codepage, name, 0); + else + name++; } while ( *name ); if (IS_DIRECTORY_SEP (name[0])) @@ -2369,12 +2374,23 @@ get_volume_info (const char * name, cons { char *str = temp; int slashes = 4; + int dbcs_p = max_filename_mbslen () > 1; + rootname = temp; do { if (IS_DIRECTORY_SEP (*name) && --slashes == 0) break; - *str++ = *name++; + if (!dbcs_p) + *str++ = *name++; + else + { + const char *p = name; + + name = CharNextExA (file_name_codepage, name, 0); + memcpy (str, p, name - p); + str += name - p; + } } while ( *name ); @@ -2610,11 +2626,23 @@ readdir (DIR *dirp) { char filename[MAXNAMLEN + 3]; int ln; + int dbcs_p = max_filename_mbslen () > 1; strcpy (filename, dir_pathname); ln = strlen (filename) - 1; - if (!IS_DIRECTORY_SEP (filename[ln])) - strcat (filename, "\\"); + if (!dbcs_p) + { + if (!IS_DIRECTORY_SEP (filename[ln])) + strcat (filename, "\\"); + } + else + { + char *end = filename + ln + 1; + char *last_char = CharPrevExA (file_name_codepage, filename, end, 0); + + if (!IS_DIRECTORY_SEP (*last_char)) + strcat (filename, "\\"); + } strcat (filename, "*"); /* Note: No need to resolve symlinks in FILENAME, because @@ -2719,6 +2747,7 @@ read_unc_volume (HANDLE henum, char *rea DWORD bufsize = 512; char *buffer; char *ptr; + int dbcs_p = max_filename_mbslen () > 1; count = 1; buffer = alloca (bufsize); @@ -2729,7 +2758,13 @@ read_unc_volume (HANDLE henum, char *rea /* WNetEnumResource returns \\resource\share...skip forward to "share". */ ptr = ((LPNETRESOURCE) buffer)->lpRemoteName; ptr += 2; - while (*ptr && !IS_DIRECTORY_SEP (*ptr)) ptr++; + if (!dbcs_p) + while (*ptr && !IS_DIRECTORY_SEP (*ptr)) ptr++; + else + { + while (*ptr && !IS_DIRECTORY_SEP (*ptr)) + ptr = CharNextExA (file_name_codepage, ptr, 0); + } ptr++; strncpy (readbuf, ptr, size); @@ -2766,9 +2801,11 @@ logon_network_drive (const char *path) { NETRESOURCE resource; char share[MAX_PATH]; - int i, n_slashes; + int n_slashes; char drive[4]; UINT drvtype; + char *p; + int dbcs_p; if (IS_DIRECTORY_SEP (path[0]) && IS_DIRECTORY_SEP (path[1])) drvtype = DRIVE_REMOTE; @@ -2790,13 +2827,18 @@ logon_network_drive (const char *path) n_slashes = 2; strncpy (share, path, MAX_PATH); /* Truncate to just server and share name. */ - for (i = 2; i < MAX_PATH; i++) + dbcs_p = max_filename_mbslen () > 1; + for (p = share + 2; *p && p < share + MAX_PATH; ) { - if (IS_DIRECTORY_SEP (share[i]) && ++n_slashes > 3) + if (IS_DIRECTORY_SEP (*p) && ++n_slashes > 3) { - share[i] = '\0'; + *p = '\0'; break; } + if (dbcs_p) + p = CharNextExA (file_name_codepage, p, 0); + else + p++; } resource.dwType = RESOURCETYPE_DISK; @@ -3557,6 +3599,7 @@ stat_worker (const char * path, struct s DWORD access_rights = 0; DWORD fattrs = 0, serialnum = 0, fs_high = 0, fs_low = 0, nlinks = 1; FILETIME ctime, atime, wtime; + int dbcs_p; if (path == NULL || buf == NULL) { @@ -3751,6 +3794,7 @@ stat_worker (const char * path, struct s did not ask for extra precision, resolving symlinks will fly in the face of that request, since the user then wants the lightweight version of the code. */ + dbcs_p = max_filename_mbslen () > 1; rootdir = (path >= save_name + len - 1 && (IS_DIRECTORY_SEP (*path) || *path == 0)); @@ -3778,8 +3822,19 @@ stat_worker (const char * path, struct s } else if (rootdir) { - if (!IS_DIRECTORY_SEP (name[len-1])) - strcat (name, "\\"); + if (!dbcs_p) + { + if (!IS_DIRECTORY_SEP (name[len-1])) + strcat (name, "\\"); + } + else + { + char *end = name + len; + char *n = CharPrevExA (file_name_codepage, name, end, 0); + + if (!IS_DIRECTORY_SEP (*n)) + strcat (name, "\\"); + } if (GetDriveType (name) < 2) { errno = ENOENT; @@ -3791,15 +3846,37 @@ stat_worker (const char * path, struct s } else { - if (IS_DIRECTORY_SEP (name[len-1])) - name[len - 1] = 0; + if (!dbcs_p) + { + if (IS_DIRECTORY_SEP (name[len-1])) + name[len - 1] = 0; + } + else + { + char *end = name + len; + char *n = CharPrevExA (file_name_codepage, name, end, 0); + + if (IS_DIRECTORY_SEP (*n)) + *n = 0; + } /* (This is hacky, but helps when doing file completions on network drives.) Optimize by using information available from active readdir if possible. */ len = strlen (dir_pathname); - if (IS_DIRECTORY_SEP (dir_pathname[len-1])) - len--; + if (!dbcs_p) + { + if (IS_DIRECTORY_SEP (dir_pathname[len-1])) + len--; + } + else + { + char *end = dir_pathname + len; + char *n = CharPrevExA (file_name_codepage, dir_pathname, end, 0); + + if (IS_DIRECTORY_SEP (*n)) + len--; + } if (dir_find_handle != INVALID_HANDLE_VALUE && !(is_a_symlink && follow_symlinks) && strnicmp (save_name, dir_pathname, len) == 0 @@ -4060,6 +4137,7 @@ symlink (char const *filename, char cons char linkfn[MAX_PATH], *tgtfn; DWORD flags = 0; int dir_access, filename_ends_in_slash; + int dbcs_p; /* Diagnostics follows Posix as much as possible. */ if (filename == NULL || linkname == NULL) @@ -4085,6 +4163,8 @@ symlink (char const *filename, char cons return -1; } + dbcs_p = max_filename_mbslen () > 1; + /* Note: since empty FILENAME was already rejected, we can safely refer to FILENAME[1]. */ if (!(IS_DIRECTORY_SEP (filename[0]) || IS_DEVICE_SEP (filename[1]))) @@ -4099,8 +4179,21 @@ symlink (char const *filename, char cons char tem[MAX_PATH]; char *p = linkfn + strlen (linkfn); - while (p > linkfn && !IS_ANY_SEP (p[-1])) - p--; + if (!dbcs_p) + { + while (p > linkfn && !IS_ANY_SEP (p[-1])) + p--; + } + else + { + char *p1 = CharPrevExA (file_name_codepage, linkfn, p, 0); + + while (p > linkfn && !IS_ANY_SEP (*p1)) + { + p = p1; + p1 = CharPrevExA (file_name_codepage, linkfn, p1, 0); + } + } if (p > linkfn) strncpy (tem, linkfn, p - linkfn); tem[p - linkfn] = '\0'; @@ -4115,7 +4208,15 @@ symlink (char const *filename, char cons exist, but ends in a slash, we create a symlink to directory. If FILENAME exists and is a directory, we always create a symlink to directory. */ - filename_ends_in_slash = IS_DIRECTORY_SEP (filename[strlen (filename) - 1]); + if (!dbcs_p) + filename_ends_in_slash = IS_DIRECTORY_SEP (filename[strlen (filename) - 1]); + else + { + const char *end = filename + strlen (filename); + const char *n = CharPrevExA (file_name_codepage, filename, end, 0); + + filename_ends_in_slash = IS_DIRECTORY_SEP (*n); + } if (dir_access == 0 || filename_ends_in_slash) flags = SYMBOLIC_LINK_FLAG_DIRECTORY; @@ -4440,6 +4541,7 @@ chase_symlinks (const char *file) char link[MAX_PATH]; ssize_t res, link_len; int loop_count = 0; + int dbcs_p; if (is_windows_9x () == TRUE || !is_symlink (file)) return (char *)file; @@ -4447,13 +4549,27 @@ chase_symlinks (const char *file) if ((link_len = GetFullPathName (file, MAX_PATH, link, NULL)) == 0) return (char *)file; + dbcs_p = max_filename_mbslen () > 1; target[0] = '\0'; do { /* Remove trailing slashes, as we want to resolve the last non-trivial part of the link name. */ - while (link_len > 3 && IS_DIRECTORY_SEP (link[link_len-1])) - link[link_len--] = '\0'; + if (!dbcs_p) + { + while (link_len > 3 && IS_DIRECTORY_SEP (link[link_len-1])) + link[link_len--] = '\0'; + } + else if (link_len > 3) + { + char *n = CharPrevExA (file_name_codepage, link, link + link_len, 0); + + while (n >= link + 2 && IS_DIRECTORY_SEP (*n)) + { + n[1] = '\0'; + n = CharPrevExA (file_name_codepage, link, n, 0); + } + } res = readlink (link, target, MAX_PATH); if (res > 0) @@ -4466,8 +4582,21 @@ chase_symlinks (const char *file) the symlink, then copy the result back to target. */ char *p = link + link_len; - while (p > link && !IS_ANY_SEP (p[-1])) - p--; + if (!dbcs_p) + { + while (p > link && !IS_ANY_SEP (p[-1])) + p--; + } + else + { + char *p1 = CharPrevExA (file_name_codepage, link, p, 0); + + while (p > link && !IS_ANY_SEP (*p1)) + { + p = p1; + p1 = CharPrevExA (file_name_codepage, link, p1, 0); + } + } strcpy (p, target); strcpy (target, link); } ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 12:56 ` Eli Zaretskii @ 2013-01-26 14:06 ` Eli Zaretskii 2013-01-26 16:33 ` Shigeru Fukaya 1 sibling, 0 replies; 15+ messages in thread From: Eli Zaretskii @ 2013-01-26 14:06 UTC (permalink / raw) To: shigeru.fukaya; +Cc: 13553 > Date: Sat, 26 Jan 2013 14:56:46 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: 13553@debbugs.gnu.org > > After looking at the code which I fixed, I'm afraid I don't understand > why file-attributes returned nil in your case. The code that uses > IS_DIRECTORY_SEP there is only a fallback, for when the > GetFileInformationByHandle API fails for some reason. You don't say > what version of Windows you are using, but I won't expect that > fallback code be executed for any reasonably recent Windows version. Actually, that fallback code gets also executed if you customize w32-get-true-file-attributes to nil, or if the file is on a remote (i.e. networked) filesystem. Could one of these be true in your case? ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 12:56 ` Eli Zaretskii 2013-01-26 14:06 ` Eli Zaretskii @ 2013-01-26 16:33 ` Shigeru Fukaya 2013-01-26 17:37 ` Eli Zaretskii 1 sibling, 1 reply; 15+ messages in thread From: Shigeru Fukaya @ 2013-01-26 16:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 13553 I greatly appreciate your very quick work, thank you. I looked through your fix of all functions (parse_root, get_volume_info, readdir, read_unc_volume, logon_network_drive, star_worker, symlink, chase_symlinks) one by one, and found good at first look. I'll try build and check it next, anyway. Yes, my `w32-get-true-file-attributes' is nil. (Isn't it default?) As for coding symbol, not a few Japanese use 'cp932 not 'shift-jis I suppose. But we need some notice for users. Regards, Shigeru >> From: Shigeru Fukaya <shigeru.fukaya@gmail.com> >> Date: Sat, 26 Jan 2013 14:52:59 +0900 >> >> `file-attributes' returns nil as stat info for a file, its name ending >> with a character whose second byte is 0x5c. >> >> It is because IS_DIRECTORY_SEP is wrongly used. >> IS_DIRECTORY_SEP only works when its argument is surely on a start >> byte of dbcs characters. >> I find incorrect usage, at least in some functions of `w32.c'. >> (readdir, stat_worker, read_unc_volume) > >Thanks. I fixed w32.c functions that use IS_DIRECTORY_SEP in revision >111200 on the emacs-24 branch. As I cannot easily test DBCS file-name >encodings on my system, please test the fixed version and see if the >problem indeed goes away. The patch relative to emacs-24 branch is >below, for your convenience. > >After looking at the code which I fixed, I'm afraid I don't understand >why file-attributes returned nil in your case. The code that uses >IS_DIRECTORY_SEP there is only a fallback, for when the >GetFileInformationByHandle API fails for some reason. You don't say >what version of Windows you are using, but I won't expect that >fallback code be executed for any reasonably recent Windows version. > >So could you please trace through the code that file-attributes calls, >and see which use of IS_DIRECTORY_SEP caused the problem? I'm afraid >that it's not in w32.c at all, but elsewhere. > > >=== modified file 'src/w32.c' ... ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 16:33 ` Shigeru Fukaya @ 2013-01-26 17:37 ` Eli Zaretskii 2013-01-27 6:56 ` Shigeru Fukaya 0 siblings, 1 reply; 15+ messages in thread From: Eli Zaretskii @ 2013-01-26 17:37 UTC (permalink / raw) To: Shigeru Fukaya; +Cc: 13553 > From: Shigeru Fukaya <shigeru.fukaya@gmail.com> > Cc: 13553@debbugs.gnu.org > Date: Sun, 27 Jan 2013 01:33:32 +0900 > > I'll try build and check it next, anyway. Thank you. > Yes, my `w32-get-true-file-attributes' is nil. (Isn't it default?) The default is 'local', which means your local files use a different code path, one which doesn't need to use IS_DIRECTORY_SEP. If your value was nil, then I understand why the code I fixed could cause trouble. > As for coding symbol, not a few Japanese use 'cp932 not 'shift-jis I > suppose. But we need some notice for users. Not sure what you are saying here. Do you think many Japanese Windows users will set file-name-coding-system to shift_jis, or that most of them will set it to cp932? Normally, users don't customize file-name-coding-system at all, in which case Emacs will use default-file-name-coding-system, that is automatically set to cp932, according to the system-wide codepage. ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-26 17:37 ` Eli Zaretskii @ 2013-01-27 6:56 ` Shigeru Fukaya 2013-01-27 7:15 ` Eli Zaretskii 0 siblings, 1 reply; 15+ messages in thread From: Shigeru Fukaya @ 2013-01-27 6:56 UTC (permalink / raw) To: Eli Zaretskii; +Cc: 13553 Built on migw32, and no trouble, thank you (I didn't do coverage tests, sorry). But we must remember, my raise of issue is the incorrect usage of IS_DIRECTORY_SEP. There are still more in fileio.c and more. (I really expect remove the check of '\' from IS_DIRECTORY_SEP, and handle '\' only where it is necessary, and things wolud be better) >> As for coding symbol, not a few Japanese use 'cp932 not 'shift-jis I >> suppose. But we need some notice for users. > >Not sure what you are saying here. Do you think many Japanese Windows >users will set file-name-coding-system to shift_jis, or that most of >them will set it to cp932? > >Normally, users don't customize file-name-coding-system at all, in >which case Emacs will use default-file-name-coding-system, that is >automatically set to cp932, according to the system-wide codepage. Yes, you are right. I mean, maybe, the case of using some remote file system. You can check by yourself their (our?) usage if you like. Some seems still using shift-jis, not cp932. http://www.google.co.jp/search?q=file-name-coding-system+sjis Regards, Shigeru ^ permalink raw reply [flat|nested] 15+ messages in thread
* bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP 2013-01-27 6:56 ` Shigeru Fukaya @ 2013-01-27 7:15 ` Eli Zaretskii 0 siblings, 0 replies; 15+ messages in thread From: Eli Zaretskii @ 2013-01-27 7:15 UTC (permalink / raw) To: Shigeru Fukaya; +Cc: 13553-done > From: Shigeru Fukaya <shigeru.fukaya@gmail.com> > Cc: 13553@debbugs.gnu.org > Date: Sun, 27 Jan 2013 15:56:54 +0900 > > Built on migw32, and no trouble, thank you > (I didn't do coverage tests, sorry). Thanks. I'm therefore closing this bug. > But we must remember, my raise of issue is the incorrect usage of > IS_DIRECTORY_SEP. There are still more in fileio.c and more. Yes. This is being discussed on emacs-devel, and once that discussion ends, the fileio.c and dired.c functions will be fixed as well. > >> As for coding symbol, not a few Japanese use 'cp932 not 'shift-jis I > >> suppose. But we need some notice for users. > > > >Not sure what you are saying here. Do you think many Japanese Windows > >users will set file-name-coding-system to shift_jis, or that most of > >them will set it to cp932? > > > >Normally, users don't customize file-name-coding-system at all, in > >which case Emacs will use default-file-name-coding-system, that is > >automatically set to cp932, according to the system-wide codepage. > > Yes, you are right. I mean, maybe, the case of using some remote file > system. Accessing remote files doesn't go through functions in w32.c, it goes through file handlers (in Tramp). > You can check by yourself their (our?) usage if you like. > Some seems still using shift-jis, not cp932. > > http://www.google.co.jp/search?q=file-name-coding-system+sjis If they want 100% solid support, they will have to change their customizations, sorry. (We could implement an equivalence table, whereby, e.g., shift_jis would be mapped to cp932, but these encodings are slightly different, so I think that would be a kludge.) In any case, the new code in w32.c is no worse than the previous one, when file-name-coding-system is set to anything that is not a recognized Windows codepage. It is actually slightly better: it uses the system-wide ANSI codepage in that case. In most cases, this would be the right thing anyway. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2013-01-27 9:40 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-01-26 5:52 bug#13553: 24.3.50; incorrect usage of IS_DIRECTORY_SEP Shigeru Fukaya 2013-01-26 10:26 ` Stefan Monnier 2013-01-26 11:16 ` Eli Zaretskii 2013-01-26 12:40 ` Stefan Monnier 2013-01-26 13:23 ` Eli Zaretskii 2013-01-26 22:24 ` Stefan Monnier 2013-01-27 7:06 ` Eli Zaretskii 2013-01-27 8:36 ` Shigeru Fukaya 2013-01-27 9:40 ` Eli Zaretskii 2013-01-26 12:56 ` Eli Zaretskii 2013-01-26 14:06 ` Eli Zaretskii 2013-01-26 16:33 ` Shigeru Fukaya 2013-01-26 17:37 ` Eli Zaretskii 2013-01-27 6:56 ` Shigeru Fukaya 2013-01-27 7:15 ` Eli Zaretskii
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.