* Re: file-attribute on certain Chinese filenames failed
[not found] <45c9d948.5c6acfa4.4c9b.ffffeb01@mx.google.com>
@ 2007-02-13 5:36 ` Kenichi Handa
2007-02-13 6:06 ` MJ Chan
0 siblings, 1 reply; 8+ messages in thread
From: Kenichi Handa @ 2007-02-13 5:36 UTC (permalink / raw)
To: MJ Chan; +Cc: emacs-devel
In article <45c9d948.5c6acfa4.4c9b.ffffeb01@mx.google.com>, MJ Chan <mjchan.inbox@gmail.com> writes:
> I had trouble in accessing some files that contains certain Chinese
> (big5) characters (in fact, I found only one character so far that is
> causing the problem; all others are good). After checking around, I
> found that the problem would take place in "lstat" call in the C
> source. One example is in "file-attributes" (src/dired.c).
> For example, if I created a file named "=26345" (see below for the
> "describe-char" on that particular Chinese character) and then
> evaluated this, it returned nil.
> (file-attributes "=26345")
> nil
I can't reproduce that problem on GNU/Linux. For instance,
it seems that this code work correctly.
(let ((file-name-coding-system 'big5)
(filename (string #x26345)))
(with-temp-file filename (insert "abc\n"))
(file-attributes filename))
=> (nil 1 8308 8308 (17873 19356) (17873 19356) (17873 19356)
4 "-rw-rw-r--" nil 11175288 21)
But, when I run the same code on Windows, write-region
causes this error:
(file-error "Opening output file" "invalid argument"
"c:/cygwin/home/handa/\x26345")
I suspect that this is because the big5 sequence of the
character #x26345 is "\267|", and Windows-XP (at least my
Japanese version) doesn't allow creating a file containing
"|". For instance, I can't create a file "a|" either.
> As mentioned above, the culprit seems to be in calling lstat with
> encoded version of the file name as shown below (taken from dired.c)
> GCPRO1 (filename);
> encoded = ENCODE_FILE (filename);
> UNGCPRO;
> if (lstat (SDATA (encoded), &s) < 0)
> return Qnil;
> If I changed the call to use un-encoded filename (i.e. lstat
> (filename,...)), then it is good. But I am not sure if this is the
> right thing to do.
I believe it's not the right fix, and first of all, I have
no idea why such a change fixes your case.
Anyway, it seems that this is an Windows specific problem.
---
Kenichi Handa
handa@m17n.org
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file-attribute on certain Chinese filenames failed
2007-02-13 5:36 ` file-attribute on certain Chinese filenames failed Kenichi Handa
@ 2007-02-13 6:06 ` MJ Chan
2007-02-14 15:15 ` Jason Rumney
2007-02-17 12:22 ` Eli Zaretskii
0 siblings, 2 replies; 8+ messages in thread
From: MJ Chan @ 2007-02-13 6:06 UTC (permalink / raw)
To: Kenichi Handa; +Cc: emacs-devel
Thanks for looking into the problem.
The same code you used did not cause write-region to fail on my
XP. That is, the file "會" was written correctly. But again,
file-attributes returned nil.
Also, if I use utf-8, then file-attributes reports incorrectly the
file as a directory as shown below. This is odd.
(let ((file-name-coding-system 'utf-8))
(file-attributes "會"))
=> (t 1 5 5 (17873 20762) (17873 20755) (17430 62310) 0 "drwxrwxrwx" nil 0 (41075 . 32279))
(let ((file-name-coding-system 'big5))
(file-attributes "會"))
=> nil
If you have other clue, please let me know.
BTW, My machine is running English XP with Big5 (Chinese/Taiwan) set for
non-Unicode program in Regional and Language Options in Control Panel.
>>>>> On Tuesday, February 13 2007, Kenichi Handa said:
> In article <45c9d948.5c6acfa4.4c9b.ffffeb01@mx.google.com>, MJ Chan <mjchan.inbox@gmail.com> writes:
>> I had trouble in accessing some files that contains certain Chinese
>> (big5) characters (in fact, I found only one character so far that is
>> causing the problem; all others are good). After checking around, I
>> found that the problem would take place in "lstat" call in the C
>> source. One example is in "file-attributes" (src/dired.c).
>> For example, if I created a file named "=26345" (see below for the
>> "describe-char" on that particular Chinese character) and then
>> evaluated this, it returned nil.
>> (file-attributes "=26345")
>> nil
> I can't reproduce that problem on GNU/Linux. For instance,
> it seems that this code work correctly.
> (let ((file-name-coding-system 'big5)
> (filename (string #x26345)))
> (with-temp-file filename (insert "abc\n"))
> (file-attributes filename))
> => (nil 1 8308 8308 (17873 19356) (17873 19356) (17873 19356)
> 4 "-rw-rw-r--" nil 11175288 21)
> But, when I run the same code on Windows, write-region
> causes this error:
> (file-error "Opening output file" "invalid argument"
> "c:/cygwin/home/handa/\x26345")
> I suspect that this is because the big5 sequence of the
> character #x26345 is "\267|", and Windows-XP (at least my
> Japanese version) doesn't allow creating a file containing
> "|". For instance, I can't create a file "a|" either.
>> As mentioned above, the culprit seems to be in calling lstat with
>> encoded version of the file name as shown below (taken from dired.c)
>> GCPRO1 (filename);
>> encoded = ENCODE_FILE (filename);
>> UNGCPRO;
>> if (lstat (SDATA (encoded), &s) < 0)
>> return Qnil;
>> If I changed the call to use un-encoded filename (i.e. lstat
>> (filename,...)), then it is good. But I am not sure if this is the
>> right thing to do.
> I believe it's not the right fix, and first of all, I have
> no idea why such a change fixes your case.
> Anyway, it seems that this is an Windows specific problem.
> ---
> Kenichi Handa
> handa@m17n.org
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file-attribute on certain Chinese filenames failed
2007-02-13 6:06 ` MJ Chan
@ 2007-02-14 15:15 ` Jason Rumney
2007-02-17 12:22 ` Eli Zaretskii
1 sibling, 0 replies; 8+ messages in thread
From: Jason Rumney @ 2007-02-14 15:15 UTC (permalink / raw)
To: MJ Chan; +Cc: emacs-devel, Kenichi Handa
MJ Chan wrote:
> Also, if I use utf-8, then file-attributes reports incorrectly the
> file as a directory as shown below. This is odd.
>
It is not odd that an incorrect setting produces incorrect results.
utf-8 is never a valid file-name-coding-system on Windows XP.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file-attribute on certain Chinese filenames failed
2007-02-13 6:06 ` MJ Chan
2007-02-14 15:15 ` Jason Rumney
@ 2007-02-17 12:22 ` Eli Zaretskii
2007-02-17 15:09 ` MJ Chan
1 sibling, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2007-02-17 12:22 UTC (permalink / raw)
To: MJ Chan; +Cc: emacs-devel, handa
> Date: Tue, 13 Feb 2007 01:06:39 -0500
> From: MJ Chan <mjchan.inbox@gmail.com>
> Cc: emacs-devel@gnu.org
>
> The same code you used did not cause write-region to fail on my
> XP. That is, the file "會" was written correctly. But again,
> file-attributes returned nil.
Could you perhaps step with a debugger into the function `stat' (in
w32.c) and see where it fails in this case?
I tried to reproduce this on Windows, but couldn't: file-attributes
works for me with non-ASCII file names. However, I don't have access
to a Chinese Windows system.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file-attribute on certain Chinese filenames failed
2007-02-17 12:22 ` Eli Zaretskii
@ 2007-02-17 15:09 ` MJ Chan
2007-02-18 22:15 ` Eli Zaretskii
0 siblings, 1 reply; 8+ messages in thread
From: MJ Chan @ 2007-02-17 15:09 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel, handa
Thanks for pointing out w32.c in which function stat is defined. I
thought Emacs used windows native stat.
Indeed, the problem is in stat, which calls strpbrk for checking
invalid filename, (*?|<>\"). The Chinese/Big5 character that I have
problem with contains '|'.
I also did some test with windows native stat call and it did not fail
with that Chinese filename.
>>>>> On Saturday, February 17 2007, Eli Zaretskii said:
>> Date: Tue, 13 Feb 2007 01:06:39 -0500
>> From: MJ Chan <mjchan.inbox@gmail.com>
>> Cc: emacs-devel@gnu.org
>>
>> The same code you used did not cause write-region to fail on my
>> XP. That is, the file "會" was written correctly. But again,
>> file-attributes returned nil.
> Could you perhaps step with a debugger into the function `stat' (in
> w32.c) and see where it fails in this case?
> I tried to reproduce this on Windows, but couldn't: file-attributes
> works for me with non-ASCII file names. However, I don't have access
> to a Chinese Windows system.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file-attribute on certain Chinese filenames failed
2007-02-17 15:09 ` MJ Chan
@ 2007-02-18 22:15 ` Eli Zaretskii
2007-02-19 4:03 ` MJ Chan
0 siblings, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2007-02-18 22:15 UTC (permalink / raw)
To: MJ Chan; +Cc: emacs-devel, handa
> Date: Sat, 17 Feb 2007 10:09:16 -0500
> From: MJ Chan <mjchan.inbox@gmail.com>
> Cc: handa@m17n.org, emacs-devel@gnu.org
>
> Indeed, the problem is in stat, which calls strpbrk for checking
> invalid filename, (*?|<>\"). The Chinese/Big5 character that I have
> problem with contains '|'.
Thanks for pointing out this blunder.
Does the patch below fix the problem for you with Chinese file names?
Index: src/w32.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/w32.c,v
retrieving revision 1.110
diff -u -r1.110 w32.c
--- src/w32.c 21 Jan 2007 04:18:15 -0000 1.110
+++ src/w32.c 18 Feb 2007 22:13:58 -0000
@@ -33,6 +33,7 @@
#include <sys/time.h>
#include <sys/utime.h>
+#include <mbstring.h>
/* must include CRT headers *before* config.h */
#ifdef HAVE_CONFIG_H
@@ -2387,7 +2388,7 @@
name = (char *) map_w32_filename (path, &path);
/* must be valid filename, no wild cards or other invalid characters */
- if (strpbrk (name, "*?|<>\""))
+ if (_mbspbrk (name, "*?|<>\""))
{
errno = ENOENT;
return -1;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file-attribute on certain Chinese filenames failed
2007-02-18 22:15 ` Eli Zaretskii
@ 2007-02-19 4:03 ` MJ Chan
2007-02-23 18:41 ` Eli Zaretskii
0 siblings, 1 reply; 8+ messages in thread
From: MJ Chan @ 2007-02-19 4:03 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel, handa
Yes, the patch has fixed the problem. Thanks.
>>>>> On Monday, February 19 2007, Eli Zaretskii said:
>> Date: Sat, 17 Feb 2007 10:09:16 -0500
>> From: MJ Chan <mjchan.inbox@gmail.com>
>> Cc: handa@m17n.org, emacs-devel@gnu.org
>>
>> Indeed, the problem is in stat, which calls strpbrk for checking
>> invalid filename, (*?|<>\"). The Chinese/Big5 character that I have
>> problem with contains '|'.
> Thanks for pointing out this blunder.
> Does the patch below fix the problem for you with Chinese file names?
> Index: src/w32.c
> ===================================================================
> RCS file: /cvsroot/emacs/emacs/src/w32.c,v
> retrieving revision 1.110
> diff -u -r1.110 w32.c
> --- src/w32.c 21 Jan 2007 04:18:15 -0000 1.110
> +++ src/w32.c 18 Feb 2007 22:13:58 -0000
> @@ -33,6 +33,7 @@
> #include <sys/time.h>
> #include <sys/utime.h>
> +#include <mbstring.h>
> /* must include CRT headers *before* config.h */
> #ifdef HAVE_CONFIG_H
> @@ -2387,7 +2388,7 @@
> name = (char *) map_w32_filename (path, &path);
> /* must be valid filename, no wild cards or other invalid characters */
> - if (strpbrk (name, "*?|<>\""))
> + if (_mbspbrk (name, "*?|<>\""))
> {
> errno = ENOENT;
> return -1;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: file-attribute on certain Chinese filenames failed
2007-02-19 4:03 ` MJ Chan
@ 2007-02-23 18:41 ` Eli Zaretskii
0 siblings, 0 replies; 8+ messages in thread
From: Eli Zaretskii @ 2007-02-23 18:41 UTC (permalink / raw)
To: MJ Chan; +Cc: emacs-devel, handa
> Date: Sun, 18 Feb 2007 23:03:47 -0500
> From: MJ Chan <mjchan.inbox@gmail.com>
> Cc: handa@m17n.org, emacs-devel@gnu.org
>
> Yes, the patch has fixed the problem. Thanks.
Thanks, I installed it (and also replaced a couple of other similar
uses of strpbrk).
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-02-23 18:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <45c9d948.5c6acfa4.4c9b.ffffeb01@mx.google.com>
2007-02-13 5:36 ` file-attribute on certain Chinese filenames failed Kenichi Handa
2007-02-13 6:06 ` MJ Chan
2007-02-14 15:15 ` Jason Rumney
2007-02-17 12:22 ` Eli Zaretskii
2007-02-17 15:09 ` MJ Chan
2007-02-18 22:15 ` Eli Zaretskii
2007-02-19 4:03 ` MJ Chan
2007-02-23 18:41 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).