all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* Re: file-attribute on certain Chinese filenames failed
       [not found] <45c9d948.5c6acfa4.4c9b.ffffeb01@mx.google.com>
@ 2007-02-13  5:36 ` Kenichi Handa
  2007-02-13  6:06   ` MJ Chan
  0 siblings, 1 reply; 8+ messages in thread
From: Kenichi Handa @ 2007-02-13  5:36 UTC (permalink / raw)
  To: MJ Chan; +Cc: emacs-devel

In article <45c9d948.5c6acfa4.4c9b.ffffeb01@mx.google.com>, MJ Chan <mjchan.inbox@gmail.com> writes:

> I had trouble in accessing some files that contains certain Chinese
> (big5) characters (in fact, I found only one character so far that is
> causing the problem; all others are good). After checking around, I
> found that the problem would take place in "lstat" call in the C
> source. One example is in "file-attributes" (src/dired.c).

> For example, if I created a file named "=26345" (see below for the
> "describe-char" on that particular Chinese character) and then
> evaluated this, it returned nil. 

> (file-attributes "=26345")
> nil

I can't reproduce that problem on GNU/Linux.  For instance,
it seems that this code work correctly.

(let ((file-name-coding-system 'big5)
      (filename (string #x26345)))
  (with-temp-file filename (insert "abc\n"))
  (file-attributes filename))

=> (nil 1 8308 8308 (17873 19356) (17873 19356) (17873 19356)
    4 "-rw-rw-r--" nil 11175288 21)

But, when I run the same code on Windows, write-region
causes this error:

(file-error "Opening output file" "invalid argument"
            "c:/cygwin/home/handa/\x26345")

I suspect that this is because the big5 sequence of the
character #x26345 is "\267|", and Windows-XP (at least my
Japanese version) doesn't allow creating a file containing
"|".  For instance, I can't create a file "a|" either.

> As mentioned above, the culprit seems to be in calling lstat with
> encoded version of the file name as shown below (taken from dired.c)

>   GCPRO1 (filename);
>   encoded = ENCODE_FILE (filename);
>   UNGCPRO;

>   if (lstat (SDATA (encoded), &s) < 0)

>     return Qnil;

> If I changed the call to use un-encoded filename (i.e. lstat
> (filename,...)), then it is good. But I am not sure if this is the
> right thing to do. 

I believe it's not the right fix, and first of all, I have
no idea why such a change fixes your case.

Anyway, it seems that this is an Windows specific problem.

---
Kenichi Handa
handa@m17n.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: file-attribute on certain Chinese filenames failed
  2007-02-13  5:36 ` file-attribute on certain Chinese filenames failed Kenichi Handa
@ 2007-02-13  6:06   ` MJ Chan
  2007-02-14 15:15     ` Jason Rumney
  2007-02-17 12:22     ` Eli Zaretskii
  0 siblings, 2 replies; 8+ messages in thread
From: MJ Chan @ 2007-02-13  6:06 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

Thanks for looking into the problem. 

The same code you used did not cause write-region to fail on my
XP. That is, the file "會" was written correctly. But again,
file-attributes returned nil.

Also, if I use utf-8, then file-attributes reports incorrectly the
file as a directory as shown below. This is odd.

(let ((file-name-coding-system 'utf-8))
  (file-attributes "會"))
=> (t 1 5 5 (17873 20762) (17873 20755) (17430 62310) 0 "drwxrwxrwx" nil 0 (41075 . 32279))

(let ((file-name-coding-system 'big5))
  (file-attributes "會"))
=> nil

If you have other clue, please let me know.

BTW, My machine is running English XP with Big5 (Chinese/Taiwan) set for
non-Unicode program in Regional and Language Options in Control Panel.

>>>>> On Tuesday, February 13 2007, Kenichi Handa said:

    > In article <45c9d948.5c6acfa4.4c9b.ffffeb01@mx.google.com>, MJ Chan <mjchan.inbox@gmail.com> writes:
    >> I had trouble in accessing some files that contains certain Chinese
    >> (big5) characters (in fact, I found only one character so far that is
    >> causing the problem; all others are good). After checking around, I
    >> found that the problem would take place in "lstat" call in the C
    >> source. One example is in "file-attributes" (src/dired.c).

    >> For example, if I created a file named "=26345" (see below for the
    >> "describe-char" on that particular Chinese character) and then
    >> evaluated this, it returned nil. 

    >> (file-attributes "=26345")
    >> nil

    > I can't reproduce that problem on GNU/Linux.  For instance,
    > it seems that this code work correctly.

    > (let ((file-name-coding-system 'big5)
    >       (filename (string #x26345)))
    >   (with-temp-file filename (insert "abc\n"))
    >   (file-attributes filename))

    > => (nil 1 8308 8308 (17873 19356) (17873 19356) (17873 19356)
    >     4 "-rw-rw-r--" nil 11175288 21)

    > But, when I run the same code on Windows, write-region
    > causes this error:

    > (file-error "Opening output file" "invalid argument"
    >             "c:/cygwin/home/handa/\x26345")

    > I suspect that this is because the big5 sequence of the
    > character #x26345 is "\267|", and Windows-XP (at least my
    > Japanese version) doesn't allow creating a file containing
    > "|".  For instance, I can't create a file "a|" either.

    >> As mentioned above, the culprit seems to be in calling lstat with
    >> encoded version of the file name as shown below (taken from dired.c)

    >> GCPRO1 (filename);
    >> encoded = ENCODE_FILE (filename);
    >> UNGCPRO;

    >> if (lstat (SDATA (encoded), &s) < 0)

    >> return Qnil;

    >> If I changed the call to use un-encoded filename (i.e. lstat
    >> (filename,...)), then it is good. But I am not sure if this is the
    >> right thing to do. 

    > I believe it's not the right fix, and first of all, I have
    > no idea why such a change fixes your case.

    > Anyway, it seems that this is an Windows specific problem.

    > ---
    > Kenichi Handa
    > handa@m17n.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: file-attribute on certain Chinese filenames failed
  2007-02-13  6:06   ` MJ Chan
@ 2007-02-14 15:15     ` Jason Rumney
  2007-02-17 12:22     ` Eli Zaretskii
  1 sibling, 0 replies; 8+ messages in thread
From: Jason Rumney @ 2007-02-14 15:15 UTC (permalink / raw)
  To: MJ Chan; +Cc: emacs-devel, Kenichi Handa

MJ Chan wrote:
> Also, if I use utf-8, then file-attributes reports incorrectly the
> file as a directory as shown below. This is odd.
>   

It is not odd that an incorrect setting produces incorrect results.
utf-8 is never a valid file-name-coding-system on Windows XP.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: file-attribute on certain Chinese filenames failed
  2007-02-13  6:06   ` MJ Chan
  2007-02-14 15:15     ` Jason Rumney
@ 2007-02-17 12:22     ` Eli Zaretskii
  2007-02-17 15:09       ` MJ Chan
  1 sibling, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2007-02-17 12:22 UTC (permalink / raw)
  To: MJ Chan; +Cc: emacs-devel, handa

> Date: Tue, 13 Feb 2007 01:06:39 -0500
> From: MJ Chan <mjchan.inbox@gmail.com>
> Cc: emacs-devel@gnu.org
> 
> The same code you used did not cause write-region to fail on my
> XP. That is, the file "會" was written correctly. But again,
> file-attributes returned nil.

Could you perhaps step with a debugger into the function `stat' (in
w32.c) and see where it fails in this case?

I tried to reproduce this on Windows, but couldn't: file-attributes
works for me with non-ASCII file names.  However, I don't have access
to a Chinese Windows system.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: file-attribute on certain Chinese filenames failed
  2007-02-17 12:22     ` Eli Zaretskii
@ 2007-02-17 15:09       ` MJ Chan
  2007-02-18 22:15         ` Eli Zaretskii
  0 siblings, 1 reply; 8+ messages in thread
From: MJ Chan @ 2007-02-17 15:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, handa

Thanks for pointing out w32.c in which function stat is defined. I
thought Emacs used windows native stat. 

Indeed, the problem is in stat, which calls strpbrk for checking
invalid filename, (*?|<>\"). The Chinese/Big5 character that I have
problem with contains '|'. 

I also did some test with windows native stat call and it did not fail
with that Chinese filename.

>>>>> On Saturday, February 17 2007, Eli Zaretskii said:

    >> Date: Tue, 13 Feb 2007 01:06:39 -0500
    >> From: MJ Chan <mjchan.inbox@gmail.com>
    >> Cc: emacs-devel@gnu.org
    >> 
    >> The same code you used did not cause write-region to fail on my
    >> XP. That is, the file "會" was written correctly. But again,
    >> file-attributes returned nil.

    > Could you perhaps step with a debugger into the function `stat' (in
    > w32.c) and see where it fails in this case?

    > I tried to reproduce this on Windows, but couldn't: file-attributes
    > works for me with non-ASCII file names.  However, I don't have access
    > to a Chinese Windows system.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: file-attribute on certain Chinese filenames failed
  2007-02-17 15:09       ` MJ Chan
@ 2007-02-18 22:15         ` Eli Zaretskii
  2007-02-19  4:03           ` MJ Chan
  0 siblings, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2007-02-18 22:15 UTC (permalink / raw)
  To: MJ Chan; +Cc: emacs-devel, handa

> Date: Sat, 17 Feb 2007 10:09:16 -0500
> From: MJ Chan <mjchan.inbox@gmail.com>
> Cc: handa@m17n.org, emacs-devel@gnu.org
> 
> Indeed, the problem is in stat, which calls strpbrk for checking
> invalid filename, (*?|<>\"). The Chinese/Big5 character that I have
> problem with contains '|'. 

Thanks for pointing out this blunder.

Does the patch below fix the problem for you with Chinese file names?


Index: src/w32.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/w32.c,v
retrieving revision 1.110
diff -u -r1.110 w32.c
--- src/w32.c	21 Jan 2007 04:18:15 -0000	1.110
+++ src/w32.c	18 Feb 2007 22:13:58 -0000
@@ -33,6 +33,7 @@
 #include <sys/time.h>
 #include <sys/utime.h>
 
+#include <mbstring.h>
 /* must include CRT headers *before* config.h */
 
 #ifdef HAVE_CONFIG_H
@@ -2387,7 +2388,7 @@
 
   name = (char *) map_w32_filename (path, &path);
   /* must be valid filename, no wild cards or other invalid characters */
-  if (strpbrk (name, "*?|<>\""))
+  if (_mbspbrk (name, "*?|<>\""))
     {
       errno = ENOENT;
       return -1;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: file-attribute on certain Chinese filenames failed
  2007-02-18 22:15         ` Eli Zaretskii
@ 2007-02-19  4:03           ` MJ Chan
  2007-02-23 18:41             ` Eli Zaretskii
  0 siblings, 1 reply; 8+ messages in thread
From: MJ Chan @ 2007-02-19  4:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, handa

Yes, the patch has fixed the problem. Thanks. 

>>>>> On Monday, February 19 2007, Eli Zaretskii said:

    >> Date: Sat, 17 Feb 2007 10:09:16 -0500
    >> From: MJ Chan <mjchan.inbox@gmail.com>
    >> Cc: handa@m17n.org, emacs-devel@gnu.org
    >> 
    >> Indeed, the problem is in stat, which calls strpbrk for checking
    >> invalid filename, (*?|<>\"). The Chinese/Big5 character that I have
    >> problem with contains '|'. 

    > Thanks for pointing out this blunder.

    > Does the patch below fix the problem for you with Chinese file names?


    > Index: src/w32.c
    > ===================================================================
    > RCS file: /cvsroot/emacs/emacs/src/w32.c,v
    > retrieving revision 1.110
    > diff -u -r1.110 w32.c
    > --- src/w32.c	21 Jan 2007 04:18:15 -0000	1.110
    > +++ src/w32.c	18 Feb 2007 22:13:58 -0000
    > @@ -33,6 +33,7 @@
    >  #include <sys/time.h>
    >  #include <sys/utime.h>
 
    > +#include <mbstring.h>
    >  /* must include CRT headers *before* config.h */
 
    >  #ifdef HAVE_CONFIG_H
    > @@ -2387,7 +2388,7 @@
 
    >    name = (char *) map_w32_filename (path, &path);
    >    /* must be valid filename, no wild cards or other invalid characters */
    > -  if (strpbrk (name, "*?|<>\""))
    > +  if (_mbspbrk (name, "*?|<>\""))
    >      {
    >        errno = ENOENT;
    >        return -1;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: file-attribute on certain Chinese filenames failed
  2007-02-19  4:03           ` MJ Chan
@ 2007-02-23 18:41             ` Eli Zaretskii
  0 siblings, 0 replies; 8+ messages in thread
From: Eli Zaretskii @ 2007-02-23 18:41 UTC (permalink / raw)
  To: MJ Chan; +Cc: emacs-devel, handa

> Date: Sun, 18 Feb 2007 23:03:47 -0500
> From: MJ Chan <mjchan.inbox@gmail.com>
> Cc: handa@m17n.org, emacs-devel@gnu.org
> 
> Yes, the patch has fixed the problem. Thanks. 

Thanks, I installed it (and also replaced a couple of other similar
uses of strpbrk).

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-02-23 18:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <45c9d948.5c6acfa4.4c9b.ffffeb01@mx.google.com>
2007-02-13  5:36 ` file-attribute on certain Chinese filenames failed Kenichi Handa
2007-02-13  6:06   ` MJ Chan
2007-02-14 15:15     ` Jason Rumney
2007-02-17 12:22     ` Eli Zaretskii
2007-02-17 15:09       ` MJ Chan
2007-02-18 22:15         ` Eli Zaretskii
2007-02-19  4:03           ` MJ Chan
2007-02-23 18:41             ` Eli Zaretskii

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.