* Re: [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] [not found] ` <200408101144.UAA13100@etlken.m17n.org> @ 2004-08-10 12:01 ` Jason Rumney 2004-08-10 16:49 ` Mattis 0 siblings, 1 reply; 10+ messages in thread From: Jason Rumney @ 2004-08-10 12:01 UTC (permalink / raw) Cc: emacs-devel, brakjoller [-- Attachment #1.1: Type: text/plain, Size: 1730 bytes --] Kenichi Handa wrote: >In article <c99f54dd040810043627699b54@mail.gmail.com>, Mattis <brakjoller@gmail.com> writes: > > > >>> Are they real question marks? Have you checked them with >>> "C-x ="? >>> >>> >>Yes, they are real question marks. >> >> > >Hmmm, then something is different in Windows port. > > >>I do not know how to proceed. It definitely seems like the low level >>emacs functions do not support all characters under Windows. Is this >>considered a bug or a non-feature? >> >> > >I have no idea because I have almost no knowledge about >Windows. Jason, do you know why > > > The functions that deal with file names in the C library on Windows return ? for characters that are not supported by the system locale, even though NTFS supports Unicode file names. The answer would be to use Unicode-aware Win32 API functions instead of the standard C library, but such functions are only supported on some versions of Windows, so determining when to use them and when not to is a problem. It is quite unusual for users to enter filenames in languages other than their own, and Emacs is certainly not the only application that has this problem (try "dir" in the windows command prompt for example), so this has not been high priority. It is probably appropriate to look at this in the next version of Emacs that uses Unicode internally. To do so now is too big a change for feature-freeze, I think and we already have a big change to support Unicode on the clipboard to install (which has more benefits to users than file names IMHO). >(let ((file-name-coding-system 'raw-text)) > (directory-files DIRECTORY_NAME)) > >returns `?' in file names if they are encoded in utf-16-le? > > [-- Attachment #1.2: Type: text/html, Size: 2616 bytes --] [-- Attachment #2: Type: text/plain, Size: 142 bytes --] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] 2004-08-10 12:01 ` [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] Jason Rumney @ 2004-08-10 16:49 ` Mattis 2004-08-10 17:15 ` Stefan Monnier 0 siblings, 1 reply; 10+ messages in thread From: Mattis @ 2004-08-10 16:49 UTC (permalink / raw) Cc: emacs-devel, Kenichi Handa > The answer would be to use Unicode-aware Win32 API functions > instead of the standard C library, but such functions are only > supported on some versions of Windows, so determining > when to use them and when not to is a problem. Well, probably, yes. I agree that this might not be a high prio case, but IMHO it should be solved sooner or later. > It is quite unusual for users to enter filenames in languages > other than their own Probably. I had a real-world problem though even though it might not be very common. Anyway, I do not want to argue about priorities here, but I want to remind you of the second problem, the "freeze" that happens on both Windows and GNU/Linux: 1. $ emacs --no-init-file --no-site-file 2. (setq debug-on-init t) 3. (setq file-name-coding-system 'utf-16-le) 4. Wait for a while See now how emacs seems to "freeze". It is possible to unfreeze it using C-g and then a debug outout like this is displayed: ange-ftp-completion-hook-function(file-exists-p "/") file-exists-p("/") make-directory("/home/mathias/.emacs.d/auto-save-list/" t) As the primary problem will not be solved (unicode support for file names) I will not be affected by the problem above in a real-world case, but I wanted you to know that something is fishy here, even on GNU/Linux. Thanks for the help! /Mathias ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] 2004-08-10 16:49 ` Mattis @ 2004-08-10 17:15 ` Stefan Monnier 2004-08-11 10:59 ` Mattis 0 siblings, 1 reply; 10+ messages in thread From: Stefan Monnier @ 2004-08-10 17:15 UTC (permalink / raw) Cc: emacs-devel, Kenichi Handa, Jason Rumney > See now how emacs seems to "freeze". It is possible to unfreeze it > using C-g and then a debug outout like this is displayed: > ange-ftp-completion-hook-function(file-exists-p "/") > file-exists-p("/") > make-directory("/home/mathias/.emacs.d/auto-save-list/" t) Yes, the problem is as follows: (file-exists-p "/home/mathias/.emacs.d/auto-save-list/") -> nil so make-directory decides the dir needs to be created, but he first checks to see if the parent needs to be created as well: (file-exists-p "/home/mathias/.emacs.d/") -> nil so it tris to create the parent, check its own parent: (file-exists-p "/home/mathias/") -> nil ... (file-exists-p "/home/") -> nil ... (file-exists-p "/") -> nil ... (file-exists-p "/") -> nil ... (file-exists-p "/") -> nil ... because the parent of "/" is "/" and because after encoding in utf-16, even "/" doesn't exist. Such an encoding is clearly completely wrong for such a system, so I'm not sure how important it is to protect oneself against such situations. After all, there are several other ways to screw oneself and this one is at least reasonably easy to revert. Try the following ed-emulator: M-: (use-global-map (make-keymap)) RET -- Stefan ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] 2004-08-10 17:15 ` Stefan Monnier @ 2004-08-11 10:59 ` Mattis 2004-08-11 14:57 ` Stefan Monnier 0 siblings, 1 reply; 10+ messages in thread From: Mattis @ 2004-08-11 10:59 UTC (permalink / raw) Cc: emacs-devel, Kenichi Handa, Jason Rumney > Yes, the problem is as follows: > > (file-exists-p "/home/mathias/.emacs.d/auto-save-list/") -> nil > > so make-directory decides the dir needs to be created, but he first checks > to see if the parent needs to be created as well: > > (file-exists-p "/home/mathias/.emacs.d/") -> nil > > so it tris to create the parent, check its own parent: > > (file-exists-p "/home/mathias/") -> nil > .... > (file-exists-p "/home/") -> nil > .... > (file-exists-p "/") -> nil > .... > (file-exists-p "/") -> nil > ... > because the parent of "/" is "/" and because after encoding in utf-16, > even "/" doesn't exist. > > Such an encoding is clearly completely wrong for such a system, so I'm not > sure how important it is to protect oneself against such situations. I see. But what if the test that file-exists-p does was to encode the "test-string" first in the same encoding? Or maybe this would be crazy, this isn't exactly my area of expertice. :) > After all, there are several other ways to screw oneself and this one is at > least reasonably easy to revert. Agree. Anyway, the conclusion of all this seems to be that Emacs is not 100 % ready for unicode yet so I have to avoid to try torture it with these in the meantime... :) /Mathias ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] 2004-08-11 10:59 ` Mattis @ 2004-08-11 14:57 ` Stefan Monnier 2004-08-12 8:22 ` Mattis 0 siblings, 1 reply; 10+ messages in thread From: Stefan Monnier @ 2004-08-11 14:57 UTC (permalink / raw) Cc: emacs-devel, Kenichi Handa, Jason Rumney > I see. But what if the test that file-exists-p does was to encode the > "test-string" first in the same encoding? Or maybe this would be crazy, > this isn't exactly my area of expertise. :) I do not understand what you are trying to say here. The problem is that (file-exists-p "/") needs to turn the stream of *characters* "/" into a stream of *bytes* (which is what the encoding is for) and if you specify the wrong encoding you might get the wrong bytes, so the OS will of course not find the corresponding file. > Anyway, the conclusion of all this seems to be that Emacs is not 100 % > ready for unicode yet so I have to avoid to try torture it with these in > the meantime... :) While the unicode support is not finished (and probably never will be given the amount of detail you can get into if you care to), it has nothing to do with the problem at hand. The only valid conclusion here is that your file names are not encoded in utf-16 and thus you get what you deserve if you wrongly tell Emacs to use utf-16 encoding for filenames. Stefan ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Re: [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] 2004-08-11 14:57 ` Stefan Monnier @ 2004-08-12 8:22 ` Mattis 2004-08-12 11:00 ` Jason Rumney 2004-08-12 11:36 ` Kenichi Handa 0 siblings, 2 replies; 10+ messages in thread From: Mattis @ 2004-08-12 8:22 UTC (permalink / raw) Cc: emacs-devel, Kenichi Handa, Jason Rumney > The problem is that (file-exists-p "/") needs to turn the stream > of *characters* "/" into a stream of *bytes* (which is what > the encoding is for) and if you specify the wrong encoding > you might get the wrong bytes, so the OS will of course > not find the corresponding file. Ah, yes, I understand now. > > Anyway, the conclusion of all this seems to be that Emacs is not 100 % > > ready for unicode yet so I have to avoid to try torture it with these in > > the meantime... :) > > While the unicode support is not finished (and probably never will be given > the amount of detail you can get into if you care to), it has nothing to do > with the problem at hand. The only valid conclusion here is that your file > names are not encoded in utf-16 and thus you get what you deserve if you > wrongly tell Emacs to use utf-16 encoding for filenames. OK. So, the best would be to have the encoding set to "undecided" and then let Emacs figure out which encoding is used, right? And hopefully it will be able to do this correctly in later versions. (I'm talking about the unicode file names on Windows-problem now, not the "freeze") Thanks for all the help. /Mathias ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] 2004-08-12 8:22 ` Mattis @ 2004-08-12 11:00 ` Jason Rumney 2004-08-12 11:36 ` Kenichi Handa 1 sibling, 0 replies; 10+ messages in thread From: Jason Rumney @ 2004-08-12 11:00 UTC (permalink / raw) Cc: Kenichi Handa, Stefan Monnier, emacs-devel Mattis wrote: >OK. So, the best would be to have the encoding set to "undecided" > The best is to leave it at what Emacs determined at startup, ie appropriate for your system locale setting. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] 2004-08-12 8:22 ` Mattis 2004-08-12 11:00 ` Jason Rumney @ 2004-08-12 11:36 ` Kenichi Handa 2004-08-12 12:31 ` Jason Rumney 1 sibling, 1 reply; 10+ messages in thread From: Kenichi Handa @ 2004-08-12 11:36 UTC (permalink / raw) Cc: jasonr, monnier, emacs-devel In article <c99f54dd0408120122401e9526@mail.gmail.com>, Mattis <brakjoller@gmail.com> writes: > OK. So, the best would be to have the encoding set to > "undecided" and then let Emacs figure out which encoding > is used, right? Unfortunately, no. By that, Emacs may be able to decode the file name corrrectly, but it doesn't remember the encoding for the time it has to encode the file name. I belive that Windows itself has a function to handle such a file name consistently. So, I think the best is to make a speical coding system, say windows-file-name, that works only for Windows. It decodes file names into utf-8 sequence (internal code of Emacs-Unicode), and encode file names into locale encoding or utf-16le-with-signature depending on the contents. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] 2004-08-12 11:36 ` Kenichi Handa @ 2004-08-12 12:31 ` Jason Rumney 2004-08-13 1:46 ` Kenichi Handa 0 siblings, 1 reply; 10+ messages in thread From: Jason Rumney @ 2004-08-12 12:31 UTC (permalink / raw) Cc: emacs-devel, monnier, brakjoller Kenichi Handa wrote: >I belive that Windows itself has a function to handle such a >file name consistently. So, I think the best is to make a >speical coding system, say windows-file-name, that works >only for Windows. > Part of this discussion has occured off list, so you may not have seen my previous mail on this. Currently Emacs uses the standard C library functions for file I/O on all platforms. In my judgement, changing this would be too much work to take on during feature-freeze, especially since it is complicated by the fact that the full Unicode API is not supported on all versions of Windows. In future when this is implemented, I do not see the need for "windows-file-name" coding-system. In the versions of Windows where the Unicode API is fully supported, there is no need for file-name-coding-system, since it is known to be utf-16-le-with-signature. In the cases where those APIs are not supported, then file-name-coding-system should be used with the standard C library as now. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] 2004-08-12 12:31 ` Jason Rumney @ 2004-08-13 1:46 ` Kenichi Handa 0 siblings, 0 replies; 10+ messages in thread From: Kenichi Handa @ 2004-08-13 1:46 UTC (permalink / raw) Cc: emacs-devel, monnier, brakjoller In article <411B631E.4050006@gnu.org>, Jason Rumney <jasonr@gnu.org> writes: > Part of this discussion has occured off list, so you may not have seen > my previous mail on this. > Currently Emacs uses the standard C library functions for file I/O on > all platforms. In my judgement, changing this would be too much work to > take on during feature-freeze, especially since it is complicated by the > fact that the full Unicode API is not supported on all versions of Windows. Of course, I agree that we shouldn't change the current behaviour now. I was talking about what to do in emacs-unicode. > In future when this is implemented, I do not see the need for > "windows-file-name" coding-system. In the versions of Windows where the > Unicode API is fully supported, there is no need for > file-name-coding-system, since it is known to be > utf-16-le-with-signature. In the cases where those APIs are not > supported, then file-name-coding-system should be used with the standard > C library as now. Ok, I see. --- Ken'ichi HANDA handa@m17n.org ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2004-08-13 1:46 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <E1BtYdG-0004bH-IO@fencepost.gnu.org> [not found] ` <200408091228.VAA10104@etlken.m17n.org> [not found] ` <c99f54dd0408091315edacc8d@mail.gmail.com> [not found] ` <200408100133.KAA11854@etlken.m17n.org> [not found] ` <c99f54dd0408100341410de4d5@mail.gmail.com> [not found] ` <200408101058.TAA12976@etlken.m17n.org> [not found] ` <c99f54dd040810043627699b54@mail.gmail.com> [not found] ` <200408101144.UAA13100@etlken.m17n.org> 2004-08-10 12:01 ` [brakjoller@gmail.com: setting utf-16 as file-name-coding-system locks up emacs] Jason Rumney 2004-08-10 16:49 ` Mattis 2004-08-10 17:15 ` Stefan Monnier 2004-08-11 10:59 ` Mattis 2004-08-11 14:57 ` Stefan Monnier 2004-08-12 8:22 ` Mattis 2004-08-12 11:00 ` Jason Rumney 2004-08-12 11:36 ` Kenichi Handa 2004-08-12 12:31 ` Jason Rumney 2004-08-13 1:46 ` Kenichi Handa
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.