* cvs-quickdir and UTF-8 encoded file names
@ 2003-08-16 16:33 Karl Eichwalder
2003-08-21 1:36 ` Kenichi Handa
0 siblings, 1 reply; 6+ messages in thread
From: Karl Eichwalder @ 2003-08-16 16:33 UTC (permalink / raw)
cvs-quickdir does not work properly for UTF-8 encoded file names
containing umlauts like ä, ö, ü, etc. The names are displayed like this
Albrecht D\303\274rer
instead of
Albrecht Dürer
And they are marked as "missing".
Platform: SuSE Linux 8.2 (x86)
locale :
LANG=de_DE.UTF-8
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE=C
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=
--
| ,__o
http://www.gnu.franken.de/ke/ | _-\_<,
ke@suse.de (work) / keichwa@gmx.net (home) | (*)/'(*)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cvs-quickdir and UTF-8 encoded file names
2003-08-16 16:33 cvs-quickdir and UTF-8 encoded file names Karl Eichwalder
@ 2003-08-21 1:36 ` Kenichi Handa
2003-08-21 4:47 ` Karl Eichwalder
0 siblings, 1 reply; 6+ messages in thread
From: Kenichi Handa @ 2003-08-21 1:36 UTC (permalink / raw)
Cc: emacs-devel
In article <sh8ypt7ebc.fsf@tux.gnu.franken.de>, Karl Eichwalder <keichwa@gmx.net> writes:
> cvs-quickdir does not work properly for UTF-8 encoded file names
> containing umlauts like ä, ö, ü, etc. The names are displayed like this
> Albrecht D\303\274rer
> instead of
> Albrecht Dürer
> And they are marked as "missing".
> Platform: SuSE Linux 8.2 (x86)
> locale :
> LANG=de_DE.UTF-8
> LC_CTYPE="de_DE.UTF-8"
Please show me the result of C-h C RET and the values of
these variables:
default-enable-multibyte-characters
enable-multibyte-characters
default-file-name-coding-system
file-name-coding-system
And, when you read CVS/Entries directly, how those file
names are decoded?
---
Ken'ichi HANDA
handa@m17n.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cvs-quickdir and UTF-8 encoded file names
2003-08-21 1:36 ` Kenichi Handa
@ 2003-08-21 4:47 ` Karl Eichwalder
2003-08-21 6:26 ` Kenichi Handa
0 siblings, 1 reply; 6+ messages in thread
From: Karl Eichwalder @ 2003-08-21 4:47 UTC (permalink / raw)
Cc: emacs-devel
Kenichi Handa <handa@m17n.org> writes:
> Please show me the result of C-h C RET and the values of
> these variables:
> default-enable-multibyte-characters
> enable-multibyte-characters
> default-file-name-coding-system
> file-name-coding-system
Thanks for asking:
Coding system for saving this buffer:
Not set locally, use the default.
Default coding system (for new files):
u -- mule-utf-8 (alias: utf-8)
Coding system for keyboard input:
nil
Coding system for terminal output:
u -- mule-utf-8 (alias: utf-8)
Defaults for subprocess I/O:
decoding: u -- mule-utf-8 (alias: utf-8)
encoding: u -- mule-utf-8 (alias: utf-8)
Priority order for recognizing coding systems when reading files:
1. mule-utf-8 (alias: utf-8)
2. iso-latin-1 (alias: iso-8859-1 latin-1)
3. mule-utf-16be-with-signature (alias: utf-16be-with-signature mule-utf-16-be utf-16-be)
4. mule-utf-16le-with-signature (alias: utf-16le-with-signature mule-utf-16-le utf-16-le)
5. iso-2022-jp (alias: junet)
6. iso-2022-7bit
7. iso-2022-7bit-lock (alias: iso-2022-int-1)
8. iso-2022-8bit-ss2
9. emacs-mule
10. raw-text
11. japanese-shift-jis (alias: shift_jis sjis)
12. chinese-big5 (alias: big5 cn-big5)
13. no-conversion
Other coding systems cannot be distinguished automatically
from these, and therefore cannot be recognized automatically
with the present coding system priorities.
The following are decoded correctly but recognized as iso-2022-7bit-lock:
iso-2022-7bit-ss2 iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-ext
iso-2022-jp-2 iso-2022-kr
Particular coding systems specified for certain file names:
OPERATION TARGET PATTERN CODING SYSTEM(s)
--------- -------------- ----------------
File I/O "ChangeLog" (utf-8 . utf-8)
"\\.g?z\\(~\\|\\.~[0-9]+~\\)?\\'"
(no-conversion . no-conversion)
"\\.tgz\\'" (no-conversion . no-conversion)
"\\.bz2\\'" (no-conversion . no-conversion)
"\\.Z\\(~\\|\\.~[0-9]+~\\)?\\'"
(no-conversion . no-conversion)
"\\.elc\\'" (emacs-mule . emacs-mule)
"\\.utf\\(-8\\)?\\'" utf-8
"\\(\\`\\|/\\)loaddefs.el\\'"
(raw-text . raw-text-unix)
"\\.tar\\'" (no-conversion . no-conversion)
"\\.po[tx]?\\'\\|\\.po\\."
po-find-file-coding-system
"" (undecided)
Process I/O nothing specified
Network I/O nothing specified
default-enable-multibyte-characters's value is t
enable-multibyte-characters's value is t
Local in buffer *cvs*; global value is t
default-file-name-coding-system's value is mule-utf-8
file-name-coding-system's value is nil
> And, when you read CVS/Entries directly, how those file
> names are decoded?
Is this the value you want to know?
Coding system for saving this buffer:
t -- raw-text-unix
To see this value I did:
C-x C-f CVS/Entries RET
M-x describe-coding-system RET
Thanks for your help.
--
| ,__o
http://www.gnu.franken.de/ke/ | _-\_<,
ke@suse.de (work) / keichwa@gmx.net (home) | (*)/'(*)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cvs-quickdir and UTF-8 encoded file names
2003-08-21 4:47 ` Karl Eichwalder
@ 2003-08-21 6:26 ` Kenichi Handa
[not found] ` <shada37yfp.fsf@tux.gnu.franken.de>
0 siblings, 1 reply; 6+ messages in thread
From: Kenichi Handa @ 2003-08-21 6:26 UTC (permalink / raw)
Cc: emacs-devel
In article <shu18bty54.fsf@tux.gnu.franken.de>, Karl Eichwalder <keichwa@gmx.net> writes:
>> And, when you read CVS/Entries directly, how those file
>> names are decoded?
> Is this the value you want to know?
> Coding system for saving this buffer:
> t -- raw-text-unix
> To see this value I did:
> C-x C-f CVS/Entries RET
> M-x describe-coding-system RET
Thank you for the info. Somehow, Emacs fails to detect the
encoding of this file. Please send me that file by some
8-bit transparent way (e.g. uuencode, base64-encoding).
---
Ken'ichi HANDA
handa@m17n.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cvs-quickdir and UTF-8 encoded file names
[not found] ` <shada37yfp.fsf@tux.gnu.franken.de>
@ 2003-08-25 1:14 ` Kenichi Handa
2003-08-25 4:16 ` Karl Eichwalder
0 siblings, 1 reply; 6+ messages in thread
From: Kenichi Handa @ 2003-08-25 1:14 UTC (permalink / raw)
Cc: emacs-devel
In article <shada37yfp.fsf@tux.gnu.franken.de>, Karl Eichwalder <keichwa@gmx.net> writes:
>> Thank you for the info. Somehow, Emacs fails to detect the
>> encoding of this file. Please send me that file by some
>> 8-bit transparent way (e.g. uuencode, base64-encoding).
> Here it comes:
I found an invalid UTF-8 sequence at 279th line. It seems
that the file name on this line is in ISO-8859-1, not UTF-8.
Thus, Emacs failed to detect it as utf-8, and decoded it by
raw-text.
Perhaps the file Entries should be read by:
(let ((coding-system-for-read (or default-file-name-coding-system
file-name-coding-system)))
...)
But, that file also contains "date" string. Does Emacs uses
that part of information too? If so, how is it encoded?
Does it contain only ASCII characters? Or, is it encoded in
users locale? Are there any possibility that the encoding
of file name is different from the encoding of date string
in a normal situation?
As my knowlege about CVS (and CVS handling code of emacs) is
limitted, I'd like to ask some other person to fix this
problem. Of course, I'll answer any Mule-related questions.
---
Ken'ichi HANDA
handa@m17n.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: cvs-quickdir and UTF-8 encoded file names
2003-08-25 1:14 ` Kenichi Handa
@ 2003-08-25 4:16 ` Karl Eichwalder
0 siblings, 0 replies; 6+ messages in thread
From: Karl Eichwalder @ 2003-08-25 4:16 UTC (permalink / raw)
Cc: emacs-devel
Kenichi Handa <handa@m17n.org> writes:
> I found an invalid UTF-8 sequence at 279th line. It seems
> that the file name on this line is in ISO-8859-1, not UTF-8.
Thanks for tracking down my error; the file name on my disk is also
ISO-8859-1 encoded. I might have slipped in when I have started Emacs
within an ISO-8859-1 environment by accident.
> Thus, Emacs failed to detect it as utf-8, and decoded it by
> raw-text.
>
> Perhaps the file Entries should be read by:
>
> (let ((coding-system-for-read (or default-file-name-coding-system
> file-name-coding-system)))
> ...)
If it is possible to detect some such encoding mismatch, Emacs should
raise an exception telling the user the how to solve the problem:
Convert file name
Use raw text
Use UTF-8
Sorry, I cannot answer the other questions. Thanks again for your
help.
--
| ,__o
http://www.gnu.franken.de/ke/ | _-\_<,
ke@suse.de (work) / keichwa@gmx.net (home) | (*)/'(*)
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2003-08-25 4:16 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-08-16 16:33 cvs-quickdir and UTF-8 encoded file names Karl Eichwalder
2003-08-21 1:36 ` Kenichi Handa
2003-08-21 4:47 ` Karl Eichwalder
2003-08-21 6:26 ` Kenichi Handa
[not found] ` <shada37yfp.fsf@tux.gnu.franken.de>
2003-08-25 1:14 ` Kenichi Handa
2003-08-25 4:16 ` Karl Eichwalder
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).