* bug#2940: 23.0.92; C-s in dired fails to find files with umlauts
@ 2009-04-09 16:28 Markus Triska
2009-04-09 17:22 ` Eli Zaretskii
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Markus Triska @ 2009-04-09 16:28 UTC (permalink / raw)
To: emacs-pretest-bug
With ~/töst.txt existing, when I do:
$ emacs -Q ~/
and press:
C-\ german-postfix RET C-s oe RET
to search for "ö" in the dired buffer (the input method correctly
converts the entered "oe" to "ö" in the minibuffer), I get:
Failing wrapped I-search [DE<]: ö
C-u C-x = on the "ö" in the dired buffer yields:
character: o (111, #o157, #x6f)
preferred charset: ascii (ASCII (ISO646 IRV))
code point: 0x6F
syntax: w which means: word
category: .:Base, a:ASCII, l:Latin, r:Roman
buffer code: #x6F
file code: #x6F (encoded by coding system utf-8-unix)
display: composed to form "ö" (see below)
Composed with the following character(s) "̈" using this font:
xft:-unknown-Cochin-normal-normal-normal-*-20-*-*-*-*-0-iso10646-1
by these glyphs:
[0 1 111 82 11 1 10 8 0 nil]
[0 1 776 235 6 0 6 12 -9 [-9 -1 0]]
Character code properties: customize what to show
name: LATIN SMALL LETTER O
general-category: Ll (Letter, Lowercase)
There are text properties here:
dired-filename t
fontified t
help-echo "mouse-2: visit this file in other window"
mouse-face highlight
C-u C-x = on the first "t" in "töst.txt" yields:
character: t (116, #o164, #x74)
preferred charset: ascii (ASCII (ISO646 IRV))
code point: 0x74
syntax: w which means: word
category: .:Base, a:ASCII, l:Latin, r:Roman
buffer code: #x74
file code: #x74 (encoded by coding system utf-8-unix)
display: by this font (glyph code)
xft:-bitstream-Bitstream Vera Sans Mono-normal-normal-normal-*-20-*-*-*-m-0-iso10646-1 (#x57)
Character code properties: customize what to show
name: LATIN SMALL LETTER T
general-category: Ll (Letter, Lowercase)
There are text properties here:
dired-filename t
fontified t
help-echo "mouse-2: visit this file in other window"
mouse-face highlight
In GNU Emacs 23.0.92.3 (i386-apple-darwin9.6.1, GTK+ Version 2.14.7)
of 2009-04-09 on mt-imac.local
Windowing system distributor `The X.Org Foundation', version 11.0.10402000
configured using `configure '--with-tiff=no''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default-enable-multibyte-characters: t
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#2940: 23.0.92; C-s in dired fails to find files with umlauts
2009-04-09 16:28 bug#2940: 23.0.92; C-s in dired fails to find files with umlauts Markus Triska
@ 2009-04-09 17:22 ` Eli Zaretskii
2009-04-09 17:33 ` Markus Triska
2011-07-11 22:02 ` Alp Aker
2019-11-02 6:12 ` Stefan Kangas
2 siblings, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2009-04-09 17:22 UTC (permalink / raw)
To: Markus Triska, 2940
> From: Markus Triska <markus.triska@gmx.at>
> Date: Thu, 09 Apr 2009 18:28:48 +0200
> Cc:
>
>
> With ~/töst.txt existing, when I do:
>
> $ emacs -Q ~/
>
> and press:
>
> C-\ german-postfix RET C-s oe RET
>
> to search for "ö" in the dired buffer (the input method correctly
> converts the entered "oe" to "ö" in the minibuffer), I get:
>
> Failing wrapped I-search [DE<]: ö
What's your value of file-name-coding-system? Does it help to say
C-x RET c utf-8 RET C-x d
instead of just "C-x d"?
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#2940: 23.0.92; C-s in dired fails to find files with umlauts
2009-04-09 17:22 ` Eli Zaretskii
@ 2009-04-09 17:33 ` Markus Triska
0 siblings, 0 replies; 8+ messages in thread
From: Markus Triska @ 2009-04-09 17:33 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 2940
Eli Zaretskii <eliz@gnu.org> writes:
> What's your value of file-name-coding-system?
It is nil, and default-file-name-coding-system is 'utf-8.
> Does it help to say
>
> C-x RET c utf-8 RET C-x d
>
> instead of just "C-x d"?
No, unfortunately not. Also for C-s it does not seem to make a
difference. When I enter an "ö" in *scratch*, C-u C-x = on it says:
character: ö (246, #o366, #xf6)
preferred charset: unicode (Unicode (ISO10646))
code point: 0xF6
syntax: w which means: word
category: .:Base, j:Japanese, l:Latin
to input: type "oe" with german-postfix
buffer code: #xC3 #xB6
file code: #xC3 #xB6 (encoded by coding system utf-8-unix)
display: by this font (glyph code)
xft:-bitstream-Bitstream Vera Sans Mono-normal-normal-normal-*-20-*-*-*-m-0-iso10646-1 (#x7C)
Character code properties: customize what to show
name: LATIN SMALL LETTER O WITH DIAERESIS
old-name: LATIN SMALL LETTER O DIAERESIS
general-category: Ll (Letter, Lowercase)
decomposition: (111 776) ('o' '̈')
There are text properties here:
fontified t
This "ö" is thus also rendered with the expected font, in contrast to
the one in dired.
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#2940: 23.0.92; C-s in dired fails to find files with umlauts
2009-04-09 16:28 bug#2940: 23.0.92; C-s in dired fails to find files with umlauts Markus Triska
2009-04-09 17:22 ` Eli Zaretskii
@ 2011-07-11 22:02 ` Alp Aker
2011-07-15 20:38 ` Glenn Morris
2019-11-02 6:12 ` Stefan Kangas
2 siblings, 1 reply; 8+ messages in thread
From: Alp Aker @ 2011-07-11 22:02 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 2940, markus.triska
Eli Zaretskii wrote:
>>> (require 'ucs-normalize)
>>> (setq file-name-coding-system 'utf-8-hfs)
>
> It could be that Emacs should do this on that platform automatically,
> yes. But some Darwin expert should look into this and provide feedback,
> before we decide.
I'm no expert, but it doesn't look as if this is necessary.
/lisp/term/ns-win.el already defines a coding system utf-8-nfd that
performs normalization and it sets that as the value of
file-name-coding-system. This takes care of the fact that the HFS+
filesystem uses decomposed file names, and indeed I can't reproduce (in
either 24.0.50 or 23.3) the behavior described in the original bug report.
OTOH, the code in question has been present in ns-win.el since the NS code
was first merged into the main branch (rev 89434), so I'm not sure how the
OP's problem arose in the first place.
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#2940: 23.0.92; C-s in dired fails to find files with umlauts
2011-07-11 22:02 ` Alp Aker
@ 2011-07-15 20:38 ` Glenn Morris
2011-07-16 17:38 ` Alp Aker
0 siblings, 1 reply; 8+ messages in thread
From: Glenn Morris @ 2011-07-15 20:38 UTC (permalink / raw)
To: Alp Aker; +Cc: 2940, markus.triska
Alp Aker wrote:
> OTOH, the code in question has been present in ns-win.el since the NS
> code was first merged into the main branch (rev 89434), so I'm not
> sure how the OP's problem arose in the first place.
IIUC, he's not using a --with-ns build. It's a "normal", gtk build that
happens to be running on a Mac. So ns-win.el isn't in use.
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#2940: 23.0.92; C-s in dired fails to find files with umlauts
2011-07-15 20:38 ` Glenn Morris
@ 2011-07-16 17:38 ` Alp Aker
0 siblings, 0 replies; 8+ messages in thread
From: Alp Aker @ 2011-07-16 17:38 UTC (permalink / raw)
To: Glenn Morris; +Cc: 2940, markus.triska
Glenn Morris wrote:
> IIUC, he's not using a --with-ns build. It's a "normal", gtk build that
> happens to be running on a Mac. So ns-win.el isn't in use.
My mistake; since it was running on Darwin I just assumed an NS build, and
didn't look at the build info in the original bug report.
Making this the default behavior for non-NS builds running on a Mac is
probably TRT. It was once possible to use Darwin with UFS, but that
hasn't been true for the last three major versions, so going forward it
will be a vanishingly rare case where (eq system-type 'darwin) doesn't
imply that the file system is a variant of HFS+. And it's reasonable for
users to expect that Emacs will, out of the box, properly handle file
names on the system it was built on.
OTOH, just adding something like:
(when (eq system-type 'darwin)
(require 'ucs-normalize)
(setq file-name-coding-system 'utf-8-hfs))
to x-win.el might not be the best solution. The utf-8-hfs coding system
does both post-read conversion (normalizing to precomposed utf-8) and
pre-write conversion (normalizing to Apple's variant of decomposed utf-8).
The latter is unnecessary: the OS itself will do normalization on any
filename handed to it. (Observe that the coding system defined in
ns-win.el only does post-read conversion.)
For local operations, the redundant pre-write conversion is harmless.
But using decomposed utf-8 might cause trouble when dealing with remote
files. So it's probably more robust to follow ns-win.el's lead and define
a coding system that only does post-read conversion. Thus:
(when (eq system-type 'darwin)
(require 'ucs-normalize)
(define-coding-system 'utf-8-hfs-for-read
"UTF-8 based coding system for HFS+ file names."
:coding-type 'utf-8
:mnemonic ?U
:charset-list '(unicode)
:post-read-conversion 'ucs-normalize-hfs-nfd-post-read-conversion)
(setq file-name-coding-system 'utf-8-hfs-for-read))
would be the addition to make to x-win.el.
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#2940: 23.0.92; C-s in dired fails to find files with umlauts
2009-04-09 16:28 bug#2940: 23.0.92; C-s in dired fails to find files with umlauts Markus Triska
2009-04-09 17:22 ` Eli Zaretskii
2011-07-11 22:02 ` Alp Aker
@ 2019-11-02 6:12 ` Stefan Kangas
2019-11-02 9:17 ` bug#2940: Aw: " Markus Triska
2 siblings, 1 reply; 8+ messages in thread
From: Stefan Kangas @ 2019-11-02 6:12 UTC (permalink / raw)
To: Markus Triska; +Cc: 2940
Markus Triska <markus.triska@gmx.at> writes:
> With ~/töst.txt existing, when I do:
>
> $ emacs -Q ~/
>
> and press:
>
> C-\ german-postfix RET C-s oe RET
>
> to search for "ö" in the dired buffer (the input method correctly
> converts the entered "oe" to "ö" in the minibuffer), I get:
>
> Failing wrapped I-search [DE<]: ö
I can't reproduce this on current master. Are you still seeing this
on a modern version of Emacs?
If I don't hear back from you within a couple of weeks, I'll just
close this bug as unreproducible.
Best regards,
Stefan Kangas
^ permalink raw reply [flat|nested] 8+ messages in thread
* bug#2940: Aw: Re: 23.0.92; C-s in dired fails to find files with umlauts
2019-11-02 6:12 ` Stefan Kangas
@ 2019-11-02 9:17 ` Markus Triska
0 siblings, 0 replies; 8+ messages in thread
From: Markus Triska @ 2019-11-02 9:17 UTC (permalink / raw)
To: Stefan Kangas; +Cc: 2940
> I can't reproduce this on current master. Are you still seeing this
> on a modern version of Emacs?
Yes, I can reproduce this exact same issue with Emacs 26.1 on OSX,
and also with a recent version of Debian.
All the best,
Markus
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-11-02 9:17 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-09 16:28 bug#2940: 23.0.92; C-s in dired fails to find files with umlauts Markus Triska
2009-04-09 17:22 ` Eli Zaretskii
2009-04-09 17:33 ` Markus Triska
2011-07-11 22:02 ` Alp Aker
2011-07-15 20:38 ` Glenn Morris
2011-07-16 17:38 ` Alp Aker
2019-11-02 6:12 ` Stefan Kangas
2019-11-02 9:17 ` bug#2940: Aw: " Markus Triska
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).