* searching for non ascii characters
@ 2005-08-02 20:27 Radomir Hejl
2005-08-02 20:55 ` Peter Dyballa
[not found] ` <mailman.2370.1123016502.20277.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 6+ messages in thread
From: Radomir Hejl @ 2005-08-02 20:27 UTC (permalink / raw)
Hello,
when in a text mode, I usually use input method. I am able to find any character
with C-s. After saving and reading the file from a disc non ascii characters
cannot be found. When I do C-u C-x C-= on non ascii char before saving I see
a charset latin-iso8859-2. Doing C-u C-x C-= after saving there's usually
mule-unicode-0100-24ff or latin-iso8859-1 charset.
So now I can only search with success for ascii chars. What should I trim in
emacs so that the searching be efficient?
I already asked in comp.emacs with no response.
Thanks, Radek.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: searching for non ascii characters
2005-08-02 20:27 searching for non ascii characters Radomir Hejl
@ 2005-08-02 20:55 ` Peter Dyballa
[not found] ` <mailman.2370.1123016502.20277.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 6+ messages in thread
From: Peter Dyballa @ 2005-08-02 20:55 UTC (permalink / raw)
Cc: help-gnu-emacs
Am 02.08.2005 um 22:27 schrieb Radomir Hejl:
> Hello,
> when in a text mode, I usually use input method. I am able to find any
> character
> with C-s. After saving and reading the file from a disc non ascii
> characters
> cannot be found. When I do C-u C-x C-= on non ascii char before saving
> I see
> a charset latin-iso8859-2. Doing C-u C-x C-= after saving there's
> usually
> mule-unicode-0100-24ff or latin-iso8859-1 charset.
>
> So now I can only search with success for ascii chars. What should I
> trim in
> emacs so that the searching be efficient?
>
Put a line like
;;; -*- mode: Text; coding: iso-8859-2; -*-
in the file's header. Could be a
(prefer-coding-system 'iso-latin-2-unix)
is already OK. The environment variable LC_CTYPE is important: GNU
Emacs sets a few things after this. In particular
default-buffer-file-coding-system gets derived from this. Then there's
file-coding-system-alist ...
--
Greetings
Pete
Bake Pizza not war!
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: searching for non ascii characters
[not found] ` <mailman.2370.1123016502.20277.help-gnu-emacs@gnu.org>
@ 2005-08-03 13:28 ` rahed
2005-08-03 14:09 ` Peter Dyballa
[not found] ` <mailman.2456.1123078766.20277.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 6+ messages in thread
From: rahed @ 2005-08-03 13:28 UTC (permalink / raw)
Peter Dyballa <Peter_Dyballa@Web.DE> writes:
>> with C-s. After saving and reading the file from a disc non ascii
>> characters
>> cannot be found. When I do C-u C-x C-= on non ascii char before
>
> Put a line like
>
> ;;; -*- mode: Text; coding: iso-8859-2; -*-
>
> in the file's header. Could be a
I put the line as my first file line. The symptoms are unchanged.
> (prefer-coding-system 'iso-latin-2-unix)
>
> is already OK. The environment variable LC_CTYPE is important: GNU
> Emacs sets a few things after this. In particular
> default-buffer-file-coding-system gets derived from this. Then there's
> file-coding-system-alist ...
I also included (prefer-coding-system 'iso-latin-2-unix) in my .emacs (before I had cp1250).
So charsets are now as if I didn't do any changes.
character listing after writing and reading from a disc:
character: š (01210241, 331937, 0x510a1)
charset: mule-unicode-0100-24ff (Unicode characters of the range U+0100..U+24FF.)
code point: 33 33
syntax: word
category: l:Latin
buffer code: 0x9C 0xF4 0xA1 0xA1
file code: B9 (encoded by coding system iso-latin-2-unix)
font: -outline-Courier New-normal-r-normal-normal-13-97-96-96-c-80-iso10646-1
Thank you, any hints appreciated.
Radek
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: searching for non ascii characters
2005-08-03 13:28 ` rahed
@ 2005-08-03 14:09 ` Peter Dyballa
[not found] ` <mailman.2456.1123078766.20277.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 6+ messages in thread
From: Peter Dyballa @ 2005-08-03 14:09 UTC (permalink / raw)
Cc: help-gnu-emacs
Am 03.08.2005 um 15:28 schrieb rahed@cwazy.co.uk:
> character: š (01210241, 331937, 0x510a1)
> charset: mule-unicode-0100-24ff (Unicode characters of the range
> U+0100..U+24FF.)
> code point: 33 33
> syntax: word
> category: l:Latin
> buffer code: 0x9C 0xF4 0xA1 0xA1
> file code: B9 (encoded by coding system iso-latin-2-unix)
> font: -outline-Courier
> New-normal-r-normal-normal-13-97-96-96-c-80-iso10646-1
>
My own test file with ISO 8859-2 encoding has this in GNU Emacs 23:
character: š (0541, 353, 0x161)
preferred charset: iso-8859-2 (ISO/IEC 8859/2)
code point: 0xB9
syntax: w which means: word
category: j:Japanese l:Latin
buffer code: 0xC5 0xA1
file code: 0xB9 (encoded by coding system iso-latin-2-unix)
display: by this font (glyph code)
-B&H-LucidaTypewriter-Medium-R-Normal-Sans-10-100-75-75-M-60-ISO10646-1
(0x161)
and this in GNU Emacs 22 and 21.3:
character: š (04471, 2361, 0x939, U+0161)
charset: [latin-iso8859-2]
(Right-Hand Part of Latin Alphabet 2 (ISO/IEC 8859-2):
ISO-IR-101.)
code point: [57]
syntax: w which means: word
category: l:Latin
buffer code: 0x82 0xB9
file code: 0xB9 (encoded by coding system iso-latin-2-unix)
display: by this font (glyph code)
-B&H-LucidaTypewriter-Medium-R-Normal-Sans-10-100-75-75-M-60-ISO8859-2
(0xB9)
Both use the right charset and encoding. If you close and open again
that file and it has that '-*- coding: iso-8859-2; -*-' in its header,
among the first six or nine lines, Emacs should switch to that coding
-- except you have at the file's end a block of local or file variables
that say something different. Or it has a fixation to a specific
coding-system. Did you launch your Emacs after changing .emacs? Can you
check the variable's state (C-h v on this variable in .emacs in newly
launched Emacs)? If it's something different than set then you either
have this statement not executed or it exists more than once and gets
reset some time after this line ... What does your file's tail look
like?
The last thing I think of is the use of fontsets instead of fonts. What
is your status?
Your file has at LATIN SMALL LETTER S WITH CARON's position the correct
byte, 0xB9. So it is presumingly still correctly encoded. To see it in
ISO/IEC 8859-2 you can revert-buffer-with-coding-system, C-x RET r
CODING-SYSTEM. Use M-x list-coding-systems to see what your system has.
--
Greetings
Pete
Windows, c'est un peu comme le beaujolais nouveau: à chaque nouvelle
cuvée on sait que ce sera dégueulasse, mais on en prend quand même, par
masochisme.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: searching for non ascii characters
[not found] ` <mailman.2456.1123078766.20277.help-gnu-emacs@gnu.org>
@ 2005-08-03 14:52 ` rahed
2005-08-03 15:11 ` Peter Dyballa
0 siblings, 1 reply; 6+ messages in thread
From: rahed @ 2005-08-03 14:52 UTC (permalink / raw)
Peter Dyballa <Peter_Dyballa@Web.DE> writes:
> Both use the right charset and encoding. If you close and open again
> that file and it has that '-*- coding: iso-8859-2; -*-' in its header,
> among the first six or nine lines, Emacs should switch to that coding
> -- except you have at the file's end a block of local or file
> variables that say something different. Or it has a fixation to a
> specific coding-system. Did you launch your Emacs after changing
> .emacs? Can you check the variable's state (C-h v on this variable in
> .emacs in newly launched Emacs)? If it's something different than set
> then you either have this statement not executed or it exists more
> than once and gets reset some time after this line ... What does your
> file's tail look like?
>From C-h v, my prefer-coding-system's value is iso-latin-2. My test file now has only two lines, the relevant header and non ascii chars.
>
> The last thing I think of is the use of fontsets instead of
> fonts. What is your status?
I'm not sure what status, but M-x list-fontsets renders
Fontset: -*-*-*-*-*-*-*-*-*-*-*-*-fontset-default
Fontset: -*-courier new-normal-r-*-*-13-*-*-*-c-*-fontset-standard
> Your file has at LATIN SMALL LETTER S WITH CARON's position the
> correct byte, 0xB9. So it is presumingly still correctly encoded. To
> see it in ISO/IEC 8859-2 you can revert-buffer-with-coding-system, C-x
> RET r CODING-SYSTEM. Use M-x list-coding-systems to see what your
> system has.
I think there is no revert-buffer-with-coding-system function with my emacs (M-x apropos).
I can only revert-buffer (no coding system change). I use 21.3 on WXP.
Radek
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: searching for non ascii characters
2005-08-03 14:52 ` rahed
@ 2005-08-03 15:11 ` Peter Dyballa
0 siblings, 0 replies; 6+ messages in thread
From: Peter Dyballa @ 2005-08-03 15:11 UTC (permalink / raw)
Cc: help-gnu-emacs
Am 03.08.2005 um 16:52 schrieb rahed@cwazy.co.uk:
> I think there is no revert-buffer-with-coding-system function with my
> emacs (M-x apropos).
> I can only revert-buffer (no coding system change). I use 21.3 on WXP.
>
Could be, my 21.3.50 came from CVS ... probaly 21.4 isn't any better in
this. I'm not with Windows, but when Unices have ready built Emacsen
from CVS there should be some for Windows too.
--
Greetings
Pete
Mac OS X is like a wigwam: no fences, no gates, but an apache inside.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-08-03 15:11 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-02 20:27 searching for non ascii characters Radomir Hejl
2005-08-02 20:55 ` Peter Dyballa
[not found] ` <mailman.2370.1123016502.20277.help-gnu-emacs@gnu.org>
2005-08-03 13:28 ` rahed
2005-08-03 14:09 ` Peter Dyballa
[not found] ` <mailman.2456.1123078766.20277.help-gnu-emacs@gnu.org>
2005-08-03 14:52 ` rahed
2005-08-03 15:11 ` Peter Dyballa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).