* replace placeholders
@ 2009-03-01 7:18 henry atting
2009-03-01 17:17 ` Eli Zaretskii
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: henry atting @ 2009-03-01 7:18 UTC (permalink / raw)
To: help-gnu-emacs
I have a file which was converted from dos to unix and from latin1 to
utf-8. Now it is speckeld with all these placeholders (\226) for non
presentable signs.
How can I get rid of it? As it is only a substitution `query-replace'
does not work.
henry
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: replace placeholders
2009-03-01 7:18 replace placeholders henry atting
@ 2009-03-01 17:17 ` Eli Zaretskii
2009-03-01 17:55 ` Ian Eure
[not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org>
2 siblings, 0 replies; 7+ messages in thread
From: Eli Zaretskii @ 2009-03-01 17:17 UTC (permalink / raw)
To: help-gnu-emacs
> From: henry atting <nspm_01@literaturlatenight.de>
> Date: Sun, 01 Mar 2009 08:18:47 +0100
>
> How can I get rid of it? As it is only a substitution `query-replace'
> does not work.
How did you try to replace them with `query-replace' (how did you type
the \226 thing to `query-replace's prompt)? And what version of Emacs
is that?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: replace placeholders
2009-03-01 7:18 replace placeholders henry atting
2009-03-01 17:17 ` Eli Zaretskii
@ 2009-03-01 17:55 ` Ian Eure
[not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org>
2 siblings, 0 replies; 7+ messages in thread
From: Ian Eure @ 2009-03-01 17:55 UTC (permalink / raw)
To: henry atting; +Cc: help-gnu-emacs
On Feb 28, 2009, at 11:18 PM, henry atting wrote:
> I have a file which was converted from dos to unix and from latin1 to
> utf-8. Now it is speckeld with all these placeholders (\226) for non
> presentable signs.
It sounds like either your transcoding to UTF-8 is broken, or you're
viewing the file with the wrong encoding.
\226 (0xE2) is LATIN SMALL LETTER A WITH CIRCUMFLEX in ISO-8859-1, so
if that was present in the input it should have been converted to 0xC3
0xA2.
Alternately, it could be the start of a UTF-8 encoded point from the
general punctuation block (e.g. curly quotes), which are all three
bytes starting with 0xE2. This would point to your editor reading the
file with the wrong encoding.
Either way, I don't think simply removing the characters is the
correct solution.
- Ian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: replace placeholders
[not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org>
@ 2009-03-02 14:06 ` henry atting
2009-03-02 15:02 ` Peter Dyballa
[not found] ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org>
0 siblings, 2 replies; 7+ messages in thread
From: henry atting @ 2009-03-02 14:06 UTC (permalink / raw)
To: help-gnu-emacs
On So, Mär 01 2009, Eli Zaretskii wrote:
>> From: henry atting <nspm_01@literaturlatenight.de>
>> Date: Sun, 01 Mar 2009 08:18:47 +0100
>>
>> How can I get rid of it? As it is only a substitution `query-replace'
>> does not work.
>
> How did you try to replace them with `query-replace' (how did you type
> the \226 thing to `query-replace's prompt)? And what version of Emacs
> is that?
My emacs version is: GNU Emacs 23.0.90.1
I simply typed \226 to `query-replace's prompt.
Normally I use `recode' for converting files from latin1 to utf-8.
This time I wanted to do it with emacs, my steps were:
M-x set-buffer-file-coding-system RET undecided-unix
then
M-x set-buffer-file-coding-system RET utf-8
henry
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: replace placeholders
2009-03-02 14:06 ` henry atting
@ 2009-03-02 15:02 ` Peter Dyballa
[not found] ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org>
1 sibling, 0 replies; 7+ messages in thread
From: Peter Dyballa @ 2009-03-02 15:02 UTC (permalink / raw)
To: henry atting; +Cc: help-gnu-emacs
Am 02.03.2009 um 15:06 schrieb henry atting:
> I simply typed \226 to `query-replace's prompt.
It should have been: C-q 2 2 6 <something non-digital> – and
sometimes care needs to be taken: GNU Emacs 23.x (could be also 22.x)
can be set to accept hex (to input directly Unicode characters) and
other coding instead of octal. Now you have strings comprised of \,
2, and 6.
--
Mit friedvollen Grüßen
Pete
Seelsorge statt Krankenkasse: das ist neu und liberal, die wähl' ich!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: replace placeholders
[not found] ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org>
@ 2009-03-03 15:49 ` henry atting
2009-03-03 20:36 ` Xah Lee
0 siblings, 1 reply; 7+ messages in thread
From: henry atting @ 2009-03-03 15:49 UTC (permalink / raw)
To: help-gnu-emacs
On Mo, Mär 02 2009, Peter Dyballa wrote:
> Am 02.03.2009 um 15:06 schrieb henry atting:
>
>> I simply typed \226 to `query-replace's prompt.
>
>
> It should have been: C-q 2 2 6 <something non-digital> – and sometimes
> care needs to be taken: GNU Emacs 23.x (could be also 22.x) can be
> set to accept hex (to input directly Unicode characters) and other
> coding instead of octal. Now you have strings comprised of \, 2, and
> 6.
Yes, this works fine.
Strange. After I eliminated all these placeholders (it is a *.tex file)
I do not really miss anything in the PDF output, the source does not
contain any letter with a circumflex.
Anyway, for now I do not have to convert any other latin1 file.
Thanks
henry
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: replace placeholders
2009-03-03 15:49 ` henry atting
@ 2009-03-03 20:36 ` Xah Lee
0 siblings, 0 replies; 7+ messages in thread
From: Xah Lee @ 2009-03-03 20:36 UTC (permalink / raw)
To: help-gnu-emacs
On Mar 3, 7:49 am, henry atting <nspm...@literaturlatenight.de> wrote:
> On Mo, Mär 02 2009, Peter Dyballa wrote:
>
> > Am 02.03.2009 um 15:06 schrieb henry atting:
>
> >> I simply typed \226 to `query-replace's prompt.
>
> > It should have been: C-q 2 2 6 <something non-digital> – and sometimes
> > care needs to be taken: GNU Emacs 23.x (could be also 22.x) can be
> > set to accept hex (to input directly Unicode characters) and other
> > coding instead of octal. Now you have strings comprised of \, 2, and
> > 6.
>
> Yes, this works fine.
> Strange. After I eliminated all these placeholders (it is a *.tex file)
> I do not really miss anything in the PDF output, the source does not
> contain any letter with a circumflex.
> Anyway, for now I do not have to convert any other latin1 file.
the best way i found to replace is simply copy & paste the char in
query-replace.
To find out what that char is, you can do “Ctrl+u Ctrl+x =”. It'll
give you the full unicode name, code number, and all sort of info. But
you have to intsall the unicode file... See:
Q: I have this character α on the screen. How to find out its
unicode's hex value or name?
You can find out a character's decimal, octal, or hex values by
placing your cursor on the character, and type “Alt+x what-cursor-
position” (Ctrl+x =). You can get more info if you place your cursor
on the character, then press “Ctrl+u Ctrl+x =”.
However, if you want the complete unicode info of a character, you
need to download a unicode data file and let emacs know where it is.
The unicode data file can be downloaded at: http://www.unicode.org/Public/UNIDATA/UnicodeData.txt.
After you downloaded it, place the following code in your “~/.emacs”
to let emacs know where it is:
; set unicode data file location. (used by what-cursor-position)
(let ((x "~/Documents/emacs/UnicodeData.txt"))
(when (file-exists-p x)
(setq describe-char-unicodedata-file x)))
Then restart emacs. Once you've done this, then place your cursor on a
unicode char, and do “Ctrl+u Ctrl+x =”, then emacs will give you all
the unicode info about that char, including the code point in decimal,
octal, hex notations, as well the unicode character name, category,
the font emacs is using, and others.
For example, here's the output on the character “α”:
character: α (332721, #o1211661, #x513b1, U+03B1)
charset: mule-unicode-0100-24ff
(Unicode characters of the range U+0100..U+24FF.)
code point: #x27 #x31
syntax: w which means: word
category: g:Greek
buffer code: #x9C #xF4 #xA7 #xB1
file code: #xCE #xB1 (encoded by coding system mule-utf-8-unix)
display: by this font (glyph code)
-apple-symbol-medium-r-normal--14-140-72-72-m-140-mac-symbol
(#x61)
Unicode data:
Name: GREEK SMALL LETTER ALPHA
Category: lowercase letter
Combining class: Spacing
Bidi category: Left-to-Right
Uppercase: Α
Titlecase: Α
There are text properties here:
fontified t
above from:
• Emacs and Unicode Tips
http://xahlee.org/emacs/emacs_n_unicode.html
Xah
∑ http://xahlee.org/
☄
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-03-03 20:36 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-01 7:18 replace placeholders henry atting
2009-03-01 17:17 ` Eli Zaretskii
2009-03-01 17:55 ` Ian Eure
[not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org>
2009-03-02 14:06 ` henry atting
2009-03-02 15:02 ` Peter Dyballa
[not found] ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org>
2009-03-03 15:49 ` henry atting
2009-03-03 20:36 ` Xah Lee
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).