unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* replace placeholders
@ 2009-03-01  7:18 henry atting
  2009-03-01 17:17 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: henry atting @ 2009-03-01  7:18 UTC (permalink / raw)
  To: help-gnu-emacs

I have a file which was converted from dos to unix and from latin1 to
utf-8. Now it is speckeld with all these placeholders (\226) for non
presentable signs. 
How can I get rid of it? As it is only a substitution `query-replace'
does not work.

henry



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: replace placeholders
  2009-03-01  7:18 replace placeholders henry atting
@ 2009-03-01 17:17 ` Eli Zaretskii
  2009-03-01 17:55 ` Ian Eure
       [not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 7+ messages in thread
From: Eli Zaretskii @ 2009-03-01 17:17 UTC (permalink / raw)
  To: help-gnu-emacs

> From: henry atting <nspm_01@literaturlatenight.de>
> Date: Sun, 01 Mar 2009 08:18:47 +0100
> 
> How can I get rid of it? As it is only a substitution `query-replace'
> does not work.

How did you try to replace them with `query-replace' (how did you type
the \226 thing to `query-replace's prompt)?  And what version of Emacs
is that?




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: replace placeholders
  2009-03-01  7:18 replace placeholders henry atting
  2009-03-01 17:17 ` Eli Zaretskii
@ 2009-03-01 17:55 ` Ian Eure
       [not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org>
  2 siblings, 0 replies; 7+ messages in thread
From: Ian Eure @ 2009-03-01 17:55 UTC (permalink / raw)
  To: henry atting; +Cc: help-gnu-emacs

On Feb 28, 2009, at 11:18 PM, henry atting wrote:

> I have a file which was converted from dos to unix and from latin1 to
> utf-8. Now it is speckeld with all these placeholders (\226) for non
> presentable signs.

It sounds like either your transcoding to UTF-8 is broken, or you're  
viewing the file with the wrong encoding.

\226 (0xE2) is LATIN SMALL LETTER A WITH CIRCUMFLEX in ISO-8859-1, so  
if that was present in the input it should have been converted to 0xC3  
0xA2.

Alternately, it could be the start of a UTF-8 encoded point from the  
general punctuation block (e.g. curly quotes), which are all three  
bytes starting with 0xE2. This would point to your editor reading the  
file with the wrong encoding.

Either way, I don't think simply removing the characters is the  
correct solution.

  - Ian




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: replace placeholders
       [not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org>
@ 2009-03-02 14:06   ` henry atting
  2009-03-02 15:02     ` Peter Dyballa
       [not found]     ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 7+ messages in thread
From: henry atting @ 2009-03-02 14:06 UTC (permalink / raw)
  To: help-gnu-emacs

On So, Mär 01 2009, Eli Zaretskii wrote:

>> From: henry atting <nspm_01@literaturlatenight.de>
>> Date: Sun, 01 Mar 2009 08:18:47 +0100
>> 
>> How can I get rid of it? As it is only a substitution `query-replace'
>> does not work.
>
> How did you try to replace them with `query-replace' (how did you type
> the \226 thing to `query-replace's prompt)?  And what version of Emacs
> is that?

My emacs version is: GNU Emacs 23.0.90.1
I simply typed \226 to `query-replace's prompt.

Normally I use `recode' for converting files from latin1 to utf-8.
This time I wanted to do it with emacs, my steps were:

  M-x set-buffer-file-coding-system RET undecided-unix

then

  M-x set-buffer-file-coding-system RET utf-8

henry


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: replace placeholders
  2009-03-02 14:06   ` henry atting
@ 2009-03-02 15:02     ` Peter Dyballa
       [not found]     ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 7+ messages in thread
From: Peter Dyballa @ 2009-03-02 15:02 UTC (permalink / raw)
  To: henry atting; +Cc: help-gnu-emacs


Am 02.03.2009 um 15:06 schrieb henry atting:

> I simply typed \226 to `query-replace's prompt.


It should have been: C-q 2 2 6 <something non-digital> – and  
sometimes care needs to be taken: GNU Emacs 23.x (could be also 22.x)  
can be set to accept hex (to input directly Unicode characters) and  
other coding instead of octal. Now you have strings comprised of \,  
2, and 6.

--
Mit friedvollen Grüßen

   Pete

Seelsorge statt Krankenkasse: das ist neu und liberal, die wähl' ich!






^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: replace placeholders
       [not found]     ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org>
@ 2009-03-03 15:49       ` henry atting
  2009-03-03 20:36         ` Xah Lee
  0 siblings, 1 reply; 7+ messages in thread
From: henry atting @ 2009-03-03 15:49 UTC (permalink / raw)
  To: help-gnu-emacs

On Mo, Mär 02 2009, Peter Dyballa wrote:

> Am 02.03.2009 um 15:06 schrieb henry atting:
>
>> I simply typed \226 to `query-replace's prompt.
>
>
> It should have been: C-q 2 2 6 <something non-digital> – and sometimes
> care needs to be taken: GNU Emacs 23.x (could be also 22.x)  can be
> set to accept hex (to input directly Unicode characters) and  other
> coding instead of octal. Now you have strings comprised of \,  2, and
> 6.

Yes, this works fine.
Strange. After I eliminated all these placeholders (it is a *.tex file)
I do not really miss anything in the PDF output, the source does not
contain any letter with a circumflex.
Anyway, for now I do not have to convert any other latin1 file.

Thanks
henry


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: replace placeholders
  2009-03-03 15:49       ` henry atting
@ 2009-03-03 20:36         ` Xah Lee
  0 siblings, 0 replies; 7+ messages in thread
From: Xah Lee @ 2009-03-03 20:36 UTC (permalink / raw)
  To: help-gnu-emacs

On Mar 3, 7:49 am, henry atting <nspm...@literaturlatenight.de> wrote:
> On Mo, Mär 02 2009, Peter Dyballa wrote:
>
> > Am 02.03.2009 um 15:06 schrieb henry atting:
>
> >> I simply typed \226 to `query-replace's prompt.
>
> > It should have been: C-q 2 2 6 <something non-digital> – and sometimes
> > care needs to be taken: GNU Emacs 23.x (could be also 22.x)  can be
> > set to accept hex (to input directly Unicode characters) and  other
> > coding instead of octal. Now you have strings comprised of \,  2, and
> > 6.
>
> Yes, this works fine.
> Strange. After I eliminated all these placeholders (it is a *.tex file)
> I do not really miss anything in the PDF output, the source does not
> contain any letter with a circumflex.
> Anyway, for now I do not have to convert any other latin1 file.

the best way i found to replace is simply copy & paste the char in
query-replace.

To find out what that char is, you can do “Ctrl+u Ctrl+x =”. It'll
give you the full unicode name, code number, and all sort of info. But
you have to intsall the unicode file... See:

Q: I have this character α on the screen. How to find out its
unicode's hex value or name?

You can find out a character's decimal, octal, or hex values by
placing your cursor on the character, and type “Alt+x what-cursor-
position” (Ctrl+x =). You can get more info if you place your cursor
on the character, then press “Ctrl+u Ctrl+x =”.

However, if you want the complete unicode info of a character, you
need to download a unicode data file and let emacs know where it is.
The unicode data file can be downloaded at: http://www.unicode.org/Public/UNIDATA/UnicodeData.txt.
After you downloaded it, place the following code in your “~/.emacs”
to let emacs know where it is:

; set unicode data file location. (used by what-cursor-position)
(let ((x "~/Documents/emacs/UnicodeData.txt"))
  (when (file-exists-p x)
    (setq describe-char-unicodedata-file x)))

Then restart emacs. Once you've done this, then place your cursor on a
unicode char, and do “Ctrl+u Ctrl+x =”, then emacs will give you all
the unicode info about that char, including the code point in decimal,
octal, hex notations, as well the unicode character name, category,
the font emacs is using, and others.

For example, here's the output on the character “α”:

      character: α (332721, #o1211661, #x513b1, U+03B1)
        charset: mule-unicode-0100-24ff
                 (Unicode characters of the range U+0100..U+24FF.)
     code point: #x27 #x31
         syntax: w 	which means: word
       category: g:Greek
    buffer code: #x9C #xF4 #xA7 #xB1
      file code: #xCE #xB1 (encoded by coding system mule-utf-8-unix)
        display: by this font (glyph code)
     -apple-symbol-medium-r-normal--14-140-72-72-m-140-mac-symbol
(#x61)
   Unicode data:
           Name: GREEK SMALL LETTER ALPHA
       Category: lowercase letter
Combining class: Spacing
  Bidi category: Left-to-Right
      Uppercase: Α
      Titlecase: Α

There are text properties here:
  fontified            t

above from:
• Emacs and Unicode Tips
  http://xahlee.org/emacs/emacs_n_unicode.html

  Xah
∑ http://xahlee.org/^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-03-03 20:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-01  7:18 replace placeholders henry atting
2009-03-01 17:17 ` Eli Zaretskii
2009-03-01 17:55 ` Ian Eure
     [not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org>
2009-03-02 14:06   ` henry atting
2009-03-02 15:02     ` Peter Dyballa
     [not found]     ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org>
2009-03-03 15:49       ` henry atting
2009-03-03 20:36         ` Xah Lee

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).