* replace placeholders @ 2009-03-01 7:18 henry atting 2009-03-01 17:17 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: henry atting @ 2009-03-01 7:18 UTC (permalink / raw To: help-gnu-emacs I have a file which was converted from dos to unix and from latin1 to utf-8. Now it is speckeld with all these placeholders (\226) for non presentable signs. How can I get rid of it? As it is only a substitution `query-replace' does not work. henry ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: replace placeholders 2009-03-01 7:18 replace placeholders henry atting @ 2009-03-01 17:17 ` Eli Zaretskii 2009-03-01 17:55 ` Ian Eure [not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org> 2 siblings, 0 replies; 7+ messages in thread From: Eli Zaretskii @ 2009-03-01 17:17 UTC (permalink / raw To: help-gnu-emacs > From: henry atting <nspm_01@literaturlatenight.de> > Date: Sun, 01 Mar 2009 08:18:47 +0100 > > How can I get rid of it? As it is only a substitution `query-replace' > does not work. How did you try to replace them with `query-replace' (how did you type the \226 thing to `query-replace's prompt)? And what version of Emacs is that? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: replace placeholders 2009-03-01 7:18 replace placeholders henry atting 2009-03-01 17:17 ` Eli Zaretskii @ 2009-03-01 17:55 ` Ian Eure [not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org> 2 siblings, 0 replies; 7+ messages in thread From: Ian Eure @ 2009-03-01 17:55 UTC (permalink / raw To: henry atting; +Cc: help-gnu-emacs On Feb 28, 2009, at 11:18 PM, henry atting wrote: > I have a file which was converted from dos to unix and from latin1 to > utf-8. Now it is speckeld with all these placeholders (\226) for non > presentable signs. It sounds like either your transcoding to UTF-8 is broken, or you're viewing the file with the wrong encoding. \226 (0xE2) is LATIN SMALL LETTER A WITH CIRCUMFLEX in ISO-8859-1, so if that was present in the input it should have been converted to 0xC3 0xA2. Alternately, it could be the start of a UTF-8 encoded point from the general punctuation block (e.g. curly quotes), which are all three bytes starting with 0xE2. This would point to your editor reading the file with the wrong encoding. Either way, I don't think simply removing the characters is the correct solution. - Ian ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org>]
* Re: replace placeholders [not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org> @ 2009-03-02 14:06 ` henry atting 2009-03-02 15:02 ` Peter Dyballa [not found] ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org> 0 siblings, 2 replies; 7+ messages in thread From: henry atting @ 2009-03-02 14:06 UTC (permalink / raw To: help-gnu-emacs On So, Mär 01 2009, Eli Zaretskii wrote: >> From: henry atting <nspm_01@literaturlatenight.de> >> Date: Sun, 01 Mar 2009 08:18:47 +0100 >> >> How can I get rid of it? As it is only a substitution `query-replace' >> does not work. > > How did you try to replace them with `query-replace' (how did you type > the \226 thing to `query-replace's prompt)? And what version of Emacs > is that? My emacs version is: GNU Emacs 23.0.90.1 I simply typed \226 to `query-replace's prompt. Normally I use `recode' for converting files from latin1 to utf-8. This time I wanted to do it with emacs, my steps were: M-x set-buffer-file-coding-system RET undecided-unix then M-x set-buffer-file-coding-system RET utf-8 henry ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: replace placeholders 2009-03-02 14:06 ` henry atting @ 2009-03-02 15:02 ` Peter Dyballa [not found] ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org> 1 sibling, 0 replies; 7+ messages in thread From: Peter Dyballa @ 2009-03-02 15:02 UTC (permalink / raw To: henry atting; +Cc: help-gnu-emacs Am 02.03.2009 um 15:06 schrieb henry atting: > I simply typed \226 to `query-replace's prompt. It should have been: C-q 2 2 6 <something non-digital> – and sometimes care needs to be taken: GNU Emacs 23.x (could be also 22.x) can be set to accept hex (to input directly Unicode characters) and other coding instead of octal. Now you have strings comprised of \, 2, and 6. -- Mit friedvollen Grüßen Pete Seelsorge statt Krankenkasse: das ist neu und liberal, die wähl' ich! ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org>]
* Re: replace placeholders [not found] ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org> @ 2009-03-03 15:49 ` henry atting 2009-03-03 20:36 ` Xah Lee 0 siblings, 1 reply; 7+ messages in thread From: henry atting @ 2009-03-03 15:49 UTC (permalink / raw To: help-gnu-emacs On Mo, Mär 02 2009, Peter Dyballa wrote: > Am 02.03.2009 um 15:06 schrieb henry atting: > >> I simply typed \226 to `query-replace's prompt. > > > It should have been: C-q 2 2 6 <something non-digital> – and sometimes > care needs to be taken: GNU Emacs 23.x (could be also 22.x) can be > set to accept hex (to input directly Unicode characters) and other > coding instead of octal. Now you have strings comprised of \, 2, and > 6. Yes, this works fine. Strange. After I eliminated all these placeholders (it is a *.tex file) I do not really miss anything in the PDF output, the source does not contain any letter with a circumflex. Anyway, for now I do not have to convert any other latin1 file. Thanks henry ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: replace placeholders 2009-03-03 15:49 ` henry atting @ 2009-03-03 20:36 ` Xah Lee 0 siblings, 0 replies; 7+ messages in thread From: Xah Lee @ 2009-03-03 20:36 UTC (permalink / raw To: help-gnu-emacs On Mar 3, 7:49 am, henry atting <nspm...@literaturlatenight.de> wrote: > On Mo, Mär 02 2009, Peter Dyballa wrote: > > > Am 02.03.2009 um 15:06 schrieb henry atting: > > >> I simply typed \226 to `query-replace's prompt. > > > It should have been: C-q 2 2 6 <something non-digital> – and sometimes > > care needs to be taken: GNU Emacs 23.x (could be also 22.x) can be > > set to accept hex (to input directly Unicode characters) and other > > coding instead of octal. Now you have strings comprised of \, 2, and > > 6. > > Yes, this works fine. > Strange. After I eliminated all these placeholders (it is a *.tex file) > I do not really miss anything in the PDF output, the source does not > contain any letter with a circumflex. > Anyway, for now I do not have to convert any other latin1 file. the best way i found to replace is simply copy & paste the char in query-replace. To find out what that char is, you can do “Ctrl+u Ctrl+x =”. It'll give you the full unicode name, code number, and all sort of info. But you have to intsall the unicode file... See: Q: I have this character α on the screen. How to find out its unicode's hex value or name? You can find out a character's decimal, octal, or hex values by placing your cursor on the character, and type “Alt+x what-cursor- position” (Ctrl+x =). You can get more info if you place your cursor on the character, then press “Ctrl+u Ctrl+x =”. However, if you want the complete unicode info of a character, you need to download a unicode data file and let emacs know where it is. The unicode data file can be downloaded at: http://www.unicode.org/Public/UNIDATA/UnicodeData.txt. After you downloaded it, place the following code in your “~/.emacs” to let emacs know where it is: ; set unicode data file location. (used by what-cursor-position) (let ((x "~/Documents/emacs/UnicodeData.txt")) (when (file-exists-p x) (setq describe-char-unicodedata-file x))) Then restart emacs. Once you've done this, then place your cursor on a unicode char, and do “Ctrl+u Ctrl+x =”, then emacs will give you all the unicode info about that char, including the code point in decimal, octal, hex notations, as well the unicode character name, category, the font emacs is using, and others. For example, here's the output on the character “α”: character: α (332721, #o1211661, #x513b1, U+03B1) charset: mule-unicode-0100-24ff (Unicode characters of the range U+0100..U+24FF.) code point: #x27 #x31 syntax: w which means: word category: g:Greek buffer code: #x9C #xF4 #xA7 #xB1 file code: #xCE #xB1 (encoded by coding system mule-utf-8-unix) display: by this font (glyph code) -apple-symbol-medium-r-normal--14-140-72-72-m-140-mac-symbol (#x61) Unicode data: Name: GREEK SMALL LETTER ALPHA Category: lowercase letter Combining class: Spacing Bidi category: Left-to-Right Uppercase: Α Titlecase: Α There are text properties here: fontified t above from: • Emacs and Unicode Tips http://xahlee.org/emacs/emacs_n_unicode.html Xah ∑ http://xahlee.org/ ☄ ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-03-03 20:36 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-03-01 7:18 replace placeholders henry atting 2009-03-01 17:17 ` Eli Zaretskii 2009-03-01 17:55 ` Ian Eure [not found] ` <mailman.2136.1235927829.31690.help-gnu-emacs@gnu.org> 2009-03-02 14:06 ` henry atting 2009-03-02 15:02 ` Peter Dyballa [not found] ` <mailman.2209.1236006175.31690.help-gnu-emacs@gnu.org> 2009-03-03 15:49 ` henry atting 2009-03-03 20:36 ` Xah Lee
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.