unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: ken <gebser@speakeasy.net>
To: GNU Emacs List <help-gnu-emacs@gnu.org>
Subject: replacing characters and whacky trans-buffer conversion
Date: Tue, 06 Mar 2007 10:15:00 -0500	[thread overview]
Message-ID: <45ED8574.3040201@speakeasy.net> (raw)


An email comes in with this (emdash) character in it: –

It looks like an em-dash until the text containing it is pasted into an
emacs buffer; then it appears as a series of "garbage characters".
(Copy and paste the emdash into an emacs buffer yourself, and perhaps
you'll see what I mean.)

To me and, possibly to you, this emdash appears in emacs as nine (9)
"garbage" characters.

Because I want to programmatically replace these 9 garbage characters
into something latin1-friendly, I copy-and-paste these nine characters
into an *.el file containing a line like this:

  (replace-string "–" "--" nil (point-min) (point-max))

The sought string (i.e., the first argument above) isn't found, however
because, for some whacky reason, the emdash pasted into the *.el file is
different-- by one character-- from exactly the same emdash pasted into
the other emacs buffer (the one I'm saving the email in).

In the emacs buffer containing the email, the fourth garbage character
(as shown by C-u C-x=) is:

  character: β (05542, 2914, 0xb62)
    charset: greek-iso8859-7
	     (Right-Hand Part of Latin/Greek Alphabet (ISO/IEC 8859-7): ISO-IR-126)
 code point: 98
     syntax: word
   category: g:Greek
buffer code: 0x86 0xE2
  file code: not encodable by coding system undecided-unix
       font: -ETL-Fixed-Medium-R-Normal--16-160-72-72-C-80-ISO8859-7

In the *.el buffer, the fourth garbage character (which should be
exactly the same character) is:

  character: â (0342, 226, 0xe2)
    charset: eight-bit-graphic (8-bit graphic char (0xA0..0xFF))
 code point: 226
     syntax: whitespace
   category:
buffer code: 0xE2
  file code: 0xE2 (encoded by coding system raw-text-unix)
       font: -ETL-Fixed-Medium-R-Normal--16-160-72-72-C-80-ISO8859-1

I tried entering "C-q 5542 RETURN" into the *.el file, but emacs
immediately makes it into the second (â, or 0342) character.  Doing the
same into the other emacs buffer (containing my copy of the email)
*does* enter the good (β, or 05542) character.

All I really want is for the above replace-string function to work as
expected.  But emacs consistently converts that fourth character in the
emdash string into a different character, subsequently causing the
search to fail.  So how do I get the correct "garbage" characters into
the first argument of the replace-string function-- i.e., into the *.el
file?


tnx,
ken


-- 
"Genius might be described as a supreme capacity for getting its
possessors into trouble of all kinds."
	-- Samuel Butler

             reply	other threads:[~2007-03-06 15:15 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-06 15:15 ken [this message]
2007-03-06 16:28 ` replacing characters and whacky trans-buffer conversion Peter Dyballa
2007-03-07  7:38   ` Matthew Flaschen
2007-03-07  9:59     ` Peter Dyballa
2007-03-08 12:16   ` ken
2007-03-08 16:31     ` Peter Dyballa
2007-03-08 20:43   ` ken
2007-03-08 23:14     ` Peter Dyballa
     [not found]     ` <mailman.688.1173395790.7795.help-gnu-emacs@gnu.org>
2007-03-09 14:28       ` Oliver Scholz
2007-03-07 20:48 ` ken
2007-03-07 21:03   ` ken
2007-03-07 21:30     ` Peter Dyballa
2007-03-08  1:11       ` ken
     [not found]       ` <mailman.627.1173316331.7795.help-gnu-emacs@gnu.org>
2007-03-08  7:50         ` Stefan Monnier
2007-03-08 10:40           ` ken
2007-03-08 11:55             ` ken
     [not found]           ` <mailman.648.1173350436.7795.help-gnu-emacs@gnu.org>
2007-03-09  1:51             ` Stefan Monnier
2007-03-09 10:15               ` ken
2007-03-09 13:14                 ` Peter Dyballa
2007-03-09 15:54                   ` ken
2007-03-09 16:13                     ` Peter Dyballa
2007-03-09 18:41                   ` Reiner Steib
2007-03-10 18:29                     ` ken
2007-03-10 18:57                       ` Reiner Steib
2007-03-10 19:00                       ` Peter Dyballa
2007-03-10 19:12                       ` Eli Zaretskii
2007-03-09 10:21               ` ken
2007-03-09 13:02                 ` Peter Dyballa
     [not found]               ` <mailman.699.1173435731.7795.help-gnu-emacs@gnu.org>
2007-03-09 20:20                 ` Stefan Monnier
2007-03-10 18:32                   ` ken
     [not found]               ` <mailman.698.1173435330.7795.help-gnu-emacs@gnu.org>
2007-03-09 20:34                 ` Stefan Monnier
2007-03-09 22:00                   ` Oliver Scholz
     [not found] <mailman.528.1173194164.7795.help-gnu-emacs@gnu.org>
2007-03-06 16:41 ` Oliver Scholz
2007-03-06 17:52 ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45ED8574.3040201@speakeasy.net \
    --to=gebser@speakeasy.net \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).