From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ken Newsgroups: gmane.emacs.help Subject: Re: replacing characters and whacky trans-buffer conversion Date: Wed, 07 Mar 2007 16:03:52 -0500 Message-ID: <45EF28B8.2050905@speakeasy.net> References: <45ED8574.3040201@speakeasy.net> <45EF2512.9060200@speakeasy.net> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1173301464 30867 80.91.229.12 (7 Mar 2007 21:04:24 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 7 Mar 2007 21:04:24 +0000 (UTC) To: GNU Emacs List Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Mar 07 22:04:18 2007 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1HP3Iw-0007B1-6P for geh-help-gnu-emacs@m.gmane.org; Wed, 07 Mar 2007 22:04:10 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HP3J3-0000Gv-Tc for geh-help-gnu-emacs@m.gmane.org; Wed, 07 Mar 2007 16:04:17 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1HP3It-0000GQ-Mi for help-gnu-emacs@gnu.org; Wed, 07 Mar 2007 16:04:07 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1HP3Ir-0000GD-JF for help-gnu-emacs@gnu.org; Wed, 07 Mar 2007 16:04:06 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HP3Ir-0000GA-FR for help-gnu-emacs@gnu.org; Wed, 07 Mar 2007 16:04:05 -0500 Original-Received: from mail4.sea5.speakeasy.net ([69.17.117.6]) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1HP3Ii-0004PT-JH for help-gnu-emacs@gnu.org; Wed, 07 Mar 2007 16:03:57 -0500 Original-Received: (qmail 15427 invoked from network); 7 Mar 2007 21:03:55 -0000 Original-Received: from dsl093-011-017.cle1.dsl.speakeasy.net (HELO [192.168.0.27]) (gebser@[66.93.11.17]) (envelope-sender ) by mail4.sea5.speakeasy.net (qmail-ldap-1.03) with AES256-SHA encrypted SMTP for ; 7 Mar 2007 21:03:54 -0000 User-Agent: Thunderbird 2.0pre (X11/20070214) In-Reply-To: <45EF2512.9060200@speakeasy.net> X-detected-kernel: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:41768 Archived-At: Sorry... in the below it should say 5542 instead of 5442. On 03/07/2007 03:48 PM somebody named ken wrote: > Okay, try this: > > Create two buffers in one emacs frame. > > In one of them enter C-q 5442 RETURN. You should get a Greek character > which looks much like a German double-s. > > Using only emacs, put this character into the kill buffer and yank it > into the second buffer. > > When I do this, I get a different character in the second buffer. Its > coding (ascertained via C-x=) is also different. > > Go back to the first buffer and do another yank. What character do you > get. I get the original character, the one inserted with C-q 5442 RETURN. > > > > On 03/06/2007 10:15 AM somebody named ken wrote: >> An email comes in with this (emdash) character in it: – >> >> It looks like an em-dash until the text containing it is pasted into an >> emacs buffer; then it appears as a series of "garbage characters". >> (Copy and paste the emdash into an emacs buffer yourself, and perhaps >> you'll see what I mean.) >> >> To me and, possibly to you, this emdash appears in emacs as nine (9) >> "garbage" characters. >> >> Because I want to programmatically replace these 9 garbage characters >> into something latin1-friendly, I copy-and-paste these nine characters >> into an *.el file containing a line like this: >> >> (replace-string "–" "--" nil (point-min) (point-max)) >> >> The sought string (i.e., the first argument above) isn't found, however >> because, for some whacky reason, the emdash pasted into the *.el file is >> different-- by one character-- from exactly the same emdash pasted into >> the other emacs buffer (the one I'm saving the email in). >> >> In the emacs buffer containing the email, the fourth garbage character >> (as shown by C-u C-x=) is: >> >> character: β (05542, 2914, 0xb62) >> charset: greek-iso8859-7 >> (Right-Hand Part of Latin/Greek Alphabet (ISO/IEC 8859-7): ISO-IR-126) >> code point: 98 >> syntax: word >> category: g:Greek >> buffer code: 0x86 0xE2 >> file code: not encodable by coding system undecided-unix >> font: -ETL-Fixed-Medium-R-Normal--16-160-72-72-C-80-ISO8859-7 >> >> In the *.el buffer, the fourth garbage character (which should be >> exactly the same character) is: >> >> character: â (0342, 226, 0xe2) >> charset: eight-bit-graphic (8-bit graphic char (0xA0..0xFF)) >> code point: 226 >> syntax: whitespace >> category: >> buffer code: 0xE2 >> file code: 0xE2 (encoded by coding system raw-text-unix) >> font: -ETL-Fixed-Medium-R-Normal--16-160-72-72-C-80-ISO8859-1 >> >> I tried entering "C-q 5542 RETURN" into the *.el file, but emacs >> immediately makes it into the second (â, or 0342) character. Doing the >> same into the other emacs buffer (containing my copy of the email) >> *does* enter the good (β, or 05542) character. >> >> All I really want is for the above replace-string function to work as >> expected. But emacs consistently converts that fourth character in the >> emdash string into a different character, subsequently causing the >> search to fail. So how do I get the correct "garbage" characters into >> the first argument of the replace-string function-- i.e., into the *.el >> file? >> >> >> tnx, >> ken >> >> > > -- "Genius might be described as a supreme capacity for getting its possessors into trouble of all kinds." -- Samuel Butler