From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Buchs, Kevin" Newsgroups: gmane.emacs.help Subject: those funny non-ASCII characters Date: Thu, 24 May 2012 18:49:29 -0500 Message-ID: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1337903393 28318 80.91.229.3 (24 May 2012 23:49:53 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Thu, 24 May 2012 23:49:53 +0000 (UTC) To: Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri May 25 01:49:52 2012 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SXhmr-0003GT-00 for geh-help-gnu-emacs@m.gmane.org; Fri, 25 May 2012 01:49:45 +0200 Original-Received: from localhost ([::1]:44797 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SXhmq-00044t-Du for geh-help-gnu-emacs@m.gmane.org; Thu, 24 May 2012 19:49:44 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:48020) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SXhmk-00044D-Au for help-gnu-emacs@gnu.org; Thu, 24 May 2012 19:49:39 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SXhmi-0006yy-9d for help-gnu-emacs@gnu.org; Thu, 24 May 2012 19:49:37 -0400 Original-Received: from mail10.mayo.edu ([129.176.212.47]:37712) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SXhmi-0006yo-5O for help-gnu-emacs@gnu.org; Thu, 24 May 2012 19:49:36 -0400 X-IronPort-AV: E=Sophos;i="4.75,653,1330927200"; d="scan'208";a="156871025" Original-Received: from roedlp003a.mayo.edu (HELO mail10.mayo.edu) ([129.176.158.13]) by ironport10-dlp.mayo.edu with ESMTP; 24 May 2012 18:49:32 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av0EAOnHvk+BsNQ1/2dsb2JhbABBA7RrgQeCFwUwClEBKgYYB1cBBBMIiAWaOZgLiQSNKgWCNmADiD+MWYp6hHaCfg X-IronPort-AV: E=Sophos;i="4.75,653,1330927200"; d="scan'208";a="156871024" Original-Received: from mhro1a.mayo.edu ([129.176.212.53]) by ironport10.mayo.edu with ESMTP; 24 May 2012 18:49:32 -0500 Original-Received: from smtprelay.mayo.edu (smtprelay1.mayo.edu [192.168.48.10]) by mhro1a.mayo.edu with ESMTP id BT-MMP-1812665 for help-gnu-emacs@gnu.org; Thu, 24 May 2012 18:49:30 -0500 Original-Received: from MACE.mayo.edu (mace.mayo.edu [129.176.215.134]) by smtprelay.mayo.edu (8.12.11/8.12.11) with ESMTP id q4ONnTcs027341 for ; Thu, 24 May 2012 18:49:30 -0500 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: those funny non-ASCII characters thread-index: Ac06B7JYxCsR8pB4TRCbV+WRLY5ffg== X-CFilter-Loop: Reflected X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 129.176.212.47 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:84959 Archived-At: I often paste content from web pages into an emacs org-mode buffer and I get the odd quote characters or dashes that are not ASCII. I created a lisp function to remove the unicode ones that are just 8 bits. Lately I am seeing that there are characters that are not being caught. They show up in emacs as the expected character. When I kill/yank them into lisp code, they are not being found. When I save the buffer, I am asked for coding and chose raw text. When the file is opened again, these characters are showing up as some sort of special symbol (dashed circle with flag off the top) followed by doubles/triples of \2xx. For example, the dash character I just stored was this sequence: circle-flag \200 \231. Using Gnu/Linux od to dump them I get hex strings such as: 340 245 206 340 244 206 210 200 and for the dash mentioned above 342 200 231.=20 I am very naive in regard to coding, so please excuse my ignorance. I would guess these are 16-bit (Unicode16) characters. Can someone enlighten me as to how I can determine what these characters are (after pasted into a buffer) and how I can code a function to replace them with ASCII equivalents? The only thing I could think of was hexl mode, but that didn't turn out well. Thanks. Kevin Buchs | Senior Engineer | SPPDG | 507-538-5459 | buchs.kevin@mayo.edu Mayo Clinic | 200 First Street SW | Rochester, MN 55905 | http://www.mayo.edu/sppdg=20