From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: dkcombs@panix.com (David Combs) Newsgroups: gmane.emacs.help Subject: Re: how to scan file for non-ascii chars (eg cut-n-paste from ms-word) Date: Tue, 18 Jan 2011 19:54:12 +0000 (UTC) Organization: Public Access Networks Corp. Message-ID: References: NNTP-Posting-Host: lo.gmane.org X-Trace: dough.gmane.org 1295400348 23420 80.91.229.12 (19 Jan 2011 01:25:48 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 19 Jan 2011 01:25:48 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Jan 19 02:25:42 2011 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PfMns-0004Px-Ns for geh-help-gnu-emacs@m.gmane.org; Wed, 19 Jan 2011 02:25:40 +0100 Original-Received: from localhost ([127.0.0.1]:50690 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PfMnr-0002Oh-Uu for geh-help-gnu-emacs@m.gmane.org; Tue, 18 Jan 2011 20:25:40 -0500 Original-Path: usenet.stanford.edu!panix!not-for-mail Original-Newsgroups: gnu.emacs.help Original-Lines: 72 Original-NNTP-Posting-Host: panix1.panix.com Original-X-Trace: reader1.panix.com 1295380452 16127 166.84.1.1 (18 Jan 2011 19:54:12 GMT) Original-X-Complaints-To: abuse@panix.com Original-NNTP-Posting-Date: Tue, 18 Jan 2011 19:54:12 +0000 (UTC) X-Newsreader: trn 4.0-test76 (Apr 2, 2001) Original-Xref: usenet.stanford.edu gnu.emacs.help:184391 X-Mailman-Approved-At: Tue, 18 Jan 2011 20:19:57 -0500 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:78571 Archived-At: In article , Eli Zaretskii wrote: >> From: dkcombs@panix.com (David Combs) >> Newsgroups: gnu.emacs.help >> Date: 8 Jan 2011 19:53:01 -0500 >> >> When I 'cut-n-paste' from eg ms-word-produced document, into an >> emacs buffer (ie ascii), you get all kinds of "non-ascii" chars, >> eg left and right double-quotes, like these: >> >> >> Char: . (8221, #o20035, #x201d) point=250 of 4096 (6%) column=7 >> Char: . (8220, #o20034, #x201c) point=218 of 4096 (5%) column=42 >> >> >> accents, and so on. >> >> When I go to save the buffer, emacs will ask if I want to >> save it in eg japanese format. Not exactly what I want. > >Doesn't it suggest utf-8 as one of the possible encodings? If so, why >not use utf-8 and leave these characters in the file? Because (er, as an excuse) I often want to copy-paste them into an ASCII hints-and-tricks file I keep for my own use, and which I then edit and search-within via emacs (of course). Suppose I want to PRINT from that supposedly-ASCII file -- does my old (but wonderful) HP-1200 laserjet -- all it has for fonts are the original times, some-sans-serif one, something else (I forget), and "symbol". Isn't that a problem? FURTHER, and more importantly, how do I *search* for one of these funny things, a left-double-quote, say? It's so *easy* to just hit C-s "! Given my current state of emacs-knowledge on "foreign" fonts (like zero), that's what I say -- until I can somehow learn more. Thanks! > >> What I'd like to do is change those "strange" characters >> to their plain-ascii "equivalent", so to speak. Like >> '"' for double quote (left OR right), etc. > >Not sure why would you want that, but doesn't M-% solve this problem >nicely? If not, why not? > You mean do a query-replace on each non-ascii char? How do I even know which ones are even *in* some buffer of text? What'd be nice is something that went through the whole buffer *once*, doing the "right thing" with each non-ascii char. Do I make any sense? Or do I not really understand? Thanks, David