From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ken Newsgroups: gmane.emacs.help Subject: Re: How to get rid of Microsoft dumb quotes, e.g. \222 for apostrophe? Date: Mon, 19 Feb 2007 09:17:36 -0500 Message-ID: <45D9B180.2040401@speakeasy.net> References: <1171628373.417583.61410@k78g2000cwa.googlegroups.com> <87zm7e8e7j.fsf@wivenhoe.staff8.ul.ie> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1171894722 32609 80.91.229.12 (19 Feb 2007 14:18:42 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 19 Feb 2007 14:18:42 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Mon Feb 19 15:18:36 2007 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1HJ9Lf-0001aN-IK for geh-help-gnu-emacs@m.gmane.org; Mon, 19 Feb 2007 15:18:35 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HJ9Le-0007v2-IC for geh-help-gnu-emacs@m.gmane.org; Mon, 19 Feb 2007 09:18:34 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1HJ9LF-0007sk-3w for help-gnu-emacs@gnu.org; Mon, 19 Feb 2007 09:18:09 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1HJ9LC-0007sY-Uz for help-gnu-emacs@gnu.org; Mon, 19 Feb 2007 09:18:08 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1HJ9LC-0007sV-NB for help-gnu-emacs@gnu.org; Mon, 19 Feb 2007 09:18:06 -0500 Original-Received: from mail8.sea5.speakeasy.net ([69.17.117.10]) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA:32) (Exim 4.52) id 1HJ9LC-0006uq-8M for help-gnu-emacs@gnu.org; Mon, 19 Feb 2007 09:18:06 -0500 Original-Received: (qmail 30255 invoked from network); 19 Feb 2007 14:18:04 -0000 Original-Received: from dsl093-011-017.cle1.dsl.speakeasy.net (HELO [192.168.0.27]) (gebser@[66.93.11.17]) (envelope-sender ) by mail8.sea5.speakeasy.net (qmail-ldap-1.03) with AES256-SHA encrypted SMTP for ; 19 Feb 2007 14:18:03 -0000 User-Agent: Thunderbird 1.5.0.9 (X11/20061206) In-Reply-To: <87zm7e8e7j.fsf@wivenhoe.staff8.ul.ie> X-Enigmail-Version: 0.94.1.1 OpenPGP: id=45796D04 X-detected-kernel: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:41313 Archived-At: > "Endless Story" writes: > >> I have just started seeing lots of nasty stuff like \222 instead of >> apostrophes in working on text files in Emacs on XP, then trying to >> reformat these files for LaTeX. > > .... The below is not textbook elisp, but it works, is easily understandable, and so too is easy to modify and add other "characters" to. For example, if you have a typical set of chars which signal the beginning of paragraph (like "\n\n"), you could insert another replace-string line to convert that to the appropriate LaTeX (or HTML or whatever) coding for "paragraph". Such a set of replacements might be better organized into a separate (but similar) function however. Open Source == Your Choice. (defun replace-garbage-chars () "Replace goofy MS and other garbage characters with latin1 equivalents." (interactive) (save-excursion ;save the current point (replace-string "—" "--" nil (point-min) (point-max)) ; multi-byte (replace-string "‘" "`" nil (point-min) (point-max)) (replace-string "’" "'" nil (point-min) (point-max)) (replace-string "“" "``" nil (point-min) (point-max)) (replace-string "”" "''" nil (point-min) (point-max)) (replace-string "–" "--" nil (point-min) (point-max)) )) Note that chars/strings within the first set of double-quotes in each pair of replace-string args appear in emacs as, e.g., "\221". To enter these escaped numbers, e.g. "\221", do C-q 2 2 1 RETURN. Also, multi-byte strings such as the first should be toward the top of the list so that single-byte replacements don't cut them up, making subsequent searches for them impossible. To discover the code for a new (garbage) char to be replaced, put the point over it and do "C-x="; the first code returned in the minibuffer tells you the escaped number you want to replace. With this function in a file in directory in the emacs path and this in my ~/.emacs: (global-set-key "\C-cr" 'replace-garbage-chars) doing C-cr in an emacs buffer performs the replacements without moving the point... exactly what I was looking for. Enjoy, ken