From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: ken Newsgroups: gmane.emacs.help Subject: fixing M$ character codes, redux Date: Thu, 07 Oct 2004 18:23:23 -0400 Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Message-ID: <4165C1DB.4090706@speakeasy.net> Reply-To: gebser@speakeasy.net NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1097187883 8175 80.91.229.6 (7 Oct 2004 22:24:43 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 7 Oct 2004 22:24:43 +0000 (UTC) Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Fri Oct 08 00:24:33 2004 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1CFgga-0007KP-00 for ; Fri, 08 Oct 2004 00:24:32 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1CFgnL-00021Y-SF for geh-help-gnu-emacs@m.gmane.org; Thu, 07 Oct 2004 18:31:31 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1CFgnD-00021T-Lq for help-gnu-emacs@gnu.org; Thu, 07 Oct 2004 18:31:23 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1CFgnD-00021H-8s for help-gnu-emacs@gnu.org; Thu, 07 Oct 2004 18:31:23 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1CFgnD-00021E-6Y for help-gnu-emacs@gnu.org; Thu, 07 Oct 2004 18:31:23 -0400 Original-Received: from [216.254.0.203] (helo=mail3.speakeasy.net) by monty-python.gnu.org with esmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.34) id 1CFgg2-0002uP-1P for help-gnu-emacs@gnu.org; Thu, 07 Oct 2004 18:23:58 -0400 Original-Received: (qmail 22900 invoked from network); 7 Oct 2004 22:23:56 -0000 Original-Received: from dsl093-011-017.cle1.dsl.speakeasy.net (HELO [192.168.0.100]) (cousin@[66.93.11.17]) (envelope-sender ) by mail3.speakeasy.net (qmail-ldap-1.03) with AES256-SHA encrypted SMTP for ; 7 Oct 2004 22:23:56 -0000 User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040831 X-Accept-Language: en-us, en, de, ru, fr-fr Original-To: GNU Emacs List X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.help:21135 X-Report-Spam: http://spam.gmane.org/gmane.emacs.help:21135 Because I often confronted much the same set of goofy M$ (or whatever) characters, I followed the earlier thread about programmatically translating the bad characters into something useful. Unfortunately, like one other poster here, the supplied script didn't work at all for me. I'm guessing that the set of screwy characters I was getting might have been coming from somewhere else, maybe not from Windoze or maybe it's something to do with the input method or charset I'm set up for. So I created these handful of commands: (replace-string "\205" "..." nil nil nil) ; might be a dash (-) (??) (replace-string "\222" "'" nil nil nil) (replace-string "\223" "``" nil nil nil) (replace-string "\224" "''" nil nil nil) (replace-string "\226" "-" nil nil nil) (replace-string "\227" "-- " nil nil nil) (replace-string "\240" " " nil nil nil) ;soft space They all work just peachy. I run each one separately by doing C-x C-[ C-[ (for me, the same as C-x ESC ESC) which minibuffer prompts me to run (redo) the last command. I delete the default that "redo" provides and paste in each of the above "replace-string ..." commands. I developed and tried it on one file today, and it works great. (Note please that the "characters" which appear in the file edited appear just as they do in the first arguments of the above commands, except that C-f acts like all four characters-- e.g., in "\234" are just one character... in a sense it is.) What I'd like to do is wrap all the above commands into one defun. I tried using some other code: (defun kef.de8 () "Turn 8bit characters into 7bit equivalents." (interactive) (mapcar (function (lambda (old_and_new) (save-excursion (apply 'query-replace old_and_new)))) ("\205" "...") ; might be a dash (-) (??) ("\222" "'") ("\223" "``" ) ("\224" "''") ("\226" "-") ("\227" "-- ") ("\240" " ") ;soft space ))) But running this didn't work-- the minibuffer told me it made no replacements; however, the above "(replace-string ...)" things did work. I could write a little utility in C and some other languages to do this, but elisp still makes an idiot out of me. Any help? BTW, if you're using Linux, check out "man 7 iso_8859-1", "man charsets", and 'man -k character |grep "character set"' for more information on this kind of stuff. Also see "man iconv" for a commandline utility for doing character conversions from the shell. -- See this movie before it's against the law: The Corporation