unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: ken <gebser@speakeasy.net>
Subject: fixing M$ character codes, redux
Date: Thu, 07 Oct 2004 18:23:23 -0400	[thread overview]
Message-ID: <4165C1DB.4090706@speakeasy.net> (raw)

Because I often confronted much the same set of goofy M$ (or whatever) 
characters, I followed the earlier thread about programmatically 
translating the bad characters into something useful.  Unfortunately, 
like one other poster here, the supplied script didn't work at all for 
me.  I'm guessing that the set of screwy characters I was getting might 
have been coming from somewhere else, maybe not from Windoze or maybe 
it's something to do with the input method or charset I'm set up for.

So I created these handful of commands:

(replace-string "\205" "..." nil nil nil) ; might be a dash (-) (??)
(replace-string "\222" "'" nil nil nil)
(replace-string "\223" "``" nil nil nil)
(replace-string "\224" "''" nil nil nil)
(replace-string "\226" "-" nil nil nil)
(replace-string "\227" "-- " nil nil nil)
(replace-string "\240" " " nil nil nil)    ;soft space

They all work just peachy.  I run each one separately by doing C-x C-[ 
C-[ (for me, the same as C-x ESC ESC) which minibuffer prompts me to run 
(redo) the last command.  I delete the default that "redo" provides and 
paste in each of the above "replace-string ..." commands.  I developed 
and tried it on one file today, and it works great.  (Note please that 
the "characters" which appear in the file edited appear just as they do 
in the first arguments of the above commands, except that C-f acts like 
all four characters-- e.g., in "\234" are just one character... in a 
sense it is.)

What I'd like to do is wrap all the above commands into one defun.  I 
tried using some other code:

(defun kef.de8 ()
  "Turn 8bit characters into 7bit equivalents."
  (interactive)
  (mapcar
   (function (lambda (old_and_new)
    (save-excursion (apply 'query-replace old_and_new))))
     ("\205" "...") ; might be a dash (-) (??)
     ("\222" "'")
     ("\223" "``" )
     ("\224" "''")
     ("\226" "-")
     ("\227" "-- ")
     ("\240" " ")    ;soft space
)))

But running this didn't work-- the minibuffer told me it made no 
replacements; however, the above "(replace-string ...)" things did 
work.  I could write a little utility in C and some other languages to 
do this, but elisp still makes an idiot out of me.  Any help?

BTW, if you're using Linux, check out "man 7 iso_8859-1", "man 
charsets", and 'man -k character |grep "character set"' for more 
information on this kind of stuff.  Also see "man iconv" for a 
commandline utility for doing character conversions from the shell.


-- 
See this movie before it's against the law: The Corporation
<http://www.TheCorporation.com/>

             reply	other threads:[~2004-10-07 22:23 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-07 22:23 ken [this message]
     [not found] <mailman.2045.1097188285.2017.help-gnu-emacs@gnu.org>
2004-10-07 22:37 ` fixing M$ character codes, redux David Kastrup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4165C1DB.4090706@speakeasy.net \
    --to=gebser@speakeasy.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).