* fixing M$ character codes
@ 2004-07-03 16:00 nospam55
[not found] ` <Jym.wzfz884ct0.fsf@econet.org>
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: nospam55 @ 2004-07-03 16:00 UTC (permalink / raw)
Hi! Help me and the mankind !
I have often to correct manually dowloaded M$ - originating text , i. e. I must bother
to do the ethernal replacements
\222 into '
\213 into <
etc etc
, where the codes are taken from the well-known
(defvar gnus-article-dumbquotes-map
'(("\202" ",")
("\203" "f")
("\204" ",,")
("\205" "...")
("\213" "<")
("\214" "OE")
("\221" "`")
("\222" "'")
("\223" "``")
("\224" "\"")
("\225" "*")
("\226" "-")
("\227" "--")
("\231" "(TM)")
("\233" ">")
("\234" "oe")
("\264" "'"))
"Table for MS-to-Latin1 translation.")
Now, the big question whose answer I didn't find on the internet
is there a way for having emacs fix this mess ?
My dream is to have something like
M-x fix-evil-empire-nonsense RET
that automatically does on the selected region the replacements above for us
poor humans.
Any package? Any defun ?
Thank you :)
PS micro$oft-ware makes many people to write
\222 for ' apostrophe
\202 for , comma
\213 for < "less than"
\233 for > "greater than"
\224 for backslash
\225 for asterisk
etc etc
without they realize this :(
^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <Jym.wzfz884ct0.fsf@econet.org>]
* Re: fixing M$ character codes
2004-07-03 16:00 fixing M$ character codes nospam55
[not found] ` <Jym.wzfz884ct0.fsf@econet.org>
@ 2004-07-04 14:08 ` Jym Dyer
2004-07-05 10:00 ` Haines Brown
2004-07-04 21:06 ` Jesper Harder
2004-07-05 14:22 ` nospam55
3 siblings, 1 reply; 10+ messages in thread
From: Jym Dyer @ 2004-07-04 14:08 UTC (permalink / raw)
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1943 bytes --]
=v= I think ideally the code would parse headers to figure out
whether the brain damaged quotes are supposed to be ISO-Latin,
Windows-1252, UTF-8, or whatever. But for now I just use a
sledgehammer and convert any and all needlessly-8bit characters
to their 7bit equivalents.
=v= The code I use is below. I suppose someday I ought to make
them more comprehensive, but for now I just add what I need
along the way. (Warning: this converts all know quotes and
dashes to ASCII equivalents, but also convert centered dots to
asterisks, which isn't exactly an equivalent.)
<_Jym_>
(defun jym.de8 ()
"Turn 8bit characters into 7bit equivalents."
(interactive)
(mapcar
(function (lambda (old_and_new)
(save-excursion (apply 'query-replace old_and_new))))
'(("" "-")
("¹" "'")
("²" "''")
("³" "``")
("·" "*")
("
" "...")
("" "--")
("" "`")
("" "`")
("" "``") ; = 0x93
("" "''") ; = 0x94
("" "*")
("" "-") ; = 0x96
("" "--") ; = 0x97
("" "`")
("" "'")
("" "``")
("" "''")
("â" "") )))
;mapcar;
;defun jym.de8;
(defun jym.de8qp ()
"Turn quoted printable 8bit into 7bit equivalents."
(interactive)
(mapcar
(function (lambda (old_and_new)
(save-excursion (apply 'query-replace old_and_new))))
'(("=\n" "")
;("=E2=80=94" "--")
("=E2=80=99" "'") ; UTF-8
("=E2=80=9C" "``") ; UTF-8
("=E2=80=9D" "''") ; UTF-8
("=0D\n" "\n") ; = \r\n
("=20\n" "\n")
("=2E" ".")
("=3F" "?")
("=46" "F")
("=5B" "[")
("=5D" "]")
("=8B" "--")
("=8C" "`")
("=91" "`")
("=92" "'")
("=93" "``") ; = 0223
("=94" "''")
("=96" "-") ; = 0226
("=97" "--") ; = 0227
("=A0" " ")
("=A5" "'")
("=AD" "--")
("=AE" "\"")
("=B2" "``")
("=B3" "''")
("=B9" "'")) ))
;mapcar;
;defun jym.de8qp;
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fixing M$ character codes
2004-07-04 14:08 ` Jym Dyer
@ 2004-07-05 10:00 ` Haines Brown
2004-07-05 10:19 ` Thomas Gehrlein
2004-07-07 15:12 ` Jym Dyer
0 siblings, 2 replies; 10+ messages in thread
From: Haines Brown @ 2004-07-05 10:00 UTC (permalink / raw)
Jym Dyer <jym@econet.org> writes:
> =v= I think ideally the code would parse headers to figure out
> whether the brain damaged quotes are supposed to be ISO-Latin,
> Windows-1252, UTF-8, or whatever. But for now I just use a
> sledgehammer and convert any and all needlessly-8bit characters
> to their 7bit equivalents.
>
> =v= The code I use is below. I suppose someday I ought to make
> them more comprehensive, but for now I just add what I need
> along the way. (Warning: this converts all know quotes and
> dashes to ASCII equivalents, but also convert centered dots to
> asterisks, which isn't exactly an equivalent.)
> <_Jym_>
>
>
> (defun jym.de8 ()
> "Turn 8bit characters into 7bit equivalents."
> (interactive)
> (mapcar
> (function (lambda (old_and_new)
> (save-excursion (apply 'query-replace old_and_new))))
> '(("" "-")
> ("¹" "'")
> ...
Jym,
As on who often has to process documents with 8-bit characters, your
lisp code was certainly welcome. But it does not seem to do anything.
I'm running emacs 21.2.1 and pasted the code you supplied into
~/.emacs, and reloaded emacs. If I open a test file that is filled
with these 8-bit characters, it is displayed in emacs without any
change.
What am I doing wrong?
--
Haines Brown
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fixing M$ character codes
2004-07-05 10:00 ` Haines Brown
@ 2004-07-05 10:19 ` Thomas Gehrlein
2004-07-07 15:12 ` Jym Dyer
1 sibling, 0 replies; 10+ messages in thread
From: Thomas Gehrlein @ 2004-07-05 10:19 UTC (permalink / raw)
Haines Brown <brownh@teufel.hartford-hwp.com> writes:
> As on who often has to process documents with 8-bit characters, your
> lisp code was certainly welcome. But it does not seem to do anything.
>
> I'm running emacs 21.2.1 and pasted the code you supplied into
> ~/.emacs, and reloaded emacs. If I open a test file that is filled
> with these 8-bit characters, it is displayed in emacs without any
> change.
>
> What am I doing wrong?
Did you try calling the function interactively?
M-x jym.de8 RET
Thomas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fixing M$ character codes
2004-07-05 10:00 ` Haines Brown
2004-07-05 10:19 ` Thomas Gehrlein
@ 2004-07-07 15:12 ` Jym Dyer
2004-07-07 20:38 ` Haines Brown
1 sibling, 1 reply; 10+ messages in thread
From: Jym Dyer @ 2004-07-07 15:12 UTC (permalink / raw)
> As on who often has to process documents with 8-bit
> characters, your lisp code was certainly welcome.
> But it does not seem to do anything.
=v= I type "Meta-X jym.de8" and it goes through a bunch of
query-replaces. (Or "Meta-X jym.de8qp" for quoted-printable
buffers.) Like I said, it's pretty much a sledgehammer and
someday I'll clean it up. But it does the job.
<_Jym_>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fixing M$ character codes
2004-07-07 15:12 ` Jym Dyer
@ 2004-07-07 20:38 ` Haines Brown
2004-07-15 16:59 ` Jym Dyer
0 siblings, 1 reply; 10+ messages in thread
From: Haines Brown @ 2004-07-07 20:38 UTC (permalink / raw)
Jym Dyer <jym@econet.org> writes:
> > As on who often has to process documents with 8-bit
> > characters, your lisp code was certainly welcome.
> > But it does not seem to do anything.
>
> =v= I type "Meta-X jym.de8" and it goes through a bunch of
> query-replaces. (Or "Meta-X jym.de8qp" for quoted-printable
> buffers.) Like I said, it's pretty much a sledgehammer and
> someday I'll clean it up. But it does the job.
> <_Jym_>
I think I've got myself straightened out, thanks.
One thing that confused me was that I was not sure how to define
"quoted printable."
I'd also like to avoid being queried for each kind of replacement, as
happens with the ! command.
--
Haines Brown
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fixing M$ character codes
2004-07-03 16:00 fixing M$ character codes nospam55
[not found] ` <Jym.wzfz884ct0.fsf@econet.org>
2004-07-04 14:08 ` Jym Dyer
@ 2004-07-04 21:06 ` Jesper Harder
2004-07-05 14:22 ` nospam55
3 siblings, 0 replies; 10+ messages in thread
From: Jesper Harder @ 2004-07-04 21:06 UTC (permalink / raw)
nospam55 <nospa@no.yahoo.no> writes:
> I have often to correct manually dowloaded M$ - originating text ,
> i. e. I must bother to do the ethernal replacements, where the codes
> are taken from the well-known
>
> (defvar gnus-article-dumbquotes-map
>
> is there a way for having emacs fix this mess ?
>
> My dream is to have something like
>
> M-x fix-evil-empire-nonsense RET
Sure, the command is called `M-x article-treat-dumbquotes'. It works
in all buffers, not just the Gnus article buffer which it is intended
for.
You'll probably need to autoload it if you're using an infidel Usenet
client not approved by Gnus Towers and the Church of Emacs.
--
Jesper Harder <http://purl.org/harder/>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: fixing M$ character codes
2004-07-03 16:00 fixing M$ character codes nospam55
` (2 preceding siblings ...)
2004-07-04 21:06 ` Jesper Harder
@ 2004-07-05 14:22 ` nospam55
3 siblings, 0 replies; 10+ messages in thread
From: nospam55 @ 2004-07-05 14:22 UTC (permalink / raw)
Thomas Gehrlein <thomas.gehrlein@t-online.de> wrote :
> Did you try calling the function interactively?
>
> M-x jym.de8 RET
Yes I think this is the correct use of the Jim's funcs,
after selection as region
of the text to convert ; I think that the problem is that
some funny 8 bit chars in the jym.de8 func body got lost in the
journey from jim's HD to ours HDs :(
for example I got the piece
...
("" "`")
("" "``") ; = 0x93
("" "''") ; = 0x94
("" "*")
("" "-") ; = 0x96
("" "--") ; = 0x97
("" "`")
("" "'")
...
, whith that strange empties "" .
Could you quote that chars for us Jim ? for example,
octally saying \044 for the dollar sign etc ?
We type all back at our PCs.
nospam55 :)
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2004-07-15 16:59 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-03 16:00 fixing M$ character codes nospam55
[not found] ` <Jym.wzfz884ct0.fsf@econet.org>
2004-07-04 13:19 ` nospam55
2004-07-04 14:08 ` Jym Dyer
2004-07-05 10:00 ` Haines Brown
2004-07-05 10:19 ` Thomas Gehrlein
2004-07-07 15:12 ` Jym Dyer
2004-07-07 20:38 ` Haines Brown
2004-07-15 16:59 ` Jym Dyer
2004-07-04 21:06 ` Jesper Harder
2004-07-05 14:22 ` nospam55
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.