From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Reiner Steib Newsgroups: gmane.emacs.devel Subject: Re: html2text Date: Mon, 08 Nov 2004 16:51:34 +0100 Message-ID: References: <1099247139.071920.12084.nullmailer@Update.UU.SE> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1099929270 30439 80.91.229.6 (8 Nov 2004 15:54:30 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 8 Nov 2004 15:54:30 +0000 (UTC) Cc: Emacs development Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Nov 08 16:54:23 2004 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1CRBqY-0000VF-00 for ; Mon, 08 Nov 2004 16:54:22 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1CRByt-00065R-Sm for ged-emacs-devel@m.gmane.org; Mon, 08 Nov 2004 11:02:59 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1CRByb-00063P-C7 for emacs-devel@gnu.org; Mon, 08 Nov 2004 11:02:41 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1CRByZ-00062S-GS for emacs-devel@gnu.org; Mon, 08 Nov 2004 11:02:39 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1CRByZ-00061m-DX for emacs-devel@gnu.org; Mon, 08 Nov 2004 11:02:39 -0500 Original-Received: from [134.60.1.1] (helo=mail-new.rz.uni-ulm.de) by monty-python.gnu.org with esmtp (TLSv1:DES-CBC3-SHA:168) (Exim 4.34) id 1CRBpf-00008n-N1 for emacs-devel@gnu.org; Mon, 08 Nov 2004 10:53:28 -0500 Original-Received: from lumberjack.physik.uni-ulm.de (lumberjack.physik.uni-ulm.de [134.60.10.173]) by mail.uni-ulm.de (8.13.1/8.13.1) with ESMTP id iA8FpYk8009394; Mon, 8 Nov 2004 16:51:39 +0100 (MET) Original-Received: by lumberjack.physik.uni-ulm.de (Postfix, from userid 170) id 97E9818178; Mon, 8 Nov 2004 16:51:34 +0100 (CET) Mail-Followup-To: Original-To: jari.aalto@cante.net (Jari Aalto+mail.emacs) X-Face: mtjf/D:es1T0wHO:&CJ'ZXe"l; 3C--rw\z!{`eFwL){|]RpI+4{u25L=5C /0>KuGeTsk<~<&NE-AKV1560e!+RJeyWmSskkrJm?[vUV#66{T_m|Ae<||Ku#Mk5`y&O`n~z2; n8eP J5#2h@2eQgV@E70IY_0WlEx!"&giy{+\%h1LJox$zv@/l%ZmU4^tZA>xQpnkUBVC5.jpg#0'(+2?Rs )NAr:>3<=WxHE$ktbLysDIM5TbmHu*3 (Jari Aalto's message of "Sat, 06 Nov 2004 17:47:32 +0200") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux) X-DCC-sonic.net-Metrics: gemini 1156; Body=2 Fuz1=2 Fuz2=2 X-MIME-Autoconverted: from 8bit to quoted-printable by mail.uni-ulm.de id iA8FpYk8009394 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:29584 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:29584 On Sat, Nov 06 2004, Jari Aalto+mail.emacs wrote: > This is your copy. Article has been posted to the newsgroup(s). I didn't see your message on emacs-devel, see . > * Sun 2004-10-31 Alfred Szmidt gmane.emacs.devel > * Message-Id: 1099247139.071920.12084.nullmailer AT Update.UU.SE > | html2text is quite nice, but it doesn't strip all HTML files into > | something that is readable. The following patch makes it strip some > | "newer" tags that have croped up. > > There is more entities. This patch is against the Gnus CVS, but I > assume it will work for Emacs as well. The entities are in > alphabetical order. > > 2004-11-06 Sat Jari Aalto > > * text2html (html2text-replace-list). Added more HTML 4.0 > entities. It seems you have signed papers for Emacs as you are listed in the AUTHORS file. But I can't check it myself. Could you please confirm? [ The suggested patch from Jari's original message was: ] --8<---------------cut here---------------start------------->8--- --- html2text.el.7.10 2004-11-06 17:20:46.000000000 +0200 +++ html2text.el 2004-11-06 17:41:12.000000000 +0200 @@ -42,8 +42,42 @@ (defvar html2text-format-single-element-list '(("hr" . html2text-clean-h= r))) (defvar html2text-replace-list - '((" " . " ") (">" . ">") ("<" . "<") (""" . "\"") - ("&" . "&") ("'" . "'")) + '(("´" . "`") + ("&" . "&") + ("'" . "'") + ("¦" . "|") + ("¢" . "c") + ("ˆ" . "^") + ("©" . "(C)") + ("¤" . "=A4") + ("°" . "degree") + ("÷" . "/") + ("€" . "e") + ("½" . "=BD") + (">" . ">") + ("¿" . "?") + ("«" . "<<") + ("&ldquo" . "\"") + ("‹" . "(") + ("‘" . "`") + ("<" . "<") + ("—" . "--") + (" " . " ") + ("–" . "-") + ("‰" . "%%") + ("±" . "+-") + ("£" . "=A3") + (""" . "\"") + ("»" . ">>") + ("&rdquo" . "\"") + ("®" . "(R)") + ("›" . ")") + ("’" . "'") + ("§" . "=A7") + ("¹" . "^1") + ("²" . "^2") + ("³" . "^3") + ("˜" . "~")) "The map of entity to text. --8<---------------cut here---------------end--------------->8--- Bye, Reiner. --=20 ,,, (o o) ---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/