From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David De La Harpe Golden Newsgroups: gmane.emacs.devel Subject: Re: Rmail-mbox branch Date: Mon, 08 Sep 2008 18:55:08 +0100 Message-ID: <48C566FC.1070808@harpegolden.net> References: <87zlprvod0.fsf@stupidchicken.com> <4868CF84.1040005@pajato.com> <48A90589.4020804@pajato.com> <48A91146.60200@pajato.com> <48A968A3.8050806@pajato.com> <48BA1DAE.2030005@pajato.com> <874p51xblf.fsf@cyd.mit.edu> <84od39q9mv.fsf@boris.laptop> <84abesum0g.fsf@boris.laptop> <841w03v389.fsf@boris.laptop> <84wshvtgoz.fsf@boris.laptop> <87od363q1i.fsf@uwakimon.sk.tsukuba.ac.jp> <87iqtd4ubu.fsf@uwakimon.sk.tsukuba.ac.jp> <877i9s4pf5.fsf@uwakimon.sk.tsukuba.ac.jp> <87hc8vf97n.fsf@xemacs.org> <878wu3fd4r.fsf@xemacs.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1220896543 15785 80.91.229.12 (8 Sep 2008 17:55:43 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 8 Sep 2008 17:55:43 +0000 (UTC) Cc: "Stephen J. Turnbull" , emacs-devel@gnu.org, monnier@IRO.UMontreal.CA, evilborisnet@netscape.net To: rms@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Sep 08 19:56:38 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KckyT-0006vd-Na for ged-emacs-devel@m.gmane.org; Mon, 08 Sep 2008 19:56:30 +0200 Original-Received: from localhost ([127.0.0.1]:45922 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KckxS-0004bt-Cm for ged-emacs-devel@m.gmane.org; Mon, 08 Sep 2008 13:55:26 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KckxN-0004ZO-Ol for emacs-devel@gnu.org; Mon, 08 Sep 2008 13:55:21 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KckxM-0004XZ-AH for emacs-devel@gnu.org; Mon, 08 Sep 2008 13:55:21 -0400 Original-Received: from [199.232.76.173] (port=36664 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KckxM-0004XS-3X for emacs-devel@gnu.org; Mon, 08 Sep 2008 13:55:20 -0400 Original-Received: from harpegolden.net ([65.99.215.13]:33673) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KckxG-00017Z-Gi; Mon, 08 Sep 2008 13:55:14 -0400 Original-Received: from golden1.harpegolden.net (86-43-162-175.b-ras2.prp.dublin.eircom.net [86.43.162.175]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "David De La Harpe Golden", Issuer "David De La Harpe Golden Personal CA rev 3" (verified OK)) by harpegolden.net (Postfix) with ESMTP id DBBF381AD; Mon, 8 Sep 2008 18:55:11 +0100 (IST) User-Agent: Mozilla-Thunderbird 2.0.0.16 (X11/20080724) In-Reply-To: X-Enigmail-Version: 0.95.0 X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:103696 Archived-At: Richard M. Stallman wrote: > HTML is made of plain ASCII text. (1) No it isn't, at least not necessarily. http://www.w3.org/TR/REC-html40-971218/charset.html#encodings While HTML numeric/entity refs _allow_ most all >7-bit chars to make it through an ascii channel, IF someone uses them, AFAIK their use is _not_ mandated, you can e.g. just write your html in utf-8 and use =E2= =98=BA characters directly. i.e. they're a facility a bit like those C trigraphs (entity refs have a range of other uses in SGML/XML land). (2) text/html can have a declared charset. It _may_ be US-ASCII, but it isn't necessarily the case that it is. http://www.ietf.org/rfc/rfc2854.txt "Because of the availability within HTML itself for using character entity references, documents that use a wide repertoire of characters may still be represented using the US-ASCII charset and transported without encoding. However, transport of text/html using a charset other than US-ASCII may require base64 or quoted-printable encoding for 7-bit channels." When I wrote a mostly-ascii Unicode HTML mail with a capable mailer (not that I make a habit of HTML mail, bleurgh), it got sent as: Content-Type: text/html; charset=3D"utf-8" Content-Transfer-Encoding: quoted-printable At some point, after pasting in a big bunch of unicode squiggles (presumably some heuristic on proportion of non-7-bit-ascii chars or encoded length) , it decided to switch to: Content-Type: text/html; charset=3D"utf-8" Content-Transfer-Encoding: base64