From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Stephen J. Turnbull" Newsgroups: gmane.emacs.devel Subject: Re: Rmail-mbox branch Date: Mon, 08 Sep 2008 18:53:41 +0900 Message-ID: <87myij0xqy.fsf@xemacs.org> References: <87zlprvod0.fsf@stupidchicken.com> <4868CF84.1040005@pajato.com> <48A90589.4020804@pajato.com> <48A91146.60200@pajato.com> <48A968A3.8050806@pajato.com> <48BA1DAE.2030005@pajato.com> <874p51xblf.fsf@cyd.mit.edu> <84od39q9mv.fsf@boris.laptop> <84abesum0g.fsf@boris.laptop> <841w03v389.fsf@boris.laptop> <84wshvtgoz.fsf@boris.laptop> <87od363q1i.fsf@uwakimon.sk.tsukuba.ac.jp> <87iqtd4ubu.fsf@uwakimon.sk.tsukuba.ac.jp> <877i9s4pf5.fsf@uwakimon.sk.tsukuba.ac.jp> <87hc8vf97n.fsf@xemacs.org> <878wu3fd4r.fsf@xemacs.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1220867578 6457 80.91.229.12 (8 Sep 2008 09:52:58 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 8 Sep 2008 09:52:58 +0000 (UTC) Cc: evilborisnet@netscape.net, monnier@IRO.UMontreal.CA, rms@gnu.org, emacs-devel@gnu.org To: Francesco Potorti` Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Sep 08 11:53:52 2008 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1KcdRO-0007e8-F6 for ged-emacs-devel@m.gmane.org; Mon, 08 Sep 2008 11:53:51 +0200 Original-Received: from localhost ([127.0.0.1]:34143 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KcdQN-00087V-NE for ged-emacs-devel@m.gmane.org; Mon, 08 Sep 2008 05:52:47 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KcdMa-0004gR-2n for emacs-devel@gnu.org; Mon, 08 Sep 2008 05:48:52 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KcdMX-0004dt-DC for emacs-devel@gnu.org; Mon, 08 Sep 2008 05:48:50 -0400 Original-Received: from [199.232.76.173] (port=58649 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KcdMW-0004dY-MW for emacs-devel@gnu.org; Mon, 08 Sep 2008 05:48:48 -0400 Original-Received: from mtps01.sk.tsukuba.ac.jp ([130.158.97.223]:35875) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KcdMN-0006Si-IQ; Mon, 08 Sep 2008 05:48:40 -0400 Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp [130.158.99.156]) by mtps01.sk.tsukuba.ac.jp (Postfix) with ESMTP id 0EE071535AF; Mon, 8 Sep 2008 18:48:36 +0900 (JST) Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000) id 9A9281A26D1; Mon, 8 Sep 2008 18:53:41 +0900 (JST) In-Reply-To: X-Mailer: VM 8.0.12-devo-585 under 21.5 (beta28) "fuki" 78738a40e31e XEmacs Lucid (x86_64-unknown-linux) X-detected-kernel: by monty-python.gnu.org: Linux 2.6, seldom 2.4 (older, 4) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:103682 Archived-At: Francesco Potorti` writes: > >[3] For practical purposes we can assume that message bodies which are > >'text/plain; charset=XYZ' for XYZ=iso-8859-1, XYZ=iso-8859-15, and > >XYZ=utf-8 will generally "just work", too. But it's easy to imagine > >situations where a MIME-conforming MUA might render such incorrectly > >without a MIME Content-Type header to guide it, such as a Latin-2- > >using locale. > > One thing that I find useful is searching mbox files using standard Unix > text tools. This is not possible for base-64 coded messages. However, > I can edit an mbox file, convert its text from base-64 and edit the MIME > header appropriately, thus still obtaining an RFC-compliant message that > is searchable and that can be displayed without further decoding (it is > smaller, too). I do this routinely for the three charsets you mentioned > above, but it should work for any charset. I suggest that this > conversion be made automatically whenever possible and sensible. This is always trivially possible in Emacs.[1] The key, as you mention, is fixing the Content-Transfer-Encoding field to '8bit' after decoding, and (perhaps) recoding from a legacy charset to UTF-8, in which case you'll need to fix the Content-Type field's 'charset' parameter, too. However, whether it is "sensible" depends on the user's environment, including what other MUAs and text-processing tools you have available. I recommend rather making this a folder-specific user setting, defaulting to 'ask. You yourself could set it to 'always since you have favorable experience with it. Or perhaps the values should be something like 'never, 'ask, 'always-content-type (where this uses the value specified in the 'charset' parameter of the Content-Type field), and 'always-utf8. Footnotes: [1] Of course you have to know how to use Mule, which is not yet fully diffused into the community. Once you've got that, though, it's trivial. Borrow the BASE64 and QP decoders from Gnus, and use de/en-code-coding-region to fix up the MIME charset.