From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Reiner Steib <4.uce.03.r.s@nurfuerspam.de> Newsgroups: gmane.emacs.bugs Subject: Re: Broken charset=utf-16be articles with Gnus and Emacs 21.3 Date: Mon, 31 Mar 2003 15:41:28 +0200 Organization: Dept. of Theoretical Physics, University of Ulm Sender: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org Message-ID: References: Reply-To: reiner.steib@gmx.de NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1049118231 22434 80.91.224.249 (31 Mar 2003 13:43:51 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 31 Mar 2003 13:43:51 +0000 (UTC) Cc: Kenichi Handa Original-X-From: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org Mon Mar 31 15:43:48 2003 Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18zzZk-0005ph-00 for ; Mon, 31 Mar 2003 15:43:48 +0200 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18zzZZ-0003MO-00 for gnu-bug-gnu-emacs@m.gmane.org; Mon, 31 Mar 2003 08:43:37 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 18zzZJ-0003JU-00 for bug-gnu-emacs@gnu.org; Mon, 31 Mar 2003 08:43:21 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 18zzZF-0003Em-00 for bug-gnu-emacs@gnu.org; Mon, 31 Mar 2003 08:43:19 -0500 Original-Received: from main.gmane.org ([80.91.224.249]) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18zzZE-0003Cc-00 for bug-gnu-emacs@gnu.org; Mon, 31 Mar 2003 08:43:17 -0500 Original-Received: from list by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 18zzZ0-0005mL-00 for ; Mon, 31 Mar 2003 15:43:02 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-To: bug-gnu-emacs@gnu.org Original-Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 18zzYC-0005gh-00 for ; Mon, 31 Mar 2003 15:42:12 +0200 Original-Lines: 73 Original-X-Complaints-To: usenet@main.gmane.org Mail-Followup-To: bugs@gnus.org, bug-gnu-emacs@gnu.org, Mark Trettin , Simon Krahnke , Kenichi Handa X-Face: :6KQZ[nyoS_edmB.%gw-=)BYth^|2+Y+^cu%I$FSx!&>-'om>3H7A|M&n(V7fIo3P.; yo.b yq4$p;ZaBtkv)\}biaiBQe"mD}iib1AA@99-fZ7i*bLhNRVC&0Wkxg9)SH?oWc@{ User-Agent: Gnus/5.090017 (Oort Gnus v0.17) Emacs/21.3 (gnu/linux) Cancel-Lock: sha1:64g9x5UinFue9/3rS9Kxi1ULm8g= Original-cc: bugs@gnus.org Original-cc: Simon Krahnke Original-cc: Mark Trettin X-BeenThere: bug-gnu-emacs@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Bug reports for GNU Emacs, the Swiss army knife of text editors List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: bug-gnu-emacs-bounces+gnu-bug-gnu-emacs=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.bugs:4673 X-Report-Spam: http://spam.gmane.org/gmane.emacs.bugs:4673 On Mon, Mar 31 2003, Jesper Harder wrote: > FWIW, Oort Gnus and current CVS Emacs works as expected. It used to > exhibit the same bug you're describing, Yes, I remember seeing this on the Ding-List in some of Kai's articles. But Kai and me didn't recall whether the problem was in Emacs CVS-HEAD or in Oort Gnus (and whether it was fixed or not). > but I was fixed earlier this year -- probably by this change: > > 2003-01-03 Dave Love > > * international/mule-cmds.el (sort-coding-systems): > Adjust priority of utf-16 and x-ctext. Yes, that's it, thanks for pointing this out. After applying the following patch to lisp/international/mule-cmds.el from Emacs 21.3.1 and evaluation `sort-coding-systems', I get utf-8 as expected. --8<---------------cut here---------------start------------->8--- --- mule-cmds.el 26 Dec 2002 17:27:20 -0000 1.216 +++ mule-cmds.el 3 Jan 2003 20:16:11 -0000 1.217 @@ -425,9 +425,18 @@ (let ((base (coding-system-base x))) (+ (if (eq base most-preferred) 64 0) (let ((mime (coding-system-get base 'mime-charset))) + ;; Prefer coding systems corresponding to a + ;; MIME charset. (if mime - (if (string-match "^x-" (symbol-name mime)) - 16 32) + ;; Lower utf-16 priority so that we + ;; normally prefer utf-8 to it, and put + ;; x-ctext below that. + (cond ((or (eq base 'mule-utf-16-le) + (eq base 'mule-utf-16-be)) + 16) + ((string-match "^x-" (symbol-name mime)) + 8) + (t 32)) 0)) (if (memq base lang-preferred) 8 0) (if (string-match "-with-esc$" (symbol-name base)) --8<---------------cut here---------------end--------------->8--- > Didn't it get applied to Emacs 21.3? No. If the EMACS_21_1_RC branch is still maintained, it should probably be applied there as well. On Mon, Mar 31 2003, Kenichi Handa wrote: > Oops, I've just found that Emacs' coding systems utf-16-le and > utf-16-be produce BOM (Byte Order Mark) which is a bug according to > their specifications. I've just installed a fix. Does it make sense to apply it to EMACS_21_1_RC too? >> Expected behavior: >> - The article should be encoded with >> "Content-Type: text/plain; charset=utf-8". > I don't know why GNUS prefers utf-16-X to utf-8. At least, > sort-coding-systems prefers utf-8. Apparently not in Emacs 21.3, unless I misunderstood the abovementioned patch to `sort-coding-systems'. Bye, Reiner. -- ,,, (o o) ---ooO-(_)-Ooo--- PGP key available via WWW http://rsteib.home.pages.de/