From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eric Abrahamsen Newsgroups: gmane.emacs.devel Subject: Re: Gnus; Restore multi encoding support for NNTP Date: Mon, 27 Dec 2021 19:17:32 -0800 Message-ID: <87r19x2zar.fsf@ericabrahamsen.net> References: <87wnjqb62b.fsf@gnus.org> <87sfueb3y1.fsf@gnus.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35419"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) Cc: Lars Ingebrigtsen , Emacs Devel To: LdBeth Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Dec 28 04:18:34 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n230T-00092p-Tu for ged-emacs-devel@m.gmane-mx.org; Tue, 28 Dec 2021 04:18:33 +0100 Original-Received: from localhost ([::1]:58830 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n230R-00044I-RT for ged-emacs-devel@m.gmane-mx.org; Mon, 27 Dec 2021 22:18:31 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:36464) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n22zq-0003O6-Ll for emacs-devel@gnu.org; Mon, 27 Dec 2021 22:17:54 -0500 Original-Received: from mail.ericabrahamsen.net ([52.70.2.18]:41768) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n22zp-0003EK-1W for emacs-devel@gnu.org; Mon, 27 Dec 2021 22:17:54 -0500 Original-Received: from localhost (c-71-197-232-41.hsd1.wa.comcast.net [71.197.232.41]) (Authenticated sender: eric@ericabrahamsen.net) by mail.ericabrahamsen.net (Postfix) with ESMTPSA id A8FA6FA093; Tue, 28 Dec 2021 03:17:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericabrahamsen.net; s=mail; t=1640661464; bh=voT++jpJAmpB6DCrObZj3kRApZK6kDld5fCrKjFYr+E=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=VCh8yI2MiX5+nyEPx8E9bohbqUlSeMVG1LereICsNr3C4PJqR5YV1Q0KkgTTrjtLG OP11tFNSRpGrWfE74nl6U/2VySpsxw+xW9ZXcvD8hWGlRKVG7g8I3MhBprKbtVAxYK n2sVNT6QIKpQHiiFIW5TZeOqc20ux1Gj6KPVN8P0= In-Reply-To: (LdBeth's message of "Mon, 27 Dec 2021 21:58:41 +0800") Received-SPF: pass client-ip=52.70.2.18; envelope-from=eric@ericabrahamsen.net; helo=mail.ericabrahamsen.net X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:283462 Archived-At: LdBeth writes: >>>>>> In <87sfueb3y1.fsf@gnus.org> >>>>>> Lars Ingebrigtsen wrote: > Lars> LdBeth writes: > > ldb> That would only solve the problem displaying the groups list from > ldb> server-mode, after Gnus saves the decoded group names in > ldb> ~/.newsrc.eld > ldb> and reads in from a new session, it would not able to correctly > ldb> figure > ldb> out the original group name from the starup screen group-mode. > ldb> That is > ldb> why a mapping is needed (and it needs to be saved with the > ldb> .newsrc.eld file). > > Lars> It knows the coding system to use for that group name, so it can > Lars> use > Lars> that when encoding the name, too, surely? > > Probably you mean using the coding system in > `gnus-group-name-charset-group-alist'? > > That won't work in certain case, say, there's a group name on a server > is > "nntp+news.newsfan.net:\346\265\213\350\257\225" (in UTF-8) > while the rest group names on that server are in GBK, so I set > > ``` > (setq gnus-group-name-charset-group-alist > '(("\346\265\213\350\257\225" . utf-8) > ("news\\.newsfan\\.net" . gbk))) > ``` > > And it's not able to use that to correct encode the names. Trying to catch up here... The moral intent of the changes in cb12a84f2c519a48dd87453c925e3bc36d9944db was to move the site of group name decoding from just-in-time conversion before display to the user, to conversion over-the-wire when talking to the server. Meaning that group name strings should be decoded as they arrive from the server, and encoded before they're sent to the server. Locally (including in file names for agent/cache files) they should always be utf-8-emacs. `gnus-group-name-charset-group-alist' ought to be the right tool here, but I'm not 100% sure that it is used correctly both for incoming and outgoing group names. Also it's obviously got a bit of a chicken-and-the-egg problem, in that you can't match the regexp correctly unless you've already got the properly-decoded group name. > What's worse is when there are two group names having different coding > systems decoded to the same UTF-8 string for the same server, if gnus > doesn't correctly record which one uses which, well... (This is the > reason for using charset string property for that) Gnus does not handle this situation correctly, and I can't imagine it ever did. But probably it should. How does this per-group encoding information arrive from the server? Eric