* Gnus; Restore multi encoding support for NNTP @ 2021-12-27 9:42 LdBeth 2021-12-27 12:11 ` Lars Ingebrigtsen 0 siblings, 1 reply; 25+ messages in thread From: LdBeth @ 2021-12-27 9:42 UTC (permalink / raw) To: Emacs Devel [-- Attachment #1: Type: text/plain, Size: 1953 bytes --] I have this problem reported as bug #52792 https://debbugs.gnu.org/cgi/bugreport.cgi?bug=52792 There used to be special handling to decode NNTP group names in different coding systems, starts from Emacs 27 these are removed in favor of working with UTF-8 internally. That works fine with emails or RSS, but not so with NNTP servers that are trends to retain their old setting, which results in some group names cannot be correctly display in Group Buffer. However, since this bug only affects people who are using Gnus with NNTP servers that still using none UTF-8 complaint charset, I guess it'll be better that I get hands on it. The basic plan is to restore the option to decode group names based on `gnus-group-name-charset-group-alist'. The reason having this custom variable is because a server could use different incompatible charset especially when group names are in different languages. It seems this variable is not been used in else where except for decode group name been displayed in article buffer. However, that would only resolve the display issue. Other changes are needed to properly restore the decoded names to it's original coding so requests to the NNTP server can be properly done. The old Gnus code caches the original coding via the deleted `gnus-agent-decoded-group-names' variable, the original string is passed everywhere and converted to user's coding system for display. To go with the original approach means reverting part of the commit cb12a84f2c519a48dd87453c925e3bc36d9944db for NNTP related functions. A possible new approach, is to save the original coding system via the charset string property, and go for UTF-8 internally. I'm not yet sure if that could be lose during Gnus' internal processing. The middle way is keep the mapping relation in an alist or hashtable, and convert back to the original encoding for communicate the server. Thoughts, comments, related information are appreciated. -- LDB [-- Attachment #2: OpenPGP Digital Signature --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Gnus; Restore multi encoding support for NNTP 2021-12-27 9:42 Gnus; Restore multi encoding support for NNTP LdBeth @ 2021-12-27 12:11 ` Lars Ingebrigtsen 2021-12-27 12:41 ` LdBeth 0 siblings, 1 reply; 25+ messages in thread From: Lars Ingebrigtsen @ 2021-12-27 12:11 UTC (permalink / raw) To: LdBeth; +Cc: Eric Abrahamsen, Emacs Devel LdBeth <andpuke@foxmail.com> writes: > The basic plan is to restore the option to decode group names based on > `gnus-group-name-charset-group-alist'. The reason having this custom > variable is because a server could use different incompatible charset > especially when group names are in different languages. It seems this > variable is not been used in else where except for decode group name > been displayed in article buffer. [...] > The middle way is keep the mapping relation in an alist or hashtable, > and convert back to the original encoding for communicate the server. Can't we just use `gnus-group-name-charset-group-alist' directly and decode based on that when reading in the groups list from the backend? (It's been some years since I looked at that code; adding Eric to the CCs.) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Gnus; Restore multi encoding support for NNTP 2021-12-27 12:11 ` Lars Ingebrigtsen @ 2021-12-27 12:41 ` LdBeth 2021-12-27 12:57 ` Lars Ingebrigtsen 0 siblings, 1 reply; 25+ messages in thread From: LdBeth @ 2021-12-27 12:41 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Eric Abrahamsen, LdBeth, Emacs Devel >>>>> In <87wnjqb62b.fsf@gnus.org> >>>>> Lars Ingebrigtsen <larsi@gnus.org> wrote: Lars> Can't we just use `gnus-group-name-charset-group-alist' directly and Lars> decode based on that when reading in the groups list from the backend? Lars> (It's been some years since I looked at that code; adding Eric to the Lars> CCs.) That would only solve the problem displaying the groups list from server-mode, after Gnus saves the decoded group names in ~/.newsrc.eld and reads in from a new session, it would not able to correctly figure out the original group name from the starup screen group-mode. That is why a mapping is needed (and it needs to be saved with the .newsrc.eld file). -- LDB ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Gnus; Restore multi encoding support for NNTP 2021-12-27 12:41 ` LdBeth @ 2021-12-27 12:57 ` Lars Ingebrigtsen 2021-12-27 13:58 ` LdBeth 0 siblings, 1 reply; 25+ messages in thread From: Lars Ingebrigtsen @ 2021-12-27 12:57 UTC (permalink / raw) To: LdBeth; +Cc: Eric Abrahamsen, Emacs Devel LdBeth <andpuke@foxmail.com> writes: > That would only solve the problem displaying the groups list from > server-mode, after Gnus saves the decoded group names in ~/.newsrc.eld > and reads in from a new session, it would not able to correctly figure > out the original group name from the starup screen group-mode. That is > why a mapping is needed (and it needs to be saved with the > .newsrc.eld file). It knows the coding system to use for that group name, so it can use that when encoding the name, too, surely? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Gnus; Restore multi encoding support for NNTP 2021-12-27 12:57 ` Lars Ingebrigtsen @ 2021-12-27 13:58 ` LdBeth 2021-12-28 3:17 ` Eric Abrahamsen ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: LdBeth @ 2021-12-27 13:58 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Eric Abrahamsen, LdBeth, Emacs Devel >>>>> In <87sfueb3y1.fsf@gnus.org> >>>>> Lars Ingebrigtsen <larsi@gnus.org> wrote: Lars> LdBeth <andpuke@foxmail.com> writes: ldb> That would only solve the problem displaying the groups list from ldb> server-mode, after Gnus saves the decoded group names in ~/.newsrc.eld ldb> and reads in from a new session, it would not able to correctly figure ldb> out the original group name from the starup screen group-mode. That is ldb> why a mapping is needed (and it needs to be saved with the ldb> .newsrc.eld file). Lars> It knows the coding system to use for that group name, so it can use Lars> that when encoding the name, too, surely? Probably you mean using the coding system in `gnus-group-name-charset-group-alist'? That won't work in certain case, say, there's a group name on a server is "nntp+news.newsfan.net:\346\265\213\350\257\225" (in UTF-8) while the rest group names on that server are in GBK, so I set ``` (setq gnus-group-name-charset-group-alist '(("\346\265\213\350\257\225" . utf-8) ("news\\.newsfan\\.net" . gbk))) ``` And it's not able to use that to correct encode the names. What's worse is when there are two group names having different coding systems decoded to the same UTF-8 string for the same server, if gnus doesn't correctly record which one uses which, well... (This is the reason for using charset string property for that) -- LDB ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Gnus; Restore multi encoding support for NNTP 2021-12-27 13:58 ` LdBeth @ 2021-12-28 3:17 ` Eric Abrahamsen 2021-12-28 14:31 ` Lars Ingebrigtsen 2021-12-28 14:29 ` Lars Ingebrigtsen 2021-12-30 10:23 ` [PATCH] " LdBeth 2 siblings, 1 reply; 25+ messages in thread From: Eric Abrahamsen @ 2021-12-28 3:17 UTC (permalink / raw) To: LdBeth; +Cc: Lars Ingebrigtsen, Emacs Devel LdBeth <andpuke@foxmail.com> writes: >>>>>> In <87sfueb3y1.fsf@gnus.org> >>>>>> Lars Ingebrigtsen <larsi@gnus.org> wrote: > Lars> LdBeth <andpuke@foxmail.com> writes: > > ldb> That would only solve the problem displaying the groups list from > ldb> server-mode, after Gnus saves the decoded group names in > ldb> ~/.newsrc.eld > ldb> and reads in from a new session, it would not able to correctly > ldb> figure > ldb> out the original group name from the starup screen group-mode. > ldb> That is > ldb> why a mapping is needed (and it needs to be saved with the > ldb> .newsrc.eld file). > > Lars> It knows the coding system to use for that group name, so it can > Lars> use > Lars> that when encoding the name, too, surely? > > Probably you mean using the coding system in > `gnus-group-name-charset-group-alist'? > > That won't work in certain case, say, there's a group name on a server > is > "nntp+news.newsfan.net:\346\265\213\350\257\225" (in UTF-8) > while the rest group names on that server are in GBK, so I set > > ``` > (setq gnus-group-name-charset-group-alist > '(("\346\265\213\350\257\225" . utf-8) > ("news\\.newsfan\\.net" . gbk))) > ``` > > And it's not able to use that to correct encode the names. Trying to catch up here... The moral intent of the changes in cb12a84f2c519a48dd87453c925e3bc36d9944db was to move the site of group name decoding from just-in-time conversion before display to the user, to conversion over-the-wire when talking to the server. Meaning that group name strings should be decoded as they arrive from the server, and encoded before they're sent to the server. Locally (including in file names for agent/cache files) they should always be utf-8-emacs. `gnus-group-name-charset-group-alist' ought to be the right tool here, but I'm not 100% sure that it is used correctly both for incoming and outgoing group names. Also it's obviously got a bit of a chicken-and-the-egg problem, in that you can't match the regexp correctly unless you've already got the properly-decoded group name. > What's worse is when there are two group names having different coding > systems decoded to the same UTF-8 string for the same server, if gnus > doesn't correctly record which one uses which, well... (This is the > reason for using charset string property for that) Gnus does not handle this situation correctly, and I can't imagine it ever did. But probably it should. How does this per-group encoding information arrive from the server? Eric ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Gnus; Restore multi encoding support for NNTP 2021-12-28 3:17 ` Eric Abrahamsen @ 2021-12-28 14:31 ` Lars Ingebrigtsen 2021-12-28 15:40 ` LdBeth 0 siblings, 1 reply; 25+ messages in thread From: Lars Ingebrigtsen @ 2021-12-28 14:31 UTC (permalink / raw) To: Eric Abrahamsen; +Cc: Emacs Devel Eric Abrahamsen <eric@ericabrahamsen.net> writes: >> What's worse is when there are two group names having different coding >> systems decoded to the same UTF-8 string for the same server, if gnus >> doesn't correctly record which one uses which, well... (This is the >> reason for using charset string property for that) > > Gnus does not handle this situation correctly, and I can't imagine it > ever did. I think it did, since we just kept the unencoded data in the newsrc alist, and then decoded before displaying. So you could have any number of groups with names that decoded to the same displayed string. > But probably it should. How does this per-group encoding > information arrive from the server? It doesn't -- you just have to know (and put that knowledge in `gnus-group-name-charset-group-alist'). -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Gnus; Restore multi encoding support for NNTP 2021-12-28 14:31 ` Lars Ingebrigtsen @ 2021-12-28 15:40 ` LdBeth 0 siblings, 0 replies; 25+ messages in thread From: LdBeth @ 2021-12-28 15:40 UTC (permalink / raw) To: Lars Ingebrigtsen, Eric Abrahamsen; +Cc: Emacs Devel Eric> The moral intent of the changes in Eric> cb12a84f2c519a48dd87453c925e3bc36d9944db was to move the site of Eric> group name decoding from just-in-time conversion before display Eric> to the user, to conversion over-the-wire when talking to the Eric> server. Meaning that group name strings should be decoded as Eric> they arrive from the server, and encoded before they're sent to Eric> the server. Locally (including in file names for agent/cache Eric> files) they should always be utf-8-emacs. Thanks for the explaination. Then I'll not go for the direction of adding back the just-in-time conversion. Eric> Gnus does not handle this situation correctly, and I can't imagine it Eric> ever did. Lars> I think it did, since we just kept the unencoded data in the newsrc Lars> alist, and then decoded before displaying. So you could have any number Lars> of groups with names that decoded to the same displayed string. Gnus used to store the raw byte strings so yes that was feasible. Eric> How does this per-group encoding information arrive from the Eric> server? We never get encoding information directly from the server. Sometimes the group names contain the coding system used in ASCII, most of the time we just knowing that most NNTP servers hosting in the same country or region uses the same coding system so we can match by the top level domain names, that's why `gnus-group-name-charset-group-alist' is been used. The last resort is by guessing: there's a `find-coding-systems-string' function from mule.el that works very well, other news clients might use their own guessing utils. -- LDB ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Gnus; Restore multi encoding support for NNTP 2021-12-27 13:58 ` LdBeth 2021-12-28 3:17 ` Eric Abrahamsen @ 2021-12-28 14:29 ` Lars Ingebrigtsen 2021-12-28 15:43 ` LdBeth 2021-12-30 10:23 ` [PATCH] " LdBeth 2 siblings, 1 reply; 25+ messages in thread From: Lars Ingebrigtsen @ 2021-12-28 14:29 UTC (permalink / raw) To: LdBeth; +Cc: Eric Abrahamsen, Emacs Devel LdBeth <andpuke@foxmail.com> writes: > That won't work in certain case, say, there's a group name on a server is > "nntp+news.newsfan.net:\346\265\213\350\257\225" (in UTF-8) > while the rest group names on that server are in GBK, so I set > > ``` > (setq gnus-group-name-charset-group-alist > '(("\346\265\213\350\257\225" . utf-8) > ("news\\.newsfan\\.net" . gbk))) > ``` > > And it's not able to use that to correct encode the names. I think it should be? (But the mechanism would have to do the matching in some coding system (or as octets) in both cases.) -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Gnus; Restore multi encoding support for NNTP 2021-12-28 14:29 ` Lars Ingebrigtsen @ 2021-12-28 15:43 ` LdBeth 0 siblings, 0 replies; 25+ messages in thread From: LdBeth @ 2021-12-28 15:43 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Eric Abrahamsen, LdBeth, Emacs Devel >>>>> In <87mtkkajlk.fsf@gnus.org> >>>>> Lars Ingebrigtsen <larsi@gnus.org> wrote: Lars> LdBeth <andpuke@foxmail.com> writes: >> That won't work in certain case, say, there's a group name on a server is >> "nntp+news.newsfan.net:\346\265\213\350\257\225" (in UTF-8) >> while the rest group names on that server are in GBK, so I set >> >> ``` >> (setq gnus-group-name-charset-group-alist >> '(("\346\265\213\350\257\225" . utf-8) >> ("news\\.newsfan\\.net" . gbk))) >> ``` >> >> And it's not able to use that to correct encode the names. Lars> I think it should be? (But the mechanism would have to do the matching Lars> in some coding system (or as octets) in both cases.) Ah, yes, I get what you mean. It is possible to decode "\346\265\213\350\257\225" first and match the stored group name in utf-8, but that won't still solve the "many group names decodes to one utf-8 name" problem. -- LDB ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] Gnus; Restore multi encoding support for NNTP 2021-12-27 13:58 ` LdBeth 2021-12-28 3:17 ` Eric Abrahamsen 2021-12-28 14:29 ` Lars Ingebrigtsen @ 2021-12-30 10:23 ` LdBeth 2021-12-30 14:49 ` Lars Ingebrigtsen 2 siblings, 1 reply; 25+ messages in thread From: LdBeth @ 2021-12-30 10:23 UTC (permalink / raw) To: Emacs Devel; +Cc: Eric Abrahamsen, Lars Ingebrigtsen [-- Attachment #1: Type: text/plain, Size: 514 bytes --] Hi, This patch is intended to fix bug #52792. This uses `gnus-group-name-charset' to decode group names when fetching from **NNTP server only**, `gnus-group-real-name` is modified to decode group names based on their attached `charset' text property. It is expected that when Gnus saves group names to ~/.newsrc.eld, the text properties are correctly saved. If we can assure only NNTP needs this special encoding, I may move the coding coversion to nntp.el instead, and fix `gnus-group-name-charset'. -- LDB [-- Attachment #2: gnus.patch --] [-- Type: text/plain, Size: 2109 bytes --] diff --git a/gnus-group.el b/gnus-group.el index 2ec001f..c990158 100644 --- a/gnus-group.el +++ b/gnus-group.el @@ -1197,6 +1197,9 @@ The following commands are available: ;; FIXME: If we never have to coerce group names to unibyte now, how ;; much of this is necessary? How much encoding/decoding do we still ;; have to do? +;; At least, nntp method still needs this. +;; Maybe we can just reduce this function to just lookup +;; `gnus-group-name-charset-group-alist'. (defun gnus-group-name-charset (method group) (unless method (setq method (gnus-find-method-for-group group))) diff --git a/gnus-srvr.el b/gnus-srvr.el index fa880b7..39a94c1 100644 --- a/gnus-srvr.el +++ b/gnus-srvr.el @@ -775,13 +775,17 @@ claim them." (while (not (eobp)) (ignore-errors (push (cons - (decode-coding-string - (buffer-substring - (point) - (progn - (skip-chars-forward "^ \t") - (point))) - 'utf-8-emacs) + (let ((name + (buffer-substring + (point) + (progn + (skip-chars-forward "^ \t") + (point))))) + (if (eq (detect-coding-string name t) 'undecided) + name + (decode-coding-string + name + (inline (gnus-group-name-charset method name))))) (let ((last (read cur))) (cons (read cur) last))) groups)) diff --git a/gnus-util.el b/gnus-util.el index 8dbdcc8..f204e81 100644 --- a/gnus-util.el +++ b/gnus-util.el @@ -622,9 +622,9 @@ If N, return the Nth ancestor instead." (defmacro gnus-group-real-name (group) "Find the real name of a foreign newsgroup." `(let ((gname ,group)) - (if (string-match "^[^:]+:" gname) - (substring gname (match-end 0)) - gname))) + (encode-coding-string (if (string-match "^[^:]+:" gname) + (substring gname (match-end 0)) + (get-text-property 0 'charset gname))))) (defmacro gnus-group-server (group) "Find the server name of a foreign newsgroup. ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2021-12-30 10:23 ` [PATCH] " LdBeth @ 2021-12-30 14:49 ` Lars Ingebrigtsen 2021-12-30 14:54 ` Eli Zaretskii 2021-12-30 15:18 ` LdBeth 0 siblings, 2 replies; 25+ messages in thread From: Lars Ingebrigtsen @ 2021-12-30 14:49 UTC (permalink / raw) To: LdBeth; +Cc: Eric Abrahamsen, Emacs Devel LdBeth <andpuke@foxmail.com> writes: > + (encode-coding-string (if (string-match "^[^:]+:" gname) > + (substring gname (match-end 0)) > + (get-text-property 0 'charset gname))))) Where is the `charset' property set? And... I don't understand why `gnus-group-real-name' should encode anything -- it's a function used everywhere to just strip the prefix from group names, and doesn't really have anything conceptually to do with translating to on-the-wire formats. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2021-12-30 14:49 ` Lars Ingebrigtsen @ 2021-12-30 14:54 ` Eli Zaretskii 2021-12-30 15:18 ` LdBeth 1 sibling, 0 replies; 25+ messages in thread From: Eli Zaretskii @ 2021-12-30 14:54 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: eric, emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Thu, 30 Dec 2021 15:49:43 +0100 > Cc: Eric Abrahamsen <eric@ericabrahamsen.net>, > Emacs Devel <emacs-devel@gnu.org> > > LdBeth <andpuke@foxmail.com> writes: > > > + (encode-coding-string (if (string-match "^[^:]+:" gname) > > + (substring gname (match-end 0)) > > + (get-text-property 0 'charset gname))))) > > Where is the `charset' property set? The decoding functions (decode-coding-region etc.) set it. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2021-12-30 14:49 ` Lars Ingebrigtsen 2021-12-30 14:54 ` Eli Zaretskii @ 2021-12-30 15:18 ` LdBeth 2021-12-31 15:59 ` Lars Ingebrigtsen 1 sibling, 1 reply; 25+ messages in thread From: LdBeth @ 2021-12-30 15:18 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Eric Abrahamsen, LdBeth, Emacs Devel >>>>> In <87wnjm6tbs.fsf@gnus.org> >>>>> Lars Ingebrigtsen <larsi@gnus.org> wrote: Lars> Where is the `charset' property set? This property is automatically added by `decode-coding-string` when the decoding is non trivial. It seems Gnus can save and restore text properties without further modification, and at least this won't break anything that is already working. Please let me know if this is an unreliable behavior. ``` (decode-coding-string "\262\342\312\324" 'gbk) ;; ==> #("测试" 0 2 (charset chinese-gbk)) ;; "test" in Chinese ``` Lars> And... I don't understand why `gnus-group-real-name' should encode Lars> anything -- it's a function used everywhere to just strip the prefix Lars> from group names, and doesn't really have anything conceptually to do Lars> with translating to on-the-wire formats. oops, I'm doing this wrong. I was meant to modify only where `gnus-group-real-name' been called inside `gnus-int.el`. Btw I figured it is not a good idea to do encoding in nntp.el because the decoding was not done in nntp.el either. -- LDB ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2021-12-30 15:18 ` LdBeth @ 2021-12-31 15:59 ` Lars Ingebrigtsen 2022-01-01 2:11 ` LdBeth 0 siblings, 1 reply; 25+ messages in thread From: Lars Ingebrigtsen @ 2021-12-31 15:59 UTC (permalink / raw) To: LdBeth; +Cc: Eric Abrahamsen, Emacs Devel LdBeth <andpuke@foxmail.com> writes: > This property is automatically added by `decode-coding-string` when > the decoding is non trivial. It seems Gnus can save and restore text > properties without further modification, and at least this won't break > anything that is already working. Please let me know if this is an > unreliable behavior. > > ``` > (decode-coding-string "\262\342\312\324" 'gbk) > ;; ==> #("测试" 0 2 (charset chinese-gbk)) ;; "test" in Chinese > > ``` Ah, right, I'd totally forgotten that bit. I think it can be relied upon. And storing the info as a text property will probably work in Gnus -- it'll save the data to .newsrc.eld, as you've found out -- but it sounds pretty brittle to me. That is, I wouldn't be surprised if the text property goes missing at some point, because the code in Gnus isn't written with text properties in mind. > Lars> And... I don't understand why `gnus-group-real-name' should encode > Lars> anything -- it's a function used everywhere to just strip the prefix > Lars> from group names, and doesn't really have anything conceptually to do > Lars> with translating to on-the-wire formats. > > oops, I'm doing this wrong. I was meant to modify only where > `gnus-group-real-name' been called inside `gnus-int.el`. > > Btw I figured it is not a good idea to do encoding in nntp.el because > the decoding was not done in nntp.el either. Perhaps just having this in an alist in nntp.el somewhere would be the most logical choice, even though it means that nntp.el peeks at Gnus variables. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2021-12-31 15:59 ` Lars Ingebrigtsen @ 2022-01-01 2:11 ` LdBeth 2022-01-01 3:32 ` LdBeth 2022-01-01 6:58 ` Eli Zaretskii 0 siblings, 2 replies; 25+ messages in thread From: LdBeth @ 2022-01-01 2:11 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Eric Abrahamsen, LdBeth, Emacs Devel [-- Attachment #1: Type: text/plain, Size: 1618 bytes --] >>>>> In <874k6o7okc.fsf@gnus.org> >>>>> Lars Ingebrigtsen <larsi@gnus.org> wrote: Lars> Ah, right, I'd totally forgotten that bit. I think it can be relied Lars> upon. And storing the info as a text property will probably work in Lars> Gnus -- it'll save the data to .newsrc.eld, as you've found out -- but Lars> it sounds pretty brittle to me. That is, I wouldn't be surprised if the Lars> text property goes missing at some point, because the code in Gnus isn't Lars> written with text properties in mind. I have now figured how to write text property into .newsrc.eld: Gnus does extra UTF-8 encoding when save group names, since it is now already using UTF-8 encoding internally, I think it would be safe to just remove that. ldb> Btw I figured it is not a good idea to do encoding in nntp.el because ldb> the decoding was not done in nntp.el either. Lars> Perhaps just having this in an alist in nntp.el somewhere would be the Lars> most logical choice, even though it means that nntp.el peeks at Gnus Lars> variables. I figured it is more diffcult to do percisely the encoding in nntp.el Besides, I think it would be more ideal to let `gnus-group-name-charset-group-alist' still to be generic on all backends, which is the Emacs 26's old behavior. Right now this patch has no problem accessing, subscribing servers with GBK coding system and save the group names with their text property (test agains the git master branch). The only one missing puzzle is, the text property would be lost at some point after read in the newsrc.eld file. I'll do a trace later to find out if this can be worked out. [-- Attachment #2: gnus.patch --] [-- Type: text/plain, Size: 14625 bytes --] diff --git a/lisp/gnus/gnus-group.el b/lisp/gnus/gnus-group.el index b042930..9db3d11 100644 --- a/lisp/gnus/gnus-group.el +++ b/lisp/gnus/gnus-group.el @@ -1230,6 +1230,12 @@ gnus-group-decoded-name (let ((charset (gnus-group-name-charset nil string))) (gnus-group-name-decode string charset))) +(defun gnus-group-encoded-name (string) + ;; search for `charset' property added by `decode-coding-string' + (let ((pos (text-property-not-all 0 (length string) 'charset nil string))) + (if pos (encode-coding-string string (get-text-property pos 'charset string)) + string))) + (defun gnus-group-list-groups (&optional level unread lowest update-level) "List newsgroups with level LEVEL or lower that have unread articles. Default is all subscribed groups. diff --git a/lisp/gnus/gnus-int.el b/lisp/gnus/gnus-int.el index 255c11f..4fcc44d 100644 --- a/lisp/gnus/gnus-int.el +++ b/lisp/gnus/gnus-int.el @@ -472,7 +472,7 @@ gnus-request-compact-group (result (funcall (gnus-get-function gnus-command-method 'request-compact-group) - (gnus-group-real-name group) + (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method) t))) result)) @@ -493,7 +493,8 @@ gnus-request-group (setq gnus-command-method (inline (gnus-server-to-method gnus-command-method)))) (funcall (inline (gnus-get-function gnus-command-method 'request-group)) - (gnus-group-real-name group) (nth 1 gnus-command-method) + (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method) dont-check info))) @@ -503,7 +504,8 @@ gnus-request-group-description (func 'request-group-description)) (when (gnus-check-backend-function func group) (funcall (gnus-get-function gnus-command-method func) - (gnus-group-real-name group) (nth 1 gnus-command-method))))) + (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method))))) (defun gnus-request-group-scan (group info) "Request that GROUP get a complete rescan." @@ -511,13 +513,15 @@ gnus-request-group-scan (func 'request-group-scan)) (when (gnus-check-backend-function func group) (funcall (gnus-get-function gnus-command-method func) - (gnus-group-real-name group) (nth 1 gnus-command-method) info)))) + (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method) info)))) (defun gnus-close-group (group) "Request the GROUP be closed." (let ((gnus-command-method (inline (gnus-find-method-for-group group)))) (funcall (gnus-get-function gnus-command-method 'close-group) - (gnus-group-real-name group) (nth 1 gnus-command-method)))) + (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method)))) (defun gnus-retrieve-headers (articles group &optional fetch-old) "Request headers for ARTICLES in GROUP. @@ -531,14 +535,14 @@ gnus-retrieve-headers (gnus-agent-retrieve-headers articles group fetch-old)) (t (funcall (gnus-get-function gnus-command-method 'retrieve-headers) - articles (gnus-group-real-name group) + articles (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method) fetch-old))))) (defun gnus-retrieve-articles (articles group) "Request ARTICLES in GROUP." (let ((gnus-command-method (gnus-find-method-for-group group))) (funcall (gnus-get-function gnus-command-method 'retrieve-articles) - articles (gnus-group-real-name group) + articles (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method)))) (defun gnus-retrieve-groups (groups command-method) @@ -557,7 +561,7 @@ gnus-request-type 'request-type (car gnus-command-method))) 'unknown (funcall (gnus-get-function gnus-command-method 'request-type) - (gnus-group-real-name group) article)))) + (gnus-group-real-name (gnus-group-encoded-name group)) article)))) (defun gnus-request-update-group-status (group status) "Change the status of a group. @@ -568,7 +572,7 @@ gnus-request-update-group-status nil (funcall (gnus-get-function gnus-command-method 'request-update-group-status) - (gnus-group-real-name group) status + (gnus-group-real-name (gnus-group-encoded-name group)) status (nth 1 gnus-command-method))))) (defun gnus-request-set-mark (group action) @@ -578,7 +582,7 @@ gnus-request-set-mark 'request-set-mark (car gnus-command-method))) action (funcall (gnus-get-function gnus-command-method 'request-set-mark) - (gnus-group-real-name group) action + (gnus-group-real-name (gnus-group-encoded-name group)) action (nth 1 gnus-command-method)) (gnus-run-hook-with-args gnus-after-set-mark-hook group action)))) @@ -590,7 +594,8 @@ gnus-request-update-mark mark (gnus-run-hook-with-args gnus-before-update-mark-hook group article mark) (funcall (gnus-get-function gnus-command-method 'request-update-mark) - (gnus-group-real-name group) article mark)))) + (gnus-group-real-name (gnus-group-encoded-name group)) + article mark)))) (defun gnus-request-article (article group &optional buffer) "Request the ARTICLE in GROUP. @@ -598,7 +603,7 @@ gnus-request-article If BUFFER, insert the article in that group." (let ((gnus-command-method (gnus-find-method-for-group group))) (funcall (gnus-get-function gnus-command-method 'request-article) - article (gnus-group-real-name group) + article (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method) buffer))) (defun gnus-request-thread (header group) @@ -606,7 +611,7 @@ gnus-request-thread (let ((gnus-command-method (gnus-find-method-for-group group))) (funcall (gnus-get-function gnus-command-method 'request-thread) header - (gnus-group-real-name group)))) + (gnus-group-real-name (gnus-group-encoded-name group))))) (defun gnus-select-group-with-message-id (group message-id) "Activate and select GROUP with the given MESSAGE-ID selected. @@ -654,7 +659,7 @@ gnus-simplify-group-name "Return the simplest representation of the name of GROUP. This is the string that Gnus uses to identify the group." (gnus-group-prefixed-name - (gnus-group-real-name group) + (gnus-group-real-name (gnus-group-encoded-name group)) (gnus-group-method group))) (defun gnus-warp-to-article () @@ -722,7 +727,8 @@ gnus-request-body clean-up t)) ;; Use `head' function. ((fboundp head) - (setq res (funcall head article (gnus-group-real-name group) + (setq res (funcall head article + (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method)))) ;; Use `article' function. (t @@ -751,7 +757,7 @@ gnus-request-expunge-group (gnus-server-to-method command-method) command-method))) (funcall (gnus-get-function gnus-command-method 'request-expunge-group) - (gnus-group-real-name group) + (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method)))) (defvar mail-source-plugged) @@ -768,7 +774,7 @@ gnus-request-scan (not (gnus-agent-method-p gnus-command-method))) (setq gnus-internal-registry-spool-current-method gnus-command-method) (funcall (gnus-get-function gnus-command-method 'request-scan) - (and group (gnus-group-real-name group)) + (and group (gnus-group-real-name (gnus-group-encoded-name group))) (nth 1 gnus-command-method))))) (defun gnus-request-update-info (info command-method) @@ -792,7 +798,7 @@ gnus-request-marks 'request-marks (car gnus-command-method)) (let ((group (gnus-info-group info))) (and (funcall (gnus-get-function gnus-command-method 'request-marks) - (gnus-group-real-name group) + (gnus-group-real-name (gnus-group-encoded-name group)) info (nth 1 gnus-command-method)) ;; If the minimum article number is greater than 1, then all ;; smaller article numbers are known not to exist; we'll @@ -816,7 +822,8 @@ gnus-request-expire-articles (not-deleted (funcall (gnus-get-function gnus-command-method 'request-expire-articles) - articles (gnus-group-real-name group) (nth 1 gnus-command-method) + articles (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method) force))) (when (and gnus-agent (gnus-agent-method-p gnus-command-method)) @@ -830,7 +837,8 @@ gnus-request-move-article (let* ((gnus-command-method (gnus-find-method-for-group group)) (result (funcall (gnus-get-function gnus-command-method 'request-move-article) - article (gnus-group-real-name group) + article + (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method) accept-function last move-is-internal))) (when (and result gnus-agent @@ -864,7 +872,9 @@ gnus-request-accept-article (result (funcall (gnus-get-function gnus-command-method 'request-accept-article) - (if (stringp group) (gnus-group-real-name group) group) + (if (stringp group) + (gnus-group-real-name (gnus-group-encoded-name group)) + group) (cadr gnus-command-method) last))) (when (and gnus-agent @@ -883,7 +893,9 @@ gnus-request-replace-article (message-encode-message-body))) (let* ((func (car (gnus-group-name-to-method group))) (result (funcall (intern (format "%s-request-replace-article" func)) - article (gnus-group-real-name group) buffer))) + article + (gnus-group-real-name (gnus-group-encoded-name group)) + buffer))) (when (and gnus-agent (gnus-agent-method-p gnus-command-method)) (gnus-agent-regenerate-group group (list article))) result)) @@ -892,7 +904,7 @@ gnus-request-restore-buffer "Request a new buffer restored to the state of ARTICLE." (let ((gnus-command-method (gnus-find-method-for-group group))) (funcall (gnus-get-function gnus-command-method 'request-restore-buffer) - article (gnus-group-real-name group) + article (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method)))) (defun gnus-request-create-group (group &optional command-method args) @@ -902,13 +914,15 @@ gnus-request-create-group command-method) (gnus-find-method-for-group group)))) (funcall (gnus-get-function gnus-command-method 'request-create-group) - (gnus-group-real-name group) (nth 1 gnus-command-method) args))) + (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method) args))) (defun gnus-request-delete-group (group &optional force) (let* ((gnus-command-method (gnus-find-method-for-group group)) (result (funcall (gnus-get-function gnus-command-method 'request-delete-group) - (gnus-group-real-name group) force (nth 1 gnus-command-method)))) + (gnus-group-real-name (gnus-group-encoded-name group)) + force (nth 1 gnus-command-method)))) (when result (gnus-cache-delete-group group) (gnus-agent-delete-group group)) @@ -918,8 +932,9 @@ gnus-request-rename-group (let* ((gnus-command-method (gnus-find-method-for-group group)) (result (funcall (gnus-get-function gnus-command-method 'request-rename-group) - (gnus-group-real-name group) - (gnus-group-real-name new-name) (nth 1 gnus-command-method)))) + (gnus-group-real-name (gnus-group-encoded-name group)) + (gnus-group-real-name (gnus-group-encoded-name new-name)) + (nth 1 gnus-command-method)))) (when result (gnus-cache-rename-group group new-name) (gnus-agent-rename-group group new-name)) diff --git a/lisp/gnus/gnus-srvr.el b/lisp/gnus/gnus-srvr.el index fa880b7..94c6e2e 100644 --- a/lisp/gnus/gnus-srvr.el +++ b/lisp/gnus/gnus-srvr.el @@ -775,13 +775,12 @@ gnus-browse-foreign-server (while (not (eobp)) (ignore-errors (push (cons - (decode-coding-string - (buffer-substring + (gnus-group-decoded-name + (buffer-substring (point) (progn (skip-chars-forward "^ \t") - (point))) - 'utf-8-emacs) + (point)))) (let ((last (read cur))) (cons (read cur) last))) groups)) @@ -789,7 +788,7 @@ gnus-browse-foreign-server (while (not (eobp)) (ignore-errors (push (cons - (decode-coding-string + (gnus-group-decoded-name (if (eq (char-after) ?\") (read cur) (let ((p (point)) (name "")) @@ -801,8 +800,7 @@ gnus-browse-foreign-server (skip-chars-forward "^ \t\\\\") (setq name (concat name (buffer-substring p (point))))) - name)) - 'utf-8-emacs) + name))) (let ((last (read cur))) (cons (read cur) last))) groups)) diff --git a/lisp/gnus/gnus-start.el b/lisp/gnus/gnus-start.el index 606bd3a..b1b2366 100644 --- a/lisp/gnus/gnus-start.el +++ b/lisp/gnus/gnus-start.el @@ -2893,26 +2893,6 @@ gnus-gnus-to-quick-newsrc-format ;; Remove the `gnus-killed-list' from the list of variables ;; to be saved, if required. (delq 'gnus-killed-list (copy-sequence gnus-variable-list))))) - ;; Encode group names in `gnus-newsrc-alist' and - ;; `gnus-topic-alist' in order to keep newsrc.eld files - ;; compatible with older versions of Gnus. At some point, - ;; if/when a new version of Gnus is released, stop doing - ;; this and move the corresponding decode in - ;; `gnus-read-newsrc-el-file' into a conversion routine. - (gnus-newsrc-alist - (mapcar (lambda (info) - (cons (encode-coding-string (car info) 'utf-8-emacs) - (cdr info))) - gnus-newsrc-alist)) - (gnus-topic-alist - (when (memq 'gnus-topic-alist variables) - (mapcar (lambda (elt) - (cons (car elt) ; Topic name - (mapcar (lambda (g) - (encode-coding-string - g 'utf-8-emacs)) - (cdr elt)))) - gnus-topic-alist))) variable) ;; Insert the variables into the file. (while variables ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2022-01-01 2:11 ` LdBeth @ 2022-01-01 3:32 ` LdBeth 2022-01-03 11:18 ` Lars Ingebrigtsen 2022-01-01 6:58 ` Eli Zaretskii 1 sibling, 1 reply; 25+ messages in thread From: LdBeth @ 2022-01-01 3:32 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Eric Abrahamsen, LdBeth, Emacs Devel [-- Attachment #1: Type: text/plain, Size: 1192 bytes --] >>>>> In <tencent_CA1EFFD4DC58BB7F1C417AAC30747544AD09@qq.com> >>>>> LdBeth <andpuke@foxmail.com> wrote: ldb> Right now this patch has no problem accessing, subscribing servers ldb> with GBK coding system and save the group names with their text ldb> property (test agains the git master branch). The only one missing ldb> puzzle is, the text property would be lost at some point after read in ldb> the newsrc.eld file. I'll do a trace later to find out if this can be ldb> worked out. I have now removed the "extraneous" decoding rountines when convert the gnus-newsrc-alist to hashtable. I did some test on the server I use and it works fine to me. The minimal .gnus.el I use: ``` (setq gnus-select-method '(nnnil "")) (add-to-list 'gnus-secondary-select-methods '(nntp "news.newsfan.net")) (setq gnus-group-name-charset-group-alist '((".*" . gbk))) ``` Notice that after entering the group there would still be wrongly decoded article names, but that can be solved by setting up `gnus-summary-show-article-charset-alist` `mm-coding-system-priorities` etc. These are not related to this patch, and are quite complex so I'd rather to not cover them here. Btw, happy new year. [-- Attachment #2: gnus.patch --] [-- Type: text/plain, Size: 15808 bytes --] diff --git a/lisp/gnus/gnus-group.el b/lisp/gnus/gnus-group.el index b042930..9db3d11 100644 --- a/lisp/gnus/gnus-group.el +++ b/lisp/gnus/gnus-group.el @@ -1230,6 +1230,12 @@ gnus-group-decoded-name (let ((charset (gnus-group-name-charset nil string))) (gnus-group-name-decode string charset))) +(defun gnus-group-encoded-name (string) + ;; search for `charset' property added by `decode-coding-string' + (let ((pos (text-property-not-all 0 (length string) 'charset nil string))) + (if pos (encode-coding-string string (get-text-property pos 'charset string)) + string))) + (defun gnus-group-list-groups (&optional level unread lowest update-level) "List newsgroups with level LEVEL or lower that have unread articles. Default is all subscribed groups. diff --git a/lisp/gnus/gnus-int.el b/lisp/gnus/gnus-int.el index 255c11f..4fcc44d 100644 --- a/lisp/gnus/gnus-int.el +++ b/lisp/gnus/gnus-int.el @@ -472,7 +472,7 @@ gnus-request-compact-group (result (funcall (gnus-get-function gnus-command-method 'request-compact-group) - (gnus-group-real-name group) + (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method) t))) result)) @@ -493,7 +493,8 @@ gnus-request-group (setq gnus-command-method (inline (gnus-server-to-method gnus-command-method)))) (funcall (inline (gnus-get-function gnus-command-method 'request-group)) - (gnus-group-real-name group) (nth 1 gnus-command-method) + (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method) dont-check info))) @@ -503,7 +504,8 @@ gnus-request-group-description (func 'request-group-description)) (when (gnus-check-backend-function func group) (funcall (gnus-get-function gnus-command-method func) - (gnus-group-real-name group) (nth 1 gnus-command-method))))) + (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method))))) (defun gnus-request-group-scan (group info) "Request that GROUP get a complete rescan." @@ -511,13 +513,15 @@ gnus-request-group-scan (func 'request-group-scan)) (when (gnus-check-backend-function func group) (funcall (gnus-get-function gnus-command-method func) - (gnus-group-real-name group) (nth 1 gnus-command-method) info)))) + (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method) info)))) (defun gnus-close-group (group) "Request the GROUP be closed." (let ((gnus-command-method (inline (gnus-find-method-for-group group)))) (funcall (gnus-get-function gnus-command-method 'close-group) - (gnus-group-real-name group) (nth 1 gnus-command-method)))) + (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method)))) (defun gnus-retrieve-headers (articles group &optional fetch-old) "Request headers for ARTICLES in GROUP. @@ -531,14 +535,14 @@ gnus-retrieve-headers (gnus-agent-retrieve-headers articles group fetch-old)) (t (funcall (gnus-get-function gnus-command-method 'retrieve-headers) - articles (gnus-group-real-name group) + articles (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method) fetch-old))))) (defun gnus-retrieve-articles (articles group) "Request ARTICLES in GROUP." (let ((gnus-command-method (gnus-find-method-for-group group))) (funcall (gnus-get-function gnus-command-method 'retrieve-articles) - articles (gnus-group-real-name group) + articles (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method)))) (defun gnus-retrieve-groups (groups command-method) @@ -557,7 +561,7 @@ gnus-request-type 'request-type (car gnus-command-method))) 'unknown (funcall (gnus-get-function gnus-command-method 'request-type) - (gnus-group-real-name group) article)))) + (gnus-group-real-name (gnus-group-encoded-name group)) article)))) (defun gnus-request-update-group-status (group status) "Change the status of a group. @@ -568,7 +572,7 @@ gnus-request-update-group-status nil (funcall (gnus-get-function gnus-command-method 'request-update-group-status) - (gnus-group-real-name group) status + (gnus-group-real-name (gnus-group-encoded-name group)) status (nth 1 gnus-command-method))))) (defun gnus-request-set-mark (group action) @@ -578,7 +582,7 @@ gnus-request-set-mark 'request-set-mark (car gnus-command-method))) action (funcall (gnus-get-function gnus-command-method 'request-set-mark) - (gnus-group-real-name group) action + (gnus-group-real-name (gnus-group-encoded-name group)) action (nth 1 gnus-command-method)) (gnus-run-hook-with-args gnus-after-set-mark-hook group action)))) @@ -590,7 +594,8 @@ gnus-request-update-mark mark (gnus-run-hook-with-args gnus-before-update-mark-hook group article mark) (funcall (gnus-get-function gnus-command-method 'request-update-mark) - (gnus-group-real-name group) article mark)))) + (gnus-group-real-name (gnus-group-encoded-name group)) + article mark)))) (defun gnus-request-article (article group &optional buffer) "Request the ARTICLE in GROUP. @@ -598,7 +603,7 @@ gnus-request-article If BUFFER, insert the article in that group." (let ((gnus-command-method (gnus-find-method-for-group group))) (funcall (gnus-get-function gnus-command-method 'request-article) - article (gnus-group-real-name group) + article (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method) buffer))) (defun gnus-request-thread (header group) @@ -606,7 +611,7 @@ gnus-request-thread (let ((gnus-command-method (gnus-find-method-for-group group))) (funcall (gnus-get-function gnus-command-method 'request-thread) header - (gnus-group-real-name group)))) + (gnus-group-real-name (gnus-group-encoded-name group))))) (defun gnus-select-group-with-message-id (group message-id) "Activate and select GROUP with the given MESSAGE-ID selected. @@ -654,7 +659,7 @@ gnus-simplify-group-name "Return the simplest representation of the name of GROUP. This is the string that Gnus uses to identify the group." (gnus-group-prefixed-name - (gnus-group-real-name group) + (gnus-group-real-name (gnus-group-encoded-name group)) (gnus-group-method group))) (defun gnus-warp-to-article () @@ -722,7 +727,8 @@ gnus-request-body clean-up t)) ;; Use `head' function. ((fboundp head) - (setq res (funcall head article (gnus-group-real-name group) + (setq res (funcall head article + (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method)))) ;; Use `article' function. (t @@ -751,7 +757,7 @@ gnus-request-expunge-group (gnus-server-to-method command-method) command-method))) (funcall (gnus-get-function gnus-command-method 'request-expunge-group) - (gnus-group-real-name group) + (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method)))) (defvar mail-source-plugged) @@ -768,7 +774,7 @@ gnus-request-scan (not (gnus-agent-method-p gnus-command-method))) (setq gnus-internal-registry-spool-current-method gnus-command-method) (funcall (gnus-get-function gnus-command-method 'request-scan) - (and group (gnus-group-real-name group)) + (and group (gnus-group-real-name (gnus-group-encoded-name group))) (nth 1 gnus-command-method))))) (defun gnus-request-update-info (info command-method) @@ -792,7 +798,7 @@ gnus-request-marks 'request-marks (car gnus-command-method)) (let ((group (gnus-info-group info))) (and (funcall (gnus-get-function gnus-command-method 'request-marks) - (gnus-group-real-name group) + (gnus-group-real-name (gnus-group-encoded-name group)) info (nth 1 gnus-command-method)) ;; If the minimum article number is greater than 1, then all ;; smaller article numbers are known not to exist; we'll @@ -816,7 +822,8 @@ gnus-request-expire-articles (not-deleted (funcall (gnus-get-function gnus-command-method 'request-expire-articles) - articles (gnus-group-real-name group) (nth 1 gnus-command-method) + articles (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method) force))) (when (and gnus-agent (gnus-agent-method-p gnus-command-method)) @@ -830,7 +837,8 @@ gnus-request-move-article (let* ((gnus-command-method (gnus-find-method-for-group group)) (result (funcall (gnus-get-function gnus-command-method 'request-move-article) - article (gnus-group-real-name group) + article + (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method) accept-function last move-is-internal))) (when (and result gnus-agent @@ -864,7 +872,9 @@ gnus-request-accept-article (result (funcall (gnus-get-function gnus-command-method 'request-accept-article) - (if (stringp group) (gnus-group-real-name group) group) + (if (stringp group) + (gnus-group-real-name (gnus-group-encoded-name group)) + group) (cadr gnus-command-method) last))) (when (and gnus-agent @@ -883,7 +893,9 @@ gnus-request-replace-article (message-encode-message-body))) (let* ((func (car (gnus-group-name-to-method group))) (result (funcall (intern (format "%s-request-replace-article" func)) - article (gnus-group-real-name group) buffer))) + article + (gnus-group-real-name (gnus-group-encoded-name group)) + buffer))) (when (and gnus-agent (gnus-agent-method-p gnus-command-method)) (gnus-agent-regenerate-group group (list article))) result)) @@ -892,7 +904,7 @@ gnus-request-restore-buffer "Request a new buffer restored to the state of ARTICLE." (let ((gnus-command-method (gnus-find-method-for-group group))) (funcall (gnus-get-function gnus-command-method 'request-restore-buffer) - article (gnus-group-real-name group) + article (gnus-group-real-name (gnus-group-encoded-name group)) (nth 1 gnus-command-method)))) (defun gnus-request-create-group (group &optional command-method args) @@ -902,13 +914,15 @@ gnus-request-create-group command-method) (gnus-find-method-for-group group)))) (funcall (gnus-get-function gnus-command-method 'request-create-group) - (gnus-group-real-name group) (nth 1 gnus-command-method) args))) + (gnus-group-real-name (gnus-group-encoded-name group)) + (nth 1 gnus-command-method) args))) (defun gnus-request-delete-group (group &optional force) (let* ((gnus-command-method (gnus-find-method-for-group group)) (result (funcall (gnus-get-function gnus-command-method 'request-delete-group) - (gnus-group-real-name group) force (nth 1 gnus-command-method)))) + (gnus-group-real-name (gnus-group-encoded-name group)) + force (nth 1 gnus-command-method)))) (when result (gnus-cache-delete-group group) (gnus-agent-delete-group group)) @@ -918,8 +932,9 @@ gnus-request-rename-group (let* ((gnus-command-method (gnus-find-method-for-group group)) (result (funcall (gnus-get-function gnus-command-method 'request-rename-group) - (gnus-group-real-name group) - (gnus-group-real-name new-name) (nth 1 gnus-command-method)))) + (gnus-group-real-name (gnus-group-encoded-name group)) + (gnus-group-real-name (gnus-group-encoded-name new-name)) + (nth 1 gnus-command-method)))) (when result (gnus-cache-rename-group group new-name) (gnus-agent-rename-group group new-name)) diff --git a/lisp/gnus/gnus-srvr.el b/lisp/gnus/gnus-srvr.el index fa880b7..94c6e2e 100644 --- a/lisp/gnus/gnus-srvr.el +++ b/lisp/gnus/gnus-srvr.el @@ -775,13 +775,12 @@ gnus-browse-foreign-server (while (not (eobp)) (ignore-errors (push (cons - (decode-coding-string - (buffer-substring + (gnus-group-decoded-name + (buffer-substring (point) (progn (skip-chars-forward "^ \t") - (point))) - 'utf-8-emacs) + (point)))) (let ((last (read cur))) (cons (read cur) last))) groups)) @@ -789,7 +788,7 @@ gnus-browse-foreign-server (while (not (eobp)) (ignore-errors (push (cons - (decode-coding-string + (gnus-group-decoded-name (if (eq (char-after) ?\") (read cur) (let ((p (point)) (name "")) @@ -801,8 +800,7 @@ gnus-browse-foreign-server (skip-chars-forward "^ \t\\\\") (setq name (concat name (buffer-substring p (point))))) - name)) - 'utf-8-emacs) + name))) (let ((last (read cur))) (cons (read cur) last))) groups)) diff --git a/lisp/gnus/gnus-start.el b/lisp/gnus/gnus-start.el index 606bd3a..2999d6b 100644 --- a/lisp/gnus/gnus-start.el +++ b/lisp/gnus/gnus-start.el @@ -1831,11 +1831,7 @@ gnus-make-hashtable-from-newsrc-alist (if (setq rest (member method methods)) (setf (gnus-info-method info) (car rest)) (push method methods))) - ;; Check for encoded group names and decode them. - (when (string-match-p "[^[:ascii:]]" (setq gname (gnus-info-group info))) - (let ((decoded (gnus-group-decoded-name gname))) - (setf gname decoded - (gnus-info-group info) decoded))) + (setf gname (gnus-info-group info)) ;; Check for duplicates. (if (gethash gname gnus-newsrc-hashtb) ;; Remove this entry from the alist. @@ -2406,17 +2402,6 @@ gnus-read-newsrc-el-file (when gnus-newsrc-assoc (setq gnus-newsrc-alist gnus-newsrc-assoc)))) (gnus-make-hashtable-from-newsrc-alist) - (when gnus-topic-alist - (setq gnus-topic-alist - (mapcar - (lambda (elt) - (cons (car elt) - (mapcar (lambda (g) - (if (string-match-p "[^[:ascii:]]" g) - (gnus-group-decoded-name g) - g)) - (cdr elt)))) - gnus-topic-alist))) (when (file-newer-than-file-p file ding-file) ;; Old format quick file (gnus-message 5 "Reading %s..." file) @@ -2893,26 +2878,6 @@ gnus-gnus-to-quick-newsrc-format ;; Remove the `gnus-killed-list' from the list of variables ;; to be saved, if required. (delq 'gnus-killed-list (copy-sequence gnus-variable-list))))) - ;; Encode group names in `gnus-newsrc-alist' and - ;; `gnus-topic-alist' in order to keep newsrc.eld files - ;; compatible with older versions of Gnus. At some point, - ;; if/when a new version of Gnus is released, stop doing - ;; this and move the corresponding decode in - ;; `gnus-read-newsrc-el-file' into a conversion routine. - (gnus-newsrc-alist - (mapcar (lambda (info) - (cons (encode-coding-string (car info) 'utf-8-emacs) - (cdr info))) - gnus-newsrc-alist)) - (gnus-topic-alist - (when (memq 'gnus-topic-alist variables) - (mapcar (lambda (elt) - (cons (car elt) ; Topic name - (mapcar (lambda (g) - (encode-coding-string - g 'utf-8-emacs)) - (cdr elt)))) - gnus-topic-alist))) variable) ;; Insert the variables into the file. (while variables ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2022-01-01 3:32 ` LdBeth @ 2022-01-03 11:18 ` Lars Ingebrigtsen 2022-01-03 11:25 ` Lars Ingebrigtsen 2022-01-03 14:00 ` LdBeth 0 siblings, 2 replies; 25+ messages in thread From: Lars Ingebrigtsen @ 2022-01-03 11:18 UTC (permalink / raw) To: LdBeth; +Cc: Eric Abrahamsen, Emacs Devel LdBeth <andpuke@foxmail.com> writes: > + (let ((pos (text-property-not-all 0 (length string) 'charset nil string))) > + (if pos (encode-coding-string string (get-text-property pos 'charset string)) > + string))) Like I said before, I'm not really very enthusiastic about stashing this data in the text properties -- it's quite likely that there are packages or functions of there that'll do various transforms on the group names, and the text properties may be lost. If this data has to be stored non-ephemerally, then storing it in the group parameter list, for instance, would be less brittle. > @@ -472,7 +472,7 @@ gnus-request-compact-group > (result > (funcall (gnus-get-function gnus-command-method > 'request-compact-group) > - (gnus-group-real-name group) > + (gnus-group-real-name (gnus-group-encoded-name group)) > (nth 1 gnus-command-method) t))) > result)) (etc.) And I'm not sure about giving the backends the encoded names -- that's a major change in behaviour, and has to end up causing problems somewhere (for instance, in nnimap which is encoded to utf-7, and would be double-encoded if there's a `charset' text property on the group name). -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2022-01-03 11:18 ` Lars Ingebrigtsen @ 2022-01-03 11:25 ` Lars Ingebrigtsen 2022-01-03 14:00 ` LdBeth 1 sibling, 0 replies; 25+ messages in thread From: Lars Ingebrigtsen @ 2022-01-03 11:25 UTC (permalink / raw) To: LdBeth; +Cc: Eric Abrahamsen, Emacs Devel Lars Ingebrigtsen <larsi@gnus.org> writes: > (etc.) And I'm not sure about giving the backends the encoded names -- > that's a major change in behaviour, and has to end up causing problems > somewhere (for instance, in nnimap which is encoded to utf-7, and would > be double-encoded if there's a `charset' text property on the group > name). On the other hand -- I guess this was what Gnus did before Eric's changes? So perhaps there'll be no problems. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2022-01-03 11:18 ` Lars Ingebrigtsen 2022-01-03 11:25 ` Lars Ingebrigtsen @ 2022-01-03 14:00 ` LdBeth 1 sibling, 0 replies; 25+ messages in thread From: LdBeth @ 2022-01-03 14:00 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: Eric Abrahamsen, LdBeth, Emacs Devel >>>>> In <87v8z13w54.fsf@gnus.org> >>>>> Lars Ingebrigtsen <larsi@gnus.org> wrote: Lars> LdBeth <andpuke@foxmail.com> writes: >> + (let ((pos (text-property-not-all 0 (length string) 'charset nil string))) >> + (if pos (encode-coding-string string (get-text-property pos 'charset string)) >> + string))) Lars> Like I said before, I'm not really very enthusiastic about Lars> stashing this data in the text properties -- it's quite likely Lars> that there are packages or functions of there that'll do various Lars> transforms on the group names, and the text properties may be Lars> lost. If this data has to be stored non-ephemerally, then Lars> storing it in the group parameter list, for instance, would be Lars> less brittle. Thanks for the suggestion, it seems possible to add the coding information group parameter list so the newsrc.eld file can still be saved with ASCII coding. However, before the group has been subscribed and added to `gnus-newsrc-hashtb', the charset information seems has no other place to be stored unless we add another bookkeeping facility. For the group names, they are store under the text property in group mode buffer and are retrieved via `(get-text-property (point) 'gnus-group)', it seems these are only been looked up and compared. Or maybe we could just prepend the charset information to the decoded group name. For example instead of `nntp+news.server.net:group.name' it becomes `nntp-gbk+news.server.net:group.name'. In that case the info is still attached with group name, and can certainly undergoes most of the transformations. >> @@ -472,7 +472,7 @@ gnus-request-compact-group >> (result >> (funcall (gnus-get-function gnus-command-method >> 'request-compact-group) >> - (gnus-group-real-name group) >> + (gnus-group-real-name (gnus-group-encoded-name group)) >> (nth 1 gnus-command-method) t))) >> result)) Lars> (etc.) And I'm not sure about giving the backends the encoded names -- Lars> that's a major change in behaviour, and has to end up causing problems Lars> somewhere (for instance, in nnimap which is encoded to utf-7, and would Lars> be double-encoded if there's a `charset' text property on the group Lars> name). In older version it's just a call to `string-as-unibyte'. And for nnimap, the `gnus-group-name-charset' function already avoids doing extra encoding. -- LDB ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2022-01-01 2:11 ` LdBeth 2022-01-01 3:32 ` LdBeth @ 2022-01-01 6:58 ` Eli Zaretskii 2022-01-01 8:34 ` LdBeth 1 sibling, 1 reply; 25+ messages in thread From: Eli Zaretskii @ 2022-01-01 6:58 UTC (permalink / raw) To: LdBeth; +Cc: eric, larsi, emacs-devel > Date: Sat, 01 Jan 2022 10:11:12 +0800 > From: LdBeth <andpuke@foxmail.com> > Cc: Eric Abrahamsen <eric@ericabrahamsen.net>, LdBeth <andpuke@foxmail.com>, > Emacs Devel <emacs-devel@gnu.org> > > I have now figured how to write text property into .newsrc.eld: Gnus > does extra UTF-8 encoding when save group names, since it is now > already using UTF-8 encoding internally, I think it would be safe to > just remove that. If by "using UTF-8 internally" you mean the internal representation of buffer text and strings, then encoding is still needed for correct handling of codepoints outside of Unicode. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2022-01-01 6:58 ` Eli Zaretskii @ 2022-01-01 8:34 ` LdBeth 2022-01-01 8:56 ` Eli Zaretskii 0 siblings, 1 reply; 25+ messages in thread From: LdBeth @ 2022-01-01 8:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eric, LdBeth, larsi, emacs-devel >>>>> In <83tueoeyby.fsf@gnu.org> >>>>> Eli Zaretskii <eliz@gnu.org> wrote: >> I have now figured how to write text property into .newsrc.eld: Gnus >> does extra UTF-8 encoding when save group names, since it is now >> already using UTF-8 encoding internally, I think it would be safe to >> just remove that. Eli> If by "using UTF-8 internally" you mean the internal representation of Eli> buffer text and strings, then encoding is still needed for correct Eli> handling of codepoints outside of Unicode. Gnus already uses `utf-8-emacs' coding to save the newsrc.eld file for a while. According to the Elisp manual, that is the coding system that can handle the internal codepoints used by Emacs. -- LDB ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2022-01-01 8:34 ` LdBeth @ 2022-01-01 8:56 ` Eli Zaretskii 2022-01-01 9:26 ` LdBeth 0 siblings, 1 reply; 25+ messages in thread From: Eli Zaretskii @ 2022-01-01 8:56 UTC (permalink / raw) To: LdBeth; +Cc: eric, larsi, emacs-devel > Date: Sat, 01 Jan 2022 16:34:12 +0800 > From: LdBeth <andpuke@foxmail.com> > Cc: eric@ericabrahamsen.net, LdBeth <andpuke@foxmail.com>, larsi@gnus.org, > emacs-devel@gnu.org > > >>>>> In <83tueoeyby.fsf@gnu.org> > >>>>> Eli Zaretskii <eliz@gnu.org> wrote: > > >> I have now figured how to write text property into .newsrc.eld: Gnus > >> does extra UTF-8 encoding when save group names, since it is now > >> already using UTF-8 encoding internally, I think it would be safe to > >> just remove that. > > Eli> If by "using UTF-8 internally" you mean the internal representation of > Eli> buffer text and strings, then encoding is still needed for correct > Eli> handling of codepoints outside of Unicode. > > Gnus already uses `utf-8-emacs' coding to save the newsrc.eld file for > a while. According to the Elisp manual, that is the coding system > that can handle the internal codepoints used by Emacs. You are saying that encoding by utf-8-emacs is a no-op? AFAIR, that's not true. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2022-01-01 8:56 ` Eli Zaretskii @ 2022-01-01 9:26 ` LdBeth 2022-01-01 9:35 ` Eli Zaretskii 0 siblings, 1 reply; 25+ messages in thread From: LdBeth @ 2022-01-01 9:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eric, LdBeth, larsi, emacs-devel >>>>> In <83mtkfg7g1.fsf@gnu.org> >>>>> Eli Zaretskii <eliz@gnu.org> wrote: >> Eli> If by "using UTF-8 internally" you mean the internal representation of >> Eli> buffer text and strings, then encoding is still needed for correct >> Eli> handling of codepoints outside of Unicode. >> >> Gnus already uses `utf-8-emacs' coding to save the newsrc.eld file for >> a while. According to the Elisp manual, that is the coding system >> that can handle the internal codepoints used by Emacs. Eli> You are saying that encoding by utf-8-emacs is a no-op? AFAIR, that's Eli> not true. I mean, there should be no problem `prin1' any emacs strings to a file saved using utf-8-emacs coding, and correctly `read' them back given the file has `-*- coding: utf-8-emacs -*-` header line. Encoding by utf-8-emacs was used under the assumption that Gnus from a much older version of Emacs can safely read the file. By treating everything as UTF-8, Gnus has already broke the compatibility with older versions (.newsrc.eld contains wrongly encoded characters cannot work with older version that can do the correct encode/decode of none UTF-8 group names), so I think there has no point to continue restrict the charset of .newsrc.eld to be ASCII readable. And so this patch can take the advantages of both the UTF-8 internal string encoding without the redundancy of book keeping an extra translation table, which probably can not be as good as the approach taken by this patch. -- LDB ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] Gnus; Restore multi encoding support for NNTP 2022-01-01 9:26 ` LdBeth @ 2022-01-01 9:35 ` Eli Zaretskii 0 siblings, 0 replies; 25+ messages in thread From: Eli Zaretskii @ 2022-01-01 9:35 UTC (permalink / raw) To: LdBeth; +Cc: eric, larsi, emacs-devel > Date: Sat, 01 Jan 2022 17:26:00 +0800 > From: LdBeth <andpuke@foxmail.com> > Cc: LdBeth <andpuke@foxmail.com>, > eric@ericabrahamsen.net, > larsi@gnus.org, > emacs-devel@gnu.org > > >>>>> In <83mtkfg7g1.fsf@gnu.org> > >>>>> Eli Zaretskii <eliz@gnu.org> wrote: > > >> Eli> If by "using UTF-8 internally" you mean the internal representation of > >> Eli> buffer text and strings, then encoding is still needed for correct > >> Eli> handling of codepoints outside of Unicode. > >> > >> Gnus already uses `utf-8-emacs' coding to save the newsrc.eld file for > >> a while. According to the Elisp manual, that is the coding system > >> that can handle the internal codepoints used by Emacs. > > Eli> You are saying that encoding by utf-8-emacs is a no-op? AFAIR, that's > Eli> not true. > > I mean, there should be no problem `prin1' any emacs strings to a file > saved using utf-8-emacs coding, and correctly `read' them back given > the file has `-*- coding: utf-8-emacs -*-` header line. That's fine. I thought you were suggesting not to encode the text written to a file at all. Apologies for my misunderstanding. ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2022-01-03 14:00 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-12-27 9:42 Gnus; Restore multi encoding support for NNTP LdBeth 2021-12-27 12:11 ` Lars Ingebrigtsen 2021-12-27 12:41 ` LdBeth 2021-12-27 12:57 ` Lars Ingebrigtsen 2021-12-27 13:58 ` LdBeth 2021-12-28 3:17 ` Eric Abrahamsen 2021-12-28 14:31 ` Lars Ingebrigtsen 2021-12-28 15:40 ` LdBeth 2021-12-28 14:29 ` Lars Ingebrigtsen 2021-12-28 15:43 ` LdBeth 2021-12-30 10:23 ` [PATCH] " LdBeth 2021-12-30 14:49 ` Lars Ingebrigtsen 2021-12-30 14:54 ` Eli Zaretskii 2021-12-30 15:18 ` LdBeth 2021-12-31 15:59 ` Lars Ingebrigtsen 2022-01-01 2:11 ` LdBeth 2022-01-01 3:32 ` LdBeth 2022-01-03 11:18 ` Lars Ingebrigtsen 2022-01-03 11:25 ` Lars Ingebrigtsen 2022-01-03 14:00 ` LdBeth 2022-01-01 6:58 ` Eli Zaretskii 2022-01-01 8:34 ` LdBeth 2022-01-01 8:56 ` Eli Zaretskii 2022-01-01 9:26 ` LdBeth 2022-01-01 9:35 ` Eli Zaretskii
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.