From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Florian Weimer Newsgroups: gmane.emacs.devel,gmane.emacs.gnus.general Subject: Re: master ef14acf: Make nnml handle invalid non-ASCII headers more consistently Date: Sat, 17 Dec 2022 15:57:18 +0100 Message-ID: <87bko26ptd.fsf@oldenburg.str.redhat.com> References: <20210122180801.14756.84264@vcs0.savannah.gnu.org> <20210122180802.F0A1E20A10@vcs0.savannah.gnu.org> <874jtvq8c2.fsf@oldenburg.str.redhat.com> <83k02qiicb.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="4364"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux) Cc: Lars Ingebrigtsen , Eric Abrahamsen , emacs-devel@gnu.org, ding@gnus.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Dec 17 15:57:57 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1p6YdR-0000uj-AQ for ged-emacs-devel@m.gmane-mx.org; Sat, 17 Dec 2022 15:57:57 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1p6Yd1-0007Li-VF; Sat, 17 Dec 2022 09:57:31 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p6Ycw-0007LX-Tg for emacs-devel@gnu.org; Sat, 17 Dec 2022 09:57:26 -0500 Original-Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1p6Ycv-0000TS-3L for emacs-devel@gnu.org; Sat, 17 Dec 2022 09:57:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1671289043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references; bh=3yj2+OYPeRY0OPJ9YDBnSyfrb8Z7U2xMR1Ybd4TwEVk=; b=jHHI6nxV+Wy08VEsbJWD9XbavW/pe1ssff33VlHD68OJucviYnxZJUVCUZzicf+b5TMgH/ yuvf3o3Psc67GTV2ITOP0GQJl6fUBkuuG1jTaT9Wv8iwoUPAatNoDbozQhicAEqsYj+xFK 9JAT9PgF/DwyH6u0NWDrpG4nsB7g7jo= Original-Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-493-zbxBIvrVM3Cygt9paHrVUw-1; Sat, 17 Dec 2022 09:57:22 -0500 X-MC-Unique: zbxBIvrVM3Cygt9paHrVUw-1 Original-Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B96763C01D88; Sat, 17 Dec 2022 14:57:21 +0000 (UTC) Original-Received: from oldenburg.str.redhat.com (unknown [10.2.16.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 69DC31121314; Sat, 17 Dec 2022 14:57:20 +0000 (UTC) X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.129.124; envelope-from=fweimer@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:301562 gmane.emacs.gnus.general:91028 Archived-At: * Eli Zaretskii: >> From: Florian Weimer >> Cc: Lars Ingebrigtsen , ding@gnus.org >> Date: Fri, 16 Dec 2022 23:42:21 +0100 >>=20 >> * Lars Ingebrigtsen: >>=20 >> > branch: master >> > commit ef14acfb68bb5b0ce42221e9681b93562f8085eb >> > Author: Lars Ingebrigtsen >> > Commit: Lars Ingebrigtsen >> > >> > Make nnml handle invalid non-ASCII headers more consistently >> > =20 >> > * lisp/gnus/nnml.el (nnml--encode-headers): New function to >> > RFC2047-encode invalid Subject/From headers (bug#45925). This >> > will make them be displayed more consistently in the Summary >> > buffer (but still "wrong" sometimes, since there's not that much >> > we can guess at at this stage, charset wise). >> > (nnml-parse-head): Use it. >> > --- >> > lisp/gnus/nnml.el | 16 ++++++++++++++++ >> > 1 file changed, 16 insertions(+) >> > >> > diff --git a/lisp/gnus/nnml.el b/lisp/gnus/nnml.el >> > index ebececa..3cdfc74 100644 >> > --- a/lisp/gnus/nnml.el >> > +++ b/lisp/gnus/nnml.el >> > @@ -769,8 +769,24 @@ article number. This function is called narrowed= to an article." >> > (let ((headers (nnheader-parse-head t))) >> > =09(setf (mail-header-chars headers) chars) >> > =09(setf (mail-header-number headers) number) >> > +=09;; If there's non-ASCII raw characters in the data, >> > +=09;; RFC2047-encode them to avoid having arbitrary data in the >> > +=09;; .overview file. >> > +=09(nnml--encode-headers headers) >> > =09headers)))) >>=20 >> Unfortunately, this change in particular causes Gnus to stops storing >> messages into nnmail after receiving a message with this header: >>=20 >> From: =3D?utf-8?b?572X5YuH5YiaKFlvbmdnYW5nIEx1bykgdmlhIEVsZnV0aWxzLWRldm= Vs?=3D >> >>=20 >> The logged error message is: >>=20 >> Mail source (maildir :path =E2=80=A6) failed: (error Invalid data for rf= c2047 encoding: =E7=BD=97=E5=8B=87=E5=88=9A(Yonggang Luo) via Elfutils-deve= l ) >>=20 >> On an older Emacs without this change, it seems that the original header >> is written to the .overview file, which sidestep the problem that not >> all strings are encodable by the rfc2047 functions. > > Thanks. I guess this From header is invalid because there's no space > between the "=E7=BD=97=E5=8B=87=E5=88=9A" and the "(Yonggang Luo)" parts? Yes, that seems to be what's tripping the encoder. But I'm not sure if proper encoding of ( or ) (as =3D28 or =3D29 using the Q encoding, or using the B encoding as in the raw text) is actually invalid. RFC 2047 only talks about unencoded ( or ). In contrast, encoded ( and ) are valid syntax at the RFC 822 layer because encoding hides them. > Does the na=C3=AFve patch below solve the problem? > > diff --git a/lisp/gnus/nnml.el b/lisp/gnus/nnml.el > index 40e4b9e..7aa445e 100644 > --- a/lisp/gnus/nnml.el > +++ b/lisp/gnus/nnml.el > @@ -776,17 +776,22 @@ nnml-parse-head > =09(nnml--encode-headers headers) > =09headers)))) > =20 > +;; RFC2047-encode Subject and From, but leave invalid headers unencoded. > (defun nnml--encode-headers (headers) > (let ((subject (mail-header-subject headers)) > =09(rfc2047-encoding-type 'mime)) > (unless (string-match "\\`[[:ascii:]]*\\'" subject) > - (setf (mail-header-subject headers) > -=09 (mail-encode-encoded-word-string subject t)))) > + (let ((encoded-subject > + (ignore-errors (mail-encode-encoded-word-string subject t))= )) > + (if encoded-subject > + (setf (mail-header-subject headers) encoded-subject))))) > (let ((from (mail-header-from headers)) > =09(rfc2047-encoding-type 'address-mime)) > (unless (string-match "\\`[[:ascii:]]*\\'" from) > - (setf (mail-header-from headers) > -=09 (rfc2047-encode-string from t))))) > + (let ((encoded-from > + (ignore-errors (rfc2047-encode-string from t)))) > + (if encoded-from > + (setf (mail-header-from headers) encoded-from)))))) > =20 > (defun nnml-get-nov-buffer (group &optional incrementalp) > (let ((buffer (gnus-get-buffer-create Thanks! I somehow can't reproduce the original issue. I expect more problematic messages to arrive next week, though, and will report then how it goes. Florian