From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Visuwesh Newsgroups: gmane.emacs.bugs Subject: bug#74624: 29.4.50; Gnus cannot parse some filenames(UTF8) in an attachment Date: Sun, 01 Dec 2024 11:54:11 +0530 Message-ID: <875xo3q5tg.fsf@gmail.com> References: <87v7w44srm.fsf@localdomain> <86ed2s7kxp.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="9859"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Cc: 74624@debbugs.gnu.org, Konstantin To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Dec 01 07:26:17 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tHdPM-0002NB-Tg for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 01 Dec 2024 07:26:17 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tHdPC-0007BD-TB; Sun, 01 Dec 2024 01:26:07 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tHdP8-0007B0-R7 for bug-gnu-emacs@gnu.org; Sun, 01 Dec 2024 01:26:02 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tHdP8-0005SQ-77 for bug-gnu-emacs@gnu.org; Sun, 01 Dec 2024 01:26:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:Date:References:In-Reply-To:From:To:Subject; bh=Q9xHDGK1J9PvsymZcA0YhmqV6pNAhc/ZKV4742pxX40=; b=guMn7C8FB+SYsWY0s2MZMnvcTGoTnljfOCV5/pZS6kxlsYXXvNyagq1JBAYaLV9rB1q3IFr5H4Yu1YQByIkUnKlnfHIKwhlNSxRdOVSia/cAQH/Pwn3+xnfjEQ1KQMq9V3UToDqV3XXoNTC0GEL9MWePUXfsMo5lJ37lVaEXWTBMz7TGmEAUakkWsBKMlSnsEKhWjf7TmBQssa/LqSFIcxIc7YRGbXPKotKMwStxxSxNpULbWRq5uQRN54JKiWl9YvFYccXxkFgVzUsWuFHr+vv5dZcMbBfvBpRNqa6VIVnVAzgQZxjmTuEDtEgpmvrXRhlLdEPRFCr8ksDYUGSWXw==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1tHdP8-0004Ko-1D for bug-gnu-emacs@gnu.org; Sun, 01 Dec 2024 01:26:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Visuwesh Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 01 Dec 2024 06:26:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 74624 X-GNU-PR-Package: emacs Original-Received: via spool by 74624-submit@debbugs.gnu.org id=B74624.173303432316588 (code B ref 74624); Sun, 01 Dec 2024 06:26:01 +0000 Original-Received: (at 74624) by debbugs.gnu.org; 1 Dec 2024 06:25:23 +0000 Original-Received: from localhost ([127.0.0.1]:49952 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tHdOV-0004JU-C7 for submit@debbugs.gnu.org; Sun, 01 Dec 2024 01:25:23 -0500 Original-Received: from mail-pf1-f195.google.com ([209.85.210.195]:57430) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tHdOT-0004JB-1L for 74624@debbugs.gnu.org; Sun, 01 Dec 2024 01:25:21 -0500 Original-Received: by mail-pf1-f195.google.com with SMTP id d2e1a72fcca58-724d8422f37so2558121b3a.2 for <74624@debbugs.gnu.org>; Sat, 30 Nov 2024 22:25:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1733034255; x=1733639055; darn=debbugs.gnu.org; h=content-transfer-encoding:mime-version:user-agent:message-id:date :references:in-reply-to:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Q9xHDGK1J9PvsymZcA0YhmqV6pNAhc/ZKV4742pxX40=; b=F/ioFuH8FiCgLyNj4Ow3KZoWeJo9ryhETMfy+eYGZ+3fEHIp2rBn3Xi2l1Vszrmwbo OtbPF4I0+MMAvkfHXcQeOps2v2pb2Icu9g3rcEr6gkb5FDv+vNTsrtzCXkV839p+h2qv OygI6DL7fp3uqAXHucZ/g+o2UjDswEnkcFyC3Pkwz2lweNDofyVduUjOuvBV0fBOUb3d lnDs/V22tTcWA1CiJl1fmZ0JkTVe12SRpo/MolrQI6WedwwyP/JiJc73apla7Px5XltN 0cSGJijKKJFomlgtFk5GOZDbHyHRWcmIIwixsUUwsg+XczWhe/VZQZXiiHaGNHsJ5m2N Ghmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733034255; x=1733639055; h=content-transfer-encoding:mime-version:user-agent:message-id:date :references:in-reply-to:subject:cc:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Q9xHDGK1J9PvsymZcA0YhmqV6pNAhc/ZKV4742pxX40=; b=LlTH0gCboQYeElSCKju8M3kfbkYkkZu0BT0wcCLYs8Yc2TOBontnm2nlYWM08kLMM1 UvSa03NNXr1Xu3UNUP8xzsiTvi8Gzz1QlwgXMFqjaiYnTo16OY7Oh+rd9bKkHX8suS5E dgQD1Z1xyhvisUkxJCh2YjP2eEHNB2iUhosfVwmGLvXbpzVL4a2dtAqWk0mj1j3kdIYY ztkqGhBN4zv6kyi24YZm19/80180zquOtpKGxdxLptZJHJ5lAO+aJtHH1Lcl7JY02PD1 gCvciwJ4j8rudqrLMYoGdLEbYrOaHN4tbL84/SxjiIPZbXZQOlHJ3af/nAoTpGOf88PH CWDQ== X-Forwarded-Encrypted: i=1; AJvYcCXYt7YY4MSpf0BsLBEDtMM13gnYnvy5rRZhj5n5GSIdZGsDjpITnZxeE/zTje2wtvKo0D8g2A==@debbugs.gnu.org X-Gm-Message-State: AOJu0YzIBDEgK5Mt+cuXFe3RM5MtU2BN/giamHbwXthKtJP2PEG3fHzJ rEZ3PtvDcMwZpo7ke3QWNC9eDZBBxHxRmvkQIGbgTIowP+AwQM1c X-Gm-Gg: ASbGncuGtzXMRNRwWlh6u5jrLh1ShnIsVoBoBtXHaF7zPCXBhCDj1GV1yLiR9mpyXps y/sr28GgQvijvGblXr8aso7Zc7UBjcE46YPJ6iHPitBcgLT7evnZ7lu/V9ZMlmEfOVc3ide+mzC b35jnJv6m6nk0VbyUm0pcT87VPIrdENVEPVJxhgdjNRJ7KVpsB/kM6cUdSiKiZCsWVcmQYr/kl5 iUXRTF/LVk/UGl/e5COZ3q52UplvHLqSlIvs/oJe+7B8mA= X-Google-Smtp-Source: AGHT+IFRKTX/IZJNKWVHPbV0OKx/jugLS/YIeUGOP6dB2Weg+XyzC1flg8vHIDHITeXeGhEtHMpXbw== X-Received: by 2002:a05:6a00:928e:b0:71e:659:f2e7 with SMTP id d2e1a72fcca58-72530013bd7mr30808377b3a.8.1733034255144; Sat, 30 Nov 2024 22:24:15 -0800 (PST) Original-Received: from localhost ([115.240.90.130]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7254176f8f9sm6311116b3a.66.2024.11.30.22.24.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 Nov 2024 22:24:14 -0800 (PST) In-Reply-To: <86ed2s7kxp.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 30 Nov 2024 18:20:18 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:296220 Archived-At: [=E0=AE=9A=E0=AE=A9=E0=AE=BF =E0=AE=A8=E0=AE=B5=E0=AE=AE=E0=AF=8D=E0=AE=AA= =E0=AE=B0=E0=AF=8D 30, 2024] Eli Zaretskii wrote: >> From: Konstantin >> Date: Sat, 30 Nov 2024 18:59:25 +0300 >>=20 >> >From time to time i get emails with attachments from my colleges, which= they send from >> "Roundcube" web-interface.=20 >>=20 >> Often, i cannot open these attachments by =3DRET=3D(gnus-article-press-b= utton) >> or save them =3Do=3D(gnus-mime-save-part) with correct name. >> (interestingly =3DX-m=3D(gnus-summary-save-parts) works correctly) >>=20 >> The reason is gnus cannot parse correctly some attached filenames. >>=20 >> The example of such attachment (I took it from gnus-summary-show-raw-art= icle) >>=20 >> --=3D_d38c0abddd645077f401d42fa430d9d5 >> Content-Transfer-Encoding: base64 >> Content-Type: application/vnd.openxmlformats-officedocument.wordprocessi= ngml.document; >> name=3D"=3D?UTF-8?Q?=3DD0=3D9E=3DD0=3DB1=3DD0=3DB7=3DD0=3DBE=3DD1=3D80_= 2024_=3D28=3DD0=3DBD=3DD0=3DB0_=3D2Ed?=3D >> =3D?UTF-8?Q?ocx?=3D" >> Content-Disposition: attachment; >> filename*0*=3DUTF-8''%D0%9E%D0%B1%D0%B7%D0%BE%D1%80%202024%20%28%D0%BD%= D0; >> filename*1*=3D%B0%20.docx; >> size=3D10 >>=20 >> c2Rmc2FmYXNmCg=3D=3D >> --=3D_d38c0abddd645077f401d42fa430d9d5-- >>=20 >> I have tried to examine the reason. As i see it,=20=20 >> gnus-data for such attachment is formed incorrectly: >>=20 >> (# >> ("application/vnd.openxmlformats-officedocument.word..." >> (name . "=D0=9E=D0=B1=D0=B7=D0=BE=D1=80 2024 (=D0=BD=D0=B0 .docx")) >> base64 nil >> ("attachment" (size . "10") >> (filename . "=D0=9E=D0=B1=D0=B7=D0=BE=D1=80 2024 (=D0=BD\320")) nil= nil nil) >>=20 >> One can see that the filename is broken. >> It should be "=D0=9E=D0=B1=D0=B7=D0=BE=D1=80 2024 (=D0=BD=D0=B0 .docx" j= ust like the name. > > It looks like Gnus fails to decipher the file name when it is split in > the middle of a UTF-8 sequence. > > I don't know Gnus. If you can help me by showing where the value of > 'gnus-data property is calculated, I might be able to find the bug and > suggest a fix. The decoding of the filename in the Content-Disposition header is done in mm-dissect-buffer by calling mail-header-parse-content-disposition. Specifically, rfc2231-parse-string. The following patch fixes the issue on my end: diff --git a/lisp/mail/rfc2231.el b/lisp/mail/rfc2231.el index 33324cafb5b..632e270a922 100644 --- a/lisp/mail/rfc2231.el +++ b/lisp/mail/rfc2231.el @@ -193,7 +193,7 @@ rfc2231-parse-string (push (list attribute value encoded) cparams)) ;; Repetition of a part; do nothing. ((and elem - (null number)) + (null part)) ) ;; Concatenate continuation parts. (t NUMBER is the variable used during the parsing portion of the function in the big condition-case form above the cl-loop form which the patch modifies. In the header below Content-Disposition: attachment; filename*0*=3DUTF-8''%D0%9E%D0%B1%D0%B7%D0%BE%D1%80%202024%20%28%D0%B= D%D0; filename*1*=3D%B0%20.docx; size=3D10 the function first parses filename*0* and here NUMBER is 0, then filename*1* and here NUMBER is 1. By the time it finishes parsing size, NUMBER is set to nil. The loop should use the value of NUMBER pushed to PARAMETERS as the 3rd element (referred to as `part' by the cl-loop form) instead of whatever value NUMBER happened to be when we parsed the last element.