unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Konstantin <reich-cv@yandex.ru>
To: Visuwesh <visuweshm@gmail.com>
Cc: Eli Zaretskii <eliz@gnu.org>, 74624@debbugs.gnu.org
Subject: bug#74624: 29.4.50; Gnus cannot parse some filenames(UTF8) in an attachment
Date: Sun, 01 Dec 2024 10:52:30 +0300	[thread overview]
Message-ID: <87bjxvsuv5.fsf@localdomain> (raw)
In-Reply-To: <875xo3q5tg.fsf@gmail.com> (Visuwesh's message of "Sun, 01 Dec 2024 11:54:11 +0530")


Visuwesh <visuweshm@gmail.com> writes:

> [சனி நவம்பர் 30, 2024] Eli Zaretskii wrote:
>
>>> From: Konstantin <reich-cv@yandex.ru>
>>> Date: Sat, 30 Nov 2024 18:59:25 +0300
>>> 
>>> >From time to time i get emails with attachments from my colleges, which they send from
>>> "Roundcube" web-interface. 
>>> 
>>> Often, i cannot open these attachments by =RET=(gnus-article-press-button)
>>> or save them =o=(gnus-mime-save-part) with correct name.
>>> (interestingly =X-m=(gnus-summary-save-parts) works correctly)
>>> 
>>> The reason is gnus cannot parse correctly some attached filenames.
>>> 
>>> The example of such attachment (I took it from gnus-summary-show-raw-article)
>>> 
>>>  --=_d38c0abddd645077f401d42fa430d9d5
>>> Content-Transfer-Encoding: base64
>>> Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document;
>>>  name="=?UTF-8?Q?=D0=9E=D0=B1=D0=B7=D0=BE=D1=80_2024_=28=D0=BD=D0=B0_=2Ed?=
>>>  =?UTF-8?Q?ocx?="
>>> Content-Disposition: attachment;
>>>  filename*0*=UTF-8''%D0%9E%D0%B1%D0%B7%D0%BE%D1%80%202024%20%28%D0%BD%D0;
>>>  filename*1*=%B0%20.docx;
>>>  size=10
>>> 
>>> c2Rmc2FmYXNmCg==
>>> --=_d38c0abddd645077f401d42fa430d9d5--
>>> 
>>> I have tried to examine the reason. As i see it,  
>>> gnus-data for such attachment is formed incorrectly:
>>> 
>>> (#<buffer  *mm*-480444>
>>>      ("application/vnd.openxmlformats-officedocument.word..."
>>>      (name . "О️бзор 2024 (на .docx"))
>>>      base64 nil
>>>      ("attachment" (size . "10")
>>>      (filename . "О️бзор 2024 (н\320")) nil nil nil)
>>> 
>>> One can see that the filename is broken.
>>> It should be "О️бзор 2024 (на .docx" just like the name.
>>
>> It looks like Gnus fails to decipher the file name when it is split in
>> the middle of a UTF-8 sequence.
>>
>> I don't know Gnus.  If you can help me by showing where the value of
>> 'gnus-data property is calculated, I might be able to find the bug and
>> suggest a fix.
>
> The decoding of the filename in the Content-Disposition header is done
> in mm-dissect-buffer by calling mail-header-parse-content-disposition.
> Specifically, rfc2231-parse-string.  The following patch fixes the issue
> on my end:
>
> diff --git a/lisp/mail/rfc2231.el b/lisp/mail/rfc2231.el
> index 33324cafb5b..632e270a922 100644
> --- a/lisp/mail/rfc2231.el
> +++ b/lisp/mail/rfc2231.el
> @@ -193,7 +193,7 @@ rfc2231-parse-string
>  		     (push (list attribute value encoded) cparams))
>  		    ;; Repetition of a part; do nothing.
>  		    ((and elem
> -			  (null number))
> +			  (null part))
>  		     )
>  		    ;; Concatenate continuation parts.
>  		    (t
>
> NUMBER is the variable used during the parsing portion of the function
> in the big condition-case form above the cl-loop form which the patch
> modifies.  In the header below
>
>     Content-Disposition: attachment;
>       filename*0*=UTF-8''%D0%9E%D0%B1%D0%B7%D0%BE%D1%80%202024%20%28%D0%BD%D0;
>       filename*1*=%B0%20.docx;
>       size=10
>
> the function first parses filename*0* and here NUMBER is 0, then
> filename*1* and here NUMBER is 1.  By the time it finishes parsing size,
> NUMBER is set to nil.  The loop should use the value of NUMBER pushed to
> PARAMETERS as the 3rd element (referred to as `part' by the cl-loop
> form) instead of whatever value NUMBER happened to be when we parsed the
> last element.

Thank you,

indeed the patch fixes this bug.





  reply	other threads:[~2024-12-01  7:52 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-30 15:59 bug#74624: 29.4.50; Gnus cannot parse some filenames(UTF8) in an attachment Konstantin
2024-11-30 16:20 ` Eli Zaretskii
2024-12-01  6:24   ` Visuwesh
2024-12-01  7:52     ` Konstantin [this message]
2024-12-01  8:17       ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bjxvsuv5.fsf@localdomain \
    --to=reich-cv@yandex.ru \
    --cc=74624@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=visuweshm@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).