From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Garjola Dindi Newsgroups: gmane.emacs.help Subject: Re: Incorrect rendering of accented characters in HTML e-mail (Gnus) Date: Sat, 10 Oct 2020 17:53:07 +0200 Message-ID: <87mu0u84d8.fsf@pc-117-162.ovh.com> References: <87362mp5md.fsf@pc-117-162.ovh.com> <83v9fi41ux.fsf@gnu.org> <87tuv287za.fsf@pc-117-162.ovh.com> <83tuv23zud.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="38133"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) To: help-gnu-emacs@gnu.org Cancel-Lock: sha1:PoJstn8vnJJl3qKsDDz58JOhNZI= Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Sat Oct 10 17:54:17 2020 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kRHCL-0009oB-Ej for geh-help-gnu-emacs@m.gmane-mx.org; Sat, 10 Oct 2020 17:54:17 +0200 Original-Received: from localhost ([::1]:35220 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kRHCK-0008KM-CY for geh-help-gnu-emacs@m.gmane-mx.org; Sat, 10 Oct 2020 11:54:16 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39816) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kRHBP-0008JT-BZ for help-gnu-emacs@gnu.org; Sat, 10 Oct 2020 11:53:19 -0400 Original-Received: from static.214.254.202.116.clients.your-server.de ([116.202.254.214]:42406 helo=ciao.gmane.io) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kRHBM-0001su-5T for help-gnu-emacs@gnu.org; Sat, 10 Oct 2020 11:53:18 -0400 Original-Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1kRHBJ-0008b5-5F for help-gnu-emacs@gnu.org; Sat, 10 Oct 2020 17:53:13 +0200 X-Injected-Via-Gmane: http://gmane.org/ Received-SPF: pass client-ip=116.202.254.214; envelope-from=geh-help-gnu-emacs@m.gmane-mx.org; helo=ciao.gmane.io X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/10 09:17:45 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:124416 Archived-At: On Sat 10-Oct-2020 at 16:44:26 +02, Eli Zaretskii wrote: >> From: Garjola Dindi >> Date: Sat, 10 Oct 2020 16:35:05 +0200 >> >> >> The html part of the e-mails contains >> >> >> >> ,---- >> >> | < #part type=text/plain format="flowed" charset="utf-8" >> >> | disposition=inline nofile=yes> >> >> `---- >> >> >> >> so I guess that the html renderer should pick it up. I have tested shr, >> >> gnus-w3m and w3m and I always get the same result. >> >> >> >> I would be grateful if somebody could help me understand what happens. >> > >> > How does the character appear in the original HTML? >> >> Thanks for your quick response. >> >> I don't know if I am inspecting the message correctly, because when I >> enter the edit mode, all characters appear OK. Therefore, I am not sure >> if I an seeing the original html. > > Can you use some other tool, like wget or curl, to download the text > as it comes from the server? > I have these tools, but I don't know how to use them to get the text from the imap server. Since I am using offlineimap, I have the file on disk and I can compare it before and after the dummy edit I do in Gnus. For instance, before Gnus reads the message, I have: ,---- | Content-Type: multipart/alternative; boundary="=-1601233669-108915-4573-6815-27-=" | MIME-Version: 1.0 | | | --=-1601233669-108915-4573-6815-27-= | Content-Type: text/plain; charset=utf-8; format=flowed | Content-Transfer-Encoding: 8bit | | | | Elen Buzaré et Jérôme Robin sont membres fondateurs de l'association | Stoa Gallica, la première association francophone de stoïcisme, fondée | le 15 juillet 2017. Ils ont chaleureusement accepté de répondre à | quelques unes de mes questions. | -- `---- and after Gnus dummy edit, I have: ,---- | Content-Type: multipart/alternative; boundary="=-=-=" | | --=-=-= | Content-Type: text/plain; charset=utf-8; format=flowed | Content-Disposition: inline | Content-Transfer-Encoding: quoted-printable | | | | Elen Buzar=C3=A9 et J=C3=A9r=C3=B4me Robin sont membres fondateurs de l'ass= | ociation=20 | Stoa Gallica, la premi=C3=A8re association francophone de sto=C3=AFcisme, f= | ond=C3=A9e=20 | le 15 juillet 2017. Ils ont chaleureusement accept=C3=A9 de r=C3=A9pondre = | =C3=A0=20 | quelques unes de mes questions. | --=20 `---- >> I have also noticed that the I also have the same issue with non html >> e-mails. > > "Same issue" in what sense? Is just é replaced by i, or does > something like that happen with every non-ASCII letter? > The characters are incorrectly displayed. In html I have the é -> i replacement. In plain text, I have the é replaced by \351. >> For instance, here is what I see in the article buffer: >> >> ,---- >> | \311lodie, qui a rejoint l'\351quipe podcast, me dit que sa soeur, qui a une >> | formation th\351\342trale, serait disponible ponctuellement pour faire des >> | voix pour des lectures. Pour le moment on a jamais eu ce besoin mais \347a >> | peut ouvrir des perspectives. >> `---- >> >> (I have replaced the non printable chars with \xxx) > > What do you mean by "non printable" here? Do they look like octal > escapes or do they look like something else? > When sending the message to the list, Gnus said that these where non printable characters and I replaced them by the sequence of individual chars that read as the octal escapes. > Btw, the above is not UTF-8 encoding, it's Latin-1 encoding. > > Does the problem go away if you start Emacs as "emacs -Q"? I have tried, but with "emacs -Q", Gnus does not find the nnmaildir groups. So I don't know how to proceed. Thanks again. --