On Thu, Aug 22, 2024 at 09:44:04PM +0300, Eli Zaretskii wrote: > > From: Joseph Turner [...] > > When decoding, should plz fallback to detect-coding-region instead of utf-8? > > If this is HTML, then I think it is okay to trust the headers about > the charset and default to UTF-8. The problem with > detect-coding-region is that some of it is based on guesswork [...] Yes, and it's incredibly crude guesswork at times. Talk to the server admin. With HTML and friends, you get one or two layers of fun, because they can declare the encoding /whithin/ the stream (HTML in two different ways, at least). If the "outer layer" decides to helpfully recode, then the inner declarations are lying (I actually had this with HTML mails: the MIME layer recoded Latin-1 to UTF-8, the tag in there was a lie. Needless to say, html2text made mojibake :-) Cheers -- t