From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM Date: Tue, 11 May 2021 15:04:05 +0300 Message-ID: <83cztx5vey.fsf@gnu.org> References: <0ed1c9c7-26c1-b801-1910-6d5bb50dec3d@yahoo.de> <46c6dd22-ecff-aa7d-e019-1784060574c2@yahoo.de> <83zgx265bm.fsf@gnu.org> <83sg2u5zz5.fsf@gnu.org> <87fsyur1rx.fsf@gnus.org> <87fsyu4jof.fsf@igel.home> <83lf8m5x2c.fsf@gnu.org> <35838176-9518-6c4a-eb71-25ce7cb0ec4e@yahoo.de> <83k0o65vf9.fsf@gnu.org> <87bl9i4g7m.fsf@igel.home> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="37378"; mail-complaints-to="usenet@ciao.gmane.io" Cc: rdiezmail-emacs@yahoo.de, larsi@gnus.org, 48324@debbugs.gnu.org To: Andreas Schwab Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Tue May 11 14:05:22 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lgR8b-0009Z3-IO for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 11 May 2021 14:05:21 +0200 Original-Received: from localhost ([::1]:51182 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lgR8Z-0007FE-PX for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 11 May 2021 08:05:19 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:35276) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lgR8I-0007E9-M8 for bug-gnu-emacs@gnu.org; Tue, 11 May 2021 08:05:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:51910) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lgR8I-0001UT-EZ for bug-gnu-emacs@gnu.org; Tue, 11 May 2021 08:05:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lgR8I-0004EI-9H for bug-gnu-emacs@gnu.org; Tue, 11 May 2021 08:05:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 11 May 2021 12:05:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 48324 X-GNU-PR-Package: emacs Original-Received: via spool by 48324-submit@debbugs.gnu.org id=B48324.162073465116198 (code B ref 48324); Tue, 11 May 2021 12:05:02 +0000 Original-Received: (at 48324) by debbugs.gnu.org; 11 May 2021 12:04:11 +0000 Original-Received: from localhost ([127.0.0.1]:35223 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lgR7T-0004DB-EQ for submit@debbugs.gnu.org; Tue, 11 May 2021 08:04:11 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:52992) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lgR7R-0004Cz-Fg for 48324@debbugs.gnu.org; Tue, 11 May 2021 08:04:10 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:46760) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lgR7K-0000zm-0j; Tue, 11 May 2021 08:04:03 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:2409 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lgR7J-0006Aj-H6; Tue, 11 May 2021 08:04:01 -0400 In-Reply-To: <87bl9i4g7m.fsf@igel.home> (message from Andreas Schwab on Mon, 10 May 2021 20:05:33 +0200) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:206219 Archived-At: > From: Andreas Schwab > Cc: "R. Diez" , larsi@gnus.org, > 48324@debbugs.gnu.org > Date: Mon, 10 May 2021 20:05:33 +0200 > > On Mai 10 2021, Eli Zaretskii wrote: > > > FTR, here's a shorter and easier recipe: > > > > emacs -Q > > C-x C-f foo.txt RET > > C-x RET f utf-8-with-signature-dos RET > > 1 2 3 > > C-x C-s > > M-x hexl-mode RET > > M-x hexl-insert-hex-char RET 00 RET > > I guess the gist is that hexl-mode not only needs to account for the EOL > type, but also for the signature when computing original-point. Actually, it turned out that wasn't the main problem. (It was still a problem, but the same problem happened in a buffer produced by hexl-find-file.) The main problems were that (a) hexl.el handled null bytes as characters that need to be encoded before inserting them (as if they were non-ASCII characters), and (b) its handling of non-ASCII characters when the encoding of the original file used a BOM was incorrect (because encode-coding-char didn't remove the BOM from the encoded byte sequence). By contrast, hexl-find-file visits the file literally, so its encoding of a null byte was trivially correct. This should be now fixed on the master branch. The capability of inserting multibyte characters via Hexl is somewhat problematic, so I made a point of describing the issues in the relevant doc strings (because the problems are intrinsic and IMO hard or impossible to solve in general).