From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM Date: Sat, 02 Jul 2022 19:37:07 +0300 Message-ID: <83k08vbhe4.fsf@gnu.org> References: <0ed1c9c7-26c1-b801-1910-6d5bb50dec3d@yahoo.de> <46c6dd22-ecff-aa7d-e019-1784060574c2@yahoo.de> <83zgx265bm.fsf@gnu.org> <83sg2u5zz5.fsf@gnu.org> <87fsyur1rx.fsf@gnus.org> <87fsyu4jof.fsf@igel.home> <83lf8m5x2c.fsf@gnu.org> <35838176-9518-6c4a-eb71-25ce7cb0ec4e@yahoo.de> <83k0o65vf9.fsf@gnu.org> <87bl9i4g7m.fsf@igel.home> <83cztx5vey.fsf@gnu.org> <83tun82h9k.fsf@gnu.org> <87y1xbxzio.fsf@gnus.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26746"; mail-complaints-to="usenet@ciao.gmane.io" Cc: rgm@gnu.org, schwab@linux-m68k.org, 48324@debbugs.gnu.org To: Lars Ingebrigtsen Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Jul 02 18:38:10 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1o7g8H-0006qP-DT for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 02 Jul 2022 18:38:09 +0200 Original-Received: from localhost ([::1]:52854 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o7g8G-0003bn-1p for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 02 Jul 2022 12:38:08 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:40806) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o7g8A-0003bO-1w for bug-gnu-emacs@gnu.org; Sat, 02 Jul 2022 12:38:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:48825) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1o7g89-00081a-PD for bug-gnu-emacs@gnu.org; Sat, 02 Jul 2022 12:38:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1o7g89-0003zc-M3 for bug-gnu-emacs@gnu.org; Sat, 02 Jul 2022 12:38:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 02 Jul 2022 16:38:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 48324 X-GNU-PR-Package: emacs Original-Received: via spool by 48324-submit@debbugs.gnu.org id=B48324.165677985115312 (code B ref 48324); Sat, 02 Jul 2022 16:38:01 +0000 Original-Received: (at 48324) by debbugs.gnu.org; 2 Jul 2022 16:37:31 +0000 Original-Received: from localhost ([127.0.0.1]:42722 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1o7g7f-0003yu-1J for submit@debbugs.gnu.org; Sat, 02 Jul 2022 12:37:31 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:55766) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1o7g7Z-0003yb-8k for 48324@debbugs.gnu.org; Sat, 02 Jul 2022 12:37:29 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:50326) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o7g7T-00078i-D4; Sat, 02 Jul 2022 12:37:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=Ve2VPgsM/mraT0WzI36TDSJNPZ9wHcuMYGbb1LUg/EI=; b=ZB5DYPaCq4fw XsZO65m5SiQhhI8HwjUJt7NBbrdWn8BgsMZ82YQ8scIx+MkISLngQGeKvrvgNnB+XlHYxgJaq6QOX FZt+zt7ZzBNMxSqhX1LCRs/hGYXRVkJ291ADtfGp56hEBpxO8Fvr6bsdy+WSBpzHwV2EeTOpjidA4 xwcfehj+dj7qcfQ4G14axogJ4vylzhFv4eLaHc24VlTo2bHWtT2tbylrzgV2kYjbLii2uAtH04TJa juhfX78kIwquv7eVU89pyCVbPIiNoVvrE9NOKQGJv6XM9oPkyfK/fpzw3bHzie52p4n07129ykgqH kIy2x0VFFrSHgHIOJuiMHw==; Original-Received: from [87.69.77.57] (port=3500 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o7g7S-0005h3-R7; Sat, 02 Jul 2022 12:37:19 -0400 In-Reply-To: <87y1xbxzio.fsf@gnus.org> (message from Lars Ingebrigtsen on Sat, 02 Jul 2022 18:14:39 +0200) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:235932 Archived-At: > From: Lars Ingebrigtsen > Cc: Glenn Morris , schwab@linux-m68k.org, 48324@debbugs.gnu.org > Date: Sat, 02 Jul 2022 18:14:39 +0200 > > Eli Zaretskii writes: > > > This actually reveals a design flaw in string-limit: we cannot simply > > use encode-coding-char to encode the characters one by one. I added a > > FIXME comment to explain why, as I don't currently have any clever > > ideas for how to implement it more correctly, except by iterations, > > which is inelegant. Ideas welcome. > > Hm... do we have some way of knowing that the coding system we're using > is one that should have a BOM? And a function to remove the BOM? The problem is not just with BOM. The problem will happen with any coding-system that produces prefix and/or suffix bytes when it encodes strings. The FIXME I added mentions ISO-2022 7-bit encodings as another example. And then there are coding-system's with pre-write-conversion, and those can produce any additions they like. > If we had both, then we could strip the BOM from the individual chars, > and add one to the front. AFAIR, what we have now already handles BOM in coding-system's that are known to produce a BOM. See encode-coding-char.