From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.bugs Subject: bug#44486: 27.1; C-@ chars corrupt elisp buffer Date: Sat, 14 Nov 2020 17:53:57 -0500 Message-ID: References: <878sbeikpr.fsf@posteo.net> <87zh3u8pqn.fsf@igel.home> <83blga8pdp.fsf@gnu.org> <838sbe8nny.fsf@gnu.org> <83361m8d1t.fsf@gnu.org> <87blg6lem7.fsf@gnus.org> <83r1p24ieo.fsf@gnu.org> <83h7psug9r.fsf@gnu.org> <83eekvvruq.fsf@gnu.org> <83y2j3u7zv.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35548"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: thievol@posteo.net, larsi@gnus.org, schwab@linux-m68k.org, 44486@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Nov 14 23:55:13 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ke4Rs-000983-Fz for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 14 Nov 2020 23:55:12 +0100 Original-Received: from localhost ([::1]:34094 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ke4Rr-0002q8-4E for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 14 Nov 2020 17:55:11 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54786) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ke4Ri-0002po-Kk for bug-gnu-emacs@gnu.org; Sat, 14 Nov 2020 17:55:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:40712) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ke4Ri-0008T1-C5 for bug-gnu-emacs@gnu.org; Sat, 14 Nov 2020 17:55:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ke4Ri-0000FL-9i for bug-gnu-emacs@gnu.org; Sat, 14 Nov 2020 17:55:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 14 Nov 2020 22:55:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 44486 X-GNU-PR-Package: emacs Original-Received: via spool by 44486-submit@debbugs.gnu.org id=B44486.1605394451884 (code B ref 44486); Sat, 14 Nov 2020 22:55:02 +0000 Original-Received: (at 44486) by debbugs.gnu.org; 14 Nov 2020 22:54:11 +0000 Original-Received: from localhost ([127.0.0.1]:52258 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ke4Qs-0000EC-LW for submit@debbugs.gnu.org; Sat, 14 Nov 2020 17:54:10 -0500 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:35209) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ke4Qr-0000E0-Im for 44486@debbugs.gnu.org; Sat, 14 Nov 2020 17:54:10 -0500 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id F20CD4410A3; Sat, 14 Nov 2020 17:54:03 -0500 (EST) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 5F00744106D; Sat, 14 Nov 2020 17:53:58 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1605394438; bh=YT+58dluexxjHTDqFhlqGUOlxTxnVRELeKptoaJLEnQ=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=ONRo6bw3AKFxqgzgpRVrnPcPDysr8eL0WYnVsaLLx+tDHF888S3z1V16PtEC6y+ay 0gNq2QZLrjKnU6TS2w2icyRjwXlzf2CT/zxiusjeeE17HD1CBB8Yqtg4AUGVB7SO6J /kQzHa85UFCUlAELn6AfGN5hN8seEmA8f/RXaCC9mYIqgYRcbCDXyRZlNyMVxrDPnW CuE34TfkqLR3toJIjj1pY0J/q4jjeCZ+l0BiEABY+6Qkr2fgpo8ju/KVbiuOVlvuKy rI2ARZlVHUKeXqPhpkNJYjgsabGTfgYeivqMTSOC2If3Ya/+cSxkH3eNJ6c6cEfg70 dr1h4nJ9u/1Nw== Original-Received: from alfajor (unknown [157.52.9.240]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 04C101202E7; Sat, 14 Nov 2020 17:53:57 -0500 (EST) In-Reply-To: <83y2j3u7zv.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 14 Nov 2020 20:08:04 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:193340 Archived-At: >> If `utf-8` is preferable over `prefer-utf-8` for this usage I think >> the problem is in `prefer-utf-8` since it was introduced >> specifically for that. > The implementation doesn't support your POV. Then I think the implementation is in error. >> >> I believe if there's a NUL byte in such a files but it otherwise does= n't >> >> contain any invalid UTF-8 byte sequence, it will result in better >> >> behavior if we treat it as UFT-8 than as binary. >> > We treat null bytes as the _single_ telltale sign of a binary file. >>=20 >> A .el file should *never* be a binary file. > > We are not talking about .el files, we are talking about _any_ file > read using prefer-utf-8. `prefer-utf-8` was not introduced because it seemed like a good idea and then we hoped someone would find it useful. It was introduced to solve a concrete need, which is that of `.el` files. It's quite possible that there are other situations that have the same needs as `.el` files, but from where I stand it looks like "the needs of .el files (and similar cases)" should determine the intended behavior of `prefer-utf-8` rather than its current implementation. > For .el files, we can always bind inhibit-null-byte-detection to t > when we load or visit such files. We could, but I'm having trouble imagining a situation where we'd want to use `prefer-utf-8` and not inhibit "NUL means binary". The "NUL mean binarys" heuristic fundamentally says that `binary` is the first coding system we try and only if this one fails (for lack of NUL bytes) we consider others. But for `prefer-utf-8` we should first consider utf-8 and only if this fails should we consider others (potentially including `binary` if you want, my opinion is not as strong there). > I'm not talking about .el files. The coding-system's applicability is > wider than that. Could be. But it's its "raison d'=EAtre" (and AFAIK currently still the sole application), so it should handle this case as best it can. Stefan