From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.bugs Subject: bug#44486: 27.1; C-@ chars corrupt elisp buffer Date: Sat, 14 Nov 2020 12:55:51 -0500 Message-ID: References: <878sbeikpr.fsf@posteo.net> <87zh3u8pqn.fsf@igel.home> <83blga8pdp.fsf@gnu.org> <838sbe8nny.fsf@gnu.org> <83361m8d1t.fsf@gnu.org> <87blg6lem7.fsf@gnus.org> <83r1p24ieo.fsf@gnu.org> <83h7psug9r.fsf@gnu.org> <83eekvvruq.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34524"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: thievol@posteo.net, larsi@gnus.org, schwab@linux-m68k.org, 44486@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Nov 14 18:56:11 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kdzmU-0008sH-DW for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 14 Nov 2020 18:56:10 +0100 Original-Received: from localhost ([::1]:47426 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kdzmT-0007f9-Ez for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 14 Nov 2020 12:56:09 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:60004) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kdzmM-0007ey-IZ for bug-gnu-emacs@gnu.org; Sat, 14 Nov 2020 12:56:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:40394) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kdzmM-0006XD-9K for bug-gnu-emacs@gnu.org; Sat, 14 Nov 2020 12:56:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kdzmM-0005QE-89 for bug-gnu-emacs@gnu.org; Sat, 14 Nov 2020 12:56:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 14 Nov 2020 17:56:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 44486 X-GNU-PR-Package: emacs Original-Received: via spool by 44486-submit@debbugs.gnu.org id=B44486.160537656120835 (code B ref 44486); Sat, 14 Nov 2020 17:56:02 +0000 Original-Received: (at 44486) by debbugs.gnu.org; 14 Nov 2020 17:56:01 +0000 Original-Received: from localhost ([127.0.0.1]:51940 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kdzmK-0005Py-Pr for submit@debbugs.gnu.org; Sat, 14 Nov 2020 12:56:01 -0500 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:42552) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kdzmI-0005Pk-U1 for 44486@debbugs.gnu.org; Sat, 14 Nov 2020 12:55:59 -0500 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 7485A44106D; Sat, 14 Nov 2020 12:55:53 -0500 (EST) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 2748244104D; Sat, 14 Nov 2020 12:55:52 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1605376552; bh=bRNquEN6gg4zCNpG7q0pEdZBOFK8UMffDiRpoON2vuI=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=frcSLF7ZVFIpjnjczASzRSlTa5QXPZEY+G7mzu2QlwAW1n5pSblWbzJF0Pw2Bdl/d p+dJZegco10InhYB5tbM4sKMBYpkRm77j3tN5qLdt5xawZBiTwwp8w7+9mVJuwwi25 zN0yp1N6hu8s5QLNPZRvMKcNmetOy6NrfVGeNOhunTEuG9LtqxKOdMfJqJ3cE+eqOe 0/gJ3uSoK+fADVvFHK71SuJVUBzn+e2HHj1R2sq5mFUq5NKPm40Ijyrkk2++lXwCDR Ymf4i3LKijwUbleayL1p+v3KSR0NqeKWhsoeDF/f+LnqVa8611fZbUFx07DxQqM+RO zh3Nv9dpCzftA== Original-Received: from alfajor (unknown [157.52.9.240]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id D7FB71202A8; Sat, 14 Nov 2020 12:55:51 -0500 (EST) In-Reply-To: <83eekvvruq.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 14 Nov 2020 18:13:49 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:193318 Archived-At: >> >> Actually, for prefer-utf-8 files, I think we never want to automatically >> >> fallback to binary. >> > I think you are assuming prefer-utf-8 is something other than what it >> > is. It is not a variant of UTF-8, it is a variant of 'undecided' >> > (i.e. it starts by detecting the encoding), which prefers UTF-8 if >> > that can decode the text. >> My position is not based on principles but on pragmatic concerns. >> AFAIK `prefer-utf-8` is only ever used for files which are known to >> contain text and should almost always contain UTF-8 text. > For those, we should use utf-8, not prefer-utf-8. No, `utf-8` should be used when other coding systems should be considered as errors (i.e. not "almost always" but "always"), whereas `prefer-utf-8` is for use when utf-8 is the most likely one and other coding systems should be tried only when there's some evidence that the file actually doesn't use utf-8. `prefer-utf-8` was introduced specifically for `.el` files (and I don't know of any other use of that encoding so far). If `utf-8` is preferable over `prefer-utf-8` for this usage I think the problem is in `prefer-utf-8` since it was introduced specifically for that. >> I believe if there's a NUL byte in such a files but it otherwise doesn't >> contain any invalid UTF-8 byte sequence, it will result in better >> behavior if we treat it as UFT-8 than as binary. > We treat null bytes as the _single_ telltale sign of a binary file. A .el file should *never* be a binary file. > If we disable that in coding-systems that are supposed to _detect_ > encoding, we will never be able to detect binary files. In which scenario would it be beneficial to detect a `.el` file as being binary instead of utf-8? Stefan PS: Especially since NUL bytes can and do occur in ELisp code.