From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#50946: insert-file-contents can corrupt buffers. [Was: bug#50946: Emacs-28: Inadequate coding in hack-elisp-shorthands] Date: Sun, 03 Oct 2021 15:40:24 +0300 Message-ID: <83lf3a8eo7.fsf@gnu.org> References: <831r54einq.fsf@gnu.org> <871r54xnds.fsf@gmail.com> <87ee933bcj.fsf@gmail.com> <83pmsnbnci.fsf@gnu.org> <83k0ivbjbu.fsf@gnu.org> <83czonbhex.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="33372"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 50946@debbugs.gnu.org, joaotavora@gmail.com To: Alan Mackenzie Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Oct 03 14:41:11 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mX0nn-0008Sa-IK for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 03 Oct 2021 14:41:11 +0200 Original-Received: from localhost ([::1]:34040 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mX0nl-0004YN-1k for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 03 Oct 2021 08:41:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:41532) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mX0ne-0004VU-LY for bug-gnu-emacs@gnu.org; Sun, 03 Oct 2021 08:41:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:49779) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mX0ne-0007vs-E5 for bug-gnu-emacs@gnu.org; Sun, 03 Oct 2021 08:41:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1mX0ne-0006E0-By for bug-gnu-emacs@gnu.org; Sun, 03 Oct 2021 08:41:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 03 Oct 2021 12:41:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 50946 X-GNU-PR-Package: emacs Original-Received: via spool by 50946-submit@debbugs.gnu.org id=B50946.163326485123895 (code B ref 50946); Sun, 03 Oct 2021 12:41:02 +0000 Original-Received: (at 50946) by debbugs.gnu.org; 3 Oct 2021 12:40:51 +0000 Original-Received: from localhost ([127.0.0.1]:33087 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mX0nT-0006DK-9q for submit@debbugs.gnu.org; Sun, 03 Oct 2021 08:40:51 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:36296) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mX0nR-0006D5-5A for 50946@debbugs.gnu.org; Sun, 03 Oct 2021 08:40:50 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:47390) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mX0nJ-0007ks-CB; Sun, 03 Oct 2021 08:40:43 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:4116 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mX0nH-0005ak-VW; Sun, 03 Oct 2021 08:40:40 -0400 In-Reply-To: (message from Alan Mackenzie on Sun, 3 Oct 2021 12:10:19 +0000) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:216262 Archived-At: > Date: Sun, 3 Oct 2021 12:10:19 +0000 > Cc: joaotavora@gmail.com, 50946@debbugs.gnu.org > From: Alan Mackenzie > > Create a file ~/utf8-chars.txt as follows. All the non-ascii characters > are 2-byte German UTF8 characters. Only the Q is an ascii character. > There is a LF at the end: > > ÄäÖöQÜüß > > Now, in an empty buffer, > > M-: (insert-file-contents "~/utf8-chars.txt" nil 3 15) > > .. The first character of this buffer is now the Emacs encoding of the > raw byte 0xa4. > > Now do > > M-: (insert-file-contents "~/utf8-chars.txt" nil 0 3) > > The entire buffer, apart from the Q and the LF, now consists of raw > bytes, and the buffer is now 16 characters long. (Is this a bug?). > Note that the Q is now further back from the end of the buffer than it > should be. OK, thanks. > My opinion, for what it's worth, is that using insert-file-contents in > hack-elisp-shorthands is a Bad Thing. Even if it is possible to get it > working rigorously, it is surely not worth the trouble. Why not simply > visit the file in a buffer, and check for buffer local variables in the > normal fashion? We already visit the file when we load it. João, why didn't you simply insert (alist-get 'elisp-shorthands (hack-local-variables--find-variables)) in load-with-code-conversion, immediately after it calls insert-file-contents? Are there any problems with that, and if so, what are they? > There are bugs in the documentation of insert-file-contents in the elisp > manual. It confuses bytes with characters, and it fails to mention the > need to keep BEG and END at character boundaries. I propose installing > the following patch to the release branch: Thanks, I will review this later. However: > @@ -580,7 +583,8 @@ Reading from Files > This function works like @code{insert-file-contents} except that it > does not run @code{after-insert-file-functions}, and does not do > format decoding, character code conversion, automatic uncompression, > -and so on. > +and so on. @var{beg} and @var{end}, if non-@code{nil}, should be at > +character boundaries, as in @code{insert-file-contents}. > @end defun I don't think I understand why you made this second correction: insert-file-contents-literally deals with bytes to begin with. > The doc strings of insert-file-contents\(-literally\)? will also need to > be updated. In some sense, yes.