From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#50946: insert-file-contents can corrupt buffers. Date: Sun, 03 Oct 2021 18:25:57 +0300 Message-ID: <83czom870a.fsf@gnu.org> References: <87ee933bcj.fsf@gmail.com> <83pmsnbnci.fsf@gnu.org> <83k0ivbjbu.fsf@gnu.org> <83czonbhex.fsf@gnu.org> <83lf3a8eo7.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="28161"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 50946@debbugs.gnu.org, joaotavora@gmail.com To: Alan Mackenzie Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Oct 03 17:27:31 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mX3Ol-00075H-3N for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 03 Oct 2021 17:27:31 +0200 Original-Received: from localhost ([::1]:33448 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mX3Oh-0003JO-J6 for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 03 Oct 2021 11:27:27 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:34972) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mX3OI-0002rF-AY for bug-gnu-emacs@gnu.org; Sun, 03 Oct 2021 11:27:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:51447) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mX3OI-0006rW-2t for bug-gnu-emacs@gnu.org; Sun, 03 Oct 2021 11:27:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1mX3OH-0002PR-RY for bug-gnu-emacs@gnu.org; Sun, 03 Oct 2021 11:27:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 03 Oct 2021 15:27:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 50946 X-GNU-PR-Package: emacs Original-Received: via spool by 50946-submit@debbugs.gnu.org id=B50946.16332747839215 (code B ref 50946); Sun, 03 Oct 2021 15:27:01 +0000 Original-Received: (at 50946) by debbugs.gnu.org; 3 Oct 2021 15:26:23 +0000 Original-Received: from localhost ([127.0.0.1]:34760 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mX3Nf-0002OY-JF for submit@debbugs.gnu.org; Sun, 03 Oct 2021 11:26:23 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:57972) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mX3NZ-0002OI-SK for 50946@debbugs.gnu.org; Sun, 03 Oct 2021 11:26:21 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:50662) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mX3NU-00067j-5R; Sun, 03 Oct 2021 11:26:12 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:2314 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mX3NT-0001AD-89; Sun, 03 Oct 2021 11:26:11 -0400 In-Reply-To: (message from Alan Mackenzie on Sun, 3 Oct 2021 15:04:27 +0000) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:216268 Archived-At: > Date: Sun, 3 Oct 2021 15:04:27 +0000 > Cc: joaotavora@gmail.com, 50946@debbugs.gnu.org > From: Alan Mackenzie > > Here is an updated patch, superseding my patch from midday. I have > amended the descriptions of the two functions, replacing "corruption" of > the buffer by "inserting raw-text characters" in the first function, and > added explanation to the second. Thanks, see below some comments. > I wasn't able to find a suitable target for a cross-reference explaining > "raw-text". I think "Coding System Basics" is where we describe that encoding. > --- a/doc/lispref/files.texi > +++ b/doc/lispref/files.texi > @@ -556,14 +556,18 @@ Reading from Files > > If @var{beg} and @var{end} are non-@code{nil}, they should be numbers > that are byte offsets specifying the portion of the file to insert. > -In this case, @var{visit} must be @code{nil}. For example, > +In this case, @var{visit} must be @code{nil}. Be careful to ensure > +that these byte positions are at character boundaries. Otherwise, > +Emacs's character code conversion will insert one or more raw-text > +characters into the buffer, which is probably not what you want. For This isn't the whole story. The problem is mainly with the autodetection of encoding: it can go awry if you give it only a portion of the file. But if you bind coding-system-for-read, that problem goes away, and the only effect of using BEG and END arguments is limited to the first character/byte read. In particular, if you read a file in chunks, the character at the boundary could end up as 2 or more raw bytes -- but as long as you bind coding-system-for-read, no other parts are supposed to be affected. And the problematic sequence of raw bytes can then be converted back to the original character with very simple Lisp. So the text you propose is too "frightening", in that it basically says "don't use that". Which is too tough, because valid use cases to use that feature do exist, and if the programmer knows what he/she is doing it doesn't have to produce garbled buffers. For the manual, we need more informative text, which mentions coding-system-for-read.