unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Alan Mackenzie <acm@muc.de>
Cc: 50946@debbugs.gnu.org, joaotavora@gmail.com
Subject: bug#50946: insert-file-contents can corrupt buffers.
Date: Sun, 03 Oct 2021 18:25:57 +0300	[thread overview]
Message-ID: <83czom870a.fsf@gnu.org> (raw)
In-Reply-To: <YVnGe06VlKnKDFX8@ACM> (message from Alan Mackenzie on Sun, 3 Oct 2021 15:04:27 +0000)

> Date: Sun, 3 Oct 2021 15:04:27 +0000
> Cc: joaotavora@gmail.com, 50946@debbugs.gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> Here is an updated patch, superseding my patch from midday.  I have
> amended the descriptions of the two functions, replacing "corruption" of
> the buffer by "inserting raw-text characters" in the first function, and
> added explanation to the second.

Thanks, see below some comments.

> I wasn't able to find a suitable target for a cross-reference explaining
> "raw-text".

I think "Coding System Basics" is where we describe that encoding.

> --- a/doc/lispref/files.texi
> +++ b/doc/lispref/files.texi
> @@ -556,14 +556,18 @@ Reading from Files
>  
>  If @var{beg} and @var{end} are non-@code{nil}, they should be numbers
>  that are byte offsets specifying the portion of the file to insert.
> -In this case, @var{visit} must be @code{nil}.  For example,
> +In this case, @var{visit} must be @code{nil}.  Be careful to ensure
> +that these byte positions are at character boundaries.  Otherwise,
> +Emacs's character code conversion will insert one or more raw-text
> +characters into the buffer, which is probably not what you want.  For

This isn't the whole story.  The problem is mainly with the
autodetection of encoding: it can go awry if you give it only a
portion of the file.  But if you bind coding-system-for-read, that
problem goes away, and the only effect of using BEG and END arguments
is limited to the first character/byte read.  In particular, if you
read a file in chunks, the character at the boundary could end up as 2
or more raw bytes -- but as long as you bind coding-system-for-read,
no other parts are supposed to be affected.  And the problematic
sequence of raw bytes can then be converted back to the original
character with very simple Lisp.

So the text you propose is too "frightening", in that it basically
says "don't use that".  Which is too tough, because valid use cases to
use that feature do exist, and if the programmer knows what he/she is
doing it doesn't have to produce garbled buffers.  For the manual, we
need more informative text, which mentions coding-system-for-read.





  reply	other threads:[~2021-10-03 15:25 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-01 17:10 bug#50946: Emacs-28: Inadequate coding in hack-elisp-shorthands Alan Mackenzie
2021-10-01 17:53 ` Eli Zaretskii
2021-10-01 21:15   ` João Távora
2021-10-02  6:10     ` Eli Zaretskii
2021-10-02  0:48   ` João Távora
2021-10-02 10:50     ` Alan Mackenzie
2021-10-02 11:13       ` João Távora
2021-10-02 11:38       ` João Távora
2021-10-02 12:38         ` Alan Mackenzie
2021-10-02 12:52           ` Eli Zaretskii
2021-10-02 13:57             ` Alan Mackenzie
2021-10-02 14:19               ` Eli Zaretskii
2021-10-02 14:45                 ` Alan Mackenzie
2021-10-02 15:00                   ` Eli Zaretskii
2021-10-02 20:07                     ` Alan Mackenzie
2021-10-03 11:45                       ` Eli Zaretskii
2021-10-03 12:10                     ` bug#50946: insert-file-contents can corrupt buffers. [Was: bug#50946: Emacs-28: Inadequate coding in hack-elisp-shorthands] Alan Mackenzie
2021-10-03 12:40                       ` Eli Zaretskii
2021-10-03 13:33                         ` Alan Mackenzie
2021-10-03 15:04                         ` bug#50946: insert-file-contents can corrupt buffers Alan Mackenzie
2021-10-03 15:25                           ` Eli Zaretskii [this message]
2021-10-03 17:21                             ` Alan Mackenzie
2021-10-03 17:36                               ` Eli Zaretskii
2021-10-03 18:19                                 ` Alan Mackenzie
2021-10-03 15:34                         ` bug#50946: insert-file-contents can corrupt buffers. [Was: bug#50946: Emacs-28: Inadequate coding in hack-elisp-shorthands] João Távora
2021-10-03 15:42                           ` João Távora
2021-10-03 15:56                           ` Eli Zaretskii
2021-10-03 16:02                             ` João Távora
2021-10-03 16:20                               ` Eli Zaretskii
2021-10-03 17:05                                 ` João Távora
2021-10-03 17:56                                   ` Eli Zaretskii
2021-10-03 18:59                                     ` João Távora
2021-10-03 19:51                                       ` Eli Zaretskii
2021-10-03 19:59                                         ` João Távora
2021-10-02 15:02                 ` bug#50946: Emacs-28: Inadequate coding in hack-elisp-shorthands João Távora
2021-10-04  0:14                   ` Richard Stallman
2021-10-02 14:47           ` João Távora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83czom870a.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=50946@debbugs.gnu.org \
    --cc=acm@muc.de \
    --cc=joaotavora@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).