unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#9800: Incomplete truncated file buffers from the /proc filesystem
@ 2011-10-19 22:59 Juri Linkov
  2011-10-20  8:22 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Juri Linkov @ 2011-10-19 22:59 UTC (permalink / raw)
  To: 9800

Large files from the /proc filesystem are visited incompletely,
their file buffers are truncated at the position 65536.
One possible test case to reproduce this is to load enough libraries
with e.g. (imagemagick-register-types) and visit Emacs's maps file
in /proc/$PID/maps.

Andreas said in http://lists.gnu.org/archive/html/emacs-devel/2011-10/msg00782.html
that it's due to this code in `insert-file-contents':

	  /* The file size returned from stat may be zero, but data
	     may be readable nonetheless, for example when this is a
	     file in the /proc filesystem.  */
	  if (end_offset == 0)
	    end_offset = READ_BUF_SIZE;

How this could be fixed?  Should it keep reading while more data can be
read from the file?





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-10-19 22:59 bug#9800: Incomplete truncated file buffers from the /proc filesystem Juri Linkov
@ 2011-10-20  8:22 ` Eli Zaretskii
  2011-10-20  8:44   ` Andreas Schwab
  2011-10-24  2:53 ` Paul Eggert
  2022-02-07  0:10 ` Lars Ingebrigtsen
  2 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2011-10-20  8:22 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 9800

> From: Juri Linkov <juri@jurta.org>
> Date: Thu, 20 Oct 2011 01:59:42 +0300
> 
> Large files from the /proc filesystem are visited incompletely,
> their file buffers are truncated at the position 65536.
> One possible test case to reproduce this is to load enough libraries
> with e.g. (imagemagick-register-types) and visit Emacs's maps file
> in /proc/$PID/maps.
> 
> Andreas said in http://lists.gnu.org/archive/html/emacs-devel/2011-10/msg00782.html
> that it's due to this code in `insert-file-contents':
> 
> 	  /* The file size returned from stat may be zero, but data
> 	     may be readable nonetheless, for example when this is a
> 	     file in the /proc filesystem.  */
> 	  if (end_offset == 0)
> 	    end_offset = READ_BUF_SIZE;
> 
> How this could be fixed?  Should it keep reading while more data can be
> read from the file?

Does lseek work on these files?  If so, we could use something like

   lseek (fd, 0L, SEEK_END)

to find its size.  Or we could treat those files as non-regular, where
we set end_offset to TYPE_MAXIMUM (off_t) -- would that work with
these files?





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-10-20  8:22 ` Eli Zaretskii
@ 2011-10-20  8:44   ` Andreas Schwab
  2023-02-12  7:38     ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Andreas Schwab @ 2011-10-20  8:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 9800

Eli Zaretskii <eliz@gnu.org> writes:

> Does lseek work on these files?

No.  The contents are completely dynamic, generated on-the-fly when
reading.

> Or we could treat those files as non-regular,

That is the only option.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-10-19 22:59 bug#9800: Incomplete truncated file buffers from the /proc filesystem Juri Linkov
  2011-10-20  8:22 ` Eli Zaretskii
@ 2011-10-24  2:53 ` Paul Eggert
  2011-10-24 21:50   ` Richard Stallman
  2011-11-03 20:32   ` Lars Magne Ingebrigtsen
  2022-02-07  0:10 ` Lars Ingebrigtsen
  2 siblings, 2 replies; 15+ messages in thread
From: Paul Eggert @ 2011-10-24  2:53 UTC (permalink / raw)
  To: 9800

It strikes me that regular files can go as you read them, too,
and that Emacs is not doing this properly.  That is, Emacs should
be fixed so that it continues to read from a growing regular file
until a proper EOF is reached (i.e., until read returns 0).

If Emacs is fixed in this way, it will read these /proc files
correctly too.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-10-24  2:53 ` Paul Eggert
@ 2011-10-24 21:50   ` Richard Stallman
  2011-10-24 22:02     ` Paul Eggert
  2011-11-03 20:32   ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 15+ messages in thread
From: Richard Stallman @ 2011-10-24 21:50 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 9800

    It strikes me that regular files can go as you read them, too,
    and that Emacs is not doing this properly.  That is, Emacs should
    be fixed so that it continues to read from a growing regular file
    until a proper EOF is reached (i.e., until read returns 0).

I think there was a reason for doing it this way.  Perhaps so as to
allocate the space before reading the file.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use free telephony http://directory.fsf.org/category/tel/





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-10-24 21:50   ` Richard Stallman
@ 2011-10-24 22:02     ` Paul Eggert
  2023-02-12 10:21       ` Arsen Arsenović via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 15+ messages in thread
From: Paul Eggert @ 2011-10-24 22:02 UTC (permalink / raw)
  To: rms; +Cc: 9800

On 10/24/11 14:50, Richard Stallman wrote:
> I think there was a reason for doing it this way.  Perhaps so as to
> allocate the space before reading the file.

Yes, that sounds right.  And in the typical case where the file is not
growing, that allocates space efficiently.  If the file is growing, though,
it's OK to allocate more space after discovering that the initial
allocation was too small.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-10-24  2:53 ` Paul Eggert
  2011-10-24 21:50   ` Richard Stallman
@ 2011-11-03 20:32   ` Lars Magne Ingebrigtsen
  2011-11-04  9:36     ` Juri Linkov
  1 sibling, 1 reply; 15+ messages in thread
From: Lars Magne Ingebrigtsen @ 2011-11-03 20:32 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 9800

Paul Eggert <eggert@cs.ucla.edu> writes:

> It strikes me that regular files can go as you read them, too,
> and that Emacs is not doing this properly.  That is, Emacs should
> be fixed so that it continues to read from a growing regular file
> until a proper EOF is reached (i.e., until read returns 0).

Sounds like a good idea, but remember to bail out some time before
reading the infinitely big files to the very end.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-11-03 20:32   ` Lars Magne Ingebrigtsen
@ 2011-11-04  9:36     ` Juri Linkov
  2011-11-04 10:54       ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Juri Linkov @ 2011-11-04  9:36 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: Paul Eggert, 9800

>> It strikes me that regular files can go as you read them, too,
>> and that Emacs is not doing this properly.  That is, Emacs should
>> be fixed so that it continues to read from a growing regular file
>> until a proper EOF is reached (i.e., until read returns 0).
>
> Sounds like a good idea, but remember to bail out some time before
> reading the infinitely big files to the very end.  :-)

Maybe limit the reading by the value of `large-file-warning-threshold'.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-11-04  9:36     ` Juri Linkov
@ 2011-11-04 10:54       ` Eli Zaretskii
  0 siblings, 0 replies; 15+ messages in thread
From: Eli Zaretskii @ 2011-11-04 10:54 UTC (permalink / raw)
  To: Juri Linkov; +Cc: larsi, eggert, 9800

> From: Juri Linkov <juri@jurta.org>
> Date: Fri, 04 Nov 2011 11:36:01 +0200
> Cc: Paul Eggert <eggert@cs.ucla.edu>, 9800@debbugs.gnu.org
> 
> >> It strikes me that regular files can go as you read them, too,
> >> and that Emacs is not doing this properly.  That is, Emacs should
> >> be fixed so that it continues to read from a growing regular file
> >> until a proper EOF is reached (i.e., until read returns 0).
> >
> > Sounds like a good idea, but remember to bail out some time before
> > reading the infinitely big files to the very end.  :-)
> 
> Maybe limit the reading by the value of `large-file-warning-threshold'.

IMO, that value is ridiculously low for such use.  Maybe multiply it
by some large factor.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-10-19 22:59 bug#9800: Incomplete truncated file buffers from the /proc filesystem Juri Linkov
  2011-10-20  8:22 ` Eli Zaretskii
  2011-10-24  2:53 ` Paul Eggert
@ 2022-02-07  0:10 ` Lars Ingebrigtsen
  2022-02-07 19:41   ` Juri Linkov
  2 siblings, 1 reply; 15+ messages in thread
From: Lars Ingebrigtsen @ 2022-02-07  0:10 UTC (permalink / raw)
  To: Juri Linkov; +Cc: 9800

Juri Linkov <juri@jurta.org> writes:

> Large files from the /proc filesystem are visited incompletely,
> their file buffers are truncated at the position 65536.

It seems like this issue has been exacerbated somewhat since this was
reported.

(with-temp-buffer
  (insert-file-contents "/proc/cpuinfo")
  (buffer-size))
=> 16384

(with-temp-buffer
  (call-process "cat" nil t nil "/proc/cpuinfo")
  (buffer-size))
=> 24626

But perhaps it's dependent on the block size.  (This is on
Debian/bookworm with Emacs 29.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2022-02-07  0:10 ` Lars Ingebrigtsen
@ 2022-02-07 19:41   ` Juri Linkov
  0 siblings, 0 replies; 15+ messages in thread
From: Juri Linkov @ 2022-02-07 19:41 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 9800

>> Large files from the /proc filesystem are visited incompletely,
>> their file buffers are truncated at the position 65536.
>
> It seems like this issue has been exacerbated somewhat since this was
> reported.
>
> (with-temp-buffer
>   (insert-file-contents "/proc/cpuinfo")
>   (buffer-size))
> => 16384
>
> (with-temp-buffer
>   (call-process "cat" nil t nil "/proc/cpuinfo")
>   (buffer-size))
> => 24626
>
> But perhaps it's dependent on the block size.  (This is on
> Debian/bookworm with Emacs 29.)

I confirm that the block size now is decreased from 65536 to 16384,
so more buffers are truncated.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-10-20  8:44   ` Andreas Schwab
@ 2023-02-12  7:38     ` Eli Zaretskii
  2023-02-12  9:24       ` Ruijie Yu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2023-02-12  7:38 UTC (permalink / raw)
  To: Andreas Schwab, Paul Eggert; +Cc: juri, 9800

> From: Andreas Schwab <schwab@linux-m68k.org>
> Date: Thu, 20 Oct 2011 10:44:57 +0200
> Cc: Juri Linkov <juri@jurta.org>, 9800@debbugs.gnu.org
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Does lseek work on these files?
> 
> No.  The contents are completely dynamic, generated on-the-fly when
> reading.
> 
> > Or we could treat those files as non-regular,
> 
> That is the only option.

Are all the files in "/proc" of this nature?  If so, we could consider
all of the files in that directory non-regular; if that is all that's
needed to visit /proc/foo files, insert-file-contents already has code
to deal with non-regular files.

Paul, do you see any downsides to such heuristic?  We could make it a
user option if the heuristic could sometimes backfire, I guess.





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2023-02-12  7:38     ` Eli Zaretskii
@ 2023-02-12  9:24       ` Ruijie Yu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 15+ messages in thread
From: Ruijie Yu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-02-12  9:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: juri, Paul Eggert, Andreas Schwab, 9800

There is /proc/config.gz which does report a size, contrary to other files which report a size of 0. 

> On Feb 12, 2023, at 15:41, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> 
>> 
>> From: Andreas Schwab <schwab@linux-m68k.org>
>> Date: Thu, 20 Oct 2011 10:44:57 +0200
>> Cc: Juri Linkov <juri@jurta.org>, 9800@debbugs.gnu.org
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>>> Does lseek work on these files?
>> 
>> No.  The contents are completely dynamic, generated on-the-fly when
>> reading.
>> 
>>> Or we could treat those files as non-regular,
>> 
>> That is the only option.
> 
> Are all the files in "/proc" of this nature?  If so, we could consider
> all of the files in that directory non-regular; if that is all that's
> needed to visit /proc/foo files, insert-file-contents already has code
> to deal with non-regular files.
> 
> Paul, do you see any downsides to such heuristic?  We could make it a
> user option if the heuristic could sometimes backfire, I guess.

Best,


RY





^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2011-10-24 22:02     ` Paul Eggert
@ 2023-02-12 10:21       ` Arsen Arsenović via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-02-13 20:47         ` Paul Eggert
  0 siblings, 1 reply; 15+ messages in thread
From: Arsen Arsenović via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-02-12 10:21 UTC (permalink / raw)
  To: Paul Eggert; +Cc: rms, 9800

[-- Attachment #1: Type: text/plain, Size: 1655 bytes --]

Hi,

I was just debugging this before I found the bug report.  The diagnosis
is right: st_size is wrong for proc files (and, I'd argue, for regular
files sometimes).  So, I agree with Paul.

Paul Eggert <eggert@cs.ucla.edu> writes:

> On 10/24/11 14:50, Richard Stallman wrote:
>> I think there was a reason for doing it this way.  Perhaps so as to
>> allocate the space before reading the file.
>
> Yes, that sounds right.  And in the typical case where the file is not
> growing, that allocates space efficiently.  If the file is growing, though,
> it's OK to allocate more space after discovering that the initial
> allocation was too small.

Right.  The best possible approach is, likely:

  fstat (fd, x, &st)
  bufsz = max (READ_BUF_SIZE, st.st_size)
  buf = malloc (bufsz)

  int ret = 0, readsz = 0;
  do
    {
      readsz += ret;
      if (readsz == bufsz && size isn't unreasonable)
        {
          /* value chosen arbitrarily.  */
          bufsz += min (16 * READ_BUF_SIZE, bufsz)
          buf = realloc (buf, bufsz)
        }
      errno = 0
      ret = read (fd, buf + readsz, bufsz - readsz)
    }
  while (ret > 0 || errno == EINTR);

... or such.  This approach is robust and general, and I suspect it'd
even work for named pipes.

st_size isn't a good enough indicator of size, and it can go out of date
before TOU, however, it's - no doubt - a useful hint in the 99% case.
Using st_size to figure out a base allocation size and extending
appropriately is a well known strategy, and it would be appropriate to
do so here.

Thanks in advance, have a great day.
-- 
Arsen Arsenović

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 381 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* bug#9800: Incomplete truncated file buffers from the /proc filesystem
  2023-02-12 10:21       ` Arsen Arsenović via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-02-13 20:47         ` Paul Eggert
  0 siblings, 0 replies; 15+ messages in thread
From: Paul Eggert @ 2023-02-13 20:47 UTC (permalink / raw)
  To: Arsen Arsenović; +Cc: rms, 9800-done

[-- Attachment #1: Type: text/plain, Size: 1425 bytes --]

On 2023-02-12 02:21, Arsen Arsenović wrote:
> ... or such.  This approach is robust and general, and I suspect it'd
> even work for named pipes.

Although indeed robust and general and it will work with named pipes in 
some cases, it still has a problem if the other side of the named pipe 
outputs data very slowly: Emacs will still seem to hang until you type C-g.

That being said, the approach is an improvement and it fixes the 
original bug report so I installed the attached and am boldly closing 
the bug report (we can reopen it if I'm wrong). The last patch in the 
attached series is the actual fix: the others are minor cleanups of this 
messy area, which I discovered while looking into the fix.

This patch does not address the abovementioned issue of named pipes, nor 
the issue of inserting very large files: the code should behave roughly 
the same as before in those two areas. These issues can be raised in 
separate bug reports if needed.

PS. I was surprised to see that Emacs master currently has several test 
case failures on GNU/Linux (specifically the latest Fedora and Ubuntu 
releases). I hope these are known and that people are working on them.

1 files did not finish:
   lisp/server-tests.log
4 files contained unexpected results:
   src/lread-tests.log
   lisp/international/mule-tests.log
   lisp/emacs-lisp/map-tests.log
   lisp/emacs-lisp/bytecomp-tests.log

[-- Attachment #2: 0001-Improve-insert-file-contents-checking.patch --]
[-- Type: text/x-patch, Size: 2041 bytes --]

From 5284af27ee5250c631ff4ee2f3d8682f0c5df8bc Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 12 Feb 2023 17:30:46 -0800
Subject: [PATCH 1/5] Improve insert-file-contents checking

* src/fileio.c (Finsert_file_contents): Check BEG, END,
REPLACE for validity before launching into opening files etc.
---
 src/fileio.c | 29 ++++++++++++-----------------
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/src/fileio.c b/src/fileio.c
index c672e0f7baf..64337abdaef 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -3908,7 +3908,6 @@ because (1) it preserves some marker positions (in unchanged portions
   int fd;
   ptrdiff_t inserted = 0;
   ptrdiff_t how_much;
-  off_t beg_offset, end_offset;
   int unprocessed;
   specpdl_ref count = SPECPDL_INDEX ();
   Lisp_Object handler, val, insval, orig_filename, old_undo;
@@ -3970,6 +3969,17 @@ because (1) it preserves some marker positions (in unchanged portions
       goto handled;
     }
 
+  if (!NILP (visit))
+    {
+      if (!NILP (beg) || !NILP (end))
+	error ("Attempt to visit less than an entire file");
+      if (BEG < Z && NILP (replace))
+	error ("Cannot do file visiting in a non-empty buffer");
+    }
+
+  off_t beg_offset = !NILP (beg) ? file_offset (beg) : 0;
+  off_t end_offset = !NILP (end) ? file_offset (end) : -1;
+
   orig_filename = filename;
   filename = ENCODE_FILE (filename);
 
@@ -4030,22 +4040,7 @@ because (1) it preserves some marker positions (in unchanged portions
 		  build_string ("not a regular file"), orig_filename);
     }
 
-  if (!NILP (visit))
-    {
-      if (!NILP (beg) || !NILP (end))
-	error ("Attempt to visit less than an entire file");
-      if (BEG < Z && NILP (replace))
-	error ("Cannot do file visiting in a non-empty buffer");
-    }
-
-  if (!NILP (beg))
-    beg_offset = file_offset (beg);
-  else
-    beg_offset = 0;
-
-  if (!NILP (end))
-    end_offset = file_offset (end);
-  else
+  if (end_offset < 0)
     {
       if (!regular)
 	end_offset = TYPE_MAXIMUM (off_t);
-- 
2.37.2


[-- Attachment #3: 0002-Improve-insert-file-contents-on-non-regular-files.patch --]
[-- Type: text/x-patch, Size: 1521 bytes --]

From b0842671e750be08356425e2fc38251e7b08d5d7 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 12 Feb 2023 17:52:46 -0800
Subject: [PATCH 2/5] Improve insert-file-contents on non-regular files
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/fileio.c (Finsert_file_contents):
If the file is not a regular file, check REPLACE and VISIT
before doing further syscalls that won’t matter in this case.
---
 src/fileio.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/fileio.c b/src/fileio.c
index 64337abdaef..751b8ec573c 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -4022,7 +4022,6 @@ because (1) it preserves some marker positions (in unchanged portions
   if (!S_ISREG (st.st_mode))
     {
       regular = false;
-      seekable = lseek (fd, 0, SEEK_CUR) < 0;
 
       if (! NILP (visit))
         {
@@ -4030,14 +4029,15 @@ because (1) it preserves some marker positions (in unchanged portions
 	  goto notfound;
         }
 
+      if (!NILP (replace))
+	xsignal2 (Qfile_error,
+		  build_string ("not a regular file"), orig_filename);
+
+      seekable = lseek (fd, 0, SEEK_CUR) < 0;
       if (!NILP (beg) && !seekable)
 	xsignal2 (Qfile_error,
 		  build_string ("cannot use a start position in a non-seekable file/device"),
 		  orig_filename);
-
-      if (!NILP (replace))
-	xsignal2 (Qfile_error,
-		  build_string ("not a regular file"), orig_filename);
     }
 
   if (end_offset < 0)
-- 
2.37.2


[-- Attachment #4: 0003-Don-t-scan-text-twice-to-guess-coding-system.patch --]
[-- Type: text/x-patch, Size: 1195 bytes --]

From ccc092115172f15c9135771f90d0000f8bf21614 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 13 Feb 2023 08:51:45 -0800
Subject: [PATCH 3/5] =?UTF-8?q?Don=E2=80=99t=20scan=20text=20twice=20to=20?=
 =?UTF-8?q?guess=20coding=20system?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/fileio.c (Finsert_file_contents): If the file shrank
below 4 KiB, don’t read duplicate text into READ_BUF.
This also removes a use of SEEK_END, which Linux /proc
file systems do not support (not that we should get here
with /proc).
---
 src/fileio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/fileio.c b/src/fileio.c
index 751b8ec573c..47177be0f4d 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -4119,7 +4119,7 @@ because (1) it preserves some marker positions (in unchanged portions
 		  if (nread == 1024)
 		    {
 		      int ntail;
-		      if (lseek (fd, - (1024 * 3), SEEK_END) < 0)
+		      if (lseek (fd, st.st_size - 1024 * 3, SEEK_CUR) < 0)
 			report_file_error ("Setting file position",
 					   orig_filename);
 		      ntail = emacs_read_quit (fd, read_buf + nread, 1024 * 3);
-- 
2.37.2


[-- Attachment #5: 0004-Fix-src-fileio.c-comment.patch --]
[-- Type: text/x-patch, Size: 1022 bytes --]

From bae5fa5d9a8ef8c41fbb3408eea441a2ee14d1db Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 13 Feb 2023 08:53:52 -0800
Subject: [PATCH 4/5] Fix src/fileio.c comment
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/fileio.c (Finsert_file_contents): Fix comment.
Since the code relies on st_size, it’s limited to
regular files, not to seekable files.
---
 src/fileio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/fileio.c b/src/fileio.c
index 47177be0f4d..ee30db8b49b 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -4101,7 +4101,7 @@ because (1) it preserves some marker positions (in unchanged portions
       else
 	{
 	  /* Don't try looking inside a file for a coding system
-	     specification if it is not seekable.  */
+	     specification if it is not a regular file.  */
 	  if (regular && !NILP (Vset_auto_coding_function))
 	    {
 	      /* Find a coding system specified in the heading two
-- 
2.37.2


[-- Attachment #6: 0005-Fix-insert-file-contents-on-proc-files.patch --]
[-- Type: text/x-patch, Size: 4914 bytes --]

From b950b46f514989442fdd9937a0e96d53a3affa88 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 13 Feb 2023 12:32:11 -0800
Subject: [PATCH 5/5] Fix insert-file-contents on /proc files

This should fix Bug#9800 (2011-10-19).
* src/fileio.c (Finsert_file_contents):
Do not trust st_size even on regular files, as the file might
be a Linux /proc file, or it might be growing.
Instead, always read to EOF when END is nil.
---
 src/fileio.c | 57 +++++++++++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 32 deletions(-)

diff --git a/src/fileio.c b/src/fileio.c
index ee30db8b49b..b80f8d61de4 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -3907,7 +3907,6 @@ because (1) it preserves some marker positions (in unchanged portions
   struct timespec mtime;
   int fd;
   ptrdiff_t inserted = 0;
-  ptrdiff_t how_much;
   int unprocessed;
   specpdl_ref count = SPECPDL_INDEX ();
   Lisp_Object handler, val, insval, orig_filename, old_undo;
@@ -3920,7 +3919,8 @@ because (1) it preserves some marker positions (in unchanged portions
   bool replace_handled = false;
   bool set_coding_system = false;
   Lisp_Object coding_system;
-  bool read_quit = false;
+  /* Negative if read error, 0 if OK so far, positive if quit.  */
+  ptrdiff_t read_quit = 0;
   /* If the undo log only contains the insertion, there's no point
      keeping it.  It's typically when we first fill a file-buffer.  */
   bool empty_undo_list_p
@@ -4404,7 +4404,7 @@ because (1) it preserves some marker positions (in unchanged portions
       ptrdiff_t bufpos;
       unsigned char *decoded;
       ptrdiff_t temp;
-      ptrdiff_t this = 0;
+      ptrdiff_t this;
       specpdl_ref this_count = SPECPDL_INDEX ();
       bool multibyte
 	= ! NILP (BVAR (current_buffer, enable_multibyte_characters));
@@ -4580,8 +4580,12 @@ because (1) it preserves some marker positions (in unchanged portions
     }
 
   move_gap_both (PT, PT_BYTE);
-  if (GAP_SIZE < total)
-    make_gap (total - GAP_SIZE);
+
+  /* Ensure the gap is at least one byte larger than needed for the
+     estimated file size, so that in the usual case we read to EOF
+     without reallocating.  */
+  if (GAP_SIZE <= total)
+    make_gap (total - GAP_SIZE + 1);
 
   if (beg_offset != 0 || !NILP (replace))
     {
@@ -4589,12 +4593,6 @@ because (1) it preserves some marker positions (in unchanged portions
 	report_file_error ("Setting file position", orig_filename);
     }
 
-  /* In the following loop, HOW_MUCH contains the total bytes read so
-     far for a regular file, and not changed for a special file.  But,
-     before exiting the loop, it is set to a negative value if I/O
-     error occurs.  */
-  how_much = 0;
-
   /* Total bytes inserted.  */
   inserted = 0;
 
@@ -4603,23 +4601,26 @@ because (1) it preserves some marker positions (in unchanged portions
   {
     ptrdiff_t gap_size = GAP_SIZE;
 
-    while (how_much < total)
+    while (NILP (end) || inserted < total)
       {
-	/* `try' is reserved in some compilers (Microsoft C).  */
-	ptrdiff_t trytry = min (total - how_much, READ_BUF_SIZE);
 	ptrdiff_t this;
 
+	if (gap_size == 0)
+	  {
+	    /* The size estimate was wrong.  Make the gap 50% larger.  */
+	    make_gap (GAP_SIZE >> 1);
+	    gap_size = GAP_SIZE - inserted;
+	  }
+
+	/* 'try' is reserved in some compilers (Microsoft C).  */
+	ptrdiff_t trytry = min (gap_size, READ_BUF_SIZE);
+	if (!NILP (end))
+	  trytry = min (trytry, total - inserted);
+
 	if (!seekable && NILP (end))
 	  {
 	    Lisp_Object nbytes;
 
-	    /* Maybe make more room.  */
-	    if (gap_size < trytry)
-	      {
-		make_gap (trytry - gap_size);
-		gap_size = GAP_SIZE - inserted;
-	      }
-
 	    /* Read from the file, capturing `quit'.  When an
 	       error occurs, end the loop, and arrange for a quit
 	       to be signaled after decoding the text we read.  */
@@ -4630,7 +4631,7 @@ because (1) it preserves some marker positions (in unchanged portions
 
 	    if (NILP (nbytes))
 	      {
-		read_quit = true;
+		read_quit = 1;
 		break;
 	      }
 
@@ -4649,19 +4650,11 @@ because (1) it preserves some marker positions (in unchanged portions
 
 	if (this <= 0)
 	  {
-	    how_much = this;
+	    read_quit = this;
 	    break;
 	  }
 
 	gap_size -= this;
-
-	/* For a regular file, where TOTAL is the real size,
-	   count HOW_MUCH to compare with it.
-	   For a special file, where TOTAL is just a buffer size,
-	   so don't bother counting in HOW_MUCH.
-	   (INSERTED is where we count the number of characters inserted.)  */
-	if (seekable || !NILP (end))
-	  how_much += this;
 	inserted += this;
       }
   }
@@ -4682,7 +4675,7 @@ because (1) it preserves some marker positions (in unchanged portions
   emacs_close (fd);
   clear_unwind_protect (fd_index);
 
-  if (how_much < 0)
+  if (read_quit < 0)
     report_file_error ("Read error", orig_filename);
 
  notfound:
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-02-13 20:47 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-19 22:59 bug#9800: Incomplete truncated file buffers from the /proc filesystem Juri Linkov
2011-10-20  8:22 ` Eli Zaretskii
2011-10-20  8:44   ` Andreas Schwab
2023-02-12  7:38     ` Eli Zaretskii
2023-02-12  9:24       ` Ruijie Yu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2011-10-24  2:53 ` Paul Eggert
2011-10-24 21:50   ` Richard Stallman
2011-10-24 22:02     ` Paul Eggert
2023-02-12 10:21       ` Arsen Arsenović via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-13 20:47         ` Paul Eggert
2011-11-03 20:32   ` Lars Magne Ingebrigtsen
2011-11-04  9:36     ` Juri Linkov
2011-11-04 10:54       ` Eli Zaretskii
2022-02-07  0:10 ` Lars Ingebrigtsen
2022-02-07 19:41   ` Juri Linkov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).