From: Paul Eggert <eggert@cs.ucla.edu>
To: "Arsen Arsenović" <arsen@aarsen.me>
Cc: rms@gnu.org, 9800-done@debbugs.gnu.org
Subject: bug#9800: Incomplete truncated file buffers from the /proc filesystem
Date: Mon, 13 Feb 2023 12:47:33 -0800 [thread overview]
Message-ID: <f711f43f-6892-f849-21ec-26376c351f11@cs.ucla.edu> (raw)
In-Reply-To: <861qmvcglp.fsf@aarsen.me>
[-- Attachment #1: Type: text/plain, Size: 1425 bytes --]
On 2023-02-12 02:21, Arsen Arsenović wrote:
> ... or such. This approach is robust and general, and I suspect it'd
> even work for named pipes.
Although indeed robust and general and it will work with named pipes in
some cases, it still has a problem if the other side of the named pipe
outputs data very slowly: Emacs will still seem to hang until you type C-g.
That being said, the approach is an improvement and it fixes the
original bug report so I installed the attached and am boldly closing
the bug report (we can reopen it if I'm wrong). The last patch in the
attached series is the actual fix: the others are minor cleanups of this
messy area, which I discovered while looking into the fix.
This patch does not address the abovementioned issue of named pipes, nor
the issue of inserting very large files: the code should behave roughly
the same as before in those two areas. These issues can be raised in
separate bug reports if needed.
PS. I was surprised to see that Emacs master currently has several test
case failures on GNU/Linux (specifically the latest Fedora and Ubuntu
releases). I hope these are known and that people are working on them.
1 files did not finish:
lisp/server-tests.log
4 files contained unexpected results:
src/lread-tests.log
lisp/international/mule-tests.log
lisp/emacs-lisp/map-tests.log
lisp/emacs-lisp/bytecomp-tests.log
[-- Attachment #2: 0001-Improve-insert-file-contents-checking.patch --]
[-- Type: text/x-patch, Size: 2041 bytes --]
From 5284af27ee5250c631ff4ee2f3d8682f0c5df8bc Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 12 Feb 2023 17:30:46 -0800
Subject: [PATCH 1/5] Improve insert-file-contents checking
* src/fileio.c (Finsert_file_contents): Check BEG, END,
REPLACE for validity before launching into opening files etc.
---
src/fileio.c | 29 ++++++++++++-----------------
1 file changed, 12 insertions(+), 17 deletions(-)
diff --git a/src/fileio.c b/src/fileio.c
index c672e0f7baf..64337abdaef 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -3908,7 +3908,6 @@ because (1) it preserves some marker positions (in unchanged portions
int fd;
ptrdiff_t inserted = 0;
ptrdiff_t how_much;
- off_t beg_offset, end_offset;
int unprocessed;
specpdl_ref count = SPECPDL_INDEX ();
Lisp_Object handler, val, insval, orig_filename, old_undo;
@@ -3970,6 +3969,17 @@ because (1) it preserves some marker positions (in unchanged portions
goto handled;
}
+ if (!NILP (visit))
+ {
+ if (!NILP (beg) || !NILP (end))
+ error ("Attempt to visit less than an entire file");
+ if (BEG < Z && NILP (replace))
+ error ("Cannot do file visiting in a non-empty buffer");
+ }
+
+ off_t beg_offset = !NILP (beg) ? file_offset (beg) : 0;
+ off_t end_offset = !NILP (end) ? file_offset (end) : -1;
+
orig_filename = filename;
filename = ENCODE_FILE (filename);
@@ -4030,22 +4040,7 @@ because (1) it preserves some marker positions (in unchanged portions
build_string ("not a regular file"), orig_filename);
}
- if (!NILP (visit))
- {
- if (!NILP (beg) || !NILP (end))
- error ("Attempt to visit less than an entire file");
- if (BEG < Z && NILP (replace))
- error ("Cannot do file visiting in a non-empty buffer");
- }
-
- if (!NILP (beg))
- beg_offset = file_offset (beg);
- else
- beg_offset = 0;
-
- if (!NILP (end))
- end_offset = file_offset (end);
- else
+ if (end_offset < 0)
{
if (!regular)
end_offset = TYPE_MAXIMUM (off_t);
--
2.37.2
[-- Attachment #3: 0002-Improve-insert-file-contents-on-non-regular-files.patch --]
[-- Type: text/x-patch, Size: 1521 bytes --]
From b0842671e750be08356425e2fc38251e7b08d5d7 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 12 Feb 2023 17:52:46 -0800
Subject: [PATCH 2/5] Improve insert-file-contents on non-regular files
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* src/fileio.c (Finsert_file_contents):
If the file is not a regular file, check REPLACE and VISIT
before doing further syscalls that won’t matter in this case.
---
src/fileio.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/fileio.c b/src/fileio.c
index 64337abdaef..751b8ec573c 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -4022,7 +4022,6 @@ because (1) it preserves some marker positions (in unchanged portions
if (!S_ISREG (st.st_mode))
{
regular = false;
- seekable = lseek (fd, 0, SEEK_CUR) < 0;
if (! NILP (visit))
{
@@ -4030,14 +4029,15 @@ because (1) it preserves some marker positions (in unchanged portions
goto notfound;
}
+ if (!NILP (replace))
+ xsignal2 (Qfile_error,
+ build_string ("not a regular file"), orig_filename);
+
+ seekable = lseek (fd, 0, SEEK_CUR) < 0;
if (!NILP (beg) && !seekable)
xsignal2 (Qfile_error,
build_string ("cannot use a start position in a non-seekable file/device"),
orig_filename);
-
- if (!NILP (replace))
- xsignal2 (Qfile_error,
- build_string ("not a regular file"), orig_filename);
}
if (end_offset < 0)
--
2.37.2
[-- Attachment #4: 0003-Don-t-scan-text-twice-to-guess-coding-system.patch --]
[-- Type: text/x-patch, Size: 1195 bytes --]
From ccc092115172f15c9135771f90d0000f8bf21614 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 13 Feb 2023 08:51:45 -0800
Subject: [PATCH 3/5] =?UTF-8?q?Don=E2=80=99t=20scan=20text=20twice=20to=20?=
=?UTF-8?q?guess=20coding=20system?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* src/fileio.c (Finsert_file_contents): If the file shrank
below 4 KiB, don’t read duplicate text into READ_BUF.
This also removes a use of SEEK_END, which Linux /proc
file systems do not support (not that we should get here
with /proc).
---
src/fileio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/fileio.c b/src/fileio.c
index 751b8ec573c..47177be0f4d 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -4119,7 +4119,7 @@ because (1) it preserves some marker positions (in unchanged portions
if (nread == 1024)
{
int ntail;
- if (lseek (fd, - (1024 * 3), SEEK_END) < 0)
+ if (lseek (fd, st.st_size - 1024 * 3, SEEK_CUR) < 0)
report_file_error ("Setting file position",
orig_filename);
ntail = emacs_read_quit (fd, read_buf + nread, 1024 * 3);
--
2.37.2
[-- Attachment #5: 0004-Fix-src-fileio.c-comment.patch --]
[-- Type: text/x-patch, Size: 1022 bytes --]
From bae5fa5d9a8ef8c41fbb3408eea441a2ee14d1db Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 13 Feb 2023 08:53:52 -0800
Subject: [PATCH 4/5] Fix src/fileio.c comment
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* src/fileio.c (Finsert_file_contents): Fix comment.
Since the code relies on st_size, it’s limited to
regular files, not to seekable files.
---
src/fileio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/fileio.c b/src/fileio.c
index 47177be0f4d..ee30db8b49b 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -4101,7 +4101,7 @@ because (1) it preserves some marker positions (in unchanged portions
else
{
/* Don't try looking inside a file for a coding system
- specification if it is not seekable. */
+ specification if it is not a regular file. */
if (regular && !NILP (Vset_auto_coding_function))
{
/* Find a coding system specified in the heading two
--
2.37.2
[-- Attachment #6: 0005-Fix-insert-file-contents-on-proc-files.patch --]
[-- Type: text/x-patch, Size: 4914 bytes --]
From b950b46f514989442fdd9937a0e96d53a3affa88 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 13 Feb 2023 12:32:11 -0800
Subject: [PATCH 5/5] Fix insert-file-contents on /proc files
This should fix Bug#9800 (2011-10-19).
* src/fileio.c (Finsert_file_contents):
Do not trust st_size even on regular files, as the file might
be a Linux /proc file, or it might be growing.
Instead, always read to EOF when END is nil.
---
src/fileio.c | 57 +++++++++++++++++++++++-----------------------------
1 file changed, 25 insertions(+), 32 deletions(-)
diff --git a/src/fileio.c b/src/fileio.c
index ee30db8b49b..b80f8d61de4 100644
--- a/src/fileio.c
+++ b/src/fileio.c
@@ -3907,7 +3907,6 @@ because (1) it preserves some marker positions (in unchanged portions
struct timespec mtime;
int fd;
ptrdiff_t inserted = 0;
- ptrdiff_t how_much;
int unprocessed;
specpdl_ref count = SPECPDL_INDEX ();
Lisp_Object handler, val, insval, orig_filename, old_undo;
@@ -3920,7 +3919,8 @@ because (1) it preserves some marker positions (in unchanged portions
bool replace_handled = false;
bool set_coding_system = false;
Lisp_Object coding_system;
- bool read_quit = false;
+ /* Negative if read error, 0 if OK so far, positive if quit. */
+ ptrdiff_t read_quit = 0;
/* If the undo log only contains the insertion, there's no point
keeping it. It's typically when we first fill a file-buffer. */
bool empty_undo_list_p
@@ -4404,7 +4404,7 @@ because (1) it preserves some marker positions (in unchanged portions
ptrdiff_t bufpos;
unsigned char *decoded;
ptrdiff_t temp;
- ptrdiff_t this = 0;
+ ptrdiff_t this;
specpdl_ref this_count = SPECPDL_INDEX ();
bool multibyte
= ! NILP (BVAR (current_buffer, enable_multibyte_characters));
@@ -4580,8 +4580,12 @@ because (1) it preserves some marker positions (in unchanged portions
}
move_gap_both (PT, PT_BYTE);
- if (GAP_SIZE < total)
- make_gap (total - GAP_SIZE);
+
+ /* Ensure the gap is at least one byte larger than needed for the
+ estimated file size, so that in the usual case we read to EOF
+ without reallocating. */
+ if (GAP_SIZE <= total)
+ make_gap (total - GAP_SIZE + 1);
if (beg_offset != 0 || !NILP (replace))
{
@@ -4589,12 +4593,6 @@ because (1) it preserves some marker positions (in unchanged portions
report_file_error ("Setting file position", orig_filename);
}
- /* In the following loop, HOW_MUCH contains the total bytes read so
- far for a regular file, and not changed for a special file. But,
- before exiting the loop, it is set to a negative value if I/O
- error occurs. */
- how_much = 0;
-
/* Total bytes inserted. */
inserted = 0;
@@ -4603,23 +4601,26 @@ because (1) it preserves some marker positions (in unchanged portions
{
ptrdiff_t gap_size = GAP_SIZE;
- while (how_much < total)
+ while (NILP (end) || inserted < total)
{
- /* `try' is reserved in some compilers (Microsoft C). */
- ptrdiff_t trytry = min (total - how_much, READ_BUF_SIZE);
ptrdiff_t this;
+ if (gap_size == 0)
+ {
+ /* The size estimate was wrong. Make the gap 50% larger. */
+ make_gap (GAP_SIZE >> 1);
+ gap_size = GAP_SIZE - inserted;
+ }
+
+ /* 'try' is reserved in some compilers (Microsoft C). */
+ ptrdiff_t trytry = min (gap_size, READ_BUF_SIZE);
+ if (!NILP (end))
+ trytry = min (trytry, total - inserted);
+
if (!seekable && NILP (end))
{
Lisp_Object nbytes;
- /* Maybe make more room. */
- if (gap_size < trytry)
- {
- make_gap (trytry - gap_size);
- gap_size = GAP_SIZE - inserted;
- }
-
/* Read from the file, capturing `quit'. When an
error occurs, end the loop, and arrange for a quit
to be signaled after decoding the text we read. */
@@ -4630,7 +4631,7 @@ because (1) it preserves some marker positions (in unchanged portions
if (NILP (nbytes))
{
- read_quit = true;
+ read_quit = 1;
break;
}
@@ -4649,19 +4650,11 @@ because (1) it preserves some marker positions (in unchanged portions
if (this <= 0)
{
- how_much = this;
+ read_quit = this;
break;
}
gap_size -= this;
-
- /* For a regular file, where TOTAL is the real size,
- count HOW_MUCH to compare with it.
- For a special file, where TOTAL is just a buffer size,
- so don't bother counting in HOW_MUCH.
- (INSERTED is where we count the number of characters inserted.) */
- if (seekable || !NILP (end))
- how_much += this;
inserted += this;
}
}
@@ -4682,7 +4675,7 @@ because (1) it preserves some marker positions (in unchanged portions
emacs_close (fd);
clear_unwind_protect (fd_index);
- if (how_much < 0)
+ if (read_quit < 0)
report_file_error ("Read error", orig_filename);
notfound:
--
2.37.2
next prev parent reply other threads:[~2023-02-13 20:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-19 22:59 bug#9800: Incomplete truncated file buffers from the /proc filesystem Juri Linkov
2011-10-20 8:22 ` Eli Zaretskii
2011-10-20 8:44 ` Andreas Schwab
2023-02-12 7:38 ` Eli Zaretskii
2023-02-12 9:24 ` Ruijie Yu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2011-10-24 2:53 ` Paul Eggert
2011-10-24 21:50 ` Richard Stallman
2011-10-24 22:02 ` Paul Eggert
2023-02-12 10:21 ` Arsen Arsenović via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-02-13 20:47 ` Paul Eggert [this message]
2011-11-03 20:32 ` Lars Magne Ingebrigtsen
2011-11-04 9:36 ` Juri Linkov
2011-11-04 10:54 ` Eli Zaretskii
2022-02-07 0:10 ` Lars Ingebrigtsen
2022-02-07 19:41 ` Juri Linkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f711f43f-6892-f849-21ec-26376c351f11@cs.ucla.edu \
--to=eggert@cs.ucla.edu \
--cc=9800-done@debbugs.gnu.org \
--cc=arsen@aarsen.me \
--cc=rms@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).