unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Pip Cet <pipcet@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 46881@debbugs.gnu.org, Paul Eggert <eggert@cs.ucla.edu>
Subject: bug#46881: 28.0.50; pdumper dumping causes way too many syscalls
Date: Wed, 3 Mar 2021 07:35:45 +0000	[thread overview]
Message-ID: <CAOqdjBcLB-zzV=Kbh6xcJHuATZBy5je-UtA4SwM07ZvtyE3RmQ@mail.gmail.com> (raw)
In-Reply-To: <83r1kw6b06.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 1208 bytes --]

On Wed, Mar 3, 2021 at 5:51 AM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Tue, 2 Mar 2021 20:45:04 +0000
> >
> > On Tue, Mar 2, 2021 at 8:35 PM Pip Cet <pipcet@gmail.com> wrote:
> > > I've looked into the problem, and it seems easy to solve and worth it
> > > in terms of debuggability and performance.

Since debuggability is such a concern, we probably shouldn't leak the
buffer memory. Revised patch attached. (This patch also removes the
lseek() syscalls; while not quite as numerous as the read() ones,
those did clutter up straces here).

> In particular, is it safe to allocate
> large amounts of memory off the heap while dumping?

Even if it isn't, we'd still be faster re-running the dump after
growing the dumper image than the current approach is.

>A couple of
> places in pdumper.c says some parts of code should call malloc.

IIUC, the prohibition on calling malloc, if it is still a concern,
applies only when loading the dump, not while writing it.

My main concern is the possibility of a partly-written dump file,
since we no longer turn "!UMPEDGNUEMACS" into "DUMPEDGNUEMACS" after
the dump. Maybe it would make sense to restore that feature?

Pip

[-- Attachment #2: 0001-Prepare-pdumper-dump-file-in-memory-write-it-in-one-.patch --]
[-- Type: text/x-patch, Size: 2932 bytes --]

From fae67c02955a5bbea16d554b8e735dc8bef6a9e2 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Tue, 2 Mar 2021 20:38:23 +0000
Subject: [PATCH] Prepare pdumper dump file in memory, write it in one go
 (Bug#46881)

* src/pdumper.c (struct dump_context): Add buf, buf_size, max_offset fields.
(dump_grow_buffer): New function.
(dump_write): Use memcpy, not an actual emacs_write.
(dump_seek): Keep track of maximum seen offset. Don't actually seek.
(Fdump_emacs_portable): Write out the file contents when done.
---
 src/pdumper.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/src/pdumper.c b/src/pdumper.c
index 337742fda4ade..29d1cb862e07f 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -473,6 +473,10 @@ dump_fingerprint (char const *label,
 {
   /* Header we'll write to the dump file when done.  */
   struct dump_header header;
+  /* Data that will be written to the dump file.  */
+  void *buf;
+  dump_off buf_size;
+  dump_off max_offset;
 
   Lisp_Object old_purify_flag;
   Lisp_Object old_post_gc_hook;
@@ -581,6 +585,13 @@ dump_fingerprint (char const *label,
 \f
 /* Dump file creation */
 
+static void dump_grow_buffer (struct dump_context *ctx)
+{
+  ctx->buf = xrealloc (ctx->buf, ctx->buf_size = (ctx->buf_size ?
+						  (ctx->buf_size * 2)
+						  : 8 * 1024 * 1024));
+}
+
 static dump_off dump_object (struct dump_context *ctx, Lisp_Object object);
 static dump_off dump_object_for_offset (struct dump_context *ctx,
 					Lisp_Object object);
@@ -747,8 +758,9 @@ dump_write (struct dump_context *ctx, const void *buf, dump_off nbyte)
   eassert (nbyte == 0 || buf != NULL);
   eassert (ctx->obj_offset == 0);
   eassert (ctx->flags.dump_object_contents);
-  if (emacs_write (ctx->fd, buf, nbyte) < nbyte)
-    report_file_error ("Could not write to dump file", ctx->dump_filename);
+  while (ctx->offset + nbyte > ctx->buf_size)
+    dump_grow_buffer (ctx);
+  memcpy ((char *)ctx->buf + ctx->offset, buf, nbyte);
   ctx->offset += nbyte;
 }
 
@@ -828,10 +840,9 @@ dump_tailq_pop (struct dump_tailq *tailq)
 static void
 dump_seek (struct dump_context *ctx, dump_off offset)
 {
+  if (ctx->max_offset < ctx->offset)
+    ctx->max_offset = ctx->offset;
   eassert (ctx->obj_offset == 0);
-  if (lseek (ctx->fd, offset, SEEK_SET) < 0)
-    report_file_error ("Setting file position",
-                       ctx->dump_filename);
   ctx->offset = offset;
 }
 
@@ -4159,6 +4170,12 @@ DEFUN ("dump-emacs-portable",
   ctx->header.magic[0] = dump_magic[0];
   dump_seek (ctx, 0);
   dump_write (ctx, &ctx->header, sizeof (ctx->header));
+  if (emacs_write (ctx->fd, ctx->buf, ctx->max_offset) < ctx->max_offset)
+    report_file_error ("Could not write to dump file", ctx->dump_filename);
+  xfree (ctx->buf);
+  ctx->buf = NULL;
+  ctx->buf_size = 0;
+  ctx->max_offset = 0;
 
   dump_off
     header_bytes = header_end - header_start,
-- 
2.30.1


  reply	other threads:[~2021-03-03  7:35 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-02 20:33 bug#46881: 28.0.50; pdumper dumping causes way too many syscalls Pip Cet
2021-03-02 20:45 ` Pip Cet
2021-03-02 21:07   ` Alan Third
2021-03-03  7:10     ` Pip Cet
2021-03-03 19:57       ` Alan Third
2021-03-04  7:25         ` Pip Cet
2021-03-03  5:51   ` Eli Zaretskii
2021-03-03  7:35     ` Pip Cet [this message]
2021-03-03 15:09       ` Lars Ingebrigtsen
2021-03-03 19:35       ` Paul Eggert
2021-03-04 22:26     ` Daniel Colascione
2021-03-05  2:30       ` Pip Cet
2021-03-05  7:19         ` Eli Zaretskii
2021-03-05  7:38           ` Pip Cet
2021-03-05  7:54             ` Eli Zaretskii
2021-03-05  9:54               ` Pip Cet
2021-03-05 10:23                 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-03-05 12:06                 ` Eli Zaretskii
2021-03-05 12:49                   ` Lars Ingebrigtsen
2021-03-05 13:23                     ` Eli Zaretskii
2021-03-05 13:16                   ` Pip Cet
2021-03-05 14:02                     ` Pip Cet
2021-03-05 14:13                       ` Daniel Colascione
2021-03-05 14:55                       ` Eli Zaretskii
2021-03-05 15:12                         ` Pip Cet
2021-03-05  9:35             ` Andreas Schwab
2021-03-05  9:41               ` Pip Cet
2021-06-15  9:25 ` Mattias Engdegård
2021-06-15 12:58   ` Daniel Colascione
2021-06-15 13:06     ` Eli Zaretskii
2021-06-15 13:17       ` Lars Ingebrigtsen
2021-06-15 13:25         ` Daniel Colascione
2021-06-15 13:30           ` Eli Zaretskii
2021-06-15 15:32         ` Mattias Engdegård
2021-06-15 22:44           ` Daniel Colascione
2021-06-16  8:00             ` Mattias Engdegård
2021-06-16  8:14               ` Lars Ingebrigtsen
2021-06-16  8:16               ` Pip Cet
2021-06-16 14:13 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOqdjBcLB-zzV=Kbh6xcJHuATZBy5je-UtA4SwM07ZvtyE3RmQ@mail.gmail.com' \
    --to=pipcet@gmail.com \
    --cc=46881@debbugs.gnu.org \
    --cc=eggert@cs.ucla.edu \
    --cc=eliz@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).