From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Daniel Colascione Newsgroups: gmane.emacs.bugs Subject: bug#46881: 28.0.50; pdumper dumping causes way too many syscalls Date: Thu, 4 Mar 2021 17:26:32 -0500 Message-ID: <90e99fc5-280d-63bb-9bc4-3efe89b9f9e2@dancol.org> References: <83r1kw6b06.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8203"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 Cc: 46881@debbugs.gnu.org To: Eli Zaretskii , Pip Cet , Paul Eggert Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Mar 04 23:40:03 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lHwdV-00020L-Ny for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 04 Mar 2021 23:40:02 +0100 Original-Received: from localhost ([::1]:35142 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lHwdU-00030Y-FN for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 04 Mar 2021 17:40:00 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:55962) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lHwQw-0002ZZ-ON for bug-gnu-emacs@gnu.org; Thu, 04 Mar 2021 17:27:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:49009) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lHwQw-0003vg-FO for bug-gnu-emacs@gnu.org; Thu, 04 Mar 2021 17:27:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lHwQw-0001Wu-B7 for bug-gnu-emacs@gnu.org; Thu, 04 Mar 2021 17:27:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Daniel Colascione Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 04 Mar 2021 22:27:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 46881 X-GNU-PR-Package: emacs Original-Received: via spool by 46881-submit@debbugs.gnu.org id=B46881.16148968015852 (code B ref 46881); Thu, 04 Mar 2021 22:27:02 +0000 Original-Received: (at 46881) by debbugs.gnu.org; 4 Mar 2021 22:26:41 +0000 Original-Received: from localhost ([127.0.0.1]:60555 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lHwQa-0001WI-TA for submit@debbugs.gnu.org; Thu, 04 Mar 2021 17:26:41 -0500 Original-Received: from dancol.org ([96.126.100.184]:35692) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lHwQW-0001W8-3X for 46881@debbugs.gnu.org; Thu, 04 Mar 2021 17:26:38 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=dancol.org; s=x; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=AH3jYq7hr+YyCbABU+2+sUNtbROHnDTbfObLdyzhWms=; b=UB24kcXCIPvrgAOb/19/PtG/IL 7WwiRVbTX03Y5aL5u4LckDkYGagGLrp5vzbW2Pf5LtDMM8MhDPpCqikgYT+PfIplfynUEuoIdoI74 QAOw/Sxv214AUCXu0wSDXUutWFNqRvS2lPWAEqD0EQy6xkqsgeqiOEyu1EXzivEsAzcf5OHKfiv7o FEl8aJYOz5xM7ZseepNowy7USVozRyXS6qNzoJ18UB+OLfCFzgPIfNRhXcEVGgg5GaquO9McMZZBg 3NzMwaC+LLDBKbLTiI2gayi0gEBwn5/l09WUTBXEniNAFiXHyCa4LNZkWrJQ7uHCrrMn8QgS6s4zk YRlX37RA==; Original-Received: from [97.104.73.87] (port=46596 helo=[192.168.1.148]) by dancol.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from ) id 1lHwQU-0003cW-5G; Thu, 04 Mar 2021 14:26:34 -0800 In-Reply-To: <83r1kw6b06.fsf@gnu.org> Content-Language: en-US X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:201484 Archived-At: On 3/3/21 12:51 AM, Eli Zaretskii wrote: >> From: Pip Cet >> Date: Tue, 2 Mar 2021 20:45:04 +0000 >> >> On Tue, Mar 2, 2021 at 8:35 PM Pip Cet wrote: >>> I've looked into the problem, and it seems easy to solve and worth it >>> in terms of debuggability and performance. >> Very rough benchmarks, but this seems to be clearly worth it: >> >> Performance: >> With patch: >> real 0m3.861s >> user 0m3.776s >> sys 0m0.085s >> >> Without patch: >> real 0m7.001s >> user 0m4.476s >> sys 0m2.511s >> >> Number of syscalls: >> With patch: 415442 >> Without patch: 2028307 >> >>> Patch will be attached once this has a bug number. >> And here's the patch. Testing would be very appreciated. >> >> I'm unsure about the precise usage of dump_off vs ptrdiff_t here; I >> don't think it matters, but suggestions, nitpicks, and comments, on >> this or any other aspect, would be very appreciated. >> From 92ee138852b34ede2f43dd7f93f310fc746bb3bf Mon Sep 17 00:00:00 2001 >> From: Pip Cet >> Date: Tue, 2 Mar 2021 20:38:23 +0000 >> Subject: [PATCH] Prepare pdumper dump file in memory, write it in one go >> (Bug#46881) >> >> * src/pdumper.c (struct dump_context): Add buf, buf_size, max_offset fields. >> (grow_buffer): New function. >> (dump_write): Use memcpy, not an actual emacs_write. >> (dump_seek): Keep track of maximum seen offset. >> (Fdump_emacs_portable): Write out the file contents when done. >> --- >> src/pdumper.c | 20 ++++++++++++++++++-- >> 1 file changed, 18 insertions(+), 2 deletions(-) >> >> diff --git a/src/pdumper.c b/src/pdumper.c >> index 337742fda4ade..62ddad8ee5e34 100644 >> --- a/src/pdumper.c >> +++ b/src/pdumper.c >> @@ -473,6 +473,10 @@ dump_fingerprint (char const *label, >> { >> /* Header we'll write to the dump file when done. */ >> struct dump_header header; >> + /* Data that will be written to the dump file. */ >> + void *buf; >> + ptrdiff_t buf_size; >> + ptrdiff_t max_offset; >> >> Lisp_Object old_purify_flag; >> Lisp_Object old_post_gc_hook; >> @@ -581,6 +585,13 @@ dump_fingerprint (char const *label, >> >> /* Dump file creation */ >> >> +static void dump_grow_buffer (struct dump_context *ctx) >> +{ >> + ctx->buf = xrealloc (ctx->buf, ctx->buf_size = (ctx->buf_size ? >> + (ctx->buf_size * 2) >> + : 1024 * 1024)); >> +} >> + >> static dump_off dump_object (struct dump_context *ctx, Lisp_Object object); >> static dump_off dump_object_for_offset (struct dump_context *ctx, >> Lisp_Object object); >> @@ -747,8 +758,9 @@ dump_write (struct dump_context *ctx, const void *buf, dump_off nbyte) >> eassert (nbyte == 0 || buf != NULL); >> eassert (ctx->obj_offset == 0); >> eassert (ctx->flags.dump_object_contents); >> - if (emacs_write (ctx->fd, buf, nbyte) < nbyte) >> - report_file_error ("Could not write to dump file", ctx->dump_filename); >> + while (ctx->offset + nbyte > ctx->buf_size) >> + dump_grow_buffer (ctx); >> + memcpy ((char *)ctx->buf + ctx->offset, buf, nbyte); >> ctx->offset += nbyte; >> } >> >> @@ -828,6 +840,8 @@ dump_tailq_pop (struct dump_tailq *tailq) >> static void >> dump_seek (struct dump_context *ctx, dump_off offset) >> { >> + if (ctx->max_offset < ctx->offset) >> + ctx->max_offset = ctx->offset; >> eassert (ctx->obj_offset == 0); >> if (lseek (ctx->fd, offset, SEEK_SET) < 0) >> report_file_error ("Setting file position", >> @@ -4159,6 +4173,8 @@ DEFUN ("dump-emacs-portable", >> ctx->header.magic[0] = dump_magic[0]; >> dump_seek (ctx, 0); >> dump_write (ctx, &ctx->header, sizeof (ctx->header)); >> + if (emacs_write (ctx->fd, ctx->buf, ctx->max_offset) < ctx->max_offset) >> + report_file_error ("Could not write to dump file", ctx->dump_filename); >> >> dump_off >> header_bytes = header_end - header_start, >> -- >> 2.30.1 > Thanks. > > Daniel, Paul: any comments? In particular, is it safe to allocate > large amounts of memory off the heap while dumping? A couple of > places in pdumper.c says some parts of code should call malloc. It looks fine, but wouldn't dumping to a FILE* (with internal buffering) do the same basic thing in a simpler way? There aren't any particular constraints on the environment _during_ the dump: we even make new lisp objects. It's when loading the dump, early in initialization, that you have to be careful.