unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Daniel Colascione <dancol@dancol.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: "Eric S. Raymond" <esr@thyrsus.com>,
	stephen@xemacs.org, Emacs developers <emacs-devel@gnu.org>
Subject: Re: Time to drop the pre-dump phase in the build?
Date: Fri, 10 Jan 2014 21:30:13 -0800	[thread overview]
Message-ID: <52D0D6E5.9060507@dancol.org> (raw)
In-Reply-To: <jwv8uun2jo9.fsf-monnier+emacs@gnu.org>

On 01/10/2014 09:13 PM, Stefan Monnier wrote:
>> Another possibility is to just allocate enough space in the emacs image
>> itself in BSS, then replace that mapping with a view of the dump file.
>
> Indeed, that should work, assuming you can mmap into existing space.

On POSIX-y systems, you can just mmap on top of the existing section. On 
Windows, you have to unmap first, but I think it could be made to work.

> But not nearly as bad: the main dump problem we have is with generating
> the `emacs' executable, whereas here we'd only need to generate the
> "swap file" which is later loaded into the same executable.
> Should still be a lot more portable.

Do you mean building emacs with a large blob of zero in .data, using it 
as a heap, and replacing the contents of that section (without modifying 
the executable image structure) to actually "dump" emacs?

>> By the way: is it me, or are we dirtying far too much of the current emacs
>> image? On my Emacs, we're dirtying (and COWing) 8MB; if I make
>> Fgarbage_collect a no-op, that drops to 4MB.
>
> For sure, GC will dirty up pretty much all pages that hold Lisp objects
> (except for those in the purespace), because of the need to set/reset
> the `mark' bit.

I was thinking about this problem. What if we were to just treat all 
image-backed objects as already marked if they're in pages that are 
unmodified? (We can perform this test very cheaply, at least on */Linux 
and Windows.) Then we wouldn't mark them during GC, and we additionally 
don't demand-page objects just for GC.

The problem we create is that we might have modified image-backed 
objects reachable only from unmodified image-backed objects, and these 
modified objects might point to heap-allocated objects that we really 
should mark. So what if we walk the per-type allocation lists during the 
*mark* phase and treat all in-image objects on modified pages as 
individual roots? This way, we eventually mark all heap-allocated 
objects. (Let's assume that no image-backed unmodified object can 
directly point to a heap-allocated object.)

This way, we can avoid touching most dumped data structures during GC. 
We might modify them for other reasons, though, like setting symbol 
value cells --- but if my quick and dirty GC test worked correctly, we 
should still save quite a bit on commit charge without worrying about 
these cases.



  reply	other threads:[~2014-01-11  5:30 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-10 19:15 Time to drop the pre-dump phase in the build? Eric S. Raymond
2014-01-10 19:49 ` Eli Zaretskii
2014-01-11  6:16   ` Paul Eggert
2014-01-11  7:17   ` Richard Stallman
2014-01-12  0:16     ` Nix
2014-01-12  3:48       ` Eli Zaretskii
2014-01-12  3:53         ` Eric S. Raymond
2014-01-11  7:17   ` Richard Stallman
2014-01-10 19:51 ` Stefan Monnier
2014-01-10 20:09   ` Eric S. Raymond
2014-01-10 20:13   ` Eli Zaretskii
2014-01-10 20:20 ` Barry Warsaw
2014-01-10 20:30   ` Eli Zaretskii
2014-01-10 21:06     ` Barry Warsaw
2014-01-10 22:19 ` Daniel Colascione
2014-01-10 22:58   ` David Kastrup
2014-01-11  0:05     ` Daniel Colascione
2014-01-10 23:23   ` Stefan Monnier
2014-01-11  0:07     ` Daniel Colascione
2014-01-11  2:58       ` Stefan Monnier
2014-01-11  3:37         ` Daniel Colascione
2014-01-11  5:13           ` Stefan Monnier
2014-01-11  5:30             ` Daniel Colascione [this message]
2014-01-11 16:14           ` Stephen J. Turnbull
2014-01-11 20:13     ` Glenn Morris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52D0D6E5.9060507@dancol.org \
    --to=dancol@dancol.org \
    --cc=emacs-devel@gnu.org \
    --cc=esr@thyrsus.com \
    --cc=monnier@iro.umontreal.ca \
    --cc=stephen@xemacs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).