unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Time to drop the pre-dump phase in the build?
@ 2014-01-10 19:15 Eric S. Raymond
  2014-01-10 19:49 ` Eli Zaretskii
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Eric S. Raymond @ 2014-01-10 19:15 UTC (permalink / raw)
  To: emacs-devel

My current transition task is still tag cleanup and signing. I'll
report on that shortly.

I had some off-list conversation with one of our lurkers about
tag cleaning. During it he made an interestingly radical suggestion:
Maybe it's time to stop pre-dumping compiled Lisp into the Emacs build.

While this made sense as a performance hack back in the day, hardware
(most relevantly disk I/O) is much, *much* faster now.  And SSDs are
making access to disk not much slower than main memory. Compilation
on demand might be fast enough today.

There are good reasons to think about dropping this technique:

(1) It makes cross-build of Emacs a pain in the ass.

(2) Even in the non-crossbuild case, it requires a whole lot of
    build-system hair we could otherwise do without.

(3) Back when I last looked at it (admittedly a long time ago) 
    the dump code was both the largest single source of porting
    problems and a serious attractor of crash bugs.  

(4) We're presently buying some startup speed at the cost of a larger
    minimum working set.  I don't *know* that this is a bad trade
    under modern cache hierarchies, but I think the question deserves
    examination.

If anybody wants to own this problem, comparative benchmarking seems
like a good place to start.  That is, hard numbers about the 
actual performance effects of pre-dumping.  That'd head off a
lot of arguments, anyway.

(Why, yes.  I *do* enjoy shaking up peoples' long-held assumptions.
This wasn't obvious already?)
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

Every election is a sort of advance auction sale of stolen goods. 
	-- H.L. Mencken 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 19:15 Time to drop the pre-dump phase in the build? Eric S. Raymond
@ 2014-01-10 19:49 ` Eli Zaretskii
  2014-01-11  6:16   ` Paul Eggert
                     ` (2 more replies)
  2014-01-10 19:51 ` Stefan Monnier
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 25+ messages in thread
From: Eli Zaretskii @ 2014-01-10 19:49 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: emacs-devel

> From: esr@thyrsus.com (Eric S. Raymond)
> Date: Fri, 10 Jan 2014 14:15:30 -0500 (EST)
> 
> (2) Even in the non-crossbuild case, it requires a whole lot of
>     build-system hair we could otherwise do without.

Like what?

> (3) Back when I last looked at it (admittedly a long time ago) 
>     the dump code was both the largest single source of porting
>     problems and a serious attractor of crash bugs.  

Didn't hear about these in a while, perhaps several years.

> (4) We're presently buying some startup speed at the cost of a larger
>     minimum working set.

That's not true: we only preload stuff that is almost immediately
necessary anyway.  You'd have almost the same footprint before you
type anything in Emacs after it starts, even if you start "emacs -Q",
let alone a full-blown session that loads a .emacs.

In any case, without showing numbers for the footprint, and some
analysis of which files might not be needed right away, it's very hard
to have a rational discussion.

> If anybody wants to own this problem, comparative benchmarking seems
> like a good place to start.  That is, hard numbers about the 
> actual performance effects of pre-dumping.  That'd head off a
> lot of arguments, anyway.

I suggest to file a feature request bug report, so that this (and any
followups) gets recorded

> (Why, yes.  I *do* enjoy shaking up peoples' long-held assumptions.
> This wasn't obvious already?)

Let's have one revolution at a time, shall we?



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 19:15 Time to drop the pre-dump phase in the build? Eric S. Raymond
  2014-01-10 19:49 ` Eli Zaretskii
@ 2014-01-10 19:51 ` Stefan Monnier
  2014-01-10 20:09   ` Eric S. Raymond
  2014-01-10 20:13   ` Eli Zaretskii
  2014-01-10 20:20 ` Barry Warsaw
  2014-01-10 22:19 ` Daniel Colascione
  3 siblings, 2 replies; 25+ messages in thread
From: Stefan Monnier @ 2014-01-10 19:51 UTC (permalink / raw)
  To: Eric S. Raymond; +Cc: emacs-devel

> If anybody wants to own this problem, comparative benchmarking seems
> like a good place to start.  That is, hard numbers about the 
> actual performance effects of pre-dumping.  That'd head off a
> lot of arguments, anyway.

On my main desktop machine (AMD E350 with 6GB of RAM):

   time src/temacs --batch => 10s
   time src/emacs --batch => 0.07s

This is on a Samsung 840 SSD.
So, as long as the real startup (GUI and .emacs) takes less than 10s,
I think this idea is a non-starter.


        Stefan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 19:51 ` Stefan Monnier
@ 2014-01-10 20:09   ` Eric S. Raymond
  2014-01-10 20:13   ` Eli Zaretskii
  1 sibling, 0 replies; 25+ messages in thread
From: Eric S. Raymond @ 2014-01-10 20:09 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca>:
> > If anybody wants to own this problem, comparative benchmarking seems
> > like a good place to start.  That is, hard numbers about the 
> > actual performance effects of pre-dumping.  That'd head off a
> > lot of arguments, anyway.
> 
> On my main desktop machine (AMD E350 with 6GB of RAM):
> 
>    time src/temacs --batch => 10s
>    time src/emacs --batch => 0.07s
> 
> This is on a Samsung 840 SSD.
> So, as long as the real startup (GUI and .emacs) takes less than 10s,
> I think this idea is a non-starter.

Fair enough.  I wasn't attached to it, but asking the question seemed
like a good thing.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 19:51 ` Stefan Monnier
  2014-01-10 20:09   ` Eric S. Raymond
@ 2014-01-10 20:13   ` Eli Zaretskii
  1 sibling, 0 replies; 25+ messages in thread
From: Eli Zaretskii @ 2014-01-10 20:13 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: esr, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Fri, 10 Jan 2014 14:51:30 -0500
> Cc: emacs-devel@gnu.org
> 
> On my main desktop machine (AMD E350 with 6GB of RAM):
> 
>    time src/temacs --batch => 10s
>    time src/emacs --batch => 0.07s

Here I have similar numbers:

  time src/temacs --batch => 6.9s
  time src/temacs --batch => 0.075s



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 19:15 Time to drop the pre-dump phase in the build? Eric S. Raymond
  2014-01-10 19:49 ` Eli Zaretskii
  2014-01-10 19:51 ` Stefan Monnier
@ 2014-01-10 20:20 ` Barry Warsaw
  2014-01-10 20:30   ` Eli Zaretskii
  2014-01-10 22:19 ` Daniel Colascione
  3 siblings, 1 reply; 25+ messages in thread
From: Barry Warsaw @ 2014-01-10 20:20 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 482 bytes --]

On Jan 10, 2014, at 02:15 PM, Eric S. Raymond wrote:

>Maybe it's time to stop pre-dumping compiled Lisp into the Emacs build.

Interestingly enough albeit tangential, I haven't even byte-compiled my
personal elisp files in *decades*.  I just load them from source, which works
much better with having them under a vcs.  Performance (startup or runtime)
hasn't been a problem since I made the switch.

I'm not suggesting that for all of Emacs's elisp of course.

-Barry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 20:20 ` Barry Warsaw
@ 2014-01-10 20:30   ` Eli Zaretskii
  2014-01-10 21:06     ` Barry Warsaw
  0 siblings, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2014-01-10 20:30 UTC (permalink / raw)
  To: Barry Warsaw; +Cc: emacs-devel

> From: Barry Warsaw <barry@python.org>
> Date: Fri, 10 Jan 2014 15:20:35 -0500
> 
> Interestingly enough albeit tangential, I haven't even byte-compiled my
> personal elisp files in *decades*.  I just load them from source

Did you measure performance?  Byte-compiled code is about 2 times
faster.

> Performance (startup or runtime) hasn't been a problem since I made
> the switch.

If you are used to slightly slower operation, you will in time stop
paying attention to the slow-down.  E.g., I run an unoptimized build
of Emacs most of the time (due to better debugging opportunities), and
think it is fast enough -- until I fire up an optimized build and am
amazed by its speed.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 20:30   ` Eli Zaretskii
@ 2014-01-10 21:06     ` Barry Warsaw
  0 siblings, 0 replies; 25+ messages in thread
From: Barry Warsaw @ 2014-01-10 21:06 UTC (permalink / raw)
  Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1062 bytes --]

On Jan 10, 2014, at 10:30 PM, Eli Zaretskii wrote:

>> Interestingly enough albeit tangential, I haven't even byte-compiled my
>> personal elisp files in *decades*.  I just load them from source
>
>Did you measure performance?  Byte-compiled code is about 2 times
>faster.

Not really, but two trends make it not really worth it:

1) Computers are so much more insanely faster now than when I started with
Emacs, or even started vc'ing my personal elisp.

2) More and more of my personal hacks have been subsumed by standard Emacs
functionality that I'm really down to just a few handful of files now.

>> Performance (startup or runtime) hasn't been a problem since I made
>> the switch.
>
>If you are used to slightly slower operation, you will in time stop
>paying attention to the slow-down.  E.g., I run an unoptimized build
>of Emacs most of the time (due to better debugging opportunities), and
>think it is fast enough -- until I fire up an optimized build and am
>amazed by its speed.

Fast enough is good enough. :)

-Barry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 19:15 Time to drop the pre-dump phase in the build? Eric S. Raymond
                   ` (2 preceding siblings ...)
  2014-01-10 20:20 ` Barry Warsaw
@ 2014-01-10 22:19 ` Daniel Colascione
  2014-01-10 22:58   ` David Kastrup
  2014-01-10 23:23   ` Stefan Monnier
  3 siblings, 2 replies; 25+ messages in thread
From: Daniel Colascione @ 2014-01-10 22:19 UTC (permalink / raw)
  To: Eric S. Raymond, emacs-devel

On 01/10/2014 11:15 AM, Eric S. Raymond wrote:
> My current transition task is still tag cleanup and signing. I'll
> report on that shortly.
>
> I had some off-list conversation with one of our lurkers about
> tag cleaning. During it he made an interestingly radical suggestion:
> Maybe it's time to stop pre-dumping compiled Lisp into the Emacs build.

Disagree. As other benchmarks in this thread indicate, dumping is still 
a very useful optimization. Besides: the build complexity is 
well-understood.

> While this made sense as a performance hack back in the day, hardware
> (most relevantly disk I/O) is much, *much* faster now.  And SSDs are
> making access to disk not much slower than main memory. Compilation
> on demand might be fast enough today.

Not everyone has an SSD.

> There are good reasons to think about dropping this technique:
>
> (1) It makes cross-build of Emacs a pain in the ass.

Meh?

> (3) Back when I last looked at it (admittedly a long time ago)
>      the dump code was both the largest single source of porting
>      problems and a serious attractor of crash bugs.

That's why the XEmacs portable dumper is better than the current Emacs 
setup. But not by enough to get distracted with ripping the guts out of 
the system.

>
> (4) We're presently buying some startup speed at the cost of a larger
>      minimum working set.

The minimum working set is zero. Modern operating systems demand-page 
necessary information. The dumped information is file-backed, so the 
commit charge is zero as well.

>      under modern cache hierarchies, but I think the question deserves
>      examination.

Unless circumstances have materially change (as they would if, say, 
non-volatile main memory became common), I don't want to waste time 
rehashing old debates.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 22:19 ` Daniel Colascione
@ 2014-01-10 22:58   ` David Kastrup
  2014-01-11  0:05     ` Daniel Colascione
  2014-01-10 23:23   ` Stefan Monnier
  1 sibling, 1 reply; 25+ messages in thread
From: David Kastrup @ 2014-01-10 22:58 UTC (permalink / raw)
  To: emacs-devel

Daniel Colascione <dancol@dancol.org> writes:

> On 01/10/2014 11:15 AM, Eric S. Raymond wrote:
>
>> (4) We're presently buying some startup speed at the cost of a larger
>>      minimum working set.
>
> The minimum working set is zero. Modern operating systems demand-page
> necessary information.

That's a popular misconception.  The key point to note is "page" in
demand-paging.  Unless one uses a garbage collection and topological
sort and compaction of the memory, most of the stuff that will get paged
in along with required data will not get accessed because it is
unrelated.  Now a temacs dump has not seen much action with regard to
fragmentation, but still the normal Lisp programming styles allocate and
release enough transient memory that the image will be mixed up quite
more than byte-compiled files will be.  Of course, if the byte-compiled
files are small, you'll get into block waste as well.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 22:19 ` Daniel Colascione
  2014-01-10 22:58   ` David Kastrup
@ 2014-01-10 23:23   ` Stefan Monnier
  2014-01-11  0:07     ` Daniel Colascione
  2014-01-11 20:13     ` Glenn Morris
  1 sibling, 2 replies; 25+ messages in thread
From: Stefan Monnier @ 2014-01-10 23:23 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eric S. Raymond, emacs-devel

> That's why the XEmacs portable dumper is better than the current Emacs
> setup.

Right, a portable dumper would be nice to have.
Tho I don't know enough of the details to know what are the downsides
(e.g. does it require relocation?  If so that means the file can't just
be mmap'd read-only and shared between processes, right?).


        Stefan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 22:58   ` David Kastrup
@ 2014-01-11  0:05     ` Daniel Colascione
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Colascione @ 2014-01-11  0:05 UTC (permalink / raw)
  To: David Kastrup, emacs-devel

On 01/10/2014 02:58 PM, David Kastrup wrote:
> Daniel Colascione <dancol@dancol.org> writes:
>
>> On 01/10/2014 11:15 AM, Eric S. Raymond wrote:
>>
>>> (4) We're presently buying some startup speed at the cost of a larger
>>>       minimum working set.
>>
>> The minimum working set is zero. Modern operating systems demand-page
>> necessary information.
>
> That's a popular misconception.  The key point to note is "page" in
> demand-paging.  Unless one uses a garbage collection and topological
> sort and compaction of the memory, most of the stuff that will get paged
> in along with required data will not get accessed because it is
> unrelated.  Now a temacs dump has not seen much action with regard to
> fragmentation, but still the normal Lisp programming styles allocate and
> release enough transient memory that the image will be mixed up quite
> more than byte-compiled files will be.  Of course, if the byte-compiled
> files are small, you'll get into block waste as well.

~/edev/trunk/src
$ ps -eo pid,rss,cmd | grep '[e]macs'
31132 38516 ./temacs -Q
31136 31312 ./emacs -Q

[For emacs]
Address           Kbytes     RSS   Dirty Mode   Mapping
0000000000827000   11320   10552    8020 rw---  emacs



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 23:23   ` Stefan Monnier
@ 2014-01-11  0:07     ` Daniel Colascione
  2014-01-11  2:58       ` Stefan Monnier
  2014-01-11 20:13     ` Glenn Morris
  1 sibling, 1 reply; 25+ messages in thread
From: Daniel Colascione @ 2014-01-11  0:07 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eric S. Raymond, emacs-devel

On 01/10/2014 03:23 PM, Stefan Monnier wrote:
>> That's why the XEmacs portable dumper is better than the current Emacs
>> setup.
>
> Right, a portable dumper would be nice to have.
> Tho I don't know enough of the details to know what are the downsides
> (e.g. does it require relocation?  If so that means the file can't just
> be mmap'd read-only and shared between processes, right?).

If I'm reading the XEmacs Internals documentation properly, pdump *may* 
require relocation, but not if the offset at which the dumpfile is 
loaded happens to match the offset at which it's loaded. That's about as 
good as you can ask for.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-11  0:07     ` Daniel Colascione
@ 2014-01-11  2:58       ` Stefan Monnier
  2014-01-11  3:37         ` Daniel Colascione
  0 siblings, 1 reply; 25+ messages in thread
From: Stefan Monnier @ 2014-01-11  2:58 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eric S. Raymond, emacs-devel

>>> That's why the XEmacs portable dumper is better than the current Emacs
>>> setup.
>> Right, a portable dumper would be nice to have.
>> Tho I don't know enough of the details to know what are the downsides
>> (e.g. does it require relocation?  If so that means the file can't just
>> be mmap'd read-only and shared between processes, right?).
> If I'm reading the XEmacs Internals documentation properly, pdump *may*
> require relocation, but not if the offset at which the dumpfile is loaded
> happens to match the offset at which it's loaded. That's about as good as
> you can ask for.

Which begs the question: when is it the case that the offset matches?
Can we assume it to be the common case?


        Stefan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-11  2:58       ` Stefan Monnier
@ 2014-01-11  3:37         ` Daniel Colascione
  2014-01-11  5:13           ` Stefan Monnier
  2014-01-11 16:14           ` Stephen J. Turnbull
  0 siblings, 2 replies; 25+ messages in thread
From: Daniel Colascione @ 2014-01-11  3:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eric S. Raymond, stephen, Emacs developers

On 01/10/2014 06:58 PM, Stefan Monnier wrote:
>>>> That's why the XEmacs portable dumper is better than the current Emacs
>>>> setup.
>>> Right, a portable dumper would be nice to have.
>>> Tho I don't know enough of the details to know what are the downsides
>>> (e.g. does it require relocation?  If so that means the file can't just
>>> be mmap'd read-only and shared between processes, right?).
>> If I'm reading the XEmacs Internals documentation properly, pdump *may*
>> require relocation, but not if the offset at which the dumpfile is loaded
>> happens to match the offset at which it's loaded. That's about as good as
>> you can ask for.
>
> Which begs the question: when is it the case that the offset matches?
> Can we assume it to be the common case?

Someone who actually uses XEmacs can probably provide better commentary 
(+ Stephen), but I imagine that in a 64-bit address space, you'll pretty 
likely be able to map the dump file in the same place every time. On a 
32-bit system with ASLR, maybe not as often. Cygwin has similar problems 
involving fork.

Another possibility is to just allocate enough space in the emacs image 
itself in BSS, then replace that mapping with a view of the dump file. 
(This way, we always map the dump file at the same place relative to the 
emacs image base). Or we can make the dump file a section in the image, 
but at that point, we're starting to talk about portability problems again.

By the way: is it me, or are we dirtying far too much of the current 
emacs image? On my Emacs, we're dirtying (and COWing) 8MB; if I make 
Fgarbage_collect a no-op, that drops to 4MB.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-11  3:37         ` Daniel Colascione
@ 2014-01-11  5:13           ` Stefan Monnier
  2014-01-11  5:30             ` Daniel Colascione
  2014-01-11 16:14           ` Stephen J. Turnbull
  1 sibling, 1 reply; 25+ messages in thread
From: Stefan Monnier @ 2014-01-11  5:13 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eric S. Raymond, stephen, Emacs developers

> Another possibility is to just allocate enough space in the emacs image
> itself in BSS, then replace that mapping with a view of the dump file.

Indeed, that should work, assuming you can mmap into existing space.

> image base). Or we can make the dump file a section in the image, but at
> that point, we're starting to talk about portability problems again.

But not nearly as bad: the main dump problem we have is with generating
the `emacs' executable, whereas here we'd only need to generate the
"swap file" which is later loaded into the same executable.
Should still be a lot more portable.

> By the way: is it me, or are we dirtying far too much of the current emacs
> image? On my Emacs, we're dirtying (and COWing) 8MB; if I make
> Fgarbage_collect a no-op, that drops to 4MB.

For sure, GC will dirty up pretty much all pages that hold Lisp objects
(except for those in the purespace), because of the need to set/reset
the `mark' bit.


        Stefan



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-11  5:13           ` Stefan Monnier
@ 2014-01-11  5:30             ` Daniel Colascione
  0 siblings, 0 replies; 25+ messages in thread
From: Daniel Colascione @ 2014-01-11  5:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eric S. Raymond, stephen, Emacs developers

On 01/10/2014 09:13 PM, Stefan Monnier wrote:
>> Another possibility is to just allocate enough space in the emacs image
>> itself in BSS, then replace that mapping with a view of the dump file.
>
> Indeed, that should work, assuming you can mmap into existing space.

On POSIX-y systems, you can just mmap on top of the existing section. On 
Windows, you have to unmap first, but I think it could be made to work.

> But not nearly as bad: the main dump problem we have is with generating
> the `emacs' executable, whereas here we'd only need to generate the
> "swap file" which is later loaded into the same executable.
> Should still be a lot more portable.

Do you mean building emacs with a large blob of zero in .data, using it 
as a heap, and replacing the contents of that section (without modifying 
the executable image structure) to actually "dump" emacs?

>> By the way: is it me, or are we dirtying far too much of the current emacs
>> image? On my Emacs, we're dirtying (and COWing) 8MB; if I make
>> Fgarbage_collect a no-op, that drops to 4MB.
>
> For sure, GC will dirty up pretty much all pages that hold Lisp objects
> (except for those in the purespace), because of the need to set/reset
> the `mark' bit.

I was thinking about this problem. What if we were to just treat all 
image-backed objects as already marked if they're in pages that are 
unmodified? (We can perform this test very cheaply, at least on */Linux 
and Windows.) Then we wouldn't mark them during GC, and we additionally 
don't demand-page objects just for GC.

The problem we create is that we might have modified image-backed 
objects reachable only from unmodified image-backed objects, and these 
modified objects might point to heap-allocated objects that we really 
should mark. So what if we walk the per-type allocation lists during the 
*mark* phase and treat all in-image objects on modified pages as 
individual roots? This way, we eventually mark all heap-allocated 
objects. (Let's assume that no image-backed unmodified object can 
directly point to a heap-allocated object.)

This way, we can avoid touching most dumped data structures during GC. 
We might modify them for other reasons, though, like setting symbol 
value cells --- but if my quick and dirty GC test worked correctly, we 
should still save quite a bit on commit charge without worrying about 
these cases.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 19:49 ` Eli Zaretskii
@ 2014-01-11  6:16   ` Paul Eggert
  2014-01-11  7:17   ` Richard Stallman
  2014-01-11  7:17   ` Richard Stallman
  2 siblings, 0 replies; 25+ messages in thread
From: Paul Eggert @ 2014-01-11  6:16 UTC (permalink / raw)
  To: Eli Zaretskii, Eric S. Raymond; +Cc: emacs-devel

>> >(3) Back when I last looked at it (admittedly a long time ago)
>> >     the dump code was both the largest single source of porting
>> >     problems and a serious attractor of crash bugs.

> Didn't hear about these in a while, perhaps several years.

I ran into a porting problem a couple of weeks ago:
the Emacs dump approach doesn't work with AddressSanitizer
on GNU/Linux, e.g., gcc -fsanitize=address if you're using
GCC 4.8 or later.  It turns out that John Wiegley reported
the problem on emacs-devel in 2012; see:

http://lists.gnu.org/archive/html/emacs-devel/2012-06/msg00600.html

AddressSanitizer would be a useful bug-squasher for Emacs,
I expect.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 19:49 ` Eli Zaretskii
  2014-01-11  6:16   ` Paul Eggert
@ 2014-01-11  7:17   ` Richard Stallman
  2014-01-12  0:16     ` Nix
  2014-01-11  7:17   ` Richard Stallman
  2 siblings, 1 reply; 25+ messages in thread
From: Richard Stallman @ 2014-01-11  7:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: esr, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

    > (3) Back when I last looked at it (admittedly a long time ago) 
    >     the dump code was both the largest single source of porting
    >     problems and a serious attractor of crash bugs.  

    Didn't hear about these in a while, perhaps several years.

20 years ago we needed to port the dumping code to various different
systems.  It was substantial work.  But it seems there have been no new
such systems in a long time, and that code seems to be stable.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 19:49 ` Eli Zaretskii
  2014-01-11  6:16   ` Paul Eggert
  2014-01-11  7:17   ` Richard Stallman
@ 2014-01-11  7:17   ` Richard Stallman
  2 siblings, 0 replies; 25+ messages in thread
From: Richard Stallman @ 2014-01-11  7:17 UTC (permalink / raw)
  To: esr, Eli Zaretskii; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

Without dumping, Emacs startup would be comparable to
temacs -l loadup.  That takes almost 2 minutes on my machine.
emacs -Q takes about 3 seconds.

Maybe you can make it faster than that.
Perhaps half of the files would not need to be loaded at startup.
It would not need to do GC as much as it does.
Nonetheless it will be much slower than emacs -Q is now.

Thus, I am strongly opposed to this change.
Perhaps it would be ok, considering only fast machines.
But Emacs has to be good on slower machines too.

I'm not against eliminating dumping if and when the benefits of
dumping are truly no longer needed.  If you can make the initial
loading 20 times as fast as temacs -l loadup, it would be just a small
annoyance.


-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-11  3:37         ` Daniel Colascione
  2014-01-11  5:13           ` Stefan Monnier
@ 2014-01-11 16:14           ` Stephen J. Turnbull
  1 sibling, 0 replies; 25+ messages in thread
From: Stephen J. Turnbull @ 2014-01-11 16:14 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eric S. Raymond, Stefan Monnier, Emacs developers

Daniel Colascione writes:

 > > Which begs the question: when is it the case that the offset matches?
 > > Can we assume it to be the common case?
 > 
 > Someone who actually uses XEmacs can probably provide better commentary 
 > (+ Stephen), but I imagine that in a 64-bit address space, you'll pretty 
 > likely be able to map the dump file in the same place every time.

I don't really know.  I'm pretty sure you have to avoid ASLR, XEmacs's
pdump doesn't know how to cope with that in any case IIRC.

Olivier Galibert would know.  Maybe Marcus Crestani or Martin
Buccholz.





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-10 23:23   ` Stefan Monnier
  2014-01-11  0:07     ` Daniel Colascione
@ 2014-01-11 20:13     ` Glenn Morris
  1 sibling, 0 replies; 25+ messages in thread
From: Glenn Morris @ 2014-01-11 20:13 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier wrote:

>> That's why the XEmacs portable dumper is better than the current Emacs
>> setup.
>
> Right, a portable dumper would be nice to have.
> Tho I don't know enough of the details to know what are the downsides
> (e.g. does it require relocation?  If so that means the file can't just
> be mmap'd read-only and shared between processes, right?).

Apparently there was an Emacs 21 patch, but AFAIK it never appeared:

http://lists.gnu.org/archive/html/emacs-devel/2002-04/msg00723.html
http://lists.gnu.org/archive/html/emacs-devel/2008-02/msg02194.html

And it broke in 22:
http://knagano.blogspot.com/2005/10/portable-dumper-revisited.html

If you read Japanese (I don't), there's a paper at:
http://lc.linux.or.jp/lc2002/papers/nagano0920h.pdf
  (Has some startup times listed!)

Source:  http://www.sodan.org/~knagano/emacs/pdump



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-11  7:17   ` Richard Stallman
@ 2014-01-12  0:16     ` Nix
  2014-01-12  3:48       ` Eli Zaretskii
  0 siblings, 1 reply; 25+ messages in thread
From: Nix @ 2014-01-12  0:16 UTC (permalink / raw)
  To: Richard Stallman; +Cc: esr, Eli Zaretskii, Stephen J. Turnbull, emacs-devel

On 11 Jan 2014, Richard Stallman outgrape:
> 20 years ago we needed to port the dumping code to various different
> systems.  It was substantial work.  But it seems there have been no new
> such systems in a long time, and that code seems to be stable.

It still requires substantial ugly hacks, e.g. there is code in glibc to
serialize and deserialize malloc state whose sole purpose is to support
Emacs dumping, and which cannot be changed since that would force Emacs
to be redumped when glibc was upgraded. It seems plausible that this
might eventually retard glibc allocator development :(

XEmacs long ago migrated to a 'portable undumper', whereby (IIRC) the
Lisp heap is serialized into a form that is then mmap()ed in at startup
time (using a separate file, so unexec() is no longer necessary). It was
a lot of work, but doing something similar might be worth considering in
the future anyway.

-- 
NULL && (void)



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-12  0:16     ` Nix
@ 2014-01-12  3:48       ` Eli Zaretskii
  2014-01-12  3:53         ` Eric S. Raymond
  0 siblings, 1 reply; 25+ messages in thread
From: Eli Zaretskii @ 2014-01-12  3:48 UTC (permalink / raw)
  To: Nix; +Cc: esr, rms, turnbull, emacs-devel

> From: Nix <nix@esperi.org.uk>
> Emacs: ed  ::  20-megaton hydrogen bomb : firecracker
> Date: Sun, 12 Jan 2014 00:16:17 +0000
> Cc: esr@thyrsus.com, Eli Zaretskii <eliz@gnu.org>,
> 	"Stephen J. Turnbull" <turnbull@sk.tsukuba.ac.jp>, emacs-devel@gnu.org
> 
> It still requires substantial ugly hacks, e.g. there is code in glibc to
> serialize and deserialize malloc state whose sole purpose is to support
> Emacs dumping, and which cannot be changed since that would force Emacs
> to be redumped when glibc was upgraded. It seems plausible that this
> might eventually retard glibc allocator development :(
> 
> XEmacs long ago migrated to a 'portable undumper', whereby (IIRC) the
> Lisp heap is serialized into a form that is then mmap()ed in at startup
> time (using a separate file, so unexec() is no longer necessary). It was
> a lot of work, but doing something similar might be worth considering in
> the future anyway.

No change of this scale ever happens in Emacs, unless someone steps
forward and does the job, or most of it.  People who want this to
happen should take notice and act.  Talk won't cut it.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Time to drop the pre-dump phase in the build?
  2014-01-12  3:48       ` Eli Zaretskii
@ 2014-01-12  3:53         ` Eric S. Raymond
  0 siblings, 0 replies; 25+ messages in thread
From: Eric S. Raymond @ 2014-01-12  3:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Nix, rms, turnbull, emacs-devel

Eli Zaretskii <eliz@gnu.org>:
> > XEmacs long ago migrated to a 'portable undumper', whereby (IIRC) the
> > Lisp heap is serialized into a form that is then mmap()ed in at startup
> > time (using a separate file, so unexec() is no longer necessary). It was
> > a lot of work, but doing something similar might be worth considering in
> > the future anyway.
> 
> No change of this scale ever happens in Emacs, unless someone steps
> forward and does the job, or most of it.  People who want this to
> happen should take notice and act.  Talk won't cut it.

Since I raised the possibility: I won't have the bandwidth to do this
in the forseeable future.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-01-12  3:53 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-10 19:15 Time to drop the pre-dump phase in the build? Eric S. Raymond
2014-01-10 19:49 ` Eli Zaretskii
2014-01-11  6:16   ` Paul Eggert
2014-01-11  7:17   ` Richard Stallman
2014-01-12  0:16     ` Nix
2014-01-12  3:48       ` Eli Zaretskii
2014-01-12  3:53         ` Eric S. Raymond
2014-01-11  7:17   ` Richard Stallman
2014-01-10 19:51 ` Stefan Monnier
2014-01-10 20:09   ` Eric S. Raymond
2014-01-10 20:13   ` Eli Zaretskii
2014-01-10 20:20 ` Barry Warsaw
2014-01-10 20:30   ` Eli Zaretskii
2014-01-10 21:06     ` Barry Warsaw
2014-01-10 22:19 ` Daniel Colascione
2014-01-10 22:58   ` David Kastrup
2014-01-11  0:05     ` Daniel Colascione
2014-01-10 23:23   ` Stefan Monnier
2014-01-11  0:07     ` Daniel Colascione
2014-01-11  2:58       ` Stefan Monnier
2014-01-11  3:37         ` Daniel Colascione
2014-01-11  5:13           ` Stefan Monnier
2014-01-11  5:30             ` Daniel Colascione
2014-01-11 16:14           ` Stephen J. Turnbull
2014-01-11 20:13     ` Glenn Morris

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).