When should ralloc.c be used? (WAS: bug#24358)

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* When should ralloc.c be used? (WAS: bug#24358)
       [not found]                   ` <831szqhbc2.fsf@gnu.org>
@ 2016-10-22  3:03                     ` npostavs
  2016-10-22  5:32                       ` Paul Eggert
  0 siblings, 1 reply; 375+ messages in thread
From: npostavs @ 2016-10-22  3:03 UTC (permalink / raw)
  To: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:
>>     Thread 1 "emacs" hit Hardware watchpoint 4: current_buffer->text->beg
>> 
>>     Old value = (unsigned char *) 0x18351b8 ""
>>     New value = (unsigned char *) 0x188a1b8 ""
>>     r_alloc_sbrk (size=290816) at ralloc.c:818
>
> r_alloc_sbrk?  What OS is this?  We only use ralloc.c on a handful of
> them, as of Emacs 25.

Should ralloc.c be used on GNU/Linux systems that have GNU libc?

I found this in my config.log (I think it's related to ralloc use,
though I'm finding the configure code a bit confusing):

    configure:11440: checking whether malloc is Doug Lea style
    configure:11461: gcc -o conftest -O0 -g3 -march=native     conftest.c  >&5 
    conftest.c: In function 'main':
    conftest.c:107:6: error: '__malloc_initialize_hook' undeclared (first use in this function)
          __malloc_initialize_hook = hook;
          ^~~~~~~~~~~~~~~~~~~~~~~~

It seems that my malloc.h does not declare __malloc_initialize_hook,
even though 'man 3 malloc_hook' says

       #include <malloc.h>

       void *(*__malloc_hook)(size_t size, const void *caller);

       void *(*__realloc_hook)(void *ptr, size_t size, const void *caller);

       void *(*__memalign_hook)(size_t alignment, size_t size,
                                const void *caller);

       void (*__free_hook)(void *ptr, const void *caller);

       void (*__malloc_initialize_hook)(void);

       void (*__after_morecore_hook)(void);

Perhaps this is because the recommended way to set this hook (which is
different from what the configure test is using) doesn't require a
declaration?

       The variable __malloc_initialize_hook points at a function that is called once when the  mal‐
       loc  implementation  is initialized.  This is a weak variable, so it can be overridden in the
       application with a definition like the following:

           void (*__malloc_initialize_hook)(void) = my_init_hook;



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-22  3:03                     ` When should ralloc.c be used? (WAS: bug#24358) npostavs
@ 2016-10-22  5:32                       ` Paul Eggert
  2016-10-22  7:29                         ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2016-10-22  5:32 UTC (permalink / raw)
  To: npostavs, emacs-devel

npostavs@users.sourceforge.net wrote:
> Should ralloc.c be used on GNU/Linux systems that have GNU libc?

Yes, with bleeding-edge glibc, as __malloc_initialize_hook has been removed. 
Evidently your man page is out of sync with your glibc.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-22  5:32                       ` Paul Eggert
@ 2016-10-22  7:29                         ` Eli Zaretskii
  2016-10-22 18:34                           ` Paul Eggert
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-22  7:29 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel, npostavs

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Fri, 21 Oct 2016 22:32:49 -0700
> 
> npostavs@users.sourceforge.net wrote:
> > Should ralloc.c be used on GNU/Linux systems that have GNU libc?
> 
> Yes, with bleeding-edge glibc, as __malloc_initialize_hook has been removed. 

If that's the case, shouldn't we switch such glibc systems to use mmap
instead?  It should be free of at least some of the problems in
ralloc.c, I think.

Alternatively, how about supporting an external Doug Lea malloc
library (assuming such a library exists and Emacs can be linked
against it)?

ralloc.c is generally "bad news", we've gone to non-trivial efforts
during the last years to reduce its usage to the minimum.  I always
thought that only MSDOS and perhaps a few *BSD systems still use it.
Having it creep back into GNU/Linux is really a bad regression, IMO.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-22  7:29                         ` Eli Zaretskii
@ 2016-10-22 18:34                           ` Paul Eggert
  2016-10-22 19:43                             ` When should ralloc.c be used? Stefan Monnier
  2016-10-24  0:21                             ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  0 siblings, 2 replies; 375+ messages in thread
From: Paul Eggert @ 2016-10-22 18:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, npostavs

Eli Zaretskii wrote:
> Having it creep back into GNU/Linux is really a bad regression, IMO.

I don't like it either, but would rather work on redoing the build process so 
that we can use the native malloc on all hosts.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-22 18:34                           ` Paul Eggert
@ 2016-10-22 19:43                             ` Stefan Monnier
  2016-10-23  2:37                               ` Paul Eggert
  2016-10-24  0:21                             ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-22 19:43 UTC (permalink / raw)
  To: emacs-devel

>> Having it creep back into GNU/Linux is really a bad regression, IMO.
> I don't like it either, but would rather work on redoing the build process
> so that we can use the native malloc on all hosts.

But that doesn't explain why we'd need to use ralloc in the mean time.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-22 19:43                             ` When should ralloc.c be used? Stefan Monnier
@ 2016-10-23  2:37                               ` Paul Eggert
  2016-10-23  6:53                                 ` Eli Zaretskii
  2016-10-23 12:55                                 ` When should ralloc.c be used? Stefan Monnier
  0 siblings, 2 replies; 375+ messages in thread
From: Paul Eggert @ 2016-10-23  2:37 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

Stefan Monnier wrote:
> that doesn't explain why we'd need to use ralloc in the mean time.

I suppose you're right that we don't need to; we could instead hack on Emacs to 
get it to work without ralloc on recent glibc. If someone wants to do that, 
great. I'd rather spend my own limited cycles on fixing the main problem, which 
is unexec.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23  2:37                               ` Paul Eggert
@ 2016-10-23  6:53                                 ` Eli Zaretskii
  2016-10-23  7:57                                   ` Paul Eggert
  2016-10-23 16:44                                   ` Skipping unexec via a big .elc file (was: When should ralloc.c be used?) Stefan Monnier
  2016-10-23 12:55                                 ` When should ralloc.c be used? Stefan Monnier
  1 sibling, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23  6:53 UTC (permalink / raw)
  To: Paul Eggert; +Cc: monnier, emacs-devel

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 22 Oct 2016 19:37:36 -0700
> 
> Stefan Monnier wrote:
> > that doesn't explain why we'd need to use ralloc in the mean time.
> 
> I suppose you're right that we don't need to; we could instead hack on Emacs to 
> get it to work without ralloc on recent glibc.

How about using mmap in those cases?

> If someone wants to do that, great. I'd rather spend my own limited
> cycles on fixing the main problem, which is unexec.

I thought we agreed to get rid of unexec by loading a single .elc file
at startup of Emacs, and remove the distinction between temacs and
emacs altogether.  Is that what you'd like to work on?

Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23  6:53                                 ` Eli Zaretskii
@ 2016-10-23  7:57                                   ` Paul Eggert
  2016-10-23  8:58                                     ` Eli Zaretskii
  2016-10-23 16:44                                   ` Skipping unexec via a big .elc file (was: When should ralloc.c be used?) Stefan Monnier
  1 sibling, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2016-10-23  7:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Eli Zaretskii wrote:

> How about using mmap in those cases?

I don't know, and would rather not spend time investigating.

> I thought we agreed to get rid of unexec by loading a single .elc file
> at startup of Emacs

Yes, if that performs well enough. We don't know yet whether it will. It's on my 
list of things to look into, but it's not trivial.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23  7:57                                   ` Paul Eggert
@ 2016-10-23  8:58                                     ` Eli Zaretskii
  2016-10-23  9:38                                       ` Paul Eggert
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23  8:58 UTC (permalink / raw)
  To: Paul Eggert; +Cc: monnier, emacs-devel

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sun, 23 Oct 2016 00:57:21 -0700
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> > I thought we agreed to get rid of unexec by loading a single .elc file
> > at startup of Emacs
> 
> Yes, if that performs well enough. We don't know yet whether it will. It's on my 
> list of things to look into, but it's not trivial.

Can you share the concerns and the tests you'd like to be performed?
Perhaps others (myself included) could help with such testing.

Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23  8:58                                     ` Eli Zaretskii
@ 2016-10-23  9:38                                       ` Paul Eggert
  2016-10-23 12:50                                         ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2016-10-23  9:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Eli Zaretskii wrote:
> Can you share the concerns and the tests you'd like to be performed?

I wrote something along those lines in Bug#23529; see the URL below. My main 
concern is startup time and energy. I don't have detailed benchmarks.

https://debbugs.gnu.org/cgi/bugreport.cgi?bug=23529#197



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23  9:38                                       ` Paul Eggert
@ 2016-10-23 12:50                                         ` Eli Zaretskii
  2016-10-23 13:39                                           ` Stefan Monnier
  2016-10-23 15:22                                           ` Andreas Schwab
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 12:50 UTC (permalink / raw)
  To: Paul Eggert; +Cc: monnier, emacs-devel

Is it reasonable to require a version of glibc that still supports
__malloc_initialize_hook?  When people upgrade to a newer glibc, the
previous version is still left on the system, I presume (for programs
that need them which were built against those old versions)?

I took a look at our sources, and we have a lot of places where we
call malloc, directly or indirectly, while holding C pointers to data
of Lisp strings.  We also have several (maybe half a dozen) places
where the same happens with C pointers to buffer text.  Auditing all
of these and fixing them is a non-trivial job, so maybe we should try
to avoid the problems in the first place?  Is it a practical solution?

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 12:50                                         ` Eli Zaretskii
@ 2016-10-23 13:39                                           ` Stefan Monnier
  2016-10-23 14:01                                             ` Eli Zaretskii
  2016-10-23 15:22                                           ` Andreas Schwab
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 13:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel

> Is it reasonable to require a version of glibc that still supports
> __malloc_initialize_hook?  When people upgrade to a newer glibc, the
> previous version is still left on the system, I presume (for programs
> that need them which were built against those old versions)?

Not necessarily, no.  E.g. it's not the case for fresh new installs.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 13:39                                           ` Stefan Monnier
@ 2016-10-23 14:01                                             ` Eli Zaretskii
  2016-10-23 14:18                                               ` Stefan Monnier
  2016-10-23 18:19                                               ` Paul Eggert
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 14:01 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eggert, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Paul Eggert <eggert@cs.ucla.edu>,  emacs-devel@gnu.org
> Date: Sun, 23 Oct 2016 09:39:57 -0400
> 
> > Is it reasonable to require a version of glibc that still supports
> > __malloc_initialize_hook?  When people upgrade to a newer glibc, the
> > previous version is still left on the system, I presume (for programs
> > that need them which were built against those old versions)?
> 
> Not necessarily, no.  E.g. it's not the case for fresh new installs.

But they can downgrade, right?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 14:01                                             ` Eli Zaretskii
@ 2016-10-23 14:18                                               ` Stefan Monnier
  2016-10-23 18:19                                               ` Paul Eggert
  1 sibling, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 14:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel

>> > Is it reasonable to require a version of glibc that still supports
>> > __malloc_initialize_hook?  When people upgrade to a newer glibc, the
>> > previous version is still left on the system, I presume (for programs
>> > that need them which were built against those old versions)?
>> Not necessarily, no.  E.g. it's not the case for fresh new installs.
> But they can downgrade, right?

Depends on the details of the distribution, but while it's technically
probably possible, it's not necessarily simple for the end-user.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 14:01                                             ` Eli Zaretskii
  2016-10-23 14:18                                               ` Stefan Monnier
@ 2016-10-23 18:19                                               ` Paul Eggert
  2016-10-23 19:03                                                 ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2016-10-23 18:19 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: emacs-devel

Eli Zaretskii wrote:
> But they can downgrade, right?

No, as users don't necessarily have an older glibc to downgrade to. That train 
has already left the station. With my blessing, I might add.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 18:19                                               ` Paul Eggert
@ 2016-10-23 19:03                                                 ` Eli Zaretskii
  2016-10-23 20:36                                                   ` Stefan Monnier
  2016-10-24  4:59                                                   ` Paul Eggert
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 19:03 UTC (permalink / raw)
  To: Paul Eggert; +Cc: monnier, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sun, 23 Oct 2016 11:19:15 -0700
> 
> Eli Zaretskii wrote:
> > But they can downgrade, right?
> 
> No, as users don't necessarily have an older glibc to downgrade to. That train 
> has already left the station. With my blessing, I might add.

Then what are our choices to solve this for Emacs 25.2?  If GNU/Linux
starts using ralloc more and more, we will have crashes and data
corruption all over the place.  It's inconceivable to release 25.2 in
this state.

Any suggestions?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 19:03                                                 ` Eli Zaretskii
@ 2016-10-23 20:36                                                   ` Stefan Monnier
  2016-10-24  6:54                                                     ` Eli Zaretskii
  2016-10-24  4:59                                                   ` Paul Eggert
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 20:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel

> Then what are our choices to solve this for Emacs 25.2?  If GNU/Linux
> starts using ralloc more and more, we will have crashes and data
> corruption all over the place.  It's inconceivable to release 25.2 in
> this state.

What's wrong with using gmalloc without ralloc and with mmap'd buffers?


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 20:36                                                   ` Stefan Monnier
@ 2016-10-24  6:54                                                     ` Eli Zaretskii
  2016-10-24 10:15                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24  6:54 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eggert, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Paul Eggert <eggert@cs.ucla.edu>,  emacs-devel@gnu.org
> Date: Sun, 23 Oct 2016 16:36:24 -0400
> 
> > Then what are our choices to solve this for Emacs 25.2?  If GNU/Linux
> > starts using ralloc more and more, we will have crashes and data
> > corruption all over the place.  It's inconceivable to release 25.2 in
> > this state.
> 
> What's wrong with using gmalloc without ralloc and with mmap'd buffers?

Nothing, if it works.  But someone should set up Emacs to do that, and
make sure the result builds, bootstraps, and works reliably,
i.e. doesn't have all the problems reported recently in this and
related bugs.

I don't have access to any platforms that are affected by this
(fencepost doesn't yet have such a new glibc).  I will do this myself
if no one else comes to help, but I really could use help from people
who work on platforms that are affected by this issue.  Noam and Sam
help, but we need more manpower and more expertise.

In any case, I asked what were our alternatives, because I'm not sure
we have a clear view of those.  Making decisions with just a peephole
view of the issues is never a good idea.  The best solution might not
be changing the configury to eliminate ralloc, it could be something
entirely different.

For example, I see in regex.c a set of special definitions for
REGEX_ALLOCATE_STACK and friends conditioned by this:

  #if defined REL_ALLOC && defined REGEX_MALLOC

These definitions call directly a few functions in ralloc.c, as
opposed to going via malloc.  Does anyone know what is this about?
Should we try building with REGEX_MALLOC on platforms that use
ralloc.c, and see whether the problems with regex searches triggered
by relocation go away?

Yet another idea is enlarge the stack space available to SAFE_ALLOCA
in regex.c, so that the failure stack is allocated off the C runtime
stack, thus side-stepping the relocation issues.

And maybe there are other possibilities.  We really need to come up
with all the possible ideas, try which ones work, and decide as
quickly as possible what is the best one.  This currently is the most
serious blocking issue on the way towards releasing Emacs 25.2 soon,
as we wanted to.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24  6:54                                                     ` Eli Zaretskii
@ 2016-10-24 10:15                                                       ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 10:15 UTC (permalink / raw)
  To: emacs-devel; +Cc: eggert, monnier

> Date: Mon, 24 Oct 2016 09:54:12 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org
> 
> And maybe there are other possibilities.  We really need to come up
> with all the possible ideas, try which ones work, and decide as
> quickly as possible what is the best one.  This currently is the most
> serious blocking issue on the way towards releasing Emacs 25.2 soon,
> as we wanted to.

So I think the most promising alternatives at this point are:

  . Build with gmalloc but without ralloc.

    Would people who have ralloc.o in their src directory please
    reconfigure with REL_ALLOC=no, and see if the result works
    reliably in you're day-to-day work?  Please report the results
    here, and if you were hit by one of the related bugs (24358 and
    24764), please report also to the corresponding bug addresses.

  . Back-port the HYBRID_MALLOC changes from master.  Not sure if the
    patch is simple and safe enough, or whether the result is tested
    well enough to have that on emacs-25.

If one of these works, we should consider reverting the changes in
regex.c that attempt to handle relocation during regex.c calls.  We
should also consider removing ralloc.c from any of our builds, in the
hope that the platforms which we care about have a much better malloc
implementation than what was available 20 years ago.

Comments?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 19:03                                                 ` Eli Zaretskii
  2016-10-23 20:36                                                   ` Stefan Monnier
@ 2016-10-24  4:59                                                   ` Paul Eggert
  2016-10-24  7:44                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2016-10-24  4:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Eli Zaretskii wrote:
> Then what are our choices to solve this for Emacs 25.2?

Sorry, I've lost context. What's "this"? Doesn't draft Emacs 25.2 work 
adequately on bleeding-edge glibc without our doing anything special? If not, 
what are the problems and how can we reproduce them?

I attempted to reproduce the problems, whatever they are, by building draft 
Emacs 25.2 with './configure emacs_cv_var_doug_lea_malloc=no' on x86-64 Ubuntu 
16.04.1 (I don't have easy access to 16.10 yet, but this does build and link 
gmalloc.o and ralloc.o). --enable-gcc-warnings did generate some warnings, which 
I just now fixed, but I didn't observe any runtime misbehaviors.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24  4:59                                                   ` Paul Eggert
@ 2016-10-24  7:44                                                     ` Eli Zaretskii
  2016-10-24  8:29                                                       ` Andreas Schwab
  2016-10-24 16:21                                                       ` Paul Eggert
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24  7:44 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sun, 23 Oct 2016 21:59:52 -0700
> 
> Eli Zaretskii wrote:
> > Then what are our choices to solve this for Emacs 25.2?
> 
> Sorry, I've lost context. What's "this"?

The fact that Emacs 25.2 builds with ralloc.c on recent GNU/Linux
systems, which triggers random crashes, inability to build Emacs in
some cases, and other atrocities, such as corruption of buffer text.

> Doesn't draft Emacs 25.2 work adequately on bleeding-edge glibc
> without our doing anything special?

Not even close.

> If not, what are the problems and how can we reproduce them?

One problem is that relocation of buffer text and Lisp string data can
happen during regex searches, due to reallocation of the failure stack
to a size that exceeds MAX_ALLOCA.  Noam fixed that by some
non-trivial code in regex.c, but those changes seem to have uncovered
a problem that precludes bootstrapping Emacs 25.2, reported here:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24358#123

which was independently reported for a different machine here:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24772#11

I predict that more people will start hitting this as they upgrade to
newer glibc and start building Emacs with ralloc.c.

So we can't even build Emacs 25.2 reliably on "bleeding-edge"
GNU/Linux systems -- how's that for "working adequately"?

Then there's bug#24764, which sounds a lot like it's also related to
ralloc.c.  Because Michael Heerdegen talked about problems with
browsing Web pages, I looked at xml.c, and sure thing, it passes a C
pointer to buffer text to libxml2 functions which call malloc
internally.  I installed a ralloc-specific workaround for that, but
didn't yet hear from Michael if that was sufficient to solve his
frequent crashes in GC and corruptions of buffer text.

> I attempted to reproduce the problems, whatever they are, by building draft 
> Emacs 25.2 with './configure emacs_cv_var_doug_lea_malloc=no' on x86-64 Ubuntu 
> 16.04.1 (I don't have easy access to 16.10 yet, but this does build and link 
> gmalloc.o and ralloc.o). --enable-gcc-warnings did generate some warnings, which 
> I just now fixed, but I didn't observe any runtime misbehaviors.

The problems don't happen immediately, and the problem with bootstrap
(bug#24358) seems to be dependent on some factor we don't yet
understand: Noam cannot reproduce it, although his system is very
similar to the one where it does happen.  If you want an almost
immediate manifestation of the problem in a build with ralloc, remove
the calls to r_alloc_inhibit_buffer_relocation in xml.c, and browse
some Web pages, you will sooner or later hit an assertion violation in
parse_region.

More generally, I found a few more places in the sources where we hold
C pointers to buffer text around calls to functions that can call
malloc.  I'm not sure I found all of them, because the only way to
look for them I know of is not perfect.  As for the same situation
where we hold C pointers to Lisp string data where malloc can be
called, there are virtually dozens of them, the most frequent paradigm
is something like

  char *beg = SSDATA (lisp_string);
  char *end = beg + something;
  Lisp_Object new_string = make_unibyte_string (beg, end - beg);

(The catch here is that make_unibyte_string calls malloc internally,
which could relocate the data of the original lisp_string, and thus
invalidate the pointers 'beg' and 'end'.)

I don't remember well enough the internals of ralloc.c: perhaps it
doesn't relocate Lisp string data unless the string is long enough? or
at all?  So the problems with Lisp string might not be as grave as I
fear, but this should certainly be looked into before we dismiss all
those cases.

Bottom line: when GNU/Linux systems started using ralloc.c, they've
potentially exposed Emacs 25.2 to very serious instability, on the
platform that we consider by far the most important one.  We need to
move fast and thoroughly to investigate the possible solutions and
decide which one to use.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24  7:44                                                     ` Eli Zaretskii
@ 2016-10-24  8:29                                                       ` Andreas Schwab
  2016-10-24  8:47                                                         ` Eli Zaretskii
  2016-10-24 16:21                                                       ` Paul Eggert
  1 sibling, 1 reply; 375+ messages in thread
From: Andreas Schwab @ 2016-10-24  8:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel

On Okt 24 2016, Eli Zaretskii <eliz@gnu.org> wrote:

> I don't remember well enough the internals of ralloc.c: perhaps it
> doesn't relocate Lisp string data unless the string is long enough? or
> at all?  So the problems with Lisp string might not be as grave as I
> fear, but this should certainly be looked into before we dismiss all
> those cases.

String data can be relocated even without ralloc, see
compact_small_strings.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24  8:29                                                       ` Andreas Schwab
@ 2016-10-24  8:47                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24  8:47 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: eggert, emacs-devel

> From: Andreas Schwab <schwab@suse.de>
> Cc: Paul Eggert <eggert@cs.ucla.edu>,  emacs-devel@gnu.org
> Date: Mon, 24 Oct 2016 10:29:35 +0200
> 
> On Okt 24 2016, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > I don't remember well enough the internals of ralloc.c: perhaps it
> > doesn't relocate Lisp string data unless the string is long enough? or
> > at all?  So the problems with Lisp string might not be as grave as I
> > fear, but this should certainly be looked into before we dismiss all
> > those cases.
> 
> String data can be relocated even without ralloc, see
> compact_small_strings.

Yes, but that's called only by GC, so not a problem in most (if not
all) the places I've seen that, and is not related to ralloc anyway.

Looking at ralloc.c and its callers, I think the only blocks of memory
it relocates on its own are those allocated with r_alloc or r_realloc,
which is only used for buffer text and (under REGEX_MALLOC) for
regex.c failure stack.  Which means compact_small_strings is the
_only_ place where string data is relocated, so just calls to malloc
cannot.

IOW, I confused GC and compact_small_strings with ralloc, when I
talked about string data, and our only problem is with pointers to
buffer text.

Am I missing something?

Thanks.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24  7:44                                                     ` Eli Zaretskii
  2016-10-24  8:29                                                       ` Andreas Schwab
@ 2016-10-24 16:21                                                       ` Paul Eggert
  2016-10-24 16:39                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2016-10-24 16:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 10/24/2016 12:44 AM, Eli Zaretskii wrote:
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24358... Then there's 
> bug#24764 ...

These bugs seem to be fixed now (thanks to you!). As Andreas pointed 
out, the problems with ralloc.c are not as severe as initially feared, 
since they should be limited to pointers to buffer text and should not 
extend to pointers to Lisp strings.

As I understand it, although the ralloc.c approach worked for a long 
time, it fell out of favor on common platforms and so hasn't been 
debugged as thoroughly for the past several years. Unfortunately, recent 
changes to glibc have caused ralloc.c to be used again on common GNU 
platforms and this are shaking out longstanding bugs with the ralloc.c 
approach. This means people using bleeding-edge glibc are suffering 
problems similar to what people on now-unusual platforms must have had 
for some time.

Surely we can fix these ralloc.c-related bugs as they come up. That 
being said, they are a hassle for users and maintainers, and if dropping 
ralloc.c works and doesn't cause significant performance degradation it 
sounds like that would be a win.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:21                                                       ` Paul Eggert
@ 2016-10-24 16:39                                                         ` Eli Zaretskii
  2016-10-24 16:54                                                           ` Paul Eggert
                                                                             ` (2 more replies)
  0 siblings, 3 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 16:39 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Mon, 24 Oct 2016 09:21:50 -0700
> 
> On 10/24/2016 12:44 AM, Eli Zaretskii wrote:
> > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24358... Then there's 
> > bug#24764 ...
> 
> These bugs seem to be fixed now (thanks to you!).

Some of them are fixed.  At least one more remains (I hope to fix it
soon).  What's more, we still don't know whether the changes in
regex.c by Noam are correct enough to solve the problems with
relocation of buffer text while we search the buffer, and we also
don't know yet whether the two bugs mentioned above are solved because
we didn't hear from their OPs.

Since these all are related to ralloc.c, the question is whether we
should get rid of using it on GNU/Linux, instead of chasing each of
these problems (which is anything but easy and might take time).

> As Andreas pointed out, the problems with ralloc.c are not as severe
> as initially feared, since they should be limited to pointers to
> buffer text and should not extend to pointers to Lisp strings.

Indeed, and that's a relief.

> As I understand it, although the ralloc.c approach worked for a long 
> time, it fell out of favor on common platforms and so hasn't been 
> debugged as thoroughly for the past several years. Unfortunately, recent 
> changes to glibc have caused ralloc.c to be used again on common GNU 
> platforms and this are shaking out longstanding bugs with the ralloc.c 
> approach. This means people using bleeding-edge glibc are suffering 
> problems similar to what people on now-unusual platforms must have had 
> for some time.

Yes, exactly.  And since most people at least here use Emacs on
GNU/Linux, the nasty problems due to ralloc.c are popping up much
faster and more frequently than they did when only *BSD and Windows
used ralloc.c.

> Surely we can fix these ralloc.c-related bugs as they come up. That 
> being said, they are a hassle for users and maintainers, and if dropping 
> ralloc.c works and doesn't cause significant performance degradation it 
> sounds like that would be a win.

Right, so I'd like your opinion and comments about the possible
solutions proposed so far:

  . Build with gmalloc but without ralloc.

  . Back-port the HYBRID_MALLOC changes from master.  Not sure if the
    patch is simple and safe enough, or whether the result is tested
    well enough to have that on emacs-25.

  . Build with gmalloc and use mmap for buffer text allocation.

Thanks.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:39                                                         ` Eli Zaretskii
@ 2016-10-24 16:54                                                           ` Paul Eggert
  2016-10-24 17:05                                                             ` Eli Zaretskii
  2016-10-28  6:18                                                           ` Jérémie Courrèges-Anglas
  2016-10-28  6:19                                                           ` Jérémie Courrèges-Anglas
  2 siblings, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2016-10-24 16:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 775 bytes --]

On 10/24/2016 09:39 AM, Eli Zaretskii wrote:
>
> Right, so I'd like your opinion and comments about the possible
> solutions proposed so far:
>
>    . Build with gmalloc but without ralloc.

This goes back to what we were doing, no?

>
>    . Back-port the HYBRID_MALLOC changes from master.  Not sure if the
>      patch is simple and safe enough, or whether the result is tested
>      well enough to have that on emacs-25.

This sounds riskier.

>    . Build with gmalloc and use mmap for buffer text allocation.

This also sounds riskier.

How about the attached patch for emacs-25? Basically, it says "use 
ralloc.c only if requested via './configure REL_ALLOC=yes'". I assume 
that this patch need not be ported to master, due to HYBRID_MALLOC. I 
haven't tested this.

[-- Attachment #2: ralloc.diff --]
[-- Type: text/x-patch, Size: 735 bytes --]

diff --git a/configure.ac b/configure.ac
index ae7dfe5..19b44bd 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2189,18 +2189,10 @@ if test "$doug_lea_malloc" = "yes" ; then
   AC_DEFINE(DOUG_LEA_MALLOC, 1,
     [Define to 1 if the system memory allocator is Doug Lea style,
      with malloc hooks and malloc_set_state.])
-
-  ## Use mmap directly for allocating larger buffers.
-  ## FIXME this comes from src/s/{gnu,gnu-linux}.h:
-  ## #ifdef DOUG_LEA_MALLOC; #undef REL_ALLOC; #endif
-  ## Does the AC_FUNC_MMAP test below make this check unnecessary?
-  case "$opsys" in
-    mingw32|gnu*) REL_ALLOC=no ;;
-  esac
 fi
 
 if test x"${REL_ALLOC}" = x; then
-  REL_ALLOC=${GNU_MALLOC}
+  REL_ALLOC=no
 fi
 
 use_mmap_for_buffers=no

^ permalink raw reply related	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:54                                                           ` Paul Eggert
@ 2016-10-24 17:05                                                             ` Eli Zaretskii
  2016-10-25  6:23                                                               ` Paul Eggert
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 17:05 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Mon, 24 Oct 2016 09:54:24 -0700
> 
> >    . Build with gmalloc but without ralloc.
> 
> This goes back to what we were doing, no?

No, we were using the glibc malloc, AFAIK.  Or am I missing something?

> >    . Back-port the HYBRID_MALLOC changes from master.  Not sure if the
> >      patch is simple and safe enough, or whether the result is tested
> >      well enough to have that on emacs-25.
> 
> This sounds riskier.
> 
> >    . Build with gmalloc and use mmap for buffer text allocation.
> 
> This also sounds riskier.

I agree with your assessments.

> How about the attached patch for emacs-25? Basically, it says "use 
> ralloc.c only if requested via './configure REL_ALLOC=yes'".

LGTM.  Should we wait for people to build with REL_ALLOC=no manually,
to see if there are any problems, or should we push this right away?

> I assume that this patch need not be ported to master, due to
> HYBRID_MALLOC.

Yes, I think so.

Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 17:05                                                             ` Eli Zaretskii
@ 2016-10-25  6:23                                                               ` Paul Eggert
  2016-10-25 16:11                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2016-10-25  6:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 784 bytes --]

>>>    . Build with gmalloc but without ralloc.
>>
>> This goes back to what we were doing, no?
>
> No, we were using the glibc malloc, AFAIK.  Or am I missing something?

No you're right, I was sloppy.

>> How about the attached patch for emacs-25? Basically, it says "use
>> ralloc.c only if requested via './configure REL_ALLOC=yes'".
>
> LGTM.  Should we wait for people to build with REL_ALLOC=no manually,
> to see if there are any problems, or should we push this right away?

I doubt whether many more people will build with REL_ALLOC=no. So I think we 
should push it into emacs-25. Proposed patch attached. I have tested this on 
Fedora 24 x86-64 and Ubuntu 16.04 x86-64 with './configure 
emacs_cv_var_doug_lea_malloc=no' to simulate bleeding-edge glibc.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Default-REL_ALLOC-to-no.patch --]
[-- Type: text/x-diff; name="0001-Default-REL_ALLOC-to-no.patch", Size: 1732 bytes --]

From 561110345c48d48b4621d2e59487c2c17fcc988c Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 24 Oct 2016 23:11:32 -0700
Subject: [PATCH] Default REL_ALLOC to 'no'

This should make ralloc-related bugs less likely on GNU/Linux
systems with bleeding-edge glibc.  See the email thread containing:
http://lists.gnu.org/archive/html/emacs-devel/2016-10/msg00801.html
Do not merge to master.
* configure.ac (REL_ALLOC): Default to 'no' on all platforms, not
merely on platforms with Doug Lea malloc.  Although bleeding-edge
glibc no longer exports __malloc_initialize_hook and so longer
passes the configure-time test for Doug Lea malloc, ralloc tickles
longstanding bugs like Bug#24358 and Bug#24764 and Emacs is likely
to be more reliable without it.  This patch is not needed on
master, which uses hybrid malloc in this situation.
---
 configure.ac | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/configure.ac b/configure.ac
index ae7dfe5..19b44bd 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2189,18 +2189,10 @@ if test "$doug_lea_malloc" = "yes" ; then
   AC_DEFINE(DOUG_LEA_MALLOC, 1,
     [Define to 1 if the system memory allocator is Doug Lea style,
      with malloc hooks and malloc_set_state.])
-
-  ## Use mmap directly for allocating larger buffers.
-  ## FIXME this comes from src/s/{gnu,gnu-linux}.h:
-  ## #ifdef DOUG_LEA_MALLOC; #undef REL_ALLOC; #endif
-  ## Does the AC_FUNC_MMAP test below make this check unnecessary?
-  case "$opsys" in
-    mingw32|gnu*) REL_ALLOC=no ;;
-  esac
 fi
 
 if test x"${REL_ALLOC}" = x; then
-  REL_ALLOC=${GNU_MALLOC}
+  REL_ALLOC=no
 fi
 
 use_mmap_for_buffers=no
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-25  6:23                                                               ` Paul Eggert
@ 2016-10-25 16:11                                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-25 16:11 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Mon, 24 Oct 2016 23:23:45 -0700
> 
> > LGTM.  Should we wait for people to build with REL_ALLOC=no manually,
> > to see if there are any problems, or should we push this right away?
> 
> I doubt whether many more people will build with REL_ALLOC=no. So I think we 
> should push it into emacs-25. Proposed patch attached.

I agree.  Anyway, at least one person already tried this build and
reported that problems due to relocation are gone.

So please push to emacs-25.

Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:39                                                         ` Eli Zaretskii
  2016-10-24 16:54                                                           ` Paul Eggert
@ 2016-10-28  6:18                                                           ` Jérémie Courrèges-Anglas
  2016-10-28  6:19                                                           ` Jérémie Courrèges-Anglas
  2 siblings, 0 replies; 375+ messages in thread
From: Jérémie Courrèges-Anglas @ 2016-10-28  6:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

[...]

>> As I understand it, although the ralloc.c approach worked for a long 
>> time, it fell out of favor on common platforms and so hasn't been 
>> debugged as thoroughly for the past several years. Unfortunately, recent 
>> changes to glibc have caused ralloc.c to be used again on common GNU 
>> platforms and this are shaking out longstanding bugs with the ralloc.c 
>> approach. This means people using bleeding-edge glibc are suffering 
>> problems similar to what people on now-unusual platforms must have had 
>> for some time.
>
> Yes, exactly.  And since most people at least here use Emacs on
> GNU/Linux, the nasty problems due to ralloc.c are popping up much
> faster and more frequently than they did when only *BSD and Windows
> used ralloc.c.

I'm a bit surprised that such issues happen on recent glibc systems.
Emacs has been using ralloc on OpenBSD since years, and seems to be
pretty stable.  Granted, memory corruption bugs can depend on many
parameters, but still...

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:39                                                         ` Eli Zaretskii
  2016-10-24 16:54                                                           ` Paul Eggert
  2016-10-28  6:18                                                           ` Jérémie Courrèges-Anglas
@ 2016-10-28  6:19                                                           ` Jérémie Courrèges-Anglas
  2016-10-28  7:40                                                             ` Eli Zaretskii
  2 siblings, 1 reply; 375+ messages in thread
From: Jérémie Courrèges-Anglas @ 2016-10-28  6:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

[...]

>> As I understand it, although the ralloc.c approach worked for a long 
>> time, it fell out of favor on common platforms and so hasn't been 
>> debugged as thoroughly for the past several years. Unfortunately, recent 
>> changes to glibc have caused ralloc.c to be used again on common GNU 
>> platforms and this are shaking out longstanding bugs with the ralloc.c 
>> approach. This means people using bleeding-edge glibc are suffering 
>> problems similar to what people on now-unusual platforms must have had 
>> for some time.
>
> Yes, exactly.  And since most people at least here use Emacs on
> GNU/Linux, the nasty problems due to ralloc.c are popping up much
> faster and more frequently than they did when only *BSD and Windows
> used ralloc.c.

I'm a bit surprised that such issues happen on recent glibc systems.
Emacs has been using ralloc on OpenBSD since years, and seems to be
pretty stable.  Granted, memory corruption bugs can depend on many
parameters, but still...

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  6:19                                                           ` Jérémie Courrèges-Anglas
@ 2016-10-28  7:40                                                             ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28  7:40 UTC (permalink / raw)
  To: Jérémie Courrèges-Anglas; +Cc: eggert, emacs-devel

> From: jca@wxcvbn.org (Jérémie Courrèges-Anglas)
> Cc: Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org
> Date: Fri, 28 Oct 2016 08:19:48 +0200
> 
> > Yes, exactly.  And since most people at least here use Emacs on
> > GNU/Linux, the nasty problems due to ralloc.c are popping up much
> > faster and more frequently than they did when only *BSD and Windows
> > used ralloc.c.
> 
> I'm a bit surprised that such issues happen on recent glibc systems.
> Emacs has been using ralloc on OpenBSD since years, and seems to be
> pretty stable.  Granted, memory corruption bugs can depend on many
> parameters, but still...

I guess your usage patterns side-step the problematic code.  E.g., if
the resulting memory footprint is stable (i.e. never grows too much
too fast), ralloc will not need to relocate buffer text too
frequently, so you won't bump into these problems.  And some of those
problems appeared only recently: e.g., EWW, which triggers the problem
when it calls libxml2, is a 25.1 addition.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 12:50                                         ` Eli Zaretskii
  2016-10-23 13:39                                           ` Stefan Monnier
@ 2016-10-23 15:22                                           ` Andreas Schwab
  2016-10-23 15:49                                             ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Andreas Schwab @ 2016-10-23 15:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, monnier, emacs-devel

On Okt 23 2016, Eli Zaretskii <eliz@gnu.org> wrote:

> Is it reasonable to require a version of glibc that still supports
> __malloc_initialize_hook?  When people upgrade to a newer glibc, the
> previous version is still left on the system, I presume (for programs
> that need them which were built against those old versions)?

There will ever be only one glibc.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 15:22                                           ` Andreas Schwab
@ 2016-10-23 15:49                                             ` Eli Zaretskii
  2016-10-23 15:57                                               ` Andreas Schwab
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 15:49 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: eggert, monnier, emacs-devel

> From: Andreas Schwab <schwab@linux-m68k.org>
> Date: Sun, 23 Oct 2016 17:22:37 +0200
> Cc: Paul Eggert <eggert@cs.ucla.edu>, monnier@iro.umontreal.ca,
> 	emacs-devel@gnu.org
> 
> On Okt 23 2016, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > Is it reasonable to require a version of glibc that still supports
> > __malloc_initialize_hook?  When people upgrade to a newer glibc, the
> > previous version is still left on the system, I presume (for programs
> > that need them which were built against those old versions)?
> 
> There will ever be only one glibc.

Sorry, I don't understand what that means.  Can't it be that some
older program is linked against an older libc.so.N version?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 15:49                                             ` Eli Zaretskii
@ 2016-10-23 15:57                                               ` Andreas Schwab
  2016-10-23 17:06                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Andreas Schwab @ 2016-10-23 15:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel

On Okt 23 2016, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Andreas Schwab <schwab@linux-m68k.org>
>> Date: Sun, 23 Oct 2016 17:22:37 +0200
>> Cc: Paul Eggert <eggert@cs.ucla.edu>, monnier@iro.umontreal.ca,
>> 	emacs-devel@gnu.org
>> 
>> On Okt 23 2016, Eli Zaretskii <eliz@gnu.org> wrote:
>> 
>> > Is it reasonable to require a version of glibc that still supports
>> > __malloc_initialize_hook?  When people upgrade to a newer glibc, the
>> > previous version is still left on the system, I presume (for programs
>> > that need them which were built against those old versions)?
>> 
>> There will ever be only one glibc.
>
> Sorry, I don't understand what that means.  Can't it be that some
> older program is linked against an older libc.so.N version?

There is no older version.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 15:57                                               ` Andreas Schwab
@ 2016-10-23 17:06                                                 ` Eli Zaretskii
  2016-10-23 20:35                                                   ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 17:06 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: eggert, monnier, emacs-devel

> From: Andreas Schwab <schwab@linux-m68k.org>
> Date: Sun, 23 Oct 2016 17:57:15 +0200
> Cc: eggert@cs.ucla.edu, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> >> > Is it reasonable to require a version of glibc that still supports
> >> > __malloc_initialize_hook?  When people upgrade to a newer glibc, the
> >> > previous version is still left on the system, I presume (for programs
> >> > that need them which were built against those old versions)?
> >> 
> >> There will ever be only one glibc.
> >
> > Sorry, I don't understand what that means.  Can't it be that some
> > older program is linked against an older libc.so.N version?
> 
> There is no older version.

That doesn't really help understanding the issues.  I guess it's above
my pay grade.

(When trying to solve such a grave problem, we need everybody's best
expertise and help, and you are one of the best experts on these
matters here.  Looks like I'm too naïve expecting such cooperation.)



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 17:06                                                 ` Eli Zaretskii
@ 2016-10-23 20:35                                                   ` Stefan Monnier
  0 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 20:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, Andreas Schwab, emacs-devel

> That doesn't really help understanding the issues.  I guess it's above
> my pay grade.

IIUC it's been libc.so.6 ever since distribution have moved from the
"old libc" to glibc-2.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Skipping unexec via a big .elc file (was: When should ralloc.c be used?)
  2016-10-23  6:53                                 ` Eli Zaretskii
  2016-10-23  7:57                                   ` Paul Eggert
@ 2016-10-23 16:44                                   ` Stefan Monnier
  2016-10-23 17:34                                     ` Eli Zaretskii
  2016-10-24 18:34                                     ` Lars Brinkhoff
  1 sibling, 2 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 16:44 UTC (permalink / raw)
  To: emacs-devel

>> If someone wants to do that, great. I'd rather spend my own limited
>> cycles on fixing the main problem, which is unexec.
> I thought we agreed to get rid of unexec by loading a single .elc file
> at startup of Emacs, and remove the distinction between temacs and
> Emacs altogether.  Is that what you'd like to work on?

FWIW, I just did a quick experiment with the patch below which dumps the
state of Emacs's obarray after loadup.el into a big "dumped.elc" file.
Not sure if such an approach could work, but in any case I expect that
a working .elc file should likely be of comparable size.

The result is a .elc file of 3.3MB which seems reasonable.
When I try to load it, tho, I get:

    % time src/emacs -Q --batch -l dumped.elc -f kill-emacs
    src/emacs -Q --batch -l dumped.elc -f kill-emacs  3.50s user 0.00s system 99% cpu 3.506 total
    % 

And that's with a warm cache on "i3-4170 CPU @ 3.70GHz" (my first, and
still only, CPU that goes beyond 3GHz).

So even if there might be ways to speed this up, it doesn't look
too promising.


        Stefan


diff --git a/lisp/loadup.el b/lisp/loadup.el
index 5c16464..dddd71f 100644
--- a/lisp/loadup.el
+++ b/lisp/loadup.el
@@ -474,6 +474,65 @@
 						invocation-directory)
 			      (expand-file-name name invocation-directory)
 			      t)))
+      (message "Dumping into dumped.elc...preparing...")
+
+      ;; Dump the current state into a file so we can reload it!
+      (with-current-buffer (generate-new-buffer "dumped.elc")
+        (message "Dumping into dumped.elc...generating...")
+        (insert ";ELC\^W\^@\^@\^@\n;;; Compiled\n;;; in Emacs version " emacs-version "\n")
+        (let ((cmds '()))
+          (setcdr global-buffers-menu-map nil) ;; Get rid of buffer objects!
+          (mapatoms
+           (lambda (s)
+             (when (and (fboundp s)
+                        (not (subrp (symbol-function s)))
+                        ;; FIXME: We need these, but they contain
+                        ;; unprintable objects.
+                        (not (memq s '(rename-buffer))))
+               (push `(fset ',s ,(macroexp-quote (symbol-function s))) cmds))
+             (when (and (boundp s) (not (keywordp s))
+                        (not (memq s '(nil t
+                                       ;; I think we don't need these!
+                                       terminal-frame
+                                       ;; FIXME: We need these, but they contain
+                                       ;; unprintable objects.
+                                       advertised-signature-table
+                                       undo-auto--undoably-changed-buffers))))
+               ;; FIXME: Don't record in the load-history!
+               ;; FIXME: Handle varaliases!
+               (let ((v (symbol-value s)))
+                 (push `(defvar ,s
+                          ,(cond
+                            ((subrp v)
+                             `(symbol-function ',(intern (subr-name v))))
+                            ((and (markerp v) (null (marker-buffer v)))
+                             '(make-marker))
+                            ((and (overlayp v) (null (overlay-buffer v)))
+                             '(let ((ol (make-overlay (point-min) (point-min))))
+                                (delete-overlay ol)
+                                ol))
+                            (v (macroexp-quote v))))
+                       cmds)))
+             (when (symbol-plist s)
+               (push `(setplist ',s ',(symbol-plist s)) cmds))))
+          (message "Dumping into dumped.elc...printing...")
+          (let ((print-circle t)
+                (print-gensym t)
+                (print-quoted t)
+                (print-level nil)
+                (print-length nil)
+                (print-escape-newlines t))
+            (print `(progn . ,cmds) (current-buffer)))
+          (goto-char (point-min))
+          (while (re-search-forward " (\\(defvar\\|setplist\\|fset\\) " nil t)
+            (goto-char (match-beginning 0))
+            (delete-char 1) (insert "\n"))
+          (message "Dumping into dumped.elc...saving...")
+          (let ((coding-system-for-write 'emacs-internal))
+            (write-region (point-min) (point-max) (buffer-name)))
+          (message "Dumping into dumped.elc...done")
+          ))
+
       (kill-emacs)))
 
 ;; For machines with CANNOT_DUMP defined in config.h,




^ permalink raw reply related	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file (was: When should ralloc.c be used?)
  2016-10-23 16:44                                   ` Skipping unexec via a big .elc file (was: When should ralloc.c be used?) Stefan Monnier
@ 2016-10-23 17:34                                     ` Eli Zaretskii
  2016-10-23 20:27                                       ` Skipping unexec via a big .elc file Stefan Monnier
  2016-10-24  1:07                                       ` Stefan Monnier
  2016-10-24 18:34                                     ` Lars Brinkhoff
  1 sibling, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 17:34 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sun, 23 Oct 2016 12:44:33 -0400
> 
> >> If someone wants to do that, great. I'd rather spend my own limited
> >> cycles on fixing the main problem, which is unexec.
> > I thought we agreed to get rid of unexec by loading a single .elc file
> > at startup of Emacs, and remove the distinction between temacs and
> > Emacs altogether.  Is that what you'd like to work on?
> 
> FWIW, I just did a quick experiment with the patch below which dumps the
> state of Emacs's obarray after loadup.el into a big "dumped.elc" file.
> Not sure if such an approach could work, but in any case I expect that
> a working .elc file should likely be of comparable size.
> 
> The result is a .elc file of 3.3MB which seems reasonable.
> When I try to load it, tho, I get:
> 
>     % time src/emacs -Q --batch -l dumped.elc -f kill-emacs
>     src/emacs -Q --batch -l dumped.elc -f kill-emacs  3.50s user 0.00s system 99% cpu 3.506 total
>     % 
> 
> And that's with a warm cache on "i3-4170 CPU @ 3.70GHz" (my first, and
> still only, CPU that goes beyond 3GHz).
> 
> So even if there might be ways to speed this up, it doesn't look
> too promising.

That sounds strangely long, as I got less than 2 sec with all the
preloaded *.elc files concatenated to a single file, and that's before
I made pure-copy a no-op.

Another report was that "loadup" with pure-copy short-circuited took
less than 0.5 sec.  See

  https://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01049.html

Was your Emacs an optimized build?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-23 17:34                                     ` Eli Zaretskii
@ 2016-10-23 20:27                                       ` Stefan Monnier
  2016-10-24  6:22                                         ` Eli Zaretskii
  2016-10-24  1:07                                       ` Stefan Monnier
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 20:27 UTC (permalink / raw)
  To: emacs-devel

>   https://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01049.html
> Was your Emacs an optimized build?

I tried it with my local build of master (not optimized) as well as with
the "emacs24" executable provided by Debian.  Both times were comparable.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-23 20:27                                       ` Skipping unexec via a big .elc file Stefan Monnier
@ 2016-10-24  6:22                                         ` Eli Zaretskii
  2016-10-24 12:47                                           ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24  6:22 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sun, 23 Oct 2016 16:27:48 -0400
> 
> >   https://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01049.html
> > Was your Emacs an optimized build?
> 
> I tried it with my local build of master (not optimized) as well as with
> the "emacs24" executable provided by Debian.  Both times were comparable.

An unoptimized Emacs runs about 3 times slower, so I cannot explain
your comparable results with both versions, it doesn't match any of my
experiences.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  6:22                                         ` Eli Zaretskii
@ 2016-10-24 12:47                                           ` Stefan Monnier
  2016-10-24 13:08                                             ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 12:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> I tried it with my local build of master (not optimized) as well as with
>> the "emacs24" executable provided by Debian.  Both times were comparable.
> An unoptimized Emacs runs about 3 times slower,

In my experience it's much less drastic, unless you include
enable_checking and such in "unoptimized".

> so I cannot explain your comparable results with both versions, it
> doesn't match any of my experiences.

The way I explained it to myself is that the lread.c code is much
less affected (e.g. it should almost be unaffected by enable_checking).

BTW, have you tried my experiement on your side?


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 12:47                                           ` Stefan Monnier
@ 2016-10-24 13:08                                             ` Eli Zaretskii
  2016-10-24 14:15                                               ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 13:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Mon, 24 Oct 2016 08:47:49 -0400
> 
> > so I cannot explain your comparable results with both versions, it
> > doesn't match any of my experiences.
> 
> The way I explained it to myself is that the lread.c code is much
> less affected (e.g. it should almost be unaffected by enable_checking).

Reading Lisp involves a lot of CPU-intensive processing.

> BTW, have you tried my experiement on your side?

No.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 13:08                                             ` Eli Zaretskii
@ 2016-10-24 14:15                                               ` Stefan Monnier
  0 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 14:15 UTC (permalink / raw)
  To: emacs-devel

>> The way I explained it to myself is that the lread.c code is much
>> less affected (e.g. it should almost be unaffected by enable_checking).
> Reading Lisp involves a lot of CPU-intensive processing.

Yes, but it's a different kind of code, so it may be
affected differently.
In any case, I have no concrete data to back up this intuition and
I don't believe it very strongly either.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-23 17:34                                     ` Eli Zaretskii
  2016-10-23 20:27                                       ` Skipping unexec via a big .elc file Stefan Monnier
@ 2016-10-24  1:07                                       ` Stefan Monnier
  2016-10-24  6:39                                         ` Eli Zaretskii
  2016-10-24  9:40                                         ` Ken Raeburn
  1 sibling, 2 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24  1:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> That sounds strangely long, as I got less than 2 sec with all the
> preloaded *.elc files concatenated to a single file, and that's before
> I made pure-copy a no-op.
> Another report was that "loadup" with pure-copy short-circuited took
> less than 0.5 sec.  See

Hmm... indeed, I got to 0.72s with his patch (on a different, slower
machine (a Thinkpad X201s, i.e. with a i7 CPU L620 @ 2.00GHz)).

If I re-add international/characters it goes up a bit to
0.96s, but still nowhere near the 3s I got on my big .elc file.
[ I wonder what makes loading my big file so slow.  ]

This said, there's still a factor 5-10 to get to "immediate", tho.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  1:07                                       ` Stefan Monnier
@ 2016-10-24  6:39                                         ` Eli Zaretskii
  2016-10-24  6:47                                           ` Lars Ingebrigtsen
  2016-10-24 13:04                                           ` Stefan Monnier
  2016-10-24  9:40                                         ` Ken Raeburn
  1 sibling, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24  6:39 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Sun, 23 Oct 2016 21:07:47 -0400
> 
> > That sounds strangely long, as I got less than 2 sec with all the
> > preloaded *.elc files concatenated to a single file, and that's before
> > I made pure-copy a no-op.
> > Another report was that "loadup" with pure-copy short-circuited took
> > less than 0.5 sec.  See
> 
> Hmm... indeed, I got to 0.72s with his patch (on a different, slower
> machine (a Thinkpad X201s, i.e. with a i7 CPU L620 @ 2.00GHz)).
> 
> If I re-add international/characters it goes up a bit to
> 0.96s, but still nowhere near the 3s I got on my big .elc file.
> [ I wonder what makes loading my big file so slow.  ]
> 
> This said, there's still a factor 5-10 to get to "immediate", tho.

A small price to pay for the advantages, IMO.  The most important
advantage in my view is that the dumping/loading process becomes very
simple and understandable even by people with minimal knowledge of C
subtleties and Emacs internals, let alone development tools like the
assembler and the linker.  This would make future maintenance much
more robust and reliable, and also allow more contributors to work on
improving, speeding up, and extending the build process.  The
alternatives all require us to depend on a dwindling handful of
people, which is a huge disadvantage in the long run.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  6:39                                         ` Eli Zaretskii
@ 2016-10-24  6:47                                           ` Lars Ingebrigtsen
  2016-10-24  7:17                                             ` Eli Zaretskii
  2016-10-24 13:04                                           ` Stefan Monnier
  1 sibling, 1 reply; 375+ messages in thread
From: Lars Ingebrigtsen @ 2016-10-24  6:47 UTC (permalink / raw)
  To: emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> If I re-add international/characters it goes up a bit to
>> 0.96s, but still nowhere near the 3s I got on my big .elc file.
>> [ I wonder what makes loading my big file so slow.  ]
>> 
>> This said, there's still a factor 5-10 to get to "immediate", tho.
>
> A small price to pay for the advantages, IMO.

I think a one second startup time for "emacs -Q -nw" on a fast machine
sounds pretty horrific, myself.  Not all people live in Emacs, but
instead start and stop Emacs to do small edits to files.

It would also make using Emacs on slower, smaller mobile devices an
unsatisfying experience.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  6:47                                           ` Lars Ingebrigtsen
@ 2016-10-24  7:17                                             ` Eli Zaretskii
  2016-10-24  8:24                                               ` Andreas Schwab
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24  7:17 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: emacs-devel

> From: Lars Ingebrigtsen <larsi@gnus.org>
> Date: Mon, 24 Oct 2016 08:47:39 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> If I re-add international/characters it goes up a bit to
> >> 0.96s, but still nowhere near the 3s I got on my big .elc file.
> >> [ I wonder what makes loading my big file so slow.  ]
> >> 
> >> This said, there's still a factor 5-10 to get to "immediate", tho.
> >
> > A small price to pay for the advantages, IMO.
> 
> I think a one second startup time for "emacs -Q -nw" on a fast machine
> sounds pretty horrific, myself.

We are not talking about 1 sec, we are talking about less than half
that time, potentially even 1/4th of a second.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  7:17                                             ` Eli Zaretskii
@ 2016-10-24  8:24                                               ` Andreas Schwab
  2016-10-24  8:41                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Andreas Schwab @ 2016-10-24  8:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Lars Ingebrigtsen, emacs-devel

On Okt 24 2016, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Lars Ingebrigtsen <larsi@gnus.org>
>> Date: Mon, 24 Oct 2016 08:47:39 +0200
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >> If I re-add international/characters it goes up a bit to
>> >> 0.96s, but still nowhere near the 3s I got on my big .elc file.
>> >> [ I wonder what makes loading my big file so slow.  ]
>> >> 
>> >> This said, there's still a factor 5-10 to get to "immediate", tho.
>> >
>> > A small price to pay for the advantages, IMO.
>> 
>> I think a one second startup time for "emacs -Q -nw" on a fast machine
>> sounds pretty horrific, myself.
>
> We are not talking about 1 sec, we are talking about less than half
> that time, potentially even 1/4th of a second.

That's still a lot.

$ time emacs --batch --eval t
0.027user 0.011system 0m0.048selapsed 79.66%CPU

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  8:24                                               ` Andreas Schwab
@ 2016-10-24  8:41                                                 ` Eli Zaretskii
  2016-10-24  9:47                                                   ` Daniel Colascione
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24  8:41 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: larsi, emacs-devel

> From: Andreas Schwab <schwab@suse.de>
> Cc: Lars Ingebrigtsen <larsi@gnus.org>,  emacs-devel@gnu.org
> Date: Mon, 24 Oct 2016 10:24:26 +0200
> 
> > We are not talking about 1 sec, we are talking about less than half
> > that time, potentially even 1/4th of a second.
> 
> That's still a lot.
> 
> $ time emacs --batch --eval t
> 0.027user 0.011system 0m0.048selapsed 79.66%CPU

Then I guess you will have to continue using unexec, and when that
alternative disappears, switch to some other editor.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  8:41                                                 ` Eli Zaretskii
@ 2016-10-24  9:47                                                   ` Daniel Colascione
  2016-10-24 10:00                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2016-10-24  9:47 UTC (permalink / raw)
  To: Eli Zaretskii, Andreas Schwab; +Cc: larsi, emacs-devel

On 10/24/2016 01:41 AM, Eli Zaretskii wrote:
>> From: Andreas Schwab <schwab@suse.de>
>> Cc: Lars Ingebrigtsen <larsi@gnus.org>,  emacs-devel@gnu.org
>> Date: Mon, 24 Oct 2016 10:24:26 +0200
>>
>>> We are not talking about 1 sec, we are talking about less than half
>>> that time, potentially even 1/4th of a second.
>>
>> That's still a lot.
>>
>> $ time emacs --batch --eval t
>> 0.027user 0.011system 0m0.048selapsed 79.66%CPU
>
> Then I guess you will have to continue using unexec, and when that
> alternative disappears, switch to some other editor.
>

I have lots of scripts that run using emacs -Q --batch; many are invoked 
frequently in other scripts. Making each take 250ms instead of 27ms to 
run will greatly increase the overall runtime of the high-level 
operations. I don't see a need to regress performance here, since a 
custom malloc will perform at least as well as the last glibc malloc 
that supported unexec (since it could in principle be a literal copy of 
that code), and we found the performance of that malloc acceptable. I 
care _much_ more about runtime performance than I do about allocation 
throughput once started.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  9:47                                                   ` Daniel Colascione
@ 2016-10-24 10:00                                                     ` Eli Zaretskii
  2016-10-24 10:03                                                       ` Daniel Colascione
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 10:00 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: schwab, larsi, emacs-devel

> Cc: larsi@gnus.org, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Mon, 24 Oct 2016 02:47:03 -0700
> 
> >> $ time emacs --batch --eval t
> >> 0.027user 0.011system 0m0.048selapsed 79.66%CPU
> >
> > Then I guess you will have to continue using unexec, and when that
> > alternative disappears, switch to some other editor.
> >
> 
> I have lots of scripts that run using emacs -Q --batch; many are invoked 
> frequently in other scripts. Making each take 250ms instead of 27ms to 
> run will greatly increase the overall runtime of the high-level 
> operations.

Maybe --batch won't need to load all of the elc code, maybe we could
have a smaller batch.elc for that.  Or maybe what Ken just wrote will
bring the load time below 100 ms, who knows.

IOW, I think we are arguing prematurely about something whose
performance we don't really understand, haven't measured yet, and
haven't even written yet.  Doesn't sound like a good idea.

> I don't see a need to regress performance here, since a 
> custom malloc will perform at least as well as the last glibc malloc 
> that supported unexec (since it could in principle be a literal copy of 
> that code), and we found the performance of that malloc acceptable. I 
> care _much_ more about runtime performance than I do about allocation 
> throughput once started.

The desire to drop unexec is not just because of malloc, it's because
advances in compilers, linkers, and system security make maintenance
of unexec harder and harder.  For example, unexec is incompatible with
address sanitation and other similar security techniques.  It also
regularly breaks when some new section is invented by the linker.
Etc. etc.

Therefore, we already decided to move towards eliminating unexec, and
the only issue we should discuss is how to do that.  You are in fact
suggesting to overturn that decision, which I don't think people will
agree with.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 10:00                                                     ` Eli Zaretskii
@ 2016-10-24 10:03                                                       ` Daniel Colascione
  2016-10-24 10:18                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2016-10-24 10:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: schwab, larsi, emacs-devel

On 10/24/2016 03:00 AM, Eli Zaretskii wrote:
>> Cc: larsi@gnus.org, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Mon, 24 Oct 2016 02:47:03 -0700
>>
>>>> $ time emacs --batch --eval t
>>>> 0.027user 0.011system 0m0.048selapsed 79.66%CPU
>>>
>>> Then I guess you will have to continue using unexec, and when that
>>> alternative disappears, switch to some other editor.
>>>
>>
>> I have lots of scripts that run using emacs -Q --batch; many are invoked
>> frequently in other scripts. Making each take 250ms instead of 27ms to
>> run will greatly increase the overall runtime of the high-level
>> operations.
>
> Maybe --batch won't need to load all of the elc code, maybe we could
> have a smaller batch.elc for that.  Or maybe what Ken just wrote will
> bring the load time below 100 ms, who knows.
>
> IOW, I think we are arguing prematurely about something whose
> performance we don't really understand, haven't measured yet, and
> haven't even written yet.  Doesn't sound like a good idea.
>
>> I don't see a need to regress performance here, since a
>> custom malloc will perform at least as well as the last glibc malloc
>> that supported unexec (since it could in principle be a literal copy of
>> that code), and we found the performance of that malloc acceptable. I
>> care _much_ more about runtime performance than I do about allocation
>> throughput once started.
>
> The desire to drop unexec is not just because of malloc, it's because
> advances in compilers, linkers, and system security make maintenance
> of unexec harder and harder.  For example, unexec is incompatible with
> address sanitation and other similar security techniques.  It also
> regularly breaks when some new section is invented by the linker.
> Etc. etc.
>
> Therefore, we already decided to move towards eliminating unexec, and
> the only issue we should discuss is how to do that.  You are in fact
> suggesting to overturn that decision, which I don't think people will
> agree with.

Sure. I'd like to have a PIE Emacs myself. We're talking about methods. 
I don't think the XEmacs-style "portable dumper" approach, with 
relocations, has been given adequate consideration.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 10:03                                                       ` Daniel Colascione
@ 2016-10-24 10:18                                                         ` Eli Zaretskii
  2016-10-24 10:28                                                           ` Philipp Stephani
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 10:18 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: schwab, larsi, emacs-devel

> Cc: schwab@suse.de, larsi@gnus.org, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Mon, 24 Oct 2016 03:03:37 -0700
> 
> I don't think the XEmacs-style "portable dumper" approach, with 
> relocations, has been given adequate consideration.

I think everyone agrees, which is why that approach is not being
considered.

But loading the "pre-loaded" *.elc files as quickly as possible is IMO
an attractive approach, because it's very simple and doesn't require
knowing too much about unrelated issues.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 10:18                                                         ` Eli Zaretskii
@ 2016-10-24 10:28                                                           ` Philipp Stephani
  2016-10-24 10:51                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Philipp Stephani @ 2016-10-24 10:28 UTC (permalink / raw)
  To: Eli Zaretskii, Daniel Colascione; +Cc: schwab, larsi, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 953 bytes --]

Eli Zaretskii <eliz@gnu.org> schrieb am Mo., 24. Okt. 2016 um 12:19 Uhr:

> > Cc: schwab@suse.de, larsi@gnus.org, emacs-devel@gnu.org
> > From: Daniel Colascione <dancol@dancol.org>
> > Date: Mon, 24 Oct 2016 03:03:37 -0700
> >
> > I don't think the XEmacs-style "portable dumper" approach, with
> > relocations, has been given adequate consideration.
>
> I think everyone agrees, which is why that approach is not being
> considered.
>
> But loading the "pre-loaded" *.elc files as quickly as possible is IMO
> an attractive approach, because it's very simple and doesn't require
> knowing too much about unrelated issues.
>
>
I agree, we should strife for simplicity first and performance later. I'd
suggest to use the pre-loaded .elc approach in master and work on a faster
(but still portable) replacement later, when the need arises. Switching to
a portable dumper now means we can cut out lots of code and workarounds,
which is a significant win.

[-- Attachment #2: Type: text/html, Size: 1862 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 10:28                                                           ` Philipp Stephani
@ 2016-10-24 10:51                                                             ` Eli Zaretskii
  2016-10-24 13:52                                                               ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 10:51 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: schwab, larsi, dancol, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Mon, 24 Oct 2016 10:28:06 +0000
> Cc: schwab@suse.de, larsi@gnus.org, emacs-devel@gnu.org
> 
>  But loading the "pre-loaded" *.elc files as quickly as possible is IMO
>  an attractive approach, because it's very simple and doesn't require
>  knowing too much about unrelated issues.
> 
> I agree, we should strife for simplicity first and performance later. I'd suggest to use the pre-loaded .elc
> approach in master and work on a faster (but still portable) replacement later, when the need arises.

I agree: we should make it work right first, and speed it up later.
There's a lot of room for speed improvement, some of the ideas were
already voiced here.

If someone wants to work on this, I think some of the stuff that
should be done is this:

  . Implement a command that writes a given list of *.elc files into a
    single file.

  . Make the C code that today runs at dump time and records various
    build-related variables, such as source-directory and
    system-configuration-features, record the values in a Lisp file
    (eventually will be the same .elc file that is loaded at startup).

(I'm sure there are more items in that list, but I didn't think long
enough to come up with more.)

Volunteers are welcome.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 10:51                                                             ` Eli Zaretskii
@ 2016-10-24 13:52                                                               ` Stefan Monnier
  2016-10-24 16:04                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 13:52 UTC (permalink / raw)
  To: emacs-devel

>   . Implement a command that writes a given list of *.elc files into a
>     single file.

>   . Make the C code that today runs at dump time and records various
>     build-related variables, such as source-directory and
>     system-configuration-features, record the values in a Lisp file
>     (eventually will be the same .elc file that is loaded at startup).

BTW, my dumped.elc attempt was specifically trying to solve these
issues: by dumping the state of the obarray, we automatically get these
vars set like we want them.  It also solves other side-issues such as
making sure that `C-h f dolist RET' points to "subr.el" rather than to
"dumped.elc".


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 13:52                                                               ` Stefan Monnier
@ 2016-10-24 16:04                                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 16:04 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 24 Oct 2016 09:52:54 -0400
> 
> >   . Implement a command that writes a given list of *.elc files into a
> >     single file.
> 
> >   . Make the C code that today runs at dump time and records various
> >     build-related variables, such as source-directory and
> >     system-configuration-features, record the values in a Lisp file
> >     (eventually will be the same .elc file that is loaded at startup).
> 
> BTW, my dumped.elc attempt was specifically trying to solve these
> issues: by dumping the state of the obarray, we automatically get these
> vars set like we want them.  It also solves other side-issues such as
> making sure that `C-h f dolist RET' points to "subr.el" rather than to
> "dumped.elc".

I consider tinkering with obarray's internals still too "advanced" to
prefer it to a simple generation of Lisp code that records values of a
few variables.  (And are you sure all of the information we record at
dump time is in obarray? I am not.)  The number of these variables is
not large, so finding them in the sources will not be hard.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  6:39                                         ` Eli Zaretskii
  2016-10-24  6:47                                           ` Lars Ingebrigtsen
@ 2016-10-24 13:04                                           ` Stefan Monnier
  2016-10-24 13:35                                             ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 13:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> A small price to pay for the advantages, IMO.

I think some users will run away screaming if Emacs takes a whole second
to start up.

> The most important advantage in my view is that the dumping/loading
> process becomes very simple and understandable even by people with
> minimal knowledge of C subtleties and Emacs internals,

Yes, the benefits are clear, but the cost is pretty steep.

I think we could live with a 0.2s startup time, but that's already
a pretty high cost:
- 0.2s feels sluggish when you expect "immediate".
- byte-compilation has historically moved from "do it in a single
  session", to "start a separate Emacs session for each file" for good
  reasons.  A 0.2s startup time imposes either a much slower
  byte-compilation, or will compel us to go back to "do it all in
  a single session".

> This would make future maintenance much more robust and reliable, and
> also allow more contributors to work on improving, speeding up, and
> extending the build process.  The alternatives all require us to
> depend on a dwindling handful of people, which is a huge disadvantage
> in the long run.

Maybe there's indeed a lot of speed up still waiting there, and by
reducing loading time of .elc files (and/or allowing more laziness there)
we could bring down the 0.96s to 0.2s *and* speed up other uses at the
same time.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 13:04                                           ` Stefan Monnier
@ 2016-10-24 13:35                                             ` Eli Zaretskii
  2016-10-24 14:45                                               ` Daniel Colascione
  2016-10-25 22:46                                               ` Perry E. Metzger
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 13:35 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Mon, 24 Oct 2016 09:04:29 -0400
> 
> > A small price to pay for the advantages, IMO.
> 
> I think some users will run away screaming if Emacs takes a whole second
> to start up.

It depends.  If those users, like me, have hundreds of buffers in
their sessions, and use desktop.el to recreate their sessions, they
already wait a few seconds for that.

And I don't expect the result to be 1 sec, that's is a rounded up
value that is already higher than what I saw.

> > The most important advantage in my view is that the dumping/loading
> > process becomes very simple and understandable even by people with
> > minimal knowledge of C subtleties and Emacs internals,
> 
> Yes, the benefits are clear, but the cost is pretty steep.

We will have to speed this up, of course.  You didn't expect tossing
unexec to be an easy job, did you?

> I think we could live with a 0.2s startup time, but that's already
> a pretty high cost:
> - 0.2s feels sluggish when you expect "immediate".
> - byte-compilation has historically moved from "do it in a single
>   session", to "start a separate Emacs session for each file" for good
>   reasons.  A 0.2s startup time imposes either a much slower
>   byte-compilation, or will compel us to go back to "do it all in
>   a single session".

I think you forget parallelism.  We build Emacs with several
compilations running in parallel for a long time.  And byte-compiling
a typical file already takes more than 0.2 sec, sometimes (often?)
significantly more, so I don't see a catastrophe yet.

> > This would make future maintenance much more robust and reliable, and
> > also allow more contributors to work on improving, speeding up, and
> > extending the build process.  The alternatives all require us to
> > depend on a dwindling handful of people, which is a huge disadvantage
> > in the long run.
> 
> Maybe there's indeed a lot of speed up still waiting there, and by
> reducing loading time of .elc files (and/or allowing more laziness there)
> we could bring down the 0.96s to 0.2s *and* speed up other uses at the
> same time.

That's my hope, yes.  E.g., maybe reading the startup.elc file could
run in another thread?

In any case, I don't think it's right to throw out this idea without
trying very hard to make it work, because the benefits are so clear.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 13:35                                             ` Eli Zaretskii
@ 2016-10-24 14:45                                               ` Daniel Colascione
  2016-10-24 15:58                                                 ` Eli Zaretskii
  2016-10-25 22:46                                               ` Perry E. Metzger
  1 sibling, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2016-10-24 14:45 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: emacs-devel

On 10/24/2016 06:35 AM, Eli Zaretskii wrote:
>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: emacs-devel@gnu.org
>> Date: Mon, 24 Oct 2016 09:04:29 -0400
>>
>>> A small price to pay for the advantages, IMO.
>>
>> I think some users will run away screaming if Emacs takes a whole second
>> to start up.
>
> It depends.  If those users, like me, have hundreds of buffers in
> their sessions, and use desktop.el to recreate their sessions, they
> already wait a few seconds for that.
>
> And I don't expect the result to be 1 sec, that's is a rounded up
> value that is already higher than what I saw.
>
>>> The most important advantage in my view is that the dumping/loading
>>> process becomes very simple and understandable even by people with
>>> minimal knowledge of C subtleties and Emacs internals,
>>
>> Yes, the benefits are clear, but the cost is pretty steep.
>
> We will have to speed this up, of course.  You didn't expect tossing
> unexec to be an easy job, did you?
>
>> I think we could live with a 0.2s startup time, but that's already
>> a pretty high cost:
>> - 0.2s feels sluggish when you expect "immediate".
>> - byte-compilation has historically moved from "do it in a single
>>   session", to "start a separate Emacs session for each file" for good
>>   reasons.  A 0.2s startup time imposes either a much slower
>>   byte-compilation, or will compel us to go back to "do it all in
>>   a single session".
>
> I think you forget parallelism.  We build Emacs with several
> compilations running in parallel for a long time.  And byte-compiling
> a typical file already takes more than 0.2 sec, sometimes (often?)
> significantly more, so I don't see a catastrophe yet.
>
>>> This would make future maintenance much more robust and reliable, and
>>> also allow more contributors to work on improving, speeding up, and
>>> extending the build process.  The alternatives all require us to
>>> depend on a dwindling handful of people, which is a huge disadvantage
>>> in the long run.
>>
>> Maybe there's indeed a lot of speed up still waiting there, and by
>> reducing loading time of .elc files (and/or allowing more laziness there)
>> we could bring down the 0.96s to 0.2s *and* speed up other uses at the
>> same time.
>
> That's my hope, yes.  E.g., maybe reading the startup.elc file could
> run in another thread?
>
> In any case, I don't think it's right to throw out this idea without
> trying very hard to make it work, because the benefits are so clear.

I'm worried that it'll be deemed to "work" at a level of performance 
much worse than what we have today. My preference would be to keep 
hammering on this approach and others until we find something with only 
minimal performance regressions. I don't see the unexec maintenance 
situation being desperate enough that we need to accept a big 
performance loss.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 14:45                                               ` Daniel Colascione
@ 2016-10-24 15:58                                                 ` Eli Zaretskii
  2016-10-24 16:17                                                   ` Daniel Colascione
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 15:58 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: monnier, emacs-devel

> Cc: emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Mon, 24 Oct 2016 07:45:17 -0700
> 
> > In any case, I don't think it's right to throw out this idea without
> > trying very hard to make it work, because the benefits are so clear.
> 
> I'm worried that it'll be deemed to "work" at a level of performance 
> much worse than what we have today.

Why would you worry that it'll be accepted then more easily than it's
accepted now?  The same arguments will be voiced in the future if the
solution's performance turns out to be insufficient.

> I don't see the unexec maintenance situation being desperate enough
> that we need to accept a big performance loss.

I very much disagree with this: the unexec maintenance situation is
actually so fragile that it could break at any moment, in the sense
that we could very easily get into having no people on board who know
enough about unexec to solve the next problem that will break it.  The
number of people who do know gets smaller and smaller with each year.
That is not healthy at all for the future of the project.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 15:58                                                 ` Eli Zaretskii
@ 2016-10-24 16:17                                                   ` Daniel Colascione
  2016-10-24 16:51                                                     ` Philipp Stephani
  2016-10-24 16:52                                                     ` Eli Zaretskii
  0 siblings, 2 replies; 375+ messages in thread
From: Daniel Colascione @ 2016-10-24 16:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

On 10/24/2016 08:58 AM, Eli Zaretskii wrote:
>> Cc: emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Mon, 24 Oct 2016 07:45:17 -0700
>>
>>> In any case, I don't think it's right to throw out this idea without
>>> trying very hard to make it work, because the benefits are so clear.
>>
>> I'm worried that it'll be deemed to "work" at a level of performance
>> much worse than what we have today.
>
> Why would you worry that it'll be accepted then more easily than it's
> accepted now?  The same arguments will be voiced in the future if the
> solution's performance turns out to be insufficient.
>
>> I don't see the unexec maintenance situation being desperate enough
>> that we need to accept a big performance loss.
>
> I very much disagree with this: the unexec maintenance situation is
> actually so fragile that it could break at any moment, in the sense
> that we could very easily get into having no people on board who know
> enough about unexec to solve the next problem that will break it.  The
> number of people who do know gets smaller and smaller with each year.
> That is not healthy at all for the future of the project.

In both this discussion and the one about insdel, you've expressed the 
sentiment that we need to optimize for a world in which very few people 
have time to maintain Emacs internals. I have a more optimistic view: 
people are generally good at figuring things out, and if learning about 
unexec or other esoteric facilities is that prevents a developer from 
porting Emacs to a new platform or fixing an important bug, that 
developer will put time into learning about these mechanisms.

That is, we *could* get into a situation where "no people on board [] 
know enough about unexec to solve the next problem", but that situation 
will resolve itself when people learn about unexec.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 16:17                                                   ` Daniel Colascione
@ 2016-10-24 16:51                                                     ` Philipp Stephani
  2016-10-24 19:47                                                       ` Daniel Colascione
  2016-10-24 16:52                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Philipp Stephani @ 2016-10-24 16:51 UTC (permalink / raw)
  To: Daniel Colascione, Eli Zaretskii; +Cc: monnier, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 703 bytes --]

Daniel Colascione <dancol@dancol.org> schrieb am Mo., 24. Okt. 2016 um
18:35 Uhr:

> That is, we *could* get into a situation where "no people on board []
> know enough about unexec to solve the next problem"

I'd argue that we are already in this situation.  For example, nobody knows
how to make unexec work with ASLR or PIE; when I tried fuzzing Emacs with
AFL, the dumped binary would simply crash; the dumped binary is not
reproducible (i.e. bit-by-bit identical after every build); and I think
dumping also doesn't work with ASan. The fraction of situation where unexec
doesn't work any more gets larger and larger. If we had people who could
solve these problems, it should get smaller instead.

[-- Attachment #2: Type: text/html, Size: 1013 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 16:51                                                     ` Philipp Stephani
@ 2016-10-24 19:47                                                       ` Daniel Colascione
  2016-10-25 15:59                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2016-10-24 19:47 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: Eli Zaretskii, monnier, emacs-devel

Philipp Stephani <p.stephani2@gmail.com> writes:

> Daniel Colascione <dancol@dancol.org> schrieb am Mo., 24. Okt. 2016 um 18:35 Uhr:
>
>  That is, we *could* get into a situation where "no people on board []
>  know enough about unexec to solve the next problem"
>
> I'd argue that we are already in this situation.  For example, nobody
> knows how to make unexec work with ASLR or PIE; when I tried fuzzing
> Emacs with AFL, the dumped binary would simply crash; the dumped
> binary is not reproducible (i.e. bit-by-bit identical after every
> build); and I think dumping also doesn't work with ASan. The fraction
> of situation where unexec doesn't work any more gets larger and
> larger. If we had people who could solve these problems, it should get
> smaller instead.

It's not a matter of "not knowing" how to make unexec work with PIE and
PIC code generally --- the problem is that the naive approach currently
used for serializing program state depends on the process address state
being reproducible: we don't specially mark pointers in the saved image,
so we can't relocate them. There have been numerous discussions on
emacs-devel about relocation schemes, with proposals ranging from just
making elc faster to translating elisp to C.

Everyone who's seriously thought about the unexec problem _understands_
the issue. unexec isn't black magic. Getting rid of the current scheme
is a matter of finding the right relocation scheme (which for all I know
might as well be "make elc better") and finding the time to
implement it.

My preferred approach is the portable dumper one: basically what we're
doing today, except that instead of just blindly copying the data
segment and heap to a new emacs binary, we'll write this information to
a separate file, stored in a portable format, a file that we'll keep
alongside the Emacs binary.  We'll store in this file metadata about
where the pointers are. (There are two kinds of pointers in this file:
pointers to other parts of the file and pointers to the Emacs binary.)

At startup, we'll load the dump file and walk the relocations, fixing up
all the embedded addresses to account for the new process's different
address space.  There's no binary other than the one that the compiler
generates; this data file is just data, so ASLR, ASAN, and other clever
things should work fine. (Some people have proposed asking the system
dynamic linker to do the relocating, but I'd prefer to do it ourselves,
in a portable way.)

We can't save all of the Emacs data segment this way, but we can
relocate and restore anything that's marked with staticpro. The overall
experience should be very similar to what we have today.

Additionally, the purespace concept remains useful: if we take pure
storage and put it in its own region of the dump file, we don't need to
take copy-on-write faults for data that cannot contain pointers.

Speaking of COW faults: a refinement of this scheme is to do the
relocations lazily, in a SIGSEGV handler.  (Map the dump file PROT_NONE
so any access traps.)  In the SIGSEGV handler, we can relocate just the
page we faulted, then continue. This way, we don't need to slurp in the
entire dump file from disk just to start emacs -Q -batch: we can
demand-page!

Whether this refinement is worth the trouble is something only
experimentation can tell, but it's an option if we need it.  With this
refinement, the portable dumping approach should be safe, semantically
familiar to unexec, ASLR-compatible, _and_ very nearly as fast as what
we have today.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 19:47                                                       ` Daniel Colascione
@ 2016-10-25 15:59                                                         ` Eli Zaretskii
  2016-10-25 16:14                                                           ` Daniel Colascione
                                                                             ` (2 more replies)
  0 siblings, 3 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-25 15:59 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: p.stephani2, monnier, emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Cc: Eli Zaretskii <eliz@gnu.org>,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Mon, 24 Oct 2016 12:47:56 -0700
> 
> > I'd argue that we are already in this situation.  For example, nobody
> > knows how to make unexec work with ASLR or PIE; when I tried fuzzing
> > Emacs with AFL, the dumped binary would simply crash; the dumped
> > binary is not reproducible (i.e. bit-by-bit identical after every
> > build); and I think dumping also doesn't work with ASan. The fraction
> > of situation where unexec doesn't work any more gets larger and
> > larger. If we had people who could solve these problems, it should get
> > smaller instead.
> 
> Everyone who's seriously thought about the unexec problem _understands_
> the issue.

The important point is that the number of people here who can claim
such understanding, enough so to fix the issues, is diminishingly
small, and gets smaller every year.

> My preferred approach is the portable dumper one: basically what we're
> doing today, except that instead of just blindly copying the data
> segment and heap to a new emacs binary, we'll write this information to
> a separate file, stored in a portable format, a file that we'll keep
> alongside the Emacs binary.  We'll store in this file metadata about
> where the pointers are. (There are two kinds of pointers in this file:
> pointers to other parts of the file and pointers to the Emacs binary.)
> 
> At startup, we'll load the dump file and walk the relocations, fixing up
> all the embedded addresses to account for the new process's different
> address space.

Why do you think this will have better performance that reading a
single .elc file at startup?  It's still mainly file I/O and
processing of the file's contents, just like with byte-compiled files.

If we have no reason to believe this portable dumper will be
significantly faster, we should IMO investigate the .elc method first,
because it's so much simpler, both in its implementation and in future
maintenance.  E.g., adding a new kind of Lisp object to Emacs would
require corresponding changes in the dumper.

> We can't save all of the Emacs data segment this way, but we can
> relocate and restore anything that's marked with staticpro. The overall
> experience should be very similar to what we have today.
> [...]
> Speaking of COW faults: a refinement of this scheme is to do the
> relocations lazily, in a SIGSEGV handler.  (Map the dump file PROT_NONE
> so any access traps.)  In the SIGSEGV handler, we can relocate just the
> page we faulted, then continue. This way, we don't need to slurp in the
> entire dump file from disk just to start emacs -Q -batch: we can
> demand-page!

Demand paging in an application, and an application such as Emacs on
top of that, makes little sense to me.  This is the OS business, not
ours.  Using mmap as a fast way to read a file, yes, that's done in
many applications.  But please lets leave demand paging out of our
scope.

IMO the less we mess with low-level techniques that no other
applications use the better, both because we have very few people who
can do that and because doing so runs higher risk of becoming broken
by future developments in the platforms we deem important.  The
long-term tendency in Emacs development should be to move away from
such techniques, not to acquire more of them.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-25 15:59                                                         ` Eli Zaretskii
@ 2016-10-25 16:14                                                           ` Daniel Colascione
  2016-10-25 17:05                                                             ` Eli Zaretskii
  2016-10-25 19:49                                                           ` Stefan Monnier
  2016-10-25 22:53                                                           ` Perry E. Metzger
  2 siblings, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2016-10-25 16:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, monnier, emacs-devel

On 10/25/2016 08:59 AM, Eli Zaretskii wrote:
>> From: Daniel Colascione <dancol@dancol.org>
>> Cc: Eli Zaretskii <eliz@gnu.org>,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
>> Date: Mon, 24 Oct 2016 12:47:56 -0700
>>
>>> I'd argue that we are already in this situation.  For example, nobody
>>> knows how to make unexec work with ASLR or PIE; when I tried fuzzing
>>> Emacs with AFL, the dumped binary would simply crash; the dumped
>>> binary is not reproducible (i.e. bit-by-bit identical after every
>>> build); and I think dumping also doesn't work with ASan. The fraction
>>> of situation where unexec doesn't work any more gets larger and
>>> larger. If we had people who could solve these problems, it should get
>>> smaller instead.
>>
>> Everyone who's seriously thought about the unexec problem _understands_
>> the issue.
>
> The important point is that the number of people here who can claim
> such understanding, enough so to fix the issues, is diminishingly
> small, and gets smaller every year.

There's no demand for more yet. There isn't a catastrophe --- just low 
demand for core-change expertise. There used* to be a lot more (at least 
per-capita) stonemasons in historical societies than in today's society. 
That doesn't mean we've forgotten how to cut stones, and if there were a 
sudden need to do it, more stonemasons would magically appear.

>> My preferred approach is the portable dumper one: basically what we're
>> doing today, except that instead of just blindly copying the data
>> segment and heap to a new emacs binary, we'll write this information to
>> a separate file, stored in a portable format, a file that we'll keep
>> alongside the Emacs binary.  We'll store in this file metadata about
>> where the pointers are. (There are two kinds of pointers in this file:
>> pointers to other parts of the file and pointers to the Emacs binary.)
>>
>> At startup, we'll load the dump file and walk the relocations, fixing up
>> all the embedded addresses to account for the new process's different
>> address space.
>
> Why do you think this will have better performance that reading a
> single .elc file at startup?  It's still mainly file I/O and
> processing of the file's contents, just like with byte-compiled files.

Because a portable dumper can do less, on both file I/O and processing 
of the file's contents. There's no lisp evaluation, no slurping a whole 
file into memory. Having to read all of Emacs into memory on startup is 
a burden even on a fast, modern machine like mine.

~/edev/trunk/src
$ sync && sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'

~/edev/trunk/src
$ time pv < emacs >/dev/null
48.6MiB 0:00:00 [ 455MiB/s] 
[=========================================================>] 100% 


real	0m0.116s
user	0m0.000s
sys	0m0.016s

That's pretty fast, but it's not free. Not having to do this much IO on 
startup in the first place would be even better.

> If we have no reason to believe this portable dumper will be
> significantly faster, we should IMO investigate the .elc method first,
> because it's so much simpler, both in its implementation and in future
> maintenance.  E.g., adding a new kind of Lisp object to Emacs would
> require corresponding changes in the dumper.

Adding a new kind of lisp object requires changes throughout core 
anyway. At the very least, you need to teach GC where your new object 
keeps its pointers, and that's exactly the knowledge that the dumper 
would need.

>> We can't save all of the Emacs data segment this way, but we can
>> relocate and restore anything that's marked with staticpro. The overall
>> experience should be very similar to what we have today.
>> [...]
>> Speaking of COW faults: a refinement of this scheme is to do the
>> relocations lazily, in a SIGSEGV handler.  (Map the dump file PROT_NONE
>> so any access traps.)  In the SIGSEGV handler, we can relocate just the
>> page we faulted, then continue. This way, we don't need to slurp in the
>> entire dump file from disk just to start emacs -Q -batch: we can
>> demand-page!
>
> Demand paging in an application, and an application such as Emacs on
> top of that, makes little sense to me.

Why? It's conceptually no different from autoload. There is no technique 
in computer science so rarefied that it's only good in ring zero.

> This is the OS business, not
> ours.  Using mmap as a fast way to read a file, yes, that's done in
> many applications.  But please lets leave demand paging out of our
> scope.

Emacs isn't just an application. It's a Lisp virtual machine, and 
employing the optimization techniques used in other virtual machines can 
be important wins.

(FWIW, mmap isn't a particularly fast way of doing bulk file reads. 
That's why GNU grep removed its mmap support.)

> IMO the less we mess with low-level techniques that no other
> applications use the better, both because we have very few people who
> can do that and because doing so runs higher risk of becoming broken
> by future developments in the platforms we deem important.  The
> long-term tendency in Emacs development should be to move away from
> such techniques, not to acquire more of them.

I'm for anything that delivers meaningful performance advantages.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-25 16:14                                                           ` Daniel Colascione
@ 2016-10-25 17:05                                                             ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-25 17:05 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: p.stephani2, monnier, emacs-devel

> Cc: p.stephani2@gmail.com, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Tue, 25 Oct 2016 09:14:55 -0700
> 
> >> Everyone who's seriously thought about the unexec problem _understands_
> >> the issue.
> >
> > The important point is that the number of people here who can claim
> > such understanding, enough so to fix the issues, is diminishingly
> > small, and gets smaller every year.
> 
> There's no demand for more yet.

Not true.  Demand for this level of expertise is continuous in Emacs,
and never dwindles, not for the last 25 years that I'm involved.

> There used* to be a lot more (at least 
> per-capita) stonemasons in historical societies than in today's society. 
> That doesn't mean we've forgotten how to cut stones, and if there were a 
> sudden need to do it, more stonemasons would magically appear.

I think your optimism is misplaced.  I'm old enough to have seen
several proficiencies go extinct due to new technology that made them
irrelevant.  When demand for those forgotten proficiencies came up,
people invariably run to the few still around who know how to do that,
they don't learn that themselves (and don't even know how).

> > Why do you think this will have better performance that reading a
> > single .elc file at startup?  It's still mainly file I/O and
> > processing of the file's contents, just like with byte-compiled files.
> 
> Because a portable dumper can do less, on both file I/O and processing 
> of the file's contents. There's no lisp evaluation, no slurping a whole 
> file into memory. Having to read all of Emacs into memory on startup is 
> a burden even on a fast, modern machine like mine.
> 
> ~/edev/trunk/src
> $ sync && sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
> 
> ~/edev/trunk/src
> $ time pv < emacs >/dev/null
> 48.6MiB 0:00:00 [ 455MiB/s] 
> [=========================================================>] 100% 
> 
> 
> real	0m0.116s

Which is definitely comparable with my measurements of loading all of
the *.elc files concatenated, which were proclaimed to be "too slow".

> > If we have no reason to believe this portable dumper will be
> > significantly faster, we should IMO investigate the .elc method first,
> > because it's so much simpler, both in its implementation and in future
> > maintenance.  E.g., adding a new kind of Lisp object to Emacs would
> > require corresponding changes in the dumper.
> 
> Adding a new kind of lisp object requires changes throughout core 
> anyway.

The changes in the dumper are _in_addition_ to that.

> > Demand paging in an application, and an application such as Emacs on
> > top of that, makes little sense to me.
> 
> Why? It's conceptually no different from autoload.

The devil is in the details, though.  And there are a lot of details
in this case that are completely unrelated to the concept.  If you
don't get them all right, you get a subtly unstable application that
will crash randomly in hard to reproduce and debug situations.

> > This is the OS business, not
> > ours.  Using mmap as a fast way to read a file, yes, that's done in
> > many applications.  But please lets leave demand paging out of our
> > scope.
> 
> Emacs isn't just an application. It's a Lisp virtual machine

No, it's not.  It's an application with a powerful extension language.

> (FWIW, mmap isn't a particularly fast way of doing bulk file reads. 
> That's why GNU grep removed its mmap support.)

It was an example of a low-level technique that is sufficiently simple
to use, that's all.

> > IMO the less we mess with low-level techniques that no other
> > applications use the better, both because we have very few people who
> > can do that and because doing so runs higher risk of becoming broken
> > by future developments in the platforms we deem important.  The
> > long-term tendency in Emacs development should be to move away from
> > such techniques, not to acquire more of them.
> 
> I'm for anything that delivers meaningful performance advantages.

IME, that way lies madness.  It's the exact opposite of the direction
Emacs should evolve if we want to prevent it from becoming a marginal
package for a few enthusiasts.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-25 15:59                                                         ` Eli Zaretskii
  2016-10-25 16:14                                                           ` Daniel Colascione
@ 2016-10-25 19:49                                                           ` Stefan Monnier
  2016-10-25 22:53                                                           ` Perry E. Metzger
  2 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-25 19:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, Daniel Colascione, emacs-devel

>> At startup, we'll load the dump file and walk the relocations, fixing up
>> all the embedded addresses to account for the new process's different
>> address space.
> Why do you think this will have better performance that reading a
> single .elc file at startup?  It's still mainly file I/O and
> processing of the file's contents, just like with byte-compiled files.

I guess it depends if we can get lread.c to be bound by file-I/O.
Currently, it's significantly slower.

It's clear on the surface that lread.c has more work to do than an ideal
"portable undumper":
- the PU just has to find all pointers and increment them by
  a fixed offset (it could do so either by a GC-like traversal, or by
  consulting an auxiliary precomputed table of addresses stored alongside
  the dump state)
- lread.c has to check every byte for lexing/parsing, it has to call the
  memory allocator for every object, tie the knots for cyclic objects,
  `intern` the symbols, decode the \ in strings, ...

The jury is still out whether this extra work can be implemented
efficiently enough.  There are other differences which can impact the
performance (e.g. the size of the "dump" is likely different in the two
cases, so the amount of I/O is affected) and the desirability (speeding
up loading of .elc would benefit other cases as well, we could generate
a generic dump.elc rather than have it be OS-specific, ...)


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-25 15:59                                                         ` Eli Zaretskii
  2016-10-25 16:14                                                           ` Daniel Colascione
  2016-10-25 19:49                                                           ` Stefan Monnier
@ 2016-10-25 22:53                                                           ` Perry E. Metzger
  2016-10-26  2:36                                                             ` Eli Zaretskii
  2 siblings, 1 reply; 375+ messages in thread
From: Perry E. Metzger @ 2016-10-25 22:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, Daniel Colascione, monnier, emacs-devel

On Tue, 25 Oct 2016 18:59:36 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
> > Everyone who's seriously thought about the unexec problem
> > _understands_ the issue.
> 
> The important point is that the number of people here who can claim
> such understanding, enough so to fix the issues, is diminishingly
> small, and gets smaller every year.

Just an aside: when you attract fewer and fewer users, you end up with
fewer and fewer contributors. Fewer and fewer contributors makes
maintenance harder and can create a death spiral for projects. If, in
an effort to make maintenance easier, you scare off a lot of users,
you could end up making the maintenance situation worse in the long
run. I'm not saying that longer start time would scare off users as
such, but in general, this balance has to be weighed in making
decisions about usability vs. maintenance costs.

Perry
-- 
Perry E. Metzger		perry@piermont.com

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-25 22:53                                                           ` Perry E. Metzger
@ 2016-10-26  2:36                                                             ` Eli Zaretskii
  2016-10-26  2:37                                                               ` Perry E. Metzger
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-26  2:36 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: p.stephani2, dancol, monnier, emacs-devel

> Date: Tue, 25 Oct 2016 18:53:13 -0400
> From: "Perry E. Metzger" <perry@piermont.com>
> Cc: Daniel Colascione <dancol@dancol.org>, p.stephani2@gmail.com,
>  monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> On Tue, 25 Oct 2016 18:59:36 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
> > > Everyone who's seriously thought about the unexec problem
> > > _understands_ the issue.
> > 
> > The important point is that the number of people here who can claim
> > such understanding, enough so to fix the issues, is diminishingly
> > small, and gets smaller every year.
> 
> Just an aside: when you attract fewer and fewer users, you end up with
> fewer and fewer contributors. Fewer and fewer contributors makes
> maintenance harder and can create a death spiral for projects. If, in
> an effort to make maintenance easier, you scare off a lot of users,
> you could end up making the maintenance situation worse in the long
> run. I'm not saying that longer start time would scare off users as
> such, but in general, this balance has to be weighed in making
> decisions about usability vs. maintenance costs.

That's a profoundly false premise, and a misrepresentation of
everything I wrote.  No one is arguing for slower startup that will
annoy users!  The issue at hand is which approach to prefer when the
startup time is comparable.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-26  2:36                                                             ` Eli Zaretskii
@ 2016-10-26  2:37                                                               ` Perry E. Metzger
  0 siblings, 0 replies; 375+ messages in thread
From: Perry E. Metzger @ 2016-10-26  2:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: p.stephani2, dancol, monnier, emacs-devel

On Wed, 26 Oct 2016 05:36:19 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
> That's a profoundly false premise, and a misrepresentation of
> everything I wrote.  No one is arguing for slower startup that will
> annoy users!  The issue at hand is which approach to prefer when the
> startup time is comparable.

Good to hear.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 16:17                                                   ` Daniel Colascione
  2016-10-24 16:51                                                     ` Philipp Stephani
@ 2016-10-24 16:52                                                     ` Eli Zaretskii
  1 sibling, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 16:52 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: monnier, emacs-devel

> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Mon, 24 Oct 2016 09:17:14 -0700
> 
> > I very much disagree with this: the unexec maintenance situation is
> > actually so fragile that it could break at any moment, in the sense
> > that we could very easily get into having no people on board who know
> > enough about unexec to solve the next problem that will break it.  The
> > number of people who do know gets smaller and smaller with each year.
> > That is not healthy at all for the future of the project.
> 
> In both this discussion and the one about insdel, you've expressed the 
> sentiment that we need to optimize for a world in which very few people 
> have time to maintain Emacs internals. I have a more optimistic view: 
> people are generally good at figuring things out, and if learning about 
> unexec or other esoteric facilities is that prevents a developer from 
> porting Emacs to a new platform or fixing an important bug, that 
> developer will put time into learning about these mechanisms.

Even if you are right, such "figuring out" will take time, and will
delay Emacs development if not stall it.  With enough bad luck, we
could start people abandoning ship.  Like I said, Emacs already cannot
be built on a system with ASLR; how soon do you think this and similar
problems will be considered fatal flaws?

Yes, I'm a pessimist about these aspects of Emacs development.  My
reasons are what I see before my eyes almost every day: some problems
in Emacs are not touched until one of the few who know enough do it.
Look at the last installment of this saga, with ralloc-induced
problems: the same usual suspects are involved in solving it.  If all
of those few were run over by a bus, how fast these problems would be
identified and solved?  And this problem is by far simpler than the
unexec subtleties.  It's no accident that no one (perhaps except Paul)
is seriously working on the unexec replacement.  Why would you believe
that this could change in the future, when most our contributors lack
proficiency working on this level?

So yes, I think your optimism is misplaced.  But that doesn't matter,
because no solution for unexec that is not good enough,
performance-wise and otherwise, will be accepted by the crowd, no
matter how grave is the current situation.  So you should not be
worried about this.  What _is_ important, IMO, is that if and when we
do need to drop unexec, we will have _some_ solution, however
imperfect, to start with and get it up to speed.  Because whatever the
solution, making it happen is a lot of work, and we had better done
most of it by then.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 13:35                                             ` Eli Zaretskii
  2016-10-24 14:45                                               ` Daniel Colascione
@ 2016-10-25 22:46                                               ` Perry E. Metzger
  1 sibling, 0 replies; 375+ messages in thread
From: Perry E. Metzger @ 2016-10-25 22:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, emacs-devel

On Mon, 24 Oct 2016 16:35:08 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
> > I think some users will run away screaming if Emacs takes a whole
> > second to start up.  
> 
> It depends.  If those users, like me, have hundreds of buffers in
> their sessions, and use desktop.el to recreate their sessions, they
> already wait a few seconds for that.

Just FYI, I find the fact that emacs starts up instantaneously a big
win. If it starts taking a substantial period to start up again like
it did 30 years ago, I'm going to be unhappy. I get that this
simplifies maintenance but the startup time is a big lose.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  1:07                                       ` Stefan Monnier
  2016-10-24  6:39                                         ` Eli Zaretskii
@ 2016-10-24  9:40                                         ` Ken Raeburn
  2016-10-24 13:13                                           ` Stefan Monnier
  1 sibling, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2016-10-24  9:40 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

> On Oct 23, 2016, at 21:07, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
>> That sounds strangely long, as I got less than 2 sec with all the
>> preloaded *.elc files concatenated to a single file, and that's before
>> I made pure-copy a no-op.
>> Another report was that "loadup" with pure-copy short-circuited took
>> less than 0.5 sec.  See
> 
> Hmm... indeed, I got to 0.72s with his patch (on a different, slower
> machine (a Thinkpad X201s, i.e. with a i7 CPU L620 @ 2.00GHz)).
> 
> If I re-add international/characters it goes up a bit to
> 0.96s, but still nowhere near the 3s I got on my big .elc file.
> [ I wonder what makes loading my big file so slow.  ]
> 
> This said, there's still a factor 5-10 to get to "immediate", tho.

I think this came up in the thread Eli referred to, but when I’ve looked at startup time in CANNOT_DUMP builds, a couple of things jumped out at me:

* Garbage collection time.  If we’re not trying to dump out as compact as possible an image, squeezing out every byte is less important.  Drop all of the explicit calls in loadup.el.  Consider raising gc-cons-threshold to the point where it doesn’t trigger during loadup; maybe set it back after startup completes, or the first time Emacs is idle more than a couple seconds.

* I/O processing time — not the I/O system calls, but the C library processing.  Change getc to getc_unlocked in charset.c and lread.c.  (And/or change the loading of dumped.elc to read everything into a buffer and execute code from the buffer, if that might be faster.)  Mutex locking time is costly on Mac OS X, but not exactly free in glibc either.

As I recall, I had startup times under a second without any loadup/dump preprocessing with these changes.  (And all the “purecopy” stuff skipped, in a CANNOT_DUMP build.)

Your “dumped.elc” might trigger some of the same issues.  If the eventual idea is to stuff the “dumped” data into a char array to link into the final installed executable, the second issue is less relevant, though.

Did you check whether actually byte compiling the written file made a difference?

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24  9:40                                         ` Ken Raeburn
@ 2016-10-24 13:13                                           ` Stefan Monnier
  2016-10-25  9:02                                             ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 13:13 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Eli Zaretskii, emacs-devel

> * Garbage collection time.  If we’re not trying to dump out as compact as
> possible an image, squeezing out every byte is less important.  Drop all of
> the explicit calls in loadup.el.  Consider raising gc-cons-threshold to the
> point where it doesn’t trigger during loadup; maybe set it back after
> startup completes, or the first time Emacs is idle more
> than a couple seconds.

The patch to which I was referring (and which I used) does get rid of
most gc calls.

> As I recall, I had startup times under a second without any loadup/dump
> preprocessing with these changes.  (And all the “purecopy” stuff skipped,
> in a CANNOT_DUMP build.)

The patch also skips the purecopy by setting purify-flag to nil.

> Did you check whether actually byte compiling the written file made
> a difference?

dumped.elc has no code to compile.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 13:13                                           ` Stefan Monnier
@ 2016-10-25  9:02                                             ` Ken Raeburn
  2016-10-25 13:48                                               ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2016-10-25  9:02 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

On Oct 24, 2016, at 09:13, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

>> Did you check whether actually byte compiling the written file made
>> a difference?
> 
> dumped.elc has no code to compile.

It has a lot of fset and setplist calls which can be compiled, especially if you reorder things such that they’re not mixed up with the defvar calls that don’t compile.  The generated .elc output is about 25% larger.  I don’t expect the C parts of fset and setplist to be affected at all, of course; the parsing and interpretation of the Lisp may be another matter.  Unfortunately, byte-compile-file doesn’t preserve the sharing of objects (“#42#”) present in the input file, so the output isn’t semantically the same.

I did some profiling.  Without byte compiling, it appears that around half of the CPU time used loading the file in my test is spent in Frassq(…,read_objects), called from substitute_object_recurse.  For processing a file with this much sharing of objects, an assoc list with O(n) access time may not be the best choice.  Whatever we replace it with, it appears we need to be able to look up cons cells in a collection by either element.

The next top users of CPU time (_IO_getc, oblookup) are less significant, though there are some easy minor gains to be made there.

With a hacked-up 31-slot hash table replacing read_objects, the getc_unlocked changes, and setting OBARRAY_SIZE to 8191, I got the load time for the file in batch mode on my test system from just under a half second to about a quarter second.  Nearly half the remaining CPU time is split between readchar, read1, readbyte_from_file, and Fassq.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-25  9:02                                             ` Ken Raeburn
@ 2016-10-25 13:48                                               ` Stefan Monnier
  2016-10-27  8:51                                                 ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-25 13:48 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Eli Zaretskii, emacs-devel

>>> Did you check whether actually byte compiling the written file made
>>> a difference?
>> dumped.elc has no code to compile.
> It has a lot of fset and setplist calls which can be compiled, especially if
> you reorder things such that they’re not mixed up with the defvar calls that
> don’t compile.

"A lot of" is relative: the time to read them compared to an equivalent
byte-code version should be negligeable, and their execution time should
be even more negligeable.

> The generated .elc output is about 25% larger.

That's not because of byte-compilation per-se.  It's because the
byte-compiler uses `print-circle' but only within each top-level entity,
so you lose sharing between functions and between variables.

IOW you can get the exact same 25% larger file by printing each
fset/defvar/setplist separately (instead of printing them as one big
`progn`).  And you can trick the byte-compiler to preserve this sharing
by replacing the leading `progn` (which the byte-compiler removes) into
a (let () ...), tho maybe you'll need to really add some dummy binding
in that `let` to make sure the byte-compiler doesn't end up removing it.

> I did some profiling.  Without byte compiling, it appears that around half
> of the CPU time used loading the file in my test is spent in
> Frassq(…,read_objects), called from substitute_object_recurse.

Ah, that's what it is.  Clearly we should be able to optimize most of
this away.

> For processing a file with this much sharing of objects, an assoc list with
> O(n) access time may not be the best choice.

Indeed.

> Whatever we replace it with, it appears we need to be able to look up
> cons cells in a collection by either element.

Ideally, we could get rid of substitute_object_in_subtree entirely.
E.g. the patch below skips it for the case of "#n=(...)", and by peeping
ahead to decide the type of placeholder we build, we should be able to
get rid of it in all cases.

        Stefan

diff --git a/src/lread.c b/src/lread.c
index 58d518c..a06a78f 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -2936,12 +2936,21 @@ read1 (Lisp_Object readcharfun, int *pch, bool first_in_list)
 		      tem = read0 (readcharfun);

 		      /* Now put it everywhere the placeholder was...  */
-		      substitute_object_in_subtree (tem, placeholder);
+                      if (CONSP (tem))
+                        {
+                          Fsetcar (placeholder, XCAR (tem));
+                          Fsetcdr (placeholder, XCDR (tem));
+                          return placeholder;
+                        }
+                      else
+                        {
+		          substitute_object_in_subtree (tem, placeholder);

-		      /* ...and #n# will use the real value from now on.  */
-		      Fsetcdr (cell, tem);
+		          /* ...and #n# will use the real value from now on.  */
+		          Fsetcdr (cell, tem);

-		      return tem;
+		          return tem;
+                        }
 		    }

 		  /* #n# returns a previously read object.  */

^ permalink raw reply related	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-25 13:48                                               ` Stefan Monnier
@ 2016-10-27  8:51                                                 ` Ken Raeburn
  2016-10-30 14:43                                                   ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2016-10-27  8:51 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

> On Oct 25, 2016, at 09:48, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
>>>> Did you check whether actually byte compiling the written file made
>>>> a difference?
>>> dumped.elc has no code to compile.
>> It has a lot of fset and setplist calls which can be compiled, especially if
>> you reorder things such that they’re not mixed up with the defvar calls that
>> don’t compile.
> 
> "A lot of" is relative: the time to read them compared to an equivalent
> byte-code version should be negligeable, and their execution time should
> be even more negligeable.
> 
>> The generated .elc output is about 25% larger.
> 
> That's not because of byte-compilation per-se.  It's because the
> byte-compiler uses `print-circle' but only within each top-level entity,
> so you lose sharing between functions and between variables.
> 
> IOW you can get the exact same 25% larger file by printing each
> fset/defvar/setplist separately (instead of printing them as one big
> `progn`).  And you can trick the byte-compiler to preserve this sharing
> by replacing the leading `progn` (which the byte-compiler removes) into
> a (let () ...), tho maybe you'll need to really add some dummy binding
> in that `let` to make sure the byte-compiler doesn't end up removing it.

Ah, yes… “(let () …)” was enough with no bindings.  Now the compiled file, which now contains only one big byte-code invocation, is still larger than the original dumped file, though not as much, and from a couple of spot checks it looks like the data sharing is indeed preserved.  It also takes longer to load.  Oh well.

> Ideally, we could get rid of substitute_object_in_subtree entirely.
> E.g. the patch below skips it for the case of "#n=(...)", and by peeping
> ahead to decide the type of placeholder we build, we should be able to
> get rid of it in all cases.

I would think not for types using flexible array members, since we may not know the allocation size until we’ve seen the end of the object.

In poking around with gdb, most of the invocations of substitute_object_in_subtree I looked at got a subtree of nil.  It appears to me that if the “subtree” passed isn’t the placeholder and isn’t one of the types we process recursively, then we will never do any substitution, right?  So the checking of seen_list and read_objects isn’t relevant.

I started my tests over with an updated source tree from upstream and put in your loadup.el change.  Running “time emacs -batch -l dumped.elc” took 3.5s; according to “perf record”/“perf report”, Frassq took about 85% of the CPU time, and Fassq took about 9%.

Added your lread.c patch; run time is about 1.8s, 70% in Frassq and almost 20% in Fassq.

Patched substitute_object_recurse after the check for the subtree matching the placeholder, so that if the subtree passed was a symbol or number, it would simply be returned without consulting seen_list or read_objects.  Run time is now 0.7s; Fassq is a bit over 50% of that, and Frassq about 17%, and _IO_getc around 11%.  I think it should be safe to short-circuit it for some other types as well.

I had my getc_unlocked change sitting around so I pulled that in.  Run time is now 0.6s, with Fassq at 57% and Frassq at 18%.

Next on the profiling chart is oblookup, but it’s only at 4% so I’m going to ignore OBARRAY_SIZE for now.  However, OBARRAY_SIZE could affect the order of atoms in processing, which could drastically rearrange the ordering of the data structures in dumped.elc.

I think the next step is to look at replacing read_objects, probably with a pair of hash tables, but it’s getting a bit late for trying that tonight.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-27  8:51                                                 ` Ken Raeburn
@ 2016-10-30 14:43                                                   ` Ken Raeburn
  2016-10-30 15:31                                                     ` Simon Leinen
                                                                       ` (2 more replies)
  0 siblings, 3 replies; 375+ messages in thread
From: Ken Raeburn @ 2016-10-30 14:43 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

I wrote:
> Patched substitute_object_recurse after the check for the subtree matching the placeholder, so that if the subtree passed was a symbol or number, it would simply be returned without consulting seen_list or read_objects.  Run time is now 0.7s; Fassq is a bit over 50% of that, and Frassq about 17%, and _IO_getc around 11%.  I think it should be safe to short-circuit it for some other types as well.
> 
> I had my getc_unlocked change sitting around so I pulled that in.  Run time is now 0.6s, with Fassq at 57% and Frassq at 18%.
> 
> Next on the profiling chart is oblookup, but it’s only at 4% so I’m going to ignore OBARRAY_SIZE for now.  However, OBARRAY_SIZE could affect the order of atoms in processing, which could drastically rearrange the ordering of the data structures in dumped.elc.
> 
> I think the next step is to look at replacing read_objects, probably with a pair of hash tables, but it’s getting a bit late for trying that tonight.

I switched over to a pair of hash tables and the run time is just under 0.2s on my test machine now.  Profiling reports are now topped by read1, readchar, and readbyte_from_file (now including the expanded getc_unlocked calls), accounting for about 30% of the CPU time between them.  The hash functions and substitute_object_recurse are not taking a significant amount of time.

I took a look at the types of shared data in one of the generated dumped.elc files I got; almost 2700 were strings (all without text properties), almost 1900 were cons cells, and the rest numbered under 300.  So I’m not sure special-casing other types besides Lisp_Cons in read1 will gain us much.

It took me a while to sort through the lookups being done during and after parsing of an object and how the checks for circular objects work, but I think I’ve got it working.  I’ve pushed a scratch branch over with the changes if you’d like to try them, though I think I botched the git push syntax when trying to create “scratch/raeburn/startup” somehow, so I created “scratch/raeburn-startup”… or possibly I’ve created both?  I saw an email notification go out for both, but I only see the latter in the repository browser interface… 

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-30 14:43                                                   ` Ken Raeburn
@ 2016-10-30 15:31                                                     ` Simon Leinen
  2016-10-30 16:52                                                     ` Daniel Colascione
  2016-10-31 14:27                                                     ` Stefan Monnier
  2 siblings, 0 replies; 375+ messages in thread
From: Simon Leinen @ 2016-10-30 15:31 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Eli Zaretskii, Stefan Monnier, Emacs developers

On Sun, Oct 30, 2016 at 3:43 PM, Ken Raeburn <raeburn@raeburn.org> wrote:
> I switched over to a pair of hash tables and the run time is just under 0.2s on my test machine now.  Profiling reports are now topped by read1, readchar, and readbyte_from_file (now including the expanded getc_unlocked calls), accounting for about 30% of the CPU time between them.  The hash functions and substitute_object_recurse are not taking a significant amount of time. [...]

Promising!

Years ago I spent some time optimizing the MIB-reading code in
UCD/Net-SNMP, and found that the biggest win was to treat the input
file as one big buffer (I actually mmap()ped it) and then avoid most
of the memory allocation overhead of token creation by using start/end
pointers directly into that buffer.  I never upstreamed that code, and
I'm not sure the representation would have been acceptable to the
other developers.  But it sure was fast.  Maybe an approach like that
would be suitable for .elc loading.
-- 
Simon.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-30 14:43                                                   ` Ken Raeburn
  2016-10-30 15:31                                                     ` Simon Leinen
@ 2016-10-30 16:52                                                     ` Daniel Colascione
  2016-10-31 14:27                                                     ` Stefan Monnier
  2 siblings, 0 replies; 375+ messages in thread
From: Daniel Colascione @ 2016-10-30 16:52 UTC (permalink / raw)
  To: Ken Raeburn, Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

On 10/30/2016 07:43 AM, Ken Raeburn wrote:
>
> I wrote:
>> Patched substitute_object_recurse after the check for the subtree matching the placeholder, so that if the subtree passed was a symbol or number, it would simply be returned without consulting seen_list or read_objects.  Run time is now 0.7s; Fassq is a bit over 50% of that, and Frassq about 17%, and _IO_getc around 11%.  I think it should be safe to short-circuit it for some other types as well.
>>
>> I had my getc_unlocked change sitting around so I pulled that in.  Run time is now 0.6s, with Fassq at 57% and Frassq at 18%.
>>
>> Next on the profiling chart is oblookup, but it’s only at 4% so I’m going to ignore OBARRAY_SIZE for now.  However, OBARRAY_SIZE could affect the order of atoms in processing, which could drastically rearrange the ordering of the data structures in dumped.elc.
>>
>> I think the next step is to look at replacing read_objects, probably with a pair of hash tables, but it’s getting a bit late for trying that tonight.
>
> I switched over to a pair of hash tables and the run time is just under 0.2s on my test machine now.  Profiling reports are now topped by read1, readchar, and readbyte_from_file (now including the expanded getc_unlocked calls), accounting for about 30% of the CPU time between them.  The hash functions and substitute_object_recurse are not taking a significant amount of time.
>
> I took a look at the types of shared data in one of the generated dumped.elc files I got; almost 2700 were strings (all without text properties), almost 1900 were cons cells, and the rest numbered under 300.  So I’m not sure special-casing other types besides Lisp_Cons in read1 will gain us much.
>
> It took me a while to sort through the lookups being done during and after parsing of an object and how the checks for circular objects work, but I think I’ve got it working.  I’ve pushed a scratch branch over with the changes if you’d like to try them, though I think I botched the git push syntax when trying to create “scratch/raeburn/startup” somehow, so I created “scratch/raeburn-startup”… or possibly I’ve created both?  I saw an email notification go out for both, but I only see the latter in the repository browser interface…
>
> Ken

Awesome! Even if we go for something besides a big elc file for startup, 
these improvements will help.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-30 14:43                                                   ` Ken Raeburn
  2016-10-30 15:31                                                     ` Simon Leinen
  2016-10-30 16:52                                                     ` Daniel Colascione
@ 2016-10-31 14:27                                                     ` Stefan Monnier
  2016-11-02  7:36                                                       ` Ken Raeburn
  2 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-31 14:27 UTC (permalink / raw)
  To: emacs-devel

> I switched over to a pair of hash tables and the run time is just under 0.2s
> on my test machine now.  Profiling reports are now topped by read1,
> readchar, and readbyte_from_file (now including the expanded getc_unlocked
> calls), accounting for about 30% of the CPU time between them.  The hash
> functions and substitute_object_recurse are not taking a significant amount
> of time.

BTW, I don't know if you've tried to make that dumped file work
correctly, but in case you haven't here's my latest attempt.

It mostly works, tho there are still issues such as the fact that the
global-font-lock-mode still fails to be properly enabled.


        Stefan


diff --git a/lisp/emacs-lisp/macroexp.el b/lisp/emacs-lisp/macroexp.el
index 310ca29..9ca53eb 100644
--- a/lisp/emacs-lisp/macroexp.el
+++ b/lisp/emacs-lisp/macroexp.el
@@ -439,7 +439,8 @@ macroexp--const-symbol-p
   (or (memq symbol '(nil t))
       (keywordp symbol)
       (if any-value
-	  (or (memq symbol byte-compile-const-variables)
+	  (or (and (boundp 'byte-compile-const-variables)
+                   (memq symbol byte-compile-const-variables))
 	      ;; FIXME: We should provide a less intrusive way to find out
 	      ;; if a variable is "constant".
 	      (and (boundp symbol)
diff --git a/lisp/international/mule.el b/lisp/international/mule.el
index 21ab7e1..bb4808b 100644
--- a/lisp/international/mule.el
+++ b/lisp/international/mule.el
@@ -290,7 +290,7 @@ define-charset
 		      elt))
 		  props))
     (setcdr (assq :plist attrs) props)
-
+    (put name 'internal--charset-args (mapcar #'cdr attrs))
     (apply 'define-charset-internal name (mapcar 'cdr attrs))))
 
 
@@ -911,6 +911,8 @@ define-coding-system
 	  (cons :name (cons name (cons :docstring (cons (purecopy docstring)
 							props)))))
     (setcdr (assq :plist common-attrs) props)
+    (put name 'internal--cs-args
+         (mapcar #'cdr (append common-attrs spec-attrs)))
     (apply 'define-coding-system-internal
 	   name (mapcar 'cdr (append common-attrs spec-attrs)))))
 
diff --git a/lisp/loadup.el b/lisp/loadup.el
index 21c64a8..5967334 100644
--- a/lisp/loadup.el
+++ b/lisp/loadup.el
@@ -1,4 +1,4 @@
-;;; loadup.el --- load up standardly loaded Lisp files for Emacs
+;;; loadup.el --- load up standardly loaded Lisp files for Emacs  -*- lexical-binding:t -*-
 
 ;; Copyright (C) 1985-1986, 1992, 1994, 2001-2016 Free Software
 ;; Foundation, Inc.
@@ -461,6 +461,150 @@
 						invocation-directory)
 			      (expand-file-name name invocation-directory)
 			      t)))
+      (message "Dumping into dumped.elc...preparing...")
+
+      ;; Dump the current state into a file so we can reload it!
+      (message "Dumping into dumped.elc...generating...")
+      (let ((faces '())
+            (coding-systems '()) (coding-system-aliases '())
+            (charsets '()) (charset-aliases '())
+            (cmds '()))
+        (setcdr global-buffers-menu-map nil) ;; Get rid of buffer objects!
+        (mapatoms
+         (lambda (s)
+           (when (fboundp s)
+             (if (subrp (symbol-function s))
+                 ;; subr objects aren't readable!
+                 (unless (equal (symbol-name s) (subr-name (symbol-function s)))
+                   (push `(fset ',s (symbol-function ',(intern (subr-name (symbol-function s))))) cmds))
+               (if (memq s '(rename-buffer))
+                   ;; FIXME: We need these, but they contain
+                   ;; unprintable objects.
+                   nil
+                 (push `(fset ',s ,(macroexp-quote (symbol-function s)))
+                       cmds))))
+           (when (and (boundp s)
+                      (not (macroexp--const-symbol-p s 'any-value))
+                      ;; I think we don't need/want these!
+                      (not (memq s '(terminal-frame obarray
+                                     initial-window-system window-system
+                                     ;; custom-delayed-init-variables
+                                     exec-path
+                                     process-environment
+                                     command-line-args noninteractive))))
+             ;; FIXME: Handle varaliases!
+             (let ((v (symbol-value s)))
+               (push `(set-default
+                       ',s
+                       ,(cond
+                         ;; FIXME: (Correct) hack to avoid
+                         ;; unprintable objects.
+                         ((eq s 'undo-auto--undoably-changed-buffers) nil)
+                         ;; FIXME: Incorrect hack to avoid
+                         ;; unprintable objects.
+                         ((eq s 'advertised-signature-table)
+                          (make-hash-table :test 'eq :weakness 'key))
+                         ((subrp v)
+                          `(symbol-function ',(intern (subr-name v))))
+                         ((and (markerp v) (null (marker-buffer v)))
+                          '(make-marker))
+                         ((and (overlayp v) (null (overlay-buffer v)))
+                          '(let ((ol (make-overlay (point-min) (point-min))))
+                             (delete-overlay ol)
+                             ol))
+                         (v (macroexp-quote v))))
+                     cmds)
+               (push `(defvar ,s) cmds)))
+           (when (symbol-plist s)
+             (push `(setplist ',s ',(symbol-plist s)) cmds))
+           (when (get s 'face-defface-spec)
+             (push s faces))
+           (if (get s 'internal--cs-args)
+               (push s coding-systems))
+           (when (and (coding-system-p s)
+                      (not (eq s (car (coding-system-aliases s)))))
+             (push (cons s (car (coding-system-aliases s)))
+                   coding-system-aliases))
+           (if (get s 'internal--charset-args)
+               (push s charsets)
+             (when (and (charsetp s)
+                        (not (eq s (get-charset-property s :name))))
+               (push (cons s (get-charset-property s :name))
+                     charset-aliases))))
+         obarray)
+        (message "Dumping into dumped.elc...printing...")
+        (with-current-buffer (generate-new-buffer "dumped.elc")
+          (insert ";ELC\^W\^@\^@\^@\n;;; Compiled\n;;; in Emacs version "
+                  emacs-version "\n")
+          (let ((print-circle t)
+                (print-gensym t)
+                (print-quoted t)
+                (print-level nil)
+                (print-length nil)
+                (print-escape-newlines t)
+                (standard-output (current-buffer)))
+            (print `(progn . ,cmds))
+            (terpri)
+            (print `(let ((css ',charsets))
+                      (dotimes (i 3)
+                        (dolist (cs (prog1 css (setq css nil)))
+                          ;; (message "Defining charset %S..." cs)
+                          (condition-case nil
+                              (progn
+                                (apply #'define-charset-internal
+                                       cs (get cs 'internal--charset-args))
+                                ;; (message "Defining charset %S...done" cs)
+                                )
+                            (error
+                             ;; (message "Defining charset %S...postponed"
+                             ;;          cs)
+                             (push cs css)))))))
+            (terpri)
+            (print `(dolist (cs ',charset-aliases)
+                      (define-charset-alias (car cs) (cdr cs))))
+            (terpri)
+            (print `(let ((css ',coding-systems))
+                      (dotimes (i 3)
+                        (dolist (cs (prog1 css (setq css nil)))
+                          ;; (message "Defining coding-system %S..." cs)
+                          (condition-case nil
+                              (progn
+                                (apply #'define-coding-system-internal
+                                       cs (get cs 'internal--cs-args))
+                                ;; (message "Defining coding-system %S...done" cs)
+                                )
+                            (error
+                             ;; (message "Defining coding-system %S...postponed"
+                             ;;          cs)
+                             (push cs css)))))))
+            (print `(dolist (f ',faces)
+                      (face-spec-set f (get f 'face-defface-spec)
+                                     'face-defface-spec)))
+            (terpri)
+            (print `(dolist (cs ',coding-system-aliases)
+                      (define-coding-system-alias (car cs) (cdr cs))))
+            (terpri)
+            (print `(progn
+                      ;; (message "Done preloading!")
+                      ;; (message "custom-delayed-init-variables = %S"
+                      ;;          custom-delayed-init-variables)
+                      ;; (message "Running top-level = %S" top-level)
+                      (setq debug-on-error t)
+                      (use-global-map global-map)
+                      (eval top-level)
+                      ;; (message "top-level done!?")
+                      ))
+            (terpri))
+          (goto-char (point-min))
+          (while (re-search-forward " (\\(defvar\\|setplist\\|fset\\) " nil t)
+            (goto-char (match-beginning 0))
+            (delete-char 1) (insert "\n"))
+          (message "Dumping into dumped.elc...saving...")
+          (let ((coding-system-for-write 'emacs-internal))
+            (write-region (point-min) (point-max) (buffer-name)))
+          (message "Dumping into dumped.elc...done")
+          ))
+
       (kill-emacs)))
 
 ;; For machines with CANNOT_DUMP defined in config.h,
diff --git a/src/coding.c b/src/coding.c
index 9f709be..a677758 100644
--- a/src/coding.c
+++ b/src/coding.c
@@ -10326,8 +10326,9 @@ usage: (define-coding-system-internal ...)  */)
       CHECK_NUMBER_CAR (reg_usage);
       CHECK_NUMBER_CDR (reg_usage);
 
-      request = Fcopy_sequence (args[coding_arg_iso2022_request]);
-      for (tail = request; CONSP (tail); tail = XCDR (tail))
+      request = Qnil;
+      for (tail = args[coding_arg_iso2022_request];
+            CONSP (tail); tail = XCDR (tail))
 	{
 	  int id;
 	  Lisp_Object tmp1;
@@ -10339,7 +10340,8 @@ usage: (define-coding-system-internal ...)  */)
 	  CHECK_NATNUM_CDR (val);
 	  if (XINT (XCDR (val)) >= 4)
 	    error ("Invalid graphic register number: %"pI"d", XINT (XCDR (val)));
-	  XSETCAR (val, make_number (id));
+	  request = Fcons (Fcons (make_number (id), XCDR (val)),
+                           request);
 	}
 
       flags = args[coding_arg_iso2022_flags];
diff --git a/src/emacs.c b/src/emacs.c
index 2480dfc..bdf3742 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -1593,9 +1593,9 @@ Using an Emacs configured with --with-x-toolkit=lucid does not have this problem
 #endif
 	  Vtop_level = list2 (Qload, build_unibyte_string (file));
 	}
-      /* Unless next switch is -nl, load "loadup.el" first thing.  */
-      if (! no_loadup)
-	Vtop_level = list2 (Qload, build_string ("loadup.el"));
+      else if (! no_loadup)
+        /* Unless next switch is -nl, load "loadup.el" first thing.  */
+	Vtop_level = list2 (Qload, build_string ("../src/dumped.elc"));
     }
 
   /* Set up for profiling.  This is known to work on FreeBSD,




^ permalink raw reply related	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-31 14:27                                                     ` Stefan Monnier
@ 2016-11-02  7:36                                                       ` Ken Raeburn
  2016-11-02 12:17                                                         ` Stefan Monnier
  2016-11-02 12:22                                                         ` Stefan Monnier
  0 siblings, 2 replies; 375+ messages in thread
From: Ken Raeburn @ 2016-11-02  7:36 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On Oct 31, 2016, at 10:27, Stefan Monnier <monnier@iro.umontreal.ca> wrote:


>> I switched over to a pair of hash tables and the run time is just under 0.2s
>> on my test machine now.  Profiling reports are now topped by read1,
>> readchar, and readbyte_from_file (now including the expanded getc_unlocked
>> calls), accounting for about 30% of the CPU time between them.  The hash
>> functions and substitute_object_recurse are not taking a significant amount
>> of time.
> 
> BTW, I don't know if you've tried to make that dumped file work
> correctly, but in case you haven't here's my latest attempt.

Thanks!  Looks like you’ve refined the handling of faces and other attributes.  Have you tried it out in batch mode?  I’m getting a crash in realize_face with a null cache pointer.

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-11-02  7:36                                                       ` Ken Raeburn
@ 2016-11-02 12:17                                                         ` Stefan Monnier
  2016-11-02 12:22                                                         ` Stefan Monnier
  1 sibling, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-11-02 12:17 UTC (permalink / raw)
  To: emacs-devel

> Thanks!  Looks like you’ve refined the handling of faces and other
> attributes.  Have you tried it out in batch mode?

No, haven't gotten that far.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-11-02  7:36                                                       ` Ken Raeburn
  2016-11-02 12:17                                                         ` Stefan Monnier
@ 2016-11-02 12:22                                                         ` Stefan Monnier
  2016-11-03  5:37                                                           ` Ken Raeburn
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-11-02 12:22 UTC (permalink / raw)
  To: emacs-devel

BTW, it might be worth comparing the behavior with the one we get with
the "normal" temacs (i.e. by loading loadup.el instead of dumped.elc),
as well as with what we get with CANNOT_DUMP: from what I remember the
current code doesn't handle CANNOT_DUMP 100% correctly (which is OK so
far because CANNOT_DUMP is only ever used temporarily during porting
until unexec is working).

IOW some of the problems we may encounter could be unrelated to what we
do w.r.t dumped.elc.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-11-02 12:22                                                         ` Stefan Monnier
@ 2016-11-03  5:37                                                           ` Ken Raeburn
  2016-12-11 13:34                                                             ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2016-11-03  5:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On Nov 2, 2016, at 08:22, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> BTW, it might be worth comparing the behavior with the one we get with
> the "normal" temacs (i.e. by loading loadup.el instead of dumped.elc),
> as well as with what we get with CANNOT_DUMP: from what I remember the
> current code doesn't handle CANNOT_DUMP 100% correctly (which is OK so
> far because CANNOT_DUMP is only ever used temporarily during porting
> until unexec is working).

…which is why it seems like I have to keep fixing bugs every time I try to use it.

If CANNOT_DUMP mode worked reliably, I’d think it would be the logical starting point for this work — it’s already compiling with the expectation that we’ll be loading the Lisp code when the user starts it up rather than preparing for unexec and a second invocation of main().  Changing the thing we load at startup from loadup.el to dumped.elc should be pretty minor.

I’m trying CANNOT_DUMP out right now.  As soon as I’ve got it bootstrapping again, I’ll try pulling in the other changes.

> IOW some of the problems we may encounter could be unrelated to what we
> do w.r.t dumped.elc.

True.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-11-03  5:37                                                           ` Ken Raeburn
@ 2016-12-11 13:34                                                             ` Ken Raeburn
  2016-12-11 15:42                                                               ` Eli Zaretskii
                                                                                 ` (4 more replies)
  0 siblings, 5 replies; 375+ messages in thread
From: Ken Raeburn @ 2016-12-11 13:34 UTC (permalink / raw)
  To: Emacs developers

I’ve pushed an update to the scratch/raeburn-startup branch.  It includes several updates:

* Stefan’s Oct 31 patch instead of his earlier one.  This does more reinitializing of charsets, coding systems, etc., which I believe were absent from the previous version.

* More patches to the recursive object substitution pass done during reading.  The big costs on Mac OS X seem to differ from my Linux/GNU/X11 build — there’s a much larger dumped.elc file, and an entirely different compiler — but I’ve managed to trim the run time there a bit.

* Changed gc-cons-threshold to be much larger.  By itself, this isn’t a good change.  But we’d exceed the old value many times over just reading the big “progn” form; this way my Linux/GNU/X11 run doesn’t trigger GC during startup, though I think the Mac version still does.  I think a better strategy might try to defer or discourage GC during startup, and do it instead when we have idle cycles while the user isn’t trying to get something done.  But revamping the GC strategy is a different discussion.

* Larger obarray.  After startup, my Linux/GNU/X11 build has over 15k symbols, and my Mac build has over 21k.  The old obarray size of 1511 meant average chain lengths of over 10 and 14.  Shorter chains mean less time spent in oblookup.  And extra slots are cheap.

* Open-code reading ASCII symbol characters from a file in read1().  The hot path involved examining readcharfun to determine its type, compare it against some known symbols, select a function to call, have that function check to see if we’re doing pushback instead of actually reading, block input, do the actual getc() call, and unblock input — all for each character.  The new version duplicates a bunch of code, but once it sees we’re reading from a file, skips most of that for the common path through the inner loop.  This cut maybe 10% off of some of my run times.

With all these changes — Stefan’s new patch with additional initialization, and my updates to shave a little more time off — I’m still hitting just under 0.2s for:

  time ./temacs --batch --eval '(progn (message "hi") (kill-emacs))'

on Linux/GNU/X11 (Intel Core i5-2320, 3GHz, gcc 4.9); my Mac (Intel Core 2 Duo, 2.8GHz) takes over half a second (including at least one GC invocation).

It can be tested by running “temacs” after building it.  The lisp load path will be set based on the source tree, not the installation prefix.  If “-nl” and “-l” arguments are not given, it’ll load “../src/dumped.elc”, but that’s interpreted relative to the lisp *source* directory.  If you build in a directory other than the checked-out tree (i.e., $srcdir is not “.”) as I do, you’ll need to copy dumped.elc from the src directory of the build tree where it’s generated to the src directory of the source tree where it’s sought.

If dumped.elc isn’t found, temacs will exit with status 42.  Under Stefan’s version, an X11 run would spit out a message saying the file wasn’t found and exit, but a tty run would get into a loop complaining about internal-echo-keystrokes-prefix and would need to be killed from another terminal.  This way, it only kind of sucks, equally in both cases. :-)

The remaining time still seems to be about 2/3 reading and parsing bytes, allocating objects, and updating (mostly scanning) the obarray.  There should be a bit more time that can be squeezed out.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-11 13:34                                                             ` Ken Raeburn
@ 2016-12-11 15:42                                                               ` Eli Zaretskii
  2016-12-24 11:06                                                                 ` Eli Zaretskii
  2016-12-11 19:18                                                               ` Richard Stallman
                                                                                 ` (3 subsequent siblings)
  4 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-12-11 15:42 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Sun, 11 Dec 2016 08:34:01 -0500
> 
> I’ve pushed an update to the scratch/raeburn-startup branch.  It includes several updates:

Thank you for your work, I will definitely try to check it out soon.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-11 15:42                                                               ` Eli Zaretskii
@ 2016-12-24 11:06                                                                 ` Eli Zaretskii
  2016-12-25 15:46                                                                   ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-12-24 11:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: raeburn, emacs-devel

> Date: Sun, 11 Dec 2016 17:42:21 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> > From: Ken Raeburn <raeburn@raeburn.org>
> > Date: Sun, 11 Dec 2016 08:34:01 -0500
> > 
> > I’ve pushed an update to the scratch/raeburn-startup branch.  It includes several updates:
> 
> Thank you for your work, I will definitely try to check it out soon.

I took a quick look.  There are a few issues with the Windows build
there, but I have one question to which I'd like to know the answer
first: why do we still dumping temacs to emacs, instead of loading
dumped.elc into a bare emacs?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-24 11:06                                                                 ` Eli Zaretskii
@ 2016-12-25 15:46                                                                   ` Stefan Monnier
  0 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-12-25 15:46 UTC (permalink / raw)
  To: emacs-devel

> I took a quick look.  There are a few issues with the Windows build
> there, but I have one question to which I'd like to know the answer
> first: why do we still dumping temacs to emacs, instead of loading
> dumped.elc into a bare emacs?

I don't know if Ken has another reason, but in my case it's simply
because I haven't bothered to change the code so as not to call the
unexec dump.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-11 13:34                                                             ` Ken Raeburn
  2016-12-11 15:42                                                               ` Eli Zaretskii
@ 2016-12-11 19:18                                                               ` Richard Stallman
  2016-12-15 12:57                                                                 ` Ken Raeburn
  2016-12-11 19:18                                                               ` Richard Stallman
                                                                                 ` (2 subsequent siblings)
  4 siblings, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-12-11 19:18 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > * Changed gc-cons-threshold to be much larger.

How about binding it to a higher value for loadup?

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-11 19:18                                                               ` Richard Stallman
@ 2016-12-15 12:57                                                                 ` Ken Raeburn
  2016-12-15 16:04                                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2016-12-15 12:57 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

> On Dec 11, 2016, at 14:18, Richard Stallman <rms@gnu.org> wrote:
> 
> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
> 
>> * Changed gc-cons-threshold to be much larger.
> 
> How about binding it to a higher value for loadup?

That may be good enough.  But GC will probably kick in right after we set it back, so probably most methods we might try for measuring startup time will incur the cost of at least one GC pass, and it’ll happen when the user starts Emacs in real life.  I guess one question is, how much it matters?  It’s only a fraction of a second, but I’m trying to shave a startup time of 0.2s (or 0.6s on Mac OS X) down closer to 0.1s, so a fraction of a second can make a difference.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-15 12:57                                                                 ` Ken Raeburn
@ 2016-12-15 16:04                                                                   ` Eli Zaretskii
  2016-12-15 16:26                                                                     ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-12-15 16:04 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: rms, emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Thu, 15 Dec 2016 07:57:09 -0500
> Cc: emacs-devel@gnu.org
> 
> > How about binding it to a higher value for loadup?
> 
> That may be good enough.  But GC will probably kick in right after we set it back

AFAIK, just setting the GC threshold doesn't automatically invoke GC,
you need do something that calls maybe_gc.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-15 16:04                                                                   ` Eli Zaretskii
@ 2016-12-15 16:26                                                                     ` Ken Raeburn
  0 siblings, 0 replies; 375+ messages in thread
From: Ken Raeburn @ 2016-12-15 16:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rms, emacs-devel


> On Dec 15, 2016, at 11:04, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Ken Raeburn <raeburn@raeburn.org>
>> Date: Thu, 15 Dec 2016 07:57:09 -0500
>> Cc: emacs-devel@gnu.org
>> 
>>> How about binding it to a higher value for loadup?
>> 
>> That may be good enough.  But GC will probably kick in right after we set it back
> 
> AFAIK, just setting the GC threshold doesn't automatically invoke GC,
> you need do something that calls maybe_gc.

Right, but if we’re not following it up with evaluating another form from dumped.elc (eval_sub can invoke GC) or invoking some compiled routine (branch operations can invoke GC), then we’re probably ready to check for availability of user input (which can invoke GC).


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-11 13:34                                                             ` Ken Raeburn
  2016-12-11 15:42                                                               ` Eli Zaretskii
  2016-12-11 19:18                                                               ` Richard Stallman
@ 2016-12-11 19:18                                                               ` Richard Stallman
  2016-12-12 17:25                                                                 ` Ken Raeburn
  2016-12-13 15:21                                                               ` Ken Brown
  2016-12-24 13:37                                                               ` Eli Zaretskii
  4 siblings, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-12-11 19:18 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > * Larger obarray.  After startup, my Linux/GNU/X11 build has over
  > * 15k symbols, and my Mac build has over 21k.  The old obarray
  > * size of 1511 meant average chain lengths of over 10 and 14.
  > * Shorter chains mean less time spent in oblookup.  And extra
  > * slots are cheap.

This may be a good idea, but it has nothing to do with any particular
method of startup or dumping.  So how about doing it unconditionally?

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-11 19:18                                                               ` Richard Stallman
@ 2016-12-12 17:25                                                                 ` Ken Raeburn
  0 siblings, 0 replies; 375+ messages in thread
From: Ken Raeburn @ 2016-12-12 17:25 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel


> On Dec 11, 2016, at 14:18, Richard Stallman <rms@gnu.org> wrote:
> 
> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
> 
>> * Larger obarray.  After startup, my Linux/GNU/X11 build has over
>> * 15k symbols, and my Mac build has over 21k.  The old obarray
>> * size of 1511 meant average chain lengths of over 10 and 14.
>> * Shorter chains mean less time spent in oblookup.  And extra
>> * slots are cheap.
> 
> This may be a good idea, but it has nothing to do with any particular
> method of startup or dumping.  So how about doing it unconditionally?

A few of the changes on this branch would probably improve speed at least a tiny bit regardless of the startup method.  This one also has the advantage of being a trivial change with, as far as I can see, no down side, so, yeah….

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-11 13:34                                                             ` Ken Raeburn
                                                                                 ` (2 preceding siblings ...)
  2016-12-11 19:18                                                               ` Richard Stallman
@ 2016-12-13 15:21                                                               ` Ken Brown
  2016-12-14  5:30                                                                 ` Ken Raeburn
  2016-12-24 13:37                                                               ` Eli Zaretskii
  4 siblings, 1 reply; 375+ messages in thread
From: Ken Brown @ 2016-12-13 15:21 UTC (permalink / raw)
  To: Ken Raeburn, Emacs developers

On 12/11/2016 8:34 AM, Ken Raeburn wrote:
> I’ve pushed an update to the scratch/raeburn-startup branch.

Did you actually push these changes?  The last commit I see at 
http://git.savannah.gnu.org/cgit/emacs.git/log/?h=scratch/raeburn-startup 
is dated 2016-10-30.

Ken






^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-13 15:21                                                               ` Ken Brown
@ 2016-12-14  5:30                                                                 ` Ken Raeburn
  2016-12-14  5:45                                                                   ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2016-12-14  5:30 UTC (permalink / raw)
  To: Ken Brown; +Cc: Emacs developers


> On Dec 13, 2016, at 10:21, Ken Brown <kbrown@cornell.edu> wrote:
> 
> On 12/11/2016 8:34 AM, Ken Raeburn wrote:
>> I’ve pushed an update to the scratch/raeburn-startup branch.
> 
> Did you actually push these changes?  The last commit I see at http://git.savannah.gnu.org/cgit/emacs.git/log/?h=scratch/raeburn-startup is dated 2016-10-30.
> 
> Ken

Strange, I’m not sure what happened.  I’ll push it again.  Just as well, I’ve got a couple minor updates to add anyway.

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-14  5:30                                                                 ` Ken Raeburn
@ 2016-12-14  5:45                                                                   ` Ken Raeburn
  2016-12-14 10:58                                                                     ` Phil Sainty
                                                                                       ` (2 more replies)
  0 siblings, 3 replies; 375+ messages in thread
From: Ken Raeburn @ 2016-12-14  5:45 UTC (permalink / raw)
  To: Ken Brown; +Cc: Emacs developers


> On Dec 14, 2016, at 00:30, Ken Raeburn <raeburn@raeburn.org> wrote:
> 
> 
>> On Dec 13, 2016, at 10:21, Ken Brown <kbrown@cornell.edu> wrote:
>> 
>> On 12/11/2016 8:34 AM, Ken Raeburn wrote:
>>> I’ve pushed an update to the scratch/raeburn-startup branch.
>> 
>> Did you actually push these changes?  The last commit I see at http://git.savannah.gnu.org/cgit/emacs.git/log/?h=scratch/raeburn-startup is dated 2016-10-30.
>> 
>> Ken
> 
> Strange, I’m not sure what happened.  I’ll push it again.  Just as well, I’ve got a couple minor updates to add anyway.

I must have overlooked it the first time, but “git push -f” is being rejected with:

remote: error: denying non-fast-forward refs/heads/scratch/raeburn-startup (you should pull first)

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-14  5:45                                                                   ` Ken Raeburn
@ 2016-12-14 10:58                                                                     ` Phil Sainty
  2016-12-14 12:06                                                                       ` Yuri Khan
  2016-12-14 11:00                                                                     ` Lars Ingebrigtsen
  2016-12-15 11:45                                                                     ` Ken Raeburn
  2 siblings, 1 reply; 375+ messages in thread
From: Phil Sainty @ 2016-12-14 10:58 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Emacs developers

On 14/12/16 18:45, Ken Raeburn wrote:
> I must have overlooked it the first time, but “git push -f” is being
rejected with:
> remote: error: denying non-fast-forward
refs/heads/scratch/raeburn-startup (you should pull first)

Which typically means that you've amended or rebased your local
history since you last pushed it. Or potentially someone else has
pushed to that branch in the interim. Your local branch and the
remote branch have diverged, at any rate.

If you *really really* want to push a revised history -- bearing
in mind that it will cause merge issues for anyone who has already
pulled from that branch (which is why you would usually refrain
from doing such a thing, and why server-side protection like this
exists in the first place) -- then you can generally work around
the protection by deleting the upstream branch and then pushing
your new version of it.

Alternatively, you may need to rebase your local branch onto the
upstream revision.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-14 10:58                                                                     ` Phil Sainty
@ 2016-12-14 12:06                                                                       ` Yuri Khan
  0 siblings, 0 replies; 375+ messages in thread
From: Yuri Khan @ 2016-12-14 12:06 UTC (permalink / raw)
  To: Phil Sainty; +Cc: Ken Raeburn, Emacs developers

On Wed, Dec 14, 2016 at 5:58 PM, Phil Sainty <psainty@orcon.net.nz> wrote:

> If you *really really* want to push a revised history -- bearing
> in mind that it will cause merge issues for anyone who has already
> pulled from that branch (which is why you would usually refrain
> from doing such a thing, and why server-side protection like this
> exists in the first place) -- then you can generally work around
> the protection by deleting the upstream branch and then pushing
> your new version of it.

Scratch branches are personal and others should exercise caution
if/when pulling them. It is okay for the branch owner to delete and
recreate it.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-14  5:45                                                                   ` Ken Raeburn
  2016-12-14 10:58                                                                     ` Phil Sainty
@ 2016-12-14 11:00                                                                     ` Lars Ingebrigtsen
  2016-12-15 11:45                                                                     ` Ken Raeburn
  2 siblings, 0 replies; 375+ messages in thread
From: Lars Ingebrigtsen @ 2016-12-14 11:00 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Ken Brown, Emacs developers

Ken Raeburn <raeburn@raeburn.org> writes:

> I must have overlooked it the first time, but “git push -f” is being
> rejected with:
>
> remote: error: denying non-fast-forward
> refs/heads/scratch/raeburn-startup (you should pull first)

Sounds like you need to say "git pull --rebase" before you push.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-14  5:45                                                                   ` Ken Raeburn
  2016-12-14 10:58                                                                     ` Phil Sainty
  2016-12-14 11:00                                                                     ` Lars Ingebrigtsen
@ 2016-12-15 11:45                                                                     ` Ken Raeburn
  2016-12-15 17:28                                                                       ` Ken Raeburn
  2016-12-16 14:22                                                                       ` Robert Pluim
  2 siblings, 2 replies; 375+ messages in thread
From: Ken Raeburn @ 2016-12-15 11:45 UTC (permalink / raw)
  To: Emacs developers

Branch scratch/raeburn-startup deleted and re-pushed.

In addition to the changes I mentioned earlier, I found an unnecessary memset in the face reinitialization code that could go, and an initialization form was being emitted that tried to incorporate the obarray by value (which wouldn’t work because the symbol chains don’t all get dumped); omitting the latter for now cuts the file size a percent or so.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-15 11:45                                                                     ` Ken Raeburn
@ 2016-12-15 17:28                                                                       ` Ken Raeburn
  2016-12-15 19:59                                                                         ` Eli Zaretskii
  2016-12-16 14:22                                                                       ` Robert Pluim
  1 sibling, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2016-12-15 17:28 UTC (permalink / raw)
  To: Emacs developers

One area I’m contemplating now is whether we can trim the size of dumped.elc.

Question: How useable does Emacs need to be if the Lisp code is improperly installed (or not installed) and can’t be loaded?

With the big-elc approach as currently implemented, assuming we store dumped.elc in the install tree along with the other Lisp code, it basically can’t start without that Lisp library.

If that’s okay, then the next question is: How much do we *need* to load before processing user input?

My impression has been that the current loadup.el contents cover not just the bare minimum that Emacs absolutely needs to have to function, but also the popular things we want to have readily available without having to wait for a Lisp package to load (like buff-menu), especially if it’s in response to a simple keypress or mouse click.  (The X support code is loaded in an X build, even if no X display is present at startup; I think parts of it fall into both categories.)  And adding stuff is fairly cheap; loadup and unexec can take as long as they want, only the speed of relaunching the resulting executable affects the user.

With the big-elc approach, the tradeoffs change.  Reading a bunch of function definitions from dumped.elc is only a tiny bit faster than reading the same definitions from the original .elc files, because we don’t have to open more files.  (At least, the cost is trivial if the files are in cache.  It’ll be OS- and system-dependent.)  Only the time for precomputation that gets done as the file is loaded is saved, and exchanged for the time needed to parse the saved result.

If we can trim some stuff from loadup.el, and resort to autoloading that stuff later, that may save us some startup time.  (Some text mode commands?  Or buff-menu?)

If we really want some of the other stuff to be able to run immediately when the user hits a key, maybe there’s some way to compromise between that and faster startup.  A strawman proposal: load the “must-haves” via dumped.elc; load the user’s init file, read files and execute eval commands as indicated by the command line options; check for user input; if there’s no user input (e.g., use an idle timer set for 3s), start going through a list of “nice-to-haves” and loading them, continuing until user input is available.  If the user starts typing or otherwise invoking some of those nice-to-have commands right away, they’ll have to wait while autoloading happens, but if we get a couple idle seconds, we may still pull the commands in before the user needs them.

Of course, if the user types something while we’re loading a file, that file will have to finish loading before we can respond; it’s sort of a guessing game as to how much idle time suggests that the user is doing something else and probably won’t type anything in the next second or two.  Perhaps we can divide the task further to keep any individual delay shorter: read a .elc file into a buffer, check for input, parse into S-expressions, check for input, eval the S-expressions…

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-15 17:28                                                                       ` Ken Raeburn
@ 2016-12-15 19:59                                                                         ` Eli Zaretskii
  2016-12-15 22:07                                                                           ` Clément Pit--Claudel
                                                                                             ` (2 more replies)
  0 siblings, 3 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-12-15 19:59 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Thu, 15 Dec 2016 12:28:15 -0500
> 
> Question: How useable does Emacs need to be if the Lisp code is improperly installed (or not installed) and can’t be loaded?

I had this same idea just the other day.  We have auto-loading, so I
went, so maybe just starting temacs and letting it load whatever it
needs when it needs that would be good enough?

Just to see what would we be up against, I ran

  ./temacs -Q -nl

and sure thing, it errored out right away because some C code called
Lisp which wasn't loaded yet.

What's more, auto-loading doesn't work for preloaded packages, because
we have code in autoload.el to skip/ignore autoload cookies in files
mentioned in loadup.el.

So my next idea would be to come up with a smaller loadup.el which
only loads the stuff that is needed for temacs to start.  I didn't try
that yet, but I did think that Phillip's work on ldefs-boot might just
be a good starting point: those ldefs-boot-*.el files might be just
what we need.

IMO, it would be interesting to see where this will take us, and what
kind of performance could that produce.

Thanks.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-15 19:59                                                                         ` Eli Zaretskii
@ 2016-12-15 22:07                                                                           ` Clément Pit--Claudel
  2016-12-16  7:54                                                                             ` Eli Zaretskii
  2016-12-16  7:56                                                                           ` Eli Zaretskii
  2016-12-19 15:09                                                                           ` Phillip Lord
  2 siblings, 1 reply; 375+ messages in thread
From: Clément Pit--Claudel @ 2016-12-15 22:07 UTC (permalink / raw)
  To: emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 437 bytes --]

On 2016-12-15 14:59, Eli Zaretskii wrote:
> IMO, it would be interesting to see where this will take us, and what
> kind of performance could that produce.

This sounds like a good idea; I wonder how much it will break, though.  Many external packages don't (require) preloaded packages (some preloaded packages don't or used to not export a (provide), in fact), which may cause issues if these packages aren't preloaded anymore.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-15 22:07                                                                           ` Clément Pit--Claudel
@ 2016-12-16  7:54                                                                             ` Eli Zaretskii
  2016-12-16 14:28                                                                               ` Clément Pit--Claudel
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-12-16  7:54 UTC (permalink / raw)
  To: Clément Pit--Claudel; +Cc: emacs-devel

> From: Clément Pit--Claudel <clement.pit@gmail.com>
> Date: Thu, 15 Dec 2016 17:07:50 -0500
> 
> On 2016-12-15 14:59, Eli Zaretskii wrote:
> > IMO, it would be interesting to see where this will take us, and what
> > kind of performance could that produce.
> 
> This sounds like a good idea; I wonder how much it will break, though.  Many external packages don't (require) preloaded packages (some preloaded packages don't or used to not export a (provide), in fact), which may cause issues if these packages aren't preloaded anymore.

Autoloading should fix that.  This idea won't work anyway without
adding the relevant symbols to loaddefs.el.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-16  7:54                                                                             ` Eli Zaretskii
@ 2016-12-16 14:28                                                                               ` Clément Pit--Claudel
  2016-12-16 14:39                                                                                 ` Eli Zaretskii
  2016-12-19 15:11                                                                                 ` Phillip Lord
  0 siblings, 2 replies; 375+ messages in thread
From: Clément Pit--Claudel @ 2016-12-16 14:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 944 bytes --]

On 2016-12-16 02:54, Eli Zaretskii wrote:
>> From: Clément Pit--Claudel <clement.pit@gmail.com>
>> Date: Thu, 15 Dec 2016 17:07:50 -0500
>>
>> On 2016-12-15 14:59, Eli Zaretskii wrote:
>>> IMO, it would be interesting to see where this will take us, and what
>>> kind of performance could that produce.
>>
>> This sounds like a good idea; I wonder how much it will break, though.  Many external packages don't (require) preloaded packages (some preloaded packages don't or used to not export a (provide), in fact), which may cause issues if these packages aren't preloaded anymore.
> 
> Autoloading should fix that.  This idea won't work anyway without
> adding the relevant symbols to loaddefs.el.

Indeed; but then we need to autoload all functions in these files, right?
Also, does autoloading work for macros? And would there not be potential issues with variables and/or defcustoms? (I can't think of any)

Clément.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-16 14:28                                                                               ` Clément Pit--Claudel
@ 2016-12-16 14:39                                                                                 ` Eli Zaretskii
  2016-12-16 15:28                                                                                   ` Clément Pit--Claudel
  2016-12-17 14:56                                                                                   ` Stefan Monnier
  2016-12-19 15:11                                                                                 ` Phillip Lord
  1 sibling, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-12-16 14:39 UTC (permalink / raw)
  To: Clément Pit--Claudel; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Clément Pit--Claudel <clement.pit@gmail.com>
> Date: Fri, 16 Dec 2016 09:28:37 -0500
> 
> > Autoloading should fix that.  This idea won't work anyway without
> > adding the relevant symbols to loaddefs.el.
> 
> Indeed; but then we need to autoload all functions in these files, right?

Not sure about "all", but most of them, yes.  And variables.

> Also, does autoloading work for macros?

It doesn't, but why would macros be a problem?  They need to be seen
by the byte compiler when it compiles the package, not when the
package is loaded.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-16 14:39                                                                                 ` Eli Zaretskii
@ 2016-12-16 15:28                                                                                   ` Clément Pit--Claudel
  2016-12-16 21:27                                                                                     ` Eli Zaretskii
  2016-12-17 14:56                                                                                   ` Stefan Monnier
  1 sibling, 1 reply; 375+ messages in thread
From: Clément Pit--Claudel @ 2016-12-16 15:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


[-- Attachment #1.1: Type: text/plain, Size: 806 bytes --]

On 2016-12-16 09:39, Eli Zaretskii wrote:
>> Cc: emacs-devel@gnu.org
>> From: Clément Pit--Claudel <clement.pit@gmail.com>
>> Date: Fri, 16 Dec 2016 09:28:37 -0500
>>
>>> Autoloading should fix that.  This idea won't work anyway without
>>> adding the relevant symbols to loaddefs.el.
>>
>> Indeed; but then we need to autoload all functions in these files, right?
> 
> Not sure about "all", but most of them, yes.  And variables.
> 
>> Also, does autoloading work for macros?
> 
> It doesn't, but why would macros be a problem?  They need to be seen
> by the byte compiler when it compiles the package, not when the
> package is loaded.

Right; but won't we have a problem when package.el compiles newly downloaded packages that depend on formerly autoloaded libraries?

Clément.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-16 15:28                                                                                   ` Clément Pit--Claudel
@ 2016-12-16 21:27                                                                                     ` Eli Zaretskii
  2016-12-16 21:38                                                                                       ` Noam Postavsky
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-12-16 21:27 UTC (permalink / raw)
  To: Clément Pit--Claudel; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Clément Pit--Claudel <clement.pit@gmail.com>
> Date: Fri, 16 Dec 2016 10:28:22 -0500
> 
> >> Also, does autoloading work for macros?
> > 
> > It doesn't, but why would macros be a problem?  They need to be seen
> > by the byte compiler when it compiles the package, not when the
> > package is loaded.
> 
> Right; but won't we have a problem when package.el compiles newly downloaded packages that depend on formerly autoloaded libraries?

I don't know.  Maybe.  Determining this would be part of the job of
exploring this alternative.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-16 21:27                                                                                     ` Eli Zaretskii
@ 2016-12-16 21:38                                                                                       ` Noam Postavsky
  0 siblings, 0 replies; 375+ messages in thread
From: Noam Postavsky @ 2016-12-16 21:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Clément Pit--Claudel, Emacs developers

On Fri, Dec 16, 2016 at 4:27 PM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Cc: emacs-devel@gnu.org
>> From: Clément Pit--Claudel <clement.pit@gmail.com>
>> Date: Fri, 16 Dec 2016 10:28:22 -0500
>>
>> >> Also, does autoloading work for macros?
>> >
>> > It doesn't, but why would macros be a problem?  They need to be seen
>> > by the byte compiler when it compiles the package, not when the
>> > package is loaded.

Why do we think autoloading doesn't work for macros?

autoload is a built-in function in `C source code'.

(autoload FUNCTION FILE &optional DOCSTRING INTERACTIVE TYPE)
[...]
Fifth arg TYPE indicates the type of the object:
[...]
   `macro' or t says FUNCTION is really a macro.


>>
>> Right; but won't we have a problem when package.el compiles newly downloaded packages that depend on formerly autoloaded libraries?
>
> I don't know.  Maybe.  Determining this would be part of the job of
> exploring this alternative.
>



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-16 14:39                                                                                 ` Eli Zaretskii
  2016-12-16 15:28                                                                                   ` Clément Pit--Claudel
@ 2016-12-17 14:56                                                                                   ` Stefan Monnier
  1 sibling, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-12-17 14:56 UTC (permalink / raw)
  To: emacs-devel

>> Also, does autoloading work for macros?
> It doesn't, but why would macros be a problem?

Huh?  Autoloading works fine for macros.  Or maybe I'm misundertanding
Clement's question.

The problem of autoloading is with variables, coding systems, faces, ...
This said, I don't see any strong reason why we couldn't arrange to
autoload coding systems.

Maybe we should instrument the Emacs code to mark functions that are
called during a normal "start Emacs opening a file in fundamental mode".
Then we can look at the preloaded functions&macros which haven't been
used.  This should give us a good idea of how much there is to gain on
this front.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-16 14:28                                                                               ` Clément Pit--Claudel
  2016-12-16 14:39                                                                                 ` Eli Zaretskii
@ 2016-12-19 15:11                                                                                 ` Phillip Lord
  1 sibling, 0 replies; 375+ messages in thread
From: Phillip Lord @ 2016-12-19 15:11 UTC (permalink / raw)
  To: Clément Pit--Claudel; +Cc: Eli Zaretskii, emacs-devel

Clément Pit--Claudel <clement.pit@gmail.com> writes:

> On 2016-12-16 02:54, Eli Zaretskii wrote:
>>> From: Clément Pit--Claudel <clement.pit@gmail.com>
>>> Date: Thu, 15 Dec 2016 17:07:50 -0500
>>>
>>> On 2016-12-15 14:59, Eli Zaretskii wrote:
>>>> IMO, it would be interesting to see where this will take us, and what
>>>> kind of performance could that produce.
>>>
>>> This sounds like a good idea; I wonder how much it will break, though.
>>> Many external packages don't (require) preloaded packages (some preloaded
>>> packages don't or used to not export a (provide), in fact), which may cause
>>> issues if these packages aren't preloaded anymore.
>> 
>> Autoloading should fix that.  This idea won't work anyway without
>> adding the relevant symbols to loaddefs.el.
>
> Indeed; but then we need to autoload all functions in these files, right?


Unfortunately, there is a fairly random aspect to it. You need to
autoload the first function in the file that actually gets called. Once
the file is actually loaded, autoloads don't matter any more. Obviously,
though, which the first function actually is may change.

Phil




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-15 19:59                                                                         ` Eli Zaretskii
  2016-12-15 22:07                                                                           ` Clément Pit--Claudel
@ 2016-12-16  7:56                                                                           ` Eli Zaretskii
  2016-12-19 15:15                                                                             ` Phillip Lord
  2016-12-19 15:09                                                                           ` Phillip Lord
  2 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-12-16  7:56 UTC (permalink / raw)
  To: raeburn; +Cc: emacs-devel

> Date: Thu, 15 Dec 2016 21:59:08 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> So my next idea would be to come up with a smaller loadup.el which
> only loads the stuff that is needed for temacs to start.  I didn't try
> that yet, but I did think that Phillip's work on ldefs-boot might just
> be a good starting point: those ldefs-boot-*.el files might be just
> what we need.

Oh, and more more thought: there could be a separate, smaller loadup
file for batch invocations, since speed of startup in that mode is
somewhat more important than in the interactive mode.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-16  7:56                                                                           ` Eli Zaretskii
@ 2016-12-19 15:15                                                                             ` Phillip Lord
  0 siblings, 0 replies; 375+ messages in thread
From: Phillip Lord @ 2016-12-19 15:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: raeburn, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Thu, 15 Dec 2016 21:59:08 +0200
>> From: Eli Zaretskii <eliz@gnu.org>
>> Cc: emacs-devel@gnu.org
>> 
>> So my next idea would be to come up with a smaller loadup.el which
>> only loads the stuff that is needed for temacs to start.  I didn't try
>> that yet, but I did think that Phillip's work on ldefs-boot might just
>> be a good starting point: those ldefs-boot-*.el files might be just
>> what we need.
>
> Oh, and more more thought: there could be a separate, smaller loadup
> file for batch invocations, since speed of startup in that mode is
> somewhat more important than in the interactive mode.


loadup would need refactoring anyway. It doesn't just loadup at the
moment. It also dumps and kills Emacs. It's also got some strange
syntactic constraints because the Makefile does some sed based parsing
of it.

Phil



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-15 19:59                                                                         ` Eli Zaretskii
  2016-12-15 22:07                                                                           ` Clément Pit--Claudel
  2016-12-16  7:56                                                                           ` Eli Zaretskii
@ 2016-12-19 15:09                                                                           ` Phillip Lord
  2016-12-20 18:57                                                                             ` Ken Raeburn
  2 siblings, 1 reply; 375+ messages in thread
From: Phillip Lord @ 2016-12-19 15:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ken Raeburn, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:
> So my next idea would be to come up with a smaller loadup.el which
> only loads the stuff that is needed for temacs to start.  I didn't try
> that yet, but I did think that Phillip's work on ldefs-boot might just
> be a good starting point: those ldefs-boot-*.el files might be just
> what we need.
>
> IMO, it would be interesting to see where this will take us, and what
> kind of performance could that produce.

I looked at this a little and in fact the boot code that I have written
does tell you exactly which autoloads you need to get temacs to work --
it's not very many, I think that there are only 10 or so (bytecomp.el
for instance).

Of course, this is 10 autoloads PLUS all of the non-auto loads in
loadup.el. My own feeling is that this is a bit unclean at the moment;
given that loadup.el is supposed to support temacs till the point that
it dumps, probably all of the autoloads used for this process should be
explicitly in loadup; or, alternatively, we should have very non-auto
loads in loadup and do everything via autoload.

Phil

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-19 15:09                                                                           ` Phillip Lord
@ 2016-12-20 18:57                                                                             ` Ken Raeburn
  2016-12-20 23:22                                                                               ` Stefan Monnier
  2016-12-21 12:13                                                                               ` Phillip Lord
  0 siblings, 2 replies; 375+ messages in thread
From: Ken Raeburn @ 2016-12-20 18:57 UTC (permalink / raw)
  To: Phillip Lord; +Cc: Eli Zaretskii, emacs-devel

On Dec 19, 2016, at 10:09, Phillip Lord <phillip.lord@russet.org.uk> wrote:

> I looked at this a little and in fact the boot code that I have written
> does tell you exactly which autoloads you need to get temacs to work --
> it's not very many, I think that there are only 10 or so (bytecomp.el
> for instance).

This sounds like it could be the biggest help for startup time at this point.  Are you going to look further into making a lightweight loadup file?

Looking at ldefs-boot.el and loaddefs.el, and contemplating the parsing of them, I wonder: If we go the big-elc route, can we defer loading the doc strings until they’re actually needed?  Perhaps using the “(#$ . nnnn)” syntax used in .elc files, or somehow pointing at the real .el or .elc files defining the functions?  Maybe just omit the function doc strings, if the help code does something reasonable in that case?

I’ve still been poking at the reader code, but for small-ish changes I think I’m hitting a point of diminishing returns.  My current test case run time is about 0.15-0.16s, though the run times are short enough that minor system activity at the same time can affect the results.  I’ve got one more experiment in the works that cuts almost 20% of the size of dumped.elc, and cuts the test run time to about 0.14s.  (Sharing interned symbols in the printer, so “setplist … setplist …” becomes “#4=setplist … #4# …”, drastically cutting into the 90% of oblookup calls that are done for symbols already in the obarray, and the related string manipulations, as well as the legibility of the generated file.)  After that, I think the next step is further specialization of read1/readchar/read_escape/readbyte for the get-file-char case, and maybe more tweaks to try to optimize for mostly-ASCII input.  But those result in more code duplication and additional maintenance work, for probably small benefit, so they’re not looking all that appealing.

So, at this point, I’m inclined to finish my current experiment with the printer, and maybe set aside work on the big-elc performance for a bit, maybe look into threading bugs or the state of the CANNOT_DUMP code.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-20 18:57                                                                             ` Ken Raeburn
@ 2016-12-20 23:22                                                                               ` Stefan Monnier
  2016-12-21  7:44                                                                                 ` Ken Raeburn
  2016-12-21 12:13                                                                               ` Phillip Lord
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-12-20 23:22 UTC (permalink / raw)
  To: emacs-devel

> them, I wonder: If we go the big-elc route, can we defer loading the doc
> strings until they’re actually needed?  Perhaps using the “(#$ . nnnn)”

In the dumped.elc file I generate, there should be basically
no docstrings (the data I dump already uses either the NNN or the (#$
. NNN) representation to point to docstrings in the DOC file or in the
original .elc file), so I don't think there's much opportunity for deferral.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-20 23:22                                                                               ` Stefan Monnier
@ 2016-12-21  7:44                                                                                 ` Ken Raeburn
  0 siblings, 0 replies; 375+ messages in thread
From: Ken Raeburn @ 2016-12-21  7:44 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On Dec 20, 2016, at 18:22, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

>> them, I wonder: If we go the big-elc route, can we defer loading the doc
>> strings until they’re actually needed?  Perhaps using the “(#$ . nnnn)”
> 
> In the dumped.elc file I generate, there should be basically
> no docstrings (the data I dump already uses either the NNN or the (#$
> . NNN) representation to point to docstrings in the DOC file or in the
> original .elc file), so I don't think there's much opportunity for deferral.

Ah, yes, I forgot that happens even for the loaddefs doc strings not explicitly using that syntax, thanks to Snarf-documentation.  At least, so long as all the files we might pre-load under various conditions are all covered by the DOC file, if we do take the approach of a smaller file for batch mode.

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-20 18:57                                                                             ` Ken Raeburn
  2016-12-20 23:22                                                                               ` Stefan Monnier
@ 2016-12-21 12:13                                                                               ` Phillip Lord
  1 sibling, 0 replies; 375+ messages in thread
From: Phillip Lord @ 2016-12-21 12:13 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Eli Zaretskii, emacs-devel

Ken Raeburn <raeburn@raeburn.org> writes:

> On Dec 19, 2016, at 10:09, Phillip Lord <phillip.lord@russet.org.uk> wrote:
>
>> I looked at this a little and in fact the boot code that I have written
>> does tell you exactly which autoloads you need to get temacs to work --
>> it's not very many, I think that there are only 10 or so (bytecomp.el
>> for instance).
>
> This sounds like it could be the biggest help for startup time at this point.
> Are you going to look further into making a lightweight loadup file?
>
> Looking at ldefs-boot.el and loaddefs.el, and contemplating the parsing of
> them, I wonder: If we go the big-elc route, can we defer loading the doc
> strings until they’re actually needed?  Perhaps using the “(#$ . nnnn)” syntax
> used in .elc files, or somehow pointing at the real .el or .elc files defining
> the functions?  Maybe just omit the function doc strings, if the help code
> does something reasonable in that case?

In the ldefs-boot-auto.el file, there are no doc strings. This was
mostly because it was extra effort to add them, and they made no
difference for the use intended.

One simple way to lazy load doc strings would be to fiddle with help
code so that it loads the relevant file before looking for the
docstring. This does assume that the all the code in dumped.elc is
idempotent, though.

Incidentally, unless I have misunderstood, dumped.elc will duplicate
code also found in other .elc files (say, byte-run.elc, and
nadvice.elc)? Is dumped.elc going to detect these dependencies and
redump as necessary?

Phil



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-15 11:45                                                                     ` Ken Raeburn
  2016-12-15 17:28                                                                       ` Ken Raeburn
@ 2016-12-16 14:22                                                                       ` Robert Pluim
  1 sibling, 0 replies; 375+ messages in thread
From: Robert Pluim @ 2016-12-16 14:22 UTC (permalink / raw)
  To: emacs-devel

Ken Raeburn <raeburn@raeburn.org> writes:

> Branch scratch/raeburn-startup deleted and re-pushed.
>
> In addition to the changes I mentioned earlier, I found an unnecessary
> memset in the face reinitialization code that could go, and an
> initialization form was being emitted that tried to incorporate the
> obarray by value (which wouldn’t work because the symbol chains don’t
> all get dumped); omitting the latter for now cuts the file size a
> percent or so.

Hmm, it crashes for me when doing a bootstrap on GNU/Linux Mint (based
on Ubuntu 16.04)

/bin/bash: line 1: 21979 Aborted                 EMACSLOADPATH= '../src/emacs' -batch --no-site-file --no-site-lisp --eval '(setq load-prefer-newer t)' -f batch-byte-compile calc/calcalg2.el
Makefile:282: recipe for target 'calc/calcalg2.elc' failed

Backtrace (this is compiled -ggdb -O0 with gcc 5.4.0)

(gdb) bt
#0  terminate_due_to_signal (sig=6, backtrace_limit=40) at emacs.c:367
#1  0x00000000005866d3 in emacs_abort () at sysdep.c:2337
#2  0x000000000057100d in unblock_input_to (level=-1) at keyboard.c:7170
#3  0x0000000000571024 in unblock_input () at keyboard.c:7186
#4  0x0000000000639605 in read1 (readcharfun=25152, pch=0x7fffffff719c, first_in_list=false) at lread.c:3446
#5  0x000000000063a906 in read_list (flag=false, readcharfun=25152) at lread.c:3928
#6  0x00000000006371e9 in read1 (readcharfun=25152, pch=0x7fffffff7514, first_in_list=false) at lread.c:2629
#7  0x0000000000636336 in read0 (readcharfun=25152) at lread.c:2210
#8  0x00000000006361fb in read_internal_start (stream=25152, start=0, end=0) at lread.c:2176
#9  0x0000000000635ef2 in Fread (stream=25152) at lread.c:2110
#10 0x000000000060748d in Ffuncall (nargs=2, args=0x7fffffff76b0) at eval.c:2715
#11 0x0000000000606d54 in call1 (fn=39792, arg1=25152) at eval.c:2574
#12 0x00000000006358a5 in readevalloop (readcharfun=25152, stream=0x19bc7a0, sourcename=26993220, printflag=false, 
    unibyte=0, readfun=0, start=0, end=0) at lread.c:1958
#13 0x0000000000633fed in Fload (file=10190276, noerror=0, nomessage=45216, nosuffix=0, must_suffix=45216)
    at lread.c:1367
#14 0x00000000006136d9 in Frequire (feature=4534752, filename=0, noerror=0) at fns.c:2894
#15 0x0000000000607516 in Ffuncall (nargs=2, args=0x7fffffff7ae8) at eval.c:2722
#16 0x000000000064e9ef in exec_byte_code (bytestr=26973012, vector=22861437, maxdepth=10, args_template=0, nargs=0, 
    args=0x0) at bytecode.c:639
#17 0x000000000064dd56 in Fbyte_code (bytestr=26973012, vector=22861437, maxdepth=10) at bytecode.c:319
#18 0x0000000000605f02 in eval_sub (form=26933523) at eval.c:2194
#19 0x00000000006359c3 in readevalloop (readcharfun=25152, stream=0x18c7190, sourcename=26972436, printflag=false, 
    unibyte=0, readfun=0, start=0, end=0) at lread.c:1980
#20 0x0000000000633fed in Fload (file=26802292, noerror=0, nomessage=45216, nosuffix=0, must_suffix=45216)
    at lread.c:1367
#21 0x00000000006136d9 in Frequire (feature=13579232, filename=0, noerror=0) at fns.c:2894
#22 0x0000000000607516 in Ffuncall (nargs=2, args=0x7fffffff8518) at eval.c:2722
#23 0x00000000006063fa in Fapply (nargs=2, args=0x7fffffff8518) at eval.c:2300
#24 0x000000000060735d in Ffuncall (nargs=3, args=0x7fffffff8510) at eval.c:2695
#25 0x000000000064e9ef in exec_byte_code (bytestr=25727860, vector=22453597, maxdepth=38, args_template=1030, 
    nargs=1, args=0x7fffffff8a78) at bytecode.c:639
#26 0x0000000000607dd8 in funcall_lambda (fun=22453765, nargs=1, arg_vector=0x7fffffff8a70) at eval.c:2879
#27 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffff8a68) at eval.c:2764
#28 0x000000000064e9ef in exec_byte_code (bytestr=25715156, vector=22445469, maxdepth=18, args_template=1030, 
    nargs=1, args=0x7fffffff8f60) at bytecode.c:639
#29 0x0000000000607dd8 in funcall_lambda (fun=22445517, nargs=1, arg_vector=0x7fffffff8f58) at eval.c:2879
#30 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffff8f50) at eval.c:2764
#31 0x000000000064e9ef in exec_byte_code (bytestr=25714756, vector=22445325, maxdepth=22, args_template=1030, 
    nargs=1, args=0x7fffffff9438) at bytecode.c:639
#32 0x0000000000607dd8 in funcall_lambda (fun=22445373, nargs=1, arg_vector=0x7fffffff9430) at eval.c:2879
#33 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffff9428) at eval.c:2764
#34 0x000000000064e9ef in exec_byte_code (bytestr=17023828, vector=22255765, maxdepth=42, args_template=2058, 
    nargs=2, args=0x7fffffff9948) at bytecode.c:639
#35 0x0000000000607dd8 in funcall_lambda (fun=22255869, nargs=2, arg_vector=0x7fffffff9938) at eval.c:2879
#36 0x00000000006077ae in Ffuncall (nargs=3, args=0x7fffffff9930) at eval.c:2764
#37 0x000000000064e9ef in exec_byte_code (bytestr=25714692, vector=22433525, maxdepth=18, args_template=1030, 
    nargs=1, args=0x7fffffff9e08) at bytecode.c:639
#38 0x0000000000607dd8 in funcall_lambda (fun=22445421, nargs=1, arg_vector=0x7fffffff9e00) at eval.c:2879
#39 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffff9df8) at eval.c:2764
#40 0x000000000064e9ef in exec_byte_code (bytestr=25699108, vector=22436957, maxdepth=22, args_template=1030, 
    nargs=1, args=0x7fffffffa330) at bytecode.c:639
#41 0x0000000000607dd8 in funcall_lambda (fun=22433429, nargs=1, arg_vector=0x7fffffffa328) at eval.c:2879
#42 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffffa320) at eval.c:2764
#43 0x000000000064e9ef in exec_byte_code (bytestr=25698596, vector=22437133, maxdepth=66, args_template=1030, 
    nargs=1, args=0x7fffffffa978) at bytecode.c:639
#44 0x0000000000607dd8 in funcall_lambda (fun=22433477, nargs=1, arg_vector=0x7fffffffa970) at eval.c:2879
#45 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffffa968) at eval.c:2764
#46 0x000000000064e9ef in exec_byte_code (bytestr=25677076, vector=22432581, maxdepth=66, args_template=2054, 
    nargs=1, args=0x7fffffffb030) at bytecode.c:639
#47 0x0000000000607dd8 in funcall_lambda (fun=22429357, nargs=1, arg_vector=0x7fffffffb028) at eval.c:2879
#48 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffffb020) at eval.c:2764
#49 0x000000000064e9ef in exec_byte_code (bytestr=25918004, vector=19599701, maxdepth=34, args_template=1030, 
    nargs=1, args=0x7fffffffb578) at bytecode.c:639
#50 0x0000000000607dd8 in funcall_lambda (fun=19599829, nargs=1, arg_vector=0x7fffffffb570) at eval.c:2879
#51 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffffb568) at eval.c:2764
#52 0x000000000064e9ef in exec_byte_code (bytestr=25917348, vector=20758333, maxdepth=42, args_template=1026, 
    nargs=0, args=0x7fffffffbb58) at bytecode.c:639
#53 0x0000000000607dd8 in funcall_lambda (fun=20758493, nargs=0, arg_vector=0x7fffffffbb58) at eval.c:2879
#54 0x00000000006077ae in Ffuncall (nargs=1, args=0x7fffffffbb50) at eval.c:2764
#55 0x000000000064e9ef in exec_byte_code (bytestr=10932540, vector=10932573, maxdepth=94, args_template=1030, 
    nargs=1, args=0x7fffffffc4e8) at bytecode.c:639
#56 0x0000000000607dd8 in funcall_lambda (fun=10932493, nargs=1, arg_vector=0x7fffffffc4e0) at eval.c:2879
#57 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffffc4d8) at eval.c:2764
#58 0x000000000064e9ef in exec_byte_code (bytestr=10909516, vector=10909549, maxdepth=86, args_template=2, nargs=0, 
    args=0x7fffffffd108) at bytecode.c:639
#59 0x0000000000607dd8 in funcall_lambda (fun=10909469, nargs=0, arg_vector=0x7fffffffd108) at eval.c:2879
#60 0x00000000006077ae in Ffuncall (nargs=1, args=0x7fffffffd100) at eval.c:2764
#61 0x000000000064e9ef in exec_byte_code (bytestr=10905556, vector=10905589, maxdepth=50, args_template=2, nargs=0, 
    args=0x7fffffffd6f0) at bytecode.c:639
#62 0x0000000000607dd8 in funcall_lambda (fun=10905509, nargs=0, arg_vector=0x7fffffffd6f0) at eval.c:2879
#63 0x0000000000607b46 in apply_lambda (fun=10905509, args=0, count=4) at eval.c:2816
#64 0x000000000060607e in eval_sub (form=19109027) at eval.c:2233
#65 0x0000000000605572 in Feval (form=19109027, lexical=0) at eval.c:2010
#66 0x00000000005644f6 in top_level_2 () at keyboard.c:1127
#67 0x0000000000603fbd in internal_condition_case (bfun=0x5644d3 <top_level_2>, handlers=19536, 
    hfun=0x563f70 <cmd_error>) at eval.c:1314
#68 0x0000000000564537 in top_level_1 (ignore=0) at keyboard.c:1135
#69 0x00000000006038cc in internal_catch (tag=46608, func=0x5644f8 <top_level_1>, arg=0) at eval.c:1080
#70 0x000000000056442b in command_loop () at keyboard.c:1096
#71 0x0000000000563b55 in recursive_edit_1 () at keyboard.c:703
#72 0x0000000000563ccc in Frecursive_edit () at keyboard.c:774
#73 0x0000000000561951 in main (argc=9, argv=0x7fffffffdc38) at emacs.c:1698

Lisp Backtrace:
"read" (0xffff76b8)
"require" (0xffff7af0)
"byte-code" (0xffff7f20)
"require" (0xffff8520)
"apply" (0xffff8518)
"byte-compile-file-form-require" (0xffff8a70)
"byte-compile-file-form" (0xffff8f58)
0x1567d38 PVEC_COMPILED
"byte-compile-recurse-toplevel" (0xffff9938)
"byte-compile-toplevel-file-form" (0xffff9e00)
0x1564e90 PVEC_COMPILED
"byte-compile-from-buffer" (0xffffa970)
"byte-compile-file" (0xffffb028)
"batch-byte-compile-file" (0xffffb570)
"batch-byte-compile" (0xffffbb58)
"command-line-1" (0xffffc4e0)
"command-line" (0xffffd108)
"normal-top-level" (0xffffd6f0)
(gdb) 




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-11 13:34                                                             ` Ken Raeburn
                                                                                 ` (3 preceding siblings ...)
  2016-12-13 15:21                                                               ` Ken Brown
@ 2016-12-24 13:37                                                               ` Eli Zaretskii
  2016-12-26 17:48                                                                 ` Eli Zaretskii
  4 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-12-24 13:37 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Sun, 11 Dec 2016 08:34:01 -0500
> 
> With all these changes — Stefan’s new patch with additional initialization, and my updates to shave a little more time off — I’m still hitting just under 0.2s for:
> 
>   time ./temacs --batch --eval '(progn (message "hi") (kill-emacs))'
> 
> on Linux/GNU/X11 (Intel Core i5-2320, 3GHz, gcc 4.9); my Mac (Intel Core 2 Duo, 2.8GHz) takes over half a second (including at least one GC invocation).

For the record, my timing is 0.828s with an unoptimized build of the
branch, as opposed to 13.2s with an unoptimized build on master, and
5.343s with an optimized (-Og) build of Emacs 25.1.90.  The CPU is
Core i7-2600, 3.4GHz; the compiler used is GCC 5.3.0.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-24 13:37                                                               ` Eli Zaretskii
@ 2016-12-26 17:48                                                                 ` Eli Zaretskii
  2017-01-07  9:40                                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-12-26 17:48 UTC (permalink / raw)
  To: raeburn; +Cc: emacs-devel

> Date: Sat, 24 Dec 2016 15:37:11 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> I’m still hitting just under 0.2s for:
> 
> >   time ./temacs --batch --eval '(progn (message "hi") (kill-emacs))'
> > 
> > on Linux/GNU/X11 (Intel Core i5-2320, 3GHz, gcc 4.9); my Mac (Intel Core 2 Duo, 2.8GHz) takes over half a second (including at least one GC invocation).
> 
> For the record, my timing is 0.828s with an unoptimized build of the
> branch, as opposed to 13.2s with an unoptimized build on master, and
> 5.343s with an optimized (-Og) build of Emacs 25.1.90.  The CPU is
> Core i7-2600, 3.4GHz; the compiler used is GCC 5.3.0.

With an optimized (-O2) build on the same system, the above command
takes 0.190s on the average.  Byte-compiling Lisp files in batch mode
shows that loading dumped.elc takes about 0.150s, as that is the
difference between byte-compiling with temacs and the dumped emacs.
For example, compiling simple.el takes 0.656s with temacs and 0.515s
with dumped emacs.  IOW, the overhead is additive, as expected.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-12-26 17:48                                                                 ` Eli Zaretskii
@ 2017-01-07  9:40                                                                   ` Eli Zaretskii
  2017-01-09 10:28                                                                     ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-01-07  9:40 UTC (permalink / raw)
  To: raeburn; +Cc: emacs-devel

Ken,

I tried to get rid of calling dump-emacs in the raeburn-startup
branch, see the changes below.  The resulting code builds and produces
dumped.elc, but then fails to compile the *.el files:

  ...
  Loading d:/gnu/git/emacs/no-unexec/lisp/leim/leim-list.el (source)...
  Finding pointers to doc strings...
  Finding pointers to doc strings...done
  Dumping under the name emacs
  Dumping into dumped.elc...preparing...
  Dumping into dumped.elc...generating...
  Dumping into dumped.elc...printing...
  Dumping into dumped.elc...saving...
  Dumping into dumped.elc...done
  mv -f emacs.exe bootstrap-emacs.exe
  make -C ../lisp compile-first EMACS="../src/bootstrap-emacs.exe"
  make[2]: Entering directory `/d/gnu/git/emacs/no-unexec/lisp'
    ELC      emacs-lisp/macroexp.elc
  Loading ../src/dumped.elc...
  Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end)
    ELC      emacs-lisp/cconv.elc
  Loading ../src/dumped.elc...
  Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end)

This could be related to the fact that the original code produced the
first dumped.elc in the top-level directory, not in src/, and I needed
to fix that, since otherwise bootstrap-emacs would exit immediately
(see the changes below).  In the original version, src/dumped.elc was
only produced after all the necessary Lisp files were byte-compiled
already.

So it seems like the current build process on this branch still
somehow depends on a dumped emacs executable, until it byte-compiles
all the preloaded Lisp files, and produces dumped.elc from that.  IOW,
the first dumped.elc produced before byte-compiling those files is not
up to the job of running Emacs for byte-compiling Lisp files.  How can
we fix that, so that unexec and its call can be really removed from
the sources?  Or did I miss something?

Thanks.

diff --git a/lisp/loadup.el b/lisp/loadup.el
index 54d19c1..873d804 100644
--- a/lisp/loadup.el
+++ b/lisp/loadup.el
@@ -453,27 +453,30 @@
       ;; confused people installing Emacs (they'd install the file
       ;; under the name `xemacs'), and it's inconsistent with every
       ;; other GNU program's build process.
-      (dump-emacs "emacs" "temacs")
-      (message "%d pure bytes used" pure-bytes-used)
-      ;; Recompute NAME now, so that it isn't set when we dump.
-      (if (not (or (eq system-type 'ms-dos)
-                   ;; Don't bother adding another name if we're just
-                   ;; building bootstrap-emacs.
-                   (equal (last command-line-args) '("bootstrap"))))
-	  (let ((name (concat "emacs-" emacs-version))
-		(exe (if (eq system-type 'windows-nt) ".exe" "")))
-	    (while (string-match "[^-+_.a-zA-Z0-9]+" name)
-	      (setq name (concat (downcase (substring name 0 (match-beginning 0)))
+      ;; (dump-emacs "emacs" "temacs")
+      ;; (message "%d pure bytes used" pure-bytes-used)
+      (let ((exe (if (memq system-type '(windows-nt ms-dos)) ".exe" "")))
+        (copy-file (expand-file-name (concat "temacs" exe) invocation-directory)
+                   (expand-file-name (concat "emacs" exe) invocation-directory)
+                   t)
+        ;; Recompute NAME now, so that it isn't set when we dump.
+        (if (not (or (eq system-type 'ms-dos)
+                     ;; Don't bother adding another name if we're just
+                     ;; building bootstrap-emacs.
+                     (equal (last command-line-args) '("bootstrap"))))
+            (let ((name (concat "emacs-" emacs-version)))
+              (while (string-match "[^-+_.a-zA-Z0-9]+" name)
+                (setq name (concat (downcase (substring name 0 (match-beginning 0)))
 				 "-"
 				 (substring name (match-end 0)))))
-	    (setq name (concat name exe))
-            (message "Adding name %s" name)
-	    ;; When this runs on Windows, invocation-directory is not
-	    ;; necessarily the current directory.
-	    (add-name-to-file (expand-file-name (concat "emacs" exe)
-						invocation-directory)
-			      (expand-file-name name invocation-directory)
-			      t)))
+              (setq name (concat name exe))
+              (message "Adding name %s" name)
+              ;; When this runs on Windows, invocation-directory is not
+              ;; necessarily the current directory.
+              (add-name-to-file (expand-file-name (concat "emacs" exe)
+                                                  invocation-directory)
+                                (expand-file-name name invocation-directory)
+                                t))))
       (message "Dumping into dumped.elc...preparing...")
 
       ;; Dump the current state into a file so we can reload it!
@@ -555,6 +558,7 @@
          obarray)
         (message "Dumping into dumped.elc...printing...")
         (with-current-buffer (generate-new-buffer "dumped.elc")
+          (setq default-directory invocation-directory)
           (insert ";ELC\^W\^@\^@\^@\n;;; Compiled\n;;; in Emacs version "
                   emacs-version "\n")
           (let ((print-circle t)



^ permalink raw reply related	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-07  9:40                                                                   ` Eli Zaretskii
@ 2017-01-09 10:28                                                                     ` Ken Raeburn
  2017-01-10  2:25                                                                       ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-01-09 10:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Emacs developers

> On Jan 7, 2017, at 04:40, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> Ken,
> 
> I tried to get rid of calling dump-emacs in the raeburn-startup
> branch, see the changes below.  The resulting code builds and produces
> dumped.elc, but then fails to compile the *.el files:

I’ve been looking into it this weekend.  It appears that in some of my builds I’m seeing in dumped.elc stuff along the lines of:

(setplist 'window-parameter '(gv-expander (closure (t) #19=(do &rest args) (gv--defsetter 'window-parameter (lambda #20=(val &rest args) `(,'set-window-parameter . #21=(,@args ,val))) . #22=(do args))) side-effect-free t))

That’s with my #N# patch removed; that patch obfuscates the code but I don’t think it should be changing the meaning.

The comma-quote-symbol syntax looks strange to me, could that be causing it?

> This could be related to the fact that the original code produced the
> first dumped.elc in the top-level directory, not in src/, and I needed
> to fix that, since otherwise bootstrap-emacs would exit immediately
> (see the changes below).  In the original version, src/dumped.elc was
> only produced after all the necessary Lisp files were byte-compiled
> already.

In the GNU/Linux build, the dumped.elc file is generated in the src directory of the build tree.  So that part of your patch didn’t alter anything for my testing as far as I can see.

But the GNU/Linux build supports building in a separate tree from the source tree, a mode I usually do my builds in, and at startup we look for dumped.elc in the src directory of the source tree, not the build tree.  So I still have to tweak it manually.

> So it seems like the current build process on this branch still
> somehow depends on a dumped emacs executable, until it byte-compiles
> all the preloaded Lisp files, and produces dumped.elc from that.  IOW,
> the first dumped.elc produced before byte-compiling those files is not
> up to the job of running Emacs for byte-compiling Lisp files.  How can
> we fix that, so that unexec and its call can be really removed from
> the sources?  Or did I miss something?

A workaround might be to use loadup.el instead of dumped.elc during that stage.  But that doesn’t fix the problem.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-09 10:28                                                                     ` Ken Raeburn
@ 2017-01-10  2:25                                                                       ` Stefan Monnier
  2017-01-10  9:46                                                                         ` Andreas Schwab
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2017-01-10  2:25 UTC (permalink / raw)
  To: emacs-devel

> `(,'set-window-parameter . #21=(,@args ,val))) . #22=(do args)))
> side-effect-free t))
>
> The comma-quote-symbol syntax looks strange to me, could that be causing it?

The ,' is a result of evaluation of code like

    ``(,',setter ,@args ,val)

so, it's indeed strange, but only to the extent that nested backquotes
are "strange".

Eli wrote:
>  Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end)

Hmm... I don't understand this.  This message seems to come from
backquote.el:

   ((eq (car s) backquote-unquote-symbol)
    (if (<= level 0)
        (cond
         ((> (length s) 2)
          ;; We could support it with: (cons 2 `(list . ,(cdr s)))
          ;; But let's not encourage such uses.
          (error "Multiple args to , are not supported: %S" s))
         (t (cons (if (eq (car-safe (nth 1 s)) 'quote) 0 1)
                  (nth 1 s))))
      (backquote-delay-process s (1- level))))

but then `s` should have \, in its car, whereas the above message
indicates that (car s) is (\, (quote set-window-parameter)) which
implies we should not have entered this branch.

Maybe I'm just too tired to read this code right, tho.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-10  2:25                                                                       ` Stefan Monnier
@ 2017-01-10  9:46                                                                         ` Andreas Schwab
  2017-01-10 17:19                                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Andreas Schwab @ 2017-01-10  9:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On Jan 09 2017, Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> Eli wrote:
>>  Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end)
>
> Hmm... I don't understand this.  This message seems to come from
> backquote.el:
>
>    ((eq (car s) backquote-unquote-symbol)
>     (if (<= level 0)
>         (cond
>          ((> (length s) 2)
>           ;; We could support it with: (cons 2 `(list . ,(cdr s)))
>           ;; But let's not encourage such uses.
>           (error "Multiple args to , are not supported: %S" s))
>          (t (cons (if (eq (car-safe (nth 1 s)) 'quote) 0 1)
>                   (nth 1 s))))
>       (backquote-delay-process s (1- level))))
>
> but then `s` should have \, in its car, whereas the above message
> indicates that (car s) is (\, (quote set-window-parameter)) which
> implies we should not have entered this branch.

That can only mean that something clobbered backquote-unquote-symbol.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-10  9:46                                                                         ` Andreas Schwab
@ 2017-01-10 17:19                                                                           ` Eli Zaretskii
  2017-01-11  6:32                                                                             ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-01-10 17:19 UTC (permalink / raw)
  To: Andreas Schwab, Ken Raeburn; +Cc: monnier, emacs-devel

> From: Andreas Schwab <schwab@linux-m68k.org>
> Date: Tue, 10 Jan 2017 10:46:25 +0100
> Cc: emacs-devel@gnu.org
> 
> On Jan 09 2017, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
> > Eli wrote:
> >>  Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end)
> >
> > Hmm... I don't understand this.  This message seems to come from
> > backquote.el:
> >
> >    ((eq (car s) backquote-unquote-symbol)
> >     (if (<= level 0)
> >         (cond
> >          ((> (length s) 2)
> >           ;; We could support it with: (cons 2 `(list . ,(cdr s)))
> >           ;; But let's not encourage such uses.
> >           (error "Multiple args to , are not supported: %S" s))
> >          (t (cons (if (eq (car-safe (nth 1 s)) 'quote) 0 1)
> >                   (nth 1 s))))
> >       (backquote-delay-process s (1- level))))
> >
> > but then `s` should have \, in its car, whereas the above message
> > indicates that (car s) is (\, (quote set-window-parameter)) which
> > implies we should not have entered this branch.
> 
> That can only mean that something clobbered backquote-unquote-symbol.

Yes, the value of backquote-unquote-symbol at this point is indeed
this:

   (\, (quote set-window-parameter))

I guess something is wrong with reading dumped.elc?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-10 17:19                                                                           ` Eli Zaretskii
@ 2017-01-11  6:32                                                                             ` Ken Raeburn
  2017-01-12  8:17                                                                               ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-01-11  6:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Andreas Schwab, monnier, emacs-devel

> On Jan 10, 2017, at 12:19, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Andreas Schwab <schwab@linux-m68k.org>
>> Date: Tue, 10 Jan 2017 10:46:25 +0100
>> Cc: emacs-devel@gnu.org
>> 
>> On Jan 09 2017, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> 
>>> Eli wrote:
>>>> Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end)
>>> 
>>> Hmm... I don't understand this.  This message seems to come from
>>> backquote.el:
>>> 
>>>   ((eq (car s) backquote-unquote-symbol)
>>>    (if (<= level 0)
>>>        (cond
>>>         ((> (length s) 2)
>>>          ;; We could support it with: (cons 2 `(list . ,(cdr s)))
>>>          ;; But let's not encourage such uses.
>>>          (error "Multiple args to , are not supported: %S" s))
>>>         (t (cons (if (eq (car-safe (nth 1 s)) 'quote) 0 1)
>>>                  (nth 1 s))))
>>>      (backquote-delay-process s (1- level))))
>>> 
>>> but then `s` should have \, in its car, whereas the above message
>>> indicates that (car s) is (\, (quote set-window-parameter)) which
>>> implies we should not have entered this branch.
>> 
>> That can only mean that something clobbered backquote-unquote-symbol.
> 
> Yes, the value of backquote-unquote-symbol at this point is indeed
> this:
> 
>   (\, (quote set-window-parameter))
> 
> I guess something is wrong with reading dumped.elc?

At the moment it’s looking to me like it might be a problem with my #N# patch for writing out symbols.  It got a little more of a speedup reading dumped.elc, but if I drop that change, I get a lot further in trying to bootstrap the tree with your change.  It still fails while processing the “leim” directory, though.

Indeed, looking at dumped.elc, I see:
  (#35# '#5646# '#218#)
where 35 is set-default, 5646 is backquote-unquote-symbol, and 218 is ,’set-window-parameter thanks to "#218=,’#897=set-window-parameter" being read from dumped.elc.  I suspect 218 was supposed to be just the comma, but the special printing of comma forms was still applied but is not compatible with the #N# handling, so comma and related symbols should just be excluded from that hack.

I’ll test that out, but in the meantime, commenting out the binding in loadup.el of print-symbols-as-references should make things work again (bootstrapping up until partway through the leim directory).

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-11  6:32                                                                             ` Ken Raeburn
@ 2017-01-12  8:17                                                                               ` Ken Raeburn
  2017-01-14 10:41                                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-01-12  8:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Andreas Schwab, monnier, emacs-devel

On Jan 11, 2017, at 01:32, Ken Raeburn <raeburn@raeburn.org> wrote:
> Indeed, looking at dumped.elc, I see:
>  (#35# '#5646# '#218#)
> where 35 is set-default, 5646 is backquote-unquote-symbol, and 218 is ,’set-window-parameter thanks to "#218=,’#897=set-window-parameter" being read from dumped.elc.  I suspect 218 was supposed to be just the comma, but the special printing of comma forms was still applied but is not compatible with the #N# handling, so comma and related symbols should just be excluded from that hack.

There were other instances that made it clear that “#218#” was being printed where “,” was intended, including with the lack of space before whatever followed that’s normal for a comma (e.g., “#218##219#” where #219# referred to some ordinary symbol).

I’ve just uploaded a workaround for that (including a comma-dot sequence that I’m not familiar with, but which seems to get the same treatment as comma and comma-at), and a bug fix I found relating to one of my earlier changes.

Now, with your patch to avoid unexec, it’s successfully compiling in the lisp directory but fails in leim, which I haven’t dug into yet:

make[2]: Entering directory '/home/raeburn/dev/emacs/s/lisp'
make -C ../leim all EMACS="../src/emacs"
make[3]: Entering directory '/home/raeburn/dev/emacs/s/leim'
/bin/mkdir -p ../lisp/leim/ja-dic
  GEN      ../lisp/leim/ja-dic/ja-dic.el
Loading ../src/dumped.elc...
Reading file "/home/raeburn/dev/emacs/s/leim/SKK-DIC/SKK-JISYO.L" ...
Processing OKURI-ARI entries ...
Debugger entered--Lisp error: (search-failed "^\\cH")
  re-search-forward("^\\cH")
  (let ((from (point)) to) (search-forward ";; okuri-nasi") (beginning-of-line) (setq to (point)) (narrow-to-region from to) (skkdic-convert-okuri-ari skkbuf buf) (widen) (goto-char to) (forward-line 1) (setq from (point)) (re-search-forward "^\\cH") (setq to (match-beginning 0)) (narrow-to-region from to) (skkdic-convert-postfix skkbuf buf) (widen) (goto-char to) (skkdic-convert-prefix skkbuf buf) (skkdic-collect-okuri-nasi) (skkdic-convert-okuri-nasi skkbuf buf) (save-current-buffer (set-buffer buf) (goto-char (point-max)) (insert ";;\n(provide 'ja-dic)\n\n" ";; Local Variables:\n" ";; version-control: never\n" ";; no-update-autoloads: t\n" ";; coding: utf-8\n" ";; End:\n\n" ";;; ja-dic.el ends here\n")))
…

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-12  8:17                                                                               ` Ken Raeburn
@ 2017-01-14 10:41                                                                                 ` Eli Zaretskii
  2017-01-14 10:55                                                                                   ` Andreas Schwab
                                                                                                     ` (3 more replies)
  0 siblings, 4 replies; 375+ messages in thread
From: Eli Zaretskii @ 2017-01-14 10:41 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: schwab, monnier, emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Thu, 12 Jan 2017 03:17:40 -0500
> Cc: Andreas Schwab <schwab@linux-m68k.org>,
>  monnier@iro.umontreal.ca,
>  emacs-devel@gnu.org
> 
> I’ve just uploaded a workaround for that (including a comma-dot sequence that I’m not familiar with, but which seems to get the same treatment as comma and comma-at), and a bug fix I found relating to one of my earlier changes.
> 
> Now, with your patch to avoid unexec, it’s successfully compiling in the lisp directory but fails

It does fail for me while byte compiling 2 files:

    ELC      leim/ja-dic/ja-dic.elc
  Loading ../src/dumped.elc...

  In toplevel form:
  leim/ja-dic/ja-dic.el:76:1:Error: Args out of range: " s  ", 2432, 2432
  Makefile:282: recipe for target `leim/ja-dic/ja-dic.elc' failed
  make[2]: *** [leim/ja-dic/ja-dic.elc] Error 1

    ELC      net/eww.elc
  Loading ../src/dumped.elc...

  In toplevel form:
  net/eww.el:29:1:Error: Undefined category: >
  Makefile:282: recipe for target `net/eww.elc' failed
  make[2]: *** [net/eww.elc] Error 1

The line number in the error message is bogus, it points to a require
line (that's a known issue with byte-compiler error reporting, I
think).  Running the latter compilation under GDB, I see this:

  Thread 1 hit Breakpoint 3, Fsignal (error_symbol=21616,
      data=-4611686018320478144) at eval.c:1471
  1471      signal_or_quit (error_symbol, data, false);
  (gdb) pp error_symbol
  error
  (gdb) pp data
  ("Undefined category: >")
  (gdb) bt
  #0  Fsignal (error_symbol=21616, data=-4611686018320478144) at eval.c:1471
  #1  0x0114c616 in xsignal (error_symbol=21616, data=-4611686018320478144)
      at lisp.h:3872
  #2  0x01221129 in xsignal1 (error_symbol=21616, arg=-9223372036747839088)
      at eval.c:1606
  #3  0x01221df8 in verror (
      m=0x167d0c5 <DEFAULT_REHASH_SIZE+373> "Undefined category: %c",
      ap=0x826bb4 ">") at eval.c:1791
  #4  0x01221e16 in error (
      m=0x167d0c5 <DEFAULT_REHASH_SIZE+373> "Undefined category: %c")
      at eval.c:1803
  #5  0x011108b2 in Fmodify_category_entry (character=4611686018427387937,
      category=4611686018427387966, table=-6917529027624565016, reset=0)
      at category.c:368
  #6  0x01225980 in Ffuncall (nargs=3, args=0x826d88) at eval.c:2726
  #7  0x0128959c in exec_byte_code (bytestr=-9223372036747840008,
      vector=-6917529027534179736, maxdepth=4611686018427387911,
      args_template=0, nargs=0, args=0x0) at bytecode.c:639
  #8  0x012886b0 in Fbyte_code (bytestr=-9223372036747840008,
      vector=-6917529027534179736, maxdepth=4611686018427387911)
      at bytecode.c:319
  #9  0x012238ef in eval_sub (form=-4611686018320476528) at eval.c:2194
  #10 0x01269693 in readevalloop (readcharfun=28056,
      stream=0x77c5fd00 <msvcrt!_iob+128>, sourcename=-9223372036747840088,
      printflag=false, unibyte=0, readfun=0, start=0, end=0) at lread.c:1980
  #11 0x01266f73 in Fload (file=-9223372036747853616, noerror=0,
      nomessage=55328, nosuffix=0, must_suffix=0) at lread.c:1367
  #12 0x01223a45 in eval_sub (form=-4611686018320474720) at eval.c:2202
  #13 0x0121c562 in Fprogn (body=-4611686018320474624) at eval.c:432
  #14 0x0121c277 in Fif (args=-4611686018320475808) at eval.c:390
  #15 0x012232ef in eval_sub (form=-4611686018320475824) at eval.c:2141
  #16 0x01268a56 in readevalloop_eager_expand_eval (val=-4611686018320474576,
      macroexpand=62241920) at lread.c:1792
  #17 0x01269679 in readevalloop (readcharfun=-6917529027540611248, stream=0x0,
      sourcename=-9223372036754301328, printflag=false, unibyte=0, readfun=0,
      start=0, end=0) at lread.c:1978
  #18 0x01269b55 in Feval_buffer (buffer=-6917529027540611248, printflag=0,
      filename=-9223372036754305384, unibyte=0, do_allow_print=55328)
      at lread.c:2044
  #19 0x01225a39 in Ffuncall (nargs=6, args=0x828098) at eval.c:2731
  #20 0x0128959c in exec_byte_code (bytestr=-9223372036760655744,
      vector=-6917529027547213384, maxdepth=4611686018427387910,
      args_template=0, nargs=0, args=0x0) at bytecode.c:639
  #21 0x01226fc8 in funcall_lambda (fun=-6917529027547213024, nargs=4,
      arg_vector=0x8286c0) at eval.c:2957
  #22 0x01225e1f in Ffuncall (nargs=5, args=0x8286b8) at eval.c:2764
  #23 0x01225110 in call4 (fn=64816760, arg1=-9223372036754305384,
      arg2=-9223372036754305384, arg3=0, arg4=55328) at eval.c:2599
  #24 0x01266b7c in Fload (file=-9223372036754314936, noerror=0,
      nomessage=55328, nosuffix=0, must_suffix=55328) at lread.c:1311
  #25 0x012391ce in Frequire (feature=76025368, filename=0, noerror=0)
      at fns.c:2894
  #26 0x012258f2 in Ffuncall (nargs=2, args=0x829068) at eval.c:2722
  #27 0x0122429d in Fapply (nargs=2, args=0x829068) at eval.c:2300
  #28 0x012256f3 in Ffuncall (nargs=3, args=0x829060) at eval.c:2695
  #29 0x0128959c in exec_byte_code (bytestr=-9223372036754937720,
      vector=-6917529027541241920, maxdepth=4611686018427387913,
      args_template=4611686018427388161, nargs=1, args=0x829648)
      at bytecode.c:639
  #30 0x012268de in funcall_lambda (fun=-6917529027541241752, nargs=1,
      arg_vector=0x829640) at eval.c:2879
  #31 0x01225e1f in Ffuncall (nargs=2, args=0x829638) at eval.c:2764
  #32 0x0128959c in exec_byte_code (bytestr=-9223372036754938152,
      vector=-6917529027541242792, maxdepth=4611686018427387908,
      args_template=4611686018427388161, nargs=1, args=0x829bb0)
      at bytecode.c:639
  #33 0x012268de in funcall_lambda (fun=-6917529027541242744, nargs=1,
      arg_vector=0x829ba8) at eval.c:2879
  #34 0x01225e1f in Ffuncall (nargs=2, args=0x829ba0) at eval.c:2764
  #35 0x0128959c in exec_byte_code (bytestr=-9223372036754938216,
      vector=-6917529027541242960, maxdepth=4611686018427387909,
      args_template=4611686018427388161, nargs=1, args=0x82a108)
      at bytecode.c:639
  #36 0x012268de in funcall_lambda (fun=-6917529027541242912, nargs=1,
      arg_vector=0x82a100) at eval.c:2879
  #37 0x01225e1f in Ffuncall (nargs=2, args=0x82a0f8) at eval.c:2764
  #38 0x0128959c in exec_byte_code (bytestr=-9223372036755022888,
      vector=-6917529027541352232, maxdepth=4611686018427387914,
      args_template=4611686018427388418, nargs=2, args=0x82a698)
      at bytecode.c:639
  #39 0x012268de in funcall_lambda (fun=-6917529027541352128, nargs=2,
      arg_vector=0x82a688) at eval.c:2879
  #40 0x01225e1f in Ffuncall (nargs=3, args=0x82a680) at eval.c:2764
  #41 0x0128959c in exec_byte_code (bytestr=-9223372036754938232,
      vector=-6917529027541242864, maxdepth=4611686018427387908,
      args_template=4611686018427388161, nargs=1, args=0x82abd8)
      at bytecode.c:639
  #42 0x012268de in funcall_lambda (fun=-6917529027541242840, nargs=1,
      arg_vector=0x82abd0) at eval.c:2879
  #43 0x01225e1f in Ffuncall (nargs=2, args=0x82abc8) at eval.c:2764
  #44 0x0128959c in exec_byte_code (bytestr=-9223372036754939440,
      vector=-6917529027541248760, maxdepth=4611686018427387909,
      args_template=4611686018427388161, nargs=1, args=0x82b180)
      at bytecode.c:639
  #45 0x012268de in funcall_lambda (fun=-6917529027541248584, nargs=1,
      arg_vector=0x82b178) at eval.c:2879
  #46 0x01225e1f in Ffuncall (nargs=2, args=0x82b170) at eval.c:2764
  #47 0x0128959c in exec_byte_code (bytestr=-9223372036754939488,
      vector=-6917529027541248536, maxdepth=4611686018427387920,
      args_template=4611686018427388161, nargs=1, args=0x82b848)
      at bytecode.c:639
  #48 0x012268de in funcall_lambda (fun=-6917529027541248088, nargs=1,
      arg_vector=0x82b840) at eval.c:2879
  #49 0x01225e1f in Ffuncall (nargs=2, args=0x82b838) at eval.c:2764
  #50 0x0128959c in exec_byte_code (bytestr=-9223372036754945616,
      vector=-6917529027541250008, maxdepth=4611686018427387920,
      args_template=4611686018427388417, nargs=1, args=0x82bf80)
      at bytecode.c:639
  #51 0x012268de in funcall_lambda (fun=-6917529027541249216, nargs=1,
      arg_vector=0x82bf78) at eval.c:2879
  #52 0x01225e1f in Ffuncall (nargs=2, args=0x82bf70) at eval.c:2764
  #53 0x0128959c in exec_byte_code (bytestr=-9223372036754832632,
      vector=-6917529027541134088, maxdepth=4611686018427387912,
      args_template=4611686018427388161, nargs=1, args=0x82c548)
      at bytecode.c:639
  #54 0x012268de in funcall_lambda (fun=-6917529027541133960, nargs=1,
      arg_vector=0x82c540) at eval.c:2879
  #55 0x01225e1f in Ffuncall (nargs=2, args=0x82c538) at eval.c:2764
  #56 0x0128959c in exec_byte_code (bytestr=-9223372036754832680,
      vector=-6917529027541140920, maxdepth=4611686018427387914,
      args_template=4611686018427388160, nargs=0, args=0x82cba8)
      at bytecode.c:639
  #57 0x012268de in funcall_lambda (fun=-6917529027541140760, nargs=0,
      arg_vector=0x82cba8) at eval.c:2879
  #58 0x01225e1f in Ffuncall (nargs=1, args=0x82cba0) at eval.c:2764
  #59 0x0128959c in exec_byte_code (bytestr=-9223372036757936560,
      vector=-6917529027544929360, maxdepth=4611686018427387927,
      args_template=4611686018427388161, nargs=1, args=0x82d5b8)
      at bytecode.c:639
  #60 0x012268de in funcall_lambda (fun=-6917529027544928504, nargs=1,
      arg_vector=0x82d5b0) at eval.c:2879
  #61 0x01225e1f in Ffuncall (nargs=2, args=0x82d5a8) at eval.c:2764
  #62 0x0128959c in exec_byte_code (bytestr=-9223372036760992488,
      vector=-6917529027547292792, maxdepth=4611686018427387925,
      args_template=4611686018427387904, nargs=0, args=0x82e258)
      at bytecode.c:639
  #63 0x012268de in funcall_lambda (fun=-6917529027547291064, nargs=0,
      arg_vector=0x82e258) at eval.c:2879
  #64 0x01225e1f in Ffuncall (nargs=1, args=0x82e250) at eval.c:2764
  #65 0x0128959c in exec_byte_code (bytestr=-9223372036756613024,
      vector=-6917529027542917320, maxdepth=4611686018427387916,
      args_template=4611686018427387904, nargs=0, args=0x82e850)
      at bytecode.c:639
  #66 0x012268de in funcall_lambda (fun=-6917529027542916696, nargs=0,
      arg_vector=0x82e850) at eval.c:2879
  #67 0x012263b2 in apply_lambda (fun=-6917529027542916696, args=0, count=21)
      at eval.c:2816
  #68 0x01223de3 in eval_sub (form=-4611686018332481616) at eval.c:2233
  #69 0x01222b56 in Feval (form=-4611686018332481616, lexical=0) at eval.c:2010
  #70 0x01223886 in eval_sub (form=-4611686018328424112) at eval.c:2191
  #71 0x0121c562 in Fprogn (body=-4611686018328424080) at eval.c:432
  #72 0x012232ef in eval_sub (form=-4611686018328424240) at eval.c:2141
  #73 0x01269693 in readevalloop (readcharfun=28056,
      stream=0x77c5fce0 <msvcrt!_iob+96>, sourcename=-9223372036838132000,
      printflag=false, unibyte=0, readfun=0, start=0, end=0) at lread.c:1980
  #74 0x01266f73 in Fload (file=-9223372036838132176, noerror=0, nomessage=0,
      nosuffix=0, must_suffix=0) at lread.c:1367
  #75 0x01223a45 in eval_sub (form=-4611686018340754848) at eval.c:2202
  #76 0x012205e2 in internal_lisp_condition_case (var=0,
      bodyform=-4611686018340754848, handlers=-4611686018340754768)
      at eval.c:1285
  #77 0x0121fe64 in Fcondition_case (args=-4611686018340754736) at eval.c:1211
  #78 0x012232ef in eval_sub (form=-4611686018340754720) at eval.c:2141
  #79 0x01222b56 in Feval (form=-4611686018340754720, lexical=0) at eval.c:2010
  #80 0x01155640 in top_level_2 () at keyboard.c:1127
  #81 0x01220675 in internal_condition_case (bfun=0x115560a <top_level_2>,
      handlers=21616, hfun=0x1154dc1 <cmd_error>) at eval.c:1314
  #82 0x011556a6 in top_level_1 (ignore=0) at keyboard.c:1135
  #83 0x0121f7fd in internal_catch (tag=57512, func=0x1155646 <top_level_1>,
      arg=0) at eval.c:1080
  #84 0x01155522 in command_loop () at keyboard.c:1096
  #85 0x011547f3 in recursive_edit_1 () at keyboard.c:703
  #86 0x01154a8f in Frecursive_edit () at keyboard.c:774
  #87 0x01152244 in main (argc=7, argv=0xa440d8) at emacs.c:1698

  Lisp Backtrace:
  "modify-category-entry" (0x826d90)
  "byte-code" (0x827270)
  "load" (0x827a90)
  "if" (0x827cc0)
  "eval-buffer" (0x8280a0)
  "load-with-code-conversion" (0x8286c0)
  "require" (0x829070)
  "apply" (0x829068)
  "byte-compile-file-form-require" (0x829640)
  "byte-compile-file-form" (0x829ba8)
  0x5f36be0 PVEC_COMPILED
  "byte-compile-recurse-toplevel" (0x82a688)
  "byte-compile-toplevel-file-form" (0x82abd0)
  0x5f355b8 PVEC_COMPILED
  "byte-compile-from-buffer" (0x82b840)
  "byte-compile-file" (0x82bf78)
  "batch-byte-compile-file" (0x82c540)
  "batch-byte-compile" (0x82cba8)
  "command-line-1" (0x82d5b0)
  "command-line" (0x82e258)
  "normal-top-level" (0x82e850)
  "eval" (0x82eb30)
  "progn" (0x82ed20)
  "load" (0x82f500)
  "condition-case" (0x82f7d0)
  (gdb) fr 11
  #11 0x01266f73 in Fload (file=-9223372036747853616, noerror=0,
      nomessage=55328, nosuffix=0, must_suffix=0) at lread.c:1367
  1367        readevalloop (Qget_file_char, stream, hist_file_name,
  (gdb) pp file
  "kinsoku"
  (gdb)

So it is loading kinsoku.el, and the code which triggers this is this:

  (while (< idx len)
    (setq ch (aref kinsoku-bol idx)
	  idx (1+ idx))
    (modify-category-entry ch ?>)))

The category '>' is defined in characters.el.  Surprisingly,
characters.elc in this branch is identical to the file on master, so
byte compilation (see below) is off the hook here.  What else could
explain that this category is deemed unknown?

Running the ja-dic.el compilation under GDB, I see this:

  Thread 1 hit Breakpoint 3, Fsignal (error_symbol=9464,
      data=-4611686018325493760) at eval.c:1471
  1471      signal_or_quit (error_symbol, data, false);
  args-out-of-range
  (gdb) pp data
  (" s  " 2432 2432)
  (gdb) bt
  #0  Fsignal (error_symbol=9464, data=-4611686018325493760) at eval.c:1471
  #1  0x0114c616 in xsignal (error_symbol=9464, data=-4611686018325493760)
      at lisp.h:3872
  #2  0x0122120b in xsignal3 (error_symbol=9464, arg1=-9223372036754289936,
      arg2=4611686018427390336, arg3=4611686018427390336) at eval.c:1618
  #3  0x011f8170 in args_out_of_range_3 (a1=-9223372036754289936,
      a2=4611686018427390336, a3=4611686018427390336) at data.c:169
  #4  0x0122f302 in validate_subarray (array=-9223372036754289936,
      from=4611686018427390336, to=4611686018427390336, size=4, ifrom=0x828da8,
      ito=0x828da4) at fns.c:1257
  #5  0x0122f3a1 in Fsubstring (string=-9223372036754289936,
      from=4611686018427390336, to=4611686018427390336) at fns.c:1282
  #6  0x0128a8b7 in exec_byte_code (bytestr=-9223372036754295792,
      vector=-6917529027540600688, maxdepth=4611686018427387908,
      args_template=0, nargs=0, args=0x0) at bytecode.c:958
  #7  0x01226fc8 in funcall_lambda (fun=-6917529027540652976, nargs=1,
      arg_vector=0x829398) at eval.c:2957
  #8  0x01225e1f in Ffuncall (nargs=2, args=0x829390) at eval.c:2764
  #9  0x0128959c in exec_byte_code (bytestr=-9223372036754295728,
      vector=-6917529027540600592, maxdepth=4611686018427387912,
      args_template=0, nargs=0, args=0x0) at bytecode.c:639
  #10 0x01226fc8 in funcall_lambda (fun=-6917529027540600496, nargs=15663,
      arg_vector=0x60c72d0) at eval.c:2957
  #11 0x01225e1f in Ffuncall (nargs=15664, args=0x60c72c8) at eval.c:2764
  #12 0x012246e8 in Fapply (nargs=2, args=0x829970) at eval.c:2343
  #13 0x01224f2d in apply1 (fn=-6917529027540600496, arg=-4611686018326943040)
      at eval.c:2559
  #14 0x0121f650 in Fmacroexpand (form=-4611686018326943056,
      environment=-4611686018327011424) at eval.c:1035
  #15 0x0122588f in Ffuncall (nargs=3, args=0x829b48) at eval.c:2718
  #16 0x0128959c in exec_byte_code (bytestr=-9223372036758992784,
      vector=-6917529027545329872, maxdepth=4611686018427387914,
      args_template=4611686018427388418, nargs=2, args=0x82a110)
      at bytecode.c:639
  #17 0x012268de in funcall_lambda (fun=-6917529027545333672, nargs=2,
      arg_vector=0x82a100) at eval.c:2879
  #18 0x01225e1f in Ffuncall (nargs=3, args=0x82a0f8) at eval.c:2764
  #19 0x0128959c in exec_byte_code (bytestr=-9223372036755022888,
      vector=-6917529027541352232, maxdepth=4611686018427387914,
      args_template=4611686018427388418, nargs=2, args=0x82a698)
      at bytecode.c:639
  #20 0x012268de in funcall_lambda (fun=-6917529027541352128, nargs=2,
      arg_vector=0x82a688) at eval.c:2879
  #21 0x01225e1f in Ffuncall (nargs=3, args=0x82a680) at eval.c:2764
  #22 0x0128959c in exec_byte_code (bytestr=-9223372036754938232,
      vector=-6917529027541242864, maxdepth=4611686018427387908,
      args_template=4611686018427388161, nargs=1, args=0x82abd8)
      at bytecode.c:639
  #23 0x012268de in funcall_lambda (fun=-6917529027541242840, nargs=1,
      arg_vector=0x82abd0) at eval.c:2879
  #24 0x01225e1f in Ffuncall (nargs=2, args=0x82abc8) at eval.c:2764
  #25 0x0128959c in exec_byte_code (bytestr=-9223372036754939440,
      vector=-6917529027541248760, maxdepth=4611686018427387909,
      args_template=4611686018427388161, nargs=1, args=0x82b180)
      at bytecode.c:639
  #26 0x012268de in funcall_lambda (fun=-6917529027541248584, nargs=1,
      arg_vector=0x82b178) at eval.c:2879
  #27 0x01225e1f in Ffuncall (nargs=2, args=0x82b170) at eval.c:2764
  #28 0x0128959c in exec_byte_code (bytestr=-9223372036754939488,
      vector=-6917529027541248536, maxdepth=4611686018427387920,
      args_template=4611686018427388161, nargs=1, args=0x82b848)
      at bytecode.c:639
  #29 0x012268de in funcall_lambda (fun=-6917529027541248088, nargs=1,
      arg_vector=0x82b840) at eval.c:2879
  #30 0x01225e1f in Ffuncall (nargs=2, args=0x82b838) at eval.c:2764
  #31 0x0128959c in exec_byte_code (bytestr=-9223372036754945616,
      vector=-6917529027541250008, maxdepth=4611686018427387920,
      args_template=4611686018427388417, nargs=1, args=0x82bf80)
      at bytecode.c:639
  #32 0x012268de in funcall_lambda (fun=-6917529027541249216, nargs=1,
      arg_vector=0x82bf78) at eval.c:2879
  #33 0x01225e1f in Ffuncall (nargs=2, args=0x82bf70) at eval.c:2764
  #34 0x0128959c in exec_byte_code (bytestr=-9223372036754832632,
      vector=-6917529027541134088, maxdepth=4611686018427387912,
      args_template=4611686018427388161, nargs=1, args=0x82c548)
      at bytecode.c:639
  #35 0x012268de in funcall_lambda (fun=-6917529027541133960, nargs=1,
      arg_vector=0x82c540) at eval.c:2879
  #36 0x01225e1f in Ffuncall (nargs=2, args=0x82c538) at eval.c:2764
  #37 0x0128959c in exec_byte_code (bytestr=-9223372036754832680,
      vector=-6917529027541140920, maxdepth=4611686018427387914,
      args_template=4611686018427388160, nargs=0, args=0x82cba8)
      at bytecode.c:639
  #38 0x012268de in funcall_lambda (fun=-6917529027541140760, nargs=0,
      arg_vector=0x82cba8) at eval.c:2879
  #39 0x01225e1f in Ffuncall (nargs=1, args=0x82cba0) at eval.c:2764
  #40 0x0128959c in exec_byte_code (bytestr=-9223372036757936560,
      vector=-6917529027544929360, maxdepth=4611686018427387927,
      args_template=4611686018427388161, nargs=1, args=0x82d5b8)
      at bytecode.c:639
  #41 0x012268de in funcall_lambda (fun=-6917529027544928504, nargs=1,
      arg_vector=0x82d5b0) at eval.c:2879
  #42 0x01225e1f in Ffuncall (nargs=2, args=0x82d5a8) at eval.c:2764
  #43 0x0128959c in exec_byte_code (bytestr=-9223372036760992488,
      vector=-6917529027547292792, maxdepth=4611686018427387925,
      args_template=4611686018427387904, nargs=0, args=0x82e258)
      at bytecode.c:639
  #44 0x012268de in funcall_lambda (fun=-6917529027547291064, nargs=0,
      arg_vector=0x82e258) at eval.c:2879
  #45 0x01225e1f in Ffuncall (nargs=1, args=0x82e250) at eval.c:2764
  #46 0x0128959c in exec_byte_code (bytestr=-9223372036756613024,
      vector=-6917529027542917320, maxdepth=4611686018427387916,
      args_template=4611686018427387904, nargs=0, args=0x82e850)
      at bytecode.c:639
  #47 0x012268de in funcall_lambda (fun=-6917529027542916696, nargs=0,
      arg_vector=0x82e850) at eval.c:2879
  #48 0x012263b2 in apply_lambda (fun=-6917529027542916696, args=0, count=21)
      at eval.c:2816
  #49 0x01223de3 in eval_sub (form=-4611686018332481616) at eval.c:2233
  #50 0x01222b56 in Feval (form=-4611686018332481616, lexical=0) at eval.c:2010
  #51 0x01223886 in eval_sub (form=-4611686018328424112) at eval.c:2191
  #52 0x0121c562 in Fprogn (body=-4611686018328424080) at eval.c:432
  #53 0x012232ef in eval_sub (form=-4611686018328424240) at eval.c:2141
  #54 0x01269693 in readevalloop (readcharfun=28056,
      stream=0x77c5fce0 <msvcrt!_iob+96>, sourcename=-9223372036838132000,
      printflag=false, unibyte=0, readfun=0, start=0, end=0) at lread.c:1980
  #55 0x01266f73 in Fload (file=-9223372036838132176, noerror=0, nomessage=0,
      nosuffix=0, must_suffix=0) at lread.c:1367
  #56 0x01223a45 in eval_sub (form=-4611686018340754848) at eval.c:2202
  #57 0x012205e2 in internal_lisp_condition_case (var=0,
      bodyform=-4611686018340754848, handlers=-4611686018340754768)
      at eval.c:1285
  #58 0x0121fe64 in Fcondition_case (args=-4611686018340754736) at eval.c:1211
  #59 0x012232ef in eval_sub (form=-4611686018340754720) at eval.c:2141
  #60 0x01222b56 in Feval (form=-4611686018340754720, lexical=0) at eval.c:2010
  #61 0x01155640 in top_level_2 () at keyboard.c:1127
  #62 0x01220675 in internal_condition_case (bfun=0x115560a <top_level_2>,
      handlers=21616, hfun=0x1154dc1 <cmd_error>) at eval.c:1314
  #63 0x011556a6 in top_level_1 (ignore=0) at keyboard.c:1135
  #64 0x0121f7fd in internal_catch (tag=57512, func=0x1155646 <top_level_1>,
      arg=0) at eval.c:1080
  #65 0x01155522 in command_loop () at keyboard.c:1096
  #66 0x011547f3 in recursive_edit_1 () at keyboard.c:703
  #67 0x01154a8f in Frecursive_edit () at keyboard.c:774
  #68 0x01152244 in main (argc=7, argv=0xa440d8) at emacs.c:1698

  Lisp Backtrace:
  "skkdic-extract-conversion-data" (0x829398)
  0x5fd3950 PVEC_COMPILED
  "macroexpand" (0x829b50)
  "macroexp-macroexpand" (0x82a100)
  "byte-compile-recurse-toplevel" (0x82a688)
  "byte-compile-toplevel-file-form" (0x82abd0)
  0x5f355b8 PVEC_COMPILED
  "byte-compile-from-buffer" (0x82b840)
  "byte-compile-file" (0x82bf78)
  "batch-byte-compile-file" (0x82c540)
  "batch-byte-compile" (0x82cba8)
  "command-line-1" (0x82d5b0)
  "command-line" (0x82e258)
  "normal-top-level" (0x82e850)
  "eval" (0x82eb30)
  "progn" (0x82ed20)
  "load" (0x82f500)
  "condition-case" (0x82f7d0)
  (gdb) fr 5
  #5  0x0122f3a1 in Fsubstring (string=-9223372036754289936,
      from=4611686018427390336, to=4611686018427390336) at fns.c:1282
  1282      validate_subarray (string, from, to, size, &ifrom, &ito);
  (gdb) pp string
  " s  "
  (gdb)

The error seems to come from this function in ja-dic-cnv.el:

  (defun skkdic-extract-conversion-data (entry)
    (string-match "^\\cj+[a-z]* " entry)
    (let ((kana (substring entry (match-beginning 0) (1- (match-end 0))))
	  (i (match-end 0))
	  candidates)
      (while (string-match "[^ ]+" entry i)
	(setq candidates (cons (match-string 0 entry) candidates))
	(setq i (match-end 0)))
      (cons (skkdic-get-kana-compact-codes kana) candidates)))

The call to 'substring' is the one that errors out.  So this again
points to some problem with categories, as "\\cj" is in the regexp.

> make[2]: Entering directory '/home/raeburn/dev/emacs/s/lisp'
> make -C ../leim all EMACS="../src/emacs"
> make[3]: Entering directory '/home/raeburn/dev/emacs/s/leim'
> /bin/mkdir -p ../lisp/leim/ja-dic
>   GEN      ../lisp/leim/ja-dic/ja-dic.el
> Loading ../src/dumped.elc...
> Reading file "/home/raeburn/dev/emacs/s/leim/SKK-DIC/SKK-JISYO.L" ...
> Processing OKURI-ARI entries ...
> Debugger entered--Lisp error: (search-failed "^\\cH")
>   re-search-forward("^\\cH")
>   (let ((from (point)) to) (search-forward ";; okuri-nasi") (beginning-of-line) (setq to (point)) (narrow-to-region from to) (skkdic-convert-okuri-ari skkbuf buf) (widen) (goto-char to) (forward-line 1) (setq from (point)) (re-search-forward "^\\cH") (setq to (match-beginning 0)) (narrow-to-region from to) (skkdic-convert-postfix skkbuf buf) (widen) (goto-char to) (skkdic-convert-prefix skkbuf buf) (skkdic-collect-okuri-nasi) (skkdic-convert-okuri-nasi skkbuf buf) (save-current-buffer (set-buffer buf) (goto-char (point-max)) (insert ";;\n(provide 'ja-dic)\n\n" ";; Local Variables:\n" ";; version-control: never\n" ";; no-update-autoloads: t\n" ";; coding: utf-8\n" ";; End:\n\n" ";;; ja-dic.el ends here\n")))

Not sure why I didn't see the error with okuri-nasi, perhaps the
previous build attempts already generated that.  If I do

  touch leim/SKK-DIC/SKK-JISYO.L

the next "make" indeed fails as on your system.

One other thing I noticed is that most of the *.elc files produced by
this build are different from those I see on master.  The differences
are sometimes just a few bytes (e.g., in mule-diag.elc), but sometimes
much larger (e.g., files.elc).  Perhaps this points to some subtle
problem in byte compilation?  But even if so, that cannot explain the
failure to compile eww.el and ja-dic.el.

HTH



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 10:41                                                                                 ` Eli Zaretskii
@ 2017-01-14 10:55                                                                                   ` Andreas Schwab
  2017-01-14 11:07                                                                                     ` Eli Zaretskii
  2017-01-14 15:30                                                                                   ` Stefan Monnier
                                                                                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 375+ messages in thread
From: Andreas Schwab @ 2017-01-14 10:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ken Raeburn, monnier, emacs-devel

On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote:

> The line number in the error message is bogus, it points to a require
> line (that's a known issue with byte-compiler error reporting, I
> think).

It's not bogus, since the error was raised while the byte-compiler
evaluated the form there.  Lisp errors don't carry line number
information so there isn't much the byte-compiler can do.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 10:55                                                                                   ` Andreas Schwab
@ 2017-01-14 11:07                                                                                     ` Eli Zaretskii
  2017-01-14 11:26                                                                                       ` Alan Mackenzie
  2017-01-14 12:19                                                                                       ` Andreas Schwab
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2017-01-14 11:07 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: raeburn, monnier, emacs-devel

> From: Andreas Schwab <schwab@linux-m68k.org>
> Cc: Ken Raeburn <raeburn@raeburn.org>,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Sat, 14 Jan 2017 11:55:42 +0100
> 
> On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > The line number in the error message is bogus, it points to a require
> > line (that's a known issue with byte-compiler error reporting, I
> > think).
> 
> It's not bogus, since the error was raised while the byte-compiler
> evaluated the form there.

It's "bogus" in the sense that it isn't useful for finding the code
which triggered the error.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 11:07                                                                                     ` Eli Zaretskii
@ 2017-01-14 11:26                                                                                       ` Alan Mackenzie
  2017-01-14 12:19                                                                                       ` Andreas Schwab
  1 sibling, 0 replies; 375+ messages in thread
From: Alan Mackenzie @ 2017-01-14 11:26 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: raeburn, Andreas Schwab, monnier, emacs-devel

Hello, Eli.

On Sat, Jan 14, 2017 at 01:07:17PM +0200, Eli Zaretskii wrote:
> > From: Andreas Schwab <schwab@linux-m68k.org>
> > Cc: Ken Raeburn <raeburn@raeburn.org>,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> > Date: Sat, 14 Jan 2017 11:55:42 +0100

> > On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote:

> > > The line number in the error message is bogus, it points to a require
> > > line (that's a known issue with byte-compiler error reporting, I
> > > think).

> > It's not bogus, since the error was raised while the byte-compiler
> > evaluated the form there.

> It's "bogus" in the sense that it isn't useful for finding the code
> which triggered the error.

Just as a matter of interest, I spent quite a bit of time in the summer
trying to fix this.  My approach was this:
(i) The modified reader created an association list between each cons it
  creates and the source code position.
(ii) Each time a compiler function transformed such a cons, instead of
  the function returning the transformed form, it did setcar/setcdr into
  the original cons to preserve the mapping in the association table.
(iii) On emitting an error/warning, the compiler would look up the
  source code position in the association list.

I'm confident that such an approach would work.  However, it was an
enormous amount of work to adapt the compiler, and I got distracted by
other things, so haven't managed to produce anything workable, yet.  At
least there's already a reliable test suite for this, namely make
bootstrap.  :-)

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 11:07                                                                                     ` Eli Zaretskii
  2017-01-14 11:26                                                                                       ` Alan Mackenzie
@ 2017-01-14 12:19                                                                                       ` Andreas Schwab
  2017-01-14 13:05                                                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Andreas Schwab @ 2017-01-14 12:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: raeburn, monnier, emacs-devel

On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Andreas Schwab <schwab@linux-m68k.org>
>> Cc: Ken Raeburn <raeburn@raeburn.org>,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
>> Date: Sat, 14 Jan 2017 11:55:42 +0100
>> 
>> On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote:
>> 
>> > The line number in the error message is bogus, it points to a require
>> > line (that's a known issue with byte-compiler error reporting, I
>> > think).
>> 
>> It's not bogus, since the error was raised while the byte-compiler
>> evaluated the form there.
>
> It's "bogus" in the sense that it isn't useful for finding the code
> which triggered the error.

It is as bogus as every Lisp error.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 12:19                                                                                       ` Andreas Schwab
@ 2017-01-14 13:05                                                                                         ` Eli Zaretskii
  2017-01-14 15:12                                                                                           ` Andreas Schwab
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-01-14 13:05 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: raeburn, monnier, emacs-devel

> From: Andreas Schwab <schwab@linux-m68k.org>
> Date: Sat, 14 Jan 2017 13:19:12 +0100
> Cc: raeburn@raeburn.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org
> 
> >> > The line number in the error message is bogus, it points to a require
> >> > line (that's a known issue with byte-compiler error reporting, I
> >> > think).
> >> 
> >> It's not bogus, since the error was raised while the byte-compiler
> >> evaluated the form there.
> >
> > It's "bogus" in the sense that it isn't useful for finding the code
> > which triggered the error.
> 
> It is as bogus as every Lisp error.

On the contrary: most of them provide useful information about the
error locus.  This one clearly didn't.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 13:05                                                                                         ` Eli Zaretskii
@ 2017-01-14 15:12                                                                                           ` Andreas Schwab
  2017-01-14 17:37                                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Andreas Schwab @ 2017-01-14 15:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: raeburn, monnier, emacs-devel

On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Andreas Schwab <schwab@linux-m68k.org>
>> Date: Sat, 14 Jan 2017 13:19:12 +0100
>> Cc: raeburn@raeburn.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org
>> 
>> >> > The line number in the error message is bogus, it points to a require
>> >> > line (that's a known issue with byte-compiler error reporting, I
>> >> > think).
>> >> 
>> >> It's not bogus, since the error was raised while the byte-compiler
>> >> evaluated the form there.
>> >
>> > It's "bogus" in the sense that it isn't useful for finding the code
>> > which triggered the error.
>> 
>> It is as bogus as every Lisp error.
>
> On the contrary: most of them provide useful information about the
> error locus.  This one clearly didn't.

It accurately tells you the form that caused the error, something you
never get from a Lisp error.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 15:12                                                                                           ` Andreas Schwab
@ 2017-01-14 17:37                                                                                             ` Eli Zaretskii
  2017-01-14 18:50                                                                                               ` Andreas Schwab
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-01-14 17:37 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: raeburn, monnier, emacs-devel

> From: Andreas Schwab <schwab@linux-m68k.org>
> Cc: raeburn@raeburn.org,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Sat, 14 Jan 2017 16:12:22 +0100
> 
> >> It is as bogus as every Lisp error.
> >
> > On the contrary: most of them provide useful information about the
> > error locus.  This one clearly didn't.
> 
> It accurately tells you the form that caused the error, something you
> never get from a Lisp error.

It's accurate, but utterly useless.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 17:37                                                                                             ` Eli Zaretskii
@ 2017-01-14 18:50                                                                                               ` Andreas Schwab
  0 siblings, 0 replies; 375+ messages in thread
From: Andreas Schwab @ 2017-01-14 18:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: raeburn, monnier, emacs-devel

On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Andreas Schwab <schwab@linux-m68k.org>
>> Cc: raeburn@raeburn.org,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
>> Date: Sat, 14 Jan 2017 16:12:22 +0100
>> 
>> >> It is as bogus as every Lisp error.
>> >
>> > On the contrary: most of them provide useful information about the
>> > error locus.  This one clearly didn't.
>> 
>> It accurately tells you the form that caused the error, something you
>> never get from a Lisp error.
>
> It's accurate, but utterly useless.

No, it isn't.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 10:41                                                                                 ` Eli Zaretskii
  2017-01-14 10:55                                                                                   ` Andreas Schwab
@ 2017-01-14 15:30                                                                                   ` Stefan Monnier
  2017-01-14 17:42                                                                                     ` Eli Zaretskii
  2017-01-21  7:58                                                                                   ` Ken Raeburn
  2017-02-02  9:10                                                                                   ` Ken Raeburn
  3 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2017-01-14 15:30 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ken Raeburn, schwab, emacs-devel

>   leim/ja-dic/ja-dic.el:76:1:Error: Args out of range: " s  ", 2432, 2432
[...]
> The line number in the error message is bogus, it points to a require
> line (that's a known issue with byte-compiler error reporting, I
> think).

I don't think it's "bogus": it says that the error occurred while
compiling that `require` line, i.e. while loading the corresponding file.
You can set byte-compile-debug (along with debug-on-error) to get
a backtrace which will be more useful.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 15:30                                                                                   ` Stefan Monnier
@ 2017-01-14 17:42                                                                                     ` Eli Zaretskii
  2017-01-14 18:11                                                                                       ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-01-14 17:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: raeburn, schwab, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 14 Jan 2017 10:30:45 -0500
> Cc: Ken Raeburn <raeburn@raeburn.org>, schwab@linux-m68k.org,
> 	emacs-devel@gnu.org
> 
> You can set byte-compile-debug (along with debug-on-error) to get
> a backtrace which will be more useful.

That doesn't help when one is presented with a build log.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 17:42                                                                                     ` Eli Zaretskii
@ 2017-01-14 18:11                                                                                       ` Stefan Monnier
  2017-01-14 20:13                                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2017-01-14 18:11 UTC (permalink / raw)
  To: emacs-devel

>> You can set byte-compile-debug (along with debug-on-error) to get
>> a backtrace which will be more useful.
> That doesn't help when one is presented with a build log.

Not directly, no, indeed.  Usually I then fire an interactive Emacs, set
the vars and call byte-compile-file to reproduce the problem in an
environment where I can investigate the backtrace comfortably.

My point was simply that this is an *evaluation* error more than an error
in the compiled code, so the poverty of the info is due to the poverty
of info we get when running Elisp code (and this is indeed somewhat
linked to the byte-compiler since the byte-compiler doesn't preserve
the source location in the bytecode it emits).

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 18:11                                                                                       ` Stefan Monnier
@ 2017-01-14 20:13                                                                                         ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2017-01-14 20:13 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 14 Jan 2017 13:11:26 -0500
> 
> >> You can set byte-compile-debug (along with debug-on-error) to get
> >> a backtrace which will be more useful.
> > That doesn't help when one is presented with a build log.
> 
> Not directly, no, indeed.  Usually I then fire an interactive Emacs, set
> the vars and call byte-compile-file to reproduce the problem in an
> environment where I can investigate the backtrace comfortably.

There's more than one way of tracking the real locus of the problem.
My point is that either way, it's an annoyance which makes
investigation of such problems significantly less efficient than when
the byte compiler points out the source file and the line number where
it happens, or close thereabouts (which is what happens most of the
time).

I gather that we are in violent agreement about that.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 10:41                                                                                 ` Eli Zaretskii
  2017-01-14 10:55                                                                                   ` Andreas Schwab
  2017-01-14 15:30                                                                                   ` Stefan Monnier
@ 2017-01-21  7:58                                                                                   ` Ken Raeburn
  2017-01-22 16:55                                                                                     ` Ken Raeburn
  2017-02-02  9:10                                                                                   ` Ken Raeburn
  3 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-01-21  7:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Andreas Schwab, Stefan Monnier, Emacs developers

I think I may have figured out why I was getting crashes relating to the face cache but it wasn’t very reproducible.  Some of the face creation code paths will ensure that a cache exists for a frame before using it — like the handling of “menu” in internal-set-lisp-face-attribute — and some do not.  In a regular Emacs build, the order of operations in the C and Lisp code dictate the order in which face definitions are processed.  So, for example, in a batch-mode test invocation I tried, the “menu” face handling created the cache for frame “F1” before using it.

But using dumped.elc, face property settings get restored, but the code generated assumes that the order doesn’t matter, so the list of face names depends not just on which Lisp code was loaded, but on the order they’re seen under “mapatoms”, i.e., based on load order and the obarray size.  (So my Mac/NS and GNU/Linux/X11 builds have different lists of names, and different orders.)

I’m looking at internal-set-lisp-face-attribute as a place to always ensure the existence of the cache, but there may be a better location.

On Jan 14, 2017, at 05:41, Eli Zaretskii <eliz@gnu.org> wrote:
> [… much about failures I’m still looking at…]
> One other thing I noticed is that most of the *.elc files produced by
> this build are different from those I see on master.  The differences
> are sometimes just a few bytes (e.g., in mule-diag.elc), but sometimes
> much larger (e.g., files.elc).  Perhaps this points to some subtle
> problem in byte compilation?  But even if so, that cannot explain the
> failure to compile eww.el and ja-dic.el.

I built a couple versions, and found several .elc files different.  The first case I looked at was macroexp--const-symbol-p in macroexp.elc.  From disassembling, it appears that the expression “(boundp 'byte-compile-const-variables)” is optimized out in the build from the branch point, but not in the build including the dumped.elc changes.  I’m not sure why yet, but it’s almost certainly a bug that they’re different.  And a bug affecting the emacs-lisp environment and/or the byte compiler output could certainly cause later attempts at byte compilation (using newly byte-compiled code) to misbehave.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-21  7:58                                                                                   ` Ken Raeburn
@ 2017-01-22 16:55                                                                                     ` Ken Raeburn
  0 siblings, 0 replies; 375+ messages in thread
From: Ken Raeburn @ 2017-01-22 16:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Andreas Schwab, Stefan Monnier, Emacs developers


On Jan 21, 2017, at 02:58, Ken Raeburn <raeburn@raeburn.org> wrote:
> I built a couple versions, and found several .elc files different.  The first case I looked at was macroexp--const-symbol-p in macroexp.elc.  From disassembling, it appears that the expression “(boundp 'byte-compile-const-variables)” is optimized out in the build from the branch point, but not in the build including the dumped.elc changes.  I’m not sure why yet, but it’s almost certainly a bug that they’re different.  And a bug affecting the emacs-lisp environment and/or the byte compiler output could certainly cause later attempts at byte compilation (using newly byte-compiled code) to misbehave.

Ah, this may be a false alarm.  I’d overlooked the fact that the updated version (October 31) of Stefan’s patch changed that code to insert that expression on the branch, and I assumed the two were compiling the same source.  But if byte-compile-const-variables can be seen as unbound, that could also alter the optimization results compared to the master branch.  Perhaps that should be fixed, if possible.

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-01-14 10:41                                                                                 ` Eli Zaretskii
                                                                                                     ` (2 preceding siblings ...)
  2017-01-21  7:58                                                                                   ` Ken Raeburn
@ 2017-02-02  9:10                                                                                   ` Ken Raeburn
  2017-02-04 10:37                                                                                     ` Eli Zaretskii
  3 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-02-02  9:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Andreas Schwab, Stefan Monnier, emacs-devel

On Jan 14, 2017, at 05:41, Eli Zaretskii <eliz@gnu.org> wrote:
> It does fail for me while byte compiling 2 files:

I still haven’t hit these.

> The category '>' is defined in characters.el.  Surprisingly,
> characters.elc in this branch is identical to the file on master, so
> byte compilation (see below) is off the hook here.  What else could
> explain that this category is deemed unknown?

I recently noticed the standard syntax and category tables don’t appear to be among the information dumped out, so anything set by characters.el isn’t preserved.  I also didn’t see an existing way to restore them trivially.  (A huge list of “modify-syntax-entry” calls and such seems impractical.)  Also, the buffer-local nature of some variables was being lost; that presented itself as a failure to get syntax-based highlighting in C source files.

Having patched around these, I’m still failing on the same file, but later; it prompts me for the coding system to use to write an output file, because it’s not valid UTF-8.  Apparently the leading comments copied from SKK-JISYO.L are being corrupted.  The first non-ASCII bytes in the buffer (in the “ACKNOWLEDGEMENTS” part of the comment) are 0xe3, 0x81, 0x93, 0xe3, 0x81, 0xae, 0xe8 in a normal build and 0xf5, 0x80, 0x84, 0xac, 0xf5, 0x80, 0x85, 0x87 in my build.  I discovered this maybe half an hour ago, so that’s as far as I’ve gotten.

I’ve just pushed my changes, plus your change to avoid the dump-emacs call, to the branch.  Saving and restoring the standard syntax table seems cheap, because it was already referenced by other objects that were dumped out, but the standard category table almost doubles the size of my dumped.elc, and presumably increases the time to read it accordingly.  Perhaps reading characters.el(c) at startup would be a better choice.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-02  9:10                                                                                   ` Ken Raeburn
@ 2017-02-04 10:37                                                                                     ` Eli Zaretskii
  2017-02-05 14:19                                                                                       ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-02-04 10:37 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Thu, 2 Feb 2017 04:10:38 -0500
> Cc: Andreas Schwab <schwab@linux-m68k.org>,
>  Stefan Monnier <monnier@iro.umontreal.ca>,
>  emacs-devel@gnu.org
> 
> Perhaps reading characters.el(c) at startup would be a better
> choice.

How about changing emacs.c to read characters.el(c) just after
dumped.elc?

Alternatively, would it be possible to simply append characters.elc to
the end of dumped.elc, as part of preparing the latter?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-04 10:37                                                                                     ` Eli Zaretskii
@ 2017-02-05 14:19                                                                                       ` Ken Raeburn
  2017-02-05 15:51                                                                                         ` Eli Zaretskii
                                                                                                           ` (2 more replies)
  0 siblings, 3 replies; 375+ messages in thread
From: Ken Raeburn @ 2017-02-05 14:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Emacs developers

> On Feb 4, 2017, at 05:37, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Ken Raeburn <raeburn@raeburn.org>
>> Date: Thu, 2 Feb 2017 04:10:38 -0500
>> Cc: Andreas Schwab <schwab@linux-m68k.org>,
>> Stefan Monnier <monnier@iro.umontreal.ca>,
>> emacs-devel@gnu.org
>> 
>> Perhaps reading characters.el(c) at startup would be a better
>> choice.
> 
> How about changing emacs.c to read characters.el(c) just after
> dumped.elc?
> 
> Alternatively, would it be possible to simply append characters.elc to
> the end of dumped.elc, as part of preparing the latter?

For now, I changed loadup.el to emit a “load” form to get characters.elc at startup, and that seems to be working.  Copying the contents of characters.elc may be very slightly faster, but I haven’t done any timing tests.

I also tracked down my new ja-dic-cnv problem.  It looks like SKK-JISYO.L was being mangled on read because the input sequences weren’t recognized as Unicode compatible; this caused the resulting buffer not to be considered UTF-8 compatible, so it prompted for a coding system to write with.  Calling unify-charset on the various charsets seems to be needed.

With that change, I’m able to run “make bootstrap” in a GNU/Linux/X11 configuration and it runs to completion.  I haven’t yet tested it on macOS, or compared the .elc for the differences you were describing.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-05 14:19                                                                                       ` Ken Raeburn
@ 2017-02-05 15:51                                                                                         ` Eli Zaretskii
  2017-02-05 23:19                                                                                           ` Ken Raeburn
  2017-02-05 20:03                                                                                         ` Ken Brown
  2017-02-25 14:52                                                                                         ` Eli Zaretskii
  2 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-02-05 15:51 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Sun, 5 Feb 2017 09:19:38 -0500
> Cc: Emacs developers <emacs-devel@gnu.org>
> 
> For now, I changed loadup.el to emit a “load” form to get characters.elc at startup, and that seems to be working.  Copying the contents of characters.elc may be very slightly faster, but I haven’t done any timing tests.
> 
> I also tracked down my new ja-dic-cnv problem.  It looks like SKK-JISYO.L was being mangled on read because the input sequences weren’t recognized as Unicode compatible; this caused the resulting buffer not to be considered UTF-8 compatible, so it prompted for a coding system to write with.  Calling unify-charset on the various charsets seems to be needed.
> 
> With that change, I’m able to run “make bootstrap” in a GNU/Linux/X11 configuration and it runs to completion.  I haven’t yet tested it on macOS, or compared the .elc for the differences you were describing.

Thanks.  Are those changes committed to the branch?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-05 15:51                                                                                         ` Eli Zaretskii
@ 2017-02-05 23:19                                                                                           ` Ken Raeburn
  2017-02-06 15:20                                                                                             ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-02-05 23:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


On Feb 5, 2017, at 10:51, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> Thanks.  Are those changes committed to the branch?

Yes, I pushed them this morning.

Ken



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-05 23:19                                                                                           ` Ken Raeburn
@ 2017-02-06 15:20                                                                                             ` Ken Raeburn
  2017-02-06 15:39                                                                                               ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-02-06 15:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

And now I’ve got a *possible* explanation for why I’m seeing differences in some .elc files.

It appears that the .elc output can vary depending on whether other loaded Lisp code was compiled or not.

I found differences in diary-lib.elc between two of my build trees (raeburn-startup branch, and the branch point).  In the first function I looked at, the differences came down to this:

335	constant  (12 31 -1)   constant  (12 31 -1)
336	dup	  	       dup	  	    
337	varbind	  date	       varbind	  date	    
339	dup	  	       varbind	  date	    
340	varbind	  date	       constant  (12 31 -1)
342	car	  	       car	  	    
343	unbind	  1	       unbind	  1

The constant here comes from calendar-absolute-from-gregorian, a defsubst in calendar.el.  I tried with one source base (at the branch point), compiling diary-lib.el with calendar.elc present, and again with calendar.elc missing so that calendar.el would get used.  The generated .elc files showed the same differences.

This is arguably a bug, but not one added by the big-elc changes.

I almost always build with a make option like “-j4”, so the timing of byte compilations of different files relative to one another isn’t entirely predictable.  This *could* account for a lot of differences I’m seeing.  It’s not a sure thing that this sort of thing is the only cause of differences; I’ll have to do a fully serialized bootstrap of both versions of the code to see.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-06 15:20                                                                                             ` Ken Raeburn
@ 2017-02-06 15:39                                                                                               ` Stefan Monnier
  2017-02-06 19:08                                                                                                 ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2017-02-06 15:39 UTC (permalink / raw)
  To: emacs-devel

> It appears that the .elc output can vary depending on whether other loaded
> Lisp code was compiled or not.

Indeed: the culprit is the defsubst implementation.  Currently, if
a function is byte-compiled, the optimizer inlines its byte-codes
and when it's not yet byte-compiled, then it inlines the source code.

We should probably change that so that when it finds that the defsubst
function is not yet byte-compiled, it byte-compiles it and then inlines
the resulting byte-codes.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-06 15:39                                                                                               ` Stefan Monnier
@ 2017-02-06 19:08                                                                                                 ` Ken Raeburn
  2017-02-06 22:39                                                                                                   ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-02-06 19:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel


> On Feb 6, 2017, at 10:39, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> 
>> It appears that the .elc output can vary depending on whether other loaded
>> Lisp code was compiled or not.
> 
> Indeed: the culprit is the defsubst implementation.  Currently, if
> a function is byte-compiled, the optimizer inlines its byte-codes
> and when it's not yet byte-compiled, then it inlines the source code.
> 
> We should probably change that so that when it finds that the defsubst
> function is not yet byte-compiled, it byte-compiles it and then inlines
> the resulting byte-codes.

Is this a known (and filed) bug?  A quick search for defsubst in debbugs only finds me one unrelated report.

In any case, doing “make bootstrap” from clean trees (which I’m assuming will byte-compile files in the same order each time) still gets me a few differences between the branch point and the branch, including python.elc differing in use of dynamic docstrings, and url-handler.elc file-name-handler wrappers saying “no original documentation”.  Still more to debug, I guess.

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-06 19:08                                                                                                 ` Ken Raeburn
@ 2017-02-06 22:39                                                                                                   ` Stefan Monnier
  2017-02-08 10:31                                                                                                     ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2017-02-06 22:39 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> Is this a known (and filed) bug?

I don't think it's filed, no.  I've known about it for a while now, and
it came up "recently" in the discussion about reproducible builds.
Until then it wasn't considered as a real bug, I think, more like
a quirk.

> In any case, doing “make bootstrap” from clean trees (which I’m assuming
> will byte-compile files in the same order each time)

Not sure if make guarantees a specific order of execution in that case,
but in my experience I think it does operate in a deterministic way, indeed.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-06 22:39                                                                                                   ` Stefan Monnier
@ 2017-02-08 10:31                                                                                                     ` Ken Raeburn
  2017-02-08 14:38                                                                                                       ` Ken Brown
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-02-08 10:31 UTC (permalink / raw)
  To: Emacs developers

On Feb 6, 2017, at 17:39, Stefan Monnier <monnier@IRO.UMontreal.CA> wrote:

>> Is this a known (and filed) bug?
> 
> I don't think it's filed, no.  I've known about it for a while now, and
> it came up "recently" in the discussion about reproducible builds.
> Until then it wasn't considered as a real bug, I think, more like
> a quirk.

Ah, okay.  I didn’t follow that discussion closely.  I haven’t got the bandwidth to keep up on everything, and until now I thought I didn’t care about this one. :-)

With my bootstrap builds running without parallel make, I’ve gotten things much further along in terms of generating .elc files that match what I get without all the big-elc changes.

The difference in progmodes/python.elc came down to the use of UTF-8 in the environment during byte compilation affecting the generated doc strings (using format-message in a macro).  Removing internal--text-quoting-flag from the stuff saved in dumped.elc made the files match for me on my Mac (with UTF-8 in use by default).  I think that’s just papering over the real problem (the macro’s result shouldn’t depend on the UTF-8-ness of the environment), but the flag should reflect the environment of the current Emacs invocation anyway, not the one that produced dumped.elc.

The difference in url/url-handler.elc was because the subr doc strings were getting lost.  The numbers (“DOC” file offsets) stored in the Lisp_Subr structure weren’t preserved, so url-handlers-create-wrapper would just fill in “No original documentation.”  I’m making dumped.elc invoke Snarf-documentation for now.

A tangent: As it happens, a couple years back I was experimenting with having C-based subr/variable documentation stuffed into the executable instead of needing the DOC file, in ways that wouldn’t add a lot of Lisp data unless the doc strings were actually needed.  For subr documentation, it doesn’t create Lisp strings until they’re requested.  For variables, I’ve got an idea on deferring the Lisp string creation, but currently they’re created at startup and stuffed into the property list.  I’ve just updated it to recent Emacs sources, in case we might want to explore that direction further; it might be more efficient than patching up doc pointers every time we start up.

Anyway, with the changes I’ve just pushed to the branch, my bootstrapped tree has .elc files that match those built from the branch point, except for mule.elc, macroexp.elc (both source files changed on the branch), bytecomp.elc and byte-opt.elc (probably due to macroexp changes).

I haven’t tried any more extensive testing.
There may be some funny stuff going on in restoring the charset definitions that I still need to look into.

I haven’t pulled in Ken Brown’s Cygwin changes; Ken, feel free to push those to the branch as well.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-08 10:31                                                                                                     ` Ken Raeburn
@ 2017-02-08 14:38                                                                                                       ` Ken Brown
  0 siblings, 0 replies; 375+ messages in thread
From: Ken Brown @ 2017-02-08 14:38 UTC (permalink / raw)
  To: Ken Raeburn, Emacs developers

On 2/8/2017 5:31 AM, Ken Raeburn wrote:
> I haven’t pulled in Ken Brown’s Cygwin changes; Ken, feel free to push those to the branch as well.

Done.

Ken




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-05 14:19                                                                                       ` Ken Raeburn
  2017-02-05 15:51                                                                                         ` Eli Zaretskii
@ 2017-02-05 20:03                                                                                         ` Ken Brown
  2017-02-25 14:52                                                                                         ` Eli Zaretskii
  2 siblings, 0 replies; 375+ messages in thread
From: Ken Brown @ 2017-02-05 20:03 UTC (permalink / raw)
  To: Ken Raeburn, Eli Zaretskii; +Cc: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 230 bytes --]

On 2/5/2017 9:19 AM, Ken Raeburn wrote:
> With that change, I’m able to run “make bootstrap” in a GNU/Linux/X11 configuration and it runs to completion.

The attached patch enables the build to succeed on Cygwin.

Ken

[-- Attachment #2: 0001-Fix-build-on-Cygwin.patch --]
[-- Type: text/plain, Size: 1518 bytes --]

From 165c3356ebb5413277197e4e17c97e7758f96396 Mon Sep 17 00:00:00 2001
From: Ken Brown <kbrown@cornell.edu>
Date: Sun, 5 Feb 2017 14:58:59 -0500
Subject: [PATCH] Fix build on Cygwin

* configure.ac: Use system malloc on Cygwin.

* lisp/loadup.el: Use ".exe" suffix on Cygwin.
---
 configure.ac   | 4 +---
 lisp/loadup.el | 2 +-
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/configure.ac b/configure.ac
index 425e338..c1fd14d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2158,9 +2158,7 @@ AC_DEFUN
 test "$CANNOT_DUMP" = yes ||
 case "$opsys" in
   ## darwin ld insists on the use of malloc routines in the System framework.
-  darwin | mingw32 | nacl | sol2-10) ;;
-  cygwin) hybrid_malloc=yes
-          system_malloc= ;;
+  cygwin | darwin | mingw32 | nacl | sol2-10) ;;
   *) test "$ac_cv_func_sbrk" = yes && system_malloc=$emacs_cv_sanitize_address;;
 esac
 
diff --git a/lisp/loadup.el b/lisp/loadup.el
index 80e9a28..72f24a6 100644
--- a/lisp/loadup.el
+++ b/lisp/loadup.el
@@ -455,7 +455,7 @@
       ;; other GNU program's build process.
       ;; (dump-emacs "emacs" "temacs")
       ;; (message "%d pure bytes used" pure-bytes-used)
-      (let ((exe (if (memq system-type '(windows-nt ms-dos)) ".exe" "")))
+      (let ((exe (if (memq system-type '(cygwin windows-nt ms-dos)) ".exe" "")))
         (copy-file (expand-file-name (concat "temacs" exe) invocation-directory)
                    (expand-file-name (concat "emacs" exe) invocation-directory)
                    t)
-- 
2.8.3


^ permalink raw reply related	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-05 14:19                                                                                       ` Ken Raeburn
  2017-02-05 15:51                                                                                         ` Eli Zaretskii
  2017-02-05 20:03                                                                                         ` Ken Brown
@ 2017-02-25 14:52                                                                                         ` Eli Zaretskii
  2017-02-25 15:19                                                                                           ` Eli Zaretskii
  2017-02-26 12:37                                                                                           ` Ken Raeburn
  2 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2017-02-25 14:52 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Sun, 5 Feb 2017 09:19:38 -0500
> Cc: Emacs developers <emacs-devel@gnu.org>
> 
> I also tracked down my new ja-dic-cnv problem.  It looks like SKK-JISYO.L was being mangled on read because the input sequences weren’t recognized as Unicode compatible; this caused the resulting buffer not to be considered UTF-8 compatible, so it prompted for a coding system to write with.  Calling unify-charset on the various charsets seems to be needed.

Is this part in the repository?  Because I still get prompted for an
encoding when producing ja-dic.el:

    GEN      ../lisp/leim/ja-dic/ja-dic.el
  Reading file "d:/gnu/git/emacs/no-unexec/leim/SKK-DIC/SKK-JISYO.L" ...
  Processing OKURI-ARI entries ...
  Processing POSTFIX entries ...
  Processing PREFIX entries ...
  Collecting OKURI-NASI entries ...
  collected 26% ...
  collected 30% ...
  collected 40% ...
  collected 50% ...
  collected 60% ...
  collected 70% ...
  collected 80% ...
  collected 90% ...
  Processing OKURI-NASI entries ...
  processed 10% ...
  processed 20% ...
  processed 30% ...
  processed 40% ...
  processed 50% ...
  processed 60% ...
  processed 70% ...
  processed 80% ...
  processed 90% ...
  processed 100% ...
  Select coding system (default japanese-shift-jis): utf-8-unix

I needed to type utf-8-unix by hand.  Any ideas?  Is it possible that
this happens because my default encoding is not UTF-8?

I also pushed a small Windows-specific change to the branch, to allow
Windows users try building this branch.

Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-25 14:52                                                                                         ` Eli Zaretskii
@ 2017-02-25 15:19                                                                                           ` Eli Zaretskii
  2017-02-26 12:37                                                                                           ` Ken Raeburn
  1 sibling, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2017-02-25 15:19 UTC (permalink / raw)
  To: raeburn; +Cc: emacs-devel

> Date: Sat, 25 Feb 2017 16:52:12 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-devel@gnu.org
> 
> Is this part in the repository?  Because I still get prompted for an
> encoding when producing ja-dic.el:
> 
>     GEN      ../lisp/leim/ja-dic/ja-dic.el

Also, it looks like the logic in startup.el that should bypass certain
stuff under -Q isn't working, because I see my abbrevs being loaded
even though I invoked "emacs -Q".  Thoughts?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-25 14:52                                                                                         ` Eli Zaretskii
  2017-02-25 15:19                                                                                           ` Eli Zaretskii
@ 2017-02-26 12:37                                                                                           ` Ken Raeburn
  2017-03-04 14:23                                                                                             ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-02-26 12:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


> On Feb 25, 2017, at 09:52, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Ken Raeburn <raeburn@raeburn.org>
>> Date: Sun, 5 Feb 2017 09:19:38 -0500
>> Cc: Emacs developers <emacs-devel@gnu.org>
>> 
>> I also tracked down my new ja-dic-cnv problem.  It looks like SKK-JISYO.L was being mangled on read because the input sequences weren’t recognized as Unicode compatible; this caused the resulting buffer not to be considered UTF-8 compatible, so it prompted for a coding system to write with.  Calling unify-charset on the various charsets seems to be needed.
> 
> Is this part in the repository?  Because I still get prompted for an
> encoding when producing ja-dic.el:

Yes, change d864464 has the unify-charset changes.



> 
>    GEN      ../lisp/leim/ja-dic/ja-dic.el
>  Reading file "d:/gnu/git/emacs/no-unexec/leim/SKK-DIC/SKK-JISYO.L" ...
>  Processing OKURI-ARI entries ...
>  Processing POSTFIX entries ...
>  Processing PREFIX entries ...
>  Collecting OKURI-NASI entries ...
>  collected 26% ...
>  collected 30% ...
>  collected 40% ...
>  collected 50% ...
>  collected 60% ...
>  collected 70% ...
>  collected 80% ...
>  collected 90% ...
>  Processing OKURI-NASI entries ...
>  processed 10% ...
>  processed 20% ...
>  processed 30% ...
>  processed 40% ...
>  processed 50% ...
>  processed 60% ...
>  processed 70% ...
>  processed 80% ...
>  processed 90% ...
>  processed 100% ...
>  Select coding system (default japanese-shift-jis): utf-8-unix
> 
> I needed to type utf-8-unix by hand.  Any ideas?  Is it possible that
> this happens because my default encoding is not UTF-8?

Looks like my environment has LANG=en_US.UTF-8, on Mac and GNU/Linux.  But setting LANG=C or en_US.ISO8859-1 doesn’t seem to cause the build to get hung up this way for me.

Did you do a full bootstrap after updating?  An outdated dumped.elc could certainly do this, and I know at least some of the dependencies aren’t current with the changes on the branch.  (I’ve taken to going as far as “git clean -f -d -x”, then using autogen.sh, configure, and “make bootstrap”, fairly often.)

> I also pushed a small Windows-specific change to the branch, to allow
> Windows users try building this branch.

Great!

> Also, it looks like the logic in startup.el that should bypass certain
> stuff under -Q isn't working, because I see my abbrevs being loaded
> even though I invoked "emacs -Q".  Thoughts?

Strange… this is also working for me.  At least, settings from my .emacs aren’t being applied, when I use “emacs -Q”.

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-02-26 12:37                                                                                           ` Ken Raeburn
@ 2017-03-04 14:23                                                                                             ` Eli Zaretskii
  2017-03-06  8:46                                                                                               ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-03-04 14:23 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Sun, 26 Feb 2017 07:37:56 -0500
> Cc: emacs-devel@gnu.org
> 
> >    GEN      ../lisp/leim/ja-dic/ja-dic.el
> >  Reading file "d:/gnu/git/emacs/no-unexec/leim/SKK-DIC/SKK-JISYO.L" ...
> >  Processing OKURI-ARI entries ...
> >  Processing POSTFIX entries ...
> >  Processing PREFIX entries ...
> >  Collecting OKURI-NASI entries ...
> >  collected 26% ...
> >  collected 30% ...
> >  collected 40% ...
> >  collected 50% ...
> >  collected 60% ...
> >  collected 70% ...
> >  collected 80% ...
> >  collected 90% ...
> >  Processing OKURI-NASI entries ...
> >  processed 10% ...
> >  processed 20% ...
> >  processed 30% ...
> >  processed 40% ...
> >  processed 50% ...
> >  processed 60% ...
> >  processed 70% ...
> >  processed 80% ...
> >  processed 90% ...
> >  processed 100% ...
> >  Select coding system (default japanese-shift-jis): utf-8-unix
> > 
> > I needed to type utf-8-unix by hand.  Any ideas?  Is it possible that
> > this happens because my default encoding is not UTF-8?
> 
> Looks like my environment has LANG=en_US.UTF-8, on Mac and GNU/Linux.  But setting LANG=C or en_US.ISO8859-1 doesn’t seem to cause the build to get hung up this way for me.
> 
> Did you do a full bootstrap after updating?  An outdated dumped.elc could certainly do this, and I know at least some of the dependencies aren’t current with the changes on the branch.  (I’ve taken to going as far as “git clean -f -d -x”, then using autogen.sh, configure, and “make bootstrap”, fairly often.)

I've bootstrapped now, and this problem is gone.  Thanks.

> > Also, it looks like the logic in startup.el that should bypass certain
> > stuff under -Q isn't working, because I see my abbrevs being loaded
> > even though I invoked "emacs -Q".  Thoughts?
> 
> Strange… this is also working for me.  At least, settings from my .emacs aren’t being applied, when I use “emacs -Q”.

This problem is still there.  It has nothing to do with loading
~/.emacs, though: startup.el always loads your ~/.emacs.d/abbrev_defs,
if that file exists.  I'm not sure why it loads that file, but I
verified that the master version does that as well.

So the issue here is not that the file is loaded, but how it is
processed.  I only noticed this because my abbrev_defs file uses a
function that is only defined in my .emacs.  So "emacs -Q" on the
raeburn-startup branch barfs because that function is not known.
Strangely, "emacs -Q" on the master branch doesn't signal an error,
and I don't even see Fsignal called if I set a breakpoint there.  I
don't (yet) understand why the different behavior.

If you insert into your abbrev_defs file something that references a
function which is not defined, do you see the same problem as I do?

Btw, one thing that I saw while debugging is that purify-flag is set
to t while running the startup code.  This is because init_alloc_once
is called during startup (previously, it was only called by temacs).
I don't know if this is related to the issue (setting purify-flag to
nil in Frecursive_edit didn't help), but I thought I'd bring it up,
because maybe we need to set it to nil earlier.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-04 14:23                                                                                             ` Eli Zaretskii
@ 2017-03-06  8:46                                                                                               ` Ken Raeburn
  2017-03-11 12:27                                                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-03-06  8:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


On Mar 4, 2017, at 09:23, Eli Zaretskii <eliz@gnu.org> wrote:

>>> Also, it looks like the logic in startup.el that should bypass certain
>>> stuff under -Q isn't working, because I see my abbrevs being loaded
>>> even though I invoked "emacs -Q".  Thoughts?
>> 
>> Strange… this is also working for me.  At least, settings from my .emacs aren’t being applied, when I use “emacs -Q”.
> 
> This problem is still there.  It has nothing to do with loading
> ~/.emacs, though: startup.el always loads your ~/.emacs.d/abbrev_defs,
> if that file exists.  I'm not sure why it loads that file, but I
> verified that the master version does that as well.

Odd, seems like -Q should skip that, with the rest of the user’s initializations.

> 
> So the issue here is not that the file is loaded, but how it is
> processed.  I only noticed this because my abbrev_defs file uses a
> function that is only defined in my .emacs.  So "emacs -Q" on the
> raeburn-startup branch barfs because that function is not known.
> Strangely, "emacs -Q" on the master branch doesn't signal an error,
> and I don't even see Fsignal called if I set a breakpoint there.  I
> don't (yet) understand why the different behavior.
> 
> If you insert into your abbrev_defs file something that references a
> function which is not defined, do you see the same problem as I do?

I added a line:

  (missing-function)

in between some define-abbrev-table invocations, and “emacs -Q” on master (2-3 weeks old) and raeburn-startup both complain about it for me.

> Btw, one thing that I saw while debugging is that purify-flag is set
> to t while running the startup code.  This is because init_alloc_once
> is called during startup (previously, it was only called by temacs).
> I don't know if this is related to the issue (setting purify-flag to
> nil in Frecursive_edit didn't help), but I thought I'd bring it up,
> because maybe we need to set it to nil earlier.

I’ve been thinking that the branch should probably set CANNOT_DUMP unconditionally.  The behavior around pure storage and such under CANNOT_DUMP is probably closer to what we want for the branch.  But there’s at least one bug in building with CANNOT_DUMP for macOS I’ve got to clear up first.  As you say, it may not be at all related to the problem you’re running into, though.

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-06  8:46                                                                                               ` Ken Raeburn
@ 2017-03-11 12:27                                                                                                 ` Eli Zaretskii
  2017-03-11 13:18                                                                                                   ` Andreas Schwab
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-03-11 12:27 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Mon, 6 Mar 2017 03:46:17 -0500
> Cc: emacs-devel@gnu.org
> 
> On Mar 4, 2017, at 09:23, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> >>> Also, it looks like the logic in startup.el that should bypass certain
> >>> stuff under -Q isn't working, because I see my abbrevs being loaded
> >>> even though I invoked "emacs -Q".  Thoughts?
> >> 
> >> Strange… this is also working for me.  At least, settings from my .emacs aren’t being applied, when I use “emacs -Q”.
> > 
> > This problem is still there.  It has nothing to do with loading
> > ~/.emacs, though: startup.el always loads your ~/.emacs.d/abbrev_defs,
> > if that file exists.  I'm not sure why it loads that file, but I
> > verified that the master version does that as well.
> 
> Odd, seems like -Q should skip that, with the rest of the user’s initializations.

Maybe so, but this code has been there since about forever, and the
documentation of -Q doesn't say user's abbrevs are bypassed, only
under -batch.  In any case, it's a separate problem.

> > So the issue here is not that the file is loaded, but how it is
> > processed.  I only noticed this because my abbrev_defs file uses a
> > function that is only defined in my .emacs.  So "emacs -Q" on the
> > raeburn-startup branch barfs because that function is not known.
> > Strangely, "emacs -Q" on the master branch doesn't signal an error,
> > and I don't even see Fsignal called if I set a breakpoint there.  I
> > don't (yet) understand why the different behavior.
> > 
> > If you insert into your abbrev_defs file something that references a
> > function which is not defined, do you see the same problem as I do?
> 
> I added a line:
> 
>   (missing-function)
> 
> in between some define-abbrev-table invocations, and “emacs -Q” on master (2-3 weeks old) and raeburn-startup both complain about it for me.

I debugged this some more: this has nothing to do with unknown
functions, you just need to have global abbrevs in the abbrev_defs
file, for example:

  (define-abbrev-table 'global-abbrev-table
    '(
      ("abbout" "about" nil 0)
      ("abotu" "about" nil 0)))

The problem seems to be that global-abbrev-table is not an abbrev
table where startup.el calls quietly-read-abbrev-file (abbrev-table-p
returns nil for it).  If I make this change:

diff --git a/lisp/startup.el b/lisp/startup.el
index 4a04f9c..7f55962 100644
--- a/lisp/startup.el
+++ b/lisp/startup.el
@@ -1263,6 +1263,8 @@ command-line
 	      (deactivate-mark)))
 
 	;; If the user has a file of abbrevs, read it (unless -batch).
+	(or (abbrev-table-p global-abbrev-table)
+	    (setq global-abbrev-table (make-abbrev-table)))
 	(when (and (not noninteractive)
 		   (file-exists-p abbrev-file-name)
 		   (file-readable-p abbrev-file-name))

then "emacs -Q" starts up normally.  Can you reproduce this?

global-abbrev-table is defined in abbrev.el like this:

  (defvar global-abbrev-table (make-abbrev-table)
    "The abbrev table whose abbrevs affect all buffers.
  Each buffer may also have a local abbrev table.
  If it does, the local table overrides the global one
  for any particular abbrev defined in both.")

So I think the issue could be that this defvar somehow doesn't end up
in dumped.elc as an abbrev table under the new build procedure.  Does that make sense?



^ permalink raw reply related	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-11 12:27                                                                                                 ` Eli Zaretskii
@ 2017-03-11 13:18                                                                                                   ` Andreas Schwab
  2017-03-11 13:42                                                                                                     ` Eli Zaretskii
                                                                                                                       ` (2 more replies)
  0 siblings, 3 replies; 375+ messages in thread
From: Andreas Schwab @ 2017-03-11 13:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ken Raeburn, emacs-devel

On Mär 11 2017, Eli Zaretskii <eliz@gnu.org> wrote:

> So I think the issue could be that this defvar somehow doesn't end up
> in dumped.elc as an abbrev table under the new build procedure.  Does that make sense?

I think the problem is that an abbrev table is actually an obarray,
which does not have a suitable print syntax.

ELISP> global-abbrev-table
[## 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

ELISP> (abbrev-table-p [## 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0])
nil

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-11 13:18                                                                                                   ` Andreas Schwab
@ 2017-03-11 13:42                                                                                                     ` Eli Zaretskii
  2017-03-11 15:48                                                                                                     ` Stefan Monnier
  2017-03-11 23:59                                                                                                     ` Ken Raeburn
  2 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2017-03-11 13:42 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: raeburn, emacs-devel

> From: Andreas Schwab <schwab@linux-m68k.org>
> Cc: Ken Raeburn <raeburn@raeburn.org>,  emacs-devel@gnu.org
> Date: Sat, 11 Mar 2017 14:18:28 +0100
> 
> On Mär 11 2017, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > So I think the issue could be that this defvar somehow doesn't end up
> > in dumped.elc as an abbrev table under the new build procedure.  Does that make sense?
> 
> I think the problem is that an abbrev table is actually an obarray,
> which does not have a suitable print syntax.

Indeed, makes perfect sense.  Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-11 13:18                                                                                                   ` Andreas Schwab
  2017-03-11 13:42                                                                                                     ` Eli Zaretskii
@ 2017-03-11 15:48                                                                                                     ` Stefan Monnier
  2017-03-11 21:48                                                                                                       ` Richard Stallman
  2017-03-11 23:59                                                                                                     ` Ken Raeburn
  2 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2017-03-11 15:48 UTC (permalink / raw)
  To: emacs-devel

> I think the problem is that an abbrev table is actually an obarray,
> which does not have a suitable print syntax.

Maybe now would be a good time to change the representation of
abbrev-tables?


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-11 15:48                                                                                                     ` Stefan Monnier
@ 2017-03-11 21:48                                                                                                       ` Richard Stallman
  2017-03-11 22:06                                                                                                         ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2017-03-11 21:48 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Maybe now would be a good time to change the representation of
  > abbrev-tables?

How would we prefer for them to print?

I am not convinced that it would be useful or convenient
to have them print out in a way that describes their contents.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-11 21:48                                                                                                       ` Richard Stallman
@ 2017-03-11 22:06                                                                                                         ` Stefan Monnier
  0 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2017-03-11 22:06 UTC (permalink / raw)
  To: emacs-devel

>> Maybe now would be a good time to change the representation of
>> abbrev-tables?
> How would we prefer for them to print?

I'm not talking about "representation" in the sense of "print format"
but in terms of which data-structure to use for them.  But yes, of
course that will affect the way they print.  Clearly, I'd hope that the
new representation would print `read'ably.

> I am not convinced that it would be useful or convenient
> to have them print out in a way that describes their contents.

The way abbrev-tables print right now is just plain bad: it's neither
computer-readable nor human-readable.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-11 13:18                                                                                                   ` Andreas Schwab
  2017-03-11 13:42                                                                                                     ` Eli Zaretskii
  2017-03-11 15:48                                                                                                     ` Stefan Monnier
@ 2017-03-11 23:59                                                                                                     ` Ken Raeburn
  2017-03-12 17:06                                                                                                       ` Stefan Monnier
  2017-03-13  8:25                                                                                                       ` Ken Raeburn
  2 siblings, 2 replies; 375+ messages in thread
From: Ken Raeburn @ 2017-03-11 23:59 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Eli Zaretskii, emacs-devel


On Mar 11, 2017, at 08:18, Andreas Schwab <schwab@linux-m68k.org> wrote:

> On Mär 11 2017, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> So I think the issue could be that this defvar somehow doesn't end up
>> in dumped.elc as an abbrev table under the new build procedure.  Does that make sense?
> 
> I think the problem is that an abbrev table is actually an obarray,
> which does not have a suitable print syntax.

Ah, yes.  Thanks for noticing that.

And just yesterday I was thinking how convenient — and surprising — it was that we didn’t have to dump out any obarray objects; oh well.  Unless we’re going to arrange for obarrays to be printable and readable in a useful form, they’ll need special-casing.  But abbrev variables should be easy enough to recognize and process.  I’ll take a look.

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-11 23:59                                                                                                     ` Ken Raeburn
@ 2017-03-12 17:06                                                                                                       ` Stefan Monnier
  2017-03-13  8:25                                                                                                       ` Ken Raeburn
  1 sibling, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2017-03-12 17:06 UTC (permalink / raw)
  To: emacs-devel

> And just yesterday I was thinking how convenient — and surprising — it was
> that we didn’t have to dump out any obarray objects; oh well.  Unless we’re
> going to arrange for obarrays to be printable and readable in a useful form,
> they’ll need special-casing.  But abbrev variables should be easy enough to
> recognize and process.  I’ll take a look.

My personal favorite choice is to deprecate obarrays (most uses would
be better served by a hash-table), but getting rid of them completely is
rather tricky.  So we probably want to solve the obarray problem
regardless of whether we deprecate them.

It seems fairly, easy, tho:
- add a `make-obarray` function, which basically does the same as
  `make-vector` but uses another tag.  Use it in abbrev.el (and other
  applicable places).
- change `intern` and friends to accept those other kinds of vectors.
- change print.c to do something more clever with obarrays.
- deprecate use of plain vectors as obarrays.

I'm in the mood for procrastinating, so don't be surprised if a patch
shows up,


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-11 23:59                                                                                                     ` Ken Raeburn
  2017-03-12 17:06                                                                                                       ` Stefan Monnier
@ 2017-03-13  8:25                                                                                                       ` Ken Raeburn
  2017-03-26 16:44                                                                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-03-13  8:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 871 bytes --]

I have a patch which seems to recreate all the abbrev tables that were in the initial Emacs process, including the sharing (lisp-mode-abbrev-table being a parent of emacs-lisp-mode-abbrev-table; local-abbrev-table set to the fundamental mode table).  Please let me know if it fixes your problem.

It depends on a couple key things: (1) The abbrev tables are empty at that point, so we don’t have to worry about reconstructing all the abbrevs in a table; this can be fixed, it’s just tedious. (2) Abbrev-table values are only used as symbol values, or parents of other abbrev tables.  This is much harder.  Stefan’s printable replacement for obarrays would probably be a better solution.  Though, normally I’d expect people to want printing of an obarray to show symbol names, and for this use case we need the function, value, and plist data as well.

Ken


[-- Attachment #2: abbrev-table-patch --]
[-- Type: application/octet-stream, Size: 4844 bytes --]

commit dd4d7941a3915d16a0f78ab547e3656ec95f4f29
Author: Ken Raeburn <raeburn@raeburn.org>
Date:   Mon Mar 13 03:21:53 2017 -0400

    Dump and restore empty abbrev tables.
    
    Abbrev tables are obarrays and thus don't print out in a useful form.
    They need to be assembled at load time.  Fortunately, loadup.el only
    gives us empty abbrev tables, so we don't have to actually restore any
    abbrevs, only the tables.
    
    * lisp/loadup.el: When variable values are abbrev tables, emit a
    "make-abbrev-table" initialization with the appropriate property
    lists.  Check abbrev tables and their parents for instances of
    sharing.  Reject any abbrev tables that are not empty.

diff --git a/lisp/loadup.el b/lisp/loadup.el
index cc9ed7be1a..48a1208ed7 100644
--- a/lisp/loadup.el
+++ b/lisp/loadup.el
@@ -484,6 +484,10 @@
             (coding-systems '()) (coding-system-aliases '())
             (charsets '()) (charset-aliases '())
             (unified-charsets '())
+            (abbrev-tables (make-hash-table :test 'eq))
+            (abbrev-assign-cmds '())
+            (abbrev-make-cmds '())
+            (abbrev-counter 0)
             (cmds '()))
         (setcdr global-buffers-menu-map nil) ;; Get rid of buffer objects!
         (push `(internal--set-standard-syntax-table
@@ -539,6 +543,42 @@
                           '(let ((ol (make-overlay (point-min) (point-min))))
                              (delete-overlay ol)
                              ol))
+                         ;; abbrev-table-p isn't very defensive
+                         ((condition-case nil
+                              (abbrev-table-p v)
+                            (error nil))
+                          (cl-labels ((replace-abbrevs-for-dump
+                                       (table)
+                                       (or (abbrev-table-empty-p table)
+                                           (error "Non-empty abbrev tables not handled"))
+                                       (let ((newval (gethash table abbrev-tables)))
+                                         (if newval
+                                             `(aref scratch-abbrev-tables ,newval)
+                                           (let* ((props (symbol-plist (obarray-get table ""))))
+                                             (cond ((plist-get props :parents)
+                                                    (setq props (copy-sequence props))
+                                                    (plist-put props
+                                                               :parents
+                                                               (mapcar (lambda (value)
+                                                                         (replace-abbrevs-for-dump value))
+                                                                       (plist-get props :parents))))
+                                                   ((eq (length props) 2)
+                                                    ;; Only :abbrev-table-modiff, which gets added at creation anyway.
+                                                    (setq props nil)))
+                                             (push `(aset scratch-abbrev-tables
+                                                          ,abbrev-counter
+                                                          (make-abbrev-table ',props))
+                                                   abbrev-make-cmds)
+                                             (puthash table abbrev-counter abbrev-tables)
+                                             (prog1
+                                                 `(aref scratch-abbrev-tables ,abbrev-counter)
+                                               (setq abbrev-counter (1+ abbrev-counter))))))))
+                            (push `(set-default ',s
+                                                ,(replace-abbrevs-for-dump v))
+                                  abbrev-assign-cmds))
+                          ;; Placeholder to be used before we know
+                          ;; we've defined make-abbrev-table.
+                          0)
                          (v (macroexp-quote v))))
                      cmds)
                ;; Local variables: make-variable-buffer-local,
@@ -591,6 +631,10 @@
             (print '(get-buffer-create "*Messages*"))
             (print `(progn . ,cmds))
             (terpri)
+            ;; Now that make-abbrev-table is defined, use it.
+            (print `(let ((scratch-abbrev-tables (make-vector ,abbrev-counter 0)))
+                      ,@(nreverse abbrev-make-cmds)
+                      ,@abbrev-assign-cmds))
             (print `(let ((css ',charsets))
                       (dotimes (i 3)
                         (dolist (cs (prog1 css (setq css nil)))

^ permalink raw reply related	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-13  8:25                                                                                                       ` Ken Raeburn
@ 2017-03-26 16:44                                                                                                         ` Eli Zaretskii
  2017-03-28  2:27                                                                                                           ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-03-26 16:44 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Mon, 13 Mar 2017 04:25:19 -0400
> Cc: Emacs developers <emacs-devel@gnu.org>
> 
> I have a patch which seems to recreate all the abbrev tables that were in the initial Emacs process, including the sharing (lisp-mode-abbrev-table being a parent of emacs-lisp-mode-abbrev-table; local-abbrev-table set to the fundamental mode table).  Please let me know if it fixes your problem.

Sorry for the long delay: Life™ intervened big time...

> It depends on a couple key things: (1) The abbrev tables are empty at that point, so we don’t have to worry about reconstructing all the abbrevs in a table; this can be fixed, it’s just tedious. (2) Abbrev-table values are only used as symbol values, or parents of other abbrev tables.  This is much harder.  Stefan’s printable replacement for obarrays would probably be a better solution.  Though, normally I’d expect people to want printing of an obarray to show symbol names, and for this use case we need the function, value, and plist data as well.

I applied your patch, and while dumping I get an error message:

  Dumping into dumped.elc...preparing...
  Dumping into dumped.elc...generating...
  Symbol's function definition is void: cl-labels

and dumped.elc is not re-created.  What did I miss?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-26 16:44                                                                                                         ` Eli Zaretskii
@ 2017-03-28  2:27                                                                                                           ` Ken Raeburn
  2017-03-31  6:57                                                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-03-28  2:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


On Mar 26, 2017, at 12:44, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Ken Raeburn <raeburn@raeburn.org>
>> Date: Mon, 13 Mar 2017 04:25:19 -0400
>> Cc: Emacs developers <emacs-devel@gnu.org>
>> 
>> I have a patch which seems to recreate all the abbrev tables that were in the initial Emacs process, including the sharing (lisp-mode-abbrev-table being a parent of emacs-lisp-mode-abbrev-table; local-abbrev-table set to the fundamental mode table).  Please let me know if it fixes your problem.
> 
> Sorry for the long delay: Life™ intervened big time…

It happens.  No worries.

> 
>> It depends on a couple key things: (1) The abbrev tables are empty at that point, so we don’t have to worry about reconstructing all the abbrevs in a table; this can be fixed, it’s just tedious. (2) Abbrev-table values are only used as symbol values, or parents of other abbrev tables.  This is much harder.  Stefan’s printable replacement for obarrays would probably be a better solution.  Though, normally I’d expect people to want printing of an obarray to show symbol names, and for this use case we need the function, value, and plist data as well.
> 
> I applied your patch, and while dumping I get an error message:
> 
>  Dumping into dumped.elc...preparing...
>  Dumping into dumped.elc...generating...
>  Symbol's function definition is void: cl-labels
> 
> and dumped.elc is not re-created.  What did I miss?

Looks like I missed a “require” or “load” to pull in cl-macs.  Perhaps it’s loaded by something else in my build that’s platform-dependent (X11 vs Windows?) and isn’t in yours; I’m not sure. But it isn’t working for me to just load it explicitly without fixing up the load path too.  Perhaps I should’ve just defined a helper function instead of using cl-labels.

For now, try adding this patch. It bootstraps for me, and should get cl-labels defined.

diff --git a/lisp/loadup.el b/lisp/loadup.el
index 4ef9712ab6..f9251020cd 100644
--- a/lisp/loadup.el
+++ b/lisp/loadup.el
@@ -57,6 +57,17 @@
 ;; Add subdirectories to the load-path for files that might get
 ;; autoloaded when bootstrapping.
 ;; This is because PATH_DUMPLOADSEARCH is just "../lisp".
+(let ((dir (car load-path)))
+  (message "load path is %S" load-path)
+  (setq load-path (list (expand-file-name "." dir)
+                        (expand-file-name "emacs-lisp" dir)
+                        (expand-file-name "language" dir)
+                        (expand-file-name "international" dir)
+                        (expand-file-name "textmodes" dir)
+                        (expand-file-name "vc" dir))))
+
+(setq purify-flag nil)
+
 (if (or (equal (member "bootstrap" command-line-args) '("bootstrap"))
 	;; FIXME this is irritatingly fragile.
 	(equal (nth 4 command-line-args) "unidata-gen.el")
@@ -64,19 +75,10 @@
 	(if (fboundp 'dump-emacs)
 	    (string-match "src/bootstrap-emacs" (nth 0 command-line-args))
 	  t))
-    (let ((dir (car load-path)))
-      ;; We'll probably overflow the pure space.
-      (setq purify-flag nil)
-      ;; Value of max-lisp-eval-depth when compiling initially.
-      ;; During bootstrapping the byte-compiler is run interpreted when
-      ;; compiling itself, which uses a lot more stack than usual.
-      (setq max-lisp-eval-depth 2200)
-      (setq load-path (list (expand-file-name "." dir)
-			    (expand-file-name "emacs-lisp" dir)
-			    (expand-file-name "language" dir)
-			    (expand-file-name "international" dir)
-			    (expand-file-name "textmodes" dir)
-			    (expand-file-name "vc" dir)))))
+    ;; Value of max-lisp-eval-depth when compiling initially.
+    ;; During bootstrapping the byte-compiler is run interpreted when
+    ;; compiling itself, which uses a lot more stack than usual.
+    (setq max-lisp-eval-depth 2200))
 
 (if (eq t purify-flag)
     ;; Hash consing saved around 11% of pure space in my tests.
@@ -308,6 +310,8 @@
 ;; Preload some constants and floating point functions.
 (load "emacs-lisp/float-sup")
 
+(load "emacs-lisp/cl-macs")
+
 (load "vc/vc-hooks")
 (load "vc/ediff-hook")
 (load "uniquify")




^ permalink raw reply related	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-28  2:27                                                                                                           ` Ken Raeburn
@ 2017-03-31  6:57                                                                                                             ` Eli Zaretskii
  2017-03-31  8:40                                                                                                               ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-03-31  6:57 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Mon, 27 Mar 2017 22:27:26 -0400
> Cc: emacs-devel@gnu.org
> 
> > I applied your patch, and while dumping I get an error message:
> > 
> >  Dumping into dumped.elc...preparing...
> >  Dumping into dumped.elc...generating...
> >  Symbol's function definition is void: cl-labels
> > 
> > and dumped.elc is not re-created.  What did I miss?
> 
> Looks like I missed a “require” or “load” to pull in cl-macs.  Perhaps it’s loaded by something else in my build that’s platform-dependent (X11 vs Windows?) and isn’t in yours; I’m not sure. But it isn’t working for me to just load it explicitly without fixing up the load path too.  Perhaps I should’ve just defined a helper function instead of using cl-labels.
> 
> For now, try adding this patch. It bootstraps for me, and should get cl-labels defined.

This fixes the problem, and Emacs now starts OK, so the abbrevs issue
is also solved.

I think you should push all the changes you asked me to apply as
patches.

What is the roadmap ahead?  Are there any known issues left, before we
can consider this be a candidate for merging to master, and asking
people to test it in their routine workflows before we actually merge?

Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-31  6:57                                                                                                             ` Eli Zaretskii
@ 2017-03-31  8:40                                                                                                               ` Ken Raeburn
  2017-04-03 16:15                                                                                                                 ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-03-31  8:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On Mar 31, 2017, at 02:57, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> This fixes the problem, and Emacs now starts OK, so the abbrevs issue
> is also solved.

Great!

> I think you should push all the changes you asked me to apply as
> patches.

Will do, probably this weekend.

> What is the roadmap ahead?  Are there any known issues left, before we
> can consider this be a candidate for merging to master, and asking
> people to test it in their routine workflows before we actually merge?
> 
> Thanks.

There are a number of issues on my list.  Some can be dealt with while people are experimenting, or even after merging.  Others may affect usability, like the current inability to report clearly and consistently if dumped.elc can’t be found.  I haven’t even tested doing “make install”; I always run Emacs from the build tree.  There are a few other minor bugs, like a few unprintable definitions not getting dumped, that it’d be nice to address; I’ll go back over my list and take a look.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-03-31  8:40                                                                                                               ` Ken Raeburn
@ 2017-04-03 16:15                                                                                                                 ` Ken Raeburn
  2017-04-03 16:57                                                                                                                   ` Alan Mackenzie
  2017-04-10 16:19                                                                                                                   ` Ken Raeburn
  0 siblings, 2 replies; 375+ messages in thread
From: Ken Raeburn @ 2017-04-03 16:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


On Mar 31, 2017, at 04:40, Ken Raeburn <raeburn@raeburn.org> wrote:

> 
> On Mar 31, 2017, at 02:57, Eli Zaretskii <eliz@gnu.org> wrote:
>> 
>> This fixes the problem, and Emacs now starts OK, so the abbrevs issue
>> is also solved.
> 
> Great!
> 
>> I think you should push all the changes you asked me to apply as
>> patches.
> 
> Will do, probably this weekend.

Looks like the abbrev change isn’t actually working right… I got the quoting wrong, so the abbrev tables are constructed as (mostly) proper abbrev tables, and in the right order, but the “:parent” properties are bad. Working on fixing it up….




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-03 16:15                                                                                                                 ` Ken Raeburn
@ 2017-04-03 16:57                                                                                                                   ` Alan Mackenzie
  2017-04-03 18:35                                                                                                                     ` Ken Raeburn
  2017-04-10 16:19                                                                                                                   ` Ken Raeburn
  1 sibling, 1 reply; 375+ messages in thread
From: Alan Mackenzie @ 2017-04-03 16:57 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

Hello, Ken.

On Mon, Apr 03, 2017 at 12:15:29 -0400, Ken Raeburn wrote:

> On Mar 31, 2017, at 04:40, Ken Raeburn <raeburn@raeburn.org> wrote:


> > On Mar 31, 2017, at 02:57, Eli Zaretskii <eliz@gnu.org> wrote:

> >> This fixes the problem, and Emacs now starts OK, so the abbrevs issue
> >> is also solved.

> > Great!

> >> I think you should push all the changes you asked me to apply as
> >> patches.

> > Will do, probably this weekend.

> Looks like the abbrev change isn’t actually working right… I got the
> quoting wrong, so the abbrev tables are constructed as (mostly) proper
> abbrev tables, and in the right order, but the “:parent” properties are
> bad. Working on fixing it up….

I, for one, am feeling enthusiastic about this new way of building Emacs,
and am looking forward to trying it out in the near future.

Thanks for a great job almost finished!

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-03 16:57                                                                                                                   ` Alan Mackenzie
@ 2017-04-03 18:35                                                                                                                     ` Ken Raeburn
  2017-04-03 19:14                                                                                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-04-03 18:35 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

On Apr 3, 2017, at 12:57, Alan Mackenzie <acm@muc.de> wrote:

> Hello, Ken.
> 

> I, for one, am feeling enthusiastic about this new way of building Emacs,
> and am looking forward to trying it out in the near future.
> 
> Thanks for a great job almost finished!

Just making sure credit goes where it’s due:  Stefan did great work on the key piece, processing the Lisp environment for dumping as Lisp.  I’ve tried to improve the Lisp reader performance a bit, and fix up a couple minor bugs here and there, and maybe put a little polish on it.

Despite a few little speed-ups, I’ve got my doubts as to whether it’s going to be fast enough.  The .elc files are still (essentially) Lisp, and parsing text is not the most efficient way to load a bunch of object definitions.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-03 18:35                                                                                                                     ` Ken Raeburn
@ 2017-04-03 19:14                                                                                                                       ` Eli Zaretskii
  2017-04-04  8:08                                                                                                                         ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-04-03 19:14 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: acm, emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Mon, 3 Apr 2017 14:35:16 -0400
> Cc: emacs-devel@gnu.org
> 
> Despite a few little speed-ups, I’ve got my doubts as to whether it’s going to be fast enough.

I published my preliminary timings in these 2 messages:

  http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00923.html
  http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00959.html



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-03 19:14                                                                                                                       ` Eli Zaretskii
@ 2017-04-04  8:08                                                                                                                         ` Ken Raeburn
  2017-04-04  9:51                                                                                                                           ` Robert Pluim
                                                                                                                                             ` (2 more replies)
  0 siblings, 3 replies; 375+ messages in thread
From: Ken Raeburn @ 2017-04-04  8:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel

On Apr 3, 2017, at 15:14, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Ken Raeburn <raeburn@raeburn.org>
>> Date: Mon, 3 Apr 2017 14:35:16 -0400
>> Cc: emacs-devel@gnu.org
>> 
>> Despite a few little speed-ups, I’ve got my doubts as to whether it’s going to be fast enough.
> 
> I published my preliminary timings in these 2 messages:
> 
>  http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00923.html
>  http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00959.html

Yes, I got some speedups, but I didn’t get it as fast as I was hoping.  Some of my changes since your second message above might’ve improved the numbers a little, but some (like loading the doc pointers at startup, and I think “uniquify” is going to need to be loaded at startup too because it attaches advice to “rename-buffer” which we can’t save properly) may slow it a little too.

I was aiming for a startup time under a tenth of a second, and didn’t get there, though there were a couple of additional things that could be tried, with some effort.  I’m not sure a startup time of nearly a fifth of a second will feel for people.  If they start Emacs once as part of logging in, it probably won’t be an issue.  If they start it every time they want to edit a file, it may be annoying to have the startup time increased by even 0.15s.

Still, I suppose we can let people try it out, and find out what they think.  Then we can decide if it’s good enough, if further speedup measures are worth exploring, or if it’s a dead end.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-04  8:08                                                                                                                         ` Ken Raeburn
@ 2017-04-04  9:51                                                                                                                           ` Robert Pluim
  2017-04-04 10:27                                                                                                                           ` joakim
  2017-04-07  5:46                                                                                                                           ` Lars Brinkhoff
  2 siblings, 0 replies; 375+ messages in thread
From: Robert Pluim @ 2017-04-04  9:51 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: acm, Eli Zaretskii, emacs-devel

Ken Raeburn <raeburn@raeburn.org> writes:

>
> I was aiming for a startup time under a tenth of a second, and didn’t
> get there, though there were a couple of additional things that could
> be tried, with some effort.  I’m not sure a startup time of nearly a
> fifth of a second will feel for people.  If they start Emacs once as
> part of logging in, it probably won’t be an issue.  If they start it
> every time they want to edit a file, it may be annoying to have the
> startup time increased by even 0.15s.

I have an emacs I normally keep running, and occasionally start one
for a quick editing task. 0.15s is completely lost in the noise for
me. I've tried the branch, as far as I'm concerned its speed is fine.

Regards

Robert



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-04  8:08                                                                                                                         ` Ken Raeburn
  2017-04-04  9:51                                                                                                                           ` Robert Pluim
@ 2017-04-04 10:27                                                                                                                           ` joakim
  2017-04-04 12:14                                                                                                                             ` Clément Pit-Claudel
  2017-04-07  5:46                                                                                                                           ` Lars Brinkhoff
  2 siblings, 1 reply; 375+ messages in thread
From: joakim @ 2017-04-04 10:27 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: acm, Eli Zaretskii, emacs-devel

Ken Raeburn <raeburn@raeburn.org> writes:

> On Apr 3, 2017, at 15:14, Eli Zaretskii <eliz@gnu.org> wrote:
>
>>> From: Ken Raeburn <raeburn@raeburn.org>
>>> Date: Mon, 3 Apr 2017 14:35:16 -0400
>>> Cc: emacs-devel@gnu.org
>>> 
>>> Despite a few little speed-ups, I’ve got my doubts as to whether it’s going to be fast enough.
>> 
>> I published my preliminary timings in these 2 messages:
>> 
>>  http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00923.html>  http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00959.html
>
> Yes, I got some speedups, but I didn’t get it as fast as I was hoping.  Some of my changes since your second message above might’ve improved the numbers a little, but some (like loading the doc pointers at startup, and I think “uniquify” is going to need to be loaded at startup too because it attaches advice to “rename-buffer” which we can’t save properly) may slow it a little too.
>
> I was aiming for a startup time under a tenth of a second, and didn’t get there, though there were a couple of additional things that could be tried, with some effort.  I’m not sure a startup time of nearly a fifth of a second will feel for people.  If they start Emacs once as part of logging in, it probably won’t be an issue.  If they start it every time they want to edit a file, it may be annoying to have the startup time increased by even 0.15s.

In my case I mostly use long-running sessions, so slow emacs startup
isn't so bad for me. Most of the boot time seem to happen in 3rd party
libs anyway.

But on the other hand I think there is a valid use case for using Emacs
for things like batch processing, web servers and such. And in those
cases startup time matters. Again otoh, you might want to use
emacsclient together with a long running emacs in those cases. But I'm
not really using emacs for that sort of thing so one should listen to
the people actually doing it primarily.

>
> Still, I suppose we can let people try it out, and find out what they think.  Then we can decide if it’s good enough, if further speedup measures are worth exploring, or if it’s a dead end.
>
> Ken
-- 
Joakim Verona
joakim@verona.se




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-04 10:27                                                                                                                           ` joakim
@ 2017-04-04 12:14                                                                                                                             ` Clément Pit-Claudel
  2017-04-04 14:38                                                                                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Clément Pit-Claudel @ 2017-04-04 12:14 UTC (permalink / raw)
  To: emacs-devel

On 2017-04-04 06:27, joakim@verona.se wrote:
> 
> But on the other hand I think there is a valid use case for using Emacs
> for things like batch processing, web servers and such. And in those
> cases startup time matters. Again otoh, you might want to use
> emacsclient together with a long running emacs in those cases. But I'm
> not really using emacs for that sort of thing so one should listen to
> the people actually doing it primarily.

I use Emacs in batch mode a lot; using a server is tricky, because anything you load or change on one run persists until the across future executions (for example one execution might load a file, then the next one might forget to load the file explicitly, but still work well because that file was previously loaded; when you run the program again in a fresh instance, things fail).

Additionally, there are bugs in Emacsclient that make it tricky to use on its own, so my current code relies on two instances of Emacs: a long-lived one and a short-lived on. THe short-lived one is used to connect to the long-running server. (This saves a lot because most of the execution time is otherwise spent loading packages).

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-04 12:14                                                                                                                             ` Clément Pit-Claudel
@ 2017-04-04 14:38                                                                                                                               ` Eli Zaretskii
  2017-04-04 15:16                                                                                                                                 ` Clément Pit-Claudel
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-04-04 14:38 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: emacs-devel

> From: Clément Pit-Claudel <cpitclaudel@gmail.com>
> Date: Tue, 4 Apr 2017 08:14:48 -0400
> 
> Additionally, there are bugs in Emacsclient that make it tricky to use on its own

Why aren't these bugs being fixed?  Are there bug reports with the
details?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-04 14:38                                                                                                                               ` Eli Zaretskii
@ 2017-04-04 15:16                                                                                                                                 ` Clément Pit-Claudel
  2017-04-04 15:53                                                                                                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Clément Pit-Claudel @ 2017-04-04 15:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 2017-04-04 10:38, Eli Zaretskii wrote:
>> From: Clément Pit-Claudel <cpitclaudel@gmail.com>
>> Date: Tue, 4 Apr 2017 08:14:48 -0400
>>
>> Additionally, there are bugs in Emacsclient that make it tricky to use on its own
> 
> Why aren't these bugs being fixed?  Are there bug reports with the
> details?

I can think of 3: "Is emacsclient --eval broken?" from emacs-devel on 2016-08-03, which got fixed almost instantly (thanks Johan Bockgård!). #24616, which now has documentation and so is arguably fixed (but errors still pop up on the server, not the client, and so the server has to capture backtraces and send them back explicitly). "How can I rethrow an error after recording a backtrace?" from emacs-devel on 2016-08-04, which is due to emacsclient not incrementing num_nonmacro_input_events.

One of them is fixed, the second seems seems to be mostly wontfix, and the third is open.  But all three are relevant when trying to remain compatible with older Emacsen.

Cheers,
Clément.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-04 15:16                                                                                                                                 ` Clément Pit-Claudel
@ 2017-04-04 15:53                                                                                                                                   ` Eli Zaretskii
  2017-04-04 18:22                                                                                                                                     ` Clément Pit-Claudel
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-04-04 15:53 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: emacs-devel

> From: Clément Pit-Claudel <cpitclaudel@gmail.com>
> Date: Tue, 4 Apr 2017 11:16:35 -0400
> Cc: emacs-devel@gnu.org
> 
> One of them is fixed, the second seems seems to be mostly wontfix, and the third is open.  But all three are relevant when trying to remain compatible with older Emacsen.

So there's only one bug that remains, is that right?

As for older versions, you could fix them by retrofitting the patches
into them, right?  And in any case, they seem to be unrelated to the
issue at hand, which will only affect the next release and those after
it.  Right?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-04 15:53                                                                                                                                   ` Eli Zaretskii
@ 2017-04-04 18:22                                                                                                                                     ` Clément Pit-Claudel
  0 siblings, 0 replies; 375+ messages in thread
From: Clément Pit-Claudel @ 2017-04-04 18:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 2017-04-04 11:53, Eli Zaretskii wrote:
>> From: Clément Pit-Claudel <cpitclaudel@gmail.com> Date: Tue, 4 Apr
>> 2017 11:16:35 -0400 Cc: emacs-devel@gnu.org
>> 
>> One of them is fixed, the second seems seems to be mostly wontfix,
>> and the third is open.  But all three are relevant when trying to
>> remain compatible with older Emacsen.
> 
> So there's only one bug that remains, is that right?
> 
> As for older versions, you could fix them by retrofitting the
> patches into them, right?

Correct. When the patches are in Lisp, yes, to some extent.  In C, no, of course.

> And in any case, they seem to be unrelated to the issue at hand,
> which will only affect the next release and those after it.  Right?

Correct. I sought to point out that, since old version prevent some authors from using emacsclient, emacsclient being fast does mitigate much of the costs of Emacs itself starting slowly.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-04  8:08                                                                                                                         ` Ken Raeburn
  2017-04-04  9:51                                                                                                                           ` Robert Pluim
  2017-04-04 10:27                                                                                                                           ` joakim
@ 2017-04-07  5:46                                                                                                                           ` Lars Brinkhoff
  2017-04-07  7:28                                                                                                                             ` Eli Zaretskii
  2 siblings, 1 reply; 375+ messages in thread
From: Lars Brinkhoff @ 2017-04-07  5:46 UTC (permalink / raw)
  To: emacs-devel

Ken Raeburn wrote:
> I was aiming for a startup time under a tenth of a second, and didn’t
> get there, though there were a couple of additional things that could
> be tried, with some effort.  I’m not sure a startup time of nearly a
> fifth of a second will feel for people.

Every invokation of async-start launches a new emacs subprocess, doesn't
it?  So startup time would also affect uses of async.el.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-07  5:46                                                                                                                           ` Lars Brinkhoff
@ 2017-04-07  7:28                                                                                                                             ` Eli Zaretskii
  2017-04-07  9:02                                                                                                                               ` Ken Raeburn
  2017-04-07 13:23                                                                                                                               ` Skipping unexec via a big .elc file Stefan Monnier
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2017-04-07  7:28 UTC (permalink / raw)
  To: Lars Brinkhoff, Ken Raeburn; +Cc: emacs-devel

> From: Lars Brinkhoff <lars@nocrew.org>
> Date: Fri, 07 Apr 2017 07:46:12 +0200
> 
> Ken Raeburn wrote:
> > I was aiming for a startup time under a tenth of a second, and didn’t
> > get there, though there were a couple of additional things that could
> > be tried, with some effort.  I’m not sure a startup time of nearly a
> > fifth of a second will feel for people.
> 
> Every invokation of async-start launches a new emacs subprocess, doesn't
> it?  So startup time would also affect uses of async.el.

Perhaps we could have a separate, much smaller dumped.elc for batch
invocations, to cater to these use cases.  Ken, does this make sense?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-07  7:28                                                                                                                             ` Eli Zaretskii
@ 2017-04-07  9:02                                                                                                                               ` Ken Raeburn
  2017-04-07 13:40                                                                                                                                 ` Eli Zaretskii
  2017-04-07 13:23                                                                                                                               ` Skipping unexec via a big .elc file Stefan Monnier
  1 sibling, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-04-07  9:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Lars Brinkhoff, emacs-devel


On Apr 7, 2017, at 03:28, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Lars Brinkhoff <lars@nocrew.org>
>> Date: Fri, 07 Apr 2017 07:46:12 +0200
>> 
>> Ken Raeburn wrote:
>>> I was aiming for a startup time under a tenth of a second, and didn’t
>>> get there, though there were a couple of additional things that could
>>> be tried, with some effort.  I’m not sure a startup time of nearly a
>>> fifth of a second will feel for people.
>> 
>> Every invokation of async-start launches a new emacs subprocess, doesn't
>> it?  So startup time would also affect uses of async.el.
> 
> Perhaps we could have a separate, much smaller dumped.elc for batch
> invocations, to cater to these use cases.  Ken, does this make sense?

We could do it, sure.  For example, stuff relating to window systems probably isn’t of much use in batch mode.  One question is, do we change such things to use autoload in case the user’s init file references their functions, or do we require that the user know to use “load” or “require”?

Maybe we can find file currently loaded that we could change over to autoloading in all cases, improving the interactive startup time too?

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-07  9:02                                                                                                                               ` Ken Raeburn
@ 2017-04-07 13:40                                                                                                                                 ` Eli Zaretskii
  2017-04-07 16:02                                                                                                                                   ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-04-07 13:40 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: lars, emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Fri, 7 Apr 2017 05:02:30 -0400
> Cc: Lars Brinkhoff <lars@nocrew.org>,
>  emacs-devel@gnu.org
> 
> > Perhaps we could have a separate, much smaller dumped.elc for batch
> > invocations, to cater to these use cases.  Ken, does this make sense?
> 
> We could do it, sure.  For example, stuff relating to window systems probably isn’t of much use in batch mode.  One question is, do we change such things to use autoload in case the user’s init file references their functions, or do we require that the user know to use “load” or “require”?

I don't think I understand the question: -batch implies -Q, so the
user's init file is not relevant.

> Maybe we can find file currently loaded that we could change over to autoloading in all cases, improving the interactive startup time too?

Yes, making dumped.elc smaller by using autoload is another way of
slashing some load time.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-07 13:40                                                                                                                                 ` Eli Zaretskii
@ 2017-04-07 16:02                                                                                                                                   ` Ken Raeburn
  2017-04-07 16:17                                                                                                                                     ` Clément Pit-Claudel
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-04-07 16:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lars, emacs-devel


On Apr 7, 2017, at 09:40, Eli Zaretskii <eliz@gnu.org> wrote:

>> From: Ken Raeburn <raeburn@raeburn.org>
>> Date: Fri, 7 Apr 2017 05:02:30 -0400
>> Cc: Lars Brinkhoff <lars@nocrew.org>,
>> emacs-devel@gnu.org
>> 
>>> Perhaps we could have a separate, much smaller dumped.elc for batch
>>> invocations, to cater to these use cases.  Ken, does this make sense?
>> 
>> We could do it, sure.  For example, stuff relating to window systems probably isn’t of much use in batch mode.  One question is, do we change such things to use autoload in case the user’s init file references their functions, or do we require that the user know to use “load” or “require”?
> 
> I don't think I understand the question: -batch implies -Q, so the
> user's init file is not relevant.

Sorry, was too late at night I guess.  For batch mode, it’s code loaded via -l options that might have to add new explicit dependencies, if we don’t add autoloads for everything.  I would expect autoloads would be the direction we’d want to go….


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-07 16:02                                                                                                                                   ` Ken Raeburn
@ 2017-04-07 16:17                                                                                                                                     ` Clément Pit-Claudel
  2017-04-08 15:03                                                                                                                                       ` Philipp Stephani
  0 siblings, 1 reply; 375+ messages in thread
From: Clément Pit-Claudel @ 2017-04-07 16:17 UTC (permalink / raw)
  To: emacs-devel

On 2017-04-07 12:02, Ken Raeburn wrote:
> Sorry, was too late at night I guess.  For batch mode, it’s code
> loaded via -l options that might have to add new explicit
> dependencies, if we don’t add autoloads for everything.  I would
> expect autoloads would be the direction we’d want to go….

Removing some preloaded packages in favor of autoloads is probably a good idea.  We should be careful, though: as previously discussed, essentially everything in packages that were previously preloaded needs to be autoloaded now, since many packages don't (require) the preloaded features that they use. 




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-07 16:17                                                                                                                                     ` Clément Pit-Claudel
@ 2017-04-08 15:03                                                                                                                                       ` Philipp Stephani
  2017-04-08 15:15                                                                                                                                         ` Clément Pit-Claudel
  0 siblings, 1 reply; 375+ messages in thread
From: Philipp Stephani @ 2017-04-08 15:03 UTC (permalink / raw)
  To: Clément Pit-Claudel, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 954 bytes --]

Clément Pit-Claudel <cpitclaudel@gmail.com> schrieb am Fr., 7. Apr. 2017 um
18:18 Uhr:

> On 2017-04-07 12:02, Ken Raeburn wrote:
> > Sorry, was too late at night I guess.  For batch mode, it’s code
> > loaded via -l options that might have to add new explicit
> > dependencies, if we don’t add autoloads for everything.  I would
> > expect autoloads would be the direction we’d want to go….
>
> Removing some preloaded packages in favor of autoloads is probably a good
> idea.  We should be careful, though: as previously discussed, essentially
> everything in packages that were previously preloaded needs to be
> autoloaded now, since many packages don't (require) the preloaded features
> that they use.
>
>
>
Doesn't that effectively just move most of the code to loaddefs.el, from
which it again has to be either preloaded or byte-compiled into the "big
.elc file"? Does this really bring measurable benefits nowadays?

[-- Attachment #2: Type: text/html, Size: 1422 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-08 15:03                                                                                                                                       ` Philipp Stephani
@ 2017-04-08 15:15                                                                                                                                         ` Clément Pit-Claudel
  2017-04-08 15:53                                                                                                                                           ` Philipp Stephani
  0 siblings, 1 reply; 375+ messages in thread
From: Clément Pit-Claudel @ 2017-04-08 15:15 UTC (permalink / raw)
  To: Philipp Stephani, emacs-devel

On 2017-04-08 11:03, Philipp Stephani wrote:
> Clément Pit-Claudel <cpitclaudel@gmail.com schrieb:
>> … essentially everything in packages that were previously
>> preloaded needs to be autoloaded now, since many packages don't
>> (require) the preloaded features that they use.

> Doesn't that effectively just move most of the code to loaddefs.el,
> from which it again has to be either preloaded or byte-compiled into
> the "big .elc file"? Does this really bring measurable benefits
> nowadays?

(Sorry if I'm misunderstanding you)

I think the idea is that you can defer loading the implementation of a significant fraction of currently-preloaded functions, because many of these are currently unused.

So the intended saving is that currently-preloaded but uncommonly-used functions would not be dumped to the big-elc (their signatures, in the form an autoload, would be).  Packages that use them without (require)-ing the corresponding feature first would still work, but startup would be faster.

(I hope I got this right)



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-08 15:15                                                                                                                                         ` Clément Pit-Claudel
@ 2017-04-08 15:53                                                                                                                                           ` Philipp Stephani
  2017-04-08 16:18                                                                                                                                             ` Eli Zaretskii
  2017-04-08 17:58                                                                                                                                             ` Clément Pit-Claudel
  0 siblings, 2 replies; 375+ messages in thread
From: Philipp Stephani @ 2017-04-08 15:53 UTC (permalink / raw)
  To: Clément Pit-Claudel, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2332 bytes --]

Clément Pit-Claudel <cpitclaudel@gmail.com> schrieb am Sa., 8. Apr. 2017 um
17:15 Uhr:

> On 2017-04-08 11:03, Philipp Stephani wrote:
> > Clément Pit-Claudel <cpitclaudel@gmail.com schrieb:
> >> … essentially everything in packages that were previously
> >> preloaded needs to be autoloaded now, since many packages don't
> >> (require) the preloaded features that they use.
>
> > Doesn't that effectively just move most of the code to loaddefs.el,
> > from which it again has to be either preloaded or byte-compiled into
> > the "big .elc file"? Does this really bring measurable benefits
> > nowadays?
>
> (Sorry if I'm misunderstanding you)
>
> I think the idea is that you can defer loading the implementation of a
> significant fraction of currently-preloaded functions, because many of
> these are currently unused.
>
> So the intended saving is that currently-preloaded but uncommonly-used
> functions would not be dumped to the big-elc (their signatures, in the form
> an autoload, would be).  Packages that use them without (require)-ing the
> corresponding feature first would still work, but startup would be faster.
>

The question is whether there is actually a significant speed-up.
Autoloading is traditionally used for a small number of interactive
commands that cause large optional libraries to be loaded. In such cases I
could imagine that the performance gain is still significant. However, you
now suggest that preloaded libraries get turned into autoloads. The
structure of those libraries is typically quite different: the consist to a
large extent of individual helper functions that are independent of each
other. My guess is that this could make overall performance worse: it will
cause loaddefs.el to contain all the signatures and docstrings of these
helper functions, and loaddefs.el is itself not byte-compiled. Therefore,
you now need to load the definitions effectively twice: once in
loaddefs.el, once the functions are actually used. Therefore such a change
shouldn't be made without measuring its impact.
I'd actually prefer going into the other direction: preload much more than
now, and remove lots of stuff from autoloads. This will probably need a
different strategy for preloading (Daniel's approach, or Rmacs, or an Elisp
LLVM compiler, ...).

[-- Attachment #2: Type: text/html, Size: 3057 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-08 15:53                                                                                                                                           ` Philipp Stephani
@ 2017-04-08 16:18                                                                                                                                             ` Eli Zaretskii
  2017-04-08 18:01                                                                                                                                               ` Stefan Monnier
  2017-04-08 17:58                                                                                                                                             ` Clément Pit-Claudel
  1 sibling, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-04-08 16:18 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: cpitclaudel, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Sat, 08 Apr 2017 15:53:49 +0000
> 
> The question is whether there is actually a significant speed-up.
> Autoloading is traditionally used for a small number of interactive commands that cause large optional libraries
> to be loaded. In such cases I could imagine that the performance gain is still significant. However, you now
> suggest that preloaded libraries get turned into autoloads. The structure of those libraries is typically quite
> different: the consist to a large extent of individual helper functions that are independent of each other. My
> guess is that this could make overall performance worse: it will cause loaddefs.el to contain all the signatures
> and docstrings of these helper functions, and loaddefs.el is itself not byte-compiled. Therefore, you now need
> to load the definitions effectively twice: once in loaddefs.el, once the functions are actually used. Therefore
> such a change shouldn't be made without measuring its impact.

This issue will not be resolved by guessing, but by measurements.  So
if you are interested and can produce a dumped.elc that only loads
what's necessary in -batch session, and that dumped.elc does or
doesn't load significantly faster than the full one, we will know who
is right here.

Thanks.

> I'd actually prefer going into the other direction: preload much more than now, and remove lots of stuff from
> autoloads. This will probably need a different strategy for preloading (Daniel's approach, or Rmacs, or an Elisp
> LLVM compiler, ...).

Given that load time is an issue, loading more stuff than strictly
necessary seems to make very little sense to me.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-08 16:18                                                                                                                                             ` Eli Zaretskii
@ 2017-04-08 18:01                                                                                                                                               ` Stefan Monnier
  2017-05-01 11:41                                                                                                                                                 ` Philipp Stephani
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2017-04-08 18:01 UTC (permalink / raw)
  To: emacs-devel

>> I'd actually prefer going into the other direction: preload much more than
>> now, and remove lots of stuff from
>> autoloads. This will probably need a different strategy for preloading
>> (Daniel's approach, or Rmacs, or an Elisp
>> LLVM compiler, ...).
> Given that load time is an issue, loading more stuff than strictly
> necessary seems to make very little sense to me.

IIUC, using Daniel's approach, it should be possible to preload using
mmap in a time that's largely independent from the size of the
preloaded file (or at least, with a small constant).


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-08 18:01                                                                                                                                               ` Stefan Monnier
@ 2017-05-01 11:41                                                                                                                                                 ` Philipp Stephani
  0 siblings, 0 replies; 375+ messages in thread
From: Philipp Stephani @ 2017-05-01 11:41 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 705 bytes --]

Stefan Monnier <monnier@iro.umontreal.ca> schrieb am Sa., 8. Apr. 2017 um
20:06 Uhr:

> >> I'd actually prefer going into the other direction: preload much more
> than
> >> now, and remove lots of stuff from
> >> autoloads. This will probably need a different strategy for preloading
> >> (Daniel's approach, or Rmacs, or an Elisp
> >> LLVM compiler, ...).
> > Given that load time is an issue, loading more stuff than strictly
> > necessary seems to make very little sense to me.
>
> IIUC, using Daniel's approach, it should be possible to preload using
> mmap in a time that's largely independent from the size of the
> preloaded file (or at least, with a small constant).
>
>
Yes, that would be ideal.

[-- Attachment #2: Type: text/html, Size: 1071 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-08 15:53                                                                                                                                           ` Philipp Stephani
  2017-04-08 16:18                                                                                                                                             ` Eli Zaretskii
@ 2017-04-08 17:58                                                                                                                                             ` Clément Pit-Claudel
  2017-05-01 11:40                                                                                                                                               ` Philipp Stephani
  1 sibling, 1 reply; 375+ messages in thread
From: Clément Pit-Claudel @ 2017-04-08 17:58 UTC (permalink / raw)
  To: Philipp Stephani, emacs-devel

On 2017-04-08 11:53, Philipp Stephani wrote:
> However, you now suggest that preloaded libraries get turned into 
> autoloads.

I didn't suggest this :) I just pointed out that there were difficulties with that approach.

> Therefore such a change shouldn't be made without measuring its 
> impact.

Yup, as always when trying to optimize things :)

> I'd actually prefer going into the other direction: preload much more
> than now, and remove lots of stuff from autoloads. This will probably
> need a different strategy for preloading (Daniel's approach, or
> Rmacs, or an Elisp LLVM compiler, ...).

I don't have an opinion on this topic; I admire the work of both Daniel and Ken, but I'm happy to defer to you and other experts for technical opinions.

Clément.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-08 17:58                                                                                                                                             ` Clément Pit-Claudel
@ 2017-05-01 11:40                                                                                                                                               ` Philipp Stephani
  2017-05-01 12:07                                                                                                                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Philipp Stephani @ 2017-05-01 11:40 UTC (permalink / raw)
  To: Clément Pit-Claudel, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 699 bytes --]

Clément Pit-Claudel <cpitclaudel@gmail.com> schrieb am Sa., 8. Apr. 2017 um
19:58 Uhr:

>
> > I'd actually prefer going into the other direction: preload much more
> > than now, and remove lots of stuff from autoloads. This will probably
> > need a different strategy for preloading (Daniel's approach, or
> > Rmacs, or an Elisp LLVM compiler, ...).
>
> I don't have an opinion on this topic; I admire the work of both Daniel
> and Ken, but I'm happy to defer to you and other experts for technical
> opinions.
>
>
I'm absolutely not an expert on this. All I'm suggesting is that the impact
of such changes should be measured, and that startup time in batch mode
isn't everything.

[-- Attachment #2: Type: text/html, Size: 1018 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-05-01 11:40                                                                                                                                               ` Philipp Stephani
@ 2017-05-01 12:07                                                                                                                                                 ` Eli Zaretskii
  2017-05-18 17:39                                                                                                                                                   ` Daniel Colascione
  2017-05-21  8:44                                                                                                                                                   ` compiled lisp file format (Re: Skipping unexec via a big .elc file) Ken Raeburn
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2017-05-01 12:07 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: cpitclaudel, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Mon, 01 May 2017 11:40:46 +0000
> 
> All I'm suggesting is that the impact of such changes should be
> measured, and that startup time in batch mode isn't everything. 

Startup time in batch mode isn't everything, but if it's a frequent
use case in which even relatively short delays are tangible, we should
try to find a way of minimizing those delays.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-05-01 12:07                                                                                                                                                 ` Eli Zaretskii
@ 2017-05-18 17:39                                                                                                                                                   ` Daniel Colascione
  2017-05-18 19:45                                                                                                                                                     ` Eli Zaretskii
  2017-05-21  8:44                                                                                                                                                   ` compiled lisp file format (Re: Skipping unexec via a big .elc file) Ken Raeburn
  1 sibling, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2017-05-18 17:39 UTC (permalink / raw)
  To: Eli Zaretskii, Philipp Stephani; +Cc: cpitclaudel, emacs-devel

On 05/01/2017 05:07 AM, Eli Zaretskii wrote:
>> From: Philipp Stephani <p.stephani2@gmail.com>
>> Date: Mon, 01 May 2017 11:40:46 +0000
>>
>> All I'm suggesting is that the impact of such changes should be
>> measured, and that startup time in batch mode isn't everything.
>
> Startup time in batch mode isn't everything, but if it's a frequent
> use case in which even relatively short delays are tangible, we should
> try to find a way of minimizing those delays.

I'm in a position to rebase my portable dumper patch. The last few 
months have been, er, interesting.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-05-18 17:39                                                                                                                                                   ` Daniel Colascione
@ 2017-05-18 19:45                                                                                                                                                     ` Eli Zaretskii
  2018-12-25 15:46                                                                                                                                                       ` Philipp Stephani
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-05-18 19:45 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: p.stephani2, cpitclaudel, emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Date: Thu, 18 May 2017 10:39:34 -0700
> Cc: cpitclaudel@gmail.com, emacs-devel@gnu.org
> 
> I'm in a position to rebase my portable dumper patch. The last few 
> months have been, er, interesting.

Please make a branch with the patch, so people could try it.

Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-05-18 19:45                                                                                                                                                     ` Eli Zaretskii
@ 2018-12-25 15:46                                                                                                                                                       ` Philipp Stephani
  2018-12-25 17:21                                                                                                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Philipp Stephani @ 2018-12-25 15:46 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: Clément Pit-Claudel, Daniel Colascione, Emacs developers

Am Do., 18. Mai 2017 um 21:45 Uhr schrieb Eli Zaretskii <eliz@gnu.org>:
>
> > From: Daniel Colascione <dancol@dancol.org>
> > Date: Thu, 18 May 2017 10:39:34 -0700
> > Cc: cpitclaudel@gmail.com, emacs-devel@gnu.org
> >
> > I'm in a position to rebase my portable dumper patch. The last few
> > months have been, er, interesting.
>
> Please make a branch with the patch, so people could try it.

Hi, what's the state of the pdumper branch? Any chance we can merge it
to master?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2018-12-25 15:46                                                                                                                                                       ` Philipp Stephani
@ 2018-12-25 17:21                                                                                                                                                         ` Eli Zaretskii
  2018-12-25 19:15                                                                                                                                                           ` Daniel Colascione
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2018-12-25 17:21 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: cpitclaudel, dancol, emacs-devel

> From: Philipp Stephani <p.stephani2@gmail.com>
> Date: Tue, 25 Dec 2018 16:46:23 +0100
> Cc: Daniel Colascione <dancol@dancol.org>, Clément Pit-Claudel <cpitclaudel@gmail.com>, 
> 	Emacs developers <emacs-devel@gnu.org>
> 
> Hi, what's the state of the pdumper branch? Any chance we can merge it
> to master?

Last time this popped up:

  http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00295.html



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2018-12-25 17:21                                                                                                                                                         ` Eli Zaretskii
@ 2018-12-25 19:15                                                                                                                                                           ` Daniel Colascione
  2018-12-26 15:27                                                                                                                                                             ` Eli Zaretskii
  2019-01-07 21:37                                                                                                                                                             ` Daniel Colascione
  0 siblings, 2 replies; 375+ messages in thread
From: Daniel Colascione @ 2018-12-25 19:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Philipp Stephani, dancol, cpitclaudel, emacs-devel

>> From: Philipp Stephani <p.stephani2@gmail.com>
>> Date: Tue, 25 Dec 2018 16:46:23 +0100
>> Cc: Daniel Colascione <dancol@dancol.org>, ClÃ©ment Pit-Claudel
>> <cpitclaudel@gmail.com>,
>> 	Emacs developers <emacs-devel@gnu.org>
>>
>> Hi, what's the state of the pdumper branch? Any chance we can merge it
>> to master?
>
> Last time this popped up:
>
>   http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00295.html
>

Yeah, it's about time we finally get around to doing this. I have some
time between now and the end of the year, and I'll rebase the work and
land it, assuming no new major objections.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2018-12-25 19:15                                                                                                                                                           ` Daniel Colascione
@ 2018-12-26 15:27                                                                                                                                                             ` Eli Zaretskii
  2019-01-07 21:37                                                                                                                                                             ` Daniel Colascione
  1 sibling, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2018-12-26 15:27 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: p.stephani2, dancol, cpitclaudel, emacs-devel

> Date: Tue, 25 Dec 2018 11:15:57 -0800
> From: "Daniel Colascione" <dancol@dancol.org>
> Cc: Philipp Stephani <p.stephani2@gmail.com>, dancol@dancol.org,
> 	cpitclaudel@gmail.com, emacs-devel@gnu.org
> 
> Yeah, it's about time we finally get around to doing this. I have some
> time between now and the end of the year, and I'll rebase the work and
> land it, assuming no new major objections.

Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2018-12-25 19:15                                                                                                                                                           ` Daniel Colascione
  2018-12-26 15:27                                                                                                                                                             ` Eli Zaretskii
@ 2019-01-07 21:37                                                                                                                                                             ` Daniel Colascione
  2019-01-15 22:46                                                                                                                                                               ` Daniel Colascione
  1 sibling, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2019-01-07 21:37 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Eli Zaretskii, dancol, cpitclaudel, Philipp Stephani, emacs-devel

>>> From: Philipp Stephani <p.stephani2@gmail.com>
>>> Date: Tue, 25 Dec 2018 16:46:23 +0100
>>> Cc: Daniel Colascione <dancol@dancol.org>, ClÃ©ment Pit-Claudel
>>> <cpitclaudel@gmail.com>,
>>> 	Emacs developers <emacs-devel@gnu.org>
>>>
>>> Hi, what's the state of the pdumper branch? Any chance we can merge it
>>> to master?
>>
>> Last time this popped up:
>>
>>   http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00295.html
>>
>
> Yeah, it's about time we finally get around to doing this. I have some
> time between now and the end of the year, and I'll rebase the work and
> land it, assuming no new major objections.

I'm still working on this, FWIW. I'd hoped it'd be a simple rebase, but
the vectorization of Lisp_Misc forced more changes than I'd thought.





^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2019-01-07 21:37                                                                                                                                                             ` Daniel Colascione
@ 2019-01-15 22:46                                                                                                                                                               ` Daniel Colascione
  2019-01-16  8:45                                                                                                                                                                 ` Tassilo Horn
                                                                                                                                                                                   ` (5 more replies)
  0 siblings, 6 replies; 375+ messages in thread
From: Daniel Colascione @ 2019-01-15 22:46 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Eli Zaretskii, Daniel Colascione, cpitclaudel, Philipp Stephani,
	emacs-devel

>>>> From: Philipp Stephani <p.stephani2@gmail.com>
>>>> Date: Tue, 25 Dec 2018 16:46:23 +0100
>>>> Cc: Daniel Colascione <dancol@dancol.org>, ClÃ©ment Pit-Claudel
>>>> <cpitclaudel@gmail.com>,
>>>> 	Emacs developers <emacs-devel@gnu.org>
>>>>
>>>> Hi, what's the state of the pdumper branch? Any chance we can merge it
>>>> to master?
>>>
>>> Last time this popped up:
>>>
>>>   http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00295.html
>>>
>>
>> Yeah, it's about time we finally get around to doing this. I have some
>> time between now and the end of the year, and I'll rebase the work and
>> land it, assuming no new major objections.
>
> I'm still working on this, FWIW. I'd hoped it'd be a simple rebase, but
> the vectorization of Lisp_Misc forced more changes than I'd thought.

> Hi Daniel.
>> I'm still working on this, FWIW. I'd hoped it'd be a simple rebase, but
>> the vectorization of Lisp_Misc forced more changes than I'd thought.
>>
>
> Good to know knowing You are working on this.

I landed pdumper. It works on my machine (tm)! Let me know about any
breakage. Plenty of people tested the old pdumper branch, but the changes
necessary to rebase it could use a good once over.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2019-01-15 22:46                                                                                                                                                               ` Daniel Colascione
@ 2019-01-16  8:45                                                                                                                                                                 ` Tassilo Horn
  2019-01-16 10:25                                                                                                                                                                 ` Robert Pluim
                                                                                                                                                                                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 375+ messages in thread
From: Tassilo Horn @ 2019-01-16  8:45 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

"Daniel Colascione" <dancol@dancol.org> writes:

> I landed pdumper. It works on my machine (tm)! Let me know about any
> breakage. Plenty of people tested the old pdumper branch, but the
> changes necessary to rebase it could use a good once over.

I'm not sure if it's related, but since today I encounter bug#34094.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2019-01-15 22:46                                                                                                                                                               ` Daniel Colascione
  2019-01-16  8:45                                                                                                                                                                 ` Tassilo Horn
@ 2019-01-16 10:25                                                                                                                                                                 ` Robert Pluim
  2019-01-16 11:58                                                                                                                                                                 ` Phillip Lord
                                                                                                                                                                                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 375+ messages in thread
From: Robert Pluim @ 2019-01-16 10:25 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Philipp Stephani, Eli Zaretskii, cpitclaudel, emacs-devel

"Daniel Colascione" <dancol@dancol.org> writes:


> I landed pdumper. It works on my machine (tm)! Let me know about any
> breakage. Plenty of people tested the old pdumper branch, but the changes
> necessary to rebase it could use a good once over.

It works for me on x86_64-apple-darwin18.2.0 and x86_64-pc-linux-gnu
(I needed 'make bootstrap', which I guess is not unexpected for such a
big change).

Robert



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2019-01-15 22:46                                                                                                                                                               ` Daniel Colascione
  2019-01-16  8:45                                                                                                                                                                 ` Tassilo Horn
  2019-01-16 10:25                                                                                                                                                                 ` Robert Pluim
@ 2019-01-16 11:58                                                                                                                                                                 ` Phillip Lord
  2019-01-18 12:46                                                                                                                                                                   ` Windows Binaries with pdumper Phillip Lord
  2019-01-16 12:00                                                                                                                                                                 ` Skipping unexec via a big .elc file Elias Mårtenson
                                                                                                                                                                                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 375+ messages in thread
From: Phillip Lord @ 2019-01-16 11:58 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Philipp Stephani, Eli Zaretskii, cpitclaudel, emacs-devel


I thought to build some windows snapshot binaries. I am getting this
error of master:


CC       pdumper.o
../../../../git/master/src/pdumper.c: In function 'dump_cold_bignum':
../../../../git/master/src/pdumper.c:3447:53: error: conversion to 'mp_size_t {aka long int}' from 'size_t {aka long long unsigned int}' may alter its value [-Werror=conversion]
       mp_limb_t limb = mpz_getlimbn (bignum->value, i);
                                                     ^
cc1.exe: some warnings being treated as errors
make[1]: *** [Makefile:392: pdumper.o] Error 1
make[1]: Leaving directory '/home/Administrator/emacs-build/build/emacs-27.0.50-snapshot/x86_64/src'
make: *** [Makefile:423: src] Error 2


Phil



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Windows Binaries with pdumper
  2019-01-16 11:58                                                                                                                                                                 ` Phillip Lord
@ 2019-01-18 12:46                                                                                                                                                                   ` Phillip Lord
  2019-01-21 11:30                                                                                                                                                                     ` Jostein Kjønigsen
  0 siblings, 1 reply; 375+ messages in thread
From: Phillip Lord @ 2019-01-18 12:46 UTC (permalink / raw)
  To: emacs-devel


I've updated the snapshot binaries for Windows to the latest trunk,
which includes the pdumper.

https://alpha.gnu.org/gnu/emacs/pretest/windows/emacs-27/

Phil



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Windows Binaries with pdumper
  2019-01-18 12:46                                                                                                                                                                   ` Windows Binaries with pdumper Phillip Lord
@ 2019-01-21 11:30                                                                                                                                                                     ` Jostein Kjønigsen
  2019-01-21 14:19                                                                                                                                                                       ` Phillip Lord
  0 siblings, 1 reply; 375+ messages in thread
From: Jostein Kjønigsen @ 2019-01-21 11:30 UTC (permalink / raw)
  To: Phillip Lord, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 605 bytes --]

Hey Phil.

Thanks for the build.That's great news!

I've installed it and seems to run just fine, while loading it with a
full config and doing some very basic tasks.
Are there any particular areas of this build you want tested?

--
Vennlig hilsen
Jostein Kjønigsen

jostein@kjonigsen.net 🍵 jostein@gmail.com
https://jostein.kjonigsen.net


On Fri, Jan 18, 2019, at 1:46 PM, Phillip Lord wrote:
> 
> I've updated the snapshot binaries for Windows to the latest trunk,
> which includes the pdumper.
> 
> https://alpha.gnu.org/gnu/emacs/pretest/windows/emacs-27/
> 
> Phil
> 


[-- Attachment #2: Type: text/html, Size: 1553 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Windows Binaries with pdumper
  2019-01-21 11:30                                                                                                                                                                     ` Jostein Kjønigsen
@ 2019-01-21 14:19                                                                                                                                                                       ` Phillip Lord
  0 siblings, 0 replies; 375+ messages in thread
From: Phillip Lord @ 2019-01-21 14:19 UTC (permalink / raw)
  To: Jostein Kjønigsen; +Cc: jostein, emacs-devel


I am not the best person to ask this, since I don't use either windows
nor have much knowledge of the pdumper. The motivation for putting the
snapshots up was to get it into daily use by more people.

Phil

Jostein Kjønigsen <jostein@secure.kjonigsen.net> writes:

> Hey Phil.
>
> Thanks for the build.That's great news!
>
> I've installed it and seems to run just fine, while loading it with a
> full config and doing some very basic tasks.
> Are there any particular areas of this build you want tested?
>
> --
> Vennlig hilsen
> Jostein Kjønigsen
>
> jostein@kjonigsen.net 🍵 jostein@gmail.com
> https://jostein.kjonigsen.net
>
>
> On Fri, Jan 18, 2019, at 1:46 PM, Phillip Lord wrote:
>> 
>> I've updated the snapshot binaries for Windows to the latest trunk,
>> which includes the pdumper.
>> 
>> https://alpha.gnu.org/gnu/emacs/pretest/windows/emacs-27/
>> 
>> Phil
>> 



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2019-01-15 22:46                                                                                                                                                               ` Daniel Colascione
                                                                                                                                                                                   ` (2 preceding siblings ...)
  2019-01-16 11:58                                                                                                                                                                 ` Phillip Lord
@ 2019-01-16 12:00                                                                                                                                                                 ` Elias Mårtenson
  2019-01-16 15:59                                                                                                                                                                 ` Eli Zaretskii
  2019-01-16 21:56                                                                                                                                                                 ` Clément Pit-Claudel
  5 siblings, 0 replies; 375+ messages in thread
From: Elias Mårtenson @ 2019-01-16 12:00 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Philipp Stephani, Eli Zaretskii, cpitclaudel, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 556 bytes --]

On Wed, 16 Jan 2019 at 06:47, Daniel Colascione <dancol@dancol.org> wrote:

I landed pdumper. It works on my machine (tm)! Let me know about any
> breakage. Plenty of people tested the old pdumper branch, but the changes
> necessary to rebase it could use a good once over.
>

Very nice, I've waited for this. Thanks a lot. I'm using it right now, and
it seems to work fine.

A small comment. There seems to be a typo in the word “build” in NEWS:

“Emacs now needs an "emacs.pdmp" file, generated during the built”.

Regards,
Elias

[-- Attachment #2: Type: text/html, Size: 1004 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2019-01-15 22:46                                                                                                                                                               ` Daniel Colascione
                                                                                                                                                                                   ` (3 preceding siblings ...)
  2019-01-16 12:00                                                                                                                                                                 ` Skipping unexec via a big .elc file Elias Mårtenson
@ 2019-01-16 15:59                                                                                                                                                                 ` Eli Zaretskii
  2019-01-16 16:08                                                                                                                                                                   ` Daniel Colascione
  2019-01-16 21:56                                                                                                                                                                 ` Clément Pit-Claudel
  5 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2019-01-16 15:59 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

> Date: Tue, 15 Jan 2019 14:46:03 -0800
> From: "Daniel Colascione" <dancol@dancol.org>
> Cc: "Daniel Colascione" <dancol@dancol.org>,
>  "Eli Zaretskii" <eliz@gnu.org>,
>  "Philipp Stephani" <p.stephani2@gmail.com>,
>  cpitclaudel@gmail.com,
>  emacs-devel@gnu.org
> 
> I landed pdumper.

Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2019-01-16 15:59                                                                                                                                                                 ` Eli Zaretskii
@ 2019-01-16 16:08                                                                                                                                                                   ` Daniel Colascione
  0 siblings, 0 replies; 375+ messages in thread
From: Daniel Colascione @ 2019-01-16 16:08 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel

>> Date: Tue, 15 Jan 2019 14:46:03 -0800
>> From: "Daniel Colascione" <dancol@dancol.org>
>> Cc: "Daniel Colascione" <dancol@dancol.org>,
>>  "Eli Zaretskii" <eliz@gnu.org>,
>>  "Philipp Stephani" <p.stephani2@gmail.com>,
>>  cpitclaudel@gmail.com,
>>  emacs-devel@gnu.org
>>
>> I landed pdumper.
>
> Thanks.

Should have a fix for the crash in detect_coding soon-ish. Who knew that
struct coding_system had function pointers? (This structure is _mostly_
ephemeral, reloaded from Lisp, but it's also persistent on some contexts.)




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2019-01-15 22:46                                                                                                                                                               ` Daniel Colascione
                                                                                                                                                                                   ` (4 preceding siblings ...)
  2019-01-16 15:59                                                                                                                                                                 ` Eli Zaretskii
@ 2019-01-16 21:56                                                                                                                                                                 ` Clément Pit-Claudel
  5 siblings, 0 replies; 375+ messages in thread
From: Clément Pit-Claudel @ 2019-01-16 21:56 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eli Zaretskii, Philipp Stephani, emacs-devel

On 15/01/2019 17.46, Daniel Colascione wrote:
> I landed pdumper.

Congratulations!




^ permalink raw reply	[flat|nested] 375+ messages in thread

* compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-01 12:07                                                                                                                                                 ` Eli Zaretskii
  2017-05-18 17:39                                                                                                                                                   ` Daniel Colascione
@ 2017-05-21  8:44                                                                                                                                                   ` Ken Raeburn
  2017-05-21  8:53                                                                                                                                                     ` Paul Eggert
  2017-05-21 16:02                                                                                                                                                     ` John Wiegley
  1 sibling, 2 replies; 375+ messages in thread
From: Ken Raeburn @ 2017-05-21  8:44 UTC (permalink / raw)
  To: Emacs developers

I haven’t had much time to further the work on the big-elc approach recently, but there is one idea I want to toss out there for possibly improving the load time further: Changing the .elc file format to a binary one.  I’m not talking about a memory image like Daniel is working on.  I mean a file representing a sequence of S-expressions, but optimized for loading speed rather than for human readability.

The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc.  While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines.

I haven’t got a complete, concrete proposal, but I see at least a couple general approaches possible:

1) Follow the model of flat object file formats: Some file sections have data of various types (string content, symbol names, integer or floating constants); others (the equivalent of standard object file “relocation” data) would provide info on how to allocate and fill in the container objects (pairs, vectors, etc) desired, with references to the symbols or strings or other container objects.

2) Continue to use the current recursive processing, but with a binary format.  Some (byte? word?) value indicates “this is string data”, it’s followed by a byte count and that many bytes of string content (always using the Emacs internal encoding, so we don’t have to translate when reading).  Another value indicates an integer constant.  Another value indicates a vector, and is followed by a length and then that many other values, which are each processed recursively before we get back to the object following the vector.  Each object’s initializer’s length is dependent on the type, and for container types, the values contained within.

Either way, getting away from the expensive one-character-at-a-time processing, multibyte coding, escape processing, etc., and pushing around groups of bytes whenever possible should save us time.

This would be useable not just for the dumped.elc file, but for other compiled Lisp files as well, whether in the distribution or from ELPA or the user’s own code.

I did throw together a half-baked attempt to try some of this out.  I added a new “#” construct for unibyte strings, putting the byte count into the file so that the string data could be copied with fread() instead of a READCHAR loop.  I also added a new version of the “#n#” syntax that uses a fixed number of READCHAR calls and avoids the decimal arithmetic.  So, the file can no longer be processed as Lisp, and it still requires some text parsing, though not nearly as much as before; some of the worst of both worlds.  But the load time for dumped.elc did drop by another 12% in my tests (start in batch mode, print a message and exit, from 0.227s down to 0.2s or less per run, still loading a couple of standard-elc-format files during startup).

I’m curious if people think this might be an approach worth pursuing.  Or if the Lisp-based elc format is seen as advantageous in ways I’m not seeing….

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-21  8:44                                                                                                                                                   ` compiled lisp file format (Re: Skipping unexec via a big .elc file) Ken Raeburn
@ 2017-05-21  8:53                                                                                                                                                     ` Paul Eggert
  2017-05-28 11:07                                                                                                                                                       ` Ken Raeburn
  2017-05-21 16:02                                                                                                                                                     ` John Wiegley
  1 sibling, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2017-05-21  8:53 UTC (permalink / raw)
  To: Ken Raeburn, Emacs developers

Ken Raeburn wrote:
> The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc.  While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines.

Although it does indeed sound like a big change, I don't see why it would 
prevent us from sharing the files across machines. Emacs can use standard ELF 
and DWARF format on any platform if Emacs is doing the loading. And there should 
be some software-engineering benefit in using the same format that Guile uses.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-21  8:53                                                                                                                                                     ` Paul Eggert
@ 2017-05-28 11:07                                                                                                                                                       ` Ken Raeburn
  2017-05-28 12:43                                                                                                                                                         ` Philipp Stephani
  2017-05-28 21:09                                                                                                                                                         ` Paul Eggert
  0 siblings, 2 replies; 375+ messages in thread
From: Ken Raeburn @ 2017-05-28 11:07 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Emacs developers

On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu> wrote:

> Ken Raeburn wrote:
>> The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc.  While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines.
> 
> Although it does indeed sound like a big change, I don't see why it would prevent us from sharing the files across machines. Emacs can use standard ELF and DWARF format on any platform if Emacs is doing the loading. And there should be some software-engineering benefit in using the same format that Guile uses.

Sorry for the delay in responding.

The ELF format has header fields indicating the word size, endianness, machine architecture (though there’s a value for “none”), and OS ABI.  Some fields vary in size or order depending on whether the 32-bit or 64-bit format is in use.  Some other format details (e.g., relocation types, interpretation of certain ranges of values in some fields) are architecture- or OS-dependent; we might not care about many of those details, but relocations are likely needed if we want to play linking games or use DWARF.

I think Guile is using whatever the native word size and architecture are.  If we do that for Emacs, they’re not portable between platforms.  Currently it works for me to put my Lisp files, both source and compiled, into ~/elisp and use them from different kinds of machines if my home directory is NFS-mounted.

We could instead pick fixed values (say, architecture “none”, little-endian, 32-bit), but then there’s no guarantee that we could use any of the usual GNU tools on them without a bunch of work, or that we’d ever be able to use non-GNU tools to treat them as object files.  Then again, we couldn’t expect to do the latter portably anyway, since some of the platforms don’t even use ELF.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-28 11:07                                                                                                                                                       ` Ken Raeburn
@ 2017-05-28 12:43                                                                                                                                                         ` Philipp Stephani
  2017-05-29  9:33                                                                                                                                                           ` Ken Raeburn
  2017-05-28 21:09                                                                                                                                                         ` Paul Eggert
  1 sibling, 1 reply; 375+ messages in thread
From: Philipp Stephani @ 2017-05-28 12:43 UTC (permalink / raw)
  To: Ken Raeburn, Paul Eggert; +Cc: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 2561 bytes --]

Ken Raeburn <raeburn@raeburn.org> schrieb am So., 28. Mai 2017 um 13:07 Uhr:

>
> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu> wrote:
>
> > Ken Raeburn wrote:
> >> The Guile project has taken this idea pretty far; they’re generating
> ELF object files with a few special sections for Guile objects, using the
> standard DWARF sections for debug information, etc.  While it has a certain
> appeal (making C modules and Lisp files look much more similar, maybe being
> able to link Lisp and C together into one executable image, letting GDB
> understand some of your data), switching to a machine-specific format would
> be a pretty drastic change, when we can currently share the files across
> machines.
> >
> > Although it does indeed sound like a big change, I don't see why it
> would prevent us from sharing the files across machines. Emacs can use
> standard ELF and DWARF format on any platform if Emacs is doing the
> loading. And there should be some software-engineering benefit in using the
> same format that Guile uses.
>
> Sorry for the delay in responding.
>
> The ELF format has header fields indicating the word size, endianness,
> machine architecture (though there’s a value for “none”), and OS ABI.  Some
> fields vary in size or order depending on whether the 32-bit or 64-bit
> format is in use.  Some other format details (e.g., relocation types,
> interpretation of certain ranges of values in some fields) are
> architecture- or OS-dependent; we might not care about many of those
> details, but relocations are likely needed if we want to play linking games
> or use DWARF.
>
> I think Guile is using whatever the native word size and architecture
> are.  If we do that for Emacs, they’re not portable between platforms.
> Currently it works for me to put my Lisp files, both source and compiled,
> into ~/elisp and use them from different kinds of machines if my home
> directory is NFS-mounted.
>
> We could instead pick fixed values (say, architecture “none”,
> little-endian, 32-bit), but then there’s no guarantee that we could use any
> of the usual GNU tools on them without a bunch of work, or that we’d ever
> be able to use non-GNU tools to treat them as object files.  Then again, we
> couldn’t expect to do the latter portably anyway, since some of the
> platforms don’t even use ELF.
>
>
Is there any significant advantage of using ELF, or could this just use one
of the standard binary serialization formats (protobuf, flatbuffer, ...)?

[-- Attachment #2: Type: text/html, Size: 2873 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-28 12:43                                                                                                                                                         ` Philipp Stephani
@ 2017-05-29  9:33                                                                                                                                                           ` Ken Raeburn
  2017-07-02 15:46                                                                                                                                                             ` Philipp Stephani
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-05-29  9:33 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: Paul Eggert, Emacs developers

[-- Attachment #1: Type: text/plain, Size: 6033 bytes --]

On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com> wrote:

> 
> 
> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am So., 28. Mai 2017 um 13:07 Uhr:
> 
> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu <mailto:eggert@cs.ucla.edu>> wrote:
> 
> > Ken Raeburn wrote:
> >> The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc.  While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines.
> >
> > Although it does indeed sound like a big change, I don't see why it would prevent us from sharing the files across machines. Emacs can use standard ELF and DWARF format on any platform if Emacs is doing the loading. And there should be some software-engineering benefit in using the same format that Guile uses.
> 
> Sorry for the delay in responding.
> 
> The ELF format has header fields indicating the word size, endianness, machine architecture (though there’s a value for “none”), and OS ABI.  Some fields vary in size or order depending on whether the 32-bit or 64-bit format is in use.  Some other format details (e.g., relocation types, interpretation of certain ranges of values in some fields) are architecture- or OS-dependent; we might not care about many of those details, but relocations are likely needed if we want to play linking games or use DWARF.
> 
> I think Guile is using whatever the native word size and architecture are.  If we do that for Emacs, they’re not portable between platforms.  Currently it works for me to put my Lisp files, both source and compiled, into ~/elisp and use them from different kinds of machines if my home directory is NFS-mounted.
> 
> We could instead pick fixed values (say, architecture “none”, little-endian, 32-bit), but then there’s no guarantee that we could use any of the usual GNU tools on them without a bunch of work, or that we’d ever be able to use non-GNU tools to treat them as object files.  Then again, we couldn’t expect to do the latter portably anyway, since some of the platforms don’t even use ELF.
> 
> 
> Is there any significant advantage of using ELF, or could this just use one of the standard binary serialization formats (protobuf, flatbuffer, ...)? 

That’s an interesting idea.  If one of the popular serialization libraries is compatibly licensed, easy to use, and performs well, it may be better than rolling our own.  It’ll need to handle data structures with circular or cross-linked references.  And we have the doc string delayed-loading optimization (that currently uses #$ and #@ syntaxes); presumably we’d like to keep that optimization in some form.  It would be good not to have to build all our data structures on ones generated by the tool with its own bookkeeping fields; having anything in a cons cell besides the “car” and “cdr” slots would mean a significant increase in memory use.

I initially said, “follow the model of flat object file formats”, not “use ELF”; ELF is just one way of organizing the data of an object file, with years of experience behind it, which we could use wholesale or borrow some lessons from.  One of the typical advantages of object file formats is that the data is grouped for efficient memory usage; some sections of a file will be mapped into the address space read-only (shared between processes), other sections read-write (possibly shared until copied on write), and others not mapped at all.  For example, we might put symbol names (normally never modified but it can be done), doc strings (to be loaded later, only if needed), byte code, and other strings into their own sections, and create Lisp_String objects and such pointing to those bytes as needed.  We don’t keep much in the way of source location information for Lisp code around, but if we ever change that, arguably it could go in a file section that’s not mapped or read until the debugger wants the information.

The Guile project’s documentation says their use of ELF is intended to build on existing work to invent a good object file format with several desired characteristics (https://www.gnu.org/software/guile/manual/html_node/Object-File-Format.html):

	• Above all else, it should be very cheap to load a compiled file.
	• It should be possible to statically allocate constants in the file. For example, a bytevector literal in source code can be emitted directly into the object file.
	• The compiled file should enable maximum code and data sharing between different processes.
	• The compiled file should contain debugging information, such as line numbers, but that information should be separated from the code itself. It should be possible to strip debugging information if space is tight.

They’re generating byte code currently, but are looking forward towards generating native code as well (instead?).

Their write-up implicitly assumes that, as with “normal” object files, the idea is to mmap the data into the address space, some of it read-only and some of it automatically getting some patching up, and then using those in-memory objects directly.  There’s no explicit discussion of the tradeoffs of loading a file all at once versus reading one object tree (S-expression) at a time from an input stream, but especially when mapping and using much of the data unmodified is feasible, I suspect the all-at-once approach is likely to be more efficient.  Whether that would be true in a case like Emacs, I don’t know.

They use DWARF for carrying some debug information, but so far I’m unsure what information is actually stored there.

Ken

[-- Attachment #2: Type: text/html, Size: 8822 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-29  9:33                                                                                                                                                           ` Ken Raeburn
@ 2017-07-02 15:46                                                                                                                                                             ` Philipp Stephani
  2017-07-03  1:44                                                                                                                                                               ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Philipp Stephani @ 2017-07-02 15:46 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Paul Eggert, Emacs developers

[-- Attachment #1: Type: text/plain, Size: 3249 bytes --]

Ken Raeburn <raeburn@raeburn.org> schrieb am Mo., 29. Mai 2017 um 11:33 Uhr:

>
> On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com> wrote:
>
>
>
> Ken Raeburn <raeburn@raeburn.org> schrieb am So., 28. Mai 2017 um
> 13:07 Uhr:
>
>>
>> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu> wrote:
>>
>> > Ken Raeburn wrote:
>> >> The Guile project has taken this idea pretty far; they’re generating
>> ELF object files with a few special sections for Guile objects, using the
>> standard DWARF sections for debug information, etc.  While it has a certain
>> appeal (making C modules and Lisp files look much more similar, maybe being
>> able to link Lisp and C together into one executable image, letting GDB
>> understand some of your data), switching to a machine-specific format would
>> be a pretty drastic change, when we can currently share the files across
>> machines.
>> >
>> > Although it does indeed sound like a big change, I don't see why it
>> would prevent us from sharing the files across machines. Emacs can use
>> standard ELF and DWARF format on any platform if Emacs is doing the
>> loading. And there should be some software-engineering benefit in using the
>> same format that Guile uses.
>>
>> Sorry for the delay in responding.
>>
>> The ELF format has header fields indicating the word size, endianness,
>> machine architecture (though there’s a value for “none”), and OS ABI.  Some
>> fields vary in size or order depending on whether the 32-bit or 64-bit
>> format is in use.  Some other format details (e.g., relocation types,
>> interpretation of certain ranges of values in some fields) are
>> architecture- or OS-dependent; we might not care about many of those
>> details, but relocations are likely needed if we want to play linking games
>> or use DWARF.
>>
>> I think Guile is using whatever the native word size and architecture
>> are.  If we do that for Emacs, they’re not portable between platforms.
>> Currently it works for me to put my Lisp files, both source and compiled,
>> into ~/elisp and use them from different kinds of machines if my home
>> directory is NFS-mounted.
>>
>> We could instead pick fixed values (say, architecture “none”,
>> little-endian, 32-bit), but then there’s no guarantee that we could use any
>> of the usual GNU tools on them without a bunch of work, or that we’d ever
>> be able to use non-GNU tools to treat them as object files.  Then again, we
>> couldn’t expect to do the latter portably anyway, since some of the
>> platforms don’t even use ELF.
>>
>>
> Is there any significant advantage of using ELF, or could this just use
> one of the standard binary serialization formats (protobuf, flatbuffer,
> ...)?
>
>
> That’s an interesting idea.  If one of the popular serialization libraries
> is compatibly licensed, easy to use, and performs well, it may be better
> than rolling our own.
>

I've tried this out (with flatbuffers), but I haven't seen significant
speed improvements. It might very well be the case that during loading the
reader is already fast enough (e.g. for ELC files it doesn't do any
decoding), and it's the evaluator that's too slow.

[-- Attachment #2: Type: text/html, Size: 4253 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-07-02 15:46                                                                                                                                                             ` Philipp Stephani
@ 2017-07-03  1:44                                                                                                                                                               ` Ken Raeburn
  2017-09-24 13:57                                                                                                                                                                 ` Philipp Stephani
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-07-03  1:44 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: Paul Eggert, Emacs developers

[-- Attachment #1: Type: text/plain, Size: 3869 bytes --]


On Jul 2, 2017, at 11:46, Philipp Stephani <p.stephani2@gmail.com> wrote:

> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am Mo., 29. Mai 2017 um 11:33 Uhr:
> 
> On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com <mailto:p.stephani2@gmail.com>> wrote:
> 
>> 
>> 
>> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am So., 28. Mai 2017 um 13:07 Uhr:
>> 
>> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu <mailto:eggert@cs.ucla.edu>> wrote:
>> 
>> > Ken Raeburn wrote:
>> >> The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc.  While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines.
>> >
>> > Although it does indeed sound like a big change, I don't see why it would prevent us from sharing the files across machines. Emacs can use standard ELF and DWARF format on any platform if Emacs is doing the loading. And there should be some software-engineering benefit in using the same format that Guile uses.
>> 
>> Sorry for the delay in responding.
>> 
>> The ELF format has header fields indicating the word size, endianness, machine architecture (though there’s a value for “none”), and OS ABI.  Some fields vary in size or order depending on whether the 32-bit or 64-bit format is in use.  Some other format details (e.g., relocation types, interpretation of certain ranges of values in some fields) are architecture- or OS-dependent; we might not care about many of those details, but relocations are likely needed if we want to play linking games or use DWARF.
>> 
>> I think Guile is using whatever the native word size and architecture are.  If we do that for Emacs, they’re not portable between platforms.  Currently it works for me to put my Lisp files, both source and compiled, into ~/elisp and use them from different kinds of machines if my home directory is NFS-mounted.
>> 
>> We could instead pick fixed values (say, architecture “none”, little-endian, 32-bit), but then there’s no guarantee that we could use any of the usual GNU tools on them without a bunch of work, or that we’d ever be able to use non-GNU tools to treat them as object files.  Then again, we couldn’t expect to do the latter portably anyway, since some of the platforms don’t even use ELF.
>> 
>> 
>> Is there any significant advantage of using ELF, or could this just use one of the standard binary serialization formats (protobuf, flatbuffer, ...)? 
> 
> That’s an interesting idea.  If one of the popular serialization libraries is compatibly licensed, easy to use, and performs well, it may be better than rolling our own.
> 
> I've tried this out (with flatbuffers), but I haven't seen significant speed improvements. It might very well be the case that during loading the reader is already fast enough (e.g. for ELC files it doesn't do any decoding), and it's the evaluator that's too slow.

What’s your test case, and how are you measuring the performance?

In my tests with the one big elc file, using the Linux “perf” tool, it seems that readchar, read1, encode_char, and ungetc are where a good chunk of CPU time is still spent — about 1/4 in my testing with the “big elc file” code.  My experiment in May cut down a chunk of the overall run time (start in batch mode, print a message, and exit) with some ugly reader syntax hacks. Tests with smaller files may have different characteristics though…

Ken

[-- Attachment #2: Type: text/html, Size: 6224 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-07-03  1:44                                                                                                                                                               ` Ken Raeburn
@ 2017-09-24 13:57                                                                                                                                                                 ` Philipp Stephani
  2017-09-27  8:31                                                                                                                                                                   ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Philipp Stephani @ 2017-09-24 13:57 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Paul Eggert, Emacs developers

[-- Attachment #1: Type: text/plain, Size: 3926 bytes --]

Ken Raeburn <raeburn@raeburn.org> schrieb am Mo., 3. Juli 2017 um 03:44 Uhr:

>
> On Jul 2, 2017, at 11:46, Philipp Stephani <p.stephani2@gmail.com> wrote:
>
> Ken Raeburn <raeburn@raeburn.org> schrieb am Mo., 29. Mai 2017 um
> 11:33 Uhr:
>
>>
>> On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com>
>> wrote:
>>
>>
>>
>> Ken Raeburn <raeburn@raeburn.org> schrieb am So., 28. Mai 2017 um
>> 13:07 Uhr:
>>
>>>
>>> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu> wrote:
>>>
>>> > Ken Raeburn wrote:
>>> >> The Guile project has taken this idea pretty far; they’re generating
>>> ELF object files with a few special sections for Guile objects, using the
>>> standard DWARF sections for debug information, etc.  While it has a certain
>>> appeal (making C modules and Lisp files look much more similar, maybe being
>>> able to link Lisp and C together into one executable image, letting GDB
>>> understand some of your data), switching to a machine-specific format would
>>> be a pretty drastic change, when we can currently share the files across
>>> machines.
>>> >
>>> > Although it does indeed sound like a big change, I don't see why it
>>> would prevent us from sharing the files across machines. Emacs can use
>>> standard ELF and DWARF format on any platform if Emacs is doing the
>>> loading. And there should be some software-engineering benefit in using the
>>> same format that Guile uses.
>>>
>>> Sorry for the delay in responding.
>>>
>>> The ELF format has header fields indicating the word size, endianness,
>>> machine architecture (though there’s a value for “none”), and OS ABI.  Some
>>> fields vary in size or order depending on whether the 32-bit or 64-bit
>>> format is in use.  Some other format details (e.g., relocation types,
>>> interpretation of certain ranges of values in some fields) are
>>> architecture- or OS-dependent; we might not care about many of those
>>> details, but relocations are likely needed if we want to play linking games
>>> or use DWARF.
>>>
>>> I think Guile is using whatever the native word size and architecture
>>> are.  If we do that for Emacs, they’re not portable between platforms.
>>> Currently it works for me to put my Lisp files, both source and compiled,
>>> into ~/elisp and use them from different kinds of machines if my home
>>> directory is NFS-mounted.
>>>
>>> We could instead pick fixed values (say, architecture “none”,
>>> little-endian, 32-bit), but then there’s no guarantee that we could use any
>>> of the usual GNU tools on them without a bunch of work, or that we’d ever
>>> be able to use non-GNU tools to treat them as object files.  Then again, we
>>> couldn’t expect to do the latter portably anyway, since some of the
>>> platforms don’t even use ELF.
>>>
>>>
>> Is there any significant advantage of using ELF, or could this just use
>> one of the standard binary serialization formats (protobuf, flatbuffer,
>> ...)?
>>
>>
>> That’s an interesting idea.  If one of the popular serialization
>> libraries is compatibly licensed, easy to use, and performs well, it may
>> be better than rolling our own.
>>
>
> I've tried this out (with flatbuffers), but I haven't seen significant
> speed improvements. It might very well be the case that during loading the
> reader is already fast enough (e.g. for ELC files it doesn't do any
> decoding), and it's the evaluator that's too slow.
>
>
> What’s your test case, and how are you measuring the performance?
>

IIRC I've repeatedly loaded one of the biggest .elc files shipped with
Emacs and measured the total loading time. I haven't done any detailed
profiling, since I was hoping for a significant speed increase that would
justify the work.
If people are generally interested in pursuing this further, I'd be happy
to put my code into a scratch branch.

[-- Attachment #2: Type: text/html, Size: 5694 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-09-24 13:57                                                                                                                                                                 ` Philipp Stephani
@ 2017-09-27  8:31                                                                                                                                                                   ` Ken Raeburn
  0 siblings, 0 replies; 375+ messages in thread
From: Ken Raeburn @ 2017-09-27  8:31 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: Paul Eggert, Emacs developers

[-- Attachment #1: Type: text/plain, Size: 6615 bytes --]

On Sep 24, 2017, at 09:57, Philipp Stephani <p.stephani2@gmail.com> wrote:
> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am Mo., 3. Juli 2017 um 03:44 Uhr:
> 
> On Jul 2, 2017, at 11:46, Philipp Stephani <p.stephani2@gmail.com <mailto:p.stephani2@gmail.com>> wrote:
> 
>> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am Mo., 29. Mai 2017 um 11:33 Uhr:
>> 
>> On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com <mailto:p.stephani2@gmail.com>> wrote:
>> 
>>> 
>>> 
>>> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am So., 28. Mai 2017 um 13:07 Uhr:
>>> 
>>> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu <mailto:eggert@cs.ucla.edu>> wrote:
>>> 
>>> > Ken Raeburn wrote:
>>> >> The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc.  While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines.
>>> >
>>> > Although it does indeed sound like a big change, I don't see why it would prevent us from sharing the files across machines. Emacs can use standard ELF and DWARF format on any platform if Emacs is doing the loading. And there should be some software-engineering benefit in using the same format that Guile uses.
>>> 
>>> Sorry for the delay in responding.
>>> 
>>> The ELF format has header fields indicating the word size, endianness, machine architecture (though there’s a value for “none”), and OS ABI.  Some fields vary in size or order depending on whether the 32-bit or 64-bit format is in use.  Some other format details (e.g., relocation types, interpretation of certain ranges of values in some fields) are architecture- or OS-dependent; we might not care about many of those details, but relocations are likely needed if we want to play linking games or use DWARF.
>>> 
>>> I think Guile is using whatever the native word size and architecture are.  If we do that for Emacs, they’re not portable between platforms.  Currently it works for me to put my Lisp files, both source and compiled, into ~/elisp and use them from different kinds of machines if my home directory is NFS-mounted.
>>> 
>>> We could instead pick fixed values (say, architecture “none”, little-endian, 32-bit), but then there’s no guarantee that we could use any of the usual GNU tools on them without a bunch of work, or that we’d ever be able to use non-GNU tools to treat them as object files.  Then again, we couldn’t expect to do the latter portably anyway, since some of the platforms don’t even use ELF.
>>> 
>>> 
>>> Is there any significant advantage of using ELF, or could this just use one of the standard binary serialization formats (protobuf, flatbuffer, ...)? 
>> 
>> That’s an interesting idea.  If one of the popular serialization libraries is compatibly licensed, easy to use, and performs well, it may be better than rolling our own.
>> 
>> I've tried this out (with flatbuffers), but I haven't seen significant speed improvements. It might very well be the case that during loading the reader is already fast enough (e.g. for ELC files it doesn't do any decoding), and it's the evaluator that's too slow.
> 
> What’s your test case, and how are you measuring the performance?
> 
> IIRC I've repeatedly loaded one of the biggest .elc files shipped with Emacs and measured the total loading time. I haven't done any detailed profiling, since I was hoping for a significant speed increase that would justify the work.

It’ll depend on what the code in that file is doing.

In the raeburn-startup branch, the last bit of profiling I did — you can see a graph at http://www.mit.edu/~raeburn/emacs.svg <http://www.mit.edu/~raeburn/emacs.svg> and if you haven’t read up on flame graphs (http://www.brendangregg.com/flamegraphs.html <http://www.brendangregg.com/flamegraphs.html>), they provide a nice visualization of the CPU time consumption broken down by what the current call stack looks like — showed nearly 1/3 of the CPU time of a simple run of Emacs in batch mode was spent reading and parsing the saved Lisp environment.  Most of the rest of the CPU time was spent executing the loaded code (lots of fset and setplist calls), but the biggest chunk of that was executing a nested load of international/characters.elc; during that nested load, most of the time was spent in execution (mostly char table processing) and very little in parsing.

So… for the saved Lisp environment file, excluding the nested load, reading and parsing is about 2/3 of the CPU time used; for characters.elc, reading and parsing is a minuscule portion of the CPU time.

Loading a Lisp file internally uses the Lisp “read” routine, which requires an input stream of character values (not byte values) to be supplied; we examine the stream object and dispatch to various bits of code depending on its type (buffer, marker, function, certain special symbols), *for each character*.  Each byte is examined to see if it’s part of a multibyte character.  Each character is considered to see if it’s allowed to be part of a symbol name or string or whatever we’re in the middle of parsing, or if it’s a backslash quoting some other character, etc.

Hence my hopes for a non-text-based format, designed to streamline reading data from files, where we can do things like specify a vector length or string length up front instead of having to consider each character and process character quoting sequences, stuff like that.  E.g., here’s a unibyte string of 47 bytes, so just copy the bytes without considering every one separately.  No human-readable printed form, no escape sequences needed.

Another help might be finding a faster way to load the character data.  I’ve got the branch loading characters.elc at startup because saving and parsing the generated tables was even slower than evaluating the Lisp code to generate them.  Perhaps we can do some processing of them during the build and convert them into some other form that lets us start up faster.

> If people are generally interested in pursuing this further, I'd be happy to put my code into a scratch branch.

I’d be curious to take a look…

Ken

[-- Attachment #2: Type: text/html, Size: 9862 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-28 11:07                                                                                                                                                       ` Ken Raeburn
  2017-05-28 12:43                                                                                                                                                         ` Philipp Stephani
@ 2017-05-28 21:09                                                                                                                                                         ` Paul Eggert
  2017-05-29  9:33                                                                                                                                                           ` Ken Raeburn
  1 sibling, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2017-05-28 21:09 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Emacs developers

Ken Raeburn wrote:
> I think Guile is using whatever the native word size and architecture are.  If we do that for Emacs, they’re not portable between platforms.

Sure, but we're talking about the format Emacs uses to save its state, not the 
format of .elc files. Currently Emacs saves its state as an executable file that 
in general cannot be moved from one GNU/Linux distribution to another even if 
they have the same architecture. Switching to Guile's platform-neutral approach 
would make Emacs's saved-state format more portable, not less.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-28 21:09                                                                                                                                                         ` Paul Eggert
@ 2017-05-29  9:33                                                                                                                                                           ` Ken Raeburn
  2017-05-29 16:37                                                                                                                                                             ` Paul Eggert
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2017-05-29  9:33 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Emacs developers

On May 28, 2017, at 17:09, Paul Eggert <eggert@cs.ucla.edu> wrote:

> Ken Raeburn wrote:
>> I think Guile is using whatever the native word size and architecture are.  If we do that for Emacs, they’re not portable between platforms.
> 
> Sure, but we're talking about the format Emacs uses to save its state, not the format of .elc files. Currently Emacs saves its state as an executable file that in general cannot be moved from one GNU/Linux distribution to another even if they have the same architecture. Switching to Guile's platform-neutral approach would make Emacs's saved-state format more portable, not less.

Actually, I was referring to compiled-Lisp files generally, including the “dumped.elc” file, when I suggested it.

And I wouldn’t describe Guile’s “ELF everywhere” approach as entirely platform-neutral.  I built a Guile tree tonight to take a look.  My guess earlier about using the native architecture was wrong (it uses “none”), but it appears that the generated files are specific to the host’s byte order and word size.  So some sharing is possible between similar platforms, but not across all as with the current .elc format.

Even saving just the Lisp state as with “dumped.elc”, I think there could be state from the environment or build options that varies across distributions.  Lists of supported image types, distro customizations, things like that.  I’m not sure what benefit there is in trying to share saved Emacs state across distros.  If the goal is for a user to save a massively customized environment for future invocations, perhaps we should just work on speeding up the loading of the customizations.

If we want standardized object/executable format specifically for the preloaded environment, perhaps using the native format by way of the C compiler is a better choice.  I think this may have come up in the discussion before.  The big loss there is the ability to create a new saved environment without having a C compiler handy, but it seems like a thing few people are likely to want to do, and even fewer non-developers who might not be able to install a compiler.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-29  9:33                                                                                                                                                           ` Ken Raeburn
@ 2017-05-29 16:37                                                                                                                                                             ` Paul Eggert
  2017-05-29 17:39                                                                                                                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2017-05-29 16:37 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Emacs developers

Ken Raeburn wrote:

> And I wouldn’t describe Guile’s “ELF everywhere” approach as entirely platform-neutral.

That's correct. It's more portable than what Emacs currently does, but it's less 
portable than saving state in .elc format.

> Even saving just the Lisp state as with “dumped.elc”, I think there could be state from the environment or build options that varies across distributions.

Yes, quite true. Even with "dumped .elc" or with any of the other methods 
proposed, it would be quite difficult to make the saved state portable to any 
platform. That kind of portability should not be our goal.

> If we want standardized object/executable format specifically for the preloaded environment, perhaps using the native format by way of the C compiler is a better choice.  I think this may have come up in the discussion before.  The big loss there is the ability to create a new saved environment without having a C compiler handy, but it seems like a thing few people are likely to want to do, and even fewer non-developers who might not be able to install a compiler.

Yes, this is my preferred solution too; I was the one who made that suggestion. 
Although Eli didn't like the idea at the time, perhaps there will come a day 
when we revisit it. It should be faster than even Guile's ELF-based loading, 
which is already plenty fast.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-29 16:37                                                                                                                                                             ` Paul Eggert
@ 2017-05-29 17:39                                                                                                                                                               ` Eli Zaretskii
  2017-05-29 18:03                                                                                                                                                                 ` Paul Eggert
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-05-29 17:39 UTC (permalink / raw)
  To: Paul Eggert; +Cc: raeburn, emacs-devel

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Mon, 29 May 2017 09:37:15 -0700
> Cc: Emacs developers <emacs-devel@gnu.org>
> 
> > If we want standardized object/executable format specifically for the preloaded environment, perhaps using the native format by way of the C compiler is a better choice.  I think this may have come up in the discussion before.  The big loss there is the ability to create a new saved environment without having a C compiler handy, but it seems like a thing few people are likely to want to do, and even fewer non-developers who might not be able to install a compiler.
> 
> Yes, this is my preferred solution too; I was the one who made that suggestion. 
> Although Eli didn't like the idea at the time, perhaps there will come a day 
> when we revisit it. It should be faster than even Guile's ELF-based loading, 
> which is already plenty fast.

I have no doubt it will be faster.  My problem with this alternative
is that I believe maintaining it will need experts on C and C
compilers that we cannot rely on having available.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-29 17:39                                                                                                                                                               ` Eli Zaretskii
@ 2017-05-29 18:03                                                                                                                                                                 ` Paul Eggert
  2017-05-29 18:53                                                                                                                                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2017-05-29 18:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: raeburn, emacs-devel

Eli Zaretskii wrote:
> My problem with this alternative
> is that I believe maintaining it will need experts on C and C
> compilers that we cannot rely on having available.

I don't see why we'd need any more expertise in C than we already require. The 
Emacs core is written in C, and one must be expert in C to maintain it. The C 
code that would be output would be simple -- much simpler than what we are 
already maintaining.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-29 18:03                                                                                                                                                                 ` Paul Eggert
@ 2017-05-29 18:53                                                                                                                                                                   ` Eli Zaretskii
  2017-05-29 20:15                                                                                                                                                                     ` Paul Eggert
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2017-05-29 18:53 UTC (permalink / raw)
  To: Paul Eggert; +Cc: raeburn, emacs-devel

> Cc: raeburn@raeburn.org, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Mon, 29 May 2017 11:03:38 -0700
> 
> Eli Zaretskii wrote:
> > My problem with this alternative
> > is that I believe maintaining it will need experts on C and C
> > compilers that we cannot rely on having available.
> 
> I don't see why we'd need any more expertise in C than we already require. The 
> Emacs core is written in C, and one must be expert in C to maintain it. The C 
> code that would be output would be simple -- much simpler than what we are 
> already maintaining.

This idea requires _generating_ C, something we don't currently do,
AFAIK.

As for whether it will be simple, I reserve my judgment, since no code
was presented to demonstrate the idea for some reasonably complex Lisp
data.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-29 18:53                                                                                                                                                                   ` Eli Zaretskii
@ 2017-05-29 20:15                                                                                                                                                                     ` Paul Eggert
  2017-05-30  5:52                                                                                                                                                                       ` Ken Raeburn
  2017-05-30  5:55                                                                                                                                                                       ` Eli Zaretskii
  0 siblings, 2 replies; 375+ messages in thread
From: Paul Eggert @ 2017-05-29 20:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: raeburn, emacs-devel

Eli Zaretskii wrote:
> This idea requires_generating_  C, something we don't currently do

Actually, Emacs generates and reformats C code all the time, e.g., when editing 
and reindenting it. And the Emacs build procedure generates plenty of C code, 
e.g., lib/stdlib.h. It is not a stretch to assume enough C expertise to deal 
with this sort of thing.

> As for whether it will be simple, I reserve my judgment

That's discouraging.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-29 20:15                                                                                                                                                                     ` Paul Eggert
@ 2017-05-30  5:52                                                                                                                                                                       ` Ken Raeburn
  2017-05-30  5:55                                                                                                                                                                       ` Eli Zaretskii
  1 sibling, 0 replies; 375+ messages in thread
From: Ken Raeburn @ 2017-05-30  5:52 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, emacs-devel

Ah, yes, I remember this part of the discussion now… sorry, didn’t mean to stir up the same old argument again.

The expertise question would be an issue for attempting to adopt “ELF everywhere” in some fashion too, especially if we tried to do something interesting with using DWARF to store some kind of debug info.

On May 29, 2017, at 16:15, Paul Eggert <eggert@cs.ucla.edu> wrote:

> Eli Zaretskii wrote:
>> This idea requires_generating_  C, something we don't currently do
> 
> Actually, Emacs generates and reformats C code all the time, e.g., when editing and reindenting it. And the Emacs build procedure generates plenty of C code, e.g., lib/stdlib.h. It is not a stretch to assume enough C expertise to deal with this sort of thing.
> 
>> As for whether it will be simple, I reserve my judgment
> 
> That's discouraging.

I dunno, sounds like an invitation to produce an implementation that shows how straightforward it can be.  But, I had other things I was planning to work on tonight…

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-29 20:15                                                                                                                                                                     ` Paul Eggert
  2017-05-30  5:52                                                                                                                                                                       ` Ken Raeburn
@ 2017-05-30  5:55                                                                                                                                                                       ` Eli Zaretskii
  1 sibling, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2017-05-30  5:55 UTC (permalink / raw)
  To: Paul Eggert; +Cc: raeburn, emacs-devel

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Mon, 29 May 2017 13:15:28 -0700
> Cc: raeburn@raeburn.org, emacs-devel@gnu.org
> 
> Eli Zaretskii wrote:
> > This idea requires_generating_  C, something we don't currently do
> 
> Actually, Emacs generates and reformats C code all the time, e.g., when editing 
> and reindenting it. And the Emacs build procedure generates plenty of C code, 
> e.g., lib/stdlib.h. It is not a stretch to assume enough C expertise to deal 
> with this sort of thing.

I can only say I disagree.

> > As for whether it will be simple, I reserve my judgment
> 
> That's discouraging.

I'm sorry if you feel like that, but I don't see why: we are
discussing hypothetical code that I have no idea what it will look
like.  I just don't want to opine about something I never saw.  I
think it's reasonable.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file)
  2017-05-21  8:44                                                                                                                                                   ` compiled lisp file format (Re: Skipping unexec via a big .elc file) Ken Raeburn
  2017-05-21  8:53                                                                                                                                                     ` Paul Eggert
@ 2017-05-21 16:02                                                                                                                                                     ` John Wiegley
  1 sibling, 0 replies; 375+ messages in thread
From: John Wiegley @ 2017-05-21 16:02 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 771 bytes --]

>>>>> "KR" == Ken Raeburn <raeburn@raeburn.org> writes:

KR> I haven’t had much time to further the work on the big-elc approach
KR> recently, but there is one idea I want to toss out there for possibly
KR> improving the load time further: Changing the .elc file format to a binary
KR> one. I’m not talking about a memory image like Daniel is working on. I
KR> mean a file representing a sequence of S-expressions, but optimized for
KR> loading speed rather than for human readability.

I would like to see this; I can't think of a reason not to encode the
information in the best format for loading.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 658 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-07  7:28                                                                                                                             ` Eli Zaretskii
  2017-04-07  9:02                                                                                                                               ` Ken Raeburn
@ 2017-04-07 13:23                                                                                                                               ` Stefan Monnier
  1 sibling, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2017-04-07 13:23 UTC (permalink / raw)
  To: emacs-devel

> Perhaps we could have a separate, much smaller dumped.elc for batch
> invocations, to cater to these use cases.  Ken, does this make sense?

FWIW, I'm not sure if there is much to save there.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2017-04-03 16:15                                                                                                                 ` Ken Raeburn
  2017-04-03 16:57                                                                                                                   ` Alan Mackenzie
@ 2017-04-10 16:19                                                                                                                   ` Ken Raeburn
  1 sibling, 0 replies; 375+ messages in thread
From: Ken Raeburn @ 2017-04-10 16:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On Apr 3, 2017, at 12:15, Ken Raeburn <raeburn@raeburn.org> wrote:

> On Mar 31, 2017, at 04:40, Ken Raeburn <raeburn@raeburn.org> wrote:
> 
>> 
>> On Mar 31, 2017, at 02:57, Eli Zaretskii <eliz@gnu.org> wrote:
>>> 
>>> This fixes the problem, and Emacs now starts OK, so the abbrevs issue
>>> is also solved.
>> 
>> Great!
>> 
>>> I think you should push all the changes you asked me to apply as
>>> patches.
>> 
>> Will do, probably this weekend.
> 
> Looks like the abbrev change isn’t actually working right… I got the quoting wrong, so the abbrev tables are constructed as (mostly) proper abbrev tables, and in the right order, but the “:parent” properties are bad. Working on fixing it up….

I’ve finally gotten what I think is a fixed version pushed to the branch, which should get the parent links right between abbrev tables.

It’s also got a fix to a problem I keep hitting doing parallel bootstrap builds.  (I goofed slightly on the log message though, got the name of the temp file wrong.)  I like to try to do bootstrap builds when testing and before pushing changes, so hopefully future fixes won’t be so tediously slow to check.

I’m still cleaning up my list of open issues, and will probably check that in in the branch’s admin/notes directory.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-23 16:44                                   ` Skipping unexec via a big .elc file (was: When should ralloc.c be used?) Stefan Monnier
  2016-10-23 17:34                                     ` Eli Zaretskii
@ 2016-10-24 18:34                                     ` Lars Brinkhoff
  2016-10-24 19:52                                       ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Lars Brinkhoff @ 2016-10-24 18:34 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:
> FWIW, I just did a quick experiment with the patch below which dumps
> the state of Emacs's obarray after loadup.el into a big "dumped.elc"
> file.  [...]  So even if there might be ways to speed this up, it
> doesn't look too promising.

I suppose it's obvious that this dumped.elc can't easily be converted to
a c file which is compiled and linked into the final emacs.  For the
benefit of me and perhaps others that would otherwise waste time on
this, could someone just briefly explain why?




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: Skipping unexec via a big .elc file
  2016-10-24 18:34                                     ` Lars Brinkhoff
@ 2016-10-24 19:52                                       ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 19:52 UTC (permalink / raw)
  To: Lars Brinkhoff; +Cc: emacs-devel

> From: Lars Brinkhoff <lars@nocrew.org>
> Date: Mon, 24 Oct 2016 20:34:49 +0200
> 
> I suppose it's obvious that this dumped.elc can't easily be converted to
> a c file which is compiled and linked into the final emacs.

It can.  More accurately, we could implement a back-end to the unexec
process that generates C source file.  That was Paul's suggestion.

I consider this option less desirable for several reasons:

  . writing and maintaining such a C back-end would be non-trivial,
    and would require good control of portable C programming,
    something that most of our contributors lack

  . it requires a C compiler, i.e. end-users cannot dump their own
    customized Emacs without having a compiler and linker installed




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23  2:37                               ` Paul Eggert
  2016-10-23  6:53                                 ` Eli Zaretskii
@ 2016-10-23 12:55                                 ` Stefan Monnier
  2016-10-23 14:28                                   ` Stefan Monnier
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 12:55 UTC (permalink / raw)
  To: emacs-devel

> I suppose you're right that we don't need to; we could instead hack on Emacs
> to get it to work without ralloc on recent glibc.

I thought it's just a matter of saying "don't use ralloc" (i.e. the use
of ralloc is only an optimization hack to try and avoid fragmentation
problems).


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 12:55                                 ` When should ralloc.c be used? Stefan Monnier
@ 2016-10-23 14:28                                   ` Stefan Monnier
  2016-10-23 14:57                                     ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 14:28 UTC (permalink / raw)
  To: emacs-devel

> I thought it's just a matter of saying "don't use ralloc" (i.e. the use
> of ralloc is only an optimization hack to try and avoid fragmentation
> problems).

And AFAICT we should just never use ralloc because the rest of Emacs's
code is actually not prepared to deal with the implications, and trying
to fix it is not only a lot of work, but would make the code less
maintainable.  I'd rather live with the fragmentation.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 14:28                                   ` Stefan Monnier
@ 2016-10-23 14:57                                     ` Eli Zaretskii
  2016-10-23 15:07                                       ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 14:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sun, 23 Oct 2016 10:28:22 -0400
> 
> I'd rather live with the fragmentation.

You mean, use gmalloc without ralloc?  Is that feasible?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 14:57                                     ` Eli Zaretskii
@ 2016-10-23 15:07                                       ` Stefan Monnier
  2016-10-23 15:44                                         ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 15:07 UTC (permalink / raw)
  To: emacs-devel

>> I'd rather live with the fragmentation.
> You mean, use gmalloc without ralloc?

Not only when we use gmalloc but always.  I suggest we get rid of ralloc.c.

> Is that feasible?

Why not?


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 15:07                                       ` Stefan Monnier
@ 2016-10-23 15:44                                         ` Eli Zaretskii
  2016-10-23 16:30                                           ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 15:44 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sun, 23 Oct 2016 11:07:23 -0400
> 
> >> I'd rather live with the fragmentation.
> > You mean, use gmalloc without ralloc?
> 
> Not only when we use gmalloc but always.  I suggest we get rid of ralloc.c.
> 
> > Is that feasible?
> 
> Why not?

I don't think we ever used such a configuration.  Is modern sbrk good
enough for gmalloc?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 15:44                                         ` Eli Zaretskii
@ 2016-10-23 16:30                                           ` Stefan Monnier
  2016-10-23 16:45                                             ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 16:30 UTC (permalink / raw)
  To: emacs-devel

> I don't think we ever used such a configuration.  Is modern sbrk good
> enough for gmalloc?

Why not?


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 16:30                                           ` Stefan Monnier
@ 2016-10-23 16:45                                             ` Eli Zaretskii
  2016-10-23 16:49                                               ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 16:45 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sun, 23 Oct 2016 12:30:32 -0400
> 
> > I don't think we ever used such a configuration.  Is modern sbrk good
> > enough for gmalloc?
> 
> Why not?

"Why not" is never a useful answer.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 16:45                                             ` Eli Zaretskii
@ 2016-10-23 16:49                                               ` Stefan Monnier
  2016-10-23 17:35                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 16:49 UTC (permalink / raw)
  To: emacs-devel

>> > I don't think we ever used such a configuration.  Is modern sbrk good
>> > enough for gmalloc?
>> Why not?
> "Why not" is never a useful answer.

It just means that I really see no reason why it wouldn't work just fine.
It's not like glibc's malloc was particularly magical, so we should be
able to do the same in gmalloc.c.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 16:49                                               ` Stefan Monnier
@ 2016-10-23 17:35                                                 ` Eli Zaretskii
  2016-10-23 20:23                                                   ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 17:35 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sun, 23 Oct 2016 12:49:29 -0400
> 
> >> > I don't think we ever used such a configuration.  Is modern sbrk good
> >> > enough for gmalloc?
> >> Why not?
> > "Why not" is never a useful answer.
> 
> It just means that I really see no reason why it wouldn't work just fine.
> It's not like glibc's malloc was particularly magical, so we should be
> able to do the same in gmalloc.c.

AAIK, glibc's malloc doesn't use sbrk anymore.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 17:35                                                 ` Eli Zaretskii
@ 2016-10-23 20:23                                                   ` Stefan Monnier
  2016-10-23 20:33                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 20:23 UTC (permalink / raw)
  To: emacs-devel

>> It just means that I really see no reason why it wouldn't work just fine.
>> It's not like glibc's malloc was particularly magical, so we should be
>> able to do the same in gmalloc.c.
> AAIK, glibc's malloc doesn't use sbrk anymore.

I don't think it matters very much since we use mmap for the buffers,
which is the main source of fragmentation otherwise, AFAIK.

And if it proves to really be a problem, we could replace our gmalloc.c
with a more recent one which builds on mmap.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 20:23                                                   ` Stefan Monnier
@ 2016-10-23 20:33                                                     ` Eli Zaretskii
  2016-10-23 20:44                                                       ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-23 20:33 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sun, 23 Oct 2016 16:23:38 -0400
> 
> >> It just means that I really see no reason why it wouldn't work just fine.
> >> It's not like glibc's malloc was particularly magical, so we should be
> >> able to do the same in gmalloc.c.
> > AAIK, glibc's malloc doesn't use sbrk anymore.
> 
> I don't think it matters very much since we use mmap for the buffers,

No, we don't, not on GNU/Linux anyway.  Or do you see
USE_MMAP_FOR_BUFFERS defined to 1 in your src/config.h?

> And if it proves to really be a problem, we could replace our gmalloc.c
> with a more recent one which builds on mmap.

That's a possibility, yes.  But someone would have to bring such a
gmalloc, and probably leave the current one as well, to minimize the
impact on unaffected platforms.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 20:33                                                     ` Eli Zaretskii
@ 2016-10-23 20:44                                                       ` Stefan Monnier
  2016-10-24  5:11                                                         ` Paul Eggert
  2016-10-24  6:59                                                         ` Eli Zaretskii
  0 siblings, 2 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-23 20:44 UTC (permalink / raw)
  To: emacs-devel

>> I don't think it matters very much since we use mmap for the buffers,
> No, we don't, not on GNU/Linux anyway.

AFAIK the decision not to use mmap was due to the fact that glibc's
malloc itself uses mmap.  But if we don't use glibc's malloc, then why
wouldn't we decide to use mmap ourselves for the buffers?

> Or do you see USE_MMAP_FOR_BUFFERS defined to 1 in your src/config.h?

I'm not sure how to interpret what I see.  On Debian stable I see:

  Should Emacs use the GNU version of malloc?             yes
      (Using Doug Lea's new malloc from the GNU C Library.)
  Should Emacs use a relocating allocator for buffers?    no
  Should Emacs use mmap(2) for buffer allocation?         no

and on Debian testing I see:

  Should Emacs use the GNU version of malloc?             no (only before dumping)
  Should Emacs use a relocating allocator for buffers?    no
  Should Emacs use mmap(2) for buffer allocation?         no

so, in neither case do I see REL_ALLOC enabled.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 20:44                                                       ` Stefan Monnier
@ 2016-10-24  5:11                                                         ` Paul Eggert
  2016-10-24 12:33                                                           ` Stefan Monnier
  2016-10-24  6:59                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2016-10-24  5:11 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

Stefan Monnier wrote:
> in neither case do I see REL_ALLOC enabled.

It looks like you are using the master branch. I think Eli is worried more 
urgently about the emacs-25 branch. For emacs-25 with bleeding-edge glibc, I 
would expect:

   Should Emacs use the GNU version of malloc?             yes
   Should Emacs use a relocating allocator for buffers?    yes
   Should Emacs use mmap(2) for buffer allocation?         no

because emacs-25 will compile both gmalloc.o and ralloc.o on such a platform.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24  5:11                                                         ` Paul Eggert
@ 2016-10-24 12:33                                                           ` Stefan Monnier
  2016-10-24 13:05                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 12:33 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

>   Should Emacs use the GNU version of malloc?             yes
>   Should Emacs use a relocating allocator for buffers?    yes
>   Should Emacs use mmap(2) for buffer allocation?         no
> because emacs-25 will compile both gmalloc.o and ralloc.o on such a platform.

But I fail to see what's hard about changing that to "rel_alloc=no, mmap=yes".


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 12:33                                                           ` Stefan Monnier
@ 2016-10-24 13:05                                                             ` Eli Zaretskii
  2016-10-24 14:12                                                               ` Stefan Monnier
                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 13:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eggert, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 24 Oct 2016 08:33:10 -0400
> Cc: emacs-devel@gnu.org
> 
> >   Should Emacs use the GNU version of malloc?             yes
> >   Should Emacs use a relocating allocator for buffers?    yes
> >   Should Emacs use mmap(2) for buffer allocation?         no
> > because emacs-25 will compile both gmalloc.o and ralloc.o on such a platform.
> 
> But I fail to see what's hard about changing that to "rel_alloc=no, mmap=yes".

Why do we need mmap at all?  Why not just use malloc (as implemented
by gmalloc)?

Using mmap has disadvantages: when you need to enlarge buffer text,
and that fails (because there are no more free pages/addresses after
the already allocated region), we need to copy buffer text to the new
allocation.  This happens quite a lot when we visit a compressed
buffer.  (The MS-Windows emulation of mmap in w32heap.c reserves twice
the number of pages as originally requested, for that very reason.)

So if we can stop using ralloc without also using mmap directly for
buffer text, that'd be a win, I think.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 13:05                                                             ` Eli Zaretskii
@ 2016-10-24 14:12                                                               ` Stefan Monnier
  2016-10-24 16:00                                                                 ` Eli Zaretskii
  2016-10-24 14:37                                                               ` Stefan Monnier
  2016-10-25  3:12                                                               ` Ken Raeburn
  2 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 14:12 UTC (permalink / raw)
  To: emacs-devel

>> But I fail to see what's hard about changing that to "rel_alloc=no,
>> mmap=yes".
> Why do we need mmap at all?  Why not just use malloc (as implemented
> by gmalloc)?

AFAIU the reason we use ralloc is because of memory fragmentation, and
mmap brings similar benefits.  Maybe we don't need either of them, but
at least at some point in the past the fragmentation issue was
sufficient to convince people to write that code.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 14:12                                                               ` Stefan Monnier
@ 2016-10-24 16:00                                                                 ` Eli Zaretskii
  2016-10-24 18:51                                                                   ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 16:00 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 24 Oct 2016 10:12:27 -0400
> 
> >> But I fail to see what's hard about changing that to "rel_alloc=no,
> >> mmap=yes".
> > Why do we need mmap at all?  Why not just use malloc (as implemented
> > by gmalloc)?
> 
> AFAIU the reason we use ralloc is because of memory fragmentation, and
> mmap brings similar benefits.

But we have successfully used the glibc's malloc, without mmap, for
years without any sign of fragmentation problems.  So these
fragmentation problems are not as bad as they sound, at least in one
malloc implementation.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:00                                                                 ` Eli Zaretskii
@ 2016-10-24 18:51                                                                   ` Stefan Monnier
  0 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 18:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> But we have successfully used the glibc's malloc, without mmap, for
> years without any sign of fragmentation problems.

Yes, glibc's malloc was good enough.  I'm not sure that applies to
gmalloc.c.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 13:05                                                             ` Eli Zaretskii
  2016-10-24 14:12                                                               ` Stefan Monnier
@ 2016-10-24 14:37                                                               ` Stefan Monnier
  2016-10-24 15:40                                                                 ` Eli Zaretskii
  2016-10-25  3:12                                                               ` Ken Raeburn
  2 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 14:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel

> Using mmap has disadvantages: when you need to enlarge buffer text,
> and that fails (because there are no more free pages/addresses after
> the already allocated region), we need to copy buffer text to the new
> allocation.

All allocators suffer from this problem.  I haven't seen any evidence
that the mmap-based allocation code is significantly more prone to it.

Also, the glibc allocators used mmap internally when allocating
large-ish chunks (e.g. for buffer text), so if that was a problem, we
would have noticed, I think.

> (The MS-Windows emulation of mmap in w32heap.c reserves twice the
> number of pages as originally requested, for that very reason.)

Indeed, if this problem proves significant, there are fairly easy ways
to reduce its impact, such as using the kind of approach you mention.

Another advantage of using mmap is that it can return the memory to the
OS once you kill your large buffer, whereas with gmalloc+ralloc this
basically never happens, AFAIK.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 14:37                                                               ` Stefan Monnier
@ 2016-10-24 15:40                                                                 ` Eli Zaretskii
  2016-10-24 16:27                                                                   ` Daniel Colascione
  2016-10-24 18:45                                                                   ` Stefan Monnier
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 15:40 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eggert, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: eggert@cs.ucla.edu,  emacs-devel@gnu.org
> Date: Mon, 24 Oct 2016 10:37:19 -0400
> 
> > Using mmap has disadvantages: when you need to enlarge buffer text,
> > and that fails (because there are no more free pages/addresses after
> > the already allocated region), we need to copy buffer text to the new
> > allocation.
> 
> All allocators suffer from this problem.  I haven't seen any evidence
> that the mmap-based allocation code is significantly more prone to it.

I have seen that.  The native glibc malloc, the on GNU/Linux systems
were using until we got screwed by the recent glibc, didn't have this
problem, while mmap-based allocator did.  Don't ask me how glibc does
it, I don't know; but the fact is there.  This was discovered when the
Windows mmap emulation in w32heap.c was developed and tested.

> Also, the glibc allocators used mmap internally when allocating
> large-ish chunks (e.g. for buffer text), so if that was a problem, we
> would have noticed, I think.

True; but they somehow work around the problem.

> Another advantage of using mmap is that it can return the memory to the
> OS once you kill your large buffer, whereas with gmalloc+ralloc this
> basically never happens, AFAIK.

Not entirely true: ralloc calls the system sbrk with a negative
argument when it feels like it.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 15:40                                                                 ` Eli Zaretskii
@ 2016-10-24 16:27                                                                   ` Daniel Colascione
  2016-10-24 16:57                                                                     ` Eli Zaretskii
                                                                                       ` (4 more replies)
  2016-10-24 18:45                                                                   ` Stefan Monnier
  1 sibling, 5 replies; 375+ messages in thread
From: Daniel Colascione @ 2016-10-24 16:27 UTC (permalink / raw)
  To: Eli Zaretskii, Stefan Monnier; +Cc: eggert, emacs-devel

On 10/24/2016 08:40 AM, Eli Zaretskii wrote:
>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Cc: eggert@cs.ucla.edu,  emacs-devel@gnu.org
>> Date: Mon, 24 Oct 2016 10:37:19 -0400
>>
>>> Using mmap has disadvantages: when you need to enlarge buffer text,
>>> and that fails (because there are no more free pages/addresses after
>>> the already allocated region), we need to copy buffer text to the new
>>> allocation.

64-bit address spaces are *huge*. What about just making every buffer 
allocation 2GB long or so, marked PROT_NONE? You don't actually have to 
commit all that memory --- all you've done is set aside that address 
space. But because you've set aside so much address space, you'll very 
likely be able to expand the actual allocation region (a subset of the 
reserved region) as much as you want.

>> All allocators suffer from this problem.  I haven't seen any evidence
>> that the mmap-based allocation code is significantly more prone to it.
>
> I have seen that.  The native glibc malloc, the on GNU/Linux systems
> were using until we got screwed by the recent glibc, didn't have this
> problem, while mmap-based allocator did.  Don't ask me how glibc does
> it, I don't know; but the fact is there.  This was discovered when the
> Windows mmap emulation in w32heap.c was developed and tested.
>
>> Also, the glibc allocators used mmap internally when allocating
>> large-ish chunks (e.g. for buffer text), so if that was a problem, we
>> would have noticed, I think.
>
> True; but they somehow work around the problem.
>
>> Another advantage of using mmap is that it can return the memory to the
>> OS once you kill your large buffer, whereas with gmalloc+ralloc this
>> basically never happens, AFAIK.
>
> Not entirely true: ralloc calls the system sbrk with a negative
> argument when it feels like it.

You can also madvise(MADV_DONTNEED, ...) regions *inside* the heap that 
contain only freed memory. This procedure also returns memory to the 
operating system.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:27                                                                   ` Daniel Colascione
@ 2016-10-24 16:57                                                                     ` Eli Zaretskii
  2016-10-25  2:34                                                                     ` Richard Stallman
                                                                                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 16:57 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel

> Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Mon, 24 Oct 2016 09:27:43 -0700
> 
> >>> Using mmap has disadvantages: when you need to enlarge buffer text,
> >>> and that fails (because there are no more free pages/addresses after
> >>> the already allocated region), we need to copy buffer text to the new
> >>> allocation.
> 
> 64-bit address spaces are *huge*. What about just making every buffer 
> allocation 2GB long or so, marked PROT_NONE? You don't actually have to 
> commit all that memory --- all you've done is set aside that address 
> space. But because you've set aside so much address space, you'll very 
> likely be able to expand the actual allocation region (a subset of the 
> reserved region) as much as you want.

Sounds OK, although I'm not an expert on that.  But in any case, these
ideas are not baked enough to be applied to the release branch, if we
want to release Emacs 25.2 soon (as in "in a couple of months").

(Of course, there's always the case of a file larger than 2GB, it's
not unheard of, although still quite rare.)



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:27                                                                   ` Daniel Colascione
  2016-10-24 16:57                                                                     ` Eli Zaretskii
@ 2016-10-25  2:34                                                                     ` Richard Stallman
  2016-10-25 14:13                                                                     ` Stefan Monnier
                                                                                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-25  2:34 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: eliz, eggert, monnier, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > 64-bit address spaces are *huge*. What about just making every buffer 
  > allocation 2GB long or so, marked PROT_NONE?

Does Linux handles such sparseness efficiently?
I don't know.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:27                                                                   ` Daniel Colascione
  2016-10-24 16:57                                                                     ` Eli Zaretskii
  2016-10-25  2:34                                                                     ` Richard Stallman
@ 2016-10-25 14:13                                                                     ` Stefan Monnier
  2016-10-25 14:14                                                                     ` Stefan Monnier
  2016-10-28  6:03                                                                     ` Jérémie Courrèges-Anglas
  4 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-25 14:13 UTC (permalink / raw)
  To: emacs-devel

> 64-bit address spaces are *huge*. What about just making every buffer
> allocation 2GB long or so, marked PROT_NONE?

Won't be sufficient for 3GB buffers, obviously ;-)


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:27                                                                   ` Daniel Colascione
                                                                                       ` (2 preceding siblings ...)
  2016-10-25 14:13                                                                     ` Stefan Monnier
@ 2016-10-25 14:14                                                                     ` Stefan Monnier
  2016-10-28  6:03                                                                     ` Jérémie Courrèges-Anglas
  4 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-25 14:14 UTC (permalink / raw)
  To: emacs-devel

>> Not entirely true: ralloc calls the system sbrk with a negative
>> argument when it feels like it.
> You can also madvise(MADV_DONTNEED, ...) regions *inside* the heap that
> contain only freed memory. This procedure also returns memory to the
> operating system.

ralloc.c doesn't do this currently.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 16:27                                                                   ` Daniel Colascione
                                                                                       ` (3 preceding siblings ...)
  2016-10-25 14:14                                                                     ` Stefan Monnier
@ 2016-10-28  6:03                                                                     ` Jérémie Courrèges-Anglas
  2016-10-28  6:23                                                                       ` Daniel Colascione
  4 siblings, 1 reply; 375+ messages in thread
From: Jérémie Courrèges-Anglas @ 2016-10-28  6:03 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eli Zaretskii, eggert, Stefan Monnier, emacs-devel

Daniel Colascione <dancol@dancol.org> writes:

> On 10/24/2016 08:40 AM, Eli Zaretskii wrote:
>>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>>> Cc: eggert@cs.ucla.edu,  emacs-devel@gnu.org
>>> Date: Mon, 24 Oct 2016 10:37:19 -0400
>>>
>>>> Using mmap has disadvantages: when you need to enlarge buffer text,
>>>> and that fails (because there are no more free pages/addresses after
>>>> the already allocated region), we need to copy buffer text to the new
>>>> allocation.
>
> 64-bit address spaces are *huge*. What about just making every buffer
> allocation 2GB long or so, marked PROT_NONE? You don't actually have to
> commit all that memory --- all you've done is set aside that address
> space.

IIUC you suggest relying on memory overcommit.  That doesn't sound
portable at all.  Not all OSes do overcommit and the ones who do
generally provide a way to disable it.

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  6:03                                                                     ` Jérémie Courrèges-Anglas
@ 2016-10-28  6:23                                                                       ` Daniel Colascione
  2016-10-28  7:09                                                                         ` Jérémie Courrèges-Anglas
  2016-10-28  7:46                                                                         ` Eli Zaretskii
  0 siblings, 2 replies; 375+ messages in thread
From: Daniel Colascione @ 2016-10-28  6:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, Stefan Monnier, emacs-devel

jca@wxcvbn.org (Jérémie Courrèges-Anglas) writes:

> Daniel Colascione <dancol@dancol.org> writes:
>
>> On 10/24/2016 08:40 AM, Eli Zaretskii wrote:
>>>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>>>> Cc: eggert@cs.ucla.edu,  emacs-devel@gnu.org
>>>> Date: Mon, 24 Oct 2016 10:37:19 -0400
>>>>
>>>>> Using mmap has disadvantages: when you need to enlarge buffer text,
>>>>> and that fails (because there are no more free pages/addresses after
>>>>> the already allocated region), we need to copy buffer text to the new
>>>>> allocation.
>>
>> 64-bit address spaces are *huge*. What about just making every buffer
>> allocation 2GB long or so, marked PROT_NONE? You don't actually have to
>> commit all that memory --- all you've done is set aside that address
>> space.
>
> IIUC you suggest relying on memory overcommit.  That doesn't sound
> portable at all.  Not all OSes do overcommit and the ones who do
> generally provide a way to disable it.

You understand incorrectly. "Overcommit" is the practice of allowing an
operating system to lie about how much memory it's guaranteed to give
applications in the future.  We're not talking about guaranteed
memory. We're talking about setting aside address space only, not asking
the OS to make guarantees about future memory availability.  All major
operating systems, even ones like Windows that don't do overcommit,
provide ways to reserve address space without asking the OS to guarantee
availability of memory.

That said, my idea probably isn't the best --- but it doesn't rely
on overcommit.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  6:23                                                                       ` Daniel Colascione
@ 2016-10-28  7:09                                                                         ` Jérémie Courrèges-Anglas
  2016-10-28  7:46                                                                         ` Eli Zaretskii
  1 sibling, 0 replies; 375+ messages in thread
From: Jérémie Courrèges-Anglas @ 2016-10-28  7:09 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: Eli Zaretskii, eggert, Stefan Monnier, emacs-devel

Daniel Colascione <dancol@dancol.org> writes:

> jca@wxcvbn.org (Jérémie Courrèges-Anglas) writes:
>
>> Daniel Colascione <dancol@dancol.org> writes:
>>
>>> On 10/24/2016 08:40 AM, Eli Zaretskii wrote:
>>>>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>>>>> Cc: eggert@cs.ucla.edu,  emacs-devel@gnu.org
>>>>> Date: Mon, 24 Oct 2016 10:37:19 -0400
>>>>>
>>>>>> Using mmap has disadvantages: when you need to enlarge buffer text,
>>>>>> and that fails (because there are no more free pages/addresses after
>>>>>> the already allocated region), we need to copy buffer text to the new
>>>>>> allocation.
>>>
>>> 64-bit address spaces are *huge*. What about just making every buffer
>>> allocation 2GB long or so, marked PROT_NONE? You don't actually have to
>>> commit all that memory --- all you've done is set aside that address
>>> space.
>>
>> IIUC you suggest relying on memory overcommit.  That doesn't sound
>> portable at all.  Not all OSes do overcommit and the ones who do
>> generally provide a way to disable it.
>
> You understand incorrectly. "Overcommit" is the practice of allowing an
> operating system to lie about how much memory it's guaranteed to give
> applications in the future.  We're not talking about guaranteed
> memory. We're talking about setting aside address space only, not asking
> the OS to make guarantees about future memory availability.  All major
> operating systems, even ones like Windows that don't do overcommit,
> provide ways to reserve address space without asking the OS to guarantee
> availability of memory.

Can you point at some documentation regarding those techniques?  I fail
to find one that would work on my "non-major", mostly POSIX OS.

> That said, my idea probably isn't the best --- but it doesn't rely
> on overcommit.

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  6:23                                                                       ` Daniel Colascione
  2016-10-28  7:09                                                                         ` Jérémie Courrèges-Anglas
@ 2016-10-28  7:46                                                                         ` Eli Zaretskii
  2016-10-28  8:11                                                                           ` Daniel Colascione
  1 sibling, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28  7:46 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel

> From: Daniel Colascione <dancol@dancol.org>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,  eggert@cs.ucla.edu,  emacs-devel@gnu.org
> Date: Thu, 27 Oct 2016 23:23:05 -0700
> 
> We're talking about setting aside address space only, not asking
> the OS to make guarantees about future memory availability.  All major
> operating systems, even ones like Windows that don't do overcommit,
> provide ways to reserve address space without asking the OS to guarantee
> availability of memory.

Not sure I understand you: if a portion of the address space has been
reserved, how come these addresses won't be available when we try to
commit them later?  There might not be physical pages available for
that, but virtual memory for those addresses must be available, no?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  7:46                                                                         ` Eli Zaretskii
@ 2016-10-28  8:11                                                                           ` Daniel Colascione
  2016-10-28  8:27                                                                             ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2016-10-28  8:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel

On 10/28/2016 12:46 AM, Eli Zaretskii wrote:
>> From: Daniel Colascione <dancol@dancol.org>
>> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,  eggert@cs.ucla.edu,  emacs-devel@gnu.org
>> Date: Thu, 27 Oct 2016 23:23:05 -0700
>>
>> We're talking about setting aside address space only, not asking
>> the OS to make guarantees about future memory availability.  All major
>> operating systems, even ones like Windows that don't do overcommit,
>> provide ways to reserve address space without asking the OS to guarantee
>> availability of memory.
>
> Not sure I understand you: if a portion of the address space has been
> reserved, how come these addresses won't be available when we try to
> commit them later?  There might not be physical pages available for
> that, but virtual memory for those addresses must be available, no?

I'm not sure I understand what you're confused about, so I'll try a 
broader explanation.

Say I mmap (anonymously, for simplicity) a page PROT_NONE. After the 
initial mapping, that address space is unavailable for other uses. But 
because the page protections are PROT_NONE, my program has no legal 
right to access that page, so the OS doesn't have to guarantee that it 
can find a physical page to back that page I've mmaped. In this state, 
the memory is reserved.

The 20GB PROT_NONE address space reservation itself requires very little 
memory. It's just a note in the kernel's VM interval tree that says "the 
addresses in range [0x20000, 0x500020000) are reserved". Virtual memory is

Now imagine I change the protections to PROT_READ|PROT_WRITE --- once 
the PROT_READ|PROT_WRITE mprotect succeeds, my program has every right 
to access that page; under a strict accounting scheme (that is, without 
overcommit), the OS has to guarantee that it'll be able to go find a 
physical page to back that virtual page. In this state, the memory is 
committed -- the kernel has committed to finding backing storage for 
that page at some point when the current process tries to access it.

Say you have a strict-accounting system with 1GB of RAM and 1GB of swap. 
I can write a program that reserves 20GB of address space. That's fine. 
The kernel isn't promising to give you 20GB of memory: it's setting 
address space. Now if I attempt to map 20GB PROT_READ|PROT_WRITE, on any 
reasonable (i..e, not overcommit) system, mmap should fail, since 
there's no way a system with 1GB of RAM and 1GB of swap can promise to 
provide 20GB of private memory.

Overcommit confuses the issue: the kernel will _commit_ to as much 
memory as you ask it for and then renege on that commitment when it 
finds it convenient. An overcommit system with 1GB of RAM and 1GB of 
swap will happily let you make that 20GB PROT_READ|PROT_WRITE mapping. 
It'll just kill you after you use more than 2GB of that mapping. A 
non-overcommit system understands how to say "sorry, I can't let you do 
that" up front. On a non-overcommit system, your process will never be 
killed for accessing memory that the kernel told the process in advance 
that it could use.

I think Jérémie is working with a mental model where every memory 
mapping is commit, and only an overcommit system allows you to commit 
more memory than the system actually has. Any system will let you 
reserve more address space than you have available commit: reservations 
are cheap.

(In Windows, the corresponding concepts MEM_RESERVE and MEM_COMMIT. 
Windows is much more explicit about the difference between memory 
reservations and memory commitments than Linux is, because most of the 
time, Linux users don't care about the difference.)

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  8:11                                                                           ` Daniel Colascione
@ 2016-10-28  8:27                                                                             ` Eli Zaretskii
  2016-10-28  8:44                                                                               ` Daniel Colascione
  2016-10-28 11:40                                                                               ` Jérémie Courrèges-Anglas
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28  8:27 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel

> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 28 Oct 2016 01:11:08 -0700
> 
> Say I mmap (anonymously, for simplicity) a page PROT_NONE. After the 
> initial mapping, that address space is unavailable for other uses. But 
> because the page protections are PROT_NONE, my program has no legal 
> right to access that page, so the OS doesn't have to guarantee that it 
> can find a physical page to back that page I've mmaped. In this state, 
> the memory is reserved.
> 
> The 20GB PROT_NONE address space reservation itself requires very little 
> memory. It's just a note in the kernel's VM interval tree that says "the 
> addresses in range [0x20000, 0x500020000) are reserved". Virtual memory is
> 
> Now imagine I change the protections to PROT_READ|PROT_WRITE --- once 
> the PROT_READ|PROT_WRITE mprotect succeeds, my program has every right 
> to access that page; under a strict accounting scheme (that is, without 
> overcommit), the OS has to guarantee that it'll be able to go find a 
> physical page to back that virtual page. In this state, the memory is 
> committed -- the kernel has committed to finding backing storage for 
> that page at some point when the current process tries to access it.

I'm with you up to here.  My question is whether PROT_READ|PROT_WRITE
call could fail after PROT_NONE succeeded.  You seem to say it could;
I thought it couldn't.

> Say you have a strict-accounting system with 1GB of RAM and 1GB of swap. 
> I can write a program that reserves 20GB of address space.

I thought such a reservation should fail, because you don't have
enough virtual memory for 20GB of addresses.  IOW, I thought the
ability to reserve address space is restricted by the actual amount of
virtual memory available on the system at the time of the call.  You
seem to say I was wrong.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  8:27                                                                             ` Eli Zaretskii
@ 2016-10-28  8:44                                                                               ` Daniel Colascione
  2016-10-28  9:43                                                                                 ` Eli Zaretskii
  2016-10-28 11:40                                                                               ` Jérémie Courrèges-Anglas
  1 sibling, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2016-10-28  8:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel

On 10/28/2016 01:27 AM, Eli Zaretskii wrote:
>> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Fri, 28 Oct 2016 01:11:08 -0700
>>
>> Say I mmap (anonymously, for simplicity) a page PROT_NONE. After the
>> initial mapping, that address space is unavailable for other uses. But
>> because the page protections are PROT_NONE, my program has no legal
>> right to access that page, so the OS doesn't have to guarantee that it
>> can find a physical page to back that page I've mmaped. In this state,
>> the memory is reserved.
>>
>> The 20GB PROT_NONE address space reservation itself requires very little
>> memory. It's just a note in the kernel's VM interval tree that says "the
>> addresses in range [0x20000, 0x500020000) are reserved". Virtual memory is
>>
>> Now imagine I change the protections to PROT_READ|PROT_WRITE --- once
>> the PROT_READ|PROT_WRITE mprotect succeeds, my program has every right
>> to access that page; under a strict accounting scheme (that is, without
>> overcommit), the OS has to guarantee that it'll be able to go find a
>> physical page to back that virtual page. In this state, the memory is
>> committed -- the kernel has committed to finding backing storage for
>> that page at some point when the current process tries to access it.
>
> I'm with you up to here.  My question is whether PROT_READ|PROT_WRITE
> call could fail after PROT_NONE succeeded.  You seem to say it could;
> I thought it couldn't.

Yes, it can fail. This program just failed on my system, which is a 
strict accounting (echo 2 > /proc/sys/vm/overcommit_memory) Linux box 
with much less than 100GB total commit available.

#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <errno.h>

size_t GB = (size_t) 1024 * 1024 * 1024;
int
main()
{
     size_t sz = 100*GB;
     void* mem = mmap(NULL, sz, PROT_NONE,
                      MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
     if (mem == MAP_FAILED) {
         fprintf(stderr, "map failed: %s\n", strerror(errno));
         return 1;
     }

     if (mprotect(mem, sz, PROT_READ|PROT_WRITE)) {
         fprintf(stderr, "mprotect failed: %s\n", strerror(errno));
         return 1;
     }

     fprintf(stderr, "mprotect worked\n");
     return 0;
}

>> Say you have a strict-accounting system with 1GB of RAM and 1GB of swap.
>> I can write a program that reserves 20GB of address space.
>
> I thought such a reservation should fail, because you don't have
> enough virtual memory for 20GB of addresses.  IOW, I thought the
> ability to reserve address space is restricted by the actual amount of
> virtual memory available on the system at the time of the call.  You
> seem to say I was wrong.

I'm not sure you're even wrong :-) What does "virtual memory" mean to 
you? I'm not sure what you have in mind maps to any of the concepts I'm 
using.

When we allocate memory, we can consume two resources: address space and 
commit. That 100GB mmap above doesn't consume virtual memory, but it 
does consume address space. Address space is a finite resource, but 
usually much larger than commit, which is the sum of RAM and swap space. 
When you commit a page, the resource you're consuming is commit.

(Technically, the 100GB mapping consumes real memory enough for the OS 
to remember you've set aside that address space, but it's usually a 
negligible book-keeping note. On my system, I can make sz equal to 80TB 
or so before the mmap starts to fail: that's about the size of the 
address space range dictated by amd64 processor design.)

(In a 32-bit process on modern systems, it's frequently the case that 
you have more commit on the system than any one process has address space.)



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  8:44                                                                               ` Daniel Colascione
@ 2016-10-28  9:43                                                                                 ` Eli Zaretskii
  2016-10-28  9:52                                                                                   ` Daniel Colascione
  2016-10-28 12:11                                                                                   ` Stefan Monnier
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28  9:43 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel

> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 28 Oct 2016 01:44:33 -0700
> 
> >> Say you have a strict-accounting system with 1GB of RAM and 1GB of swap.
> >> I can write a program that reserves 20GB of address space.
> >
> > I thought such a reservation should fail, because you don't have
> > enough virtual memory for 20GB of addresses.  IOW, I thought the
> > ability to reserve address space is restricted by the actual amount of
> > virtual memory available on the system at the time of the call.  You
> > seem to say I was wrong.
> 
> I'm not sure you're even wrong :-) What does "virtual memory" mean to 
> you?

Physical + swap, as usual.

> When we allocate memory, we can consume two resources: address space and 
> commit. That 100GB mmap above doesn't consume virtual memory, but it 
> does consume address space. Address space is a finite resource, but 
> usually much larger than commit, which is the sum of RAM and swap space. 
> When you commit a page, the resource you're consuming is commit.

If reserving a range of addresses doesn't necessarily mean they will
be later available for committing, then what is the purpose of
reserving them in the first place?  What good does it do?

We have in w32heap.c:mmap_realloc code that attempts to commit pages
that were previously reserved.  That code does recover from a failure
to commit, but such a failure is deemed unusual and causes special
warnings under debugger.  I never saw these warnings happen, except
when we had bugs in that code.  You seem to say that this is based on
false premises, and there's nothing unusual about MEM_COMMIT to fail
for the range of pages previously reserved with MEM_RESERVE.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  9:43                                                                                 ` Eli Zaretskii
@ 2016-10-28  9:52                                                                                   ` Daniel Colascione
  2016-10-28 12:25                                                                                     ` Eli Zaretskii
  2016-10-28 12:11                                                                                   ` Stefan Monnier
  1 sibling, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2016-10-28  9:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel

On 10/28/2016 02:43 AM, Eli Zaretskii wrote:
>> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Fri, 28 Oct 2016 01:44:33 -0700
>>
>>>> Say you have a strict-accounting system with 1GB of RAM and 1GB of swap.
>>>> I can write a program that reserves 20GB of address space.
>>>
>>> I thought such a reservation should fail, because you don't have
>>> enough virtual memory for 20GB of addresses.  IOW, I thought the
>>> ability to reserve address space is restricted by the actual amount of
>>> virtual memory available on the system at the time of the call.  You
>>> seem to say I was wrong.
>>
>> I'm not sure you're even wrong :-) What does "virtual memory" mean to
>> you?
>
> Physical + swap, as usual.
>
>> When we allocate memory, we can consume two resources: address space and
>> commit. That 100GB mmap above doesn't consume virtual memory, but it
>> does consume address space. Address space is a finite resource, but
>> usually much larger than commit, which is the sum of RAM and swap space.
>> When you commit a page, the resource you're consuming is commit.
>
> If reserving a range of addresses doesn't necessarily mean they will
> be later available for committing, then what is the purpose of
> reserving them in the first place?  What good does it do?

Reserving address space is useful for making sure you have a contiguous 
range of virtual addresses that you can use later.

> We have in w32heap.c:mmap_realloc code that attempts to commit pages
> that were previously reserved.  That code does recover from a failure
> to commit, but such a failure is deemed unusual and causes special
> warnings under debugger.  I never saw these warnings happen, except
> when we had bugs in that code.  You seem to say that this is based on
> false premises, and there's nothing unusual about MEM_COMMIT to fail
> for the range of pages previously reserved with MEM_RESERVE.

The MEM_COMMIT failure might be rare in practice --- systems have a lot 
of memory these days --- but MEM_COMMIT failing for a memory region 
previously reserved with MEM_RESERVE is perfectly legal. MEM_RESERVE 
does not stake a claim on the system's memory resources. It consumes 
only your own address space.





^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  9:52                                                                                   ` Daniel Colascione
@ 2016-10-28 12:25                                                                                     ` Eli Zaretskii
  2016-10-28 13:37                                                                                       ` Stefan Monnier
  2016-10-28 15:41                                                                                       ` Daniel Colascione
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28 12:25 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel

> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 28 Oct 2016 02:52:19 -0700
> 
> > If reserving a range of addresses doesn't necessarily mean they will
> > be later available for committing, then what is the purpose of
> > reserving them in the first place?  What good does it do?
> 
> Reserving address space is useful for making sure you have a contiguous 
> range of virtual addresses that you can use later.

But if committing more pages from the reserved range is not guaranteed
to succeed, I cannot rely on getting that contiguous range of
addresses, can I?

> > We have in w32heap.c:mmap_realloc code that attempts to commit pages
> > that were previously reserved.  That code does recover from a failure
> > to commit, but such a failure is deemed unusual and causes special
> > warnings under debugger.  I never saw these warnings happen, except
> > when we had bugs in that code.  You seem to say that this is based on
> > false premises, and there's nothing unusual about MEM_COMMIT to fail
> > for the range of pages previously reserved with MEM_RESERVE.
> 
> The MEM_COMMIT failure might be rare in practice --- systems have a lot 
> of memory these days --- but MEM_COMMIT failing for a memory region 
> previously reserved with MEM_RESERVE is perfectly legal.

I can only say that I never saw that happening.

Thanks.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 12:25                                                                                     ` Eli Zaretskii
@ 2016-10-28 13:37                                                                                       ` Stefan Monnier
  2016-10-28 14:30                                                                                         ` Eli Zaretskii
  2016-10-28 15:41                                                                                       ` Daniel Colascione
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-28 13:37 UTC (permalink / raw)
  To: emacs-devel

> But if committing more pages from the reserved range is not guaranteed
> to succeed, I cannot rely on getting that contiguous range of
> addresses, can I?

It should only fail in those cases where a new mmap (or malloc) would
also fail.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 13:37                                                                                       ` Stefan Monnier
@ 2016-10-28 14:30                                                                                         ` Eli Zaretskii
  2016-10-28 14:43                                                                                           ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28 14:30 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Fri, 28 Oct 2016 09:37:04 -0400
> 
> > But if committing more pages from the reserved range is not guaranteed
> > to succeed, I cannot rely on getting that contiguous range of
> > addresses, can I?
> 
> It should only fail in those cases where a new mmap (or malloc) would
> also fail.

That means never, for all practical purposes.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 14:30                                                                                         ` Eli Zaretskii
@ 2016-10-28 14:43                                                                                           ` Stefan Monnier
  0 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-28 14:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> > But if committing more pages from the reserved range is not guaranteed
>> > to succeed, I cannot rely on getting that contiguous range of
>> > addresses, can I?
>> It should only fail in those cases where a new mmap (or malloc) would
>> also fail.
> That means never, for all practical purposes.

Of course,


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 12:25                                                                                     ` Eli Zaretskii
  2016-10-28 13:37                                                                                       ` Stefan Monnier
@ 2016-10-28 15:41                                                                                       ` Daniel Colascione
  2016-10-29  6:08                                                                                         ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Daniel Colascione @ 2016-10-28 15:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel

On 10/28/2016 05:25 AM, Eli Zaretskii wrote:
>> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Fri, 28 Oct 2016 02:52:19 -0700
>>
>>> If reserving a range of addresses doesn't necessarily mean they will
>>> be later available for committing, then what is the purpose of
>>> reserving them in the first place?  What good does it do?
>>
>> Reserving address space is useful for making sure you have a contiguous
>> range of virtual addresses that you can use later.
>
> But if committing more pages from the reserved range is not guaranteed
> to succeed, I cannot rely on getting that contiguous range of
> addresses, can I?

You already _have_ the range of addresses. You just can't do anything 
with them yet.

Here's another use case: magic ring buffers. (Where you put two 
consecutive views of the same file in memory next to each other so that 
operations on the ring buffer don't need to be split even in cases where 
they'd wrap the end of the ring.)

Say on our 1GB RAM, 1GB swap system we want to memory-map a 5GB ring 
buffer log file. We can do it safely and atomically like this:

1) Reserve 10GB of address space with an anonymous PROT_NONE mapping; 
the mapping is at $ADDR
2) Memory-map our log file at $ADDR with PROT_READ|PROT_WRITE; (the 
mapping is file-backed, not anonymous, so it doesn't count against 
system commit charge)
3) Memory-map the log file _again_ at $ADDR+5GB

Now we have a nice mirrored view of our ring buffer, and thanks to the 
PROT_NONE mapping we set up in step one, no other thread was able to 
sneak in the middle and allocate something in the [$ADDR+5GB,$ADDR+10GB) 
range and spoil our ability to set up the mirroring.

In this instance, setting aside address space without allocating backing 
storage for it turned out to be very useful.

>
>>> We have in w32heap.c:mmap_realloc code that attempts to commit pages
>>> that were previously reserved.  That code does recover from a failure
>>> to commit, but such a failure is deemed unusual and causes special
>>> warnings under debugger.  I never saw these warnings happen, except
>>> when we had bugs in that code.  You seem to say that this is based on
>>> false premises, and there's nothing unusual about MEM_COMMIT to fail
>>> for the range of pages previously reserved with MEM_RESERVE.
>>
>> The MEM_COMMIT failure might be rare in practice --- systems have a lot
>> of memory these days --- but MEM_COMMIT failing for a memory region
>> previously reserved with MEM_RESERVE is perfectly legal.
>
> I can only say that I never saw that happening.
>
> Thanks.
>



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 15:41                                                                                       ` Daniel Colascione
@ 2016-10-29  6:08                                                                                         ` Eli Zaretskii
  2016-10-29  6:14                                                                                           ` Daniel Colascione
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-29  6:08 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel

> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Fri, 28 Oct 2016 08:41:48 -0700
> 
> >> Reserving address space is useful for making sure you have a contiguous
> >> range of virtual addresses that you can use later.
> >
> > But if committing more pages from the reserved range is not guaranteed
> > to succeed, I cannot rely on getting that contiguous range of
> > addresses, can I?
> 
> You already _have_ the range of addresses. You just can't do anything 
> with them yet.

It's no use "having" the addresses, in the above sense, if I can't
rely on being able to do anything with them later.

> Here's another use case: magic ring buffers. (Where you put two 
> consecutive views of the same file in memory next to each other so that 
> operations on the ring buffer don't need to be split even in cases where 
> they'd wrap the end of the ring.)
> 
> Say on our 1GB RAM, 1GB swap system we want to memory-map a 5GB ring 
> buffer log file. We can do it safely and atomically like this:
> 
> 1) Reserve 10GB of address space with an anonymous PROT_NONE mapping; 
> the mapping is at $ADDR
> 2) Memory-map our log file at $ADDR with PROT_READ|PROT_WRITE; (the 
> mapping is file-backed, not anonymous, so it doesn't count against 
> system commit charge)
> 3) Memory-map the log file _again_ at $ADDR+5GB

If 3) fails, what do you do?

> Now we have a nice mirrored view of our ring buffer, and thanks to the 
> PROT_NONE mapping we set up in step one, no other thread was able to 
> sneak in the middle and allocate something in the [$ADDR+5GB,$ADDR+10GB) 
> range and spoil our ability to set up the mirroring.
> 
> In this instance, setting aside address space without allocating backing 
> storage for it turned out to be very useful.

Not if PROT_READ|PROT_WRITE call fails.

But if, as Stefan says, this will "never" happen, then the problem
doesn't exist in practice, and for all practical purposes what I
thought should happen, does happen, even if in theory it can fail.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-29  6:08                                                                                         ` Eli Zaretskii
@ 2016-10-29  6:14                                                                                           ` Daniel Colascione
  0 siblings, 0 replies; 375+ messages in thread
From: Daniel Colascione @ 2016-10-29  6:14 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Fri, 28 Oct 2016 08:41:48 -0700
>> 
>> >> Reserving address space is useful for making sure you have a contiguous
>> >> range of virtual addresses that you can use later.
>> >
>> > But if committing more pages from the reserved range is not guaranteed
>> > to succeed, I cannot rely on getting that contiguous range of
>> > addresses, can I?
>> 
>> You already _have_ the range of addresses. You just can't do anything 
>> with them yet.
>
> It's no use "having" the addresses, in the above sense, if I can't
> rely on being able to do anything with them later.

You can rely on nobody else using that address space, though. This
exclusion in itself is valuable. It's like an electric company buying
right-of-way for a high-voltage transmission line.  Sure, the electric
company isn't doing anything with that long strip of land, but the value
is in nobody _else_ doing anything with it either.

>
>> Here's another use case: magic ring buffers. (Where you put two 
>> consecutive views of the same file in memory next to each other so that 
>> operations on the ring buffer don't need to be split even in cases where 
>> they'd wrap the end of the ring.)
>> 
>> Say on our 1GB RAM, 1GB swap system we want to memory-map a 5GB ring 
>> buffer log file. We can do it safely and atomically like this:
>> 
>> 1) Reserve 10GB of address space with an anonymous PROT_NONE mapping; 
>> the mapping is at $ADDR
>> 2) Memory-map our log file at $ADDR with PROT_READ|PROT_WRITE; (the 
>> mapping is file-backed, not anonymous, so it doesn't count against 
>> system commit charge)
>> 3) Memory-map the log file _again_ at $ADDR+5GB
>
> If 3) fails, what do you do?

Unmap the original mapping and mapping #2, then fail the higher-level
make_magic_ring_buffer operation. Any operation that allocates memory
can fail.

>
>> Now we have a nice mirrored view of our ring buffer, and thanks to the 
>> PROT_NONE mapping we set up in step one, no other thread was able to 
>> sneak in the middle and allocate something in the [$ADDR+5GB,$ADDR+10GB) 
>> range and spoil our ability to set up the mirroring.
>> 
>> In this instance, setting aside address space without allocating backing 
>> storage for it turned out to be very useful.
>
> Not if PROT_READ|PROT_WRITE call fails.
>
> But if, as Stefan says, this will "never" happen, then the problem
> doesn't exist in practice, and for all practical purposes what I
> thought should happen, does happen, even if in theory it can fail.

It's unlikely, but it's a legal failure mode.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  9:43                                                                                 ` Eli Zaretskii
  2016-10-28  9:52                                                                                   ` Daniel Colascione
@ 2016-10-28 12:11                                                                                   ` Stefan Monnier
  1 sibling, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-28 12:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel, eggert

> If reserving a range of addresses doesn't necessarily mean they will
> be later available for committing, then what is the purpose of
> reserving them in the first place?  What good does it do?

My guess is that you can later use that address space to mmap files
in there.  Equivalently, you could increase the swap space between the
time you PROT_NONE and the time you switch to PROT_RW.

PROT_NONE is useful in a situation such as ours: you want to mmap
a hundred buffers, and make sure you can grow any of them without
knowing beforehand which one will grow.

But most likely, whether it's useful or not to be able to reserve 80TB
of address space even if you'd never be able to PROT_RW later was not
really relevant to the design of the API.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28  8:27                                                                             ` Eli Zaretskii
  2016-10-28  8:44                                                                               ` Daniel Colascione
@ 2016-10-28 11:40                                                                               ` Jérémie Courrèges-Anglas
  2016-10-28 13:03                                                                                 ` Stefan Monnier
  2016-10-28 15:34                                                                                 ` Daniel Colascione
  1 sibling, 2 replies; 375+ messages in thread
From: Jérémie Courrèges-Anglas @ 2016-10-28 11:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel, monnier, eggert

Eli Zaretskii <eliz@gnu.org> writes:

>> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
>> From: Daniel Colascione <dancol@dancol.org>
>> Date: Fri, 28 Oct 2016 01:11:08 -0700
>> 
>> Say I mmap (anonymously, for simplicity) a page PROT_NONE. After the 
>> initial mapping, that address space is unavailable for other uses. But 
>> because the page protections are PROT_NONE, my program has no legal 
>> right to access that page, so the OS doesn't have to guarantee that it 
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> can find a physical page to back that page I've mmaped. In this state, 
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is what I think is a problem in your reasoning.  "Doesn't have to
guarantee" doesn't mean that the kernel *should not* actually check the
available memory and resource limits.

>> the memory is reserved.
>> 
>> The 20GB PROT_NONE address space reservation itself requires very little 
>> memory. It's just a note in the kernel's VM interval tree that says "the 
>> addresses in range [0x20000, 0x500020000) are reserved". Virtual memory is
>> 
>> Now imagine I change the protections to PROT_READ|PROT_WRITE --- once 
>> the PROT_READ|PROT_WRITE mprotect succeeds, my program has every right 
>> to access that page; under a strict accounting scheme (that is, without 
>> overcommit), the OS has to guarantee that it'll be able to go find a 
>> physical page to back that virtual page. In this state, the memory is 
>> committed -- the kernel has committed to finding backing storage for 
>> that page at some point when the current process tries to access it.
>
> I'm with you up to here.  My question is whether PROT_READ|PROT_WRITE
> call could fail after PROT_NONE succeeded.  You seem to say it could;
> I thought it couldn't.

I wouldn't have thought that PROT_NONE vs PROT_READ|PROT_WRITE would
have changed anything here, but on *some* OSes it does, however it is
not portable.  At least OpenBSD doesn't behave like what you describe.
IMHO people who rely on this kind of reservations rely on
implementation-defined behavior.

Also, sanity wise, I'd prefer having mmap(2) fail right away rather than
having mprotect(2) fail, much later.  *If* mprotect(2) actually fails ;
of course, you don't want to play russian roulette with your OS's
flavor of the OOM-killer either.

>> Say you have a strict-accounting system with 1GB of RAM and 1GB of swap. 
>> I can write a program that reserves 20GB of address space.
>
> I thought such a reservation should fail, because you don't have
> enough virtual memory for 20GB of addresses.  IOW, I thought the
> ability to reserve address space is restricted by the actual amount of
> virtual memory available on the system at the time of the call.  You
> seem to say I was wrong.


-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 11:40                                                                               ` Jérémie Courrèges-Anglas
@ 2016-10-28 13:03                                                                                 ` Stefan Monnier
  2016-10-28 14:41                                                                                   ` Jérémie Courrèges-Anglas
  2016-10-28 15:34                                                                                 ` Daniel Colascione
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-28 13:03 UTC (permalink / raw)
  To: emacs-devel

> I wouldn't have thought that PROT_NONE vs PROT_READ|PROT_WRITE would
> have changed anything here, but on *some* OSes it does, however it is
> not portable.  At least OpenBSD doesn't behave like what you describe.

Are you sure?  Can you point to concrete evidence?

Not that's it's important (using a hard-coded number like 2GB wouldn't
work, so we'd more likely use something like w32heap.c's "pre-allocate
double the size", which doesn't suffer from that problem anyway and
still guarantees efficient behavior when growing a buffer progressively
from 1B to 100GB).


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 13:03                                                                                 ` Stefan Monnier
@ 2016-10-28 14:41                                                                                   ` Jérémie Courrèges-Anglas
  0 siblings, 0 replies; 375+ messages in thread
From: Jérémie Courrèges-Anglas @ 2016-10-28 14:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> I wouldn't have thought that PROT_NONE vs PROT_READ|PROT_WRITE would
>> have changed anything here, but on *some* OSes it does, however it is
>> not portable.  At least OpenBSD doesn't behave like what you describe.
>
> Are you sure?  Can you point to concrete evidence?

Erm, I think there was a problem with my tests.

data:
- system has 8GB of ram
- no swap
- "data" rlimit set to 32GB, the per-process maximum supported on
  OpenBSD/amd64

with Daniel's test program asking for 20GB:
- mmap(PROT_NONE) _succeeds_
- mprotect(PROT_READ|PROT_WRITE) _succeeds_

An mmap call directly asking for 20GB with PROT_READ|PROT_WRITE also
succeeds.  The protection flags aren't checked to decide whether ENOMEM
should be returned, and the process has no easy way to tell whether the
requested amount of memory is actually usable (-> SIGBUS if the system
can't map enough pages).

The reason why my test initially failed is that I assumed that ulimit -d
was 4GB on this box, not 1.5GB (default for OpenBSD/amd64).  Not
double-checking this was sloppy, my sincere apologies to Daniel and the
other readers.

> Not that's it's important (using a hard-coded number like 2GB wouldn't
> work, so we'd more likely use something like w32heap.c's "pre-allocate
> double the size", which doesn't suffer from that problem anyway and
> still guarantees efficient behavior when growing a buffer progressively
> from 1B to 100GB).

Ack.  Note that the test above was using the maximum value for
ulimit -d; for the record, a single allocation of 2GB would be rejected
by default on all of our supported platforms.

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 11:40                                                                               ` Jérémie Courrèges-Anglas
  2016-10-28 13:03                                                                                 ` Stefan Monnier
@ 2016-10-28 15:34                                                                                 ` Daniel Colascione
  1 sibling, 0 replies; 375+ messages in thread
From: Daniel Colascione @ 2016-10-28 15:34 UTC (permalink / raw)
  To: Eli Zaretskii, eggert, monnier, emacs-devel

On 10/28/2016 04:40 AM, Jérémie Courrèges-Anglas wrote:
> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org
>>> From: Daniel Colascione <dancol@dancol.org>
>>> Date: Fri, 28 Oct 2016 01:11:08 -0700
>>>
>>> Say I mmap (anonymously, for simplicity) a page PROT_NONE. After the
>>> initial mapping, that address space is unavailable for other uses. But
>>> because the page protections are PROT_NONE, my program has no legal
>>> right to access that page, so the OS doesn't have to guarantee that it
>                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> can find a physical page to back that page I've mmaped. In this state,
>    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> This is what I think is a problem in your reasoning.  "Doesn't have to
> guarantee" doesn't mean that the kernel *should not* actually check the
> available memory and resource limits.
>

IMHO, an OS that rejects big PROT_NONE mappings merely because it might 
not be able to change them to PROT_READ|PROT_WRITE later is broken. The 
non-overcommit Linux behavior (which is identical to Windows behavior) 
is the _right _thing_ _to_ _do_. The OS is letting the process manage 
its address space and assuming that the programmer knows what he wanted 
to do.

>>> the memory is reserved.
>>>
>>> The 20GB PROT_NONE address space reservation itself requires very little
>>> memory. It's just a note in the kernel's VM interval tree that says "the
>>> addresses in range [0x20000, 0x500020000) are reserved". Virtual memory is
>>>
>>> Now imagine I change the protections to PROT_READ|PROT_WRITE --- once
>>> the PROT_READ|PROT_WRITE mprotect succeeds, my program has every right
>>> to access that page; under a strict accounting scheme (that is, without
>>> overcommit), the OS has to guarantee that it'll be able to go find a
>>> physical page to back that virtual page. In this state, the memory is
>>> committed -- the kernel has committed to finding backing storage for
>>> that page at some point when the current process tries to access it.
>>
>> I'm with you up to here.  My question is whether PROT_READ|PROT_WRITE
>> call could fail after PROT_NONE succeeded.  You seem to say it could;
>> I thought it couldn't.
>
> I wouldn't have thought that PROT_NONE vs PROT_READ|PROT_WRITE would
> have changed anything here, but on *some* OSes it does, however it is
> not portable.  At least OpenBSD doesn't behave like what you describe.

How does it behave?

> IMHO people who rely on this kind of reservations rely on
> implementation-defined behavior.

OpenBSD is a Coelacanth. It's a relic. It doesn't even a unified buffe 
cache.

> Also, sanity wise, I'd prefer having mmap(2) fail right away rather than
> having mprotect(2) fail, much later.

Then ask for PROT_READ|PROT_WRITE access right away. Ask for commit, not 
just address space.

> *If* mprotect(2) actually fails ;
> of course, you don't want to play russian roulette with your OS's
> flavor of the OOM-killer either.

That's why overcommit is an abomination.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 15:40                                                                 ` Eli Zaretskii
  2016-10-24 16:27                                                                   ` Daniel Colascione
@ 2016-10-24 18:45                                                                   ` Stefan Monnier
  2016-10-24 19:38                                                                     ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 18:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel

>> > Using mmap has disadvantages: when you need to enlarge buffer text,
>> > and that fails (because there are no more free pages/addresses after
>> > the already allocated region), we need to copy buffer text to the new
>> > allocation.
>> All allocators suffer from this problem.  I haven't seen any evidence
>> that the mmap-based allocation code is significantly more prone to it.
> I have seen that.

Could you give some details (mostly about the scale of the problem)?

> The native glibc malloc, the on GNU/Linux systems were using until we
> got screwed by the recent glibc, didn't have this problem, while
> mmap-based allocator did.  Don't ask me how glibc does it, I don't
> know; but the fact is there.

It likely mmaps a bit more than requested, like you do in w32heap.c.

>> Another advantage of using mmap is that it can return the memory to the
>> OS once you kill your large buffer, whereas with gmalloc+ralloc this
>> basically never happens, AFAIK.
> Not entirely true: ralloc calls the system sbrk with a negative
> argument when it feels like it.

That's why I said "basically".  Yes, in theory it can sometimes
return memory.  In practice, this is rare.  In contrast, with mmap,
returning memory to the OS is the rule rather than the exception.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 18:45                                                                   ` Stefan Monnier
@ 2016-10-24 19:38                                                                     ` Eli Zaretskii
  2016-10-25 14:12                                                                       ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 19:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eggert, emacs-devel

> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org
> Date: Mon, 24 Oct 2016 14:45:59 -0400
> 
> >> > Using mmap has disadvantages: when you need to enlarge buffer text,
> >> > and that fails (because there are no more free pages/addresses after
> >> > the already allocated region), we need to copy buffer text to the new
> >> > allocation.
> >> All allocators suffer from this problem.  I haven't seen any evidence
> >> that the mmap-based allocation code is significantly more prone to it.
> > I have seen that.
> 
> Could you give some details (mostly about the scale of the problem)?

Visiting a large compressed file (e.g., an Emacs release tarball
compressed with gzip) takes with mmap several times as long as in a
build without mmap.

> > Not entirely true: ralloc calls the system sbrk with a negative
> > argument when it feels like it.
> 
> That's why I said "basically".  Yes, in theory it can sometimes
> return memory.  In practice, this is rare.  In contrast, with mmap,
> returning memory to the OS is the rule rather than the exception.

How so?  Releasing memory in both cases requires basically the same
situation: a large enough block of contiguous memory not in use.  It
seems ralloc is actually at an advantage, because relocating blocks
helps collect together a larger free block.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 19:38                                                                     ` Eli Zaretskii
@ 2016-10-25 14:12                                                                       ` Stefan Monnier
  2016-10-25 16:36                                                                         ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-25 14:12 UTC (permalink / raw)
  To: emacs-devel

>> That's why I said "basically".  Yes, in theory it can sometimes
>> return memory.  In practice, this is rare.  In contrast, with mmap,
>> returning memory to the OS is the rule rather than the exception.
> How so?  Releasing memory in both cases requires basically the same
> situation: a large enough block of contiguous memory not in use.

IIUC releasing memory with sbrk can only be done if that memory is at
the end of the heap.

> It seems ralloc is actually at an advantage, because relocating blocks
> helps collect together a larger free block.

mmap can always free what it has allocated before, without any need to
relocate anything.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-25 14:12                                                                       ` Stefan Monnier
@ 2016-10-25 16:36                                                                         ` Eli Zaretskii
  2016-10-25 19:27                                                                           ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-25 16:36 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Tue, 25 Oct 2016 10:12:23 -0400
> 
> >> That's why I said "basically".  Yes, in theory it can sometimes
> >> return memory.  In practice, this is rare.  In contrast, with mmap,
> >> returning memory to the OS is the rule rather than the exception.
> > How so?  Releasing memory in both cases requires basically the same
> > situation: a large enough block of contiguous memory not in use.
> 
> IIUC releasing memory with sbrk can only be done if that memory is at
> the end of the heap.

Since ralloc.c relocates blocks, it can make this happen more easily.

> > It seems ralloc is actually at an advantage, because relocating blocks
> > helps collect together a larger free block.
> 
> mmap can always free what it has allocated before, without any need to
> relocate anything.

It makes no sense to release random pages here and there, all you get
is fragmentation at address space level.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-25 16:36                                                                         ` Eli Zaretskii
@ 2016-10-25 19:27                                                                           ` Stefan Monnier
  0 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-25 19:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> IIUC releasing memory with sbrk can only be done if that memory is at
>> the end of the heap.
> Since ralloc.c relocates blocks, it can make this happen more easily.

But the sbrk area is shared with gmalloc, whose data is not relocatable,
so as soon as gmalloc calls sbrk, the space previously allocated by
ralloc can't be returned any more.

>> > It seems ralloc is actually at an advantage, because relocating blocks
>> > helps collect together a larger free block.
>> mmap can always free what it has allocated before, without any need to
>> relocate anything.
> It makes no sense to release random pages here and there, all you get
> is fragmentation at address space level.

Yes, but address space is much more plentiful.

Note that glibc uses exactly this approach and it works very well for us.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 13:05                                                             ` Eli Zaretskii
  2016-10-24 14:12                                                               ` Stefan Monnier
  2016-10-24 14:37                                                               ` Stefan Monnier
@ 2016-10-25  3:12                                                               ` Ken Raeburn
  2016-10-25 16:06                                                                 ` Eli Zaretskii
  2 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2016-10-25  3:12 UTC (permalink / raw)
  To: Emacs development discussions

On Oct 24, 2016, at 09:05, Eli Zaretskii <eliz@gnu.org> wrote:

> Using mmap has disadvantages: when you need to enlarge buffer text,
> and that fails (because there are no more free pages/addresses after
> the already allocated region), we need to copy buffer text to the new
> allocation.  This happens quite a lot when we visit a compressed
> buffer.  (The MS-Windows emulation of mmap in w32heap.c reserves twice
> the number of pages as originally requested, for that very reason.)

In the general case, yes.  But modern Linux kernels have an “mremap” system call which can “move” a range of pages to a portion of the address space that can accommodate a larger size, by tweaking page tables rather than copying all the bits around.  I’m pretty sure modern glibc realloc uses it.  I had a project a while back where code ported to Solaris ran far slower than the GNU/Linux version because lots of realloc calls were done on a large array; Solaris copied, GNU/Linux remapped.

  void *mremap(void *old_address, size_t old_size,
               size_t new_size, int flags, ... /* void *new_address */);

Of course you can’t shift bytes within a page this way, or add new space anywhere but after the last page of the old region.  (No hint in the man page whether you can use an explicit new address range overlapping the old range to shift a chunk of memory a la memmove, or if the results would be undefined a la memcpy.)

I don’t know if any other systems support it.  The performance savings for one of our favorite systems might be worth the special-casing.  Though, if glibc realloc does the right thing, maybe using malloc/realloc for buffer storage would suffice.

Ken

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-25  3:12                                                               ` Ken Raeburn
@ 2016-10-25 16:06                                                                 ` Eli Zaretskii
  2016-10-26  4:36                                                                   ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-25 16:06 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Mon, 24 Oct 2016 23:12:40 -0400
> 
> > Using mmap has disadvantages: when you need to enlarge buffer text,
> > and that fails (because there are no more free pages/addresses after
> > the already allocated region), we need to copy buffer text to the new
> > allocation.  This happens quite a lot when we visit a compressed
> > buffer.  (The MS-Windows emulation of mmap in w32heap.c reserves twice
> > the number of pages as originally requested, for that very reason.)
> 
> In the general case, yes.  But modern Linux kernels have an “mremap” system 
> call which can “move” a range of pages to a portion of the address space that 
> can accommodate a larger size, by tweaking page tables rather than copying all 
> the bits around.  I’m pretty sure modern glibc realloc uses it.

AFAIU, this feature will only help us if someone adds code to use it
in buffer.c:mmap_enlarge.  Or are you saying that the OS will call
mremap for us automatically when mmap_enlarge attempts to map
additional pages at the end of an mmaped region?

> I don’t know if any other systems support it.  The performance savings for one 
> of our favorite systems might be worth the special-casing.  Though, if glibc 
> realloc does the right thing, maybe using malloc/realloc for buffer storage 
> would suffice.

If the Linux kernel is the only system that allows implementation of
mremap, then it doesn't really help in the long run, because on master
we don't need mmap at all for GNU/Linux systems.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-25 16:06                                                                 ` Eli Zaretskii
@ 2016-10-26  4:36                                                                   ` Ken Raeburn
  2016-10-26 11:40                                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Ken Raeburn @ 2016-10-26  4:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel


> On Oct 25, 2016, at 12:06, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Ken Raeburn <raeburn@raeburn.org>
>> Date: Mon, 24 Oct 2016 23:12:40 -0400
>> 
>>> Using mmap has disadvantages: when you need to enlarge buffer text,
>>> and that fails (because there are no more free pages/addresses after
>>> the already allocated region), we need to copy buffer text to the new
>>> allocation.  This happens quite a lot when we visit a compressed
>>> buffer.  (The MS-Windows emulation of mmap in w32heap.c reserves twice
>>> the number of pages as originally requested, for that very reason.)
>> 
>> In the general case, yes.  But modern Linux kernels have an “mremap” system 
>> call which can “move” a range of pages to a portion of the address space that 
>> can accommodate a larger size, by tweaking page tables rather than copying all 
>> the bits around.  I’m pretty sure modern glibc realloc uses it.
> 
> AFAIU, this feature will only help us if someone adds code to use it
> in buffer.c:mmap_enlarge.  Or are you saying that the OS will call
> mremap for us automatically when mmap_enlarge attempts to map
> additional pages at the end of an mmaped region?

It could be done explicitly, but my experience was that malloc/realloc would just do it for us; we’d just have to use malloc/realloc instead of explicitly calling mmap.  I just took a quick look at the glibc sources (2.19, as patched and packaged by Debian), and it looks like the use of mmap kicks in by default for 128kB or larger allocations, though the threshold can be changed at run time.

> If the Linux kernel is the only system that allows implementation of
> mremap, then it doesn't really help in the long run, because on master
> we don't need mmap at all for GNU/Linux systems.

A man page browser at freebsd.org for several platforms seems to indicate that NetBSD has picked it up, but neither FreeBSD nor OpenBSD.  I don’t know if NetBSD’s realloc will use it, but it’s certainly simpler if we just ignore mremap for explicit use, and just bear in mind that realloc may not always have to pay the expected copying penalty on all systems….


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-26  4:36                                                                   ` Ken Raeburn
@ 2016-10-26 11:40                                                                     ` Eli Zaretskii
  2016-10-27  8:51                                                                       ` Ken Raeburn
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-26 11:40 UTC (permalink / raw)
  To: Ken Raeburn; +Cc: emacs-devel

> From: Ken Raeburn <raeburn@raeburn.org>
> Date: Wed, 26 Oct 2016 00:36:42 -0400
> Cc: emacs-devel@gnu.org
> 
> >>> Using mmap has disadvantages: when you need to enlarge buffer text,
> >>> and that fails (because there are no more free pages/addresses after
> >>> the already allocated region), we need to copy buffer text to the new
> >>> allocation.  This happens quite a lot when we visit a compressed
> >>> buffer.  (The MS-Windows emulation of mmap in w32heap.c reserves twice
> >>> the number of pages as originally requested, for that very reason.)
> >> 
> >> In the general case, yes.  But modern Linux kernels have an “mremap” system 
> >> call which can “move” a range of pages to a portion of the address space 
> >> that 
> >> can accommodate a larger size, by tweaking page tables rather than copying 
> >> all 
> >> the bits around.  I’m pretty sure modern glibc realloc uses it.
> > 
> > AFAIU, this feature will only help us if someone adds code to use it
> > in buffer.c:mmap_enlarge.  Or are you saying that the OS will call
> > mremap for us automatically when mmap_enlarge attempts to map
> > additional pages at the end of an mmaped region?
> 
> It could be done explicitly, but my experience was that malloc/realloc would 
> just do it for us; we’d just have to use malloc/realloc instead of explicitly 
> calling mmap.

I think we've lost context of the discussion.  Please see above: this
is about the disadvantages of using mmap directly, i.e. for those
cases where the native malloc or gmalloc suffer from memory
fragmentation, and we decide to use mmap in buffer.c to countermand
that.

I've pointed out the disadvantages of using mmap directly, and you
mentioned the mremap syscall as the counter-argument.  If you thought
I was talking about problems mmap could cause to the malloc
implementation, then that's a misunderstanding: I was explicitly
talking about using mmap directly for allocating buffer text.  My
point was that we should only use mmap if necessary, as it comes for a
price.

> A man page browser at freebsd.org for several platforms seems to indicate that 
> NetBSD has picked it up, but neither FreeBSD nor OpenBSD.  I don’t know if 
> NetBSD’s realloc will use it, but it’s certainly simpler if we just ignore 
> mremap for explicit use, and just bear in mind that realloc may not always have 
> to pay the expected copying penalty on all systems….

Once again, this is about the cases where using malloc for buffer text
gives unsatisfactory results, and mmap is being considered as a
remedy.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-26 11:40                                                                     ` Eli Zaretskii
@ 2016-10-27  8:51                                                                       ` Ken Raeburn
  0 siblings, 0 replies; 375+ messages in thread
From: Ken Raeburn @ 2016-10-27  8:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> I think we've lost context of the discussion.  Please see above: this
> is about the disadvantages of using mmap directly, i.e. for those
> cases where the native malloc or gmalloc suffer from memory
> fragmentation, and we decide to use mmap in buffer.c to countermand
> that.

Yes, sorry, I got a bit off track…

Ken


^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-23 20:44                                                       ` Stefan Monnier
  2016-10-24  5:11                                                         ` Paul Eggert
@ 2016-10-24  6:59                                                         ` Eli Zaretskii
  2016-10-24 12:45                                                           ` Stefan Monnier
  1 sibling, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24  6:59 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sun, 23 Oct 2016 16:44:10 -0400
> 
> >> I don't think it matters very much since we use mmap for the buffers,
> > No, we don't, not on GNU/Linux anyway.
> 
> AFAIK the decision not to use mmap was due to the fact that glibc's
> malloc itself uses mmap.  But if we don't use glibc's malloc, then why
> wouldn't we decide to use mmap ourselves for the buffers?

I already asked that:

  http://lists.gnu.org/archive/html/emacs-devel/2016-10/msg00678.html

The only answer was disappointing:

  I don't know, and would rather not spend time investigating.

It looks like either people don't realize what a land mine we just
stepped on, or they simply don't care enough.

Does it make sense to anyone to release Emacs 25.2 that doesn't work
reliably on recent GNU/Linux systems?  Because that's what is going to
happen if we don't invest all the resources we have into solving this.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24  6:59                                                         ` Eli Zaretskii
@ 2016-10-24 12:45                                                           ` Stefan Monnier
  2016-10-24 13:07                                                             ` Eli Zaretskii
  2016-10-24 16:56                                                             ` Richard Stallman
  0 siblings, 2 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 12:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> It looks like either people don't realize what a land mine we just
> stepped on, or they simply don't care enough.
[...]
> Does it make sense to anyone to release Emacs 25.2 that doesn't work
> reliably on recent GNU/Linux systems?  Because that's what is going to
> happen if we don't invest all the resources we have into solving this.

I must be missing something: you seem to know about past severe problems
we've had because of fragmentation, whereas I can't remember any
such occurrence.


        Stefan



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 12:45                                                           ` Stefan Monnier
@ 2016-10-24 13:07                                                             ` Eli Zaretskii
  2016-10-24 14:42                                                               ` Stefan Monnier
  2016-10-24 16:56                                                             ` Richard Stallman
  1 sibling, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 13:07 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Mon, 24 Oct 2016 08:45:04 -0400
> 
> > It looks like either people don't realize what a land mine we just
> > stepped on, or they simply don't care enough.
> [...]
> > Does it make sense to anyone to release Emacs 25.2 that doesn't work
> > reliably on recent GNU/Linux systems?  Because that's what is going to
> > happen if we don't invest all the resources we have into solving this.
> 
> I must be missing something: you seem to know about past severe problems
> we've had because of fragmentation, whereas I can't remember any
> such occurrence.

I don't understand how you get to talking about fragmentation.  I
never mentioned anything like that.  The problems I was talking about
are all related to using ralloc.c on GNU/Linux systems with a recent
enough glibc.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 13:07                                                             ` Eli Zaretskii
@ 2016-10-24 14:42                                                               ` Stefan Monnier
  2016-10-24 15:43                                                                 ` Eli Zaretskii
  2016-10-24 16:10                                                                 ` Eli Zaretskii
  0 siblings, 2 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 14:42 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

> I don't understand how you get to talking about fragmentation.  I
> never mentioned anything like that.  The problems I was talking about
> are all related to using ralloc.c on GNU/Linux systems with a recent
> enough glibc.

I misunderstood, then.  I fully agree that ralloc.c is a landmine, if
that's what you meant.  That's why I think we should get rid of it.

And if we really want to keep it, we should prefer mmap over ralloc
(i.e. we should only consider ralloc in those cases where the mmap
alternative is unavailable (not sure if there are still systems where
this is the case, the DOS port maybe?)).

AFAIK, gmalloc+mmap-ralloc is a perfectly acceptable solution for
Emacs-25.2 with the new glibc, with no known problem.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 14:42                                                               ` Stefan Monnier
@ 2016-10-24 15:43                                                                 ` Eli Zaretskii
  2016-10-24 18:50                                                                   ` Stefan Monnier
  2016-10-24 16:10                                                                 ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 15:43 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Mon, 24 Oct 2016 10:42:02 -0400
> 
> AFAIK, gmalloc+mmap-ralloc is a perfectly acceptable solution for
> Emacs-25.2 with the new glibc, with no known problem.

So you consider this preferable to the 2 alternatives I mentioned in

  http://lists.gnu.org/archive/html/emacs-devel/2016-10/msg00740.html

?

They both avoid using mmap, since I won't want to re-introduce its
disadvantages to GNU/Linux systems.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 15:43                                                                 ` Eli Zaretskii
@ 2016-10-24 18:50                                                                   ` Stefan Monnier
  0 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 18:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> AFAIK, gmalloc+mmap-ralloc is a perfectly acceptable solution for
>> Emacs-25.2 with the new glibc, with no known problem.
> So you consider this preferable to the 2 alternatives I mentioned in
>   http://lists.gnu.org/archive/html/emacs-devel/2016-10/msg00740.html
> ?

Not sure.  AFAIU, gmalloc-mmap-ralloc suffers from fragmentation, which
was the reason why ralloc was written in the first place, so I would
tend to shy away from it, but I have not personally seen those problems,
so I don't have a strong opinion on this.

As for using HYBRID_MALLOC, that would be a better solution I think, but
I haven't looked at the corresponding patch, so I don't know how safe it is.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 14:42                                                               ` Stefan Monnier
  2016-10-24 15:43                                                                 ` Eli Zaretskii
@ 2016-10-24 16:10                                                                 ` Eli Zaretskii
  1 sibling, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 16:10 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 24 Oct 2016 10:42:02 -0400
> Cc: emacs-devel@gnu.org
> 
> And if we really want to keep it, we should prefer mmap over ralloc
> (i.e. we should only consider ralloc in those cases where the mmap
> alternative is unavailable (not sure if there are still systems where
> this is the case, the DOS port maybe?)).

The DOS port has code to work with its system malloc.  That code was
tested at the time, so this port shouldn't be an obstacle on the way
of getting rid of ralloc.c.

> AFAIK, gmalloc+mmap-ralloc is a perfectly acceptable solution for
> Emacs-25.2 with the new glibc, with no known problem.

I think gmalloc without mmap might be better.  See the alternatives I
mentioned elsewhere in this thread.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24 12:45                                                           ` Stefan Monnier
  2016-10-24 13:07                                                             ` Eli Zaretskii
@ 2016-10-24 16:56                                                             ` Richard Stallman
  1 sibling, 0 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-24 16:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eliz, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I must be missing something: you seem to know about past severe problems
  > we've had because of fragmentation, whereas I can't remember any
  > such occurrence.

Fragmentation caused problems bad enough to motivate me to write
ralloc.  But I don't know about the present situation.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-22 18:34                           ` Paul Eggert
  2016-10-22 19:43                             ` When should ralloc.c be used? Stefan Monnier
@ 2016-10-24  0:21                             ` Richard Stallman
  2016-10-24  3:59                               ` Paul Eggert
                                                 ` (2 more replies)
  1 sibling, 3 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-24  0:21 UTC (permalink / raw)
  To: Paul Eggert; +Cc: eliz, npostavs, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I don't like it either, but would rather work on redoing the build process so 
  > that we can use the native malloc on all hosts.

That may not be desirable, though.  We started using GNU malloc
because it gave much better performance than some native mallocs.
Whether that is true today, I have no idea; I am only saying
that it is an issue to consider.

Stefan said:

  > But that doesn't explain why we'd need to use ralloc in the mean time.

Why would we not want to use ralloc?  It made a big improvement for
memory management when I wrote it.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-24  0:21                             ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
@ 2016-10-24  3:59                               ` Paul Eggert
  2016-10-24  7:15                               ` Eli Zaretskii
  2016-10-24 14:04                               ` When should ralloc.c be used? Stefan Monnier
  2 siblings, 0 replies; 375+ messages in thread
From: Paul Eggert @ 2016-10-24  3:59 UTC (permalink / raw)
  To: rms; +Cc: eliz, npostavs, emacs-devel

Richard Stallman wrote:
>   > I don't like it either, but would rather work on redoing the build process so
>   > that we can use the native malloc on all hosts.
>
> That may not be desirable, though.  We started using GNU malloc
> because it gave much better performance

We could continue to do that, on the set of platforms where our copy of GNU 
malloc works significantly better than native malloc. My impression, though, is 
that this set of platforms is gradually shrinking due to improvements in memory 
allocators. See, e.g.:

Berger ED, Zorn BG, McKinley KS. Reconsidering custom memory allocation. 
OOPSLA'02. http://dx.doi.org/10.1145/2502508.2502522
https://people.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-24  0:21                             ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  2016-10-24  3:59                               ` Paul Eggert
@ 2016-10-24  7:15                               ` Eli Zaretskii
  2016-10-24 16:55                                 ` Richard Stallman
  2016-10-24 14:04                               ` When should ralloc.c be used? Stefan Monnier
  2 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24  7:15 UTC (permalink / raw)
  To: rms; +Cc: npostavs, eggert, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> CC: eliz@gnu.org, emacs-devel@gnu.org, npostavs@users.sourceforge.net
> Date: Sun, 23 Oct 2016 20:21:33 -0400
> 
>   > I don't like it either, but would rather work on redoing the build process so 
>   > that we can use the native malloc on all hosts.
> 
> That may not be desirable, though.  We started using GNU malloc
> because it gave much better performance than some native mallocs.
> Whether that is true today, I have no idea; I am only saying
> that it is an issue to consider.

I think native malloc on GNU/Linux is much better these days; we were
using it all the recent years, until glibc developers removed the
hooks we needed for unexec support (which is why those GNU/Linux
systems where this change is already installed switched to gmalloc and
ralloc instead).

Emacs 25.1 switched to native malloc on MS-Windows as well, and I see
no problems with memory management due to that, perhaps even a small
improvement.

>   > But that doesn't explain why we'd need to use ralloc in the mean time.
> 
> Why would we not want to use ralloc?

It imposes hard-to-fulfill requirements on functions that get C
pointers to buffer text or to Lisp string data: those functions must
never call malloc, directly or indirectly.

This requirement was well known to the few Emacs developers in the
distant past, when all the platforms used ralloc.  But since the
modern platforms gradually migrated away from ralloc, this is almost
unknown to most current developers, and code crept in that violates
this requirement.  Fixing all that code is hard, because most of it is
not easily found; it manifests itself in corruption of buffer text,
random segfaults and aborts during GC, which happen long time after
the offending code did its job.

> It made a big improvement for memory management when I wrote it.

It is no longer a big improvement, as modern platforms manage memory
much better in their native malloc implementations.  So ralloc is
nowadays a significant disadvantage almost without advantages.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-24  7:15                               ` Eli Zaretskii
@ 2016-10-24 16:55                                 ` Richard Stallman
  2016-10-24 17:09                                   ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-10-24 16:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, eggert, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I think native malloc on GNU/Linux is much better these days; we were
  > using it all the recent years, until glibc developers removed the
  > hooks we needed for unexec support (which is why those GNU/Linux
  > systems where this change is already installed switched to gmalloc and
  > ralloc instead).

Should we talk with them about putting in those hooks or other
suitable hooks?  Then we could go back to the libc malloc.

  > It imposes hard-to-fulfill requirements on functions that get C
  > pointers to buffer text or to Lisp string data: those functions must
  > never call malloc, directly or indirectly.

I think the way to fix those is by systematically looking at the
source for them, rather than by debugging.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-24 16:55                                 ` Richard Stallman
@ 2016-10-24 17:09                                   ` Eli Zaretskii
  2016-10-25  2:35                                     ` Richard Stallman
                                                       ` (2 more replies)
  0 siblings, 3 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-24 17:09 UTC (permalink / raw)
  To: rms; +Cc: npostavs, eggert, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> CC: eggert@cs.ucla.edu, emacs-devel@gnu.org,
> 	npostavs@users.sourceforge.net
> Date: Mon, 24 Oct 2016 12:55:36 -0400
> 
>   > I think native malloc on GNU/Linux is much better these days; we were
>   > using it all the recent years, until glibc developers removed the
>   > hooks we needed for unexec support (which is why those GNU/Linux
>   > systems where this change is already installed switched to gmalloc and
>   > ralloc instead).
> 
> Should we talk with them about putting in those hooks or other
> suitable hooks?  Then we could go back to the libc malloc.

I think we tried, and more or less failed.  (That was in the context
of unexec, but the arguments are more or less similar.)

>   > It imposes hard-to-fulfill requirements on functions that get C
>   > pointers to buffer text or to Lisp string data: those functions must
>   > never call malloc, directly or indirectly.
> 
> I think the way to fix those is by systematically looking at the
> source for them, rather than by debugging.

Yes, but finding out whether this is so is not easy, because the
malloc call is sometimes buried very deep.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-24 17:09                                   ` Eli Zaretskii
@ 2016-10-25  2:35                                     ` Richard Stallman
  2016-10-25  6:38                                       ` Paul Eggert
  2016-10-25 16:04                                       ` Eli Zaretskii
  2016-10-25  2:35                                     ` Richard Stallman
  2016-10-25 23:00                                     ` Perry E. Metzger
  2 siblings, 2 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-25  2:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, eggert, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > Should we talk with them about putting in those hooks or other
  > > suitable hooks?  Then we could go back to the libc malloc.

  > I think we tried, and more or less failed.  (That was in the context
  > of unexec, but the arguments are more or less similar.)

How did it fail?  Did they give it a strong try?

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-25  2:35                                     ` Richard Stallman
@ 2016-10-25  6:38                                       ` Paul Eggert
  2016-10-25 16:04                                       ` Eli Zaretskii
  1 sibling, 0 replies; 375+ messages in thread
From: Paul Eggert @ 2016-10-25  6:38 UTC (permalink / raw)
  To: rms, Eli Zaretskii; +Cc: npostavs, emacs-devel

Richard Stallman wrote:
>   > > Should we talk with them about putting in those hooks or other
>   > > suitable hooks?  Then we could go back to the libc malloc.
>
>   > I think we tried, and more or less failed.  (That was in the context
>   > of unexec, but the arguments are more or less similar.)
>
> How did it fail?  Did they give it a strong try?

It was more the other way around. People working on the glibc memory allocator 
convinced me that the malloc hooks were a significant impediment to performance 
improvements within glibc, and that Emacs unexec didn't really need those hooks 
any more. Emacs was the only major user of that part of the old glibc API.

For those interested in GNU malloc performance improvements, a talk related to 
the current effort is scheduled a week from Thursday in Santa Fe. Please see:

O'Donell C. linux and glibc: The 4.5TiB malloc API trace. LPC 2016. 
https://linuxplumbersconf.org/2016/ocw/proposals/3921

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-25  2:35                                     ` Richard Stallman
  2016-10-25  6:38                                       ` Paul Eggert
@ 2016-10-25 16:04                                       ` Eli Zaretskii
  2016-10-25 23:49                                         ` Richard Stallman
  2016-10-25 23:49                                         ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  1 sibling, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-25 16:04 UTC (permalink / raw)
  To: rms; +Cc: npostavs, eggert, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> CC: eggert@cs.ucla.edu, emacs-devel@gnu.org,
> 	npostavs@users.sourceforge.net
> Date: Mon, 24 Oct 2016 22:35:53 -0400
> 
>   > > Should we talk with them about putting in those hooks or other
>   > > suitable hooks?  Then we could go back to the libc malloc.
> 
>   > I think we tried, and more or less failed.  (That was in the context
>   > of unexec, but the arguments are more or less similar.)
> 
> How did it fail?

My take is that the glibc developers don't really want to hear about
keeping those hooks.

> Did they give it a strong try?

I don't know what that means in practice.  What would make the try
"strong"?

You can see the discussion starting here:

  http://lists.gnu.org/archive/html/emacs-devel/2016-01/msg00956.html

You took some part in the discussion, at least its public part (I
understand there was also an off-list part).  I think once you said
here:

  http://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01633.html

that you favored replacing unexec by a more portable scheme, there was
no longer any reasons to make our argument stronger.

Since then Paul implemented a workaround on the master branch, which
uses gmalloc during dumping, and switches to the native malloc in the
dumped executable.

At the time, we didn't realize, I think, that removing the glibc hooks
will cause GNU/Linux systems to start using ralloc.c, which is the
trigger for the present discussion.  The discovery of this issue means
that the hope expressed in the Jan 2016 discussions that Emacs
versions before 25 will continue to be usable on GNU/Linux systems
with a newer glibc -- that hope was too optimistic.  Based on what
we've learned the hard way during the last couple of weeks, I'd say
that all the Emacs versions before 25.2 (including 25.1) will be
unstable on such GNU systems to the degree of making them almost
unusable.  E.g., one bug report related to this claims crashes inside
GC once every 10-15 minutes, something that is IMO unbearably
frequent, especially given that segfaults during GC almost always
cause loss of work (because auto-saving almost always fails).

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-25 16:04                                       ` Eli Zaretskii
@ 2016-10-25 23:49                                         ` Richard Stallman
  2016-10-26  5:08                                           ` Paul Eggert
  2016-10-26 11:37                                           ` Eli Zaretskii
  2016-10-25 23:49                                         ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  1 sibling, 2 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-25 23:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  >   I think once you said
  > here:

  >   http://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01633.html

  > that you favored replacing unexec by a more portable scheme, there was
  > no longer any reasons to make our argument stronger.

In general, I'm in favor of a more portable method.  But we don't have
one now.  Is it feasible to do?  Is anyone working on one?  

If not, then I hope we can design, with the Glibc developers, a
different set of hooks to allow us to make unexec work.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-25 23:49                                         ` Richard Stallman
@ 2016-10-26  5:08                                           ` Paul Eggert
  2016-10-26 11:46                                             ` Eli Zaretskii
  2016-10-27  1:23                                             ` Richard Stallman
  2016-10-26 11:37                                           ` Eli Zaretskii
  1 sibling, 2 replies; 375+ messages in thread
From: Paul Eggert @ 2016-10-26  5:08 UTC (permalink / raw)
  To: rms, Eli Zaretskii; +Cc: emacs-devel, npostavs

Richard Stallman wrote:
> In general, I'm in favor of a more portable method.  But we don't have
> one now.  Is it feasible to do?  Is anyone working on one?

Yes, it's feasible. It is on my list of things of do. Admittedly I'm stretched 
thin, and the approach I prefer (generating and then compiling C code) is not 
everybody's favorite.

> If not, then I hope we can design, with the Glibc developers, a
> different set of hooks to allow us to make unexec work.

That would be more work than fixing Emacs, I expect. Plus, malloc hooks are not 
the only reason  unexec is dicey.

> We should withdraw 25.1, I think.

I don't think that will help. Similar problems likely affect 24.5 and earlier 
versions, if they are built against bleeding-edge glibc and are configured in 
the default way.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-26  5:08                                           ` Paul Eggert
@ 2016-10-26 11:46                                             ` Eli Zaretskii
  2016-10-26 13:10                                               ` Noam Postavsky
  2016-10-27  1:23                                             ` Richard Stallman
  1 sibling, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-26 11:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel, rms, npostavs

> Cc: npostavs@users.sourceforge.net, emacs-devel@gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 25 Oct 2016 22:08:21 -0700
> 
> > We should withdraw 25.1, I think.
> 
> I don't think that will help. Similar problems likely affect 24.5 and earlier 
> versions, if they are built against bleeding-edge glibc and are configured in 
> the default way.

Indeed, I agree.  People who first bump into this with Emacs 25.1 will
have 25.2 soon enough (I hope).  By contrast, those who will try to
build Emacs 24.x on the newer GNU/Linux systems will be unable to
resolve the instability problems exposed by using ralloc.c, except by
back-porting patches we have just committed to the Emacs repository,
which is not easy.

Not sure what to do with the old versions, or whether anything can be
done.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-26 11:46                                             ` Eli Zaretskii
@ 2016-10-26 13:10                                               ` Noam Postavsky
  2016-10-26 14:20                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Noam Postavsky @ 2016-10-26 13:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Paul Eggert, rms, Emacs developers

On Wed, Oct 26, 2016 at 7:46 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>> Cc: npostavs@users.sourceforge.net, emacs-devel@gnu.org
>> From: Paul Eggert <eggert@cs.ucla.edu>
>> Date: Tue, 25 Oct 2016 22:08:21 -0700
>>
>> > We should withdraw 25.1, I think.
>>
>> I don't think that will help. Similar problems likely affect 24.5 and earlier
>> versions, if they are built against bleeding-edge glibc and are configured in
>> the default way.
>
> Indeed, I agree.  People who first bump into this with Emacs 25.1 will
> have 25.2 soon enough (I hope).  By contrast, those who will try to
> build Emacs 24.x on the newer GNU/Linux systems will be unable to
> resolve the instability problems exposed by using ralloc.c, except by
> back-porting patches we have just committed to the Emacs repository,
> which is not easy.

Wouldn't configuring with REL_ALLOC=no work?



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-26 13:10                                               ` Noam Postavsky
@ 2016-10-26 14:20                                                 ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-26 14:20 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: eggert, rms, emacs-devel

> From: Noam Postavsky <npostavs@users.sourceforge.net>
> Date: Wed, 26 Oct 2016 09:10:35 -0400
> Cc: Paul Eggert <eggert@cs.ucla.edu>, rms@gnu.org, Emacs developers <emacs-devel@gnu.org>
> 
> Wouldn't configuring with REL_ALLOC=no work?

It could, yes.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-26  5:08                                           ` Paul Eggert
  2016-10-26 11:46                                             ` Eli Zaretskii
@ 2016-10-27  1:23                                             ` Richard Stallman
  2016-10-27  1:36                                               ` Paul Eggert
  1 sibling, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-10-27  1:23 UTC (permalink / raw)
  To: Paul Eggert; +Cc: eliz, emacs-devel, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Yes, it's feasible. It is on my list of things of do. Admittedly I'm stretched 
  > thin, and the approach I prefer (generating and then compiling C code) is not 
  > everybody's favorite.

Could you explain that more?

Does anyone want to implement another approach?

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27  1:23                                             ` Richard Stallman
@ 2016-10-27  1:36                                               ` Paul Eggert
  2016-10-27 13:35                                                 ` Perry E. Metzger
                                                                   ` (3 more replies)
  0 siblings, 4 replies; 375+ messages in thread
From: Paul Eggert @ 2016-10-27  1:36 UTC (permalink / raw)
  To: rms; +Cc: eliz, emacs-devel, npostavs

On 10/26/2016 06:23 PM, Richard Stallman wrote:
> Could you explain that more?

The main idea is to save the current Emacs state as C source code, then 
compile the (large and boring) .c file and relink Emacs with the 
resulting .o file instead of a dummy .o file that it would start off 
with. Most of this new .o file would be data; perhaps some would be code 
that would initialize the data, though we'd want to minimize this.

> Does anyone want to implement another approach?

Eli has mentioned a simpler approach, where we build an .elc file when 
saving Emacs state and load the .elc file during normal startup. The 
main worry about this approach is performance.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27  1:36                                               ` Paul Eggert
@ 2016-10-27 13:35                                                 ` Perry E. Metzger
  2016-10-27 14:51                                                   ` Paul Eggert
  2016-10-27 13:44                                                 ` Fabrice Popineau
                                                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 375+ messages in thread
From: Perry E. Metzger @ 2016-10-27 13:35 UTC (permalink / raw)
  To: Paul Eggert; +Cc: eliz, npostavs, rms, emacs-devel

On Wed, 26 Oct 2016 18:36:02 -0700 Paul Eggert <eggert@cs.ucla.edu>
wrote:
> On 10/26/2016 06:23 PM, Richard Stallman wrote:
> > Could you explain that more?  
> 
> The main idea is to save the current Emacs state as C source code,
> then compile the (large and boring) .c file and relink Emacs with
> the resulting .o file instead of a dummy .o file that it would
> start off with. Most of this new .o file would be data; perhaps
> some would be code that would initialize the data, though we'd want
> to minimize this.

Could the new dynamic loading feature be used here so it wouldn't be
necessary to re-link emacs?

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 13:35                                                 ` Perry E. Metzger
@ 2016-10-27 14:51                                                   ` Paul Eggert
  2016-10-27 15:05                                                     ` Perry E. Metzger
  0 siblings, 1 reply; 375+ messages in thread
From: Paul Eggert @ 2016-10-27 14:51 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: eliz, npostavs, Fabrice Popineau, rms, emacs-devel

On 10/27/2016 06:35 AM, Perry E. Metzger wrote:
> Could the new dynamic loading feature be used here so it wouldn't be necessary to re-link emacs?

It might be doable, though I expect it'd be more work. Dynamic loading 
purposely isolates modules from Emacs internals, and most likely we'd 
need several bridges over that moat.

On 10/27/2016 06:44 AM, Fabrice Popineau wrote:

> I find it disturbing that a C compiler will be needed to redump Emacs.
It's a tradeoff, yes. My impression is that the rare user who redumps 
Emacs typically has a C compiler installed or can easily install one, so 
it shouldn't be much to ask.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 14:51                                                   ` Paul Eggert
@ 2016-10-27 15:05                                                     ` Perry E. Metzger
  2016-10-27 18:13                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Perry E. Metzger @ 2016-10-27 15:05 UTC (permalink / raw)
  To: Paul Eggert; +Cc: eliz, npostavs, Fabrice Popineau, rms, emacs-devel

On Thu, 27 Oct 2016 07:51:37 -0700 Paul Eggert <eggert@cs.ucla.edu>
wrote:
> On 10/27/2016 06:44 AM, Fabrice Popineau wrote:
> 
> > I find it disturbing that a C compiler will be needed to redump
> > Emacs.  
> It's a tradeoff, yes. My impression is that the rare user who
> redumps Emacs typically has a C compiler installed or can easily
> install one, so it shouldn't be much to ask.

Agreed. On free operating systems it's easy (that's the whole point of
freedom!), and even on non-free operating systems free compilers are
available, so it isn't a big deal for most users I think. (On macOS
the official XCode compiler is also available gratis.)

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 15:05                                                     ` Perry E. Metzger
@ 2016-10-27 18:13                                                       ` Eli Zaretskii
  2016-10-27 21:03                                                         ` Perry E. Metzger
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-27 18:13 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: npostavs, eggert, fabrice.popineau, rms, emacs-devel

> Date: Thu, 27 Oct 2016 11:05:03 -0400
> From: "Perry E. Metzger" <perry@piermont.com>
> Cc: rms@gnu.org, eliz@gnu.org, emacs-devel@gnu.org,
>  npostavs@users.sourceforge.net, Fabrice Popineau
>  <fabrice.popineau@gmail.com>
> 
> > It's a tradeoff, yes. My impression is that the rare user who
> > redumps Emacs typically has a C compiler installed or can easily
> > install one, so it shouldn't be much to ask.
> 
> Agreed. On free operating systems it's easy (that's the whole point of
> freedom!), and even on non-free operating systems free compilers are
> available, so it isn't a big deal for most users I think. (On macOS
> the official XCode compiler is also available gratis.)

I can assure you that installing a fully functioning environment for
compiling programs is not a trivial task on MS-Windows.  It isn't
enough to have just a compiler: you need Binutils, support libraries
and header files, and a well-configured MSYS installation to be able
to run Emacs build script and Makefiles.

Also, please don't forget that some people run Emacs on machines where
they are not system administrators and are not allowed to install
arbitrary packages.

Not everyone is in the same position as you and Paul (or myself).



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 18:13                                                       ` Eli Zaretskii
@ 2016-10-27 21:03                                                         ` Perry E. Metzger
  2016-10-27 21:07                                                           ` Daniel Colascione
  2016-10-28  7:03                                                           ` Eli Zaretskii
  0 siblings, 2 replies; 375+ messages in thread
From: Perry E. Metzger @ 2016-10-27 21:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, eggert, fabrice.popineau, rms, emacs-devel

On Thu, 27 Oct 2016 21:13:05 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
> > Date: Thu, 27 Oct 2016 11:05:03 -0400
> > From: "Perry E. Metzger" <perry@piermont.com>
> > Cc: rms@gnu.org, eliz@gnu.org, emacs-devel@gnu.org,
> >  npostavs@users.sourceforge.net, Fabrice Popineau
> >  <fabrice.popineau@gmail.com>
> >   
> > > It's a tradeoff, yes. My impression is that the rare user who
> > > redumps Emacs typically has a C compiler installed or can easily
> > > install one, so it shouldn't be much to ask.  
> > 
> > Agreed. On free operating systems it's easy (that's the whole
> > point of freedom!), and even on non-free operating systems free
> > compilers are available, so it isn't a big deal for most users I
> > think. (On macOS the official XCode compiler is also available
> > gratis.)  
> 
> I can assure you that installing a fully functioning environment for
> compiling programs is not a trivial task on MS-Windows.  It isn't
> enough to have just a compiler: you need Binutils, support libraries
> and header files, and a well-configured MSYS installation to be able
> to run Emacs build script and Makefiles.
> 
> Also, please don't forget that some people run Emacs on machines
> where they are not system administrators and are not allowed to
> install arbitrary packages.
> 
> Not everyone is in the same position as you and Paul (or myself).
> 

Sure, but most people never, ever undump an Emacs either unless
they're building from scratch or doing Emacs dev work...

-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 21:03                                                         ` Perry E. Metzger
@ 2016-10-27 21:07                                                           ` Daniel Colascione
  2016-10-27 23:23                                                             ` Perry E. Metzger
  2016-10-28  7:06                                                             ` When should ralloc.c be used? (WAS: bug#24358) Eli Zaretskii
  2016-10-28  7:03                                                           ` Eli Zaretskii
  1 sibling, 2 replies; 375+ messages in thread
From: Daniel Colascione @ 2016-10-27 21:07 UTC (permalink / raw)
  To: Perry E. Metzger, Eli Zaretskii
  Cc: emacs-devel, eggert, fabrice.popineau, rms, npostavs

On 10/27/2016 02:03 PM, Perry E. Metzger wrote:
> On Thu, 27 Oct 2016 21:13:05 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
>>> Date: Thu, 27 Oct 2016 11:05:03 -0400
>>> From: "Perry E. Metzger" <perry@piermont.com>
>>> Cc: rms@gnu.org, eliz@gnu.org, emacs-devel@gnu.org,
>>>  npostavs@users.sourceforge.net, Fabrice Popineau
>>>  <fabrice.popineau@gmail.com>
>>>
>>>> It's a tradeoff, yes. My impression is that the rare user who
>>>> redumps Emacs typically has a C compiler installed or can easily
>>>> install one, so it shouldn't be much to ask.
>>>
>>> Agreed. On free operating systems it's easy (that's the whole
>>> point of freedom!), and even on non-free operating systems free
>>> compilers are available, so it isn't a big deal for most users I
>>> think. (On macOS the official XCode compiler is also available
>>> gratis.)
>>
>> I can assure you that installing a fully functioning environment for
>> compiling programs is not a trivial task on MS-Windows.  It isn't
>> enough to have just a compiler: you need Binutils, support libraries
>> and header files, and a well-configured MSYS installation to be able
>> to run Emacs build script and Makefiles.
>>
>> Also, please don't forget that some people run Emacs on machines
>> where they are not system administrators and are not allowed to
>> install arbitrary packages.
>>
>> Not everyone is in the same position as you and Paul (or myself).
>>
>
> Sure, but most people never, ever undump an Emacs either unless
> they're building from scratch or doing Emacs dev work...
>

That's because it doesn't really work. That's why I added code that 
explicitly stops repeated dumps. It doesn't mean I don't want to support 
user dumps.

It pains me to see people tolerate 30 second Emacs startup times. The 
daemon is a hack. I want Emacs normal mode of operation to be to start 
from *user* *specific* saved state --- that way, all Emacs instances can 
be as fast as emacs -Q.

If we need a compiler to make this happen, so be it. We'll just require 
libgcc, or hell, check it in to the repository, the way gcc checks in 
its dependencies.

An additional benefit of integrating with a compiler at runtime is the 
potential to JIT elisp code. Both LLVM andGCC these days have usable JIT 
interfaces. We could even serialize JIT traces in these user Emacs dumps.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 21:07                                                           ` Daniel Colascione
@ 2016-10-27 23:23                                                             ` Perry E. Metzger
  2016-10-27 23:32                                                               ` When should ralloc.c be used? Daniel Colascione
  2016-10-28  7:06                                                             ` When should ralloc.c be used? (WAS: bug#24358) Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Perry E. Metzger @ 2016-10-27 23:23 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: eggert, rms, npostavs, fabrice.popineau, emacs-devel,
	Eli Zaretskii

On Thu, 27 Oct 2016 14:07:46 -0700 Daniel Colascione
<dancol@dancol.org> wrote:
> If we need a compiler to make this happen, so be it. We'll just
> require libgcc, or hell, check it in to the repository, the way gcc
> checks in its dependencies.
> 
> An additional benefit of integrating with a compiler at runtime is
> the potential to JIT elisp code. Both LLVM and GCC these days have
> usable JIT interfaces. We could even serialize JIT traces in these
> user Emacs dumps.

Having a JIT for emacs bytecode (or some other IR) would be really
superb. I had no idea that GCC now had JIT support, but if it is as
easy to use as LLVM's, a prototype would not be a hard project. (I
presume RMS would insist on GCC as the basis.)

Of course, given that Emacs already byte compiles everything, maybe
going straight to machine code rather than the bytecode + JIT would
be good? Again, I don't know what GCC's infra is like, but if it is
as good as LLVM's that would be quite straightforward.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-27 23:23                                                             ` Perry E. Metzger
@ 2016-10-27 23:32                                                               ` Daniel Colascione
  0 siblings, 0 replies; 375+ messages in thread
From: Daniel Colascione @ 2016-10-27 23:32 UTC (permalink / raw)
  To: Perry E. Metzger
  Cc: eggert, rms, npostavs, fabrice.popineau, emacs-devel,
	Eli Zaretskii

"Perry E. Metzger" <perry@piermont.com> writes:

> On Thu, 27 Oct 2016 14:07:46 -0700 Daniel Colascione
> <dancol@dancol.org> wrote:
>> If we need a compiler to make this happen, so be it. We'll just
>> require libgcc, or hell, check it in to the repository, the way gcc
>> checks in its dependencies.
>> 
>> An additional benefit of integrating with a compiler at runtime is
>> the potential to JIT elisp code. Both LLVM and GCC these days have
>> usable JIT interfaces. We could even serialize JIT traces in these
>> user Emacs dumps.
>
> Having a JIT for emacs bytecode (or some other IR) would be really
> superb. I had no idea that GCC now had JIT support, but if it is as
> easy to use as LLVM's, a prototype would not be a hard project. (I
> presume RMS would insist on GCC as the basis.)

GCC's interface isn't nearly as mature as LLVM's yet, but there's promise

https://gcc.gnu.org/wiki/JIT

> Of course, given that Emacs already byte compiles everything, maybe
> going straight to machine code rather than the bytecode + JIT would
> be good? Again, I don't know what GCC's infra is like, but if it is
> as good as LLVM's that would be quite straightforward.

AOT is all the rage right now (JEP 295), but I believe that tracing JITs
are ultimately the right choice for code density and installation
latency reasons. But this is one of those arguments that's never going
to be solved.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 21:07                                                           ` Daniel Colascione
  2016-10-27 23:23                                                             ` Perry E. Metzger
@ 2016-10-28  7:06                                                             ` Eli Zaretskii
  1 sibling, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28  7:06 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: eggert, rms, npostavs, fabrice.popineau, emacs-devel, perry

> Cc: npostavs@users.sourceforge.net, eggert@cs.ucla.edu,
>  fabrice.popineau@gmail.com, rms@gnu.org, emacs-devel@gnu.org
> From: Daniel Colascione <dancol@dancol.org>
> Date: Thu, 27 Oct 2016 14:07:46 -0700
> 
> > Sure, but most people never, ever undump an Emacs either unless
> > they're building from scratch or doing Emacs dev work...
> >
> 
> That's because it doesn't really work. That's why I added code that 
> explicitly stops repeated dumps. It doesn't mean I don't want to support 
> user dumps.

Exactly.

> It pains me to see people tolerate 30 second Emacs startup times. The 
> daemon is a hack. I want Emacs normal mode of operation to be to start 
> from *user* *specific* saved state --- that way, all Emacs instances can 
> be as fast as emacs -Q.
> 
> If we need a compiler to make this happen, so be it.

But if there's a simpler method that doesn't get us enywhere near 30
sec, that should be "good enough".

> We'll just require libgcc, or hell, check it in to the repository,

You can't, not without having all the GCC sources.  But that's an
aside.

> An additional benefit of integrating with a compiler at runtime is the 
> potential to JIT elisp code. Both LLVM andGCC these days have usable JIT 
> interfaces. We could even serialize JIT traces in these user Emacs dumps.

This should be an opt-in feature, not a hard requirement, IMO.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 21:03                                                         ` Perry E. Metzger
  2016-10-27 21:07                                                           ` Daniel Colascione
@ 2016-10-28  7:03                                                           ` Eli Zaretskii
  1 sibling, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28  7:03 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: npostavs, eggert, fabrice.popineau, rms, emacs-devel

> Date: Thu, 27 Oct 2016 17:03:45 -0400
> From: "Perry E. Metzger" <perry@piermont.com>
> Cc: eggert@cs.ucla.edu, rms@gnu.org, emacs-devel@gnu.org,
>  npostavs@users.sourceforge.net, fabrice.popineau@gmail.com
> 
> > Not everyone is in the same position as you and Paul (or myself).
> > 
> 
> Sure, but most people never, ever undump an Emacs either unless
> they're building from scratch or doing Emacs dev work...

Oh, so now we are going to argue that a feature that can't be easily
had is not important?  Then I'll claim that the Emacs startup time is
not important, either, because "most people never, ever" start Emacs
except when their machine starts, and their Emacs session is
thereafter running for weeks and months without ever restarting.

Let's agree to respect other people's usage patterns and
circumstances, even if they are different from ours.  Emacs is great
because it allows so many different patterns, so preferring one of
them too much is something we should avoid.  If each one of us sees
only their personal needs as the most important ones, we will never be
able to agree on anything.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27  1:36                                               ` Paul Eggert
  2016-10-27 13:35                                                 ` Perry E. Metzger
@ 2016-10-27 13:44                                                 ` Fabrice Popineau
  2016-10-27 15:35                                                   ` Eli Zaretskii
  2016-10-27 20:39                                                 ` Richard Stallman
  2016-10-27 20:40                                                 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  3 siblings, 1 reply; 375+ messages in thread
From: Fabrice Popineau @ 2016-10-27 13:44 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Eli Zaretskii, Noam Postavsky, rms, Emacs developers

[-- Attachment #1: Type: text/plain, Size: 603 bytes --]

2016-10-27 3:36 GMT+02:00 Paul Eggert <eggert@cs.ucla.edu>:

> On 10/26/2016 06:23 PM, Richard Stallman wrote:
>
>> Could you explain that more?
>>
>
> The main idea is to save the current Emacs state as C source code, then
> compile the (large and boring) .c file and relink Emacs with the resulting
> .o file instead of a dummy .o file that it would start off with. Most of
> this new .o file would be data; perhaps some would be code that would
> initialize the data, though we'd want to minimize this.
>
>
I find it disturbing that a C compiler will be needed to redump Emacs.
Am I alone ?

Fabrice

[-- Attachment #2: Type: text/html, Size: 1142 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 13:44                                                 ` Fabrice Popineau
@ 2016-10-27 15:35                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-27 15:35 UTC (permalink / raw)
  To: Fabrice Popineau; +Cc: npostavs, eggert, rms, emacs-devel

> From: Fabrice Popineau <fabrice.popineau@gmail.com>
> Date: Thu, 27 Oct 2016 15:44:27 +0200
> Cc: rms@gnu.org, Eli Zaretskii <eliz@gnu.org>, Emacs developers <emacs-devel@gnu.org>, 
> 	Noam Postavsky <npostavs@users.sourceforge.net>
> 
>  The main idea is to save the current Emacs state as C source code, then compile the (large and
>  boring) .c file and relink Emacs with the resulting .o file instead of a dummy .o file that it would start off
>  with. Most of this new .o file would be data; perhaps some would be code that would initialize the data,
>  though we'd want to minimize this.
> 
> I find it disturbing that a C compiler will be needed to redump Emacs.
> Am I alone ? 

No.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27  1:36                                               ` Paul Eggert
  2016-10-27 13:35                                                 ` Perry E. Metzger
  2016-10-27 13:44                                                 ` Fabrice Popineau
@ 2016-10-27 20:39                                                 ` Richard Stallman
  2016-10-28  6:48                                                   ` Eli Zaretskii
  2016-10-28 12:51                                                   ` When should ralloc.c be used? Stefan Monnier
  2016-10-27 20:40                                                 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  3 siblings, 2 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-27 20:39 UTC (permalink / raw)
  To: Paul Eggert; +Cc: eliz, emacs-devel, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Eli has mentioned a simpler approach, where we build an .elc file when 
  > saving Emacs state and load the .elc file during normal startup. The 
  > main worry about this approach is performance.

There is no reason why we have to choose between C code and Lisp code.
It's worth developing a special-purpose format for this, if that would
be considerably faster.

However, reading a file that specifies construction of objects is
always going to be slower than copying a memory dump.  What part of the
time could we save with a different format?

Here's an idea.  We separate (1) creating objects from (2) storing them.
We define several operations.  We record all objects created
by sequence number.

We have these ways of creating an object.  That object is given
the next consecutive sequence number.

* intern (specify symbol name)
* variable value (specify variable name)
* string (specify contents)
* integer (specify value)
* cons (specify two sequence numbers)
* list (specify N sequence numbers)
* array (specify N sequence numbers)
* expression (specify the expression textually; it gets evalled)

And these ways of storing the last object.

* store in a variable (specify symbol name)
* store in a function cell (specify symbol name)
* store in car of cons cell (specify its sequence number)
* store in cdr of cons cell (specify its sequence number)
* store in array element (specify array sequence number and index)
* store in symbol property (specify symbol sequence number and property name)
* store using expression (specify a lambda expression with one arg, textually)
* store in special internal place (specify a code number to say which place)

And one storage reclaimer

* discard all sequence numbers back to N.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 20:39                                                 ` Richard Stallman
@ 2016-10-28  6:48                                                   ` Eli Zaretskii
  2016-10-28 19:12                                                     ` Richard Stallman
  2016-10-28 12:51                                                   ` When should ralloc.c be used? Stefan Monnier
  1 sibling, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28  6:48 UTC (permalink / raw)
  To: rms; +Cc: eggert, emacs-devel, npostavs

> From: Richard Stallman <rms@gnu.org>
> CC: eliz@gnu.org, npostavs@users.sourceforge.net, emacs-devel@gnu.org
> Date: Thu, 27 Oct 2016 16:39:59 -0400
> 
>   > Eli has mentioned a simpler approach, where we build an .elc file when 
>   > saving Emacs state and load the .elc file during normal startup. The 
>   > main worry about this approach is performance.
> 
> There is no reason why we have to choose between C code and Lisp code.
> It's worth developing a special-purpose format for this, if that would
> be considerably faster.

The Lisp approach has a huge advantage: it is much simpler, so
everyone here will understand it, and it is much easier to maintain
and develop.

So if the performance hit is bearable (meaning will be accepted by the
crowd), it should IMO be preferred for reasons of project management
and its future, even though faster methods exist.  IOW, the goal of
bringing the unexec out of the shadows of system-level black magic it
is now should stomp the "faster is always better" principle, if we
care about the future of Emacs in the face of the fact that fewer and
fewer people know, or even want to know, about segments and offsets in
a binary executable file.

And speaking about performance: I suggest people who worry about that
start by comparing startup times of past versions of Emacs.  Using
this simple benchmark proposed by Andreas:

  time emacs -batch --eval t

I see that we've been consistently adding 10% of startup time with
each major release, beginning with Emacs 23, so 25.1 starts about 25%
slower than 22.3.  If that didn't cause an outcry, then perhaps our
concern for this order of magnitude of differences in startup time are
a tad exaggerated?

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-28  6:48                                                   ` Eli Zaretskii
@ 2016-10-28 19:12                                                     ` Richard Stallman
  2016-10-29  6:37                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-10-28 19:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > The Lisp approach has a huge advantage: it is much simpler, so
  > everyone here will understand it, and it is much easier to maintain
  > and develop.

The special format I propose is simple enough.

  > So if the performance hit is bearable (meaning will be accepted by the
  > crowd), it should IMO be preferred for reasons of project management
  > and its future,

Slowness here affects every user and is quite noticeable.
Don't we already know that Lisp is too slow for this?

It is worth substantial extra effort to speed this up.

  > care about the future of Emacs in the face of the fact that fewer and
  > fewer people know, or even want to know, about segments and offsets in
  > a binary executable file.

That is an argument for replacing unexec with something that saves the
data to reloed, but it is not an argument for using Lisp as the format.

  >   Using
  > this simple benchmark proposed by Andreas:

  >   time emacs -batch --eval t

I just tried it with my current build (from June).  It took .26 seconds,
which is fast enough.

If replacing unexec with loading Lisp takes .05 seconds more, I won't
complain.  But I think it will take several seconds, if not minutes.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-28 19:12                                                     ` Richard Stallman
@ 2016-10-29  6:37                                                       ` Eli Zaretskii
  2016-10-29 14:55                                                         ` When should ralloc.c be used? Stefan Monnier
  2016-10-29 16:38                                                         ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  0 siblings, 2 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-29  6:37 UTC (permalink / raw)
  To: rms; +Cc: eggert, emacs-devel, npostavs

> From: Richard Stallman <rms@gnu.org>
> CC: eggert@cs.ucla.edu, npostavs@users.sourceforge.net,
> 	emacs-devel@gnu.org
> Date: Fri, 28 Oct 2016 15:12:27 -0400
> 
>   > The Lisp approach has a huge advantage: it is much simpler, so
>   > everyone here will understand it, and it is much easier to maintain
>   > and develop.
> 
> The special format I propose is simple enough.

For you and me (and a few others), maybe.  For most of the current
Emacs contributors it's nowhere near "simple enough", because it
requires one to be familiar with intimate details of Emacs object
design and implementation.

IOW, for the purposes of this discussion, I consider anything that is
not mostly Lisp "not simple".

>   > So if the performance hit is bearable (meaning will be accepted by the
>   > crowd), it should IMO be preferred for reasons of project management
>   > and its future,
> 
> Slowness here affects every user and is quite noticeable.
> Don't we already know that Lisp is too slow for this?

No, we don't know that, because we never tried to implement any method
of reading compiled Lisp that is optimized for speed and targets a
bare Emacs.

E.g., it turned out that most of the time it takes 'loadup' to do its
job is due to the linear search of pure strings in
find_string_data_in_pure, called by make_pure_string.  If we call
'loadup' upon every startup, the need for pure storage goes away, and
the 'loadup' time can be sped up tenfold.  And that is even before
making all of the preloaded files a single file, which speeds up
things at least twofold more, according to my measurements.

So here you have a 20-fold speedup just by two very simple measures.

>   > care about the future of Emacs in the face of the fact that fewer and
>   > fewer people know, or even want to know, about segments and offsets in
>   > a binary executable file.
> 
> That is an argument for replacing unexec with something that saves the
> data to reloed, but it is not an argument for using Lisp as the format.

It is an argument for both, because I don't think we can count on too
many people here being able to tinker with Lisp object internals in
the future.  The less such features we have that will need
maintenance, the better for Emacs viability in the long run.

>   >   time emacs -batch --eval t
> 
> I just tried it with my current build (from June).  It took .26 seconds,
> which is fast enough.
> 
> If replacing unexec with loading Lisp takes .05 seconds more, I won't
> complain.  But I think it will take several seconds, if not minutes.

How much does it take on your system to do this:

  time src/temacs -batch -l loadup

And if you modify Emacs with the patch posted here:

  http://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01049.html

how long does it take temacs to loadup then?

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-29  6:37                                                       ` Eli Zaretskii
@ 2016-10-29 14:55                                                         ` Stefan Monnier
  2016-10-30 16:13                                                           ` Eli Zaretskii
  2016-10-29 16:38                                                         ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-29 14:55 UTC (permalink / raw)
  To: emacs-devel

> E.g., it turned out that most of the time it takes 'loadup' to do its
> job is due to the linear search of pure strings in
> find_string_data_in_pure, called by make_pure_string.

Indeed.  hash-consing pure objects takes another very significant
percentage of the time.

> If we call 'loadup' upon every startup, the need for pure storage goes
> away, and the 'loadup' time can be sped up tenfold.

Actually, there's no *need* for pure storage in either case.

There are benefits to the use of pure storage, and some of them remain
even if we don't dump: one of the benefits that remains is to reduce the
size of the GC'd heap and hence speed up the GC.  Whether that's
significant enough to bother with it is of course debatable.  But note
also that we could keep the use of pure-space without paying the hefty
price of find_string_data_in_pure (or hash-consing), since these merely
try to make the purespace more compact.  The extra cost is OK when we
dump the result, but it's not worth the trouble if we don't dump.

> And that is even before making all of the preloaded files a single
> file, which speeds up things at least twofold more, according to
> my measurements.

BTW, how large is that single file (I'm curious if its size is
significantly different from the 3.2MB I got for my dumped.elc)?

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-29 14:55                                                         ` When should ralloc.c be used? Stefan Monnier
@ 2016-10-30 16:13                                                           ` Eli Zaretskii
  2016-10-30 21:47                                                             ` Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-30 16:13 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 29 Oct 2016 10:55:20 -0400
> 
> BTW, how large is that single file (I'm curious if its size is
> significantly different from the 3.2MB I got for my dumped.elc)?

Its size is 4.2MB.  It's basically a concatenation of all the
preloaded *.elc files, with all but a single preamble removed.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-30 16:13                                                           ` Eli Zaretskii
@ 2016-10-30 21:47                                                             ` Stefan Monnier
  0 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-30 21:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> BTW, how large is that single file (I'm curious if its size is
>> significantly different from the 3.2MB I got for my dumped.elc)?
> Its size is 4.2MB.  It's basically a concatenation of all the
> preloaded *.elc files, with all but a single preamble removed.

OK, so it's pretty much exactly the same size as what I get (the 3.2MB
I get turns into 4.2MB if I print each function/var separately, thus
preventing sharing between them, which is what happens in the normal
.elc files).

Good to know that there's no significant difference between the
two in this regard, thanks.

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-29  6:37                                                       ` Eli Zaretskii
  2016-10-29 14:55                                                         ` When should ralloc.c be used? Stefan Monnier
@ 2016-10-29 16:38                                                         ` Richard Stallman
  2016-10-29 21:57                                                           ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-10-29 16:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > For you and me (and a few others), maybe.  For most of the current
  > Emacs contributors it's nowhere near "simple enough", because it
  > requires one to be familiar with intimate details of Emacs object
  > design and implementation.

No it doesn't.  The code to look at objects and output them this way
wouldn't have to know any more about how they are represented
than the code for Fprinc.  It would operate using the usual macros
for decomposing objects.

The idea that all C code should be regarded as unmaintainable
is a nonstarter.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-29 16:38                                                         ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
@ 2016-10-29 21:57                                                           ` Eli Zaretskii
  2016-10-31 19:18                                                             ` Richard Stallman
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-29 21:57 UTC (permalink / raw)
  To: rms; +Cc: eggert, emacs-devel, npostavs

> From: Richard Stallman <rms@gnu.org>
> CC: eggert@cs.ucla.edu, npostavs@users.sourceforge.net,
> 	emacs-devel@gnu.org
> Date: Sat, 29 Oct 2016 12:38:48 -0400
> 
>   > For you and me (and a few others), maybe.  For most of the current
>   > Emacs contributors it's nowhere near "simple enough", because it
>   > requires one to be familiar with intimate details of Emacs object
>   > design and implementation.
> 
> No it doesn't.  The code to look at objects and output them this way
> wouldn't have to know any more about how they are represented
> than the code for Fprinc.

The code like in princ (actually in its subroutines) is exactly what I
think we should try to avoid.

> The idea that all C code should be regarded as unmaintainable
> is a nonstarter.

I didn't say it will be unmaintainable, I said its maintenance will be
harder than of Lisp code.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-29 21:57                                                           ` Eli Zaretskii
@ 2016-10-31 19:18                                                             ` Richard Stallman
  2016-10-31 20:58                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-10-31 19:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, eggert, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I didn't say it will be unmaintainable, I said its maintenance will be
  > harder than of Lisp code.

That's no horrible thing.  Speeding up startup is important.
If some C code is an effective way to do it, we shouldn't
reject that just because of a general preference for Lisp code.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-31 19:18                                                             ` Richard Stallman
@ 2016-10-31 20:58                                                               ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-31 20:58 UTC (permalink / raw)
  To: rms; +Cc: npostavs, eggert, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> CC: eggert@cs.ucla.edu, emacs-devel@gnu.org,
> 	npostavs@users.sourceforge.net
> Date: Mon, 31 Oct 2016 15:18:38 -0400
> 
>   > I didn't say it will be unmaintainable, I said its maintenance will be
>   > harder than of Lisp code.
> 
> That's no horrible thing.

"Horrible" is in the eyes of the beholder.  I think keeping Emacs as
maintainable as possible is very important for its future.

> If some C code is an effective way to do it, we shouldn't
> reject that just because of a general preference for Lisp code.

I'm not rejecting it, just explaining why it shouldn't be the first
priority, IMO.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-27 20:39                                                 ` Richard Stallman
  2016-10-28  6:48                                                   ` Eli Zaretskii
@ 2016-10-28 12:51                                                   ` Stefan Monnier
  1 sibling, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-28 12:51 UTC (permalink / raw)
  To: emacs-devel

> There is no reason why we have to choose between C code and Lisp code.
> It's worth developing a special-purpose format for this, if that would
> be considerably faster.

We're still investigating how much time can be gained just by optimizing
lread.c.  But yes, maybe another format (which could also be used for
.elc files, of course) would allow reading faster.


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27  1:36                                               ` Paul Eggert
                                                                   ` (2 preceding siblings ...)
  2016-10-27 20:39                                                 ` Richard Stallman
@ 2016-10-27 20:40                                                 ` Richard Stallman
  2016-10-27 22:34                                                   ` Paul Eggert
  2016-10-28  6:55                                                   ` Eli Zaretskii
  3 siblings, 2 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-27 20:40 UTC (permalink / raw)
  To: Paul Eggert; +Cc: eliz, emacs-devel, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Eli has mentioned a simpler approach, where we build an .elc file when 
  > saving Emacs state and load the .elc file during normal startup. The 
  > main worry about this approach is performance.

Any such scheme has this problem:
how to find all the places where initialization has stored some sort
of value?  They do not all have ways to access them and set them from Lisp.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 20:40                                                 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
@ 2016-10-27 22:34                                                   ` Paul Eggert
  2016-10-28  2:40                                                     ` Richard Stallman
  2016-10-28  2:40                                                     ` Richard Stallman
  2016-10-28  6:55                                                   ` Eli Zaretskii
  1 sibling, 2 replies; 375+ messages in thread
From: Paul Eggert @ 2016-10-27 22:34 UTC (permalink / raw)
  To: rms; +Cc: emacs-devel

On 10/27/2016 01:40 PM, Richard Stallman wrote:
> Any such scheme has this problem:
> how to find all the places where initialization has stored some sort
> of value?  They do not all have ways to access them and set them from Lisp.

My impression is that most such initializations are so small and fast 
that we needn't worry about saving and restoring their state. We can 
simply redo the initialization when Emacs starts up again - this will be 
the default behavior if we leave the temacs initialization code alone, 
which means we'd get this for very little maintenance effort. Any 
counterexamples we can handle specially, by saving and restoring their 
state by hand (so to speak).

Something along the lines of your idea for storage creation should work, 
though we'll have to be careful about destructive operations like setcar 
that can cause an object with an earlier sequence number to point to an 
object with a later sequence number. It's not clear whether it has 
significant advantages over the C-based approach.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 22:34                                                   ` Paul Eggert
@ 2016-10-28  2:40                                                     ` Richard Stallman
  2016-10-28  2:40                                                     ` Richard Stallman
  1 sibling, 0 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-28  2:40 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Something along the lines of your idea for storage creation should work, 
  > though we'll have to be careful about destructive operations like setcar 
  > that can cause an object with an earlier sequence number to point to an 
  > object with a later sequence number.

Why so?

The purpose of sequence numbers is simply so you can refer to the
objects already made -- not for proving some theorems of
well-foundedness.  Cycles should not be a problem.

  >  It's not clear whether it has 
  > significant advantages over the C-based approach.

Here are two:

* You don't need a C compiler to dump Emacs.

* You don't need to relink to dump Emacs.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 22:34                                                   ` Paul Eggert
  2016-10-28  2:40                                                     ` Richard Stallman
@ 2016-10-28  2:40                                                     ` Richard Stallman
  2016-10-28  7:21                                                       ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-10-28  2:40 UTC (permalink / raw)
  To: Paul Eggert; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > Any such scheme has this problem:
  > > how to find all the places where initialization has stored some sort
  > > of value?  They do not all have ways to access them and set them from Lisp.

  > My impression is that most such initializations are so small and fast 
  > that we needn't worry about saving and restoring their state. We can 
  > simply redo the initialization when Emacs starts up again - this will be 
  > the default behavior if we leave the temacs initialization code alone, 
  > which means we'd get this for very little maintenance effort. 

It could be so, but someone will have to try it.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-28  2:40                                                     ` Richard Stallman
@ 2016-10-28  7:21                                                       ` Eli Zaretskii
  0 siblings, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28  7:21 UTC (permalink / raw)
  To: rms; +Cc: eggert, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> Date: Thu, 27 Oct 2016 22:40:01 -0400
> Cc: emacs-devel@gnu.org
> 
>   > My impression is that most such initializations are so small and fast 
>   > that we needn't worry about saving and restoring their state. We can 
>   > simply redo the initialization when Emacs starts up again - this will be 
>   > the default behavior if we leave the temacs initialization code alone, 
>   > which means we'd get this for very little maintenance effort. 
> 
> It could be so, but someone will have to try it.

We already have a CANNOT_DUMP configuration, which does precisely
that, used by some systems, so this code is in reasonably good shape.
It's just a question of make it fast enough.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-27 20:40                                                 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  2016-10-27 22:34                                                   ` Paul Eggert
@ 2016-10-28  6:55                                                   ` Eli Zaretskii
  1 sibling, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-28  6:55 UTC (permalink / raw)
  To: rms; +Cc: eggert, emacs-devel, npostavs

> From: Richard Stallman <rms@gnu.org>
> CC: eliz@gnu.org, npostavs@users.sourceforge.net, emacs-devel@gnu.org
> Date: Thu, 27 Oct 2016 16:40:14 -0400
> 
>   > Eli has mentioned a simpler approach, where we build an .elc file when 
>   > saving Emacs state and load the .elc file during normal startup. The 
>   > main worry about this approach is performance.
> 
> Any such scheme has this problem:
> how to find all the places where initialization has stored some sort
> of value?  They do not all have ways to access them and set them from Lisp.

The part that must be done in C, like DEFUN etc. will be done at
startup of the Emacs session.  The part that stores values in
variables exposed to Lisp will be either moved to startup.el, or done
at startup in C.  A few variables whose value depends on the build
directory and other stuff that is only known at build time will be
recorded in a special Lisp file created by the build and loaded at
startup as part of the .elc file described above.

Btw, Paul's description about "saving state" in a .elc file is
inaccurate: most of that file is just the preloaded Lisp packages,
like simple.el, subr.el, etc.  There's very little of saved state
there, because that state will simple be created at startup, each time
Emacs starts.

In source terms, most of the "if (!initialized)" parts will be run
each time Emacs starts.

Does that answer your question?

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-25 23:49                                         ` Richard Stallman
  2016-10-26  5:08                                           ` Paul Eggert
@ 2016-10-26 11:37                                           ` Eli Zaretskii
  2016-10-27  1:24                                             ` Richard Stallman
  1 sibling, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-26 11:37 UTC (permalink / raw)
  To: rms; +Cc: eggert, emacs-devel, npostavs

> From: Richard Stallman <rms@gnu.org>
> CC: npostavs@users.sourceforge.net, eggert@cs.ucla.edu,
> 	emacs-devel@gnu.org
> Date: Tue, 25 Oct 2016 19:49:33 -0400
> 
>   >   I think once you said
>   > here:
> 
>   >   http://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01633.html
> 
>   > that you favored replacing unexec by a more portable scheme, there was
>   > no longer any reasons to make our argument stronger.
> 
> In general, I'm in favor of a more portable method.  But we don't have
> one now.  Is it feasible to do?  Is anyone working on one?  

We are warming up.  There are a few ideas, but I'm not sure we have
decided which one is the best yet.

> If not, then I hope we can design, with the Glibc developers, a
> different set of hooks to allow us to make unexec work.

Frankly, I think that ship has sailed, and cannot be turned around.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-26 11:37                                           ` Eli Zaretskii
@ 2016-10-27  1:24                                             ` Richard Stallman
  2016-10-28 12:57                                               ` When should ralloc.c be used? Stefan Monnier
  0 siblings, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-10-27  1:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > If not, then I hope we can design, with the Glibc developers, a
  > > different set of hooks to allow us to make unexec work.

  > Frankly, I think that ship has sailed, and cannot be turned around.

I think that is an exaggeration.  They got rid of ONE set of hooks for
specific practical reasons.  Maybe we can design a different set of
hooks which do the job and which are not a problem for them to
support.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-27  1:24                                             ` Richard Stallman
@ 2016-10-28 12:57                                               ` Stefan Monnier
  2016-10-28 19:13                                                 ` Richard Stallman
  0 siblings, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-28 12:57 UTC (permalink / raw)
  To: emacs-devel

> I think that is an exaggeration.  They got rid of ONE set of hooks for
> specific practical reasons.  Maybe we can design a different set of
> hooks which do the job and which are not a problem for them to
> support.

While that is true, I think there is very little motivation to go down
that road even in Emacs's side: this glibc-malloc "issue" is just one
more nail in the unexec coffin, so even if we can find a way back we'd
still be stuck with the problem of doing unexec with address
randomization (for example), and maintenance of unexec (which has proved
less problematic than I expected, over the years, admittedly, but
remains a source of worry).

        Stefan

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 12:57                                               ` When should ralloc.c be used? Stefan Monnier
@ 2016-10-28 19:13                                                 ` Richard Stallman
  2016-10-28 22:46                                                   ` Stefan Monnier
  2016-10-29  6:39                                                   ` Eli Zaretskii
  0 siblings, 2 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-28 19:13 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > I think that is an exaggeration.  They got rid of ONE set of hooks for
  > > specific practical reasons.  Maybe we can design a different set of
  > > hooks which do the job and which are not a problem for them to
  > > support.

  > While that is true, I think there is very little motivation to go down
  > that road even in Emacs's side: this glibc-malloc "issue" is just one
  > more nail in the unexec coffin, so even if we can find a way back we'd
  > still be stuck with the problem of doing unexec with address
  > randomization (for example), and maintenance of unexec (which has proved

I just did  'time temacs -batch -l loadup'.  It took over 18 seconds.
We have a long way to go to make that fast enough.

Perhaps what we need is to dump that data verbatim in a format
chosen by us, then relocate all the pointers if necessary.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 19:13                                                 ` Richard Stallman
@ 2016-10-28 22:46                                                   ` Stefan Monnier
  2016-10-29 16:35                                                     ` Richard Stallman
  2016-10-29  6:39                                                   ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Stefan Monnier @ 2016-10-28 22:46 UTC (permalink / raw)
  To: emacs-devel

> I just did  'time temacs -batch -l loadup'.  It took over 18 seconds.
> We have a long way to go to make that fast enough.

It's easy to bring it down to 1s: the current loadup.el spend most of
its time in things that are not terribly important (to try and scrape a few
bytes here and there: worthwhile in the context of a one-off dump step,
but not that important otherwise).


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 22:46                                                   ` Stefan Monnier
@ 2016-10-29 16:35                                                     ` Richard Stallman
  0 siblings, 0 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-29 16:35 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > It's easy to bring it down to 1s: the current loadup.el spend most of
  > its time in things that are not terribly important (to try and scrape a few
  > bytes here and there: worthwhile in the context of a one-off dump step,
  > but not that important otherwise).

Please show us.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-28 19:13                                                 ` Richard Stallman
  2016-10-28 22:46                                                   ` Stefan Monnier
@ 2016-10-29  6:39                                                   ` Eli Zaretskii
  2016-10-29 16:37                                                     ` Richard Stallman
  1 sibling, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-29  6:39 UTC (permalink / raw)
  To: rms; +Cc: monnier, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> Date: Fri, 28 Oct 2016 15:13:19 -0400
> Cc: emacs-devel@gnu.org
> 
> I just did  'time temacs -batch -l loadup'.  It took over 18 seconds.
> We have a long way to go to make that fast enough.

Try the patch I pointed to in a previous message, and see what kind of
speedup is possible by 2 simple measures.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-29  6:39                                                   ` Eli Zaretskii
@ 2016-10-29 16:37                                                     ` Richard Stallman
  2016-10-29 21:51                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-10-29 16:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > I just did  'time temacs -batch -l loadup'.  It took over 18 seconds.
  > > We have a long way to go to make that fast enough.

  > Try the patch I pointed to in a previous message, and see what kind of
  > speedup is possible by 2 simple measures.

It would be a lot of work for me to try that myself, and I think it is
not necessary if you already tried it.  What fractional speedup did
you observe when you tried it?

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-29 16:37                                                     ` Richard Stallman
@ 2016-10-29 21:51                                                       ` Eli Zaretskii
  2016-10-30 11:33                                                         ` Richard Stallman
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-29 21:51 UTC (permalink / raw)
  To: rms; +Cc: monnier, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> CC: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> Date: Sat, 29 Oct 2016 12:37:34 -0400
> 
>   > Try the patch I pointed to in a previous message, and see what kind of
>   > speedup is possible by 2 simple measures.
> 
> It would be a lot of work for me to try that myself, and I think it is
> not necessary if you already tried it.  What fractional speedup did
> you observe when you tried it?

About 20, as I wrote.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-29 21:51                                                       ` Eli Zaretskii
@ 2016-10-30 11:33                                                         ` Richard Stallman
  2016-10-30 15:33                                                           ` Alp Aker
  2016-10-30 16:08                                                           ` Eli Zaretskii
  0 siblings, 2 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-30 11:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > It would be a lot of work for me to try that myself, and I think it is
  > > not necessary if you already tried it.  What fractional speedup did
  > > you observe when you tried it?

  > About 20, as I wrote.

Sorry, I am not sure what 20 means here.
Was it a speedup of 20%?
A factor of 20?

If it means 20%, it would be a good step, but much more would
be required to make non-dumping ok for starting Emacs.


-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-30 11:33                                                         ` Richard Stallman
@ 2016-10-30 15:33                                                           ` Alp Aker
  2016-10-30 17:19                                                             ` Richard Stallman
  2016-10-30 16:08                                                           ` Eli Zaretskii
  1 sibling, 1 reply; 375+ messages in thread
From: Alp Aker @ 2016-10-30 15:33 UTC (permalink / raw)
  To: rms; +Cc: Eli Zaretskii, Emacs devel

[-- Attachment #1: Type: text/plain, Size: 680 bytes --]

On Sun, Oct 30, 2016 at 7:33 AM, Richard Stallman <rms@gnu.org> wrote:

> Sorry, I am not sure what 20 means here.
> Was it a speedup of 20%?
> A factor of 20?

He meant a factor of 20.  Here's the original comment:

> E.g., it turned out that most of the time it takes 'loadup' to do its
> job is due to the linear search of pure strings in
> find_string_data_in_pure, called by make_pure_string.  If we call
> 'loadup' upon every startup, the need for pure storage goes away, and
> the 'loadup' time can be sped up tenfold.  And that is even before
> making all of the preloaded files a single file, which speeds up
> things at least twofold more, according to my measurements.

[-- Attachment #2: Type: text/html, Size: 984 bytes --]

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-30 15:33                                                           ` Alp Aker
@ 2016-10-30 17:19                                                             ` Richard Stallman
  0 siblings, 0 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-30 17:19 UTC (permalink / raw)
  To: Alp Aker; +Cc: eliz, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > Sorry, I am not sure what 20 means here.
  > > Was it a speedup of 20%?
  > > A factor of 20?

  > He meant a factor of 20.  Here's the original comment:

With a 20-times speeduo, maybe it is fast enough.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-30 11:33                                                         ` Richard Stallman
  2016-10-30 15:33                                                           ` Alp Aker
@ 2016-10-30 16:08                                                           ` Eli Zaretskii
  1 sibling, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-30 16:08 UTC (permalink / raw)
  To: rms; +Cc: monnier, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> CC: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> Date: Sun, 30 Oct 2016 07:33:12 -0400
> 
>   > > It would be a lot of work for me to try that myself, and I think it is
>   > > not necessary if you already tried it.  What fractional speedup did
>   > > you observe when you tried it?
> 
>   > About 20, as I wrote.
> 
> Sorry, I am not sure what 20 means here.
> Was it a speedup of 20%?
> A factor of 20?

A factor of 20.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-25 16:04                                       ` Eli Zaretskii
  2016-10-25 23:49                                         ` Richard Stallman
@ 2016-10-25 23:49                                         ` Richard Stallman
  1 sibling, 0 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-25 23:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel, npostavs

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  >   Based on what
  > we've learned the hard way during the last couple of weeks, I'd say
  > that all the Emacs versions before 25.2 (including 25.1) will be
  > unstable on such GNU systems to the degree of making them almost
  > unusable.

We should withdraw 25.1, I think.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-24 17:09                                   ` Eli Zaretskii
  2016-10-25  2:35                                     ` Richard Stallman
@ 2016-10-25  2:35                                     ` Richard Stallman
  2016-10-25 16:05                                       ` Eli Zaretskii
  2016-10-25 23:00                                     ` Perry E. Metzger
  2 siblings, 1 reply; 375+ messages in thread
From: Richard Stallman @ 2016-10-25  2:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, eggert, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > I think the way to fix those is by systematically looking at the
  > > source for them, rather than by debugging.

  > Yes, but finding out whether this is so is not easy, because the
  > malloc call is sometimes buried very deep.

There are programs that determine call trees.  We could find these
problems by analyzing the output.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-25  2:35                                     ` Richard Stallman
@ 2016-10-25 16:05                                       ` Eli Zaretskii
  2016-10-27  1:22                                         ` Richard Stallman
  0 siblings, 1 reply; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-25 16:05 UTC (permalink / raw)
  To: rms; +Cc: npostavs, eggert, emacs-devel

> From: Richard Stallman <rms@gnu.org>
> CC: eggert@cs.ucla.edu, emacs-devel@gnu.org,
> 	npostavs@users.sourceforge.net
> Date: Mon, 24 Oct 2016 22:35:55 -0400
> 
>   > > I think the way to fix those is by systematically looking at the
>   > > source for them, rather than by debugging.
> 
>   > Yes, but finding out whether this is so is not easy, because the
>   > malloc call is sometimes buried very deep.
> 
> There are programs that determine call trees.  We could find these
> problems by analyzing the output.

Yes, but the real problem is to determine whether the code needs any
changes at all.  For that, one must understand the control flow, and
figure out whether pointers to buffer text are used across malloc
calls without any updates.  This is the hardest part, because pointers
are frequently passed down to subroutines and to their subroutines,
which use them or call malloc only under certain conditions.  For
example, it could be that a subroutine only calls malloc if the
passed-in pointer does not originate from a buffer object.

This analysis is what makes the source study hard.

Anyway, I think I just finished hunting and fixing those cases, so the
only remaining issue is with regex.c functions, for which we have a
patch that will most probably do the job.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-25 16:05                                       ` Eli Zaretskii
@ 2016-10-27  1:22                                         ` Richard Stallman
  0 siblings, 0 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-27  1:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: npostavs, eggert, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Yes, but the real problem is to determine whether the code needs any
  > changes at all.  For that, one must understand the control flow, and
  > figure out whether pointers to buffer text are used across malloc
  > calls without any updates.

That is true.  But this can find the functions that need to be checked,
because they make buffer pointers and they indirectly call malloc.


-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-24 17:09                                   ` Eli Zaretskii
  2016-10-25  2:35                                     ` Richard Stallman
  2016-10-25  2:35                                     ` Richard Stallman
@ 2016-10-25 23:00                                     ` Perry E. Metzger
  2016-10-26  2:37                                       ` Eli Zaretskii
  2016-10-27  1:25                                       ` Richard Stallman
  2 siblings, 2 replies; 375+ messages in thread
From: Perry E. Metzger @ 2016-10-25 23:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, emacs-devel, rms, npostavs

On Mon, 24 Oct 2016 20:09:32 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
> >   > It imposes hard-to-fulfill requirements on functions that get
> >   > C pointers to buffer text or to Lisp string data: those
> >   > functions must never call malloc, directly or indirectly.  
> > 
> > I think the way to fix those is by systematically looking at the
> > source for them, rather than by debugging.  
> 
> Yes, but finding out whether this is so is not easy, because the
> malloc call is sometimes buried very deep.

Could this be found by doing a debugging build where malloc
aborts in the conditions where it can't be called directly or
indirectly? Then one could just run that way and find the instances
pretty easily.

Perry
-- 
Perry E. Metzger		perry@piermont.com



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-25 23:00                                     ` Perry E. Metzger
@ 2016-10-26  2:37                                       ` Eli Zaretskii
  2016-10-27  1:25                                       ` Richard Stallman
  1 sibling, 0 replies; 375+ messages in thread
From: Eli Zaretskii @ 2016-10-26  2:37 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: eggert, emacs-devel, rms, npostavs

> Date: Tue, 25 Oct 2016 19:00:45 -0400
> From: "Perry E. Metzger" <perry@piermont.com>
> Cc: rms@gnu.org, npostavs@users.sourceforge.net, eggert@cs.ucla.edu,
>  emacs-devel@gnu.org
> 
> On Mon, 24 Oct 2016 20:09:32 +0300 Eli Zaretskii <eliz@gnu.org> wrote:
> > >   > It imposes hard-to-fulfill requirements on functions that get
> > >   > C pointers to buffer text or to Lisp string data: those
> > >   > functions must never call malloc, directly or indirectly.  
> > > 
> > > I think the way to fix those is by systematically looking at the
> > > source for them, rather than by debugging.  
> > 
> > Yes, but finding out whether this is so is not easy, because the
> > malloc call is sometimes buried very deep.
> 
> Could this be found by doing a debugging build where malloc
> aborts in the conditions where it can't be called directly or
> indirectly?

I don't know how to define those conditions.  If you have concrete
suggestions, please describe them.



^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used? (WAS: bug#24358)
  2016-10-25 23:00                                     ` Perry E. Metzger
  2016-10-26  2:37                                       ` Eli Zaretskii
@ 2016-10-27  1:25                                       ` Richard Stallman
  1 sibling, 0 replies; 375+ messages in thread
From: Richard Stallman @ 2016-10-27  1:25 UTC (permalink / raw)
  To: Perry E. Metzger; +Cc: npostavs, eliz, eggert, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Could this be found by doing a debugging build where malloc
  > aborts in the conditions where it can't be called directly or
  > indirectly?

We could make the functions that create pointers into buffers also
increment a global counter when they do that, and decrement the
counter when done.  malloc would abort if the counter is nonzero.

The hard part would be arranging to reset the counter to zero when
there is a nonlocal exit out of such a region.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.

^ permalink raw reply	[flat|nested] 375+ messages in thread

* Re: When should ralloc.c be used?
  2016-10-24  0:21                             ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
  2016-10-24  3:59                               ` Paul Eggert
  2016-10-24  7:15                               ` Eli Zaretskii
@ 2016-10-24 14:04                               ` Stefan Monnier
  2 siblings, 0 replies; 375+ messages in thread
From: Stefan Monnier @ 2016-10-24 14:04 UTC (permalink / raw)
  To: emacs-devel

>> But that doesn't explain why we'd need to use ralloc in the mean time.
> Why would we not want to use ralloc?  It made a big improvement for
> memory management when I wrote it.

But that was before we were able to use mmap for the allocation of
buffer memory, which is the main source of fragmentation AFAIK.
Also the size of virtual and physical memory was quite different back then,


        Stefan




^ permalink raw reply	[flat|nested] 375+ messages in thread

end of thread, other threads:[~2019-01-21 14:19 UTC | newest]

Thread overview: 375+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <ksa8fz8eud.fsf@luna.netfonds.no>
     [not found] ` <87twe6sx2g.fsf@users.sourceforge.net>
     [not found]   ` <87eg51ng4r.fsf_-_@users.sourceforge.net>
     [not found]     ` <87k2djwumn.fsf@users.sourceforge.net>
     [not found]       ` <83h98nidvd.fsf@gnu.org>
     [not found]         ` <87eg3rvtsf.fsf@users.sourceforge.net>
     [not found]           ` <83k2dihpm9.fsf@gnu.org>
     [not found]             ` <8760p2wzgj.fsf@users.sourceforge.net>
     [not found]               ` <838ttyhhzu.fsf@gnu.org>
     [not found]                 ` <871szqwu51.fsf@users.sourceforge.net>
     [not found]                   ` <831szqhbc2.fsf@gnu.org>
2016-10-22  3:03                     ` When should ralloc.c be used? (WAS: bug#24358) npostavs
2016-10-22  5:32                       ` Paul Eggert
2016-10-22  7:29                         ` Eli Zaretskii
2016-10-22 18:34                           ` Paul Eggert
2016-10-22 19:43                             ` When should ralloc.c be used? Stefan Monnier
2016-10-23  2:37                               ` Paul Eggert
2016-10-23  6:53                                 ` Eli Zaretskii
2016-10-23  7:57                                   ` Paul Eggert
2016-10-23  8:58                                     ` Eli Zaretskii
2016-10-23  9:38                                       ` Paul Eggert
2016-10-23 12:50                                         ` Eli Zaretskii
2016-10-23 13:39                                           ` Stefan Monnier
2016-10-23 14:01                                             ` Eli Zaretskii
2016-10-23 14:18                                               ` Stefan Monnier
2016-10-23 18:19                                               ` Paul Eggert
2016-10-23 19:03                                                 ` Eli Zaretskii
2016-10-23 20:36                                                   ` Stefan Monnier
2016-10-24  6:54                                                     ` Eli Zaretskii
2016-10-24 10:15                                                       ` Eli Zaretskii
2016-10-24  4:59                                                   ` Paul Eggert
2016-10-24  7:44                                                     ` Eli Zaretskii
2016-10-24  8:29                                                       ` Andreas Schwab
2016-10-24  8:47                                                         ` Eli Zaretskii
2016-10-24 16:21                                                       ` Paul Eggert
2016-10-24 16:39                                                         ` Eli Zaretskii
2016-10-24 16:54                                                           ` Paul Eggert
2016-10-24 17:05                                                             ` Eli Zaretskii
2016-10-25  6:23                                                               ` Paul Eggert
2016-10-25 16:11                                                                 ` Eli Zaretskii
2016-10-28  6:18                                                           ` Jérémie Courrèges-Anglas
2016-10-28  6:19                                                           ` Jérémie Courrèges-Anglas
2016-10-28  7:40                                                             ` Eli Zaretskii
2016-10-23 15:22                                           ` Andreas Schwab
2016-10-23 15:49                                             ` Eli Zaretskii
2016-10-23 15:57                                               ` Andreas Schwab
2016-10-23 17:06                                                 ` Eli Zaretskii
2016-10-23 20:35                                                   ` Stefan Monnier
2016-10-23 16:44                                   ` Skipping unexec via a big .elc file (was: When should ralloc.c be used?) Stefan Monnier
2016-10-23 17:34                                     ` Eli Zaretskii
2016-10-23 20:27                                       ` Skipping unexec via a big .elc file Stefan Monnier
2016-10-24  6:22                                         ` Eli Zaretskii
2016-10-24 12:47                                           ` Stefan Monnier
2016-10-24 13:08                                             ` Eli Zaretskii
2016-10-24 14:15                                               ` Stefan Monnier
2016-10-24  1:07                                       ` Stefan Monnier
2016-10-24  6:39                                         ` Eli Zaretskii
2016-10-24  6:47                                           ` Lars Ingebrigtsen
2016-10-24  7:17                                             ` Eli Zaretskii
2016-10-24  8:24                                               ` Andreas Schwab
2016-10-24  8:41                                                 ` Eli Zaretskii
2016-10-24  9:47                                                   ` Daniel Colascione
2016-10-24 10:00                                                     ` Eli Zaretskii
2016-10-24 10:03                                                       ` Daniel Colascione
2016-10-24 10:18                                                         ` Eli Zaretskii
2016-10-24 10:28                                                           ` Philipp Stephani
2016-10-24 10:51                                                             ` Eli Zaretskii
2016-10-24 13:52                                                               ` Stefan Monnier
2016-10-24 16:04                                                                 ` Eli Zaretskii
2016-10-24 13:04                                           ` Stefan Monnier
2016-10-24 13:35                                             ` Eli Zaretskii
2016-10-24 14:45                                               ` Daniel Colascione
2016-10-24 15:58                                                 ` Eli Zaretskii
2016-10-24 16:17                                                   ` Daniel Colascione
2016-10-24 16:51                                                     ` Philipp Stephani
2016-10-24 19:47                                                       ` Daniel Colascione
2016-10-25 15:59                                                         ` Eli Zaretskii
2016-10-25 16:14                                                           ` Daniel Colascione
2016-10-25 17:05                                                             ` Eli Zaretskii
2016-10-25 19:49                                                           ` Stefan Monnier
2016-10-25 22:53                                                           ` Perry E. Metzger
2016-10-26  2:36                                                             ` Eli Zaretskii
2016-10-26  2:37                                                               ` Perry E. Metzger
2016-10-24 16:52                                                     ` Eli Zaretskii
2016-10-25 22:46                                               ` Perry E. Metzger
2016-10-24  9:40                                         ` Ken Raeburn
2016-10-24 13:13                                           ` Stefan Monnier
2016-10-25  9:02                                             ` Ken Raeburn
2016-10-25 13:48                                               ` Stefan Monnier
2016-10-27  8:51                                                 ` Ken Raeburn
2016-10-30 14:43                                                   ` Ken Raeburn
2016-10-30 15:31                                                     ` Simon Leinen
2016-10-30 16:52                                                     ` Daniel Colascione
2016-10-31 14:27                                                     ` Stefan Monnier
2016-11-02  7:36                                                       ` Ken Raeburn
2016-11-02 12:17                                                         ` Stefan Monnier
2016-11-02 12:22                                                         ` Stefan Monnier
2016-11-03  5:37                                                           ` Ken Raeburn
2016-12-11 13:34                                                             ` Ken Raeburn
2016-12-11 15:42                                                               ` Eli Zaretskii
2016-12-24 11:06                                                                 ` Eli Zaretskii
2016-12-25 15:46                                                                   ` Stefan Monnier
2016-12-11 19:18                                                               ` Richard Stallman
2016-12-15 12:57                                                                 ` Ken Raeburn
2016-12-15 16:04                                                                   ` Eli Zaretskii
2016-12-15 16:26                                                                     ` Ken Raeburn
2016-12-11 19:18                                                               ` Richard Stallman
2016-12-12 17:25                                                                 ` Ken Raeburn
2016-12-13 15:21                                                               ` Ken Brown
2016-12-14  5:30                                                                 ` Ken Raeburn
2016-12-14  5:45                                                                   ` Ken Raeburn
2016-12-14 10:58                                                                     ` Phil Sainty
2016-12-14 12:06                                                                       ` Yuri Khan
2016-12-14 11:00                                                                     ` Lars Ingebrigtsen
2016-12-15 11:45                                                                     ` Ken Raeburn
2016-12-15 17:28                                                                       ` Ken Raeburn
2016-12-15 19:59                                                                         ` Eli Zaretskii
2016-12-15 22:07                                                                           ` Clément Pit--Claudel
2016-12-16  7:54                                                                             ` Eli Zaretskii
2016-12-16 14:28                                                                               ` Clément Pit--Claudel
2016-12-16 14:39                                                                                 ` Eli Zaretskii
2016-12-16 15:28                                                                                   ` Clément Pit--Claudel
2016-12-16 21:27                                                                                     ` Eli Zaretskii
2016-12-16 21:38                                                                                       ` Noam Postavsky
2016-12-17 14:56                                                                                   ` Stefan Monnier
2016-12-19 15:11                                                                                 ` Phillip Lord
2016-12-16  7:56                                                                           ` Eli Zaretskii
2016-12-19 15:15                                                                             ` Phillip Lord
2016-12-19 15:09                                                                           ` Phillip Lord
2016-12-20 18:57                                                                             ` Ken Raeburn
2016-12-20 23:22                                                                               ` Stefan Monnier
2016-12-21  7:44                                                                                 ` Ken Raeburn
2016-12-21 12:13                                                                               ` Phillip Lord
2016-12-16 14:22                                                                       ` Robert Pluim
2016-12-24 13:37                                                               ` Eli Zaretskii
2016-12-26 17:48                                                                 ` Eli Zaretskii
2017-01-07  9:40                                                                   ` Eli Zaretskii
2017-01-09 10:28                                                                     ` Ken Raeburn
2017-01-10  2:25                                                                       ` Stefan Monnier
2017-01-10  9:46                                                                         ` Andreas Schwab
2017-01-10 17:19                                                                           ` Eli Zaretskii
2017-01-11  6:32                                                                             ` Ken Raeburn
2017-01-12  8:17                                                                               ` Ken Raeburn
2017-01-14 10:41                                                                                 ` Eli Zaretskii
2017-01-14 10:55                                                                                   ` Andreas Schwab
2017-01-14 11:07                                                                                     ` Eli Zaretskii
2017-01-14 11:26                                                                                       ` Alan Mackenzie
2017-01-14 12:19                                                                                       ` Andreas Schwab
2017-01-14 13:05                                                                                         ` Eli Zaretskii
2017-01-14 15:12                                                                                           ` Andreas Schwab
2017-01-14 17:37                                                                                             ` Eli Zaretskii
2017-01-14 18:50                                                                                               ` Andreas Schwab
2017-01-14 15:30                                                                                   ` Stefan Monnier
2017-01-14 17:42                                                                                     ` Eli Zaretskii
2017-01-14 18:11                                                                                       ` Stefan Monnier
2017-01-14 20:13                                                                                         ` Eli Zaretskii
2017-01-21  7:58                                                                                   ` Ken Raeburn
2017-01-22 16:55                                                                                     ` Ken Raeburn
2017-02-02  9:10                                                                                   ` Ken Raeburn
2017-02-04 10:37                                                                                     ` Eli Zaretskii
2017-02-05 14:19                                                                                       ` Ken Raeburn
2017-02-05 15:51                                                                                         ` Eli Zaretskii
2017-02-05 23:19                                                                                           ` Ken Raeburn
2017-02-06 15:20                                                                                             ` Ken Raeburn
2017-02-06 15:39                                                                                               ` Stefan Monnier
2017-02-06 19:08                                                                                                 ` Ken Raeburn
2017-02-06 22:39                                                                                                   ` Stefan Monnier
2017-02-08 10:31                                                                                                     ` Ken Raeburn
2017-02-08 14:38                                                                                                       ` Ken Brown
2017-02-05 20:03                                                                                         ` Ken Brown
2017-02-25 14:52                                                                                         ` Eli Zaretskii
2017-02-25 15:19                                                                                           ` Eli Zaretskii
2017-02-26 12:37                                                                                           ` Ken Raeburn
2017-03-04 14:23                                                                                             ` Eli Zaretskii
2017-03-06  8:46                                                                                               ` Ken Raeburn
2017-03-11 12:27                                                                                                 ` Eli Zaretskii
2017-03-11 13:18                                                                                                   ` Andreas Schwab
2017-03-11 13:42                                                                                                     ` Eli Zaretskii
2017-03-11 15:48                                                                                                     ` Stefan Monnier
2017-03-11 21:48                                                                                                       ` Richard Stallman
2017-03-11 22:06                                                                                                         ` Stefan Monnier
2017-03-11 23:59                                                                                                     ` Ken Raeburn
2017-03-12 17:06                                                                                                       ` Stefan Monnier
2017-03-13  8:25                                                                                                       ` Ken Raeburn
2017-03-26 16:44                                                                                                         ` Eli Zaretskii
2017-03-28  2:27                                                                                                           ` Ken Raeburn
2017-03-31  6:57                                                                                                             ` Eli Zaretskii
2017-03-31  8:40                                                                                                               ` Ken Raeburn
2017-04-03 16:15                                                                                                                 ` Ken Raeburn
2017-04-03 16:57                                                                                                                   ` Alan Mackenzie
2017-04-03 18:35                                                                                                                     ` Ken Raeburn
2017-04-03 19:14                                                                                                                       ` Eli Zaretskii
2017-04-04  8:08                                                                                                                         ` Ken Raeburn
2017-04-04  9:51                                                                                                                           ` Robert Pluim
2017-04-04 10:27                                                                                                                           ` joakim
2017-04-04 12:14                                                                                                                             ` Clément Pit-Claudel
2017-04-04 14:38                                                                                                                               ` Eli Zaretskii
2017-04-04 15:16                                                                                                                                 ` Clément Pit-Claudel
2017-04-04 15:53                                                                                                                                   ` Eli Zaretskii
2017-04-04 18:22                                                                                                                                     ` Clément Pit-Claudel
2017-04-07  5:46                                                                                                                           ` Lars Brinkhoff
2017-04-07  7:28                                                                                                                             ` Eli Zaretskii
2017-04-07  9:02                                                                                                                               ` Ken Raeburn
2017-04-07 13:40                                                                                                                                 ` Eli Zaretskii
2017-04-07 16:02                                                                                                                                   ` Ken Raeburn
2017-04-07 16:17                                                                                                                                     ` Clément Pit-Claudel
2017-04-08 15:03                                                                                                                                       ` Philipp Stephani
2017-04-08 15:15                                                                                                                                         ` Clément Pit-Claudel
2017-04-08 15:53                                                                                                                                           ` Philipp Stephani
2017-04-08 16:18                                                                                                                                             ` Eli Zaretskii
2017-04-08 18:01                                                                                                                                               ` Stefan Monnier
2017-05-01 11:41                                                                                                                                                 ` Philipp Stephani
2017-04-08 17:58                                                                                                                                             ` Clément Pit-Claudel
2017-05-01 11:40                                                                                                                                               ` Philipp Stephani
2017-05-01 12:07                                                                                                                                                 ` Eli Zaretskii
2017-05-18 17:39                                                                                                                                                   ` Daniel Colascione
2017-05-18 19:45                                                                                                                                                     ` Eli Zaretskii
2018-12-25 15:46                                                                                                                                                       ` Philipp Stephani
2018-12-25 17:21                                                                                                                                                         ` Eli Zaretskii
2018-12-25 19:15                                                                                                                                                           ` Daniel Colascione
2018-12-26 15:27                                                                                                                                                             ` Eli Zaretskii
2019-01-07 21:37                                                                                                                                                             ` Daniel Colascione
2019-01-15 22:46                                                                                                                                                               ` Daniel Colascione
2019-01-16  8:45                                                                                                                                                                 ` Tassilo Horn
2019-01-16 10:25                                                                                                                                                                 ` Robert Pluim
2019-01-16 11:58                                                                                                                                                                 ` Phillip Lord
2019-01-18 12:46                                                                                                                                                                   ` Windows Binaries with pdumper Phillip Lord
2019-01-21 11:30                                                                                                                                                                     ` Jostein Kjønigsen
2019-01-21 14:19                                                                                                                                                                       ` Phillip Lord
2019-01-16 12:00                                                                                                                                                                 ` Skipping unexec via a big .elc file Elias Mårtenson
2019-01-16 15:59                                                                                                                                                                 ` Eli Zaretskii
2019-01-16 16:08                                                                                                                                                                   ` Daniel Colascione
2019-01-16 21:56                                                                                                                                                                 ` Clément Pit-Claudel
2017-05-21  8:44                                                                                                                                                   ` compiled lisp file format (Re: Skipping unexec via a big .elc file) Ken Raeburn
2017-05-21  8:53                                                                                                                                                     ` Paul Eggert
2017-05-28 11:07                                                                                                                                                       ` Ken Raeburn
2017-05-28 12:43                                                                                                                                                         ` Philipp Stephani
2017-05-29  9:33                                                                                                                                                           ` Ken Raeburn
2017-07-02 15:46                                                                                                                                                             ` Philipp Stephani
2017-07-03  1:44                                                                                                                                                               ` Ken Raeburn
2017-09-24 13:57                                                                                                                                                                 ` Philipp Stephani
2017-09-27  8:31                                                                                                                                                                   ` Ken Raeburn
2017-05-28 21:09                                                                                                                                                         ` Paul Eggert
2017-05-29  9:33                                                                                                                                                           ` Ken Raeburn
2017-05-29 16:37                                                                                                                                                             ` Paul Eggert
2017-05-29 17:39                                                                                                                                                               ` Eli Zaretskii
2017-05-29 18:03                                                                                                                                                                 ` Paul Eggert
2017-05-29 18:53                                                                                                                                                                   ` Eli Zaretskii
2017-05-29 20:15                                                                                                                                                                     ` Paul Eggert
2017-05-30  5:52                                                                                                                                                                       ` Ken Raeburn
2017-05-30  5:55                                                                                                                                                                       ` Eli Zaretskii
2017-05-21 16:02                                                                                                                                                     ` John Wiegley
2017-04-07 13:23                                                                                                                               ` Skipping unexec via a big .elc file Stefan Monnier
2017-04-10 16:19                                                                                                                   ` Ken Raeburn
2016-10-24 18:34                                     ` Lars Brinkhoff
2016-10-24 19:52                                       ` Eli Zaretskii
2016-10-23 12:55                                 ` When should ralloc.c be used? Stefan Monnier
2016-10-23 14:28                                   ` Stefan Monnier
2016-10-23 14:57                                     ` Eli Zaretskii
2016-10-23 15:07                                       ` Stefan Monnier
2016-10-23 15:44                                         ` Eli Zaretskii
2016-10-23 16:30                                           ` Stefan Monnier
2016-10-23 16:45                                             ` Eli Zaretskii
2016-10-23 16:49                                               ` Stefan Monnier
2016-10-23 17:35                                                 ` Eli Zaretskii
2016-10-23 20:23                                                   ` Stefan Monnier
2016-10-23 20:33                                                     ` Eli Zaretskii
2016-10-23 20:44                                                       ` Stefan Monnier
2016-10-24  5:11                                                         ` Paul Eggert
2016-10-24 12:33                                                           ` Stefan Monnier
2016-10-24 13:05                                                             ` Eli Zaretskii
2016-10-24 14:12                                                               ` Stefan Monnier
2016-10-24 16:00                                                                 ` Eli Zaretskii
2016-10-24 18:51                                                                   ` Stefan Monnier
2016-10-24 14:37                                                               ` Stefan Monnier
2016-10-24 15:40                                                                 ` Eli Zaretskii
2016-10-24 16:27                                                                   ` Daniel Colascione
2016-10-24 16:57                                                                     ` Eli Zaretskii
2016-10-25  2:34                                                                     ` Richard Stallman
2016-10-25 14:13                                                                     ` Stefan Monnier
2016-10-25 14:14                                                                     ` Stefan Monnier
2016-10-28  6:03                                                                     ` Jérémie Courrèges-Anglas
2016-10-28  6:23                                                                       ` Daniel Colascione
2016-10-28  7:09                                                                         ` Jérémie Courrèges-Anglas
2016-10-28  7:46                                                                         ` Eli Zaretskii
2016-10-28  8:11                                                                           ` Daniel Colascione
2016-10-28  8:27                                                                             ` Eli Zaretskii
2016-10-28  8:44                                                                               ` Daniel Colascione
2016-10-28  9:43                                                                                 ` Eli Zaretskii
2016-10-28  9:52                                                                                   ` Daniel Colascione
2016-10-28 12:25                                                                                     ` Eli Zaretskii
2016-10-28 13:37                                                                                       ` Stefan Monnier
2016-10-28 14:30                                                                                         ` Eli Zaretskii
2016-10-28 14:43                                                                                           ` Stefan Monnier
2016-10-28 15:41                                                                                       ` Daniel Colascione
2016-10-29  6:08                                                                                         ` Eli Zaretskii
2016-10-29  6:14                                                                                           ` Daniel Colascione
2016-10-28 12:11                                                                                   ` Stefan Monnier
2016-10-28 11:40                                                                               ` Jérémie Courrèges-Anglas
2016-10-28 13:03                                                                                 ` Stefan Monnier
2016-10-28 14:41                                                                                   ` Jérémie Courrèges-Anglas
2016-10-28 15:34                                                                                 ` Daniel Colascione
2016-10-24 18:45                                                                   ` Stefan Monnier
2016-10-24 19:38                                                                     ` Eli Zaretskii
2016-10-25 14:12                                                                       ` Stefan Monnier
2016-10-25 16:36                                                                         ` Eli Zaretskii
2016-10-25 19:27                                                                           ` Stefan Monnier
2016-10-25  3:12                                                               ` Ken Raeburn
2016-10-25 16:06                                                                 ` Eli Zaretskii
2016-10-26  4:36                                                                   ` Ken Raeburn
2016-10-26 11:40                                                                     ` Eli Zaretskii
2016-10-27  8:51                                                                       ` Ken Raeburn
2016-10-24  6:59                                                         ` Eli Zaretskii
2016-10-24 12:45                                                           ` Stefan Monnier
2016-10-24 13:07                                                             ` Eli Zaretskii
2016-10-24 14:42                                                               ` Stefan Monnier
2016-10-24 15:43                                                                 ` Eli Zaretskii
2016-10-24 18:50                                                                   ` Stefan Monnier
2016-10-24 16:10                                                                 ` Eli Zaretskii
2016-10-24 16:56                                                             ` Richard Stallman
2016-10-24  0:21                             ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
2016-10-24  3:59                               ` Paul Eggert
2016-10-24  7:15                               ` Eli Zaretskii
2016-10-24 16:55                                 ` Richard Stallman
2016-10-24 17:09                                   ` Eli Zaretskii
2016-10-25  2:35                                     ` Richard Stallman
2016-10-25  6:38                                       ` Paul Eggert
2016-10-25 16:04                                       ` Eli Zaretskii
2016-10-25 23:49                                         ` Richard Stallman
2016-10-26  5:08                                           ` Paul Eggert
2016-10-26 11:46                                             ` Eli Zaretskii
2016-10-26 13:10                                               ` Noam Postavsky
2016-10-26 14:20                                                 ` Eli Zaretskii
2016-10-27  1:23                                             ` Richard Stallman
2016-10-27  1:36                                               ` Paul Eggert
2016-10-27 13:35                                                 ` Perry E. Metzger
2016-10-27 14:51                                                   ` Paul Eggert
2016-10-27 15:05                                                     ` Perry E. Metzger
2016-10-27 18:13                                                       ` Eli Zaretskii
2016-10-27 21:03                                                         ` Perry E. Metzger
2016-10-27 21:07                                                           ` Daniel Colascione
2016-10-27 23:23                                                             ` Perry E. Metzger
2016-10-27 23:32                                                               ` When should ralloc.c be used? Daniel Colascione
2016-10-28  7:06                                                             ` When should ralloc.c be used? (WAS: bug#24358) Eli Zaretskii
2016-10-28  7:03                                                           ` Eli Zaretskii
2016-10-27 13:44                                                 ` Fabrice Popineau
2016-10-27 15:35                                                   ` Eli Zaretskii
2016-10-27 20:39                                                 ` Richard Stallman
2016-10-28  6:48                                                   ` Eli Zaretskii
2016-10-28 19:12                                                     ` Richard Stallman
2016-10-29  6:37                                                       ` Eli Zaretskii
2016-10-29 14:55                                                         ` When should ralloc.c be used? Stefan Monnier
2016-10-30 16:13                                                           ` Eli Zaretskii
2016-10-30 21:47                                                             ` Stefan Monnier
2016-10-29 16:38                                                         ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
2016-10-29 21:57                                                           ` Eli Zaretskii
2016-10-31 19:18                                                             ` Richard Stallman
2016-10-31 20:58                                                               ` Eli Zaretskii
2016-10-28 12:51                                                   ` When should ralloc.c be used? Stefan Monnier
2016-10-27 20:40                                                 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
2016-10-27 22:34                                                   ` Paul Eggert
2016-10-28  2:40                                                     ` Richard Stallman
2016-10-28  2:40                                                     ` Richard Stallman
2016-10-28  7:21                                                       ` Eli Zaretskii
2016-10-28  6:55                                                   ` Eli Zaretskii
2016-10-26 11:37                                           ` Eli Zaretskii
2016-10-27  1:24                                             ` Richard Stallman
2016-10-28 12:57                                               ` When should ralloc.c be used? Stefan Monnier
2016-10-28 19:13                                                 ` Richard Stallman
2016-10-28 22:46                                                   ` Stefan Monnier
2016-10-29 16:35                                                     ` Richard Stallman
2016-10-29  6:39                                                   ` Eli Zaretskii
2016-10-29 16:37                                                     ` Richard Stallman
2016-10-29 21:51                                                       ` Eli Zaretskii
2016-10-30 11:33                                                         ` Richard Stallman
2016-10-30 15:33                                                           ` Alp Aker
2016-10-30 17:19                                                             ` Richard Stallman
2016-10-30 16:08                                                           ` Eli Zaretskii
2016-10-25 23:49                                         ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman
2016-10-25  2:35                                     ` Richard Stallman
2016-10-25 16:05                                       ` Eli Zaretskii
2016-10-27  1:22                                         ` Richard Stallman
2016-10-25 23:00                                     ` Perry E. Metzger
2016-10-26  2:37                                       ` Eli Zaretskii
2016-10-27  1:25                                       ` Richard Stallman
2016-10-24 14:04                               ` When should ralloc.c be used? Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).