* When should ralloc.c be used? (WAS: bug#24358) [not found] ` <831szqhbc2.fsf@gnu.org> @ 2016-10-22 3:03 ` npostavs 2016-10-22 5:32 ` Paul Eggert 0 siblings, 1 reply; 375+ messages in thread From: npostavs @ 2016-10-22 3:03 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Thread 1 "emacs" hit Hardware watchpoint 4: current_buffer->text->beg >> >> Old value = (unsigned char *) 0x18351b8 "" >> New value = (unsigned char *) 0x188a1b8 "" >> r_alloc_sbrk (size=290816) at ralloc.c:818 > > r_alloc_sbrk? What OS is this? We only use ralloc.c on a handful of > them, as of Emacs 25. Should ralloc.c be used on GNU/Linux systems that have GNU libc? I found this in my config.log (I think it's related to ralloc use, though I'm finding the configure code a bit confusing): configure:11440: checking whether malloc is Doug Lea style configure:11461: gcc -o conftest -O0 -g3 -march=native conftest.c >&5 conftest.c: In function 'main': conftest.c:107:6: error: '__malloc_initialize_hook' undeclared (first use in this function) __malloc_initialize_hook = hook; ^~~~~~~~~~~~~~~~~~~~~~~~ It seems that my malloc.h does not declare __malloc_initialize_hook, even though 'man 3 malloc_hook' says #include <malloc.h> void *(*__malloc_hook)(size_t size, const void *caller); void *(*__realloc_hook)(void *ptr, size_t size, const void *caller); void *(*__memalign_hook)(size_t alignment, size_t size, const void *caller); void (*__free_hook)(void *ptr, const void *caller); void (*__malloc_initialize_hook)(void); void (*__after_morecore_hook)(void); Perhaps this is because the recommended way to set this hook (which is different from what the configure test is using) doesn't require a declaration? The variable __malloc_initialize_hook points at a function that is called once when the mal‐ loc implementation is initialized. This is a weak variable, so it can be overridden in the application with a definition like the following: void (*__malloc_initialize_hook)(void) = my_init_hook; ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-22 3:03 ` When should ralloc.c be used? (WAS: bug#24358) npostavs @ 2016-10-22 5:32 ` Paul Eggert 2016-10-22 7:29 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Paul Eggert @ 2016-10-22 5:32 UTC (permalink / raw) To: npostavs, emacs-devel npostavs@users.sourceforge.net wrote: > Should ralloc.c be used on GNU/Linux systems that have GNU libc? Yes, with bleeding-edge glibc, as __malloc_initialize_hook has been removed. Evidently your man page is out of sync with your glibc. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-22 5:32 ` Paul Eggert @ 2016-10-22 7:29 ` Eli Zaretskii 2016-10-22 18:34 ` Paul Eggert 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-22 7:29 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, npostavs > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Fri, 21 Oct 2016 22:32:49 -0700 > > npostavs@users.sourceforge.net wrote: > > Should ralloc.c be used on GNU/Linux systems that have GNU libc? > > Yes, with bleeding-edge glibc, as __malloc_initialize_hook has been removed. If that's the case, shouldn't we switch such glibc systems to use mmap instead? It should be free of at least some of the problems in ralloc.c, I think. Alternatively, how about supporting an external Doug Lea malloc library (assuming such a library exists and Emacs can be linked against it)? ralloc.c is generally "bad news", we've gone to non-trivial efforts during the last years to reduce its usage to the minimum. I always thought that only MSDOS and perhaps a few *BSD systems still use it. Having it creep back into GNU/Linux is really a bad regression, IMO. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-22 7:29 ` Eli Zaretskii @ 2016-10-22 18:34 ` Paul Eggert 2016-10-22 19:43 ` When should ralloc.c be used? Stefan Monnier 2016-10-24 0:21 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 0 siblings, 2 replies; 375+ messages in thread From: Paul Eggert @ 2016-10-22 18:34 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, npostavs Eli Zaretskii wrote: > Having it creep back into GNU/Linux is really a bad regression, IMO. I don't like it either, but would rather work on redoing the build process so that we can use the native malloc on all hosts. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-22 18:34 ` Paul Eggert @ 2016-10-22 19:43 ` Stefan Monnier 2016-10-23 2:37 ` Paul Eggert 2016-10-24 0:21 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-22 19:43 UTC (permalink / raw) To: emacs-devel >> Having it creep back into GNU/Linux is really a bad regression, IMO. > I don't like it either, but would rather work on redoing the build process > so that we can use the native malloc on all hosts. But that doesn't explain why we'd need to use ralloc in the mean time. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-22 19:43 ` When should ralloc.c be used? Stefan Monnier @ 2016-10-23 2:37 ` Paul Eggert 2016-10-23 6:53 ` Eli Zaretskii 2016-10-23 12:55 ` When should ralloc.c be used? Stefan Monnier 0 siblings, 2 replies; 375+ messages in thread From: Paul Eggert @ 2016-10-23 2:37 UTC (permalink / raw) To: Stefan Monnier, emacs-devel Stefan Monnier wrote: > that doesn't explain why we'd need to use ralloc in the mean time. I suppose you're right that we don't need to; we could instead hack on Emacs to get it to work without ralloc on recent glibc. If someone wants to do that, great. I'd rather spend my own limited cycles on fixing the main problem, which is unexec. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 2:37 ` Paul Eggert @ 2016-10-23 6:53 ` Eli Zaretskii 2016-10-23 7:57 ` Paul Eggert 2016-10-23 16:44 ` Skipping unexec via a big .elc file (was: When should ralloc.c be used?) Stefan Monnier 2016-10-23 12:55 ` When should ralloc.c be used? Stefan Monnier 1 sibling, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 6:53 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 22 Oct 2016 19:37:36 -0700 > > Stefan Monnier wrote: > > that doesn't explain why we'd need to use ralloc in the mean time. > > I suppose you're right that we don't need to; we could instead hack on Emacs to > get it to work without ralloc on recent glibc. How about using mmap in those cases? > If someone wants to do that, great. I'd rather spend my own limited > cycles on fixing the main problem, which is unexec. I thought we agreed to get rid of unexec by loading a single .elc file at startup of Emacs, and remove the distinction between temacs and emacs altogether. Is that what you'd like to work on? Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 6:53 ` Eli Zaretskii @ 2016-10-23 7:57 ` Paul Eggert 2016-10-23 8:58 ` Eli Zaretskii 2016-10-23 16:44 ` Skipping unexec via a big .elc file (was: When should ralloc.c be used?) Stefan Monnier 1 sibling, 1 reply; 375+ messages in thread From: Paul Eggert @ 2016-10-23 7:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel Eli Zaretskii wrote: > How about using mmap in those cases? I don't know, and would rather not spend time investigating. > I thought we agreed to get rid of unexec by loading a single .elc file > at startup of Emacs Yes, if that performs well enough. We don't know yet whether it will. It's on my list of things to look into, but it's not trivial. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 7:57 ` Paul Eggert @ 2016-10-23 8:58 ` Eli Zaretskii 2016-10-23 9:38 ` Paul Eggert 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 8:58 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sun, 23 Oct 2016 00:57:21 -0700 > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org > > > I thought we agreed to get rid of unexec by loading a single .elc file > > at startup of Emacs > > Yes, if that performs well enough. We don't know yet whether it will. It's on my > list of things to look into, but it's not trivial. Can you share the concerns and the tests you'd like to be performed? Perhaps others (myself included) could help with such testing. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 8:58 ` Eli Zaretskii @ 2016-10-23 9:38 ` Paul Eggert 2016-10-23 12:50 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Paul Eggert @ 2016-10-23 9:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel Eli Zaretskii wrote: > Can you share the concerns and the tests you'd like to be performed? I wrote something along those lines in Bug#23529; see the URL below. My main concern is startup time and energy. I don't have detailed benchmarks. https://debbugs.gnu.org/cgi/bugreport.cgi?bug=23529#197 ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 9:38 ` Paul Eggert @ 2016-10-23 12:50 ` Eli Zaretskii 2016-10-23 13:39 ` Stefan Monnier 2016-10-23 15:22 ` Andreas Schwab 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 12:50 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel Is it reasonable to require a version of glibc that still supports __malloc_initialize_hook? When people upgrade to a newer glibc, the previous version is still left on the system, I presume (for programs that need them which were built against those old versions)? I took a look at our sources, and we have a lot of places where we call malloc, directly or indirectly, while holding C pointers to data of Lisp strings. We also have several (maybe half a dozen) places where the same happens with C pointers to buffer text. Auditing all of these and fixing them is a non-trivial job, so maybe we should try to avoid the problems in the first place? Is it a practical solution? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 12:50 ` Eli Zaretskii @ 2016-10-23 13:39 ` Stefan Monnier 2016-10-23 14:01 ` Eli Zaretskii 2016-10-23 15:22 ` Andreas Schwab 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 13:39 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel > Is it reasonable to require a version of glibc that still supports > __malloc_initialize_hook? When people upgrade to a newer glibc, the > previous version is still left on the system, I presume (for programs > that need them which were built against those old versions)? Not necessarily, no. E.g. it's not the case for fresh new installs. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 13:39 ` Stefan Monnier @ 2016-10-23 14:01 ` Eli Zaretskii 2016-10-23 14:18 ` Stefan Monnier 2016-10-23 18:19 ` Paul Eggert 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 14:01 UTC (permalink / raw) To: Stefan Monnier; +Cc: eggert, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org > Date: Sun, 23 Oct 2016 09:39:57 -0400 > > > Is it reasonable to require a version of glibc that still supports > > __malloc_initialize_hook? When people upgrade to a newer glibc, the > > previous version is still left on the system, I presume (for programs > > that need them which were built against those old versions)? > > Not necessarily, no. E.g. it's not the case for fresh new installs. But they can downgrade, right? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 14:01 ` Eli Zaretskii @ 2016-10-23 14:18 ` Stefan Monnier 2016-10-23 18:19 ` Paul Eggert 1 sibling, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 14:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel >> > Is it reasonable to require a version of glibc that still supports >> > __malloc_initialize_hook? When people upgrade to a newer glibc, the >> > previous version is still left on the system, I presume (for programs >> > that need them which were built against those old versions)? >> Not necessarily, no. E.g. it's not the case for fresh new installs. > But they can downgrade, right? Depends on the details of the distribution, but while it's technically probably possible, it's not necessarily simple for the end-user. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 14:01 ` Eli Zaretskii 2016-10-23 14:18 ` Stefan Monnier @ 2016-10-23 18:19 ` Paul Eggert 2016-10-23 19:03 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Paul Eggert @ 2016-10-23 18:19 UTC (permalink / raw) To: Eli Zaretskii, Stefan Monnier; +Cc: emacs-devel Eli Zaretskii wrote: > But they can downgrade, right? No, as users don't necessarily have an older glibc to downgrade to. That train has already left the station. With my blessing, I might add. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 18:19 ` Paul Eggert @ 2016-10-23 19:03 ` Eli Zaretskii 2016-10-23 20:36 ` Stefan Monnier 2016-10-24 4:59 ` Paul Eggert 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 19:03 UTC (permalink / raw) To: Paul Eggert; +Cc: monnier, emacs-devel > Cc: emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sun, 23 Oct 2016 11:19:15 -0700 > > Eli Zaretskii wrote: > > But they can downgrade, right? > > No, as users don't necessarily have an older glibc to downgrade to. That train > has already left the station. With my blessing, I might add. Then what are our choices to solve this for Emacs 25.2? If GNU/Linux starts using ralloc more and more, we will have crashes and data corruption all over the place. It's inconceivable to release 25.2 in this state. Any suggestions? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 19:03 ` Eli Zaretskii @ 2016-10-23 20:36 ` Stefan Monnier 2016-10-24 6:54 ` Eli Zaretskii 2016-10-24 4:59 ` Paul Eggert 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 20:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel > Then what are our choices to solve this for Emacs 25.2? If GNU/Linux > starts using ralloc more and more, we will have crashes and data > corruption all over the place. It's inconceivable to release 25.2 in > this state. What's wrong with using gmalloc without ralloc and with mmap'd buffers? Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 20:36 ` Stefan Monnier @ 2016-10-24 6:54 ` Eli Zaretskii 2016-10-24 10:15 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 6:54 UTC (permalink / raw) To: Stefan Monnier; +Cc: eggert, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org > Date: Sun, 23 Oct 2016 16:36:24 -0400 > > > Then what are our choices to solve this for Emacs 25.2? If GNU/Linux > > starts using ralloc more and more, we will have crashes and data > > corruption all over the place. It's inconceivable to release 25.2 in > > this state. > > What's wrong with using gmalloc without ralloc and with mmap'd buffers? Nothing, if it works. But someone should set up Emacs to do that, and make sure the result builds, bootstraps, and works reliably, i.e. doesn't have all the problems reported recently in this and related bugs. I don't have access to any platforms that are affected by this (fencepost doesn't yet have such a new glibc). I will do this myself if no one else comes to help, but I really could use help from people who work on platforms that are affected by this issue. Noam and Sam help, but we need more manpower and more expertise. In any case, I asked what were our alternatives, because I'm not sure we have a clear view of those. Making decisions with just a peephole view of the issues is never a good idea. The best solution might not be changing the configury to eliminate ralloc, it could be something entirely different. For example, I see in regex.c a set of special definitions for REGEX_ALLOCATE_STACK and friends conditioned by this: #if defined REL_ALLOC && defined REGEX_MALLOC These definitions call directly a few functions in ralloc.c, as opposed to going via malloc. Does anyone know what is this about? Should we try building with REGEX_MALLOC on platforms that use ralloc.c, and see whether the problems with regex searches triggered by relocation go away? Yet another idea is enlarge the stack space available to SAFE_ALLOCA in regex.c, so that the failure stack is allocated off the C runtime stack, thus side-stepping the relocation issues. And maybe there are other possibilities. We really need to come up with all the possible ideas, try which ones work, and decide as quickly as possible what is the best one. This currently is the most serious blocking issue on the way towards releasing Emacs 25.2 soon, as we wanted to. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 6:54 ` Eli Zaretskii @ 2016-10-24 10:15 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 10:15 UTC (permalink / raw) To: emacs-devel; +Cc: eggert, monnier > Date: Mon, 24 Oct 2016 09:54:12 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org > > And maybe there are other possibilities. We really need to come up > with all the possible ideas, try which ones work, and decide as > quickly as possible what is the best one. This currently is the most > serious blocking issue on the way towards releasing Emacs 25.2 soon, > as we wanted to. So I think the most promising alternatives at this point are: . Build with gmalloc but without ralloc. Would people who have ralloc.o in their src directory please reconfigure with REL_ALLOC=no, and see if the result works reliably in you're day-to-day work? Please report the results here, and if you were hit by one of the related bugs (24358 and 24764), please report also to the corresponding bug addresses. . Back-port the HYBRID_MALLOC changes from master. Not sure if the patch is simple and safe enough, or whether the result is tested well enough to have that on emacs-25. If one of these works, we should consider reverting the changes in regex.c that attempt to handle relocation during regex.c calls. We should also consider removing ralloc.c from any of our builds, in the hope that the platforms which we care about have a much better malloc implementation than what was available 20 years ago. Comments? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 19:03 ` Eli Zaretskii 2016-10-23 20:36 ` Stefan Monnier @ 2016-10-24 4:59 ` Paul Eggert 2016-10-24 7:44 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Paul Eggert @ 2016-10-24 4:59 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii wrote: > Then what are our choices to solve this for Emacs 25.2? Sorry, I've lost context. What's "this"? Doesn't draft Emacs 25.2 work adequately on bleeding-edge glibc without our doing anything special? If not, what are the problems and how can we reproduce them? I attempted to reproduce the problems, whatever they are, by building draft Emacs 25.2 with './configure emacs_cv_var_doug_lea_malloc=no' on x86-64 Ubuntu 16.04.1 (I don't have easy access to 16.10 yet, but this does build and link gmalloc.o and ralloc.o). --enable-gcc-warnings did generate some warnings, which I just now fixed, but I didn't observe any runtime misbehaviors. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 4:59 ` Paul Eggert @ 2016-10-24 7:44 ` Eli Zaretskii 2016-10-24 8:29 ` Andreas Schwab 2016-10-24 16:21 ` Paul Eggert 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 7:44 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel > Cc: emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sun, 23 Oct 2016 21:59:52 -0700 > > Eli Zaretskii wrote: > > Then what are our choices to solve this for Emacs 25.2? > > Sorry, I've lost context. What's "this"? The fact that Emacs 25.2 builds with ralloc.c on recent GNU/Linux systems, which triggers random crashes, inability to build Emacs in some cases, and other atrocities, such as corruption of buffer text. > Doesn't draft Emacs 25.2 work adequately on bleeding-edge glibc > without our doing anything special? Not even close. > If not, what are the problems and how can we reproduce them? One problem is that relocation of buffer text and Lisp string data can happen during regex searches, due to reallocation of the failure stack to a size that exceeds MAX_ALLOCA. Noam fixed that by some non-trivial code in regex.c, but those changes seem to have uncovered a problem that precludes bootstrapping Emacs 25.2, reported here: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24358#123 which was independently reported for a different machine here: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24772#11 I predict that more people will start hitting this as they upgrade to newer glibc and start building Emacs with ralloc.c. So we can't even build Emacs 25.2 reliably on "bleeding-edge" GNU/Linux systems -- how's that for "working adequately"? Then there's bug#24764, which sounds a lot like it's also related to ralloc.c. Because Michael Heerdegen talked about problems with browsing Web pages, I looked at xml.c, and sure thing, it passes a C pointer to buffer text to libxml2 functions which call malloc internally. I installed a ralloc-specific workaround for that, but didn't yet hear from Michael if that was sufficient to solve his frequent crashes in GC and corruptions of buffer text. > I attempted to reproduce the problems, whatever they are, by building draft > Emacs 25.2 with './configure emacs_cv_var_doug_lea_malloc=no' on x86-64 Ubuntu > 16.04.1 (I don't have easy access to 16.10 yet, but this does build and link > gmalloc.o and ralloc.o). --enable-gcc-warnings did generate some warnings, which > I just now fixed, but I didn't observe any runtime misbehaviors. The problems don't happen immediately, and the problem with bootstrap (bug#24358) seems to be dependent on some factor we don't yet understand: Noam cannot reproduce it, although his system is very similar to the one where it does happen. If you want an almost immediate manifestation of the problem in a build with ralloc, remove the calls to r_alloc_inhibit_buffer_relocation in xml.c, and browse some Web pages, you will sooner or later hit an assertion violation in parse_region. More generally, I found a few more places in the sources where we hold C pointers to buffer text around calls to functions that can call malloc. I'm not sure I found all of them, because the only way to look for them I know of is not perfect. As for the same situation where we hold C pointers to Lisp string data where malloc can be called, there are virtually dozens of them, the most frequent paradigm is something like char *beg = SSDATA (lisp_string); char *end = beg + something; Lisp_Object new_string = make_unibyte_string (beg, end - beg); (The catch here is that make_unibyte_string calls malloc internally, which could relocate the data of the original lisp_string, and thus invalidate the pointers 'beg' and 'end'.) I don't remember well enough the internals of ralloc.c: perhaps it doesn't relocate Lisp string data unless the string is long enough? or at all? So the problems with Lisp string might not be as grave as I fear, but this should certainly be looked into before we dismiss all those cases. Bottom line: when GNU/Linux systems started using ralloc.c, they've potentially exposed Emacs 25.2 to very serious instability, on the platform that we consider by far the most important one. We need to move fast and thoroughly to investigate the possible solutions and decide which one to use. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 7:44 ` Eli Zaretskii @ 2016-10-24 8:29 ` Andreas Schwab 2016-10-24 8:47 ` Eli Zaretskii 2016-10-24 16:21 ` Paul Eggert 1 sibling, 1 reply; 375+ messages in thread From: Andreas Schwab @ 2016-10-24 8:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel On Okt 24 2016, Eli Zaretskii <eliz@gnu.org> wrote: > I don't remember well enough the internals of ralloc.c: perhaps it > doesn't relocate Lisp string data unless the string is long enough? or > at all? So the problems with Lisp string might not be as grave as I > fear, but this should certainly be looked into before we dismiss all > those cases. String data can be relocated even without ralloc, see compact_small_strings. Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 8:29 ` Andreas Schwab @ 2016-10-24 8:47 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 8:47 UTC (permalink / raw) To: Andreas Schwab; +Cc: eggert, emacs-devel > From: Andreas Schwab <schwab@suse.de> > Cc: Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org > Date: Mon, 24 Oct 2016 10:29:35 +0200 > > On Okt 24 2016, Eli Zaretskii <eliz@gnu.org> wrote: > > > I don't remember well enough the internals of ralloc.c: perhaps it > > doesn't relocate Lisp string data unless the string is long enough? or > > at all? So the problems with Lisp string might not be as grave as I > > fear, but this should certainly be looked into before we dismiss all > > those cases. > > String data can be relocated even without ralloc, see > compact_small_strings. Yes, but that's called only by GC, so not a problem in most (if not all) the places I've seen that, and is not related to ralloc anyway. Looking at ralloc.c and its callers, I think the only blocks of memory it relocates on its own are those allocated with r_alloc or r_realloc, which is only used for buffer text and (under REGEX_MALLOC) for regex.c failure stack. Which means compact_small_strings is the _only_ place where string data is relocated, so just calls to malloc cannot. IOW, I confused GC and compact_small_strings with ralloc, when I talked about string data, and our only problem is with pointers to buffer text. Am I missing something? Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 7:44 ` Eli Zaretskii 2016-10-24 8:29 ` Andreas Schwab @ 2016-10-24 16:21 ` Paul Eggert 2016-10-24 16:39 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Paul Eggert @ 2016-10-24 16:21 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On 10/24/2016 12:44 AM, Eli Zaretskii wrote: > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24358... Then there's > bug#24764 ... These bugs seem to be fixed now (thanks to you!). As Andreas pointed out, the problems with ralloc.c are not as severe as initially feared, since they should be limited to pointers to buffer text and should not extend to pointers to Lisp strings. As I understand it, although the ralloc.c approach worked for a long time, it fell out of favor on common platforms and so hasn't been debugged as thoroughly for the past several years. Unfortunately, recent changes to glibc have caused ralloc.c to be used again on common GNU platforms and this are shaking out longstanding bugs with the ralloc.c approach. This means people using bleeding-edge glibc are suffering problems similar to what people on now-unusual platforms must have had for some time. Surely we can fix these ralloc.c-related bugs as they come up. That being said, they are a hassle for users and maintainers, and if dropping ralloc.c works and doesn't cause significant performance degradation it sounds like that would be a win. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:21 ` Paul Eggert @ 2016-10-24 16:39 ` Eli Zaretskii 2016-10-24 16:54 ` Paul Eggert ` (2 more replies) 0 siblings, 3 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 16:39 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel > Cc: emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Mon, 24 Oct 2016 09:21:50 -0700 > > On 10/24/2016 12:44 AM, Eli Zaretskii wrote: > > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24358... Then there's > > bug#24764 ... > > These bugs seem to be fixed now (thanks to you!). Some of them are fixed. At least one more remains (I hope to fix it soon). What's more, we still don't know whether the changes in regex.c by Noam are correct enough to solve the problems with relocation of buffer text while we search the buffer, and we also don't know yet whether the two bugs mentioned above are solved because we didn't hear from their OPs. Since these all are related to ralloc.c, the question is whether we should get rid of using it on GNU/Linux, instead of chasing each of these problems (which is anything but easy and might take time). > As Andreas pointed out, the problems with ralloc.c are not as severe > as initially feared, since they should be limited to pointers to > buffer text and should not extend to pointers to Lisp strings. Indeed, and that's a relief. > As I understand it, although the ralloc.c approach worked for a long > time, it fell out of favor on common platforms and so hasn't been > debugged as thoroughly for the past several years. Unfortunately, recent > changes to glibc have caused ralloc.c to be used again on common GNU > platforms and this are shaking out longstanding bugs with the ralloc.c > approach. This means people using bleeding-edge glibc are suffering > problems similar to what people on now-unusual platforms must have had > for some time. Yes, exactly. And since most people at least here use Emacs on GNU/Linux, the nasty problems due to ralloc.c are popping up much faster and more frequently than they did when only *BSD and Windows used ralloc.c. > Surely we can fix these ralloc.c-related bugs as they come up. That > being said, they are a hassle for users and maintainers, and if dropping > ralloc.c works and doesn't cause significant performance degradation it > sounds like that would be a win. Right, so I'd like your opinion and comments about the possible solutions proposed so far: . Build with gmalloc but without ralloc. . Back-port the HYBRID_MALLOC changes from master. Not sure if the patch is simple and safe enough, or whether the result is tested well enough to have that on emacs-25. . Build with gmalloc and use mmap for buffer text allocation. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:39 ` Eli Zaretskii @ 2016-10-24 16:54 ` Paul Eggert 2016-10-24 17:05 ` Eli Zaretskii 2016-10-28 6:18 ` Jérémie Courrèges-Anglas 2016-10-28 6:19 ` Jérémie Courrèges-Anglas 2 siblings, 1 reply; 375+ messages in thread From: Paul Eggert @ 2016-10-24 16:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 775 bytes --] On 10/24/2016 09:39 AM, Eli Zaretskii wrote: > > Right, so I'd like your opinion and comments about the possible > solutions proposed so far: > > . Build with gmalloc but without ralloc. This goes back to what we were doing, no? > > . Back-port the HYBRID_MALLOC changes from master. Not sure if the > patch is simple and safe enough, or whether the result is tested > well enough to have that on emacs-25. This sounds riskier. > . Build with gmalloc and use mmap for buffer text allocation. This also sounds riskier. How about the attached patch for emacs-25? Basically, it says "use ralloc.c only if requested via './configure REL_ALLOC=yes'". I assume that this patch need not be ported to master, due to HYBRID_MALLOC. I haven't tested this. [-- Attachment #2: ralloc.diff --] [-- Type: text/x-patch, Size: 735 bytes --] diff --git a/configure.ac b/configure.ac index ae7dfe5..19b44bd 100644 --- a/configure.ac +++ b/configure.ac @@ -2189,18 +2189,10 @@ if test "$doug_lea_malloc" = "yes" ; then AC_DEFINE(DOUG_LEA_MALLOC, 1, [Define to 1 if the system memory allocator is Doug Lea style, with malloc hooks and malloc_set_state.]) - - ## Use mmap directly for allocating larger buffers. - ## FIXME this comes from src/s/{gnu,gnu-linux}.h: - ## #ifdef DOUG_LEA_MALLOC; #undef REL_ALLOC; #endif - ## Does the AC_FUNC_MMAP test below make this check unnecessary? - case "$opsys" in - mingw32|gnu*) REL_ALLOC=no ;; - esac fi if test x"${REL_ALLOC}" = x; then - REL_ALLOC=${GNU_MALLOC} + REL_ALLOC=no fi use_mmap_for_buffers=no ^ permalink raw reply related [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:54 ` Paul Eggert @ 2016-10-24 17:05 ` Eli Zaretskii 2016-10-25 6:23 ` Paul Eggert 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 17:05 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel > Cc: emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Mon, 24 Oct 2016 09:54:24 -0700 > > > . Build with gmalloc but without ralloc. > > This goes back to what we were doing, no? No, we were using the glibc malloc, AFAIK. Or am I missing something? > > . Back-port the HYBRID_MALLOC changes from master. Not sure if the > > patch is simple and safe enough, or whether the result is tested > > well enough to have that on emacs-25. > > This sounds riskier. > > > . Build with gmalloc and use mmap for buffer text allocation. > > This also sounds riskier. I agree with your assessments. > How about the attached patch for emacs-25? Basically, it says "use > ralloc.c only if requested via './configure REL_ALLOC=yes'". LGTM. Should we wait for people to build with REL_ALLOC=no manually, to see if there are any problems, or should we push this right away? > I assume that this patch need not be ported to master, due to > HYBRID_MALLOC. Yes, I think so. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 17:05 ` Eli Zaretskii @ 2016-10-25 6:23 ` Paul Eggert 2016-10-25 16:11 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Paul Eggert @ 2016-10-25 6:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel [-- Attachment #1: Type: text/plain, Size: 784 bytes --] >>> . Build with gmalloc but without ralloc. >> >> This goes back to what we were doing, no? > > No, we were using the glibc malloc, AFAIK. Or am I missing something? No you're right, I was sloppy. >> How about the attached patch for emacs-25? Basically, it says "use >> ralloc.c only if requested via './configure REL_ALLOC=yes'". > > LGTM. Should we wait for people to build with REL_ALLOC=no manually, > to see if there are any problems, or should we push this right away? I doubt whether many more people will build with REL_ALLOC=no. So I think we should push it into emacs-25. Proposed patch attached. I have tested this on Fedora 24 x86-64 and Ubuntu 16.04 x86-64 with './configure emacs_cv_var_doug_lea_malloc=no' to simulate bleeding-edge glibc. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Default-REL_ALLOC-to-no.patch --] [-- Type: text/x-diff; name="0001-Default-REL_ALLOC-to-no.patch", Size: 1732 bytes --] From 561110345c48d48b4621d2e59487c2c17fcc988c Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Mon, 24 Oct 2016 23:11:32 -0700 Subject: [PATCH] Default REL_ALLOC to 'no' This should make ralloc-related bugs less likely on GNU/Linux systems with bleeding-edge glibc. See the email thread containing: http://lists.gnu.org/archive/html/emacs-devel/2016-10/msg00801.html Do not merge to master. * configure.ac (REL_ALLOC): Default to 'no' on all platforms, not merely on platforms with Doug Lea malloc. Although bleeding-edge glibc no longer exports __malloc_initialize_hook and so longer passes the configure-time test for Doug Lea malloc, ralloc tickles longstanding bugs like Bug#24358 and Bug#24764 and Emacs is likely to be more reliable without it. This patch is not needed on master, which uses hybrid malloc in this situation. --- configure.ac | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/configure.ac b/configure.ac index ae7dfe5..19b44bd 100644 --- a/configure.ac +++ b/configure.ac @@ -2189,18 +2189,10 @@ if test "$doug_lea_malloc" = "yes" ; then AC_DEFINE(DOUG_LEA_MALLOC, 1, [Define to 1 if the system memory allocator is Doug Lea style, with malloc hooks and malloc_set_state.]) - - ## Use mmap directly for allocating larger buffers. - ## FIXME this comes from src/s/{gnu,gnu-linux}.h: - ## #ifdef DOUG_LEA_MALLOC; #undef REL_ALLOC; #endif - ## Does the AC_FUNC_MMAP test below make this check unnecessary? - case "$opsys" in - mingw32|gnu*) REL_ALLOC=no ;; - esac fi if test x"${REL_ALLOC}" = x; then - REL_ALLOC=${GNU_MALLOC} + REL_ALLOC=no fi use_mmap_for_buffers=no -- 2.7.4 ^ permalink raw reply related [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-25 6:23 ` Paul Eggert @ 2016-10-25 16:11 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-25 16:11 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel > Cc: emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Mon, 24 Oct 2016 23:23:45 -0700 > > > LGTM. Should we wait for people to build with REL_ALLOC=no manually, > > to see if there are any problems, or should we push this right away? > > I doubt whether many more people will build with REL_ALLOC=no. So I think we > should push it into emacs-25. Proposed patch attached. I agree. Anyway, at least one person already tried this build and reported that problems due to relocation are gone. So please push to emacs-25. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:39 ` Eli Zaretskii 2016-10-24 16:54 ` Paul Eggert @ 2016-10-28 6:18 ` Jérémie Courrèges-Anglas 2016-10-28 6:19 ` Jérémie Courrèges-Anglas 2 siblings, 0 replies; 375+ messages in thread From: Jérémie Courrèges-Anglas @ 2016-10-28 6:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: [...] >> As I understand it, although the ralloc.c approach worked for a long >> time, it fell out of favor on common platforms and so hasn't been >> debugged as thoroughly for the past several years. Unfortunately, recent >> changes to glibc have caused ralloc.c to be used again on common GNU >> platforms and this are shaking out longstanding bugs with the ralloc.c >> approach. This means people using bleeding-edge glibc are suffering >> problems similar to what people on now-unusual platforms must have had >> for some time. > > Yes, exactly. And since most people at least here use Emacs on > GNU/Linux, the nasty problems due to ralloc.c are popping up much > faster and more frequently than they did when only *BSD and Windows > used ralloc.c. I'm a bit surprised that such issues happen on recent glibc systems. Emacs has been using ralloc on OpenBSD since years, and seems to be pretty stable. Granted, memory corruption bugs can depend on many parameters, but still... -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:39 ` Eli Zaretskii 2016-10-24 16:54 ` Paul Eggert 2016-10-28 6:18 ` Jérémie Courrèges-Anglas @ 2016-10-28 6:19 ` Jérémie Courrèges-Anglas 2016-10-28 7:40 ` Eli Zaretskii 2 siblings, 1 reply; 375+ messages in thread From: Jérémie Courrèges-Anglas @ 2016-10-28 6:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: [...] >> As I understand it, although the ralloc.c approach worked for a long >> time, it fell out of favor on common platforms and so hasn't been >> debugged as thoroughly for the past several years. Unfortunately, recent >> changes to glibc have caused ralloc.c to be used again on common GNU >> platforms and this are shaking out longstanding bugs with the ralloc.c >> approach. This means people using bleeding-edge glibc are suffering >> problems similar to what people on now-unusual platforms must have had >> for some time. > > Yes, exactly. And since most people at least here use Emacs on > GNU/Linux, the nasty problems due to ralloc.c are popping up much > faster and more frequently than they did when only *BSD and Windows > used ralloc.c. I'm a bit surprised that such issues happen on recent glibc systems. Emacs has been using ralloc on OpenBSD since years, and seems to be pretty stable. Granted, memory corruption bugs can depend on many parameters, but still... -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 6:19 ` Jérémie Courrèges-Anglas @ 2016-10-28 7:40 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 7:40 UTC (permalink / raw) To: Jérémie Courrèges-Anglas; +Cc: eggert, emacs-devel > From: jca@wxcvbn.org (Jérémie Courrèges-Anglas) > Cc: Paul Eggert <eggert@cs.ucla.edu>, emacs-devel@gnu.org > Date: Fri, 28 Oct 2016 08:19:48 +0200 > > > Yes, exactly. And since most people at least here use Emacs on > > GNU/Linux, the nasty problems due to ralloc.c are popping up much > > faster and more frequently than they did when only *BSD and Windows > > used ralloc.c. > > I'm a bit surprised that such issues happen on recent glibc systems. > Emacs has been using ralloc on OpenBSD since years, and seems to be > pretty stable. Granted, memory corruption bugs can depend on many > parameters, but still... I guess your usage patterns side-step the problematic code. E.g., if the resulting memory footprint is stable (i.e. never grows too much too fast), ralloc will not need to relocate buffer text too frequently, so you won't bump into these problems. And some of those problems appeared only recently: e.g., EWW, which triggers the problem when it calls libxml2, is a 25.1 addition. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 12:50 ` Eli Zaretskii 2016-10-23 13:39 ` Stefan Monnier @ 2016-10-23 15:22 ` Andreas Schwab 2016-10-23 15:49 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Andreas Schwab @ 2016-10-23 15:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, monnier, emacs-devel On Okt 23 2016, Eli Zaretskii <eliz@gnu.org> wrote: > Is it reasonable to require a version of glibc that still supports > __malloc_initialize_hook? When people upgrade to a newer glibc, the > previous version is still left on the system, I presume (for programs > that need them which were built against those old versions)? There will ever be only one glibc. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 15:22 ` Andreas Schwab @ 2016-10-23 15:49 ` Eli Zaretskii 2016-10-23 15:57 ` Andreas Schwab 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 15:49 UTC (permalink / raw) To: Andreas Schwab; +Cc: eggert, monnier, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Date: Sun, 23 Oct 2016 17:22:37 +0200 > Cc: Paul Eggert <eggert@cs.ucla.edu>, monnier@iro.umontreal.ca, > emacs-devel@gnu.org > > On Okt 23 2016, Eli Zaretskii <eliz@gnu.org> wrote: > > > Is it reasonable to require a version of glibc that still supports > > __malloc_initialize_hook? When people upgrade to a newer glibc, the > > previous version is still left on the system, I presume (for programs > > that need them which were built against those old versions)? > > There will ever be only one glibc. Sorry, I don't understand what that means. Can't it be that some older program is linked against an older libc.so.N version? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 15:49 ` Eli Zaretskii @ 2016-10-23 15:57 ` Andreas Schwab 2016-10-23 17:06 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Andreas Schwab @ 2016-10-23 15:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel On Okt 23 2016, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Andreas Schwab <schwab@linux-m68k.org> >> Date: Sun, 23 Oct 2016 17:22:37 +0200 >> Cc: Paul Eggert <eggert@cs.ucla.edu>, monnier@iro.umontreal.ca, >> emacs-devel@gnu.org >> >> On Okt 23 2016, Eli Zaretskii <eliz@gnu.org> wrote: >> >> > Is it reasonable to require a version of glibc that still supports >> > __malloc_initialize_hook? When people upgrade to a newer glibc, the >> > previous version is still left on the system, I presume (for programs >> > that need them which were built against those old versions)? >> >> There will ever be only one glibc. > > Sorry, I don't understand what that means. Can't it be that some > older program is linked against an older libc.so.N version? There is no older version. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 15:57 ` Andreas Schwab @ 2016-10-23 17:06 ` Eli Zaretskii 2016-10-23 20:35 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 17:06 UTC (permalink / raw) To: Andreas Schwab; +Cc: eggert, monnier, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Date: Sun, 23 Oct 2016 17:57:15 +0200 > Cc: eggert@cs.ucla.edu, monnier@iro.umontreal.ca, emacs-devel@gnu.org > > >> > Is it reasonable to require a version of glibc that still supports > >> > __malloc_initialize_hook? When people upgrade to a newer glibc, the > >> > previous version is still left on the system, I presume (for programs > >> > that need them which were built against those old versions)? > >> > >> There will ever be only one glibc. > > > > Sorry, I don't understand what that means. Can't it be that some > > older program is linked against an older libc.so.N version? > > There is no older version. That doesn't really help understanding the issues. I guess it's above my pay grade. (When trying to solve such a grave problem, we need everybody's best expertise and help, and you are one of the best experts on these matters here. Looks like I'm too naïve expecting such cooperation.) ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 17:06 ` Eli Zaretskii @ 2016-10-23 20:35 ` Stefan Monnier 0 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 20:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, Andreas Schwab, emacs-devel > That doesn't really help understanding the issues. I guess it's above > my pay grade. IIUC it's been libc.so.6 ever since distribution have moved from the "old libc" to glibc-2. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Skipping unexec via a big .elc file (was: When should ralloc.c be used?) 2016-10-23 6:53 ` Eli Zaretskii 2016-10-23 7:57 ` Paul Eggert @ 2016-10-23 16:44 ` Stefan Monnier 2016-10-23 17:34 ` Eli Zaretskii 2016-10-24 18:34 ` Lars Brinkhoff 1 sibling, 2 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 16:44 UTC (permalink / raw) To: emacs-devel >> If someone wants to do that, great. I'd rather spend my own limited >> cycles on fixing the main problem, which is unexec. > I thought we agreed to get rid of unexec by loading a single .elc file > at startup of Emacs, and remove the distinction between temacs and > Emacs altogether. Is that what you'd like to work on? FWIW, I just did a quick experiment with the patch below which dumps the state of Emacs's obarray after loadup.el into a big "dumped.elc" file. Not sure if such an approach could work, but in any case I expect that a working .elc file should likely be of comparable size. The result is a .elc file of 3.3MB which seems reasonable. When I try to load it, tho, I get: % time src/emacs -Q --batch -l dumped.elc -f kill-emacs src/emacs -Q --batch -l dumped.elc -f kill-emacs 3.50s user 0.00s system 99% cpu 3.506 total % And that's with a warm cache on "i3-4170 CPU @ 3.70GHz" (my first, and still only, CPU that goes beyond 3GHz). So even if there might be ways to speed this up, it doesn't look too promising. Stefan diff --git a/lisp/loadup.el b/lisp/loadup.el index 5c16464..dddd71f 100644 --- a/lisp/loadup.el +++ b/lisp/loadup.el @@ -474,6 +474,65 @@ invocation-directory) (expand-file-name name invocation-directory) t))) + (message "Dumping into dumped.elc...preparing...") + + ;; Dump the current state into a file so we can reload it! + (with-current-buffer (generate-new-buffer "dumped.elc") + (message "Dumping into dumped.elc...generating...") + (insert ";ELC\^W\^@\^@\^@\n;;; Compiled\n;;; in Emacs version " emacs-version "\n") + (let ((cmds '())) + (setcdr global-buffers-menu-map nil) ;; Get rid of buffer objects! + (mapatoms + (lambda (s) + (when (and (fboundp s) + (not (subrp (symbol-function s))) + ;; FIXME: We need these, but they contain + ;; unprintable objects. + (not (memq s '(rename-buffer)))) + (push `(fset ',s ,(macroexp-quote (symbol-function s))) cmds)) + (when (and (boundp s) (not (keywordp s)) + (not (memq s '(nil t + ;; I think we don't need these! + terminal-frame + ;; FIXME: We need these, but they contain + ;; unprintable objects. + advertised-signature-table + undo-auto--undoably-changed-buffers)))) + ;; FIXME: Don't record in the load-history! + ;; FIXME: Handle varaliases! + (let ((v (symbol-value s))) + (push `(defvar ,s + ,(cond + ((subrp v) + `(symbol-function ',(intern (subr-name v)))) + ((and (markerp v) (null (marker-buffer v))) + '(make-marker)) + ((and (overlayp v) (null (overlay-buffer v))) + '(let ((ol (make-overlay (point-min) (point-min)))) + (delete-overlay ol) + ol)) + (v (macroexp-quote v)))) + cmds))) + (when (symbol-plist s) + (push `(setplist ',s ',(symbol-plist s)) cmds)))) + (message "Dumping into dumped.elc...printing...") + (let ((print-circle t) + (print-gensym t) + (print-quoted t) + (print-level nil) + (print-length nil) + (print-escape-newlines t)) + (print `(progn . ,cmds) (current-buffer))) + (goto-char (point-min)) + (while (re-search-forward " (\\(defvar\\|setplist\\|fset\\) " nil t) + (goto-char (match-beginning 0)) + (delete-char 1) (insert "\n")) + (message "Dumping into dumped.elc...saving...") + (let ((coding-system-for-write 'emacs-internal)) + (write-region (point-min) (point-max) (buffer-name))) + (message "Dumping into dumped.elc...done") + )) + (kill-emacs))) ;; For machines with CANNOT_DUMP defined in config.h, ^ permalink raw reply related [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file (was: When should ralloc.c be used?) 2016-10-23 16:44 ` Skipping unexec via a big .elc file (was: When should ralloc.c be used?) Stefan Monnier @ 2016-10-23 17:34 ` Eli Zaretskii 2016-10-23 20:27 ` Skipping unexec via a big .elc file Stefan Monnier 2016-10-24 1:07 ` Stefan Monnier 2016-10-24 18:34 ` Lars Brinkhoff 1 sibling, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 17:34 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sun, 23 Oct 2016 12:44:33 -0400 > > >> If someone wants to do that, great. I'd rather spend my own limited > >> cycles on fixing the main problem, which is unexec. > > I thought we agreed to get rid of unexec by loading a single .elc file > > at startup of Emacs, and remove the distinction between temacs and > > Emacs altogether. Is that what you'd like to work on? > > FWIW, I just did a quick experiment with the patch below which dumps the > state of Emacs's obarray after loadup.el into a big "dumped.elc" file. > Not sure if such an approach could work, but in any case I expect that > a working .elc file should likely be of comparable size. > > The result is a .elc file of 3.3MB which seems reasonable. > When I try to load it, tho, I get: > > % time src/emacs -Q --batch -l dumped.elc -f kill-emacs > src/emacs -Q --batch -l dumped.elc -f kill-emacs 3.50s user 0.00s system 99% cpu 3.506 total > % > > And that's with a warm cache on "i3-4170 CPU @ 3.70GHz" (my first, and > still only, CPU that goes beyond 3GHz). > > So even if there might be ways to speed this up, it doesn't look > too promising. That sounds strangely long, as I got less than 2 sec with all the preloaded *.elc files concatenated to a single file, and that's before I made pure-copy a no-op. Another report was that "loadup" with pure-copy short-circuited took less than 0.5 sec. See https://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01049.html Was your Emacs an optimized build? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-23 17:34 ` Eli Zaretskii @ 2016-10-23 20:27 ` Stefan Monnier 2016-10-24 6:22 ` Eli Zaretskii 2016-10-24 1:07 ` Stefan Monnier 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 20:27 UTC (permalink / raw) To: emacs-devel > https://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01049.html > Was your Emacs an optimized build? I tried it with my local build of master (not optimized) as well as with the "emacs24" executable provided by Debian. Both times were comparable. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-23 20:27 ` Skipping unexec via a big .elc file Stefan Monnier @ 2016-10-24 6:22 ` Eli Zaretskii 2016-10-24 12:47 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 6:22 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sun, 23 Oct 2016 16:27:48 -0400 > > > https://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01049.html > > Was your Emacs an optimized build? > > I tried it with my local build of master (not optimized) as well as with > the "emacs24" executable provided by Debian. Both times were comparable. An unoptimized Emacs runs about 3 times slower, so I cannot explain your comparable results with both versions, it doesn't match any of my experiences. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 6:22 ` Eli Zaretskii @ 2016-10-24 12:47 ` Stefan Monnier 2016-10-24 13:08 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 12:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >> I tried it with my local build of master (not optimized) as well as with >> the "emacs24" executable provided by Debian. Both times were comparable. > An unoptimized Emacs runs about 3 times slower, In my experience it's much less drastic, unless you include enable_checking and such in "unoptimized". > so I cannot explain your comparable results with both versions, it > doesn't match any of my experiences. The way I explained it to myself is that the lread.c code is much less affected (e.g. it should almost be unaffected by enable_checking). BTW, have you tried my experiement on your side? Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 12:47 ` Stefan Monnier @ 2016-10-24 13:08 ` Eli Zaretskii 2016-10-24 14:15 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 13:08 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: emacs-devel@gnu.org > Date: Mon, 24 Oct 2016 08:47:49 -0400 > > > so I cannot explain your comparable results with both versions, it > > doesn't match any of my experiences. > > The way I explained it to myself is that the lread.c code is much > less affected (e.g. it should almost be unaffected by enable_checking). Reading Lisp involves a lot of CPU-intensive processing. > BTW, have you tried my experiement on your side? No. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 13:08 ` Eli Zaretskii @ 2016-10-24 14:15 ` Stefan Monnier 0 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 14:15 UTC (permalink / raw) To: emacs-devel >> The way I explained it to myself is that the lread.c code is much >> less affected (e.g. it should almost be unaffected by enable_checking). > Reading Lisp involves a lot of CPU-intensive processing. Yes, but it's a different kind of code, so it may be affected differently. In any case, I have no concrete data to back up this intuition and I don't believe it very strongly either. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-23 17:34 ` Eli Zaretskii 2016-10-23 20:27 ` Skipping unexec via a big .elc file Stefan Monnier @ 2016-10-24 1:07 ` Stefan Monnier 2016-10-24 6:39 ` Eli Zaretskii 2016-10-24 9:40 ` Ken Raeburn 1 sibling, 2 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 1:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > That sounds strangely long, as I got less than 2 sec with all the > preloaded *.elc files concatenated to a single file, and that's before > I made pure-copy a no-op. > Another report was that "loadup" with pure-copy short-circuited took > less than 0.5 sec. See Hmm... indeed, I got to 0.72s with his patch (on a different, slower machine (a Thinkpad X201s, i.e. with a i7 CPU L620 @ 2.00GHz)). If I re-add international/characters it goes up a bit to 0.96s, but still nowhere near the 3s I got on my big .elc file. [ I wonder what makes loading my big file so slow. ] This said, there's still a factor 5-10 to get to "immediate", tho. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 1:07 ` Stefan Monnier @ 2016-10-24 6:39 ` Eli Zaretskii 2016-10-24 6:47 ` Lars Ingebrigtsen 2016-10-24 13:04 ` Stefan Monnier 2016-10-24 9:40 ` Ken Raeburn 1 sibling, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 6:39 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: emacs-devel@gnu.org > Date: Sun, 23 Oct 2016 21:07:47 -0400 > > > That sounds strangely long, as I got less than 2 sec with all the > > preloaded *.elc files concatenated to a single file, and that's before > > I made pure-copy a no-op. > > Another report was that "loadup" with pure-copy short-circuited took > > less than 0.5 sec. See > > Hmm... indeed, I got to 0.72s with his patch (on a different, slower > machine (a Thinkpad X201s, i.e. with a i7 CPU L620 @ 2.00GHz)). > > If I re-add international/characters it goes up a bit to > 0.96s, but still nowhere near the 3s I got on my big .elc file. > [ I wonder what makes loading my big file so slow. ] > > This said, there's still a factor 5-10 to get to "immediate", tho. A small price to pay for the advantages, IMO. The most important advantage in my view is that the dumping/loading process becomes very simple and understandable even by people with minimal knowledge of C subtleties and Emacs internals, let alone development tools like the assembler and the linker. This would make future maintenance much more robust and reliable, and also allow more contributors to work on improving, speeding up, and extending the build process. The alternatives all require us to depend on a dwindling handful of people, which is a huge disadvantage in the long run. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 6:39 ` Eli Zaretskii @ 2016-10-24 6:47 ` Lars Ingebrigtsen 2016-10-24 7:17 ` Eli Zaretskii 2016-10-24 13:04 ` Stefan Monnier 1 sibling, 1 reply; 375+ messages in thread From: Lars Ingebrigtsen @ 2016-10-24 6:47 UTC (permalink / raw) To: emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> If I re-add international/characters it goes up a bit to >> 0.96s, but still nowhere near the 3s I got on my big .elc file. >> [ I wonder what makes loading my big file so slow. ] >> >> This said, there's still a factor 5-10 to get to "immediate", tho. > > A small price to pay for the advantages, IMO. I think a one second startup time for "emacs -Q -nw" on a fast machine sounds pretty horrific, myself. Not all people live in Emacs, but instead start and stop Emacs to do small edits to files. It would also make using Emacs on slower, smaller mobile devices an unsatisfying experience. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 6:47 ` Lars Ingebrigtsen @ 2016-10-24 7:17 ` Eli Zaretskii 2016-10-24 8:24 ` Andreas Schwab 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 7:17 UTC (permalink / raw) To: Lars Ingebrigtsen; +Cc: emacs-devel > From: Lars Ingebrigtsen <larsi@gnus.org> > Date: Mon, 24 Oct 2016 08:47:39 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > >> If I re-add international/characters it goes up a bit to > >> 0.96s, but still nowhere near the 3s I got on my big .elc file. > >> [ I wonder what makes loading my big file so slow. ] > >> > >> This said, there's still a factor 5-10 to get to "immediate", tho. > > > > A small price to pay for the advantages, IMO. > > I think a one second startup time for "emacs -Q -nw" on a fast machine > sounds pretty horrific, myself. We are not talking about 1 sec, we are talking about less than half that time, potentially even 1/4th of a second. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 7:17 ` Eli Zaretskii @ 2016-10-24 8:24 ` Andreas Schwab 2016-10-24 8:41 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Andreas Schwab @ 2016-10-24 8:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Lars Ingebrigtsen, emacs-devel On Okt 24 2016, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Lars Ingebrigtsen <larsi@gnus.org> >> Date: Mon, 24 Oct 2016 08:47:39 +0200 >> >> Eli Zaretskii <eliz@gnu.org> writes: >> >> >> If I re-add international/characters it goes up a bit to >> >> 0.96s, but still nowhere near the 3s I got on my big .elc file. >> >> [ I wonder what makes loading my big file so slow. ] >> >> >> >> This said, there's still a factor 5-10 to get to "immediate", tho. >> > >> > A small price to pay for the advantages, IMO. >> >> I think a one second startup time for "emacs -Q -nw" on a fast machine >> sounds pretty horrific, myself. > > We are not talking about 1 sec, we are talking about less than half > that time, potentially even 1/4th of a second. That's still a lot. $ time emacs --batch --eval t 0.027user 0.011system 0m0.048selapsed 79.66%CPU Andreas. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different." ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 8:24 ` Andreas Schwab @ 2016-10-24 8:41 ` Eli Zaretskii 2016-10-24 9:47 ` Daniel Colascione 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 8:41 UTC (permalink / raw) To: Andreas Schwab; +Cc: larsi, emacs-devel > From: Andreas Schwab <schwab@suse.de> > Cc: Lars Ingebrigtsen <larsi@gnus.org>, emacs-devel@gnu.org > Date: Mon, 24 Oct 2016 10:24:26 +0200 > > > We are not talking about 1 sec, we are talking about less than half > > that time, potentially even 1/4th of a second. > > That's still a lot. > > $ time emacs --batch --eval t > 0.027user 0.011system 0m0.048selapsed 79.66%CPU Then I guess you will have to continue using unexec, and when that alternative disappears, switch to some other editor. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 8:41 ` Eli Zaretskii @ 2016-10-24 9:47 ` Daniel Colascione 2016-10-24 10:00 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2016-10-24 9:47 UTC (permalink / raw) To: Eli Zaretskii, Andreas Schwab; +Cc: larsi, emacs-devel On 10/24/2016 01:41 AM, Eli Zaretskii wrote: >> From: Andreas Schwab <schwab@suse.de> >> Cc: Lars Ingebrigtsen <larsi@gnus.org>, emacs-devel@gnu.org >> Date: Mon, 24 Oct 2016 10:24:26 +0200 >> >>> We are not talking about 1 sec, we are talking about less than half >>> that time, potentially even 1/4th of a second. >> >> That's still a lot. >> >> $ time emacs --batch --eval t >> 0.027user 0.011system 0m0.048selapsed 79.66%CPU > > Then I guess you will have to continue using unexec, and when that > alternative disappears, switch to some other editor. > I have lots of scripts that run using emacs -Q --batch; many are invoked frequently in other scripts. Making each take 250ms instead of 27ms to run will greatly increase the overall runtime of the high-level operations. I don't see a need to regress performance here, since a custom malloc will perform at least as well as the last glibc malloc that supported unexec (since it could in principle be a literal copy of that code), and we found the performance of that malloc acceptable. I care _much_ more about runtime performance than I do about allocation throughput once started. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 9:47 ` Daniel Colascione @ 2016-10-24 10:00 ` Eli Zaretskii 2016-10-24 10:03 ` Daniel Colascione 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 10:00 UTC (permalink / raw) To: Daniel Colascione; +Cc: schwab, larsi, emacs-devel > Cc: larsi@gnus.org, emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Mon, 24 Oct 2016 02:47:03 -0700 > > >> $ time emacs --batch --eval t > >> 0.027user 0.011system 0m0.048selapsed 79.66%CPU > > > > Then I guess you will have to continue using unexec, and when that > > alternative disappears, switch to some other editor. > > > > I have lots of scripts that run using emacs -Q --batch; many are invoked > frequently in other scripts. Making each take 250ms instead of 27ms to > run will greatly increase the overall runtime of the high-level > operations. Maybe --batch won't need to load all of the elc code, maybe we could have a smaller batch.elc for that. Or maybe what Ken just wrote will bring the load time below 100 ms, who knows. IOW, I think we are arguing prematurely about something whose performance we don't really understand, haven't measured yet, and haven't even written yet. Doesn't sound like a good idea. > I don't see a need to regress performance here, since a > custom malloc will perform at least as well as the last glibc malloc > that supported unexec (since it could in principle be a literal copy of > that code), and we found the performance of that malloc acceptable. I > care _much_ more about runtime performance than I do about allocation > throughput once started. The desire to drop unexec is not just because of malloc, it's because advances in compilers, linkers, and system security make maintenance of unexec harder and harder. For example, unexec is incompatible with address sanitation and other similar security techniques. It also regularly breaks when some new section is invented by the linker. Etc. etc. Therefore, we already decided to move towards eliminating unexec, and the only issue we should discuss is how to do that. You are in fact suggesting to overturn that decision, which I don't think people will agree with. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 10:00 ` Eli Zaretskii @ 2016-10-24 10:03 ` Daniel Colascione 2016-10-24 10:18 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2016-10-24 10:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: schwab, larsi, emacs-devel On 10/24/2016 03:00 AM, Eli Zaretskii wrote: >> Cc: larsi@gnus.org, emacs-devel@gnu.org >> From: Daniel Colascione <dancol@dancol.org> >> Date: Mon, 24 Oct 2016 02:47:03 -0700 >> >>>> $ time emacs --batch --eval t >>>> 0.027user 0.011system 0m0.048selapsed 79.66%CPU >>> >>> Then I guess you will have to continue using unexec, and when that >>> alternative disappears, switch to some other editor. >>> >> >> I have lots of scripts that run using emacs -Q --batch; many are invoked >> frequently in other scripts. Making each take 250ms instead of 27ms to >> run will greatly increase the overall runtime of the high-level >> operations. > > Maybe --batch won't need to load all of the elc code, maybe we could > have a smaller batch.elc for that. Or maybe what Ken just wrote will > bring the load time below 100 ms, who knows. > > IOW, I think we are arguing prematurely about something whose > performance we don't really understand, haven't measured yet, and > haven't even written yet. Doesn't sound like a good idea. > >> I don't see a need to regress performance here, since a >> custom malloc will perform at least as well as the last glibc malloc >> that supported unexec (since it could in principle be a literal copy of >> that code), and we found the performance of that malloc acceptable. I >> care _much_ more about runtime performance than I do about allocation >> throughput once started. > > The desire to drop unexec is not just because of malloc, it's because > advances in compilers, linkers, and system security make maintenance > of unexec harder and harder. For example, unexec is incompatible with > address sanitation and other similar security techniques. It also > regularly breaks when some new section is invented by the linker. > Etc. etc. > > Therefore, we already decided to move towards eliminating unexec, and > the only issue we should discuss is how to do that. You are in fact > suggesting to overturn that decision, which I don't think people will > agree with. Sure. I'd like to have a PIE Emacs myself. We're talking about methods. I don't think the XEmacs-style "portable dumper" approach, with relocations, has been given adequate consideration. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 10:03 ` Daniel Colascione @ 2016-10-24 10:18 ` Eli Zaretskii 2016-10-24 10:28 ` Philipp Stephani 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 10:18 UTC (permalink / raw) To: Daniel Colascione; +Cc: schwab, larsi, emacs-devel > Cc: schwab@suse.de, larsi@gnus.org, emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Mon, 24 Oct 2016 03:03:37 -0700 > > I don't think the XEmacs-style "portable dumper" approach, with > relocations, has been given adequate consideration. I think everyone agrees, which is why that approach is not being considered. But loading the "pre-loaded" *.elc files as quickly as possible is IMO an attractive approach, because it's very simple and doesn't require knowing too much about unrelated issues. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 10:18 ` Eli Zaretskii @ 2016-10-24 10:28 ` Philipp Stephani 2016-10-24 10:51 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Philipp Stephani @ 2016-10-24 10:28 UTC (permalink / raw) To: Eli Zaretskii, Daniel Colascione; +Cc: schwab, larsi, emacs-devel [-- Attachment #1: Type: text/plain, Size: 953 bytes --] Eli Zaretskii <eliz@gnu.org> schrieb am Mo., 24. Okt. 2016 um 12:19 Uhr: > > Cc: schwab@suse.de, larsi@gnus.org, emacs-devel@gnu.org > > From: Daniel Colascione <dancol@dancol.org> > > Date: Mon, 24 Oct 2016 03:03:37 -0700 > > > > I don't think the XEmacs-style "portable dumper" approach, with > > relocations, has been given adequate consideration. > > I think everyone agrees, which is why that approach is not being > considered. > > But loading the "pre-loaded" *.elc files as quickly as possible is IMO > an attractive approach, because it's very simple and doesn't require > knowing too much about unrelated issues. > > I agree, we should strife for simplicity first and performance later. I'd suggest to use the pre-loaded .elc approach in master and work on a faster (but still portable) replacement later, when the need arises. Switching to a portable dumper now means we can cut out lots of code and workarounds, which is a significant win. [-- Attachment #2: Type: text/html, Size: 1862 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 10:28 ` Philipp Stephani @ 2016-10-24 10:51 ` Eli Zaretskii 2016-10-24 13:52 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 10:51 UTC (permalink / raw) To: Philipp Stephani; +Cc: schwab, larsi, dancol, emacs-devel > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Mon, 24 Oct 2016 10:28:06 +0000 > Cc: schwab@suse.de, larsi@gnus.org, emacs-devel@gnu.org > > But loading the "pre-loaded" *.elc files as quickly as possible is IMO > an attractive approach, because it's very simple and doesn't require > knowing too much about unrelated issues. > > I agree, we should strife for simplicity first and performance later. I'd suggest to use the pre-loaded .elc > approach in master and work on a faster (but still portable) replacement later, when the need arises. I agree: we should make it work right first, and speed it up later. There's a lot of room for speed improvement, some of the ideas were already voiced here. If someone wants to work on this, I think some of the stuff that should be done is this: . Implement a command that writes a given list of *.elc files into a single file. . Make the C code that today runs at dump time and records various build-related variables, such as source-directory and system-configuration-features, record the values in a Lisp file (eventually will be the same .elc file that is loaded at startup). (I'm sure there are more items in that list, but I didn't think long enough to come up with more.) Volunteers are welcome. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 10:51 ` Eli Zaretskii @ 2016-10-24 13:52 ` Stefan Monnier 2016-10-24 16:04 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 13:52 UTC (permalink / raw) To: emacs-devel > . Implement a command that writes a given list of *.elc files into a > single file. > . Make the C code that today runs at dump time and records various > build-related variables, such as source-directory and > system-configuration-features, record the values in a Lisp file > (eventually will be the same .elc file that is loaded at startup). BTW, my dumped.elc attempt was specifically trying to solve these issues: by dumping the state of the obarray, we automatically get these vars set like we want them. It also solves other side-issues such as making sure that `C-h f dolist RET' points to "subr.el" rather than to "dumped.elc". Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 13:52 ` Stefan Monnier @ 2016-10-24 16:04 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 16:04 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Mon, 24 Oct 2016 09:52:54 -0400 > > > . Implement a command that writes a given list of *.elc files into a > > single file. > > > . Make the C code that today runs at dump time and records various > > build-related variables, such as source-directory and > > system-configuration-features, record the values in a Lisp file > > (eventually will be the same .elc file that is loaded at startup). > > BTW, my dumped.elc attempt was specifically trying to solve these > issues: by dumping the state of the obarray, we automatically get these > vars set like we want them. It also solves other side-issues such as > making sure that `C-h f dolist RET' points to "subr.el" rather than to > "dumped.elc". I consider tinkering with obarray's internals still too "advanced" to prefer it to a simple generation of Lisp code that records values of a few variables. (And are you sure all of the information we record at dump time is in obarray? I am not.) The number of these variables is not large, so finding them in the sources will not be hard. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 6:39 ` Eli Zaretskii 2016-10-24 6:47 ` Lars Ingebrigtsen @ 2016-10-24 13:04 ` Stefan Monnier 2016-10-24 13:35 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 13:04 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > A small price to pay for the advantages, IMO. I think some users will run away screaming if Emacs takes a whole second to start up. > The most important advantage in my view is that the dumping/loading > process becomes very simple and understandable even by people with > minimal knowledge of C subtleties and Emacs internals, Yes, the benefits are clear, but the cost is pretty steep. I think we could live with a 0.2s startup time, but that's already a pretty high cost: - 0.2s feels sluggish when you expect "immediate". - byte-compilation has historically moved from "do it in a single session", to "start a separate Emacs session for each file" for good reasons. A 0.2s startup time imposes either a much slower byte-compilation, or will compel us to go back to "do it all in a single session". > This would make future maintenance much more robust and reliable, and > also allow more contributors to work on improving, speeding up, and > extending the build process. The alternatives all require us to > depend on a dwindling handful of people, which is a huge disadvantage > in the long run. Maybe there's indeed a lot of speed up still waiting there, and by reducing loading time of .elc files (and/or allowing more laziness there) we could bring down the 0.96s to 0.2s *and* speed up other uses at the same time. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 13:04 ` Stefan Monnier @ 2016-10-24 13:35 ` Eli Zaretskii 2016-10-24 14:45 ` Daniel Colascione 2016-10-25 22:46 ` Perry E. Metzger 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 13:35 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: emacs-devel@gnu.org > Date: Mon, 24 Oct 2016 09:04:29 -0400 > > > A small price to pay for the advantages, IMO. > > I think some users will run away screaming if Emacs takes a whole second > to start up. It depends. If those users, like me, have hundreds of buffers in their sessions, and use desktop.el to recreate their sessions, they already wait a few seconds for that. And I don't expect the result to be 1 sec, that's is a rounded up value that is already higher than what I saw. > > The most important advantage in my view is that the dumping/loading > > process becomes very simple and understandable even by people with > > minimal knowledge of C subtleties and Emacs internals, > > Yes, the benefits are clear, but the cost is pretty steep. We will have to speed this up, of course. You didn't expect tossing unexec to be an easy job, did you? > I think we could live with a 0.2s startup time, but that's already > a pretty high cost: > - 0.2s feels sluggish when you expect "immediate". > - byte-compilation has historically moved from "do it in a single > session", to "start a separate Emacs session for each file" for good > reasons. A 0.2s startup time imposes either a much slower > byte-compilation, or will compel us to go back to "do it all in > a single session". I think you forget parallelism. We build Emacs with several compilations running in parallel for a long time. And byte-compiling a typical file already takes more than 0.2 sec, sometimes (often?) significantly more, so I don't see a catastrophe yet. > > This would make future maintenance much more robust and reliable, and > > also allow more contributors to work on improving, speeding up, and > > extending the build process. The alternatives all require us to > > depend on a dwindling handful of people, which is a huge disadvantage > > in the long run. > > Maybe there's indeed a lot of speed up still waiting there, and by > reducing loading time of .elc files (and/or allowing more laziness there) > we could bring down the 0.96s to 0.2s *and* speed up other uses at the > same time. That's my hope, yes. E.g., maybe reading the startup.elc file could run in another thread? In any case, I don't think it's right to throw out this idea without trying very hard to make it work, because the benefits are so clear. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 13:35 ` Eli Zaretskii @ 2016-10-24 14:45 ` Daniel Colascione 2016-10-24 15:58 ` Eli Zaretskii 2016-10-25 22:46 ` Perry E. Metzger 1 sibling, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2016-10-24 14:45 UTC (permalink / raw) To: Eli Zaretskii, Stefan Monnier; +Cc: emacs-devel On 10/24/2016 06:35 AM, Eli Zaretskii wrote: >> From: Stefan Monnier <monnier@iro.umontreal.ca> >> Cc: emacs-devel@gnu.org >> Date: Mon, 24 Oct 2016 09:04:29 -0400 >> >>> A small price to pay for the advantages, IMO. >> >> I think some users will run away screaming if Emacs takes a whole second >> to start up. > > It depends. If those users, like me, have hundreds of buffers in > their sessions, and use desktop.el to recreate their sessions, they > already wait a few seconds for that. > > And I don't expect the result to be 1 sec, that's is a rounded up > value that is already higher than what I saw. > >>> The most important advantage in my view is that the dumping/loading >>> process becomes very simple and understandable even by people with >>> minimal knowledge of C subtleties and Emacs internals, >> >> Yes, the benefits are clear, but the cost is pretty steep. > > We will have to speed this up, of course. You didn't expect tossing > unexec to be an easy job, did you? > >> I think we could live with a 0.2s startup time, but that's already >> a pretty high cost: >> - 0.2s feels sluggish when you expect "immediate". >> - byte-compilation has historically moved from "do it in a single >> session", to "start a separate Emacs session for each file" for good >> reasons. A 0.2s startup time imposes either a much slower >> byte-compilation, or will compel us to go back to "do it all in >> a single session". > > I think you forget parallelism. We build Emacs with several > compilations running in parallel for a long time. And byte-compiling > a typical file already takes more than 0.2 sec, sometimes (often?) > significantly more, so I don't see a catastrophe yet. > >>> This would make future maintenance much more robust and reliable, and >>> also allow more contributors to work on improving, speeding up, and >>> extending the build process. The alternatives all require us to >>> depend on a dwindling handful of people, which is a huge disadvantage >>> in the long run. >> >> Maybe there's indeed a lot of speed up still waiting there, and by >> reducing loading time of .elc files (and/or allowing more laziness there) >> we could bring down the 0.96s to 0.2s *and* speed up other uses at the >> same time. > > That's my hope, yes. E.g., maybe reading the startup.elc file could > run in another thread? > > In any case, I don't think it's right to throw out this idea without > trying very hard to make it work, because the benefits are so clear. I'm worried that it'll be deemed to "work" at a level of performance much worse than what we have today. My preference would be to keep hammering on this approach and others until we find something with only minimal performance regressions. I don't see the unexec maintenance situation being desperate enough that we need to accept a big performance loss. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 14:45 ` Daniel Colascione @ 2016-10-24 15:58 ` Eli Zaretskii 2016-10-24 16:17 ` Daniel Colascione 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 15:58 UTC (permalink / raw) To: Daniel Colascione; +Cc: monnier, emacs-devel > Cc: emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Mon, 24 Oct 2016 07:45:17 -0700 > > > In any case, I don't think it's right to throw out this idea without > > trying very hard to make it work, because the benefits are so clear. > > I'm worried that it'll be deemed to "work" at a level of performance > much worse than what we have today. Why would you worry that it'll be accepted then more easily than it's accepted now? The same arguments will be voiced in the future if the solution's performance turns out to be insufficient. > I don't see the unexec maintenance situation being desperate enough > that we need to accept a big performance loss. I very much disagree with this: the unexec maintenance situation is actually so fragile that it could break at any moment, in the sense that we could very easily get into having no people on board who know enough about unexec to solve the next problem that will break it. The number of people who do know gets smaller and smaller with each year. That is not healthy at all for the future of the project. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 15:58 ` Eli Zaretskii @ 2016-10-24 16:17 ` Daniel Colascione 2016-10-24 16:51 ` Philipp Stephani 2016-10-24 16:52 ` Eli Zaretskii 0 siblings, 2 replies; 375+ messages in thread From: Daniel Colascione @ 2016-10-24 16:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel On 10/24/2016 08:58 AM, Eli Zaretskii wrote: >> Cc: emacs-devel@gnu.org >> From: Daniel Colascione <dancol@dancol.org> >> Date: Mon, 24 Oct 2016 07:45:17 -0700 >> >>> In any case, I don't think it's right to throw out this idea without >>> trying very hard to make it work, because the benefits are so clear. >> >> I'm worried that it'll be deemed to "work" at a level of performance >> much worse than what we have today. > > Why would you worry that it'll be accepted then more easily than it's > accepted now? The same arguments will be voiced in the future if the > solution's performance turns out to be insufficient. > >> I don't see the unexec maintenance situation being desperate enough >> that we need to accept a big performance loss. > > I very much disagree with this: the unexec maintenance situation is > actually so fragile that it could break at any moment, in the sense > that we could very easily get into having no people on board who know > enough about unexec to solve the next problem that will break it. The > number of people who do know gets smaller and smaller with each year. > That is not healthy at all for the future of the project. In both this discussion and the one about insdel, you've expressed the sentiment that we need to optimize for a world in which very few people have time to maintain Emacs internals. I have a more optimistic view: people are generally good at figuring things out, and if learning about unexec or other esoteric facilities is that prevents a developer from porting Emacs to a new platform or fixing an important bug, that developer will put time into learning about these mechanisms. That is, we *could* get into a situation where "no people on board [] know enough about unexec to solve the next problem", but that situation will resolve itself when people learn about unexec. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 16:17 ` Daniel Colascione @ 2016-10-24 16:51 ` Philipp Stephani 2016-10-24 19:47 ` Daniel Colascione 2016-10-24 16:52 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Philipp Stephani @ 2016-10-24 16:51 UTC (permalink / raw) To: Daniel Colascione, Eli Zaretskii; +Cc: monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 703 bytes --] Daniel Colascione <dancol@dancol.org> schrieb am Mo., 24. Okt. 2016 um 18:35 Uhr: > That is, we *could* get into a situation where "no people on board [] > know enough about unexec to solve the next problem" I'd argue that we are already in this situation. For example, nobody knows how to make unexec work with ASLR or PIE; when I tried fuzzing Emacs with AFL, the dumped binary would simply crash; the dumped binary is not reproducible (i.e. bit-by-bit identical after every build); and I think dumping also doesn't work with ASan. The fraction of situation where unexec doesn't work any more gets larger and larger. If we had people who could solve these problems, it should get smaller instead. [-- Attachment #2: Type: text/html, Size: 1013 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 16:51 ` Philipp Stephani @ 2016-10-24 19:47 ` Daniel Colascione 2016-10-25 15:59 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2016-10-24 19:47 UTC (permalink / raw) To: Philipp Stephani; +Cc: Eli Zaretskii, monnier, emacs-devel Philipp Stephani <p.stephani2@gmail.com> writes: > Daniel Colascione <dancol@dancol.org> schrieb am Mo., 24. Okt. 2016 um 18:35 Uhr: > > That is, we *could* get into a situation where "no people on board [] > know enough about unexec to solve the next problem" > > I'd argue that we are already in this situation. For example, nobody > knows how to make unexec work with ASLR or PIE; when I tried fuzzing > Emacs with AFL, the dumped binary would simply crash; the dumped > binary is not reproducible (i.e. bit-by-bit identical after every > build); and I think dumping also doesn't work with ASan. The fraction > of situation where unexec doesn't work any more gets larger and > larger. If we had people who could solve these problems, it should get > smaller instead. It's not a matter of "not knowing" how to make unexec work with PIE and PIC code generally --- the problem is that the naive approach currently used for serializing program state depends on the process address state being reproducible: we don't specially mark pointers in the saved image, so we can't relocate them. There have been numerous discussions on emacs-devel about relocation schemes, with proposals ranging from just making elc faster to translating elisp to C. Everyone who's seriously thought about the unexec problem _understands_ the issue. unexec isn't black magic. Getting rid of the current scheme is a matter of finding the right relocation scheme (which for all I know might as well be "make elc better") and finding the time to implement it. My preferred approach is the portable dumper one: basically what we're doing today, except that instead of just blindly copying the data segment and heap to a new emacs binary, we'll write this information to a separate file, stored in a portable format, a file that we'll keep alongside the Emacs binary. We'll store in this file metadata about where the pointers are. (There are two kinds of pointers in this file: pointers to other parts of the file and pointers to the Emacs binary.) At startup, we'll load the dump file and walk the relocations, fixing up all the embedded addresses to account for the new process's different address space. There's no binary other than the one that the compiler generates; this data file is just data, so ASLR, ASAN, and other clever things should work fine. (Some people have proposed asking the system dynamic linker to do the relocating, but I'd prefer to do it ourselves, in a portable way.) We can't save all of the Emacs data segment this way, but we can relocate and restore anything that's marked with staticpro. The overall experience should be very similar to what we have today. Additionally, the purespace concept remains useful: if we take pure storage and put it in its own region of the dump file, we don't need to take copy-on-write faults for data that cannot contain pointers. Speaking of COW faults: a refinement of this scheme is to do the relocations lazily, in a SIGSEGV handler. (Map the dump file PROT_NONE so any access traps.) In the SIGSEGV handler, we can relocate just the page we faulted, then continue. This way, we don't need to slurp in the entire dump file from disk just to start emacs -Q -batch: we can demand-page! Whether this refinement is worth the trouble is something only experimentation can tell, but it's an option if we need it. With this refinement, the portable dumping approach should be safe, semantically familiar to unexec, ASLR-compatible, _and_ very nearly as fast as what we have today. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 19:47 ` Daniel Colascione @ 2016-10-25 15:59 ` Eli Zaretskii 2016-10-25 16:14 ` Daniel Colascione ` (2 more replies) 0 siblings, 3 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-25 15:59 UTC (permalink / raw) To: Daniel Colascione; +Cc: p.stephani2, monnier, emacs-devel > From: Daniel Colascione <dancol@dancol.org> > Cc: Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca, emacs-devel@gnu.org > Date: Mon, 24 Oct 2016 12:47:56 -0700 > > > I'd argue that we are already in this situation. For example, nobody > > knows how to make unexec work with ASLR or PIE; when I tried fuzzing > > Emacs with AFL, the dumped binary would simply crash; the dumped > > binary is not reproducible (i.e. bit-by-bit identical after every > > build); and I think dumping also doesn't work with ASan. The fraction > > of situation where unexec doesn't work any more gets larger and > > larger. If we had people who could solve these problems, it should get > > smaller instead. > > Everyone who's seriously thought about the unexec problem _understands_ > the issue. The important point is that the number of people here who can claim such understanding, enough so to fix the issues, is diminishingly small, and gets smaller every year. > My preferred approach is the portable dumper one: basically what we're > doing today, except that instead of just blindly copying the data > segment and heap to a new emacs binary, we'll write this information to > a separate file, stored in a portable format, a file that we'll keep > alongside the Emacs binary. We'll store in this file metadata about > where the pointers are. (There are two kinds of pointers in this file: > pointers to other parts of the file and pointers to the Emacs binary.) > > At startup, we'll load the dump file and walk the relocations, fixing up > all the embedded addresses to account for the new process's different > address space. Why do you think this will have better performance that reading a single .elc file at startup? It's still mainly file I/O and processing of the file's contents, just like with byte-compiled files. If we have no reason to believe this portable dumper will be significantly faster, we should IMO investigate the .elc method first, because it's so much simpler, both in its implementation and in future maintenance. E.g., adding a new kind of Lisp object to Emacs would require corresponding changes in the dumper. > We can't save all of the Emacs data segment this way, but we can > relocate and restore anything that's marked with staticpro. The overall > experience should be very similar to what we have today. > [...] > Speaking of COW faults: a refinement of this scheme is to do the > relocations lazily, in a SIGSEGV handler. (Map the dump file PROT_NONE > so any access traps.) In the SIGSEGV handler, we can relocate just the > page we faulted, then continue. This way, we don't need to slurp in the > entire dump file from disk just to start emacs -Q -batch: we can > demand-page! Demand paging in an application, and an application such as Emacs on top of that, makes little sense to me. This is the OS business, not ours. Using mmap as a fast way to read a file, yes, that's done in many applications. But please lets leave demand paging out of our scope. IMO the less we mess with low-level techniques that no other applications use the better, both because we have very few people who can do that and because doing so runs higher risk of becoming broken by future developments in the platforms we deem important. The long-term tendency in Emacs development should be to move away from such techniques, not to acquire more of them. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-25 15:59 ` Eli Zaretskii @ 2016-10-25 16:14 ` Daniel Colascione 2016-10-25 17:05 ` Eli Zaretskii 2016-10-25 19:49 ` Stefan Monnier 2016-10-25 22:53 ` Perry E. Metzger 2 siblings, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2016-10-25 16:14 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, monnier, emacs-devel On 10/25/2016 08:59 AM, Eli Zaretskii wrote: >> From: Daniel Colascione <dancol@dancol.org> >> Cc: Eli Zaretskii <eliz@gnu.org>, monnier@iro.umontreal.ca, emacs-devel@gnu.org >> Date: Mon, 24 Oct 2016 12:47:56 -0700 >> >>> I'd argue that we are already in this situation. For example, nobody >>> knows how to make unexec work with ASLR or PIE; when I tried fuzzing >>> Emacs with AFL, the dumped binary would simply crash; the dumped >>> binary is not reproducible (i.e. bit-by-bit identical after every >>> build); and I think dumping also doesn't work with ASan. The fraction >>> of situation where unexec doesn't work any more gets larger and >>> larger. If we had people who could solve these problems, it should get >>> smaller instead. >> >> Everyone who's seriously thought about the unexec problem _understands_ >> the issue. > > The important point is that the number of people here who can claim > such understanding, enough so to fix the issues, is diminishingly > small, and gets smaller every year. There's no demand for more yet. There isn't a catastrophe --- just low demand for core-change expertise. There used* to be a lot more (at least per-capita) stonemasons in historical societies than in today's society. That doesn't mean we've forgotten how to cut stones, and if there were a sudden need to do it, more stonemasons would magically appear. >> My preferred approach is the portable dumper one: basically what we're >> doing today, except that instead of just blindly copying the data >> segment and heap to a new emacs binary, we'll write this information to >> a separate file, stored in a portable format, a file that we'll keep >> alongside the Emacs binary. We'll store in this file metadata about >> where the pointers are. (There are two kinds of pointers in this file: >> pointers to other parts of the file and pointers to the Emacs binary.) >> >> At startup, we'll load the dump file and walk the relocations, fixing up >> all the embedded addresses to account for the new process's different >> address space. > > Why do you think this will have better performance that reading a > single .elc file at startup? It's still mainly file I/O and > processing of the file's contents, just like with byte-compiled files. Because a portable dumper can do less, on both file I/O and processing of the file's contents. There's no lisp evaluation, no slurping a whole file into memory. Having to read all of Emacs into memory on startup is a burden even on a fast, modern machine like mine. ~/edev/trunk/src $ sync && sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' ~/edev/trunk/src $ time pv < emacs >/dev/null 48.6MiB 0:00:00 [ 455MiB/s] [=========================================================>] 100% real 0m0.116s user 0m0.000s sys 0m0.016s That's pretty fast, but it's not free. Not having to do this much IO on startup in the first place would be even better. > If we have no reason to believe this portable dumper will be > significantly faster, we should IMO investigate the .elc method first, > because it's so much simpler, both in its implementation and in future > maintenance. E.g., adding a new kind of Lisp object to Emacs would > require corresponding changes in the dumper. Adding a new kind of lisp object requires changes throughout core anyway. At the very least, you need to teach GC where your new object keeps its pointers, and that's exactly the knowledge that the dumper would need. >> We can't save all of the Emacs data segment this way, but we can >> relocate and restore anything that's marked with staticpro. The overall >> experience should be very similar to what we have today. >> [...] >> Speaking of COW faults: a refinement of this scheme is to do the >> relocations lazily, in a SIGSEGV handler. (Map the dump file PROT_NONE >> so any access traps.) In the SIGSEGV handler, we can relocate just the >> page we faulted, then continue. This way, we don't need to slurp in the >> entire dump file from disk just to start emacs -Q -batch: we can >> demand-page! > > Demand paging in an application, and an application such as Emacs on > top of that, makes little sense to me. Why? It's conceptually no different from autoload. There is no technique in computer science so rarefied that it's only good in ring zero. > This is the OS business, not > ours. Using mmap as a fast way to read a file, yes, that's done in > many applications. But please lets leave demand paging out of our > scope. Emacs isn't just an application. It's a Lisp virtual machine, and employing the optimization techniques used in other virtual machines can be important wins. (FWIW, mmap isn't a particularly fast way of doing bulk file reads. That's why GNU grep removed its mmap support.) > IMO the less we mess with low-level techniques that no other > applications use the better, both because we have very few people who > can do that and because doing so runs higher risk of becoming broken > by future developments in the platforms we deem important. The > long-term tendency in Emacs development should be to move away from > such techniques, not to acquire more of them. I'm for anything that delivers meaningful performance advantages. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-25 16:14 ` Daniel Colascione @ 2016-10-25 17:05 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-25 17:05 UTC (permalink / raw) To: Daniel Colascione; +Cc: p.stephani2, monnier, emacs-devel > Cc: p.stephani2@gmail.com, monnier@iro.umontreal.ca, emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Tue, 25 Oct 2016 09:14:55 -0700 > > >> Everyone who's seriously thought about the unexec problem _understands_ > >> the issue. > > > > The important point is that the number of people here who can claim > > such understanding, enough so to fix the issues, is diminishingly > > small, and gets smaller every year. > > There's no demand for more yet. Not true. Demand for this level of expertise is continuous in Emacs, and never dwindles, not for the last 25 years that I'm involved. > There used* to be a lot more (at least > per-capita) stonemasons in historical societies than in today's society. > That doesn't mean we've forgotten how to cut stones, and if there were a > sudden need to do it, more stonemasons would magically appear. I think your optimism is misplaced. I'm old enough to have seen several proficiencies go extinct due to new technology that made them irrelevant. When demand for those forgotten proficiencies came up, people invariably run to the few still around who know how to do that, they don't learn that themselves (and don't even know how). > > Why do you think this will have better performance that reading a > > single .elc file at startup? It's still mainly file I/O and > > processing of the file's contents, just like with byte-compiled files. > > Because a portable dumper can do less, on both file I/O and processing > of the file's contents. There's no lisp evaluation, no slurping a whole > file into memory. Having to read all of Emacs into memory on startup is > a burden even on a fast, modern machine like mine. > > ~/edev/trunk/src > $ sync && sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' > > ~/edev/trunk/src > $ time pv < emacs >/dev/null > 48.6MiB 0:00:00 [ 455MiB/s] > [=========================================================>] 100% > > > real 0m0.116s Which is definitely comparable with my measurements of loading all of the *.elc files concatenated, which were proclaimed to be "too slow". > > If we have no reason to believe this portable dumper will be > > significantly faster, we should IMO investigate the .elc method first, > > because it's so much simpler, both in its implementation and in future > > maintenance. E.g., adding a new kind of Lisp object to Emacs would > > require corresponding changes in the dumper. > > Adding a new kind of lisp object requires changes throughout core > anyway. The changes in the dumper are _in_addition_ to that. > > Demand paging in an application, and an application such as Emacs on > > top of that, makes little sense to me. > > Why? It's conceptually no different from autoload. The devil is in the details, though. And there are a lot of details in this case that are completely unrelated to the concept. If you don't get them all right, you get a subtly unstable application that will crash randomly in hard to reproduce and debug situations. > > This is the OS business, not > > ours. Using mmap as a fast way to read a file, yes, that's done in > > many applications. But please lets leave demand paging out of our > > scope. > > Emacs isn't just an application. It's a Lisp virtual machine No, it's not. It's an application with a powerful extension language. > (FWIW, mmap isn't a particularly fast way of doing bulk file reads. > That's why GNU grep removed its mmap support.) It was an example of a low-level technique that is sufficiently simple to use, that's all. > > IMO the less we mess with low-level techniques that no other > > applications use the better, both because we have very few people who > > can do that and because doing so runs higher risk of becoming broken > > by future developments in the platforms we deem important. The > > long-term tendency in Emacs development should be to move away from > > such techniques, not to acquire more of them. > > I'm for anything that delivers meaningful performance advantages. IME, that way lies madness. It's the exact opposite of the direction Emacs should evolve if we want to prevent it from becoming a marginal package for a few enthusiasts. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-25 15:59 ` Eli Zaretskii 2016-10-25 16:14 ` Daniel Colascione @ 2016-10-25 19:49 ` Stefan Monnier 2016-10-25 22:53 ` Perry E. Metzger 2 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-25 19:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, Daniel Colascione, emacs-devel >> At startup, we'll load the dump file and walk the relocations, fixing up >> all the embedded addresses to account for the new process's different >> address space. > Why do you think this will have better performance that reading a > single .elc file at startup? It's still mainly file I/O and > processing of the file's contents, just like with byte-compiled files. I guess it depends if we can get lread.c to be bound by file-I/O. Currently, it's significantly slower. It's clear on the surface that lread.c has more work to do than an ideal "portable undumper": - the PU just has to find all pointers and increment them by a fixed offset (it could do so either by a GC-like traversal, or by consulting an auxiliary precomputed table of addresses stored alongside the dump state) - lread.c has to check every byte for lexing/parsing, it has to call the memory allocator for every object, tie the knots for cyclic objects, `intern` the symbols, decode the \ in strings, ... The jury is still out whether this extra work can be implemented efficiently enough. There are other differences which can impact the performance (e.g. the size of the "dump" is likely different in the two cases, so the amount of I/O is affected) and the desirability (speeding up loading of .elc would benefit other cases as well, we could generate a generic dump.elc rather than have it be OS-specific, ...) Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-25 15:59 ` Eli Zaretskii 2016-10-25 16:14 ` Daniel Colascione 2016-10-25 19:49 ` Stefan Monnier @ 2016-10-25 22:53 ` Perry E. Metzger 2016-10-26 2:36 ` Eli Zaretskii 2 siblings, 1 reply; 375+ messages in thread From: Perry E. Metzger @ 2016-10-25 22:53 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, Daniel Colascione, monnier, emacs-devel On Tue, 25 Oct 2016 18:59:36 +0300 Eli Zaretskii <eliz@gnu.org> wrote: > > Everyone who's seriously thought about the unexec problem > > _understands_ the issue. > > The important point is that the number of people here who can claim > such understanding, enough so to fix the issues, is diminishingly > small, and gets smaller every year. Just an aside: when you attract fewer and fewer users, you end up with fewer and fewer contributors. Fewer and fewer contributors makes maintenance harder and can create a death spiral for projects. If, in an effort to make maintenance easier, you scare off a lot of users, you could end up making the maintenance situation worse in the long run. I'm not saying that longer start time would scare off users as such, but in general, this balance has to be weighed in making decisions about usability vs. maintenance costs. Perry -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-25 22:53 ` Perry E. Metzger @ 2016-10-26 2:36 ` Eli Zaretskii 2016-10-26 2:37 ` Perry E. Metzger 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-26 2:36 UTC (permalink / raw) To: Perry E. Metzger; +Cc: p.stephani2, dancol, monnier, emacs-devel > Date: Tue, 25 Oct 2016 18:53:13 -0400 > From: "Perry E. Metzger" <perry@piermont.com> > Cc: Daniel Colascione <dancol@dancol.org>, p.stephani2@gmail.com, > monnier@iro.umontreal.ca, emacs-devel@gnu.org > > On Tue, 25 Oct 2016 18:59:36 +0300 Eli Zaretskii <eliz@gnu.org> wrote: > > > Everyone who's seriously thought about the unexec problem > > > _understands_ the issue. > > > > The important point is that the number of people here who can claim > > such understanding, enough so to fix the issues, is diminishingly > > small, and gets smaller every year. > > Just an aside: when you attract fewer and fewer users, you end up with > fewer and fewer contributors. Fewer and fewer contributors makes > maintenance harder and can create a death spiral for projects. If, in > an effort to make maintenance easier, you scare off a lot of users, > you could end up making the maintenance situation worse in the long > run. I'm not saying that longer start time would scare off users as > such, but in general, this balance has to be weighed in making > decisions about usability vs. maintenance costs. That's a profoundly false premise, and a misrepresentation of everything I wrote. No one is arguing for slower startup that will annoy users! The issue at hand is which approach to prefer when the startup time is comparable. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-26 2:36 ` Eli Zaretskii @ 2016-10-26 2:37 ` Perry E. Metzger 0 siblings, 0 replies; 375+ messages in thread From: Perry E. Metzger @ 2016-10-26 2:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: p.stephani2, dancol, monnier, emacs-devel On Wed, 26 Oct 2016 05:36:19 +0300 Eli Zaretskii <eliz@gnu.org> wrote: > That's a profoundly false premise, and a misrepresentation of > everything I wrote. No one is arguing for slower startup that will > annoy users! The issue at hand is which approach to prefer when the > startup time is comparable. Good to hear. Perry -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 16:17 ` Daniel Colascione 2016-10-24 16:51 ` Philipp Stephani @ 2016-10-24 16:52 ` Eli Zaretskii 1 sibling, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 16:52 UTC (permalink / raw) To: Daniel Colascione; +Cc: monnier, emacs-devel > Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Mon, 24 Oct 2016 09:17:14 -0700 > > > I very much disagree with this: the unexec maintenance situation is > > actually so fragile that it could break at any moment, in the sense > > that we could very easily get into having no people on board who know > > enough about unexec to solve the next problem that will break it. The > > number of people who do know gets smaller and smaller with each year. > > That is not healthy at all for the future of the project. > > In both this discussion and the one about insdel, you've expressed the > sentiment that we need to optimize for a world in which very few people > have time to maintain Emacs internals. I have a more optimistic view: > people are generally good at figuring things out, and if learning about > unexec or other esoteric facilities is that prevents a developer from > porting Emacs to a new platform or fixing an important bug, that > developer will put time into learning about these mechanisms. Even if you are right, such "figuring out" will take time, and will delay Emacs development if not stall it. With enough bad luck, we could start people abandoning ship. Like I said, Emacs already cannot be built on a system with ASLR; how soon do you think this and similar problems will be considered fatal flaws? Yes, I'm a pessimist about these aspects of Emacs development. My reasons are what I see before my eyes almost every day: some problems in Emacs are not touched until one of the few who know enough do it. Look at the last installment of this saga, with ralloc-induced problems: the same usual suspects are involved in solving it. If all of those few were run over by a bus, how fast these problems would be identified and solved? And this problem is by far simpler than the unexec subtleties. It's no accident that no one (perhaps except Paul) is seriously working on the unexec replacement. Why would you believe that this could change in the future, when most our contributors lack proficiency working on this level? So yes, I think your optimism is misplaced. But that doesn't matter, because no solution for unexec that is not good enough, performance-wise and otherwise, will be accepted by the crowd, no matter how grave is the current situation. So you should not be worried about this. What _is_ important, IMO, is that if and when we do need to drop unexec, we will have _some_ solution, however imperfect, to start with and get it up to speed. Because whatever the solution, making it happen is a lot of work, and we had better done most of it by then. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 13:35 ` Eli Zaretskii 2016-10-24 14:45 ` Daniel Colascione @ 2016-10-25 22:46 ` Perry E. Metzger 1 sibling, 0 replies; 375+ messages in thread From: Perry E. Metzger @ 2016-10-25 22:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Stefan Monnier, emacs-devel On Mon, 24 Oct 2016 16:35:08 +0300 Eli Zaretskii <eliz@gnu.org> wrote: > > I think some users will run away screaming if Emacs takes a whole > > second to start up. > > It depends. If those users, like me, have hundreds of buffers in > their sessions, and use desktop.el to recreate their sessions, they > already wait a few seconds for that. Just FYI, I find the fact that emacs starts up instantaneously a big win. If it starts taking a substantial period to start up again like it did 30 years ago, I'm going to be unhappy. I get that this simplifies maintenance but the startup time is a big lose. Perry -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 1:07 ` Stefan Monnier 2016-10-24 6:39 ` Eli Zaretskii @ 2016-10-24 9:40 ` Ken Raeburn 2016-10-24 13:13 ` Stefan Monnier 1 sibling, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2016-10-24 9:40 UTC (permalink / raw) To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel > On Oct 23, 2016, at 21:07, Stefan Monnier <monnier@iro.umontreal.ca> wrote: > >> That sounds strangely long, as I got less than 2 sec with all the >> preloaded *.elc files concatenated to a single file, and that's before >> I made pure-copy a no-op. >> Another report was that "loadup" with pure-copy short-circuited took >> less than 0.5 sec. See > > Hmm... indeed, I got to 0.72s with his patch (on a different, slower > machine (a Thinkpad X201s, i.e. with a i7 CPU L620 @ 2.00GHz)). > > If I re-add international/characters it goes up a bit to > 0.96s, but still nowhere near the 3s I got on my big .elc file. > [ I wonder what makes loading my big file so slow. ] > > This said, there's still a factor 5-10 to get to "immediate", tho. I think this came up in the thread Eli referred to, but when I’ve looked at startup time in CANNOT_DUMP builds, a couple of things jumped out at me: * Garbage collection time. If we’re not trying to dump out as compact as possible an image, squeezing out every byte is less important. Drop all of the explicit calls in loadup.el. Consider raising gc-cons-threshold to the point where it doesn’t trigger during loadup; maybe set it back after startup completes, or the first time Emacs is idle more than a couple seconds. * I/O processing time — not the I/O system calls, but the C library processing. Change getc to getc_unlocked in charset.c and lread.c. (And/or change the loading of dumped.elc to read everything into a buffer and execute code from the buffer, if that might be faster.) Mutex locking time is costly on Mac OS X, but not exactly free in glibc either. As I recall, I had startup times under a second without any loadup/dump preprocessing with these changes. (And all the “purecopy” stuff skipped, in a CANNOT_DUMP build.) Your “dumped.elc” might trigger some of the same issues. If the eventual idea is to stuff the “dumped” data into a char array to link into the final installed executable, the second issue is less relevant, though. Did you check whether actually byte compiling the written file made a difference? Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 9:40 ` Ken Raeburn @ 2016-10-24 13:13 ` Stefan Monnier 2016-10-25 9:02 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 13:13 UTC (permalink / raw) To: Ken Raeburn; +Cc: Eli Zaretskii, emacs-devel > * Garbage collection time. If we’re not trying to dump out as compact as > possible an image, squeezing out every byte is less important. Drop all of > the explicit calls in loadup.el. Consider raising gc-cons-threshold to the > point where it doesn’t trigger during loadup; maybe set it back after > startup completes, or the first time Emacs is idle more > than a couple seconds. The patch to which I was referring (and which I used) does get rid of most gc calls. > As I recall, I had startup times under a second without any loadup/dump > preprocessing with these changes. (And all the “purecopy” stuff skipped, > in a CANNOT_DUMP build.) The patch also skips the purecopy by setting purify-flag to nil. > Did you check whether actually byte compiling the written file made > a difference? dumped.elc has no code to compile. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 13:13 ` Stefan Monnier @ 2016-10-25 9:02 ` Ken Raeburn 2016-10-25 13:48 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2016-10-25 9:02 UTC (permalink / raw) To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel On Oct 24, 2016, at 09:13, Stefan Monnier <monnier@iro.umontreal.ca> wrote: >> Did you check whether actually byte compiling the written file made >> a difference? > > dumped.elc has no code to compile. It has a lot of fset and setplist calls which can be compiled, especially if you reorder things such that they’re not mixed up with the defvar calls that don’t compile. The generated .elc output is about 25% larger. I don’t expect the C parts of fset and setplist to be affected at all, of course; the parsing and interpretation of the Lisp may be another matter. Unfortunately, byte-compile-file doesn’t preserve the sharing of objects (“#42#”) present in the input file, so the output isn’t semantically the same. I did some profiling. Without byte compiling, it appears that around half of the CPU time used loading the file in my test is spent in Frassq(…,read_objects), called from substitute_object_recurse. For processing a file with this much sharing of objects, an assoc list with O(n) access time may not be the best choice. Whatever we replace it with, it appears we need to be able to look up cons cells in a collection by either element. The next top users of CPU time (_IO_getc, oblookup) are less significant, though there are some easy minor gains to be made there. With a hacked-up 31-slot hash table replacing read_objects, the getc_unlocked changes, and setting OBARRAY_SIZE to 8191, I got the load time for the file in batch mode on my test system from just under a half second to about a quarter second. Nearly half the remaining CPU time is split between readchar, read1, readbyte_from_file, and Fassq. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-25 9:02 ` Ken Raeburn @ 2016-10-25 13:48 ` Stefan Monnier 2016-10-27 8:51 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-25 13:48 UTC (permalink / raw) To: Ken Raeburn; +Cc: Eli Zaretskii, emacs-devel >>> Did you check whether actually byte compiling the written file made >>> a difference? >> dumped.elc has no code to compile. > It has a lot of fset and setplist calls which can be compiled, especially if > you reorder things such that they’re not mixed up with the defvar calls that > don’t compile. "A lot of" is relative: the time to read them compared to an equivalent byte-code version should be negligeable, and their execution time should be even more negligeable. > The generated .elc output is about 25% larger. That's not because of byte-compilation per-se. It's because the byte-compiler uses `print-circle' but only within each top-level entity, so you lose sharing between functions and between variables. IOW you can get the exact same 25% larger file by printing each fset/defvar/setplist separately (instead of printing them as one big `progn`). And you can trick the byte-compiler to preserve this sharing by replacing the leading `progn` (which the byte-compiler removes) into a (let () ...), tho maybe you'll need to really add some dummy binding in that `let` to make sure the byte-compiler doesn't end up removing it. > I did some profiling. Without byte compiling, it appears that around half > of the CPU time used loading the file in my test is spent in > Frassq(…,read_objects), called from substitute_object_recurse. Ah, that's what it is. Clearly we should be able to optimize most of this away. > For processing a file with this much sharing of objects, an assoc list with > O(n) access time may not be the best choice. Indeed. > Whatever we replace it with, it appears we need to be able to look up > cons cells in a collection by either element. Ideally, we could get rid of substitute_object_in_subtree entirely. E.g. the patch below skips it for the case of "#n=(...)", and by peeping ahead to decide the type of placeholder we build, we should be able to get rid of it in all cases. Stefan diff --git a/src/lread.c b/src/lread.c index 58d518c..a06a78f 100644 --- a/src/lread.c +++ b/src/lread.c @@ -2936,12 +2936,21 @@ read1 (Lisp_Object readcharfun, int *pch, bool first_in_list) tem = read0 (readcharfun); /* Now put it everywhere the placeholder was... */ - substitute_object_in_subtree (tem, placeholder); + if (CONSP (tem)) + { + Fsetcar (placeholder, XCAR (tem)); + Fsetcdr (placeholder, XCDR (tem)); + return placeholder; + } + else + { + substitute_object_in_subtree (tem, placeholder); - /* ...and #n# will use the real value from now on. */ - Fsetcdr (cell, tem); + /* ...and #n# will use the real value from now on. */ + Fsetcdr (cell, tem); - return tem; + return tem; + } } /* #n# returns a previously read object. */ ^ permalink raw reply related [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-25 13:48 ` Stefan Monnier @ 2016-10-27 8:51 ` Ken Raeburn 2016-10-30 14:43 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2016-10-27 8:51 UTC (permalink / raw) To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel > On Oct 25, 2016, at 09:48, Stefan Monnier <monnier@iro.umontreal.ca> wrote: > >>>> Did you check whether actually byte compiling the written file made >>>> a difference? >>> dumped.elc has no code to compile. >> It has a lot of fset and setplist calls which can be compiled, especially if >> you reorder things such that they’re not mixed up with the defvar calls that >> don’t compile. > > "A lot of" is relative: the time to read them compared to an equivalent > byte-code version should be negligeable, and their execution time should > be even more negligeable. > >> The generated .elc output is about 25% larger. > > That's not because of byte-compilation per-se. It's because the > byte-compiler uses `print-circle' but only within each top-level entity, > so you lose sharing between functions and between variables. > > IOW you can get the exact same 25% larger file by printing each > fset/defvar/setplist separately (instead of printing them as one big > `progn`). And you can trick the byte-compiler to preserve this sharing > by replacing the leading `progn` (which the byte-compiler removes) into > a (let () ...), tho maybe you'll need to really add some dummy binding > in that `let` to make sure the byte-compiler doesn't end up removing it. Ah, yes… “(let () …)” was enough with no bindings. Now the compiled file, which now contains only one big byte-code invocation, is still larger than the original dumped file, though not as much, and from a couple of spot checks it looks like the data sharing is indeed preserved. It also takes longer to load. Oh well. > Ideally, we could get rid of substitute_object_in_subtree entirely. > E.g. the patch below skips it for the case of "#n=(...)", and by peeping > ahead to decide the type of placeholder we build, we should be able to > get rid of it in all cases. I would think not for types using flexible array members, since we may not know the allocation size until we’ve seen the end of the object. In poking around with gdb, most of the invocations of substitute_object_in_subtree I looked at got a subtree of nil. It appears to me that if the “subtree” passed isn’t the placeholder and isn’t one of the types we process recursively, then we will never do any substitution, right? So the checking of seen_list and read_objects isn’t relevant. I started my tests over with an updated source tree from upstream and put in your loadup.el change. Running “time emacs -batch -l dumped.elc” took 3.5s; according to “perf record”/“perf report”, Frassq took about 85% of the CPU time, and Fassq took about 9%. Added your lread.c patch; run time is about 1.8s, 70% in Frassq and almost 20% in Fassq. Patched substitute_object_recurse after the check for the subtree matching the placeholder, so that if the subtree passed was a symbol or number, it would simply be returned without consulting seen_list or read_objects. Run time is now 0.7s; Fassq is a bit over 50% of that, and Frassq about 17%, and _IO_getc around 11%. I think it should be safe to short-circuit it for some other types as well. I had my getc_unlocked change sitting around so I pulled that in. Run time is now 0.6s, with Fassq at 57% and Frassq at 18%. Next on the profiling chart is oblookup, but it’s only at 4% so I’m going to ignore OBARRAY_SIZE for now. However, OBARRAY_SIZE could affect the order of atoms in processing, which could drastically rearrange the ordering of the data structures in dumped.elc. I think the next step is to look at replacing read_objects, probably with a pair of hash tables, but it’s getting a bit late for trying that tonight. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-27 8:51 ` Ken Raeburn @ 2016-10-30 14:43 ` Ken Raeburn 2016-10-30 15:31 ` Simon Leinen ` (2 more replies) 0 siblings, 3 replies; 375+ messages in thread From: Ken Raeburn @ 2016-10-30 14:43 UTC (permalink / raw) To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel I wrote: > Patched substitute_object_recurse after the check for the subtree matching the placeholder, so that if the subtree passed was a symbol or number, it would simply be returned without consulting seen_list or read_objects. Run time is now 0.7s; Fassq is a bit over 50% of that, and Frassq about 17%, and _IO_getc around 11%. I think it should be safe to short-circuit it for some other types as well. > > I had my getc_unlocked change sitting around so I pulled that in. Run time is now 0.6s, with Fassq at 57% and Frassq at 18%. > > Next on the profiling chart is oblookup, but it’s only at 4% so I’m going to ignore OBARRAY_SIZE for now. However, OBARRAY_SIZE could affect the order of atoms in processing, which could drastically rearrange the ordering of the data structures in dumped.elc. > > I think the next step is to look at replacing read_objects, probably with a pair of hash tables, but it’s getting a bit late for trying that tonight. I switched over to a pair of hash tables and the run time is just under 0.2s on my test machine now. Profiling reports are now topped by read1, readchar, and readbyte_from_file (now including the expanded getc_unlocked calls), accounting for about 30% of the CPU time between them. The hash functions and substitute_object_recurse are not taking a significant amount of time. I took a look at the types of shared data in one of the generated dumped.elc files I got; almost 2700 were strings (all without text properties), almost 1900 were cons cells, and the rest numbered under 300. So I’m not sure special-casing other types besides Lisp_Cons in read1 will gain us much. It took me a while to sort through the lookups being done during and after parsing of an object and how the checks for circular objects work, but I think I’ve got it working. I’ve pushed a scratch branch over with the changes if you’d like to try them, though I think I botched the git push syntax when trying to create “scratch/raeburn/startup” somehow, so I created “scratch/raeburn-startup”… or possibly I’ve created both? I saw an email notification go out for both, but I only see the latter in the repository browser interface… Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-30 14:43 ` Ken Raeburn @ 2016-10-30 15:31 ` Simon Leinen 2016-10-30 16:52 ` Daniel Colascione 2016-10-31 14:27 ` Stefan Monnier 2 siblings, 0 replies; 375+ messages in thread From: Simon Leinen @ 2016-10-30 15:31 UTC (permalink / raw) To: Ken Raeburn; +Cc: Eli Zaretskii, Stefan Monnier, Emacs developers On Sun, Oct 30, 2016 at 3:43 PM, Ken Raeburn <raeburn@raeburn.org> wrote: > I switched over to a pair of hash tables and the run time is just under 0.2s on my test machine now. Profiling reports are now topped by read1, readchar, and readbyte_from_file (now including the expanded getc_unlocked calls), accounting for about 30% of the CPU time between them. The hash functions and substitute_object_recurse are not taking a significant amount of time. [...] Promising! Years ago I spent some time optimizing the MIB-reading code in UCD/Net-SNMP, and found that the biggest win was to treat the input file as one big buffer (I actually mmap()ped it) and then avoid most of the memory allocation overhead of token creation by using start/end pointers directly into that buffer. I never upstreamed that code, and I'm not sure the representation would have been acceptable to the other developers. But it sure was fast. Maybe an approach like that would be suitable for .elc loading. -- Simon. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-30 14:43 ` Ken Raeburn 2016-10-30 15:31 ` Simon Leinen @ 2016-10-30 16:52 ` Daniel Colascione 2016-10-31 14:27 ` Stefan Monnier 2 siblings, 0 replies; 375+ messages in thread From: Daniel Colascione @ 2016-10-30 16:52 UTC (permalink / raw) To: Ken Raeburn, Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel On 10/30/2016 07:43 AM, Ken Raeburn wrote: > > I wrote: >> Patched substitute_object_recurse after the check for the subtree matching the placeholder, so that if the subtree passed was a symbol or number, it would simply be returned without consulting seen_list or read_objects. Run time is now 0.7s; Fassq is a bit over 50% of that, and Frassq about 17%, and _IO_getc around 11%. I think it should be safe to short-circuit it for some other types as well. >> >> I had my getc_unlocked change sitting around so I pulled that in. Run time is now 0.6s, with Fassq at 57% and Frassq at 18%. >> >> Next on the profiling chart is oblookup, but it’s only at 4% so I’m going to ignore OBARRAY_SIZE for now. However, OBARRAY_SIZE could affect the order of atoms in processing, which could drastically rearrange the ordering of the data structures in dumped.elc. >> >> I think the next step is to look at replacing read_objects, probably with a pair of hash tables, but it’s getting a bit late for trying that tonight. > > I switched over to a pair of hash tables and the run time is just under 0.2s on my test machine now. Profiling reports are now topped by read1, readchar, and readbyte_from_file (now including the expanded getc_unlocked calls), accounting for about 30% of the CPU time between them. The hash functions and substitute_object_recurse are not taking a significant amount of time. > > I took a look at the types of shared data in one of the generated dumped.elc files I got; almost 2700 were strings (all without text properties), almost 1900 were cons cells, and the rest numbered under 300. So I’m not sure special-casing other types besides Lisp_Cons in read1 will gain us much. > > It took me a while to sort through the lookups being done during and after parsing of an object and how the checks for circular objects work, but I think I’ve got it working. I’ve pushed a scratch branch over with the changes if you’d like to try them, though I think I botched the git push syntax when trying to create “scratch/raeburn/startup” somehow, so I created “scratch/raeburn-startup”… or possibly I’ve created both? I saw an email notification go out for both, but I only see the latter in the repository browser interface… > > Ken Awesome! Even if we go for something besides a big elc file for startup, these improvements will help. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-30 14:43 ` Ken Raeburn 2016-10-30 15:31 ` Simon Leinen 2016-10-30 16:52 ` Daniel Colascione @ 2016-10-31 14:27 ` Stefan Monnier 2016-11-02 7:36 ` Ken Raeburn 2 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-31 14:27 UTC (permalink / raw) To: emacs-devel > I switched over to a pair of hash tables and the run time is just under 0.2s > on my test machine now. Profiling reports are now topped by read1, > readchar, and readbyte_from_file (now including the expanded getc_unlocked > calls), accounting for about 30% of the CPU time between them. The hash > functions and substitute_object_recurse are not taking a significant amount > of time. BTW, I don't know if you've tried to make that dumped file work correctly, but in case you haven't here's my latest attempt. It mostly works, tho there are still issues such as the fact that the global-font-lock-mode still fails to be properly enabled. Stefan diff --git a/lisp/emacs-lisp/macroexp.el b/lisp/emacs-lisp/macroexp.el index 310ca29..9ca53eb 100644 --- a/lisp/emacs-lisp/macroexp.el +++ b/lisp/emacs-lisp/macroexp.el @@ -439,7 +439,8 @@ macroexp--const-symbol-p (or (memq symbol '(nil t)) (keywordp symbol) (if any-value - (or (memq symbol byte-compile-const-variables) + (or (and (boundp 'byte-compile-const-variables) + (memq symbol byte-compile-const-variables)) ;; FIXME: We should provide a less intrusive way to find out ;; if a variable is "constant". (and (boundp symbol) diff --git a/lisp/international/mule.el b/lisp/international/mule.el index 21ab7e1..bb4808b 100644 --- a/lisp/international/mule.el +++ b/lisp/international/mule.el @@ -290,7 +290,7 @@ define-charset elt)) props)) (setcdr (assq :plist attrs) props) - + (put name 'internal--charset-args (mapcar #'cdr attrs)) (apply 'define-charset-internal name (mapcar 'cdr attrs)))) @@ -911,6 +911,8 @@ define-coding-system (cons :name (cons name (cons :docstring (cons (purecopy docstring) props))))) (setcdr (assq :plist common-attrs) props) + (put name 'internal--cs-args + (mapcar #'cdr (append common-attrs spec-attrs))) (apply 'define-coding-system-internal name (mapcar 'cdr (append common-attrs spec-attrs))))) diff --git a/lisp/loadup.el b/lisp/loadup.el index 21c64a8..5967334 100644 --- a/lisp/loadup.el +++ b/lisp/loadup.el @@ -1,4 +1,4 @@ -;;; loadup.el --- load up standardly loaded Lisp files for Emacs +;;; loadup.el --- load up standardly loaded Lisp files for Emacs -*- lexical-binding:t -*- ;; Copyright (C) 1985-1986, 1992, 1994, 2001-2016 Free Software ;; Foundation, Inc. @@ -461,6 +461,150 @@ invocation-directory) (expand-file-name name invocation-directory) t))) + (message "Dumping into dumped.elc...preparing...") + + ;; Dump the current state into a file so we can reload it! + (message "Dumping into dumped.elc...generating...") + (let ((faces '()) + (coding-systems '()) (coding-system-aliases '()) + (charsets '()) (charset-aliases '()) + (cmds '())) + (setcdr global-buffers-menu-map nil) ;; Get rid of buffer objects! + (mapatoms + (lambda (s) + (when (fboundp s) + (if (subrp (symbol-function s)) + ;; subr objects aren't readable! + (unless (equal (symbol-name s) (subr-name (symbol-function s))) + (push `(fset ',s (symbol-function ',(intern (subr-name (symbol-function s))))) cmds)) + (if (memq s '(rename-buffer)) + ;; FIXME: We need these, but they contain + ;; unprintable objects. + nil + (push `(fset ',s ,(macroexp-quote (symbol-function s))) + cmds)))) + (when (and (boundp s) + (not (macroexp--const-symbol-p s 'any-value)) + ;; I think we don't need/want these! + (not (memq s '(terminal-frame obarray + initial-window-system window-system + ;; custom-delayed-init-variables + exec-path + process-environment + command-line-args noninteractive)))) + ;; FIXME: Handle varaliases! + (let ((v (symbol-value s))) + (push `(set-default + ',s + ,(cond + ;; FIXME: (Correct) hack to avoid + ;; unprintable objects. + ((eq s 'undo-auto--undoably-changed-buffers) nil) + ;; FIXME: Incorrect hack to avoid + ;; unprintable objects. + ((eq s 'advertised-signature-table) + (make-hash-table :test 'eq :weakness 'key)) + ((subrp v) + `(symbol-function ',(intern (subr-name v)))) + ((and (markerp v) (null (marker-buffer v))) + '(make-marker)) + ((and (overlayp v) (null (overlay-buffer v))) + '(let ((ol (make-overlay (point-min) (point-min)))) + (delete-overlay ol) + ol)) + (v (macroexp-quote v)))) + cmds) + (push `(defvar ,s) cmds))) + (when (symbol-plist s) + (push `(setplist ',s ',(symbol-plist s)) cmds)) + (when (get s 'face-defface-spec) + (push s faces)) + (if (get s 'internal--cs-args) + (push s coding-systems)) + (when (and (coding-system-p s) + (not (eq s (car (coding-system-aliases s))))) + (push (cons s (car (coding-system-aliases s))) + coding-system-aliases)) + (if (get s 'internal--charset-args) + (push s charsets) + (when (and (charsetp s) + (not (eq s (get-charset-property s :name)))) + (push (cons s (get-charset-property s :name)) + charset-aliases)))) + obarray) + (message "Dumping into dumped.elc...printing...") + (with-current-buffer (generate-new-buffer "dumped.elc") + (insert ";ELC\^W\^@\^@\^@\n;;; Compiled\n;;; in Emacs version " + emacs-version "\n") + (let ((print-circle t) + (print-gensym t) + (print-quoted t) + (print-level nil) + (print-length nil) + (print-escape-newlines t) + (standard-output (current-buffer))) + (print `(progn . ,cmds)) + (terpri) + (print `(let ((css ',charsets)) + (dotimes (i 3) + (dolist (cs (prog1 css (setq css nil))) + ;; (message "Defining charset %S..." cs) + (condition-case nil + (progn + (apply #'define-charset-internal + cs (get cs 'internal--charset-args)) + ;; (message "Defining charset %S...done" cs) + ) + (error + ;; (message "Defining charset %S...postponed" + ;; cs) + (push cs css))))))) + (terpri) + (print `(dolist (cs ',charset-aliases) + (define-charset-alias (car cs) (cdr cs)))) + (terpri) + (print `(let ((css ',coding-systems)) + (dotimes (i 3) + (dolist (cs (prog1 css (setq css nil))) + ;; (message "Defining coding-system %S..." cs) + (condition-case nil + (progn + (apply #'define-coding-system-internal + cs (get cs 'internal--cs-args)) + ;; (message "Defining coding-system %S...done" cs) + ) + (error + ;; (message "Defining coding-system %S...postponed" + ;; cs) + (push cs css))))))) + (print `(dolist (f ',faces) + (face-spec-set f (get f 'face-defface-spec) + 'face-defface-spec))) + (terpri) + (print `(dolist (cs ',coding-system-aliases) + (define-coding-system-alias (car cs) (cdr cs)))) + (terpri) + (print `(progn + ;; (message "Done preloading!") + ;; (message "custom-delayed-init-variables = %S" + ;; custom-delayed-init-variables) + ;; (message "Running top-level = %S" top-level) + (setq debug-on-error t) + (use-global-map global-map) + (eval top-level) + ;; (message "top-level done!?") + )) + (terpri)) + (goto-char (point-min)) + (while (re-search-forward " (\\(defvar\\|setplist\\|fset\\) " nil t) + (goto-char (match-beginning 0)) + (delete-char 1) (insert "\n")) + (message "Dumping into dumped.elc...saving...") + (let ((coding-system-for-write 'emacs-internal)) + (write-region (point-min) (point-max) (buffer-name))) + (message "Dumping into dumped.elc...done") + )) + (kill-emacs))) ;; For machines with CANNOT_DUMP defined in config.h, diff --git a/src/coding.c b/src/coding.c index 9f709be..a677758 100644 --- a/src/coding.c +++ b/src/coding.c @@ -10326,8 +10326,9 @@ usage: (define-coding-system-internal ...) */) CHECK_NUMBER_CAR (reg_usage); CHECK_NUMBER_CDR (reg_usage); - request = Fcopy_sequence (args[coding_arg_iso2022_request]); - for (tail = request; CONSP (tail); tail = XCDR (tail)) + request = Qnil; + for (tail = args[coding_arg_iso2022_request]; + CONSP (tail); tail = XCDR (tail)) { int id; Lisp_Object tmp1; @@ -10339,7 +10340,8 @@ usage: (define-coding-system-internal ...) */) CHECK_NATNUM_CDR (val); if (XINT (XCDR (val)) >= 4) error ("Invalid graphic register number: %"pI"d", XINT (XCDR (val))); - XSETCAR (val, make_number (id)); + request = Fcons (Fcons (make_number (id), XCDR (val)), + request); } flags = args[coding_arg_iso2022_flags]; diff --git a/src/emacs.c b/src/emacs.c index 2480dfc..bdf3742 100644 --- a/src/emacs.c +++ b/src/emacs.c @@ -1593,9 +1593,9 @@ Using an Emacs configured with --with-x-toolkit=lucid does not have this problem #endif Vtop_level = list2 (Qload, build_unibyte_string (file)); } - /* Unless next switch is -nl, load "loadup.el" first thing. */ - if (! no_loadup) - Vtop_level = list2 (Qload, build_string ("loadup.el")); + else if (! no_loadup) + /* Unless next switch is -nl, load "loadup.el" first thing. */ + Vtop_level = list2 (Qload, build_string ("../src/dumped.elc")); } /* Set up for profiling. This is known to work on FreeBSD, ^ permalink raw reply related [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-31 14:27 ` Stefan Monnier @ 2016-11-02 7:36 ` Ken Raeburn 2016-11-02 12:17 ` Stefan Monnier 2016-11-02 12:22 ` Stefan Monnier 0 siblings, 2 replies; 375+ messages in thread From: Ken Raeburn @ 2016-11-02 7:36 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel On Oct 31, 2016, at 10:27, Stefan Monnier <monnier@iro.umontreal.ca> wrote: >> I switched over to a pair of hash tables and the run time is just under 0.2s >> on my test machine now. Profiling reports are now topped by read1, >> readchar, and readbyte_from_file (now including the expanded getc_unlocked >> calls), accounting for about 30% of the CPU time between them. The hash >> functions and substitute_object_recurse are not taking a significant amount >> of time. > > BTW, I don't know if you've tried to make that dumped file work > correctly, but in case you haven't here's my latest attempt. Thanks! Looks like you’ve refined the handling of faces and other attributes. Have you tried it out in batch mode? I’m getting a crash in realize_face with a null cache pointer. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-11-02 7:36 ` Ken Raeburn @ 2016-11-02 12:17 ` Stefan Monnier 2016-11-02 12:22 ` Stefan Monnier 1 sibling, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-11-02 12:17 UTC (permalink / raw) To: emacs-devel > Thanks! Looks like you’ve refined the handling of faces and other > attributes. Have you tried it out in batch mode? No, haven't gotten that far. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-11-02 7:36 ` Ken Raeburn 2016-11-02 12:17 ` Stefan Monnier @ 2016-11-02 12:22 ` Stefan Monnier 2016-11-03 5:37 ` Ken Raeburn 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-11-02 12:22 UTC (permalink / raw) To: emacs-devel BTW, it might be worth comparing the behavior with the one we get with the "normal" temacs (i.e. by loading loadup.el instead of dumped.elc), as well as with what we get with CANNOT_DUMP: from what I remember the current code doesn't handle CANNOT_DUMP 100% correctly (which is OK so far because CANNOT_DUMP is only ever used temporarily during porting until unexec is working). IOW some of the problems we may encounter could be unrelated to what we do w.r.t dumped.elc. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-11-02 12:22 ` Stefan Monnier @ 2016-11-03 5:37 ` Ken Raeburn 2016-12-11 13:34 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2016-11-03 5:37 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel On Nov 2, 2016, at 08:22, Stefan Monnier <monnier@iro.umontreal.ca> wrote: > BTW, it might be worth comparing the behavior with the one we get with > the "normal" temacs (i.e. by loading loadup.el instead of dumped.elc), > as well as with what we get with CANNOT_DUMP: from what I remember the > current code doesn't handle CANNOT_DUMP 100% correctly (which is OK so > far because CANNOT_DUMP is only ever used temporarily during porting > until unexec is working). …which is why it seems like I have to keep fixing bugs every time I try to use it. If CANNOT_DUMP mode worked reliably, I’d think it would be the logical starting point for this work — it’s already compiling with the expectation that we’ll be loading the Lisp code when the user starts it up rather than preparing for unexec and a second invocation of main(). Changing the thing we load at startup from loadup.el to dumped.elc should be pretty minor. I’m trying CANNOT_DUMP out right now. As soon as I’ve got it bootstrapping again, I’ll try pulling in the other changes. > IOW some of the problems we may encounter could be unrelated to what we > do w.r.t dumped.elc. True. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-11-03 5:37 ` Ken Raeburn @ 2016-12-11 13:34 ` Ken Raeburn 2016-12-11 15:42 ` Eli Zaretskii ` (4 more replies) 0 siblings, 5 replies; 375+ messages in thread From: Ken Raeburn @ 2016-12-11 13:34 UTC (permalink / raw) To: Emacs developers I’ve pushed an update to the scratch/raeburn-startup branch. It includes several updates: * Stefan’s Oct 31 patch instead of his earlier one. This does more reinitializing of charsets, coding systems, etc., which I believe were absent from the previous version. * More patches to the recursive object substitution pass done during reading. The big costs on Mac OS X seem to differ from my Linux/GNU/X11 build — there’s a much larger dumped.elc file, and an entirely different compiler — but I’ve managed to trim the run time there a bit. * Changed gc-cons-threshold to be much larger. By itself, this isn’t a good change. But we’d exceed the old value many times over just reading the big “progn” form; this way my Linux/GNU/X11 run doesn’t trigger GC during startup, though I think the Mac version still does. I think a better strategy might try to defer or discourage GC during startup, and do it instead when we have idle cycles while the user isn’t trying to get something done. But revamping the GC strategy is a different discussion. * Larger obarray. After startup, my Linux/GNU/X11 build has over 15k symbols, and my Mac build has over 21k. The old obarray size of 1511 meant average chain lengths of over 10 and 14. Shorter chains mean less time spent in oblookup. And extra slots are cheap. * Open-code reading ASCII symbol characters from a file in read1(). The hot path involved examining readcharfun to determine its type, compare it against some known symbols, select a function to call, have that function check to see if we’re doing pushback instead of actually reading, block input, do the actual getc() call, and unblock input — all for each character. The new version duplicates a bunch of code, but once it sees we’re reading from a file, skips most of that for the common path through the inner loop. This cut maybe 10% off of some of my run times. With all these changes — Stefan’s new patch with additional initialization, and my updates to shave a little more time off — I’m still hitting just under 0.2s for: time ./temacs --batch --eval '(progn (message "hi") (kill-emacs))' on Linux/GNU/X11 (Intel Core i5-2320, 3GHz, gcc 4.9); my Mac (Intel Core 2 Duo, 2.8GHz) takes over half a second (including at least one GC invocation). It can be tested by running “temacs” after building it. The lisp load path will be set based on the source tree, not the installation prefix. If “-nl” and “-l” arguments are not given, it’ll load “../src/dumped.elc”, but that’s interpreted relative to the lisp *source* directory. If you build in a directory other than the checked-out tree (i.e., $srcdir is not “.”) as I do, you’ll need to copy dumped.elc from the src directory of the build tree where it’s generated to the src directory of the source tree where it’s sought. If dumped.elc isn’t found, temacs will exit with status 42. Under Stefan’s version, an X11 run would spit out a message saying the file wasn’t found and exit, but a tty run would get into a loop complaining about internal-echo-keystrokes-prefix and would need to be killed from another terminal. This way, it only kind of sucks, equally in both cases. :-) The remaining time still seems to be about 2/3 reading and parsing bytes, allocating objects, and updating (mostly scanning) the obarray. There should be a bit more time that can be squeezed out. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-11 13:34 ` Ken Raeburn @ 2016-12-11 15:42 ` Eli Zaretskii 2016-12-24 11:06 ` Eli Zaretskii 2016-12-11 19:18 ` Richard Stallman ` (3 subsequent siblings) 4 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-12-11 15:42 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Sun, 11 Dec 2016 08:34:01 -0500 > > I’ve pushed an update to the scratch/raeburn-startup branch. It includes several updates: Thank you for your work, I will definitely try to check it out soon. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-11 15:42 ` Eli Zaretskii @ 2016-12-24 11:06 ` Eli Zaretskii 2016-12-25 15:46 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-12-24 11:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: raeburn, emacs-devel > Date: Sun, 11 Dec 2016 17:42:21 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org > > > From: Ken Raeburn <raeburn@raeburn.org> > > Date: Sun, 11 Dec 2016 08:34:01 -0500 > > > > I’ve pushed an update to the scratch/raeburn-startup branch. It includes several updates: > > Thank you for your work, I will definitely try to check it out soon. I took a quick look. There are a few issues with the Windows build there, but I have one question to which I'd like to know the answer first: why do we still dumping temacs to emacs, instead of loading dumped.elc into a bare emacs? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-24 11:06 ` Eli Zaretskii @ 2016-12-25 15:46 ` Stefan Monnier 0 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-12-25 15:46 UTC (permalink / raw) To: emacs-devel > I took a quick look. There are a few issues with the Windows build > there, but I have one question to which I'd like to know the answer > first: why do we still dumping temacs to emacs, instead of loading > dumped.elc into a bare emacs? I don't know if Ken has another reason, but in my case it's simply because I haven't bothered to change the code so as not to call the unexec dump. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-11 13:34 ` Ken Raeburn 2016-12-11 15:42 ` Eli Zaretskii @ 2016-12-11 19:18 ` Richard Stallman 2016-12-15 12:57 ` Ken Raeburn 2016-12-11 19:18 ` Richard Stallman ` (2 subsequent siblings) 4 siblings, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-12-11 19:18 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > * Changed gc-cons-threshold to be much larger. How about binding it to a higher value for loadup? -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-11 19:18 ` Richard Stallman @ 2016-12-15 12:57 ` Ken Raeburn 2016-12-15 16:04 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2016-12-15 12:57 UTC (permalink / raw) To: rms; +Cc: emacs-devel > On Dec 11, 2016, at 14:18, Richard Stallman <rms@gnu.org> wrote: > > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > >> * Changed gc-cons-threshold to be much larger. > > How about binding it to a higher value for loadup? That may be good enough. But GC will probably kick in right after we set it back, so probably most methods we might try for measuring startup time will incur the cost of at least one GC pass, and it’ll happen when the user starts Emacs in real life. I guess one question is, how much it matters? It’s only a fraction of a second, but I’m trying to shave a startup time of 0.2s (or 0.6s on Mac OS X) down closer to 0.1s, so a fraction of a second can make a difference. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-15 12:57 ` Ken Raeburn @ 2016-12-15 16:04 ` Eli Zaretskii 2016-12-15 16:26 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-12-15 16:04 UTC (permalink / raw) To: Ken Raeburn; +Cc: rms, emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Thu, 15 Dec 2016 07:57:09 -0500 > Cc: emacs-devel@gnu.org > > > How about binding it to a higher value for loadup? > > That may be good enough. But GC will probably kick in right after we set it back AFAIK, just setting the GC threshold doesn't automatically invoke GC, you need do something that calls maybe_gc. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-15 16:04 ` Eli Zaretskii @ 2016-12-15 16:26 ` Ken Raeburn 0 siblings, 0 replies; 375+ messages in thread From: Ken Raeburn @ 2016-12-15 16:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rms, emacs-devel > On Dec 15, 2016, at 11:04, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Ken Raeburn <raeburn@raeburn.org> >> Date: Thu, 15 Dec 2016 07:57:09 -0500 >> Cc: emacs-devel@gnu.org >> >>> How about binding it to a higher value for loadup? >> >> That may be good enough. But GC will probably kick in right after we set it back > > AFAIK, just setting the GC threshold doesn't automatically invoke GC, > you need do something that calls maybe_gc. Right, but if we’re not following it up with evaluating another form from dumped.elc (eval_sub can invoke GC) or invoking some compiled routine (branch operations can invoke GC), then we’re probably ready to check for availability of user input (which can invoke GC). ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-11 13:34 ` Ken Raeburn 2016-12-11 15:42 ` Eli Zaretskii 2016-12-11 19:18 ` Richard Stallman @ 2016-12-11 19:18 ` Richard Stallman 2016-12-12 17:25 ` Ken Raeburn 2016-12-13 15:21 ` Ken Brown 2016-12-24 13:37 ` Eli Zaretskii 4 siblings, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-12-11 19:18 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > * Larger obarray. After startup, my Linux/GNU/X11 build has over > * 15k symbols, and my Mac build has over 21k. The old obarray > * size of 1511 meant average chain lengths of over 10 and 14. > * Shorter chains mean less time spent in oblookup. And extra > * slots are cheap. This may be a good idea, but it has nothing to do with any particular method of startup or dumping. So how about doing it unconditionally? -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-11 19:18 ` Richard Stallman @ 2016-12-12 17:25 ` Ken Raeburn 0 siblings, 0 replies; 375+ messages in thread From: Ken Raeburn @ 2016-12-12 17:25 UTC (permalink / raw) To: rms; +Cc: emacs-devel > On Dec 11, 2016, at 14:18, Richard Stallman <rms@gnu.org> wrote: > > [[[ To any NSA and FBI agents reading my email: please consider ]]] > [[[ whether defending the US Constitution against all enemies, ]]] > [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > >> * Larger obarray. After startup, my Linux/GNU/X11 build has over >> * 15k symbols, and my Mac build has over 21k. The old obarray >> * size of 1511 meant average chain lengths of over 10 and 14. >> * Shorter chains mean less time spent in oblookup. And extra >> * slots are cheap. > > This may be a good idea, but it has nothing to do with any particular > method of startup or dumping. So how about doing it unconditionally? A few of the changes on this branch would probably improve speed at least a tiny bit regardless of the startup method. This one also has the advantage of being a trivial change with, as far as I can see, no down side, so, yeah…. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-11 13:34 ` Ken Raeburn ` (2 preceding siblings ...) 2016-12-11 19:18 ` Richard Stallman @ 2016-12-13 15:21 ` Ken Brown 2016-12-14 5:30 ` Ken Raeburn 2016-12-24 13:37 ` Eli Zaretskii 4 siblings, 1 reply; 375+ messages in thread From: Ken Brown @ 2016-12-13 15:21 UTC (permalink / raw) To: Ken Raeburn, Emacs developers On 12/11/2016 8:34 AM, Ken Raeburn wrote: > I’ve pushed an update to the scratch/raeburn-startup branch. Did you actually push these changes? The last commit I see at http://git.savannah.gnu.org/cgit/emacs.git/log/?h=scratch/raeburn-startup is dated 2016-10-30. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-13 15:21 ` Ken Brown @ 2016-12-14 5:30 ` Ken Raeburn 2016-12-14 5:45 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2016-12-14 5:30 UTC (permalink / raw) To: Ken Brown; +Cc: Emacs developers > On Dec 13, 2016, at 10:21, Ken Brown <kbrown@cornell.edu> wrote: > > On 12/11/2016 8:34 AM, Ken Raeburn wrote: >> I’ve pushed an update to the scratch/raeburn-startup branch. > > Did you actually push these changes? The last commit I see at http://git.savannah.gnu.org/cgit/emacs.git/log/?h=scratch/raeburn-startup is dated 2016-10-30. > > Ken Strange, I’m not sure what happened. I’ll push it again. Just as well, I’ve got a couple minor updates to add anyway. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-14 5:30 ` Ken Raeburn @ 2016-12-14 5:45 ` Ken Raeburn 2016-12-14 10:58 ` Phil Sainty ` (2 more replies) 0 siblings, 3 replies; 375+ messages in thread From: Ken Raeburn @ 2016-12-14 5:45 UTC (permalink / raw) To: Ken Brown; +Cc: Emacs developers > On Dec 14, 2016, at 00:30, Ken Raeburn <raeburn@raeburn.org> wrote: > > >> On Dec 13, 2016, at 10:21, Ken Brown <kbrown@cornell.edu> wrote: >> >> On 12/11/2016 8:34 AM, Ken Raeburn wrote: >>> I’ve pushed an update to the scratch/raeburn-startup branch. >> >> Did you actually push these changes? The last commit I see at http://git.savannah.gnu.org/cgit/emacs.git/log/?h=scratch/raeburn-startup is dated 2016-10-30. >> >> Ken > > Strange, I’m not sure what happened. I’ll push it again. Just as well, I’ve got a couple minor updates to add anyway. I must have overlooked it the first time, but “git push -f” is being rejected with: remote: error: denying non-fast-forward refs/heads/scratch/raeburn-startup (you should pull first) Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-14 5:45 ` Ken Raeburn @ 2016-12-14 10:58 ` Phil Sainty 2016-12-14 12:06 ` Yuri Khan 2016-12-14 11:00 ` Lars Ingebrigtsen 2016-12-15 11:45 ` Ken Raeburn 2 siblings, 1 reply; 375+ messages in thread From: Phil Sainty @ 2016-12-14 10:58 UTC (permalink / raw) To: Ken Raeburn; +Cc: Emacs developers On 14/12/16 18:45, Ken Raeburn wrote: > I must have overlooked it the first time, but “git push -f” is being rejected with: > remote: error: denying non-fast-forward refs/heads/scratch/raeburn-startup (you should pull first) Which typically means that you've amended or rebased your local history since you last pushed it. Or potentially someone else has pushed to that branch in the interim. Your local branch and the remote branch have diverged, at any rate. If you *really really* want to push a revised history -- bearing in mind that it will cause merge issues for anyone who has already pulled from that branch (which is why you would usually refrain from doing such a thing, and why server-side protection like this exists in the first place) -- then you can generally work around the protection by deleting the upstream branch and then pushing your new version of it. Alternatively, you may need to rebase your local branch onto the upstream revision. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-14 10:58 ` Phil Sainty @ 2016-12-14 12:06 ` Yuri Khan 0 siblings, 0 replies; 375+ messages in thread From: Yuri Khan @ 2016-12-14 12:06 UTC (permalink / raw) To: Phil Sainty; +Cc: Ken Raeburn, Emacs developers On Wed, Dec 14, 2016 at 5:58 PM, Phil Sainty <psainty@orcon.net.nz> wrote: > If you *really really* want to push a revised history -- bearing > in mind that it will cause merge issues for anyone who has already > pulled from that branch (which is why you would usually refrain > from doing such a thing, and why server-side protection like this > exists in the first place) -- then you can generally work around > the protection by deleting the upstream branch and then pushing > your new version of it. Scratch branches are personal and others should exercise caution if/when pulling them. It is okay for the branch owner to delete and recreate it. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-14 5:45 ` Ken Raeburn 2016-12-14 10:58 ` Phil Sainty @ 2016-12-14 11:00 ` Lars Ingebrigtsen 2016-12-15 11:45 ` Ken Raeburn 2 siblings, 0 replies; 375+ messages in thread From: Lars Ingebrigtsen @ 2016-12-14 11:00 UTC (permalink / raw) To: Ken Raeburn; +Cc: Ken Brown, Emacs developers Ken Raeburn <raeburn@raeburn.org> writes: > I must have overlooked it the first time, but “git push -f” is being > rejected with: > > remote: error: denying non-fast-forward > refs/heads/scratch/raeburn-startup (you should pull first) Sounds like you need to say "git pull --rebase" before you push. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-14 5:45 ` Ken Raeburn 2016-12-14 10:58 ` Phil Sainty 2016-12-14 11:00 ` Lars Ingebrigtsen @ 2016-12-15 11:45 ` Ken Raeburn 2016-12-15 17:28 ` Ken Raeburn 2016-12-16 14:22 ` Robert Pluim 2 siblings, 2 replies; 375+ messages in thread From: Ken Raeburn @ 2016-12-15 11:45 UTC (permalink / raw) To: Emacs developers Branch scratch/raeburn-startup deleted and re-pushed. In addition to the changes I mentioned earlier, I found an unnecessary memset in the face reinitialization code that could go, and an initialization form was being emitted that tried to incorporate the obarray by value (which wouldn’t work because the symbol chains don’t all get dumped); omitting the latter for now cuts the file size a percent or so. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-15 11:45 ` Ken Raeburn @ 2016-12-15 17:28 ` Ken Raeburn 2016-12-15 19:59 ` Eli Zaretskii 2016-12-16 14:22 ` Robert Pluim 1 sibling, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2016-12-15 17:28 UTC (permalink / raw) To: Emacs developers One area I’m contemplating now is whether we can trim the size of dumped.elc. Question: How useable does Emacs need to be if the Lisp code is improperly installed (or not installed) and can’t be loaded? With the big-elc approach as currently implemented, assuming we store dumped.elc in the install tree along with the other Lisp code, it basically can’t start without that Lisp library. If that’s okay, then the next question is: How much do we *need* to load before processing user input? My impression has been that the current loadup.el contents cover not just the bare minimum that Emacs absolutely needs to have to function, but also the popular things we want to have readily available without having to wait for a Lisp package to load (like buff-menu), especially if it’s in response to a simple keypress or mouse click. (The X support code is loaded in an X build, even if no X display is present at startup; I think parts of it fall into both categories.) And adding stuff is fairly cheap; loadup and unexec can take as long as they want, only the speed of relaunching the resulting executable affects the user. With the big-elc approach, the tradeoffs change. Reading a bunch of function definitions from dumped.elc is only a tiny bit faster than reading the same definitions from the original .elc files, because we don’t have to open more files. (At least, the cost is trivial if the files are in cache. It’ll be OS- and system-dependent.) Only the time for precomputation that gets done as the file is loaded is saved, and exchanged for the time needed to parse the saved result. If we can trim some stuff from loadup.el, and resort to autoloading that stuff later, that may save us some startup time. (Some text mode commands? Or buff-menu?) If we really want some of the other stuff to be able to run immediately when the user hits a key, maybe there’s some way to compromise between that and faster startup. A strawman proposal: load the “must-haves” via dumped.elc; load the user’s init file, read files and execute eval commands as indicated by the command line options; check for user input; if there’s no user input (e.g., use an idle timer set for 3s), start going through a list of “nice-to-haves” and loading them, continuing until user input is available. If the user starts typing or otherwise invoking some of those nice-to-have commands right away, they’ll have to wait while autoloading happens, but if we get a couple idle seconds, we may still pull the commands in before the user needs them. Of course, if the user types something while we’re loading a file, that file will have to finish loading before we can respond; it’s sort of a guessing game as to how much idle time suggests that the user is doing something else and probably won’t type anything in the next second or two. Perhaps we can divide the task further to keep any individual delay shorter: read a .elc file into a buffer, check for input, parse into S-expressions, check for input, eval the S-expressions… Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-15 17:28 ` Ken Raeburn @ 2016-12-15 19:59 ` Eli Zaretskii 2016-12-15 22:07 ` Clément Pit--Claudel ` (2 more replies) 0 siblings, 3 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-12-15 19:59 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Thu, 15 Dec 2016 12:28:15 -0500 > > Question: How useable does Emacs need to be if the Lisp code is improperly installed (or not installed) and can’t be loaded? I had this same idea just the other day. We have auto-loading, so I went, so maybe just starting temacs and letting it load whatever it needs when it needs that would be good enough? Just to see what would we be up against, I ran ./temacs -Q -nl and sure thing, it errored out right away because some C code called Lisp which wasn't loaded yet. What's more, auto-loading doesn't work for preloaded packages, because we have code in autoload.el to skip/ignore autoload cookies in files mentioned in loadup.el. So my next idea would be to come up with a smaller loadup.el which only loads the stuff that is needed for temacs to start. I didn't try that yet, but I did think that Phillip's work on ldefs-boot might just be a good starting point: those ldefs-boot-*.el files might be just what we need. IMO, it would be interesting to see where this will take us, and what kind of performance could that produce. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-15 19:59 ` Eli Zaretskii @ 2016-12-15 22:07 ` Clément Pit--Claudel 2016-12-16 7:54 ` Eli Zaretskii 2016-12-16 7:56 ` Eli Zaretskii 2016-12-19 15:09 ` Phillip Lord 2 siblings, 1 reply; 375+ messages in thread From: Clément Pit--Claudel @ 2016-12-15 22:07 UTC (permalink / raw) To: emacs-devel [-- Attachment #1.1: Type: text/plain, Size: 437 bytes --] On 2016-12-15 14:59, Eli Zaretskii wrote: > IMO, it would be interesting to see where this will take us, and what > kind of performance could that produce. This sounds like a good idea; I wonder how much it will break, though. Many external packages don't (require) preloaded packages (some preloaded packages don't or used to not export a (provide), in fact), which may cause issues if these packages aren't preloaded anymore. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-15 22:07 ` Clément Pit--Claudel @ 2016-12-16 7:54 ` Eli Zaretskii 2016-12-16 14:28 ` Clément Pit--Claudel 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-12-16 7:54 UTC (permalink / raw) To: Clément Pit--Claudel; +Cc: emacs-devel > From: Clément Pit--Claudel <clement.pit@gmail.com> > Date: Thu, 15 Dec 2016 17:07:50 -0500 > > On 2016-12-15 14:59, Eli Zaretskii wrote: > > IMO, it would be interesting to see where this will take us, and what > > kind of performance could that produce. > > This sounds like a good idea; I wonder how much it will break, though. Many external packages don't (require) preloaded packages (some preloaded packages don't or used to not export a (provide), in fact), which may cause issues if these packages aren't preloaded anymore. Autoloading should fix that. This idea won't work anyway without adding the relevant symbols to loaddefs.el. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-16 7:54 ` Eli Zaretskii @ 2016-12-16 14:28 ` Clément Pit--Claudel 2016-12-16 14:39 ` Eli Zaretskii 2016-12-19 15:11 ` Phillip Lord 0 siblings, 2 replies; 375+ messages in thread From: Clément Pit--Claudel @ 2016-12-16 14:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel [-- Attachment #1.1: Type: text/plain, Size: 944 bytes --] On 2016-12-16 02:54, Eli Zaretskii wrote: >> From: Clément Pit--Claudel <clement.pit@gmail.com> >> Date: Thu, 15 Dec 2016 17:07:50 -0500 >> >> On 2016-12-15 14:59, Eli Zaretskii wrote: >>> IMO, it would be interesting to see where this will take us, and what >>> kind of performance could that produce. >> >> This sounds like a good idea; I wonder how much it will break, though. Many external packages don't (require) preloaded packages (some preloaded packages don't or used to not export a (provide), in fact), which may cause issues if these packages aren't preloaded anymore. > > Autoloading should fix that. This idea won't work anyway without > adding the relevant symbols to loaddefs.el. Indeed; but then we need to autoload all functions in these files, right? Also, does autoloading work for macros? And would there not be potential issues with variables and/or defcustoms? (I can't think of any) Clément. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-16 14:28 ` Clément Pit--Claudel @ 2016-12-16 14:39 ` Eli Zaretskii 2016-12-16 15:28 ` Clément Pit--Claudel 2016-12-17 14:56 ` Stefan Monnier 2016-12-19 15:11 ` Phillip Lord 1 sibling, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-12-16 14:39 UTC (permalink / raw) To: Clément Pit--Claudel; +Cc: emacs-devel > Cc: emacs-devel@gnu.org > From: Clément Pit--Claudel <clement.pit@gmail.com> > Date: Fri, 16 Dec 2016 09:28:37 -0500 > > > Autoloading should fix that. This idea won't work anyway without > > adding the relevant symbols to loaddefs.el. > > Indeed; but then we need to autoload all functions in these files, right? Not sure about "all", but most of them, yes. And variables. > Also, does autoloading work for macros? It doesn't, but why would macros be a problem? They need to be seen by the byte compiler when it compiles the package, not when the package is loaded. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-16 14:39 ` Eli Zaretskii @ 2016-12-16 15:28 ` Clément Pit--Claudel 2016-12-16 21:27 ` Eli Zaretskii 2016-12-17 14:56 ` Stefan Monnier 1 sibling, 1 reply; 375+ messages in thread From: Clément Pit--Claudel @ 2016-12-16 15:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel [-- Attachment #1.1: Type: text/plain, Size: 806 bytes --] On 2016-12-16 09:39, Eli Zaretskii wrote: >> Cc: emacs-devel@gnu.org >> From: Clément Pit--Claudel <clement.pit@gmail.com> >> Date: Fri, 16 Dec 2016 09:28:37 -0500 >> >>> Autoloading should fix that. This idea won't work anyway without >>> adding the relevant symbols to loaddefs.el. >> >> Indeed; but then we need to autoload all functions in these files, right? > > Not sure about "all", but most of them, yes. And variables. > >> Also, does autoloading work for macros? > > It doesn't, but why would macros be a problem? They need to be seen > by the byte compiler when it compiles the package, not when the > package is loaded. Right; but won't we have a problem when package.el compiles newly downloaded packages that depend on formerly autoloaded libraries? Clément. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-16 15:28 ` Clément Pit--Claudel @ 2016-12-16 21:27 ` Eli Zaretskii 2016-12-16 21:38 ` Noam Postavsky 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-12-16 21:27 UTC (permalink / raw) To: Clément Pit--Claudel; +Cc: emacs-devel > Cc: emacs-devel@gnu.org > From: Clément Pit--Claudel <clement.pit@gmail.com> > Date: Fri, 16 Dec 2016 10:28:22 -0500 > > >> Also, does autoloading work for macros? > > > > It doesn't, but why would macros be a problem? They need to be seen > > by the byte compiler when it compiles the package, not when the > > package is loaded. > > Right; but won't we have a problem when package.el compiles newly downloaded packages that depend on formerly autoloaded libraries? I don't know. Maybe. Determining this would be part of the job of exploring this alternative. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-16 21:27 ` Eli Zaretskii @ 2016-12-16 21:38 ` Noam Postavsky 0 siblings, 0 replies; 375+ messages in thread From: Noam Postavsky @ 2016-12-16 21:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Clément Pit--Claudel, Emacs developers On Fri, Dec 16, 2016 at 4:27 PM, Eli Zaretskii <eliz@gnu.org> wrote: >> Cc: emacs-devel@gnu.org >> From: Clément Pit--Claudel <clement.pit@gmail.com> >> Date: Fri, 16 Dec 2016 10:28:22 -0500 >> >> >> Also, does autoloading work for macros? >> > >> > It doesn't, but why would macros be a problem? They need to be seen >> > by the byte compiler when it compiles the package, not when the >> > package is loaded. Why do we think autoloading doesn't work for macros? autoload is a built-in function in `C source code'. (autoload FUNCTION FILE &optional DOCSTRING INTERACTIVE TYPE) [...] Fifth arg TYPE indicates the type of the object: [...] `macro' or t says FUNCTION is really a macro. >> >> Right; but won't we have a problem when package.el compiles newly downloaded packages that depend on formerly autoloaded libraries? > > I don't know. Maybe. Determining this would be part of the job of > exploring this alternative. > ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-16 14:39 ` Eli Zaretskii 2016-12-16 15:28 ` Clément Pit--Claudel @ 2016-12-17 14:56 ` Stefan Monnier 1 sibling, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-12-17 14:56 UTC (permalink / raw) To: emacs-devel >> Also, does autoloading work for macros? > It doesn't, but why would macros be a problem? Huh? Autoloading works fine for macros. Or maybe I'm misundertanding Clement's question. The problem of autoloading is with variables, coding systems, faces, ... This said, I don't see any strong reason why we couldn't arrange to autoload coding systems. Maybe we should instrument the Emacs code to mark functions that are called during a normal "start Emacs opening a file in fundamental mode". Then we can look at the preloaded functions¯os which haven't been used. This should give us a good idea of how much there is to gain on this front. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-16 14:28 ` Clément Pit--Claudel 2016-12-16 14:39 ` Eli Zaretskii @ 2016-12-19 15:11 ` Phillip Lord 1 sibling, 0 replies; 375+ messages in thread From: Phillip Lord @ 2016-12-19 15:11 UTC (permalink / raw) To: Clément Pit--Claudel; +Cc: Eli Zaretskii, emacs-devel Clément Pit--Claudel <clement.pit@gmail.com> writes: > On 2016-12-16 02:54, Eli Zaretskii wrote: >>> From: Clément Pit--Claudel <clement.pit@gmail.com> >>> Date: Thu, 15 Dec 2016 17:07:50 -0500 >>> >>> On 2016-12-15 14:59, Eli Zaretskii wrote: >>>> IMO, it would be interesting to see where this will take us, and what >>>> kind of performance could that produce. >>> >>> This sounds like a good idea; I wonder how much it will break, though. >>> Many external packages don't (require) preloaded packages (some preloaded >>> packages don't or used to not export a (provide), in fact), which may cause >>> issues if these packages aren't preloaded anymore. >> >> Autoloading should fix that. This idea won't work anyway without >> adding the relevant symbols to loaddefs.el. > > Indeed; but then we need to autoload all functions in these files, right? Unfortunately, there is a fairly random aspect to it. You need to autoload the first function in the file that actually gets called. Once the file is actually loaded, autoloads don't matter any more. Obviously, though, which the first function actually is may change. Phil ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-15 19:59 ` Eli Zaretskii 2016-12-15 22:07 ` Clément Pit--Claudel @ 2016-12-16 7:56 ` Eli Zaretskii 2016-12-19 15:15 ` Phillip Lord 2016-12-19 15:09 ` Phillip Lord 2 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-12-16 7:56 UTC (permalink / raw) To: raeburn; +Cc: emacs-devel > Date: Thu, 15 Dec 2016 21:59:08 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org > > So my next idea would be to come up with a smaller loadup.el which > only loads the stuff that is needed for temacs to start. I didn't try > that yet, but I did think that Phillip's work on ldefs-boot might just > be a good starting point: those ldefs-boot-*.el files might be just > what we need. Oh, and more more thought: there could be a separate, smaller loadup file for batch invocations, since speed of startup in that mode is somewhat more important than in the interactive mode. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-16 7:56 ` Eli Zaretskii @ 2016-12-19 15:15 ` Phillip Lord 0 siblings, 0 replies; 375+ messages in thread From: Phillip Lord @ 2016-12-19 15:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: raeburn, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Date: Thu, 15 Dec 2016 21:59:08 +0200 >> From: Eli Zaretskii <eliz@gnu.org> >> Cc: emacs-devel@gnu.org >> >> So my next idea would be to come up with a smaller loadup.el which >> only loads the stuff that is needed for temacs to start. I didn't try >> that yet, but I did think that Phillip's work on ldefs-boot might just >> be a good starting point: those ldefs-boot-*.el files might be just >> what we need. > > Oh, and more more thought: there could be a separate, smaller loadup > file for batch invocations, since speed of startup in that mode is > somewhat more important than in the interactive mode. loadup would need refactoring anyway. It doesn't just loadup at the moment. It also dumps and kills Emacs. It's also got some strange syntactic constraints because the Makefile does some sed based parsing of it. Phil ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-15 19:59 ` Eli Zaretskii 2016-12-15 22:07 ` Clément Pit--Claudel 2016-12-16 7:56 ` Eli Zaretskii @ 2016-12-19 15:09 ` Phillip Lord 2016-12-20 18:57 ` Ken Raeburn 2 siblings, 1 reply; 375+ messages in thread From: Phillip Lord @ 2016-12-19 15:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Ken Raeburn, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: > So my next idea would be to come up with a smaller loadup.el which > only loads the stuff that is needed for temacs to start. I didn't try > that yet, but I did think that Phillip's work on ldefs-boot might just > be a good starting point: those ldefs-boot-*.el files might be just > what we need. > > IMO, it would be interesting to see where this will take us, and what > kind of performance could that produce. I looked at this a little and in fact the boot code that I have written does tell you exactly which autoloads you need to get temacs to work -- it's not very many, I think that there are only 10 or so (bytecomp.el for instance). Of course, this is 10 autoloads PLUS all of the non-auto loads in loadup.el. My own feeling is that this is a bit unclean at the moment; given that loadup.el is supposed to support temacs till the point that it dumps, probably all of the autoloads used for this process should be explicitly in loadup; or, alternatively, we should have very non-auto loads in loadup and do everything via autoload. Phil ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-19 15:09 ` Phillip Lord @ 2016-12-20 18:57 ` Ken Raeburn 2016-12-20 23:22 ` Stefan Monnier 2016-12-21 12:13 ` Phillip Lord 0 siblings, 2 replies; 375+ messages in thread From: Ken Raeburn @ 2016-12-20 18:57 UTC (permalink / raw) To: Phillip Lord; +Cc: Eli Zaretskii, emacs-devel On Dec 19, 2016, at 10:09, Phillip Lord <phillip.lord@russet.org.uk> wrote: > I looked at this a little and in fact the boot code that I have written > does tell you exactly which autoloads you need to get temacs to work -- > it's not very many, I think that there are only 10 or so (bytecomp.el > for instance). This sounds like it could be the biggest help for startup time at this point. Are you going to look further into making a lightweight loadup file? Looking at ldefs-boot.el and loaddefs.el, and contemplating the parsing of them, I wonder: If we go the big-elc route, can we defer loading the doc strings until they’re actually needed? Perhaps using the “(#$ . nnnn)” syntax used in .elc files, or somehow pointing at the real .el or .elc files defining the functions? Maybe just omit the function doc strings, if the help code does something reasonable in that case? I’ve still been poking at the reader code, but for small-ish changes I think I’m hitting a point of diminishing returns. My current test case run time is about 0.15-0.16s, though the run times are short enough that minor system activity at the same time can affect the results. I’ve got one more experiment in the works that cuts almost 20% of the size of dumped.elc, and cuts the test run time to about 0.14s. (Sharing interned symbols in the printer, so “setplist … setplist …” becomes “#4=setplist … #4# …”, drastically cutting into the 90% of oblookup calls that are done for symbols already in the obarray, and the related string manipulations, as well as the legibility of the generated file.) After that, I think the next step is further specialization of read1/readchar/read_escape/readbyte for the get-file-char case, and maybe more tweaks to try to optimize for mostly-ASCII input. But those result in more code duplication and additional maintenance work, for probably small benefit, so they’re not looking all that appealing. So, at this point, I’m inclined to finish my current experiment with the printer, and maybe set aside work on the big-elc performance for a bit, maybe look into threading bugs or the state of the CANNOT_DUMP code. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-20 18:57 ` Ken Raeburn @ 2016-12-20 23:22 ` Stefan Monnier 2016-12-21 7:44 ` Ken Raeburn 2016-12-21 12:13 ` Phillip Lord 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-12-20 23:22 UTC (permalink / raw) To: emacs-devel > them, I wonder: If we go the big-elc route, can we defer loading the doc > strings until they’re actually needed? Perhaps using the “(#$ . nnnn)” In the dumped.elc file I generate, there should be basically no docstrings (the data I dump already uses either the NNN or the (#$ . NNN) representation to point to docstrings in the DOC file or in the original .elc file), so I don't think there's much opportunity for deferral. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-20 23:22 ` Stefan Monnier @ 2016-12-21 7:44 ` Ken Raeburn 0 siblings, 0 replies; 375+ messages in thread From: Ken Raeburn @ 2016-12-21 7:44 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel On Dec 20, 2016, at 18:22, Stefan Monnier <monnier@iro.umontreal.ca> wrote: >> them, I wonder: If we go the big-elc route, can we defer loading the doc >> strings until they’re actually needed? Perhaps using the “(#$ . nnnn)” > > In the dumped.elc file I generate, there should be basically > no docstrings (the data I dump already uses either the NNN or the (#$ > . NNN) representation to point to docstrings in the DOC file or in the > original .elc file), so I don't think there's much opportunity for deferral. Ah, yes, I forgot that happens even for the loaddefs doc strings not explicitly using that syntax, thanks to Snarf-documentation. At least, so long as all the files we might pre-load under various conditions are all covered by the DOC file, if we do take the approach of a smaller file for batch mode. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-20 18:57 ` Ken Raeburn 2016-12-20 23:22 ` Stefan Monnier @ 2016-12-21 12:13 ` Phillip Lord 1 sibling, 0 replies; 375+ messages in thread From: Phillip Lord @ 2016-12-21 12:13 UTC (permalink / raw) To: Ken Raeburn; +Cc: Eli Zaretskii, emacs-devel Ken Raeburn <raeburn@raeburn.org> writes: > On Dec 19, 2016, at 10:09, Phillip Lord <phillip.lord@russet.org.uk> wrote: > >> I looked at this a little and in fact the boot code that I have written >> does tell you exactly which autoloads you need to get temacs to work -- >> it's not very many, I think that there are only 10 or so (bytecomp.el >> for instance). > > This sounds like it could be the biggest help for startup time at this point. > Are you going to look further into making a lightweight loadup file? > > Looking at ldefs-boot.el and loaddefs.el, and contemplating the parsing of > them, I wonder: If we go the big-elc route, can we defer loading the doc > strings until they’re actually needed? Perhaps using the “(#$ . nnnn)” syntax > used in .elc files, or somehow pointing at the real .el or .elc files defining > the functions? Maybe just omit the function doc strings, if the help code > does something reasonable in that case? In the ldefs-boot-auto.el file, there are no doc strings. This was mostly because it was extra effort to add them, and they made no difference for the use intended. One simple way to lazy load doc strings would be to fiddle with help code so that it loads the relevant file before looking for the docstring. This does assume that the all the code in dumped.elc is idempotent, though. Incidentally, unless I have misunderstood, dumped.elc will duplicate code also found in other .elc files (say, byte-run.elc, and nadvice.elc)? Is dumped.elc going to detect these dependencies and redump as necessary? Phil ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-15 11:45 ` Ken Raeburn 2016-12-15 17:28 ` Ken Raeburn @ 2016-12-16 14:22 ` Robert Pluim 1 sibling, 0 replies; 375+ messages in thread From: Robert Pluim @ 2016-12-16 14:22 UTC (permalink / raw) To: emacs-devel Ken Raeburn <raeburn@raeburn.org> writes: > Branch scratch/raeburn-startup deleted and re-pushed. > > In addition to the changes I mentioned earlier, I found an unnecessary > memset in the face reinitialization code that could go, and an > initialization form was being emitted that tried to incorporate the > obarray by value (which wouldn’t work because the symbol chains don’t > all get dumped); omitting the latter for now cuts the file size a > percent or so. Hmm, it crashes for me when doing a bootstrap on GNU/Linux Mint (based on Ubuntu 16.04) /bin/bash: line 1: 21979 Aborted EMACSLOADPATH= '../src/emacs' -batch --no-site-file --no-site-lisp --eval '(setq load-prefer-newer t)' -f batch-byte-compile calc/calcalg2.el Makefile:282: recipe for target 'calc/calcalg2.elc' failed Backtrace (this is compiled -ggdb -O0 with gcc 5.4.0) (gdb) bt #0 terminate_due_to_signal (sig=6, backtrace_limit=40) at emacs.c:367 #1 0x00000000005866d3 in emacs_abort () at sysdep.c:2337 #2 0x000000000057100d in unblock_input_to (level=-1) at keyboard.c:7170 #3 0x0000000000571024 in unblock_input () at keyboard.c:7186 #4 0x0000000000639605 in read1 (readcharfun=25152, pch=0x7fffffff719c, first_in_list=false) at lread.c:3446 #5 0x000000000063a906 in read_list (flag=false, readcharfun=25152) at lread.c:3928 #6 0x00000000006371e9 in read1 (readcharfun=25152, pch=0x7fffffff7514, first_in_list=false) at lread.c:2629 #7 0x0000000000636336 in read0 (readcharfun=25152) at lread.c:2210 #8 0x00000000006361fb in read_internal_start (stream=25152, start=0, end=0) at lread.c:2176 #9 0x0000000000635ef2 in Fread (stream=25152) at lread.c:2110 #10 0x000000000060748d in Ffuncall (nargs=2, args=0x7fffffff76b0) at eval.c:2715 #11 0x0000000000606d54 in call1 (fn=39792, arg1=25152) at eval.c:2574 #12 0x00000000006358a5 in readevalloop (readcharfun=25152, stream=0x19bc7a0, sourcename=26993220, printflag=false, unibyte=0, readfun=0, start=0, end=0) at lread.c:1958 #13 0x0000000000633fed in Fload (file=10190276, noerror=0, nomessage=45216, nosuffix=0, must_suffix=45216) at lread.c:1367 #14 0x00000000006136d9 in Frequire (feature=4534752, filename=0, noerror=0) at fns.c:2894 #15 0x0000000000607516 in Ffuncall (nargs=2, args=0x7fffffff7ae8) at eval.c:2722 #16 0x000000000064e9ef in exec_byte_code (bytestr=26973012, vector=22861437, maxdepth=10, args_template=0, nargs=0, args=0x0) at bytecode.c:639 #17 0x000000000064dd56 in Fbyte_code (bytestr=26973012, vector=22861437, maxdepth=10) at bytecode.c:319 #18 0x0000000000605f02 in eval_sub (form=26933523) at eval.c:2194 #19 0x00000000006359c3 in readevalloop (readcharfun=25152, stream=0x18c7190, sourcename=26972436, printflag=false, unibyte=0, readfun=0, start=0, end=0) at lread.c:1980 #20 0x0000000000633fed in Fload (file=26802292, noerror=0, nomessage=45216, nosuffix=0, must_suffix=45216) at lread.c:1367 #21 0x00000000006136d9 in Frequire (feature=13579232, filename=0, noerror=0) at fns.c:2894 #22 0x0000000000607516 in Ffuncall (nargs=2, args=0x7fffffff8518) at eval.c:2722 #23 0x00000000006063fa in Fapply (nargs=2, args=0x7fffffff8518) at eval.c:2300 #24 0x000000000060735d in Ffuncall (nargs=3, args=0x7fffffff8510) at eval.c:2695 #25 0x000000000064e9ef in exec_byte_code (bytestr=25727860, vector=22453597, maxdepth=38, args_template=1030, nargs=1, args=0x7fffffff8a78) at bytecode.c:639 #26 0x0000000000607dd8 in funcall_lambda (fun=22453765, nargs=1, arg_vector=0x7fffffff8a70) at eval.c:2879 #27 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffff8a68) at eval.c:2764 #28 0x000000000064e9ef in exec_byte_code (bytestr=25715156, vector=22445469, maxdepth=18, args_template=1030, nargs=1, args=0x7fffffff8f60) at bytecode.c:639 #29 0x0000000000607dd8 in funcall_lambda (fun=22445517, nargs=1, arg_vector=0x7fffffff8f58) at eval.c:2879 #30 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffff8f50) at eval.c:2764 #31 0x000000000064e9ef in exec_byte_code (bytestr=25714756, vector=22445325, maxdepth=22, args_template=1030, nargs=1, args=0x7fffffff9438) at bytecode.c:639 #32 0x0000000000607dd8 in funcall_lambda (fun=22445373, nargs=1, arg_vector=0x7fffffff9430) at eval.c:2879 #33 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffff9428) at eval.c:2764 #34 0x000000000064e9ef in exec_byte_code (bytestr=17023828, vector=22255765, maxdepth=42, args_template=2058, nargs=2, args=0x7fffffff9948) at bytecode.c:639 #35 0x0000000000607dd8 in funcall_lambda (fun=22255869, nargs=2, arg_vector=0x7fffffff9938) at eval.c:2879 #36 0x00000000006077ae in Ffuncall (nargs=3, args=0x7fffffff9930) at eval.c:2764 #37 0x000000000064e9ef in exec_byte_code (bytestr=25714692, vector=22433525, maxdepth=18, args_template=1030, nargs=1, args=0x7fffffff9e08) at bytecode.c:639 #38 0x0000000000607dd8 in funcall_lambda (fun=22445421, nargs=1, arg_vector=0x7fffffff9e00) at eval.c:2879 #39 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffff9df8) at eval.c:2764 #40 0x000000000064e9ef in exec_byte_code (bytestr=25699108, vector=22436957, maxdepth=22, args_template=1030, nargs=1, args=0x7fffffffa330) at bytecode.c:639 #41 0x0000000000607dd8 in funcall_lambda (fun=22433429, nargs=1, arg_vector=0x7fffffffa328) at eval.c:2879 #42 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffffa320) at eval.c:2764 #43 0x000000000064e9ef in exec_byte_code (bytestr=25698596, vector=22437133, maxdepth=66, args_template=1030, nargs=1, args=0x7fffffffa978) at bytecode.c:639 #44 0x0000000000607dd8 in funcall_lambda (fun=22433477, nargs=1, arg_vector=0x7fffffffa970) at eval.c:2879 #45 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffffa968) at eval.c:2764 #46 0x000000000064e9ef in exec_byte_code (bytestr=25677076, vector=22432581, maxdepth=66, args_template=2054, nargs=1, args=0x7fffffffb030) at bytecode.c:639 #47 0x0000000000607dd8 in funcall_lambda (fun=22429357, nargs=1, arg_vector=0x7fffffffb028) at eval.c:2879 #48 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffffb020) at eval.c:2764 #49 0x000000000064e9ef in exec_byte_code (bytestr=25918004, vector=19599701, maxdepth=34, args_template=1030, nargs=1, args=0x7fffffffb578) at bytecode.c:639 #50 0x0000000000607dd8 in funcall_lambda (fun=19599829, nargs=1, arg_vector=0x7fffffffb570) at eval.c:2879 #51 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffffb568) at eval.c:2764 #52 0x000000000064e9ef in exec_byte_code (bytestr=25917348, vector=20758333, maxdepth=42, args_template=1026, nargs=0, args=0x7fffffffbb58) at bytecode.c:639 #53 0x0000000000607dd8 in funcall_lambda (fun=20758493, nargs=0, arg_vector=0x7fffffffbb58) at eval.c:2879 #54 0x00000000006077ae in Ffuncall (nargs=1, args=0x7fffffffbb50) at eval.c:2764 #55 0x000000000064e9ef in exec_byte_code (bytestr=10932540, vector=10932573, maxdepth=94, args_template=1030, nargs=1, args=0x7fffffffc4e8) at bytecode.c:639 #56 0x0000000000607dd8 in funcall_lambda (fun=10932493, nargs=1, arg_vector=0x7fffffffc4e0) at eval.c:2879 #57 0x00000000006077ae in Ffuncall (nargs=2, args=0x7fffffffc4d8) at eval.c:2764 #58 0x000000000064e9ef in exec_byte_code (bytestr=10909516, vector=10909549, maxdepth=86, args_template=2, nargs=0, args=0x7fffffffd108) at bytecode.c:639 #59 0x0000000000607dd8 in funcall_lambda (fun=10909469, nargs=0, arg_vector=0x7fffffffd108) at eval.c:2879 #60 0x00000000006077ae in Ffuncall (nargs=1, args=0x7fffffffd100) at eval.c:2764 #61 0x000000000064e9ef in exec_byte_code (bytestr=10905556, vector=10905589, maxdepth=50, args_template=2, nargs=0, args=0x7fffffffd6f0) at bytecode.c:639 #62 0x0000000000607dd8 in funcall_lambda (fun=10905509, nargs=0, arg_vector=0x7fffffffd6f0) at eval.c:2879 #63 0x0000000000607b46 in apply_lambda (fun=10905509, args=0, count=4) at eval.c:2816 #64 0x000000000060607e in eval_sub (form=19109027) at eval.c:2233 #65 0x0000000000605572 in Feval (form=19109027, lexical=0) at eval.c:2010 #66 0x00000000005644f6 in top_level_2 () at keyboard.c:1127 #67 0x0000000000603fbd in internal_condition_case (bfun=0x5644d3 <top_level_2>, handlers=19536, hfun=0x563f70 <cmd_error>) at eval.c:1314 #68 0x0000000000564537 in top_level_1 (ignore=0) at keyboard.c:1135 #69 0x00000000006038cc in internal_catch (tag=46608, func=0x5644f8 <top_level_1>, arg=0) at eval.c:1080 #70 0x000000000056442b in command_loop () at keyboard.c:1096 #71 0x0000000000563b55 in recursive_edit_1 () at keyboard.c:703 #72 0x0000000000563ccc in Frecursive_edit () at keyboard.c:774 #73 0x0000000000561951 in main (argc=9, argv=0x7fffffffdc38) at emacs.c:1698 Lisp Backtrace: "read" (0xffff76b8) "require" (0xffff7af0) "byte-code" (0xffff7f20) "require" (0xffff8520) "apply" (0xffff8518) "byte-compile-file-form-require" (0xffff8a70) "byte-compile-file-form" (0xffff8f58) 0x1567d38 PVEC_COMPILED "byte-compile-recurse-toplevel" (0xffff9938) "byte-compile-toplevel-file-form" (0xffff9e00) 0x1564e90 PVEC_COMPILED "byte-compile-from-buffer" (0xffffa970) "byte-compile-file" (0xffffb028) "batch-byte-compile-file" (0xffffb570) "batch-byte-compile" (0xffffbb58) "command-line-1" (0xffffc4e0) "command-line" (0xffffd108) "normal-top-level" (0xffffd6f0) (gdb) ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-11 13:34 ` Ken Raeburn ` (3 preceding siblings ...) 2016-12-13 15:21 ` Ken Brown @ 2016-12-24 13:37 ` Eli Zaretskii 2016-12-26 17:48 ` Eli Zaretskii 4 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-12-24 13:37 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Sun, 11 Dec 2016 08:34:01 -0500 > > With all these changes — Stefan’s new patch with additional initialization, and my updates to shave a little more time off — I’m still hitting just under 0.2s for: > > time ./temacs --batch --eval '(progn (message "hi") (kill-emacs))' > > on Linux/GNU/X11 (Intel Core i5-2320, 3GHz, gcc 4.9); my Mac (Intel Core 2 Duo, 2.8GHz) takes over half a second (including at least one GC invocation). For the record, my timing is 0.828s with an unoptimized build of the branch, as opposed to 13.2s with an unoptimized build on master, and 5.343s with an optimized (-Og) build of Emacs 25.1.90. The CPU is Core i7-2600, 3.4GHz; the compiler used is GCC 5.3.0. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-24 13:37 ` Eli Zaretskii @ 2016-12-26 17:48 ` Eli Zaretskii 2017-01-07 9:40 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-12-26 17:48 UTC (permalink / raw) To: raeburn; +Cc: emacs-devel > Date: Sat, 24 Dec 2016 15:37:11 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org > > I’m still hitting just under 0.2s for: > > > time ./temacs --batch --eval '(progn (message "hi") (kill-emacs))' > > > > on Linux/GNU/X11 (Intel Core i5-2320, 3GHz, gcc 4.9); my Mac (Intel Core 2 Duo, 2.8GHz) takes over half a second (including at least one GC invocation). > > For the record, my timing is 0.828s with an unoptimized build of the > branch, as opposed to 13.2s with an unoptimized build on master, and > 5.343s with an optimized (-Og) build of Emacs 25.1.90. The CPU is > Core i7-2600, 3.4GHz; the compiler used is GCC 5.3.0. With an optimized (-O2) build on the same system, the above command takes 0.190s on the average. Byte-compiling Lisp files in batch mode shows that loading dumped.elc takes about 0.150s, as that is the difference between byte-compiling with temacs and the dumped emacs. For example, compiling simple.el takes 0.656s with temacs and 0.515s with dumped emacs. IOW, the overhead is additive, as expected. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-12-26 17:48 ` Eli Zaretskii @ 2017-01-07 9:40 ` Eli Zaretskii 2017-01-09 10:28 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-01-07 9:40 UTC (permalink / raw) To: raeburn; +Cc: emacs-devel Ken, I tried to get rid of calling dump-emacs in the raeburn-startup branch, see the changes below. The resulting code builds and produces dumped.elc, but then fails to compile the *.el files: ... Loading d:/gnu/git/emacs/no-unexec/lisp/leim/leim-list.el (source)... Finding pointers to doc strings... Finding pointers to doc strings...done Dumping under the name emacs Dumping into dumped.elc...preparing... Dumping into dumped.elc...generating... Dumping into dumped.elc...printing... Dumping into dumped.elc...saving... Dumping into dumped.elc...done mv -f emacs.exe bootstrap-emacs.exe make -C ../lisp compile-first EMACS="../src/bootstrap-emacs.exe" make[2]: Entering directory `/d/gnu/git/emacs/no-unexec/lisp' ELC emacs-lisp/macroexp.elc Loading ../src/dumped.elc... Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end) ELC emacs-lisp/cconv.elc Loading ../src/dumped.elc... Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end) This could be related to the fact that the original code produced the first dumped.elc in the top-level directory, not in src/, and I needed to fix that, since otherwise bootstrap-emacs would exit immediately (see the changes below). In the original version, src/dumped.elc was only produced after all the necessary Lisp files were byte-compiled already. So it seems like the current build process on this branch still somehow depends on a dumped emacs executable, until it byte-compiles all the preloaded Lisp files, and produces dumped.elc from that. IOW, the first dumped.elc produced before byte-compiling those files is not up to the job of running Emacs for byte-compiling Lisp files. How can we fix that, so that unexec and its call can be really removed from the sources? Or did I miss something? Thanks. diff --git a/lisp/loadup.el b/lisp/loadup.el index 54d19c1..873d804 100644 --- a/lisp/loadup.el +++ b/lisp/loadup.el @@ -453,27 +453,30 @@ ;; confused people installing Emacs (they'd install the file ;; under the name `xemacs'), and it's inconsistent with every ;; other GNU program's build process. - (dump-emacs "emacs" "temacs") - (message "%d pure bytes used" pure-bytes-used) - ;; Recompute NAME now, so that it isn't set when we dump. - (if (not (or (eq system-type 'ms-dos) - ;; Don't bother adding another name if we're just - ;; building bootstrap-emacs. - (equal (last command-line-args) '("bootstrap")))) - (let ((name (concat "emacs-" emacs-version)) - (exe (if (eq system-type 'windows-nt) ".exe" ""))) - (while (string-match "[^-+_.a-zA-Z0-9]+" name) - (setq name (concat (downcase (substring name 0 (match-beginning 0))) + ;; (dump-emacs "emacs" "temacs") + ;; (message "%d pure bytes used" pure-bytes-used) + (let ((exe (if (memq system-type '(windows-nt ms-dos)) ".exe" ""))) + (copy-file (expand-file-name (concat "temacs" exe) invocation-directory) + (expand-file-name (concat "emacs" exe) invocation-directory) + t) + ;; Recompute NAME now, so that it isn't set when we dump. + (if (not (or (eq system-type 'ms-dos) + ;; Don't bother adding another name if we're just + ;; building bootstrap-emacs. + (equal (last command-line-args) '("bootstrap")))) + (let ((name (concat "emacs-" emacs-version))) + (while (string-match "[^-+_.a-zA-Z0-9]+" name) + (setq name (concat (downcase (substring name 0 (match-beginning 0))) "-" (substring name (match-end 0))))) - (setq name (concat name exe)) - (message "Adding name %s" name) - ;; When this runs on Windows, invocation-directory is not - ;; necessarily the current directory. - (add-name-to-file (expand-file-name (concat "emacs" exe) - invocation-directory) - (expand-file-name name invocation-directory) - t))) + (setq name (concat name exe)) + (message "Adding name %s" name) + ;; When this runs on Windows, invocation-directory is not + ;; necessarily the current directory. + (add-name-to-file (expand-file-name (concat "emacs" exe) + invocation-directory) + (expand-file-name name invocation-directory) + t)))) (message "Dumping into dumped.elc...preparing...") ;; Dump the current state into a file so we can reload it! @@ -555,6 +558,7 @@ obarray) (message "Dumping into dumped.elc...printing...") (with-current-buffer (generate-new-buffer "dumped.elc") + (setq default-directory invocation-directory) (insert ";ELC\^W\^@\^@\^@\n;;; Compiled\n;;; in Emacs version " emacs-version "\n") (let ((print-circle t) ^ permalink raw reply related [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-07 9:40 ` Eli Zaretskii @ 2017-01-09 10:28 ` Ken Raeburn 2017-01-10 2:25 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-01-09 10:28 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Emacs developers > On Jan 7, 2017, at 04:40, Eli Zaretskii <eliz@gnu.org> wrote: > > Ken, > > I tried to get rid of calling dump-emacs in the raeburn-startup > branch, see the changes below. The resulting code builds and produces > dumped.elc, but then fails to compile the *.el files: I’ve been looking into it this weekend. It appears that in some of my builds I’m seeing in dumped.elc stuff along the lines of: (setplist 'window-parameter '(gv-expander (closure (t) #19=(do &rest args) (gv--defsetter 'window-parameter (lambda #20=(val &rest args) `(,'set-window-parameter . #21=(,@args ,val))) . #22=(do args))) side-effect-free t)) That’s with my #N# patch removed; that patch obfuscates the code but I don’t think it should be changing the meaning. The comma-quote-symbol syntax looks strange to me, could that be causing it? > This could be related to the fact that the original code produced the > first dumped.elc in the top-level directory, not in src/, and I needed > to fix that, since otherwise bootstrap-emacs would exit immediately > (see the changes below). In the original version, src/dumped.elc was > only produced after all the necessary Lisp files were byte-compiled > already. In the GNU/Linux build, the dumped.elc file is generated in the src directory of the build tree. So that part of your patch didn’t alter anything for my testing as far as I can see. But the GNU/Linux build supports building in a separate tree from the source tree, a mode I usually do my builds in, and at startup we look for dumped.elc in the src directory of the source tree, not the build tree. So I still have to tweak it manually. > So it seems like the current build process on this branch still > somehow depends on a dumped emacs executable, until it byte-compiles > all the preloaded Lisp files, and produces dumped.elc from that. IOW, > the first dumped.elc produced before byte-compiling those files is not > up to the job of running Emacs for byte-compiling Lisp files. How can > we fix that, so that unexec and its call can be really removed from > the sources? Or did I miss something? A workaround might be to use loadup.el instead of dumped.elc during that stage. But that doesn’t fix the problem. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-09 10:28 ` Ken Raeburn @ 2017-01-10 2:25 ` Stefan Monnier 2017-01-10 9:46 ` Andreas Schwab 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2017-01-10 2:25 UTC (permalink / raw) To: emacs-devel > `(,'set-window-parameter . #21=(,@args ,val))) . #22=(do args))) > side-effect-free t)) > > The comma-quote-symbol syntax looks strange to me, could that be causing it? The ,' is a result of evaluation of code like ``(,',setter ,@args ,val) so, it's indeed strange, but only to the extent that nested backquotes are "strange". Eli wrote: > Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end) Hmm... I don't understand this. This message seems to come from backquote.el: ((eq (car s) backquote-unquote-symbol) (if (<= level 0) (cond ((> (length s) 2) ;; We could support it with: (cons 2 `(list . ,(cdr s))) ;; But let's not encourage such uses. (error "Multiple args to , are not supported: %S" s)) (t (cons (if (eq (car-safe (nth 1 s)) 'quote) 0 1) (nth 1 s)))) (backquote-delay-process s (1- level)))) but then `s` should have \, in its car, whereas the above message indicates that (car s) is (\, (quote set-window-parameter)) which implies we should not have entered this branch. Maybe I'm just too tired to read this code right, tho. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-10 2:25 ` Stefan Monnier @ 2017-01-10 9:46 ` Andreas Schwab 2017-01-10 17:19 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Andreas Schwab @ 2017-01-10 9:46 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel On Jan 09 2017, Stefan Monnier <monnier@iro.umontreal.ca> wrote: > Eli wrote: >> Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end) > > Hmm... I don't understand this. This message seems to come from > backquote.el: > > ((eq (car s) backquote-unquote-symbol) > (if (<= level 0) > (cond > ((> (length s) 2) > ;; We could support it with: (cons 2 `(list . ,(cdr s))) > ;; But let's not encourage such uses. > (error "Multiple args to , are not supported: %S" s)) > (t (cons (if (eq (car-safe (nth 1 s)) 'quote) 0 1) > (nth 1 s)))) > (backquote-delay-process s (1- level)))) > > but then `s` should have \, in its car, whereas the above message > indicates that (car s) is (\, (quote set-window-parameter)) which > implies we should not have entered this branch. That can only mean that something clobbered backquote-unquote-symbol. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-10 9:46 ` Andreas Schwab @ 2017-01-10 17:19 ` Eli Zaretskii 2017-01-11 6:32 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-01-10 17:19 UTC (permalink / raw) To: Andreas Schwab, Ken Raeburn; +Cc: monnier, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Date: Tue, 10 Jan 2017 10:46:25 +0100 > Cc: emacs-devel@gnu.org > > On Jan 09 2017, Stefan Monnier <monnier@iro.umontreal.ca> wrote: > > > Eli wrote: > >> Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end) > > > > Hmm... I don't understand this. This message seems to come from > > backquote.el: > > > > ((eq (car s) backquote-unquote-symbol) > > (if (<= level 0) > > (cond > > ((> (length s) 2) > > ;; We could support it with: (cons 2 `(list . ,(cdr s))) > > ;; But let's not encourage such uses. > > (error "Multiple args to , are not supported: %S" s)) > > (t (cons (if (eq (car-safe (nth 1 s)) 'quote) 0 1) > > (nth 1 s)))) > > (backquote-delay-process s (1- level)))) > > > > but then `s` should have \, in its car, whereas the above message > > indicates that (car s) is (\, (quote set-window-parameter)) which > > implies we should not have entered this branch. > > That can only mean that something clobbered backquote-unquote-symbol. Yes, the value of backquote-unquote-symbol at this point is indeed this: (\, (quote set-window-parameter)) I guess something is wrong with reading dumped.elc? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-10 17:19 ` Eli Zaretskii @ 2017-01-11 6:32 ` Ken Raeburn 2017-01-12 8:17 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-01-11 6:32 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Andreas Schwab, monnier, emacs-devel > On Jan 10, 2017, at 12:19, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Andreas Schwab <schwab@linux-m68k.org> >> Date: Tue, 10 Jan 2017 10:46:25 +0100 >> Cc: emacs-devel@gnu.org >> >> On Jan 09 2017, Stefan Monnier <monnier@iro.umontreal.ca> wrote: >> >>> Eli wrote: >>>> Multiple args to , are not supported: ((\, (quote set-window-parameter)) temp (\, (quote set-window-parameter)) end) >>> >>> Hmm... I don't understand this. This message seems to come from >>> backquote.el: >>> >>> ((eq (car s) backquote-unquote-symbol) >>> (if (<= level 0) >>> (cond >>> ((> (length s) 2) >>> ;; We could support it with: (cons 2 `(list . ,(cdr s))) >>> ;; But let's not encourage such uses. >>> (error "Multiple args to , are not supported: %S" s)) >>> (t (cons (if (eq (car-safe (nth 1 s)) 'quote) 0 1) >>> (nth 1 s)))) >>> (backquote-delay-process s (1- level)))) >>> >>> but then `s` should have \, in its car, whereas the above message >>> indicates that (car s) is (\, (quote set-window-parameter)) which >>> implies we should not have entered this branch. >> >> That can only mean that something clobbered backquote-unquote-symbol. > > Yes, the value of backquote-unquote-symbol at this point is indeed > this: > > (\, (quote set-window-parameter)) > > I guess something is wrong with reading dumped.elc? At the moment it’s looking to me like it might be a problem with my #N# patch for writing out symbols. It got a little more of a speedup reading dumped.elc, but if I drop that change, I get a lot further in trying to bootstrap the tree with your change. It still fails while processing the “leim” directory, though. Indeed, looking at dumped.elc, I see: (#35# '#5646# '#218#) where 35 is set-default, 5646 is backquote-unquote-symbol, and 218 is ,’set-window-parameter thanks to "#218=,’#897=set-window-parameter" being read from dumped.elc. I suspect 218 was supposed to be just the comma, but the special printing of comma forms was still applied but is not compatible with the #N# handling, so comma and related symbols should just be excluded from that hack. I’ll test that out, but in the meantime, commenting out the binding in loadup.el of print-symbols-as-references should make things work again (bootstrapping up until partway through the leim directory). Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-11 6:32 ` Ken Raeburn @ 2017-01-12 8:17 ` Ken Raeburn 2017-01-14 10:41 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-01-12 8:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Andreas Schwab, monnier, emacs-devel On Jan 11, 2017, at 01:32, Ken Raeburn <raeburn@raeburn.org> wrote: > Indeed, looking at dumped.elc, I see: > (#35# '#5646# '#218#) > where 35 is set-default, 5646 is backquote-unquote-symbol, and 218 is ,’set-window-parameter thanks to "#218=,’#897=set-window-parameter" being read from dumped.elc. I suspect 218 was supposed to be just the comma, but the special printing of comma forms was still applied but is not compatible with the #N# handling, so comma and related symbols should just be excluded from that hack. There were other instances that made it clear that “#218#” was being printed where “,” was intended, including with the lack of space before whatever followed that’s normal for a comma (e.g., “#218##219#” where #219# referred to some ordinary symbol). I’ve just uploaded a workaround for that (including a comma-dot sequence that I’m not familiar with, but which seems to get the same treatment as comma and comma-at), and a bug fix I found relating to one of my earlier changes. Now, with your patch to avoid unexec, it’s successfully compiling in the lisp directory but fails in leim, which I haven’t dug into yet: make[2]: Entering directory '/home/raeburn/dev/emacs/s/lisp' make -C ../leim all EMACS="../src/emacs" make[3]: Entering directory '/home/raeburn/dev/emacs/s/leim' /bin/mkdir -p ../lisp/leim/ja-dic GEN ../lisp/leim/ja-dic/ja-dic.el Loading ../src/dumped.elc... Reading file "/home/raeburn/dev/emacs/s/leim/SKK-DIC/SKK-JISYO.L" ... Processing OKURI-ARI entries ... Debugger entered--Lisp error: (search-failed "^\\cH") re-search-forward("^\\cH") (let ((from (point)) to) (search-forward ";; okuri-nasi") (beginning-of-line) (setq to (point)) (narrow-to-region from to) (skkdic-convert-okuri-ari skkbuf buf) (widen) (goto-char to) (forward-line 1) (setq from (point)) (re-search-forward "^\\cH") (setq to (match-beginning 0)) (narrow-to-region from to) (skkdic-convert-postfix skkbuf buf) (widen) (goto-char to) (skkdic-convert-prefix skkbuf buf) (skkdic-collect-okuri-nasi) (skkdic-convert-okuri-nasi skkbuf buf) (save-current-buffer (set-buffer buf) (goto-char (point-max)) (insert ";;\n(provide 'ja-dic)\n\n" ";; Local Variables:\n" ";; version-control: never\n" ";; no-update-autoloads: t\n" ";; coding: utf-8\n" ";; End:\n\n" ";;; ja-dic.el ends here\n"))) … ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-12 8:17 ` Ken Raeburn @ 2017-01-14 10:41 ` Eli Zaretskii 2017-01-14 10:55 ` Andreas Schwab ` (3 more replies) 0 siblings, 4 replies; 375+ messages in thread From: Eli Zaretskii @ 2017-01-14 10:41 UTC (permalink / raw) To: Ken Raeburn; +Cc: schwab, monnier, emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Thu, 12 Jan 2017 03:17:40 -0500 > Cc: Andreas Schwab <schwab@linux-m68k.org>, > monnier@iro.umontreal.ca, > emacs-devel@gnu.org > > I’ve just uploaded a workaround for that (including a comma-dot sequence that I’m not familiar with, but which seems to get the same treatment as comma and comma-at), and a bug fix I found relating to one of my earlier changes. > > Now, with your patch to avoid unexec, it’s successfully compiling in the lisp directory but fails It does fail for me while byte compiling 2 files: ELC leim/ja-dic/ja-dic.elc Loading ../src/dumped.elc... In toplevel form: leim/ja-dic/ja-dic.el:76:1:Error: Args out of range: " s ", 2432, 2432 Makefile:282: recipe for target `leim/ja-dic/ja-dic.elc' failed make[2]: *** [leim/ja-dic/ja-dic.elc] Error 1 ELC net/eww.elc Loading ../src/dumped.elc... In toplevel form: net/eww.el:29:1:Error: Undefined category: > Makefile:282: recipe for target `net/eww.elc' failed make[2]: *** [net/eww.elc] Error 1 The line number in the error message is bogus, it points to a require line (that's a known issue with byte-compiler error reporting, I think). Running the latter compilation under GDB, I see this: Thread 1 hit Breakpoint 3, Fsignal (error_symbol=21616, data=-4611686018320478144) at eval.c:1471 1471 signal_or_quit (error_symbol, data, false); (gdb) pp error_symbol error (gdb) pp data ("Undefined category: >") (gdb) bt #0 Fsignal (error_symbol=21616, data=-4611686018320478144) at eval.c:1471 #1 0x0114c616 in xsignal (error_symbol=21616, data=-4611686018320478144) at lisp.h:3872 #2 0x01221129 in xsignal1 (error_symbol=21616, arg=-9223372036747839088) at eval.c:1606 #3 0x01221df8 in verror ( m=0x167d0c5 <DEFAULT_REHASH_SIZE+373> "Undefined category: %c", ap=0x826bb4 ">") at eval.c:1791 #4 0x01221e16 in error ( m=0x167d0c5 <DEFAULT_REHASH_SIZE+373> "Undefined category: %c") at eval.c:1803 #5 0x011108b2 in Fmodify_category_entry (character=4611686018427387937, category=4611686018427387966, table=-6917529027624565016, reset=0) at category.c:368 #6 0x01225980 in Ffuncall (nargs=3, args=0x826d88) at eval.c:2726 #7 0x0128959c in exec_byte_code (bytestr=-9223372036747840008, vector=-6917529027534179736, maxdepth=4611686018427387911, args_template=0, nargs=0, args=0x0) at bytecode.c:639 #8 0x012886b0 in Fbyte_code (bytestr=-9223372036747840008, vector=-6917529027534179736, maxdepth=4611686018427387911) at bytecode.c:319 #9 0x012238ef in eval_sub (form=-4611686018320476528) at eval.c:2194 #10 0x01269693 in readevalloop (readcharfun=28056, stream=0x77c5fd00 <msvcrt!_iob+128>, sourcename=-9223372036747840088, printflag=false, unibyte=0, readfun=0, start=0, end=0) at lread.c:1980 #11 0x01266f73 in Fload (file=-9223372036747853616, noerror=0, nomessage=55328, nosuffix=0, must_suffix=0) at lread.c:1367 #12 0x01223a45 in eval_sub (form=-4611686018320474720) at eval.c:2202 #13 0x0121c562 in Fprogn (body=-4611686018320474624) at eval.c:432 #14 0x0121c277 in Fif (args=-4611686018320475808) at eval.c:390 #15 0x012232ef in eval_sub (form=-4611686018320475824) at eval.c:2141 #16 0x01268a56 in readevalloop_eager_expand_eval (val=-4611686018320474576, macroexpand=62241920) at lread.c:1792 #17 0x01269679 in readevalloop (readcharfun=-6917529027540611248, stream=0x0, sourcename=-9223372036754301328, printflag=false, unibyte=0, readfun=0, start=0, end=0) at lread.c:1978 #18 0x01269b55 in Feval_buffer (buffer=-6917529027540611248, printflag=0, filename=-9223372036754305384, unibyte=0, do_allow_print=55328) at lread.c:2044 #19 0x01225a39 in Ffuncall (nargs=6, args=0x828098) at eval.c:2731 #20 0x0128959c in exec_byte_code (bytestr=-9223372036760655744, vector=-6917529027547213384, maxdepth=4611686018427387910, args_template=0, nargs=0, args=0x0) at bytecode.c:639 #21 0x01226fc8 in funcall_lambda (fun=-6917529027547213024, nargs=4, arg_vector=0x8286c0) at eval.c:2957 #22 0x01225e1f in Ffuncall (nargs=5, args=0x8286b8) at eval.c:2764 #23 0x01225110 in call4 (fn=64816760, arg1=-9223372036754305384, arg2=-9223372036754305384, arg3=0, arg4=55328) at eval.c:2599 #24 0x01266b7c in Fload (file=-9223372036754314936, noerror=0, nomessage=55328, nosuffix=0, must_suffix=55328) at lread.c:1311 #25 0x012391ce in Frequire (feature=76025368, filename=0, noerror=0) at fns.c:2894 #26 0x012258f2 in Ffuncall (nargs=2, args=0x829068) at eval.c:2722 #27 0x0122429d in Fapply (nargs=2, args=0x829068) at eval.c:2300 #28 0x012256f3 in Ffuncall (nargs=3, args=0x829060) at eval.c:2695 #29 0x0128959c in exec_byte_code (bytestr=-9223372036754937720, vector=-6917529027541241920, maxdepth=4611686018427387913, args_template=4611686018427388161, nargs=1, args=0x829648) at bytecode.c:639 #30 0x012268de in funcall_lambda (fun=-6917529027541241752, nargs=1, arg_vector=0x829640) at eval.c:2879 #31 0x01225e1f in Ffuncall (nargs=2, args=0x829638) at eval.c:2764 #32 0x0128959c in exec_byte_code (bytestr=-9223372036754938152, vector=-6917529027541242792, maxdepth=4611686018427387908, args_template=4611686018427388161, nargs=1, args=0x829bb0) at bytecode.c:639 #33 0x012268de in funcall_lambda (fun=-6917529027541242744, nargs=1, arg_vector=0x829ba8) at eval.c:2879 #34 0x01225e1f in Ffuncall (nargs=2, args=0x829ba0) at eval.c:2764 #35 0x0128959c in exec_byte_code (bytestr=-9223372036754938216, vector=-6917529027541242960, maxdepth=4611686018427387909, args_template=4611686018427388161, nargs=1, args=0x82a108) at bytecode.c:639 #36 0x012268de in funcall_lambda (fun=-6917529027541242912, nargs=1, arg_vector=0x82a100) at eval.c:2879 #37 0x01225e1f in Ffuncall (nargs=2, args=0x82a0f8) at eval.c:2764 #38 0x0128959c in exec_byte_code (bytestr=-9223372036755022888, vector=-6917529027541352232, maxdepth=4611686018427387914, args_template=4611686018427388418, nargs=2, args=0x82a698) at bytecode.c:639 #39 0x012268de in funcall_lambda (fun=-6917529027541352128, nargs=2, arg_vector=0x82a688) at eval.c:2879 #40 0x01225e1f in Ffuncall (nargs=3, args=0x82a680) at eval.c:2764 #41 0x0128959c in exec_byte_code (bytestr=-9223372036754938232, vector=-6917529027541242864, maxdepth=4611686018427387908, args_template=4611686018427388161, nargs=1, args=0x82abd8) at bytecode.c:639 #42 0x012268de in funcall_lambda (fun=-6917529027541242840, nargs=1, arg_vector=0x82abd0) at eval.c:2879 #43 0x01225e1f in Ffuncall (nargs=2, args=0x82abc8) at eval.c:2764 #44 0x0128959c in exec_byte_code (bytestr=-9223372036754939440, vector=-6917529027541248760, maxdepth=4611686018427387909, args_template=4611686018427388161, nargs=1, args=0x82b180) at bytecode.c:639 #45 0x012268de in funcall_lambda (fun=-6917529027541248584, nargs=1, arg_vector=0x82b178) at eval.c:2879 #46 0x01225e1f in Ffuncall (nargs=2, args=0x82b170) at eval.c:2764 #47 0x0128959c in exec_byte_code (bytestr=-9223372036754939488, vector=-6917529027541248536, maxdepth=4611686018427387920, args_template=4611686018427388161, nargs=1, args=0x82b848) at bytecode.c:639 #48 0x012268de in funcall_lambda (fun=-6917529027541248088, nargs=1, arg_vector=0x82b840) at eval.c:2879 #49 0x01225e1f in Ffuncall (nargs=2, args=0x82b838) at eval.c:2764 #50 0x0128959c in exec_byte_code (bytestr=-9223372036754945616, vector=-6917529027541250008, maxdepth=4611686018427387920, args_template=4611686018427388417, nargs=1, args=0x82bf80) at bytecode.c:639 #51 0x012268de in funcall_lambda (fun=-6917529027541249216, nargs=1, arg_vector=0x82bf78) at eval.c:2879 #52 0x01225e1f in Ffuncall (nargs=2, args=0x82bf70) at eval.c:2764 #53 0x0128959c in exec_byte_code (bytestr=-9223372036754832632, vector=-6917529027541134088, maxdepth=4611686018427387912, args_template=4611686018427388161, nargs=1, args=0x82c548) at bytecode.c:639 #54 0x012268de in funcall_lambda (fun=-6917529027541133960, nargs=1, arg_vector=0x82c540) at eval.c:2879 #55 0x01225e1f in Ffuncall (nargs=2, args=0x82c538) at eval.c:2764 #56 0x0128959c in exec_byte_code (bytestr=-9223372036754832680, vector=-6917529027541140920, maxdepth=4611686018427387914, args_template=4611686018427388160, nargs=0, args=0x82cba8) at bytecode.c:639 #57 0x012268de in funcall_lambda (fun=-6917529027541140760, nargs=0, arg_vector=0x82cba8) at eval.c:2879 #58 0x01225e1f in Ffuncall (nargs=1, args=0x82cba0) at eval.c:2764 #59 0x0128959c in exec_byte_code (bytestr=-9223372036757936560, vector=-6917529027544929360, maxdepth=4611686018427387927, args_template=4611686018427388161, nargs=1, args=0x82d5b8) at bytecode.c:639 #60 0x012268de in funcall_lambda (fun=-6917529027544928504, nargs=1, arg_vector=0x82d5b0) at eval.c:2879 #61 0x01225e1f in Ffuncall (nargs=2, args=0x82d5a8) at eval.c:2764 #62 0x0128959c in exec_byte_code (bytestr=-9223372036760992488, vector=-6917529027547292792, maxdepth=4611686018427387925, args_template=4611686018427387904, nargs=0, args=0x82e258) at bytecode.c:639 #63 0x012268de in funcall_lambda (fun=-6917529027547291064, nargs=0, arg_vector=0x82e258) at eval.c:2879 #64 0x01225e1f in Ffuncall (nargs=1, args=0x82e250) at eval.c:2764 #65 0x0128959c in exec_byte_code (bytestr=-9223372036756613024, vector=-6917529027542917320, maxdepth=4611686018427387916, args_template=4611686018427387904, nargs=0, args=0x82e850) at bytecode.c:639 #66 0x012268de in funcall_lambda (fun=-6917529027542916696, nargs=0, arg_vector=0x82e850) at eval.c:2879 #67 0x012263b2 in apply_lambda (fun=-6917529027542916696, args=0, count=21) at eval.c:2816 #68 0x01223de3 in eval_sub (form=-4611686018332481616) at eval.c:2233 #69 0x01222b56 in Feval (form=-4611686018332481616, lexical=0) at eval.c:2010 #70 0x01223886 in eval_sub (form=-4611686018328424112) at eval.c:2191 #71 0x0121c562 in Fprogn (body=-4611686018328424080) at eval.c:432 #72 0x012232ef in eval_sub (form=-4611686018328424240) at eval.c:2141 #73 0x01269693 in readevalloop (readcharfun=28056, stream=0x77c5fce0 <msvcrt!_iob+96>, sourcename=-9223372036838132000, printflag=false, unibyte=0, readfun=0, start=0, end=0) at lread.c:1980 #74 0x01266f73 in Fload (file=-9223372036838132176, noerror=0, nomessage=0, nosuffix=0, must_suffix=0) at lread.c:1367 #75 0x01223a45 in eval_sub (form=-4611686018340754848) at eval.c:2202 #76 0x012205e2 in internal_lisp_condition_case (var=0, bodyform=-4611686018340754848, handlers=-4611686018340754768) at eval.c:1285 #77 0x0121fe64 in Fcondition_case (args=-4611686018340754736) at eval.c:1211 #78 0x012232ef in eval_sub (form=-4611686018340754720) at eval.c:2141 #79 0x01222b56 in Feval (form=-4611686018340754720, lexical=0) at eval.c:2010 #80 0x01155640 in top_level_2 () at keyboard.c:1127 #81 0x01220675 in internal_condition_case (bfun=0x115560a <top_level_2>, handlers=21616, hfun=0x1154dc1 <cmd_error>) at eval.c:1314 #82 0x011556a6 in top_level_1 (ignore=0) at keyboard.c:1135 #83 0x0121f7fd in internal_catch (tag=57512, func=0x1155646 <top_level_1>, arg=0) at eval.c:1080 #84 0x01155522 in command_loop () at keyboard.c:1096 #85 0x011547f3 in recursive_edit_1 () at keyboard.c:703 #86 0x01154a8f in Frecursive_edit () at keyboard.c:774 #87 0x01152244 in main (argc=7, argv=0xa440d8) at emacs.c:1698 Lisp Backtrace: "modify-category-entry" (0x826d90) "byte-code" (0x827270) "load" (0x827a90) "if" (0x827cc0) "eval-buffer" (0x8280a0) "load-with-code-conversion" (0x8286c0) "require" (0x829070) "apply" (0x829068) "byte-compile-file-form-require" (0x829640) "byte-compile-file-form" (0x829ba8) 0x5f36be0 PVEC_COMPILED "byte-compile-recurse-toplevel" (0x82a688) "byte-compile-toplevel-file-form" (0x82abd0) 0x5f355b8 PVEC_COMPILED "byte-compile-from-buffer" (0x82b840) "byte-compile-file" (0x82bf78) "batch-byte-compile-file" (0x82c540) "batch-byte-compile" (0x82cba8) "command-line-1" (0x82d5b0) "command-line" (0x82e258) "normal-top-level" (0x82e850) "eval" (0x82eb30) "progn" (0x82ed20) "load" (0x82f500) "condition-case" (0x82f7d0) (gdb) fr 11 #11 0x01266f73 in Fload (file=-9223372036747853616, noerror=0, nomessage=55328, nosuffix=0, must_suffix=0) at lread.c:1367 1367 readevalloop (Qget_file_char, stream, hist_file_name, (gdb) pp file "kinsoku" (gdb) So it is loading kinsoku.el, and the code which triggers this is this: (while (< idx len) (setq ch (aref kinsoku-bol idx) idx (1+ idx)) (modify-category-entry ch ?>))) The category '>' is defined in characters.el. Surprisingly, characters.elc in this branch is identical to the file on master, so byte compilation (see below) is off the hook here. What else could explain that this category is deemed unknown? Running the ja-dic.el compilation under GDB, I see this: Thread 1 hit Breakpoint 3, Fsignal (error_symbol=9464, data=-4611686018325493760) at eval.c:1471 1471 signal_or_quit (error_symbol, data, false); args-out-of-range (gdb) pp data (" s " 2432 2432) (gdb) bt #0 Fsignal (error_symbol=9464, data=-4611686018325493760) at eval.c:1471 #1 0x0114c616 in xsignal (error_symbol=9464, data=-4611686018325493760) at lisp.h:3872 #2 0x0122120b in xsignal3 (error_symbol=9464, arg1=-9223372036754289936, arg2=4611686018427390336, arg3=4611686018427390336) at eval.c:1618 #3 0x011f8170 in args_out_of_range_3 (a1=-9223372036754289936, a2=4611686018427390336, a3=4611686018427390336) at data.c:169 #4 0x0122f302 in validate_subarray (array=-9223372036754289936, from=4611686018427390336, to=4611686018427390336, size=4, ifrom=0x828da8, ito=0x828da4) at fns.c:1257 #5 0x0122f3a1 in Fsubstring (string=-9223372036754289936, from=4611686018427390336, to=4611686018427390336) at fns.c:1282 #6 0x0128a8b7 in exec_byte_code (bytestr=-9223372036754295792, vector=-6917529027540600688, maxdepth=4611686018427387908, args_template=0, nargs=0, args=0x0) at bytecode.c:958 #7 0x01226fc8 in funcall_lambda (fun=-6917529027540652976, nargs=1, arg_vector=0x829398) at eval.c:2957 #8 0x01225e1f in Ffuncall (nargs=2, args=0x829390) at eval.c:2764 #9 0x0128959c in exec_byte_code (bytestr=-9223372036754295728, vector=-6917529027540600592, maxdepth=4611686018427387912, args_template=0, nargs=0, args=0x0) at bytecode.c:639 #10 0x01226fc8 in funcall_lambda (fun=-6917529027540600496, nargs=15663, arg_vector=0x60c72d0) at eval.c:2957 #11 0x01225e1f in Ffuncall (nargs=15664, args=0x60c72c8) at eval.c:2764 #12 0x012246e8 in Fapply (nargs=2, args=0x829970) at eval.c:2343 #13 0x01224f2d in apply1 (fn=-6917529027540600496, arg=-4611686018326943040) at eval.c:2559 #14 0x0121f650 in Fmacroexpand (form=-4611686018326943056, environment=-4611686018327011424) at eval.c:1035 #15 0x0122588f in Ffuncall (nargs=3, args=0x829b48) at eval.c:2718 #16 0x0128959c in exec_byte_code (bytestr=-9223372036758992784, vector=-6917529027545329872, maxdepth=4611686018427387914, args_template=4611686018427388418, nargs=2, args=0x82a110) at bytecode.c:639 #17 0x012268de in funcall_lambda (fun=-6917529027545333672, nargs=2, arg_vector=0x82a100) at eval.c:2879 #18 0x01225e1f in Ffuncall (nargs=3, args=0x82a0f8) at eval.c:2764 #19 0x0128959c in exec_byte_code (bytestr=-9223372036755022888, vector=-6917529027541352232, maxdepth=4611686018427387914, args_template=4611686018427388418, nargs=2, args=0x82a698) at bytecode.c:639 #20 0x012268de in funcall_lambda (fun=-6917529027541352128, nargs=2, arg_vector=0x82a688) at eval.c:2879 #21 0x01225e1f in Ffuncall (nargs=3, args=0x82a680) at eval.c:2764 #22 0x0128959c in exec_byte_code (bytestr=-9223372036754938232, vector=-6917529027541242864, maxdepth=4611686018427387908, args_template=4611686018427388161, nargs=1, args=0x82abd8) at bytecode.c:639 #23 0x012268de in funcall_lambda (fun=-6917529027541242840, nargs=1, arg_vector=0x82abd0) at eval.c:2879 #24 0x01225e1f in Ffuncall (nargs=2, args=0x82abc8) at eval.c:2764 #25 0x0128959c in exec_byte_code (bytestr=-9223372036754939440, vector=-6917529027541248760, maxdepth=4611686018427387909, args_template=4611686018427388161, nargs=1, args=0x82b180) at bytecode.c:639 #26 0x012268de in funcall_lambda (fun=-6917529027541248584, nargs=1, arg_vector=0x82b178) at eval.c:2879 #27 0x01225e1f in Ffuncall (nargs=2, args=0x82b170) at eval.c:2764 #28 0x0128959c in exec_byte_code (bytestr=-9223372036754939488, vector=-6917529027541248536, maxdepth=4611686018427387920, args_template=4611686018427388161, nargs=1, args=0x82b848) at bytecode.c:639 #29 0x012268de in funcall_lambda (fun=-6917529027541248088, nargs=1, arg_vector=0x82b840) at eval.c:2879 #30 0x01225e1f in Ffuncall (nargs=2, args=0x82b838) at eval.c:2764 #31 0x0128959c in exec_byte_code (bytestr=-9223372036754945616, vector=-6917529027541250008, maxdepth=4611686018427387920, args_template=4611686018427388417, nargs=1, args=0x82bf80) at bytecode.c:639 #32 0x012268de in funcall_lambda (fun=-6917529027541249216, nargs=1, arg_vector=0x82bf78) at eval.c:2879 #33 0x01225e1f in Ffuncall (nargs=2, args=0x82bf70) at eval.c:2764 #34 0x0128959c in exec_byte_code (bytestr=-9223372036754832632, vector=-6917529027541134088, maxdepth=4611686018427387912, args_template=4611686018427388161, nargs=1, args=0x82c548) at bytecode.c:639 #35 0x012268de in funcall_lambda (fun=-6917529027541133960, nargs=1, arg_vector=0x82c540) at eval.c:2879 #36 0x01225e1f in Ffuncall (nargs=2, args=0x82c538) at eval.c:2764 #37 0x0128959c in exec_byte_code (bytestr=-9223372036754832680, vector=-6917529027541140920, maxdepth=4611686018427387914, args_template=4611686018427388160, nargs=0, args=0x82cba8) at bytecode.c:639 #38 0x012268de in funcall_lambda (fun=-6917529027541140760, nargs=0, arg_vector=0x82cba8) at eval.c:2879 #39 0x01225e1f in Ffuncall (nargs=1, args=0x82cba0) at eval.c:2764 #40 0x0128959c in exec_byte_code (bytestr=-9223372036757936560, vector=-6917529027544929360, maxdepth=4611686018427387927, args_template=4611686018427388161, nargs=1, args=0x82d5b8) at bytecode.c:639 #41 0x012268de in funcall_lambda (fun=-6917529027544928504, nargs=1, arg_vector=0x82d5b0) at eval.c:2879 #42 0x01225e1f in Ffuncall (nargs=2, args=0x82d5a8) at eval.c:2764 #43 0x0128959c in exec_byte_code (bytestr=-9223372036760992488, vector=-6917529027547292792, maxdepth=4611686018427387925, args_template=4611686018427387904, nargs=0, args=0x82e258) at bytecode.c:639 #44 0x012268de in funcall_lambda (fun=-6917529027547291064, nargs=0, arg_vector=0x82e258) at eval.c:2879 #45 0x01225e1f in Ffuncall (nargs=1, args=0x82e250) at eval.c:2764 #46 0x0128959c in exec_byte_code (bytestr=-9223372036756613024, vector=-6917529027542917320, maxdepth=4611686018427387916, args_template=4611686018427387904, nargs=0, args=0x82e850) at bytecode.c:639 #47 0x012268de in funcall_lambda (fun=-6917529027542916696, nargs=0, arg_vector=0x82e850) at eval.c:2879 #48 0x012263b2 in apply_lambda (fun=-6917529027542916696, args=0, count=21) at eval.c:2816 #49 0x01223de3 in eval_sub (form=-4611686018332481616) at eval.c:2233 #50 0x01222b56 in Feval (form=-4611686018332481616, lexical=0) at eval.c:2010 #51 0x01223886 in eval_sub (form=-4611686018328424112) at eval.c:2191 #52 0x0121c562 in Fprogn (body=-4611686018328424080) at eval.c:432 #53 0x012232ef in eval_sub (form=-4611686018328424240) at eval.c:2141 #54 0x01269693 in readevalloop (readcharfun=28056, stream=0x77c5fce0 <msvcrt!_iob+96>, sourcename=-9223372036838132000, printflag=false, unibyte=0, readfun=0, start=0, end=0) at lread.c:1980 #55 0x01266f73 in Fload (file=-9223372036838132176, noerror=0, nomessage=0, nosuffix=0, must_suffix=0) at lread.c:1367 #56 0x01223a45 in eval_sub (form=-4611686018340754848) at eval.c:2202 #57 0x012205e2 in internal_lisp_condition_case (var=0, bodyform=-4611686018340754848, handlers=-4611686018340754768) at eval.c:1285 #58 0x0121fe64 in Fcondition_case (args=-4611686018340754736) at eval.c:1211 #59 0x012232ef in eval_sub (form=-4611686018340754720) at eval.c:2141 #60 0x01222b56 in Feval (form=-4611686018340754720, lexical=0) at eval.c:2010 #61 0x01155640 in top_level_2 () at keyboard.c:1127 #62 0x01220675 in internal_condition_case (bfun=0x115560a <top_level_2>, handlers=21616, hfun=0x1154dc1 <cmd_error>) at eval.c:1314 #63 0x011556a6 in top_level_1 (ignore=0) at keyboard.c:1135 #64 0x0121f7fd in internal_catch (tag=57512, func=0x1155646 <top_level_1>, arg=0) at eval.c:1080 #65 0x01155522 in command_loop () at keyboard.c:1096 #66 0x011547f3 in recursive_edit_1 () at keyboard.c:703 #67 0x01154a8f in Frecursive_edit () at keyboard.c:774 #68 0x01152244 in main (argc=7, argv=0xa440d8) at emacs.c:1698 Lisp Backtrace: "skkdic-extract-conversion-data" (0x829398) 0x5fd3950 PVEC_COMPILED "macroexpand" (0x829b50) "macroexp-macroexpand" (0x82a100) "byte-compile-recurse-toplevel" (0x82a688) "byte-compile-toplevel-file-form" (0x82abd0) 0x5f355b8 PVEC_COMPILED "byte-compile-from-buffer" (0x82b840) "byte-compile-file" (0x82bf78) "batch-byte-compile-file" (0x82c540) "batch-byte-compile" (0x82cba8) "command-line-1" (0x82d5b0) "command-line" (0x82e258) "normal-top-level" (0x82e850) "eval" (0x82eb30) "progn" (0x82ed20) "load" (0x82f500) "condition-case" (0x82f7d0) (gdb) fr 5 #5 0x0122f3a1 in Fsubstring (string=-9223372036754289936, from=4611686018427390336, to=4611686018427390336) at fns.c:1282 1282 validate_subarray (string, from, to, size, &ifrom, &ito); (gdb) pp string " s " (gdb) The error seems to come from this function in ja-dic-cnv.el: (defun skkdic-extract-conversion-data (entry) (string-match "^\\cj+[a-z]* " entry) (let ((kana (substring entry (match-beginning 0) (1- (match-end 0)))) (i (match-end 0)) candidates) (while (string-match "[^ ]+" entry i) (setq candidates (cons (match-string 0 entry) candidates)) (setq i (match-end 0))) (cons (skkdic-get-kana-compact-codes kana) candidates))) The call to 'substring' is the one that errors out. So this again points to some problem with categories, as "\\cj" is in the regexp. > make[2]: Entering directory '/home/raeburn/dev/emacs/s/lisp' > make -C ../leim all EMACS="../src/emacs" > make[3]: Entering directory '/home/raeburn/dev/emacs/s/leim' > /bin/mkdir -p ../lisp/leim/ja-dic > GEN ../lisp/leim/ja-dic/ja-dic.el > Loading ../src/dumped.elc... > Reading file "/home/raeburn/dev/emacs/s/leim/SKK-DIC/SKK-JISYO.L" ... > Processing OKURI-ARI entries ... > Debugger entered--Lisp error: (search-failed "^\\cH") > re-search-forward("^\\cH") > (let ((from (point)) to) (search-forward ";; okuri-nasi") (beginning-of-line) (setq to (point)) (narrow-to-region from to) (skkdic-convert-okuri-ari skkbuf buf) (widen) (goto-char to) (forward-line 1) (setq from (point)) (re-search-forward "^\\cH") (setq to (match-beginning 0)) (narrow-to-region from to) (skkdic-convert-postfix skkbuf buf) (widen) (goto-char to) (skkdic-convert-prefix skkbuf buf) (skkdic-collect-okuri-nasi) (skkdic-convert-okuri-nasi skkbuf buf) (save-current-buffer (set-buffer buf) (goto-char (point-max)) (insert ";;\n(provide 'ja-dic)\n\n" ";; Local Variables:\n" ";; version-control: never\n" ";; no-update-autoloads: t\n" ";; coding: utf-8\n" ";; End:\n\n" ";;; ja-dic.el ends here\n"))) Not sure why I didn't see the error with okuri-nasi, perhaps the previous build attempts already generated that. If I do touch leim/SKK-DIC/SKK-JISYO.L the next "make" indeed fails as on your system. One other thing I noticed is that most of the *.elc files produced by this build are different from those I see on master. The differences are sometimes just a few bytes (e.g., in mule-diag.elc), but sometimes much larger (e.g., files.elc). Perhaps this points to some subtle problem in byte compilation? But even if so, that cannot explain the failure to compile eww.el and ja-dic.el. HTH ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 10:41 ` Eli Zaretskii @ 2017-01-14 10:55 ` Andreas Schwab 2017-01-14 11:07 ` Eli Zaretskii 2017-01-14 15:30 ` Stefan Monnier ` (2 subsequent siblings) 3 siblings, 1 reply; 375+ messages in thread From: Andreas Schwab @ 2017-01-14 10:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Ken Raeburn, monnier, emacs-devel On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote: > The line number in the error message is bogus, it points to a require > line (that's a known issue with byte-compiler error reporting, I > think). It's not bogus, since the error was raised while the byte-compiler evaluated the form there. Lisp errors don't carry line number information so there isn't much the byte-compiler can do. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 10:55 ` Andreas Schwab @ 2017-01-14 11:07 ` Eli Zaretskii 2017-01-14 11:26 ` Alan Mackenzie 2017-01-14 12:19 ` Andreas Schwab 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2017-01-14 11:07 UTC (permalink / raw) To: Andreas Schwab; +Cc: raeburn, monnier, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: Ken Raeburn <raeburn@raeburn.org>, monnier@iro.umontreal.ca, emacs-devel@gnu.org > Date: Sat, 14 Jan 2017 11:55:42 +0100 > > On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote: > > > The line number in the error message is bogus, it points to a require > > line (that's a known issue with byte-compiler error reporting, I > > think). > > It's not bogus, since the error was raised while the byte-compiler > evaluated the form there. It's "bogus" in the sense that it isn't useful for finding the code which triggered the error. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 11:07 ` Eli Zaretskii @ 2017-01-14 11:26 ` Alan Mackenzie 2017-01-14 12:19 ` Andreas Schwab 1 sibling, 0 replies; 375+ messages in thread From: Alan Mackenzie @ 2017-01-14 11:26 UTC (permalink / raw) To: Eli Zaretskii; +Cc: raeburn, Andreas Schwab, monnier, emacs-devel Hello, Eli. On Sat, Jan 14, 2017 at 01:07:17PM +0200, Eli Zaretskii wrote: > > From: Andreas Schwab <schwab@linux-m68k.org> > > Cc: Ken Raeburn <raeburn@raeburn.org>, monnier@iro.umontreal.ca, emacs-devel@gnu.org > > Date: Sat, 14 Jan 2017 11:55:42 +0100 > > On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote: > > > The line number in the error message is bogus, it points to a require > > > line (that's a known issue with byte-compiler error reporting, I > > > think). > > It's not bogus, since the error was raised while the byte-compiler > > evaluated the form there. > It's "bogus" in the sense that it isn't useful for finding the code > which triggered the error. Just as a matter of interest, I spent quite a bit of time in the summer trying to fix this. My approach was this: (i) The modified reader created an association list between each cons it creates and the source code position. (ii) Each time a compiler function transformed such a cons, instead of the function returning the transformed form, it did setcar/setcdr into the original cons to preserve the mapping in the association table. (iii) On emitting an error/warning, the compiler would look up the source code position in the association list. I'm confident that such an approach would work. However, it was an enormous amount of work to adapt the compiler, and I got distracted by other things, so haven't managed to produce anything workable, yet. At least there's already a reliable test suite for this, namely make bootstrap. :-) -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 11:07 ` Eli Zaretskii 2017-01-14 11:26 ` Alan Mackenzie @ 2017-01-14 12:19 ` Andreas Schwab 2017-01-14 13:05 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Andreas Schwab @ 2017-01-14 12:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: raeburn, monnier, emacs-devel On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Andreas Schwab <schwab@linux-m68k.org> >> Cc: Ken Raeburn <raeburn@raeburn.org>, monnier@iro.umontreal.ca, emacs-devel@gnu.org >> Date: Sat, 14 Jan 2017 11:55:42 +0100 >> >> On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote: >> >> > The line number in the error message is bogus, it points to a require >> > line (that's a known issue with byte-compiler error reporting, I >> > think). >> >> It's not bogus, since the error was raised while the byte-compiler >> evaluated the form there. > > It's "bogus" in the sense that it isn't useful for finding the code > which triggered the error. It is as bogus as every Lisp error. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 12:19 ` Andreas Schwab @ 2017-01-14 13:05 ` Eli Zaretskii 2017-01-14 15:12 ` Andreas Schwab 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-01-14 13:05 UTC (permalink / raw) To: Andreas Schwab; +Cc: raeburn, monnier, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Date: Sat, 14 Jan 2017 13:19:12 +0100 > Cc: raeburn@raeburn.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org > > >> > The line number in the error message is bogus, it points to a require > >> > line (that's a known issue with byte-compiler error reporting, I > >> > think). > >> > >> It's not bogus, since the error was raised while the byte-compiler > >> evaluated the form there. > > > > It's "bogus" in the sense that it isn't useful for finding the code > > which triggered the error. > > It is as bogus as every Lisp error. On the contrary: most of them provide useful information about the error locus. This one clearly didn't. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 13:05 ` Eli Zaretskii @ 2017-01-14 15:12 ` Andreas Schwab 2017-01-14 17:37 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Andreas Schwab @ 2017-01-14 15:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: raeburn, monnier, emacs-devel On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Andreas Schwab <schwab@linux-m68k.org> >> Date: Sat, 14 Jan 2017 13:19:12 +0100 >> Cc: raeburn@raeburn.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org >> >> >> > The line number in the error message is bogus, it points to a require >> >> > line (that's a known issue with byte-compiler error reporting, I >> >> > think). >> >> >> >> It's not bogus, since the error was raised while the byte-compiler >> >> evaluated the form there. >> > >> > It's "bogus" in the sense that it isn't useful for finding the code >> > which triggered the error. >> >> It is as bogus as every Lisp error. > > On the contrary: most of them provide useful information about the > error locus. This one clearly didn't. It accurately tells you the form that caused the error, something you never get from a Lisp error. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 15:12 ` Andreas Schwab @ 2017-01-14 17:37 ` Eli Zaretskii 2017-01-14 18:50 ` Andreas Schwab 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-01-14 17:37 UTC (permalink / raw) To: Andreas Schwab; +Cc: raeburn, monnier, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: raeburn@raeburn.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org > Date: Sat, 14 Jan 2017 16:12:22 +0100 > > >> It is as bogus as every Lisp error. > > > > On the contrary: most of them provide useful information about the > > error locus. This one clearly didn't. > > It accurately tells you the form that caused the error, something you > never get from a Lisp error. It's accurate, but utterly useless. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 17:37 ` Eli Zaretskii @ 2017-01-14 18:50 ` Andreas Schwab 0 siblings, 0 replies; 375+ messages in thread From: Andreas Schwab @ 2017-01-14 18:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: raeburn, monnier, emacs-devel On Jan 14 2017, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Andreas Schwab <schwab@linux-m68k.org> >> Cc: raeburn@raeburn.org, monnier@iro.umontreal.ca, emacs-devel@gnu.org >> Date: Sat, 14 Jan 2017 16:12:22 +0100 >> >> >> It is as bogus as every Lisp error. >> > >> > On the contrary: most of them provide useful information about the >> > error locus. This one clearly didn't. >> >> It accurately tells you the form that caused the error, something you >> never get from a Lisp error. > > It's accurate, but utterly useless. No, it isn't. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 10:41 ` Eli Zaretskii 2017-01-14 10:55 ` Andreas Schwab @ 2017-01-14 15:30 ` Stefan Monnier 2017-01-14 17:42 ` Eli Zaretskii 2017-01-21 7:58 ` Ken Raeburn 2017-02-02 9:10 ` Ken Raeburn 3 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2017-01-14 15:30 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Ken Raeburn, schwab, emacs-devel > leim/ja-dic/ja-dic.el:76:1:Error: Args out of range: " s ", 2432, 2432 [...] > The line number in the error message is bogus, it points to a require > line (that's a known issue with byte-compiler error reporting, I > think). I don't think it's "bogus": it says that the error occurred while compiling that `require` line, i.e. while loading the corresponding file. You can set byte-compile-debug (along with debug-on-error) to get a backtrace which will be more useful. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 15:30 ` Stefan Monnier @ 2017-01-14 17:42 ` Eli Zaretskii 2017-01-14 18:11 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-01-14 17:42 UTC (permalink / raw) To: Stefan Monnier; +Cc: raeburn, schwab, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sat, 14 Jan 2017 10:30:45 -0500 > Cc: Ken Raeburn <raeburn@raeburn.org>, schwab@linux-m68k.org, > emacs-devel@gnu.org > > You can set byte-compile-debug (along with debug-on-error) to get > a backtrace which will be more useful. That doesn't help when one is presented with a build log. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 17:42 ` Eli Zaretskii @ 2017-01-14 18:11 ` Stefan Monnier 2017-01-14 20:13 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2017-01-14 18:11 UTC (permalink / raw) To: emacs-devel >> You can set byte-compile-debug (along with debug-on-error) to get >> a backtrace which will be more useful. > That doesn't help when one is presented with a build log. Not directly, no, indeed. Usually I then fire an interactive Emacs, set the vars and call byte-compile-file to reproduce the problem in an environment where I can investigate the backtrace comfortably. My point was simply that this is an *evaluation* error more than an error in the compiled code, so the poverty of the info is due to the poverty of info we get when running Elisp code (and this is indeed somewhat linked to the byte-compiler since the byte-compiler doesn't preserve the source location in the bytecode it emits). Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 18:11 ` Stefan Monnier @ 2017-01-14 20:13 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2017-01-14 20:13 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sat, 14 Jan 2017 13:11:26 -0500 > > >> You can set byte-compile-debug (along with debug-on-error) to get > >> a backtrace which will be more useful. > > That doesn't help when one is presented with a build log. > > Not directly, no, indeed. Usually I then fire an interactive Emacs, set > the vars and call byte-compile-file to reproduce the problem in an > environment where I can investigate the backtrace comfortably. There's more than one way of tracking the real locus of the problem. My point is that either way, it's an annoyance which makes investigation of such problems significantly less efficient than when the byte compiler points out the source file and the line number where it happens, or close thereabouts (which is what happens most of the time). I gather that we are in violent agreement about that. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 10:41 ` Eli Zaretskii 2017-01-14 10:55 ` Andreas Schwab 2017-01-14 15:30 ` Stefan Monnier @ 2017-01-21 7:58 ` Ken Raeburn 2017-01-22 16:55 ` Ken Raeburn 2017-02-02 9:10 ` Ken Raeburn 3 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-01-21 7:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Andreas Schwab, Stefan Monnier, Emacs developers I think I may have figured out why I was getting crashes relating to the face cache but it wasn’t very reproducible. Some of the face creation code paths will ensure that a cache exists for a frame before using it — like the handling of “menu” in internal-set-lisp-face-attribute — and some do not. In a regular Emacs build, the order of operations in the C and Lisp code dictate the order in which face definitions are processed. So, for example, in a batch-mode test invocation I tried, the “menu” face handling created the cache for frame “F1” before using it. But using dumped.elc, face property settings get restored, but the code generated assumes that the order doesn’t matter, so the list of face names depends not just on which Lisp code was loaded, but on the order they’re seen under “mapatoms”, i.e., based on load order and the obarray size. (So my Mac/NS and GNU/Linux/X11 builds have different lists of names, and different orders.) I’m looking at internal-set-lisp-face-attribute as a place to always ensure the existence of the cache, but there may be a better location. On Jan 14, 2017, at 05:41, Eli Zaretskii <eliz@gnu.org> wrote: > [… much about failures I’m still looking at…] > One other thing I noticed is that most of the *.elc files produced by > this build are different from those I see on master. The differences > are sometimes just a few bytes (e.g., in mule-diag.elc), but sometimes > much larger (e.g., files.elc). Perhaps this points to some subtle > problem in byte compilation? But even if so, that cannot explain the > failure to compile eww.el and ja-dic.el. I built a couple versions, and found several .elc files different. The first case I looked at was macroexp--const-symbol-p in macroexp.elc. From disassembling, it appears that the expression “(boundp 'byte-compile-const-variables)” is optimized out in the build from the branch point, but not in the build including the dumped.elc changes. I’m not sure why yet, but it’s almost certainly a bug that they’re different. And a bug affecting the emacs-lisp environment and/or the byte compiler output could certainly cause later attempts at byte compilation (using newly byte-compiled code) to misbehave. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-21 7:58 ` Ken Raeburn @ 2017-01-22 16:55 ` Ken Raeburn 0 siblings, 0 replies; 375+ messages in thread From: Ken Raeburn @ 2017-01-22 16:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Andreas Schwab, Stefan Monnier, Emacs developers On Jan 21, 2017, at 02:58, Ken Raeburn <raeburn@raeburn.org> wrote: > I built a couple versions, and found several .elc files different. The first case I looked at was macroexp--const-symbol-p in macroexp.elc. From disassembling, it appears that the expression “(boundp 'byte-compile-const-variables)” is optimized out in the build from the branch point, but not in the build including the dumped.elc changes. I’m not sure why yet, but it’s almost certainly a bug that they’re different. And a bug affecting the emacs-lisp environment and/or the byte compiler output could certainly cause later attempts at byte compilation (using newly byte-compiled code) to misbehave. Ah, this may be a false alarm. I’d overlooked the fact that the updated version (October 31) of Stefan’s patch changed that code to insert that expression on the branch, and I assumed the two were compiling the same source. But if byte-compile-const-variables can be seen as unbound, that could also alter the optimization results compared to the master branch. Perhaps that should be fixed, if possible. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-01-14 10:41 ` Eli Zaretskii ` (2 preceding siblings ...) 2017-01-21 7:58 ` Ken Raeburn @ 2017-02-02 9:10 ` Ken Raeburn 2017-02-04 10:37 ` Eli Zaretskii 3 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-02-02 9:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Andreas Schwab, Stefan Monnier, emacs-devel On Jan 14, 2017, at 05:41, Eli Zaretskii <eliz@gnu.org> wrote: > It does fail for me while byte compiling 2 files: I still haven’t hit these. > The category '>' is defined in characters.el. Surprisingly, > characters.elc in this branch is identical to the file on master, so > byte compilation (see below) is off the hook here. What else could > explain that this category is deemed unknown? I recently noticed the standard syntax and category tables don’t appear to be among the information dumped out, so anything set by characters.el isn’t preserved. I also didn’t see an existing way to restore them trivially. (A huge list of “modify-syntax-entry” calls and such seems impractical.) Also, the buffer-local nature of some variables was being lost; that presented itself as a failure to get syntax-based highlighting in C source files. Having patched around these, I’m still failing on the same file, but later; it prompts me for the coding system to use to write an output file, because it’s not valid UTF-8. Apparently the leading comments copied from SKK-JISYO.L are being corrupted. The first non-ASCII bytes in the buffer (in the “ACKNOWLEDGEMENTS” part of the comment) are 0xe3, 0x81, 0x93, 0xe3, 0x81, 0xae, 0xe8 in a normal build and 0xf5, 0x80, 0x84, 0xac, 0xf5, 0x80, 0x85, 0x87 in my build. I discovered this maybe half an hour ago, so that’s as far as I’ve gotten. I’ve just pushed my changes, plus your change to avoid the dump-emacs call, to the branch. Saving and restoring the standard syntax table seems cheap, because it was already referenced by other objects that were dumped out, but the standard category table almost doubles the size of my dumped.elc, and presumably increases the time to read it accordingly. Perhaps reading characters.el(c) at startup would be a better choice. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-02 9:10 ` Ken Raeburn @ 2017-02-04 10:37 ` Eli Zaretskii 2017-02-05 14:19 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-02-04 10:37 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Thu, 2 Feb 2017 04:10:38 -0500 > Cc: Andreas Schwab <schwab@linux-m68k.org>, > Stefan Monnier <monnier@iro.umontreal.ca>, > emacs-devel@gnu.org > > Perhaps reading characters.el(c) at startup would be a better > choice. How about changing emacs.c to read characters.el(c) just after dumped.elc? Alternatively, would it be possible to simply append characters.elc to the end of dumped.elc, as part of preparing the latter? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-04 10:37 ` Eli Zaretskii @ 2017-02-05 14:19 ` Ken Raeburn 2017-02-05 15:51 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 375+ messages in thread From: Ken Raeburn @ 2017-02-05 14:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Emacs developers > On Feb 4, 2017, at 05:37, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Ken Raeburn <raeburn@raeburn.org> >> Date: Thu, 2 Feb 2017 04:10:38 -0500 >> Cc: Andreas Schwab <schwab@linux-m68k.org>, >> Stefan Monnier <monnier@iro.umontreal.ca>, >> emacs-devel@gnu.org >> >> Perhaps reading characters.el(c) at startup would be a better >> choice. > > How about changing emacs.c to read characters.el(c) just after > dumped.elc? > > Alternatively, would it be possible to simply append characters.elc to > the end of dumped.elc, as part of preparing the latter? For now, I changed loadup.el to emit a “load” form to get characters.elc at startup, and that seems to be working. Copying the contents of characters.elc may be very slightly faster, but I haven’t done any timing tests. I also tracked down my new ja-dic-cnv problem. It looks like SKK-JISYO.L was being mangled on read because the input sequences weren’t recognized as Unicode compatible; this caused the resulting buffer not to be considered UTF-8 compatible, so it prompted for a coding system to write with. Calling unify-charset on the various charsets seems to be needed. With that change, I’m able to run “make bootstrap” in a GNU/Linux/X11 configuration and it runs to completion. I haven’t yet tested it on macOS, or compared the .elc for the differences you were describing. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-05 14:19 ` Ken Raeburn @ 2017-02-05 15:51 ` Eli Zaretskii 2017-02-05 23:19 ` Ken Raeburn 2017-02-05 20:03 ` Ken Brown 2017-02-25 14:52 ` Eli Zaretskii 2 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-02-05 15:51 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Sun, 5 Feb 2017 09:19:38 -0500 > Cc: Emacs developers <emacs-devel@gnu.org> > > For now, I changed loadup.el to emit a “load” form to get characters.elc at startup, and that seems to be working. Copying the contents of characters.elc may be very slightly faster, but I haven’t done any timing tests. > > I also tracked down my new ja-dic-cnv problem. It looks like SKK-JISYO.L was being mangled on read because the input sequences weren’t recognized as Unicode compatible; this caused the resulting buffer not to be considered UTF-8 compatible, so it prompted for a coding system to write with. Calling unify-charset on the various charsets seems to be needed. > > With that change, I’m able to run “make bootstrap” in a GNU/Linux/X11 configuration and it runs to completion. I haven’t yet tested it on macOS, or compared the .elc for the differences you were describing. Thanks. Are those changes committed to the branch? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-05 15:51 ` Eli Zaretskii @ 2017-02-05 23:19 ` Ken Raeburn 2017-02-06 15:20 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-02-05 23:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On Feb 5, 2017, at 10:51, Eli Zaretskii <eliz@gnu.org> wrote: > > Thanks. Are those changes committed to the branch? Yes, I pushed them this morning. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-05 23:19 ` Ken Raeburn @ 2017-02-06 15:20 ` Ken Raeburn 2017-02-06 15:39 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-02-06 15:20 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel And now I’ve got a *possible* explanation for why I’m seeing differences in some .elc files. It appears that the .elc output can vary depending on whether other loaded Lisp code was compiled or not. I found differences in diary-lib.elc between two of my build trees (raeburn-startup branch, and the branch point). In the first function I looked at, the differences came down to this: 335 constant (12 31 -1) constant (12 31 -1) 336 dup dup 337 varbind date varbind date 339 dup varbind date 340 varbind date constant (12 31 -1) 342 car car 343 unbind 1 unbind 1 The constant here comes from calendar-absolute-from-gregorian, a defsubst in calendar.el. I tried with one source base (at the branch point), compiling diary-lib.el with calendar.elc present, and again with calendar.elc missing so that calendar.el would get used. The generated .elc files showed the same differences. This is arguably a bug, but not one added by the big-elc changes. I almost always build with a make option like “-j4”, so the timing of byte compilations of different files relative to one another isn’t entirely predictable. This *could* account for a lot of differences I’m seeing. It’s not a sure thing that this sort of thing is the only cause of differences; I’ll have to do a fully serialized bootstrap of both versions of the code to see. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-06 15:20 ` Ken Raeburn @ 2017-02-06 15:39 ` Stefan Monnier 2017-02-06 19:08 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2017-02-06 15:39 UTC (permalink / raw) To: emacs-devel > It appears that the .elc output can vary depending on whether other loaded > Lisp code was compiled or not. Indeed: the culprit is the defsubst implementation. Currently, if a function is byte-compiled, the optimizer inlines its byte-codes and when it's not yet byte-compiled, then it inlines the source code. We should probably change that so that when it finds that the defsubst function is not yet byte-compiled, it byte-compiles it and then inlines the resulting byte-codes. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-06 15:39 ` Stefan Monnier @ 2017-02-06 19:08 ` Ken Raeburn 2017-02-06 22:39 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-02-06 19:08 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > On Feb 6, 2017, at 10:39, Stefan Monnier <monnier@iro.umontreal.ca> wrote: > >> It appears that the .elc output can vary depending on whether other loaded >> Lisp code was compiled or not. > > Indeed: the culprit is the defsubst implementation. Currently, if > a function is byte-compiled, the optimizer inlines its byte-codes > and when it's not yet byte-compiled, then it inlines the source code. > > We should probably change that so that when it finds that the defsubst > function is not yet byte-compiled, it byte-compiles it and then inlines > the resulting byte-codes. Is this a known (and filed) bug? A quick search for defsubst in debbugs only finds me one unrelated report. In any case, doing “make bootstrap” from clean trees (which I’m assuming will byte-compile files in the same order each time) still gets me a few differences between the branch point and the branch, including python.elc differing in use of dynamic docstrings, and url-handler.elc file-name-handler wrappers saying “no original documentation”. Still more to debug, I guess. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-06 19:08 ` Ken Raeburn @ 2017-02-06 22:39 ` Stefan Monnier 2017-02-08 10:31 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2017-02-06 22:39 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > Is this a known (and filed) bug? I don't think it's filed, no. I've known about it for a while now, and it came up "recently" in the discussion about reproducible builds. Until then it wasn't considered as a real bug, I think, more like a quirk. > In any case, doing “make bootstrap” from clean trees (which I’m assuming > will byte-compile files in the same order each time) Not sure if make guarantees a specific order of execution in that case, but in my experience I think it does operate in a deterministic way, indeed. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-06 22:39 ` Stefan Monnier @ 2017-02-08 10:31 ` Ken Raeburn 2017-02-08 14:38 ` Ken Brown 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-02-08 10:31 UTC (permalink / raw) To: Emacs developers On Feb 6, 2017, at 17:39, Stefan Monnier <monnier@IRO.UMontreal.CA> wrote: >> Is this a known (and filed) bug? > > I don't think it's filed, no. I've known about it for a while now, and > it came up "recently" in the discussion about reproducible builds. > Until then it wasn't considered as a real bug, I think, more like > a quirk. Ah, okay. I didn’t follow that discussion closely. I haven’t got the bandwidth to keep up on everything, and until now I thought I didn’t care about this one. :-) With my bootstrap builds running without parallel make, I’ve gotten things much further along in terms of generating .elc files that match what I get without all the big-elc changes. The difference in progmodes/python.elc came down to the use of UTF-8 in the environment during byte compilation affecting the generated doc strings (using format-message in a macro). Removing internal--text-quoting-flag from the stuff saved in dumped.elc made the files match for me on my Mac (with UTF-8 in use by default). I think that’s just papering over the real problem (the macro’s result shouldn’t depend on the UTF-8-ness of the environment), but the flag should reflect the environment of the current Emacs invocation anyway, not the one that produced dumped.elc. The difference in url/url-handler.elc was because the subr doc strings were getting lost. The numbers (“DOC” file offsets) stored in the Lisp_Subr structure weren’t preserved, so url-handlers-create-wrapper would just fill in “No original documentation.” I’m making dumped.elc invoke Snarf-documentation for now. A tangent: As it happens, a couple years back I was experimenting with having C-based subr/variable documentation stuffed into the executable instead of needing the DOC file, in ways that wouldn’t add a lot of Lisp data unless the doc strings were actually needed. For subr documentation, it doesn’t create Lisp strings until they’re requested. For variables, I’ve got an idea on deferring the Lisp string creation, but currently they’re created at startup and stuffed into the property list. I’ve just updated it to recent Emacs sources, in case we might want to explore that direction further; it might be more efficient than patching up doc pointers every time we start up. Anyway, with the changes I’ve just pushed to the branch, my bootstrapped tree has .elc files that match those built from the branch point, except for mule.elc, macroexp.elc (both source files changed on the branch), bytecomp.elc and byte-opt.elc (probably due to macroexp changes). I haven’t tried any more extensive testing. There may be some funny stuff going on in restoring the charset definitions that I still need to look into. I haven’t pulled in Ken Brown’s Cygwin changes; Ken, feel free to push those to the branch as well. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-08 10:31 ` Ken Raeburn @ 2017-02-08 14:38 ` Ken Brown 0 siblings, 0 replies; 375+ messages in thread From: Ken Brown @ 2017-02-08 14:38 UTC (permalink / raw) To: Ken Raeburn, Emacs developers On 2/8/2017 5:31 AM, Ken Raeburn wrote: > I haven’t pulled in Ken Brown’s Cygwin changes; Ken, feel free to push those to the branch as well. Done. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-05 14:19 ` Ken Raeburn 2017-02-05 15:51 ` Eli Zaretskii @ 2017-02-05 20:03 ` Ken Brown 2017-02-25 14:52 ` Eli Zaretskii 2 siblings, 0 replies; 375+ messages in thread From: Ken Brown @ 2017-02-05 20:03 UTC (permalink / raw) To: Ken Raeburn, Eli Zaretskii; +Cc: Emacs developers [-- Attachment #1: Type: text/plain, Size: 230 bytes --] On 2/5/2017 9:19 AM, Ken Raeburn wrote: > With that change, I’m able to run “make bootstrap” in a GNU/Linux/X11 configuration and it runs to completion. The attached patch enables the build to succeed on Cygwin. Ken [-- Attachment #2: 0001-Fix-build-on-Cygwin.patch --] [-- Type: text/plain, Size: 1518 bytes --] From 165c3356ebb5413277197e4e17c97e7758f96396 Mon Sep 17 00:00:00 2001 From: Ken Brown <kbrown@cornell.edu> Date: Sun, 5 Feb 2017 14:58:59 -0500 Subject: [PATCH] Fix build on Cygwin * configure.ac: Use system malloc on Cygwin. * lisp/loadup.el: Use ".exe" suffix on Cygwin. --- configure.ac | 4 +--- lisp/loadup.el | 2 +- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/configure.ac b/configure.ac index 425e338..c1fd14d 100644 --- a/configure.ac +++ b/configure.ac @@ -2158,9 +2158,7 @@ AC_DEFUN test "$CANNOT_DUMP" = yes || case "$opsys" in ## darwin ld insists on the use of malloc routines in the System framework. - darwin | mingw32 | nacl | sol2-10) ;; - cygwin) hybrid_malloc=yes - system_malloc= ;; + cygwin | darwin | mingw32 | nacl | sol2-10) ;; *) test "$ac_cv_func_sbrk" = yes && system_malloc=$emacs_cv_sanitize_address;; esac diff --git a/lisp/loadup.el b/lisp/loadup.el index 80e9a28..72f24a6 100644 --- a/lisp/loadup.el +++ b/lisp/loadup.el @@ -455,7 +455,7 @@ ;; other GNU program's build process. ;; (dump-emacs "emacs" "temacs") ;; (message "%d pure bytes used" pure-bytes-used) - (let ((exe (if (memq system-type '(windows-nt ms-dos)) ".exe" ""))) + (let ((exe (if (memq system-type '(cygwin windows-nt ms-dos)) ".exe" ""))) (copy-file (expand-file-name (concat "temacs" exe) invocation-directory) (expand-file-name (concat "emacs" exe) invocation-directory) t) -- 2.8.3 ^ permalink raw reply related [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-05 14:19 ` Ken Raeburn 2017-02-05 15:51 ` Eli Zaretskii 2017-02-05 20:03 ` Ken Brown @ 2017-02-25 14:52 ` Eli Zaretskii 2017-02-25 15:19 ` Eli Zaretskii 2017-02-26 12:37 ` Ken Raeburn 2 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2017-02-25 14:52 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Sun, 5 Feb 2017 09:19:38 -0500 > Cc: Emacs developers <emacs-devel@gnu.org> > > I also tracked down my new ja-dic-cnv problem. It looks like SKK-JISYO.L was being mangled on read because the input sequences weren’t recognized as Unicode compatible; this caused the resulting buffer not to be considered UTF-8 compatible, so it prompted for a coding system to write with. Calling unify-charset on the various charsets seems to be needed. Is this part in the repository? Because I still get prompted for an encoding when producing ja-dic.el: GEN ../lisp/leim/ja-dic/ja-dic.el Reading file "d:/gnu/git/emacs/no-unexec/leim/SKK-DIC/SKK-JISYO.L" ... Processing OKURI-ARI entries ... Processing POSTFIX entries ... Processing PREFIX entries ... Collecting OKURI-NASI entries ... collected 26% ... collected 30% ... collected 40% ... collected 50% ... collected 60% ... collected 70% ... collected 80% ... collected 90% ... Processing OKURI-NASI entries ... processed 10% ... processed 20% ... processed 30% ... processed 40% ... processed 50% ... processed 60% ... processed 70% ... processed 80% ... processed 90% ... processed 100% ... Select coding system (default japanese-shift-jis): utf-8-unix I needed to type utf-8-unix by hand. Any ideas? Is it possible that this happens because my default encoding is not UTF-8? I also pushed a small Windows-specific change to the branch, to allow Windows users try building this branch. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-25 14:52 ` Eli Zaretskii @ 2017-02-25 15:19 ` Eli Zaretskii 2017-02-26 12:37 ` Ken Raeburn 1 sibling, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2017-02-25 15:19 UTC (permalink / raw) To: raeburn; +Cc: emacs-devel > Date: Sat, 25 Feb 2017 16:52:12 +0200 > From: Eli Zaretskii <eliz@gnu.org> > Cc: emacs-devel@gnu.org > > Is this part in the repository? Because I still get prompted for an > encoding when producing ja-dic.el: > > GEN ../lisp/leim/ja-dic/ja-dic.el Also, it looks like the logic in startup.el that should bypass certain stuff under -Q isn't working, because I see my abbrevs being loaded even though I invoked "emacs -Q". Thoughts? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-25 14:52 ` Eli Zaretskii 2017-02-25 15:19 ` Eli Zaretskii @ 2017-02-26 12:37 ` Ken Raeburn 2017-03-04 14:23 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-02-26 12:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > On Feb 25, 2017, at 09:52, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Ken Raeburn <raeburn@raeburn.org> >> Date: Sun, 5 Feb 2017 09:19:38 -0500 >> Cc: Emacs developers <emacs-devel@gnu.org> >> >> I also tracked down my new ja-dic-cnv problem. It looks like SKK-JISYO.L was being mangled on read because the input sequences weren’t recognized as Unicode compatible; this caused the resulting buffer not to be considered UTF-8 compatible, so it prompted for a coding system to write with. Calling unify-charset on the various charsets seems to be needed. > > Is this part in the repository? Because I still get prompted for an > encoding when producing ja-dic.el: Yes, change d864464 has the unify-charset changes. > > GEN ../lisp/leim/ja-dic/ja-dic.el > Reading file "d:/gnu/git/emacs/no-unexec/leim/SKK-DIC/SKK-JISYO.L" ... > Processing OKURI-ARI entries ... > Processing POSTFIX entries ... > Processing PREFIX entries ... > Collecting OKURI-NASI entries ... > collected 26% ... > collected 30% ... > collected 40% ... > collected 50% ... > collected 60% ... > collected 70% ... > collected 80% ... > collected 90% ... > Processing OKURI-NASI entries ... > processed 10% ... > processed 20% ... > processed 30% ... > processed 40% ... > processed 50% ... > processed 60% ... > processed 70% ... > processed 80% ... > processed 90% ... > processed 100% ... > Select coding system (default japanese-shift-jis): utf-8-unix > > I needed to type utf-8-unix by hand. Any ideas? Is it possible that > this happens because my default encoding is not UTF-8? Looks like my environment has LANG=en_US.UTF-8, on Mac and GNU/Linux. But setting LANG=C or en_US.ISO8859-1 doesn’t seem to cause the build to get hung up this way for me. Did you do a full bootstrap after updating? An outdated dumped.elc could certainly do this, and I know at least some of the dependencies aren’t current with the changes on the branch. (I’ve taken to going as far as “git clean -f -d -x”, then using autogen.sh, configure, and “make bootstrap”, fairly often.) > I also pushed a small Windows-specific change to the branch, to allow > Windows users try building this branch. Great! > Also, it looks like the logic in startup.el that should bypass certain > stuff under -Q isn't working, because I see my abbrevs being loaded > even though I invoked "emacs -Q". Thoughts? Strange… this is also working for me. At least, settings from my .emacs aren’t being applied, when I use “emacs -Q”. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-02-26 12:37 ` Ken Raeburn @ 2017-03-04 14:23 ` Eli Zaretskii 2017-03-06 8:46 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-03-04 14:23 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Sun, 26 Feb 2017 07:37:56 -0500 > Cc: emacs-devel@gnu.org > > > GEN ../lisp/leim/ja-dic/ja-dic.el > > Reading file "d:/gnu/git/emacs/no-unexec/leim/SKK-DIC/SKK-JISYO.L" ... > > Processing OKURI-ARI entries ... > > Processing POSTFIX entries ... > > Processing PREFIX entries ... > > Collecting OKURI-NASI entries ... > > collected 26% ... > > collected 30% ... > > collected 40% ... > > collected 50% ... > > collected 60% ... > > collected 70% ... > > collected 80% ... > > collected 90% ... > > Processing OKURI-NASI entries ... > > processed 10% ... > > processed 20% ... > > processed 30% ... > > processed 40% ... > > processed 50% ... > > processed 60% ... > > processed 70% ... > > processed 80% ... > > processed 90% ... > > processed 100% ... > > Select coding system (default japanese-shift-jis): utf-8-unix > > > > I needed to type utf-8-unix by hand. Any ideas? Is it possible that > > this happens because my default encoding is not UTF-8? > > Looks like my environment has LANG=en_US.UTF-8, on Mac and GNU/Linux. But setting LANG=C or en_US.ISO8859-1 doesn’t seem to cause the build to get hung up this way for me. > > Did you do a full bootstrap after updating? An outdated dumped.elc could certainly do this, and I know at least some of the dependencies aren’t current with the changes on the branch. (I’ve taken to going as far as “git clean -f -d -x”, then using autogen.sh, configure, and “make bootstrap”, fairly often.) I've bootstrapped now, and this problem is gone. Thanks. > > Also, it looks like the logic in startup.el that should bypass certain > > stuff under -Q isn't working, because I see my abbrevs being loaded > > even though I invoked "emacs -Q". Thoughts? > > Strange… this is also working for me. At least, settings from my .emacs aren’t being applied, when I use “emacs -Q”. This problem is still there. It has nothing to do with loading ~/.emacs, though: startup.el always loads your ~/.emacs.d/abbrev_defs, if that file exists. I'm not sure why it loads that file, but I verified that the master version does that as well. So the issue here is not that the file is loaded, but how it is processed. I only noticed this because my abbrev_defs file uses a function that is only defined in my .emacs. So "emacs -Q" on the raeburn-startup branch barfs because that function is not known. Strangely, "emacs -Q" on the master branch doesn't signal an error, and I don't even see Fsignal called if I set a breakpoint there. I don't (yet) understand why the different behavior. If you insert into your abbrev_defs file something that references a function which is not defined, do you see the same problem as I do? Btw, one thing that I saw while debugging is that purify-flag is set to t while running the startup code. This is because init_alloc_once is called during startup (previously, it was only called by temacs). I don't know if this is related to the issue (setting purify-flag to nil in Frecursive_edit didn't help), but I thought I'd bring it up, because maybe we need to set it to nil earlier. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-04 14:23 ` Eli Zaretskii @ 2017-03-06 8:46 ` Ken Raeburn 2017-03-11 12:27 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-03-06 8:46 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On Mar 4, 2017, at 09:23, Eli Zaretskii <eliz@gnu.org> wrote: >>> Also, it looks like the logic in startup.el that should bypass certain >>> stuff under -Q isn't working, because I see my abbrevs being loaded >>> even though I invoked "emacs -Q". Thoughts? >> >> Strange… this is also working for me. At least, settings from my .emacs aren’t being applied, when I use “emacs -Q”. > > This problem is still there. It has nothing to do with loading > ~/.emacs, though: startup.el always loads your ~/.emacs.d/abbrev_defs, > if that file exists. I'm not sure why it loads that file, but I > verified that the master version does that as well. Odd, seems like -Q should skip that, with the rest of the user’s initializations. > > So the issue here is not that the file is loaded, but how it is > processed. I only noticed this because my abbrev_defs file uses a > function that is only defined in my .emacs. So "emacs -Q" on the > raeburn-startup branch barfs because that function is not known. > Strangely, "emacs -Q" on the master branch doesn't signal an error, > and I don't even see Fsignal called if I set a breakpoint there. I > don't (yet) understand why the different behavior. > > If you insert into your abbrev_defs file something that references a > function which is not defined, do you see the same problem as I do? I added a line: (missing-function) in between some define-abbrev-table invocations, and “emacs -Q” on master (2-3 weeks old) and raeburn-startup both complain about it for me. > Btw, one thing that I saw while debugging is that purify-flag is set > to t while running the startup code. This is because init_alloc_once > is called during startup (previously, it was only called by temacs). > I don't know if this is related to the issue (setting purify-flag to > nil in Frecursive_edit didn't help), but I thought I'd bring it up, > because maybe we need to set it to nil earlier. I’ve been thinking that the branch should probably set CANNOT_DUMP unconditionally. The behavior around pure storage and such under CANNOT_DUMP is probably closer to what we want for the branch. But there’s at least one bug in building with CANNOT_DUMP for macOS I’ve got to clear up first. As you say, it may not be at all related to the problem you’re running into, though. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-06 8:46 ` Ken Raeburn @ 2017-03-11 12:27 ` Eli Zaretskii 2017-03-11 13:18 ` Andreas Schwab 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-03-11 12:27 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Mon, 6 Mar 2017 03:46:17 -0500 > Cc: emacs-devel@gnu.org > > On Mar 4, 2017, at 09:23, Eli Zaretskii <eliz@gnu.org> wrote: > > >>> Also, it looks like the logic in startup.el that should bypass certain > >>> stuff under -Q isn't working, because I see my abbrevs being loaded > >>> even though I invoked "emacs -Q". Thoughts? > >> > >> Strange… this is also working for me. At least, settings from my .emacs aren’t being applied, when I use “emacs -Q”. > > > > This problem is still there. It has nothing to do with loading > > ~/.emacs, though: startup.el always loads your ~/.emacs.d/abbrev_defs, > > if that file exists. I'm not sure why it loads that file, but I > > verified that the master version does that as well. > > Odd, seems like -Q should skip that, with the rest of the user’s initializations. Maybe so, but this code has been there since about forever, and the documentation of -Q doesn't say user's abbrevs are bypassed, only under -batch. In any case, it's a separate problem. > > So the issue here is not that the file is loaded, but how it is > > processed. I only noticed this because my abbrev_defs file uses a > > function that is only defined in my .emacs. So "emacs -Q" on the > > raeburn-startup branch barfs because that function is not known. > > Strangely, "emacs -Q" on the master branch doesn't signal an error, > > and I don't even see Fsignal called if I set a breakpoint there. I > > don't (yet) understand why the different behavior. > > > > If you insert into your abbrev_defs file something that references a > > function which is not defined, do you see the same problem as I do? > > I added a line: > > (missing-function) > > in between some define-abbrev-table invocations, and “emacs -Q” on master (2-3 weeks old) and raeburn-startup both complain about it for me. I debugged this some more: this has nothing to do with unknown functions, you just need to have global abbrevs in the abbrev_defs file, for example: (define-abbrev-table 'global-abbrev-table '( ("abbout" "about" nil 0) ("abotu" "about" nil 0))) The problem seems to be that global-abbrev-table is not an abbrev table where startup.el calls quietly-read-abbrev-file (abbrev-table-p returns nil for it). If I make this change: diff --git a/lisp/startup.el b/lisp/startup.el index 4a04f9c..7f55962 100644 --- a/lisp/startup.el +++ b/lisp/startup.el @@ -1263,6 +1263,8 @@ command-line (deactivate-mark))) ;; If the user has a file of abbrevs, read it (unless -batch). + (or (abbrev-table-p global-abbrev-table) + (setq global-abbrev-table (make-abbrev-table))) (when (and (not noninteractive) (file-exists-p abbrev-file-name) (file-readable-p abbrev-file-name)) then "emacs -Q" starts up normally. Can you reproduce this? global-abbrev-table is defined in abbrev.el like this: (defvar global-abbrev-table (make-abbrev-table) "The abbrev table whose abbrevs affect all buffers. Each buffer may also have a local abbrev table. If it does, the local table overrides the global one for any particular abbrev defined in both.") So I think the issue could be that this defvar somehow doesn't end up in dumped.elc as an abbrev table under the new build procedure. Does that make sense? ^ permalink raw reply related [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-11 12:27 ` Eli Zaretskii @ 2017-03-11 13:18 ` Andreas Schwab 2017-03-11 13:42 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 375+ messages in thread From: Andreas Schwab @ 2017-03-11 13:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Ken Raeburn, emacs-devel On Mär 11 2017, Eli Zaretskii <eliz@gnu.org> wrote: > So I think the issue could be that this defvar somehow doesn't end up > in dumped.elc as an abbrev table under the new build procedure. Does that make sense? I think the problem is that an abbrev table is actually an obarray, which does not have a suitable print syntax. ELISP> global-abbrev-table [## 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] ELISP> (abbrev-table-p [## 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]) nil Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-11 13:18 ` Andreas Schwab @ 2017-03-11 13:42 ` Eli Zaretskii 2017-03-11 15:48 ` Stefan Monnier 2017-03-11 23:59 ` Ken Raeburn 2 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2017-03-11 13:42 UTC (permalink / raw) To: Andreas Schwab; +Cc: raeburn, emacs-devel > From: Andreas Schwab <schwab@linux-m68k.org> > Cc: Ken Raeburn <raeburn@raeburn.org>, emacs-devel@gnu.org > Date: Sat, 11 Mar 2017 14:18:28 +0100 > > On Mär 11 2017, Eli Zaretskii <eliz@gnu.org> wrote: > > > So I think the issue could be that this defvar somehow doesn't end up > > in dumped.elc as an abbrev table under the new build procedure. Does that make sense? > > I think the problem is that an abbrev table is actually an obarray, > which does not have a suitable print syntax. Indeed, makes perfect sense. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-11 13:18 ` Andreas Schwab 2017-03-11 13:42 ` Eli Zaretskii @ 2017-03-11 15:48 ` Stefan Monnier 2017-03-11 21:48 ` Richard Stallman 2017-03-11 23:59 ` Ken Raeburn 2 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2017-03-11 15:48 UTC (permalink / raw) To: emacs-devel > I think the problem is that an abbrev table is actually an obarray, > which does not have a suitable print syntax. Maybe now would be a good time to change the representation of abbrev-tables? Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-11 15:48 ` Stefan Monnier @ 2017-03-11 21:48 ` Richard Stallman 2017-03-11 22:06 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Richard Stallman @ 2017-03-11 21:48 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Maybe now would be a good time to change the representation of > abbrev-tables? How would we prefer for them to print? I am not convinced that it would be useful or convenient to have them print out in a way that describes their contents. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-11 21:48 ` Richard Stallman @ 2017-03-11 22:06 ` Stefan Monnier 0 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2017-03-11 22:06 UTC (permalink / raw) To: emacs-devel >> Maybe now would be a good time to change the representation of >> abbrev-tables? > How would we prefer for them to print? I'm not talking about "representation" in the sense of "print format" but in terms of which data-structure to use for them. But yes, of course that will affect the way they print. Clearly, I'd hope that the new representation would print `read'ably. > I am not convinced that it would be useful or convenient > to have them print out in a way that describes their contents. The way abbrev-tables print right now is just plain bad: it's neither computer-readable nor human-readable. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-11 13:18 ` Andreas Schwab 2017-03-11 13:42 ` Eli Zaretskii 2017-03-11 15:48 ` Stefan Monnier @ 2017-03-11 23:59 ` Ken Raeburn 2017-03-12 17:06 ` Stefan Monnier 2017-03-13 8:25 ` Ken Raeburn 2 siblings, 2 replies; 375+ messages in thread From: Ken Raeburn @ 2017-03-11 23:59 UTC (permalink / raw) To: Andreas Schwab; +Cc: Eli Zaretskii, emacs-devel On Mar 11, 2017, at 08:18, Andreas Schwab <schwab@linux-m68k.org> wrote: > On Mär 11 2017, Eli Zaretskii <eliz@gnu.org> wrote: > >> So I think the issue could be that this defvar somehow doesn't end up >> in dumped.elc as an abbrev table under the new build procedure. Does that make sense? > > I think the problem is that an abbrev table is actually an obarray, > which does not have a suitable print syntax. Ah, yes. Thanks for noticing that. And just yesterday I was thinking how convenient — and surprising — it was that we didn’t have to dump out any obarray objects; oh well. Unless we’re going to arrange for obarrays to be printable and readable in a useful form, they’ll need special-casing. But abbrev variables should be easy enough to recognize and process. I’ll take a look. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-11 23:59 ` Ken Raeburn @ 2017-03-12 17:06 ` Stefan Monnier 2017-03-13 8:25 ` Ken Raeburn 1 sibling, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2017-03-12 17:06 UTC (permalink / raw) To: emacs-devel > And just yesterday I was thinking how convenient — and surprising — it was > that we didn’t have to dump out any obarray objects; oh well. Unless we’re > going to arrange for obarrays to be printable and readable in a useful form, > they’ll need special-casing. But abbrev variables should be easy enough to > recognize and process. I’ll take a look. My personal favorite choice is to deprecate obarrays (most uses would be better served by a hash-table), but getting rid of them completely is rather tricky. So we probably want to solve the obarray problem regardless of whether we deprecate them. It seems fairly, easy, tho: - add a `make-obarray` function, which basically does the same as `make-vector` but uses another tag. Use it in abbrev.el (and other applicable places). - change `intern` and friends to accept those other kinds of vectors. - change print.c to do something more clever with obarrays. - deprecate use of plain vectors as obarrays. I'm in the mood for procrastinating, so don't be surprised if a patch shows up, Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-11 23:59 ` Ken Raeburn 2017-03-12 17:06 ` Stefan Monnier @ 2017-03-13 8:25 ` Ken Raeburn 2017-03-26 16:44 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-03-13 8:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Emacs developers [-- Attachment #1: Type: text/plain, Size: 871 bytes --] I have a patch which seems to recreate all the abbrev tables that were in the initial Emacs process, including the sharing (lisp-mode-abbrev-table being a parent of emacs-lisp-mode-abbrev-table; local-abbrev-table set to the fundamental mode table). Please let me know if it fixes your problem. It depends on a couple key things: (1) The abbrev tables are empty at that point, so we don’t have to worry about reconstructing all the abbrevs in a table; this can be fixed, it’s just tedious. (2) Abbrev-table values are only used as symbol values, or parents of other abbrev tables. This is much harder. Stefan’s printable replacement for obarrays would probably be a better solution. Though, normally I’d expect people to want printing of an obarray to show symbol names, and for this use case we need the function, value, and plist data as well. Ken [-- Attachment #2: abbrev-table-patch --] [-- Type: application/octet-stream, Size: 4844 bytes --] commit dd4d7941a3915d16a0f78ab547e3656ec95f4f29 Author: Ken Raeburn <raeburn@raeburn.org> Date: Mon Mar 13 03:21:53 2017 -0400 Dump and restore empty abbrev tables. Abbrev tables are obarrays and thus don't print out in a useful form. They need to be assembled at load time. Fortunately, loadup.el only gives us empty abbrev tables, so we don't have to actually restore any abbrevs, only the tables. * lisp/loadup.el: When variable values are abbrev tables, emit a "make-abbrev-table" initialization with the appropriate property lists. Check abbrev tables and their parents for instances of sharing. Reject any abbrev tables that are not empty. diff --git a/lisp/loadup.el b/lisp/loadup.el index cc9ed7be1a..48a1208ed7 100644 --- a/lisp/loadup.el +++ b/lisp/loadup.el @@ -484,6 +484,10 @@ (coding-systems '()) (coding-system-aliases '()) (charsets '()) (charset-aliases '()) (unified-charsets '()) + (abbrev-tables (make-hash-table :test 'eq)) + (abbrev-assign-cmds '()) + (abbrev-make-cmds '()) + (abbrev-counter 0) (cmds '())) (setcdr global-buffers-menu-map nil) ;; Get rid of buffer objects! (push `(internal--set-standard-syntax-table @@ -539,6 +543,42 @@ '(let ((ol (make-overlay (point-min) (point-min)))) (delete-overlay ol) ol)) + ;; abbrev-table-p isn't very defensive + ((condition-case nil + (abbrev-table-p v) + (error nil)) + (cl-labels ((replace-abbrevs-for-dump + (table) + (or (abbrev-table-empty-p table) + (error "Non-empty abbrev tables not handled")) + (let ((newval (gethash table abbrev-tables))) + (if newval + `(aref scratch-abbrev-tables ,newval) + (let* ((props (symbol-plist (obarray-get table "")))) + (cond ((plist-get props :parents) + (setq props (copy-sequence props)) + (plist-put props + :parents + (mapcar (lambda (value) + (replace-abbrevs-for-dump value)) + (plist-get props :parents)))) + ((eq (length props) 2) + ;; Only :abbrev-table-modiff, which gets added at creation anyway. + (setq props nil))) + (push `(aset scratch-abbrev-tables + ,abbrev-counter + (make-abbrev-table ',props)) + abbrev-make-cmds) + (puthash table abbrev-counter abbrev-tables) + (prog1 + `(aref scratch-abbrev-tables ,abbrev-counter) + (setq abbrev-counter (1+ abbrev-counter)))))))) + (push `(set-default ',s + ,(replace-abbrevs-for-dump v)) + abbrev-assign-cmds)) + ;; Placeholder to be used before we know + ;; we've defined make-abbrev-table. + 0) (v (macroexp-quote v)))) cmds) ;; Local variables: make-variable-buffer-local, @@ -591,6 +631,10 @@ (print '(get-buffer-create "*Messages*")) (print `(progn . ,cmds)) (terpri) + ;; Now that make-abbrev-table is defined, use it. + (print `(let ((scratch-abbrev-tables (make-vector ,abbrev-counter 0))) + ,@(nreverse abbrev-make-cmds) + ,@abbrev-assign-cmds)) (print `(let ((css ',charsets)) (dotimes (i 3) (dolist (cs (prog1 css (setq css nil))) ^ permalink raw reply related [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-13 8:25 ` Ken Raeburn @ 2017-03-26 16:44 ` Eli Zaretskii 2017-03-28 2:27 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-03-26 16:44 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Mon, 13 Mar 2017 04:25:19 -0400 > Cc: Emacs developers <emacs-devel@gnu.org> > > I have a patch which seems to recreate all the abbrev tables that were in the initial Emacs process, including the sharing (lisp-mode-abbrev-table being a parent of emacs-lisp-mode-abbrev-table; local-abbrev-table set to the fundamental mode table). Please let me know if it fixes your problem. Sorry for the long delay: Life™ intervened big time... > It depends on a couple key things: (1) The abbrev tables are empty at that point, so we don’t have to worry about reconstructing all the abbrevs in a table; this can be fixed, it’s just tedious. (2) Abbrev-table values are only used as symbol values, or parents of other abbrev tables. This is much harder. Stefan’s printable replacement for obarrays would probably be a better solution. Though, normally I’d expect people to want printing of an obarray to show symbol names, and for this use case we need the function, value, and plist data as well. I applied your patch, and while dumping I get an error message: Dumping into dumped.elc...preparing... Dumping into dumped.elc...generating... Symbol's function definition is void: cl-labels and dumped.elc is not re-created. What did I miss? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-26 16:44 ` Eli Zaretskii @ 2017-03-28 2:27 ` Ken Raeburn 2017-03-31 6:57 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-03-28 2:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On Mar 26, 2017, at 12:44, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Ken Raeburn <raeburn@raeburn.org> >> Date: Mon, 13 Mar 2017 04:25:19 -0400 >> Cc: Emacs developers <emacs-devel@gnu.org> >> >> I have a patch which seems to recreate all the abbrev tables that were in the initial Emacs process, including the sharing (lisp-mode-abbrev-table being a parent of emacs-lisp-mode-abbrev-table; local-abbrev-table set to the fundamental mode table). Please let me know if it fixes your problem. > > Sorry for the long delay: Life™ intervened big time… It happens. No worries. > >> It depends on a couple key things: (1) The abbrev tables are empty at that point, so we don’t have to worry about reconstructing all the abbrevs in a table; this can be fixed, it’s just tedious. (2) Abbrev-table values are only used as symbol values, or parents of other abbrev tables. This is much harder. Stefan’s printable replacement for obarrays would probably be a better solution. Though, normally I’d expect people to want printing of an obarray to show symbol names, and for this use case we need the function, value, and plist data as well. > > I applied your patch, and while dumping I get an error message: > > Dumping into dumped.elc...preparing... > Dumping into dumped.elc...generating... > Symbol's function definition is void: cl-labels > > and dumped.elc is not re-created. What did I miss? Looks like I missed a “require” or “load” to pull in cl-macs. Perhaps it’s loaded by something else in my build that’s platform-dependent (X11 vs Windows?) and isn’t in yours; I’m not sure. But it isn’t working for me to just load it explicitly without fixing up the load path too. Perhaps I should’ve just defined a helper function instead of using cl-labels. For now, try adding this patch. It bootstraps for me, and should get cl-labels defined. diff --git a/lisp/loadup.el b/lisp/loadup.el index 4ef9712ab6..f9251020cd 100644 --- a/lisp/loadup.el +++ b/lisp/loadup.el @@ -57,6 +57,17 @@ ;; Add subdirectories to the load-path for files that might get ;; autoloaded when bootstrapping. ;; This is because PATH_DUMPLOADSEARCH is just "../lisp". +(let ((dir (car load-path))) + (message "load path is %S" load-path) + (setq load-path (list (expand-file-name "." dir) + (expand-file-name "emacs-lisp" dir) + (expand-file-name "language" dir) + (expand-file-name "international" dir) + (expand-file-name "textmodes" dir) + (expand-file-name "vc" dir)))) + +(setq purify-flag nil) + (if (or (equal (member "bootstrap" command-line-args) '("bootstrap")) ;; FIXME this is irritatingly fragile. (equal (nth 4 command-line-args) "unidata-gen.el") @@ -64,19 +75,10 @@ (if (fboundp 'dump-emacs) (string-match "src/bootstrap-emacs" (nth 0 command-line-args)) t)) - (let ((dir (car load-path))) - ;; We'll probably overflow the pure space. - (setq purify-flag nil) - ;; Value of max-lisp-eval-depth when compiling initially. - ;; During bootstrapping the byte-compiler is run interpreted when - ;; compiling itself, which uses a lot more stack than usual. - (setq max-lisp-eval-depth 2200) - (setq load-path (list (expand-file-name "." dir) - (expand-file-name "emacs-lisp" dir) - (expand-file-name "language" dir) - (expand-file-name "international" dir) - (expand-file-name "textmodes" dir) - (expand-file-name "vc" dir))))) + ;; Value of max-lisp-eval-depth when compiling initially. + ;; During bootstrapping the byte-compiler is run interpreted when + ;; compiling itself, which uses a lot more stack than usual. + (setq max-lisp-eval-depth 2200)) (if (eq t purify-flag) ;; Hash consing saved around 11% of pure space in my tests. @@ -308,6 +310,8 @@ ;; Preload some constants and floating point functions. (load "emacs-lisp/float-sup") +(load "emacs-lisp/cl-macs") + (load "vc/vc-hooks") (load "vc/ediff-hook") (load "uniquify") ^ permalink raw reply related [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-28 2:27 ` Ken Raeburn @ 2017-03-31 6:57 ` Eli Zaretskii 2017-03-31 8:40 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-03-31 6:57 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Mon, 27 Mar 2017 22:27:26 -0400 > Cc: emacs-devel@gnu.org > > > I applied your patch, and while dumping I get an error message: > > > > Dumping into dumped.elc...preparing... > > Dumping into dumped.elc...generating... > > Symbol's function definition is void: cl-labels > > > > and dumped.elc is not re-created. What did I miss? > > Looks like I missed a “require” or “load” to pull in cl-macs. Perhaps it’s loaded by something else in my build that’s platform-dependent (X11 vs Windows?) and isn’t in yours; I’m not sure. But it isn’t working for me to just load it explicitly without fixing up the load path too. Perhaps I should’ve just defined a helper function instead of using cl-labels. > > For now, try adding this patch. It bootstraps for me, and should get cl-labels defined. This fixes the problem, and Emacs now starts OK, so the abbrevs issue is also solved. I think you should push all the changes you asked me to apply as patches. What is the roadmap ahead? Are there any known issues left, before we can consider this be a candidate for merging to master, and asking people to test it in their routine workflows before we actually merge? Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-31 6:57 ` Eli Zaretskii @ 2017-03-31 8:40 ` Ken Raeburn 2017-04-03 16:15 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-03-31 8:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On Mar 31, 2017, at 02:57, Eli Zaretskii <eliz@gnu.org> wrote: > > This fixes the problem, and Emacs now starts OK, so the abbrevs issue > is also solved. Great! > I think you should push all the changes you asked me to apply as > patches. Will do, probably this weekend. > What is the roadmap ahead? Are there any known issues left, before we > can consider this be a candidate for merging to master, and asking > people to test it in their routine workflows before we actually merge? > > Thanks. There are a number of issues on my list. Some can be dealt with while people are experimenting, or even after merging. Others may affect usability, like the current inability to report clearly and consistently if dumped.elc can’t be found. I haven’t even tested doing “make install”; I always run Emacs from the build tree. There are a few other minor bugs, like a few unprintable definitions not getting dumped, that it’d be nice to address; I’ll go back over my list and take a look. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-03-31 8:40 ` Ken Raeburn @ 2017-04-03 16:15 ` Ken Raeburn 2017-04-03 16:57 ` Alan Mackenzie 2017-04-10 16:19 ` Ken Raeburn 0 siblings, 2 replies; 375+ messages in thread From: Ken Raeburn @ 2017-04-03 16:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On Mar 31, 2017, at 04:40, Ken Raeburn <raeburn@raeburn.org> wrote: > > On Mar 31, 2017, at 02:57, Eli Zaretskii <eliz@gnu.org> wrote: >> >> This fixes the problem, and Emacs now starts OK, so the abbrevs issue >> is also solved. > > Great! > >> I think you should push all the changes you asked me to apply as >> patches. > > Will do, probably this weekend. Looks like the abbrev change isn’t actually working right… I got the quoting wrong, so the abbrev tables are constructed as (mostly) proper abbrev tables, and in the right order, but the “:parent” properties are bad. Working on fixing it up…. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-03 16:15 ` Ken Raeburn @ 2017-04-03 16:57 ` Alan Mackenzie 2017-04-03 18:35 ` Ken Raeburn 2017-04-10 16:19 ` Ken Raeburn 1 sibling, 1 reply; 375+ messages in thread From: Alan Mackenzie @ 2017-04-03 16:57 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel Hello, Ken. On Mon, Apr 03, 2017 at 12:15:29 -0400, Ken Raeburn wrote: > On Mar 31, 2017, at 04:40, Ken Raeburn <raeburn@raeburn.org> wrote: > > On Mar 31, 2017, at 02:57, Eli Zaretskii <eliz@gnu.org> wrote: > >> This fixes the problem, and Emacs now starts OK, so the abbrevs issue > >> is also solved. > > Great! > >> I think you should push all the changes you asked me to apply as > >> patches. > > Will do, probably this weekend. > Looks like the abbrev change isn’t actually working right… I got the > quoting wrong, so the abbrev tables are constructed as (mostly) proper > abbrev tables, and in the right order, but the “:parent” properties are > bad. Working on fixing it up…. I, for one, am feeling enthusiastic about this new way of building Emacs, and am looking forward to trying it out in the near future. Thanks for a great job almost finished! -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-03 16:57 ` Alan Mackenzie @ 2017-04-03 18:35 ` Ken Raeburn 2017-04-03 19:14 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-04-03 18:35 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel On Apr 3, 2017, at 12:57, Alan Mackenzie <acm@muc.de> wrote: > Hello, Ken. > > I, for one, am feeling enthusiastic about this new way of building Emacs, > and am looking forward to trying it out in the near future. > > Thanks for a great job almost finished! Just making sure credit goes where it’s due: Stefan did great work on the key piece, processing the Lisp environment for dumping as Lisp. I’ve tried to improve the Lisp reader performance a bit, and fix up a couple minor bugs here and there, and maybe put a little polish on it. Despite a few little speed-ups, I’ve got my doubts as to whether it’s going to be fast enough. The .elc files are still (essentially) Lisp, and parsing text is not the most efficient way to load a bunch of object definitions. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-03 18:35 ` Ken Raeburn @ 2017-04-03 19:14 ` Eli Zaretskii 2017-04-04 8:08 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-04-03 19:14 UTC (permalink / raw) To: Ken Raeburn; +Cc: acm, emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Mon, 3 Apr 2017 14:35:16 -0400 > Cc: emacs-devel@gnu.org > > Despite a few little speed-ups, I’ve got my doubts as to whether it’s going to be fast enough. I published my preliminary timings in these 2 messages: http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00923.html http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00959.html ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-03 19:14 ` Eli Zaretskii @ 2017-04-04 8:08 ` Ken Raeburn 2017-04-04 9:51 ` Robert Pluim ` (2 more replies) 0 siblings, 3 replies; 375+ messages in thread From: Ken Raeburn @ 2017-04-04 8:08 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acm, emacs-devel On Apr 3, 2017, at 15:14, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Ken Raeburn <raeburn@raeburn.org> >> Date: Mon, 3 Apr 2017 14:35:16 -0400 >> Cc: emacs-devel@gnu.org >> >> Despite a few little speed-ups, I’ve got my doubts as to whether it’s going to be fast enough. > > I published my preliminary timings in these 2 messages: > > http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00923.html > http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00959.html Yes, I got some speedups, but I didn’t get it as fast as I was hoping. Some of my changes since your second message above might’ve improved the numbers a little, but some (like loading the doc pointers at startup, and I think “uniquify” is going to need to be loaded at startup too because it attaches advice to “rename-buffer” which we can’t save properly) may slow it a little too. I was aiming for a startup time under a tenth of a second, and didn’t get there, though there were a couple of additional things that could be tried, with some effort. I’m not sure a startup time of nearly a fifth of a second will feel for people. If they start Emacs once as part of logging in, it probably won’t be an issue. If they start it every time they want to edit a file, it may be annoying to have the startup time increased by even 0.15s. Still, I suppose we can let people try it out, and find out what they think. Then we can decide if it’s good enough, if further speedup measures are worth exploring, or if it’s a dead end. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-04 8:08 ` Ken Raeburn @ 2017-04-04 9:51 ` Robert Pluim 2017-04-04 10:27 ` joakim 2017-04-07 5:46 ` Lars Brinkhoff 2 siblings, 0 replies; 375+ messages in thread From: Robert Pluim @ 2017-04-04 9:51 UTC (permalink / raw) To: Ken Raeburn; +Cc: acm, Eli Zaretskii, emacs-devel Ken Raeburn <raeburn@raeburn.org> writes: > > I was aiming for a startup time under a tenth of a second, and didn’t > get there, though there were a couple of additional things that could > be tried, with some effort. I’m not sure a startup time of nearly a > fifth of a second will feel for people. If they start Emacs once as > part of logging in, it probably won’t be an issue. If they start it > every time they want to edit a file, it may be annoying to have the > startup time increased by even 0.15s. I have an emacs I normally keep running, and occasionally start one for a quick editing task. 0.15s is completely lost in the noise for me. I've tried the branch, as far as I'm concerned its speed is fine. Regards Robert ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-04 8:08 ` Ken Raeburn 2017-04-04 9:51 ` Robert Pluim @ 2017-04-04 10:27 ` joakim 2017-04-04 12:14 ` Clément Pit-Claudel 2017-04-07 5:46 ` Lars Brinkhoff 2 siblings, 1 reply; 375+ messages in thread From: joakim @ 2017-04-04 10:27 UTC (permalink / raw) To: Ken Raeburn; +Cc: acm, Eli Zaretskii, emacs-devel Ken Raeburn <raeburn@raeburn.org> writes: > On Apr 3, 2017, at 15:14, Eli Zaretskii <eliz@gnu.org> wrote: > >>> From: Ken Raeburn <raeburn@raeburn.org> >>> Date: Mon, 3 Apr 2017 14:35:16 -0400 >>> Cc: emacs-devel@gnu.org >>> >>> Despite a few little speed-ups, I’ve got my doubts as to whether it’s going to be fast enough. >> >> I published my preliminary timings in these 2 messages: >> >> http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00923.html> http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00959.html > > Yes, I got some speedups, but I didn’t get it as fast as I was hoping. Some of my changes since your second message above might’ve improved the numbers a little, but some (like loading the doc pointers at startup, and I think “uniquify” is going to need to be loaded at startup too because it attaches advice to “rename-buffer” which we can’t save properly) may slow it a little too. > > I was aiming for a startup time under a tenth of a second, and didn’t get there, though there were a couple of additional things that could be tried, with some effort. I’m not sure a startup time of nearly a fifth of a second will feel for people. If they start Emacs once as part of logging in, it probably won’t be an issue. If they start it every time they want to edit a file, it may be annoying to have the startup time increased by even 0.15s. In my case I mostly use long-running sessions, so slow emacs startup isn't so bad for me. Most of the boot time seem to happen in 3rd party libs anyway. But on the other hand I think there is a valid use case for using Emacs for things like batch processing, web servers and such. And in those cases startup time matters. Again otoh, you might want to use emacsclient together with a long running emacs in those cases. But I'm not really using emacs for that sort of thing so one should listen to the people actually doing it primarily. > > Still, I suppose we can let people try it out, and find out what they think. Then we can decide if it’s good enough, if further speedup measures are worth exploring, or if it’s a dead end. > > Ken -- Joakim Verona joakim@verona.se ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-04 10:27 ` joakim @ 2017-04-04 12:14 ` Clément Pit-Claudel 2017-04-04 14:38 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Clément Pit-Claudel @ 2017-04-04 12:14 UTC (permalink / raw) To: emacs-devel On 2017-04-04 06:27, joakim@verona.se wrote: > > But on the other hand I think there is a valid use case for using Emacs > for things like batch processing, web servers and such. And in those > cases startup time matters. Again otoh, you might want to use > emacsclient together with a long running emacs in those cases. But I'm > not really using emacs for that sort of thing so one should listen to > the people actually doing it primarily. I use Emacs in batch mode a lot; using a server is tricky, because anything you load or change on one run persists until the across future executions (for example one execution might load a file, then the next one might forget to load the file explicitly, but still work well because that file was previously loaded; when you run the program again in a fresh instance, things fail). Additionally, there are bugs in Emacsclient that make it tricky to use on its own, so my current code relies on two instances of Emacs: a long-lived one and a short-lived on. THe short-lived one is used to connect to the long-running server. (This saves a lot because most of the execution time is otherwise spent loading packages). ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-04 12:14 ` Clément Pit-Claudel @ 2017-04-04 14:38 ` Eli Zaretskii 2017-04-04 15:16 ` Clément Pit-Claudel 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-04-04 14:38 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Tue, 4 Apr 2017 08:14:48 -0400 > > Additionally, there are bugs in Emacsclient that make it tricky to use on its own Why aren't these bugs being fixed? Are there bug reports with the details? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-04 14:38 ` Eli Zaretskii @ 2017-04-04 15:16 ` Clément Pit-Claudel 2017-04-04 15:53 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Clément Pit-Claudel @ 2017-04-04 15:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On 2017-04-04 10:38, Eli Zaretskii wrote: >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Tue, 4 Apr 2017 08:14:48 -0400 >> >> Additionally, there are bugs in Emacsclient that make it tricky to use on its own > > Why aren't these bugs being fixed? Are there bug reports with the > details? I can think of 3: "Is emacsclient --eval broken?" from emacs-devel on 2016-08-03, which got fixed almost instantly (thanks Johan Bockgård!). #24616, which now has documentation and so is arguably fixed (but errors still pop up on the server, not the client, and so the server has to capture backtraces and send them back explicitly). "How can I rethrow an error after recording a backtrace?" from emacs-devel on 2016-08-04, which is due to emacsclient not incrementing num_nonmacro_input_events. One of them is fixed, the second seems seems to be mostly wontfix, and the third is open. But all three are relevant when trying to remain compatible with older Emacsen. Cheers, Clément. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-04 15:16 ` Clément Pit-Claudel @ 2017-04-04 15:53 ` Eli Zaretskii 2017-04-04 18:22 ` Clément Pit-Claudel 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-04-04 15:53 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Tue, 4 Apr 2017 11:16:35 -0400 > Cc: emacs-devel@gnu.org > > One of them is fixed, the second seems seems to be mostly wontfix, and the third is open. But all three are relevant when trying to remain compatible with older Emacsen. So there's only one bug that remains, is that right? As for older versions, you could fix them by retrofitting the patches into them, right? And in any case, they seem to be unrelated to the issue at hand, which will only affect the next release and those after it. Right? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-04 15:53 ` Eli Zaretskii @ 2017-04-04 18:22 ` Clément Pit-Claudel 0 siblings, 0 replies; 375+ messages in thread From: Clément Pit-Claudel @ 2017-04-04 18:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On 2017-04-04 11:53, Eli Zaretskii wrote: >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> Date: Tue, 4 Apr >> 2017 11:16:35 -0400 Cc: emacs-devel@gnu.org >> >> One of them is fixed, the second seems seems to be mostly wontfix, >> and the third is open. But all three are relevant when trying to >> remain compatible with older Emacsen. > > So there's only one bug that remains, is that right? > > As for older versions, you could fix them by retrofitting the > patches into them, right? Correct. When the patches are in Lisp, yes, to some extent. In C, no, of course. > And in any case, they seem to be unrelated to the issue at hand, > which will only affect the next release and those after it. Right? Correct. I sought to point out that, since old version prevent some authors from using emacsclient, emacsclient being fast does mitigate much of the costs of Emacs itself starting slowly. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-04 8:08 ` Ken Raeburn 2017-04-04 9:51 ` Robert Pluim 2017-04-04 10:27 ` joakim @ 2017-04-07 5:46 ` Lars Brinkhoff 2017-04-07 7:28 ` Eli Zaretskii 2 siblings, 1 reply; 375+ messages in thread From: Lars Brinkhoff @ 2017-04-07 5:46 UTC (permalink / raw) To: emacs-devel Ken Raeburn wrote: > I was aiming for a startup time under a tenth of a second, and didn’t > get there, though there were a couple of additional things that could > be tried, with some effort. I’m not sure a startup time of nearly a > fifth of a second will feel for people. Every invokation of async-start launches a new emacs subprocess, doesn't it? So startup time would also affect uses of async.el. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-07 5:46 ` Lars Brinkhoff @ 2017-04-07 7:28 ` Eli Zaretskii 2017-04-07 9:02 ` Ken Raeburn 2017-04-07 13:23 ` Skipping unexec via a big .elc file Stefan Monnier 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2017-04-07 7:28 UTC (permalink / raw) To: Lars Brinkhoff, Ken Raeburn; +Cc: emacs-devel > From: Lars Brinkhoff <lars@nocrew.org> > Date: Fri, 07 Apr 2017 07:46:12 +0200 > > Ken Raeburn wrote: > > I was aiming for a startup time under a tenth of a second, and didn’t > > get there, though there were a couple of additional things that could > > be tried, with some effort. I’m not sure a startup time of nearly a > > fifth of a second will feel for people. > > Every invokation of async-start launches a new emacs subprocess, doesn't > it? So startup time would also affect uses of async.el. Perhaps we could have a separate, much smaller dumped.elc for batch invocations, to cater to these use cases. Ken, does this make sense? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-07 7:28 ` Eli Zaretskii @ 2017-04-07 9:02 ` Ken Raeburn 2017-04-07 13:40 ` Eli Zaretskii 2017-04-07 13:23 ` Skipping unexec via a big .elc file Stefan Monnier 1 sibling, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-04-07 9:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Lars Brinkhoff, emacs-devel On Apr 7, 2017, at 03:28, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Lars Brinkhoff <lars@nocrew.org> >> Date: Fri, 07 Apr 2017 07:46:12 +0200 >> >> Ken Raeburn wrote: >>> I was aiming for a startup time under a tenth of a second, and didn’t >>> get there, though there were a couple of additional things that could >>> be tried, with some effort. I’m not sure a startup time of nearly a >>> fifth of a second will feel for people. >> >> Every invokation of async-start launches a new emacs subprocess, doesn't >> it? So startup time would also affect uses of async.el. > > Perhaps we could have a separate, much smaller dumped.elc for batch > invocations, to cater to these use cases. Ken, does this make sense? We could do it, sure. For example, stuff relating to window systems probably isn’t of much use in batch mode. One question is, do we change such things to use autoload in case the user’s init file references their functions, or do we require that the user know to use “load” or “require”? Maybe we can find file currently loaded that we could change over to autoloading in all cases, improving the interactive startup time too? Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-07 9:02 ` Ken Raeburn @ 2017-04-07 13:40 ` Eli Zaretskii 2017-04-07 16:02 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-04-07 13:40 UTC (permalink / raw) To: Ken Raeburn; +Cc: lars, emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Fri, 7 Apr 2017 05:02:30 -0400 > Cc: Lars Brinkhoff <lars@nocrew.org>, > emacs-devel@gnu.org > > > Perhaps we could have a separate, much smaller dumped.elc for batch > > invocations, to cater to these use cases. Ken, does this make sense? > > We could do it, sure. For example, stuff relating to window systems probably isn’t of much use in batch mode. One question is, do we change such things to use autoload in case the user’s init file references their functions, or do we require that the user know to use “load” or “require”? I don't think I understand the question: -batch implies -Q, so the user's init file is not relevant. > Maybe we can find file currently loaded that we could change over to autoloading in all cases, improving the interactive startup time too? Yes, making dumped.elc smaller by using autoload is another way of slashing some load time. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-07 13:40 ` Eli Zaretskii @ 2017-04-07 16:02 ` Ken Raeburn 2017-04-07 16:17 ` Clément Pit-Claudel 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-04-07 16:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: lars, emacs-devel On Apr 7, 2017, at 09:40, Eli Zaretskii <eliz@gnu.org> wrote: >> From: Ken Raeburn <raeburn@raeburn.org> >> Date: Fri, 7 Apr 2017 05:02:30 -0400 >> Cc: Lars Brinkhoff <lars@nocrew.org>, >> emacs-devel@gnu.org >> >>> Perhaps we could have a separate, much smaller dumped.elc for batch >>> invocations, to cater to these use cases. Ken, does this make sense? >> >> We could do it, sure. For example, stuff relating to window systems probably isn’t of much use in batch mode. One question is, do we change such things to use autoload in case the user’s init file references their functions, or do we require that the user know to use “load” or “require”? > > I don't think I understand the question: -batch implies -Q, so the > user's init file is not relevant. Sorry, was too late at night I guess. For batch mode, it’s code loaded via -l options that might have to add new explicit dependencies, if we don’t add autoloads for everything. I would expect autoloads would be the direction we’d want to go…. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-07 16:02 ` Ken Raeburn @ 2017-04-07 16:17 ` Clément Pit-Claudel 2017-04-08 15:03 ` Philipp Stephani 0 siblings, 1 reply; 375+ messages in thread From: Clément Pit-Claudel @ 2017-04-07 16:17 UTC (permalink / raw) To: emacs-devel On 2017-04-07 12:02, Ken Raeburn wrote: > Sorry, was too late at night I guess. For batch mode, it’s code > loaded via -l options that might have to add new explicit > dependencies, if we don’t add autoloads for everything. I would > expect autoloads would be the direction we’d want to go…. Removing some preloaded packages in favor of autoloads is probably a good idea. We should be careful, though: as previously discussed, essentially everything in packages that were previously preloaded needs to be autoloaded now, since many packages don't (require) the preloaded features that they use. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-07 16:17 ` Clément Pit-Claudel @ 2017-04-08 15:03 ` Philipp Stephani 2017-04-08 15:15 ` Clément Pit-Claudel 0 siblings, 1 reply; 375+ messages in thread From: Philipp Stephani @ 2017-04-08 15:03 UTC (permalink / raw) To: Clément Pit-Claudel, emacs-devel [-- Attachment #1: Type: text/plain, Size: 954 bytes --] Clément Pit-Claudel <cpitclaudel@gmail.com> schrieb am Fr., 7. Apr. 2017 um 18:18 Uhr: > On 2017-04-07 12:02, Ken Raeburn wrote: > > Sorry, was too late at night I guess. For batch mode, it’s code > > loaded via -l options that might have to add new explicit > > dependencies, if we don’t add autoloads for everything. I would > > expect autoloads would be the direction we’d want to go…. > > Removing some preloaded packages in favor of autoloads is probably a good > idea. We should be careful, though: as previously discussed, essentially > everything in packages that were previously preloaded needs to be > autoloaded now, since many packages don't (require) the preloaded features > that they use. > > > Doesn't that effectively just move most of the code to loaddefs.el, from which it again has to be either preloaded or byte-compiled into the "big .elc file"? Does this really bring measurable benefits nowadays? [-- Attachment #2: Type: text/html, Size: 1422 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-08 15:03 ` Philipp Stephani @ 2017-04-08 15:15 ` Clément Pit-Claudel 2017-04-08 15:53 ` Philipp Stephani 0 siblings, 1 reply; 375+ messages in thread From: Clément Pit-Claudel @ 2017-04-08 15:15 UTC (permalink / raw) To: Philipp Stephani, emacs-devel On 2017-04-08 11:03, Philipp Stephani wrote: > Clément Pit-Claudel <cpitclaudel@gmail.com schrieb: >> … essentially everything in packages that were previously >> preloaded needs to be autoloaded now, since many packages don't >> (require) the preloaded features that they use. > Doesn't that effectively just move most of the code to loaddefs.el, > from which it again has to be either preloaded or byte-compiled into > the "big .elc file"? Does this really bring measurable benefits > nowadays? (Sorry if I'm misunderstanding you) I think the idea is that you can defer loading the implementation of a significant fraction of currently-preloaded functions, because many of these are currently unused. So the intended saving is that currently-preloaded but uncommonly-used functions would not be dumped to the big-elc (their signatures, in the form an autoload, would be). Packages that use them without (require)-ing the corresponding feature first would still work, but startup would be faster. (I hope I got this right) ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-08 15:15 ` Clément Pit-Claudel @ 2017-04-08 15:53 ` Philipp Stephani 2017-04-08 16:18 ` Eli Zaretskii 2017-04-08 17:58 ` Clément Pit-Claudel 0 siblings, 2 replies; 375+ messages in thread From: Philipp Stephani @ 2017-04-08 15:53 UTC (permalink / raw) To: Clément Pit-Claudel, emacs-devel [-- Attachment #1: Type: text/plain, Size: 2332 bytes --] Clément Pit-Claudel <cpitclaudel@gmail.com> schrieb am Sa., 8. Apr. 2017 um 17:15 Uhr: > On 2017-04-08 11:03, Philipp Stephani wrote: > > Clément Pit-Claudel <cpitclaudel@gmail.com schrieb: > >> … essentially everything in packages that were previously > >> preloaded needs to be autoloaded now, since many packages don't > >> (require) the preloaded features that they use. > > > Doesn't that effectively just move most of the code to loaddefs.el, > > from which it again has to be either preloaded or byte-compiled into > > the "big .elc file"? Does this really bring measurable benefits > > nowadays? > > (Sorry if I'm misunderstanding you) > > I think the idea is that you can defer loading the implementation of a > significant fraction of currently-preloaded functions, because many of > these are currently unused. > > So the intended saving is that currently-preloaded but uncommonly-used > functions would not be dumped to the big-elc (their signatures, in the form > an autoload, would be). Packages that use them without (require)-ing the > corresponding feature first would still work, but startup would be faster. > The question is whether there is actually a significant speed-up. Autoloading is traditionally used for a small number of interactive commands that cause large optional libraries to be loaded. In such cases I could imagine that the performance gain is still significant. However, you now suggest that preloaded libraries get turned into autoloads. The structure of those libraries is typically quite different: the consist to a large extent of individual helper functions that are independent of each other. My guess is that this could make overall performance worse: it will cause loaddefs.el to contain all the signatures and docstrings of these helper functions, and loaddefs.el is itself not byte-compiled. Therefore, you now need to load the definitions effectively twice: once in loaddefs.el, once the functions are actually used. Therefore such a change shouldn't be made without measuring its impact. I'd actually prefer going into the other direction: preload much more than now, and remove lots of stuff from autoloads. This will probably need a different strategy for preloading (Daniel's approach, or Rmacs, or an Elisp LLVM compiler, ...). [-- Attachment #2: Type: text/html, Size: 3057 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-08 15:53 ` Philipp Stephani @ 2017-04-08 16:18 ` Eli Zaretskii 2017-04-08 18:01 ` Stefan Monnier 2017-04-08 17:58 ` Clément Pit-Claudel 1 sibling, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-04-08 16:18 UTC (permalink / raw) To: Philipp Stephani; +Cc: cpitclaudel, emacs-devel > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Sat, 08 Apr 2017 15:53:49 +0000 > > The question is whether there is actually a significant speed-up. > Autoloading is traditionally used for a small number of interactive commands that cause large optional libraries > to be loaded. In such cases I could imagine that the performance gain is still significant. However, you now > suggest that preloaded libraries get turned into autoloads. The structure of those libraries is typically quite > different: the consist to a large extent of individual helper functions that are independent of each other. My > guess is that this could make overall performance worse: it will cause loaddefs.el to contain all the signatures > and docstrings of these helper functions, and loaddefs.el is itself not byte-compiled. Therefore, you now need > to load the definitions effectively twice: once in loaddefs.el, once the functions are actually used. Therefore > such a change shouldn't be made without measuring its impact. This issue will not be resolved by guessing, but by measurements. So if you are interested and can produce a dumped.elc that only loads what's necessary in -batch session, and that dumped.elc does or doesn't load significantly faster than the full one, we will know who is right here. Thanks. > I'd actually prefer going into the other direction: preload much more than now, and remove lots of stuff from > autoloads. This will probably need a different strategy for preloading (Daniel's approach, or Rmacs, or an Elisp > LLVM compiler, ...). Given that load time is an issue, loading more stuff than strictly necessary seems to make very little sense to me. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-08 16:18 ` Eli Zaretskii @ 2017-04-08 18:01 ` Stefan Monnier 2017-05-01 11:41 ` Philipp Stephani 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2017-04-08 18:01 UTC (permalink / raw) To: emacs-devel >> I'd actually prefer going into the other direction: preload much more than >> now, and remove lots of stuff from >> autoloads. This will probably need a different strategy for preloading >> (Daniel's approach, or Rmacs, or an Elisp >> LLVM compiler, ...). > Given that load time is an issue, loading more stuff than strictly > necessary seems to make very little sense to me. IIUC, using Daniel's approach, it should be possible to preload using mmap in a time that's largely independent from the size of the preloaded file (or at least, with a small constant). Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-08 18:01 ` Stefan Monnier @ 2017-05-01 11:41 ` Philipp Stephani 0 siblings, 0 replies; 375+ messages in thread From: Philipp Stephani @ 2017-05-01 11:41 UTC (permalink / raw) To: Stefan Monnier, emacs-devel [-- Attachment #1: Type: text/plain, Size: 705 bytes --] Stefan Monnier <monnier@iro.umontreal.ca> schrieb am Sa., 8. Apr. 2017 um 20:06 Uhr: > >> I'd actually prefer going into the other direction: preload much more > than > >> now, and remove lots of stuff from > >> autoloads. This will probably need a different strategy for preloading > >> (Daniel's approach, or Rmacs, or an Elisp > >> LLVM compiler, ...). > > Given that load time is an issue, loading more stuff than strictly > > necessary seems to make very little sense to me. > > IIUC, using Daniel's approach, it should be possible to preload using > mmap in a time that's largely independent from the size of the > preloaded file (or at least, with a small constant). > > Yes, that would be ideal. [-- Attachment #2: Type: text/html, Size: 1071 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-08 15:53 ` Philipp Stephani 2017-04-08 16:18 ` Eli Zaretskii @ 2017-04-08 17:58 ` Clément Pit-Claudel 2017-05-01 11:40 ` Philipp Stephani 1 sibling, 1 reply; 375+ messages in thread From: Clément Pit-Claudel @ 2017-04-08 17:58 UTC (permalink / raw) To: Philipp Stephani, emacs-devel On 2017-04-08 11:53, Philipp Stephani wrote: > However, you now suggest that preloaded libraries get turned into > autoloads. I didn't suggest this :) I just pointed out that there were difficulties with that approach. > Therefore such a change shouldn't be made without measuring its > impact. Yup, as always when trying to optimize things :) > I'd actually prefer going into the other direction: preload much more > than now, and remove lots of stuff from autoloads. This will probably > need a different strategy for preloading (Daniel's approach, or > Rmacs, or an Elisp LLVM compiler, ...). I don't have an opinion on this topic; I admire the work of both Daniel and Ken, but I'm happy to defer to you and other experts for technical opinions. Clément. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-08 17:58 ` Clément Pit-Claudel @ 2017-05-01 11:40 ` Philipp Stephani 2017-05-01 12:07 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Philipp Stephani @ 2017-05-01 11:40 UTC (permalink / raw) To: Clément Pit-Claudel, emacs-devel [-- Attachment #1: Type: text/plain, Size: 699 bytes --] Clément Pit-Claudel <cpitclaudel@gmail.com> schrieb am Sa., 8. Apr. 2017 um 19:58 Uhr: > > > I'd actually prefer going into the other direction: preload much more > > than now, and remove lots of stuff from autoloads. This will probably > > need a different strategy for preloading (Daniel's approach, or > > Rmacs, or an Elisp LLVM compiler, ...). > > I don't have an opinion on this topic; I admire the work of both Daniel > and Ken, but I'm happy to defer to you and other experts for technical > opinions. > > I'm absolutely not an expert on this. All I'm suggesting is that the impact of such changes should be measured, and that startup time in batch mode isn't everything. [-- Attachment #2: Type: text/html, Size: 1018 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-05-01 11:40 ` Philipp Stephani @ 2017-05-01 12:07 ` Eli Zaretskii 2017-05-18 17:39 ` Daniel Colascione 2017-05-21 8:44 ` compiled lisp file format (Re: Skipping unexec via a big .elc file) Ken Raeburn 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2017-05-01 12:07 UTC (permalink / raw) To: Philipp Stephani; +Cc: cpitclaudel, emacs-devel > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Mon, 01 May 2017 11:40:46 +0000 > > All I'm suggesting is that the impact of such changes should be > measured, and that startup time in batch mode isn't everything. Startup time in batch mode isn't everything, but if it's a frequent use case in which even relatively short delays are tangible, we should try to find a way of minimizing those delays. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-05-01 12:07 ` Eli Zaretskii @ 2017-05-18 17:39 ` Daniel Colascione 2017-05-18 19:45 ` Eli Zaretskii 2017-05-21 8:44 ` compiled lisp file format (Re: Skipping unexec via a big .elc file) Ken Raeburn 1 sibling, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2017-05-18 17:39 UTC (permalink / raw) To: Eli Zaretskii, Philipp Stephani; +Cc: cpitclaudel, emacs-devel On 05/01/2017 05:07 AM, Eli Zaretskii wrote: >> From: Philipp Stephani <p.stephani2@gmail.com> >> Date: Mon, 01 May 2017 11:40:46 +0000 >> >> All I'm suggesting is that the impact of such changes should be >> measured, and that startup time in batch mode isn't everything. > > Startup time in batch mode isn't everything, but if it's a frequent > use case in which even relatively short delays are tangible, we should > try to find a way of minimizing those delays. I'm in a position to rebase my portable dumper patch. The last few months have been, er, interesting. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-05-18 17:39 ` Daniel Colascione @ 2017-05-18 19:45 ` Eli Zaretskii 2018-12-25 15:46 ` Philipp Stephani 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-05-18 19:45 UTC (permalink / raw) To: Daniel Colascione; +Cc: p.stephani2, cpitclaudel, emacs-devel > From: Daniel Colascione <dancol@dancol.org> > Date: Thu, 18 May 2017 10:39:34 -0700 > Cc: cpitclaudel@gmail.com, emacs-devel@gnu.org > > I'm in a position to rebase my portable dumper patch. The last few > months have been, er, interesting. Please make a branch with the patch, so people could try it. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-05-18 19:45 ` Eli Zaretskii @ 2018-12-25 15:46 ` Philipp Stephani 2018-12-25 17:21 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Philipp Stephani @ 2018-12-25 15:46 UTC (permalink / raw) To: Eli Zaretskii Cc: Clément Pit-Claudel, Daniel Colascione, Emacs developers Am Do., 18. Mai 2017 um 21:45 Uhr schrieb Eli Zaretskii <eliz@gnu.org>: > > > From: Daniel Colascione <dancol@dancol.org> > > Date: Thu, 18 May 2017 10:39:34 -0700 > > Cc: cpitclaudel@gmail.com, emacs-devel@gnu.org > > > > I'm in a position to rebase my portable dumper patch. The last few > > months have been, er, interesting. > > Please make a branch with the patch, so people could try it. Hi, what's the state of the pdumper branch? Any chance we can merge it to master? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2018-12-25 15:46 ` Philipp Stephani @ 2018-12-25 17:21 ` Eli Zaretskii 2018-12-25 19:15 ` Daniel Colascione 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2018-12-25 17:21 UTC (permalink / raw) To: Philipp Stephani; +Cc: cpitclaudel, dancol, emacs-devel > From: Philipp Stephani <p.stephani2@gmail.com> > Date: Tue, 25 Dec 2018 16:46:23 +0100 > Cc: Daniel Colascione <dancol@dancol.org>, Clément Pit-Claudel <cpitclaudel@gmail.com>, > Emacs developers <emacs-devel@gnu.org> > > Hi, what's the state of the pdumper branch? Any chance we can merge it > to master? Last time this popped up: http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00295.html ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2018-12-25 17:21 ` Eli Zaretskii @ 2018-12-25 19:15 ` Daniel Colascione 2018-12-26 15:27 ` Eli Zaretskii 2019-01-07 21:37 ` Daniel Colascione 0 siblings, 2 replies; 375+ messages in thread From: Daniel Colascione @ 2018-12-25 19:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Philipp Stephani, dancol, cpitclaudel, emacs-devel >> From: Philipp Stephani <p.stephani2@gmail.com> >> Date: Tue, 25 Dec 2018 16:46:23 +0100 >> Cc: Daniel Colascione <dancol@dancol.org>, Clément Pit-Claudel >> <cpitclaudel@gmail.com>, >> Emacs developers <emacs-devel@gnu.org> >> >> Hi, what's the state of the pdumper branch? Any chance we can merge it >> to master? > > Last time this popped up: > > http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00295.html > Yeah, it's about time we finally get around to doing this. I have some time between now and the end of the year, and I'll rebase the work and land it, assuming no new major objections. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2018-12-25 19:15 ` Daniel Colascione @ 2018-12-26 15:27 ` Eli Zaretskii 2019-01-07 21:37 ` Daniel Colascione 1 sibling, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2018-12-26 15:27 UTC (permalink / raw) To: Daniel Colascione; +Cc: p.stephani2, dancol, cpitclaudel, emacs-devel > Date: Tue, 25 Dec 2018 11:15:57 -0800 > From: "Daniel Colascione" <dancol@dancol.org> > Cc: Philipp Stephani <p.stephani2@gmail.com>, dancol@dancol.org, > cpitclaudel@gmail.com, emacs-devel@gnu.org > > Yeah, it's about time we finally get around to doing this. I have some > time between now and the end of the year, and I'll rebase the work and > land it, assuming no new major objections. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2018-12-25 19:15 ` Daniel Colascione 2018-12-26 15:27 ` Eli Zaretskii @ 2019-01-07 21:37 ` Daniel Colascione 2019-01-15 22:46 ` Daniel Colascione 1 sibling, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2019-01-07 21:37 UTC (permalink / raw) To: Daniel Colascione Cc: Eli Zaretskii, dancol, cpitclaudel, Philipp Stephani, emacs-devel >>> From: Philipp Stephani <p.stephani2@gmail.com> >>> Date: Tue, 25 Dec 2018 16:46:23 +0100 >>> Cc: Daniel Colascione <dancol@dancol.org>, Clément Pit-Claudel >>> <cpitclaudel@gmail.com>, >>> Emacs developers <emacs-devel@gnu.org> >>> >>> Hi, what's the state of the pdumper branch? Any chance we can merge it >>> to master? >> >> Last time this popped up: >> >> http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00295.html >> > > Yeah, it's about time we finally get around to doing this. I have some > time between now and the end of the year, and I'll rebase the work and > land it, assuming no new major objections. I'm still working on this, FWIW. I'd hoped it'd be a simple rebase, but the vectorization of Lisp_Misc forced more changes than I'd thought. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2019-01-07 21:37 ` Daniel Colascione @ 2019-01-15 22:46 ` Daniel Colascione 2019-01-16 8:45 ` Tassilo Horn ` (5 more replies) 0 siblings, 6 replies; 375+ messages in thread From: Daniel Colascione @ 2019-01-15 22:46 UTC (permalink / raw) To: Daniel Colascione Cc: Eli Zaretskii, Daniel Colascione, cpitclaudel, Philipp Stephani, emacs-devel >>>> From: Philipp Stephani <p.stephani2@gmail.com> >>>> Date: Tue, 25 Dec 2018 16:46:23 +0100 >>>> Cc: Daniel Colascione <dancol@dancol.org>, Clément Pit-Claudel >>>> <cpitclaudel@gmail.com>, >>>> Emacs developers <emacs-devel@gnu.org> >>>> >>>> Hi, what's the state of the pdumper branch? Any chance we can merge it >>>> to master? >>> >>> Last time this popped up: >>> >>> http://lists.gnu.org/archive/html/emacs-devel/2018-10/msg00295.html >>> >> >> Yeah, it's about time we finally get around to doing this. I have some >> time between now and the end of the year, and I'll rebase the work and >> land it, assuming no new major objections. > > I'm still working on this, FWIW. I'd hoped it'd be a simple rebase, but > the vectorization of Lisp_Misc forced more changes than I'd thought. > Hi Daniel. >> I'm still working on this, FWIW. I'd hoped it'd be a simple rebase, but >> the vectorization of Lisp_Misc forced more changes than I'd thought. >> > > Good to know knowing You are working on this. I landed pdumper. It works on my machine (tm)! Let me know about any breakage. Plenty of people tested the old pdumper branch, but the changes necessary to rebase it could use a good once over. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2019-01-15 22:46 ` Daniel Colascione @ 2019-01-16 8:45 ` Tassilo Horn 2019-01-16 10:25 ` Robert Pluim ` (4 subsequent siblings) 5 siblings, 0 replies; 375+ messages in thread From: Tassilo Horn @ 2019-01-16 8:45 UTC (permalink / raw) To: Daniel Colascione; +Cc: emacs-devel "Daniel Colascione" <dancol@dancol.org> writes: > I landed pdumper. It works on my machine (tm)! Let me know about any > breakage. Plenty of people tested the old pdumper branch, but the > changes necessary to rebase it could use a good once over. I'm not sure if it's related, but since today I encounter bug#34094. Bye, Tassilo ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2019-01-15 22:46 ` Daniel Colascione 2019-01-16 8:45 ` Tassilo Horn @ 2019-01-16 10:25 ` Robert Pluim 2019-01-16 11:58 ` Phillip Lord ` (3 subsequent siblings) 5 siblings, 0 replies; 375+ messages in thread From: Robert Pluim @ 2019-01-16 10:25 UTC (permalink / raw) To: Daniel Colascione Cc: Philipp Stephani, Eli Zaretskii, cpitclaudel, emacs-devel "Daniel Colascione" <dancol@dancol.org> writes: > I landed pdumper. It works on my machine (tm)! Let me know about any > breakage. Plenty of people tested the old pdumper branch, but the changes > necessary to rebase it could use a good once over. It works for me on x86_64-apple-darwin18.2.0 and x86_64-pc-linux-gnu (I needed 'make bootstrap', which I guess is not unexpected for such a big change). Robert ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2019-01-15 22:46 ` Daniel Colascione 2019-01-16 8:45 ` Tassilo Horn 2019-01-16 10:25 ` Robert Pluim @ 2019-01-16 11:58 ` Phillip Lord 2019-01-18 12:46 ` Windows Binaries with pdumper Phillip Lord 2019-01-16 12:00 ` Skipping unexec via a big .elc file Elias Mårtenson ` (2 subsequent siblings) 5 siblings, 1 reply; 375+ messages in thread From: Phillip Lord @ 2019-01-16 11:58 UTC (permalink / raw) To: Daniel Colascione Cc: Philipp Stephani, Eli Zaretskii, cpitclaudel, emacs-devel I thought to build some windows snapshot binaries. I am getting this error of master: CC pdumper.o ../../../../git/master/src/pdumper.c: In function 'dump_cold_bignum': ../../../../git/master/src/pdumper.c:3447:53: error: conversion to 'mp_size_t {aka long int}' from 'size_t {aka long long unsigned int}' may alter its value [-Werror=conversion] mp_limb_t limb = mpz_getlimbn (bignum->value, i); ^ cc1.exe: some warnings being treated as errors make[1]: *** [Makefile:392: pdumper.o] Error 1 make[1]: Leaving directory '/home/Administrator/emacs-build/build/emacs-27.0.50-snapshot/x86_64/src' make: *** [Makefile:423: src] Error 2 Phil ^ permalink raw reply [flat|nested] 375+ messages in thread
* Windows Binaries with pdumper 2019-01-16 11:58 ` Phillip Lord @ 2019-01-18 12:46 ` Phillip Lord 2019-01-21 11:30 ` Jostein Kjønigsen 0 siblings, 1 reply; 375+ messages in thread From: Phillip Lord @ 2019-01-18 12:46 UTC (permalink / raw) To: emacs-devel I've updated the snapshot binaries for Windows to the latest trunk, which includes the pdumper. https://alpha.gnu.org/gnu/emacs/pretest/windows/emacs-27/ Phil ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Windows Binaries with pdumper 2019-01-18 12:46 ` Windows Binaries with pdumper Phillip Lord @ 2019-01-21 11:30 ` Jostein Kjønigsen 2019-01-21 14:19 ` Phillip Lord 0 siblings, 1 reply; 375+ messages in thread From: Jostein Kjønigsen @ 2019-01-21 11:30 UTC (permalink / raw) To: Phillip Lord, emacs-devel [-- Attachment #1: Type: text/plain, Size: 605 bytes --] Hey Phil. Thanks for the build.That's great news! I've installed it and seems to run just fine, while loading it with a full config and doing some very basic tasks. Are there any particular areas of this build you want tested? -- Vennlig hilsen Jostein Kjønigsen jostein@kjonigsen.net 🍵 jostein@gmail.com https://jostein.kjonigsen.net On Fri, Jan 18, 2019, at 1:46 PM, Phillip Lord wrote: > > I've updated the snapshot binaries for Windows to the latest trunk, > which includes the pdumper. > > https://alpha.gnu.org/gnu/emacs/pretest/windows/emacs-27/ > > Phil > [-- Attachment #2: Type: text/html, Size: 1553 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Windows Binaries with pdumper 2019-01-21 11:30 ` Jostein Kjønigsen @ 2019-01-21 14:19 ` Phillip Lord 0 siblings, 0 replies; 375+ messages in thread From: Phillip Lord @ 2019-01-21 14:19 UTC (permalink / raw) To: Jostein Kjønigsen; +Cc: jostein, emacs-devel I am not the best person to ask this, since I don't use either windows nor have much knowledge of the pdumper. The motivation for putting the snapshots up was to get it into daily use by more people. Phil Jostein Kjønigsen <jostein@secure.kjonigsen.net> writes: > Hey Phil. > > Thanks for the build.That's great news! > > I've installed it and seems to run just fine, while loading it with a > full config and doing some very basic tasks. > Are there any particular areas of this build you want tested? > > -- > Vennlig hilsen > Jostein Kjønigsen > > jostein@kjonigsen.net 🍵 jostein@gmail.com > https://jostein.kjonigsen.net > > > On Fri, Jan 18, 2019, at 1:46 PM, Phillip Lord wrote: >> >> I've updated the snapshot binaries for Windows to the latest trunk, >> which includes the pdumper. >> >> https://alpha.gnu.org/gnu/emacs/pretest/windows/emacs-27/ >> >> Phil >> ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2019-01-15 22:46 ` Daniel Colascione ` (2 preceding siblings ...) 2019-01-16 11:58 ` Phillip Lord @ 2019-01-16 12:00 ` Elias Mårtenson 2019-01-16 15:59 ` Eli Zaretskii 2019-01-16 21:56 ` Clément Pit-Claudel 5 siblings, 0 replies; 375+ messages in thread From: Elias Mårtenson @ 2019-01-16 12:00 UTC (permalink / raw) To: Daniel Colascione Cc: Philipp Stephani, Eli Zaretskii, cpitclaudel, emacs-devel [-- Attachment #1: Type: text/plain, Size: 556 bytes --] On Wed, 16 Jan 2019 at 06:47, Daniel Colascione <dancol@dancol.org> wrote: I landed pdumper. It works on my machine (tm)! Let me know about any > breakage. Plenty of people tested the old pdumper branch, but the changes > necessary to rebase it could use a good once over. > Very nice, I've waited for this. Thanks a lot. I'm using it right now, and it seems to work fine. A small comment. There seems to be a typo in the word “build” in NEWS: “Emacs now needs an "emacs.pdmp" file, generated during the built”. Regards, Elias [-- Attachment #2: Type: text/html, Size: 1004 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2019-01-15 22:46 ` Daniel Colascione ` (3 preceding siblings ...) 2019-01-16 12:00 ` Skipping unexec via a big .elc file Elias Mårtenson @ 2019-01-16 15:59 ` Eli Zaretskii 2019-01-16 16:08 ` Daniel Colascione 2019-01-16 21:56 ` Clément Pit-Claudel 5 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2019-01-16 15:59 UTC (permalink / raw) To: Daniel Colascione; +Cc: emacs-devel > Date: Tue, 15 Jan 2019 14:46:03 -0800 > From: "Daniel Colascione" <dancol@dancol.org> > Cc: "Daniel Colascione" <dancol@dancol.org>, > "Eli Zaretskii" <eliz@gnu.org>, > "Philipp Stephani" <p.stephani2@gmail.com>, > cpitclaudel@gmail.com, > emacs-devel@gnu.org > > I landed pdumper. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2019-01-16 15:59 ` Eli Zaretskii @ 2019-01-16 16:08 ` Daniel Colascione 0 siblings, 0 replies; 375+ messages in thread From: Daniel Colascione @ 2019-01-16 16:08 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel >> Date: Tue, 15 Jan 2019 14:46:03 -0800 >> From: "Daniel Colascione" <dancol@dancol.org> >> Cc: "Daniel Colascione" <dancol@dancol.org>, >> "Eli Zaretskii" <eliz@gnu.org>, >> "Philipp Stephani" <p.stephani2@gmail.com>, >> cpitclaudel@gmail.com, >> emacs-devel@gnu.org >> >> I landed pdumper. > > Thanks. Should have a fix for the crash in detect_coding soon-ish. Who knew that struct coding_system had function pointers? (This structure is _mostly_ ephemeral, reloaded from Lisp, but it's also persistent on some contexts.) ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2019-01-15 22:46 ` Daniel Colascione ` (4 preceding siblings ...) 2019-01-16 15:59 ` Eli Zaretskii @ 2019-01-16 21:56 ` Clément Pit-Claudel 5 siblings, 0 replies; 375+ messages in thread From: Clément Pit-Claudel @ 2019-01-16 21:56 UTC (permalink / raw) To: Daniel Colascione; +Cc: Eli Zaretskii, Philipp Stephani, emacs-devel On 15/01/2019 17.46, Daniel Colascione wrote: > I landed pdumper. Congratulations! ^ permalink raw reply [flat|nested] 375+ messages in thread
* compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-01 12:07 ` Eli Zaretskii 2017-05-18 17:39 ` Daniel Colascione @ 2017-05-21 8:44 ` Ken Raeburn 2017-05-21 8:53 ` Paul Eggert 2017-05-21 16:02 ` John Wiegley 1 sibling, 2 replies; 375+ messages in thread From: Ken Raeburn @ 2017-05-21 8:44 UTC (permalink / raw) To: Emacs developers I haven’t had much time to further the work on the big-elc approach recently, but there is one idea I want to toss out there for possibly improving the load time further: Changing the .elc file format to a binary one. I’m not talking about a memory image like Daniel is working on. I mean a file representing a sequence of S-expressions, but optimized for loading speed rather than for human readability. The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc. While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines. I haven’t got a complete, concrete proposal, but I see at least a couple general approaches possible: 1) Follow the model of flat object file formats: Some file sections have data of various types (string content, symbol names, integer or floating constants); others (the equivalent of standard object file “relocation” data) would provide info on how to allocate and fill in the container objects (pairs, vectors, etc) desired, with references to the symbols or strings or other container objects. 2) Continue to use the current recursive processing, but with a binary format. Some (byte? word?) value indicates “this is string data”, it’s followed by a byte count and that many bytes of string content (always using the Emacs internal encoding, so we don’t have to translate when reading). Another value indicates an integer constant. Another value indicates a vector, and is followed by a length and then that many other values, which are each processed recursively before we get back to the object following the vector. Each object’s initializer’s length is dependent on the type, and for container types, the values contained within. Either way, getting away from the expensive one-character-at-a-time processing, multibyte coding, escape processing, etc., and pushing around groups of bytes whenever possible should save us time. This would be useable not just for the dumped.elc file, but for other compiled Lisp files as well, whether in the distribution or from ELPA or the user’s own code. I did throw together a half-baked attempt to try some of this out. I added a new “#” construct for unibyte strings, putting the byte count into the file so that the string data could be copied with fread() instead of a READCHAR loop. I also added a new version of the “#n#” syntax that uses a fixed number of READCHAR calls and avoids the decimal arithmetic. So, the file can no longer be processed as Lisp, and it still requires some text parsing, though not nearly as much as before; some of the worst of both worlds. But the load time for dumped.elc did drop by another 12% in my tests (start in batch mode, print a message and exit, from 0.227s down to 0.2s or less per run, still loading a couple of standard-elc-format files during startup). I’m curious if people think this might be an approach worth pursuing. Or if the Lisp-based elc format is seen as advantageous in ways I’m not seeing…. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-21 8:44 ` compiled lisp file format (Re: Skipping unexec via a big .elc file) Ken Raeburn @ 2017-05-21 8:53 ` Paul Eggert 2017-05-28 11:07 ` Ken Raeburn 2017-05-21 16:02 ` John Wiegley 1 sibling, 1 reply; 375+ messages in thread From: Paul Eggert @ 2017-05-21 8:53 UTC (permalink / raw) To: Ken Raeburn, Emacs developers Ken Raeburn wrote: > The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc. While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines. Although it does indeed sound like a big change, I don't see why it would prevent us from sharing the files across machines. Emacs can use standard ELF and DWARF format on any platform if Emacs is doing the loading. And there should be some software-engineering benefit in using the same format that Guile uses. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-21 8:53 ` Paul Eggert @ 2017-05-28 11:07 ` Ken Raeburn 2017-05-28 12:43 ` Philipp Stephani 2017-05-28 21:09 ` Paul Eggert 0 siblings, 2 replies; 375+ messages in thread From: Ken Raeburn @ 2017-05-28 11:07 UTC (permalink / raw) To: Paul Eggert; +Cc: Emacs developers On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu> wrote: > Ken Raeburn wrote: >> The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc. While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines. > > Although it does indeed sound like a big change, I don't see why it would prevent us from sharing the files across machines. Emacs can use standard ELF and DWARF format on any platform if Emacs is doing the loading. And there should be some software-engineering benefit in using the same format that Guile uses. Sorry for the delay in responding. The ELF format has header fields indicating the word size, endianness, machine architecture (though there’s a value for “none”), and OS ABI. Some fields vary in size or order depending on whether the 32-bit or 64-bit format is in use. Some other format details (e.g., relocation types, interpretation of certain ranges of values in some fields) are architecture- or OS-dependent; we might not care about many of those details, but relocations are likely needed if we want to play linking games or use DWARF. I think Guile is using whatever the native word size and architecture are. If we do that for Emacs, they’re not portable between platforms. Currently it works for me to put my Lisp files, both source and compiled, into ~/elisp and use them from different kinds of machines if my home directory is NFS-mounted. We could instead pick fixed values (say, architecture “none”, little-endian, 32-bit), but then there’s no guarantee that we could use any of the usual GNU tools on them without a bunch of work, or that we’d ever be able to use non-GNU tools to treat them as object files. Then again, we couldn’t expect to do the latter portably anyway, since some of the platforms don’t even use ELF. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-28 11:07 ` Ken Raeburn @ 2017-05-28 12:43 ` Philipp Stephani 2017-05-29 9:33 ` Ken Raeburn 2017-05-28 21:09 ` Paul Eggert 1 sibling, 1 reply; 375+ messages in thread From: Philipp Stephani @ 2017-05-28 12:43 UTC (permalink / raw) To: Ken Raeburn, Paul Eggert; +Cc: Emacs developers [-- Attachment #1: Type: text/plain, Size: 2561 bytes --] Ken Raeburn <raeburn@raeburn.org> schrieb am So., 28. Mai 2017 um 13:07 Uhr: > > On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu> wrote: > > > Ken Raeburn wrote: > >> The Guile project has taken this idea pretty far; they’re generating > ELF object files with a few special sections for Guile objects, using the > standard DWARF sections for debug information, etc. While it has a certain > appeal (making C modules and Lisp files look much more similar, maybe being > able to link Lisp and C together into one executable image, letting GDB > understand some of your data), switching to a machine-specific format would > be a pretty drastic change, when we can currently share the files across > machines. > > > > Although it does indeed sound like a big change, I don't see why it > would prevent us from sharing the files across machines. Emacs can use > standard ELF and DWARF format on any platform if Emacs is doing the > loading. And there should be some software-engineering benefit in using the > same format that Guile uses. > > Sorry for the delay in responding. > > The ELF format has header fields indicating the word size, endianness, > machine architecture (though there’s a value for “none”), and OS ABI. Some > fields vary in size or order depending on whether the 32-bit or 64-bit > format is in use. Some other format details (e.g., relocation types, > interpretation of certain ranges of values in some fields) are > architecture- or OS-dependent; we might not care about many of those > details, but relocations are likely needed if we want to play linking games > or use DWARF. > > I think Guile is using whatever the native word size and architecture > are. If we do that for Emacs, they’re not portable between platforms. > Currently it works for me to put my Lisp files, both source and compiled, > into ~/elisp and use them from different kinds of machines if my home > directory is NFS-mounted. > > We could instead pick fixed values (say, architecture “none”, > little-endian, 32-bit), but then there’s no guarantee that we could use any > of the usual GNU tools on them without a bunch of work, or that we’d ever > be able to use non-GNU tools to treat them as object files. Then again, we > couldn’t expect to do the latter portably anyway, since some of the > platforms don’t even use ELF. > > Is there any significant advantage of using ELF, or could this just use one of the standard binary serialization formats (protobuf, flatbuffer, ...)? [-- Attachment #2: Type: text/html, Size: 2873 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-28 12:43 ` Philipp Stephani @ 2017-05-29 9:33 ` Ken Raeburn 2017-07-02 15:46 ` Philipp Stephani 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-05-29 9:33 UTC (permalink / raw) To: Philipp Stephani; +Cc: Paul Eggert, Emacs developers [-- Attachment #1: Type: text/plain, Size: 6033 bytes --] On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com> wrote: > > > Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am So., 28. Mai 2017 um 13:07 Uhr: > > On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu <mailto:eggert@cs.ucla.edu>> wrote: > > > Ken Raeburn wrote: > >> The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc. While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines. > > > > Although it does indeed sound like a big change, I don't see why it would prevent us from sharing the files across machines. Emacs can use standard ELF and DWARF format on any platform if Emacs is doing the loading. And there should be some software-engineering benefit in using the same format that Guile uses. > > Sorry for the delay in responding. > > The ELF format has header fields indicating the word size, endianness, machine architecture (though there’s a value for “none”), and OS ABI. Some fields vary in size or order depending on whether the 32-bit or 64-bit format is in use. Some other format details (e.g., relocation types, interpretation of certain ranges of values in some fields) are architecture- or OS-dependent; we might not care about many of those details, but relocations are likely needed if we want to play linking games or use DWARF. > > I think Guile is using whatever the native word size and architecture are. If we do that for Emacs, they’re not portable between platforms. Currently it works for me to put my Lisp files, both source and compiled, into ~/elisp and use them from different kinds of machines if my home directory is NFS-mounted. > > We could instead pick fixed values (say, architecture “none”, little-endian, 32-bit), but then there’s no guarantee that we could use any of the usual GNU tools on them without a bunch of work, or that we’d ever be able to use non-GNU tools to treat them as object files. Then again, we couldn’t expect to do the latter portably anyway, since some of the platforms don’t even use ELF. > > > Is there any significant advantage of using ELF, or could this just use one of the standard binary serialization formats (protobuf, flatbuffer, ...)? That’s an interesting idea. If one of the popular serialization libraries is compatibly licensed, easy to use, and performs well, it may be better than rolling our own. It’ll need to handle data structures with circular or cross-linked references. And we have the doc string delayed-loading optimization (that currently uses #$ and #@ syntaxes); presumably we’d like to keep that optimization in some form. It would be good not to have to build all our data structures on ones generated by the tool with its own bookkeeping fields; having anything in a cons cell besides the “car” and “cdr” slots would mean a significant increase in memory use. I initially said, “follow the model of flat object file formats”, not “use ELF”; ELF is just one way of organizing the data of an object file, with years of experience behind it, which we could use wholesale or borrow some lessons from. One of the typical advantages of object file formats is that the data is grouped for efficient memory usage; some sections of a file will be mapped into the address space read-only (shared between processes), other sections read-write (possibly shared until copied on write), and others not mapped at all. For example, we might put symbol names (normally never modified but it can be done), doc strings (to be loaded later, only if needed), byte code, and other strings into their own sections, and create Lisp_String objects and such pointing to those bytes as needed. We don’t keep much in the way of source location information for Lisp code around, but if we ever change that, arguably it could go in a file section that’s not mapped or read until the debugger wants the information. The Guile project’s documentation says their use of ELF is intended to build on existing work to invent a good object file format with several desired characteristics (https://www.gnu.org/software/guile/manual/html_node/Object-File-Format.html): • Above all else, it should be very cheap to load a compiled file. • It should be possible to statically allocate constants in the file. For example, a bytevector literal in source code can be emitted directly into the object file. • The compiled file should enable maximum code and data sharing between different processes. • The compiled file should contain debugging information, such as line numbers, but that information should be separated from the code itself. It should be possible to strip debugging information if space is tight. They’re generating byte code currently, but are looking forward towards generating native code as well (instead?). Their write-up implicitly assumes that, as with “normal” object files, the idea is to mmap the data into the address space, some of it read-only and some of it automatically getting some patching up, and then using those in-memory objects directly. There’s no explicit discussion of the tradeoffs of loading a file all at once versus reading one object tree (S-expression) at a time from an input stream, but especially when mapping and using much of the data unmodified is feasible, I suspect the all-at-once approach is likely to be more efficient. Whether that would be true in a case like Emacs, I don’t know. They use DWARF for carrying some debug information, but so far I’m unsure what information is actually stored there. Ken [-- Attachment #2: Type: text/html, Size: 8822 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-29 9:33 ` Ken Raeburn @ 2017-07-02 15:46 ` Philipp Stephani 2017-07-03 1:44 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Philipp Stephani @ 2017-07-02 15:46 UTC (permalink / raw) To: Ken Raeburn; +Cc: Paul Eggert, Emacs developers [-- Attachment #1: Type: text/plain, Size: 3249 bytes --] Ken Raeburn <raeburn@raeburn.org> schrieb am Mo., 29. Mai 2017 um 11:33 Uhr: > > On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com> wrote: > > > > Ken Raeburn <raeburn@raeburn.org> schrieb am So., 28. Mai 2017 um > 13:07 Uhr: > >> >> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu> wrote: >> >> > Ken Raeburn wrote: >> >> The Guile project has taken this idea pretty far; they’re generating >> ELF object files with a few special sections for Guile objects, using the >> standard DWARF sections for debug information, etc. While it has a certain >> appeal (making C modules and Lisp files look much more similar, maybe being >> able to link Lisp and C together into one executable image, letting GDB >> understand some of your data), switching to a machine-specific format would >> be a pretty drastic change, when we can currently share the files across >> machines. >> > >> > Although it does indeed sound like a big change, I don't see why it >> would prevent us from sharing the files across machines. Emacs can use >> standard ELF and DWARF format on any platform if Emacs is doing the >> loading. And there should be some software-engineering benefit in using the >> same format that Guile uses. >> >> Sorry for the delay in responding. >> >> The ELF format has header fields indicating the word size, endianness, >> machine architecture (though there’s a value for “none”), and OS ABI. Some >> fields vary in size or order depending on whether the 32-bit or 64-bit >> format is in use. Some other format details (e.g., relocation types, >> interpretation of certain ranges of values in some fields) are >> architecture- or OS-dependent; we might not care about many of those >> details, but relocations are likely needed if we want to play linking games >> or use DWARF. >> >> I think Guile is using whatever the native word size and architecture >> are. If we do that for Emacs, they’re not portable between platforms. >> Currently it works for me to put my Lisp files, both source and compiled, >> into ~/elisp and use them from different kinds of machines if my home >> directory is NFS-mounted. >> >> We could instead pick fixed values (say, architecture “none”, >> little-endian, 32-bit), but then there’s no guarantee that we could use any >> of the usual GNU tools on them without a bunch of work, or that we’d ever >> be able to use non-GNU tools to treat them as object files. Then again, we >> couldn’t expect to do the latter portably anyway, since some of the >> platforms don’t even use ELF. >> >> > Is there any significant advantage of using ELF, or could this just use > one of the standard binary serialization formats (protobuf, flatbuffer, > ...)? > > > That’s an interesting idea. If one of the popular serialization libraries > is compatibly licensed, easy to use, and performs well, it may be better > than rolling our own. > I've tried this out (with flatbuffers), but I haven't seen significant speed improvements. It might very well be the case that during loading the reader is already fast enough (e.g. for ELC files it doesn't do any decoding), and it's the evaluator that's too slow. [-- Attachment #2: Type: text/html, Size: 4253 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-07-02 15:46 ` Philipp Stephani @ 2017-07-03 1:44 ` Ken Raeburn 2017-09-24 13:57 ` Philipp Stephani 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-07-03 1:44 UTC (permalink / raw) To: Philipp Stephani; +Cc: Paul Eggert, Emacs developers [-- Attachment #1: Type: text/plain, Size: 3869 bytes --] On Jul 2, 2017, at 11:46, Philipp Stephani <p.stephani2@gmail.com> wrote: > Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am Mo., 29. Mai 2017 um 11:33 Uhr: > > On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com <mailto:p.stephani2@gmail.com>> wrote: > >> >> >> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am So., 28. Mai 2017 um 13:07 Uhr: >> >> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu <mailto:eggert@cs.ucla.edu>> wrote: >> >> > Ken Raeburn wrote: >> >> The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc. While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines. >> > >> > Although it does indeed sound like a big change, I don't see why it would prevent us from sharing the files across machines. Emacs can use standard ELF and DWARF format on any platform if Emacs is doing the loading. And there should be some software-engineering benefit in using the same format that Guile uses. >> >> Sorry for the delay in responding. >> >> The ELF format has header fields indicating the word size, endianness, machine architecture (though there’s a value for “none”), and OS ABI. Some fields vary in size or order depending on whether the 32-bit or 64-bit format is in use. Some other format details (e.g., relocation types, interpretation of certain ranges of values in some fields) are architecture- or OS-dependent; we might not care about many of those details, but relocations are likely needed if we want to play linking games or use DWARF. >> >> I think Guile is using whatever the native word size and architecture are. If we do that for Emacs, they’re not portable between platforms. Currently it works for me to put my Lisp files, both source and compiled, into ~/elisp and use them from different kinds of machines if my home directory is NFS-mounted. >> >> We could instead pick fixed values (say, architecture “none”, little-endian, 32-bit), but then there’s no guarantee that we could use any of the usual GNU tools on them without a bunch of work, or that we’d ever be able to use non-GNU tools to treat them as object files. Then again, we couldn’t expect to do the latter portably anyway, since some of the platforms don’t even use ELF. >> >> >> Is there any significant advantage of using ELF, or could this just use one of the standard binary serialization formats (protobuf, flatbuffer, ...)? > > That’s an interesting idea. If one of the popular serialization libraries is compatibly licensed, easy to use, and performs well, it may be better than rolling our own. > > I've tried this out (with flatbuffers), but I haven't seen significant speed improvements. It might very well be the case that during loading the reader is already fast enough (e.g. for ELC files it doesn't do any decoding), and it's the evaluator that's too slow. What’s your test case, and how are you measuring the performance? In my tests with the one big elc file, using the Linux “perf” tool, it seems that readchar, read1, encode_char, and ungetc are where a good chunk of CPU time is still spent — about 1/4 in my testing with the “big elc file” code. My experiment in May cut down a chunk of the overall run time (start in batch mode, print a message, and exit) with some ugly reader syntax hacks. Tests with smaller files may have different characteristics though… Ken [-- Attachment #2: Type: text/html, Size: 6224 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-07-03 1:44 ` Ken Raeburn @ 2017-09-24 13:57 ` Philipp Stephani 2017-09-27 8:31 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Philipp Stephani @ 2017-09-24 13:57 UTC (permalink / raw) To: Ken Raeburn; +Cc: Paul Eggert, Emacs developers [-- Attachment #1: Type: text/plain, Size: 3926 bytes --] Ken Raeburn <raeburn@raeburn.org> schrieb am Mo., 3. Juli 2017 um 03:44 Uhr: > > On Jul 2, 2017, at 11:46, Philipp Stephani <p.stephani2@gmail.com> wrote: > > Ken Raeburn <raeburn@raeburn.org> schrieb am Mo., 29. Mai 2017 um > 11:33 Uhr: > >> >> On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com> >> wrote: >> >> >> >> Ken Raeburn <raeburn@raeburn.org> schrieb am So., 28. Mai 2017 um >> 13:07 Uhr: >> >>> >>> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu> wrote: >>> >>> > Ken Raeburn wrote: >>> >> The Guile project has taken this idea pretty far; they’re generating >>> ELF object files with a few special sections for Guile objects, using the >>> standard DWARF sections for debug information, etc. While it has a certain >>> appeal (making C modules and Lisp files look much more similar, maybe being >>> able to link Lisp and C together into one executable image, letting GDB >>> understand some of your data), switching to a machine-specific format would >>> be a pretty drastic change, when we can currently share the files across >>> machines. >>> > >>> > Although it does indeed sound like a big change, I don't see why it >>> would prevent us from sharing the files across machines. Emacs can use >>> standard ELF and DWARF format on any platform if Emacs is doing the >>> loading. And there should be some software-engineering benefit in using the >>> same format that Guile uses. >>> >>> Sorry for the delay in responding. >>> >>> The ELF format has header fields indicating the word size, endianness, >>> machine architecture (though there’s a value for “none”), and OS ABI. Some >>> fields vary in size or order depending on whether the 32-bit or 64-bit >>> format is in use. Some other format details (e.g., relocation types, >>> interpretation of certain ranges of values in some fields) are >>> architecture- or OS-dependent; we might not care about many of those >>> details, but relocations are likely needed if we want to play linking games >>> or use DWARF. >>> >>> I think Guile is using whatever the native word size and architecture >>> are. If we do that for Emacs, they’re not portable between platforms. >>> Currently it works for me to put my Lisp files, both source and compiled, >>> into ~/elisp and use them from different kinds of machines if my home >>> directory is NFS-mounted. >>> >>> We could instead pick fixed values (say, architecture “none”, >>> little-endian, 32-bit), but then there’s no guarantee that we could use any >>> of the usual GNU tools on them without a bunch of work, or that we’d ever >>> be able to use non-GNU tools to treat them as object files. Then again, we >>> couldn’t expect to do the latter portably anyway, since some of the >>> platforms don’t even use ELF. >>> >>> >> Is there any significant advantage of using ELF, or could this just use >> one of the standard binary serialization formats (protobuf, flatbuffer, >> ...)? >> >> >> That’s an interesting idea. If one of the popular serialization >> libraries is compatibly licensed, easy to use, and performs well, it may >> be better than rolling our own. >> > > I've tried this out (with flatbuffers), but I haven't seen significant > speed improvements. It might very well be the case that during loading the > reader is already fast enough (e.g. for ELC files it doesn't do any > decoding), and it's the evaluator that's too slow. > > > What’s your test case, and how are you measuring the performance? > IIRC I've repeatedly loaded one of the biggest .elc files shipped with Emacs and measured the total loading time. I haven't done any detailed profiling, since I was hoping for a significant speed increase that would justify the work. If people are generally interested in pursuing this further, I'd be happy to put my code into a scratch branch. [-- Attachment #2: Type: text/html, Size: 5694 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-09-24 13:57 ` Philipp Stephani @ 2017-09-27 8:31 ` Ken Raeburn 0 siblings, 0 replies; 375+ messages in thread From: Ken Raeburn @ 2017-09-27 8:31 UTC (permalink / raw) To: Philipp Stephani; +Cc: Paul Eggert, Emacs developers [-- Attachment #1: Type: text/plain, Size: 6615 bytes --] On Sep 24, 2017, at 09:57, Philipp Stephani <p.stephani2@gmail.com> wrote: > Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am Mo., 3. Juli 2017 um 03:44 Uhr: > > On Jul 2, 2017, at 11:46, Philipp Stephani <p.stephani2@gmail.com <mailto:p.stephani2@gmail.com>> wrote: > >> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am Mo., 29. Mai 2017 um 11:33 Uhr: >> >> On May 28, 2017, at 08:43, Philipp Stephani <p.stephani2@gmail.com <mailto:p.stephani2@gmail.com>> wrote: >> >>> >>> >>> Ken Raeburn <raeburn@raeburn.org <mailto:raeburn@raeburn.org>> schrieb am So., 28. Mai 2017 um 13:07 Uhr: >>> >>> On May 21, 2017, at 04:53, Paul Eggert <eggert@cs.ucla.edu <mailto:eggert@cs.ucla.edu>> wrote: >>> >>> > Ken Raeburn wrote: >>> >> The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc. While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines. >>> > >>> > Although it does indeed sound like a big change, I don't see why it would prevent us from sharing the files across machines. Emacs can use standard ELF and DWARF format on any platform if Emacs is doing the loading. And there should be some software-engineering benefit in using the same format that Guile uses. >>> >>> Sorry for the delay in responding. >>> >>> The ELF format has header fields indicating the word size, endianness, machine architecture (though there’s a value for “none”), and OS ABI. Some fields vary in size or order depending on whether the 32-bit or 64-bit format is in use. Some other format details (e.g., relocation types, interpretation of certain ranges of values in some fields) are architecture- or OS-dependent; we might not care about many of those details, but relocations are likely needed if we want to play linking games or use DWARF. >>> >>> I think Guile is using whatever the native word size and architecture are. If we do that for Emacs, they’re not portable between platforms. Currently it works for me to put my Lisp files, both source and compiled, into ~/elisp and use them from different kinds of machines if my home directory is NFS-mounted. >>> >>> We could instead pick fixed values (say, architecture “none”, little-endian, 32-bit), but then there’s no guarantee that we could use any of the usual GNU tools on them without a bunch of work, or that we’d ever be able to use non-GNU tools to treat them as object files. Then again, we couldn’t expect to do the latter portably anyway, since some of the platforms don’t even use ELF. >>> >>> >>> Is there any significant advantage of using ELF, or could this just use one of the standard binary serialization formats (protobuf, flatbuffer, ...)? >> >> That’s an interesting idea. If one of the popular serialization libraries is compatibly licensed, easy to use, and performs well, it may be better than rolling our own. >> >> I've tried this out (with flatbuffers), but I haven't seen significant speed improvements. It might very well be the case that during loading the reader is already fast enough (e.g. for ELC files it doesn't do any decoding), and it's the evaluator that's too slow. > > What’s your test case, and how are you measuring the performance? > > IIRC I've repeatedly loaded one of the biggest .elc files shipped with Emacs and measured the total loading time. I haven't done any detailed profiling, since I was hoping for a significant speed increase that would justify the work. It’ll depend on what the code in that file is doing. In the raeburn-startup branch, the last bit of profiling I did — you can see a graph at http://www.mit.edu/~raeburn/emacs.svg <http://www.mit.edu/~raeburn/emacs.svg> and if you haven’t read up on flame graphs (http://www.brendangregg.com/flamegraphs.html <http://www.brendangregg.com/flamegraphs.html>), they provide a nice visualization of the CPU time consumption broken down by what the current call stack looks like — showed nearly 1/3 of the CPU time of a simple run of Emacs in batch mode was spent reading and parsing the saved Lisp environment. Most of the rest of the CPU time was spent executing the loaded code (lots of fset and setplist calls), but the biggest chunk of that was executing a nested load of international/characters.elc; during that nested load, most of the time was spent in execution (mostly char table processing) and very little in parsing. So… for the saved Lisp environment file, excluding the nested load, reading and parsing is about 2/3 of the CPU time used; for characters.elc, reading and parsing is a minuscule portion of the CPU time. Loading a Lisp file internally uses the Lisp “read” routine, which requires an input stream of character values (not byte values) to be supplied; we examine the stream object and dispatch to various bits of code depending on its type (buffer, marker, function, certain special symbols), *for each character*. Each byte is examined to see if it’s part of a multibyte character. Each character is considered to see if it’s allowed to be part of a symbol name or string or whatever we’re in the middle of parsing, or if it’s a backslash quoting some other character, etc. Hence my hopes for a non-text-based format, designed to streamline reading data from files, where we can do things like specify a vector length or string length up front instead of having to consider each character and process character quoting sequences, stuff like that. E.g., here’s a unibyte string of 47 bytes, so just copy the bytes without considering every one separately. No human-readable printed form, no escape sequences needed. Another help might be finding a faster way to load the character data. I’ve got the branch loading characters.elc at startup because saving and parsing the generated tables was even slower than evaluating the Lisp code to generate them. Perhaps we can do some processing of them during the build and convert them into some other form that lets us start up faster. > If people are generally interested in pursuing this further, I'd be happy to put my code into a scratch branch. I’d be curious to take a look… Ken [-- Attachment #2: Type: text/html, Size: 9862 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-28 11:07 ` Ken Raeburn 2017-05-28 12:43 ` Philipp Stephani @ 2017-05-28 21:09 ` Paul Eggert 2017-05-29 9:33 ` Ken Raeburn 1 sibling, 1 reply; 375+ messages in thread From: Paul Eggert @ 2017-05-28 21:09 UTC (permalink / raw) To: Ken Raeburn; +Cc: Emacs developers Ken Raeburn wrote: > I think Guile is using whatever the native word size and architecture are. If we do that for Emacs, they’re not portable between platforms. Sure, but we're talking about the format Emacs uses to save its state, not the format of .elc files. Currently Emacs saves its state as an executable file that in general cannot be moved from one GNU/Linux distribution to another even if they have the same architecture. Switching to Guile's platform-neutral approach would make Emacs's saved-state format more portable, not less. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-28 21:09 ` Paul Eggert @ 2017-05-29 9:33 ` Ken Raeburn 2017-05-29 16:37 ` Paul Eggert 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2017-05-29 9:33 UTC (permalink / raw) To: Paul Eggert; +Cc: Emacs developers On May 28, 2017, at 17:09, Paul Eggert <eggert@cs.ucla.edu> wrote: > Ken Raeburn wrote: >> I think Guile is using whatever the native word size and architecture are. If we do that for Emacs, they’re not portable between platforms. > > Sure, but we're talking about the format Emacs uses to save its state, not the format of .elc files. Currently Emacs saves its state as an executable file that in general cannot be moved from one GNU/Linux distribution to another even if they have the same architecture. Switching to Guile's platform-neutral approach would make Emacs's saved-state format more portable, not less. Actually, I was referring to compiled-Lisp files generally, including the “dumped.elc” file, when I suggested it. And I wouldn’t describe Guile’s “ELF everywhere” approach as entirely platform-neutral. I built a Guile tree tonight to take a look. My guess earlier about using the native architecture was wrong (it uses “none”), but it appears that the generated files are specific to the host’s byte order and word size. So some sharing is possible between similar platforms, but not across all as with the current .elc format. Even saving just the Lisp state as with “dumped.elc”, I think there could be state from the environment or build options that varies across distributions. Lists of supported image types, distro customizations, things like that. I’m not sure what benefit there is in trying to share saved Emacs state across distros. If the goal is for a user to save a massively customized environment for future invocations, perhaps we should just work on speeding up the loading of the customizations. If we want standardized object/executable format specifically for the preloaded environment, perhaps using the native format by way of the C compiler is a better choice. I think this may have come up in the discussion before. The big loss there is the ability to create a new saved environment without having a C compiler handy, but it seems like a thing few people are likely to want to do, and even fewer non-developers who might not be able to install a compiler. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-29 9:33 ` Ken Raeburn @ 2017-05-29 16:37 ` Paul Eggert 2017-05-29 17:39 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Paul Eggert @ 2017-05-29 16:37 UTC (permalink / raw) To: Ken Raeburn; +Cc: Emacs developers Ken Raeburn wrote: > And I wouldn’t describe Guile’s “ELF everywhere” approach as entirely platform-neutral. That's correct. It's more portable than what Emacs currently does, but it's less portable than saving state in .elc format. > Even saving just the Lisp state as with “dumped.elc”, I think there could be state from the environment or build options that varies across distributions. Yes, quite true. Even with "dumped .elc" or with any of the other methods proposed, it would be quite difficult to make the saved state portable to any platform. That kind of portability should not be our goal. > If we want standardized object/executable format specifically for the preloaded environment, perhaps using the native format by way of the C compiler is a better choice. I think this may have come up in the discussion before. The big loss there is the ability to create a new saved environment without having a C compiler handy, but it seems like a thing few people are likely to want to do, and even fewer non-developers who might not be able to install a compiler. Yes, this is my preferred solution too; I was the one who made that suggestion. Although Eli didn't like the idea at the time, perhaps there will come a day when we revisit it. It should be faster than even Guile's ELF-based loading, which is already plenty fast. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-29 16:37 ` Paul Eggert @ 2017-05-29 17:39 ` Eli Zaretskii 2017-05-29 18:03 ` Paul Eggert 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-05-29 17:39 UTC (permalink / raw) To: Paul Eggert; +Cc: raeburn, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Mon, 29 May 2017 09:37:15 -0700 > Cc: Emacs developers <emacs-devel@gnu.org> > > > If we want standardized object/executable format specifically for the preloaded environment, perhaps using the native format by way of the C compiler is a better choice. I think this may have come up in the discussion before. The big loss there is the ability to create a new saved environment without having a C compiler handy, but it seems like a thing few people are likely to want to do, and even fewer non-developers who might not be able to install a compiler. > > Yes, this is my preferred solution too; I was the one who made that suggestion. > Although Eli didn't like the idea at the time, perhaps there will come a day > when we revisit it. It should be faster than even Guile's ELF-based loading, > which is already plenty fast. I have no doubt it will be faster. My problem with this alternative is that I believe maintaining it will need experts on C and C compilers that we cannot rely on having available. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-29 17:39 ` Eli Zaretskii @ 2017-05-29 18:03 ` Paul Eggert 2017-05-29 18:53 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Paul Eggert @ 2017-05-29 18:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: raeburn, emacs-devel Eli Zaretskii wrote: > My problem with this alternative > is that I believe maintaining it will need experts on C and C > compilers that we cannot rely on having available. I don't see why we'd need any more expertise in C than we already require. The Emacs core is written in C, and one must be expert in C to maintain it. The C code that would be output would be simple -- much simpler than what we are already maintaining. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-29 18:03 ` Paul Eggert @ 2017-05-29 18:53 ` Eli Zaretskii 2017-05-29 20:15 ` Paul Eggert 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2017-05-29 18:53 UTC (permalink / raw) To: Paul Eggert; +Cc: raeburn, emacs-devel > Cc: raeburn@raeburn.org, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Mon, 29 May 2017 11:03:38 -0700 > > Eli Zaretskii wrote: > > My problem with this alternative > > is that I believe maintaining it will need experts on C and C > > compilers that we cannot rely on having available. > > I don't see why we'd need any more expertise in C than we already require. The > Emacs core is written in C, and one must be expert in C to maintain it. The C > code that would be output would be simple -- much simpler than what we are > already maintaining. This idea requires _generating_ C, something we don't currently do, AFAIK. As for whether it will be simple, I reserve my judgment, since no code was presented to demonstrate the idea for some reasonably complex Lisp data. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-29 18:53 ` Eli Zaretskii @ 2017-05-29 20:15 ` Paul Eggert 2017-05-30 5:52 ` Ken Raeburn 2017-05-30 5:55 ` Eli Zaretskii 0 siblings, 2 replies; 375+ messages in thread From: Paul Eggert @ 2017-05-29 20:15 UTC (permalink / raw) To: Eli Zaretskii; +Cc: raeburn, emacs-devel Eli Zaretskii wrote: > This idea requires_generating_ C, something we don't currently do Actually, Emacs generates and reformats C code all the time, e.g., when editing and reindenting it. And the Emacs build procedure generates plenty of C code, e.g., lib/stdlib.h. It is not a stretch to assume enough C expertise to deal with this sort of thing. > As for whether it will be simple, I reserve my judgment That's discouraging. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-29 20:15 ` Paul Eggert @ 2017-05-30 5:52 ` Ken Raeburn 2017-05-30 5:55 ` Eli Zaretskii 1 sibling, 0 replies; 375+ messages in thread From: Ken Raeburn @ 2017-05-30 5:52 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, emacs-devel Ah, yes, I remember this part of the discussion now… sorry, didn’t mean to stir up the same old argument again. The expertise question would be an issue for attempting to adopt “ELF everywhere” in some fashion too, especially if we tried to do something interesting with using DWARF to store some kind of debug info. On May 29, 2017, at 16:15, Paul Eggert <eggert@cs.ucla.edu> wrote: > Eli Zaretskii wrote: >> This idea requires_generating_ C, something we don't currently do > > Actually, Emacs generates and reformats C code all the time, e.g., when editing and reindenting it. And the Emacs build procedure generates plenty of C code, e.g., lib/stdlib.h. It is not a stretch to assume enough C expertise to deal with this sort of thing. > >> As for whether it will be simple, I reserve my judgment > > That's discouraging. I dunno, sounds like an invitation to produce an implementation that shows how straightforward it can be. But, I had other things I was planning to work on tonight… Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-29 20:15 ` Paul Eggert 2017-05-30 5:52 ` Ken Raeburn @ 2017-05-30 5:55 ` Eli Zaretskii 1 sibling, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2017-05-30 5:55 UTC (permalink / raw) To: Paul Eggert; +Cc: raeburn, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Mon, 29 May 2017 13:15:28 -0700 > Cc: raeburn@raeburn.org, emacs-devel@gnu.org > > Eli Zaretskii wrote: > > This idea requires_generating_ C, something we don't currently do > > Actually, Emacs generates and reformats C code all the time, e.g., when editing > and reindenting it. And the Emacs build procedure generates plenty of C code, > e.g., lib/stdlib.h. It is not a stretch to assume enough C expertise to deal > with this sort of thing. I can only say I disagree. > > As for whether it will be simple, I reserve my judgment > > That's discouraging. I'm sorry if you feel like that, but I don't see why: we are discussing hypothetical code that I have no idea what it will look like. I just don't want to opine about something I never saw. I think it's reasonable. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: compiled lisp file format (Re: Skipping unexec via a big .elc file) 2017-05-21 8:44 ` compiled lisp file format (Re: Skipping unexec via a big .elc file) Ken Raeburn 2017-05-21 8:53 ` Paul Eggert @ 2017-05-21 16:02 ` John Wiegley 1 sibling, 0 replies; 375+ messages in thread From: John Wiegley @ 2017-05-21 16:02 UTC (permalink / raw) To: Ken Raeburn; +Cc: Emacs developers [-- Attachment #1: Type: text/plain, Size: 771 bytes --] >>>>> "KR" == Ken Raeburn <raeburn@raeburn.org> writes: KR> I haven’t had much time to further the work on the big-elc approach KR> recently, but there is one idea I want to toss out there for possibly KR> improving the load time further: Changing the .elc file format to a binary KR> one. I’m not talking about a memory image like Daniel is working on. I KR> mean a file representing a sequence of S-expressions, but optimized for KR> loading speed rather than for human readability. I would like to see this; I can't think of a reason not to encode the information in the best format for loading. -- John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 658 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-07 7:28 ` Eli Zaretskii 2017-04-07 9:02 ` Ken Raeburn @ 2017-04-07 13:23 ` Stefan Monnier 1 sibling, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2017-04-07 13:23 UTC (permalink / raw) To: emacs-devel > Perhaps we could have a separate, much smaller dumped.elc for batch > invocations, to cater to these use cases. Ken, does this make sense? FWIW, I'm not sure if there is much to save there. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2017-04-03 16:15 ` Ken Raeburn 2017-04-03 16:57 ` Alan Mackenzie @ 2017-04-10 16:19 ` Ken Raeburn 1 sibling, 0 replies; 375+ messages in thread From: Ken Raeburn @ 2017-04-10 16:19 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel On Apr 3, 2017, at 12:15, Ken Raeburn <raeburn@raeburn.org> wrote: > On Mar 31, 2017, at 04:40, Ken Raeburn <raeburn@raeburn.org> wrote: > >> >> On Mar 31, 2017, at 02:57, Eli Zaretskii <eliz@gnu.org> wrote: >>> >>> This fixes the problem, and Emacs now starts OK, so the abbrevs issue >>> is also solved. >> >> Great! >> >>> I think you should push all the changes you asked me to apply as >>> patches. >> >> Will do, probably this weekend. > > Looks like the abbrev change isn’t actually working right… I got the quoting wrong, so the abbrev tables are constructed as (mostly) proper abbrev tables, and in the right order, but the “:parent” properties are bad. Working on fixing it up…. I’ve finally gotten what I think is a fixed version pushed to the branch, which should get the parent links right between abbrev tables. It’s also got a fix to a problem I keep hitting doing parallel bootstrap builds. (I goofed slightly on the log message though, got the name of the temp file wrong.) I like to try to do bootstrap builds when testing and before pushing changes, so hopefully future fixes won’t be so tediously slow to check. I’m still cleaning up my list of open issues, and will probably check that in in the branch’s admin/notes directory. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-23 16:44 ` Skipping unexec via a big .elc file (was: When should ralloc.c be used?) Stefan Monnier 2016-10-23 17:34 ` Eli Zaretskii @ 2016-10-24 18:34 ` Lars Brinkhoff 2016-10-24 19:52 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Lars Brinkhoff @ 2016-10-24 18:34 UTC (permalink / raw) To: emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> writes: > FWIW, I just did a quick experiment with the patch below which dumps > the state of Emacs's obarray after loadup.el into a big "dumped.elc" > file. [...] So even if there might be ways to speed this up, it > doesn't look too promising. I suppose it's obvious that this dumped.elc can't easily be converted to a c file which is compiled and linked into the final emacs. For the benefit of me and perhaps others that would otherwise waste time on this, could someone just briefly explain why? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: Skipping unexec via a big .elc file 2016-10-24 18:34 ` Lars Brinkhoff @ 2016-10-24 19:52 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 19:52 UTC (permalink / raw) To: Lars Brinkhoff; +Cc: emacs-devel > From: Lars Brinkhoff <lars@nocrew.org> > Date: Mon, 24 Oct 2016 20:34:49 +0200 > > I suppose it's obvious that this dumped.elc can't easily be converted to > a c file which is compiled and linked into the final emacs. It can. More accurately, we could implement a back-end to the unexec process that generates C source file. That was Paul's suggestion. I consider this option less desirable for several reasons: . writing and maintaining such a C back-end would be non-trivial, and would require good control of portable C programming, something that most of our contributors lack . it requires a C compiler, i.e. end-users cannot dump their own customized Emacs without having a compiler and linker installed ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 2:37 ` Paul Eggert 2016-10-23 6:53 ` Eli Zaretskii @ 2016-10-23 12:55 ` Stefan Monnier 2016-10-23 14:28 ` Stefan Monnier 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 12:55 UTC (permalink / raw) To: emacs-devel > I suppose you're right that we don't need to; we could instead hack on Emacs > to get it to work without ralloc on recent glibc. I thought it's just a matter of saying "don't use ralloc" (i.e. the use of ralloc is only an optimization hack to try and avoid fragmentation problems). Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 12:55 ` When should ralloc.c be used? Stefan Monnier @ 2016-10-23 14:28 ` Stefan Monnier 2016-10-23 14:57 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 14:28 UTC (permalink / raw) To: emacs-devel > I thought it's just a matter of saying "don't use ralloc" (i.e. the use > of ralloc is only an optimization hack to try and avoid fragmentation > problems). And AFAICT we should just never use ralloc because the rest of Emacs's code is actually not prepared to deal with the implications, and trying to fix it is not only a lot of work, but would make the code less maintainable. I'd rather live with the fragmentation. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 14:28 ` Stefan Monnier @ 2016-10-23 14:57 ` Eli Zaretskii 2016-10-23 15:07 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 14:57 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sun, 23 Oct 2016 10:28:22 -0400 > > I'd rather live with the fragmentation. You mean, use gmalloc without ralloc? Is that feasible? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 14:57 ` Eli Zaretskii @ 2016-10-23 15:07 ` Stefan Monnier 2016-10-23 15:44 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 15:07 UTC (permalink / raw) To: emacs-devel >> I'd rather live with the fragmentation. > You mean, use gmalloc without ralloc? Not only when we use gmalloc but always. I suggest we get rid of ralloc.c. > Is that feasible? Why not? Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 15:07 ` Stefan Monnier @ 2016-10-23 15:44 ` Eli Zaretskii 2016-10-23 16:30 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 15:44 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sun, 23 Oct 2016 11:07:23 -0400 > > >> I'd rather live with the fragmentation. > > You mean, use gmalloc without ralloc? > > Not only when we use gmalloc but always. I suggest we get rid of ralloc.c. > > > Is that feasible? > > Why not? I don't think we ever used such a configuration. Is modern sbrk good enough for gmalloc? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 15:44 ` Eli Zaretskii @ 2016-10-23 16:30 ` Stefan Monnier 2016-10-23 16:45 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 16:30 UTC (permalink / raw) To: emacs-devel > I don't think we ever used such a configuration. Is modern sbrk good > enough for gmalloc? Why not? Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 16:30 ` Stefan Monnier @ 2016-10-23 16:45 ` Eli Zaretskii 2016-10-23 16:49 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 16:45 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sun, 23 Oct 2016 12:30:32 -0400 > > > I don't think we ever used such a configuration. Is modern sbrk good > > enough for gmalloc? > > Why not? "Why not" is never a useful answer. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 16:45 ` Eli Zaretskii @ 2016-10-23 16:49 ` Stefan Monnier 2016-10-23 17:35 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 16:49 UTC (permalink / raw) To: emacs-devel >> > I don't think we ever used such a configuration. Is modern sbrk good >> > enough for gmalloc? >> Why not? > "Why not" is never a useful answer. It just means that I really see no reason why it wouldn't work just fine. It's not like glibc's malloc was particularly magical, so we should be able to do the same in gmalloc.c. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 16:49 ` Stefan Monnier @ 2016-10-23 17:35 ` Eli Zaretskii 2016-10-23 20:23 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 17:35 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sun, 23 Oct 2016 12:49:29 -0400 > > >> > I don't think we ever used such a configuration. Is modern sbrk good > >> > enough for gmalloc? > >> Why not? > > "Why not" is never a useful answer. > > It just means that I really see no reason why it wouldn't work just fine. > It's not like glibc's malloc was particularly magical, so we should be > able to do the same in gmalloc.c. AAIK, glibc's malloc doesn't use sbrk anymore. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 17:35 ` Eli Zaretskii @ 2016-10-23 20:23 ` Stefan Monnier 2016-10-23 20:33 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 20:23 UTC (permalink / raw) To: emacs-devel >> It just means that I really see no reason why it wouldn't work just fine. >> It's not like glibc's malloc was particularly magical, so we should be >> able to do the same in gmalloc.c. > AAIK, glibc's malloc doesn't use sbrk anymore. I don't think it matters very much since we use mmap for the buffers, which is the main source of fragmentation otherwise, AFAIK. And if it proves to really be a problem, we could replace our gmalloc.c with a more recent one which builds on mmap. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 20:23 ` Stefan Monnier @ 2016-10-23 20:33 ` Eli Zaretskii 2016-10-23 20:44 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-23 20:33 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sun, 23 Oct 2016 16:23:38 -0400 > > >> It just means that I really see no reason why it wouldn't work just fine. > >> It's not like glibc's malloc was particularly magical, so we should be > >> able to do the same in gmalloc.c. > > AAIK, glibc's malloc doesn't use sbrk anymore. > > I don't think it matters very much since we use mmap for the buffers, No, we don't, not on GNU/Linux anyway. Or do you see USE_MMAP_FOR_BUFFERS defined to 1 in your src/config.h? > And if it proves to really be a problem, we could replace our gmalloc.c > with a more recent one which builds on mmap. That's a possibility, yes. But someone would have to bring such a gmalloc, and probably leave the current one as well, to minimize the impact on unaffected platforms. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 20:33 ` Eli Zaretskii @ 2016-10-23 20:44 ` Stefan Monnier 2016-10-24 5:11 ` Paul Eggert 2016-10-24 6:59 ` Eli Zaretskii 0 siblings, 2 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-23 20:44 UTC (permalink / raw) To: emacs-devel >> I don't think it matters very much since we use mmap for the buffers, > No, we don't, not on GNU/Linux anyway. AFAIK the decision not to use mmap was due to the fact that glibc's malloc itself uses mmap. But if we don't use glibc's malloc, then why wouldn't we decide to use mmap ourselves for the buffers? > Or do you see USE_MMAP_FOR_BUFFERS defined to 1 in your src/config.h? I'm not sure how to interpret what I see. On Debian stable I see: Should Emacs use the GNU version of malloc? yes (Using Doug Lea's new malloc from the GNU C Library.) Should Emacs use a relocating allocator for buffers? no Should Emacs use mmap(2) for buffer allocation? no and on Debian testing I see: Should Emacs use the GNU version of malloc? no (only before dumping) Should Emacs use a relocating allocator for buffers? no Should Emacs use mmap(2) for buffer allocation? no so, in neither case do I see REL_ALLOC enabled. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 20:44 ` Stefan Monnier @ 2016-10-24 5:11 ` Paul Eggert 2016-10-24 12:33 ` Stefan Monnier 2016-10-24 6:59 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Paul Eggert @ 2016-10-24 5:11 UTC (permalink / raw) To: Stefan Monnier, emacs-devel Stefan Monnier wrote: > in neither case do I see REL_ALLOC enabled. It looks like you are using the master branch. I think Eli is worried more urgently about the emacs-25 branch. For emacs-25 with bleeding-edge glibc, I would expect: Should Emacs use the GNU version of malloc? yes Should Emacs use a relocating allocator for buffers? yes Should Emacs use mmap(2) for buffer allocation? no because emacs-25 will compile both gmalloc.o and ralloc.o on such a platform. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 5:11 ` Paul Eggert @ 2016-10-24 12:33 ` Stefan Monnier 2016-10-24 13:05 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 12:33 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel > Should Emacs use the GNU version of malloc? yes > Should Emacs use a relocating allocator for buffers? yes > Should Emacs use mmap(2) for buffer allocation? no > because emacs-25 will compile both gmalloc.o and ralloc.o on such a platform. But I fail to see what's hard about changing that to "rel_alloc=no, mmap=yes". Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 12:33 ` Stefan Monnier @ 2016-10-24 13:05 ` Eli Zaretskii 2016-10-24 14:12 ` Stefan Monnier ` (2 more replies) 0 siblings, 3 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 13:05 UTC (permalink / raw) To: Stefan Monnier; +Cc: eggert, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Mon, 24 Oct 2016 08:33:10 -0400 > Cc: emacs-devel@gnu.org > > > Should Emacs use the GNU version of malloc? yes > > Should Emacs use a relocating allocator for buffers? yes > > Should Emacs use mmap(2) for buffer allocation? no > > because emacs-25 will compile both gmalloc.o and ralloc.o on such a platform. > > But I fail to see what's hard about changing that to "rel_alloc=no, mmap=yes". Why do we need mmap at all? Why not just use malloc (as implemented by gmalloc)? Using mmap has disadvantages: when you need to enlarge buffer text, and that fails (because there are no more free pages/addresses after the already allocated region), we need to copy buffer text to the new allocation. This happens quite a lot when we visit a compressed buffer. (The MS-Windows emulation of mmap in w32heap.c reserves twice the number of pages as originally requested, for that very reason.) So if we can stop using ralloc without also using mmap directly for buffer text, that'd be a win, I think. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 13:05 ` Eli Zaretskii @ 2016-10-24 14:12 ` Stefan Monnier 2016-10-24 16:00 ` Eli Zaretskii 2016-10-24 14:37 ` Stefan Monnier 2016-10-25 3:12 ` Ken Raeburn 2 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 14:12 UTC (permalink / raw) To: emacs-devel >> But I fail to see what's hard about changing that to "rel_alloc=no, >> mmap=yes". > Why do we need mmap at all? Why not just use malloc (as implemented > by gmalloc)? AFAIU the reason we use ralloc is because of memory fragmentation, and mmap brings similar benefits. Maybe we don't need either of them, but at least at some point in the past the fragmentation issue was sufficient to convince people to write that code. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 14:12 ` Stefan Monnier @ 2016-10-24 16:00 ` Eli Zaretskii 2016-10-24 18:51 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 16:00 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Mon, 24 Oct 2016 10:12:27 -0400 > > >> But I fail to see what's hard about changing that to "rel_alloc=no, > >> mmap=yes". > > Why do we need mmap at all? Why not just use malloc (as implemented > > by gmalloc)? > > AFAIU the reason we use ralloc is because of memory fragmentation, and > mmap brings similar benefits. But we have successfully used the glibc's malloc, without mmap, for years without any sign of fragmentation problems. So these fragmentation problems are not as bad as they sound, at least in one malloc implementation. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:00 ` Eli Zaretskii @ 2016-10-24 18:51 ` Stefan Monnier 0 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 18:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > But we have successfully used the glibc's malloc, without mmap, for > years without any sign of fragmentation problems. Yes, glibc's malloc was good enough. I'm not sure that applies to gmalloc.c. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 13:05 ` Eli Zaretskii 2016-10-24 14:12 ` Stefan Monnier @ 2016-10-24 14:37 ` Stefan Monnier 2016-10-24 15:40 ` Eli Zaretskii 2016-10-25 3:12 ` Ken Raeburn 2 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 14:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel > Using mmap has disadvantages: when you need to enlarge buffer text, > and that fails (because there are no more free pages/addresses after > the already allocated region), we need to copy buffer text to the new > allocation. All allocators suffer from this problem. I haven't seen any evidence that the mmap-based allocation code is significantly more prone to it. Also, the glibc allocators used mmap internally when allocating large-ish chunks (e.g. for buffer text), so if that was a problem, we would have noticed, I think. > (The MS-Windows emulation of mmap in w32heap.c reserves twice the > number of pages as originally requested, for that very reason.) Indeed, if this problem proves significant, there are fairly easy ways to reduce its impact, such as using the kind of approach you mention. Another advantage of using mmap is that it can return the memory to the OS once you kill your large buffer, whereas with gmalloc+ralloc this basically never happens, AFAIK. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 14:37 ` Stefan Monnier @ 2016-10-24 15:40 ` Eli Zaretskii 2016-10-24 16:27 ` Daniel Colascione 2016-10-24 18:45 ` Stefan Monnier 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 15:40 UTC (permalink / raw) To: Stefan Monnier; +Cc: eggert, emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Mon, 24 Oct 2016 10:37:19 -0400 > > > Using mmap has disadvantages: when you need to enlarge buffer text, > > and that fails (because there are no more free pages/addresses after > > the already allocated region), we need to copy buffer text to the new > > allocation. > > All allocators suffer from this problem. I haven't seen any evidence > that the mmap-based allocation code is significantly more prone to it. I have seen that. The native glibc malloc, the on GNU/Linux systems were using until we got screwed by the recent glibc, didn't have this problem, while mmap-based allocator did. Don't ask me how glibc does it, I don't know; but the fact is there. This was discovered when the Windows mmap emulation in w32heap.c was developed and tested. > Also, the glibc allocators used mmap internally when allocating > large-ish chunks (e.g. for buffer text), so if that was a problem, we > would have noticed, I think. True; but they somehow work around the problem. > Another advantage of using mmap is that it can return the memory to the > OS once you kill your large buffer, whereas with gmalloc+ralloc this > basically never happens, AFAIK. Not entirely true: ralloc calls the system sbrk with a negative argument when it feels like it. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 15:40 ` Eli Zaretskii @ 2016-10-24 16:27 ` Daniel Colascione 2016-10-24 16:57 ` Eli Zaretskii ` (4 more replies) 2016-10-24 18:45 ` Stefan Monnier 1 sibling, 5 replies; 375+ messages in thread From: Daniel Colascione @ 2016-10-24 16:27 UTC (permalink / raw) To: Eli Zaretskii, Stefan Monnier; +Cc: eggert, emacs-devel On 10/24/2016 08:40 AM, Eli Zaretskii wrote: >> From: Stefan Monnier <monnier@iro.umontreal.ca> >> Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org >> Date: Mon, 24 Oct 2016 10:37:19 -0400 >> >>> Using mmap has disadvantages: when you need to enlarge buffer text, >>> and that fails (because there are no more free pages/addresses after >>> the already allocated region), we need to copy buffer text to the new >>> allocation. 64-bit address spaces are *huge*. What about just making every buffer allocation 2GB long or so, marked PROT_NONE? You don't actually have to commit all that memory --- all you've done is set aside that address space. But because you've set aside so much address space, you'll very likely be able to expand the actual allocation region (a subset of the reserved region) as much as you want. >> All allocators suffer from this problem. I haven't seen any evidence >> that the mmap-based allocation code is significantly more prone to it. > > I have seen that. The native glibc malloc, the on GNU/Linux systems > were using until we got screwed by the recent glibc, didn't have this > problem, while mmap-based allocator did. Don't ask me how glibc does > it, I don't know; but the fact is there. This was discovered when the > Windows mmap emulation in w32heap.c was developed and tested. > >> Also, the glibc allocators used mmap internally when allocating >> large-ish chunks (e.g. for buffer text), so if that was a problem, we >> would have noticed, I think. > > True; but they somehow work around the problem. > >> Another advantage of using mmap is that it can return the memory to the >> OS once you kill your large buffer, whereas with gmalloc+ralloc this >> basically never happens, AFAIK. > > Not entirely true: ralloc calls the system sbrk with a negative > argument when it feels like it. You can also madvise(MADV_DONTNEED, ...) regions *inside* the heap that contain only freed memory. This procedure also returns memory to the operating system. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:27 ` Daniel Colascione @ 2016-10-24 16:57 ` Eli Zaretskii 2016-10-25 2:34 ` Richard Stallman ` (3 subsequent siblings) 4 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 16:57 UTC (permalink / raw) To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel > Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Mon, 24 Oct 2016 09:27:43 -0700 > > >>> Using mmap has disadvantages: when you need to enlarge buffer text, > >>> and that fails (because there are no more free pages/addresses after > >>> the already allocated region), we need to copy buffer text to the new > >>> allocation. > > 64-bit address spaces are *huge*. What about just making every buffer > allocation 2GB long or so, marked PROT_NONE? You don't actually have to > commit all that memory --- all you've done is set aside that address > space. But because you've set aside so much address space, you'll very > likely be able to expand the actual allocation region (a subset of the > reserved region) as much as you want. Sounds OK, although I'm not an expert on that. But in any case, these ideas are not baked enough to be applied to the release branch, if we want to release Emacs 25.2 soon (as in "in a couple of months"). (Of course, there's always the case of a file larger than 2GB, it's not unheard of, although still quite rare.) ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:27 ` Daniel Colascione 2016-10-24 16:57 ` Eli Zaretskii @ 2016-10-25 2:34 ` Richard Stallman 2016-10-25 14:13 ` Stefan Monnier ` (2 subsequent siblings) 4 siblings, 0 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-25 2:34 UTC (permalink / raw) To: Daniel Colascione; +Cc: eliz, eggert, monnier, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > 64-bit address spaces are *huge*. What about just making every buffer > allocation 2GB long or so, marked PROT_NONE? Does Linux handles such sparseness efficiently? I don't know. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:27 ` Daniel Colascione 2016-10-24 16:57 ` Eli Zaretskii 2016-10-25 2:34 ` Richard Stallman @ 2016-10-25 14:13 ` Stefan Monnier 2016-10-25 14:14 ` Stefan Monnier 2016-10-28 6:03 ` Jérémie Courrèges-Anglas 4 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-25 14:13 UTC (permalink / raw) To: emacs-devel > 64-bit address spaces are *huge*. What about just making every buffer > allocation 2GB long or so, marked PROT_NONE? Won't be sufficient for 3GB buffers, obviously ;-) Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:27 ` Daniel Colascione ` (2 preceding siblings ...) 2016-10-25 14:13 ` Stefan Monnier @ 2016-10-25 14:14 ` Stefan Monnier 2016-10-28 6:03 ` Jérémie Courrèges-Anglas 4 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-25 14:14 UTC (permalink / raw) To: emacs-devel >> Not entirely true: ralloc calls the system sbrk with a negative >> argument when it feels like it. > You can also madvise(MADV_DONTNEED, ...) regions *inside* the heap that > contain only freed memory. This procedure also returns memory to the > operating system. ralloc.c doesn't do this currently. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 16:27 ` Daniel Colascione ` (3 preceding siblings ...) 2016-10-25 14:14 ` Stefan Monnier @ 2016-10-28 6:03 ` Jérémie Courrèges-Anglas 2016-10-28 6:23 ` Daniel Colascione 4 siblings, 1 reply; 375+ messages in thread From: Jérémie Courrèges-Anglas @ 2016-10-28 6:03 UTC (permalink / raw) To: Daniel Colascione; +Cc: Eli Zaretskii, eggert, Stefan Monnier, emacs-devel Daniel Colascione <dancol@dancol.org> writes: > On 10/24/2016 08:40 AM, Eli Zaretskii wrote: >>> From: Stefan Monnier <monnier@iro.umontreal.ca> >>> Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org >>> Date: Mon, 24 Oct 2016 10:37:19 -0400 >>> >>>> Using mmap has disadvantages: when you need to enlarge buffer text, >>>> and that fails (because there are no more free pages/addresses after >>>> the already allocated region), we need to copy buffer text to the new >>>> allocation. > > 64-bit address spaces are *huge*. What about just making every buffer > allocation 2GB long or so, marked PROT_NONE? You don't actually have to > commit all that memory --- all you've done is set aside that address > space. IIUC you suggest relying on memory overcommit. That doesn't sound portable at all. Not all OSes do overcommit and the ones who do generally provide a way to disable it. -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 6:03 ` Jérémie Courrèges-Anglas @ 2016-10-28 6:23 ` Daniel Colascione 2016-10-28 7:09 ` Jérémie Courrèges-Anglas 2016-10-28 7:46 ` Eli Zaretskii 0 siblings, 2 replies; 375+ messages in thread From: Daniel Colascione @ 2016-10-28 6:23 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, Stefan Monnier, emacs-devel jca@wxcvbn.org (Jérémie Courrèges-Anglas) writes: > Daniel Colascione <dancol@dancol.org> writes: > >> On 10/24/2016 08:40 AM, Eli Zaretskii wrote: >>>> From: Stefan Monnier <monnier@iro.umontreal.ca> >>>> Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org >>>> Date: Mon, 24 Oct 2016 10:37:19 -0400 >>>> >>>>> Using mmap has disadvantages: when you need to enlarge buffer text, >>>>> and that fails (because there are no more free pages/addresses after >>>>> the already allocated region), we need to copy buffer text to the new >>>>> allocation. >> >> 64-bit address spaces are *huge*. What about just making every buffer >> allocation 2GB long or so, marked PROT_NONE? You don't actually have to >> commit all that memory --- all you've done is set aside that address >> space. > > IIUC you suggest relying on memory overcommit. That doesn't sound > portable at all. Not all OSes do overcommit and the ones who do > generally provide a way to disable it. You understand incorrectly. "Overcommit" is the practice of allowing an operating system to lie about how much memory it's guaranteed to give applications in the future. We're not talking about guaranteed memory. We're talking about setting aside address space only, not asking the OS to make guarantees about future memory availability. All major operating systems, even ones like Windows that don't do overcommit, provide ways to reserve address space without asking the OS to guarantee availability of memory. That said, my idea probably isn't the best --- but it doesn't rely on overcommit. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 6:23 ` Daniel Colascione @ 2016-10-28 7:09 ` Jérémie Courrèges-Anglas 2016-10-28 7:46 ` Eli Zaretskii 1 sibling, 0 replies; 375+ messages in thread From: Jérémie Courrèges-Anglas @ 2016-10-28 7:09 UTC (permalink / raw) To: Daniel Colascione; +Cc: Eli Zaretskii, eggert, Stefan Monnier, emacs-devel Daniel Colascione <dancol@dancol.org> writes: > jca@wxcvbn.org (Jérémie Courrèges-Anglas) writes: > >> Daniel Colascione <dancol@dancol.org> writes: >> >>> On 10/24/2016 08:40 AM, Eli Zaretskii wrote: >>>>> From: Stefan Monnier <monnier@iro.umontreal.ca> >>>>> Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org >>>>> Date: Mon, 24 Oct 2016 10:37:19 -0400 >>>>> >>>>>> Using mmap has disadvantages: when you need to enlarge buffer text, >>>>>> and that fails (because there are no more free pages/addresses after >>>>>> the already allocated region), we need to copy buffer text to the new >>>>>> allocation. >>> >>> 64-bit address spaces are *huge*. What about just making every buffer >>> allocation 2GB long or so, marked PROT_NONE? You don't actually have to >>> commit all that memory --- all you've done is set aside that address >>> space. >> >> IIUC you suggest relying on memory overcommit. That doesn't sound >> portable at all. Not all OSes do overcommit and the ones who do >> generally provide a way to disable it. > > You understand incorrectly. "Overcommit" is the practice of allowing an > operating system to lie about how much memory it's guaranteed to give > applications in the future. We're not talking about guaranteed > memory. We're talking about setting aside address space only, not asking > the OS to make guarantees about future memory availability. All major > operating systems, even ones like Windows that don't do overcommit, > provide ways to reserve address space without asking the OS to guarantee > availability of memory. Can you point at some documentation regarding those techniques? I fail to find one that would work on my "non-major", mostly POSIX OS. > That said, my idea probably isn't the best --- but it doesn't rely > on overcommit. -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 6:23 ` Daniel Colascione 2016-10-28 7:09 ` Jérémie Courrèges-Anglas @ 2016-10-28 7:46 ` Eli Zaretskii 2016-10-28 8:11 ` Daniel Colascione 1 sibling, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 7:46 UTC (permalink / raw) To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel > From: Daniel Colascione <dancol@dancol.org> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Thu, 27 Oct 2016 23:23:05 -0700 > > We're talking about setting aside address space only, not asking > the OS to make guarantees about future memory availability. All major > operating systems, even ones like Windows that don't do overcommit, > provide ways to reserve address space without asking the OS to guarantee > availability of memory. Not sure I understand you: if a portion of the address space has been reserved, how come these addresses won't be available when we try to commit them later? There might not be physical pages available for that, but virtual memory for those addresses must be available, no? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 7:46 ` Eli Zaretskii @ 2016-10-28 8:11 ` Daniel Colascione 2016-10-28 8:27 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2016-10-28 8:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel On 10/28/2016 12:46 AM, Eli Zaretskii wrote: >> From: Daniel Colascione <dancol@dancol.org> >> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, eggert@cs.ucla.edu, emacs-devel@gnu.org >> Date: Thu, 27 Oct 2016 23:23:05 -0700 >> >> We're talking about setting aside address space only, not asking >> the OS to make guarantees about future memory availability. All major >> operating systems, even ones like Windows that don't do overcommit, >> provide ways to reserve address space without asking the OS to guarantee >> availability of memory. > > Not sure I understand you: if a portion of the address space has been > reserved, how come these addresses won't be available when we try to > commit them later? There might not be physical pages available for > that, but virtual memory for those addresses must be available, no? I'm not sure I understand what you're confused about, so I'll try a broader explanation. Say I mmap (anonymously, for simplicity) a page PROT_NONE. After the initial mapping, that address space is unavailable for other uses. But because the page protections are PROT_NONE, my program has no legal right to access that page, so the OS doesn't have to guarantee that it can find a physical page to back that page I've mmaped. In this state, the memory is reserved. The 20GB PROT_NONE address space reservation itself requires very little memory. It's just a note in the kernel's VM interval tree that says "the addresses in range [0x20000, 0x500020000) are reserved". Virtual memory is Now imagine I change the protections to PROT_READ|PROT_WRITE --- once the PROT_READ|PROT_WRITE mprotect succeeds, my program has every right to access that page; under a strict accounting scheme (that is, without overcommit), the OS has to guarantee that it'll be able to go find a physical page to back that virtual page. In this state, the memory is committed -- the kernel has committed to finding backing storage for that page at some point when the current process tries to access it. Say you have a strict-accounting system with 1GB of RAM and 1GB of swap. I can write a program that reserves 20GB of address space. That's fine. The kernel isn't promising to give you 20GB of memory: it's setting address space. Now if I attempt to map 20GB PROT_READ|PROT_WRITE, on any reasonable (i..e, not overcommit) system, mmap should fail, since there's no way a system with 1GB of RAM and 1GB of swap can promise to provide 20GB of private memory. Overcommit confuses the issue: the kernel will _commit_ to as much memory as you ask it for and then renege on that commitment when it finds it convenient. An overcommit system with 1GB of RAM and 1GB of swap will happily let you make that 20GB PROT_READ|PROT_WRITE mapping. It'll just kill you after you use more than 2GB of that mapping. A non-overcommit system understands how to say "sorry, I can't let you do that" up front. On a non-overcommit system, your process will never be killed for accessing memory that the kernel told the process in advance that it could use. I think Jérémie is working with a mental model where every memory mapping is commit, and only an overcommit system allows you to commit more memory than the system actually has. Any system will let you reserve more address space than you have available commit: reservations are cheap. (In Windows, the corresponding concepts MEM_RESERVE and MEM_COMMIT. Windows is much more explicit about the difference between memory reservations and memory commitments than Linux is, because most of the time, Linux users don't care about the difference.) ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 8:11 ` Daniel Colascione @ 2016-10-28 8:27 ` Eli Zaretskii 2016-10-28 8:44 ` Daniel Colascione 2016-10-28 11:40 ` Jérémie Courrèges-Anglas 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 8:27 UTC (permalink / raw) To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel > Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Fri, 28 Oct 2016 01:11:08 -0700 > > Say I mmap (anonymously, for simplicity) a page PROT_NONE. After the > initial mapping, that address space is unavailable for other uses. But > because the page protections are PROT_NONE, my program has no legal > right to access that page, so the OS doesn't have to guarantee that it > can find a physical page to back that page I've mmaped. In this state, > the memory is reserved. > > The 20GB PROT_NONE address space reservation itself requires very little > memory. It's just a note in the kernel's VM interval tree that says "the > addresses in range [0x20000, 0x500020000) are reserved". Virtual memory is > > Now imagine I change the protections to PROT_READ|PROT_WRITE --- once > the PROT_READ|PROT_WRITE mprotect succeeds, my program has every right > to access that page; under a strict accounting scheme (that is, without > overcommit), the OS has to guarantee that it'll be able to go find a > physical page to back that virtual page. In this state, the memory is > committed -- the kernel has committed to finding backing storage for > that page at some point when the current process tries to access it. I'm with you up to here. My question is whether PROT_READ|PROT_WRITE call could fail after PROT_NONE succeeded. You seem to say it could; I thought it couldn't. > Say you have a strict-accounting system with 1GB of RAM and 1GB of swap. > I can write a program that reserves 20GB of address space. I thought such a reservation should fail, because you don't have enough virtual memory for 20GB of addresses. IOW, I thought the ability to reserve address space is restricted by the actual amount of virtual memory available on the system at the time of the call. You seem to say I was wrong. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 8:27 ` Eli Zaretskii @ 2016-10-28 8:44 ` Daniel Colascione 2016-10-28 9:43 ` Eli Zaretskii 2016-10-28 11:40 ` Jérémie Courrèges-Anglas 1 sibling, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2016-10-28 8:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel On 10/28/2016 01:27 AM, Eli Zaretskii wrote: >> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org >> From: Daniel Colascione <dancol@dancol.org> >> Date: Fri, 28 Oct 2016 01:11:08 -0700 >> >> Say I mmap (anonymously, for simplicity) a page PROT_NONE. After the >> initial mapping, that address space is unavailable for other uses. But >> because the page protections are PROT_NONE, my program has no legal >> right to access that page, so the OS doesn't have to guarantee that it >> can find a physical page to back that page I've mmaped. In this state, >> the memory is reserved. >> >> The 20GB PROT_NONE address space reservation itself requires very little >> memory. It's just a note in the kernel's VM interval tree that says "the >> addresses in range [0x20000, 0x500020000) are reserved". Virtual memory is >> >> Now imagine I change the protections to PROT_READ|PROT_WRITE --- once >> the PROT_READ|PROT_WRITE mprotect succeeds, my program has every right >> to access that page; under a strict accounting scheme (that is, without >> overcommit), the OS has to guarantee that it'll be able to go find a >> physical page to back that virtual page. In this state, the memory is >> committed -- the kernel has committed to finding backing storage for >> that page at some point when the current process tries to access it. > > I'm with you up to here. My question is whether PROT_READ|PROT_WRITE > call could fail after PROT_NONE succeeded. You seem to say it could; > I thought it couldn't. Yes, it can fail. This program just failed on my system, which is a strict accounting (echo 2 > /proc/sys/vm/overcommit_memory) Linux box with much less than 100GB total commit available. #include <stdio.h> #include <string.h> #include <sys/mman.h> #include <errno.h> size_t GB = (size_t) 1024 * 1024 * 1024; int main() { size_t sz = 100*GB; void* mem = mmap(NULL, sz, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (mem == MAP_FAILED) { fprintf(stderr, "map failed: %s\n", strerror(errno)); return 1; } if (mprotect(mem, sz, PROT_READ|PROT_WRITE)) { fprintf(stderr, "mprotect failed: %s\n", strerror(errno)); return 1; } fprintf(stderr, "mprotect worked\n"); return 0; } >> Say you have a strict-accounting system with 1GB of RAM and 1GB of swap. >> I can write a program that reserves 20GB of address space. > > I thought such a reservation should fail, because you don't have > enough virtual memory for 20GB of addresses. IOW, I thought the > ability to reserve address space is restricted by the actual amount of > virtual memory available on the system at the time of the call. You > seem to say I was wrong. I'm not sure you're even wrong :-) What does "virtual memory" mean to you? I'm not sure what you have in mind maps to any of the concepts I'm using. When we allocate memory, we can consume two resources: address space and commit. That 100GB mmap above doesn't consume virtual memory, but it does consume address space. Address space is a finite resource, but usually much larger than commit, which is the sum of RAM and swap space. When you commit a page, the resource you're consuming is commit. (Technically, the 100GB mapping consumes real memory enough for the OS to remember you've set aside that address space, but it's usually a negligible book-keeping note. On my system, I can make sz equal to 80TB or so before the mmap starts to fail: that's about the size of the address space range dictated by amd64 processor design.) (In a 32-bit process on modern systems, it's frequently the case that you have more commit on the system than any one process has address space.) ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 8:44 ` Daniel Colascione @ 2016-10-28 9:43 ` Eli Zaretskii 2016-10-28 9:52 ` Daniel Colascione 2016-10-28 12:11 ` Stefan Monnier 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 9:43 UTC (permalink / raw) To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel > Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Fri, 28 Oct 2016 01:44:33 -0700 > > >> Say you have a strict-accounting system with 1GB of RAM and 1GB of swap. > >> I can write a program that reserves 20GB of address space. > > > > I thought such a reservation should fail, because you don't have > > enough virtual memory for 20GB of addresses. IOW, I thought the > > ability to reserve address space is restricted by the actual amount of > > virtual memory available on the system at the time of the call. You > > seem to say I was wrong. > > I'm not sure you're even wrong :-) What does "virtual memory" mean to > you? Physical + swap, as usual. > When we allocate memory, we can consume two resources: address space and > commit. That 100GB mmap above doesn't consume virtual memory, but it > does consume address space. Address space is a finite resource, but > usually much larger than commit, which is the sum of RAM and swap space. > When you commit a page, the resource you're consuming is commit. If reserving a range of addresses doesn't necessarily mean they will be later available for committing, then what is the purpose of reserving them in the first place? What good does it do? We have in w32heap.c:mmap_realloc code that attempts to commit pages that were previously reserved. That code does recover from a failure to commit, but such a failure is deemed unusual and causes special warnings under debugger. I never saw these warnings happen, except when we had bugs in that code. You seem to say that this is based on false premises, and there's nothing unusual about MEM_COMMIT to fail for the range of pages previously reserved with MEM_RESERVE. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 9:43 ` Eli Zaretskii @ 2016-10-28 9:52 ` Daniel Colascione 2016-10-28 12:25 ` Eli Zaretskii 2016-10-28 12:11 ` Stefan Monnier 1 sibling, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2016-10-28 9:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel On 10/28/2016 02:43 AM, Eli Zaretskii wrote: >> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org >> From: Daniel Colascione <dancol@dancol.org> >> Date: Fri, 28 Oct 2016 01:44:33 -0700 >> >>>> Say you have a strict-accounting system with 1GB of RAM and 1GB of swap. >>>> I can write a program that reserves 20GB of address space. >>> >>> I thought such a reservation should fail, because you don't have >>> enough virtual memory for 20GB of addresses. IOW, I thought the >>> ability to reserve address space is restricted by the actual amount of >>> virtual memory available on the system at the time of the call. You >>> seem to say I was wrong. >> >> I'm not sure you're even wrong :-) What does "virtual memory" mean to >> you? > > Physical + swap, as usual. > >> When we allocate memory, we can consume two resources: address space and >> commit. That 100GB mmap above doesn't consume virtual memory, but it >> does consume address space. Address space is a finite resource, but >> usually much larger than commit, which is the sum of RAM and swap space. >> When you commit a page, the resource you're consuming is commit. > > If reserving a range of addresses doesn't necessarily mean they will > be later available for committing, then what is the purpose of > reserving them in the first place? What good does it do? Reserving address space is useful for making sure you have a contiguous range of virtual addresses that you can use later. > We have in w32heap.c:mmap_realloc code that attempts to commit pages > that were previously reserved. That code does recover from a failure > to commit, but such a failure is deemed unusual and causes special > warnings under debugger. I never saw these warnings happen, except > when we had bugs in that code. You seem to say that this is based on > false premises, and there's nothing unusual about MEM_COMMIT to fail > for the range of pages previously reserved with MEM_RESERVE. The MEM_COMMIT failure might be rare in practice --- systems have a lot of memory these days --- but MEM_COMMIT failing for a memory region previously reserved with MEM_RESERVE is perfectly legal. MEM_RESERVE does not stake a claim on the system's memory resources. It consumes only your own address space. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 9:52 ` Daniel Colascione @ 2016-10-28 12:25 ` Eli Zaretskii 2016-10-28 13:37 ` Stefan Monnier 2016-10-28 15:41 ` Daniel Colascione 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 12:25 UTC (permalink / raw) To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel > Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Fri, 28 Oct 2016 02:52:19 -0700 > > > If reserving a range of addresses doesn't necessarily mean they will > > be later available for committing, then what is the purpose of > > reserving them in the first place? What good does it do? > > Reserving address space is useful for making sure you have a contiguous > range of virtual addresses that you can use later. But if committing more pages from the reserved range is not guaranteed to succeed, I cannot rely on getting that contiguous range of addresses, can I? > > We have in w32heap.c:mmap_realloc code that attempts to commit pages > > that were previously reserved. That code does recover from a failure > > to commit, but such a failure is deemed unusual and causes special > > warnings under debugger. I never saw these warnings happen, except > > when we had bugs in that code. You seem to say that this is based on > > false premises, and there's nothing unusual about MEM_COMMIT to fail > > for the range of pages previously reserved with MEM_RESERVE. > > The MEM_COMMIT failure might be rare in practice --- systems have a lot > of memory these days --- but MEM_COMMIT failing for a memory region > previously reserved with MEM_RESERVE is perfectly legal. I can only say that I never saw that happening. Thanks. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 12:25 ` Eli Zaretskii @ 2016-10-28 13:37 ` Stefan Monnier 2016-10-28 14:30 ` Eli Zaretskii 2016-10-28 15:41 ` Daniel Colascione 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-28 13:37 UTC (permalink / raw) To: emacs-devel > But if committing more pages from the reserved range is not guaranteed > to succeed, I cannot rely on getting that contiguous range of > addresses, can I? It should only fail in those cases where a new mmap (or malloc) would also fail. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 13:37 ` Stefan Monnier @ 2016-10-28 14:30 ` Eli Zaretskii 2016-10-28 14:43 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 14:30 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Fri, 28 Oct 2016 09:37:04 -0400 > > > But if committing more pages from the reserved range is not guaranteed > > to succeed, I cannot rely on getting that contiguous range of > > addresses, can I? > > It should only fail in those cases where a new mmap (or malloc) would > also fail. That means never, for all practical purposes. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 14:30 ` Eli Zaretskii @ 2016-10-28 14:43 ` Stefan Monnier 0 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-28 14:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >> > But if committing more pages from the reserved range is not guaranteed >> > to succeed, I cannot rely on getting that contiguous range of >> > addresses, can I? >> It should only fail in those cases where a new mmap (or malloc) would >> also fail. > That means never, for all practical purposes. Of course, Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 12:25 ` Eli Zaretskii 2016-10-28 13:37 ` Stefan Monnier @ 2016-10-28 15:41 ` Daniel Colascione 2016-10-29 6:08 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Daniel Colascione @ 2016-10-28 15:41 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel On 10/28/2016 05:25 AM, Eli Zaretskii wrote: >> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org >> From: Daniel Colascione <dancol@dancol.org> >> Date: Fri, 28 Oct 2016 02:52:19 -0700 >> >>> If reserving a range of addresses doesn't necessarily mean they will >>> be later available for committing, then what is the purpose of >>> reserving them in the first place? What good does it do? >> >> Reserving address space is useful for making sure you have a contiguous >> range of virtual addresses that you can use later. > > But if committing more pages from the reserved range is not guaranteed > to succeed, I cannot rely on getting that contiguous range of > addresses, can I? You already _have_ the range of addresses. You just can't do anything with them yet. Here's another use case: magic ring buffers. (Where you put two consecutive views of the same file in memory next to each other so that operations on the ring buffer don't need to be split even in cases where they'd wrap the end of the ring.) Say on our 1GB RAM, 1GB swap system we want to memory-map a 5GB ring buffer log file. We can do it safely and atomically like this: 1) Reserve 10GB of address space with an anonymous PROT_NONE mapping; the mapping is at $ADDR 2) Memory-map our log file at $ADDR with PROT_READ|PROT_WRITE; (the mapping is file-backed, not anonymous, so it doesn't count against system commit charge) 3) Memory-map the log file _again_ at $ADDR+5GB Now we have a nice mirrored view of our ring buffer, and thanks to the PROT_NONE mapping we set up in step one, no other thread was able to sneak in the middle and allocate something in the [$ADDR+5GB,$ADDR+10GB) range and spoil our ability to set up the mirroring. In this instance, setting aside address space without allocating backing storage for it turned out to be very useful. > >>> We have in w32heap.c:mmap_realloc code that attempts to commit pages >>> that were previously reserved. That code does recover from a failure >>> to commit, but such a failure is deemed unusual and causes special >>> warnings under debugger. I never saw these warnings happen, except >>> when we had bugs in that code. You seem to say that this is based on >>> false premises, and there's nothing unusual about MEM_COMMIT to fail >>> for the range of pages previously reserved with MEM_RESERVE. >> >> The MEM_COMMIT failure might be rare in practice --- systems have a lot >> of memory these days --- but MEM_COMMIT failing for a memory region >> previously reserved with MEM_RESERVE is perfectly legal. > > I can only say that I never saw that happening. > > Thanks. > ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 15:41 ` Daniel Colascione @ 2016-10-29 6:08 ` Eli Zaretskii 2016-10-29 6:14 ` Daniel Colascione 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-29 6:08 UTC (permalink / raw) To: Daniel Colascione; +Cc: eggert, monnier, emacs-devel > Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Fri, 28 Oct 2016 08:41:48 -0700 > > >> Reserving address space is useful for making sure you have a contiguous > >> range of virtual addresses that you can use later. > > > > But if committing more pages from the reserved range is not guaranteed > > to succeed, I cannot rely on getting that contiguous range of > > addresses, can I? > > You already _have_ the range of addresses. You just can't do anything > with them yet. It's no use "having" the addresses, in the above sense, if I can't rely on being able to do anything with them later. > Here's another use case: magic ring buffers. (Where you put two > consecutive views of the same file in memory next to each other so that > operations on the ring buffer don't need to be split even in cases where > they'd wrap the end of the ring.) > > Say on our 1GB RAM, 1GB swap system we want to memory-map a 5GB ring > buffer log file. We can do it safely and atomically like this: > > 1) Reserve 10GB of address space with an anonymous PROT_NONE mapping; > the mapping is at $ADDR > 2) Memory-map our log file at $ADDR with PROT_READ|PROT_WRITE; (the > mapping is file-backed, not anonymous, so it doesn't count against > system commit charge) > 3) Memory-map the log file _again_ at $ADDR+5GB If 3) fails, what do you do? > Now we have a nice mirrored view of our ring buffer, and thanks to the > PROT_NONE mapping we set up in step one, no other thread was able to > sneak in the middle and allocate something in the [$ADDR+5GB,$ADDR+10GB) > range and spoil our ability to set up the mirroring. > > In this instance, setting aside address space without allocating backing > storage for it turned out to be very useful. Not if PROT_READ|PROT_WRITE call fails. But if, as Stefan says, this will "never" happen, then the problem doesn't exist in practice, and for all practical purposes what I thought should happen, does happen, even if in theory it can fail. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-29 6:08 ` Eli Zaretskii @ 2016-10-29 6:14 ` Daniel Colascione 0 siblings, 0 replies; 375+ messages in thread From: Daniel Colascione @ 2016-10-29 6:14 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, monnier, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org >> From: Daniel Colascione <dancol@dancol.org> >> Date: Fri, 28 Oct 2016 08:41:48 -0700 >> >> >> Reserving address space is useful for making sure you have a contiguous >> >> range of virtual addresses that you can use later. >> > >> > But if committing more pages from the reserved range is not guaranteed >> > to succeed, I cannot rely on getting that contiguous range of >> > addresses, can I? >> >> You already _have_ the range of addresses. You just can't do anything >> with them yet. > > It's no use "having" the addresses, in the above sense, if I can't > rely on being able to do anything with them later. You can rely on nobody else using that address space, though. This exclusion in itself is valuable. It's like an electric company buying right-of-way for a high-voltage transmission line. Sure, the electric company isn't doing anything with that long strip of land, but the value is in nobody _else_ doing anything with it either. > >> Here's another use case: magic ring buffers. (Where you put two >> consecutive views of the same file in memory next to each other so that >> operations on the ring buffer don't need to be split even in cases where >> they'd wrap the end of the ring.) >> >> Say on our 1GB RAM, 1GB swap system we want to memory-map a 5GB ring >> buffer log file. We can do it safely and atomically like this: >> >> 1) Reserve 10GB of address space with an anonymous PROT_NONE mapping; >> the mapping is at $ADDR >> 2) Memory-map our log file at $ADDR with PROT_READ|PROT_WRITE; (the >> mapping is file-backed, not anonymous, so it doesn't count against >> system commit charge) >> 3) Memory-map the log file _again_ at $ADDR+5GB > > If 3) fails, what do you do? Unmap the original mapping and mapping #2, then fail the higher-level make_magic_ring_buffer operation. Any operation that allocates memory can fail. > >> Now we have a nice mirrored view of our ring buffer, and thanks to the >> PROT_NONE mapping we set up in step one, no other thread was able to >> sneak in the middle and allocate something in the [$ADDR+5GB,$ADDR+10GB) >> range and spoil our ability to set up the mirroring. >> >> In this instance, setting aside address space without allocating backing >> storage for it turned out to be very useful. > > Not if PROT_READ|PROT_WRITE call fails. > > But if, as Stefan says, this will "never" happen, then the problem > doesn't exist in practice, and for all practical purposes what I > thought should happen, does happen, even if in theory it can fail. It's unlikely, but it's a legal failure mode. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 9:43 ` Eli Zaretskii 2016-10-28 9:52 ` Daniel Colascione @ 2016-10-28 12:11 ` Stefan Monnier 1 sibling, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-28 12:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel, eggert > If reserving a range of addresses doesn't necessarily mean they will > be later available for committing, then what is the purpose of > reserving them in the first place? What good does it do? My guess is that you can later use that address space to mmap files in there. Equivalently, you could increase the swap space between the time you PROT_NONE and the time you switch to PROT_RW. PROT_NONE is useful in a situation such as ours: you want to mmap a hundred buffers, and make sure you can grow any of them without knowing beforehand which one will grow. But most likely, whether it's useful or not to be able to reserve 80TB of address space even if you'd never be able to PROT_RW later was not really relevant to the design of the API. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 8:27 ` Eli Zaretskii 2016-10-28 8:44 ` Daniel Colascione @ 2016-10-28 11:40 ` Jérémie Courrèges-Anglas 2016-10-28 13:03 ` Stefan Monnier 2016-10-28 15:34 ` Daniel Colascione 1 sibling, 2 replies; 375+ messages in thread From: Jérémie Courrèges-Anglas @ 2016-10-28 11:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Daniel Colascione, emacs-devel, monnier, eggert Eli Zaretskii <eliz@gnu.org> writes: >> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org >> From: Daniel Colascione <dancol@dancol.org> >> Date: Fri, 28 Oct 2016 01:11:08 -0700 >> >> Say I mmap (anonymously, for simplicity) a page PROT_NONE. After the >> initial mapping, that address space is unavailable for other uses. But >> because the page protections are PROT_NONE, my program has no legal >> right to access that page, so the OS doesn't have to guarantee that it ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> can find a physical page to back that page I've mmaped. In this state, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is what I think is a problem in your reasoning. "Doesn't have to guarantee" doesn't mean that the kernel *should not* actually check the available memory and resource limits. >> the memory is reserved. >> >> The 20GB PROT_NONE address space reservation itself requires very little >> memory. It's just a note in the kernel's VM interval tree that says "the >> addresses in range [0x20000, 0x500020000) are reserved". Virtual memory is >> >> Now imagine I change the protections to PROT_READ|PROT_WRITE --- once >> the PROT_READ|PROT_WRITE mprotect succeeds, my program has every right >> to access that page; under a strict accounting scheme (that is, without >> overcommit), the OS has to guarantee that it'll be able to go find a >> physical page to back that virtual page. In this state, the memory is >> committed -- the kernel has committed to finding backing storage for >> that page at some point when the current process tries to access it. > > I'm with you up to here. My question is whether PROT_READ|PROT_WRITE > call could fail after PROT_NONE succeeded. You seem to say it could; > I thought it couldn't. I wouldn't have thought that PROT_NONE vs PROT_READ|PROT_WRITE would have changed anything here, but on *some* OSes it does, however it is not portable. At least OpenBSD doesn't behave like what you describe. IMHO people who rely on this kind of reservations rely on implementation-defined behavior. Also, sanity wise, I'd prefer having mmap(2) fail right away rather than having mprotect(2) fail, much later. *If* mprotect(2) actually fails ; of course, you don't want to play russian roulette with your OS's flavor of the OOM-killer either. >> Say you have a strict-accounting system with 1GB of RAM and 1GB of swap. >> I can write a program that reserves 20GB of address space. > > I thought such a reservation should fail, because you don't have > enough virtual memory for 20GB of addresses. IOW, I thought the > ability to reserve address space is restricted by the actual amount of > virtual memory available on the system at the time of the call. You > seem to say I was wrong. -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 11:40 ` Jérémie Courrèges-Anglas @ 2016-10-28 13:03 ` Stefan Monnier 2016-10-28 14:41 ` Jérémie Courrèges-Anglas 2016-10-28 15:34 ` Daniel Colascione 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-28 13:03 UTC (permalink / raw) To: emacs-devel > I wouldn't have thought that PROT_NONE vs PROT_READ|PROT_WRITE would > have changed anything here, but on *some* OSes it does, however it is > not portable. At least OpenBSD doesn't behave like what you describe. Are you sure? Can you point to concrete evidence? Not that's it's important (using a hard-coded number like 2GB wouldn't work, so we'd more likely use something like w32heap.c's "pre-allocate double the size", which doesn't suffer from that problem anyway and still guarantees efficient behavior when growing a buffer progressively from 1B to 100GB). Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 13:03 ` Stefan Monnier @ 2016-10-28 14:41 ` Jérémie Courrèges-Anglas 0 siblings, 0 replies; 375+ messages in thread From: Jérémie Courrèges-Anglas @ 2016-10-28 14:41 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Stefan Monnier <monnier@iro.umontreal.ca> writes: >> I wouldn't have thought that PROT_NONE vs PROT_READ|PROT_WRITE would >> have changed anything here, but on *some* OSes it does, however it is >> not portable. At least OpenBSD doesn't behave like what you describe. > > Are you sure? Can you point to concrete evidence? Erm, I think there was a problem with my tests. data: - system has 8GB of ram - no swap - "data" rlimit set to 32GB, the per-process maximum supported on OpenBSD/amd64 with Daniel's test program asking for 20GB: - mmap(PROT_NONE) _succeeds_ - mprotect(PROT_READ|PROT_WRITE) _succeeds_ An mmap call directly asking for 20GB with PROT_READ|PROT_WRITE also succeeds. The protection flags aren't checked to decide whether ENOMEM should be returned, and the process has no easy way to tell whether the requested amount of memory is actually usable (-> SIGBUS if the system can't map enough pages). The reason why my test initially failed is that I assumed that ulimit -d was 4GB on this box, not 1.5GB (default for OpenBSD/amd64). Not double-checking this was sloppy, my sincere apologies to Daniel and the other readers. > Not that's it's important (using a hard-coded number like 2GB wouldn't > work, so we'd more likely use something like w32heap.c's "pre-allocate > double the size", which doesn't suffer from that problem anyway and > still guarantees efficient behavior when growing a buffer progressively > from 1B to 100GB). Ack. Note that the test above was using the maximum value for ulimit -d; for the record, a single allocation of 2GB would be rejected by default on all of our supported platforms. -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 11:40 ` Jérémie Courrèges-Anglas 2016-10-28 13:03 ` Stefan Monnier @ 2016-10-28 15:34 ` Daniel Colascione 1 sibling, 0 replies; 375+ messages in thread From: Daniel Colascione @ 2016-10-28 15:34 UTC (permalink / raw) To: Eli Zaretskii, eggert, monnier, emacs-devel On 10/28/2016 04:40 AM, Jérémie Courrèges-Anglas wrote: > Eli Zaretskii <eliz@gnu.org> writes: > >>> Cc: monnier@iro.umontreal.ca, eggert@cs.ucla.edu, emacs-devel@gnu.org >>> From: Daniel Colascione <dancol@dancol.org> >>> Date: Fri, 28 Oct 2016 01:11:08 -0700 >>> >>> Say I mmap (anonymously, for simplicity) a page PROT_NONE. After the >>> initial mapping, that address space is unavailable for other uses. But >>> because the page protections are PROT_NONE, my program has no legal >>> right to access that page, so the OS doesn't have to guarantee that it > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> can find a physical page to back that page I've mmaped. In this state, > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > This is what I think is a problem in your reasoning. "Doesn't have to > guarantee" doesn't mean that the kernel *should not* actually check the > available memory and resource limits. > IMHO, an OS that rejects big PROT_NONE mappings merely because it might not be able to change them to PROT_READ|PROT_WRITE later is broken. The non-overcommit Linux behavior (which is identical to Windows behavior) is the _right _thing_ _to_ _do_. The OS is letting the process manage its address space and assuming that the programmer knows what he wanted to do. >>> the memory is reserved. >>> >>> The 20GB PROT_NONE address space reservation itself requires very little >>> memory. It's just a note in the kernel's VM interval tree that says "the >>> addresses in range [0x20000, 0x500020000) are reserved". Virtual memory is >>> >>> Now imagine I change the protections to PROT_READ|PROT_WRITE --- once >>> the PROT_READ|PROT_WRITE mprotect succeeds, my program has every right >>> to access that page; under a strict accounting scheme (that is, without >>> overcommit), the OS has to guarantee that it'll be able to go find a >>> physical page to back that virtual page. In this state, the memory is >>> committed -- the kernel has committed to finding backing storage for >>> that page at some point when the current process tries to access it. >> >> I'm with you up to here. My question is whether PROT_READ|PROT_WRITE >> call could fail after PROT_NONE succeeded. You seem to say it could; >> I thought it couldn't. > > I wouldn't have thought that PROT_NONE vs PROT_READ|PROT_WRITE would > have changed anything here, but on *some* OSes it does, however it is > not portable. At least OpenBSD doesn't behave like what you describe. How does it behave? > IMHO people who rely on this kind of reservations rely on > implementation-defined behavior. OpenBSD is a Coelacanth. It's a relic. It doesn't even a unified buffe cache. > Also, sanity wise, I'd prefer having mmap(2) fail right away rather than > having mprotect(2) fail, much later. Then ask for PROT_READ|PROT_WRITE access right away. Ask for commit, not just address space. > *If* mprotect(2) actually fails ; > of course, you don't want to play russian roulette with your OS's > flavor of the OOM-killer either. That's why overcommit is an abomination. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 15:40 ` Eli Zaretskii 2016-10-24 16:27 ` Daniel Colascione @ 2016-10-24 18:45 ` Stefan Monnier 2016-10-24 19:38 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 18:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel >> > Using mmap has disadvantages: when you need to enlarge buffer text, >> > and that fails (because there are no more free pages/addresses after >> > the already allocated region), we need to copy buffer text to the new >> > allocation. >> All allocators suffer from this problem. I haven't seen any evidence >> that the mmap-based allocation code is significantly more prone to it. > I have seen that. Could you give some details (mostly about the scale of the problem)? > The native glibc malloc, the on GNU/Linux systems were using until we > got screwed by the recent glibc, didn't have this problem, while > mmap-based allocator did. Don't ask me how glibc does it, I don't > know; but the fact is there. It likely mmaps a bit more than requested, like you do in w32heap.c. >> Another advantage of using mmap is that it can return the memory to the >> OS once you kill your large buffer, whereas with gmalloc+ralloc this >> basically never happens, AFAIK. > Not entirely true: ralloc calls the system sbrk with a negative > argument when it feels like it. That's why I said "basically". Yes, in theory it can sometimes return memory. In practice, this is rare. In contrast, with mmap, returning memory to the OS is the rule rather than the exception. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 18:45 ` Stefan Monnier @ 2016-10-24 19:38 ` Eli Zaretskii 2016-10-25 14:12 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 19:38 UTC (permalink / raw) To: Stefan Monnier; +Cc: eggert, emacs-devel > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Mon, 24 Oct 2016 14:45:59 -0400 > > >> > Using mmap has disadvantages: when you need to enlarge buffer text, > >> > and that fails (because there are no more free pages/addresses after > >> > the already allocated region), we need to copy buffer text to the new > >> > allocation. > >> All allocators suffer from this problem. I haven't seen any evidence > >> that the mmap-based allocation code is significantly more prone to it. > > I have seen that. > > Could you give some details (mostly about the scale of the problem)? Visiting a large compressed file (e.g., an Emacs release tarball compressed with gzip) takes with mmap several times as long as in a build without mmap. > > Not entirely true: ralloc calls the system sbrk with a negative > > argument when it feels like it. > > That's why I said "basically". Yes, in theory it can sometimes > return memory. In practice, this is rare. In contrast, with mmap, > returning memory to the OS is the rule rather than the exception. How so? Releasing memory in both cases requires basically the same situation: a large enough block of contiguous memory not in use. It seems ralloc is actually at an advantage, because relocating blocks helps collect together a larger free block. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 19:38 ` Eli Zaretskii @ 2016-10-25 14:12 ` Stefan Monnier 2016-10-25 16:36 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-25 14:12 UTC (permalink / raw) To: emacs-devel >> That's why I said "basically". Yes, in theory it can sometimes >> return memory. In practice, this is rare. In contrast, with mmap, >> returning memory to the OS is the rule rather than the exception. > How so? Releasing memory in both cases requires basically the same > situation: a large enough block of contiguous memory not in use. IIUC releasing memory with sbrk can only be done if that memory is at the end of the heap. > It seems ralloc is actually at an advantage, because relocating blocks > helps collect together a larger free block. mmap can always free what it has allocated before, without any need to relocate anything. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-25 14:12 ` Stefan Monnier @ 2016-10-25 16:36 ` Eli Zaretskii 2016-10-25 19:27 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-25 16:36 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Tue, 25 Oct 2016 10:12:23 -0400 > > >> That's why I said "basically". Yes, in theory it can sometimes > >> return memory. In practice, this is rare. In contrast, with mmap, > >> returning memory to the OS is the rule rather than the exception. > > How so? Releasing memory in both cases requires basically the same > > situation: a large enough block of contiguous memory not in use. > > IIUC releasing memory with sbrk can only be done if that memory is at > the end of the heap. Since ralloc.c relocates blocks, it can make this happen more easily. > > It seems ralloc is actually at an advantage, because relocating blocks > > helps collect together a larger free block. > > mmap can always free what it has allocated before, without any need to > relocate anything. It makes no sense to release random pages here and there, all you get is fragmentation at address space level. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-25 16:36 ` Eli Zaretskii @ 2016-10-25 19:27 ` Stefan Monnier 0 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-25 19:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >> IIUC releasing memory with sbrk can only be done if that memory is at >> the end of the heap. > Since ralloc.c relocates blocks, it can make this happen more easily. But the sbrk area is shared with gmalloc, whose data is not relocatable, so as soon as gmalloc calls sbrk, the space previously allocated by ralloc can't be returned any more. >> > It seems ralloc is actually at an advantage, because relocating blocks >> > helps collect together a larger free block. >> mmap can always free what it has allocated before, without any need to >> relocate anything. > It makes no sense to release random pages here and there, all you get > is fragmentation at address space level. Yes, but address space is much more plentiful. Note that glibc uses exactly this approach and it works very well for us. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 13:05 ` Eli Zaretskii 2016-10-24 14:12 ` Stefan Monnier 2016-10-24 14:37 ` Stefan Monnier @ 2016-10-25 3:12 ` Ken Raeburn 2016-10-25 16:06 ` Eli Zaretskii 2 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2016-10-25 3:12 UTC (permalink / raw) To: Emacs development discussions On Oct 24, 2016, at 09:05, Eli Zaretskii <eliz@gnu.org> wrote: > Using mmap has disadvantages: when you need to enlarge buffer text, > and that fails (because there are no more free pages/addresses after > the already allocated region), we need to copy buffer text to the new > allocation. This happens quite a lot when we visit a compressed > buffer. (The MS-Windows emulation of mmap in w32heap.c reserves twice > the number of pages as originally requested, for that very reason.) In the general case, yes. But modern Linux kernels have an “mremap” system call which can “move” a range of pages to a portion of the address space that can accommodate a larger size, by tweaking page tables rather than copying all the bits around. I’m pretty sure modern glibc realloc uses it. I had a project a while back where code ported to Solaris ran far slower than the GNU/Linux version because lots of realloc calls were done on a large array; Solaris copied, GNU/Linux remapped. void *mremap(void *old_address, size_t old_size, size_t new_size, int flags, ... /* void *new_address */); Of course you can’t shift bytes within a page this way, or add new space anywhere but after the last page of the old region. (No hint in the man page whether you can use an explicit new address range overlapping the old range to shift a chunk of memory a la memmove, or if the results would be undefined a la memcpy.) I don’t know if any other systems support it. The performance savings for one of our favorite systems might be worth the special-casing. Though, if glibc realloc does the right thing, maybe using malloc/realloc for buffer storage would suffice. Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-25 3:12 ` Ken Raeburn @ 2016-10-25 16:06 ` Eli Zaretskii 2016-10-26 4:36 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-25 16:06 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Mon, 24 Oct 2016 23:12:40 -0400 > > > Using mmap has disadvantages: when you need to enlarge buffer text, > > and that fails (because there are no more free pages/addresses after > > the already allocated region), we need to copy buffer text to the new > > allocation. This happens quite a lot when we visit a compressed > > buffer. (The MS-Windows emulation of mmap in w32heap.c reserves twice > > the number of pages as originally requested, for that very reason.) > > In the general case, yes. But modern Linux kernels have an “mremap” system > call which can “move” a range of pages to a portion of the address space that > can accommodate a larger size, by tweaking page tables rather than copying all > the bits around. I’m pretty sure modern glibc realloc uses it. AFAIU, this feature will only help us if someone adds code to use it in buffer.c:mmap_enlarge. Or are you saying that the OS will call mremap for us automatically when mmap_enlarge attempts to map additional pages at the end of an mmaped region? > I don’t know if any other systems support it. The performance savings for one > of our favorite systems might be worth the special-casing. Though, if glibc > realloc does the right thing, maybe using malloc/realloc for buffer storage > would suffice. If the Linux kernel is the only system that allows implementation of mremap, then it doesn't really help in the long run, because on master we don't need mmap at all for GNU/Linux systems. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-25 16:06 ` Eli Zaretskii @ 2016-10-26 4:36 ` Ken Raeburn 2016-10-26 11:40 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Ken Raeburn @ 2016-10-26 4:36 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > On Oct 25, 2016, at 12:06, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Ken Raeburn <raeburn@raeburn.org> >> Date: Mon, 24 Oct 2016 23:12:40 -0400 >> >>> Using mmap has disadvantages: when you need to enlarge buffer text, >>> and that fails (because there are no more free pages/addresses after >>> the already allocated region), we need to copy buffer text to the new >>> allocation. This happens quite a lot when we visit a compressed >>> buffer. (The MS-Windows emulation of mmap in w32heap.c reserves twice >>> the number of pages as originally requested, for that very reason.) >> >> In the general case, yes. But modern Linux kernels have an “mremap” system >> call which can “move” a range of pages to a portion of the address space that >> can accommodate a larger size, by tweaking page tables rather than copying all >> the bits around. I’m pretty sure modern glibc realloc uses it. > > AFAIU, this feature will only help us if someone adds code to use it > in buffer.c:mmap_enlarge. Or are you saying that the OS will call > mremap for us automatically when mmap_enlarge attempts to map > additional pages at the end of an mmaped region? It could be done explicitly, but my experience was that malloc/realloc would just do it for us; we’d just have to use malloc/realloc instead of explicitly calling mmap. I just took a quick look at the glibc sources (2.19, as patched and packaged by Debian), and it looks like the use of mmap kicks in by default for 128kB or larger allocations, though the threshold can be changed at run time. > If the Linux kernel is the only system that allows implementation of > mremap, then it doesn't really help in the long run, because on master > we don't need mmap at all for GNU/Linux systems. A man page browser at freebsd.org for several platforms seems to indicate that NetBSD has picked it up, but neither FreeBSD nor OpenBSD. I don’t know if NetBSD’s realloc will use it, but it’s certainly simpler if we just ignore mremap for explicit use, and just bear in mind that realloc may not always have to pay the expected copying penalty on all systems…. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-26 4:36 ` Ken Raeburn @ 2016-10-26 11:40 ` Eli Zaretskii 2016-10-27 8:51 ` Ken Raeburn 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-26 11:40 UTC (permalink / raw) To: Ken Raeburn; +Cc: emacs-devel > From: Ken Raeburn <raeburn@raeburn.org> > Date: Wed, 26 Oct 2016 00:36:42 -0400 > Cc: emacs-devel@gnu.org > > >>> Using mmap has disadvantages: when you need to enlarge buffer text, > >>> and that fails (because there are no more free pages/addresses after > >>> the already allocated region), we need to copy buffer text to the new > >>> allocation. This happens quite a lot when we visit a compressed > >>> buffer. (The MS-Windows emulation of mmap in w32heap.c reserves twice > >>> the number of pages as originally requested, for that very reason.) > >> > >> In the general case, yes. But modern Linux kernels have an “mremap” system > >> call which can “move” a range of pages to a portion of the address space > >> that > >> can accommodate a larger size, by tweaking page tables rather than copying > >> all > >> the bits around. I’m pretty sure modern glibc realloc uses it. > > > > AFAIU, this feature will only help us if someone adds code to use it > > in buffer.c:mmap_enlarge. Or are you saying that the OS will call > > mremap for us automatically when mmap_enlarge attempts to map > > additional pages at the end of an mmaped region? > > It could be done explicitly, but my experience was that malloc/realloc would > just do it for us; we’d just have to use malloc/realloc instead of explicitly > calling mmap. I think we've lost context of the discussion. Please see above: this is about the disadvantages of using mmap directly, i.e. for those cases where the native malloc or gmalloc suffer from memory fragmentation, and we decide to use mmap in buffer.c to countermand that. I've pointed out the disadvantages of using mmap directly, and you mentioned the mremap syscall as the counter-argument. If you thought I was talking about problems mmap could cause to the malloc implementation, then that's a misunderstanding: I was explicitly talking about using mmap directly for allocating buffer text. My point was that we should only use mmap if necessary, as it comes for a price. > A man page browser at freebsd.org for several platforms seems to indicate that > NetBSD has picked it up, but neither FreeBSD nor OpenBSD. I don’t know if > NetBSD’s realloc will use it, but it’s certainly simpler if we just ignore > mremap for explicit use, and just bear in mind that realloc may not always have > to pay the expected copying penalty on all systems…. Once again, this is about the cases where using malloc for buffer text gives unsatisfactory results, and mmap is being considered as a remedy. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-26 11:40 ` Eli Zaretskii @ 2016-10-27 8:51 ` Ken Raeburn 0 siblings, 0 replies; 375+ messages in thread From: Ken Raeburn @ 2016-10-27 8:51 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > I think we've lost context of the discussion. Please see above: this > is about the disadvantages of using mmap directly, i.e. for those > cases where the native malloc or gmalloc suffer from memory > fragmentation, and we decide to use mmap in buffer.c to countermand > that. Yes, sorry, I got a bit off track… Ken ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-23 20:44 ` Stefan Monnier 2016-10-24 5:11 ` Paul Eggert @ 2016-10-24 6:59 ` Eli Zaretskii 2016-10-24 12:45 ` Stefan Monnier 1 sibling, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 6:59 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sun, 23 Oct 2016 16:44:10 -0400 > > >> I don't think it matters very much since we use mmap for the buffers, > > No, we don't, not on GNU/Linux anyway. > > AFAIK the decision not to use mmap was due to the fact that glibc's > malloc itself uses mmap. But if we don't use glibc's malloc, then why > wouldn't we decide to use mmap ourselves for the buffers? I already asked that: http://lists.gnu.org/archive/html/emacs-devel/2016-10/msg00678.html The only answer was disappointing: I don't know, and would rather not spend time investigating. It looks like either people don't realize what a land mine we just stepped on, or they simply don't care enough. Does it make sense to anyone to release Emacs 25.2 that doesn't work reliably on recent GNU/Linux systems? Because that's what is going to happen if we don't invest all the resources we have into solving this. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 6:59 ` Eli Zaretskii @ 2016-10-24 12:45 ` Stefan Monnier 2016-10-24 13:07 ` Eli Zaretskii 2016-10-24 16:56 ` Richard Stallman 0 siblings, 2 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 12:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > It looks like either people don't realize what a land mine we just > stepped on, or they simply don't care enough. [...] > Does it make sense to anyone to release Emacs 25.2 that doesn't work > reliably on recent GNU/Linux systems? Because that's what is going to > happen if we don't invest all the resources we have into solving this. I must be missing something: you seem to know about past severe problems we've had because of fragmentation, whereas I can't remember any such occurrence. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 12:45 ` Stefan Monnier @ 2016-10-24 13:07 ` Eli Zaretskii 2016-10-24 14:42 ` Stefan Monnier 2016-10-24 16:56 ` Richard Stallman 1 sibling, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 13:07 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: emacs-devel@gnu.org > Date: Mon, 24 Oct 2016 08:45:04 -0400 > > > It looks like either people don't realize what a land mine we just > > stepped on, or they simply don't care enough. > [...] > > Does it make sense to anyone to release Emacs 25.2 that doesn't work > > reliably on recent GNU/Linux systems? Because that's what is going to > > happen if we don't invest all the resources we have into solving this. > > I must be missing something: you seem to know about past severe problems > we've had because of fragmentation, whereas I can't remember any > such occurrence. I don't understand how you get to talking about fragmentation. I never mentioned anything like that. The problems I was talking about are all related to using ralloc.c on GNU/Linux systems with a recent enough glibc. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 13:07 ` Eli Zaretskii @ 2016-10-24 14:42 ` Stefan Monnier 2016-10-24 15:43 ` Eli Zaretskii 2016-10-24 16:10 ` Eli Zaretskii 0 siblings, 2 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 14:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel > I don't understand how you get to talking about fragmentation. I > never mentioned anything like that. The problems I was talking about > are all related to using ralloc.c on GNU/Linux systems with a recent > enough glibc. I misunderstood, then. I fully agree that ralloc.c is a landmine, if that's what you meant. That's why I think we should get rid of it. And if we really want to keep it, we should prefer mmap over ralloc (i.e. we should only consider ralloc in those cases where the mmap alternative is unavailable (not sure if there are still systems where this is the case, the DOS port maybe?)). AFAIK, gmalloc+mmap-ralloc is a perfectly acceptable solution for Emacs-25.2 with the new glibc, with no known problem. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 14:42 ` Stefan Monnier @ 2016-10-24 15:43 ` Eli Zaretskii 2016-10-24 18:50 ` Stefan Monnier 2016-10-24 16:10 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 15:43 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Cc: emacs-devel@gnu.org > Date: Mon, 24 Oct 2016 10:42:02 -0400 > > AFAIK, gmalloc+mmap-ralloc is a perfectly acceptable solution for > Emacs-25.2 with the new glibc, with no known problem. So you consider this preferable to the 2 alternatives I mentioned in http://lists.gnu.org/archive/html/emacs-devel/2016-10/msg00740.html ? They both avoid using mmap, since I won't want to re-introduce its disadvantages to GNU/Linux systems. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 15:43 ` Eli Zaretskii @ 2016-10-24 18:50 ` Stefan Monnier 0 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 18:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >> AFAIK, gmalloc+mmap-ralloc is a perfectly acceptable solution for >> Emacs-25.2 with the new glibc, with no known problem. > So you consider this preferable to the 2 alternatives I mentioned in > http://lists.gnu.org/archive/html/emacs-devel/2016-10/msg00740.html > ? Not sure. AFAIU, gmalloc-mmap-ralloc suffers from fragmentation, which was the reason why ralloc was written in the first place, so I would tend to shy away from it, but I have not personally seen those problems, so I don't have a strong opinion on this. As for using HYBRID_MALLOC, that would be a better solution I think, but I haven't looked at the corresponding patch, so I don't know how safe it is. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 14:42 ` Stefan Monnier 2016-10-24 15:43 ` Eli Zaretskii @ 2016-10-24 16:10 ` Eli Zaretskii 1 sibling, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 16:10 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Mon, 24 Oct 2016 10:42:02 -0400 > Cc: emacs-devel@gnu.org > > And if we really want to keep it, we should prefer mmap over ralloc > (i.e. we should only consider ralloc in those cases where the mmap > alternative is unavailable (not sure if there are still systems where > this is the case, the DOS port maybe?)). The DOS port has code to work with its system malloc. That code was tested at the time, so this port shouldn't be an obstacle on the way of getting rid of ralloc.c. > AFAIK, gmalloc+mmap-ralloc is a perfectly acceptable solution for > Emacs-25.2 with the new glibc, with no known problem. I think gmalloc without mmap might be better. See the alternatives I mentioned elsewhere in this thread. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 12:45 ` Stefan Monnier 2016-10-24 13:07 ` Eli Zaretskii @ 2016-10-24 16:56 ` Richard Stallman 1 sibling, 0 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-24 16:56 UTC (permalink / raw) To: Stefan Monnier; +Cc: eliz, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I must be missing something: you seem to know about past severe problems > we've had because of fragmentation, whereas I can't remember any > such occurrence. Fragmentation caused problems bad enough to motivate me to write ralloc. But I don't know about the present situation. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-22 18:34 ` Paul Eggert 2016-10-22 19:43 ` When should ralloc.c be used? Stefan Monnier @ 2016-10-24 0:21 ` Richard Stallman 2016-10-24 3:59 ` Paul Eggert ` (2 more replies) 1 sibling, 3 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-24 0:21 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, npostavs, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I don't like it either, but would rather work on redoing the build process so > that we can use the native malloc on all hosts. That may not be desirable, though. We started using GNU malloc because it gave much better performance than some native mallocs. Whether that is true today, I have no idea; I am only saying that it is an issue to consider. Stefan said: > But that doesn't explain why we'd need to use ralloc in the mean time. Why would we not want to use ralloc? It made a big improvement for memory management when I wrote it. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-24 0:21 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman @ 2016-10-24 3:59 ` Paul Eggert 2016-10-24 7:15 ` Eli Zaretskii 2016-10-24 14:04 ` When should ralloc.c be used? Stefan Monnier 2 siblings, 0 replies; 375+ messages in thread From: Paul Eggert @ 2016-10-24 3:59 UTC (permalink / raw) To: rms; +Cc: eliz, npostavs, emacs-devel Richard Stallman wrote: > > I don't like it either, but would rather work on redoing the build process so > > that we can use the native malloc on all hosts. > > That may not be desirable, though. We started using GNU malloc > because it gave much better performance We could continue to do that, on the set of platforms where our copy of GNU malloc works significantly better than native malloc. My impression, though, is that this set of platforms is gradually shrinking due to improvements in memory allocators. See, e.g.: Berger ED, Zorn BG, McKinley KS. Reconsidering custom memory allocation. OOPSLA'02. http://dx.doi.org/10.1145/2502508.2502522 https://people.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-24 0:21 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 2016-10-24 3:59 ` Paul Eggert @ 2016-10-24 7:15 ` Eli Zaretskii 2016-10-24 16:55 ` Richard Stallman 2016-10-24 14:04 ` When should ralloc.c be used? Stefan Monnier 2 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 7:15 UTC (permalink / raw) To: rms; +Cc: npostavs, eggert, emacs-devel > From: Richard Stallman <rms@gnu.org> > CC: eliz@gnu.org, emacs-devel@gnu.org, npostavs@users.sourceforge.net > Date: Sun, 23 Oct 2016 20:21:33 -0400 > > > I don't like it either, but would rather work on redoing the build process so > > that we can use the native malloc on all hosts. > > That may not be desirable, though. We started using GNU malloc > because it gave much better performance than some native mallocs. > Whether that is true today, I have no idea; I am only saying > that it is an issue to consider. I think native malloc on GNU/Linux is much better these days; we were using it all the recent years, until glibc developers removed the hooks we needed for unexec support (which is why those GNU/Linux systems where this change is already installed switched to gmalloc and ralloc instead). Emacs 25.1 switched to native malloc on MS-Windows as well, and I see no problems with memory management due to that, perhaps even a small improvement. > > But that doesn't explain why we'd need to use ralloc in the mean time. > > Why would we not want to use ralloc? It imposes hard-to-fulfill requirements on functions that get C pointers to buffer text or to Lisp string data: those functions must never call malloc, directly or indirectly. This requirement was well known to the few Emacs developers in the distant past, when all the platforms used ralloc. But since the modern platforms gradually migrated away from ralloc, this is almost unknown to most current developers, and code crept in that violates this requirement. Fixing all that code is hard, because most of it is not easily found; it manifests itself in corruption of buffer text, random segfaults and aborts during GC, which happen long time after the offending code did its job. > It made a big improvement for memory management when I wrote it. It is no longer a big improvement, as modern platforms manage memory much better in their native malloc implementations. So ralloc is nowadays a significant disadvantage almost without advantages. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-24 7:15 ` Eli Zaretskii @ 2016-10-24 16:55 ` Richard Stallman 2016-10-24 17:09 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-10-24 16:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I think native malloc on GNU/Linux is much better these days; we were > using it all the recent years, until glibc developers removed the > hooks we needed for unexec support (which is why those GNU/Linux > systems where this change is already installed switched to gmalloc and > ralloc instead). Should we talk with them about putting in those hooks or other suitable hooks? Then we could go back to the libc malloc. > It imposes hard-to-fulfill requirements on functions that get C > pointers to buffer text or to Lisp string data: those functions must > never call malloc, directly or indirectly. I think the way to fix those is by systematically looking at the source for them, rather than by debugging. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-24 16:55 ` Richard Stallman @ 2016-10-24 17:09 ` Eli Zaretskii 2016-10-25 2:35 ` Richard Stallman ` (2 more replies) 0 siblings, 3 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-24 17:09 UTC (permalink / raw) To: rms; +Cc: npostavs, eggert, emacs-devel > From: Richard Stallman <rms@gnu.org> > CC: eggert@cs.ucla.edu, emacs-devel@gnu.org, > npostavs@users.sourceforge.net > Date: Mon, 24 Oct 2016 12:55:36 -0400 > > > I think native malloc on GNU/Linux is much better these days; we were > > using it all the recent years, until glibc developers removed the > > hooks we needed for unexec support (which is why those GNU/Linux > > systems where this change is already installed switched to gmalloc and > > ralloc instead). > > Should we talk with them about putting in those hooks or other > suitable hooks? Then we could go back to the libc malloc. I think we tried, and more or less failed. (That was in the context of unexec, but the arguments are more or less similar.) > > It imposes hard-to-fulfill requirements on functions that get C > > pointers to buffer text or to Lisp string data: those functions must > > never call malloc, directly or indirectly. > > I think the way to fix those is by systematically looking at the > source for them, rather than by debugging. Yes, but finding out whether this is so is not easy, because the malloc call is sometimes buried very deep. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-24 17:09 ` Eli Zaretskii @ 2016-10-25 2:35 ` Richard Stallman 2016-10-25 6:38 ` Paul Eggert 2016-10-25 16:04 ` Eli Zaretskii 2016-10-25 2:35 ` Richard Stallman 2016-10-25 23:00 ` Perry E. Metzger 2 siblings, 2 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-25 2:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > Should we talk with them about putting in those hooks or other > > suitable hooks? Then we could go back to the libc malloc. > I think we tried, and more or less failed. (That was in the context > of unexec, but the arguments are more or less similar.) How did it fail? Did they give it a strong try? -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-25 2:35 ` Richard Stallman @ 2016-10-25 6:38 ` Paul Eggert 2016-10-25 16:04 ` Eli Zaretskii 1 sibling, 0 replies; 375+ messages in thread From: Paul Eggert @ 2016-10-25 6:38 UTC (permalink / raw) To: rms, Eli Zaretskii; +Cc: npostavs, emacs-devel Richard Stallman wrote: > > > Should we talk with them about putting in those hooks or other > > > suitable hooks? Then we could go back to the libc malloc. > > > I think we tried, and more or less failed. (That was in the context > > of unexec, but the arguments are more or less similar.) > > How did it fail? Did they give it a strong try? It was more the other way around. People working on the glibc memory allocator convinced me that the malloc hooks were a significant impediment to performance improvements within glibc, and that Emacs unexec didn't really need those hooks any more. Emacs was the only major user of that part of the old glibc API. For those interested in GNU malloc performance improvements, a talk related to the current effort is scheduled a week from Thursday in Santa Fe. Please see: O'Donell C. linux and glibc: The 4.5TiB malloc API trace. LPC 2016. https://linuxplumbersconf.org/2016/ocw/proposals/3921 ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-25 2:35 ` Richard Stallman 2016-10-25 6:38 ` Paul Eggert @ 2016-10-25 16:04 ` Eli Zaretskii 2016-10-25 23:49 ` Richard Stallman 2016-10-25 23:49 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 1 sibling, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-25 16:04 UTC (permalink / raw) To: rms; +Cc: npostavs, eggert, emacs-devel > From: Richard Stallman <rms@gnu.org> > CC: eggert@cs.ucla.edu, emacs-devel@gnu.org, > npostavs@users.sourceforge.net > Date: Mon, 24 Oct 2016 22:35:53 -0400 > > > > Should we talk with them about putting in those hooks or other > > > suitable hooks? Then we could go back to the libc malloc. > > > I think we tried, and more or less failed. (That was in the context > > of unexec, but the arguments are more or less similar.) > > How did it fail? My take is that the glibc developers don't really want to hear about keeping those hooks. > Did they give it a strong try? I don't know what that means in practice. What would make the try "strong"? You can see the discussion starting here: http://lists.gnu.org/archive/html/emacs-devel/2016-01/msg00956.html You took some part in the discussion, at least its public part (I understand there was also an off-list part). I think once you said here: http://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01633.html that you favored replacing unexec by a more portable scheme, there was no longer any reasons to make our argument stronger. Since then Paul implemented a workaround on the master branch, which uses gmalloc during dumping, and switches to the native malloc in the dumped executable. At the time, we didn't realize, I think, that removing the glibc hooks will cause GNU/Linux systems to start using ralloc.c, which is the trigger for the present discussion. The discovery of this issue means that the hope expressed in the Jan 2016 discussions that Emacs versions before 25 will continue to be usable on GNU/Linux systems with a newer glibc -- that hope was too optimistic. Based on what we've learned the hard way during the last couple of weeks, I'd say that all the Emacs versions before 25.2 (including 25.1) will be unstable on such GNU systems to the degree of making them almost unusable. E.g., one bug report related to this claims crashes inside GC once every 10-15 minutes, something that is IMO unbearably frequent, especially given that segfaults during GC almost always cause loss of work (because auto-saving almost always fails). ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-25 16:04 ` Eli Zaretskii @ 2016-10-25 23:49 ` Richard Stallman 2016-10-26 5:08 ` Paul Eggert 2016-10-26 11:37 ` Eli Zaretskii 2016-10-25 23:49 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 1 sibling, 2 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-25 23:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I think once you said > here: > http://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01633.html > that you favored replacing unexec by a more portable scheme, there was > no longer any reasons to make our argument stronger. In general, I'm in favor of a more portable method. But we don't have one now. Is it feasible to do? Is anyone working on one? If not, then I hope we can design, with the Glibc developers, a different set of hooks to allow us to make unexec work. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-25 23:49 ` Richard Stallman @ 2016-10-26 5:08 ` Paul Eggert 2016-10-26 11:46 ` Eli Zaretskii 2016-10-27 1:23 ` Richard Stallman 2016-10-26 11:37 ` Eli Zaretskii 1 sibling, 2 replies; 375+ messages in thread From: Paul Eggert @ 2016-10-26 5:08 UTC (permalink / raw) To: rms, Eli Zaretskii; +Cc: emacs-devel, npostavs Richard Stallman wrote: > In general, I'm in favor of a more portable method. But we don't have > one now. Is it feasible to do? Is anyone working on one? Yes, it's feasible. It is on my list of things of do. Admittedly I'm stretched thin, and the approach I prefer (generating and then compiling C code) is not everybody's favorite. > If not, then I hope we can design, with the Glibc developers, a > different set of hooks to allow us to make unexec work. That would be more work than fixing Emacs, I expect. Plus, malloc hooks are not the only reason unexec is dicey. > We should withdraw 25.1, I think. I don't think that will help. Similar problems likely affect 24.5 and earlier versions, if they are built against bleeding-edge glibc and are configured in the default way. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-26 5:08 ` Paul Eggert @ 2016-10-26 11:46 ` Eli Zaretskii 2016-10-26 13:10 ` Noam Postavsky 2016-10-27 1:23 ` Richard Stallman 1 sibling, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-26 11:46 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, rms, npostavs > Cc: npostavs@users.sourceforge.net, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Tue, 25 Oct 2016 22:08:21 -0700 > > > We should withdraw 25.1, I think. > > I don't think that will help. Similar problems likely affect 24.5 and earlier > versions, if they are built against bleeding-edge glibc and are configured in > the default way. Indeed, I agree. People who first bump into this with Emacs 25.1 will have 25.2 soon enough (I hope). By contrast, those who will try to build Emacs 24.x on the newer GNU/Linux systems will be unable to resolve the instability problems exposed by using ralloc.c, except by back-porting patches we have just committed to the Emacs repository, which is not easy. Not sure what to do with the old versions, or whether anything can be done. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-26 11:46 ` Eli Zaretskii @ 2016-10-26 13:10 ` Noam Postavsky 2016-10-26 14:20 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Noam Postavsky @ 2016-10-26 13:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Paul Eggert, rms, Emacs developers On Wed, Oct 26, 2016 at 7:46 AM, Eli Zaretskii <eliz@gnu.org> wrote: >> Cc: npostavs@users.sourceforge.net, emacs-devel@gnu.org >> From: Paul Eggert <eggert@cs.ucla.edu> >> Date: Tue, 25 Oct 2016 22:08:21 -0700 >> >> > We should withdraw 25.1, I think. >> >> I don't think that will help. Similar problems likely affect 24.5 and earlier >> versions, if they are built against bleeding-edge glibc and are configured in >> the default way. > > Indeed, I agree. People who first bump into this with Emacs 25.1 will > have 25.2 soon enough (I hope). By contrast, those who will try to > build Emacs 24.x on the newer GNU/Linux systems will be unable to > resolve the instability problems exposed by using ralloc.c, except by > back-porting patches we have just committed to the Emacs repository, > which is not easy. Wouldn't configuring with REL_ALLOC=no work? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-26 13:10 ` Noam Postavsky @ 2016-10-26 14:20 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-26 14:20 UTC (permalink / raw) To: Noam Postavsky; +Cc: eggert, rms, emacs-devel > From: Noam Postavsky <npostavs@users.sourceforge.net> > Date: Wed, 26 Oct 2016 09:10:35 -0400 > Cc: Paul Eggert <eggert@cs.ucla.edu>, rms@gnu.org, Emacs developers <emacs-devel@gnu.org> > > Wouldn't configuring with REL_ALLOC=no work? It could, yes. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-26 5:08 ` Paul Eggert 2016-10-26 11:46 ` Eli Zaretskii @ 2016-10-27 1:23 ` Richard Stallman 2016-10-27 1:36 ` Paul Eggert 1 sibling, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-10-27 1:23 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, emacs-devel, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Yes, it's feasible. It is on my list of things of do. Admittedly I'm stretched > thin, and the approach I prefer (generating and then compiling C code) is not > everybody's favorite. Could you explain that more? Does anyone want to implement another approach? -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 1:23 ` Richard Stallman @ 2016-10-27 1:36 ` Paul Eggert 2016-10-27 13:35 ` Perry E. Metzger ` (3 more replies) 0 siblings, 4 replies; 375+ messages in thread From: Paul Eggert @ 2016-10-27 1:36 UTC (permalink / raw) To: rms; +Cc: eliz, emacs-devel, npostavs On 10/26/2016 06:23 PM, Richard Stallman wrote: > Could you explain that more? The main idea is to save the current Emacs state as C source code, then compile the (large and boring) .c file and relink Emacs with the resulting .o file instead of a dummy .o file that it would start off with. Most of this new .o file would be data; perhaps some would be code that would initialize the data, though we'd want to minimize this. > Does anyone want to implement another approach? Eli has mentioned a simpler approach, where we build an .elc file when saving Emacs state and load the .elc file during normal startup. The main worry about this approach is performance. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 1:36 ` Paul Eggert @ 2016-10-27 13:35 ` Perry E. Metzger 2016-10-27 14:51 ` Paul Eggert 2016-10-27 13:44 ` Fabrice Popineau ` (2 subsequent siblings) 3 siblings, 1 reply; 375+ messages in thread From: Perry E. Metzger @ 2016-10-27 13:35 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, npostavs, rms, emacs-devel On Wed, 26 Oct 2016 18:36:02 -0700 Paul Eggert <eggert@cs.ucla.edu> wrote: > On 10/26/2016 06:23 PM, Richard Stallman wrote: > > Could you explain that more? > > The main idea is to save the current Emacs state as C source code, > then compile the (large and boring) .c file and relink Emacs with > the resulting .o file instead of a dummy .o file that it would > start off with. Most of this new .o file would be data; perhaps > some would be code that would initialize the data, though we'd want > to minimize this. Could the new dynamic loading feature be used here so it wouldn't be necessary to re-link emacs? Perry -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 13:35 ` Perry E. Metzger @ 2016-10-27 14:51 ` Paul Eggert 2016-10-27 15:05 ` Perry E. Metzger 0 siblings, 1 reply; 375+ messages in thread From: Paul Eggert @ 2016-10-27 14:51 UTC (permalink / raw) To: Perry E. Metzger; +Cc: eliz, npostavs, Fabrice Popineau, rms, emacs-devel On 10/27/2016 06:35 AM, Perry E. Metzger wrote: > Could the new dynamic loading feature be used here so it wouldn't be necessary to re-link emacs? It might be doable, though I expect it'd be more work. Dynamic loading purposely isolates modules from Emacs internals, and most likely we'd need several bridges over that moat. On 10/27/2016 06:44 AM, Fabrice Popineau wrote: > I find it disturbing that a C compiler will be needed to redump Emacs. It's a tradeoff, yes. My impression is that the rare user who redumps Emacs typically has a C compiler installed or can easily install one, so it shouldn't be much to ask. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 14:51 ` Paul Eggert @ 2016-10-27 15:05 ` Perry E. Metzger 2016-10-27 18:13 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Perry E. Metzger @ 2016-10-27 15:05 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, npostavs, Fabrice Popineau, rms, emacs-devel On Thu, 27 Oct 2016 07:51:37 -0700 Paul Eggert <eggert@cs.ucla.edu> wrote: > On 10/27/2016 06:44 AM, Fabrice Popineau wrote: > > > I find it disturbing that a C compiler will be needed to redump > > Emacs. > It's a tradeoff, yes. My impression is that the rare user who > redumps Emacs typically has a C compiler installed or can easily > install one, so it shouldn't be much to ask. Agreed. On free operating systems it's easy (that's the whole point of freedom!), and even on non-free operating systems free compilers are available, so it isn't a big deal for most users I think. (On macOS the official XCode compiler is also available gratis.) Perry -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 15:05 ` Perry E. Metzger @ 2016-10-27 18:13 ` Eli Zaretskii 2016-10-27 21:03 ` Perry E. Metzger 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-27 18:13 UTC (permalink / raw) To: Perry E. Metzger; +Cc: npostavs, eggert, fabrice.popineau, rms, emacs-devel > Date: Thu, 27 Oct 2016 11:05:03 -0400 > From: "Perry E. Metzger" <perry@piermont.com> > Cc: rms@gnu.org, eliz@gnu.org, emacs-devel@gnu.org, > npostavs@users.sourceforge.net, Fabrice Popineau > <fabrice.popineau@gmail.com> > > > It's a tradeoff, yes. My impression is that the rare user who > > redumps Emacs typically has a C compiler installed or can easily > > install one, so it shouldn't be much to ask. > > Agreed. On free operating systems it's easy (that's the whole point of > freedom!), and even on non-free operating systems free compilers are > available, so it isn't a big deal for most users I think. (On macOS > the official XCode compiler is also available gratis.) I can assure you that installing a fully functioning environment for compiling programs is not a trivial task on MS-Windows. It isn't enough to have just a compiler: you need Binutils, support libraries and header files, and a well-configured MSYS installation to be able to run Emacs build script and Makefiles. Also, please don't forget that some people run Emacs on machines where they are not system administrators and are not allowed to install arbitrary packages. Not everyone is in the same position as you and Paul (or myself). ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 18:13 ` Eli Zaretskii @ 2016-10-27 21:03 ` Perry E. Metzger 2016-10-27 21:07 ` Daniel Colascione 2016-10-28 7:03 ` Eli Zaretskii 0 siblings, 2 replies; 375+ messages in thread From: Perry E. Metzger @ 2016-10-27 21:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, eggert, fabrice.popineau, rms, emacs-devel On Thu, 27 Oct 2016 21:13:05 +0300 Eli Zaretskii <eliz@gnu.org> wrote: > > Date: Thu, 27 Oct 2016 11:05:03 -0400 > > From: "Perry E. Metzger" <perry@piermont.com> > > Cc: rms@gnu.org, eliz@gnu.org, emacs-devel@gnu.org, > > npostavs@users.sourceforge.net, Fabrice Popineau > > <fabrice.popineau@gmail.com> > > > > > It's a tradeoff, yes. My impression is that the rare user who > > > redumps Emacs typically has a C compiler installed or can easily > > > install one, so it shouldn't be much to ask. > > > > Agreed. On free operating systems it's easy (that's the whole > > point of freedom!), and even on non-free operating systems free > > compilers are available, so it isn't a big deal for most users I > > think. (On macOS the official XCode compiler is also available > > gratis.) > > I can assure you that installing a fully functioning environment for > compiling programs is not a trivial task on MS-Windows. It isn't > enough to have just a compiler: you need Binutils, support libraries > and header files, and a well-configured MSYS installation to be able > to run Emacs build script and Makefiles. > > Also, please don't forget that some people run Emacs on machines > where they are not system administrators and are not allowed to > install arbitrary packages. > > Not everyone is in the same position as you and Paul (or myself). > Sure, but most people never, ever undump an Emacs either unless they're building from scratch or doing Emacs dev work... -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 21:03 ` Perry E. Metzger @ 2016-10-27 21:07 ` Daniel Colascione 2016-10-27 23:23 ` Perry E. Metzger 2016-10-28 7:06 ` When should ralloc.c be used? (WAS: bug#24358) Eli Zaretskii 2016-10-28 7:03 ` Eli Zaretskii 1 sibling, 2 replies; 375+ messages in thread From: Daniel Colascione @ 2016-10-27 21:07 UTC (permalink / raw) To: Perry E. Metzger, Eli Zaretskii Cc: emacs-devel, eggert, fabrice.popineau, rms, npostavs On 10/27/2016 02:03 PM, Perry E. Metzger wrote: > On Thu, 27 Oct 2016 21:13:05 +0300 Eli Zaretskii <eliz@gnu.org> wrote: >>> Date: Thu, 27 Oct 2016 11:05:03 -0400 >>> From: "Perry E. Metzger" <perry@piermont.com> >>> Cc: rms@gnu.org, eliz@gnu.org, emacs-devel@gnu.org, >>> npostavs@users.sourceforge.net, Fabrice Popineau >>> <fabrice.popineau@gmail.com> >>> >>>> It's a tradeoff, yes. My impression is that the rare user who >>>> redumps Emacs typically has a C compiler installed or can easily >>>> install one, so it shouldn't be much to ask. >>> >>> Agreed. On free operating systems it's easy (that's the whole >>> point of freedom!), and even on non-free operating systems free >>> compilers are available, so it isn't a big deal for most users I >>> think. (On macOS the official XCode compiler is also available >>> gratis.) >> >> I can assure you that installing a fully functioning environment for >> compiling programs is not a trivial task on MS-Windows. It isn't >> enough to have just a compiler: you need Binutils, support libraries >> and header files, and a well-configured MSYS installation to be able >> to run Emacs build script and Makefiles. >> >> Also, please don't forget that some people run Emacs on machines >> where they are not system administrators and are not allowed to >> install arbitrary packages. >> >> Not everyone is in the same position as you and Paul (or myself). >> > > Sure, but most people never, ever undump an Emacs either unless > they're building from scratch or doing Emacs dev work... > That's because it doesn't really work. That's why I added code that explicitly stops repeated dumps. It doesn't mean I don't want to support user dumps. It pains me to see people tolerate 30 second Emacs startup times. The daemon is a hack. I want Emacs normal mode of operation to be to start from *user* *specific* saved state --- that way, all Emacs instances can be as fast as emacs -Q. If we need a compiler to make this happen, so be it. We'll just require libgcc, or hell, check it in to the repository, the way gcc checks in its dependencies. An additional benefit of integrating with a compiler at runtime is the potential to JIT elisp code. Both LLVM andGCC these days have usable JIT interfaces. We could even serialize JIT traces in these user Emacs dumps. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 21:07 ` Daniel Colascione @ 2016-10-27 23:23 ` Perry E. Metzger 2016-10-27 23:32 ` When should ralloc.c be used? Daniel Colascione 2016-10-28 7:06 ` When should ralloc.c be used? (WAS: bug#24358) Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Perry E. Metzger @ 2016-10-27 23:23 UTC (permalink / raw) To: Daniel Colascione Cc: eggert, rms, npostavs, fabrice.popineau, emacs-devel, Eli Zaretskii On Thu, 27 Oct 2016 14:07:46 -0700 Daniel Colascione <dancol@dancol.org> wrote: > If we need a compiler to make this happen, so be it. We'll just > require libgcc, or hell, check it in to the repository, the way gcc > checks in its dependencies. > > An additional benefit of integrating with a compiler at runtime is > the potential to JIT elisp code. Both LLVM and GCC these days have > usable JIT interfaces. We could even serialize JIT traces in these > user Emacs dumps. Having a JIT for emacs bytecode (or some other IR) would be really superb. I had no idea that GCC now had JIT support, but if it is as easy to use as LLVM's, a prototype would not be a hard project. (I presume RMS would insist on GCC as the basis.) Of course, given that Emacs already byte compiles everything, maybe going straight to machine code rather than the bytecode + JIT would be good? Again, I don't know what GCC's infra is like, but if it is as good as LLVM's that would be quite straightforward. Perry -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-27 23:23 ` Perry E. Metzger @ 2016-10-27 23:32 ` Daniel Colascione 0 siblings, 0 replies; 375+ messages in thread From: Daniel Colascione @ 2016-10-27 23:32 UTC (permalink / raw) To: Perry E. Metzger Cc: eggert, rms, npostavs, fabrice.popineau, emacs-devel, Eli Zaretskii "Perry E. Metzger" <perry@piermont.com> writes: > On Thu, 27 Oct 2016 14:07:46 -0700 Daniel Colascione > <dancol@dancol.org> wrote: >> If we need a compiler to make this happen, so be it. We'll just >> require libgcc, or hell, check it in to the repository, the way gcc >> checks in its dependencies. >> >> An additional benefit of integrating with a compiler at runtime is >> the potential to JIT elisp code. Both LLVM and GCC these days have >> usable JIT interfaces. We could even serialize JIT traces in these >> user Emacs dumps. > > Having a JIT for emacs bytecode (or some other IR) would be really > superb. I had no idea that GCC now had JIT support, but if it is as > easy to use as LLVM's, a prototype would not be a hard project. (I > presume RMS would insist on GCC as the basis.) GCC's interface isn't nearly as mature as LLVM's yet, but there's promise https://gcc.gnu.org/wiki/JIT > Of course, given that Emacs already byte compiles everything, maybe > going straight to machine code rather than the bytecode + JIT would > be good? Again, I don't know what GCC's infra is like, but if it is > as good as LLVM's that would be quite straightforward. AOT is all the rage right now (JEP 295), but I believe that tracing JITs are ultimately the right choice for code density and installation latency reasons. But this is one of those arguments that's never going to be solved. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 21:07 ` Daniel Colascione 2016-10-27 23:23 ` Perry E. Metzger @ 2016-10-28 7:06 ` Eli Zaretskii 1 sibling, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 7:06 UTC (permalink / raw) To: Daniel Colascione Cc: eggert, rms, npostavs, fabrice.popineau, emacs-devel, perry > Cc: npostavs@users.sourceforge.net, eggert@cs.ucla.edu, > fabrice.popineau@gmail.com, rms@gnu.org, emacs-devel@gnu.org > From: Daniel Colascione <dancol@dancol.org> > Date: Thu, 27 Oct 2016 14:07:46 -0700 > > > Sure, but most people never, ever undump an Emacs either unless > > they're building from scratch or doing Emacs dev work... > > > > That's because it doesn't really work. That's why I added code that > explicitly stops repeated dumps. It doesn't mean I don't want to support > user dumps. Exactly. > It pains me to see people tolerate 30 second Emacs startup times. The > daemon is a hack. I want Emacs normal mode of operation to be to start > from *user* *specific* saved state --- that way, all Emacs instances can > be as fast as emacs -Q. > > If we need a compiler to make this happen, so be it. But if there's a simpler method that doesn't get us enywhere near 30 sec, that should be "good enough". > We'll just require libgcc, or hell, check it in to the repository, You can't, not without having all the GCC sources. But that's an aside. > An additional benefit of integrating with a compiler at runtime is the > potential to JIT elisp code. Both LLVM andGCC these days have usable JIT > interfaces. We could even serialize JIT traces in these user Emacs dumps. This should be an opt-in feature, not a hard requirement, IMO. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 21:03 ` Perry E. Metzger 2016-10-27 21:07 ` Daniel Colascione @ 2016-10-28 7:03 ` Eli Zaretskii 1 sibling, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 7:03 UTC (permalink / raw) To: Perry E. Metzger; +Cc: npostavs, eggert, fabrice.popineau, rms, emacs-devel > Date: Thu, 27 Oct 2016 17:03:45 -0400 > From: "Perry E. Metzger" <perry@piermont.com> > Cc: eggert@cs.ucla.edu, rms@gnu.org, emacs-devel@gnu.org, > npostavs@users.sourceforge.net, fabrice.popineau@gmail.com > > > Not everyone is in the same position as you and Paul (or myself). > > > > Sure, but most people never, ever undump an Emacs either unless > they're building from scratch or doing Emacs dev work... Oh, so now we are going to argue that a feature that can't be easily had is not important? Then I'll claim that the Emacs startup time is not important, either, because "most people never, ever" start Emacs except when their machine starts, and their Emacs session is thereafter running for weeks and months without ever restarting. Let's agree to respect other people's usage patterns and circumstances, even if they are different from ours. Emacs is great because it allows so many different patterns, so preferring one of them too much is something we should avoid. If each one of us sees only their personal needs as the most important ones, we will never be able to agree on anything. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 1:36 ` Paul Eggert 2016-10-27 13:35 ` Perry E. Metzger @ 2016-10-27 13:44 ` Fabrice Popineau 2016-10-27 15:35 ` Eli Zaretskii 2016-10-27 20:39 ` Richard Stallman 2016-10-27 20:40 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 3 siblings, 1 reply; 375+ messages in thread From: Fabrice Popineau @ 2016-10-27 13:44 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, Noam Postavsky, rms, Emacs developers [-- Attachment #1: Type: text/plain, Size: 603 bytes --] 2016-10-27 3:36 GMT+02:00 Paul Eggert <eggert@cs.ucla.edu>: > On 10/26/2016 06:23 PM, Richard Stallman wrote: > >> Could you explain that more? >> > > The main idea is to save the current Emacs state as C source code, then > compile the (large and boring) .c file and relink Emacs with the resulting > .o file instead of a dummy .o file that it would start off with. Most of > this new .o file would be data; perhaps some would be code that would > initialize the data, though we'd want to minimize this. > > I find it disturbing that a C compiler will be needed to redump Emacs. Am I alone ? Fabrice [-- Attachment #2: Type: text/html, Size: 1142 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 13:44 ` Fabrice Popineau @ 2016-10-27 15:35 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-27 15:35 UTC (permalink / raw) To: Fabrice Popineau; +Cc: npostavs, eggert, rms, emacs-devel > From: Fabrice Popineau <fabrice.popineau@gmail.com> > Date: Thu, 27 Oct 2016 15:44:27 +0200 > Cc: rms@gnu.org, Eli Zaretskii <eliz@gnu.org>, Emacs developers <emacs-devel@gnu.org>, > Noam Postavsky <npostavs@users.sourceforge.net> > > The main idea is to save the current Emacs state as C source code, then compile the (large and > boring) .c file and relink Emacs with the resulting .o file instead of a dummy .o file that it would start off > with. Most of this new .o file would be data; perhaps some would be code that would initialize the data, > though we'd want to minimize this. > > I find it disturbing that a C compiler will be needed to redump Emacs. > Am I alone ? No. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 1:36 ` Paul Eggert 2016-10-27 13:35 ` Perry E. Metzger 2016-10-27 13:44 ` Fabrice Popineau @ 2016-10-27 20:39 ` Richard Stallman 2016-10-28 6:48 ` Eli Zaretskii 2016-10-28 12:51 ` When should ralloc.c be used? Stefan Monnier 2016-10-27 20:40 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 3 siblings, 2 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-27 20:39 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, emacs-devel, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Eli has mentioned a simpler approach, where we build an .elc file when > saving Emacs state and load the .elc file during normal startup. The > main worry about this approach is performance. There is no reason why we have to choose between C code and Lisp code. It's worth developing a special-purpose format for this, if that would be considerably faster. However, reading a file that specifies construction of objects is always going to be slower than copying a memory dump. What part of the time could we save with a different format? Here's an idea. We separate (1) creating objects from (2) storing them. We define several operations. We record all objects created by sequence number. We have these ways of creating an object. That object is given the next consecutive sequence number. * intern (specify symbol name) * variable value (specify variable name) * string (specify contents) * integer (specify value) * cons (specify two sequence numbers) * list (specify N sequence numbers) * array (specify N sequence numbers) * expression (specify the expression textually; it gets evalled) And these ways of storing the last object. * store in a variable (specify symbol name) * store in a function cell (specify symbol name) * store in car of cons cell (specify its sequence number) * store in cdr of cons cell (specify its sequence number) * store in array element (specify array sequence number and index) * store in symbol property (specify symbol sequence number and property name) * store using expression (specify a lambda expression with one arg, textually) * store in special internal place (specify a code number to say which place) And one storage reclaimer * discard all sequence numbers back to N. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 20:39 ` Richard Stallman @ 2016-10-28 6:48 ` Eli Zaretskii 2016-10-28 19:12 ` Richard Stallman 2016-10-28 12:51 ` When should ralloc.c be used? Stefan Monnier 1 sibling, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 6:48 UTC (permalink / raw) To: rms; +Cc: eggert, emacs-devel, npostavs > From: Richard Stallman <rms@gnu.org> > CC: eliz@gnu.org, npostavs@users.sourceforge.net, emacs-devel@gnu.org > Date: Thu, 27 Oct 2016 16:39:59 -0400 > > > Eli has mentioned a simpler approach, where we build an .elc file when > > saving Emacs state and load the .elc file during normal startup. The > > main worry about this approach is performance. > > There is no reason why we have to choose between C code and Lisp code. > It's worth developing a special-purpose format for this, if that would > be considerably faster. The Lisp approach has a huge advantage: it is much simpler, so everyone here will understand it, and it is much easier to maintain and develop. So if the performance hit is bearable (meaning will be accepted by the crowd), it should IMO be preferred for reasons of project management and its future, even though faster methods exist. IOW, the goal of bringing the unexec out of the shadows of system-level black magic it is now should stomp the "faster is always better" principle, if we care about the future of Emacs in the face of the fact that fewer and fewer people know, or even want to know, about segments and offsets in a binary executable file. And speaking about performance: I suggest people who worry about that start by comparing startup times of past versions of Emacs. Using this simple benchmark proposed by Andreas: time emacs -batch --eval t I see that we've been consistently adding 10% of startup time with each major release, beginning with Emacs 23, so 25.1 starts about 25% slower than 22.3. If that didn't cause an outcry, then perhaps our concern for this order of magnitude of differences in startup time are a tad exaggerated? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-28 6:48 ` Eli Zaretskii @ 2016-10-28 19:12 ` Richard Stallman 2016-10-29 6:37 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-10-28 19:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > The Lisp approach has a huge advantage: it is much simpler, so > everyone here will understand it, and it is much easier to maintain > and develop. The special format I propose is simple enough. > So if the performance hit is bearable (meaning will be accepted by the > crowd), it should IMO be preferred for reasons of project management > and its future, Slowness here affects every user and is quite noticeable. Don't we already know that Lisp is too slow for this? It is worth substantial extra effort to speed this up. > care about the future of Emacs in the face of the fact that fewer and > fewer people know, or even want to know, about segments and offsets in > a binary executable file. That is an argument for replacing unexec with something that saves the data to reloed, but it is not an argument for using Lisp as the format. > Using > this simple benchmark proposed by Andreas: > time emacs -batch --eval t I just tried it with my current build (from June). It took .26 seconds, which is fast enough. If replacing unexec with loading Lisp takes .05 seconds more, I won't complain. But I think it will take several seconds, if not minutes. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-28 19:12 ` Richard Stallman @ 2016-10-29 6:37 ` Eli Zaretskii 2016-10-29 14:55 ` When should ralloc.c be used? Stefan Monnier 2016-10-29 16:38 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 0 siblings, 2 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-29 6:37 UTC (permalink / raw) To: rms; +Cc: eggert, emacs-devel, npostavs > From: Richard Stallman <rms@gnu.org> > CC: eggert@cs.ucla.edu, npostavs@users.sourceforge.net, > emacs-devel@gnu.org > Date: Fri, 28 Oct 2016 15:12:27 -0400 > > > The Lisp approach has a huge advantage: it is much simpler, so > > everyone here will understand it, and it is much easier to maintain > > and develop. > > The special format I propose is simple enough. For you and me (and a few others), maybe. For most of the current Emacs contributors it's nowhere near "simple enough", because it requires one to be familiar with intimate details of Emacs object design and implementation. IOW, for the purposes of this discussion, I consider anything that is not mostly Lisp "not simple". > > So if the performance hit is bearable (meaning will be accepted by the > > crowd), it should IMO be preferred for reasons of project management > > and its future, > > Slowness here affects every user and is quite noticeable. > Don't we already know that Lisp is too slow for this? No, we don't know that, because we never tried to implement any method of reading compiled Lisp that is optimized for speed and targets a bare Emacs. E.g., it turned out that most of the time it takes 'loadup' to do its job is due to the linear search of pure strings in find_string_data_in_pure, called by make_pure_string. If we call 'loadup' upon every startup, the need for pure storage goes away, and the 'loadup' time can be sped up tenfold. And that is even before making all of the preloaded files a single file, which speeds up things at least twofold more, according to my measurements. So here you have a 20-fold speedup just by two very simple measures. > > care about the future of Emacs in the face of the fact that fewer and > > fewer people know, or even want to know, about segments and offsets in > > a binary executable file. > > That is an argument for replacing unexec with something that saves the > data to reloed, but it is not an argument for using Lisp as the format. It is an argument for both, because I don't think we can count on too many people here being able to tinker with Lisp object internals in the future. The less such features we have that will need maintenance, the better for Emacs viability in the long run. > > time emacs -batch --eval t > > I just tried it with my current build (from June). It took .26 seconds, > which is fast enough. > > If replacing unexec with loading Lisp takes .05 seconds more, I won't > complain. But I think it will take several seconds, if not minutes. How much does it take on your system to do this: time src/temacs -batch -l loadup And if you modify Emacs with the patch posted here: http://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01049.html how long does it take temacs to loadup then? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-29 6:37 ` Eli Zaretskii @ 2016-10-29 14:55 ` Stefan Monnier 2016-10-30 16:13 ` Eli Zaretskii 2016-10-29 16:38 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-29 14:55 UTC (permalink / raw) To: emacs-devel > E.g., it turned out that most of the time it takes 'loadup' to do its > job is due to the linear search of pure strings in > find_string_data_in_pure, called by make_pure_string. Indeed. hash-consing pure objects takes another very significant percentage of the time. > If we call 'loadup' upon every startup, the need for pure storage goes > away, and the 'loadup' time can be sped up tenfold. Actually, there's no *need* for pure storage in either case. There are benefits to the use of pure storage, and some of them remain even if we don't dump: one of the benefits that remains is to reduce the size of the GC'd heap and hence speed up the GC. Whether that's significant enough to bother with it is of course debatable. But note also that we could keep the use of pure-space without paying the hefty price of find_string_data_in_pure (or hash-consing), since these merely try to make the purespace more compact. The extra cost is OK when we dump the result, but it's not worth the trouble if we don't dump. > And that is even before making all of the preloaded files a single > file, which speeds up things at least twofold more, according to > my measurements. BTW, how large is that single file (I'm curious if its size is significantly different from the 3.2MB I got for my dumped.elc)? Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-29 14:55 ` When should ralloc.c be used? Stefan Monnier @ 2016-10-30 16:13 ` Eli Zaretskii 2016-10-30 21:47 ` Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-30 16:13 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sat, 29 Oct 2016 10:55:20 -0400 > > BTW, how large is that single file (I'm curious if its size is > significantly different from the 3.2MB I got for my dumped.elc)? Its size is 4.2MB. It's basically a concatenation of all the preloaded *.elc files, with all but a single preamble removed. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-30 16:13 ` Eli Zaretskii @ 2016-10-30 21:47 ` Stefan Monnier 0 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-30 21:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >> BTW, how large is that single file (I'm curious if its size is >> significantly different from the 3.2MB I got for my dumped.elc)? > Its size is 4.2MB. It's basically a concatenation of all the > preloaded *.elc files, with all but a single preamble removed. OK, so it's pretty much exactly the same size as what I get (the 3.2MB I get turns into 4.2MB if I print each function/var separately, thus preventing sharing between them, which is what happens in the normal .elc files). Good to know that there's no significant difference between the two in this regard, thanks. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-29 6:37 ` Eli Zaretskii 2016-10-29 14:55 ` When should ralloc.c be used? Stefan Monnier @ 2016-10-29 16:38 ` Richard Stallman 2016-10-29 21:57 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-10-29 16:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > For you and me (and a few others), maybe. For most of the current > Emacs contributors it's nowhere near "simple enough", because it > requires one to be familiar with intimate details of Emacs object > design and implementation. No it doesn't. The code to look at objects and output them this way wouldn't have to know any more about how they are represented than the code for Fprinc. It would operate using the usual macros for decomposing objects. The idea that all C code should be regarded as unmaintainable is a nonstarter. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-29 16:38 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman @ 2016-10-29 21:57 ` Eli Zaretskii 2016-10-31 19:18 ` Richard Stallman 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-29 21:57 UTC (permalink / raw) To: rms; +Cc: eggert, emacs-devel, npostavs > From: Richard Stallman <rms@gnu.org> > CC: eggert@cs.ucla.edu, npostavs@users.sourceforge.net, > emacs-devel@gnu.org > Date: Sat, 29 Oct 2016 12:38:48 -0400 > > > For you and me (and a few others), maybe. For most of the current > > Emacs contributors it's nowhere near "simple enough", because it > > requires one to be familiar with intimate details of Emacs object > > design and implementation. > > No it doesn't. The code to look at objects and output them this way > wouldn't have to know any more about how they are represented > than the code for Fprinc. The code like in princ (actually in its subroutines) is exactly what I think we should try to avoid. > The idea that all C code should be regarded as unmaintainable > is a nonstarter. I didn't say it will be unmaintainable, I said its maintenance will be harder than of Lisp code. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-29 21:57 ` Eli Zaretskii @ 2016-10-31 19:18 ` Richard Stallman 2016-10-31 20:58 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-10-31 19:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I didn't say it will be unmaintainable, I said its maintenance will be > harder than of Lisp code. That's no horrible thing. Speeding up startup is important. If some C code is an effective way to do it, we shouldn't reject that just because of a general preference for Lisp code. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-31 19:18 ` Richard Stallman @ 2016-10-31 20:58 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-31 20:58 UTC (permalink / raw) To: rms; +Cc: npostavs, eggert, emacs-devel > From: Richard Stallman <rms@gnu.org> > CC: eggert@cs.ucla.edu, emacs-devel@gnu.org, > npostavs@users.sourceforge.net > Date: Mon, 31 Oct 2016 15:18:38 -0400 > > > I didn't say it will be unmaintainable, I said its maintenance will be > > harder than of Lisp code. > > That's no horrible thing. "Horrible" is in the eyes of the beholder. I think keeping Emacs as maintainable as possible is very important for its future. > If some C code is an effective way to do it, we shouldn't > reject that just because of a general preference for Lisp code. I'm not rejecting it, just explaining why it shouldn't be the first priority, IMO. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-27 20:39 ` Richard Stallman 2016-10-28 6:48 ` Eli Zaretskii @ 2016-10-28 12:51 ` Stefan Monnier 1 sibling, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-28 12:51 UTC (permalink / raw) To: emacs-devel > There is no reason why we have to choose between C code and Lisp code. > It's worth developing a special-purpose format for this, if that would > be considerably faster. We're still investigating how much time can be gained just by optimizing lread.c. But yes, maybe another format (which could also be used for .elc files, of course) would allow reading faster. Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 1:36 ` Paul Eggert ` (2 preceding siblings ...) 2016-10-27 20:39 ` Richard Stallman @ 2016-10-27 20:40 ` Richard Stallman 2016-10-27 22:34 ` Paul Eggert 2016-10-28 6:55 ` Eli Zaretskii 3 siblings, 2 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-27 20:40 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, emacs-devel, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Eli has mentioned a simpler approach, where we build an .elc file when > saving Emacs state and load the .elc file during normal startup. The > main worry about this approach is performance. Any such scheme has this problem: how to find all the places where initialization has stored some sort of value? They do not all have ways to access them and set them from Lisp. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 20:40 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman @ 2016-10-27 22:34 ` Paul Eggert 2016-10-28 2:40 ` Richard Stallman 2016-10-28 2:40 ` Richard Stallman 2016-10-28 6:55 ` Eli Zaretskii 1 sibling, 2 replies; 375+ messages in thread From: Paul Eggert @ 2016-10-27 22:34 UTC (permalink / raw) To: rms; +Cc: emacs-devel On 10/27/2016 01:40 PM, Richard Stallman wrote: > Any such scheme has this problem: > how to find all the places where initialization has stored some sort > of value? They do not all have ways to access them and set them from Lisp. My impression is that most such initializations are so small and fast that we needn't worry about saving and restoring their state. We can simply redo the initialization when Emacs starts up again - this will be the default behavior if we leave the temacs initialization code alone, which means we'd get this for very little maintenance effort. Any counterexamples we can handle specially, by saving and restoring their state by hand (so to speak). Something along the lines of your idea for storage creation should work, though we'll have to be careful about destructive operations like setcar that can cause an object with an earlier sequence number to point to an object with a later sequence number. It's not clear whether it has significant advantages over the C-based approach. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 22:34 ` Paul Eggert @ 2016-10-28 2:40 ` Richard Stallman 2016-10-28 2:40 ` Richard Stallman 1 sibling, 0 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-28 2:40 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Something along the lines of your idea for storage creation should work, > though we'll have to be careful about destructive operations like setcar > that can cause an object with an earlier sequence number to point to an > object with a later sequence number. Why so? The purpose of sequence numbers is simply so you can refer to the objects already made -- not for proving some theorems of well-foundedness. Cycles should not be a problem. > It's not clear whether it has > significant advantages over the C-based approach. Here are two: * You don't need a C compiler to dump Emacs. * You don't need to relink to dump Emacs. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 22:34 ` Paul Eggert 2016-10-28 2:40 ` Richard Stallman @ 2016-10-28 2:40 ` Richard Stallman 2016-10-28 7:21 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-10-28 2:40 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > Any such scheme has this problem: > > how to find all the places where initialization has stored some sort > > of value? They do not all have ways to access them and set them from Lisp. > My impression is that most such initializations are so small and fast > that we needn't worry about saving and restoring their state. We can > simply redo the initialization when Emacs starts up again - this will be > the default behavior if we leave the temacs initialization code alone, > which means we'd get this for very little maintenance effort. It could be so, but someone will have to try it. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-28 2:40 ` Richard Stallman @ 2016-10-28 7:21 ` Eli Zaretskii 0 siblings, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 7:21 UTC (permalink / raw) To: rms; +Cc: eggert, emacs-devel > From: Richard Stallman <rms@gnu.org> > Date: Thu, 27 Oct 2016 22:40:01 -0400 > Cc: emacs-devel@gnu.org > > > My impression is that most such initializations are so small and fast > > that we needn't worry about saving and restoring their state. We can > > simply redo the initialization when Emacs starts up again - this will be > > the default behavior if we leave the temacs initialization code alone, > > which means we'd get this for very little maintenance effort. > > It could be so, but someone will have to try it. We already have a CANNOT_DUMP configuration, which does precisely that, used by some systems, so this code is in reasonably good shape. It's just a question of make it fast enough. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-27 20:40 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 2016-10-27 22:34 ` Paul Eggert @ 2016-10-28 6:55 ` Eli Zaretskii 1 sibling, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-28 6:55 UTC (permalink / raw) To: rms; +Cc: eggert, emacs-devel, npostavs > From: Richard Stallman <rms@gnu.org> > CC: eliz@gnu.org, npostavs@users.sourceforge.net, emacs-devel@gnu.org > Date: Thu, 27 Oct 2016 16:40:14 -0400 > > > Eli has mentioned a simpler approach, where we build an .elc file when > > saving Emacs state and load the .elc file during normal startup. The > > main worry about this approach is performance. > > Any such scheme has this problem: > how to find all the places where initialization has stored some sort > of value? They do not all have ways to access them and set them from Lisp. The part that must be done in C, like DEFUN etc. will be done at startup of the Emacs session. The part that stores values in variables exposed to Lisp will be either moved to startup.el, or done at startup in C. A few variables whose value depends on the build directory and other stuff that is only known at build time will be recorded in a special Lisp file created by the build and loaded at startup as part of the .elc file described above. Btw, Paul's description about "saving state" in a .elc file is inaccurate: most of that file is just the preloaded Lisp packages, like simple.el, subr.el, etc. There's very little of saved state there, because that state will simple be created at startup, each time Emacs starts. In source terms, most of the "if (!initialized)" parts will be run each time Emacs starts. Does that answer your question? ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-25 23:49 ` Richard Stallman 2016-10-26 5:08 ` Paul Eggert @ 2016-10-26 11:37 ` Eli Zaretskii 2016-10-27 1:24 ` Richard Stallman 1 sibling, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-26 11:37 UTC (permalink / raw) To: rms; +Cc: eggert, emacs-devel, npostavs > From: Richard Stallman <rms@gnu.org> > CC: npostavs@users.sourceforge.net, eggert@cs.ucla.edu, > emacs-devel@gnu.org > Date: Tue, 25 Oct 2016 19:49:33 -0400 > > > I think once you said > > here: > > > http://lists.gnu.org/archive/html/emacs-devel/2016-01/msg01633.html > > > that you favored replacing unexec by a more portable scheme, there was > > no longer any reasons to make our argument stronger. > > In general, I'm in favor of a more portable method. But we don't have > one now. Is it feasible to do? Is anyone working on one? We are warming up. There are a few ideas, but I'm not sure we have decided which one is the best yet. > If not, then I hope we can design, with the Glibc developers, a > different set of hooks to allow us to make unexec work. Frankly, I think that ship has sailed, and cannot be turned around. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-26 11:37 ` Eli Zaretskii @ 2016-10-27 1:24 ` Richard Stallman 2016-10-28 12:57 ` When should ralloc.c be used? Stefan Monnier 0 siblings, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-10-27 1:24 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > If not, then I hope we can design, with the Glibc developers, a > > different set of hooks to allow us to make unexec work. > Frankly, I think that ship has sailed, and cannot be turned around. I think that is an exaggeration. They got rid of ONE set of hooks for specific practical reasons. Maybe we can design a different set of hooks which do the job and which are not a problem for them to support. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-27 1:24 ` Richard Stallman @ 2016-10-28 12:57 ` Stefan Monnier 2016-10-28 19:13 ` Richard Stallman 0 siblings, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-28 12:57 UTC (permalink / raw) To: emacs-devel > I think that is an exaggeration. They got rid of ONE set of hooks for > specific practical reasons. Maybe we can design a different set of > hooks which do the job and which are not a problem for them to > support. While that is true, I think there is very little motivation to go down that road even in Emacs's side: this glibc-malloc "issue" is just one more nail in the unexec coffin, so even if we can find a way back we'd still be stuck with the problem of doing unexec with address randomization (for example), and maintenance of unexec (which has proved less problematic than I expected, over the years, admittedly, but remains a source of worry). Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 12:57 ` When should ralloc.c be used? Stefan Monnier @ 2016-10-28 19:13 ` Richard Stallman 2016-10-28 22:46 ` Stefan Monnier 2016-10-29 6:39 ` Eli Zaretskii 0 siblings, 2 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-28 19:13 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I think that is an exaggeration. They got rid of ONE set of hooks for > > specific practical reasons. Maybe we can design a different set of > > hooks which do the job and which are not a problem for them to > > support. > While that is true, I think there is very little motivation to go down > that road even in Emacs's side: this glibc-malloc "issue" is just one > more nail in the unexec coffin, so even if we can find a way back we'd > still be stuck with the problem of doing unexec with address > randomization (for example), and maintenance of unexec (which has proved I just did 'time temacs -batch -l loadup'. It took over 18 seconds. We have a long way to go to make that fast enough. Perhaps what we need is to dump that data verbatim in a format chosen by us, then relocate all the pointers if necessary. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 19:13 ` Richard Stallman @ 2016-10-28 22:46 ` Stefan Monnier 2016-10-29 16:35 ` Richard Stallman 2016-10-29 6:39 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Stefan Monnier @ 2016-10-28 22:46 UTC (permalink / raw) To: emacs-devel > I just did 'time temacs -batch -l loadup'. It took over 18 seconds. > We have a long way to go to make that fast enough. It's easy to bring it down to 1s: the current loadup.el spend most of its time in things that are not terribly important (to try and scrape a few bytes here and there: worthwhile in the context of a one-off dump step, but not that important otherwise). Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 22:46 ` Stefan Monnier @ 2016-10-29 16:35 ` Richard Stallman 0 siblings, 0 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-29 16:35 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > It's easy to bring it down to 1s: the current loadup.el spend most of > its time in things that are not terribly important (to try and scrape a few > bytes here and there: worthwhile in the context of a one-off dump step, > but not that important otherwise). Please show us. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-28 19:13 ` Richard Stallman 2016-10-28 22:46 ` Stefan Monnier @ 2016-10-29 6:39 ` Eli Zaretskii 2016-10-29 16:37 ` Richard Stallman 1 sibling, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-29 6:39 UTC (permalink / raw) To: rms; +Cc: monnier, emacs-devel > From: Richard Stallman <rms@gnu.org> > Date: Fri, 28 Oct 2016 15:13:19 -0400 > Cc: emacs-devel@gnu.org > > I just did 'time temacs -batch -l loadup'. It took over 18 seconds. > We have a long way to go to make that fast enough. Try the patch I pointed to in a previous message, and see what kind of speedup is possible by 2 simple measures. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-29 6:39 ` Eli Zaretskii @ 2016-10-29 16:37 ` Richard Stallman 2016-10-29 21:51 ` Eli Zaretskii 0 siblings, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-10-29 16:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I just did 'time temacs -batch -l loadup'. It took over 18 seconds. > > We have a long way to go to make that fast enough. > Try the patch I pointed to in a previous message, and see what kind of > speedup is possible by 2 simple measures. It would be a lot of work for me to try that myself, and I think it is not necessary if you already tried it. What fractional speedup did you observe when you tried it? -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-29 16:37 ` Richard Stallman @ 2016-10-29 21:51 ` Eli Zaretskii 2016-10-30 11:33 ` Richard Stallman 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-29 21:51 UTC (permalink / raw) To: rms; +Cc: monnier, emacs-devel > From: Richard Stallman <rms@gnu.org> > CC: monnier@iro.umontreal.ca, emacs-devel@gnu.org > Date: Sat, 29 Oct 2016 12:37:34 -0400 > > > Try the patch I pointed to in a previous message, and see what kind of > > speedup is possible by 2 simple measures. > > It would be a lot of work for me to try that myself, and I think it is > not necessary if you already tried it. What fractional speedup did > you observe when you tried it? About 20, as I wrote. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-29 21:51 ` Eli Zaretskii @ 2016-10-30 11:33 ` Richard Stallman 2016-10-30 15:33 ` Alp Aker 2016-10-30 16:08 ` Eli Zaretskii 0 siblings, 2 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-30 11:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: monnier, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > It would be a lot of work for me to try that myself, and I think it is > > not necessary if you already tried it. What fractional speedup did > > you observe when you tried it? > About 20, as I wrote. Sorry, I am not sure what 20 means here. Was it a speedup of 20%? A factor of 20? If it means 20%, it would be a good step, but much more would be required to make non-dumping ok for starting Emacs. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-30 11:33 ` Richard Stallman @ 2016-10-30 15:33 ` Alp Aker 2016-10-30 17:19 ` Richard Stallman 2016-10-30 16:08 ` Eli Zaretskii 1 sibling, 1 reply; 375+ messages in thread From: Alp Aker @ 2016-10-30 15:33 UTC (permalink / raw) To: rms; +Cc: Eli Zaretskii, Emacs devel [-- Attachment #1: Type: text/plain, Size: 680 bytes --] On Sun, Oct 30, 2016 at 7:33 AM, Richard Stallman <rms@gnu.org> wrote: > Sorry, I am not sure what 20 means here. > Was it a speedup of 20%? > A factor of 20? He meant a factor of 20. Here's the original comment: > E.g., it turned out that most of the time it takes 'loadup' to do its > job is due to the linear search of pure strings in > find_string_data_in_pure, called by make_pure_string. If we call > 'loadup' upon every startup, the need for pure storage goes away, and > the 'loadup' time can be sped up tenfold. And that is even before > making all of the preloaded files a single file, which speeds up > things at least twofold more, according to my measurements. [-- Attachment #2: Type: text/html, Size: 984 bytes --] ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-30 15:33 ` Alp Aker @ 2016-10-30 17:19 ` Richard Stallman 0 siblings, 0 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-30 17:19 UTC (permalink / raw) To: Alp Aker; +Cc: eliz, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > Sorry, I am not sure what 20 means here. > > Was it a speedup of 20%? > > A factor of 20? > He meant a factor of 20. Here's the original comment: With a 20-times speeduo, maybe it is fast enough. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-30 11:33 ` Richard Stallman 2016-10-30 15:33 ` Alp Aker @ 2016-10-30 16:08 ` Eli Zaretskii 1 sibling, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-30 16:08 UTC (permalink / raw) To: rms; +Cc: monnier, emacs-devel > From: Richard Stallman <rms@gnu.org> > CC: monnier@iro.umontreal.ca, emacs-devel@gnu.org > Date: Sun, 30 Oct 2016 07:33:12 -0400 > > > > It would be a lot of work for me to try that myself, and I think it is > > > not necessary if you already tried it. What fractional speedup did > > > you observe when you tried it? > > > About 20, as I wrote. > > Sorry, I am not sure what 20 means here. > Was it a speedup of 20%? > A factor of 20? A factor of 20. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-25 16:04 ` Eli Zaretskii 2016-10-25 23:49 ` Richard Stallman @ 2016-10-25 23:49 ` Richard Stallman 1 sibling, 0 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-25 23:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel, npostavs [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Based on what > we've learned the hard way during the last couple of weeks, I'd say > that all the Emacs versions before 25.2 (including 25.1) will be > unstable on such GNU systems to the degree of making them almost > unusable. We should withdraw 25.1, I think. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-24 17:09 ` Eli Zaretskii 2016-10-25 2:35 ` Richard Stallman @ 2016-10-25 2:35 ` Richard Stallman 2016-10-25 16:05 ` Eli Zaretskii 2016-10-25 23:00 ` Perry E. Metzger 2 siblings, 1 reply; 375+ messages in thread From: Richard Stallman @ 2016-10-25 2:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I think the way to fix those is by systematically looking at the > > source for them, rather than by debugging. > Yes, but finding out whether this is so is not easy, because the > malloc call is sometimes buried very deep. There are programs that determine call trees. We could find these problems by analyzing the output. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-25 2:35 ` Richard Stallman @ 2016-10-25 16:05 ` Eli Zaretskii 2016-10-27 1:22 ` Richard Stallman 0 siblings, 1 reply; 375+ messages in thread From: Eli Zaretskii @ 2016-10-25 16:05 UTC (permalink / raw) To: rms; +Cc: npostavs, eggert, emacs-devel > From: Richard Stallman <rms@gnu.org> > CC: eggert@cs.ucla.edu, emacs-devel@gnu.org, > npostavs@users.sourceforge.net > Date: Mon, 24 Oct 2016 22:35:55 -0400 > > > > I think the way to fix those is by systematically looking at the > > > source for them, rather than by debugging. > > > Yes, but finding out whether this is so is not easy, because the > > malloc call is sometimes buried very deep. > > There are programs that determine call trees. We could find these > problems by analyzing the output. Yes, but the real problem is to determine whether the code needs any changes at all. For that, one must understand the control flow, and figure out whether pointers to buffer text are used across malloc calls without any updates. This is the hardest part, because pointers are frequently passed down to subroutines and to their subroutines, which use them or call malloc only under certain conditions. For example, it could be that a subroutine only calls malloc if the passed-in pointer does not originate from a buffer object. This analysis is what makes the source study hard. Anyway, I think I just finished hunting and fixing those cases, so the only remaining issue is with regex.c functions, for which we have a patch that will most probably do the job. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-25 16:05 ` Eli Zaretskii @ 2016-10-27 1:22 ` Richard Stallman 0 siblings, 0 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-27 1:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: npostavs, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Yes, but the real problem is to determine whether the code needs any > changes at all. For that, one must understand the control flow, and > figure out whether pointers to buffer text are used across malloc > calls without any updates. That is true. But this can find the functions that need to be checked, because they make buffer pointers and they indirectly call malloc. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-24 17:09 ` Eli Zaretskii 2016-10-25 2:35 ` Richard Stallman 2016-10-25 2:35 ` Richard Stallman @ 2016-10-25 23:00 ` Perry E. Metzger 2016-10-26 2:37 ` Eli Zaretskii 2016-10-27 1:25 ` Richard Stallman 2 siblings, 2 replies; 375+ messages in thread From: Perry E. Metzger @ 2016-10-25 23:00 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel, rms, npostavs On Mon, 24 Oct 2016 20:09:32 +0300 Eli Zaretskii <eliz@gnu.org> wrote: > > > It imposes hard-to-fulfill requirements on functions that get > > > C pointers to buffer text or to Lisp string data: those > > > functions must never call malloc, directly or indirectly. > > > > I think the way to fix those is by systematically looking at the > > source for them, rather than by debugging. > > Yes, but finding out whether this is so is not easy, because the > malloc call is sometimes buried very deep. Could this be found by doing a debugging build where malloc aborts in the conditions where it can't be called directly or indirectly? Then one could just run that way and find the instances pretty easily. Perry -- Perry E. Metzger perry@piermont.com ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-25 23:00 ` Perry E. Metzger @ 2016-10-26 2:37 ` Eli Zaretskii 2016-10-27 1:25 ` Richard Stallman 1 sibling, 0 replies; 375+ messages in thread From: Eli Zaretskii @ 2016-10-26 2:37 UTC (permalink / raw) To: Perry E. Metzger; +Cc: eggert, emacs-devel, rms, npostavs > Date: Tue, 25 Oct 2016 19:00:45 -0400 > From: "Perry E. Metzger" <perry@piermont.com> > Cc: rms@gnu.org, npostavs@users.sourceforge.net, eggert@cs.ucla.edu, > emacs-devel@gnu.org > > On Mon, 24 Oct 2016 20:09:32 +0300 Eli Zaretskii <eliz@gnu.org> wrote: > > > > It imposes hard-to-fulfill requirements on functions that get > > > > C pointers to buffer text or to Lisp string data: those > > > > functions must never call malloc, directly or indirectly. > > > > > > I think the way to fix those is by systematically looking at the > > > source for them, rather than by debugging. > > > > Yes, but finding out whether this is so is not easy, because the > > malloc call is sometimes buried very deep. > > Could this be found by doing a debugging build where malloc > aborts in the conditions where it can't be called directly or > indirectly? I don't know how to define those conditions. If you have concrete suggestions, please describe them. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? (WAS: bug#24358) 2016-10-25 23:00 ` Perry E. Metzger 2016-10-26 2:37 ` Eli Zaretskii @ 2016-10-27 1:25 ` Richard Stallman 1 sibling, 0 replies; 375+ messages in thread From: Richard Stallman @ 2016-10-27 1:25 UTC (permalink / raw) To: Perry E. Metzger; +Cc: npostavs, eliz, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Could this be found by doing a debugging build where malloc > aborts in the conditions where it can't be called directly or > indirectly? We could make the functions that create pointers into buffers also increment a global counter when they do that, and decrement the counter when done. malloc would abort if the counter is nonzero. The hard part would be arranging to reset the counter to zero when there is a nonlocal exit out of such a region. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html. ^ permalink raw reply [flat|nested] 375+ messages in thread
* Re: When should ralloc.c be used? 2016-10-24 0:21 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 2016-10-24 3:59 ` Paul Eggert 2016-10-24 7:15 ` Eli Zaretskii @ 2016-10-24 14:04 ` Stefan Monnier 2 siblings, 0 replies; 375+ messages in thread From: Stefan Monnier @ 2016-10-24 14:04 UTC (permalink / raw) To: emacs-devel >> But that doesn't explain why we'd need to use ralloc in the mean time. > Why would we not want to use ralloc? It made a big improvement for > memory management when I wrote it. But that was before we were able to use mmap for the allocation of buffer memory, which is the main source of fragmentation AFAIK. Also the size of virtual and physical memory was quite different back then, Stefan ^ permalink raw reply [flat|nested] 375+ messages in thread
end of thread, other threads:[~2019-01-21 14:19 UTC | newest] Thread overview: 375+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <ksa8fz8eud.fsf@luna.netfonds.no> [not found] ` <87twe6sx2g.fsf@users.sourceforge.net> [not found] ` <87eg51ng4r.fsf_-_@users.sourceforge.net> [not found] ` <87k2djwumn.fsf@users.sourceforge.net> [not found] ` <83h98nidvd.fsf@gnu.org> [not found] ` <87eg3rvtsf.fsf@users.sourceforge.net> [not found] ` <83k2dihpm9.fsf@gnu.org> [not found] ` <8760p2wzgj.fsf@users.sourceforge.net> [not found] ` <838ttyhhzu.fsf@gnu.org> [not found] ` <871szqwu51.fsf@users.sourceforge.net> [not found] ` <831szqhbc2.fsf@gnu.org> 2016-10-22 3:03 ` When should ralloc.c be used? (WAS: bug#24358) npostavs 2016-10-22 5:32 ` Paul Eggert 2016-10-22 7:29 ` Eli Zaretskii 2016-10-22 18:34 ` Paul Eggert 2016-10-22 19:43 ` When should ralloc.c be used? Stefan Monnier 2016-10-23 2:37 ` Paul Eggert 2016-10-23 6:53 ` Eli Zaretskii 2016-10-23 7:57 ` Paul Eggert 2016-10-23 8:58 ` Eli Zaretskii 2016-10-23 9:38 ` Paul Eggert 2016-10-23 12:50 ` Eli Zaretskii 2016-10-23 13:39 ` Stefan Monnier 2016-10-23 14:01 ` Eli Zaretskii 2016-10-23 14:18 ` Stefan Monnier 2016-10-23 18:19 ` Paul Eggert 2016-10-23 19:03 ` Eli Zaretskii 2016-10-23 20:36 ` Stefan Monnier 2016-10-24 6:54 ` Eli Zaretskii 2016-10-24 10:15 ` Eli Zaretskii 2016-10-24 4:59 ` Paul Eggert 2016-10-24 7:44 ` Eli Zaretskii 2016-10-24 8:29 ` Andreas Schwab 2016-10-24 8:47 ` Eli Zaretskii 2016-10-24 16:21 ` Paul Eggert 2016-10-24 16:39 ` Eli Zaretskii 2016-10-24 16:54 ` Paul Eggert 2016-10-24 17:05 ` Eli Zaretskii 2016-10-25 6:23 ` Paul Eggert 2016-10-25 16:11 ` Eli Zaretskii 2016-10-28 6:18 ` Jérémie Courrèges-Anglas 2016-10-28 6:19 ` Jérémie Courrèges-Anglas 2016-10-28 7:40 ` Eli Zaretskii 2016-10-23 15:22 ` Andreas Schwab 2016-10-23 15:49 ` Eli Zaretskii 2016-10-23 15:57 ` Andreas Schwab 2016-10-23 17:06 ` Eli Zaretskii 2016-10-23 20:35 ` Stefan Monnier 2016-10-23 16:44 ` Skipping unexec via a big .elc file (was: When should ralloc.c be used?) Stefan Monnier 2016-10-23 17:34 ` Eli Zaretskii 2016-10-23 20:27 ` Skipping unexec via a big .elc file Stefan Monnier 2016-10-24 6:22 ` Eli Zaretskii 2016-10-24 12:47 ` Stefan Monnier 2016-10-24 13:08 ` Eli Zaretskii 2016-10-24 14:15 ` Stefan Monnier 2016-10-24 1:07 ` Stefan Monnier 2016-10-24 6:39 ` Eli Zaretskii 2016-10-24 6:47 ` Lars Ingebrigtsen 2016-10-24 7:17 ` Eli Zaretskii 2016-10-24 8:24 ` Andreas Schwab 2016-10-24 8:41 ` Eli Zaretskii 2016-10-24 9:47 ` Daniel Colascione 2016-10-24 10:00 ` Eli Zaretskii 2016-10-24 10:03 ` Daniel Colascione 2016-10-24 10:18 ` Eli Zaretskii 2016-10-24 10:28 ` Philipp Stephani 2016-10-24 10:51 ` Eli Zaretskii 2016-10-24 13:52 ` Stefan Monnier 2016-10-24 16:04 ` Eli Zaretskii 2016-10-24 13:04 ` Stefan Monnier 2016-10-24 13:35 ` Eli Zaretskii 2016-10-24 14:45 ` Daniel Colascione 2016-10-24 15:58 ` Eli Zaretskii 2016-10-24 16:17 ` Daniel Colascione 2016-10-24 16:51 ` Philipp Stephani 2016-10-24 19:47 ` Daniel Colascione 2016-10-25 15:59 ` Eli Zaretskii 2016-10-25 16:14 ` Daniel Colascione 2016-10-25 17:05 ` Eli Zaretskii 2016-10-25 19:49 ` Stefan Monnier 2016-10-25 22:53 ` Perry E. Metzger 2016-10-26 2:36 ` Eli Zaretskii 2016-10-26 2:37 ` Perry E. Metzger 2016-10-24 16:52 ` Eli Zaretskii 2016-10-25 22:46 ` Perry E. Metzger 2016-10-24 9:40 ` Ken Raeburn 2016-10-24 13:13 ` Stefan Monnier 2016-10-25 9:02 ` Ken Raeburn 2016-10-25 13:48 ` Stefan Monnier 2016-10-27 8:51 ` Ken Raeburn 2016-10-30 14:43 ` Ken Raeburn 2016-10-30 15:31 ` Simon Leinen 2016-10-30 16:52 ` Daniel Colascione 2016-10-31 14:27 ` Stefan Monnier 2016-11-02 7:36 ` Ken Raeburn 2016-11-02 12:17 ` Stefan Monnier 2016-11-02 12:22 ` Stefan Monnier 2016-11-03 5:37 ` Ken Raeburn 2016-12-11 13:34 ` Ken Raeburn 2016-12-11 15:42 ` Eli Zaretskii 2016-12-24 11:06 ` Eli Zaretskii 2016-12-25 15:46 ` Stefan Monnier 2016-12-11 19:18 ` Richard Stallman 2016-12-15 12:57 ` Ken Raeburn 2016-12-15 16:04 ` Eli Zaretskii 2016-12-15 16:26 ` Ken Raeburn 2016-12-11 19:18 ` Richard Stallman 2016-12-12 17:25 ` Ken Raeburn 2016-12-13 15:21 ` Ken Brown 2016-12-14 5:30 ` Ken Raeburn 2016-12-14 5:45 ` Ken Raeburn 2016-12-14 10:58 ` Phil Sainty 2016-12-14 12:06 ` Yuri Khan 2016-12-14 11:00 ` Lars Ingebrigtsen 2016-12-15 11:45 ` Ken Raeburn 2016-12-15 17:28 ` Ken Raeburn 2016-12-15 19:59 ` Eli Zaretskii 2016-12-15 22:07 ` Clément Pit--Claudel 2016-12-16 7:54 ` Eli Zaretskii 2016-12-16 14:28 ` Clément Pit--Claudel 2016-12-16 14:39 ` Eli Zaretskii 2016-12-16 15:28 ` Clément Pit--Claudel 2016-12-16 21:27 ` Eli Zaretskii 2016-12-16 21:38 ` Noam Postavsky 2016-12-17 14:56 ` Stefan Monnier 2016-12-19 15:11 ` Phillip Lord 2016-12-16 7:56 ` Eli Zaretskii 2016-12-19 15:15 ` Phillip Lord 2016-12-19 15:09 ` Phillip Lord 2016-12-20 18:57 ` Ken Raeburn 2016-12-20 23:22 ` Stefan Monnier 2016-12-21 7:44 ` Ken Raeburn 2016-12-21 12:13 ` Phillip Lord 2016-12-16 14:22 ` Robert Pluim 2016-12-24 13:37 ` Eli Zaretskii 2016-12-26 17:48 ` Eli Zaretskii 2017-01-07 9:40 ` Eli Zaretskii 2017-01-09 10:28 ` Ken Raeburn 2017-01-10 2:25 ` Stefan Monnier 2017-01-10 9:46 ` Andreas Schwab 2017-01-10 17:19 ` Eli Zaretskii 2017-01-11 6:32 ` Ken Raeburn 2017-01-12 8:17 ` Ken Raeburn 2017-01-14 10:41 ` Eli Zaretskii 2017-01-14 10:55 ` Andreas Schwab 2017-01-14 11:07 ` Eli Zaretskii 2017-01-14 11:26 ` Alan Mackenzie 2017-01-14 12:19 ` Andreas Schwab 2017-01-14 13:05 ` Eli Zaretskii 2017-01-14 15:12 ` Andreas Schwab 2017-01-14 17:37 ` Eli Zaretskii 2017-01-14 18:50 ` Andreas Schwab 2017-01-14 15:30 ` Stefan Monnier 2017-01-14 17:42 ` Eli Zaretskii 2017-01-14 18:11 ` Stefan Monnier 2017-01-14 20:13 ` Eli Zaretskii 2017-01-21 7:58 ` Ken Raeburn 2017-01-22 16:55 ` Ken Raeburn 2017-02-02 9:10 ` Ken Raeburn 2017-02-04 10:37 ` Eli Zaretskii 2017-02-05 14:19 ` Ken Raeburn 2017-02-05 15:51 ` Eli Zaretskii 2017-02-05 23:19 ` Ken Raeburn 2017-02-06 15:20 ` Ken Raeburn 2017-02-06 15:39 ` Stefan Monnier 2017-02-06 19:08 ` Ken Raeburn 2017-02-06 22:39 ` Stefan Monnier 2017-02-08 10:31 ` Ken Raeburn 2017-02-08 14:38 ` Ken Brown 2017-02-05 20:03 ` Ken Brown 2017-02-25 14:52 ` Eli Zaretskii 2017-02-25 15:19 ` Eli Zaretskii 2017-02-26 12:37 ` Ken Raeburn 2017-03-04 14:23 ` Eli Zaretskii 2017-03-06 8:46 ` Ken Raeburn 2017-03-11 12:27 ` Eli Zaretskii 2017-03-11 13:18 ` Andreas Schwab 2017-03-11 13:42 ` Eli Zaretskii 2017-03-11 15:48 ` Stefan Monnier 2017-03-11 21:48 ` Richard Stallman 2017-03-11 22:06 ` Stefan Monnier 2017-03-11 23:59 ` Ken Raeburn 2017-03-12 17:06 ` Stefan Monnier 2017-03-13 8:25 ` Ken Raeburn 2017-03-26 16:44 ` Eli Zaretskii 2017-03-28 2:27 ` Ken Raeburn 2017-03-31 6:57 ` Eli Zaretskii 2017-03-31 8:40 ` Ken Raeburn 2017-04-03 16:15 ` Ken Raeburn 2017-04-03 16:57 ` Alan Mackenzie 2017-04-03 18:35 ` Ken Raeburn 2017-04-03 19:14 ` Eli Zaretskii 2017-04-04 8:08 ` Ken Raeburn 2017-04-04 9:51 ` Robert Pluim 2017-04-04 10:27 ` joakim 2017-04-04 12:14 ` Clément Pit-Claudel 2017-04-04 14:38 ` Eli Zaretskii 2017-04-04 15:16 ` Clément Pit-Claudel 2017-04-04 15:53 ` Eli Zaretskii 2017-04-04 18:22 ` Clément Pit-Claudel 2017-04-07 5:46 ` Lars Brinkhoff 2017-04-07 7:28 ` Eli Zaretskii 2017-04-07 9:02 ` Ken Raeburn 2017-04-07 13:40 ` Eli Zaretskii 2017-04-07 16:02 ` Ken Raeburn 2017-04-07 16:17 ` Clément Pit-Claudel 2017-04-08 15:03 ` Philipp Stephani 2017-04-08 15:15 ` Clément Pit-Claudel 2017-04-08 15:53 ` Philipp Stephani 2017-04-08 16:18 ` Eli Zaretskii 2017-04-08 18:01 ` Stefan Monnier 2017-05-01 11:41 ` Philipp Stephani 2017-04-08 17:58 ` Clément Pit-Claudel 2017-05-01 11:40 ` Philipp Stephani 2017-05-01 12:07 ` Eli Zaretskii 2017-05-18 17:39 ` Daniel Colascione 2017-05-18 19:45 ` Eli Zaretskii 2018-12-25 15:46 ` Philipp Stephani 2018-12-25 17:21 ` Eli Zaretskii 2018-12-25 19:15 ` Daniel Colascione 2018-12-26 15:27 ` Eli Zaretskii 2019-01-07 21:37 ` Daniel Colascione 2019-01-15 22:46 ` Daniel Colascione 2019-01-16 8:45 ` Tassilo Horn 2019-01-16 10:25 ` Robert Pluim 2019-01-16 11:58 ` Phillip Lord 2019-01-18 12:46 ` Windows Binaries with pdumper Phillip Lord 2019-01-21 11:30 ` Jostein Kjønigsen 2019-01-21 14:19 ` Phillip Lord 2019-01-16 12:00 ` Skipping unexec via a big .elc file Elias Mårtenson 2019-01-16 15:59 ` Eli Zaretskii 2019-01-16 16:08 ` Daniel Colascione 2019-01-16 21:56 ` Clément Pit-Claudel 2017-05-21 8:44 ` compiled lisp file format (Re: Skipping unexec via a big .elc file) Ken Raeburn 2017-05-21 8:53 ` Paul Eggert 2017-05-28 11:07 ` Ken Raeburn 2017-05-28 12:43 ` Philipp Stephani 2017-05-29 9:33 ` Ken Raeburn 2017-07-02 15:46 ` Philipp Stephani 2017-07-03 1:44 ` Ken Raeburn 2017-09-24 13:57 ` Philipp Stephani 2017-09-27 8:31 ` Ken Raeburn 2017-05-28 21:09 ` Paul Eggert 2017-05-29 9:33 ` Ken Raeburn 2017-05-29 16:37 ` Paul Eggert 2017-05-29 17:39 ` Eli Zaretskii 2017-05-29 18:03 ` Paul Eggert 2017-05-29 18:53 ` Eli Zaretskii 2017-05-29 20:15 ` Paul Eggert 2017-05-30 5:52 ` Ken Raeburn 2017-05-30 5:55 ` Eli Zaretskii 2017-05-21 16:02 ` John Wiegley 2017-04-07 13:23 ` Skipping unexec via a big .elc file Stefan Monnier 2017-04-10 16:19 ` Ken Raeburn 2016-10-24 18:34 ` Lars Brinkhoff 2016-10-24 19:52 ` Eli Zaretskii 2016-10-23 12:55 ` When should ralloc.c be used? Stefan Monnier 2016-10-23 14:28 ` Stefan Monnier 2016-10-23 14:57 ` Eli Zaretskii 2016-10-23 15:07 ` Stefan Monnier 2016-10-23 15:44 ` Eli Zaretskii 2016-10-23 16:30 ` Stefan Monnier 2016-10-23 16:45 ` Eli Zaretskii 2016-10-23 16:49 ` Stefan Monnier 2016-10-23 17:35 ` Eli Zaretskii 2016-10-23 20:23 ` Stefan Monnier 2016-10-23 20:33 ` Eli Zaretskii 2016-10-23 20:44 ` Stefan Monnier 2016-10-24 5:11 ` Paul Eggert 2016-10-24 12:33 ` Stefan Monnier 2016-10-24 13:05 ` Eli Zaretskii 2016-10-24 14:12 ` Stefan Monnier 2016-10-24 16:00 ` Eli Zaretskii 2016-10-24 18:51 ` Stefan Monnier 2016-10-24 14:37 ` Stefan Monnier 2016-10-24 15:40 ` Eli Zaretskii 2016-10-24 16:27 ` Daniel Colascione 2016-10-24 16:57 ` Eli Zaretskii 2016-10-25 2:34 ` Richard Stallman 2016-10-25 14:13 ` Stefan Monnier 2016-10-25 14:14 ` Stefan Monnier 2016-10-28 6:03 ` Jérémie Courrèges-Anglas 2016-10-28 6:23 ` Daniel Colascione 2016-10-28 7:09 ` Jérémie Courrèges-Anglas 2016-10-28 7:46 ` Eli Zaretskii 2016-10-28 8:11 ` Daniel Colascione 2016-10-28 8:27 ` Eli Zaretskii 2016-10-28 8:44 ` Daniel Colascione 2016-10-28 9:43 ` Eli Zaretskii 2016-10-28 9:52 ` Daniel Colascione 2016-10-28 12:25 ` Eli Zaretskii 2016-10-28 13:37 ` Stefan Monnier 2016-10-28 14:30 ` Eli Zaretskii 2016-10-28 14:43 ` Stefan Monnier 2016-10-28 15:41 ` Daniel Colascione 2016-10-29 6:08 ` Eli Zaretskii 2016-10-29 6:14 ` Daniel Colascione 2016-10-28 12:11 ` Stefan Monnier 2016-10-28 11:40 ` Jérémie Courrèges-Anglas 2016-10-28 13:03 ` Stefan Monnier 2016-10-28 14:41 ` Jérémie Courrèges-Anglas 2016-10-28 15:34 ` Daniel Colascione 2016-10-24 18:45 ` Stefan Monnier 2016-10-24 19:38 ` Eli Zaretskii 2016-10-25 14:12 ` Stefan Monnier 2016-10-25 16:36 ` Eli Zaretskii 2016-10-25 19:27 ` Stefan Monnier 2016-10-25 3:12 ` Ken Raeburn 2016-10-25 16:06 ` Eli Zaretskii 2016-10-26 4:36 ` Ken Raeburn 2016-10-26 11:40 ` Eli Zaretskii 2016-10-27 8:51 ` Ken Raeburn 2016-10-24 6:59 ` Eli Zaretskii 2016-10-24 12:45 ` Stefan Monnier 2016-10-24 13:07 ` Eli Zaretskii 2016-10-24 14:42 ` Stefan Monnier 2016-10-24 15:43 ` Eli Zaretskii 2016-10-24 18:50 ` Stefan Monnier 2016-10-24 16:10 ` Eli Zaretskii 2016-10-24 16:56 ` Richard Stallman 2016-10-24 0:21 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 2016-10-24 3:59 ` Paul Eggert 2016-10-24 7:15 ` Eli Zaretskii 2016-10-24 16:55 ` Richard Stallman 2016-10-24 17:09 ` Eli Zaretskii 2016-10-25 2:35 ` Richard Stallman 2016-10-25 6:38 ` Paul Eggert 2016-10-25 16:04 ` Eli Zaretskii 2016-10-25 23:49 ` Richard Stallman 2016-10-26 5:08 ` Paul Eggert 2016-10-26 11:46 ` Eli Zaretskii 2016-10-26 13:10 ` Noam Postavsky 2016-10-26 14:20 ` Eli Zaretskii 2016-10-27 1:23 ` Richard Stallman 2016-10-27 1:36 ` Paul Eggert 2016-10-27 13:35 ` Perry E. Metzger 2016-10-27 14:51 ` Paul Eggert 2016-10-27 15:05 ` Perry E. Metzger 2016-10-27 18:13 ` Eli Zaretskii 2016-10-27 21:03 ` Perry E. Metzger 2016-10-27 21:07 ` Daniel Colascione 2016-10-27 23:23 ` Perry E. Metzger 2016-10-27 23:32 ` When should ralloc.c be used? Daniel Colascione 2016-10-28 7:06 ` When should ralloc.c be used? (WAS: bug#24358) Eli Zaretskii 2016-10-28 7:03 ` Eli Zaretskii 2016-10-27 13:44 ` Fabrice Popineau 2016-10-27 15:35 ` Eli Zaretskii 2016-10-27 20:39 ` Richard Stallman 2016-10-28 6:48 ` Eli Zaretskii 2016-10-28 19:12 ` Richard Stallman 2016-10-29 6:37 ` Eli Zaretskii 2016-10-29 14:55 ` When should ralloc.c be used? Stefan Monnier 2016-10-30 16:13 ` Eli Zaretskii 2016-10-30 21:47 ` Stefan Monnier 2016-10-29 16:38 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 2016-10-29 21:57 ` Eli Zaretskii 2016-10-31 19:18 ` Richard Stallman 2016-10-31 20:58 ` Eli Zaretskii 2016-10-28 12:51 ` When should ralloc.c be used? Stefan Monnier 2016-10-27 20:40 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 2016-10-27 22:34 ` Paul Eggert 2016-10-28 2:40 ` Richard Stallman 2016-10-28 2:40 ` Richard Stallman 2016-10-28 7:21 ` Eli Zaretskii 2016-10-28 6:55 ` Eli Zaretskii 2016-10-26 11:37 ` Eli Zaretskii 2016-10-27 1:24 ` Richard Stallman 2016-10-28 12:57 ` When should ralloc.c be used? Stefan Monnier 2016-10-28 19:13 ` Richard Stallman 2016-10-28 22:46 ` Stefan Monnier 2016-10-29 16:35 ` Richard Stallman 2016-10-29 6:39 ` Eli Zaretskii 2016-10-29 16:37 ` Richard Stallman 2016-10-29 21:51 ` Eli Zaretskii 2016-10-30 11:33 ` Richard Stallman 2016-10-30 15:33 ` Alp Aker 2016-10-30 17:19 ` Richard Stallman 2016-10-30 16:08 ` Eli Zaretskii 2016-10-25 23:49 ` When should ralloc.c be used? (WAS: bug#24358) Richard Stallman 2016-10-25 2:35 ` Richard Stallman 2016-10-25 16:05 ` Eli Zaretskii 2016-10-27 1:22 ` Richard Stallman 2016-10-25 23:00 ` Perry E. Metzger 2016-10-26 2:37 ` Eli Zaretskii 2016-10-27 1:25 ` Richard Stallman 2016-10-24 14:04 ` When should ralloc.c be used? Stefan Monnier
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).