From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: GC and stack marking Date: Thu, 22 May 2014 18:20:04 +0300 Message-ID: <83mwe9n8m3.fsf@gnu.org> References: <83a9add91p.fsf@gnu.org> <8338g4bd7m.fsf@gnu.org> <83wqdgntan.fsf@gnu.org> <83ppj7oniv.fsf@gnu.org> <838upvnnsq.fsf@gnu.org> <83y4xvm2tg.fsf@gnu.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1400772038 4633 80.91.229.3 (22 May 2014 15:20:38 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 22 May 2014 15:20:38 +0000 (UTC) Cc: fabrice.popineau@gmail.com, monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu May 22 17:20:31 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WnUn2-0001Sz-Ix for ged-emacs-devel@m.gmane.org; Thu, 22 May 2014 17:20:16 +0200 Original-Received: from localhost ([::1]:38163 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WnUn1-0005ie-So for ged-emacs-devel@m.gmane.org; Thu, 22 May 2014 11:20:15 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56841) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WnUms-0005dw-9o for emacs-devel@gnu.org; Thu, 22 May 2014 11:20:12 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WnUmm-0003ly-J7 for emacs-devel@gnu.org; Thu, 22 May 2014 11:20:06 -0400 Original-Received: from mtaout27.012.net.il ([80.179.55.183]:51731) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WnUmf-0003YL-OQ; Thu, 22 May 2014 11:19:54 -0400 Original-Received: from conversion-daemon.mtaout27.012.net.il by mtaout27.012.net.il (HyperSendmail v2007.08) id <0N5Z00K00EB3HR00@mtaout27.012.net.il>; Thu, 22 May 2014 18:16:37 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout27.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N5Z00F23EFPF260@mtaout27.012.net.il>; Thu, 22 May 2014 18:16:37 +0300 (IDT) In-reply-to: <83y4xvm2tg.fsf@gnu.org> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 80.179.55.183 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:172027 Archived-At: > Date: Wed, 21 May 2014 20:58:19 +0300 > From: Eli Zaretskii > Cc: fabrice.popineau@gmail.com, emacs-devel@gnu.org > > > From: Stefan Monnier > > Cc: emacs-devel@gnu.org, fabrice.popineau@gmail.com > > Date: Wed, 21 May 2014 13:40:21 -0400 > > > > > I already tried that before, and came up empty-handed. I tried again > > > now; the address of that value on the stack does not correspond to any > > > local variable in the corresponding stack frame, and I also cannot > > > find that address in the disassembly of the function whose stack frame > > > includes the value. > > > > It might simply be a slot that's unused by the current stack frame, > > whose value comes from some stack frame that existed some time in > > the past. > > That's probably what it is, yes. That's definitely what it is. The value gets onto the stack when loadup.el does this: (when (hash-table-p purify-flag) (let ((strings 0) (vectors 0) (bytecodes 0) (conses 0) (others 0)) (maphash (lambda (k v) (cond ((stringp k) (setq strings (1+ strings))) ((vectorp k) (setq vectors (1+ vectors))) ((consp k) (setq conses (1+ conses))) ((byte-code-function-p v) (setq bytecodes (1+ bytecodes))) (t (setq others (1+ others))))) purify-flag) (message "Pure-hashed: %d strings, %d vectors, %d conses, %d bytecodes, %d others" strings vectors conses bytecodes others))) The call to hash-table-p pushes the table address on the stack before calling Fhash_table_p, and it remains there until the call to mark_stack. > > > Now, I have a question: mark_stack stops examining the stack when it > > > gets to its own stack frame. That is certainly safe, but it sounds > > > too conservative: it should stop at the stack frame of > > > Fgarbage_collect, I think, because no live Lisp object can appear > > > while Fgarbage_collect runs, right? > > > > Sounds right, yes. > > I will try that and see if that helps. Of course, if my reading of > GDB data is correct, and the value was indeed in the > Fgarbage_collect's stack frame, it must help. It did help, at least in an unoptimized build. The suggested patch is below. It just reshuffles the existing code: we now determine the limit for searching the stack in Fgarbage_collect, and then call a subroutine that does what Fgarbage_collect should actually do. This way, none of the variables local to Fgarbage_collect or its stack will be searched by mark_stack. Is the patchy below OK for the trunk? Does anyone see anything problematic with it? --- src/alloc.c~ 2014-05-21 18:04:29 +0300 +++ src/alloc.c 2014-05-22 18:18:32 +0300 @@ -4880,61 +4880,8 @@ dump_zombies (void) from the stack start. */ static void -mark_stack (void) +mark_stack (void *end) { - void *end; - -#ifdef HAVE___BUILTIN_UNWIND_INIT - /* Force callee-saved registers and register windows onto the stack. - This is the preferred method if available, obviating the need for - machine dependent methods. */ - __builtin_unwind_init (); - end = &end; -#else /* not HAVE___BUILTIN_UNWIND_INIT */ -#ifndef GC_SAVE_REGISTERS_ON_STACK - /* jmp_buf may not be aligned enough on darwin-ppc64 */ - union aligned_jmpbuf { - Lisp_Object o; - sys_jmp_buf j; - } j; - volatile bool stack_grows_down_p = (char *) &j > (char *) stack_base; -#endif - /* This trick flushes the register windows so that all the state of - the process is contained in the stack. */ - /* Fixme: Code in the Boehm GC suggests flushing (with `flushrs') is - needed on ia64 too. See mach_dep.c, where it also says inline - assembler doesn't work with relevant proprietary compilers. */ -#ifdef __sparc__ -#if defined (__sparc64__) && defined (__FreeBSD__) - /* FreeBSD does not have a ta 3 handler. */ - asm ("flushw"); -#else - asm ("ta 3"); -#endif -#endif - - /* Save registers that we need to see on the stack. We need to see - registers used to hold register variables and registers used to - pass parameters. */ -#ifdef GC_SAVE_REGISTERS_ON_STACK - GC_SAVE_REGISTERS_ON_STACK (end); -#else /* not GC_SAVE_REGISTERS_ON_STACK */ - -#ifndef GC_SETJMP_WORKS /* If it hasn't been checked yet that - setjmp will definitely work, test it - and print a message with the result - of the test. */ - if (!setjmp_tested_p) - { - setjmp_tested_p = 1; - test_setjmp (); - } -#endif /* GC_SETJMP_WORKS */ - - sys_setjmp (j.j); - end = stack_grows_down_p ? (char *) &j + sizeof j : (char *) &j; -#endif /* not GC_SAVE_REGISTERS_ON_STACK */ -#endif /* not HAVE___BUILTIN_UNWIND_INIT */ /* This assumes that the stack is a contiguous region in memory. If that's not the case, something has to be done here to iterate @@ -5542,22 +5489,14 @@ mark_pinned_symbols (void) } } -DEFUN ("garbage-collect", Fgarbage_collect, Sgarbage_collect, 0, 0, "", - doc: /* Reclaim storage for Lisp objects no longer needed. -Garbage collection happens automatically if you cons more than -`gc-cons-threshold' bytes of Lisp data since previous garbage collection. -`garbage-collect' normally returns a list with info on amount of space in use, -where each entry has the form (NAME SIZE USED FREE), where: -- NAME is a symbol describing the kind of objects this entry represents, -- SIZE is the number of bytes used by each one, -- USED is the number of those objects that were found live in the heap, -- FREE is the number of those objects that are not live but that Emacs - keeps around for future allocations (maybe because it does not know how - to return them to the OS). -However, if there was overflow in pure space, `garbage-collect' -returns nil, because real GC can't be done. -See Info node `(elisp)Garbage Collection'. */) - (void) +/* Subroutine of Fgarbage_collect that does most of the work. It is a + separate function so that we could limit mark_stack in searching + the stack frames below this function, thus avoiding the rare cases + where mark_stack finds values that look like live Lisp objects on + portions of stack that couldn't possibly contain such live + objects. */ +static Lisp_Object +garbage_collect_1 (void *end) { struct buffer *nextb; char stack_top_variable; @@ -5655,7 +5594,7 @@ See Info node `(elisp)Garbage Collection #if (GC_MARK_STACK == GC_MAKE_GCPROS_NOOPS \ || GC_MARK_STACK == GC_MARK_STACK_CHECK_GCPROS) - mark_stack (); + mark_stack (end); #else { register struct gcpro *tail; @@ -5678,7 +5617,7 @@ See Info node `(elisp)Garbage Collection #endif #if GC_MARK_STACK == GC_USE_GCPROS_CHECK_ZOMBIES - mark_stack (); + mark_stack (end); #endif /* Everything is now marked, except for the data in font caches @@ -5838,6 +5777,82 @@ See Info node `(elisp)Garbage Collection return retval; } +DEFUN ("garbage-collect", Fgarbage_collect, Sgarbage_collect, 0, 0, "", + doc: /* Reclaim storage for Lisp objects no longer needed. +Garbage collection happens automatically if you cons more than +`gc-cons-threshold' bytes of Lisp data since previous garbage collection. +`garbage-collect' normally returns a list with info on amount of space in use, +where each entry has the form (NAME SIZE USED FREE), where: +- NAME is a symbol describing the kind of objects this entry represents, +- SIZE is the number of bytes used by each one, +- USED is the number of those objects that were found live in the heap, +- FREE is the number of those objects that are not live but that Emacs + keeps around for future allocations (maybe because it does not know how + to return them to the OS). +However, if there was overflow in pure space, `garbage-collect' +returns nil, because real GC can't be done. +See Info node `(elisp)Garbage Collection'. */) + (void) +{ +#if (GC_MARK_STACK == GC_MAKE_GCPROS_NOOPS \ + || GC_MARK_STACK == GC_MARK_STACK_CHECK_GCPROS \ + || GC_MARK_STACK == GC_USE_GCPROS_CHECK_ZOMBIES) + void *end; + +#ifdef HAVE___BUILTIN_UNWIND_INIT + /* Force callee-saved registers and register windows onto the stack. + This is the preferred method if available, obviating the need for + machine dependent methods. */ + __builtin_unwind_init (); + end = &end; +#else /* not HAVE___BUILTIN_UNWIND_INIT */ +#ifndef GC_SAVE_REGISTERS_ON_STACK + /* jmp_buf may not be aligned enough on darwin-ppc64 */ + union aligned_jmpbuf { + Lisp_Object o; + sys_jmp_buf j; + } j; + volatile bool stack_grows_down_p = (char *) &j > (char *) stack_base; +#endif + /* This trick flushes the register windows so that all the state of + the process is contained in the stack. */ + /* Fixme: Code in the Boehm GC suggests flushing (with `flushrs') is + needed on ia64 too. See mach_dep.c, where it also says inline + assembler doesn't work with relevant proprietary compilers. */ +#ifdef __sparc__ +#if defined (__sparc64__) && defined (__FreeBSD__) + /* FreeBSD does not have a ta 3 handler. */ + asm ("flushw"); +#else + asm ("ta 3"); +#endif +#endif + + /* Save registers that we need to see on the stack. We need to see + registers used to hold register variables and registers used to + pass parameters. */ +#ifdef GC_SAVE_REGISTERS_ON_STACK + GC_SAVE_REGISTERS_ON_STACK (end); +#else /* not GC_SAVE_REGISTERS_ON_STACK */ + +#ifndef GC_SETJMP_WORKS /* If it hasn't been checked yet that + setjmp will definitely work, test it + and print a message with the result + of the test. */ + if (!setjmp_tested_p) + { + setjmp_tested_p = 1; + test_setjmp (); + } +#endif /* GC_SETJMP_WORKS */ + + sys_setjmp (j.j); + end = stack_grows_down_p ? (char *) &j + sizeof j : (char *) &j; +#endif /* not GC_SAVE_REGISTERS_ON_STACK */ +#endif /* not HAVE___BUILTIN_UNWIND_INIT */ +#endif /* GC_MARK_STACK */ + return garbage_collect_1 (end); +} /* Mark Lisp objects in glyph matrix MATRIX. Currently the only interesting objects referenced from glyphs are strings. */