unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: fabrice.popineau@gmail.com, monnier@iro.umontreal.ca,
	emacs-devel@gnu.org
Subject: Re: GC and stack marking
Date: Thu, 22 May 2014 18:20:04 +0300	[thread overview]
Message-ID: <83mwe9n8m3.fsf@gnu.org> (raw)
In-Reply-To: <83y4xvm2tg.fsf@gnu.org>

> Date: Wed, 21 May 2014 20:58:19 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: fabrice.popineau@gmail.com, emacs-devel@gnu.org
> 
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: emacs-devel@gnu.org,  fabrice.popineau@gmail.com
> > Date: Wed, 21 May 2014 13:40:21 -0400
> > 
> > > I already tried that before, and came up empty-handed.  I tried again
> > > now; the address of that value on the stack does not correspond to any
> > > local variable in the corresponding stack frame, and I also cannot
> > > find that address in the disassembly of the function whose stack frame
> > > includes the value.
> > 
> > It might simply be a slot that's unused by the current stack frame,
> > whose value comes from some stack frame that existed some time in
> > the past.
> 
> That's probably what it is, yes.

That's definitely what it is.  The value gets onto the stack when
loadup.el does this:

  (when (hash-table-p purify-flag)
    (let ((strings 0)
	  (vectors 0)
	  (bytecodes 0)
	  (conses 0)
	  (others 0))
      (maphash (lambda (k v)
		 (cond
		  ((stringp k) (setq strings (1+ strings)))
		  ((vectorp k) (setq vectors (1+ vectors)))
		  ((consp k)   (setq conses  (1+ conses)))
		  ((byte-code-function-p v) (setq bytecodes (1+ bytecodes)))
		  (t           (setq others  (1+ others)))))
	       purify-flag)
      (message "Pure-hashed: %d strings, %d vectors, %d conses, %d bytecodes, %d others"
	       strings vectors conses bytecodes others)))

The call to hash-table-p pushes the table address on the stack before
calling Fhash_table_p, and it remains there until the call to
mark_stack.

> > > Now, I have a question: mark_stack stops examining the stack when it
> > > gets to its own stack frame.  That is certainly safe, but it sounds
> > > too conservative: it should stop at the stack frame of
> > > Fgarbage_collect, I think, because no live Lisp object can appear
> > > while Fgarbage_collect runs, right?
> > 
> > Sounds right, yes.
> 
> I will try that and see if that helps.  Of course, if my reading of
> GDB data is correct, and the value was indeed in the
> Fgarbage_collect's stack frame, it must help.

It did help, at least in an unoptimized build.  The suggested patch is
below.  It just reshuffles the existing code: we now determine the
limit for searching the stack in Fgarbage_collect, and then call a
subroutine that does what Fgarbage_collect should actually do.  This
way, none of the variables local to Fgarbage_collect or its stack will
be searched by mark_stack.

Is the patchy below OK for the trunk?  Does anyone see anything
problematic with it?

--- src/alloc.c~	2014-05-21 18:04:29 +0300
+++ src/alloc.c	2014-05-22 18:18:32 +0300
@@ -4880,61 +4880,8 @@ dump_zombies (void)
    from the stack start.  */
 
 static void
-mark_stack (void)
+mark_stack (void *end)
 {
-  void *end;
-
-#ifdef HAVE___BUILTIN_UNWIND_INIT
-  /* Force callee-saved registers and register windows onto the stack.
-     This is the preferred method if available, obviating the need for
-     machine dependent methods.  */
-  __builtin_unwind_init ();
-  end = &end;
-#else /* not HAVE___BUILTIN_UNWIND_INIT */
-#ifndef GC_SAVE_REGISTERS_ON_STACK
-  /* jmp_buf may not be aligned enough on darwin-ppc64 */
-  union aligned_jmpbuf {
-    Lisp_Object o;
-    sys_jmp_buf j;
-  } j;
-  volatile bool stack_grows_down_p = (char *) &j > (char *) stack_base;
-#endif
-  /* This trick flushes the register windows so that all the state of
-     the process is contained in the stack.  */
-  /* Fixme: Code in the Boehm GC suggests flushing (with `flushrs') is
-     needed on ia64 too.  See mach_dep.c, where it also says inline
-     assembler doesn't work with relevant proprietary compilers.  */
-#ifdef __sparc__
-#if defined (__sparc64__) && defined (__FreeBSD__)
-  /* FreeBSD does not have a ta 3 handler.  */
-  asm ("flushw");
-#else
-  asm ("ta 3");
-#endif
-#endif
-
-  /* Save registers that we need to see on the stack.  We need to see
-     registers used to hold register variables and registers used to
-     pass parameters.  */
-#ifdef GC_SAVE_REGISTERS_ON_STACK
-  GC_SAVE_REGISTERS_ON_STACK (end);
-#else /* not GC_SAVE_REGISTERS_ON_STACK */
-
-#ifndef GC_SETJMP_WORKS  /* If it hasn't been checked yet that
-			    setjmp will definitely work, test it
-			    and print a message with the result
-			    of the test.  */
-  if (!setjmp_tested_p)
-    {
-      setjmp_tested_p = 1;
-      test_setjmp ();
-    }
-#endif /* GC_SETJMP_WORKS */
-
-  sys_setjmp (j.j);
-  end = stack_grows_down_p ? (char *) &j + sizeof j : (char *) &j;
-#endif /* not GC_SAVE_REGISTERS_ON_STACK */
-#endif /* not HAVE___BUILTIN_UNWIND_INIT */
 
   /* This assumes that the stack is a contiguous region in memory.  If
      that's not the case, something has to be done here to iterate
@@ -5542,22 +5489,14 @@ mark_pinned_symbols (void)
     }
 }
 
-DEFUN ("garbage-collect", Fgarbage_collect, Sgarbage_collect, 0, 0, "",
-       doc: /* Reclaim storage for Lisp objects no longer needed.
-Garbage collection happens automatically if you cons more than
-`gc-cons-threshold' bytes of Lisp data since previous garbage collection.
-`garbage-collect' normally returns a list with info on amount of space in use,
-where each entry has the form (NAME SIZE USED FREE), where:
-- NAME is a symbol describing the kind of objects this entry represents,
-- SIZE is the number of bytes used by each one,
-- USED is the number of those objects that were found live in the heap,
-- FREE is the number of those objects that are not live but that Emacs
-  keeps around for future allocations (maybe because it does not know how
-  to return them to the OS).
-However, if there was overflow in pure space, `garbage-collect'
-returns nil, because real GC can't be done.
-See Info node `(elisp)Garbage Collection'.  */)
-  (void)
+/* Subroutine of Fgarbage_collect that does most of the work.  It is a
+   separate function so that we could limit mark_stack in searching
+   the stack frames below this function, thus avoiding the rare cases
+   where mark_stack finds values that look like live Lisp objects on
+   portions of stack that couldn't possibly contain such live
+   objects.  */
+static Lisp_Object
+garbage_collect_1 (void *end)
 {
   struct buffer *nextb;
   char stack_top_variable;
@@ -5655,7 +5594,7 @@ See Info node `(elisp)Garbage Collection
 
 #if (GC_MARK_STACK == GC_MAKE_GCPROS_NOOPS \
      || GC_MARK_STACK == GC_MARK_STACK_CHECK_GCPROS)
-  mark_stack ();
+  mark_stack (end);
 #else
   {
     register struct gcpro *tail;
@@ -5678,7 +5617,7 @@ See Info node `(elisp)Garbage Collection
 #endif
 
 #if GC_MARK_STACK == GC_USE_GCPROS_CHECK_ZOMBIES
-  mark_stack ();
+  mark_stack (end);
 #endif
 
   /* Everything is now marked, except for the data in font caches
@@ -5838,6 +5777,82 @@ See Info node `(elisp)Garbage Collection
   return retval;
 }
 
+DEFUN ("garbage-collect", Fgarbage_collect, Sgarbage_collect, 0, 0, "",
+       doc: /* Reclaim storage for Lisp objects no longer needed.
+Garbage collection happens automatically if you cons more than
+`gc-cons-threshold' bytes of Lisp data since previous garbage collection.
+`garbage-collect' normally returns a list with info on amount of space in use,
+where each entry has the form (NAME SIZE USED FREE), where:
+- NAME is a symbol describing the kind of objects this entry represents,
+- SIZE is the number of bytes used by each one,
+- USED is the number of those objects that were found live in the heap,
+- FREE is the number of those objects that are not live but that Emacs
+  keeps around for future allocations (maybe because it does not know how
+  to return them to the OS).
+However, if there was overflow in pure space, `garbage-collect'
+returns nil, because real GC can't be done.
+See Info node `(elisp)Garbage Collection'.  */)
+  (void)
+{
+#if (GC_MARK_STACK == GC_MAKE_GCPROS_NOOPS		\
+     || GC_MARK_STACK == GC_MARK_STACK_CHECK_GCPROS	\
+     || GC_MARK_STACK == GC_USE_GCPROS_CHECK_ZOMBIES)
+  void *end;
+
+#ifdef HAVE___BUILTIN_UNWIND_INIT
+  /* Force callee-saved registers and register windows onto the stack.
+     This is the preferred method if available, obviating the need for
+     machine dependent methods.  */
+  __builtin_unwind_init ();
+  end = &end;
+#else /* not HAVE___BUILTIN_UNWIND_INIT */
+#ifndef GC_SAVE_REGISTERS_ON_STACK
+  /* jmp_buf may not be aligned enough on darwin-ppc64 */
+  union aligned_jmpbuf {
+    Lisp_Object o;
+    sys_jmp_buf j;
+  } j;
+  volatile bool stack_grows_down_p = (char *) &j > (char *) stack_base;
+#endif
+  /* This trick flushes the register windows so that all the state of
+     the process is contained in the stack.  */
+  /* Fixme: Code in the Boehm GC suggests flushing (with `flushrs') is
+     needed on ia64 too.  See mach_dep.c, where it also says inline
+     assembler doesn't work with relevant proprietary compilers.  */
+#ifdef __sparc__
+#if defined (__sparc64__) && defined (__FreeBSD__)
+  /* FreeBSD does not have a ta 3 handler.  */
+  asm ("flushw");
+#else
+  asm ("ta 3");
+#endif
+#endif
+
+  /* Save registers that we need to see on the stack.  We need to see
+     registers used to hold register variables and registers used to
+     pass parameters.  */
+#ifdef GC_SAVE_REGISTERS_ON_STACK
+  GC_SAVE_REGISTERS_ON_STACK (end);
+#else /* not GC_SAVE_REGISTERS_ON_STACK */
+
+#ifndef GC_SETJMP_WORKS  /* If it hasn't been checked yet that
+			    setjmp will definitely work, test it
+			    and print a message with the result
+			    of the test.  */
+  if (!setjmp_tested_p)
+    {
+      setjmp_tested_p = 1;
+      test_setjmp ();
+    }
+#endif /* GC_SETJMP_WORKS */
+
+  sys_setjmp (j.j);
+  end = stack_grows_down_p ? (char *) &j + sizeof j : (char *) &j;
+#endif /* not GC_SAVE_REGISTERS_ON_STACK */
+#endif /* not HAVE___BUILTIN_UNWIND_INIT */
+#endif /* GC_MARK_STACK */
+  return garbage_collect_1 (end);
+}
 
 /* Mark Lisp objects in glyph matrix MATRIX.  Currently the
    only interesting objects referenced from glyphs are strings.  */



  reply	other threads:[~2014-05-22 15:20 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-19 16:31 GC and stack marking Eli Zaretskii
2014-05-19 18:47 ` Paul Eggert
2014-05-19 19:14   ` Eli Zaretskii
2014-05-19 19:58     ` Paul Eggert
2014-05-19 20:03       ` Eli Zaretskii
2014-05-19 20:17         ` Paul Eggert
2014-05-20 16:37           ` Eli Zaretskii
2014-05-20 13:44 ` Stefan Monnier
2014-05-20 16:57   ` Eli Zaretskii
2014-05-20 17:54     ` Stefan Monnier
2014-05-20 19:28       ` Eli Zaretskii
2014-05-20 22:01         ` Stefan Monnier
2014-05-21  2:48           ` Eli Zaretskii
2014-05-21  3:01             ` Stefan Monnier
2014-05-21 15:39               ` Eli Zaretskii
2014-05-21 15:57                 ` Dmitry Antipov
2014-05-21 16:06                   ` Dmitry Antipov
2014-05-21 16:55                     ` Eli Zaretskii
2014-05-21 16:53                   ` Eli Zaretskii
2014-05-21 17:40                 ` Stefan Monnier
2014-05-21 17:58                   ` Eli Zaretskii
2014-05-22 15:20                     ` Eli Zaretskii [this message]
2014-05-22 16:14                       ` Stefan Monnier
2014-05-24 12:03                         ` Eli Zaretskii
2014-05-20 19:12     ` Daniel Colascione
2014-05-20 19:43       ` Eli Zaretskii
2014-05-20 22:03         ` Stefan Monnier
2014-05-21  2:51           ` Eli Zaretskii
2014-05-31  6:31   ` Florian Weimer
2014-05-31 14:24     ` Stefan Monnier
  -- strict thread matches above, loose matches on Subject: below --
2014-05-21 19:31 Barry OReilly
2014-05-21 20:13 ` Eli Zaretskii
2014-05-21 20:49   ` Barry OReilly
2014-05-22  2:43     ` Eli Zaretskii
2014-05-22  3:12       ` Daniel Colascione
2014-05-22  5:37         ` David Kastrup
2014-05-22 13:57           ` Stefan Monnier
2014-05-22 15:49         ` Eli Zaretskii
2014-05-22 14:59       ` Barry OReilly
2014-05-22 17:03         ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83mwe9n8m3.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=fabrice.popineau@gmail.com \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).