unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: Eli Zaretskii <eliz@gnu.org>
Cc: jrm@ftfl.ca, mattiase@acm.org, 37006@debbugs.gnu.org
Subject: bug#37006: 27.0.50; garbage collection not happening after 26de2d42
Date: Sat, 14 Sep 2019 00:51:18 -0700	[thread overview]
Message-ID: <b66de82b-d22d-e8fa-22dd-0022db5a5b78@cs.ucla.edu> (raw)
In-Reply-To: <83o90qqzqw.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 334 bytes --]

On 8/15/19 7:17 AM, Eli Zaretskii wrote:
> OK, but I still hope we will switch to EMACS_INT instead.

I installed the attached patch into master, and it changes the relevant 
threshold variables to EMACS_INT, as part of a somewhat-larger cleanup. This 
should make a practical difference only on 32-bit Emacs without --with-wide-int.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Improve-gc-cons-percentage-calculation.patch --]
[-- Type: text/x-patch; name="0001-Improve-gc-cons-percentage-calculation.patch", Size: 12388 bytes --]

From bac66302e92bdd3a353102d2076548e7e83d92e5 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 14 Sep 2019 00:32:01 -0700
Subject: [PATCH] Improve gc-cons-percentage calculation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The old calculation relied on a hodgpodge of partly updated GC
stats to find a number to multiply gc-cons-percentage by.
The new one counts data found by the previous GC, plus half of
the data allocated since then; this is more systematic albeit
still ad hoc.
* src/alloc.c (consing_until_gc, gc_threshold, consing_threshold):
Now EMACS_INT, not intmax_t.
(HI_THRESHOLD): New macro.
(tally_consing): New function.
(make_interval, allocate_string, allocate_string_data)
(make_float, free_cons, allocate_vectorlike, Fmake_symbol): Use it.
(allow_garbage_collection, inhibit_garbage_collection)
(consing_threshold, garbage_collect):
Use HI_THRESHOLD rather than INTMAX_MAX.
(consing_threshold): New arg SINCE_GC.  All callers changed.
(bump_consing_until_gc): Return new consing_until_gc, instead of
nil.  All callers changed.  Don’t worry about overflow since we
now saturate at HI_THRESHOLD.  Guess that half of
recently-allocated objects are still alive, instead of relying on
the previous (even less-accurate) hodgepodge.
(maybe_garbage_collect): New function.
(garbage_collect): Work even if a finalizer disables or enables
memory profiling.  Do not use malloc_probe if GC reclaimed nothing.
* src/lisp.h (maybe_gc): Call maybe_garbage_collect instead
of garbage_collect.
---
 src/alloc.c | 127 ++++++++++++++++++++++++++++++----------------------
 src/lisp.h  |   5 ++-
 2 files changed, 77 insertions(+), 55 deletions(-)

diff --git a/src/alloc.c b/src/alloc.c
index ca8311cc00..497f600551 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -224,7 +224,7 @@ struct emacs_globals globals;
 
 /* maybe_gc collects garbage if this goes negative.  */
 
-intmax_t consing_until_gc;
+EMACS_INT consing_until_gc;
 
 #ifdef HAVE_PDUMPER
 /* Number of finalizers run: used to loop over GC until we stop
@@ -238,9 +238,16 @@ bool gc_in_progress;
 
 /* System byte and object counts reported by GC.  */
 
+/* Assume byte counts fit in uintptr_t and object counts fit into
+   intptr_t.  */
 typedef uintptr_t byte_ct;
 typedef intptr_t object_ct;
 
+/* Large-magnitude value for a threshold count, which fits in EMACS_INT.
+   Using only half the EMACS_INT range avoids overflow hassles.
+   There is no need to fit these counts into fixnums.  */
+#define HI_THRESHOLD (EMACS_INT_MAX / 2)
+
 /* Number of live and free conses etc. counted by the most-recent GC.  */
 
 static struct gcstat
@@ -299,7 +306,7 @@ static intptr_t garbage_collection_inhibited;
 
 /* The GC threshold in bytes, the last time it was calculated
    from gc-cons-threshold and gc-cons-percentage.  */
-static intmax_t gc_threshold;
+static EMACS_INT gc_threshold;
 
 /* If nonzero, this is a warning delivered by malloc and not yet
    displayed.  */
@@ -536,6 +543,15 @@ XFLOAT_INIT (Lisp_Object f, double n)
   XFLOAT (f)->u.data = n;
 }
 
+/* Account for allocation of NBYTES in the heap.  This is a separate
+   function to avoid hassles with implementation-defined conversion
+   from unsigned to signed types.  */
+static void
+tally_consing (ptrdiff_t nbytes)
+{
+  consing_until_gc -= nbytes;
+}
+
 #ifdef DOUG_LEA_MALLOC
 static bool
 pointers_fit_in_lispobj_p (void)
@@ -1372,7 +1388,7 @@ make_interval (void)
 
   MALLOC_UNBLOCK_INPUT;
 
-  consing_until_gc -= sizeof (struct interval);
+  tally_consing (sizeof (struct interval));
   intervals_consed++;
   RESET_INTERVAL (val);
   val->gcmarkbit = 0;
@@ -1739,7 +1755,7 @@ allocate_string (void)
   MALLOC_UNBLOCK_INPUT;
 
   ++strings_consed;
-  consing_until_gc -= sizeof *s;
+  tally_consing (sizeof *s);
 
 #ifdef GC_CHECK_STRING_BYTES
   if (!noninteractive)
@@ -1859,7 +1875,7 @@ allocate_string_data (struct Lisp_String *s,
       old_data->string = NULL;
     }
 
-  consing_until_gc -= needed;
+  tally_consing (needed);
 }
 
 
@@ -2464,7 +2480,7 @@ make_float (double float_value)
 
   XFLOAT_INIT (val, float_value);
   eassert (!XFLOAT_MARKED_P (XFLOAT (val)));
-  consing_until_gc -= sizeof (struct Lisp_Float);
+  tally_consing (sizeof (struct Lisp_Float));
   floats_consed++;
   return val;
 }
@@ -2535,8 +2551,8 @@ free_cons (struct Lisp_Cons *ptr)
   ptr->u.s.u.chain = cons_free_list;
   ptr->u.s.car = dead_object ();
   cons_free_list = ptr;
-  if (INT_ADD_WRAPV (consing_until_gc, sizeof *ptr, &consing_until_gc))
-    consing_until_gc = INTMAX_MAX;
+  ptrdiff_t nbytes = sizeof *ptr;
+  tally_consing (-nbytes);
 }
 
 DEFUN ("cons", Fcons, Scons, 2, 2, 0,
@@ -3153,7 +3169,7 @@ allocate_vectorlike (ptrdiff_t len)
   if (find_suspicious_object_in_range (p, (char *) p + nbytes))
     emacs_abort ();
 
-  consing_until_gc -= nbytes;
+  tally_consing (nbytes);
   vector_cells_consed += len;
 
   MALLOC_UNBLOCK_INPUT;
@@ -3438,7 +3454,7 @@ Its value is void, and its function definition and property list are nil.  */)
   MALLOC_UNBLOCK_INPUT;
 
   init_symbol (val, name);
-  consing_until_gc -= sizeof (struct Lisp_Symbol);
+  tally_consing (sizeof (struct Lisp_Symbol));
   symbols_consed++;
   return val;
 }
@@ -5477,7 +5493,7 @@ staticpro (Lisp_Object const *varaddress)
 static void
 allow_garbage_collection (intmax_t consing)
 {
-  consing_until_gc = consing - (INTMAX_MAX - consing_until_gc);
+  consing_until_gc = consing - (HI_THRESHOLD - consing_until_gc);
   garbage_collection_inhibited--;
 }
 
@@ -5487,7 +5503,7 @@ inhibit_garbage_collection (void)
   ptrdiff_t count = SPECPDL_INDEX ();
   record_unwind_protect_intmax (allow_garbage_collection, consing_until_gc);
   garbage_collection_inhibited++;
-  consing_until_gc = INTMAX_MAX;
+  consing_until_gc = HI_THRESHOLD;
   return count;
 }
 
@@ -5761,11 +5777,13 @@ mark_and_sweep_weak_table_contents (void)
     }
 }
 
-/* Return the number of bytes to cons between GCs, assuming
-   gc-cons-threshold is THRESHOLD and gc-cons-percentage is
-   PERCENTAGE.  */
-static intmax_t
-consing_threshold (intmax_t threshold, Lisp_Object percentage)
+/* Return the number of bytes to cons between GCs, given THRESHOLD and
+   PERCENTAGE.  When calculating a threshold based on PERCENTAGE,
+   assume SINCE_GC bytes have been allocated since the most recent GC.
+   The returned value is positive and no greater than HI_THRESHOLD.  */
+static EMACS_INT
+consing_threshold (intmax_t threshold, Lisp_Object percentage,
+		   intmax_t since_gc)
 {
   if (!NILP (Vmemory_full))
     return memory_full_cons_threshold;
@@ -5775,42 +5793,33 @@ consing_threshold (intmax_t threshold, Lisp_Object percentage)
       if (FLOATP (percentage))
 	{
 	  double tot = (XFLOAT_DATA (percentage)
-			* total_bytes_of_live_objects ());
+			* (total_bytes_of_live_objects () + since_gc));
 	  if (threshold < tot)
 	    {
-	      if (tot < INTMAX_MAX)
-		threshold = tot;
+	      if (tot < HI_THRESHOLD)
+		return tot;
 	      else
-		threshold = INTMAX_MAX;
+		return HI_THRESHOLD;
 	    }
 	}
-      return threshold;
+      return min (threshold, HI_THRESHOLD);
     }
 }
 
-/* Adjust consing_until_gc, assuming gc-cons-threshold is THRESHOLD and
-   gc-cons-percentage is PERCENTAGE.  */
-static Lisp_Object
+/* Adjust consing_until_gc and gc_threshold, given THRESHOLD and PERCENTAGE.
+   Return the updated consing_until_gc.  */
+
+static EMACS_INT
 bump_consing_until_gc (intmax_t threshold, Lisp_Object percentage)
 {
-  /* If consing_until_gc is negative leave it alone, since this prevents
-     negative integer overflow and a GC would have been done soon anyway.  */
-  if (0 <= consing_until_gc)
-    {
-      threshold = consing_threshold (threshold, percentage);
-      intmax_t sum;
-      if (INT_ADD_WRAPV (consing_until_gc, threshold - gc_threshold, &sum))
-	{
-	  /* Scale the threshold down so that consing_until_gc does
-	     not overflow.  */
-	  sum = INTMAX_MAX;
-	  threshold = INTMAX_MAX - consing_until_gc + gc_threshold;
-	}
-      consing_until_gc = sum;
-      gc_threshold = threshold;
-    }
-
-  return Qnil;
+  /* Guesstimate that half the bytes allocated since the most
+     recent GC are still in use.  */
+  EMACS_INT since_gc = (gc_threshold - consing_until_gc) >> 1;
+  EMACS_INT new_gc_threshold = consing_threshold (threshold, percentage,
+						  since_gc);
+  consing_until_gc += new_gc_threshold - gc_threshold;
+  gc_threshold = new_gc_threshold;
+  return consing_until_gc;
 }
 
 /* Watch changes to gc-cons-threshold.  */
@@ -5821,7 +5830,8 @@ watch_gc_cons_threshold (Lisp_Object symbol, Lisp_Object newval,
   intmax_t threshold;
   if (! (INTEGERP (newval) && integer_to_intmax (newval, &threshold)))
     return Qnil;
-  return bump_consing_until_gc (threshold, Vgc_cons_percentage);
+  bump_consing_until_gc (threshold, Vgc_cons_percentage);
+  return Qnil;
 }
 
 /* Watch changes to gc-cons-percentage.  */
@@ -5829,7 +5839,18 @@ static Lisp_Object
 watch_gc_cons_percentage (Lisp_Object symbol, Lisp_Object newval,
 			  Lisp_Object operation, Lisp_Object where)
 {
-  return bump_consing_until_gc (gc_cons_threshold, newval);
+  bump_consing_until_gc (gc_cons_threshold, newval);
+  return Qnil;
+}
+
+/* It may be time to collect garbage.  Recalculate consing_until_gc,
+   since it might depend on current usage, and do the garbage
+   collection if the recalculation says so.  */
+void
+maybe_garbage_collect (void)
+{
+  if (bump_consing_until_gc (gc_cons_threshold, Vgc_cons_percentage) < 0)
+    garbage_collect ();
 }
 
 /* Subroutine of Fgarbage_collect that does most of the work.  */
@@ -5841,7 +5862,6 @@ garbage_collect (void)
   bool message_p;
   ptrdiff_t count = SPECPDL_INDEX ();
   struct timespec start;
-  byte_ct tot_before = 0;
 
   eassert (weak_hash_tables == NULL);
 
@@ -5856,14 +5876,15 @@ garbage_collect (void)
   FOR_EACH_BUFFER (nextb)
     compact_buffer (nextb);
 
-  if (profiler_memory_running)
-    tot_before = total_bytes_of_live_objects ();
+  byte_ct tot_before = (profiler_memory_running
+			? total_bytes_of_live_objects ()
+			: (byte_ct) -1);
 
   start = current_timespec ();
 
   /* In case user calls debug_print during GC,
      don't let that cause a recursive GC.  */
-  consing_until_gc = INTMAX_MAX;
+  consing_until_gc = HI_THRESHOLD;
 
   /* Save what's currently displayed in the echo area.  Don't do that
      if we are GC'ing because we've run out of memory, since
@@ -5975,7 +5996,7 @@ garbage_collect (void)
   unblock_input ();
 
   consing_until_gc = gc_threshold
-    = consing_threshold (gc_cons_threshold, Vgc_cons_percentage);
+    = consing_threshold (gc_cons_threshold, Vgc_cons_percentage, 0);
 
   if (garbage_collection_messages && NILP (Vmemory_full))
     {
@@ -6008,11 +6029,11 @@ garbage_collect (void)
   gcs_done++;
 
   /* Collect profiling data.  */
-  if (profiler_memory_running)
+  if (tot_before != (byte_ct) -1)
     {
       byte_ct tot_after = total_bytes_of_live_objects ();
-      byte_ct swept = tot_before <= tot_after ? 0 : tot_before - tot_after;
-      malloc_probe (min (swept, SIZE_MAX));
+      if (tot_after < tot_before)
+	malloc_probe (min (tot_before - tot_after, SIZE_MAX));
     }
 }
 
diff --git a/src/lisp.h b/src/lisp.h
index 024e5edb26..02f8a7b668 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -3824,9 +3824,10 @@ extern void mark_maybe_objects (Lisp_Object const *, ptrdiff_t);
 extern void mark_stack (char const *, char const *);
 extern void flush_stack_call_func (void (*func) (void *arg), void *arg);
 extern void garbage_collect (void);
+extern void maybe_garbage_collect (void);
 extern const char *pending_malloc_warning;
 extern Lisp_Object zero_vector;
-extern intmax_t consing_until_gc;
+extern EMACS_INT consing_until_gc;
 #ifdef HAVE_PDUMPER
 extern int number_finalizers_run;
 #endif
@@ -5056,7 +5057,7 @@ INLINE void
 maybe_gc (void)
 {
   if (consing_until_gc < 0)
-    garbage_collect ();
+    maybe_garbage_collect ();
 }
 
 INLINE_HEADER_END
-- 
2.17.1


  parent reply	other threads:[~2019-09-14  7:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <5075406D-6DB8-4560-BB64-7198526FCF9F@acm.org>
2019-08-11 16:23 ` bug#37006: 27.0.50; garbage collection not happening after 26de2d42 Mattias Engdegård
2019-08-11 17:07   ` Eli Zaretskii
     [not found] ` <83h86nu0pq.fsf@gnu.org>
     [not found]   ` <86pnlbphus.fsf@phe.ftfl.ca>
2019-08-12  2:31     ` Eli Zaretskii
2019-08-12 14:34       ` Joseph Mingrone
2019-08-12 16:49         ` Eli Zaretskii
2019-08-12 17:00           ` Mattias Engdegård
2019-08-13 15:37             ` Eli Zaretskii
2019-08-13 16:48               ` Mattias Engdegård
2019-08-13 17:04                 ` Eli Zaretskii
2019-08-13 17:29                   ` Mattias Engdegård
2019-08-13 17:21           ` Paul Eggert
2019-08-13 17:53             ` Eli Zaretskii
2019-08-13 19:32               ` Paul Eggert
2019-08-14 16:06                 ` Eli Zaretskii
2019-08-15  1:37                   ` Paul Eggert
2019-08-15 14:17                     ` Eli Zaretskii
2019-08-15 18:51                       ` Paul Eggert
2019-08-15 19:34                         ` Eli Zaretskii
2019-09-14  7:51                       ` Paul Eggert [this message]
2019-09-14  8:30                         ` Eli Zaretskii
2019-08-11 12:39 Joseph Mingrone
2019-08-11 15:13 ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b66de82b-d22d-e8fa-22dd-0022db5a5b78@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=37006@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=jrm@ftfl.ca \
    --cc=mattiase@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).