unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
@ 2019-09-06 13:52 Michael Heerdegen
  2019-09-07 14:23 ` Michael Heerdegen
  2019-09-08  1:11 ` Paul Eggert
  0 siblings, 2 replies; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-06 13:52 UTC (permalink / raw)
  To: 37321


Hi,

apparently due to the latest changes in gc (I noticed the problem today,
after doing a make bootstrap yesterday), el-searching is slowed down by
a factor of 10000 or so.  Profiling shows that nearly all of the time is
spent in gc.

To reproduce from emacs -Q:

M-x list-packages
Install el-search
M-x el-search-emacs-elisp-sources 'pcase M-RET

This will start an el-occur for the symbol 'pcase in the Emacs Elisp
sources.  This normally takes around 10 seconds but is now just
unusable.

(If you want me to bisect, do I have to make bootstrap, or does a simple
make suffice?)

Hope someone can help.

TIA,

Michael.





In GNU Emacs 27.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.10)
 of 2019-09-06 built on drachen
Repository revision: 459b416e5f950c6127f10c9eca68d65bdab688d4
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12004000
System Description: Debian GNU/Linux bullseye/sid






^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-06 13:52 bug#37321: 27.0.50; Excessive gc in a use case (el-search) Michael Heerdegen
@ 2019-09-07 14:23 ` Michael Heerdegen
  2019-09-07 15:30   ` Michael Heerdegen
  2019-09-08  1:11 ` Paul Eggert
  1 sibling, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-07 14:23 UTC (permalink / raw)
  To: 37321

Michael Heerdegen <michael_heerdegen@web.de> writes:

> M-x el-search-emacs-elisp-sources 'pcase M-RET

I found the culprit.  It's this simple function:

#+begin_src emacs-lisp
(defun el-search--flatten-tree (tree)
  "Return a list of `el-search--atomic-p' objects in TREE."
  (let ((elements ())
        (walked-objects ;to avoid infinite recursion for circular TREEs
         (make-hash-table :test #'eq))
         (gc-cons-percentage 0.8)
        ) ;Why is binding it here more effective than binding it more top-level?
    (cl-labels ((walker (object)
                        (if (el-search--atomic-p object)
                            (push object elements)
                          (unless (gethash object walked-objects)
                            (puthash object t walked-objects)
                            (if (consp object)
                                (progn
                                  (while (consp object)
                                    (walker (car object))
                                    (setq object (cdr object))
                                    (if (gethash object walked-objects)
                                        (setq object nil)
                                      (puthash object t walked-objects)))
                                  (when object ;dotted list
                                    (walker object)))
                              (cl-loop for elt being the elements of object do (walker elt)))))))
      (walker tree)
      elements)))
#+end_src

I've put a lot of work in optimizing el-searching speed.  It now mainly
depends of effectiveness of gc, mostly in the above function.  When I
wrote it I noticed that it worked best when binding gc-cons-percentage
to a value of 0.8 (inside the function, see the code).  But now this
binding slows the thing down as hell.  When I remove the binding,
searching is fast again, but it still takes twice as long as in the past
when 0.8 worked.

I've no clue about how gc works and how the above function can be
optimized, and if gc is expected to be that slow now with the
gc-cons-percentage binding in effect.  Advice appreciated.

TIA,

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-07 14:23 ` Michael Heerdegen
@ 2019-09-07 15:30   ` Michael Heerdegen
  0 siblings, 0 replies; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-07 15:30 UTC (permalink / raw)
  To: 37321

Michael Heerdegen <michael_heerdegen@web.de> writes:

> #+begin_src emacs-lisp
> (defun el-search--flatten-tree (tree)
>   "Return a list of `el-search--atomic-p' objects in TREE."
>   (let ((elements ())
>         (walked-objects ;to avoid infinite recursion for circular TREEs
>          (make-hash-table :test #'eq))
>          (gc-cons-percentage 0.8)
>         ) ;Why is binding it here more effective than binding it more top-level?
>     (cl-labels ((walker (object)
>                         (if (el-search--atomic-p object)
>                             (push object elements)
>                           (unless (gethash object walked-objects)
>                             (puthash object t walked-objects)
>                             (if (consp object)
>                                 (progn
>                                   (while (consp object)
>                                     (walker (car object))
>                                     (setq object (cdr object))
>                                     (if (gethash object walked-objects)
>                                         (setq object nil)
>                                       (puthash object t walked-objects)))
>                                   (when object ;dotted list
>                                     (walker object)))
>                               (cl-loop for elt being the elements of object do (walker elt)))))))
>       (walker tree)
>       elements)))
> #+end_src

It doesn't seem to be related to the hash table usage - if I remove the
hash table test based infloop check part of the function, the behavior
wrt gc doesn't change.  So I wonder what in this function does it make
so much depend on gc behavior?

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-06 13:52 bug#37321: 27.0.50; Excessive gc in a use case (el-search) Michael Heerdegen
  2019-09-07 14:23 ` Michael Heerdegen
@ 2019-09-08  1:11 ` Paul Eggert
  2019-09-08 14:52   ` Michael Heerdegen
  1 sibling, 1 reply; 38+ messages in thread
From: Paul Eggert @ 2019-09-08  1:11 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 37321

[-- Attachment #1: Type: text/plain, Size: 85 bytes --]

Thanks for reporting the bug. I installed the attached patch; please give it a try.


[-- Attachment #2: 0001-Fix-bug-when-gc-cons-percentage-is-bumped-to-0.8.patch --]
[-- Type: text/x-patch, Size: 3907 bytes --]

From 4c31e7c3e4e70b38ab51a99d61e215aefe0190dd Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 7 Sep 2019 18:08:12 -0700
Subject: [PATCH] Fix bug when gc-cons-percentage is bumped to 0.8

Problem reported by Michael Heerdegen (Bug#37321).
* src/alloc.c (gc_threshold): New static var.
(bump_consing_until_gc): Change args from DIFF to THRESHOLD and
PERCENTAGE.  All uses changed.  When accounting for a changed
gc-cons-percentage, do not assume that total_bytes_of_live_objects
returns the same value now that it did the last time we were
called.
---
 src/alloc.c | 45 ++++++++++++++++++++++++++++-----------------
 1 file changed, 28 insertions(+), 17 deletions(-)

diff --git a/src/alloc.c b/src/alloc.c
index 5fc515f33b..be98cfd5f5 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -297,6 +297,10 @@ #define PUREBEG (char *) pure
 
 static intptr_t garbage_collection_inhibited;
 
+/* The GC threshold in bytes, the last time it was calculated
+   from gc-cons-threshold and gc-cons-percentage.  */
+static intmax_t gc_threshold;
+
 /* If nonzero, this is a warning delivered by malloc and not yet
    displayed.  */
 
@@ -5808,15 +5812,28 @@ consing_threshold (intmax_t threshold, Lisp_Object percentage)
     }
 }
 
-/* Increment consing_until_gc by DIFF, avoiding overflow.  */
+/* Adjust consing_until_gc, assuming gc-cons-threshold is THRESHOLD and
+   gc-cons-percentage is PERCENTAGE.  */
 static Lisp_Object
-bump_consing_until_gc (intmax_t diff)
+bump_consing_until_gc (intmax_t threshold, Lisp_Object percentage)
 {
   /* If consing_until_gc is negative leave it alone, since this prevents
      negative integer overflow and a GC would have been done soon anyway.  */
-  if (0 <= consing_until_gc
-      && INT_ADD_WRAPV (consing_until_gc, diff, &consing_until_gc))
-    consing_until_gc = INTMAX_MAX;
+  if (0 <= consing_until_gc)
+    {
+      threshold = consing_threshold (threshold, percentage);
+      intmax_t sum;
+      if (INT_ADD_WRAPV (consing_until_gc, threshold - gc_threshold, &sum))
+	{
+	  /* Scale the threshold down so that consing_until_gc does
+	     not overflow.  */
+	  sum = INTMAX_MAX;
+	  threshold = INTMAX_MAX - consing_until_gc + gc_threshold;
+	}
+      consing_until_gc = sum;
+      gc_threshold = threshold;
+    }
+
   return Qnil;
 }
 
@@ -5825,13 +5842,10 @@ bump_consing_until_gc (intmax_t diff)
 watch_gc_cons_threshold (Lisp_Object symbol, Lisp_Object newval,
 			 Lisp_Object operation, Lisp_Object where)
 {
-  Lisp_Object percentage = Vgc_cons_percentage;
   intmax_t threshold;
-  intmax_t diff = (INTEGERP (newval) && integer_to_intmax (newval, &threshold)
-		   ? (consing_threshold (threshold, percentage)
-		      - consing_threshold (gc_cons_threshold, percentage))
-		   : 0);
-  return bump_consing_until_gc (diff);
+  if (! (INTEGERP (newval) && integer_to_intmax (newval, &threshold)))
+    return Qnil;
+  return bump_consing_until_gc (threshold, Vgc_cons_percentage);
 }
 
 /* Watch changes to gc-cons-percentage.  */
@@ -5839,10 +5853,7 @@ watch_gc_cons_threshold (Lisp_Object symbol, Lisp_Object newval,
 watch_gc_cons_percentage (Lisp_Object symbol, Lisp_Object newval,
 			  Lisp_Object operation, Lisp_Object where)
 {
-  intmax_t threshold = gc_cons_threshold;
-  intmax_t diff = (consing_threshold (threshold, newval)
-		   - consing_threshold (threshold, Vgc_cons_percentage));
-  return bump_consing_until_gc (diff);
+  return bump_consing_until_gc (gc_cons_threshold, newval);
 }
 
 /* Subroutine of Fgarbage_collect that does most of the work.  */
@@ -5987,8 +5998,8 @@ garbage_collect_1 (struct gcstat *gcst)
 
   unblock_input ();
 
-  consing_until_gc = consing_threshold (gc_cons_threshold,
-					Vgc_cons_percentage);
+  consing_until_gc = gc_threshold
+    = consing_threshold (gc_cons_threshold, Vgc_cons_percentage);
 
   if (garbage_collection_messages && NILP (Vmemory_full))
     {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-08  1:11 ` Paul Eggert
@ 2019-09-08 14:52   ` Michael Heerdegen
  2019-09-08 15:25     ` Michael Heerdegen
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-08 14:52 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 37321

Paul Eggert <eggert@cs.ucla.edu> writes:

> Thanks for reporting the bug. I installed the attached patch; please
> give it a try.

Thank you, that fixes the issue for me.

FWIW, in the end, the search is still slower as before, by a factor of 5
or so.  But I had found .8 only by trying - if the gc implementation has
been changed, maybe some other value works better now.

I think I'll rewrite my use case to use hash tables instead, that should
be more efficient (less garbage).


Thanks,

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-08 14:52   ` Michael Heerdegen
@ 2019-09-08 15:25     ` Michael Heerdegen
  2019-09-14  8:04       ` Paul Eggert
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-08 15:25 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 37321

Michael Heerdegen <michael_heerdegen@web.de> writes:

> I think I'll rewrite my use case to use hash tables instead, that should
> be more efficient (less garbage).

Unfortunately this doesn't make a difference :-(

Garbage collection takes ~ 50% of the search time.  I don't think I
produce unnaturally amounts of garbage.  But my Emacs seems to have
become slower independently from gc.  There is still a slow down of
searching by a factor of at least 3 or so that doesn't seem to be
related to gc.

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-08 15:25     ` Michael Heerdegen
@ 2019-09-14  8:04       ` Paul Eggert
  2019-09-14  8:37         ` Eli Zaretskii
                           ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Paul Eggert @ 2019-09-14  8:04 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 37321

[-- Attachment #1: Type: text/plain, Size: 1419 bytes --]

On 9/8/19 8:25 AM, Michael Heerdegen wrote:
> my Emacs seems to have
> become slower independently from gc.  There is still a slow down of
> searching by a factor of at least 3 or so that doesn't seem to be
> related to gc.

Have you tried turning on profiling to see what the problem might be?

I looked into it a bit, and I think the problem is related to this line in 
el-search--flatten-tree:

         (gc-cons-percentage 0.8)) ;Why is binding it here more effective than 
binding it more top-level?

That binding is not effective in general, because it causes Emacs to do most of 
its computation with gc-cons-percentage equal to 0.8, but a small amount of it 
with gc-cons-percentage equal to 0.1 (the default value). This is in an attempt 
to disable most GC. However, the attempt fails in master because when 
gc-cons-percentage drops to 0.1 Emacs does a garbage collection pretty much 
right away. (In Emacs 26, the code lucks out because Emacs happens to not look 
at gc-cons-percentage during the brief time that it is 0.1, so it doesn't GC.)

Proposed ELPA fix attached; it should help performance in Emacs master (where I 
checked in some changes recently to help master work a bit better on this 
example). The abovementioned line shouldn't be needed in Emacs master.

There may be other places in el-search that would benefit from a change similar 
to the attached patch, but I didn't investigate this.

[-- Attachment #2: 0001-Port-GC-tuning-to-Emacs-27-alpha.patch --]
[-- Type: text/x-patch, Size: 1249 bytes --]

From 3e85fb302a9cf88ed3668d41c4b2e4cdb7017c54 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 14 Sep 2019 01:01:20 -0700
Subject: [PATCH] Port GC tuning to Emacs 27 alpha

* packages/el-search/el-search.el (el-search-setup-search):
Set gc-cons-percentage to 0.8 while searching.
---
 packages/el-search/el-search.el | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/packages/el-search/el-search.el b/packages/el-search/el-search.el
index 4e9cd98a4..5e213cf2b 100644
--- a/packages/el-search/el-search.el
+++ b/packages/el-search/el-search.el
@@ -2168,11 +2168,12 @@ files (i.e. file names) to search in.
 With optional FROM-HERE non-nil, the first buffer in this stream
 should be the current buffer, and searching will start at the
 current buffer's point instead of its beginning."
+ (let ((gc-cons-percentage 0.8))
   (el-search-setup-search-1 pattern get-buffer-stream nil setup-function)
   (if (not el-search-occur-flag)
       (el-search-continue-search from-here)
     (setq el-search-occur-flag nil)
-    (el-search-occur)))
+    (el-search-occur))))
 
 (defun el-search-stream-of-directory-files (&optional directory recurse)
   "Return a stream of emacs-lisp files in DIRECTORY.
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-14  8:04       ` Paul Eggert
@ 2019-09-14  8:37         ` Eli Zaretskii
  2019-09-14  8:52           ` Paul Eggert
  2019-09-16 23:53         ` Michael Heerdegen
  2019-09-25  9:42         ` Michael Heerdegen
  2 siblings, 1 reply; 38+ messages in thread
From: Eli Zaretskii @ 2019-09-14  8:37 UTC (permalink / raw)
  To: Paul Eggert; +Cc: michael_heerdegen, 37321

> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 14 Sep 2019 01:04:07 -0700
> Cc: 37321@debbugs.gnu.org
> 
>          (gc-cons-percentage 0.8)) ;Why is binding it here more effective than 
> binding it more top-level?
> 
> That binding is not effective in general, because it causes Emacs to do most of 
> its computation with gc-cons-percentage equal to 0.8, but a small amount of it 
> with gc-cons-percentage equal to 0.1 (the default value). This is in an attempt 
> to disable most GC. However, the attempt fails in master because when 
> gc-cons-percentage drops to 0.1 Emacs does a garbage collection pretty much 
> right away. (In Emacs 26, the code lucks out because Emacs happens to not look 
> at gc-cons-percentage during the brief time that it is 0.1, so it doesn't GC.)

Maybe we should have something in NEWS describing the Lisp-visible
effects of the GC changes, perhaps with a couple of practical do's and
dont's regarding GC?

I'm not sure if you are reading Reddit, but it turns out a lot of
Emacs users have GC-related tricks in their init files, and some
unbundled packages even do that as part of their operation, like the
one in question here.  IMNSHO, a large part of this proliferation is
based on blind copying from others without really understanding what
this does, and it reached epidemic proportions.  It might be a good
idea to give users who do care some inside information to allow them
to tune their GC tricks to these changes -- it could lower the number
of bug reports that I'm afraid will follow.

Thanks.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-14  8:37         ` Eli Zaretskii
@ 2019-09-14  8:52           ` Paul Eggert
  2019-09-14  9:57             ` Eli Zaretskii
  2019-09-15  4:33             ` Richard Stallman
  0 siblings, 2 replies; 38+ messages in thread
From: Paul Eggert @ 2019-09-14  8:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: michael_heerdegen, 37321

On 9/14/19 1:37 AM, Eli Zaretskii wrote:
> It might be a good
> idea to give users who do care some inside information to allow them
> to tune their GC tricks to these changes -- it could lower the number
> of bug reports that I'm afraid will follow.

I could add something to NEWS along the lines of "The heuristics for when to 
garbage collect have been tweaked, in an attempt to make Emacs perform a bit 
better and follow its documentation a bit more closely." Dunno how much that 
would help, though. It'd be hard to be much more specific.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-14  8:52           ` Paul Eggert
@ 2019-09-14  9:57             ` Eli Zaretskii
  2019-09-14 17:57               ` Paul Eggert
  2019-09-15  4:33             ` Richard Stallman
  1 sibling, 1 reply; 38+ messages in thread
From: Eli Zaretskii @ 2019-09-14  9:57 UTC (permalink / raw)
  To: Paul Eggert; +Cc: michael_heerdegen, 37321

> Cc: michael_heerdegen@web.de, 37321@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 14 Sep 2019 01:52:27 -0700
> 
> On 9/14/19 1:37 AM, Eli Zaretskii wrote:
> > It might be a good
> > idea to give users who do care some inside information to allow them
> > to tune their GC tricks to these changes -- it could lower the number
> > of bug reports that I'm afraid will follow.
> 
> I could add something to NEWS along the lines of "The heuristics for when to 
> garbage collect have been tweaked, in an attempt to make Emacs perform a bit 
> better and follow its documentation a bit more closely." Dunno how much that 
> would help, though. It'd be hard to be much more specific.

If we cannot give any specifics, I think at least telling users to
re-tune their GC tricks would help.

Thanks.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-14  9:57             ` Eli Zaretskii
@ 2019-09-14 17:57               ` Paul Eggert
  2019-09-14 18:16                 ` Eli Zaretskii
  0 siblings, 1 reply; 38+ messages in thread
From: Paul Eggert @ 2019-09-14 17:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: michael_heerdegen, 37321

[-- Attachment #1: Type: text/plain, Size: 193 bytes --]

On 9/14/19 2:57 AM, Eli Zaretskii wrote:
> If we cannot give any specifics, I think at least telling users to
> re-tune their GC tricks would help.

I installed the attached to try to do that.

[-- Attachment #2: 0001-Improve-doc-of-GC-thresholds.patch --]
[-- Type: text/x-patch, Size: 2794 bytes --]

From 1acc0cc9aaf25c808a60cf09cf8a4d1c653c3aa9 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 14 Sep 2019 10:53:24 -0700
Subject: [PATCH] Improve doc of GC thresholds

* doc/lispref/internals.texi (Garbage Collection), etc/NEWS:
Warn that control over GC is only approximate.
---
 doc/lispref/internals.texi | 13 ++++++++++---
 etc/NEWS                   |  7 +++++++
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/doc/lispref/internals.texi b/doc/lispref/internals.texi
index f85c266ede..c52999e1cd 100644
--- a/doc/lispref/internals.texi
+++ b/doc/lispref/internals.texi
@@ -533,9 +533,6 @@ Garbage Collection
 trigger another garbage collection.  You can use the result returned by
 @code{garbage-collect} to get an information about size of the particular
 object type; space allocated to the contents of buffers does not count.
-Note that the subsequent garbage collection does not happen immediately
-when the threshold is exhausted, but only the next time the Lisp interpreter
-is called.
 
 The initial threshold value is @code{GC_DEFAULT_THRESHOLD}, defined in
 @file{alloc.c}.  Since it's defined in @code{word_size} units, the value
@@ -562,6 +559,16 @@ Garbage Collection
 proportion.
 @end defopt
 
+  Control over the garbage collector via @code{gc-cons-threshold} and
+@code{gc-cons-percentage} is only approximate.  Although Emacs checks
+for threshold exhaustion regularly, for efficiency reasons it does not
+do so immediately after every change to the heap or to
+@code{gc-cons-threshold} or @code{gc-cons-percentage}, so exhausting
+the threshold does not immediately trigger garbage collection.  Also,
+for efficency in threshold calculations Emacs approximates the heap
+size, which counts the bytes used by currently-accessible objects in
+the heap.
+
   The value returned by @code{garbage-collect} describes the amount of
 memory used by Lisp data, broken down by data type.  By contrast, the
 function @code{memory-limit} provides information on the total amount of
diff --git a/etc/NEWS b/etc/NEWS
index 94c98a7ebe..252c6bf9b9 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -2429,6 +2429,13 @@ remote systems, which support this check.
 +++
 ** 'memory-limit' now returns a better estimate of memory consumption.
 
++++
+** When interpreting 'gc-cons-percentage', Emacs now estimates the
+heap size more often and (we hope) more accurately.  E.g., formerly
+(progn (let ((gc-cons-percentage 0.8)) BODY1) BODY2) continued to use
+the 0.8 value during BODY2 until the next garbage collection, but that
+is no longer true.  Applications may need to re-tune their GC tricks.
+
 +++
 ** New macro 'combine-change-calls' arranges to call the change hooks
 ('before-change-functions' and 'after-change-functions') just once
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-14 17:57               ` Paul Eggert
@ 2019-09-14 18:16                 ` Eli Zaretskii
  0 siblings, 0 replies; 38+ messages in thread
From: Eli Zaretskii @ 2019-09-14 18:16 UTC (permalink / raw)
  To: Paul Eggert; +Cc: michael_heerdegen, 37321

> Cc: michael_heerdegen@web.de, 37321@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sat, 14 Sep 2019 10:57:35 -0700
> 
> On 9/14/19 2:57 AM, Eli Zaretskii wrote:
> > If we cannot give any specifics, I think at least telling users to
> > re-tune their GC tricks would help.
> 
> I installed the attached to try to do that.

Thanks.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-14  8:52           ` Paul Eggert
  2019-09-14  9:57             ` Eli Zaretskii
@ 2019-09-15  4:33             ` Richard Stallman
  1 sibling, 0 replies; 38+ messages in thread
From: Richard Stallman @ 2019-09-15  4:33 UTC (permalink / raw)
  To: Paul Eggert; +Cc: michael_heerdegen, 37321

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I could add something to NEWS along the lines of "The heuristics for when to 
  > garbage collect have been tweaked, in an attempt to make Emacs perform a bit 
  > better and follow its documentation a bit more closely." Dunno how much that 
  > would help, though. It'd be hard to be much more specific.

That is so general that I am not sure it is worth mentioning in NEWS.
Users don't need to read about it to get the benefit of it.

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)







^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-14  8:04       ` Paul Eggert
  2019-09-14  8:37         ` Eli Zaretskii
@ 2019-09-16 23:53         ` Michael Heerdegen
  2019-09-17  0:55           ` Paul Eggert
  2019-09-17 12:47           ` Noam Postavsky
  2019-09-25  9:42         ` Michael Heerdegen
  2 siblings, 2 replies; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-16 23:53 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 37321

Paul Eggert <eggert@cs.ucla.edu> writes:

> > my Emacs seems to have become slower independently from gc.  There
> > is still a slow down of searching by a factor of at least 3 or so
> > that doesn't seem to be related to gc.
>
> Have you tried turning on profiling to see what the problem might be?

For el-searches, yes, I did.  But the amounts of time spent were
distributed quite evenly between different things, I could not identify
a culprit.  It looked more like everything got proportionally slower.

> I looked into it a bit, and I think the problem is related to this
> line in el-search--flatten-tree:
>
>         (gc-cons-percentage 0.8)) ;Why is binding it here more
>         effective than binding it more top-level?
>
> That binding is not effective in general, because it causes Emacs to
> do most of its computation with gc-cons-percentage equal to 0.8, but a
> small amount of it with gc-cons-percentage equal to 0.1 (the default
> value).

Yeah.  Funnily enough I had experimented a lot with where to put this
binding.  I also had tried what you suggest in your patch.  What I
currently use looks weird and I don't understand why but in the past it
produced the most efficient result - even more efficient than with a
more durable binding.  Seems this is not true any more.

> This is in an attempt to disable most GC. However, the attempt
> fails in master because when gc-cons-percentage drops to 0.1 Emacs
> does a garbage collection pretty much right away. (In Emacs 26, the
> code lucks out because Emacs happens to not look at gc-cons-percentage
> during the brief time that it is 0.1, so it doesn't GC.)

Ah, ok.

> There may be other places in el-search that would benefit from a
> change similar to the attached patch, but I didn't investigate this.

Yes, I think so.

Question: why didn't it help to switch to hash tables?  My use case is
like this: very frequently I need to collect N (N is variable with an
order of magnitude of roughly 0 < N < 100.000 or so) objects in a
structure and later perform member tests whether a given element is
equal to one of the N.  I used to use a list and `member' to implement
this.  When I use a hash-table that associates the N elements with t
instead, and use gethash as member test, do I produce less garbage?
That would be good but when using this it didn't lower the amount of
time spent in gc.

Thanks,

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-16 23:53         ` Michael Heerdegen
@ 2019-09-17  0:55           ` Paul Eggert
  2019-09-21  0:41             ` Michael Heerdegen
  2019-09-17 12:47           ` Noam Postavsky
  1 sibling, 1 reply; 38+ messages in thread
From: Paul Eggert @ 2019-09-17  0:55 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 37321

On 9/16/19 4:53 PM, Michael Heerdegen wrote:

> For el-searches, yes, I did.  But the amounts of time spent were
> distributed quite evenly between different things, I could not identify
> a culprit.  It looked more like everything got proportionally slower.

That doesn't sound good. Can you identify which commit did that, if any?

> When I use a hash-table that associates the N elements with t
> instead, and use gethash as member test, do I produce less garbage?

Hard to say, but you can use (memory-use-counts) to estimate how much 
garbage you're creating, since it counts all uses whereas 
(garbage-collect) counts only live uses. There's also memory profiling, 
as opposed to CPU profiling.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-16 23:53         ` Michael Heerdegen
  2019-09-17  0:55           ` Paul Eggert
@ 2019-09-17 12:47           ` Noam Postavsky
  2019-09-21  0:44             ` Michael Heerdegen
  1 sibling, 1 reply; 38+ messages in thread
From: Noam Postavsky @ 2019-09-17 12:47 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: Paul Eggert, 37321

Michael Heerdegen <michael_heerdegen@web.de> writes:
>
> Question: why didn't it help to switch to hash tables?  My use case is
> like this: very frequently I need to collect N (N is variable with an
> order of magnitude of roughly 0 < N < 100.000 or so) objects in a
> structure and later perform member tests whether a given element is
> equal to one of the N.  I used to use a list and `member' to implement
> this.  When I use a hash-table that associates the N elements with t
> instead, and use gethash as member test, do I produce less garbage?
> That would be good but when using this it didn't lower the amount of
> time spent in gc.

I would expect it to produce more garbage.  A list of length N has to
contain 2N slots (2 for each cons = car+cdr).  A hash table with N
items, needs at least 2N as well: N keys + N values.  And since it
stores these in vectors/arrays, as you add items it has to reallocate
them to resize (and the final size will likely be a bit higher than N),
producing more garbage (this can be avoided if you can pass :size N up
front).

gethash as a member test should be faster than lists for large N though.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-17  0:55           ` Paul Eggert
@ 2019-09-21  0:41             ` Michael Heerdegen
  2019-09-21  0:46               ` Michael Heerdegen
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-21  0:41 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 37321

Paul Eggert <eggert@cs.ucla.edu> writes:

> > For el-searches, yes, I did.  But the amounts of time spent were
> > distributed quite evenly between different things, I could not identify
> > a culprit.  It looked more like everything got proportionally slower.
>
> That doesn't sound good. Can you identify which commit did that, if any?

I made the gc related change to el-search now and search times don't
look that bad any more.  I still have a slow down of 2 or so, but I
think that can at least partly be explained by changes I made to
el-search in the meantime.  I'll keep an eye on it, but I guess
everything is good.

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-17 12:47           ` Noam Postavsky
@ 2019-09-21  0:44             ` Michael Heerdegen
  0 siblings, 0 replies; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-21  0:44 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: Paul Eggert, 37321

Noam Postavsky <npostavs@gmail.com> writes:

> I would expect it to produce more garbage.  A list of length N has to
> contain 2N slots (2 for each cons = car+cdr).  A hash table with N
> items, needs at least 2N as well: N keys + N values.  And since it
> stores these in vectors/arrays, as you add items it has to reallocate
> them to resize (and the final size will likely be a bit higher than N),
> producing more garbage (this can be avoided if you can pass :size N up
> front).

Makes sense, thanks.  So in the case I had in mind switching to hash
tables offers no advantages.

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-21  0:41             ` Michael Heerdegen
@ 2019-09-21  0:46               ` Michael Heerdegen
  2019-09-21  6:19                 ` Paul Eggert
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-21  0:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 37321

Michael Heerdegen <michael_heerdegen@web.de> writes:

> I made the gc related change to el-search now and search times don't
> look that bad any more.  I still have a slow down of 2 or so, but I
> think that can at least partly be explained by changes I made to
> el-search in the meantime.  I'll keep an eye on it, but I guess
> everything is good.

After reading all messages again, seems we are all happy and everything
is done.  Can we close?

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-21  0:46               ` Michael Heerdegen
@ 2019-09-21  6:19                 ` Paul Eggert
  0 siblings, 0 replies; 38+ messages in thread
From: Paul Eggert @ 2019-09-21  6:19 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 37321-done

On 9/20/19 5:46 PM, Michael Heerdegen wrote:

> After reading all messages again, seems we are all happy and everything
> is done.  Can we close?

Sure, closing. Thanks for checking things out.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-14  8:04       ` Paul Eggert
  2019-09-14  8:37         ` Eli Zaretskii
  2019-09-16 23:53         ` Michael Heerdegen
@ 2019-09-25  9:42         ` Michael Heerdegen
  2019-09-25 20:37           ` Paul Eggert
  2 siblings, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-25  9:42 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 37321

Paul Eggert <eggert@cs.ucla.edu> writes:

> That binding is not effective in general, because it causes Emacs to
> do most of its computation with gc-cons-percentage equal to 0.8, but a
> small amount of it with gc-cons-percentage equal to 0.1 (the default
> value). [...]
> Proposed ELPA fix attached [...]
>
> There may be other places in el-search that would benefit from a
> change similar to the attached patch, but I didn't investigate this.

That would actually mean that I would have to add the gc-cons-percentage
binding to nearly every single command, which are quite a lot.  Isn't
there a better way?

Thanks,

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-25  9:42         ` Michael Heerdegen
@ 2019-09-25 20:37           ` Paul Eggert
  2019-09-26 11:42             ` Michael Heerdegen
  0 siblings, 1 reply; 38+ messages in thread
From: Paul Eggert @ 2019-09-25 20:37 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: 37321

On 9/25/19 2:42 AM, Michael Heerdegen wrote:
> That would actually mean that I would have to add the gc-cons-percentage
> binding to nearly every single command, which are quite a lot.  Isn't
> there a better way?

I don't see one offhand, sorry. Except for the obvious way, which 
depends on a larger question: what's different about el-search? That is, 
if you are observing significantly better performance with 
gc-cons-percentage set to 0.8, shouldn't we set it to be that value by 
default in Emacs?





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-25 20:37           ` Paul Eggert
@ 2019-09-26 11:42             ` Michael Heerdegen
  2019-09-26 12:14               ` Eli Zaretskii
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-26 11:42 UTC (permalink / raw)
  To: Paul Eggert; +Cc: 37321

Paul Eggert <eggert@cs.ucla.edu> writes:

> I don't see one offhand, sorry. Except for the obvious way, which
> depends on a larger question: what's different about el-search?

El-searching produces a lot of garbage on the way, since I must
continuously `read'.  Not a very special thing, though.

> That is, if you are observing significantly better performance with
> gc-cons-percentage set to 0.8, shouldn't we set it to be that value by
> default in Emacs?

Good question.  Could it have downsides (I guess garbage collection
would happen less often but take longer then so one could get noticeable
interrupts)?


Regards,

Michael





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-26 11:42             ` Michael Heerdegen
@ 2019-09-26 12:14               ` Eli Zaretskii
  2019-09-26 13:03                 ` Michael Heerdegen
  2019-10-08  8:43                 ` Michael Heerdegen
  0 siblings, 2 replies; 38+ messages in thread
From: Eli Zaretskii @ 2019-09-26 12:14 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: eggert, 37321

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Date: Thu, 26 Sep 2019 13:42:02 +0200
> Cc: 37321@debbugs.gnu.org
> 
> > That is, if you are observing significantly better performance with
> > gc-cons-percentage set to 0.8, shouldn't we set it to be that value by
> > default in Emacs?
> 
> Good question.  Could it have downsides (I guess garbage collection
> would happen less often but take longer then so one could get noticeable
> interrupts)?

Using 0.8 means we won't GC before we cons 80% of the heap, which
sounds too high.

Btw, any reason why you enlarge gc-cons-percentage instead of
gc-cons-threshold?  The former is less determinate because it depends
on the size of the heap, which you cannot control.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-26 12:14               ` Eli Zaretskii
@ 2019-09-26 13:03                 ` Michael Heerdegen
  2019-10-08  8:43                 ` Michael Heerdegen
  1 sibling, 0 replies; 38+ messages in thread
From: Michael Heerdegen @ 2019-09-26 13:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, 37321

Eli Zaretskii <eliz@gnu.org> writes:

> Btw, any reason why you enlarge gc-cons-percentage instead of
> gc-cons-threshold?  The former is less determinate because it depends
> on the size of the heap, which you cannot control.

I don't recall.  Just ignorance I guess.

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-09-26 12:14               ` Eli Zaretskii
  2019-09-26 13:03                 ` Michael Heerdegen
@ 2019-10-08  8:43                 ` Michael Heerdegen
  2019-10-08  9:09                   ` Eli Zaretskii
  1 sibling, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-10-08  8:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, 37321

Eli Zaretskii <eliz@gnu.org> writes:

> > From: Michael Heerdegen <michael_heerdegen@web.de>
> > Date: Thu, 26 Sep 2019 13:42:02 +0200
> > Cc: 37321@debbugs.gnu.org
> >
> > > That is, if you are observing significantly better performance with
> > > gc-cons-percentage set to 0.8, shouldn't we set it to be that value by
> > > default in Emacs?
> >
> > Good question.  Could it have downsides (I guess garbage collection
> > would happen less often but take longer then so one could get noticeable
> > interrupts)?
>
> Using 0.8 means we won't GC before we cons 80% of the heap, which
> sounds too high.

I've now had it set to 0.8 in my init file all the time and Emacs
behaved ok afaict.  So, do you want to try to increase some gc value in
Emacs master now?

Regards,

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-08  8:43                 ` Michael Heerdegen
@ 2019-10-08  9:09                   ` Eli Zaretskii
  2019-10-08  9:11                     ` Michael Heerdegen
  0 siblings, 1 reply; 38+ messages in thread
From: Eli Zaretskii @ 2019-10-08  9:09 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: eggert, 37321

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Cc: eggert@cs.ucla.edu,  37321@debbugs.gnu.org
> Date: Tue, 08 Oct 2019 10:43:06 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Using 0.8 means we won't GC before we cons 80% of the heap, which
> > sounds too high.
> 
> I've now had it set to 0.8 in my init file all the time and Emacs
> behaved ok afaict.  So, do you want to try to increase some gc value in
> Emacs master now?

No, I think it's too high for being the default (you can, of course,
do that locally).  Can you instead increase gc-cons-threshold, leaving
the percentage at its default value, and see if you can get the same
good results?  It is much easier for us to bump gc-cons-threshold
(assuming the higher value is still reasonable), because that is an
absolute number, so it is independent of the heap size.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-08  9:09                   ` Eli Zaretskii
@ 2019-10-08  9:11                     ` Michael Heerdegen
  2019-10-08  9:19                       ` Eli Zaretskii
  2019-10-08  9:22                       ` Michael Heerdegen
  0 siblings, 2 replies; 38+ messages in thread
From: Michael Heerdegen @ 2019-10-08  9:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, 37321

Eli Zaretskii <eliz@gnu.org> writes:

> No, I think it's too high for being the default (you can, of course,
> do that locally).  Can you instead increase gc-cons-threshold, leaving
> the percentage at its default value, and see if you can get the same
> good results?  It is much easier for us to bump gc-cons-threshold
> (assuming the higher value is still reasonable), because that is an
> absolute number, so it is independent of the heap size.

Sure.  Which value would you suggest?

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-08  9:11                     ` Michael Heerdegen
@ 2019-10-08  9:19                       ` Eli Zaretskii
  2019-10-08 11:12                         ` Michael Heerdegen
  2019-10-08  9:22                       ` Michael Heerdegen
  1 sibling, 1 reply; 38+ messages in thread
From: Eli Zaretskii @ 2019-10-08  9:19 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: eggert, 37321

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Cc: eggert@cs.ucla.edu,  37321@debbugs.gnu.org
> Date: Tue, 08 Oct 2019 11:11:56 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > No, I think it's too high for being the default (you can, of course,
> > do that locally).  Can you instead increase gc-cons-threshold, leaving
> > the percentage at its default value, and see if you can get the same
> > good results?  It is much easier for us to bump gc-cons-threshold
> > (assuming the higher value is still reasonable), because that is an
> > absolute number, so it is independent of the heap size.
> 
> Sure.  Which value would you suggest?

Start with 4 times the default.  If that's not enough, double the
value until you are satisfied, and tell what was the final value.

Thanks.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-08  9:11                     ` Michael Heerdegen
  2019-10-08  9:19                       ` Eli Zaretskii
@ 2019-10-08  9:22                       ` Michael Heerdegen
  1 sibling, 0 replies; 38+ messages in thread
From: Michael Heerdegen @ 2019-10-08  9:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, 37321

Michael Heerdegen <michael_heerdegen@web.de> writes:

> Sure.  Which value would you suggest?

BTW, as a guidance: an el-search for the symbol pcase in my complete
load-path (and with the .8 setting) currently behaves like this:

Elapsed time: 52.331413s (7.141519s in 20 GCs)

Conses: 4.32514e+07
Floats: 152079
Vector-Cells: 2.735e+08
Symbols: 163489
String-Chars: 8.00088e+07
Intervals: 28094
Strings: 2.7693e+06

(the last values are the printed vector difference of the return value
of `memory-use-counts' from before and after command invocation).

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-08  9:19                       ` Eli Zaretskii
@ 2019-10-08 11:12                         ` Michael Heerdegen
  2019-10-08 12:11                           ` Eli Zaretskii
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-10-08 11:12 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, 37321

Eli Zaretskii <eliz@gnu.org> writes:

> Start with 4 times the default.  If that's not enough, double the
> value until you are satisfied, and tell what was the final value.

I leave it to you to interpret the result (can we expect that other use
cases have a similar optimum?):



My current setup (setq gc-cons-percentage .8)

52.331413s (7.141519s in 20 GCs)


Default: (setq gc-cons-threshold 800000)

87.626289s (41.526705s in 269 GCs)


(setq gc-cons-threshold (* 800000 2))

86.213154s (40.132441s in 267 GCs)


(setq gc-cons-threshold (* 800000 4))

87.422085s (40.967298s in 267 GCs)


(setq gc-cons-threshold (* 800000 8))

84.519948s (38.535352s in 245 GCs)


(setq gc-cons-threshold (* 800000 16))

81.596529s (34.603730s in 193 GCs)


(setq gc-cons-threshold (* 800000 32))

70.866717s (24.850518s in 123 GCs)


(setq gc-cons-threshold (* 800000 64))

61.893515s (14.088496s in 63 GCs)


(setq gc-cons-threshold (* 800000 128))

56.277259s (8.122554s in 32 GCs)


(setq gc-cons-threshold (* 800000 256))

55.173263s (6.067774s in 17 GCs)


(setq gc-cons-threshold (* 800000 512))

56.116736s (8.825274s in 9 GCs)


(setq gc-cons-threshold (* 800000 1024))

70.832969s (19.474471s in 5 GCs)



Regards,

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-08 11:12                         ` Michael Heerdegen
@ 2019-10-08 12:11                           ` Eli Zaretskii
  2019-10-08 12:38                             ` Michael Heerdegen
  0 siblings, 1 reply; 38+ messages in thread
From: Eli Zaretskii @ 2019-10-08 12:11 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: eggert, 37321

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Cc: eggert@cs.ucla.edu,  37321@debbugs.gnu.org
> Date: Tue, 08 Oct 2019 13:12:29 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Start with 4 times the default.  If that's not enough, double the
> > value until you are satisfied, and tell what was the final value.
> 
> I leave it to you to interpret the result (can we expect that other use
> cases have a similar optimum?):
> [...]
> 61.893515s (14.088496s in 63 GCs)
> 
> 
> (setq gc-cons-threshold (* 800000 128))
> 
> 56.277259s (8.122554s in 32 GCs)
> 
> (setq gc-cons-threshold (* 800000 256))
> 
> 55.173263s (6.067774s in 17 GCs)
> 
> (setq gc-cons-threshold (* 800000 512))
> 
> 56.116736s (8.825274s in 9 GCs)
> 
> (setq gc-cons-threshold (* 800000 1024))
> 
> 70.832969s (19.474471s in 5 GCs)

IMO, this suggests that your code produces a lot of garbage, so it is
not a good basis for setting the default values.  I would be OK with
bumping up the default gc-cons-threshold 8-fold, or even 16-fold, but
increasing it by a factor of 256 is too much, IMO: it would mean we
only GC after consing ~200MB of data, which is about half the heap
size of my current Emacs session, which has been up and running for 28
days, and has 360 live buffers.

Thanks.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-08 12:11                           ` Eli Zaretskii
@ 2019-10-08 12:38                             ` Michael Heerdegen
  2019-10-08 13:03                               ` Eli Zaretskii
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-10-08 12:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, 37321

Eli Zaretskii <eliz@gnu.org> writes:

> IMO, this suggests that your code produces a lot of garbage

Yes, mostly unavoidable afaik.

>, so it is not a good basis for setting the default values.  I would be
>OK with bumping up the default gc-cons-threshold 8-fold, or even
>16-fold, but increasing it by a factor of 256 is too much, IMO: it
>would mean we only GC after consing ~200MB of data, which is about half
>the heap size of my current Emacs session, which has been up and
>running for 28 days, and has 360 live buffers.

Ok, then I'll increase gc-cons-threshold in el-search as it had been
suggested.  Unless you have a better idea of how to handle the garbage
in a way that gc doesn't lower the speed that much (I wish I could
explicitly free the many garbage objects).


Thanks,

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-08 12:38                             ` Michael Heerdegen
@ 2019-10-08 13:03                               ` Eli Zaretskii
  2019-10-09 14:47                                 ` Michael Heerdegen
  0 siblings, 1 reply; 38+ messages in thread
From: Eli Zaretskii @ 2019-10-08 13:03 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: eggert, 37321

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Cc: eggert@cs.ucla.edu,  37321@debbugs.gnu.org
> Date: Tue, 08 Oct 2019 14:38:57 +0200
> 
> Ok, then I'll increase gc-cons-threshold in el-search as it had been
> suggested.  Unless you have a better idea of how to handle the garbage
> in a way that gc doesn't lower the speed that much

AFAIK, the latter can only be done by changing your algorithms to
produce less garbage.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-08 13:03                               ` Eli Zaretskii
@ 2019-10-09 14:47                                 ` Michael Heerdegen
  2019-10-09 15:33                                   ` Eli Zaretskii
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Heerdegen @ 2019-10-09 14:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, 37321

Eli Zaretskii <eliz@gnu.org> writes:

> > in a way that gc doesn't lower the speed that much
>
> AFAIK, the latter can only be done by changing your algorithms to
> produce less garbage.

I tried to find out what code produces the most garbage.  It turned out
that ca. 50% of garbage was generated by code that prevents infinite
recursion when recursing into nested structures.  I use hash tables to
collect visited objects, and the hash tables cause the garbage.  I tried
to reuse hash tables and clear them after each use, but this makes the
code much slower than what I win from gc.

But 99,9% of el-searched code isn't cyclic, so the effort is for nothing
most of the time.  Is there an efficient way to find out if a given
object is cyclic?

For now I try with this:

(lambda ()
  (save-excursion)
  (goto-char (point-min))
  (search-forward-regexp "#[0-9+]=[^\"]" nil t))

(all treated objects are read from a buffer, so I can inspect the
contents) and get good results but it feels a bit hackish.


Thanks,

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-09 14:47                                 ` Michael Heerdegen
@ 2019-10-09 15:33                                   ` Eli Zaretskii
  2019-10-09 20:53                                     ` Paul Eggert
  2019-10-10 10:58                                     ` Michael Heerdegen
  0 siblings, 2 replies; 38+ messages in thread
From: Eli Zaretskii @ 2019-10-09 15:33 UTC (permalink / raw)
  To: Michael Heerdegen; +Cc: eggert, 37321

> From: Michael Heerdegen <michael_heerdegen@web.de>
> Cc: eggert@cs.ucla.edu,  37321@debbugs.gnu.org
> Date: Wed, 09 Oct 2019 16:47:58 +0200
> 
> > AFAIK, the latter can only be done by changing your algorithms to
> > produce less garbage.
> 
> I tried to find out what code produces the most garbage.  It turned out
> that ca. 50% of garbage was generated by code that prevents infinite
> recursion when recursing into nested structures.  I use hash tables to
> collect visited objects, and the hash tables cause the garbage.  I tried
> to reuse hash tables and clear them after each use, but this makes the
> code much slower than what I win from gc.
> 
> But 99,9% of el-searched code isn't cyclic, so the effort is for nothing
> most of the time.  Is there an efficient way to find out if a given
> object is cyclic?

The hare/tortoise method we use in data.c?

Or maybe you could make your search non-recursive by using an
internal queue of requests (like in BFS vs DFS)?

(I wrote that based on 5 sec of thought, so maybe it makes no sense at
all.)





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-09 15:33                                   ` Eli Zaretskii
@ 2019-10-09 20:53                                     ` Paul Eggert
  2019-10-10 10:58                                     ` Michael Heerdegen
  1 sibling, 0 replies; 38+ messages in thread
From: Paul Eggert @ 2019-10-09 20:53 UTC (permalink / raw)
  To: Eli Zaretskii, Michael Heerdegen; +Cc: 37321

On 10/9/19 8:33 AM, Eli Zaretskii wrote:
> The hare/tortoise method we use in data.c?

Yes, that's a good suggestion, as Brent's teleporting tortoise-hare 
algorithm (see citation in lisp.h) is quite good at lessening the 
overhead of checking for this rare situation. Perhaps there's a way we 
could "export" that algorithm from the C code, so that Elisp code could 
use the algorithm without having to reinvent it.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* bug#37321: 27.0.50; Excessive gc in a use case (el-search)
  2019-10-09 15:33                                   ` Eli Zaretskii
  2019-10-09 20:53                                     ` Paul Eggert
@ 2019-10-10 10:58                                     ` Michael Heerdegen
  1 sibling, 0 replies; 38+ messages in thread
From: Michael Heerdegen @ 2019-10-10 10:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: eggert, 37321

Eli Zaretskii <eliz@gnu.org> writes:

> The hare/tortoise method we use in data.c?

I didn't know that one.  Nice!  Thanks for the suggestions,

Michael.





^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2019-10-10 10:58 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-06 13:52 bug#37321: 27.0.50; Excessive gc in a use case (el-search) Michael Heerdegen
2019-09-07 14:23 ` Michael Heerdegen
2019-09-07 15:30   ` Michael Heerdegen
2019-09-08  1:11 ` Paul Eggert
2019-09-08 14:52   ` Michael Heerdegen
2019-09-08 15:25     ` Michael Heerdegen
2019-09-14  8:04       ` Paul Eggert
2019-09-14  8:37         ` Eli Zaretskii
2019-09-14  8:52           ` Paul Eggert
2019-09-14  9:57             ` Eli Zaretskii
2019-09-14 17:57               ` Paul Eggert
2019-09-14 18:16                 ` Eli Zaretskii
2019-09-15  4:33             ` Richard Stallman
2019-09-16 23:53         ` Michael Heerdegen
2019-09-17  0:55           ` Paul Eggert
2019-09-21  0:41             ` Michael Heerdegen
2019-09-21  0:46               ` Michael Heerdegen
2019-09-21  6:19                 ` Paul Eggert
2019-09-17 12:47           ` Noam Postavsky
2019-09-21  0:44             ` Michael Heerdegen
2019-09-25  9:42         ` Michael Heerdegen
2019-09-25 20:37           ` Paul Eggert
2019-09-26 11:42             ` Michael Heerdegen
2019-09-26 12:14               ` Eli Zaretskii
2019-09-26 13:03                 ` Michael Heerdegen
2019-10-08  8:43                 ` Michael Heerdegen
2019-10-08  9:09                   ` Eli Zaretskii
2019-10-08  9:11                     ` Michael Heerdegen
2019-10-08  9:19                       ` Eli Zaretskii
2019-10-08 11:12                         ` Michael Heerdegen
2019-10-08 12:11                           ` Eli Zaretskii
2019-10-08 12:38                             ` Michael Heerdegen
2019-10-08 13:03                               ` Eli Zaretskii
2019-10-09 14:47                                 ` Michael Heerdegen
2019-10-09 15:33                                   ` Eli Zaretskii
2019-10-09 20:53                                     ` Paul Eggert
2019-10-10 10:58                                     ` Michael Heerdegen
2019-10-08  9:22                       ` Michael Heerdegen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).