* bug#37321: 27.0.50; Excessive gc in a use case (el-search) @ 2019-09-06 13:52 Michael Heerdegen 2019-09-07 14:23 ` Michael Heerdegen 2019-09-08 1:11 ` Paul Eggert 0 siblings, 2 replies; 38+ messages in thread From: Michael Heerdegen @ 2019-09-06 13:52 UTC (permalink / raw) To: 37321 Hi, apparently due to the latest changes in gc (I noticed the problem today, after doing a make bootstrap yesterday), el-searching is slowed down by a factor of 10000 or so. Profiling shows that nearly all of the time is spent in gc. To reproduce from emacs -Q: M-x list-packages Install el-search M-x el-search-emacs-elisp-sources 'pcase M-RET This will start an el-occur for the symbol 'pcase in the Emacs Elisp sources. This normally takes around 10 seconds but is now just unusable. (If you want me to bisect, do I have to make bootstrap, or does a simple make suffice?) Hope someone can help. TIA, Michael. In GNU Emacs 27.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.10) of 2019-09-06 built on drachen Repository revision: 459b416e5f950c6127f10c9eca68d65bdab688d4 Repository branch: master Windowing system distributor 'The X.Org Foundation', version 11.0.12004000 System Description: Debian GNU/Linux bullseye/sid ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-06 13:52 bug#37321: 27.0.50; Excessive gc in a use case (el-search) Michael Heerdegen @ 2019-09-07 14:23 ` Michael Heerdegen 2019-09-07 15:30 ` Michael Heerdegen 2019-09-08 1:11 ` Paul Eggert 1 sibling, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-09-07 14:23 UTC (permalink / raw) To: 37321 Michael Heerdegen <michael_heerdegen@web.de> writes: > M-x el-search-emacs-elisp-sources 'pcase M-RET I found the culprit. It's this simple function: #+begin_src emacs-lisp (defun el-search--flatten-tree (tree) "Return a list of `el-search--atomic-p' objects in TREE." (let ((elements ()) (walked-objects ;to avoid infinite recursion for circular TREEs (make-hash-table :test #'eq)) (gc-cons-percentage 0.8) ) ;Why is binding it here more effective than binding it more top-level? (cl-labels ((walker (object) (if (el-search--atomic-p object) (push object elements) (unless (gethash object walked-objects) (puthash object t walked-objects) (if (consp object) (progn (while (consp object) (walker (car object)) (setq object (cdr object)) (if (gethash object walked-objects) (setq object nil) (puthash object t walked-objects))) (when object ;dotted list (walker object))) (cl-loop for elt being the elements of object do (walker elt))))))) (walker tree) elements))) #+end_src I've put a lot of work in optimizing el-searching speed. It now mainly depends of effectiveness of gc, mostly in the above function. When I wrote it I noticed that it worked best when binding gc-cons-percentage to a value of 0.8 (inside the function, see the code). But now this binding slows the thing down as hell. When I remove the binding, searching is fast again, but it still takes twice as long as in the past when 0.8 worked. I've no clue about how gc works and how the above function can be optimized, and if gc is expected to be that slow now with the gc-cons-percentage binding in effect. Advice appreciated. TIA, Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-07 14:23 ` Michael Heerdegen @ 2019-09-07 15:30 ` Michael Heerdegen 0 siblings, 0 replies; 38+ messages in thread From: Michael Heerdegen @ 2019-09-07 15:30 UTC (permalink / raw) To: 37321 Michael Heerdegen <michael_heerdegen@web.de> writes: > #+begin_src emacs-lisp > (defun el-search--flatten-tree (tree) > "Return a list of `el-search--atomic-p' objects in TREE." > (let ((elements ()) > (walked-objects ;to avoid infinite recursion for circular TREEs > (make-hash-table :test #'eq)) > (gc-cons-percentage 0.8) > ) ;Why is binding it here more effective than binding it more top-level? > (cl-labels ((walker (object) > (if (el-search--atomic-p object) > (push object elements) > (unless (gethash object walked-objects) > (puthash object t walked-objects) > (if (consp object) > (progn > (while (consp object) > (walker (car object)) > (setq object (cdr object)) > (if (gethash object walked-objects) > (setq object nil) > (puthash object t walked-objects))) > (when object ;dotted list > (walker object))) > (cl-loop for elt being the elements of object do (walker elt))))))) > (walker tree) > elements))) > #+end_src It doesn't seem to be related to the hash table usage - if I remove the hash table test based infloop check part of the function, the behavior wrt gc doesn't change. So I wonder what in this function does it make so much depend on gc behavior? Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-06 13:52 bug#37321: 27.0.50; Excessive gc in a use case (el-search) Michael Heerdegen 2019-09-07 14:23 ` Michael Heerdegen @ 2019-09-08 1:11 ` Paul Eggert 2019-09-08 14:52 ` Michael Heerdegen 1 sibling, 1 reply; 38+ messages in thread From: Paul Eggert @ 2019-09-08 1:11 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 37321 [-- Attachment #1: Type: text/plain, Size: 85 bytes --] Thanks for reporting the bug. I installed the attached patch; please give it a try. [-- Attachment #2: 0001-Fix-bug-when-gc-cons-percentage-is-bumped-to-0.8.patch --] [-- Type: text/x-patch, Size: 3907 bytes --] From 4c31e7c3e4e70b38ab51a99d61e215aefe0190dd Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 7 Sep 2019 18:08:12 -0700 Subject: [PATCH] Fix bug when gc-cons-percentage is bumped to 0.8 Problem reported by Michael Heerdegen (Bug#37321). * src/alloc.c (gc_threshold): New static var. (bump_consing_until_gc): Change args from DIFF to THRESHOLD and PERCENTAGE. All uses changed. When accounting for a changed gc-cons-percentage, do not assume that total_bytes_of_live_objects returns the same value now that it did the last time we were called. --- src/alloc.c | 45 ++++++++++++++++++++++++++++----------------- 1 file changed, 28 insertions(+), 17 deletions(-) diff --git a/src/alloc.c b/src/alloc.c index 5fc515f33b..be98cfd5f5 100644 --- a/src/alloc.c +++ b/src/alloc.c @@ -297,6 +297,10 @@ #define PUREBEG (char *) pure static intptr_t garbage_collection_inhibited; +/* The GC threshold in bytes, the last time it was calculated + from gc-cons-threshold and gc-cons-percentage. */ +static intmax_t gc_threshold; + /* If nonzero, this is a warning delivered by malloc and not yet displayed. */ @@ -5808,15 +5812,28 @@ consing_threshold (intmax_t threshold, Lisp_Object percentage) } } -/* Increment consing_until_gc by DIFF, avoiding overflow. */ +/* Adjust consing_until_gc, assuming gc-cons-threshold is THRESHOLD and + gc-cons-percentage is PERCENTAGE. */ static Lisp_Object -bump_consing_until_gc (intmax_t diff) +bump_consing_until_gc (intmax_t threshold, Lisp_Object percentage) { /* If consing_until_gc is negative leave it alone, since this prevents negative integer overflow and a GC would have been done soon anyway. */ - if (0 <= consing_until_gc - && INT_ADD_WRAPV (consing_until_gc, diff, &consing_until_gc)) - consing_until_gc = INTMAX_MAX; + if (0 <= consing_until_gc) + { + threshold = consing_threshold (threshold, percentage); + intmax_t sum; + if (INT_ADD_WRAPV (consing_until_gc, threshold - gc_threshold, &sum)) + { + /* Scale the threshold down so that consing_until_gc does + not overflow. */ + sum = INTMAX_MAX; + threshold = INTMAX_MAX - consing_until_gc + gc_threshold; + } + consing_until_gc = sum; + gc_threshold = threshold; + } + return Qnil; } @@ -5825,13 +5842,10 @@ bump_consing_until_gc (intmax_t diff) watch_gc_cons_threshold (Lisp_Object symbol, Lisp_Object newval, Lisp_Object operation, Lisp_Object where) { - Lisp_Object percentage = Vgc_cons_percentage; intmax_t threshold; - intmax_t diff = (INTEGERP (newval) && integer_to_intmax (newval, &threshold) - ? (consing_threshold (threshold, percentage) - - consing_threshold (gc_cons_threshold, percentage)) - : 0); - return bump_consing_until_gc (diff); + if (! (INTEGERP (newval) && integer_to_intmax (newval, &threshold))) + return Qnil; + return bump_consing_until_gc (threshold, Vgc_cons_percentage); } /* Watch changes to gc-cons-percentage. */ @@ -5839,10 +5853,7 @@ watch_gc_cons_threshold (Lisp_Object symbol, Lisp_Object newval, watch_gc_cons_percentage (Lisp_Object symbol, Lisp_Object newval, Lisp_Object operation, Lisp_Object where) { - intmax_t threshold = gc_cons_threshold; - intmax_t diff = (consing_threshold (threshold, newval) - - consing_threshold (threshold, Vgc_cons_percentage)); - return bump_consing_until_gc (diff); + return bump_consing_until_gc (gc_cons_threshold, newval); } /* Subroutine of Fgarbage_collect that does most of the work. */ @@ -5987,8 +5998,8 @@ garbage_collect_1 (struct gcstat *gcst) unblock_input (); - consing_until_gc = consing_threshold (gc_cons_threshold, - Vgc_cons_percentage); + consing_until_gc = gc_threshold + = consing_threshold (gc_cons_threshold, Vgc_cons_percentage); if (garbage_collection_messages && NILP (Vmemory_full)) { -- 2.17.1 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-08 1:11 ` Paul Eggert @ 2019-09-08 14:52 ` Michael Heerdegen 2019-09-08 15:25 ` Michael Heerdegen 0 siblings, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-09-08 14:52 UTC (permalink / raw) To: Paul Eggert; +Cc: 37321 Paul Eggert <eggert@cs.ucla.edu> writes: > Thanks for reporting the bug. I installed the attached patch; please > give it a try. Thank you, that fixes the issue for me. FWIW, in the end, the search is still slower as before, by a factor of 5 or so. But I had found .8 only by trying - if the gc implementation has been changed, maybe some other value works better now. I think I'll rewrite my use case to use hash tables instead, that should be more efficient (less garbage). Thanks, Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-08 14:52 ` Michael Heerdegen @ 2019-09-08 15:25 ` Michael Heerdegen 2019-09-14 8:04 ` Paul Eggert 0 siblings, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-09-08 15:25 UTC (permalink / raw) To: Paul Eggert; +Cc: 37321 Michael Heerdegen <michael_heerdegen@web.de> writes: > I think I'll rewrite my use case to use hash tables instead, that should > be more efficient (less garbage). Unfortunately this doesn't make a difference :-( Garbage collection takes ~ 50% of the search time. I don't think I produce unnaturally amounts of garbage. But my Emacs seems to have become slower independently from gc. There is still a slow down of searching by a factor of at least 3 or so that doesn't seem to be related to gc. Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-08 15:25 ` Michael Heerdegen @ 2019-09-14 8:04 ` Paul Eggert 2019-09-14 8:37 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 38+ messages in thread From: Paul Eggert @ 2019-09-14 8:04 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 37321 [-- Attachment #1: Type: text/plain, Size: 1419 bytes --] On 9/8/19 8:25 AM, Michael Heerdegen wrote: > my Emacs seems to have > become slower independently from gc. There is still a slow down of > searching by a factor of at least 3 or so that doesn't seem to be > related to gc. Have you tried turning on profiling to see what the problem might be? I looked into it a bit, and I think the problem is related to this line in el-search--flatten-tree: (gc-cons-percentage 0.8)) ;Why is binding it here more effective than binding it more top-level? That binding is not effective in general, because it causes Emacs to do most of its computation with gc-cons-percentage equal to 0.8, but a small amount of it with gc-cons-percentage equal to 0.1 (the default value). This is in an attempt to disable most GC. However, the attempt fails in master because when gc-cons-percentage drops to 0.1 Emacs does a garbage collection pretty much right away. (In Emacs 26, the code lucks out because Emacs happens to not look at gc-cons-percentage during the brief time that it is 0.1, so it doesn't GC.) Proposed ELPA fix attached; it should help performance in Emacs master (where I checked in some changes recently to help master work a bit better on this example). The abovementioned line shouldn't be needed in Emacs master. There may be other places in el-search that would benefit from a change similar to the attached patch, but I didn't investigate this. [-- Attachment #2: 0001-Port-GC-tuning-to-Emacs-27-alpha.patch --] [-- Type: text/x-patch, Size: 1249 bytes --] From 3e85fb302a9cf88ed3668d41c4b2e4cdb7017c54 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 14 Sep 2019 01:01:20 -0700 Subject: [PATCH] Port GC tuning to Emacs 27 alpha * packages/el-search/el-search.el (el-search-setup-search): Set gc-cons-percentage to 0.8 while searching. --- packages/el-search/el-search.el | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/packages/el-search/el-search.el b/packages/el-search/el-search.el index 4e9cd98a4..5e213cf2b 100644 --- a/packages/el-search/el-search.el +++ b/packages/el-search/el-search.el @@ -2168,11 +2168,12 @@ files (i.e. file names) to search in. With optional FROM-HERE non-nil, the first buffer in this stream should be the current buffer, and searching will start at the current buffer's point instead of its beginning." + (let ((gc-cons-percentage 0.8)) (el-search-setup-search-1 pattern get-buffer-stream nil setup-function) (if (not el-search-occur-flag) (el-search-continue-search from-here) (setq el-search-occur-flag nil) - (el-search-occur))) + (el-search-occur)))) (defun el-search-stream-of-directory-files (&optional directory recurse) "Return a stream of emacs-lisp files in DIRECTORY. -- 2.21.0 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-14 8:04 ` Paul Eggert @ 2019-09-14 8:37 ` Eli Zaretskii 2019-09-14 8:52 ` Paul Eggert 2019-09-16 23:53 ` Michael Heerdegen 2019-09-25 9:42 ` Michael Heerdegen 2 siblings, 1 reply; 38+ messages in thread From: Eli Zaretskii @ 2019-09-14 8:37 UTC (permalink / raw) To: Paul Eggert; +Cc: michael_heerdegen, 37321 > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 14 Sep 2019 01:04:07 -0700 > Cc: 37321@debbugs.gnu.org > > (gc-cons-percentage 0.8)) ;Why is binding it here more effective than > binding it more top-level? > > That binding is not effective in general, because it causes Emacs to do most of > its computation with gc-cons-percentage equal to 0.8, but a small amount of it > with gc-cons-percentage equal to 0.1 (the default value). This is in an attempt > to disable most GC. However, the attempt fails in master because when > gc-cons-percentage drops to 0.1 Emacs does a garbage collection pretty much > right away. (In Emacs 26, the code lucks out because Emacs happens to not look > at gc-cons-percentage during the brief time that it is 0.1, so it doesn't GC.) Maybe we should have something in NEWS describing the Lisp-visible effects of the GC changes, perhaps with a couple of practical do's and dont's regarding GC? I'm not sure if you are reading Reddit, but it turns out a lot of Emacs users have GC-related tricks in their init files, and some unbundled packages even do that as part of their operation, like the one in question here. IMNSHO, a large part of this proliferation is based on blind copying from others without really understanding what this does, and it reached epidemic proportions. It might be a good idea to give users who do care some inside information to allow them to tune their GC tricks to these changes -- it could lower the number of bug reports that I'm afraid will follow. Thanks. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-14 8:37 ` Eli Zaretskii @ 2019-09-14 8:52 ` Paul Eggert 2019-09-14 9:57 ` Eli Zaretskii 2019-09-15 4:33 ` Richard Stallman 0 siblings, 2 replies; 38+ messages in thread From: Paul Eggert @ 2019-09-14 8:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: michael_heerdegen, 37321 On 9/14/19 1:37 AM, Eli Zaretskii wrote: > It might be a good > idea to give users who do care some inside information to allow them > to tune their GC tricks to these changes -- it could lower the number > of bug reports that I'm afraid will follow. I could add something to NEWS along the lines of "The heuristics for when to garbage collect have been tweaked, in an attempt to make Emacs perform a bit better and follow its documentation a bit more closely." Dunno how much that would help, though. It'd be hard to be much more specific. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-14 8:52 ` Paul Eggert @ 2019-09-14 9:57 ` Eli Zaretskii 2019-09-14 17:57 ` Paul Eggert 2019-09-15 4:33 ` Richard Stallman 1 sibling, 1 reply; 38+ messages in thread From: Eli Zaretskii @ 2019-09-14 9:57 UTC (permalink / raw) To: Paul Eggert; +Cc: michael_heerdegen, 37321 > Cc: michael_heerdegen@web.de, 37321@debbugs.gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 14 Sep 2019 01:52:27 -0700 > > On 9/14/19 1:37 AM, Eli Zaretskii wrote: > > It might be a good > > idea to give users who do care some inside information to allow them > > to tune their GC tricks to these changes -- it could lower the number > > of bug reports that I'm afraid will follow. > > I could add something to NEWS along the lines of "The heuristics for when to > garbage collect have been tweaked, in an attempt to make Emacs perform a bit > better and follow its documentation a bit more closely." Dunno how much that > would help, though. It'd be hard to be much more specific. If we cannot give any specifics, I think at least telling users to re-tune their GC tricks would help. Thanks. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-14 9:57 ` Eli Zaretskii @ 2019-09-14 17:57 ` Paul Eggert 2019-09-14 18:16 ` Eli Zaretskii 0 siblings, 1 reply; 38+ messages in thread From: Paul Eggert @ 2019-09-14 17:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: michael_heerdegen, 37321 [-- Attachment #1: Type: text/plain, Size: 193 bytes --] On 9/14/19 2:57 AM, Eli Zaretskii wrote: > If we cannot give any specifics, I think at least telling users to > re-tune their GC tricks would help. I installed the attached to try to do that. [-- Attachment #2: 0001-Improve-doc-of-GC-thresholds.patch --] [-- Type: text/x-patch, Size: 2794 bytes --] From 1acc0cc9aaf25c808a60cf09cf8a4d1c653c3aa9 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 14 Sep 2019 10:53:24 -0700 Subject: [PATCH] Improve doc of GC thresholds * doc/lispref/internals.texi (Garbage Collection), etc/NEWS: Warn that control over GC is only approximate. --- doc/lispref/internals.texi | 13 ++++++++++--- etc/NEWS | 7 +++++++ 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/doc/lispref/internals.texi b/doc/lispref/internals.texi index f85c266ede..c52999e1cd 100644 --- a/doc/lispref/internals.texi +++ b/doc/lispref/internals.texi @@ -533,9 +533,6 @@ Garbage Collection trigger another garbage collection. You can use the result returned by @code{garbage-collect} to get an information about size of the particular object type; space allocated to the contents of buffers does not count. -Note that the subsequent garbage collection does not happen immediately -when the threshold is exhausted, but only the next time the Lisp interpreter -is called. The initial threshold value is @code{GC_DEFAULT_THRESHOLD}, defined in @file{alloc.c}. Since it's defined in @code{word_size} units, the value @@ -562,6 +559,16 @@ Garbage Collection proportion. @end defopt + Control over the garbage collector via @code{gc-cons-threshold} and +@code{gc-cons-percentage} is only approximate. Although Emacs checks +for threshold exhaustion regularly, for efficiency reasons it does not +do so immediately after every change to the heap or to +@code{gc-cons-threshold} or @code{gc-cons-percentage}, so exhausting +the threshold does not immediately trigger garbage collection. Also, +for efficency in threshold calculations Emacs approximates the heap +size, which counts the bytes used by currently-accessible objects in +the heap. + The value returned by @code{garbage-collect} describes the amount of memory used by Lisp data, broken down by data type. By contrast, the function @code{memory-limit} provides information on the total amount of diff --git a/etc/NEWS b/etc/NEWS index 94c98a7ebe..252c6bf9b9 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -2429,6 +2429,13 @@ remote systems, which support this check. +++ ** 'memory-limit' now returns a better estimate of memory consumption. ++++ +** When interpreting 'gc-cons-percentage', Emacs now estimates the +heap size more often and (we hope) more accurately. E.g., formerly +(progn (let ((gc-cons-percentage 0.8)) BODY1) BODY2) continued to use +the 0.8 value during BODY2 until the next garbage collection, but that +is no longer true. Applications may need to re-tune their GC tricks. + +++ ** New macro 'combine-change-calls' arranges to call the change hooks ('before-change-functions' and 'after-change-functions') just once -- 2.17.1 ^ permalink raw reply related [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-14 17:57 ` Paul Eggert @ 2019-09-14 18:16 ` Eli Zaretskii 0 siblings, 0 replies; 38+ messages in thread From: Eli Zaretskii @ 2019-09-14 18:16 UTC (permalink / raw) To: Paul Eggert; +Cc: michael_heerdegen, 37321 > Cc: michael_heerdegen@web.de, 37321@debbugs.gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 14 Sep 2019 10:57:35 -0700 > > On 9/14/19 2:57 AM, Eli Zaretskii wrote: > > If we cannot give any specifics, I think at least telling users to > > re-tune their GC tricks would help. > > I installed the attached to try to do that. Thanks. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-14 8:52 ` Paul Eggert 2019-09-14 9:57 ` Eli Zaretskii @ 2019-09-15 4:33 ` Richard Stallman 1 sibling, 0 replies; 38+ messages in thread From: Richard Stallman @ 2019-09-15 4:33 UTC (permalink / raw) To: Paul Eggert; +Cc: michael_heerdegen, 37321 [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I could add something to NEWS along the lines of "The heuristics for when to > garbage collect have been tweaked, in an attempt to make Emacs perform a bit > better and follow its documentation a bit more closely." Dunno how much that > would help, though. It'd be hard to be much more specific. That is so general that I am not sure it is worth mentioning in NEWS. Users don't need to read about it to get the benefit of it. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-14 8:04 ` Paul Eggert 2019-09-14 8:37 ` Eli Zaretskii @ 2019-09-16 23:53 ` Michael Heerdegen 2019-09-17 0:55 ` Paul Eggert 2019-09-17 12:47 ` Noam Postavsky 2019-09-25 9:42 ` Michael Heerdegen 2 siblings, 2 replies; 38+ messages in thread From: Michael Heerdegen @ 2019-09-16 23:53 UTC (permalink / raw) To: Paul Eggert; +Cc: 37321 Paul Eggert <eggert@cs.ucla.edu> writes: > > my Emacs seems to have become slower independently from gc. There > > is still a slow down of searching by a factor of at least 3 or so > > that doesn't seem to be related to gc. > > Have you tried turning on profiling to see what the problem might be? For el-searches, yes, I did. But the amounts of time spent were distributed quite evenly between different things, I could not identify a culprit. It looked more like everything got proportionally slower. > I looked into it a bit, and I think the problem is related to this > line in el-search--flatten-tree: > > (gc-cons-percentage 0.8)) ;Why is binding it here more > effective than binding it more top-level? > > That binding is not effective in general, because it causes Emacs to > do most of its computation with gc-cons-percentage equal to 0.8, but a > small amount of it with gc-cons-percentage equal to 0.1 (the default > value). Yeah. Funnily enough I had experimented a lot with where to put this binding. I also had tried what you suggest in your patch. What I currently use looks weird and I don't understand why but in the past it produced the most efficient result - even more efficient than with a more durable binding. Seems this is not true any more. > This is in an attempt to disable most GC. However, the attempt > fails in master because when gc-cons-percentage drops to 0.1 Emacs > does a garbage collection pretty much right away. (In Emacs 26, the > code lucks out because Emacs happens to not look at gc-cons-percentage > during the brief time that it is 0.1, so it doesn't GC.) Ah, ok. > There may be other places in el-search that would benefit from a > change similar to the attached patch, but I didn't investigate this. Yes, I think so. Question: why didn't it help to switch to hash tables? My use case is like this: very frequently I need to collect N (N is variable with an order of magnitude of roughly 0 < N < 100.000 or so) objects in a structure and later perform member tests whether a given element is equal to one of the N. I used to use a list and `member' to implement this. When I use a hash-table that associates the N elements with t instead, and use gethash as member test, do I produce less garbage? That would be good but when using this it didn't lower the amount of time spent in gc. Thanks, Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-16 23:53 ` Michael Heerdegen @ 2019-09-17 0:55 ` Paul Eggert 2019-09-21 0:41 ` Michael Heerdegen 2019-09-17 12:47 ` Noam Postavsky 1 sibling, 1 reply; 38+ messages in thread From: Paul Eggert @ 2019-09-17 0:55 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 37321 On 9/16/19 4:53 PM, Michael Heerdegen wrote: > For el-searches, yes, I did. But the amounts of time spent were > distributed quite evenly between different things, I could not identify > a culprit. It looked more like everything got proportionally slower. That doesn't sound good. Can you identify which commit did that, if any? > When I use a hash-table that associates the N elements with t > instead, and use gethash as member test, do I produce less garbage? Hard to say, but you can use (memory-use-counts) to estimate how much garbage you're creating, since it counts all uses whereas (garbage-collect) counts only live uses. There's also memory profiling, as opposed to CPU profiling. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-17 0:55 ` Paul Eggert @ 2019-09-21 0:41 ` Michael Heerdegen 2019-09-21 0:46 ` Michael Heerdegen 0 siblings, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-09-21 0:41 UTC (permalink / raw) To: Paul Eggert; +Cc: 37321 Paul Eggert <eggert@cs.ucla.edu> writes: > > For el-searches, yes, I did. But the amounts of time spent were > > distributed quite evenly between different things, I could not identify > > a culprit. It looked more like everything got proportionally slower. > > That doesn't sound good. Can you identify which commit did that, if any? I made the gc related change to el-search now and search times don't look that bad any more. I still have a slow down of 2 or so, but I think that can at least partly be explained by changes I made to el-search in the meantime. I'll keep an eye on it, but I guess everything is good. Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-21 0:41 ` Michael Heerdegen @ 2019-09-21 0:46 ` Michael Heerdegen 2019-09-21 6:19 ` Paul Eggert 0 siblings, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-09-21 0:46 UTC (permalink / raw) To: Paul Eggert; +Cc: 37321 Michael Heerdegen <michael_heerdegen@web.de> writes: > I made the gc related change to el-search now and search times don't > look that bad any more. I still have a slow down of 2 or so, but I > think that can at least partly be explained by changes I made to > el-search in the meantime. I'll keep an eye on it, but I guess > everything is good. After reading all messages again, seems we are all happy and everything is done. Can we close? Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-21 0:46 ` Michael Heerdegen @ 2019-09-21 6:19 ` Paul Eggert 0 siblings, 0 replies; 38+ messages in thread From: Paul Eggert @ 2019-09-21 6:19 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 37321-done On 9/20/19 5:46 PM, Michael Heerdegen wrote: > After reading all messages again, seems we are all happy and everything > is done. Can we close? Sure, closing. Thanks for checking things out. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-16 23:53 ` Michael Heerdegen 2019-09-17 0:55 ` Paul Eggert @ 2019-09-17 12:47 ` Noam Postavsky 2019-09-21 0:44 ` Michael Heerdegen 1 sibling, 1 reply; 38+ messages in thread From: Noam Postavsky @ 2019-09-17 12:47 UTC (permalink / raw) To: Michael Heerdegen; +Cc: Paul Eggert, 37321 Michael Heerdegen <michael_heerdegen@web.de> writes: > > Question: why didn't it help to switch to hash tables? My use case is > like this: very frequently I need to collect N (N is variable with an > order of magnitude of roughly 0 < N < 100.000 or so) objects in a > structure and later perform member tests whether a given element is > equal to one of the N. I used to use a list and `member' to implement > this. When I use a hash-table that associates the N elements with t > instead, and use gethash as member test, do I produce less garbage? > That would be good but when using this it didn't lower the amount of > time spent in gc. I would expect it to produce more garbage. A list of length N has to contain 2N slots (2 for each cons = car+cdr). A hash table with N items, needs at least 2N as well: N keys + N values. And since it stores these in vectors/arrays, as you add items it has to reallocate them to resize (and the final size will likely be a bit higher than N), producing more garbage (this can be avoided if you can pass :size N up front). gethash as a member test should be faster than lists for large N though. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-17 12:47 ` Noam Postavsky @ 2019-09-21 0:44 ` Michael Heerdegen 0 siblings, 0 replies; 38+ messages in thread From: Michael Heerdegen @ 2019-09-21 0:44 UTC (permalink / raw) To: Noam Postavsky; +Cc: Paul Eggert, 37321 Noam Postavsky <npostavs@gmail.com> writes: > I would expect it to produce more garbage. A list of length N has to > contain 2N slots (2 for each cons = car+cdr). A hash table with N > items, needs at least 2N as well: N keys + N values. And since it > stores these in vectors/arrays, as you add items it has to reallocate > them to resize (and the final size will likely be a bit higher than N), > producing more garbage (this can be avoided if you can pass :size N up > front). Makes sense, thanks. So in the case I had in mind switching to hash tables offers no advantages. Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-14 8:04 ` Paul Eggert 2019-09-14 8:37 ` Eli Zaretskii 2019-09-16 23:53 ` Michael Heerdegen @ 2019-09-25 9:42 ` Michael Heerdegen 2019-09-25 20:37 ` Paul Eggert 2 siblings, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-09-25 9:42 UTC (permalink / raw) To: Paul Eggert; +Cc: 37321 Paul Eggert <eggert@cs.ucla.edu> writes: > That binding is not effective in general, because it causes Emacs to > do most of its computation with gc-cons-percentage equal to 0.8, but a > small amount of it with gc-cons-percentage equal to 0.1 (the default > value). [...] > Proposed ELPA fix attached [...] > > There may be other places in el-search that would benefit from a > change similar to the attached patch, but I didn't investigate this. That would actually mean that I would have to add the gc-cons-percentage binding to nearly every single command, which are quite a lot. Isn't there a better way? Thanks, Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-25 9:42 ` Michael Heerdegen @ 2019-09-25 20:37 ` Paul Eggert 2019-09-26 11:42 ` Michael Heerdegen 0 siblings, 1 reply; 38+ messages in thread From: Paul Eggert @ 2019-09-25 20:37 UTC (permalink / raw) To: Michael Heerdegen; +Cc: 37321 On 9/25/19 2:42 AM, Michael Heerdegen wrote: > That would actually mean that I would have to add the gc-cons-percentage > binding to nearly every single command, which are quite a lot. Isn't > there a better way? I don't see one offhand, sorry. Except for the obvious way, which depends on a larger question: what's different about el-search? That is, if you are observing significantly better performance with gc-cons-percentage set to 0.8, shouldn't we set it to be that value by default in Emacs? ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-25 20:37 ` Paul Eggert @ 2019-09-26 11:42 ` Michael Heerdegen 2019-09-26 12:14 ` Eli Zaretskii 0 siblings, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-09-26 11:42 UTC (permalink / raw) To: Paul Eggert; +Cc: 37321 Paul Eggert <eggert@cs.ucla.edu> writes: > I don't see one offhand, sorry. Except for the obvious way, which > depends on a larger question: what's different about el-search? El-searching produces a lot of garbage on the way, since I must continuously `read'. Not a very special thing, though. > That is, if you are observing significantly better performance with > gc-cons-percentage set to 0.8, shouldn't we set it to be that value by > default in Emacs? Good question. Could it have downsides (I guess garbage collection would happen less often but take longer then so one could get noticeable interrupts)? Regards, Michael ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-26 11:42 ` Michael Heerdegen @ 2019-09-26 12:14 ` Eli Zaretskii 2019-09-26 13:03 ` Michael Heerdegen 2019-10-08 8:43 ` Michael Heerdegen 0 siblings, 2 replies; 38+ messages in thread From: Eli Zaretskii @ 2019-09-26 12:14 UTC (permalink / raw) To: Michael Heerdegen; +Cc: eggert, 37321 > From: Michael Heerdegen <michael_heerdegen@web.de> > Date: Thu, 26 Sep 2019 13:42:02 +0200 > Cc: 37321@debbugs.gnu.org > > > That is, if you are observing significantly better performance with > > gc-cons-percentage set to 0.8, shouldn't we set it to be that value by > > default in Emacs? > > Good question. Could it have downsides (I guess garbage collection > would happen less often but take longer then so one could get noticeable > interrupts)? Using 0.8 means we won't GC before we cons 80% of the heap, which sounds too high. Btw, any reason why you enlarge gc-cons-percentage instead of gc-cons-threshold? The former is less determinate because it depends on the size of the heap, which you cannot control. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-26 12:14 ` Eli Zaretskii @ 2019-09-26 13:03 ` Michael Heerdegen 2019-10-08 8:43 ` Michael Heerdegen 1 sibling, 0 replies; 38+ messages in thread From: Michael Heerdegen @ 2019-09-26 13:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, 37321 Eli Zaretskii <eliz@gnu.org> writes: > Btw, any reason why you enlarge gc-cons-percentage instead of > gc-cons-threshold? The former is less determinate because it depends > on the size of the heap, which you cannot control. I don't recall. Just ignorance I guess. Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-09-26 12:14 ` Eli Zaretskii 2019-09-26 13:03 ` Michael Heerdegen @ 2019-10-08 8:43 ` Michael Heerdegen 2019-10-08 9:09 ` Eli Zaretskii 1 sibling, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-10-08 8:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, 37321 Eli Zaretskii <eliz@gnu.org> writes: > > From: Michael Heerdegen <michael_heerdegen@web.de> > > Date: Thu, 26 Sep 2019 13:42:02 +0200 > > Cc: 37321@debbugs.gnu.org > > > > > That is, if you are observing significantly better performance with > > > gc-cons-percentage set to 0.8, shouldn't we set it to be that value by > > > default in Emacs? > > > > Good question. Could it have downsides (I guess garbage collection > > would happen less often but take longer then so one could get noticeable > > interrupts)? > > Using 0.8 means we won't GC before we cons 80% of the heap, which > sounds too high. I've now had it set to 0.8 in my init file all the time and Emacs behaved ok afaict. So, do you want to try to increase some gc value in Emacs master now? Regards, Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-08 8:43 ` Michael Heerdegen @ 2019-10-08 9:09 ` Eli Zaretskii 2019-10-08 9:11 ` Michael Heerdegen 0 siblings, 1 reply; 38+ messages in thread From: Eli Zaretskii @ 2019-10-08 9:09 UTC (permalink / raw) To: Michael Heerdegen; +Cc: eggert, 37321 > From: Michael Heerdegen <michael_heerdegen@web.de> > Cc: eggert@cs.ucla.edu, 37321@debbugs.gnu.org > Date: Tue, 08 Oct 2019 10:43:06 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Using 0.8 means we won't GC before we cons 80% of the heap, which > > sounds too high. > > I've now had it set to 0.8 in my init file all the time and Emacs > behaved ok afaict. So, do you want to try to increase some gc value in > Emacs master now? No, I think it's too high for being the default (you can, of course, do that locally). Can you instead increase gc-cons-threshold, leaving the percentage at its default value, and see if you can get the same good results? It is much easier for us to bump gc-cons-threshold (assuming the higher value is still reasonable), because that is an absolute number, so it is independent of the heap size. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-08 9:09 ` Eli Zaretskii @ 2019-10-08 9:11 ` Michael Heerdegen 2019-10-08 9:19 ` Eli Zaretskii 2019-10-08 9:22 ` Michael Heerdegen 0 siblings, 2 replies; 38+ messages in thread From: Michael Heerdegen @ 2019-10-08 9:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, 37321 Eli Zaretskii <eliz@gnu.org> writes: > No, I think it's too high for being the default (you can, of course, > do that locally). Can you instead increase gc-cons-threshold, leaving > the percentage at its default value, and see if you can get the same > good results? It is much easier for us to bump gc-cons-threshold > (assuming the higher value is still reasonable), because that is an > absolute number, so it is independent of the heap size. Sure. Which value would you suggest? Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-08 9:11 ` Michael Heerdegen @ 2019-10-08 9:19 ` Eli Zaretskii 2019-10-08 11:12 ` Michael Heerdegen 2019-10-08 9:22 ` Michael Heerdegen 1 sibling, 1 reply; 38+ messages in thread From: Eli Zaretskii @ 2019-10-08 9:19 UTC (permalink / raw) To: Michael Heerdegen; +Cc: eggert, 37321 > From: Michael Heerdegen <michael_heerdegen@web.de> > Cc: eggert@cs.ucla.edu, 37321@debbugs.gnu.org > Date: Tue, 08 Oct 2019 11:11:56 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > No, I think it's too high for being the default (you can, of course, > > do that locally). Can you instead increase gc-cons-threshold, leaving > > the percentage at its default value, and see if you can get the same > > good results? It is much easier for us to bump gc-cons-threshold > > (assuming the higher value is still reasonable), because that is an > > absolute number, so it is independent of the heap size. > > Sure. Which value would you suggest? Start with 4 times the default. If that's not enough, double the value until you are satisfied, and tell what was the final value. Thanks. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-08 9:19 ` Eli Zaretskii @ 2019-10-08 11:12 ` Michael Heerdegen 2019-10-08 12:11 ` Eli Zaretskii 0 siblings, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-10-08 11:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, 37321 Eli Zaretskii <eliz@gnu.org> writes: > Start with 4 times the default. If that's not enough, double the > value until you are satisfied, and tell what was the final value. I leave it to you to interpret the result (can we expect that other use cases have a similar optimum?): My current setup (setq gc-cons-percentage .8) 52.331413s (7.141519s in 20 GCs) Default: (setq gc-cons-threshold 800000) 87.626289s (41.526705s in 269 GCs) (setq gc-cons-threshold (* 800000 2)) 86.213154s (40.132441s in 267 GCs) (setq gc-cons-threshold (* 800000 4)) 87.422085s (40.967298s in 267 GCs) (setq gc-cons-threshold (* 800000 8)) 84.519948s (38.535352s in 245 GCs) (setq gc-cons-threshold (* 800000 16)) 81.596529s (34.603730s in 193 GCs) (setq gc-cons-threshold (* 800000 32)) 70.866717s (24.850518s in 123 GCs) (setq gc-cons-threshold (* 800000 64)) 61.893515s (14.088496s in 63 GCs) (setq gc-cons-threshold (* 800000 128)) 56.277259s (8.122554s in 32 GCs) (setq gc-cons-threshold (* 800000 256)) 55.173263s (6.067774s in 17 GCs) (setq gc-cons-threshold (* 800000 512)) 56.116736s (8.825274s in 9 GCs) (setq gc-cons-threshold (* 800000 1024)) 70.832969s (19.474471s in 5 GCs) Regards, Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-08 11:12 ` Michael Heerdegen @ 2019-10-08 12:11 ` Eli Zaretskii 2019-10-08 12:38 ` Michael Heerdegen 0 siblings, 1 reply; 38+ messages in thread From: Eli Zaretskii @ 2019-10-08 12:11 UTC (permalink / raw) To: Michael Heerdegen; +Cc: eggert, 37321 > From: Michael Heerdegen <michael_heerdegen@web.de> > Cc: eggert@cs.ucla.edu, 37321@debbugs.gnu.org > Date: Tue, 08 Oct 2019 13:12:29 +0200 > > Eli Zaretskii <eliz@gnu.org> writes: > > > Start with 4 times the default. If that's not enough, double the > > value until you are satisfied, and tell what was the final value. > > I leave it to you to interpret the result (can we expect that other use > cases have a similar optimum?): > [...] > 61.893515s (14.088496s in 63 GCs) > > > (setq gc-cons-threshold (* 800000 128)) > > 56.277259s (8.122554s in 32 GCs) > > (setq gc-cons-threshold (* 800000 256)) > > 55.173263s (6.067774s in 17 GCs) > > (setq gc-cons-threshold (* 800000 512)) > > 56.116736s (8.825274s in 9 GCs) > > (setq gc-cons-threshold (* 800000 1024)) > > 70.832969s (19.474471s in 5 GCs) IMO, this suggests that your code produces a lot of garbage, so it is not a good basis for setting the default values. I would be OK with bumping up the default gc-cons-threshold 8-fold, or even 16-fold, but increasing it by a factor of 256 is too much, IMO: it would mean we only GC after consing ~200MB of data, which is about half the heap size of my current Emacs session, which has been up and running for 28 days, and has 360 live buffers. Thanks. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-08 12:11 ` Eli Zaretskii @ 2019-10-08 12:38 ` Michael Heerdegen 2019-10-08 13:03 ` Eli Zaretskii 0 siblings, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-10-08 12:38 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, 37321 Eli Zaretskii <eliz@gnu.org> writes: > IMO, this suggests that your code produces a lot of garbage Yes, mostly unavoidable afaik. >, so it is not a good basis for setting the default values. I would be >OK with bumping up the default gc-cons-threshold 8-fold, or even >16-fold, but increasing it by a factor of 256 is too much, IMO: it >would mean we only GC after consing ~200MB of data, which is about half >the heap size of my current Emacs session, which has been up and >running for 28 days, and has 360 live buffers. Ok, then I'll increase gc-cons-threshold in el-search as it had been suggested. Unless you have a better idea of how to handle the garbage in a way that gc doesn't lower the speed that much (I wish I could explicitly free the many garbage objects). Thanks, Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-08 12:38 ` Michael Heerdegen @ 2019-10-08 13:03 ` Eli Zaretskii 2019-10-09 14:47 ` Michael Heerdegen 0 siblings, 1 reply; 38+ messages in thread From: Eli Zaretskii @ 2019-10-08 13:03 UTC (permalink / raw) To: Michael Heerdegen; +Cc: eggert, 37321 > From: Michael Heerdegen <michael_heerdegen@web.de> > Cc: eggert@cs.ucla.edu, 37321@debbugs.gnu.org > Date: Tue, 08 Oct 2019 14:38:57 +0200 > > Ok, then I'll increase gc-cons-threshold in el-search as it had been > suggested. Unless you have a better idea of how to handle the garbage > in a way that gc doesn't lower the speed that much AFAIK, the latter can only be done by changing your algorithms to produce less garbage. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-08 13:03 ` Eli Zaretskii @ 2019-10-09 14:47 ` Michael Heerdegen 2019-10-09 15:33 ` Eli Zaretskii 0 siblings, 1 reply; 38+ messages in thread From: Michael Heerdegen @ 2019-10-09 14:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, 37321 Eli Zaretskii <eliz@gnu.org> writes: > > in a way that gc doesn't lower the speed that much > > AFAIK, the latter can only be done by changing your algorithms to > produce less garbage. I tried to find out what code produces the most garbage. It turned out that ca. 50% of garbage was generated by code that prevents infinite recursion when recursing into nested structures. I use hash tables to collect visited objects, and the hash tables cause the garbage. I tried to reuse hash tables and clear them after each use, but this makes the code much slower than what I win from gc. But 99,9% of el-searched code isn't cyclic, so the effort is for nothing most of the time. Is there an efficient way to find out if a given object is cyclic? For now I try with this: (lambda () (save-excursion) (goto-char (point-min)) (search-forward-regexp "#[0-9+]=[^\"]" nil t)) (all treated objects are read from a buffer, so I can inspect the contents) and get good results but it feels a bit hackish. Thanks, Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-09 14:47 ` Michael Heerdegen @ 2019-10-09 15:33 ` Eli Zaretskii 2019-10-09 20:53 ` Paul Eggert 2019-10-10 10:58 ` Michael Heerdegen 0 siblings, 2 replies; 38+ messages in thread From: Eli Zaretskii @ 2019-10-09 15:33 UTC (permalink / raw) To: Michael Heerdegen; +Cc: eggert, 37321 > From: Michael Heerdegen <michael_heerdegen@web.de> > Cc: eggert@cs.ucla.edu, 37321@debbugs.gnu.org > Date: Wed, 09 Oct 2019 16:47:58 +0200 > > > AFAIK, the latter can only be done by changing your algorithms to > > produce less garbage. > > I tried to find out what code produces the most garbage. It turned out > that ca. 50% of garbage was generated by code that prevents infinite > recursion when recursing into nested structures. I use hash tables to > collect visited objects, and the hash tables cause the garbage. I tried > to reuse hash tables and clear them after each use, but this makes the > code much slower than what I win from gc. > > But 99,9% of el-searched code isn't cyclic, so the effort is for nothing > most of the time. Is there an efficient way to find out if a given > object is cyclic? The hare/tortoise method we use in data.c? Or maybe you could make your search non-recursive by using an internal queue of requests (like in BFS vs DFS)? (I wrote that based on 5 sec of thought, so maybe it makes no sense at all.) ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-09 15:33 ` Eli Zaretskii @ 2019-10-09 20:53 ` Paul Eggert 2019-10-10 10:58 ` Michael Heerdegen 1 sibling, 0 replies; 38+ messages in thread From: Paul Eggert @ 2019-10-09 20:53 UTC (permalink / raw) To: Eli Zaretskii, Michael Heerdegen; +Cc: 37321 On 10/9/19 8:33 AM, Eli Zaretskii wrote: > The hare/tortoise method we use in data.c? Yes, that's a good suggestion, as Brent's teleporting tortoise-hare algorithm (see citation in lisp.h) is quite good at lessening the overhead of checking for this rare situation. Perhaps there's a way we could "export" that algorithm from the C code, so that Elisp code could use the algorithm without having to reinvent it. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-09 15:33 ` Eli Zaretskii 2019-10-09 20:53 ` Paul Eggert @ 2019-10-10 10:58 ` Michael Heerdegen 1 sibling, 0 replies; 38+ messages in thread From: Michael Heerdegen @ 2019-10-10 10:58 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, 37321 Eli Zaretskii <eliz@gnu.org> writes: > The hare/tortoise method we use in data.c? I didn't know that one. Nice! Thanks for the suggestions, Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
* bug#37321: 27.0.50; Excessive gc in a use case (el-search) 2019-10-08 9:11 ` Michael Heerdegen 2019-10-08 9:19 ` Eli Zaretskii @ 2019-10-08 9:22 ` Michael Heerdegen 1 sibling, 0 replies; 38+ messages in thread From: Michael Heerdegen @ 2019-10-08 9:22 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, 37321 Michael Heerdegen <michael_heerdegen@web.de> writes: > Sure. Which value would you suggest? BTW, as a guidance: an el-search for the symbol pcase in my complete load-path (and with the .8 setting) currently behaves like this: Elapsed time: 52.331413s (7.141519s in 20 GCs) Conses: 4.32514e+07 Floats: 152079 Vector-Cells: 2.735e+08 Symbols: 163489 String-Chars: 8.00088e+07 Intervals: 28094 Strings: 2.7693e+06 (the last values are the printed vector difference of the return value of `memory-use-counts' from before and after command invocation). Michael. ^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2019-10-10 10:58 UTC | newest] Thread overview: 38+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-09-06 13:52 bug#37321: 27.0.50; Excessive gc in a use case (el-search) Michael Heerdegen 2019-09-07 14:23 ` Michael Heerdegen 2019-09-07 15:30 ` Michael Heerdegen 2019-09-08 1:11 ` Paul Eggert 2019-09-08 14:52 ` Michael Heerdegen 2019-09-08 15:25 ` Michael Heerdegen 2019-09-14 8:04 ` Paul Eggert 2019-09-14 8:37 ` Eli Zaretskii 2019-09-14 8:52 ` Paul Eggert 2019-09-14 9:57 ` Eli Zaretskii 2019-09-14 17:57 ` Paul Eggert 2019-09-14 18:16 ` Eli Zaretskii 2019-09-15 4:33 ` Richard Stallman 2019-09-16 23:53 ` Michael Heerdegen 2019-09-17 0:55 ` Paul Eggert 2019-09-21 0:41 ` Michael Heerdegen 2019-09-21 0:46 ` Michael Heerdegen 2019-09-21 6:19 ` Paul Eggert 2019-09-17 12:47 ` Noam Postavsky 2019-09-21 0:44 ` Michael Heerdegen 2019-09-25 9:42 ` Michael Heerdegen 2019-09-25 20:37 ` Paul Eggert 2019-09-26 11:42 ` Michael Heerdegen 2019-09-26 12:14 ` Eli Zaretskii 2019-09-26 13:03 ` Michael Heerdegen 2019-10-08 8:43 ` Michael Heerdegen 2019-10-08 9:09 ` Eli Zaretskii 2019-10-08 9:11 ` Michael Heerdegen 2019-10-08 9:19 ` Eli Zaretskii 2019-10-08 11:12 ` Michael Heerdegen 2019-10-08 12:11 ` Eli Zaretskii 2019-10-08 12:38 ` Michael Heerdegen 2019-10-08 13:03 ` Eli Zaretskii 2019-10-09 14:47 ` Michael Heerdegen 2019-10-09 15:33 ` Eli Zaretskii 2019-10-09 20:53 ` Paul Eggert 2019-10-10 10:58 ` Michael Heerdegen 2019-10-08 9:22 ` Michael Heerdegen
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).