On 10/09/2023 08:33, Eli Zaretskii wrote: >> Date: Sun, 10 Sep 2023 04:30:24 +0300 >> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net, >> 64735@debbugs.gnu.org >> From: Dmitry Gutov >> >> On 08/09/2023 09:35, Eli Zaretskii wrote: >>> I think you'd need to expose consing_until_gc to Lisp, and then you >>> can collect the data from Lisp. >> >> I can expose it to Lisp and print all three from post-gc-hook, but the >> result just looks like this: >> >> gc-pct 0.1 gc-thr 800000 cugc 4611686018427387903 >> >> Perhaps I need to add a hook which runs at the beginning of GC? Or of >> maybe_gc even? > > You could record its value in a local variable at the entry to > garbage_collect, and the expose that value to Lisp. That also doesn't seem to give much, given that the condition for entering 'maybe_garbage_collect' is (consing_until_gc < 0). I.e. we wait until it's down to 0, then garbage-collect. What we could perhaps do is add another hook (or a printer) at the beginning of maybe_gc, but either would result in lots and lots of output. >> Alternatively, (memory-use-counts) seems to retain some counters which >> don't get erased during garbage collection. > > Maybe using those will be good enough, indeed. I added this instrumentation: (defvar last-mem-counts '(0 0 0 0 0 0 0)) (defun gc-record-after () (let* ((counts (memory-use-counts)) (diff (cl-map 'list (lambda (old new) (- new old)) last-mem-counts counts))) (setq last-mem-counts counts) (message "counts diff %s" diff))) (add-hook 'post-gc-hook #'gc-record-after) so that after each garbage collection we print the differences in all counters (CONSES FLOATS VECTOR-CELLS SYMBOLS STRING-CHARS INTERVALS STRINGS). And a message call when the process finishes. And made those recordings during the benchmark runs of two different listing methods (one using make-process, another using process-file) to list all files in a large directory (there are ~200000 files there). The make-process one I also ran with a different (large) value of read-process-output-max. Results attached. What's in there? First of all, for find-directory-files-recursively-3, there are 0 garbage collections between the beginning of the function and when we start parsing the output (no GCs while the process is writing to the buffer synchronously). I guess inserting output in a buffer doesn't increase consing, so there's nothing to GC? Next: for find-directory-files-recursively-2, the process only finishes at the end, when all GC cycles are done for. I suppose that also means we block the process's output while Lisp is running, and also that whatever GC events occur might coincide with the chunks of output coming from the process, and however many of them turn out to be in total. So there is also a second recording for find-directory-files-recursively-2 with read-process-output-max=409600. It does improve the performance significantly (and reduce the number of GC pauses). I guess what I'm still not clear on, is whether the number of GC pauses is fewer because of less consing (the only column that looks significantly different is the 3rd: VECTOR-CELLS), or because the process finishes faster due to larger buffers, which itself causes fewer calls to maybe_gc. And, of course, what else could be done to reduce the time spent in GC in the asynchronous case.