Opportunistic GC

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Opportunistic GC
@ 2021-03-08  3:35 Stefan Monnier
  2021-03-08  7:20 ` Pip Cet
  0 siblings, 1 reply; 61+ messages in thread
From: Stefan Monnier @ 2021-03-08  3:35 UTC (permalink / raw)
  To: emacs-devel

I've been running with the code below recently to try and see if
opportunistic GC can be worth the trouble.

I've mostly focused on measuring the GC behavior rather than on tuning
the opportunistic parameters, so there's no doubt we can do better, but
here's what I've gotten so far:

- Gnus-focused session:
  ((opportunistic-gcs . 848)
   (jit 6 0 0) (cmd 526 241 707) (noncmd 92 1 2)
   (commands . 65745))
- General session:
  ((opportunistic-gcs . 1332)
   (jit 115 0 0) (cmd 239 19 89) (noncmd 1749 18 38)
   (commands . 176431))

The precise meaning of those numbers can be seen in the code, but here's
the explanation in words:
`commands` is the number of commands that were run.
`opportunistic-gcs` is the number of times we performed GC opportunistically
`jit`, `cmd`, and `noncmd` are the number of times we GC'd non-opportunistically
during a `cmd`, between commands (for `noncmd`) or during jit-lock.
The first number is the number of times a single GC took place
during/between the command, the second number counts the number of times
several GCs took place, and the third is to sum of those "multiple GCs".

The middle and third numbers are those for which opportunistic GC
fundamentally can't help very much: the command (or the work done
between commands) was just sufficiently long-running that it's hard to
avoid running a GC during this period.

[ Note that the way we measure for jit-lock it's really unlikely we'll
  ever get "multiple GCs" because it would require >1 GC performed
  during a single jit-lock chunk.  ]

Note that because my opportunistic GC calls (garbage-collect-maybe 3),
every opportunistic GC avoids only 1/3 non-opportunistic GC.

So, while these numbers don't look completely hopeless, they seem to
indicate that opportunistic GCs aren't very useful at least with the
tuning parameters I used.  To be more useful, maybe we'd need to
increase the non-opportunistic `gc-cons-percentage` and/or reduce the
idle-time before opportunistic GC (see "magic formula" below).

There's also a question of where those `noncmd` GCs come from.
(timers?  process filters?  non-jit part of redisplay? ...?)

        Stefan


(defvar gc--opportunistic-last-gcs gcs-done)
(defvar gc--opportunistic-state 'noncmd)
(defvar gc--opportunistic-counters nil)

(defun gc--check ()
  (let ((gcs-counted
         (+ (alist-get 'multi-gcs gc--opportunistic-counters 0)
                   (alist-get 'earlier-gcs gc--opportunistic-counters 0)
                   (alist-get 'single-gc-cmds gc--opportunistic-counters 0)
                   (alist-get 'noncmds-gcs gc--opportunistic-counters 0)
                   (alist-get 'opportunistic-gcs gc--opportunistic-counters 0)
                   (or (car (alist-get 'error-gcs gc--opportunistic-counters)) 0))))
    (unless (= gcs-done gcs-counted)
      (push (+ (- gcs-done gcs-counted)
               (or (car (alist-get 'error-gcs gc--opportunistic-counters)) 0))
            (alist-get 'error-gcs gc--opportunistic-counters)))))

(defun gc--opportunistic-record (nextstate)
  (let ((counts (alist-get gc--opportunistic-state gc--opportunistic-counters)))
    (unless counts
      (setf (alist-get gc--opportunistic-state gc--opportunistic-counters)
            (setq counts (list 0 0 0))))
    (pcase (prog1 (- gcs-done gc--opportunistic-last-gcs)
             (setq gc--opportunistic-last-gcs gcs-done))
      ((pred (>= 0)) nil)
      (1 (cl-incf (nth 0 counts)))
      (gcs (cl-incf (nth 1 counts))
           (cl-incf (nth 2 counts) gcs))))
  (setq gc--opportunistic-state nextstate))

(defun gc--opportunistic-postch ()
  (cl-incf (alist-get 'commands gc--opportunistic-counters 0))
  (gc--opportunistic-record 'noncmd))

(defun gc--opportunistic-prech ()
  (cl-callf identity
      (alist-get 'earlier-gcs gc--opportunistic-counters gcs-done))
  (gc--opportunistic-record 'cmd)
  ;; (gc--check)
  )

(defun gc--opportunistic-jitlock (orig-fun start)
  (if (eq gc--opportunistic-state 'cmd)
      ;; Count jit-lock execution which happens during a command as
      ;; being part of command execution rather than as part of jit-lock!
      (funcall orig-fun start)
    (let ((gc--opportunistic-state gc--opportunistic-state))
      (gc--opportunistic-record 'jit)
      (unwind-protect
          (funcall orig-fun start)
        (gc--opportunistic-record 'postjit)))))

(add-hook 'pre-command-hook #'gc--opportunistic-prech)
(add-hook 'post-command-hook #'gc--opportunistic-postch)
(advice-add 'jit-lock-function :around #'gc--opportunistic-jitlock)

(defun gc--opportunistic ()
  "Run the GC during idle time."
  ;; This is good for two reasons:
  ;; - It reduces the number of times we have to GC in the middle of
  ;;   an operation.
  ;; - It means we GC when the C stack is short, reducing the risk of false
  ;;   positives from the conservative stack scanning.
  (when (garbage-collect-maybe 3)
    (cl-incf (alist-get 'opportunistic-gcs gc--opportunistic-counters 0))
    ;; Don't double count this GC in other categories.
    (cl-incf gc--opportunistic-last-gcs)
    ;; Recalibrate the timer.
    (cancel-function-timers #'gc--opportunistic)
    (run-with-idle-timer
     ;; FIXME: Magic formula!
     (+ 1 (* 10 (/ gc-elapsed gcs-done))) t #'gc--opportunistic)))

(defun gc--opportunistic-score ()
  "Show the current counters's that keep track of GC behavior."
  (interactive)
  (message "%S" gc--opportunistic-counters))





^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08  3:35 Opportunistic GC Stefan Monnier
@ 2021-03-08  7:20 ` Pip Cet
  2021-03-08  8:26   ` martin rudalics
                     ` (5 more replies)
  0 siblings, 6 replies; 61+ messages in thread
From: Pip Cet @ 2021-03-08  7:20 UTC (permalink / raw)
  To: Stefan Monnier, eliz; +Cc: emacs-devel

On Mon, Mar 8, 2021 at 3:37 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> I've been running with the code below recently to try and see if
> opportunistic GC can be worth the trouble.

Just a random idea: What if we exploit the fact most people have more
than one CPU core these days, garbage-collect in a separate fork()ed
process, then do only the sweeping in the main process?

Immediate advantage: we'd never mark objects in the "real" Emacs
process, making debugging easier.

My implementation proposal would be to pipe(), fork(), run GC, then
send through the pipe the Lisp_Objects which can be collected by the
original Emacs.

For me, it was a bit difficult to see that this would indeed be safe,
but I'm now pretty convinced it would be: objects that are unreachable
in the child Emacs cannot become reachable in the parent Emacs (they
might show up on the stack, but that's a false positive).

We'd have to be careful not to run two "real" GC processes at the same
time. And, of course, we'd require fork() and pipe(), so I'm not sure
this would work on Windows. Furthermore, if we ever decide to fall
back to doing the real GC in the main process, we must not sweep
objects which were determined to be unreachable by a conflicting
henchman process.

And of course it would mean that we would increase memory usage for
GC, so in low-memory situations, we'd have to fall back to traditional
GC.

Eli, would this qualify as a "small" GC change, and thus be vetoed?

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08  7:20 ` Pip Cet
@ 2021-03-08  8:26   ` martin rudalics
  2021-03-08  9:02     ` Pip Cet
  2021-03-08 14:01   ` Andrea Corallo via Emacs development discussions.
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 61+ messages in thread
From: martin rudalics @ 2021-03-08  8:26 UTC (permalink / raw)
  To: Pip Cet, Stefan Monnier, eliz; +Cc: emacs-devel

 > For me, it was a bit difficult to see that this would indeed be safe,
 > but I'm now pretty convinced it would be: objects that are unreachable
 > in the child Emacs cannot become reachable in the parent Emacs (they
 > might show up on the stack, but that's a false positive).

How would this handle the standard problem of the Lisp thread storing a
pointer to an unmarked object A in an already marked object B and
releasing all other pointers to A?

martin



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08  8:26   ` martin rudalics
@ 2021-03-08  9:02     ` Pip Cet
  2021-03-08  9:51       ` martin rudalics
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-08  9:02 UTC (permalink / raw)
  To: martin rudalics; +Cc: eliz, Stefan Monnier, emacs-devel

On Mon, Mar 8, 2021 at 8:26 AM martin rudalics <rudalics@gmx.at> wrote:
>  > For me, it was a bit difficult to see that this would indeed be safe,
>  > but I'm now pretty convinced it would be: objects that are unreachable
>  > in the child Emacs cannot become reachable in the parent Emacs (they
>  > might show up on the stack, but that's a false positive).
>
> How would this handle the standard problem of the Lisp thread storing a
> pointer to an unmarked object A in an already marked object B and
> releasing all other pointers to A?

We fork(), we don't clone(). If A exists at the time of fork() but
isn't reachable through the thread's stack, that's a bug anyway; if it
doesn't exist yet at the time of fork(), it's not collected in this
cycle.

Did I miss something?

(Other than weak hash tables, of course, which may resurrect
unreachable objects, but I intend to handle those...)

Pip



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08  9:02     ` Pip Cet
@ 2021-03-08  9:51       ` martin rudalics
  2021-03-08 10:44         ` Pip Cet
  0 siblings, 1 reply; 61+ messages in thread
From: martin rudalics @ 2021-03-08  9:51 UTC (permalink / raw)
  To: Pip Cet; +Cc: eliz, Stefan Monnier, emacs-devel

 > We fork(), we don't clone(). If A exists at the time of fork() but
 > isn't reachable through the thread's stack, that's a bug anyway; if it
 > doesn't exist yet at the time of fork(), it's not collected in this
 > cycle.

I have no idea what the difference between fork and clone implies here.

IIUC you have to make a copy of Lisp thread and heap, have the collector
operate on the copies and, when done, pass the now unmarked objects to
the Lisp thread for recycling in the original heap.  IIUC that doubles
the size needed for heap and stack.  And short-lived objects have to
wait for the next cycle to get recovered.  Or what am I missing?

martin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08  9:51       ` martin rudalics
@ 2021-03-08 10:44         ` Pip Cet
  2021-03-08 11:37           ` martin rudalics
                             ` (2 more replies)
  0 siblings, 3 replies; 61+ messages in thread
From: Pip Cet @ 2021-03-08 10:44 UTC (permalink / raw)
  To: martin rudalics; +Cc: eliz, Stefan Monnier, emacs-devel

On Mon, Mar 8, 2021 at 9:51 AM martin rudalics <rudalics@gmx.at> wrote:
>  > We fork(), we don't clone(). If A exists at the time of fork() but
>  > isn't reachable through the thread's stack, that's a bug anyway; if it
>  > doesn't exist yet at the time of fork(), it's not collected in this
>  > cycle.
>
> I have no idea what the difference between fork and clone implies here.

fork() is a (relatively) cheap copy-on-write "copy" of the address space.

clone() would cause severe problems, of the type you describe above.
fork() is the (very, computers are expensive) poor man's way to avoid
those problems.

> IIUC you have to make a copy of Lisp thread and heap, have the collector

A "copy", as above.

> operate on the copies and, when done, pass the now unmarked objects to
> the Lisp thread for recycling in the original heap.  IIUC that doubles

Yes, that part is correct.

> the size needed for heap and stack.

Not quite. It's a copy-on-write copy, and we would avoid unsharing the
pages we don't set mark bits in. Pages that get collected don't have
set mark bits, so they wouldn't get copied.

And, yes, we should keep the mark bits separate from the data so we
could avoid unsharing an entire page because a single object in it
survives GC.

If the stack is of significant size (it usually isn't), it won't get
copied, either.

> And short-lived objects have to
> wait for the next cycle to get recovered.

I'm not aware of any GC algorithm that recovers objects allocated
after the GC cycle started :-)

> Or what am I missing?

Mostly that there should not be a doubling of memory usage. With the
mark bits fixed, memory usage would grow by, at most, 1/64th the heap
size on 64-bit systems, plus a constant. I think we can live with a
1.5% increase in memory usage if we get (effectively) zero-cost (or
zero latency) GC in exchange. (I'm assuming there'll be an unused CPU
core, which is usually true for me (unless I'm compiling something)).
With the mark bits not fixed, memory usage would increase by
approximately the size of the surviving heap, but not the size of the
discarded heap pages.

In addition to keeping the mark bits discontiguous from the marked
object, we should free entire pages without, as we do at present,
first faulting them in. That's part of the reason Emacs does so badly
when the system starts swapping.

Note that none of this is "real" GC: we still mark and sweep, just in
a slightly smarter way.

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 10:44         ` Pip Cet
@ 2021-03-08 11:37           ` martin rudalics
  2021-03-08 12:27             ` Pip Cet
  2021-03-08 11:38           ` Stefan Kangas
  2021-03-08 14:49           ` Eli Zaretskii
  2 siblings, 1 reply; 61+ messages in thread
From: martin rudalics @ 2021-03-08 11:37 UTC (permalink / raw)
  To: Pip Cet; +Cc: eliz, Stefan Monnier, emacs-devel

 >> And short-lived objects have to
 >> wait for the next cycle to get recovered.
 >
 > I'm not aware of any GC algorithm that recovers objects allocated
 > after the GC cycle started :-)

In the early phase of a collection cycle most objects are unmarked so it
would make sense to give a new object the mark status of the first
object referencing it.  But that's admittedly hard to do with ambiguous
roots.

 > Mostly that there should not be a doubling of memory usage. With the
 > mark bits fixed, memory usage would grow by, at most, 1/64th the heap
 > size on 64-bit systems, plus a constant. I think we can live with a
 > 1.5% increase in memory usage if we get (effectively) zero-cost (or
 > zero latency) GC in exchange. (I'm assuming there'll be an unused CPU
 > core, which is usually true for me (unless I'm compiling something)).
 > With the mark bits not fixed, memory usage would increase by
 > approximately the size of the surviving heap, but not the size of the
 > discarded heap pages.

But doesn't that depend on how many writes the Lisp thread performs in
the heap?  Each such write causes its associated page getting written
out to avoid that an old link gets lost.  And if the action was to drop
an old link, writing the page out doesn't even make sense.

 > Note that none of this is "real" GC: we still mark and sweep, just in
 > a slightly smarter way.

If with "real" you meant "real-time", then it might qualify as such.

martin



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 10:44         ` Pip Cet
  2021-03-08 11:37           ` martin rudalics
@ 2021-03-08 11:38           ` Stefan Kangas
  2021-03-08 12:55             ` Pip Cet
  2021-03-08 14:06             ` Andrea Corallo via Emacs development discussions.
  2021-03-08 14:49           ` Eli Zaretskii
  2 siblings, 2 replies; 61+ messages in thread
From: Stefan Kangas @ 2021-03-08 11:38 UTC (permalink / raw)
  To: Pip Cet, martin rudalics; +Cc: eliz, Stefan Monnier, emacs-devel

Pip Cet <pipcet@gmail.com> writes:

> Note that none of this is "real" GC: we still mark and sweep, just in
> a slightly smarter way.

Do you see any significant pros (or cons) to the approach you suggest as
compared to a generational GC?

Would we want/need to move to a generational GC if we had what you
describe?



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 11:37           ` martin rudalics
@ 2021-03-08 12:27             ` Pip Cet
  2021-03-08 18:06               ` martin rudalics
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-08 12:27 UTC (permalink / raw)
  To: martin rudalics; +Cc: eliz, Stefan Monnier, emacs-devel

On Mon, Mar 8, 2021 at 11:37 AM martin rudalics <rudalics@gmx.at> wrote:
>  >> And short-lived objects have to
>  >> wait for the next cycle to get recovered.
>  >
>  > I'm not aware of any GC algorithm that recovers objects allocated
>  > after the GC cycle started :-)
>
> In the early phase of a collection cycle most objects are unmarked so it
> would make sense to give a new object the mark status of the first
> object referencing it.  But that's admittedly hard to do with ambiguous
> roots.

Perhaps it's worth keeping in mind that this is a relatively simple
change, not a complete GC rewrite.

>  > Mostly that there should not be a doubling of memory usage. With the
>  > mark bits fixed, memory usage would grow by, at most, 1/64th the heap
>  > size on 64-bit systems, plus a constant. I think we can live with a
>  > 1.5% increase in memory usage if we get (effectively) zero-cost (or
>  > zero latency) GC in exchange. (I'm assuming there'll be an unused CPU
>  > core, which is usually true for me (unless I'm compiling something)).
>  > With the mark bits not fixed, memory usage would increase by
>  > approximately the size of the surviving heap, but not the size of the
>  > discarded heap pages.
>
> But doesn't that depend on how many writes the Lisp thread performs in
> the heap?

You're right, it does. Thanks for correcting me on this point.

I don't think it's going to matter in practice, though. GC, after all,
does not take that long, and the retained working set of most Lisp
programs is small, memory is cheap, and it is a factor of two in the
absolute worst conceivable case.

> Each such write causes its associated page getting written
> out to avoid that an old link gets lost.  And if the action was to drop
> an old link, writing the page out doesn't even make sense.

I'm not sure what you mean by "drop an old link".

>  > Note that none of this is "real" GC: we still mark and sweep, just in
>  > a slightly smarter way.
>
> If with "real" you meant "real-time", then it might qualify as such.

No, I meant "real" as opposed to mark-and-sweep style algorithms.
Generational would have been a better (but possibly slightly
narrower?) term.

Pip



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 11:38           ` Stefan Kangas
@ 2021-03-08 12:55             ` Pip Cet
  2021-03-08 14:06             ` Andrea Corallo via Emacs development discussions.
  1 sibling, 0 replies; 61+ messages in thread
From: Pip Cet @ 2021-03-08 12:55 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: martin rudalics, eliz, Stefan Monnier, emacs-devel

On Mon, Mar 8, 2021 at 11:38 AM Stefan Kangas <stefankangas@gmail.com> wrote:
> Pip Cet <pipcet@gmail.com> writes:
> > Note that none of this is "real" GC: we still mark and sweep, just in
> > a slightly smarter way.
>
> Do you see any significant pros (or cons) to the approach you suggest as
> compared to a generational GC?

In general, I see significant cons in my approach of incrementally
improving GC a little here and there: a complete rewrite would be much
better (but, at least at this point in time, I'm not volunteering for
it). So I think any time sunk into this may be lost, eventually, and
we're only patching holes in our straw hut until we find time to
demolish it and build a skyscraper instead.

Other cons:
- requires POSIX
- requires efficient OS paging

Pros:
- it's cheap, typically (i.e. on GNU/Linux with a "typical" workload)
- it's easy to do (comparatively)
- there are going to be tangible and immediate benefits
- it's a first step towards making GC more hackable again, so a future
GCologist will find it easier to implement generational GC

> Would we want/need to move to a generational GC if we had what you
> describe?

Yes. Absolutely.

As I said above, I think this might actually help generational GC
implementations a lot: those are going to want a "tell me which
objects would be collected if I ran mark-and-sweep now" function for
checking, and that's what this is.

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08  7:20 ` Pip Cet
  2021-03-08  8:26   ` martin rudalics
@ 2021-03-08 14:01   ` Andrea Corallo via Emacs development discussions.
  2021-03-08 15:04     ` Pip Cet
  2021-03-08 14:45   ` Eli Zaretskii
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 61+ messages in thread
From: Andrea Corallo via Emacs development discussions. @ 2021-03-08 14:01 UTC (permalink / raw)
  To: Pip Cet; +Cc: Stefan Monnier, eliz, emacs-devel

Pip Cet <pipcet@gmail.com> writes:

> On Mon, Mar 8, 2021 at 3:37 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> I've been running with the code below recently to try and see if
>> opportunistic GC can be worth the trouble.
>
> Just a random idea: What if we exploit the fact most people have more
> than one CPU core these days, garbage-collect in a separate fork()ed
> process, then do only the sweeping in the main process?

It's an interesting concept, I thought about something similar in the
past but, I've two questions:

Are we really sure that a non trivial process can long survive
or work as expected after being forked?

Do we have (an efficient) fork on all supported systems?

Thanks

  Andrea



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 11:38           ` Stefan Kangas
  2021-03-08 12:55             ` Pip Cet
@ 2021-03-08 14:06             ` Andrea Corallo via Emacs development discussions.
  1 sibling, 0 replies; 61+ messages in thread
From: Andrea Corallo via Emacs development discussions. @ 2021-03-08 14:06 UTC (permalink / raw)
  To: Stefan Kangas; +Cc: Pip Cet, martin rudalics, eliz, Stefan Monnier, emacs-devel

Stefan Kangas <stefankangas@gmail.com> writes:

> Pip Cet <pipcet@gmail.com> writes:
>
>> Note that none of this is "real" GC: we still mark and sweep, just in
>> a slightly smarter way.
>
> Do you see any significant pros (or cons) to the approach you suggest as
> compared to a generational GC?
>
> Would we want/need to move to a generational GC if we had what you
> describe?

I think generational GC are generally great for efficiency, but for the
kind of application Emacs is and the multi-core world we live in I think
we are more interest in parallelism.

IOW we may be very happy to sacrifice efficiency if we gain in
parallelism and diminish or zero the GC pauses.

Regards

  Andrea



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08  7:20 ` Pip Cet
  2021-03-08  8:26   ` martin rudalics
  2021-03-08 14:01   ` Andrea Corallo via Emacs development discussions.
@ 2021-03-08 14:45   ` Eli Zaretskii
  2021-03-08 14:57     ` Pip Cet
  2021-03-08 16:03   ` Concurrent GC via `fork` (was: Opportunistic GC) Stefan Monnier
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-08 14:45 UTC (permalink / raw)
  To: Pip Cet; +Cc: monnier, emacs-devel

> From: Pip Cet <pipcet@gmail.com>
> Date: Mon, 8 Mar 2021 07:20:28 +0000
> Cc: emacs-devel@gnu.org
> 
> My implementation proposal would be to pipe(), fork(), run GC, then
> send through the pipe the Lisp_Objects which can be collected by the
> original Emacs.

What happens when, on a 8GB machine, we have an Emacs session with a
5GB memory footprint, and you fork and start marking (which triggers
copy-on-write)?

Also, I quite frequently need to run Emacs on a system where I'm
forbidden to run more than 2 processes simultaneously ("make -j3"
aborts with an error message), and you propose to take those two
processes with 2 copies of Emacs?

IOW, I don't believe this scales well enough.

> Eli, would this qualify as a "small" GC change, and thus be vetoed?

I don't yet see how it is a good idea to begin with.  I'm probably
missing something.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 10:44         ` Pip Cet
  2021-03-08 11:37           ` martin rudalics
  2021-03-08 11:38           ` Stefan Kangas
@ 2021-03-08 14:49           ` Eli Zaretskii
  2021-03-08 15:02             ` Pip Cet
  2 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-08 14:49 UTC (permalink / raw)
  To: Pip Cet; +Cc: rudalics, monnier, emacs-devel

> From: Pip Cet <pipcet@gmail.com>
> Date: Mon, 8 Mar 2021 10:44:06 +0000
> Cc: eliz@gnu.org, Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> And, yes, we should keep the mark bits separate from the data so we
> could avoid unsharing an entire page because a single object in it
> survives GC.

Ah, so not really an easy change after all.

And how do you propose to keep the mark bits separately and still
maintain coherency between the object and its bits?



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 14:45   ` Eli Zaretskii
@ 2021-03-08 14:57     ` Pip Cet
  2021-03-08 15:16       ` Eli Zaretskii
  2021-03-09 13:01       ` Eli Zaretskii
  0 siblings, 2 replies; 61+ messages in thread
From: Pip Cet @ 2021-03-08 14:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, emacs-devel

On Mon, Mar 8, 2021 at 2:45 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Mon, 8 Mar 2021 07:20:28 +0000
> > Cc: emacs-devel@gnu.org
> >
> > My implementation proposal would be to pipe(), fork(), run GC, then
> > send through the pipe the Lisp_Objects which can be collected by the
> > original Emacs.
>
> What happens when, on a 8GB machine, we have an Emacs session with a
> 5GB memory footprint, and you fork and start marking (which triggers
> copy-on-write)?

That is, indeed, one of the cases in which we have to use the
synchronous GC or suffer the consequences.

Let me be clear about that: yes, it would use (up to) 10 GB of memory,
if you really have 5 GB of Lisp data.

> Also, I quite frequently need to run Emacs on a system where I'm
> forbidden to run more than 2 processes simultaneously ("make -j3"
> aborts with an error message), and you propose to take those two
> processes with 2 copies of Emacs?

That is, indeed, one of the cases in which we have to use synchronous GC.

> IOW, I don't believe this scales well enough.

I believe it has applications, including my own personal use of Emacs.
I don't believe it's universally better than synchronous GC.

> > Eli, would this qualify as a "small" GC change, and thus be vetoed?
> I don't yet see how it is a good idea to begin with.  I'm probably
> missing something.

Once in a while, every few days or so, GC happens at an inconvenient
time to me. I'd like to interrupt it, but I can't. I'd like for Emacs
not to become unresponsive while it's mark-and-sweeping, but it is.
Sometimes, I end up killing Emacs rather than waiting for GC to
complete.

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 14:49           ` Eli Zaretskii
@ 2021-03-08 15:02             ` Pip Cet
  2021-03-08 15:23               ` Eli Zaretskii
  2021-03-08 15:46               ` Stefan Monnier
  0 siblings, 2 replies; 61+ messages in thread
From: Pip Cet @ 2021-03-08 15:02 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Stefan Monnier, emacs-devel

On Mon, Mar 8, 2021 at 2:49 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Mon, 8 Mar 2021 10:44:06 +0000
> > Cc: eliz@gnu.org, Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> >
> > And, yes, we should keep the mark bits separate from the data so we
> > could avoid unsharing an entire page because a single object in it
> > survives GC.
>
> Ah, so not really an easy change after all.

For conses, it's easy. Instead of keeping the mark bits in the cons
block, we keep a pointer to the mark bits in the cons block, and
allocate the mark bits separately and (hopefully) on a different page.

> And how do you propose to keep the mark bits separately and still
> maintain coherency between the object and its bits?

Just as we do for pdumper, which already does this. In fact, I'm
pretty sure it was Daniel who said a while ago that he wants to extend
that to all GCable objects, not just those that live in the pdmp.

AFAIK, we never go from mark bit to object, only from object to mark
bit. And that would simply be behind a level of pointer indirection.

In each Emacs process, there'd still be a 1-to-1 relation between
objects and mark bits. It would just so happen that there are two
Emacs processes, one of which only checks which objects are
unreachable in it and tells the other one about them.

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 14:01   ` Andrea Corallo via Emacs development discussions.
@ 2021-03-08 15:04     ` Pip Cet
  2021-03-08 15:50       ` Stefan Monnier
  2021-03-08 15:52       ` Andrea Corallo via Emacs development discussions.
  0 siblings, 2 replies; 61+ messages in thread
From: Pip Cet @ 2021-03-08 15:04 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: eliz, Stefan Monnier, emacs-devel

On Mon, Mar 8, 2021 at 2:01 PM Andrea Corallo <akrl@sdf.org> wrote:
> Pip Cet <pipcet@gmail.com> writes:
>
> > On Mon, Mar 8, 2021 at 3:37 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> >> I've been running with the code below recently to try and see if
> >> opportunistic GC can be worth the trouble.
> >
> > Just a random idea: What if we exploit the fact most people have more
> > than one CPU core these days, garbage-collect in a separate fork()ed
> > process, then do only the sweeping in the main process?
>
> It's an interesting concept, I thought about something similar in the
> past but, I've two questions:
>
> Are we really sure that a non trivial process can long survive
> or work as expected after being forked?

I believe that, on all systems where bash works, we can expect fork()
to behave. We do need to block signals, obviously, and must not do any
real I/O in the subprocess (in particular, we must not
unblock_input(), which I confess I did until just now).

> Do we have (an efficient) fork on all supported systems?

I'm not sure. I do know some systems prefer using spawn() to fork() +
exec(), and I always suspected the main reason for that is because
they do not have an efficient fork().

I do think that there should always be a runtime option to get the
conventional synchronous GC, as that is bound to remain the faster
option.

As Eli points out, there are at least two significant use cases in
which we have to use synchronous GC: low memory machines with large
workloads, and limits on the number of processes we can launch. I
believe single-thread machines are also likely not to benefit from it.

But note that we already do launch quite a few asynchronous Emacs
instances, on the native-comp branch. This would be just one, a
short-lived one at that.

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 14:57     ` Pip Cet
@ 2021-03-08 15:16       ` Eli Zaretskii
  2021-03-09 13:01       ` Eli Zaretskii
  1 sibling, 0 replies; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-08 15:16 UTC (permalink / raw)
  To: Pip Cet; +Cc: monnier, emacs-devel

> From: Pip Cet <pipcet@gmail.com>
> Date: Mon, 8 Mar 2021 14:57:06 +0000
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> Once in a while, every few days or so, GC happens at an inconvenient
> time to me. I'd like to interrupt it, but I can't. I'd like for Emacs
> not to become unresponsive while it's mark-and-sweeping, but it is.
> Sometimes, I end up killing Emacs rather than waiting for GC to
> complete.

I'm aware of the issues we have with the current GC, but those issues
don't automatically endorse _any_ alternative.  The alternative must
make sense, be scalable, and have other useful properties before we
decide it's worth our while.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 15:02             ` Pip Cet
@ 2021-03-08 15:23               ` Eli Zaretskii
  2021-03-08 15:39                 ` Pip Cet
  2021-03-08 15:46               ` Stefan Monnier
  1 sibling, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-08 15:23 UTC (permalink / raw)
  To: Pip Cet; +Cc: rudalics, monnier, emacs-devel

> From: Pip Cet <pipcet@gmail.com>
> Date: Mon, 8 Mar 2021 15:02:18 +0000
> Cc: rudalics@gmx.at, Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> For conses, it's easy. Instead of keeping the mark bits in the cons
> block, we keep a pointer to the mark bits in the cons block, and
> allocate the mark bits separately and (hopefully) on a different page.

And for others?

Also, doesn't it mean you increase the memory required for a cons by a
factor of 1.5, as it now needs one more member (3 words instead of 2)?



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 15:23               ` Eli Zaretskii
@ 2021-03-08 15:39                 ` Pip Cet
  2021-03-08 16:27                   ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-08 15:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Stefan Monnier, emacs-devel

On Mon, Mar 8, 2021 at 3:23 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Mon, 8 Mar 2021 15:02:18 +0000
> > Cc: rudalics@gmx.at, Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> >
> > For conses, it's easy. Instead of keeping the mark bits in the cons
> > block, we keep a pointer to the mark bits in the cons block, and
> > allocate the mark bits separately and (hopefully) on a different page.

> And for others?

Symbols: there's probably no need, as there are rarely sessions with
millions of symbols.
Floats: work just as conses do.
Strings: I don't think there are millions of short strings on the
heap, usually, but I'm not sure. Anyway, they're just like vectors.
Vectors: That's the difficult one; small vectors are kept in vector
blocks, and we'd need to have one bit per vector, which would mean one
bit for every two lisp words in the vector block, IIUC. So a storage
size increase of 1/128, 1/64 on 32-bit machines. Large vectors are,
well, large, so we can afford to keep an entire malloc() block for
their single mark bit, which would also come out to 1/128 using
typical numbers, if I got the math right.

But this would be mere optimization, of course.

> Also, doesn't it mean you increase the memory required for a cons by a
> factor of 1.5, as it now needs one more member (3 words instead of 2)?

There'd be one pointer per cons block, not one per cons, so the factor
is 1.008 on this machine.

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 15:02             ` Pip Cet
  2021-03-08 15:23               ` Eli Zaretskii
@ 2021-03-08 15:46               ` Stefan Monnier
  2021-03-08 16:11                 ` Pip Cet
  1 sibling, 1 reply; 61+ messages in thread
From: Stefan Monnier @ 2021-03-08 15:46 UTC (permalink / raw)
  To: Pip Cet; +Cc: rudalics, Eli Zaretskii, emacs-devel

>> Ah, so not really an easy change after all.
>
> For conses, it's easy. Instead of keeping the mark bits in the cons
> block, we keep a pointer to the mark bits in the cons block, and
> allocate the mark bits separately and (hopefully) on a different page.

What you mean is that it's easy because the markbits are already
separate from the cons objects (same for floats).  We could also just
increase the size of the cons blocks such that their markbits area takes
up a whole page.  That means cons blocks of about (* 16B/cons 8b/B) =>
128 pages => 512KB.

>> And how do you propose to keep the mark bits separately and still
>> maintain coherency between the object and its bits?
> Just as we do for pdumper, which already does this.  In fact, I'm
> pretty sure it was Daniel who said a while ago that he wants to extend
> that to all GCable objects, not just those that live in the pdmp.

Yes, it's a good idea to do that in general.


        Stefan




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 15:04     ` Pip Cet
@ 2021-03-08 15:50       ` Stefan Monnier
  2021-03-08 15:52       ` Andrea Corallo via Emacs development discussions.
  1 sibling, 0 replies; 61+ messages in thread
From: Stefan Monnier @ 2021-03-08 15:50 UTC (permalink / raw)
  To: Pip Cet; +Cc: eliz, emacs-devel, Andrea Corallo

> I'm not sure. I do know some systems prefer using spawn() to fork() +
> exec(), and I always suspected the main reason for that is because
> they do not have an efficient fork().

IIRC the reason is that `fork` can be unnecessarily costly if your
forked application is large and it just wants to spawn a small /bin/sh
process: while `fork` doesn't need to copy the application's pages, it
still has to copy its pagetable(!) and it may have to pre-allocate or
pre-reserve the pages that it may copy-on-write later.

        Stefan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 15:04     ` Pip Cet
  2021-03-08 15:50       ` Stefan Monnier
@ 2021-03-08 15:52       ` Andrea Corallo via Emacs development discussions.
  2021-03-08 16:14         ` Philipp Stephani
  2021-03-08 16:44         ` Stefan Monnier
  1 sibling, 2 replies; 61+ messages in thread
From: Andrea Corallo via Emacs development discussions. @ 2021-03-08 15:52 UTC (permalink / raw)
  To: Pip Cet; +Cc: Stefan Monnier, eliz, emacs-devel

Pip Cet <pipcet@gmail.com> writes:

> On Mon, Mar 8, 2021 at 2:01 PM Andrea Corallo <akrl@sdf.org> wrote:
>> Pip Cet <pipcet@gmail.com> writes:
>>
>> > On Mon, Mar 8, 2021 at 3:37 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> >> I've been running with the code below recently to try and see if
>> >> opportunistic GC can be worth the trouble.
>> >
>> > Just a random idea: What if we exploit the fact most people have more
>> > than one CPU core these days, garbage-collect in a separate fork()ed
>> > process, then do only the sweeping in the main process?
>>
>> It's an interesting concept, I thought about something similar in the
>> past but, I've two questions:
>>
>> Are we really sure that a non trivial process can long survive
>> or work as expected after being forked?
>
> I believe that, on all systems where bash works, we can expect fork()
> to behave.

I also expect fork will work, I'm more wondering about interaction with
sockets file descriptors or anything else in our codebase and/or in
libraries we have loaded.

I've read in the past this is an issue in practice in non trivial code
and that, because the typical use of fork () nowadays is just to run
afterward exec (), operating in the grey between the two was not really
a good idea.  But I don't have the reference now and if you say it works
I've no reasons not to trust you.

  Andrea



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Concurrent GC via `fork` (was: Opportunistic GC)
  2021-03-08  7:20 ` Pip Cet
                     ` (2 preceding siblings ...)
  2021-03-08 14:45   ` Eli Zaretskii
@ 2021-03-08 16:03   ` Stefan Monnier
  2021-03-08 16:25     ` Pip Cet
  2021-03-10 17:18   ` Opportunistic GC Matt Armstrong
  2022-07-05  1:47   ` Stefan Monnier
  5 siblings, 1 reply; 61+ messages in thread
From: Stefan Monnier @ 2021-03-08 16:03 UTC (permalink / raw)
  To: Pip Cet; +Cc: eliz, emacs-devel

> Just a random idea: What if we exploit the fact most people have more
> than one CPU core these days, garbage-collect in a separate fork()ed
> process, then do only the sweeping in the main process?

Sounds like a good idea.
A concurrent GC which relies on `fork` to take the snapshot.

I don't think we'll be able to always use it, so the synchronous code
path will have to stay, but I think it's definitely worth working on it.

It will probably require moving the markbits out of the objects in order
to work well (which would probably be beneficial in the synchronous GC
case as well), but other than that I'd expect it should be possible to
get pretty good performance out of it in the "normal" case:
- most pages should stay shared between the two processes (because the
  GC process should basically only modify its stack and its markbit
  pages, and the parent process should hopefully not have enough time to
  modify a large fraction of its working set).
- the "normal" case I envision is one where there's a fair bit of extra
  RAM left in the system and a few CPU cores waiting to help Emacs.

> Eli, would this qualify as a "small" GC change, and thus be vetoed?

I like the idea, but its inclusion can't be discussed without
a proof-of-concept patch.


        Stefan




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 15:46               ` Stefan Monnier
@ 2021-03-08 16:11                 ` Pip Cet
  2021-03-08 16:37                   ` Stefan Monnier
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-08 16:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: rudalics, Eli Zaretskii, emacs-devel

On Mon, Mar 8, 2021 at 3:46 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> >> Ah, so not really an easy change after all.
> >
> > For conses, it's easy. Instead of keeping the mark bits in the cons
> > block, we keep a pointer to the mark bits in the cons block, and
> > allocate the mark bits separately and (hopefully) on a different page.
>
> What you mean is that it's easy because the markbits are already
> separate from the cons objects (same for floats).  We could also just
> increase the size of the cons blocks such that their markbits area takes
> up a whole page.  That means cons blocks of about (* 16B/cons 8b/B) =>
> 128 pages => 512KB.

I don't understand the benefits of doing that. The disadvantages would
be very significant: we can't free a cons block until all its conses
have been freed, and that's drastically more likely if we increase the
cons block size by a factor of 128.

Note that 4KB pages are no longer all there is. I hear even VAX uses a
different page size ;-)

> >> And how do you propose to keep the mark bits separately and still
> >> maintain coherency between the object and its bits?
> > Just as we do for pdumper, which already does this.  In fact, I'm
> > pretty sure it was Daniel who said a while ago that he wants to extend
> > that to all GCable objects, not just those that live in the pdmp.

> Yes, it's a good idea to do that in general.

I agree it's a good idea to use the same approach for pdumped objects
and regular old objects, but I don't think the current pdumper
approach extends to dynamically-allocated objects, and I'd rather
avoid searching a btree for each mark bit. That means pdumper has to
be modified to keep its objects in blocks, and those blocks need a
pointer to their mark bits, but we don't want to fault them in just to
set the mark bits pointer, so we'd need a relative pointer, and then
things start getting complicated...

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 15:52       ` Andrea Corallo via Emacs development discussions.
@ 2021-03-08 16:14         ` Philipp Stephani
  2021-03-08 16:18           ` Andrea Corallo via Emacs development discussions.
  2021-03-08 16:44         ` Stefan Monnier
  1 sibling, 1 reply; 61+ messages in thread
From: Philipp Stephani @ 2021-03-08 16:14 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Eli Zaretskii, Stefan Monnier, Pip Cet, Emacs developers

Am Mo., 8. März 2021 um 16:53 Uhr schrieb Andrea Corallo via Emacs
development discussions. <emacs-devel@gnu.org>:
>
> Pip Cet <pipcet@gmail.com> writes:
>
> > On Mon, Mar 8, 2021 at 2:01 PM Andrea Corallo <akrl@sdf.org> wrote:
> >> Pip Cet <pipcet@gmail.com> writes:
> >>
> >> > On Mon, Mar 8, 2021 at 3:37 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> >> >> I've been running with the code below recently to try and see if
> >> >> opportunistic GC can be worth the trouble.
> >> >
> >> > Just a random idea: What if we exploit the fact most people have more
> >> > than one CPU core these days, garbage-collect in a separate fork()ed
> >> > process, then do only the sweeping in the main process?
> >>
> >> It's an interesting concept, I thought about something similar in the
> >> past but, I've two questions:
> >>
> >> Are we really sure that a non trivial process can long survive
> >> or work as expected after being forked?
> >
> > I believe that, on all systems where bash works, we can expect fork()
> > to behave.
>
> I also expect fork will work, I'm more wondering about interaction with
> sockets file descriptors or anything else in our codebase and/or in
> libraries we have loaded.
>
> I've read in the past this is an issue in practice in non trivial code
> and that, because the typical use of fork () nowadays is just to run
> afterward exec (), operating in the grey between the two was not really
> a good idea.  But I don't have the reference now and if you say it works
> I've no reasons not to trust you.

The restriction is that you can only call async-signal-safe functions
in subprocesses before exec. See e.g. the man page for fork on
GNU/Linux.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 16:14         ` Philipp Stephani
@ 2021-03-08 16:18           ` Andrea Corallo via Emacs development discussions.
  2021-03-08 16:40             ` Philipp Stephani
  0 siblings, 1 reply; 61+ messages in thread
From: Andrea Corallo via Emacs development discussions. @ 2021-03-08 16:18 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: Pip Cet, Stefan Monnier, Eli Zaretskii, Emacs developers

Philipp Stephani <p.stephani2@gmail.com> writes:

[...]

>> I also expect fork will work, I'm more wondering about interaction with
>> sockets file descriptors or anything else in our codebase and/or in
>> libraries we have loaded.
>>
>> I've read in the past this is an issue in practice in non trivial code
>> and that, because the typical use of fork () nowadays is just to run
>> afterward exec (), operating in the grey between the two was not really
>> a good idea.  But I don't have the reference now and if you say it works
>> I've no reasons not to trust you.
>
> The restriction is that you can only call async-signal-safe functions
> in subprocesses before exec. See e.g. the man page for fork on
> GNU/Linux.

Can we enforce that for our case?

  Andrea



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Concurrent GC via `fork` (was: Opportunistic GC)
  2021-03-08 16:03   ` Concurrent GC via `fork` (was: Opportunistic GC) Stefan Monnier
@ 2021-03-08 16:25     ` Pip Cet
  2021-03-08 17:37       ` Concurrent GC via `fork` Stefan Monnier
  2021-03-08 19:21       ` Lars Ingebrigtsen
  0 siblings, 2 replies; 61+ messages in thread
From: Pip Cet @ 2021-03-08 16:25 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eliz, emacs-devel

On Mon, Mar 8, 2021 at 4:03 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> > Just a random idea: What if we exploit the fact most people have more
> > than one CPU core these days, garbage-collect in a separate fork()ed
> > process, then do only the sweeping in the main process?
>
> Sounds like a good idea.

Thanks.

Sorry for derailing your thread, I should have chosen a new subject.

> A concurrent GC which relies on `fork` to take the snapshot.
> I don't think we'll be able to always use it, so the synchronous code
> path will have to stay,

Absolutely.

And I do want to mention weak hash tables: they're a problem, because
they can resurrect unreachable objects. Worse, while normal
weak-key-strong-value hash tables only resurrect objects when you
maphash, weak-value hash tables can do so for gethash as well, so that
code path would become more expensive and more difficult to translate
into native code.

> but I think it's definitely worth working on it.

I've got it working for conses, I think (as a proof of concept,
synchronously, with an extra waitpid()).

> It will probably require moving the markbits out of the objects in order
> to work well (which would probably be beneficial in the synchronous GC
> case as well), but other than that I'd expect it should be possible to
> get pretty good performance out of it in the "normal" case:

The normal case, for me, is Emacs freeing a lot of memory in a GC
cycle, and keeping only a relatively small working set. That should
work well with the way markbits are written only for objects that are
to be kept.

> - most pages should stay shared between the two processes (because the
>   GC process should basically only modify its stack and its markbit
>   pages, and the parent process should hopefully not have enough time to
>   modify a large fraction of its working set).
> - the "normal" case I envision is one where there's a fair bit of extra
>   RAM left in the system and a few CPU cores waiting to help Emacs.

Indeed.

Apart from hating threads and loving processes, I love nice levels,
and I think this would be a good use case for them. For example, I
hear there's a new CPU out there that has two separate sets of cores
for "efficiency" and "performance", and it would be great if Emacs ran
an "efficiency" garbage collection in the background rather than a
"performance" garbage collection synchronously.

> > Eli, would this qualify as a "small" GC change, and thus be vetoed?
>
> I like the idea, but its inclusion can't be discussed without
> a proof-of-concept patch.

I think that's the sensible approach, yes. I certainly wasn't hoping
to hear anything _positive_ before submitting a patch.

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 15:39                 ` Pip Cet
@ 2021-03-08 16:27                   ` Eli Zaretskii
  2021-03-08 16:35                     ` Pip Cet
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-08 16:27 UTC (permalink / raw)
  To: Pip Cet; +Cc: rudalics, monnier, emacs-devel

> From: Pip Cet <pipcet@gmail.com>
> Date: Mon, 8 Mar 2021 15:39:07 +0000
> Cc: rudalics@gmx.at, Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> > Also, doesn't it mean you increase the memory required for a cons by a
> > factor of 1.5, as it now needs one more member (3 words instead of 2)?
> 
> There'd be one pointer per cons block, not one per cons, so the factor
> is 1.008 on this machine.

And how do you make sure the mark bits will be on a different/separate
page?



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 16:27                   ` Eli Zaretskii
@ 2021-03-08 16:35                     ` Pip Cet
  2021-03-08 17:06                       ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-08 16:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: rudalics, Stefan Monnier, emacs-devel

On Mon, Mar 8, 2021 at 4:27 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Mon, 8 Mar 2021 15:39:07 +0000
> > Cc: rudalics@gmx.at, Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> >
> > > Also, doesn't it mean you increase the memory required for a cons by a
> > > factor of 1.5, as it now needs one more member (3 words instead of 2)?
> >
> > There'd be one pointer per cons block, not one per cons, so the factor
> > is 1.008 on this machine.
>
> And how do you make sure the mark bits will be on a different/separate
> page?

We allocate cons blocks aligned to physical pages. The mark bits don't
live in the cons block. Therefore, they can't live on the same page as
a cons block.

What am I missing?

Pip



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 16:11                 ` Pip Cet
@ 2021-03-08 16:37                   ` Stefan Monnier
  0 siblings, 0 replies; 61+ messages in thread
From: Stefan Monnier @ 2021-03-08 16:37 UTC (permalink / raw)
  To: Pip Cet; +Cc: rudalics, Eli Zaretskii, emacs-devel

>> What you mean is that it's easy because the markbits are already
>> separate from the cons objects (same for floats).  We could also just
>> increase the size of the cons blocks such that their markbits area takes
>> up a whole page.  That means cons blocks of about (* 16B/cons 8b/B) =>
>> 128 pages => 512KB.
> I don't understand the benefits of doing that.

I'm not specifically arguing for it, actually.  Just pointing out that
there's fair bit of design space here.  I think we'd generally prefer to
avoid an extra level of indirection (the marking phase tends to "kill"
the cache and suffer from memory latency), so if we can compute the
address of the markbit directly from the object address, it's better.

> The disadvantages would be very significant: we can't free a cons
> block until all its conses have been freed, and that's drastically
> more likely if we increase the cons block size

Fully agreed.

> by a factor of 128.

[ IIRC the cons blocks are currently 16KB, so it would increase it by
  a slightly smaller factor.  Not that it matters, tho.  ]

> Note that 4KB pages are no longer all there is. I hear even VAX uses a
> different page size ;-)

Yes, we're *finally* slowly moving to slightly larger pages.

>> > Just as we do for pdumper, which already does this.  In fact, I'm
>> > pretty sure it was Daniel who said a while ago that he wants to extend
>> > that to all GCable objects, not just those that live in the pdmp.
>> Yes, it's a good idea to do that in general.
> I agree it's a good idea to use the same approach for pdumped objects
> and regular old objects, but I don't think the current pdumper
> approach extends to dynamically-allocated objects,

Indeed.

> and I'd rather avoid searching a btree for each mark bit. That means
> pdumper has to be modified to keep its objects in blocks, and those

Not just pdumper.  In the GC, the most important objects AFAIK are
strings and vector(like)s.  Vectors used to be handled fairly
inefficiently, but their importance has grown, so it's worth trying to
handle them very efficiently: there's even silver lining in that it
would allow us to treat more of the other types as vectors
(e.g. strings) and thus simplify the code.

E.g. It'd be worthwhile improving our vector allocation so they get
allocated in blocks of vectors of the same size.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 16:18           ` Andrea Corallo via Emacs development discussions.
@ 2021-03-08 16:40             ` Philipp Stephani
  0 siblings, 0 replies; 61+ messages in thread
From: Philipp Stephani @ 2021-03-08 16:40 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: Eli Zaretskii, Stefan Monnier, Pip Cet, Emacs developers

Am Mo., 8. März 2021 um 17:18 Uhr schrieb Andrea Corallo <akrl@sdf.org>:
>
> Philipp Stephani <p.stephani2@gmail.com> writes:
>
> [...]
>
> >> I also expect fork will work, I'm more wondering about interaction with
> >> sockets file descriptors or anything else in our codebase and/or in
> >> libraries we have loaded.
> >>
> >> I've read in the past this is an issue in practice in non trivial code
> >> and that, because the typical use of fork () nowadays is just to run
> >> afterward exec (), operating in the grey between the two was not really
> >> a good idea.  But I don't have the reference now and if you say it works
> >> I've no reasons not to trust you.
> >
> > The restriction is that you can only call async-signal-safe functions
> > in subprocesses before exec. See e.g. the man page for fork on
> > GNU/Linux.
>
> Can we enforce that for our case?
>

It would be useful to have compiler attributes to mark
async-signal-safe functions and other attributes that enforce
async-signal-safely within a region, but I'm not aware of the
existence of such attributes.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 15:52       ` Andrea Corallo via Emacs development discussions.
  2021-03-08 16:14         ` Philipp Stephani
@ 2021-03-08 16:44         ` Stefan Monnier
  2021-03-08 17:13           ` Eli Zaretskii
  2021-03-09  6:22           ` Richard Stallman
  1 sibling, 2 replies; 61+ messages in thread
From: Stefan Monnier @ 2021-03-08 16:44 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: eliz, Pip Cet, emacs-devel

> I also expect fork will work, I'm more wondering about interaction with
> sockets file descriptors or anything else in our codebase and/or in
> libraries we have loaded.

I suspect we wouldn't want to use `fork` literally, but rather something
like Linux's `clone` so as to have more control about what is
really shared.  Basically, we just want to spawn a thread, except it runs
in a snapshot of the heap.


        Stefan




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 16:35                     ` Pip Cet
@ 2021-03-08 17:06                       ` Eli Zaretskii
  0 siblings, 0 replies; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-08 17:06 UTC (permalink / raw)
  To: Pip Cet; +Cc: rudalics, monnier, emacs-devel

> From: Pip Cet <pipcet@gmail.com>
> Date: Mon, 8 Mar 2021 16:35:30 +0000
> Cc: rudalics@gmx.at, Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> > And how do you make sure the mark bits will be on a different/separate
> > page?
> 
> We allocate cons blocks aligned to physical pages.

What is a "physical page" for this purpose?  You want a page whose
size will make sure you don't trigger copy-on-write when you change
some bit -- are you sure that this size is always the same as the
"page size" we use for so-called "page-aligned" allocations?

And if the answer is YES, how future-proof is it?

> What am I missing?

I don't know whether we are missing something.  We are on new ground
here: we never cared in Emacs about physical layout of memory, at
least not so much.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 16:44         ` Stefan Monnier
@ 2021-03-08 17:13           ` Eli Zaretskii
  2021-03-10 20:25             ` Pip Cet
  2021-03-09  6:22           ` Richard Stallman
  1 sibling, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-08 17:13 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel, pipcet, akrl

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Pip Cet <pipcet@gmail.com>,  eliz@gnu.org,  emacs-devel@gnu.org
> Date: Mon, 08 Mar 2021 11:44:05 -0500
> 
> > I also expect fork will work, I'm more wondering about interaction with
> > sockets file descriptors or anything else in our codebase and/or in
> > libraries we have loaded.
> 
> I suspect we wouldn't want to use `fork` literally, but rather something
> like Linux's `clone` so as to have more control about what is
> really shared.  Basically, we just want to spawn a thread, except it runs
> in a snapshot of the heap.

There are other potential complications with this, related to
threads.  Posix says 'fork' only copies a single thread, the one that
called it, but what if that single thread is not the main thread?  In
Emacs nowadays a non-main Lisp thread can trigger GC.  What happens to
the mutexes we use in the threads machinery when we fork like that?



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Concurrent GC via `fork`
  2021-03-08 16:25     ` Pip Cet
@ 2021-03-08 17:37       ` Stefan Monnier
  2021-03-08 19:21       ` Lars Ingebrigtsen
  1 sibling, 0 replies; 61+ messages in thread
From: Stefan Monnier @ 2021-03-08 17:37 UTC (permalink / raw)
  To: Pip Cet; +Cc: eliz, emacs-devel

> And I do want to mention weak hash tables: they're a problem, because
> they can resurrect unreachable objects.

Indeed, weak pointers require special attention to handle them correctly
with concurrent GCs.  But nothing insurmountable.

> Worse, while normal weak-key-strong-value hash tables only resurrect
> objects when you maphash, weak-value hash tables can do so for gethash
> as well, so that code path would become more expensive

I think it can be made "cheap enough".

> and more difficult to translate into native code.

I don't know what this has to do with native code.

>> but I think it's definitely worth working on it.
> I've got it working for conses, I think (as a proof of concept,
> synchronously, with an extra waitpid()).

I think a real proof of concept is one where the GC is never
run synchronously.

> Apart from hating threads and loving processes, I love nice levels,
> and I think this would be a good use case for them. For example, I
> hear there's a new CPU out there that has two separate sets of cores
> for "efficiency" and "performance", and it would be great if Emacs ran
> an "efficiency" garbage collection in the background rather than a
> "performance" garbage collection synchronously.

To the extent that the GC normally only uses less than 50% of the CPU
time, indeed we don't need it to run as fast the main thread.
More generally, the OS scheduling of the GC process should focus on
maximizing throughput rather than minimizing latency.


        Stefan




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 12:27             ` Pip Cet
@ 2021-03-08 18:06               ` martin rudalics
  2021-03-08 18:24                 ` Concurrent GC via fork (was: Opportunistic GC) Stefan Monnier
  2021-03-10 20:35                 ` Opportunistic GC Pip Cet
  0 siblings, 2 replies; 61+ messages in thread
From: martin rudalics @ 2021-03-08 18:06 UTC (permalink / raw)
  To: Pip Cet; +Cc: eliz, Stefan Monnier, emacs-devel

 > I don't think it's going to matter in practice, though. GC, after all,
 > does not take that long,

.. as long as we do not increase the heap size with the excuse that GC
is now no more intrusive anyway ...

 > and the retained working set of most Lisp
 > programs is small, memory is cheap, and it is a factor of two in the
 > absolute worst conceivable case.

We don't have to reserve that space and closing the fork automatically
returns it to the OS?  I have no idea how copy-on-write is implemented.

 >> Each such write causes its associated page getting written
 >> out to avoid that an old link gets lost.  And if the action was to drop
 >> an old link, writing the page out doesn't even make sense.
 >
 > I'm not sure what you mean by "drop an old link".

Setting a car or cdr to nil (as opposed to setting a car or cdr to some
other cons cell).  In this case, the copy on write will trigger a fault
on that object's page to make sure that the previous reference gets
traced by the collector.  Usually, concurrent collectors do not care.

 > No, I meant "real" as opposed to mark-and-sweep style algorithms.
 > Generational would have been a better (but possibly slightly
 > narrower?) term.

So you mean mark-and-sweep as opposed to copying.  In principle, a
generational collector could be mark-and-sweep too.  Just that its
paging behavior would be probably disastrous over time.

We never collected any evidence on how long Elisp objects live.  So it's
not clear how much we could gain from promoting long-lived objects.

martin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Concurrent GC via fork (was: Opportunistic GC)
  2021-03-08 18:06               ` martin rudalics
@ 2021-03-08 18:24                 ` Stefan Monnier
  2021-03-09  8:10                   ` Concurrent GC via fork martin rudalics
  2021-03-10 20:35                 ` Opportunistic GC Pip Cet
  1 sibling, 1 reply; 61+ messages in thread
From: Stefan Monnier @ 2021-03-08 18:24 UTC (permalink / raw)
  To: martin rudalics; +Cc: eliz, Pip Cet, emacs-devel

>> I don't think it's going to matter in practice, though. GC, after all,
>> does not take that long,
> .. as long as we do not increase the heap size with the excuse that GC
> is now no more intrusive anyway ...

That same argument could be used to lower the `gc-cons-percentage` ;-)

> So you mean mark-and-sweep as opposed to copying.  In principle, a
> generational collector could be mark-and-sweep too.

Not just in principle.  E.g. it's common to use stop&copy when
collecting the nursery and mark&sweep when collecting the old space.


        Stefan




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Concurrent GC via `fork`
  2021-03-08 16:25     ` Pip Cet
  2021-03-08 17:37       ` Concurrent GC via `fork` Stefan Monnier
@ 2021-03-08 19:21       ` Lars Ingebrigtsen
  1 sibling, 0 replies; 61+ messages in thread
From: Lars Ingebrigtsen @ 2021-03-08 19:21 UTC (permalink / raw)
  To: Pip Cet; +Cc: eliz, Stefan Monnier, emacs-devel

Pip Cet <pipcet@gmail.com> writes:

> I think that's the sensible approach, yes. I certainly wasn't hoping
> to hear anything _positive_ before submitting a patch.

Yes, positive feedback would be very out-of-character for this mailing
list.

I think it sounds like a very intriguing idea.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 16:44         ` Stefan Monnier
  2021-03-08 17:13           ` Eli Zaretskii
@ 2021-03-09  6:22           ` Richard Stallman
  2021-03-09  8:11             ` Pip Cet
  1 sibling, 1 reply; 61+ messages in thread
From: Richard Stallman @ 2021-03-09  6:22 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eliz, emacs-devel, pipcet, akrl

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I suspect we wouldn't want to use `fork` literally, but rather something
  > like Linux's `clone` so as to have more control about what is
  > really shared.

I see a potential problem: maybe not all current or future supported
platforms have this feature.


-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)





^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Concurrent GC via fork
  2021-03-08 18:24                 ` Concurrent GC via fork (was: Opportunistic GC) Stefan Monnier
@ 2021-03-09  8:10                   ` martin rudalics
  0 siblings, 0 replies; 61+ messages in thread
From: martin rudalics @ 2021-03-09  8:10 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: eliz, Pip Cet, emacs-devel

 > Not just in principle.  E.g. it's common to use stop&copy when
 > collecting the nursery and mark&sweep when collecting the old space.

We have to leave the nursery in place anyway.  But we should be able to
compactify the geriatrics once in a while to avoid thrashing.

martin



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-09  6:22           ` Richard Stallman
@ 2021-03-09  8:11             ` Pip Cet
  2021-03-10  5:52               ` Richard Stallman
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-09  8:11 UTC (permalink / raw)
  To: rms; +Cc: eliz, emacs-devel, Stefan Monnier, Andrea Corallo

On Tue, Mar 9, 2021 at 6:22 AM Richard Stallman <rms@gnu.org> wrote:
>
> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
>   > I suspect we wouldn't want to use `fork` literally, but rather something
>   > like Linux's `clone` so as to have more control about what is
>   > really shared.
>
> I see a potential problem: maybe not all current or future supported
> platforms have this feature.

Do you mean `fork' or `clone'? If it's the latter, we can just use
`fork' and miss out on a very slight optimization. If you think there
are platforms which don't support an efficient `fork'  (WebAssembly
doesn't; you have to make a full eager copy of the heap yourself) but
where we want to do the `fork' trick anyway, I don't see how. I think
those platforms will just have to continue using synchronous GC.
Which, of course, is not going away: this feature is useful in some
situations, I hope, but it should by no means be mandatory.

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 14:57     ` Pip Cet
  2021-03-08 15:16       ` Eli Zaretskii
@ 2021-03-09 13:01       ` Eli Zaretskii
  2021-03-10 20:21         ` Pip Cet
  1 sibling, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-09 13:01 UTC (permalink / raw)
  To: Pip Cet; +Cc: monnier, emacs-devel

> From: Pip Cet <pipcet@gmail.com>
> Date: Mon, 8 Mar 2021 14:57:06 +0000
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> > What happens when, on a 8GB machine, we have an Emacs session with a
> > 5GB memory footprint, and you fork and start marking (which triggers
> > copy-on-write)?
> 
> That is, indeed, one of the cases in which we have to use the
> synchronous GC or suffer the consequences.

How do we know if/when we are in that case?  Because if we err, we
have a meeting with the OOM killer and a swift death.

> > Also, I quite frequently need to run Emacs on a system where I'm
> > forbidden to run more than 2 processes simultaneously ("make -j3"
> > aborts with an error message), and you propose to take those two
> > processes with 2 copies of Emacs?
> 
> That is, indeed, one of the cases in which we have to use synchronous GC.

Again, how do we know we are in that case?



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-09  8:11             ` Pip Cet
@ 2021-03-10  5:52               ` Richard Stallman
  0 siblings, 0 replies; 61+ messages in thread
From: Richard Stallman @ 2021-03-10  5:52 UTC (permalink / raw)
  To: Pip Cet; +Cc: eliz, emacs-devel, monnier, akrl

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > I see a potential problem: maybe not all current or future supported
  > > platforms have this feature.

  > Do you mean `fork' or `clone'? If it's the latter, we can just use
  > `fork' and miss out on a very slight optimization.

I say this in reference to `clone'.  I was not sure what problem there
would be if we made Emacs use `clone' and the system's kernel does not
implement it.  The point is that people should not neglect to consider
what will happen on such systems.

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08  7:20 ` Pip Cet
                     ` (3 preceding siblings ...)
  2021-03-08 16:03   ` Concurrent GC via `fork` (was: Opportunistic GC) Stefan Monnier
@ 2021-03-10 17:18   ` Matt Armstrong
  2021-03-10 19:38     ` Pip Cet
  2022-07-05  1:47   ` Stefan Monnier
  5 siblings, 1 reply; 61+ messages in thread
From: Matt Armstrong @ 2021-03-10 17:18 UTC (permalink / raw)
  To: Pip Cet, Stefan Monnier, eliz; +Cc: emacs-devel

Pip Cet <pipcet@gmail.com> writes:

> On Mon, Mar 8, 2021 at 3:37 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> I've been running with the code below recently to try and see if
>> opportunistic GC can be worth the trouble.
>
> Just a random idea: What if we exploit the fact most people have more
> than one CPU core these days, garbage-collect in a separate fork()ed
> process, then do only the sweeping in the main process?
>
> Immediate advantage: we'd never mark objects in the "real" Emacs
> process, making debugging easier.
>
> My implementation proposal would be to pipe(), fork(), run GC, then
> send through the pipe the Lisp_Objects which can be collected by the
> original Emacs.
>
> For me, it was a bit difficult to see that this would indeed be safe,
> but I'm now pretty convinced it would be: objects that are unreachable
> in the child Emacs cannot become reachable in the parent Emacs (they
> might show up on the stack, but that's a false positive).
>
> We'd have to be careful not to run two "real" GC processes at the same
> time.

Hi Pip,

Neat idea.

Are there examples of other GC implementations using this approach?
Perhaps this has been tried before, elsewhere, and we could learn a lot
from that.

I looked for myself and found
https://dlang.org/blog/2019/07/22/symmetry-autumn-of-code-experience-report-porting-a-fork-based-gc/
which talk about this idea in the context of the D run time.  This page
mentions a different idea: doing the mark pass with multiple threads.
This might be worth exploring; the two ideas are composable with
each other.

Matt



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-10 17:18   ` Opportunistic GC Matt Armstrong
@ 2021-03-10 19:38     ` Pip Cet
  0 siblings, 0 replies; 61+ messages in thread
From: Pip Cet @ 2021-03-10 19:38 UTC (permalink / raw)
  To: Matt Armstrong; +Cc: eliz, Stefan Monnier, emacs-devel

On Wed, Mar 10, 2021 at 5:18 PM Matt Armstrong <matt@rfc20.org> wrote:
> Pip Cet <pipcet@gmail.com> writes:
> > On Mon, Mar 8, 2021 at 3:37 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> >> I've been running with the code below recently to try and see if
> >> opportunistic GC can be worth the trouble.
> >
> > Just a random idea: What if we exploit the fact most people have more
> > than one CPU core these days, garbage-collect in a separate fork()ed
> > process, then do only the sweeping in the main process?
> >
> > Immediate advantage: we'd never mark objects in the "real" Emacs
> > process, making debugging easier.
> >
> > My implementation proposal would be to pipe(), fork(), run GC, then
> > send through the pipe the Lisp_Objects which can be collected by the
> > original Emacs.
> >
> > For me, it was a bit difficult to see that this would indeed be safe,
> > but I'm now pretty convinced it would be: objects that are unreachable
> > in the child Emacs cannot become reachable in the parent Emacs (they
> > might show up on the stack, but that's a false positive).
> >
> > We'd have to be careful not to run two "real" GC processes at the same
> > time.
>
> Hi Pip,
>
> Neat idea.

Thank you.

> Are there examples of other GC implementations using this approach?

I'm not aware of any, but that doesn't mean much.

> Perhaps this has been tried before, elsewhere, and we could learn a lot
> from that.

I'm certain it has.

> I looked for myself and found
> https://dlang.org/blog/2019/07/22/symmetry-autumn-of-code-experience-report-porting-a-fork-based-gc/
> which talk about this idea in the context of the D run time.  This page
> mentions a different idea: doing the mark pass with multiple threads.
> This might be worth exploring; the two ideas are composable with
> each other.

Absolutely. What they're not easily composable with is a copying
garbage collector, but Emacs might not require one of those as much as
other applications do.

Again, I think this is, most of all, cheap in terms of implementation
time. Not as cheap as I thought initially, I confess, but I'm willing
to argue that this might be a good thing: we need to be clearer about
what's GC and what isn't, and move other clean-up tasks out of the
garbage collector.

There's also an additional problem with running the GC "fully"
asynchronously, which I define as collecting only some of the objects
it returns as unreachable before starting the next cycle. I'm not sure
we ever really want to do that, but I thought it would be nice if we
could. The problem is that A and B may both be unreachable, A might
reference B, but only B might get collected, leaving a dangling
pointer in A. Then A might become maybe-reachable again, by being on
the stack as a false positive, and we end up seeing nasty corruption
bugs. This actually did happen to me while playing with this patch, so
at least I learned my "don't play with snakes" lesson.

That means we need to either assume that the forked thread will always
terminate properly, and abort otherwise, or we need to wait for it to
produce all of its output before starting to act on it. The latter is
problematic because GC should be about freeing memory, not allocating
more memory to begin with. The former is problematic because it makes
things like automated nice level adjustments for the worker process
much more difficult: instead of going "oh, we were being too nice,
kill the thread and start over", we'd have to un-nice it, probably to
a much higher priority, once we conclude it's taking too long.

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-09 13:01       ` Eli Zaretskii
@ 2021-03-10 20:21         ` Pip Cet
  2021-03-10 20:41           ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-10 20:21 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, emacs-devel

On Tue, Mar 9, 2021 at 1:01 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Mon, 8 Mar 2021 14:57:06 +0000
> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> >
> > > What happens when, on a 8GB machine, we have an Emacs session with a
> > > 5GB memory footprint, and you fork and start marking (which triggers
> > > copy-on-write)?
> >
> > That is, indeed, one of the cases in which we have to use the
> > synchronous GC or suffer the consequences.
>
> How do we know if/when we are in that case?  Because if we err, we
> have a meeting with the OOM killer and a swift death.

Yes, but there's a good chance we would have had that anyway, since
it's "only" a factor of two, at most.

I see similarities to how we decide how many processes to use for
native compilation. The current rule in that case is "half the
system's reported number of CPUs", however that is determined.

I don't see why we can't do the same thing for memory: if there's a
chance the GC process plus the current Emacs process exceed half of
the lowest of
1. system memory
2. ulimit memory
3. a sensible upper limit,

we use synchronous GC by default. Of course this should be
customizable because users know best.

If you're arguing that there is a strong case that synchronous GC
should remain the default, I'm inclined to agree, because we haven't
even tried the other way or gained any experience with it so far. But
that does not mean it has no application whatsoever. I would gladly
trade doubled memory consumption for reliable, reliably interruptible,
zero-latency GC, particularly if it means never having to mask out
mark bits before inspecting objects after a crash again :-)

> > > Also, I quite frequently need to run Emacs on a system where I'm
> > > forbidden to run more than 2 processes simultaneously ("make -j3"
> > > aborts with an error message), and you propose to take those two
> > > processes with 2 copies of Emacs?
> >
> > That is, indeed, one of the cases in which we have to use synchronous GC.
>
> Again, how do we know we are in that case?

I assume you don't have a reliable getrlimit that lets you know?

Pip

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 17:13           ` Eli Zaretskii
@ 2021-03-10 20:25             ` Pip Cet
  2021-03-10 20:48               ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-10 20:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, Stefan Monnier, Andrea Corallo

On Mon, Mar 8, 2021 at 5:13 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: Pip Cet <pipcet@gmail.com>,  eliz@gnu.org,  emacs-devel@gnu.org
> > Date: Mon, 08 Mar 2021 11:44:05 -0500
> >
> > > I also expect fork will work, I'm more wondering about interaction with
> > > sockets file descriptors or anything else in our codebase and/or in
> > > libraries we have loaded.
> >
> > I suspect we wouldn't want to use `fork` literally, but rather something
> > like Linux's `clone` so as to have more control about what is
> > really shared.  Basically, we just want to spawn a thread, except it runs
> > in a snapshot of the heap.
>
> There are other potential complications with this, related to
> threads.  Posix says 'fork' only copies a single thread, the one that
> called it, but what if that single thread is not the main thread?

Why would that be a problem? I think everything would work just fine
in that case, but maybe I'm missing a problem that should be obvious
to me.

> In
> Emacs nowadays a non-main Lisp thread can trigger GC.  What happens to
> the mutexes we use in the threads machinery when we fork like that?

In the child, they're irrelevant, because we'll never return from the
function that triggered GC, or allow signal handlers to run; in the
parent, they remain valid. Everything should be fine...

Pip



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08 18:06               ` martin rudalics
  2021-03-08 18:24                 ` Concurrent GC via fork (was: Opportunistic GC) Stefan Monnier
@ 2021-03-10 20:35                 ` Pip Cet
  1 sibling, 0 replies; 61+ messages in thread
From: Pip Cet @ 2021-03-10 20:35 UTC (permalink / raw)
  To: martin rudalics; +Cc: eliz, Stefan Monnier, emacs-devel

On Mon, Mar 8, 2021 at 6:06 PM martin rudalics <rudalics@gmx.at> wrote:
>  > I don't think it's going to matter in practice, though. GC, after all,
>  > does not take that long,
>
> .. as long as we do not increase the heap size with the excuse that GC
> is now no more intrusive anyway ...

You're worried about this patch making it less likely someone will
actually improve the garbage collector? So am I, actually...

>  > and the retained working set of most Lisp
>  > programs is small, memory is cheap, and it is a factor of two in the
>  > absolute worst conceivable case.
>
> We don't have to reserve that space and closing the fork automatically
> returns it to the OS?  I have no idea how copy-on-write is implemented.

The space might be counted towards quota limits, but it won't actually
be backed by real memory until it's used. Yes, when the child
terminates, the memory will be freed.

>  >> Each such write causes its associated page getting written
>  >> out to avoid that an old link gets lost.  And if the action was to drop
>  >> an old link, writing the page out doesn't even make sense.
>  >
>  > I'm not sure what you mean by "drop an old link".
>
> Setting a car or cdr to nil (as opposed to setting a car or cdr to some
> other cons cell).  In this case, the copy on write will trigger a fault
> on that object's page to make sure that the previous reference gets
> traced by the collector.  Usually, concurrent collectors do not care.

Okay, I think I understand now. You're absolutely correct, that's
something the simplistic (but fast) kernel copy-on-write machinery
cannot handle.

> We never collected any evidence on how long Elisp objects live.  So it's
> not clear how much we could gain from promoting long-lived objects.

Ah, but let me try to turn around that argument: if you wanted to
collect such data, my fork-to-GC proposal would be precisely how I'd
do it :-)

Pip



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-10 20:21         ` Pip Cet
@ 2021-03-10 20:41           ` Eli Zaretskii
  2021-03-11  8:06             ` Pip Cet
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-10 20:41 UTC (permalink / raw)
  To: Pip Cet; +Cc: monnier, emacs-devel

> From: Pip Cet <pipcet@gmail.com>
> Date: Wed, 10 Mar 2021 20:21:16 +0000
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> > > > What happens when, on a 8GB machine, we have an Emacs session with a
> > > > 5GB memory footprint, and you fork and start marking (which triggers
> > > > copy-on-write)?
> > >
> > > That is, indeed, one of the cases in which we have to use the
> > > synchronous GC or suffer the consequences.
> >
> > How do we know if/when we are in that case?  Because if we err, we
> > have a meeting with the OOM killer and a swift death.
> 
> Yes, but there's a good chance we would have had that anyway, since
> it's "only" a factor of two, at most.

I don't see how that helps to get out of the conundrum.  The
synchronous GC doesn't enlarge the memory footprint of the Emacs
process, so if we had enough memory when we started GC, there are good
chances we will have it throughout the GC.  By contrast, when you
fork, you enlarge the memory pressure in the system, and the amount of
the increment is unknown in advance (because it's determined by the
degree of nesting of the Lisp data you have in the session at that
moment).  You could enlarge the memory enough to cause OOM killer to
kick in.

So it's important to know when not to attempt the fork.  What is the
algorithm for that?  If you want to use the fraction of the memory the
Emacs process takes, and set the threshold at 50%, then we now need to
discuss how to know that fraction reliably, what with some of the
memory coming from sbrk and some from mmap -- this is why we all but
eliminated memory-limit as not being very useful, and you now want to
use the same technique to base a critical decision.

> If you're arguing that there is a strong case that synchronous GC
> should remain the default, I'm inclined to agree

No, I'm not arguing that yet, not until I see the implementation.  I'm
just thinking aloud about some pesky problems we'd need to solve on
the way.

> > > > Also, I quite frequently need to run Emacs on a system where I'm
> > > > forbidden to run more than 2 processes simultaneously ("make -j3"
> > > > aborts with an error message), and you propose to take those two
> > > > processes with 2 copies of Emacs?
> > >
> > > That is, indeed, one of the cases in which we have to use synchronous GC.
> >
> > Again, how do we know we are in that case?
> 
> I assume you don't have a reliable getrlimit that lets you know?

I don't know -- does getrlimit always report this stuff, with all the
virtualization that goes on nowadays?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-10 20:25             ` Pip Cet
@ 2021-03-10 20:48               ` Eli Zaretskii
  2021-03-11  7:55                 ` Pip Cet
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-10 20:48 UTC (permalink / raw)
  To: Pip Cet; +Cc: emacs-devel, monnier, akrl

> From: Pip Cet <pipcet@gmail.com>
> Date: Wed, 10 Mar 2021 20:25:04 +0000
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, Andrea Corallo <akrl@sdf.org>, emacs-devel@gnu.org
> 
> > There are other potential complications with this, related to
> > threads.  Posix says 'fork' only copies a single thread, the one that
> > called it, but what if that single thread is not the main thread?
> 
> Why would that be a problem?

Well, I mentioned one aspect that should be considered below.  There
are probably others, as we never intended for one of these threads to
be left alone.  E.g., what about error handling? non-main threads have
a very simplistic one.

> > In
> > Emacs nowadays a non-main Lisp thread can trigger GC.  What happens to
> > the mutexes we use in the threads machinery when we fork like that?
> 
> In the child, they're irrelevant

They are?  Don't you intend to send the results to the parent process?
or read some stuff from the parent?  Doesn't that mean you'd need to
call pselect?

> Everything should be fine...

Famous last words ;-)



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-10 20:48               ` Eli Zaretskii
@ 2021-03-11  7:55                 ` Pip Cet
  2021-03-11  8:37                   ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-11  7:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, Stefan Monnier, Andrea Corallo

On Wed, Mar 10, 2021 at 8:48 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Wed, 10 Mar 2021 20:25:04 +0000
> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, Andrea Corallo <akrl@sdf.org>, emacs-devel@gnu.org
> >
> > > There are other potential complications with this, related to
> > > threads.  Posix says 'fork' only copies a single thread, the one that
> > > called it, but what if that single thread is not the main thread?
> >
> > Why would that be a problem?
>
> Well, I mentioned one aspect that should be considered below.  There
> are probably others, as we never intended for one of these threads to
> be left alone.  E.g., what about error handling? non-main threads have
> a very simplistic one.

GC doesn't throw, though, and we're in a non-returning function.

> > > In
> > > Emacs nowadays a non-main Lisp thread can trigger GC.  What happens to
> > > the mutexes we use in the threads machinery when we fork like that?
> >
> > In the child, they're irrelevant
>
> They are?  Don't you intend to send the results to the parent process?

I do intend to do that. I'm currently using write synchronously, and I
don't see a problem with that.

> or read some stuff from the parent?

Not right now, no.

> Doesn't that mean you'd need to
> call pselect?

Not at present, no. But even if I did, how would that make mutexes relevant?

> > Everything should be fine...
> Famous last words ;-)

Precisely.

Pip



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-10 20:41           ` Eli Zaretskii
@ 2021-03-11  8:06             ` Pip Cet
  2021-03-11  9:03               ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-11  8:06 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, emacs-devel

On Wed, Mar 10, 2021 at 8:41 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Wed, 10 Mar 2021 20:21:16 +0000
> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> >
> > > > > What happens when, on a 8GB machine, we have an Emacs session with a
> > > > > 5GB memory footprint, and you fork and start marking (which triggers
> > > > > copy-on-write)?
> > > >
> > > > That is, indeed, one of the cases in which we have to use the
> > > > synchronous GC or suffer the consequences.
> > >
> > > How do we know if/when we are in that case?  Because if we err, we
> > > have a meeting with the OOM killer and a swift death.
> >
> > Yes, but there's a good chance we would have had that anyway, since
> > it's "only" a factor of two, at most.
>
> I don't see how that helps to get out of the conundrum.  The
> synchronous GC doesn't enlarge the memory footprint of the Emacs
> process, so if we had enough memory when we started GC, there are good
> chances we will have it throughout the GC.  By contrast, when you
> fork, you enlarge the memory pressure in the system, and the amount of
> the increment is unknown in advance (because it's determined by the
> degree of nesting of the Lisp data you have in the session at that
> moment).  You could enlarge the memory enough to cause OOM killer to
> kick in.

All of these problems apply to all forks, not just GC ones.

> So it's important to know when not to attempt the fork.

I don't think "normal" GNU/Linux systems should ever be in that
situation, since they have virtual memory. Exotic systems may be, but
that's a system dependency which, as you correctly say, cannot be
easily resolved by generic code.

And, of course, it's possible even on GNU/Linux system for the fork()
not to succeed, which we need to handle.

> What is the
> algorithm for that?  If you want to use the fraction of the memory the
> Emacs process takes, and set the threshold at 50%, then we now need to
> discuss how to know that fraction reliably, what with some of the
> memory coming from sbrk and some from mmap -- this is why we all but
> eliminated memory-limit as not being very useful, and you now want to
> use the same technique to base a critical decision.

Ideally, we'd use the same (broken) technique as the (broken) OS uses
to determine whether to kill us.

> > If you're arguing that there is a strong case that synchronous GC
> > should remain the default, I'm inclined to agree
>
> No, I'm not arguing that yet, not until I see the implementation.  I'm
> just thinking aloud about some pesky problems we'd need to solve on
> the way.

Thanks for explaining.

I don't think we can answer, at this point, the question of what the
best circumstances are for enabling a feature that I haven't even
shown a prototype patch for (turns out sigblock is no longer
recommended! Who'd have known?)

It's possible the best answer is "only when the user tells us to
fork", but that's okay. Many features are disabled by default, and I
don't see why that shouldn't be true for a GC trick.

> > > > > Also, I quite frequently need to run Emacs on a system where I'm
> > > > > forbidden to run more than 2 processes simultaneously ("make -j3"
> > > > > aborts with an error message), and you propose to take those two
> > > > > processes with 2 copies of Emacs?
> > > >
> > > > That is, indeed, one of the cases in which we have to use synchronous GC.
> > >
> > > Again, how do we know we are in that case?
> >
> > I assume you don't have a reliable getrlimit that lets you know?
>
> I don't know -- does getrlimit always report this stuff, with all the
> virtualization that goes on nowadays?

It should, so it probably doesn't.

Pip



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-11  7:55                 ` Pip Cet
@ 2021-03-11  8:37                   ` Eli Zaretskii
  2021-03-11  9:03                     ` Pip Cet
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-11  8:37 UTC (permalink / raw)
  To: Pip Cet; +Cc: emacs-devel, monnier, akrl

> From: Pip Cet <pipcet@gmail.com>
> Date: Thu, 11 Mar 2021 07:55:30 +0000
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, Andrea Corallo <akrl@sdf.org>, emacs-devel@gnu.org
> 
> > Doesn't that mean you'd need to
> > call pselect?
> 
> Not at present, no. But even if I did, how would that make mutexes relevant?

See thread_select.  We don't call pselect directly.




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-11  8:37                   ` Eli Zaretskii
@ 2021-03-11  9:03                     ` Pip Cet
  2021-03-11  9:34                       ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-11  9:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, Stefan Monnier, Andrea Corallo

On Thu, Mar 11, 2021 at 8:37 AM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Thu, 11 Mar 2021 07:55:30 +0000
> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, Andrea Corallo <akrl@sdf.org>, emacs-devel@gnu.org
> >
> > > Doesn't that mean you'd need to
> > > call pselect?
> >
> > Not at present, no. But even if I did, how would that make mutexes relevant?
>
> See thread_select.

I have no intention of calling thread_select.

> We don't call pselect directly.

In this case, we should. I don't see a problem with that, as the child
Emacs does not need to perform other I/O.

Pip



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-11  8:06             ` Pip Cet
@ 2021-03-11  9:03               ` Eli Zaretskii
  0 siblings, 0 replies; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-11  9:03 UTC (permalink / raw)
  To: Pip Cet; +Cc: monnier, emacs-devel

> From: Pip Cet <pipcet@gmail.com>
> Date: Thu, 11 Mar 2021 08:06:50 +0000
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
> 
> > I don't see how that helps to get out of the conundrum.  The
> > synchronous GC doesn't enlarge the memory footprint of the Emacs
> > process, so if we had enough memory when we started GC, there are good
> > chances we will have it throughout the GC.  By contrast, when you
> > fork, you enlarge the memory pressure in the system, and the amount of
> > the increment is unknown in advance (because it's determined by the
> > degree of nesting of the Lisp data you have in the session at that
> > moment).  You could enlarge the memory enough to cause OOM killer to
> > kick in.
> 
> All of these problems apply to all forks, not just GC ones.

You mean, when we start a sub-process?  In that case the memory of the
Emacs process is not cloned, because we immediately call 'exec'.  But
in the case of forking and running GC, some part of the Emacs memory
will have to be cloned.  Which part is that is hard to predict in
advance; it could be a large part.

> > So it's important to know when not to attempt the fork.
> 
> I don't think "normal" GNU/Linux systems should ever be in that
> situation, since they have virtual memory.

I don't think I follow.  Virtual memory can be exhausted as well,
right?  That's the danger I'm talking about.

> And, of course, it's possible even on GNU/Linux system for the fork()
> not to succeed, which we need to handle.

The problem is with the scenario where the trouble begins long after
the 'fork' call succeeded.  Or are you saying that the OS will fail
the 'fork' call right when we call it, and if it doesn't, there's no
chance the forked child will come close to exhausting VM?

> > What is the
> > algorithm for that?  If you want to use the fraction of the memory the
> > Emacs process takes, and set the threshold at 50%, then we now need to
> > discuss how to know that fraction reliably, what with some of the
> > memory coming from sbrk and some from mmap -- this is why we all but
> > eliminated memory-limit as not being very useful, and you now want to
> > use the same technique to base a critical decision.
> 
> Ideally, we'd use the same (broken) technique as the (broken) OS uses
> to determine whether to kill us.

Is that possible?  Does the OS expose enough APIs to glean this info?
If so, perhaps we could use that to resurrect the usefulness of
memory-limit?



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-11  9:03                     ` Pip Cet
@ 2021-03-11  9:34                       ` Eli Zaretskii
  2021-03-11 10:11                         ` Pip Cet
  0 siblings, 1 reply; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-11  9:34 UTC (permalink / raw)
  To: Pip Cet; +Cc: emacs-devel, monnier, akrl

> From: Pip Cet <pipcet@gmail.com>
> Date: Thu, 11 Mar 2021 09:03:23 +0000
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, Andrea Corallo <akrl@sdf.org>, emacs-devel@gnu.org
> 
> > See thread_select.
> 
> I have no intention of calling thread_select.

If you call wait_reading_process_output, or any of its callers (such
as accept-process-output or sleep-for), Emacs _will_ call
thread_select.

> > We don't call pselect directly.
> 
> In this case, we should. I don't see a problem with that, as the child
> Emacs does not need to perform other I/O.

If you want to do that for any of the above-mentioned calls, you will
have to modify the code of wait_reading_process_output.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-11  9:34                       ` Eli Zaretskii
@ 2021-03-11 10:11                         ` Pip Cet
  2021-03-11 11:35                           ` Eli Zaretskii
  0 siblings, 1 reply; 61+ messages in thread
From: Pip Cet @ 2021-03-11 10:11 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, Stefan Monnier, Andrea Corallo

On Thu, Mar 11, 2021 at 9:34 AM Eli Zaretskii <eliz@gnu.org> wrote:
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Thu, 11 Mar 2021 09:03:23 +0000
> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, Andrea Corallo <akrl@sdf.org>, emacs-devel@gnu.org
> >
> > > See thread_select.
> >
> > I have no intention of calling thread_select.
>
> If you call wait_reading_process_output, or any of its callers (such
> as accept-process-output or sleep-for), Emacs _will_ call
> thread_select.

Why would I do any of those things in the child process?

Note that they'll continue working fine in the parent process, which
is unaffected by the fork.

> > > We don't call pselect directly.
> >
> > In this case, we should. I don't see a problem with that, as the child
> > Emacs does not need to perform other I/O.
>
> If you want to do that for any of the above-mentioned calls, you will
> have to modify the code of wait_reading_process_output.

The child process's sole job is to inspect the heap, perform GC, and
return collectable objects to the parent via a pipe. It does not need
to call Lisp. It certainly shouldn't call thread_select, or
wait_reading_process_output.

What am I missing?

Pip



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-11 10:11                         ` Pip Cet
@ 2021-03-11 11:35                           ` Eli Zaretskii
  0 siblings, 0 replies; 61+ messages in thread
From: Eli Zaretskii @ 2021-03-11 11:35 UTC (permalink / raw)
  To: Pip Cet; +Cc: emacs-devel, monnier, akrl

> From: Pip Cet <pipcet@gmail.com>
> Date: Thu, 11 Mar 2021 10:11:41 +0000
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, Andrea Corallo <akrl@sdf.org>, emacs-devel@gnu.org
> 
> > If you call wait_reading_process_output, or any of its callers (such
> > as accept-process-output or sleep-for), Emacs _will_ call
> > thread_select.
> 
> Why would I do any of those things in the child process?

To communicate with the parent.  I don't know if you need to do that,
but just in case.

> The child process's sole job is to inspect the heap, perform GC, and
> return collectable objects to the parent via a pipe. It does not need
> to call Lisp. It certainly shouldn't call thread_select, or
> wait_reading_process_output.
> 
> What am I missing?

I'm just giving you heads-up for stuff you may wish doing.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2021-03-08  7:20 ` Pip Cet
                     ` (4 preceding siblings ...)
  2021-03-10 17:18   ` Opportunistic GC Matt Armstrong
@ 2022-07-05  1:47   ` Stefan Monnier
  2022-07-05 13:49     ` T.V Raman
  5 siblings, 1 reply; 61+ messages in thread
From: Stefan Monnier @ 2022-07-05  1:47 UTC (permalink / raw)
  To: emacs-devel

A bit more than a year ago I wrote:
> So, while these numbers don't look completely hopeless, they seem to
> indicate that opportunistic GCs aren't very useful at least with the
> tuning parameters I used.  To be more useful, maybe we'd need to
> increase the non-opportunistic `gc-cons-percentage` and/or reduce the
> idle-time before opportunistic GC (see "magic formula" below).

I've recently been playing with tweaks to that code, basically setting
`gc-cons-percentage` to 0.5 and then calling
(garbage-collect-maybe 10) from the timer.

IOW multiplying by about 5 the amount of memory we can allocate within
a command before we call the GC, but in return running an opportunistic GC as soon as there's more than about half as much allocation as the normal threshold (0.5 during the command and 0.05 for the opportunistic threshold compared to 0.1 all the time in the default settings).

The results are a lot more promising:

    ((jit 3 0 0) (cmd 7 3 8) (noncmd 23 3 10)
     (earlier-gcs . 79)
     (commands . 46027)
     (opportunistic-gcs . 576))

IOW, the vast majority of GCs are now opportunistic and very few commands
or jit-locks spent any time in the GC (I suspect most of the `noncmd`
GCs happened during auto-revert).

The higher `gc-cons-percentage` during commands should tend to increase
fragmentation a bit in the long run, but the lower `gc-cons-percentage`
used for the opportunistic GCs should have the opposite effect, so I'd
expect the overall long term health of the heap to be fairly similar.

        Stefan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: Opportunistic GC
  2022-07-05  1:47   ` Stefan Monnier
@ 2022-07-05 13:49     ` T.V Raman
  0 siblings, 0 replies; 61+ messages in thread
From: T.V Raman @ 2022-07-05 13:49 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=gb18030, Size: 414 bytes --]

Perhaps install something like this via a custom hook where the user can
experiment with various commands that are wrapped by this
setting/settings?

In my case for instance, I render many *large* html pages AKA EBooks
with EWW/SHR, and wrapping shr/eww in my case looks like it could
definitely help.

-- 

Thanks,

--Raman(I Search, I Find, I Misplace, I Research)
7©4 Id: kg:/m/0285kf1  •0Ü8

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2022-07-05 13:49 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-08  3:35 Opportunistic GC Stefan Monnier
2021-03-08  7:20 ` Pip Cet
2021-03-08  8:26   ` martin rudalics
2021-03-08  9:02     ` Pip Cet
2021-03-08  9:51       ` martin rudalics
2021-03-08 10:44         ` Pip Cet
2021-03-08 11:37           ` martin rudalics
2021-03-08 12:27             ` Pip Cet
2021-03-08 18:06               ` martin rudalics
2021-03-08 18:24                 ` Concurrent GC via fork (was: Opportunistic GC) Stefan Monnier
2021-03-09  8:10                   ` Concurrent GC via fork martin rudalics
2021-03-10 20:35                 ` Opportunistic GC Pip Cet
2021-03-08 11:38           ` Stefan Kangas
2021-03-08 12:55             ` Pip Cet
2021-03-08 14:06             ` Andrea Corallo via Emacs development discussions.
2021-03-08 14:49           ` Eli Zaretskii
2021-03-08 15:02             ` Pip Cet
2021-03-08 15:23               ` Eli Zaretskii
2021-03-08 15:39                 ` Pip Cet
2021-03-08 16:27                   ` Eli Zaretskii
2021-03-08 16:35                     ` Pip Cet
2021-03-08 17:06                       ` Eli Zaretskii
2021-03-08 15:46               ` Stefan Monnier
2021-03-08 16:11                 ` Pip Cet
2021-03-08 16:37                   ` Stefan Monnier
2021-03-08 14:01   ` Andrea Corallo via Emacs development discussions.
2021-03-08 15:04     ` Pip Cet
2021-03-08 15:50       ` Stefan Monnier
2021-03-08 15:52       ` Andrea Corallo via Emacs development discussions.
2021-03-08 16:14         ` Philipp Stephani
2021-03-08 16:18           ` Andrea Corallo via Emacs development discussions.
2021-03-08 16:40             ` Philipp Stephani
2021-03-08 16:44         ` Stefan Monnier
2021-03-08 17:13           ` Eli Zaretskii
2021-03-10 20:25             ` Pip Cet
2021-03-10 20:48               ` Eli Zaretskii
2021-03-11  7:55                 ` Pip Cet
2021-03-11  8:37                   ` Eli Zaretskii
2021-03-11  9:03                     ` Pip Cet
2021-03-11  9:34                       ` Eli Zaretskii
2021-03-11 10:11                         ` Pip Cet
2021-03-11 11:35                           ` Eli Zaretskii
2021-03-09  6:22           ` Richard Stallman
2021-03-09  8:11             ` Pip Cet
2021-03-10  5:52               ` Richard Stallman
2021-03-08 14:45   ` Eli Zaretskii
2021-03-08 14:57     ` Pip Cet
2021-03-08 15:16       ` Eli Zaretskii
2021-03-09 13:01       ` Eli Zaretskii
2021-03-10 20:21         ` Pip Cet
2021-03-10 20:41           ` Eli Zaretskii
2021-03-11  8:06             ` Pip Cet
2021-03-11  9:03               ` Eli Zaretskii
2021-03-08 16:03   ` Concurrent GC via `fork` (was: Opportunistic GC) Stefan Monnier
2021-03-08 16:25     ` Pip Cet
2021-03-08 17:37       ` Concurrent GC via `fork` Stefan Monnier
2021-03-08 19:21       ` Lars Ingebrigtsen
2021-03-10 17:18   ` Opportunistic GC Matt Armstrong
2021-03-10 19:38     ` Pip Cet
2022-07-05  1:47   ` Stefan Monnier
2022-07-05 13:49     ` T.V Raman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).