unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Lynn Winebarger <owinebar@gmail.com>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: emacs-devel <emacs-devel@gnu.org>
Subject: Re: GC mark stack
Date: Mon, 15 Aug 2022 10:51:56 -0400	[thread overview]
Message-ID: <CAM=F=bBfDfA9mM8agTbWo+rFWQmYK6b=PWcBn3+3CifWTy5bBg@mail.gmail.com> (raw)
In-Reply-To: <jwvy1vp6269.fsf-monnier+emacs@gnu.org>

On Mon, Aug 15, 2022 at 10:02 AM Stefan Monnier
<monnier@iro.umontreal.ca> wrote:
>
> Lynn Winebarger [2022-06-26 11:31:43] wrote:
> > I was reviewing alloc.c in the 28.1 source, and noted that it uses a
> > semi-naive computation of the transitive closure of the graph of live
> > data structures (weak hash tables is where I see it).
> > Since the fix to https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54698
> > (commit 7a8798de95a57c8ff85f070075e0a0176b458578) moved to using an
> > explicit stack, I was curious if you'd considered using a variant of
> > Tarjan's SCC algorithm, such as the one described in
> > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.9019&rep=rep1&type=pdf
>
> I must admit that I don't understand what you're referring to.
> And seeing the lack of response by other people, I suspect I'm not alone.
>
> Are you talking about the code in `mark_and_sweep_weak_table_contents`?
> Maybe the `do .. while` loop?

That is exactly it.  Some of my characterization of the mark-sweep
algorithm (for non-weak references) in terms of SCC calculation was
incorrect - I don't really think about this day-to-day anymore.  But,
you can certainly calculate the SCCs of the "weakly-references *"
(Kleene * operation intended) relation while recursing the entries of
a weak hash table in one pass, then perform the union on the SCC
representatives, then mark all elements of the live SCCs, rather than
traversing the graph over and over.  The academic treatment I
referenced provides a refinement of the algorithm used by DeRemer and
Pennello to optimize the time spent calculating unions.  You can
preallocate the memory required for bookkeeping (for determining
traversal ordering, done with a stack in Tarjan's formulation) by
adding a field to each weak reference recording the traversal order,
which is then used to identify the SCC representative (assuming you do
the calculation on the graph of weak references, to determine which
are actually strong, before doing the proper marking pass).  Since the
algorithm records the traversal order for each element, rather than
using an explicit stack, comparing elements (to determine the SCC
representative of a particular node) is constant-time rather than the
linear search of the stack suggested in Tarjan's formulation.
Ultimately you know there are only 2 sets in the SCC graph of "root
strongly-references*" image (hence only require 1 bit for the SCC
representative of each node in the heap), but you need to determine
the SCCs of the "weakly-references*" relation as an intermediate step,
and that can't be reduced to "live or dead" a priori.

Whether it is worth the effort to do so is another question.

Apologies to the extent I'm still getting some of the details wrong,
because I'm deliberately not working it out until I know whether my
employer will try to assert ownership of the result.

Lynn



  reply	other threads:[~2022-08-15 14:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-26 15:31 GC mark stack Lynn Winebarger
2022-06-26 15:50 ` Lynn Winebarger
2022-08-15 14:01 ` Stefan Monnier
2022-08-15 14:51   ` Lynn Winebarger [this message]
2022-08-15 16:28     ` Stefan Monnier
2022-08-15 17:52       ` Lynn Winebarger
2022-08-15 18:17         ` Stefan Monnier
2022-08-16  1:53           ` Lynn Winebarger
2022-08-16  9:50             ` Lynn Winebarger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAM=F=bBfDfA9mM8agTbWo+rFWQmYK6b=PWcBn3+3CifWTy5bBg@mail.gmail.com' \
    --to=owinebar@gmail.com \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).