From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lynn Winebarger Newsgroups: gmane.emacs.devel Subject: Re: GC mark stack Date: Mon, 15 Aug 2022 10:51:56 -0400 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6190"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Aug 15 16:53:45 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oNbTN-0001U6-4D for ged-emacs-devel@m.gmane-mx.org; Mon, 15 Aug 2022 16:53:45 +0200 Original-Received: from localhost ([::1]:35136 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oNbTM-0000fG-89 for ged-emacs-devel@m.gmane-mx.org; Mon, 15 Aug 2022 10:53:44 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:43420) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNbRu-0007Qm-1y for emacs-devel@gnu.org; Mon, 15 Aug 2022 10:52:16 -0400 Original-Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]:41520) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oNbRs-0000We-Ab for emacs-devel@gnu.org; Mon, 15 Aug 2022 10:52:13 -0400 Original-Received: by mail-pg1-x52d.google.com with SMTP id 202so6674730pgc.8 for ; Mon, 15 Aug 2022 07:52:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=SaeGay0bbXIjWk5ApTyCfwbUIjSU5ZkBlSV3hX729Es=; b=EfGNSRgD2tqa5LX7NBHrcZWfORbgFDZ33jeGuDORzlQ9lUT58EbrOEW62eb83Yehz3 eqkhGYohX1blymcqgfK0q+C9uZ2bUoO2Sv/LILRWinuGRrj5upb6Pw/7jHNMDkReOa4h YNZq6S24gV/vSJwkbFAQcvrp4qyP9evO3lziC4qDIff70+2t6WKHsvJwk0yRk2SdfBOq JwOxGgqOW4fBhORO1hPBRjyFLVnwKyLZ1NaeBDWiPrgRLTA/EvhMwPX9BdsUVl+nkrKQ QBFG+QDHrovTL/aghG2BZpcr+hi3G1HVPyIfs6S/3ZP9AyZtjw1phsPufWGWnhOWpXq3 0mJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=SaeGay0bbXIjWk5ApTyCfwbUIjSU5ZkBlSV3hX729Es=; b=i9iB7TKqUQpeqXdcIIQQIbrL+TgdlF7DnaGRoI64+YEqhsRquMZCdmAKdfIp71VsJf r19TdSZtZ1YNr8odbPNXfpmtWJJz3F7TEdrOMfyjKbRZzCvRVic8aK9TD00g/g77t+vG N/+J94oJLkbxgT3N4lAWh8vXWAxTqUZqIu2dXy4877AbFLGLkPaXa4pgNMWuAI0G0/Sq e+UMH57xS1zi1iGxrOsOMABPiyx0e7PnyyEz/3hJVOr+ZEI5idJm02mZmKJt3X4Qh4Lq 60yjZzacEUqDrCaHWwKNaJNB2LXLu1m3YFCtVq36AXAUsZjRATBxfcNwUI/bE9GcJjFv gUMA== X-Gm-Message-State: ACgBeo21VlwpLt886VNp2J+cSuzAquUuSGs2NScOBQb0g7/j5yegbVDJ dY5gqkkvhX904VDqIVthuQgd93CeuwhnOnzP86V6CwDzrWg= X-Google-Smtp-Source: AA6agR66VXlv4CTtiFKjU8D4CxpbiFDaRHhlEnZAmA1WF0VACINO8ZIUr9K6GwNMwRsEzq+KuU5SoSROnUXOc9jv2Oo= X-Received: by 2002:aa7:86d4:0:b0:52e:fe71:77df with SMTP id h20-20020aa786d4000000b0052efe7177dfmr16349396pfo.35.1660575130266; Mon, 15 Aug 2022 07:52:10 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::52d; envelope-from=owinebar@gmail.com; helo=mail-pg1-x52d.google.com X-Spam_score_int: -8 X-Spam_score: -0.9 X-Spam_bar: / X-Spam_report: (-0.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URI_DOTEDU=1.246 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:293482 Archived-At: On Mon, Aug 15, 2022 at 10:02 AM Stefan Monnier wrote: > > Lynn Winebarger [2022-06-26 11:31:43] wrote: > > I was reviewing alloc.c in the 28.1 source, and noted that it uses a > > semi-naive computation of the transitive closure of the graph of live > > data structures (weak hash tables is where I see it). > > Since the fix to https://debbugs.gnu.org/cgi/bugreport.cgi?bug=54698 > > (commit 7a8798de95a57c8ff85f070075e0a0176b458578) moved to using an > > explicit stack, I was curious if you'd considered using a variant of > > Tarjan's SCC algorithm, such as the one described in > > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.9019&rep=rep1&type=pdf > > I must admit that I don't understand what you're referring to. > And seeing the lack of response by other people, I suspect I'm not alone. > > Are you talking about the code in `mark_and_sweep_weak_table_contents`? > Maybe the `do .. while` loop? That is exactly it. Some of my characterization of the mark-sweep algorithm (for non-weak references) in terms of SCC calculation was incorrect - I don't really think about this day-to-day anymore. But, you can certainly calculate the SCCs of the "weakly-references *" (Kleene * operation intended) relation while recursing the entries of a weak hash table in one pass, then perform the union on the SCC representatives, then mark all elements of the live SCCs, rather than traversing the graph over and over. The academic treatment I referenced provides a refinement of the algorithm used by DeRemer and Pennello to optimize the time spent calculating unions. You can preallocate the memory required for bookkeeping (for determining traversal ordering, done with a stack in Tarjan's formulation) by adding a field to each weak reference recording the traversal order, which is then used to identify the SCC representative (assuming you do the calculation on the graph of weak references, to determine which are actually strong, before doing the proper marking pass). Since the algorithm records the traversal order for each element, rather than using an explicit stack, comparing elements (to determine the SCC representative of a particular node) is constant-time rather than the linear search of the stack suggested in Tarjan's formulation. Ultimately you know there are only 2 sets in the SCC graph of "root strongly-references*" image (hence only require 1 bit for the SCC representative of each node in the heap), but you need to determine the SCCs of the "weakly-references*" relation as an intermediate step, and that can't be reduced to "live or dead" a priori. Whether it is worth the effort to do so is another question. Apologies to the extent I'm still getting some of the details wrong, because I'm deliberately not working it out until I know whether my employer will try to assert ownership of the result. Lynn