From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Lynn Winebarger Newsgroups: gmane.emacs.devel Subject: Re: GC mark stack Date: Mon, 15 Aug 2022 21:53:27 -0400 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="20176"; mail-complaints-to="usenet@ciao.gmane.io" Cc: emacs-devel To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Aug 16 03:54:21 2022 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1oNlmf-00051Y-4E for ged-emacs-devel@m.gmane-mx.org; Tue, 16 Aug 2022 03:54:21 +0200 Original-Received: from localhost ([::1]:38502 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oNlmd-00064m-If for ged-emacs-devel@m.gmane-mx.org; Mon, 15 Aug 2022 21:54:19 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:45638) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oNlm3-0005PA-1j for emacs-devel@gnu.org; Mon, 15 Aug 2022 21:53:43 -0400 Original-Received: from mail-pf1-x432.google.com ([2607:f8b0:4864:20::432]:36789) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1oNlm1-0006Rs-9g for emacs-devel@gnu.org; Mon, 15 Aug 2022 21:53:42 -0400 Original-Received: by mail-pf1-x432.google.com with SMTP id a22so7548327pfg.3 for ; Mon, 15 Aug 2022 18:53:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc; bh=ZOxhC6wLWJAhtk6V1cXM2hlbu7W1skex49F11BFcq4k=; b=mtoBQi/RdFbgwBPWedG2L7ZIuzNFAvZ4IrOtXOAA41q/xUGgVq9WVePk9lHWCUgQLd KFI4hME5gADN7IZXjjFSO97d38ME2MNeufId7CQaBdXSadz2Fd/nMqEf7wdc2S6HeK0d zsnZIiltiCwencofoH50/xwhLyL9K8DjQ6nm/8hyPU1OJrG4nj4nDDXeYXDM6MjcOnbC XLk+3ldlFuUdksoGNdviXZBznYmUjY5FlU7csofRkgMwFxd7oPCbn2ecK4UFMIabDjXf xZG08JRuK8YoYh0aG5wLu9vF8L/2pbxkgTnH6umB97v4JAoHii4mHxyYJWGtK0XBk01P uRig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc; bh=ZOxhC6wLWJAhtk6V1cXM2hlbu7W1skex49F11BFcq4k=; b=IZG0mZ3fmPpF+9Yyd5A5n4yXpYvAlnhwlhdzjrrh0At0mKMrzol3iCPKO4fHnleBeR ORxQuQsJvt/zCAeBkAYUFuabJxSnMJXO+HDl9LATn6VLK0RlNrqu4og0NSgpiMmuyYOQ 4gm1lDvH2BxvU+CMPokfjt2YIOruKe4xdIVu2yt+uVfy5NpORB4eJAhaIF3VbksvZBR+ ugUeqeADwImoGO+XXTb7Nmgygpr2er54uFvQ9PYwPum4oeEQOjF81SZTR2Zcq/WrSxzL ftQOLoUpO6c6svKCJ9bflu8SCxCS6YW0d1m0q0nm2hhcMXdmM/lKZ9y0v90mgpLbHrCH cMow== X-Gm-Message-State: ACgBeo2ti3HVkxHPcSy5ilzHipF44Dw5mt0r/wPhsIAm5ruhEB6IShQn d+VIUJWs1scMfqT7+LI1dY/tLdchzTKo7GiuXPI4uJKBiaY= X-Google-Smtp-Source: AA6agR7V6Ildnc66JdFkUaYvxUnfEATKgOqGzdXe0RMmba93LEhA9zuyFA73VjbVTGzId4+Rhid5FrT/wFxapZtXNaM= X-Received: by 2002:aa7:86d4:0:b0:52e:fe71:77df with SMTP id h20-20020aa786d4000000b0052efe7177dfmr18563711pfo.35.1660614819853; Mon, 15 Aug 2022 18:53:39 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::432; envelope-from=owinebar@gmail.com; helo=mail-pf1-x432.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:293489 Archived-At: On Mon, Aug 15, 2022 at 2:17 PM Stefan Monnier w= rote: > > > If the loop can be expressed by a relational expression consisting of > > a concrete relation (like "references"), composition, union, and > > transitive closure, then there is a systematic way to construct an > > algorithm of the type I described. > > I think the core of the problem is the following: > > In a "key weak" hash table, if a reachable object points (transitively) t= o > an object that is a key in that table, then the value associated with > that key becomes reachable. So conceptually we have a reference (an > edge) from the key *object* to the value, but this reference is not > actually represented in the heap. Right, so the weak part(s) of the hash table entry essentially invert the strong reference relation graph - I wasn't clear on this part. As you surmised, explicit construction of the inverted relation (i.e. reifying the back-reference) is the most straightforward method for creating the graph to traverse with Tarjan's algorithm when inversion is added to the set of operators allowed in the relational expressions. I'm not 100% on the semantics of weak hash table reference types, so can you verify: * Key - key strongly-references entry, entry strongly-references value [ asserting the existence of 2 edges in the graph of the strong-references relation ] * Value - value strongly-references entry, entry strongly-references key * KeyOrValue - ( key strongly-references entry, entry strongly-references value ) OR ( value strongly-references entry, entry strongly-references key ) - maybe equivalent to introducing intermediate entries for each case: * (key strongly-references key-entry, key-entry strongly-references entry, key-entry strongly-references value), (value strongly-references value-entry, value-entry strongly-references entry, value-entry strongly-reference key) * KeyAndValue - (key, value) strongly-references entry, where (key, value) is a special node that is only live if both key and value are (or entry itself is such node) I'm really unsure about those last two. I'm also not sure what the right logic is if the hash function is for equal or eqv and not eq. Or, that matter, whether a fixnum or float as an eq key should always or never keep an entry live. I would also be concerned about adding costs to code that does not use weak references at all. The best I think you could do is add a simple bit flag to indicate whether an object has been used as the (or a) target of the weak part of a weak hash table entry, or the target of a weak reference in general. Then at least it's only a quick check in mark_object in general, with more involved lookup into the reified references if it has been set. I think there's probably room for such a bit in all types that would make sense as weak keys. The complexity is probably prohibitive for any savings, I don't know. I did a brief survey of the tree in correspondence with Mattias Engdeg=C3=A5rd for appearances of weak hash tables in the "core": * all_loaded_comp_units_h in src/comp.c * composition_hash_table in src/composite.c - comment is out of date, says it's weak, but then when actually created is not * advertised-signature-table in lisp/emacs-lisp/byte-run.el - used in byte compiler to detect obsolete calling convention * cl--generic-combined-method-memoization in lisp/emacs-lisp/cl-generic= .el * eieio--object-names in lisp/emacs-lisp/eieio.el * ert--test-buffers in lisp/emacs-lisp/ert-x.el * macro-exp--warned in lisp/emacs-lisp/macroexp.el * read-answer-map--memoize in lisp/emacs-lisp/map-ynp.el * pcase--memoize in lisp/emacs-lisp/pcase.el * undo-equiv-table in lisp/simple.el * add-to-ordered-list in lisp/subr.el uses a local variable ordering bound to a weak hash table What would be interesting on the first one is if, after purifying strings before dumping, the weak references from NCU values kept the filename strings used as keys alive, and those were also (possibly weak) keys in some other hash table. I have no idea if that's the case, it would just be a weird, probably unintended side effect of pure space. I'm not sure how many of the others are long-lived or potentially large enough to be a concern (with 2100+ loaded NCUs, the first one is neither short-lived nor particularly negligible in size) I'm not sure how you can determine whether it's an issue in practice. Lynn