From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Pip Cet Newsgroups: gmane.emacs.devel Subject: Re: MPS: weak hash tables Date: Wed, 03 Jul 2024 06:33:23 +0000 Message-ID: <1VNw6cPSIpKfxNRqQBpVleX2BDbQuUqwLQzo-C8N-_PRvNNLG3BnhbcWpUJkiJYnOogBvqRTcLApebjqdZel7CgXVx9T0CnPn6_go_AugDA=@protonmail.com> References: <2syUQ04IbTWqDJjMfKSrtzWMWmFGq1GIOwSxv_r6BEyNDtk7ADADKjZk-90g9tSS9SKWppkiq6_zihUtsoE1spiopaOI6-v9inQrGxwMyCs=@protonmail.com> <87o77gvxz3.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34993"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Helmut Eller , Eli Zaretskii , Emacs Devel To: =?utf-8?Q?Gerd_M=C3=B6llmann?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Jul 03 12:53:11 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sOxbr-0008rH-Aq for ged-emacs-devel@m.gmane-mx.org; Wed, 03 Jul 2024 12:53:11 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sOxbH-0002ce-6J; Wed, 03 Jul 2024 06:52:35 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sOtYh-0003jo-98 for emacs-devel@gnu.org; Wed, 03 Jul 2024 02:33:40 -0400 Original-Received: from mail-40133.protonmail.ch ([185.70.40.133]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sOtYe-0004Fb-68 for emacs-devel@gnu.org; Wed, 03 Jul 2024 02:33:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1719988406; x=1720247606; bh=uxntVKDVHbq1np255bMszTQK/DBDBBZFDSx2S/RsAQ0=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=QKrJ5kbRzip9imTgMt9r56fjufMWjaMNxwk68JGc1Hc0IE0OZYifLTq9R3EWoV8fi hLMHKqhgR70E67Tdf4L7Gdr7Q1W9IwWGhL1UOsLW9+Tf4X2uJZYB+Mvyjj23jBxsBT Pu3X1ENLmpDfgzsUf/G0ZtqZSEU0LsB+8mMrjNuLLhZpCitJYwNS+AYi9sj4uvJ3b/ U6W4BvS9j7bk7iJMvkZrKYoNFf+mp6gWCamXjQi8nhd9lHk3IpP9bN32WuDRqypHnD /FJpZ96hkorrA2r14UHvT3LAmZrz3gE/gRIUkZYp2NFyNkC+2bg2z+V2lcbAvDVAVF mNDQ8WIEQ3i5w== In-Reply-To: Feedback-ID: 112775352:user:proton X-Pm-Message-ID: 2fc57e74c1262361460cdc9823f2c82a2553e778 Received-SPF: pass client-ip=185.70.40.133; envelope-from=pipcet@protonmail.com; helo=mail-40133.protonmail.ch X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Wed, 03 Jul 2024 06:52:28 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:321241 Archived-At: On Wednesday, July 3rd, 2024 at 06:11, Gerd M=C3=B6llmann wrote: > Helmut Eller eller.helmut@gmail.com writes: > > On Tue, Jul 02 2024, Pip Cet wrote: > >=20 > > > Also, there's the whole caution thing about weak objects containing > > > only unaligned words or words pointing directly to a base object, > > > which is only relevant on Unix/i386, IIRC. (MPS emulates instructions > > > to simulate fine-grained barriers, which is a really cool idea; I'd > > > still like an option to turn it off though...). That would mean we > > > have to replace Lisp_Objects and use the ptr member of our union (and > > > that's the reason I'm using fixnums rather than plain integers for th= e > > > hash). > >=20 > > Why do you think that the restriction only applies to 32-bit systems? > > My interpretation of > >=20 > > Section 7.4. Caution > > ... > > =E2=80=9CAligned pointer=E2=80=9D means a word whose numeric value (tha= t is, its value > > when treated as an unsigned integer) is a multiple of the size of a > > pointer. If you=E2=80=99re using a 64-bit architecture, that means that= an > > aligned pointer is a multiple of 8 and its bottom three bits are zero. > > ... > >=20 > > is that it applies to 64-bit machines as well. >=20 >=20 > OTOH, when I see this in a bit broader context, namely >=20 > 7.3 > ... >=20 > Emulation of accesses to protected objects happens when all of the > following are true: >=20 > The object is a weak object allocated in an AWL pool. >=20 > The MPS is running on Linux/IA-32 or Windows/IA-32. Extending this > list to new (reasonable) operating systems should be tolerable (for > example, macOS/IA-32). Extending this to new processor architectures > requires more work. >=20 > The processor instruction that is accessing the object is of a > suitable simple form. The MPS doesn=E2=80=99t contain an emulator for all > possible instructions that might access memory, so currently it only > recognizes and emulates a simple MOV from memory to a register or > vice-versa. >=20 > Contact us if you need emulation of access to weak references for new > operating systems, processor architectures, or memory access > instructions. >=20 > 7.4. Caution=EF=83=81 >=20 > Because of the instruction emulation described in Protection faults > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > above, AWL places the following restriction on the format of objects > allocated in it: >=20 > Each slot in an object must either be a valid word-aligned reference, > or else the bottom bits of the word must be non-zero so that it does > not look like an aligned pointer. >=20 > =E2=80=9CAligned pointer=E2=80=9D means... >=20 > I'd bet that these restrictions don't matter when emulation is not done, > which is the case for 64 bit processors, not IA32 etc. And indeed, on my > machine with arm64 splatting in the marker vectors works just fine. >=20 > I can understand that Ravenbrook documents these restrictions, for > future (payed) developments and so on, but, you know... implementing > them gets pretty ugly pretty quickly. (And I wonder if the emulation > brings enough to warrant the effort.) Frankly, I wonder how they'd feel about a patch to make emulation optional = on all architectures, so the restrictions would be optional as well. I'm pl= aying with a qemu IA32 environment, but haven't been able to trigger the em= ulation code yet (even on IA32, only some very simple instructions are emul= ated). I've run into another issue: finalization. MPS's take on that is rather unu= sual, in that an object can be "finalized" while weak references to it stil= l exist (and destruction can be vetoed by the finalization code creating a = new strong reference to it, IIUC). The upshot of this is that this code: (setq bignum (1+ most-positive-fixnum)) (setq table (make-hash-table :test 'eq :weakness 'key)) (puthash bignum t table) table (setq bignum nil) (setq values nil) (igc--collect) table produces a table with a nonsensical/random bignum as key, because the memor= y has been freed and reused for something else. I have a patch here which "splats" finalized pvecs so they become PVEC_FREE= , ignores such objects during iteration, and gets rid of the "count" elemen= t, instead counting the elements for (hash-table-count weak-table). I'll in= stall it after some more testing, unless a better solution occurs to someon= e. > It would be nice if the ugliness could be encapsulated so that one > doesn't have to see it all the time, as far as that it possible in C > :-). Or conditionalized, maybe, because with Helmut's idea (which I find > the right one), we're using and additional word for weak references. How hard would it be to "just" add struct igc_headers to the remaining non-= headered objects? I don't really want to reopen the "get rid of pure space"= discussion again, but that's probably the hard part? Pip