On Sunday, June 30th, 2024 at 11:02, Gerd Möllmann <gerd.moellmann@gmail.com> wrote:
> Pip Cet pipcet@protonmail.com writes:
> > > > And since we can't resize objects that may be pinned by ambiguous
> > > > references, well, the weak hash table implementation would look quite
> > > > different from the strong hash tables we have).
> >
> > > If we make-hash-table with weak keys and/or values, allocate the
> > > Lisp_Hash_Table from the AWL pool using the weak_strong allocation
> > > point. If neither keys nor values are weak, allocate the hash table from
> > > the default pool.
> >
> > > Allocate the key and the value vector according to the hash table's
> > > weakness either from the strong or weak allocation point, or from the
> > > default pool if the table isn't weak at all. (I've split the
> > > key_and_value vector already in two, but you probably noticed that
> > > already.)
> > >
> > > Dependent object of the key and value vectors could be the hash table
> > > itself.
> >
> > Then we couldn't modify the value vector when the key gets splatted,
> > or vice versa, so the table wouldn't be properly weak.
>
> True. I forgot to mention an important thing: When something is splat,
> set flag(s) in the dependent hash table indicating that something must
> be done because of that splatting. In gethash and so, check the flag and
> do what's necessary. (I did something similar for the weak hash tables
> in CMUCL, and it wasn't entirely bad. And weak tables should be rare.)

Not necessarily rare, particularly not if we turn BUF_MARKERS into a weak hash table (I still don't see why we shouldn't do that, maybe I missed it).

But, yes, that's a good idea and requires far fewer changes to the hash table code than I've now made locally. However, I've decided to go through with it and have just successfully splatted my first weak hash table entry.

> > My understanding is we must allocate all strongly-referencing objects
> > together in one object, all weakly-referencing objects together in
> > another one, and make them depend on each other.
>
>
> The first 2 points I think are true, and are the reason I split the 1
> vector in master containing both keys and values into two in igc, so
> that keys and values can be weak or not, as necessary.
>
> The 3rd thing you wrote I'm not sure. My understanding is that
> specifying a dependent object simply means that MPS makes it accessible
> while scanning the depending object. I don't think MPS does anything to
> the dependent object by itself. Hence the idea to make the hash table
> the dependent object so that one at least can "notify" it that something
> has happened, so that it can modify index and next vectors.

You're right, if we choose to invalidate hash tables asynchronously and fix them up later, it's perfectly okay for the dependent object to be the hash table.

> > And that means making the pseudovec a mere tuple of pointers to the
> > strong and weak parts, because the conglomerate objects might need to
> > be resized...
>
> I'm afraid I couldn't follow. Could you please explain?

If we go with my original proposal (which I'm not at all sure about at this point as the code is really ugly), we'd have to have something like:

struct Lisp_Weak_Hash_Table
{
  union vectorlike_header header;

  struct Lisp_Weak_Hash_Table_Strong_Part *strong;
  struct Lisp_Weak_Hash_Table_Weak_Part *weak;
};

and store everything relevant in ->strong, which would be potentially resized and reallocated by Fputhash.

Another disadvantage of my approach is that if we ever decide to dynamically shrink hash tables, we'd have to have a notification mechanism anyway.

VERY WIP patch attached. It doesn't actually register the dependent object yet or resize the hash tables or anything, it's just enough to build "temacs" (not emacs, the dumper part isn't there yet) and run:

(setq table (make-hash-table :weakness 'key))
(setq key (cons 1 2))
(puthash key (cons 3 4) table)
(gethash key table t)
(hash-table-count table)
(igc--collect)
(setq key nil)
(setq values nil)
(igc--collect)
(hash-table-count table)

and get a 0 result.

Pip