From: Pip Cet via "Emacs development discussions." <emacs-devel@gnu.org>
To: Stefan Kangas <stefankangas@gmail.com>
Cc: Helmut Eller <eller.helmut@gmail.com>,
Stefan Monnier <monnier@iro.umontreal.ca>,
emacs-devel@gnu.org
Subject: Re: Merging scratch/no-purespace to remove unexec and purespace
Date: Sun, 22 Dec 2024 13:13:50 +0000 [thread overview]
Message-ID: <87a5cnhndr.fsf@protonmail.com> (raw)
In-Reply-To: <CADwFkm=9=Zr-KzC8mPFDkyJ8RpYEhNUFKK7TKvQHsGj+7FsiNA@mail.gmail.com>
Pip Cet <pipcet@protonmail.com> writes:
> However, I realize that (1) is currently a sheer guess. I haven't
> decided whether it's worth it to get an upper bound on the saved GC time
> by implementing a universal "tenured" set and performing a GC right
> after loading (which should be very fast, not marking any pdumped
> objects).
I did. This got long again. That's because I wanted to be really sure
that merging no-purespace isn't going to prevent worthwhile
optimizations in the future, and I am now. Feel free to skip the rest
:-)
My initial results are that simply "tenuring" the char tables in the
pdump seems to have such a drastic effect that it's hard to perform a
fair measurement: process_mark_stack is called (in emacs -Q, no --batch)
21384 times if we "tenure" the char tables, and 135345 times if we
don't.
(This suggests that char tables may be worth optimizing for the "old"
GC: simply keep a set of GC-relevant values in the char table, and scan
that rather than scanning the entire char table. However, we can't do
that with MPS, so I'm not overly interested in it. Also, I doubt the
optimization decisions required for char tables would be made the same
way if they were reimplemented today, so it may be more productive to
start over from scratch, with a particular focus on reducing the time
needed for GC rather than ordinary performance)
Also, we need to add a few check_writable calls to avoid segfaults. I
should have expected that, I guess.
The good news is that few pdumped objects (256 once a non-batched Emacs
is started) actually appear to be written to, so it's not entirely
hopeless to identify those in one run and mark them non-tenured in the
real Emacs.
IOW, my tentative conclusion is that it's possible to perform such
optimizations after pure space is dropped, and there's no reason to
delay the merge.
Optimizing based on a *hint* that an object probably won't be mutated is
a potential way forward.
Optimizing based on a hard promise that an object won't be mutated, as
the old purespace code does, not so much. Even the old purespace code,
with the years of development it's seen, ended up losing the
optimization and causing preventable segfaults for valid-looking Elisp
code.
I must confess I'm fundamentally opposed to having objects come in a
"read-only" and a "read-write" flavor. Either they should always be
immutable, such as bignums and floats are now, or we should go to the
trouble of supporting the rare cases in which an object hinted or
guessed to be read-only turned out not to be. (This is independent of
the question of whether the characters in a string can be changed or
not.)
It's very hard even to define what constitutes mutation of an object and
what doesn't. Setting a symbol's global value is clearly a mutation in
the current code, but what if we keep those global values in a hash
table instead, and the struct Lisp_Symbol is never written to? Does
lexically (or dynamically) binding a symbol mean the entire symbol is no
longer read-only? If we ever implement hash-collision workarounds by
randomizing hash seeds, would re-seeding count as a mutation of the hash
table? What about (aset v 0 (aref v 0))? Hash table resizing? Removing
dead keys from Weak hash tables? Pinning a string to use it in a byte
code object? Wouldn't it make sense to protect hash table (or obarray)
keys from mutation if that may result in irretrievable entries?
Most of these questions have two good answers, one which aids in
optimization, and one which Lisp programmers would expect. They're
often different.
To get back to the no-purespace branch, I think we should consider
reintroducing check_writable () calls (which would currently be no-ops
on the master branch) after the merge, if we can agree on precisely when
this macro should be called and how. The old locations of CHECK_IMPURE
can serve as a hint, but no more, so let's drop CHECK_IMPURE first and
start with a clean slate there.
Pip
next prev parent reply other threads:[~2024-12-22 13:13 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-17 10:47 Merging scratch/no-purespace to remove unexec and purespace Stefan Kangas
2024-12-17 13:12 ` Gerd Möllmann
2024-12-17 14:20 ` Gerd Möllmann
2024-12-17 14:30 ` Gerd Möllmann
2024-12-17 17:56 ` Gerd Möllmann
2024-12-17 18:50 ` Eli Zaretskii
2024-12-17 18:56 ` Gerd Möllmann
2024-12-18 12:55 ` Andrea Corallo
2024-12-18 14:03 ` Gerd Möllmann
2024-12-18 16:05 ` Pip Cet via Emacs development discussions.
2024-12-18 16:30 ` Gerd Möllmann
2024-12-18 16:25 ` Pip Cet via Emacs development discussions.
2024-12-18 22:27 ` Andrea Corallo
2024-12-19 9:28 ` Pip Cet via Emacs development discussions.
2024-12-19 10:38 ` Andrea Corallo
2024-12-19 10:50 ` Stefan Kangas
2024-12-19 12:08 ` Pip Cet via Emacs development discussions.
2024-12-19 17:55 ` Stefan Kangas
2024-12-19 20:13 ` Pip Cet via Emacs development discussions.
2024-12-20 15:59 ` Stefan Monnier
2024-12-20 16:22 ` Pip Cet via Emacs development discussions.
2024-12-20 17:25 ` Gerd Möllmann
2024-12-20 20:35 ` Andrea Corallo
2024-12-20 20:39 ` Pip Cet via Emacs development discussions.
2024-12-21 6:33 ` Gerd Möllmann
2024-12-21 6:56 ` Andrea Corallo
2024-12-20 20:38 ` Pip Cet via Emacs development discussions.
2024-12-20 20:57 ` Gerd Möllmann
2024-12-20 8:42 ` Pip Cet via Emacs development discussions.
2024-12-18 0:18 ` Stefan Kangas
2024-12-17 19:30 ` Helmut Eller
2024-12-17 20:47 ` Stefan Monnier
2024-12-18 2:15 ` Stefan Kangas
2024-12-18 7:11 ` Helmut Eller
2024-12-18 13:35 ` Pip Cet via Emacs development discussions.
2024-12-18 6:56 ` Helmut Eller
2024-12-21 17:41 ` Helmut Eller
2024-12-21 18:32 ` Gerd Möllmann
2024-12-21 22:19 ` Pip Cet via Emacs development discussions.
2024-12-22 1:28 ` Stefan Kangas
2024-12-22 11:12 ` Pip Cet via Emacs development discussions.
2024-12-22 13:07 ` Eli Zaretskii
2024-12-22 14:12 ` Pip Cet via Emacs development discussions.
2024-12-22 15:51 ` Stefan Monnier
2024-12-22 17:09 ` Gerd Möllmann
2024-12-22 17:10 ` Pip Cet via Emacs development discussions.
2024-12-22 13:13 ` Pip Cet via Emacs development discussions. [this message]
2024-12-22 14:16 ` Helmut Eller
2024-12-18 9:30 ` Pip Cet via Emacs development discussions.
2024-12-18 0:50 ` Po Lu
2024-12-18 2:12 ` Stefan Kangas
2024-12-18 21:26 ` Stefan Monnier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a5cnhndr.fsf@protonmail.com \
--to=emacs-devel@gnu.org \
--cc=eller.helmut@gmail.com \
--cc=monnier@iro.umontreal.ca \
--cc=pipcet@protonmail.com \
--cc=stefankangas@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).