unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Pip Cet <pipcet@gmail.com>
To: Daniel Colascione <dancol@dancol.org>
Cc: michael_heerdegen@web.de, npostavs@gmail.com,
	36447@debbugs.gnu.org, Stefan Monnier <monnier@iro.umontreal.ca>
Subject: bug#36447: 27.0.50; New "Unknown keyword" errors
Date: Wed, 10 Jul 2019 15:01:01 +0000	[thread overview]
Message-ID: <CAOqdjBfOYsXOVbH24D+fketPwH_iuO34Tk-oWVU6YTFRX7cDrw@mail.gmail.com> (raw)
In-Reply-To: <735465eebd8f96d3969d42e6f734f69e.squirrel@dancol.org>

On Wed, Jul 10, 2019 at 3:19 AM Daniel Colascione <dancol@dancol.org> wrote:
> Thanks for debugging this problem. It really is nasty. AIUI, the problem
> is that hash-consing the hash vectors might unify vectors that happen to
> have the same contents under one hashing scheme but different contents
> under a different hashing scheme, so when we rehash lazily, we correctly
> rehash one hash table and corrupt a different hash table's index array by
> side effect. There are two proposed solutions:
>
> 1) Copy the hash table internal vectors on rehash, and
> 2) "Freeze" hash tables by eliminating the index arrays and rebuilding
> them all eagerly on dump start.

That summary is correct, I think.

> #1 works, but it's somewhat inefficient.
>
> #2 is a variant of #1, in a sense. Instead of copying the hash table
> vectors and mutating them, we rebuild them from scratch. I don't
> understand why we have to do that eagerly.

I'm sorry if I suggested that we must do that. On the contrary, both
options are entirely viable, though on balance I prefer the eager
version.

The lazy approach makes the code more complicated, slightly slower,
and introduced what appears to me to be at least one bug
(Fhash_table_count returns a negative integer if called with a
non-rehashed hastable).

The eager approach slows down startup by a fraction of a millisecond
(it might be an entire millisecond if your Emacs session doesn't
access any of the dumped hash tables at all, but since it does tend to
do that, it'll be less in practice).

> #1 isn't as bad as you might think.

But it's not as good as "do #1 but only if PURE_P", which no longer works.

> It's the same work either way, except that if we copy, when we grow the
> hash table, we can actually free the original vectors.

I'd like to restrict #1 to hash tables which are somehow immutable,
either because they're pure or because we actually introduce immutable
objects, so they'd never grow. Mutable hash tables must not share
their index vectors anyway.

> IMHO, the right approach is to check in #1 for now and switch to a #2-like
> approach once master is stable. Noticing that we don't actually have to
> store the hash table internal arrays in the dump is a good catch.

I agree, but I think we should discuss the future of pure space, too.
Maybe we should have a separate bug for that, though.





  reply	other threads:[~2019-07-10 15:01 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-30 18:23 bug#36447: 27.0.50; New "Unknown keyword" errors Michael Heerdegen
2019-06-30 18:43 ` Eli Zaretskii
2019-06-30 21:44   ` Michael Heerdegen
2019-07-01 12:25     ` Noam Postavsky
2019-07-01 13:20       ` Pip Cet
2019-07-01 22:04       ` Michael Heerdegen
2019-07-02  1:59         ` Stefan Kangas
2019-07-02 14:17           ` Eli Zaretskii
2019-07-02 13:29         ` Pip Cet
2019-07-02 15:35           ` Michael Heerdegen
2019-07-02 16:20             ` Noam Postavsky
2019-07-02 22:50               ` Pip Cet
2019-07-03 11:57                 ` Pip Cet
2019-07-05  1:59                   ` Michael Heerdegen
2019-07-05  6:35                     ` Pip Cet
2019-07-05  7:50                       ` Eli Zaretskii
2019-07-05  8:12                         ` Pip Cet
2019-07-05  8:25                           ` Eli Zaretskii
2019-07-05  8:36                             ` Pip Cet
2019-07-05  8:41                               ` Eli Zaretskii
2019-07-05  9:09                                 ` Pip Cet
2019-07-05 12:23                                   ` Robert Pluim
2019-07-05 12:33                                   ` Eli Zaretskii
2019-07-05 13:41                                     ` Pip Cet
2019-07-05 18:00                                     ` Stefan Monnier
2019-07-05 18:07                                       ` Eli Zaretskii
2019-07-05 20:16                                         ` Stefan Monnier
2019-07-05 18:57                                       ` Pip Cet
2019-07-05 19:13                                         ` Eli Zaretskii
2019-07-05 20:21                                         ` Stefan Monnier
2019-07-05 21:52                                           ` Pip Cet
2019-07-05 22:10                                             ` Stefan Monnier
2019-07-06  6:45                                               ` Eli Zaretskii
2019-07-06 15:08                                                 ` Pip Cet
2019-07-09 21:05                                                   ` Stefan Monnier
2019-07-10  2:38                                                     ` Eli Zaretskii
2019-07-10  3:19                                                       ` Daniel Colascione
2019-07-10 15:01                                                         ` Pip Cet [this message]
2019-07-10 17:16                                                           ` Daniel Colascione
2019-07-10 20:14                                                             ` Pip Cet
2019-07-06 15:32                                             ` Michael Heerdegen
2019-07-08 17:30                                               ` Lars Ingebrigtsen
2019-07-08 17:58                                                 ` Pip Cet
2019-07-08 22:18                                                   ` Lars Ingebrigtsen
2019-07-08 22:25                                                     ` Noam Postavsky
2019-07-09 14:00                                                       ` Pip Cet
2019-07-10  3:01                                                         ` Daniel Colascione
2019-07-14 14:06                                                           ` Noam Postavsky
2019-07-08 23:22                                                     ` Stefan Monnier
2019-07-08 22:23                                                   ` Michael Heerdegen
2019-07-09 15:43                                                     ` Eli Zaretskii
2019-07-09 20:15                                                   ` Stefan Monnier
2019-07-05  7:55                       ` Katsumi Yamaoka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOqdjBfOYsXOVbH24D+fketPwH_iuO34Tk-oWVU6YTFRX7cDrw@mail.gmail.com \
    --to=pipcet@gmail.com \
    --cc=36447@debbugs.gnu.org \
    --cc=dancol@dancol.org \
    --cc=michael_heerdegen@web.de \
    --cc=monnier@iro.umontreal.ca \
    --cc=npostavs@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).