unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Pip Cet <pipcet@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: michael_heerdegen@web.de, npostavs@gmail.com, 36447@debbugs.gnu.org
Subject: bug#36447: 27.0.50; New "Unknown keyword" errors
Date: Fri, 5 Jul 2019 09:09:13 +0000	[thread overview]
Message-ID: <CAOqdjBcgHD3N3eh=S1aHmou51nL=F3n1p=VvUD95dxPZr0sF9w@mail.gmail.com> (raw)
In-Reply-To: <83r274an61.fsf@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 3427 bytes --]

On Fri, Jul 5, 2019 at 8:41 AM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Pip Cet <pipcet@gmail.com>
> > Date: Fri, 5 Jul 2019 08:36:57 +0000
> > Cc: michael_heerdegen@web.de, npostavs@gmail.com, 36447@debbugs.gnu.org
> >
> > > > I don't think we can sensibly add tests for this bug, but the fix I
> > > > posted earlier still seems valid to me.
> > >
> > > Sorry, I'm not tracking this part of the discussion, as it lost me
> > > long ago.
> >
> > What's the best way of getting this fixed?
>
> Sorry, I don't think I know what "this bug" is about,

The bug:
Building emacs with "-O0 -g3 -ggdb" on current Linux will result in
binaries that sometimes, depending on the precise compiler version
used, will fail weirdly if you evaluate this emacs -Q recipe line by
line:

(custom-handle-keyword nil :group nil nil)
(y-or-n-p "prompt")
(custom-handle-keyword nil :group nil nil)

The error produced will be "unknown keyword :group", which is
nonsensical as :group is indeed a valid keyword.


The analysis:
It's not the byte code, which is fine and looks like this:

byte code for custom-handle-keyword:
  doc:  For customization option SYMBOL, handle KEYWORD with VALUE. ...
  args: (arg1 arg2 arg3 arg4)
0    varref      purify-flag
1    goto-if-nil 1
4    constant  purecopy
5    stack-ref 2
6    call      1
7    stack-set 2
9:1    stack-ref 2
10    constant  <jump-table-eq (:group 2 :version 3 :package-version 4
:link 5 :load 6 :tag 7 :set-after 8)>
11    switch
12    goto      9
...
52:9    constant  error
53    constant  "Unknown keyword %s"
54    stack-ref 4
55    call      2
56    return

Note that the code uses a jump table, which is a hash table mapping
keys to integers for the "switch" op.

This is where hash tables come in.

We can inspect the hash table `custom-handle-keyword' uses by evaluating

(aref (aref (symbol-function #'custom-handle-keyword) 2) 2)

The hash table prints fine.

But investigating its C in-memory representation, we find that the
hash collision chains, stored in the ->next vector, are corrupted.

It turns out that this is because hash_table_rehash was called on a
different hash table which had the same ->next vector, but different
contents.

That's the problem I fixed: for reasons explained below, we sometimes
see two hash tables with the same ->next vector, then try to rehash
both of them, obtaining different results. Last caller wins, first
caller gets the corruption (each hash table is rehashed at most once).

The reasons are this: when a hash table is purecopied, its ->next
vector is purecopied, which merges it with another, similar, hash
table's ->next vector if purify-flag is a (third) hash table. The
vectors are compared using `equal', but the pure copies are actually
`eq'.

This worked fine with the old dumper, because we never modified pure
storage. However, with the current pdumper code, we have to do that.

The (disappointingly trivial) fix:
call copy-sequence on h->next before rehashing the table. This will
make h->next impure, which is good since we're going to modify it.
While we're there, do the same for the other vectors used in the hash
table representation, except for h->key_and_value, which we need not
touch.

> and how is the issue with hash tables relevant.

The bytecode is executing incorrectly because it relies on a
purecopied hash table, which is effectively part of the compiled
function. The hash table has become corrupted.

[-- Attachment #2: 0002-Don-t-alter-shared-structure-in-dumped-purecopied-ha.patch --]
[-- Type: text/x-patch, Size: 964 bytes --]

From 70419f630f60919c8645a10aeef8d299f5098ff5 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Wed, 3 Jul 2019 11:48:22 +0000
Subject: [PATCH 2/2] Don't alter shared structure in dumped purecopied hash
 tables.

* src/fns.c (hash_table_rehash): Make sure we're operating on
fresh copies of ->next, ->index, ->hash.
---
 src/fns.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/src/fns.c b/src/fns.c
index 2fc000a7f4..44d2de523a 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4223,6 +4223,12 @@ hash_table_rehash (struct Lisp_Hash_Table *h)
 {
   ptrdiff_t size = HASH_TABLE_SIZE (h);
 
+  /* These structures may have been purecopied and shared
+     (bug#36447).  */
+  h->next = Fcopy_sequence (h->next);
+  h->index = Fcopy_sequence (h->index);
+  h->hash = Fcopy_sequence (h->hash);
+
   /* Recompute the actual hash codes for each entry in the table.
      Order is still invalid.  */
   for (ptrdiff_t i = 0; i < size; ++i)
-- 
2.20.1


  reply	other threads:[~2019-07-05  9:09 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-30 18:23 bug#36447: 27.0.50; New "Unknown keyword" errors Michael Heerdegen
2019-06-30 18:43 ` Eli Zaretskii
2019-06-30 21:44   ` Michael Heerdegen
2019-07-01 12:25     ` Noam Postavsky
2019-07-01 13:20       ` Pip Cet
2019-07-01 22:04       ` Michael Heerdegen
2019-07-02  1:59         ` Stefan Kangas
2019-07-02 14:17           ` Eli Zaretskii
2019-07-02 13:29         ` Pip Cet
2019-07-02 15:35           ` Michael Heerdegen
2019-07-02 16:20             ` Noam Postavsky
2019-07-02 22:50               ` Pip Cet
2019-07-03 11:57                 ` Pip Cet
2019-07-05  1:59                   ` Michael Heerdegen
2019-07-05  6:35                     ` Pip Cet
2019-07-05  7:50                       ` Eli Zaretskii
2019-07-05  8:12                         ` Pip Cet
2019-07-05  8:25                           ` Eli Zaretskii
2019-07-05  8:36                             ` Pip Cet
2019-07-05  8:41                               ` Eli Zaretskii
2019-07-05  9:09                                 ` Pip Cet [this message]
2019-07-05 12:23                                   ` Robert Pluim
2019-07-05 12:33                                   ` Eli Zaretskii
2019-07-05 13:41                                     ` Pip Cet
2019-07-05 18:00                                     ` Stefan Monnier
2019-07-05 18:07                                       ` Eli Zaretskii
2019-07-05 20:16                                         ` Stefan Monnier
2019-07-05 18:57                                       ` Pip Cet
2019-07-05 19:13                                         ` Eli Zaretskii
2019-07-05 20:21                                         ` Stefan Monnier
2019-07-05 21:52                                           ` Pip Cet
2019-07-05 22:10                                             ` Stefan Monnier
2019-07-06  6:45                                               ` Eli Zaretskii
2019-07-06 15:08                                                 ` Pip Cet
2019-07-09 21:05                                                   ` Stefan Monnier
2019-07-10  2:38                                                     ` Eli Zaretskii
2019-07-10  3:19                                                       ` Daniel Colascione
2019-07-10 15:01                                                         ` Pip Cet
2019-07-10 17:16                                                           ` Daniel Colascione
2019-07-10 20:14                                                             ` Pip Cet
2019-07-06 15:32                                             ` Michael Heerdegen
2019-07-08 17:30                                               ` Lars Ingebrigtsen
2019-07-08 17:58                                                 ` Pip Cet
2019-07-08 22:18                                                   ` Lars Ingebrigtsen
2019-07-08 22:25                                                     ` Noam Postavsky
2019-07-09 14:00                                                       ` Pip Cet
2019-07-10  3:01                                                         ` Daniel Colascione
2019-07-14 14:06                                                           ` Noam Postavsky
2019-07-08 23:22                                                     ` Stefan Monnier
2019-07-08 22:23                                                   ` Michael Heerdegen
2019-07-09 15:43                                                     ` Eli Zaretskii
2019-07-09 20:15                                                   ` Stefan Monnier
2019-07-05  7:55                       ` Katsumi Yamaoka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOqdjBcgHD3N3eh=S1aHmou51nL=F3n1p=VvUD95dxPZr0sF9w@mail.gmail.com' \
    --to=pipcet@gmail.com \
    --cc=36447@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=michael_heerdegen@web.de \
    --cc=npostavs@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).