* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
@ 2019-07-11 14:05 Pip Cet
2019-07-14 14:39 ` Paul Eggert
0 siblings, 1 reply; 37+ messages in thread
From: Pip Cet @ 2019-07-11 14:05 UTC (permalink / raw)
To: 36597, dancol
[-- Attachment #1: Type: text/plain, Size: 1440 bytes --]
This is a follow-up to bug#36447, which has been fixed.
Lazy rehashing for hash tables should be removed. This patch does that.
Lazy rehashing makes all code that accesses hash tables a little more
complicated; in at least one case, we forgot to do that, resulting in
bug#36596.
The sole benefit of lazy rehashing to be mentioned so far is that
start-up time is reduced a little (by less than a millisecond on this
system), and that for certain usage patterns (many short Emacs
sessions that don't do very much, I think), this might outweigh the
negative consequences of lazy rehashing.
Lazy rehashing means less maintainable code, more code, and, at
run-time, slightly slower code.
In particular, it means that we have to have a comment in lisp.h which
distracts from the code and has to be marked as "IMPORTANT!!!!!!!",
which I think is nearly as annoying as having to jump through special
hoops before accessing a hash table.
I'm posting this as a separate bug so we can have a, hopefully, brief
discussion about it and then either move away from lazy rehashing or
continue using it; if we decide to continue using it, it should be
documented in much more detail.
(We should keep in mind that conditional branches, even ones that are
well-predicted and don't cause cache misses, aren't free: they use
execution units, and preparing their arguments in registers increases
register pressure, and of course they increase code size.)
[-- Attachment #2: 0001-Rehash-hash-tables-eagerly-after-loading-a-portable-.patch --]
[-- Type: text/x-patch, Size: 16498 bytes --]
From 2db3964f2d8290b905b2d20e2a62fb073fec1382 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Thu, 11 Jul 2019 12:06:59 +0000
Subject: [PATCH] Rehash hash tables eagerly after loading a portable dump
* src/lisp.h (hash_rehash_needed_p): Remove. All uses removed.
(hash_rehash_if_needed): Remove. All uses removed.
(struct Lisp_Hash_Table): Remove comment about rehashing hash tables.
* src/pdumper.c (thaw_hash_tables): New function.
(hash_table_thaw): New function.
(hash_table_freeze): New function.
(dump_hash_table): Simplify.
(dump_hash_table_list): New function.
(hash_table_contents): New function.
(Fdump_emacs_portable): Handle hash tables by eager rehashing.
(pdumper_load): Restore hash tables.
(init_pdumper_once): New function.
---
src/bytecode.c | 1 -
src/composite.c | 1 -
src/emacs.c | 1 +
src/fns.c | 53 ++++-----------
src/lisp.h | 19 +-----
src/minibuf.c | 3 -
src/pdumper.c | 177 +++++++++++++++++++++++++-----------------------
src/pdumper.h | 1 +
8 files changed, 108 insertions(+), 148 deletions(-)
diff --git a/src/bytecode.c b/src/bytecode.c
index 29dff44f00..9c72429e0c 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -1402,7 +1402,6 @@ #define DEFINE(name, value) LABEL (name) ,
Lisp_Object v1 = POP;
ptrdiff_t i;
struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table);
- hash_rehash_if_needed (h);
/* h->count is a faster approximation for HASH_TABLE_SIZE (h)
here. */
diff --git a/src/composite.c b/src/composite.c
index 183062de46..49a285cff0 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -654,7 +654,6 @@ gstring_lookup_cache (Lisp_Object header)
composition_gstring_put_cache (Lisp_Object gstring, ptrdiff_t len)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
- hash_rehash_if_needed (h);
Lisp_Object header = LGSTRING_HEADER (gstring);
EMACS_UINT hash = h->test.hashfn (&h->test, header);
if (len < 0)
diff --git a/src/emacs.c b/src/emacs.c
index 9c93748a0f..136236eb35 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -1560,6 +1560,7 @@ main (int argc, char **argv)
if (!initialized)
{
init_alloc_once ();
+ init_pdumper_once ();
init_obarray_once ();
init_eval_once ();
init_charset_once ();
diff --git a/src/fns.c b/src/fns.c
index 7343556ac2..420d898b26 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4224,43 +4224,24 @@ hash_table_rehash (struct Lisp_Hash_Table *h)
{
ptrdiff_t size = HASH_TABLE_SIZE (h);
- /* These structures may have been purecopied and shared
- (bug#36447). */
- h->next = Fcopy_sequence (h->next);
- h->index = Fcopy_sequence (h->index);
- h->hash = Fcopy_sequence (h->hash);
-
/* Recompute the actual hash codes for each entry in the table.
Order is still invalid. */
for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (h, i)))
- {
- Lisp_Object key = HASH_KEY (h, i);
- EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
- set_hash_hash_slot (h, i, make_fixnum (hash_code));
- }
-
- /* Reset the index so that any slot we don't fill below is marked
- invalid. */
- Ffillarray (h->index, make_fixnum (-1));
+ {
+ Lisp_Object key = HASH_KEY (h, i);
+ EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
+ set_hash_hash_slot (h, i, make_fixnum (hash_code));
+ }
/* Rebuild the collision chains. */
for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (h, i)))
- {
- EMACS_UINT hash_code = XUFIXNUM (HASH_HASH (h, i));
- ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
- set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
- set_hash_index_slot (h, start_of_bucket, i);
- eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
- }
-
- /* Finally, mark the hash table as having a valid hash order.
- Do this last so that if we're interrupted, we retry on next
- access. */
- eassert (h->count < 0);
- h->count = -h->count;
- eassert (!hash_rehash_needed_p (h));
+ {
+ EMACS_UINT hash_code = XUFIXNUM (HASH_HASH (h, i));
+ ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
+ set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
+ set_hash_index_slot (h, start_of_bucket, i);
+ eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
+ }
}
/* Lookup KEY in hash table H. If HASH is non-null, return in *HASH
@@ -4273,8 +4254,6 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, EMACS_UINT *hash)
EMACS_UINT hash_code;
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
hash_code = h->test.hashfn (&h->test, key);
eassert ((hash_code & ~INTMASK) == 0);
if (hash)
@@ -4303,8 +4282,6 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
{
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
eassert ((hash & ~INTMASK) == 0);
/* Increment count after resizing because resizing may fail. */
@@ -4338,8 +4315,6 @@ hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
ptrdiff_t prev = -1;
- hash_rehash_if_needed (h);
-
for (ptrdiff_t i = HASH_INDEX (h, start_of_bucket);
0 <= i;
i = HASH_NEXT (h, i))
@@ -4417,9 +4392,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
for (ptrdiff_t bucket = 0; bucket < n; ++bucket)
{
/* Follow collision chain, removing entries that don't survive
- this garbage collection. It's okay if hash_rehash_needed_p
- (h) is true, since we're operating entirely on the cached
- hash values. */
+ this garbage collection. */
ptrdiff_t prev = -1;
ptrdiff_t next;
for (ptrdiff_t i = HASH_INDEX (h, bucket); 0 <= i; i = next)
diff --git a/src/lisp.h b/src/lisp.h
index fa57cad8a6..cc0e1bce51 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2236,11 +2236,7 @@ #define DEFSYM(sym, name) /* empty */
struct Lisp_Hash_Table
{
- /* Change pdumper.c if you change the fields here.
-
- IMPORTANT!!!!!!!
-
- Call hash_rehash_if_needed() before accessing. */
+ /* Change pdumper.c if you change the fields here. */
/* This is for Lisp; the hash table code does not refer to it. */
union vectorlike_header header;
@@ -2353,19 +2349,6 @@ HASH_TABLE_SIZE (const struct Lisp_Hash_Table *h)
void hash_table_rehash (struct Lisp_Hash_Table *h);
-INLINE bool
-hash_rehash_needed_p (const struct Lisp_Hash_Table *h)
-{
- return h->count < 0;
-}
-
-INLINE void
-hash_rehash_if_needed (struct Lisp_Hash_Table *h)
-{
- if (hash_rehash_needed_p (h))
- hash_table_rehash (h);
-}
-
/* Default size for hash tables if not specified. */
enum DEFAULT_HASH_SIZE { DEFAULT_HASH_SIZE = 65 };
diff --git a/src/minibuf.c b/src/minibuf.c
index d9a6e15b05..e923ce2a43 100644
--- a/src/minibuf.c
+++ b/src/minibuf.c
@@ -1203,9 +1203,6 @@ DEFUN ("try-completion", Ftry_completion, Stry_completion, 2, 3, 0,
bucket = AREF (collection, idx);
}
- if (HASH_TABLE_P (collection))
- hash_rehash_if_needed (XHASH_TABLE (collection));
-
while (1)
{
/* Get the next element of the alist, obarray, or hash-table. */
diff --git a/src/pdumper.c b/src/pdumper.c
index 3d8531c6a4..cfdeb5ec9b 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -392,6 +392,8 @@ dump_fingerprint (const char *label, unsigned char const *xfingerprint)
The start of the cold region is always aligned on a page
boundary. */
dump_off cold_start;
+
+ dump_off hash_list;
};
/* Double-ended singly linked list. */
@@ -549,6 +551,8 @@ dump_fingerprint (const char *label, unsigned char const *xfingerprint)
heap objects. */
Lisp_Object bignum_data;
+ Lisp_Object hash_tables;
+
unsigned number_hot_relocations;
unsigned number_discardable_relocations;
};
@@ -2621,68 +2625,64 @@ dump_vectorlike_generic (struct dump_context *ctx,
return offset;
}
-/* Determine whether the hash table's hash order is stable
- across dump and load. If it is, we don't have to trigger
- a rehash on access. */
-static bool
-dump_hash_table_stable_p (const struct Lisp_Hash_Table *hash)
+/* Return a list of (KEY . VALUE) pairs in the given hash table. */
+static Lisp_Object
+hash_table_contents (struct Lisp_Hash_Table *h)
{
- bool is_eql = hash->test.hashfn == hashfn_eql;
- bool is_equal = hash->test.hashfn == hashfn_equal;
- ptrdiff_t size = HASH_TABLE_SIZE (hash);
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (hash, i)))
+ Lisp_Object contents = Qnil;
+ /* Make sure key_and_value ends up in the same order, charset.c
+ relies on it by expecting hash table indices to stay constant
+ across the dump. */
+ for (ptrdiff_t i = HASH_TABLE_SIZE (h) - 1; i >= 0; --i)
+ if (!NILP (HASH_HASH (h, i)))
{
- Lisp_Object key = HASH_KEY (hash, i);
- bool key_stable = (dump_builtin_symbol_p (key)
- || FIXNUMP (key)
- || (is_equal && STRINGP (key))
- || ((is_equal || is_eql) && FLOATP (key)));
- if (!key_stable)
- return false;
+ dump_push (&contents, HASH_VALUE (h, i));
+ dump_push (&contents, HASH_KEY (h, i));
}
+ return CALLN (Fapply, Qvector, contents);
+}
- return true;
+static dump_off
+dump_hash_table_list (struct dump_context *ctx)
+{
+ if (CONSP (ctx->hash_tables))
+ return dump_object (ctx, CALLN (Fapply, Qvector, ctx->hash_tables));
+ else
+ return 0;
}
-/* Return a list of (KEY . VALUE) pairs in the given hash table. */
-static Lisp_Object
-hash_table_contents (Lisp_Object table)
+static void
+hash_table_freeze (struct Lisp_Hash_Table *h)
{
- Lisp_Object contents = Qnil;
- struct Lisp_Hash_Table *h = XHASH_TABLE (table);
- for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); ++i)
- if (!NILP (HASH_HASH (h, i)))
- dump_push (&contents, Fcons (HASH_KEY (h, i), HASH_VALUE (h, i)));
- return Fnreverse (contents);
+ h->key_and_value = hash_table_contents (h);
+ ptrdiff_t nkeys = XFIXNAT (Flength (h->key_and_value)) / 2;
+ h->count = nkeys;
+ if (nkeys == 0)
+ nkeys = 1;
+ h->index = h->next = h->hash = make_fixnum (nkeys);
}
-/* Copy the given hash table, rehash it, and make sure that we can
- look up all the values in the original. */
static void
-check_hash_table_rehash (Lisp_Object table_orig)
-{
- hash_rehash_if_needed (XHASH_TABLE (table_orig));
- Lisp_Object table_rehashed = Fcopy_hash_table (table_orig);
- eassert (XHASH_TABLE (table_rehashed)->count >= 0);
- XHASH_TABLE (table_rehashed)->count *= -1;
- eassert (XHASH_TABLE (table_rehashed)->count <= 0);
- hash_rehash_if_needed (XHASH_TABLE (table_rehashed));
- eassert (XHASH_TABLE (table_rehashed)->count >= 0);
- Lisp_Object expected_contents = hash_table_contents (table_orig);
- while (!NILP (expected_contents))
+hash_table_thaw (struct Lisp_Hash_Table *h)
+{
+ Lisp_Object count = h->index;
+ h->index = Fmake_vector (h->index, make_fixnum (-1));
+ h->hash = Fmake_vector (h->hash, Qnil);
+ h->next = Fmake_vector (h->next, make_fixnum (-1));
+ Lisp_Object key_and_value = h->key_and_value;
+ h->next_free = -1;
+ if (XFIXNAT (count) <= 1)
{
- Lisp_Object key_value_pair = dump_pop (&expected_contents);
- Lisp_Object key = XCAR (key_value_pair);
- Lisp_Object expected_value = XCDR (key_value_pair);
- Lisp_Object arbitrary = Qdump_emacs_portable__sort_predicate_copied;
- Lisp_Object found_value = Fgethash (key, table_rehashed, arbitrary);
- eassert (EQ (expected_value, found_value));
- Fremhash (key, table_rehashed);
+ h->key_and_value = Fmake_vector (make_fixnum (2 * XFIXNAT (count)), Qnil);
+ ptrdiff_t i = 0;
+ while (i < ASIZE (key_and_value))
+ {
+ ASET (h->key_and_value, i, AREF (key_and_value, i));
+ i++;
+ }
}
- eassert (EQ (Fhash_table_count (table_rehashed),
- make_fixnum (0)));
+ hash_table_rehash (h);
}
static dump_off
@@ -2694,45 +2694,11 @@ dump_hash_table (struct dump_context *ctx,
# error "Lisp_Hash_Table changed. See CHECK_STRUCTS comment in config.h."
#endif
const struct Lisp_Hash_Table *hash_in = XHASH_TABLE (object);
- bool is_stable = dump_hash_table_stable_p (hash_in);
- /* If the hash table is likely to be modified in memory (either
- because we need to rehash, and thus toggle hash->count, or
- because we need to assemble a list of weak tables) punt the hash
- table to the end of the dump, where we can lump all such hash
- tables together. */
- if (!(is_stable || !NILP (hash_in->weak))
- && ctx->flags.defer_hash_tables)
- {
- if (offset != DUMP_OBJECT_ON_HASH_TABLE_QUEUE)
- {
- eassert (offset == DUMP_OBJECT_ON_NORMAL_QUEUE
- || offset == DUMP_OBJECT_NOT_SEEN);
- /* We still want to dump the actual keys and values now. */
- dump_enqueue_object (ctx, hash_in->key_and_value, WEIGHT_NONE);
- /* We'll get to the rest later. */
- offset = DUMP_OBJECT_ON_HASH_TABLE_QUEUE;
- dump_remember_object (ctx, object, offset);
- dump_push (&ctx->deferred_hash_tables, object);
- }
- return offset;
- }
-
- if (PDUMPER_CHECK_REHASHING)
- check_hash_table_rehash (make_lisp_ptr ((void *) hash_in, Lisp_Vectorlike));
-
struct Lisp_Hash_Table hash_munged = *hash_in;
struct Lisp_Hash_Table *hash = &hash_munged;
- /* Remember to rehash this hash table on first access. After a
- dump reload, the hash table values will have changed, so we'll
- need to rebuild the index.
-
- TODO: for EQ and EQL hash tables, it should be possible to rehash
- here using the preferred load address of the dump, eliminating
- the need to rehash-on-access if we can load the dump where we
- want. */
- if (hash->count > 0 && !is_stable)
- hash->count = -hash->count;
+ hash_table_freeze (hash);
+ dump_push (&ctx->hash_tables, object);
START_DUMP_PVEC (ctx, &hash->header, struct Lisp_Hash_Table, out);
dump_pseudovector_lisp_fields (ctx, &out->header, &hash->header);
@@ -4142,6 +4108,19 @@ DEFUN ("dump-emacs-portable",
|| !NILP (ctx->deferred_hash_tables)
|| !NILP (ctx->deferred_symbols));
+ ctx->header.hash_list = ctx->offset;
+ dump_hash_table_list (ctx);
+
+ do
+ {
+ dump_drain_deferred_hash_tables (ctx);
+ dump_drain_deferred_symbols (ctx);
+ dump_drain_normal_queue (ctx);
+ }
+ while (!dump_queue_empty_p (&ctx->dump_queue)
+ || !NILP (ctx->deferred_hash_tables)
+ || !NILP (ctx->deferred_symbols));
+
dump_sort_copied_objects (ctx);
/* While we copy built-in symbols into the Emacs image, these
@@ -5433,6 +5412,13 @@ pdumper_load (const char *dump_filename)
for (int i = 0; i < ARRAYELTS (sections); ++i)
dump_mmap_reset (§ions[i]);
+ if (header->hash_list)
+ {
+ struct Lisp_Vector *hash_tables =
+ ((struct Lisp_Vector *)(dump_base + header->hash_list));
+ XSETVECTOR (Vpdumper_hash_tables, hash_tables);
+ }
+
/* Run the functions Emacs registered for doing post-dump-load
initialization. */
for (int i = 0; i < nr_dump_hooks; ++i)
@@ -5504,10 +5490,31 @@ DEFUN ("pdumper-stats", Fpdumper_stats, Spdumper_stats, 0, 0, 0,
\f
+static void thaw_hash_tables (void)
+{
+ Lisp_Object hash_tables = Vpdumper_hash_tables;
+ ptrdiff_t i = 0;
+ while (i < ASIZE (hash_tables))
+ {
+ hash_table_thaw (XHASH_TABLE (AREF (hash_tables, i)));
+ i++;
+ }
+ Vpdumper_hash_tables = zero_vector;
+}
+
+void
+init_pdumper_once (void)
+{
+ Vpdumper_hash_tables = zero_vector;
+ pdumper_do_now_and_after_load (thaw_hash_tables);
+}
+
void
syms_of_pdumper (void)
{
#ifdef HAVE_PDUMPER
+ DEFVAR_LISP ("pdumper-hash-tables", Vpdumper_hash_tables,
+ doc: /* A list of hash tables that need to be thawed after loading the pdump. */);
defsubr (&Sdump_emacs_portable);
defsubr (&Sdump_emacs_portable__sort_predicate);
defsubr (&Sdump_emacs_portable__sort_predicate_copied);
diff --git a/src/pdumper.h b/src/pdumper.h
index ab2f426c1e..cfea06d33d 100644
--- a/src/pdumper.h
+++ b/src/pdumper.h
@@ -248,6 +248,7 @@ pdumper_clear_marks (void)
file was loaded. */
extern void pdumper_record_wd (const char *);
+void init_pdumper_once (void);
void syms_of_pdumper (void);
INLINE_HEADER_END
--
2.20.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-11 14:05 bug#36597: 27.0.50; rehash hash tables eagerly in pdumper Pip Cet
@ 2019-07-14 14:39 ` Paul Eggert
2019-07-14 15:01 ` Pip Cet
2019-07-18 5:39 ` Eli Zaretskii
0 siblings, 2 replies; 37+ messages in thread
From: Paul Eggert @ 2019-07-14 14:39 UTC (permalink / raw)
To: Pip Cet; +Cc: 36597
[-- Attachment #1: Type: text/plain, Size: 824 bytes --]
Although I like the simplicity of eager rehashing, I'm not yet sold on the
performance implications. On my usual (cd lisp && make compile-always)
benchmark, the patch hurt user+system CPU time performance by 1.5%. Admittedly
just one benchmark, but still....
Also, must we expose Vpdumper_hash_tables to Lisp? Surely it'd be better to keep
it private to pdumper.c.
I'll CC: to Daniel to see whether he has any insights about improvements in this
area.
PS. I ran that benchmark on my home desktop, an Intel Xeon E3-1225 v2 running
Ubunto 18.04.2. To run it, I rebased your patch and also removed the
no-longer-used PDUMPER_CHECK_REHASHING macro that my GCC complained about
(wonder why that didn't happen for you?), resulting in the attached patch
against current master 8ff09154a29a1151afb2902267ca35f89ebda73c.
[-- Attachment #2: 0001-Rehash-hash-tables-eagerly-after-loading-a-portable-.patch --]
[-- Type: text/x-patch, Size: 17309 bytes --]
From 96a36697b2f472f3a6b3138444659babd0d73f32 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Sun, 14 Jul 2019 07:22:01 -0700
Subject: [PATCH] Rehash hash tables eagerly after loading a portable dump
* src/lisp.h (hash_rehash_needed_p): Remove. All uses removed.
(hash_rehash_if_needed): Remove. All uses removed.
(struct Lisp_Hash_Table): Remove comment about rehashing hash tables.
* src/pdumper.c (thaw_hash_tables): New function.
(hash_table_thaw): New function.
(hash_table_freeze): New function.
(dump_hash_table): Simplify.
(dump_hash_table_list): New function.
(hash_table_contents): New function.
(Fdump_emacs_portable): Handle hash tables by eager rehashing.
(pdumper_load): Restore hash tables.
(init_pdumper_once): New function.
(PDUMPER_CHECK_REHASHING): Remove.
---
src/bytecode.c | 1 -
src/composite.c | 1 -
src/emacs.c | 1 +
src/fns.c | 54 ++++----------
src/lisp.h | 19 +----
src/minibuf.c | 3 -
src/pdumper.c | 188 ++++++++++++++++++++++++------------------------
src/pdumper.h | 1 +
8 files changed, 108 insertions(+), 160 deletions(-)
diff --git a/src/bytecode.c b/src/bytecode.c
index 29dff44f00..9c72429e0c 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -1402,7 +1402,6 @@ #define DEFINE(name, value) LABEL (name) ,
Lisp_Object v1 = POP;
ptrdiff_t i;
struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table);
- hash_rehash_if_needed (h);
/* h->count is a faster approximation for HASH_TABLE_SIZE (h)
here. */
diff --git a/src/composite.c b/src/composite.c
index 183062de46..49a285cff0 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -654,7 +654,6 @@ gstring_lookup_cache (Lisp_Object header)
composition_gstring_put_cache (Lisp_Object gstring, ptrdiff_t len)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
- hash_rehash_if_needed (h);
Lisp_Object header = LGSTRING_HEADER (gstring);
EMACS_UINT hash = h->test.hashfn (&h->test, header);
if (len < 0)
diff --git a/src/emacs.c b/src/emacs.c
index ad661a081b..855b2c6715 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -1560,6 +1560,7 @@ main (int argc, char **argv)
if (!initialized)
{
init_alloc_once ();
+ init_pdumper_once ();
init_obarray_once ();
init_eval_once ();
init_charset_once ();
diff --git a/src/fns.c b/src/fns.c
index 0497588689..b6134a314c 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4241,43 +4241,24 @@ hash_table_rehash (struct Lisp_Hash_Table *h)
{
ptrdiff_t size = HASH_TABLE_SIZE (h);
- /* These structures may have been purecopied and shared
- (bug#36447). */
- h->next = Fcopy_sequence (h->next);
- h->index = Fcopy_sequence (h->index);
- h->hash = Fcopy_sequence (h->hash);
-
/* Recompute the actual hash codes for each entry in the table.
Order is still invalid. */
for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (h, i)))
- {
- Lisp_Object key = HASH_KEY (h, i);
- EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
- set_hash_hash_slot (h, i, make_fixnum (hash_code));
- }
-
- /* Reset the index so that any slot we don't fill below is marked
- invalid. */
- Ffillarray (h->index, make_fixnum (-1));
+ {
+ Lisp_Object key = HASH_KEY (h, i);
+ EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
+ set_hash_hash_slot (h, i, make_fixnum (hash_code));
+ }
/* Rebuild the collision chains. */
for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (h, i)))
- {
- EMACS_UINT hash_code = XUFIXNUM (HASH_HASH (h, i));
- ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
- set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
- set_hash_index_slot (h, start_of_bucket, i);
- eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
- }
-
- /* Finally, mark the hash table as having a valid hash order.
- Do this last so that if we're interrupted, we retry on next
- access. */
- eassert (h->count < 0);
- h->count = -h->count;
- eassert (!hash_rehash_needed_p (h));
+ {
+ EMACS_UINT hash_code = XUFIXNUM (HASH_HASH (h, i));
+ ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
+ set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
+ set_hash_index_slot (h, start_of_bucket, i);
+ eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
+ }
}
/* Lookup KEY in hash table H. If HASH is non-null, return in *HASH
@@ -4290,8 +4271,6 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, EMACS_UINT *hash)
EMACS_UINT hash_code;
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
hash_code = h->test.hashfn (&h->test, key);
eassert ((hash_code & ~INTMASK) == 0);
if (hash)
@@ -4320,8 +4299,6 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
{
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
eassert ((hash & ~INTMASK) == 0);
/* Increment count after resizing because resizing may fail. */
@@ -4355,8 +4332,6 @@ hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
ptrdiff_t prev = -1;
- hash_rehash_if_needed (h);
-
for (ptrdiff_t i = HASH_INDEX (h, start_of_bucket);
0 <= i;
i = HASH_NEXT (h, i))
@@ -4434,9 +4409,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
for (ptrdiff_t bucket = 0; bucket < n; ++bucket)
{
/* Follow collision chain, removing entries that don't survive
- this garbage collection. It's okay if hash_rehash_needed_p
- (h) is true, since we're operating entirely on the cached
- hash values. */
+ this garbage collection. */
ptrdiff_t prev = -1;
ptrdiff_t next;
for (ptrdiff_t i = HASH_INDEX (h, bucket); 0 <= i; i = next)
@@ -4881,7 +4854,6 @@ DEFUN ("hash-table-count", Fhash_table_count, Shash_table_count, 1, 1, 0,
(Lisp_Object table)
{
struct Lisp_Hash_Table *h = check_hash_table (table);
- hash_rehash_if_needed (h);
return make_fixnum (h->count);
}
diff --git a/src/lisp.h b/src/lisp.h
index 13014c82dc..d0e5c43c41 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2245,11 +2245,7 @@ #define DEFSYM(sym, name) /* empty */
struct Lisp_Hash_Table
{
- /* Change pdumper.c if you change the fields here.
-
- IMPORTANT!!!!!!!
-
- Call hash_rehash_if_needed() before accessing. */
+ /* Change pdumper.c if you change the fields here. */
/* This is for Lisp; the hash table code does not refer to it. */
union vectorlike_header header;
@@ -2363,19 +2359,6 @@ HASH_TABLE_SIZE (const struct Lisp_Hash_Table *h)
void hash_table_rehash (struct Lisp_Hash_Table *h);
-INLINE bool
-hash_rehash_needed_p (const struct Lisp_Hash_Table *h)
-{
- return h->count < 0;
-}
-
-INLINE void
-hash_rehash_if_needed (struct Lisp_Hash_Table *h)
-{
- if (hash_rehash_needed_p (h))
- hash_table_rehash (h);
-}
-
/* Default size for hash tables if not specified. */
enum DEFAULT_HASH_SIZE { DEFAULT_HASH_SIZE = 65 };
diff --git a/src/minibuf.c b/src/minibuf.c
index d9a6e15b05..e923ce2a43 100644
--- a/src/minibuf.c
+++ b/src/minibuf.c
@@ -1203,9 +1203,6 @@ DEFUN ("try-completion", Ftry_completion, Stry_completion, 2, 3, 0,
bucket = AREF (collection, idx);
}
- if (HASH_TABLE_P (collection))
- hash_rehash_if_needed (XHASH_TABLE (collection));
-
while (1)
{
/* Get the next element of the alist, obarray, or hash-table. */
diff --git a/src/pdumper.c b/src/pdumper.c
index 03c00bf27b..d35d296d32 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -107,17 +107,6 @@ #define VM_MS_WINDOWS 2
#define DANGEROUS 0
-/* PDUMPER_CHECK_REHASHING being true causes the portable dumper to
- check, for each hash table it dumps, that the hash table means the
- same thing after rehashing. */
-#ifndef PDUMPER_CHECK_REHASHING
-# if ENABLE_CHECKING
-# define PDUMPER_CHECK_REHASHING 1
-# else
-# define PDUMPER_CHECK_REHASHING 0
-# endif
-#endif
-
/* We require an architecture in which all pointers are the same size
and have the same layout, where pointers are either 32 or 64 bits
long, and where bytes have eight bits --- that is, a
@@ -393,6 +382,8 @@ dump_fingerprint (char const *label,
The start of the cold region is always aligned on a page
boundary. */
dump_off cold_start;
+
+ dump_off hash_list;
};
/* Double-ended singly linked list. */
@@ -550,6 +541,8 @@ dump_fingerprint (char const *label,
heap objects. */
Lisp_Object bignum_data;
+ Lisp_Object hash_tables;
+
unsigned number_hot_relocations;
unsigned number_discardable_relocations;
};
@@ -2622,68 +2615,64 @@ dump_vectorlike_generic (struct dump_context *ctx,
return offset;
}
-/* Determine whether the hash table's hash order is stable
- across dump and load. If it is, we don't have to trigger
- a rehash on access. */
-static bool
-dump_hash_table_stable_p (const struct Lisp_Hash_Table *hash)
+/* Return a list of (KEY . VALUE) pairs in the given hash table. */
+static Lisp_Object
+hash_table_contents (struct Lisp_Hash_Table *h)
{
- bool is_eql = hash->test.hashfn == hashfn_eql;
- bool is_equal = hash->test.hashfn == hashfn_equal;
- ptrdiff_t size = HASH_TABLE_SIZE (hash);
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (hash, i)))
+ Lisp_Object contents = Qnil;
+ /* Make sure key_and_value ends up in the same order, charset.c
+ relies on it by expecting hash table indices to stay constant
+ across the dump. */
+ for (ptrdiff_t i = HASH_TABLE_SIZE (h) - 1; i >= 0; --i)
+ if (!NILP (HASH_HASH (h, i)))
{
- Lisp_Object key = HASH_KEY (hash, i);
- bool key_stable = (dump_builtin_symbol_p (key)
- || FIXNUMP (key)
- || (is_equal && STRINGP (key))
- || ((is_equal || is_eql) && FLOATP (key)));
- if (!key_stable)
- return false;
+ dump_push (&contents, HASH_VALUE (h, i));
+ dump_push (&contents, HASH_KEY (h, i));
}
+ return CALLN (Fapply, Qvector, contents);
+}
- return true;
+static dump_off
+dump_hash_table_list (struct dump_context *ctx)
+{
+ if (CONSP (ctx->hash_tables))
+ return dump_object (ctx, CALLN (Fapply, Qvector, ctx->hash_tables));
+ else
+ return 0;
}
-/* Return a list of (KEY . VALUE) pairs in the given hash table. */
-static Lisp_Object
-hash_table_contents (Lisp_Object table)
+static void
+hash_table_freeze (struct Lisp_Hash_Table *h)
{
- Lisp_Object contents = Qnil;
- struct Lisp_Hash_Table *h = XHASH_TABLE (table);
- for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); ++i)
- if (!NILP (HASH_HASH (h, i)))
- dump_push (&contents, Fcons (HASH_KEY (h, i), HASH_VALUE (h, i)));
- return Fnreverse (contents);
+ h->key_and_value = hash_table_contents (h);
+ ptrdiff_t nkeys = XFIXNAT (Flength (h->key_and_value)) / 2;
+ h->count = nkeys;
+ if (nkeys == 0)
+ nkeys = 1;
+ h->index = h->next = h->hash = make_fixnum (nkeys);
}
-/* Copy the given hash table, rehash it, and make sure that we can
- look up all the values in the original. */
static void
-check_hash_table_rehash (Lisp_Object table_orig)
-{
- hash_rehash_if_needed (XHASH_TABLE (table_orig));
- Lisp_Object table_rehashed = Fcopy_hash_table (table_orig);
- eassert (XHASH_TABLE (table_rehashed)->count >= 0);
- XHASH_TABLE (table_rehashed)->count *= -1;
- eassert (XHASH_TABLE (table_rehashed)->count <= 0);
- hash_rehash_if_needed (XHASH_TABLE (table_rehashed));
- eassert (XHASH_TABLE (table_rehashed)->count >= 0);
- Lisp_Object expected_contents = hash_table_contents (table_orig);
- while (!NILP (expected_contents))
+hash_table_thaw (struct Lisp_Hash_Table *h)
+{
+ Lisp_Object count = h->index;
+ h->index = Fmake_vector (h->index, make_fixnum (-1));
+ h->hash = Fmake_vector (h->hash, Qnil);
+ h->next = Fmake_vector (h->next, make_fixnum (-1));
+ Lisp_Object key_and_value = h->key_and_value;
+ h->next_free = -1;
+ if (XFIXNAT (count) <= 1)
{
- Lisp_Object key_value_pair = dump_pop (&expected_contents);
- Lisp_Object key = XCAR (key_value_pair);
- Lisp_Object expected_value = XCDR (key_value_pair);
- Lisp_Object arbitrary = Qdump_emacs_portable__sort_predicate_copied;
- Lisp_Object found_value = Fgethash (key, table_rehashed, arbitrary);
- eassert (EQ (expected_value, found_value));
- Fremhash (key, table_rehashed);
+ h->key_and_value = Fmake_vector (make_fixnum (2 * XFIXNAT (count)), Qnil);
+ ptrdiff_t i = 0;
+ while (i < ASIZE (key_and_value))
+ {
+ ASET (h->key_and_value, i, AREF (key_and_value, i));
+ i++;
+ }
}
- eassert (EQ (Fhash_table_count (table_rehashed),
- make_fixnum (0)));
+ hash_table_rehash (h);
}
static dump_off
@@ -2695,45 +2684,11 @@ dump_hash_table (struct dump_context *ctx,
# error "Lisp_Hash_Table changed. See CHECK_STRUCTS comment in config.h."
#endif
const struct Lisp_Hash_Table *hash_in = XHASH_TABLE (object);
- bool is_stable = dump_hash_table_stable_p (hash_in);
- /* If the hash table is likely to be modified in memory (either
- because we need to rehash, and thus toggle hash->count, or
- because we need to assemble a list of weak tables) punt the hash
- table to the end of the dump, where we can lump all such hash
- tables together. */
- if (!(is_stable || !NILP (hash_in->weak))
- && ctx->flags.defer_hash_tables)
- {
- if (offset != DUMP_OBJECT_ON_HASH_TABLE_QUEUE)
- {
- eassert (offset == DUMP_OBJECT_ON_NORMAL_QUEUE
- || offset == DUMP_OBJECT_NOT_SEEN);
- /* We still want to dump the actual keys and values now. */
- dump_enqueue_object (ctx, hash_in->key_and_value, WEIGHT_NONE);
- /* We'll get to the rest later. */
- offset = DUMP_OBJECT_ON_HASH_TABLE_QUEUE;
- dump_remember_object (ctx, object, offset);
- dump_push (&ctx->deferred_hash_tables, object);
- }
- return offset;
- }
-
- if (PDUMPER_CHECK_REHASHING)
- check_hash_table_rehash (make_lisp_ptr ((void *) hash_in, Lisp_Vectorlike));
-
struct Lisp_Hash_Table hash_munged = *hash_in;
struct Lisp_Hash_Table *hash = &hash_munged;
- /* Remember to rehash this hash table on first access. After a
- dump reload, the hash table values will have changed, so we'll
- need to rebuild the index.
-
- TODO: for EQ and EQL hash tables, it should be possible to rehash
- here using the preferred load address of the dump, eliminating
- the need to rehash-on-access if we can load the dump where we
- want. */
- if (hash->count > 0 && !is_stable)
- hash->count = -hash->count;
+ hash_table_freeze (hash);
+ dump_push (&ctx->hash_tables, object);
START_DUMP_PVEC (ctx, &hash->header, struct Lisp_Hash_Table, out);
dump_pseudovector_lisp_fields (ctx, &out->header, &hash->header);
@@ -4140,6 +4095,19 @@ DEFUN ("dump-emacs-portable",
|| !NILP (ctx->deferred_hash_tables)
|| !NILP (ctx->deferred_symbols));
+ ctx->header.hash_list = ctx->offset;
+ dump_hash_table_list (ctx);
+
+ do
+ {
+ dump_drain_deferred_hash_tables (ctx);
+ dump_drain_deferred_symbols (ctx);
+ dump_drain_normal_queue (ctx);
+ }
+ while (!dump_queue_empty_p (&ctx->dump_queue)
+ || !NILP (ctx->deferred_hash_tables)
+ || !NILP (ctx->deferred_symbols));
+
dump_sort_copied_objects (ctx);
/* While we copy built-in symbols into the Emacs image, these
@@ -5431,6 +5399,13 @@ pdumper_load (const char *dump_filename)
for (int i = 0; i < ARRAYELTS (sections); ++i)
dump_mmap_reset (§ions[i]);
+ if (header->hash_list)
+ {
+ struct Lisp_Vector *hash_tables =
+ ((struct Lisp_Vector *)(dump_base + header->hash_list));
+ XSETVECTOR (Vpdumper_hash_tables, hash_tables);
+ }
+
/* Run the functions Emacs registered for doing post-dump-load
initialization. */
for (int i = 0; i < nr_dump_hooks; ++i)
@@ -5502,10 +5477,31 @@ DEFUN ("pdumper-stats", Fpdumper_stats, Spdumper_stats, 0, 0, 0,
\f
+static void thaw_hash_tables (void)
+{
+ Lisp_Object hash_tables = Vpdumper_hash_tables;
+ ptrdiff_t i = 0;
+ while (i < ASIZE (hash_tables))
+ {
+ hash_table_thaw (XHASH_TABLE (AREF (hash_tables, i)));
+ i++;
+ }
+ Vpdumper_hash_tables = zero_vector;
+}
+
+void
+init_pdumper_once (void)
+{
+ Vpdumper_hash_tables = zero_vector;
+ pdumper_do_now_and_after_load (thaw_hash_tables);
+}
+
void
syms_of_pdumper (void)
{
#ifdef HAVE_PDUMPER
+ DEFVAR_LISP ("pdumper-hash-tables", Vpdumper_hash_tables,
+ doc: /* A list of hash tables that need to be thawed after loading the pdump. */);
defsubr (&Sdump_emacs_portable);
defsubr (&Sdump_emacs_portable__sort_predicate);
defsubr (&Sdump_emacs_portable__sort_predicate_copied);
diff --git a/src/pdumper.h b/src/pdumper.h
index ab2f426c1e..cfea06d33d 100644
--- a/src/pdumper.h
+++ b/src/pdumper.h
@@ -248,6 +248,7 @@ pdumper_clear_marks (void)
file was loaded. */
extern void pdumper_record_wd (const char *);
+void init_pdumper_once (void);
void syms_of_pdumper (void);
INLINE_HEADER_END
--
2.17.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-14 14:39 ` Paul Eggert
@ 2019-07-14 15:01 ` Pip Cet
2019-07-14 15:49 ` Paul Eggert
2019-07-18 5:39 ` Eli Zaretskii
1 sibling, 1 reply; 37+ messages in thread
From: Pip Cet @ 2019-07-14 15:01 UTC (permalink / raw)
To: Paul Eggert; +Cc: 36597
On Sun, Jul 14, 2019 at 2:40 PM Paul Eggert <eggert@cs.ucla.edu> wrote:
> Although I like the simplicity of eager rehashing, I'm not yet sold on the
> performance implications. On my usual (cd lisp && make compile-always)
> benchmark, the patch hurt user+system CPU time performance by 1.5%. Admittedly
> just one benchmark, but still....
Indeed, that's plenty of small Emacs processes not doing very much.
It's not the case we ought to be optimizing for, I think, but the
performance concerns should be taken seriously. One way to avoid the
performance problems entirely is preferred-address loading of hash
dumps, but that has security implications...
> Also, must we expose Vpdumper_hash_tables to Lisp? Surely it'd be better to keep
> it private to pdumper.c.
Oops, I agree absolutely. Will remove that.
> I'll CC: to Daniel to see whether he has any insights about improvements in this
> area.
Sure; I sent the original email to Daniel, too, of course.
> PS. I ran that benchmark on my home desktop, an Intel Xeon E3-1225 v2 running
> Ubunto 18.04.2. To run it, I rebased your patch and also removed the
> no-longer-used PDUMPER_CHECK_REHASHING macro that my GCC complained about
> (wonder why that didn't happen for you?), resulting in the attached patch
> against current master 8ff09154a29a1151afb2902267ca35f89ebda73c.
Some GCC versions complain about it, some don't, I think.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-14 15:01 ` Pip Cet
@ 2019-07-14 15:49 ` Paul Eggert
2019-07-14 16:54 ` Pip Cet
0 siblings, 1 reply; 37+ messages in thread
From: Paul Eggert @ 2019-07-14 15:49 UTC (permalink / raw)
To: Pip Cet; +Cc: 36597
Pip Cet wrote:
> Indeed, that's plenty of small Emacs processes not doing very much.
> It's not the case we ought to be optimizing for, I think, but the
> performance concerns should be taken seriously.
What's a good benchmark for what we should be optimizing for? Ideally something
somewhat-realistic as opposed to a microbenchmark.
It doesn't appear to be as simple as plenty of processes not doing very much.
This benchmark:
cd leim && time make -B ../lisp/leim/ja-dic/ja-dic.el
is dominated by a single CPU-intensive Emacs process and takes about 19 CPU
seconds on my home desktop. The proposed patch slows this benchmark down by
about 0.6%. (I ran the benchmark ten times after a warmup run, and took the
average of the ten user+system times.)
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-14 15:49 ` Paul Eggert
@ 2019-07-14 16:54 ` Pip Cet
2019-07-15 14:39 ` Pip Cet
0 siblings, 1 reply; 37+ messages in thread
From: Pip Cet @ 2019-07-14 16:54 UTC (permalink / raw)
To: Paul Eggert; +Cc: 36597
On Sun, Jul 14, 2019 at 3:49 PM Paul Eggert <eggert@cs.ucla.edu> wrote:
> Pip Cet wrote:
> > Indeed, that's plenty of small Emacs processes not doing very much.
> > It's not the case we ought to be optimizing for, I think, but the
> > performance concerns should be taken seriously.
>
> What's a good benchmark for what we should be optimizing for? Ideally something
> somewhat-realistic as opposed to a microbenchmark.
I'd suggest something along the lines of:
perf stat -r 10 --table -e
cycles:u,instructions:u,branches:u,branch-misses:u make -B
../lisp/leim/ja-dic/ja-dic.el
With my patch:
61,136,837,192 cycles:u
( +- 0.42% )
42,313,912,525 instructions:u # 0.69 insn per
cycle ( +- 0.00% )
12,131,893,779 branches:u
( +- 0.00% )
47,602,747 branch-misses:u # 0.39% of all
branches ( +- 1.11% )
without my patch:
61,460,927,899 cycles:u
( +- 0.44% )
42,358,289,131 instructions:u # 0.69 insn per
cycle ( +- 0.00% )
12,134,582,441 branches:u
( +- 0.00% )
48,540,232 branch-misses:u # 0.40% of all
branches ( +- 1.09% )
A 0.5% improvement.
By comparison,
perf stat -r 100 --table -e
cycles:u,instructions:u,branches:u,branch-misses:u
~/git/emacs/src/emacs -Q --batch
With my patch:
80,749,425 cycles:u
( +- 0.81% )
146,770,045 instructions:u # 1.82 insn per
cycle ( +- 0.00% )
29,218,226 branches:u
( +- 0.00% )
450,275 branch-misses:u # 1.54% of all
branches ( +- 0.11% )
without my patch:
78,896,395 cycles:u
( +- 0.12% )
147,059,777 instructions:u # 1.86 insn per
cycle ( +- 0.00% )
29,287,917 branches:u
( +- 0.00% )
450,194 branch-misses:u # 1.54% of all
branches ( +- 0.09% )
About a 2% slowdown.
perf stat -r cycles:u,instructions:u,branches:u,missed-branches:u
> is dominated by a single CPU-intensive Emacs process and takes about 19 CPU
> seconds on my home desktop. The proposed patch slows this benchmark down by
> about 0.6%. (I ran the benchmark ten times after a warmup run, and took the
> average of the ten user+system times.)
Hmm. I'd like to know the reason for that, but I suspect it may simply
be thermal throttling. That's the reason I'm running tests in
parallel, though it might be better to compare instruction counts or
scheduled µ-ops rather than cycles...
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-14 16:54 ` Pip Cet
@ 2019-07-15 14:39 ` Pip Cet
2019-07-19 7:23 ` Pip Cet
0 siblings, 1 reply; 37+ messages in thread
From: Pip Cet @ 2019-07-15 14:39 UTC (permalink / raw)
To: Paul Eggert; +Cc: 36597
On Sun, Jul 14, 2019 at 4:54 PM Pip Cet <pipcet@gmail.com> wrote:
> > What's a good benchmark for what we should be optimizing for? Ideally something
> > somewhat-realistic as opposed to a microbenchmark.
Here are the things I've tried so far (building a full histogram of
actual clock cycles per run in all cases):
1. ja-dic.el: my patch is slightly faster: (on the order of 0.5%)
2. emacs -Q --batch: my patch is slightly slower (on the order of ~2%)
3. emacs -Q --eval "(run-with-timer 1 nil #'kill-emacs)": my patch is
very slightly faster (on the order of 0.1%)
Test 3 was run using a dedicated Xvnc server; all tests were run in
parallel with and without the patch.
The main advantage of my patch appears to be a reduction in pdumper
image size, which somehow leads to the performance improvement. I
haven't benchmarked a hypothetical patch which reduces the pdumper
image size but continues rehashing lazily.
But I noticed that my patch may affect hashes more than it should,
because it makes the thawed hash have the same size as the number of
hash entries in it. That seems not to hurt performance...
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-14 14:39 ` Paul Eggert
2019-07-14 15:01 ` Pip Cet
@ 2019-07-18 5:39 ` Eli Zaretskii
1 sibling, 0 replies; 37+ messages in thread
From: Eli Zaretskii @ 2019-07-18 5:39 UTC (permalink / raw)
To: Paul Eggert; +Cc: 36597, pipcet
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Sun, 14 Jul 2019 07:39:08 -0700
> Cc: 36597@debbugs.gnu.org
>
> Although I like the simplicity of eager rehashing, I'm not yet sold on the
> performance implications. On my usual (cd lisp && make compile-always)
> benchmark, the patch hurt user+system CPU time performance by 1.5%. Admittedly
> just one benchmark, but still....
>
> Also, must we expose Vpdumper_hash_tables to Lisp? Surely it'd be better to keep
> it private to pdumper.c.
>
> I'll CC: to Daniel to see whether he has any insights about improvements in this
> area.
You didn't CC Daniel, AFAICT, so I did that now.
> PS. I ran that benchmark on my home desktop, an Intel Xeon E3-1225 v2 running
> Ubunto 18.04.2. To run it, I rebased your patch and also removed the
> no-longer-used PDUMPER_CHECK_REHASHING macro that my GCC complained about
> (wonder why that didn't happen for you?), resulting in the attached patch
> against current master 8ff09154a29a1151afb2902267ca35f89ebda73c.
>
> >From 96a36697b2f472f3a6b3138444659babd0d73f32 Mon Sep 17 00:00:00 2001
> From: Pip Cet <pipcet@gmail.com>
> Date: Sun, 14 Jul 2019 07:22:01 -0700
> Subject: [PATCH] Rehash hash tables eagerly after loading a portable dump
>
> * src/lisp.h (hash_rehash_needed_p): Remove. All uses removed.
> (hash_rehash_if_needed): Remove. All uses removed.
> (struct Lisp_Hash_Table): Remove comment about rehashing hash tables.
> * src/pdumper.c (thaw_hash_tables): New function.
> (hash_table_thaw): New function.
> (hash_table_freeze): New function.
> (dump_hash_table): Simplify.
> (dump_hash_table_list): New function.
> (hash_table_contents): New function.
> (Fdump_emacs_portable): Handle hash tables by eager rehashing.
> (pdumper_load): Restore hash tables.
> (init_pdumper_once): New function.
> (PDUMPER_CHECK_REHASHING): Remove.
> ---
> src/bytecode.c | 1 -
> src/composite.c | 1 -
> src/emacs.c | 1 +
> src/fns.c | 54 ++++----------
> src/lisp.h | 19 +----
> src/minibuf.c | 3 -
> src/pdumper.c | 188 ++++++++++++++++++++++++------------------------
> src/pdumper.h | 1 +
> 8 files changed, 108 insertions(+), 160 deletions(-)
>
> diff --git a/src/bytecode.c b/src/bytecode.c
> index 29dff44f00..9c72429e0c 100644
> --- a/src/bytecode.c
> +++ b/src/bytecode.c
> @@ -1402,7 +1402,6 @@ #define DEFINE(name, value) LABEL (name) ,
> Lisp_Object v1 = POP;
> ptrdiff_t i;
> struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table);
> - hash_rehash_if_needed (h);
>
> /* h->count is a faster approximation for HASH_TABLE_SIZE (h)
> here. */
> diff --git a/src/composite.c b/src/composite.c
> index 183062de46..49a285cff0 100644
> --- a/src/composite.c
> +++ b/src/composite.c
> @@ -654,7 +654,6 @@ gstring_lookup_cache (Lisp_Object header)
> composition_gstring_put_cache (Lisp_Object gstring, ptrdiff_t len)
> {
> struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
> - hash_rehash_if_needed (h);
> Lisp_Object header = LGSTRING_HEADER (gstring);
> EMACS_UINT hash = h->test.hashfn (&h->test, header);
> if (len < 0)
> diff --git a/src/emacs.c b/src/emacs.c
> index ad661a081b..855b2c6715 100644
> --- a/src/emacs.c
> +++ b/src/emacs.c
> @@ -1560,6 +1560,7 @@ main (int argc, char **argv)
> if (!initialized)
> {
> init_alloc_once ();
> + init_pdumper_once ();
> init_obarray_once ();
> init_eval_once ();
> init_charset_once ();
> diff --git a/src/fns.c b/src/fns.c
> index 0497588689..b6134a314c 100644
> --- a/src/fns.c
> +++ b/src/fns.c
> @@ -4241,43 +4241,24 @@ hash_table_rehash (struct Lisp_Hash_Table *h)
> {
> ptrdiff_t size = HASH_TABLE_SIZE (h);
>
> - /* These structures may have been purecopied and shared
> - (bug#36447). */
> - h->next = Fcopy_sequence (h->next);
> - h->index = Fcopy_sequence (h->index);
> - h->hash = Fcopy_sequence (h->hash);
> -
> /* Recompute the actual hash codes for each entry in the table.
> Order is still invalid. */
> for (ptrdiff_t i = 0; i < size; ++i)
> - if (!NILP (HASH_HASH (h, i)))
> - {
> - Lisp_Object key = HASH_KEY (h, i);
> - EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
> - set_hash_hash_slot (h, i, make_fixnum (hash_code));
> - }
> -
> - /* Reset the index so that any slot we don't fill below is marked
> - invalid. */
> - Ffillarray (h->index, make_fixnum (-1));
> + {
> + Lisp_Object key = HASH_KEY (h, i);
> + EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
> + set_hash_hash_slot (h, i, make_fixnum (hash_code));
> + }
>
> /* Rebuild the collision chains. */
> for (ptrdiff_t i = 0; i < size; ++i)
> - if (!NILP (HASH_HASH (h, i)))
> - {
> - EMACS_UINT hash_code = XUFIXNUM (HASH_HASH (h, i));
> - ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
> - set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
> - set_hash_index_slot (h, start_of_bucket, i);
> - eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
> - }
> -
> - /* Finally, mark the hash table as having a valid hash order.
> - Do this last so that if we're interrupted, we retry on next
> - access. */
> - eassert (h->count < 0);
> - h->count = -h->count;
> - eassert (!hash_rehash_needed_p (h));
> + {
> + EMACS_UINT hash_code = XUFIXNUM (HASH_HASH (h, i));
> + ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
> + set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
> + set_hash_index_slot (h, start_of_bucket, i);
> + eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
> + }
> }
>
> /* Lookup KEY in hash table H. If HASH is non-null, return in *HASH
> @@ -4290,8 +4271,6 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, EMACS_UINT *hash)
> EMACS_UINT hash_code;
> ptrdiff_t start_of_bucket, i;
>
> - hash_rehash_if_needed (h);
> -
> hash_code = h->test.hashfn (&h->test, key);
> eassert ((hash_code & ~INTMASK) == 0);
> if (hash)
> @@ -4320,8 +4299,6 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
> {
> ptrdiff_t start_of_bucket, i;
>
> - hash_rehash_if_needed (h);
> -
> eassert ((hash & ~INTMASK) == 0);
>
> /* Increment count after resizing because resizing may fail. */
> @@ -4355,8 +4332,6 @@ hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
> ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
> ptrdiff_t prev = -1;
>
> - hash_rehash_if_needed (h);
> -
> for (ptrdiff_t i = HASH_INDEX (h, start_of_bucket);
> 0 <= i;
> i = HASH_NEXT (h, i))
> @@ -4434,9 +4409,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
> for (ptrdiff_t bucket = 0; bucket < n; ++bucket)
> {
> /* Follow collision chain, removing entries that don't survive
> - this garbage collection. It's okay if hash_rehash_needed_p
> - (h) is true, since we're operating entirely on the cached
> - hash values. */
> + this garbage collection. */
> ptrdiff_t prev = -1;
> ptrdiff_t next;
> for (ptrdiff_t i = HASH_INDEX (h, bucket); 0 <= i; i = next)
> @@ -4881,7 +4854,6 @@ DEFUN ("hash-table-count", Fhash_table_count, Shash_table_count, 1, 1, 0,
> (Lisp_Object table)
> {
> struct Lisp_Hash_Table *h = check_hash_table (table);
> - hash_rehash_if_needed (h);
> return make_fixnum (h->count);
> }
>
> diff --git a/src/lisp.h b/src/lisp.h
> index 13014c82dc..d0e5c43c41 100644
> --- a/src/lisp.h
> +++ b/src/lisp.h
> @@ -2245,11 +2245,7 @@ #define DEFSYM(sym, name) /* empty */
>
> struct Lisp_Hash_Table
> {
> - /* Change pdumper.c if you change the fields here.
> -
> - IMPORTANT!!!!!!!
> -
> - Call hash_rehash_if_needed() before accessing. */
> + /* Change pdumper.c if you change the fields here. */
>
> /* This is for Lisp; the hash table code does not refer to it. */
> union vectorlike_header header;
> @@ -2363,19 +2359,6 @@ HASH_TABLE_SIZE (const struct Lisp_Hash_Table *h)
>
> void hash_table_rehash (struct Lisp_Hash_Table *h);
>
> -INLINE bool
> -hash_rehash_needed_p (const struct Lisp_Hash_Table *h)
> -{
> - return h->count < 0;
> -}
> -
> -INLINE void
> -hash_rehash_if_needed (struct Lisp_Hash_Table *h)
> -{
> - if (hash_rehash_needed_p (h))
> - hash_table_rehash (h);
> -}
> -
> /* Default size for hash tables if not specified. */
>
> enum DEFAULT_HASH_SIZE { DEFAULT_HASH_SIZE = 65 };
> diff --git a/src/minibuf.c b/src/minibuf.c
> index d9a6e15b05..e923ce2a43 100644
> --- a/src/minibuf.c
> +++ b/src/minibuf.c
> @@ -1203,9 +1203,6 @@ DEFUN ("try-completion", Ftry_completion, Stry_completion, 2, 3, 0,
> bucket = AREF (collection, idx);
> }
>
> - if (HASH_TABLE_P (collection))
> - hash_rehash_if_needed (XHASH_TABLE (collection));
> -
> while (1)
> {
> /* Get the next element of the alist, obarray, or hash-table. */
> diff --git a/src/pdumper.c b/src/pdumper.c
> index 03c00bf27b..d35d296d32 100644
> --- a/src/pdumper.c
> +++ b/src/pdumper.c
> @@ -107,17 +107,6 @@ #define VM_MS_WINDOWS 2
>
> #define DANGEROUS 0
>
> -/* PDUMPER_CHECK_REHASHING being true causes the portable dumper to
> - check, for each hash table it dumps, that the hash table means the
> - same thing after rehashing. */
> -#ifndef PDUMPER_CHECK_REHASHING
> -# if ENABLE_CHECKING
> -# define PDUMPER_CHECK_REHASHING 1
> -# else
> -# define PDUMPER_CHECK_REHASHING 0
> -# endif
> -#endif
> -
> /* We require an architecture in which all pointers are the same size
> and have the same layout, where pointers are either 32 or 64 bits
> long, and where bytes have eight bits --- that is, a
> @@ -393,6 +382,8 @@ dump_fingerprint (char const *label,
> The start of the cold region is always aligned on a page
> boundary. */
> dump_off cold_start;
> +
> + dump_off hash_list;
> };
>
> /* Double-ended singly linked list. */
> @@ -550,6 +541,8 @@ dump_fingerprint (char const *label,
> heap objects. */
> Lisp_Object bignum_data;
>
> + Lisp_Object hash_tables;
> +
> unsigned number_hot_relocations;
> unsigned number_discardable_relocations;
> };
> @@ -2622,68 +2615,64 @@ dump_vectorlike_generic (struct dump_context *ctx,
> return offset;
> }
>
> -/* Determine whether the hash table's hash order is stable
> - across dump and load. If it is, we don't have to trigger
> - a rehash on access. */
> -static bool
> -dump_hash_table_stable_p (const struct Lisp_Hash_Table *hash)
> +/* Return a list of (KEY . VALUE) pairs in the given hash table. */
> +static Lisp_Object
> +hash_table_contents (struct Lisp_Hash_Table *h)
> {
> - bool is_eql = hash->test.hashfn == hashfn_eql;
> - bool is_equal = hash->test.hashfn == hashfn_equal;
> - ptrdiff_t size = HASH_TABLE_SIZE (hash);
> - for (ptrdiff_t i = 0; i < size; ++i)
> - if (!NILP (HASH_HASH (hash, i)))
> + Lisp_Object contents = Qnil;
> + /* Make sure key_and_value ends up in the same order, charset.c
> + relies on it by expecting hash table indices to stay constant
> + across the dump. */
> + for (ptrdiff_t i = HASH_TABLE_SIZE (h) - 1; i >= 0; --i)
> + if (!NILP (HASH_HASH (h, i)))
> {
> - Lisp_Object key = HASH_KEY (hash, i);
> - bool key_stable = (dump_builtin_symbol_p (key)
> - || FIXNUMP (key)
> - || (is_equal && STRINGP (key))
> - || ((is_equal || is_eql) && FLOATP (key)));
> - if (!key_stable)
> - return false;
> + dump_push (&contents, HASH_VALUE (h, i));
> + dump_push (&contents, HASH_KEY (h, i));
> }
> + return CALLN (Fapply, Qvector, contents);
> +}
>
> - return true;
> +static dump_off
> +dump_hash_table_list (struct dump_context *ctx)
> +{
> + if (CONSP (ctx->hash_tables))
> + return dump_object (ctx, CALLN (Fapply, Qvector, ctx->hash_tables));
> + else
> + return 0;
> }
>
> -/* Return a list of (KEY . VALUE) pairs in the given hash table. */
> -static Lisp_Object
> -hash_table_contents (Lisp_Object table)
> +static void
> +hash_table_freeze (struct Lisp_Hash_Table *h)
> {
> - Lisp_Object contents = Qnil;
> - struct Lisp_Hash_Table *h = XHASH_TABLE (table);
> - for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); ++i)
> - if (!NILP (HASH_HASH (h, i)))
> - dump_push (&contents, Fcons (HASH_KEY (h, i), HASH_VALUE (h, i)));
> - return Fnreverse (contents);
> + h->key_and_value = hash_table_contents (h);
> + ptrdiff_t nkeys = XFIXNAT (Flength (h->key_and_value)) / 2;
> + h->count = nkeys;
> + if (nkeys == 0)
> + nkeys = 1;
> + h->index = h->next = h->hash = make_fixnum (nkeys);
> }
>
> -/* Copy the given hash table, rehash it, and make sure that we can
> - look up all the values in the original. */
> static void
> -check_hash_table_rehash (Lisp_Object table_orig)
> -{
> - hash_rehash_if_needed (XHASH_TABLE (table_orig));
> - Lisp_Object table_rehashed = Fcopy_hash_table (table_orig);
> - eassert (XHASH_TABLE (table_rehashed)->count >= 0);
> - XHASH_TABLE (table_rehashed)->count *= -1;
> - eassert (XHASH_TABLE (table_rehashed)->count <= 0);
> - hash_rehash_if_needed (XHASH_TABLE (table_rehashed));
> - eassert (XHASH_TABLE (table_rehashed)->count >= 0);
> - Lisp_Object expected_contents = hash_table_contents (table_orig);
> - while (!NILP (expected_contents))
> +hash_table_thaw (struct Lisp_Hash_Table *h)
> +{
> + Lisp_Object count = h->index;
> + h->index = Fmake_vector (h->index, make_fixnum (-1));
> + h->hash = Fmake_vector (h->hash, Qnil);
> + h->next = Fmake_vector (h->next, make_fixnum (-1));
> + Lisp_Object key_and_value = h->key_and_value;
> + h->next_free = -1;
> + if (XFIXNAT (count) <= 1)
> {
> - Lisp_Object key_value_pair = dump_pop (&expected_contents);
> - Lisp_Object key = XCAR (key_value_pair);
> - Lisp_Object expected_value = XCDR (key_value_pair);
> - Lisp_Object arbitrary = Qdump_emacs_portable__sort_predicate_copied;
> - Lisp_Object found_value = Fgethash (key, table_rehashed, arbitrary);
> - eassert (EQ (expected_value, found_value));
> - Fremhash (key, table_rehashed);
> + h->key_and_value = Fmake_vector (make_fixnum (2 * XFIXNAT (count)), Qnil);
> + ptrdiff_t i = 0;
> + while (i < ASIZE (key_and_value))
> + {
> + ASET (h->key_and_value, i, AREF (key_and_value, i));
> + i++;
> + }
> }
>
> - eassert (EQ (Fhash_table_count (table_rehashed),
> - make_fixnum (0)));
> + hash_table_rehash (h);
> }
>
> static dump_off
> @@ -2695,45 +2684,11 @@ dump_hash_table (struct dump_context *ctx,
> # error "Lisp_Hash_Table changed. See CHECK_STRUCTS comment in config.h."
> #endif
> const struct Lisp_Hash_Table *hash_in = XHASH_TABLE (object);
> - bool is_stable = dump_hash_table_stable_p (hash_in);
> - /* If the hash table is likely to be modified in memory (either
> - because we need to rehash, and thus toggle hash->count, or
> - because we need to assemble a list of weak tables) punt the hash
> - table to the end of the dump, where we can lump all such hash
> - tables together. */
> - if (!(is_stable || !NILP (hash_in->weak))
> - && ctx->flags.defer_hash_tables)
> - {
> - if (offset != DUMP_OBJECT_ON_HASH_TABLE_QUEUE)
> - {
> - eassert (offset == DUMP_OBJECT_ON_NORMAL_QUEUE
> - || offset == DUMP_OBJECT_NOT_SEEN);
> - /* We still want to dump the actual keys and values now. */
> - dump_enqueue_object (ctx, hash_in->key_and_value, WEIGHT_NONE);
> - /* We'll get to the rest later. */
> - offset = DUMP_OBJECT_ON_HASH_TABLE_QUEUE;
> - dump_remember_object (ctx, object, offset);
> - dump_push (&ctx->deferred_hash_tables, object);
> - }
> - return offset;
> - }
> -
> - if (PDUMPER_CHECK_REHASHING)
> - check_hash_table_rehash (make_lisp_ptr ((void *) hash_in, Lisp_Vectorlike));
> -
> struct Lisp_Hash_Table hash_munged = *hash_in;
> struct Lisp_Hash_Table *hash = &hash_munged;
>
> - /* Remember to rehash this hash table on first access. After a
> - dump reload, the hash table values will have changed, so we'll
> - need to rebuild the index.
> -
> - TODO: for EQ and EQL hash tables, it should be possible to rehash
> - here using the preferred load address of the dump, eliminating
> - the need to rehash-on-access if we can load the dump where we
> - want. */
> - if (hash->count > 0 && !is_stable)
> - hash->count = -hash->count;
> + hash_table_freeze (hash);
> + dump_push (&ctx->hash_tables, object);
>
> START_DUMP_PVEC (ctx, &hash->header, struct Lisp_Hash_Table, out);
> dump_pseudovector_lisp_fields (ctx, &out->header, &hash->header);
> @@ -4140,6 +4095,19 @@ DEFUN ("dump-emacs-portable",
> || !NILP (ctx->deferred_hash_tables)
> || !NILP (ctx->deferred_symbols));
>
> + ctx->header.hash_list = ctx->offset;
> + dump_hash_table_list (ctx);
> +
> + do
> + {
> + dump_drain_deferred_hash_tables (ctx);
> + dump_drain_deferred_symbols (ctx);
> + dump_drain_normal_queue (ctx);
> + }
> + while (!dump_queue_empty_p (&ctx->dump_queue)
> + || !NILP (ctx->deferred_hash_tables)
> + || !NILP (ctx->deferred_symbols));
> +
> dump_sort_copied_objects (ctx);
>
> /* While we copy built-in symbols into the Emacs image, these
> @@ -5431,6 +5399,13 @@ pdumper_load (const char *dump_filename)
> for (int i = 0; i < ARRAYELTS (sections); ++i)
> dump_mmap_reset (§ions[i]);
>
> + if (header->hash_list)
> + {
> + struct Lisp_Vector *hash_tables =
> + ((struct Lisp_Vector *)(dump_base + header->hash_list));
> + XSETVECTOR (Vpdumper_hash_tables, hash_tables);
> + }
> +
> /* Run the functions Emacs registered for doing post-dump-load
> initialization. */
> for (int i = 0; i < nr_dump_hooks; ++i)
> @@ -5502,10 +5477,31 @@ DEFUN ("pdumper-stats", Fpdumper_stats, Spdumper_stats, 0, 0, 0,
>
> \f
>
> +static void thaw_hash_tables (void)
> +{
> + Lisp_Object hash_tables = Vpdumper_hash_tables;
> + ptrdiff_t i = 0;
> + while (i < ASIZE (hash_tables))
> + {
> + hash_table_thaw (XHASH_TABLE (AREF (hash_tables, i)));
> + i++;
> + }
> + Vpdumper_hash_tables = zero_vector;
> +}
> +
> +void
> +init_pdumper_once (void)
> +{
> + Vpdumper_hash_tables = zero_vector;
> + pdumper_do_now_and_after_load (thaw_hash_tables);
> +}
> +
> void
> syms_of_pdumper (void)
> {
> #ifdef HAVE_PDUMPER
> + DEFVAR_LISP ("pdumper-hash-tables", Vpdumper_hash_tables,
> + doc: /* A list of hash tables that need to be thawed after loading the pdump. */);
> defsubr (&Sdump_emacs_portable);
> defsubr (&Sdump_emacs_portable__sort_predicate);
> defsubr (&Sdump_emacs_portable__sort_predicate_copied);
> diff --git a/src/pdumper.h b/src/pdumper.h
> index ab2f426c1e..cfea06d33d 100644
> --- a/src/pdumper.h
> +++ b/src/pdumper.h
> @@ -248,6 +248,7 @@ pdumper_clear_marks (void)
> file was loaded. */
> extern void pdumper_record_wd (const char *);
>
> +void init_pdumper_once (void);
> void syms_of_pdumper (void);
>
> INLINE_HEADER_END
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-15 14:39 ` Pip Cet
@ 2019-07-19 7:23 ` Pip Cet
2019-07-19 7:46 ` Eli Zaretskii
0 siblings, 1 reply; 37+ messages in thread
From: Pip Cet @ 2019-07-19 7:23 UTC (permalink / raw)
To: Paul Eggert; +Cc: 36597
[-- Attachment #1: Type: text/plain, Size: 380 bytes --]
On Mon, Jul 15, 2019 at 2:39 PM Pip Cet <pipcet@gmail.com> wrote:
> But I noticed that my patch may affect hashes more than it should,
> because it makes the thawed hash have the same size as the number of
> hash entries in it. That seems not to hurt performance...
But it should be fixed, anyway. The attached patch has no unexpected
performance benefits, as far as I can tell.
[-- Attachment #2: 0001-Rehash-hash-tables-eagerly-after-loading-a-dump.patch --]
[-- Type: text/x-patch, Size: 18044 bytes --]
From 21ef5ded3741429837e903eb9ade88d593bc0d8f Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Fri, 19 Jul 2019 07:12:42 +0000
Subject: [PATCH] Rehash hash tables eagerly after loading a dump.
* src/lisp.h (hash_rehash_needed_p): Remove. All uses removed.
(hash_rehash_if_needed): Remove. All uses removed.
(struct Lisp_Hash_Table): Remove comment about rehashing hash tables.
* src/pdumper.c (thaw_hash_tables): New function.
(hash_table_thaw): New function.
(hash_table_freeze): New function.
(dump_hash_table): Simplify.
(dump_hash_table_list): New function.
(hash_table_contents): New function.
(Fdump_emacs_portable): Handle hash tables by eager rehashing.
(pdumper_load): Restore hash tables.
(init_pdumper_once): New function.
---
src/bytecode.c | 1 -
src/composite.c | 1 -
src/emacs.c | 1 +
src/fns.c | 65 +++++------------
src/lisp.h | 19 +----
src/minibuf.c | 3 -
src/pdumper.c | 182 ++++++++++++++++++++++--------------------------
src/pdumper.h | 1 +
8 files changed, 103 insertions(+), 170 deletions(-)
diff --git a/src/bytecode.c b/src/bytecode.c
index 29dff44f00..9c72429e0c 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -1402,7 +1402,6 @@ #define DEFINE(name, value) LABEL (name) ,
Lisp_Object v1 = POP;
ptrdiff_t i;
struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table);
- hash_rehash_if_needed (h);
/* h->count is a faster approximation for HASH_TABLE_SIZE (h)
here. */
diff --git a/src/composite.c b/src/composite.c
index 183062de46..49a285cff0 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -654,7 +654,6 @@ gstring_lookup_cache (Lisp_Object header)
composition_gstring_put_cache (Lisp_Object gstring, ptrdiff_t len)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
- hash_rehash_if_needed (h);
Lisp_Object header = LGSTRING_HEADER (gstring);
EMACS_UINT hash = h->test.hashfn (&h->test, header);
if (len < 0)
diff --git a/src/emacs.c b/src/emacs.c
index ad661a081b..855b2c6715 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -1560,6 +1560,7 @@ main (int argc, char **argv)
if (!initialized)
{
init_alloc_once ();
+ init_pdumper_once ();
init_obarray_once ();
init_eval_once ();
init_charset_once ();
diff --git a/src/fns.c b/src/fns.c
index 0497588689..6f86d7f314 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4233,51 +4233,27 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
/* Recompute the hashes (and hence also the "next" pointers).
Normally there's never a need to recompute hashes.
- This is done only on first-access to a hash-table loaded from
- the "pdump", because the object's addresses may have changed, thus
+ This is done only on first access to a hash-table loaded from
+ the "pdump", because the objects' addresses may have changed, thus
affecting their hash. */
void
hash_table_rehash (struct Lisp_Hash_Table *h)
{
- ptrdiff_t size = HASH_TABLE_SIZE (h);
-
- /* These structures may have been purecopied and shared
- (bug#36447). */
- h->next = Fcopy_sequence (h->next);
- h->index = Fcopy_sequence (h->index);
- h->hash = Fcopy_sequence (h->hash);
-
/* Recompute the actual hash codes for each entry in the table.
Order is still invalid. */
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (h, i)))
- {
- Lisp_Object key = HASH_KEY (h, i);
- EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
- set_hash_hash_slot (h, i, make_fixnum (hash_code));
- }
-
- /* Reset the index so that any slot we don't fill below is marked
- invalid. */
- Ffillarray (h->index, make_fixnum (-1));
-
- /* Rebuild the collision chains. */
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (h, i)))
- {
- EMACS_UINT hash_code = XUFIXNUM (HASH_HASH (h, i));
- ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
- set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
- set_hash_index_slot (h, start_of_bucket, i);
- eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
- }
+ for (ptrdiff_t i = 0; i < h->count; ++i)
+ {
+ Lisp_Object key = HASH_KEY (h, i);
+ EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
+ ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
+ set_hash_hash_slot (h, i, make_fixnum (hash_code));
+ set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
+ set_hash_index_slot (h, start_of_bucket, i);
+ eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
+ }
- /* Finally, mark the hash table as having a valid hash order.
- Do this last so that if we're interrupted, we retry on next
- access. */
- eassert (h->count < 0);
- h->count = -h->count;
- eassert (!hash_rehash_needed_p (h));
+ for (ptrdiff_t i = h->count; i < ASIZE (h->next) - 1; i++)
+ set_hash_next_slot (h, i, i + 1);
}
/* Lookup KEY in hash table H. If HASH is non-null, return in *HASH
@@ -4290,8 +4266,6 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, EMACS_UINT *hash)
EMACS_UINT hash_code;
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
hash_code = h->test.hashfn (&h->test, key);
eassert ((hash_code & ~INTMASK) == 0);
if (hash)
@@ -4320,8 +4294,6 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
{
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
eassert ((hash & ~INTMASK) == 0);
/* Increment count after resizing because resizing may fail. */
@@ -4355,8 +4327,6 @@ hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
ptrdiff_t prev = -1;
- hash_rehash_if_needed (h);
-
for (ptrdiff_t i = HASH_INDEX (h, start_of_bucket);
0 <= i;
i = HASH_NEXT (h, i))
@@ -4434,9 +4404,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
for (ptrdiff_t bucket = 0; bucket < n; ++bucket)
{
/* Follow collision chain, removing entries that don't survive
- this garbage collection. It's okay if hash_rehash_needed_p
- (h) is true, since we're operating entirely on the cached
- hash values. */
+ this garbage collection. */
ptrdiff_t prev = -1;
ptrdiff_t next;
for (ptrdiff_t i = HASH_INDEX (h, bucket); 0 <= i; i = next)
@@ -4478,7 +4446,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
set_hash_hash_slot (h, i, Qnil);
eassert (h->count != 0);
- h->count += h->count > 0 ? -1 : 1;
+ h->count--;
}
else
{
@@ -4881,7 +4849,6 @@ DEFUN ("hash-table-count", Fhash_table_count, Shash_table_count, 1, 1, 0,
(Lisp_Object table)
{
struct Lisp_Hash_Table *h = check_hash_table (table);
- hash_rehash_if_needed (h);
return make_fixnum (h->count);
}
diff --git a/src/lisp.h b/src/lisp.h
index 13014c82dc..d0e5c43c41 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2245,11 +2245,7 @@ #define DEFSYM(sym, name) /* empty */
struct Lisp_Hash_Table
{
- /* Change pdumper.c if you change the fields here.
-
- IMPORTANT!!!!!!!
-
- Call hash_rehash_if_needed() before accessing. */
+ /* Change pdumper.c if you change the fields here. */
/* This is for Lisp; the hash table code does not refer to it. */
union vectorlike_header header;
@@ -2363,19 +2359,6 @@ HASH_TABLE_SIZE (const struct Lisp_Hash_Table *h)
void hash_table_rehash (struct Lisp_Hash_Table *h);
-INLINE bool
-hash_rehash_needed_p (const struct Lisp_Hash_Table *h)
-{
- return h->count < 0;
-}
-
-INLINE void
-hash_rehash_if_needed (struct Lisp_Hash_Table *h)
-{
- if (hash_rehash_needed_p (h))
- hash_table_rehash (h);
-}
-
/* Default size for hash tables if not specified. */
enum DEFAULT_HASH_SIZE { DEFAULT_HASH_SIZE = 65 };
diff --git a/src/minibuf.c b/src/minibuf.c
index d9a6e15b05..e923ce2a43 100644
--- a/src/minibuf.c
+++ b/src/minibuf.c
@@ -1203,9 +1203,6 @@ DEFUN ("try-completion", Ftry_completion, Stry_completion, 2, 3, 0,
bucket = AREF (collection, idx);
}
- if (HASH_TABLE_P (collection))
- hash_rehash_if_needed (XHASH_TABLE (collection));
-
while (1)
{
/* Get the next element of the alist, obarray, or hash-table. */
diff --git a/src/pdumper.c b/src/pdumper.c
index 03c00bf27b..e1be696748 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -107,17 +107,6 @@ #define VM_MS_WINDOWS 2
#define DANGEROUS 0
-/* PDUMPER_CHECK_REHASHING being true causes the portable dumper to
- check, for each hash table it dumps, that the hash table means the
- same thing after rehashing. */
-#ifndef PDUMPER_CHECK_REHASHING
-# if ENABLE_CHECKING
-# define PDUMPER_CHECK_REHASHING 1
-# else
-# define PDUMPER_CHECK_REHASHING 0
-# endif
-#endif
-
/* We require an architecture in which all pointers are the same size
and have the same layout, where pointers are either 32 or 64 bits
long, and where bytes have eight bits --- that is, a
@@ -393,6 +382,8 @@ dump_fingerprint (char const *label,
The start of the cold region is always aligned on a page
boundary. */
dump_off cold_start;
+
+ dump_off hash_list;
};
/* Double-ended singly linked list. */
@@ -550,6 +541,8 @@ dump_fingerprint (char const *label,
heap objects. */
Lisp_Object bignum_data;
+ Lisp_Object hash_tables;
+
unsigned number_hot_relocations;
unsigned number_discardable_relocations;
};
@@ -2622,68 +2615,58 @@ dump_vectorlike_generic (struct dump_context *ctx,
return offset;
}
-/* Determine whether the hash table's hash order is stable
- across dump and load. If it is, we don't have to trigger
- a rehash on access. */
-static bool
-dump_hash_table_stable_p (const struct Lisp_Hash_Table *hash)
+/* Return a vector of KEY, VALUE pairs in the given hash table H. The
+ first H->count pairs are valid, the rest is left as nil. */
+static Lisp_Object
+hash_table_contents (struct Lisp_Hash_Table *h)
{
- bool is_eql = hash->test.hashfn == hashfn_eql;
- bool is_equal = hash->test.hashfn == hashfn_equal;
- ptrdiff_t size = HASH_TABLE_SIZE (hash);
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (hash, i)))
+ Lisp_Object contents = Qnil;
+ /* Make sure key_and_value ends up in the same order, charset.c
+ relies on it by expecting hash table indices to stay constant
+ across the dump. */
+ for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h) - h->count; i++)
+ {
+ dump_push (&contents, Qnil);
+ dump_push (&contents, Qnil);
+ }
+
+ for (ptrdiff_t i = HASH_TABLE_SIZE (h) - 1; i >= 0; --i)
+ if (!NILP (HASH_HASH (h, i)))
{
- Lisp_Object key = HASH_KEY (hash, i);
- bool key_stable = (dump_builtin_symbol_p (key)
- || FIXNUMP (key)
- || (is_equal && STRINGP (key))
- || ((is_equal || is_eql) && FLOATP (key)));
- if (!key_stable)
- return false;
+ dump_push (&contents, HASH_VALUE (h, i));
+ dump_push (&contents, HASH_KEY (h, i));
}
- return true;
+ return CALLN (Fapply, Qvector, contents);
}
-/* Return a list of (KEY . VALUE) pairs in the given hash table. */
-static Lisp_Object
-hash_table_contents (Lisp_Object table)
+static dump_off
+dump_hash_table_list (struct dump_context *ctx)
{
- Lisp_Object contents = Qnil;
- struct Lisp_Hash_Table *h = XHASH_TABLE (table);
- for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); ++i)
- if (!NILP (HASH_HASH (h, i)))
- dump_push (&contents, Fcons (HASH_KEY (h, i), HASH_VALUE (h, i)));
- return Fnreverse (contents);
+ if (CONSP (ctx->hash_tables))
+ return dump_object (ctx, CALLN (Fapply, Qvector, ctx->hash_tables));
+ else
+ return 0;
}
-/* Copy the given hash table, rehash it, and make sure that we can
- look up all the values in the original. */
static void
-check_hash_table_rehash (Lisp_Object table_orig)
-{
- hash_rehash_if_needed (XHASH_TABLE (table_orig));
- Lisp_Object table_rehashed = Fcopy_hash_table (table_orig);
- eassert (XHASH_TABLE (table_rehashed)->count >= 0);
- XHASH_TABLE (table_rehashed)->count *= -1;
- eassert (XHASH_TABLE (table_rehashed)->count <= 0);
- hash_rehash_if_needed (XHASH_TABLE (table_rehashed));
- eassert (XHASH_TABLE (table_rehashed)->count >= 0);
- Lisp_Object expected_contents = hash_table_contents (table_orig);
- while (!NILP (expected_contents))
- {
- Lisp_Object key_value_pair = dump_pop (&expected_contents);
- Lisp_Object key = XCAR (key_value_pair);
- Lisp_Object expected_value = XCDR (key_value_pair);
- Lisp_Object arbitrary = Qdump_emacs_portable__sort_predicate_copied;
- Lisp_Object found_value = Fgethash (key, table_rehashed, arbitrary);
- eassert (EQ (expected_value, found_value));
- Fremhash (key, table_rehashed);
- }
+hash_table_freeze (struct Lisp_Hash_Table *h)
+{
+ ptrdiff_t nkeys = XFIXNAT (Flength (h->key_and_value)) / 2;
+ h->key_and_value = hash_table_contents (h);
+ h->next_free = (nkeys == h->count ? -1 : h->count);
+ h->index = Flength (h->index);
+ h->next = h->hash = make_fixnum (nkeys);
+}
+
+static void
+hash_table_thaw (struct Lisp_Hash_Table *h)
+{
+ h->index = Fmake_vector (h->index, make_fixnum (-1));
+ h->hash = Fmake_vector (h->hash, Qnil);
+ h->next = Fmake_vector (h->next, make_fixnum (-1));
- eassert (EQ (Fhash_table_count (table_rehashed),
- make_fixnum (0)));
+ hash_table_rehash (h);
}
static dump_off
@@ -2695,45 +2678,11 @@ dump_hash_table (struct dump_context *ctx,
# error "Lisp_Hash_Table changed. See CHECK_STRUCTS comment in config.h."
#endif
const struct Lisp_Hash_Table *hash_in = XHASH_TABLE (object);
- bool is_stable = dump_hash_table_stable_p (hash_in);
- /* If the hash table is likely to be modified in memory (either
- because we need to rehash, and thus toggle hash->count, or
- because we need to assemble a list of weak tables) punt the hash
- table to the end of the dump, where we can lump all such hash
- tables together. */
- if (!(is_stable || !NILP (hash_in->weak))
- && ctx->flags.defer_hash_tables)
- {
- if (offset != DUMP_OBJECT_ON_HASH_TABLE_QUEUE)
- {
- eassert (offset == DUMP_OBJECT_ON_NORMAL_QUEUE
- || offset == DUMP_OBJECT_NOT_SEEN);
- /* We still want to dump the actual keys and values now. */
- dump_enqueue_object (ctx, hash_in->key_and_value, WEIGHT_NONE);
- /* We'll get to the rest later. */
- offset = DUMP_OBJECT_ON_HASH_TABLE_QUEUE;
- dump_remember_object (ctx, object, offset);
- dump_push (&ctx->deferred_hash_tables, object);
- }
- return offset;
- }
-
- if (PDUMPER_CHECK_REHASHING)
- check_hash_table_rehash (make_lisp_ptr ((void *) hash_in, Lisp_Vectorlike));
-
struct Lisp_Hash_Table hash_munged = *hash_in;
struct Lisp_Hash_Table *hash = &hash_munged;
- /* Remember to rehash this hash table on first access. After a
- dump reload, the hash table values will have changed, so we'll
- need to rebuild the index.
-
- TODO: for EQ and EQL hash tables, it should be possible to rehash
- here using the preferred load address of the dump, eliminating
- the need to rehash-on-access if we can load the dump where we
- want. */
- if (hash->count > 0 && !is_stable)
- hash->count = -hash->count;
+ hash_table_freeze (hash);
+ dump_push (&ctx->hash_tables, object);
START_DUMP_PVEC (ctx, &hash->header, struct Lisp_Hash_Table, out);
dump_pseudovector_lisp_fields (ctx, &out->header, &hash->header);
@@ -4140,6 +4089,19 @@ DEFUN ("dump-emacs-portable",
|| !NILP (ctx->deferred_hash_tables)
|| !NILP (ctx->deferred_symbols));
+ ctx->header.hash_list = ctx->offset;
+ dump_hash_table_list (ctx);
+
+ do
+ {
+ dump_drain_deferred_hash_tables (ctx);
+ dump_drain_deferred_symbols (ctx);
+ dump_drain_normal_queue (ctx);
+ }
+ while (!dump_queue_empty_p (&ctx->dump_queue)
+ || !NILP (ctx->deferred_hash_tables)
+ || !NILP (ctx->deferred_symbols));
+
dump_sort_copied_objects (ctx);
/* While we copy built-in symbols into the Emacs image, these
@@ -5290,6 +5252,9 @@ dump_do_all_emacs_relocations (const struct dump_header *const header,
NUMBER_DUMP_SECTIONS,
};
+/* Pointer to a stack variable to avoid having to staticpro it. */
+static Lisp_Object *pdumper_hashes = &zero_vector;
+
/* Load a dump from DUMP_FILENAME. Return an error code.
N.B. We run very early in initialization, so we can't use lisp,
@@ -5431,6 +5396,15 @@ pdumper_load (const char *dump_filename)
for (int i = 0; i < ARRAYELTS (sections); ++i)
dump_mmap_reset (§ions[i]);
+ Lisp_Object hashes = zero_vector;
+ if (header->hash_list)
+ {
+ struct Lisp_Vector *hash_tables =
+ ((struct Lisp_Vector *)(dump_base + header->hash_list));
+ XSETVECTOR (hashes, hash_tables);
+ }
+
+ pdumper_hashes = &hashes;
/* Run the functions Emacs registered for doing post-dump-load
initialization. */
for (int i = 0; i < nr_dump_hooks; ++i)
@@ -5501,6 +5475,18 @@ DEFUN ("pdumper-stats", Fpdumper_stats, Spdumper_stats, 0, 0, 0,
#endif /* HAVE_PDUMPER */
\f
+static void thaw_hash_tables (void)
+{
+ Lisp_Object hash_tables = *pdumper_hashes;
+ for (ptrdiff_t i = 0; i < ASIZE (hash_tables); i++)
+ hash_table_thaw (XHASH_TABLE (AREF (hash_tables, i)));
+}
+
+void
+init_pdumper_once (void)
+{
+ pdumper_do_now_and_after_load (thaw_hash_tables);
+}
void
syms_of_pdumper (void)
diff --git a/src/pdumper.h b/src/pdumper.h
index ab2f426c1e..cfea06d33d 100644
--- a/src/pdumper.h
+++ b/src/pdumper.h
@@ -248,6 +248,7 @@ pdumper_clear_marks (void)
file was loaded. */
extern void pdumper_record_wd (const char *);
+void init_pdumper_once (void);
void syms_of_pdumper (void);
INLINE_HEADER_END
--
2.20.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-19 7:23 ` Pip Cet
@ 2019-07-19 7:46 ` Eli Zaretskii
2019-07-20 12:38 ` Pip Cet
0 siblings, 1 reply; 37+ messages in thread
From: Eli Zaretskii @ 2019-07-19 7:46 UTC (permalink / raw)
To: Pip Cet; +Cc: eggert, 36597
> From: Pip Cet <pipcet@gmail.com>
> Date: Fri, 19 Jul 2019 07:23:50 +0000
> Cc: 36597@debbugs.gnu.org
>
> But it should be fixed, anyway. The attached patch has no unexpected
> performance benefits, as far as I can tell.
Thanks.
> +static void thaw_hash_tables (void)
> +{
This is not our style of defining a function. The name should begin
on a new line.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-19 7:46 ` Eli Zaretskii
@ 2019-07-20 12:38 ` Pip Cet
2019-07-21 3:18 ` Paul Eggert
0 siblings, 1 reply; 37+ messages in thread
From: Pip Cet @ 2019-07-20 12:38 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: eggert, 36597
[-- Attachment #1: Type: text/plain, Size: 493 bytes --]
On Fri, Jul 19, 2019 at 7:46 AM Eli Zaretskii <eliz@gnu.org> wrote:
> > +static void thaw_hash_tables (void)
> > +{
>
> This is not our style of defining a function. The name should begin
> on a new line.
Thanks for pointing that out, revised patch attached.
I'm currently playing around with redefining hash tables not to have
internal freelists. That makes the hash table code a lot simpler
overall, but some of that simplicity would be lost trying to support
lazy hash table rehashing.
[-- Attachment #2: 0001-Rehash-hash-tables-eagerly-after-loading-a-dump.patch --]
[-- Type: text/x-patch, Size: 18045 bytes --]
From dd7743b50c58a5c9bb25b3ed05245a19b0dd19d7 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Fri, 19 Jul 2019 07:12:42 +0000
Subject: [PATCH] Rehash hash tables eagerly after loading a dump.
* src/lisp.h (hash_rehash_needed_p): Remove. All uses removed.
(hash_rehash_if_needed): Remove. All uses removed.
(struct Lisp_Hash_Table): Remove comment about rehashing hash tables.
* src/pdumper.c (thaw_hash_tables): New function.
(hash_table_thaw): New function.
(hash_table_freeze): New function.
(dump_hash_table): Simplify.
(dump_hash_table_list): New function.
(hash_table_contents): New function.
(Fdump_emacs_portable): Handle hash tables by eager rehashing.
(pdumper_load): Restore hash tables.
(init_pdumper_once): New function.
---
src/bytecode.c | 1 -
src/composite.c | 1 -
src/emacs.c | 1 +
src/fns.c | 65 +++++------------
src/lisp.h | 19 +----
src/minibuf.c | 3 -
src/pdumper.c | 183 ++++++++++++++++++++++--------------------------
src/pdumper.h | 1 +
8 files changed, 104 insertions(+), 170 deletions(-)
diff --git a/src/bytecode.c b/src/bytecode.c
index 29dff44f00..9c72429e0c 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -1402,7 +1402,6 @@ #define DEFINE(name, value) LABEL (name) ,
Lisp_Object v1 = POP;
ptrdiff_t i;
struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table);
- hash_rehash_if_needed (h);
/* h->count is a faster approximation for HASH_TABLE_SIZE (h)
here. */
diff --git a/src/composite.c b/src/composite.c
index 183062de46..49a285cff0 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -654,7 +654,6 @@ gstring_lookup_cache (Lisp_Object header)
composition_gstring_put_cache (Lisp_Object gstring, ptrdiff_t len)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
- hash_rehash_if_needed (h);
Lisp_Object header = LGSTRING_HEADER (gstring);
EMACS_UINT hash = h->test.hashfn (&h->test, header);
if (len < 0)
diff --git a/src/emacs.c b/src/emacs.c
index ad661a081b..855b2c6715 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -1560,6 +1560,7 @@ main (int argc, char **argv)
if (!initialized)
{
init_alloc_once ();
+ init_pdumper_once ();
init_obarray_once ();
init_eval_once ();
init_charset_once ();
diff --git a/src/fns.c b/src/fns.c
index 0497588689..6f86d7f314 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4233,51 +4233,27 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
/* Recompute the hashes (and hence also the "next" pointers).
Normally there's never a need to recompute hashes.
- This is done only on first-access to a hash-table loaded from
- the "pdump", because the object's addresses may have changed, thus
+ This is done only on first access to a hash-table loaded from
+ the "pdump", because the objects' addresses may have changed, thus
affecting their hash. */
void
hash_table_rehash (struct Lisp_Hash_Table *h)
{
- ptrdiff_t size = HASH_TABLE_SIZE (h);
-
- /* These structures may have been purecopied and shared
- (bug#36447). */
- h->next = Fcopy_sequence (h->next);
- h->index = Fcopy_sequence (h->index);
- h->hash = Fcopy_sequence (h->hash);
-
/* Recompute the actual hash codes for each entry in the table.
Order is still invalid. */
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (h, i)))
- {
- Lisp_Object key = HASH_KEY (h, i);
- EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
- set_hash_hash_slot (h, i, make_fixnum (hash_code));
- }
-
- /* Reset the index so that any slot we don't fill below is marked
- invalid. */
- Ffillarray (h->index, make_fixnum (-1));
-
- /* Rebuild the collision chains. */
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (h, i)))
- {
- EMACS_UINT hash_code = XUFIXNUM (HASH_HASH (h, i));
- ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
- set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
- set_hash_index_slot (h, start_of_bucket, i);
- eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
- }
+ for (ptrdiff_t i = 0; i < h->count; ++i)
+ {
+ Lisp_Object key = HASH_KEY (h, i);
+ EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
+ ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
+ set_hash_hash_slot (h, i, make_fixnum (hash_code));
+ set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
+ set_hash_index_slot (h, start_of_bucket, i);
+ eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
+ }
- /* Finally, mark the hash table as having a valid hash order.
- Do this last so that if we're interrupted, we retry on next
- access. */
- eassert (h->count < 0);
- h->count = -h->count;
- eassert (!hash_rehash_needed_p (h));
+ for (ptrdiff_t i = h->count; i < ASIZE (h->next) - 1; i++)
+ set_hash_next_slot (h, i, i + 1);
}
/* Lookup KEY in hash table H. If HASH is non-null, return in *HASH
@@ -4290,8 +4266,6 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, EMACS_UINT *hash)
EMACS_UINT hash_code;
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
hash_code = h->test.hashfn (&h->test, key);
eassert ((hash_code & ~INTMASK) == 0);
if (hash)
@@ -4320,8 +4294,6 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
{
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
eassert ((hash & ~INTMASK) == 0);
/* Increment count after resizing because resizing may fail. */
@@ -4355,8 +4327,6 @@ hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
ptrdiff_t prev = -1;
- hash_rehash_if_needed (h);
-
for (ptrdiff_t i = HASH_INDEX (h, start_of_bucket);
0 <= i;
i = HASH_NEXT (h, i))
@@ -4434,9 +4404,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
for (ptrdiff_t bucket = 0; bucket < n; ++bucket)
{
/* Follow collision chain, removing entries that don't survive
- this garbage collection. It's okay if hash_rehash_needed_p
- (h) is true, since we're operating entirely on the cached
- hash values. */
+ this garbage collection. */
ptrdiff_t prev = -1;
ptrdiff_t next;
for (ptrdiff_t i = HASH_INDEX (h, bucket); 0 <= i; i = next)
@@ -4478,7 +4446,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
set_hash_hash_slot (h, i, Qnil);
eassert (h->count != 0);
- h->count += h->count > 0 ? -1 : 1;
+ h->count--;
}
else
{
@@ -4881,7 +4849,6 @@ DEFUN ("hash-table-count", Fhash_table_count, Shash_table_count, 1, 1, 0,
(Lisp_Object table)
{
struct Lisp_Hash_Table *h = check_hash_table (table);
- hash_rehash_if_needed (h);
return make_fixnum (h->count);
}
diff --git a/src/lisp.h b/src/lisp.h
index 13014c82dc..d0e5c43c41 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2245,11 +2245,7 @@ #define DEFSYM(sym, name) /* empty */
struct Lisp_Hash_Table
{
- /* Change pdumper.c if you change the fields here.
-
- IMPORTANT!!!!!!!
-
- Call hash_rehash_if_needed() before accessing. */
+ /* Change pdumper.c if you change the fields here. */
/* This is for Lisp; the hash table code does not refer to it. */
union vectorlike_header header;
@@ -2363,19 +2359,6 @@ HASH_TABLE_SIZE (const struct Lisp_Hash_Table *h)
void hash_table_rehash (struct Lisp_Hash_Table *h);
-INLINE bool
-hash_rehash_needed_p (const struct Lisp_Hash_Table *h)
-{
- return h->count < 0;
-}
-
-INLINE void
-hash_rehash_if_needed (struct Lisp_Hash_Table *h)
-{
- if (hash_rehash_needed_p (h))
- hash_table_rehash (h);
-}
-
/* Default size for hash tables if not specified. */
enum DEFAULT_HASH_SIZE { DEFAULT_HASH_SIZE = 65 };
diff --git a/src/minibuf.c b/src/minibuf.c
index d9a6e15b05..e923ce2a43 100644
--- a/src/minibuf.c
+++ b/src/minibuf.c
@@ -1203,9 +1203,6 @@ DEFUN ("try-completion", Ftry_completion, Stry_completion, 2, 3, 0,
bucket = AREF (collection, idx);
}
- if (HASH_TABLE_P (collection))
- hash_rehash_if_needed (XHASH_TABLE (collection));
-
while (1)
{
/* Get the next element of the alist, obarray, or hash-table. */
diff --git a/src/pdumper.c b/src/pdumper.c
index 03c00bf27b..697ff9767a 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -107,17 +107,6 @@ #define VM_MS_WINDOWS 2
#define DANGEROUS 0
-/* PDUMPER_CHECK_REHASHING being true causes the portable dumper to
- check, for each hash table it dumps, that the hash table means the
- same thing after rehashing. */
-#ifndef PDUMPER_CHECK_REHASHING
-# if ENABLE_CHECKING
-# define PDUMPER_CHECK_REHASHING 1
-# else
-# define PDUMPER_CHECK_REHASHING 0
-# endif
-#endif
-
/* We require an architecture in which all pointers are the same size
and have the same layout, where pointers are either 32 or 64 bits
long, and where bytes have eight bits --- that is, a
@@ -393,6 +382,8 @@ dump_fingerprint (char const *label,
The start of the cold region is always aligned on a page
boundary. */
dump_off cold_start;
+
+ dump_off hash_list;
};
/* Double-ended singly linked list. */
@@ -550,6 +541,8 @@ dump_fingerprint (char const *label,
heap objects. */
Lisp_Object bignum_data;
+ Lisp_Object hash_tables;
+
unsigned number_hot_relocations;
unsigned number_discardable_relocations;
};
@@ -2622,68 +2615,58 @@ dump_vectorlike_generic (struct dump_context *ctx,
return offset;
}
-/* Determine whether the hash table's hash order is stable
- across dump and load. If it is, we don't have to trigger
- a rehash on access. */
-static bool
-dump_hash_table_stable_p (const struct Lisp_Hash_Table *hash)
+/* Return a vector of KEY, VALUE pairs in the given hash table H. The
+ first H->count pairs are valid, the rest is left as nil. */
+static Lisp_Object
+hash_table_contents (struct Lisp_Hash_Table *h)
{
- bool is_eql = hash->test.hashfn == hashfn_eql;
- bool is_equal = hash->test.hashfn == hashfn_equal;
- ptrdiff_t size = HASH_TABLE_SIZE (hash);
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (hash, i)))
+ Lisp_Object contents = Qnil;
+ /* Make sure key_and_value ends up in the same order, charset.c
+ relies on it by expecting hash table indices to stay constant
+ across the dump. */
+ for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h) - h->count; i++)
+ {
+ dump_push (&contents, Qnil);
+ dump_push (&contents, Qnil);
+ }
+
+ for (ptrdiff_t i = HASH_TABLE_SIZE (h) - 1; i >= 0; --i)
+ if (!NILP (HASH_HASH (h, i)))
{
- Lisp_Object key = HASH_KEY (hash, i);
- bool key_stable = (dump_builtin_symbol_p (key)
- || FIXNUMP (key)
- || (is_equal && STRINGP (key))
- || ((is_equal || is_eql) && FLOATP (key)));
- if (!key_stable)
- return false;
+ dump_push (&contents, HASH_VALUE (h, i));
+ dump_push (&contents, HASH_KEY (h, i));
}
- return true;
+ return CALLN (Fapply, Qvector, contents);
}
-/* Return a list of (KEY . VALUE) pairs in the given hash table. */
-static Lisp_Object
-hash_table_contents (Lisp_Object table)
+static dump_off
+dump_hash_table_list (struct dump_context *ctx)
{
- Lisp_Object contents = Qnil;
- struct Lisp_Hash_Table *h = XHASH_TABLE (table);
- for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); ++i)
- if (!NILP (HASH_HASH (h, i)))
- dump_push (&contents, Fcons (HASH_KEY (h, i), HASH_VALUE (h, i)));
- return Fnreverse (contents);
+ if (CONSP (ctx->hash_tables))
+ return dump_object (ctx, CALLN (Fapply, Qvector, ctx->hash_tables));
+ else
+ return 0;
}
-/* Copy the given hash table, rehash it, and make sure that we can
- look up all the values in the original. */
static void
-check_hash_table_rehash (Lisp_Object table_orig)
-{
- hash_rehash_if_needed (XHASH_TABLE (table_orig));
- Lisp_Object table_rehashed = Fcopy_hash_table (table_orig);
- eassert (XHASH_TABLE (table_rehashed)->count >= 0);
- XHASH_TABLE (table_rehashed)->count *= -1;
- eassert (XHASH_TABLE (table_rehashed)->count <= 0);
- hash_rehash_if_needed (XHASH_TABLE (table_rehashed));
- eassert (XHASH_TABLE (table_rehashed)->count >= 0);
- Lisp_Object expected_contents = hash_table_contents (table_orig);
- while (!NILP (expected_contents))
- {
- Lisp_Object key_value_pair = dump_pop (&expected_contents);
- Lisp_Object key = XCAR (key_value_pair);
- Lisp_Object expected_value = XCDR (key_value_pair);
- Lisp_Object arbitrary = Qdump_emacs_portable__sort_predicate_copied;
- Lisp_Object found_value = Fgethash (key, table_rehashed, arbitrary);
- eassert (EQ (expected_value, found_value));
- Fremhash (key, table_rehashed);
- }
+hash_table_freeze (struct Lisp_Hash_Table *h)
+{
+ ptrdiff_t nkeys = XFIXNAT (Flength (h->key_and_value)) / 2;
+ h->key_and_value = hash_table_contents (h);
+ h->next_free = (nkeys == h->count ? -1 : h->count);
+ h->index = Flength (h->index);
+ h->next = h->hash = make_fixnum (nkeys);
+}
+
+static void
+hash_table_thaw (struct Lisp_Hash_Table *h)
+{
+ h->index = Fmake_vector (h->index, make_fixnum (-1));
+ h->hash = Fmake_vector (h->hash, Qnil);
+ h->next = Fmake_vector (h->next, make_fixnum (-1));
- eassert (EQ (Fhash_table_count (table_rehashed),
- make_fixnum (0)));
+ hash_table_rehash (h);
}
static dump_off
@@ -2695,45 +2678,11 @@ dump_hash_table (struct dump_context *ctx,
# error "Lisp_Hash_Table changed. See CHECK_STRUCTS comment in config.h."
#endif
const struct Lisp_Hash_Table *hash_in = XHASH_TABLE (object);
- bool is_stable = dump_hash_table_stable_p (hash_in);
- /* If the hash table is likely to be modified in memory (either
- because we need to rehash, and thus toggle hash->count, or
- because we need to assemble a list of weak tables) punt the hash
- table to the end of the dump, where we can lump all such hash
- tables together. */
- if (!(is_stable || !NILP (hash_in->weak))
- && ctx->flags.defer_hash_tables)
- {
- if (offset != DUMP_OBJECT_ON_HASH_TABLE_QUEUE)
- {
- eassert (offset == DUMP_OBJECT_ON_NORMAL_QUEUE
- || offset == DUMP_OBJECT_NOT_SEEN);
- /* We still want to dump the actual keys and values now. */
- dump_enqueue_object (ctx, hash_in->key_and_value, WEIGHT_NONE);
- /* We'll get to the rest later. */
- offset = DUMP_OBJECT_ON_HASH_TABLE_QUEUE;
- dump_remember_object (ctx, object, offset);
- dump_push (&ctx->deferred_hash_tables, object);
- }
- return offset;
- }
-
- if (PDUMPER_CHECK_REHASHING)
- check_hash_table_rehash (make_lisp_ptr ((void *) hash_in, Lisp_Vectorlike));
-
struct Lisp_Hash_Table hash_munged = *hash_in;
struct Lisp_Hash_Table *hash = &hash_munged;
- /* Remember to rehash this hash table on first access. After a
- dump reload, the hash table values will have changed, so we'll
- need to rebuild the index.
-
- TODO: for EQ and EQL hash tables, it should be possible to rehash
- here using the preferred load address of the dump, eliminating
- the need to rehash-on-access if we can load the dump where we
- want. */
- if (hash->count > 0 && !is_stable)
- hash->count = -hash->count;
+ hash_table_freeze (hash);
+ dump_push (&ctx->hash_tables, object);
START_DUMP_PVEC (ctx, &hash->header, struct Lisp_Hash_Table, out);
dump_pseudovector_lisp_fields (ctx, &out->header, &hash->header);
@@ -4140,6 +4089,19 @@ DEFUN ("dump-emacs-portable",
|| !NILP (ctx->deferred_hash_tables)
|| !NILP (ctx->deferred_symbols));
+ ctx->header.hash_list = ctx->offset;
+ dump_hash_table_list (ctx);
+
+ do
+ {
+ dump_drain_deferred_hash_tables (ctx);
+ dump_drain_deferred_symbols (ctx);
+ dump_drain_normal_queue (ctx);
+ }
+ while (!dump_queue_empty_p (&ctx->dump_queue)
+ || !NILP (ctx->deferred_hash_tables)
+ || !NILP (ctx->deferred_symbols));
+
dump_sort_copied_objects (ctx);
/* While we copy built-in symbols into the Emacs image, these
@@ -5290,6 +5252,9 @@ dump_do_all_emacs_relocations (const struct dump_header *const header,
NUMBER_DUMP_SECTIONS,
};
+/* Pointer to a stack variable to avoid having to staticpro it. */
+static Lisp_Object *pdumper_hashes = &zero_vector;
+
/* Load a dump from DUMP_FILENAME. Return an error code.
N.B. We run very early in initialization, so we can't use lisp,
@@ -5431,6 +5396,15 @@ pdumper_load (const char *dump_filename)
for (int i = 0; i < ARRAYELTS (sections); ++i)
dump_mmap_reset (§ions[i]);
+ Lisp_Object hashes = zero_vector;
+ if (header->hash_list)
+ {
+ struct Lisp_Vector *hash_tables =
+ ((struct Lisp_Vector *)(dump_base + header->hash_list));
+ XSETVECTOR (hashes, hash_tables);
+ }
+
+ pdumper_hashes = &hashes;
/* Run the functions Emacs registered for doing post-dump-load
initialization. */
for (int i = 0; i < nr_dump_hooks; ++i)
@@ -5501,6 +5475,19 @@ DEFUN ("pdumper-stats", Fpdumper_stats, Spdumper_stats, 0, 0, 0,
#endif /* HAVE_PDUMPER */
\f
+static void
+thaw_hash_tables (void)
+{
+ Lisp_Object hash_tables = *pdumper_hashes;
+ for (ptrdiff_t i = 0; i < ASIZE (hash_tables); i++)
+ hash_table_thaw (XHASH_TABLE (AREF (hash_tables, i)));
+}
+
+void
+init_pdumper_once (void)
+{
+ pdumper_do_now_and_after_load (thaw_hash_tables);
+}
void
syms_of_pdumper (void)
diff --git a/src/pdumper.h b/src/pdumper.h
index ab2f426c1e..cfea06d33d 100644
--- a/src/pdumper.h
+++ b/src/pdumper.h
@@ -248,6 +248,7 @@ pdumper_clear_marks (void)
file was loaded. */
extern void pdumper_record_wd (const char *);
+void init_pdumper_once (void);
void syms_of_pdumper (void);
INLINE_HEADER_END
--
2.22.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-20 12:38 ` Pip Cet
@ 2019-07-21 3:18 ` Paul Eggert
2019-07-21 5:34 ` Pip Cet
0 siblings, 1 reply; 37+ messages in thread
From: Paul Eggert @ 2019-07-21 3:18 UTC (permalink / raw)
To: Pip Cet, Eli Zaretskii; +Cc: 36597
[-- Attachment #1: Type: text/plain, Size: 699 bytes --]
Pip Cet wrote:
> I'm currently playing around with redefining hash tables not to have
> internal freelists. That makes the hash table code a lot simpler
> overall, but some of that simplicity would be lost trying to support
> lazy hash table rehashing.
While looking into this I discovered unlikely bugs in Emacs's hash table code
and GC that can make Emacs dump core, along with some other unlikely hash-table
bugs that can cause Emacs to report memory exhaustion when there should be
plenty of memory. I installed the attached patches to fix these problems and to
refactor to make this code easier to understand (at least for me :-). These
patches will probably affect performance analysis.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-hash-table-overallocation-etc.patch --]
[-- Type: text/x-patch; name="0001-Fix-hash-table-overallocation-etc.patch", Size: 5716 bytes --]
From b0908a0fe6dc4f878b05a8b26ed3ff0c702e26c7 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 20 Jul 2019 19:40:02 -0700
Subject: [PATCH 1/6] Fix hash table overallocation etc.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* src/fns.c (set_hash_key_and_value, set_hash_next)
(set_hash_hash, set_hash_index): Remove. All uses removed.
(maybe_resize_hash_table): Don’t update h->next until it’s
known that all the allocations succeeded, to avoid trashing
the hash table if memory is exhausted. Don’t overallocate the
other vectors. Don’t output growth message if the hash table
didn’t actually grow due to allocation failure. Assume C99
decls after statements.
---
src/fns.c | 87 +++++++++++++++++++------------------------------------
1 file changed, 29 insertions(+), 58 deletions(-)
diff --git a/src/fns.c b/src/fns.c
index 0497588689..4c99d974bd 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -3803,37 +3803,17 @@ CHECK_HASH_TABLE (Lisp_Object x)
CHECK_TYPE (HASH_TABLE_P (x), Qhash_table_p, x);
}
-static void
-set_hash_key_and_value (struct Lisp_Hash_Table *h, Lisp_Object key_and_value)
-{
- h->key_and_value = key_and_value;
-}
-static void
-set_hash_next (struct Lisp_Hash_Table *h, Lisp_Object next)
-{
- h->next = next;
-}
static void
set_hash_next_slot (struct Lisp_Hash_Table *h, ptrdiff_t idx, ptrdiff_t val)
{
gc_aset (h->next, idx, make_fixnum (val));
}
static void
-set_hash_hash (struct Lisp_Hash_Table *h, Lisp_Object hash)
-{
- h->hash = hash;
-}
-static void
set_hash_hash_slot (struct Lisp_Hash_Table *h, ptrdiff_t idx, Lisp_Object val)
{
gc_aset (h->hash, idx, val);
}
static void
-set_hash_index (struct Lisp_Hash_Table *h, Lisp_Object index)
-{
- h->index = index;
-}
-static void
set_hash_index_slot (struct Lisp_Hash_Table *h, ptrdiff_t idx, ptrdiff_t val)
{
gc_aset (h->index, idx, make_fixnum (val));
@@ -4159,10 +4139,8 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
if (h->next_free < 0)
{
ptrdiff_t old_size = HASH_TABLE_SIZE (h);
- EMACS_INT new_size, index_size, nsize;
- ptrdiff_t i;
+ EMACS_INT new_size;
double rehash_size = h->rehash_size;
- double index_float;
if (rehash_size < 0)
new_size = old_size - rehash_size;
@@ -4177,50 +4155,38 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
if (new_size <= old_size)
new_size = old_size + 1;
double threshold = h->rehash_threshold;
- index_float = new_size / threshold;
- index_size = (index_float < INDEX_SIZE_BOUND + 1
- ? next_almost_prime (index_float)
- : INDEX_SIZE_BOUND + 1);
- nsize = max (index_size, 2 * new_size);
- if (INDEX_SIZE_BOUND < nsize)
+ double index_float = new_size / threshold;
+ EMACS_INT index_size = (index_float < INDEX_SIZE_BOUND + 1
+ ? next_almost_prime (index_float)
+ : INDEX_SIZE_BOUND + 1);
+ if (INDEX_SIZE_BOUND < max (index_size, 2 * new_size))
error ("Hash table too large to resize");
-#ifdef ENABLE_CHECKING
- if (HASH_TABLE_P (Vpurify_flag)
- && XHASH_TABLE (Vpurify_flag) == h)
- message ("Growing hash table to: %"pI"d", new_size);
-#endif
-
- set_hash_key_and_value (h, larger_vector (h->key_and_value,
- 2 * (new_size - old_size), -1));
- set_hash_hash (h, larger_vector (h->hash, new_size - old_size, -1));
- set_hash_index (h, make_vector (index_size, make_fixnum (-1)));
- set_hash_next (h, larger_vecalloc (h->next, new_size - old_size, -1));
+ /* Allocate all the new vectors before updating *H, to
+ avoid problems if memory is exhausted. larger_vecalloc
+ finishes computing the size of the replacement vectors. */
+ Lisp_Object next = larger_vecalloc (h->next, new_size - old_size, -1);
+ ptrdiff_t next_size = ASIZE (next);
+ Lisp_Object key_and_value
+ = larger_vector (h->key_and_value, 2 * (next_size - old_size),
+ 2 * next_size);
+ Lisp_Object hash = larger_vector (h->hash, next_size - old_size,
+ next_size);
+ h->index = make_vector (index_size, make_fixnum (-1));
+ h->key_and_value = key_and_value;
+ h->hash = hash;
+ h->next = next;
/* Update the free list. Do it so that new entries are added at
the end of the free list. This makes some operations like
maphash faster. */
- for (i = old_size; i < new_size - 1; ++i)
+ for (ptrdiff_t i = old_size; i < next_size - 1; i++)
set_hash_next_slot (h, i, i + 1);
- set_hash_next_slot (h, i, -1);
-
- if (h->next_free < 0)
- h->next_free = old_size;
- else
- {
- ptrdiff_t last = h->next_free;
- while (true)
- {
- ptrdiff_t next = HASH_NEXT (h, last);
- if (next < 0)
- break;
- last = next;
- }
- set_hash_next_slot (h, last, old_size);
- }
+ set_hash_next_slot (h, next_size - 1, -1);
+ h->next_free = old_size;
/* Rehash. */
- for (i = 0; i < old_size; ++i)
+ for (ptrdiff_t i = 0; i < old_size; i++)
if (!NILP (HASH_HASH (h, i)))
{
EMACS_UINT hash_code = XUFIXNUM (HASH_HASH (h, i));
@@ -4228,6 +4194,11 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
set_hash_index_slot (h, start_of_bucket, i);
}
+
+#ifdef ENABLE_CHECKING
+ if (HASH_TABLE_P (Vpurify_flag) && XHASH_TABLE (Vpurify_flag) == h)
+ message ("Growing hash table to: %"pD"d", new_size);
+#endif
}
}
--
2.17.1
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Rename-pure-to-purecopy.patch --]
[-- Type: text/x-patch; name="0002-Rename-pure-to-purecopy.patch", Size: 5390 bytes --]
From df5024dbaef5e1f7e39a2a8268523f9fc1af3118 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 20 Jul 2019 19:40:03 -0700
Subject: [PATCH 2/6] =?UTF-8?q?Rename=20=E2=80=98pure=E2=80=99=20to=20?=
=?UTF-8?q?=E2=80=98purecopy=E2=80=99?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* src/lisp.h (struct Lisp_Hash_Table): Rename ‘pure’ member to
‘purecopy’, as the old name was quite confusing (it did not
mean the hash table was pure). All uses changed.
---
src/alloc.c | 6 +++---
src/fns.c | 10 +++++-----
src/lisp.h | 2 +-
src/pdumper.c | 2 +-
src/print.c | 4 ++--
5 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/src/alloc.c b/src/alloc.c
index 7a0611dd3e..8649d4e0f4 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -5329,7 +5329,7 @@ make_pure_vector (ptrdiff_t len)
purecopy_hash_table (struct Lisp_Hash_Table *table)
{
eassert (NILP (table->weak));
- eassert (table->pure);
+ eassert (table->purecopy);
struct Lisp_Hash_Table *pure = pure_alloc (sizeof *pure, Lisp_Vectorlike);
struct hash_table_test pure_test = table->test;
@@ -5346,7 +5346,7 @@ purecopy_hash_table (struct Lisp_Hash_Table *table)
pure->index = purecopy (table->index);
pure->count = table->count;
pure->next_free = table->next_free;
- pure->pure = table->pure;
+ pure->purecopy = table->purecopy;
pure->rehash_threshold = table->rehash_threshold;
pure->rehash_size = table->rehash_size;
pure->key_and_value = purecopy (table->key_and_value);
@@ -5410,7 +5410,7 @@ purecopy (Lisp_Object obj)
/* Do not purify hash tables which haven't been defined with
:purecopy as non-nil or are weak - they aren't guaranteed to
not change. */
- if (!NILP (table->weak) || !table->pure)
+ if (!NILP (table->weak) || !table->purecopy)
{
/* Instead, add the hash table to the list of pinned objects,
so that it will be marked during GC. */
diff --git a/src/fns.c b/src/fns.c
index 4c99d974bd..d4f6842f27 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4055,7 +4055,7 @@ #define INDEX_SIZE_BOUND \
Lisp_Object
make_hash_table (struct hash_table_test test, EMACS_INT size,
float rehash_size, float rehash_threshold,
- Lisp_Object weak, bool pure)
+ Lisp_Object weak, bool purecopy)
{
struct Lisp_Hash_Table *h;
Lisp_Object table;
@@ -4094,7 +4094,7 @@ make_hash_table (struct hash_table_test test, EMACS_INT size,
h->next = make_vector (size, make_fixnum (-1));
h->index = make_vector (index_size, make_fixnum (-1));
h->next_weak = NULL;
- h->pure = pure;
+ h->purecopy = purecopy;
/* Set up the free list. */
for (i = 0; i < size - 1; ++i)
@@ -4748,7 +4748,7 @@ DEFUN ("make-hash-table", Fmake_hash_table, Smake_hash_table, 0, MANY, 0,
(ptrdiff_t nargs, Lisp_Object *args)
{
Lisp_Object test, weak;
- bool pure;
+ bool purecopy;
struct hash_table_test testdesc;
ptrdiff_t i;
USE_SAFE_ALLOCA;
@@ -4784,7 +4784,7 @@ DEFUN ("make-hash-table", Fmake_hash_table, Smake_hash_table, 0, MANY, 0,
/* See if there's a `:purecopy PURECOPY' argument. */
i = get_key_arg (QCpurecopy, nargs, args, used);
- pure = i && !NILP (args[i]);
+ purecopy = i && !NILP (args[i]);
/* See if there's a `:size SIZE' argument. */
i = get_key_arg (QCsize, nargs, args, used);
Lisp_Object size_arg = i ? args[i] : Qnil;
@@ -4835,7 +4835,7 @@ DEFUN ("make-hash-table", Fmake_hash_table, Smake_hash_table, 0, MANY, 0,
SAFE_FREE ();
return make_hash_table (testdesc, size, rehash_size, rehash_threshold, weak,
- pure);
+ purecopy);
}
diff --git a/src/lisp.h b/src/lisp.h
index 13014c82dc..8f60963eb7 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2287,7 +2287,7 @@ #define DEFSYM(sym, name) /* empty */
/* True if the table can be purecopied. The table cannot be
changed afterwards. */
- bool pure;
+ bool purecopy;
/* Resize hash table when number of entries / table size is >= this
ratio. */
diff --git a/src/pdumper.c b/src/pdumper.c
index 03c00bf27b..206a196890 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -2741,7 +2741,7 @@ dump_hash_table (struct dump_context *ctx,
them as close to the hash table as possible. */
DUMP_FIELD_COPY (out, hash, count);
DUMP_FIELD_COPY (out, hash, next_free);
- DUMP_FIELD_COPY (out, hash, pure);
+ DUMP_FIELD_COPY (out, hash, purecopy);
DUMP_FIELD_COPY (out, hash, rehash_threshold);
DUMP_FIELD_COPY (out, hash, rehash_size);
dump_field_lv (ctx, out, hash, &hash->key_and_value, WEIGHT_STRONG);
diff --git a/src/print.c b/src/print.c
index 6623244c59..cb34090514 100644
--- a/src/print.c
+++ b/src/print.c
@@ -1575,10 +1575,10 @@ print_vectorlike (Lisp_Object obj, Lisp_Object printcharfun, bool escapeflag,
print_object (Fhash_table_rehash_threshold (obj),
printcharfun, escapeflag);
- if (h->pure)
+ if (h->purecopy)
{
print_c_string (" purecopy ", printcharfun);
- print_object (h->pure ? Qt : Qnil, printcharfun, escapeflag);
+ print_object (h->purecopy ? Qt : Qnil, printcharfun, escapeflag);
}
print_c_string (" data ", printcharfun);
--
2.17.1
[-- Attachment #4: 0003-Simplify-maybe_gc-implementation.patch --]
[-- Type: text/x-patch, Size: 7183 bytes --]
From 26de2d42d0460c5b193456950a568cb04a29dc00 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 20 Jul 2019 19:40:03 -0700
Subject: [PATCH 3/6] Simplify maybe_gc implementation
* src/alloc.c (consing_until_gc): New variable, replacing the
combination of consing_since_gc and gc_relative_threshold.
All uses changed.
(byte_ct): Move decl here from lisp.h.
(memory_full_cons_threshold): New an enum constant.
(free_cons): Check for integer overflow in
statistics calculation.
* src/lisp.h (object_ct): Move decl here from alloc.c.
(OBJECT_CT_MAX): New macro.
(maybe_gc): Simplify accordingly.
---
src/alloc.c | 68 ++++++++++++++++++++++++++---------------------------
src/lisp.h | 12 ++++------
2 files changed, 38 insertions(+), 42 deletions(-)
diff --git a/src/alloc.c b/src/alloc.c
index 8649d4e0f4..9d18fd918b 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -222,13 +222,9 @@ #define GC_DEFAULT_THRESHOLD (100000 * word_size)
/* Global variables. */
struct emacs_globals globals;
-/* Number of bytes of consing done since the last gc. */
+/* maybe_gc collects garbage if this goes negative. */
-byte_ct consing_since_gc;
-
-/* Similar minimum, computed from Vgc_cons_percentage. */
-
-byte_ct gc_relative_threshold;
+object_ct consing_until_gc;
#ifdef HAVE_PDUMPER
/* Number of finalizers run: used to loop over GC until we stop
@@ -240,10 +236,9 @@ #define GC_DEFAULT_THRESHOLD (100000 * word_size)
bool gc_in_progress;
-/* Type of object counts reported by GC. Unlike byte_ct, this can be
- signed, e.g., it is less than 2**31 on a typical 32-bit machine. */
+/* System byte counts reported by GC. */
-typedef intptr_t object_ct;
+typedef uintptr_t byte_ct;
/* Number of live and free conses etc. */
@@ -1373,7 +1368,7 @@ make_interval (void)
MALLOC_UNBLOCK_INPUT;
- consing_since_gc += sizeof (struct interval);
+ consing_until_gc -= sizeof (struct interval);
intervals_consed++;
gcstat.total_free_intervals--;
RESET_INTERVAL (val);
@@ -1745,7 +1740,7 @@ allocate_string (void)
gcstat.total_free_strings--;
gcstat.total_strings++;
++strings_consed;
- consing_since_gc += sizeof *s;
+ consing_until_gc -= sizeof *s;
#ifdef GC_CHECK_STRING_BYTES
if (!noninteractive)
@@ -1865,7 +1860,7 @@ allocate_string_data (struct Lisp_String *s,
old_data->string = NULL;
}
- consing_since_gc += needed;
+ consing_until_gc -= needed;
}
@@ -2471,7 +2466,7 @@ make_float (double float_value)
XFLOAT_INIT (val, float_value);
eassert (!XFLOAT_MARKED_P (XFLOAT (val)));
- consing_since_gc += sizeof (struct Lisp_Float);
+ consing_until_gc -= sizeof (struct Lisp_Float);
floats_consed++;
gcstat.total_free_floats--;
return val;
@@ -2521,7 +2516,7 @@ #define XUNMARK_CONS(fptr) \
/* Minimum number of bytes of consing since GC before next GC,
when memory is full. */
-byte_ct const memory_full_cons_threshold = sizeof (struct cons_block);
+enum { memory_full_cons_threshold = sizeof (struct cons_block) };
/* Current cons_block. */
@@ -2543,7 +2538,8 @@ free_cons (struct Lisp_Cons *ptr)
ptr->u.s.u.chain = cons_free_list;
ptr->u.s.car = dead_object ();
cons_free_list = ptr;
- consing_since_gc -= sizeof *ptr;
+ if (INT_ADD_WRAPV (consing_until_gc, sizeof *ptr, &consing_until_gc))
+ consing_until_gc = OBJECT_CT_MAX;
gcstat.total_free_conses++;
}
@@ -2594,7 +2590,7 @@ DEFUN ("cons", Fcons, Scons, 2, 2, 0,
XSETCAR (val, car);
XSETCDR (val, cdr);
eassert (!XCONS_MARKED_P (XCONS (val)));
- consing_since_gc += sizeof (struct Lisp_Cons);
+ consing_until_gc -= sizeof (struct Lisp_Cons);
gcstat.total_free_conses--;
cons_cells_consed++;
return val;
@@ -3176,7 +3172,7 @@ allocate_vectorlike (ptrdiff_t len)
if (find_suspicious_object_in_range (p, (char *) p + nbytes))
emacs_abort ();
- consing_since_gc += nbytes;
+ consing_until_gc -= nbytes;
vector_cells_consed += len;
MALLOC_UNBLOCK_INPUT;
@@ -3462,7 +3458,7 @@ DEFUN ("make-symbol", Fmake_symbol, Smake_symbol, 1, 1, 0,
MALLOC_UNBLOCK_INPUT;
init_symbol (val, name);
- consing_since_gc += sizeof (struct Lisp_Symbol);
+ consing_until_gc -= sizeof (struct Lisp_Symbol);
symbols_consed++;
gcstat.total_free_symbols--;
return val;
@@ -3862,6 +3858,7 @@ memory_full (size_t nbytes)
if (! enough_free_memory)
{
Vmemory_full = Qt;
+ consing_until_gc = memory_full_cons_threshold;
/* The first time we get here, free the spare memory. */
for (int i = 0; i < ARRAYELTS (spare_memory); i++)
@@ -5802,7 +5799,7 @@ garbage_collect_1 (struct gcstat *gcst)
/* In case user calls debug_print during GC,
don't let that cause a recursive GC. */
- consing_since_gc = 0;
+ consing_until_gc = OBJECT_CT_MAX;
/* Save what's currently displayed in the echo area. Don't do that
if we are GC'ing because we've run out of memory, since
@@ -5913,23 +5910,26 @@ garbage_collect_1 (struct gcstat *gcst)
unblock_input ();
- consing_since_gc = 0;
- if (gc_cons_threshold < GC_DEFAULT_THRESHOLD / 10)
- gc_cons_threshold = GC_DEFAULT_THRESHOLD / 10;
-
- gc_relative_threshold = 0;
- if (FLOATP (Vgc_cons_percentage))
- { /* Set gc_cons_combined_threshold. */
- double tot = total_bytes_of_live_objects ();
-
- tot *= XFLOAT_DATA (Vgc_cons_percentage);
- if (0 < tot)
+ if (!NILP (Vmemory_full))
+ consing_until_gc = memory_full_cons_threshold;
+ else
+ {
+ intptr_t threshold = min (max (GC_DEFAULT_THRESHOLD,
+ gc_cons_threshold >> 3),
+ OBJECT_CT_MAX);
+ if (FLOATP (Vgc_cons_percentage))
{
- if (tot < UINTPTR_MAX)
- gc_relative_threshold = tot;
- else
- gc_relative_threshold = UINTPTR_MAX;
+ double tot = (XFLOAT_DATA (Vgc_cons_percentage)
+ * total_bytes_of_live_objects ());
+ if (threshold < tot)
+ {
+ if (tot < OBJECT_CT_MAX)
+ threshold = tot;
+ else
+ threshold = OBJECT_CT_MAX;
+ }
}
+ consing_until_gc = threshold;
}
if (garbage_collection_messages && NILP (Vmemory_full))
diff --git a/src/lisp.h b/src/lisp.h
index 8f60963eb7..50a61cadd7 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -3763,10 +3763,9 @@ #define CONS_TO_INTEGER(cons, type, var) \
extern void garbage_collect (void);
extern const char *pending_malloc_warning;
extern Lisp_Object zero_vector;
-typedef uintptr_t byte_ct; /* System byte counts reported by GC. */
-extern byte_ct consing_since_gc;
-extern byte_ct gc_relative_threshold;
-extern byte_ct const memory_full_cons_threshold;
+typedef intptr_t object_ct; /* Signed type of object counts reported by GC. */
+#define OBJECT_CT_MAX INTPTR_MAX
+extern object_ct consing_until_gc;
#ifdef HAVE_PDUMPER
extern int number_finalizers_run;
#endif
@@ -4993,10 +4992,7 @@ #define FOR_EACH_ALIST_VALUE(head_var, list_var, value_var) \
INLINE void
maybe_gc (void)
{
- if ((consing_since_gc > gc_cons_threshold
- && consing_since_gc > gc_relative_threshold)
- || (!NILP (Vmemory_full)
- && consing_since_gc > memory_full_cons_threshold))
+ if (consing_until_gc < 0)
garbage_collect ();
}
--
2.17.1
[-- Attachment #5: 0004-Inhibit-GC-after-inhibit_garbage_collection.patch --]
[-- Type: text/x-patch, Size: 2626 bytes --]
From 5018b663c6c0d31f27fb44630a69d9e0bd73273d Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 20 Jul 2019 19:40:03 -0700
Subject: [PATCH 4/6] Inhibit GC after inhibit_garbage_collection
Without this patch, there are unlikely ways that garbage
collection could occur (sometimes causing undefined behavior)
even when inhibit_garbage_collection is in effect.
* src/alloc.c (garbage_collection_inhibited): New var.
(pure_alloc): Increment it if pure space is exhausted, so that
garbage_collect_1 no longer needs to inspect
pure_bytes_used_before_overflow.
(allow_garbage_collection): New function.
(inhibit_garbage_collection): Increment the new variable rather
than specbinding a user variable.
(garbage_collect_1): Do not garbage collect if the new variable
is set, rather than if pure_bytes_used_before_overflow is set.
---
src/alloc.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/src/alloc.c b/src/alloc.c
index 9d18fd918b..09b3a4ea7e 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -292,6 +292,10 @@ #define PUREBEG (char *) pure
static ptrdiff_t pure_bytes_used_non_lisp;
+/* If positive, garbage collection is inhibited. Otherwise, zero. */
+
+static intptr_t garbage_collection_inhibited;
+
/* If nonzero, this is a warning delivered by malloc and not yet
displayed. */
@@ -5120,6 +5124,10 @@ pure_alloc (size_t size, int type)
pure_bytes_used_before_overflow += pure_bytes_used - size;
pure_bytes_used = 0;
pure_bytes_used_lisp = pure_bytes_used_non_lisp = 0;
+
+ /* Can't GC if pure storage overflowed because we can't determine
+ if something is a pure object or not. */
+ garbage_collection_inhibited++;
goto again;
}
@@ -5486,12 +5494,19 @@ staticpro (Lisp_Object const *varaddress)
/* Temporarily prevent garbage collection. */
+static void
+allow_garbage_collection (void)
+{
+ garbage_collection_inhibited--;
+}
+
ptrdiff_t
inhibit_garbage_collection (void)
{
ptrdiff_t count = SPECPDL_INDEX ();
- specbind (Qgc_cons_threshold, make_fixnum (MOST_POSITIVE_FIXNUM));
+ record_unwind_protect_void (allow_garbage_collection);
+ garbage_collection_inhibited++;
return count;
}
@@ -5779,9 +5794,7 @@ garbage_collect_1 (struct gcstat *gcst)
eassert (weak_hash_tables == NULL);
- /* Can't GC if pure storage overflowed because we can't determine
- if something is a pure object or not. */
- if (pure_bytes_used_before_overflow)
+ if (garbage_collection_inhibited)
return false;
/* Record this function, so it appears on the profiler's backtraces. */
--
2.17.1
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: 0005-Simplify-hashfn-cmpfn-calling-convention.patch --]
[-- Type: text/x-patch; name="0005-Simplify-hashfn-cmpfn-calling-convention.patch", Size: 20741 bytes --]
From b6f194a0fb6dbd1b19aa01f95a955f5b8b23b40e Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 20 Jul 2019 19:40:03 -0700
Subject: [PATCH 5/6] Simplify hashfn/cmpfn calling convention
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* src/fns.c (cmpfn_eql, cmpfn_equal, cmpfn_user_defined)
(hashfn_eq, hashfn_equal, hashfn_eql, hashfn_user_defined):
* src/profiler.c (cmpfn_profiler, hashfn_profiler):
Use new calling convention where the return value is a fixnum
instead of EMACS_UINT. While we’re at it, put the hash table
at the end, since that’s a bit simpler and generates better
code (at least on the x86-64). All callers changed.
* src/fns.c (hash_lookup): Store fixnum rather than EMACS_UINT.
All callers changed.
(hash_put): Take a fixnum rather than an EMACS_UINT.
All callers changed. Remove unnecessary eassert (XUFIXNUM does it).
* src/lisp.h (struct hash_table_test):
Adjust signatures of cmpfn and hashfn.
---
src/bytecode.c | 8 ++--
src/category.c | 9 ++--
src/charset.c | 2 +-
src/composite.c | 5 +--
src/emacs-module.c | 3 +-
src/fns.c | 103 ++++++++++++++++++++-------------------------
src/image.c | 3 +-
src/json.c | 3 +-
src/lisp.h | 12 +++---
src/lread.c | 3 +-
src/macfont.m | 3 +-
src/profiler.c | 34 +++++++--------
12 files changed, 82 insertions(+), 106 deletions(-)
diff --git a/src/bytecode.c b/src/bytecode.c
index 29dff44f00..e82de026a8 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -1409,16 +1409,16 @@ #define DEFINE(name, value) LABEL (name) ,
if (h->count <= 5)
{ /* Do a linear search if there are not many cases
FIXME: 5 is arbitrarily chosen. */
- Lisp_Object hash_code = h->test.cmpfn
- ? make_fixnum (h->test.hashfn (&h->test, v1)) : Qnil;
+ Lisp_Object hash_code
+ = h->test.cmpfn ? h->test.hashfn (v1, &h->test) : Qnil;
for (i = h->count; 0 <= --i; )
if (EQ (v1, HASH_KEY (h, i))
|| (h->test.cmpfn
&& EQ (hash_code, HASH_HASH (h, i))
- && h->test.cmpfn (&h->test, v1, HASH_KEY (h, i))))
+ && !NILP (h->test.cmpfn (v1, HASH_KEY (h, i),
+ &h->test))))
break;
-
}
else
i = hash_lookup (h, v1, NULL);
diff --git a/src/category.c b/src/category.c
index 132fae9d40..9e460cfc64 100644
--- a/src/category.c
+++ b/src/category.c
@@ -48,18 +48,15 @@ bset_category_table (struct buffer *b, Lisp_Object val)
static Lisp_Object
hash_get_category_set (Lisp_Object table, Lisp_Object category_set)
{
- struct Lisp_Hash_Table *h;
- ptrdiff_t i;
- EMACS_UINT hash;
-
if (NILP (XCHAR_TABLE (table)->extras[1]))
set_char_table_extras
(table, 1,
make_hash_table (hashtest_equal, DEFAULT_HASH_SIZE,
DEFAULT_REHASH_SIZE, DEFAULT_REHASH_THRESHOLD,
Qnil, false));
- h = XHASH_TABLE (XCHAR_TABLE (table)->extras[1]);
- i = hash_lookup (h, category_set, &hash);
+ struct Lisp_Hash_Table *h = XHASH_TABLE (XCHAR_TABLE (table)->extras[1]);
+ Lisp_Object hash;
+ ptrdiff_t i = hash_lookup (h, category_set, &hash);
if (i >= 0)
return HASH_KEY (h, i);
hash_put (h, category_set, Qnil, hash);
diff --git a/src/charset.c b/src/charset.c
index 85535e8bff..8c54381dc4 100644
--- a/src/charset.c
+++ b/src/charset.c
@@ -842,7 +842,7 @@ DEFUN ("define-charset-internal", Fdefine_charset_internal,
/* Charset attr vector. */
Lisp_Object attrs;
Lisp_Object val;
- EMACS_UINT hash_code;
+ Lisp_Object hash_code;
struct Lisp_Hash_Table *hash_table = XHASH_TABLE (Vcharset_hash_table);
int i, j;
struct charset charset;
diff --git a/src/composite.c b/src/composite.c
index 183062de46..c36663f8e9 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -164,11 +164,10 @@ #define MAX_AUTO_COMPOSITION_LOOKBACK 3
get_composition_id (ptrdiff_t charpos, ptrdiff_t bytepos, ptrdiff_t nchars,
Lisp_Object prop, Lisp_Object string)
{
- Lisp_Object id, length, components, key, *key_contents;
+ Lisp_Object id, length, components, key, *key_contents, hash_code;
ptrdiff_t glyph_len;
struct Lisp_Hash_Table *hash_table = XHASH_TABLE (composition_hash_table);
ptrdiff_t hash_index;
- EMACS_UINT hash_code;
enum composition_method method;
struct composition *cmp;
ptrdiff_t i;
@@ -656,7 +655,7 @@ composition_gstring_put_cache (Lisp_Object gstring, ptrdiff_t len)
struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
hash_rehash_if_needed (h);
Lisp_Object header = LGSTRING_HEADER (gstring);
- EMACS_UINT hash = h->test.hashfn (&h->test, header);
+ Lisp_Object hash = h->test.hashfn (header, &h->test);
if (len < 0)
{
ptrdiff_t glyph_len = LGSTRING_GLYPH_LEN (gstring);
diff --git a/src/emacs-module.c b/src/emacs-module.c
index 8c09ea6bb6..4b991a1c74 100644
--- a/src/emacs-module.c
+++ b/src/emacs-module.c
@@ -362,8 +362,7 @@ module_make_global_ref (emacs_env *env, emacs_value ref)
{
MODULE_FUNCTION_BEGIN (NULL);
struct Lisp_Hash_Table *h = XHASH_TABLE (Vmodule_refs_hash);
- Lisp_Object new_obj = value_to_lisp (ref);
- EMACS_UINT hashcode;
+ Lisp_Object new_obj = value_to_lisp (ref), hashcode;
ptrdiff_t i = hash_lookup (h, new_obj, &hashcode);
if (i >= 0)
diff --git a/src/fns.c b/src/fns.c
index d4f6842f27..d9503c491e 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -2373,7 +2373,7 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum equal_kind equal_kind,
case Lisp_Cons: case Lisp_Vectorlike:
{
struct Lisp_Hash_Table *h = XHASH_TABLE (ht);
- EMACS_UINT hash;
+ Lisp_Object hash;
ptrdiff_t i = hash_lookup (h, o1, &hash);
if (i >= 0)
{ /* `o1' was seen already. */
@@ -3934,74 +3934,67 @@ HASH_INDEX (struct Lisp_Hash_Table *h, ptrdiff_t idx)
/* Ignore HT and compare KEY1 and KEY2 using 'eql'.
Value is true if KEY1 and KEY2 are the same. */
-static bool
-cmpfn_eql (struct hash_table_test *ht,
- Lisp_Object key1,
- Lisp_Object key2)
+static Lisp_Object
+cmpfn_eql (Lisp_Object key1, Lisp_Object key2, struct hash_table_test *ht)
{
- return !NILP (Feql (key1, key2));
+ return Feql (key1, key2);
}
/* Ignore HT and compare KEY1 and KEY2 using 'equal'.
Value is true if KEY1 and KEY2 are the same. */
-static bool
-cmpfn_equal (struct hash_table_test *ht,
- Lisp_Object key1,
- Lisp_Object key2)
+static Lisp_Object
+cmpfn_equal (Lisp_Object key1, Lisp_Object key2, struct hash_table_test *ht)
{
- return !NILP (Fequal (key1, key2));
+ return Fequal (key1, key2);
}
/* Given HT, compare KEY1 and KEY2 using HT->user_cmp_function.
Value is true if KEY1 and KEY2 are the same. */
-static bool
-cmpfn_user_defined (struct hash_table_test *ht,
- Lisp_Object key1,
- Lisp_Object key2)
+static Lisp_Object
+cmpfn_user_defined (Lisp_Object key1, Lisp_Object key2,
+ struct hash_table_test *ht)
{
- return !NILP (call2 (ht->user_cmp_function, key1, key2));
+ return call2 (ht->user_cmp_function, key1, key2);
}
-/* Ignore HT and return a hash code for KEY which uses 'eq' to compare keys.
- The hash code is at most INTMASK. */
+/* Ignore HT and return a hash code for KEY which uses 'eq' to compare
+ keys. */
-static EMACS_UINT
-hashfn_eq (struct hash_table_test *ht, Lisp_Object key)
+static Lisp_Object
+hashfn_eq (Lisp_Object key, struct hash_table_test *ht)
{
- return XHASH (key) ^ XTYPE (key);
+ return make_fixnum (XHASH (key) ^ XTYPE (key));
}
/* Ignore HT and return a hash code for KEY which uses 'equal' to compare keys.
The hash code is at most INTMASK. */
-EMACS_UINT
-hashfn_equal (struct hash_table_test *ht, Lisp_Object key)
+Lisp_Object
+hashfn_equal (Lisp_Object key, struct hash_table_test *ht)
{
- return sxhash (key, 0);
+ return make_fixnum (sxhash (key, 0));
}
/* Ignore HT and return a hash code for KEY which uses 'eql' to compare keys.
The hash code is at most INTMASK. */
-EMACS_UINT
-hashfn_eql (struct hash_table_test *ht, Lisp_Object key)
+Lisp_Object
+hashfn_eql (Lisp_Object key, struct hash_table_test *ht)
{
- return ((FLOATP (key) || BIGNUMP (key))
- ? hashfn_equal (ht, key)
- : hashfn_eq (ht, key));
+ return (FLOATP (key) || BIGNUMP (key) ? hashfn_equal : hashfn_eq) (key, ht);
}
/* Given HT, return a hash code for KEY which uses a user-defined
- function to compare keys. The hash code is at most INTMASK. */
+ function to compare keys. */
-static EMACS_UINT
-hashfn_user_defined (struct hash_table_test *ht, Lisp_Object key)
+static Lisp_Object
+hashfn_user_defined (Lisp_Object key, struct hash_table_test *ht)
{
Lisp_Object hash = call1 (ht->user_hash_function, key);
- return hashfn_eq (ht, hash);
+ return hashfn_eq (hash, ht);
}
struct hash_table_test const
@@ -4224,8 +4217,8 @@ hash_table_rehash (struct Lisp_Hash_Table *h)
if (!NILP (HASH_HASH (h, i)))
{
Lisp_Object key = HASH_KEY (h, i);
- EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
- set_hash_hash_slot (h, i, make_fixnum (hash_code));
+ Lisp_Object hash_code = h->test.hashfn (key, &h->test);
+ set_hash_hash_slot (h, i, hash_code);
}
/* Reset the index so that any slot we don't fill below is marked
@@ -4256,25 +4249,23 @@ hash_table_rehash (struct Lisp_Hash_Table *h)
matching KEY, or -1 if not found. */
ptrdiff_t
-hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, EMACS_UINT *hash)
+hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object *hash)
{
- EMACS_UINT hash_code;
ptrdiff_t start_of_bucket, i;
hash_rehash_if_needed (h);
- hash_code = h->test.hashfn (&h->test, key);
- eassert ((hash_code & ~INTMASK) == 0);
+ Lisp_Object hash_code = h->test.hashfn (key, &h->test);
if (hash)
*hash = hash_code;
- start_of_bucket = hash_code % ASIZE (h->index);
+ start_of_bucket = XUFIXNUM (hash_code) % ASIZE (h->index);
for (i = HASH_INDEX (h, start_of_bucket); 0 <= i; i = HASH_NEXT (h, i))
if (EQ (key, HASH_KEY (h, i))
|| (h->test.cmpfn
- && hash_code == XUFIXNUM (HASH_HASH (h, i))
- && h->test.cmpfn (&h->test, key, HASH_KEY (h, i))))
+ && EQ (hash_code, HASH_HASH (h, i))
+ && !NILP (h->test.cmpfn (key, HASH_KEY (h, i), &h->test))))
break;
return i;
@@ -4287,14 +4278,12 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, EMACS_UINT *hash)
ptrdiff_t
hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
- EMACS_UINT hash)
+ Lisp_Object hash)
{
ptrdiff_t start_of_bucket, i;
hash_rehash_if_needed (h);
- eassert ((hash & ~INTMASK) == 0);
-
/* Increment count after resizing because resizing may fail. */
maybe_resize_hash_table (h);
h->count++;
@@ -4306,10 +4295,10 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
set_hash_value_slot (h, i, value);
/* Remember its hash code. */
- set_hash_hash_slot (h, i, make_fixnum (hash));
+ set_hash_hash_slot (h, i, hash);
/* Add new entry to its collision chain. */
- start_of_bucket = hash % ASIZE (h->index);
+ start_of_bucket = XUFIXNUM (hash) % ASIZE (h->index);
set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
set_hash_index_slot (h, start_of_bucket, i);
return i;
@@ -4321,9 +4310,8 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
void
hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
{
- EMACS_UINT hash_code = h->test.hashfn (&h->test, key);
- eassert ((hash_code & ~INTMASK) == 0);
- ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
+ Lisp_Object hash_code = h->test.hashfn (key, &h->test);
+ ptrdiff_t start_of_bucket = XUFIXNUM (hash_code) % ASIZE (h->index);
ptrdiff_t prev = -1;
hash_rehash_if_needed (h);
@@ -4334,8 +4322,8 @@ hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
{
if (EQ (key, HASH_KEY (h, i))
|| (h->test.cmpfn
- && hash_code == XUFIXNUM (HASH_HASH (h, i))
- && h->test.cmpfn (&h->test, key, HASH_KEY (h, i))))
+ && EQ (hash_code, HASH_HASH (h, i))
+ && !NILP (h->test.cmpfn (key, HASH_KEY (h, i), &h->test))))
{
/* Take entry out of collision chain. */
if (prev < 0)
@@ -4685,7 +4673,7 @@ If (eq A B), then (= (sxhash-eq A) (sxhash-eq B)).
Hash codes are not guaranteed to be preserved across Emacs sessions. */)
(Lisp_Object obj)
{
- return make_fixnum (hashfn_eq (NULL, obj));
+ return hashfn_eq (obj, NULL);
}
DEFUN ("sxhash-eql", Fsxhash_eql, Ssxhash_eql, 1, 1, 0,
@@ -4695,7 +4683,7 @@ If (eql A B), then (= (sxhash-eql A) (sxhash-eql B)).
Hash codes are not guaranteed to be preserved across Emacs sessions. */)
(Lisp_Object obj)
{
- return make_fixnum (hashfn_eql (NULL, obj));
+ return hashfn_eql (obj, NULL);
}
DEFUN ("sxhash-equal", Fsxhash_equal, Ssxhash_equal, 1, 1, 0,
@@ -4705,7 +4693,7 @@ If (equal A B), then (= (sxhash-equal A) (sxhash-equal B)).
Hash codes are not guaranteed to be preserved across Emacs sessions. */)
(Lisp_Object obj)
{
- return make_fixnum (hashfn_equal (NULL, obj));
+ return hashfn_equal (obj, NULL);
}
DEFUN ("make-hash-table", Fmake_hash_table, Smake_hash_table, 0, MANY, 0,
@@ -4951,9 +4939,8 @@ DEFUN ("puthash", Fputhash, Sputhash, 3, 3, 0,
struct Lisp_Hash_Table *h = check_hash_table (table);
CHECK_IMPURE (table, h);
- ptrdiff_t i;
- EMACS_UINT hash;
- i = hash_lookup (h, key, &hash);
+ Lisp_Object hash;
+ ptrdiff_t i = hash_lookup (h, key, &hash);
if (i >= 0)
set_hash_value_slot (h, i, value);
else
diff --git a/src/image.c b/src/image.c
index b081d4b912..355c849491 100644
--- a/src/image.c
+++ b/src/image.c
@@ -4606,8 +4606,7 @@ xpm_put_color_table_h (Lisp_Object color_table,
Lisp_Object color)
{
struct Lisp_Hash_Table *table = XHASH_TABLE (color_table);
- EMACS_UINT hash_code;
- Lisp_Object chars = make_unibyte_string (chars_start, chars_len);
+ Lisp_Object chars = make_unibyte_string (chars_start, chars_len), hash_code;
hash_lookup (table, chars, &hash_code);
hash_put (table, chars, color, hash_code);
diff --git a/src/json.c b/src/json.c
index 21c4b946b4..d05f2c54e2 100644
--- a/src/json.c
+++ b/src/json.c
@@ -867,8 +867,7 @@ json_to_lisp (json_t *json, struct json_configuration *conf)
json_t *value;
json_object_foreach (json, key_str, value)
{
- Lisp_Object key = build_string_from_utf8 (key_str);
- EMACS_UINT hash;
+ Lisp_Object key = build_string_from_utf8 (key_str), hash;
ptrdiff_t i = hash_lookup (h, key, &hash);
/* Keys in JSON objects are unique, so the key can't
be present yet. */
diff --git a/src/lisp.h b/src/lisp.h
index 50a61cadd7..e5edb8fd12 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2237,10 +2237,10 @@ #define DEFSYM(sym, name) /* empty */
Lisp_Object user_cmp_function;
/* C function to compare two keys. */
- bool (*cmpfn) (struct hash_table_test *t, Lisp_Object, Lisp_Object);
+ Lisp_Object (*cmpfn) (Lisp_Object, Lisp_Object, struct hash_table_test *t);
/* C function to compute hash code. */
- EMACS_UINT (*hashfn) (struct hash_table_test *t, Lisp_Object);
+ Lisp_Object (*hashfn) (Lisp_Object, struct hash_table_test *t);
};
struct Lisp_Hash_Table
@@ -3591,13 +3591,13 @@ #define CONS_TO_INTEGER(cons, type, var) \
extern char *extract_data_from_object (Lisp_Object, ptrdiff_t *, ptrdiff_t *);
EMACS_UINT hash_string (char const *, ptrdiff_t);
EMACS_UINT sxhash (Lisp_Object, int);
-EMACS_UINT hashfn_eql (struct hash_table_test *ht, Lisp_Object key);
-EMACS_UINT hashfn_equal (struct hash_table_test *ht, Lisp_Object key);
+Lisp_Object hashfn_eql (Lisp_Object, struct hash_table_test *);
+Lisp_Object hashfn_equal (Lisp_Object, struct hash_table_test *);
Lisp_Object make_hash_table (struct hash_table_test, EMACS_INT, float, float,
Lisp_Object, bool);
-ptrdiff_t hash_lookup (struct Lisp_Hash_Table *, Lisp_Object, EMACS_UINT *);
+ptrdiff_t hash_lookup (struct Lisp_Hash_Table *, Lisp_Object, Lisp_Object *);
ptrdiff_t hash_put (struct Lisp_Hash_Table *, Lisp_Object, Lisp_Object,
- EMACS_UINT);
+ Lisp_Object);
void hash_remove_from_table (struct Lisp_Hash_Table *, Lisp_Object);
extern struct hash_table_test const hashtest_eq, hashtest_eql, hashtest_equal;
extern void validate_subarray (Lisp_Object, Lisp_Object, Lisp_Object,
diff --git a/src/lread.c b/src/lread.c
index 3152fcf867..eecb5e141d 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -3161,8 +3161,7 @@ read1 (Lisp_Object readcharfun, int *pch, bool first_in_list)
Lisp_Object placeholder = Fcons (Qnil, Qnil);
struct Lisp_Hash_Table *h
= XHASH_TABLE (read_objects_map);
- EMACS_UINT hash;
- Lisp_Object number = make_fixnum (n);
+ Lisp_Object number = make_fixnum (n), hash;
ptrdiff_t i = hash_lookup (h, number, &hash);
if (i >= 0)
diff --git a/src/macfont.m b/src/macfont.m
index 301951f34a..7170e80140 100644
--- a/src/macfont.m
+++ b/src/macfont.m
@@ -986,8 +986,7 @@ static void mac_font_get_glyphs_for_variants (CFDataRef, UTF32Char,
{
struct Lisp_Hash_Table *h;
ptrdiff_t i;
- EMACS_UINT hash;
- Lisp_Object value;
+ Lisp_Object hash, value;
if (!HASH_TABLE_P (macfont_family_cache))
macfont_family_cache = CALLN (Fmake_hash_table, QCtest, Qeq);
diff --git a/src/profiler.c b/src/profiler.c
index 87be30acc3..e9b6a37d06 100644
--- a/src/profiler.c
+++ b/src/profiler.c
@@ -36,11 +36,9 @@ saturated_add (EMACS_INT a, EMACS_INT b)
typedef struct Lisp_Hash_Table log_t;
-static bool cmpfn_profiler (
- struct hash_table_test *, Lisp_Object, Lisp_Object);
-
-static EMACS_UINT hashfn_profiler (
- struct hash_table_test *, Lisp_Object);
+static Lisp_Object cmpfn_profiler (Lisp_Object, Lisp_Object,
+ struct hash_table_test *);
+static Lisp_Object hashfn_profiler (Lisp_Object, struct hash_table_test *);
static const struct hash_table_test hashtest_profiler =
{
@@ -165,7 +163,7 @@ record_backtrace (log_t *log, EMACS_INT count)
careful to avoid memory allocation since we're in a signal
handler, and we optimize the code to try and avoid computing the
hash+lookup twice. See fns.c:Fputhash for reference. */
- EMACS_UINT hash;
+ Lisp_Object hash;
ptrdiff_t j = hash_lookup (log, backtrace, &hash);
if (j >= 0)
{
@@ -529,30 +527,30 @@ DEFUN ("function-equal", Ffunction_equal, Sfunction_equal, 2, 2, 0,
return res ? Qt : Qnil;
}
-static bool
-cmpfn_profiler (struct hash_table_test *t,
- Lisp_Object bt1, Lisp_Object bt2)
+static Lisp_Object
+cmpfn_profiler (Lisp_Object bt1, Lisp_Object bt2, struct hash_table_test *t)
{
if (VECTORP (bt1) && VECTORP (bt2))
{
ptrdiff_t l = ASIZE (bt1);
if (l != ASIZE (bt2))
- return false;
+ return Qnil;
for (ptrdiff_t i = 0; i < l; i++)
if (NILP (Ffunction_equal (AREF (bt1, i), AREF (bt2, i))))
- return false;
- return true;
+ return Qnil;
+ return Qt;
}
else
- return EQ (bt1, bt2);
+ return EQ (bt1, bt2) ? Qt : Qnil;
}
-static EMACS_UINT
-hashfn_profiler (struct hash_table_test *ht, Lisp_Object bt)
+static Lisp_Object
+hashfn_profiler (Lisp_Object bt, struct hash_table_test *ht)
{
+ EMACS_UINT hash;
if (VECTORP (bt))
{
- EMACS_UINT hash = 0;
+ hash = 0;
ptrdiff_t l = ASIZE (bt);
for (ptrdiff_t i = 0; i < l; i++)
{
@@ -563,10 +561,10 @@ hashfn_profiler (struct hash_table_test *ht, Lisp_Object bt)
? XHASH (XCDR (XCDR (f))) : XHASH (f));
hash = sxhash_combine (hash, hash1);
}
- return SXHASH_REDUCE (hash);
}
else
- return XHASH (bt);
+ hash = XHASH (bt);
+ return make_fixnum (SXHASH_REDUCE (hash));
}
static void syms_of_profiler_for_pdumper (void);
--
2.17.1
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #7: 0006-Fix-crash-if-user-test-munges-hash-table.patch --]
[-- Type: text/x-patch; name="0006-Fix-crash-if-user-test-munges-hash-table.patch", Size: 15190 bytes --]
From 515afc9c15870cd7bd6b96e2d8b89938116923ac Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 20 Jul 2019 19:40:03 -0700
Subject: [PATCH 6/6] Fix crash if user test munges hash table
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* src/fns.c (restore_mutability)
(hash_table_user_defined_call): New functions.
(cmpfn_user_defined, hashfn_user_defined): Use them.
(make_hash_table, copy_hash_table):
Mark new hash table as mutable.
(check_mutable_hash_table): New function.
(Fclrhash, Fputhash, Fremhash): Use it instead of CHECK_IMPURE.
* src/lisp.h (struct hash_table_test): User-defined functions
now take pointers to struct Lisp_Hash_Table, not to struct
hash_table_test. All uses changed.
(struct Lisp_Hash_Table): New member ‘mutable’.
* src/pdumper.c (dump_hash_table): Copy it.
* test/src/fns-tests.el (test-hash-function-that-mutates-hash-table):
New test, which tests for the bug.
---
src/alloc.c | 1 +
src/bytecode.c | 5 ++-
src/composite.c | 2 +-
src/fns.c | 74 ++++++++++++++++++++++++++++++++-----------
src/lisp.h | 15 ++++++---
src/pdumper.c | 1 +
src/profiler.c | 8 ++---
test/src/fns-tests.el | 12 +++++++
8 files changed, 87 insertions(+), 31 deletions(-)
diff --git a/src/alloc.c b/src/alloc.c
index 09b3a4ea7e..1718ce0faf 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -5352,6 +5352,7 @@ purecopy_hash_table (struct Lisp_Hash_Table *table)
pure->count = table->count;
pure->next_free = table->next_free;
pure->purecopy = table->purecopy;
+ eassert (!pure->mutable);
pure->rehash_threshold = table->rehash_threshold;
pure->rehash_size = table->rehash_size;
pure->key_and_value = purecopy (table->key_and_value);
diff --git a/src/bytecode.c b/src/bytecode.c
index e82de026a8..d668a9a6a1 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -1410,14 +1410,13 @@ #define DEFINE(name, value) LABEL (name) ,
{ /* Do a linear search if there are not many cases
FIXME: 5 is arbitrarily chosen. */
Lisp_Object hash_code
- = h->test.cmpfn ? h->test.hashfn (v1, &h->test) : Qnil;
+ = h->test.cmpfn ? h->test.hashfn (v1, h) : Qnil;
for (i = h->count; 0 <= --i; )
if (EQ (v1, HASH_KEY (h, i))
|| (h->test.cmpfn
&& EQ (hash_code, HASH_HASH (h, i))
- && !NILP (h->test.cmpfn (v1, HASH_KEY (h, i),
- &h->test))))
+ && !NILP (h->test.cmpfn (v1, HASH_KEY (h, i), h))))
break;
}
else
diff --git a/src/composite.c b/src/composite.c
index c36663f8e9..a6606d5fc4 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -655,7 +655,7 @@ composition_gstring_put_cache (Lisp_Object gstring, ptrdiff_t len)
struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
hash_rehash_if_needed (h);
Lisp_Object header = LGSTRING_HEADER (gstring);
- Lisp_Object hash = h->test.hashfn (header, &h->test);
+ Lisp_Object hash = h->test.hashfn (header, h);
if (len < 0)
{
ptrdiff_t glyph_len = LGSTRING_GLYPH_LEN (gstring);
diff --git a/src/fns.c b/src/fns.c
index d9503c491e..5f1ed07a12 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -3931,11 +3931,37 @@ HASH_INDEX (struct Lisp_Hash_Table *h, ptrdiff_t idx)
return XFIXNUM (AREF (h->index, idx));
}
+/* Restore a hash table's mutability after the critical section exits. */
+
+static void
+restore_mutability (void *ptr)
+{
+ struct Lisp_Hash_Table *h = ptr;
+ h->mutable = true;
+}
+
+/* Return the result of calling a user-defined hash or comparison
+ function ARGS[0] with arguments ARGS[1] through ARGS[NARGS - 1].
+ Signal an error if the function attempts to modify H, which
+ otherwise might lead to undefined behavior. */
+
+static Lisp_Object
+hash_table_user_defined_call (ptrdiff_t nargs, Lisp_Object *args,
+ struct Lisp_Hash_Table *h)
+{
+ if (!h->mutable)
+ return Ffuncall (nargs, args);
+ ptrdiff_t count = inhibit_garbage_collection ();
+ record_unwind_protect_ptr (restore_mutability, h);
+ h->mutable = false;
+ return unbind_to (count, Ffuncall (nargs, args));
+}
+
/* Ignore HT and compare KEY1 and KEY2 using 'eql'.
Value is true if KEY1 and KEY2 are the same. */
static Lisp_Object
-cmpfn_eql (Lisp_Object key1, Lisp_Object key2, struct hash_table_test *ht)
+cmpfn_eql (Lisp_Object key1, Lisp_Object key2, struct Lisp_Hash_Table *h)
{
return Feql (key1, key2);
}
@@ -3944,7 +3970,7 @@ cmpfn_eql (Lisp_Object key1, Lisp_Object key2, struct hash_table_test *ht)
Value is true if KEY1 and KEY2 are the same. */
static Lisp_Object
-cmpfn_equal (Lisp_Object key1, Lisp_Object key2, struct hash_table_test *ht)
+cmpfn_equal (Lisp_Object key1, Lisp_Object key2, struct Lisp_Hash_Table *h)
{
return Fequal (key1, key2);
}
@@ -3955,16 +3981,17 @@ cmpfn_equal (Lisp_Object key1, Lisp_Object key2, struct hash_table_test *ht)
static Lisp_Object
cmpfn_user_defined (Lisp_Object key1, Lisp_Object key2,
- struct hash_table_test *ht)
+ struct Lisp_Hash_Table *h)
{
- return call2 (ht->user_cmp_function, key1, key2);
+ Lisp_Object args[] = { h->test.user_cmp_function, key1, key2 };
+ return hash_table_user_defined_call (ARRAYELTS (args), args, h);
}
/* Ignore HT and return a hash code for KEY which uses 'eq' to compare
keys. */
static Lisp_Object
-hashfn_eq (Lisp_Object key, struct hash_table_test *ht)
+hashfn_eq (Lisp_Object key, struct Lisp_Hash_Table *h)
{
return make_fixnum (XHASH (key) ^ XTYPE (key));
}
@@ -3973,7 +4000,7 @@ hashfn_eq (Lisp_Object key, struct hash_table_test *ht)
The hash code is at most INTMASK. */
Lisp_Object
-hashfn_equal (Lisp_Object key, struct hash_table_test *ht)
+hashfn_equal (Lisp_Object key, struct Lisp_Hash_Table *h)
{
return make_fixnum (sxhash (key, 0));
}
@@ -3982,19 +4009,19 @@ hashfn_equal (Lisp_Object key, struct hash_table_test *ht)
The hash code is at most INTMASK. */
Lisp_Object
-hashfn_eql (Lisp_Object key, struct hash_table_test *ht)
+hashfn_eql (Lisp_Object key, struct Lisp_Hash_Table *h)
{
- return (FLOATP (key) || BIGNUMP (key) ? hashfn_equal : hashfn_eq) (key, ht);
+ return (FLOATP (key) || BIGNUMP (key) ? hashfn_equal : hashfn_eq) (key, h);
}
/* Given HT, return a hash code for KEY which uses a user-defined
function to compare keys. */
static Lisp_Object
-hashfn_user_defined (Lisp_Object key, struct hash_table_test *ht)
+hashfn_user_defined (Lisp_Object key, struct Lisp_Hash_Table *h)
{
- Lisp_Object hash = call1 (ht->user_hash_function, key);
- return hashfn_eq (hash, ht);
+ Lisp_Object args[] = { h->test.user_hash_function, key };
+ return hash_table_user_defined_call (ARRAYELTS (args), args, h);
}
struct hash_table_test const
@@ -4088,6 +4115,7 @@ make_hash_table (struct hash_table_test test, EMACS_INT size,
h->index = make_vector (index_size, make_fixnum (-1));
h->next_weak = NULL;
h->purecopy = purecopy;
+ h->mutable = true;
/* Set up the free list. */
for (i = 0; i < size - 1; ++i)
@@ -4113,6 +4141,7 @@ copy_hash_table (struct Lisp_Hash_Table *h1)
h2 = allocate_hash_table ();
*h2 = *h1;
+ h2->mutable = true;
h2->key_and_value = Fcopy_sequence (h1->key_and_value);
h2->hash = Fcopy_sequence (h1->hash);
h2->next = Fcopy_sequence (h1->next);
@@ -4217,7 +4246,7 @@ hash_table_rehash (struct Lisp_Hash_Table *h)
if (!NILP (HASH_HASH (h, i)))
{
Lisp_Object key = HASH_KEY (h, i);
- Lisp_Object hash_code = h->test.hashfn (key, &h->test);
+ Lisp_Object hash_code = h->test.hashfn (key, h);
set_hash_hash_slot (h, i, hash_code);
}
@@ -4255,7 +4284,7 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object *hash)
hash_rehash_if_needed (h);
- Lisp_Object hash_code = h->test.hashfn (key, &h->test);
+ Lisp_Object hash_code = h->test.hashfn (key, h);
if (hash)
*hash = hash_code;
@@ -4265,12 +4294,19 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object *hash)
if (EQ (key, HASH_KEY (h, i))
|| (h->test.cmpfn
&& EQ (hash_code, HASH_HASH (h, i))
- && !NILP (h->test.cmpfn (key, HASH_KEY (h, i), &h->test))))
+ && !NILP (h->test.cmpfn (key, HASH_KEY (h, i), h))))
break;
return i;
}
+static void
+check_mutable_hash_table (Lisp_Object obj, struct Lisp_Hash_Table *h)
+{
+ if (!h->mutable)
+ signal_error ("hash table test modifies table", obj);
+ eassert (!PURE_P (h));
+}
/* Put an entry into hash table H that associates KEY with VALUE.
HASH is a previously computed hash code of KEY.
@@ -4310,7 +4346,7 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
void
hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
{
- Lisp_Object hash_code = h->test.hashfn (key, &h->test);
+ Lisp_Object hash_code = h->test.hashfn (key, h);
ptrdiff_t start_of_bucket = XUFIXNUM (hash_code) % ASIZE (h->index);
ptrdiff_t prev = -1;
@@ -4323,7 +4359,7 @@ hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
if (EQ (key, HASH_KEY (h, i))
|| (h->test.cmpfn
&& EQ (hash_code, HASH_HASH (h, i))
- && !NILP (h->test.cmpfn (key, HASH_KEY (h, i), &h->test))))
+ && !NILP (h->test.cmpfn (key, HASH_KEY (h, i), h))))
{
/* Take entry out of collision chain. */
if (prev < 0)
@@ -4912,7 +4948,7 @@ DEFUN ("clrhash", Fclrhash, Sclrhash, 1, 1, 0,
(Lisp_Object table)
{
struct Lisp_Hash_Table *h = check_hash_table (table);
- CHECK_IMPURE (table, h);
+ check_mutable_hash_table (table, h);
hash_clear (h);
/* Be compatible with XEmacs. */
return table;
@@ -4937,7 +4973,7 @@ DEFUN ("puthash", Fputhash, Sputhash, 3, 3, 0,
(Lisp_Object key, Lisp_Object value, Lisp_Object table)
{
struct Lisp_Hash_Table *h = check_hash_table (table);
- CHECK_IMPURE (table, h);
+ check_mutable_hash_table (table, h);
Lisp_Object hash;
ptrdiff_t i = hash_lookup (h, key, &hash);
@@ -4955,7 +4991,7 @@ DEFUN ("remhash", Fremhash, Sremhash, 2, 2, 0,
(Lisp_Object key, Lisp_Object table)
{
struct Lisp_Hash_Table *h = check_hash_table (table);
- CHECK_IMPURE (table, h);
+ check_mutable_hash_table (table, h);
hash_remove_from_table (h, key);
return Qnil;
}
diff --git a/src/lisp.h b/src/lisp.h
index e5edb8fd12..6d101fed90 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2225,6 +2225,8 @@ #define DEFSYM(sym, name) /* empty */
/* The structure of a Lisp hash table. */
+struct Lisp_Hash_Table;
+
struct hash_table_test
{
/* Name of the function used to compare keys. */
@@ -2237,10 +2239,10 @@ #define DEFSYM(sym, name) /* empty */
Lisp_Object user_cmp_function;
/* C function to compare two keys. */
- Lisp_Object (*cmpfn) (Lisp_Object, Lisp_Object, struct hash_table_test *t);
+ Lisp_Object (*cmpfn) (Lisp_Object, Lisp_Object, struct Lisp_Hash_Table *);
/* C function to compute hash code. */
- Lisp_Object (*hashfn) (Lisp_Object, struct hash_table_test *t);
+ Lisp_Object (*hashfn) (Lisp_Object, struct Lisp_Hash_Table *);
};
struct Lisp_Hash_Table
@@ -2289,6 +2291,11 @@ #define DEFSYM(sym, name) /* empty */
changed afterwards. */
bool purecopy;
+ /* True if the table is mutable. Ordinarily tables are mutable, but
+ pure tables are not, and while a table is being mutated it is
+ immutable for recursive attempts to mutate it. */
+ bool mutable;
+
/* Resize hash table when number of entries / table size is >= this
ratio. */
float rehash_threshold;
@@ -3591,8 +3598,8 @@ #define CONS_TO_INTEGER(cons, type, var) \
extern char *extract_data_from_object (Lisp_Object, ptrdiff_t *, ptrdiff_t *);
EMACS_UINT hash_string (char const *, ptrdiff_t);
EMACS_UINT sxhash (Lisp_Object, int);
-Lisp_Object hashfn_eql (Lisp_Object, struct hash_table_test *);
-Lisp_Object hashfn_equal (Lisp_Object, struct hash_table_test *);
+Lisp_Object hashfn_eql (Lisp_Object, struct Lisp_Hash_Table *);
+Lisp_Object hashfn_equal (Lisp_Object, struct Lisp_Hash_Table *);
Lisp_Object make_hash_table (struct hash_table_test, EMACS_INT, float, float,
Lisp_Object, bool);
ptrdiff_t hash_lookup (struct Lisp_Hash_Table *, Lisp_Object, Lisp_Object *);
diff --git a/src/pdumper.c b/src/pdumper.c
index 206a196890..2abac80a37 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -2742,6 +2742,7 @@ dump_hash_table (struct dump_context *ctx,
DUMP_FIELD_COPY (out, hash, count);
DUMP_FIELD_COPY (out, hash, next_free);
DUMP_FIELD_COPY (out, hash, purecopy);
+ DUMP_FIELD_COPY (out, hash, mutable);
DUMP_FIELD_COPY (out, hash, rehash_threshold);
DUMP_FIELD_COPY (out, hash, rehash_size);
dump_field_lv (ctx, out, hash, &hash->key_and_value, WEIGHT_STRONG);
diff --git a/src/profiler.c b/src/profiler.c
index e9b6a37d06..ed0e9ddd88 100644
--- a/src/profiler.c
+++ b/src/profiler.c
@@ -37,8 +37,8 @@ saturated_add (EMACS_INT a, EMACS_INT b)
typedef struct Lisp_Hash_Table log_t;
static Lisp_Object cmpfn_profiler (Lisp_Object, Lisp_Object,
- struct hash_table_test *);
-static Lisp_Object hashfn_profiler (Lisp_Object, struct hash_table_test *);
+ struct Lisp_Hash_Table *);
+static Lisp_Object hashfn_profiler (Lisp_Object, struct Lisp_Hash_Table *);
static const struct hash_table_test hashtest_profiler =
{
@@ -528,7 +528,7 @@ DEFUN ("function-equal", Ffunction_equal, Sfunction_equal, 2, 2, 0,
}
static Lisp_Object
-cmpfn_profiler (Lisp_Object bt1, Lisp_Object bt2, struct hash_table_test *t)
+cmpfn_profiler (Lisp_Object bt1, Lisp_Object bt2, struct Lisp_Hash_Table *h)
{
if (VECTORP (bt1) && VECTORP (bt2))
{
@@ -545,7 +545,7 @@ cmpfn_profiler (Lisp_Object bt1, Lisp_Object bt2, struct hash_table_test *t)
}
static Lisp_Object
-hashfn_profiler (Lisp_Object bt, struct hash_table_test *ht)
+hashfn_profiler (Lisp_Object bt, struct Lisp_Hash_Table *h)
{
EMACS_UINT hash;
if (VECTORP (bt))
diff --git a/test/src/fns-tests.el b/test/src/fns-tests.el
index 9d4ae4fdf3..7d56da77cf 100644
--- a/test/src/fns-tests.el
+++ b/test/src/fns-tests.el
@@ -846,4 +846,16 @@ test-proper-list-p
(should (not (proper-list-p (make-bool-vector 0 nil))))
(should (not (proper-list-p (make-symbol "a")))))
+(ert-deftest test-hash-function-that-mutates-hash-table ()
+ (define-hash-table-test 'badeq 'eq 'bad-hash)
+ (let ((h (make-hash-table :test 'badeq :size 1 :rehash-size 1)))
+ (defun bad-hash (k)
+ (if (eq k 100)
+ (clrhash h))
+ (sxhash-eq k))
+ (should-error
+ (dotimes (k 200)
+ (puthash k k h)))
+ (should (= 100 (hash-table-count h)))))
+
(provide 'fns-tests)
--
2.17.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-21 3:18 ` Paul Eggert
@ 2019-07-21 5:34 ` Pip Cet
2019-07-21 6:32 ` Paul Eggert
2019-07-21 6:32 ` Pip Cet
0 siblings, 2 replies; 37+ messages in thread
From: Pip Cet @ 2019-07-21 5:34 UTC (permalink / raw)
To: Paul Eggert; +Cc: 36597
On Sun, Jul 21, 2019 at 3:18 AM Paul Eggert <eggert@cs.ucla.edu> wrote:
> Pip Cet wrote:
> > I'm currently playing around with redefining hash tables not to have
> > internal freelists. That makes the hash table code a lot simpler
> > overall, but some of that simplicity would be lost trying to support
> > lazy hash table rehashing.
>
> While looking into this I discovered unlikely bugs in Emacs's hash table code
> and GC that can make Emacs dump core, along with some other unlikely hash-table
> bugs that can cause Emacs to report memory exhaustion when there should be
> plenty of memory. I installed the attached patches to fix these problems and to
> refactor to make this code easier to understand (at least for me :-). These
> patches will probably affect performance analysis.
Well, at least they'll require rebasing, particularly of the
no-internal-freelists patch :-)
While your changes are extensive, I don't see anything in there that
would drastically affect performance or memory footprint. Maybe I'm
missing something, though.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-21 5:34 ` Pip Cet
@ 2019-07-21 6:32 ` Paul Eggert
2019-07-21 6:32 ` Pip Cet
1 sibling, 0 replies; 37+ messages in thread
From: Paul Eggert @ 2019-07-21 6:32 UTC (permalink / raw)
To: Pip Cet; +Cc: 36597
Pip Cet wrote:
> I don't see anything in there that
> would drastically affect performance or memory footprint.
The drastic change to the memory footprint occurs because the old code
incorrectly computed the length of the hash table's vectors, and over-allocated
them in some cases. The over-allocation factor could get worse with each hash
table resize, following a Fibonacci-like sequence. I think this was a bug I
introduced in 2011-07-21T17:41:20!eggert@cs.ucla.edu.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-21 5:34 ` Pip Cet
2019-07-21 6:32 ` Paul Eggert
@ 2019-07-21 6:32 ` Pip Cet
2020-08-09 19:27 ` Lars Ingebrigtsen
1 sibling, 1 reply; 37+ messages in thread
From: Pip Cet @ 2019-07-21 6:32 UTC (permalink / raw)
To: Paul Eggert; +Cc: 36597
[-- Attachment #1: Type: text/plain, Size: 1243 bytes --]
On Sun, Jul 21, 2019 at 5:34 AM Pip Cet <pipcet@gmail.com> wrote:
> On Sun, Jul 21, 2019 at 3:18 AM Paul Eggert <eggert@cs.ucla.edu> wrote:
> > Pip Cet wrote:
> > > I'm currently playing around with redefining hash tables not to have
> > > internal freelists. That makes the hash table code a lot simpler
> > > overall, but some of that simplicity would be lost trying to support
> > > lazy hash table rehashing.
> >
> > While looking into this I discovered unlikely bugs in Emacs's hash table code
> > and GC that can make Emacs dump core, along with some other unlikely hash-table
> > bugs that can cause Emacs to report memory exhaustion when there should be
> > plenty of memory. I installed the attached patches to fix these problems and to
> > refactor to make this code easier to understand (at least for me :-). These
> > patches will probably affect performance analysis.
>
> Well, at least they'll require rebasing, particularly of the
> no-internal-freelists patch :-)
Rebased patches attached. The performance measurements don't seem to
change significantly.
> While your changes are extensive, I don't see anything in there that
> would drastically affect performance or memory footprint. Maybe I'm
> missing something, though.
[-- Attachment #2: 0001-Rehash-hash-tables-eagerly-after-loading-a-dump.patch --]
[-- Type: text/x-patch, Size: 17980 bytes --]
From b839d0993f4df666ba780fb338d2ac1800599114 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Sun, 21 Jul 2019 06:05:54 +0000
Subject: [PATCH] Rehash hash tables eagerly after loading a dump
* src/lisp.h (hash_rehash_needed_p): Remove. All uses removed.
(hash_rehash_if_needed): Remove. All uses removed.
(struct Lisp_Hash_Table): Remove comment about rehashing hash tables.
* src/pdumper.c (thaw_hash_tables): New function.
(hash_table_thaw): New function.
(hash_table_freeze): New function.
(dump_hash_table): Simplify.
(dump_hash_table_list): New function.
(hash_table_contents): New function.
(Fdump_emacs_portable): Handle hash tables by eager rehashing.
(pdumper_load): Restore hash tables.
(init_pdumper_once): New function.
---
src/bytecode.c | 1 -
src/composite.c | 1 -
src/emacs.c | 1 +
src/fns.c | 65 +++++------------
src/lisp.h | 19 +----
src/minibuf.c | 3 -
src/pdumper.c | 183 ++++++++++++++++++++++--------------------------
src/pdumper.h | 1 +
8 files changed, 104 insertions(+), 170 deletions(-)
diff --git a/src/bytecode.c b/src/bytecode.c
index d668a9a6a1..485a448b06 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -1402,7 +1402,6 @@ #define DEFINE(name, value) LABEL (name) ,
Lisp_Object v1 = POP;
ptrdiff_t i;
struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table);
- hash_rehash_if_needed (h);
/* h->count is a faster approximation for HASH_TABLE_SIZE (h)
here. */
diff --git a/src/composite.c b/src/composite.c
index a6606d5fc4..5b59818d09 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -653,7 +653,6 @@ gstring_lookup_cache (Lisp_Object header)
composition_gstring_put_cache (Lisp_Object gstring, ptrdiff_t len)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
- hash_rehash_if_needed (h);
Lisp_Object header = LGSTRING_HEADER (gstring);
Lisp_Object hash = h->test.hashfn (header, h);
if (len < 0)
diff --git a/src/emacs.c b/src/emacs.c
index ad661a081b..855b2c6715 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -1560,6 +1560,7 @@ main (int argc, char **argv)
if (!initialized)
{
init_alloc_once ();
+ init_pdumper_once ();
init_obarray_once ();
init_eval_once ();
init_charset_once ();
diff --git a/src/fns.c b/src/fns.c
index d7e123122d..06c006d747 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4223,51 +4223,27 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
/* Recompute the hashes (and hence also the "next" pointers).
Normally there's never a need to recompute hashes.
- This is done only on first-access to a hash-table loaded from
- the "pdump", because the object's addresses may have changed, thus
+ This is done only on first access to a hash-table loaded from
+ the "pdump", because the objects' addresses may have changed, thus
affecting their hash. */
void
hash_table_rehash (struct Lisp_Hash_Table *h)
{
- ptrdiff_t size = HASH_TABLE_SIZE (h);
-
- /* These structures may have been purecopied and shared
- (bug#36447). */
- h->next = Fcopy_sequence (h->next);
- h->index = Fcopy_sequence (h->index);
- h->hash = Fcopy_sequence (h->hash);
-
/* Recompute the actual hash codes for each entry in the table.
Order is still invalid. */
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (h, i)))
- {
- Lisp_Object key = HASH_KEY (h, i);
- Lisp_Object hash_code = h->test.hashfn (key, h);
- set_hash_hash_slot (h, i, hash_code);
- }
-
- /* Reset the index so that any slot we don't fill below is marked
- invalid. */
- Ffillarray (h->index, make_fixnum (-1));
-
- /* Rebuild the collision chains. */
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (h, i)))
- {
- EMACS_UINT hash_code = XUFIXNUM (HASH_HASH (h, i));
- ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
- set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
- set_hash_index_slot (h, start_of_bucket, i);
- eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
- }
+ for (ptrdiff_t i = 0; i < h->count; ++i)
+ {
+ Lisp_Object key = HASH_KEY (h, i);
+ Lisp_Object hash_code = h->test.hashfn (key, h);
+ ptrdiff_t start_of_bucket = XUFIXNUM (hash_code) % ASIZE (h->index);
+ set_hash_hash_slot (h, i, hash_code);
+ set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
+ set_hash_index_slot (h, start_of_bucket, i);
+ eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
+ }
- /* Finally, mark the hash table as having a valid hash order.
- Do this last so that if we're interrupted, we retry on next
- access. */
- eassert (h->count < 0);
- h->count = -h->count;
- eassert (!hash_rehash_needed_p (h));
+ for (ptrdiff_t i = h->count; i < ASIZE (h->next) - 1; i++)
+ set_hash_next_slot (h, i, i + 1);
}
/* Lookup KEY in hash table H. If HASH is non-null, return in *HASH
@@ -4279,8 +4255,6 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object *hash)
{
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
Lisp_Object hash_code = h->test.hashfn (key, h);
if (hash)
*hash = hash_code;
@@ -4315,8 +4289,6 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
{
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
/* Increment count after resizing because resizing may fail. */
maybe_resize_hash_table (h);
h->count++;
@@ -4347,8 +4319,6 @@ hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
ptrdiff_t start_of_bucket = XUFIXNUM (hash_code) % ASIZE (h->index);
ptrdiff_t prev = -1;
- hash_rehash_if_needed (h);
-
for (ptrdiff_t i = HASH_INDEX (h, start_of_bucket);
0 <= i;
i = HASH_NEXT (h, i))
@@ -4426,9 +4396,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
for (ptrdiff_t bucket = 0; bucket < n; ++bucket)
{
/* Follow collision chain, removing entries that don't survive
- this garbage collection. It's okay if hash_rehash_needed_p
- (h) is true, since we're operating entirely on the cached
- hash values. */
+ this garbage collection. */
ptrdiff_t prev = -1;
ptrdiff_t next;
for (ptrdiff_t i = HASH_INDEX (h, bucket); 0 <= i; i = next)
@@ -4470,7 +4438,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
set_hash_hash_slot (h, i, Qnil);
eassert (h->count != 0);
- h->count += h->count > 0 ? -1 : 1;
+ h->count--;
}
else
{
@@ -4873,7 +4841,6 @@ DEFUN ("hash-table-count", Fhash_table_count, Shash_table_count, 1, 1, 0,
(Lisp_Object table)
{
struct Lisp_Hash_Table *h = check_hash_table (table);
- hash_rehash_if_needed (h);
return make_fixnum (h->count);
}
diff --git a/src/lisp.h b/src/lisp.h
index 6d101fed90..ee11df7e34 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2247,11 +2247,7 @@ #define DEFSYM(sym, name) /* empty */
struct Lisp_Hash_Table
{
- /* Change pdumper.c if you change the fields here.
-
- IMPORTANT!!!!!!!
-
- Call hash_rehash_if_needed() before accessing. */
+ /* Change pdumper.c if you change the fields here. */
/* This is for Lisp; the hash table code does not refer to it. */
union vectorlike_header header;
@@ -2370,19 +2366,6 @@ HASH_TABLE_SIZE (const struct Lisp_Hash_Table *h)
void hash_table_rehash (struct Lisp_Hash_Table *h);
-INLINE bool
-hash_rehash_needed_p (const struct Lisp_Hash_Table *h)
-{
- return h->count < 0;
-}
-
-INLINE void
-hash_rehash_if_needed (struct Lisp_Hash_Table *h)
-{
- if (hash_rehash_needed_p (h))
- hash_table_rehash (h);
-}
-
/* Default size for hash tables if not specified. */
enum DEFAULT_HASH_SIZE { DEFAULT_HASH_SIZE = 65 };
diff --git a/src/minibuf.c b/src/minibuf.c
index d9a6e15b05..e923ce2a43 100644
--- a/src/minibuf.c
+++ b/src/minibuf.c
@@ -1203,9 +1203,6 @@ DEFUN ("try-completion", Ftry_completion, Stry_completion, 2, 3, 0,
bucket = AREF (collection, idx);
}
- if (HASH_TABLE_P (collection))
- hash_rehash_if_needed (XHASH_TABLE (collection));
-
while (1)
{
/* Get the next element of the alist, obarray, or hash-table. */
diff --git a/src/pdumper.c b/src/pdumper.c
index 2abac80a37..29b1adcba6 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -107,17 +107,6 @@ #define VM_MS_WINDOWS 2
#define DANGEROUS 0
-/* PDUMPER_CHECK_REHASHING being true causes the portable dumper to
- check, for each hash table it dumps, that the hash table means the
- same thing after rehashing. */
-#ifndef PDUMPER_CHECK_REHASHING
-# if ENABLE_CHECKING
-# define PDUMPER_CHECK_REHASHING 1
-# else
-# define PDUMPER_CHECK_REHASHING 0
-# endif
-#endif
-
/* We require an architecture in which all pointers are the same size
and have the same layout, where pointers are either 32 or 64 bits
long, and where bytes have eight bits --- that is, a
@@ -393,6 +382,8 @@ dump_fingerprint (char const *label,
The start of the cold region is always aligned on a page
boundary. */
dump_off cold_start;
+
+ dump_off hash_list;
};
/* Double-ended singly linked list. */
@@ -550,6 +541,8 @@ dump_fingerprint (char const *label,
heap objects. */
Lisp_Object bignum_data;
+ Lisp_Object hash_tables;
+
unsigned number_hot_relocations;
unsigned number_discardable_relocations;
};
@@ -2622,68 +2615,58 @@ dump_vectorlike_generic (struct dump_context *ctx,
return offset;
}
-/* Determine whether the hash table's hash order is stable
- across dump and load. If it is, we don't have to trigger
- a rehash on access. */
-static bool
-dump_hash_table_stable_p (const struct Lisp_Hash_Table *hash)
+/* Return a vector of KEY, VALUE pairs in the given hash table H. The
+ first H->count pairs are valid, the rest is left as nil. */
+static Lisp_Object
+hash_table_contents (struct Lisp_Hash_Table *h)
{
- bool is_eql = hash->test.hashfn == hashfn_eql;
- bool is_equal = hash->test.hashfn == hashfn_equal;
- ptrdiff_t size = HASH_TABLE_SIZE (hash);
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (HASH_HASH (hash, i)))
+ Lisp_Object contents = Qnil;
+ /* Make sure key_and_value ends up in the same order, charset.c
+ relies on it by expecting hash table indices to stay constant
+ across the dump. */
+ for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h) - h->count; i++)
+ {
+ dump_push (&contents, Qnil);
+ dump_push (&contents, Qnil);
+ }
+
+ for (ptrdiff_t i = HASH_TABLE_SIZE (h) - 1; i >= 0; --i)
+ if (!NILP (HASH_HASH (h, i)))
{
- Lisp_Object key = HASH_KEY (hash, i);
- bool key_stable = (dump_builtin_symbol_p (key)
- || FIXNUMP (key)
- || (is_equal && STRINGP (key))
- || ((is_equal || is_eql) && FLOATP (key)));
- if (!key_stable)
- return false;
+ dump_push (&contents, HASH_VALUE (h, i));
+ dump_push (&contents, HASH_KEY (h, i));
}
- return true;
+ return CALLN (Fapply, Qvector, contents);
}
-/* Return a list of (KEY . VALUE) pairs in the given hash table. */
-static Lisp_Object
-hash_table_contents (Lisp_Object table)
+static dump_off
+dump_hash_table_list (struct dump_context *ctx)
{
- Lisp_Object contents = Qnil;
- struct Lisp_Hash_Table *h = XHASH_TABLE (table);
- for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); ++i)
- if (!NILP (HASH_HASH (h, i)))
- dump_push (&contents, Fcons (HASH_KEY (h, i), HASH_VALUE (h, i)));
- return Fnreverse (contents);
+ if (CONSP (ctx->hash_tables))
+ return dump_object (ctx, CALLN (Fapply, Qvector, ctx->hash_tables));
+ else
+ return 0;
}
-/* Copy the given hash table, rehash it, and make sure that we can
- look up all the values in the original. */
static void
-check_hash_table_rehash (Lisp_Object table_orig)
-{
- hash_rehash_if_needed (XHASH_TABLE (table_orig));
- Lisp_Object table_rehashed = Fcopy_hash_table (table_orig);
- eassert (XHASH_TABLE (table_rehashed)->count >= 0);
- XHASH_TABLE (table_rehashed)->count *= -1;
- eassert (XHASH_TABLE (table_rehashed)->count <= 0);
- hash_rehash_if_needed (XHASH_TABLE (table_rehashed));
- eassert (XHASH_TABLE (table_rehashed)->count >= 0);
- Lisp_Object expected_contents = hash_table_contents (table_orig);
- while (!NILP (expected_contents))
- {
- Lisp_Object key_value_pair = dump_pop (&expected_contents);
- Lisp_Object key = XCAR (key_value_pair);
- Lisp_Object expected_value = XCDR (key_value_pair);
- Lisp_Object arbitrary = Qdump_emacs_portable__sort_predicate_copied;
- Lisp_Object found_value = Fgethash (key, table_rehashed, arbitrary);
- eassert (EQ (expected_value, found_value));
- Fremhash (key, table_rehashed);
- }
+hash_table_freeze (struct Lisp_Hash_Table *h)
+{
+ ptrdiff_t nkeys = XFIXNAT (Flength (h->key_and_value)) / 2;
+ h->key_and_value = hash_table_contents (h);
+ h->next_free = (nkeys == h->count ? -1 : h->count);
+ h->index = Flength (h->index);
+ h->next = h->hash = make_fixnum (nkeys);
+}
+
+static void
+hash_table_thaw (struct Lisp_Hash_Table *h)
+{
+ h->index = Fmake_vector (h->index, make_fixnum (-1));
+ h->hash = Fmake_vector (h->hash, Qnil);
+ h->next = Fmake_vector (h->next, make_fixnum (-1));
- eassert (EQ (Fhash_table_count (table_rehashed),
- make_fixnum (0)));
+ hash_table_rehash (h);
}
static dump_off
@@ -2695,45 +2678,11 @@ dump_hash_table (struct dump_context *ctx,
# error "Lisp_Hash_Table changed. See CHECK_STRUCTS comment in config.h."
#endif
const struct Lisp_Hash_Table *hash_in = XHASH_TABLE (object);
- bool is_stable = dump_hash_table_stable_p (hash_in);
- /* If the hash table is likely to be modified in memory (either
- because we need to rehash, and thus toggle hash->count, or
- because we need to assemble a list of weak tables) punt the hash
- table to the end of the dump, where we can lump all such hash
- tables together. */
- if (!(is_stable || !NILP (hash_in->weak))
- && ctx->flags.defer_hash_tables)
- {
- if (offset != DUMP_OBJECT_ON_HASH_TABLE_QUEUE)
- {
- eassert (offset == DUMP_OBJECT_ON_NORMAL_QUEUE
- || offset == DUMP_OBJECT_NOT_SEEN);
- /* We still want to dump the actual keys and values now. */
- dump_enqueue_object (ctx, hash_in->key_and_value, WEIGHT_NONE);
- /* We'll get to the rest later. */
- offset = DUMP_OBJECT_ON_HASH_TABLE_QUEUE;
- dump_remember_object (ctx, object, offset);
- dump_push (&ctx->deferred_hash_tables, object);
- }
- return offset;
- }
-
- if (PDUMPER_CHECK_REHASHING)
- check_hash_table_rehash (make_lisp_ptr ((void *) hash_in, Lisp_Vectorlike));
-
struct Lisp_Hash_Table hash_munged = *hash_in;
struct Lisp_Hash_Table *hash = &hash_munged;
- /* Remember to rehash this hash table on first access. After a
- dump reload, the hash table values will have changed, so we'll
- need to rebuild the index.
-
- TODO: for EQ and EQL hash tables, it should be possible to rehash
- here using the preferred load address of the dump, eliminating
- the need to rehash-on-access if we can load the dump where we
- want. */
- if (hash->count > 0 && !is_stable)
- hash->count = -hash->count;
+ hash_table_freeze (hash);
+ dump_push (&ctx->hash_tables, object);
START_DUMP_PVEC (ctx, &hash->header, struct Lisp_Hash_Table, out);
dump_pseudovector_lisp_fields (ctx, &out->header, &hash->header);
@@ -4141,6 +4090,19 @@ DEFUN ("dump-emacs-portable",
|| !NILP (ctx->deferred_hash_tables)
|| !NILP (ctx->deferred_symbols));
+ ctx->header.hash_list = ctx->offset;
+ dump_hash_table_list (ctx);
+
+ do
+ {
+ dump_drain_deferred_hash_tables (ctx);
+ dump_drain_deferred_symbols (ctx);
+ dump_drain_normal_queue (ctx);
+ }
+ while (!dump_queue_empty_p (&ctx->dump_queue)
+ || !NILP (ctx->deferred_hash_tables)
+ || !NILP (ctx->deferred_symbols));
+
dump_sort_copied_objects (ctx);
/* While we copy built-in symbols into the Emacs image, these
@@ -5291,6 +5253,9 @@ dump_do_all_emacs_relocations (const struct dump_header *const header,
NUMBER_DUMP_SECTIONS,
};
+/* Pointer to a stack variable to avoid having to staticpro it. */
+static Lisp_Object *pdumper_hashes = &zero_vector;
+
/* Load a dump from DUMP_FILENAME. Return an error code.
N.B. We run very early in initialization, so we can't use lisp,
@@ -5432,6 +5397,15 @@ pdumper_load (const char *dump_filename)
for (int i = 0; i < ARRAYELTS (sections); ++i)
dump_mmap_reset (§ions[i]);
+ Lisp_Object hashes = zero_vector;
+ if (header->hash_list)
+ {
+ struct Lisp_Vector *hash_tables =
+ ((struct Lisp_Vector *)(dump_base + header->hash_list));
+ XSETVECTOR (hashes, hash_tables);
+ }
+
+ pdumper_hashes = &hashes;
/* Run the functions Emacs registered for doing post-dump-load
initialization. */
for (int i = 0; i < nr_dump_hooks; ++i)
@@ -5502,6 +5476,19 @@ DEFUN ("pdumper-stats", Fpdumper_stats, Spdumper_stats, 0, 0, 0,
#endif /* HAVE_PDUMPER */
\f
+static void
+thaw_hash_tables (void)
+{
+ Lisp_Object hash_tables = *pdumper_hashes;
+ for (ptrdiff_t i = 0; i < ASIZE (hash_tables); i++)
+ hash_table_thaw (XHASH_TABLE (AREF (hash_tables, i)));
+}
+
+void
+init_pdumper_once (void)
+{
+ pdumper_do_now_and_after_load (thaw_hash_tables);
+}
void
syms_of_pdumper (void)
diff --git a/src/pdumper.h b/src/pdumper.h
index ab2f426c1e..cfea06d33d 100644
--- a/src/pdumper.h
+++ b/src/pdumper.h
@@ -248,6 +248,7 @@ pdumper_clear_marks (void)
file was loaded. */
extern void pdumper_record_wd (const char *);
+void init_pdumper_once (void);
void syms_of_pdumper (void);
INLINE_HEADER_END
--
2.22.0
[-- Attachment #3: 0001-snapshot.patch.gz --]
[-- Type: application/gzip, Size: 30954 bytes --]
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2019-07-21 6:32 ` Pip Cet
@ 2020-08-09 19:27 ` Lars Ingebrigtsen
2020-08-10 11:51 ` Pip Cet
0 siblings, 1 reply; 37+ messages in thread
From: Lars Ingebrigtsen @ 2020-08-09 19:27 UTC (permalink / raw)
To: Pip Cet; +Cc: Paul Eggert, 36597
Pip Cet <pipcet@gmail.com> writes:
>> Well, at least they'll require rebasing, particularly of the
>> no-internal-freelists patch :-)
>
> Rebased patches attached. The performance measurements don't seem to
> change significantly.
And this was the final message in this thread. I think the general
consensus was that Pip's patches were a good idea... unless they had
any negative performance impact?
So I tried applying the patch now to Emacs 28 to do some benchmarking,
but it didn't apply cleanly, so I gave up.
Is this still something worth pursuing?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-09 19:27 ` Lars Ingebrigtsen
@ 2020-08-10 11:51 ` Pip Cet
2020-08-10 13:04 ` Lars Ingebrigtsen
0 siblings, 1 reply; 37+ messages in thread
From: Pip Cet @ 2020-08-10 11:51 UTC (permalink / raw)
To: Lars Ingebrigtsen; +Cc: Paul Eggert, 36597
[-- Attachment #1: Type: text/plain, Size: 2026 bytes --]
On Sun, Aug 9, 2020 at 7:28 PM Lars Ingebrigtsen <larsi@gnus.org> wrote:
> >> Well, at least they'll require rebasing, particularly of the
> >> no-internal-freelists patch :-)
> >
> > Rebased patches attached. The performance measurements don't seem to
> > change significantly.
>
> And this was the final message in this thread. I think the general
> consensus was that Pip's patches were a good idea... unless they had
> any negative performance impact?
I'm not aware of any remaining negative performance impact, but I do
seem to recall Daniel was somewhat opposed to the patch.
> So I tried applying the patch now to Emacs 28 to do some benchmarking,
> but it didn't apply cleanly, so I gave up.
I'll try to rebase them again. Does the attached work for you?
> Is this still something worth pursuing?
I think it is, at least in the case of the "rehash eagerly" patch.
As for the more general rewrite of hash tables, it might be a good
idea to summarize and discuss this idea some more. Here are some
initial notes, but I'll make a proper proposal once things are ready:
- there's a new lisp.h type for hash cells, which is four words long
and holds a hash, key, value, and a next link (either another hash
cell, or the hash table this cell belongs to, or nil if it no longer
belongs to any hash table)
- hash tables are essentially vectors of such hash cells
- hash cells are invisible to Lisp code, but C code can hold on to
hash cell Lisp Objects, guaranteeing they won't be collected (they can
still be removed from the hash table)
- in my current implementation, hash cells aren't pseudovectors,
they're cons-like objects with four cells rather than two.
The implications are:
- more work for the garbage collector
- a somewhat ambitious rewrite of the profiler to keep hash cells pre-allocated
- shrinkable hash tables
- genuinely unordered hash tables, since there's no more HASH_INDEX
- rehash thresholds would be interpreted differently
There's also a slight reduction in memory usage for hash tables.
[-- Attachment #2: 0001-Rehash-hash-tables-eagerly-after-loading-a-dump.patch --]
[-- Type: text/x-patch, Size: 19743 bytes --]
From 3ae17db7d2123b5a0b7d25e590f9e29d7ff537fc Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Fri, 19 Jul 2019 07:12:42 +0000
Subject: [PATCH] Rehash hash tables eagerly after loading a dump.
* src/lisp.h (hash_rehash_needed_p): Remove. All uses removed.
(hash_rehash_if_needed): Remove. All uses removed.
(struct Lisp_Hash_Table): Remove comment about rehashing hash tables.
* src/pdumper.c (thaw_hash_tables): New function.
(hash_table_thaw): New function.
(hash_table_freeze): New function.
(dump_hash_table): Simplify.
(dump_hash_table_list): New function.
(hash_table_contents): New function.
(Fdump_emacs_portable): Handle hash tables by eager rehashing.
(pdumper_load): Restore hash tables.
(init_pdumper_once): New function.
---
src/bytecode.c | 1 -
src/composite.c | 1 -
src/emacs.c | 1 +
src/fns.c | 65 ++++-----------
src/lisp.h | 21 +----
src/minibuf.c | 3 -
src/pdumper.c | 208 +++++++++++++++++++++---------------------------
src/pdumper.h | 1 +
8 files changed, 112 insertions(+), 189 deletions(-)
diff --git a/src/bytecode.c b/src/bytecode.c
index 1913a4812a..1c3b6eac0d 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -1401,7 +1401,6 @@ #define DEFINE(name, value) LABEL (name) ,
Lisp_Object v1 = POP;
ptrdiff_t i;
struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table);
- hash_rehash_if_needed (h);
/* h->count is a faster approximation for HASH_TABLE_SIZE (h)
here. */
diff --git a/src/composite.c b/src/composite.c
index f96f0b7772..ec2b8328f7 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -652,7 +652,6 @@ gstring_lookup_cache (Lisp_Object header)
composition_gstring_put_cache (Lisp_Object gstring, ptrdiff_t len)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
- hash_rehash_if_needed (h);
Lisp_Object header = LGSTRING_HEADER (gstring);
Lisp_Object hash = h->test.hashfn (header, h);
if (len < 0)
diff --git a/src/emacs.c b/src/emacs.c
index 8e5eaf5e43..d31fa2cb28 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -1536,6 +1536,7 @@ main (int argc, char **argv)
if (!initialized)
{
init_alloc_once ();
+ init_pdumper_once ();
init_obarray_once ();
init_eval_once ();
init_charset_once ();
diff --git a/src/fns.c b/src/fns.c
index 811d6e8200..41e26104f3 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4248,50 +4248,28 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
/* Recompute the hashes (and hence also the "next" pointers).
Normally there's never a need to recompute hashes.
- This is done only on first-access to a hash-table loaded from
- the "pdump", because the object's addresses may have changed, thus
+ This is done only on first access to a hash-table loaded from
+ the "pdump", because the objects' addresses may have changed, thus
affecting their hash. */
void
-hash_table_rehash (struct Lisp_Hash_Table *h)
+hash_table_rehash (Lisp_Object hash)
{
- ptrdiff_t size = HASH_TABLE_SIZE (h);
-
- /* These structures may have been purecopied and shared
- (bug#36447). */
- Lisp_Object hash = make_nil_vector (size);
- h->next = Fcopy_sequence (h->next);
- h->index = Fcopy_sequence (h->index);
-
+ struct Lisp_Hash_Table *h = XHASH_TABLE (hash);
/* Recompute the actual hash codes for each entry in the table.
Order is still invalid. */
- for (ptrdiff_t i = 0; i < size; ++i)
+ for (ptrdiff_t i = 0; i < h->count; ++i)
{
Lisp_Object key = HASH_KEY (h, i);
- if (!EQ (key, Qunbound))
- ASET (hash, i, h->test.hashfn (key, h));
+ Lisp_Object hash_code = h->test.hashfn (key, h);
+ ptrdiff_t start_of_bucket = XUFIXNUM (hash_code) % ASIZE (h->index);
+ set_hash_hash_slot (h, i, hash_code);
+ set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
+ set_hash_index_slot (h, start_of_bucket, i);
+ eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
}
- /* Reset the index so that any slot we don't fill below is marked
- invalid. */
- Ffillarray (h->index, make_fixnum (-1));
-
- /* Rebuild the collision chains. */
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (AREF (hash, i)))
- {
- EMACS_UINT hash_code = XUFIXNUM (AREF (hash, i));
- ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
- set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
- set_hash_index_slot (h, start_of_bucket, i);
- eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
- }
-
- /* Finally, mark the hash table as having a valid hash order.
- Do this last so that if we're interrupted, we retry on next
- access. */
- eassert (hash_rehash_needed_p (h));
- h->hash = hash;
- eassert (!hash_rehash_needed_p (h));
+ for (ptrdiff_t i = h->count; i < ASIZE (h->next) - 1; i++)
+ set_hash_next_slot (h, i, i + 1);
}
/* Lookup KEY in hash table H. If HASH is non-null, return in *HASH
@@ -4303,8 +4281,6 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object *hash)
{
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
Lisp_Object hash_code = h->test.hashfn (key, h);
if (hash)
*hash = hash_code;
@@ -4339,8 +4315,6 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
{
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
/* Increment count after resizing because resizing may fail. */
maybe_resize_hash_table (h);
h->count++;
@@ -4373,8 +4347,6 @@ hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
ptrdiff_t start_of_bucket = XUFIXNUM (hash_code) % ASIZE (h->index);
ptrdiff_t prev = -1;
- hash_rehash_if_needed (h);
-
for (ptrdiff_t i = HASH_INDEX (h, start_of_bucket);
0 <= i;
i = HASH_NEXT (h, i))
@@ -4415,8 +4387,7 @@ hash_clear (struct Lisp_Hash_Table *h)
if (h->count > 0)
{
ptrdiff_t size = HASH_TABLE_SIZE (h);
- if (!hash_rehash_needed_p (h))
- memclear (xvector_contents (h->hash), size * word_size);
+ memclear (xvector_contents (h->hash), size * word_size);
for (ptrdiff_t i = 0; i < size; i++)
{
set_hash_next_slot (h, i, i < size - 1 ? i + 1 : -1);
@@ -4452,9 +4423,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
for (ptrdiff_t bucket = 0; bucket < n; ++bucket)
{
/* Follow collision chain, removing entries that don't survive
- this garbage collection. It's okay if hash_rehash_needed_p
- (h) is true, since we're operating entirely on the cached
- hash values. */
+ this garbage collection. */
ptrdiff_t prev = -1;
ptrdiff_t next;
for (ptrdiff_t i = HASH_INDEX (h, bucket); 0 <= i; i = next)
@@ -4499,7 +4468,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
set_hash_hash_slot (h, i, Qnil);
eassert (h->count != 0);
- h->count += h->count > 0 ? -1 : 1;
+ h->count--;
}
else
{
@@ -4923,7 +4892,7 @@ DEFUN ("hash-table-count", Fhash_table_count, Shash_table_count, 1, 1, 0,
(Lisp_Object table)
{
struct Lisp_Hash_Table *h = check_hash_table (table);
- eassert (h->count >= 0);
+
return make_fixnum (h->count);
}
diff --git a/src/lisp.h b/src/lisp.h
index 17b92a0414..00d237394c 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2275,11 +2275,7 @@ #define DEFSYM(sym, name) /* empty */
struct Lisp_Hash_Table
{
- /* Change pdumper.c if you change the fields here.
-
- IMPORTANT!!!!!!!
-
- Call hash_rehash_if_needed() before accessing. */
+ /* Change pdumper.c if you change the fields here. */
/* This is for Lisp; the hash table code does not refer to it. */
union vectorlike_header header;
@@ -2398,20 +2394,7 @@ HASH_TABLE_SIZE (const struct Lisp_Hash_Table *h)
return size;
}
-void hash_table_rehash (struct Lisp_Hash_Table *h);
-
-INLINE bool
-hash_rehash_needed_p (const struct Lisp_Hash_Table *h)
-{
- return NILP (h->hash);
-}
-
-INLINE void
-hash_rehash_if_needed (struct Lisp_Hash_Table *h)
-{
- if (hash_rehash_needed_p (h))
- hash_table_rehash (h);
-}
+void hash_table_rehash (Lisp_Object h);
/* Default size for hash tables if not specified. */
diff --git a/src/minibuf.c b/src/minibuf.c
index 9d870ce364..cb302c5a60 100644
--- a/src/minibuf.c
+++ b/src/minibuf.c
@@ -1212,9 +1212,6 @@ DEFUN ("try-completion", Ftry_completion, Stry_completion, 2, 3, 0,
bucket = AREF (collection, idx);
}
- if (HASH_TABLE_P (collection))
- hash_rehash_if_needed (XHASH_TABLE (collection));
-
while (1)
{
/* Get the next element of the alist, obarray, or hash-table. */
diff --git a/src/pdumper.c b/src/pdumper.c
index 63ee0fcb7f..efee001651 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -105,21 +105,10 @@ #define VM_MS_WINDOWS 2
# define VM_SUPPORTED 0
#endif
-/* PDUMPER_CHECK_REHASHING being true causes the portable dumper to
- check, for each hash table it dumps, that the hash table means the
- same thing after rehashing. */
-#ifndef PDUMPER_CHECK_REHASHING
-# if ENABLE_CHECKING
-# define PDUMPER_CHECK_REHASHING 1
-# else
-# define PDUMPER_CHECK_REHASHING 0
-# endif
-#endif
-
-/* Require an architecture in which pointers, ptrdiff_t and intptr_t
- are the same size and have the same layout, and where bytes have
- eight bits --- that is, a general-purpose computer made after 1990.
- Also require Lisp_Object to be at least as wide as pointers. */
+/* We require an architecture in which all pointers are the same size
+ and have the same layout, where pointers are either 32 or 64 bits
+ long, and where bytes have eight bits --- that is, a
+ general-purpose computer made after 1990. */
verify (sizeof (ptrdiff_t) == sizeof (void *));
verify (sizeof (intptr_t) == sizeof (ptrdiff_t));
verify (sizeof (void (*) (void)) == sizeof (void *));
@@ -401,6 +390,8 @@ dump_fingerprint (char const *label,
The start of the cold region is always aligned on a page
boundary. */
dump_off cold_start;
+
+ dump_off hash_list;
};
/* Double-ended singly linked list. */
@@ -558,6 +549,8 @@ dump_fingerprint (char const *label,
heap objects. */
Lisp_Object bignum_data;
+ Lisp_Object hash_tables;
+
unsigned number_hot_relocations;
unsigned number_discardable_relocations;
};
@@ -2616,78 +2609,61 @@ dump_vectorlike_generic (struct dump_context *ctx,
return offset;
}
-/* Determine whether the hash table's hash order is stable
- across dump and load. If it is, we don't have to trigger
- a rehash on access. */
-static bool
-dump_hash_table_stable_p (const struct Lisp_Hash_Table *hash)
+/* Return a vector of KEY, VALUE pairs in the given hash table H. The
+ first H->count pairs are valid, the rest is left as nil. */
+static Lisp_Object
+hash_table_contents (struct Lisp_Hash_Table *h)
{
- if (hash->test.hashfn == hashfn_user_defined)
+ if (h->test.hashfn == hashfn_user_defined)
error ("cannot dump hash tables with user-defined tests"); /* Bug#36769 */
- bool is_eql = hash->test.hashfn == hashfn_eql;
- bool is_equal = hash->test.hashfn == hashfn_equal;
- ptrdiff_t size = HASH_TABLE_SIZE (hash);
- for (ptrdiff_t i = 0; i < size; ++i)
+ Lisp_Object contents = Qnil;
+ /* Make sure key_and_value ends up in the same order, charset.c
+ relies on it by expecting hash table indices to stay constant
+ across the dump. */
+ for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h) - h->count; i++)
{
- Lisp_Object key = HASH_KEY (hash, i);
- if (!EQ (key, Qunbound))
- {
- bool key_stable = (dump_builtin_symbol_p (key)
- || FIXNUMP (key)
- || (is_equal
- && (STRINGP (key) || BOOL_VECTOR_P (key)))
- || ((is_equal || is_eql)
- && (FLOATP (key) || BIGNUMP (key))));
- if (!key_stable)
- return false;
- }
+ dump_push (&contents, Qnil);
+ dump_push (&contents, Qunbound);
}
- return true;
+ for (ptrdiff_t i = HASH_TABLE_SIZE (h) - 1; i >= 0; --i)
+ if (!NILP (HASH_HASH (h, i)))
+ {
+ dump_push (&contents, HASH_VALUE (h, i));
+ dump_push (&contents, HASH_KEY (h, i));
+ }
+
+ return CALLN (Fapply, Qvector, contents);
}
-/* Return a list of (KEY . VALUE) pairs in the given hash table. */
-static Lisp_Object
-hash_table_contents (Lisp_Object table)
+static dump_off
+dump_hash_table_list (struct dump_context *ctx)
{
- Lisp_Object contents = Qnil;
- struct Lisp_Hash_Table *h = XHASH_TABLE (table);
- for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); ++i)
- {
- Lisp_Object key = HASH_KEY (h, i);
- if (!EQ (key, Qunbound))
- dump_push (&contents, Fcons (key, HASH_VALUE (h, i)));
- }
- return Fnreverse (contents);
+ if (CONSP (ctx->hash_tables))
+ return dump_object (ctx, CALLN (Fapply, Qvector, ctx->hash_tables));
+ else
+ return 0;
}
-/* Copy the given hash table, rehash it, and make sure that we can
- look up all the values in the original. */
static void
-check_hash_table_rehash (Lisp_Object table_orig)
-{
- ptrdiff_t count = XHASH_TABLE (table_orig)->count;
- hash_rehash_if_needed (XHASH_TABLE (table_orig));
- Lisp_Object table_rehashed = Fcopy_hash_table (table_orig);
- eassert (!hash_rehash_needed_p (XHASH_TABLE (table_rehashed)));
- XHASH_TABLE (table_rehashed)->hash = Qnil;
- eassert (count == 0 || hash_rehash_needed_p (XHASH_TABLE (table_rehashed)));
- hash_rehash_if_needed (XHASH_TABLE (table_rehashed));
- eassert (!hash_rehash_needed_p (XHASH_TABLE (table_rehashed)));
- Lisp_Object expected_contents = hash_table_contents (table_orig);
- while (!NILP (expected_contents))
- {
- Lisp_Object key_value_pair = dump_pop (&expected_contents);
- Lisp_Object key = XCAR (key_value_pair);
- Lisp_Object expected_value = XCDR (key_value_pair);
- Lisp_Object arbitrary = Qdump_emacs_portable__sort_predicate_copied;
- Lisp_Object found_value = Fgethash (key, table_rehashed, arbitrary);
- eassert (EQ (expected_value, found_value));
- Fremhash (key, table_rehashed);
- }
+hash_table_freeze (struct Lisp_Hash_Table *h)
+{
+ ptrdiff_t nkeys = XFIXNAT (Flength (h->key_and_value)) / 2;
+ h->key_and_value = hash_table_contents (h);
+ h->next_free = (nkeys == h->count ? -1 : h->count);
+ h->index = Flength (h->index);
+ h->next = h->hash = make_fixnum (nkeys);
+}
- eassert (EQ (Fhash_table_count (table_rehashed),
- make_fixnum (0)));
+static void
+hash_table_thaw (Lisp_Object hash)
+{
+ struct Lisp_Hash_Table *h = XHASH_TABLE (hash);
+ h->index = Fmake_vector (h->index, make_fixnum (-1));
+ h->hash = Fmake_vector (h->hash, Qnil);
+ h->next = Fmake_vector (h->next, make_fixnum (-1));
+
+ hash_table_rehash (hash);
}
static dump_off
@@ -2699,51 +2675,11 @@ dump_hash_table (struct dump_context *ctx,
# error "Lisp_Hash_Table changed. See CHECK_STRUCTS comment in config.h."
#endif
const struct Lisp_Hash_Table *hash_in = XHASH_TABLE (object);
- bool is_stable = dump_hash_table_stable_p (hash_in);
- /* If the hash table is likely to be modified in memory (either
- because we need to rehash, and thus toggle hash->count, or
- because we need to assemble a list of weak tables) punt the hash
- table to the end of the dump, where we can lump all such hash
- tables together. */
- if (!(is_stable || !NILP (hash_in->weak))
- && ctx->flags.defer_hash_tables)
- {
- if (offset != DUMP_OBJECT_ON_HASH_TABLE_QUEUE)
- {
- eassert (offset == DUMP_OBJECT_ON_NORMAL_QUEUE
- || offset == DUMP_OBJECT_NOT_SEEN);
- /* We still want to dump the actual keys and values now. */
- dump_enqueue_object (ctx, hash_in->key_and_value, WEIGHT_NONE);
- /* We'll get to the rest later. */
- offset = DUMP_OBJECT_ON_HASH_TABLE_QUEUE;
- dump_remember_object (ctx, object, offset);
- dump_push (&ctx->deferred_hash_tables, object);
- }
- return offset;
- }
-
- if (PDUMPER_CHECK_REHASHING)
- check_hash_table_rehash (make_lisp_ptr ((void *) hash_in, Lisp_Vectorlike));
-
struct Lisp_Hash_Table hash_munged = *hash_in;
struct Lisp_Hash_Table *hash = &hash_munged;
- /* Remember to rehash this hash table on first access. After a
- dump reload, the hash table values will have changed, so we'll
- need to rebuild the index.
-
- TODO: for EQ and EQL hash tables, it should be possible to rehash
- here using the preferred load address of the dump, eliminating
- the need to rehash-on-access if we can load the dump where we
- want. */
- if (hash->count > 0 && !is_stable)
- /* Hash codes will have to be recomputed anyway, so let's not dump them.
- Also set `hash` to nil for hash_rehash_needed_p.
- We could also refrain from dumping the `next' and `index' vectors,
- except that `next' is currently used for HASH_TABLE_SIZE and
- we'd have to rebuild the next_free list as well as adjust
- sweep_weak_hash_table for the case where there's no `index'. */
- hash->hash = Qnil;
+ hash_table_freeze (hash);
+ dump_push (&ctx->hash_tables, object);
START_DUMP_PVEC (ctx, &hash->header, struct Lisp_Hash_Table, out);
dump_pseudovector_lisp_fields (ctx, &out->header, &hash->header);
@@ -4151,6 +4087,19 @@ DEFUN ("dump-emacs-portable",
|| !NILP (ctx->deferred_hash_tables)
|| !NILP (ctx->deferred_symbols));
+ ctx->header.hash_list = ctx->offset;
+ dump_hash_table_list (ctx);
+
+ do
+ {
+ dump_drain_deferred_hash_tables (ctx);
+ dump_drain_deferred_symbols (ctx);
+ dump_drain_normal_queue (ctx);
+ }
+ while (!dump_queue_empty_p (&ctx->dump_queue)
+ || !NILP (ctx->deferred_hash_tables)
+ || !NILP (ctx->deferred_symbols));
+
dump_sort_copied_objects (ctx);
/* While we copy built-in symbols into the Emacs image, these
@@ -5302,6 +5251,9 @@ dump_do_all_emacs_relocations (const struct dump_header *const header,
NUMBER_DUMP_SECTIONS,
};
+/* Pointer to a stack variable to avoid having to staticpro it. */
+static Lisp_Object *pdumper_hashes = &zero_vector;
+
/* Load a dump from DUMP_FILENAME. Return an error code.
N.B. We run very early in initialization, so we can't use lisp,
@@ -5448,6 +5400,15 @@ pdumper_load (const char *dump_filename)
for (int i = 0; i < ARRAYELTS (sections); ++i)
dump_mmap_reset (§ions[i]);
+ Lisp_Object hashes = zero_vector;
+ if (header->hash_list)
+ {
+ struct Lisp_Vector *hash_tables =
+ ((struct Lisp_Vector *)(dump_base + header->hash_list));
+ XSETVECTOR (hashes, hash_tables);
+ }
+
+ pdumper_hashes = &hashes;
/* Run the functions Emacs registered for doing post-dump-load
initialization. */
for (int i = 0; i < nr_dump_hooks; ++i)
@@ -5518,6 +5479,19 @@ DEFUN ("pdumper-stats", Fpdumper_stats, Spdumper_stats, 0, 0, 0,
#endif /* HAVE_PDUMPER */
\f
+static void
+thaw_hash_tables (void)
+{
+ Lisp_Object hash_tables = *pdumper_hashes;
+ for (ptrdiff_t i = 0; i < ASIZE (hash_tables); i++)
+ hash_table_thaw (AREF (hash_tables, i));
+}
+
+void
+init_pdumper_once (void)
+{
+ pdumper_do_now_and_after_load (thaw_hash_tables);
+}
void
syms_of_pdumper (void)
diff --git a/src/pdumper.h b/src/pdumper.h
index 6a99b511f2..c793fb4058 100644
--- a/src/pdumper.h
+++ b/src/pdumper.h
@@ -256,6 +256,7 @@ pdumper_clear_marks (void)
file was loaded. */
extern void pdumper_record_wd (const char *);
+void init_pdumper_once (void);
void syms_of_pdumper (void);
INLINE_HEADER_END
--
2.28.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-10 11:51 ` Pip Cet
@ 2020-08-10 13:04 ` Lars Ingebrigtsen
2020-08-11 9:33 ` Paul Eggert
0 siblings, 1 reply; 37+ messages in thread
From: Lars Ingebrigtsen @ 2020-08-10 13:04 UTC (permalink / raw)
To: Pip Cet; +Cc: Paul Eggert, 36597
Pip Cet <pipcet@gmail.com> writes:
>> So I tried applying the patch now to Emacs 28 to do some benchmarking,
>> but it didn't apply cleanly, so I gave up.
>
> I'll try to rebase them again. Does the attached work for you?
Yup. I've now done some benchmarking with
time make -j32 compile-always
Without patch:
real 0m38.855s
real 0m40.295s
real 0m39.299s
real 0m39.864s
real 0m40.428s
real 0m40.012s
real 0m38.988s
real 0m39.807s
real 0m40.455s
real 0m37.341s
real 0m33.349s
real 0m34.379s
real 0m34.339s
real 0m33.139s
real 0m32.902s
real 0m33.755s
real 0m34.143s
real 0m34.598s
real 0m34.484s
real 0m34.342s
With patch:
real 0m36.064s
real 0m36.617s
real 0m34.502s
real 0m36.817s
real 0m31.782s
real 0m32.859s
real 0m29.779s
real 0m29.703s
real 0m30.313s
real 0m29.496s
real 0m29.585s
real 0m29.807s
real 0m30.235s
real 0m30.142s
real 0m29.960s
real 0m30.067s
real 0m30.114s
real 0m29.975s
real 0m30.388s
real 0m30.112s
Er... It's weird that there's so much difference in time between
runs -- this is running on a machine that does nothing else and has a
load of 0.0 if I'm not compiling Emacs.
So I don't know what can be concluded here... if we just take the mean
from these numbers, it seems that your patch is making compilation
faster. :-)
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-10 13:04 ` Lars Ingebrigtsen
@ 2020-08-11 9:33 ` Paul Eggert
2020-08-11 9:40 ` Pip Cet
` (2 more replies)
0 siblings, 3 replies; 37+ messages in thread
From: Paul Eggert @ 2020-08-11 9:33 UTC (permalink / raw)
To: Lars Ingebrigtsen, Pip Cet; +Cc: 36597-done
[-- Attachment #1: Type: text/plain, Size: 958 bytes --]
On 8/10/20 6:04 AM, Lars Ingebrigtsen wrote:
> time make -j32 compile-always
> ...
> Er... It's weird that there's so much difference in time between
> runs
I think I get less variance if I do a sequential 'make' (i.e., without -j). Of
course this takes longer.
> if we just take the mean
> from these numbers, it seems that your patch is making compilation
> faster. :-)
It also simplifies the code a bit, so I took the liberty of installing it, after
updating its commit message a bit and changing it to keep a comment that was
updated recently (I figure this was a merge error). The patch I installed is the
first patch attached. While reviewing the patch I noticed some relatively minor
things in the neighborhood that could easily be fixed, so I did so by installing
some followup patches, also attached (the last patch is the only one that's at
all nontrivial). I'll mark the bug as done. Thanks to both of you for following
up on this.
[-- Attachment #2: 0001-Rehash-hash-tables-eagerly-after-loading-a-dump.patch --]
[-- Type: text/x-patch, Size: 19345 bytes --]
From 8a33944ee20f57a8d2efe6fe68aa0f4745aa2d03 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Tue, 11 Aug 2020 02:16:53 -0700
Subject: [PATCH 1/7] Rehash hash tables eagerly after loading a dump
This simplifies code, and helps performance in some cases (Bug#36597).
* src/lisp.h (hash_rehash_needed_p): Remove. All uses removed.
(hash_rehash_if_needed): Remove. All uses removed.
(struct Lisp_Hash_Table): Remove comment about rehashing hash tables.
* src/pdumper.c (thaw_hash_tables): New function.
(hash_table_thaw): New function.
(hash_table_freeze): New function.
(dump_hash_table): Simplify.
(dump_hash_table_list): New function.
(hash_table_contents): New function.
(Fdump_emacs_portable): Handle hash tables by eager rehashing.
(pdumper_load): Restore hash tables.
(init_pdumper_once): New function.
---
src/bytecode.c | 1 -
src/composite.c | 1 -
src/emacs.c | 1 +
src/fns.c | 65 ++++------------
src/lisp.h | 21 +----
src/minibuf.c | 3 -
src/pdumper.c | 201 +++++++++++++++++++++---------------------------
src/pdumper.h | 1 +
8 files changed, 109 insertions(+), 185 deletions(-)
diff --git a/src/bytecode.c b/src/bytecode.c
index 1913a4812a..1c3b6eac0d 100644
--- a/src/bytecode.c
+++ b/src/bytecode.c
@@ -1401,7 +1401,6 @@ #define DEFINE(name, value) LABEL (name) ,
Lisp_Object v1 = POP;
ptrdiff_t i;
struct Lisp_Hash_Table *h = XHASH_TABLE (jmp_table);
- hash_rehash_if_needed (h);
/* h->count is a faster approximation for HASH_TABLE_SIZE (h)
here. */
diff --git a/src/composite.c b/src/composite.c
index f96f0b7772..ec2b8328f7 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -652,7 +652,6 @@ gstring_lookup_cache (Lisp_Object header)
composition_gstring_put_cache (Lisp_Object gstring, ptrdiff_t len)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
- hash_rehash_if_needed (h);
Lisp_Object header = LGSTRING_HEADER (gstring);
Lisp_Object hash = h->test.hashfn (header, h);
if (len < 0)
diff --git a/src/emacs.c b/src/emacs.c
index 8e5eaf5e43..d31fa2cb28 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -1536,6 +1536,7 @@ main (int argc, char **argv)
if (!initialized)
{
init_alloc_once ();
+ init_pdumper_once ();
init_obarray_once ();
init_eval_once ();
init_charset_once ();
diff --git a/src/fns.c b/src/fns.c
index 811d6e8200..41e26104f3 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4248,50 +4248,28 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
/* Recompute the hashes (and hence also the "next" pointers).
Normally there's never a need to recompute hashes.
- This is done only on first-access to a hash-table loaded from
- the "pdump", because the object's addresses may have changed, thus
+ This is done only on first access to a hash-table loaded from
+ the "pdump", because the objects' addresses may have changed, thus
affecting their hash. */
void
-hash_table_rehash (struct Lisp_Hash_Table *h)
+hash_table_rehash (Lisp_Object hash)
{
- ptrdiff_t size = HASH_TABLE_SIZE (h);
-
- /* These structures may have been purecopied and shared
- (bug#36447). */
- Lisp_Object hash = make_nil_vector (size);
- h->next = Fcopy_sequence (h->next);
- h->index = Fcopy_sequence (h->index);
-
+ struct Lisp_Hash_Table *h = XHASH_TABLE (hash);
/* Recompute the actual hash codes for each entry in the table.
Order is still invalid. */
- for (ptrdiff_t i = 0; i < size; ++i)
+ for (ptrdiff_t i = 0; i < h->count; ++i)
{
Lisp_Object key = HASH_KEY (h, i);
- if (!EQ (key, Qunbound))
- ASET (hash, i, h->test.hashfn (key, h));
+ Lisp_Object hash_code = h->test.hashfn (key, h);
+ ptrdiff_t start_of_bucket = XUFIXNUM (hash_code) % ASIZE (h->index);
+ set_hash_hash_slot (h, i, hash_code);
+ set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
+ set_hash_index_slot (h, start_of_bucket, i);
+ eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
}
- /* Reset the index so that any slot we don't fill below is marked
- invalid. */
- Ffillarray (h->index, make_fixnum (-1));
-
- /* Rebuild the collision chains. */
- for (ptrdiff_t i = 0; i < size; ++i)
- if (!NILP (AREF (hash, i)))
- {
- EMACS_UINT hash_code = XUFIXNUM (AREF (hash, i));
- ptrdiff_t start_of_bucket = hash_code % ASIZE (h->index);
- set_hash_next_slot (h, i, HASH_INDEX (h, start_of_bucket));
- set_hash_index_slot (h, start_of_bucket, i);
- eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
- }
-
- /* Finally, mark the hash table as having a valid hash order.
- Do this last so that if we're interrupted, we retry on next
- access. */
- eassert (hash_rehash_needed_p (h));
- h->hash = hash;
- eassert (!hash_rehash_needed_p (h));
+ for (ptrdiff_t i = h->count; i < ASIZE (h->next) - 1; i++)
+ set_hash_next_slot (h, i, i + 1);
}
/* Lookup KEY in hash table H. If HASH is non-null, return in *HASH
@@ -4303,8 +4281,6 @@ hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object *hash)
{
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
Lisp_Object hash_code = h->test.hashfn (key, h);
if (hash)
*hash = hash_code;
@@ -4339,8 +4315,6 @@ hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value,
{
ptrdiff_t start_of_bucket, i;
- hash_rehash_if_needed (h);
-
/* Increment count after resizing because resizing may fail. */
maybe_resize_hash_table (h);
h->count++;
@@ -4373,8 +4347,6 @@ hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key)
ptrdiff_t start_of_bucket = XUFIXNUM (hash_code) % ASIZE (h->index);
ptrdiff_t prev = -1;
- hash_rehash_if_needed (h);
-
for (ptrdiff_t i = HASH_INDEX (h, start_of_bucket);
0 <= i;
i = HASH_NEXT (h, i))
@@ -4415,8 +4387,7 @@ hash_clear (struct Lisp_Hash_Table *h)
if (h->count > 0)
{
ptrdiff_t size = HASH_TABLE_SIZE (h);
- if (!hash_rehash_needed_p (h))
- memclear (xvector_contents (h->hash), size * word_size);
+ memclear (xvector_contents (h->hash), size * word_size);
for (ptrdiff_t i = 0; i < size; i++)
{
set_hash_next_slot (h, i, i < size - 1 ? i + 1 : -1);
@@ -4452,9 +4423,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
for (ptrdiff_t bucket = 0; bucket < n; ++bucket)
{
/* Follow collision chain, removing entries that don't survive
- this garbage collection. It's okay if hash_rehash_needed_p
- (h) is true, since we're operating entirely on the cached
- hash values. */
+ this garbage collection. */
ptrdiff_t prev = -1;
ptrdiff_t next;
for (ptrdiff_t i = HASH_INDEX (h, bucket); 0 <= i; i = next)
@@ -4499,7 +4468,7 @@ sweep_weak_table (struct Lisp_Hash_Table *h, bool remove_entries_p)
set_hash_hash_slot (h, i, Qnil);
eassert (h->count != 0);
- h->count += h->count > 0 ? -1 : 1;
+ h->count--;
}
else
{
@@ -4923,7 +4892,7 @@ DEFUN ("hash-table-count", Fhash_table_count, Shash_table_count, 1, 1, 0,
(Lisp_Object table)
{
struct Lisp_Hash_Table *h = check_hash_table (table);
- eassert (h->count >= 0);
+
return make_fixnum (h->count);
}
diff --git a/src/lisp.h b/src/lisp.h
index 17b92a0414..d88038d91b 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -2275,11 +2275,7 @@ #define DEFSYM(sym, name) /* empty */
struct Lisp_Hash_Table
{
- /* Change pdumper.c if you change the fields here.
-
- IMPORTANT!!!!!!!
-
- Call hash_rehash_if_needed() before accessing. */
+ /* Change pdumper.c if you change the fields here. */
/* This is for Lisp; the hash table code does not refer to it. */
union vectorlike_header header;
@@ -2398,20 +2394,7 @@ HASH_TABLE_SIZE (const struct Lisp_Hash_Table *h)
return size;
}
-void hash_table_rehash (struct Lisp_Hash_Table *h);
-
-INLINE bool
-hash_rehash_needed_p (const struct Lisp_Hash_Table *h)
-{
- return NILP (h->hash);
-}
-
-INLINE void
-hash_rehash_if_needed (struct Lisp_Hash_Table *h)
-{
- if (hash_rehash_needed_p (h))
- hash_table_rehash (h);
-}
+void hash_table_rehash (Lisp_Object);
/* Default size for hash tables if not specified. */
diff --git a/src/minibuf.c b/src/minibuf.c
index 9d870ce364..cb302c5a60 100644
--- a/src/minibuf.c
+++ b/src/minibuf.c
@@ -1212,9 +1212,6 @@ DEFUN ("try-completion", Ftry_completion, Stry_completion, 2, 3, 0,
bucket = AREF (collection, idx);
}
- if (HASH_TABLE_P (collection))
- hash_rehash_if_needed (XHASH_TABLE (collection));
-
while (1)
{
/* Get the next element of the alist, obarray, or hash-table. */
diff --git a/src/pdumper.c b/src/pdumper.c
index 63ee0fcb7f..10dfa8737f 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -105,17 +105,6 @@ #define VM_MS_WINDOWS 2
# define VM_SUPPORTED 0
#endif
-/* PDUMPER_CHECK_REHASHING being true causes the portable dumper to
- check, for each hash table it dumps, that the hash table means the
- same thing after rehashing. */
-#ifndef PDUMPER_CHECK_REHASHING
-# if ENABLE_CHECKING
-# define PDUMPER_CHECK_REHASHING 1
-# else
-# define PDUMPER_CHECK_REHASHING 0
-# endif
-#endif
-
/* Require an architecture in which pointers, ptrdiff_t and intptr_t
are the same size and have the same layout, and where bytes have
eight bits --- that is, a general-purpose computer made after 1990.
@@ -401,6 +390,8 @@ dump_fingerprint (char const *label,
The start of the cold region is always aligned on a page
boundary. */
dump_off cold_start;
+
+ dump_off hash_list;
};
/* Double-ended singly linked list. */
@@ -558,6 +549,8 @@ dump_fingerprint (char const *label,
heap objects. */
Lisp_Object bignum_data;
+ Lisp_Object hash_tables;
+
unsigned number_hot_relocations;
unsigned number_discardable_relocations;
};
@@ -2616,78 +2609,62 @@ dump_vectorlike_generic (struct dump_context *ctx,
return offset;
}
-/* Determine whether the hash table's hash order is stable
- across dump and load. If it is, we don't have to trigger
- a rehash on access. */
-static bool
-dump_hash_table_stable_p (const struct Lisp_Hash_Table *hash)
+/* Return a vector of KEY, VALUE pairs in the given hash table H. The
+ first H->count pairs are valid, the rest is left as nil. */
+static Lisp_Object
+hash_table_contents (struct Lisp_Hash_Table *h)
{
- if (hash->test.hashfn == hashfn_user_defined)
+ if (h->test.hashfn == hashfn_user_defined)
error ("cannot dump hash tables with user-defined tests"); /* Bug#36769 */
- bool is_eql = hash->test.hashfn == hashfn_eql;
- bool is_equal = hash->test.hashfn == hashfn_equal;
- ptrdiff_t size = HASH_TABLE_SIZE (hash);
- for (ptrdiff_t i = 0; i < size; ++i)
+ Lisp_Object contents = Qnil;
+
+ /* Make sure key_and_value ends up in the same order; charset.c
+ relies on it by expecting hash table indices to stay constant
+ across the dump. */
+ for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h) - h->count; i++)
{
- Lisp_Object key = HASH_KEY (hash, i);
- if (!EQ (key, Qunbound))
- {
- bool key_stable = (dump_builtin_symbol_p (key)
- || FIXNUMP (key)
- || (is_equal
- && (STRINGP (key) || BOOL_VECTOR_P (key)))
- || ((is_equal || is_eql)
- && (FLOATP (key) || BIGNUMP (key))));
- if (!key_stable)
- return false;
- }
+ dump_push (&contents, Qnil);
+ dump_push (&contents, Qunbound);
}
- return true;
+ for (ptrdiff_t i = HASH_TABLE_SIZE (h) - 1; i >= 0; --i)
+ if (!NILP (HASH_HASH (h, i)))
+ {
+ dump_push (&contents, HASH_VALUE (h, i));
+ dump_push (&contents, HASH_KEY (h, i));
+ }
+
+ return CALLN (Fapply, Qvector, contents);
}
-/* Return a list of (KEY . VALUE) pairs in the given hash table. */
-static Lisp_Object
-hash_table_contents (Lisp_Object table)
+static dump_off
+dump_hash_table_list (struct dump_context *ctx)
{
- Lisp_Object contents = Qnil;
- struct Lisp_Hash_Table *h = XHASH_TABLE (table);
- for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); ++i)
- {
- Lisp_Object key = HASH_KEY (h, i);
- if (!EQ (key, Qunbound))
- dump_push (&contents, Fcons (key, HASH_VALUE (h, i)));
- }
- return Fnreverse (contents);
+ if (CONSP (ctx->hash_tables))
+ return dump_object (ctx, CALLN (Fapply, Qvector, ctx->hash_tables));
+ else
+ return 0;
}
-/* Copy the given hash table, rehash it, and make sure that we can
- look up all the values in the original. */
static void
-check_hash_table_rehash (Lisp_Object table_orig)
-{
- ptrdiff_t count = XHASH_TABLE (table_orig)->count;
- hash_rehash_if_needed (XHASH_TABLE (table_orig));
- Lisp_Object table_rehashed = Fcopy_hash_table (table_orig);
- eassert (!hash_rehash_needed_p (XHASH_TABLE (table_rehashed)));
- XHASH_TABLE (table_rehashed)->hash = Qnil;
- eassert (count == 0 || hash_rehash_needed_p (XHASH_TABLE (table_rehashed)));
- hash_rehash_if_needed (XHASH_TABLE (table_rehashed));
- eassert (!hash_rehash_needed_p (XHASH_TABLE (table_rehashed)));
- Lisp_Object expected_contents = hash_table_contents (table_orig);
- while (!NILP (expected_contents))
- {
- Lisp_Object key_value_pair = dump_pop (&expected_contents);
- Lisp_Object key = XCAR (key_value_pair);
- Lisp_Object expected_value = XCDR (key_value_pair);
- Lisp_Object arbitrary = Qdump_emacs_portable__sort_predicate_copied;
- Lisp_Object found_value = Fgethash (key, table_rehashed, arbitrary);
- eassert (EQ (expected_value, found_value));
- Fremhash (key, table_rehashed);
- }
+hash_table_freeze (struct Lisp_Hash_Table *h)
+{
+ ptrdiff_t nkeys = XFIXNAT (Flength (h->key_and_value)) / 2;
+ h->key_and_value = hash_table_contents (h);
+ h->next_free = (nkeys == h->count ? -1 : h->count);
+ h->index = Flength (h->index);
+ h->next = h->hash = make_fixnum (nkeys);
+}
- eassert (EQ (Fhash_table_count (table_rehashed),
- make_fixnum (0)));
+static void
+hash_table_thaw (Lisp_Object hash)
+{
+ struct Lisp_Hash_Table *h = XHASH_TABLE (hash);
+ h->index = Fmake_vector (h->index, make_fixnum (-1));
+ h->hash = Fmake_vector (h->hash, Qnil);
+ h->next = Fmake_vector (h->next, make_fixnum (-1));
+
+ hash_table_rehash (hash);
}
static dump_off
@@ -2699,51 +2676,11 @@ dump_hash_table (struct dump_context *ctx,
# error "Lisp_Hash_Table changed. See CHECK_STRUCTS comment in config.h."
#endif
const struct Lisp_Hash_Table *hash_in = XHASH_TABLE (object);
- bool is_stable = dump_hash_table_stable_p (hash_in);
- /* If the hash table is likely to be modified in memory (either
- because we need to rehash, and thus toggle hash->count, or
- because we need to assemble a list of weak tables) punt the hash
- table to the end of the dump, where we can lump all such hash
- tables together. */
- if (!(is_stable || !NILP (hash_in->weak))
- && ctx->flags.defer_hash_tables)
- {
- if (offset != DUMP_OBJECT_ON_HASH_TABLE_QUEUE)
- {
- eassert (offset == DUMP_OBJECT_ON_NORMAL_QUEUE
- || offset == DUMP_OBJECT_NOT_SEEN);
- /* We still want to dump the actual keys and values now. */
- dump_enqueue_object (ctx, hash_in->key_and_value, WEIGHT_NONE);
- /* We'll get to the rest later. */
- offset = DUMP_OBJECT_ON_HASH_TABLE_QUEUE;
- dump_remember_object (ctx, object, offset);
- dump_push (&ctx->deferred_hash_tables, object);
- }
- return offset;
- }
-
- if (PDUMPER_CHECK_REHASHING)
- check_hash_table_rehash (make_lisp_ptr ((void *) hash_in, Lisp_Vectorlike));
-
struct Lisp_Hash_Table hash_munged = *hash_in;
struct Lisp_Hash_Table *hash = &hash_munged;
- /* Remember to rehash this hash table on first access. After a
- dump reload, the hash table values will have changed, so we'll
- need to rebuild the index.
-
- TODO: for EQ and EQL hash tables, it should be possible to rehash
- here using the preferred load address of the dump, eliminating
- the need to rehash-on-access if we can load the dump where we
- want. */
- if (hash->count > 0 && !is_stable)
- /* Hash codes will have to be recomputed anyway, so let's not dump them.
- Also set `hash` to nil for hash_rehash_needed_p.
- We could also refrain from dumping the `next' and `index' vectors,
- except that `next' is currently used for HASH_TABLE_SIZE and
- we'd have to rebuild the next_free list as well as adjust
- sweep_weak_hash_table for the case where there's no `index'. */
- hash->hash = Qnil;
+ hash_table_freeze (hash);
+ dump_push (&ctx->hash_tables, object);
START_DUMP_PVEC (ctx, &hash->header, struct Lisp_Hash_Table, out);
dump_pseudovector_lisp_fields (ctx, &out->header, &hash->header);
@@ -4151,6 +4088,19 @@ DEFUN ("dump-emacs-portable",
|| !NILP (ctx->deferred_hash_tables)
|| !NILP (ctx->deferred_symbols));
+ ctx->header.hash_list = ctx->offset;
+ dump_hash_table_list (ctx);
+
+ do
+ {
+ dump_drain_deferred_hash_tables (ctx);
+ dump_drain_deferred_symbols (ctx);
+ dump_drain_normal_queue (ctx);
+ }
+ while (!dump_queue_empty_p (&ctx->dump_queue)
+ || !NILP (ctx->deferred_hash_tables)
+ || !NILP (ctx->deferred_symbols));
+
dump_sort_copied_objects (ctx);
/* While we copy built-in symbols into the Emacs image, these
@@ -5302,6 +5252,9 @@ dump_do_all_emacs_relocations (const struct dump_header *const header,
NUMBER_DUMP_SECTIONS,
};
+/* Pointer to a stack variable to avoid having to staticpro it. */
+static Lisp_Object *pdumper_hashes = &zero_vector;
+
/* Load a dump from DUMP_FILENAME. Return an error code.
N.B. We run very early in initialization, so we can't use lisp,
@@ -5448,6 +5401,15 @@ pdumper_load (const char *dump_filename)
for (int i = 0; i < ARRAYELTS (sections); ++i)
dump_mmap_reset (§ions[i]);
+ Lisp_Object hashes = zero_vector;
+ if (header->hash_list)
+ {
+ struct Lisp_Vector *hash_tables =
+ ((struct Lisp_Vector *)(dump_base + header->hash_list));
+ XSETVECTOR (hashes, hash_tables);
+ }
+
+ pdumper_hashes = &hashes;
/* Run the functions Emacs registered for doing post-dump-load
initialization. */
for (int i = 0; i < nr_dump_hooks; ++i)
@@ -5518,6 +5480,19 @@ DEFUN ("pdumper-stats", Fpdumper_stats, Spdumper_stats, 0, 0, 0,
#endif /* HAVE_PDUMPER */
\f
+static void
+thaw_hash_tables (void)
+{
+ Lisp_Object hash_tables = *pdumper_hashes;
+ for (ptrdiff_t i = 0; i < ASIZE (hash_tables); i++)
+ hash_table_thaw (AREF (hash_tables, i));
+}
+
+void
+init_pdumper_once (void)
+{
+ pdumper_do_now_and_after_load (thaw_hash_tables);
+}
void
syms_of_pdumper (void)
diff --git a/src/pdumper.h b/src/pdumper.h
index 6a99b511f2..c793fb4058 100644
--- a/src/pdumper.h
+++ b/src/pdumper.h
@@ -256,6 +256,7 @@ pdumper_clear_marks (void)
file was loaded. */
extern void pdumper_record_wd (const char *);
+void init_pdumper_once (void);
void syms_of_pdumper (void);
INLINE_HEADER_END
--
2.17.1
[-- Attachment #3: 0002-src-fns.c-hash_table_rehash-Help-the-compiler-a-bit.patch --]
[-- Type: text/x-patch, Size: 1688 bytes --]
From 5c9fa99d3f0cbb2a44d3e0507533f4ab5a13f906 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 11 Aug 2020 02:16:54 -0700
Subject: [PATCH 2/7] * src/fns.c (hash_table_rehash): Help the compiler a bit.
---
src/fns.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/src/fns.c b/src/fns.c
index 41e26104f3..9199178212 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4250,14 +4250,16 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
Normally there's never a need to recompute hashes.
This is done only on first access to a hash-table loaded from
the "pdump", because the objects' addresses may have changed, thus
- affecting their hash. */
+ affecting their hashes. */
void
hash_table_rehash (Lisp_Object hash)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (hash);
+ ptrdiff_t i, count = h->count;
+
/* Recompute the actual hash codes for each entry in the table.
Order is still invalid. */
- for (ptrdiff_t i = 0; i < h->count; ++i)
+ for (i = 0; i < count; i++)
{
Lisp_Object key = HASH_KEY (h, i);
Lisp_Object hash_code = h->test.hashfn (key, h);
@@ -4268,7 +4270,8 @@ hash_table_rehash (Lisp_Object hash)
eassert (HASH_NEXT (h, i) != i); /* Stop loops. */
}
- for (ptrdiff_t i = h->count; i < ASIZE (h->next) - 1; i++)
+ ptrdiff_t size = ASIZE (h->next);
+ for (; i + 1 < size; i++)
set_hash_next_slot (h, i, i + 1);
}
@@ -4892,7 +4895,6 @@ DEFUN ("hash-table-count", Fhash_table_count, Shash_table_count, 1, 1, 0,
(Lisp_Object table)
{
struct Lisp_Hash_Table *h = check_hash_table (table);
-
return make_fixnum (h->count);
}
--
2.17.1
[-- Attachment #4: 0003-src-pdumper.c-pdumper_load-XSETVECTOR-make_lisp_ptr.patch --]
[-- Type: text/x-patch, Size: 1611 bytes --]
From 523b92c3e7b61e5e625101c7ec99273689f7a75a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 11 Aug 2020 02:16:54 -0700
Subject: [PATCH 3/7] * src/pdumper.c (pdumper_load): XSETVECTOR ->
make_lisp_ptr.
---
src/pdumper.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/src/pdumper.c b/src/pdumper.c
index 10dfa8737f..c38cb2d34f 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -391,6 +391,7 @@ dump_fingerprint (char const *label,
boundary. */
dump_off cold_start;
+ /* Offset of a vector of the dumped hash tables. */
dump_off hash_list;
};
@@ -549,6 +550,7 @@ dump_fingerprint (char const *label,
heap objects. */
Lisp_Object bignum_data;
+ /* List of hash tables that have been dumped. */
Lisp_Object hash_tables;
unsigned number_hot_relocations;
@@ -2610,7 +2612,7 @@ dump_vectorlike_generic (struct dump_context *ctx,
}
/* Return a vector of KEY, VALUE pairs in the given hash table H. The
- first H->count pairs are valid, the rest is left as nil. */
+ first H->count pairs are valid, and the rest are unbound. */
static Lisp_Object
hash_table_contents (struct Lisp_Hash_Table *h)
{
@@ -5405,8 +5407,8 @@ pdumper_load (const char *dump_filename)
if (header->hash_list)
{
struct Lisp_Vector *hash_tables =
- ((struct Lisp_Vector *)(dump_base + header->hash_list));
- XSETVECTOR (hashes, hash_tables);
+ (struct Lisp_Vector *) (dump_base + header->hash_list);
+ hashes = make_lisp_ptr (hash_tables, Lisp_Vectorlike);
}
pdumper_hashes = &hashes;
--
2.17.1
[-- Attachment #5: 0004-Don-t-needlessly-convert-to-unsigned-in-pdumper.patch --]
[-- Type: text/x-patch, Size: 5982 bytes --]
From 1d50d5a039ca4d40473c862b021d2ed97279ffe8 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 11 Aug 2020 02:16:54 -0700
Subject: [PATCH 4/7] =?UTF-8?q?Don=E2=80=99t=20needlessly=20convert=20to?=
=?UTF-8?q?=20=E2=80=98unsigned=E2=80=99=20in=20pdumper?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* src/pdumper.c (PRIdDUMP_OFF): New macro.
(EMACS_INT_XDIGITS): New constant.
(struct dump_context): Use dump_off for relocation counts.
All uses changed.
(dump_queue_enqueue, dump_queue_dequeue, Fdump_emacs_portable):
Don’t assume counts fit in ‘unsigned’ or ‘unsigned long’.
Use EMACS_INT_XDIGITS instead of assuming it’s 16.
---
src/pdumper.c | 58 +++++++++++++++++++++++++++------------------------
1 file changed, 31 insertions(+), 27 deletions(-)
diff --git a/src/pdumper.c b/src/pdumper.c
index c38cb2d34f..6d303af77d 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -141,6 +141,9 @@ divide_round_up (size_t x, size_t y)
typedef int_least32_t dump_off;
#define DUMP_OFF_MIN INT_LEAST32_MIN
#define DUMP_OFF_MAX INT_LEAST32_MAX
+#define PRIdDUMP_OFF PRIdLEAST32
+
+enum { EMACS_INT_XDIGITS = (EMACS_INT_WIDTH + 3) / 4 };
static void ATTRIBUTE_FORMAT ((printf, 1, 2))
dump_trace (const char *fmt, ...)
@@ -553,8 +556,8 @@ dump_fingerprint (char const *label,
/* List of hash tables that have been dumped. */
Lisp_Object hash_tables;
- unsigned number_hot_relocations;
- unsigned number_discardable_relocations;
+ dump_off number_hot_relocations;
+ dump_off number_discardable_relocations;
};
/* These special values for use as offsets in dump_remember_object and
@@ -1007,9 +1010,9 @@ dump_queue_enqueue (struct dump_queue *dump_queue,
if (NILP (weights))
{
/* Object is new. */
- dump_trace ("new object %016x weight=%u\n",
- (unsigned) XLI (object),
- (unsigned) weight.value);
+ EMACS_UINT uobj = XLI (object);
+ dump_trace ("new object %0*"pI"x weight=%d\n", EMACS_INT_XDIGITS, uobj,
+ weight.value);
if (weight.value == WEIGHT_NONE.value)
{
@@ -1224,17 +1227,15 @@ dump_queue_dequeue (struct dump_queue *dump_queue, dump_off basis)
+ dump_tailq_length (&dump_queue->one_weight_normal_objects)
+ dump_tailq_length (&dump_queue->one_weight_strong_objects)));
- bool dump_object_counts = true;
- if (dump_object_counts)
- dump_trace
- ("dump_queue_dequeue basis=%d fancy=%u zero=%u "
- "normal=%u strong=%u hash=%u\n",
- basis,
- (unsigned) dump_tailq_length (&dump_queue->fancy_weight_objects),
- (unsigned) dump_tailq_length (&dump_queue->zero_weight_objects),
- (unsigned) dump_tailq_length (&dump_queue->one_weight_normal_objects),
- (unsigned) dump_tailq_length (&dump_queue->one_weight_strong_objects),
- (unsigned) XFIXNUM (Fhash_table_count (dump_queue->link_weights)));
+ dump_trace
+ (("dump_queue_dequeue basis=%"PRIdDUMP_OFF" fancy=%"PRIdPTR
+ " zero=%"PRIdPTR" normal=%"PRIdPTR" strong=%"PRIdPTR" hash=%td\n"),
+ basis,
+ dump_tailq_length (&dump_queue->fancy_weight_objects),
+ dump_tailq_length (&dump_queue->zero_weight_objects),
+ dump_tailq_length (&dump_queue->one_weight_normal_objects),
+ dump_tailq_length (&dump_queue->one_weight_strong_objects),
+ XHASH_TABLE (dump_queue->link_weights)->count);
static const int nr_candidates = 3;
struct candidate
@@ -1307,10 +1308,10 @@ dump_queue_dequeue (struct dump_queue *dump_queue, dump_off basis)
else
emacs_abort ();
- dump_trace (" result score=%f src=%s object=%016x\n",
+ EMACS_UINT uresult = XLI (result);
+ dump_trace (" result score=%f src=%s object=%0*"pI"x\n",
best < 0 ? -1.0 : (double) candidates[best].score,
- src,
- (unsigned) XLI (result));
+ src, EMACS_INT_XDIGITS, uresult);
{
Lisp_Object weights = Fgethash (result, dump_queue->link_weights, Qnil);
@@ -4162,9 +4163,9 @@ DEFUN ("dump-emacs-portable",
of the dump. */
drain_reloc_list (ctx, dump_emit_dump_reloc, emacs_reloc_merger,
&ctx->dump_relocs, &ctx->header.dump_relocs);
- unsigned number_hot_relocations = ctx->number_hot_relocations;
+ dump_off number_hot_relocations = ctx->number_hot_relocations;
ctx->number_hot_relocations = 0;
- unsigned number_discardable_relocations = ctx->number_discardable_relocations;
+ dump_off number_discardable_relocations = ctx->number_discardable_relocations;
ctx->number_discardable_relocations = 0;
drain_reloc_list (ctx, dump_emit_dump_reloc, emacs_reloc_merger,
&ctx->object_starts, &ctx->header.object_starts);
@@ -4188,14 +4189,17 @@ DEFUN ("dump-emacs-portable",
dump_seek (ctx, 0);
dump_write (ctx, &ctx->header, sizeof (ctx->header));
+ dump_off
+ header_bytes = header_end - header_start,
+ hot_bytes = hot_end - hot_start,
+ discardable_bytes = discardable_end - ctx->header.discardable_start,
+ cold_bytes = cold_end - ctx->header.cold_start;
fprintf (stderr,
("Dump complete\n"
- "Byte counts: header=%lu hot=%lu discardable=%lu cold=%lu\n"
- "Reloc counts: hot=%u discardable=%u\n"),
- (unsigned long) (header_end - header_start),
- (unsigned long) (hot_end - hot_start),
- (unsigned long) (discardable_end - ctx->header.discardable_start),
- (unsigned long) (cold_end - ctx->header.cold_start),
+ "Byte counts: header=%"PRIdDUMP_OFF" hot=%"PRIdDUMP_OFF
+ " discardable=%"PRIdDUMP_OFF" cold=%"PRIdDUMP_OFF"\n"
+ "Reloc counts: hot=%"PRIdDUMP_OFF" discardable=%"PRIdDUMP_OFF"\n"),
+ header_bytes, hot_bytes, discardable_bytes, cold_bytes,
number_hot_relocations,
number_discardable_relocations);
--
2.17.1
[-- Attachment #6: 0005-In-pdumper-simplify-INT_MAX-computation.patch --]
[-- Type: text/x-patch, Size: 1634 bytes --]
From 597bb393156730e0f68b0b3e80098d977b8dbdb8 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 11 Aug 2020 02:16:54 -0700
Subject: [PATCH 5/7] In pdumper, simplify INT_MAX computation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
* src/pdumper.c (dump_read_all): Avoid unnecessary cast.
Also, round down to page size, as sysdep.c does.
Also, don’t assume INT_MAX <= UINT_MAX (!).
---
src/pdumper.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/src/pdumper.c b/src/pdumper.c
index 6d303af77d..fcad5242df 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -5065,14 +5065,13 @@ dump_read_all (int fd, void *buf, size_t bytes_to_read)
{
/* We don't want to use emacs_read, since that relies on the lisp
world, and we're not in the lisp world yet. */
- eassert (bytes_to_read <= SSIZE_MAX);
size_t bytes_read = 0;
while (bytes_read < bytes_to_read)
{
- /* Some platforms accept only int-sized values to read. */
- unsigned chunk_to_read = INT_MAX;
- if (bytes_to_read - bytes_read < chunk_to_read)
- chunk_to_read = (unsigned) (bytes_to_read - bytes_read);
+ /* Some platforms accept only int-sized values to read.
+ Round this down to a page size (see MAX_RW_COUNT in sysdep.c). */
+ int max_rw_count = INT_MAX >> 18 << 18;
+ size_t chunk_to_read = min (bytes_to_read - bytes_read, max_rw_count);
ssize_t chunk = read (fd, (char *) buf + bytes_read, chunk_to_read);
if (chunk < 0)
return chunk;
--
2.17.1
[-- Attachment #7: 0006-pdumper-speed-tweeks-for-hash-tables.patch --]
[-- Type: text/x-patch, Size: 2685 bytes --]
From ba8bff6bfca9daefd3917d706b94fc55b7d93191 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 11 Aug 2020 02:16:54 -0700
Subject: [PATCH 6/7] pdumper speed tweeks for hash tables
* src/pdumper.c (dump_queue_empty_p): Avoid unnecessary call
to Fhash_table_count on a known hash table.
(dump_hash_table_list): !NILP, not CONSP.
(hash_table_freeze, hash_table_thaw): ASIZE, not Flength, on vectors.
Initialize in same order as struct.
(hash_table_thaw): make_nil_vector, not Fmake_vector with nil.
---
src/pdumper.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/src/pdumper.c b/src/pdumper.c
index fcad5242df..bc9d197ca2 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -964,11 +964,9 @@ dump_queue_init (struct dump_queue *dump_queue)
static bool
dump_queue_empty_p (struct dump_queue *dump_queue)
{
- bool is_empty =
- EQ (Fhash_table_count (dump_queue->sequence_numbers),
- make_fixnum (0));
- eassert (EQ (Fhash_table_count (dump_queue->sequence_numbers),
- Fhash_table_count (dump_queue->link_weights)));
+ ptrdiff_t count = XHASH_TABLE (dump_queue->sequence_numbers)->count;
+ bool is_empty = count == 0;
+ eassert (count == XFIXNAT (Fhash_table_count (dump_queue->link_weights)));
if (!is_empty)
{
eassert (!dump_tailq_empty_p (&dump_queue->zero_weight_objects)
@@ -2643,7 +2641,7 @@ hash_table_contents (struct Lisp_Hash_Table *h)
static dump_off
dump_hash_table_list (struct dump_context *ctx)
{
- if (CONSP (ctx->hash_tables))
+ if (!NILP (ctx->hash_tables))
return dump_object (ctx, CALLN (Fapply, Qvector, ctx->hash_tables));
else
return 0;
@@ -2652,20 +2650,20 @@ dump_hash_table_list (struct dump_context *ctx)
static void
hash_table_freeze (struct Lisp_Hash_Table *h)
{
- ptrdiff_t nkeys = XFIXNAT (Flength (h->key_and_value)) / 2;
+ ptrdiff_t npairs = ASIZE (h->key_and_value) / 2;
h->key_and_value = hash_table_contents (h);
- h->next_free = (nkeys == h->count ? -1 : h->count);
- h->index = Flength (h->index);
- h->next = h->hash = make_fixnum (nkeys);
+ h->next = h->hash = make_fixnum (npairs);
+ h->index = make_fixnum (ASIZE (h->index));
+ h->next_free = (npairs == h->count ? -1 : h->count);
}
static void
hash_table_thaw (Lisp_Object hash)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (hash);
- h->index = Fmake_vector (h->index, make_fixnum (-1));
- h->hash = Fmake_vector (h->hash, Qnil);
+ h->hash = make_nil_vector (XFIXNUM (h->hash));
h->next = Fmake_vector (h->next, make_fixnum (-1));
+ h->index = Fmake_vector (h->index, make_fixnum (-1));
hash_table_rehash (hash);
}
--
2.17.1
[-- Attachment #8: 0007-pdumper-avoid-listing-hash-table-contents.patch --]
[-- Type: text/x-patch, Size: 1807 bytes --]
From 88d3e15f47b675d8d3fc922eb5a6ff7df8295b34 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 11 Aug 2020 02:16:54 -0700
Subject: [PATCH 7/7] pdumper avoid listing hash table contents
* src/pdumper.c (hash_table_contents): Create a vector directly,
instead of creating a list and then converting that to a vector.
---
src/pdumper.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/src/pdumper.c b/src/pdumper.c
index bc9d197ca2..94921dc9ea 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -2617,25 +2617,28 @@ hash_table_contents (struct Lisp_Hash_Table *h)
{
if (h->test.hashfn == hashfn_user_defined)
error ("cannot dump hash tables with user-defined tests"); /* Bug#36769 */
- Lisp_Object contents = Qnil;
+
+ ptrdiff_t size = HASH_TABLE_SIZE (h);
+ Lisp_Object key_and_value = make_uninit_vector (2 * size);
+ ptrdiff_t n = 0;
/* Make sure key_and_value ends up in the same order; charset.c
relies on it by expecting hash table indices to stay constant
across the dump. */
- for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h) - h->count; i++)
- {
- dump_push (&contents, Qnil);
- dump_push (&contents, Qunbound);
- }
-
- for (ptrdiff_t i = HASH_TABLE_SIZE (h) - 1; i >= 0; --i)
+ for (ptrdiff_t i = 0; i < size; i++)
if (!NILP (HASH_HASH (h, i)))
{
- dump_push (&contents, HASH_VALUE (h, i));
- dump_push (&contents, HASH_KEY (h, i));
+ ASET (key_and_value, n++, HASH_KEY (h, i));
+ ASET (key_and_value, n++, HASH_VALUE (h, i));
}
- return CALLN (Fapply, Qvector, contents);
+ while (n < 2 * size)
+ {
+ ASET (key_and_value, n++, Qunbound);
+ ASET (key_and_value, n++, Qnil);
+ }
+
+ return key_and_value;
}
static dump_off
--
2.17.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 9:33 ` Paul Eggert
@ 2020-08-11 9:40 ` Pip Cet
2020-08-11 11:50 ` Lars Ingebrigtsen
2020-08-11 14:52 ` Eli Zaretskii
2 siblings, 0 replies; 37+ messages in thread
From: Pip Cet @ 2020-08-11 9:40 UTC (permalink / raw)
To: Paul Eggert; +Cc: Lars Ingebrigtsen, 36597-done
On Tue, Aug 11, 2020 at 9:33 AM Paul Eggert <eggert@cs.ucla.edu> wrote:
> On 8/10/20 6:04 AM, Lars Ingebrigtsen wrote:
> > time make -j32 compile-always
> > ...
> > Er... It's weird that there's so much difference in time between
> > runs
>
> I think I get less variance if I do a sequential 'make' (i.e., without -j). Of
> course this takes longer.
It's worse than just execution time variance: at least in the past,
byte code used to differ between runs if -jN was used. I'm not sure
whether that's still true.
> It also simplifies the code a bit, so I took the liberty of installing it, after
> updating its commit message a bit and changing it to keep a comment that was
> updated recently (I figure this was a merge error).
It was, thank you!
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 9:33 ` Paul Eggert
2020-08-11 9:40 ` Pip Cet
@ 2020-08-11 11:50 ` Lars Ingebrigtsen
2020-08-11 14:52 ` Eli Zaretskii
2 siblings, 0 replies; 37+ messages in thread
From: Lars Ingebrigtsen @ 2020-08-11 11:50 UTC (permalink / raw)
To: Paul Eggert; +Cc: 36597-done, Pip Cet
Paul Eggert <eggert@cs.ucla.edu> writes:
> It also simplifies the code a bit, so I took the liberty of installing
> it, after updating its commit message a bit and changing it to keep a
> comment that was updated recently (I figure this was a merge
> error).
Just for fun, I re-ran the "make compile-always" test 20 times
before/after pulling the new version, and the mean compilation speed is
slightly smaller after. But as before, the speeds vary ~10% between
runs. But at least there's no obvious negative performance impact here.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 9:33 ` Paul Eggert
2020-08-11 9:40 ` Pip Cet
2020-08-11 11:50 ` Lars Ingebrigtsen
@ 2020-08-11 14:52 ` Eli Zaretskii
2020-08-11 15:30 ` Paul Eggert
2020-08-11 15:59 ` Pip Cet
2 siblings, 2 replies; 37+ messages in thread
From: Eli Zaretskii @ 2020-08-11 14:52 UTC (permalink / raw)
To: Paul Eggert; +Cc: larsi, 36597-done, pipcet
> Cc: Eli Zaretskii <eliz@gnu.org>, 36597-done@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 11 Aug 2020 02:33:34 -0700
>
> It also simplifies the code a bit, so I took the liberty of installing it
It doesn't compile here:
pdumper.c: In function 'dump_queue_enqueue':
pdumper.c:1012:19: warning: unknown conversion type character 'l' in format [-Wformat=]
1012 | dump_trace ("new object %0*"pI"x weight=%d\n", EMACS_INT_XDIGITS, uobj,
| ^~~~~~~~~~~~~~~~
In file included from character.h:27,
from buffer.h:27,
from pdumper.c:34:
lisp.h:108:17: note: format string is defined here
108 | # define pI "ll"
| ^
pdumper.c:1012:19: warning: format '%d' expects argument of type 'int', but argument 3 has type 'EMACS_UINT' {aka 'long long unsigned int'} [-Wformat=]
1012 | dump_trace ("new object %0*"pI"x weight=%d\n", EMACS_INT_XDIGITS, uobj,
| ^~~~~~~~~~~~~~~~ ~~~~
| |
|
EMACS_UINT {aka long long unsigned int}
pdumper.c:1012:48: note: format string is defined here
1012 | dump_trace ("new object %0*"pI"x weight=%d\n", EMACS_INT_XDIGITS, uobj,
| ~^
| |
| int
| %I64d
pdumper.c:1012:19: warning: too many arguments for format [-Wformat-extra-args]
1012 | dump_trace ("new object %0*"pI"x weight=%d\n", EMACS_INT_XDIGITS, uobj,
| ^~~~~~~~~~~~~~~~
pdumper.c: In function 'dump_queue_dequeue':
pdumper.c:1229:6: warning: format '%d' expects argument of type 'int', but argument 3 has type 'gl_intptr_t' {aka 'long int'} [-Wformat=]
1229 | (("dump_queue_dequeue basis=%"PRIdDUMP_OFF" fancy=%"PRIdPTR
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1230 | " zero=%"PRIdPTR" normal=%"PRIdPTR" strong=%"PRIdPTR" hash=%td\n"),
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1231 | basis,
1232 | dump_tailq_length (&dump_queue->fancy_weight_objects),
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| gl_intptr_t {aka long int}
pdumper.c:1229:6: warning: format '%d' expects argument of type 'int', but argument 4 has type 'gl_intptr_t' {aka 'long int'} [-Wformat=]
pdumper.c:1229:6: warning: format '%d' expects argument of type 'int', but argument 5 has type 'gl_intptr_t' {aka 'long int'} [-Wformat=]
pdumper.c:1229:6: warning: format '%d' expects argument of type 'int', but argument 6 has type 'gl_intptr_t' {aka 'long int'} [-Wformat=]
pdumper.c:1229:6: warning: unknown conversion type character 't' in format [-Wformat=]
pdumper.c:1229:6: warning: too many arguments for format [-Wformat-extra-args]
pdumper.c:1310:15: warning: unknown conversion type character 'l' in format [-Wformat=]
1310 | dump_trace (" result score=%f src=%s object=%0*"pI"x\n",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from character.h:27,
from buffer.h:27,
from pdumper.c:34:
lisp.h:108:17: note: format string is defined here
108 | # define pI "ll"
| ^
pdumper.c:1310:15: warning: too many arguments for format [-Wformat-extra-args]
1310 | dump_trace (" result score=%f src=%s object=%0*"pI"x\n",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pdumper.c: In function 'hash_table_thaw':
pdumper.c:2667:30: error: conversion from 'EMACS_INT' {aka 'long long int'} to 'ptrdiff_t' {aka 'int'} may change value [-Werror=conversion]
2667 | h->hash = make_nil_vector (XFIXNUM (h->hash));
| ^~~~~~~~~~~~~~~~~
cc1.exe: some warnings being treated as errors
Makefile:401: recipe for target `pdumper.o' failed
make[1]: *** [pdumper.o] Error 1
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 14:52 ` Eli Zaretskii
@ 2020-08-11 15:30 ` Paul Eggert
2020-08-11 17:00 ` Eli Zaretskii
2020-08-11 15:59 ` Pip Cet
1 sibling, 1 reply; 37+ messages in thread
From: Paul Eggert @ 2020-08-11 15:30 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: larsi, 36597, pipcet
On 8/11/20 7:52 AM, Eli Zaretskii wrote:
> It doesn't compile here:
>
> pdumper.c: In function 'dump_queue_enqueue':
> pdumper.c:1012:19: warning: unknown conversion type character 'l' in format [-Wformat=]
> 1012 | dump_trace ("new object %0*"pI"x weight=%d\n", EMACS_INT_XDIGITS, uobj,
> | ^~~~~~~~~~~~~~~~
> In file included from character.h:27,
> from buffer.h:27,
> from pdumper.c:34:
> lisp.h:108:17: note: format string is defined here
> 108 | # define pI "ll"
<https://stackoverflow.com/questions/23718110/error-unknown-conversion-type-character-l-in-format-scanning-long-long>
suggests that this is a problem on MinGW, but pI is supposed to be "I64" on that
platform, not "ll".
What warnings does your compiler generate for the following?
#include <stdio.h>
int a;
long long b;
int main (void) {
printf ("x=%0*llx\n", a, b);
printf ("x=%0*I64x\n", a, b);
return 0;
}
and what are __MINGW32__, __USE_MINGW_ANSI_STDIO, MINGW_W64,
__MINGW32_MAJOR_VERSION, __GNUC__, and __GNUC_MINOR__ on your platform?
On my Fedora 31 platform, the above program causes 'gcc -Wall' to say:
t.c: In function ‘main’:
t.c:6:17: warning: unknown conversion type character ‘I’ in format [-Wformat=]
6 | printf ("x=%0*I64x\n", a, b);
| ^
t.c:6:11: warning: too many arguments for format [-Wformat-extra-args]
6 | printf ("x=%0*I64x\n", a, b);
| ^~~~~~~~~~~~~
which is what I'd expect on Fedora.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 14:52 ` Eli Zaretskii
2020-08-11 15:30 ` Paul Eggert
@ 2020-08-11 15:59 ` Pip Cet
2020-08-11 17:00 ` Eli Zaretskii
1 sibling, 1 reply; 37+ messages in thread
From: Pip Cet @ 2020-08-11 15:59 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: larsi, Paul Eggert, 36597-done
[-- Attachment #1: Type: text/plain, Size: 551 bytes --]
On Tue, Aug 11, 2020 at 2:52 PM Eli Zaretskii <eliz@gnu.org> wrote:
> pdumper.c: In function 'hash_table_thaw':
> pdumper.c:2667:30: error: conversion from 'EMACS_INT' {aka 'long long int'} to 'ptrdiff_t' {aka 'int'} may change value [-Werror=conversion]
> 2667 | h->hash = make_nil_vector (XFIXNUM (h->hash));
> | ^~~~~~~~~~~~~~~~~
> cc1.exe: some warnings being treated as errors
I suggest going back to Fmake_vector (h->hash, Qnil), as in the
attached patch. It's shorter, and it actually compiles.
[-- Attachment #2: 0001-Fix-wide-int-compilation-issue-in-pdumper.c.patch --]
[-- Type: text/x-patch, Size: 867 bytes --]
From eaaefce564bcee1d76e36105caea9d7937743ad4 Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet@gmail.com>
Date: Tue, 11 Aug 2020 15:52:36 +0000
Subject: [PATCH] Fix wide-int compilation issue in pdumper.c
* src/pdumper.c (hash_table_thaw): Avoid make_nil_vector, which
requires additional casting.
---
src/pdumper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/pdumper.c b/src/pdumper.c
index 94921dc9ea..e132da1551 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -2664,7 +2664,7 @@ hash_table_freeze (struct Lisp_Hash_Table *h)
hash_table_thaw (Lisp_Object hash)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (hash);
- h->hash = make_nil_vector (XFIXNUM (h->hash));
+ h->hash = Fmake_vector (h->hash, Qnil);
h->next = Fmake_vector (h->next, make_fixnum (-1));
h->index = Fmake_vector (h->index, make_fixnum (-1));
--
2.28.0
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 15:30 ` Paul Eggert
@ 2020-08-11 17:00 ` Eli Zaretskii
2020-08-11 18:11 ` Paul Eggert
0 siblings, 1 reply; 37+ messages in thread
From: Eli Zaretskii @ 2020-08-11 17:00 UTC (permalink / raw)
To: Paul Eggert; +Cc: larsi, 36597, pipcet
> Cc: larsi@gnus.org, pipcet@gmail.com, 36597@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 11 Aug 2020 08:30:23 -0700
>
> > lisp.h:108:17: note: format string is defined here
> > 108 | # define pI "ll"
>
> <https://stackoverflow.com/questions/23718110/error-unknown-conversion-type-character-l-in-format-scanning-long-long>
> suggests that this is a problem on MinGW, but pI is supposed to be "I64" on that
> platform, not "ll".
No, it's supposed to be "ll". The problem is not in lisp.h, it's in
pdumper.c: its declaration of attributes of dump_trace was incorrect
for MinGW. I fixed that.
The warnings about %d vs gl_intptr_t should be fixed in Gnulib, I
think: why does it use 'long int' instead of 'int' on 32-bit
platforms? Or maybe the format in pdumper.c should use %ld instead, I
don't know.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 15:59 ` Pip Cet
@ 2020-08-11 17:00 ` Eli Zaretskii
2020-08-11 17:31 ` Paul Eggert
0 siblings, 1 reply; 37+ messages in thread
From: Eli Zaretskii @ 2020-08-11 17:00 UTC (permalink / raw)
To: Pip Cet; +Cc: larsi, eggert, 36597-done
> From: Pip Cet <pipcet@gmail.com>
> Date: Tue, 11 Aug 2020 15:59:12 +0000
> Cc: Paul Eggert <eggert@cs.ucla.edu>, larsi@gnus.org, 36597-done@debbugs.gnu.org
>
> I suggest going back to Fmake_vector (h->hash, Qnil), as in the
> attached patch. It's shorter, and it actually compiles.
Yes, I did that, thanks.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 17:00 ` Eli Zaretskii
@ 2020-08-11 17:31 ` Paul Eggert
2020-08-11 18:27 ` Andy Moreton
2020-08-11 18:32 ` Eli Zaretskii
0 siblings, 2 replies; 37+ messages in thread
From: Paul Eggert @ 2020-08-11 17:31 UTC (permalink / raw)
To: Eli Zaretskii, Pip Cet; +Cc: larsi, 36597-done
[-- Attachment #1: Type: text/plain, Size: 472 bytes --]
On 8/11/20 10:00 AM, Eli Zaretskii wrote:
>> I suggest going back to Fmake_vector (h->hash, Qnil), as in the
>> attached patch. It's shorter, and it actually compiles.
> Yes, I did that, thanks.
The compilation issue was due to pdumper enabling -Wconversion, which causes
more trouble than it cures (but that is a different topic). I worked around the
glitch by installing the attached further patch, which should also help explain
the motivation for make_nil_vector.
[-- Attachment #2: 0001-Prefer-make_nil_vector-to-make-vector-with-nil.patch --]
[-- Type: text/x-patch, Size: 1906 bytes --]
From 669aeafbd14b0ebb824bacba0a6b3daad30847a9 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 11 Aug 2020 10:29:02 -0700
Subject: [PATCH] Prefer make_nil_vector to make-vector with nil
* src/pdumper.c (hash_table_thaw): Pacify -Wconversion so
we can use make_nil_vector again.
* src/timefns.c (syms_of_timefns): Prefer make_nil_vector
to make_vector with Qnil.
---
src/lisp.h | 3 ++-
src/pdumper.c | 4 +++-
src/timefns.c | 2 +-
3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/src/lisp.h b/src/lisp.h
index d88038d91b..2962babb4f 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -3947,7 +3947,8 @@ make_uninit_sub_char_table (int depth, int min_char)
return v;
}
-/* Make a vector of SIZE nils. */
+/* Make a vector of SIZE nils - faster than make_vector (size, Qnil)
+ if the OS already cleared the new memory. */
INLINE Lisp_Object
make_nil_vector (ptrdiff_t size)
diff --git a/src/pdumper.c b/src/pdumper.c
index 6c581bcd0b..aaa760d70d 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -2664,7 +2664,9 @@ hash_table_freeze (struct Lisp_Hash_Table *h)
hash_table_thaw (Lisp_Object hash)
{
struct Lisp_Hash_Table *h = XHASH_TABLE (hash);
- h->hash = Fmake_vector (h->hash, Qnil);
+ ALLOW_IMPLICIT_CONVERSION;
+ h->hash = make_nil_vector (XFIXNUM (h->hash));
+ DISALLOW_IMPLICIT_CONVERSION;
h->next = Fmake_vector (h->next, make_fixnum (-1));
h->index = Fmake_vector (h->index, make_fixnum (-1));
diff --git a/src/timefns.c b/src/timefns.c
index 7bcc37d7c1..94cfddf0da 100644
--- a/src/timefns.c
+++ b/src/timefns.c
@@ -2048,7 +2048,7 @@ syms_of_timefns (void)
defsubr (&Scurrent_time_zone);
defsubr (&Sset_time_zone_rule);
- flt_radix_power = make_vector (flt_radix_power_size, Qnil);
+ flt_radix_power = make_nil_vector (flt_radix_power_size);
staticpro (&flt_radix_power);
#ifdef NEED_ZTRILLION_INIT
--
2.17.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 17:00 ` Eli Zaretskii
@ 2020-08-11 18:11 ` Paul Eggert
2020-08-11 18:35 ` Eli Zaretskii
0 siblings, 1 reply; 37+ messages in thread
From: Paul Eggert @ 2020-08-11 18:11 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: larsi, 36597, pipcet
[-- Attachment #1: Type: text/plain, Size: 608 bytes --]
On 8/11/20 10:00 AM, Eli Zaretskii wrote:
> The warnings about %d vs gl_intptr_t should be fixed in Gnulib, I
> think: why does it use 'long int' instead of 'int' on 32-bit
> platforms? Or maybe the format in pdumper.c should use %ld instead, I
> don't know.
Ah, it's because Emacs uses C99 inttypes.h macros like PRIdPTR without also
using the Gnulib inttypes module which implements these macros on platforms like
MinGW where the macros don't work. This problem occurs elsewhere in Emacs in a
couple of places, we just never noticed it. I installed the attached patch,
which I hope fixes the glitch.
[-- Attachment #2: 0001-Use-Gnulib-inttypes-module.patch --]
[-- Type: text/x-patch, Size: 2150 bytes --]
From 39c16c1170fd8bd7035e6e265048dd371cde4609 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 11 Aug 2020 11:06:39 -0700
Subject: [PATCH] Use Gnulib inttypes module
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Needed for platforms like MinGW that don’t support C99 PRIdPTR.
* admin/merge-gnulib (GNULIB_MODULES): Add inttypes.
* m4/gnulib-comp.m4: Regenerate.
---
admin/merge-gnulib | 2 +-
lib/gnulib.mk.in | 1 +
m4/gnulib-comp.m4 | 2 ++
3 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/admin/merge-gnulib b/admin/merge-gnulib
index 3f32536a62..98f7941bd8 100755
--- a/admin/merge-gnulib
+++ b/admin/merge-gnulib
@@ -36,7 +36,7 @@ GNULIB_MODULES=
fchmodat fcntl fcntl-h fdopendir
filemode filename filevercmp flexmember fpieee fstatat fsusage fsync futimens
getloadavg getopt-gnu getrandom gettime gettimeofday gitlog-to-changelog
- ieee754-h ignore-value intprops largefile libgmp lstat
+ ieee754-h ignore-value intprops inttypes largefile libgmp lstat
manywarnings memmem-simple mempcpy memrchr minmax mkostemp mktime nstrftime
pathmax pipe2 pselect pthread_sigmask
qcopy-acl readlink readlinkat regex
diff --git a/lib/gnulib.mk.in b/lib/gnulib.mk.in
index 92d0621c61..e7e9fbdc31 100644
--- a/lib/gnulib.mk.in
+++ b/lib/gnulib.mk.in
@@ -116,6 +116,7 @@
# ieee754-h \
# ignore-value \
# intprops \
+# inttypes \
# largefile \
# libgmp \
# lstat \
diff --git a/m4/gnulib-comp.m4 b/m4/gnulib-comp.m4
index 5bfa1473ed..1f8a87218e 100644
--- a/m4/gnulib-comp.m4
+++ b/m4/gnulib-comp.m4
@@ -113,6 +113,7 @@ AC_DEFUN
# Code from module ignore-value:
# Code from module include_next:
# Code from module intprops:
+ # Code from module inttypes:
# Code from module inttypes-incomplete:
# Code from module largefile:
AC_REQUIRE([AC_SYS_LARGEFILE])
@@ -342,6 +343,7 @@ AC_DEFUN
fi
gl_SYS_TIME_MODULE_INDICATOR([gettimeofday])
gl_IEEE754_H
+ gl_INTTYPES_H
gl_INTTYPES_INCOMPLETE
AC_REQUIRE([gl_LARGEFILE])
gl___INLINE
--
2.17.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 17:31 ` Paul Eggert
@ 2020-08-11 18:27 ` Andy Moreton
2020-08-11 18:32 ` Eli Zaretskii
1 sibling, 0 replies; 37+ messages in thread
From: Andy Moreton @ 2020-08-11 18:27 UTC (permalink / raw)
To: 36597
On Tue 11 Aug 2020, Paul Eggert wrote:
> On 8/11/20 10:00 AM, Eli Zaretskii wrote:
>>> I suggest going back to Fmake_vector (h->hash, Qnil), as in the
>>> attached patch. It's shorter, and it actually compiles.
>> Yes, I did that, thanks.
>
> The compilation issue was due to pdumper enabling -Wconversion, which causes
> more trouble than it cures (but that is a different topic). I worked around
> the glitch by installing the attached further patch, which should also help
> explain the motivation for make_nil_vector.
Eli's fixes have addressed issues with the MinGW toolchains, but Mingw64
64bit builds are still broken:
C:/emacs/git/emacs/master/src/pdumper.c: In function 'dump_read_all':
C:/emacs/git/emacs/master/src/pdumper.c:5078:60: error: conversion from 'size_t' {aka 'long long unsigned int'} to 'unsigned int' may change value [-Werror=conversion]
5078 | ssize_t chunk = read (fd, (char *) buf + bytes_read, chunk_to_read);
| ^~~~~~~~~~~~~
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 17:31 ` Paul Eggert
2020-08-11 18:27 ` Andy Moreton
@ 2020-08-11 18:32 ` Eli Zaretskii
1 sibling, 0 replies; 37+ messages in thread
From: Eli Zaretskii @ 2020-08-11 18:32 UTC (permalink / raw)
To: Paul Eggert; +Cc: larsi, 36597-done, pipcet
> Cc: larsi@gnus.org, 36597-done@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 11 Aug 2020 10:31:24 -0700
>
> The compilation issue was due to pdumper enabling -Wconversion, which causes
> more trouble than it cures (but that is a different topic). I worked around the
> glitch by installing the attached further patch, which should also help explain
> the motivation for make_nil_vector.
Is it really worth it? The code is now a kind of puzzle (what doe
those ALLOW/DISALLOW_IMPLICIT_CONVERSION macros do?), and Fmake_vector
is a very thin wrapper around the C function it calls. So I think
this change is for the worse, sorry.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 18:11 ` Paul Eggert
@ 2020-08-11 18:35 ` Eli Zaretskii
2020-08-11 18:55 ` Eli Zaretskii
2020-08-11 23:43 ` Paul Eggert
0 siblings, 2 replies; 37+ messages in thread
From: Eli Zaretskii @ 2020-08-11 18:35 UTC (permalink / raw)
To: Paul Eggert; +Cc: larsi, 36597, pipcet
> Cc: larsi@gnus.org, pipcet@gmail.com, 36597@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 11 Aug 2020 11:11:20 -0700
>
> Ah, it's because Emacs uses C99 inttypes.h macros like PRIdPTR without also
> using the Gnulib inttypes module which implements these macros on platforms like
> MinGW where the macros don't work. This problem occurs elsewhere in Emacs in a
> couple of places, we just never noticed it. I installed the attached patch,
> which I hope fixes the glitch.
It doesn't, because we avoid the Gnulib inttypes module on MinGW.
I don't understand why it's needed; there's nothing wrong with MinGW's
inttypes.h header.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 18:35 ` Eli Zaretskii
@ 2020-08-11 18:55 ` Eli Zaretskii
2020-08-11 23:43 ` Paul Eggert
1 sibling, 0 replies; 37+ messages in thread
From: Eli Zaretskii @ 2020-08-11 18:55 UTC (permalink / raw)
To: eggert; +Cc: larsi, 36597, pipcet
> Date: Tue, 11 Aug 2020 21:35:52 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: larsi@gnus.org, 36597@debbugs.gnu.org, pipcet@gmail.com
>
> I don't understand why it's needed; there's nothing wrong with MinGW's
> inttypes.h header.
And, btw, Gnulib's inttypes.h does this:
#if !defined PRIdPTR
# ifdef INTPTR_MAX
# define PRIdPTR @PRIPTR_PREFIX@ "d"
# endif
#endif
But since MinGW's inttypes.h does provide PRIdPTR, this will do
nothing, so it cannot possibly help here. Am I missing something?
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 18:35 ` Eli Zaretskii
2020-08-11 18:55 ` Eli Zaretskii
@ 2020-08-11 23:43 ` Paul Eggert
2020-08-12 14:10 ` Eli Zaretskii
1 sibling, 1 reply; 37+ messages in thread
From: Paul Eggert @ 2020-08-11 23:43 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: larsi, 36597, pipcet
[-- Attachment #1: Type: text/plain, Size: 1138 bytes --]
On 8/11/20 11:35 AM, Eli Zaretskii wrote:
> It doesn't, because we avoid the Gnulib inttypes module on MinGW.
In that case perhaps I should revert the change that added the Gnulib inttypes
module, as MS-Windows is the only currently-active platform with PRIdPTR etc.
problems that I've heard of.
> I don't understand why it's needed; there's nothing wrong with MinGW's
> inttypes.h header.
I don't know what the problems with MS-Windows are or were. Perhaps they're
fixed on all development environments we know about. That would suggest
reverting the inttypes change too.
Does the attached simplification pacify GCC on MinGW? If so, that could be
combined with reverting the inttypes change.
Does the following standalone program compile OK with 'gcc -Wall' on MinGW? If
so, why does the same thing not work when compiling Emacs? The error message you
quoted in Bug#36597#67 suggests that PRIdPTR is "d" whereas intptr_t is 'long'
which means the following program should run afoul of MinGW.
#include <inttypes.h>
#include <stdio.h>
char buf[1000];
intptr_t ip;
int main (void) {
return sprintf (buf, "%"PRIdPTR, ip);
}
[-- Attachment #2: pdumper.diff --]
[-- Type: text/x-patch, Size: 1293 bytes --]
diff --git a/src/pdumper.c b/src/pdumper.c
index 7708bc892f..0bc48aedbe 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -143,8 +143,6 @@ #define DUMP_OFF_MIN INT_LEAST32_MIN
#define DUMP_OFF_MAX INT_LEAST32_MAX
#define PRIdDUMP_OFF PRIdLEAST32
-enum { EMACS_INT_XDIGITS = (EMACS_INT_WIDTH + 3) / 4 };
-
static void ATTRIBUTE_FORMAT_PRINTF (1, 2)
dump_trace (const char *fmt, ...)
{
@@ -1008,9 +1006,7 @@ dump_queue_enqueue (struct dump_queue *dump_queue,
if (NILP (weights))
{
/* Object is new. */
- EMACS_UINT uobj = XLI (object);
- dump_trace ("new object %0*"pI"x weight=%d\n", EMACS_INT_XDIGITS, uobj,
- weight.value);
+ dump_trace ("new object %p weight=%d\n", XLP (object), weight.value);
if (weight.value == WEIGHT_NONE.value)
{
@@ -1306,10 +1302,9 @@ dump_queue_dequeue (struct dump_queue *dump_queue, dump_off basis)
else
emacs_abort ();
- EMACS_UINT uresult = XLI (result);
- dump_trace (" result score=%f src=%s object=%0*"pI"x\n",
+ dump_trace (" result score=%f src=%s object=%p\n",
best < 0 ? -1.0 : (double) candidates[best].score,
- src, EMACS_INT_XDIGITS, uresult);
+ src, XLP (result));
{
Lisp_Object weights = Fgethash (result, dump_queue->link_weights, Qnil);
^ permalink raw reply related [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-11 23:43 ` Paul Eggert
@ 2020-08-12 14:10 ` Eli Zaretskii
2020-08-12 14:46 ` Eli Zaretskii
2020-08-12 19:11 ` Paul Eggert
0 siblings, 2 replies; 37+ messages in thread
From: Eli Zaretskii @ 2020-08-12 14:10 UTC (permalink / raw)
To: Paul Eggert; +Cc: larsi, 36597, pipcet
> Cc: larsi@gnus.org, pipcet@gmail.com, 36597@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Tue, 11 Aug 2020 16:43:16 -0700
>
> [1:text/plain Hide]
>
> On 8/11/20 11:35 AM, Eli Zaretskii wrote:
> > It doesn't, because we avoid the Gnulib inttypes module on MinGW.
>
> In that case perhaps I should revert the change that added the Gnulib inttypes
> module, as MS-Windows is the only currently-active platform with PRIdPTR etc.
> problems that I've heard of.
If that module is in our repository only because of MS-Windows, then
it indeed isn't needed.
> > I don't understand why it's needed; there's nothing wrong with MinGW's
> > inttypes.h header.
>
> I don't know what the problems with MS-Windows are or were. Perhaps they're
> fixed on all development environments we know about. That would suggest
> reverting the inttypes change too.
I see no problems in MinGW headers with PRIdPTR nor with intptr_t.
> Does the attached simplification pacify GCC on MinGW? If so, that could be
> combined with reverting the inttypes change.
The warnings disappeared because you installed a change that no
longer uses the GCC warning options which triggered them. So I'm
unsure how you'd like me to test the patch, please elaborate.
> Does the following standalone program compile OK with 'gcc -Wall' on MinGW? If
> so, why does the same thing not work when compiling Emacs? The error message you
> quoted in Bug#36597#67 suggests that PRIdPTR is "d" whereas intptr_t is 'long'
> which means the following program should run afoul of MinGW.
The problem is not with MinGW's definition of intptr_t: it is
typedef'ed as 'int' in 32-bit builds in the MinGW headers. The
problem is not with intptr_t, it's with its Gnulib equivalent:
pdumper.c:1229:6: warning: format '%d' expects argument of type 'int', but argument 4 has type 'gl_intptr_t' {aka 'long int'} [-Wformat=]
It complains about gl_intptr_t, not intptr_t. That's because Gnulib's
stdint.h does this:
# ifdef _WIN64
typedef long long int gl_intptr_t;
typedef unsigned long long int gl_uintptr_t;
# else
typedef long int gl_intptr_t;
typedef unsigned long int gl_uintptr_t;
# endif
# define intptr_t gl_intptr_t
# define uintptr_t gl_uintptr_t
I don't understand why it uses 'long int' 32-bit platforms, it looks
gratuitous, especially since MinGW itself uses just 'int'. (Another
question is why Gnulib thinks it needs to redefine intptr_t, but if
the redefinition was correct, this would not be especially important.)
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-12 14:10 ` Eli Zaretskii
@ 2020-08-12 14:46 ` Eli Zaretskii
2020-08-12 19:11 ` Paul Eggert
1 sibling, 0 replies; 37+ messages in thread
From: Eli Zaretskii @ 2020-08-12 14:46 UTC (permalink / raw)
To: eggert; +Cc: larsi, 36597, pipcet
> Date: Wed, 12 Aug 2020 17:10:34 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: larsi@gnus.org, 36597@debbugs.gnu.org, pipcet@gmail.com
>
> > Does the attached simplification pacify GCC on MinGW? If so, that could be
> > combined with reverting the inttypes change.
>
> The warnings disappeared because you installed a change that no
> longer uses the GCC warning options which triggered them. So I'm
> unsure how you'd like me to test the patch, please elaborate.
Sorry, I wasn't paying attention: the warnings did not disappear.
I've now tried the proposed changes, but they don't affect the code
which is cited in the warning messages. The offending code is here:
dump_trace
(("dump_queue_dequeue basis=%"PRIdDUMP_OFF" fancy=%"PRIdPTR
" zero=%"PRIdPTR" normal=%"PRIdPTR" strong=%"PRIdPTR" hash=%td\n"),
basis,
dump_tailq_length (&dump_queue->fancy_weight_objects),
dump_tailq_length (&dump_queue->zero_weight_objects),
dump_tailq_length (&dump_queue->one_weight_normal_objects),
dump_tailq_length (&dump_queue->one_weight_strong_objects),
XHASH_TABLE (dump_queue->link_weights)->count);
And it triggers warnings because intptr_t is redefined to be 'long
int', whereas PRIdPTR is "d". Here are the warnings again:
pdumper.c: In function 'dump_queue_dequeue':
pdumper.c:1214:6: warning: format '%d' expects argument of type 'int', but argument 3 has type 'gl_intptr_t' {aka 'long int'} [-Wformat=]
1214 | (("dump_queue_dequeue basis=%"PRIdDUMP_OFF" fancy=%"PRIdPTR
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1215 | " zero=%"PRIdPTR" normal=%"PRIdPTR" strong=%"PRIdPTR" hash=%td\n"),
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1216 | basis,
1217 | dump_tailq_length (&dump_queue->fancy_weight_objects),
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| gl_intptr_t {aka long int}
pdumper.c:1214:6: warning: format '%d' expects argument of type 'int', but argument 4 has type 'gl_intptr_t' {aka 'long int'} [-Wformat=]
pdumper.c:1214:6: warning: format '%d' expects argument of type 'int', but argument 5 has type 'gl_intptr_t' {aka 'long int'} [-Wformat=]
pdumper.c:1214:6: warning: format '%d' expects argument of type 'int', but argument 6 has type 'gl_intptr_t' {aka 'long int'} [-Wformat=]
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-12 14:10 ` Eli Zaretskii
2020-08-12 14:46 ` Eli Zaretskii
@ 2020-08-12 19:11 ` Paul Eggert
2020-08-12 19:28 ` Eli Zaretskii
1 sibling, 1 reply; 37+ messages in thread
From: Paul Eggert @ 2020-08-12 19:11 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: larsi, 36597, pipcet
On 8/12/20 7:10 AM, Eli Zaretskii wrote:
> If that module is in our repository only because of MS-Windows, then
> it indeed isn't needed.
OK, I removed it.
> I don't understand why it uses 'long int' 32-bit platforms, it looks
> gratuitous, especially since MinGW itself uses just 'int'. (Another
> question is why Gnulib thinks it needs to redefine intptr_t, but if
> the redefinition was correct, this would not be especially important.)
As I recall the idea was to not worry about the plethora of buggy intptr_t
implementations at the time, and just substitute Gnulib's own. Nowadays perhaps
that decision should be revisited.
I looked into the MinGW situation and the problem seems to be that MinGW defined
a macro _INTPTR_T_DEFINED that it no longer defines, and Gnulib was keying off
that no-longer-present macro. I installed a patch for that in Gnulib here:
https://lists.gnu.org/r/bug-gnulib/2020-08/msg00088.html
and migrated the patch into Emacs. Hope it fixes things.
As an aside, we're spending too much time on pdumper.c code that has no effect
because dump_trace never outputs anything. How about if I remove dump_trace and
its callers? Although dump_trace may have been useful when the portable dumper
got developed, it's just a developer time sink now.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-12 19:11 ` Paul Eggert
@ 2020-08-12 19:28 ` Eli Zaretskii
2020-08-12 20:41 ` Andy Moreton
0 siblings, 1 reply; 37+ messages in thread
From: Eli Zaretskii @ 2020-08-12 19:28 UTC (permalink / raw)
To: Paul Eggert, Daniel Colascione; +Cc: larsi, 36597, pipcet
> Cc: larsi@gnus.org, pipcet@gmail.com, 36597@debbugs.gnu.org
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Wed, 12 Aug 2020 12:11:09 -0700
>
> I looked into the MinGW situation and the problem seems to be that MinGW defined
> a macro _INTPTR_T_DEFINED that it no longer defines, and Gnulib was keying off
> that no-longer-present macro.
I think _INTPTR_T_DEFINED is still being used, but only by MinGW64. I
use mingw.org's MinGW, where that macro was never used.
However, both MinGW flavors typedef intptr_t as 'int', not 'long int',
on 32-bit platforms.
> I installed a patch for that in Gnulib here:
>
> https://lists.gnu.org/r/bug-gnulib/2020-08/msg00088.html
>
> and migrated the patch into Emacs. Hope it fixes things.
It does here, thanks. I hope someone will be able to make sure
MinGW64 builds are not adversely affected (I don't think they should
be).
> As an aside, we're spending too much time on pdumper.c code that has no effect
> because dump_trace never outputs anything. How about if I remove dump_trace and
> its callers? Although dump_trace may have been useful when the portable dumper
> got developed, it's just a developer time sink now.
I have no opinion on this, but I'd like to hear from Daniel (CC'ed)
what he thinks.
^ permalink raw reply [flat|nested] 37+ messages in thread
* bug#36597: 27.0.50; rehash hash tables eagerly in pdumper
2020-08-12 19:28 ` Eli Zaretskii
@ 2020-08-12 20:41 ` Andy Moreton
0 siblings, 0 replies; 37+ messages in thread
From: Andy Moreton @ 2020-08-12 20:41 UTC (permalink / raw)
To: 36597
On Wed 12 Aug 2020, Eli Zaretskii wrote:
>> Cc: larsi@gnus.org, pipcet@gmail.com, 36597@debbugs.gnu.org
>> From: Paul Eggert <eggert@cs.ucla.edu>
>> Date: Wed, 12 Aug 2020 12:11:09 -0700
>>
>> I looked into the MinGW situation and the problem seems to be that MinGW defined
>> a macro _INTPTR_T_DEFINED that it no longer defines, and Gnulib was keying off
>> that no-longer-present macro.
>
> I think _INTPTR_T_DEFINED is still being used, but only by MinGW64. I
> use mingw.org's MinGW, where that macro was never used.
Yes, MinGW64 still defines this in corecrt.h, and in _cygwin.h (which I
think is to support cross conpiling to cygwin).
> However, both MinGW flavors typedef intptr_t as 'int', not 'long int',
> on 32-bit platforms.
Agreed.
>> I installed a patch for that in Gnulib here:
>>
>> https://lists.gnu.org/r/bug-gnulib/2020-08/msg00088.html
>>
>> and migrated the patch into Emacs. Hope it fixes things.
>
> It does here, thanks. I hope someone will be able to make sure
> MinGW64 builds are not adversely affected (I don't think they should
> be).
On msys2, 32bit mingw64 and 64bit mingw64 builds are both ok (using
commit fd6058b8).
AndyM
^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2020-08-12 20:41 UTC | newest]
Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-11 14:05 bug#36597: 27.0.50; rehash hash tables eagerly in pdumper Pip Cet
2019-07-14 14:39 ` Paul Eggert
2019-07-14 15:01 ` Pip Cet
2019-07-14 15:49 ` Paul Eggert
2019-07-14 16:54 ` Pip Cet
2019-07-15 14:39 ` Pip Cet
2019-07-19 7:23 ` Pip Cet
2019-07-19 7:46 ` Eli Zaretskii
2019-07-20 12:38 ` Pip Cet
2019-07-21 3:18 ` Paul Eggert
2019-07-21 5:34 ` Pip Cet
2019-07-21 6:32 ` Paul Eggert
2019-07-21 6:32 ` Pip Cet
2020-08-09 19:27 ` Lars Ingebrigtsen
2020-08-10 11:51 ` Pip Cet
2020-08-10 13:04 ` Lars Ingebrigtsen
2020-08-11 9:33 ` Paul Eggert
2020-08-11 9:40 ` Pip Cet
2020-08-11 11:50 ` Lars Ingebrigtsen
2020-08-11 14:52 ` Eli Zaretskii
2020-08-11 15:30 ` Paul Eggert
2020-08-11 17:00 ` Eli Zaretskii
2020-08-11 18:11 ` Paul Eggert
2020-08-11 18:35 ` Eli Zaretskii
2020-08-11 18:55 ` Eli Zaretskii
2020-08-11 23:43 ` Paul Eggert
2020-08-12 14:10 ` Eli Zaretskii
2020-08-12 14:46 ` Eli Zaretskii
2020-08-12 19:11 ` Paul Eggert
2020-08-12 19:28 ` Eli Zaretskii
2020-08-12 20:41 ` Andy Moreton
2020-08-11 15:59 ` Pip Cet
2020-08-11 17:00 ` Eli Zaretskii
2020-08-11 17:31 ` Paul Eggert
2020-08-11 18:27 ` Andy Moreton
2020-08-11 18:32 ` Eli Zaretskii
2019-07-18 5:39 ` Eli Zaretskii
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.