unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: [Emacs-diffs] master 515afc9 6/6: Fix crash if user test munges hash table
       [not found] ` <20190721031354.BA68420BE2@vcs0.savannah.gnu.org>
@ 2019-07-21  9:57   ` Pip Cet
  2019-07-21 18:39     ` Paul Eggert
  0 siblings, 1 reply; 4+ messages in thread
From: Pip Cet @ 2019-07-21  9:57 UTC (permalink / raw)
  To: emacs-devel, Paul Eggert; +Cc: emacs-diffs

On Sun, Jul 21, 2019 at 3:14 AM Paul Eggert <eggert@cs.ucla.edu> wrote:
> diff --git a/src/alloc.c b/src/alloc.c
> index 09b3a4e..1718ce0 100644
> --- a/src/alloc.c
> +++ b/src/alloc.c
> @@ -5352,6 +5352,7 @@ purecopy_hash_table (struct Lisp_Hash_Table *table)
>    pure->count = table->count;
>    pure->next_free = table->next_free;
>    pure->purecopy = table->purecopy;
> +  eassert (!pure->mutable);
>    pure->rehash_threshold = table->rehash_threshold;
>    pure->rehash_size = table->rehash_size;
>    pure->key_and_value = purecopy (table->key_and_value);

I don't think this eassert () is completely safe.  pure_alloc will
return uninitialized memory if pure space has overflowed, so it's
possible the new table is marked as mutable.

> diff --git a/test/src/fns-tests.el b/test/src/fns-tests.el
> index 9d4ae4f..7d56da7 100644
> --- a/test/src/fns-tests.el
> +++ b/test/src/fns-tests.el
> @@ -846,4 +846,16 @@
>    (should (not (proper-list-p (make-bool-vector 0 nil))))
>    (should (not (proper-list-p (make-symbol "a")))))
>
> +(ert-deftest test-hash-function-that-mutates-hash-table ()
> +  (define-hash-table-test 'badeq 'eq 'bad-hash)
> +  (let ((h (make-hash-table :test 'badeq :size 1 :rehash-size 1)))
> +    (defun bad-hash (k)
> +      (if (eq k 100)
> +         (clrhash h))
> +      (sxhash-eq k))
> +    (should-error
> +     (dotimes (k 200)
> +       (puthash k k h)))
> +    (should (= 100 (hash-table-count h)))))
> +
>  (provide 'fns-tests)

Is it really necessary to cater to code such as this? I thought the
general line was that it was okay for bad Lisp code to crash Emacs in
exceptional circumstances, such as by building bad bytecode objects or
by doing silly things in a user-defined hash function...



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Emacs-diffs] master 5018b66 4/6: Inhibit GC after inhibit_garbage_collection
       [not found] ` <20190721031353.DBCBC20BE2@vcs0.savannah.gnu.org>
@ 2019-07-21 14:01   ` Stefan Monnier
  2019-07-21 18:42     ` Paul Eggert
  0 siblings, 1 reply; 4+ messages in thread
From: Stefan Monnier @ 2019-07-21 14:01 UTC (permalink / raw)
  To: emacs-devel; +Cc: Paul Eggert

>     * src/alloc.c (garbage_collection_inhibited): New var.

While I like the idea, I worry about its impact on performance:
when garbage_collection_inhibited is true and

       consing_since_gc > gc_cons_threshold
       && consing_since_gc > gc_relative_threshold

`maybe_gc` will constantly call `garbage_collect`.  Maybe we could
circumvent this by temporarily bumping gc_relative_threshold?


        Stefan




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Emacs-diffs] master 515afc9 6/6: Fix crash if user test munges hash table
  2019-07-21  9:57   ` [Emacs-diffs] master 515afc9 6/6: Fix crash if user test munges hash table Pip Cet
@ 2019-07-21 18:39     ` Paul Eggert
  0 siblings, 0 replies; 4+ messages in thread
From: Paul Eggert @ 2019-07-21 18:39 UTC (permalink / raw)
  To: Pip Cet, emacs-devel; +Cc: emacs-diffs

[-- Attachment #1: Type: text/plain, Size: 1399 bytes --]

Pip Cet wrote:

> I don't think this eassert () is completely safe.  pure_alloc will
> return uninitialized memory if pure space has overflowed, so it's
> possible the new table is marked as mutable.

Good point. I wonder what other code has similar problems? Rather than spend 
time looking for that, I installed the first attached patch, which clears that 
uninitialized memory. If pure space is going away anyway I'm not sure it's worth 
the trouble to audit the code for similar problems elsewhere.

> Is it really necessary to cater to code such as this? I thought the
> general line was that it was okay for bad Lisp code to crash Emacs in
> exceptional circumstances, such as by building bad bytecode objects or
> by doing silly things in a user-defined hash function...

Generally speaking Emacs should not crash. There is currently an exception for 
bad bytecode objects, but we really should fix that (presumably by verifying the 
bytecode before executing it, to avoid slowdown during execution).

While we're on the topic, we should better document how user-defined hash 
functions should behave. I installed the second attached patch to do that. It 
also clarifies that hash codes are fixnums.

By code inspection I found some problems that I recently introduced, involving 
very large hash tables and/or 32-bit platforms --with-wide-int, and fixed them 
with the third attached patch.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-pure_alloc-returns-cleared-memory.patch --]
[-- Type: text/x-patch; name="0001-pure_alloc-returns-cleared-memory.patch", Size: 2100 bytes --]

From 4a1507b88e813e3d54614f4cb59211234e05334a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 21 Jul 2019 11:20:07 -0700
Subject: [PATCH 1/3] pure_alloc returns cleared memory
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* src/alloc.c (pure_alloc): Clear any heap-allocated storage.
This is simpler than auditing all the callers to make sure
they don’t assume pure memory is cleared memory, and the
performance implication is nonexistent except when Emacs
is misconfigured.  Also, add an assertion to catch
caller misuse when pure space is exhausted.
---
 src/alloc.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/alloc.c b/src/alloc.c
index 1718ce0faf..b7ba886482 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -5086,7 +5086,13 @@ valid_lisp_object_p (Lisp_Object obj)
 /* Allocate room for SIZE bytes from pure Lisp storage and return a
    pointer to it.  TYPE is the Lisp type for which the memory is
    allocated.  TYPE < 0 means it's not used for a Lisp object,
-   and that the result should have an alignment of -TYPE.  */
+   and that the result should have an alignment of -TYPE.
+
+   The bytes are initially zero.
+
+   If pure space is exhausted, allocate space from the heap.  This is
+   merely an expedient to let Emacs warn that pure space was exhausted
+   and that Emacs should be rebuilt with a larger pure space.  */
 
 static void *
 pure_alloc (size_t size, int type)
@@ -5119,8 +5125,10 @@ pure_alloc (size_t size, int type)
   /* Don't allocate a large amount here,
      because it might get mmap'd and then its address
      might not be usable.  */
-  purebeg = xmalloc (10000);
-  pure_size = 10000;
+  int small_amount = 10000;
+  eassert (size <= small_amount - LISP_ALIGNMENT);
+  purebeg = xzalloc (small_amount);
+  pure_size = small_amount;
   pure_bytes_used_before_overflow += pure_bytes_used - size;
   pure_bytes_used = 0;
   pure_bytes_used_lisp = pure_bytes_used_non_lisp = 0;
-- 
2.17.1


[-- Attachment #3: 0002-Improve-doc-for-hash-tables.patch --]
[-- Type: text/x-patch, Size: 6291 bytes --]

From cf285946bee56912286f75e4d1215214bc7c5b4b Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 21 Jul 2019 11:20:07 -0700
Subject: [PATCH 2/3] Improve doc for hash tables

* doc/lispref/hash.texi (Creating Hash, Defining Hash):
* src/fns.c (Fsxhash_eq, Fsxhash_eql, Fsxhash_equal):
Say that hashes are fixnums.
(Fmake_hash_table): Say that that an integer rehash-size
should be a fixnum.
* doc/lispref/hash.texi (Defining Hash): Say that hash and
comparison functions should be consistent and pure, and should
return quickly.
---
 doc/lispref/hash.texi | 27 ++++++++++++++++-----------
 src/fns.c             |  8 ++++----
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/doc/lispref/hash.texi b/doc/lispref/hash.texi
index 9b900e6309..051531491c 100644
--- a/doc/lispref/hash.texi
+++ b/doc/lispref/hash.texi
@@ -132,7 +132,7 @@ Creating Hash
 it grows automatically.  This value specifies how to make the hash table
 larger, at that time.
 
-If @var{rehash-size} is an integer, it should be positive, and the hash
+If @var{rehash-size} is a fixnum, it should be positive and the hash
 table grows by adding approximately that much to the nominal size.  If
 @var{rehash-size} is floating point, it had better be greater
 than 1, and the hash table grows by multiplying the old size by
@@ -239,14 +239,19 @@ Defining Hash
 
   You can think of a hash table conceptually as a large array of many
 slots, each capable of holding one association.  To look up a key,
-@code{gethash} first computes an integer, the hash code, from the key.
-It reduces this integer modulo the length of the array, to produce an
+@code{gethash} first computes a fixnum, the hash code, from the key.
+It reduces this fixnum modulo the length of the array, to produce an
 index in the array.  Then it looks in that slot, and if necessary in
 other nearby slots, to see if it has found the key being sought.
 
   Thus, to define a new method of key lookup, you need to specify both a
 function to compute the hash code from a key, and a function to compare
-two keys directly.
+two keys directly.  The two functions should be consistent with each
+other: that is, two keys' hash codes should be the same if the keys
+compare as equal.  Also, since the two functions can be called at any
+time (such as by the garbage collector), the functions should be free
+of side effects and should return quickly, and their behavior should
+depend on only on properties of the keys that do not change.
 
 @defun define-hash-table-test name test-fn hash-fn
 This function defines a new hash table test, named @var{name}.
@@ -260,9 +265,9 @@ Defining Hash
 return non-@code{nil} if they are considered the same.
 
 The function @var{hash-fn} should accept one argument, a key, and return
-an integer that is the hash code of that key.  For good results, the
-function should use the whole range of integers for hash codes,
-including negative integers.
+a fixnum that is the hash code of that key.  For good results, the
+function should use the whole range of fixnums for hash codes,
+including negative fixnums.
 
 The specified functions are stored in the property list of @var{name}
 under the property @code{hash-table-test}; the property value's form is
@@ -271,12 +276,12 @@ Defining Hash
 
 @defun sxhash-equal obj
 This function returns a hash code for Lisp object @var{obj}.
-This is an integer which reflects the contents of @var{obj}
+This is a fixnum that reflects the contents of @var{obj}
 and the other Lisp objects it points to.
 
 If two objects @var{obj1} and @var{obj2} are @code{equal}, then
 @code{(sxhash-equal @var{obj1})} and @code{(sxhash-equal @var{obj2})}
-are the same integer.
+are the same fixnum.
 
 If the two objects are not @code{equal}, the values returned by
 @code{sxhash-equal} are usually different, but not always; once in a
@@ -294,7 +299,7 @@ Defining Hash
 
 If two objects @var{obj1} and @var{obj2} are @code{eq}, then
 @code{(sxhash-eq @var{obj1})} and @code{(sxhash-eq @var{obj2})} are
-the same integer.
+the same fixnum.
 @end defun
 
 @defun sxhash-eql obj
@@ -305,7 +310,7 @@ Defining Hash
 
 If two objects @var{obj1} and @var{obj2} are @code{eql}, then
 @code{(sxhash-eql @var{obj1})} and @code{(sxhash-eql @var{obj2})} are
-the same integer.
+the same fixnum.
 @end defun
 
   This example creates a hash table whose keys are strings that are
diff --git a/src/fns.c b/src/fns.c
index d7e123122d..819eaec7c7 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4700,7 +4700,7 @@ sxhash (Lisp_Object obj, int depth)
  ***********************************************************************/
 
 DEFUN ("sxhash-eq", Fsxhash_eq, Ssxhash_eq, 1, 1, 0,
-       doc: /* Return an integer hash code for OBJ suitable for `eq'.
+       doc: /* Return a fixnum hash code for OBJ suitable for `eq'.
 If (eq A B), then (= (sxhash-eq A) (sxhash-eq B)).
 
 Hash codes are not guaranteed to be preserved across Emacs sessions.  */)
@@ -4710,7 +4710,7 @@ Hash codes are not guaranteed to be preserved across Emacs sessions.  */)
 }
 
 DEFUN ("sxhash-eql", Fsxhash_eql, Ssxhash_eql, 1, 1, 0,
-       doc: /* Return an integer hash code for OBJ suitable for `eql'.
+       doc: /* Return a fixnum hash code for OBJ suitable for `eql'.
 If (eql A B), then (= (sxhash-eql A) (sxhash-eql B)).
 
 Hash codes are not guaranteed to be preserved across Emacs sessions.  */)
@@ -4720,7 +4720,7 @@ Hash codes are not guaranteed to be preserved across Emacs sessions.  */)
 }
 
 DEFUN ("sxhash-equal", Fsxhash_equal, Ssxhash_equal, 1, 1, 0,
-       doc: /* Return an integer hash code for OBJ suitable for `equal'.
+       doc: /* Return a fixnum hash code for OBJ suitable for `equal'.
 If (equal A B), then (= (sxhash-equal A) (sxhash-equal B)).
 
 Hash codes are not guaranteed to be preserved across Emacs sessions.  */)
@@ -4744,7 +4744,7 @@ keys.  Default is `eql'.  Predefined are the tests `eq', `eql', and
 Default is 65.
 
 :rehash-size REHASH-SIZE - Indicates how to expand the table when it
-fills up.  If REHASH-SIZE is an integer, increase the size by that
+fills up.  If REHASH-SIZE is a fixnum, increase the size by that
 amount.  If it is a float, it must be > 1.0, and the new size is the
 old size multiplied by that factor.  Default is 1.5.
 
-- 
2.17.1


[-- Attachment #4: 0003-Avoid-integer-overflow-in-hash-table-size.patch --]
[-- Type: text/x-patch, Size: 2597 bytes --]

From c72e6328b408805953a5adf832b5c5cc9f3a75e7 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 21 Jul 2019 11:20:07 -0700
Subject: [PATCH 3/3] Avoid integer overflow in hash table size

* src/fns.c (INDEX_SIZE_BOUND): Use a tighter bound.
(maybe_resize_hash_table): Avoid integer overflow when
checking for hash table size overflow.  Fix confusion
between INDEX_SIZE_BOUND (which is for the index vector)
and hash table size.  Fix typo in debugging message
when ENABLE_CHECKING.
---
 src/fns.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/src/fns.c b/src/fns.c
index 819eaec7c7..8eefa7c72b 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -4042,9 +4042,14 @@ allocate_hash_table (void)
 }
 
 /* An upper bound on the size of a hash table index.  It must fit in
-   ptrdiff_t and be a valid Emacs fixnum.  */
+   ptrdiff_t and be a valid Emacs fixnum.  This is an upper bound on
+   VECTOR_ELTS_MAX (see alloc.c) and gets as close as we can without
+   violating modularity.  */
 #define INDEX_SIZE_BOUND \
-  ((ptrdiff_t) min (MOST_POSITIVE_FIXNUM, PTRDIFF_MAX / word_size))
+  ((ptrdiff_t) min (MOST_POSITIVE_FIXNUM, \
+		    ((min (PTRDIFF_MAX, SIZE_MAX) \
+		      - header_size - GCALIGNMENT) \
+		     / word_size)))
 
 /* Create and initialize a new hash table.
 
@@ -4169,11 +4174,13 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
       else
 	{
 	  double float_new_size = old_size * (rehash_size + 1);
-	  if (float_new_size < INDEX_SIZE_BOUND + 1)
+	  if (float_new_size < EMACS_INT_MAX)
 	    new_size = float_new_size;
 	  else
-	    new_size = INDEX_SIZE_BOUND + 1;
+	    new_size = EMACS_INT_MAX;
 	}
+      if (PTRDIFF_MAX < new_size)
+	new_size = PTRDIFF_MAX;
       if (new_size <= old_size)
 	new_size = old_size + 1;
 
@@ -4181,7 +4188,7 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
 	 avoid problems if memory is exhausted.  larger_vecalloc
 	 finishes computing the size of the replacement vectors.  */
       Lisp_Object next = larger_vecalloc (h->next, new_size - old_size,
-					  INDEX_SIZE_BOUND / 2);
+					  PTRDIFF_MAX / 2);
       ptrdiff_t next_size = ASIZE (next);
       for (ptrdiff_t i = old_size; i < next_size - 1; i++)
 	gc_aset (next, i, make_fixnum (i + 1));
@@ -4216,7 +4223,7 @@ maybe_resize_hash_table (struct Lisp_Hash_Table *h)
 
 #ifdef ENABLE_CHECKING
       if (HASH_TABLE_P (Vpurify_flag) && XHASH_TABLE (Vpurify_flag) == h)
-	message ("Growing hash table to: %"pD"d", new_size);
+	message ("Growing hash table to: %"pD"d", next_size);
 #endif
     }
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Emacs-diffs] master 5018b66 4/6: Inhibit GC after inhibit_garbage_collection
  2019-07-21 14:01   ` [Emacs-diffs] master 5018b66 4/6: Inhibit GC after inhibit_garbage_collection Stefan Monnier
@ 2019-07-21 18:42     ` Paul Eggert
  0 siblings, 0 replies; 4+ messages in thread
From: Paul Eggert @ 2019-07-21 18:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 202 bytes --]

Stefan Monnier wrote:
> Maybe we could circumvent this by temporarily bumping gc_relative_threshold?

Sure, I installed the attached. It bumps consing_until_gc which is how the 
thresholding works now.

[-- Attachment #2: 0001-Speed-up-maybe_gc-when-GC-is-inhibited.patch --]
[-- Type: text/x-patch, Size: 1580 bytes --]

From d02c2f7f6507105605ed0596a7e26acd5b3b8122 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sun, 21 Jul 2019 11:20:07 -0700
Subject: [PATCH] Speed up maybe_gc when GC is inhibited

* src/alloc.c (allow_garbage_collection)
(inhibit_garbage_collection): Temporarily bump
consing_until_gc, to improve performance of maybe_gc while
garbage collection is inhibited.  Suggested by Stefan Monnier in:
https://lists.gnu.org/r/emacs-devel/2019-07/msg00511.html
---
 src/alloc.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/alloc.c b/src/alloc.c
index b7ba886482..50015808e5 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -5501,11 +5501,14 @@ staticpro (Lisp_Object const *varaddress)
 			  Protection from GC
  ***********************************************************************/
 
-/* Temporarily prevent garbage collection.  */
+/* Temporarily prevent garbage collection.  Temporarily bump
+   consing_until_gc to speed up maybe_gc when GC is inhibited.  */
 
 static void
-allow_garbage_collection (void)
+allow_garbage_collection (void *ptr)
 {
+  object_ct *p = ptr;
+  consing_until_gc = *p;
   garbage_collection_inhibited--;
 }
 
@@ -5513,9 +5516,10 @@ ptrdiff_t
 inhibit_garbage_collection (void)
 {
   ptrdiff_t count = SPECPDL_INDEX ();
-
-  record_unwind_protect_void (allow_garbage_collection);
+  object_ct consing = consing_until_gc;
+  record_unwind_protect_ptr (allow_garbage_collection, &consing);
   garbage_collection_inhibited++;
+  consing_until_gc = OBJECT_CT_MAX;
   return count;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-07-21 18:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20190721031351.22349.23406@vcs0.savannah.gnu.org>
     [not found] ` <20190721031354.BA68420BE2@vcs0.savannah.gnu.org>
2019-07-21  9:57   ` [Emacs-diffs] master 515afc9 6/6: Fix crash if user test munges hash table Pip Cet
2019-07-21 18:39     ` Paul Eggert
     [not found] ` <20190721031353.DBCBC20BE2@vcs0.savannah.gnu.org>
2019-07-21 14:01   ` [Emacs-diffs] master 5018b66 4/6: Inhibit GC after inhibit_garbage_collection Stefan Monnier
2019-07-21 18:42     ` Paul Eggert

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).