Re: Bug#38708: eq vs eql in byte-compiled code

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

From: Paul Eggert <eggert@cs.ucla.edu>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: "Mattias Engdegård" <mattiase@acm.org>,
	"Pip Cet" <pipcet@gmail.com>,
	emacs-devel@gnu.org
Subject: Re: Bug#38708: eq vs eql in byte-compiled code
Date: Wed, 1 Jan 2020 23:52:12 -0800	[thread overview]
Message-ID: <1e0940ff-e418-bafc-66d3-72b562b2c65b@cs.ucla.edu> (raw)
In-Reply-To: <jwva777p4x2.fsf-monnier+emacs@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 1409 bytes --]

On 1/1/20 7:45 AM, Stefan Monnier wrote:
> We could/should do hash-consing of bignums, tho.  It won't affect code
> that doesn't use bignums, and should have a fairly minor performance
> cost for bignums while making their semantics more clean.

I did that, and surprise! it sped up 'make compile-always' by about 7% on the
two platforms I tried it on (Fedora 31 and Ubuntu 18.04.3, both x86-64).

Although hash-consing of bignums slows down computations that do nothing but
generate new bignums (a slowdown of 38% on my microbenchmark) such computations
are reasonably rare, and the benefit having 'eq' be more-compatible with
traditional Emacs (as well as the abovementioned speedup in more-typical code)
seems to make this tradeoff well worthwhile.

Code patch against Emacs master attached. Given the performance boost, at least
this should go into master; not so sure about emacs-27, though (pro: eq is more
compatible; con: it's a last-minute change to emacs-27).

Two or three issues:

* Should we document that eq == eql on bignums, or continue to leave this stuff
unspecified?

* Should we try hash-consing floats too? Maybe it wouldn't be as slow as we
thought, for typical computations anyway....

* The attached patch could probably be sped up a bit by supporting sets as well
as mappings at the low level, since bignum_map is really just a set of bignums.
Not sure it's worth the effort, though.

[-- Attachment #2: 0001-Hash-cons-bignums.txt --]
[-- Type: text/plain, Size: 10506 bytes --]

From 2cc2d34a4f0fe866714f062dde7bfcc485b3b9e4 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Wed, 1 Jan 2020 23:18:58 -0800
Subject: [PATCH] Hash-cons bignums
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Suggested by Stefan Monnier in:
https://lists.gnu.org/r/emacs-devel/2020-01/msg00010.html
This improves performance of ‘make compile-always’
by about 7% on my platform (x86-64 Ubuntu 18.04.3).
* src/alloc.c (make_pure_bignum): Remove, as we can’t copy (much
less purecopy) bignums any more.  Caller removed.
* src/bignum.c (make_bignum_bits): Make sure the result is unique.
(make_bignum_str): Let make_bignum do the work.  Also, this
checks against converting a string to a too-big bignum.
* src/fns.c (Fmemql, Feql, internal_equal, hashfn_eql):
Just use eq to compare bignums.
(sxhash_mpz): Rename from sxhash_bignum and change the argument
to mpz_t const *.  Caller changed.
(sxhash): No need to treat bignums specially any more.
(bignum_map): New static variable.
(make_unique_bignum): New function, which uses sxhash_mpz.
(syms_of_fns): Initialize bignum_map.
* src/pdumper.c (dump_hash_table_stable_p):
Bignums are stable keys now.
---
 src/alloc.c   |  31 ----------------
 src/bignum.c  |  14 ++-----
 src/fns.c     | 100 +++++++++++++++++++++++++++++++-------------------
 src/lisp.h    |   1 +
 src/pdumper.c |   5 +--
 5 files changed, 68 insertions(+), 83 deletions(-)

diff --git a/src/alloc.c b/src/alloc.c
index 1c6b664b22..6a2ae6f6d5 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -5279,35 +5279,6 @@ make_pure_float (double num)
   return new;
 }
 
-/* Value is a bignum object with value VALUE allocated from pure
-   space.  */
-
-static Lisp_Object
-make_pure_bignum (Lisp_Object value)
-{
-  mpz_t const *n = xbignum_val (value);
-  size_t i, nlimbs = mpz_size (*n);
-  size_t nbytes = nlimbs * sizeof (mp_limb_t);
-  mp_limb_t *pure_limbs;
-  mp_size_t new_size;
-
-  struct Lisp_Bignum *b = pure_alloc (sizeof *b, Lisp_Vectorlike);
-  XSETPVECTYPESIZE (b, PVEC_BIGNUM, 0, VECSIZE (struct Lisp_Bignum));
-
-  int limb_alignment = alignof (mp_limb_t);
-  pure_limbs = pure_alloc (nbytes, - limb_alignment);
-  for (i = 0; i < nlimbs; ++i)
-    pure_limbs[i] = mpz_getlimbn (*n, i);
-
-  new_size = nlimbs;
-  if (mpz_sgn (*n) < 0)
-    new_size = -new_size;
-
-  mpz_roinit_n (b->value, pure_limbs, new_size);
-
-  return make_lisp_ptr (b, Lisp_Vectorlike);
-}
-
 /* Return a vector with room for LEN Lisp_Objects allocated from
    pure space.  */
 
@@ -5449,8 +5420,6 @@ purecopy (Lisp_Object obj)
       /* Don't hash-cons it.  */
       return obj;
     }
-  else if (BIGNUMP (obj))
-    obj = make_pure_bignum (obj);
   else
     {
       AUTO_STRING (fmt, "Don't know how to purify: %S");
diff --git a/src/bignum.c b/src/bignum.c
index 51d90ffaef..0e3f5db8d9 100644
--- a/src/bignum.c
+++ b/src/bignum.c
@@ -89,12 +89,7 @@ make_bignum_bits (size_t bits)
      timestamps.  */
   if (integer_width < bits && 2 * max (INTMAX_WIDTH, UINTMAX_WIDTH) < bits)
     overflow_error ();
-
-  struct Lisp_Bignum *b = ALLOCATE_PLAIN_PSEUDOVECTOR (struct Lisp_Bignum,
-						       PVEC_BIGNUM);
-  mpz_init (b->value);
-  mpz_swap (b->value, mpz[0]);
-  return make_lisp_ptr (b, Lisp_Vectorlike);
+  return make_unique_bignum ();
 }
 
 /* Return a Lisp integer equal to mpz[0], which must not be in fixnum range.
@@ -424,10 +419,7 @@ bignum_to_string (Lisp_Object num, int base)
 Lisp_Object
 make_bignum_str (char const *num, int base)
 {
-  struct Lisp_Bignum *b = ALLOCATE_PLAIN_PSEUDOVECTOR (struct Lisp_Bignum,
-						       PVEC_BIGNUM);
-  mpz_init (b->value);
-  int check = mpz_set_str (b->value, num, base);
+  int check = mpz_set_str (mpz[0], num, base);
   eassert (check == 0);
-  return make_lisp_ptr (b, Lisp_Vectorlike);
+  return make_bignum ();
 }
diff --git a/src/fns.c b/src/fns.c
index 3b5feace52..80e0eac71c 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -1575,29 +1575,17 @@ DEFUN ("memql", Fmemql, Smemql, 2, 2, 0,
 The value is actually the tail of LIST whose car is ELT.  */)
   (Lisp_Object elt, Lisp_Object list)
 {
+  if (!FLOATP (elt))
+    return Fmemq (elt, list);
+
   Lisp_Object tail = list;
 
-  if (FLOATP (elt))
-    {
-      FOR_EACH_TAIL (tail)
-        {
-          Lisp_Object tem = XCAR (tail);
-          if (FLOATP (tem) && same_float (elt, tem))
-            return tail;
-        }
-    }
-  else if (BIGNUMP (elt))
+  FOR_EACH_TAIL (tail)
     {
-      FOR_EACH_TAIL (tail)
-        {
-          Lisp_Object tem = XCAR (tail);
-          if (BIGNUMP (tem)
-	      && mpz_cmp (*xbignum_val (elt), *xbignum_val (tem)) == 0)
-            return tail;
-        }
+      Lisp_Object tem = XCAR (tail);
+      if (FLOATP (tem) && same_float (elt, tem))
+	return tail;
     }
-  else
-    return Fmemq (elt, list);
 
   CHECK_LIST_END (tail, list);
   return Qnil;
@@ -2300,14 +2288,9 @@ DEFUN ("eql", Feql, Seql, 2, 2, 0,
 \(eql 0.0e+NaN 0.0e+NaN) returns t, whereas `=' does the opposite.  */)
   (Lisp_Object obj1, Lisp_Object obj2)
 {
-  if (FLOATP (obj1))
-    return FLOATP (obj2) && same_float (obj1, obj2) ? Qt : Qnil;
-  else if (BIGNUMP (obj1))
-    return ((BIGNUMP (obj2)
-	     && mpz_cmp (*xbignum_val (obj1), *xbignum_val (obj2)) == 0)
-	    ? Qt : Qnil);
-  else
-    return EQ (obj1, obj2) ? Qt : Qnil;
+  return ((EQ (obj1, obj2)
+	   || (FLOATP (obj1) && FLOATP (obj2) && same_float (obj1, obj2)))
+	  ? Qt : Qnil);
 }
 
 DEFUN ("equal", Fequal, Sequal, 2, 2, 0,
@@ -2433,8 +2416,6 @@ internal_equal (Lisp_Object o1, Lisp_Object o2, enum equal_kind equal_kind,
 	   same size.  */
 	if (ASIZE (o2) != size)
 	  return false;
-	if (BIGNUMP (o1))
-	  return mpz_cmp (*xbignum_val (o1), *xbignum_val (o2)) == 0;
 	if (OVERLAYP (o1))
 	  {
 	    if (!internal_equal (OVERLAY_START (o1), OVERLAY_START (o2),
@@ -4031,7 +4012,7 @@ hashfn_equal (Lisp_Object key, struct Lisp_Hash_Table *h)
 Lisp_Object
 hashfn_eql (Lisp_Object key, struct Lisp_Hash_Table *h)
 {
-  return (FLOATP (key) || BIGNUMP (key) ? hashfn_equal : hashfn_eq) (key, h);
+  return (FLOATP (key) ? hashfn_equal : hashfn_eq) (key, h);
 }
 
 /* Given HT, return a hash code for KEY which uses a user-defined
@@ -4655,16 +4636,15 @@ sxhash_bool_vector (Lisp_Object vec)
   return SXHASH_REDUCE (hash);
 }
 
-/* Return a hash for a bignum.  */
+/* Return a hash for an integer.  */
 
 static EMACS_UINT
-sxhash_bignum (Lisp_Object bignum)
+sxhash_mpz (mpz_t const *n)
 {
-  mpz_t const *n = xbignum_val (bignum);
-  size_t i, nlimbs = mpz_size (*n);
+  size_t nlimbs = mpz_size (*n);
   EMACS_UINT hash = 0;
 
-  for (i = 0; i < nlimbs; ++i)
+  for (size_t i = 0; i < nlimbs; i++)
     hash = sxhash_combine (hash, mpz_getlimbn (*n, i));
 
   return SXHASH_REDUCE (hash);
@@ -4698,9 +4678,7 @@ sxhash (Lisp_Object obj, int depth)
 
       /* This can be everything from a vector to an overlay.  */
     case Lisp_Vectorlike:
-      if (BIGNUMP (obj))
-	hash = sxhash_bignum (obj);
-      else if (VECTORP (obj) || RECORDP (obj))
+      if (VECTORP (obj) || RECORDP (obj))
 	/* According to the CL HyperSpec, two arrays are equal only if
 	   they are `eq', except for strings and bit-vectors.  In
 	   Emacs, this works differently.  We have to compare element
@@ -5066,6 +5044,47 @@ DEFUN ("define-hash-table-test", Fdefine_hash_table_test,
   return Fput (name, Qhash_table_test, list2 (test, hash));
 }
 
+\f
+/***********************************************************************
+			    Hashed bignums
+ ***********************************************************************/
+
+/* Weak hash table of all current bignum objects, to prevent duplicates.  */
+
+static Lisp_Object bignum_map;
+
+/* Return the unique bignum with value equal to mpz[0], which must not
+   be in fixnum range.  Set mpz[0] to a junk value.  */
+
+Lisp_Object
+make_unique_bignum (void)
+{
+  struct Lisp_Hash_Table *h = XHASH_TABLE (bignum_map);
+
+  /* If mpz[0]'s value is already hashed, return the corresponding bignum.
+     This is like hash_lookup, except with special hashing and
+     comparison for bignums.  hash_rehash_if_needed would be redundant
+     since sxhash_mpz is stable.  */
+  EMACS_UINT hash = sxhash_mpz (&mpz[0]);
+  Lisp_Object hash_code = make_ufixnum (hash);
+  ptrdiff_t start_of_bucket = hash % ASIZE (h->index);
+  for (ptrdiff_t i = HASH_INDEX (h, start_of_bucket);
+       0 <= i; i = HASH_NEXT (h, i))
+    if (EQ (hash_code, HASH_HASH (h, i)))
+      {
+	Lisp_Object num = HASH_KEY (h, i);
+	if (mpz_cmp (mpz[0], *xbignum_val (num)) == 0)
+	  return num;
+      }
+
+  struct Lisp_Bignum *b = ALLOCATE_PLAIN_PSEUDOVECTOR (struct Lisp_Bignum,
+						       PVEC_BIGNUM);
+  mpz_init (b->value);
+  mpz_swap (b->value, mpz[0]);
+  Lisp_Object bignum = make_lisp_ptr (b, Lisp_Vectorlike);
+  hash_put (h, bignum, Qnil, hash_code);
+  return bignum;
+}
 
 \f
 /************************************************************************
@@ -5665,4 +5684,9 @@ syms_of_fns (void)
   defsubr (&Ssecure_hash);
   defsubr (&Sbuffer_hash);
   defsubr (&Slocale_info);
+
+  bignum_map = make_hash_table (hashtest_eq, DEFAULT_HASH_SIZE,
+				DEFAULT_REHASH_SIZE, DEFAULT_REHASH_THRESHOLD,
+				Qkey, false);
+  staticpro (&bignum_map);
 }
diff --git a/src/lisp.h b/src/lisp.h
index 8674fe11a6..e066085514 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -3649,6 +3649,7 @@ #define CONS_TO_INTEGER(cons, type, var)				\
 extern EMACS_INT next_almost_prime (EMACS_INT) ATTRIBUTE_CONST;
 extern Lisp_Object larger_vector (Lisp_Object, ptrdiff_t, ptrdiff_t);
 extern bool sweep_weak_table (struct Lisp_Hash_Table *, bool);
+extern Lisp_Object make_unique_bignum (void);
 extern void hexbuf_digest (char *, void const *, int);
 extern char *extract_data_from_object (Lisp_Object, ptrdiff_t *, ptrdiff_t *);
 EMACS_UINT hash_string (char const *, ptrdiff_t);
diff --git a/src/pdumper.c b/src/pdumper.c
index 3ee1146040..ab0461ccb6 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -2632,11 +2632,10 @@ dump_hash_table_stable_p (const struct Lisp_Hash_Table *hash)
       if (!EQ (key, Qunbound))
         {
 	  bool key_stable = (dump_builtin_symbol_p (key)
-			     || FIXNUMP (key)
+			     || INTEGERP (key)
 			     || (is_equal
 			         && (STRINGP (key) || BOOL_VECTOR_P (key)))
-			     || ((is_equal || is_eql)
-			         && (FLOATP (key) || BIGNUMP (key))));
+			     || ((is_equal || is_eql) && FLOATP (key)));
           if (!key_stable)
             return false;
         }
-- 
2.17.1

next prev parent reply	other threads:[~2020-01-02  7:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-31 15:07 Bug#38708: eq vs eql in byte-compiled code Pip Cet
2019-12-31 15:51 ` Andrea Corallo
2019-12-31 16:05 ` Mattias Engdegård
2019-12-31 17:38 ` Paul Eggert
2020-01-01 12:38   ` Mattias Engdegård
2020-01-02  8:38     ` Paul Eggert
2020-01-02 17:26       ` Mattias Engdegård
2020-01-04 19:55         ` Stefan Monnier
2020-01-22 10:56       ` Mattias Engdegård
2020-01-25  0:59         ` Paul Eggert
2020-01-01 15:45   ` Stefan Monnier
2020-01-02  7:52     ` Paul Eggert [this message]
2020-01-02 12:27       ` Pip Cet
2020-01-02 23:12         ` Paul Eggert
2020-01-02 13:48       ` Eli Zaretskii
2020-01-04 18:54       ` Stefan Monnier
2020-01-04 19:33         ` Paul Eggert
2020-01-04 19:49           ` Stefan Monnier

find likely ancestor, descendant, or conflicting patches for this message:
dfblob:1c6b664b2 dfblob:51d90ffae dfblob:3b5feace5 dfblob:8674fe11a
dfblob:3ee114604 dfblob:6a2ae6f6d dfblob:0e3f5db8d dfblob:80e0eac71
dfblob:e06608551 dfblob:ab0461ccb
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1e0940ff-e418-bafc-66d3-72b562b2c65b@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=emacs-devel@gnu.org \
    --cc=mattiase@acm.org \
    --cc=monnier@iro.umontreal.ca \
    --cc=pipcet@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).