From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.bugs Subject: bug#8771: Remove arbitrary 32-bit limit in Emacs hash tables Date: Mon, 30 May 2011 23:09:34 -0700 Organization: UCLA Computer Science Department Message-ID: <4DE4861E.709@cs.ucla.edu> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1306822221 30314 80.91.229.12 (31 May 2011 06:10:21 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 31 May 2011 06:10:21 +0000 (UTC) To: 8771@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue May 31 08:10:13 2011 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QRI9c-00067n-BY for geb-bug-gnu-emacs@m.gmane.org; Tue, 31 May 2011 08:10:12 +0200 Original-Received: from localhost ([::1]:40508 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QRI9b-0007TA-SG for geb-bug-gnu-emacs@m.gmane.org; Tue, 31 May 2011 02:10:11 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:40090) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QRI9W-0007Qv-OO for bug-gnu-emacs@gnu.org; Tue, 31 May 2011 02:10:09 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QRI9U-0000KD-0s for bug-gnu-emacs@gnu.org; Tue, 31 May 2011 02:10:06 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:35723) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QRI9T-0000JI-Sf for bug-gnu-emacs@gnu.org; Tue, 31 May 2011 02:10:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1QRI9T-0004yF-Oi; Tue, 31 May 2011 02:10:03 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Paul Eggert Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 31 May 2011 06:10:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 8771 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.130682219719090 (code B ref -1); Tue, 31 May 2011 06:10:03 +0000 Original-Received: (at submit) by debbugs.gnu.org; 31 May 2011 06:09:57 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QRI9L-0004xr-PS for submit@debbugs.gnu.org; Tue, 31 May 2011 02:09:56 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QRI9I-0004xe-LL for submit@debbugs.gnu.org; Tue, 31 May 2011 02:09:54 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QRI9A-0000E1-HC for submit@debbugs.gnu.org; Tue, 31 May 2011 02:09:47 -0400 Original-Received: from lists.gnu.org ([140.186.70.17]:35726) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QRI9A-0000Dx-FR for submit@debbugs.gnu.org; Tue, 31 May 2011 02:09:44 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:40041) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QRI97-0007OK-TS for bug-gnu-emacs@gnu.org; Tue, 31 May 2011 02:09:44 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QRI95-00009L-1M for bug-gnu-emacs@gnu.org; Tue, 31 May 2011 02:09:41 -0400 Original-Received: from smtp.cs.ucla.edu ([131.179.128.62]:52299) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QRI94-0008TB-B4 for bug-gnu-emacs@gnu.org; Tue, 31 May 2011 02:09:38 -0400 Original-Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 6B56A39E80F2 for ; Mon, 30 May 2011 23:09:36 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Original-Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id re6OB7cjKMV6 for ; Mon, 30 May 2011 23:09:34 -0700 (PDT) Original-Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id AB14639E80E0 for ; Mon, 30 May 2011 23:09:34 -0700 (PDT) User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.17) Gecko/20110424 Thunderbird/3.1.10 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Tue, 31 May 2011 02:10:03 -0400 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:46839 Archived-At: Currently the Emacs source code uses 'unsigned' for hashes and 'int' for hash tables, but on 64-bit hosts hash tables can in principle be larger than what can be shoehorned into 32 bits. Here's a proposed patch; most of it is pretty straightforward. Remove arbitrary limit of 2**31 entries in hash tables. * category.c (hash_get_category_set): Use 'EMACS_UINT' and 'EMACS_INT' for hashes and hash indexes, instead of 'unsigned' and 'int'. * ccl.c (ccl_driver): Likewise. * charset.c (Fdefine_charset_internal): Likewise. * charset.h (struct charset.hash_index): Likewise. * composite.c (get_composition_id, gstring_lookup_cache): (composition_gstring_put_cache): Likewise. * composite.h (struct composition.hash_index): Likewise. * dispextern.h (struct image.hash): Likewise. * fns.c (next_almost_prime, larger_vector, cmpfn_eql): (cmpfn_equal, cmpfn_user_defined, hashfn_eq, hashfn_eql): (hashfn_equal, hashfn_user_defined, make_hash_table): (maybe_resize_hash_table, hash_lookup, hash_put): (hash_remove_from_table, hash_clear, sweep_weak_table, SXHASH_COMBINE): (sxhash_string, sxhash_list, sxhash_vector, sxhash_bool_vector): (Fsxhash, Fgethash, Fputhash, Fmaphash): Likewise. * image.c (make_image, search_image_cache, lookup_image): (xpm_put_color_table_h): Likewise. * lisp.h (struct Lisp_Hash_Table): Likewise, for 'count', 'cmpfn', and 'hashfn' members. * minibuf.c (Ftry_completion, Fall_completions, Ftest_completion): Likewise. * print.c (print): Likewise. * alloc.c (allocate_vectorlike): Check for overflow in vector size calculations. * ccl.c (ccl_driver): Check for overflow when converting EMACS_INT to int. * fns.c, image.c: Remove unnecessary static decls that would otherwise need to be updated by these changes. * fns.c (make_hash_table, maybe_resize_hash_table): Check for integer overflow with large hash tables. (make_hash_table, maybe_resize_hash_table, Fmake_hash_table): Prefer the faster XFLOAT_DATA to XFLOATINT where either will do. (SXHASH_REDUCE): New macro. (sxhash_string, sxhash_list, sxhash_vector, sxhash_bool_vector): Use it instead of discarding useful hash info with large hash values. (sxhash_float): New function. (sxhash): Use it. No more need for "& INTMASK" due to above changes. * lisp.h (FIXNUM_BITS): New macro, useful for SXHASH_REDUCE etc. (MOST_NEGATIVE_FIXNUM, MOST_POSITIVE_FIXNUM, INTMASK): Rewrite to use FIXNUM_BITS, as this simplifies things. (next_almost_prime, larger_vector, sxhash, hash_lookup, hash_put): Adjust signatures to match updated version of code. (consing_since_gc): Now EMACS_INT, since a single hash table can use more than INT_MAX bytes. === modified file 'src/alloc.c' --- src/alloc.c 2011-05-31 05:15:34 +0000 +++ src/alloc.c 2011-05-31 05:52:26 +0000 @@ -157,7 +157,7 @@ /* Number of bytes of consing done since the last gc. */ -int consing_since_gc; +EMACS_INT consing_since_gc; /* Similar minimum, computed from Vgc_cons_percentage. */ @@ -2788,6 +2788,11 @@ { struct Lisp_Vector *p; size_t nbytes; + int header_size = offsetof (struct Lisp_Vector, contents); + int word_size = sizeof p->contents[0]; + + if ((SIZE_MAX - header_size) / word_size < len) + memory_full (); MALLOC_BLOCK_INPUT; @@ -2801,8 +2806,7 @@ /* This gets triggered by code which I haven't bothered to fix. --Stef */ /* eassert (!handling_signal); */ - nbytes = (offsetof (struct Lisp_Vector, contents) - + len * sizeof p->contents[0]); + nbytes = header_size + len * word_size; p = (struct Lisp_Vector *) lisp_malloc (nbytes, MEM_TYPE_VECTORLIKE); #ifdef DOUG_LEA_MALLOC === modified file 'src/category.c' --- src/category.c 2011-04-11 06:28:35 +0000 +++ src/category.c 2011-05-31 05:52:26 +0000 @@ -67,8 +67,8 @@ hash_get_category_set (Lisp_Object table, Lisp_Object category_set) { struct Lisp_Hash_Table *h; - int i; - unsigned hash; + EMACS_INT i; + EMACS_UINT hash; if (NILP (XCHAR_TABLE (table)->extras[1])) XCHAR_TABLE (table)->extras[1] === modified file 'src/ccl.c' --- src/ccl.c 2011-05-31 05:38:59 +0000 +++ src/ccl.c 2011-05-31 05:52:26 +0000 @@ -1307,15 +1307,15 @@ : -1)); h = GET_HASH_TABLE (eop); - op = hash_lookup (h, make_number (reg[RRR]), NULL); - if (op >= 0) + eop = hash_lookup (h, make_number (reg[RRR]), NULL); + if (eop >= 0) { Lisp_Object opl; - opl = HASH_VALUE (h, op); - if (! CHARACTERP (opl)) + opl = HASH_VALUE (h, eop); + if (! (IN_INT_RANGE (eop) && CHARACTERP (opl))) CCL_INVALID_CMD; reg[RRR] = charset_unicode; - reg[rrr] = op; + reg[rrr] = eop; reg[7] = 1; /* r7 true for success */ } else @@ -1334,11 +1334,11 @@ i = CCL_DECODE_CHAR (reg[RRR], reg[rrr]); h = GET_HASH_TABLE (eop); - op = hash_lookup (h, make_number (i), NULL); - if (op >= 0) + eop = hash_lookup (h, make_number (i), NULL); + if (eop >= 0) { Lisp_Object opl; - opl = HASH_VALUE (h, op); + opl = HASH_VALUE (h, eop); if (! (INTEGERP (opl) && IN_INT_RANGE (XINT (opl)))) CCL_INVALID_CMD; reg[RRR] = XINT (opl); === modified file 'src/charset.c' --- src/charset.c 2011-05-28 22:39:39 +0000 +++ src/charset.c 2011-05-31 05:52:26 +0000 @@ -849,7 +849,7 @@ /* Charset attr vector. */ Lisp_Object attrs; Lisp_Object val; - unsigned hash_code; + EMACS_UINT hash_code; struct Lisp_Hash_Table *hash_table = XHASH_TABLE (Vcharset_hash_table); int i, j; struct charset charset; === modified file 'src/charset.h' --- src/charset.h 2011-05-01 16:27:34 +0000 +++ src/charset.h 2011-05-31 05:52:26 +0000 @@ -146,7 +146,7 @@ int id; /* Index to Vcharset_hash_table. */ - int hash_index; + EMACS_INT hash_index; /* Dimension of the charset: 1, 2, 3, or 4. */ int dimension; === modified file 'src/composite.c' --- src/composite.c 2011-05-20 00:51:38 +0000 +++ src/composite.c 2011-05-31 05:52:26 +0000 @@ -179,8 +179,8 @@ Lisp_Object id, length, components, key, *key_contents; int glyph_len; struct Lisp_Hash_Table *hash_table = XHASH_TABLE (composition_hash_table); - int hash_index; - unsigned hash_code; + EMACS_INT hash_index; + EMACS_UINT hash_code; struct composition *cmp; EMACS_INT i; int ch; @@ -656,7 +656,7 @@ gstring_lookup_cache (Lisp_Object header) { struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table); - int i = hash_lookup (h, header, NULL); + EMACS_INT i = hash_lookup (h, header, NULL); return (i >= 0 ? HASH_VALUE (h, i) : Qnil); } @@ -665,7 +665,7 @@ composition_gstring_put_cache (Lisp_Object gstring, EMACS_INT len) { struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table); - unsigned hash; + EMACS_UINT hash; Lisp_Object header, copy; EMACS_INT i; === modified file 'src/composite.h' --- src/composite.h 2011-04-11 03:39:45 +0000 +++ src/composite.h 2011-05-31 05:52:26 +0000 @@ -186,7 +186,7 @@ enum composition_method method; /* Index to the composition hash table. */ - int hash_index; + EMACS_INT hash_index; /* For which font we have calculated the remaining members. The actual type is device dependent. */ === modified file 'src/dispextern.h' --- src/dispextern.h 2011-05-25 03:45:04 +0000 +++ src/dispextern.h 2011-05-31 05:52:26 +0000 @@ -2798,7 +2798,7 @@ } data; /* Hash value of image specification to speed up comparisons. */ - unsigned hash; + EMACS_UINT hash; /* Image id of this image. */ int id; === modified file 'src/fns.c' --- src/fns.c 2011-05-28 22:39:39 +0000 +++ src/fns.c 2011-05-31 05:52:26 +0000 @@ -3358,21 +3358,6 @@ static struct Lisp_Hash_Table *check_hash_table (Lisp_Object); static size_t get_key_arg (Lisp_Object, size_t, Lisp_Object *, char *); static void maybe_resize_hash_table (struct Lisp_Hash_Table *); -static int cmpfn_eql (struct Lisp_Hash_Table *, Lisp_Object, unsigned, - Lisp_Object, unsigned); -static int cmpfn_equal (struct Lisp_Hash_Table *, Lisp_Object, unsigned, - Lisp_Object, unsigned); -static int cmpfn_user_defined (struct Lisp_Hash_Table *, Lisp_Object, - unsigned, Lisp_Object, unsigned); -static unsigned hashfn_eq (struct Lisp_Hash_Table *, Lisp_Object); -static unsigned hashfn_eql (struct Lisp_Hash_Table *, Lisp_Object); -static unsigned hashfn_equal (struct Lisp_Hash_Table *, Lisp_Object); -static unsigned hashfn_user_defined (struct Lisp_Hash_Table *, - Lisp_Object); -static unsigned sxhash_string (unsigned char *, int); -static unsigned sxhash_list (Lisp_Object, int); -static unsigned sxhash_vector (Lisp_Object, int); -static unsigned sxhash_bool_vector (Lisp_Object); static int sweep_weak_table (struct Lisp_Hash_Table *, int); @@ -3395,8 +3380,8 @@ /* Value is the next integer I >= N, N >= 0 which is "almost" a prime number. */ -int -next_almost_prime (int n) +EMACS_INT +next_almost_prime (EMACS_INT n) { if (n % 2 == 0) n += 1; @@ -3436,10 +3421,10 @@ vector that are not copied from VEC are set to INIT. */ Lisp_Object -larger_vector (Lisp_Object vec, int new_size, Lisp_Object init) +larger_vector (Lisp_Object vec, EMACS_INT new_size, Lisp_Object init) { struct Lisp_Vector *v; - int i, old_size; + EMACS_INT i, old_size; xassert (VECTORP (vec)); old_size = ASIZE (vec); @@ -3463,7 +3448,9 @@ KEY2 are the same. */ static int -cmpfn_eql (struct Lisp_Hash_Table *h, Lisp_Object key1, unsigned int hash1, Lisp_Object key2, unsigned int hash2) +cmpfn_eql (struct Lisp_Hash_Table *h, + Lisp_Object key1, EMACS_UINT hash1, + Lisp_Object key2, EMACS_UINT hash2) { return (FLOATP (key1) && FLOATP (key2) @@ -3476,7 +3463,9 @@ KEY2 are the same. */ static int -cmpfn_equal (struct Lisp_Hash_Table *h, Lisp_Object key1, unsigned int hash1, Lisp_Object key2, unsigned int hash2) +cmpfn_equal (struct Lisp_Hash_Table *h, + Lisp_Object key1, EMACS_UINT hash1, + Lisp_Object key2, EMACS_UINT hash2) { return hash1 == hash2 && !NILP (Fequal (key1, key2)); } @@ -3487,7 +3476,9 @@ if KEY1 and KEY2 are the same. */ static int -cmpfn_user_defined (struct Lisp_Hash_Table *h, Lisp_Object key1, unsigned int hash1, Lisp_Object key2, unsigned int hash2) +cmpfn_user_defined (struct Lisp_Hash_Table *h, + Lisp_Object key1, EMACS_UINT hash1, + Lisp_Object key2, EMACS_UINT hash2) { if (hash1 == hash2) { @@ -3507,10 +3498,10 @@ `eq' to compare keys. The hash code returned is guaranteed to fit in a Lisp integer. */ -static unsigned +static EMACS_UINT hashfn_eq (struct Lisp_Hash_Table *h, Lisp_Object key) { - unsigned hash = XUINT (key) ^ XTYPE (key); + EMACS_UINT hash = XUINT (key) ^ XTYPE (key); xassert ((hash & ~INTMASK) == 0); return hash; } @@ -3520,10 +3511,10 @@ `eql' to compare keys. The hash code returned is guaranteed to fit in a Lisp integer. */ -static unsigned +static EMACS_UINT hashfn_eql (struct Lisp_Hash_Table *h, Lisp_Object key) { - unsigned hash; + EMACS_UINT hash; if (FLOATP (key)) hash = sxhash (key, 0); else @@ -3537,10 +3528,10 @@ `equal' to compare keys. The hash code returned is guaranteed to fit in a Lisp integer. */ -static unsigned +static EMACS_UINT hashfn_equal (struct Lisp_Hash_Table *h, Lisp_Object key) { - unsigned hash = sxhash (key, 0); + EMACS_UINT hash = sxhash (key, 0); xassert ((hash & ~INTMASK) == 0); return hash; } @@ -3550,7 +3541,7 @@ user-defined function to compare keys. The hash code returned is guaranteed to fit in a Lisp integer. */ -static unsigned +static EMACS_UINT hashfn_user_defined (struct Lisp_Hash_Table *h, Lisp_Object key) { Lisp_Object args[2], hash; @@ -3593,26 +3584,33 @@ { struct Lisp_Hash_Table *h; Lisp_Object table; - int index_size, i, sz; + EMACS_INT index_size, i, sz; + double index_float; /* Preconditions. */ xassert (SYMBOLP (test)); xassert (INTEGERP (size) && XINT (size) >= 0); xassert ((INTEGERP (rehash_size) && XINT (rehash_size) > 0) - || (FLOATP (rehash_size) && XFLOATINT (rehash_size) > 1.0)); + || (FLOATP (rehash_size) && 1 < XFLOAT_DATA (rehash_size))); xassert (FLOATP (rehash_threshold) - && XFLOATINT (rehash_threshold) > 0 - && XFLOATINT (rehash_threshold) <= 1.0); + && 0 < XFLOAT_DATA (rehash_threshold) + && XFLOAT_DATA (rehash_threshold) <= 1.0); if (XFASTINT (size) == 0) size = make_number (1); + sz = XFASTINT (size); + index_float = sz / XFLOAT_DATA (rehash_threshold); + index_size = (index_float < MOST_POSITIVE_FIXNUM + 1 + ? next_almost_prime (index_float) + : MOST_POSITIVE_FIXNUM + 1); + if (MOST_POSITIVE_FIXNUM < max (index_size, 2 * sz)) + error ("Hash table too large"); + /* Allocate a table and initialize it. */ h = allocate_hash_table (); /* Initialize hash table slots. */ - sz = XFASTINT (size); - h->test = test; if (EQ (test, Qeql)) { @@ -3644,8 +3642,6 @@ h->key_and_value = Fmake_vector (make_number (2 * sz), Qnil); h->hash = Fmake_vector (size, Qnil); h->next = Fmake_vector (size, Qnil); - /* Cast to int here avoids losing with gcc 2.95 on Tru64/Alpha... */ - index_size = next_almost_prime ((int) (sz / XFLOATINT (rehash_threshold))); h->index = Fmake_vector (make_number (index_size), Qnil); /* Set up the free list. */ @@ -3709,20 +3705,29 @@ { if (NILP (h->next_free)) { - int old_size = HASH_TABLE_SIZE (h); - int i, new_size, index_size; + EMACS_INT old_size = HASH_TABLE_SIZE (h); + EMACS_INT i, new_size, index_size; EMACS_INT nsize; + double index_float; if (INTEGERP (h->rehash_size)) new_size = old_size + XFASTINT (h->rehash_size); else - new_size = old_size * XFLOATINT (h->rehash_size); - new_size = max (old_size + 1, new_size); - index_size = next_almost_prime ((int) - (new_size - / XFLOATINT (h->rehash_threshold))); - /* Assignment to EMACS_INT stops GCC whining about limited range - of data type. */ + { + double float_new_size = old_size * XFLOAT_DATA (h->rehash_size); + if (float_new_size < MOST_POSITIVE_FIXNUM + 1) + { + new_size = float_new_size; + if (new_size <= old_size) + new_size = old_size + 1; + } + else + new_size = MOST_POSITIVE_FIXNUM + 1; + } + index_float = new_size / XFLOAT_DATA (h->rehash_threshold); + index_size = (index_float < MOST_POSITIVE_FIXNUM + 1 + ? next_almost_prime (index_float) + : MOST_POSITIVE_FIXNUM + 1); nsize = max (index_size, 2 * new_size); if (nsize > MOST_POSITIVE_FIXNUM) error ("Hash table too large to resize"); @@ -3756,8 +3761,8 @@ for (i = 0; i < old_size; ++i) if (!NILP (HASH_HASH (h, i))) { - unsigned hash_code = XUINT (HASH_HASH (h, i)); - int start_of_bucket = hash_code % ASIZE (h->index); + EMACS_UINT hash_code = XUINT (HASH_HASH (h, i)); + EMACS_INT start_of_bucket = hash_code % ASIZE (h->index); HASH_NEXT (h, i) = HASH_INDEX (h, start_of_bucket); HASH_INDEX (h, start_of_bucket) = make_number (i); } @@ -3769,11 +3774,11 @@ the hash code of KEY. Value is the index of the entry in H matching KEY, or -1 if not found. */ -int -hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, unsigned int *hash) +EMACS_INT +hash_lookup (struct Lisp_Hash_Table *h, Lisp_Object key, EMACS_UINT *hash) { - unsigned hash_code; - int start_of_bucket; + EMACS_UINT hash_code; + EMACS_INT start_of_bucket; Lisp_Object idx; hash_code = h->hashfn (h, key); @@ -3786,7 +3791,7 @@ /* We need not gcpro idx since it's either an integer or nil. */ while (!NILP (idx)) { - int i = XFASTINT (idx); + EMACS_INT i = XFASTINT (idx); if (EQ (key, HASH_KEY (h, i)) || (h->cmpfn && h->cmpfn (h, key, hash_code, @@ -3803,10 +3808,11 @@ HASH is a previously computed hash code of KEY. Value is the index of the entry in H matching KEY. */ -int -hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value, unsigned int hash) +EMACS_INT +hash_put (struct Lisp_Hash_Table *h, Lisp_Object key, Lisp_Object value, + EMACS_UINT hash) { - int start_of_bucket, i; + EMACS_INT start_of_bucket, i; xassert ((hash & ~INTMASK) == 0); @@ -3836,8 +3842,8 @@ static void hash_remove_from_table (struct Lisp_Hash_Table *h, Lisp_Object key) { - unsigned hash_code; - int start_of_bucket; + EMACS_UINT hash_code; + EMACS_INT start_of_bucket; Lisp_Object idx, prev; hash_code = h->hashfn (h, key); @@ -3848,7 +3854,7 @@ /* We need not gcpro idx, prev since they're either integers or nil. */ while (!NILP (idx)) { - int i = XFASTINT (idx); + EMACS_INT i = XFASTINT (idx); if (EQ (key, HASH_KEY (h, i)) || (h->cmpfn @@ -3886,7 +3892,7 @@ { if (h->count > 0) { - int i, size = HASH_TABLE_SIZE (h); + EMACS_INT i, size = HASH_TABLE_SIZE (h); for (i = 0; i < size; ++i) { @@ -3924,7 +3930,8 @@ static int sweep_weak_table (struct Lisp_Hash_Table *h, int remove_entries_p) { - int bucket, n, marked; + EMACS_INT bucket, n; + int marked; n = ASIZE (h->index) & ~ARRAY_MARK_FLAG; marked = 0; @@ -3938,7 +3945,7 @@ prev = Qnil; for (idx = HASH_INDEX (h, bucket); !NILP (idx); idx = next) { - int i = XFASTINT (idx); + EMACS_INT i = XFASTINT (idx); int key_known_to_survive_p = survives_gc_p (HASH_KEY (h, i)); int value_known_to_survive_p = survives_gc_p (HASH_VALUE (h, i)); int remove_p; @@ -4067,43 +4074,68 @@ #define SXHASH_MAX_LEN 7 -/* Combine two integers X and Y for hashing. */ +/* Combine two integers X and Y for hashing. The result might not fit + into a Lisp integer. */ #define SXHASH_COMBINE(X, Y) \ - ((((unsigned)(X) << 4) + (((unsigned)(X) >> 24) & 0x0fffffff)) \ - + (unsigned)(Y)) + ((((EMACS_UINT) (X) << 4) + ((EMACS_UINT) (X) >> (BITS_PER_EMACS_INT - 4))) \ + + (EMACS_UINT) (Y)) +/* Hash X, returning a value that fits into a Lisp integer. */ +#define SXHASH_REDUCE(X) \ + ((((X) ^ (X) >> (BITS_PER_EMACS_INT - FIXNUM_BITS))) & INTMASK) /* Return a hash for string PTR which has length LEN. The hash code returned is guaranteed to fit in a Lisp integer. */ -static unsigned -sxhash_string (unsigned char *ptr, int len) +static EMACS_UINT +sxhash_string (unsigned char *ptr, EMACS_INT len) { unsigned char *p = ptr; unsigned char *end = p + len; unsigned char c; - unsigned hash = 0; + EMACS_UINT hash = 0; while (p != end) { c = *p++; if (c >= 0140) c -= 40; - hash = ((hash << 4) + (hash >> 28) + c); + hash = SXHASH_COMBINE (hash, c); } - return hash & INTMASK; -} - + return SXHASH_REDUCE (hash); +} + +/* Return a hash for the floating point value VAL. */ + +static EMACS_INT +sxhash_float (double val) +{ + EMACS_UINT hash = 0; + enum { + WORDS_PER_DOUBLE = (sizeof val / sizeof hash + + (sizeof val % sizeof hash != 0)) + }; + union { + double val; + EMACS_UINT word[WORDS_PER_DOUBLE]; + } u; + int i; + u.val = val; + memset (&u.val + 1, 0, sizeof u - sizeof u.val); + for (i = 0; i < WORDS_PER_DOUBLE; i++) + hash = SXHASH_COMBINE (hash, u.word[i]); + return SXHASH_REDUCE (hash); +} /* Return a hash for list LIST. DEPTH is the current depth in the list. We don't recurse deeper than SXHASH_MAX_DEPTH in it. */ -static unsigned +static EMACS_UINT sxhash_list (Lisp_Object list, int depth) { - unsigned hash = 0; + EMACS_UINT hash = 0; int i; if (depth < SXHASH_MAX_DEPTH) @@ -4111,63 +4143,62 @@ CONSP (list) && i < SXHASH_MAX_LEN; list = XCDR (list), ++i) { - unsigned hash2 = sxhash (XCAR (list), depth + 1); + EMACS_UINT hash2 = sxhash (XCAR (list), depth + 1); hash = SXHASH_COMBINE (hash, hash2); } if (!NILP (list)) { - unsigned hash2 = sxhash (list, depth + 1); + EMACS_UINT hash2 = sxhash (list, depth + 1); hash = SXHASH_COMBINE (hash, hash2); } - return hash; + return SXHASH_REDUCE (hash); } /* Return a hash for vector VECTOR. DEPTH is the current depth in the Lisp structure. */ -static unsigned +static EMACS_UINT sxhash_vector (Lisp_Object vec, int depth) { - unsigned hash = ASIZE (vec); + EMACS_UINT hash = ASIZE (vec); int i, n; n = min (SXHASH_MAX_LEN, ASIZE (vec)); for (i = 0; i < n; ++i) { - unsigned hash2 = sxhash (AREF (vec, i), depth + 1); + EMACS_UINT hash2 = sxhash (AREF (vec, i), depth + 1); hash = SXHASH_COMBINE (hash, hash2); } - return hash; + return SXHASH_REDUCE (hash); } - /* Return a hash for bool-vector VECTOR. */ -static unsigned +static EMACS_UINT sxhash_bool_vector (Lisp_Object vec) { - unsigned hash = XBOOL_VECTOR (vec)->size; + EMACS_UINT hash = XBOOL_VECTOR (vec)->size; int i, n; n = min (SXHASH_MAX_LEN, XBOOL_VECTOR (vec)->header.size); for (i = 0; i < n; ++i) hash = SXHASH_COMBINE (hash, XBOOL_VECTOR (vec)->data[i]); - return hash; + return SXHASH_REDUCE (hash); } /* Return a hash code for OBJ. DEPTH is the current depth in the Lisp structure. Value is an unsigned integer clipped to INTMASK. */ -unsigned +EMACS_UINT sxhash (Lisp_Object obj, int depth) { - unsigned hash; + EMACS_UINT hash; if (depth > SXHASH_MAX_DEPTH) return 0; @@ -4211,20 +4242,14 @@ break; case Lisp_Float: - { - double val = XFLOAT_DATA (obj); - unsigned char *p = (unsigned char *) &val; - size_t i; - for (hash = 0, i = 0; i < sizeof val; i++) - hash = SXHASH_COMBINE (hash, p[i]); - break; - } + hash = sxhash_float (XFLOAT_DATA (obj)); + break; default: abort (); } - return hash & INTMASK; + return hash; } @@ -4238,7 +4263,7 @@ doc: /* Compute a hash code for OBJ and return it as integer. */) (Lisp_Object obj) { - unsigned hash = sxhash (obj, 0); + EMACS_UINT hash = sxhash (obj, 0); return make_number (hash); } @@ -4315,17 +4340,16 @@ /* Look for `:rehash-size SIZE'. */ i = get_key_arg (QCrehash_size, nargs, args, used); rehash_size = i ? args[i] : make_float (DEFAULT_REHASH_SIZE); - if (!NUMBERP (rehash_size) - || (INTEGERP (rehash_size) && XINT (rehash_size) <= 0) - || XFLOATINT (rehash_size) <= 1.0) + if (! ((INTEGERP (rehash_size) && 0 < XINT (rehash_size)) + || (FLOATP (rehash_size) && 1 < XFLOAT_DATA (rehash_size)))) signal_error ("Invalid hash table rehash size", rehash_size); /* Look for `:rehash-threshold THRESHOLD'. */ i = get_key_arg (QCrehash_threshold, nargs, args, used); rehash_threshold = i ? args[i] : make_float (DEFAULT_REHASH_THRESHOLD); - if (!FLOATP (rehash_threshold) - || XFLOATINT (rehash_threshold) <= 0.0 - || XFLOATINT (rehash_threshold) > 1.0) + if (! (FLOATP (rehash_threshold) + && 0 < XFLOAT_DATA (rehash_threshold) + && XFLOAT_DATA (rehash_threshold) <= 1)) signal_error ("Invalid hash table rehash threshold", rehash_threshold); /* Look for `:weakness WEAK'. */ @@ -4437,7 +4461,7 @@ (Lisp_Object key, Lisp_Object table, Lisp_Object dflt) { struct Lisp_Hash_Table *h = check_hash_table (table); - int i = hash_lookup (h, key, NULL); + EMACS_INT i = hash_lookup (h, key, NULL); return i >= 0 ? HASH_VALUE (h, i) : dflt; } @@ -4449,8 +4473,8 @@ (Lisp_Object key, Lisp_Object value, Lisp_Object table) { struct Lisp_Hash_Table *h = check_hash_table (table); - int i; - unsigned hash; + EMACS_INT i; + EMACS_UINT hash; i = hash_lookup (h, key, &hash); if (i >= 0) @@ -4479,7 +4503,7 @@ { struct Lisp_Hash_Table *h = check_hash_table (table); Lisp_Object args[3]; - int i; + EMACS_INT i; for (i = 0; i < HASH_TABLE_SIZE (h); ++i) if (!NILP (HASH_HASH (h, i))) === modified file 'src/image.c' --- src/image.c 2011-05-30 01:12:12 +0000 +++ src/image.c 2011-05-31 05:52:26 +0000 @@ -982,7 +982,6 @@ Image type independent image structures ***********************************************************************/ -static struct image *make_image (Lisp_Object spec, unsigned hash); static void free_image (struct frame *f, struct image *img); static int check_image_size (struct frame *f, int width, int height); @@ -991,7 +990,7 @@ SPEC. SPEC has a hash value of HASH. */ static struct image * -make_image (Lisp_Object spec, unsigned int hash) +make_image (Lisp_Object spec, EMACS_UINT hash) { struct image *img = (struct image *) xmalloc (sizeof *img); Lisp_Object file = image_spec_value (spec, QCfile, NULL); @@ -1388,7 +1387,6 @@ Image Cache ***********************************************************************/ -static struct image *search_image_cache (struct frame *, Lisp_Object, unsigned); static void cache_image (struct frame *f, struct image *img); static void postprocess_image (struct frame *, struct image *); @@ -1414,7 +1412,7 @@ /* Find an image matching SPEC in the cache, and return it. If no image is found, return NULL. */ static struct image * -search_image_cache (struct frame *f, Lisp_Object spec, unsigned int hash) +search_image_cache (struct frame *f, Lisp_Object spec, EMACS_UINT hash) { struct image *img; struct image_cache *c = FRAME_IMAGE_CACHE (f); @@ -1714,7 +1712,7 @@ lookup_image (struct frame *f, Lisp_Object spec) { struct image *img; - unsigned hash; + EMACS_UINT hash; EMACS_TIME now; /* F must be a window-system frame, and SPEC must be a valid image @@ -3751,7 +3749,7 @@ Lisp_Object color) { struct Lisp_Hash_Table *table = XHASH_TABLE (color_table); - unsigned hash_code; + EMACS_UINT hash_code; Lisp_Object chars = make_unibyte_string (chars_start, chars_len); hash_lookup (table, chars, &hash_code); === modified file 'src/lisp.h' --- src/lisp.h 2011-05-31 05:15:34 +0000 +++ src/lisp.h 2011-05-31 05:52:26 +0000 @@ -525,22 +525,20 @@ #define EQ(x, y) (XHASH (x) == XHASH (y)) +/* Number of bits in a fixnum, including the sign bit. */ +#ifdef USE_2_TAGS_FOR_INTS +# define FIXNUM_BITS (VALBITS + 1) +#else +# define FIXNUM_BITS VALBITS +#endif + +/* Mask indicating the significant bits of a fixnum. */ +#define INTMASK (((EMACS_INT) 1 << FIXNUM_BITS) - 1) + /* Largest and smallest representable fixnum values. These are the C values. */ - -#ifdef USE_2_TAGS_FOR_INTS -# define MOST_NEGATIVE_FIXNUM - ((EMACS_INT) 1 << VALBITS) -# define MOST_POSITIVE_FIXNUM (((EMACS_INT) 1 << VALBITS) - 1) -/* Mask indicating the significant bits of a Lisp_Int. - I.e. (x & INTMASK) == XUINT (make_number (x)). */ -# define INTMASK ((((EMACS_INT) 1) << (VALBITS + 1)) - 1) -#else -# define MOST_NEGATIVE_FIXNUM - ((EMACS_INT) 1 << (VALBITS - 1)) -# define MOST_POSITIVE_FIXNUM (((EMACS_INT) 1 << (VALBITS - 1)) - 1) -/* Mask indicating the significant bits of a Lisp_Int. - I.e. (x & INTMASK) == XUINT (make_number (x)). */ -# define INTMASK ((((EMACS_INT) 1) << VALBITS) - 1) -#endif +#define MOST_POSITIVE_FIXNUM (INTMASK / 2) +#define MOST_NEGATIVE_FIXNUM (-1 - MOST_POSITIVE_FIXNUM) /* Value is non-zero if I doesn't fit into a Lisp fixnum. It is written this way so that it also works if I is of unsigned @@ -1179,7 +1177,7 @@ a special way (e.g. because of weakness). */ /* Number of key/value entries in the table. */ - unsigned int count; + EMACS_INT count; /* Vector of keys and values. The key of item I is found at index 2 * I, the value is found at index 2 * I + 1. @@ -1191,11 +1189,12 @@ struct Lisp_Hash_Table *next_weak; /* C function to compare two keys. */ - int (* cmpfn) (struct Lisp_Hash_Table *, Lisp_Object, - unsigned, Lisp_Object, unsigned); + int (*cmpfn) (struct Lisp_Hash_Table *, + Lisp_Object, EMACS_UINT, + Lisp_Object, EMACS_UINT); /* C function to compute hash code. */ - unsigned (* hashfn) (struct Lisp_Hash_Table *, Lisp_Object); + EMACS_UINT (*hashfn) (struct Lisp_Hash_Table *, Lisp_Object); }; @@ -2093,7 +2092,7 @@ /* Number of bytes of structure consed since last GC. */ -extern int consing_since_gc; +extern EMACS_INT consing_since_gc; extern EMACS_INT gc_relative_threshold; @@ -2468,19 +2467,19 @@ /* Defined in fns.c */ extern Lisp_Object QCrehash_size, QCrehash_threshold; -extern int next_almost_prime (int); -extern Lisp_Object larger_vector (Lisp_Object, int, Lisp_Object); +extern EMACS_INT next_almost_prime (EMACS_INT); +extern Lisp_Object larger_vector (Lisp_Object, EMACS_INT, Lisp_Object); extern void sweep_weak_hash_tables (void); extern Lisp_Object Qcursor_in_echo_area; extern Lisp_Object Qstring_lessp; extern Lisp_Object QCsize, QCtest, QCweakness, Qequal, Qeq, Qeql; -unsigned sxhash (Lisp_Object, int); +EMACS_UINT sxhash (Lisp_Object, int); Lisp_Object make_hash_table (Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object, Lisp_Object); -int hash_lookup (struct Lisp_Hash_Table *, Lisp_Object, unsigned *); -int hash_put (struct Lisp_Hash_Table *, Lisp_Object, Lisp_Object, - unsigned); +EMACS_INT hash_lookup (struct Lisp_Hash_Table *, Lisp_Object, EMACS_UINT *); +EMACS_INT hash_put (struct Lisp_Hash_Table *, Lisp_Object, Lisp_Object, + EMACS_UINT); void init_weak_hash_tables (void); extern void init_fns (void); EXFUN (Fmake_hash_table, MANY); === modified file 'src/minibuf.c' --- src/minibuf.c 2011-05-31 03:03:38 +0000 +++ src/minibuf.c 2011-05-31 05:52:26 +0000 @@ -1209,7 +1209,7 @@ && (!SYMBOLP (XCAR (collection)) || NILP (XCAR (collection))))) ? list_table : function_table)); - int idx = 0, obsize = 0; + EMACS_INT idx = 0, obsize = 0; int matchcount = 0; int bindcount = -1; Lisp_Object bucket, zero, end, tem; @@ -1474,7 +1474,7 @@ : NILP (collection) || (CONSP (collection) && (!SYMBOLP (XCAR (collection)) || NILP (XCAR (collection)))); - int idx = 0, obsize = 0; + EMACS_INT idx = 0, obsize = 0; int bindcount = -1; Lisp_Object bucket, tem, zero; struct gcpro gcpro1, gcpro2, gcpro3, gcpro4; @@ -1770,7 +1770,7 @@ (Lisp_Object string, Lisp_Object collection, Lisp_Object predicate) { Lisp_Object regexps, tail, tem = Qnil; - int i = 0; + EMACS_INT i = 0; CHECK_STRING (string); === modified file 'src/print.c' --- src/print.c 2011-05-21 04:33:23 +0000 +++ src/print.c 2011-05-31 05:52:26 +0000 @@ -1082,7 +1082,7 @@ Maybe a better way to do that is to copy elements to a new hash table. */ struct Lisp_Hash_Table *h = XHASH_TABLE (Vprint_number_table); - int i; + EMACS_INT i; for (i = 0; i < HASH_TABLE_SIZE (h); ++i) if (!NILP (HASH_HASH (h, i))