bug#32463: 27.0.50; (logior -1) => 4611686018427387903

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

From: Paul Eggert <eggert@cs.ucla.edu>
To: Pip Cet <pipcet@gmail.com>
Cc: andrewjmoreton@gmail.com, 32463@debbugs.gnu.org
Subject: bug#32463: 27.0.50; (logior -1) => 4611686018427387903
Date: Tue, 21 Aug 2018 02:40:34 -0700	[thread overview]
Message-ID: <7b6c9c32-c301-09db-d23e-ce976afcb5c1@cs.ucla.edu> (raw)
In-Reply-To: <CAOqdjBdKsDi4VjARaW-4w8EkV+JJQ2Fg2uMEb6rBWB5DWipH6w@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 953 bytes --]

Pip Cet wrote:

> I think we can be clever and wrap calls to mpz_mul_2exp (which can
> create arbitrary bignums) and whatever Fexpt uses....  I'd suggest a much
> lower limit until/unless the C-g issue is fixed, perhaps overridable
> by a user preference if people really want to use big bignums.

Yes, this sounds good. After stressing Emacs with bignums for a bit, I found 
that it was too easy to get Emacs to abort or hang by creating large bignums.

> I'm not sure what a reasonable limit would be, but I think a global
> limit of bignum size to something that allows for "immediate"
> computations would be best.

I installed the attached patch to do that. It tentatively defaults to a limit of 
2↑↑5 (i.e., 2**65536) for bignums, overrideable by setting a new variable 
'integer-width' that defaults to 65536. This default should be big enough for 
almost all Emacs applications and should avoid issues of aborts and hangs.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Avoid-libgmp-aborts-by-imposing-limits.patch --]
[-- Type: text/x-patch; name="0001-Avoid-libgmp-aborts-by-imposing-limits.patch", Size: 32306 bytes --]

From d6a497dd887cdbb35c5b4e2929e83962ba708159 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 21 Aug 2018 02:16:50 -0700
Subject: [PATCH] Avoid libgmp aborts by imposing limits
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

libgmp calls ‘abort’ when given numbers too big for its
internal data structures.  The numeric limit is large and
platform-dependent; with 64-bit GMP 6.1.2 it is around
2**2**37.  Work around the problem by refusing to call libgmp
functions with arguments that would cause an abort.  With luck
libgmp will have a better way to do this in the future.
Also, introduce a variable integer-width that lets the user
control how large bignums can be.  This currently defaults
to 2**16, i.e., it allows bignums up to 2**2**16.  This
should be enough for ordinary computation, and should
help Emacs to avoid thrashing or hanging.
Problem noted by Pip Cet (Bug#32463#71).
* doc/lispref/numbers.texi, etc/NEWS:
Document recent bignum changes, including this one.
Improve documentation for bitwise operations, in the light
of bignums.
* src/alloc.c (make_number): Enforce integer-width.
(integer_overflow): New function.
(xrealloc_for_gmp, xfree_for_gmp):
Move here from emacs.c, as it's memory allocation.
(init_alloc): Initialize GMP here, rather than in emacs.c.
(integer_width): New var.
* src/data.c (GMP_NLIMBS_MAX, NLIMBS_LIMIT): New constants.
(emacs_mpz_size, emacs_mpz_mul)
(emacs_mpz_mul_2exp, emacs_mpz_pow_ui): New functions.
(arith_driver, Fash, expt_integer): Use them.
(expt_integer): New function, containing integer code
that was out of place in floatfns.c.
(check_bignum_size, xmalloc_for_gmp): Remove.
* src/emacs.c (main): Do not initialize GMP here.
* src/floatfns.c (Fexpt): Use expt_integer, which
now contains integer code moved from here.
* src/lisp.h (GMP_NUMB_BITS): Define if gmp.h doesn’t.
---
 doc/lispref/numbers.texi | 314 ++++++++++++++++++---------------------
 etc/NEWS                 |   6 +
 src/alloc.c              |  73 ++++++---
 src/data.c               | 109 +++++++++++++-
 src/emacs.c              |  34 -----
 src/floatfns.c           |  24 +--
 src/lisp.h               |  11 +-
 7 files changed, 321 insertions(+), 250 deletions(-)

diff --git a/doc/lispref/numbers.texi b/doc/lispref/numbers.texi
index 209e9f139a..9c16b1a64c 100644
--- a/doc/lispref/numbers.texi
+++ b/doc/lispref/numbers.texi
@@ -34,13 +34,21 @@ Numbers
 @node Integer Basics
 @section Integer Basics
 
-  Integers in Emacs Lisp can have arbitrary precision.
+  Integers in Emacs Lisp are not limited to the machine word size.
 
   Under the hood, though, there are two kinds of integers: smaller
 ones, called @dfn{fixnums}, and larger ones, called @dfn{bignums}.
-Some functions in Emacs only accept fixnums.  Also, while fixnums can
-always be compared for equality with @code{eq}, bignums require the
-use of @code{eql}.
+Some functions in Emacs accept only fixnums.  Also, while fixnums can
+always be compared for numeric equality with @code{eq}, bignums
+require more-heavyweight equality predicates like @code{eql}.
+
+  The range of values for bignums is limited by the amount of main
+memory, by machine characteristics such as the size of the word used
+to represent a bignum's exponent, and by the @code{integer-width}
+variable.  These limits are typically much more generous than the
+limits for fixnums.  A bignum is never numerically equal to a fixnum;
+if Emacs computes an integer in fixnum range, it represents the
+integer as a fixnum, not a bignum.
 
   The range of values for a fixnum depends on the machine.  The
 minimum range is @minus{}536,870,912 to 536,870,911 (30 bits; i.e.,
@@ -97,33 +105,30 @@ Integer Basics
 #24r1k @result{} 44
 @end example
 
-  An integer is read as a fixnum if it is in the correct range.
-Otherwise, it will be read as a bignum.
-
   To understand how various functions work on integers, especially the
 bitwise operators (@pxref{Bitwise Operations}), it is often helpful to
 view the numbers in their binary form.
 
-  In 30-bit binary, the decimal integer 5 looks like this:
+  In binary, the decimal integer 5 looks like this:
 
 @example
-0000...000101 (30 bits total)
+...000101
 @end example
 
 @noindent
-(The @samp{...} stands for enough bits to fill out a 30-bit word; in
-this case, @samp{...} stands for twenty 0 bits.  Later examples also
-use the @samp{...} notation to make binary integers easier to read.)
+(The @samp{...} stands for a conceptually infinite number of bits that
+match the leading bit; here, an infinite number of 0 bits.  Later
+examples also use this @samp{...} notation.)
 
   The integer @minus{}1 looks like this:
 
 @example
-1111...111111 (30 bits total)
+...111111
 @end example
 
 @noindent
 @cindex two's complement
-@minus{}1 is represented as 30 ones.  (This is called @dfn{two's
+@minus{}1 is represented as all ones.  (This is called @dfn{two's
 complement} notation.)
 
   Subtracting 4 from @minus{}1 returns the negative integer @minus{}5.
@@ -131,14 +136,7 @@ Integer Basics
 @minus{}5 looks like this:
 
 @example
-1111...111011 (30 bits total)
-@end example
-
-  In this implementation, the largest 30-bit binary integer is
-536,870,911 in decimal.  In binary, it looks like this:
-
-@example
-0111...111111 (30 bits total)
+...111011
 @end example
 
   Many of the functions described in this chapter accept markers for
@@ -147,10 +145,10 @@ Integer Basics
 give these arguments the name @var{number-or-marker}.  When the argument
 value is a marker, its position value is used and its buffer is ignored.
 
-@cindex largest Lisp integer
-@cindex maximum Lisp integer
+@cindex largest fixnum
+@cindex maximum fixnum
 @defvar most-positive-fixnum
-The value of this variable is the largest ``small'' integer that Emacs
+The value of this variable is the greatest ``small'' integer that Emacs
 Lisp can handle.  Typical values are
 @ifnottex
 2**29 @minus{} 1
@@ -168,11 +166,11 @@ Integer Basics
 on 64-bit platforms.
 @end defvar
 
-@cindex smallest Lisp integer
-@cindex minimum Lisp integer
+@cindex smallest fixnum
+@cindex minimum fixnum
 @defvar most-negative-fixnum
-The value of this variable is the smallest small integer that Emacs
-Lisp can handle.  It is negative.  Typical values are
+The value of this variable is the numerically least ``small'' integer
+that Emacs Lisp can handle.  It is negative.  Typical values are
 @ifnottex
 @minus{}2**29
 @end ifnottex
@@ -187,6 +185,19 @@ Integer Basics
 @math{-2^{61}}
 @end tex
 on 64-bit platforms.
+@end defvar
+
+@cindex bignum range
+@cindex integer range
+@defvar integer-width
+The value of this variable is a nonnegative integer that is an upper
+bound on the number of bits in a bignum.  Integers outside the fixnum
+range are limited to absolute values less than 2@sup{@var{n}}, where
+@var{n} is this variable's value.  Attempts to create bignums outside
+this range result in integer overflow.  Setting this variable to zero
+disables creation of bignums; setting it to a large number can cause
+Emacs to consume large quantities of memory if a computation creates
+huge integers.
 @end defvar
 
   In Emacs Lisp, text characters are represented by integers.  Any
@@ -378,17 +389,17 @@ Comparison of Numbers
 comparison would return @code{nil} and vice versa.  @xref{Float
 Basics}.
 
-  In Emacs Lisp, each small integer is a unique Lisp object.
-Therefore, @code{eq} is equivalent to @code{=} where small integers are
-concerned.  It is sometimes convenient to use @code{eq} for comparing
-an unknown value with an integer, because @code{eq} does not report an
+  In Emacs Lisp, if two fixnums are numerically equal, they are the
+same Lisp object.  That is, @code{eq} is equivalent to @code{=} on
+fixnums.  It is sometimes convenient to use @code{eq} for comparing
+an unknown value with a fixnum, because @code{eq} does not report an
 error if the unknown value is not a number---it accepts arguments of
 any type.  By contrast, @code{=} signals an error if the arguments are
 not numbers or markers.  However, it is better programming practice to
 use @code{=} if you can, even for comparing integers.
 
-  Sometimes it is useful to compare numbers with @code{equal}, which
-treats two numbers as equal if they have the same data type (both
+  Sometimes it is useful to compare numbers with @code{eql} or @code{equal},
+which treat two numbers as equal if they have the same data type (both
 integers, or both floating point) and the same value.  By contrast,
 @code{=} can treat an integer and a floating-point number as equal.
 @xref{Equality Predicates}.
@@ -830,142 +841,113 @@ Bitwise Operations
 @cindex logical arithmetic
 
   In a computer, an integer is represented as a binary number, a
-sequence of @dfn{bits} (digits which are either zero or one).  A bitwise
+sequence of @dfn{bits} (digits which are either zero or one).
+Conceptually the bit sequence is infinite on the left, with the
+most-significant bits being all zeros or all ones.  A bitwise
 operation acts on the individual bits of such a sequence.  For example,
 @dfn{shifting} moves the whole sequence left or right one or more places,
 reproducing the same pattern moved over.
 
   The bitwise operations in Emacs Lisp apply only to integers.
 
-@defun lsh integer1 count
-@cindex logical shift
-@code{lsh}, which is an abbreviation for @dfn{logical shift}, shifts the
-bits in @var{integer1} to the left @var{count} places, or to the right
-if @var{count} is negative, bringing zeros into the vacated bits.  If
-@var{count} is negative, @code{lsh} shifts zeros into the leftmost
-(most-significant) bit, producing a nonnegative result even if
-@var{integer1} is negative fixnum.  (If @var{integer1} is a negative
-bignum, @var{count} must be nonnegative.)  Contrast this with
-@code{ash}, below.
-
-Here are two examples of @code{lsh}, shifting a pattern of bits one
-place to the left.  We show only the low-order eight bits of the binary
-pattern; the rest are all zero.
+@defun ash integer1 count
+@cindex arithmetic shift
+@code{ash} (@dfn{arithmetic shift}) shifts the bits in @var{integer1}
+to the left @var{count} places, or to the right if @var{count} is
+negative.  Left shifts introduce zero bits on the right; right shifts
+discard the rightmost bits.  Considered as an integer operation,
+@code{ash} multiplies @var{integer1} by 2@sup{@var{count}} and then
+converts the result to an integer by rounding downward, toward
+minus infinity.
+
+Here are examples of @code{ash}, shifting a pattern of bits one place
+to the left and to the right.  These examples show only the low-order
+bits of the binary pattern; leading bits all agree with the
+highest-order bit shown.  As you can see, shifting left by one is
+equivalent to multiplying by two, whereas shifting right by one is
+equivalent to dividing by two and then rounding toward minus infinity.
 
 @example
 @group
-(lsh 5 1)
-     @result{} 10
-;; @r{Decimal 5 becomes decimal 10.}
-00000101 @result{} 00001010
-
-(lsh 7 1)
-     @result{} 14
+(ash 7 1) @result{} 14
 ;; @r{Decimal 7 becomes decimal 14.}
-00000111 @result{} 00001110
-@end group
-@end example
-
-@noindent
-As the examples illustrate, shifting the pattern of bits one place to
-the left produces a number that is twice the value of the previous
-number.
-
-Shifting a pattern of bits two places to the left produces results
-like this (with 8-bit binary numbers):
-
-@example
-@group
-(lsh 3 2)
-     @result{} 12
-;; @r{Decimal 3 becomes decimal 12.}
-00000011 @result{} 00001100
+...000111
+     @result{}
+...001110
 @end group
-@end example
-
-On the other hand, shifting one place to the right looks like this:
 
-@example
 @group
-(lsh 6 -1)
-     @result{} 3
-;; @r{Decimal 6 becomes decimal 3.}
-00000110 @result{} 00000011
+(ash 7 -1) @result{} 3
+...000111
+     @result{}
+...000011
 @end group
 
 @group
-(lsh 5 -1)
-     @result{} 2
-;; @r{Decimal 5 becomes decimal 2.}
-00000101 @result{} 00000010
+(ash -7 1) @result{} -14
+...111001
+     @result{}
+...110010
 @end group
-@end example
-
-@noindent
-As the example illustrates, shifting one place to the right divides the
-value of a positive integer by two, rounding downward.
-@end defun
-
-@defun ash integer1 count
-@cindex arithmetic shift
-@code{ash} (@dfn{arithmetic shift}) shifts the bits in @var{integer1}
-to the left @var{count} places, or to the right if @var{count}
-is negative.
-
-@code{ash} gives the same results as @code{lsh} except when
-@var{integer1} and @var{count} are both negative.  In that case,
-@code{ash} puts ones in the empty bit positions on the left, while
-@code{lsh} puts zeros in those bit positions and requires
-@var{integer1} to be a fixnum.
 
-Thus, with @code{ash}, shifting the pattern of bits one place to the right
-looks like this:
-
-@example
 @group
-(ash -6 -1) @result{} -3
-;; @r{Decimal @minus{}6 becomes decimal @minus{}3.}
-1111...111010 (30 bits total)
+(ash -7 -1) @result{} -4
+...111001
      @result{}
-1111...111101 (30 bits total)
+...111100
 @end group
 @end example
 
-Here are other examples:
+Here are examples of shifting left or right by two bits:
 
-@c !!! Check if lined up in smallbook format!  XDVI shows problem
-@c     with smallbook but not with regular book! --rjc 16mar92
 @smallexample
 @group
-                   ;  @r{       30-bit binary values}
-
-(lsh 5 2)          ;   5  =  @r{0000...000101}
-     @result{} 20         ;      =  @r{0000...010100}
-@end group
-@group
-(ash 5 2)
-     @result{} 20
-(lsh -5 2)         ;  -5  =  @r{1111...111011}
-     @result{} -20        ;      =  @r{1111...101100}
-(ash -5 2)
-     @result{} -20
+                  ;  @r{       binary values}
+(ash 5 2)         ;   5  =  @r{...000101}
+     @result{} 20         ;      =  @r{...010100}
+(ash -5 2)        ;  -5  =  @r{...111011}
+     @result{} -20        ;      =  @r{...101100}
 @end group
 @group
-(lsh 5 -2)         ;   5  =  @r{0000...000101}
-     @result{} 1          ;      =  @r{0000...000001}
+(ash 5 -2)
+     @result{} 1          ;      =  @r{...000001}
 @end group
 @group
-(ash 5 -2)
-     @result{} 1
+(ash -5 -2)
+     @result{} -2         ;      =  @r{...111110}
 @end group
+@end smallexample
+@end defun
+
+@defun lsh integer1 count
+@cindex logical shift
+@code{lsh}, which is an abbreviation for @dfn{logical shift}, shifts the
+bits in @var{integer1} to the left @var{count} places, or to the right
+if @var{count} is negative, bringing zeros into the vacated bits.  If
+@var{count} is negative, then @var{integer1} must be either a fixnum
+or a positive bignum, and @code{lsh} treats a negative fixnum as if it
+were unsigned by subtracting twice @code{most-negative-fixnum} before
+shifting, producing a nonnegative result.  This quirky behavior dates
+back to when Emacs supported only fixnums; nowadays @code{ash} is a
+better choice.
+
+As @code{lsh} behaves like @code{ash} except when @var{integer1} and
+@var{count1} are both negative, the following examples focus on these
+exceptional cases.  These examples assume 30-bit fixnums.
+
+@smallexample
 @group
-(lsh -5 -2)        ;  -5  =  @r{1111...111011}
-     @result{} 268435454
-                   ;      =  @r{0011...111110}
+                 ; @r{     binary values}
+(ash -7 -1)      ; -7 = @r{...111111111111111111111111111001}
+     @result{} -4        ;    = @r{...111111111111111111111111111100}
+(lsh -7 -1)
+     @result{} 536870908 ;    = @r{...011111111111111111111111111100}
 @end group
 @group
-(ash -5 -2)        ;  -5  =  @r{1111...111011}
-     @result{} -2         ;      =  @r{1111...111110}
+(ash -5 -2)      ; -5 = @r{...111111111111111111111111111011}
+     @result{} -2        ;    = @r{...111111111111111111111111111110}
+(lsh -5 -2)
+     @result{} 268435454 ;    = @r{...001111111111111111111111111110}
 @end group
 @end smallexample
 @end defun
@@ -999,23 +981,23 @@ Bitwise Operations
 
 @smallexample
 @group
-                   ; @r{       30-bit binary values}
+                   ; @r{       binary values}
 
-(logand 14 13)     ; 14  =  @r{0000...001110}
-                   ; 13  =  @r{0000...001101}
-     @result{} 12         ; 12  =  @r{0000...001100}
+(logand 14 13)     ; 14  =  @r{...001110}
+                   ; 13  =  @r{...001101}
+     @result{} 12         ; 12  =  @r{...001100}
 @end group
 
 @group
-(logand 14 13 4)   ; 14  =  @r{0000...001110}
-                   ; 13  =  @r{0000...001101}
-                   ;  4  =  @r{0000...000100}
-     @result{} 4          ;  4  =  @r{0000...000100}
+(logand 14 13 4)   ; 14  =  @r{...001110}
+                   ; 13  =  @r{...001101}
+                   ;  4  =  @r{...000100}
+     @result{} 4          ;  4  =  @r{...000100}
 @end group
 
 @group
 (logand)
-     @result{} -1         ; -1  =  @r{1111...111111}
+     @result{} -1         ; -1  =  @r{...111111}
 @end group
 @end smallexample
 @end defun
@@ -1029,18 +1011,18 @@ Bitwise Operations
 
 @smallexample
 @group
-                   ; @r{       30-bit binary values}
+                   ; @r{       binary values}
 
-(logior 12 5)      ; 12  =  @r{0000...001100}
-                   ;  5  =  @r{0000...000101}
-     @result{} 13         ; 13  =  @r{0000...001101}
+(logior 12 5)      ; 12  =  @r{...001100}
+                   ;  5  =  @r{...000101}
+     @result{} 13         ; 13  =  @r{...001101}
 @end group
 
 @group
-(logior 12 5 7)    ; 12  =  @r{0000...001100}
-                   ;  5  =  @r{0000...000101}
-                   ;  7  =  @r{0000...000111}
-     @result{} 15         ; 15  =  @r{0000...001111}
+(logior 12 5 7)    ; 12  =  @r{...001100}
+                   ;  5  =  @r{...000101}
+                   ;  7  =  @r{...000111}
+     @result{} 15         ; 15  =  @r{...001111}
 @end group
 @end smallexample
 @end defun
@@ -1054,18 +1036,18 @@ Bitwise Operations
 
 @smallexample
 @group
-                   ; @r{       30-bit binary values}
+                   ; @r{       binary values}
 
-(logxor 12 5)      ; 12  =  @r{0000...001100}
-                   ;  5  =  @r{0000...000101}
-     @result{} 9          ;  9  =  @r{0000...001001}
+(logxor 12 5)      ; 12  =  @r{...001100}
+                   ;  5  =  @r{...000101}
+     @result{} 9          ;  9  =  @r{...001001}
 @end group
 
 @group
-(logxor 12 5 7)    ; 12  =  @r{0000...001100}
-                   ;  5  =  @r{0000...000101}
-                   ;  7  =  @r{0000...000111}
-     @result{} 14         ; 14  =  @r{0000...001110}
+(logxor 12 5 7)    ; 12  =  @r{...001100}
+                   ;  5  =  @r{...000101}
+                   ;  7  =  @r{...000111}
+     @result{} 14         ; 14  =  @r{...001110}
 @end group
 @end smallexample
 @end defun
@@ -1078,9 +1060,9 @@ Bitwise Operations
 @example
 (lognot 5)
      @result{} -6
-;;  5  =  @r{0000...000101} (30 bits total)
+;;  5  =  @r{...000101}
 ;; @r{becomes}
-;; -6  =  @r{1111...111010} (30 bits total)
+;; -6  =  @r{...111010}
 @end example
 @end defun
 
@@ -1095,9 +1077,9 @@ Bitwise Operations
 nonnegative.
 
 @example
-(logcount 43)     ; 43 = #b101011
+(logcount 43)     ;  43 = @r{...000101011}
      @result{} 4
-(logcount -43)    ; -43 = #b111...1010101
+(logcount -43)    ; -43 = @r{...111010101}
      @result{} 3
 @end example
 @end defun
diff --git a/etc/NEWS b/etc/NEWS
index a9f8ed2ef8..9a74164421 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -871,6 +871,12 @@ bignums.  However, note that unlike fixnums, bignums will not compare
 equal with 'eq', you must use 'eql' instead.  (Numerical comparison
 with '=' works on both, of course.)
 
++++
+** New variable 'integer-width'.
+It is a nonnegative integer specifying the maximum number of bits
+allowed in a bignum.  Integer overflow occurs if this limit is
+exceeded.
+
 ** define-minor-mode automatically documents the meaning of ARG
 
 +++
diff --git a/src/alloc.c b/src/alloc.c
index ddc0696ba9..24a24aab96 100644
--- a/src/alloc.c
+++ b/src/alloc.c
@@ -3746,33 +3746,33 @@ make_bignum_str (const char *num, int base)
 Lisp_Object
 make_number (mpz_t value)
 {
-  if (mpz_fits_slong_p (value))
-    {
-      long l = mpz_get_si (value);
-      if (!FIXNUM_OVERFLOW_P (l))
-	return make_fixnum (l);
-    }
-  else if (LONG_WIDTH < FIXNUM_BITS)
+  size_t bits = mpz_sizeinbase (value, 2);
+
+  if (bits <= FIXNUM_BITS)
     {
-      size_t bits = mpz_sizeinbase (value, 2);
+      EMACS_INT v = 0;
+      int i = 0, shift = 0;
 
-      if (bits <= FIXNUM_BITS)
-        {
-          EMACS_INT v = 0;
-	  int i = 0;
-	  for (int shift = 0; shift < bits; shift += mp_bits_per_limb)
-            {
-	      EMACS_INT limb = mpz_getlimbn (value, i++);
-	      v += limb << shift;
-            }
-	  if (mpz_sgn (value) < 0)
-            v = -v;
+      do
+	{
+	  EMACS_INT limb = mpz_getlimbn (value, i++);
+	  v += limb << shift;
+	  shift += GMP_NUMB_BITS;
+	}
+      while (shift < bits);
 
-          if (!FIXNUM_OVERFLOW_P (v))
-	    return make_fixnum (v);
-        }
+      if (mpz_sgn (value) < 0)
+	v = -v;
+
+      if (!FIXNUM_OVERFLOW_P (v))
+	return make_fixnum (v);
     }
 
+  /* The documentation says integer-width should be nonnegative, so
+     a single comparison suffices even though 'bits' is unsigned.  */
+  if (integer_width < bits)
+    integer_overflow ();
+
   struct Lisp_Bignum *b = ALLOCATE_PSEUDOVECTOR (struct Lisp_Bignum, value,
 						 PVEC_BIGNUM);
   /* We could mpz_init + mpz_swap here, to avoid a copy, but the
@@ -7200,6 +7200,26 @@ verify_alloca (void)
 
 #endif /* ENABLE_CHECKING && USE_STACK_LISP_OBJECTS */
 
+/* Memory allocation for GMP.  */
+
+void
+integer_overflow (void)
+{
+  error ("Integer too large to be represented");
+}
+
+static void *
+xrealloc_for_gmp (void *ptr, size_t ignore, size_t size)
+{
+  return xrealloc (ptr, size);
+}
+
+static void
+xfree_for_gmp (void *ptr, size_t ignore)
+{
+  xfree (ptr);
+}
+
 /* Initialization.  */
 
 void
@@ -7233,6 +7253,10 @@ init_alloc_once (void)
 void
 init_alloc (void)
 {
+  eassert (mp_bits_per_limb == GMP_NUMB_BITS);
+  integer_width = 1 << 16;
+  mp_set_memory_functions (xmalloc, xrealloc_for_gmp, xfree_for_gmp);
+
   Vgc_elapsed = make_float (0.0);
   gcs_done = 0;
 
@@ -7335,6 +7359,11 @@ The time is in seconds as a floating point value.  */);
   DEFVAR_INT ("gcs-done", gcs_done,
               doc: /* Accumulated number of garbage collections done.  */);
 
+  DEFVAR_INT ("integer-width", integer_width,
+	      doc: /* Maximum number of bits in bignums.
+Integers outside the fixnum range are limited to absolute values less
+than 2**N, where N is this variable's value.  N should be nonnegative.  */);
+
   defsubr (&Scons);
   defsubr (&Slist);
   defsubr (&Svector);
diff --git a/src/data.c b/src/data.c
index 8a6975da3a..4c6d33f294 100644
--- a/src/data.c
+++ b/src/data.c
@@ -2384,6 +2384,80 @@ bool-vector.  IDX starts at 0.  */)
   return newelt;
 }
 \f
+/* GMP tests for this value and aborts (!) if it is exceeded.
+   This is as of GMP 6.1.2 (2016); perhaps future versions will differ.  */
+enum { GMP_NLIMBS_MAX = min (INT_MAX, ULONG_MAX / GMP_NUMB_BITS) };
+
+/* An upper bound on limb counts, needed to prevent libgmp and/or
+   Emacs from aborting or otherwise misbehaving.  This bound applies
+   to estimates of mpz_t sizes before the mpz_t objects are created,
+   as opposed to integer-width which operates on mpz_t values after
+   creation and before conversion to Lisp bignums.  */
+enum
+  {
+   NLIMBS_LIMIT = min (min (/* libgmp needs to store limb counts.  */
+			    GMP_NLIMBS_MAX,
+
+			    /* Size calculations need to work.  */
+			    min (PTRDIFF_MAX, SIZE_MAX) / sizeof (mp_limb_t)),
+
+		       /* Emacs puts bit counts into fixnums.  */
+		       MOST_POSITIVE_FIXNUM / GMP_NUMB_BITS)
+  };
+
+/* Like mpz_size, but tell the compiler the result is a nonnegative int.  */
+
+static int
+emacs_mpz_size (mpz_t const op)
+{
+  mp_size_t size = mpz_size (op);
+  eassume (0 <= size && size <= INT_MAX);
+  return size;
+}
+
+/* Wrappers to work around GMP limitations.  As of GMP 6.1.2 (2016),
+   the library code aborts when a number is too large.  These wrappers
+   avoid the problem for functions that can return numbers much larger
+   than their arguments.  For slowly-growing numbers, the integer
+   width check in make_number should suffice.  */
+
+static void
+emacs_mpz_mul (mpz_t rop, mpz_t const op1, mpz_t const op2)
+{
+  if (NLIMBS_LIMIT - emacs_mpz_size (op1) < emacs_mpz_size (op2))
+    integer_overflow ();
+  mpz_mul (rop, op1, op2);
+}
+
+static void
+emacs_mpz_mul_2exp (mpz_t rop, mpz_t const op1, mp_bitcnt_t op2)
+{
+  /* Fudge factor derived from GMP 6.1.2, to avoid an abort in
+     mpz_mul_2exp (look for the '+ 1' in its source code).  */
+  enum { mul_2exp_extra_limbs = 1 };
+  enum { lim = min (NLIMBS_LIMIT, GMP_NLIMBS_MAX - mul_2exp_extra_limbs) };
+
+  mp_bitcnt_t op2limbs = op2 / GMP_NUMB_BITS;
+  if (lim - emacs_mpz_size (op1) < op2limbs)
+    integer_overflow ();
+  mpz_mul_2exp (rop, op1, op2);
+}
+
+static void
+emacs_mpz_pow_ui (mpz_t rop, mpz_t const base, unsigned long exp)
+{
+  /* This fudge factor is derived from GMP 6.1.2, to avoid an abort in
+     mpz_n_pow_ui (look for the '5' in its source code).  */
+  enum { pow_ui_extra_limbs = 5 };
+  enum { lim = min (NLIMBS_LIMIT, GMP_NLIMBS_MAX - pow_ui_extra_limbs) };
+
+  int nbase = emacs_mpz_size (base), n;
+  if (INT_MULTIPLY_WRAPV (nbase, exp, &n) || lim < n)
+    integer_overflow ();
+  mpz_pow_ui (rop, base, exp);
+}
+
+\f
 /* Arithmetic functions */
 
 Lisp_Object
@@ -2872,13 +2946,13 @@ arith_driver (enum arithop code, ptrdiff_t nargs, Lisp_Object *args)
 	  break;
 	case Amult:
 	  if (BIGNUMP (val))
-	    mpz_mul (accum, accum, XBIGNUM (val)->value);
+	    emacs_mpz_mul (accum, accum, XBIGNUM (val)->value);
 	  else if (! FIXNUMS_FIT_IN_LONG)
             {
 	      mpz_t tem;
 	      mpz_init (tem);
 	      mpz_set_intmax (tem, XFIXNUM (val));
-	      mpz_mul (accum, accum, tem);
+	      emacs_mpz_mul (accum, accum, tem);
 	      mpz_clear (tem);
             }
 	  else
@@ -3293,7 +3367,7 @@ In this case, the sign bit is duplicated.  */)
       mpz_t result;
       mpz_init (result);
       if (XFIXNUM (count) > 0)
-	mpz_mul_2exp (result, XBIGNUM (value)->value, XFIXNUM (count));
+	emacs_mpz_mul_2exp (result, XBIGNUM (value)->value, XFIXNUM (count));
       else
 	mpz_fdiv_q_2exp (result, XBIGNUM (value)->value, - XFIXNUM (count));
       val = make_number (result);
@@ -3319,7 +3393,7 @@ In this case, the sign bit is duplicated.  */)
       mpz_set_intmax (result, XFIXNUM (value));
 
       if (XFIXNUM (count) >= 0)
-	mpz_mul_2exp (result, result, XFIXNUM (count));
+	emacs_mpz_mul_2exp (result, result, XFIXNUM (count));
       else
 	mpz_fdiv_q_2exp (result, result, - XFIXNUM (count));
 
@@ -3330,6 +3404,33 @@ In this case, the sign bit is duplicated.  */)
   return val;
 }
 
+/* Return X ** Y as an integer.  X and Y must be integers, and Y must
+   be nonnegative.  */
+
+Lisp_Object
+expt_integer (Lisp_Object x, Lisp_Object y)
+{
+  unsigned long exp;
+  if (TYPE_RANGED_FIXNUMP (unsigned long, y))
+    exp = XFIXNUM (y);
+  else if (MOST_POSITIVE_FIXNUM < ULONG_MAX && BIGNUMP (y)
+	   && mpz_fits_ulong_p (XBIGNUM (y)->value))
+    exp = mpz_get_ui (XBIGNUM (y)->value);
+  else
+    integer_overflow ();
+
+  mpz_t val;
+  mpz_init (val);
+  emacs_mpz_pow_ui (val,
+		    (FIXNUMP (x)
+		     ? (mpz_set_intmax (val, XFIXNUM (x)), val)
+		     : XBIGNUM (x)->value),
+		    exp);
+  Lisp_Object res = make_number (val);
+  mpz_clear (val);
+  return res;
+}
+
 DEFUN ("1+", Fadd1, Sadd1, 1, 1, 0,
        doc: /* Return NUMBER plus one.  NUMBER may be a number or a marker.
 Markers are converted to integers.  */)
diff --git a/src/emacs.c b/src/emacs.c
index 11ee0b8118..7d07ec8502 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -673,38 +673,6 @@ close_output_streams (void)
     _exit (EXIT_FAILURE);
 }
 
-/* Memory allocation functions for GMP.  */
-
-static void
-check_bignum_size (size_t size)
-{
-  /* Do not create a bignum whose log base 2 could exceed fixnum range.
-     This way, functions like mpz_popcount return values in fixnum range.
-     It may also help to avoid other problems with outlandish bignums.  */
-  if (MOST_POSITIVE_FIXNUM / CHAR_BIT < size)
-    error ("Integer too large to be represented");
-}
-
-static void * ATTRIBUTE_MALLOC
-xmalloc_for_gmp (size_t size)
-{
-  check_bignum_size (size);
-  return xmalloc (size);
-}
-
-static void *
-xrealloc_for_gmp (void *ptr, size_t ignore, size_t size)
-{
-  check_bignum_size (size);
-  return xrealloc (ptr, size);
-}
-
-static void
-xfree_for_gmp (void *ptr, size_t ignore)
-{
-  xfree (ptr);
-}
-
 /* ARGSUSED */
 int
 main (int argc, char **argv)
@@ -803,8 +771,6 @@ main (int argc, char **argv)
   init_standard_fds ();
   atexit (close_output_streams);
 
-  mp_set_memory_functions (xmalloc_for_gmp, xrealloc_for_gmp, xfree_for_gmp);
-
   sort_args (argc, argv);
   argc = 0;
   while (argv[argc]) argc++;
diff --git a/src/floatfns.c b/src/floatfns.c
index 7c52a0a9a2..ea9000b90a 100644
--- a/src/floatfns.c
+++ b/src/floatfns.c
@@ -210,29 +210,7 @@ DEFUN ("expt", Fexpt, Sexpt, 2, 2, 0,
   /* Common Lisp spec: don't promote if both are integers, and if the
      result is not fractional.  */
   if (INTEGERP (arg1) && NATNUMP (arg2))
-    {
-      unsigned long exp;
-      if (TYPE_RANGED_FIXNUMP (unsigned long, arg2))
-	exp = XFIXNUM (arg2);
-      else if (MOST_POSITIVE_FIXNUM < ULONG_MAX && BIGNUMP (arg2)
-	       && mpz_fits_ulong_p (XBIGNUM (arg2)->value))
-	exp = mpz_get_ui (XBIGNUM (arg2)->value);
-      else
-	xsignal3 (Qrange_error, build_string ("expt"), arg1, arg2);
-
-      mpz_t val;
-      mpz_init (val);
-      if (FIXNUMP (arg1))
-	{
-	  mpz_set_intmax (val, XFIXNUM (arg1));
-	  mpz_pow_ui (val, val, exp);
-	}
-      else
-	mpz_pow_ui (val, XBIGNUM (arg1)->value, exp);
-      Lisp_Object res = make_number (val);
-      mpz_clear (val);
-      return res;
-    }
+    return expt_integer (arg1, arg2);
 
   return make_float (pow (XFLOATINT (arg1), XFLOATINT (arg2)));
 }
diff --git a/src/lisp.h b/src/lisp.h
index fe384d1844..8f48a33484 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -996,6 +996,14 @@ enum More_Lisp_Bits
 #define MOST_POSITIVE_FIXNUM (EMACS_INT_MAX >> INTTYPEBITS)
 #define MOST_NEGATIVE_FIXNUM (-1 - MOST_POSITIVE_FIXNUM)
 
+
+/* GMP-related limits.  */
+
+/* Number of data bits in a limb.  */
+#ifndef GMP_NUMB_BITS
+enum { GMP_NUMB_BITS = TYPE_WIDTH (mp_limb_t) };
+#endif
+
 #if USE_LSB_TAG
 
 INLINE Lisp_Object
@@ -3338,7 +3346,7 @@ extern void set_internal (Lisp_Object, Lisp_Object, Lisp_Object,
                           enum Set_Internal_Bind);
 extern void set_default_internal (Lisp_Object, Lisp_Object,
                                   enum Set_Internal_Bind bindflag);
-
+extern Lisp_Object expt_integer (Lisp_Object, Lisp_Object);
 extern void syms_of_data (void);
 extern void swap_in_global_binding (struct Lisp_Symbol *);
 
@@ -3700,6 +3708,7 @@ extern void display_malloc_warning (void);
 extern ptrdiff_t inhibit_garbage_collection (void);
 extern Lisp_Object build_overlay (Lisp_Object, Lisp_Object, Lisp_Object);
 extern void free_cons (struct Lisp_Cons *);
+extern _Noreturn void integer_overflow (void);
 extern void init_alloc_once (void);
 extern void init_alloc (void);
 extern void syms_of_alloc (void);
-- 
2.17.1

next prev parent reply	other threads:[~2018-08-21  9:40 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-17  3:29 bug#32463: 27.0.50; (logior -1) => 4611686018427387903 Katsumi Yamaoka
2018-08-17  5:59 ` Pip Cet
2018-08-17  7:40   ` Katsumi Yamaoka
2018-08-17  9:27   ` Andy Moreton
2018-08-17 11:36     ` Pip Cet
2018-08-17 11:53       ` Pip Cet
2018-08-17 13:27         ` Andy Moreton
2018-08-18 22:43         ` Paul Eggert
2018-08-17 13:24       ` Andy Moreton
2018-08-18 18:48       ` Paul Eggert
2018-08-18 18:59         ` Eli Zaretskii
2018-08-18 19:58           ` Pip Cet
2018-08-18 22:27             ` Paul Eggert
2018-08-19 15:03             ` Eli Zaretskii
2018-08-18 19:59           ` Paul Eggert
2018-08-18 19:45         ` Andy Moreton
2018-08-19 10:43           ` Live System User
2018-08-20  3:02       ` Richard Stallman
2018-08-20  3:47         ` Paul Eggert
2018-08-21  3:37           ` Richard Stallman
2018-08-18 22:56 ` Paul Eggert
2018-08-18 23:17   ` Paul Eggert
2018-08-19 10:34   ` Andy Moreton
2018-08-19 10:48   ` Pip Cet
2018-08-19 10:59     ` Paul Eggert
2018-08-19 11:32       ` Pip Cet
2018-08-21  9:40         ` Paul Eggert [this message]
2018-08-21 10:50           ` Andy Moreton
2018-08-21 14:36           ` Eli Zaretskii
2018-08-21 14:52             ` Andy Moreton
2018-08-21 17:24             ` Paul Eggert
2018-08-19 10:52   ` Paul Eggert
2018-08-22  2:29     ` Paul Eggert
2018-08-22 16:56   ` Tom Tromey
2018-08-22 17:52     ` Paul Eggert
2018-08-22 18:25       ` Eli Zaretskii
2018-08-23  0:28         ` Paul Eggert
2018-08-23  2:39           ` Eli Zaretskii
2018-08-19 18:00 ` Andy Moreton
2018-08-22  2:34 ` Paul Eggert
2018-08-22 23:27   ` Andy Moreton
2018-08-23 14:05     ` Eli Zaretskii
2018-08-22  2:56 ` Paul Eggert
2018-08-22  8:20   ` Andy Moreton
2018-08-22  8:39 ` Andy Moreton

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:209e9f139 dfblob:9c16b1a64 dfblob:a9f8ed2ef dfblob:9a7416442
dfblob:ddc0696ba dfblob:24a24aab9 dfblob:8a6975da3 dfblob:4c6d33f29
dfblob:11ee0b811 dfblob:7d07ec850 dfblob:7c52a0a9a dfblob:ea9000b90
dfblob:fe384d184 dfblob:8f48a3348 )
 OR (
bs:"Avoid libgmp aborts by imposing limits" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b6c9c32-c301-09db-d23e-ce976afcb5c1@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=32463@debbugs.gnu.org \
    --cc=andrewjmoreton@gmail.com \
    --cc=pipcet@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).