From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.bugs Subject: bug#8794: cons_to_long fixes; making 64-bit EMACS_INT the default Date: Fri, 03 Jun 2011 20:05:03 -0700 Organization: UCLA Computer Science Department Message-ID: <4DE9A0DF.2030801@cs.ucla.edu> References: <4DE89EB8.9020202@cs.ucla.edu> <83oc2fdw59.fsf@gnu.org> <4DE91FB3.80601@cs.ucla.edu> <83hb86em4k.fsf@gnu.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1307156790 18627 80.91.229.12 (4 Jun 2011 03:06:30 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sat, 4 Jun 2011 03:06:30 +0000 (UTC) Cc: 8794@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Jun 04 05:06:26 2011 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QShBv-0008N1-Iw for geb-bug-gnu-emacs@m.gmane.org; Sat, 04 Jun 2011 05:06:23 +0200 Original-Received: from localhost ([::1]:52254 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QShBu-0006yo-DO for geb-bug-gnu-emacs@m.gmane.org; Fri, 03 Jun 2011 23:06:22 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:55218) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QShBb-0006yh-UO for bug-gnu-emacs@gnu.org; Fri, 03 Jun 2011 23:06:05 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QShBa-0002L4-Dt for bug-gnu-emacs@gnu.org; Fri, 03 Jun 2011 23:06:03 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:45400) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QShBa-0002Kz-9u for bug-gnu-emacs@gnu.org; Fri, 03 Jun 2011 23:06:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1QShBZ-0004u0-QB; Fri, 03 Jun 2011 23:06:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Paul Eggert Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Jun 2011 03:06:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 8794 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 8794-submit@debbugs.gnu.org id=B8794.130715671918794 (code B ref 8794); Sat, 04 Jun 2011 03:06:01 +0000 Original-Received: (at 8794) by debbugs.gnu.org; 4 Jun 2011 03:05:19 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QShAt-0004t5-03 for submit@debbugs.gnu.org; Fri, 03 Jun 2011 23:05:19 -0400 Original-Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QShAp-0004st-V5 for 8794@debbugs.gnu.org; Fri, 03 Jun 2011 23:05:17 -0400 Original-Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 38A3C39E80FF; Fri, 3 Jun 2011 20:05:10 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Original-Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8-ulA0Xmndm8; Fri, 3 Jun 2011 20:05:08 -0700 (PDT) Original-Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 5EF8639E80F0; Fri, 3 Jun 2011 20:05:08 -0700 (PDT) User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.17) Gecko/20110424 Thunderbird/3.1.10 In-Reply-To: <83hb86em4k.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Fri, 03 Jun 2011 23:06:01 -0400 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:46945 Archived-At: On 06/03/11 12:43, Eli Zaretskii wrote: > maybe 2GB is not the limit, but 4GB surely is. Yes, that's right. A host with 32-bit addresses can address at most 4 GiB. >> Perhaps you're thinking of pointer subtraction? That often stops working on >> arrays larger than 2 GiB. But this is easy to program around. > > Well, then we need to program around that, _before_ we promise buffers > larger than 2GB on 32-bit hosts. OK, good point. Rather than embark on that further fix, let's just get this one done. The further patches below fix the documentation so that it talks about the 2 GiB limit on typical 32-bit hosts, and fix the code to enforce this limit. > What about ULONG_MAX in this patch to xselect.c: > ... > ? There are also USHRT_MAX, LONG_MAX, CHAR_MAX, and SHRT_MAX there, > but I see no limits.h being included. How did that compile for you? Thanks for catching that. It worked for me because my platform needed a replacement inttypes.h, which in turn included limits.h. More up-to-date platforms would use the system inttypes.h, which wouldn't do that. The patches below fix that too. One more patch (also below) catches a similar problem with 32-bit integer overflow that I noticed in image.c while I was fixing the other problems. It has no dependencies on the other patches. Thanks for the review. :::::::::::::: diff1 :::::::::::::: === modified file 'src/ChangeLog' --- src/ChangeLog 2011-06-03 19:02:25 +0000 +++ src/ChangeLog 2011-06-03 20:14:12 +0000 @@ -19,7 +19,8 @@ * undo.c (record_first_change): Use INTEGER_TO_CONS. (Fprimitive_undo): Use CONS_TO_INTEGER. * xfns.c (Fx_window_property): Likewise. - * xselect.c (x_own_selection, selection_data_to_lisp_data): + * xselect.c: Include . + (x_own_selection, selection_data_to_lisp_data): Use INTEGER_TO_CONS. (x_handle_selection_request, x_handle_selection_clear) (x_get_foreign_selection, Fx_disown_selection_internal) === modified file 'src/xselect.c' --- src/xselect.c 2011-06-03 19:02:25 +0000 +++ src/xselect.c 2011-06-03 20:14:12 +0000 @@ -20,6 +20,7 @@ /* Rewritten by jwz */ #include +#include #include /* termhooks.h needs this */ #include :::::::::::::: diff2 :::::::::::::: === modified file 'doc/emacs/buffers.texi' --- doc/emacs/buffers.texi 2011-06-03 18:47:14 +0000 +++ doc/emacs/buffers.texi 2011-06-03 23:21:13 +0000 @@ -43,8 +43,9 @@ A buffer's size cannot be larger than some maximum, which is defined by the largest buffer position representable by the @dfn{Emacs integer} data type. This is because Emacs tracks buffer positions -using that data type. For most machines, the maximum buffer size +using that data type. For 64-bit machines, the maximum buffer size enforced by the data types is @math{2^61 - 2} bytes, or about 2 EiB. +For most 32-bit machines, the maximum is @math{2^31 - 1} bytes, or about 2 GiB. For some older machines, the maximum is @math{2^29 - 2} bytes, or about 512 MiB. Buffer sizes are also limited by the size of Emacs's virtual memory. :::::::::::::: diff3 :::::::::::::: === modified file 'src/ChangeLog' --- src/ChangeLog 2011-06-03 20:14:12 +0000 +++ src/ChangeLog 2011-06-04 02:02:36 +0000 @@ -1,3 +1,11 @@ +2011-06-04 Paul Eggert + + Use ptrdiff_t, not int, for sizes. + * image.c (slurp_file): Switch from int to ptrdiff_t. + All uses changed. + (slurp_file, svg_load): Check that file size fits in both + size_t (for malloc) and ptrdiff_t (for sanity and safety). + 2011-06-03 Paul Eggert Check for overflow when converting integer to cons and back. === modified file 'src/image.c' --- src/image.c 2011-05-31 06:05:00 +0000 +++ src/image.c 2011-06-04 02:02:36 +0000 @@ -2112,9 +2112,6 @@ File Handling ***********************************************************************/ -static unsigned char *slurp_file (char *, int *); - - /* Find image file FILE. Look in data-directory/images, then x-bitmap-file-path. Value is the encoded full name of the file found, or nil if not found. */ @@ -2151,7 +2148,7 @@ occurred. *SIZE is set to the size of the file. */ static unsigned char * -slurp_file (char *file, int *size) +slurp_file (char *file, ptrdiff_t *size) { FILE *fp = NULL; unsigned char *buf = NULL; @@ -2159,6 +2156,7 @@ if (stat (file, &st) == 0 && (fp = fopen (file, "rb")) != NULL + && 0 <= st.st_size && st.st_size <= min (PTRDIFF_MAX, SIZE_MAX) && (buf = (unsigned char *) xmalloc (st.st_size), fread (buf, 1, st.st_size, fp) == st.st_size)) { @@ -2814,7 +2812,7 @@ { Lisp_Object file; unsigned char *contents; - int size; + ptrdiff_t size; file = x_find_image_file (file_name); if (!STRINGP (file)) @@ -4039,7 +4037,7 @@ { Lisp_Object file; unsigned char *contents; - int size; + ptrdiff_t size; file = x_find_image_file (file_name); if (!STRINGP (file)) @@ -5021,6 +5019,7 @@ if (stat (SDATA (file), &st) == 0 && (fp = fopen (SDATA (file), "rb")) != NULL + && 0 <= st.st_size && st.st_size <= min (PTRDIFF_MAX, SIZE_MAX) && (buf = (char *) xmalloc (st.st_size), fread (buf, 1, st.st_size, fp) == st.st_size)) { @@ -5055,7 +5054,7 @@ enum {PBM_MONO, PBM_GRAY, PBM_COLOR} type; unsigned char *contents = NULL; unsigned char *end, *p; - int size; + ptrdiff_t size; specified_file = image_spec_value (img->spec, QCfile, NULL); @@ -7869,7 +7868,7 @@ static int svg_load (struct frame *f, struct image *img); static int svg_load_image (struct frame *, struct image *, - unsigned char *, unsigned int); + unsigned char *, ptrdiff_t); /* The symbol `svg' identifying images of this type. */ @@ -8047,7 +8046,7 @@ { Lisp_Object file; unsigned char *contents; - int size; + ptrdiff_t size; file = x_find_image_file (file_name); if (!STRINGP (file)) @@ -8074,7 +8073,7 @@ Lisp_Object data; data = image_spec_value (img->spec, QCdata, NULL); - if (!STRINGP (data)) + if (! (STRINGP (data) && SBYTES (data) <= min (PTRDIFF_MAX, SIZE_MAX))) { image_error ("Invalid image data `%s'", data, Qnil); return 0; @@ -8096,7 +8095,7 @@ svg_load_image (struct frame *f, /* Pointer to emacs frame structure. */ struct image *img, /* Pointer to emacs image structure. */ unsigned char *contents, /* String containing the SVG XML data to be parsed. */ - unsigned int size) /* Size of data in bytes. */ + ptrdiff_t size) /* Size of data in bytes. */ { RsvgHandle *rsvg_handle; RsvgDimensionData dimension_data; :::::::::::::: diff4 :::::::::::::: === modified file 'src/ChangeLog' --- src/ChangeLog 2011-06-04 02:02:36 +0000 +++ src/ChangeLog 2011-06-04 02:49:51 +0000 @@ -1,5 +1,20 @@ 2011-06-04 Paul Eggert + Check for buffer and string overflow more precisely. + * buffer.h (BUF_BYTES_MAX): New macro. + * lisp.h (STRING_BYTES_MAX): New macro. + * alloc.c (Fmake_string): + * character.c (string_escape_byte8): + * coding.c (coding_alloc_by_realloc): + * doprnt.c (doprnt): + * editfns.c (Fformat): + * eval.c (verror): + Use STRING_BYTES_MAX, not MOST_POSITIVE_FIXNUM, + since they may not be the same number. + * editfns.c (Finsert_char): + * fileio.c (Finsert_file_contents): + Likewise for BUF_BYTES_MAX. + Use ptrdiff_t, not int, for sizes. * image.c (slurp_file): Switch from int to ptrdiff_t. All uses changed. === modified file 'src/alloc.c' --- src/alloc.c 2011-06-02 08:35:28 +0000 +++ src/alloc.c 2011-06-04 02:49:51 +0000 @@ -2205,7 +2205,7 @@ int len = CHAR_STRING (c, str); EMACS_INT string_len = XINT (length); - if (string_len > MOST_POSITIVE_FIXNUM / len) + if (string_len > STRING_BYTES_MAX / len) string_overflow (); nbytes = len * string_len; val = make_uninit_multibyte_string (string_len, nbytes); === modified file 'src/buffer.h' --- src/buffer.h 2011-06-02 06:15:15 +0000 +++ src/buffer.h 2011-06-04 02:49:51 +0000 @@ -306,6 +306,11 @@ } \ while (0) +/* Maximum number of bytes in a buffer. + A buffer cannot contain more bytes than a 1-origin fixnum can represent, + nor can it be so large that C pointer arithmetic stops working. */ +#define BUF_BYTES_MAX min (MOST_POSITIVE_FIXNUM - 1, min (SIZE_MAX, PTRDIFF_MAX)) + /* Return the address of byte position N in current buffer. */ #define BYTE_POS_ADDR(n) \ === modified file 'src/character.c' --- src/character.c 2011-06-02 06:17:35 +0000 +++ src/character.c 2011-06-04 02:49:51 +0000 @@ -838,7 +838,7 @@ if (multibyte) { if ((MOST_POSITIVE_FIXNUM - nchars) / 3 < byte8_count - || (MOST_POSITIVE_FIXNUM - nbytes) / 2 < byte8_count) + || (STRING_BYTES_MAX - nbytes) / 2 < byte8_count) string_overflow (); /* Convert 2-byte sequence of byte8 chars to 4-byte octal. */ @@ -847,7 +847,7 @@ } else { - if ((MOST_POSITIVE_FIXNUM - nchars) / 3 < byte8_count) + if ((STRING_BYTES_MAX - nchars) / 3 < byte8_count) string_overflow (); /* Convert 1-byte sequence of byte8 chars to 4-byte octal. */ === modified file 'src/coding.c' --- src/coding.c 2011-05-30 01:12:12 +0000 +++ src/coding.c 2011-06-04 02:49:51 +0000 @@ -1071,8 +1071,8 @@ static void coding_alloc_by_realloc (struct coding_system *coding, EMACS_INT bytes) { - if (coding->dst_bytes >= MOST_POSITIVE_FIXNUM - bytes) - error ("Maximum size of buffer or string exceeded"); + if (STRING_BYTES_MAX - coding->dst_bytes < bytes) + string_overflow (); coding->destination = (unsigned char *) xrealloc (coding->destination, coding->dst_bytes + bytes); coding->dst_bytes += bytes; === modified file 'src/doprnt.c' --- src/doprnt.c 2011-04-30 20:05:43 +0000 +++ src/doprnt.c 2011-06-04 02:49:51 +0000 @@ -329,7 +329,7 @@ minlen = atoi (&fmtcpy[1]); string = va_arg (ap, char *); tem = strlen (string); - if (tem > MOST_POSITIVE_FIXNUM) + if (tem > STRING_BYTES_MAX) error ("String for %%s or %%S format is too long"); width = strwidth (string, tem); goto doit1; @@ -338,7 +338,7 @@ doit: /* Coming here means STRING contains ASCII only. */ tem = strlen (string); - if (tem > MOST_POSITIVE_FIXNUM) + if (tem > STRING_BYTES_MAX) error ("Format width or precision too large"); width = tem; doit1: === modified file 'src/editfns.c' --- src/editfns.c 2011-06-03 18:14:49 +0000 +++ src/editfns.c 2011-06-04 02:49:51 +0000 @@ -2341,7 +2341,7 @@ len = CHAR_STRING (XFASTINT (character), str); else str[0] = XFASTINT (character), len = 1; - if (MOST_POSITIVE_FIXNUM / len < XINT (count)) + if (BUF_BYTES_MAX / len < XINT (count)) error ("Maximum buffer size would be exceeded"); n = XINT (count) * len; if (n <= 0) @@ -3588,7 +3588,7 @@ char initial_buffer[4000]; char *buf = initial_buffer; EMACS_INT bufsize = sizeof initial_buffer; - EMACS_INT max_bufsize = min (MOST_POSITIVE_FIXNUM + 1, SIZE_MAX); + EMACS_INT max_bufsize = STRING_BYTES_MAX + 1; char *p; Lisp_Object buf_save_value IF_LINT (= {0}); register char *format, *end, *format_start; === modified file 'src/eval.c' --- src/eval.c 2011-05-30 05:39:59 +0000 +++ src/eval.c 2011-06-04 02:49:51 +0000 @@ -1994,7 +1994,7 @@ { char buf[4000]; size_t size = sizeof buf; - size_t size_max = min (MOST_POSITIVE_FIXNUM + 1, SIZE_MAX); + size_t size_max = STRING_BYTES_MAX + 1; size_t mlen = strlen (m); char *buffer = buf; size_t used; === modified file 'src/fileio.c' --- src/fileio.c 2011-06-03 19:02:25 +0000 +++ src/fileio.c 2011-06-04 02:49:51 +0000 @@ -3248,7 +3248,7 @@ /* Check whether the size is too large or negative, which can happen on a platform that allows file sizes greater than the maximum off_t value. */ if (! not_regular - && ! (0 <= st.st_size && st.st_size <= MOST_POSITIVE_FIXNUM)) + && ! (0 <= st.st_size && st.st_size <= BUF_BYTES_MAX)) error ("Maximum buffer size exceeded"); /* Prevent redisplay optimizations. */ === modified file 'src/lisp.h' --- src/lisp.h 2011-06-03 19:02:25 +0000 +++ src/lisp.h 2011-06-04 02:49:51 +0000 @@ -766,6 +766,12 @@ #endif /* not GC_CHECK_STRING_BYTES */ +/* A string cannot contain more bytes than a fixnum can represent, + nor can it be so long that C pointer arithmetic stops working on + the string plus a terminating null. */ +#define STRING_BYTES_MAX \ + min (MOST_POSITIVE_FIXNUM, min (SIZE_MAX, PTRDIFF_MAX) - 1) + /* Mark STR as a unibyte string. */ #define STRING_SET_UNIBYTE(STR) \ do { if (EQ (STR, empty_multibyte_string)) \