From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#8794: cons_to_long fixes; making 64-bit EMACS_INT the default Date: Fri, 03 Jun 2011 22:43:55 +0300 Message-ID: <83hb86em4k.fsf@gnu.org> References: <4DE89EB8.9020202@cs.ucla.edu> <83oc2fdw59.fsf@gnu.org> <4DE91FB3.80601@cs.ucla.edu> Reply-To: Eli Zaretskii NNTP-Posting-Host: lo.gmane.org X-Trace: dough.gmane.org 1307130276 24629 80.91.229.12 (3 Jun 2011 19:44:36 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 3 Jun 2011 19:44:36 +0000 (UTC) Cc: 8794@debbugs.gnu.org To: Paul Eggert Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Jun 03 21:44:31 2011 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([140.186.70.17]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1QSaII-0001t9-2e for geb-bug-gnu-emacs@m.gmane.org; Fri, 03 Jun 2011 21:44:30 +0200 Original-Received: from localhost ([::1]:52056 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QSaIG-00073S-Iv for geb-bug-gnu-emacs@m.gmane.org; Fri, 03 Jun 2011 15:44:28 -0400 Original-Received: from eggs.gnu.org ([140.186.70.92]:55516) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QSaHt-00072a-4Y for bug-gnu-emacs@gnu.org; Fri, 03 Jun 2011 15:44:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QSaHr-0001EV-9s for bug-gnu-emacs@gnu.org; Fri, 03 Jun 2011 15:44:04 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:39279) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QSaHr-0001EJ-2T for bug-gnu-emacs@gnu.org; Fri, 03 Jun 2011 15:44:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1QSaHp-0003IV-Ty; Fri, 03 Jun 2011 15:44:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 03 Jun 2011 19:44:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 8794 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 8794-submit@debbugs.gnu.org id=B8794.130713023212654 (code B ref 8794); Fri, 03 Jun 2011 19:44:01 +0000 Original-Received: (at 8794) by debbugs.gnu.org; 3 Jun 2011 19:43:52 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QSaHg-0003I2-5O for submit@debbugs.gnu.org; Fri, 03 Jun 2011 15:43:52 -0400 Original-Received: from mtaout20.012.net.il ([80.179.55.166]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QSaHd-0003Hq-MU for 8794@debbugs.gnu.org; Fri, 03 Jun 2011 15:43:51 -0400 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0LM800E00BH6Q000@a-mtaout20.012.net.il> for 8794@debbugs.gnu.org; Fri, 03 Jun 2011 22:43:43 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([84.229.223.140]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0LM800E3NC4UOV40@a-mtaout20.012.net.il>; Fri, 03 Jun 2011 22:43:43 +0300 (IDT) In-reply-to: <4DE91FB3.80601@cs.ucla.edu> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Fri, 03 Jun 2011 15:44:01 -0400 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:46932 Archived-At: > Date: Fri, 03 Jun 2011 10:53:55 -0700 > From: Paul Eggert > CC: 8794@debbugs.gnu.org > > int > main (void) > { > int big = 536870913; > int *p = malloc (big * sizeof *p); > if (!p) > return 1; > memset (p, 0xef, big * sizeof *p); > printf ("%x %x\n", p[0], p[big - 1]); > return 0; > } > > On my RHEL 5.6 host, built as a 32-bit executable, this outputs: > > $ gcc -m32 t.c > $ ./a.out > efefefef efefefef How does this work on the machine code level? Doesn't the code need to load a pointer to p into a 32-bit register, in order to reference the array? On Windows, I see that the GCC-produced code does this: movl $0x20000001,0xfffffffc(%ebp) ... mov 0xfffffffc(%ebp),%eax shl $0x2,%eax and then uses EAX to reference the array elements. That last left shift by 2 bits will surely overflow for values of `big' that are larger that 0x3fffffff (not 0x20000001, the value you used). So maybe 2GB is not the limit, but 4GB surely is. You promise much more. > Perhaps you're thinking of pointer subtraction? That often stops working on > arrays larger than 2 GiB. But this is easy to program around. Well, then we need to program around that, _before_ we promise buffers larger than 2GB on 32-bit hosts. E.g., look how we address characters in buffers: /* Address of beginning of buffer. */ #define BUF_BEG_ADDR(buf) ((buf)->text->beg) /* Return character code of multi-byte form at byte position POS in BUF. If POS doesn't point the head of valid multi-byte form, only the byte at POS is returned. No range checking. */ #define BUF_FETCH_MULTIBYTE_CHAR(buf, pos) \ (_fetch_multibyte_char_p \ = (((pos) >= BUF_GPT_BYTE (buf) ? BUF_GAP_SIZE (buf) : 0) \ + (pos) + BUF_BEG_ADDR (buf) - BEG_BYTE), \ STRING_CHAR (_fetch_multibyte_char_p)) The pointer arithmetics will wrap around on 32-bit hosts here, because a pointer is loaded into a 32-bit register before it's dereferenced. Am I missing something? > And anyway, even if we assume buffers and strings are all smaller > than 2 GiB, an EMACS_INT wider than 32 bits is still needed for > large buffers and strings, due to the tag bits. I wasn't saying a 64-bit EMACS_INT wasn't an advantage. It is. But I very much doubt that we could have buffers and strings larger than 4GB on 32-bit hosts. Your changes to the docs seem to promise much larger buffers, which I don't think is feasible. > > The *_MAX macros need limits.h, but I don't see it being included by > > data.c. Did I miss something? > > Those are OK because lisp.h includes inttypes.h. INTMAX_MAX and > UINTMAX_MAX are defined by inttypes.h (actually, stdint.h, but > inttypes.h includes stdint.h). What about ULONG_MAX in this patch to xselect.c: > - *data_ret = (unsigned char *) xmalloc (sizeof (long) + 1); > - (*data_ret) [sizeof (long)] = 0; > - (*(unsigned long **) data_ret) [0] = cons_to_long (obj); > + *data_ret = (unsigned char *) xmalloc (sizeof (unsigned long) + 1); > + (*data_ret) [sizeof (unsigned long)] = 0; > + (*(unsigned long **) data_ret) [0] = cons_to_unsigned (obj, ULONG_MAX); ? There are also USHRT_MAX, LONG_MAX, CHAR_MAX, and SHRT_MAX there, but I see no limits.h being included. How did that compile for you?