From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Making --with-wide-int the default Date: Fri, 16 Oct 2015 18:28:20 +0200 Message-ID: <876126srq3.fsf@fencepost.gnu.org> References: <56117F37.9060808@dancol.org> <83oag087gs.fsf@gnu.org> <83oafz70im.fsf@gnu.org> <5620AF43.4050401@cs.ucla.edu> <8737xbusz1.fsf@fencepost.gnu.org> <83d1wf6v47.fsf@gnu.org> <87pp0ftbmg.fsf@fencepost.gnu.org> <831tcv6s6f.fsf@gnu.org> <87d1wft8g7.fsf@fencepost.gnu.org> <83vba754rq.fsf@gnu.org> <87zizisyg2.fsf@fencepost.gnu.org> <83io666bv3.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1445013058 4647 80.91.229.3 (16 Oct 2015 16:30:58 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 16 Oct 2015 16:30:58 +0000 (UTC) Cc: lekktu@gmail.com, eggert@cs.ucla.edu, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Oct 16 18:30:57 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Zn7u0-0005eN-81 for ged-emacs-devel@m.gmane.org; Fri, 16 Oct 2015 18:30:44 +0200 Original-Received: from localhost ([::1]:54894 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zn7tz-0004ik-Dq for ged-emacs-devel@m.gmane.org; Fri, 16 Oct 2015 12:30:43 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46422) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zn7rs-0001yq-Io for emacs-devel@gnu.org; Fri, 16 Oct 2015 12:28:33 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zn7rr-0007NS-Bq for emacs-devel@gnu.org; Fri, 16 Oct 2015 12:28:32 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:59530) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zn7ri-0007I3-5z; Fri, 16 Oct 2015 12:28:22 -0400 Original-Received: from localhost ([127.0.0.1]:45115 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.82) (envelope-from ) id 1Zn7rh-0005LV-Cy; Fri, 16 Oct 2015 12:28:21 -0400 Original-Received: by lola (Postfix, from userid 1000) id D164BEBEAB; Fri, 16 Oct 2015 18:28:20 +0200 (CEST) In-Reply-To: <83io666bv3.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 16 Oct 2015 19:01:52 +0300") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:191778 Archived-At: Eli Zaretskii writes: >> From: David Kastrup >> Cc: eggert@cs.ucla.edu, lekktu@gmail.com, emacs-devel@gnu.org >> Date: Fri, 16 Oct 2015 16:03:09 +0200 >> >> >> Isn't that also changing the size of a Lisp cell? And of integer >> >> arithmetic? >> > >> > Part of it, yes. But since a Lisp cell can hold buffer or string >> > position, what else can you expect? >> >> That buffer or string positions not representable in the 29 bits or so >> available for integers in a Lisp cell are instead represented using a >> gpm number for which there is a reference in the Lisp cell. > > But a GPM number is nothing more or less that some C data type. It's > not some magic wand that can solve problems just by being there. > >> How do you think Emacs managed 64-bit doubles "inside" of a 32-bit >> integral type used for representing Lisp cells? > > So you are suggesting to make a Lisp integer represented like a Lisp > float, i.e. accessible via a pointer? For values not fitting into 29 bits, yes. > But that in itself will slow down integer arithmetics, due to the need > to dereference the pointer, won't it? For values not fitting into 29 bits, yes. >> GMP needs to be called only when leaving the range of "small >> integers" (which is all we even have right now). 64-bit arithmetic >> in your plan would be required for every single operation. Yes, when >> GMP kicks in, it will be slower than operations using exactly 64-bit >> (not more, not less). But it's the exception rather than the rule. > > But there has to be a test for when the exception happens, and that > test is going to exert its price on every operation, whether it > succeeds or fails the test. I'm not at all sure it will be a net win. With regard to memory requirements and processing speed for values fitting easily into 32Bit (including booleans, indirect types, symbols, various other stuff), it will be a net win. Not everything in Lisp is an integer. >> So much the exception that we could entirely make do without it so >> far. It _will_ occur frequently when editing files larger than 1GB >> or so. But only in the _Lisp_ representations of those buffer >> positions. Everything implemented in C will use the integral data >> type we choose for that, throwing an error when it gets exceeded. > > So you are suggesting to give up the transparent exposure of members > of Lisp objects that are integers? I have no idea what you are talking about. Integers and Lisp cells are not the same. Treating them as the same is a bug. Just ask Stefan. So I have no idea what you call "transparent exposure" here. >> Yes, not all of the possible offsets will be representable in one >> Lisp cell. > > So you are suggesting that a buffer position will not always be a > simple number, but will sometimes be a cons cell? No. It will always be a Lisp integer, but Lisp integers will not always be stored in a single cell. But GMP does not store its numbers in cons cells. Really, I have no idea what your problem here is. Smooth degradation to fit rare large values compactly is not a particularly uncommon concept. Emacs uses it, for example, in UTF-8 representations of 8-bit encodings. It represents character codes 0-127 as-is, then switches to non-byte representation of 128-255. Why don't we use 32-bit characters from the get-go? Because they are rarely needed and we don't want to always pay the price in memory for the rarely needed cases. Doubling the size of Lisp cells does not just (roughly) double the size of all integers. -- David Kastrup