From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Making --with-wide-int the default Date: Fri, 16 Oct 2015 12:27:04 +0200 Message-ID: <87d1wft8g7.fsf@fencepost.gnu.org> References: <83egpruiyp.fsf@gnu.org> <54E0FF93.2000104@dancol.org> <5610ED13.1010406@dancol.org> <56117F37.9060808@dancol.org> <83oag087gs.fsf@gnu.org> <83oafz70im.fsf@gnu.org> <5620AF43.4050401@cs.ucla.edu> <8737xbusz1.fsf@fencepost.gnu.org> <83d1wf6v47.fsf@gnu.org> <87pp0ftbmg.fsf@fencepost.gnu.org> <831tcv6s6f.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1444997006 23028 80.91.229.3 (16 Oct 2015 12:03:26 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 16 Oct 2015 12:03:26 +0000 (UTC) Cc: lekktu@gmail.com, eggert@cs.ucla.edu, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Oct 16 14:03:25 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Zn3jG-0001qo-Ha for ged-emacs-devel@m.gmane.org; Fri, 16 Oct 2015 14:03:22 +0200 Original-Received: from localhost ([::1]:52547 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zn2Ep-0006vw-QJ for ged-emacs-devel@m.gmane.org; Fri, 16 Oct 2015 06:27:51 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:60470) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zn2EB-0006aZ-Nw for emacs-devel@gnu.org; Fri, 16 Oct 2015 06:27:12 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zn2EA-0008UT-G6 for emacs-devel@gnu.org; Fri, 16 Oct 2015 06:27:11 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:45193) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zn2E5-0008Os-Q2; Fri, 16 Oct 2015 06:27:05 -0400 Original-Received: from localhost ([127.0.0.1]:59011 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.82) (envelope-from ) id 1Zn2E4-0002yD-UU; Fri, 16 Oct 2015 06:27:05 -0400 Original-Received: by lola (Postfix, from userid 1000) id 49829E239A; Fri, 16 Oct 2015 12:27:04 +0200 (CEST) In-Reply-To: <831tcv6s6f.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 16 Oct 2015 13:09:28 +0300") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:191744 Archived-At: Eli Zaretskii writes: >> From: David Kastrup >> Cc: eggert@cs.ucla.edu, lekktu@gmail.com, emacs-devel@gnu.org >> Date: Fri, 16 Oct 2015 11:18:31 +0200 >>=20 >> Eli Zaretskii writes: >>=20 >> >> From: David Kastrup >> >> Cc: Eli Zaretskii , Juanma Barranquero >> >> , emacs-devel@gnu.org >> >> Date: Fri, 16 Oct 2015 10:18:26 +0200 >> >>=20 >> >> Instead of going to 64-bit unilaterally it would seem to make more se= nse >> >> to me to degrade gracefully into gmp. >> > >> > How can GMP help extend the maximum size of buffers and strings beyond >> > what a 32-bit EMACS_INT allows? >>=20 >> By choosing an appropriate data type for representing buffer/string >> sizes in C and converting back-and-forth from the Lisp type as needed. >> Pretty much the same way we do it now. > > Sorry, I don't foollow: what "appropriate data type"? Would that be > 'long long'? If so, that's exactly what we do now, which you say is > "going to 64-bit unilaterally". What am I missing? A change in the size of the Lisp data type. >> I think it would be a reasonable restriction to keep to 2GB size of >> strings and buffers when working with a 32bit executable. That's >> what people expect on a 32-bit architecture. > > That's what we have now in 32-bit builds --with-wide-int. So I'm not > sure why you mention that as some kind of change related to this > discussion. Isn't that also changing the size of a Lisp cell? And of integer arithmetic? >> When you are editing gigabyte files, at some point of time, the Lisp >> representation of the respective offsets in the high part of the >> buffer will become the responsibility of GMP, yes. I'm not worried >> about that. > > I don't understand. C doesn't have dynamic types. But Lisp does. > If the variable that holds buffer positions needs to support 61-bit > offsets, it will have to be a 64-bit integral data type from the > get-go. I repeat: we are talking about a 32-bit binary where restricting buffer and string size to 32bit offsets would be reasonable and expected. But even if we assume that we want to support some larger size in the C parts, that's just the matter of choosing a different type and a different conversion from Lisp to C. GUILE has all of -- C Function: char scm_to_char (SCM x) -- C Function: signed char scm_to_schar (SCM x) -- C Function: unsigned char scm_to_uchar (SCM x) -- C Function: short scm_to_short (SCM x) -- C Function: unsigned short scm_to_ushort (SCM x) -- C Function: int scm_to_int (SCM x) -- C Function: unsigned int scm_to_uint (SCM x) -- C Function: long scm_to_long (SCM x) -- C Function: unsigned long scm_to_ulong (SCM x) -- C Function: long long scm_to_long_long (SCM x) -- C Function: unsigned long long scm_to_ulong_long (SCM x) -- C Function: size_t scm_to_size_t (SCM x) -- C Function: ssize_t scm_to_ssize_t (SCM x) -- C Function: scm_t_ptrdiff scm_to_ptrdiff_t (SCM x) -- C Function: scm_t_int8 scm_to_int8 (SCM x) -- C Function: scm_t_uint8 scm_to_uint8 (SCM x) -- C Function: scm_t_int16 scm_to_int16 (SCM x) -- C Function: scm_t_uint16 scm_to_uint16 (SCM x) -- C Function: scm_t_int32 scm_to_int32 (SCM x) -- C Function: scm_t_uint32 scm_to_uint32 (SCM x) -- C Function: scm_t_int64 scm_to_int64 (SCM x) -- C Function: scm_t_uint64 scm_to_uint64 (SCM x) -- C Function: scm_t_intmax scm_to_intmax (SCM x) -- C Function: scm_t_uintmax scm_to_uintmax (SCM x) When X represents an exact integer that fits into the indicated C type, return that integer. Else signal an error, either a =E2=80=98wrong-type=E2=80=99 error when X is not an exact integer, or = an =E2=80=98out-of-range=E2=80=99 error when it doesn=E2=80=99t fit the g= iven range. The functions =E2=80=98scm_to_long_long=E2=80=99, =E2=80=98scm_to_ulon= g_long=E2=80=99, =E2=80=98scm_to_int64=E2=80=99, and =E2=80=98scm_to_uint64=E2=80=99 ar= e only available when the corresponding types are. Choose a C type of your choosing for dealing with buffer offsets, create aliases for its conversions, and you are good to go. The Lisp side supports arbitrary sizes, and the actual C-side limits (rather than something like 26 bits or 29 bits or 30 bits or 61 bits) are established when converting from Lisp types. > And having 61-bit integers for integer arithmetics is also a valuable > feature. 61-bit is some arbitrary junk number. Transparently degrading from 29-bits to gmp numbers means that there are no arbitrary limits (or at least you are unlikely to hit them) and the performance for the vast majority of cases will be 29-bit performance. > So the EMACS_INT type will have to be able to support that. I don't see why when one can use GMP (which effectively uses the same kind arithmetic for 61-bit numbers as C does but does not stop there). --=20 David Kastrup