From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Mark H Weaver Newsgroups: gmane.lisp.guile.devel Subject: Immediate doubles (up to 2^256) and rationals coming to Guile 3 Date: Thu, 06 Jun 2019 05:40:39 -0400 Message-ID: <87zhmvaw5p.fsf@netris.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="41083"; mail-complaints-to="usenet@blaine.gmane.org" To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Thu Jun 06 11:43:21 2019 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hYovY-000AYi-GX for guile-devel@m.gmane.org; Thu, 06 Jun 2019 11:43:20 +0200 Original-Received: from localhost ([127.0.0.1]:57367 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hYovX-0007OR-Gn for guile-devel@m.gmane.org; Thu, 06 Jun 2019 05:43:19 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:42751) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hYovO-0007NS-DE for guile-devel@gnu.org; Thu, 06 Jun 2019 05:43:13 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hYovN-0006nJ-4d for guile-devel@gnu.org; Thu, 06 Jun 2019 05:43:10 -0400 Original-Received: from world.peace.net ([64.112.178.59]:60314) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hYovN-0006c7-11 for guile-devel@gnu.org; Thu, 06 Jun 2019 05:43:09 -0400 Original-Received: from mhw by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hYovB-00068J-Ds; Thu, 06 Jun 2019 05:42:57 -0400 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 64.112.178.59 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.org gmane.lisp.guile.devel:19943 Archived-At: I've found a way to efficiently support both immediate IEEE binary-64 doubles up to ~1.158e77 (with larger ones transparently allocated on the heap), and also immediate exact rationals with up to 54 binary digits (~16 decimal digits), without restricting the 64-bit pointer space at all, and without any loss of arithmetic precision. I have a working draft implementation that roughly doubles the speed of a simple "substract 1.0 until negative" loop for inexact reals less than 2^256, compared with current 'master' (near 2.9.2). The same loop for exact rationals runs in ~70% of the time compared with current master. More importantly, no heap allocation is required when working with these immediate values. More precisely, large finite doubles with magnitude >= 2^256 (1.157920892373162e77) must be heap allocated. All other doubles, including all Inf/NaNs, are immediate floats, or "iflos". I call this approach "high-exponent boxing", because instead of using the space of ~2^53 NaNs for non-flonum purposes, I use the space of finite doubles with binary exponent larger than 255. That's 3/8ths of the total space of IEEE doubles. This gives us a total of (3 * 2^61) SCM values to use for other things. To convert a double into an SCM, just add (2^60 + 2^52) to its unsigned 64-bit representation, and rotate left 4 bits. This maps the largest finite doubles to SCM values with 000, 110, or 111 in the low 3 bits. * * * I've also implemented immediate exact rationals, or "fixrats". The precise rule is that on a 64-bit system, an exact non-integer rational is immediate if and only if it can be written with 54 binary digits or less. In terms of decimal digits, any rational that can be written with 16 decimal digits is immediate, and some that require 17 or 18 decimal digits are immediate. Here's the format of fixrats on 64-bit systems: Srrrrrrxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0111 |\____/\___________________________________________________/\__/ | | | | | rank (6 bits) data (53 bits) tag sign The 53-bit data field contains both the numerator and denominator, with the denominator in the highest bits. The most significant bit of the denominator is implicit, i.e. not explicitly represented. That's why we can represent up to 54 binary digits in a 53-bit field. The remaining bits of the denominator occupy the most significant data bits, and the magnitude of the numerator occupy the least significant data bits. The 6-bit rank is 2 less than the integer length of the denominator, i.e. 1 less than the number of data bits allocated to the denominator. I chose this representation because it allows us to leverage existing floating-point hardware to efficiently pack fixrats. Simply convert the denominator to an IEEE double, and now the rank will be in the low bits of the IEEE exponent field, immediately adjacent to the denominator with its high bit removed. This simplifies the packing operation. There's also a nice way to extract the denominator from a fixrat: mask out the sign bit, shift right 5 bits, and interpret it as an IEEE double. The denominator will be the integer part of the resulting value, with the numerator in the fraction bits. Simply cast this double to an integer to discard the numerator bits. * * * Here are the tags used in my draft implementation: ;; /* with iflos: xxx: iflo (000 < xxx < 110) ;; (64-bit) 1111: fixnum ;; 0111: fixrat ;; ;; 000: heap object ;; tttt0110: immediate non-number ;; 1110: [NOT_SCM] ;; 11110: [NOT_SCM] struct tag ;; ttttt101110: [NOT_SCM] non-pair non-struct non-smob tag ;; ttttttttx1001110: [NOT_SCM] smob and here's my plan for 32-bit systems, not yet implemented: ;; without iflos: 1: fixnum ;; (32-bit) 10: fixrat ;; ;; 000: heap object ;; tttt0100: immediate non-number ;; 1100: [NOT_SCM] ;; 11100: [NOT_SCM] struct tag ;; ttttt101100: [NOT_SCM] non-pair non-struct non-smob tag ;; ttttttttx1001100: [NOT_SCM] smob */ I'll have more to say about all of this in future messages, and of course I welcome your comments and suggestions. For now, I've pushed my preliminary work to the 'wip-new-tagging' branch on Savannah. https://git.savannah.gnu.org/cgit/guile.git/log/?h=wip-new-tagging Give it a try and let me know what you think. Mark