From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Mark H Weaver <mhw@netris.org>
Newsgroups: gmane.lisp.guile.bugs
Subject: bug#18295: Radix points in non-decimal numbers
Date: Wed, 01 Oct 2014 01:19:39 -0400
Message-ID: <87y4t08iz8.fsf@yeeloong.lan>
References: <87egwc6eaa.fsf@kagami.home>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
X-Trace: ger.gmane.org 1412140887 3512 80.91.229.3 (1 Oct 2014 05:21:27 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Wed, 1 Oct 2014 05:21:27 +0000 (UTC)
Cc: 18295@debbugs.gnu.org
To: Ian Price <ianprice90@googlemail.com>
Original-X-From: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org Wed Oct 01 07:21:20 2014
Return-path: <bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org>
Envelope-to: guile-bugs@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org>)
	id 1XZCLo-0007yg-18
	for guile-bugs@m.gmane.org; Wed, 01 Oct 2014 07:21:20 +0200
Original-Received: from localhost ([::1]:47808 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org>)
	id 1XZCLn-0004if-HG
	for guile-bugs@m.gmane.org; Wed, 01 Oct 2014 01:21:19 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51207)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1XZCLf-0004hU-GR
	for bug-guile@gnu.org; Wed, 01 Oct 2014 01:21:16 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1XZCLX-0003Qz-3l
	for bug-guile@gnu.org; Wed, 01 Oct 2014 01:21:11 -0400
Original-Received: from debbugs.gnu.org ([140.186.70.43]:37005)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1XZCLW-0003Qv-Vj
	for bug-guile@gnu.org; Wed, 01 Oct 2014 01:21:03 -0400
Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80)
	(envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1XZCLW-0003H7-2R
	for bug-guile@gnu.org; Wed, 01 Oct 2014 01:21:02 -0400
X-Loop: help-debbugs@gnu.org
Resent-From: Mark H Weaver <mhw@netris.org>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
Resent-CC: bug-guile@gnu.org
Resent-Date: Wed, 01 Oct 2014 05:21:01 +0000
Resent-Message-ID: <handler.18295.B18295.141214081212518@debbugs.gnu.org>
Resent-Sender: help-debbugs@gnu.org
X-GNU-PR-Message: followup 18295
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
Original-Received: via spool by 18295-submit@debbugs.gnu.org id=B18295.141214081212518
	(code B ref 18295); Wed, 01 Oct 2014 05:21:01 +0000
Original-Received: (at 18295) by debbugs.gnu.org; 1 Oct 2014 05:20:12 +0000
Original-Received: from localhost ([127.0.0.1]:56802 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces@debbugs.gnu.org>)
	id 1XZCKh-0003Fm-1R
	for submit@debbugs.gnu.org; Wed, 01 Oct 2014 01:20:12 -0400
Original-Received: from world.peace.net ([96.39.62.75]:34902)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <mhw@netris.org>) id 1XZCKZ-0003FY-5j
	for 18295@debbugs.gnu.org; Wed, 01 Oct 2014 01:20:05 -0400
Original-Received: from c-24-62-95-23.hsd1.ma.comcast.net ([24.62.95.23]
	helo=yeeloong.lan)
	by world.peace.net with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.72) (envelope-from <mhw@netris.org>)
	id 1XZCKM-00007G-K6; Wed, 01 Oct 2014 01:19:50 -0400
In-Reply-To: <87egwc6eaa.fsf@kagami.home> (Ian Price's message of "Tue, 19 Aug
	2014 10:04:45 +0100")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x
X-Received-From: 140.186.70.43
X-BeenThere: bug-guile@gnu.org
List-Id: "Bug reports for GUILE,
	GNU's Ubiquitous Extension Language" <bug-guile.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/bug-guile>,
	<mailto:bug-guile-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/bug-guile>
List-Post: <mailto:bug-guile@gnu.org>
List-Help: <mailto:bug-guile-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/bug-guile>,
	<mailto:bug-guile-request@gnu.org?subject=subscribe>
Errors-To: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org
Original-Sender: bug-guile-bounces+guile-bugs=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.lisp.guile.bugs:7597
Archived-At: <http://permalink.gmane.org/gmane.lisp.guile.bugs/7597>

--=-=-=
Content-Type: text/plain

Hi Ian,

Ian Price <ianprice90@googlemail.com> writes:
> Occasionally it is handy to use a radix point in bases other than
> decimal. i.e. 1.1 in binary is 1.5 decimal.

I agree that this would be nice.

Here's a preliminary patch to implement it, against stable-2.0.  It
works, but is not yet ready to push.  It needs tests, but more
importantly I'm undecided on how best to limit the exponents.  The
numbers are represented exactly until just before returning from
string->number, so very large exponents could exhaust the available
memory.  Of course, a large integer can already exhaust the memory, but
that's less likely to happen by accident when exponents are not
involved.  Also, if they _are_ converted to inexact at the end (which
happens unless #e is given), large exponents will become infinite or
zero, which is not as nice as an error message, although I think we
already fail to do this in some cases.

One easy solution would be to prohibit exponents unless radix == 10.

Thoughts?

If you want to work more on this, we could be co-authors of the commit.
I should probably focus on other things for a while.

     Best,
      Mark


--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline;
 filename=0001-PRELIMINARY-string-number-Support-digits-after-point.patch
Content-Description: [PATCH] PRELIMINARY string->number: Support digits
 after point for non-decimals

>From 9ed731c917d4dd9d73258d96e401da5a06bef77a Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw@netris.org>
Date: Mon, 29 Sep 2014 01:26:50 -0400
Subject: [PATCH] PRELIMINARY string->number: Support digits after point for
 non-decimals.

NOTE: This limits the radix to 36.  Previously, we would accept
      bogosities such as:
        (string->number "{" 37) => 36
        (string->number "|" 38) => 37

TODO: Generalize code that places reasonable limits on exponents when
      using non-decimal radices.  (search for XXX in the patch)

TODO: Add tests.

TODO: Split into multiple commits (use scm_t_wchar, avoid the word
      "decimal" except where base 10 is assumed, doc fixes, limit to
      radix <= 36, support digits after point for non-decimals)
---
 doc/ref/api-data.texi         |  6 +--
 libguile/numbers.c            | 94 +++++++++++++++++++++----------------------
 test-suite/tests/numbers.test |  6 +--
 3 files changed, 53 insertions(+), 53 deletions(-)

diff --git a/doc/ref/api-data.texi b/doc/ref/api-data.texi
index acdf9ca..e5880d2 100644
--- a/doc/ref/api-data.texi
+++ b/doc/ref/api-data.texi
@@ -1,7 +1,7 @@
 @c -*-texinfo-*-
 @c This is part of the GNU Guile Reference Manual.
-@c Copyright (C)  1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006, 2007,
-@c   2008, 2009, 2010, 2011, 2012, 2013, 2014  Free Software Foundation, Inc.
+@c Copyright (C) 1996, 1997, 2000-2004, 2006-2014
+@c   Free Software Foundation, Inc.
 @c See the file guile.texi for copying conditions.
 
 @node Simple Data Types
@@ -1098,7 +1098,7 @@ inexact, a radix of 10 will be used.
 @deffnx {C Function} scm_string_to_number (string, radix)
 Return a number of the maximally precise representation
 expressed by the given @var{string}. @var{radix} must be an
-exact integer, either 2, 8, 10, or 16. If supplied, @var{radix}
+exact integer between 2 and 36. If supplied, @var{radix}
 is a default radix that may be overridden by an explicit radix
 prefix in @var{string} (e.g.@: "#o177"). If @var{radix} is not
 supplied, then the default radix is 10. If string is not a
diff --git a/libguile/numbers.c b/libguile/numbers.c
index c197eee..4748b51 100644
--- a/libguile/numbers.c
+++ b/libguile/numbers.c
@@ -1,6 +1,4 @@
-/* Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
- *   2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013,
- *   2014 Free Software Foundation, Inc.
+/* Copyright (C) 1995-2014 Free Software Foundation, Inc.
  *
  * Portions Copyright 1990, 1991, 1992, 1993 by AT&T Bell Laboratories
  * and Bellcore.  See scm_divide.
@@ -5806,7 +5804,7 @@ enum t_exactness {NO_EXACTNESS, INEXACT, EXACT};
 /* Caller is responsible for checking that the return value is in range
    for the given radix, which should be <= 36. */
 static unsigned int
-char_decimal_value (scm_t_uint32 c)
+char_digit_value (scm_t_wchar c)
 {
   /* uc_decimal_value returns -1 on error. When cast to an unsigned int,
      that's certainly above any valid decimal, so we take advantage of
@@ -5818,8 +5816,8 @@ char_decimal_value (scm_t_uint32 c)
   if (d >= 10U)
     {
       c = uc_tolower (c);
-      if (c >= (scm_t_uint32) 'a')
-        d = c - (scm_t_uint32)'a' + 10U;
+      if (c >= (scm_t_wchar)'a')
+        d = c - (scm_t_wchar)'a' + 10U;
     }
   return d;
 }
@@ -5837,14 +5835,14 @@ mem2uinteger (SCM mem, unsigned int *p_idx,
   scm_t_bits add = 0;
   unsigned int digit_value;
   SCM result;
-  char c;
+  scm_t_wchar c;
   size_t len = scm_i_string_length (mem);
 
   if (idx == len)
     return SCM_BOOL_F;
 
   c = scm_i_string_ref (mem, idx);
-  digit_value = char_decimal_value (c);
+  digit_value = char_digit_value (c);
   if (digit_value >= radix)
     return SCM_BOOL_F;
 
@@ -5862,9 +5860,9 @@ mem2uinteger (SCM mem, unsigned int *p_idx,
         break;
       else
         {
-          digit_value = char_decimal_value (c);
-          /* This check catches non-decimals in addition to out-of-range
-             decimals.  */
+          digit_value = char_digit_value (c);
+          /* This check catches non-digits in addition to out-of-range
+             digits.  */
           if (digit_value >= radix)
 	    break;
 	}
@@ -5899,18 +5897,17 @@ mem2uinteger (SCM mem, unsigned int *p_idx,
 }
 
 
-/* R5RS, section 7.1.1, lexical structure of numbers: <decimal 10>.  Only
- * covers the parts of the rules that start at a potential point.  The value
- * of the digits up to the point have been parsed by the caller and are given
- * in variable result.  The content of *p_exactness indicates, whether a hash
- * has already been seen in the digits before the point.
+/* R5RS, section 7.1.1, lexical structure of numbers: <decimal 10>.  We
+ * generalize this to support radices other than 10.  Only covers the
+ * parts of the rules that start at a potential point.  The value of the
+ * digits up to the point have been parsed by the caller and are given
+ * in variable result.  The content of *p_exactness indicates, whether a
+ * hash has already been seen in the digits before the point.
  */
 
-#define DIGIT2UINT(d) (uc_numeric_value(d).numerator)
-
 static SCM
-mem2decimal_from_point (SCM result, SCM mem, 
-			unsigned int *p_idx, enum t_exactness *p_exactness)
+mem2real_from_point (SCM result, SCM mem, unsigned int *p_idx,
+                     unsigned int radix, enum t_exactness *p_exactness)
 {
   unsigned int idx = *p_idx;
   enum t_exactness x = *p_exactness;
@@ -5930,13 +5927,12 @@ mem2decimal_from_point (SCM result, SCM mem,
       while (idx != len)
 	{
 	  scm_t_wchar c = scm_i_string_ref (mem, idx);
-	  if (uc_is_property_decimal_digit ((scm_t_uint32) c))
+          digit_value = char_digit_value (c);
+          if (digit_value < radix)
 	    {
-	      if (x == INEXACT)
-		return SCM_BOOL_F;
-	      else
-		digit_value = DIGIT2UINT (c);
-	    }
+              if (x == INEXACT)
+                return SCM_BOOL_F;
+            }
 	  else if (c == '#')
 	    {
 	      x = INEXACT;
@@ -5946,20 +5942,20 @@ mem2decimal_from_point (SCM result, SCM mem,
 	    break;
 
 	  idx++;
-	  if (SCM_MOST_POSITIVE_FIXNUM / 10 < shift)
+	  if (SCM_MOST_POSITIVE_FIXNUM / radix < shift)
 	    {
 	      big_shift = scm_product (big_shift, SCM_I_MAKINUM (shift));
 	      result = scm_product (result, SCM_I_MAKINUM (shift));
 	      if (add > 0)
 		result = scm_sum (result, SCM_I_MAKINUM (add));
 	      
-	      shift = 10;
+	      shift = radix;
 	      add = digit_value;
 	    }
 	  else
 	    {
-	      shift = shift * 10;
-	      add = add * 10 + digit_value;
+	      shift = shift * radix;
+	      add = add * radix + digit_value;
 	    }
 	};
 
@@ -5981,6 +5977,7 @@ mem2decimal_from_point (SCM result, SCM mem,
       int sign = 1;
       unsigned int start;
       scm_t_wchar c;
+      unsigned int digit_value;
       int exponent;
       SCM e;
 
@@ -6020,24 +6017,29 @@ mem2decimal_from_point (SCM result, SCM mem,
 	  else
 	    sign = 1;
 
-	  if (!uc_is_property_decimal_digit ((scm_t_uint32) c))
+          digit_value = char_digit_value (c);
+	  if (digit_value >= radix)
 	    return SCM_BOOL_F;
 
 	  idx++;
-	  exponent = DIGIT2UINT (c);
+	  exponent = digit_value;
 	  while (idx != len)
 	    {
 	      scm_t_wchar c = scm_i_string_ref (mem, idx);
-	      if (uc_is_property_decimal_digit ((scm_t_uint32) c))
+              digit_value = char_digit_value (c);
+	      if (digit_value < radix)
 		{
 		  idx++;
+                  /* XXX FIXME: This logic is not sufficient for
+                     non-decimal numbers */
 		  if (exponent <= SCM_MAXEXP)
-		    exponent = exponent * 10 + DIGIT2UINT (c);
+		    exponent = exponent * radix + digit_value;
 		}
 	      else
 		break;
 	    }
 
+          /* XXX FIXME: This logic is not sufficient for non-decimal numbers */
 	  if (exponent > ((sign == 1) ? SCM_MAXEXP : SCM_MAXEXP + DBL_DIG + 1))
 	    {
 	      size_t exp_len = idx - start;
@@ -6046,7 +6048,7 @@ mem2decimal_from_point (SCM result, SCM mem,
 	      scm_out_of_range ("string->number", exp_num);
 	    }
 
-	  e = scm_integer_expt (SCM_I_MAKINUM (10), SCM_I_MAKINUM (exponent));
+	  e = scm_integer_expt (SCM_I_MAKINUM (radix), SCM_I_MAKINUM (exponent));
 	  if (sign == 1)
 	    result = scm_product (result, e);
 	  else
@@ -6138,15 +6140,14 @@ mem2ureal (SCM mem, unsigned int *p_idx,
 
   if (scm_i_string_ref (mem, idx) == '.')
     {
-      if (radix != 10)
-	return SCM_BOOL_F;
-      else if (idx + 1 == len)
+      if (idx + 1 == len)
 	return SCM_BOOL_F;
-      else if (!uc_is_property_decimal_digit ((scm_t_uint32) scm_i_string_ref (mem, idx+1)))
+      else if (char_digit_value (scm_i_string_ref (mem, idx+1))
+               >= radix)
 	return SCM_BOOL_F;
       else
-	result = mem2decimal_from_point (SCM_INUM0, mem,
-					 p_idx, &implicit_x);
+	result = mem2real_from_point (SCM_INUM0, mem, p_idx,
+                                      radix, &implicit_x);
     }
   else
     {
@@ -6173,14 +6174,13 @@ mem2ureal (SCM mem, unsigned int *p_idx,
 	  /* both are int/big here, I assume */
 	  result = scm_i_make_ratio (uinteger, divisor);
 	}
-      else if (radix == 10)
+      else
 	{
-	  result = mem2decimal_from_point (uinteger, mem, &idx, &implicit_x);
+	  result = mem2real_from_point (uinteger, mem, &idx,
+                                        radix, &implicit_x);
 	  if (scm_is_false (result))
 	    return SCM_BOOL_F;
 	}
-      else
-	result = uinteger;
 
       *p_idx = idx;
     }
@@ -6436,7 +6436,7 @@ SCM_DEFINE (scm_string_to_number, "string->number", 1, 1, 0,
             (SCM string, SCM radix),
 	    "Return a number of the maximally precise representation\n"
 	    "expressed by the given @var{string}. @var{radix} must be an\n"
-	    "exact integer, either 2, 8, 10, or 16. If supplied, @var{radix}\n"
+	    "exact integer between 2 and 36. If supplied, @var{radix}\n"
 	    "is a default radix that may be overridden by an explicit radix\n"
 	    "prefix in @var{string} (e.g. \"#o177\"). If @var{radix} is not\n"
 	    "supplied, then the default radix is 10. If string is not a\n"
@@ -6451,7 +6451,7 @@ SCM_DEFINE (scm_string_to_number, "string->number", 1, 1, 0,
   if (SCM_UNBNDP (radix))
     base = 10;
   else
-    base = scm_to_unsigned_integer (radix, 2, INT_MAX);
+    base = scm_to_unsigned_integer (radix, 2, 36);
 
   answer = scm_i_string_to_number (string, base);
   scm_remember_upto_here_1 (string);
diff --git a/test-suite/tests/numbers.test b/test-suite/tests/numbers.test
index 847f939..0acc3db 100644
--- a/test-suite/tests/numbers.test
+++ b/test-suite/tests/numbers.test
@@ -1,6 +1,6 @@
 ;;;; numbers.test --- tests guile's numbers     -*- scheme -*-
-;;;; Copyright (C) 2000, 2001, 2003, 2004, 2005, 2006, 2009, 2010, 2011,
-;;;;   2012, 2013 Free Software Foundation, Inc.
+;;;; Copyright (C) 2000, 2001, 2003-2006, 2009-2014
+;;;;   Free Software Foundation, Inc.
 ;;;;
 ;;;; This library is free software; you can redistribute it and/or
 ;;;; modify it under the terms of the GNU Lesser General Public
@@ -1575,7 +1575,7 @@
     (for-each (lambda (x) (if (string->number x) (throw 'fail)))
 	      '("" "q" "1q" "6+7iq" "8+9q" "10+11" "13+" "18@19q" "20@q" "23@"
 		"+25iq" "26i" "-q" "-iq" "i" "5#.0" "8/" "10#11" ".#" "."
-		"#o.2" "3.4q" "15.16e17q" "18.19e+q" ".q" ".17#18" "10q" "#b2"
+		"3.4q" "15.16e17q" "18.19e+q" ".q" ".17#18" "10q" "#b2"
 		"#b3" "#b4" "#b5" "#b6" "#b7" "#b8" "#b9" "#ba" "#bb" "#bc"
 		"#bd" "#be" "#bf" "#q" "#b#b1" "#o#o1" "#d#d1" "#x#x1" "#e#e1"
 		"#i#i1" "12@12+0i" "3/0" "0/0" "4+3/0i" "4/0-3i" "2+0/0i"
-- 
1.8.4


--=-=-=--