unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* Guile 2.0.9, reader: Cannot 'read' an '*unspecified*' value
@ 2015-11-03 13:56 Artyom Poptsov
  2015-11-03 14:44 ` Taylan Ulrich Bayırlı/Kammer
  0 siblings, 1 reply; 6+ messages in thread
From: Artyom Poptsov @ 2015-11-03 13:56 UTC (permalink / raw)
  To: Guile Users' Mailing List


[-- Attachment #1.1: Type: text/plain, Size: 3170 bytes --]

Hello Guilers,

it seems that currently there's no way to 'read' back an '*unspecified*'
value, but in some cases such a feature might be handy.  Here's the
description of the problem; a patch is attached as well.

To be more specific, this expression fails in GNU Guile 2.0.9 and
2.1.0.455-73f61-dirty (which I compiled from the master branch):

--8<---------------cut here---------------start------------->8---
(read (open-input-string (object->string *unspecified*)))
-| ERROR: In procedure read:
-| ERROR: In procedure scm_lreadr: #<unknown port>:1:3: Unknown # object: #\<
-|
-| Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
--8<---------------cut here---------------end--------------->8---

I faced with this problem when tried to 'read' back a vector returned by
'make-vector' with the default filling (that are '*unspecified*'
values).

Looking through the Guile sources I found that one could fix a problem
with reading of a (custom) object using 'read-hash-extend' procedure.
But alas -- the problem, again, that I cannot return an unspecified
value from my Guile reader callback because 'scm_read_sharp' from
'libguile/read.c' considers such a value as an indication that the
procedure is unable to read an object.

With that said, I think the fix could be pretty simple -- we could
return a multiple values object from the Guile reader callback in the
case when we need to 'read' an unspecified value where the 2nd value
could indicate whether the returned unspecified value is *the* value or
an indication that we could not read the value.  As far as I understand,
this solution is backward compatible so current callbacks will work as
usual.

As an example:

--8<---------------cut here---------------start------------->8---
(read-hash-extend #\<
                  (lambda (c port)
                    (let ((str     "")
                          (pending 0)
                          (c       (read-char port)))
                      (while (not (and (char=? c #\>) (= pending 0)))
                        (and (char=? c #\<)
                             (set! pending (1+ pending)))
                        (and (char=? c #\>)
                             (set! pending (1- pending)))
                        (set! str (string-append str (string c)))
                        (set! c (read-char port)))
                      (if (string=? str "unspecified")
                          (values *unspecified* #t)
                          *unspecified*))))

(define (read-string str)
  (read (open-input-string str)))

(write (read-string (object->string (make-vector 2))))
;; => #(#<unspecified> #<unspecified>)

(write (read-string (object->string *unspecified*)))
;; => #<unspecified>
--8<---------------cut here---------------end--------------->8---

As I said, the patch is attached.  I'd love to hear any comments on the
patch, especially given that this patch is my first attempt to make a
contribution to GNU Guile.

Thanks,

- Artyom

-- 
Artyom V. Poptsov <poptsov.artyom@gmail.com>;  GPG Key: 0898A02F
Home page: http://poptsov-artyom.narod.ru/

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: [PATCH] Provide means to 'read' an unspecified object --]
[-- Type: text/x-diff, Size: 5648 bytes --]

From c593bdd8b7c62ea857fb196ceeb003bb87caf9fa Mon Sep 17 00:00:00 2001
From: "Artyom V. Poptsov" <poptsov.artyom@gmail.com>
Date: Tue, 6 Oct 2015 10:20:57 +0300
Subject: [PATCH] Provide means to 'read' an #<unspecified> object

Currently there's no way to 'read' back an unspecified value, but in
some cases such a feature might be handy (eg. in case of reading a
vector filled with unspecified objects.)  This patch changes the way the
reader treats the returned value of a reader extension callback set by
'read-hash-extend': now when the callback returns a multiple values
object and the second value is '#t', the 1st value returned as is, even
if it is an unspecified object.

If a callback returns an unspecified object alone, it is considered as
an indication that the callback could not read an object -- in other
words, the solution should be backward compatible with existing reader
extenstions.

* libguile/read.c (scm_read_sharp): Check whether the value returned by
  'scm_read_sharp_extension' is a multiple values object and handle it
  appropriately if so.
  (read_inner_expression): Handle multiple values object returned by
  'scm_read_sharp'.
* test-suite/tests/reader.test: Add a test for 'read-hash-extend' to
  ensure that the callback can return an unspecified object.  Add a test
  for the same procedure that ensures that the callback can indicate an
  unknown object by returning an unspecified value alone.
* doc/ref/api-evaluation.texi: Update description of 'read-hash-extend'.
---
 doc/ref/api-evaluation.texi  |  8 ++++++++
 libguile/read.c              | 19 +++++++++++++------
 test-suite/tests/reader.test | 11 +++++++++++
 3 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/doc/ref/api-evaluation.texi b/doc/ref/api-evaluation.texi
index 296f1da..8278ab9 100644
--- a/doc/ref/api-evaluation.texi
+++ b/doc/ref/api-evaluation.texi
@@ -298,6 +298,14 @@ starting with the character sequence @code{#} and @var{chr}.
 @var{proc} will be called with two arguments:  the character
 @var{chr} and the port to read further data from. The object
 returned will be the return value of @code{read}. 
+
+Returning of an unspecified value (value of the @code{*unspecified*}
+constant) alone from @var{proc} means that the procedure cannot parse
+the character sequence.  Despite of this, in some cases you may want to
+read and return an unspecified value as the correct value.  This can be
+achieved by returning of two values: an unspecified value as the first
+value and @code{#t} as the second, see (@pxref{Multiple Values}).
+
 Passing @code{#f} for @var{proc} will remove a previous setting. 
 
 @end deffn
diff --git a/libguile/read.c b/libguile/read.c
index ecf27ff..e9bb6f0 100644
--- a/libguile/read.c
+++ b/libguile/read.c
@@ -1656,8 +1656,15 @@ scm_read_sharp (scm_t_wchar chr, SCM port, scm_t_read_opts *opts,
   chr = scm_getc_unlocked (port);
 
   result = scm_read_sharp_extension (chr, port, opts);
-  if (!scm_is_eq (result, SCM_UNSPECIFIED))
-    return result;
+  if (! SCM_VALUESP (result))
+    {
+      if (! scm_is_eq (result, SCM_UNSPECIFIED))
+        return result;
+    }
+  else if (scm_is_true (scm_c_value_ref (result, 1)))
+    {
+      return result;
+    }
 
   switch (chr)
     {
@@ -1713,7 +1720,7 @@ scm_read_sharp (scm_t_wchar chr, SCM port, scm_t_read_opts *opts,
       return (scm_read_nil (chr, port, opts));
     default:
       result = scm_read_sharp_extension (chr, port, opts);
-      if (scm_is_eq (result, SCM_UNSPECIFIED))
+      if ((! SCM_VALUESP (result)) && scm_is_eq (result, SCM_UNSPECIFIED))
 	{
 	  /* To remain compatible with 1.8 and earlier, the following
 	     characters have lower precedence than `read-hash-extend'
@@ -1728,7 +1735,7 @@ scm_read_sharp (scm_t_wchar chr, SCM port, scm_t_read_opts *opts,
 	    }
 	}
       else
-	return result;
+	return scm_c_value_ref (result, 0);
     }
 
   return SCM_UNSPECIFIED;
@@ -1807,11 +1814,11 @@ read_inner_expression (SCM port, scm_t_read_opts *opts)
             long line  = SCM_LINUM (port);
             int column = SCM_COL (port) - 1;
 	    SCM result = scm_read_sharp (chr, port, opts, line, column);
-	    if (scm_is_eq (result, SCM_UNSPECIFIED))
+	    if ((! SCM_VALUESP (result)) && scm_is_eq (result, SCM_UNSPECIFIED))
 	      /* We read a comment or some such.  */
 	      break;
 	    else
-	      return result;
+	      return scm_c_value_ref (result, 0);
 	  }
 	case ')':
 	  scm_i_input_error (FUNC_NAME, port, "unexpected \")\"", SCM_EOL);
diff --git a/test-suite/tests/reader.test b/test-suite/tests/reader.test
index 5eb368d..b860abe 100644
--- a/test-suite/tests/reader.test
+++ b/test-suite/tests/reader.test
@@ -169,6 +169,17 @@
   (pass-if "square brackets are parens"
     (equal? '() (read-string "[]")))
 
+  (pass-if "reader callback can read an unspecified value"
+    (with-fluids ((%read-hash-procedures (fluid-ref %read-hash-procedures)))
+      (read-hash-extend #\< (lambda (c port) (values *unspecified* #t)))
+      (eq? *unspecified* (read-string (object->string *unspecified*)))))
+
+  (pass-if-exception "reader callback can indicate an unknown object"
+      exception:unknown-sharp-object
+    (with-fluids ((%read-hash-procedures (fluid-ref %read-hash-procedures)))
+      (read-hash-extend #\< (lambda (c port) *unspecified*))
+      (eq? *unspecified* (read-string (object->string *unspecified*)))))
+
   (pass-if-exception "paren mismatch" exception:unexpected-rparen
                      (read-string "'[)"))
 
-- 
2.4.6


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Guile 2.0.9, reader: Cannot 'read' an '*unspecified*' value
  2015-11-03 13:56 Guile 2.0.9, reader: Cannot 'read' an '*unspecified*' value Artyom Poptsov
@ 2015-11-03 14:44 ` Taylan Ulrich Bayırlı/Kammer
  2015-11-03 15:21   ` David Kastrup
  2015-11-03 19:37   ` Artyom Poptsov
  0 siblings, 2 replies; 6+ messages in thread
From: Taylan Ulrich Bayırlı/Kammer @ 2015-11-03 14:44 UTC (permalink / raw)
  To: Artyom Poptsov; +Cc: Guile Users' Mailing List

Artyom Poptsov <poptsov.artyom@gmail.com> writes:

> Hello Guilers,
>
> it seems that currently there's no way to 'read' back an '*unspecified*'
> value, but in some cases such a feature might be handy.  Here's the
> description of the problem; a patch is attached as well.

Just my opinion: I generally see code relying on the existence of the
*unspecified* value (let alone any specific semantics of it) to be
sub-optimal.

Guile documents the value, so I guess there's some guarantees regarding
its existence and semantics, but I think it's best not to rely on it
anyway, so that #1 Guile can decide to do something else in the future
where it currently returns *unspecified*, #2 code has clearer semantics,
#3 code can be ported more easily to other Scheme platforms (say GNU
Kawa), and possibly more such benefits.

In that vein, I actually find it beneficial when code relying on the
*unspecified* value fails as early as possible.  For instance in Guix
package recipes, some people (including me) occasionally accidentally
write recipes where a procedure returns *unspecified* where actually a
Boolean is expected.  This easily falls through the cracks because the
system accepts *unspecified* as a non-false Boolean value, when actually
it indicates that any arbitrary value could have been returned in its
place, including #f.

Just my two cents.  The maintainers should decide what to do. :-)

Taylan



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Guile 2.0.9, reader: Cannot 'read' an '*unspecified*' value
  2015-11-03 14:44 ` Taylan Ulrich Bayırlı/Kammer
@ 2015-11-03 15:21   ` David Kastrup
  2015-11-03 15:26     ` David Kastrup
  2015-11-03 19:37   ` Artyom Poptsov
  1 sibling, 1 reply; 6+ messages in thread
From: David Kastrup @ 2015-11-03 15:21 UTC (permalink / raw)
  To: guile-user

taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") writes:

> Artyom Poptsov <poptsov.artyom@gmail.com> writes:
>
>> Hello Guilers,
>>
>> it seems that currently there's no way to 'read' back an
>> '*unspecified*' value,

*unspecified* works reasonably fine in most circumstances.

scheme@(guile-user)> (make-vector 3)
$1 = #(#<unspecified> #<unspecified> #<unspecified>)
scheme@(guile-user)> #(#<unspecified> #<unspecified> #<unspecified>)
While reading expression:
ERROR: In procedure scm_lreadr: #<unknown port>:2:3: Unknown # object: #\<
scheme@(guile-user)> #(*unspecified* *unspecified* *unspecified*)
$2 = #(*unspecified* *unspecified* *unspecified*)
scheme@(guile-user)> 

It seems like printing *unspecified* as #<unspecified> is not actually
doing anybody much of a favor, though.

>> but in some cases such a feature might be handy.  Here's the
>> description of the problem; a patch is attached as well.
>
> Just my opinion: I generally see code relying on the existence of the
> *unspecified* value (let alone any specific semantics of it) to be
> sub-optimal.

See <URL:http://debbugs.gnu.org/cgi/bugreport.cgi?bug=17474>.  Even
while Andy Wingo steadfastly refuses to acknowledge this patch, it is an
implementation of his comment

;;; {The Unspecified Value}
;;;
;;; Currently Guile represents unspecified values via one particular value,
;;; which may be obtained by evaluating (if #f #f). It would be nice in the
;;; future if we could replace this with a return of 0 values, though.

This patch renders *unspecified* and (values) identical (and equivalent
to SCM_UNSPECIFIED which thus is the C representation of the zero-values
object).

In spite of this patch being ignored perpetually, code relying on
*unspecified* being different from (values) in too many respects seems
imprudent.

However, it is also clear that *unspecified* in GUILE, patch or not, has
a more tangible existence than the fundamental Scheme guarantees for
(values) or any non-single-value objects are and that this is not going
to change.

While for most instances of "the return value is unspecified" it seems
like a reasonably elegant way to implement this as "the return value is
*unspecified*", I think it was a bad idea to also use this plan for "the
initial value is unspecified" as in the case of make-vector.  That
significantly reduces the options for more rigid implementations of
*unspecified* since lots of code by now relies on being able to move
*unspecified* around as part of data structures.

> Guile documents the value, so I guess there's some guarantees
> regarding its existence and semantics, but I think it's best not to
> rely on it anyway, so that #1 Guile can decide to do something else in
> the future where it currently returns *unspecified*, #2 code has
> clearer semantics, #3 code can be ported more easily to other Scheme
> platforms (say GNU Kawa), and possibly more such benefits.

[...]

> Just my two cents.  The maintainers should decide what to do. :-)

Or not.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Guile 2.0.9, reader: Cannot 'read' an '*unspecified*' value
  2015-11-03 15:21   ` David Kastrup
@ 2015-11-03 15:26     ` David Kastrup
  0 siblings, 0 replies; 6+ messages in thread
From: David Kastrup @ 2015-11-03 15:26 UTC (permalink / raw)
  To: guile-user

David Kastrup <dak@gnu.org> writes:

> taylanbayirli@gmail.com (Taylan Ulrich "Bayırlı/Kammer") writes:
>
>> Artyom Poptsov <poptsov.artyom@gmail.com> writes:
>>
>>> Hello Guilers,
>>>
>>> it seems that currently there's no way to 'read' back an
>>> '*unspecified*' value,
>
> *unspecified* works reasonably fine in most circumstances.
>
> scheme@(guile-user)> (make-vector 3)
> $1 = #(#<unspecified> #<unspecified> #<unspecified>)
> scheme@(guile-user)> #(#<unspecified> #<unspecified> #<unspecified>)
> While reading expression:
> ERROR: In procedure scm_lreadr: #<unknown port>:2:3: Unknown # object: #\<
> scheme@(guile-user)> #(*unspecified* *unspecified* *unspecified*)
> $2 = #(*unspecified* *unspecified* *unspecified*)
> scheme@(guile-user)> 
>
> It seems like printing *unspecified* as #<unspecified> is not actually
> doing anybody much of a favor, though.

Though #(*unspecified* *unspecified* *unspecified*) is not actually an
alternative, being a vector of three symbols.  Cough cough.

(vector *unspecified* *unspecified* *unspecified*)

does read back properly but of course that does not turn *unspecified*
into an actual print form of *unspecified*.  Sorry for this particular
brain fart.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Guile 2.0.9, reader: Cannot 'read' an '*unspecified*' value
  2015-11-03 14:44 ` Taylan Ulrich Bayırlı/Kammer
  2015-11-03 15:21   ` David Kastrup
@ 2015-11-03 19:37   ` Artyom Poptsov
  2015-11-03 19:47     ` Artyom Poptsov
  1 sibling, 1 reply; 6+ messages in thread
From: Artyom Poptsov @ 2015-11-03 19:37 UTC (permalink / raw)
  To: Taylan Ulrich "Bayırlı/Kammer"
  Cc: Guile Users' Mailing List

[-- Attachment #1: Type: text/plain, Size: 999 bytes --]

Hello Taylan,

thanks for your comments.  I may agree that *unspecified* value should
not be used in the Scheme code explicitly, yet 'make-vector' uses it as
the default filling for newly created vectors.  Given that it's not
possible to 'read' them, it adds to the special cases when output of
'write' is not machine readable.

It's possible to solve this dilemma by introducing a special case to a
user program as well when it cannot 'read' a vector unless the user
specified a proper filling.  Nevertheless I thought this is a problem
and the problem is worth of thinking about; and maybe there's a solution
that requires minimal changes to Guile sources.

And note that the reader callback relies on *undefined* value anyways.

- Artyom

P.S. I was going to send the patch to guile-devel ML, but by a mistake
     sent it to guile-users ML.  Sorry for that.

-- 
Artyom V. Poptsov <poptsov.artyom@gmail.com>;  GPG Key: 0898A02F
Home page: http://poptsov-artyom.narod.ru/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Guile 2.0.9, reader: Cannot 'read' an '*unspecified*' value
  2015-11-03 19:37   ` Artyom Poptsov
@ 2015-11-03 19:47     ` Artyom Poptsov
  0 siblings, 0 replies; 6+ messages in thread
From: Artyom Poptsov @ 2015-11-03 19:47 UTC (permalink / raw)
  To: Taylan Ulrich "Bayırlı/Kammer"
  Cc: Guile Users' Mailing List

[-- Attachment #1: Type: text/plain, Size: 235 bytes --]

> And note that the reader callback relies on *undefined* value anyways.

s/*undefined*/*unspecified*/

- Artyom

-- 
Artyom V. Poptsov <poptsov.artyom@gmail.com>;  GPG Key: 0898A02F
Home page: http://poptsov-artyom.narod.ru/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-11-03 19:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-03 13:56 Guile 2.0.9, reader: Cannot 'read' an '*unspecified*' value Artyom Poptsov
2015-11-03 14:44 ` Taylan Ulrich Bayırlı/Kammer
2015-11-03 15:21   ` David Kastrup
2015-11-03 15:26     ` David Kastrup
2015-11-03 19:37   ` Artyom Poptsov
2015-11-03 19:47     ` Artyom Poptsov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).