unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#30237: Generalizing ‘and=>’
@ 2018-01-24 12:10 Mathieu Lirzin
  2018-01-24 15:01 ` Mark H Weaver
  0 siblings, 1 reply; 5+ messages in thread
From: Mathieu Lirzin @ 2018-01-24 12:10 UTC (permalink / raw)
  To: 30237

Hello,

Here is a proposal for generalizing ‘and=>’ to a pipeline of procedures.
It acts like a “bind” operator in an ad-hoc “Maybe” monad which uses #f
to represent the absence of value.  Not sure if it is useful in
practice, but it feels like a natural generalization.

The current definition is the following:

  (define (and=> value procedure)
    (and value (procedure value)))

Here is my proposition:

  (define-syntax and=>
    (syntax-rules ()
      ((_) #t)
      ((_ val) val)
      ((_ val proc)
       (and val (proc val)))
      ((_ val proc proc* ...)
       (and=> (and val (proc val)) proc* ...))))

Let me know if such change is welcome or not, so I can provide a
complete patch including documentation.  Even if it's a small change, I
would like to assign copyright for future changes.

Thanks.

-- 
Mathieu Lirzin
GPG: F2A3 8D7E EB2B 6640 5761  070D 0ADE E100 9460 4D37





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#30237: Generalizing ‘and=>’
  2018-01-24 12:10 bug#30237: Generalizing ‘and=>’ Mathieu Lirzin
@ 2018-01-24 15:01 ` Mark H Weaver
  2018-01-24 20:08   ` Mathieu Lirzin
  0 siblings, 1 reply; 5+ messages in thread
From: Mark H Weaver @ 2018-01-24 15:01 UTC (permalink / raw)
  To: Mathieu Lirzin; +Cc: 30237

Hi Mathieu,

Mathieu Lirzin <mthl@gnu.org> writes:

> Here is a proposal for generalizing ‘and=>’ to a pipeline of procedures.
> It acts like a “bind” operator in an ad-hoc “Maybe” monad which uses #f
> to represent the absence of value.  Not sure if it is useful in
> practice, but it feels like a natural generalization.
>
> The current definition is the following:
>
>   (define (and=> value procedure)
>     (and value (procedure value)))
>
> Here is my proposition:
>
>   (define-syntax and=>
>     (syntax-rules ()
>       ((_) #t)
>       ((_ val) val)
>       ((_ val proc)
>        (and val (proc val)))
>       ((_ val proc proc* ...)
>        (and=> (and val (proc val)) proc* ...))))

Be careful, macros are different than procedures!  Instead of binding
values to variables, they bind _unevaluated_ forms to pattern variables.
So 'val' is a misleading name for the first operand of the macro above.
In fact, what's being bound there is an _expression_, and you are then
expanding this into something that contains two copies of that
expression.  So, for example:

  (and=> (compile-webkitgtk)
         test
         install)

expands into:

  (and=> (and (compile-webkitgtk)
              (test (compile-webkitgtk)))
         install)

which expands into:

  (and (and (compile-webkitgtk)
            (test (compile-webkitgtk)))
       (install (and (compile-webkitgtk)
                     (test (compile-webkitgtk)))))

So you end up compiling webkitgtk 4 times, and testing it twice, before
finally installing it.  More generally, if you pass N+1 operands, the
first operand will be evaluated 2^N times.

The other problem is that 'and=>' is currently a procedure, and you're
replacing it with a macro.  There are a couple of issues with this.  One
is that procedures are first-class objects in Scheme, but macros aren't.
Procedures can be passed as arguments to procedures, stored in data
structures, etc, but macros cannot.

For example, if 'and=>' is a procedure, you can write:

  (map and=> vals procs)

but you can't do this if 'and=>' is a macro.

The other problem with changing 'and=>' to a macro is that this
effectively changes the ABI of Guile, which we can't do within a stable
release series (2.2.x).  Existing .go files that use 'and=>' were
compiled to generate a normal procedure call for 'and=>', and if that
procedure no longer exists then the code will break.  This change
requires all users of 'and=>' to be recompiled.

> Let me know if such change is welcome or not, so I can provide a
> complete patch including documentation.

I think it's worth considering something along these lines for the
'master' branch which will eventually become Guile 3, although in order
to support existing callers that expect 'and=>' to be a first-class
procedure, we might want to arrange for a bare 'and=>' to expand into a
reference to a first-class procedure that does the same job, similar to
what we do with 'define-inlinable'.

The other option would be to implement your enhanced 'and=>' as a normal
procedure using case-lambda, maybe something like this:

  (define and=>
    (case-lambda
      ((val proc) (and val (proc val)))
      ((val . procs)
       (let loop ((val val) (procs procs))
         (if (null? procs)
             val
             (and val (loop ((car procs) val)
                            (cdr procs))))))
      (() #t)))

The first case is not strictly needed, but is included as an
optimization to avoid heap-allocating a list for the 'procs' rest
argument in the common case of two arguments.

The second case is implemented as a loop instead of a recursive call to
'and=>', to prevent repeatedly heap-allocating the rest list on each
iteration, which would lead to O(N^2) allocations for N arguments.  It
would be nicer to use 'match' here, but since 'and=>' is defined early
in boot-9.scm, we must restrict ourselves to core functionality in its
implementation.

I'm not sure if it makes sense to include the final (nullary) case.
Unlike 'and', 'or', '+', '*' and similar operations where every argument
is treated uniformally, in this case the first argument is qualitatively
different than the others.  Therefore, it seems to me that this
procedure naturally generalizes down to 1 argument, but no further.

Finally, there still some question in my mind whether this
generalization would be useful in practice.  Have you found a real-world
use case where this generalized 'and=>' makes life easier?

Do you know about SRFI-2 (and-let*)?  How would that work for your use
case?

     Regards,
       Mark





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#30237: Generalizing ‘and=>’
  2018-01-24 15:01 ` Mark H Weaver
@ 2018-01-24 20:08   ` Mathieu Lirzin
  2018-01-31  4:31     ` Mark H Weaver
  0 siblings, 1 reply; 5+ messages in thread
From: Mathieu Lirzin @ 2018-01-24 20:08 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: 30237

Hello Mark,

Mark H Weaver <mhw@netris.org> writes:

> Mathieu Lirzin <mthl@gnu.org> writes:
>
>> Here is a proposal for generalizing ‘and=>’ to a pipeline of procedures.
>> It acts like a “bind” operator in an ad-hoc “Maybe” monad which uses #f
>> to represent the absence of value.  Not sure if it is useful in
>> practice, but it feels like a natural generalization.
>>
>> The current definition is the following:
>>
>>   (define (and=> value procedure)
>>     (and value (procedure value)))
>>
>> Here is my proposition:
>>
>>   (define-syntax and=>
>>     (syntax-rules ()
>>       ((_) #t)
>>       ((_ val) val)
>>       ((_ val proc)
>>        (and val (proc val)))
>>       ((_ val proc proc* ...)
>>        (and=> (and val (proc val)) proc* ...))))
>
> Be careful, macros are different than procedures!  Instead of binding
> values to variables, they bind _unevaluated_ forms to pattern variables.
> So 'val' is a misleading name for the first operand of the macro above.
> In fact, what's being bound there is an _expression_, and you are then
> expanding this into something that contains two copies of that
> expression.  So, for example:
>
>   (and=> (compile-webkitgtk)
>          test
>          install)
>
> expands into:
>
>   (and=> (and (compile-webkitgtk)
>               (test (compile-webkitgtk)))
>          install)
>
> which expands into:
>
>   (and (and (compile-webkitgtk)
>             (test (compile-webkitgtk)))
>        (install (and (compile-webkitgtk)
>                      (test (compile-webkitgtk)))))
>
> So you end up compiling webkitgtk 4 times, and testing it twice, before
> finally installing it.  More generally, if you pass N+1 operands, the
> first operand will be evaluated 2^N times.

Indeed I overlooked that.

> The other problem is that 'and=>' is currently a procedure, and you're
> replacing it with a macro.  There are a couple of issues with this.  One
> is that procedures are first-class objects in Scheme, but macros aren't.
> Procedures can be passed as arguments to procedures, stored in data
> structures, etc, but macros cannot.
>
> For example, if 'and=>' is a procedure, you can write:
>
>   (map and=> vals procs)
>
> but you can't do this if 'and=>' is a macro.
>
> The other problem with changing 'and=>' to a macro is that this
> effectively changes the ABI of Guile, which we can't do within a stable
> release series (2.2.x).  Existing .go files that use 'and=>' were
> compiled to generate a normal procedure call for 'and=>', and if that
> procedure no longer exists then the code will break.  This change
> requires all users of 'and=>' to be recompiled.

I initially implemented it as a procedure but move to the macro side of
thing in an attempt to generate more efficient code while handling
procedures that return multiple values.  But this can't work given the
issue you described above.

>> Let me know if such change is welcome or not, so I can provide a
>> complete patch including documentation.
>
> I think it's worth considering something along these lines for the
> 'master' branch which will eventually become Guile 3, although in order
> to support existing callers that expect 'and=>' to be a first-class
> procedure, we might want to arrange for a bare 'and=>' to expand into a
> reference to a first-class procedure that does the same job, similar to
> what we do with 'define-inlinable'.

I don't think making it a macro worths the breaking change.

> The other option would be to implement your enhanced 'and=>' as a normal
> procedure using case-lambda, maybe something like this:
>
>   (define and=>
>     (case-lambda
>       ((val proc) (and val (proc val)))
>       ((val . procs)
>        (let loop ((val val) (procs procs))
>          (if (null? procs)
>              val
>              (and val (loop ((car procs) val)
>                             (cdr procs))))))
>       (() #t)))
>
> The first case is not strictly needed, but is included as an
> optimization to avoid heap-allocating a list for the 'procs' rest
> argument in the common case of two arguments.
>
> The second case is implemented as a loop instead of a recursive call to
> 'and=>', to prevent repeatedly heap-allocating the rest list on each
> iteration, which would lead to O(N^2) allocations for N arguments.  It
> would be nicer to use 'match' here, but since 'and=>' is defined early
> in boot-9.scm, we must restrict ourselves to core functionality in its
> implementation.

That's a clear explanation.  IMO It would be nice if multiple values
were handled too.  here is one solution inspired by ‘compose’.

(define and=>
  (case-lambda
    ((val proc) (and val (proc val)))
    ((val proc . procs)
     (let loop ((p proc) (ps procs))
       (if (null? ps)
	   (p val)
	   (loop (lambda args
		   (call-with-values (lambda () (apply p args))
		     (lambda (arg0 . args*)
		       (and arg0 (apply (car ps) arg0 args*)))))
		 (cdr ps)))))))

I am not sure if there is a way to avoid constructing the composition of
functions.

> I'm not sure if it makes sense to include the final (nullary) case.
> Unlike 'and', 'or', '+', '*' and similar operations where every argument
> is treated uniformally, in this case the first argument is qualitatively
> different than the others.  Therefore, it seems to me that this
> procedure naturally generalizes down to 1 argument, but no further.

I don't think so, I defined it only because I took inspiration from
‘and’.

> Finally, there still some question in my mind whether this
> generalization would be useful in practice.  Have you found a real-world
> use case where this generalized 'and=>' makes life easier?

Not really.  :-)

> Do you know about SRFI-2 (and-let*)?  How would that work for your use
> case?

Yes I know about ‘and-let*’ which is a proper way to do pipelining while
checking for #f.  My motivation for extending ‘and=>’ was to have
something similar but without the special syntax.

Thanks for the in-depth review.

-- 
Mathieu Lirzin
GPG: F2A3 8D7E EB2B 6640 5761  070D 0ADE E100 9460 4D37





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#30237: Generalizing ‘and=>’
  2018-01-24 20:08   ` Mathieu Lirzin
@ 2018-01-31  4:31     ` Mark H Weaver
  2018-01-31 14:23       ` Mathieu Lirzin
  0 siblings, 1 reply; 5+ messages in thread
From: Mark H Weaver @ 2018-01-31  4:31 UTC (permalink / raw)
  To: Mathieu Lirzin; +Cc: 30237

Mathieu Lirzin <mthl@gnu.org> writes:

> IMO It would be nice if multiple values
> were handled too.  here is one solution inspired by ‘compose’.
>
> (define and=>
>   (case-lambda
>     ((val proc) (and val (proc val)))
>     ((val proc . procs)
>      (let loop ((p proc) (ps procs))
>        (if (null? ps)
> 	   (p val)
> 	   (loop (lambda args
> 		   (call-with-values (lambda () (apply p args))
> 		     (lambda (arg0 . args*)
> 		       (and arg0 (apply (car ps) arg0 args*)))))
> 		 (cdr ps)))))))

This generalization will inevitably slow it down, and it's not clear
that this is useful in practice.  It's also not particularly natural,
because these return value(s) somehow need to be interpreted as a
boolean.  There are multiple ways to do that, and it's not clear which
is the best way.

I think we should resist the temptation to make simple things more
complicated without good reason.  Making things more complicated always
has a cost.  I'm a big believer in keeping things simple, especially the
widely used primitive operations.

Thoughts?

      Mark





^ permalink raw reply	[flat|nested] 5+ messages in thread

* bug#30237: Generalizing ‘and=>’
  2018-01-31  4:31     ` Mark H Weaver
@ 2018-01-31 14:23       ` Mathieu Lirzin
  0 siblings, 0 replies; 5+ messages in thread
From: Mathieu Lirzin @ 2018-01-31 14:23 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: 30237

Mark H Weaver <mhw@netris.org> writes:

> This generalization will inevitably slow it down, and it's not clear
> that this is useful in practice.  It's also not particularly natural,
> because these return value(s) somehow need to be interpreted as a
> boolean.  There are multiple ways to do that, and it's not clear which
> is the best way.
>
> I think we should resist the temptation to make simple things more
> complicated without good reason.  Making things more complicated always
> has a cost.  I'm a big believer in keeping things simple, especially the
> widely used primitive operations.

Generalizing the arity of things tends to make them more “simple” for
example thinking of ‘+’ as an arbitrary arity function instead of a
binary operator is simpler.

I am realizing that In this case, the semantics are indeed unclear for
multiple values.  For example if multiple values were to be handled then
‘(and=> (values 1 2) list)’ should work in the first place.  Moreover as
you pointed what should be done regarding the truthiness of multiple
values is not clear either.  As a consequence I withdraw my proposition
to generalize ‘and=>’ to multiple values.

Having said that I think the implementation you previously proposed
(without the arity 0 case) keep things simple:

  (define and=>
    (case-lambda
      ((val proc) (and val (proc val)))
      ((val . procs)
       (let loop ((val val) (procs procs))
         (if (null? procs)
             val
             (and val (loop ((car procs) val)
                            (cdr procs))))))))

> Finally, there still some question in my mind whether this
> generalization would be useful in practice.  Have you found a
> real-world use case where this generalized 'and=>' makes life easier?

I took some time to think about a pseudo-realistic use case.  Consider
some configuration variables that are set from the process environment.
We want to check and transform what the user provides with some slight
variations for each variable:

  ;;; Higher-order utilities.

  (define (ensure pred)
    (lambda (val)
      (and (pred val) val)))

  (define (check pred msg)
    (lambda (val)
      (unless (pred val)
        (display msg))
      val))

  ;;; Process and check variables (unrealistic but give an idea).

  (define (split-path str)
    (string-split str #\:))

  (define (no-empty-strings? lst)
    (not (any (lambda (str) (string=? "" str)) lst)))

  (define (keep-existing-files lst)
    (filter file-exists? lst))

  (define (warn-deprecated-path lst)
    (when (any (lambda (string-suffix? "/old-bar")))
       (display "Please don't refer to \"old-bar\" for XYZ reason")))

  ;;; Global variables.

  (define %foo
    ;; Apply procedures in a pipeline fashion...
    (and=> (and=> (and=> (and=> (getenv "FOO") split-path)
                         (ensure no-empty-strings?))
                  keep-existing-files)
           (ensure pair?)))

  (define %bar
    ;; ...or with more idiomatic procedure calls.
    (let ((lst (split-path (or (getenv "BAR")
                               "/etc/bar:/usr/share/bar"))))
      (when (no-empty-strings? lst)
        (warn-deprecated-path lst)
        lst)))

Handling arbitrary arities would allow rewriting those variables in a
more elegant way:

  (define %foo
    (and=> (getenv "FOO")
           split-path
           (ensure no-empty-strings?)
           keep-existing-files
           (ensure pair?)))

  (define (no-deprecated-dirs? lst)
    (not (any (lambda (str) (string-suffix? "/old-bar" str)) lst)))

  (define %bar
    (and=> (or (getenv "BAR") "/etc/bar:/usr/share/bar")
           split-path
           (ensure no-empty-strings?)
           (check no-deprecated-dirs?
                  "Please don't refer to \"old-bar\" for XYZ reason")))

‘and-let*’ would be a reasonable alternative but the benefit of this
form is that it favours the use of higher-order procedures in place of
special syntax.

WDYT?

-- 
Mathieu Lirzin
GPG: F2A3 8D7E EB2B 6640 5761  070D 0ADE E100 9460 4D37





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-01-31 14:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-24 12:10 bug#30237: Generalizing ‘and=>’ Mathieu Lirzin
2018-01-24 15:01 ` Mark H Weaver
2018-01-24 20:08   ` Mathieu Lirzin
2018-01-31  4:31     ` Mark H Weaver
2018-01-31 14:23       ` Mathieu Lirzin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).