unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Colin Woodbury <colin@fosskers.ca>
To: "Linus Björnstam" <linus.bjornstam@veryfast.biz>, guile-devel@gnu.org
Subject: Re: [PATCHv2] Extensions for SRFI-171 (Transducers)
Date: Sun, 25 Dec 2022 00:28:13 +0900	[thread overview]
Message-ID: <1158f7bf-3093-8c3b-dd74-267f1fbb6f4c@fosskers.ca> (raw)
In-Reply-To: <bd133eb0-8dcb-4818-ad13-72c9f40bc4cf@app.fastmail.com>

[-- Attachment #1: Type: text/plain, Size: 2214 bytes --]

Hi Linus, great to hear from you. Thanks for the feedback. I've attached 
new versions of the patches (with your suggested fix), and also an extra 
4th commit that adds a short guide for how to write one's own Reducers.

Merry Christmas!

Colin

On 12/21/22 19:00, Linus Björnstam wrote:
> As the author of both the SRFI and the guile code I am very happy you like it. I don't have a computer at the moment, but I looked through the code and it looked great.
>
> All additions should have been included in the original SRFI :)
>
> one comment: your code uses define-public, which the rest of SRFI-171 code does not.
>
> I am not in any position to sign code off for inclusion in guile proper, but if the define-public thing is fixed it very much has my blessing.
>
> Best regards
>    Linus Björnstam
>
> On Wed, 21 Dec 2022, at 01:48, Colin Woodbury wrote:
>> Happy holidays everyone, I hope everything is going well for you.
>>
>> Since discovering SRFI-171 (Transducers) I have fallen in love with it.
>> Transducers let me "talk the way I want to talk" while knowing that I'm
>> being efficient underneath w.r.t. to iteration and allocation. In using
>> Guile's implementation, I noticed a few common idioms missing that are
>> otherwise present in other languages, so I've added them in a series of
>> patches. I've been using these often for a number of weeks without
>> issue, but of course have added unit tests as well.
>>
>> The full details are in the commit messages, but here are the main highlights:
>>
>>   * rfold: The fundamental reducer. This allows the user to turn any
>> two-arg function into a valid reducer, so that they don't need to worry
>> about hand-writing reducers via case-lambda.
>>   * rfind: Yields the first item in the transduction that matches some
>> predicate. Nice for locating some specific value from a potentially
>> large data source (e.g. a port).
>>   * twindow: Like tsegment, but yields overlapping slices into the data.
>> Cheers, and have a great holiday.
>>
>> Colin
>>
>> Attachments:
>> * 0001-srfi-171-add-twindow-and-various-reducers.patch
>> * 0002-doc-add-new-SRFI-171-reducers-to-the-manual.patch
>> * 0003-srfi-171-add-unit-tests-for-new-functions.patch

[-- Attachment #2: 0001-srfi-171-add-twindow-and-various-reducers.patch --]
[-- Type: text/x-patch, Size: 4899 bytes --]

From 96856b184a507886db2c5c20323983ae125a6bdb Mon Sep 17 00:00:00 2001
From: Colin Woodbury <colin@fosskers.ca>
Date: Mon, 19 Dec 2022 09:39:37 +0900
Subject: [PATCH 1/4] srfi-171: add twindow and various reducers

This adds a number of reduction primitives often seen in other languages
to Guile's SRFI-171 extensions.

Most critical may be `rfold`, which could be called the fundamental
reducer, as it's likely that all other reducers could be defined in
terms of it (though not all are). While `tfold` already exists in
Guile's SRFI-171 extension as a transducer, folding is in essence a
reduction. Also without a primative like `rlast` (also introduced here),
the results of `tfold` are difficult to consume. This is avoided by
providing `rfold` directly as a generalised means to collapse an entire
transduction down into a single value (i.e. the whole point of
reducers). `rfold` is also useful for the creation of ad-hoc reducers,
as any 2-arg function can be passed to it to fold the stream of values.

`rfirst`, `rlast`, and `rfind` are common idioms and so have been added.

The equivalent of `rmax` and `rmin` are easy to write manually via
`rfold`, but they have been provided here as a convenience in the same
spirit as `rcons`.

`rfor-each` also cannot be forgotten as a classic adaptation of its
SRFI-1 cousin.

Also added is `twindow`, handy for analysing groups of adjacent items.
---
 module/srfi/srfi-171/gnu.scm | 87 +++++++++++++++++++++++++++++++++++-
 1 file changed, 85 insertions(+), 2 deletions(-)

diff --git a/module/srfi/srfi-171/gnu.scm b/module/srfi/srfi-171/gnu.scm
index 45a4e19af..c41925e8a 100644
--- a/module/srfi/srfi-171/gnu.scm
+++ b/module/srfi/srfi-171/gnu.scm
@@ -15,10 +15,17 @@
 ;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 
 (define-module (srfi srfi-171 gnu)
+  #:use-module (ice-9 q)
   #:use-module (srfi srfi-171)
   #:use-module (srfi srfi-171 meta)
-  #:export (tbatch tfold))
-
+  #:export (tbatch
+            tfold
+            twindow
+            rfind
+            rfirst rlast
+            rfold
+            rfor-each
+            rmax rmin))
 
 (define tbatch
   (case-lambda
@@ -63,3 +70,79 @@
          (if (reduced? state)
              (reduced (reducer (unreduce state)))
              (r result state)))))))
+
+(define (twindow n)
+  "Yield @var{n}-length windows of overlapping values.  This is different from
+@code{tsegment} which yields non-overlapping windows. If there were
+fewer items in the input than @var{n}, then this yields nothing."
+  (when (not (and (integer? n) (positive? n)))
+        (error "argument to twindow must be a positive integer"))
+  (lambda (reducer)
+    (let ((i 0)
+          (q (make-q)))
+      (case-lambda
+        (() (reducer))
+        ((result) (reducer result))
+        ((result input)
+         (enq! q input)
+         (set! i (1+ i))
+         (cond ((< i n) result)
+               ((= i n) (reducer result (list-copy (car q))))
+               (else (deq! q)
+                     (reducer result (list-copy (car q))))))))))
+
+(define rfor-each
+  (case-lambda
+    "Run through every item in a transduction for their side effects but throw away
+all results."
+    (() *unspecified*)
+    ((acc) *unspecified*)
+    ((acc input) *unspecified*)))
+
+(define (rfirst seed)
+  "Yield the first value of the transduction, or the @var{seed} value if there is none."
+  (case-lambda
+    (() seed)
+    ((acc) acc)
+    ((_ input) (reduced input))))
+
+(define (rlast seed)
+  "Yield the final value of the transduction, or the @var{seed} value if there is none."
+  (case-lambda
+    (() seed)
+    ((acc) acc)
+    ((_ input) input)))
+
+(define (rfold f seed)
+  "The fundamental reducer. @code{rfold} creates an ad-hoc reducer based on
+a given 2-argument function. A @var{seed} is also required as the
+initial accumulator value, which also becomes the return value in case
+there was no input left in the transduction.
+
+Functions like @code{+} and @code{*} are automatically valid reducers,
+because they yield sane values even when given 0 or 1 arguments. Other
+functions like @code{max} cannot be used as-is as reducers since they
+require at least 2 arguments. For functions like this, @code{rfold} is
+appropriate."
+  (case-lambda
+    (() seed)
+    ((acc) acc)
+    ((acc input) (f acc input))))
+
+(define (rmax seed)
+  "Yield the maximum value of the transduction, or the @var{seed} value if
+there is none."
+  (rfold max seed))
+
+(define (rmin seed)
+  "Yield the minimum value of the transduction, or the @var{seed} value if
+there is none."
+  (rfold min seed))
+
+(define (rfind pred?)
+  "Find the first element in the transduction that satisfies a given predicate.
+Yields #f if no such element was found."
+  (case-lambda
+    (() #f)
+    ((acc) acc)
+    ((acc input) (if (pred? input) (reduced input) #f))))
-- 
2.39.0


[-- Attachment #3: 0002-doc-add-new-SRFI-171-reducers-to-the-manual.patch --]
[-- Type: text/x-patch, Size: 5704 bytes --]

From 58e7ca2718a860ca2fb5692684d6d128a7c1ae75 Mon Sep 17 00:00:00 2001
From: Colin Woodbury <colin@fosskers.ca>
Date: Tue, 20 Dec 2022 09:41:51 +0900
Subject: [PATCH 2/4] doc: add new SRFI-171 reducers to the manual

---
 doc/ref/srfi-modules.texi | 96 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 91 insertions(+), 5 deletions(-)

diff --git a/doc/ref/srfi-modules.texi b/doc/ref/srfi-modules.texi
index bce5b4eac..6eb1a563e 100644
--- a/doc/ref/srfi-modules.texi
+++ b/doc/ref/srfi-modules.texi
@@ -5836,7 +5836,7 @@ identity in the reduction.
 @cindex transducers reducers
 
 @deffn {Scheme Procedure} rcons
-a simple consing reducer. When called without values, it returns its
+A simple consing reducer. When called without values, it returns its
 identity, @code{'()}.  With one value, which will be a list, it reverses
 the list (using @code{reverse!}).  When called with two values, it conses
 the second value to the first.
@@ -5848,7 +5848,7 @@ the second value to the first.
 @end deffn
 
 @deffn {Scheme Procedure} reverse-rcons
-same as rcons, but leaves the values in their reversed order.
+The same as @code{rcons}, but leaves the values in their reversed order.
 @example
 (list-transduce (tmap (lambda (x) (+ x 1))) reverse-rcons (list 0 1 2 3))
 @result{} (4 3 2 1)
@@ -5856,7 +5856,7 @@ same as rcons, but leaves the values in their reversed order.
 @end deffn
 
 @deffn {Scheme Procedure} rany pred?
-The reducer version of any.  Returns @code{(reduced (pred? value))} if
+The reducer version of @code{any}.  Returns @code{(reduced (pred? value))} if
 any @code{(pred? value)} returns non-#f.  The identity is #f.
 
 @example
@@ -5869,7 +5869,7 @@ any @code{(pred? value)} returns non-#f.  The identity is #f.
 @end deffn
 
 @deffn {Scheme Procedure} revery pred?
-The reducer version of every.  Stops the transduction and returns
+The reducer version of @code{every}.  Stops the transduction and returns
 @code{(reduced #f)} if any @code{(pred? value)} returns #f.  If every
 @code{(pred? value)} returns true, it returns the result of the last
 invocation of @code{(pred? value)}.  The identity is #t.
@@ -5894,6 +5894,77 @@ transduction.
 @end example
 @end deffn
 
+@subheading Guile-specific reducers
+These reducers are available in the @code{(srfi srfi-171 gnu)} module,
+and are provided outside the standard described by the SRFI-171
+document.
+
+@deffn {Scheme Procedure} rfold proc seed
+The fundamental reducer. @code{rfold} creates an ad-hoc reducer based on
+a given 2-argument @var{proc}. A @var{seed} is required as the initial
+accumulator value, which also becomes the final return value in the case
+where there was no input left in the transduction.
+
+The first argument to the @var{proc} is the accumulating value, and the
+second is the current item of the transduction.
+
+Note that functions like @code{+} and @code{*} are automatically valid
+reducers, because they yield sane values even when given 0 or 1
+arguments. Other functions like @code{max} cannot be used as-is as
+reducers since they require at least 2 arguments. For functions like
+this, @code{rfold} is appropriate.
+
+@example
+;; Turning built-ins into reducers. Identical to rmax.
+(list-transduce (tfilter odd?) (rfold max 0) '(1 2 3 4 5))
+@result{} 5
+
+;; Custom lambdas into reducers. Identical to rlast.
+(list-transduce (tmap identity)
+                (rfold (lambda (_ input) input) #f)
+                '("abc" "def" "ghi"))
+@result{} "ghi"
+
+;; Track the 3 largest values in a transduction.
+(define (three-largest acc input)
+  (take (sort (cons input acc) >) 3))
+
+(list-transduce (tfilter odd?)
+                (rfold three-largest '(0 0 0))
+                '(7 1 4 2 13 5 9 2 8))
+@result{} (13 9 7)
+@end example
+@end deffn
+
+@deffn {Scheme Procedure} rfind pred?
+Find the first element in the transduction that satisfies a given
+predicate. Yields #f if no such element was found.
+
+@example
+(list-transduce (tmap identity)
+                (rfind string?)
+                '(1 c #t 4.12 "Jack" ()))
+@result{} "Jack"
+@end example
+@end deffn
+
+@deffn {Scheme Procedure} rfirst seed
+@deffnx {Scheme Procedure} rlast seed
+Yield the first (or last) value of the transduction, or the @var{seed}
+value if there is none.
+@end deffn
+
+@deffn {Scheme Procedure} rfor-each proc
+Apply @var{proc} for its side-effects to every value of the
+transduction, ignoring all results. Like its @ref{SRFI-1} cousin, yields
+@code{*unspecified*}.
+@end deffn
+
+@deffn {Scheme Procedure} rmax seed
+@deffnx {Scheme Procedure} rmin seed
+Yield the maximum (or minimum) value of the transduction, or the
+@var{seed} value if there is none.
+@end deffn
 
 @node SRFI-171 Transducers
 @subsubsection Transducers
@@ -6057,7 +6128,7 @@ Stateless.
 
 @subheading Guile-specific transducers
 These transducers are available in the @code{(srfi srfi-171 gnu)}
-library, and are provided outside the standard described by the SRFI-171
+module, and are provided outside the standard described by the SRFI-171
 document.
 
 @deffn {Scheme Procedure} tbatch reducer
@@ -6085,6 +6156,21 @@ value)}, saving it's result between iterations.
 @end example
 @end deffn
 
+@deffn {Scheme Procedure} twindow n
+
+Returns a transducer that yields @var{n}-length windows of overlapping
+values. This is different from @code{tsegment} which yields
+non-overlapping windows. If there were fewer items in the input than
+@var{n}, then this yields nothing.
+
+@example
+(list-transduce (twindow 3) rcons '(1 2 3 4 5))
+@result{} ((1 2 3) (2 3 4) (3 4 5))
+@end example
+
+Stateful.
+@end deffn
+
 
 @node SRFI-171 Helpers
 @subsubsection Helper functions for writing transducers
-- 
2.39.0


[-- Attachment #4: 0003-srfi-171-add-unit-tests-for-new-functions.patch --]
[-- Type: text/x-patch, Size: 2891 bytes --]

From 7b7538c61799fa0fa0e2fa18efba98b7de7da1ca Mon Sep 17 00:00:00 2001
From: Colin Woodbury <colin@fosskers.ca>
Date: Wed, 21 Dec 2022 09:30:50 +0900
Subject: [PATCH 3/4] srfi-171: add unit tests for new functions

These tests mainly match the examples shown in the docs.
---
 test-suite/tests/srfi-171.test | 66 ++++++++++++++++++++++++++++++----
 1 file changed, 60 insertions(+), 6 deletions(-)

diff --git a/test-suite/tests/srfi-171.test b/test-suite/tests/srfi-171.test
index 1ef7bc5f2..d1d54b2ec 100644
--- a/test-suite/tests/srfi-171.test
+++ b/test-suite/tests/srfi-171.test
@@ -207,15 +207,69 @@
             (list-transduce (tenumerate (- 1)) rcons numeric-list)))
 
   (pass-if "tbatch"
-    (equal?
-            '((0 1) (2 3) (4))
+    (equal? '((0 1) (2 3) (4))
             (list-transduce (tbatch (ttake 2) rcons) rcons numeric-list)))
 
   (pass-if "tfold"
-           (equal?
-            '(0 1 3 6 10)
-            (list-transduce (tfold +) rcons numeric-list))))
-
+    (equal? '(0 1 3 6 10)
+            (list-transduce (tfold +) rcons numeric-list)))
+
+  (pass-if "twindow: too wide of a window"
+    (equal? '()
+            (list-transduce (twindow 10) rcons '(1 2 3))))
+
+  (pass-if "twindow: acceptable window"
+    (equal? '((1 2 3) (2 3 4) (3 4 5))
+            (list-transduce (twindow 3) rcons '(1 2 3 4 5)))))
+
+(with-test-prefix "reducers"
+  (pass-if "rfold: builtin"
+    (equal? 5
+            (list-transduce (tfilter odd?) (rfold max 0) '(1 2 3 4 5))))
+
+  (pass-if "rfold: custom lambda"
+    (equal? "ghi"
+            (list-transduce (tmap identity)
+                            (rfold (lambda (_ input) input) #f)
+                            '("abc" "def" "ghi"))))
+
+  (pass-if "rfirst: empty"
+    (equal? 0
+            (list-transduce (tmap identity) (rfirst 0) '())))
+
+  (pass-if "rfirst: non-empty"
+    (equal? 1
+            (list-transduce (tmap identity) (rfirst 0) '(1 2 3))))
+
+  (pass-if "rlast: empty"
+    (equal? 0
+            (list-transduce (tfilter (lambda (_) #f)) (rlast 0) '(1 2 3))))
+
+  (pass-if "rlast: non-empty"
+    (equal? 5
+            (list-transduce (tmap identity) (rlast 0) '(1 2 3 4 5))))
+
+  (pass-if "rmax: empty"
+    (equal? 0
+            (list-transduce (tmap identity) (rmax 0) '())))
+
+  (pass-if "rmax: non-empty"
+    (equal? 31
+            (list-transduce (tmap identity) (rmax 0) '(1 2 31 4 5))))
+
+  (pass-if "rmin: empty"
+    (equal? 0
+            (list-transduce (tmap identity) (rmin 0) '())))
+
+  (pass-if "rmin: non-empty"
+    (equal? 1
+            (list-transduce (tmap identity) (rmin 1000) '(5 3 1 7 6))))
+
+  (pass-if "rfind"
+    (equal? "Jack"
+            (list-transduce (tmap identity)
+                            (rfind string?)
+                            '(1 c #t 4.12 "Jack" ())))))
 
 (with-test-prefix "x-transduce"
   (pass-if "list-transduce" 
-- 
2.39.0


[-- Attachment #5: 0004-doc-added-a-guide-for-writing-custom-reducers.patch --]
[-- Type: text/x-patch, Size: 3488 bytes --]

From 87a74d106f11680c4924befb664d7ef685c16b06 Mon Sep 17 00:00:00 2001
From: Colin Woodbury <colin@fosskers.ca>
Date: Thu, 22 Dec 2022 20:32:33 +0900
Subject: [PATCH 4/4] doc: added a guide for writing custom reducers

The guide previously explained what reducers were, but not the specifics
of how to write them yourself. This commits rectifies this.
---
 doc/ref/srfi-modules.texi | 76 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/doc/ref/srfi-modules.texi b/doc/ref/srfi-modules.texi
index 6eb1a563e..e3beb2ff5 100644
--- a/doc/ref/srfi-modules.texi
+++ b/doc/ref/srfi-modules.texi
@@ -5966,6 +5966,82 @@ Yield the maximum (or minimum) value of the transduction, or the
 @var{seed} value if there is none.
 @end deffn
 
+@subheading Writing your own reducers
+If you want to reduce some values via an ordinary function that you
+already have on-hand, then @code{rfold} is all you need:
+
+@example
+;; Same as rmax.
+(list-transduce (tmap identity) (rfold max 0) '(1 2 3 4 5))
+@result{} 5
+@end example
+
+However, if you want more customized behaviour (such as early
+termination and/or arbitrary manipulation of the input values) then
+you're free to write a reducer manually. To do so, we need to write a
+function that can accept one to three arguments. We do this with
+@code{case-lambda}. Let's create a simple reducer that adds all the
+numbers in its input:
+
+@example
+(define rsum
+  (case-lambda
+    ;; This is the first case called. It establishes the on-going
+    ;; accumulator value.
+    (() 0)
+    ;; This is the last case called. Its result is the final output
+    ;; of the reduction.
+    ((result-so-far) result-so-far)
+    ;; This is the normal case. We do something to the accumulator
+    ;; and the current input.
+    ((result-so-far input) (+ result-so-far input)))
+
+(list-transduce (tfilter odd?) rsum '(1 2 3 4))
+@result{} 4
+@end example
+
+You'll notice that @code{rsum} isn't actually included in SRFI-171,
+because @code{+} already behaves exactly this way.
+
+@subsubheading Higher-order reducers
+Of course, the top-level @code{define} can also take arguments, and we
+can use these within the @code{case-lambda}. Here's how @code{rlast} is
+implemented:
+
+@example
+(define (rlast seed)
+  (case-lambda
+    (() seed)
+    ((result-so-far) result-so-far)
+    ((_ input) input)))
+
+(list-transduce (tfilter odd?) (rlast 0) '(1 2 3 4 5 6))
+@result{} 5
+@end example
+
+The @code{seed} here acts as a default for when nothing was left in the
+transduction. You're free of course to pass in whatever arguments you
+wish (including other functions) and use them as you see fit.
+
+@subsubheading Early termination
+Importing @code{(srfi srfi-171 meta)} exposes the @code{reduced}
+function, which we can use to signal to the overall transduction process
+that we're done, and no further input needs to be pulled from the data
+source. Here's @code{rfind} (already included in Guile's SRFI-171
+extensions) which escapes the iteration when some condition is met:
+
+@example
+(define (rfind pred?)
+  (case-lambda
+    (() #f)
+    ((acc) acc)
+    ((acc input) (if (pred? input) (reduced input) #f))))
+@end example
+
+Important note: calling @code{reduced} yourself like this does activate
+the final results-only branch, so any extra effects or manipulations you
+have there will also occur.
+
 @node SRFI-171 Transducers
 @subsubsection Transducers
 @cindex transducers transducers
-- 
2.39.0


  parent reply	other threads:[~2022-12-24 15:28 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-21  0:48 [PATCH] Extensions for SRFI-171 (Transducers) Colin Woodbury
2022-12-21 10:00 ` Linus Björnstam
2022-12-22 14:33   ` Damien Mattei
2022-12-22 14:52     ` Damien Mattei
2022-12-22 17:32       ` Damien Mattei
2022-12-24 13:25         ` Damien Mattei
2022-12-24 15:28   ` Colin Woodbury [this message]
2023-01-14 15:09     ` [PATCHv2] " Ludovic Courtès
2023-01-21  1:48       ` [PATCHv3] " Colin Woodbury
2023-01-23 22:17         ` Ludovic Courtès
2023-01-24  0:21           ` Colin Woodbury
2023-01-24  9:07             ` Ludovic Courtès
2023-05-18  0:36               ` Colin Woodbury
2023-06-28 12:27                 ` Colin Woodbury
2023-08-12 19:37                   ` Linus Björnstam
2023-08-13  4:07                     ` Colin Woodbury
2023-08-18 10:03                   ` Ludovic Courtès
2023-08-18 10:10                     ` Colin Woodbury
2023-08-18 13:24                       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1158f7bf-3093-8c3b-dd74-267f1fbb6f4c@fosskers.ca \
    --to=colin@fosskers.ca \
    --cc=guile-devel@gnu.org \
    --cc=linus.bjornstam@veryfast.biz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).