unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: akater <nuclearspace@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
Subject: Re: [PATCH] Add cl-map-into, revision 3
Date: Sat, 09 Oct 2021 02:46:17 +0000	[thread overview]
Message-ID: <87a6jidifa.fsf@gmail.com> (raw)
In-Reply-To: <83o8811ex5.fsf@gnu.org>


[-- Attachment #1.1: Type: text/plain, Size: 2730 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> +(cl-defmacro cl--do-seq-type-signature ((type-var signature &optional result)
>> +                                        &body body)
>> +  "With TYPE-VAR bound to sequence type, evaluate BODY forms.  Return RESULT.
>
> The first line of a doc string should be a single sentence, not longer
> than 79 characters.
>
>> +(defun cl--make-map-into-mapper (signature &optional do-not-compile)
>> +  "Return mapper for `cl-map-into' specialized on arglists of type
>> +encoded by SIGNATURE.
>
> Same here.

This one is not public interface though.  But I fixed this one
nevertheless.

>> +(defun cl-map-into (target function &rest sequences)
>> +  "Common Lisp's map-into.
>
> The first line of a doc string of a public interface should describe
> the arguments, at least the mandatory ones.

Done.

Changes:
- NEWS (29) entry
- entry in cl.texi
- supported are list, vector, bool-vector, string
- some more tests
- make-mapper code is simplified
- “target” renamed to “result-sequence” because that's the way it is in
  cl spec
- clean docstrings

Three points remain.

- Regarding “do-not-compile” argument in make-mapper.  It would be
  better to have “compile” argument instead, with 3 possible values:
  nil, byte-compile, native-compile.  Native-compile seems to work right
  now but I'm just getting acquainted with the feature, it's not going
  smooth, and I'm not sure whether native-compile can be used by default
  in cl-map-into.  If cl-map-into won't make it into Emacs 28, I suggest
  using native-compile right away, for ease of experimentation since
  nothing depends on cl-map-into right now.

- I prefer providing examples for functions, including “internal” ones;
  most of the time examples come naturally during development so it's
  better to use them than to let them go to waste.  Usually examples are
  nowhere to submit; I thus often leave them in docstrings, especially
  when it comes to “internal” functions.  This is the case with this
  patch.  While people do this sometimes, and there's even a Common Lisp
  library that addresses this technique in some way, I'm not sure if
  this is appropriate style.

- I left a comment block in the beginning.  Since the existing mapper in
  cl-extra is buggy, I'd rather have at least some of the comments
  remain.  It will get into sight of more people this way than a mere
  bug in the tracker, and implmenting new similar dispatchers seems
  straightforward.  I'll report the bug in coming days unless the bug is
  already there.  Also commented are (possible) type declarations.  I
  think they convey something useful as well.


[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 865 bytes --]

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Add cl-map-into --]
[-- Type: text/x-diff, Size: 15076 bytes --]

From cebb9a88e244457428385948eaf6bac24a7e5eb1 Mon Sep 17 00:00:00 2001
From: akater <nuclearspace@gmail.com>
Date: Wed, 15 Sep 2021 19:42:47 +0000
Subject: [PATCH] Add cl-map-into

map-into is a standard Common Lisp function that acts as cl-map, only
values are recorded into a preallocated sequence.

* lisp/emacs-lisp/cl-extra.el
(cl-map-into): New primary function
(cl--map-into-basic-call-arguments-limit,
cl--map-into-max-small-signature): New auxiliary constant
(cl--map-into-mappers-array, cl--map-into-mappers-alist): New variable
(cl--compute-map-into-signature, cl--make-map-into-mapper): New auxiliary function
(cl--do-seq-type-signature): New auxiliary macro
---
 doc/misc/cl.texi                       |   9 ++
 etc/NEWS                               |   7 +
 lisp/emacs-lisp/cl-extra.el            | 206 +++++++++++++++++++++++++
 test/lisp/emacs-lisp/cl-extra-tests.el |  61 ++++++++
 4 files changed, 283 insertions(+)

diff --git a/doc/misc/cl.texi b/doc/misc/cl.texi
index 04e2c71a2b..45cea26d4e 100644
--- a/doc/misc/cl.texi
+++ b/doc/misc/cl.texi
@@ -3301,6 +3301,15 @@ be one of the following symbols: @code{vector}, @code{string},
 thrown away and @code{cl-map} returns @code{nil}).
 @end defun
 
+@defun cl-map-into result-sequence function &rest sequences
+This function is like @code{cl-map}, except that the values returned by
+@var{function} are recorded into existing @var{result-sequence} rather
+than into a newly allocated sequence of specified type.  Note also that
+while @code{cl-map} requires the function to be mapped over to accept at
+least one argument, @code{cl-map-into} does not require that from
+@var{function}.  The return value is @var{result-sequence}.
+@end defun
+
 @defun cl-maplist function list &rest more-lists
 This function calls @var{function} on each of its argument lists,
 then on the @sc{cdr}s of those lists, and so on, until the
diff --git a/etc/NEWS b/etc/NEWS
index b91a5cbb72..131df75df4 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -157,6 +157,13 @@ Using 'make-obsolete' on a theme is now supported.  This will make
 ** New function 'define-keymap'.
 This function allows defining a number of keystrokes with one form.
 
++++
+** New function 'cl-map-into'.
+A counterpart of Common Lisp's 'map-into', this function destructively
+modifies a sequence passed to it to contain the results of applying a
+function passed to it to each element in the sequences passed to it,
+in turn.
+
 +++
 ** New macro 'defvar-keymap'.
 This macro allows defining keymap variables more conveniently.
diff --git a/lisp/emacs-lisp/cl-extra.el b/lisp/emacs-lisp/cl-extra.el
index 499d26b737..df712e3237 100644
--- a/lisp/emacs-lisp/cl-extra.el
+++ b/lisp/emacs-lisp/cl-extra.el
@@ -88,6 +88,212 @@ defun cl-equalp (x y)
 	(t (equal x y))))
 
 
+;;; map-into
+
+;; We implement a simple dispatcher for sequence types.
+;;
+;; cl-extra has cl--mapcar-many for similar purpose.
+;; The core issue with it, it goes through args pre-emptively
+;; to compute min length when there are more than 2 arguments
+;; which makes it and its reverse dependencies fail on circular lists
+;; unless there are <3 args.
+;; Other issues are
+;; - it performs type checks for sequences of known types at runtime
+;; - it may cons whole arglist thrice per invocation
+;; - looks like it's hard to extend.
+
+;; Our approach doesn't have these issues.
+
+(defconst cl--map-into-basic-call-arguments-limit 7
+  "Maximal reasonably expected number of arguments to `cl-map-into'.
+
+`cl-map-into' caches its code corresponding to various signature
+types of arglists supplied to `cl-map-into'.  Arglists may vary
+in length.
+
+Code corresponding to arglists of length less than
+`cl--map-into-basic-call-arguments-limit' is accessed via array.
+
+Code corresponding to arglists of length greater than or equal to
+`cl--map-into-basic-call-arguments-limit' is accessed via alist.")
+
+(defconst cl--map-into-max-small-signature
+  (expt 2 cl--map-into-basic-call-arguments-limit)
+  "Length of array to allocate for caching `cl-map-into' mappers
+corresponding to small arglists.
+
+Such mappers are accessed by their position in an array; position
+equals the signature.
+
+Consider `cl-map-into' arglist
+
+(result-seq f seq-1 seq-2)
+
+call-arguments-limit corresponding to arglists of this length or
+shorter, is 4 (as there are 4 arguments).  This leaves at most 3
+sequences to contribute to type signature.
+
+Hovewer, we have to store one additional bit for fixnum-based
+encoding to be unambiguous and simple.  So overall array length
+ends up being exactly (expt 2 call-arguments-limit).")
+
+(defvar cl--map-into-mappers-array
+  (make-vector cl--map-into-max-small-signature nil)
+  "Array holding mappers corresponding to small arglists of `cl-map-into'.
+
+Element type is (or function null).")
+
+(defvar cl--map-into-mappers-alist nil
+  "Alist holding mappers corresponding to large arglists of `cl-map-into'.")
+
+(defun cl--compute-map-into-signature (&rest all-sequences)
+  "Compute lookup key for `cl-map-into''s almost-arglist ALL-SEQUENCES.
+
+Namely: ALL-SEQUENCES would be (RESULT-SEQUENCE   . SEQUENCES)
+               for (cl-map-into RESULT-SEQUENCE f . SEQUENCES)
+
+As a side effect, it checks that ALL-SEQUENCES are of sequence
+types.
+
+Example:
+ELISP> (mapcar (lambda (arglist)
+                 (apply #'cl--compute-map-into-signature arglist))
+               '(( () () () )                ; signature #b1000
+                 ( () () [] )                ; signature #b1001
+                 ( () [] () )                ; signature #b1010
+                 ( () [] [] )                ; signature #b1011
+                 ( [] () () )                ; signature #b1100
+                 ))
+(8 9 10 11 12)"
+  ;; This is not `cl-map-into'-specific and could be used for other caches
+  ;; which is why we don't specify arglist as (result &rest sequences).
+  ;; For the time being (while this dispatch is not used widely),
+  ;; neither docstring nor name reflect this.
+  (let ((signature 1))
+    (dolist (s all-sequences signature)
+      (setq signature (ash signature 1))
+      (cl-etypecase s
+        (list)
+        ((or string vector bool-vector) (cl-incf signature))))))
+
+(cl-defmacro cl--do-seq-type-signature ((type-var signature &optional result)
+                                        &body body)
+  "With TYPE-VAR bound to a sequence type, evaluate BODY forms, return RESULT.
+
+TYPE-VAR goes across sequence types in an arglist corresponding
+to SIGNATURE that encodes sequence types in that arglist.
+
+Iteration goes from arglist's end to arglist's start.
+
+If :first is present at toplevel in BODY, all forms following
+it (and those forms only) are evaluated in order when TYPE-VAR is
+bound to the first sequence type in the arglist --- which would
+be the last sequence type derived from SIGNATURE: see the
+previous paragraph.  At other iteration steps, only forms
+preceding the first :first are evaluated.
+
+Subsequent instances of toplevel :first in BODY don't affect anything."
+  (declare (indent 1))
+  (let* ((main (cl-copy-list body))
+         (first (if (eq :first (car main)) (progn (setf main nil)
+                                                  (cdr main))
+                  (cl-loop with sublist = main
+                           while sublist do
+                           (when (eq :first (cadr sublist))
+                             (setf first (cddr sublist) (cdr sublist) nil)
+                             (cl-return first))
+                           (pop sublist)))))
+    (let ((sig (make-symbol "sig")))
+      `(let ((,sig ,signature) ,type-var)
+         ;; (declare (type (integer (1)) ,sig)
+         ;;          ;; Let's keep nil for now.
+         ;;          (type (member nil list vector) ,type-var))
+         (cl-check-type ,sig (integer (1)))
+         (cl-loop (cond
+                   ((or (when (= 2 ,sig) (setq ,type-var 'list))
+                        (when (= 3 ,sig) (setq ,type-var 'array)))
+                    ;; This duplicates main code sometimes.  Maybe,
+                    ;; there is elegant enough way to eliminate duplication.
+                    ,@(or first main) (cl-return ,result))
+                   (t (setq ,type-var (if (zerop (mod ,sig 2))
+                                          'list
+                                        'array))
+                      ,@main))
+                  (setf ,sig (floor ,sig 2)))))))
+
+(defun cl--make-map-into-mapper (signature &optional do-not-compile)
+  "Return mapper for `cl-map-into' specialized on arglists of given SIGNATURE.
+
+If DO-NOT-COMPILE is nil (default), return byte-compiled function.
+Otherwise, return lambda form.
+
+Example:
+ELISP> (cl--make-map-into-mapper #b10101 t)
+(lambda (f result-list array-3 list-2 array-1)
+  (cl-loop for elt in-ref result-list
+           for elt-3 across array-3
+           for elt-2 in list-2
+           for elt-1 across array-1
+           do (setf elt (funcall f elt-3 elt-2 elt-1))
+           finally return result-list))"
+  (let ((gensym-counter 1) f xs ss loop result-elt result-var)
+    (cl--do-seq-type-signature (type signature)
+      (setq loop (nconc (list 'for (let ((it (gensym "elt-")))
+                                     (push it xs)
+                                     (cl-decf gensym-counter)
+                                     it)
+                              (cl-case type
+                                (list 'in)
+                                (array 'across))
+                              (let ((it (gensym (format "%s-" type))))
+                                (push it ss)
+                                it))
+                        loop))
+      :first
+      (setq loop (nconc (list 'for (setq result-elt (make-symbol "elt"))
+                              (cl-case type
+                                (list 'in-ref)
+                                (array 'across-ref))
+                              (setq result-var
+                                    (make-symbol (format "result-%s" type))))
+                        loop)))
+    (funcall
+     (if do-not-compile #'identity #'byte-compile)
+     `(lambda ,(cons (setq f (make-symbol "f")) (cons result-var ss))
+        (cl-loop ,@(nconc loop `(do (setf ,result-elt (funcall ,f ,@xs))
+                                    ;; Bytecode looks better
+                                    ;; with finally return ..
+                                    ;; than with finally (cl-return ..).
+                                    finally return ,result-var)))))))
+
+(defun cl-map-into (result-sequence function &rest sequences)
+  "Map FUNCTION over SEQUENCES, recording values into RESULT-SEQUENCE.
+
+RESULT-SEQUENCE is destructively modified.
+
+RESULT-SEQUENCE and each element of SEQUENCES can each be either
+a list, a vector, a string, or a bool-vector.  If RESULT-SEQUENCE
+and each element of SEQUENCES are not all the same length, the
+iteration terminates when the shortest sequence (of any of the
+SEQUENCES or the RESULT-SEQUENCE) is exhausted.  If
+RESULT-SEQUENCE is longer than the shortest element of SEQUENCES,
+extra elements at the end of RESULT-SEQUENCE are left unchanged.
+
+Return RESULT-SEQUENCE."
+  (cl-check-type function function)
+  (apply
+   (let* ((sig (apply #'cl--compute-map-into-signature result-sequence
+                      sequences))
+          (small (< sig cl--map-into-max-small-signature)))
+     (with-memoization (if small (aref cl--map-into-mappers-array sig)
+                         ;; TODO: Order alist entries for faster lookup
+                         ;; (note that we'll have to abandon alist-get then).
+                         (alist-get sig cl--map-into-mappers-alist
+                                    nil nil #'=))
+                       (cl--make-map-into-mapper sig)))
+   function result-sequence sequences))
+
+
 ;;; Control structures.
 
 ;;;###autoload
diff --git a/test/lisp/emacs-lisp/cl-extra-tests.el b/test/lisp/emacs-lisp/cl-extra-tests.el
index 91f0a1e201..a4f21f2edf 100644
--- a/test/lisp/emacs-lisp/cl-extra-tests.el
+++ b/test/lisp/emacs-lisp/cl-extra-tests.el
@@ -35,6 +35,67 @@
     (should (eq (cl-getf plist 'y :none) nil))
     (should (eq (cl-getf plist 'z :none) :none))))
 
+(ert-deftest cl-map-into ()
+  (should (equal '(42 42 42)
+                 (cl-map-into (list 0 0 0) #'+ '(1 2 3) [41 40 39])))
+  (should (equal '(42 42 42)
+                 (cl-map-into (list 0 0 0) #'+ [41 40 39] '(1 2 3))))
+  (should (equal '(42 42 42)
+                 (cl-map-into (list 0 0 0) #'* '(1 2 3) [42 21 14])))
+  (should (equal '(42 42 42)
+                 (let ((s (list 0 0 0)))
+                   (cl-map-into s #'+ '(1 2 3) [41 40 39])
+                   s)))
+  (should (equal '(42 42 42)
+                 (let ((s (list 0 0 0)))
+                   (cl-map-into s #'+ s [41 40 39] '(1 2 3))
+                   s)))
+  (should (equal '(42 42 42)
+                 (let ((s (list 0 0 0)))
+                   (cl-map-into s #'+ '(1 2 3) s [41 40 39])
+                   s)))
+  (should (equal '(42 42 42)
+                 (let ((s (list 0 0 0)))
+                   (cl-map-into s #'+ '(1 2 3) [41 40 39] s)
+                   s)))
+  (should (equal '(42 42 42)
+                 (let ((s (list 18 19 20)))
+                   (cl-map-into s #'+ s '(6 4 2 1 not-even-a-number) s)
+                   s)))
+  (should (equal [42 42 42]
+                 (let ((s (vector 0 0 0)))
+                   (cl-map-into s #'+ '(1 2 3) [41 40 39])
+                   s)))
+  (should (equal [42 42 42]
+                 (let ((s (vector 0 0 0)))
+                   (cl-map-into s #'+ [41 40 39] '(1 2 3))
+                   s)))
+  (should (equal [42 42 42]
+                 (let ((s (vector 18 19 20)))
+                   (cl-map-into s #'+ s '(6 4 2 1 not-even-a-number) s)
+                   s)))
+  (should (equal "Lisp"
+                 (let ((s (copy-sequence "Loop")))
+                   (cl-map-into s (lambda (mask new old) (if mask new old))
+                                (bool-vector nil t t nil) "Disjoint" s)
+                   s)))
+  (should (equal '(1 2 3)
+                 (let ((s (list 'one 'two 'three)))
+                   (cl-map-into s (let ((n 0)) (lambda () (cl-incf n))))
+                   s)))
+  (should (equal (bool-vector t nil t nil t)
+                 (let ((s (bool-vector nil nil nil nil nil)))
+                   (cl-map-into s #'cl-evenp '#1=(0 1 . #1#))
+                   s)))
+  (should (equal "GNU GNU GNU GNU"
+                 (let ((cyclically '#1=(t nil . #1#))
+                       (glue '#1=(?G ?L ?U ?E . #1#))
+                       (ants '#1=(?A ?N ?T ?\s . #1#))
+                       (s (make-string 15 0)))
+                   (cl-map-into s (lambda (x y z) (if x y z))
+                                cyclically glue ants)
+                   s))))
+
 (ert-deftest cl-extra-test-mapc ()
   (let ((lst '(a b c))
         (lst2 '(d e f))
-- 
2.32.0


  parent reply	other threads:[~2021-10-09  2:46 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-23 23:35 [PATCH] Add cl-map-into akater
2021-09-27 17:18 ` Stefan Monnier
2021-09-29 19:30   ` akater
2021-09-29 20:26     ` Stefan Monnier
2021-09-30  6:38       ` Bozhidar Batsov
2021-09-30 13:03         ` Adam Porter
2021-09-30 13:09           ` Bozhidar Batsov
2021-09-30 13:21             ` Adam Porter
2021-09-30 15:00               ` akater
2021-10-01 18:40         ` Stefan Monnier
2021-10-01 18:51           ` Eli Zaretskii
2021-10-01 19:04             ` Tassilo Horn
2021-10-01 20:52               ` Stefan Monnier
2021-10-01 22:08                 ` Glenn Morris
2021-10-02  3:53                   ` Stefan Monnier
2021-10-06 23:35   ` [PATCH] Add cl-map-into, revision 2 akater
2021-10-07  7:18     ` Eli Zaretskii
2021-10-07  8:24       ` akater
2021-10-07  9:00         ` Eli Zaretskii
2021-10-09  2:46       ` akater [this message]
2021-10-13 22:32         ` [PATCH] Add cl-map-into, revision 3 Stefan Monnier
2021-10-26 12:52           ` akater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a6jidifa.fsf@gmail.com \
    --to=nuclearspace@gmail.com \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).