From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Shiro Kawai Newsgroups: gmane.lisp.guile.devel Subject: Re: [PATCH] add SRFI: srfi-121; generators Date: Fri, 22 Jan 2021 16:14:44 -1000 Message-ID: References: <87r1mcfozv.fsf@netris.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000f01b1305b987de21" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10341"; mail-complaints-to="usenet@ciao.gmane.io" Cc: srfi , John Cowan , guile-devel@gnu.org To: Mark H Weaver Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Sat Jan 23 11:07:18 2021 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1l3Fp8-0002Yl-0A for guile-devel@m.gmane-mx.org; Sat, 23 Jan 2021 11:07:18 +0100 Original-Received: from localhost ([::1]:57774 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l3Fp7-0001FP-0o for guile-devel@m.gmane-mx.org; Sat, 23 Jan 2021 05:07:17 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:53808) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l38S3-0000CL-La for guile-devel@gnu.org; Fri, 22 Jan 2021 21:14:59 -0500 Original-Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]:45498) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1l38S1-00018e-CJ for guile-devel@gnu.org; Fri, 22 Jan 2021 21:14:59 -0500 Original-Received: by mail-pl1-x635.google.com with SMTP id b8so4289824plh.12 for ; Fri, 22 Jan 2021 18:14:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9foUBzyPvQ44g6XXew7G3Nzu7adP6LWD+30df+J4A5w=; b=OUYYOmpyBHuy72KoBAn1TC9Dh/5+ZyoXIctF6Eew+lCduyEw4QWYDXdppeh+JJISc9 Jnx450m7eQIYtwyaVwY5WWN6yRZtENaTs/9IJgeLAKFMobbCdsX/AhCudcu1wypcWqU9 7dOTboAvzFes1BfbtjD9jbS1ey1sWBm8UA3n59O22tpX+t/JNSlJ7b55PTYE1ljHt1ZF 8fCyaLkf2hqJWyTZ3SjCS184xb8qyvJNJLIFS9a3iY2ISS2su/UZe69Un0O3njP+nT3H 0ik/Z7s3bNBY++jp4Xz5Elg9I1LSBXQjde73h9BoJgCFe9cqfu/7LOonY8TOYDzLBBZq QwIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9foUBzyPvQ44g6XXew7G3Nzu7adP6LWD+30df+J4A5w=; b=OuBylHZIohrl23HWcsE4ZlHjGBZ00vc9ba6TJEvtjsnKqEUuLlJeLQQoMgd+gcp9z/ K+hv05EiYMdA1CMr3Z+ekRMSwDln5UqNFLnUniQcq+0fzWmkeQTaJ4pEyymwG9360GlH ohAfERuh7fbAzwqKikH4ZjW1ZaEAO8akX6AgzclNo72G4xidT6hVAwwBTbXGh6Jd+Z+m OTRm5dYJMldzNxiV2V/twZPTDFjUmlB0qjp/vGPjpfVU1lIgY55p7zkogMIPs6PHmmhh pKYgVEyzSU53fC+ynwtCIUarNV/LzHIE2Xmguhx5QNsOH84ER2jG0sgUnP9lVW5CyEF1 DKRQ== X-Gm-Message-State: AOAM531yEKa2TxJeiwGrUQRSGtLIr1GvBM6lCbsZpf4akt3Qcxpzk30w jxo6TDxRey43SfmquyH97WsGYSeUnJR9SaFYdVg= X-Google-Smtp-Source: ABdhPJzTqEVUuSNJRWa04YdwAFy3WY2J3BKg0sBys32OK0Jpsx75SiTs6WCUIAwpC6/2qZBbraCkoYpTH22G0dfDvu4= X-Received: by 2002:a17:902:e5cc:b029:de:cdab:2da5 with SMTP id u12-20020a170902e5ccb02900decdab2da5mr7441933plf.32.1611368095553; Fri, 22 Jan 2021 18:14:55 -0800 (PST) In-Reply-To: <87r1mcfozv.fsf@netris.org> Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=shiro.kawai@gmail.com; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Sat, 23 Jan 2021 05:06:54 -0500 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:20649 Archived-At: --000000000000f01b1305b987de21 Content-Type: text/plain; charset="UTF-8" Hi Mark, I agree that the whole point of generators is performance. When combined with lazy sequences (srfi-127), it covers many typical use cases of streams much more efficiently. I've written Gauche version with performance in mind, but it depends on a few non-standard features and can not be used as-is. Besides, if we do go for performance, we need benchmark suites as well. I'm interested in adopting Gauche version as the reference implementation but don't know when I have time. If you have some code to improve the current reference implementation and can merge it to the repo, we can proceed gradually, I guess. --shiro On Fri, Jan 22, 2021 at 2:59 PM Mark H Weaver wrote: > Hi John, > > John Cowan writes: > > > Mark: I'm interested to know if you have a sketch of ideas for a more > > efficient implementation of SRFI 121/158. You say it requires specific > > knowledge of Guile internals, but are you willing to sketch how you might > > do it? Thanks. > > Here are a few suggestions off the top of my head, looking only at the > latest SRFI-121 reference implementation: > > * In 'gcombine', 'generator-fold', 'generator-for-each', and possibly > also 'generator-unfold', it would be helpful to use 'case-lambda' to > provide specialized code for cases where the 'dotted' tail in the > argument list consists of 1 or 2 arguments. When a procedure with a > dotted tail is called, it forces allocation of a new list, and last I > checked Guile does not include optimizations to avoid that allocation > where possible. Worse, the general case requires calling 'map' and > allocating a new list every time a new item is requested. It's a > great help to avoid these expenses in the most common cases. For > example, see the definitions of 'map', 'for-each', and 'fold' in > Guile: > > > https://git.savannah.gnu.org/cgit/guile.git/tree/module/ice-9/boot-9.scm?id=75b0db1a286f936a90683973efc2315a07c03b21#n214 > > https://git.savannah.gnu.org/cgit/guile.git/tree/module/srfi/srfi-1.scm?id=75b0db1a286f936a90683973efc2315a07c03b21#n451 > > * Avoid using 'define-values' in internal lexical contexts in Guile for > now. Hopefully some day Guile's implementation of 'define-values' > will be more efficient, but for now it's implemented as a simple macro > that expands into code that mutates variables, which prevents several > other optimizations that could otherwise be done by Guile's compiler. > In particular, in 'gcombine', better use 'call-with-values' or > 'receive'. > > * Several procedures are defined in terms of more general higher-order > procedures, or create intermediate lists/generators unnecessarily, for > the sake of making the code simpler. In most contexts I would applaud > this practice, but there will be a significant price to pay in > efficiency. For example, 'generator->reverse-list' with 1 argument is > implemented in terms of 'generator-fold', and with 2 arguments by > creating an intermediate generator using 'gtake'. > > * To make matters worse: 'gtake', as well as 'make-for-each-generator' > and 'make-unfold-generator', are defined in terms of coroutine > generators. It's good to have coroutine generators available, but > they are quite expensive, and best avoided where efficiency is > desirable. > > * 'generator->vector' is defined using 'generator->list' and > 'list->vector'. That's bad enough, but digging deeper we see that > 'generator->list' is implemented by reversing the result of > 'generator->reverse-list', which as I've already pointed out is > defined using 'generator-fold', which currently involves calling 'map' > and 'append' and allocating temporary lists for each item generated. > > In general, it's helpful to go through the mental exercise of tracing an > evaluation of each of these procedures, and more importantly to trace > calls to the produced generators, to get some idea of the expense > involved. > > This is just a sampling of the problems I noticed from a quick skim of > the code, by no means comprehensive. > > I acknowledge that following the suggestions above will make the code > much larger, uglier, and more difficult to understand. I would not > recommend such practices except in cases where efficiency is paramount. > > In this case, I think efficiency *is* paramount. My rationale is that, > like hash tables, generators inevitably force code that uses them to be > written in an imperative style, and therefore they are best avoided in > my opinion, *except* in cases where efficiency is paramount. > > To avoid being forced to write code in an imperative style, I suggest > that in most cases streams are preferable to generators. > > Regards, > Mark > --000000000000f01b1305b987de21 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Mark,
I agree that the whole point=C2=A0of generato= rs is performance.=C2=A0 When combined with lazy sequences (srfi-127), it c= overs many typical use cases of streams much more efficiently.
I&= #39;ve written Gauche version with performance in mind, but it depends on a= few non-standard features and can not be used as-is.=C2=A0 Besides, if we = do go for performance, we need benchmark suites as well.=C2=A0 I'm inte= rested in adopting Gauche version as the reference implementation but don&#= 39;t know when I have time.=C2=A0 If you have some code to improve the curr= ent reference implementation and can merge it to the repo, we can proceed g= radually, I guess.

--shiro


On = Fri, Jan 22, 2021 at 2:59 PM Mark H Weaver <mhw@netris.org> wrote:
Hi John,

John Cowan <cowan@cc= il.org> writes:

> Mark: I'm interested to know if you have a sketch of ideas for a m= ore
> efficient implementation of SRFI 121/158.=C2=A0 You say it requires sp= ecific
> knowledge of Guile internals, but are you willing to sketch how you mi= ght
> do it?=C2=A0 Thanks.

Here are a few suggestions off the top of my head, looking only at the
latest SRFI-121 reference implementation:

* In 'gcombine', 'generator-fold', 'generator-for-each&= #39;, and possibly
=C2=A0 also 'generator-unfold', it would be helpful to use 'cas= e-lambda' to
=C2=A0 provide specialized code for cases where the 'dotted' tail i= n the
=C2=A0 argument list consists of 1 or 2 arguments.=C2=A0 When a procedure w= ith a
=C2=A0 dotted tail is called, it forces allocation of a new list, and last = I
=C2=A0 checked Guile does not include optimizations to avoid that allocatio= n
=C2=A0 where possible.=C2=A0 Worse, the general case requires calling '= map' and
=C2=A0 allocating a new list every time a new item is requested.=C2=A0 It&#= 39;s a
=C2=A0 great help to avoid these expenses in the most common cases.=C2=A0 F= or
=C2=A0 example, see the definitions of 'map', 'for-each', a= nd 'fold' in
=C2=A0 Guile:

=C2=A0 https://git.savannah.gnu.org/cgit/guile.git/t= ree/module/ice-9/boot-9.scm?id=3D75b0db1a286f936a90683973efc2315a07c03b21#n= 214
=C2=A0 https://git.savannah.gnu.org/cgit/guile.git/tr= ee/module/srfi/srfi-1.scm?id=3D75b0db1a286f936a90683973efc2315a07c03b21#n45= 1

* Avoid using 'define-values' in internal lexical contexts in Guile= for
=C2=A0 now.=C2=A0 Hopefully some day Guile's implementation of 'def= ine-values'
=C2=A0 will be more efficient, but for now it's implemented as a simple= macro
=C2=A0 that expands into code that mutates variables, which prevents severa= l
=C2=A0 other optimizations that could otherwise be done by Guile's comp= iler.
=C2=A0 In particular, in 'gcombine', better use 'call-with-valu= es' or
=C2=A0 'receive'.

* Several procedures are defined in terms of more general higher-order
=C2=A0 procedures, or create intermediate lists/generators unnecessarily, f= or
=C2=A0 the sake of making the code simpler.=C2=A0 In most contexts I would = applaud
=C2=A0 this practice, but there will be a significant price to pay in
=C2=A0 efficiency.=C2=A0 For example, 'generator->reverse-list' = with 1 argument is
=C2=A0 implemented in terms of 'generator-fold', and with 2 argumen= ts by
=C2=A0 creating an intermediate generator using 'gtake'.

* To make matters worse: 'gtake', as well as 'make-for-each-gen= erator'
=C2=A0 and 'make-unfold-generator', are defined in terms of corouti= ne
=C2=A0 generators.=C2=A0 It's good to have coroutine generators availab= le, but
=C2=A0 they are quite expensive, and best avoided where efficiency is
=C2=A0 desirable.

* 'generator->vector' is defined using 'generator->list&#= 39; and
=C2=A0 'list->vector'.=C2=A0 That's bad enough, but digging = deeper we see that
=C2=A0 'generator->list' is implemented by reversing the result = of
=C2=A0 'generator->reverse-list', which as I've already poin= ted out is
=C2=A0 defined using 'generator-fold', which currently involves cal= ling 'map'
=C2=A0 and 'append' and allocating temporary lists for each item ge= nerated.

In general, it's helpful to go through the mental exercise of tracing a= n
evaluation of each of these procedures, and more importantly to trace
calls to the produced generators, to get some idea of the expense
involved.

This is just a sampling of the problems I noticed from a quick skim of
the code, by no means comprehensive.

I acknowledge that following the suggestions above will make the code
much larger, uglier, and more difficult to understand.=C2=A0 I would not recommend such practices except in cases where efficiency is paramount.

In this case, I think efficiency *is* paramount.=C2=A0 My rationale is that= ,
like hash tables, generators inevitably force code that uses them to be
written in an imperative style, and therefore they are best avoided in
my opinion, *except* in cases where efficiency is paramount.

To avoid being forced to write code in an imperative style, I suggest
that in most cases streams are preferable to generators.

=C2=A0 =C2=A0 =C2=A0 Regards,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Mark
--000000000000f01b1305b987de21--