From mboxrd@z Thu Jan  1 00:00:00 1970
Path: main.gmane.org!not-for-mail
From: Dirk Herrmann <dirk@sallust.ida.ing.tu-bs.de>
Newsgroups: gmane.lisp.guile.devel
Subject: memoization and conditional defines
Date: Thu, 7 Nov 2002 18:52:28 +0100 (CET)
Sender: guile-devel-admin@gnu.org
Message-ID: <Pine.GSO.4.05.10211071839250.18458-100000@sallust.ida.ing.tu-bs.de>
NNTP-Posting-Host: main.gmane.org
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Trace: main.gmane.org 1036692362 11082 80.91.224.249 (7 Nov 2002 18:06:02 GMT)
X-Complaints-To: usenet@main.gmane.org
NNTP-Posting-Date: Thu, 7 Nov 2002 18:06:02 +0000 (UTC)
Return-path: <guile-devel-admin@gnu.org>
Original-Received: from monty-python.gnu.org ([199.232.76.173])
	by main.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 189r2V-0002sC-00
	for <guile-devel@m.gmane.org>; Thu, 07 Nov 2002 19:06:00 +0100
Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org)
	by monty-python.gnu.org with esmtp (Exim 4.10)
	id 189r0Q-0004P7-00; Thu, 07 Nov 2002 13:03:50 -0500
Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10)
	id 189qpd-0000n9-00
	for guile-devel@gnu.org; Thu, 07 Nov 2002 12:52:41 -0500
Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10)
	id 189qpW-0000i2-00
	for guile-devel@gnu.org; Thu, 07 Nov 2002 12:52:38 -0500
Original-Received: from sallust.ida.ing.tu-bs.de ([134.169.132.52])
	by monty-python.gnu.org with esmtp (Exim 4.10)
	id 189qpV-0000gl-00
	for guile-devel@gnu.org; Thu, 07 Nov 2002 12:52:33 -0500
Original-Received: from localhost (dirk@localhost)
	by sallust.ida.ing.tu-bs.de (8.9.3+Sun/8.9.1) with ESMTP id SAA18466
	for <guile-devel@gnu.org>; Thu, 7 Nov 2002 18:52:28 +0100 (CET)
Original-To: guile-devel@gnu.org
Errors-To: guile-devel-admin@gnu.org
X-BeenThere: guile-devel@gnu.org
X-Mailman-Version: 2.0.11
Precedence: bulk
List-Help: <mailto:guile-devel-request@gnu.org?subject=help>
List-Post: <mailto:guile-devel@gnu.org>
List-Subscribe: <http://mail.gnu.org/mailman/listinfo/guile-devel>,
	<mailto:guile-devel-request@gnu.org?subject=subscribe>
List-Id: Developers list for Guile,
	the GNU extensibility library <guile-devel.gnu.org>
List-Unsubscribe: <http://mail.gnu.org/mailman/listinfo/guile-devel>,
	<mailto:guile-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://mail.gnu.org/pipermail/guile-devel/>
Xref: main.gmane.org gmane.lisp.guile.devel:1657
X-Report-Spam: http://spam.gmane.org/gmane.lisp.guile.devel:1657

Hi folks,

I would again like to put your focus on the question, whether we should
support top level forms like the following:

  (if <condition> (define foo bar))

In the course of a former conversation on this list there was a consensus
that such commands should be supported.  However, I have now a better
understanding why allowing such a thing is problematic, and I would like
to make this obvious for everybody.  We should then rethink the decision,
and based on the extended knowledge either keep it (and know what this
will cost us) or redecide.


As an introduction, I would give an example of how memoization works, at 
least in principle.  Assume the following code:
  (if foo bar)
It is a list made of three symbols 'if, 'foo and 'bar.  The memoizer will
take the expression and try to pre-compute as much of it as possible.  
The result will be:
  (#@if #<variable: "foo"> #<variable: "bar"> #<unspecified>)
That is, the memoizer looks up 'if in the symbol table and finds that it
is bound to the corresponding builtin r5rs macro.  The memoizer then calls
the corresponding transformer function which is responsible to transform
if-expressions into the memoized format.  The transformer function for the
if-expression then recursively calls the memoizer to memoize the condition
form.  In this case, the memoizer finds that 'foo is bound on the top
level. It performs the corresponding variable lookup and returns the
variable object as the memoized code.  The transformer function for the
if-expression then calls the memoizer to memoize the then-part, and again
the corresponding variable object is returned.  Finally, the transformer
function for the if-expression adds the missing else-part, namely the
#<unspecified> value, which is exactly what guile returns for a missing
else-form.  The resulting memoized if-expression is prepended by the
immediate value #@if, which is a guile-internal value only used to
indicate a memoized if-expression to the executor.

For the later execution, this has the benefit, that a) the if-expression
is quickly recognizable because its first element is the immediate value
#@if, b) the executor does not have to distinguish between if-then and
if-then-else and c) the executor does not have to do any variable lookups
any more, since they are done once during memoization.


Now, getting one step closer to the problem:  Memoization for a
cond-expression is somewhat more tricky:  Assume the following code:
  (cond (#t => some-function))
will be memoized to:
  (#@cond (#t #@=> #<variable: "some-function">))
because in a typical environment there is no binding for the symbol '=>.
That means, during memoization it is detected that the '=> is used to
express the special syntactic form.  Since the memoized code contains the
immediate value #@=>, the executor will treat the cond-expression as
follows: The condition is found to be true, and that true value will be
passed as an argument to the value of 'some-function.

In contrast, if there was a binding for '=>, R5RS demands that the use of
the symbol '=> should indicate the use of the value to which '=> is bound.
That means, the memoizer should rather return code as follows:
  (#@cond (#t #<variable: "=>"> #<variable: "some-function">))
The executor will then treat the cond-expression as follows: The condition
is found to be true.  Then, the value of => is looked up.  Finally, the
value of some-function is looked up and returned as the result.  In other
words, the fact that there is a binding for '=> makes this cond-expression
look like
  (cond (#t some-expression some-function))
instead of
  (cond (#t => some-function))

This behaviour is demanded by R5RS for all kinds of macros.  Therefore,
the cond-macro is not a special case, but is used here only as an example
for a general rule:  The way syntax transformation is performed depends on
the set of bindings that are visible at the moment of the transformation.


Now, stepping directly into the problem:  How should the following
top-level form be memoized:
  (begin
    (if some-condition (define => #t))
    (define (foo x)
      (cond (x => some-function))))
There are two possibilities, what the memoized code could look like:
Variant a)
  (@#begin
    (#@if #<variable: "some-condition"> (#@define => #t) #<unspecified>)
    (#@define (foo x)
      (#@cond (#<local-ref: x> #@=> #<variable: "some-function">))))
Variant b)
  (@#begin
    (#@if #<variable: "some-condition"> (#@define => #t) #<unspecified>)
    (#@define (foo x)
      (#@cond (#<local-ref: x> #<variable: "=>"> #<variable: "some-function">))))
But: How should a memoizer that is performed _before_ execution know what
will be correct?  How should it know whether there will be a binding for
'=>?


R5RS simplifies this problem by only allowing definitions if they are
_really_ on the top-level.  The only thing allowed for top-level
definitions is to put them within 'begin forms.  But this is not actually
a problem, because it is easily possible to split up top-level 'begin
forms like the following
  (begin form1 form2 ...)
into the corresponding top level forms
  form1
  form2
  ...
and have each one of them separately memoized and executed.


The consequence of all the above is, that allowing things like
  (if <condition> (define foo bar))
makes it much more difficult to get memoization right.  That is, if we
really allow such forms, this means that we have the following choices:
 a) be incompatible with R5RS.  This will, however, require us to make it
    clear for users of guile where and how we deviate from R5RS.  Taking
    this option will also imply that our macros are not really hygienic.
    It also means that an interpreted system will behave different than a
    compiled or memoized system, even if there are no uses of eval or
    load.
 b) complicate the memoization process in order to get things right.  An
    option that we have here is the following:  Top-level forms are not
    memoized but interpreted.  Memoization is only done whenever a
    non-top-level scope is entered, like a let-form, or a
    lambda-expression. The cost of this solution is that in addition to
    the memoizer and the executor we would need an interpreter.


Conclusion:  We should rethink the decision to allow expressions like
  (if <condition> (define foo bar))


Best regards
Dirk Herrmann



_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel