From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Dirk Herrmann Newsgroups: gmane.lisp.guile.devel Subject: memoization and conditional defines Date: Thu, 7 Nov 2002 18:52:28 +0100 (CET) Sender: guile-devel-admin@gnu.org Message-ID: NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Trace: main.gmane.org 1036692362 11082 80.91.224.249 (7 Nov 2002 18:06:02 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Thu, 7 Nov 2002 18:06:02 +0000 (UTC) Return-path: Original-Received: from monty-python.gnu.org ([199.232.76.173]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 189r2V-0002sC-00 for ; Thu, 07 Nov 2002 19:06:00 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10) id 189r0Q-0004P7-00; Thu, 07 Nov 2002 13:03:50 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10) id 189qpd-0000n9-00 for guile-devel@gnu.org; Thu, 07 Nov 2002 12:52:41 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10) id 189qpW-0000i2-00 for guile-devel@gnu.org; Thu, 07 Nov 2002 12:52:38 -0500 Original-Received: from sallust.ida.ing.tu-bs.de ([134.169.132.52]) by monty-python.gnu.org with esmtp (Exim 4.10) id 189qpV-0000gl-00 for guile-devel@gnu.org; Thu, 07 Nov 2002 12:52:33 -0500 Original-Received: from localhost (dirk@localhost) by sallust.ida.ing.tu-bs.de (8.9.3+Sun/8.9.1) with ESMTP id SAA18466 for ; Thu, 7 Nov 2002 18:52:28 +0100 (CET) Original-To: guile-devel@gnu.org Errors-To: guile-devel-admin@gnu.org X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Developers list for Guile, the GNU extensibility library List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.lisp.guile.devel:1657 X-Report-Spam: http://spam.gmane.org/gmane.lisp.guile.devel:1657 Hi folks, I would again like to put your focus on the question, whether we should support top level forms like the following: (if (define foo bar)) In the course of a former conversation on this list there was a consensus that such commands should be supported. However, I have now a better understanding why allowing such a thing is problematic, and I would like to make this obvious for everybody. We should then rethink the decision, and based on the extended knowledge either keep it (and know what this will cost us) or redecide. As an introduction, I would give an example of how memoization works, at least in principle. Assume the following code: (if foo bar) It is a list made of three symbols 'if, 'foo and 'bar. The memoizer will take the expression and try to pre-compute as much of it as possible. The result will be: (#@if # # #) That is, the memoizer looks up 'if in the symbol table and finds that it is bound to the corresponding builtin r5rs macro. The memoizer then calls the corresponding transformer function which is responsible to transform if-expressions into the memoized format. The transformer function for the if-expression then recursively calls the memoizer to memoize the condition form. In this case, the memoizer finds that 'foo is bound on the top level. It performs the corresponding variable lookup and returns the variable object as the memoized code. The transformer function for the if-expression then calls the memoizer to memoize the then-part, and again the corresponding variable object is returned. Finally, the transformer function for the if-expression adds the missing else-part, namely the # value, which is exactly what guile returns for a missing else-form. The resulting memoized if-expression is prepended by the immediate value #@if, which is a guile-internal value only used to indicate a memoized if-expression to the executor. For the later execution, this has the benefit, that a) the if-expression is quickly recognizable because its first element is the immediate value #@if, b) the executor does not have to distinguish between if-then and if-then-else and c) the executor does not have to do any variable lookups any more, since they are done once during memoization. Now, getting one step closer to the problem: Memoization for a cond-expression is somewhat more tricky: Assume the following code: (cond (#t => some-function)) will be memoized to: (#@cond (#t #@=> #)) because in a typical environment there is no binding for the symbol '=>. That means, during memoization it is detected that the '=> is used to express the special syntactic form. Since the memoized code contains the immediate value #@=>, the executor will treat the cond-expression as follows: The condition is found to be true, and that true value will be passed as an argument to the value of 'some-function. In contrast, if there was a binding for '=>, R5RS demands that the use of the symbol '=> should indicate the use of the value to which '=> is bound. That means, the memoizer should rather return code as follows: (#@cond (#t #"> #)) The executor will then treat the cond-expression as follows: The condition is found to be true. Then, the value of => is looked up. Finally, the value of some-function is looked up and returned as the result. In other words, the fact that there is a binding for '=> makes this cond-expression look like (cond (#t some-expression some-function)) instead of (cond (#t => some-function)) This behaviour is demanded by R5RS for all kinds of macros. Therefore, the cond-macro is not a special case, but is used here only as an example for a general rule: The way syntax transformation is performed depends on the set of bindings that are visible at the moment of the transformation. Now, stepping directly into the problem: How should the following top-level form be memoized: (begin (if some-condition (define => #t)) (define (foo x) (cond (x => some-function)))) There are two possibilities, what the memoized code could look like: Variant a) (@#begin (#@if # (#@define => #t) #) (#@define (foo x) (#@cond (# #@=> #)))) Variant b) (@#begin (#@if # (#@define => #t) #) (#@define (foo x) (#@cond (# #"> #)))) But: How should a memoizer that is performed _before_ execution know what will be correct? How should it know whether there will be a binding for '=>? R5RS simplifies this problem by only allowing definitions if they are _really_ on the top-level. The only thing allowed for top-level definitions is to put them within 'begin forms. But this is not actually a problem, because it is easily possible to split up top-level 'begin forms like the following (begin form1 form2 ...) into the corresponding top level forms form1 form2 ... and have each one of them separately memoized and executed. The consequence of all the above is, that allowing things like (if (define foo bar)) makes it much more difficult to get memoization right. That is, if we really allow such forms, this means that we have the following choices: a) be incompatible with R5RS. This will, however, require us to make it clear for users of guile where and how we deviate from R5RS. Taking this option will also imply that our macros are not really hygienic. It also means that an interpreted system will behave different than a compiled or memoized system, even if there are no uses of eval or load. b) complicate the memoization process in order to get things right. An option that we have here is the following: Top-level forms are not memoized but interpreted. Memoization is only done whenever a non-top-level scope is entered, like a let-form, or a lambda-expression. The cost of this solution is that in addition to the memoizer and the executor we would need an interpreter. Conclusion: We should rethink the decision to allow expressions like (if (define foo bar)) Best regards Dirk Herrmann _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel