From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Mark H Weaver Newsgroups: gmane.lisp.guile.user Subject: Re: Multiple values passed as single argument to procedure Date: Mon, 12 Jun 2017 22:26:12 -0400 Message-ID: <87h8zk1v3v.fsf@netris.org> References: <87mv9fnejc.fsf@gmail.com> <87k24i2rev.fsf@netris.org> <87zidexdjw.fsf@gmail.com> <87a85d3k9n.fsf@netris.org> <87mv9dy5wb.fsf@gmail.com> <87mv9d7dfu.fsf@fencepost.gnu.org> <87wp8h1lyh.fsf@netris.org> <87h8zl7086.fsf@fencepost.gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1497320820 17330 195.159.176.226 (13 Jun 2017 02:27:00 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 13 Jun 2017 02:27:00 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) Cc: guile-user@gnu.org To: David Kastrup Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Tue Jun 13 04:26:54 2017 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dKbXi-000431-8V for guile-user@m.gmane.org; Tue, 13 Jun 2017 04:26:54 +0200 Original-Received: from localhost ([::1]:40832 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dKbXl-0008IA-Ve for guile-user@m.gmane.org; Mon, 12 Jun 2017 22:26:57 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:53264) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dKbXK-0008I4-7c for guile-user@gnu.org; Mon, 12 Jun 2017 22:26:31 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dKbXH-0006G9-1G for guile-user@gnu.org; Mon, 12 Jun 2017 22:26:30 -0400 Original-Received: from world.peace.net ([50.252.239.5]:44369) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dKbXG-0006G5-QV; Mon, 12 Jun 2017 22:26:26 -0400 Original-Received: from pool-72-93-33-54.bstnma.east.verizon.net ([72.93.33.54] helo=jojen) by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1dKbXE-000886-6l; Mon, 12 Jun 2017 22:26:24 -0400 In-Reply-To: <87h8zl7086.fsf@fencepost.gnu.org> (David Kastrup's message of "Mon, 12 Jun 2017 16:24:25 +0200") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 50.252.239.5 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.org gmane.lisp.guile.user:13832 Archived-At: David Kastrup writes: > Mark H Weaver writes: > >> I'm sorry David, but _everything_ that you wrote below is incorrect. > > Well, let me try again. It's not all that easy to understand. You're obviously quite intelligent and knowledgeable, so it's probably my failure to explain it well more than anything else. We can try again together. >> David Kastrup writes: >> >>> Chris Marusich writes: >>> >>>> I think I'm missing something here. In (list (f)), the call to f >>>> certainly looks like it's happening at a position that one might >>>> intuitively call a "tail" position. >>> >>> It is, >> >> No it isn't. The reason that (f) is not in tail position with respect to (list (f)) is because there is still something left to do after calling (f) but before invoking the continuation of (list (f)). The thing remaining to do is to apply 'list' to the returned value. Putting it another way, if (f) evaluates to 5, you cannot proceed by reducing (list (f)) to 5. The continuation of (f) cannot be the same as the continuation of (list (f)), because that would skip the step of calling 'list'. * * * It might be helpful to be more precise about what continuations are. To a first approximation, a continuation can be visualized as an expression with a hole. For example, in the expression (list (f) (f) (f)), suppose that the compiler chooses to evaluate the second argument first. Then the continuation of the middle (f) is: (list (f) <> (f)) where <> is the hole. When 'f' returns its value (i.e. invokes its continuation), what remains to be done is this: evaluate the expression which is formed by putting the return value into the hole. So, if that middle (f) returns 5, then we proceed to evaluate: (list (f) 5 (f)) If the compiler chooses to evaluate the first argument next, then the continuation for the first (f) will be: (list <> 5 (f)) and so on. To make this a bit more precise, we could actually use normal Scheme procedures to model these continuations. The continuation above could be written like this: (lambda (x) (list x 5 (f))) Supposing that (f) returns 6 this time (maybe it's a counter?), the continuation of the last (f) will be: (lambda (z) (list 6 5 z)) In fact, in Scheme we can actually write code in such a way that makes these continuations explicit. Here's how we do it: we add an extra argument to every procedure: its continuation. Instead of returning values from procedures, we explicitly call the continuation, passing the return value as its argument. If you do this until every procedure call is in tail position, the resulting program is in "continuation passing style" or CPS. Here's what (lambda () (list (f 1) (f 2) (f 3))) looks like in CPS, using the same evaluation order as I chose above: (lambda (outer-k) (f^ 2 (lambda (y) (f^ 1 (lambda (x) (f^ 3 (lambda (z) (list^ x y z outer-k)))))))) Where 'f^' and 'list^' are variants of 'f' and 'list' that accept an explicit continuation argument. Continuation passing style makes more things explicit. It forces us to explicitly choose an evaluation order, and it entails assigning names to every intermediate value. The new variables x, y, and z will hold the intermediate values returned by (f 1), (f 2), and (f 3), respectively. So far, I've shown only unary continuations, but when a program is in continuation passing form, it becomes trivial to "return" an arbitrary number of values by simply adding more arguments to the continuations. For example, we could modify the CPS example above to receive two values from 'f', and to return a list of all six results: (lambda (outer-k) (f^ 2 (lambda (y1 y2) (f^ 1 (lambda (x1 x2) (f^ 3 (lambda (z1 z2) (list^ x1 x2 y1 y2 z1 z2 outer-k)))))))) In CPS, this is a very straightforward, and requires no additional syntax or primitives. However, in "direct style" (i.e. the normal way we write programs) we cannot add multiple return values so easily. The difficulty is essentially syntactic in nature. Here's one way to do the same job as above in direct style, and making the order of argument evaluation unspecified: (lambda () (let-values (((x1 x2) (f 1)) ((y1 y2) (f 2)) ((z1 z2) (f 3))) (list x1 x2 y1 y2 z1 z2))) compared with the earlier version that keeps only one argument from each call to 'f'. (lambda () (list (f 1) (f 2) (f 3))) Notice how similar the two CPS examples are, while the corresponding programs in direct style are strikingly different. >>> but list does not take multiple values >> >> Yes it does. 'list' accepts an arbitrary number of values >> (arguments). > > In my book, multiple values/arguments are different things As you can see above, in CPS, they are *exactly* the same thing. And in fact Guile 2.2 transforms programs into CPS as an integral part of compilation before code generation. > but I admit that this is a quagmire to me and strictly speaking what I > call "multiple values" (namely the unmitigated return value from > calling values) Your statement above, where you call "multiple values" "the unmitigated return value" (singular) reveals your confusion. I suspect you are still thinking of multiple value returns as if the values are being packaged up into a single object which is returned and then taken apart by the caller. This is a mistake. Nothing like this is happening inside Guile's VM. I suspect that you do not have any trouble envisioning multiple arguments being passed *to* a procedure without packaging them into a single object. You surely know that in C, the arguments are stored in separate registers, or placed in separate entries on the stack, before jumping to a C function. Something similar can be done to return multiple values, if the calling convention allows it. Guile's does. If you free your mind from the biases of direct style and C calling conventions, there's a remarkable symmetry between passing arguments *to* a procedure and returning values *from* a procedure. The apparent lack of symmetry is purely a syntactic artifact of the "direct style" that we generally prefer to work with. >>> and thus discards additional values returned by f. >> >> 'list' does not discard anything. The additional values are discarded >> before 'list' is called. > > Ok. Because a function call cannot take multiple values. It does take > multiple _arguments_. Let's take another look at Chris's example: (define (f . _) (values 1 2)) (list (f)) => (1) If you expected (1 2), then consider this: (list (f) (f) (f)) => (1 1 1) Would you expect (1 2 1 2 1 2)? If so, I guess you are looking for splicing behavior in argument lists. To implement this, we'd need to convert things to CPS differently. Instead of: (lambda (outer-k) (f^ 2 (lambda (y) (f^ 1 (lambda (x) (f^ 3 (lambda (z) (list^ x y z outer-k)))))))) You could do something like this: (lambda (outer-k) (f^ 2 (lambda ys (f^ 1 (lambda xs (f^ 3 (lambda zs (append^ xs ys zs (lambda (args) (apply^ list^ args outer-k)))))))))) I know that there are some languages that do this kind of splicing, but in my opinion it's a very bad idea. In Scheme, when you see the procedure call: (f (x) (y)) You know just from this, without any context, that (x) will be the first argument and (y) will be the second argument. This fact is important for both humans and compilers. For humans it is important because we can more easily and reliably reason about the code we are looking at. For compilers it is important for the same reason, actually. For example, if 'f' is known at compile time, and something useful is known about (y), we can inline the call to 'f', substituting (y) for its second argument, and use our knowledge of (y) to optimize the result. However, if we had splicing semantics for procedure calls, and if 'x' is not known at compile time (it usually isn't), then we would have _no_ idea which argument of 'f' (y) will be bound to, and we can't do much of anything to optimize this. So, I suppose you could consider it design choice to insist that in an arbitrary procedure call (f (x) (y)) that we know statically (i.e. at compile time) that no matter how (x) and (y) behave, two arguments will always be passed in this call, and the first argument will be (x) and the second argument will be (y). I should mention that if you consider examples where the procedure being called is 'list' or '+' or something else which accepts an arbitrary number of arguments and treats them all interchangably, then the splicing behavior seems more sensible. However, such procedures are quite rare in practice. The overwhelmingly common case is that a procedure's arguments have very different roles, where splicing is not at all useful and merely makes it difficult for humans (and usually impossible for the compiler) to know which argument a given operand will be assigned to. Does that make sense? Regards, Mark