From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Newsgroups: gmane.emacs.bugs Subject: bug#51982: Erroneous handling of local variables in byte-compiled nested lambdas Date: Wed, 1 Dec 2021 17:04:44 +0100 Message-ID: References: <87y25jo2q1.fsf@web.de> <29C3A3F8-CD9F-4AF2-A731-3304FC30E380@acm.org> <87wnl23pnd.fsf@web.de> <59A729EF-C4D4-47EB-9ADC-19FE8EBE7F10@acm.org> <877dd0bi17.fsf@web.de> Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.21\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="4798"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Michael Heerdegen , Paul Pogonyshev , 51982@debbugs.gnu.org To: Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Dec 01 17:05:57 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1msS7I-00010C-IR for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 01 Dec 2021 17:05:56 +0100 Original-Received: from localhost ([::1]:55086 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1msS7H-00033k-I3 for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 01 Dec 2021 11:05:55 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:57502) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1msS6Q-00031D-U8 for bug-gnu-emacs@gnu.org; Wed, 01 Dec 2021 11:05:11 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:33782) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1msS6Q-0008Qe-IB for bug-gnu-emacs@gnu.org; Wed, 01 Dec 2021 11:05:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1msS6Q-0007rN-Dr for bug-gnu-emacs@gnu.org; Wed, 01 Dec 2021 11:05:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 01 Dec 2021 16:05:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 51982 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 51982-submit@debbugs.gnu.org id=B51982.163837469830199 (code B ref 51982); Wed, 01 Dec 2021 16:05:02 +0000 Original-Received: (at 51982) by debbugs.gnu.org; 1 Dec 2021 16:04:58 +0000 Original-Received: from localhost ([127.0.0.1]:45328 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1msS6M-0007r0-61 for submit@debbugs.gnu.org; Wed, 01 Dec 2021 11:04:58 -0500 Original-Received: from mail1478c50.megamailservers.eu ([91.136.14.78]:38588 helo=mail118c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1msS6J-0007qj-0X for 51982@debbugs.gnu.org; Wed, 01 Dec 2021 11:04:56 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1638374687; bh=mVFFMflD9NAw+XbHn/fXaPHb8ePSAtJZ6dN6fg0bknM=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=GfvG9efpZvO3DfL//hVScdVxeiOKtb6YJegWPPCMbBZAoK3A0ivB5Owm9i4D2X+IF IO/GSXA1tCGCaU2l1Z48Ek4beM8EpkvjorhAHDZBAzt6OUbAj9j1r3onpDHGG/UFp8 UC8gOw4thWbRoG2RqWu9B7lIHrk/YBaHT86IgQ4g= Feedback-ID: mattiase@acm.or Original-Received: from stanniol.lan (c-b952e353.032-75-73746f71.bbcust.telenor.se [83.227.82.185]) (authenticated bits=0) by mail118c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 1B1G4iLb020180; Wed, 1 Dec 2021 16:04:46 +0000 In-Reply-To: X-Mailer: Apple Mail (2.3445.104.21) X-CTCH-RefID: str=0001.0A742F19.61A79D1F.00DD, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.4 cv=M80ulw8s c=1 sm=1 tr=0 ts=61a79d1f a=von4qPfY+hyqc0zmWf0tYQ==:117 a=von4qPfY+hyqc0zmWf0tYQ==:17 a=IkcTkHD0fZMA:10 a=M51BFTxLslgA:10 a=iRZporoAAAAA:8 a=cfH4NIlj4eTF1UUExc8A:9 a=QEXdDO2ut3YA:10 a=NOBgFS-JBQ2l-kSd6-zu:22 X-Origin-Country: SE X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:221212 Archived-At: 30 nov. 2021 kl. 23.41 skrev Stefan Monnier : > [ We could also force dynamically-scoped code to go through (a = neutered > version of) cconv.el , so that bytecomp.el and byte-opt.el can = presume > that `let*` doesn't exist any more. ] Yes, a dynbind frontend would be handy for other reasons (some syntactic = normalisation in case we can't do in macroexpand-all).=20 > BTW, have you checked the impact on byte-code quality? With respect to these patches? Yes: the B patch gives slightly better = code because materialising the accessor (internal-get-closed-var N) is = as cheap or cheaper than even a stack variable access. But the = difference is small and since the case is rare it's probably = insignificant. In fact, there is probably a way of making them produce identical code = by constant-propagating such forms in the optimiser. Who knows, might = give unexpected improvements to existing code as well. Time for an = experiment! >>> These two tests are identical aren't they? >> No, they exercise different code paths (let and let*). >=20 > Then that deserves a comment ;-) Will do. >>> Looks good (better than patch A). >>=20 >> And here I was prepared to apply patch A since it's slightly more >> conservative and it seems to be a rare problem anyway. >> I've now split the patches in a more sensible (and easily reviewed) = way: the >> first corresponds to patch A, and the second is the diff to B. Take a = second >> look before making up your mind. >>=20 >>> You say "On the other hand, patch B does abuse the cconv data = structures >>> a little (but it works!)" so the code should say something about >>> this abuse. A least I failed to see where the abuse lies. >>=20 >> There are comments and doc strings such as >>=20 >> EXTEND is a list of variables which might need to be accessed even >> from places where they are shadowed, because some part of ENV causes >> them to be used at places where they originally did not >> directly appear. >>=20 >> but with the B patch we put things into `extend` that are not = strictly >> variables but (international-get-closed-var N). >=20 > See below, I think we don't need to put them there. >=20 >> Similarly, `env` has entries like (VAR . (apply-partially F ARG1 ARG2 = ..)) >> where the ARGi are always treated as variables but now they can be = access >> forms as well. >=20 > I don't think the current code assumes that ARGs are vars here. > You're probably right that it used to be the case and it's not any = more, > but that shouldn't cause problems. The risk I can see is if one of > those ARGs is an expression which refers to a var which gets shadowed, > in which case `cconv--remap-llv` won't rewrite it the way it should. > But I think with your code ARG will either be a simple var or = something > of the form (internal-get-closed-var N) so we should be safe. >=20 >> @@ -304,6 +304,22 @@ cconv--convert-funcbody >> `(,@(nreverse special-forms) ,@(macroexp-unprogn body)))) >> funcbody))) >>=20 >> +(defun cconv--lifted-arg (var env) >> + "The argument to use for VAR in =CE=BB-lifted calls according to = ENV." >> + (let ((mapping (cdr (assq var env)))) >> + (pcase-exhaustive mapping >> + (`(internal-get-closed-var . ,_) >> + ;; The variable is captured. >> + mapping) >> + (`(car-safe (internal-get-closed-var . ,_)) >> + ;; The variable is mutably captured; skip >> + ;; the indirection step because the variable is >> + ;; passed "by reference" to the =CE=BB-lifted function. >> + (cadr mapping)) >> + ((or '() `(car-safe ,(pred symbolp))) >> + ;; The variable is not captured; use the (shadowed) variable = value. >> + var)))) >=20 > The docstring or comment at the beginning should mention this function > is specifically for shadowed vars. Right. > Also, If mapping is of the form (car-safe SYMBOL) is `var` really the > correct answer? Shouldn't it still be (cadr mapping)? Can there ever be a difference? I don't think so, but prove me wrong! (If you manage to do that, you will have found a second bug in the = original code.) For context, this is the case when we have a variable mutated by a = lambda lifted inner function (that doesn't escape). The variable will be = wrapped in a cons but retain its name. Example: (lambda (x) (let ((f (lambda () (setq x (1+ x))))) (let ((x 3)) (list x (funcall f))))) -> (lambda (x) (let ((x (list x)))=20 (let ((f (lambda (x) (setcar x (1+ (car-safe x)))))) (let ((x 3) (closed-x x)) (list x (funcall f closed-x)))))) > Side note: I don't understand why we `(cons closedsym`, since that > `closedsym` can never appear in another binding (since it's fresh). Maybe it's to satisfy the invariant checked by the assertion at the top? > I don't much like this `symbolp` test (which fundamentally seems to > be trying to recover the information about which branch of the `pcase` > we're coming from in `cconv--lifted-arg`). That's precisely what it is trying to do and no, I don't like it much = either. I suppose cconv--lifted-arg could be made a location function; we could = then access and mutate local variables. Something poetically = self-referential about that, but I'm not overly fond of the closure = creation overhead (better than what it once was but still too high). > It at least deserves > a comment explaining why it's doing the right thing. > If we can remove this `symbolp` test recovering info about provenance = of > the result of `cconv--lifted-arg` then I think option B is better, but > I prefer otherwise option A. I don't see any alternative that is obviously better so I'm applying = patch A. We can still go with B later on if we want; the changes are = minor. Good comments, thank you very much!