From: "Mattias Engdegård" <mattiase@acm.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: Michael Heerdegen <michael_heerdegen@web.de>,
Paul Pogonyshev <pogonyshev@gmail.com>,
51982@debbugs.gnu.org
Subject: bug#51982: Erroneous handling of local variables in byte-compiled nested lambdas
Date: Wed, 1 Dec 2021 17:04:44 +0100 [thread overview]
Message-ID: <AF81C637-59D9-468B-917F-65AC37B13C75@acm.org> (raw)
In-Reply-To: <jwvlf152tqb.fsf-monnier+emacs@gnu.org>
30 nov. 2021 kl. 23.41 skrev Stefan Monnier <monnier@iro.umontreal.ca>:
> [ We could also force dynamically-scoped code to go through (a neutered
> version of) cconv.el , so that bytecomp.el and byte-opt.el can presume
> that `let*` doesn't exist any more. ]
Yes, a dynbind frontend would be handy for other reasons (some syntactic normalisation in case we can't do in macroexpand-all).
> BTW, have you checked the impact on byte-code quality?
With respect to these patches? Yes: the B patch gives slightly better code because materialising the accessor (internal-get-closed-var N) is as cheap or cheaper than even a stack variable access. But the difference is small and since the case is rare it's probably insignificant.
In fact, there is probably a way of making them produce identical code by constant-propagating such forms in the optimiser. Who knows, might give unexpected improvements to existing code as well. Time for an experiment!
>>> These two tests are identical aren't they?
>> No, they exercise different code paths (let and let*).
>
> Then that deserves a comment ;-)
Will do.
>>> Looks good (better than patch A).
>>
>> And here I was prepared to apply patch A since it's slightly more
>> conservative and it seems to be a rare problem anyway.
>> I've now split the patches in a more sensible (and easily reviewed) way: the
>> first corresponds to patch A, and the second is the diff to B. Take a second
>> look before making up your mind.
>>
>>> You say "On the other hand, patch B does abuse the cconv data structures
>>> a little (but it works!)" so the code should say something about
>>> this abuse. A least I failed to see where the abuse lies.
>>
>> There are comments and doc strings such as
>>
>> EXTEND is a list of variables which might need to be accessed even
>> from places where they are shadowed, because some part of ENV causes
>> them to be used at places where they originally did not
>> directly appear.
>>
>> but with the B patch we put things into `extend` that are not strictly
>> variables but (international-get-closed-var N).
>
> See below, I think we don't need to put them there.
>
>> Similarly, `env` has entries like (VAR . (apply-partially F ARG1 ARG2 ..))
>> where the ARGi are always treated as variables but now they can be access
>> forms as well.
>
> I don't think the current code assumes that ARGs are vars here.
> You're probably right that it used to be the case and it's not any more,
> but that shouldn't cause problems. The risk I can see is if one of
> those ARGs is an expression which refers to a var which gets shadowed,
> in which case `cconv--remap-llv` won't rewrite it the way it should.
> But I think with your code ARG will either be a simple var or something
> of the form (internal-get-closed-var N) so we should be safe.
>
>> @@ -304,6 +304,22 @@ cconv--convert-funcbody
>> `(,@(nreverse special-forms) ,@(macroexp-unprogn body))))
>> funcbody)))
>>
>> +(defun cconv--lifted-arg (var env)
>> + "The argument to use for VAR in λ-lifted calls according to ENV."
>> + (let ((mapping (cdr (assq var env))))
>> + (pcase-exhaustive mapping
>> + (`(internal-get-closed-var . ,_)
>> + ;; The variable is captured.
>> + mapping)
>> + (`(car-safe (internal-get-closed-var . ,_))
>> + ;; The variable is mutably captured; skip
>> + ;; the indirection step because the variable is
>> + ;; passed "by reference" to the λ-lifted function.
>> + (cadr mapping))
>> + ((or '() `(car-safe ,(pred symbolp)))
>> + ;; The variable is not captured; use the (shadowed) variable value.
>> + var))))
>
> The docstring or comment at the beginning should mention this function
> is specifically for shadowed vars.
Right.
> Also, If mapping is of the form (car-safe SYMBOL) is `var` really the
> correct answer? Shouldn't it still be (cadr mapping)?
Can there ever be a difference? I don't think so, but prove me wrong!
(If you manage to do that, you will have found a second bug in the original code.)
For context, this is the case when we have a variable mutated by a lambda lifted inner function (that doesn't escape). The variable will be wrapped in a cons but retain its name. Example:
(lambda (x)
(let ((f (lambda () (setq x (1+ x)))))
(let ((x 3))
(list x (funcall f)))))
->
(lambda (x)
(let ((x (list x)))
(let ((f (lambda (x) (setcar x (1+ (car-safe x))))))
(let ((x 3)
(closed-x x))
(list x (funcall f closed-x))))))
> Side note: I don't understand why we `(cons closedsym`, since that
> `closedsym` can never appear in another binding (since it's fresh).
Maybe it's to satisfy the invariant checked by the assertion at the top?
> I don't much like this `symbolp` test (which fundamentally seems to
> be trying to recover the information about which branch of the `pcase`
> we're coming from in `cconv--lifted-arg`).
That's precisely what it is trying to do and no, I don't like it much either.
I suppose cconv--lifted-arg could be made a location function; we could then access and mutate local variables. Something poetically self-referential about that, but I'm not overly fond of the closure creation overhead (better than what it once was but still too high).
> It at least deserves
> a comment explaining why it's doing the right thing.
> If we can remove this `symbolp` test recovering info about provenance of
> the result of `cconv--lifted-arg` then I think option B is better, but
> I prefer otherwise option A.
I don't see any alternative that is obviously better so I'm applying patch A. We can still go with B later on if we want; the changes are minor.
Good comments, thank you very much!
next prev parent reply other threads:[~2021-12-01 16:04 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-19 20:31 bug#51982: Erroneous handling of local variables in byte-compiled nested lambdas Paul Pogonyshev
2021-11-20 4:44 ` Michael Heerdegen
2021-11-20 8:45 ` Mattias Engdegård
2021-11-20 10:51 ` Michael Heerdegen
2021-11-20 16:54 ` Paul Pogonyshev
2021-11-20 17:04 ` Mattias Engdegård
2021-11-20 17:22 ` Paul Pogonyshev
2021-11-20 18:34 ` Mattias Engdegård
2021-11-20 20:53 ` Paul Pogonyshev
2021-11-21 7:59 ` Michael Heerdegen
2021-11-21 9:59 ` Mattias Engdegård
2021-11-22 10:29 ` Michael Heerdegen
2021-11-22 13:56 ` Mattias Engdegård
2021-11-22 17:35 ` Mattias Engdegård
2021-11-30 14:12 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-11-30 17:01 ` Mattias Engdegård
2021-11-30 22:41 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-12-01 16:04 ` Mattias Engdegård [this message]
2021-12-01 18:34 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-12-01 22:32 ` Mattias Engdegård
2021-12-02 9:13 ` Mattias Engdegård
2022-09-09 17:59 ` Lars Ingebrigtsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AF81C637-59D9-468B-917F-65AC37B13C75@acm.org \
--to=mattiase@acm.org \
--cc=51982@debbugs.gnu.org \
--cc=michael_heerdegen@web.de \
--cc=monnier@iro.umontreal.ca \
--cc=pogonyshev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).