unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: "Mattias Engdegård" <mattiase@acm.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: Michael Heerdegen <michael_heerdegen@web.de>,
	Paul Pogonyshev <pogonyshev@gmail.com>,
	51982@debbugs.gnu.org
Subject: bug#51982: Erroneous handling of local variables in byte-compiled nested lambdas
Date: Wed, 1 Dec 2021 17:04:44 +0100	[thread overview]
Message-ID: <AF81C637-59D9-468B-917F-65AC37B13C75@acm.org> (raw)
In-Reply-To: <jwvlf152tqb.fsf-monnier+emacs@gnu.org>

30 nov. 2021 kl. 23.41 skrev Stefan Monnier <monnier@iro.umontreal.ca>:

> [ We could also force dynamically-scoped code to go through (a neutered
>  version of) cconv.el , so that bytecomp.el and byte-opt.el can presume
>  that `let*` doesn't exist any more.  ]

Yes, a dynbind frontend would be handy for other reasons (some syntactic normalisation in case we can't do in macroexpand-all). 

> BTW, have you checked the impact on byte-code quality?

With respect to these patches? Yes: the B patch gives slightly better code because materialising the accessor (internal-get-closed-var N) is as cheap or cheaper than even a stack variable access. But the difference is small and since the case is rare it's probably insignificant.

In fact, there is probably a way of making them produce identical code by constant-propagating such forms in the optimiser. Who knows, might give unexpected improvements to existing code as well. Time for an experiment!

>>> These two tests are identical aren't they?
>> No, they exercise different code paths (let and let*).
> 
> Then that deserves a comment ;-)

Will do.

>>> Looks good (better than patch A).
>> 
>> And here I was prepared to apply patch A since it's slightly more
>> conservative and it seems to be a rare problem anyway.
>> I've now split the patches in a more sensible (and easily reviewed) way: the
>> first corresponds to patch A, and the second is the diff to B. Take a second
>> look before making up your mind.
>> 
>>> You say "On the other hand, patch B does abuse the cconv data structures
>>> a little (but it works!)" so the code should say something about
>>> this abuse.  A least I failed to see where the abuse lies.
>> 
>> There are comments and doc strings such as
>> 
>>  EXTEND is a list of variables which might need to be accessed even
>>  from places where they are shadowed, because some part of ENV causes
>>  them to be used at places where they originally did not
>>  directly appear.
>> 
>> but with the B patch we put things into `extend` that are not strictly
>> variables but (international-get-closed-var N).
> 
> See below, I think we don't need to put them there.
> 
>> Similarly, `env` has entries like (VAR . (apply-partially F ARG1 ARG2 ..))
>> where the ARGi are always treated as variables but now they can be access
>> forms as well.
> 
> I don't think the current code assumes that ARGs are vars here.
> You're probably right that it used to be the case and it's not any more,
> but that shouldn't cause problems.  The risk I can see is if one of
> those ARGs is an expression which refers to a var which gets shadowed,
> in which case `cconv--remap-llv` won't rewrite it the way it should.
> But I think with your code ARG will either be a simple var or something
> of the form (internal-get-closed-var N) so we should be safe.
> 
>> @@ -304,6 +304,22 @@ cconv--convert-funcbody
>>             `(,@(nreverse special-forms) ,@(macroexp-unprogn body))))
>>       funcbody)))
>> 
>> +(defun cconv--lifted-arg (var env)
>> +  "The argument to use for VAR in λ-lifted calls according to ENV."
>> +  (let ((mapping (cdr (assq var env))))
>> +    (pcase-exhaustive mapping
>> +      (`(internal-get-closed-var . ,_)
>> +       ;; The variable is captured.
>> +       mapping)
>> +      (`(car-safe (internal-get-closed-var . ,_))
>> +       ;; The variable is mutably captured; skip
>> +       ;; the indirection step because the variable is
>> +       ;; passed "by reference" to the λ-lifted function.
>> +       (cadr mapping))
>> +      ((or '() `(car-safe ,(pred symbolp)))
>> +       ;; The variable is not captured; use the (shadowed) variable value.
>> +       var))))
> 
> The docstring or comment at the beginning should mention this function
> is specifically for shadowed vars.

Right.

> Also, If mapping is of the form (car-safe SYMBOL) is `var` really the
> correct answer?  Shouldn't it still be (cadr mapping)?

Can there ever be a difference? I don't think so, but prove me wrong!
(If you manage to do that, you will have found a second bug in the original code.)

For context, this is the case when we have a variable mutated by a lambda lifted inner function (that doesn't escape). The variable will be wrapped in a cons but retain its name. Example:

(lambda (x)
  (let ((f (lambda () (setq x (1+ x)))))
    (let ((x 3))
      (list x (funcall f)))))
->
(lambda (x)
  (let ((x (list x))) 
    (let ((f (lambda (x) (setcar x (1+ (car-safe x))))))
      (let ((x 3)
            (closed-x x))
        (list x (funcall f closed-x))))))

> Side note: I don't understand why we `(cons closedsym`, since that
> `closedsym` can never appear in another binding (since it's fresh).

Maybe it's to satisfy the invariant checked by the assertion at the top?

> I don't much like this `symbolp` test (which fundamentally seems to
> be trying to recover the information about which branch of the `pcase`
> we're coming from in `cconv--lifted-arg`).

That's precisely what it is trying to do and no, I don't like it much either.

I suppose cconv--lifted-arg could be made a location function; we could then access and mutate local variables. Something poetically self-referential about that, but I'm not overly fond of the closure creation overhead (better than what it once was but still too high).

>  It at least deserves
> a comment explaining why it's doing the right thing.

> If we can remove this `symbolp` test recovering info about provenance of
> the result of `cconv--lifted-arg` then I think option B is better, but
> I prefer otherwise option A.

I don't see any alternative that is obviously better so I'm applying patch A. We can still go with B later on if we want; the changes are minor.

Good comments, thank you very much!







  reply	other threads:[~2021-12-01 16:04 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-19 20:31 bug#51982: Erroneous handling of local variables in byte-compiled nested lambdas Paul Pogonyshev
2021-11-20  4:44 ` Michael Heerdegen
2021-11-20  8:45   ` Mattias Engdegård
2021-11-20 10:51     ` Michael Heerdegen
2021-11-20 16:54   ` Paul Pogonyshev
2021-11-20 17:04     ` Mattias Engdegård
2021-11-20 17:22       ` Paul Pogonyshev
2021-11-20 18:34         ` Mattias Engdegård
2021-11-20 20:53           ` Paul Pogonyshev
2021-11-21  7:59         ` Michael Heerdegen
2021-11-21  9:59           ` Mattias Engdegård
2021-11-22 10:29             ` Michael Heerdegen
2021-11-22 13:56               ` Mattias Engdegård
2021-11-22 17:35                 ` Mattias Engdegård
2021-11-30 14:12                   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-11-30 17:01                     ` Mattias Engdegård
2021-11-30 22:41                       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-12-01 16:04                         ` Mattias Engdegård [this message]
2021-12-01 18:34                           ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-12-01 22:32                             ` Mattias Engdegård
2021-12-02  9:13                               ` Mattias Engdegård
2022-09-09 17:59                                 ` Lars Ingebrigtsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AF81C637-59D9-468B-917F-65AC37B13C75@acm.org \
    --to=mattiase@acm.org \
    --cc=51982@debbugs.gnu.org \
    --cc=michael_heerdegen@web.de \
    --cc=monnier@iro.umontreal.ca \
    --cc=pogonyshev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).