unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Question about native compilation (bug?)
@ 2023-07-26 21:32 Drew Adams
  2023-07-27  4:45 ` Tassilo Horn
  2023-07-27  9:53 ` Andrea Corallo
  0 siblings, 2 replies; 11+ messages in thread
From: Drew Adams @ 2023-07-26 21:32 UTC (permalink / raw)
  To: emacs-devel@gnu.org

I don't have native compilation for my laptop, but I received a bug report from a user of one of my libraries, and I get the feeling that native compilation (which this user has) is interfering and perhaps bugged.

1. My code, when loaded, saves the original definitions of some functions, such as `read-buffer', by creating a `defalias' for each of them, such as this one:

(defalias 'ORIG-read-buffer
          (symbol-function 'read-buffer)

2. I define a global minor mode that, when turned on, redefines those saved functions, such as `read-buffer', using `fset', to using definitions appropriate to the minor mode.  

3. When the minor mode is turned off I reset those function definitions to the saved (`defalias'ed) versions, using `fset'.

(I do this dance because I want the library to work properly also with older Emacs versions.  No comments, please, about its fragility or other weaknesses.)
___

This is the problem, reported in Emacs 28.2 with native compilation:

Emacs 28 added an additional optional arg to function `read-buffer' (which is defined in C).  My substitute version of `read-buffer', which is used when my minor mode is turned on, has only 1 required arg and 2 optional, not 1 required and 3 optional.

Obviously, for any code that will try to pass 4 args, I need to update my code to accommodate the 4th arg.  I haven't done that yet.

But my question is about code that ostensibly passes 3 or fewer args.  An error is nevertheless raised, saying that 4 args were passed and only 3 were allowed.  The standard code, whose vanilla source definition (in `window.el') passes only 3 args, in actuality, at runtime, passes 4 args, the 4th one being explicitly nil.

Recipe: the minor mode is turned on, and `C-x 5 b' is tried.  An error is raised:

Debugger entered--Lisp error: (wrong-number-of-arguments #<subr my-read-buffer> 4)
  read-buffer("Switch to buffer in other frame: " #<buffer foo> confirm-after-completion nil)
  read-buffer-to-switch("Switch to buffer in other frame: ")
  byte-code("\300\301!C\207" [read-buffer-to-switch "Switch to buffer in other frame: "] 2)
  call-interactively(switch-to-buffer-other-frame nil nil)
  command-execute(switch-to-buffer-other-frame)

I wouldn't have a clue to what is going on here from that backtrace, since function `my-read-buffer' accepts 3 args and the source code for `read-buffer-to-switch' passes 3 args.

I got a glue (I think), from this: For the reporting user `C-h k C-x 5 b' says this:

C-x 5 b runs the command switch-to-buffer-other-frame (found in global-map), which is an interactive native-compiled Lisp function in '/var/lib/snapd/snap/emacs/2031/usr/share/emacs/28.2/lisp/window.el'.

It's that "native-compiled" that makes me wonder.  If the user loads the Lisp source file, `window.el', there's no problem.  But it looks kinda like native compilation has "baked-in" the call to `read-buffer', passing an explicit 4th arg, nil, instead of passing only the 3 args that the source code says to pass.

Passing 4 args, the last of which is nil, isn't the same thing as passing 3 args, but it looks like native compilation takes a shortcut, assuming that it's the same thing.

Is this a native-compilation bug?  Is it some other Emacs bug?  Or is this just the way things are going to be from now on - "situation normale, rien à signaler"?

Let me know, if my guesses are near the target, or if I'm missing something.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Question about native compilation (bug?)
  2023-07-26 21:32 Question about native compilation (bug?) Drew Adams
@ 2023-07-27  4:45 ` Tassilo Horn
  2023-07-27 15:25   ` [External] : " Drew Adams
  2023-07-27  9:53 ` Andrea Corallo
  1 sibling, 1 reply; 11+ messages in thread
From: Tassilo Horn @ 2023-07-27  4:45 UTC (permalink / raw)
  To: Drew Adams; +Cc: emacs-devel

Drew Adams <drew.adams@oracle.com> writes:

> Is this a native-compilation bug?  Is it some other Emacs bug?  Or is
> this just the way things are going to be from now on - "situation
> normale, rien à signaler"?

I've seen such issues in the past, i.e., breakage when subrs change
signature and native compiled code still uses the old version, too.  I
think it should be solved in theory by the hash-number in the eln-cache
directory, i.e., such a change should change the hash and cause
everything to be native-compiled again.  But that doesn't help when you
change emacs-internal functions from an external package.

You said, we should't comment on your way of doing things but maybe
using advices instead of redefinitions would be more robust...

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Question about native compilation (bug?)
  2023-07-26 21:32 Question about native compilation (bug?) Drew Adams
  2023-07-27  4:45 ` Tassilo Horn
@ 2023-07-27  9:53 ` Andrea Corallo
  2023-07-27 14:37   ` [External] : " Drew Adams
  1 sibling, 1 reply; 11+ messages in thread
From: Andrea Corallo @ 2023-07-27  9:53 UTC (permalink / raw)
  To: Drew Adams; +Cc: emacs-devel@gnu.org

Drew Adams <drew.adams@oracle.com> writes:

> I don't have native compilation for my laptop, but I received a bug report from a user of one of my libraries, and I get the feeling that native compilation (which this user has) is interfering and perhaps bugged.
>
> 1. My code, when loaded, saves the original definitions of some functions, such as `read-buffer', by creating a `defalias' for each of them, such as this one:
>
> (defalias 'ORIG-read-buffer
>           (symbol-function 'read-buffer)
>
> 2. I define a global minor mode that, when turned on, redefines those saved functions, such as `read-buffer', using `fset', to using definitions appropriate to the minor mode.  
>
> 3. When the minor mode is turned off I reset those function definitions to the saved (`defalias'ed) versions, using `fset'.
>
> (I do this dance because I want the library to work properly also with older Emacs versions.  No comments, please, about its fragility or other weaknesses.)
> ___
>
> This is the problem, reported in Emacs 28.2 with native compilation:
>
> Emacs 28 added an additional optional arg to function `read-buffer' (which is defined in C).  My substitute version of `read-buffer', which is used when my minor mode is turned on, has only 1 required arg and 2 optional, not 1 required and 3 optional.
>
> Obviously, for any code that will try to pass 4 args, I need to update my code to accommodate the 4th arg.  I haven't done that yet.
>
> But my question is about code that ostensibly passes 3 or fewer args.  An error is nevertheless raised, saying that 4 args were passed and only 3 were allowed.  The standard code, whose vanilla source definition (in `window.el') passes only 3 args, in actuality, at runtime, passes 4 args, the 4th one being explicitly nil.
>
> Recipe: the minor mode is turned on, and `C-x 5 b' is tried.  An error is raised:
>
> Debugger entered--Lisp error: (wrong-number-of-arguments #<subr my-read-buffer> 4)
>   read-buffer("Switch to buffer in other frame: " #<buffer foo> confirm-after-completion nil)
>   read-buffer-to-switch("Switch to buffer in other frame: ")
>   byte-code("\300\301!C\207" [read-buffer-to-switch "Switch to buffer in other frame: "] 2)
>   call-interactively(switch-to-buffer-other-frame nil nil)
>   command-execute(switch-to-buffer-other-frame)
>
> I wouldn't have a clue to what is going on here from that backtrace, since function `my-read-buffer' accepts 3 args and the source code for `read-buffer-to-switch' passes 3 args.
>
> I got a glue (I think), from this: For the reporting user `C-h k C-x 5 b' says this:
>
> C-x 5 b runs the command switch-to-buffer-other-frame (found in global-map), which is an interactive native-compiled Lisp function in '/var/lib/snapd/snap/emacs/2031/usr/share/emacs/28.2/lisp/window.el'.
>
> It's that "native-compiled" that makes me wonder.  If the user loads the Lisp source file, `window.el', there's no problem.  But it looks kinda like native compilation has "baked-in" the call to `read-buffer', passing an explicit 4th arg, nil, instead of passing only the 3 args that the source code says to pass.
>
> Passing 4 args, the last of which is nil, isn't the same thing as passing 3 args, but it looks like native compilation takes a shortcut, assuming that it's the same thing.
>
> Is this a native-compilation bug?  Is it some other Emacs bug?  Or is this just the way things are going to be from now on - "situation normale, rien à signaler"?
>
> Let me know, if my guesses are near the target, or if I'm missing something.

Redefining a primitive (thing that is not recommended but often done)
one must at least use the same signature as the original one.  I don't
think we support the case where the new definition is of a different
signature.  Native compilation might be just more sensitive in this
unsupported condition.

  Andrea



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Re: Question about native compilation (bug?)
  2023-07-27  9:53 ` Andrea Corallo
@ 2023-07-27 14:37   ` Drew Adams
  2023-07-27 16:15     ` Andrea Corallo
  0 siblings, 1 reply; 11+ messages in thread
From: Drew Adams @ 2023-07-27 14:37 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: emacs-devel@gnu.org

> Redefining a primitive (thing that is not recommended but often done)
> one must at least use the same signature as the original one.  I don't
> think we support the case where the new definition is of a different
> signature.  Native compilation might be just more sensitive in this
> unsupported condition.

I don't know whether the same problem arises if
the redefined function has a Lisp instead of a
C definition.  If not, then what you say might
have a bit more sway.

Is that the case - is this problem limited to
redefining functions defined in C?  Or does it
apply also to redefining functions defined in
Lisp?

Lisp itself has always been a dynamic language
and a dynamic programming environment.  That
has included redefining things on the fly.
Since Day One.

And without native compilation we _do_ support
redefining to a definition that has a different
signature.  The actual code in question doesn't
raise an error without native compilation, in
the described scenario (i.e., for a call whose
source code passes only 3 args).

To be clear, I'm not so much arguing about this
as raising a question about it.  I can guess
there are reasonable arguments on both sides of
the question.
___

Imagine this scenario, to take this away from
the actual case where I discovered the problem:

1. A function with two args, both optional, is
defined in Lisp.  But all existing calls to the
function pass only the first arg. 

And imagine that the code gets native-compiled.

2. Later the function is redefined in such a way
that it doesn't ever need the second arg - the
new definition has only one arg (optional).

In a "normal", traditional Lisp environment, no
code changes would be needed. No error would be
raised for any of the existing calls to that
function, because they all pass only one arg.

I don't think this scenario is far-fetched.

Should it really be the case that just because
the code was natively compiled all of the calls
to that function should now raise an error,
because native compilation decided to pass two
args for each call, the second arg being nil?
In effect, native compilation ignores the
redefinition, when it comes to existing calls.
It has baked-in the idea that there are 2 args
and so it passes 2 args in each call.

To me, native compilation should ideally just
be "plumbing".  It shouldn't (again, ideally)
change the _behavior_ of code, other than wrt
performance.  The behavior should (ideally) be
the same as without native compilation.

I don't know the implementation of native
compilation, and I can't speak to whether it
could easily _not_ pass explicit nil for all
missing optional args.  If doing that would be
a game changer (impossible, hard, or with bad
consequences) then we should just live with
what we have, I guess.  But in that case maybe
we should at least document this behavior
difference with native-compilation?

But if native compilation could just as well
pass only the actual arguments that get passed
in the source code, instead of adding nil args
for all optional args, that would be better,
I think.  It would be faithful to the spirit
of Lisp, and faithful to the source code and
the byte-compiled code.

Would such a fix be feasible / easy?  If so,
would it drastically impact performance
negatively?  If not hard to fix and no bad
consequences from doing so, how about making
such a fix?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Re: Question about native compilation (bug?)
  2023-07-27  4:45 ` Tassilo Horn
@ 2023-07-27 15:25   ` Drew Adams
  0 siblings, 0 replies; 11+ messages in thread
From: Drew Adams @ 2023-07-27 15:25 UTC (permalink / raw)
  To: Tassilo Horn; +Cc: emacs-devel@gnu.org

> You said, we should't comment on your way of
> doing things but maybe using advices instead
> of redefinitions would be more robust...

I know about advice, and I use it, thx.

This isn't really about what I do or
why.  How I discovered the problem is
independent from the problem itself.

Please see the scenario outlined in my
reply to Andrea.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [External] : Re: Question about native compilation (bug?)
  2023-07-27 14:37   ` [External] : " Drew Adams
@ 2023-07-27 16:15     ` Andrea Corallo
  2023-07-27 17:05       ` Drew Adams
  0 siblings, 1 reply; 11+ messages in thread
From: Andrea Corallo @ 2023-07-27 16:15 UTC (permalink / raw)
  To: Drew Adams; +Cc: emacs-devel@gnu.org

Drew Adams <drew.adams@oracle.com> writes:

>> Redefining a primitive (thing that is not recommended but often done)
>> one must at least use the same signature as the original one.  I don't
>> think we support the case where the new definition is of a different
>> signature.  Native compilation might be just more sensitive in this
>> unsupported condition.
>
> I don't know whether the same problem arises if
> the redefined function has a Lisp instead of a
> C definition.  If not, then what you say might
> have a bit more sway.

Should be only about primitives (C definitions).

  Andrea



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Re: Question about native compilation (bug?)
  2023-07-27 16:15     ` Andrea Corallo
@ 2023-07-27 17:05       ` Drew Adams
  2023-07-27 17:51         ` Andrea Corallo
  0 siblings, 1 reply; 11+ messages in thread
From: Drew Adams @ 2023-07-27 17:05 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: emacs-devel@gnu.org

> >> Redefining a primitive (thing that is not recommended but often done)
> >> one must at least use the same signature as the original one.  I don't
> >> think we support the case where the new definition is of a different
> >> signature.  Native compilation might be just more sensitive in this
> >> unsupported condition.
> >
> > I don't know whether the same problem arises if
> > the redefined function has a Lisp instead of a
> > C definition.  If not, then what you say might
> > have a bit more sway.
> 
> Should be only about primitives (C definitions).

OK, thx.

What about my other questions, e.g. wrt fixing this?



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [External] : Re: Question about native compilation (bug?)
  2023-07-27 17:05       ` Drew Adams
@ 2023-07-27 17:51         ` Andrea Corallo
  2023-07-27 18:14           ` Drew Adams
  0 siblings, 1 reply; 11+ messages in thread
From: Andrea Corallo @ 2023-07-27 17:51 UTC (permalink / raw)
  To: Drew Adams; +Cc: emacs-devel@gnu.org

Drew Adams <drew.adams@oracle.com> writes:

>> >> Redefining a primitive (thing that is not recommended but often done)
>> >> one must at least use the same signature as the original one.  I don't
>> >> think we support the case where the new definition is of a different
>> >> signature.  Native compilation might be just more sensitive in this
>> >> unsupported condition.
>> >
>> > I don't know whether the same problem arises if
>> > the redefined function has a Lisp instead of a
>> > C definition.  If not, then what you say might
>> > have a bit more sway.
>> 
>> Should be only about primitives (C definitions).
>
> OK, thx.
>
> What about my other questions, e.g. wrt fixing this?

I haven't done any recent analysis, from what I remember it is not
easily fixable.

That said I think is not worth of, redefining primitives is already
discouraged by the manual and dangerous (more on that later), doing it
with a different signature it's just kamikaze behavior.

Note also that redefining primitives in Emacs is not only disincouraged,
but is really not guaranteed to work properly.  The redefinition will
not take effect executing bytecode if the primitive has a dedicated
byteopcode and it will *not* take effect either for any call to the
primiteve done form C itself.

  Andrea



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [External] : Re: Question about native compilation (bug?)
  2023-07-27 17:51         ` Andrea Corallo
@ 2023-07-27 18:14           ` Drew Adams
  2023-07-27 18:38             ` Andrea Corallo
  2023-07-27 19:11             ` Eli Zaretskii
  0 siblings, 2 replies; 11+ messages in thread
From: Drew Adams @ 2023-07-27 18:14 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: emacs-devel@gnu.org

> > What about my other questions, e.g. wrt fixing this?
> 
> I haven't done any recent analysis, from what I remember it is not
> easily fixable.
> 
> That said I think is not worth of, redefining primitives is already
> discouraged by the manual and dangerous (more on that later), doing it
> with a different signature it's just kamikaze behavior.
> 
> Note also that redefining primitives in Emacs is not only disincouraged,
> but is really not guaranteed to work properly.  The redefinition will
> not take effect executing bytecode if the primitive has a dedicated
> byteopcode and it will *not* take effect either for any call to the
> primiteve done form C itself.

I don't claim to understand all of that, e.g.
primitives that do or don't have dedicated
byteopcodes etc.

The fact is that it does work for `read-buffer',
except when native compilation is turned on.
Except, of course, for calls from C itself -
that's understandable (nothing new about that).

The point is to have compatibility with what
happens with Lisp source code.

If this can't/won't be fixed, so be it.  But it
makes Elisp code with native compilation behave
differently from Elisp code that's either source
or byte-compiled.

How much do we care about native compilation
respecting the same behavior as source code?

If this won't be fixed, or until it is, shouldn't
such incompatibility be called out in the doc?
___

Even aside from imagining redefinitions, does
it make sense for a source-code function call
that passes N args to be changed to a call that
passes N args plus M nil args?  That's really
what this is about, it seems to me: reproducing
the actual call, as is, instead of adding
explicit nil optional args.  We don't do that
with byte compilation, right?  Why should we
need to do it with native compilation?



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [External] : Re: Question about native compilation (bug?)
  2023-07-27 18:14           ` Drew Adams
@ 2023-07-27 18:38             ` Andrea Corallo
  2023-07-27 19:11             ` Eli Zaretskii
  1 sibling, 0 replies; 11+ messages in thread
From: Andrea Corallo @ 2023-07-27 18:38 UTC (permalink / raw)
  To: Drew Adams; +Cc: emacs-devel@gnu.org

Drew Adams <drew.adams@oracle.com> writes:

>> > What about my other questions, e.g. wrt fixing this?
>> 
>> I haven't done any recent analysis, from what I remember it is not
>> easily fixable.
>> 
>> That said I think is not worth of, redefining primitives is already
>> discouraged by the manual and dangerous (more on that later), doing it
>> with a different signature it's just kamikaze behavior.
>> 
>> Note also that redefining primitives in Emacs is not only disincouraged,
>> but is really not guaranteed to work properly.  The redefinition will
>> not take effect executing bytecode if the primitive has a dedicated
>> byteopcode and it will *not* take effect either for any call to the
>> primiteve done form C itself.
>
> I don't claim to understand all of that, e.g.
> primitives that do or don't have dedicated
> byteopcodes etc.
>
> The fact is that it does work for `read-buffer',
> except when native compilation is turned on.
> Except, of course, for calls from C itself -
> that's understandable (nothing new about that).
>
> The point is to have compatibility with what
> happens with Lisp source code.

As I said that is already broken in multiple ways.  Try redefining + and
you'll discover that: it works in interpreted code, it does not when
executing bytecode, it does again from native code, it does not from C
calls.  Not only, the byte compiler assumes primitives are what they are
and they are not redefined, so byte compiling code you might get
spurious warnings or a miss-compiled output.

You just touched the tip of an iceberg of something we don't support and
we ask the user not to do.

  Andrea



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [External] : Re: Question about native compilation (bug?)
  2023-07-27 18:14           ` Drew Adams
  2023-07-27 18:38             ` Andrea Corallo
@ 2023-07-27 19:11             ` Eli Zaretskii
  1 sibling, 0 replies; 11+ messages in thread
From: Eli Zaretskii @ 2023-07-27 19:11 UTC (permalink / raw)
  To: Drew Adams; +Cc: acorallo, emacs-devel

> From: Drew Adams <drew.adams@oracle.com>
> CC: "emacs-devel@gnu.org" <emacs-devel@gnu.org>
> Date: Thu, 27 Jul 2023 18:14:15 +0000
> 
> > > What about my other questions, e.g. wrt fixing this?
> > 
> > I haven't done any recent analysis, from what I remember it is not
> > easily fixable.
> > 
> > That said I think is not worth of, redefining primitives is already
> > discouraged by the manual and dangerous (more on that later), doing it
> > with a different signature it's just kamikaze behavior.
> > 
> > Note also that redefining primitives in Emacs is not only disincouraged,
> > but is really not guaranteed to work properly.  The redefinition will
> > not take effect executing bytecode if the primitive has a dedicated
> > byteopcode and it will *not* take effect either for any call to the
> > primiteve done form C itself.
> 
> I don't claim to understand all of that, e.g.
> primitives that do or don't have dedicated
> byteopcodes etc.
> 
> The fact is that it does work for `read-buffer',
> except when native compilation is turned on.

If we ever decide to give read-buffer a dedicate bytecode op-code, it
will stop working.  So applications using this are unreliable.

> The point is to have compatibility with what
> happens with Lisp source code.
> 
> If this can't/won't be fixed, so be it.  But it
> makes Elisp code with native compilation behave
> differently from Elisp code that's either source
> or byte-compiled.

By doing this you invoke "undefined behavior", whereby you cannot talk
about "different behavior" because the behavior is undefined.

> If this won't be fixed, or until it is, shouldn't
> such incompatibility be called out in the doc?

We already advise not to redefine existing functions:

     Be careful not to redefine existing functions unintentionally.
     ‘defun’ redefines even primitive functions such as ‘car’ without
     any hesitation or notification.  Emacs does not prevent you from
     doing this, because redefining a function is sometimes done
     deliberately, and there is no way to distinguish deliberate
     redefinition from unintentional redefinition.

> Even aside from imagining redefinitions, does
> it make sense for a source-code function call
> that passes N args to be changed to a call that
> passes N args plus M nil args?  That's really
> what this is about, it seems to me: reproducing
> the actual call, as is, instead of adding
> explicit nil optional args.  We don't do that
> with byte compilation, right?  Why should we
> need to do it with native compilation?

Because native code runs natively, not by our interpreter, and thus
must have a fixed number of arguments.



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-07-27 19:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-26 21:32 Question about native compilation (bug?) Drew Adams
2023-07-27  4:45 ` Tassilo Horn
2023-07-27 15:25   ` [External] : " Drew Adams
2023-07-27  9:53 ` Andrea Corallo
2023-07-27 14:37   ` [External] : " Drew Adams
2023-07-27 16:15     ` Andrea Corallo
2023-07-27 17:05       ` Drew Adams
2023-07-27 17:51         ` Andrea Corallo
2023-07-27 18:14           ` Drew Adams
2023-07-27 18:38             ` Andrea Corallo
2023-07-27 19:11             ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).