unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#43190: feature/native-comp; failure to compile json-mode leaves Emacs broken
@ 2020-09-04  1:06 Tom Gillespie
  2020-09-04  7:56 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 6+ messages in thread
From: Tom Gillespie @ 2020-09-04  1:06 UTC (permalink / raw)
  To: 43190; +Cc: Andrea Corallo

Hi Andrea,
   Found another one. I don't know what is going wrong on
the compiler side that induces it, but hopefully the repro
can get you pointed in the right direction. Also sent to the
bug tracker so it should show up there shortly. I have left
the section on issues with kill-buffer in the report so that
anyone encountering the effect of the bug won't have to spend
the same amount of time I did figuring out what is happening.
Best!
Tom


This issue occurs at 3023eb569213a3dd5183640f6e322acd00ea536a.
It does not happen in emacs-27 nor on master at
ae6daa680a5f5f5fb9c6a15296e5e88c97cd770a.

When trying to load json-mode for the first time it fails to compile
due to some issue in json-snatcher, the error that is displayed is
json-mode.el:33:1: Error: Symbol’s function definition is void:
jsons-remove-buffer

I suspect that something goes wrong inside json-snatcher.el and the
end result is that =(add-hook 'kill-buffer-hook 'jsons-remove-buffer)=
on line json-snatcher.el:85 manages to run, but then the defun for
that function is never called. The fact that that function cannot be
found means that anything that calls kill-buffer will be broken, which
includes a terrifyingly large portion of Emacs' debugging
functionality (see footnote 1).  On a bad day this also breaks things
like the ability to exit Emacs via =C-x C-c=.

Fortunately in the repro below this state is not triggered so you can
at least exit.  Just in case I included the escape hatches that can be
evaluated using =C-x C-e= in the scratch buffer most relevantly =(pop
kill-buffer-hook)=.

Here is a simple reproduction that is hopefully enough
to track down what is going wrong with native compile.
I have not been able to produce a simple visible repro
using --batch, but I think the problem is clear enough.
#+begin_src bash
[[ -d "/tmp/test/.emacs.d/" ]] && rm -r /tmp/test/.emacs.d/; \
emacs -q \
--eval "(setq user-emacs-directory \"/tmp/test/.emacs.d/\")" \
--eval "(require 'package)" \
--eval "(add-to-list 'package-archives '(\"melpa\" .
\"https://melpa.org/packages/\") t)" \
--eval "(package-refresh-contents)" \
--eval "(package-install 'json-mode)"
#+end_src

The repro for emacs-27 without the --eval options.
#+begin_src bash
[[ -d "/tmp/test/.emacs.d/" ]] && rm -r /tmp/test/.emacs.d/; \
emacs-27 -q \
#+end_src

The repro for the master branch without the --eval options.
#+begin_src bash
./autogen.sh
./configure
make -j 8
[[ -d "/tmp/test/.emacs.d/" ]] && rm -r /tmp/test/.emacs.d/; \
src/emacs -q \
#+end_src

A longer reproduction that controls for a number of variables and
shows the larger issues with kill-buffer is below.

Run this and hit enter in the ielm buffer and you will experience
and odd hang. Hit enter again and then switch to the scratch buffer
if you want to play around with the additional issues.
#+begin_src bash
[[ -d "/tmp/test/.emacs.d/" ]] && rm -r /tmp/test/.emacs.d/; \
emacs -q \
--eval "(setq user-emacs-directory \"/tmp/test/.emacs.d/\")" \
--eval "(when (boundp 'comp-eln-load-path) (pop comp-eln-load-path)
(add-to-list 'comp-eln-load-path (concat user-emacs-directory
\"eln-cache\")))" \
--eval "(require 'package)" \
--eval "(add-to-list 'package-archives '(\"melpa\" .
\"https://melpa.org/packages/\") t)" \
--eval "(package-refresh-contents)" \
--eval "(ignore-errors (package-install 'json-mode))" \
--eval "(ielm)" \
--eval "(with-current-buffer \"*ielm*\" (insert \"(setq asdf (+ 1 2))\"))" \
--eval "(with-current-buffer \"*scratch*\"
(insert \"(pop kill-buffer-hook) ; calling this will revert the massive breakage
asdf  ; C-x C-e this to see more brokeness with native comp or working with 27
(fmakunbound 'jsons-remove-buffer)  ; use to repro the effects without
native comp
C-x-C-e-me-i-am-not-bound  ; asdf will be bound if testing with 27
;; run these blocks to be able to safely enter the debugger
(setq debug-on-message \\\"break\\\")
(defvar break-message \\\"break\\\")
(defun jsons-remove-buffer ()
  (let ((bm break-message))
    (setq break-message \\\"\\\")
    (message bm)))\"))"
#+end_src

Backtraces.

#+begin_example elisp
Debugger entered--Lisp error: "break"
  message("break")
  (let ((bm break-message)) (setq break-message "") (message bm))
  jsons-remove-buffer()
  kill-buffer(#<buffer  *temp*-833840>)
  ielm-eval-input(#("(+ 1 2)" 0 7 (fontified t)) nil)
  ielm-send-input(nil)
  ielm-return()
  funcall-interactively(ielm-return)
  command-execute(ielm-return)
#+end_example

#+begin_example elisp
Debugger entered--Lisp error: "break"
  message("break")
  (let ((bm break-message)) (setq break-message "") (message bm))
  jsons-remove-buffer()
  kill-buffer(#<buffer  *temp*-61838>)
  #f(compiled-function () #<bytecode 0x11ac346e22dcf22>)()
  cl-print-to-string-with-limit(backtrace--print (void-variable asdf) 5000)
  backtrace--print-to-string((void-variable asdf) nil)
  backtrace-print-to-string((void-variable asdf))
  debugger--insert-header((error (void-variable asdf)))
  #f(compiled-function () #<bytecode 0x1f401d22beba>)()
  backtrace-print()
  debugger-setup-buffer((error (void-variable asdf)))
  debug(error (void-variable asdf))
  (progn asdf)
  elisp--eval-last-sexp(nil)
  eval-last-sexp(nil)
  funcall-interactively(eval-last-sexp nil)
  command-execute(eval-last-sexp)
#+end_example

Thoughts on the presence of a larger issue.

While this is clearly a bug in the way that json-snatcher is written
(the symbol should never have been added to the hook before the
function was defined), it is only causing an issue with the new native
comp code.  Furthermore this example reveals a number of systematic
issues with when and where kill-buffer is being called inside core
Emacs code, and with how hook failures on kill-buffer are being
handled. This was an absolute nightmare to debug and I think I will
submit a separate issue about the utterly undebuggable nature of
issues with kill-buffer-hook.

Thoughts on potential solutions while they are fresh in my mind. It
might be possible to add a check in add-hook that could be invoked for
hooks that want to require any symbol added to the hook list to
already be bound at the time that it is added to the hook list. This
would prevent the issue that we see with json-snatcher in the future.
However this does not prevent issues if someone was doing something
evil like using push to add to kill-buffer-hook. The error message
that is currently emitted for these kinds of failures is almost
good enough, I think that it just needs a note that the symbol was
originally found the kill-buffer-hook list and maybe provide
instructions on how to remove the symbol from the hook. For example
"kill-buffer: Symbol’s function definition is void: jsons-remove-buffer
Symbol jsons-remove-buffer in kill-buffer-hook list is void
you can run (remove-hook 'kill-buffer-hook 'jsons-remove-buffer)
to restore kill-buffer functionality"

Footnotes.

1. For example toggle-debug-on-error completely fails to function
   correctly in this case, as do things like ielm-return. Anything
   which uses with-temp-buffer and tries to print, such as
   cl-print-to-string-with-limit will have its output silenced or
   ignored due to the failure of kill-buffer in the unwind forms.
   Normal approaches to debugging nearly all wind up stuck in infinite
   loops since most of them make a call to kill-buffer somewhere.





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#43190: feature/native-comp; failure to compile json-mode leaves Emacs broken
  2020-09-04  1:06 bug#43190: feature/native-comp; failure to compile json-mode leaves Emacs broken Tom Gillespie
@ 2020-09-04  7:56 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2020-09-17 17:49   ` Tom Gillespie
  0 siblings, 1 reply; 6+ messages in thread
From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-09-04  7:56 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: 43190

Hi Tom,

thanks for the detailed report.

Tom Gillespie <tgbugs@gmail.com> writes:

> While this is clearly a bug in the way that json-snatcher is written
> (the symbol should never have been added to the hook before the
> function was defined), it is only causing an issue with the new native
> comp code.

had no time to reproduce but to a very first look I'm quite unconvinced
this is a native-comp bug.  I think you'd get the same exact error
byte-compiling the file from shell using a vanilla Emacs.

The fact is that native async compilations are child processes so they
start with a clean environment.  This makes the native compiler a little
more picky on bugs when the compilation is relaying silently on some
definition sneaked in from the environment.

This lead already to a number of packages to fix their requires.

The proof is typically trying to byte-compile from shell, this give
systematically the same error, this because the front-end of the native
compiler *is* the byte-compiler :)

Each compilation unit should be consistent and compilable autonomously
without relaying on previous modifications of the environment.

Thanks

  Andrea





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#43190: feature/native-comp; failure to compile json-mode leaves Emacs broken
  2020-09-04  7:56 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2020-09-17 17:49   ` Tom Gillespie
  2020-09-18 19:23     ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 6+ messages in thread
From: Tom Gillespie @ 2020-09-17 17:49 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: 43190

Hi Andrea,
    Having poked about a bit more I agree. I don't think it has
anything to do with the native compilation directly. I suspect that it
has to do with how or exactly when the native compiled code is swapped
in. When byte compiling on the command line I get an error at the same
place that I get one when native compiling stating that json-snatcher
could not be required. However when byte compiling the file, the code
to add to kill-buffer-hook is not run. I'm wondering whether the load
starts, native comp kicks in, succeeds on the first couple of forms,
then fails for the same reason as bytecomp, except that the first
forms were successfully loaded? This assumes that native-comp compiles
and loads single expressions at a time. The other possibility that
comes to mind is that maybe the regular el file forms are loaded, and
then there is a step where the forms that would be replaced are
unloaded? My ignorance of how you do the swap doesn't help here. Note
that if you try to repro this now you will have to install an older
version of json-snacher since the patch to fix the issue has been
merged. Best,
Tom




On Fri, Sep 4, 2020 at 12:56 AM Andrea Corallo <akrl@sdf.org> wrote:
>
> Hi Tom,
>
> thanks for the detailed report.
>
> Tom Gillespie <tgbugs@gmail.com> writes:
>
> > While this is clearly a bug in the way that json-snatcher is written
> > (the symbol should never have been added to the hook before the
> > function was defined), it is only causing an issue with the new native
> > comp code.
>
> had no time to reproduce but to a very first look I'm quite unconvinced
> this is a native-comp bug.  I think you'd get the same exact error
> byte-compiling the file from shell using a vanilla Emacs.
>
> The fact is that native async compilations are child processes so they
> start with a clean environment.  This makes the native compiler a little
> more picky on bugs when the compilation is relaying silently on some
> definition sneaked in from the environment.
>
> This lead already to a number of packages to fix their requires.
>
> The proof is typically trying to byte-compile from shell, this give
> systematically the same error, this because the front-end of the native
> compiler *is* the byte-compiler :)
>
> Each compilation unit should be consistent and compilable autonomously
> without relaying on previous modifications of the environment.
>
> Thanks
>
>   Andrea





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#43190: feature/native-comp; failure to compile json-mode leaves Emacs broken
  2020-09-17 17:49   ` Tom Gillespie
@ 2020-09-18 19:23     ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2020-09-19  2:42       ` Tom Gillespie
  0 siblings, 1 reply; 6+ messages in thread
From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-09-18 19:23 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: 43190

Hi Tom,

Tom Gillespie <tgbugs@gmail.com> writes:

> Hi Andrea,
>     Having poked about a bit more I agree. I don't think it has
> anything to do with the native compilation directly. I suspect that it
> has to do with how or exactly when the native compiled code is swapped
> in. When byte compiling on the command line I get an error at the same
> place that I get one when native compiling stating that json-snatcher
> could not be required.

As I mentioned what you have seen is 99% due to the fact that the
compiled package is missing to require some feature explicitly.  For
some reason this feature in a interactive session in loaded and the
byte-compiler is able to compile without complaining of this missing
definition.  The native compiler is picky as the byte compiler invoked
in the command line (actually because the byte-compiler is its
front-end).

If for example the native compiler compiles a file using a macro but
this is not defined it will assume this is a function.  As a result the
produced code is wrong and once loaded will not behave correctly.

If a require is missing the code is not correct and we cannot garantee
to compile such a code correctly, as the package has been fixed and
there's no evidence of a native compiler bug I'd be for closing this bug
:)

  Andrea





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#43190: feature/native-comp; failure to compile json-mode leaves Emacs broken
  2020-09-18 19:23     ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2020-09-19  2:42       ` Tom Gillespie
  2020-09-19  6:38         ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 6+ messages in thread
From: Tom Gillespie @ 2020-09-19  2:42 UTC (permalink / raw)
  To: Andrea Corallo; +Cc: 43190

I think we are ok to close. If a similar issue shows up in the future
we will have this as a reference. Best!
Tom

On Fri, Sep 18, 2020 at 3:23 PM Andrea Corallo <akrl@sdf.org> wrote:
>
> Hi Tom,
>
> Tom Gillespie <tgbugs@gmail.com> writes:
>
> > Hi Andrea,
> >     Having poked about a bit more I agree. I don't think it has
> > anything to do with the native compilation directly. I suspect that it
> > has to do with how or exactly when the native compiled code is swapped
> > in. When byte compiling on the command line I get an error at the same
> > place that I get one when native compiling stating that json-snatcher
> > could not be required.
>
> As I mentioned what you have seen is 99% due to the fact that the
> compiled package is missing to require some feature explicitly.  For
> some reason this feature in a interactive session in loaded and the
> byte-compiler is able to compile without complaining of this missing
> definition.  The native compiler is picky as the byte compiler invoked
> in the command line (actually because the byte-compiler is its
> front-end).
>
> If for example the native compiler compiles a file using a macro but
> this is not defined it will assume this is a function.  As a result the
> produced code is wrong and once loaded will not behave correctly.
>
> If a require is missing the code is not correct and we cannot garantee
> to compile such a code correctly, as the package has been fixed and
> there's no evidence of a native compiler bug I'd be for closing this bug
> :)
>
>   Andrea





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#43190: feature/native-comp; failure to compile json-mode leaves Emacs broken
  2020-09-19  2:42       ` Tom Gillespie
@ 2020-09-19  6:38         ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 6+ messages in thread
From: Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2020-09-19  6:38 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: 43190-done

Tom Gillespie <tgbugs@gmail.com> writes:

> I think we are ok to close. If a similar issue shows up in the future
> we will have this as a reference. Best!
> Tom

Great thanks!  closing

  Andrea





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-09-19  6:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04  1:06 bug#43190: feature/native-comp; failure to compile json-mode leaves Emacs broken Tom Gillespie
2020-09-04  7:56 ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-09-17 17:49   ` Tom Gillespie
2020-09-18 19:23     ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors
2020-09-19  2:42       ` Tom Gillespie
2020-09-19  6:38         ` Andrea Corallo via Bug reports for GNU Emacs, the Swiss army knife of text editors

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).