unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Rob Browning <rlb@defaultvalue.org>,
	Andrea Corallo <akrl@sdf.org>, Paul Eggert <eggert@cs.ucla.edu>
Cc: 57789@debbugs.gnu.org
Subject: bug#57789: Emacs 28.1 clone build with native compilation crashes on s390x
Date: Thu, 15 Sep 2022 10:10:59 +0300	[thread overview]
Message-ID: <83wna5yuws.fsf@gnu.org> (raw)
In-Reply-To: <87pmfxhfoz.fsf@trouble.defaultvalue.org> (message from Rob Browning on Wed, 14 Sep 2022 15:19:24 -0500)

> From: Rob Browning <rlb@defaultvalue.org>
> Cc: 57789@debbugs.gnu.org
> Date: Wed, 14 Sep 2022 15:19:24 -0500
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Please run the crashing command under GDB, and when it segfaults,
> > produce the C-level and Lisp-level backtrace, and post them here.
> 
> Starting from scratch with the emacs-28.1 commit I can reproduce the
> failure when building via
> 
>   ./configure --prefix=/home/rlb/opt/emacs-tmp --with-native-compilation
> 
> It crashes with the same segfault repeatably, i.e. if you run make
> again, it crashes again on the previously mentioned "... -l comp -f
> batch-byte+native-compile international/titdic-cnv.el" invocation.  That
> crash output is attached below.
> 
> After adjusting the Makefile.in invocation so I could run it with gdb in
> exactly the same environment once it's failing on that command, I
> captured the backtrace and included it below.

Thanks.  The backtrace indicates that the crash is in GC.  This
probably means we have some fundamental problem on that architecture.
Andrea, any advice for how to investigate?

Does the build of the same code with the same options sans
"--with-native-compilation" succeed, or does it also crash with
similar symptoms?  If the build without native-compilation succeeds,
my first question would be how mature and stable is libgccjit on that
platform?  Perhaps take this up with the GCC's libgccjit developers.

> With respect to the Lisp-level backtrace, I imagined you probably meant
> an xbacktrace?  If so (and assuming I'm guessing right about how I
> should do that), I haven't figured out how to arrange sourcing the
> src/.gdbinit from the src/Makefile.in command.

You can source it manually from the GDB prompt, when the segfault
happens, and then invoke xbacktrace manually, can't you?

> It looked like it might be because there were no debug symbols, so I
> tried adding a CFLAGS=-g3 to the end of the ./configure, but that caused
> the crash to disappear entirely.

Too bad, it means we have a heisenbug on our hands, which will make it
even harder to debug (as if debugging crashes in GC were not hard
enough already).

What happens if you modify this variable:

  (defcustom native-comp-debug (if (eq 'windows-nt system-type) 1 0)

to have the value 1 or even zero, and then rebuild from scratch? does
the build succeed then?

> Finally (and this was just a random guess based on previous experiences,
> particularly with programs like guile that play (normal, traditional)
> tricks with pointers/coercions/etc.) I noticed that emacs doesn't
> specify -fno-strict-aliasing, and unless all the C code has been written
> with that in mind, I assume that might open a window allowing the
> optimizer to introduce undesirable changes.  So I added a
> CFLAGS=-fno-strict-aliasing to the end of the ./configure command, and
> then the build and tests worked fine (twice in a row):
> 
>   ./configure --prefix=/home/rlb/opt/emacs-tmp --with-native-compilation \
>     CFLAGS=-fno-strict-aliasing
> 
> Of course that's not remotely conclusive, but if all of the C code
> wasn't written with strict-aliasing in mind, then I wondered if it might
> make sense to consider adding -fno-strict-aliasing as a default option.

I don't know enough about this.  Perhaps Andrea or Paul could comment.

> Also, even if that ends up being desirable, I'm not sure it'll be
> sufficient.  That is, I suspect I might want to run the full build/check
> with -fno-strict-aliasing in a loop for a bit to make sure the clean
> build/check is reliable, since I think I may have seen some test crashes
> (not the build crash) on one earlier run with that option, but I'm not
> sure that was a clean attempt.

Yes, running the full test suite would be the logical next step.

> Program received signal SIGSEGV, Segmentation fault.
> mark_object (arg=<optimized out>) at alloc.c:6809
> 6809            if (symbol_marked_p (ptr))
> (gdb) backtrace
> #0  mark_object (arg=<optimized out>) at alloc.c:6809

Any idea what cause SIGSEGV here?  Was 'ptr' an invalid pointer for
some reason, and if so, what exactly makes it invalid?

Thanks.





  parent reply	other threads:[~2022-09-15  7:10 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-14  1:04 bug#57789: Emacs 28.1 clone build with native compilation crashes on s390x Rob Browning
2022-09-14  2:42 ` Eli Zaretskii
2022-09-14  3:06   ` Rob Browning
2022-09-14  3:20     ` Rob Browning
2022-09-14 20:19   ` Rob Browning
2022-09-14 20:21     ` Rob Browning
2022-09-16  6:04       ` Gerd Möllmann
2022-09-17 21:04         ` Rob Browning
2022-09-18  5:22           ` Gerd Möllmann
2022-09-18  5:49             ` Eli Zaretskii
2022-09-18  5:55               ` Gerd Möllmann
2022-09-18  5:33           ` Eli Zaretskii
2022-09-24 21:06             ` Rob Browning
2023-06-07 21:15               ` Andrea Corallo
2023-09-11 18:08                 ` Stefan Kangas
2022-09-15  7:10     ` Eli Zaretskii [this message]
2022-09-15 14:51       ` Paul Eggert via Bug reports for GNU Emacs, the Swiss army knife of text editors
2022-09-15 16:26         ` Rob Browning
2022-09-16  8:43         ` Andrea Corallo
2022-09-16  8:39       ` Andrea Corallo
2022-09-17 21:00       ` Rob Browning

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83wna5yuws.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=57789@debbugs.gnu.org \
    --cc=akrl@sdf.org \
    --cc=eggert@cs.ucla.edu \
    --cc=rlb@defaultvalue.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).