* Emacs 25.1 build failures on Ubuntu arm64
@ 2017-03-29 0:30 Barry Warsaw
2017-03-29 15:03 ` Eli Zaretskii
0 siblings, 1 reply; 6+ messages in thread
From: Barry Warsaw @ 2017-03-29 0:30 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 2112 bytes --]
I'm not sure how much of this email will be actionable, but I wanted to at
least let y'all know about a problem we've seen building Emacs 25.1 on Ubuntu
17.04 arm64.
Quick background: In Ubuntu we have a few minor changes above the package in
Debian, which we mostly inherit. One of those changes is that we build with
-j1 on arm64 only. There are a couple of other changes that shouldn't have an
impact on the problem I'll describe, but the changelogs can be easily
inspected.
In any case, for several weeks now we've seen Emacs segfault on arm64 builders
consistently in what I think are the CEDET tests. The full traceback, when
we've been able to capture it, is here:
https://launchpadlibrarian.net/302409391/gdb-bt-full.txt
The full build log is here:
https://launchpadlibrarian.net/310046964/buildlog_ubuntu-zesty-arm64.emacs25_25.1+1-3ubuntu3~ppa0_BUILDING.txt.gz
(scroll to the bottom to see the crash)
and the bug that tracks this is here:
https://bugs.launchpad.net/ubuntu/+source/emacs25/+bug/1656474
We're using gcc based on 6.2.1/6.3.0 (depending on the build).
All the other architectures have been building fine, but arm64 crashes
consistently on our build machines. Attempts to reproduce the crash on
development machines, even on arm64 chroots and bare metal have not been
successful, so debugging the problem isn't easy. Based on some research into
other crashes, various other attempts to work around the problem, and some
discussions among Ubuntu developers on IRC, we wondered whether turning off
optimization would help. Indeed, adding -O0 to CFLAGS only on arm64 solved
the problem in a test build.
I've now uploaded a new build to the archive with -O0 on arm64 and I expect
this to build and get published once it's approved (Ubuntu is currently in
final beta freeze).
I'll watch this thread via Gmane and try to answer any additional questions
you might have. As I said, I'm not sure what the Emacs developers can do
about it, but I wanted you to know about the problem and how we think we've
solved it.
Cheers,
-Barry
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Emacs 25.1 build failures on Ubuntu arm64
2017-03-29 0:30 Emacs 25.1 build failures on Ubuntu arm64 Barry Warsaw
@ 2017-03-29 15:03 ` Eli Zaretskii
2017-03-29 20:46 ` Barry Warsaw
0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2017-03-29 15:03 UTC (permalink / raw)
To: Barry Warsaw; +Cc: emacs-devel
> From: Barry Warsaw <barry@python.org>
> Date: Tue, 28 Mar 2017 20:30:40 -0400
>
> I'm not sure how much of this email will be actionable, but I wanted to at
> least let y'all know about a problem we've seen building Emacs 25.1 on Ubuntu
> 17.04 arm64.
Did this configuration build correctly with Emacs 24.5?
> In any case, for several weeks now we've seen Emacs segfault on arm64 builders
> consistently in what I think are the CEDET tests. The full traceback, when
> we've been able to capture it, is here:
>
> https://launchpadlibrarian.net/302409391/gdb-bt-full.txt
I see this there:
#5 handle_sigsegv (sig=11, siginfo=<optimized out>, arg=<optimized out>) at sysdep.c:1695
fatal = <optimized out>
#6 <signal handler called>
No symbol table info available.
#7 unchain_marker (marker=marker@entry=0x130ad18) at marker.c:605
tail = <optimized out>
prev = 0x676e696e6e6967e5
b = 0x130acf0
#8 0x00000000005334a4 in free_marker (marker=marker@entry=19967257) at alloc.c:3850
No locals.
The value of 'prev' in frame #7 looks garbled: it's ASCII text
"ginning", probably was "beginning" at some point. This might
indicate memory corruption, either some dynamically allocated memory
or the stack got smashed.
Does your build compile ralloc.c? If so, could you try configuring
with REL_ALLOC=no?
> and the bug that tracks this is here:
>
> https://bugs.launchpad.net/ubuntu/+source/emacs25/+bug/1656474
That bug mentions several fixed done by Debian; did you try them?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Emacs 25.1 build failures on Ubuntu arm64
2017-03-29 15:03 ` Eli Zaretskii
@ 2017-03-29 20:46 ` Barry Warsaw
2017-03-29 21:57 ` Barry Warsaw
0 siblings, 1 reply; 6+ messages in thread
From: Barry Warsaw @ 2017-03-29 20:46 UTC (permalink / raw)
Cc: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 1705 bytes --]
On Mar 29, 2017, at 06:03 PM, Eli Zaretskii wrote:
>> I'm not sure how much of this email will be actionable, but I wanted to at
>> least let y'all know about a problem we've seen building Emacs 25.1 on
>> Ubuntu 17.04 arm64.
>
>Did this configuration build correctly with Emacs 24.5?
It did. Ubuntu 17.04 does still have Emacs 24.5, and its arm64 build did
succeed. It was last built in the archive back February but I just tried an
experimental rebuild with today's toolchain and arm64 passed.
> prev = 0x676e696e6e6967e5
>
>The value of 'prev' in frame #7 looks garbled: it's ASCII text "ginning",
>probably was "beginning" at some point. This might indicate memory
>corruption, either some dynamically allocated memory or the stack got
>smashed.
Indeed, that's interesting.
>Does your build compile ralloc.c? If so, could you try configuring
>with REL_ALLOC=no?
We already build with REL_ALLOC=no for both emacs24 and emacs25.
>> and the bug that tracks this is here:
>>
>> https://bugs.launchpad.net/ubuntu/+source/emacs25/+bug/1656474
>
>That bug mentions several fixed done by Debian; did you try them?
I haven't; here are the list of Ubuntu deltas. It's possible of course that
one of these introduces a problem, but I'm more suspicious of the toolchain.
The last upload into Debian unstable did successfully build arm64.
- build with parallel=1 on arm64
- Rebuild against new imagemagick 6.9.7.0. (no changes to emacs)
- Don't build-depend on gconf. emacs has supported gsettings for years
I'll see if I can do a build of the Debian version of 25.1 in an Ubuntu 17.04
PPA to see if there are any different results.
Cheers,
-Barry
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Emacs 25.1 build failures on Ubuntu arm64
2017-03-29 20:46 ` Barry Warsaw
@ 2017-03-29 21:57 ` Barry Warsaw
2017-08-01 14:55 ` Rob Browning
0 siblings, 1 reply; 6+ messages in thread
From: Barry Warsaw @ 2017-03-29 21:57 UTC (permalink / raw)
To: emacs-devel
[-- Attachment #1: Type: text/plain, Size: 472 bytes --]
On Mar 29, 2017, at 04:46 PM, Barry Warsaw wrote:
>I'll see if I can do a build of the Debian version of 25.1 in an Ubuntu 17.04
>PPA to see if there are any different results.
Indeed, Debian unstable's version of Emacs 25.1 also fails with the same
segfault in the Ubuntu 17.04 PPA, as expected. We can rule out any Ubuntu
deltas.
https://launchpadlibrarian.net/313465977/buildlog_ubuntu-zesty-arm64.emacs25_25.1+1-4~ppa0_BUILDING.txt.gz
Cheers,
-Barry
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Emacs 25.1 build failures on Ubuntu arm64
2017-03-29 21:57 ` Barry Warsaw
@ 2017-08-01 14:55 ` Rob Browning
2017-08-04 12:26 ` Byung-Hee HWANG (황병희, 黃炳熙)
0 siblings, 1 reply; 6+ messages in thread
From: Rob Browning @ 2017-08-01 14:55 UTC (permalink / raw)
To: Barry Warsaw, emacs-devel
Barry Warsaw <barry@python.org> writes:
> On Mar 29, 2017, at 04:46 PM, Barry Warsaw wrote:
>
>>I'll see if I can do a build of the Debian version of 25.1 in an Ubuntu 17.04
>>PPA to see if there are any different results.
>
> Indeed, Debian unstable's version of Emacs 25.1 also fails with the same
> segfault in the Ubuntu 17.04 PPA, as expected. We can rule out any Ubuntu
> deltas.
>
> https://launchpadlibrarian.net/313465977/buildlog_ubuntu-zesty-arm64.emacs25_25.1+1-4~ppa0_BUILDING.txt.gz
I spent a bit of time on one of the porterboxes and gathered a little
more information, though in the end, we just added -O0 on arm64 for now
as well.
I was able to reproduce this on asachi using a git checkout of
origin/emacs-25.2 (i.e. clean upstream tree, no debian adjustments),
though I did emulate our VPATH build (see below).
One notable oddity -- while trying to narrow down the cause, I found
that the crash could be reliably triggered by adding the .git dir to the
(copied) build tree. For example, this either builds, or crashes,
depending on whether or not the .git dir is introduced.
rm -rf debian
mkdir -p debian/build-src
cp -a $(ls -A | egrep -v '^(\.git|\.pc|debian)$') debian/build-src
# If this line is removed, the build works fine, otherwise it crashes
cp -a .git debian/build-src/
pushd debian/build-src
./autogen.sh
popd
mkdir debian/build-x
cd debian/build-x
../build-src/configure ...
make -j
When the build crashes, it's always while trying to produce c-by.el, and
the crash looks similar to the one reported earlier in this thread,
i.e.:
Starting program: /home/rlb/git/emacs25-25.2+1/debian/build-x/src/emacs -batch --no-site-file --no-site-lisp -l semantic/bovine/grammar -f bovine-batch-make-parser -o /home/rlb/git/emacs25-25.2+1/debian/build-x/../../../emacs25-25.2+1/debian/build-src/lisp/cedet/semantic/bovine/c-by.el /home/rlb/git/emacs25-25.2+1/debian/build-x/../../../emacs25-25.2+1/debian/build-src/admin/grammars/c.by
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
[New Thread 0xffffb36dbe10 (LWP 22399)]
../../../build-src/lisp/emacs-lisp/eieio.el: `eieio-object-name-string' is an obsolete generic function (as of 25.1); use `eieio-named' instead.
../../../build-src/lisp/emacs-lisp/eieio-base.el: `eieio-object-name-string' is an obsolete generic function (as of 25.1); use `eieio-named' instead.
../../../build-src/lisp/cedet/semantic/db-ref.el: Obsolete name arg "DEBUG" to constructor semanticdb-ref-adebug
Thread 1 "emacs" received signal SIGSEGV, Segmentation fault.
unchain_marker (marker=marker@entry=0x188d850) at ./debian/build-src/src/marker.c:605
605 ./debian/build-src/src/marker.c: No such file or directory.
(gdb) where
#0 unchain_marker (marker=marker@entry=0x188d850) at ./debian/build-src/src/marker.c:605
#1 0x000000000053608c in free_marker (marker=marker@entry=25745489)
at ./debian/build-src/src/alloc.c:3850
#2 0x0000000000508688 in signal_before_change (preserve_ptr=0x0, end_int=9894256,
start_int=9890072) at ./debian/build-src/src/insdel.c:2041
...
I wondered if the presence of the .git dir was altering the behavior of
autogen.sh (or something else) in a way that exposes the problem.
Hope this helps
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-08-04 12:26 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-29 0:30 Emacs 25.1 build failures on Ubuntu arm64 Barry Warsaw
2017-03-29 15:03 ` Eli Zaretskii
2017-03-29 20:46 ` Barry Warsaw
2017-03-29 21:57 ` Barry Warsaw
2017-08-01 14:55 ` Rob Browning
2017-08-04 12:26 ` Byung-Hee HWANG (황병희, 黃炳熙)
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).