* bug#39577: 27.0.60; Assertion failed during compilation
@ 2020-02-12 7:39 Henrik Grimler
2020-02-13 14:57 ` Eli Zaretskii
2020-02-17 20:53 ` Paul Eggert
0 siblings, 2 replies; 9+ messages in thread
From: Henrik Grimler @ 2020-02-12 7:39 UTC (permalink / raw)
To: 39577
Hi,
I am trying to debug a segmentation fault happening on android 32bit
arm. To do that I tried recompiling my emacs with
```
../configure --enable-checking=yes,glyphs \
--enable-check-lisp-object-type \
--without-makeinfo \
--without-selinux \
--prefix /data/data/com.termux/files/usr/local \
CFLAGS="-O0 -g3 -gdwarf-4"
```
but building the emacs-27 branch (commit 06c302d) this fails with:
```
[...]
Loading /data/data/com.termux/files/home/projects/emacs/lisp/emacs-lisp/syntax.el (source)...
Loading /data/data/com.termux/files/home/projects/emacs/lisp/font-lock.el (source)...
Loading /data/data/com.termux/files/home/projects/emacs/lisp/jit-lock.el (source)...
../../src/fns.c:2856: Emacs fatal error: assertion failed: !FIXNUM_OVERFLOW_P (lisp_h_make_fixnum_n)
Fatal error 6: n
make[1]: *** [Makefile:817: bootstrap-emacs.pdmp] Aborted
make[1]: Leaving directory '/data/data/com.termux/files/home/projects/emacs/build/src'
make: *** [Makefile:424: src] Error 2
```
This (as well as the segfault) happens both if compiling with clang 9.0.1 and gcc 9.2.0.
I get a warning earlier multiple times that might be related:
```
[...]
CC dispnew.o
In file included from ../../src/dispnew.c:29:
In file included from ../../src/termchar.h:23:
../../src/dispextern.h:1917:36: warning: signed shift result (0x3FFFFC00000) requires 43 bits to represent, but 'EMACS_INT' (aka 'int') only has 32 bits [-Wshift-overflow]
? ((EMACS_INT) MAX_FACE_ID << CHARACTERBITS) | MAX_CHAR
~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~
1 warning generated.
[...]
```
I have uploaded the full config.log and make output here:
https://grimler.se/emacs/config.log
https://grimler.se/emacs/make.log
If I remove --enable-checking=yes,glyphs it builds (I am sending this
bug report from that build) but gets segmentation faults every now and
then. Easiest way to trigger it is to scroll up and down in some file,
but it still happens randomly, maybe after 200 lines, maybe after 10
000.
Does anyone have any suggestions for how I can proceed debugging this?
Best regards,
Henrik Grimler
In GNU Emacs 27.0.60 (build 1, armv7l-unknown-linux-gnueabi)
of 2020-02-10 built on localhost
Repository revision: 06c302d425fc2093130479b8aed7da4507d43331
Repository branch: emacs-27
Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Configured using:
'configure --enable-check-lisp-object-type --without-makeinfo
--without-selinux --prefix /data/data/com.termux/files/usr/local/
'CFLAGS=-O0 -g3 -gdwarf-4''
Configured features:
NOTIFY INOTIFY ACL GNUTLS LIBXML2 ZLIB MODULES THREADS PDUMPER LCMS2 GMP
Important settings:
value of $LANG: en_US.UTF-8
locale-coding-system: utf-8-unix
Major mode: Fundamental
Minor modes in effect:
show-paren-mode: t
tooltip-mode: t
global-eldoc-mode: t
electric-indent-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
column-number-mode: t
line-number-mode: t
transient-mark-mode: t
Load-path shadows:
None found.
Features:
(shadow regexp-opt sort mail-extr emacsbug message rmc puny dired
dired-loaddefs format-spec rfc822 mml mml-sec epa derived epg epg-config
gnus-util rmail rmail-loaddefs text-property-search time-date mm-decode
mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader
sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils image
term/xterm xterm edmacro kmacro tsdh-dark-theme paren finder-inf info
tool-bar package easymenu browse-url url-handlers url-parse auth-source
cl-seq eieio eieio-core cl-macs eieio-loaddefs password-cache json
subr-x map url-vars seq byte-opt gv bytecomp byte-compile cconv
cl-loaddefs cl-lib tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type tabulated-list replace newcomment text-mode elisp-mode
lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch
timer select mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite charscript charprop case-table epa-hook
jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote threads inotify lcms2 multi-tty make-network-process emacs)
Memory information:
((conses 8 77236 5768)
(symbols 24 9059 1)
(strings 16 27039 2319)
(string-bytes 1 881148)
(vectors 8 12044)
(vector-slots 4 135461 6406)
(floats 8 46 544)
(intervals 28 170 0)
(buffers 576 12))
^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#39577: 27.0.60; Assertion failed during compilation
2020-02-12 7:39 bug#39577: 27.0.60; Assertion failed during compilation Henrik Grimler
@ 2020-02-13 14:57 ` Eli Zaretskii
2020-02-13 19:00 ` Henrik Grimler
2020-02-17 20:53 ` Paul Eggert
1 sibling, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2020-02-13 14:57 UTC (permalink / raw)
To: Henrik Grimler; +Cc: 39577
> Date: Wed, 12 Feb 2020 08:39:58 +0100
> From: Henrik Grimler <henrik@grimler.se>
>
> ../configure --enable-checking=yes,glyphs \
> --enable-check-lisp-object-type \
> --without-makeinfo \
> --without-selinux \
> --prefix /data/data/com.termux/files/usr/local \
> CFLAGS="-O0 -g3 -gdwarf-4"
> ```
>
> but building the emacs-27 branch (commit 06c302d) this fails with:
>
> ```
> [...]
> Loading /data/data/com.termux/files/home/projects/emacs/lisp/emacs-lisp/syntax.el (source)...
> Loading /data/data/com.termux/files/home/projects/emacs/lisp/font-lock.el (source)...
> Loading /data/data/com.termux/files/home/projects/emacs/lisp/jit-lock.el (source)...
>
> ../../src/fns.c:2856: Emacs fatal error: assertion failed: !FIXNUM_OVERFLOW_P (lisp_h_make_fixnum_n)
This would mean that the values returned by getloadavg on that system
are preposterously large. Can you run the offending command under a
debugger, put a breakpoint on line 2856 of fns.c, and see what values
you get in the load_ave[] array?
> This (as well as the segfault) happens both if compiling with clang 9.0.1 and gcc 9.2.0.
> I get a warning earlier multiple times that might be related:
>
> ```
> [...]
> CC dispnew.o
> In file included from ../../src/dispnew.c:29:
> In file included from ../../src/termchar.h:23:
> ../../src/dispextern.h:1917:36: warning: signed shift result (0x3FFFFC00000) requires 43 bits to represent, but 'EMACS_INT' (aka 'int') only has 32 bits [-Wshift-overflow]
> ? ((EMACS_INT) MAX_FACE_ID << CHARACTERBITS) | MAX_CHAR
> ~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~
> 1 warning generated.
I think this warning is bogus, since if your EMACS_INT is not wide
enough to hold MAX_FACE_ID shifted left by 8 bits, the code will not
do that.
> If I remove --enable-checking=yes,glyphs it builds (I am sending this
> bug report from that build) but gets segmentation faults every now and
> then. Easiest way to trigger it is to scroll up and down in some file,
> but it still happens randomly, maybe after 200 lines, maybe after 10
> 000.
Can you show a backtrace from the segfault?
^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#39577: 27.0.60; Assertion failed during compilation
2020-02-13 14:57 ` Eli Zaretskii
@ 2020-02-13 19:00 ` Henrik Grimler
2020-02-13 19:23 ` Eli Zaretskii
0 siblings, 1 reply; 9+ messages in thread
From: Henrik Grimler @ 2020-02-13 19:00 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 39577
Hi Eli,
On Thu, Feb 13, 2020 at 04:57:26PM +0200, Eli Zaretskii wrote:
> > Date: Wed, 12 Feb 2020 08:39:58 +0100
> > From: Henrik Grimler <henrik@grimler.se>
> >
> > ../configure --enable-checking=yes,glyphs \
> > --enable-check-lisp-object-type \
> > --without-makeinfo \
> > --without-selinux \
> > --prefix /data/data/com.termux/files/usr/local \
> > CFLAGS="-O0 -g3 -gdwarf-4"
> > ```
> >
> > but building the emacs-27 branch (commit 06c302d) this fails with:
> >
> > ```
> > [...]
> > Loading /data/data/com.termux/files/home/projects/emacs/lisp/emacs-lisp/syntax.el (source)...
> > Loading /data/data/com.termux/files/home/projects/emacs/lisp/font-lock.el (source)...
> > Loading /data/data/com.termux/files/home/projects/emacs/lisp/jit-lock.el (source)...
> >
> > ../../src/fns.c:2856: Emacs fatal error: assertion failed: !FIXNUM_OVERFLOW_P (lisp_h_make_fixnum_n)
>
> This would mean that the values returned by getloadavg on that system
> are preposterously large. Can you run the offending command under a
> debugger, put a breakpoint on line 2856 of fns.c, and see what values
> you get in the load_ave[] array?
It seems to be preposterously small:
```
Breakpoint 2, Fload_average (use_floats=XIL(0)) at ../../src/fns.c:2856
2856 ? make_fixnum (100.0 * load_ave[loads])
(gdb) print load_ave
$1 = {2.8900000000000001, 2.8752811112650786e-312, 2.7799999999999998}
```
This android version does not have getloadavg (so I guess
lib/getloadavg.c is used instead?)
> > If I remove --enable-checking=yes,glyphs it builds (I am sending this
> > bug report from that build) but gets segmentation faults every now and
> > then. Easiest way to trigger it is to scroll up and down in some file,
> > but it still happens randomly, maybe after 200 lines, maybe after 10
> > 000.
>
> Can you show a backtrace from the segfault?
After loading gdbinit from emacs src, starting emacs and scrolling up
and down a file a couple of times it crashes with:
```
Program received signal SIGSEGV, Segmentation fault.
0xb6995228 in sigsetjmp () from /system/lib/libc.so
```
A backtrace then unfortunately only shows:
```
#0 0xb6995228 in sigsetjmp () from /system/lib/libc.so
#1 0x62e31f80 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Program received signal SIGSEGV, Segmentation fault.
backtrace_top () at ../../src/eval.c:176
176 {
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on".
Evaluation of the expression containing the function
(backtrace_top) will be abandoned.
When the function is done executing, GDB will silently stop.
```
I am fairly in-experienced with gdb, so please let me know if there is
anything else I can try.
It also seems that the segfault does not happen if running inside
tmux.
^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#39577: 27.0.60; Assertion failed during compilation
2020-02-13 19:00 ` Henrik Grimler
@ 2020-02-13 19:23 ` Eli Zaretskii
2020-02-13 20:04 ` Henrik Grimler
0 siblings, 1 reply; 9+ messages in thread
From: Eli Zaretskii @ 2020-02-13 19:23 UTC (permalink / raw)
To: Henrik Grimler; +Cc: 39577
> Date: Thu, 13 Feb 2020 20:00:16 +0100
> From: Henrik Grimler <hgrimler@kth.se>
> Cc: 39577@debbugs.gnu.org
>
> > > ../../src/fns.c:2856: Emacs fatal error: assertion failed: !FIXNUM_OVERFLOW_P (lisp_h_make_fixnum_n)
> >
> > This would mean that the values returned by getloadavg on that system
> > are preposterously large. Can you run the offending command under a
> > debugger, put a breakpoint on line 2856 of fns.c, and see what values
> > you get in the load_ave[] array?
>
> It seems to be preposterously small:
>
> ```
> Breakpoint 2, Fload_average (use_floats=XIL(0)) at ../../src/fns.c:2856
> 2856 ? make_fixnum (100.0 * load_ave[loads])
> (gdb) print load_ave
> $1 = {2.8900000000000001, 2.8752811112650786e-312, 2.7799999999999998}
> ```
>
> This android version does not have getloadavg (so I guess
> lib/getloadavg.c is used instead?)
Looks like a bug in getloadavg, but you should be fine replacing that
small value with zero.
> > Can you show a backtrace from the segfault?
>
> After loading gdbinit from emacs src, starting emacs and scrolling up
> and down a file a couple of times it crashes with:
>
> ```
> Program received signal SIGSEGV, Segmentation fault.
> 0xb6995228 in sigsetjmp () from /system/lib/libc.so
> ```
>
> A backtrace then unfortunately only shows:
>
> ```
> #0 0xb6995228 in sigsetjmp () from /system/lib/libc.so
> #1 0x62e31f80 in ?? ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Sounds like sigsetjmp is buggy on that platform?
> It also seems that the segfault does not happen if running inside
> tmux.
Does "inside tmux" mean you run a -nw session?
^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#39577: 27.0.60; Assertion failed during compilation
2020-02-13 19:23 ` Eli Zaretskii
@ 2020-02-13 20:04 ` Henrik Grimler
0 siblings, 0 replies; 9+ messages in thread
From: Henrik Grimler @ 2020-02-13 20:04 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 39577
> > > Can you show a backtrace from the segfault?
> >
> > After loading gdbinit from emacs src, starting emacs and scrolling up
> > and down a file a couple of times it crashes with:
> >
> > ```
> > Program received signal SIGSEGV, Segmentation fault.
> > 0xb6995228 in sigsetjmp () from /system/lib/libc.so
> > ```
> >
> > A backtrace then unfortunately only shows:
> >
> > ```
> > #0 0xb6995228 in sigsetjmp () from /system/lib/libc.so
> > #1 0x62e31f80 in ?? ()
> > Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>
> Sounds like sigsetjmp is buggy on that platform?
Could be? I have not seen any bug reports or similar issues for other
programs that seem related though, but android arm is fairly
rare these days. I am dreaming of finding a workaround on the emacs
side of things as I can not modify the system libraries (well not
easily anyways). I suppose I need to learn more about sigsetjmp and
friends to have a chance at that.
> > It also seems that the segfault does not happen if running inside
> > tmux.
>
> Does "inside tmux" mean you run a -nw session?
Yes, sorry for not being clear. I am running inside a terminal
emulator for android called termux, all programs are linked against
android's libc, bionic. Compilers and other tools needed for compiling
emacs are available, but there is no x11 so emacs can only be run in -nw
mode.
^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#39577: 27.0.60; Assertion failed during compilation
2020-02-12 7:39 bug#39577: 27.0.60; Assertion failed during compilation Henrik Grimler
2020-02-13 14:57 ` Eli Zaretskii
@ 2020-02-17 20:53 ` Paul Eggert
2020-02-18 15:49 ` Henrik Grimler
1 sibling, 1 reply; 9+ messages in thread
From: Paul Eggert @ 2020-02-17 20:53 UTC (permalink / raw)
To: Henrik Grimler; +Cc: 39577
[-- Attachment #1: Type: text/plain, Size: 717 bytes --]
I installed the attached patch into master, to work around the
getloadavg-related assertion failure. However, I don't think this fixes the
actual bug.
> This android version does not have getloadavg (so I guess
> lib/getloadavg.c is used instead?)
If so, you should be able to step through the replacement getloadavg and see why
it's reporting bogus values. I have the sneaking suspicion that floating point
isn't working properly, and that it's treating tiny numbers as NaNs or vice
versa. But this bug is relatively unimportant.
The main problem here seems to be the sigsetjmp-related bug. You might try
putting a breakpoint on handle_sigsegv before running Emacs; that might give you
a better backtrace.
[-- Attachment #2: 0001-Avoid-unlikely-load-average-bug.patch --]
[-- Type: text/x-patch, Size: 799 bytes --]
From 121f9bb14ab0abe618cabd24bd25ed328e36891c Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 17 Feb 2020 12:44:10 -0800
Subject: [PATCH] Avoid unlikely load-average bug
* src/fns.c (Fload_average): Do not crash or return nonsense
if the load average exceeds most-positive-fixnum/100 (Bug#39577).
---
src/fns.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/fns.c b/src/fns.c
index 436ef1c7b7..80012fa9d2 100644
--- a/src/fns.c
+++ b/src/fns.c
@@ -2843,7 +2843,7 @@ advisable. */)
while (loads-- > 0)
{
Lisp_Object load = (NILP (use_floats)
- ? make_fixnum (100.0 * load_ave[loads])
+ ? double_to_integer (100.0 * load_ave[loads])
: make_float (load_ave[loads]));
ret = Fcons (load, ret);
}
--
2.17.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* bug#39577: 27.0.60; Assertion failed during compilation
2020-02-17 20:53 ` Paul Eggert
@ 2020-02-18 15:49 ` Henrik Grimler
2020-09-01 17:25 ` Stefan Kangas
0 siblings, 1 reply; 9+ messages in thread
From: Henrik Grimler @ 2020-02-18 15:49 UTC (permalink / raw)
To: Paul Eggert; +Cc: 39577
On Mon, 2020-02-17 at 12:53 -0800, Paul Eggert wrote:
> I installed the attached patch into master, to work around the
> getloadavg-related assertion failure. However, I don't think this
> fixes the
> actual bug.
Thanks, will try a new build tonight.
> > This android version does not have getloadavg (so I guess
> > lib/getloadavg.c is used instead?)
>
> If so, you should be able to step through the replacement getloadavg
> and see why
> it's reporting bogus values. I have the sneaking suspicion that
> floating point
> isn't working properly, and that it's treating tiny numbers as NaNs
> or vice
> versa. But this bug is relatively unimportant.
Yeah, I will investigate it more when I have some time and report back
here.
> The main problem here seems to be the sigsetjmp-related bug. You
> might try
> putting a breakpoint on handle_sigsegv before running Emacs; that
> might give you
> a better backtrace.
After Eli suggested that the problem is indeed in the sigsetjmp
function I configured emacs with
```
emacs_cv_func__setjmp=no
emacs_cv_func_sigsetjmp=no
```
and it seems to have helped (5 days without segfaults now). Setting
only one of the two does not help. This seems like an acceptable
workaround in my case, but maybe it causes some other side effects I am
yet to encounter(?).
Thanks for the hint about breakpoint on handle_sigsegv, I will see if I
can learn more about what is actaully happening.
^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#39577: 27.0.60; Assertion failed during compilation
2020-02-18 15:49 ` Henrik Grimler
@ 2020-09-01 17:25 ` Stefan Kangas
2020-10-01 19:09 ` Lars Ingebrigtsen
0 siblings, 1 reply; 9+ messages in thread
From: Stefan Kangas @ 2020-09-01 17:25 UTC (permalink / raw)
To: Henrik Grimler; +Cc: 39577, Paul Eggert
Henrik Grimler <henrik@grimler.se> writes:
> On Mon, 2020-02-17 at 12:53 -0800, Paul Eggert wrote:
>> I installed the attached patch into master, to work around the
>> getloadavg-related assertion failure. However, I don't think this
>> fixes the
>> actual bug.
>
> Thanks, will try a new build tonight.
>
>> > This android version does not have getloadavg (so I guess
>> > lib/getloadavg.c is used instead?)
>>
>> If so, you should be able to step through the replacement getloadavg
>> and see why
>> it's reporting bogus values. I have the sneaking suspicion that
>> floating point
>> isn't working properly, and that it's treating tiny numbers as NaNs
>> or vice
>> versa. But this bug is relatively unimportant.
>
> Yeah, I will investigate it more when I have some time and report back
> here.
>
>> The main problem here seems to be the sigsetjmp-related bug. You
>> might try
>> putting a breakpoint on handle_sigsegv before running Emacs; that
>> might give you
>> a better backtrace.
>
> After Eli suggested that the problem is indeed in the sigsetjmp
> function I configured emacs with
>
> ```
> emacs_cv_func__setjmp=no
> emacs_cv_func_sigsetjmp=no
> ```
>
> and it seems to have helped (5 days without segfaults now). Setting
> only one of the two does not help. This seems like an acceptable
> workaround in my case, but maybe it causes some other side effects I am
> yet to encounter(?).
>
> Thanks for the hint about breakpoint on handle_sigsegv, I will see if I
> can learn more about what is actaully happening.
(That was 28 weeks ago.)
Any news here? Did you find anything out?
Thanks in advance.
^ permalink raw reply [flat|nested] 9+ messages in thread
* bug#39577: 27.0.60; Assertion failed during compilation
2020-09-01 17:25 ` Stefan Kangas
@ 2020-10-01 19:09 ` Lars Ingebrigtsen
0 siblings, 0 replies; 9+ messages in thread
From: Lars Ingebrigtsen @ 2020-10-01 19:09 UTC (permalink / raw)
To: Stefan Kangas; +Cc: 39577, Paul Eggert, Henrik Grimler
Stefan Kangas <stefan@marxist.se> writes:
> (That was 28 weeks ago.)
>
> Any news here? Did you find anything out?
And this was four weeks ago, so it seems unlikely that there'll be
further progress in this bug report, and I'm closing it.
If the problem persists, lease respond to the debbugs address and we'll
reopen.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-10-01 19:09 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-02-12 7:39 bug#39577: 27.0.60; Assertion failed during compilation Henrik Grimler
2020-02-13 14:57 ` Eli Zaretskii
2020-02-13 19:00 ` Henrik Grimler
2020-02-13 19:23 ` Eli Zaretskii
2020-02-13 20:04 ` Henrik Grimler
2020-02-17 20:53 ` Paul Eggert
2020-02-18 15:49 ` Henrik Grimler
2020-09-01 17:25 ` Stefan Kangas
2020-10-01 19:09 ` Lars Ingebrigtsen
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).