From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#20862: 25.0.50; 32-bit Emacs configured --with-wide-int miscompiles CL Date: Thu, 25 Jun 2015 17:30:16 +0300 Message-ID: <83h9pveu87.fsf@gnu.org> References: <558B75FE.3010806@cs.ucla.edu> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1435242697 11172 80.91.229.3 (25 Jun 2015 14:31:37 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 25 Jun 2015 14:31:37 +0000 (UTC) Cc: 20862@debbugs.gnu.org To: Paul Eggert Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Jun 25 16:31:22 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Z88BR-0005yW-4s for geb-bug-gnu-emacs@m.gmane.org; Thu, 25 Jun 2015 16:31:17 +0200 Original-Received: from localhost ([::1]:56009 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z88BQ-00005N-J0 for geb-bug-gnu-emacs@m.gmane.org; Thu, 25 Jun 2015 10:31:16 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:51554) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z88BJ-0008Sz-Lq for bug-gnu-emacs@gnu.org; Thu, 25 Jun 2015 10:31:13 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z88BD-0000LK-Gl for bug-gnu-emacs@gnu.org; Thu, 25 Jun 2015 10:31:09 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:55890) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z88BD-0000L9-CY for bug-gnu-emacs@gnu.org; Thu, 25 Jun 2015 10:31:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1Z88BC-0003tT-NA for bug-gnu-emacs@gnu.org; Thu, 25 Jun 2015 10:31:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 25 Jun 2015 14:31:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 20862 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 20862-submit@debbugs.gnu.org id=B20862.143524263714938 (code B ref 20862); Thu, 25 Jun 2015 14:31:02 +0000 Original-Received: (at 20862) by debbugs.gnu.org; 25 Jun 2015 14:30:37 +0000 Original-Received: from localhost ([127.0.0.1]:57336 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z88Al-0003sq-2H for submit@debbugs.gnu.org; Thu, 25 Jun 2015 10:30:36 -0400 Original-Received: from mtaout20.012.net.il ([80.179.55.166]:54246) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Z88Ag-0003sW-Pr for 20862@debbugs.gnu.org; Thu, 25 Jun 2015 10:30:32 -0400 Original-Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0NQI00H00808FF00@a-mtaout20.012.net.il> for 20862@debbugs.gnu.org; Thu, 25 Jun 2015 17:30:24 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NQI00HRX8AN9V50@a-mtaout20.012.net.il>; Thu, 25 Jun 2015 17:30:24 +0300 (IDT) In-reply-to: <558B75FE.3010806@cs.ucla.edu> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:104334 Archived-At: > Date: Wed, 24 Jun 2015 20:31:10 -0700 > From: Paul Eggert > CC: 20862@debbugs.gnu.org > > Thanks for reporting that. It appears to be a bug in the garbage collector, and is likely to be hard to reproduce. I couldn't reproduce it, but I did a 'make bootstrap' on Fedora x86-64 (configured with --with-wide-int and compiled with gcc -m32 so it's really x86), and got a core dump in a completely different area that (of course!) went away when I compiled without optimization. Do you remember which Lisp file was being compiled when you got a core dump? I had something similar while trying to debug this: crashes while compiling cedet/srecode/proj-obj.el and sometimes also ibuffer.el. And yes, it's a very elusive crash: it only happens during a full bootstrap, and even repeating the exact same bootstrap in a different directory makes it disappear! But I was luckier than you, in that I did succeed reproducing these crashes in an unoptimized build. By slowly and painfully tracking these crashes, I found out that they are due to an un-interned symbol whose name is "THIS", created by this form: (let ((print-gensym nil) (print-quoted t)) (format "%S" (cons 'fn (cl--make-usage-args orig-args)))) That symbol gets GC'ed (I see "DEAD" in its function cell) while it's still alive, and then Emacs crashes trying to print that symbol's name in the call to 'format', because GC recycles that symbol's name and replaces it with a NULL pointer. So your analysis: > Rather than try to debug it directly, I thought about what might have caused the problem, and re-audited the garbage collector with the recent Qnil==0 changes in mind. This did uncover a bug, and the attached patch (which we will need anyway) allowed me to do a "make bootstrap" successfully in the same configuration. and the changes in the patch related to symbols-as-offsets and their alignment on the stack, make perfect sense to me, because they explain how come stack marking didn't mark this symbol, and thus allowed it to be GC'ed. > I installed this into the master as commit 93f4f67ba93b78e8b31e498e8ce7bce4c8298b76; please give it a try in your setup when you have the time. I did, and the crashes are gone, thanks. The cl-lib-tests also succeed. However, there still seems to be some subtle problem, because the byte-compiled files don't all compare equal. (I've seen the same problem before your patches as well.) I used the following command to find the *.elc files that are different: diff -r -a -u -I"in Emacs version 25\.0\.50" ./lisp ../int32/lisp --exclude="*.el" --exclude="*.el~" | grep -a "^diff " where "../int32/" is the directory where I built Emacs without "--with-wide-int". This reveals differences in the following files: cedet/semantic/texi.elc cedet/semantic/util.elc cedet/srecode/srt-wy.elc emacs-lisp/cl-generic.elc Some of the differences are insignificant (different label numbers used, or different file offsets due to a longer Emacs version string), but others seem to be significant. For example, the byte code of cl--generic-struct-tag in cl-generic.elc has a few different bytes. Likewise with the byte code of semantic-texi-expand-tag in texi.elc, of semantic-something-to-tag-table in util.elc, and of srecode-template-wy--parse-table in srt-wy.elc. The list of *.elc files that differ appears to depend on optimization level: the above list was obtained with -O0; compiling with -O1 leaves only cl-generic.elc and srt-wy.elc different, and compiling with -O2 brings util.elc back, and also adds differences in eshell/esh-proc.elc. Or maybe the actual factor is the specific order in which the files are compiled (I bootstrap with "make -j8"), which determines which other Lisp files are available as *.el or *.elc, because bootstrapping without parallel Make execution again leaves only cl-generic.elc and srt-wy.elc? Do you see something similar on your system? How to go about debugging this?