From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#34655: 26.1.92; Segfault in module with --module-assertions Date: Mon, 18 Mar 2019 18:21:25 +0200 Message-ID: <835zsgw3ui.fsf@gnu.org> References: <874l8r1t3a.fsf@tcd.ie> <8336oamu3y.fsf@gnu.org> <87h8c1cv6l.fsf@tcd.ie> <83lg1dwhse.fsf@gnu.org> <87va0h12js.fsf@tcd.ie> Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="161601"; mail-complaints-to="usenet@blaine.gmane.org" Cc: 34655@debbugs.gnu.org, p.stephani2@gmail.com To: "Basil L. Contovounesios" Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Mar 18 17:22:14 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h5v1f-000fuX-O8 for geb-bug-gnu-emacs@m.gmane.org; Mon, 18 Mar 2019 17:22:11 +0100 Original-Received: from localhost ([127.0.0.1]:44166 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h5v1e-0000g9-LO for geb-bug-gnu-emacs@m.gmane.org; Mon, 18 Mar 2019 12:22:10 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:43652) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h5v1X-0000fs-8Z for bug-gnu-emacs@gnu.org; Mon, 18 Mar 2019 12:22:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h5v1W-0001Tz-2I for bug-gnu-emacs@gnu.org; Mon, 18 Mar 2019 12:22:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:35521) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1h5v1V-0001Tq-QO for bug-gnu-emacs@gnu.org; Mon, 18 Mar 2019 12:22:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1h5v1V-0007Nw-LJ for bug-gnu-emacs@gnu.org; Mon, 18 Mar 2019 12:22:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 18 Mar 2019 16:22:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34655 X-GNU-PR-Package: emacs Original-Received: via spool by 34655-submit@debbugs.gnu.org id=B34655.155292610428364 (code B ref 34655); Mon, 18 Mar 2019 16:22:01 +0000 Original-Received: (at 34655) by debbugs.gnu.org; 18 Mar 2019 16:21:44 +0000 Original-Received: from localhost ([127.0.0.1]:49065 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h5v1E-0007NQ-6j for submit@debbugs.gnu.org; Mon, 18 Mar 2019 12:21:44 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:57440) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h5v1B-0007N8-HU for 34655@debbugs.gnu.org; Mon, 18 Mar 2019 12:21:42 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:40961) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h5v16-0000zS-4d; Mon, 18 Mar 2019 12:21:36 -0400 Original-Received: from [176.228.60.248] (port=1635 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1h5v15-0000Xy-I8; Mon, 18 Mar 2019 12:21:35 -0400 In-reply-to: <87va0h12js.fsf@tcd.ie> (contovob@tcd.ie) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:156453 Archived-At: > From: "Basil L. Contovounesios" > Cc: <34655@debbugs.gnu.org>, > Date: Sun, 17 Mar 2019 23:52:55 +0000 > > > I tried at the time to reproduce your problem, and failed. But I did > > that on Windows, where I needed to replace the non-existent realpath > > by an equivalent function, so it's not a faithful reproduction. I > > will see if I can find time to look at this on a GNU machine, unless > > someone beats me to it. > > Replacing 'canonicalize_file_name' with 'strdup' still reproduces the > issue for me. Perhaps increasing the number of calls to > realpath-truename from 1000 to 5000 will also help. Right, the strdup part did that for me. (My previous attempt also used strdup as part of the replacement, but still failed to reproduce. Memory corruption bugs are frequently weird that way.) So I modified your recipe slightly, like this: (progn (module-load "/path/to/realpath.so") (setq garbage-collection-messages t) (dotimes (i 5000) (message "%d" i) (realpath-truename user-emacs-directory))) put it on a file named rp.el, and then ran it under GDB: (gdb) r -batch --module-assertions -l ./rp.el Here's what I see: [...] 3077 3078 3079 3080 Garbage collecting... Garbage collecting...done Thread 1 received signal SIGSEGV, Segmentation fault. 0x011e9918 in rpl_re_search_2 (bufp=0x17c0260 , str1=0x0, size1=0, str2=0x0, size2=20, startpos=0, range=20, regs=0x0, stop=20) at regex-emacs.c:3322 3322 buf_ch = STRING_CHAR_AND_LENGTH (d, buf_charlen) Looks like it indeed crashes after a GC, and on my system needs more than 3000 iterations. So let's run it with a breakpoint at the beginning of GC: (gdb) break alloc.c:6044 Breakpoint 3 at 0x11fa1bb: file alloc.c, line 6044. (gdb) r -batch --module-assertions -l ./rp.el [...] 3080 Thread 1 hit Breakpoint 3, garbage_collect_1 (gcst=0x82bb84) at alloc.c:6044 6044 record_in_backtrace (QAutomatic_GC, 0, 0); The backtrace at this point: (gdb) bt #0 garbage_collect_1 (gcst=0x82bb84) at alloc.c:6044 #1 0x011fa88e in garbage_collect () at alloc.c:6241 #2 0x01149adc in maybe_gc () at lisp.h:5028 #3 0x0123b012 in Ffuncall (nargs=2, args=0x82bcb0) at eval.c:2829 #4 0x0128c3cf in module_funcall (env=0x6122730, fun=0x6122868, nargs=1, args=0x82bd50) at emacs-module.c:483 #5 0x62d8136f in rp_funcall (env=0x6122730, value=0x82bd50, name=0x62d83060 "directory-name-p", nargs=1, args=0x82bd50) at realpath.c:62 #6 0x62d815cc in Frealpath_truename (env=0x6122730, nargs=1, args=0x82bd90, data=0x0) at realpath.c:124 [...] Lisp Backtrace: "directory-name-p" (0x82bcb8) "realpath-truename" (0x82bf80) "while" (0x82c2c8) "let" (0x82c538) "eval-buffer" (0x82cab0) "load-with-code-conversion" (0x82d0f0) "load" (0x82d9b8) "command-line-1" (0x82e3d0) "command-line" (0x82efe8) "normal-top-level" (0x82f690) As you see, the call to Ffuncall is the one that triggers GC from time to time. What happens with our 'file' at this point? (gdb) fr 6 (gdb) p file $1 = (emacs_value) 0x6122848 (gdb) p *file $2 = (gdb) p *(Lisp_Object *)file $3 = XIL(0x8000000006121ed0) (gdb) xtype Lisp_String (gdb) xstring $4 = (struct Lisp_String *) 0x6121ed0 "d:/usr/eli/.emacs.d/" Still valid. Now let's see who thrashes it: (gdb) p *$4 $5 = { u = { s = { size = 20, size_byte = 20, intervals = 0x0, data = 0x611e9fc "d:/usr/eli/.emacs.d/" }, next = 0x14, gcaligned = 20 '\024' } } (gdb) watch -l $4->u.s.data Hardware watchpoint 4: -location $4->u.s.data (gdb) c Continuing. Garbage collecting... Thread 1 hit Hardware watchpoint 4: -location $4->u.s.data Old value = (unsigned char *) 0x611e9fc "\024" New value = (unsigned char *) 0x0 sweep_strings () at alloc.c:2163 2163 NEXT_FREE_LISP_STRING (s) = string_free_list; (gdb) list 2158 /* Reset the strings's `data' member so that we 2159 know it's free. */ 2160 s->u.s.data = NULL; 2161 2162 /* Put the string on the free-list. */ 2163 NEXT_FREE_LISP_STRING (s) = string_free_list; 2164 string_free_list = ptr_bounds_clip (s, sizeof *s); 2165 ++nfree; 2166 } 2167 } Bingo! This is sweep_strings freeing our string, because it evidently doesn't think it's a Lisp object that is being still referenced. The culprit is this fragment from emacs-module.c, which is called each time you receive a Lisp object from Emacs which you want to use in your module: /* Convert O to an emacs_value. Allocate storage if needed; this can signal if memory is exhausted. Must be an injective function. */ static emacs_value lisp_to_value (emacs_env *env, Lisp_Object o) { if (module_assertions) { /* Add the new value to the list of values allocated from this environment. The value is actually a pointer to the Lisp_Object cast to emacs_value. We make a copy of the object on the free store to guarantee unique addresses. */ ATTRIBUTE_MAY_ALIAS Lisp_Object *optr = xmalloc (sizeof o); *optr = o; void *vptr = optr; ATTRIBUTE_MAY_ALIAS emacs_value ret = vptr; struct emacs_env_private *priv = env->private_members; priv->values = Fcons (make_mint_ptr (ret), priv->values); return ret; } What this does is make a copy of each Lisp object you get from Emacs, store that copy in memory allocated off the heap, and hand your module a pointer to the copy instead of the original object. So when you call, e.g., directory-name-p, an Emacs function, it gets that copy of the object. But memory allocation by xmalloc doesn't record the allocated memory in the red-black tree we maintain for the purposes of detecting Lisp objects referenced by C stack-based variables. So when GC comes to examine the C stack, it doesn't consider the variable 'file' in your module as being a pointer to a live Lisp object, and so it doesn't mark it. Then the sweep phase of GC recycles your Lisp object, which in this case involves setting the string's data to a NULL pointer. The patch to fix this is below; it simply marks these copied values by hand, thus preventing them from being GCed. It ran successfully with even 50,000 iterations. Philipp, any comments/objections? --- src/emacs-module.c~0 2019-02-25 10:18:35.000000000 +0200 +++ src/emacs-module.c 2019-03-18 08:33:00.564476000 +0200 @@ -1133,6 +1133,15 @@ mark_modules (void) mark_object (priv->non_local_exit_symbol); mark_object (priv->non_local_exit_data); mark_object (priv->values); + if (module_assertions) + { + for (Lisp_Object values = priv->values; + CONSP (values); values = XCDR (values)) + { + Lisp_Object *p = xmint_pointer (XCAR (values)); + mark_object (*p); + } + } } }