From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#58042: 29.0.50; ASAN use-after-free in re_match_2_internal Date: Wed, 05 Oct 2022 10:22:34 +0300 Message-ID: <83mtaau43p.fsf@gnu.org> References: <83edvnv965.fsf@gnu.org> <83pmf6u76i.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="30897"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 58042@debbugs.gnu.org To: Gerd =?UTF-8?Q?M=C3=B6llmann?= Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Oct 05 09:23:32 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ofykd-0007rE-F5 for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 05 Oct 2022 09:23:31 +0200 Original-Received: from localhost ([::1]:53028 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ofykb-0006wi-Rp for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 05 Oct 2022 03:23:29 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:34250) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ofykA-0006wT-L0 for bug-gnu-emacs@gnu.org; Wed, 05 Oct 2022 03:23:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:56688) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ofykA-0007Cn-CI for bug-gnu-emacs@gnu.org; Wed, 05 Oct 2022 03:23:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ofyk9-0000WI-QD for bug-gnu-emacs@gnu.org; Wed, 05 Oct 2022 03:23:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 05 Oct 2022 07:23:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 58042 X-GNU-PR-Package: emacs Original-Received: via spool by 58042-submit@debbugs.gnu.org id=B58042.16649545651976 (code B ref 58042); Wed, 05 Oct 2022 07:23:01 +0000 Original-Received: (at 58042) by debbugs.gnu.org; 5 Oct 2022 07:22:45 +0000 Original-Received: from localhost ([127.0.0.1]:55766 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ofyjs-0000Vn-PG for submit@debbugs.gnu.org; Wed, 05 Oct 2022 03:22:45 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:35052) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ofyjq-0000VZ-Bh for 58042@debbugs.gnu.org; Wed, 05 Oct 2022 03:22:43 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:48244) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ofyjl-0007BC-4R; Wed, 05 Oct 2022 03:22:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=FHW9BKh3CxJXHKAVBiOzHcHdDz5md9UdoPhbnK/xa0s=; b=VaZcrUqvgsJNCycGZahb +Q0OryqFpdHz8OL+vNem0oSmQ/jaRDcH5HeORfrkESVNTHkQRo+EyXhArJG75ViF9p+v1ZaU4sstj ZWeSv7ru1Xralf35IMcDCvcCyazZSElCKQVTD+GS4rOmTgPqI9DPPjBt9vgS1HtwngY8GazQQCprj BCDkmXKf6J13J2floh/Q/TadBpR2m3ryrAUAdBbDxHejXU3a0ST9LTKypXE8Umvt/XPKHfRw/QWdr tXxTO2TmUKWomtBenAtqA7gDiBIBFFQ0tHr/7OE6IcNwAw2DnTV88Cerl623CF9XAFf8WOrTtdxDz k6j8fFZm7TvKQw==; Original-Received: from [87.69.77.57] (port=3562 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ofyjk-0003Ch-Jn; Wed, 05 Oct 2022 03:22:36 -0400 In-Reply-To: (message from Gerd =?UTF-8?Q?M=C3=B6llmann?= on Wed, 05 Oct 2022 08:58:51 +0200) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:244493 Archived-At: > From: Gerd Möllmann > Cc: 58042@debbugs.gnu.org > Date: Wed, 05 Oct 2022 08:58:51 +0200 > > Eli Zaretskii writes: > > > The question that we should try answering is this: what variable holds > > the C pointer to the data of a Lisp string that is being relocated > > and/or compacted by GC between the time the C pointer is assigned and > > the time its value is dereferenced? > > I think we can answer that question, at least with a good probability. > If you look what the offending (I think) pointer points to: > > frame #5: 0x0000000100582044 emacs`re_match_2_internal(bufp=0x000000010111ace8, string1=0x0000000000000000, size1=0, string2="/Users/gerd/.config/emacs.d.default/elpa/magit-section-20220901.331/puny.dylib", size2=78, pos=0, regs=0x0000000000000000, stop=78) at regex-emacs.c:4328:15 > 4325 DEBUG_PRINT ("EXECUTING anychar.\n"); > 4326 > 4327 PREFETCH (); > -> 4328 buf_ch = RE_STRING_CHAR_AND_LENGTH (d, buf_charlen, > 4329 target_multibyte); > 4330 buf_ch = TRANSLATE (buf_ch); > 4331 if (buf_ch == '\n') > (lldb) p d > (re_char *) $285 = 0x000000011f90d0a1 "magit-section-20220901.331/puny.dylib" > > That looks like part of the filename here: > > frame #10: 0x0000000100503cf4 emacs`Ffind_file_name_handler(filename=(struct Lisp_String *) $318 = 0x000000011f6ec4c0, operation=(struct Lisp_Symbol *) $321 = 0x00000001010ec310) at fileio.c:324:24 > 321 operations = Fget (handler, Qoperations); > 322 > 323 if (STRINGP (string) > -> 324 && (match_pos = fast_string_match (string, filename)) > pos > 325 && (NILP (operations) || ! NILP (Fmemq (operation, operations)))) > 326 { > 327 Lisp_Object tem; > (lldb) p filename > (Lisp_Object) $322 = 0x000000011f6ec4c4 (struct Lisp_String *) $324 = 0x000000011f6ec4c0 > (lldb) p *$324 > (struct Lisp_String) $325 = { > u = { > s = { > size = 78 > size_byte = -1 > intervals = NULL > data = 0x000000011f5d2f38 "/Users/gerd/.config/emacs.d.default/elpa/magit-section-20220901.331/puny.dylib" > } > next = 0x000000000000004e > gcaligned = 'N' > } > } > > So, I'd say that the filename string data has been moved somewhere else > during compaction. Which would mean GC somehow ran between the point > where "d" in frame#5 was initially set up from the filename, and line > 4328 where the problem is detected. That part is clear, but the "GC somehow ran" part is not, and that is the part which we must understand to fix the problem. The filename's SSDATA is passed to re_search as a C string, under the assumption that GC cannot happen while re_search runs. If that assumption is false, we need to understand exactly how and in what cases, because without that there's nothing we can do -- regex-emacs.c code deals explicitly only with C strings. IOW, this isn't the case like char *ptr = SSDATA (lisp_string); ... dereference (ptr); where GC can happen as part of "...". Those cases are easy to fix. But this is not that case. > > I don't see how to answer > > that question without understanding how redisplay was called in the > > middle of what seems to be loading of a Lisp package, because none of > > the items 1 and 3 show anything that could call redisplay. > > What I can see is that, apparently, redisplay got called because Emacs > received a MacOS event, and did a prepare_menu_bars etc etc. You mean, a macOS event can be received asynchronously, and will interrupt some processing in C, like inside regex-emacs.c? If that can happen, no code in Emacs is safe, ever. I don't believe this is possible: we no longer process window-system events asynchronously, AFAIK, and for this very reason. But maybe macOS is different? In that case, either we should change the macOS code to avoid doing that, or we should have some means of blocking such "interrupts" around specific code fragments, akin to block_input. > How that's possible, if it is, while Emacs is in between frame#10 and > frame#5 I have not the slightest idea. And please note that this is all > happening in the same thread T0, according to ASAN. Yes, I've seen that it's the same thread. Having redisplay run from another thread would be a larger disaster. > Maybe someone knowing the Mac port has an idea if this can happen? I hope so.