From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Pip Cet via "Bug reports for GNU Emacs, the Swiss army knife of text editors" Newsgroups: gmane.emacs.bugs Subject: bug#75322: SAFE_ALLOCA assumed to root Lisp_Objects/SSDATA(string) Date: Sat, 04 Jan 2025 21:15:50 +0000 Message-ID: <87ikque0xp.fsf@protonmail.com> References: <87jzbbke6u.fsf@protonmail.com> <86sepzf4h3.fsf@gnu.org> <87a5c6j0qn.fsf@protonmail.com> <86jzbad735.fsf@gnu.org> <877c7aha9n.fsf@protonmail.com> <86y0zqbgot.fsf@gnu.org> <87ttaee5qp.fsf@protonmail.com> <86a5c6b9sb.fsf@gnu.org> Reply-To: Pip Cet Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6202"; mail-complaints-to="usenet@ciao.gmane.io" Cc: gerd.moellmann@gmail.com, 75322@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Jan 04 22:16:23 2025 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tUBVP-0001Sv-9m for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 04 Jan 2025 22:16:23 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tUBVB-0002jm-8O; Sat, 04 Jan 2025 16:16:09 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tUBV4-0002iy-P5 for bug-gnu-emacs@gnu.org; Sat, 04 Jan 2025 16:16:04 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tUBV4-0002ic-H7 for bug-gnu-emacs@gnu.org; Sat, 04 Jan 2025 16:16:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:References:In-Reply-To:From:Date:To:Subject; bh=Gud4Ebo6z/eXhA2V9Mie+lc7zfeBlv3qTetynE/z61Q=; b=NqEyw+HVR+OgfpKeFmgs4iXU3gURwUJWsKXuwxfugLRxPBJfYlZV/bPncDNToLljc4T1PbU852og7+tcUvffWJKzYze5qjAgzn26KDrmdJy6KFF8W4GNGIiAZ7/o20nONWlyfOmmFzriivNIu7/PCKfB55SEzslbAOPp+vCREyhS1mrdg2AlkzHG43jNc7vzaxgPshB2x+uL7hFbexOl4PaCIrnUFBBdMKNN2za5Yg41+lqjY7XC1GSKdVvKl3bIMPPN94u4SL0Nyd/5osnLZUilndl1XRw2QQZdBWe2ZtLr4W6p1yUVlslBLKoaMg/POP/d1kKy6jVdGUNIa6wXKg==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1tUBV4-0005Pm-A7 for bug-gnu-emacs@gnu.org; Sat, 04 Jan 2025 16:16:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Pip Cet Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Jan 2025 21:16:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 75322 X-GNU-PR-Package: emacs Original-Received: via spool by 75322-submit@debbugs.gnu.org id=B75322.173602536120808 (code B ref 75322); Sat, 04 Jan 2025 21:16:02 +0000 Original-Received: (at 75322) by debbugs.gnu.org; 4 Jan 2025 21:16:01 +0000 Original-Received: from localhost ([127.0.0.1]:57685 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tUBV2-0005PX-Vo for submit@debbugs.gnu.org; Sat, 04 Jan 2025 16:16:01 -0500 Original-Received: from mail-4322.protonmail.ch ([185.70.43.22]:24041) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1tUBV0-0005PJ-JY for 75322@debbugs.gnu.org; Sat, 04 Jan 2025 16:15:59 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1736025352; x=1736284552; bh=Gud4Ebo6z/eXhA2V9Mie+lc7zfeBlv3qTetynE/z61Q=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post; b=L9Kny33RH6LLUGVjDnnvyBdCo7uPc6f43sEMsPynctjxZa6guqYp0CTemVT8zILMO HEiqqoOS50Iicx4Dwq5yfvIpoFhy9qfg45wEuZF1HWL6JkSaSRWnhF3pHHtd1fx7QL rYCiPmwxlWb7u6JFeJ5+wEpI4yPGJMjC43S5wJb+vk8W05iFKTVIYHozGWsddiwqDm oExYAAueVF4CxRGXChCzcwfnyc0vaNpnxYpfOSrBsXHTb6Ljqv0ZHYKcNg+XcgIZ/c ulvTyczXHS5LvmPO8iN4Uokg1yy9/3IY3CtyrVfk4YW7OX2lph8mf0thwI7+O3fdF3 zP37sh7AeNv6A== In-Reply-To: <86a5c6b9sb.fsf@gnu.org> Feedback-ID: 112775352:user:proton X-Pm-Message-ID: 50ed6a949c541824ee82318b242abf003af2afb6 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:298476 Archived-At: "Eli Zaretskii" writes: >> > The safe thing is to re-initialize the pointer from the string >> > whenever we think GC could happen, and otherwise make sure GC cannot >> > happen. >> >> For the old GC, yes. For the new GC, the safe thing would be to ensure >> MPS has removed all memory barriers, take the arena lock, then call the >> syscall and return. Or use xstrdup. > > If this is indeed so (and I don't think it is), then we need to Which part do you think is wrong? > discuss it very thoroughly, because it basically means we cannot do > anything with Lisp strings in C. For example, the display iterator > has a member that is a Lisp string, which is used when we produce > glyphs for display from a string (such as a display property or an > overlay before/after string) -- what you say here basically means that > we cannot carry that string around, right? If not, how is that > different? Of course we can use Lisp strings. As long as there's an automatic variable pointing to the string data, it'll stay there. If there's a static variable pointing to the string data, it might move, but then the static variable will be updated. >> >> If a pointer to "old" data is ever exposed to Emacs, we lose, because >> >> MPS will reuse the memory for new data, which might be behind a barri= er. >> >> >> >> If we ever do: >> >> >> >> static Lisp_Object unmarked; >> ^^^^^^ >> >> unmarked =3D string; >> >> ... trigger GC here ... >> >> puts (SDATA (unmarked); >> >> >> >> the most likely outcome (thanks to Gerd for the hint) is that >> >> nonsensical data is printed >> > >> > Are you sure? >> >> With the static keyword, yes. (Assuming we don't staticpro the static >> variable, of course). > > What does static have to do with this? What matters is whether the value is visible to the garbage collector (in which case it remains a valid pointer) or isn't (in which case the memory it points to is used for something else). Automatic variables, residing on the stack or residing in a register (which is then spilled to the stack) protect the memory they point to. Static variables don't unless we tell GC about them. > What matters is the value, not the storage. I have no idea what you mean by that. The value of the variable is a tagged pointer. It won't change during GC, because GC never alters automatic variables. The question is whether this pointer still points to the right data area after GC. Unless there happens to be another ambiguous reference to the string data (which means MPS cannot move the string data, because it cannot alter ambiguous references), an unprotected static Lisp_Object will most likely point to invalid data after MPS GC runs. > The value comes from 'string', a different variable. It Which might be in a register and not survive until GC is triggered. Or it might be a global/static variable which is in an exact root, which means the data can be moved, 'string' will be updated, 'unmarked' won't. > points to string data, and it's that string data that is of interest > to us in the above snippet. >> > The below is indeed unsafe: >> > >> > char *p =3D SDATA (unmarked); >> > ... trigger GC here ... >> > puts (p); >> >> Right now, that's safe for MPS, but not for the old GC, correct? > > If GC moves string data, then it is not safe, period. Does MPS move > string data? MPS does not move string data if there is a stack variable pointing to it. It does in other cases. This is why it's safe for MPS. The old GC, IIUC, is less forgiving. >> >> > To clarify, I was trying to understand whether the error message >> >> > reported by Ihor in another thread could have happened because of G= C >> >> > in this are of the code. >> >> >> >> I currently think that Ihor's test case calls execve () with nonsensi= cal >> >> "environment" strings a lot, and once in a while they'll even be behi= nd >> >> the barrier which causes an EFAULT. >> > >> > Before we agree that this could be the reason, we'd need to find the >> > places where GC could reuse the memory of a live string, while that >> > string appears in some live data structure, and as long as no GC can >> > happen between the time we put the SDATA pointers into the environment >> > array and the time posix_spawn returns. >> >> Calling igc_collect () right before the spawn results in corrupt data: > > But the code doesn't call igc_collect, so how is this relevant? This is all that is relevant to establishing the current code is not safe for the intended behavior of scratch/igc, which is to allow garbage collection at this point, equivalent to setting a breakpoint there and calling igc_collect(). That we have a specific code path which ends up in MPS code is a bonus. I honestly don't know whether hitting a memory barrier can result in other data being moved by MPS right now. I'm assuming it can, and we should not assume string data is protected from being moved unless we root it explicitly. (Just answering the GC questions for now, sorry). Pip