From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Daniel Colascione Newsgroups: gmane.emacs.bugs Subject: bug#75322: SAFE_ALLOCA assumed to root Lisp_Objects/SSDATA(string) Date: Sun, 05 Jan 2025 16:15:47 -0500 Message-ID: <4B76EB57-AA29-40BC-8361-0906E00A3578@dancol.org> References: <87jzbbke6u.fsf@protonmail.com> <87msg7iq0o.fsf@protonmail.com> <86ed1jf1tp.fsf@gnu.org> <865xmugawr.fsf@gnu.org> <8634hx8k1u.fsf@gnu.org> <86msg56to8.fsf@gnu.org> <86h66d6pw1.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="9466"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: K-9 Mail for Android Cc: pipcet@protonmail.com To: 75322@debbugs.gnu.org, eliz@gnu.org, gerd.moellmann@gmail.com Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Jan 05 22:16:42 2025 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1tUXzF-0002M4-Jm for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 05 Jan 2025 22:16:42 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tUXyw-0001XP-Md; Sun, 05 Jan 2025 16:16:22 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tUXye-0001UW-CG for bug-gnu-emacs@gnu.org; Sun, 05 Jan 2025 16:16:04 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tUXyc-0006KB-3K for bug-gnu-emacs@gnu.org; Sun, 05 Jan 2025 16:16:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debbugs.gnu.org; s=debbugs-gnu-org; h=MIME-Version:References:In-Reply-To:From:Date:To:Subject; bh=KV23DbTrW4+V5Y9Wjtu8lOz9RWkYlG1d0Q+PmoiBdN4=; b=fIv2kDfVLxeCblcfNHZsayACiurav08qng8nfwVaX9+LPgxrX8H33bmCa/ZZsqN91OUfPuB8D/YI9b9zN9YbPqsH0qe9yoAs22fjTfEOCsRNpF40D/OdzJ11zjeK3GizlgFIkemJIUOj2HVAW04uGE6ajn3KndpJhEoHtWAJhEZIBYnPtg16kejf9DOYjFHWHBe0zJ0pM1DGw/D6rlnwZYEQLBVYNR2dYBxiEV1gxUrHNKXXvsTM+8L7lcgi0RUTu7rIVYywAhTEbVaSSmsWapJ3u6cOHh1NLQRDFYRz5zqwKpDdAnArcYpu5QRsxGwRLYvFrv1YW+hrrQ+u2fJ9vA==; Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1tUXyb-0002t2-SW for bug-gnu-emacs@gnu.org; Sun, 05 Jan 2025 16:16:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Daniel Colascione Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 05 Jan 2025 21:16:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 75322 X-GNU-PR-Package: emacs X-Debbugs-Original-To: bug-gnu-emacs@gnu.org, Eli Zaretskii , Gerd =?UTF-8?Q?M=C3=B6llmann?= X-Debbugs-Original-Cc: pipcet@protonmail.com, 75322@debbugs.gnu.org Original-Received: via spool by 75322-submit@debbugs.gnu.org id=B75322.173611175711078 (code B ref 75322); Sun, 05 Jan 2025 21:16:01 +0000 Original-Received: (at 75322) by debbugs.gnu.org; 5 Jan 2025 21:15:57 +0000 Original-Received: from localhost ([127.0.0.1]:35526 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1tUXyW-0002sc-Qb for submit@debbugs.gnu.org; Sun, 05 Jan 2025 16:15:57 -0500 Original-Received: from dancol.org ([2600:3c01:e000:3d8::1]:36840) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1tUXyS-0002sK-VJ for 75322@debbugs.gnu.org; Sun, 05 Jan 2025 16:15:54 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=dancol.org; s=x; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID: References:In-Reply-To:Subject:CC:To:From:Date:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=KV23DbTrW4+V5Y9Wjtu8lOz9RWkYlG1d0Q+PmoiBdN4=; b=mrULvfsXBmBsvqqsP4T0ccrqoX a21/KS8wHMJUTnRYnSr9ATXvKa1vUzgZqYhiXagAIL0a6JpthAlvlLKtk4XuJ8K89fWubXftnBmqN H1a0K8ZkNDNTLyZ12Ad8iBX+q8sB9pKt3F5VzmHUrljQ+7E8sLZ+lDJgyiriK+ERKqQP/NAA5w80o adYrHeob49LybqILFRrk1mV4gXW+3mpxZqFpZCvM6u/4Pbl+d1FpFoROvLELlW98sVUVPz9CtKnwN 5yVDgOt6H1PwFVlwVUVOueq71bJOefSgtnnRRYi00d4ua6xezFXoN20ALGURxdqpADcz1chfFu5vQ GbQc+T4g==; Original-Received: from 2603-9001-4203-1ab2-9d33-bea9-4999-a2af.inf6.spectrum.com ([2603:9001:4203:1ab2:9d33:bea9:4999:a2af]:49352 helo=[IPv6:::1]) by dancol.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tUXyM-00068b-2L; Sun, 05 Jan 2025 16:15:46 -0500 In-Reply-To: <86h66d6pw1.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:298612 Archived-At: On January 5, 2025 2:07:26 PM EST, Eli Zaretskii wrote: >> From: Gerd M=C3=B6llmann >> Cc: pipcet@protonmail=2Ecom, 75322@debbugs=2Egnu=2Eorg >> Date: Sun, 05 Jan 2025 19:17:37 +0100 >>=20 >> Eli Zaretskii writes: >>=20 >> > OK, but in most, if not all of these cases, the objects are reference= d >> > from the stack=2E For example, in the above fragment, the args[] arr= ay >> > is on the stack=2E Right? >>=20 >> That args is a parameter >>=20 >> call_process (ptrdiff_t nargs, Lisp_Object *args, int filefd, >>=20 >> So just from this I see only args itself on the stack, not args[0], >> args[1] and so on=2E I would have to look at all callers to determine >> that=2E Not good enough in my book=2E > >So what, we will now need to copy every args[] into a Lisp vector >created by SAFE_ALLOCA_LISP, or xstrdup all of them, and do it in >each and every function that gets the args[] array, all the way down >to where the array is finally used (because usually we have 3 or 4 >nested levels that pass args[] to one another)? That's insane! > >> > What does it mean in detail "the object may move"? A Lisp object is = a >> > tagged pointer=2E Do you mean the pointer should no point to a >> > different address, i=2Ee=2E the value of a Lisp object as a number sh= ould >> > change to still be valid? =20 >>=20 >> Exactly=2E Unless an ambiguous reference prevents the copying that can >> happen=2E > >How can we possibly make sure this works reliably and safely?? For >each variable we have in every function, we will need to analyze >whether the variable is > > =2E an automatic variable > =2E a static variable that is protected by someone > =2E a global variable that is protected by someone > =2E a result of dereferencing a pointer that is somehow protected > >etc=2E etc=2E, where "protected by someone" means that it is a descendant >of some staticpro, or of some root, or=2E=2E=2E Well, yeah=2E Every other GC program does this=2E Emacs can too=2E There's= no great burden: all Lisp objects get traced automatically=2E Everything o= n the stack or in a register gets traced automatically, and, because the sc= anning is conservative, pinned=2E You only have to take extra steps to tell= the GC about something when you're going out of your way to go around the = GC=2E It's simply not true that to adopt a modern GC every line of code has to c= hange=2E I wrote a moving GC for Emacs myself years ago=2E Worked fine=2E = No rewrite=2E >And if we cannot prove to ourselves that one of the above happens, >then we'd need to force a copy of the variable to be on the stack? > >Does this sound practical? > >If this is the price of using MPS, and I'm not missing something >obvious, then it sounds like we should run away from MPS, fast=2E >Because we will sooner or later have to rewrite every single line of >code we ever wrote=2E No, you do it by adopting a rule that when a function receives a pointer, = the caller guarantees the validity of the pointer for the duration of the c= all=2E This way, only the level of the stack that spills the array to the h= eap has to take on the responsibility of keeping the referenced objects ali= ve, and making the spilled array a pointer to the pinned guts of a Lisp vec= tor is an adequate way to do this=2E=20 "Oh, but won't that kill performance?" In a generational system, allocating small, short-lived objects is actuall= y cheap ---- usually faster than mallloc=2E I don't recall how MPS does it,= but in some garbage collectors, a gen0 allocation of this sort is literall= y just incrementing a pointer=2E Malloc has a substantial cost of its own too=2E Yes, these objects make a gen0 GC happen sooner=2E So what? Young generati= on GC is cheap=2E If you could observe a performance problem, you could sol= ve the reference problem by maintaining a thread local linked list of spill= ed array blocks and teach the GC to scan them, but that's likely overkill= =2E Anyway, every native code program anywhere that uses GC has to care about = lifetimes and references=2E Abandoning MPS doesn't magically make the prob= lem go away=2E The *existing* code is broken, and you don't see it because = we use alloca to allocate on the stack almost all the time=2E=20