From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Pip Cet Newsgroups: gmane.emacs.devel Subject: Re: Opportunistic GC Date: Wed, 10 Mar 2021 19:38:08 +0000 Message-ID: References: <87lfav6i6o.fsf@rfc20.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="29707"; mail-complaints-to="usenet@ciao.gmane.io" Cc: eliz@gnu.org, Stefan Monnier , emacs-devel@gnu.org To: Matt Armstrong Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Mar 10 20:41:53 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lK4iE-0007Hm-UL for ged-emacs-devel@m.gmane-mx.org; Wed, 10 Mar 2021 20:41:42 +0100 Original-Received: from localhost ([::1]:52094 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lK4iD-00063K-UG for ged-emacs-devel@m.gmane-mx.org; Wed, 10 Mar 2021 14:41:41 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:50182) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lK4fR-00037M-QL for emacs-devel@gnu.org; Wed, 10 Mar 2021 14:38:50 -0500 Original-Received: from mail-oi1-x234.google.com ([2607:f8b0:4864:20::234]:38730) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lK4fO-0000gZ-TE; Wed, 10 Mar 2021 14:38:48 -0500 Original-Received: by mail-oi1-x234.google.com with SMTP id v192so12840947oia.5; Wed, 10 Mar 2021 11:38:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=QGssUhKMz9+1mOS0/pIyuRodrjUXX3wtglFTgAu8YG0=; b=spIeIY1wS8fds4mg2RlDvA1xwI0HnLfQEKqaVArIoO80+SrG4kiIq8ep4DEL4qMMjx wxztAaI3sykUfg0jN7iNllAYfjBDGuj84fDR5/MVGu2XVG2WxqwgLrAsQGXt2jRkjwCY uLKN+nJUp9l2UgD/yj9AvGbbz3/mShgBCuELW/yO6n/3DjdCcwveAGO89aE1h0Jm2qFR xa0zPaMfzRkUNyZw0DhsUYisbwJgIiws6UD6Hipi4p33zK67rYqtlhjzIAL4sfFLsYWd kr/v7eFNfCDAbLrL22T05+gVe7We9Rajr5Pmf3s0LHu89il+0tr0TdSxNZx92BZU4iIV MPyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=QGssUhKMz9+1mOS0/pIyuRodrjUXX3wtglFTgAu8YG0=; b=SwIDvyAANLHEiH3uvs8ZOACIqlQsrYtvCugZosdbxYBfuKeTiCLkc7OagmlHZjd3Sy LmbU4n1qT6Qe81FS7EvFVXT2BJtI9Fqttuxgfo1kbRcrJX4WcuLU+hYpLGub0p3pgVOS czeySVeb6DsrgNjdjECv8KZWtxFNie1/ROUEPnhIee+2fwf9uAPLjBmq0r9EYrX8pgk9 IRznT2BeTrw7h0gJX7I9KZLrK6O+BcEhJZlBjufeXxAJLTOarxUweKKIb2GWIec7mX0U sZOOywO3XEw4ELkeNP51Gbwa6tZ9CZ1imtgvwtqdymjGv4d3G4WWubmFZqu0nsZUyGT6 /z+A== X-Gm-Message-State: AOAM5338L5mJD0t8a6cwZpScJYgcDXxjeYt8HlvzlOahFkWZO4sOTB57 /ry0n7aMwvgdV+DIelVDDVnnorQ2El9KHKyI1uk= X-Google-Smtp-Source: ABdhPJwW4ml6bZsT3/Dt7h17qsujg/QgCs9qaCXGwOzcSL42vrgK7styLyCzQX6NoInqkJr+0YDB3Oaddwr86fUYJ54= X-Received: by 2002:aca:4c0f:: with SMTP id z15mr3586347oia.44.1615405124493; Wed, 10 Mar 2021 11:38:44 -0800 (PST) In-Reply-To: <87lfav6i6o.fsf@rfc20.org> Received-SPF: pass client-ip=2607:f8b0:4864:20::234; envelope-from=pipcet@gmail.com; helo=mail-oi1-x234.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:266297 Archived-At: On Wed, Mar 10, 2021 at 5:18 PM Matt Armstrong wrote: > Pip Cet writes: > > On Mon, Mar 8, 2021 at 3:37 AM Stefan Monnier wrote: > >> I've been running with the code below recently to try and see if > >> opportunistic GC can be worth the trouble. > > > > Just a random idea: What if we exploit the fact most people have more > > than one CPU core these days, garbage-collect in a separate fork()ed > > process, then do only the sweeping in the main process? > > > > Immediate advantage: we'd never mark objects in the "real" Emacs > > process, making debugging easier. > > > > My implementation proposal would be to pipe(), fork(), run GC, then > > send through the pipe the Lisp_Objects which can be collected by the > > original Emacs. > > > > For me, it was a bit difficult to see that this would indeed be safe, > > but I'm now pretty convinced it would be: objects that are unreachable > > in the child Emacs cannot become reachable in the parent Emacs (they > > might show up on the stack, but that's a false positive). > > > > We'd have to be careful not to run two "real" GC processes at the same > > time. > > Hi Pip, > > Neat idea. Thank you. > Are there examples of other GC implementations using this approach? I'm not aware of any, but that doesn't mean much. > Perhaps this has been tried before, elsewhere, and we could learn a lot > from that. I'm certain it has. > I looked for myself and found > https://dlang.org/blog/2019/07/22/symmetry-autumn-of-code-experience-report-porting-a-fork-based-gc/ > which talk about this idea in the context of the D run time. This page > mentions a different idea: doing the mark pass with multiple threads. > This might be worth exploring; the two ideas are composable with > each other. Absolutely. What they're not easily composable with is a copying garbage collector, but Emacs might not require one of those as much as other applications do. Again, I think this is, most of all, cheap in terms of implementation time. Not as cheap as I thought initially, I confess, but I'm willing to argue that this might be a good thing: we need to be clearer about what's GC and what isn't, and move other clean-up tasks out of the garbage collector. There's also an additional problem with running the GC "fully" asynchronously, which I define as collecting only some of the objects it returns as unreachable before starting the next cycle. I'm not sure we ever really want to do that, but I thought it would be nice if we could. The problem is that A and B may both be unreachable, A might reference B, but only B might get collected, leaving a dangling pointer in A. Then A might become maybe-reachable again, by being on the stack as a false positive, and we end up seeing nasty corruption bugs. This actually did happen to me while playing with this patch, so at least I learned my "don't play with snakes" lesson. That means we need to either assume that the forked thread will always terminate properly, and abort otherwise, or we need to wait for it to produce all of its output before starting to act on it. The latter is problematic because GC should be about freeing memory, not allocating more memory to begin with. The former is problematic because it makes things like automated nice level adjustments for the worker process much more difficult: instead of going "oh, we were being too nice, kill the thread and start over", we'd have to un-nice it, probably to a much higher priority, once we conclude it's taking too long. Pip