From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Dimitris Papavasiliou Newsgroups: gmane.lisp.guile.user Subject: Re: Need help embedding Guile Date: Wed, 22 Dec 2021 22:05:05 +0000 Message-ID: References: <2a789e248ef8d1922caec7af553cf26e9b360619.camel@telenet.be> <87ee65shtv.fsf@laura> <87v8zg4ui8.fsf@gnuvola.org> Reply-To: Dimitris Papavasiliou Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="32079"; mail-complaints-to="usenet@ciao.gmane.io" Cc: guile-user@gnu.org To: Thien-Thi Nguyen , "olivier.dion@polymtl.ca" , "maximedevos@telenet.be" , "mikael@djurfeldt.com" Original-X-From: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Wed Dec 22 23:05:35 2021 Return-path: Envelope-to: guile-user@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n09jr-00083n-D1 for guile-user@m.gmane-mx.org; Wed, 22 Dec 2021 23:05:35 +0100 Original-Received: from localhost ([::1]:58256 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n09jp-0006EE-Om for guile-user@m.gmane-mx.org; Wed, 22 Dec 2021 17:05:33 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:57254) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n09jc-0006CA-6y for guile-user@gnu.org; Wed, 22 Dec 2021 17:05:20 -0500 Original-Received: from mail-40134.protonmail.ch ([185.70.40.134]:49873) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n09jX-0002ud-Q2 for guile-user@gnu.org; Wed, 22 Dec 2021 17:05:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.ch; s=protonmail2; t=1640210707; bh=M96SYwWcWVJ4cweYO7UgYtPIDdzRvskAislIdT42c0A=; h=Date:To:From:Cc:Reply-To:Subject:Message-ID:In-Reply-To: References:From:To:Cc; b=D07GR6sZFuUnjT9r7yTnAYdr62WEx4RlpNyQAtIMBKop4T4IU1PSI+y+B8BzMPmZf o2VItnidVFeK1K+FrZ74uutT3kboMwcKyMmiRVg+me6O7DQvkokhWQ7jRujDsCIqqY KR81ckQSmcFAlEGbcoa9jSqznfDofPAOgEA/VFbKuljkiHAbD+qyJGVFMQrmRCBfa7 +TojfaYVCZqSQ6SeD5SLNIfNQQvnzK+P98jPgW3P5uFoO/OrNMdt4AflLod6gLvG+t pPfuFiee7aDlQLsq+IcgLIfus71K1IfoG+g/0Sa6MXQew+kKvYnfEZLAfZdiiXLmsL HGgtEtaWDiF1g== In-Reply-To: <87v8zg4ui8.fsf@gnuvola.org> Received-SPF: pass client-ip=185.70.40.134; envelope-from=dpapavas@protonmail.ch; helo=mail-40134.protonmail.ch X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.io gmane.lisp.guile.user:17906 Archived-At: Thanks to everybody for their suggestions. I'll respond to all in this sin= gle message to keep the discussion from spreading out too much. Please let me = know if this is inconvenient for you. I also apologize in advance for my large messages. There's a TL;DR of sorts in the last 3 paragraphs. Let me start by noting that there are really two distinct, though connected= , problems: 1. Handling garbage collection This problem is tractable, the only question being how best to handle it. = As I neglected to say explicitly, but as Olivier pointed out: On Wednesday, December 22nd, 2021 at 4:46 PM, Olivier Dion wrote: > Since you have a graph of all the primitives in the second phase, you're > basicaly doing garbage collection there. So, yes, for this class of foreign object, I can essentially simply pass pl= ain pointers to Guile, and let it go about its business with garbage collection= as it sees fit. I keep these objects in a graph anyway and already handle fre= eing them after phase 2. Then: > If I understood, objects can be garbage before phase 2, thus not > appearing in the final graph of operations. This class of foreign objects, is indeed used during the first phase, and n= eed to be finalized once it's finished. These can be handled as above, by pass= ing plain pointers to Guile, and keeping tabs on them to finalize them explicit= ly. This is less than ideal, because all objects will be kept live until the Sc= heme code terminates and although these are typically small, there's no guarante= e that there won't be very many of them. But this might be alleviated to som= e extent by combining our own collection with that of Guile, e.g. as Mikael suggests: On Wednesday, December 22nd, 2021 at 7:37 PM, Mikael Djurfeldt wrote: > For example, the C++ application could have a doubly linked list of the C= ++ > objects. When Guile collects an object, make it unlink it from the C++ > list. Then, when you want to enter your second phase, you can go through = the > list, which now only contains the objects not yet collected, and finalize > them. Although this would require relying on finalizers, it would no longer be necessary that every single object has its finalizer called; just that most= do, so that no too much memory is wasted. This seems to be the idea behind finalizers in the BDW-GC, as far as I could see from its documentation. This then leaves the more substantial difficulty: 2. Making sure Guile has terminated after phase 1 First of all, this is related to the previous problem. Although it *is* tr= ue that the GC is conservative, this is not the ultimate reason why it is not possible to deterministically ensure that all objects are collected and finalized. As far as I can see, the ultimate reason is that the GC in use = by Guile works under the assumption that it will be in charge until process ex= it, at which point collection becomes unnecessary, as the OS will take over. If= it were possible to tell Guile to shut down and clean up, the GC would know th= at all tracked objects are now up for collection as there's no-one left to use them. This is possible with Lua for instance, another language meant to be embedded where the Lua state can be closed, at which point all objects are collected. This allowed me to embed Lua without much trouble. This also precludes this suggestion: On Wednesday, December 22nd, 2021 at 3:52 PM, Thien-Thi Nguyen wrote: > Do guardians help for this? Alas no, because a) as far as I can tell guarded objects still refer to the= GC to tell whether they are collectable and its conservatism will still create problems, but more importantly b) I can see no documented way to sever all references the Scheme code might have made to the foreign objects (but see = more below). On this issue Olivier suggests: On Wednesday, December 22nd, 2021 at 4:46 PM, Olivier Dion wrote: > One way I think you could do this is to evaluate all the user operations > in a sandbox environment. > If SEVER-MODULE? is true (the default), the module will be unlinked > from the global module tree after the evaluation returns, to allow MOD > to be garbage-collected. This is interesting, but sandboxed environments turn out to be too restrict= ing and not meant for this purpose. As far as I could tell, I cannot even load= code from within: scheme@(guile-user)> (use-modules (ice-9 sandbox)) scheme@(guile-user)> (eval-in-sandbox '(load "test.scm")) ice-9/boot-9.scm:1669:16: In procedure raise-exception: Unbound variable: load Inspired by the `sever-module?' argument I tried severing the default modul= e, as returned by `scm_current_module' and `scm_interaction_environment' like thi= s: scm_call_1( scm_variable_ref( scm_c_private_variable("ice-9 sandbox", "sever-module!")), env); This seemed to work (the severing part), but didn't help in allowing collec= tion of e.g. foreign objects bound to global variables, presumably because other references are kept on the default interaction environment. I also tried creating and then severing a custom-built r5rs environment (ma= de with `scheme-report-environment'), but this couldn't even be severed. More out of spite than anything, I tried to clear the default module. Noti= ng that a module is really a structure (although I have only a very hazy idea = what this really means), I tried: SCM env =3D scm_current_module(); scm_struct_set_x(env, scm_from_int(0), SCM_EOL); scm_struct_set_x(env, scm_from_int(1), SCM_BOOL_F); Lo and behold, this succeeded in allowing all objects to be collected! But= one might say: what of it? This is still a hack, depending on implementation details. But there's a bigger (in my view at least) issue here. As Mikael notes: On Wednesday, December 22nd, 2021 at 7:37 PM, Mikael Djurfeldt wrote: > This creates the following problem: What if some Guile code runs *after* = you > have finalized your remnant objects? [...] All of this indicates that it = could > be nice to have some kind of Guile shutdown call in the C API. Such a shu= tdown > call could go through live objects and free them. On the same matter, Maxime said: On Wednesday, December 22nd, 2021 at 5:29 PM, Maxime Devos wrote: > > This makes some sort of > > forcing/ensuring that Guile has terminated desirable. > > ... but I don't see how this follows. The only benefit I see from > ensuring Guile terminates, is freeing a little memory. But since the > Guile is basically used as a fancy configuration language, I don't see > the need. (Except for valgrind memory leak detection.) (Again this memory doesn't have to be a little. As a simple illustration, = if the program makes a geometry made out of a 3D grid of 10 * 10 * 10 cubes sa= y, it will have to allocate 1000 transformation objects to translate the cubes in= to place, which will be retained needlessly. Worse situations are easily imaginable, depending on how optimistic one is.) But the real issue is the one brought up by Mikael. Guile is quite large, = with many features and quite complex control flow mechanisms. As long as Guile = is up and running, one can't be really sure that it won't somehow interfere with = the execution of the embedding program when it shouldn't (in phase 2 in my case= ) and in ways that are not predictable. This is may be no more than a psychological problem, a mere pseudo-concern,= but I'm not certain of that. Some user might have the main code start threads = for instance, which persist past the point of its return and while that's easil= y fixed by joining all threads before phase 2 say, other such issues may not = be as tractable. Embedding Guile requires effort and having the possibility of discovering hard problems late in the game, is not entirely insignificant i= n this respect. So TL;DR: I think the issue boils down to whether it is possible to shut d= own Guile and have it clean up before process exit. If this is not currently possible, another interesting question might how well such a feature would = fit into Guile's current design and whether it would be desirable to implement = it. I would argue that, although perhaps not indispensable, it would certainly = not be unnecessary for a language specifically designed to be embedded. Dimitris PS: Let me know if you think I should start a new thread for this.