bug#19235: make-fresh-user-module procedure leaks memory

unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed

* bug#19235: make-fresh-user-module procedure leaks memory
@ 2014-11-30 23:28 Chris Vine
  2014-12-07  8:07 ` Mark H Weaver
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Vine @ 2014-11-30 23:28 UTC (permalink / raw)
  To: 19235

The make-fresh-user-module procedure leaks memory in guile-2.0.11 as
demonstrated by the attached test case.  This test case should be
invoked either with the "shared" or "fresh" option.  If invoked with
the "fresh" option, it will call make-fresh-user-module on each
iteration through the loop.  On my 32-bit machine it will steadily
accumulate a memory leak before running out of memory on consuming
approximately 2.2G memory, after about 180,000 iterations.  If called
with the "shared" option, it will accumulate no additional memory while
executing, and will execute normally to the end of its iterations.

The question which might be asked is "Would any sane person ever want
to invoke the make-fresh-user-module procedure more than a few times
in a practical program?".  The answer to this question is "Yes", if
guile is being used as an extension framework for a C or C++ program,
and it executes guile extensions as individual tasks, and it is
necessary that top levels should not to be shared.  The execution of
tasks concurrently is one such case, but there can be many cases where
isolated top levels are desirable for tasks executed serially also.

Test case:

----------------------------- snip -----------------------------

/* compile with 'gcc -O2 -Wall `pkg-config --cflags --libs guile-2.0` -o test-guile' */

#include <unistd.h>
#include <libguile.h>
#include <stdio.h>
#include <string.h>

int fresh;

void* func (void* data)
{

  switch (fresh)
    {
      case 0:
	scm_c_eval_string("");
	break;
      default:
	scm_c_eval_string("(set-current-module (make-fresh-user-module))");
    }

  return NULL;
}

int main (int argc, char *argv[])
{

  int count;

  if (argc != 2
      || (strcmp (argv[1], "shared") &&
	  strcmp (argv[1], "fresh")))
    {
      puts ("Usage: test-guile shared | fresh");
      exit (1);
    }

  if (!strcmp (argv[1], "fresh"))
    {
      puts("Invoking make-fresh-user-module");
      fresh = 1;
    }
  else
    puts("Using shared top level");

  for (count = 0; count < 256000; ++count)
    {
      scm_with_guile(func, NULL);

      if (!(count % 100)) {
	printf("%d ", count);
	fflush(stdout);
      }

      usleep(1);
    }

  puts("");
  return 0;
}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#19235: make-fresh-user-module procedure leaks memory
  2014-11-30 23:28 bug#19235: make-fresh-user-module procedure leaks memory Chris Vine
@ 2014-12-07  8:07 ` Mark H Weaver
  2014-12-07 14:19   ` Chris Vine
  0 siblings, 1 reply; 8+ messages in thread
From: Mark H Weaver @ 2014-12-07  8:07 UTC (permalink / raw)
  To: Chris Vine; +Cc: 19235

Chris Vine <chris@cvine.freeserve.co.uk> writes:

> The make-fresh-user-module procedure leaks memory in guile-2.0.11 as
> demonstrated by the attached test case.  [...]
>
> The question which might be asked is "Would any sane person ever want
> to invoke the make-fresh-user-module procedure more than a few times
> in a practical program?".  The answer to this question is "Yes", if
> guile is being used as an extension framework for a C or C++ program,
> and it executes guile extensions as individual tasks, and it is
> necessary that top levels should not to be shared.  The execution of
> tasks concurrently is one such case, but there can be many cases where
> isolated top levels are desirable for tasks executed serially also.

Unfortunately, Guile modules cannot be garbage collected.  The problem
is that modules are usually referenced by name, not by direct pointers.
Every module must have a name, due to the way our macro expander works.
Modules created by 'make-module' or 'make-fresh-user-module' are named
using gensyms.

We maintain a global map from module names to module objects, and we can
never safely delete from this map, because we cannot prove that the
module name won't be looked up in the future.

I agree that your use case is reasonable.  I'll think about how we might
allow unnamed modules to be collected, but I'm afraid it might be quite
difficult.

In the meantime, I wrote a procedure that uses undocumented interfaces
to forcefully delete a module from the name->module map.  However, I
must emphasize that this procedure is likely to break in a future
version of Guile.  However, it should work in the 2.0.x series.

--8<---------------cut here---------------start------------->8---
;; WARNING: Uses undocumented interfaces; NOT FUTURE PROOF!!
;; This needs (srfi srfi-1)
(define (delete-module! module)
  (let* ((name (module-name module))
         (last-name (last name))
         (parent-name (drop-right name 1))
         (parent (resolve-module parent-name #f))
         (var (module-variable parent last-name)))
    (when (and var
               (variable-bound? var)
               (eqv? (variable-ref var) module))
      (hashq-remove! (module-obarray parent) last-name))
    (hashq-remove! (module-submodules parent) last-name)
    #f))
--8<---------------cut here---------------end--------------->8---

I should mention that creating new modules with 'make-fresh-user-module'
is not thread safe, nor is the procedure above.  Both of them mutate the
same name->module map.  For now, I recommend protecting calls to both of
them with a mutex.

      Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#19235: make-fresh-user-module procedure leaks memory
  2014-12-07  8:07 ` Mark H Weaver
@ 2014-12-07 14:19   ` Chris Vine
  2014-12-26 18:26     ` Chris Vine
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Vine @ 2014-12-07 14:19 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: 19235

On Sun, 07 Dec 2014 03:07:43 -0500
Mark H Weaver <mhw@netris.org> wrote:
> Chris Vine <chris@cvine.freeserve.co.uk> writes:
> 
> > The make-fresh-user-module procedure leaks memory in guile-2.0.11 as
> > demonstrated by the attached test case.  [...]
> >
> > The question which might be asked is "Would any sane person ever
> > want to invoke the make-fresh-user-module procedure more than a few
> > times in a practical program?".  The answer to this question is
> > "Yes", if guile is being used as an extension framework for a C or
> > C++ program, and it executes guile extensions as individual tasks,
> > and it is necessary that top levels should not to be shared.  The
> > execution of tasks concurrently is one such case, but there can be
> > many cases where isolated top levels are desirable for tasks
> > executed serially also.
> 
> Unfortunately, Guile modules cannot be garbage collected.  The problem
> is that modules are usually referenced by name, not by direct
> pointers. Every module must have a name, due to the way our macro
> expander works. Modules created by 'make-module' or
> 'make-fresh-user-module' are named using gensyms.
> 
> We maintain a global map from module names to module objects, and we
> can never safely delete from this map, because we cannot prove that
> the module name won't be looked up in the future.
> 
> I agree that your use case is reasonable.  I'll think about how we
> might allow unnamed modules to be collected, but I'm afraid it might
> be quite difficult.
> 
> In the meantime, I wrote a procedure that uses undocumented interfaces
> to forcefully delete a module from the name->module map.  However, I
> must emphasize that this procedure is likely to break in a future
> version of Guile.  However, it should work in the 2.0.x series.

[snip]

> I should mention that creating new modules with
> 'make-fresh-user-module' is not thread safe, nor is the procedure
> above.  Both of them mutate the same name->module map.  For now, I
> recommend protecting calls to both of them with a mutex.

Thanks for that.  Having make-fresh-user-module (and possibly
set-current-module ??) not thread safe should be relatively easy to
deal with in my use case, subject to the next paragraph.  I would need
to call these procedures in C/C++ code before calling scm_eval_string()
to execute a scheme task, but presumably I can obtain a C variable
reference for make-fresh-user-module by calling scm_c_lookup() -
set-current-module already has a C interface provided by libguile.  I
would also need to use a POSIX mutex, but that is fine as guile uses
native threads.

However, that would not be completely effective if guile might also
call make-fresh-user-module internally, since that would be
unprotected.  Are there any circumstances in which guile might do this?

I know from what you said some time ago that guile module loading is not
thread safe.  Without asking you to exercise powers of clairvoyance,
can you think of any other thread safety problems I should be on the
look out for?  The documentation says that "multiple threads can call
scm_with_guile concurrently" and whereas that might literally be true
there seem a number of other things that one might reasonably expect to
do with scm_with_guile that are not.

On the memory leak, the usage case involves a library so I do not think
I can include code which might be liable to breakage at some
indeterminate time.  It would also be quite difficult to implement by
reference to the same mutex as used to protect make-fresh-user-module
because that would mean reimplementing your suggested code on the C/C++
side.  So I think it best just to document the leakage in the
documentation for the library (my one, not guile), and hope that at
some time it will be possible to deal with it in guile.

Chris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#19235: make-fresh-user-module procedure leaks memory
  2014-12-07 14:19   ` Chris Vine
@ 2014-12-26 18:26     ` Chris Vine
  2016-06-22 17:52       ` Andy Wingo
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Vine @ 2014-12-26 18:26 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: 19235

As far as I can tell the make-fresh-user-module procedure is not called
by guile itself, and providing a global mutex for it with a binding
enabling it to be called from scheme code seems to work fine.

This also makes it straightforward to incorporate in a thread-safe
way the code you suggested to free stale user modules.  However, as I
mentioned, I am a bit reluctant to incorporate code which might break
in the future.  Is there any possibility that a "delete-module!"
procedure could be included within the public guile API for the next
release of guile?  It seems like something that could be useful to
anyone using non-default user modules in their code.

Chris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#19235: make-fresh-user-module procedure leaks memory
  2014-12-26 18:26     ` Chris Vine
@ 2016-06-22 17:52       ` Andy Wingo
  2016-06-23 14:17         ` Mark H Weaver
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Wingo @ 2016-06-22 17:52 UTC (permalink / raw)
  To: Chris Vine; +Cc: ludo, 19235, 15602

In many ways I think Ludovic was right in #15602 -- we should allow
excursions to isolate changes to the module tree.  Sometimes you want an
excursion to never add a module to the tree.  Sometimes you do, but
maybe all in one go and with a mutex, to avoid races -- like, you could
load a file or evaluate some code in a private fork of the module tree,
but then commit it to the main tree afterwards.  Is that a sensible
thing?

Andy

On Fri 26 Dec 2014 19:26, Chris Vine <chris@cvine.freeserve.co.uk> writes:

> As far as I can tell the make-fresh-user-module procedure is not called
> by guile itself, and providing a global mutex for it with a binding
> enabling it to be called from scheme code seems to work fine.
>
> This also makes it straightforward to incorporate in a thread-safe
> way the code you suggested to free stale user modules.  However, as I
> mentioned, I am a bit reluctant to incorporate code which might break
> in the future.  Is there any possibility that a "delete-module!"
> procedure could be included within the public guile API for the next
> release of guile?  It seems like something that could be useful to
> anyone using non-default user modules in their code.
>
> Chris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#19235: make-fresh-user-module procedure leaks memory
  2016-06-22 17:52       ` Andy Wingo
@ 2016-06-23 14:17         ` Mark H Weaver
  2016-06-24  8:04           ` bug#15602: " Ludovic Courtès
  0 siblings, 1 reply; 8+ messages in thread
From: Mark H Weaver @ 2016-06-23 14:17 UTC (permalink / raw)
  To: Andy Wingo; +Cc: 15602, ludo, Chris Vine, 19235

Andy Wingo <wingo@pobox.com> writes:

> In many ways I think Ludovic was right in #15602 -- we should allow
> excursions to isolate changes to the module tree.  Sometimes you want an
> excursion to never add a module to the tree.  Sometimes you do, but
> maybe all in one go and with a mutex, to avoid races -- like, you could
> load a file or evaluate some code in a private fork of the module tree,
> but then commit it to the main tree afterwards.  Is that a sensible
> thing?

Yes, I agree.  In fact, I'd been thinking of something along those lines
to enable thread-safe module loading.  More specifically, I was thinking
that there should be a fluid variable that contains some additional
modules that are not yet committed to the global module tree.

Briefly, when a module is auto-loaded by a thread, the new module would
initially be visible only to that thread, and also to any threads
spawned by that thread during the auto-load.  Any attempts to access the
module from other threads would block until the module is either fully
loaded.

One potential issue that has been troubling me is that in Guile's model,
there's no guarantee that a module will _ever_ finish loading.  The main
program itself could simply run from the auto-load.  That's why I think
it's important to propagate permission to threads created during the
auto-load, but maybe there will still be problems.

    Thoughts?
      Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#15602: bug#19235: make-fresh-user-module procedure leaks memory
  2016-06-23 14:17         ` Mark H Weaver
@ 2016-06-24  8:04           ` Ludovic Courtès
  2016-06-26  0:50             ` Mark H Weaver
  0 siblings, 1 reply; 8+ messages in thread
From: Ludovic Courtès @ 2016-06-24  8:04 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: 15602, Chris Vine, 19235

Mark H Weaver <mhw@netris.org> skribis:

> Andy Wingo <wingo@pobox.com> writes:
>
>> In many ways I think Ludovic was right in #15602 -- we should allow
>> excursions to isolate changes to the module tree.  Sometimes you want an
>> excursion to never add a module to the tree.  Sometimes you do, but
>> maybe all in one go and with a mutex, to avoid races -- like, you could
>> load a file or evaluate some code in a private fork of the module tree,
>> but then commit it to the main tree afterwards.  Is that a sensible
>> thing?
>
> Yes, I agree.  In fact, I'd been thinking of something along those lines
> to enable thread-safe module loading.  More specifically, I was thinking
> that there should be a fluid variable that contains some additional
> modules that are not yet committed to the global module tree.
>
> Briefly, when a module is auto-loaded by a thread, the new module would
> initially be visible only to that thread, and also to any threads
> spawned by that thread during the auto-load.  Any attempts to access the
> module from other threads would block until the module is either fully
> loaded.

That sounds like a nice idea.

In the current state of things, perhaps this behavior could be emulated
by running ‘compile-file’ in a module excursion, and passing it a root
module that’s a copy of ‘the-root-module’, something like that (though
that would probably perform badly.)

> One potential issue that has been troubling me is that in Guile's model,
> there's no guarantee that a module will _ever_ finish loading.

I think the fact that we evaluate all the top-level forms is
problematic.  The R6RS phases were a great idea.  :-)

> The main program itself could simply run from the auto-load.  That's
> why I think it's important to propagate permission to threads created
> during the auto-load, but maybe there will still be problems.

I’m not sure what you mean by “propagate permission”?

Ludo’.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#15602: bug#19235: make-fresh-user-module procedure leaks memory
  2016-06-24  8:04           ` bug#15602: " Ludovic Courtès
@ 2016-06-26  0:50             ` Mark H Weaver
  0 siblings, 0 replies; 8+ messages in thread
From: Mark H Weaver @ 2016-06-26  0:50 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: 15602, Chris Vine, 19235

ludo@gnu.org (Ludovic Courtès) writes:

> Mark H Weaver <mhw@netris.org> skribis:
>
>> One potential issue that has been troubling me is that in Guile's model,
>> there's no guarantee that a module will _ever_ finish loading.
>
> I think the fact that we evaluate all the top-level forms is
> problematic.  The R6RS phases were a great idea.  :-)
>
>> The main program itself could simply run from the auto-load.  That's
>> why I think it's important to propagate permission to threads created
>> during the auto-load, but maybe there will still be problems.
>
> I’m not sure what you mean by “propagate permission”?

I mean: propagate permission to access the not-yet-committed module.
For example, suppose a program loads a module that runs the main event
loop as a top-level form in its body.  This module will never be
committed to the global module table, because it never finishes loading.
Now suppose that it spawns some new threads.  Those threads should have
access to the module.

Similarly, if a module uses 'par-for-each' to initialize some tables,
the spawned threads should have access to the module being loaded.

      Mark

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-06-26  0:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-30 23:28 bug#19235: make-fresh-user-module procedure leaks memory Chris Vine
2014-12-07  8:07 ` Mark H Weaver
2014-12-07 14:19   ` Chris Vine
2014-12-26 18:26     ` Chris Vine
2016-06-22 17:52       ` Andy Wingo
2016-06-23 14:17         ` Mark H Weaver
2016-06-24  8:04           ` bug#15602: " Ludovic Courtès
2016-06-26  0:50             ` Mark H Weaver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).