unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Re: Memory accounting in libgc
       [not found] <87k3c33awa.fsf@pobox.com>
@ 2014-03-12  6:57 ` Mark H Weaver
  2014-03-12  8:27   ` Andrew Gaylard
  2014-03-13 14:46   ` Neil Jerram
       [not found] ` <87k3c33awa.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
  1 sibling, 2 replies; 4+ messages in thread
From: Mark H Weaver @ 2014-03-12  6:57 UTC (permalink / raw)
  To: Andy Wingo; +Cc: bdwgc, guile-devel

Andy Wingo <wingo@pobox.com> writes:

> How does this affect libgc?
>
> First of all, it gives an answer to the question of "how much memory
> does an object use" -- simply stop the world, mark the heap in two parts
> (the first time ignoring the object in question, the second time
> starting from the object), and subtract the live heap size of the former
> from the latter.  Libgc could do this without too much problem, it seems
> to me, on objects of any kind.  It would be a little extra code but it
> could be useful.  Or not?  Dunno.

This could be generalized to the far more useful question: "How much
memory does this set of objects use?", although that's a slippery
question that might better be formulated as "How much memory would be
freed if this set of objects were no longer needed?".

For example, suppose you have a large data structure that is referenced
from two small header objects, A and B.  If you ask "How much memory
does A use?", the answer will be the size of the small header, and ditto
for B.  Without being able to ask the more general question, there's no
way to find out how much would be freed by releasing both.

     Mark



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Memory accounting in libgc
  2014-03-12  6:57 ` Memory accounting in libgc Mark H Weaver
@ 2014-03-12  8:27   ` Andrew Gaylard
  2014-03-13 14:46   ` Neil Jerram
  1 sibling, 0 replies; 4+ messages in thread
From: Andrew Gaylard @ 2014-03-12  8:27 UTC (permalink / raw)
  To: guile-devel

On 03/12/14 08:57, Mark H Weaver wrote:
> Andy Wingo <wingo@pobox.com> writes:
>
>> How does this affect libgc?
>>
>> First of all, it gives an answer to the question of "how much memory
>> does an object use" -- simply stop the world, mark the heap in two parts
>> (the first time ignoring the object in question, the second time
>> starting from the object), and subtract the live heap size of the former
>> from the latter.  Libgc could do this without too much problem, it seems
>> to me, on objects of any kind.  It would be a little extra code but it
>> could be useful.  Or not?  Dunno.
> This could be generalized to the far more useful question: "How much
> memory does this set of objects use?", although that's a slippery
> question that might better be formulated as "How much memory would be
> freed if this set of objects were no longer needed?".
>
> For example, suppose you have a large data structure that is referenced
> from two small header objects, A and B.  If you ask "How much memory
> does A use?", the answer will be the size of the small header, and ditto
> for B.  Without being able to ask the more general question, there's no
> way to find out how much would be freed by releasing both.
>
>       Mark
Agreed.  In order to build industrial-strength applications in guile, it's
important to be able to answer questions such as "what is causing my
process' memory usage to grow?"

-- 
Andrew Gaylard




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Memory accounting in libgc
  2014-03-12  6:57 ` Memory accounting in libgc Mark H Weaver
  2014-03-12  8:27   ` Andrew Gaylard
@ 2014-03-13 14:46   ` Neil Jerram
  1 sibling, 0 replies; 4+ messages in thread
From: Neil Jerram @ 2014-03-13 14:46 UTC (permalink / raw)
  To: guile-devel

On 2014-03-12 06:57, Mark H Weaver wrote:
> Andy Wingo <wingo@pobox.com> writes:
> 
>> How does this affect libgc?
>> 
>> First of all, it gives an answer to the question of "how much memory
>> does an object use" -- simply stop the world, mark the heap in two 
>> parts
>> (the first time ignoring the object in question, the second time
>> starting from the object), and subtract the live heap size of the 
>> former
>> from the latter.  Libgc could do this without too much problem, it 
>> seems
>> to me, on objects of any kind.  It would be a little extra code but it
>> could be useful.  Or not?  Dunno.
> 
> This could be generalized to the far more useful question: "How much
> memory does this set of objects use?", although that's a slippery
> question that might better be formulated as "How much memory would be
> freed if this set of objects were no longer needed?".
> 
> For example, suppose you have a large data structure that is referenced
> from two small header objects, A and B.  If you ask "How much memory
> does A use?", the answer will be the size of the small header, and 
> ditto
> for B.  Without being able to ask the more general question, there's no
> way to find out how much would be freed by releasing both.
> 
>      Mark

Absolutely agree that this would be useful, but I suspect a problem in 
how far one can push libgc to simulate a set of objects being freed 
without them actually _being_ freed.  For example there could be 
guardians associated with the objects, and I think one can validly 
imagine them doing either of the possible extremes (or anywhere in 
between), namely:

- resurrecting the objects again

- freeing up a whole load more objects/memory that the guardians know to 
be associated with the original objects, but which wasn't (for some 
reason) simply referenced by them.

Hope that's a useful thought - interesting subject!

     Neil




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Gc] Memory accounting in libgc
       [not found] ` <87k3c33awa.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
@ 2014-03-18 15:10   ` Noah Lavine
  0 siblings, 0 replies; 4+ messages in thread
From: Noah Lavine @ 2014-03-18 15:10 UTC (permalink / raw)
  To: Andy Wingo; +Cc: bdwgc, guile-devel


[-- Attachment #1.1: Type: text/plain, Size: 5879 bytes --]

Hello,

This reminds me of a related but distinct issue - region-based allocation.
Hypothetically, you could use regions to implement a limited form of this
accounting, if you could ask the question, "what is the size of this memory
region?"

It certainly wouldn't do as much as custodians, but it would serve for the
simple purpose of a sandbox. It would also have another nice effect,
because you could allocate all of the sandbox's objects together and get
memory locality when running in the sandbox.

I don't know if this is the right solution, but I am curious what other
people think.

Best,
Noah


On Sun, Mar 9, 2014 at 6:48 AM, Andy Wingo <wingo-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org> wrote:

> Hi,
>
> I was thinking this morning about memory accounting in libgc.  There are
> a couple of use cases for this.
>
> One is simply asking "how much memory does this object use?"
>
> Another case is the common "implement an IRC bot that people can use to
> evaluate expressions in a sandbox, but with a limit on the amount of
> time and space those programs can use".  Let's assume that we've already
> arranged that the program can't write to files or do other things we
> wouldn't like it to do, and also that we have given it a time limit
> through some means.
>
> Both of these cases should be handled by GC, since GC is what can see
> the whole object graph in a program.  The rest of this mail describes
> what Racket does in their GC, and how we might be able to do it in
> libgc.
>
> Racket has a type of objects called "custodians".  Custodians nest.
> There is one root custodian for the whole program, and creating a new
> custodian adds a new kid of the current custodian.  There is a separate
> form to make a custodian current.  New threads are created "within" the
> current custodian.  (Racket threads are not pthreads, but this concept
> generalizes to pthreads.)  There is also a form to "enter" a new
> custodian, effectively partitioning a thread's stack into a range owned
> by the old custodian, and a range owned by the new custodian.
>
> If there are multiple custodians in a program -- something that isn't
> always the case -- then after GC runs, their GC calls
> BTC_do_accounting() here:
>
>
> http://git.racket-lang.org/plt/blob/c27930f9399564497208eb31a44f7091d02ab7be:/racket/src/racket/gc2/mem_account.c#l417
>
> This does a mark of the program's objects, but in a different mode (the
> "accounting" flag is set).  Custodians and threads have a separate GC
> "kind", in BDW-GC terms.  When marking a custodian or a thread in
> accounting mode, the mark procedure only recurses into sub-objects if
> the custodian is "current".  First, all the program static roots are
> marked, excluding stacks, and memory use from those roots is attributed
> to the root custodian.  Then BTC_do_accounting traverses the custodian
> tree, in pre-order, for each custodian setting it "current" and running
> its mark procedure, and marking stack segments for any threads created
> by that custodian.  Any visited unmarked object is accounted to the
> current custodian (i.e., the size of the object is added to the current
> custodian's size counter).  Then the tree is traversed in post-order,
> and child memory use is also accounted to parents.  This is O(n) in live
> heap size and doesn't require any additional memory.
>
> Finally if any custodian is over its limit, an after-GC hook sets that
> custodian as scheduled for shutdown, which in Racket will close
> resources associated with the custodian (for example, files opened when
> the custodian was current), arrange to kill threads spawned by the
> custodian, arrange to longjmp() out of any entered custodian, etc.
>
> How does this affect libgc?
>
> First of all, it gives an answer to the question of "how much memory
> does an object use" -- simply stop the world, mark the heap in two parts
> (the first time ignoring the object in question, the second time
> starting from the object), and subtract the live heap size of the former
> from the latter.  Libgc could do this without too much problem, it seems
> to me, on objects of any kind.  It would be a little extra code but it
> could be useful.  Or not?  Dunno.
>
> On the question of limiting memory usage -- clearly, arranging to free
> memory is out of libgc's purview, as that's an application issue.  But
> allowing an application to impose a memory limitation on a part of the
> program does need libgc help, and it would be most usefully done via
> something like a custodian tree.  Custodians could be implemented by a
> new GC kind, so that they get a mark procedure; but then we would need a
> hook to be able to do accounting after GC is done, and really it seems
> this is best done in libgc somehow (even if its implementation is
> layered on top of gc kinds, mark procedures, and such).
>
> Accounting a thread's stack to multiple owners is tricky, but doable --
> we could do it with a new version of GC_call_with_gc_active or so.
>
> There is also the issue of objects reachable by custodians that are
> "peers" -- where one does not dominate the other.  In that case the
> object would be charged randomly to one or the other.  That's probably
> OK.
>
> Also, you would probably want to maintain a general set of per-custodian
> data -- file descriptors, for example.  One would also want to maintain
> an estimate of non-GC allocations per-custodian (for example for
> third-party image processing libraries), but I don't know how that could
> be done; perhaps it is only possible if you register a separate GC kind
> for those objects, perhaps with a mark procedure.
>
> Anyway, just some thoughts.  I don't know when I could implement this,
> but I thought I'd put this out there in case someone had any ideas about
> this, or pointers to literature.  Cheers.
>
> Regards,
>
> Andy
> --
> http://wingolog.org/
>
>

[-- Attachment #1.2: Type: text/html, Size: 7098 bytes --]

[-- Attachment #2: Type: text/plain, Size: 173 bytes --]

_______________________________________________
bdwgc mailing list
bdwgc-ZwoEplunGu1I4Lznb4ZCK0B+6BGkLq7r@public.gmane.org
https://lists.opendylan.org/mailman/listinfo/bdwgc

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-03-18 15:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <87k3c33awa.fsf@pobox.com>
2014-03-12  6:57 ` Memory accounting in libgc Mark H Weaver
2014-03-12  8:27   ` Andrew Gaylard
2014-03-13 14:46   ` Neil Jerram
     [not found] ` <87k3c33awa.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
2014-03-18 15:10   ` [Gc] " Noah Lavine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).