unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* GUILE GC -- Write barrier for vectors
@ 2002-07-14 22:14 hanwen
  2002-07-15 10:56 ` Miroslav Silovic
  2002-07-15 16:49 ` Dirk Herrmann
  0 siblings, 2 replies; 13+ messages in thread
From: hanwen @ 2002-07-14 22:14 UTC (permalink / raw)
  Cc: jantien



Hi y'all,

I've recently benchmarked LilyPond for large scores, and to my
astonishment, I found that a large proportion of the time that Lily
runs, it spends doing GC. It might very well be that LilyPond is not
handling memory efficiently. Tips on improving this are welcome: in
large runs, lilypond allocates 30 to 300k smob objects, which probably
require a 20 to 50 cells each; most of these cells rarely change --
they are data representation, not runtime (i.e. often changing) data,
at least that is what I think: this particular run has cells-marked =
44M, cells-swept = 60M, meaning that 75% of the cells are not garbage,
right?

Linked against 1.7.0 without-threads, lily uses approximately 20% of
its time in GC on a 400 mhz PII/SDRAM. In another instance, the same
compile with lilypond linked against 1.5.6 on a 1Ghz PIII/DDR RDRAM,
uses approximately 50% of the total running time (!) for GC'ing.

Can anyone comment on the difference in running times?  I didn't see
much in the changelog that could explain this difference.  Could it be
that there is some anomaly that causes the timings to be skewed on the
fast machine? The total speed increase going from the 400mhz machine
to the 1ghz is consistent with the clock freqs.

Can anyone comment on how to improve the performance?

In any case, it seems that there is room for improvement on the GUILE
side, and one of these would be generational GC.

My plan was to have a generation counter for each CARD, and increase
that counter when no garbage was found during a sweep. When a card is
older than a certain threshold, all data in it is assumed to be live
and marked already.  Mutators (SETCDR, SET_VECTOR_ELT, etc.) reset the
generation counter of a card to 0.

Also, smobs and maybe some other types (which ones? structs?), are
always assumed to have been changed, since they are outside of GC's
control. It would probably be best to allocate them from a different
pool so that they won't poison the normal generations, but that would
require a slight extension of the GUILE API.

Of course: I don't know much about GC, but this is only a crude
try. My hope is that --when CARDS are small enough-- they will fill up
with "static" data that doesn't change all the time.

As a prelude, I tried to add a "soft" write barrier to the vector
code, by returning a const* in SCM_VELTS, and fixing where it goes
wrong. The cell codes already have this type of write barrier. I then
tried to fix up the remaining parts of the code. This is required by
any gen GC code, IIRC.

The patch is on
http://www.cs.uu.nl/~hanwen/public/software/guile-vector-wb, it is
against 1.7 CVS, checked out friday night.

* Print_state uses some kind of stack. This is a highly mutable thing,
right?

* Which other object types require a write barrier? eg. the subroutine
table, ports, structs?

* I introduced the SCM_VECTOR_SET macro, ie

	SCM_VECTOR_SET(vec,idx,value);

must be used for assignment.

* Direct write access to a vector must be done through the macro
SCM_WRITABLE_VELTS.

* I ran a GC test just to verify that it was working, but I would
really like to run the test-suite; I think that the following is a
disgrace:

	blauw:~/usr/src/guile/guile-core$ make -C test-suite/ test
	make: Entering directory `/home/hanwen/usr/src/guile/guile-core/test-suite'
	make: *** No rule to make target `test'.  Stop.
	make: Leaving directory `/home/hanwen/usr/src/guile/guile-core/test-suite'
	blauw:~/usr/src/guile/guile-core$ grep -i test-suite HACKING README  BUGS 
	blauw:~/usr/src/guile/guile-core$ make test
	make: *** No rule to make target `test'.  Stop.

after some twiddling, the following command does something. It remains unsure whether this is
good or a bad result:

	blauw:~/usr/src/guile/guile-core$ libguile/.libs/lt-guile -e main -s test-suite/guile-test 
	ERROR: In procedure lstat:
	ERROR: No such file or directory: "/home/hanwen/bogus-path/test-suite"
	blauw:~/usr/src/guile/guile-core$ echo $?
	2

* Will this patch be integrated into CVS?  The FSF already has
  disclaimers for me.

* Et ceteram censeo GUILE 1.6 releasem esse


-- 

Han-Wen Nienhuys   |   hanwen@cs.uu.nl   |   http://www.cs.uu.nl/~hanwen 

_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-14 22:14 GUILE GC -- Write barrier for vectors hanwen
@ 2002-07-15 10:56 ` Miroslav Silovic
  2002-07-15 11:00   ` Han-Wen Nienhuys
  2002-07-15 16:49 ` Dirk Herrmann
  1 sibling, 1 reply; 13+ messages in thread
From: Miroslav Silovic @ 2002-07-15 10:56 UTC (permalink / raw)
  Cc: guile-devel, jantien

hanwen@cs.uu.nl wrote:

>Linked against 1.7.0 without-threads, lily uses approximately 20% of
>its time in GC on a 400 mhz PII/SDRAM. In another instance, the same
>compile with lilypond linked against 1.5.6 on a 1Ghz PIII/DDR RDRAM,
>uses approximately 50% of the total running time (!) for GC'ing.
>
>Can anyone comment on the difference in running times?  I didn't see
>much in the changelog that could explain this difference.  Could it be
>that there is some anomaly that causes the timings to be skewed on the
>fast machine? The total speed increase going from the 400mhz machine
>to the 1ghz is consistent with the clock freqs.
>
>Can anyone comment on how to improve the performance?
>  
>
I had a similar problem when working with a large number of  vector-like 
things - basilly
my application would trigger GC more often than necessary. The reason 
was that I didn't
properly report bytes allocated to Guile (i.e. my use of scm_must_malloc 
was broken).

If you're reporting too much, the GC will trigger all the time. 
Alternatively, if you're reporting
more alloc than freed bytes, you'll get bytes counter mismatch (VERY 
similar to memory leak),
and this will completely confuse GC scheduler.

    Miro



_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-15 10:56 ` Miroslav Silovic
@ 2002-07-15 11:00   ` Han-Wen Nienhuys
  2002-07-15 11:36     ` Miroslav Silovic
  0 siblings, 1 reply; 13+ messages in thread
From: Han-Wen Nienhuys @ 2002-07-15 11:00 UTC (permalink / raw)
  Cc: guile-devel


miro@puremagic.com writes:
> >Can anyone comment on how to improve the performance?
> >  
> >
> I had a similar problem when working with a large number of  vector-like 
> things - basilly
> my application would trigger GC more often than necessary. The reason 
> was that I didn't
> properly report bytes allocated to Guile (i.e. my use of scm_must_malloc 
> was broken).

> If you're reporting too much, the GC will trigger all the time. 
> Alternatively, if you're reporting

I'm quite sure that free/malloc bytes reported are matched. However,
most counts are a little underestimated. This is impossible to
prevent, since we are using classes derived from smobbed C++
baseclasses (You can't get correct object sizes in destructors and
constructors).

-- 

Han-Wen Nienhuys   |   hanwen@cs.uu.nl    | http://www.cs.uu.nl/~hanwen/


_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-15 11:00   ` Han-Wen Nienhuys
@ 2002-07-15 11:36     ` Miroslav Silovic
  0 siblings, 0 replies; 13+ messages in thread
From: Miroslav Silovic @ 2002-07-15 11:36 UTC (permalink / raw)
  Cc: guile-devel

Han-Wen Nienhuys wrote:

>I'm quite sure that free/malloc bytes reported are matched. However,
>most counts are a little underestimated. This is impossible to
>prevent, since we are using classes derived from smobbed C++
>baseclasses (You can't get correct object sizes in destructors and
>constructors).
>
>  
>

AFAIK, this shouldn't be the problem (it should only cause your memory 
usage to be
larger than necessary). Did you try to monitor and compare GC 
frequencies on both
machines (and log the bytes allocated etc.) ?

    Miro



_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-14 22:14 GUILE GC -- Write barrier for vectors hanwen
  2002-07-15 10:56 ` Miroslav Silovic
@ 2002-07-15 16:49 ` Dirk Herrmann
  2002-07-15 18:22   ` Han-Wen Nienhuys
  1 sibling, 1 reply; 13+ messages in thread
From: Dirk Herrmann @ 2002-07-15 16:49 UTC (permalink / raw)
  Cc: guile-devel, jantien

On Mon, 15 Jul 2002 hanwen@cs.uu.nl wrote:

> Linked against 1.7.0 without-threads, lily uses approximately 20% of
> its time in GC on a 400 mhz PII/SDRAM. In another instance, the same
> compile with lilypond linked against 1.5.6 on a 1Ghz PIII/DDR RDRAM,
> uses approximately 50% of the total running time (!) for GC'ing.

Marius has cleaned up the memory interface.  During this step, a couple of
bugs were removed, where guile reports too much memory being allocated to
the gc.  This caused the gc to be called more often than necessary.  You
can find some details under workbook/gc/memory.text.  Possibly these
changes account for part of the difference between CVS (1.7.0 ?) and
1.5.6.

> In any case, it seems that there is room for improvement on the GUILE
> side, and one of these would be generational GC.

Right.  I think Michael Livshin has already thought about that issue.  You
should probably contact him to share ideas.  I think, summarizing all
thoughts in workbook will be quite helpful.

> Also, smobs and maybe some other types (which ones? structs?), are
> always assumed to have been changed, since they are outside of GC's
> control. It would probably be best to allocate them from a different
> pool so that they won't poison the normal generations, but that would
> require a slight extension of the GUILE API.

Do you have a suggestion for such an extension?

> As a prelude, I tried to add a "soft" write barrier to the vector
> code, by returning a const* in SCM_VELTS, and fixing where it goes
> wrong. The cell codes already have this type of write barrier. I then
> tried to fix up the remaining parts of the code. This is required by
> any gen GC code, IIRC.
[...]
> * I introduced the SCM_VECTOR_SET macro, ie
> 
> 	SCM_VECTOR_SET(vec,idx,value);
> 
> must be used for assignment.

Great.  This is a very good change, not just for the sake of gengc.

> * Direct write access to a vector must be done through the macro
> SCM_WRITABLE_VELTS.

Hmmm?  Where should this be necessary?  Do you want to modify the vector
cell itself, or are you using this for "speed ups"?  In the "speed-up"
case:  wouldn't SCM_VECTOR_SET do as well, given that the compiler can
extract constant expressions from within loops?

Btw:  We should also try to get rid of SCM_CARLOC and SCM_CDRLOC in this
context...

> * I ran a GC test just to verify that it was working, but I would
> really like to run the test-suite; I think that the following is a
> disgrace:

Well, can't tell you about this one, since I am personally not doing
"make check" but rather "./check-guile" from the guile-core directory.

> * Will this patch be integrated into CVS?  The FSF already has
>   disclaimers for me.

I have not looked into the patch yet, but I think that the SCM_VECTOR_SET
patch as you describe it is probably the right thing to do.  I am not sure
about the need for SCM_WRITABLE_VELTS, but I think we can discuss this
one.  In any case, it would be great if you could contact Michael Livshin.

> * Et ceteram censeo GUILE 1.6 releasem esse
:-)

Best regards,
Dirk Herrmann


_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-15 16:49 ` Dirk Herrmann
@ 2002-07-15 18:22   ` Han-Wen Nienhuys
  2002-07-15 20:07     ` Dirk Herrmann
  0 siblings, 1 reply; 13+ messages in thread
From: Han-Wen Nienhuys @ 2002-07-15 18:22 UTC (permalink / raw)
  Cc: guile-devel, jantien

dirk@sallust.ida.ing.tu-bs.de writes:
> > Also, smobs and maybe some other types (which ones? structs?), are
> > always assumed to have been changed, since they are outside of GC's
> > control. It would probably be best to allocate them from a different
> > pool so that they won't poison the normal generations, but that would
> > require a slight extension of the GUILE API.
> 
> Do you have a suggestion for such an extension?

	SCM scm_changing_cell()
	SCM scm_changing_vector()

They would be allocated from a separate memory pool. Also if we
redefine the GC engine to use a different function internally
(i.e. scm_gc_internal_mark), then we can distinguish between data
cells that may be moved (only pointed to from within the heap), and
other cells (marked by scm_gc_mark: conservatively marked entries,
pointed to from smobs, etc.).

The copiable cells could be subjected to copying GC, although it would
require another scan of the entire heap.

>I think, summarizing all thoughts in workbook will be quite helpful.

I much prefer comments in the  source code. If you're serious about
hacking that's the first place where you go looking anyways.

> > * Direct write access to a vector must be done through the macro
> > SCM_WRITABLE_VELTS.
> 
> Hmmm?  Where should this be necessary?  Do you want to modify the vector
> cell itself, or are you using this for "speed ups"?  In the "speed-up"
> case:  wouldn't SCM_VECTOR_SET do as well, given that the compiler can
> extract constant expressions from within loops?

In some cases no (GC manipulations of weak vectors), and in some cases
yes, but it would introduce lots of redtape: long lists of
SCM_VECTOR_SET calls. I'll review the cases once more. 

> > * Will this patch be integrated into CVS?  The FSF already has
> >   disclaimers for me.
> 
> I have not looked into the patch yet, but I think that the SCM_VECTOR_SET
> patch as you describe it is probably the right thing to do.  I am not sure
> about the need for SCM_WRITABLE_VELTS, but I think we can discuss this
> one.  In any case, it would be great if you could contact Michael Livshin.

I assume he is still on the list, reading all my musings? Also I
noticed that Greg Harvey did a lot of work on gengc, but apparently
that didn't work very well?


--
Han-Wen Nienhuys   |   hanwen@cs.uu.nl    | http://www.cs.uu.nl/~hanwen/


_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-15 18:22   ` Han-Wen Nienhuys
@ 2002-07-15 20:07     ` Dirk Herrmann
  2002-07-15 23:02       ` Han-Wen
  0 siblings, 1 reply; 13+ messages in thread
From: Dirk Herrmann @ 2002-07-15 20:07 UTC (permalink / raw)
  Cc: guile-devel, jantien

On Mon, 15 Jul 2002, Han-Wen Nienhuys wrote:

> > Do you have a suggestion for such an extension?
> 
> 	SCM scm_changing_cell()
> 	SCM scm_changing_vector()
> 
> They would be allocated from a separate memory pool. Also if we
> redefine the GC engine to use a different function internally
> (i.e. scm_gc_internal_mark), then we can distinguish between data
> cells that may be moved (only pointed to from within the heap), and
> other cells (marked by scm_gc_mark: conservatively marked entries,
> pointed to from smobs, etc.).
> 
> The copiable cells could be subjected to copying GC, although it would
> require another scan of the entire heap.

It's a good idea - unfortunately it is patented.  Don't ask me who has
this patented, but I also suggested this idea some time ago.  That's the
bad thing about patents:  Whenever there's a good idea, you can't be sure
it is patented.

But, it may make sense to investigate this patent issue again.  Maybe the
information above is outdated and the idea can actually be used in free
software?

In this case, I wouldn't like to have the API extended as you suggest
above:  Then it would be easy to re-arrange cells in cards such that
untouched cells are moved into common cards - no need to extend the API.

> >I think, summarizing all thoughts in workbook will be quite helpful.
> 
> I much prefer comments in the  source code. If you're serious about
> hacking that's the first place where you go looking anyways.

This is true if there is code.  As long as there is just a conceptual
phase, there is no code to comment on.  And, the current situation is a
counterexample to your suggestion:  If we had started the workbook idea
sooner, we would not have to go around asking for things about gc that
others have already thought about.

> > > * Direct write access to a vector must be done through the macro
> > > SCM_WRITABLE_VELTS.
> > 
> > Hmmm?  Where should this be necessary?  Do you want to modify the vector
> > cell itself, or are you using this for "speed ups"?  In the "speed-up"
> > case:  wouldn't SCM_VECTOR_SET do as well, given that the compiler can
> > extract constant expressions from within loops?
> 
> In some cases no (GC manipulations of weak vectors), and in some cases
> yes, but it would introduce lots of redtape: long lists of
> SCM_VECTOR_SET calls. I'll review the cases once more. 

GC manipulations of weak vectors?  These shouldn't require any write
barrier.  And, even if so, there's already SCM_SET_WVECT_GC_CHAIN.

> > > * Will this patch be integrated into CVS?  The FSF already has
> > >   disclaimers for me.
> > 
> > I have not looked into the patch yet, but I think that the SCM_VECTOR_SET
> > patch as you describe it is probably the right thing to do.  I am not sure
> > about the need for SCM_WRITABLE_VELTS, but I think we can discuss this
> > one.  In any case, it would be great if you could contact Michael Livshin.
> 
> I assume he is still on the list, reading all my musings? Also I
> noticed that Greg Harvey did a lot of work on gengc, but apparently
> that didn't work very well?

I think that Michael took over some stuff from Greg Harvey.  Again:
Please don't wait to write down your thoughts until there is code.

Best regards,
Dirk


_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-15 20:07     ` Dirk Herrmann
@ 2002-07-15 23:02       ` Han-Wen
  2002-07-16  9:22         ` Han-Wen Nienhuys
  2002-07-16 23:04         ` Dirk Herrmann
  0 siblings, 2 replies; 13+ messages in thread
From: Han-Wen @ 2002-07-15 23:02 UTC (permalink / raw)
  Cc: guile-devel, jantien

dirk@sallust.ida.ing.tu-bs.de writes:
> > 	SCM scm_changing_cell()
> > 	SCM scm_changing_vector()
> > 
> > They would be allocated from a separate memory pool. Also if we
> > redefine the GC engine to use a different function internally
> > (i.e. scm_gc_internal_mark), then we can distinguish between data
> > cells that may be moved (only pointed to from within the heap), and
> > other cells (marked by scm_gc_mark: conservatively marked entries,
> > pointed to from smobs, etc.).
> > 
> > The copiable cells could be subjected to copying GC, although it would
> > require another scan of the entire heap.
> 
> It's a good idea - unfortunately it is patented.  Don't ask me who has
> this patented, but I also suggested this idea some time ago.  That's the
> bad thing about patents:  Whenever there's a good idea, you can't be sure
> it is patented.
> 
> But, it may make sense to investigate this patent issue again.  Maybe the
> information above is outdated and the idea can actually be used in free
> software?
> 
> In this case, I wouldn't like to have the API extended as you suggest
> above:  Then it would be easy to re-arrange cells in cards such that
> untouched cells are moved into common cards - no need to extend the API.

I don't know. Some form of the idea seems to be patented, but I can't
access the text of the patent, so I'm not sure (it's no. 251554).
(and even if I could -- you never know: the only person qualified to
judge wether a patent is applicable are IP lawyers and IP judges. With
those millions and millions of patents it is unlikely that none of
them are covering GUILE as it is).

Interestingly, his techreport
(ftp://gatekeeper.dec.com/pub/DEC/WRL/research-reports/WRL-TR-88.2.ps)
contains C source code for the garbage collector -- but that's no
permission to use it, right, and the "patents" page at DEC doesn't
mention the patent.

(some time later I dug it up anyway: 

http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=/netahtml/search-bool.html&r=1&f=G&l=50&co1=AND&d=ft90&s1='digital+equipment'.ASNM.&s2=bartlett.INZZ.&OS=AN/%22digital+equipment%22+AND+IN/bartlett&RS=AN/%22digital+equipment%22+AND+IN/bartlett

)

I'm reading the patent (gosh, what verbosity) -- but interestingly,
the claim is only made for not copying entire pages (cards in Scheme
terms), that are marked conservatively. If you one uses a second set
of mark bits, you could leave alone exactly those objects that are
marked conservatively (iso. the whole page). The price is that you
have another mark-bit vector (1.5 % memory overhead), that the free
space in the conservatively marked pages is fragmented, and that there
is more overhead (checking the marked-conservatively-bit) during the
copy phase.

> This is true if there is code.  As long as there is just a
> conceptual phase, there is no code to comment on.  And, the current
> situation is a counterexample to your suggestion: If we had started
> the workbook idea sooner, we would not have to go around asking for
> things about gc that others have already thought about.

Ok. So how do I go about adding to the workbook? Do I send simply send
patches?

> > > Hmmm?  Where should this be necessary?  Do you want to modify the vector
> > > cell itself, or are you using this for "speed ups"?  In the "speed-up"
> > > case:  wouldn't SCM_VECTOR_SET do as well, given that the compiler can
> > > extract constant expressions from within loops?
> > 
> > In some cases no (GC manipulations of weak vectors), and in some cases
> > yes, but it would introduce lots of redtape: long lists of
> > SCM_VECTOR_SET calls. I'll review the cases once more. 
> 
> GC manipulations of weak vectors?  These shouldn't require any write
> barrier.  And, even if so, there's already SCM_SET_WVECT_GC_CHAIN.

No, exactly my point. If you add some kind of GC related code that
writes header from SCM_VECTOR_SET() (to flag object writes), then you
shouldn't use it in code that is related to the GC. Also flagging
object writes is overhead, so you typically want to move it out of
loops if possible.

-- 

Han-Wen Nienhuys   |   hanwen@cs.uu.nl   |   http://www.cs.uu.nl/~hanwen 

_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-15 23:02       ` Han-Wen
@ 2002-07-16  9:22         ` Han-Wen Nienhuys
  2002-07-16 23:04         ` Dirk Herrmann
  1 sibling, 0 replies; 13+ messages in thread
From: Han-Wen Nienhuys @ 2002-07-16  9:22 UTC (permalink / raw)


hanwen@cs.uu.nl writes:
> I'm reading the patent (gosh, what verbosity) -- but interestingly,
> the claim is only made for not copying entire pages (cards in Scheme
> terms), that are marked conservatively. If you one uses a second set
> of mark bits, you could leave alone exactly those objects that are
> marked conservatively (iso. the whole page). The price is that you
> have another mark-bit vector (1.5 % memory overhead), that the free
> space in the conservatively marked pages is fragmented, and that there
> is more overhead (checking the marked-conservatively-bit) during the
> copy phase.

Actually, I can top that. If you store the range of cells that is
marked conservatively, you can still beat the patent but avoid the 2nd
bitvector overhead. All this (bitvector ptr, conservative-cell range,
generation count and card) flags could even be stored in the 1st
header cell.

-- 

Han-Wen Nienhuys   |   hanwen@cs.uu.nl    | http://www.cs.uu.nl/~hanwen/


_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-15 23:02       ` Han-Wen
  2002-07-16  9:22         ` Han-Wen Nienhuys
@ 2002-07-16 23:04         ` Dirk Herrmann
  2002-07-16 23:26           ` Han-Wen
  2002-07-17 21:53           ` Han-Wen
  1 sibling, 2 replies; 13+ messages in thread
From: Dirk Herrmann @ 2002-07-16 23:04 UTC (permalink / raw)
  Cc: guile-devel, jantien

On Tue, 16 Jul 2002, Han-Wen wrote:

> I don't know. Some form of the idea seems to be patented, but I can't
> access the text of the patent, so I'm not sure (it's no. 251554).
> (and even if I could -- you never know: the only person qualified to
> judge wether a patent is applicable are IP lawyers and IP judges. With
> those millions and millions of patents it is unlikely that none of
> them are covering GUILE as it is).

Maybe it would make sense to ask the FSF about the issue:  If the patent
is not actually claimed or if there is an agreement to allow its use in
free software, there wouldn't be any restrictions at all.  Probably Marius
will know about how to contact the FSF legal department to get an official
statement about it.

> I'm reading the patent (gosh, what verbosity) -- but interestingly,
> the claim is only made for not copying entire pages (cards in Scheme
> terms), that are marked conservatively. If you one uses a second set
> of mark bits, you could leave alone exactly those objects that are
> marked conservatively (iso. the whole page). The price is that you
> have another mark-bit vector (1.5 % memory overhead), that the free
> space in the conservatively marked pages is fragmented, and that there
> is more overhead (checking the marked-conservatively-bit) during the
> copy phase.

Interesting.  But:  isn't the conservatively marked page fragmented
anyway?  At some time a few objects in a conservatively marked page will
be collected.  That is, wouldn't we need to keep track of conservatively
marked cells (in contrast to pages) anyway?

> > > > Hmmm?  Where should this be necessary?  Do you want to modify the vector
> > > > cell itself, or are you using this for "speed ups"?  In the "speed-up"
> > > > case:  wouldn't SCM_VECTOR_SET do as well, given that the compiler can
> > > > extract constant expressions from within loops?
> > > 
> > > In some cases no (GC manipulations of weak vectors), and in some cases
> > > yes, but it would introduce lots of redtape: long lists of
> > > SCM_VECTOR_SET calls. I'll review the cases once more. 
> > 
> > GC manipulations of weak vectors?  These shouldn't require any write
> > barrier.  And, even if so, there's already SCM_SET_WVECT_GC_CHAIN.
> 
> No, exactly my point. If you add some kind of GC related code that
> writes header from SCM_VECTOR_SET() (to flag object writes), then you
> shouldn't use it in code that is related to the GC. Also flagging
> object writes is overhead, so you typically want to move it out of
> loops if possible.

Wouldn't the compiler move the code out anyway?  (I don't know how
complicated the flagging code would look like, so it may be that this is
impossible...)

According to the point that during GC there shouldn't be any flagging
done:  We already do have special cell accessing macros for GC:
SCM_GC_CELL_WORD and friends.  The point is, that up to now these have had
a different intention:  They don't check if their cells are actually
cells - even in debug mode.  In contrast, SCM_CELL_WORD and friend do
exhaustive checking in debug mode.

Maybe we could re-define the meaning of SCM_GC_CELL_WORD and friends to
mean the following:  The macros are only to be used in code related to GC.  
The cells are never checked for being cells, and the cells are not flagged
for gengc.  For those cases, where cells are to be modified during GC but
still need to be flagged, there could be a special macro to explicitly
flag the cells - probably such a macro would exist anyway.

Best regards,
Dirk Herrmann


_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-16 23:04         ` Dirk Herrmann
@ 2002-07-16 23:26           ` Han-Wen
  2002-07-17 21:53           ` Han-Wen
  1 sibling, 0 replies; 13+ messages in thread
From: Han-Wen @ 2002-07-16 23:26 UTC (permalink / raw)
  Cc: guile-devel, jantien

dirk@sallust.ida.ing.tu-bs.de writes:
> On Tue, 16 Jul 2002, Han-Wen wrote:
> 
> > I don't know. Some form of the idea seems to be patented, but I can't
> > access the text of the patent, so I'm not sure (it's no. 251554).
> > (and even if I could -- you never know: the only person qualified to
> > judge wether a patent is applicable are IP lawyers and IP judges. With
> > those millions and millions of patents it is unlikely that none of
> > them are covering GUILE as it is).
> 
> Maybe it would make sense to ask the FSF about the issue:  If the patent
> is not actually claimed or if there is an agreement to allow its use in
> free software, there wouldn't be any restrictions at all.  Probably Marius
> will know about how to contact the FSF legal department to get an official
> statement about it.

(short reply, I'm busy with something else right now.)

I talked to Michael Livshin, who says to have quit GUILE development.

He notes that CMU CL and its derivative SBCL have been using a mostly
copying gen GC for 6 years now. Squish (a C++ library) also uses
it. Compaq is not actively enforcing this patent, so it seems. (And
"only" 5 years until it is no longer valid)

> Wouldn't the compiler move the code out anyway?  (I don't know how
> complicated the flagging code would look like, so it may be that this is
> impossible...)

It depends on what code is in the loop. But I guess you're right. The
cases where it is surely safe to move the flagging, are also the ones
that a compiler could optimize.

-- 

Han-Wen Nienhuys   |   hanwen@cs.uu.nl   |   http://www.cs.uu.nl/~hanwen 

_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-16 23:04         ` Dirk Herrmann
  2002-07-16 23:26           ` Han-Wen
@ 2002-07-17 21:53           ` Han-Wen
  2002-07-19 23:48             ` Dirk Herrmann
  1 sibling, 1 reply; 13+ messages in thread
From: Han-Wen @ 2002-07-17 21:53 UTC (permalink / raw)
  Cc: guile-devel, jantien

dirk@sallust.ida.ing.tu-bs.de writes:
> > I'm reading the patent (gosh, what verbosity) -- but interestingly,
> > the claim is only made for not copying entire pages (cards in Scheme
> > terms), that are marked conservatively. If you one uses a second set
> > of mark bits, you could leave alone exactly those objects that are
> > marked conservatively (iso. the whole page). The price is that you
> > have another mark-bit vector (1.5 % memory overhead), that the free
> > space in the conservatively marked pages is fragmented, and that there
> > is more overhead (checking the marked-conservatively-bit) during the
> > copy phase.
> 
> Interesting.  But:  isn't the conservatively marked page fragmented
> anyway?  At some time a few objects in a conservatively marked page will
> be collected.  That is, wouldn't we need to keep track of conservatively
> marked cells (in contrast to pages) anyway?

Yes, but the point is that you can not move marked cells in a
conservatively marked page.  Somewhere along the line (if you do
copying GC),  you have to say

	if (conservatively_marked (cell))
	    ; 
	else
	   copy_cell_to_new_location ()
	   

for bartlett conservatively_marked() equals "this page contains a
conservatively_marked object". 

-- 

Han-Wen Nienhuys   |   hanwen@cs.uu.nl   |   http://www.cs.uu.nl/~hanwen 

_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: GUILE GC -- Write barrier for vectors
  2002-07-17 21:53           ` Han-Wen
@ 2002-07-19 23:48             ` Dirk Herrmann
  0 siblings, 0 replies; 13+ messages in thread
From: Dirk Herrmann @ 2002-07-19 23:48 UTC (permalink / raw)
  Cc: guile-devel, jantien

On Wed, 17 Jul 2002, Han-Wen wrote:

> dirk@sallust.ida.ing.tu-bs.de writes:
> > > I'm reading the patent (gosh, what verbosity) -- but interestingly,
> > > the claim is only made for not copying entire pages (cards in Scheme
> > > terms), that are marked conservatively. If you one uses a second set
> > > of mark bits, you could leave alone exactly those objects that are
> > > marked conservatively (iso. the whole page). The price is that you
> > > have another mark-bit vector (1.5 % memory overhead), that the free
> > > space in the conservatively marked pages is fragmented, and that there
> > > is more overhead (checking the marked-conservatively-bit) during the
> > > copy phase.
> > 
> > Interesting.  But:  isn't the conservatively marked page fragmented
> > anyway?  At some time a few objects in a conservatively marked page will
> > be collected.  That is, wouldn't we need to keep track of conservatively
> > marked cells (in contrast to pages) anyway?
> 
> Yes, but the point is that you can not move marked cells in a
> conservatively marked page.  Somewhere along the line (if you do
> copying GC),  you have to say
> 
> 	if (conservatively_marked (cell))
> 	    ; 
> 	else
> 	   copy_cell_to_new_location ()
> 	   
> 
> for bartlett conservatively_marked() equals "this page contains a
> conservatively_marked object". 

If I understand things right, then the technique to treat a whole page as
conservatively marked even if only a single one of its cells is actually
conservatively marked is a kind of optimization. Now, the question is,
whether this optimization is useful:  Every page that is marked
conservatively will be kept occupied as a whole, even if all cells except
for the one conservatively marked cell are collected.  That is, the free
space in that page is fragmented anyway.  To re-use the free cells in that
fragmented page would require handling of fragmented pages.

Best regards,
Dirk


_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2002-07-19 23:48 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-07-14 22:14 GUILE GC -- Write barrier for vectors hanwen
2002-07-15 10:56 ` Miroslav Silovic
2002-07-15 11:00   ` Han-Wen Nienhuys
2002-07-15 11:36     ` Miroslav Silovic
2002-07-15 16:49 ` Dirk Herrmann
2002-07-15 18:22   ` Han-Wen Nienhuys
2002-07-15 20:07     ` Dirk Herrmann
2002-07-15 23:02       ` Han-Wen
2002-07-16  9:22         ` Han-Wen Nienhuys
2002-07-16 23:04         ` Dirk Herrmann
2002-07-16 23:26           ` Han-Wen
2002-07-17 21:53           ` Han-Wen
2002-07-19 23:48             ` Dirk Herrmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).