unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* substantial performance loss when running long-lived or computationally intensive programs
@ 2003-05-06  3:36 Michael A. Wells
  2003-05-14 11:14 ` Han-Wen Nienhuys
  0 siblings, 1 reply; 4+ messages in thread
From: Michael A. Wells @ 2003-05-06  3:36 UTC (permalink / raw)



The Guile interpreter comes to a near halt after making a sufficiently
large number of calls to guile built-ins which do not appropriately
maintain the 'scm_mallocated' variable (declared in "libguile/gc.c").

Although this problem is present in Guile-1.6.4, the problem dates
back to at least Guile 1.4.

There appears to be a widespread misunderstanding as to how
'scm_mallocated' should be maintained.

[Several other people have discussed this issue:
    http://mail.gnu.org/archive/html/bug-guile/2001-05/msg00024.html
    http://www.glug.org/snap/workbook/gc/memory.text ]

There is no comment in "libguile/gc.c" which describes what
'scm_mallocated' should contain, but by looking through the code, it
appears that 'scm_mallocated' should correspond to the total number of
bytes obtained through malloc currently held by the interpreter. 

Each time a block of memory of size S is allocated, 'scm_mallocated'
should be incremented by S.

Each time a block of memory of size S is garbaged collected and/or
freed, 'scm_mallocated' should be decremented by S.

'scm_mallocated', in combination with 'scm_mtrigger' is used to
determine when the certain calls to the garbage collector (scm_igc)
are triggered.

When 'scm_mallocated' is greater than 'scm_mtrigger', garbage
collection is triggered.  After garbage collection, if the yield is too
small, 'scm_mtrigger' is increased, relative to 'scm_mtrigger'. (See
body of 'check_mtrigger' function.)

[Garbage collections are triggered by functions other than
'scm_mtrigger', which turns out to be a good thing.   If
'check_mtrigger' were the only caller of 'scm_igc', the size of the
guile image could grow very rapidly.]

Unfortunately, in some cases 'scm_mallocated' is incremented when the
memory is allocated, but scm_mallocated' is not decremented when
memory is freed.

As 'scm_mallocated' approaches 2^32, the value of 'scm_mtrigger' may be
set to a value _intended_ to be higher than 'scm_mallocated', but ends
up wrapping around to a value lower than 'scm_mallocated'.
It is also possible that 'scm_mallocated' will wrap around as well.

Once a wraparound occurs, the interpreter comes to a near halt.

I've identified a _partial_ list of code which increments but does not
decrement 'scm_mallocated':
 
   libguile/filesys.c: getcwd, readline
   libguile/fports.c: fport_close
   libguile/posix.c: gethostname
   libguile/ports.c: scm_remove_from_port_table
   libguile/regex-posix.c: scm_regexp_exec

      These functions allocate memory with 'scm_must_malloc' and
      'scm_must_realloc', then free memory with a call
      'scm_must_free'.  The call to 'scm_must_free' is not made
      by the garbage collector.          

      While 'scm_must_malloc' and 'scm_must_realloc' increment 
      'scm_mallocated',  'scm_must_free' does not decrement
      'scm_mallocated'.

I have attached some scheme code which demonstrates how certain using certain
guile built-ins increment but do not decrement 'scm_mallocated'.

Thanks,
Michael Wells


;---snip here

(use-modules (ice-9 format))

(define (tester how-many-times thunk)
  ; force garbage collection
  (gc)
  (gc)
  (gc)
  ; get initial value of 'scm_mallocated'
  ;   (available using 'bytes_malloced key from call to (gc-stats)
  (let ((initial-bytes-malloced (cdr (assoc 'bytes-malloced (gc-stats))))
	(starting-seconds (tms:clock (times)))) ;(current-time)))	
    ; run 'thunk' 'how-many-times'
    (do ((x 0 (+ x 1)))
	((= x how-many-times) #t)
      (cond ((= (modulo x 1000) 0)
	     (let ((now (tms:clock (times))))
	       (display (format "   ~A : ~A clock units~%" x (- now starting-seconds)))
	       (set! starting-seconds now))))	     
      (thunk))
    ; force garbage collection
    (gc)
    (gc)
    (gc)
    (let ((final-bytes-malloced (cdr (assoc 'bytes-malloced (gc-stats)))))
      (display 
       (format "  final-bytes-malloced: ~A~%" final-bytes-malloced))
      (display
       (format "initial-bytes-malloced: ~A~%" initial-bytes-malloced))
      (display
       (format "       final - initial: ~A~%" (- final-bytes-malloced
						 initial-bytes-malloced))))))
; libguile/posix.c -- "gethostname"
(tester 10000 (lambda () (gethostname)))

;
; Performance suffers when 'scm_mallocated' is very large.
; Note how quickly each 100 calls to '(gethostname)'
; run at first; compare with how long each 100 calls takes 
; near end of run.
; (For me, performance grinds to a halt after the 
; first 14779000 or so iterations of (gethostname).
;(tester 20000000 (lambda () (gethostname)))
         

; fports.c -- "fport_close"
; ports.c -- "scm_remove_from_port_table"
(tester 1000
	(lambda () 
	  (with-output-to-string (lambda () 
				   (display "1234567890")
				   (display "1234567890")))))
				   

; libguile/regex-posix.c -- "scm_regexp_exec"
(define *token* (make-regexp "^[A-Za-z0-9_]*" ))

(tester 1000 (lambda () (regexp-exec *token* "foo")))

; libguile/filesys.c "getcwd"
(tester 1000 (lambda () (getcwd)))



_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* substantial performance loss when running long-lived or computationally intensive programs
  2003-05-06  3:36 substantial performance loss when running long-lived or computationally intensive programs Michael A. Wells
@ 2003-05-14 11:14 ` Han-Wen Nienhuys
  2003-05-14 14:36   ` Mikael Djurfeldt
  0 siblings, 1 reply; 4+ messages in thread
From: Han-Wen Nienhuys @ 2003-05-14 11:14 UTC (permalink / raw)
  Cc: guile-devel

michaelawells@motorola.com writes:
> The Guile interpreter comes to a near halt after making a sufficiently
> large number of calls to guile built-ins which do not appropriately
> maintain the 'scm_mallocated' variable (declared in "libguile/gc.c").

> There appears to be a widespread misunderstanding as to how
> 'scm_mallocated' should be maintained.
> 
> There is no comment in "libguile/gc.c" which describes what
> 'scm_mallocated' should contain, but by looking through the code, it
> appears that 'scm_mallocated' should correspond to the total number of
> bytes obtained through malloc currently held by the interpreter. 

Your analysis is probably correct (I didn't check in detail), but I'm
not sure if it would be a good idea to fix all of the instances. It
would touch a lot of code for a stable release. The things you found
are fixed in 1.7.

> As 'scm_mallocated' approaches 2^32, the value of 'scm_mtrigger' may be
> set to a value _intended_ to be higher than 'scm_mallocated', but ends
> up wrapping around to a value lower than 'scm_mallocated'.
> It is also possible that 'scm_mallocated' will wrap around as well.

I've added checks in scm_gc_register_collectable_memory() in HEAD to
catch these corner-case mistakes.  (This fix might be eligible for
backporting to 1.6, although they are probably moot when the other
problems you indicated aren't fixed).

One thing that does nag me, is code in srcprop.c that manipulates
scm_mallocated directly and doesn't call
scm_gc_register_collectable_memory(). Is there a good reason for this?


-- 

Han-Wen Nienhuys   |   hanwen@cs.uu.nl   |   http://www.cs.uu.nl/~hanwen 


_______________________________________________
Bug-guile mailing list
Bug-guile@gnu.org
http://mail.gnu.org/mailman/listinfo/bug-guile


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: substantial performance loss when running long-lived or computationally intensive programs
  2003-05-14 11:14 ` Han-Wen Nienhuys
@ 2003-05-14 14:36   ` Mikael Djurfeldt
  2003-05-15 23:10     ` Han-Wen Nienhuys
  0 siblings, 1 reply; 4+ messages in thread
From: Mikael Djurfeldt @ 2003-05-14 14:36 UTC (permalink / raw)
  Cc: guile-devel

Han-Wen Nienhuys <hanwen@cs.uu.nl> writes:

> One thing that does nag me, is code in srcprop.c that manipulates
> scm_mallocated directly and doesn't call
> scm_gc_register_collectable_memory(). Is there a good reason for this?

Nope.  Only that scm_gc_register_collectable_memory() didn't exist
when srcprop.c was written in 1996.  In fact, at that time, direct
manipulation was probably the only possibility.

Please feel free to change this.

M



_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: substantial performance loss when running long-lived or computationally intensive programs
  2003-05-14 14:36   ` Mikael Djurfeldt
@ 2003-05-15 23:10     ` Han-Wen Nienhuys
  0 siblings, 0 replies; 4+ messages in thread
From: Han-Wen Nienhuys @ 2003-05-15 23:10 UTC (permalink / raw)
  Cc: guile-devel

djurfeldt@nada.kth.se writes:
> Han-Wen Nienhuys <hanwen@cs.uu.nl> writes:
> 
> > One thing that does nag me, is code in srcprop.c that manipulates
> > scm_mallocated directly and doesn't call
> > scm_gc_register_collectable_memory(). Is there a good reason for this?
> 
> Nope.  Only that scm_gc_register_collectable_memory() didn't exist
> when srcprop.c was written in 1996.  In fact, at that time, direct
> manipulation was probably the only possibility.
> 
> Please feel free to change this.

Done.

-- 

Han-Wen Nienhuys   |   hanwen@cs.uu.nl   |   http://www.cs.uu.nl/~hanwen 


_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-05-15 23:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-05-06  3:36 substantial performance loss when running long-lived or computationally intensive programs Michael A. Wells
2003-05-14 11:14 ` Han-Wen Nienhuys
2003-05-14 14:36   ` Mikael Djurfeldt
2003-05-15 23:10     ` Han-Wen Nienhuys

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).