unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
From: "Michael A. Wells" <michaelawells@motorola.com>
Subject: substantial performance loss when running long-lived or computationally intensive programs
Date: Mon, 05 May 2003 22:36:26 -0500 (CDT)	[thread overview]
Message-ID: <20030505.223626.84190828.wells@email.mot.com> (raw)


The Guile interpreter comes to a near halt after making a sufficiently
large number of calls to guile built-ins which do not appropriately
maintain the 'scm_mallocated' variable (declared in "libguile/gc.c").

Although this problem is present in Guile-1.6.4, the problem dates
back to at least Guile 1.4.

There appears to be a widespread misunderstanding as to how
'scm_mallocated' should be maintained.

[Several other people have discussed this issue:
    http://mail.gnu.org/archive/html/bug-guile/2001-05/msg00024.html
    http://www.glug.org/snap/workbook/gc/memory.text ]

There is no comment in "libguile/gc.c" which describes what
'scm_mallocated' should contain, but by looking through the code, it
appears that 'scm_mallocated' should correspond to the total number of
bytes obtained through malloc currently held by the interpreter. 

Each time a block of memory of size S is allocated, 'scm_mallocated'
should be incremented by S.

Each time a block of memory of size S is garbaged collected and/or
freed, 'scm_mallocated' should be decremented by S.

'scm_mallocated', in combination with 'scm_mtrigger' is used to
determine when the certain calls to the garbage collector (scm_igc)
are triggered.

When 'scm_mallocated' is greater than 'scm_mtrigger', garbage
collection is triggered.  After garbage collection, if the yield is too
small, 'scm_mtrigger' is increased, relative to 'scm_mtrigger'. (See
body of 'check_mtrigger' function.)

[Garbage collections are triggered by functions other than
'scm_mtrigger', which turns out to be a good thing.   If
'check_mtrigger' were the only caller of 'scm_igc', the size of the
guile image could grow very rapidly.]

Unfortunately, in some cases 'scm_mallocated' is incremented when the
memory is allocated, but scm_mallocated' is not decremented when
memory is freed.

As 'scm_mallocated' approaches 2^32, the value of 'scm_mtrigger' may be
set to a value _intended_ to be higher than 'scm_mallocated', but ends
up wrapping around to a value lower than 'scm_mallocated'.
It is also possible that 'scm_mallocated' will wrap around as well.

Once a wraparound occurs, the interpreter comes to a near halt.

I've identified a _partial_ list of code which increments but does not
decrement 'scm_mallocated':
 
   libguile/filesys.c: getcwd, readline
   libguile/fports.c: fport_close
   libguile/posix.c: gethostname
   libguile/ports.c: scm_remove_from_port_table
   libguile/regex-posix.c: scm_regexp_exec

      These functions allocate memory with 'scm_must_malloc' and
      'scm_must_realloc', then free memory with a call
      'scm_must_free'.  The call to 'scm_must_free' is not made
      by the garbage collector.          

      While 'scm_must_malloc' and 'scm_must_realloc' increment 
      'scm_mallocated',  'scm_must_free' does not decrement
      'scm_mallocated'.

I have attached some scheme code which demonstrates how certain using certain
guile built-ins increment but do not decrement 'scm_mallocated'.

Thanks,
Michael Wells


;---snip here

(use-modules (ice-9 format))

(define (tester how-many-times thunk)
  ; force garbage collection
  (gc)
  (gc)
  (gc)
  ; get initial value of 'scm_mallocated'
  ;   (available using 'bytes_malloced key from call to (gc-stats)
  (let ((initial-bytes-malloced (cdr (assoc 'bytes-malloced (gc-stats))))
	(starting-seconds (tms:clock (times)))) ;(current-time)))	
    ; run 'thunk' 'how-many-times'
    (do ((x 0 (+ x 1)))
	((= x how-many-times) #t)
      (cond ((= (modulo x 1000) 0)
	     (let ((now (tms:clock (times))))
	       (display (format "   ~A : ~A clock units~%" x (- now starting-seconds)))
	       (set! starting-seconds now))))	     
      (thunk))
    ; force garbage collection
    (gc)
    (gc)
    (gc)
    (let ((final-bytes-malloced (cdr (assoc 'bytes-malloced (gc-stats)))))
      (display 
       (format "  final-bytes-malloced: ~A~%" final-bytes-malloced))
      (display
       (format "initial-bytes-malloced: ~A~%" initial-bytes-malloced))
      (display
       (format "       final - initial: ~A~%" (- final-bytes-malloced
						 initial-bytes-malloced))))))
; libguile/posix.c -- "gethostname"
(tester 10000 (lambda () (gethostname)))

;
; Performance suffers when 'scm_mallocated' is very large.
; Note how quickly each 100 calls to '(gethostname)'
; run at first; compare with how long each 100 calls takes 
; near end of run.
; (For me, performance grinds to a halt after the 
; first 14779000 or so iterations of (gethostname).
;(tester 20000000 (lambda () (gethostname)))
         

; fports.c -- "fport_close"
; ports.c -- "scm_remove_from_port_table"
(tester 1000
	(lambda () 
	  (with-output-to-string (lambda () 
				   (display "1234567890")
				   (display "1234567890")))))
				   

; libguile/regex-posix.c -- "scm_regexp_exec"
(define *token* (make-regexp "^[A-Za-z0-9_]*" ))

(tester 1000 (lambda () (regexp-exec *token* "foo")))

; libguile/filesys.c "getcwd"
(tester 1000 (lambda () (getcwd)))



_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel


             reply	other threads:[~2003-05-06  3:36 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-05-06  3:36 Michael A. Wells [this message]
2003-05-14 11:14 ` substantial performance loss when running long-lived or computationally intensive programs Han-Wen Nienhuys
2003-05-14 14:36   ` Mikael Djurfeldt
2003-10-28 15:46 ` Thien-Thi Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030505.223626.84190828.wells@email.mot.com \
    --to=michaelawells@motorola.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).