From: "Michael A. Wells" <michaelawells@motorola.com>
Subject: substantial performance loss when running long-lived or computationally intensive programs
Date: Mon, 05 May 2003 22:36:26 -0500 (CDT) [thread overview]
Message-ID: <20030505.223626.84190828.wells@email.mot.com> (raw)
The Guile interpreter comes to a near halt after making a sufficiently
large number of calls to guile built-ins which do not appropriately
maintain the 'scm_mallocated' variable (declared in "libguile/gc.c").
Although this problem is present in Guile-1.6.4, the problem dates
back to at least Guile 1.4.
There appears to be a widespread misunderstanding as to how
'scm_mallocated' should be maintained.
[Several other people have discussed this issue:
http://mail.gnu.org/archive/html/bug-guile/2001-05/msg00024.html
http://www.glug.org/snap/workbook/gc/memory.text ]
There is no comment in "libguile/gc.c" which describes what
'scm_mallocated' should contain, but by looking through the code, it
appears that 'scm_mallocated' should correspond to the total number of
bytes obtained through malloc currently held by the interpreter.
Each time a block of memory of size S is allocated, 'scm_mallocated'
should be incremented by S.
Each time a block of memory of size S is garbaged collected and/or
freed, 'scm_mallocated' should be decremented by S.
'scm_mallocated', in combination with 'scm_mtrigger' is used to
determine when the certain calls to the garbage collector (scm_igc)
are triggered.
When 'scm_mallocated' is greater than 'scm_mtrigger', garbage
collection is triggered. After garbage collection, if the yield is too
small, 'scm_mtrigger' is increased, relative to 'scm_mtrigger'. (See
body of 'check_mtrigger' function.)
[Garbage collections are triggered by functions other than
'scm_mtrigger', which turns out to be a good thing. If
'check_mtrigger' were the only caller of 'scm_igc', the size of the
guile image could grow very rapidly.]
Unfortunately, in some cases 'scm_mallocated' is incremented when the
memory is allocated, but scm_mallocated' is not decremented when
memory is freed.
As 'scm_mallocated' approaches 2^32, the value of 'scm_mtrigger' may be
set to a value _intended_ to be higher than 'scm_mallocated', but ends
up wrapping around to a value lower than 'scm_mallocated'.
It is also possible that 'scm_mallocated' will wrap around as well.
Once a wraparound occurs, the interpreter comes to a near halt.
I've identified a _partial_ list of code which increments but does not
decrement 'scm_mallocated':
libguile/filesys.c: getcwd, readline
libguile/fports.c: fport_close
libguile/posix.c: gethostname
libguile/ports.c: scm_remove_from_port_table
libguile/regex-posix.c: scm_regexp_exec
These functions allocate memory with 'scm_must_malloc' and
'scm_must_realloc', then free memory with a call
'scm_must_free'. The call to 'scm_must_free' is not made
by the garbage collector.
While 'scm_must_malloc' and 'scm_must_realloc' increment
'scm_mallocated', 'scm_must_free' does not decrement
'scm_mallocated'.
I have attached some scheme code which demonstrates how certain using certain
guile built-ins increment but do not decrement 'scm_mallocated'.
Thanks,
Michael Wells
;---snip here
(use-modules (ice-9 format))
(define (tester how-many-times thunk)
; force garbage collection
(gc)
(gc)
(gc)
; get initial value of 'scm_mallocated'
; (available using 'bytes_malloced key from call to (gc-stats)
(let ((initial-bytes-malloced (cdr (assoc 'bytes-malloced (gc-stats))))
(starting-seconds (tms:clock (times)))) ;(current-time)))
; run 'thunk' 'how-many-times'
(do ((x 0 (+ x 1)))
((= x how-many-times) #t)
(cond ((= (modulo x 1000) 0)
(let ((now (tms:clock (times))))
(display (format " ~A : ~A clock units~%" x (- now starting-seconds)))
(set! starting-seconds now))))
(thunk))
; force garbage collection
(gc)
(gc)
(gc)
(let ((final-bytes-malloced (cdr (assoc 'bytes-malloced (gc-stats)))))
(display
(format " final-bytes-malloced: ~A~%" final-bytes-malloced))
(display
(format "initial-bytes-malloced: ~A~%" initial-bytes-malloced))
(display
(format " final - initial: ~A~%" (- final-bytes-malloced
initial-bytes-malloced))))))
; libguile/posix.c -- "gethostname"
(tester 10000 (lambda () (gethostname)))
;
; Performance suffers when 'scm_mallocated' is very large.
; Note how quickly each 100 calls to '(gethostname)'
; run at first; compare with how long each 100 calls takes
; near end of run.
; (For me, performance grinds to a halt after the
; first 14779000 or so iterations of (gethostname).
;(tester 20000000 (lambda () (gethostname)))
; fports.c -- "fport_close"
; ports.c -- "scm_remove_from_port_table"
(tester 1000
(lambda ()
(with-output-to-string (lambda ()
(display "1234567890")
(display "1234567890")))))
; libguile/regex-posix.c -- "scm_regexp_exec"
(define *token* (make-regexp "^[A-Za-z0-9_]*" ))
(tester 1000 (lambda () (regexp-exec *token* "foo")))
; libguile/filesys.c "getcwd"
(tester 1000 (lambda () (getcwd)))
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel
next reply other threads:[~2003-05-06 3:36 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-05-06 3:36 Michael A. Wells [this message]
2003-05-14 11:14 ` substantial performance loss when running long-lived or computationally intensive programs Han-Wen Nienhuys
2003-05-14 14:36 ` Mikael Djurfeldt
2003-10-28 15:46 ` Thien-Thi Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030505.223626.84190828.wells@email.mot.com \
--to=michaelawells@motorola.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).