From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: GC Date: Tue, 28 Jun 2005 00:55:00 -0400 Message-ID: <877jgfyu5t.fsf-monnier+emacs@gnu.org> References: <200506182319.j5INJWF08937@raven.dms.auburn.edu> <200506190015.j5J0FQk09223@raven.dms.auburn.edu> <200506190037.j5J0b9Y09287@raven.dms.auburn.edu> <200506191747.j5JHlha11521@raven.dms.auburn.edu> <200506202312.j5KNCct19091@raven.dms.auburn.edu> <200506212058.j5LKw5P23961@raven.dms.auburn.edu> <874qbqh0lm.fsf@jurta.org> <87mzpf3a5v.fsf_-_@jurta.org> <87y88zv3vm.fsf@jurta.org> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1119934425 22988 80.91.229.2 (28 Jun 2005 04:53:45 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 28 Jun 2005 04:53:45 +0000 (UTC) Cc: Juri Linkov , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Jun 28 06:53:37 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1Dn866-0001Xu-Re for ged-emacs-devel@m.gmane.org; Tue, 28 Jun 2005 06:53:23 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Dn8BD-0001Ko-DR for ged-emacs-devel@m.gmane.org; Tue, 28 Jun 2005 00:58:39 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Dn8Ad-0001FE-Pg for emacs-devel@gnu.org; Tue, 28 Jun 2005 00:58:04 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Dn8AT-00018P-3L for emacs-devel@gnu.org; Tue, 28 Jun 2005 00:57:55 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Dn8AS-00016p-JL for emacs-devel@gnu.org; Tue, 28 Jun 2005 00:57:52 -0400 Original-Received: from [209.226.175.25] (helo=tomts5-srv.bellnexxia.net) by monty-python.gnu.org with esmtp (Exim 4.34) id 1Dn8C3-0007Pj-9y; Tue, 28 Jun 2005 00:59:31 -0400 Original-Received: from alfajor ([67.71.35.153]) by tomts5-srv.bellnexxia.net (InterMail vM.5.01.06.10 201-253-122-130-110-20040306) with ESMTP id <20050628045500.UWQY26128.tomts5-srv.bellnexxia.net@alfajor>; Tue, 28 Jun 2005 00:55:00 -0400 Original-Received: by alfajor (Postfix, from userid 1000) id 8FA88D732C; Tue, 28 Jun 2005 00:55:00 -0400 (EDT) Original-To: Eli Zaretskii In-Reply-To: (Eli Zaretskii's message of "Sat, 25 Jun 2005 11:48:54 +0200") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:39743 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:39743 > think, but then again it might not: IIRC, Emacs always does a GC > before it asks the OS for more heap. So you might see the message I don't think this is true. GC should only ever be called from eval or funcall. Several parts of the code assume that Fcons cannot call the GC, for example. In any case, here is my understanding of the situation: The time taken by a single GC is roughly proportional to the total (live+dead) heap size: the mark phase is proportional to the live-data size, but the subsequent sweep phase is proportional to the total heap size. Total heap size at the time of a GC is roughly equal to live-data + gc-cons-threshold + unused-allocated. The unused-allocated part of the memory is basically due to fragmentation. The frequency of GC is inversely proportional to gc-cons-threshold. So the total portion of time used up by GC (the GC-overhead) is basically proportional to: live-data + gc-cons-threshold + fragmentation --------------------------------------------- gc-cons-threshold with a fixed gc-cons-threshold (as we have now), this boils down to live-data + fragmentation ------------------------- + 1 gc-cons-threshold So the GC-overhead currently grows with the live-data and with the fragmentation. This is one of the reasons why with a large heap, Emacs tends to slow down. Looking at the above equation one might think "let's bump gc-cons-threshold way up" to make GC much cheaper. The problem with it is that it tends to increase fragmentation by delaying the reclaiming of memory (most serious studies of memory fragmentation with non-moving memory allocator indicate that an important factor to reduce fragmentation is prompt reclamation of memory). Making gc-cons-threshold proportional to the installed RAM sounds like a bad idea to me: it's bound to be too small for some cases and much too large for others. The normal way to keep GC-overhead under control is to grow gc-cons-threshold together with the heap size, such that the GC-overhead stays constant (by making GCs less frequent when they get more time-consuming). Of course this may not always be best because by growing gc-cons-threshold we may increase fragmentation, but "the best" is simply not doable (not with a simple mark&sweep anyway). I'd had already suggested a change to grow gc-cons-threshold as the heap grows (a long time ago), and I see that XEmacs's gc-cons-percentage is clean interface to such a feature. I think we should introduce this variable and give it a good non-zero default value. Stefan