From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Ihor Radchenko Newsgroups: gmane.emacs.devel Subject: Re: Indentation and gc Date: Sun, 12 Mar 2023 14:50:33 +0000 Message-ID: <87y1o2t45i.fsf@localhost> References: <20230310110747.4hytasakomvdyf7i.ref@Ergus> <20230310110747.4hytasakomvdyf7i@Ergus> <87a60k657y.fsf@web.de> <838rg4zmg9.fsf@gnu.org> <87ttys4dge.fsf@web.de> <83sfebyepp.fsf@gnu.org> <87ttyru4zt.fsf@web.de> <83fsabyb41.fsf@gnu.org> <87mt4jtpqf.fsf@web.de> <83ilf7wi48.fsf@gnu.org> <878rg3wh2f.fsf@localhost> <87a60jtg0z.fsf@web.de> <877cvmumjq.fsf@localhost> <83356aukkh.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="24187"; mail-complaints-to="usenet@ciao.gmane.io" Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Mar 12 15:49:55 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pbN1G-0005zC-Rx for ged-emacs-devel@m.gmane-mx.org; Sun, 12 Mar 2023 15:49:55 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pbN0d-00051v-67; Sun, 12 Mar 2023 10:49:15 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pbN0c-00051f-5s for emacs-devel@gnu.org; Sun, 12 Mar 2023 10:49:14 -0400 Original-Received: from mout02.posteo.de ([185.67.36.66]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pbN0Z-0002Hw-Mp for emacs-devel@gnu.org; Sun, 12 Mar 2023 10:49:13 -0400 Original-Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id F1C612401C5 for ; Sun, 12 Mar 2023 15:49:08 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1678632549; bh=Gm+PCYEURXIRTVVPNtMtSdja473StujT348kPwiw3Yc=; h=From:To:Cc:Subject:Date:From; b=Qn4SQPtMTuJ/DUhJA9nPzaeSCx3gxx65Hk7awJTWm7g0NcAfYJXn8Es8KM+JtIXOC u46w/Hw/x5vknM1hu+l3UE9nSOmfd6E+t4iud/PE0JW5xYo+in4zV9IqHHrySsGYp/ oPtHNxqKhdSQULrfH/LWrrT3RwIU0B96WtiqZRYyIVcIkALoNHVIiHKS5vjjFKXt8D 92PZcAhnXmmt30m00n6gtEPbaLJmz8EPqSjuzMbUsJNSwJ72h3EMZOZ9Pm9jGCaZ45 U18fedCmHUc8SCe5d/qMtijzFx5CYQNf//jmPzotnYjfTo/azQEE42Jn6YK7O9aBqA vFcMz4GMGQSaA== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4PZN353Y12z6tqk; Sun, 12 Mar 2023 15:49:05 +0100 (CET) In-Reply-To: <83356aukkh.fsf@gnu.org> Received-SPF: pass client-ip=185.67.36.66; envelope-from=yantar92@posteo.net; helo=mout02.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:304373 Archived-At: Eli Zaretskii writes: >> Well. I do realize that there should be a limit, which is why I put it >> as 100Mb. > > Yes, but what is that 100 MiB number based on? any measurements of the > time it takes to run GC behind that number? or just a more-or-less > arbitrary value that "seems right"? That's what I tried to explain below. At the end, I used Emacs Lisp object usage divided by 10 and rounded down to hundreds. I did not try to be precise - just accurate to orders of magnitude. >> Strictly speaking, GC pauses scale with heap size. > > If by "heap size" you mean the total size of heap-allocated memory in > the Emacs process, then this is inaccurate. GC traverses only the > Lisp objects, whereas Emacs also allocates memory from the heap for > other purposes. It also allocates memory from the VM outside of the > "normal" heap -- that's where the buffer text memory usually comes > from, as well as any large enough chunk of memory Emacs needs. Thanks for the clarification. >> Increasing GC threshold >> will have two effects on the heap size: (1) thresholds lager than normal >> heap size will dominate the GC time - Emacs will need to traverse all >> the newly added data to be GCed; > > You seem to assume that GC traverses only the Lisp objects > newly-allocated since the previous GC. This is incorrect: it > traverses _all_ of the Lisp objects, both old and new. No, I am aware that GC traverses all the Lisp objects. That's why I said that large threshold only increases GC time significantly when the threshold is comparable to the heap size (part of it containing Lisp objects). Otherwise, heap size mostly determines how long it takes to complete a single GC. >> (2) too large thresholds will cause heap fragmentation, also >> increasing the GC times as the heap will expand. > > Not sure why do you think heap fragmentation increases monotonically > with larger thresholds. Maybe you should explain what you consider > "heap fragmentation" for the purposes of this discussion. See my other reply with my measurements of memory-limit vs. gc-cons-threshold. I assume that this scaling will not be drastically different even for different users. We can ask others to repeat my measurements though. >> I think that (2) is the most important factor for real world scenarios > > Not sure why you think so. Maybe because I don't have a clear idea > what kind of fragmentation you have in mind here. I meant that as long as gc-cons-threshold is much lower (10x or so) than heap size (Lisp object part), we do not need to worry about (1). Only (2) remains a concern. >> Emacs' default gives some lower safe bound on the threshold - it is >> `gc-cons-percentage', defaulting to 1% of the heap size. > > Actually, the default value of gc-cons-percentage is 0.1, i.e. 10%. > And it's 10% of the sum total of all live Lisp objects plus the number > of bytes allocated for Lisp objects since the last GC. Not 10% of the > heap size. Interesting. I thought that it is in percents. Then, I have to mention that I intentionally reduced gc-cons-percentage in my testing, which I detailed in my other message. With Emacs defaults (0.1 gc-cons-percentage), I get: memory-limit gcs-done gc-elapsed 526852 103 4.684100536 An equivalent of gc-cons-threshold = between 4Mb and 8Mb 10% also means that 800k gc-cons-threshold does not matter much even with emacs -Q -- it uses over 8Mb memory and thus gc-cons-percentage should dominate the GC, AFAIU. Note that my proposed 100Mb gc-cons-threshold limit will correspond to 1Gb live Lisp objects. For reference, this is what I have now (I got the data using memory-usage package): Total in lisp objects: 1.33GB (live 1.18GB, dead 157MB) Even if Emacs uses several hundreds Mbs of Lisp objects (typical scenario with third-party packages), my suggested gc-cons-threshold does not look too risky yet reducing GC when loading init.el (when heap size is still small). > How large is what you call "heap size" in your production session, may > I ask? See the above. >> AFAIU, routine throw-away memory allocation in Emacs is not directly >> correlated with the memory usage - it rather depends on the usage >> patterns and the packages being used. For example, it takes about 10 >> complex helm searches for me to trigger my 250Mb threshold - 25Mb per >> helm command. > > This calculation is only valid if each of these 10 commands conses > approximately the same amount of Lisp data. If that is not so, you > cannot really divide 250 MiB by 10 and claim that each command used up > that much Lisp memory. That's because GC is _not_ triggered as soon > as Emacs crosses the threshold, it is triggered when Emacs _checks_ > how much was consed since last GC and discovers it consed more than > the threshold. The trigger for testing is unrelated to crossing the > threshold. Sure. I ran exactly same command repeatedly. Just to get an idea about what is possible. Do not try to interpret my results as precise - they are just there to provide some idea about the orders of magnitude for the allocated memory. >> To get some idea about the impact of gc-cons-threshold on memory >> fragmentation, I compared the output of `memory-limit' with 250Mb vs. >> default 800kb threshold: >> >> 250Mb threshold - 689520 kb memory >> 800kb threshold - 531548 kb memory >> >> The memory usage is clearly increased, but not catastrophically, despite >> using rather large threshold. >> >> Of course, it is just init.el, which is loaded once. > > Correction: it is _your_ init.el. We need similar statistics from > many users and many different usage patterns; only then we will be > able to draw valid conclusions. Sure. Should we formally try to call for such benchmarks? >> Memory fragmentation as a result of routine Emacs usage may cause >> more significant memory usage increase. > > Actually, Emacs tries very hard to avoid fragmentation. That's why it > compacts buffers, and that's why it can relocate buffer text and > string data. Indeed. But despite all of the best efforts, fragmentation increases if we delay GCs, right? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at