unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Indentation and gc
       [not found] <20230310110747.4hytasakomvdyf7i.ref@Ergus>
@ 2023-03-10 11:07 ` Ergus
  2023-03-10 14:36   ` Dr. Arne Babenhauserheide
                     ` (2 more replies)
  0 siblings, 3 replies; 99+ messages in thread
From: Ergus @ 2023-03-10 11:07 UTC (permalink / raw)
  To: emacs-devel@gnu.org

Hi:

Just today I enabled the garbage-collection-messages and I found that
indenting the buffer with `C-x h <tab>` in just ~150 C++ lines I get the
garbage-collection message printed about 4 or 5 times before the
indentation finishes.

So, two questions:

1) Is this intended? if so, what's the reason? the indentation code is
forcing gc or is it generating too much garbage?

2) IF it doesn't impact performance... Is it possible somehow to improve
the gc message to have more useful information; or at least; to control
the no-log in order to have some outputs in the *Message* buffer?

Best,
Ergus



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-10 11:07 ` Indentation and gc Ergus
@ 2023-03-10 14:36   ` Dr. Arne Babenhauserheide
  2023-03-10 14:54     ` Eli Zaretskii
  2023-03-10 14:52   ` Eli Zaretskii
  2023-03-21  7:11   ` Jean Louis
  2 siblings, 1 reply; 99+ messages in thread
From: Dr. Arne Babenhauserheide @ 2023-03-10 14:36 UTC (permalink / raw)
  To: Ergus; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 978 bytes --]


Ergus <spacibba@aol.com> writes:

> Just today I enabled the garbage-collection-messages and I found that
> indenting the buffer with `C-x h <tab>` in just ~150 C++ lines I get the
> garbage-collection message printed about 4 or 5 times before the
> indentation finishes.


The gc-cons-threshold is very small when looking at modern workloads.

I have these set in my .emacs.d/init.el

;; at the start:

;; Make startup faster by reducing the frequency of garbage
;; collection.  The default is 800 kilobytes.  Measured in bytes.
(setq gc-cons-threshold (* 100 1024 1024))


;; at the end:

;; Make gc pauses faster by decreasing the threshold again (from the increased initial).
(setq gc-cons-threshold (* 20 1024 1024))
;; original value: 800 000

;; speed up reading from external processes
(setq read-process-output-max (* 1024 1024)) ;; 1mb


Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-10 11:07 ` Indentation and gc Ergus
  2023-03-10 14:36   ` Dr. Arne Babenhauserheide
@ 2023-03-10 14:52   ` Eli Zaretskii
  2023-03-10 21:30     ` Ergus
  2023-03-21  7:11   ` Jean Louis
  2 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-10 14:52 UTC (permalink / raw)
  To: Ergus; +Cc: emacs-devel

> Date: Fri, 10 Mar 2023 12:07:47 +0100
> From: Ergus <spacibba@aol.com>
> 
> Just today I enabled the garbage-collection-messages and I found that
> indenting the buffer with `C-x h <tab>` in just ~150 C++ lines I get the
> garbage-collection message printed about 4 or 5 times before the
> indentation finishes.
> 
> So, two questions:
> 
> 1) Is this intended? if so, what's the reason? the indentation code is
> forcing gc or is it generating too much garbage?

The latter.

> 2) IF it doesn't impact performance... Is it possible somehow to improve
> the gc message to have more useful information; or at least; to control
> the no-log in order to have some outputs in the *Message* buffer?

How will a more detailed GC message help?



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-10 14:36   ` Dr. Arne Babenhauserheide
@ 2023-03-10 14:54     ` Eli Zaretskii
  2023-03-10 19:23       ` Dr. Arne Babenhauserheide
  2023-03-11 10:54       ` Ihor Radchenko
  0 siblings, 2 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-10 14:54 UTC (permalink / raw)
  To: Dr. Arne Babenhauserheide; +Cc: spacibba, emacs-devel

> From: "Dr. Arne Babenhauserheide" <arne_bab@web.de>
> Cc: emacs-devel@gnu.org
> Date: Fri, 10 Mar 2023 15:36:06 +0100
> 
> I have these set in my .emacs.d/init.el
> 
> ;; at the start:
> 
> ;; Make startup faster by reducing the frequency of garbage
> ;; collection.  The default is 800 kilobytes.  Measured in bytes.
> (setq gc-cons-threshold (* 100 1024 1024))
> 
> 
> ;; at the end:
> 
> ;; Make gc pauses faster by decreasing the threshold again (from the increased initial).
> (setq gc-cons-threshold (* 20 1024 1024))
> ;; original value: 800 000
> 
> ;; speed up reading from external processes
> (setq read-process-output-max (* 1024 1024)) ;; 1mb

This can only be done around specific portions of code known in
advance to be long and GC-intensive.  I don't think this kind of
technique can be used in the situation described by the OP.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-10 14:54     ` Eli Zaretskii
@ 2023-03-10 19:23       ` Dr. Arne Babenhauserheide
  2023-03-11  6:38         ` Eli Zaretskii
  2023-03-11 10:54       ` Ihor Radchenko
  1 sibling, 1 reply; 99+ messages in thread
From: Dr. Arne Babenhauserheide @ 2023-03-10 19:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 798 bytes --]


Eli Zaretskii <eliz@gnu.org> writes:

>> ;; Make gc pauses faster by decreasing the threshold again (from the increased initial).
>> (setq gc-cons-threshold (* 20 1024 1024))
>> ;; original value: 800 000
>> 
>> ;; speed up reading from external processes
>> (setq read-process-output-max (* 1024 1024)) ;; 1mb
>
> This can only be done around specific portions of code known in
> advance to be long and GC-intensive.  I don't think this kind of
> technique can be used in the situation described by the OP.

This is at the end: My emacs simply has a ~25x higher gc threshold than
normal and allows more caching of process output.

That helps a lot with lsp (language servers).

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-10 14:52   ` Eli Zaretskii
@ 2023-03-10 21:30     ` Ergus
  2023-03-11  6:52       ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ergus @ 2023-03-10 21:30 UTC (permalink / raw)
  To: emacs-devel, Eli Zaretskii



On March 10, 2023 3:52:29 PM GMT+01:00, Eli Zaretskii <eliz@gnu.org> wrote:
>> Date: Fri, 10 Mar 2023 12:07:47 +0100
>> From: Ergus <spacibba@aol.com>
>> 
>> Just today I enabled the garbage-collection-messages and I found that
>> indenting the buffer with `C-x h <tab>` in just ~150 C++ lines I get the
>> garbage-collection message printed about 4 or 5 times before the
>> indentation finishes.
>> 
>> So, two questions:
>> 
>> 1) Is this intended? if so, what's the reason? the indentation code is
>> forcing gc or is it generating too much garbage?
>
>The latter.
>
>> 2) IF it doesn't impact performance... Is it possible somehow to improve
>> the gc message to have more useful information; or at least; to control
>> the no-log in order to have some outputs in the *Message* buffer?
>
>How will a more detailed GC message help?
>

Hi,

I explained wrongly. These were somehow two almost independent questions. The more detailed GC and having a logging alternative are useful for debugging purposes. Because (for example) I don't know exactly how many GC were executed during the indentation.

So they are related but not directly.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-10 19:23       ` Dr. Arne Babenhauserheide
@ 2023-03-11  6:38         ` Eli Zaretskii
  2023-03-11  6:55           ` Dr. Arne Babenhauserheide
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11  6:38 UTC (permalink / raw)
  To: Dr. Arne Babenhauserheide; +Cc: spacibba, emacs-devel

> From: "Dr. Arne Babenhauserheide" <arne_bab@web.de>
> Cc: spacibba@aol.com, emacs-devel@gnu.org
> Date: Fri, 10 Mar 2023 20:23:18 +0100
> 
> > This can only be done around specific portions of code known in
> > advance to be long and GC-intensive.  I don't think this kind of
> > technique can be used in the situation described by the OP.
> 
> This is at the end: My emacs simply has a ~25x higher gc threshold than
> normal and allows more caching of process output.
> 
> That helps a lot with lsp (language servers).

The enlarged threshold should be carefully tuned to the user's Emacs
usage patterns and to the amount of available virtual memory, to avoid
applying too much memory pressure on the system, which could
potentially lead to OOM killer doing its gruesome job.

So, instead of advising random users to raise the GC threshold to
levels that are (perhaps) suitable for your configuration and usage
patterns, we should IMO teach them how to tune the threshold to
theirs, and leave the setting to them.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-10 21:30     ` Ergus
@ 2023-03-11  6:52       ` Eli Zaretskii
  0 siblings, 0 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11  6:52 UTC (permalink / raw)
  To: Ergus; +Cc: emacs-devel

> Date: Fri, 10 Mar 2023 22:30:42 +0100
> From: Ergus <spacibba@aol.com>
> 
> >> 2) IF it doesn't impact performance... Is it possible somehow to improve
> >> the gc message to have more useful information; or at least; to control
> >> the no-log in order to have some outputs in the *Message* buffer?
> >
> >How will a more detailed GC message help?
> >
> 
> Hi,
> 
> I explained wrongly. These were somehow two almost independent questions. The more detailed GC and having a logging alternative are useful for debugging purposes. Because (for example) I don't know exactly how many GC were executed during the indentation.
> 
> So they are related but not directly.

I think I understood, but my question still stands: how will a more
detailed GC report help you in what you are trying to do?

You can see the kind of data GC returns in the doc string of
garbage-collect, and you can see that in action if you invoke
"M-x garbage-collect RET" by hand.  Please tell how will this data
help you do what you want to do.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11  6:38         ` Eli Zaretskii
@ 2023-03-11  6:55           ` Dr. Arne Babenhauserheide
  2023-03-11  7:56             ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Dr. Arne Babenhauserheide @ 2023-03-11  6:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1658 bytes --]


Eli Zaretskii <eliz@gnu.org> writes:

>> From: "Dr. Arne Babenhauserheide" <arne_bab@web.de>
>> Cc: spacibba@aol.com, emacs-devel@gnu.org
>> Date: Fri, 10 Mar 2023 20:23:18 +0100
>> 
>> > This can only be done around specific portions of code known in
>> > advance to be long and GC-intensive.  I don't think this kind of
>> > technique can be used in the situation described by the OP.
>> 
>> This is at the end: My emacs simply has a ~25x higher gc threshold than
>> normal and allows more caching of process output.
>> 
>> That helps a lot with lsp (language servers).
>
> The enlarged threshold should be carefully tuned to the user's Emacs
> usage patterns and to the amount of available virtual memory, to avoid
> applying too much memory pressure on the system, which could
> potentially lead to OOM killer doing its gruesome job.
>
> So, instead of advising random users to raise the GC threshold to
> levels that are (perhaps) suitable for your configuration and usage
> patterns, we should IMO teach them how to tune the threshold to
> theirs, and leave the setting to them.

That’s true …

Do we have a process to re-evaluate the current settings? Emacs modes
have been getting more complex in the past decades — also because modern
CPUs can execute them — so how can we see whether 800k are still the
right setting for gc-cons-threshold?

Another question would be whether enlarging the gc cons threshhold
during reading of the site and init file could be a default. It improves
startup times a lot.

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11  6:55           ` Dr. Arne Babenhauserheide
@ 2023-03-11  7:56             ` Eli Zaretskii
  2023-03-11 12:34               ` Dr. Arne Babenhauserheide
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11  7:56 UTC (permalink / raw)
  To: Dr. Arne Babenhauserheide; +Cc: spacibba, emacs-devel

> From: "Dr. Arne Babenhauserheide" <arne_bab@web.de>
> Cc: spacibba@aol.com, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 07:55:32 +0100
> 
> Do we have a process to re-evaluate the current settings? Emacs modes
> have been getting more complex in the past decades — also because modern
> CPUs can execute them — so how can we see whether 800k are still the
> right setting for gc-cons-threshold?

The "right setting" is very much dependent on the setup and resources
on each concrete system.  And people tend to run Emacs on some very
old systems, as well as on some very new ones.  Not sure how to
evaluate the default setting in these circumstances.

Perhaps the only thing we could do is enlarge the value slightly in
Emacs 30.

And please keep in mind that the value of the threshold does NOT
guarantee that Emacs will call GC as soon as it generated Lisp objects
which take that much memory.  The test against the threshold is done
only when Emacs decides it might be a good time to do a GC, so you
could have a Lisp program that runs for prolonged times and generates
much more objects than it takes to reach the threshold, before GC
happens.  It is not very probable to have such programs, but it's
definitely possible.  E.g., a single make-string call can generate a
very large string, which takes much more memory than the threshold,
without calling GC.

So raising the threshold indirectly raises the probability of having
your system run out of memory, even if the threshold value is way
below the amount of VM you have.

> Another question would be whether enlarging the gc cons threshhold
> during reading of the site and init file could be a default. It improves
> startup times a lot.

See above: to what value will you enlarge it so that it's still safe?
The Emacs startup typically does a lot of non-trivial stuff, so could
consume large quantities of memory.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-10 14:54     ` Eli Zaretskii
  2023-03-10 19:23       ` Dr. Arne Babenhauserheide
@ 2023-03-11 10:54       ` Ihor Radchenko
  2023-03-11 11:17         ` Ergus
  2023-03-11 12:37         ` Eli Zaretskii
  1 sibling, 2 replies; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 10:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Dr. Arne Babenhauserheide, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> ;; Make gc pauses faster by decreasing the threshold again (from the increased initial).
>> (setq gc-cons-threshold (* 20 1024 1024))
>> ;; original value: 800 000
>> 
>> ;; speed up reading from external processes
>> (setq read-process-output-max (* 1024 1024)) ;; 1mb
>
> This can only be done around specific portions of code known in
> advance to be long and GC-intensive.  I don't think this kind of
> technique can be used in the situation described by the OP.

May it be done when loading init.el and early-init.el?
Init files are commonly known to be resource-intensive. They also tend
to trigger GC more because the heap size is not yet very large and thus
`gc-cons-percentage' does not yet take precedence over `gc-cons-threshold'.

As for "known in advance", may Emacs keep track of how many GCs are
triggered by user commands and then adjust GC dynamically?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 10:54       ` Ihor Radchenko
@ 2023-03-11 11:17         ` Ergus
  2023-03-11 11:23           ` Ihor Radchenko
  2023-03-11 12:31           ` Eli Zaretskii
  2023-03-11 12:37         ` Eli Zaretskii
  1 sibling, 2 replies; 99+ messages in thread
From: Ergus @ 2023-03-11 11:17 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, Dr. Arne Babenhauserheide, emacs-devel

On Sat, Mar 11, 2023 at 10:54:24AM +0000, Ihor Radchenko wrote:
>Eli Zaretskii <eliz@gnu.org> writes:
>
>>> ;; Make gc pauses faster by decreasing the threshold again (from the increased initial).
>>> (setq gc-cons-threshold (* 20 1024 1024))
>>> ;; original value: 800 000
>>>
>>> ;; speed up reading from external processes
>>> (setq read-process-output-max (* 1024 1024)) ;; 1mb
>>
>> This can only be done around specific portions of code known in
>> advance to be long and GC-intensive.  I don't think this kind of
>> technique can be used in the situation described by the OP.
>
>May it be done when loading init.el and early-init.el?
>Init files are commonly known to be resource-intensive. They also tend
>to trigger GC more because the heap size is not yet very large and thus
>`gc-cons-percentage' does not yet take precedence over `gc-cons-threshold'.
>
>As for "known in advance", may Emacs keep track of how many GCs are
>triggered by user commands and then adjust GC dynamically?
>

Hi Ihor:

I have this in my early-init since a few years now.


```
(defconst my/file-name-handler-alist file-name-handler-alist)
(defconst my/gc-cons-threshold (* 2 gc-cons-threshold))

(setq-default file-name-handler-alist nil
               gc-cons-threshold most-positive-fixnum   ;; Defer Garbage collection
               gc-cons-percentage 1.0
               message-log-max 16384)

(add-hook 'window-setup-hook
           (lambda ()
             (setq file-name-handler-alist my/file-name-handler-alist
                   gc-cons-threshold my/gc-cons-threshold
                   gc-cons-percentage 0.1)
             (let ((curtime (current-time)))
               (message "Times: init:%.06f total:%.06f gc-done:%d"
                        (float-time (time-subtract after-init-time before-init-time))
                        (float-time (time-subtract curtime before-init-time))
                        gcs-done)))
           90)
```

This reduced my load time by 50% on GNU/Linux (a bit more indeed)... on
MS-Windows I am still above the 7 seconds with exactly same config (so
more than 14x slower than GNU/Linux... but I guess the problem is maybe
outside GNU Linux... antivirus virus and MS process creation slowness.)


>-- 
>Ihor Radchenko // yantar92,
>Org mode contributor,
>Learn more about Org mode at <https://orgmode.org/>.
>Support Org development at <https://liberapay.com/org-mode>,
>or support my work at <https://liberapay.com/yantar92>
>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 11:17         ` Ergus
@ 2023-03-11 11:23           ` Ihor Radchenko
  2023-03-11 12:31           ` Eli Zaretskii
  1 sibling, 0 replies; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 11:23 UTC (permalink / raw)
  To: Ergus; +Cc: Eli Zaretskii, Dr. Arne Babenhauserheide, emacs-devel

Ergus <spacibba@aol.com> writes:

> (setq-default file-name-handler-alist nil
>                gc-cons-threshold most-positive-fixnum   ;; Defer Garbage collection
>                gc-cons-percentage 1.0
>                message-log-max 16384)

Note that `gc-cons-percentage' 1.0 is the new default in Emacs 30.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 11:17         ` Ergus
  2023-03-11 11:23           ` Ihor Radchenko
@ 2023-03-11 12:31           ` Eli Zaretskii
  2023-03-11 12:39             ` Ihor Radchenko
  1 sibling, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 12:31 UTC (permalink / raw)
  To: Ergus; +Cc: yantar92, arne_bab, emacs-devel

> Date: Sat, 11 Mar 2023 12:17:30 +0100
> From: Ergus <spacibba@aol.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,
> 	"Dr. Arne Babenhauserheide" <arne_bab@web.de>, emacs-devel@gnu.org
> 
>                gc-cons-threshold most-positive-fixnum   ;; Defer Garbage collection
                                   ^^^^^^^^^^^^^^^^^^^^
NEVER do that!



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11  7:56             ` Eli Zaretskii
@ 2023-03-11 12:34               ` Dr. Arne Babenhauserheide
  2023-03-11 13:08                 ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Dr. Arne Babenhauserheide @ 2023-03-11 12:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1959 bytes --]


Eli Zaretskii <eliz@gnu.org> writes:

>> From: "Dr. Arne Babenhauserheide" <arne_bab@web.de>
>> Cc: spacibba@aol.com, emacs-devel@gnu.org
>> Date: Sat, 11 Mar 2023 07:55:32 +0100
>> 
>> Do we have a process to re-evaluate the current settings? Emacs modes
>> have been getting more complex in the past decades — also because modern
>> CPUs can execute them — so how can we see whether 800k are still the
>> right setting for gc-cons-threshold?
>
> The "right setting" is very much dependent on the setup and resources
> on each concrete system.  And people tend to run Emacs on some very
> old systems, as well as on some very new ones.  Not sure how to
> evaluate the default setting in these circumstances.
>
> Perhaps the only thing we could do is enlarge the value slightly in
> Emacs 30.

That could already help, yes. Maybe not by factor 25 as I did (that’s
mostly for lsp work), but just adjusting to how much the lower limit of
systems changed that can run Emacs 30.

> So raising the threshold indirectly raises the probability of having
> your system run out of memory, even if the threshold value is way
> below the amount of VM you have.
> See above: to what value will you enlarge it so that it's still safe?
> The Emacs startup typically does a lot of non-trivial stuff, so could
> consume large quantities of memory.

With the main risk being that we could go OOM, could Emacs evaluate the
available memory on the system on systems that support that check?

If Emacs can give back memory to the OS (I expect that it can, but I am
not sure¹), then wrapping the init process into such a check by default
could resolve many startup time problems.

By reverting to a lower value after startup, it would avoid limiting
other processes on the system.

¹: Can Emacs give back memory to the OS?

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 10:54       ` Ihor Radchenko
  2023-03-11 11:17         ` Ergus
@ 2023-03-11 12:37         ` Eli Zaretskii
  2023-03-11 13:10           ` Ihor Radchenko
  1 sibling, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 12:37 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: "Dr. Arne Babenhauserheide" <arne_bab@web.de>, spacibba@aol.com,
>  emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 10:54:24 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> ;; Make gc pauses faster by decreasing the threshold again (from the increased initial).
> >> (setq gc-cons-threshold (* 20 1024 1024))
> >> ;; original value: 800 000
> >> 
> >> ;; speed up reading from external processes
> >> (setq read-process-output-max (* 1024 1024)) ;; 1mb
> >
> > This can only be done around specific portions of code known in
> > advance to be long and GC-intensive.  I don't think this kind of
> > technique can be used in the situation described by the OP.
> 
> May it be done when loading init.el and early-init.el?

See my response, where I explain that it is not easy, AFAIU.

> As for "known in advance", may Emacs keep track of how many GCs are
> triggered by user commands and then adjust GC dynamically?

Adjust how?  If you mean enlarge under some conditions, then please
tell:

  . what are those conditions?
  . should the threshold also go down under some other conditions, and
    if so, how?
  . how to determine the ceiling for increasing the threshold?



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 12:31           ` Eli Zaretskii
@ 2023-03-11 12:39             ` Ihor Radchenko
  2023-03-11 12:40               ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 12:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ergus, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>>                gc-cons-threshold most-positive-fixnum   ;; Defer Garbage collection
>                                    ^^^^^^^^^^^^^^^^^^^^
> NEVER do that!

That's what many people do and many people suggest.
And that's what Doom does.
https://github.com/doomemacs/doomemacs/blob/master/early-init.el#L29

Not to say that it is safe, but it is commonly used in the wild in the
absence of prescribed alternative.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 12:39             ` Ihor Radchenko
@ 2023-03-11 12:40               ` Eli Zaretskii
  2023-03-11 12:54                 ` Ihor Radchenko
  2023-03-11 13:00                 ` Po Lu
  0 siblings, 2 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 12:40 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Ergus <spacibba@aol.com>, arne_bab@web.de, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 12:39:04 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >>                gc-cons-threshold most-positive-fixnum   ;; Defer Garbage collection
> >                                    ^^^^^^^^^^^^^^^^^^^^
> > NEVER do that!
> 
> That's what many people do and many people suggest.

And I take every opportunity to tell people not to do.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 12:40               ` Eli Zaretskii
@ 2023-03-11 12:54                 ` Ihor Radchenko
  2023-03-11 13:01                   ` Dr. Arne Babenhauserheide
  2023-03-11 13:14                   ` Eli Zaretskii
  2023-03-11 13:00                 ` Po Lu
  1 sibling, 2 replies; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 12:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> That's what many people do and many people suggest.
>
> And I take every opportunity to tell people not to do.

I am afraid that it is not very helpful. People do not do it only
because they can, but also because there is a real problem that can be
solved by this dangerous practice.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 12:40               ` Eli Zaretskii
  2023-03-11 12:54                 ` Ihor Radchenko
@ 2023-03-11 13:00                 ` Po Lu
  1 sibling, 0 replies; 99+ messages in thread
From: Po Lu @ 2023-03-11 13:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ihor Radchenko, spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Ihor Radchenko <yantar92@posteo.net>
>> Cc: Ergus <spacibba@aol.com>, arne_bab@web.de, emacs-devel@gnu.org
>> Date: Sat, 11 Mar 2023 12:39:04 +0000
>> 
>> Eli Zaretskii <eliz@gnu.org> writes:
>> 
>> >>                gc-cons-threshold most-positive-fixnum   ;; Defer Garbage collection
>> >                                    ^^^^^^^^^^^^^^^^^^^^
>> > NEVER do that!
>> 
>> That's what many people do and many people suggest.
>
> And I take every opportunity to tell people not to do.

+1.

For example, Android devices are much more strict wrt memory management
policy, and they have ``overcommit'' turned off, so Emacs has a chance
to signal out-of-memory errors.

Someone using the Android port reported that his Emacs reliably ran out
of memory after a couple of minutes of editing Org files.  It turns out
that his `gc-cons-threshold' was 512 MB!



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 12:54                 ` Ihor Radchenko
@ 2023-03-11 13:01                   ` Dr. Arne Babenhauserheide
  2023-03-11 13:14                   ` Eli Zaretskii
  1 sibling, 0 replies; 99+ messages in thread
From: Dr. Arne Babenhauserheide @ 2023-03-11 13:01 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, spacibba, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1498 bytes --]


Ihor Radchenko <yantar92@posteo.net> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>>>>>                gc-cons-threshold most-positive-fixnum   ;; Defer Garbage collection
>>> That's what many people do and many people suggest.
>>
>> And I take every opportunity to tell people not to do.
>
> I am afraid that it is not very helpful. People do not do it only
> because they can, but also because there is a real problem that can be
> solved by this dangerous practice.

Note that I use something similar, but just enlarging to something I
trust my system can handle:

;; Make startup faster by reducing the frequency of garbage
;; collection.  The default is 800 kilobytes.  Measured in bytes.
(setq gc-cons-threshold (* 100 1024 1024))


My system has a few tens of Gigabytes of memory, so running GC if it
finds that it consumes more than 100 MiB does not risk OOM. Algorithms
may depend on garbage collection being active to avoid unbounded memory
usage.

But it causes GC pauses to be much longer — that does not hurt during
init (I don’t interact with Emacs anyway), but it would hurt during
regular usage.

Therefore at the end of the init file this reverts to a more sensible
value for continuous operation:


;; Make gc pauses faster by decreasing the threshold again (from the increased initial).
(setq gc-cons-threshold (* 20 1024 1024))


Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 12:34               ` Dr. Arne Babenhauserheide
@ 2023-03-11 13:08                 ` Eli Zaretskii
  2023-03-11 13:31                   ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 13:08 UTC (permalink / raw)
  To: Dr. Arne Babenhauserheide; +Cc: spacibba, emacs-devel

> From: "Dr. Arne Babenhauserheide" <arne_bab@web.de>
> Cc: spacibba@aol.com, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 13:34:54 +0100
> 
> > Perhaps the only thing we could do is enlarge the value slightly in
> > Emacs 30.
> 
> That could already help, yes. Maybe not by factor 25 as I did (that’s
> mostly for lsp work), but just adjusting to how much the lower limit of
> systems changed that can run Emacs 30.

No, a factor of 25 is definitely too much.  I'd say maybe 4 or 5.

> > So raising the threshold indirectly raises the probability of having
> > your system run out of memory, even if the threshold value is way
> > below the amount of VM you have.
> …
> > See above: to what value will you enlarge it so that it's still safe?
> > The Emacs startup typically does a lot of non-trivial stuff, so could
> > consume large quantities of memory.
> 
> With the main risk being that we could go OOM, could Emacs evaluate the
> available memory on the system on systems that support that check?

It can, but what would you want to do with that value?

We cannot use it as the threshold, for the reasons I explained
earlier.  We could use some fraction of it, but what fraction?  The
answer depends on what other programs routinely run on that system.
For example, if the user is likely to run another full-fledged session
of Emacs (some people actually do that, e.g., to run Gnus in a
separate process), then using 1/2 of the amount of VM as the threshold
is out of the question, right?  And there are memory-hogging programs
out there which use much more than Emacs does.

> If Emacs can give back memory to the OS (I expect that it can, but I am
> not sure¹)

It depends...  In some situations (and some OSes), it doesn't.

> ¹: Can Emacs give back memory to the OS?

Depends on the implementation of malloc and on memory fragmentation.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 12:37         ` Eli Zaretskii
@ 2023-03-11 13:10           ` Ihor Radchenko
  2023-03-11 13:38             ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 13:10 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> > This can only be done around specific portions of code known in
>> > advance to be long and GC-intensive.  I don't think this kind of
>> > technique can be used in the situation described by the OP.
>> 
>> May it be done when loading init.el and early-init.el?
>
> See my response, where I explain that it is not easy, AFAIU.

Are you referring to
https://yhetil.org/emacs-devel/83fsabyb41.fsf@gnu.org ?

If so, as you said, there is no guarantee that the threshold is never
exceeded. So, running Emacs when the available memory is extremely close
to the heap size is not safe anyway. Emacs may start exceeding memory
limits just from loading an extra package of something along that lines.
Memory-constrained users probably need to carefully search for a balance
anyway.

What about setting gc-cons-threshold (and leaving gc-cons-percentage
intact) as a fraction of the available memory as returned by
`memory-info'? At least, when loading init.el. This can be a custom
setting like `gc-cons-threshold-init' that may either be nil (fall back
to `gc-cons-threshold'), an integer (bytes), or a float representing a
fraction of free memory. The default can be a fraction of free memory,
if memory info is available for a given system. The default may even
disable this whole thing when the available system memory is small.

>> As for "known in advance", may Emacs keep track of how many GCs are
>> triggered by user commands and then adjust GC dynamically?
>
> Adjust how?  If you mean enlarge under some conditions, then please
> tell:
>
>   . what are those conditions?

AFAIR, being too smart here does not work well. I experimented with
similar ideas in the past (by modifying gcmh package).

I'd introduce a custom setting `gc-cons-percentage-2/`gc-cons-threshold-2'
that will define an alternative (larger) GC threshold that is used for a
command if the number of GCs exceeds `gcs-done-threshold'.

Upon finishing the command, GC thresholds are lowered back to standard.

>   . should the threshold also go down under some other conditions, and
>     if so, how?

I do not think that the threshold should be lowered. This increased
value should not affect many commands to start with - just the most
resource-intensive ones.

>   . how to determine the ceiling for increasing the threshold?

That's a good question.
I'd start with trying 5-10x normal value of gc-cons-percentage for
commands that spend more than 0.1-0.2 sec (noticeable delay) in more
than a single GC. Then, I'd add some code that will collect statistics
of the impact of the new setting. After some time, I will ask users to
share the statistics in one of the Emacs surveys or just by asking on
mailing list.

Of course, the experiment should happen on master. Not on the release branch.
Maybe with extra statistics collected after the release in Emacs survey.

I tried something similar in Org. See
https://list.orgmode.org/m2y1p22nfn.fsf@ioa48nv.localdomain/T/#t

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 12:54                 ` Ihor Radchenko
  2023-03-11 13:01                   ` Dr. Arne Babenhauserheide
@ 2023-03-11 13:14                   ` Eli Zaretskii
  2023-03-11 13:38                     ` Ihor Radchenko
  1 sibling, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 13:14 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 12:54:27 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> That's what many people do and many people suggest.
> >
> > And I take every opportunity to tell people not to do.
> 
> I am afraid that it is not very helpful.

I'm sorry if I'm unhelpful, but did you ever read the posts that I
wrote on this issue?  E.g., here:

  https://old.reddit.com/r/emacs/comments/bg85qm/garbage_collector_magic_hack/



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 13:08                 ` Eli Zaretskii
@ 2023-03-11 13:31                   ` Ihor Radchenko
  2023-03-11 13:44                     ` Eli Zaretskii
  2023-03-11 16:19                     ` Dr. Arne Babenhauserheide
  0 siblings, 2 replies; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 13:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Dr. Arne Babenhauserheide, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> With the main risk being that we could go OOM, could Emacs evaluate the
>> available memory on the system on systems that support that check?
>
> It can, but what would you want to do with that value?
>
> We cannot use it as the threshold, for the reasons I explained
> earlier.  We could use some fraction of it, but what fraction?  The
> answer depends on what other programs routinely run on that system.
> For example, if the user is likely to run another full-fledged session
> of Emacs (some people actually do that, e.g., to run Gnus in a
> separate process), then using 1/2 of the amount of VM as the threshold
> is out of the question, right?  And there are memory-hogging programs
> out there which use much more than Emacs does.

What is the smallest practical free RAM available to Emacs on low-end systems?
We can take that value and then use 800kb/min free RAM in the wild and
the base threshold. On system with larger RAM the threshold will scale.

As a speculation, let's assume that the minimal sane memory we can
encounter is 128Mb. Then, 800kb correspond to ~0.7% RAM.

For systems with a lot of RAM, Say 128Gb, 0.7% corresponds to 890Mb.
Probably a bit much and will cause memory fragmentation

What about the default being:

(pcase (* (car (memory-info)) ; in kb
	  1000)
  (`nil 800000) ; 800kb, old default
  (ram
   (let ((scaled-threshold
	  (* 0.7e-2 ; 800kb/128Mb for low-end systems.
	     ram)))
     (min
      (* 100 1000 1000) ; upper limit to avoid fragmentation
      scaled-threshold))))

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 13:10           ` Ihor Radchenko
@ 2023-03-11 13:38             ` Eli Zaretskii
  0 siblings, 0 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 13:38 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 13:10:35 +0000
> 
> If so, as you said, there is no guarantee that the threshold is never
> exceeded. So, running Emacs when the available memory is extremely close
> to the heap size is not safe anyway.

What do you mean  by "heap size" here?

We aren't talking about the heap, we are talking about the VM size of
the system, vs how much of that is typically consumed by programs.
The answer depends on the system and on the usage patterns.

> Emacs may start exceeding memory limits just from loading an extra
> package of something along that lines.

It depends how large is the memory consumption of that package and how
much memory the system has.

> Memory-constrained users probably need to carefully search for a balance
> anyway.

The current GC threshold is unlikely to cause problems on systems with
even very little memory.  Emacs itself needs around 15 to 20 MiB even
when started with -Q, so if the system is capable of running Emacs, it
should be able to cope with relatively minor increases of the memory
footprint, such as what the default GC threshold produces.

You suggest increasing the threshold significantly, which could make
the danger of OOM much higher, especially on memory-challenged
systems.

> What about setting gc-cons-threshold (and leaving gc-cons-percentage
> intact) as a fraction of the available memory as returned by
> `memory-info'?

See my other message a few minutes ago, where I already responded to a
similar proposal.

> At least, when loading init.el. This can be a custom
> setting like `gc-cons-threshold-init' that may either be nil (fall back
> to `gc-cons-threshold'), an integer (bytes), or a float representing a
> fraction of free memory. The default can be a fraction of free memory,
> if memory info is available for a given system.

Which fraction?

If we leave determination of the fraction to the user, we might as
well not provide any new defcustoms at all, since the existing
variables cover that, and you are saying that everyone increases the
threshold to most-positive-fixnum anyway.

> I'd introduce a custom setting `gc-cons-percentage-2/`gc-cons-threshold-2'
> that will define an alternative (larger) GC threshold that is used for a
> command if the number of GCs exceeds `gcs-done-threshold'.

How is this different from what we already have?

> Upon finishing the command, GC thresholds are lowered back to standard.

Why only "command"?  What about stuff that runs from post-command-hook
or from timers or from process filters?  Some of them could be as
performance critical as an interactive command, no?

> >   . how to determine the ceiling for increasing the threshold?
> 
> That's a good question.

That's the main question, from my POV.  Without a good answer, we
don't have any reason to introduce any new features related to this.

> I'd start with trying 5-10x normal value of gc-cons-percentage for
> commands that spend more than 0.1-0.2 sec (noticeable delay) in more
> than a single GC.

Again: why only commands?

And how to know which ones will take more that the times you consider
as significant?  (And why should everyone use these times, which
appear to be quite low?)

> Then, I'd add some code that will collect statistics of the impact
> of the new setting. After some time, I will ask users to share the
> statistics in one of the Emacs surveys or just by asking on mailing
> list.

Then we should start with just this, and revisit the issue only after
we have some significant statistics.  Note that users will have to
tell much more than just threshold numbers: they will need to tell how
much memory they have and also the memory footprints of the programs
running on their systems.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 13:14                   ` Eli Zaretskii
@ 2023-03-11 13:38                     ` Ihor Radchenko
  2023-03-11 13:46                       ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 13:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> > And I take every opportunity to tell people not to do.
>> 
>> I am afraid that it is not very helpful.
>
> I'm sorry if I'm unhelpful, but did you ever read the posts that I
> wrote on this issue?  E.g., here:
>
>   https://old.reddit.com/r/emacs/comments/bg85qm/garbage_collector_magic_hack/

Not really. I am not following reddit so much closely to I notice all
the comments buried in various posts.

What about expanding the relevant paragraph in the manual with examples
when increasing the threshold too much is dangerous?

     The initial threshold value is ‘GC_DEFAULT_THRESHOLD’, defined in
     ‘alloc.c’.  Since it’s defined in ‘word_size’ units, the value is
     400,000 for the default 32-bit configuration and 800,000 for the
     64-bit one.  If you specify a larger value, garbage collection will
     happen less often.  This reduces the amount of time spent garbage
     collecting, but increases total memory use.  You may want to do
     this when running a program that creates lots of Lisp data.
     However, we recommend against increasing the threshold for
     prolonged periods of time, and advise that you never set it higher
     than needed for the program to run in reasonable time.  Using
     thresholds higher than necessary could potentially cause
     system-wide memory pressure, and should therefore be avoided.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 13:31                   ` Ihor Radchenko
@ 2023-03-11 13:44                     ` Eli Zaretskii
  2023-03-11 13:53                       ` Ihor Radchenko
  2023-03-11 16:19                     ` Dr. Arne Babenhauserheide
  1 sibling, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 13:44 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: "Dr. Arne Babenhauserheide" <arne_bab@web.de>, spacibba@aol.com,
>  emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 13:31:04 +0000
> 
> What is the smallest practical free RAM available to Emacs on low-end systems?
> We can take that value and then use 800kb/min free RAM in the wild and
> the base threshold. On system with larger RAM the threshold will scale.
> 
> As a speculation, let's assume that the minimal sane memory we can
> encounter is 128Mb. Then, 800kb correspond to ~0.7% RAM.
> 
> For systems with a lot of RAM, Say 128Gb, 0.7% corresponds to 890Mb.
> Probably a bit much and will cause memory fragmentation
> 
> What about the default being:
> 
> (pcase (* (car (memory-info)) ; in kb
> 	  1000)
>   (`nil 800000) ; 800kb, old default
>   (ram
>    (let ((scaled-threshold
> 	  (* 0.7e-2 ; 800kb/128Mb for low-end systems.
> 	     ram)))
>      (min
>       (* 100 1000 1000) ; upper limit to avoid fragmentation
>       scaled-threshold))))

The above implicitly assumes that gc-cons-threshold is the absolute
ceiling of the memory Emacs can allocate.  But that is not what that
threshold means and how it is used.  Even with the default threshold
of 800K a running Emacs session can allocate much more than 800K
bytes.

Therefore, the reasoning about the value should be different.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 13:38                     ` Ihor Radchenko
@ 2023-03-11 13:46                       ` Eli Zaretskii
  2023-03-11 13:54                         ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 13:46 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 13:38:54 +0000
> 
> What about expanding the relevant paragraph in the manual with examples
> when increasing the threshold too much is dangerous?
> 
>      The initial threshold value is ‘GC_DEFAULT_THRESHOLD’, defined in
>      ‘alloc.c’.  Since it’s defined in ‘word_size’ units, the value is
>      400,000 for the default 32-bit configuration and 800,000 for the
>      64-bit one.  If you specify a larger value, garbage collection will
>      happen less often.  This reduces the amount of time spent garbage
>      collecting, but increases total memory use.  You may want to do
>      this when running a program that creates lots of Lisp data.
>      However, we recommend against increasing the threshold for
>      prolonged periods of time, and advise that you never set it higher
>      than needed for the program to run in reasonable time.  Using
>      thresholds higher than necessary could potentially cause
>      system-wide memory pressure, and should therefore be avoided.

What is there to expand?  It already says all that people need to
understand before they play dangerous games with the threshold.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 13:44                     ` Eli Zaretskii
@ 2023-03-11 13:53                       ` Ihor Radchenko
  2023-03-11 14:09                         ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 13:53 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> (pcase (* (car (memory-info)) ; in kb
>> 	  1000)
>>   (`nil 800000) ; 800kb, old default
>>   (ram
>>    (let ((scaled-threshold
>> 	  (* 0.7e-2 ; 800kb/128Mb for low-end systems.
>> 	     ram)))
>>      (min
>>       (* 100 1000 1000) ; upper limit to avoid fragmentation
>>       scaled-threshold))))
>
> The above implicitly assumes that gc-cons-threshold is the absolute
> ceiling of the memory Emacs can allocate.  But that is not what that
> threshold means and how it is used.  Even with the default threshold
> of 800K a running Emacs session can allocate much more than 800K
> bytes.

No. I took into account the fact that `gc-cons-threshold' is paired with
`gc-cons-percentage'. The latter scales with Emacs memory usage. For the
former, it makes sense to scale with available memory until
`gc-cons-percentage' takes over.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 13:46                       ` Eli Zaretskii
@ 2023-03-11 13:54                         ` Ihor Radchenko
  2023-03-11 14:11                           ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 13:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> What is there to expand?  It already says all that people need to
> understand before they play dangerous games with the threshold.

Examples when increasing the threshold is dangerous.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 13:53                       ` Ihor Radchenko
@ 2023-03-11 14:09                         ` Eli Zaretskii
  2023-03-12 14:20                           ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 14:09 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 13:53:03 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > The above implicitly assumes that gc-cons-threshold is the absolute
> > ceiling of the memory Emacs can allocate.  But that is not what that
> > threshold means and how it is used.  Even with the default threshold
> > of 800K a running Emacs session can allocate much more than 800K
> > bytes.
> 
> No. I took into account the fact that `gc-cons-threshold' is paired with
> `gc-cons-percentage'. The latter scales with Emacs memory usage. For the
> former, it makes sense to scale with available memory until
> `gc-cons-percentage' takes over.

I'm talking about basis for the 0.7% figure.

Anyway, how about if you try running with the threshold you think we
should adopt, and report back after a month or so, say?



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 13:54                         ` Ihor Radchenko
@ 2023-03-11 14:11                           ` Eli Zaretskii
  2023-03-11 14:18                             ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 14:11 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 13:54:00 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > What is there to expand?  It already says all that people need to
> > understand before they play dangerous games with the threshold.
> 
> Examples when increasing the threshold is dangerous.

Please describe such an example, because I don't think I understand
what would it say.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 14:11                           ` Eli Zaretskii
@ 2023-03-11 14:18                             ` Ihor Radchenko
  2023-03-11 14:20                               ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 14:18 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Examples when increasing the threshold is dangerous.
>
> Please describe such an example, because I don't think I understand
> what would it say.

I was referring to your example in
https://old.reddit.com/r/emacs/comments/bg85qm/garbage_collector_magic_hack/emk013p/,
if I understand it correctly.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 14:18                             ` Ihor Radchenko
@ 2023-03-11 14:20                               ` Eli Zaretskii
  2023-03-11 14:31                                 ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 14:20 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 14:18:59 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Examples when increasing the threshold is dangerous.
> >
> > Please describe such an example, because I don't think I understand
> > what would it say.
> 
> I was referring to your example in
> https://old.reddit.com/r/emacs/comments/bg85qm/garbage_collector_magic_hack/emk013p/,
> if I understand it correctly.

That was meant just to demonstrate what could happen in principle, not
describe real-life uses that matter to most of us.  It even says so.

IMO, having such synthetic examples in the manual won't help much,
because people will reject them outright ("this can never happen to
me").



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 14:20                               ` Eli Zaretskii
@ 2023-03-11 14:31                                 ` Ihor Radchenko
  2023-03-11 15:32                                   ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-11 14:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> I was referring to your example in
>> https://old.reddit.com/r/emacs/comments/bg85qm/garbage_collector_magic_hack/emk013p/,
>> if I understand it correctly.
>
> That was meant just to demonstrate what could happen in principle, not
> describe real-life uses that matter to most of us.  It even says so.
>
> IMO, having such synthetic examples in the manual won't help much,
> because people will reject them outright ("this can never happen to
> me").

Well. Currently, the paragraph reads like: we have 800kb default, but do
not increase it too much because Emacs will require more memory.

Given that modern computer typically have 4-16Gb RAM, the warning does
not look like an actual warning. 800kb is nothing. Surely, increasing it
to 80Mb to even few hundreds Mb is safe, right? Or not?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 14:31                                 ` Ihor Radchenko
@ 2023-03-11 15:32                                   ` Eli Zaretskii
  2023-03-11 15:52                                     ` Lynn Winebarger
                                                       ` (2 more replies)
  0 siblings, 3 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 15:32 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 14:31:18 +0000
> 
> Well. Currently, the paragraph reads like: we have 800kb default, but do
> not increase it too much because Emacs will require more memory.

No, it says more.  In particular:

                  If you specify a larger value, garbage collection will
     happen less often.  This reduces the amount of time spent garbage
     collecting, but increases total memory use.  You may want to do
     this when running a program that creates lots of Lisp data.
     However, we recommend against increasing the threshold for
     prolonged periods of time, and advise that you never set it higher
     than needed for the program to run in reasonable time.  Using
     thresholds higher than necessary could potentially cause
     system-wide memory pressure, and should therefore be avoided.

> Given that modern computer typically have 4-16Gb RAM, the warning does
> not look like an actual warning. 800kb is nothing. Surely, increasing it
> to 80Mb to even few hundreds Mb is safe, right? Or not?

Again, you are reasoning about the value as if it were related to the
maximum memory footprint Emacs could have.  But in fact, it is related
only to the _increment_ of memory Emacs can have before it should stop
and consider how much of that is garbage.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 15:32                                   ` Eli Zaretskii
@ 2023-03-11 15:52                                     ` Lynn Winebarger
  2023-03-11 16:24                                       ` Eli Zaretskii
  2023-03-11 17:10                                     ` Gregor Zattler
  2023-03-13 12:45                                     ` Ihor Radchenko
  2 siblings, 1 reply; 99+ messages in thread
From: Lynn Winebarger @ 2023-03-11 15:52 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Ihor Radchenko, spacibba, arne_bab, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 774 bytes --]

On Sat, Mar 11, 2023, 10:32 AM Eli Zaretskii <eliz@gnu.org> wrote:

> Again, you are reasoning about the value as if it were related to the
> maximum memory footprint Emacs could have.  But in fact, it is related
> only to the _increment_ of memory Emacs can have before it should stop
> and consider how much of that is garbage.
>

So, should there be a parameter that controls the maximum amount of memory
emacs is allowed to allocate (and not just in the lisp heap), like an
internal ulimit?
The uncertainty of that limit in a given system appears to be motivating
the calibration of these gc parameters.  Combine that with dire warnings
about the consequences of mis-setting those parameters but no apparent way
to get at the cause of those settings being unsafe.

Lynn

[-- Attachment #2: Type: text/html, Size: 1240 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 13:31                   ` Ihor Radchenko
  2023-03-11 13:44                     ` Eli Zaretskii
@ 2023-03-11 16:19                     ` Dr. Arne Babenhauserheide
  2023-03-12 13:27                       ` Ihor Radchenko
  1 sibling, 1 reply; 99+ messages in thread
From: Dr. Arne Babenhauserheide @ 2023-03-11 16:19 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Eli Zaretskii, spacibba, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1498 bytes --]


Ihor Radchenko <yantar92@posteo.net> writes:

> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> With the main risk being that we could go OOM, could Emacs evaluate the
>>> available memory on the system on systems that support that check?
>>
>> It can, but what would you want to do with that value?
>>
>> We cannot use it as the threshold, for the reasons I explained
>> earlier.  We could use some fraction of it, but what fraction?  The
>> answer depends on what other programs routinely run on that system.
>> For example, if the user is likely to run another full-fledged session
>> of Emacs (some people actually do that, e.g., to run Gnus in a
>> separate process), then using 1/2 of the amount of VM as the threshold
>> is out of the question, right?  And there are memory-hogging programs
>> out there which use much more than Emacs does.
>
> What is the smallest practical free RAM available to Emacs on low-end systems?
> We can take that value and then use 800kb/min free RAM in the wild and
> the base threshold. On system with larger RAM the threshold will scale.

It’s not that simple — at least outside loading the init file. If you
increase this fraction, then GC pauses will be longer and Emacs may feel
jittery or even become unresponsive for some time.

So there should likely be also a hard upper limit to ensure that the
pauses are unnoticeable.

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 15:52                                     ` Lynn Winebarger
@ 2023-03-11 16:24                                       ` Eli Zaretskii
  0 siblings, 0 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 16:24 UTC (permalink / raw)
  To: Lynn Winebarger; +Cc: yantar92, spacibba, arne_bab, emacs-devel

> From: Lynn Winebarger <owinebar@gmail.com>
> Date: Sat, 11 Mar 2023 10:52:32 -0500
> Cc: Ihor Radchenko <yantar92@posteo.net>, spacibba@aol.com, arne_bab@web.de, 
> 	emacs-devel <emacs-devel@gnu.org>
> 
> On Sat, Mar 11, 2023, 10:32 AM Eli Zaretskii <eliz@gnu.org> wrote:
> 
>  Again, you are reasoning about the value as if it were related to the
>  maximum memory footprint Emacs could have.  But in fact, it is related
>  only to the _increment_ of memory Emacs can have before it should stop
>  and consider how much of that is garbage.
> 
> So, should there be a parameter that controls the maximum amount of memory emacs is allowed to allocate
> (and not just in the lisp heap), like an internal ulimit? 

We already have that: the 85% of total memory, where we display a
warning.

However, on many modern systems, knowing how much memory Emacs
actually uses is not easy/reliable, I think.  So any such limitation
would be similarly unreliable.

> The uncertainty of that limit in a given system appears to be motivating the calibration of these gc
> parameters.

I don't think so. The issue at hand is not how much total memory a
given system can use, the issue is how much of it remains available,
after what the programs already running consumed.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 15:32                                   ` Eli Zaretskii
  2023-03-11 15:52                                     ` Lynn Winebarger
@ 2023-03-11 17:10                                     ` Gregor Zattler
  2023-03-11 17:25                                       ` Eli Zaretskii
  2023-03-13 12:45                                     ` Ihor Radchenko
  2 siblings, 1 reply; 99+ messages in thread
From: Gregor Zattler @ 2023-03-11 17:10 UTC (permalink / raw)
  To: Eli Zaretskii, Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

Hi Eli, emacs developers,
* Eli Zaretskii <eliz@gnu.org> [2023-03-11; 17:32 +02]:
>> From: Ihor Radchenko <yantar92@posteo.net>
>> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
>> Date: Sat, 11 Mar 2023 14:31:18 +0000

>> Given that modern computer typically have 4-16Gb RAM, the warning does
>> not look like an actual warning. 800kb is nothing. Surely, increasing it
>> to 80Mb to even few hundreds Mb is safe, right? Or not?
>
> Again, you are reasoning about the value as if it were related to the
> maximum memory footprint Emacs could have.  But in fact, it is related
> only to the _increment_ of memory Emacs can have before it should stop
> and consider how much of that is garbage.

But isn't that the very reason, why Ihors gc-cons-threshold
calculation in mid:878rg3wh2f.fsf@localhost is on the
save side memory wise?  Because it's a fraction of emacs
overall memory consumption anyway but scaled regarding
the total amount of memory?

To me the problem with big gc-cons-threshold even on systems
which are even bigger on RAM is that the (rare) garbage
collection the takes much more time and an uneducated
user might think Emacs hangs.

I played a lot recently witch gc-cons-threshold settings due
to Emacs being too sluggish with my old ones.  Now I:

- set gc-cons-threshold very high at the beginning of
  startup  (* 4096 40960)
- set it lower at the end of startup
  (/ (* 4096 4096) 1)
  - use gcmh with this value
- set it very high when entering the mini-buffer and
  lower again when exiting it
- force a gc when frame loses focs

The result is that with emacs-uptime being 7 hours, 21
minutes (and plenty of time away from the computer) I
have 103 messages regarding Garbage collection with
accompanied times for them in my message buffer.

Some statistics:

Minimal number	0.000	seconds
Maximal number	2.603	seconds
Sum	65.896	seconds
Average	0.63976699029126213592	seconds
Median	0.612	seconds
Variance	0.06970711075501932322
Standard deviation	0.26402104225803541665

Actually 0.6 seconds are already rather long I think.
But it's much better than before (on a ca. 9 years old
x240 with 8GB RAM)

Therefore I think some auto-adjustment of
gc-cons-threshold would be nice, which would try to
optimize for low number of garbage collection and short
times of actual gc runs.

Ciao; Gregor



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 17:10                                     ` Gregor Zattler
@ 2023-03-11 17:25                                       ` Eli Zaretskii
  2023-03-11 18:35                                         ` Gregor Zattler
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 17:25 UTC (permalink / raw)
  To: Gregor Zattler; +Cc: yantar92, spacibba, arne_bab, emacs-devel

> From: Gregor Zattler <telegraph@gmx.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 18:10:33 +0100
> 
> > Again, you are reasoning about the value as if it were related to the
> > maximum memory footprint Emacs could have.  But in fact, it is related
> > only to the _increment_ of memory Emacs can have before it should stop
> > and consider how much of that is garbage.
> 
> But isn't that the very reason, why Ihors gc-cons-threshold
> calculation in mid:878rg3wh2f.fsf@localhost is on the
> save side memory wise?  Because it's a fraction of emacs
> overall memory consumption anyway but scaled regarding
> the total amount of memory?

No.  The limitation on the _increment_ should have nothing to do with
how much memory is already consumed or how much total memory is
available on the system.  Imagine an Emacs with N MiB of memory
footprint on a system that has N+1 MiB of memory available.

IOW, what matters is how much is _left_, not how much is already used
or totally available.

> To me the problem with big gc-cons-threshold even on systems
> which are even bigger on RAM is that the (rare) garbage
> collection the takes much more time and an uneducated
> user might think Emacs hangs.

That is another downside of large GC threshold, yes.  Which again
tells us that making the threshold grow linearly with the available
total VM is not TRT, there should be a hard limit that is not just
arbitrary, but related to the time it takes to perform GC on that many
objects.

> I played a lot recently witch gc-cons-threshold settings due
> to Emacs being too sluggish with my old ones.  Now I:
> 
> - set gc-cons-threshold very high at the beginning of
>   startup  (* 4096 40960)
> - set it lower at the end of startup
>   (/ (* 4096 4096) 1)
>   - use gcmh with this value
> - set it very high when entering the mini-buffer and
>   lower again when exiting it
> - force a gc when frame loses focs
> 
> The result is that with emacs-uptime being 7 hours, 21
> minutes (and plenty of time away from the computer) I
> have 103 messages regarding Garbage collection with
> accompanied times for them in my message buffer.
> 
> Some statistics:
> 
> Minimal number	0.000	seconds
> Maximal number	2.603	seconds
> Sum	65.896	seconds
> Average	0.63976699029126213592	seconds
> Median	0.612	seconds
> Variance	0.06970711075501932322
> Standard deviation	0.26402104225803541665
> 
> Actually 0.6 seconds are already rather long I think.
> But it's much better than before (on a ca. 9 years old
> x240 with 8GB RAM)

The above means you tuned the threshold to your system and personal
needs.  Which is what everyone should do if they are bothered by
frequent GC cycles and too long run times of some Lisp programs they
care about.

> Therefore I think some auto-adjustment of
> gc-cons-threshold would be nice, which would try to
> optimize for low number of garbage collection and short
> times of actual gc runs.

"Therefore"? how does this follow from what you did?  Your tuning is
static and is appropriate for your usage.  Others will most probably
come up with different numbers using the same procedure.  How do you
propose to make this into some kind of auto-adjustment, when how much
garbage is generated and the amount of slowdown this incurs depends on
the Lisp programs that typically run?



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 17:25                                       ` Eli Zaretskii
@ 2023-03-11 18:35                                         ` Gregor Zattler
  2023-03-11 18:49                                           ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Gregor Zattler @ 2023-03-11 18:35 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: yantar92, spacibba, arne_bab, emacs-devel

Hi Eli, emacs developers,
* Eli Zaretskii <eliz@gnu.org> [2023-03-11; 19:25 +02]:
>> From: Gregor Zattler <telegraph@gmx.net>
>> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
>> Date: Sat, 11 Mar 2023 18:10:33 +0100

>> But isn't that the very reason, why Ihors gc-cons-threshold
>> calculation in mid:878rg3wh2f.fsf@localhost is on the
>> save side memory wise?  Because it's a fraction of emacs
>> overall memory consumption anyway but scaled regarding
>> the total amount of memory?
>
> No.  The limitation on the _increment_ should have nothing to do with
> how much memory is already consumed or how much total memory is
> available on the system.  Imagine an Emacs with N MiB of memory
> footprint on a system that has N+1 MiB of memory available.
>
> IOW, what matters is how much is _left_, not how much is already used
> or totally available.

At the moment Emacs does not adjust gc-cons-threshold
to how much memory is left but uses a static
gc-cons-threshold which is rather low.  Ihor's
calculations use the same conservative rather low value
but scales it with overall memory.

[...]
> The above means you tuned the threshold to your system and personal
> needs.  Which is what everyone should do if they are bothered by
> frequent GC cycles and too long run times of some Lisp programs they
> care about.
>
>> Therefore I think some auto-adjustment of
>> gc-cons-threshold would be nice, which would try to
>> optimize for low number of garbage collection and short
>> times of actual gc runs.
>
> "Therefore"? how does this follow from what you did?  Your tuning is
> static and is appropriate for your usage.  Others will most probably
> come up with different numbers using the same procedure.  How do you
> propose to make this into some kind of auto-adjustment, when how much
> garbage is generated and the amount of slowdown this incurs depends on
> the Lisp programs that typically run?

I did it statically because I lack the ability to
program an auto-adjusting solution.

But it would be nice if gc-cons-threshold would be
adjusted after each garbage collection in relation to
the amount of time consumed in the last garbage
collections.

I envision an over-engineered customizable variable
called gc-time-service-level-agreement which would
feature two values: one for the maximum of time in
seconds allowed to spend in garbage collection and
another for the percentage of cases where this promise
of maximum time for garbage collection must be kept.
Default: (0.5 . 80)

Or some such.  This wished for auto-adjustment of
garbage collection would help cater to the needs of
users of different packages and major modes.

Ciao; Gregor



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 18:35                                         ` Gregor Zattler
@ 2023-03-11 18:49                                           ` Eli Zaretskii
  0 siblings, 0 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-11 18:49 UTC (permalink / raw)
  To: Gregor Zattler; +Cc: yantar92, spacibba, arne_bab, emacs-devel

> From: Gregor Zattler <telegraph@gmx.net>
> Cc: yantar92@posteo.net, spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Sat, 11 Mar 2023 19:35:13 +0100
> 
> > No.  The limitation on the _increment_ should have nothing to do with
> > how much memory is already consumed or how much total memory is
> > available on the system.  Imagine an Emacs with N MiB of memory
> > footprint on a system that has N+1 MiB of memory available.
> >
> > IOW, what matters is how much is _left_, not how much is already used
> > or totally available.
> 
> At the moment Emacs does not adjust gc-cons-threshold
> to how much memory is left but uses a static
> gc-cons-threshold which is rather low.  Ihor's
> calculations use the same conservative rather low value
> but scales it with overall memory.

No, Ihor proposed to _enlarge_ the threshold if the total amount of
memory is large enough.  So the proposal does not keep the same
conservative low value.

> > "Therefore"? how does this follow from what you did?  Your tuning is
> > static and is appropriate for your usage.  Others will most probably
> > come up with different numbers using the same procedure.  How do you
> > propose to make this into some kind of auto-adjustment, when how much
> > garbage is generated and the amount of slowdown this incurs depends on
> > the Lisp programs that typically run?
> 
> I did it statically because I lack the ability to
> program an auto-adjusting solution.
> 
> But it would be nice if gc-cons-threshold would be
> adjusted after each garbage collection in relation to
> the amount of time consumed in the last garbage
> collections.

We are discussing how to do that.  I don't think there's a
disagreement that if we can find a reasonable way of doing that, it
will be nice to have it.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 16:19                     ` Dr. Arne Babenhauserheide
@ 2023-03-12 13:27                       ` Ihor Radchenko
  2023-03-12 14:10                         ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-12 13:27 UTC (permalink / raw)
  To: Dr. Arne Babenhauserheide; +Cc: Eli Zaretskii, spacibba, emacs-devel

"Dr. Arne Babenhauserheide" <arne_bab@web.de> writes:

>> What is the smallest practical free RAM available to Emacs on low-end systems?
>> We can take that value and then use 800kb/min free RAM in the wild and
>> the base threshold. On system with larger RAM the threshold will scale.
>
> It’s not that simple — at least outside loading the init file. If you
> increase this fraction, then GC pauses will be longer and Emacs may feel
> jittery or even become unresponsive for some time.
>
> So there should likely be also a hard upper limit to ensure that the
> pauses are unnoticeable.

Well. I do realize that there should be a limit, which is why I put it
as 100Mb.

Strictly speaking, GC pauses scale with heap size. Increasing GC threshold
will have two effects on the heap size: (1) thresholds lager than normal
heap size will dominate the GC time - Emacs will need to traverse all
the newly added data to be GCed; (2) too large thresholds will cause
heap fragmentation, also increasing the GC times as the heap will expand.

I think that (2) is the most important factor for real world scenarios
(unless we set the threshold insanely high, higher than memory usage in
"heavy" Emacs sessions).

Emacs' default gives some lower safe bound on the threshold - it is
`gc-cons-percentage', defaulting to 1% of the heap size.

However, being safe is unfortunately not enough - apparently 1% heap
size is too less in real world scenarios when 1% is routinely and
frequently allocated, triggering GC rescans too many times.

Especially so, when loading init.el. The heap size is yet smaller than
normal and GC is triggered even more frequently.

AFAIU, routine throw-away memory allocation in Emacs is not directly
correlated with the memory usage - it rather depends on the usage
patterns and the packages being used. For example, it takes about 10
complex helm searches for me to trigger my 250Mb threshold - 25Mb per
helm command. The GC frequency often depends on how heavily I use helm
completion.

To get some idea about the impact of gc-cons-threshold on memory
fragmentation, I compared the output of `memory-limit' with 250Mb vs.
default 800kb threshold:

 250Mb threshold - 689520 kb memory
 800kb threshold - 531548 kb memory

The memory usage is clearly increased, but not catastrophically, despite
using rather large threshold.

Of course, it is just init.el, which is loaded once. Memory
fragmentation as a result of routine Emacs usage may cause more
significant memory usage increase. Not multiple times though (not even
twice in my setup).

From the above, I expect the increase of gc-cons-threshold 100x to
impact the memory dis-proportionally - between as little as 1.1x and 2x.
As long as gc-cons-threshold is significantly (~10x) lower than normal
Emacs heap size, I expect the GC frequency to scale down the same 100x
at very little cost.

For me, Emacs memory usage typically settles around 1Gb, which is why I
chose 100Mb as an upper limit (1Gb/10).

If we want to be even more safe, as I proposed, we can only increase the
gc-cons-threshold when loading init.el and possibly for certain, most
GC-heavy commands. This will minimize the impact of memory
fragmentation - only a small set of commands (having more regular memory
allocation pattern) will cause the fragmentation.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-12 13:27                       ` Ihor Radchenko
@ 2023-03-12 14:10                         ` Eli Zaretskii
  2023-03-12 14:50                           ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-12 14:10 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: Eli Zaretskii <eliz@gnu.org>, spacibba@aol.com, emacs-devel@gnu.org
> Date: Sun, 12 Mar 2023 13:27:53 +0000
> 
> > So there should likely be also a hard upper limit to ensure that the
> > pauses are unnoticeable.
> 
> Well. I do realize that there should be a limit, which is why I put it
> as 100Mb.

Yes, but what is that 100 MiB number based on? any measurements of the
time it takes to run GC behind that number? or just a more-or-less
arbitrary value that "seems right"?

> Strictly speaking, GC pauses scale with heap size.

If by "heap size" you mean the total size of heap-allocated memory in
the Emacs process, then this is inaccurate.  GC traverses only the
Lisp objects, whereas Emacs also allocates memory from the heap for
other purposes.  It also allocates memory from the VM outside of the
"normal" heap -- that's where the buffer text memory usually comes
from, as well as any large enough chunk of memory Emacs needs.

> Increasing GC threshold
> will have two effects on the heap size: (1) thresholds lager than normal
> heap size will dominate the GC time - Emacs will need to traverse all
> the newly added data to be GCed;

You seem to assume that GC traverses only the Lisp objects
newly-allocated since the previous GC.  This is incorrect: it
traverses _all_ of the Lisp objects, both old and new.

> (2) too large thresholds will cause heap fragmentation, also
> increasing the GC times as the heap will expand.

Not sure why do you think heap fragmentation increases monotonically
with larger thresholds.  Maybe you should explain what you consider
"heap fragmentation" for the purposes of this discussion.

> I think that (2) is the most important factor for real world scenarios

Not sure why you think so.  Maybe because I don't have a clear idea
what kind of fragmentation you have in mind here.

> Emacs' default gives some lower safe bound on the threshold - it is
> `gc-cons-percentage', defaulting to 1% of the heap size.

Actually, the default value of gc-cons-percentage is 0.1, i.e. 10%.
And it's 10% of the sum total of all live Lisp objects plus the number
of bytes allocated for Lisp objects since the last GC.  Not 10% of the
heap size.

> However, being safe is unfortunately not enough - apparently 1% heap
> size is too less in real world scenarios when 1% is routinely and
> frequently allocated, triggering GC rescans too many times.

How large is what you call "heap size" in your production session, may
I ask?

> AFAIU, routine throw-away memory allocation in Emacs is not directly
> correlated with the memory usage - it rather depends on the usage
> patterns and the packages being used. For example, it takes about 10
> complex helm searches for me to trigger my 250Mb threshold - 25Mb per
> helm command.

This calculation is only valid if each of these 10 commands conses
approximately the same amount of Lisp data.  If that is not so, you
cannot really divide 250 MiB by 10 and claim that each command used up
that much Lisp memory.  That's because GC is _not_ triggered as soon
as Emacs crosses the threshold, it is triggered when Emacs _checks_
how much was consed since last GC and discovers it consed more than
the threshold.  The trigger for testing is unrelated to crossing the
threshold.

> To get some idea about the impact of gc-cons-threshold on memory
> fragmentation, I compared the output of `memory-limit' with 250Mb vs.
> default 800kb threshold:
> 
>  250Mb threshold - 689520 kb memory
>  800kb threshold - 531548 kb memory
> 
> The memory usage is clearly increased, but not catastrophically, despite
> using rather large threshold.
> 
> Of course, it is just init.el, which is loaded once.

Correction: it is _your_ init.el.  We need similar statistics from
many users and many different usage patterns; only then we will be
able to draw valid conclusions.

> Memory fragmentation as a result of routine Emacs usage may cause
> more significant memory usage increase.

Actually, Emacs tries very hard to avoid fragmentation.  That's why it
compacts buffers, and that's why it can relocate buffer text and
string data.

> As long as gc-cons-threshold is significantly (~10x) lower than normal
> Emacs heap size, I expect the GC frequency to scale down the same 100x
> at very little cost.

What is the "normal heap size" in your production sessions?  And how
did you measure it?

> For me, Emacs memory usage typically settles around 1Gb, which is why I
> chose 100Mb as an upper limit (1Gb/10).

Once again, the threshold value is not necessarily directly derived
from the total memory footprint of the Emacs process.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 14:09                         ` Eli Zaretskii
@ 2023-03-12 14:20                           ` Ihor Radchenko
  2023-03-12 14:40                             ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-12 14:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1365 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

> I'm talking about basis for the 0.7% figure.

I used 0.7%*RAM because total RAM is the only reasonable metrics. What
else can we use to avoid memory over-consumption on low-end machines?

Of course, I used implicit assumption that memory usage will scale with
gc-cons-threshold linearly. IMHO, it is a safe assumption - the real
memory usage increase is slower than linear. For example, see my Emacs
loading data for different threshold values:

| gc-cons-threshold | memory-limit | gcs-done |   gc-elapsed | gc time     |
| 1Mb               |       523704 |      394 | 25.809423617 | 0.065506151 |
| 2Mb               |        +9624 |      210 |  13.41456755 | 0.063878893 |
| 4Mb               |        +1224 |      109 |  6.400488833 | 0.058720081 |
| 8Mb               |        +3164 |       63 |  3.223383144 | 0.051164812 |
| 16Mb              |        +5532 |       37 |  1.757097776 | 0.047489129 |
| 32Mb              |       +20264 |       25 |  0.995694149 | 0.039827766 |
| 64Mb              |       +59860 |       19 |  0.624039941 | 0.032844207 |
| 128Mb             |      +115356 |       16 |   0.42626893 | 0.026641808 |
| 256Mb             |      +171176 |       14 |  0.277912281 | 0.019850877 |
| 512Mb             |      +332148 |       12 |  0.122461442 | 0.010205120 |

Also, see the attached graph.


[-- Attachment #2: benchmark-gc.png --]
[-- Type: image/png, Size: 33637 bytes --]

[-- Attachment #3: Type: text/plain, Size: 751 bytes --]


The 0.7% is to ensure safe 800kb lower bound on low-end computers.

> Anyway, how about if you try running with the threshold you think we
> should adopt, and report back after a month or so, say?

I am using 250Mb threshold for the last 3 years or so.
GCs are sometimes noticeable, but not annoying:

- gc-elapsed 297 sec / gcs-done 290 -> ~1 sec per GC
- Emacs uptime 2 days 5 hours 21 minutes -> 1 GC per 10 minutes
- memory-limit 6,518,516, stable
  37x from Emacs -Q memory-limit
  10x from Emacs loading with my init.el

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-12 14:20                           ` Ihor Radchenko
@ 2023-03-12 14:40                             ` Eli Zaretskii
  2023-03-12 15:04                               ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-12 14:40 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Sun, 12 Mar 2023 14:20:33 +0000
> 
> > I'm talking about basis for the 0.7% figure.
> 
> I used 0.7%*RAM because total RAM is the only reasonable metrics. What
> else can we use to avoid memory over-consumption on low-end machines?

It could be the amount of VM on low-memory machines, but something
else on high-memory machines.  No one said it has to be derived from
the total VM on both systems with 2 GiB and systems with 128 GiB.

> Of course, I used implicit assumption that memory usage will scale with
> gc-cons-threshold linearly. IMHO, it is a safe assumption - the real
> memory usage increase is slower than linear. For example, see my Emacs
> loading data for different threshold values:

We are talking about changing the threshold for the session itself,
not just for the initial load.  So the statistics of the initial load
is not what is needed.

> The 0.7% is to ensure safe 800kb lower bound on low-end computers.

I don't see why it would be the value we need to adhere to.  That it's
the current default doesn't make it sacred, and using it as basis for
relative figures such as 0.7% has no real basis.

> > Anyway, how about if you try running with the threshold you think we
> > should adopt, and report back after a month or so, say?
> 
> I am using 250Mb threshold for the last 3 years or so.
> GCs are sometimes noticeable, but not annoying:
> 
> - gc-elapsed 297 sec / gcs-done 290 -> ~1 sec per GC

IMO, 1 sec per GC is pretty annoying.  It's around 0.165 sec in my
production session, and it still quite noticeable.  I'd be interested
to hear what others think.

> - Emacs uptime 2 days 5 hours 21 minutes -> 1 GC per 10 minutes

I'm running with gc-cons-threshold of 1.8 MiB, and get about 1 GC per
minute.  (Actually, it's about twice that, because Emacs stays up at
night, but does nothing.)

> - memory-limit 6,518,516, stable

??? That's 6 GiB.  Didn't you say your memory footprint stabilizes at
1 GiB?

Anyway, we need such statistics from many people and many different
values of the threshold, and then we will be in a position to decide
on better default values, and perhaps also on some more dynamic
adjustments to it.  We are not ready for that yet.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-12 14:10                         ` Eli Zaretskii
@ 2023-03-12 14:50                           ` Ihor Radchenko
  2023-03-12 15:13                             ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-12 14:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Well. I do realize that there should be a limit, which is why I put it
>> as 100Mb.
>
> Yes, but what is that 100 MiB number based on? any measurements of the
> time it takes to run GC behind that number? or just a more-or-less
> arbitrary value that "seems right"?

That's what I tried to explain below. At the end, I used Emacs Lisp
object usage divided by 10 and rounded down to hundreds. I did not try
to be precise - just accurate to orders of magnitude.

>> Strictly speaking, GC pauses scale with heap size.
>
> If by "heap size" you mean the total size of heap-allocated memory in
> the Emacs process, then this is inaccurate.  GC traverses only the
> Lisp objects, whereas Emacs also allocates memory from the heap for
> other purposes.  It also allocates memory from the VM outside of the
> "normal" heap -- that's where the buffer text memory usually comes
> from, as well as any large enough chunk of memory Emacs needs.

Thanks for the clarification.

>> Increasing GC threshold
>> will have two effects on the heap size: (1) thresholds lager than normal
>> heap size will dominate the GC time - Emacs will need to traverse all
>> the newly added data to be GCed;
>
> You seem to assume that GC traverses only the Lisp objects
> newly-allocated since the previous GC.  This is incorrect: it
> traverses _all_ of the Lisp objects, both old and new.

No, I am aware that GC traverses all the Lisp objects.
That's why I said that large threshold only increases GC time
significantly when the threshold is comparable to the heap size (part of
it containing Lisp objects). Otherwise, heap size mostly determines how
long it takes to complete a single GC.

>> (2) too large thresholds will cause heap fragmentation, also
>> increasing the GC times as the heap will expand.
>
> Not sure why do you think heap fragmentation increases monotonically
> with larger thresholds.  Maybe you should explain what you consider
> "heap fragmentation" for the purposes of this discussion.

See my other reply with my measurements of memory-limit vs.
gc-cons-threshold. I assume that this scaling will not be drastically
different even for different users. We can ask others to repeat my
measurements though.

>> I think that (2) is the most important factor for real world scenarios
>
> Not sure why you think so.  Maybe because I don't have a clear idea
> what kind of fragmentation you have in mind here.

I meant that as long as gc-cons-threshold is much lower (10x or so) than
heap size (Lisp object part), we do not need to worry about (1). Only
(2) remains a concern.

>> Emacs' default gives some lower safe bound on the threshold - it is
>> `gc-cons-percentage', defaulting to 1% of the heap size.
>
> Actually, the default value of gc-cons-percentage is 0.1, i.e. 10%.
> And it's 10% of the sum total of all live Lisp objects plus the number
> of bytes allocated for Lisp objects since the last GC.  Not 10% of the
> heap size.

Interesting. I thought that it is in percents.
Then, I have to mention that I intentionally reduced gc-cons-percentage
in my testing, which I detailed in my other message.

With Emacs defaults (0.1 gc-cons-percentage), I get:

memory-limit gcs-done gc-elapsed
526852 103 4.684100536

An equivalent of gc-cons-threshold = between 4Mb and 8Mb

10% also means that 800k gc-cons-threshold does not matter much even
with emacs -Q -- it uses over 8Mb memory and thus gc-cons-percentage
should dominate the GC, AFAIU.

Note that my proposed 100Mb gc-cons-threshold limit will correspond to
1Gb live Lisp objects. For reference, this is what I have now (I got the
data using memory-usage package):

   Total in lisp objects: 1.33GB (live 1.18GB, dead  157MB)

Even if Emacs uses several hundreds Mbs of Lisp objects (typical
scenario with third-party packages), my suggested gc-cons-threshold does
not look too risky yet reducing GC when loading init.el (when heap size
is still small).

> How large is what you call "heap size" in your production session, may
> I ask?

See the above.

>> AFAIU, routine throw-away memory allocation in Emacs is not directly
>> correlated with the memory usage - it rather depends on the usage
>> patterns and the packages being used. For example, it takes about 10
>> complex helm searches for me to trigger my 250Mb threshold - 25Mb per
>> helm command.
>
> This calculation is only valid if each of these 10 commands conses
> approximately the same amount of Lisp data.  If that is not so, you
> cannot really divide 250 MiB by 10 and claim that each command used up
> that much Lisp memory.  That's because GC is _not_ triggered as soon
> as Emacs crosses the threshold, it is triggered when Emacs _checks_
> how much was consed since last GC and discovers it consed more than
> the threshold.  The trigger for testing is unrelated to crossing the
> threshold.

Sure. I ran exactly same command repeatedly. Just to get an idea about
what is possible. Do not try to interpret my results as precise - they
are just there to provide some idea about the orders of magnitude for
the allocated memory.

>> To get some idea about the impact of gc-cons-threshold on memory
>> fragmentation, I compared the output of `memory-limit' with 250Mb vs.
>> default 800kb threshold:
>> 
>>  250Mb threshold - 689520 kb memory
>>  800kb threshold - 531548 kb memory
>> 
>> The memory usage is clearly increased, but not catastrophically, despite
>> using rather large threshold.
>> 
>> Of course, it is just init.el, which is loaded once.
>
> Correction: it is _your_ init.el.  We need similar statistics from
> many users and many different usage patterns; only then we will be
> able to draw valid conclusions.

Sure. Should we formally try to call for such benchmarks?

>> Memory fragmentation as a result of routine Emacs usage may cause
>> more significant memory usage increase.
>
> Actually, Emacs tries very hard to avoid fragmentation.  That's why it
> compacts buffers, and that's why it can relocate buffer text and
> string data.

Indeed. But despite all of the best efforts, fragmentation increases if
we delay GCs, right?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-12 14:40                             ` Eli Zaretskii
@ 2023-03-12 15:04                               ` Ihor Radchenko
  2023-03-12 15:26                                 ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-12 15:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> I used 0.7%*RAM because total RAM is the only reasonable metrics. What
>> else can we use to avoid memory over-consumption on low-end machines?
>
> It could be the amount of VM on low-memory machines, but something
> else on high-memory machines.  No one said it has to be derived from
> the total VM on both systems with 2 GiB and systems with 128 GiB.

For high-memory machines, the aim is using the limit (100Mb, in what I
proposed), or something close to it. I see no point trying to be
precise here - even what I propose is better when memory is not low, if
we aim for increased, yet safe gc-cons-threshold.

>> Of course, I used implicit assumption that memory usage will scale with
>> gc-cons-threshold linearly. IMHO, it is a safe assumption - the real
>> memory usage increase is slower than linear. For example, see my Emacs
>> loading data for different threshold values:
>
> We are talking about changing the threshold for the session itself,
> not just for the initial load.  So the statistics of the initial load
> is not what is needed.

Session statistics is much harder to gather.
It is more realistic to ask people about benchmarking their init.el if
we want to be serious about bumping gc-cons-threshold.

At least, we can then get a good limit for init.el.

>> The 0.7% is to ensure safe 800kb lower bound on low-end computers.
>
> I don't see why it would be the value we need to adhere to.  That it's
> the current default doesn't make it sacred, and using it as basis for
> relative figures such as 0.7% has no real basis.

We do not have to adhere to this value. But we know for sure that it is
100% safe. And we probably cannot easily get the data from low-end
machines - much fewer users are using those.

So, instead of arguing about lower limit as well, let's just use the
safe one. We can always bump it later, if we wish to bother.

>> > Anyway, how about if you try running with the threshold you think we
>> > should adopt, and report back after a month or so, say?
>> 
>> I am using 250Mb threshold for the last 3 years or so.
>> GCs are sometimes noticeable, but not annoying:
>> 
>> - gc-elapsed 297 sec / gcs-done 290 -> ~1 sec per GC
>
> IMO, 1 sec per GC is pretty annoying.  It's around 0.165 sec in my
> production session, and it still quite noticeable.  I'd be interested
> to hear what others think.

1 sec has little to do with my gc-cons-threshold, I am afraid. It is the
combination of packages I use. I have also seen worse memory
consumption. In particular, when using spell-fu, which copies over local
dictionary in every single buffer.

That's why I am seeing reducing the frequency of GCs as more important
than trying to reduce GC time, which cannot be even halved easily.

>> - memory-limit 6,518,516, stable
>
> ??? That's 6 GiB.  Didn't you say your memory footprint stabilizes at
> 1 GiB?

memory-limit is a natively compiled function defined in subr.el.

Signature
(memory-limit)

Documentation
Return an estimate of Emacs virtual memory usage, divided by 1024.

It is different from Lisp object storage size, which is about 1.3Gb.
And Emacs memory usage in system monitor is about 1.7Gb.

> Anyway, we need such statistics from many people and many different
> values of the threshold, and then we will be in a position to decide
> on better default values, and perhaps also on some more dynamic
> adjustments to it.  We are not ready for that yet.

Shall we ask about benchmarking init.el with different gc-cons-threshold
values as a start?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-12 14:50                           ` Ihor Radchenko
@ 2023-03-12 15:13                             ` Eli Zaretskii
  2023-03-12 17:15                               ` Gregor Zattler
  2023-03-13 15:01                               ` Ihor Radchenko
  0 siblings, 2 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-12 15:13 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Sun, 12 Mar 2023 14:50:33 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> I think that (2) is the most important factor for real world scenarios
> >
> > Not sure why you think so.  Maybe because I don't have a clear idea
> > what kind of fragmentation you have in mind here.
> 
> I meant that as long as gc-cons-threshold is much lower (10x or so) than
> heap size (Lisp object part), we do not need to worry about (1). Only
> (2) remains a concern.

Your gc-cons-threshold is 250 MiB.  If it's below 0.1 of the size of
Lisp data, does that mean you 2.5 GiB or more of Lisp data in your
sessions?  That's a lot.  I have less than 300 MiB, for comparison,
and this is a session that runs for 25 days non-stop, and has a
520 MiB memory footprint.

> 10% also means that 800k gc-cons-threshold does not matter much even
> with emacs -Q -- it uses over 8Mb memory and thus gc-cons-percentage
> should dominate the GC, AFAIU.

How did you measure those 8Mb?  On my system, memory-report in
"emacs -Q" shows less than 2 MiB in object memory.

> Note that my proposed 100Mb gc-cons-threshold limit will correspond to
> 1Gb live Lisp objects. For reference, this is what I have now (I got the
> data using memory-usage package):
> 
>    Total in lisp objects: 1.33GB (live 1.18GB, dead  157MB)

We need to align and calibrate our measurement means: what does
memory-report tell about object memory in that session?  1.18 GiB
sounds a lot.

> > Correction: it is _your_ init.el.  We need similar statistics from
> > many users and many different usage patterns; only then we will be
> > able to draw valid conclusions.
> 
> Sure. Should we formally try to call for such benchmarks?

Yes!

> > Actually, Emacs tries very hard to avoid fragmentation.  That's why it
> > compacts buffers, and that's why it can relocate buffer text and
> > string data.
> 
> Indeed. But despite all of the best efforts, fragmentation increases if
> we delay GCs, right?

Not IME, no.  That's why the memory footprint of a typical
long-running session levels out.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-12 15:04                               ` Ihor Radchenko
@ 2023-03-12 15:26                                 ` Eli Zaretskii
  2023-03-13 15:09                                   ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-12 15:26 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Sun, 12 Mar 2023 15:04:30 +0000
> 
> > We are talking about changing the threshold for the session itself,
> > not just for the initial load.  So the statistics of the initial load
> > is not what is needed.
> 
> Session statistics is much harder to gather.
> It is more realistic to ask people about benchmarking their init.el if
> we want to be serious about bumping gc-cons-threshold.

We should ask them to report statistics after running a session for
enough time.  What exactly qualifies as "enough time" depends on the
usage patterns: there are people, like me, who run a single session
for weeks on end, and there are others who start a new session every
day or even more frequently.  What is interesting is the statistics
near the end of the session.

Statistics from loading init.el is much less interesting, mainly
because reducing GC impact during such short intervals is easy and
well-understood.  We could collect such statistics as well, but it is
not the main goal from where I stand.

> That's why I am seeing reducing the frequency of GCs as more important
> than trying to reduce GC time, which cannot be even halved easily.

A single very long GC will be annoying even if it happens rarely.  So
I don't agree the frequency of GCs is so much more important than the
time it takes to perform a GC.

> >> - memory-limit 6,518,516, stable
> >
> > ??? That's 6 GiB.  Didn't you say your memory footprint stabilizes at
> > 1 GiB?
> 
> memory-limit is a natively compiled function defined in subr.el.
> 
> Signature
> (memory-limit)
> 
> Documentation
> Return an estimate of Emacs virtual memory usage, divided by 1024.
> 
> It is different from Lisp object storage size, which is about 1.3Gb.
> And Emacs memory usage in system monitor is about 1.7Gb.

If the Emacs memory usage in system monitor is 1.7 Gib, how come
memory-limit says it's 6.5 GiB?

> Shall we ask about benchmarking init.el with different gc-cons-threshold
> values as a start?

See above: I'm more interested in statistics of the session than in
statistics of the initial load.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-12 15:13                             ` Eli Zaretskii
@ 2023-03-12 17:15                               ` Gregor Zattler
  2023-03-12 20:07                                 ` Eli Zaretskii
  2023-03-13 15:01                               ` Ihor Radchenko
  1 sibling, 1 reply; 99+ messages in thread
From: Gregor Zattler @ 2023-03-12 17:15 UTC (permalink / raw)
  To: emacs-devel

Hi Eli, Ihor, emacs developers,
* Eli Zaretskii <eliz@gnu.org> [2023-03-12; 17:13 +02]:
>> From: Ihor Radchenko <yantar92@posteo.net>
>> Date: Sun, 12 Mar 2023 14:50:33 +0000
>> Sure. Should we formally try to call for such benchmarks?
>
> Yes!

at this point you have two different ideas of when to
measure (at the end of or at end of usage session)
(both could be asked for).

And it's not clear, to me at least, which numbers
should be collected.

It would be great if there would be an agreement on
what and when to measure and and where to send the
measurements.

A function which as a side effect outputs the numbers
on interest would be helpful.

I think it would be useful if they included
emacs-uptime at the moment of measurement and (if
possible) the total idle time.

Ciao; Gregor



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-12 17:15                               ` Gregor Zattler
@ 2023-03-12 20:07                                 ` Eli Zaretskii
  0 siblings, 0 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-12 20:07 UTC (permalink / raw)
  To: Gregor Zattler; +Cc: emacs-devel

> From: Gregor Zattler <telegraph@gmx.net>
> Date: Sun, 12 Mar 2023 18:15:06 +0100
> 
> Hi Eli, Ihor, emacs developers,
> * Eli Zaretskii <eliz@gnu.org> [2023-03-12; 17:13 +02]:
> >> From: Ihor Radchenko <yantar92@posteo.net>
> >> Date: Sun, 12 Mar 2023 14:50:33 +0000
> >> Sure. Should we formally try to call for such benchmarks?
> >
> > Yes!
> 
> at this point you have two different ideas of when to
> measure (at the end of or at end of usage session)
> (both could be asked for).
> 
> And it's not clear, to me at least, which numbers
> should be collected.

That's okay, since we didn't ask yet.  When we do, the questions will
be detailed and clear.

> It would be great if there would be an agreement on
> what and when to measure and and where to send the
> measurements.
> 
> A function which as a side effect outputs the numbers
> on interest would be helpful.
> 
> I think it would be useful if they included
> emacs-uptime at the moment of measurement and (if
> possible) the total idle time.

Your points are well taken!



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-11 15:32                                   ` Eli Zaretskii
  2023-03-11 15:52                                     ` Lynn Winebarger
  2023-03-11 17:10                                     ` Gregor Zattler
@ 2023-03-13 12:45                                     ` Ihor Radchenko
  2023-03-13 12:51                                       ` Eli Zaretskii
  2 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-13 12:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Well. Currently, the paragraph reads like: we have 800kb default, but do
>> not increase it too much because Emacs will require more memory.
>
> No, it says more.  In particular:
>
>                   If you specify a larger value, garbage collection will
>      happen less often.  This reduces the amount of time spent garbage
>      collecting, but increases total memory use.  You may want to do
>      this when running a program that creates lots of Lisp data.
>      However, we recommend against increasing the threshold for
>      prolonged periods of time, and advise that you never set it higher
>      than needed for the program to run in reasonable time.  Using
>      thresholds higher than necessary could potentially cause
>      system-wide memory pressure, and should therefore be avoided.

Unfortunately, if is not very clear how much increasing the threshold
affects memory usage. What if I increase the threshold twice? Is it
safe? Dangerous? Maybe 10%? 10x?

I guess we can give an answer if we collect usage statistics.

>> Given that modern computer typically have 4-16Gb RAM, the warning does
>> not look like an actual warning. 800kb is nothing. Surely, increasing it
>> to 80Mb to even few hundreds Mb is safe, right? Or not?
>
> Again, you are reasoning about the value as if it were related to the
> maximum memory footprint Emacs could have.  But in fact, it is related
> only to the _increment_ of memory Emacs can have before it should stop
> and consider how much of that is garbage.

But how else should I interpret "memory pressure"? In practical terms,
it looks like increasing the threshold will make Emacs GC less - a good
thing if GCs are a problem. But then there is a warning about memory
pressure, but it does not look too scary if you have plenty of RAM,
especially looking at common advises to increase the threshold across
internet.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 12:45                                     ` Ihor Radchenko
@ 2023-03-13 12:51                                       ` Eli Zaretskii
  2023-06-14 14:16                                         ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-13 12:51 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Mon, 13 Mar 2023 12:45:34 +0000
> 
> >                   If you specify a larger value, garbage collection will
> >      happen less often.  This reduces the amount of time spent garbage
> >      collecting, but increases total memory use.  You may want to do
> >      this when running a program that creates lots of Lisp data.
> >      However, we recommend against increasing the threshold for
> >      prolonged periods of time, and advise that you never set it higher
> >      than needed for the program to run in reasonable time.  Using
> >      thresholds higher than necessary could potentially cause
> >      system-wide memory pressure, and should therefore be avoided.
> 
> Unfortunately, if is not very clear how much increasing the threshold
> affects memory usage. What if I increase the threshold twice? Is it
> safe? Dangerous? Maybe 10%? 10x?

How should I know?  Does anyone know?

> I guess we can give an answer if we collect usage statistics.

Exactly.

> > Again, you are reasoning about the value as if it were related to the
> > maximum memory footprint Emacs could have.  But in fact, it is related
> > only to the _increment_ of memory Emacs can have before it should stop
> > and consider how much of that is garbage.
> 
> But how else should I interpret "memory pressure"?

Increasing the threshold indeed increases the danger of memory
pressure, but how exactly is yet unknown.  So simplistic
interpretations like this are IMO premature.

> But then there is a warning about memory
> pressure, but it does not look too scary if you have plenty of RAM,
> especially looking at common advises to increase the threshold across
> internet.

Using most-positive-fixnum as the threshold _should_ scare people.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-12 15:13                             ` Eli Zaretskii
  2023-03-12 17:15                               ` Gregor Zattler
@ 2023-03-13 15:01                               ` Ihor Radchenko
  2023-03-13 15:33                                 ` Eli Zaretskii
  2023-03-13 15:41                                 ` Eli Zaretskii
  1 sibling, 2 replies; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-13 15:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> I meant that as long as gc-cons-threshold is much lower (10x or so) than
>> heap size (Lisp object part), we do not need to worry about (1). Only
>> (2) remains a concern.
>
> Your gc-cons-threshold is 250 MiB.  If it's below 0.1 of the size of
> Lisp data, does that mean you 2.5 GiB or more of Lisp data in your
> sessions?  That's a lot.  I have less than 300 MiB, for comparison,
> and this is a session that runs for 25 days non-stop, and has a
> 520 MiB memory footprint.

I said nothing about my settings. Because my settings are tailored to my
usage.

In this discussion, I am trying to deduce something reasonable based on
logic, not just on what works for me personally.

>> 10% also means that 800k gc-cons-threshold does not matter much even
>> with emacs -Q -- it uses over 8Mb memory and thus gc-cons-percentage
>> should dominate the GC, AFAIU.
>
> How did you measure those 8Mb?  On my system, memory-report in
> "emacs -Q" shows less than 2 MiB in object memory.

emacs -Q M-x memory-report on my side gives

Estimated Emacs Memory Usage

   3.2 MiB  Overall Object Memory Usage
   2.2 MiB  Memory Used By Global Variables
   1.3 MiB  Memory Used By Symbol Plists
   334 KiB  Reserved (But Unused) Object Memory
    66 KiB  Total Image Cache Size
    21 KiB  Total Buffer Memory Usage

A sum of all the above is 7.121, which I rounded up.

>> Note that my proposed 100Mb gc-cons-threshold limit will correspond to
>> 1Gb live Lisp objects. For reference, this is what I have now (I got the
>> data using memory-usage package):
>> 
>>    Total in lisp objects: 1.33GB (live 1.18GB, dead  157MB)
>
> We need to align and calibrate our measurement means: what does
> memory-report tell about object memory in that session?  1.18 GiB
> sounds a lot.

Unfortunately, memory-report is not usable in my sessions. M-x
memory-report takes 10+ minutes and then fails with max nesting error
for me. That's why I use memory-usage third-party package, which
produces some info and does it in reasonable time.

>> > Actually, Emacs tries very hard to avoid fragmentation.  That's why it
>> > compacts buffers, and that's why it can relocate buffer text and
>> > string data.
>> 
>> Indeed. But despite all of the best efforts, fragmentation increases if
>> we delay GCs, right?
>
> Not IME, no.  That's why the memory footprint of a typical
> long-running session levels out.

Then what is the mechanism of gc-cons-threshold affecting the Emacs
memory footprint?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-12 15:26                                 ` Eli Zaretskii
@ 2023-03-13 15:09                                   ` Ihor Radchenko
  2023-03-13 15:37                                     ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-13 15:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2162 bytes --]

Eli Zaretskii <eliz@gnu.org> writes:

>> Session statistics is much harder to gather.
>> It is more realistic to ask people about benchmarking their init.el if
>> we want to be serious about bumping gc-cons-threshold.
>
> We should ask them to report statistics after running a session for
> enough time.  What exactly qualifies as "enough time" depends on the
> usage patterns: there are people, like me, who run a single session
> for weeks on end, and there are others who start a new session every
> day or even more frequently.  What is interesting is the statistics
> near the end of the session.
>
> Statistics from loading init.el is much less interesting, mainly
> because reducing GC impact during such short intervals is easy and
> well-understood.  We could collect such statistics as well, but it is
> not the main goal from where I stand.

I think we can ask interested users to install a package like
https://git.sr.ht/~yantar92/emacs-gc-stats/tree/main/item/emacs-gc-stats.el

Then, they can share the results after running Emacs with the package
for some time.

See the attached statistics data example.

WDYT?

>> That's why I am seeing reducing the frequency of GCs as more important
>> than trying to reduce GC time, which cannot be even halved easily.
>
> A single very long GC will be annoying even if it happens rarely.  So
> I don't agree the frequency of GCs is so much more important than the
> time it takes to perform a GC.

I do not really mean that it is not important.
But there is little Emacs can do about the memory consumption. It
mainly depends on the third-party packages being used.

In contrast, GC frequency is something that can be tweaked easily on
Emacs side by altering GC thresholds.

>> >> - memory-limit 6,518,516, stable
>> >
>> > ??? That's 6 GiB.  Didn't you say your memory footprint stabilizes at
>> > 1 GiB?
>> 
>> memory-limit is a natively compiled function defined in subr.el.
>
> If the Emacs memory usage in system monitor is 1.7 Gib, how come
> memory-limit says it's 6.5 GiB?

6.5Gib is virtual memory. 1.7Gib is actually used memory - the value
people usually mean when considering memory footprint.


[-- Attachment #2: emacs-gc-stats.eld --]
[-- Type: application/octet-stream, Size: 2000 bytes --]

((("Initial stats" "Mon Mar 13 15:50:59 2023"
   (gc-cons-threshold . 800000)
   (gc-cons-percentage . 0.1)
   (memory-limit . 198924)
   (emacs-version . "30.0.50")
   (memory-info 15996152 7760168 20971516 16568504))
  ("Mon Mar 13 15:51:08 2023"
   (gc-cons-threshold . 250000000)
   (gc-cons-percentage . 0.0001)
   gcmh-mode
   (gc-elapsed . 0.130179026)
   (gcs-done . 14)
   (this-command)
   (memory-limit . 748360))
  ("Mon Mar 13 15:51:08 2023"
   (gc-cons-threshold . 250000000)
   (gc-cons-percentage . 0.0001)
   gcmh-mode
   (gc-elapsed . 0.130179026)
   (gcs-done . 14)
   (this-command)
   (memory-limit . 748360))
  ("Init.el stats" "Mon Mar 13 15:51:10 2023"
   (gc-cons-threshold . 250000000)
   (gc-cons-percentage . 0.0001)
   (gc-elapsed . 0.277816909)
   (gcs-done . 15)
   (memory-limit . 757600)
   (emacs-uptime . "14 seconds"))
  ("Mon Mar 13 15:51:33 2023"
   (gc-cons-threshold . 250000000)
   (gc-cons-percentage . 0.0001)
   gcmh-mode
   (gc-elapsed . 0.277816909)
   (gcs-done . 15)
   (this-command . org-agenda)
   (memory-limit . 963332))
  ("Mon Mar 13 15:51:37 2023"
   (gc-cons-threshold . 250000000)
   (gc-cons-percentage . 0.0001)
   gcmh-mode
   (gc-elapsed . 0.425213479)
   (gcs-done . 16)
   (this-command . org-agenda)
   (memory-limit . 1209092))
  ("Mon Mar 13 15:54:07 2023"
   (gc-cons-threshold . 250000000)
   (gc-cons-percentage . 0.0001)
   gcmh-mode
   (gc-elapsed . 0.643833105)
   (gcs-done . 17)
   (this-command . org-save-all-org-buffers)
   (memory-limit . 1572008))
  ("Mon Mar 13 16:00:09 2023"
   (gc-cons-threshold . 250000000)
   (gc-cons-percentage . 0.0001)
   gcmh-mode
   (gc-elapsed . 2.788064769)
   (gcs-done . 18)
   (this-command . self-insert-command)
   (memory-limit . 1483484))
  ("Session end stats" "Mon Mar 13 16:02:38 2023"
   (gc-cons-threshold . 250000000)
   (gc-cons-percentage . 0.0001)
   (gc-elapsed . 5.784780114)
   (gcs-done . 19)
   (memory-limit . 1615352)
   (emacs-uptime . "11 minutes, 42 seconds"))))

[-- Attachment #3: Type: text/plain, Size: 224 bytes --]


-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 15:01                               ` Ihor Radchenko
@ 2023-03-13 15:33                                 ` Eli Zaretskii
  2023-03-13 15:39                                   ` Ihor Radchenko
  2023-03-13 15:41                                 ` Eli Zaretskii
  1 sibling, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-13 15:33 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Mon, 13 Mar 2023 15:01:47 +0000
> 
> >> Indeed. But despite all of the best efforts, fragmentation increases if
> >> we delay GCs, right?
> >
> > Not IME, no.  That's why the memory footprint of a typical
> > long-running session levels out.
> 
> Then what is the mechanism of gc-cons-threshold affecting the Emacs
> memory footprint?

Because higher threshold increases the probability that some free
memory couldn't be released to the OS.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 15:09                                   ` Ihor Radchenko
@ 2023-03-13 15:37                                     ` Eli Zaretskii
  2023-03-13 15:45                                       ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-13 15:37 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Mon, 13 Mar 2023 15:09:50 +0000
> 
> I think we can ask interested users to install a package like
> https://git.sr.ht/~yantar92/emacs-gc-stats/tree/main/item/emacs-gc-stats.el
> 
> Then, they can share the results after running Emacs with the package
> for some time.
> 
> See the attached statistics data example.
> 
> WDYT?

Looks useful, thanks.

> >> >> - memory-limit 6,518,516, stable
> >> >
> >> > ??? That's 6 GiB.  Didn't you say your memory footprint stabilizes at
> >> > 1 GiB?
> >> 
> >> memory-limit is a natively compiled function defined in subr.el.
> >
> > If the Emacs memory usage in system monitor is 1.7 Gib, how come
> > memory-limit says it's 6.5 GiB?
> 
> 6.5Gib is virtual memory. 1.7Gib is actually used memory - the value
> people usually mean when considering memory footprint.

You mean, the process has a 6.5 GiB footprint, out of which only 1.7
GiB are being used, and the rest is free?  That'd mean awfully
inefficient libc implementation of malloc.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 15:39                                   ` Ihor Radchenko
@ 2023-03-13 15:39                                     ` Eli Zaretskii
  2023-03-13 16:04                                       ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-13 15:39 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Mon, 13 Mar 2023 15:39:44 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> >> Indeed. But despite all of the best efforts, fragmentation increases if
> >> >> we delay GCs, right?
> >> >
> >> > Not IME, no.  That's why the memory footprint of a typical
> >> > long-running session levels out.
> >> 
> >> Then what is the mechanism of gc-cons-threshold affecting the Emacs
> >> memory footprint?
> >
> > Because higher threshold increases the probability that some free
> > memory couldn't be released to the OS.
> 
> So, fragmentation? Or do we mis-communicate?
> For me, memory fragmentation is when memory cannot be released to OS
> and/or cannot be re-used to store new objects.

Fragmentation is the latter, not the former.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 15:33                                 ` Eli Zaretskii
@ 2023-03-13 15:39                                   ` Ihor Radchenko
  2023-03-13 15:39                                     ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-13 15:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> >> Indeed. But despite all of the best efforts, fragmentation increases if
>> >> we delay GCs, right?
>> >
>> > Not IME, no.  That's why the memory footprint of a typical
>> > long-running session levels out.
>> 
>> Then what is the mechanism of gc-cons-threshold affecting the Emacs
>> memory footprint?
>
> Because higher threshold increases the probability that some free
> memory couldn't be released to the OS.

So, fragmentation? Or do we mis-communicate?
For me, memory fragmentation is when memory cannot be released to OS
and/or cannot be re-used to store new objects.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 15:01                               ` Ihor Radchenko
  2023-03-13 15:33                                 ` Eli Zaretskii
@ 2023-03-13 15:41                                 ` Eli Zaretskii
  2023-03-14 13:01                                   ` Ihor Radchenko
  1 sibling, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-13 15:41 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Mon, 13 Mar 2023 15:01:47 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > We need to align and calibrate our measurement means: what does
> > memory-report tell about object memory in that session?  1.18 GiB
> > sounds a lot.
> 
> Unfortunately, memory-report is not usable in my sessions. M-x
> memory-report takes 10+ minutes and then fails with max nesting error
> for me.

That is worth a bug report, I think.  That it takes a long time could
be just a (mis)feature, but that it errors out is a bug that needs to
be solved.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 15:37                                     ` Eli Zaretskii
@ 2023-03-13 15:45                                       ` Ihor Radchenko
  2023-03-13 16:58                                         ` Eli Zaretskii
  2023-03-13 18:14                                         ` Gregor Zattler
  0 siblings, 2 replies; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-13 15:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> See the attached statistics data example.
>> 
>> WDYT?
>
> Looks useful, thanks.

Feel free to share the ideas on what else we could record there.
Once we decide what kind of data we want to collect, I can announce the
package, asking people to report the statistics.

>> > If the Emacs memory usage in system monitor is 1.7 Gib, how come
>> > memory-limit says it's 6.5 GiB?
>> 
>> 6.5Gib is virtual memory. 1.7Gib is actually used memory - the value
>> people usually mean when considering memory footprint.
>
> You mean, the process has a 6.5 GiB footprint, out of which only 1.7
> GiB are being used, and the rest is free?  That'd mean awfully
> inefficient libc implementation of malloc.

I mean the following output for "top" bash command

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
31838 yantar92  20   0 1843744 800648 121684 S   0.3   5.0   1:12.48 emacs                   

VIRT is virtual memory and RES (or %MEM) is actually used.

Except that this one is 20 minutes uptime and 6.5GiB is after 2 days,
stabilizing earlier.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 15:39                                     ` Eli Zaretskii
@ 2023-03-13 16:04                                       ` Ihor Radchenko
  2023-03-13 16:52                                         ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-13 16:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> So, fragmentation? Or do we mis-communicate?
>> For me, memory fragmentation is when memory cannot be released to OS
>> and/or cannot be re-used to store new objects.
>
> Fragmentation is the latter, not the former.

Noted. Can we measure fragmentation from Elisp? Does memory-limit
include the fragmented memory segments?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 16:04                                       ` Ihor Radchenko
@ 2023-03-13 16:52                                         ` Eli Zaretskii
  2023-03-14 12:47                                           ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-13 16:52 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Mon, 13 Mar 2023 16:04:49 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> So, fragmentation? Or do we mis-communicate?
> >> For me, memory fragmentation is when memory cannot be released to OS
> >> and/or cannot be re-used to store new objects.
> >
> > Fragmentation is the latter, not the former.
> 
> Noted. Can we measure fragmentation from Elisp?

I think only on glibc platforms, where we have malloc-info.  The
output will need to be post-processed by some code, based on expert
knowledge of what the numbers mean.

> Does memory-limit include the fragmented memory segments?

I think it does, although it's system-dependent.  It's basically what
'top' shows as VSIZE.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 15:45                                       ` Ihor Radchenko
@ 2023-03-13 16:58                                         ` Eli Zaretskii
  2023-03-13 18:04                                           ` Ihor Radchenko
  2023-03-13 18:14                                         ` Gregor Zattler
  1 sibling, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-13 16:58 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Mon, 13 Mar 2023 15:45:36 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> See the attached statistics data example.
> >> 
> >> WDYT?
> >
> > Looks useful, thanks.
> 
> Feel free to share the ideas on what else we could record there.

I think we should start with that and add stuff as we go if needed.

> >> 6.5Gib is virtual memory. 1.7Gib is actually used memory - the value
> >> people usually mean when considering memory footprint.
> >
> > You mean, the process has a 6.5 GiB footprint, out of which only 1.7
> > GiB are being used, and the rest is free?  That'd mean awfully
> > inefficient libc implementation of malloc.
> 
> I mean the following output for "top" bash command
> 
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
> 31838 yantar92  20   0 1843744 800648 121684 S   0.3   5.0   1:12.48 emacs                   
> 
> VIRT is virtual memory and RES (or %MEM) is actually used.

Ah, you mean RES.  That's the "resident" part of the memory, i.e. what
the OS decided to keep in physical memory at this point; the rest is
swapped out.  Basically, RES is not interesting, only the total
virtual memory of the process (VIRT) is, because that's what is
counted towards the total VM of the system.  Although the complication
is that VIRT also includes the so-called "reserved" memory, which is
not necessarily in-use yet.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 16:58                                         ` Eli Zaretskii
@ 2023-03-13 18:04                                           ` Ihor Radchenko
  2023-03-14 12:19                                             ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-13 18:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Feel free to share the ideas on what else we could record there.
>
> I think we should start with that and add stuff as we go if needed.

I am not sure.
The package will require users to share the data manually.
Asking for such activity more than once will generate less replies than
a single ask.

So, I'd prefer to carefully discuss first what exactly we want to know
to decide about changing the thresholds.

>> I mean the following output for "top" bash command
>> 
>>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
>> 31838 yantar92  20   0 1843744 800648 121684 S   0.3   5.0   1:12.48 emacs                   
>> 
>> VIRT is virtual memory and RES (or %MEM) is actually used.
>
> Ah, you mean RES.  That's the "resident" part of the memory, i.e. what
> the OS decided to keep in physical memory at this point; the rest is
> swapped out.  Basically, RES is not interesting, only the total
> virtual memory of the process (VIRT) is, because that's what is
> counted towards the total VM of the system.  Although the complication
> is that VIRT also includes the so-called "reserved" memory, which is
> not necessarily in-use yet.

This is a bit confusing then. From my experience, RES is often closer to
the memory-report results.

Moreover, VIRT can exceed Memory + Swap combined.

For example, emacs -Q gives

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                
28887 yantar92  20   0  187964  76440  46816 S   0.0   0.5   0:00.90 emacs                 
 2234 yantar92  20   0   28.2g 686492 138972 S   0.0   4.3  29:45.66 QtWebEngineProc

(also, note VIRT for QtWebEngineProc, while I only have Mem 2.7GiB used
+ Swap 4.2GiB used).

Estimated Emacs Memory Usage

   3.2 MiB  Overall Object Memory Usage
   2.2 MiB  Memory Used By Global Variables
   1.3 MiB  Memory Used By Symbol Plists
   370 KiB  Reserved (But Unused) Object Memory
    66 KiB  Total Image Cache Size
    21 KiB  Total Buffer Memory Usage

Object Storage

   1.9 MiB  Vectors
   598 KiB  Conses
   492 KiB  Strings
   189 KiB  Symbols
   6.7 KiB  Buffer-Objects
   2.8 KiB  Intervals
     160 B  Floats

Largest Buffers

    11 KiB  *scratch*
   3.2 KiB  *Messages*
   2.4 KiB   *Echo Area 1*
   1.7 KiB   *Minibuf-1*
   1.3 KiB  *Memory Report*
   1.2 KiB   *Minibuf-0*
     170 B   *Echo Area 0*

Largest Variables

   273 KiB  load-history
   236 KiB  obarray
   156 KiB  definition-prefixes
    93 KiB  global-map
    80 KiB  coding-system-alist
    71 KiB  input-method-alist
    64 KiB  color-name-rgb-alist
    59 KiB  language-info-alist
    47 KiB  face--new-frame-defaults
    46 KiB  easy-menu-converted-items-table
    42 KiB  key-translation-map
    42 KiB  x-colors
    40 KiB  comp-known-type-specifiers
    38 KiB  menu-bar-options-menu
    34 KiB  comp-known-func-cstr-h
    31 KiB  comp-loaded-comp-units-h
    25 KiB  comp-eln-to-el-h
    22 KiB  iso-transl-char-map
    21 KiB  comp-subr-list
    20 KiB  auto-mode-alist



-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 15:45                                       ` Ihor Radchenko
  2023-03-13 16:58                                         ` Eli Zaretskii
@ 2023-03-13 18:14                                         ` Gregor Zattler
  2023-03-14 12:30                                           ` Eli Zaretskii
  1 sibling, 1 reply; 99+ messages in thread
From: Gregor Zattler @ 2023-03-13 18:14 UTC (permalink / raw)
  To: emacs-devel

Hi Ihor,
* Ihor Radchenko <yantar92@posteo.net> [2023-03-13; 15:45 GMT]:
>>> See the attached statistics data example.

> Feel free to share the ideas on what else we could record there.

Since you provide emacs-uptime at session end, I think
it would be useful to know total Emacs idle time also.
I have no idea if that's feasible, though.

Ciao; Gregor



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 18:04                                           ` Ihor Radchenko
@ 2023-03-14 12:19                                             ` Eli Zaretskii
  2023-03-15 10:28                                               ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-14 12:19 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Mon, 13 Mar 2023 18:04:43 +0000
> 
> > Ah, you mean RES.  That's the "resident" part of the memory, i.e. what
> > the OS decided to keep in physical memory at this point; the rest is
> > swapped out.  Basically, RES is not interesting, only the total
> > virtual memory of the process (VIRT) is, because that's what is
> > counted towards the total VM of the system.  Although the complication
> > is that VIRT also includes the so-called "reserved" memory, which is
> > not necessarily in-use yet.
> 
> This is a bit confusing then.

It definitely is.

> From my experience, RES is often closer to the memory-report
> results.

Not surprisingly, since Emacs touches Lisp objects very frequently
(every GC), so they are likely to be in the resident set.

> Moreover, VIRT can exceed Memory + Swap combined.

Yes, because VIRT includes the "reserved" memory, which is memory not
yet in-use, but which the application "reserved" for itself and
generally intends to use at some point, and so it cannot be used by
other processes.  See

  https://stackoverflow.com/questions/2440434/whats-the-difference-between-reserved-and-committed-memory
  https://www.baeldung.com/linux/resident-set-vs-virtual-memory-size



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 18:14                                         ` Gregor Zattler
@ 2023-03-14 12:30                                           ` Eli Zaretskii
  2023-03-14 15:19                                             ` Gregor Zattler
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-14 12:30 UTC (permalink / raw)
  To: Gregor Zattler; +Cc: emacs-devel

> From: Gregor Zattler <telegraph@gmx.net>
> Date: Mon, 13 Mar 2023 19:14:28 +0100
> 
> Since you provide emacs-uptime at session end, I think
> it would be useful to know total Emacs idle time also.
> I have no idea if that's feasible, though.

Why would it be useful, and what kind of information can we glean from
that?

I'm guessing that you assume when Emacs is idle it cannot possibly
cons any memory (which is true), but my point is different: did you
really see any production Emacs session that stays idle for prolonged
times?  E.g., in the session where I'm typing this I have 10 timers, 5
of which run at least once every 10 sec, and 3 run 2 or more times a
second.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 16:52                                         ` Eli Zaretskii
@ 2023-03-14 12:47                                           ` Ihor Radchenko
  2023-03-14 13:09                                             ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-14 12:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Noted. Can we measure fragmentation from Elisp?
>
> I think only on glibc platforms, where we have malloc-info.  The
> output will need to be post-processed by some code, based on expert
> knowledge of what the numbers mean.

But can we get the output from Elisp? `malloc-info' docstring says that
the output is done directly to srderr, which we cannot (?) access from
Elisp.

>> Does memory-limit include the fragmented memory segments?
>
> I think it does, although it's system-dependent.  It's basically what
> 'top' shows as VSIZE.

At least, it can indirectly demonstrate the impact of GC threshold onto
Emacs memory footprint. I guess it is what we worry about at the end.
Or does the fragmentation cause other severe effects in addition to
higher memory usage?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 15:41                                 ` Eli Zaretskii
@ 2023-03-14 13:01                                   ` Ihor Radchenko
  0 siblings, 0 replies; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-14 13:01 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> > We need to align and calibrate our measurement means: what does
>> > memory-report tell about object memory in that session?  1.18 GiB
>> > sounds a lot.
>> 
>> Unfortunately, memory-report is not usable in my sessions. M-x
>> memory-report takes 10+ minutes and then fails with max nesting error
>> for me.
>
> That is worth a bug report, I think.  That it takes a long time could
> be just a (mis)feature, but that it errors out is a bug that needs to
> be solved.

Well. I cannot reproduce now, though I do see this problem from time to
time.

Here is my current memory report for 17 hours uptime:

 PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
28551 yantar92  20   0 3798452   1.1g  54972 S   4.7   7.2  43:13.57 emacs

%MEM is typical and stabilize around similar value.

Estimated Emacs Memory Usage

   393 MiB  Overall Object Memory Usage
   298 MiB  Total Buffer Memory Usage
   138 MiB  Reserved (But Unused) Object Memory
    87 MiB  Memory Used By Global Variables
    14 MiB  Memory Used By Symbol Plists
   9.4 MiB  Total Image Cache Size

Object Storage

   164 MiB  Conses
   114 MiB  Vectors
    75 MiB  Strings
    34 MiB  Intervals
   4.8 MiB  Symbols
   948 KiB  Floats
   175 KiB  Buffer-Objects

Largest Buffers

   216 MiB  notes.org (note: there are many eq text property values there, 216MiB looks like overestimation)
    23 MiB   *code-conversion-work*
   7.6 MiB  *Org Agenda(h)*
   4.9 MiB   *helm info temp buffer*
   4.9 MiB  *info*
   2.9 MiB  config.org
   2.6 MiB  *cfw-calendar*
   2.3 MiB  TODO.org
   1.9 MiB   *helm candidates:Info Index: elisp*
   1.9 MiB  *Org Agenda(s:)*
   1.7 MiB  Thesis.org
   1.4 MiB  *xref*
   1.3 MiB  system-config.org
   1.3 MiB  *Org Agenda(d:)*
   1.3 MiB   *helm candidates:helpful-callable*
   1.2 MiB  articles.org
     1 MiB  schedule.org
   951 KiB  *notmuch-id:835yb4u0aa.fsf@gnu.org*
   910 KiB  *Org Agenda(i)*
   906 KiB   *DOC*

Largest Variables

    23 MiB  elfeed-db
    18 MiB  elfeed-db-entries
   7.4 MiB  emojify-emojis
   4.7 MiB  elfeed-db-index
   3.9 MiB  org-id-locations
   2.5 MiB  load-history
   1.5 MiB  byte-compile-form-stack
   1.3 MiB  straight--autoloads-cache
     1 MiB  ucs-normalize-hangul-translation-alist
   864 KiB  straight--build-cache-text
   765 KiB  easy-menu-converted-items-table
   694 KiB  url-domsuf-domains
   671 KiB  yas--tables
   622 KiB  info-lookup-cache
   576 KiB  face--new-frame-defaults
   563 KiB  modus-themes-faces
   472 KiB  kill-ring-yank-pointer
   472 KiB  kill-ring
   463 KiB  org-ql-node-value-cache
   458 KiB  winner-ring-alist

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-14 12:47                                           ` Ihor Radchenko
@ 2023-03-14 13:09                                             ` Eli Zaretskii
  2023-03-15 10:29                                               ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-14 13:09 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Tue, 14 Mar 2023 12:47:29 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Noted. Can we measure fragmentation from Elisp?
> >
> > I think only on glibc platforms, where we have malloc-info.  The
> > output will need to be post-processed by some code, based on expert
> > knowledge of what the numbers mean.
> 
> But can we get the output from Elisp? `malloc-info' docstring says that
> the output is done directly to srderr, which we cannot (?) access from
> Elisp.

Yes.  But as I said, I'm not sure this information is useful, unless
you are a glibc memory-allocation expert.  When we needed to
understand these reports, we asked glibc developers for help.

> >> Does memory-limit include the fragmented memory segments?
> >
> > I think it does, although it's system-dependent.  It's basically what
> > 'top' shows as VSIZE.
> 
> At least, it can indirectly demonstrate the impact of GC threshold onto
> Emacs memory footprint. I guess it is what we worry about at the end.
> Or does the fragmentation cause other severe effects in addition to
> higher memory usage?

_Real_ memory fragmentation, if it happens in Emacs, should cause the
memory footprint grow all the time without leveling out, and
malloc-info should then show that most of the memory is in small
chunks that cannot be spliced together.

However, I have yet to see a platform where Emacs causes memory
fragmentation.  Where the system malloc cannot be trusted, we use
gmalloc (and in the past used ralloc).  Most modern platforms have
reliable malloc these days (the single known exception is MSDOS), so
this problem largely doesn't exist.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-14 12:30                                           ` Eli Zaretskii
@ 2023-03-14 15:19                                             ` Gregor Zattler
  0 siblings, 0 replies; 99+ messages in thread
From: Gregor Zattler @ 2023-03-14 15:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Hi Eli,
* Eli Zaretskii <eliz@gnu.org> [2023-03-14; 14:30 +02]:
>> From: Gregor Zattler <telegraph@gmx.net>
>> Date: Mon, 13 Mar 2023 19:14:28 +0100
>>
>> Since you provide emacs-uptime at session end, I think
>> it would be useful to know total Emacs idle time also.
>> I have no idea if that's feasible, though.
>
> Why would it be useful, and what kind of information can we glean from
> that?

I thought about what you wrote in an earlier email: XXX
garbage collections, meaning say one very minute, since
you were using it only half the time.

> I'm guessing that you assume when Emacs is idle it cannot possibly
> cons any memory (which is true), but my point is different: did you
> really see any production Emacs session that stays idle for prolonged
> times?  E.g., in the session where I'm typing this I have 10 timers, 5
> of which run at least once every 10 sec, and 3 run 2 or more times a
> second.

OK, when emacs is not idle in the above mentioned
sense, then it makes no sense.



Ciao; Gregor



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-14 12:19                                             ` Eli Zaretskii
@ 2023-03-15 10:28                                               ` Ihor Radchenko
  2023-03-15 12:54                                                 ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-15 10:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Moreover, VIRT can exceed Memory + Swap combined.
>
> Yes, because VIRT includes the "reserved" memory, which is memory not
> yet in-use, but which the application "reserved" for itself and
> generally intends to use at some point, and so it cannot be used by
> other processes.  See
>
>   https://stackoverflow.com/questions/2440434/whats-the-difference-between-reserved-and-committed-memory
>   https://www.baeldung.com/linux/resident-set-vs-virtual-memory-size

Then, from what I can read, it does not look like VIRT truly represents
how much memory Emacs is going to use. It is safe to assume that only a
fraction of VIRT will be used in practice.

That said, we can probably rely on relative changes in VIRT data, if it
is recorded.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-14 13:09                                             ` Eli Zaretskii
@ 2023-03-15 10:29                                               ` Ihor Radchenko
  0 siblings, 0 replies; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-15 10:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> At least, it can indirectly demonstrate the impact of GC threshold onto
>> Emacs memory footprint. I guess it is what we worry about at the end.
>> Or does the fragmentation cause other severe effects in addition to
>> higher memory usage?
>
> _Real_ memory fragmentation, if it happens in Emacs, should cause the
> memory footprint grow all the time without leveling out, and
> malloc-info should then show that most of the memory is in small
> chunks that cannot be spliced together.
>
> However, I have yet to see a platform where Emacs causes memory
> fragmentation.  Where the system malloc cannot be trusted, we use
> gmalloc (and in the past used ralloc).  Most modern platforms have
> reliable malloc these days (the single known exception is MSDOS), so
> this problem largely doesn't exist.

In other words, we should not worry about gc-cons-threshold causing
_real_ memory fragmentation. Just about increasing the memory footprint.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-15 10:28                                               ` Ihor Radchenko
@ 2023-03-15 12:54                                                 ` Eli Zaretskii
  2023-03-15 12:59                                                   ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-15 12:54 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Wed, 15 Mar 2023 10:28:22 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Moreover, VIRT can exceed Memory + Swap combined.
> >
> > Yes, because VIRT includes the "reserved" memory, which is memory not
> > yet in-use, but which the application "reserved" for itself and
> > generally intends to use at some point, and so it cannot be used by
> > other processes.  See
> >
> >   https://stackoverflow.com/questions/2440434/whats-the-difference-between-reserved-and-committed-memory
> >   https://www.baeldung.com/linux/resident-set-vs-virtual-memory-size
> 
> Then, from what I can read, it does not look like VIRT truly represents
> how much memory Emacs is going to use. It is safe to assume that only a
> fraction of VIRT will be used in practice.

I don't think I follow your logic.

Emacs can have more memory in use than the physically installed
amount, it just means each GC will be painfully slow, because it will
need to swap in parts of memory and swap out other parts.  In theory,
Emacs can use all the VM there is, minus what other processes and the
OS use.

If your bother is about the "reserved" part, that is not supposed to
be large, but maybe glibc has its own ideas about that, because your
report about VIRT vs RES surprised me.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-15 12:54                                                 ` Eli Zaretskii
@ 2023-03-15 12:59                                                   ` Ihor Radchenko
  2023-03-15 14:20                                                     ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-15 12:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Then, from what I can read, it does not look like VIRT truly represents
>> how much memory Emacs is going to use. It is safe to assume that only a
>> fraction of VIRT will be used in practice.
>
> I don't think I follow your logic.
>
> Emacs can have more memory in use than the physically installed
> amount, it just means each GC will be painfully slow, because it will
> need to swap in parts of memory and swap out other parts.  In theory,
> Emacs can use all the VM there is, minus what other processes and the
> OS use.
>
> If your bother is about the "reserved" part, that is not supposed to
> be large, but maybe glibc has its own ideas about that, because your
> report about VIRT vs RES surprised me.

Then what should we collect from user session?
memory-limit represents VIRT, but it does not look terribly useful.

What exactly should we use as a metric for the effects of
gc-cons-threshold changes?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-15 12:59                                                   ` Ihor Radchenko
@ 2023-03-15 14:20                                                     ` Eli Zaretskii
  2023-03-16 10:27                                                       ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-03-15 14:20 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Wed, 15 Mar 2023 12:59:05 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Then, from what I can read, it does not look like VIRT truly represents
> >> how much memory Emacs is going to use. It is safe to assume that only a
> >> fraction of VIRT will be used in practice.
> >
> > I don't think I follow your logic.
> >
> > Emacs can have more memory in use than the physically installed
> > amount, it just means each GC will be painfully slow, because it will
> > need to swap in parts of memory and swap out other parts.  In theory,
> > Emacs can use all the VM there is, minus what other processes and the
> > OS use.
> >
> > If your bother is about the "reserved" part, that is not supposed to
> > be large, but maybe glibc has its own ideas about that, because your
> > report about VIRT vs RES surprised me.
> 
> Then what should we collect from user session?
> memory-limit represents VIRT, but it does not look terribly useful.
> 
> What exactly should we use as a metric for the effects of
> gc-cons-threshold changes?

I guess the output of memory-info should be good, in addition to
memory-limit.  I do think memory-limit is useful, albeit we should
take it with a grain of salt sometimes.  A too-large difference
between memory-limit and the resident set size is perhaps already an
important indication of some trouble, although I'd like to see more
examples to make up my mind about that.

But that's the possible downside of higher GC thresholds; other
important indications are the time spent in GC and the average
duration of a GC cycle.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-15 14:20                                                     ` Eli Zaretskii
@ 2023-03-16 10:27                                                       ` Ihor Radchenko
  2023-04-06  9:13                                                         ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-03-16 10:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> What exactly should we use as a metric for the effects of
>> gc-cons-threshold changes?
>
> I guess the output of memory-info should be good, in addition to
> memory-limit.

I am not sure what it will convey to us. System memory state greatly
depends on other programs running alongside Emacs. I do not see much use
to track (memory-info) for every GC. Just initially, to catch system
memory stats.

> But that's the possible downside of higher GC thresholds; other
> important indications are the time spent in GC and the average
> duration of a GC cycle.

Sure. gcs-done and gc-elapsed can be recorded after every GC. gc-elapsed
will directly provide data on total GC duration and running difference
of gc-elapsed will provide detailed statistics on GC cycle duration.

I will now try to record the data on my sessions and share it later.
We can then see what useful information can actually be extracted.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-10 11:07 ` Indentation and gc Ergus
  2023-03-10 14:36   ` Dr. Arne Babenhauserheide
  2023-03-10 14:52   ` Eli Zaretskii
@ 2023-03-21  7:11   ` Jean Louis
  2023-03-21  7:27     ` Emanuel Berg
  2 siblings, 1 reply; 99+ messages in thread
From: Jean Louis @ 2023-03-21  7:11 UTC (permalink / raw)
  To: Ergus; +Cc: emacs-devel@gnu.org

* Ergus <spacibba@aol.com> [2023-03-10 14:10]:
> Hi:
> 
> Just today I enabled the garbage-collection-messages and I found that
> indenting the buffer with `C-x h <tab>` in just ~150 C++ lines I get the
> garbage-collection message printed about 4 or 5 times before the
> indentation finishes.

Sure, I have `garbage-collection-messages' turned on all the time, and
gc is engaged too many times, giving me feeling it is blocking the
execution more than it is helping. But my feeling may not be fact.

-- 
Jean

Take action in Free Software Foundation campaigns:
https://www.fsf.org/campaigns

In support of Richard M. Stallman
https://stallmansupport.org/



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-21  7:11   ` Jean Louis
@ 2023-03-21  7:27     ` Emanuel Berg
  0 siblings, 0 replies; 99+ messages in thread
From: Emanuel Berg @ 2023-03-21  7:27 UTC (permalink / raw)
  To: emacs-devel

Jean Louis wrote:

>> Just today I enabled the garbage-collection-messages and
>> I found that indenting the buffer with `C-x h <tab>` in
>> just ~150 C++ lines I get the garbage-collection message
>> printed about 4 or 5 times before the indentation finishes.
>
> Sure, I have `garbage-collection-messages' turned on all the
> time, and gc is engaged too many times, giving me feeling it
> is blocking the execution more than it is helping. But my
> feeling may not be fact.

(setq garbage-collection-messages nil)
(setq gc-cons-threshold 3000000)

-- 
underground experts united
https://dataswamp.org/~incal




^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-16 10:27                                                       ` Ihor Radchenko
@ 2023-04-06  9:13                                                         ` Ihor Radchenko
  2023-04-08  8:04                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-04-06  9:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 627 bytes --]

Ihor Radchenko <yantar92@posteo.net> writes:

> I will now try to record the data on my sessions and share it later.
> We can then see what useful information can actually be extracted.

I am attaching some stats I recorded by varying gc-cons-threshold and
using Emacs.

Some observations:
1. GC time increases over time
2. Total time spend in GC decreases with increases gc-cons-threshold
3. Avg time spend in a single GC is roughly the same, except for very
   high gc-cons-thresholds
4. Memory limit has no obvious correlation with gc-cons-threshold

Of course, we need data from more users to draw meaningful conclusions.


[-- Attachment #2: gc-time-stats.png --]
[-- Type: image/png, Size: 204994 bytes --]

[-- Attachment #3: GC-count-stats.png --]
[-- Type: image/png, Size: 134660 bytes --]

[-- Attachment #4: gc-spacing-stats.png --]
[-- Type: image/png, Size: 117548 bytes --]

[-- Attachment #5: gc-time-acc-stats.png --]
[-- Type: image/png, Size: 132462 bytes --]

[-- Attachment #6: gc-time-avg-stats.png --]
[-- Type: image/png, Size: 134129 bytes --]

[-- Attachment #7: memory-limit-stats.png --]
[-- Type: image/png, Size: 157944 bytes --]

[-- Attachment #8: Type: text/plain, Size: 294 bytes --]


The raw data is in https://0x0.st/HXPu.eld - it is a bit large: 3Mb.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-04-06  9:13                                                         ` Ihor Radchenko
@ 2023-04-08  8:04                                                           ` Eli Zaretskii
  2023-04-08  8:15                                                             ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-04-08  8:04 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Thu, 06 Apr 2023 09:13:13 +0000
> 
> I am attaching some stats I recorded by varying gc-cons-threshold and
> using Emacs.

Thanks.  This is just for 14 to 25 hours of uptime.  I think we need
statistics for longer periods of time.

> Some observations:
> 1. GC time increases over time
> 2. Total time spend in GC decreases with increases gc-cons-threshold
> 3. Avg time spend in a single GC is roughly the same, except for very
>    high gc-cons-thresholds
> 4. Memory limit has no obvious correlation with gc-cons-threshold

I'm not sure I agree with the last conclusion.  It seems like higher
thresholds allow the memory limit to grow almost twofold.  But maybe
the relatively short uptime prevents the data from being truly
representative, as I see in some cases the memory limit going down
after some time.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-04-08  8:04                                                           ` Eli Zaretskii
@ 2023-04-08  8:15                                                             ` Ihor Radchenko
  2023-04-08 10:03                                                               ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-04-08  8:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> I am attaching some stats I recorded by varying gc-cons-threshold and
>> using Emacs.
>
> Thanks.  This is just for 14 to 25 hours of uptime.  I think we need
> statistics for longer periods of time.

Do you have a particular number of hours in mind?
For me, Emacs uptime is usually several days and sometimes less when I
am doing Elisp development.

There is a possible argument to be made that we should only look into
the real uptime people use rather than trying to measure some
unreasonable long uptime. It will also be more natural to collect
statistics then.

>> Some observations:
> ...
> I'm not sure I agree with the last conclusion.  It seems like higher
> thresholds allow the memory limit to grow almost twofold.  But maybe
> the relatively short uptime prevents the data from being truly
> representative, as I see in some cases the memory limit going down
> after some time.

These are not the conclusions we should take seriously yet. Not from a
single user data point.

Maybe you have thoughts about what else can be deduced from the data I
provided. Or about what else should be recorded. (For some context, I
realized that it is necessary to subtract idle time when recording Emacs
uptime. My overnight graphs show flat lines when Emacs is idle).

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-04-08  8:15                                                             ` Ihor Radchenko
@ 2023-04-08 10:03                                                               ` Eli Zaretskii
  2023-04-14 17:07                                                                 ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-04-08 10:03 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Sat, 08 Apr 2023 08:15:32 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> I am attaching some stats I recorded by varying gc-cons-threshold and
> >> using Emacs.
> >
> > Thanks.  This is just for 14 to 25 hours of uptime.  I think we need
> > statistics for longer periods of time.
> 
> Do you have a particular number of hours in mind?
> For me, Emacs uptime is usually several days and sometimes less when I
> am doing Elisp development.

Well, several days would be a good time frame, I think.

> There is a possible argument to be made that we should only look into
> the real uptime people use rather than trying to measure some
> unreasonable long uptime. It will also be more natural to collect
> statistics then.

I agree.  However, my production sessions normally run for several
weeks, and sometimes for several months.  I think at least some part
of the Emacs users, particularly those who don't track development
branches, have long-lasting sessions.  We should have statistics from
these long sessions as well.

> Maybe you have thoughts about what else can be deduced from the data I
> provided. Or about what else should be recorded. (For some context, I
> realized that it is necessary to subtract idle time when recording Emacs
> uptime. My overnight graphs show flat lines when Emacs is idle).

I think your data is sufficient for now, we just need it from more
users and for longer times.  Maybe later we will have insights that
will help us identify additional data to be collected.

Thanks.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-04-08 10:03                                                               ` Eli Zaretskii
@ 2023-04-14 17:07                                                                 ` Ihor Radchenko
  2023-04-14 17:56                                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-04-14 17:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: arne_bab, spacibba, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

> I think your data is sufficient for now, we just need it from more
> users and for longer times.  Maybe later we will have insights that
> will help us identify additional data to be collected.

Then, the first step is publishing the package:
https://yhetil.org/emacs-devel/87ttxil7k3.fsf@localhost

Once done, we need to decide how we want to ask users to share the data.
Should it be a reply in this thread?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-04-14 17:07                                                                 ` Ihor Radchenko
@ 2023-04-14 17:56                                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-04-14 17:56 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: arne_bab, spacibba, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: arne_bab@web.de, spacibba@aol.com, emacs-devel@gnu.org
> Date: Fri, 14 Apr 2023 17:07:41 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I think your data is sufficient for now, we just need it from more
> > users and for longer times.  Maybe later we will have insights that
> > will help us identify additional data to be collected.
> 
> Then, the first step is publishing the package:
> https://yhetil.org/emacs-devel/87ttxil7k3.fsf@localhost
> 
> Once done, we need to decide how we want to ask users to share the data.
> Should it be a reply in this thread?

Maybe a new issue on our issue tracker would be better?



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-03-13 12:51                                       ` Eli Zaretskii
@ 2023-06-14 14:16                                         ` Ihor Radchenko
  2023-06-14 15:36                                           ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-06-14 14:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> I guess we can give an answer if we collect usage statistics.
>
> Exactly.

Now, we have https://elpa.gnu.org/devel/emacs-gc-stats.html

Should we first try to collect existing usage stats, before asking
people to try non-standard GC settings?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-06-14 14:16                                         ` Ihor Radchenko
@ 2023-06-14 15:36                                           ` Eli Zaretskii
  2023-06-14 15:58                                             ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-06-14 15:36 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Wed, 14 Jun 2023 14:16:43 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> I guess we can give an answer if we collect usage statistics.
> >
> > Exactly.
> 
> Now, we have https://elpa.gnu.org/devel/emacs-gc-stats.html
> 
> Should we first try to collect existing usage stats, before asking
> people to try non-standard GC settings?

We could do both.  But I don't think I understand what you mean by
"existing usage stats".



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-06-14 15:36                                           ` Eli Zaretskii
@ 2023-06-14 15:58                                             ` Ihor Radchenko
  2023-06-14 16:07                                               ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-06-14 15:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Now, we have https://elpa.gnu.org/devel/emacs-gc-stats.html
>> 
>> Should we first try to collect existing usage stats, before asking
>> people to try non-standard GC settings?
>
> We could do both.  But I don't think I understand what you mean by
> "existing usage stats".

I meant to ask users install emacs-gc-stats and enable
emacs-gc-stats-mode, leaving all other Emacs GC as they are now. After
several weeks, we will ask these users to share the results.
This will provide the baseline.

If we want to ask about trying non-standard GC settings, then we first
need to decide which GC settings we should ask about:
`gc-cons-threshold' values; `gc-cons-percentage' values; the code from
<https://yhetil.org/emacs-devel/jwvjzwbxp4p.fsf-monnier+emacs@gnu.org>;
maybe something else.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-06-14 15:58                                             ` Ihor Radchenko
@ 2023-06-14 16:07                                               ` Eli Zaretskii
  2023-06-16 10:00                                                 ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-06-14 16:07 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Wed, 14 Jun 2023 15:58:37 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Now, we have https://elpa.gnu.org/devel/emacs-gc-stats.html
> >> 
> >> Should we first try to collect existing usage stats, before asking
> >> people to try non-standard GC settings?
> >
> > We could do both.  But I don't think I understand what you mean by
> > "existing usage stats".
> 
> I meant to ask users install emacs-gc-stats and enable
> emacs-gc-stats-mode, leaving all other Emacs GC as they are now. After
> several weeks, we will ask these users to share the results.
> This will provide the baseline.

Yes, we definitely need a baseline.  But the baseline should be based
on the default values of the GC parameters, so people who share their
statistics should take care to reset them to the default values.

> If we want to ask about trying non-standard GC settings, then we first
> need to decide which GC settings we should ask about:
> `gc-cons-threshold' values; `gc-cons-percentage' values; the code from
> <https://yhetil.org/emacs-devel/jwvjzwbxp4p.fsf-monnier+emacs@gnu.org>;
> maybe something else.

I don't think it matters, we should just need people to report the
values they used.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-06-14 16:07                                               ` Eli Zaretskii
@ 2023-06-16 10:00                                                 ` Ihor Radchenko
  2023-06-16 10:33                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-06-16 10:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> I meant to ask users install emacs-gc-stats and enable
>> emacs-gc-stats-mode, leaving all other Emacs GC as they are now. After
>> several weeks, we will ask these users to share the results.
>> This will provide the baseline.
>
> Yes, we definitely need a baseline.  But the baseline should be based
> on the default values of the GC parameters, so people who share their
> statistics should take care to reset them to the default values.

Ok. I updated the readme and package as the following:

(require 'emacs-gc-stats)
(setq emacs-gc-stats-gc-defaults 'emacs-defaults) ; optional
(emacs-gc-stats-mode +1)

This will optionally reset Emacs GC settings to defaults for Emacs
master.

Now, how should we collect the data?
I see several options:
1. I can create a dedicated mailing list to have an email to get all the
   replies.
2. We can collect directly in emacs-devel, but it will not be in a
   single email thread.
3. I can use In-Reply-To for this thread, but only people with Emacs
   email setup will be able to reply properly.
4. I can ask to create a dedicated survey on https://emacssurvey.org/

WDYT?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-06-16 10:00                                                 ` Ihor Radchenko
@ 2023-06-16 10:33                                                   ` Eli Zaretskii
  2023-06-16 11:03                                                     ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-06-16 10:33 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Fri, 16 Jun 2023 10:00:28 +0000
> 
> Now, how should we collect the data?
> I see several options:
> 1. I can create a dedicated mailing list to have an email to get all the
>    replies.
> 2. We can collect directly in emacs-devel, but it will not be in a
>    single email thread.
> 3. I can use In-Reply-To for this thread, but only people with Emacs
>    email setup will be able to reply properly.
> 4. I can ask to create a dedicated survey on https://emacssurvey.org/
> 
> WDYT?

Option 1 looks the best to me.  Let me know if you want me to create
the mailing list on Savannah.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-06-16 10:33                                                   ` Eli Zaretskii
@ 2023-06-16 11:03                                                     ` Ihor Radchenko
  2023-06-16 11:34                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-06-16 11:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Now, how should we collect the data?
>> I see several options:
>> 1. I can create a dedicated mailing list to have an email to get all the
>>    replies.
>> ... 
>> WDYT?
>
> Option 1 looks the best to me.  Let me know if you want me to create
> the mailing list on Savannah.

Sure. A list on GNU servers will be the most reliable for future reference.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-06-16 11:03                                                     ` Ihor Radchenko
@ 2023-06-16 11:34                                                       ` Eli Zaretskii
  2023-06-21 10:37                                                         ` Ihor Radchenko
  0 siblings, 1 reply; 99+ messages in thread
From: Eli Zaretskii @ 2023-06-16 11:34 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Fri, 16 Jun 2023 11:03:28 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Now, how should we collect the data?
> >> I see several options:
> >> 1. I can create a dedicated mailing list to have an email to get all the
> >>    replies.
> >> ... 
> >> WDYT?
> >
> > Option 1 looks the best to me.  Let me know if you want me to create
> > the mailing list on Savannah.
> 
> Sure. A list on GNU servers will be the most reliable for future reference.

Done.  The list's address is emacs-gc-stats@gnu.org.



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-06-16 11:34                                                       ` Eli Zaretskii
@ 2023-06-21 10:37                                                         ` Ihor Radchenko
  2023-06-21 11:11                                                           ` Eli Zaretskii
  0 siblings, 1 reply; 99+ messages in thread
From: Ihor Radchenko @ 2023-06-21 10:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: spacibba, arne_bab, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> Sure. A list on GNU servers will be the most reliable for future reference.
>
> Done.  The list's address is emacs-gc-stats@gnu.org.

I have announced the package and a call to action on Reddit, Mastodon,
and in Matrix. Now, let's see the turnout.

One concern I have about the mailing list is limits on the attachment
size. I expect up to few Mb of data. Will it cause any issues?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>



^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: Indentation and gc
  2023-06-21 10:37                                                         ` Ihor Radchenko
@ 2023-06-21 11:11                                                           ` Eli Zaretskii
  0 siblings, 0 replies; 99+ messages in thread
From: Eli Zaretskii @ 2023-06-21 11:11 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: spacibba, arne_bab, emacs-devel

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: spacibba@aol.com, arne_bab@web.de, emacs-devel@gnu.org
> Date: Wed, 21 Jun 2023 10:37:18 +0000
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >> Sure. A list on GNU servers will be the most reliable for future reference.
> >
> > Done.  The list's address is emacs-gc-stats@gnu.org.
> 
> I have announced the package and a call to action on Reddit, Mastodon,
> and in Matrix. Now, let's see the turnout.

Thanks.

> One concern I have about the mailing list is limits on the attachment
> size. I expect up to few Mb of data. Will it cause any issues?

Larger posts will require manual approval.  But most people don't
subscribe to the list, so they need approval anyway.



^ permalink raw reply	[flat|nested] 99+ messages in thread

end of thread, other threads:[~2023-06-21 11:11 UTC | newest]

Thread overview: 99+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20230310110747.4hytasakomvdyf7i.ref@Ergus>
2023-03-10 11:07 ` Indentation and gc Ergus
2023-03-10 14:36   ` Dr. Arne Babenhauserheide
2023-03-10 14:54     ` Eli Zaretskii
2023-03-10 19:23       ` Dr. Arne Babenhauserheide
2023-03-11  6:38         ` Eli Zaretskii
2023-03-11  6:55           ` Dr. Arne Babenhauserheide
2023-03-11  7:56             ` Eli Zaretskii
2023-03-11 12:34               ` Dr. Arne Babenhauserheide
2023-03-11 13:08                 ` Eli Zaretskii
2023-03-11 13:31                   ` Ihor Radchenko
2023-03-11 13:44                     ` Eli Zaretskii
2023-03-11 13:53                       ` Ihor Radchenko
2023-03-11 14:09                         ` Eli Zaretskii
2023-03-12 14:20                           ` Ihor Radchenko
2023-03-12 14:40                             ` Eli Zaretskii
2023-03-12 15:04                               ` Ihor Radchenko
2023-03-12 15:26                                 ` Eli Zaretskii
2023-03-13 15:09                                   ` Ihor Radchenko
2023-03-13 15:37                                     ` Eli Zaretskii
2023-03-13 15:45                                       ` Ihor Radchenko
2023-03-13 16:58                                         ` Eli Zaretskii
2023-03-13 18:04                                           ` Ihor Radchenko
2023-03-14 12:19                                             ` Eli Zaretskii
2023-03-15 10:28                                               ` Ihor Radchenko
2023-03-15 12:54                                                 ` Eli Zaretskii
2023-03-15 12:59                                                   ` Ihor Radchenko
2023-03-15 14:20                                                     ` Eli Zaretskii
2023-03-16 10:27                                                       ` Ihor Radchenko
2023-04-06  9:13                                                         ` Ihor Radchenko
2023-04-08  8:04                                                           ` Eli Zaretskii
2023-04-08  8:15                                                             ` Ihor Radchenko
2023-04-08 10:03                                                               ` Eli Zaretskii
2023-04-14 17:07                                                                 ` Ihor Radchenko
2023-04-14 17:56                                                                   ` Eli Zaretskii
2023-03-13 18:14                                         ` Gregor Zattler
2023-03-14 12:30                                           ` Eli Zaretskii
2023-03-14 15:19                                             ` Gregor Zattler
2023-03-11 16:19                     ` Dr. Arne Babenhauserheide
2023-03-12 13:27                       ` Ihor Radchenko
2023-03-12 14:10                         ` Eli Zaretskii
2023-03-12 14:50                           ` Ihor Radchenko
2023-03-12 15:13                             ` Eli Zaretskii
2023-03-12 17:15                               ` Gregor Zattler
2023-03-12 20:07                                 ` Eli Zaretskii
2023-03-13 15:01                               ` Ihor Radchenko
2023-03-13 15:33                                 ` Eli Zaretskii
2023-03-13 15:39                                   ` Ihor Radchenko
2023-03-13 15:39                                     ` Eli Zaretskii
2023-03-13 16:04                                       ` Ihor Radchenko
2023-03-13 16:52                                         ` Eli Zaretskii
2023-03-14 12:47                                           ` Ihor Radchenko
2023-03-14 13:09                                             ` Eli Zaretskii
2023-03-15 10:29                                               ` Ihor Radchenko
2023-03-13 15:41                                 ` Eli Zaretskii
2023-03-14 13:01                                   ` Ihor Radchenko
2023-03-11 10:54       ` Ihor Radchenko
2023-03-11 11:17         ` Ergus
2023-03-11 11:23           ` Ihor Radchenko
2023-03-11 12:31           ` Eli Zaretskii
2023-03-11 12:39             ` Ihor Radchenko
2023-03-11 12:40               ` Eli Zaretskii
2023-03-11 12:54                 ` Ihor Radchenko
2023-03-11 13:01                   ` Dr. Arne Babenhauserheide
2023-03-11 13:14                   ` Eli Zaretskii
2023-03-11 13:38                     ` Ihor Radchenko
2023-03-11 13:46                       ` Eli Zaretskii
2023-03-11 13:54                         ` Ihor Radchenko
2023-03-11 14:11                           ` Eli Zaretskii
2023-03-11 14:18                             ` Ihor Radchenko
2023-03-11 14:20                               ` Eli Zaretskii
2023-03-11 14:31                                 ` Ihor Radchenko
2023-03-11 15:32                                   ` Eli Zaretskii
2023-03-11 15:52                                     ` Lynn Winebarger
2023-03-11 16:24                                       ` Eli Zaretskii
2023-03-11 17:10                                     ` Gregor Zattler
2023-03-11 17:25                                       ` Eli Zaretskii
2023-03-11 18:35                                         ` Gregor Zattler
2023-03-11 18:49                                           ` Eli Zaretskii
2023-03-13 12:45                                     ` Ihor Radchenko
2023-03-13 12:51                                       ` Eli Zaretskii
2023-06-14 14:16                                         ` Ihor Radchenko
2023-06-14 15:36                                           ` Eli Zaretskii
2023-06-14 15:58                                             ` Ihor Radchenko
2023-06-14 16:07                                               ` Eli Zaretskii
2023-06-16 10:00                                                 ` Ihor Radchenko
2023-06-16 10:33                                                   ` Eli Zaretskii
2023-06-16 11:03                                                     ` Ihor Radchenko
2023-06-16 11:34                                                       ` Eli Zaretskii
2023-06-21 10:37                                                         ` Ihor Radchenko
2023-06-21 11:11                                                           ` Eli Zaretskii
2023-03-11 13:00                 ` Po Lu
2023-03-11 12:37         ` Eli Zaretskii
2023-03-11 13:10           ` Ihor Radchenko
2023-03-11 13:38             ` Eli Zaretskii
2023-03-10 14:52   ` Eli Zaretskii
2023-03-10 21:30     ` Ergus
2023-03-11  6:52       ` Eli Zaretskii
2023-03-21  7:11   ` Jean Louis
2023-03-21  7:27     ` Emanuel Berg

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).