On Mon, Jun 20, 2022 at 10:01 PM Lynn Winebarger wrote: > On Sun, Jun 19, 2022, 7:05 AM Lars Ingebrigtsen wrote: > >> Eli Zaretskii writes: >> >> > AFAIK, that's not really true. We call 'free' and glibc does release >> > to the OS when it can. And don't forget that Emacs calls 'malloc' a >> > lot not just for conses and other Lisp data. >> >> Wasn't there a huge discussion about this a couple months ago and the >> conclusion was that glibc rarely releases memory back to the OS? That's >> why we added `malloc-trim'. >> > > Was there? I see:. > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=38345#71 > but it's a little more than a couple of months ago. > I'm pretty curious because if I accumulate a large buffer of trace output > (running memory up to 100s of MB), killing the buffer doesn't seem to > impact gc time substantially. > I would think using mmap-based allocation would make release of memory to > the system more likely. But mmap is only used for 64+KiB size blocks, and > the trim threshold is 128KiB blocks (shouldn't it be lower than the > allocation size?) > I'm also interested in the improvement in performance reported in that > thread from using the jemalloc implementation. Has that been investigated > further? > > I tried jemalloc per Ihor's description, using LD_PRELOAD to have it replace the system malloc without having to change or even recompile any emacs code. My test was to start a shell in a buffer, and then cat about 32MB of csv data to the buffer. It seemed like a reasonable approximation of generating large buffers of tracing output for this purpose. I started emacs 28.1 from a build directory with MALLOC_CONF=background_thread:true,retain:false ./src/emacs As expected, the emacs process slowly added about 50MB of memory to RSS. After the cat had completed, I killed the buffer and ran (garbage-collect). The RSS and VSZ of the process dropped about 40MB. Not complete reclamation, but I expect that it would solve the problem of a permanent bump in gc pause time. For some reason, the configure script chooses not to use mmap for malloc. jemalloc is designed to use mmap as much as possible so it isn't limited to freeing only the uppermost regions. I can't tell what (if any) option allows me to overrule the configure scripts decision to use sbrk instead of mmap. That might help make the mallopt knobs more effective. Are there any formal memory-allocation/reclamation benchmarks that I should use instead of this ad hoc task? Lynn