From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Daniel Colascione Newsgroups: gmane.emacs.devel Subject: Re: (heap 1024 82721 1933216) Date: Sat, 18 Jan 2014 19:31:55 -0800 Message-ID: <52DB472B.5060805@dancol.org> References: <52DA8412.2080009@dancol.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------090308020109070201020803" X-Trace: ger.gmane.org 1390102336 3522 80.91.229.3 (19 Jan 2014 03:32:16 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 19 Jan 2014 03:32:16 +0000 (UTC) Cc: Emacs developers To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Jan 19 04:32:23 2014 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1W4j7W-0007S8-TC for ged-emacs-devel@m.gmane.org; Sun, 19 Jan 2014 04:32:23 +0100 Original-Received: from localhost ([::1]:45015 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W4j7W-0000h9-7w for ged-emacs-devel@m.gmane.org; Sat, 18 Jan 2014 22:32:22 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:60566) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W4j7M-0000gr-FF for emacs-devel@gnu.org; Sat, 18 Jan 2014 22:32:18 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W4j7G-00068m-8c for emacs-devel@gnu.org; Sat, 18 Jan 2014 22:32:12 -0500 Original-Received: from dancol.org ([2600:3c01::f03c:91ff:fedf:adf3]:48335) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W4j7F-00068V-QD for emacs-devel@gnu.org; Sat, 18 Jan 2014 22:32:06 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=dancol.org; s=x; h=Content-Type:In-Reply-To:References:Subject:CC:To:MIME-Version:From:Date:Message-ID; bh=/6JjEHE5AISV/dxBJGZGg7fCdfcoZVzWnAFbEtas6Og=; b=BhtPuKy56xz0XxJ9n/PFk+ZDqy6Ox6kOWnjlrnEQFstctNbVtakSAnVr5Wxunr/jvN/Vk9fXNOV8chrZM1wrhF12CCXUkOM/AbJ8k6TmQ9Gjzb/eVQm76OBYkAJubRmpnVuk1zOz1Kakf1OBUgf5DfrTDn3h2SLWDaOH9CYJZmGy1NVhrtpF/XSTEvTF8wXKDnCSsgowdc6LAQbDEKBhTAJKY/bmaN0LtE7qLB6Wrt4SsEwqCc35sZN/svwhAEsX4iyU5FJJLGaLHpQ9lGenw2T/respwkmB44pOnA9clfakpfgpf2E61vK6U7knZhk23cXaTbfR0iax8vDJlPUMIQ==; Original-Received: from c-76-104-210-106.hsd1.wa.comcast.net ([76.104.210.106] helo=[192.168.1.50]) by dancol.org with esmtpsa (TLS1.0:DHE_RSA_CAMELLIA_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1W4j78-0002XY-Tr; Sat, 18 Jan 2014 19:31:59 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2600:3c01::f03c:91ff:fedf:adf3 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:168716 Archived-At: This is a multi-part message in MIME format. --------------090308020109070201020803 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 01/18/2014 06:53 PM, Stefan Monnier wrote: >> value. dlmalloc's free memory retention seems a bit severe here. > > There are several levels at which the memory is "returned to the other level": > - if a single cons cell is in use in a "cons cell block", that block > can't be freed. > - those blocks are themselves allocated in groups of 16 IIRC, so those > groups can only be freed once all 16 of them have been freed at the > previous level. > - malloc/free can itself decide to keep those "freed" blocks for later > use, or to return them to the OS. At this level, the behavior depends > on the malloc library in use, which depends on the OS. > IIUC there are malloc libraries in use which never return memory back > to the OS. > >> Are we just badly fragmenting the heap? > > Could be. For an Emacs that grew to 6GB, I don't find it worrisome > if it doesn't shrink back below 2GB. I have no idea what contributed to that 6GB. Shared mappings count toward virtsize. Of this 6GB, though, dlmalloc has 2GB in its free lists. This figure is worrisome because this memory waste isn't coming from a simple leak we can plug. In the debugger, before I killed Emacs, I called malloc_trim, which didn't seem to have any effect. (Not that I expected it to.) dlmalloc is an sbrk-based allocator. It can only return memory to the system by reducing the data segment size. It can almost never do that in programs with typical allocation patterns, so in effect, the heap grows forever. dlmalloc does have code for using mmap for large allocations, but we've rendered that code inoperative in alloc.c by forcing sbrk allocation for all lisp objects, however large. If we allocate a 40MB vector and a cons block (or anything else), then GC the vector but keep at least one cons cell in that block live, we can never get that 40MB back. Ordinarily, dlmalloc would have just allocated that 40MB vector using mmap and expanded the heap only slightly for the cons block. We forbid mmap allocation of lisp objects because unexec doesn't restore the contents of mmaped regions, leaving some lisp objects out of the dump. One simple thing we can do to reduce fragmentation is to relax this restriction. If we know Emacs is already dumped, we can allow malloc to use mmap to allocate some lisp objects since we know emacs won't be dumped again. Today, Emacs technically supports being dumped multiple times, but we can safely kill this feature because it is broken on several major platforms already and almost certainly goes unused. On Cygwin and NS, dumping an already-dumped Emacs is explicitly forbidden. On my GTK3 GNU/Linux Emacs, attempting to dump a dumped Emacs results in a segfaults. I haven't tried it in NT Emacs, but I wouldn't be surprised if the feature were also broken there. The attached patch allows mmap allocation of large lisp objects in an Emacs that has been dumped (or that cannot ever be dumped). It could use more polish (e.g., enforcing the dump-once restriction for all platforms), but it shows that the basic idea works. Another simple thing we can do is switch malloc implementations. jemalloc is a modern mmap-based allocator available on many systems. It should be close to a drop-in replacement for dlmalloc. Conveniently, it has both sbrk and mmap modes. We could use it in sbrk mode before dumping and mmap mode afterward. Longer-term, it would be nice to be able to compact objects. We could move objects during the unmark phase of GC by looking for forwarding pointers to new object locations. (Of course, objects found through conservative scanning would have to be considered pinned.) > I'm much more worried about: how > on earth did it grow to 6GB? I have no idea --- I was just doing normal editing over a few dozen files. --------------090308020109070201020803 Content-Type: text/x-patch; name="memfrag.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="memfrag.patch" === modified file 'src/alloc.c' --- src/alloc.c 2014-01-03 06:42:23 +0000 +++ src/alloc.c 2014-01-19 03:12:44 +0000 @@ -95,6 +95,11 @@ #define MMAP_MAX_AREAS 100000000 +/* Specify the allocation size over which to request bytes from mmap + directly. */ + +#define MMAP_THRESHOLD (64*1024) + #endif /* not DOUG_LEA_MALLOC */ /* Mark, unmark, query mark bit of a Lisp string. S must be a pointer @@ -204,6 +209,13 @@ static char *stack_copy; static ptrdiff_t stack_copy_size; +/* True if we need to preserve memory regions for dumping. */ +#ifdef CANNOT_DUMP +#define might_dump 0 +#else +static bool might_dump = true; +#endif + /* Copy to DEST a block of memory from SRC of size SIZE bytes, avoiding any address sanitization. */ @@ -963,21 +975,10 @@ #endif /* BLOCK_ALIGN has to be a power of 2. */ -#define BLOCK_ALIGN (1 << 10) +#define BLOCK_ALIGN (1 << 16) -/* Padding to leave at the end of a malloc'd block. This is to give - malloc a chance to minimize the amount of memory wasted to alignment. - It should be tuned to the particular malloc library used. - On glibc-2.3.2, malloc never tries to align, so a padding of 0 is best. - aligned_alloc on the other hand would ideally prefer a value of 4 - because otherwise, there's 1020 bytes wasted between each ablocks. - In Emacs, testing shows that those 1020 can most of the time be - efficiently used by malloc to place other objects, so a value of 0 can - still preferable unless you have a lot of aligned blocks and virtually - nothing else. */ -#define BLOCK_PADDING 0 #define BLOCK_BYTES \ - (BLOCK_ALIGN - sizeof (struct ablocks *) - BLOCK_PADDING) + (BLOCK_ALIGN - sizeof (struct ablocks *)) /* Internal data structures and constants. */ @@ -1001,11 +1002,6 @@ (if not, the word before the first ablock holds a pointer to the real base). */ struct ablocks *abase; - /* The padding of all but the last ablock is unused. The padding of - the last ablock in an ablocks is not allocated. */ -#if BLOCK_PADDING - char padding[BLOCK_PADDING]; -#endif }; /* A bunch of consecutive aligned blocks. */ @@ -1015,7 +1011,7 @@ }; /* Size of the block requested from malloc or aligned_alloc. */ -#define ABLOCKS_BYTES (sizeof (struct ablocks) - BLOCK_PADDING) +#define ABLOCKS_BYTES (sizeof (struct ablocks)) #define ABLOCK_ABASE(block) \ (((uintptr_t) (block)->abase) <= (1 + 2 * ABLOCKS_SIZE) \ @@ -1062,7 +1058,8 @@ /* Prevent mmap'ing the chunk. Lisp data may not be mmap'ed because mapped region contents are not preserved in a dumped Emacs. */ - mallopt (M_MMAP_MAX, 0); + if (might_dump) + mallopt (M_MMAP_MAX, 0); #endif #ifdef USE_ALIGNED_ALLOC @@ -1084,7 +1081,8 @@ #ifdef DOUG_LEA_MALLOC /* Back to a reasonable maximum of mmap'ed areas. */ - mallopt (M_MMAP_MAX, MMAP_MAX_AREAS); + if (might_dump) + mallopt (M_MMAP_MAX, MMAP_MAX_AREAS); #endif #if ! USE_LSB_TAG @@ -1728,14 +1726,16 @@ mmap'ed data typically have an address towards the top of the address space, which won't fit into an EMACS_INT (at least on 32-bit systems with the current tagging scheme). --fx */ - mallopt (M_MMAP_MAX, 0); + if (might_dump) + mallopt (M_MMAP_MAX, 0); #endif b = lisp_malloc (size + GC_STRING_EXTRA, MEM_TYPE_NON_LISP); #ifdef DOUG_LEA_MALLOC /* Back to a reasonable maximum of mmap'ed areas. */ - mallopt (M_MMAP_MAX, MMAP_MAX_AREAS); + if (might_dump) + mallopt (M_MMAP_MAX, MMAP_MAX_AREAS); #endif b->next_free = b->data; @@ -3039,7 +3039,8 @@ /* Prevent mmap'ing the chunk. Lisp data may not be mmap'ed because mapped region contents are not preserved in a dumped Emacs. */ - mallopt (M_MMAP_MAX, 0); + if (might_dump) + mallopt (M_MMAP_MAX, 0); #endif if (nbytes <= VBLOCK_BYTES_MAX) @@ -3057,7 +3058,8 @@ #ifdef DOUG_LEA_MALLOC /* Back to a reasonable maximum of mmap'ed areas. */ - mallopt (M_MMAP_MAX, MMAP_MAX_AREAS); + if (might_dump) + mallopt (M_MMAP_MAX, MMAP_MAX_AREAS); #endif consing_since_gc += nbytes; @@ -6777,9 +6779,9 @@ #endif #ifdef DOUG_LEA_MALLOC - mallopt (M_TRIM_THRESHOLD, 128 * 1024); /* Trim threshold. */ - mallopt (M_MMAP_THRESHOLD, 64 * 1024); /* Mmap threshold. */ - mallopt (M_MMAP_MAX, MMAP_MAX_AREAS); /* Max. number of mmap'ed areas. */ + mallopt (M_TRIM_THRESHOLD, 2 * MMAP_THRESHOLD); + mallopt (M_MMAP_THRESHOLD, MMAP_THRESHOLD); + mallopt (M_MMAP_MAX, MMAP_MAX_AREAS); #endif init_strings (); init_vectors (); @@ -6804,6 +6806,11 @@ #if USE_VALGRIND valgrind_p = RUNNING_ON_VALGRIND != 0; #endif + +#ifndef CANNOT_DUMP + if (initialized) + might_dump = false; +#endif } void --------------090308020109070201020803--