From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Trevor Bentley Newsgroups: gmane.emacs.bugs Subject: bug#43389: 28.0.50; Emacs memory leaks Date: Wed, 11 Nov 2020 22:15:21 +0100 Message-ID: <871rgzvbme.fsf@mail.trevorbentley.com> References: <87r1r5428d.fsf@web.de> <874kmcvlbj.fsf@mail.trevorbentley.com> <83imasb0te.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; format=flowed Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23556"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 43389@debbugs.gnu.org, To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Nov 11 22:16:25 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kcxTc-0005z7-Tg for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 11 Nov 2020 22:16:25 +0100 Original-Received: from localhost ([::1]:48170 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kcxTb-0000Ro-El for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 11 Nov 2020 16:16:23 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:33110) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kcxTG-0000RU-Bj for bug-gnu-emacs@gnu.org; Wed, 11 Nov 2020 16:16:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:59738) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kcxTG-00021M-2C for bug-gnu-emacs@gnu.org; Wed, 11 Nov 2020 16:16:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1kcxTF-0003Km-UF for bug-gnu-emacs@gnu.org; Wed, 11 Nov 2020 16:16:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Trevor Bentley Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 11 Nov 2020 21:16:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 43389 X-GNU-PR-Package: emacs Original-Received: via spool by 43389-submit@debbugs.gnu.org id=B43389.160512933112776 (code B ref 43389); Wed, 11 Nov 2020 21:16:01 +0000 Original-Received: (at 43389) by debbugs.gnu.org; 11 Nov 2020 21:15:31 +0000 Original-Received: from localhost ([127.0.0.1]:43051 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kcxSk-0003K0-OC for submit@debbugs.gnu.org; Wed, 11 Nov 2020 16:15:31 -0500 Original-Received: from mail.trevorbentley.com ([37.187.5.80]:38555) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kcxSj-0003Jg-3Y for 43389@debbugs.gnu.org; Wed, 11 Nov 2020 16:15:29 -0500 Original-Received: from localhost (c188-150-0-48.bredband.comhem.se [188.150.0.48]) by mail.trevorbentley.com (Postfix) with ESMTPSA id 6A8B4601C8; Wed, 11 Nov 2020 22:15:22 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mail.trevorbentley.com; s=mail; t=1605129322; bh=RaMVbx3Fbd4pJXWrUxDZ+jm7rUruI/3KHPqyAydqWu0=; h=From:To:Cc:Cc:Subject:In-Reply-To:References:Date:From; b=Nc4IHeLoD9yRzsI6u2aGIOcKEUavwW183po/Aycb6FEckrWlBWy23QK2lTow3HsyU ZOIuv+ti/4jVBa1aO8W/j+4h+BNLa9EADnYXKmJNTqxNqgpCn3UWSPYZgptTgpj7ti eluqYipQOQE+xOOz0TYrZsZ6KRS1+l9SEPB6sN80= In-Reply-To: <83imasb0te.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:193136 Archived-At: > Thanks. This trace doesn't show how many bytes were allocated, > does it? Without that it is hard to judge whether these GnuTLS > calls could be the culprit. Because the full trace shows other > calls to malloc, for example this: It doesn't show the size of the individual allocations, but it indirectly shows the size of the heap. Each brk() line like this one is the start of an entry: 0.000000 brk(0x55f5ed93e000) = 0x55f5ed93e000 Where the first field is relative time since the last brk() call, and the argument in parentheses is the size requested. Subtracting the argument to one call from the argument to the previous call shows how much the heap has been extended. In this capture, subtracting the first from the last shows that the heap grew by 8,683,520 bytes, and summing the relative timestamps shows that this happened in 90.71 seconds. It's growing at about 100KB/sec at this point. Also, keep in mind that this is brk(). There could have been any number of malloc() calls in between, zero or millions, but these are the ones that couldn't find any unused blocks and had to extend the heap. > I'm not sure how Emacs could be the culprit here. If GnuTLS is > the culprit (and as explained above, this is not certain at this > point), perhaps upgrading to a newer GnuTLS version or reporting > this to GnuTLS developers would allow some progress. I think you are right, GnuTLS was probably a symptom, not a cause. I took a while to respond because I tried running emacs in Valgrind's Massif heap debugging tool, and it took forever. Some results are in now, and it looks like GnuTLS wasn't present in the leak this time around. First of all, if you aren't familiar with Massif (as I wasn't), it captures occassional snapshots of the whole heap and all allocations, and lets you dump a tree-view of those allocations later with the "ms_print" tool. The timestamps are fairly useless, as they are in "number of instructions executed." Here are three files from my investigation: The raw massif output: http://trevorbentley.com/massif.out.3364630 The *full* tree output: http://trevorbentley.com/ms_print.3364630.txt The tree output showing only entries above 10% usage: http://trevorbentley.com/ms_print.thresh10.3364630.txt What you can see from the handy ASCII graph at the top is that memory usage was chugging along, growing upwards for a couple of days, and then spiked very quickly up to just over 4GB over a few hours. If you scroll down to the very last checkpoint (the 10% threshold file is better for this), you can see where most of the memory is used. Very large sums of memory, but from different sources. 1.7GB from lisp_align_malloc (nearly all from Fcons), 1.4GB from lmalloc (half from allocate_vector_block), 700MB from lrealloc (mostly from enlarge_buffer_text). There were no large buffers open, but there were long-lived network sockets and plenty of timers. I didn't check, but I'd say the largest buffer was up to a couple of megabytes, since emacs-slack logs fairly heavily. I'm not sure what to make of this, really. It seems like a general, sudden-onset, intense craving for more memory while not particularly doing much. I could blindly suggest extreme memory fragmentation problems, but that doesn't seem very likely. It's trivial to reproduce, but takes 3-5 days, so not exactly handy to debug. Let me know if you have any requests for the next iteration before I kill it. It's running in Valgrind again. Thanks, -Trevor