From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Pip Cet Newsgroups: gmane.emacs.devel Subject: Re: MPS: out-of-memory Date: Mon, 08 Jul 2024 19:44:21 +0000 Message-ID: References: <87msmu1uy5.fsf@gmail.com> <865xtg14hd.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="7410"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Eli Zaretskii , acorallo@gnu.org, eller.helmut@gmail.com, emacs-devel@gnu.org To: =?utf-8?Q?Gerd_M=C3=B6llmann?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Jul 09 04:21:44 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sR0UB-0001ie-Sq for ged-emacs-devel@m.gmane-mx.org; Tue, 09 Jul 2024 04:21:43 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sR0Tg-000839-AR; Mon, 08 Jul 2024 22:21:12 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sQuHm-00024j-KS for emacs-devel@gnu.org; Mon, 08 Jul 2024 15:44:30 -0400 Original-Received: from mail-40131.protonmail.ch ([185.70.40.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sQuHk-0007BG-5t; Mon, 08 Jul 2024 15:44:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1720467865; x=1720727065; bh=0R0WWHrnVyUVsmijQm/BRW6K9IApdcqpz2k28APquqc=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=nwA0twic5K2ldxg4FGB9E22II2NU2TT7p1JkfR8yBg3eMbPvtEmAGOO89M3U1wBRU beoD2a8NDvuPVdBjeIXK8gKu7JcmjQHreu7hSHok8w6bD09/xFJnhjgq6W7gKpl6qN 5VAjBjRE4qDW/hIuDqtKs0gXSkwfTBcv/pyk9l7tXrABbxKMM/iJI2/i8joJXy6K93 ZrafIt/j4e1r1zk+tBiFsp0xNydhl4nI2ay14Fb3VnlI/dSVkZCoS8+e+lk9OivgIv SRQq/jeurfvvxStApO18VtRuuhUqChxSEmCLPv//brRjF402NJVGYFyhPImJ+EofTZ 3QsqdWXTIu1uQ== In-Reply-To: Feedback-ID: 112775352:user:proton X-Pm-Message-ID: a5d0bf9ec95db9555c77c545c7e34d008328ed7a Received-SPF: pass client-ip=185.70.40.131; envelope-from=pipcet@protonmail.com; helo=mail-40131.protonmail.ch X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Mon, 08 Jul 2024 22:21:09 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:321553 Archived-At: On Monday, July 8th, 2024 at 18:26, Gerd M=C3=B6llmann wrote: > Pip Cet pipcet@protonmail.com writes: >=20 > > > Yes, I think that's important, and also handling the case of getting > > > errors codes from MPS in general. > >=20 > > I think we might want to distinguish the two cases: MPS might die, > > sure, but it looks like we need an out-of-memory mechanism separate > > from MPS to avoid assertion violations. >=20 > Could you please elaborate which assertions these are? I meant the MPS assertion which fails when we run out of memory and there's= a commit limit. As that appears to happen independently of how the commit = limit was set, I don't think we can have a commit limit at this point. > > > As one can see, igc as a whole is in a certain state, igc_state. The > > > idea is that the state changes to IGC_STATE_DEAD when something fatal > > > happens. In that state, malloc is used is used for allocating Lisp > > > objects instead of MPS. That lets Emacs shut down gracefully without > > > entering MPS recursively as it was before I added the state. > >=20 > > I've seen that work, and I've seen it not work. Better than nothing > > :-) >=20 > Right, if you have such a case that I can reproduce, please holler :-) Sure. What I think I've seen is my code misbehaving and MPS aborting, while= leaving memory in an unusable state. I'm not sure that can be avoided, to = be honest. > > > Such state changes are currently done when checking the return code o= f > > > an MPS API function and when assertions fail. > > >=20 > > > Seems to work as expected, as I could see in a couple of backtraces f= rom > > > Ihor I believe showing the set_state (IGC_STATE_DEAD), and from my ow= n > > > experience. But maybe someone has an idea how to improve it. > >=20 > > If I understood Helmut correctly, he wants a mechanism to avoid > > thrashing after exceeding the memory limit. Maybe we need a special > > state for that, in which we stop the GC but continue using our memory > > while the user quits and goes to buy more RAM? >=20 > Could be. One could for example allocate a block from MPS that we make > available to MPS as a first aid. I think that would be relatiely easy. I know that people have different opinions on swap space, particularly in t= hese days of SSDs which can be worn out easily. I'm not sure there's a good= way to detect thrashing even when using the Linux kernel, nevermind across= all the OSes we want to support. > Using xzalloc in IGC_STATE_DEAD was just easiest to implement for me. I > was mainly trying to make the error handling in igc understandable. I > think introducing the states was not too bad for that. Also for that > non-error state because of the staticpros, which becomes much clearer > that way, I believe. Anyway. I think it's great! I'd just like to extend it by one state so I don't have= to go out and buy a new SSD when I mess up :-) > > Speaking of memory issues in general, I'm currently seeing pure space > > overflows, after changes which should affect weak objects exclusively. > > Which is really odd, because weak objects aren't purecopied! Still > > investigating that one, but it's possible I'll just give up and bump > > pure space. > >=20 > > In any case, we end up segfaulting because we still try to run GC > > after pure space overflowed. I'll push a fix for that, but I can't > > promise to test the case of overflowing pure space very much. >=20 > Once more time glad that I don't have pure space here :-). Maybe we could at least drop it for MPS builds? :-) IIRC the arguments for= keeping it were mostly about GC performance. > (I think I've picked what I want to play with next today, BTW. > https://github.com/actor-framework/actor-framework :-) Cool, I'll try not to break the branch too much while you're distracted by = something else. No promises though. Pip