From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: MPS: a random backtrace while toying with gdb Date: Wed, 03 Jul 2024 18:41:25 +0300 Message-ID: <865xtmbhzu.fsf@gnu.org> References: <87bk3jh8bt.fsf@localhost> <86h6d9dlyg.fsf@gnu.org> <86h6d8c52h.fsf@gnu.org> <86sewrc057.fsf@gnu.org> <868qyibsp4.fsf@gnu.org> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6747"; mail-complaints-to="usenet@ciao.gmane.io" Cc: eller.helmut@gmail.com, gerd.moellmann@gmail.com, yantar92@posteo.net, emacs-devel@gnu.org To: Pip Cet Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Wed Jul 03 17:41:53 2024 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sP27F-0001LV-AV for ged-emacs-devel@m.gmane-mx.org; Wed, 03 Jul 2024 17:41:53 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sP26s-0004Sb-W3; Wed, 03 Jul 2024 11:41:31 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sP26r-0004Qi-CN for emacs-devel@gnu.org; Wed, 03 Jul 2024 11:41:29 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sP26q-00062d-Fv; Wed, 03 Jul 2024 11:41:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=fvXdbXHRk83RvwP36/uXUsLh5wM7ABfarAWU/BmeDLU=; b=JuI0aACTjj02 oqoDbySHwqhE1jOR040xd5DTVzMqbr39AMCGfVn2ccFoAHH/T1RxRTniq9SgtjdHYF9aP+wr1jf+O vdLZwHPQiCCPhJhaAW9rBaF6cjSs6DoZf5QLq9eG1zIT9c0zBdclJZtpaqw+GLdx7Nv+rRSDYB1Eb 7UfCLtgYGa2C9A4YkF20FhFyzyXENSl7jbJ8BcVlgK/obGc4iGAbNKuhxaYl+567vFYkcXaU7pAjl QGgvfZ3nMsOs0Hh5yF/0kXzfTkXQgTfnnr0MJEy51r+NsiT/UWTfIcXsxSxb0v7ga0un1pc4ExZvq MRW5CAwyU4PfjS40ml35qw==; In-Reply-To: (message from Pip Cet on Wed, 03 Jul 2024 14:35:16 +0000) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:321264 Archived-At: > Date: Wed, 03 Jul 2024 14:35:16 +0000 > From: Pip Cet > Cc: eller.helmut@gmail.com, gerd.moellmann@gmail.com, yantar92@posteo.net, emacs-devel@gnu.org > > > I'd start with the first half of (1). It is not clear to me that the > > other part is needed, and in any case we need a reproducer for it > > first. > > Ihor's original SIGPROF-based reproducer works if you revert the (SIGPROF-specific) workaround by Helmut. What makes you think it can't happen with other signals (which, naturally, aren't as frequent or badly-timed as SIGPROF, which strikes precisely when the CPU is active)? > > Obviously a reproducer is highly desirable in this case, but we shouldn't leave known and understood bugs in the code. I agree, but you yourself said, and I agree, that without a test case we cannot test any fixes in that area. So let's have the text case first and analyze it, before discussing solution. > > Most of the crashes we've seen until now were not when the MPS > > handler was running. Also, didn't someone say that when the MPS > > Ihor's first crash backlog (MPS: profiler) clearly shows that was the case: https://lists.gnu.org/archive/html/emacs-devel/2024-06/msg00568.html > > > SIGSEGV handler is active, we could detect that from our code and > > return doing nothing? > > Dropping the signal in the process. No, setting a flag to 'raise' the same signal when we are back in our code and in safe environment (i.e., not called from the MPS SIGSEGV handler). Btw, an alternative would be to block the signals we care about while in the MPS SIGSEGV handler in some way. The way they install the handler is very simple, see protsgix.c:ProtSetup. Their code masks no signals; we could instead mask the signals we care about. As a POC, I'd simply modify their code, but if that works, we could later override their handler setup with ours or something. > > Doing (2) adds a whole lot of complexity to Emacs. > > You're right. > > > Most importantly, > > we will be unable to access Lisp data safely, unwind-protect and the > > entire specpdl stuff generally cannot be used, and signaling an error > > would be fatal. So I'd rather avoid that. > > Very good points, though I wonder to what extent our current code is safe... Well, I know from the MS-Windows experience that this is fraught with pitfalls which took us some years to find and fix. (On Windows, Emacs uses additional threads for GUI I/O and for emulating SIGALRM and SIGPROF.)