From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Daniel Colascione Newsgroups: gmane.emacs.devel Subject: Re: Dynamic modules: MODULE_HANDLE_SIGNALS etc. Date: Sun, 3 Jan 2016 13:28:24 -0800 Message-ID: <56899278.9000007@dancol.org> References: <83mvu1x6t3.fsf@gnu.org> <567841A6.4090408@cs.ucla.edu> <567844B9.2050308@dancol.org> <5678CD07.8080209@cs.ucla.edu> <5678D3AF.7030101@dancol.org> <5678D620.6070000@cs.ucla.edu> <83mvt2qxm1.fsf@gnu.org> <56797CD9.8010706@cs.ucla.edu> <8337uuqsux.fsf@gnu.org> <5679DC83.70405@cs.ucla.edu> <83oadhp2mj.fsf@gnu.org> <567AD556.6020202@cs.ucla.edu> <567AD766.3060608@dancol.org> <567B5DAB.2000900@cs.ucla.edu> <83fuyromig.fsf@gnu.org> <567C25B1.3020101@dancol.org> <56892FD6.8040708@dancol.org> <568988EE.3010205@dancol.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="Nk3okN0jafT3IFtv9otJ40oRK9vwgjStp" X-Trace: ger.gmane.org 1451856522 26004 80.91.229.3 (3 Jan 2016 21:28:42 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 3 Jan 2016 21:28:42 +0000 (UTC) To: Eli Zaretskii , Paul Eggert , Emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Jan 03 22:28:38 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aFqCc-0000xY-Fd for ged-emacs-devel@m.gmane.org; Sun, 03 Jan 2016 22:28:38 +0100 Original-Received: from localhost ([::1]:42994 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aFqCb-00078h-VG for ged-emacs-devel@m.gmane.org; Sun, 03 Jan 2016 16:28:37 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:45872) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aFqCX-00075B-Qy for Emacs-devel@gnu.org; Sun, 03 Jan 2016 16:28:35 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aFqCW-00022A-CV for Emacs-devel@gnu.org; Sun, 03 Jan 2016 16:28:33 -0500 Original-Received: from dancol.org ([2600:3c01::f03c:91ff:fedf:adf3]:39958) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aFqCW-000226-1j; Sun, 03 Jan 2016 16:28:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=dancol.org; s=x; h=Content-Type:In-Reply-To:MIME-Version:Date:Message-ID:From:References:To:Subject; bh=Qb3qJNbNsCeBgGIlPPf3pGjU/sfY8PsozHVuwp5n8Bg=; b=ZR+qabRYEOzau4uBww7702joppU7+6S9i6th6rXwEYmP1AKPmvbshouBVvDlNXTYrg+Kf4FjSSBUOWEUBZVnhlW9WDesYC/s5RV0ThDixkgFG4SPoG4QwvANP9k1vTzLPfxBY0RJ0pJn/xU5XBfblIONAwKHSD843q0H9nqJgYuCjTWjHJHVp+SizYG8VxqKKCWAyWT0ZyArimG6oPbvKXXWZzdyIv5PMLSbr03r1WPq72nNd4SvMQG/eNThkarCgRoFzsRYVUgICuYJVBB2uydamk/aKG8cuAEg6ktK2dnTSgZ1fC1TW3X7cXYQZQDIBhKjfRqtEvMZa2hosI5Wow==; Original-Received: from [2620:10d:c090:180::1:692e] (helo=[IPv6:2620:10d:c081:1103:2ab2:bdff:fe1c:db58]) by dancol.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84) (envelope-from ) id 1aFqCU-0007LS-Nt; Sun, 03 Jan 2016 13:28:30 -0800 X-Enigmail-Draft-Status: N1110 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2600:3c01::f03c:91ff:fedf:adf3 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:197506 Archived-At: This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Nk3okN0jafT3IFtv9otJ40oRK9vwgjStp Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 01/03/2016 01:07 PM, John Wiegley wrote: >>>>>> Daniel Colascione writes: >=20 >> It's not just a theoretical problem: I've spent lots of late nights st= aring >> at stack traces, trying to figure out how a certain deadlock could be >> possible, only to realize that the program had already crashed --- or = would >> have, if a seldom-tested bit of code hadn't checked for NULL and retur= ned >> without releasing a lock, causing a hang half an hour later. >=20 > I see. Isn't what you describe an argument against error handling in ge= neral, > though? It too can mask the origin of serious problems. It is. There's a difference between trying to paper over undefined behavior generally, however, and reporting well-defined errors using a safe mechanism. (The former invalidates the system's own invariants, while the latter invalidates only the application's invariants.) But yes, error handling in general can paper over bugs, and I've certainly seem Emacs bugs similarly exacerbated by attempting to ignore errors. > What if we do this: >=20 > 1. When a serious error occurs that engages crash recovery, we pop up= a > window in Emacs describing that a serious error occurred that woul= d have > crashed Emacs --and that *nothing* should be trusted now. All the = user > should do is save critical buffers and exit immediately. The call to Fdo_auto_save tries to do that already. Fdo_auto_save isn't async-signal-safe, so I'd rather fork a child process, in the child, call Fdo_auto_save and exit, have the parent wait 500ms for the child (not forever, in case the child deadlocks), kill the child, and continue crashing. That, or provide a less elaborate, async-signal-safe, pure C auto-save facility. In any case, control flow shouldn't leave the signal handler when the application is in an unpredictable state. > 2. When in such a state, M-x report-emacs-bug automatically includes = a trace > for the location where the crash occurred. Of course, this assumes= Emacs > is still functional enough to send e-mail. >=20 >> You're right that under Linux, programs need to prepare for the possib= ility >> that they might suddenly cease to exist. We're talking about something= >> different here, which is the possibility that a program can *keep runn= ing*, >> but in a damaged and undefined state. Ideally, Emacs would, on crash (and after auto-save), spawn a copy of itself with an error report pre-filled. Fork and exec work perfectly fine in signal handlers. > I was thinking the system itself is now running in a damaged and undefi= ned > state. When that happens, I often reboot since I can't really trust it > anymore. >=20 >> I'm worried that it'll be hard to know if it bites us, particularly si= nce >> the problems I'm imagining are infrequent, unreproducible, and carry n= o >> obvious signature that would show up in a user crash report. >=20 > If we use a window to pop up an alarm indicating, boldly, that Emacs is= now > UNSTABLE and should only be used to save files and exit -- maybe even n= oting > how to abort Emacs to avoid typical cleanup actions -- we can start get= ting > feedback on whether this feature really helps or hurts. I think we need better crash reporting generally. Stack overflow is only one instance of the general class of things that can go wrong. But in any case, if we put Emacs into a state where the only thing a user can do is save files, why not just save the files? There's no guarantee that after a crash that we can even display something. > I understand error handlers can mask problems, and that they've made yo= ur life > more difficult as an engineer concerned with uncovering such causes. Ho= wever, > I'm disinclined to accept, a priori, that it will hurt before trying it= out. We have no information on how often Emacs crashes in the hands or real users or how it crashes. A wait-and-see approach is just blind faith. Nobody has also brought up why other programs don't work with way. Other programs avoid this kind of hackery for good reasons, which I've detailed. We shouldn't ignore the lessons of everyone else. It's not for lack of inspiration that nobody else does this. One question that neither you, nor Eli, nor Paul have answered is why we would try to recover from stack overflow and not NULL deferences. Exactly the same arguments apply to both situations. > When Emacs isn't being run under gdb (which it almost never is) it also= > doesn't give much useful information about what happened, and loses dat= a. With > the crash recovery logic, we should at least be able to provide a trace= of > where we were when the crash was detected, plus give the user a chance = of > reporting that data back to us. I see this as possibly *increasing* the= amount > of error information we receive, and not just masking or eliminating it= =2E Emacs should report its own crashes somehow *generally*, probably with Breakpad. --Nk3okN0jafT3IFtv9otJ40oRK9vwgjStp Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJWiZJ4AAoJEN4WImmbpWBlr2EQAJcp5pYYGsgCpVK49uMm4BKj fkJutR812Rj9KHskJIr2jdbsfva7TE9+17I2as0usp6uj82OWfVaimehcIGE2uIN Nk5eJuI12PVB0Hk6ZYKA6N1JXjCA7I11NDyLsB8p4kxu8U73Ysh72JVf9zqpUiC4 Qa61JafPsJ5+/Z+RQK03iOX/3QbJ7LZQXnrAjAhaeXVd1LiCL5MSLcDbPHUQ/KF9 yFAP+j1zYxI8Ey/7Q9Mp1VfvkSrp0cBEudVK2K1p0JBAbK9347CFOYv8lo5KlVz0 RI3wh30E6vFwW9a7H+4purBzfEzFT6gElQeWPi73ra0VEx+t/xxkWf5czG3dXB+O ZvAHMtvpJD1obgokxUrWZuOBtVnDy/wDJQ+sII5RPz0JEL38jRU7Doyo0mgX8hCc M0+EPVkVwei2aJZNQpfJKB7dkvFGiK2mwU039vQxdhHZ3EtwgQpu8WjSudQKUmsV g6azXf13CWhAGgEqomqA+cIaTPRyoro0cAKtLvjrivDOxz1MblSuylYKqzV9MOsX mlsX1uEqui+IjXg64dlD7UQoMrxEAVYC+9GsyADx2rHfK3SeMAjXtfpTJH5px+4G B2HIfhFVC5r3L6hUjeSMGCIBU8//o4Rui7lK9aT29ts3NKGfRYXcZmm2jzzPcHDI tZrifm1n7Cg3W7AtdXhJ =/bA+ -----END PGP SIGNATURE----- --Nk3okN0jafT3IFtv9otJ40oRK9vwgjStp--