On 01/03/2016 11:15 AM, Eli Zaretskii wrote: >> Cc: eggert@cs.ucla.edu, Emacs-devel@gnu.org >> From: Daniel Colascione >> Date: Sun, 3 Jan 2016 11:04:08 -0800 >> >>> Yes, it is. You would like us to crash rather than try recovering. >>> That is a very heavy price in Emacs. >> >> Why is it uniquely unacceptable in Emacs? Why do other programs that >> fill the same niche not employ this strategy? > > Not many other programs run for so long and have so much precious data > for their users. Besides, who says there are no other programs that > do this? libsigsegv wasn't written as an academic exercise. Many other programs run as long. One example is the Linux kernel, which panics on stack overflow. >> Why do we not try to mitigate NULL pointer dereferences (to which >> all the same arguments apply)? > > We do: we catch SIGSEGV and try to save what can be salvaged. Invoking auto-save after resetting SIGSEGV is a good application of that approach. (We should make sure that control flow can't leave the sigsegv handler.) What's dangerous is allowing Emacs to continue running after we've detected that it's entered a bad state. I'm not against installing a sigsegv handler: I'm against returning control flow to toplevel. >>>> My point isn't that memory leaks are disastrous. It's that the >>>> consequences of this code weren't given due consideration at the time it >>>> was committed. >>> >>> You have absolutely no evidence that this wasn't considered. It's >>> factually incorrect. You don't have to know that it's incorrect, but >>> I would expect you to give more credit to our collective knowledge and >>> experience than you evidently do. >> >> I searched the mailing list and saw no discussion of the points I >> raised. > > Who said that considerations must be in public discussions? On the > contrary, I'd rather take the lack of discussions as an indication > that this was considered and no one saw any problem with it. The existence of consistent with both my view and widespread, sagacious approval. Given the concerns I raised, the more parsimonious explanation is that the code went in without review, because even if you and Paul are right, it's worth having a conversation about the dangers of the code, and AFAICT, there was none. >>>>> You are not objective, so you exaggerate the risks and dismiss the >>>>> benefits. >>>> >>>> I disagree that there *are* significant benefits. >>> >>> Of course, you do. Like I said: your bias affects your judgment. >> >> So does yours. > > No, I acknowledge the risks. You don't acknowledge the benefits. The benefit is that returning control to toplevel allows the user to save data in buffers where autosave is not enabled. I think the benefit is slight. Autosave is the only mechanism that protects against other failure modes, like the OOM killer, NULL pointer dereferences, and sudden power loss. Consequently, I strongly suspect that any truly precious data is in autosave buffers and that this stack overflow mitigation in practice allows the recovery of nothing important. >>> It's not undefined behavior, not in practice. We know quite well what >>> can and cannot happen. >> >> No you don't, because we can longjmp out of third-party code > > FUD. What "third-party code"? Any code we use in Emacs has its > sources open for scrutiny. First of all, it's perfectly legal to update libc to a version that wasn't around for a particular Emacs release, and this libc (which is perfectly conforming under _legitimate_ API use) might have problems with the Emacs recovery scheme that we didn't and couldn't anticipate. Also, third-party libraries are generally written under the assumption that control isn't yanked form under them partway through delicate operations. I don't think it's reasonable to expect that every library Emacs uses be robust under this kind of abuse. >>> Anyway, saying that "unpleasant things can happen" _is_ FUD. I want >>> to see a single bug report about these unpleasant things happening in >>> real use, then I'll start thinking whether I should reconsider. >> >> And I want to see a real bug report about the stack overflow we're >> trying to defend against. > > We've been through that already: if stack overflow never happens, the > recovery code can never cause any problems. Given that stack overflow is rare, we won't get to test the scenario much. We should err on the side of making Emacs behave predictably instead of trying to recover using undefined behavior, because if the recovery causes problems, it'll be hard to tell. >> The failure mode here wouldn't be obvious either: Emacs could just >> silently crash, hang, or write a wrong byte or two to a file. > > Neither of which is a disaster. Neither of which will produce a bug report blaming this code, so the lack of bug reports is not positive evidence that this code is harmless. >> You have no idea what might happen, which is especially concerning >> because Emacs is frequently an internet-facing network program parsing >> untrusted data. > > All I want is to take every measure to avoid losing work. Every other > problem was already there before stack-overflow recovery was added. I agree that we should avoid losing work. The way to do that is to beef up autosave so that after a crash, we can recover quickly. That's the approach other long-running programs with precious user data, like Office, Visual Studio, Firefox, and vim, use.