On Mon, Aug 31, 2015 at 2:31 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> Yes for this particular segfault.

Can you show a patch that fixes the original segfault in your use
case?

Attached. Note that either one of those changes should work. I'll test this patch some more using my original code and see whether it blows up.

I'm afraid I lost track, what with all the different scenarios
and potential solutions being thrown at this.  We should install the
fix, assuming it's clean.

I think we should fix three things:
 - concat shouldn't rely on its argument remaining unchanged in length
 - the timer list copy should happen with block_input/unblock_input wrapped around it
 - we shouldn't call do_pending_window_change from QUIT [already installed. Thanks, martin!]

Any one of these is enough to prevent the original segfault. All but the second also prevent the bizarre-elisp-induced segfault I came up with later. And I'm perfectly happy for today with the number of hooks called from QUIT reduced by one, rather than insisting on reducing them to zero right away.

> No* for similar segfaults that I think pose equally severe problems: if any other function calls concat/copy-sequence on data that is modified by window-configuration-change-hook, it should* still be possible to produce the segfault.

Emacs gives you enough rope to hang yourself; there's nothing new
here.  We should strive to protect the Emacs internals so that they
won't cause segfaults, but in user code any bets are off, and "don't
do that" is a valid response to whoever does such things.

It's always good to know what the philosophy is behind the way the code works, so thank you for that, really.

> So it wouldn't even be safe for window-configuration-change-hook to add a timer to the timer list, because the outer frame might be in the middle of creating a copy of the timer list for some Lisp code that hasn't blocked input. (As in my example below)

Futzing with timers from within some hooks is indeed fundamentally
dangerous.

Well, doing anything from window-configuration-change-hook is dangerous. My idea was to schedule an immediate timer from it to get out of the danger zone to do the actual work, but that backfired...
 
But we should still try to minimize the probability of a
crash, especially when it's Emacs itself who makes the offending copy,
because people do dangerous things all the time, and expect them to
work.  In this case, blocking input should do, I think.

I agree.

> I really don't think QUIT should run any Lisp hooks, to be honest.

I don't think this limitation could fly.  It will disable a lot of
useful use patterns, and the outcry will be loud and clear.

Okay.

> If I'm wrong and QUIT should be able to run Lisp hooks, concat needs to be fixed not to rely on its argument's size being unchanged after the make_sequence call.

That can't do any harm, so let's do it, too.

Cool.
 
> As far as I can tell, that should be reproducible. Also as far as I can tell, it's merely a matter of luck that an X resize doesn't happen at the point where I interrupted the program to artificially trigger the segfault. However, I admit that it is a separate issue, less likely to occur in practice, and I'll open another bug for it if that's the way you prefer things.

But if input is blocked, as it would be in the case of copying
timer-list inside timer_check, the X events will not be acted upon,
and the problem will not happen, right?

Indeed, that relies on bizarre elisp code deliberately doing silly things...
 
IOW, the above situation is a case of a user shooting herself in the
foot by having that particular function in the hook and that
particular code that copies timer-list (which is an internal variable
unwise users should not touch).  Am I right?

I think you are. I'm not sure whether the timer code in timer.el does anything to the timer list that might count as dangerous, but that's possibly the only legitimate Lisp user of timer-list.