unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: "Sheng Yang" <yangsheng6810@gmail.com>
To: Noam Postavsky <npostavs@gmail.com>
Cc: Paul Eggert <eggert@cs.ucla.edu>, 31995@debbugs.gnu.org
Subject: bug#31995: 26.1; Condition-case failed to catch error
Date: Thu, 12 Jul 2018 17:29:44 -0700	[thread overview]
Message-ID: <6be07045-d79a-26a9-cd63-e2c294cd0187@gmail.com> (raw)
In-Reply-To: <53dc622c-b09f-2251-0a9f-854f55a5642d@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4754 bytes --]

@Paul Eggert: I am cc-ing you because you are the author of commit
f0a1e9ec and may be more familiar with this topic.

Please ignore my previous email, I thought condition-case WAS able to
catch C stack overflow before commit f0a1e9ec, but it seems not the
case, or at least not related to this bug.

After some code reading and debugging, I find the problem: in commit
f0a1e9ec, the read_buffer for read1 is moved from a static variable to
an array stackbuf of size MAX_ALLOCA located on stack. MAX_ALLOCA is
defined to be 16 * 1024. So every recursion of read1 will eat up 16KB of
stack, and thousands of recursions (not uncommon for a deeply nested
structure) quickly use up whole stack and cause stack overflow.

One solution is to make stackbuf much smaller. I set it to 16, and this
bug disappeared. Though 16 may be too aggressive, 16 * 1024 is way too
big for a stack-based buffer in a function that may recur thousands of
times. To make things worse, the buffer is totally a waste of space when
read1 is dealing with everything ("[", "]", "(", ")", "#", "=", numbers,
etc.) other than the name of a symbol (usually tens of characters) or a
string, which is the only case when we would need a really long buffer.
A conservative choice would be a number higher than 40 or 80, making the
buffer long enough to hold any symbol, as people usually do not have
symbol longer than the one of half the width of a terminal. A more
aggressive choice is to totally remove the buffer and only allocate it
on heap. This comes at a cost of possible slow down because memory
allocation on heap is usually slower than on stack. The reason why this
was not the case before commit f0a1e9ec is that this buffer is reused by
every recursion of read1, and is not a problem.

As a reference, MAX_ALLOCA is defined in src/lisp.h for SAFE_ALLOCA,
which allocate memory on stack if its size is less than MAX_ALLOCA, and
allocate memory on heap otherwise. The usage for SAFE_ALLOCA and a
preparation macro USE_SAFE_ALLOCA seems pretty complicated and I am not
able to figure out.

On 07/11/2018 10:46 PM, Sheng Yang (杨圣) wrote:
> condition-case was able to catch C stack overflow before commit
> f0a1e9ec. I understand that recovering from C stack overflow is
> magical and can be tricky, but emacs is capable of this thanks to all
> of your efforts. The only part missing is re-throwing this as a lisp
> exception, which should not be as hard as recovering from C stack
> overflow.
>
> Here is why this feature can be important. When we open a file,
> find-file-hook will call many functions, including but not limited to
> undo-tree. These functions read additional files (undo-tree, project
> file, dir-local, etc.) and perform tasks. To guard against file
> corruption and other problems, all reads are wrapped in some try-catch
> clause. However, the trust in these try-catch clauses are let down,
> and a single file corruption (or a file that can cause C stack
> overflow) ruins the whole process of loading file with a mysterious
> message of"Recovered from C stack overflow". I don't think this is
> acceptable.
>
> From a lisp programmer's perspective, if exceptions should occur, they
> should be caught. This is exactly the behavior that condition-case and
> other try-catch clause promise.
>
> I am not an expert in C, debugging the C part of emacs can be painful
> for me. Therefore I bisected and found the offending commits (see my
> original bug report). Hope this can help you pin point the problem and
> fix the bug.
>
> On 07/11/2018 02:48 PM, Noam Postavsky wrote:
>> retitle 31995 Condition-case can't catch C stack overflow
>> tags 31995 + wontfix
>> quit
>>
>> Sheng Yang (杨圣) <yangsheng6810@gmail.com> writes:
>>
>>> It seems that the function call ~(read (current-buffer))~ causes C stack
>>> overflow. Though I personally believe the undo-tree file is not
>>> corrupted, I assume this error should be caught by condition-case even
>>> if the file to read is indeed corrupted.
>> The file is not corrupted, it's just that the recursion goes too deep
>> during reading.  However, I don't think condition-case can reasonably
>> catch C stack overflow.  As it is, recovering from C stack overflow at
>> all is a bit controversial, which is why we have the
>> attempt-stack-overflow-recovery variable which you can set to nil in
>> order to reliably segfault instead.
>
> -- 
> Sheng Yang(杨圣)
> PhD student
> Computer Science Department
> University of Maryland, College Park
> E-mail:yangsheng6810@gmail.com

-- 
Sheng Yang(杨圣)
PhD student
Computer Science Department
University of Maryland, College Park
E-mail:yangsheng6810@gmail.com


[-- Attachment #2: Type: text/html, Size: 6218 bytes --]

  reply	other threads:[~2018-07-13  0:29 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28 16:54 bug#31995: 26.1; Condition-case failed to catch error Sheng Yang
2018-07-11 21:48 ` Noam Postavsky
2018-07-12  5:46   ` Sheng Yang
2018-07-13  0:29     ` Sheng Yang [this message]
2018-07-13  3:43       ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6be07045-d79a-26a9-cd63-e2c294cd0187@gmail.com \
    --to=yangsheng6810@gmail.com \
    --cc=31995@debbugs.gnu.org \
    --cc=eggert@cs.ucla.edu \
    --cc=npostavs@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).