From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.bugs Subject: bug#23019: parse-partial-sexp doesn't output the full state needed for its continuance. Date: Thu, 17 Mar 2016 21:49:34 +0000 Message-ID: <20160317214934.GB9038@acm.fritz.box> References: <20160315091355.GA2263@acm.fritz.box> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1458251246 5694 80.91.229.3 (17 Mar 2016 21:47:26 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 17 Mar 2016 21:47:26 +0000 (UTC) Cc: 23019@debbugs.gnu.org To: Stefan Monnier Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Mar 17 22:47:16 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1agflD-00037J-Ik for geb-bug-gnu-emacs@m.gmane.org; Thu, 17 Mar 2016 22:47:15 +0100 Original-Received: from localhost ([::1]:40069 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1agflD-00073s-5J for geb-bug-gnu-emacs@m.gmane.org; Thu, 17 Mar 2016 17:47:15 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:32909) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1agfl6-0006vy-1k for bug-gnu-emacs@gnu.org; Thu, 17 Mar 2016 17:47:11 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1agfl0-0007xw-S6 for bug-gnu-emacs@gnu.org; Thu, 17 Mar 2016 17:47:07 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:54385) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1agfl0-0007xr-NS for bug-gnu-emacs@gnu.org; Thu, 17 Mar 2016 17:47:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1agfl0-0006XE-BX for bug-gnu-emacs@gnu.org; Thu, 17 Mar 2016 17:47:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Alan Mackenzie Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 17 Mar 2016 21:47:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 23019 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 23019-submit@debbugs.gnu.org id=B23019.145825121625106 (code B ref 23019); Thu, 17 Mar 2016 21:47:02 +0000 Original-Received: (at 23019) by debbugs.gnu.org; 17 Mar 2016 21:46:56 +0000 Original-Received: from localhost ([127.0.0.1]:51512 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1agfkt-0006Wr-G0 for submit@debbugs.gnu.org; Thu, 17 Mar 2016 17:46:56 -0400 Original-Received: from mail.muc.de ([193.149.48.3]:25683) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1agfkr-0006Wj-GC for 23019@debbugs.gnu.org; Thu, 17 Mar 2016 17:46:54 -0400 Original-Received: (qmail 43228 invoked by uid 3782); 17 Mar 2016 21:46:52 -0000 Original-Received: from acm.muc.de (p548A5932.dip0.t-ipconnect.de [84.138.89.50]) by colin.muc.de (tmda-ofmipd) with ESMTP; Thu, 17 Mar 2016 22:46:47 +0100 Original-Received: (qmail 26919 invoked by uid 1000); 17 Mar 2016 21:49:34 -0000 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:115013 Archived-At: On Thu, Mar 17, 2016 at 08:58:27AM -0400, Stefan Monnier wrote: > > Proposed solution: Add an extra element to the parser state, recording the > > syntax of the last character passed over before the end of the parse. > > This would be used by parse-partial-sexp to initialise its parse. > Another option is to record "the start of current element" (in case we > were in the middle of an element). This could potentially reuse (nth > 5 ppss) by generalizing it, or it could use a new entry. > The choice probably doesn't matter much and will probably be more > a question of "what's easier to implement". > > Also: the existing element 9 (the list of currently open parens) and the > > new element should be explicitly documented in the Elisp manual, together > > with a statement that there may be further elements in the parse state > > used internally by parse-partial-sexp (for future expansion). > Indeed. OK, I've got a patch ready. It's bigger than anticipated, purely because it also does some refactoring. It actually adds two elements to the parser state, and I believe that makes the parser state complete. Here's the patch: Enhance parse-partial-sexp correctly to handle two character commit delimiters Do this by adding two new fields to the parser state: the syntax of the last character scanned, and the last end of comment scanned. This should make the parser state complete. Also document element 9 of the parser state. Also refactor the code a bit. * src/syntax.c (struct lisp_parse_state): Add two new fields. (internalize_parse_state): New function, extracted from scan_sexps_forward. (back_comment): Call internalize_parse_state. (forw_comment): Return the syntax of the last character scanned to the caller. (Fforward_comment, scan_lists): New dummy variables, passed to forw_comment. (scan_sexps_forward): Remove a redundant state parameter. Access all `state' information via the address parameter `state'. Remove the code which converts from external to internal form of `state'. Access buffer contents only from `from' onwards. Reformulate code at the top of the main loop correctly to recognize comment openers when starting in the middle of one. Call forw_comment with extra argument (for return of final syntax value). (Fparse_partial_sexp): Document elements 9, 10, 11 of the parser state in the doc string. Clarify the doc string in general. Call internalize_parse_state. Take account of the new elements when consing up the output parser state. * doc/lispref/syntax.texi: (Parser State): Document element 9 and the new elements 10 and 11. Minor wording corrections (remove reference to "trivial cases"). (Low Level Parsing): Minor corrections diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi index d5a7eba..67a00d7 100644 --- a/doc/lispref/syntax.texi +++ b/doc/lispref/syntax.texi @@ -791,10 +791,10 @@ Parser State @subsection Parser State @cindex parser state - A @dfn{parser state} is a list of ten elements describing the state -of the syntactic parser, after it parses the text between a specified -starting point and a specified end point in the buffer. Parsing -functions such as @code{syntax-ppss} + A @dfn{parser state} is a list of (currently) twelve elements +describing the state of the syntactic parser, after it parses the text +between a specified starting point and a specified end point in the +buffer. Parsing functions such as @code{syntax-ppss} @ifnottex (@pxref{Position Parse}) @end ifnottex @@ -851,15 +851,21 @@ Parser State this element is @code{nil}. @item -Internal data for continuing the parsing. The meaning of this -data is subject to change; it is used if you pass this list -as the @var{state} argument to another call. +The list of the positions of the currently open parentheses, starting +with the outermost. + +@item +The @var{syntax-code} (@pxref{Syntax Table Internals}) of the last +buffer position scanned, or @code{nil} if no scanning has happened. + +@item +The position after the previous end of comment, or @code{nil} if the +scanning has not passed a comment end. @end enumerate Elements 1, 2, and 6 are ignored in a state which you pass as an -argument to continue parsing, and elements 8 and 9 are used only in -trivial cases. Those elements are mainly used internally by the -parser code. +argument to continue parsing. Elements 9 to 11 are mainly used +internally by the parser code. One additional piece of useful information is available from a parser state using this function: @@ -898,11 +904,11 @@ Low-Level Parsing If the fourth argument @var{stop-before} is non-@code{nil}, parsing stops when it comes to any character that starts a sexp. If -@var{stop-comment} is non-@code{nil}, parsing stops when it comes to the -start of an unnested comment. If @var{stop-comment} is the symbol +@var{stop-comment} is non-@code{nil}, parsing stops after the start of +an unnested comment. If @var{stop-comment} is the symbol @code{syntax-table}, parsing stops after the start of an unnested -comment or a string, or the end of an unnested comment or a string, -whichever comes first. +comment or a string, or after the end of an unnested comment or a +string, whichever comes first. If @var{state} is @code{nil}, @var{start} is assumed to be at the top level of parenthesis structure, such as the beginning of a function diff --git a/src/syntax.c b/src/syntax.c index 249d0d5..11b1ff0 100644 --- a/src/syntax.c +++ b/src/syntax.c @@ -153,6 +153,9 @@ struct lisp_parse_state ptrdiff_t comstr_start; /* Position of last comment/string starter. */ Lisp_Object levelstarts; /* Char numbers of starts-of-expression of levels (starting from outermost). */ + int prev_syntax; /* Syntax of previous character scanned, or Smax. */ + ptrdiff_t prev_comment_end; /* Position after end of last closed + comment, or -1. */ }; /* These variables are a cache for finding the start of a defun. @@ -176,7 +179,8 @@ static Lisp_Object skip_syntaxes (bool, Lisp_Object, Lisp_Object); static Lisp_Object scan_lists (EMACS_INT, EMACS_INT, EMACS_INT, bool); static void scan_sexps_forward (struct lisp_parse_state *, ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT, - bool, Lisp_Object, int); + bool, int); +static void internalize_parse_state (Lisp_Object, struct lisp_parse_state *); static bool in_classes (int, Lisp_Object); static void parse_sexp_propertize (ptrdiff_t charpos); @@ -911,10 +915,11 @@ back_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop, } do { + internalize_parse_state (Qnil, &state); scan_sexps_forward (&state, defun_start, defun_start_byte, comment_end, TYPE_MINIMUM (EMACS_INT), - 0, Qnil, 0); + 0, 0); defun_start = comment_end; if (!adjusted) { @@ -2314,7 +2319,9 @@ in_classes (int c, Lisp_Object iso_classes) into *CHARPOS_PTR and the corresponding bytepos into *BYTEPOS_PTR. Else, return false and store the charpos STOP into *CHARPOS_PTR, the corresponding bytepos into *BYTEPOS_PTR and the current nesting - (as defined for state.incomment) in *INCOMMENT_PTR. + (as defined for state->incomment) in *INCOMMENT_PTR. The + SYNTAX_WITH_FLAGS of the last character scanned in the comment is + stored into *last_syntax_ptr. The comment end is the last character of the comment rather than the character just after the comment. @@ -2326,7 +2333,7 @@ static bool forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop, EMACS_INT nesting, int style, int prev_syntax, ptrdiff_t *charpos_ptr, ptrdiff_t *bytepos_ptr, - EMACS_INT *incomment_ptr) + EMACS_INT *incomment_ptr, int *last_syntax_ptr) { register int c, c1; register enum syntaxcode code; @@ -2346,6 +2353,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop, *incomment_ptr = nesting; *charpos_ptr = from; *bytepos_ptr = from_byte; + *last_syntax_ptr = syntax; return 0; } c = FETCH_CHAR_AS_MULTIBYTE (from_byte); @@ -2415,6 +2423,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop, } *charpos_ptr = from; *bytepos_ptr = from_byte; + *last_syntax_ptr = syntax; return 1; } @@ -2436,6 +2445,7 @@ between them, return t; otherwise return nil. */) EMACS_INT count1; ptrdiff_t out_charpos, out_bytepos; EMACS_INT dummy; + int dummy2; CHECK_NUMBER (count); count1 = XINT (count); @@ -2499,7 +2509,7 @@ between them, return t; otherwise return nil. */) } /* We're at the start of a comment. */ found = forw_comment (from, from_byte, stop, comnested, comstyle, 0, - &out_charpos, &out_bytepos, &dummy); + &out_charpos, &out_bytepos, &dummy, &dummy2); from = out_charpos; from_byte = out_bytepos; if (!found) { @@ -2659,6 +2669,7 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT depth, bool sexpflag) ptrdiff_t from_byte; ptrdiff_t out_bytepos, out_charpos; EMACS_INT dummy; + int dummy2; bool multibyte_symbol_p = sexpflag && multibyte_syntax_as_symbol; if (depth > 0) min_depth = 0; @@ -2755,7 +2766,8 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT depth, bool sexpflag) UPDATE_SYNTAX_TABLE_FORWARD (from); found = forw_comment (from, from_byte, stop, comnested, comstyle, 0, - &out_charpos, &out_bytepos, &dummy); + &out_charpos, &out_bytepos, &dummy, + &dummy2); from = out_charpos, from_byte = out_bytepos; if (!found) { @@ -3119,7 +3131,7 @@ the prefix syntax flag (p). */) } /* Parse forward from FROM / FROM_BYTE to END, - assuming that FROM has state OLDSTATE (nil means FROM is start of function), + assuming that FROM has state STATE (nil means FROM is start of function), and return a description of the state of the parse at END. If STOPBEFORE, stop at the start of an atom. If COMMENTSTOP is 1, stop at the start of a comment. @@ -3127,12 +3139,11 @@ the prefix syntax flag (p). */) after the beginning of a string, or after the end of a string. */ static void -scan_sexps_forward (struct lisp_parse_state *stateptr, +scan_sexps_forward (struct lisp_parse_state *state, ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t end, EMACS_INT targetdepth, bool stopbefore, - Lisp_Object oldstate, int commentstop) + int commentstop) { - struct lisp_parse_state state; enum syntaxcode code; int c1; bool comnested; @@ -3148,7 +3159,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr, Lisp_Object tem; ptrdiff_t prev_from; /* Keep one character before FROM. */ ptrdiff_t prev_from_byte; - int prev_from_syntax; + int prev_from_syntax, prev_prev_from_syntax; bool boundary_stop = commentstop == -1; bool nofence; bool found; @@ -3165,6 +3176,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr, do { prev_from = from; \ prev_from_byte = from_byte; \ temp = FETCH_CHAR_AS_MULTIBYTE (prev_from_byte); \ + prev_prev_from_syntax = prev_from_syntax; \ prev_from_syntax = SYNTAX_WITH_FLAGS (temp); \ INC_BOTH (from, from_byte); \ if (from < end) \ @@ -3174,88 +3186,38 @@ do { prev_from = from; \ immediate_quit = 1; QUIT; - if (NILP (oldstate)) - { - depth = 0; - state.instring = -1; - state.incomment = 0; - state.comstyle = 0; /* comment style a by default. */ - state.comstr_start = -1; /* no comment/string seen. */ - } - else - { - tem = Fcar (oldstate); - if (!NILP (tem)) - depth = XINT (tem); - else - depth = 0; - - oldstate = Fcdr (oldstate); - oldstate = Fcdr (oldstate); - oldstate = Fcdr (oldstate); - tem = Fcar (oldstate); - /* Check whether we are inside string_fence-style string: */ - state.instring = (!NILP (tem) - ? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE) - : -1); - - oldstate = Fcdr (oldstate); - tem = Fcar (oldstate); - state.incomment = (!NILP (tem) - ? (INTEGERP (tem) ? XINT (tem) : -1) - : 0); - - oldstate = Fcdr (oldstate); - tem = Fcar (oldstate); - start_quoted = !NILP (tem); + depth = state->depth; + start_quoted = state->quoted; + prev_prev_from_syntax = Smax; + prev_from_syntax = state->prev_syntax; - /* if the eighth element of the list is nil, we are in comment - style a. If it is non-nil, we are in comment style b */ - oldstate = Fcdr (oldstate); - oldstate = Fcdr (oldstate); - tem = Fcar (oldstate); - state.comstyle = (NILP (tem) - ? 0 - : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE) - ? XINT (tem) - : ST_COMMENT_STYLE)); - - oldstate = Fcdr (oldstate); - tem = Fcar (oldstate); - state.comstr_start = - RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1; - oldstate = Fcdr (oldstate); - tem = Fcar (oldstate); - while (!NILP (tem)) /* >= second enclosing sexps. */ - { - Lisp_Object temhd = Fcar (tem); - if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX)) - curlevel->last = XINT (temhd); - if (++curlevel == endlevel) - curlevel--; /* error ("Nesting too deep for parser"); */ - curlevel->prev = -1; - curlevel->last = -1; - tem = Fcdr (tem); - } + tem = state->levelstarts; + while (!NILP (tem)) /* >= second enclosing sexps. */ + { + Lisp_Object temhd = Fcar (tem); + if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX)) + curlevel->last = XINT (temhd); + if (++curlevel == endlevel) + curlevel--; /* error ("Nesting too deep for parser"); */ + curlevel->prev = -1; + curlevel->last = -1; + tem = Fcdr (tem); } - state.quoted = 0; - mindepth = depth; - curlevel->prev = -1; curlevel->last = -1; - SETUP_SYNTAX_TABLE (prev_from, 1); - temp = FETCH_CHAR (prev_from_byte); - prev_from_syntax = SYNTAX_WITH_FLAGS (temp); - UPDATE_SYNTAX_TABLE_FORWARD (from); + state->quoted = 0; + mindepth = depth; + + SETUP_SYNTAX_TABLE (from, 1); /* Enter the loop at a place appropriate for initial state. */ - if (state.incomment) + if (state->incomment) goto startincomment; - if (state.instring >= 0) + if (state->instring >= 0) { - nofence = state.instring != ST_STRING_STYLE; + nofence = state->instring != ST_STRING_STYLE; if (start_quoted) goto startquotedinstring; goto startinstring; @@ -3266,10 +3228,10 @@ do { prev_from = from; \ while (from < end) { int syntax; - INC_FROM; - code = prev_from_syntax & 0xff; if (from < end + && (state->prev_comment_end == -1 + || prev_from >= state->prev_comment_end) && SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax) && (c1 = FETCH_CHAR (from_byte), syntax = SYNTAX_WITH_FLAGS (c1), @@ -3280,32 +3242,37 @@ do { prev_from = from; \ /* Record the comment style we have entered so that only the comment-end sequence of the same style actually terminates the comment section. */ - state.comstyle + state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax); comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) | SYNTAX_FLAGS_COMMENT_NESTED (syntax)); - state.incomment = comnested ? 1 : -1; - state.comstr_start = prev_from; + state->incomment = comnested ? 1 : -1; + state->comstr_start = prev_from; INC_FROM; code = Scomment; } - else if (code == Scomment_fence) - { - /* Record the comment style we have entered so that only - the comment-end sequence of the same style actually - terminates the comment section. */ - state.comstyle = ST_COMMENT_STYLE; - state.incomment = -1; - state.comstr_start = prev_from; - code = Scomment; - } - else if (code == Scomment) - { - state.comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0); - state.incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ? - 1 : -1); - state.comstr_start = prev_from; - } + else + { + INC_FROM; + code = prev_from_syntax & 0xff; + if (code == Scomment_fence) + { + /* Record the comment style we have entered so that only + the comment-end sequence of the same style actually + terminates the comment section. */ + state->comstyle = ST_COMMENT_STYLE; + state->incomment = -1; + state->comstr_start = prev_from; + code = Scomment; + } + else if (code == Scomment) + { + state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0); + state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ? + 1 : -1); + state->comstr_start = prev_from; + } + } if (SYNTAX_FLAGS_PREFIX (prev_from_syntax)) continue; @@ -3357,18 +3324,21 @@ do { prev_from = from; \ middle of it. We don't want to do that if we're just at the beginning of the comment (think of (*) ... (*)). */ found = forw_comment (from, from_byte, end, - state.incomment, state.comstyle, - (from == BEGV || from < state.comstr_start + 3) + state->incomment, state->comstyle, + (from == BEGV || from < state->comstr_start + 3) ? 0 : prev_from_syntax, - &out_charpos, &out_bytepos, &state.incomment); + &out_charpos, &out_bytepos, &state->incomment, + &prev_from_syntax); from = out_charpos; from_byte = out_bytepos; - /* Beware! prev_from and friends are invalid now. - Luckily, the `done' doesn't use them and the INC_FROM - sets them to a sane value without looking at them. */ + /* Beware! prev_from and friends (except prev_from_syntax) + are invalid now. Luckily, the `done' doesn't use them + and the INC_FROM sets them to a sane value without + looking at them. */ if (!found) goto done; INC_FROM; - state.incomment = 0; - state.comstyle = 0; /* reset the comment style */ + state->incomment = 0; + state->comstyle = 0; /* reset the comment style */ + state->prev_comment_end = from; if (boundary_stop) goto done; break; @@ -3396,16 +3366,16 @@ do { prev_from = from; \ case Sstring: case Sstring_fence: - state.comstr_start = from - 1; + state->comstr_start = from - 1; if (stopbefore) goto stop; /* this arg means stop at sexp start */ curlevel->last = prev_from; - state.instring = (code == Sstring + state->instring = (code == Sstring ? (FETCH_CHAR_AS_MULTIBYTE (prev_from_byte)) : ST_STRING_STYLE); if (boundary_stop) goto done; startinstring: { - nofence = state.instring != ST_STRING_STYLE; + nofence = state->instring != ST_STRING_STYLE; while (1) { @@ -3419,7 +3389,7 @@ do { prev_from = from; \ /* Check C_CODE here so that if the char has a syntax-table property which says it is NOT a string character, it does not end the string. */ - if (nofence && c == state.instring && c_code == Sstring) + if (nofence && c == state->instring && c_code == Sstring) break; switch (c_code) @@ -3442,7 +3412,7 @@ do { prev_from = from; \ } } string_end: - state.instring = -1; + state->instring = -1; curlevel->prev = curlevel->last; INC_FROM; if (boundary_stop) goto done; @@ -3461,25 +3431,99 @@ do { prev_from = from; \ stop: /* Here if stopping before start of sexp. */ from = prev_from; /* We have just fetched the char that starts it; */ from_byte = prev_from_byte; + prev_from_syntax = prev_prev_from_syntax; goto done; /* but return the position before it. */ endquoted: - state.quoted = 1; + state->quoted = 1; done: - state.depth = depth; - state.mindepth = mindepth; - state.thislevelstart = curlevel->prev; - state.prevlevelstart + state->depth = depth; + state->mindepth = mindepth; + state->thislevelstart = curlevel->prev; + state->prevlevelstart = (curlevel == levelstart) ? -1 : (curlevel - 1)->last; - state.location = from; - state.location_byte = from_byte; - state.levelstarts = Qnil; + state->location = from; + state->location_byte = from_byte; + state->levelstarts = Qnil; while (curlevel > levelstart) - state.levelstarts = Fcons (make_number ((--curlevel)->last), - state.levelstarts); + state->levelstarts = Fcons (make_number ((--curlevel)->last), + state->levelstarts); + state->prev_syntax = prev_from_syntax; immediate_quit = 0; +} + +/* Convert a (lisp) parse state to the internal form used in + scan_sexps_forward. */ +static void +internalize_parse_state (Lisp_Object external, struct lisp_parse_state *state) +{ + Lisp_Object tem; + + if (NILP (external)) + { + state->depth = 0; + state->instring = -1; + state->incomment = 0; + state->quoted = 0; + state->comstyle = 0; /* comment style a by default. */ + state->comstr_start = -1; /* no comment/string seen. */ + state->levelstarts = Qnil; + state->prev_syntax = Smax; + state->prev_comment_end = -1; + } + else + { + tem = Fcar (external); + if (!NILP (tem)) + state->depth = XINT (tem); + else + state->depth = 0; + + external = Fcdr (external); + external = Fcdr (external); + external = Fcdr (external); + tem = Fcar (external); + /* Check whether we are inside string_fence-style string: */ + state->instring = (!NILP (tem) + ? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE) + : -1); + + external = Fcdr (external); + tem = Fcar (external); + state->incomment = (!NILP (tem) + ? (INTEGERP (tem) ? XINT (tem) : -1) + : 0); - *stateptr = state; + external = Fcdr (external); + tem = Fcar (external); + state->quoted = !NILP (tem); + + /* if the eighth element of the list is nil, we are in comment + style a. If it is non-nil, we are in comment style b */ + external = Fcdr (external); + external = Fcdr (external); + tem = Fcar (external); + state->comstyle = (NILP (tem) + ? 0 + : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE) + ? XINT (tem) + : ST_COMMENT_STYLE)); + + external = Fcdr (external); + tem = Fcar (external); + state->comstr_start = + RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1; + external = Fcdr (external); + tem = Fcar (external); + state->levelstarts = tem; + + external = Fcdr (external); + tem = Fcar (external); + state->prev_syntax = NILP (tem) ? Smax : XINT (tem); + external = Fcdr (external); + tem = Fcar (external); + state->prev_comment_end = NILP (tem) ? -1 : XINT (tem); + } } DEFUN ("parse-partial-sexp", Fparse_partial_sexp, Sparse_partial_sexp, 2, 6, 0, @@ -3488,6 +3532,7 @@ Parsing stops at TO or when certain criteria are met; point is set to where parsing stops. If fifth arg OLDSTATE is omitted or nil, parsing assumes that FROM is the beginning of a function. + Value is a list of elements describing final state of parsing: 0. depth in parens. 1. character address of start of innermost containing list; nil if none. @@ -3501,16 +3546,20 @@ Value is a list of elements describing final state of parsing: 6. the minimum paren-depth encountered during this scan. 7. style of comment, if any. 8. character address of start of comment or string; nil if not in one. - 9. Intermediate data for continuation of parsing (subject to change). + 9. List of positions of currently open parens, outermost first. +10. Syntax of last character scanned, or nil if no scanning has happened. +11. Position after end of previous comment scanned, or nil. +12..... Possible further internal information used by `parse-partial-sexp'. + If third arg TARGETDEPTH is non-nil, parsing stops if the depth in parentheses becomes equal to TARGETDEPTH. -Fourth arg STOPBEFORE non-nil means stop when come to +Fourth arg STOPBEFORE non-nil means stop when we come to any character that starts a sexp. Fifth arg OLDSTATE is a list like what this function returns. It is used to initialize the state of the parse. Elements number 1, 2, 6 are ignored. -Sixth arg COMMENTSTOP non-nil means stop at the start of a comment. - If it is symbol `syntax-table', stop after the start of a comment or a +Sixth arg COMMENTSTOP non-nil means stop after the start of a comment. + If it is the symbol `syntax-table', stop after the start of a comment or a string, or after end of a comment or a string. */) (Lisp_Object from, Lisp_Object to, Lisp_Object targetdepth, Lisp_Object stopbefore, Lisp_Object oldstate, Lisp_Object commentstop) @@ -3527,15 +3576,17 @@ Sixth arg COMMENTSTOP non-nil means stop at the start of a comment. target = TYPE_MINIMUM (EMACS_INT); /* We won't reach this depth. */ validate_region (&from, &to); + internalize_parse_state (oldstate, &state); scan_sexps_forward (&state, XINT (from), CHAR_TO_BYTE (XINT (from)), XINT (to), - target, !NILP (stopbefore), oldstate, + target, !NILP (stopbefore), (NILP (commentstop) ? 0 : (EQ (commentstop, Qsyntax_table) ? -1 : 1))); SET_PT_BOTH (state.location, state.location_byte); - return Fcons (make_number (state.depth), + return + Fcons (make_number (state.depth), Fcons (state.prevlevelstart < 0 ? Qnil : make_number (state.prevlevelstart), Fcons (state.thislevelstart < 0 @@ -3553,11 +3604,18 @@ Sixth arg COMMENTSTOP non-nil means stop at the start of a comment. ? Qsyntax_table : make_number (state.comstyle)) : Qnil), - Fcons (((state.incomment - || (state.instring >= 0)) - ? make_number (state.comstr_start) - : Qnil), - Fcons (state.levelstarts, Qnil)))))))))); + Fcons (((state.incomment + || (state.instring >= 0)) + ? make_number (state.comstr_start) + : Qnil), + Fcons (state.levelstarts, + Fcons (state.prev_syntax == Smax + ? Qnil + : make_number (state.prev_syntax), + Fcons (state.prev_comment_end == -1 + ? Qnil + : make_number (state.prev_comment_end), + Qnil)))))))))))); } void > Stefan -- Alan Mackenzie (Nuremberg, Germany).