all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: npostavs@users.sourceforge.net
To: Alan Mackenzie <acm@muc.de>
Cc: 24870@debbugs.gnu.org, Matt Armstrong <marmstrong@google.com>
Subject: bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
Date: Sun, 18 Dec 2016 00:39:11 -0500	[thread overview]
Message-ID: <8737hldce8.fsf@users.sourceforge.net> (raw)
In-Reply-To: <20161215164458.GA2437@acm.fritz.box> (Alan Mackenzie's message of "Thu, 15 Dec 2016 16:44:58 +0000")

[-- Attachment #1: Type: text/plain, Size: 1484 bytes --]

Alan Mackenzie <acm@muc.de> writes:

> Hello, Noam.
>
> On Thu, Dec 15, 2016 at 11:33:36AM -0500, Noam Postavsky wrote:
>> On Wed, Dec 14, 2016 at 4:58 PM, Alan Mackenzie <acm@muc.de> wrote:
>
>> > What is new here is characters with paren syntax also being components of
>> > 2-char comment delimiters.  I recently fixed a similar problem when
>> > characters with word syntax were also flagged as 2-char comment delimiter
>> > parts.  I think a similar patch at case label Sopen: (Line ~3322), which
>> > would peek ahead at the next character to check for "{-" before
>> > recognising the "{" as an open paren would be the best fix.
>
>> I don't think special casing Sopen makes sense, shouldn't the check
>> apply to all syntax classes?
>
> Maybe.  I'm too tired to work this out at the moment (it was the office
> Glühwein day).
>
>> (the special case for word syntax does make sense, because it has its
>> own inner loop already)
>
> As does comment syntax, of course.
>
>> > Do you want to make this fix, or should I do it?  If you want to do it,
>> > I'm willing (indeed, eager) to review it for you.
>
>> I'll have a patch ready in a day or two.
>
> Excellent!  Or even in three or four days.  Take your time, do it well
> (at least, better than I managed last time round) and enjoy doing it.

Okay, here it is.  I think I've made this function a bit less twisty,
and hopefully haven't broken anything new (make check is still passing).


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 11354 bytes --]

From e3bc1d563a251ede1cb7649b97523161eb352694 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sun, 18 Dec 2016 00:00:30 -0500
Subject: [PATCH v1] Fix comment detection on open parens

Characters having both open paren syntax and comment start syntax were
being detected as open parens even when they should have been part a
comment starter (Bug#24870).

* src/syntax.c (check_comment_start): New function, extracted from
`scan_sexps_forward'.
(scan_sexps_forward): Add check for a 2-char comment starter before the
loop.  Inside the loop, do that check after incrementing the 'from'
character index.  Move the single char comment syntax cases into the
switch isntead of special casing them before.
* test/src/syntax-tests.el (parse-partial-sexp-paren-comments):
(parse-partial-sexp-continue-over-comment-marker): New tests.
---
 src/syntax.c             | 114 ++++++++++++++++++++++++-----------------------
 test/src/syntax-tests.el |  83 ++++++++++++++++++++++++++++++++++
 2 files changed, 142 insertions(+), 55 deletions(-)
 create mode 100644 test/src/syntax-tests.el

diff --git a/src/syntax.c b/src/syntax.c
index 338dd85..df28c91 100644
--- a/src/syntax.c
+++ b/src/syntax.c
@@ -3094,6 +3094,33 @@ DEFUN ("backward-prefix-chars", Fbackward_prefix_chars, Sbackward_prefix_chars,
   return Qnil;
 }
 \f
+
+static bool
+check_comment_start (struct lisp_parse_state *state,
+                     int prev_from_syntax,
+                     ptrdiff_t prev_from,
+                     ptrdiff_t from_byte)
+{
+  int c1, syntax;
+  if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
+      && (c1 = FETCH_CHAR_AS_MULTIBYTE (from_byte),
+          syntax = SYNTAX_WITH_FLAGS (c1),
+          SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
+    {
+      /* Record the comment style we have entered so that only
+         the comment-end sequence of the same style actually
+         terminates the comment section.  */
+      state->comstyle
+        = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
+      bool comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
+                        | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
+      state->incomment = comnested ? 1 : -1;
+      state->comstr_start = prev_from;
+      return true;
+    }
+  return false;
+}
+
 /* Parse forward from FROM / FROM_BYTE to END,
    assuming that FROM has state STATE,
    and return a description of the state of the parse at END.
@@ -3109,8 +3136,6 @@ scan_sexps_forward (struct lisp_parse_state *state,
 		    int commentstop)
 {
   enum syntaxcode code;
-  int c1;
-  bool comnested;
   struct level { ptrdiff_t last, prev; };
   struct level levelstart[100];
   struct level *curlevel = levelstart;
@@ -3124,7 +3149,6 @@ scan_sexps_forward (struct lisp_parse_state *state,
   ptrdiff_t prev_from;		/* Keep one character before FROM.  */
   ptrdiff_t prev_from_byte;
   int prev_from_syntax, prev_prev_from_syntax;
-  int syntax;
   bool boundary_stop = commentstop == -1;
   bool nofence;
   bool found;
@@ -3189,53 +3213,31 @@ scan_sexps_forward (struct lisp_parse_state *state,
     }
   else if (start_quoted)
     goto startquoted;
+  else if ((from < end)
+           && (check_comment_start (state, prev_from_syntax,
+                                    prev_from, from_byte)))
+    {
+      INC_FROM;
+      prev_from_syntax = Smax; /* the syntax has already been "used up". */
+      goto atcomment;
+    }
 
   while (from < end)
     {
-      if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
-	  && (c1 = FETCH_CHAR (from_byte),
-	      syntax = SYNTAX_WITH_FLAGS (c1),
-	      SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
-	{
-	  /* Record the comment style we have entered so that only
-	     the comment-end sequence of the same style actually
-	     terminates the comment section.  */
-	  state->comstyle
-	    = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
-	  comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
-		       | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
-	  state->incomment = comnested ? 1 : -1;
-	  state->comstr_start = prev_from;
-	  INC_FROM;
-          prev_from_syntax = Smax; /* the syntax has already been
-                                      "used up". */
-	  code = Scomment;
-	}
-      else
+      INC_FROM;
+
+      if ((from < end)
+          && (check_comment_start (state, prev_from_syntax,
+                                   prev_from, from_byte)))
         {
           INC_FROM;
-          code = prev_from_syntax & 0xff;
-          if (code == Scomment_fence)
-            {
-              /* Record the comment style we have entered so that only
-                 the comment-end sequence of the same style actually
-                 terminates the comment section.  */
-              state->comstyle = ST_COMMENT_STYLE;
-              state->incomment = -1;
-              state->comstr_start = prev_from;
-              code = Scomment;
-            }
-          else if (code == Scomment)
-            {
-              state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
-              state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
-                                 1 : -1);
-              state->comstr_start = prev_from;
-            }
+          prev_from_syntax = Smax; /* the syntax has already been "used up". */
+          goto atcomment;
         }
 
       if (SYNTAX_FLAGS_PREFIX (prev_from_syntax))
 	continue;
+      code = prev_from_syntax & 0xff;
       switch (code)
 	{
 	case Sescape:
@@ -3254,24 +3256,15 @@ scan_sexps_forward (struct lisp_parse_state *state,
 	symstarted:
 	  while (from < end)
 	    {
-	      int symchar = FETCH_CHAR_AS_MULTIBYTE (from_byte);
-
-              if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
-                  && (syntax = SYNTAX_WITH_FLAGS (symchar),
-                      SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
+              if (check_comment_start (state, prev_from_syntax,
+                                       prev_from, from_byte))
                 {
-                  state->comstyle
-                    = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
-                  comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
-                               | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
-                  state->incomment = comnested ? 1 : -1;
-                  state->comstr_start = prev_from;
                   INC_FROM;
-                  prev_from_syntax = Smax;
-                  code = Scomment;
+                  prev_from_syntax = Smax; /* the syntax has already been "used up". */
                   goto atcomment;
                 }
 
+	      int symchar = FETCH_CHAR_AS_MULTIBYTE (from_byte);
               switch (SYNTAX (symchar))
 		{
 		case Scharquote:
@@ -3292,8 +3285,19 @@ scan_sexps_forward (struct lisp_parse_state *state,
 	  curlevel->prev = curlevel->last;
 	  break;
 
-	case Scomment_fence: /* Can't happen because it's handled above.  */
+	case Scomment_fence:
+          /* Record the comment style we have entered so that only
+             the comment-end sequence of the same style actually
+             terminates the comment section.  */
+          state->comstyle = ST_COMMENT_STYLE;
+          state->incomment = -1;
+          state->comstr_start = prev_from;
+          goto atcomment;
 	case Scomment:
+          state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
+          state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
+                              1 : -1);
+          state->comstr_start = prev_from;
         atcomment:
           if (commentstop || boundary_stop) goto done;
 	startincomment:
diff --git a/test/src/syntax-tests.el b/test/src/syntax-tests.el
new file mode 100644
index 0000000..eac88e8
--- /dev/null
+++ b/test/src/syntax-tests.el
@@ -0,0 +1,83 @@
+;;; syntax-tests.el --- tests for syntax.c functions -*- lexical-binding: t -*-
+
+;; Copyright (C) 2016 Free Software Foundation, Inc.
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.
+
+;;; Code:
+
+(require 'ert)
+
+(ert-deftest parse-partial-sexp-continue-over-comment-marker ()
+  "Continue a parse that stopped in the middle of a comment marker."
+  (with-temp-buffer
+    (let ((table (make-syntax-table)))
+      (modify-syntax-entry ?/ ". 124")
+      (modify-syntax-entry ?* ". 23b")
+      (set-syntax-table table))
+    (insert "/*C*/\nX")
+    (goto-char (point-min))
+    (let* ((pointC (progn (search-forward "C") (1- (point))))
+           (preC (1- pointC))
+           (pointX (progn (search-forward "X") (1- (point))))
+           (aftC (+ 2 pointC))
+           (ppsC (parse-partial-sexp (point-min) pointC))
+           (pps-preC (parse-partial-sexp (point-min) preC))
+           (pps-aftC (parse-partial-sexp (point-min) aftC))
+           (ppsX (parse-partial-sexp (point-min) pointX)))
+      ;; C should be inside comment.
+      (should (= (nth 0 ppsC) 0))
+      (should (eq (nth 4 ppsC) t))
+      (should (= (nth 8 ppsC) (- pointC 2)))
+      ;; X should not be in comment or list.
+      (should (= (nth 0 ppsX) 0))
+      (should-not (nth 4 ppsX))
+      ;; Try using OLDSTATE.
+      (should (equal (parse-partial-sexp preC pointC nil nil pps-preC)
+                     ppsC))
+      (should (equal (parse-partial-sexp pointC aftC nil nil ppsC)
+                     pps-aftC))
+      (should (equal (parse-partial-sexp preC aftC nil nil pps-preC)
+                     pps-aftC)))))
+
+(ert-deftest parse-partial-sexp-paren-comments ()
+  "Test syntax parsing with paren comment markers.
+Specifically, where the first character of the comment marker is
+also has open paren syntax (see Bug#24870)."
+  (with-temp-buffer
+    (let ((table (make-syntax-table)))
+      (modify-syntax-entry ?\{  "(}1nb" table)
+      (modify-syntax-entry ?\}  "){4nb" table)
+      (modify-syntax-entry ?-  ". 123" table)
+      (set-syntax-table table))
+    (insert "{-C-}\nX")
+    (goto-char (point-min))
+    (let* ((pointC (progn (search-forward "C") (1- (point))))
+           (pointX (progn (search-forward "X") (1- (point))))
+           (ppsC (parse-partial-sexp (point-min) pointC))
+           (ppsX (parse-partial-sexp (point-min) pointX)))
+      ;; C should be inside nestable comment, not list.
+      (should (= (nth 0 ppsC) 0))
+      (should (= (nth 4 ppsC) 1))
+      (should (= (nth 8 ppsC) (- pointC 2)))
+      ;; X should not be in comment or list.
+      (should (= (nth 0 ppsX) 0))
+      (should-not (nth 4 ppsX))
+      ;; Try using OLDSTATE.
+      (should (equal (parse-partial-sexp pointC pointX nil nil ppsC)
+                     ppsX)))))
+
+;;; syntax-tests.el ends here
-- 
2.9.3


  reply	other threads:[~2016-12-18  5:39 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-03 19:38 bug#24870: 26.0.50; parse-partial-sexp ignores comment-end Andreas Röhler
2016-11-30  9:10 ` Matt Armstrong
2016-11-30 12:37   ` Andreas Röhler
2016-11-30 23:02     ` Matt Armstrong
2016-12-01  1:17       ` npostavs
2016-12-01  8:24         ` Andreas Röhler
2016-12-14  3:00           ` npostavs
2016-12-14  4:04             ` npostavs
2016-12-14  6:45               ` Andreas Röhler
2016-12-14 19:56               ` Alan Mackenzie
2016-12-15  8:18                 ` Andreas Röhler
2016-12-15 16:01                   ` Eli Zaretskii
2016-12-15 16:50                   ` Alan Mackenzie
2016-12-15 17:59                     ` Andreas Röhler
2016-12-14 21:58               ` Alan Mackenzie
2016-12-15 16:33                 ` Noam Postavsky
2016-12-15 16:44                   ` Alan Mackenzie
2016-12-18  5:39                     ` npostavs [this message]
2016-12-29 11:14                       ` Alan Mackenzie
2016-12-30  1:55                         ` npostavs
2017-01-13  2:07                           ` npostavs
2017-01-23 20:12                             ` Alan Mackenzie
2017-01-24  0:30                               ` npostavs
2016-12-01  8:33       ` Andreas Röhler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8737hldce8.fsf@users.sourceforge.net \
    --to=npostavs@users.sourceforge.net \
    --cc=24870@debbugs.gnu.org \
    --cc=acm@muc.de \
    --cc=marmstrong@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.