unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
@ 2016-11-03 19:38 Andreas Röhler
  2016-11-30  9:10 ` Matt Armstrong
  0 siblings, 1 reply; 24+ messages in thread
From: Andreas Röhler @ 2016-11-03 19:38 UTC (permalink / raw)
  To: 24870

[-- Attachment #1: Type: text/plain, Size: 216 bytes --]

haskell-mode, at EOB:

---

{- To explore this file: -}

asdf =

---

parse-partial-sexp thinks being inside a paren - see attachment.

GNU Emacs 26.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-10-11






[-- Attachment #2: after-haskell-comment.png --]
[-- Type: image/png, Size: 113835 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-11-03 19:38 bug#24870: 26.0.50; parse-partial-sexp ignores comment-end Andreas Röhler
@ 2016-11-30  9:10 ` Matt Armstrong
  2016-11-30 12:37   ` Andreas Röhler
  0 siblings, 1 reply; 24+ messages in thread
From: Matt Armstrong @ 2016-11-30  9:10 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: 24870

Andreas Röhler <andreas.roehler@easy-emacs.de> writes:

> haskell-mode, at EOB:
>
> ---
>
> {- To explore this file: -}
>
> asdf =
>
> ---
>
> parse-partial-sexp thinks being inside a paren - see attachment.
>
> GNU Emacs 26.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-10-11

Hi Andreas,

Emacs does not have a haskell-mode, so this bug is difficult to reproduce.

It may be more appropriate to report this to the haskell-mode
maintainers for triage.  They can figure out if it is a problem that
should be fixed in haskell-mode itself, or a problem with Emacs.

Alternatively, can you provide a series of clear instructions to
reproduce the problem in a fresh Emacs started without your
customizations?  For example, begin by running "emacs -Q" and go from
there.  Your attached .png presents a buffer called
*parse-partial-sexp-output*, but it is not clear how this was generated.

Thanks





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-11-30  9:10 ` Matt Armstrong
@ 2016-11-30 12:37   ` Andreas Röhler
  2016-11-30 23:02     ` Matt Armstrong
  0 siblings, 1 reply; 24+ messages in thread
From: Andreas Röhler @ 2016-11-30 12:37 UTC (permalink / raw)
  To: Matt Armstrong; +Cc: 24870

[-- Attachment #1: Type: text/plain, Size: 1564 bytes --]


On 30.11.2016 10:10, Matt Armstrong wrote:
> Andreas Röhler <andreas.roehler@easy-emacs.de> writes:
>> haskell-mode, at EOB: --- {- To explore this file: -} asdf = --- 
>> parse-partial-sexp thinks being inside a paren - see attachment. GNU 
>> Emacs 26.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-10-11 
> Hi Andreas, Emacs does not have a haskell-mode, so this bug is 
> difficult to reproduce. It may be more appropriate to report this to 
> the haskell-mode maintainers for triage. They can figure out if it is 
> a problem that should be fixed in haskell-mode itself, or a problem 
> with Emacs. Alternatively, can you provide a series of clear 
> instructions to reproduce the problem in a fresh Emacs started without 
> your customizations? For example, begin by running "emacs -Q" and go 
> from there. Your attached .png presents a buffer called 
> *parse-partial-sexp-output*, but it is not clear how this was 
> generated. Thanks
Hi Matt,

checked that with help of the haskell-mode folks already.
https://github.com/haskell/haskell-mode/issues/1459

As it's nice at current pretest Emacs, concluded a bug in trunk.
Here a shortened recipe. Put code below in a buffer:
||
|{- Just a comment: -}|

M-x haskell-mode RET

As after a comment, there should be no nesting.
Than evaluate one of the forms below

|(insert (format "%S" (syntax-ppss)))|
|(insert (format "%S" (parse-partial-sexp (point-min) (point))))|

Result both: (1 1 22 nil nil nil 0 nil nil (1) nil)

GNU Emacs 26.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-11-15

[-- Attachment #2: Type: text/html, Size: 2865 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-11-30 12:37   ` Andreas Röhler
@ 2016-11-30 23:02     ` Matt Armstrong
  2016-12-01  1:17       ` npostavs
  2016-12-01  8:33       ` Andreas Röhler
  0 siblings, 2 replies; 24+ messages in thread
From: Matt Armstrong @ 2016-11-30 23:02 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: 24870

Andreas Röhler <andreas.roehler@easy-emacs.de> writes:

> On 30.11.2016 10:10, Matt Armstrong wrote:
>> Andreas Röhler <andreas.roehler@easy-emacs.de> writes:
>>> haskell-mode, at EOB: --- {- To explore this file: -} asdf = --- 
>>> parse-partial-sexp thinks being inside a paren - see attachment. GNU 
>>> Emacs 26.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-10-11 
>> Hi Andreas, Emacs does not have a haskell-mode, so this bug is 
>> difficult to reproduce. It may be more appropriate to report this to 
>> the haskell-mode maintainers for triage. They can figure out if it is 
>> a problem that should be fixed in haskell-mode itself, or a problem 
>> with Emacs. Alternatively, can you provide a series of clear 
>> instructions to reproduce the problem in a fresh Emacs started without 
>> your customizations? For example, begin by running "emacs -Q" and go 
>> from there. Your attached .png presents a buffer called 
>> *parse-partial-sexp-output*, but it is not clear how this was 
>> generated. Thanks
> Hi Matt,
>
> checked that with help of the haskell-mode folks already.
> https://github.com/haskell/haskell-mode/issues/1459

Thanks.  For reference, this is what you have said on the github bug:

Seems a bug of
GNU Emacs 26.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-11-15
Does not exist at
GNU Emacs 25.1.90.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-11-29


> As it's nice at current pretest Emacs, concluded a bug in trunk.
> Here a shortened recipe. Put code below in a buffer:
> ||
> |{- Just a comment: -}|
>
> M-x haskell-mode RET

Note that haskell-mode is not part of Emacs.  Ideally, your steps to
reproduce that begin with running "emacs -Q".  Emacs maintainers that
might not also be Haskell hackers will appreciate it.  :)

Also, can you describe the visible symptom that caused you to begin
looking at syntax-ppss and parse-partial-sexp?  That description may
help me or others spot similarities with other reported bugs.

(I must say that I am not an Emacs expert, and I do not usually reply to
Emacs bugs.  I looked at a few bugs last night as a way to help
maintainers triage the "easy" bugs.  It does not look like this bug is
easy!)





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-11-30 23:02     ` Matt Armstrong
@ 2016-12-01  1:17       ` npostavs
  2016-12-01  8:24         ` Andreas Röhler
  2016-12-01  8:33       ` Andreas Röhler
  1 sibling, 1 reply; 24+ messages in thread
From: npostavs @ 2016-12-01  1:17 UTC (permalink / raw)
  To: Matt Armstrong; +Cc: 24870

Matt Armstrong <marmstrong@google.com> writes:

> Andreas Röhler <andreas.roehler@easy-emacs.de> writes:
>
>> On 30.11.2016 10:10, Matt Armstrong wrote:
>>> Andreas Röhler <andreas.roehler@easy-emacs.de> writes:
>>>> haskell-mode, at EOB: --- {- To explore this file: -} asdf = --- 
>>>> parse-partial-sexp thinks being inside a paren - see attachment. GNU 
>>>> Emacs 26.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-10-11 
>>> Hi Andreas, Emacs does not have a haskell-mode, so this bug is 
>>> difficult to reproduce. It may be more appropriate to report this to 
>>> the haskell-mode maintainers for triage. They can figure out if it is 
>>> a problem that should be fixed in haskell-mode itself, or a problem 
>>> with Emacs. Alternatively, can you provide a series of clear 
>>> instructions to reproduce the problem in a fresh Emacs started without 
>>> your customizations? For example, begin by running "emacs -Q" and go 
>>> from there. Your attached .png presents a buffer called 
>>> *parse-partial-sexp-output*, but it is not clear how this was 
>>> generated. Thanks
>> Hi Matt,
>>
>> checked that with help of the haskell-mode folks already.
>> https://github.com/haskell/haskell-mode/issues/1459
>
> Thanks.  For reference, this is what you have said on the github bug:
>
> Seems a bug of
> GNU Emacs 26.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-11-15
> Does not exist at
> GNU Emacs 25.1.90.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-11-29
>
>
>> As it's nice at current pretest Emacs, concluded a bug in trunk.
>> Here a shortened recipe. Put code below in a buffer:
>> ||
>> |{- Just a comment: -}|
>>
>> M-x haskell-mode RET
>
> Note that haskell-mode is not part of Emacs.  Ideally, your steps to
> reproduce that begin with running "emacs -Q".  Emacs maintainers that
> might not also be Haskell hackers will appreciate it.  :)
>
> Also, can you describe the visible symptom that caused you to begin
> looking at syntax-ppss and parse-partial-sexp?  That description may
> help me or others spot similarities with other reported bugs.
>
> (I must say that I am not an Emacs expert, and I do not usually reply to
> Emacs bugs.  I looked at a few bugs last night as a way to help
> maintainers triage the "easy" bugs.  It does not look like this bug is
> easy!)

This seems similar to #24767.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-01  1:17       ` npostavs
@ 2016-12-01  8:24         ` Andreas Röhler
  2016-12-14  3:00           ` npostavs
  0 siblings, 1 reply; 24+ messages in thread
From: Andreas Röhler @ 2016-12-01  8:24 UTC (permalink / raw)
  To: npostavs, Matt Armstrong; +Cc: 24870

[-- Attachment #1: Type: text/plain, Size: 2923 bytes --]



On 01.12.2016 02:17, npostavs@users.sourceforge.net wrote:
> Matt Armstrong <marmstrong@google.com> writes:
>
>> Andreas Röhler <andreas.roehler@easy-emacs.de> writes:
>>
>>> On 30.11.2016 10:10, Matt Armstrong wrote:
>>>> Andreas Röhler <andreas.roehler@easy-emacs.de> writes:
>>>>> haskell-mode, at EOB: --- {- To explore this file: -} asdf = ---
>>>>> parse-partial-sexp thinks being inside a paren - see attachment. GNU
>>>>> Emacs 26.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-10-11
>>>> Hi Andreas, Emacs does not have a haskell-mode, so this bug is
>>>> difficult to reproduce. It may be more appropriate to report this to
>>>> the haskell-mode maintainers for triage. They can figure out if it is
>>>> a problem that should be fixed in haskell-mode itself, or a problem
>>>> with Emacs. Alternatively, can you provide a series of clear
>>>> instructions to reproduce the problem in a fresh Emacs started without
>>>> your customizations? For example, begin by running "emacs -Q" and go
>>>> from there. Your attached .png presents a buffer called
>>>> *parse-partial-sexp-output*, but it is not clear how this was
>>>> generated. Thanks
>>> Hi Matt,
>>>
>>> checked that with help of the haskell-mode folks already.
>>> https://github.com/haskell/haskell-mode/issues/1459
>> Thanks.  For reference, this is what you have said on the github bug:
>>
>> Seems a bug of
>> GNU Emacs 26.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-11-15
>> Does not exist at
>> GNU Emacs 25.1.90.1 (i686-pc-linux-gnu, GTK+ Version 3.14.5) of 2016-11-29
>>
>>
>>> As it's nice at current pretest Emacs, concluded a bug in trunk.
>>> Here a shortened recipe. Put code below in a buffer:
>>> ||
>>> |{- Just a comment: -}|
>>>
>>> M-x haskell-mode RET
>> Note that haskell-mode is not part of Emacs.  Ideally, your steps to
>> reproduce that begin with running "emacs -Q".  Emacs maintainers that
>> might not also be Haskell hackers will appreciate it.  :)
>>
>> Also, can you describe the visible symptom that caused you to begin
>> looking at syntax-ppss and parse-partial-sexp?  That description may
>> help me or others spot similarities with other reported bugs.
>>
>> (I must say that I am not an Emacs expert, and I do not usually reply to
>> Emacs bugs.  I looked at a few bugs last night as a way to help
>> maintainers triage the "easy" bugs.  It does not look like this bug is
>> easy!)
> This seems similar to #24767.

There is said:

> The string "(* hello *)" should be highlighted as a comment, and is
> indeed correctly highlighted this way in Emacs-25 (and Emacs-24), but
> not in "master".

Here in "master" the syntax-highlighting is correct. So the cases might be related, but not the same.

BTW here is the recipe with a new setup.

{- asdfasd -}
-- (parse-partial-sexp (point-min) (point))
-- ==> (1 1 nil nil t nil 0 nil 15 (1) nil)

absolut n | n >= 0 = n
           | otherwise = -n



[-- Attachment #2: Type: text/html, Size: 4218 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-11-30 23:02     ` Matt Armstrong
  2016-12-01  1:17       ` npostavs
@ 2016-12-01  8:33       ` Andreas Röhler
  1 sibling, 0 replies; 24+ messages in thread
From: Andreas Röhler @ 2016-12-01  8:33 UTC (permalink / raw)
  To: Matt Armstrong; +Cc: 24870



On 01.12.2016 00:02, Matt Armstrong wrote:
> Also, can you describe the visible symptom that caused you to begin
> looking at syntax-ppss and parse-partial-sexp?  That description may
> help me or others spot similarities with other reported bugs.
>
> (I must say that I am not an Emacs expert, and I do not usually reply to
> Emacs bugs.  I looked at a few bugs last night as a way to help
> maintainers triage the "easy" bugs.  It does not look like this bug is
> easy!)

It broke some code relying on results of parse-partial-sexp
This , resp. syntax-ppss, is an internally widely used function.






^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-01  8:24         ` Andreas Röhler
@ 2016-12-14  3:00           ` npostavs
  2016-12-14  4:04             ` npostavs
  0 siblings, 1 reply; 24+ messages in thread
From: npostavs @ 2016-12-14  3:00 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: 24870, Matt Armstrong

[-- Attachment #1: Type: text/plain, Size: 288 bytes --]

tags 24870 = confirmed
merge 24870 25063
quit

This is the same as your report in 25063, as you noted there, the
comment starter is being counted as a list opener (although the comment
closer is not being recognized as a list closer).

Here's a recipe that doesn't require haskell-mode:


[-- Attachment #2: bug reproducer --]
[-- Type: text/plain, Size: 499 bytes --]

(defconst 24870-syntax-table
  (let ((table (make-syntax-table)))
    (modify-syntax-entry ?\{  "(}1nb" table)
    (modify-syntax-entry ?\}  "){4nb" table)
    (modify-syntax-entry ?-  ". 123" table)
    table))

(defun 24870-test ()
  (interactive)
  (with-current-buffer (get-buffer-create "*24870 test*")
    (set-syntax-table 24870-syntax-table)
    (insert "{-C-}\nX")
    (message "pps nesting: %d" (nth 0 (parse-partial-sexp (point-min) (point-max))))
    (display-buffer (current-buffer))))

[-- Attachment #3: Type: text/plain, Size: 65 bytes --]


I have tracked the issue down to scan_sexps_forward in syntax.c

^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-14  3:00           ` npostavs
@ 2016-12-14  4:04             ` npostavs
  2016-12-14  6:45               ` Andreas Röhler
                                 ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: npostavs @ 2016-12-14  4:04 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: 24870, Alan Mackenzie, Matt Armstrong

[-- Attachment #1: Type: text/plain, Size: 182 bytes --]

npostavs@users.sourceforge.net writes:
>
>
> I have tracked the issue down to scan_sexps_forward in syntax.c

Applying this change which reverts part of [1] seems to fix the issue:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: diff that seems to fix this bug --]
[-- Type: text/x-diff, Size: 761 bytes --]

--- i/src/syntax.c
+++ w/src/syntax.c
@@ -3192,7 +3192,11 @@ scan_sexps_forward (struct lisp_parse_state *state,
 
   while (from < end)
     {
-      if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
+      INC_FROM;
+      code = prev_from_syntax & 0xff;
+
+      if (from < end
+          && SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
 	  && (c1 = FETCH_CHAR (from_byte),
 	      syntax = SYNTAX_WITH_FLAGS (c1),
 	      SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
@@ -3213,8 +3217,6 @@ scan_sexps_forward (struct lisp_parse_state *state,
 	}
       else
         {
-          INC_FROM;
-          code = prev_from_syntax & 0xff;
           if (code == Scomment_fence)
             {
               /* Record the comment style we have entered so that only


[-- Attachment #3: Type: text/plain, Size: 385 bytes --]


Alan, can you explain what the idea behind that change was?

I think it might correspond to this part of the commit message:

    Reformulate code at the top of the main loop correctly to
    recognize comment openers when starting in the middle of one.

[1]: 9dcf5998935c Sun Mar 20 13:19:48 2016 +0000
"Amend parse-partial-sexp correctly to handle two character comment delimiters"

^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-14  4:04             ` npostavs
@ 2016-12-14  6:45               ` Andreas Röhler
  2016-12-14 19:56               ` Alan Mackenzie
  2016-12-14 21:58               ` Alan Mackenzie
  2 siblings, 0 replies; 24+ messages in thread
From: Andreas Röhler @ 2016-12-14  6:45 UTC (permalink / raw)
  To: npostavs; +Cc: 24870, Alan Mackenzie, Matt Armstrong

[-- Attachment #1: Type: text/plain, Size: 742 bytes --]



On 14.12.2016 05:04, npostavs@users.sourceforge.net wrote:
> npostavs@users.sourceforge.net writes:
>>
>> I have tracked the issue down to scan_sexps_forward in syntax.c
> Applying this change which reverts part of [1] seems to fix the issue:
>
>
>
> Alan, can you explain what the idea behind that change was?
>
> I think it might correspond to this part of the commit message:
>
>      Reformulate code at the top of the main loop correctly to
>      recognize comment openers when starting in the middle of one.
>
> [1]: 9dcf5998935c Sun Mar 20 13:19:48 2016 +0000
> "Amend parse-partial-sexp correctly to handle two character comment delimiters"

Thanks a lot! BTW that message puzzles me too, AFAIK comments in C,C++ 
can't be nested.

[-- Attachment #2: Type: text/html, Size: 1493 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-14  4:04             ` npostavs
  2016-12-14  6:45               ` Andreas Röhler
@ 2016-12-14 19:56               ` Alan Mackenzie
  2016-12-15  8:18                 ` Andreas Röhler
  2016-12-14 21:58               ` Alan Mackenzie
  2 siblings, 1 reply; 24+ messages in thread
From: Alan Mackenzie @ 2016-12-14 19:56 UTC (permalink / raw)
  To: npostavs; +Cc: 24870, Matt Armstrong

Hello, Noam.

On Tue, Dec 13, 2016 at 11:04:45PM -0500, npostavs@users.sourceforge.net wrote:
> npostavs@users.sourceforge.net writes:


> > I have tracked the issue down to scan_sexps_forward in syntax.c

> Applying this change which reverts part of [1] seems to fix the issue:


> --- i/src/syntax.c
> +++ w/src/syntax.c
> @@ -3192,7 +3192,11 @@ scan_sexps_forward (struct lisp_parse_state *state,

>    while (from < end)
>      {
> -      if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
> +      INC_FROM;
> +      code = prev_from_syntax & 0xff;
> +
> +      if (from < end
> +          && SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
>  	  && (c1 = FETCH_CHAR (from_byte),
>  	      syntax = SYNTAX_WITH_FLAGS (c1),
>  	      SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
> @@ -3213,8 +3217,6 @@ scan_sexps_forward (struct lisp_parse_state *state,
>  	}
>        else
>          {
> -          INC_FROM;
> -          code = prev_from_syntax & 0xff;
>            if (code == Scomment_fence)
>              {
>                /* Record the comment style we have entered so that only



> Alan, can you explain what the idea behind that change was?

We're talking about 9dcf5998935c8aaa846d7585b81f0dcfe1935b3d from Sun
Mar 20 13:19:48 2016 +0000, still?

The idea is that in a (parse-partial-sexp from to), the end position
might be in the middle of a two character comment marker, such as "/*".
Before this change, it was impossible successfully to use the result of
that operation as the old state for continuing parse-partial-sexp from
that position, since it did not contain enough info to see it was in a
comment after passing the "*"

The change 9dcf599 added an extra element onto the parse state which was
non-nil when we end up after a "/", etc.

> I think it might correspond to this part of the commit message:

>     Reformulate code at the top of the main loop correctly to
>     recognize comment openers when starting in the middle of one.

> [1]: 9dcf5998935c Sun Mar 20 13:19:48 2016 +0000
> "Amend parse-partial-sexp correctly to handle two character comment delimiters"

With the above in mind, I think we should both check your proposed patch
to make sure it doesn't break the working of 9dcf599.

Thanks for Cc'ing me in on this.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-14  4:04             ` npostavs
  2016-12-14  6:45               ` Andreas Röhler
  2016-12-14 19:56               ` Alan Mackenzie
@ 2016-12-14 21:58               ` Alan Mackenzie
  2016-12-15 16:33                 ` Noam Postavsky
  2 siblings, 1 reply; 24+ messages in thread
From: Alan Mackenzie @ 2016-12-14 21:58 UTC (permalink / raw)
  To: npostavs; +Cc: 24870, Matt Armstrong

Hello again, Noam.

On Tue, Dec 13, 2016 at 11:04:45PM -0500, npostavs@users.sourceforge.net wrote:
> npostavs@users.sourceforge.net writes:


> > I have tracked the issue down to scan_sexps_forward in syntax.c

> Applying this change which reverts part of [1] seems to fix the issue:


> --- i/src/syntax.c
> +++ w/src/syntax.c
> @@ -3192,7 +3192,11 @@ scan_sexps_forward (struct lisp_parse_state *state,

>    while (from < end)
>      {
> -      if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
> +      INC_FROM;
> +      code = prev_from_syntax & 0xff;
> +
> +      if (from < end
> +          && SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
>  	  && (c1 = FETCH_CHAR (from_byte),
>  	      syntax = SYNTAX_WITH_FLAGS (c1),
>  	      SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
> @@ -3213,8 +3217,6 @@ scan_sexps_forward (struct lisp_parse_state *state,
>  	}
>        else
>          {
> -          INC_FROM;
> -          code = prev_from_syntax & 0xff;
>            if (code == Scomment_fence)
>              {
>                /* Record the comment style we have entered so that only

Alas, that patch won't do.  The very first thing that must be done in the
loop is to check for a two-character comment delimiter, of which the
first character might have been recorded in OLDSTATE element 9 rather
than having been scanned by scan_sexps_forward on the previous loop
iteration.

My analysis of what's happening in the bug recipe you posted one or two
posts previously, here in scan_sexps_forward is as follows.  (In that
recipe, "{-C-}\nX" was parse-partial-sexp'd over, and the syntax table
had been set to recognise "{-" and "-}" as matching comment delimiters.)
(i) On the first iteration of the main loop, the "{" is read.  It is
  recognised as an opening paren, and causes the "current paren depth" to
  be incremented.
(ii) On the second iteration of the loop, the "-" is read.  The function
  now recognises the two-character comment opener, and proceeds to read
  the innards of the comment together with its closing delimiter (the
  "-}").
(iii) On the third and fourth iterations, the function reads "\n" and
  "X".  It then terminates, having reached point-max.
(iv) The paren depth counter remains at 1.

What is new here is characters with paren syntax also being components of
2-char comment delimiters.  I recently fixed a similar problem when
characters with word syntax were also flagged as 2-char comment delimiter
parts.  I think a similar patch at case label Sopen: (Line ~3322), which
would peek ahead at the next character to check for "{-" before
recognising the "{" as an open paren would be the best fix.

Do you want to make this fix, or should I do it?  If you want to do it,
I'm willing (indeed, eager) to review it for you.

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-14 19:56               ` Alan Mackenzie
@ 2016-12-15  8:18                 ` Andreas Röhler
  2016-12-15 16:01                   ` Eli Zaretskii
  2016-12-15 16:50                   ` Alan Mackenzie
  0 siblings, 2 replies; 24+ messages in thread
From: Andreas Röhler @ 2016-12-15  8:18 UTC (permalink / raw)
  To: Alan Mackenzie, npostavs; +Cc: 24870, Matt Armstrong



On 14.12.2016 20:56, Alan Mackenzie wrote:
>
> We're talking about 9dcf5998935c8aaa846d7585b81f0dcfe1935b3d from Sun
> Mar 20 13:19:48 2016 +0000, still?
>
> The idea is that in a (parse-partial-sexp from to), the end position
> might be in the middle of a two character comment marker, such as "/*".
> Before this change, it was impossible successfully to use the result of
> that operation as the old state for continuing parse-partial-sexp from
> that position, since it did not contain enough info to see it was in a
> comment after passing the "*"
>
> The change 9dcf599 added an extra element onto the parse state which was
> non-nil when we end up after a "/", etc.
>
>

Hi Alan,

sounds like a classical mistake for me.

You commented lately on the effect of narrowing and how simply to 
respect its results. Nothing further to say here.

OTOH: do you have a use-case, a bug, which propelled the amendment?

Thanks,

Andreas





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-15  8:18                 ` Andreas Röhler
@ 2016-12-15 16:01                   ` Eli Zaretskii
  2016-12-15 16:50                   ` Alan Mackenzie
  1 sibling, 0 replies; 24+ messages in thread
From: Eli Zaretskii @ 2016-12-15 16:01 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: 24870, acm, marmstrong, npostavs

> From: Andreas Röhler <andreas.roehler@easy-emacs.de>
> Date: Thu, 15 Dec 2016 09:18:01 +0100
> Cc: 24870@debbugs.gnu.org, Matt Armstrong <marmstrong@google.com>
> 
> Hi Alan,
> 
> sounds like a classical mistake for me.

Most people here don't make "classical mistakes", so it's best to
assume none of them do, and never say such things aloud.

Just a $0.02-worth advice.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-14 21:58               ` Alan Mackenzie
@ 2016-12-15 16:33                 ` Noam Postavsky
  2016-12-15 16:44                   ` Alan Mackenzie
  0 siblings, 1 reply; 24+ messages in thread
From: Noam Postavsky @ 2016-12-15 16:33 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 24870, Matt Armstrong

On Wed, Dec 14, 2016 at 4:58 PM, Alan Mackenzie <acm@muc.de> wrote:
>
> Alas, that patch won't do.

I thought that might be the case.

>
> What is new here is characters with paren syntax also being components of
> 2-char comment delimiters.  I recently fixed a similar problem when
> characters with word syntax were also flagged as 2-char comment delimiter
> parts.  I think a similar patch at case label Sopen: (Line ~3322), which
> would peek ahead at the next character to check for "{-" before
> recognising the "{" as an open paren would be the best fix.

I don't think special casing Sopen makes sense, shouldn't the check
apply to all syntax classes?

(the special case for word syntax does make sense, because it has its
own inner loop already)

>
> Do you want to make this fix, or should I do it?  If you want to do it,
> I'm willing (indeed, eager) to review it for you.

I'll have a patch ready in a day or two.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-15 16:33                 ` Noam Postavsky
@ 2016-12-15 16:44                   ` Alan Mackenzie
  2016-12-18  5:39                     ` npostavs
  0 siblings, 1 reply; 24+ messages in thread
From: Alan Mackenzie @ 2016-12-15 16:44 UTC (permalink / raw)
  To: Noam Postavsky; +Cc: 24870, Matt Armstrong

Hello, Noam.

On Thu, Dec 15, 2016 at 11:33:36AM -0500, Noam Postavsky wrote:
> On Wed, Dec 14, 2016 at 4:58 PM, Alan Mackenzie <acm@muc.de> wrote:

> > Alas, that patch won't do.

> I thought that might be the case.


> > What is new here is characters with paren syntax also being components of
> > 2-char comment delimiters.  I recently fixed a similar problem when
> > characters with word syntax were also flagged as 2-char comment delimiter
> > parts.  I think a similar patch at case label Sopen: (Line ~3322), which
> > would peek ahead at the next character to check for "{-" before
> > recognising the "{" as an open paren would be the best fix.

> I don't think special casing Sopen makes sense, shouldn't the check
> apply to all syntax classes?

Maybe.  I'm too tired to work this out at the moment (it was the office
Glühwein day).

> (the special case for word syntax does make sense, because it has its
> own inner loop already)

As does comment syntax, of course.

> > Do you want to make this fix, or should I do it?  If you want to do it,
> > I'm willing (indeed, eager) to review it for you.

> I'll have a patch ready in a day or two.

Excellent!  Or even in three or four days.  Take your time, do it well
(at least, better than I managed last time round) and enjoy doing it.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-15  8:18                 ` Andreas Röhler
  2016-12-15 16:01                   ` Eli Zaretskii
@ 2016-12-15 16:50                   ` Alan Mackenzie
  2016-12-15 17:59                     ` Andreas Röhler
  1 sibling, 1 reply; 24+ messages in thread
From: Alan Mackenzie @ 2016-12-15 16:50 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: 24870, Matt Armstrong, npostavs

Hello, Andreas.

On Thu, Dec 15, 2016 at 09:18:01AM +0100, Andreas Röhler wrote:


> On 14.12.2016 20:56, Alan Mackenzie wrote:

> > We're talking about 9dcf5998935c8aaa846d7585b81f0dcfe1935b3d from Sun
> > Mar 20 13:19:48 2016 +0000, still?

> > The idea is that in a (parse-partial-sexp from to), the end position
> > might be in the middle of a two character comment marker, such as "/*".
> > Before this change, it was impossible successfully to use the result of
> > that operation as the old state for continuing parse-partial-sexp from
> > that position, since it did not contain enough info to see it was in a
> > comment after passing the "*"

> > The change 9dcf599 added an extra element onto the parse state which was
> > non-nil when we end up after a "/", etc.



> Hi Alan,

> sounds like a classical mistake for me.

Quite possibly.

> You commented lately on the effect of narrowing and how simply to 
> respect its results. Nothing further to say here.

I don't see what your meaning is here, but never mind.

> OTOH: do you have a use-case, a bug, which propelled the amendment?

Yes.  It was quite a few years ago, but a bug in CC Mode was caused by
parse-partial-sexp terminating at a critical place, and the next
invocation of parse-partial-sexp thus going wrong.  I programmed round it
awkwardly at the time.

Also syntax-ppss would be falling into the trap quite a lot, I think.  I
don't think it checked specially for the critical case.  Now it doesn't
have to bother - at least, it won't as soon as Noam has corrected the
current bug.  ;-)

> Thanks,

> Andreas

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-15 16:50                   ` Alan Mackenzie
@ 2016-12-15 17:59                     ` Andreas Röhler
  0 siblings, 0 replies; 24+ messages in thread
From: Andreas Röhler @ 2016-12-15 17:59 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 24870, Matt Armstrong, npostavs



On 15.12.2016 17:50, Alan Mackenzie wrote:
> You commented lately on the effect of narrowing and how simply to
>> respect its results. Nothing further to say here.
> I don't see what your meaning is here, but never mind.

Referred to your post at emacs-devel 11.12.2016 11:17:

At the moment, narrowing is a strong, direct, simple facility - it does 
what it says it does and no more.






^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-15 16:44                   ` Alan Mackenzie
@ 2016-12-18  5:39                     ` npostavs
  2016-12-29 11:14                       ` Alan Mackenzie
  0 siblings, 1 reply; 24+ messages in thread
From: npostavs @ 2016-12-18  5:39 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 24870, Matt Armstrong

[-- Attachment #1: Type: text/plain, Size: 1484 bytes --]

Alan Mackenzie <acm@muc.de> writes:

> Hello, Noam.
>
> On Thu, Dec 15, 2016 at 11:33:36AM -0500, Noam Postavsky wrote:
>> On Wed, Dec 14, 2016 at 4:58 PM, Alan Mackenzie <acm@muc.de> wrote:
>
>> > What is new here is characters with paren syntax also being components of
>> > 2-char comment delimiters.  I recently fixed a similar problem when
>> > characters with word syntax were also flagged as 2-char comment delimiter
>> > parts.  I think a similar patch at case label Sopen: (Line ~3322), which
>> > would peek ahead at the next character to check for "{-" before
>> > recognising the "{" as an open paren would be the best fix.
>
>> I don't think special casing Sopen makes sense, shouldn't the check
>> apply to all syntax classes?
>
> Maybe.  I'm too tired to work this out at the moment (it was the office
> Glühwein day).
>
>> (the special case for word syntax does make sense, because it has its
>> own inner loop already)
>
> As does comment syntax, of course.
>
>> > Do you want to make this fix, or should I do it?  If you want to do it,
>> > I'm willing (indeed, eager) to review it for you.
>
>> I'll have a patch ready in a day or two.
>
> Excellent!  Or even in three or four days.  Take your time, do it well
> (at least, better than I managed last time round) and enjoy doing it.

Okay, here it is.  I think I've made this function a bit less twisty,
and hopefully haven't broken anything new (make check is still passing).


[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 11354 bytes --]

From e3bc1d563a251ede1cb7649b97523161eb352694 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sun, 18 Dec 2016 00:00:30 -0500
Subject: [PATCH v1] Fix comment detection on open parens

Characters having both open paren syntax and comment start syntax were
being detected as open parens even when they should have been part a
comment starter (Bug#24870).

* src/syntax.c (check_comment_start): New function, extracted from
`scan_sexps_forward'.
(scan_sexps_forward): Add check for a 2-char comment starter before the
loop.  Inside the loop, do that check after incrementing the 'from'
character index.  Move the single char comment syntax cases into the
switch isntead of special casing them before.
* test/src/syntax-tests.el (parse-partial-sexp-paren-comments):
(parse-partial-sexp-continue-over-comment-marker): New tests.
---
 src/syntax.c             | 114 ++++++++++++++++++++++++-----------------------
 test/src/syntax-tests.el |  83 ++++++++++++++++++++++++++++++++++
 2 files changed, 142 insertions(+), 55 deletions(-)
 create mode 100644 test/src/syntax-tests.el

diff --git a/src/syntax.c b/src/syntax.c
index 338dd85..df28c91 100644
--- a/src/syntax.c
+++ b/src/syntax.c
@@ -3094,6 +3094,33 @@ DEFUN ("backward-prefix-chars", Fbackward_prefix_chars, Sbackward_prefix_chars,
   return Qnil;
 }
 \f
+
+static bool
+check_comment_start (struct lisp_parse_state *state,
+                     int prev_from_syntax,
+                     ptrdiff_t prev_from,
+                     ptrdiff_t from_byte)
+{
+  int c1, syntax;
+  if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
+      && (c1 = FETCH_CHAR_AS_MULTIBYTE (from_byte),
+          syntax = SYNTAX_WITH_FLAGS (c1),
+          SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
+    {
+      /* Record the comment style we have entered so that only
+         the comment-end sequence of the same style actually
+         terminates the comment section.  */
+      state->comstyle
+        = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
+      bool comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
+                        | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
+      state->incomment = comnested ? 1 : -1;
+      state->comstr_start = prev_from;
+      return true;
+    }
+  return false;
+}
+
 /* Parse forward from FROM / FROM_BYTE to END,
    assuming that FROM has state STATE,
    and return a description of the state of the parse at END.
@@ -3109,8 +3136,6 @@ scan_sexps_forward (struct lisp_parse_state *state,
 		    int commentstop)
 {
   enum syntaxcode code;
-  int c1;
-  bool comnested;
   struct level { ptrdiff_t last, prev; };
   struct level levelstart[100];
   struct level *curlevel = levelstart;
@@ -3124,7 +3149,6 @@ scan_sexps_forward (struct lisp_parse_state *state,
   ptrdiff_t prev_from;		/* Keep one character before FROM.  */
   ptrdiff_t prev_from_byte;
   int prev_from_syntax, prev_prev_from_syntax;
-  int syntax;
   bool boundary_stop = commentstop == -1;
   bool nofence;
   bool found;
@@ -3189,53 +3213,31 @@ scan_sexps_forward (struct lisp_parse_state *state,
     }
   else if (start_quoted)
     goto startquoted;
+  else if ((from < end)
+           && (check_comment_start (state, prev_from_syntax,
+                                    prev_from, from_byte)))
+    {
+      INC_FROM;
+      prev_from_syntax = Smax; /* the syntax has already been "used up". */
+      goto atcomment;
+    }
 
   while (from < end)
     {
-      if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
-	  && (c1 = FETCH_CHAR (from_byte),
-	      syntax = SYNTAX_WITH_FLAGS (c1),
-	      SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
-	{
-	  /* Record the comment style we have entered so that only
-	     the comment-end sequence of the same style actually
-	     terminates the comment section.  */
-	  state->comstyle
-	    = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
-	  comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
-		       | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
-	  state->incomment = comnested ? 1 : -1;
-	  state->comstr_start = prev_from;
-	  INC_FROM;
-          prev_from_syntax = Smax; /* the syntax has already been
-                                      "used up". */
-	  code = Scomment;
-	}
-      else
+      INC_FROM;
+
+      if ((from < end)
+          && (check_comment_start (state, prev_from_syntax,
+                                   prev_from, from_byte)))
         {
           INC_FROM;
-          code = prev_from_syntax & 0xff;
-          if (code == Scomment_fence)
-            {
-              /* Record the comment style we have entered so that only
-                 the comment-end sequence of the same style actually
-                 terminates the comment section.  */
-              state->comstyle = ST_COMMENT_STYLE;
-              state->incomment = -1;
-              state->comstr_start = prev_from;
-              code = Scomment;
-            }
-          else if (code == Scomment)
-            {
-              state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
-              state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
-                                 1 : -1);
-              state->comstr_start = prev_from;
-            }
+          prev_from_syntax = Smax; /* the syntax has already been "used up". */
+          goto atcomment;
         }
 
       if (SYNTAX_FLAGS_PREFIX (prev_from_syntax))
 	continue;
+      code = prev_from_syntax & 0xff;
       switch (code)
 	{
 	case Sescape:
@@ -3254,24 +3256,15 @@ scan_sexps_forward (struct lisp_parse_state *state,
 	symstarted:
 	  while (from < end)
 	    {
-	      int symchar = FETCH_CHAR_AS_MULTIBYTE (from_byte);
-
-              if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
-                  && (syntax = SYNTAX_WITH_FLAGS (symchar),
-                      SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
+              if (check_comment_start (state, prev_from_syntax,
+                                       prev_from, from_byte))
                 {
-                  state->comstyle
-                    = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
-                  comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
-                               | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
-                  state->incomment = comnested ? 1 : -1;
-                  state->comstr_start = prev_from;
                   INC_FROM;
-                  prev_from_syntax = Smax;
-                  code = Scomment;
+                  prev_from_syntax = Smax; /* the syntax has already been "used up". */
                   goto atcomment;
                 }
 
+	      int symchar = FETCH_CHAR_AS_MULTIBYTE (from_byte);
               switch (SYNTAX (symchar))
 		{
 		case Scharquote:
@@ -3292,8 +3285,19 @@ scan_sexps_forward (struct lisp_parse_state *state,
 	  curlevel->prev = curlevel->last;
 	  break;
 
-	case Scomment_fence: /* Can't happen because it's handled above.  */
+	case Scomment_fence:
+          /* Record the comment style we have entered so that only
+             the comment-end sequence of the same style actually
+             terminates the comment section.  */
+          state->comstyle = ST_COMMENT_STYLE;
+          state->incomment = -1;
+          state->comstr_start = prev_from;
+          goto atcomment;
 	case Scomment:
+          state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
+          state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
+                              1 : -1);
+          state->comstr_start = prev_from;
         atcomment:
           if (commentstop || boundary_stop) goto done;
 	startincomment:
diff --git a/test/src/syntax-tests.el b/test/src/syntax-tests.el
new file mode 100644
index 0000000..eac88e8
--- /dev/null
+++ b/test/src/syntax-tests.el
@@ -0,0 +1,83 @@
+;;; syntax-tests.el --- tests for syntax.c functions -*- lexical-binding: t -*-
+
+;; Copyright (C) 2016 Free Software Foundation, Inc.
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.
+
+;;; Code:
+
+(require 'ert)
+
+(ert-deftest parse-partial-sexp-continue-over-comment-marker ()
+  "Continue a parse that stopped in the middle of a comment marker."
+  (with-temp-buffer
+    (let ((table (make-syntax-table)))
+      (modify-syntax-entry ?/ ". 124")
+      (modify-syntax-entry ?* ". 23b")
+      (set-syntax-table table))
+    (insert "/*C*/\nX")
+    (goto-char (point-min))
+    (let* ((pointC (progn (search-forward "C") (1- (point))))
+           (preC (1- pointC))
+           (pointX (progn (search-forward "X") (1- (point))))
+           (aftC (+ 2 pointC))
+           (ppsC (parse-partial-sexp (point-min) pointC))
+           (pps-preC (parse-partial-sexp (point-min) preC))
+           (pps-aftC (parse-partial-sexp (point-min) aftC))
+           (ppsX (parse-partial-sexp (point-min) pointX)))
+      ;; C should be inside comment.
+      (should (= (nth 0 ppsC) 0))
+      (should (eq (nth 4 ppsC) t))
+      (should (= (nth 8 ppsC) (- pointC 2)))
+      ;; X should not be in comment or list.
+      (should (= (nth 0 ppsX) 0))
+      (should-not (nth 4 ppsX))
+      ;; Try using OLDSTATE.
+      (should (equal (parse-partial-sexp preC pointC nil nil pps-preC)
+                     ppsC))
+      (should (equal (parse-partial-sexp pointC aftC nil nil ppsC)
+                     pps-aftC))
+      (should (equal (parse-partial-sexp preC aftC nil nil pps-preC)
+                     pps-aftC)))))
+
+(ert-deftest parse-partial-sexp-paren-comments ()
+  "Test syntax parsing with paren comment markers.
+Specifically, where the first character of the comment marker is
+also has open paren syntax (see Bug#24870)."
+  (with-temp-buffer
+    (let ((table (make-syntax-table)))
+      (modify-syntax-entry ?\{  "(}1nb" table)
+      (modify-syntax-entry ?\}  "){4nb" table)
+      (modify-syntax-entry ?-  ". 123" table)
+      (set-syntax-table table))
+    (insert "{-C-}\nX")
+    (goto-char (point-min))
+    (let* ((pointC (progn (search-forward "C") (1- (point))))
+           (pointX (progn (search-forward "X") (1- (point))))
+           (ppsC (parse-partial-sexp (point-min) pointC))
+           (ppsX (parse-partial-sexp (point-min) pointX)))
+      ;; C should be inside nestable comment, not list.
+      (should (= (nth 0 ppsC) 0))
+      (should (= (nth 4 ppsC) 1))
+      (should (= (nth 8 ppsC) (- pointC 2)))
+      ;; X should not be in comment or list.
+      (should (= (nth 0 ppsX) 0))
+      (should-not (nth 4 ppsX))
+      ;; Try using OLDSTATE.
+      (should (equal (parse-partial-sexp pointC pointX nil nil ppsC)
+                     ppsX)))))
+
+;;; syntax-tests.el ends here
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-18  5:39                     ` npostavs
@ 2016-12-29 11:14                       ` Alan Mackenzie
  2016-12-30  1:55                         ` npostavs
  0 siblings, 1 reply; 24+ messages in thread
From: Alan Mackenzie @ 2016-12-29 11:14 UTC (permalink / raw)
  To: npostavs; +Cc: 24870, Matt Armstrong

Hello, Noam.

Sorry this has taken me more time than I anticipated; it's a busy time
of the year.  :-(

On Sun, Dec 18, 2016 at 12:39:11AM -0500, npostavs@users.sourceforge.net wrote:
> Alan Mackenzie <acm@muc.de> writes:

> > Hello, Noam.

> > On Thu, Dec 15, 2016 at 11:33:36AM -0500, Noam Postavsky wrote:

[ .... ]

> >> > Do you want to make this fix, or should I do it?  If you want to
> >> > do it, I'm willing (indeed, eager) to review it for you.

> >> I'll have a patch ready in a day or two.

[ .... ]

> Okay, here it is.  I think I've made this function a bit less twisty,
> and hopefully haven't broken anything new (make check is still passing).

syntax.c looks good.  It looks very good.  I've just got one trivial
comment:

(i) The new function `check_comment_start' doesn't have a comment saying
what its return value means.  Possibly you could instead rename it so
that the name implies what it returns.  Maybe something like
`in_double_comment_opener'.

I'll admit I haven't actually tried out the code, mainly because you've
written a test file.  One comment about the test file:

(ii) In `parse-partial-sexp-continue-over-comment-marker', variable aftC
is the position in the middle of the comment closer "*/".  I don't think
you are testing in any way that element 10 (nil, or the syntax of the
position just before the end point when that position might be the first
character of a two-character construct, i.e. an escape or first char of a
double-char comment delimiter) is correct.  This element 10 was newly
introduced this year.  Would you please consider adding such a test to
this new test file.  Thanks!

Otherwise, excellent!

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-29 11:14                       ` Alan Mackenzie
@ 2016-12-30  1:55                         ` npostavs
  2017-01-13  2:07                           ` npostavs
  0 siblings, 1 reply; 24+ messages in thread
From: npostavs @ 2016-12-30  1:55 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 24870, Matt Armstrong

[-- Attachment #1: Type: text/plain, Size: 1142 bytes --]

Alan Mackenzie <acm@muc.de> writes:
>
> (i) The new function `check_comment_start' doesn't have a comment saying
> what its return value means.  Possibly you could instead rename it so
> that the name implies what it returns.  Maybe something like
> `in_double_comment_opener'.

Good point, I've updated the patch.

>
> I'll admit I haven't actually tried out the code, mainly because you've
> written a test file.

Oh dear, maybe I should have withheld the test file ;)

>
> (ii) In `parse-partial-sexp-continue-over-comment-marker', variable aftC
> is the position in the middle of the comment closer "*/".  I don't think
> you are testing in any way that element 10 (nil, or the syntax of the
> position just before the end point when that position might be the first
> character of a two-character construct, i.e. an escape or first char of a
> double-char comment delimiter) is correct.

My idea was that its effect would be tested by using pps-preC as
OLDSTATE, which avoids having to encode the specifics in the test.  I
added another clause which uses pps-aftC to cover parsing from the
middle of a comment closer as well as opener.


[-- Attachment #2: patch v2 --]
[-- Type: text/plain, Size: 11642 bytes --]

From 50773c5194591526797eddf70a3cce5f4ccf6f25 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs@gmail.com>
Date: Sun, 18 Dec 2016 00:00:30 -0500
Subject: [PATCH v2] Fix comment detection on open parens

Characters having both open paren syntax and comment start syntax were
being detected as open parens even when they should have been part a
comment starter (Bug#24870).

* src/syntax.c (in_2char_comment_start): New function, extracted from
`scan_sexps_forward'.
(scan_sexps_forward): Add check for a 2-char comment starter before the
loop.  Inside the loop, do that check after incrementing the 'from'
character index.  Move the single char comment syntax cases into the
switch instead of special casing them before.
* test/src/syntax-tests.el (parse-partial-sexp-paren-comments):
(parse-partial-sexp-continue-over-comment-marker): New tests.
---
 src/syntax.c             | 117 +++++++++++++++++++++++++----------------------
 test/src/syntax-tests.el |  85 ++++++++++++++++++++++++++++++++++
 2 files changed, 147 insertions(+), 55 deletions(-)
 create mode 100644 test/src/syntax-tests.el

diff --git a/src/syntax.c b/src/syntax.c
index 7c15e77..d291890 100644
--- a/src/syntax.c
+++ b/src/syntax.c
@@ -3092,6 +3092,36 @@ DEFUN ("backward-prefix-chars", Fbackward_prefix_chars, Sbackward_prefix_chars,
   return Qnil;
 }
 \f
+
+/* If the character at FROM_BYTE is the second part of a 2-character
+   comment opener based on PREV_FROM_SYNTAX, update STATE and return
+   true.  */
+static bool
+in_2char_comment_start (struct lisp_parse_state *state,
+                        int prev_from_syntax,
+                        ptrdiff_t prev_from,
+                        ptrdiff_t from_byte)
+{
+  int c1, syntax;
+  if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
+      && (c1 = FETCH_CHAR_AS_MULTIBYTE (from_byte),
+          syntax = SYNTAX_WITH_FLAGS (c1),
+          SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
+    {
+      /* Record the comment style we have entered so that only
+         the comment-end sequence of the same style actually
+         terminates the comment section.  */
+      state->comstyle
+        = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
+      bool comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
+                        | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
+      state->incomment = comnested ? 1 : -1;
+      state->comstr_start = prev_from;
+      return true;
+    }
+  return false;
+}
+
 /* Parse forward from FROM / FROM_BYTE to END,
    assuming that FROM has state STATE,
    and return a description of the state of the parse at END.
@@ -3107,8 +3137,6 @@ scan_sexps_forward (struct lisp_parse_state *state,
 		    int commentstop)
 {
   enum syntaxcode code;
-  int c1;
-  bool comnested;
   struct level { ptrdiff_t last, prev; };
   struct level levelstart[100];
   struct level *curlevel = levelstart;
@@ -3122,7 +3150,6 @@ scan_sexps_forward (struct lisp_parse_state *state,
   ptrdiff_t prev_from;		/* Keep one character before FROM.  */
   ptrdiff_t prev_from_byte;
   int prev_from_syntax, prev_prev_from_syntax;
-  int syntax;
   bool boundary_stop = commentstop == -1;
   bool nofence;
   bool found;
@@ -3187,53 +3214,31 @@ scan_sexps_forward (struct lisp_parse_state *state,
     }
   else if (start_quoted)
     goto startquoted;
+  else if ((from < end)
+           && (in_2char_comment_start (state, prev_from_syntax,
+                                       prev_from, from_byte)))
+    {
+      INC_FROM;
+      prev_from_syntax = Smax; /* the syntax has already been "used up". */
+      goto atcomment;
+    }
 
   while (from < end)
     {
-      if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
-	  && (c1 = FETCH_CHAR (from_byte),
-	      syntax = SYNTAX_WITH_FLAGS (c1),
-	      SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
-	{
-	  /* Record the comment style we have entered so that only
-	     the comment-end sequence of the same style actually
-	     terminates the comment section.  */
-	  state->comstyle
-	    = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
-	  comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
-		       | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
-	  state->incomment = comnested ? 1 : -1;
-	  state->comstr_start = prev_from;
-	  INC_FROM;
-          prev_from_syntax = Smax; /* the syntax has already been
-                                      "used up". */
-	  code = Scomment;
-	}
-      else
+      INC_FROM;
+
+      if ((from < end)
+          && (in_2char_comment_start (state, prev_from_syntax,
+                                      prev_from, from_byte)))
         {
           INC_FROM;
-          code = prev_from_syntax & 0xff;
-          if (code == Scomment_fence)
-            {
-              /* Record the comment style we have entered so that only
-                 the comment-end sequence of the same style actually
-                 terminates the comment section.  */
-              state->comstyle = ST_COMMENT_STYLE;
-              state->incomment = -1;
-              state->comstr_start = prev_from;
-              code = Scomment;
-            }
-          else if (code == Scomment)
-            {
-              state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
-              state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
-                                 1 : -1);
-              state->comstr_start = prev_from;
-            }
+          prev_from_syntax = Smax; /* the syntax has already been "used up". */
+          goto atcomment;
         }
 
       if (SYNTAX_FLAGS_PREFIX (prev_from_syntax))
 	continue;
+      code = prev_from_syntax & 0xff;
       switch (code)
 	{
 	case Sescape:
@@ -3252,24 +3257,15 @@ scan_sexps_forward (struct lisp_parse_state *state,
 	symstarted:
 	  while (from < end)
 	    {
-	      int symchar = FETCH_CHAR_AS_MULTIBYTE (from_byte);
-
-              if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
-                  && (syntax = SYNTAX_WITH_FLAGS (symchar),
-                      SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
+              if (in_2char_comment_start (state, prev_from_syntax,
+                                          prev_from, from_byte))
                 {
-                  state->comstyle
-                    = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
-                  comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
-                               | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
-                  state->incomment = comnested ? 1 : -1;
-                  state->comstr_start = prev_from;
                   INC_FROM;
-                  prev_from_syntax = Smax;
-                  code = Scomment;
+                  prev_from_syntax = Smax; /* the syntax has already been "used up". */
                   goto atcomment;
                 }
 
+	      int symchar = FETCH_CHAR_AS_MULTIBYTE (from_byte);
               switch (SYNTAX (symchar))
 		{
 		case Scharquote:
@@ -3290,8 +3286,19 @@ scan_sexps_forward (struct lisp_parse_state *state,
 	  curlevel->prev = curlevel->last;
 	  break;
 
-	case Scomment_fence: /* Can't happen because it's handled above.  */
+	case Scomment_fence:
+          /* Record the comment style we have entered so that only
+             the comment-end sequence of the same style actually
+             terminates the comment section.  */
+          state->comstyle = ST_COMMENT_STYLE;
+          state->incomment = -1;
+          state->comstr_start = prev_from;
+          goto atcomment;
 	case Scomment:
+          state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
+          state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
+                              1 : -1);
+          state->comstr_start = prev_from;
         atcomment:
           if (commentstop || boundary_stop) goto done;
 	startincomment:
diff --git a/test/src/syntax-tests.el b/test/src/syntax-tests.el
new file mode 100644
index 0000000..776ce5b
--- /dev/null
+++ b/test/src/syntax-tests.el
@@ -0,0 +1,85 @@
+;;; syntax-tests.el --- tests for syntax.c functions -*- lexical-binding: t -*-
+
+;; Copyright (C) 2016 Free Software Foundation, Inc.
+
+;; This file is part of GNU Emacs.
+
+;; GNU Emacs is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GNU Emacs is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GNU Emacs.  If not, see <http://www.gnu.org/licenses/>.
+
+;;; Code:
+
+(require 'ert)
+
+(ert-deftest parse-partial-sexp-continue-over-comment-marker ()
+  "Continue a parse that stopped in the middle of a comment marker."
+  (with-temp-buffer
+    (let ((table (make-syntax-table)))
+      (modify-syntax-entry ?/ ". 124")
+      (modify-syntax-entry ?* ". 23b")
+      (set-syntax-table table))
+    (insert "/*C*/\nX")
+    (goto-char (point-min))
+    (let* ((pointC (progn (search-forward "C") (1- (point))))
+           (preC (1- pointC))
+           (pointX (progn (search-forward "X") (1- (point))))
+           (aftC (+ 2 pointC))
+           (ppsC (parse-partial-sexp (point-min) pointC))
+           (pps-preC (parse-partial-sexp (point-min) preC))
+           (pps-aftC (parse-partial-sexp (point-min) aftC))
+           (ppsX (parse-partial-sexp (point-min) pointX)))
+      ;; C should be inside comment.
+      (should (= (nth 0 ppsC) 0))
+      (should (eq (nth 4 ppsC) t))
+      (should (= (nth 8 ppsC) (- pointC 2)))
+      ;; X should not be in comment or list.
+      (should (= (nth 0 ppsX) 0))
+      (should-not (nth 4 ppsX))
+      ;; Try using OLDSTATE.
+      (should (equal (parse-partial-sexp preC pointC nil nil pps-preC)
+                     ppsC))
+      (should (equal (parse-partial-sexp pointC aftC nil nil ppsC)
+                     pps-aftC))
+      (should (equal (parse-partial-sexp preC aftC nil nil pps-preC)
+                     pps-aftC))
+      (should (equal (parse-partial-sexp aftC pointX nil nil pps-aftC)
+                     ppsX)))))
+
+(ert-deftest parse-partial-sexp-paren-comments ()
+  "Test syntax parsing with paren comment markers.
+Specifically, where the first character of the comment marker is
+also has open paren syntax (see Bug#24870)."
+  (with-temp-buffer
+    (let ((table (make-syntax-table)))
+      (modify-syntax-entry ?\{  "(}1nb" table)
+      (modify-syntax-entry ?\}  "){4nb" table)
+      (modify-syntax-entry ?-  ". 123" table)
+      (set-syntax-table table))
+    (insert "{-C-}\nX")
+    (goto-char (point-min))
+    (let* ((pointC (progn (search-forward "C") (1- (point))))
+           (pointX (progn (search-forward "X") (1- (point))))
+           (ppsC (parse-partial-sexp (point-min) pointC))
+           (ppsX (parse-partial-sexp (point-min) pointX)))
+      ;; C should be inside nestable comment, not list.
+      (should (= (nth 0 ppsC) 0))
+      (should (= (nth 4 ppsC) 1))
+      (should (= (nth 8 ppsC) (- pointC 2)))
+      ;; X should not be in comment or list.
+      (should (= (nth 0 ppsX) 0))
+      (should-not (nth 4 ppsX))
+      ;; Try using OLDSTATE.
+      (should (equal (parse-partial-sexp pointC pointX nil nil ppsC)
+                     ppsX)))))
+
+;;; syntax-tests.el ends here
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2016-12-30  1:55                         ` npostavs
@ 2017-01-13  2:07                           ` npostavs
  2017-01-23 20:12                             ` Alan Mackenzie
  0 siblings, 1 reply; 24+ messages in thread
From: npostavs @ 2017-01-13  2:07 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 24870, Matt Armstrong

npostavs@users.sourceforge.net writes:

> Alan Mackenzie <acm@muc.de> writes:
>>
>> (ii) In `parse-partial-sexp-continue-over-comment-marker', variable aftC
>> is the position in the middle of the comment closer "*/".  I don't think
>> you are testing in any way that element 10 (nil, or the syntax of the
>> position just before the end point when that position might be the first
>> character of a two-character construct, i.e. an escape or first char of a
>> double-char comment delimiter) is correct.
>
> My idea was that its effect would be tested by using pps-preC as
> OLDSTATE, which avoids having to encode the specifics in the test.  I
> added another clause which uses pps-aftC to cover parsing from the
> middle of a comment closer as well as opener.

Ping?  Agree/Disagree?





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2017-01-13  2:07                           ` npostavs
@ 2017-01-23 20:12                             ` Alan Mackenzie
  2017-01-24  0:30                               ` npostavs
  0 siblings, 1 reply; 24+ messages in thread
From: Alan Mackenzie @ 2017-01-23 20:12 UTC (permalink / raw)
  To: npostavs; +Cc: 24870, Matt Armstrong

Hello, Noam.

Sorry this is very late, but better that than not at all.

On Thu, Jan 12, 2017 at 21:07:49 -0500, npostavs@users.sourceforge.net wrote:
> npostavs@users.sourceforge.net writes:

> > Alan Mackenzie <acm@muc.de> writes:

> >> (ii) In `parse-partial-sexp-continue-over-comment-marker', variable aftC
> >> is the position in the middle of the comment closer "*/".  I don't think
> >> you are testing in any way that element 10 (nil, or the syntax of the
> >> position just before the end point when that position might be the first
> >> character of a two-character construct, i.e. an escape or first char of a
> >> double-char comment delimiter) is correct.

> > My idea was that its effect would be tested by using pps-preC as
> > OLDSTATE, which avoids having to encode the specifics in the test.  I
> > added another clause which uses pps-aftC to cover parsing from the
> > middle of a comment closer as well as opener.

> Ping?  Agree/Disagree?

I've just had another fairly intensive look at the patches, and I agree.

I think it's time to commit these.  What do you say?

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 24+ messages in thread

* bug#24870: 26.0.50; parse-partial-sexp ignores comment-end
  2017-01-23 20:12                             ` Alan Mackenzie
@ 2017-01-24  0:30                               ` npostavs
  0 siblings, 0 replies; 24+ messages in thread
From: npostavs @ 2017-01-24  0:30 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 24870, Matt Armstrong

tags 24870 fixed
close 24870 
quit

Alan Mackenzie <acm@muc.de> writes:
>
> On Thu, Jan 12, 2017 at 21:07:49 -0500, npostavs@users.sourceforge.net wrote:
>> npostavs@users.sourceforge.net writes:
>
>> > Alan Mackenzie <acm@muc.de> writes:
>
>> >> (ii) In `parse-partial-sexp-continue-over-comment-marker', variable aftC
>> >> is the position in the middle of the comment closer "*/".  I don't think
>> >> you are testing in any way that element 10 (nil, or the syntax of the
>> >> position just before the end point when that position might be the first
>> >> character of a two-character construct, i.e. an escape or first char of a
>> >> double-char comment delimiter) is correct.
>
>> > My idea was that its effect would be tested by using pps-preC as
>> > OLDSTATE, which avoids having to encode the specifics in the test.  I
>> > added another clause which uses pps-aftC to cover parsing from the
>> > middle of a comment closer as well as opener.
>
>
> I've just had another fairly intensive look at the patches, and I
> agree.

Thanks for the review.

>
> I think it's time to commit these.  What do you say?

Yep, pushed to master [1: 201dfe3].

1: 2017-01-23 19:28:30 -0500 201dfe311868932d10da146808fcdd681948ba53
  Fix comment detection on open parens





^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-01-24  0:30 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-03 19:38 bug#24870: 26.0.50; parse-partial-sexp ignores comment-end Andreas Röhler
2016-11-30  9:10 ` Matt Armstrong
2016-11-30 12:37   ` Andreas Röhler
2016-11-30 23:02     ` Matt Armstrong
2016-12-01  1:17       ` npostavs
2016-12-01  8:24         ` Andreas Röhler
2016-12-14  3:00           ` npostavs
2016-12-14  4:04             ` npostavs
2016-12-14  6:45               ` Andreas Röhler
2016-12-14 19:56               ` Alan Mackenzie
2016-12-15  8:18                 ` Andreas Röhler
2016-12-15 16:01                   ` Eli Zaretskii
2016-12-15 16:50                   ` Alan Mackenzie
2016-12-15 17:59                     ` Andreas Röhler
2016-12-14 21:58               ` Alan Mackenzie
2016-12-15 16:33                 ` Noam Postavsky
2016-12-15 16:44                   ` Alan Mackenzie
2016-12-18  5:39                     ` npostavs
2016-12-29 11:14                       ` Alan Mackenzie
2016-12-30  1:55                         ` npostavs
2017-01-13  2:07                           ` npostavs
2017-01-23 20:12                             ` Alan Mackenzie
2017-01-24  0:30                               ` npostavs
2016-12-01  8:33       ` Andreas Röhler

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).