unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.
@ 2018-05-22  7:42 Tino Calancha
  2018-05-22 17:40 ` Alan Mackenzie
  2018-05-31 12:37 ` CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) Alan Mackenzie
  0 siblings, 2 replies; 93+ messages in thread
From: Tino Calancha @ 2018-05-22  7:42 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Tino Calancha, Emacs developers



Hi Alan,

Since this commit one test 
(electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings)
in electric-tests.el is failing with error:

(scan-error "Unbalanced parentheses" 6 1)

This is the bactrace:
   scan-sexps(7 -1)
   forward-sexp(-1)
   backward-sexp()
   c-before-change-check-unbalanced-strings(6 6)
   #f(compiled-function (fn) #<bytecode 0x1171d85>)(c-before-change-che
   mapc(#f(compiled-function (fn) #<bytecode 0x1171d85>) (c-extend-regi
   c-before-change(6 6)
   self-insert-command(1)
   electric-pair--insert(34)
   electric-pair-post-self-insert-function()
   self-insert-command(1)
   funcall-interactively(self-insert-command 1)
   call-interactively(self-insert-command)
   #f(compiled-function () #<bytecode 0x484725>)()
   funcall(#f(compiled-function () #<bytecode 0x484725>))
   (let nil (funcall '#f(compiled-function () #<bytecode 0x484725>)))
   eval((let nil (funcall '#f(compiled-function () #<bytecode 0x484725>
   #f(compiled-function () #<bytecode 0x48475d>)()
   call-with-saved-electric-modes(#f(compiled-function () #<bytecode 0x
   electric-pair-test-for("\"foo\"" 2 34 "\"\"foo\"\"" 3 c++-mode nil #
   #f(compiled-function () #<bytecode 0x514bbd>)()
   ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
   ert-run-test(#s(ert-test :name electric-pair-autowrapping-5-at-point
   ert-run-or-rerun-test(#s(ert--stats :selector (not (or (tag :expensi
   ert-run-tests((not (or (tag :expensive-test) (tag :unstable))) #f(co
   ert-run-tests-batch((not (or (tag :expensive-test) (tag :unstable)))
   ert-run-tests-batch-and-exit((not (or (tag :expensive-test) (tag :un
   eval((ert-run-tests-batch-and-exit '(not (or (tag :expensive-test) (
   command-line-1(("-L" ":." "-l" "ert" "-l" "lisp/electric-tests" "--e
   command-line()
   normal-top-level()




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.
  2018-05-22  7:42 [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings Tino Calancha
@ 2018-05-22 17:40 ` Alan Mackenzie
  2018-05-22 19:21   ` João Távora
  2018-05-31 12:37 ` CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) Alan Mackenzie
  1 sibling, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-05-22 17:40 UTC (permalink / raw)
  To: Tino Calancha, João Távora; +Cc: Emacs developers

Hello, Tino and João.

On Tue, May 22, 2018 at 16:42:46 +0900, Tino Calancha wrote:


> Hi Alan,

> Since this commit one test 
> (electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings)
> in electric-tests.el is failing with error:

> (scan-error "Unbalanced parentheses" 6 1)

Sorry about that.

> This is the bactrace:
>    scan-sexps(7 -1)
>    forward-sexp(-1)
>    backward-sexp()
>    c-before-change-check-unbalanced-strings(6 6)
>    #f(compiled-function (fn) #<bytecode 0x1171d85>)(c-before-change-che
>    mapc(#f(compiled-function (fn) #<bytecode 0x1171d85>) (c-extend-regi
>    c-before-change(6 6)
>    self-insert-command(1)
>    electric-pair--insert(34)
>    electric-pair-post-self-insert-function()
>    self-insert-command(1)
>    funcall-interactively(self-insert-command 1)
>    call-interactively(self-insert-command)
>    #f(compiled-function () #<bytecode 0x484725>)()
>    funcall(#f(compiled-function () #<bytecode 0x484725>))
>    (let nil (funcall '#f(compiled-function () #<bytecode 0x484725>)))
>    eval((let nil (funcall '#f(compiled-function () #<bytecode 0x484725>
>    #f(compiled-function () #<bytecode 0x48475d>)()
>    call-with-saved-electric-modes(#f(compiled-function () #<bytecode 0x
>    electric-pair-test-for("\"foo\"" 2 34 "\"\"foo\"\"" 3 c++-mode nil #
>    #f(compiled-function () #<bytecode 0x514bbd>)()
>    ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
>    ert-run-test(#s(ert-test :name electric-pair-autowrapping-5-at-point
>    ert-run-or-rerun-test(#s(ert--stats :selector (not (or (tag :expensi
>    ert-run-tests((not (or (tag :expensive-test) (tag :unstable))) #f(co
>    ert-run-tests-batch((not (or (tag :expensive-test) (tag :unstable)))
>    ert-run-tests-batch-and-exit((not (or (tag :expensive-test) (tag :un
>    eval((ert-run-tests-batch-and-exit '(not (or (tag :expensive-test) (
>    command-line-1(("-L" ":." "-l" "ert" "-l" "lisp/electric-tests" "--e
>    command-line()
>    normal-top-level()

João, the file electric-tests.el is anything but straightforward to
read.  The test referred to above is generated by a nest of two or three
macros, somehow, and it is not obvious what buffer operations were
generated by these macros, and how they triggered a newly introduced bug
in C++ Mode.  The comments in the file are too sparse to help.

How do I extract these essential details from
"electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings"?

Thanks!

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.
  2018-05-22 17:40 ` Alan Mackenzie
@ 2018-05-22 19:21   ` João Távora
  2018-05-22 19:34     ` Eli Zaretskii
                       ` (2 more replies)
  0 siblings, 3 replies; 93+ messages in thread
From: João Távora @ 2018-05-22 19:21 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Emacs developers, Tino Calancha

Hi Alan,

Alan Mackenzie <acm@muc.de> writes:

> electric-tests.el is anything but straightforward to read.

lol, sorry I couldn't quite make the 20000+LOC standards of cc-mode.el
:p.

No really, kidding, <wipes tears>, that file needs macros because it
defines almost 500 tests with very subtle variations between them. You
tripped one of them, and I think we're both glad you did (regardless of
who is at fault: test or c++-mode)

> The test referred to above is generated by a nest of two or three
> macros, somehow, and it is not obvious what buffer operations were
> generated by these macros, and how they triggered a newly introduced bug
> in C++ Mode.  The comments in the file are too sparse to help.

Here's how it works: There's only one macro for electric-pair tests,
aptly named define-electric-pair-test.

In that file, find the `define-electric-pair-test' that most closely
matches the test failure, in this case its:

   (define-electric-pair-test autowrapping-5
     "foo" "\"" :expected-string "\"foo\"" :expected-point 2
     :fixture-fn #'(lambda ()
                     (electric-pair-mode 1)
                     (mark-sexp 1)))

now go to the end of the expression and type M-x
pp-macroexpand-last-sexp. It should be easy to find your failing test,
defined in terms of `ert-deftest', in a list of 6 tests. Here it is:

  (ert-deftest electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings nil
  "With |\"foo\"|, try input \" at point 2. Should become |\"\"foo\"\"| and point at 3"
               (electric-pair-test-for "\"foo\"" 2 34 "\"\"foo\"\"" 3 'c++-mode nil
                                       #'(lambda nil
                                           (electric-pair-mode 1)
                                           (mark-sexp 1))))

Now M-x edebug-defun this form straight in the *Pp Macroexpand Output*
buffer, and M-x edebug-defun the `electric-pair-test-for' defun,
too. Now run the test:

  M-x ert RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings RET

As you step through the code, you'll eventually land on that lambda
which calls mark-sexp and hints, along with the name, that this is a
region-autowrapping test. This is why it expects, for a single character
of input, that two quotes are inserted in the buffer instead of
one. The test passes in my 26.1 as you probably already knew.

Good luck hunting the bug and let me know if you have more problems.

Thanks,
João

PS: you could also have used:

   M-x ert-describe-test RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings

Which would have rendered a nice docstring

    electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings is a
    test defined in `electric-tests.elc'.
     
    With |"foo"|, try input " at point 2. Should become |""foo""| and point at 3
     
    [back]

Though, admittedly, this is misleading for the "autowrapping" tests,
since it doesn't tell you about the "mark-sexp" region-making
command. Also not immediatly clear perhaps it that the | are buffer
boundaries.

Ideally, it should read read

  With |"foo"|, at point 2, (mark-sexp 1) and try input ".
  Should become |""foo""| and point at 3

I will try to fix this in master.

Also M-x ert-find-test-other-window could have helped you, but it
doesn't (brings me to the beginning of the file, which isn't helpful). I
don't know why, does anyone?




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification,  etc., of unterminated strings.
  2018-05-22 19:21   ` João Távora
@ 2018-05-22 19:34     ` Eli Zaretskii
  2018-05-22 20:25       ` João Távora
  2018-05-23 20:46     ` Alan Mackenzie
  2018-05-23 23:21     ` Michael Welsh Duggan
  2 siblings, 1 reply; 93+ messages in thread
From: Eli Zaretskii @ 2018-05-22 19:34 UTC (permalink / raw)
  To: João Távora; +Cc: acm, tino.calancha, emacs-devel

> From: joaotavora@gmail.com (João Távora)
> Date: Tue, 22 May 2018 20:21:25 +0100
> Cc: Emacs developers <emacs-devel@gnu.org>,
> 	Tino Calancha <tino.calancha@gmail.com>
> 
> Here's how it works:

Thanks.  How about adding this information to the test file(s), as
comments, so that others won't need to search high and low for it?



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.
  2018-05-22 19:34     ` Eli Zaretskii
@ 2018-05-22 20:25       ` João Távora
  2018-05-22 22:17         ` João Távora
  0 siblings, 1 reply; 93+ messages in thread
From: João Távora @ 2018-05-22 20:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, tino.calancha, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: joaotavora@gmail.com (João Távora)
>> Date: Tue, 22 May 2018 20:21:25 +0100
>> Cc: Emacs developers <emacs-devel@gnu.org>,
>> 	Tino Calancha <tino.calancha@gmail.com>
>> 
>> Here's how it works:
>
> Thanks.  How about adding this information to the test file(s), as
> comments, so that others won't need to search high and low for it?

Sure, but there are two more important things to do before:

1. fix the generated docstring so that M-x
ert-describe-test returns useful information. I can do this.

2. fix M-x ert-find-test-other-window so that it find the generating
form. Who can help?

Emacs is a self-documenting edtior: I would need to explain very little
to a seasoned Emacs user if just one of these things were working, let
alone two.

That said, adding comments is seldom a bad idea (because when they
become outdated, it's a terrible idea.)

João





^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.
  2018-05-22 20:25       ` João Távora
@ 2018-05-22 22:17         ` João Távora
  2018-05-23 14:52           ` Eli Zaretskii
  0 siblings, 1 reply; 93+ messages in thread
From: João Távora @ 2018-05-22 22:17 UTC (permalink / raw)
  To: Eli Zaretskii, acm; +Cc: tino.calancha, emacs-devel


Eli Zaretskii <eliz@gnu.org> writes:
>> Thanks.  How about adding this information to the test file(s), as
>> comments, so that others won't need to search high and low for it?
>
> 1. fix the generated docstring so that M-x
> ert-describe-test returns useful information. I can do this.

I just pushed this change to master and though it's not perfect, I'm
pretty happy with the result.  Here's readable output of M-x
ert-describe-test for the test that Alan is investigating:

  electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings is a
  test defined in `electric-tests.elc'.
   
  Electricity test in a `c++-mode' buffer.
   
  Start with point at 2 in a 5-char-long buffer
  like this one:
   
    |"foo"|   (buffer start and end are denoted by `|')
   
  Now call this:
   
  #'(lambda nil
      (electric-pair-mode 1)
      (mark-sexp 1))
   
  Now press the key for: "
   
  The buffer's contents should become:
   
    |""foo""|
   
  , and point should be at 3.


João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.
  2018-05-22 22:17         ` João Távora
@ 2018-05-23 14:52           ` Eli Zaretskii
  0 siblings, 0 replies; 93+ messages in thread
From: Eli Zaretskii @ 2018-05-23 14:52 UTC (permalink / raw)
  To: João Távora; +Cc: acm, tino.calancha, emacs-devel

> From: joaotavora@gmail.com (João Távora)
> Cc: emacs-devel@gnu.org,  tino.calancha@gmail.com
> Date: Tue, 22 May 2018 23:17:22 +0100
> 
> 
> I just pushed this change to master and though it's not perfect, I'm
> pretty happy with the result.  Here's readable output of M-x
> ert-describe-test for the test that Alan is investigating:
> 
>   electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings is a
>   test defined in `electric-tests.elc'.
>    
>   Electricity test in a `c++-mode' buffer.
>    
>   Start with point at 2 in a 5-char-long buffer
>   like this one:
>    
>     |"foo"|   (buffer start and end are denoted by `|')
>    
>   Now call this:
>    
>   #'(lambda nil
>       (electric-pair-mode 1)
>       (mark-sexp 1))
>    
>   Now press the key for: "
>    
>   The buffer's contents should become:
>    
>     |""foo""|
>    
>   , and point should be at 3.

Thanks, LGTM.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.
  2018-05-22 19:21   ` João Távora
  2018-05-22 19:34     ` Eli Zaretskii
@ 2018-05-23 20:46     ` Alan Mackenzie
  2018-05-23 21:12       ` João Távora
  2018-05-23 23:21     ` Michael Welsh Duggan
  2 siblings, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-05-23 20:46 UTC (permalink / raw)
  To: João Távora; +Cc: Emacs developers, Tino Calancha

Hello, João.

On Tue, May 22, 2018 at 20:21:25 +0100, João Távora wrote:
> Hi Alan,

> Alan Mackenzie <acm@muc.de> writes:

> > electric-tests.el is anything but straightforward to read.

> lol, sorry I couldn't quite make the 20000+LOC standards of cc-mode.el
> :p.

:-)

> No really, kidding, <wipes tears>, that file needs macros because it
> defines almost 500 tests with very subtle variations between them.

Yes, I understand this.

> You tripped one of them, and I think we're both glad you did
> (regardless of who is at fault: test or c++-mode)

Well, I'm glad about it, so thanks for this test file!.  The bug is most
definitely in CC Mode - I'd neglected to handle properly an open string
lacking an EOL to "terminate" it.  This will be easy to fix.

> > The test referred to above is generated by a nest of two or three
> > macros, somehow, and it is not obvious what buffer operations were
> > generated by these macros, and how they triggered a newly introduced bug
> > in C++ Mode.  The comments in the file are too sparse to help.

> Here's how it works: There's only one macro for electric-pair tests,
> aptly named define-electric-pair-test.

> In that file, find the `define-electric-pair-test' that most closely
> matches the test failure, in this case its:

>    (define-electric-pair-test autowrapping-5
>      "foo" "\"" :expected-string "\"foo\"" :expected-point 2
>      :fixture-fn #'(lambda ()
>                      (electric-pair-mode 1)
>                      (mark-sexp 1)))

> now go to the end of the expression and type M-x
> pp-macroexpand-last-sexp. It should be easy to find your failing test,
> defined in terms of `ert-deftest', in a list of 6 tests. Here it is:

>   (ert-deftest electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings nil
>   "With |\"foo\"|, try input \" at point 2. Should become |\"\"foo\"\"| and point at 3"
>                (electric-pair-test-for "\"foo\"" 2 34 "\"\"foo\"\"" 3 'c++-mode nil
>                                        #'(lambda nil
>                                            (electric-pair-mode 1)
>                                            (mark-sexp 1))))

> Now M-x edebug-defun this form straight in the *Pp Macroexpand Output*
> buffer, and M-x edebug-defun the `electric-pair-test-for' defun,
> too. Now run the test:

>   M-x ert RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings RET

Thanks for this recipe.  It wasn't self evident.

> As you step through the code, you'll eventually land on that lambda
> which calls mark-sexp and hints, along with the name, that this is a
> region-autowrapping test. This is why it expects, for a single character
> of input, that two quotes are inserted in the buffer instead of
> one. The test passes in my 26.1 as you probably already knew.

> Good luck hunting the bug and let me know if you have more problems.

Thanks for the help.

> Thanks,
> João

> PS: you could also have used:

>    M-x ert-describe-test RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings

> Which would have rendered a nice docstring

>     electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings is a
>     test defined in `electric-tests.elc'.
     
>     With |"foo"|, try input " at point 2. Should become |""foo""| and point at 3
     
>     [back]

> Though, admittedly, this is misleading for the "autowrapping" tests,
> since it doesn't tell you about the "mark-sexp" region-making
> command. Also not immediatly clear perhaps it that the | are buffer
> boundaries.

Just to be pedantic, it also doesn't say where " number 4 is inserted
(whether at EOL, or before " number 2), but that can be resolved by the
application of intelligence.  :-)

> Ideally, it should read read

>   With |"foo"|, at point 2, (mark-sexp 1) and try input ".
>   Should become |""foo""| and point at 3

> I will try to fix this in master.

> Also M-x ert-find-test-other-window could have helped you, but it
> doesn't (brings me to the beginning of the file, which isn't helpful). I
> don't know why, does anyone?

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.
  2018-05-23 20:46     ` Alan Mackenzie
@ 2018-05-23 21:12       ` João Távora
  0 siblings, 0 replies; 93+ messages in thread
From: João Távora @ 2018-05-23 21:12 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Emacs developers, Tino Calancha

Alan Mackenzie <acm@muc.de> writes:

>>   M-x ert RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings RET
>
> Thanks for this recipe.  It wasn't self evident.

I know, I added a better docstring now.

I'd like to get rid of the automatical function names so you could find
that long symbol in the file, but it seems difficult. The best solution
would be a good find-definition that placed you somewhere where
pp-macroexpand would reveal it to you.

> Just to be pedantic, it also doesn't say where " number 4 is inserted
> (whether at EOL, or before " number 2), but that can be resolved by the
> application of intelligence.  :-)

You raise an existential point: If I put all your atoms together in the
same configuration but in a different order will I have a different
Alan?  ;-)

João





^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.
  2018-05-22 19:21   ` João Távora
  2018-05-22 19:34     ` Eli Zaretskii
  2018-05-23 20:46     ` Alan Mackenzie
@ 2018-05-23 23:21     ` Michael Welsh Duggan
  2 siblings, 0 replies; 93+ messages in thread
From: Michael Welsh Duggan @ 2018-05-23 23:21 UTC (permalink / raw)
  To: emacs-devel

joaotavora@gmail.com (João Távora) writes:

> PS: you could also have used:
>
>    M-x ert-describe-test RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings
>
> Which would have rendered a nice docstring
>
>     electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings is a
>     test defined in `electric-tests.elc'.
>      
>     With |"foo"|, try input " at point 2. Should become |""foo""| and point at 3
>      
>     [back]
>
> Though, admittedly, this is misleading for the "autowrapping" tests,
> since it doesn't tell you about the "mark-sexp" region-making
> command. Also not immediatly clear perhaps it that the | are buffer
> boundaries.
>
> Ideally, it should read read
>
>   With |"foo"|, at point 2, (mark-sexp 1) and try input ".
>   Should become |""foo""| and point at 3
>
> I will try to fix this in master.
>
> Also M-x ert-find-test-other-window could have helped you, but it
> doesn't (brings me to the beginning of the file, which isn't helpful). I
> don't know why, does anyone?

Could the macros that generate the ert-deftest call not set
`definition-name' in the generated function's symbol property list that
leads back to `define-electric-pair-test`'s name element?  That might
help `find-function-search-for-symbol' find the correct location.

-- 
Michael Welsh Duggan
(md5i@md5i.com)



^ permalink raw reply	[flat|nested] 93+ messages in thread

* CC Mode and electric-pair "problem".  (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.)
  2018-05-22  7:42 [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings Tino Calancha
  2018-05-22 17:40 ` Alan Mackenzie
@ 2018-05-31 12:37 ` Alan Mackenzie
  2018-05-31 16:07   ` CC Mode and electric-pair "problem" João Távora
  2018-06-17 16:58   ` Glenn Morris
  1 sibling, 2 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-05-31 12:37 UTC (permalink / raw)
  To: Tino Calancha, João Távora; +Cc: Emacs developers

Hello agin, Tino and João.

On Tue, May 22, 2018 at 16:42:46 +0900, Tino Calancha wrote:


> Hi Alan,

> Since this commit one test 
> (electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings)
> in electric-tests.el is failing with error:

> (scan-error "Unbalanced parentheses" 6 1)

This should now have been fixed.

However, the test suite (make check) threw up another discrepancy, in a
test called
electric-pair-whitespace-chomping-2-at-point-4-in-c++-mode-in-strings.

That's with electric-pair mode enabled, and
electric-pair-skip-whitespace set to 'chomp.  The buffer, at the start
of the test, looks something like:

    " (

      ) "

.  With point just after the (, type a ).  The expected result is that
everything up to and including the existing ) gets "chomped", leaving
the buffer looking like:

    " () "

.  This no longer happens in C++ mode, and it is not clear that it
should.  In the original buffer, ( and ) are not in the same string,
since the opening string ends at EOL, there being no backslash to
continue it.

If there were escaped newlines in the buffer, I don't think the "chomp"
would work, because elec-pair.el doesn't recognise escaped newlines as
whitespace.

Comments?

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-05-31 12:37 ` CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) Alan Mackenzie
@ 2018-05-31 16:07   ` João Távora
  2018-05-31 17:28     ` Alan Mackenzie
  2018-06-17 16:58   ` Glenn Morris
  1 sibling, 1 reply; 93+ messages in thread
From: João Távora @ 2018-05-31 16:07 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Emacs developers, Tino Calancha

Hi again, Alan

Alan Mackenzie <acm@muc.de> writes:

>     " (
>
>       ) "
>
> .  With point just after the (, type a ).  The expected result is that
> everything up to and including the existing ) gets "chomped", leaving
> the buffer looking like:
>
>     " () "
>
> .  This no longer happens in C++ mode, and it is not clear that it
> should.  In the original buffer, ( and ) are not in the same string,
> since the opening string ends at EOL, there being no backslash to
> continue it.
>
> If there were escaped newlines in the buffer, I don't think the "chomp"
> would work, because elec-pair.el doesn't recognise escaped newlines as
> whitespace.
>
> Comments?

I can reproduce this, even without turning "chomping" on: 26.1 skips to
the closing parens, master doesn't.

But it's tricky. From elec-pair.el's perspective, skipping whitespace
means skipping whitespace characters *and* not crossing string/comment
boundaries.  To analyse a test case very similar to yours I wrote a
simple function (attached after my sig) to analyse just 5 characters and
an end-of-file.

   ( " \n " ) EOF

In Emacs 26.1 I get

  ((:character 34 :formatted "\"" :syntax
               (7)
               :depth 0 :string nil :last-open-parens nil)
   (:character 40 :formatted "(" :syntax
               (4 . 41)
               :depth 0 :string 34 :last-open-parens 1)
   (:character 10 :formatted "\n" :syntax
               (0)
               :depth 0 :string 34 :last-open-parens 1)
   (:character 41 :formatted ")" :syntax
               (5 . 40)
               :depth 0 :string 34 :last-open-parens 1)
   (:character 34 :formatted "\"" :syntax
               (7)
               :depth 0 :string 34 :last-open-parens 1)
   (:character nil :formatted "EOF" :syntax nil :depth 0 :string nil
               :last-open-parens nil))
               

In Emacs master, I get

  ((:character 34 :formatted "\"" :syntax
               (15)
               :depth 0 :string nil :last-open-parens nil)
   (:character 40 :formatted "(" :syntax
               (4 . 41)
               :depth 0 :string t :last-open-parens 1)
   (:character 10 :formatted "\n" :syntax
               (15)
               :depth 0 :string t :last-open-parens 1)
   (:character 41 :formatted ")" :syntax
               (5 . 40)
               :depth 0 :string nil :last-open-parens nil)
   (:character 34 :formatted "\"" :syntax
               (15)
               :depth -1 :string nil :last-open-parens nil)
   (:character nil :formatted "EOF" :syntax nil :depth -1 :string t
               :last-open-parens 5))

Note that the newline character changed its syntax from (0), which is
whitespace, to (15) which is generic string. But more importantly, the
closing paren after it no longer declares to be inside a string
according to syntax-ppss.

Is this what you and (the majority of) cc-mode users expect? If it is,
then this test (and probably many other ones) must be changed to reflect
that.

As a data-point, as an occasional c++- mode user, I'd much rather have
Emacs 26's behaviour.  When faced with such admittedly invalid C, I at
most expect M-x compile or Flymake to tell me about it, but would like
Emacs to treat it as whitespace so electric-pair keeps functioning
correctly.  That is, I expect Emacs to not choke my editing tools
because I've temporarily produced syntactically incorrect code while
editing, particularly tools designed to correct such situations.

I've also noted that whitespace-fixing tools aren't tripped by your
change. But that's because they don't care about comment and string
boundaries, although they could/should.  This suggests we could make
elec-pair.el also not care about them in c++ mode, but it would only
take us so far, because I fear worse problems would come in more basic
elec-pair.el funtionality.

In general, I think you should review the recent c++-mode changes. To
illustrate, here's a new bug report without any newlines.

1. emacs-master/src/emacs -Q
2. M-x erase-buffer RET !
3. M-x c++-mode
4. M-x electric-pair-mode
5. insert a double quote (this inserts a closer)
6. insert an opening parens (this inserts a closer)
7. insert a double quote (this inserts a closer, but...)

... it additionally popups up an error:

   c-append-to-state-cache: Scan error: "Unbalanced parentheses", 5, 1

The last quote becomes red. If I erase the buffer again and do the whole
thing again, no error happens and no red quote, which is what I expect
it to do (and Emacs 26 behaviour).

Actually, electric-pair-mode doesn't even need to be on:

1. emacs-master/src/emacs -Q
2. M-x erase-buffer RET !
3. M-x c++-mode
5a. insert a double quote
5b. insert the closer quote
5.c go back one char
6a. insert an opening parens
6b. insert the closer, go back one char
7a. insert a double quote
7b. try to insert the closer quote

You get the same c-append-to-state-cache error

João


(defun joaot/analyse (&optional int)
  (interactive "p")
  (cl-loop for p = (point-min)
           then (1+ p)
           while (<= p (point-max))
           for (depth _ _ string comment _ _ _ open-parens _) = (syntax-ppss p)
           for char = (char-after p)
           collect (list :character char
                         :formatted (if char (format "%c" char) "EOF")
                         :syntax (syntax-after p)
                         :depth depth :string string
                         :last-open-parens open-parens)
           into retval
           finally (when int (message "%s" (pp-to-string retval)))
           (cl-return retval)))



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-05-31 16:07   ` CC Mode and electric-pair "problem" João Távora
@ 2018-05-31 17:28     ` Alan Mackenzie
  2018-05-31 18:37       ` João Távora
  0 siblings, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-05-31 17:28 UTC (permalink / raw)
  To: João Távora; +Cc: Emacs developers, Tino Calancha

Hello, João

On Thu, May 31, 2018 at 17:07:43 +0100, João Távora wrote:
> Hi again, Alan

> Alan Mackenzie <acm@muc.de> writes:

> >     " (
> >
> >       ) "
> >
> > .  With point just after the (, type a ).  The expected result is that
> > everything up to and including the existing ) gets "chomped", leaving
> > the buffer looking like:
> >
> >     " () "
> >
> > .  This no longer happens in C++ mode, and it is not clear that it
> > should.  In the original buffer, ( and ) are not in the same string,
> > since the opening string ends at EOL, there being no backslash to
> > continue it.
> >
> > If there were escaped newlines in the buffer, I don't think the "chomp"
> > would work, because elec-pair.el doesn't recognise escaped newlines as
> > whitespace.
> >
> > Comments?

> I can reproduce this, even without turning "chomping" on: 26.1 skips to
> the closing parens, master doesn't.

> But it's tricky. From elec-pair.el's perspective, skipping whitespace
> means skipping whitespace characters *and* not crossing string/comment
> boundaries.  To analyse a test case very similar to yours I wrote a
> simple function (attached after my sig) to analyse just 5 characters and
> an end-of-file.

>    ( " \n " ) EOF

I think you mean " ( \n ) " EOF.  :-)

> In Emacs 26.1 I get

>   ((:character 34 :formatted "\"" :syntax
>                (7)
>                :depth 0 :string nil :last-open-parens nil)
>    (:character 40 :formatted "(" :syntax
>                (4 . 41)
>                :depth 0 :string 34 :last-open-parens 1)
>    (:character 10 :formatted "\n" :syntax
>                (0)
>                :depth 0 :string 34 :last-open-parens 1)
>    (:character 41 :formatted ")" :syntax
>                (5 . 40)
>                :depth 0 :string 34 :last-open-parens 1)
>    (:character 34 :formatted "\"" :syntax
>                (7)
>                :depth 0 :string 34 :last-open-parens 1)
>    (:character nil :formatted "EOF" :syntax nil :depth 0 :string nil
>                :last-open-parens nil))
               

> In Emacs master, I get

>   ((:character 34 :formatted "\"" :syntax
>                (15)
>                :depth 0 :string nil :last-open-parens nil)
>    (:character 40 :formatted "(" :syntax
>                (4 . 41)
>                :depth 0 :string t :last-open-parens 1)
>    (:character 10 :formatted "\n" :syntax
>                (15)
>                :depth 0 :string t :last-open-parens 1)
>    (:character 41 :formatted ")" :syntax
>                (5 . 40)
>                :depth 0 :string nil :last-open-parens nil)
>    (:character 34 :formatted "\"" :syntax
>                (15)
>                :depth -1 :string nil :last-open-parens nil)
>    (:character nil :formatted "EOF" :syntax nil :depth -1 :string t
>                :last-open-parens 5))

> Note that the newline character changed its syntax from (0), which is
> whitespace, to (15) which is generic string. But more importantly, the
> closing paren after it no longer declares to be inside a string
> according to syntax-ppss.

> Is this what you and (the majority of) cc-mode users expect? If it is,
> then this test (and probably many other ones) must be changed to reflect
> that.

Yes.  A string in C(++) mode extending over several lines is only valid
when the newlines are escaped.  The generic string syntax is partly an
artifice to get font-lock-warning-face, but is also deliberately
intended to cut the opener of the invalid string off from any subsequent
double quote.

> As a data-point, as an occasional c++- mode user, I'd much rather have
> Emacs 26's behaviour.  When faced with such admittedly invalid C, I at
> most expect M-x compile or Flymake to tell me about it, but would like
> Emacs to treat it as whitespace so electric-pair keeps functioning
> correctly.  That is, I expect Emacs to not choke my editing tools
> because I've temporarily produced syntactically incorrect code while
> editing, particularly tools designed to correct such situations.

OK.  I'll need to mull this over.

> I've also noted that whitespace-fixing tools aren't tripped by your
> change. But that's because they don't care about comment and string
> boundaries, although they could/should.  This suggests we could make
> elec-pair.el also not care about them in c++ mode, but it would only
> take us so far, because I fear worse problems would come in more basic
> elec-pair.el funtionality.

> In general, I think you should review the recent c++-mode changes. To
> illustrate, here's a new bug report without any newlines.

> 1. emacs-master/src/emacs -Q
> 2. M-x erase-buffer RET !
> 3. M-x c++-mode
> 4. M-x electric-pair-mode
> 5. insert a double quote (this inserts a closer)
> 6. insert an opening parens (this inserts a closer)
> 7. insert a double quote (this inserts a closer, but...)

> ... it additionally popups up an error:

>    c-append-to-state-cache: Scan error: "Unbalanced parentheses", 5, 1

I don't see this at all.  For me, that sequence of actions simply works,
without signalling an error.  This was on the master branch as I
committed my change today.

> The last quote becomes red. If I erase the buffer again and do the whole
> thing again, no error happens and no red quote, which is what I expect
> it to do (and Emacs 26 behaviour).

> Actually, electric-pair-mode doesn't even need to be on:

> 1. emacs-master/src/emacs -Q
> 2. M-x erase-buffer RET !
> 3. M-x c++-mode
> 5a. insert a double quote
> 5b. insert the closer quote
> 5.c go back one char
> 6a. insert an opening parens
> 6b. insert the closer, go back one char
> 7a. insert a double quote
> 7b. try to insert the closer quote

> You get the same c-append-to-state-cache error

I don't see this either.  And we both started with -Q, so it's not
something in .emacs.  Are you sure you've downloaded and build that
latest patch of mine?

> João

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-05-31 17:28     ` Alan Mackenzie
@ 2018-05-31 18:37       ` João Távora
  2018-06-02 13:02         ` Alan Mackenzie
  0 siblings, 1 reply; 93+ messages in thread
From: João Távora @ 2018-05-31 18:37 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Emacs developers, Tino Calancha

Alan Mackenzie <acm@muc.de> writes:

> Hello, João
>
> On Thu, May 31, 2018 at 17:07:43 +0100, João Távora wrote:

>>    ( " \n " ) EOF
>
> I think you mean " ( \n ) " EOF.  :-)

Right :)

> Yes.  A string in C(++) mode extending over several lines is only valid
> when the newlines are escaped.  The generic string syntax is partly an
> artifice to get font-lock-warning-face, but is also deliberately
> intended to cut the opener of the invalid string off from any subsequent
> double quote.

But is there another goal here, apart from the goal of visually
annotating the error?

If the intent is only to annotate the error visually, I'd rather leave
that to something like Flymake.  Admiteddly, it's not practical now,
since Flymake usually works by running the whole buffer through an
external syntax check tool, which may take ages compared to using syntax
hints from within emacs.

But that could be changed, my goal is to let Flymake call backends with
only recently changed parts of the buffer, and a much faster
syntax-checking backend could be devised.

Which reminds me, I never did get an answer to

   https://lists.gnu.org/archive/html/emacs-devel/2017-10/msg00448.html

did I?

> OK.  I'll need to mull this over.

OK, do. If you come to the conclusion that it is very important, and
when the code becomes stable, I can can increase the complexity of
elec-pair.el a bit to make it work in c++-mode.

BTW do all cc-based modes "forbid" multi-line strings?

>>    c-append-to-state-cache: Scan error: "Unbalanced parentheses", 5, 1
>
> I don't see this at all.  For me, that sequence of actions simply works,
> without signalling an error.  This was on the master branch as I
> committed my change today.

Things moves fast :-) I running a master without your commit from around
noon. I can't reproduce it now either, good job.

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-05-31 18:37       ` João Távora
@ 2018-06-02 13:02         ` Alan Mackenzie
  2018-06-03  3:00           ` João Távora
  0 siblings, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-02 13:02 UTC (permalink / raw)
  To: João Távora; +Cc: Emacs developers, Tino Calancha

Hello, João.

On Thu, May 31, 2018 at 19:37:22 +0100, João Távora wrote:
> Alan Mackenzie <acm@muc.de> writes:

> > Hello, João
> >
> > On Thu, May 31, 2018 at 17:07:43 +0100, João Távora wrote:

[ .... ]

> > Yes.  A string in C(++) mode extending over several lines is only
> > valid when the newlines are escaped.  The generic string syntax is
> > partly an artifice to get font-lock-warning-face, but is also
> > deliberately intended to cut the opener of the invalid string off
> > from any subsequent double quote.

> But is there another goal here, apart from the goal of visually
> annotating the error?

Well, "error" might be putting it a bit strongly.  Mainly, it's a
reminder to somebody typing in a string to close it off properly.

> If the intent is only to annotate the error visually, I'd rather leave
> that to something like Flymake.  Admiteddly, it's not practical now,
> since Flymake usually works by running the whole buffer through an
> external syntax check tool, which may take ages compared to using syntax
> hints from within emacs.

> But that could be changed, my goal is to let Flymake call backends with
> only recently changed parts of the buffer, and a much faster
> syntax-checking backend could be devised.

All I can reply with, at the moment, is ... Hmmmm.  :-)

> Which reminds me, I never did get an answer to

>    https://lists.gnu.org/archive/html/emacs-devel/2017-10/msg00448.html

> did I?

No, but you've got one now.  :-)

> > OK.  I'll need to mull this over.

> OK, do. If you come to the conclusion that it is very important, and
> when the code becomes stable, I can can increase the complexity of
> elec-pair.el a bit to make it work in c++-mode.

I think the increase in complexity would be quite small, and very local.

> BTW do all cc-based modes "forbid" multi-line strings?

No.  Pike Mode has a special feature whereby a string starting with #" is
a multiline string.  I think in D Mode (not maintained here), strings
simply are multiline, and there is no such thing as an escaped EOL.

The writer of the mode sets the CC Mode "language variable"
c-multiline-string-start-char to the character # for Pike Mode, or some
non-character non-nil value for D Mode (usually t, of course).

[ .... ]

> João

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-02 13:02         ` Alan Mackenzie
@ 2018-06-03  3:00           ` João Távora
  0 siblings, 0 replies; 93+ messages in thread
From: João Távora @ 2018-06-03  3:00 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Emacs developers, Tino Calancha

Hi Alan,

Alan Mackenzie <acm@muc.de> writes:

> Well, "error" might be putting it a bit strongly.  Mainly, it's a
> reminder to somebody typing in a string to close it off properly.

Yeah, precisely. Hence it would be ideal to annotate it, but not break
whitespace autoskip over it.

>> only recently changed parts of the buffer, and a much faster
>> syntax-checking backend could be devised.
>
> All I can reply with, at the moment, is ... Hmmmm.  :-)

That's a more than reasonable reply to my vaporware. I'll let you know
when I have something more concrete.

>> when the code becomes stable, I can can increase the complexity of
>> elec-pair.el a bit to make it work in c++-mode.
> I think the increase in complexity would be quite small, and very
> local.

That remains to be seen... Nevertheless, if we're talking about this one
test, my offer stands: I can add a variable to control this (which
c-mode will have to set since it's not going to be the default).

>> BTW do all cc-based modes "forbid" multi-line strings?
>
> No.  Pike Mode has a special feature whereby a string starting with #" is

So Pike Mode keeps the whitespace skip, right?

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-05-31 12:37 ` CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) Alan Mackenzie
  2018-05-31 16:07   ` CC Mode and electric-pair "problem" João Távora
@ 2018-06-17 16:58   ` Glenn Morris
  2018-06-17 20:13     ` Alan Mackenzie
  1 sibling, 1 reply; 93+ messages in thread
From: Glenn Morris @ 2018-06-17 16:58 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Emacs developers, João Távora, Tino Calancha

Alan Mackenzie wrote:

> However, the test suite (make check) threw up another discrepancy, in a
> test called
> electric-pair-whitespace-chomping-2-at-point-4-in-c++-mode-in-strings.

Hello, is this still being worked on?
The test continues to fail on RHEL 7 and hydra.nixos.org.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-17 16:58   ` Glenn Morris
@ 2018-06-17 20:13     ` Alan Mackenzie
  2018-06-17 21:07       ` Stefan Monnier
  2018-06-17 21:27       ` João Távora
  0 siblings, 2 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-17 20:13 UTC (permalink / raw)
  To: Glenn Morris; +Cc: Emacs developers, João Távora, Tino Calancha

Hello, Glenn.

On Sun, Jun 17, 2018 at 12:58:53 -0400, Glenn Morris wrote:
> Alan Mackenzie wrote:

> > However, the test suite (make check) threw up another discrepancy, in a
> > test called
> > electric-pair-whitespace-chomping-2-at-point-4-in-c++-mode-in-strings.

> Hello, is this still being worked on?
> The test continues to fail on RHEL 7 and hydra.nixos.org.

From my point of view, the bug is not being worked on this very day, but
has by no means been forgotten.  It has needed a period of mulling over.
I think João sees it the same way.

Although it won't be difficult to fix, this bug is an awkward thing, and
will need decisions (smallish ones) to be taken.

My favoured method would be to alter electric-pair--skip-whitespace such
that a NL terminating a string (as contrasted with a NL terminating a
comment) would be allowed to be scanned over.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-17 20:13     ` Alan Mackenzie
@ 2018-06-17 21:07       ` Stefan Monnier
  2018-06-17 21:27       ` João Távora
  1 sibling, 0 replies; 93+ messages in thread
From: Stefan Monnier @ 2018-06-17 21:07 UTC (permalink / raw)
  To: emacs-devel

> My favoured method would be to alter electric-pair--skip-whitespace such
> that a NL terminating a string (as contrasted with a NL terminating a
> comment) would be allowed to be scanned over.

AFAIK no language currently offers "NL terminating strings".  So, we
should indeed behave as if this NL doesn't terminate the string (IIUC
the problem is that CC-mode marks NL-inside-string as if it terminates
a string, but that's just an internal detail which shouldn't have such
visible side-effects.  Personally I'd vote to just not treat
NF-inside-string in such a special way: it's a lot of trouble on the
implementation side for very little benefit to the end user since the
way strings are font-locked makes it trivially obvious to the user
what's going on).


        Stefan




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-17 20:13     ` Alan Mackenzie
  2018-06-17 21:07       ` Stefan Monnier
@ 2018-06-17 21:27       ` João Távora
  2018-06-18 10:36         ` Alan Mackenzie
  1 sibling, 1 reply; 93+ messages in thread
From: João Távora @ 2018-06-17 21:27 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha

[-- Attachment #1: Type: text/plain, Size: 2079 bytes --]

On Sun, Jun 17, 2018 at 9:13 PM, Alan Mackenzie <acm@muc.de> wrote:

> Hello, Glenn.
>
> On Sun, Jun 17, 2018 at 12:58:53 -0400, Glenn Morris wrote:
> > Alan Mackenzie wrote:
>
> > > However, the test suite (make check) threw up another discrepancy, in a
> > > test called
> > > electric-pair-whitespace-chomping-2-at-point-4-in-c++-mode-in-strings.
>
> > Hello, is this still being worked on?
> > The test continues to fail on RHEL 7 and hydra.nixos.org.
>
> From my point of view, the bug is not being worked on this very day, but
> has by no means been forgotten.  It has needed a period of mulling over.
> I think João sees it the same way.
>

Yes, while mulling over things is generally good, I believe the problem
from Glenn's perspective is the nuisance of checking whether every
test failure is something to worry about or just the thing being
mulled over.

So I suggest taking a quick temporary action to make the test pass
and then think about how to do it properly.  This action could be
disabling the test temporarily but IME that invariably buries the
issue ad eternum. So it's better to do it in cc-mode.

Although it won't be difficult to fix, this bug is an awkward thing, and
> will need decisions (smallish ones) to be taken.
>
> My favoured method would be to alter electric-pair--skip-whitespace such
> that a NL terminating a string (as contrasted with a NL terminating a
> comment) would be allowed to be scanned over.
>

I'm OK with adding an customization point to
electric-pair--skip-whitespace that c-mode can customize.  But I also
wonder whether the benefit to end-users of handling NL-terminated
strings are worth it.  Perhaps there are indeed benefits, it's just that
I haven't seen them argued.  But more importantly perhaps there are
ways to reap these benefits in a way that doesn't require changes
to e-p-m, or even better, in a way that benefits all of Emacs,
not just c-mode.

So, in practice, is the advantage here that the user is visually
warned of an invalid NL-terminated string?

João

[-- Attachment #2: Type: text/html, Size: 3022 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-17 21:27       ` João Távora
@ 2018-06-18 10:36         ` Alan Mackenzie
  2018-06-18 13:24           ` João Távora
  0 siblings, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-18 10:36 UTC (permalink / raw)
  To: João Távora; +Cc: Glenn Morris, Emacs developers, Tino Calancha

Hello, João.

On Sun, Jun 17, 2018 at 22:27:20 +0100, João Távora wrote:
> On Sun, Jun 17, 2018 at 9:13 PM, Alan Mackenzie <acm@muc.de> wrote:

> > Hello, Glenn.

> > On Sun, Jun 17, 2018 at 12:58:53 -0400, Glenn Morris wrote:
> > > Alan Mackenzie wrote:

> > > > However, the test suite (make check) threw up another discrepancy, in a
> > > > test called
> > > > electric-pair-whitespace-chomping-2-at-point-4-in-c++-mode-in-strings.

> > > Hello, is this still being worked on?
> > > The test continues to fail on RHEL 7 and hydra.nixos.org.

> > From my point of view, the bug is not being worked on this very day, but
> > has by no means been forgotten.  It has needed a period of mulling over.
> > I think João sees it the same way.


> Yes, while mulling over things is generally good, I believe the problem
> from Glenn's perspective is the nuisance of checking whether every
> test failure is something to worry about or just the thing being
> mulled over.

Yes.  But it is the master branch, where not everything can be expected
to work all the time.  I think the main thing is, we're _going_ to fix
this bug.

> So I suggest taking a quick temporary action to make the test pass
> and then think about how to do it properly.  This action could be
> disabling the test temporarily but IME that invariably buries the
> issue ad eternum. So it's better to do it in cc-mode.

Hmm.  To modify CC Mode temporarily to make 'chomp in electric-pair-mode
work would be an order of magnitude more work than "simply" to fix the
bug.  That's without disabling the handling of " in CC Mode entirely.

> Although it won't be difficult to fix, this bug is an awkward thing, and
> > will need decisions (smallish ones) to be taken.

> > My favoured method would be to alter electric-pair--skip-whitespace such
> > that a NL terminating a string (as contrasted with a NL terminating a
> > comment) would be allowed to be scanned over.


> I'm OK with adding an customization point to
> electric-pair--skip-whitespace that c-mode can customize.  But I also
> wonder whether the benefit to end-users of handling NL-terminated
> strings are worth it.  Perhaps there are indeed benefits, it's just that
> I haven't seen them argued.

OK, here goes.  Why should major modes tie themselves in knots, just so
that electric-pair-mode can work?  What CC Mode is doing is natural, and
matches the reality.  A C(++) compiler regards an unterminated string as
ending at the (first unescaped) linefeed.  It will then regard the next
line as code (not string).  If there is a subsequent ", the compiler
won't see that as a terminator for the unbalanced opening ".  CC Mode now
matches this reality, which is a Good Thing.

electric-pair-mode's chomp facility could be more rigorously coded -
sometimes it is dealing with visible whitespace, sometimes it is dealing
with syntactic properties.  Surely it should be working with visible
whitespace all the time?

I've attempted a bit of debugging.  In addition to
electric-pair--skip-whitespace, I encountered a scan-sexps in subfunction
ended-prematurely-fn of function electric-pair--balance-info, which
snagged on the end of string at EOL.

> But more importantly perhaps there are ways to reap these benefits in a
> way that doesn't require changes to e-p-m, or even better, in a way
> that benefits all of Emacs, not just c-mode.

We are talking about a corner case in e-p-m, namely where e-p-m attempts
to chomp space between parens inside an invalid string.  This surely
won't come up in practice very much.  Is it worth fixing?  (I would say
yes.)

> So, in practice, is the advantage here that the user is visually
> warned of an invalid NL-terminated string?

The user is visually informed of the reality: that one or more strings
are unterminated, and where the "breakage" is (where the
font-lock-string-face stops).  This is an improvement over the previous
handling, where the opening invalid " merely got warning-face, but the
following unterminated string flowed on indefinitely.

The disadvantage is that e-p-m is constraining major modes in how they
can use syntax-table text properties.  I think this is a problem in
electric-pair-mode, not in CC Mode.

> João

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 10:36         ` Alan Mackenzie
@ 2018-06-18 13:24           ` João Távora
  2018-06-18 15:18             ` Eli Zaretskii
  2018-06-18 15:42             ` Alan Mackenzie
  0 siblings, 2 replies; 93+ messages in thread
From: João Távora @ 2018-06-18 13:24 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha

Alan Mackenzie <acm@muc.de> writes:

> Hello, João.

>> Yes, while mulling over things is generally good, I believe the problem
>> from Glenn's perspective is the nuisance of checking whether every
>> test failure is something to worry about or just the thing being
>> mulled over.
> Yes.  But it is the master branch, where not everything can be expected
> to work all the time.  I think the main thing is, we're _going_ to fix
> this bug.

Well, I respectfully and totally disagree.  The reason we have automated
tests in Hydra is to catch unintentional breakage, not intentional
breakage.  And, IIUC that test is the only one preventing a successful
"make check".

For development temporarily unhampered by tests, I think a separate
branch is a much better alternative.  It's a very easy thing to do in
git (and in your case, trivial to merge from and back to master, given
you have near-total control over that area of code).

> Hmm.  To modify CC Mode temporarily to make 'chomp in electric-pair-mode
> work would be an order of magnitude more work than "simply" to fix the
> bug.  That's without disabling the handling of " in CC Mode entirely.

How so? Can't you just revert the commit that broke it? 

>> strings are worth it.  Perhaps there are indeed benefits, it's just that
>> I haven't seen them argued.
> OK, here goes.  Why should major modes tie themselves in knots, just so
> that electric-pair-mode can work?  What CC Mode is doing is natural, and
> matches the reality.

I think you mean "mode", in the singular form :-).

Also, it doesn't "match reality": if you open a line in a string, it
syntax highlights the remaining string as C statements, but the C parser
doesn't see C statements.  IOW, newline doesn't *really* terminate a
string in C.

> electric-pair-mode's chomp facility could be more rigorously coded -
> sometimes it is dealing with visible whitespace, sometimes it is dealing
> with syntactic properties.  Surely it should be working with visible
> whitespace all the time?

No.  If it did so, it would chomp parenthesis from non-comment regions
into comment regions, for example.

That doesn't make sense, not according to show-paren-mode, for example.

By the way, after your change, very basic commands which fall completely
outside electric-pair-mode have fundamentally changed their behaviour in
cc-mode. Here are a few, out of Emacs -Q:

* Open a line in a string, using C-o.  Sexp-navigation is now messed up
  in the whole buffer, i.e. C-M-*.  Most commads error or produce
  surprising result.  So even if the intent is to eventually add a
  backslash escaping the newline, or make it two adjacent strings by
  typing two quotes (something perfectly allowed by C).

* Inside the string, `forward-sexp' in a parenthesis of a NL-terminated
  string now errors where it would previously do its job of jumping to
  the closer;

* Also inside the string, `blink-matching-paren', on by default, also
  doesn't work as before: closing a paren on a NL-started string doesn't
  match the opener.

There are no automated tests for these things, otherwise you could be
seeing test breakage here too (and, with higher probably, you may be
seeing breakage in user's expectations later on).

> I've attempted a bit of debugging.  In addition to
> electric-pair--skip-whitespace, I encountered a scan-sexps in subfunction
> ended-prematurely-fn of function electric-pair--balance-info, which
> snagged on the end of string at EOL.

I don't understand how this matters to the problem at hand, but
regardless, can you make a bug report demonstrating the presumed bug and
its impact so I can follow up?

> We are talking about a corner case in e-p-m, namely where e-p-m attempts
> to chomp space between parens inside an invalid string.  This surely
> won't come up in practice very much.  Is it worth fixing?  (I would say
> yes.)

Don't forget that the particular piece of e-p-m we're talking about is
one of the ways (arguably the easiest way) to actually fix the specific
C/C++ problem at hand for the user.  IOW it's not some random whimsical
useless thing.

>> So, in practice, is the advantage here that the user is visually
>> warned of an invalid NL-terminated string?
> The user is visually informed of the reality: that one or more strings
> are unterminated, and where the "breakage" is (where the
> font-lock-string-face stops).  This is an improvement over the previous
> handling, where the opening invalid " merely got warning-face, but the
> following unterminated string flowed on indefinitely.

I suppose that's a "yes".  In that case, the face `warning`, which
defaults to a very bright red, would be fine for me personally (and I'm
confident if could be made even more evident).  Also, the fact that the
remaining string is now syntax-highlighted as C statements is extremely
confusing.

> The disadvantage is that e-p-m is constraining major modes in how they
> can use syntax-table text properties.  I think this is a problem in
> electric-pair-mode, not in CC Mode.

Again, AFAIK, "mode", singular.  And, obviously, I'm not going to
special-case cc-mode in elec-pair.el: after doing some of my own
mulling, I may open a customization point for cc-mode.el to use.  So at
the very least, it's going to require some (potentially trivial) fix in
cc-mode.el, for sure.

But now that I've understood the non-e-p-m implications of your change,
I urge to at least make this configurable (if it is already
configurable, then don't mind me).

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 13:24           ` João Távora
@ 2018-06-18 15:18             ` Eli Zaretskii
  2018-06-18 15:37               ` João Távora
  2018-06-18 15:42             ` Alan Mackenzie
  1 sibling, 1 reply; 93+ messages in thread
From: Eli Zaretskii @ 2018-06-18 15:18 UTC (permalink / raw)
  To: João Távora; +Cc: acm, emacs-devel, tino.calancha, rgm

> From: João Távora <joaotavora@gmail.com>
> Date: Mon, 18 Jun 2018 14:24:39 +0100
> Cc: Glenn Morris <rgm@gnu.org>, Emacs developers <emacs-devel@gnu.org>,
> 	Tino Calancha <tino.calancha@gmail.com>
> 
> > Yes.  But it is the master branch, where not everything can be expected
> > to work all the time.  I think the main thing is, we're _going_ to fix
> > this bug.
> 
> Well, I respectfully and totally disagree.  The reason we have automated
> tests in Hydra is to catch unintentional breakage, not intentional
> breakage.  And, IIUC that test is the only one preventing a successful
> "make check".

Isn't there a way to mark a test as expected to fail?



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 15:18             ` Eli Zaretskii
@ 2018-06-18 15:37               ` João Távora
  2018-06-18 16:46                 ` Eli Zaretskii
  2018-06-18 20:24                 ` Glenn Morris
  0 siblings, 2 replies; 93+ messages in thread
From: João Távora @ 2018-06-18 15:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Alan Mackenzie, emacs-devel, tino.calancha, Glenn Morris

[-- Attachment #1: Type: text/plain, Size: 1178 bytes --]

Yes, and I'll probably do that. But in my experience, this has a very high
probability of burying the problem, i.e. the incentive for actually fixing
the problem is reduced dramatically.   It's better to do test-breaking
things on separate branches when possible.  IMO expected failures are for
when a feature is being designed and still incomplete, not when it was
already working.

João

On Mon, Jun 18, 2018, 16:18 Eli Zaretskii <eliz@gnu.org> wrote:

> > From: João Távora <joaotavora@gmail.com>
> > Date: Mon, 18 Jun 2018 14:24:39 +0100
> > Cc: Glenn Morris <rgm@gnu.org>, Emacs developers <emacs-devel@gnu.org>,
> >       Tino Calancha <tino.calancha@gmail.com>
> >
> > > Yes.  But it is the master branch, where not everything can be expected
> > > to work all the time.  I think the main thing is, we're _going_ to fix
> > > this bug.
> >
> > Well, I respectfully and totally disagree.  The reason we have automated
> > tests in Hydra is to catch unintentional breakage, not intentional
> > breakage.  And, IIUC that test is the only one preventing a successful
> > "make check".
>
> Isn't there a way to mark a test as expected to fail?
>

[-- Attachment #2: Type: text/html, Size: 1902 bytes --]

^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 13:24           ` João Távora
  2018-06-18 15:18             ` Eli Zaretskii
@ 2018-06-18 15:42             ` Alan Mackenzie
  2018-06-18 17:01               ` João Távora
  1 sibling, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-18 15:42 UTC (permalink / raw)
  To: João Távora; +Cc: Glenn Morris, Emacs developers, Tino Calancha

Hello, João.

On Mon, Jun 18, 2018 at 14:24:39 +0100, João Távora wrote:
> Alan Mackenzie <acm@muc.de> writes:

> > Yes.  But it is the master branch, where not everything can be expected
> > to work all the time.  I think the main thing is, we're _going_ to fix
> > this bug.

> Well, I respectfully and totally disagree.  The reason we have automated
> tests in Hydra is to catch unintentional breakage, not intentional
> breakage.

This breakage is unintentional, and we're working out how to fix it.

> For development temporarily unhampered by tests, I think a separate
> branch is a much better alternative.  It's a very easy thing to do in
> git (and in your case, trivial to merge from and back to master, given
> you have near-total control over that area of code).

It's possible, but it's a hassle; it's outside of normal workflow,
therefore involves getting into git's execrable documentation.

> Can't you just revert the commit that broke it? 

It was three (or maybe four) successive commits.  If I revert them, it
will postpone indefinitely the bug that they've fixed.

> > OK, here goes.  Why should major modes tie themselves in knots, just so
> > that electric-pair-mode can work?  What CC Mode is doing is natural, and
> > matches the reality.

> I think you mean "mode", in the singular form :-).

No.  CC Mode comprises lots of modes, not all of them maintained by me.
But even aside from that, CC Mode has often been a pioneer, developing
new techniques, which the rest of Emacs has then followed.  Examples are
hungry deletion and electric indentation.

> Also, it doesn't "match reality": if you open a line in a string, it
> syntax highlights the remaining string as C statements, but the C parser
> doesn't see C statements.  IOW, newline doesn't *really* terminate a
> string in C.

We could argue about words like "terminate" indefinitely.  What I think
is incontrovertible is if you open a line in a string, the portion after
that opening is not part of the string opened on the line above.  The
new fontification reflects this fact.

> > electric-pair-mode's chomp facility could be more rigorously coded -
> > sometimes it is dealing with visible whitespace, sometimes it is dealing
> > with syntactic properties.  Surely it should be working with visible
> > whitespace all the time?

> No.  If it did so, it would chomp parenthesis from non-comment regions
> into comment regions, for example.

But it could use the strategy of determining the end of any comment,
then using non-syntax facilities for traversing the space up to that
end.  Or something like that.

> That doesn't make sense, not according to show-paren-mode, for example.

> By the way, after your change, very basic commands which fall completely
> outside electric-pair-mode have fundamentally changed their behaviour in
> cc-mode. Here are a few, out of Emacs -Q:

> * Open a line in a string, using C-o.  Sexp-navigation is now messed up
>   in the whole buffer, i.e. C-M-*.  Most commads error or produce
>   surprising result.  So even if the intent is to eventually add a
>   backslash escaping the newline, or make it two adjacent strings by
>   typing two quotes (something perfectly allowed by C).

I've tried this, obviously, but as far as I'm aware, the operation of
C-M-* is correct for the (now syntactically incorrect) buffer.  If you
can give me a concrete example, I can look at it and correct it.

> * Inside the string, `forward-sexp' in a parenthesis of a NL-terminated
>   string now errors where it would previously do its job of jumping to
>   the closer;

It works more or less the same as C-M-n always has from a parenthesis
inside a string, which isn't matched in that string.  Just that the
notion of "inside a string" is now more exact than it used to be.

> * Also inside the string, `blink-matching-paren', on by default, also
>   doesn't work as before: closing a paren on a NL-started string doesn't
>   match the opener.

Do you mean a NL-ENDED string?  I see matching here.  If you can be more
precise about the failure, I can look at it.

> There are no automated tests for these things, otherwise you could be
> seeing test breakage here too (and, with higher probably, you may be
> seeing breakage in user's expectations later on).

No, these things are not all intended functionality of Emacs, they're
just side effects of the way the real functionality was implemented.

> > I've attempted a bit of debugging.  In addition to
> > electric-pair--skip-whitespace, I encountered a scan-sexps in subfunction
> > ended-prematurely-fn of function electric-pair--balance-info, which
> > snagged on the end of string at EOL.

> I don't understand how this matters to the problem at hand, but
> regardless, can you make a bug report demonstrating the presumed bug and
> its impact so I can follow up?

I attempted to see how difficult it would be to modify elec-pair.el to
cope with unconstrained text properties in buffers.  This was the second
problem I came up against.

> > We are talking about a corner case in e-p-m, namely where e-p-m attempts
> > to chomp space between parens inside an invalid string.  This surely
> > won't come up in practice very much.  Is it worth fixing?  (I would say
> > yes.)

> Don't forget that the particular piece of e-p-m we're talking about is
> one of the ways (arguably the easiest way) to actually fix the specific
> C/C++ problem at hand for the user.  IOW it's not some random whimsical
> useless thing.

It's not useless, but it's rare - it's three things happening all at the
same time, namely a broken string, pseudo-matching parens and space
between them.  This isn't going to happen very often.  I'd wager that
broken strings (two "s with non-escaped NLs between them) in themselves
are quite rare.  But I still think it should be fixed.  :-)

> > The user is visually informed of the reality: that one or more
> > strings are unterminated, and where the "breakage" is (where the
> > font-lock-string-face stops).  This is an improvement over the
> > previous handling, where the opening invalid " merely got
> > warning-face, but the following unterminated string flowed on
> > indefinitely.

> I suppose that's a "yes".  In that case, the face `warning`, which
> defaults to a very bright red, would be fine for me personally (and I'm
> confident if could be made even more evident).  Also, the fact that the
> remaining string is now syntax-highlighted as C statements is extremely
> confusing.

Why?  They are now C statements, and would be handled by the compiler as
such.  Having them fontified as strings (as they previously were) was
confusing.

> > The disadvantage is that e-p-m is constraining major modes in how they
> > can use syntax-table text properties.  I think this is a problem in
> > electric-pair-mode, not in CC Mode.

> Again, AFAIK, "mode", singular.

See above.  Perhaps it's worth noting that AWK-Mode has used this method
of indicating invalid strings for around 15 years, now.  There have
never been any complaints about this from users.

> And, obviously, I'm not going to special-case cc-mode in elec-pair.el:
> after doing some of my own mulling, I may open a customization point
> for cc-mode.el to use.

I think it's a general case, that of having non-neutral syntax-table
text properties on visual space characters.  What do you see as a
customisation option here?

> So at the very least, it's going to require some (potentially trivial)
> fix in cc-mode.el, for sure.

:-)

> But now that I've understood the non-e-p-m implications of your change,
> I urge to at least make this configurable (if it is already
> configurable, then don't mind me).

Make correct fontification configurable?

To sum up my viewpoint, I regard the way CC Mode now fontifies broken
strings as correct (aside from any remaining bugs, of course).  I think
elec-pair.el's assumption that whitespace always has "neutral" syntax is
unwarranted, and is the root of the current bug.

There remains the problem of making chomping parens inside a broken
string work.  I honestly think that modifying elec-pair.el is the way to
go, but I'm open to suggestions of alternative strategies that CC Mode
could follow to get the same fontification, that wouldn't require
modifying elec-pair.el.

> João

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 15:37               ` João Távora
@ 2018-06-18 16:46                 ` Eli Zaretskii
  2018-06-18 17:21                   ` Eli Zaretskii
  2018-06-18 23:49                   ` João Távora
  2018-06-18 20:24                 ` Glenn Morris
  1 sibling, 2 replies; 93+ messages in thread
From: Eli Zaretskii @ 2018-06-18 16:46 UTC (permalink / raw)
  To: João Távora; +Cc: acm, emacs-devel, tino.calancha, rgm

> From: João Távora <joaotavora@gmail.com>
> Date: Mon, 18 Jun 2018 16:37:33 +0100
> Cc: Alan Mackenzie <acm@muc.de>, Glenn Morris <rgm@gnu.org>, emacs-devel@gnu.org, 
> 	tino.calancha@gmail.com
> 
> Yes, and I'll probably do that. But in my experience, this has a very high probability of burying the problem, i.e.
> the incentive for actually fixing the problem is reduced dramatically.

But putting the problematic code on a branch reduces the incentive
even more, doesn't it?  At least with the expected failure, you will
see when it unexpectedly starts to succeed; on a branch, the code
is easily forgotten forever...

Of course, it's better to fix breakage fats, but we all have our lives.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 15:42             ` Alan Mackenzie
@ 2018-06-18 17:01               ` João Távora
  2018-06-18 18:07                 ` Yuri Khan
                                   ` (3 more replies)
  0 siblings, 4 replies; 93+ messages in thread
From: João Távora @ 2018-06-18 17:01 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha

Alan Mackenzie <acm@muc.de> writes:

>> For development temporarily unhampered by tests, I think a separate
>> branch is a much better alternative.  It's a very easy thing to do in
>> git (and in your case, trivial to merge from and back to master, given
>> you have near-total control over that area of code).
>
> It's possible, but it's a hassle; it's outside of normal workflow,
> therefore involves getting into git's execrable documentation.

hehe. lol.  OK, but really you should checkout out branches, they're all
the rage these days :-)

>> > OK, here goes.  Why should major modes tie themselves in knots, just so
>> > that electric-pair-mode can work?  What CC Mode is doing is natural, and
>> > matches the reality.
>
>> I think you mean "mode", in the singular form :-).
>
> No.  CC Mode comprises lots of modes, not all of them maintained by me.
> But even aside from that, CC Mode has often been a pioneer, developing
> new techniques, which the rest of Emacs has then followed.  Examples are
> hungry deletion and electric indentation.

But they are all children of cc-mode.el right?  I meant singular as in,
afaik, nobody else independently thought of doing that besides you.

>> Also, it doesn't "match reality": if you open a line in a string, it
>> syntax highlights the remaining string as C statements, but the C parser
>> doesn't see C statements.  IOW, newline doesn't *really* terminate a
>> string in C.
>
> We could argue about words like "terminate" indefinitely.  What I think
> is incontrovertible is if you open a line in a string, the portion after
> that opening is not part of the string opened on the line above.  The
> new fontification reflects this fact.

OK, but now reflects it reflects something that is also wrong (they're
not statements either), but to a much greater degress. And on top of
that with many more adverse side effects, of which only one is breaking
e-p-m mode.

>> > electric-pair-mode's chomp facility could be more rigorously coded -
>> > sometimes it is dealing with visible whitespace, sometimes it is dealing
>> > with syntactic properties.  Surely it should be working with visible
>> > whitespace all the time?
>
>> No.  If it did so, it would chomp parenthesis from non-comment regions
>> into comment regions, for example.
> But it could use the strategy of determining the end of any comment,
> then using non-syntax facilities for traversing the space up to that
> end.  Or something like that.

I'll look into "something like that".

> I've tried this, obviously, but as far as I'm aware, the operation of
> C-M-* is correct for the (now syntactically incorrect) buffer.  If you
> can give me a concrete example, I can look at it and correct it.

It's now much hard to select the whole invalid string.  It used to be a
matter of C-M-u C-M-SPC.  To use query-replace in the region, for
example.

>> * Also inside the string, `blink-matching-paren', on by default, also
>>   doesn't work as before: closing a paren on a NL-started string doesn't
>>   match the opener.
>
> Do you mean a NL-ENDED string?  I see matching here.  If you can be more
> precise about the failure, I can look at it.

No, I mean the closer.  You and the mode don't consider that a string
anymore, but you used to, and I still want do.

>> There are no automated tests for these things, otherwise you could be
>> seeing test breakage here too (and, with higher probably, you may be
>> seeing breakage in user's expectations later on).
> No, these things are not all intended functionality of Emacs, they're
> just side effects of the way the real functionality was implemented.

These accidents, as you have them, work just fine in just about any
other mode I can imagine.  And they worked just fine in c-mode up until
your change.

>> > electric-pair--skip-whitespace, I encountered a scan-sexps in subfunction
>> > ended-prematurely-fn of function electric-pair--balance-info, which
>> > snagged on the end of string at EOL.
>> I don't understand how this matters to the problem at hand, but
>> regardless, can you make a bug report demonstrating the presumed bug and
>> its impact so I can follow up?
> I attempted to see how difficult it would be to modify elec-pair.el to
> cope with unconstrained text properties in buffers.  This was the second
> problem I came up against.

Well, programming is a continuous problem in general.  If I understand
correctly, the thing you're trying to change is an implementation detail
of electric-pair-mode, not part of its contract, right?  If, on the
contrary, you think it is a bug, let me know.

>> > We are talking about a corner case in e-p-m, namely where e-p-m attempts
>> > to chomp space between parens inside an invalid string.  This surely
>> > won't come up in practice very much.  Is it worth fixing?  (I would say
>> > yes.)
>> Don't forget that the particular piece of e-p-m we're talking about is
>> one of the ways (arguably the easiest way) to actually fix the specific
>> C/C++ problem at hand for the user.  IOW it's not some random whimsical
>> useless thing.
> It's not useless, but it's rare - it's three things happening all at the
> same time, namely a broken string, pseudo-matching parens and space
> between them.  This isn't going to happen very often.  I'd wager that
> broken strings (two "s with non-escaped NLs between them) in themselves
> are quite rare.  But I still think it should be fixed.  :-)

Well, it's handling the rarities that makes Emacs stand out.

>> > The user is visually informed of the reality: that one or more
>> > strings are unterminated, and where the "breakage" is (where the
>> > font-lock-string-face stops).  This is an improvement over the
>> > previous handling, where the opening invalid " merely got
>> > warning-face, but the following unterminated string flowed on
>> > indefinitely.
>
>> I suppose that's a "yes".  In that case, the face `warning`, which
>> defaults to a very bright red, would be fine for me personally (and I'm
>> confident if could be made even more evident).  Also, the fact that the
>> remaining string is now syntax-highlighted as C statements is extremely
>> confusing.
>
> Why?  They are now C statements, and would be handled by the compiler as
> such.

Clarify "would". Because this doesn't compile.  My compiler doesn't even
seem to look at anything after the unterminated string:
    
   int main () {
      printf("foo
             ); 
      printf("bar");
      return 0;
     }


>> > The disadvantage is that e-p-m is constraining major modes in how they
>> > can use syntax-table text properties.  I think this is a problem in
>> > electric-pair-mode, not in CC Mode.
>
>> Again, AFAIK, "mode", singular.
>
> See above.  Perhaps it's worth noting that AWK-Mode has used this method
> of indicating invalid strings for around 15 years, now.  There have
> never been any complaints about this from users.

But they weren't ever exposed to the previous behaviour, right?  And
also, I believe that there is some discrepancy between the number users
of AWK and C, the complexity of the average program, etc...

>> But now that I've understood the non-e-p-m implications of your change,
>> I urge to at least make this configurable (if it is already
>> configurable, then don't mind me).
> Make correct fontification configurable?

For some newfound value of "correct", surely...

> There remains the problem of making chomping parens inside a broken
> string work.  I honestly think that modifying elec-pair.el is the way to
> go, but I'm open to suggestions of alternative strategies that CC Mode
> could follow to get the same fontification, that wouldn't require
> modifying elec-pair.el.

As I said, I will look into providing an entry point in elec-pair.el for
this.

Didn't you mention earlier pike-mode and d-mode? Quoting your earlier
message:

    > Pike Mode has a special feature whereby a string starting with #"
    > is a multiline string.  I think in D Mode (not maintained here),
    > strings simply are multiline, and there is no such thing as an
    > escaped EOL.

    > The writer of the mode sets the CC Mode "language variable"
    > c-multiline-string-start-char to the character # for Pike Mode, or
    > some non-character non-nil value for D Mode (usually t, of
    > course).

Can't I do this to my c/c++ mode?  Would't this be a way to get the old
behaviour back.  Perhaps it could be be let-bound in tests, also.

João




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 16:46                 ` Eli Zaretskii
@ 2018-06-18 17:21                   ` Eli Zaretskii
  2018-06-18 23:49                   ` João Távora
  1 sibling, 0 replies; 93+ messages in thread
From: Eli Zaretskii @ 2018-06-18 17:21 UTC (permalink / raw)
  To: joaotavora; +Cc: acm, tino.calancha, rgm, emacs-devel

> Date: Mon, 18 Jun 2018 19:46:15 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: acm@muc.de, emacs-devel@gnu.org, tino.calancha@gmail.com, rgm@gnu.org
> 
> Of course, it's better to fix breakage fats, but we all have our lives.
                                         ^^^^
I meant "fast", of course...



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 17:01               ` João Távora
@ 2018-06-18 18:07                 ` Yuri Khan
  2018-06-18 22:52                   ` João Távora
  2018-06-18 18:08                 ` Alan Mackenzie
                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 93+ messages in thread
From: Yuri Khan @ 2018-06-18 18:07 UTC (permalink / raw)
  To: João Távora
  Cc: Alan Mackenzie, Emacs developers, Tino Calancha, Glenn Morris

On Tue, Jun 19, 2018 at 12:25 AM João Távora <joaotavora@gmail.com> wrote:
> Alan Mackenzie <acm@muc.de> writes:
> > Why?  They are now C statements, and would be handled by the compiler as
> > such.
>
> Clarify "would". Because this doesn't compile.  My compiler doesn't even
> seem to look at anything after the unterminated string:
>
>    int main () {
>       printf("foo
>              );
>       printf("bar");
>       return 0;
>      }

Mine does. After finding a syntax error, a typical C compiler
continues scanning the source, attempting to diagnose more errors.
See:

```
int main() {
    printf("foo
           );
    return baz;
}

$ gcc test1.c

test1.c: In function ‘main’:
test1.c:2:5: warning: implicit declaration of function ‘printf’
[-Wimplicit-function-declaration]
     printf("foo
     ^
test1.c:2:5: warning: incompatible implicit declaration of built-in
function ‘printf’
test1.c:2:5: note: include ‘<stdio.h>’ or provide a declaration of ‘printf’
test1.c:2:12: warning: missing terminating " character
     printf("foo
            ^
test1.c:2:5: error: missing terminating " character
     printf("foo
     ^
test1.c:2:5: error: too few arguments to function ‘printf’
test1.c:4:12: error: ‘baz’ undeclared (first use in this function)
     return baz;
            ^
test1.c:4:12: note: each undeclared identifier is reported only once
for each function it appears in
```

Observe that the compiler first complains about the unclosed string
literal, then too few arguments, and then undeclared identifier ‘baz’.
If the compiler thought the string terminal continued, it would skip
everything until the end of file silently.


Now add a few characters:

```
int main() {
    printf("foo
           "bar");
    return baz;
}

$ gcc test2.c

test2.c: In function ‘main’:
test2.c:2:5: warning: implicit declaration of function ‘printf’
[-Wimplicit-function-declaration]
     printf("foo
     ^
test2.c:2:5: warning: incompatible implicit declaration of built-in
function ‘printf’
test2.c:2:5: note: include ‘<stdio.h>’ or provide a declaration of ‘printf’
test2.c:2:12: warning: missing terminating " character
     printf("foo
            ^
test2.c:2:5: error: missing terminating " character
     printf("foo
     ^
test2.c:4:12: error: ‘baz’ undeclared (first use in this function)
     return baz;
            ^
test2.c:4:12: note: each undeclared identifier is reported only once
for each function it appears in
```

Observe the compiler no longer complains about too few arguments to
‘printf’. This is consistent with the hypothesis that it discarded the
unterminated literal at the newline, and took "bar" as the required
format string argument.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 17:01               ` João Távora
  2018-06-18 18:07                 ` Yuri Khan
@ 2018-06-18 18:08                 ` Alan Mackenzie
  2018-06-18 23:43                   ` João Távora
  2018-06-19  1:48                   ` Stefan Monnier
  2018-06-18 22:41                 ` CC Mode and electric-pair "problem" Stephen Leake
  2018-06-19  5:02                 ` Alan Mackenzie
  3 siblings, 2 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-18 18:08 UTC (permalink / raw)
  To: João Távora; +Cc: Glenn Morris, Emacs developers, Tino Calancha

Hello again, João.

On Mon, Jun 18, 2018 at 18:01:18 +0100, João Távora wrote:
> Alan Mackenzie <acm@muc.de> writes:

> > No.  CC Mode comprises lots of modes, not all of them maintained by
> > me.  But even aside from that, CC Mode has often been a pioneer,
> > developing new techniques, which the rest of Emacs has then followed.
> > Examples are hungry deletion and electric indentation.

> But they are all children of cc-mode.el right?  I meant singular as in,
> afaik, nobody else independently thought of doing that besides you.

Probably other people have thought of it.  The actual doing was quite
involved.  But maybe we'll see whether or not the idea spreads.

> > We could argue about words like "terminate" indefinitely.  What I
> > think is incontrovertible is if you open a line in a string, the
> > portion after that opening is not part of the string opened on the
> > line above.  The new fontification reflects this fact.

> OK, but now reflects it reflects something that is also wrong (they're
> not statements either), but to a much greater degress. And on top of
> that with many more adverse side effects, of which only one is breaking
> e-p-m mode.

How adverse are they really?  I mean, I think you are currently in the
"looking for flaws" mode, which is essential, worthwhile, and
appreciated, but if you were just using, say, C++ mode, how bad would
these side effects actually be?  That's not a rhetorical question.  It's
about deciding whether to invest the work to make the "correct" behaviour
optional.

> > I've tried this, obviously, but as far as I'm aware, the operation of
> > C-M-* is correct for the (now syntactically incorrect) buffer.  If you
> > can give me a concrete example, I can look at it and correct it.

> It's now much hard to select the whole invalid string.  It used to be a
> matter of C-M-u C-M-SPC.  To use query-replace in the region, for
> example.

OK, thanks.  But how often does this happen?

> >> * Also inside the string, `blink-matching-paren', on by default, also
> >>   doesn't work as before: closing a paren on a NL-started string doesn't
> >>   match the opener.

> > Do you mean a NL-ENDED string?  I see matching here.  If you can be more
> > precise about the failure, I can look at it.

> No, I mean the closer.  You and the mode don't consider that a string
> anymore, but you used to, and I still want do.

OK.

> >> There are no automated tests for these things, otherwise you could be
> >> seeing test breakage here too (and, with higher probably, you may be
> >> seeing breakage in user's expectations later on).
> > No, these things are not all intended functionality of Emacs, they're
> > just side effects of the way the real functionality was implemented.

> These accidents, as you have them, work just fine in just about any
> other mode I can imagine.  And they worked just fine in c-mode up until
> your change.

I suspect it is more that people have got so used to them, that any
change will appear to be bad.  Maybe.

> Well, programming is a continuous problem in general.  If I understand
> correctly, the thing you're trying to change is an implementation detail
> of electric-pair-mode, not part of its contract, right?  If, on the
> contrary, you think it is a bug, let me know.

It is what I said at the end of my previous post.  e-p-m assumes that
whitespace has "neutral" syntax.  When it doesn't (like here, with a
string-fence property), the scan-sexps doesn't work as desired.  I'm
convinced this could be changed.

> >> > We are talking about a corner case in e-p-m, namely where e-p-m attempts
> >> > to chomp space between parens inside an invalid string.  This surely
> >> > won't come up in practice very much.  Is it worth fixing?  (I would say
> >> > yes.)
> >> Don't forget that the particular piece of e-p-m we're talking about is
> >> one of the ways (arguably the easiest way) to actually fix the specific
> >> C/C++ problem at hand for the user.  IOW it's not some random whimsical
> >> useless thing.
> > It's not useless, but it's rare - it's three things happening all at the
> > same time, namely a broken string, pseudo-matching parens and space
> > between them.  This isn't going to happen very often.  I'd wager that
> > broken strings (two "s with non-escaped NLs between them) in themselves
> > are quite rare.  But I still think it should be fixed.  :-)

> Well, it's handling the rarities that makes Emacs stand out.

Indeed!  Let's carry on doing this.

> >> > The user is visually informed of the reality: that one or more
> >> > strings are unterminated, and where the "breakage" is (where the
> >> > font-lock-string-face stops).  This is an improvement over the
> >> > previous handling, where the opening invalid " merely got
> >> > warning-face, but the following unterminated string flowed on
> >> > indefinitely.

> >> I suppose that's a "yes".  In that case, the face `warning`, which
> >> defaults to a very bright red, would be fine for me personally (and I'm
> >> confident if could be made even more evident).  Also, the fact that the
> >> remaining string is now syntax-highlighted as C statements is extremely
> >> confusing.

> > Why?  They are now C statements, and would be handled by the compiler as
> > such.

> Clarify "would". Because this doesn't compile.  My compiler doesn't even
> seem to look at anything after the unterminated string:
    
>    int main () {
>       printf("foo
>              ); 
>       printf("bar");
>       return 0;
>      }

Maybe the compiler has the same bug as the old CC Mode.  ;-)

But to see my point of view, type the following into a C Mode buffer in
Emacs-26.1, the last two lines first, then type in the first line above
them:

char *foo = "foo;
int bar = 5;
char *baz = "baz";

The entire second line, and the third line, up to the first ", get string
face.  We've been used to this for so long that we've lost sight of just
how bad and amateurish it really is.

Now do the same in master.  The fontification of the last two lines
remains unaffected by typing in the first line, as it should.

> > See above.  Perhaps it's worth noting that AWK-Mode has used this
> > method of indicating invalid strings for around 15 years, now.  There
> > have never been any complaints about this from users.

> But they weren't ever exposed to the previous behaviour, right?  And
> also, I believe that there is some discrepancy between the number users
> of AWK and C, the complexity of the average program, etc...

Most AWK programmers will also be using C, shell-script, whatever.  And
while there aren't that many of them, they aren't as rare as all that.
And when I say no complaints, I mean none whatsoever; not a single one.

> >> But now that I've understood the non-e-p-m implications of your change,
> >> I urge to at least make this configurable (if it is already
> >> configurable, then don't mind me).
> > Make correct fontification configurable?

> For some newfound value of "correct", surely...

Yes.  ;-)

> > There remains the problem of making chomping parens inside a broken
> > string work.  I honestly think that modifying elec-pair.el is the way to
> > go, but I'm open to suggestions of alternative strategies that CC Mode
> > could follow to get the same fontification, that wouldn't require
> > modifying elec-pair.el.

> As I said, I will look into providing an entry point in elec-pair.el for
> this.

Thanks.

> Didn't you mention earlier pike-mode and d-mode? Quoting your earlier
> message:

>     > Pike Mode has a special feature whereby a string starting with #"
>     > is a multiline string.  I think in D Mode (not maintained here),
>     > strings simply are multiline, and there is no such thing as an
>     > escaped EOL.

>     > The writer of the mode sets the CC Mode "language variable"
>     > c-multiline-string-start-char to the character # for Pike Mode, or
>     > some non-character non-nil value for D Mode (usually t, of
>     > course).

> Can't I do this to my c/c++ mode?  Would't this be a way to get the old
> behaviour back.  Perhaps it could be be let-bound in tests, also.

These are intended as language variables (i.e. variables which define a
language), not user configuration variables.  I can't immediately see any
adverse effects to binding them, but I can't guarantee there'll be none.

As for let binding them for tests, that should be for a short time only.

> João

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 15:37               ` João Távora
  2018-06-18 16:46                 ` Eli Zaretskii
@ 2018-06-18 20:24                 ` Glenn Morris
  2018-06-19  2:03                   ` João Távora
  1 sibling, 1 reply; 93+ messages in thread
From: Glenn Morris @ 2018-06-18 20:24 UTC (permalink / raw)
  To: João Távora
  Cc: Alan Mackenzie, Eli Zaretskii, tino.calancha, emacs-devel

> On Mon, Jun 18, 2018, 16:18 Eli Zaretskii <eliz@gnu.org> wrote:

>> Isn't there a way to mark a test as expected to fail?

João Távora wrote:

> Yes, and I'll probably do that.


Please do. Long term test failures are a problem for automated building,
testing, merging, etc. Thanks in advance!

> But in my experience, this has a very high probability of burying the
> problem, i.e. the incentive for actually fixing the problem is reduced
> dramatically.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 17:01               ` João Távora
  2018-06-18 18:07                 ` Yuri Khan
  2018-06-18 18:08                 ` Alan Mackenzie
@ 2018-06-18 22:41                 ` Stephen Leake
  2018-06-19  0:02                   ` João Távora
  2018-06-19  3:15                   ` Clément Pit-Claudel
  2018-06-19  5:02                 ` Alan Mackenzie
  3 siblings, 2 replies; 93+ messages in thread
From: Stephen Leake @ 2018-06-18 22:41 UTC (permalink / raw)
  To: emacs-devel

João Távora <joaotavora@gmail.com> writes:

> Alan Mackenzie <acm@muc.de> writes:
>
>>> > OK, here goes.  Why should major modes tie themselves in knots, just so
>>> > that electric-pair-mode can work?  What CC Mode is doing is natural, and
>>> > matches the reality.
>>
>>> I think you mean "mode", in the singular form :-).
>>
>> No.  CC Mode comprises lots of modes, not all of them maintained by me.
>> But even aside from that, CC Mode has often been a pioneer, developing
>> new techniques, which the rest of Emacs has then followed.  Examples are
>> hungry deletion and electric indentation.
>
> But they are all children of cc-mode.el right?  I meant singular as in,
> afaik, nobody else independently thought of doing that besides you.

For what it's worth, I'm planning on adding "new line terminates string"
to ada-mode. As Alan says, that is the way the compiler works. I was
initially inspired independently, while working on an error-correcting
parser, and found it in cc-mode while looking for ways to implement it.

If electric-pair mode wants to support users splitting a string across
lines, it should insert " before and after the newline; that's what I
would expect from it.

For me, it's more common to forget the closing " (possibly due to
copy/paste), in which case terminating the string at the new line is
more friendly.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 18:07                 ` Yuri Khan
@ 2018-06-18 22:52                   ` João Távora
  0 siblings, 0 replies; 93+ messages in thread
From: João Távora @ 2018-06-18 22:52 UTC (permalink / raw)
  To: Yuri Khan; +Cc: Alan Mackenzie, Emacs developers, Tino Calancha, Glenn Morris

Yuri Khan <yurivkhan@gmail.com> writes:

> On Tue, Jun 19, 2018 at 12:25 AM João Távora <joaotavora@gmail.com> wrote:
>> Alan Mackenzie <acm@muc.de> writes:
>> > Why?  They are now C statements, and would be handled by the compiler as
>> > such.
>>
>> Clarify "would". Because this doesn't compile.  My compiler doesn't even
>> seem to look at anything after the unterminated string:
>>
>>    int main () {
>>       printf("foo
>>              );
>>       printf("bar");
>>       return 0;
>>      }
>
> Mine does. After finding a syntax error, a typical C compiler
> continues scanning the source, attempting to diagnose more errors.

You're right.  That example I fed this particular compiler (clang)
didn't have anything for it to complain.

João





^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 18:08                 ` Alan Mackenzie
@ 2018-06-18 23:43                   ` João Távora
  2018-06-19  1:35                     ` João Távora
  2018-06-19  1:48                   ` Stefan Monnier
  1 sibling, 1 reply; 93+ messages in thread
From: João Távora @ 2018-06-18 23:43 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha

Alan Mackenzie <acm@muc.de> writes:

> Hello again, João.

Hello again, Alan

> On Mon, Jun 18, 2018 at 18:01:18 +0100, João Távora wrote:
> Probably other people have thought of it.  The actual doing was quite
> involved.  But maybe we'll see whether or not the idea spreads.
>
>> > We could argue about words like "terminate" indefinitely.  What I
>> > think is incontrovertible is if you open a line in a string, the
>> > portion after that opening is not part of the string opened on the
>> > line above.  The new fontification reflects this fact.
>
>> OK, but now reflects it reflects something that is also wrong (they're
>> not statements either), but to a much greater degress. And on top of
>> that with many more adverse side effects, of which only one is breaking
>> e-p-m mode.
>
> How adverse are they really?  I mean, I think you are currently in the
> "looking for flaws" mode, which is essential, worthwhile, and
> appreciated, but if you were just using, say, C++ mode, how bad would
> these side effects actually be?  That's not a rhetorical question.  It's
> about deciding whether to invest the work to make the "correct" behaviour
> optional.

Honestly, I don't know.  You're right I am indeed in "looking for flaws"
mode.  "Once you see it, you can't unsee it".

After reading your example below, I think the feature you are adding is
good for people who inadvertently delete the terminating quote, but not
for those who once in a while add a newline inside a string.  I am in
the latter group: I almost never unbalance by buffer (thanks mostly to
e-p-m) and I like to navigate by sexp, even in languages other than lisp
(even in message-mode, for instance).  So I get sad when something
breaks this balance.

>> > I've tried this, obviously, but as far as I'm aware, the operation of
>> > C-M-* is correct for the (now syntactically incorrect) buffer.  If you
>> > can give me a concrete example, I can look at it and correct it.
>
>> It's now much hard to select the whole invalid string.  It used to be a
>> matter of C-M-u C-M-SPC.  To use query-replace in the region, for
>> example.
>
> OK, thanks.  But how often does this happen?

Well, there's obvious the case of actually writing a multi-line string,
such as when writing a "usage:" blurb.  Here I believe most users, like
I, will first draft out the string visually and then add "\n\" to every
line, perhaps by selecting the string where point is in, which is now
much harder.

While it's true I don't write many of those lately, it will probably bug
me much more often in another situation: I'm in the the habit of C-o'ing
a lot (everywhere, not just in strings, obviously) to "open space" for
my thoughts, i.e. for the thing I am going to write next.  And this new
behaviour breaks that.

But now I've tested a bit more and can be specific: it breaks *some* of
that. C-M-u C-M-SPC is indeed broken.  That's what I use, for example,
just before I deciding to replace the string with a variable.  But
curiously, chomping to the end quote is working, which is nice.  And if
I can somehow make it to the closer quote, C-M-b works, though C-M-f at
the opener doesn't.  In any case, things were more predictable before.

>> These accidents, as you have them, work just fine in just about any
>> other mode I can imagine.  And they worked just fine in c-mode up until
>> your change.
> I suspect it is more that people have got so used to them, that any
> change will appear to be bad.  Maybe.

As we all know, Emacs is fertile in this regard. For example, I can
think of a certain version control system, rhymes with "knit"...

>> Well, programming is a continuous problem in general.  If I understand
>> correctly, the thing you're trying to change is an implementation detail
>> of electric-pair-mode, not part of its contract, right?  If, on the
>> contrary, you think it is a bug, let me know.
> It is what I said at the end of my previous post.  e-p-m assumes that
> whitespace has "neutral" syntax.  When it doesn't (like here, with a
> string-fence property), the scan-sexps doesn't work as desired.  I'm
> convinced this could be changed.

OK. I'll have a look (I admit to not having looked at that code in depth
since I wrote it 5 years ago).  I was pretty much convinced it was
flawless :-)

>> Well, it's handling the rarities that makes Emacs stand out.
> Indeed!  Let's carry on doing this.
>>    int main () {
>>       printf("foo
>>              ); 
>>       printf("bar");
>>       return 0;
>>      }
>
> Maybe the compiler has the same bug as the old CC Mode.  ;-)

No, I passed it a silly example.  It does indeed look past the
unterminated string.

> But to see my point of view, type the following into a C Mode buffer in
> Emacs-26.1, the last two lines first, then type in the first line above
> them:
>
> char *foo = "foo;
> int bar = 5;
> char *baz = "baz";
>
> face.  We've been used to this for so long that we've lost sight of just
> how bad and amateurish it really is.
>
> Now do the same in master.  The fontification of the last two lines
> remains unaffected by typing in the first line, as it should.

Indeed, I admit this is better.  I very rarely get a buffer like this,
though.

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 16:46                 ` Eli Zaretskii
  2018-06-18 17:21                   ` Eli Zaretskii
@ 2018-06-18 23:49                   ` João Távora
  2018-06-19  2:37                     ` Eli Zaretskii
  1 sibling, 1 reply; 93+ messages in thread
From: João Távora @ 2018-06-18 23:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel, tino.calancha, rgm

Eli Zaretskii <eliz@gnu.org> writes:

>> From: João Távora <joaotavora@gmail.com>
>> Date: Mon, 18 Jun 2018 16:37:33 +0100
>> Cc: Alan Mackenzie <acm@muc.de>, Glenn Morris <rgm@gnu.org>, emacs-devel@gnu.org, 
>> 	tino.calancha@gmail.com
>> 
>> Yes, and I'll probably do that. But in my experience, this has a very high probability of burying the problem, i.e.
>> the incentive for actually fixing the problem is reduced dramatically.
>
> But putting the problematic code on a branch reduces the incentive
> even more, doesn't it?

I don't follow.  I would answer "no", assuming the person developing the
temporarily misbehaving code is motivated to do it in the first place.
Develop and break things at will in a branch, merge them to master when
they're clean.  No?

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 22:41                 ` CC Mode and electric-pair "problem" Stephen Leake
@ 2018-06-19  0:02                   ` João Távora
  2018-06-19  3:15                   ` Clément Pit-Claudel
  1 sibling, 0 replies; 93+ messages in thread
From: João Távora @ 2018-06-19  0:02 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

Stephen Leake <stephen_leake@stephe-leake.org> writes:

> If electric-pair mode wants to support users splitting a string across
> lines, it should insert " before and after the newline; that's what I
> would expect from it.

Well, I don't understand the specific relevance of your example, e-p-m
doesn't "support" that, Emacs does. But FWIW your expectation is exactly
what it does now (you insert one quote, get two, then you enter the
newline).  So that's not the problem, the problem in e-p-m is a
corner-case of whitespace "chomping", that shouldn't hopefully be very
hard to fix.

My objections are beyond electric-pair-mode.  I was telling Alan how
this breaks sexp-based navigation, for instance.

> For me, it's more common to forget the closing " (possibly due to
> copy/paste), in which case terminating the string at the new line is
> more friendly.

Indeed, if you frequently do this, it's somewhat nicer not to paint the
rest of the buffer purple (or font-lock-string-face).  But now you know
about the drawbacks.

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 23:43                   ` João Távora
@ 2018-06-19  1:35                     ` João Távora
  0 siblings, 0 replies; 93+ messages in thread
From: João Távora @ 2018-06-19  1:35 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha

João Távora <joaotavora@gmail.com> writes:

>> It is what I said at the end of my previous post.  e-p-m assumes that
>> whitespace has "neutral" syntax.  When it doesn't (like here, with a
>> string-fence property), the scan-sexps doesn't work as desired.  I'm
>> convinced this could be changed.
> OK. I'll have a look (I admit to not having looked at that code in depth
> since I wrote it 5 years ago).  I was pretty much convinced it was
> flawless :-)

I looked at this code and decided its better to leave it.  I don't need
it to fix the problem at hand.  I may change this opinion, so I started
by pushing this change that we need regardless:

   commit 6353387835f6cb34765ac525ac3e9edf3239e589
    
       Electric-pair-mode lets modes choose how to skip whitespace
       
       * lisp/elec-pair.el (electric-pair-skip-whitespace-function): New buffer-local variable.
       (electric-pair-post-self-insert-function): Call it.

Then, I defined this function

    (defun c-mode-electric-skip-whitespace ()
      "CC-mode's way of skipping whitespace."
      (let ((saved (point))
            (in-comment (nth 4 (syntax-ppss))))
        ;; actually if you also skip backslash here, you'll skip/chomp
        ;; over newline escapes, which may be nice.
        (skip-chars-forward (apply #'string electric-pair-skip-whitespace-chars))
        (unless (or (not in-comment)
                    (nth 4 (syntax-ppss)))
          (goto-char saved))))

And added this to c-mode-common-hook:

   (add-hook 'c-mode-common-hook
             (lambda ()
               (setq-local electric-pair-skip-whitespace-function
                           #'c-mode-electric-skip-whitespace)
               (add-function :around
                             (local 'electric-pair-skip-self)
                             (lambda (&rest r)
                               (let (terminator)
                                 (if (and (setq terminator
                                                (nth 3 (syntax-ppss)))
                                          (save-excursion
                                            (goto-char (1- (line-end-position)))
                                            (and (eq terminator
                                                     (nth 3 (syntax-ppss)))
                                                 (not (eq terminator
                                                          (char-after))))))
                                     t
                                   (apply r)))))))

All e-p-m tests pass, though the detection of NL-terminated string is
very shady (but you probably have much better ways inside CC-mode to
detect them).  If your "fix-scan-sexps" idea above works (I don't
understand it) then the add-function won't be needed at all.

Let me know what you think,
João




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 18:08                 ` Alan Mackenzie
  2018-06-18 23:43                   ` João Távora
@ 2018-06-19  1:48                   ` Stefan Monnier
  2018-06-19  3:52                     ` Clément Pit-Claudel
  2018-06-26 16:08                     ` Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] Alan Mackenzie
  1 sibling, 2 replies; 93+ messages in thread
From: Stefan Monnier @ 2018-06-19  1:48 UTC (permalink / raw)
  To: emacs-devel

> char *foo = "foo;
> int bar = 5;
> char *baz = "baz";
>
> The entire second line, and the third line, up to the first ", get string
> face.  We've been used to this for so long that we've lost sight of just
> how bad and amateurish it really is.

But what about when you write

    char *thedoc = "Here it is:
    - First do this
    - Then do that
    And that's it!";

?

Both cases are valid transient states.  Which one will occur more often
depends a lot on the particular kind of code you write and your
coding habits.

Emacs can't reliably distinguish the two cases, so whichever behavior it
chooses it will look "amateurish" in some cases.

I think the better option here is to focus on the following:
1- Make sure the programmer is aware there's a problem in its code.
   I.e. highlight the opening quote or the non-escaped end-of-line or
   something in bright red or something like that.
2- Don't try to guess what the user intended to do.
   Instead keep our code as simple as possible: the C code we're handed
   is broken, so there's no real clear "right behavior" anyway.


-- Stefan




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 20:24                 ` Glenn Morris
@ 2018-06-19  2:03                   ` João Távora
  0 siblings, 0 replies; 93+ messages in thread
From: João Távora @ 2018-06-19  2:03 UTC (permalink / raw)
  To: Glenn Morris; +Cc: Alan Mackenzie, Eli Zaretskii, tino.calancha, emacs-devel

Glenn Morris <rgm@gnu.org> writes:
>> On Mon, Jun 18, 2018, 16:18 Eli Zaretskii <eliz@gnu.org> wrote:
>>> Isn't there a way to mark a test as expected to fail?
>> Yes, and I'll probably do that.
> Please do. 

Done in d37d30cef5bbbdf8d17315835126d76d4681b22a

João





^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 23:49                   ` João Távora
@ 2018-06-19  2:37                     ` Eli Zaretskii
  2018-06-19  8:13                       ` João Távora
  0 siblings, 1 reply; 93+ messages in thread
From: Eli Zaretskii @ 2018-06-19  2:37 UTC (permalink / raw)
  To: João Távora; +Cc: acm, emacs-devel, tino.calancha, rgm

> From: João Távora <joaotavora@gmail.com>
> Cc: acm@muc.de,  rgm@gnu.org,  emacs-devel@gnu.org,  tino.calancha@gmail.com
> Date: Tue, 19 Jun 2018 00:49:17 +0100
> 
> > But putting the problematic code on a branch reduces the incentive
> > even more, doesn't it?
> 
> I don't follow.

Code on a branch gets less testing by others, and therefore less
reminders about the failing test.

> I would answer "no", assuming the person developing the
> temporarily misbehaving code is motivated to do it in the first place.
> Develop and break things at will in a branch, merge them to master when
> they're clean.  No?

If the code is used, its breakage on a branch hurts like it does on
master.  If it's unused, then what is it doing in the repository?



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 22:41                 ` CC Mode and electric-pair "problem" Stephen Leake
  2018-06-19  0:02                   ` João Távora
@ 2018-06-19  3:15                   ` Clément Pit-Claudel
  2018-06-19  8:16                     ` João Távora
  1 sibling, 1 reply; 93+ messages in thread
From: Clément Pit-Claudel @ 2018-06-19  3:15 UTC (permalink / raw)
  To: emacs-devel

On 2018-06-18 18:41, Stephen Leake wrote:
> For what it's worth, I'm planning on adding "new line terminates string"
> to ada-mode. As Alan says, that is the way the compiler works. I was
> initially inspired independently, while working on an error-correcting
> parser, and found it in cc-mode while looking for ways to implement it.

Sorry for jumping in a bit late.  Does that mean that after the changed an unclosed quote will only cause refontification up to the end of the line?  That would be a very nice improvement.  I don't use electric-pair-mode, and as things currently stand inserting an unmatched quote applies font-lock-string-face to the entire buffer, which is a bit annoying.

Clément.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-19  1:48                   ` Stefan Monnier
@ 2018-06-19  3:52                     ` Clément Pit-Claudel
  2018-06-19  6:38                       ` Stefan Monnier
  2018-06-26 16:08                     ` Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] Alan Mackenzie
  1 sibling, 1 reply; 93+ messages in thread
From: Clément Pit-Claudel @ 2018-06-19  3:52 UTC (permalink / raw)
  To: emacs-devel

On 2018-06-18 21:48, Stefan Monnier wrote:
> 1- Make sure the programmer is aware there's a problem in its code.
>    I.e. highlight the opening quote or the non-escaped end-of-line or
>    something in bright red or something like that.

Agreed.  Given this criterion, the patch is an improvement: making sure that lines past the first one are not highlighted suppresses the risk of misleading the programmer into thinking that they have a multiline-string.

(This happens to me from time to time in Python, actually: I write "abc
def" instead of """abc
def""", and the highlighting doesn't immediately reveal the error.  Simply not highlighting the second line would help a lot.

> 2- Don't try to guess what the user intended to do.
>    Instead keep our code as simple as possible: the C code we're handed
>    is broken, so there's no real clear "right behavior" anyway.

I'm not sure whether we can afford to bail out like that — for people who don't use some form of structured editing, most of the code that the IDE ends up seeing is broken in some way (unmatched { or ", incomplete declarations, incorrect numbers of arguments, undeclared identifiers, etc.)

Modeling our error recovery behaviors on the one used by relevant compilers seems like a pretty good approach (ultimately, for the modes I maintain, I'd like to delegate fontification to a language server provided by the compiler).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-18 17:01               ` João Távora
                                   ` (2 preceding siblings ...)
  2018-06-18 22:41                 ` CC Mode and electric-pair "problem" Stephen Leake
@ 2018-06-19  5:02                 ` Alan Mackenzie
  2018-06-20 14:16                   ` Stefan Monnier
  2018-06-26 18:52                   ` Alan Mackenzie
  3 siblings, 2 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-19  5:02 UTC (permalink / raw)
  To: João Távora; +Cc: Glenn Morris, Tino Calancha, Emacs developers

Hello, João.

On Mon, Jun 18, 2018 at 18:01:18 +0100, João Távora wrote:

[ .... ]

Maybe we're looking at this the wrong way.

How about this idea: we add a new syntax flag to Emacs, ", which
terminates any open string, the same way the syntax > terminates any
open comment.  We could then set this syntax flag on newline.

This would have the disadvantage (for CC Mode) that it wouldn't work
with older Emacsen.  But it might solve the various problems we've
stumbled over in the last few days.

> João

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-19  3:52                     ` Clément Pit-Claudel
@ 2018-06-19  6:38                       ` Stefan Monnier
  2018-06-20 13:48                         ` Clément Pit-Claudel
  0 siblings, 1 reply; 93+ messages in thread
From: Stefan Monnier @ 2018-06-19  6:38 UTC (permalink / raw)
  To: emacs-devel

>> 1- Make sure the programmer is aware there's a problem in its code.
>>    I.e. highlight the opening quote or the non-escaped end-of-line or
>>    something in bright red or something like that.
> Agreed.  Given this criterion, the patch is an improvement: making sure that
> lines past the first one are not highlighted suppresses the risk of
> misleading the programmer into thinking that they have a multiline-string.

The old behavior highlighted the opening (and not-closed on the same
line) quote in font-lock-warning-face, which seemed perfectly adequate.

> (This happens to me from time to time in Python, actually: I write "abc
> def" instead of """abc
> def""", and the highlighting doesn't immediately reveal the error.
> Simply not highlighting the second line would help a lot.

It's easier to highlight the unmatched opener than to try and prevent
the second line from being highlighted (and you want to highlight that
opener in any case).

>> 2- Don't try to guess what the user intended to do.
>>    Instead keep our code as simple as possible: the C code we're handed
>>    is broken, so there's no real clear "right behavior" anyway.
>
> I'm not sure whether we can afford to bail out like that — for people who
> don't use some form of structured editing, most of the code that the IDE
> ends up seeing is broken in some way (unmatched { or ", incomplete
> declarations, incorrect numbers of arguments, undeclared identifiers, etc.)

Not sure what you mean by "bail out".  Point 1 has added highlighting to
warn the user about the presence of a problem.  Short of changing the
actual code behind the user's back, there's really not much more we can
do to prevent the compiler/IDE from seeing that broken code.

> Modeling our error recovery behaviors on the one used by relevant compilers
> seems like a pretty good approach (ultimately, for the modes I maintain, I'd
> like to delegate fontification to a language server provided by the
> compiler).

Point 2 suggest to go with the simplest implementation (i.e. let the
behavior be dictated by the implementation), so if your highlighting is
provided by LSP (say), then point 2 would suggest that there's no point
trying to provide a different behavior from the one provided by the
LSP server.


        Stefan




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-19  2:37                     ` Eli Zaretskii
@ 2018-06-19  8:13                       ` João Távora
  2018-06-19 16:59                         ` Eli Zaretskii
  0 siblings, 1 reply; 93+ messages in thread
From: João Távora @ 2018-06-19  8:13 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel, tino.calancha, rgm

Eli Zaretskii <eliz@gnu.org> writes:

>> From: João Távora <joaotavora@gmail.com>
>> Cc: acm@muc.de,  rgm@gnu.org,  emacs-devel@gnu.org,  tino.calancha@gmail.com
>> Date: Tue, 19 Jun 2018 00:49:17 +0100
>> 
>> > But putting the problematic code on a branch reduces the incentive
>> > even more, doesn't it?
>> 
>> I don't follow.
>
> Code on a branch gets less testing by others, and therefore less
> reminders about the failing test.

But surely, the programmer who broke the test, who is the person
technically (and morally) most well suited to fix the problem has the
all the original incentive to merge his work.

For me this is very clear: only merge if there are 0 failing tests (or
rather, if you've increased the number of failing tests by 0).  Perhaps
CVS used to make this impractival, but nowadays git branches make this
very easy.

BTW, why does CONTRIBUTE tell us to "make check" at all?

>> I would answer "no", assuming the person developing the
>> temporarily misbehaving code is motivated to do it in the first place.
>> Develop and break things at will in a branch, merge them to master when
>> they're clean.  No?
> If the code is used, its breakage on a branch hurts like it does on
> master.

Not at all, no, it hurts only the people interested in trying out the
feature.  On master it hurts everyone, including Hydra's continuous
integration, for example, which is the issue at hand.  But also other
automated things like automated bug bisections etc...

> If it's unused, then what is it doing in the repository?

To save it.  To show it to others for comments.  This seems rather
obvious to me, so perhaps we are misunderstanding each other.  I'm also
pretty sure I've seen branches prescribed in this list for unstable
features.





^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-19  3:15                   ` Clément Pit-Claudel
@ 2018-06-19  8:16                     ` João Távora
  0 siblings, 0 replies; 93+ messages in thread
From: João Távora @ 2018-06-19  8:16 UTC (permalink / raw)
  To: Clément Pit-Claudel; +Cc: emacs-devel

Clément Pit-Claudel <cpitclaudel@gmail.com> writes:

> On 2018-06-18 18:41, Stephen Leake wrote:
>> For what it's worth, I'm planning on adding "new line terminates string"
>> to ada-mode. As Alan says, that is the way the compiler works. I was
>> initially inspired independently, while working on an error-correcting
>> parser, and found it in cc-mode while looking for ways to implement it.
> Sorry for jumping in a bit late.  Does that mean that after the
> changed an unclosed quote will only cause refontification up to the
> end of the line?  That would be a very nice improvement.  I don't use
> electric-pair-mode, and as things currently stand inserting an

Again, see my reply to Stephen.  This has very little nothing to do with
electric-pair-mode now.  If you use sexp-based navigation,
blink-matching-paren, or show-paren-mode you will see considerable
differences in behaviour.

FWIW I like the non-fontification part, too.  But it comes with a hefty
price.

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-19  8:13                       ` João Távora
@ 2018-06-19 16:59                         ` Eli Zaretskii
  2018-06-19 19:40                           ` João Távora
  0 siblings, 1 reply; 93+ messages in thread
From: Eli Zaretskii @ 2018-06-19 16:59 UTC (permalink / raw)
  To: João Távora; +Cc: acm, emacs-devel, tino.calancha, rgm

> From: João Távora <joaotavora@gmail.com>
> Cc: acm@muc.de,  rgm@gnu.org,  emacs-devel@gnu.org,  tino.calancha@gmail.com
> Date: Tue, 19 Jun 2018 09:13:10 +0100
> 
> >> > But putting the problematic code on a branch reduces the incentive
> >> > even more, doesn't it?
> >> 
> >> I don't follow.
> >
> > Code on a branch gets less testing by others, and therefore less
> > reminders about the failing test.
> 
> But surely, the programmer who broke the test, who is the person
> technically (and morally) most well suited to fix the problem has the
> all the original incentive to merge his work.

Of course.  But this is not affected by whether the code is on a
branch or on master.

> For me this is very clear: only merge if there are 0 failing tests (or
> rather, if you've increased the number of failing tests by 0).  Perhaps
> CVS used to make this impractival, but nowadays git branches make this
> very easy.

That's a good policy.

> BTW, why does CONTRIBUTE tell us to "make check" at all?

Is this a tricky question?  Because I think the answer is clear to
all.

> >> I would answer "no", assuming the person developing the
> >> temporarily misbehaving code is motivated to do it in the first place.
> >> Develop and break things at will in a branch, merge them to master when
> >> they're clean.  No?
> > If the code is used, its breakage on a branch hurts like it does on
> > master.
> 
> Not at all, no, it hurts only the people interested in trying out the
> feature.  On master it hurts everyone

It hurts those who try the feature on master as well.

> including Hydra's continuous integration, for example, which is the
> issue at hand.  But also other automated things like automated bug
> bisections etc...
> 
> > If it's unused, then what is it doing in the repository?
> 
> To save it.  To show it to others for comments.  This seems rather
> obvious to me, so perhaps we are misunderstanding each other.  I'm also
> pretty sure I've seen branches prescribed in this list for unstable
> features.

OK, I think it's time to stop this dispute.  It isn't going anywhere,
and we basically agree on most aspects of this.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-19 16:59                         ` Eli Zaretskii
@ 2018-06-19 19:40                           ` João Távora
  0 siblings, 0 replies; 93+ messages in thread
From: João Távora @ 2018-06-19 19:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, emacs-devel, tino.calancha, rgm

Eli Zaretskii <eliz@gnu.org> writes:

>> BTW, why does CONTRIBUTE tell us to "make check" at all?
> Is this a tricky question?  Because I think the answer is clear to
> all.

Sorry if it sounded like a gotha, but my point was precisely that no
clear policy exists.  I notice a couple of paragraphs above it says
"please test your changes before commiting to the master branch".  But
for me it's still not clear if that means "don't commit if you've broken
any tests"."

I do the latter, and try to influence others to work like this, but
perhaps the phrasing is purposedly vague so other workflows can be
accomodated.  Urgent fixes may justify breaking some tests (and that's
why I asked what Alan's change did).

> OK, I think it's time to stop this dispute.  It isn't going anywhere,
> and we basically agree on most aspects of this.

OK, let's.

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-19  6:38                       ` Stefan Monnier
@ 2018-06-20 13:48                         ` Clément Pit-Claudel
  0 siblings, 0 replies; 93+ messages in thread
From: Clément Pit-Claudel @ 2018-06-20 13:48 UTC (permalink / raw)
  To: emacs-devel

On 2018-06-19 02:38, Stefan Monnier wrote:
> It's easier to highlight the unmatched opener than to try and prevent
> the second line from being highlighted (and you want to highlight that
> opener in any case).

Maybe, but I find it much more pleasant if the second line isn't highlighted.

> Not sure what you mean by "bail out".  Point 1 has added highlighting to
> warn the user about the presence of a problem.  Short of changing the
> actual code behind the user's back, there's really not much more we can
> do to prevent the compiler/IDE from seeing that broken code.

We want the compiler and IDE to see the broken code, but we also want to do as much as we can to make the experience pleasant (and I find it unpleasant that inserting an unmatched '"' breaks syntax highlighting for the rest of the buffer.

As an example, Merlin does a great job at handling broken OCaml code.

> Point 2 suggest to go with the simplest implementation (i.e. let the
> behavior be dictated by the implementation), so if your highlighting is
> provided by LSP (say), then point 2 would suggest that there's no point
> trying to provide a different behavior from the one provided by the
> LSP server.

Yes, I agree.  In the meantime, approximating that at the cost of a bit complexity in the Emacs mode seems good.

Clément.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-19  5:02                 ` Alan Mackenzie
@ 2018-06-20 14:16                   ` Stefan Monnier
  2018-06-26 18:23                     ` Alan Mackenzie
  2018-06-27 18:27                     ` Alan Mackenzie
  2018-06-26 18:52                   ` Alan Mackenzie
  1 sibling, 2 replies; 93+ messages in thread
From: Stefan Monnier @ 2018-06-20 14:16 UTC (permalink / raw)
  To: emacs-devel

> How about this idea: we add a new syntax flag to Emacs, ", which
> terminates any open string, the same way the syntax > terminates any
> open comment.  We could then set this syntax flag on newline.

To me this looks like adding a hack to patch over another.

I don't think the new behavior of unclosed strings in CC-mode is worse
than the old one, but I don't think it's really better either: it's just
different (in some cases it's better in others it's worse).

So the problem I see with it is that it brings complexity in the code
with no real improvement in terms of behavior.  The bad-interaction with
electric-pair shows that this complexity has a real immediate cost.
The suggestion above suggests that this complexity may bring in yet
more complexity.

Me not happy.

If the purpose of the change is to address use cases such as Clément's:
> Sorry for jumping in a bit late.  Does that mean that after the changed an
> unclosed quote will only cause refontification up to the end of the line?
> That would be a very nice improvement.  I don't use electric-pair-mode, and
> as things currently stand inserting an unmatched quote applies
> font-lock-string-face to the entire buffer, which is a bit annoying.

How 'bout taking an approach that will have much fewer side-effects:
Instead of adding the complexity at the low-level of syntax-tables
to make strings "magically" terminate at EOL, hook into
self-insert-command:

    when inserting a ", add a matching " at EOL if needed, or remove
    the " that we added at EOL earlier.

Something like (guaranteed 100% tested, of course.  No animals were harmed):

    (add-hook 'post-self-insert-hook
        (lambda ()
          (when (memq last-command-event '(?\" ?\'))
            (save-excursion
              (let ((pos (point))
                    (ppss (syntax-ppss (line-end-position))))
                (when (and (nth 3 ppss)        ;; EOL within a string
                           (not (nth 5 ppss))) ;; EOL not escaped
                  (if (and (> (point) pos)
                           (eq last-command-event (char-before)))
                      ;; Remove extraneous unmatched " at EOL.
                      (delete-region (1- (point)) (point))
                    (insert last-command-event)))))))
        'append 'local)

I used `append` to try and make it interact better with electric-pair-mode.


        Stefan




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Fontifying unterminated strings [was: CC Mode and electric-pair "problem".]
  2018-06-19  1:48                   ` Stefan Monnier
  2018-06-19  3:52                     ` Clément Pit-Claudel
@ 2018-06-26 16:08                     ` Alan Mackenzie
  2018-06-26 20:02                       ` João Távora
  2018-06-28 23:56                       ` Stefan Monnier
  1 sibling, 2 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-26 16:08 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Clément Pit-Claudel, João Távora, emacs-devel

Hello, Stefan.

On Mon, Jun 18, 2018 at 21:48:41 -0400, Stefan Monnier wrote:
> > char *foo = "foo;
> > int bar = 5;
> > char *baz = "baz";

> > The entire second line, and the third line, up to the first ", get string
> > face.  We've been used to this for so long that we've lost sight of just
> > how bad and amateurish it really is.

> But what about when you write

>     char *thedoc = "Here it is:
>     - First do this
>     - Then do that
>     And that's it!";

> ?

> Both cases are valid transient states.  Which one will occur more often
> depends a lot on the particular kind of code you write and your
> coding habits.

I suggest the most common case by far will be writing

    char *foo = "foo....

in the middle of an existing buffer.

> Emacs can't reliably distinguish the two cases, so whichever behavior it
> chooses it will look "amateurish" in some cases.

No, you've misunderstood my point.  It is not the aesthetic "niceness",
the lack of which is amateurish; it is fontifying as a string something
which isn't a string (as defined by the compiler's error messages).

> I think the better option here is to focus on the following:
> 1- Make sure the programmer is aware there's a problem in its code.
>    I.e. highlight the opening quote or the non-escaped end-of-line or
>    something in bright red or something like that.
> 2- Don't try to guess what the user intended to do.
>    Instead keep our code as simple as possible: the C code we're handed
>    is broken, so there's no real clear "right behavior" anyway.

How about the following suggestion - instead of having permanent
string-fence syntax-table text properties to define the ends of
unterminated strings:
(i) We leave the syntax of the string opener and EOL alone;
(ii) we amend font-{lock,core}.el to apply the desired fontification, to
  be like the new fontification in CC Mode?

This could be done straightforwardly in font-lock by temporarily putting
the string-fence properties on these strings, applying all the face
properties, then removing these properties again.  It would need a few
new customisation variables to specify what counts as an open string,
and so on.

> -- Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-20 14:16                   ` Stefan Monnier
@ 2018-06-26 18:23                     ` Alan Mackenzie
  2018-06-27 13:37                       ` João Távora
  2018-06-29  3:42                       ` Stefan Monnier
  2018-06-27 18:27                     ` Alan Mackenzie
  1 sibling, 2 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-26 18:23 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Wed, Jun 20, 2018 at 10:16:05 -0400, Stefan Monnier wrote:

[ .... ]

> If the purpose of the change is to address use cases such as Clément's:
> > Sorry for jumping in a bit late.  Does that mean that after the changed an
> > unclosed quote will only cause refontification up to the end of the line?
> > That would be a very nice improvement.  I don't use electric-pair-mode, and
> > as things currently stand inserting an unmatched quote applies
> > font-lock-string-face to the entire buffer, which is a bit annoying.

> How 'bout taking an approach that will have much fewer side-effects:
> Instead of adding the complexity at the low-level of syntax-tables
> to make strings "magically" terminate at EOL, hook into
> self-insert-command:

>     when inserting a ", add a matching " at EOL if needed, or remove
>     the " that we added at EOL earlier.

> Something like (guaranteed 100% tested, of course.  No animals were harmed):

>     (add-hook 'post-self-insert-hook
>         (lambda ()
>           (when (memq last-command-event '(?\" ?\'))
>             (save-excursion
>               (let ((pos (point))
>                     (ppss (syntax-ppss (line-end-position))))
>                 (when (and (nth 3 ppss)        ;; EOL within a string
>                            (not (nth 5 ppss))) ;; EOL not escaped
>                   (if (and (> (point) pos)
>                            (eq last-command-event (char-before)))
>                       ;; Remove extraneous unmatched " at EOL.
>                       (delete-region (1- (point)) (point))
>                     (insert last-command-event)))))))
>         'append 'local)

This is effectively electric-pair-mode, which if enabled, already
inserts two "s when you type ".

Not everybody likes electric-pair-mode.  I don't think your suggestion
is any better than mine (snipped) to which you replied.

> I used `append` to try and make it interact better with electric-pair-mode.


>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-19  5:02                 ` Alan Mackenzie
  2018-06-20 14:16                   ` Stefan Monnier
@ 2018-06-26 18:52                   ` Alan Mackenzie
  2018-06-26 19:45                     ` João Távora
  1 sibling, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-26 18:52 UTC (permalink / raw)
  To: João Távora; +Cc: Glenn Morris, Emacs developers, Tino Calancha

Hello, João.

On Tue, Jun 19, 2018 at 05:02:44 +0000, Alan Mackenzie wrote:
> On Mon, Jun 18, 2018 at 18:01:18 +0100, João Távora wrote:

> [ .... ]

> Maybe we're looking at this the wrong way.

> How about this idea: we add a new syntax flag to Emacs, ", which
> terminates any open string, the same way the syntax > terminates any
> open comment.  We could then set this syntax flag on newline.

This isn't a sensible idea. because it wouldn't solve any of the
problems we have with the string-fence syntax.

Instead, maybe we should add a new syntactic symbol to Emacs, "one-line
string quote".  A string opened by such a delimiter would be terminated
either by the same quote again, or a newline.

This would have the advantage of making fontification easy, whilst still
allowing syntactic operations within an invalid string.  For example, in

    char *foo = "(
    )"

, the "s would have "one-line string quote" syntax and be fontified with
warning face, but a C-M-n from the ( would still move point to after the
), and all the electric-pair-mode stuff would still work.

> This would have the disadvantage (for CC Mode) that it wouldn't work
> with older Emacsen.  But it might solve the various problems we've
> stumbled over in the last few days.

This paragraph would still hold.

> > João

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-26 18:52                   ` Alan Mackenzie
@ 2018-06-26 19:45                     ` João Távora
  2018-06-26 20:09                       ` Alan Mackenzie
  0 siblings, 1 reply; 93+ messages in thread
From: João Távora @ 2018-06-26 19:45 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha

> Hello, João.

Hi Alan,

Alan Mackenzie <acm@muc.de> writes:

>> [ .... ]
>> Maybe we're looking at this the wrong way.
>> How about this idea: we add a new syntax flag to Emacs, ", which
>> terminates any open string, the same way the syntax > terminates any
>> open comment.  We could then set this syntax flag on newline.
> This isn't a sensible idea. because it wouldn't solve any of the
> problems we have with the string-fence syntax.

You realize you're replying to your own suggestion, right? (just
checking...)

> This would have the advantage of making fontification easy, whilst still
> allowing syntactic operations within an invalid string.  For example, in
>
>     char *foo = "(
>     )"
>
> , the "s would have "one-line string quote" syntax and be fontified with
> warning face, but a C-M-n from the ( would still move point to after the
> ), and all the electric-pair-mode stuff would still work.

Ignoring any complications or complexity that would arise from it, that
sounds great (though more important than supporting e-p-m is having
C-M-u work from inside the string, which I suppose is included).

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Fontifying unterminated strings [was: CC Mode and electric-pair "problem".]
  2018-06-26 16:08                     ` Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] Alan Mackenzie
@ 2018-06-26 20:02                       ` João Távora
  2018-06-28 23:56                       ` Stefan Monnier
  1 sibling, 0 replies; 93+ messages in thread
From: João Távora @ 2018-06-26 20:02 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Clément Pit-Claudel, Stefan Monnier, emacs-devel

Alan Mackenzie <acm@muc.de> writes:

>> But what about when you write
>>     char *thedoc = "Here it is:
>>     - First do this
>>     - Then do that
>>     And that's it!";
>> Both cases are valid transient states.  Which one will occur more often
>> depends a lot on the particular kind of code you write and your
>> coding habits.
> I suggest the most common case by far will be writing
>     char *foo = "foo....
> in the middle of an existing buffer.

You can't know that, really.  It's not just the users of
electric-pair-mode, but users of other popular autopairing packages, or
those autopair manually.  Or users who mostly edit existing code.  One
could even speculate (just as tremulously) that more C code gets
maintained than written these days.

>> Emacs can't reliably distinguish the two cases, so whichever behavior it
>> chooses it will look "amateurish" in some cases.
> No, you've misunderstood my point.  It is not the aesthetic "niceness",
> the lack of which is amateurish; it is fontifying as a string something
> which isn't a string (as defined by the compiler's error messages).

It is just as wrong as fontifying something as C statements that isn't a
C statement.





^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-26 19:45                     ` João Távora
@ 2018-06-26 20:09                       ` Alan Mackenzie
  0 siblings, 0 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-26 20:09 UTC (permalink / raw)
  To: João Távora; +Cc: Glenn Morris, Emacs developers, Tino Calancha

Hello, João.

On Tue, Jun 26, 2018 at 20:45:44 +0100, João Távora wrote:
> Hi Alan,

> Alan Mackenzie <acm@muc.de> writes:

> >> [ .... ]
> >> Maybe we're looking at this the wrong way.
> >> How about this idea: we add a new syntax flag to Emacs, ", which
> >> terminates any open string, the same way the syntax > terminates any
> >> open comment.  We could then set this syntax flag on newline.
> > This isn't a sensible idea. because it wouldn't solve any of the
> > problems we have with the string-fence syntax.

> You realize you're replying to your own suggestion, right? (just
> checking...)

I do, yes.  :-)

> > This would have the advantage of making fontification easy, whilst still
> > allowing syntactic operations within an invalid string.  For example, in

> >     char *foo = "(
> >     )"

> > , the "s would have "one-line string quote" syntax and be fontified with
> > warning face, but a C-M-n from the ( would still move point to after the
> > ), and all the electric-pair-mode stuff would still work.

> Ignoring any complications or complexity that would arise from it, that
> sounds great (though more important than supporting e-p-m is having
> C-M-u work from inside the string, which I suppose is included).

Indeed.  The whole point is that if the syntax scanning starts outside
the one-line string, the newline acts as a terminator.  If it starts
inside the string, the newline doesn't act as anything special.

The complications would come with things like scan-sexps, which when
starting after a newline and scanning backward, would have to check for a
one-line " in the line.  I don't see such complications as being
unmanageable.

> João

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-26 18:23                     ` Alan Mackenzie
@ 2018-06-27 13:37                       ` João Távora
  2018-06-29  3:42                       ` Stefan Monnier
  1 sibling, 0 replies; 93+ messages in thread
From: João Távora @ 2018-06-27 13:37 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Stefan Monnier, emacs-devel

Alan Mackenzie <acm@muc.de> writes:

Hi Alan

>> > unclosed quote will only cause refontification up to the end of the line?
>> > That would be a very nice improvement.  I don't use electric-pair-mode, and
>> > as things currently stand inserting an unmatched quote applies
>> > font-lock-string-face to the entire buffer, which is a bit annoying.
>>     (add-hook 'post-self-insert-hook
>>         (lambda ()
>>    ... 
>>                     (insert last-command-event)))))))
>>         'append 'local)
>
> This is effectively electric-pair-mode, which if enabled, already
> inserts two "s when you type ".
>
> Not everybody likes electric-pair-mode.  I don't think your suggestion
> is any better than mine (snipped) to which you replied.

To be perfectly honest, I got confused by Stefan's suggestion, too.  If
the goal is to have electric-pair-mode-like behaviour, just turn on
electric-pair-mode.

I'd just like to point out, however, that automatically pairing quotes
and parens extends far beyond electric-pair-mode.  Of course I think
it's the best of the bunch, but there are other popular packages like
smartparens, paredit, wrap-region, textmate (and even my previous
autopair which some insist on using for some reason).

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-20 14:16                   ` Stefan Monnier
  2018-06-26 18:23                     ` Alan Mackenzie
@ 2018-06-27 18:27                     ` Alan Mackenzie
  2018-06-29  4:11                       ` Stefan Monnier
  1 sibling, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-27 18:27 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Clément Pit-Claudel, Stephen Leake, João Távora,
	emacs-devel

Hello, yet again, Stefan.

On Wed, Jun 20, 2018 at 10:16:05 -0400, Stefan Monnier wrote:
> > How about this idea: we add a new syntax flag to Emacs, ", which
> > terminates any open string, the same way the syntax > terminates any
> > open comment.  We could then set this syntax flag on newline.

I've been making negative comments about this suggestion of mine over
the last day or two.  I now believe, again, that the proposal is sound;
it would allow the desired fontification (an unterminated string being
fontified only up to the next unescaped NL) easily, without interfering
with the 'chomp facility in electric-pair-mode.

> To me this looks like adding a hack to patch over another.

> I don't think the new behavior of unclosed strings in CC-mode is worse
> than the old one, but I don't think it's really better either: it's just
> different (in some cases it's better in others it's worse).

This new fontification in CC Mode is much better.  When the terminating
" is missing, it no longer fontifies spuriously an unbounded piece of
the buffer as a string.  Clëment and Stephen Leake have responded
positively to this possibility.  I think we should enhance Emacs such
that it is easy to fontify in this new way.

> So the problem I see with it is that it brings complexity in the code
> with no real improvement in terms of behavior.  The bad-interaction with
> electric-pair shows that this complexity has a real immediate cost.
> The suggestion above suggests that this complexity may bring in yet
> more complexity.

The desired facility _is_ complicated.  I am not fond of the code in CC
Mode which implements it.  The question is, where do we put this
complexity?  With my suggestion, it would be confined mainly to the
scanning routines in syntax.c rather than being spread (and duplicated)
amongst several major modes.

Adapting the forward scanning functionality would be straightforward.
Things like (scan-sexps POS -1) would indeed become more difficult.  For
example, starting at BONL, (scan-sexps BONL -1) in

    "foo bar

would need to find the ", but the same in

    "foo"bar

would need to find the start of bar.  In other words, we would have to
pair off quotes from the beginning of the line we were scanning
backwards over.  There may well be difficulties in a NL potentially
acting as the terminator of both a string and a comment.  I think these
are the sorts of complexity you're wary of.

font-lock-fontify-syntactically-region could then be amended
straightforwardly to apply warning-face to the opening unbalanced "
(controlled, of course, by a customisation option).

> Me not happy.

My suggestion has the strong advantage that it will benefit Emacs as a
whole, and there won't need to be separate implementations in CC Mode,
Python Mode, Ada Mode, .....  The need for a multilinne string to have
escaped NLs between its lines is actually a common pattern in the
languages Emacs handles.  Why can we not handle it in syntax.c?

[ .... ]

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Fontifying unterminated strings [was: CC Mode and electric-pair "problem".]
  2018-06-26 16:08                     ` Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] Alan Mackenzie
  2018-06-26 20:02                       ` João Távora
@ 2018-06-28 23:56                       ` Stefan Monnier
  2018-06-29  0:43                         ` Stefan Monnier
  1 sibling, 1 reply; 93+ messages in thread
From: Stefan Monnier @ 2018-06-28 23:56 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: Clément Pit-Claudel, João Távora, emacs-devel

>> I think the better option here is to focus on the following:
>> 1- Make sure the programmer is aware there's a problem in its code.
>>    I.e. highlight the opening quote or the non-escaped end-of-line or
>>    something in bright red or something like that.
>> 2- Don't try to guess what the user intended to do.
>>    Instead keep our code as simple as possible: the C code we're handed
>>    is broken, so there's no real clear "right behavior" anyway.
>
> How about the following suggestion - instead of having permanent
> string-fence syntax-table text properties to define the ends of
> unterminated strings:

My suggestion has no "string-fence syntax-table" or any such thing, so
I'm not sure what you're saying here.

Before suggesting something else, could you clarify the downside you see
with my proposal?

> (ii) we amend font-{lock,core}.el to apply the desired fontification, to
>   be like the new fontification in CC Mode?

This problem is not specific to C but it's not common to all programming
languages either, so I think that modifying font-(lock|core).el
for that would be a gross hack.

We could do it somewhat cleanly by having font-lock.el provide some
"standard" function that major modes could opt to use in their
font-lock-* settings, of course, but I'm having a hard time imagining
a solution with a nice semantics on that side: if we do it only by
tweaking faces, then we get inconsistent behavior between highlighting
and C-M-f, and if we do it by tweaking syntax-tables, then we get weird
differences between the case where font-lock is used and where it's not
used (e.g. between not-yet-displayed code and already displayed code, or
between the case where font-lock-mode is enabled or not).

`syntax-table` properties should be applied via
syntax-propertize-function in order to be reliably available.


        Stefan



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Fontifying unterminated strings [was: CC Mode and electric-pair "problem".]
  2018-06-28 23:56                       ` Stefan Monnier
@ 2018-06-29  0:43                         ` Stefan Monnier
  0 siblings, 0 replies; 93+ messages in thread
From: Stefan Monnier @ 2018-06-29  0:43 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: Clément Pit-Claudel, João Távora, emacs-devel

> My suggestion has no "string-fence syntax-table" or any such thing, so
> I'm not sure what you're saying here.
> Before suggesting something else, could you clarify the downside you see
> with my proposal?

Hmm... sorry: I'm catching up with some mail backlog and didn't read
carefully.  I noticed too late that I misread: your message was not in
reply to my last suggestion but to some earlier discussion.  So, please
disregard the above text.

I'll finish reading my mail before replying further.

I think the following part of my answer is still correct, tho maybe
out-of-date with subsequent discussion I'm about to discover.


        Stefan


>> (ii) we amend font-{lock,core}.el to apply the desired fontification, to
>>   be like the new fontification in CC Mode?
>
> This problem is not specific to C but it's not common to all programming
> languages either, so I think that modifying font-(lock|core).el
> for that would be a gross hack.
>
> We could do it somewhat cleanly by having font-lock.el provide some
> "standard" function that major modes could opt to use in their
> font-lock-* settings, of course, but I'm having a hard time imagining
> a solution with a nice semantics on that side: if we do it only by
> tweaking faces, then we get inconsistent behavior between highlighting
> and C-M-f, and if we do it by tweaking syntax-tables, then we get weird
> differences between the case where font-lock is used and where it's not
> used (e.g. between not-yet-displayed code and already displayed code, or
> between the case where font-lock-mode is enabled or not).
>
> `syntax-table` properties should be applied via
> syntax-propertize-function in order to be reliably available.
>
>
>         Stefan



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-26 18:23                     ` Alan Mackenzie
  2018-06-27 13:37                       ` João Távora
@ 2018-06-29  3:42                       ` Stefan Monnier
  2018-06-30 18:09                         ` Alan Mackenzie
  1 sibling, 1 reply; 93+ messages in thread
From: Stefan Monnier @ 2018-06-29  3:42 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> This is effectively electric-pair-mode, which if enabled, already
> inserts two "s when you type ".

It's very different: it inserts/removes the second " at the end of the
line, so it ends up behaving very much like your current code, except:
- it only affects self-insert-command.
- it uses an explicit " character rather than a syntax-table text-property.

So OT1H it provides a behavior closer to current `master` than to
electric-pair-mode, but like electric-pair-mode it has a fairly focus'd
effect, so is less likely to have unexpected interactions.

> Not everybody likes electric-pair-mode.  I don't think your suggestion
> is any better than mine (snipped) to which you replied.

Its main benefit is that it's very superficial with a narrow focus.
No need to change any core API like syntax-tables with a feature which
will need to be supported for the next very many years.


        Stefan



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-27 18:27                     ` Alan Mackenzie
@ 2018-06-29  4:11                       ` Stefan Monnier
  2018-06-30 19:03                         ` Alan Mackenzie
  0 siblings, 1 reply; 93+ messages in thread
From: Stefan Monnier @ 2018-06-29  4:11 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: Clément Pit-Claudel, Stephen Leake, João Távora,
	emacs-devel

>> > How about this idea: we add a new syntax flag to Emacs, ", which
>> > terminates any open string, the same way the syntax > terminates any
>> > open comment.  We could then set this syntax flag on newline.
> I've been making negative comments about this suggestion of mine over
> the last day or two.  I now believe, again, that the proposal is sound;

It's definitely sound.  And I very much agree that it could be cleaner
than the current code on `master`.  I dislike this solution mainly
because it requires changes to Emacs's core API, so it bumps against my
feeling that the need is not clearly documented: you think the new
behavior is more often beneficial than the old behavior but we have no
actual data to verify it.  FWIW, I do not know that the old behavior is
more often beneficial either, but I'm definitely not convinced that the
new behavior is often enough more beneficial to justify such changes to
syntax-tables.

But that's for Eli to judge.

So let's look at the technical issues:
You suggest introducing a new syntax-table thingy similar to > but for
strings.  Let's call it ]
- This implies we'll need a new C-level function `back_string` to jump
  backward over such a ]-terminated string, corresponding to
  back_comment.  `back_comment` has proved to be rather nasty, so while
  we can learn from it, part of what we learn is that jumping backward
  over such things is much easier than jumping forward, so this
  innocuous ] will be more costly than might meet the eye.
- In CC-mode, \n already has syntax > so it can't also have syntax ]
  How do you intend to deal with that: will you mark those few \n that
  terminate strings with syntax-table text-properties?
  If so, what's the benefit over using string-fences?
- Another approach would be to make it possible to mark \n as both ] and
  > at the same time, which would make the CC-mode feature much cleaner
  (no need to muck with syntax-table text-properties) but the cost of
  yet more complexity in the syntax.c code.

> would need to find the start of bar.  In other words, we would have to
> pair off quotes from the beginning of the line we were scanning
> backwards over.  There may well be difficulties in a NL potentially
> acting as the terminator of both a string and a comment.  I think these
> are the sorts of complexity you're wary of.

Yes.

> My suggestion has the strong advantage that it will benefit Emacs as a
> whole, and there won't need to be separate implementations in CC Mode,
> Python Mode, Ada Mode, .....  The need for a multilinne string to have
> escaped NLs between its lines is actually a common pattern in the
> languages Emacs handles.  Why can we not handle it in syntax.c?

Emacs has handled it for the last 30 years or so.  You just want to
handle it in a different way.  I agree that Emacs's core should ideally
make it easy for a major mode to choose this "different way".

But the way I see it, your suggestion is just adding one more wart to
syntax-tables whereas we should instead work on "syntax-tables NG".

IOW, I think that we should introduce a brand new replacement for
syntax-tables (tho I don't really know what it should look like,
otherwise I'd have coded it up already); something much more powerful
and generic (probably based on a mix of a DFA at one level and some kind
of push-down automata on top of it), and such a thing could/should
easily accommodate such a feature without even needing any
ad-hoc support.


        Stefan



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-29  3:42                       ` Stefan Monnier
@ 2018-06-30 18:09                         ` Alan Mackenzie
  2018-07-01  3:37                           ` Stefan Monnier
  2018-07-01 15:57                           ` Paul Eggert
  0 siblings, 2 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-30 18:09 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Thu, Jun 28, 2018 at 23:42:34 -0400, Stefan Monnier wrote:
> > This is effectively electric-pair-mode, which if enabled, already
> > inserts two "s when you type ".

> It's very different: it inserts/removes the second " at the end of the
> line, so it ends up behaving very much like your current code, except:
> - it only affects self-insert-command.
> - it uses an explicit " character rather than a syntax-table text-property.

Some people, including me, find the insertion of characters they haven't
typed (aside from tabs/spaces for indentation) annoying.  It's good that
there are minor modes that can do this, but it's not the way to solve
the current difficulty.

> So OT1H it provides a behavior closer to current `master` than to
> electric-pair-mode, but like electric-pair-mode it has a fairly focus'd
> effect, so is less likely to have unexpected interactions.

> > Not everybody likes electric-pair-mode.  I don't think your suggestion
> > is any better than mine (snipped) to which you replied.

> Its main benefit is that it's very superficial with a narrow focus.
> No need to change any core API like syntax-tables with a feature which
> will need to be supported for the next very many years.

But it doesn't really address the problem.  That problem is how to
fontify unterminated strings (in both senses of the word "how").  Up
till now, Emacs hasn't bothered - it just allows these strings, and the
subsequent buffer portion, to be fontified randomly.

I think such a string should have string face up till the first
unescaped newline (in modes where escaped NLs are required for multiline
strings).  I can't see any other way anybody would want such a construct
to be fontified.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-29  4:11                       ` Stefan Monnier
@ 2018-06-30 19:03                         ` Alan Mackenzie
  2018-06-30 19:29                           ` Eli Zaretskii
  2018-07-01  4:02                           ` CC Mode and electric-pair "problem" Stefan Monnier
  0 siblings, 2 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-30 19:03 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Clément Pit-Claudel, Stephen Leake, João Távora,
	emacs-devel

Hello, Stefan.

On Fri, Jun 29, 2018 at 00:11:24 -0400, Stefan Monnier wrote:
> >> > How about this idea: we add a new syntax flag to Emacs, ", which
> >> > terminates any open string, the same way the syntax > terminates any
> >> > open comment.  We could then set this syntax flag on newline.
> > I've been making negative comments about this suggestion of mine over
> > the last day or two.  I now believe, again, that the proposal is sound;

> It's definitely sound.  And I very much agree that it could be cleaner
> than the current code on `master`.  I dislike this solution mainly
> because it requires changes to Emacs's core API, so it bumps against my
> feeling that the need is not clearly documented: you think the new
> behavior is more often beneficial than the old behavior but we have no
> actual data to verify it.

No, what I think is much less nuanced: that the old behaviour is simply
wrong; the new behaviour is likewise correct.  If one were to design an
editor's functionality from scratch, nobody would advocate the old
behaviour - it happened because it needed no implementation effort.

> FWIW, I do not know that the old behavior is more often beneficial
> either, but I'm definitely not convinced that the new behavior is
> often enough more beneficial to justify such changes to syntax-tables.

I am in the middle of writing a trial implementation (code speaks louder
than words).  Thus far, it has already worked in shell-script-mode
(which required a one-line change, this:

    -       ?\n ">#"
    +       ?\n ">#s"

the new `s' flag is how I've constructed it, so far).

> But that's for Eli to judge.

> So let's look at the technical issues:
> You suggest introducing a new syntax-table thingy similar to > but for
> strings.  Let's call it ]

As I noted above, I have implemented it as another flag, `s'.

> - This implies we'll need a new C-level function `back_string` to jump
>   backward over such a ]-terminated string, corresponding to
>   back_comment.

Yes.

>  `back_comment` has proved to be rather nasty, so while
>   we can learn from it, part of what we learn is that jumping backward
>   over such things is much easier ....

much less easy.  :-)

>   .... than jumping forward, so this
>   innocuous ] will be more costly than might meet the eye.

It requires the new function, which at the moment seems somewhat less
complicated than back_comment, and this requires to be called from
scan_lists.

> - In CC-mode, \n already has syntax > so it can't also have syntax ]
>   How do you intend to deal with that: will you mark those few \n that
>   terminate strings with syntax-table text-properties?

This is simple with the flag `s'.  NL would thus have end-comment syntax
_and_ the `s' flag.  In scan_lists, back_comment will be tried before
what I'm calling `back_maybe_string', since being a comment ender must have
precedence over being a string terminator.

>   If so, what's the benefit over using string-fences?

String-fence stopped the 'chomp facility of electric-pair-mode working
properly (for the currently accepted value of "properly").

> - Another approach would be to make it possible to mark \n as both ] and
>   > at the same time, which would make the CC-mode feature much cleaner
>   (no need to muck with syntax-table text-properties) but the cost of
>   yet more complexity in the syntax.c code.

That's what I'm doing with `s'.  The extra complexity in syntax.c
doesn't seem all that bad at the moment.  back_maybe_string is currently
137 lines long (including a macro analogous to INC_FROM, and a lossage:
clause modelled on the one in back_comment)), compared with
back_comment's 289 lines.  I'm planning on committing this new code to a
branch in the next few days, then you can judge better whether the new
facility is worth it.

[ .... ]

> > My suggestion has the strong advantage that it will benefit Emacs as a
> > whole, and there won't need to be separate implementations in CC Mode,
> > Python Mode, Ada Mode, .....  The need for a multilinne string to have
> > escaped NLs between its lines is actually a common pattern in the
> > languages Emacs handles.  Why can we not handle it in syntax.c?

> Emacs has handled it for the last 30 years or so.  You just want to
> handle it in a different way.  I agree that Emacs's core should ideally
> make it easy for a major mode to choose this "different way".

> But the way I see it, your suggestion is just adding one more wart to
> syntax-tables whereas we should instead work on "syntax-tables NG".

> IOW, I think that we should introduce a brand new replacement for
> syntax-tables (tho I don't really know what it should look like,
> otherwise I'd have coded it up already); something much more powerful
> and generic (probably based on a mix of a DFA at one level and some kind
> of push-down automata on top of it), and such a thing could/should
> easily accommodate such a feature without even needing any
> ad-hoc support.

"S-T-NG" may be fine for Emacs 28 or 29, but the syntax table is what we
have, and what we must work with in the short term.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-30 19:03                         ` Alan Mackenzie
@ 2018-06-30 19:29                           ` Eli Zaretskii
  2018-06-30 20:14                             ` Alan Mackenzie
  2018-07-01  4:02                           ` CC Mode and electric-pair "problem" Stefan Monnier
  1 sibling, 1 reply; 93+ messages in thread
From: Eli Zaretskii @ 2018-06-30 19:29 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: cpitclaudel, emacs-devel, stephen_leake, monnier, joaotavora

> Date: Sat, 30 Jun 2018 19:03:27 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: Clément Pit-Claudel <cpitclaudel@gmail.com>,
> 	Stephen Leake <stephen_leake@stephe-leake.org>,
> 	João Távora <joaotavora@gmail.com>,
> 	emacs-devel@gnu.org
> 
> I am in the middle of writing a trial implementation (code speaks louder
> than words).  Thus far, it has already worked in shell-script-mode
> (which required a one-line change, this:
> 
>     -       ?\n ">#"
>     +       ?\n ">#s"
> 
> the new `s' flag is how I've constructed it, so far).
> 
> > But that's for Eli to judge.
> 
> > So let's look at the technical issues:
> > You suggest introducing a new syntax-table thingy similar to > but for
> > strings.  Let's call it ]
> 
> As I noted above, I have implemented it as another flag, `s'.
> 
> > - This implies we'll need a new C-level function `back_string` to jump
> >   backward over such a ]-terminated string, corresponding to
> >   back_comment.
> 
> Yes.
> 
> >  `back_comment` has proved to be rather nasty, so while
> >   we can learn from it, part of what we learn is that jumping backward
> >   over such things is much easier ....
> 
> much less easy.  :-)
> 
> >   .... than jumping forward, so this
> >   innocuous ] will be more costly than might meet the eye.
> 
> It requires the new function, which at the moment seems somewhat less
> complicated than back_comment, and this requires to be called from
> scan_lists.
> 
> > - In CC-mode, \n already has syntax > so it can't also have syntax ]
> >   How do you intend to deal with that: will you mark those few \n that
> >   terminate strings with syntax-table text-properties?
> 
> This is simple with the flag `s'.  NL would thus have end-comment syntax
> _and_ the `s' flag.  In scan_lists, back_comment will be tried before
> what I'm calling `back_maybe_string', since being a comment ender must have
> precedence over being a string terminator.
> 
> >   If so, what's the benefit over using string-fences?
> 
> String-fence stopped the 'chomp facility of electric-pair-mode working
> properly (for the currently accepted value of "properly").
> 
> > - Another approach would be to make it possible to mark \n as both ] and
> >   > at the same time, which would make the CC-mode feature much cleaner
> >   (no need to muck with syntax-table text-properties) but the cost of
> >   yet more complexity in the syntax.c code.
> 
> That's what I'm doing with `s'.  The extra complexity in syntax.c
> doesn't seem all that bad at the moment.  back_maybe_string is currently
> 137 lines long (including a macro analogous to INC_FROM, and a lossage:
> clause modelled on the one in back_comment)), compared with
> back_comment's 289 lines.  I'm planning on committing this new code to a
> branch in the next few days, then you can judge better whether the new
> facility is worth it.

Could you please recap what problem(s) you are trying to fix with
these changes?  (I'm sorry for not following, but this thread spans
two months and many long messages with several days in-between.  It's
hard to keep focused on the main issues.)

Thanks.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-30 19:29                           ` Eli Zaretskii
@ 2018-06-30 20:14                             ` Alan Mackenzie
  2018-07-01  3:50                               ` Stefan Monnier
                                                 ` (2 more replies)
  0 siblings, 3 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-06-30 20:14 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: joaotavora, cpitclaudel, stephen_leake, monnier, emacs-devel

Hello, Eli.

On Sat, Jun 30, 2018 at 22:29:12 +0300, Eli Zaretskii wrote:
> > Date: Sat, 30 Jun 2018 19:03:27 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: Clément Pit-Claudel <cpitclaudel@gmail.com>,
> > 	Stephen Leake <stephen_leake@stephe-leake.org>,
> > 	João Távora <joaotavora@gmail.com>,
> > 	emacs-devel@gnu.org

> > I am in the middle of writing a trial implementation (code speaks louder
> > than words).  Thus far, it has already worked in shell-script-mode
> > (which required a one-line change, this:

> >     -       ?\n ">#"
> >     +       ?\n ">#s"

> > the new `s' flag is how I've constructed it, so far).

> > > But that's for Eli to judge.

> > > So let's look at the technical issues:
> > > You suggest introducing a new syntax-table thingy similar to > but for
> > > strings.  Let's call it ]

> > As I noted above, I have implemented it as another flag, `s'.

> > > - This implies we'll need a new C-level function `back_string` to jump
> > >   backward over such a ]-terminated string, corresponding to
> > >   back_comment.

> > Yes.

> > >  `back_comment` has proved to be rather nasty, so while
> > >   we can learn from it, part of what we learn is that jumping backward
> > >   over such things is much easier ....

> > much less easy.  :-)

> > >   .... than jumping forward, so this
> > >   innocuous ] will be more costly than might meet the eye.

> > It requires the new function, which at the moment seems somewhat less
> > complicated than back_comment, and this requires to be called from
> > scan_lists.

> > > - In CC-mode, \n already has syntax > so it can't also have syntax ]
> > >   How do you intend to deal with that: will you mark those few \n that
> > >   terminate strings with syntax-table text-properties?

> > This is simple with the flag `s'.  NL would thus have end-comment syntax
> > _and_ the `s' flag.  In scan_lists, back_comment will be tried before
> > what I'm calling `back_maybe_string', since being a comment ender must have
> > precedence over being a string terminator.

> > >   If so, what's the benefit over using string-fences?

> > String-fence stopped the 'chomp facility of electric-pair-mode working
> > properly (for the currently accepted value of "properly").

> > > - Another approach would be to make it possible to mark \n as both ] and
> > >   > at the same time, which would make the CC-mode feature much cleaner
> > >   (no need to muck with syntax-table text-properties) but the cost of
> > >   yet more complexity in the syntax.c code.

> > That's what I'm doing with `s'.  The extra complexity in syntax.c
> > doesn't seem all that bad at the moment.  back_maybe_string is currently
> > 137 lines long (including a macro analogous to INC_FROM, and a lossage:
> > clause modelled on the one in back_comment)), compared with
> > back_comment's 289 lines.  I'm planning on committing this new code to a
> > branch in the next few days, then you can judge better whether the new
> > facility is worth it.

> Could you please recap what problem(s) you are trying to fix with
> these changes?  (I'm sorry for not following, but this thread spans
> two months and many long messages with several days in-between.  It's
> hard to keep focused on the main issues.)

Sorry.  That's just the way things go, sometimes.

The initial problem I tried to solve was for CC Mode source files with
things like:

    char foo[] = "foo
    char bar[] = "bar";

Historically, the missing " on "foo has caused subsequent lines to have
their string quoting reversed.  This is not good.

A recent series of CC Mode commits "solved" this by putting string-fence
syntax-table text properties on the " and the NL around foo.  This caused
a "make check" test to fail.  With electric-pair-mode enabled and
electric-pair-skip-whitespace set to 'chomp, in the following:

    "  (
       )  "

, typing ) on line 1 should replace the ) on line 2, "chomping" the space
between ) and ).  However the string-fence property on L1's NL prevented
electric-pair-mode from functioning correctly.  João and I have discussed
at length ways of fixing this.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

Also, the CC Mode solution has the disadvantage that other languages
cannot get the same fontification advantages, namely that the "foo gets
warning-face on the ", and string face extends ONLY to EOL.

What I'm now proposing, and implementing as a trial, is to enhance the
syntax table facilities to support unterminated strings.  There will be
an extra syntax flag `s' on newlines meaning "terminate any open string".
This is straightforward for forward scanning, but somewhat complicated
for backward scanning.  However, it does enable unterminated strings to
be easily fontified to EOL in any language, with minimal effort.

It should allow the desired fontification without causing problems for
electric-pair-mode.

Stefan is concerned that the extra functionality may not justify the
increase in complexity in syntax.c.

> Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-30 18:09                         ` Alan Mackenzie
@ 2018-07-01  3:37                           ` Stefan Monnier
  2018-07-01 15:24                             ` Eli Zaretskii
  2018-07-06 21:58                             ` Stephen Leake
  2018-07-01 15:57                           ` Paul Eggert
  1 sibling, 2 replies; 93+ messages in thread
From: Stefan Monnier @ 2018-07-01  3:37 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> some people, including me, find the insertion of characters they haven't
> typed (aside from tabs/spaces for indentation) annoying.

Don't think of it as a character, think of it as "a special syntax-table
string-closer that's not really in the buffer": it will be automatically
removed when you fix the unterminated string anyway.

For that reason it's very different from electric-pair-mode: the " char
it adds is not meant to save you from typing typing the closing string
delimiter, it's just a technical device to bound temporarily the extent
of the string until the time you close it.

> It's good that there are minor modes that can do this, but it's not
> the way to solve the current difficulty.

Maybe the sample implementation I provided is not quite right, but
I think the approach of temporarily inserting a " instead of messing
with syntax-table properties is actually much closer to the current
CC-mode behavior than to electric-pair-mode.

> But it doesn't really address the problem.  That problem is how to
> fontify unterminated strings (in both senses of the word "how").

An unterminated string can only occur in an invalid piece of code.
To the extent that invalid code has no clear meaning, there's no way
to know what is really the "right" behavior.

My point of view is that Emacs should focus on behaving as correctly as
possible for valid code.  The only effort worth doing w.r.t invalid code
is to avoid doing something clearly harmful and to help the user make
the code valid again.  Anything further than that is time that would be
better spent improving the handling of valid code.

I don't see any concrete benefit (for the user) of the new behavior over
the old (or the reverse for that matter).  Either behavior is equally
good and which behavior is better will depend on things which Emacs
cannot know unless the user explicitly tells us.

> Up till now, Emacs hasn't bothered - it just allows these strings, and the
> subsequent buffer portion, to be fontified randomly.

It's not random: it's arbitrary.  The new behavior is also arbitrary.

AFAIK you have no statistical data to claim that your new behavior is
more often better than worse (and even less data to claim that the
difference is significant).  So it's mostly different.

> I think such a string should have string face up till the first
> unescaped newline (in modes where escaped NLs are required for
> multiline strings).

Yes, we saw that.  Some other users agree.  Yet others disagree.

Personally, as a user, I don't really care which behavior I get: it's
a rare transient situation which I'll fix soon anyway, whether Emacs
tells me about it or not.

OTOH, there is very concrete evidence that the new behavior is worse in
the sense that it adds complexity to the code and (as expected)
introduces bugs.

To me, this is a bad tradeoff.

> I can't see any other way anybody would want such a construct
> to be fontified.

That's just a lack of imagination on your part.  Tho it also means you
haven't made the effort to appreciate some of the scenarios people have
presented here where the old behavior is preferable.


        Stefan



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-30 20:14                             ` Alan Mackenzie
@ 2018-07-01  3:50                               ` Stefan Monnier
  2018-07-01  9:58                                 ` Alan Mackenzie
  2018-07-01 11:22                               ` João Távora
  2018-07-01 15:22                               ` Eli Zaretskii
  2 siblings, 1 reply; 93+ messages in thread
From: Stefan Monnier @ 2018-07-01  3:50 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: joaotavora, Eli Zaretskii, stephen_leake, cpitclaudel,
	emacs-devel

> The initial problem I tried to solve was for CC Mode source files with
> things like:
>
>     char foo[] = "foo
>     char bar[] = "bar";
>
> Historically, the missing " on "foo has caused subsequent lines to have
> their string quoting reversed.  This is not good.

Of course, as mentioned elsewhere, the new functionality is not good
either when you have

    char foo[] = "first line
    second line\
    third line";

Unless we find a magic way to distinguish those cases, both behaviors
will be sometimes right and sometimes wrong (and of course, neither
really matters since the code is invalid and will be inevitably fixed
soon by the user).

> A recent series of CC Mode commits "solved" this by putting string-fence
> syntax-table text properties on the " and the NL around foo.  This caused
> a "make check" test to fail.  With electric-pair-mode enabled and
> electric-pair-skip-whitespace set to 'chomp, in the following:

Complexity brings bugs, indeed.

> Also, the CC Mode solution has the disadvantage that other languages
> cannot get the same fontification advantages, namely that the "foo gets
> warning-face on the ", and string face extends ONLY to EOL.

    "foo gets warning-face on the "

is completely unrelated to the discussion at hand: you can have it both
with the new code and the old code, and indeed CC-mode had it with the
old code as well.

> What I'm now proposing, and implementing as a trial, is to enhance the
> syntax table facilities to support unterminated strings.

Oh indeed, complexity calls for yet more complexity.

> Stefan is concerned that the extra functionality may not justify the
> increase in complexity in syntax.c.

Yes.


        Stefan



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-30 19:03                         ` Alan Mackenzie
  2018-06-30 19:29                           ` Eli Zaretskii
@ 2018-07-01  4:02                           ` Stefan Monnier
  2018-07-01 10:58                             ` Alan Mackenzie
  1 sibling, 1 reply; 93+ messages in thread
From: Stefan Monnier @ 2018-07-01  4:02 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: Clément Pit-Claudel, Stephen Leake, João Távora,
	emacs-devel

>> So let's look at the technical issues:
>> You suggest introducing a new syntax-table thingy similar to > but for
>> strings.  Let's call it ]
> As I noted above, I have implemented it as another flag, `s'.

Better, yes.

> This is simple with the flag `s'.  NL would thus have end-comment syntax
> _and_ the `s' flag.  In scan_lists, back_comment will be tried before
> what I'm calling `back_maybe_string', since being a comment ender must have
> precedence over being a string terminator.

Why?    How 'bout:

    char foo[] = "some unterminated // string

>>   If so, what's the benefit over using string-fences?
> String-fence stopped the 'chomp facility of electric-pair-mode working
> properly (for the currently accepted value of "properly").

I suspect that it'll be easier to fix electric-pair-mode.

So the right answer was that you won't need syntax-table text-properties.

But the downside is that every time we scan backwards over a newline
we'll have to pay the extra cost of checking whether it's maybe closing
an unterminated string.

I think such a "string terminator" thingy would be valuable if it were
used/needed for *valid* code.  But introducing such complexity just to
tweak the handling of invalid code doesn't seem like a good tradeoff
at all.

> That's what I'm doing with `s'.  The extra complexity in syntax.c
> doesn't seem all that bad at the moment.  back_maybe_string is currently
> 137 lines long (including a macro analogous to INC_FROM, and a lossage:
> clause modelled on the one in back_comment)), compared with
> back_comment's 289 lines.  I'm planning on committing this new code to a
> branch in the next few days, then you can judge better whether the new
> facility is worth it.

I can't imagine how seeing the code could change my opinion on whether
it's worth it.

> "S-T-NG" may be fine for Emacs 28 or 29, but the syntax table is what we
> have, and what we must work with in the short term.

We'll never get to "S-T-NG" if we keep it for the future.


        Stefan



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-01  3:50                               ` Stefan Monnier
@ 2018-07-01  9:58                                 ` Alan Mackenzie
  0 siblings, 0 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-07-01  9:58 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Eli Zaretskii, stephen_leake, joaotavora, cpitclaudel,
	emacs-devel

Hello, Stefan.

On Sat, Jun 30, 2018 at 23:50:29 -0400, Stefan Monnier wrote:

[ .... ]

> > What I'm now proposing, and implementing as a trial, is to enhance the
> > syntax table facilities to support unterminated strings.

> Oh indeed, complexity calls for yet more complexity.

New features call for new code.  How can you disparage the new code as
"(unacceptable) complexity" when you haven't even seen it?

A good point is, who should decide how these strings should be fontified?
Three possible answers are an individual on the Emacs core team, the
major mode author, the user.

Over this entire thread you've been exceedingly negative.  You have
disparaged at least two ways of doing what's wanted, without suggesting
any other, better, way.  You seem to be saying "this is
difficult/complicated, so we'll just work around the problem/pretend it
isn't really a problem, rather than solving it".

So, please let's have your technical proposal for how to fontify
unterminated strings in the "new way".

[ .... ]

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-01  4:02                           ` CC Mode and electric-pair "problem" Stefan Monnier
@ 2018-07-01 10:58                             ` Alan Mackenzie
  2018-07-01 11:46                               ` João Távora
  2018-07-01 16:13                               ` Stefan Monnier
  0 siblings, 2 replies; 93+ messages in thread
From: Alan Mackenzie @ 2018-07-01 10:58 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Clément Pit-Claudel, Stephen Leake, João Távora,
	emacs-devel

Hello, Stefan.

On Sun, Jul 01, 2018 at 00:02:56 -0400, Stefan Monnier wrote:
> >> So let's look at the technical issues:
> >> You suggest introducing a new syntax-table thingy similar to > but for
> >> strings.  Let's call it ]
> > As I noted above, I have implemented it as another flag, `s'.

> Better, yes.

> > This is simple with the flag `s'.  NL would thus have end-comment syntax
> > _and_ the `s' flag.  In scan_lists, back_comment will be tried before
> > what I'm calling `back_maybe_string', since being a comment ender must have
> > precedence over being a string terminator.

> Why?    How 'bout:

>     char foo[] = "some unterminated // string

Bug compatibility with the current scan-sexps.

> > String-fence stopped the 'chomp facility of electric-pair-mode
> > working properly (for the currently accepted value of "properly").

> I suspect that it'll be easier to fix electric-pair-mode.

This would be my preferred option too, but it's not easy.

> But the downside is that every time we scan backwards over a newline
> we'll have to pay the extra cost of checking whether it's maybe
> closing an unterminated string.

Hmmm.  Yes, this could increase the backward scanning time quite
substantially, but we already do this for back_comment, though.  It
might be unacceptable.

A possibility would be to apply the `s' flag only in a syntax-table text
property applied to the newlines of unterminated strings.

> I think such a "string terminator" thingy would be valuable if it were
> used/needed for *valid* code.  But introducing such complexity just to
> tweak the handling of invalid code doesn't seem like a good tradeoff
> at all.

I disagree.  Whilst editing code, it is in an invalid state nearly all
the time.  It is our job to present the user with the best possible
display for this dominant state.

> > That's what I'm doing with `s'.  The extra complexity in syntax.c
> > doesn't seem all that bad at the moment.  back_maybe_string is currently
> > 137 lines long (including a macro analogous to INC_FROM, and a lossage:
> > clause modelled on the one in back_comment)), compared with
> > back_comment's 289 lines.  I'm planning on committing this new code to a
> > branch in the next few days, then you can judge better whether the new
> > facility is worth it.

> I can't imagine how seeing the code could change my opinion on whether
> it's worth it.

I would hope you would weigh up the small additional complexity against
the new features it brings, and reach a balanced judgment, rather than
dismissing the new idea without consideration.

> > "S-T-NG" may be fine for Emacs 28 or 29, but the syntax table is what we
> > have, and what we must work with in the short term.

> We'll never get to "S-T-NG" if we keep it for the future.

You see the need for it, and have at least some vague notion of what it
should look like.  I don't.  Get hacking!

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-30 20:14                             ` Alan Mackenzie
  2018-07-01  3:50                               ` Stefan Monnier
@ 2018-07-01 11:22                               ` João Távora
  2018-07-01 15:25                                 ` Eli Zaretskii
  2018-07-01 15:22                               ` Eli Zaretskii
  2 siblings, 1 reply; 93+ messages in thread
From: João Távora @ 2018-07-01 11:22 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: Eli Zaretskii, stephen_leake, cpitclaudel, monnier, emacs-devel

Alan Mackenzie <acm@muc.de> writes:

>> Could you please recap what problem(s) you are trying to fix with
> Sorry.  That's just the way things go, sometimes.

I'm not sure how far into "final allegations" we are, but below is my
summary.

> electric-pair-mode from functioning correctly.  João and I have discussed
> at length ways of fixing this.

... in particular, a few weeks ago I provided, in electric-pair-mode,
means for CC mode to declare that it has this particular behaviour.

Though I'm still waiting for Alan's comments on this, I'd say the
electric-pair-mode test failure is effectively fixed if Alan aggrees to
use that customization point.

But, in my view, electric-pair-mode was just the canary in the mine:
after Alan's changes much more basic things such as C-M-* sexp
navigation stop working like they did.  I am actually more worried about
these.

To recap, I like that Alan's change in syntactically incorrect code is
better "50% of the time":

   char *c="an incomplete string
   int a = 0;
   ...
  }<EOB>

by not fontifying "int a" as a string, does indeed exhibit some
intelligence.  But this doesn't (where it previously did):

  int main () {
    int a = 0;
    char *c = "here's me editing a
      multi-line\n\
      string";
    puts(c);
    return 0;
  }

If this switch was all, I wouldn't mind at all.  Unfortunately it comes
with a very big trade-off: the underlying syntactic changes break
e.g. C-M-u C-M-SPC inside the multi-line string being edited (which is
precisely something I could use to fix the string).

I just noticed that in 26.1 indentation of the "puts(c)" wasn't affected
by the temporary editing of the string.  Now it is, so another downside,
IMO.

João







^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-01 10:58                             ` Alan Mackenzie
@ 2018-07-01 11:46                               ` João Távora
  2018-07-01 16:13                               ` Stefan Monnier
  1 sibling, 0 replies; 93+ messages in thread
From: João Távora @ 2018-07-01 11:46 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: Clément Pit-Claudel, Stephen Leake, Stefan Monnier,
	emacs-devel

Alan Mackenzie <acm@muc.de> writes:

Hi Alan,

>> I suspect that it'll be easier to fix electric-pair-mode.
> This would be my preferred option too, but it's not easy.

I don't follow: i'm still waiting on comments on 

  https://lists.gnu.org/archive/html/emacs-devel/2018-06/msg00606.html

Where, at your request, I changed electric-pair-mode to provide a way to
fix the immediate problems (test failures/chomp thing).

(Obviously, it doesn't fix the overarching issue as I already explained
elsewhere.)

João



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-30 20:14                             ` Alan Mackenzie
  2018-07-01  3:50                               ` Stefan Monnier
  2018-07-01 11:22                               ` João Távora
@ 2018-07-01 15:22                               ` Eli Zaretskii
  2018-07-01 16:38                                 ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie
  2 siblings, 1 reply; 93+ messages in thread
From: Eli Zaretskii @ 2018-07-01 15:22 UTC (permalink / raw)
  To: Alan Mackenzie
  Cc: joaotavora, cpitclaudel, stephen_leake, monnier, emacs-devel

> Date: Sat, 30 Jun 2018 20:14:47 +0000
> Cc: cpitclaudel@gmail.com, emacs-devel@gnu.org,
>   stephen_leake@stephe-leake.org, monnier@IRO.UMontreal.CA,
>   joaotavora@gmail.com
> From: Alan Mackenzie <acm@muc.de>
> 
> > Could you please recap what problem(s) you are trying to fix with
> > these changes?  (I'm sorry for not following, but this thread spans
> > two months and many long messages with several days in-between.  It's
> > hard to keep focused on the main issues.)
> 
> Sorry.  That's just the way things go, sometimes.

Not your fault.  Thanks for taking the time to recap.

> The initial problem I tried to solve was for CC Mode source files with
> things like:
> 
>     char foo[] = "foo
>     char bar[] = "bar";
> 
> Historically, the missing " on "foo has caused subsequent lines to have
> their string quoting reversed.  This is not good.

But not really a catastrophe, IMO.

> What I'm now proposing, and implementing as a trial, is to enhance the
> syntax table facilities to support unterminated strings.  There will be
> an extra syntax flag `s' on newlines meaning "terminate any open string".
> This is straightforward for forward scanning, but somewhat complicated
> for backward scanning.  However, it does enable unterminated strings to
> be easily fontified to EOL in any language, with minimal effort.
> 
> It should allow the desired fontification without causing problems for
> electric-pair-mode.
> 
> Stefan is concerned that the extra functionality may not justify the
> increase in complexity in syntax.c.

So am I.  I'm also concerned that introducing this will slow down
various syntax-related features, only to cater to what I consider a
minor improvement at best.

Of course, if the extra functionality turns out to be not as complex
as Stefan fears and won't cause any significant slowdown that concerns
me, then perhaps we should have it.  But is that a reasonable
assumption?

Thanks.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-01  3:37                           ` Stefan Monnier
@ 2018-07-01 15:24                             ` Eli Zaretskii
  2018-07-06 21:58                             ` Stephen Leake
  1 sibling, 0 replies; 93+ messages in thread
From: Eli Zaretskii @ 2018-07-01 15:24 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, emacs-devel

> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Date: Sat, 30 Jun 2018 23:37:33 -0400
> Cc: emacs-devel@gnu.org
> 
> My point of view is that Emacs should focus on behaving as correctly as
> possible for valid code.  The only effort worth doing w.r.t invalid code
> is to avoid doing something clearly harmful and to help the user make
> the code valid again.  Anything further than that is time that would be
> better spent improving the handling of valid code.
> 
> I don't see any concrete benefit (for the user) of the new behavior over
> the old (or the reverse for that matter).  Either behavior is equally
> good and which behavior is better will depend on things which Emacs
> cannot know unless the user explicitly tells us.
> [...]
> Personally, as a user, I don't really care which behavior I get: it's
> a rare transient situation which I'll fix soon anyway, whether Emacs
> tells me about it or not.

This reflects my opinions as well.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-01 11:22                               ` João Távora
@ 2018-07-01 15:25                                 ` Eli Zaretskii
  0 siblings, 0 replies; 93+ messages in thread
From: Eli Zaretskii @ 2018-07-01 15:25 UTC (permalink / raw)
  To: João Távora
  Cc: acm, cpitclaudel, stephen_leake, monnier, emacs-devel

> From: João Távora <joaotavora@gmail.com>
> Cc: Eli Zaretskii <eliz@gnu.org>,  cpitclaudel@gmail.com,  emacs-devel@gnu.org,  stephen_leake@stephe-leake.org,  monnier@IRO.UMontreal.CA
> Date: Sun, 01 Jul 2018 12:22:11 +0100
> 
> I'm not sure how far into "final allegations" we are, but below is my
> summary.

Thanks.

> If this switch was all, I wouldn't mind at all.  Unfortunately it comes
> with a very big trade-off: the underlying syntactic changes break
> e.g. C-M-u C-M-SPC inside the multi-line string being edited (which is
> precisely something I could use to fix the string).
> 
> I just noticed that in 26.1 indentation of the "puts(c)" wasn't affected
> by the temporary editing of the string.  Now it is, so another downside,
> IMO.

Yes, if fixing a minor annoyance introduces much more serious issues,
we shouldn't install such a "fix".



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-06-30 18:09                         ` Alan Mackenzie
  2018-07-01  3:37                           ` Stefan Monnier
@ 2018-07-01 15:57                           ` Paul Eggert
  1 sibling, 0 replies; 93+ messages in thread
From: Paul Eggert @ 2018-07-01 15:57 UTC (permalink / raw)
  To: Alan Mackenzie, Stefan Monnier; +Cc: emacs-devel

Alan Mackenzie wrote:
> Some people, including me, find the insertion of characters they haven't
> typed (aside from tabs/spaces for indentation) annoying.

Hey, I often find the insertion of tabs and spaces annoying! And it's gotten 
worse recently, in CC-mode. It's often reindenting when I don't want it to, and 
for C99 constructs like designated initializers it indents in ways that are 
increasingly unhelpful. (Yes, I know, I know, I should file bug reports....)



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-01 10:58                             ` Alan Mackenzie
  2018-07-01 11:46                               ` João Távora
@ 2018-07-01 16:13                               ` Stefan Monnier
  2018-07-01 18:18                                 ` Alan Mackenzie
  1 sibling, 1 reply; 93+ messages in thread
From: Stefan Monnier @ 2018-07-01 16:13 UTC (permalink / raw)
  To: emacs-devel

>> Why?    How 'bout:
>>     char foo[] = "some unterminated // string
> Bug compatibility with the current scan-sexps.

I don't see why: currently, scan-sexps skips over the comment, but
that's not a bug: it's exactly what it is documented to do.

When you change the syntax property of ?\n to be "> s", it changes the
behavior expected based on the documentation, so in the above case it
should treat the \n as closing the string rather than closing
the comment.

It needs to work reliably for those languages where strings
are indeed terminated by newline (e.g. jgraph-mode in GNU ELPA).

> Hmmm.  Yes, this could increase the backward scanning time quite
> substantially, but we already do this for back_comment, though.

I expect the impact will be less than that of back_comment, but I think
we'd want actual measurements anyway.

> A possibility would be to apply the `s' flag only in a syntax-table text
> property applied to the newlines of unterminated strings.

But that brings us back to "why not use string-fence?".

> I disagree.  Whilst editing code, it is in an invalid state nearly all
> the time.

But we usually don't make any effort to guess what the intended closest
valid state might be, except where the user is actively editing the text
(e.g. by proposing completion candidates for identifiers).

>> I can't imagine how seeing the code could change my opinion on whether
>> it's worth it.
> I would hope you would weigh up the small additional complexity against
> the new features it brings, and reach a balanced judgment, rather than
> dismissing the new idea without consideration.

I did consider it.  I just know syntax.c well enough that I'd be very
surprised if the actual patch (as opposed to by guess at the what the
patch would look like) makes me change my mind.


        Stefan




^ permalink raw reply	[flat|nested] 93+ messages in thread

* scratch/fontify-open-string.  [Was: CC Mode and electric-pair "problem".]
  2018-07-01 15:22                               ` Eli Zaretskii
@ 2018-07-01 16:38                                 ` Alan Mackenzie
  2018-07-08  8:29                                   ` Stephen Leake
  0 siblings, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-07-01 16:38 UTC (permalink / raw)
  To: Eli Zaretskii
  Cc: joaotavora, cpitclaudel, stephen_leake, monnier, emacs-devel

Hello, Eli.

On Sun, Jul 01, 2018 at 18:22:48 +0300, Eli Zaretskii wrote:
> > Date: Sat, 30 Jun 2018 20:14:47 +0000
> > Cc: cpitclaudel@gmail.com, emacs-devel@gnu.org,
> >   stephen_leake@stephe-leake.org, monnier@IRO.UMontreal.CA,
> >   joaotavora@gmail.com
> > From: Alan Mackenzie <acm@muc.de>

[ .... ]

> > The initial problem I tried to solve was for CC Mode source files with
> > things like:

> >     char foo[] = "foo
> >     char bar[] = "bar";

> > Historically, the missing " on "foo has caused subsequent lines to have
> > their string quoting reversed.  This is not good.

> But not really a catastrophe, IMO.

Perhaps not, but it is nevertheless bad.  That it is so difficult to do
anything about is also bad.

> > What I'm now proposing, and implementing as a trial, is to enhance the
> > syntax table facilities to support unterminated strings.  There will be
> > an extra syntax flag `s' on newlines meaning "terminate any open string".
> > This is straightforward for forward scanning, but somewhat complicated
> > for backward scanning.  However, it does enable unterminated strings to
> > be easily fontified to EOL in any language, with minimal effort.

> > It should allow the desired fontification without causing problems for
> > electric-pair-mode.

> > Stefan is concerned that the extra functionality may not justify the
> > increase in complexity in syntax.c.

> So am I.  I'm also concerned that introducing this will slow down
> various syntax-related features, only to cater to what I consider a
> minor improvement at best.

> Of course, if the extra functionality turns out to be not as complex
> as Stefan fears and won't cause any significant slowdown that concerns
> me, then perhaps we should have it.  But is that a reasonable
> assumption?

It's no longer a matter of assumption.  Earlier on this afternoon, I
committed a preliminary working version of this change to the branch
scratch/fontify-open-string.

The most complicated part of the change is the new function
back_maybe_string in syntax.c.  This is a mere 137 lines long.  Even if
perhaps not fully fleshed out, it's not far off.  By contrast,
back_comment (which is also called at every newline when there're line
comments) is 289 lines long.

I have amended shell-script-mode to use this new strategy.  This required
changing just one line in sh-script.el.  To font-lock.el I have added an
optional feature to put warning-face on the opening ".

I think it is notable just how easy this new feature is to use.
Essentially any mode[*] can use it with a one line change (to the
syntax table code for \n).

[*] Except, currently, CC Mode.  ;-(

> Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-01 16:13                               ` Stefan Monnier
@ 2018-07-01 18:18                                 ` Alan Mackenzie
  2018-07-01 23:16                                   ` Stefan Monnier
  0 siblings, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-07-01 18:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Sun, Jul 01, 2018 at 12:13:32 -0400, Stefan Monnier wrote:
> >> Why?    How 'bout:
> >>     char foo[] = "some unterminated // string
> > Bug compatibility with the current scan-sexps.

> I don't see why: currently, scan-sexps skips over the comment, but
> that's not a bug: it's exactly what it is documented to do.

There is no comment there, but scan-sexps skips to it nevertheless.  As
you know, I solved these anomalies some while ago with the comment-cache
branch.

> When you change the syntax property of ?\n to be "> s", it changes the
> behavior expected based on the documentation, ....

Er, documentation?  This new flag isn't documented yet, or at least not
in any permanent fashion.

> .... so in the above case it should treat the \n as closing the string
> rather than closing the comment.

I agree.

> It needs to work reliably for those languages where strings
> are indeed terminated by newline (e.g. jgraph-mode in GNU ELPA).

You mean, jgraph-mode is another use-case for `s'?  (I'm not familiar
with it.)

> > Hmmm.  Yes, this could increase the backward scanning time quite
> > substantially, but we already do this for back_comment, though.

> I expect the impact will be less than that of back_comment, but I think
> we'd want actual measurements anyway.

Yes.

> > A possibility would be to apply the `s' flag only in a syntax-table
> > text property applied to the newlines of unterminated strings.

> But that brings us back to "why not use string-fence?".

Yes.  String-fence interferes with syntactical stuff "inside" the
invalid string, whereas the `s' flag won't.

> > I disagree.  Whilst editing code, it is in an invalid state nearly
> > all the time.

> But we usually don't make any effort to guess what the intended
> closest valid state might be, except where the user is actively
> editing the text (e.g. by proposing completion candidates for
> identifiers).

There's no need to guess.  The compiler defines the state, namely that
the (invalid) string ends at the EOL, and what follows is non-string.

> > I would hope you would weigh up the small additional complexity against
> > the new features it brings, and reach a balanced judgment, rather than
> > dismissing the new idea without consideration.

> I did consider it.  I just know syntax.c well enough that I'd be very
> surprised if the actual patch (as opposed to by guess at the what the
> patch would look like) makes me change my mind.

There's no need to guess.  back_maybe_comment is in the new
scratch/fontify-open-string branch.  It is NOT that complicated.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-01 18:18                                 ` Alan Mackenzie
@ 2018-07-01 23:16                                   ` Stefan Monnier
  2018-07-02 19:18                                     ` Alan Mackenzie
  0 siblings, 1 reply; 93+ messages in thread
From: Stefan Monnier @ 2018-07-01 23:16 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

>> >> Why?    How 'bout:
>> >>     char foo[] = "some unterminated // string
>> > Bug compatibility with the current scan-sexps.
>> I don't see why: currently, scan-sexps skips over the comment, but
>> that's not a bug: it's exactly what it is documented to do.
> There is no comment there, but scan-sexps skips to it nevertheless.

The starting point is within the string (not according to the C language
rules, of course, but according to the syntax-tables settings), and
operations like scan-sexps are documented to work under the assumption
that the starting point is outside of strings/comments, so it is very
much correct for it to consider this "// string\n" to be a comment.
I agree that it would be OK for scan-sexps in this case to consider that
\n terminates the string rather than the comment, tho.

>> When you change the syntax property of ?\n to be "> s", it changes the
>> behavior expected based on the documentation, ....
> Er, documentation?  This new flag isn't documented yet, or at least not
> in any permanent fashion.

Well, I was talking hypothetically under the assumption that "s" is
documented to mean something like "closes a string if there's one to
close".

>> .... so in the above case it should treat the \n as closing the string
>> rather than closing the comment.
> I agree.

OK, sorry 'bout the above, then, I see we agree.

>> It needs to work reliably for those languages where strings
>> are indeed terminated by newline (e.g. jgraph-mode in GNU ELPA).
> You mean, jgraph-mode is another use-case for `s'?  (I'm not familiar
> with it.)

I looked for existing use-cases and I indeed found one.  It's very much
not high-profile, tho.  Also this use-case is slightly different in that
the \n is really the normal/only way to terminate the string in jgraph.
In case you're interested:

    http://web.eecs.utk.edu/~plank/plank/jgraph/jgraph.html

>> But that brings us back to "why not use string-fence?".
> Yes.  String-fence interferes with syntactical stuff "inside" the
> invalid string, whereas the `s' flag won't.

Not sure how serious this "interferes with syntactical stuff" is
in practice.

>> But we usually don't make any effort to guess what the intended
>> closest valid state might be, except where the user is actively
>> editing the text (e.g. by proposing completion candidates for
>> identifiers).
> There's no need to guess.  The compiler defines the state, namely that
> the (invalid) string ends at the EOL, and what follows is non-string.

The compiler just makes an arbitrary choice, just like we do and that
has no bearing on what the intended valid state is (which is not
something the compiler can discover either: it's only available in the
head of the coder).

> There's no need to guess.  back_maybe_comment is in the new
> scratch/fontify-open-string branch.  It is NOT that complicated.

Unsurprisingly it introduces a complexity which I find unjustified by
the presented benefits.

But it now occurs to me that maybe we can do better: have you tried to
merge back_maybe_comment into back_comment?  After all, back_comment
already pays attention to strings (in order to try and correctly
handle comment openers appearing within strings), so there's
a possibility that back_comment might be able to handle your use case
with much fewer changes (and in that case, the performance cost would
be pretty close to 0, I think).


        Stefan



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-01 23:16                                   ` Stefan Monnier
@ 2018-07-02 19:18                                     ` Alan Mackenzie
  2018-07-03  2:10                                       ` Stefan Monnier
  0 siblings, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-07-02 19:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On Sun, Jul 01, 2018 at 19:16:06 -0400, Stefan Monnier wrote:
> >> >> Why?    How 'bout:
> >> >>     char foo[] = "some unterminated // string
> >> > Bug compatibility with the current scan-sexps.
> >> I don't see why: currently, scan-sexps skips over the comment, but
> >> that's not a bug: it's exactly what it is documented to do.
> > There is no comment there, but scan-sexps skips to it nevertheless.

> The starting point is within the string (not according to the C language
> rules, of course, but according to the syntax-tables settings), and
> operations like scan-sexps are documented to work under the assumption
> that the starting point is outside of strings/comments, so it is very
> much correct for it to consider this "// string\n" to be a comment.

Yes.  Apologies for my misunderstanding.

[ .... ]

> ..... I see we agree.

Yes.

> >> It needs to work reliably for those languages where strings
> >> are indeed terminated by newline (e.g. jgraph-mode in GNU ELPA).
> > You mean, jgraph-mode is another use-case for `s'?  (I'm not familiar
> > with it.)

> I looked for existing use-cases and I indeed found one.  It's very much
> not high-profile, tho.  Also this use-case is slightly different in that
> the \n is really the normal/only way to terminate the string in jgraph.
> In case you're interested:

>     http://web.eecs.utk.edu/~plank/plank/jgraph/jgraph.html

> >> But that brings us back to "why not use string-fence?".
> > Yes.  String-fence interferes with syntactical stuff "inside" the
> > invalid string, whereas the `s' flag won't.

> Not sure how serious this "interferes with syntactical stuff" is
> in practice.

Maybe not very.

> >> But we usually don't make any effort to guess what the intended
> >> closest valid state might be, except where the user is actively
> >> editing the text (e.g. by proposing completion candidates for
> >> identifiers).
> > There's no need to guess.  The compiler defines the state, namely that
> > the (invalid) string ends at the EOL, and what follows is non-string.

> The compiler just makes an arbitrary choice, ....

No.  The compiler has no choice here.  Or does it?  Can you identify any
other sensible strategy a compiler could follow?

> .... just like we do and that has no bearing on what the intended
> valid state is (which is not something the compiler can discover
> either: it's only available in the head of the coder).

There may or may not be a unique "intended valid state".  I don't think
it's a helpful concept - it suggests that the states a buffer is in most
of the time are in some way unimportant.  I reaffirm my view that Emacs
should present optimal information about these normal (invalid) states,
and that they are very important indeed.

> > There's no need to guess.  back_maybe_comment is in the new
> > scratch/fontify-open-string branch.  It is NOT that complicated.

> Unsurprisingly it introduces a complexity which I find unjustified by
> the presented benefits.

> But it now occurs to me that maybe we can do better: have you tried to
> merge back_maybe_string into back_comment?  After all, back_comment
> already pays attention to strings (in order to try and correctly
> handle comment openers appearing within strings), so there's
> a possibility that back_comment might be able to handle your use case
> with much fewer changes (and in that case, the performance cost would
> be pretty close to 0, I think).

That's a good idea.  I think it's clear that such a merge could be done.
But it would need a lot of detailed painstaking work.  It's optimisation
(as in "don't do it yet!").  Once we decide to adopt the idea is the
time to do this merge, I think.  That's assuming some measurements show
it's worthwhile (which I think it would be).

In fact, in my modified shell-script-mode I timed (scan-sexps BONL -1) a
million times on the following text:

    "string" at the start of a line.

With the `s' flag in place: 1.9489 seconds.
Without the `s' flag:       1.3003 seconds.

This is an overhead of almost exactly 50%.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-02 19:18                                     ` Alan Mackenzie
@ 2018-07-03  2:10                                       ` Stefan Monnier
  0 siblings, 0 replies; 93+ messages in thread
From: Stefan Monnier @ 2018-07-03  2:10 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

>> >> But we usually don't make any effort to guess what the intended
>> >> closest valid state might be, except where the user is actively
>> >> editing the text (e.g. by proposing completion candidates for
>> >> identifiers).
>> > There's no need to guess.  The compiler defines the state, namely that
>> > the (invalid) string ends at the EOL, and what follows is non-string.
>> The compiler just makes an arbitrary choice, ....
> No.  The compiler has no choice here.  Or does it?

Of course it does.

> Can you identify any other sensible strategy a compiler could follow?

It could look for the next (closing) " and if it's not on the same line
signal an error about "invalid multiline string" (or "unterminated
string" if it bumps into EOF).  GCC used to do just that (without even
signaling an error) IIRC.

> There may or may not be a unique "intended valid state".  I don't think
> it's a helpful concept - it suggests that the states a buffer is in most
> of the time are in some way unimportant.  I reaffirm my view that Emacs
> should present optimal information about these normal (invalid) states,
> and that they are very important indeed.

I'm not sure you can define "optimal" without defining "intended valid
state" in this case.

> That's a good idea.  I think it's clear that such a merge could be done.
> But it would need a lot of detailed painstaking work.

From what I remember of back_comment (not very fresh, to be honest),
I think there's a good chance it would be pretty easy, actually (at
least easy in terms of the resulting patch being short: it may take
some time to come up with the patch, OTOH).

> With the `s' flag in place: 1.9489 seconds.
> Without the `s' flag:       1.3003 seconds.

Wow, I must say I expected a significantly lower overhead.


        Stefan



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: CC Mode and electric-pair "problem".
  2018-07-01  3:37                           ` Stefan Monnier
  2018-07-01 15:24                             ` Eli Zaretskii
@ 2018-07-06 21:58                             ` Stephen Leake
  1 sibling, 0 replies; 93+ messages in thread
From: Stephen Leake @ 2018-07-06 21:58 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@IRO.UMontreal.CA> writes:

> An unterminated string can only occur in an invalid piece of code.
> To the extent that invalid code has no clear meaning, there's no way
> to know what is really the "right" behavior.

True, but I still agree with Alan; treat the newline as a string
terminator for fontification.

> My point of view is that Emacs should focus on behaving as correctly as
> possible for valid code.  The only effort worth doing w.r.t invalid code
> is to avoid doing something clearly harmful and to help the user make
> the code valid again.  Anything further than that is time that would be
> better spent improving the handling of valid code.

I disagree. When we are editing code, it has incorrect syntax most of the
time, yet we still as Emacs to fontify and indent it. So it is a strong
requirement that Emacs work "acceptably well" in this context.

I'm working on adding robust error correction to my Ada parser,
precisely for this purpose.

> I don't see any concrete benefit (for the user) of the new behavior over
> the old (or the reverse for that matter).  Either behavior is equally
> good and which behavior is better will depend on things which Emacs
> cannot know unless the user explicitly tells us.

Right. It would be nice to have the "terminate string on newline"
behavior as an option.

>> Up till now, Emacs hasn't bothered - it just allows these strings, and the
>> subsequent buffer portion, to be fontified randomly.
>
> It's not random: it's arbitrary.  The new behavior is also arbitrary.

Right.

> OTOH, there is very concrete evidence that the new behavior is worse in
> the sense that it adds complexity to the code and (as expected)
> introduces bugs.
>
> To me, this is a bad tradeoff.

Ok. I'm hoping for a coding solution that is not as complex.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]
  2018-07-01 16:38                                 ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie
@ 2018-07-08  8:29                                   ` Stephen Leake
  2018-07-15  9:00                                     ` Stephen Leake
  0 siblings, 1 reply; 93+ messages in thread
From: Stephen Leake @ 2018-07-08  8:29 UTC (permalink / raw)
  To: emacs-devel

Alan Mackenzie <acm@muc.de> writes:

> It's no longer a matter of assumption.  Earlier on this afternoon, I
> committed a preliminary working version of this change to the branch
> scratch/fontify-open-string.

I've just tried this in ada-mode, and it works nicely. I like the red
face on an unbalanced string quote.

No noticeable slowdown in anything I've tried so far.

Let me know if there's some experiment you'd like me to run.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]
  2018-07-08  8:29                                   ` Stephen Leake
@ 2018-07-15  9:00                                     ` Stephen Leake
  2018-07-15 15:13                                       ` Eli Zaretskii
  2018-07-15 16:56                                       ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie
  0 siblings, 2 replies; 93+ messages in thread
From: Stephen Leake @ 2018-07-15  9:00 UTC (permalink / raw)
  To: emacs-devel

An update on this; I just had several missing quotes in a buffer, due to
a copy/multiple paste that had a quote error. I did lots of editing with
the quote errors present.

I didn't even notice them until the compiler complained, just like any
other syntax error.

In my opinion, that is far preferable to the previous behavior of
fontifying large parts of the buffer as string, which forced me to pay
attention to a trivial syntax error instead of what I was actually doing.

This is in Ada, that does not have the option of escaping a newline to
create a multiline string, so treating a newline as string terminator is
always correct.

Anything I can do to help merge this to main?

Stephen Leake <stephen_leake@stephe-leake.org> writes:

> Alan Mackenzie <acm@muc.de> writes:
>
>> It's no longer a matter of assumption.  Earlier on this afternoon, I
>> committed a preliminary working version of this change to the branch
>> scratch/fontify-open-string.
>
> I've just tried this in ada-mode, and it works nicely. I like the red
> face on an unbalanced string quote.
>
> No noticeable slowdown in anything I've tried so far.
>
> Let me know if there's some experiment you'd like me to run.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]
  2018-07-15  9:00                                     ` Stephen Leake
@ 2018-07-15 15:13                                       ` Eli Zaretskii
  2018-07-15 18:45                                         ` Alan Mackenzie
  2018-07-15 16:56                                       ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie
  1 sibling, 1 reply; 93+ messages in thread
From: Eli Zaretskii @ 2018-07-15 15:13 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

> From: Stephen Leake <stephen_leake@stephe-leake.org>
> Date: Sun, 15 Jul 2018 04:00:23 -0500
> 
> Anything I can do to help merge this to main?

A few things:

 . NEWS
 . Updates for the relevant parts in the manual(s)
 . Minor nits below:

> +(defcustom font-lock-warn-open-string t
> +  "Fontify the opening quote of an unterminated string with warning face?
> +This is done when this variable is non-nil.

We use a slightly different style for such options (slightly rephrased
to fit on one line):

  "Non-nil means show opening quotes of unterminated strings with warning face."

> +This works only when the syntax-table entry for newline contains the flag `s'
> +\(see page \"xxx\" in the Elisp manual)."

Please replace "xxx" with an actual value.  Also, we don't refer to
our manuals as "pages", that is a relic from the "man pages" era.

> +#define DEC_AT                                                  \

Please #undef DEC_AT when you are done using it (at function's end).

> +  /* Find the alleged string opener. */

Please leave 2 spaces between the end of the comment and "*/" (here
and elsewhere in the patch)

> +  while ((at > stop)
> +         && (code != Sstring)
> +         && (!SYNTAX_FLAGS_CLOSE_STRING (syntax)))
> +    {
> +      DEC_AT;
> +    }

A single line doesn't need braces.

> +      /* Search back for a terminating string delimiter: */
> +      while ((at > stop)
> +             && (code != Sstring)
> +             && (code != Sstring_fence)
> +             && (!SYNTAX_FLAGS_CLOSE_STRING (syntax)))
> +        {
> +          DEC_AT;
> +          /* Check for comment and "other" strings. */
> +        }

Is the last comment at its correct place?  It doesn't seem to refer to
any code.

> + lose:
> +  UPDATE_SYNTAX_TABLE_FORWARD (*from);
> +  return false;
> +
> + lossage:
> +  /* We've encountered possible comments or strings with mixed
> +     delimiters.  Bail out and scan forward from a safe position. */

"lose" and "lossage" are too similar.  Can we have a better name for
the latter?

> +  {
> +    struct lisp_parse_state state;
> +    bool adjusted = true;

Why did you need the braces here?  C99 allows to mix declarations and
statements, so we no longer need such braces.

> +        find_start_value
> +          = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts))
> +          : state.thislevelstart >= 0 ? state.thislevelstart
> +          : find_start_value;

Please use parentheses here for better readability (to clearly show
which parts belong to which condition).

> -Comments are ignored if `parse-sexp-ignore-comments' is non-nil.
> +Comments are skipped over if `parse-sexp-ignore-comments' is non-nil.

We nowadays prefer to quote 'like this' in comments and plain text.

> -Comments are ignored if `parse-sexp-ignore-comments' is non-nil.
> +Comments are skipped over if `parse-sexp-ignore-comments' is non-nil.

Likewise.

Thanks again for working on this.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]
  2018-07-15  9:00                                     ` Stephen Leake
  2018-07-15 15:13                                       ` Eli Zaretskii
@ 2018-07-15 16:56                                       ` Alan Mackenzie
  2018-07-17  3:41                                         ` Stephen Leake
  1 sibling, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-07-15 16:56 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

Hello, Stephen.

Many thanks for trying out and testing this branch.

I'm afraid I've found a rather large snag - there are backward moving
commands and functions in Emacs which bypass proper syntax checking.
For example, in the following in (the modified) shell-script-mode:

1.    foo="Foo"
2.    bar="Bar
3.

, with point at BOL3, a C-M-b moves to the F, rather than "Bar.

This is because (forward-comment -1) crashes into the "whitespace" at
the end of L2 (the newline) rather than taking account of its syntax
(the string closing flag).

At the very least, the function back_comment (in src/syntax.c) will need
to be modified to take account of such things, and in doing so, might as
well become a function that also goes back over EOL-terminated strings,
as Stefan suggested.  This will be a lot of work.

I fear there may be several, or even many, lisp functions in Emacs which
may likewise need modifying.

The root cause of this problem, in the abstract, is that Emacs attempts
to scan backwards over strings and comments, which is only heuristically
possible, rather than scanning forwards over the same constructs and
remembering the endpoints.

Right at the moment, I don't know how to proceed.  Sorry.

-- 
Alan Mackenzie (Nuremberg, Germany).


On Sun, Jul 15, 2018 at 04:00:23 -0500, Stephen Leake wrote:
> An update on this; I just had several missing quotes in a buffer, due to
> a copy/multiple paste that had a quote error. I did lots of editing with
> the quote errors present.

> I didn't even notice them until the compiler complained, just like any
> other syntax error.

> In my opinion, that is far preferable to the previous behavior of
> fontifying large parts of the buffer as string, which forced me to pay
> attention to a trivial syntax error instead of what I was actually doing.

> This is in Ada, that does not have the option of escaping a newline to
> create a multiline string, so treating a newline as string terminator is
> always correct.

> Anything I can do to help merge this to main?

> Stephen Leake <stephen_leake@stephe-leake.org> writes:

> > Alan Mackenzie <acm@muc.de> writes:
> >
> >> It's no longer a matter of assumption.  Earlier on this afternoon, I
> >> committed a preliminary working version of this change to the branch
> >> scratch/fontify-open-string.
> >
> > I've just tried this in ada-mode, and it works nicely. I like the red
> > face on an unbalanced string quote.
> >
> > No noticeable slowdown in anything I've tried so far.
> >
> > Let me know if there's some experiment you'd like me to run.

> -- 
> -- Stephe



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]
  2018-07-15 15:13                                       ` Eli Zaretskii
@ 2018-07-15 18:45                                         ` Alan Mackenzie
  2018-07-16  2:23                                           ` Indentation of ?: in C-mode (was: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]) Stefan Monnier
  0 siblings, 1 reply; 93+ messages in thread
From: Alan Mackenzie @ 2018-07-15 18:45 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stephen Leake, emacs-devel

Hello, Eli,

thanks for the review.  The code is still preliminary, and is missing
quite a lot of comments, still.

I have had doubts about the mechanism (e.g. C-M-b will take a lot of
work to make it functional), see my reply to Stephen.

On Sun, Jul 15, 2018 at 18:13:15 +0300, Eli Zaretskii wrote:
> > From: Stephen Leake <stephen_leake@stephe-leake.org>
> > Date: Sun, 15 Jul 2018 04:00:23 -0500

> > Anything I can do to help merge this to main?

> A few things:

>  . NEWS
>  . Updates for the relevant parts in the manual(s)
>  . Minor nits below:

> > +(defcustom font-lock-warn-open-string t
> > +  "Fontify the opening quote of an unterminated string with warning face?
> > +This is done when this variable is non-nil.

> We use a slightly different style for such options (slightly rephrased
> to fit on one line):

Well done for the compression!

>   "Non-nil means show opening quotes of unterminated strings with warning face."

> > +This works only when the syntax-table entry for newline contains the flag `s'
> > +\(see page \"xxx\" in the Elisp manual)."

> Please replace "xxx" with an actual value.  Also, we don't refer to
> our manuals as "pages", that is a relic from the "man pages" era.

Yes, thanks.  Just "see \"<page name>\"", without the "page"?

> > +#define DEC_AT                                                  \

> Please #undef DEC_AT when you are done using it (at function's end).

OK.

> > +  /* Find the alleged string opener. */

> Please leave 2 spaces between the end of the comment and "*/" (here
> and elsewhere in the patch)

OK.  As a matter of interest, what is the reason for this?  I've seen
it all over the Emacs C code.  Is it something to do with filling?

> > +  while ((at > stop)
> > +         && (code != Sstring)
> > +         && (!SYNTAX_FLAGS_CLOSE_STRING (syntax)))
> > +    {
> > +      DEC_AT;
> > +    }

> A single line doesn't need braces.

I'm intending to put more code in there.

> > +      /* Search back for a terminating string delimiter: */
> > +      while ((at > stop)
> > +             && (code != Sstring)
> > +             && (code != Sstring_fence)
> > +             && (!SYNTAX_FLAGS_CLOSE_STRING (syntax)))
> > +        {
> > +          DEC_AT;
> > +          /* Check for comment and "other" strings. */
> > +        }

> Is the last comment at its correct place?  It doesn't seem to refer to
> any code.

It's a FIXME: "Put in code here to check for comment and "other"
strings.".

> > + lose:
> > +  UPDATE_SYNTAX_TABLE_FORWARD (*from);
> > +  return false;
> > +
> > + lossage:
> > +  /* We've encountered possible comments or strings with mixed
> > +     delimiters.  Bail out and scan forward from a safe position. */

> "lose" and "lossage" are too similar.  Can we have a better name for
> the latter?

OK.  I took the names from, I think, back_comment.

> > +  {
> > +    struct lisp_parse_state state;
> > +    bool adjusted = true;

> Why did you need the braces here?  C99 allows to mix declarations and
> statements, so we no longer need such braces.

OK.

> > +        find_start_value
> > +          = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts))
> > +          : state.thislevelstart >= 0 ? state.thislevelstart
> > +          : find_start_value;

> Please use parentheses here for better readability (to clearly show
> which parts belong to which condition).

Yes, it didn't indent well by itself.  Maybe I should raise this with
the CC Mode maintainer.  But yes, I'll put parens in.

> > -Comments are ignored if `parse-sexp-ignore-comments' is non-nil.
> > +Comments are skipped over if `parse-sexp-ignore-comments' is non-nil.

> We nowadays prefer to quote 'like this' in comments and plain text.

OK.

> > -Comments are ignored if `parse-sexp-ignore-comments' is non-nil.
> > +Comments are skipped over if `parse-sexp-ignore-comments' is non-nil.

> Likewise.

> Thanks again for working on this.

I'll make the stylistic corrections, then get working on it again in
earnest.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Indentation of ?: in C-mode (was: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".])
  2018-07-15 18:45                                         ` Alan Mackenzie
@ 2018-07-16  2:23                                           ` Stefan Monnier
  2018-07-16 14:18                                             ` Eli Zaretskii
  0 siblings, 1 reply; 93+ messages in thread
From: Stefan Monnier @ 2018-07-16  2:23 UTC (permalink / raw)
  To: emacs-devel

>> > +        find_start_value
>> > +          = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts))
>> > +          : state.thislevelstart >= 0 ? state.thislevelstart
>> > +          : find_start_value;
>> Please use parentheses here for better readability (to clearly show
>> which parts belong to which condition).
> Yes, it didn't indent well by itself.  Maybe I should raise this with
> the CC Mode maintainer.  But yes, I'll put parens in.

This is one of those rare cases where sm-c-mode handles it better:

    find_start_value
      = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts))
        : state.thislevelstart >= 0 ? state.thislevelstart
        : find_start_value;

This said, I don't see either indentation as problematic and I'm not
sure what would be "better for readability".


        Stefan




^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Indentation of ?: in C-mode (was: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".])
  2018-07-16  2:23                                           ` Indentation of ?: in C-mode (was: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]) Stefan Monnier
@ 2018-07-16 14:18                                             ` Eli Zaretskii
  2018-07-16 15:54                                               ` Indentation of ?: in C-mode Stefan Monnier
  0 siblings, 1 reply; 93+ messages in thread
From: Eli Zaretskii @ 2018-07-16 14:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sun, 15 Jul 2018 22:23:49 -0400
> 
> >> > +        find_start_value
> >> > +          = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts))
> >> > +          : state.thislevelstart >= 0 ? state.thislevelstart
> >> > +          : find_start_value;
> >> Please use parentheses here for better readability (to clearly show
> >> which parts belong to which condition).
> > Yes, it didn't indent well by itself.  Maybe I should raise this with
> > the CC Mode maintainer.  But yes, I'll put parens in.
> 
> This is one of those rare cases where sm-c-mode handles it better:
> 
>     find_start_value
>       = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts))
>         : state.thislevelstart >= 0 ? state.thislevelstart
>         : find_start_value;
> 
> This said, I don't see either indentation as problematic and I'm not
> sure what would be "better for readability".

This:

    find_start_value = CONSP (state.levelstarts)
                       ? XINT (XCAR (state.levelstarts))
                       : (state.thislevelstart >= 0
                          ? state.thislevelstart
                          : find_start_value);

Or maybe even this:

    find_start_value
      = CONSP (state.levelstarts)
        ? XINT (XCAR (state.levelstarts))
        : (state.thislevelstart >= 0 ? state.thislevelstart : find_start_value);



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: Indentation of ?: in C-mode
  2018-07-16 14:18                                             ` Eli Zaretskii
@ 2018-07-16 15:54                                               ` Stefan Monnier
  0 siblings, 0 replies; 93+ messages in thread
From: Stefan Monnier @ 2018-07-16 15:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

>> >> > +        find_start_value
>> >> > +          = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts))
>> >> > +          : state.thislevelstart >= 0 ? state.thislevelstart
>> >> > +          : find_start_value;
>> >> Please use parentheses here for better readability (to clearly show
>> >> which parts belong to which condition).
>> > Yes, it didn't indent well by itself.  Maybe I should raise this with
>> > the CC Mode maintainer.  But yes, I'll put parens in.
>> This is one of those rare cases where sm-c-mode handles it better:
>> 
>>     find_start_value
>>       = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts))
>>         : state.thislevelstart >= 0 ? state.thislevelstart
>>         : find_start_value;
>> 
>> This said, I don't see either indentation as problematic and I'm not
>> sure what would be "better for readability".
> This:
>
>     find_start_value = CONSP (state.levelstarts)
>                        ? XINT (XCAR (state.levelstarts))
>                        : (state.thislevelstart >= 0
>                           ? state.thislevelstart
>                           : find_start_value);

Interesting: I find this one to be (ever so slightly) less readable.
Basically, I read

     find_start_value
       = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts))
         : state.thislevelstart >= 0 ? state.thislevelstart
         : find_start_value;

as a C version of

     (setq find_start_value
           (cond
            ((consp state.levelstarts) (XINT (XCAR (state.levelstarts))))
            ((>= state.thislevelstart 0) state.thislevelstart)
            (t find_start_value)));

so I find the ": condition ? value" lines to be very natural (the only
odd line is really the first one because it doesn't start with ":").


        Stefan


PS: This was the opportunity to see that sm-c-mode misindents the last
line of the middle example if you remove the parens:

  find_start_value = CONSP (state.levelstarts)
                     ? XINT (XCAR (state.levelstarts))
                     : state.thislevelstart >= 0
                       ? state.thislevelstart
                     : find_start_value;

although now that I see it, I wonder if sm-c-mode.el read my mind and
took ": condition ? value" as a logical entity, just to try and show me
where this idea breaks down.



^ permalink raw reply	[flat|nested] 93+ messages in thread

* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]
  2018-07-15 16:56                                       ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie
@ 2018-07-17  3:41                                         ` Stephen Leake
  0 siblings, 0 replies; 93+ messages in thread
From: Stephen Leake @ 2018-07-17  3:41 UTC (permalink / raw)
  To: emacs-devel

Alan Mackenzie <acm@muc.de> writes:

> Hello, Stephen.
>
> Many thanks for trying out and testing this branch.

You're welcome; thanks for implementing it.

> I'm afraid I've found a rather large snag - there are backward moving
> commands and functions in Emacs which bypass proper syntax checking.
> For example, in the following in (the modified) shell-script-mode:
>
> 1.    foo="Foo"
> 2.    bar="Bar
> 3.
>
> , with point at BOL3, a C-M-b moves to the F, rather than "Bar.
>
> This is because (forward-comment -1) crashes into the "whitespace" at
> the end of L2 (the newline) rather than taking account of its syntax
> (the string closing flag).
>
> At the very least, the function back_comment (in src/syntax.c) will need
> to be modified to take account of such things, and in doing so, might as
> well become a function that also goes back over EOL-terminated strings,
> as Stefan suggested.  This will be a lot of work.
>
> I fear there may be several, or even many, lisp functions in Emacs which
> may likewise need modifying.
>
> The root cause of this problem, in the abstract, is that Emacs attempts
> to scan backwards over strings and comments, which is only heuristically
> possible, rather than scanning forwards over the same constructs and
> remembering the endpoints.
>
> Right at the moment, I don't know how to proceed.  Sorry.

I think you are opposed to syntax-ppss, but that does scan forward and
remember things; can we use that here?

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 93+ messages in thread

end of thread, other threads:[~2018-07-17  3:41 UTC | newest]

Thread overview: 93+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-05-22  7:42 [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings Tino Calancha
2018-05-22 17:40 ` Alan Mackenzie
2018-05-22 19:21   ` João Távora
2018-05-22 19:34     ` Eli Zaretskii
2018-05-22 20:25       ` João Távora
2018-05-22 22:17         ` João Távora
2018-05-23 14:52           ` Eli Zaretskii
2018-05-23 20:46     ` Alan Mackenzie
2018-05-23 21:12       ` João Távora
2018-05-23 23:21     ` Michael Welsh Duggan
2018-05-31 12:37 ` CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) Alan Mackenzie
2018-05-31 16:07   ` CC Mode and electric-pair "problem" João Távora
2018-05-31 17:28     ` Alan Mackenzie
2018-05-31 18:37       ` João Távora
2018-06-02 13:02         ` Alan Mackenzie
2018-06-03  3:00           ` João Távora
2018-06-17 16:58   ` Glenn Morris
2018-06-17 20:13     ` Alan Mackenzie
2018-06-17 21:07       ` Stefan Monnier
2018-06-17 21:27       ` João Távora
2018-06-18 10:36         ` Alan Mackenzie
2018-06-18 13:24           ` João Távora
2018-06-18 15:18             ` Eli Zaretskii
2018-06-18 15:37               ` João Távora
2018-06-18 16:46                 ` Eli Zaretskii
2018-06-18 17:21                   ` Eli Zaretskii
2018-06-18 23:49                   ` João Távora
2018-06-19  2:37                     ` Eli Zaretskii
2018-06-19  8:13                       ` João Távora
2018-06-19 16:59                         ` Eli Zaretskii
2018-06-19 19:40                           ` João Távora
2018-06-18 20:24                 ` Glenn Morris
2018-06-19  2:03                   ` João Távora
2018-06-18 15:42             ` Alan Mackenzie
2018-06-18 17:01               ` João Távora
2018-06-18 18:07                 ` Yuri Khan
2018-06-18 22:52                   ` João Távora
2018-06-18 18:08                 ` Alan Mackenzie
2018-06-18 23:43                   ` João Távora
2018-06-19  1:35                     ` João Távora
2018-06-19  1:48                   ` Stefan Monnier
2018-06-19  3:52                     ` Clément Pit-Claudel
2018-06-19  6:38                       ` Stefan Monnier
2018-06-20 13:48                         ` Clément Pit-Claudel
2018-06-26 16:08                     ` Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] Alan Mackenzie
2018-06-26 20:02                       ` João Távora
2018-06-28 23:56                       ` Stefan Monnier
2018-06-29  0:43                         ` Stefan Monnier
2018-06-18 22:41                 ` CC Mode and electric-pair "problem" Stephen Leake
2018-06-19  0:02                   ` João Távora
2018-06-19  3:15                   ` Clément Pit-Claudel
2018-06-19  8:16                     ` João Távora
2018-06-19  5:02                 ` Alan Mackenzie
2018-06-20 14:16                   ` Stefan Monnier
2018-06-26 18:23                     ` Alan Mackenzie
2018-06-27 13:37                       ` João Távora
2018-06-29  3:42                       ` Stefan Monnier
2018-06-30 18:09                         ` Alan Mackenzie
2018-07-01  3:37                           ` Stefan Monnier
2018-07-01 15:24                             ` Eli Zaretskii
2018-07-06 21:58                             ` Stephen Leake
2018-07-01 15:57                           ` Paul Eggert
2018-06-27 18:27                     ` Alan Mackenzie
2018-06-29  4:11                       ` Stefan Monnier
2018-06-30 19:03                         ` Alan Mackenzie
2018-06-30 19:29                           ` Eli Zaretskii
2018-06-30 20:14                             ` Alan Mackenzie
2018-07-01  3:50                               ` Stefan Monnier
2018-07-01  9:58                                 ` Alan Mackenzie
2018-07-01 11:22                               ` João Távora
2018-07-01 15:25                                 ` Eli Zaretskii
2018-07-01 15:22                               ` Eli Zaretskii
2018-07-01 16:38                                 ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie
2018-07-08  8:29                                   ` Stephen Leake
2018-07-15  9:00                                     ` Stephen Leake
2018-07-15 15:13                                       ` Eli Zaretskii
2018-07-15 18:45                                         ` Alan Mackenzie
2018-07-16  2:23                                           ` Indentation of ?: in C-mode (was: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]) Stefan Monnier
2018-07-16 14:18                                             ` Eli Zaretskii
2018-07-16 15:54                                               ` Indentation of ?: in C-mode Stefan Monnier
2018-07-15 16:56                                       ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie
2018-07-17  3:41                                         ` Stephen Leake
2018-07-01  4:02                           ` CC Mode and electric-pair "problem" Stefan Monnier
2018-07-01 10:58                             ` Alan Mackenzie
2018-07-01 11:46                               ` João Távora
2018-07-01 16:13                               ` Stefan Monnier
2018-07-01 18:18                                 ` Alan Mackenzie
2018-07-01 23:16                                   ` Stefan Monnier
2018-07-02 19:18                                     ` Alan Mackenzie
2018-07-03  2:10                                       ` Stefan Monnier
2018-06-26 18:52                   ` Alan Mackenzie
2018-06-26 19:45                     ` João Távora
2018-06-26 20:09                       ` Alan Mackenzie

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).