* [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings. @ 2018-05-22 7:42 Tino Calancha 2018-05-22 17:40 ` Alan Mackenzie 2018-05-31 12:37 ` CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) Alan Mackenzie 0 siblings, 2 replies; 93+ messages in thread From: Tino Calancha @ 2018-05-22 7:42 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Tino Calancha, Emacs developers Hi Alan, Since this commit one test (electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings) in electric-tests.el is failing with error: (scan-error "Unbalanced parentheses" 6 1) This is the bactrace: scan-sexps(7 -1) forward-sexp(-1) backward-sexp() c-before-change-check-unbalanced-strings(6 6) #f(compiled-function (fn) #<bytecode 0x1171d85>)(c-before-change-che mapc(#f(compiled-function (fn) #<bytecode 0x1171d85>) (c-extend-regi c-before-change(6 6) self-insert-command(1) electric-pair--insert(34) electric-pair-post-self-insert-function() self-insert-command(1) funcall-interactively(self-insert-command 1) call-interactively(self-insert-command) #f(compiled-function () #<bytecode 0x484725>)() funcall(#f(compiled-function () #<bytecode 0x484725>)) (let nil (funcall '#f(compiled-function () #<bytecode 0x484725>))) eval((let nil (funcall '#f(compiled-function () #<bytecode 0x484725> #f(compiled-function () #<bytecode 0x48475d>)() call-with-saved-electric-modes(#f(compiled-function () #<bytecode 0x electric-pair-test-for("\"foo\"" 2 34 "\"\"foo\"\"" 3 c++-mode nil # #f(compiled-function () #<bytecode 0x514bbd>)() ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test ert-run-test(#s(ert-test :name electric-pair-autowrapping-5-at-point ert-run-or-rerun-test(#s(ert--stats :selector (not (or (tag :expensi ert-run-tests((not (or (tag :expensive-test) (tag :unstable))) #f(co ert-run-tests-batch((not (or (tag :expensive-test) (tag :unstable))) ert-run-tests-batch-and-exit((not (or (tag :expensive-test) (tag :un eval((ert-run-tests-batch-and-exit '(not (or (tag :expensive-test) ( command-line-1(("-L" ":." "-l" "ert" "-l" "lisp/electric-tests" "--e command-line() normal-top-level() ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings. 2018-05-22 7:42 [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings Tino Calancha @ 2018-05-22 17:40 ` Alan Mackenzie 2018-05-22 19:21 ` João Távora 2018-05-31 12:37 ` CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) Alan Mackenzie 1 sibling, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-05-22 17:40 UTC (permalink / raw) To: Tino Calancha, João Távora; +Cc: Emacs developers Hello, Tino and João. On Tue, May 22, 2018 at 16:42:46 +0900, Tino Calancha wrote: > Hi Alan, > Since this commit one test > (electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings) > in electric-tests.el is failing with error: > (scan-error "Unbalanced parentheses" 6 1) Sorry about that. > This is the bactrace: > scan-sexps(7 -1) > forward-sexp(-1) > backward-sexp() > c-before-change-check-unbalanced-strings(6 6) > #f(compiled-function (fn) #<bytecode 0x1171d85>)(c-before-change-che > mapc(#f(compiled-function (fn) #<bytecode 0x1171d85>) (c-extend-regi > c-before-change(6 6) > self-insert-command(1) > electric-pair--insert(34) > electric-pair-post-self-insert-function() > self-insert-command(1) > funcall-interactively(self-insert-command 1) > call-interactively(self-insert-command) > #f(compiled-function () #<bytecode 0x484725>)() > funcall(#f(compiled-function () #<bytecode 0x484725>)) > (let nil (funcall '#f(compiled-function () #<bytecode 0x484725>))) > eval((let nil (funcall '#f(compiled-function () #<bytecode 0x484725> > #f(compiled-function () #<bytecode 0x48475d>)() > call-with-saved-electric-modes(#f(compiled-function () #<bytecode 0x > electric-pair-test-for("\"foo\"" 2 34 "\"\"foo\"\"" 3 c++-mode nil # > #f(compiled-function () #<bytecode 0x514bbd>)() > ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test > ert-run-test(#s(ert-test :name electric-pair-autowrapping-5-at-point > ert-run-or-rerun-test(#s(ert--stats :selector (not (or (tag :expensi > ert-run-tests((not (or (tag :expensive-test) (tag :unstable))) #f(co > ert-run-tests-batch((not (or (tag :expensive-test) (tag :unstable))) > ert-run-tests-batch-and-exit((not (or (tag :expensive-test) (tag :un > eval((ert-run-tests-batch-and-exit '(not (or (tag :expensive-test) ( > command-line-1(("-L" ":." "-l" "ert" "-l" "lisp/electric-tests" "--e > command-line() > normal-top-level() João, the file electric-tests.el is anything but straightforward to read. The test referred to above is generated by a nest of two or three macros, somehow, and it is not obvious what buffer operations were generated by these macros, and how they triggered a newly introduced bug in C++ Mode. The comments in the file are too sparse to help. How do I extract these essential details from "electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings"? Thanks! -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings. 2018-05-22 17:40 ` Alan Mackenzie @ 2018-05-22 19:21 ` João Távora 2018-05-22 19:34 ` Eli Zaretskii ` (2 more replies) 0 siblings, 3 replies; 93+ messages in thread From: João Távora @ 2018-05-22 19:21 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Emacs developers, Tino Calancha Hi Alan, Alan Mackenzie <acm@muc.de> writes: > electric-tests.el is anything but straightforward to read. lol, sorry I couldn't quite make the 20000+LOC standards of cc-mode.el :p. No really, kidding, <wipes tears>, that file needs macros because it defines almost 500 tests with very subtle variations between them. You tripped one of them, and I think we're both glad you did (regardless of who is at fault: test or c++-mode) > The test referred to above is generated by a nest of two or three > macros, somehow, and it is not obvious what buffer operations were > generated by these macros, and how they triggered a newly introduced bug > in C++ Mode. The comments in the file are too sparse to help. Here's how it works: There's only one macro for electric-pair tests, aptly named define-electric-pair-test. In that file, find the `define-electric-pair-test' that most closely matches the test failure, in this case its: (define-electric-pair-test autowrapping-5 "foo" "\"" :expected-string "\"foo\"" :expected-point 2 :fixture-fn #'(lambda () (electric-pair-mode 1) (mark-sexp 1))) now go to the end of the expression and type M-x pp-macroexpand-last-sexp. It should be easy to find your failing test, defined in terms of `ert-deftest', in a list of 6 tests. Here it is: (ert-deftest electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings nil "With |\"foo\"|, try input \" at point 2. Should become |\"\"foo\"\"| and point at 3" (electric-pair-test-for "\"foo\"" 2 34 "\"\"foo\"\"" 3 'c++-mode nil #'(lambda nil (electric-pair-mode 1) (mark-sexp 1)))) Now M-x edebug-defun this form straight in the *Pp Macroexpand Output* buffer, and M-x edebug-defun the `electric-pair-test-for' defun, too. Now run the test: M-x ert RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings RET As you step through the code, you'll eventually land on that lambda which calls mark-sexp and hints, along with the name, that this is a region-autowrapping test. This is why it expects, for a single character of input, that two quotes are inserted in the buffer instead of one. The test passes in my 26.1 as you probably already knew. Good luck hunting the bug and let me know if you have more problems. Thanks, João PS: you could also have used: M-x ert-describe-test RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings Which would have rendered a nice docstring electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings is a test defined in `electric-tests.elc'. With |"foo"|, try input " at point 2. Should become |""foo""| and point at 3 [back] Though, admittedly, this is misleading for the "autowrapping" tests, since it doesn't tell you about the "mark-sexp" region-making command. Also not immediatly clear perhaps it that the | are buffer boundaries. Ideally, it should read read With |"foo"|, at point 2, (mark-sexp 1) and try input ". Should become |""foo""| and point at 3 I will try to fix this in master. Also M-x ert-find-test-other-window could have helped you, but it doesn't (brings me to the beginning of the file, which isn't helpful). I don't know why, does anyone? ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings. 2018-05-22 19:21 ` João Távora @ 2018-05-22 19:34 ` Eli Zaretskii 2018-05-22 20:25 ` João Távora 2018-05-23 20:46 ` Alan Mackenzie 2018-05-23 23:21 ` Michael Welsh Duggan 2 siblings, 1 reply; 93+ messages in thread From: Eli Zaretskii @ 2018-05-22 19:34 UTC (permalink / raw) To: João Távora; +Cc: acm, tino.calancha, emacs-devel > From: joaotavora@gmail.com (João Távora) > Date: Tue, 22 May 2018 20:21:25 +0100 > Cc: Emacs developers <emacs-devel@gnu.org>, > Tino Calancha <tino.calancha@gmail.com> > > Here's how it works: Thanks. How about adding this information to the test file(s), as comments, so that others won't need to search high and low for it? ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings. 2018-05-22 19:34 ` Eli Zaretskii @ 2018-05-22 20:25 ` João Távora 2018-05-22 22:17 ` João Távora 0 siblings, 1 reply; 93+ messages in thread From: João Távora @ 2018-05-22 20:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acm, tino.calancha, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> From: joaotavora@gmail.com (João Távora) >> Date: Tue, 22 May 2018 20:21:25 +0100 >> Cc: Emacs developers <emacs-devel@gnu.org>, >> Tino Calancha <tino.calancha@gmail.com> >> >> Here's how it works: > > Thanks. How about adding this information to the test file(s), as > comments, so that others won't need to search high and low for it? Sure, but there are two more important things to do before: 1. fix the generated docstring so that M-x ert-describe-test returns useful information. I can do this. 2. fix M-x ert-find-test-other-window so that it find the generating form. Who can help? Emacs is a self-documenting edtior: I would need to explain very little to a seasoned Emacs user if just one of these things were working, let alone two. That said, adding comments is seldom a bad idea (because when they become outdated, it's a terrible idea.) João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings. 2018-05-22 20:25 ` João Távora @ 2018-05-22 22:17 ` João Távora 2018-05-23 14:52 ` Eli Zaretskii 0 siblings, 1 reply; 93+ messages in thread From: João Távora @ 2018-05-22 22:17 UTC (permalink / raw) To: Eli Zaretskii, acm; +Cc: tino.calancha, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: >> Thanks. How about adding this information to the test file(s), as >> comments, so that others won't need to search high and low for it? > > 1. fix the generated docstring so that M-x > ert-describe-test returns useful information. I can do this. I just pushed this change to master and though it's not perfect, I'm pretty happy with the result. Here's readable output of M-x ert-describe-test for the test that Alan is investigating: electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings is a test defined in `electric-tests.elc'. Electricity test in a `c++-mode' buffer. Start with point at 2 in a 5-char-long buffer like this one: |"foo"| (buffer start and end are denoted by `|') Now call this: #'(lambda nil (electric-pair-mode 1) (mark-sexp 1)) Now press the key for: " The buffer's contents should become: |""foo""| , and point should be at 3. João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings. 2018-05-22 22:17 ` João Távora @ 2018-05-23 14:52 ` Eli Zaretskii 0 siblings, 0 replies; 93+ messages in thread From: Eli Zaretskii @ 2018-05-23 14:52 UTC (permalink / raw) To: João Távora; +Cc: acm, tino.calancha, emacs-devel > From: joaotavora@gmail.com (João Távora) > Cc: emacs-devel@gnu.org, tino.calancha@gmail.com > Date: Tue, 22 May 2018 23:17:22 +0100 > > > I just pushed this change to master and though it's not perfect, I'm > pretty happy with the result. Here's readable output of M-x > ert-describe-test for the test that Alan is investigating: > > electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings is a > test defined in `electric-tests.elc'. > > Electricity test in a `c++-mode' buffer. > > Start with point at 2 in a 5-char-long buffer > like this one: > > |"foo"| (buffer start and end are denoted by `|') > > Now call this: > > #'(lambda nil > (electric-pair-mode 1) > (mark-sexp 1)) > > Now press the key for: " > > The buffer's contents should become: > > |""foo""| > > , and point should be at 3. Thanks, LGTM. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings. 2018-05-22 19:21 ` João Távora 2018-05-22 19:34 ` Eli Zaretskii @ 2018-05-23 20:46 ` Alan Mackenzie 2018-05-23 21:12 ` João Távora 2018-05-23 23:21 ` Michael Welsh Duggan 2 siblings, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-05-23 20:46 UTC (permalink / raw) To: João Távora; +Cc: Emacs developers, Tino Calancha Hello, João. On Tue, May 22, 2018 at 20:21:25 +0100, João Távora wrote: > Hi Alan, > Alan Mackenzie <acm@muc.de> writes: > > electric-tests.el is anything but straightforward to read. > lol, sorry I couldn't quite make the 20000+LOC standards of cc-mode.el > :p. :-) > No really, kidding, <wipes tears>, that file needs macros because it > defines almost 500 tests with very subtle variations between them. Yes, I understand this. > You tripped one of them, and I think we're both glad you did > (regardless of who is at fault: test or c++-mode) Well, I'm glad about it, so thanks for this test file!. The bug is most definitely in CC Mode - I'd neglected to handle properly an open string lacking an EOL to "terminate" it. This will be easy to fix. > > The test referred to above is generated by a nest of two or three > > macros, somehow, and it is not obvious what buffer operations were > > generated by these macros, and how they triggered a newly introduced bug > > in C++ Mode. The comments in the file are too sparse to help. > Here's how it works: There's only one macro for electric-pair tests, > aptly named define-electric-pair-test. > In that file, find the `define-electric-pair-test' that most closely > matches the test failure, in this case its: > (define-electric-pair-test autowrapping-5 > "foo" "\"" :expected-string "\"foo\"" :expected-point 2 > :fixture-fn #'(lambda () > (electric-pair-mode 1) > (mark-sexp 1))) > now go to the end of the expression and type M-x > pp-macroexpand-last-sexp. It should be easy to find your failing test, > defined in terms of `ert-deftest', in a list of 6 tests. Here it is: > (ert-deftest electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings nil > "With |\"foo\"|, try input \" at point 2. Should become |\"\"foo\"\"| and point at 3" > (electric-pair-test-for "\"foo\"" 2 34 "\"\"foo\"\"" 3 'c++-mode nil > #'(lambda nil > (electric-pair-mode 1) > (mark-sexp 1)))) > Now M-x edebug-defun this form straight in the *Pp Macroexpand Output* > buffer, and M-x edebug-defun the `electric-pair-test-for' defun, > too. Now run the test: > M-x ert RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings RET Thanks for this recipe. It wasn't self evident. > As you step through the code, you'll eventually land on that lambda > which calls mark-sexp and hints, along with the name, that this is a > region-autowrapping test. This is why it expects, for a single character > of input, that two quotes are inserted in the buffer instead of > one. The test passes in my 26.1 as you probably already knew. > Good luck hunting the bug and let me know if you have more problems. Thanks for the help. > Thanks, > João > PS: you could also have used: > M-x ert-describe-test RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings > Which would have rendered a nice docstring > electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings is a > test defined in `electric-tests.elc'. > With |"foo"|, try input " at point 2. Should become |""foo""| and point at 3 > [back] > Though, admittedly, this is misleading for the "autowrapping" tests, > since it doesn't tell you about the "mark-sexp" region-making > command. Also not immediatly clear perhaps it that the | are buffer > boundaries. Just to be pedantic, it also doesn't say where " number 4 is inserted (whether at EOL, or before " number 2), but that can be resolved by the application of intelligence. :-) > Ideally, it should read read > With |"foo"|, at point 2, (mark-sexp 1) and try input ". > Should become |""foo""| and point at 3 > I will try to fix this in master. > Also M-x ert-find-test-other-window could have helped you, but it > doesn't (brings me to the beginning of the file, which isn't helpful). I > don't know why, does anyone? -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings. 2018-05-23 20:46 ` Alan Mackenzie @ 2018-05-23 21:12 ` João Távora 0 siblings, 0 replies; 93+ messages in thread From: João Távora @ 2018-05-23 21:12 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Emacs developers, Tino Calancha Alan Mackenzie <acm@muc.de> writes: >> M-x ert RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings RET > > Thanks for this recipe. It wasn't self evident. I know, I added a better docstring now. I'd like to get rid of the automatical function names so you could find that long symbol in the file, but it seems difficult. The best solution would be a good find-definition that placed you somewhere where pp-macroexpand would reveal it to you. > Just to be pedantic, it also doesn't say where " number 4 is inserted > (whether at EOL, or before " number 2), but that can be resolved by the > application of intelligence. :-) You raise an existential point: If I put all your atoms together in the same configuration but in a different order will I have a different Alan? ;-) João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings. 2018-05-22 19:21 ` João Távora 2018-05-22 19:34 ` Eli Zaretskii 2018-05-23 20:46 ` Alan Mackenzie @ 2018-05-23 23:21 ` Michael Welsh Duggan 2 siblings, 0 replies; 93+ messages in thread From: Michael Welsh Duggan @ 2018-05-23 23:21 UTC (permalink / raw) To: emacs-devel joaotavora@gmail.com (João Távora) writes: > PS: you could also have used: > > M-x ert-describe-test RET electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings > > Which would have rendered a nice docstring > > electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings is a > test defined in `electric-tests.elc'. > > With |"foo"|, try input " at point 2. Should become |""foo""| and point at 3 > > [back] > > Though, admittedly, this is misleading for the "autowrapping" tests, > since it doesn't tell you about the "mark-sexp" region-making > command. Also not immediatly clear perhaps it that the | are buffer > boundaries. > > Ideally, it should read read > > With |"foo"|, at point 2, (mark-sexp 1) and try input ". > Should become |""foo""| and point at 3 > > I will try to fix this in master. > > Also M-x ert-find-test-other-window could have helped you, but it > doesn't (brings me to the beginning of the file, which isn't helpful). I > don't know why, does anyone? Could the macros that generate the ert-deftest call not set `definition-name' in the generated function's symbol property list that leads back to `define-electric-pair-test`'s name element? That might help `find-function-search-for-symbol' find the correct location. -- Michael Welsh Duggan (md5i@md5i.com) ^ permalink raw reply [flat|nested] 93+ messages in thread
* CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) 2018-05-22 7:42 [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings Tino Calancha 2018-05-22 17:40 ` Alan Mackenzie @ 2018-05-31 12:37 ` Alan Mackenzie 2018-05-31 16:07 ` CC Mode and electric-pair "problem" João Távora 2018-06-17 16:58 ` Glenn Morris 1 sibling, 2 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-05-31 12:37 UTC (permalink / raw) To: Tino Calancha, João Távora; +Cc: Emacs developers Hello agin, Tino and João. On Tue, May 22, 2018 at 16:42:46 +0900, Tino Calancha wrote: > Hi Alan, > Since this commit one test > (electric-pair-autowrapping-5-at-point-2-in-c++-mode-in-strings) > in electric-tests.el is failing with error: > (scan-error "Unbalanced parentheses" 6 1) This should now have been fixed. However, the test suite (make check) threw up another discrepancy, in a test called electric-pair-whitespace-chomping-2-at-point-4-in-c++-mode-in-strings. That's with electric-pair mode enabled, and electric-pair-skip-whitespace set to 'chomp. The buffer, at the start of the test, looks something like: " ( ) " . With point just after the (, type a ). The expected result is that everything up to and including the existing ) gets "chomped", leaving the buffer looking like: " () " . This no longer happens in C++ mode, and it is not clear that it should. In the original buffer, ( and ) are not in the same string, since the opening string ends at EOL, there being no backslash to continue it. If there were escaped newlines in the buffer, I don't think the "chomp" would work, because elec-pair.el doesn't recognise escaped newlines as whitespace. Comments? -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-05-31 12:37 ` CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) Alan Mackenzie @ 2018-05-31 16:07 ` João Távora 2018-05-31 17:28 ` Alan Mackenzie 2018-06-17 16:58 ` Glenn Morris 1 sibling, 1 reply; 93+ messages in thread From: João Távora @ 2018-05-31 16:07 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Emacs developers, Tino Calancha Hi again, Alan Alan Mackenzie <acm@muc.de> writes: > " ( > > ) " > > . With point just after the (, type a ). The expected result is that > everything up to and including the existing ) gets "chomped", leaving > the buffer looking like: > > " () " > > . This no longer happens in C++ mode, and it is not clear that it > should. In the original buffer, ( and ) are not in the same string, > since the opening string ends at EOL, there being no backslash to > continue it. > > If there were escaped newlines in the buffer, I don't think the "chomp" > would work, because elec-pair.el doesn't recognise escaped newlines as > whitespace. > > Comments? I can reproduce this, even without turning "chomping" on: 26.1 skips to the closing parens, master doesn't. But it's tricky. From elec-pair.el's perspective, skipping whitespace means skipping whitespace characters *and* not crossing string/comment boundaries. To analyse a test case very similar to yours I wrote a simple function (attached after my sig) to analyse just 5 characters and an end-of-file. ( " \n " ) EOF In Emacs 26.1 I get ((:character 34 :formatted "\"" :syntax (7) :depth 0 :string nil :last-open-parens nil) (:character 40 :formatted "(" :syntax (4 . 41) :depth 0 :string 34 :last-open-parens 1) (:character 10 :formatted "\n" :syntax (0) :depth 0 :string 34 :last-open-parens 1) (:character 41 :formatted ")" :syntax (5 . 40) :depth 0 :string 34 :last-open-parens 1) (:character 34 :formatted "\"" :syntax (7) :depth 0 :string 34 :last-open-parens 1) (:character nil :formatted "EOF" :syntax nil :depth 0 :string nil :last-open-parens nil)) In Emacs master, I get ((:character 34 :formatted "\"" :syntax (15) :depth 0 :string nil :last-open-parens nil) (:character 40 :formatted "(" :syntax (4 . 41) :depth 0 :string t :last-open-parens 1) (:character 10 :formatted "\n" :syntax (15) :depth 0 :string t :last-open-parens 1) (:character 41 :formatted ")" :syntax (5 . 40) :depth 0 :string nil :last-open-parens nil) (:character 34 :formatted "\"" :syntax (15) :depth -1 :string nil :last-open-parens nil) (:character nil :formatted "EOF" :syntax nil :depth -1 :string t :last-open-parens 5)) Note that the newline character changed its syntax from (0), which is whitespace, to (15) which is generic string. But more importantly, the closing paren after it no longer declares to be inside a string according to syntax-ppss. Is this what you and (the majority of) cc-mode users expect? If it is, then this test (and probably many other ones) must be changed to reflect that. As a data-point, as an occasional c++- mode user, I'd much rather have Emacs 26's behaviour. When faced with such admittedly invalid C, I at most expect M-x compile or Flymake to tell me about it, but would like Emacs to treat it as whitespace so electric-pair keeps functioning correctly. That is, I expect Emacs to not choke my editing tools because I've temporarily produced syntactically incorrect code while editing, particularly tools designed to correct such situations. I've also noted that whitespace-fixing tools aren't tripped by your change. But that's because they don't care about comment and string boundaries, although they could/should. This suggests we could make elec-pair.el also not care about them in c++ mode, but it would only take us so far, because I fear worse problems would come in more basic elec-pair.el funtionality. In general, I think you should review the recent c++-mode changes. To illustrate, here's a new bug report without any newlines. 1. emacs-master/src/emacs -Q 2. M-x erase-buffer RET ! 3. M-x c++-mode 4. M-x electric-pair-mode 5. insert a double quote (this inserts a closer) 6. insert an opening parens (this inserts a closer) 7. insert a double quote (this inserts a closer, but...) ... it additionally popups up an error: c-append-to-state-cache: Scan error: "Unbalanced parentheses", 5, 1 The last quote becomes red. If I erase the buffer again and do the whole thing again, no error happens and no red quote, which is what I expect it to do (and Emacs 26 behaviour). Actually, electric-pair-mode doesn't even need to be on: 1. emacs-master/src/emacs -Q 2. M-x erase-buffer RET ! 3. M-x c++-mode 5a. insert a double quote 5b. insert the closer quote 5.c go back one char 6a. insert an opening parens 6b. insert the closer, go back one char 7a. insert a double quote 7b. try to insert the closer quote You get the same c-append-to-state-cache error João (defun joaot/analyse (&optional int) (interactive "p") (cl-loop for p = (point-min) then (1+ p) while (<= p (point-max)) for (depth _ _ string comment _ _ _ open-parens _) = (syntax-ppss p) for char = (char-after p) collect (list :character char :formatted (if char (format "%c" char) "EOF") :syntax (syntax-after p) :depth depth :string string :last-open-parens open-parens) into retval finally (when int (message "%s" (pp-to-string retval))) (cl-return retval))) ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-05-31 16:07 ` CC Mode and electric-pair "problem" João Távora @ 2018-05-31 17:28 ` Alan Mackenzie 2018-05-31 18:37 ` João Távora 0 siblings, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-05-31 17:28 UTC (permalink / raw) To: João Távora; +Cc: Emacs developers, Tino Calancha Hello, João On Thu, May 31, 2018 at 17:07:43 +0100, João Távora wrote: > Hi again, Alan > Alan Mackenzie <acm@muc.de> writes: > > " ( > > > > ) " > > > > . With point just after the (, type a ). The expected result is that > > everything up to and including the existing ) gets "chomped", leaving > > the buffer looking like: > > > > " () " > > > > . This no longer happens in C++ mode, and it is not clear that it > > should. In the original buffer, ( and ) are not in the same string, > > since the opening string ends at EOL, there being no backslash to > > continue it. > > > > If there were escaped newlines in the buffer, I don't think the "chomp" > > would work, because elec-pair.el doesn't recognise escaped newlines as > > whitespace. > > > > Comments? > I can reproduce this, even without turning "chomping" on: 26.1 skips to > the closing parens, master doesn't. > But it's tricky. From elec-pair.el's perspective, skipping whitespace > means skipping whitespace characters *and* not crossing string/comment > boundaries. To analyse a test case very similar to yours I wrote a > simple function (attached after my sig) to analyse just 5 characters and > an end-of-file. > ( " \n " ) EOF I think you mean " ( \n ) " EOF. :-) > In Emacs 26.1 I get > ((:character 34 :formatted "\"" :syntax > (7) > :depth 0 :string nil :last-open-parens nil) > (:character 40 :formatted "(" :syntax > (4 . 41) > :depth 0 :string 34 :last-open-parens 1) > (:character 10 :formatted "\n" :syntax > (0) > :depth 0 :string 34 :last-open-parens 1) > (:character 41 :formatted ")" :syntax > (5 . 40) > :depth 0 :string 34 :last-open-parens 1) > (:character 34 :formatted "\"" :syntax > (7) > :depth 0 :string 34 :last-open-parens 1) > (:character nil :formatted "EOF" :syntax nil :depth 0 :string nil > :last-open-parens nil)) > In Emacs master, I get > ((:character 34 :formatted "\"" :syntax > (15) > :depth 0 :string nil :last-open-parens nil) > (:character 40 :formatted "(" :syntax > (4 . 41) > :depth 0 :string t :last-open-parens 1) > (:character 10 :formatted "\n" :syntax > (15) > :depth 0 :string t :last-open-parens 1) > (:character 41 :formatted ")" :syntax > (5 . 40) > :depth 0 :string nil :last-open-parens nil) > (:character 34 :formatted "\"" :syntax > (15) > :depth -1 :string nil :last-open-parens nil) > (:character nil :formatted "EOF" :syntax nil :depth -1 :string t > :last-open-parens 5)) > Note that the newline character changed its syntax from (0), which is > whitespace, to (15) which is generic string. But more importantly, the > closing paren after it no longer declares to be inside a string > according to syntax-ppss. > Is this what you and (the majority of) cc-mode users expect? If it is, > then this test (and probably many other ones) must be changed to reflect > that. Yes. A string in C(++) mode extending over several lines is only valid when the newlines are escaped. The generic string syntax is partly an artifice to get font-lock-warning-face, but is also deliberately intended to cut the opener of the invalid string off from any subsequent double quote. > As a data-point, as an occasional c++- mode user, I'd much rather have > Emacs 26's behaviour. When faced with such admittedly invalid C, I at > most expect M-x compile or Flymake to tell me about it, but would like > Emacs to treat it as whitespace so electric-pair keeps functioning > correctly. That is, I expect Emacs to not choke my editing tools > because I've temporarily produced syntactically incorrect code while > editing, particularly tools designed to correct such situations. OK. I'll need to mull this over. > I've also noted that whitespace-fixing tools aren't tripped by your > change. But that's because they don't care about comment and string > boundaries, although they could/should. This suggests we could make > elec-pair.el also not care about them in c++ mode, but it would only > take us so far, because I fear worse problems would come in more basic > elec-pair.el funtionality. > In general, I think you should review the recent c++-mode changes. To > illustrate, here's a new bug report without any newlines. > 1. emacs-master/src/emacs -Q > 2. M-x erase-buffer RET ! > 3. M-x c++-mode > 4. M-x electric-pair-mode > 5. insert a double quote (this inserts a closer) > 6. insert an opening parens (this inserts a closer) > 7. insert a double quote (this inserts a closer, but...) > ... it additionally popups up an error: > c-append-to-state-cache: Scan error: "Unbalanced parentheses", 5, 1 I don't see this at all. For me, that sequence of actions simply works, without signalling an error. This was on the master branch as I committed my change today. > The last quote becomes red. If I erase the buffer again and do the whole > thing again, no error happens and no red quote, which is what I expect > it to do (and Emacs 26 behaviour). > Actually, electric-pair-mode doesn't even need to be on: > 1. emacs-master/src/emacs -Q > 2. M-x erase-buffer RET ! > 3. M-x c++-mode > 5a. insert a double quote > 5b. insert the closer quote > 5.c go back one char > 6a. insert an opening parens > 6b. insert the closer, go back one char > 7a. insert a double quote > 7b. try to insert the closer quote > You get the same c-append-to-state-cache error I don't see this either. And we both started with -Q, so it's not something in .emacs. Are you sure you've downloaded and build that latest patch of mine? > João [ .... ] -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-05-31 17:28 ` Alan Mackenzie @ 2018-05-31 18:37 ` João Távora 2018-06-02 13:02 ` Alan Mackenzie 0 siblings, 1 reply; 93+ messages in thread From: João Távora @ 2018-05-31 18:37 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Emacs developers, Tino Calancha Alan Mackenzie <acm@muc.de> writes: > Hello, João > > On Thu, May 31, 2018 at 17:07:43 +0100, João Távora wrote: >> ( " \n " ) EOF > > I think you mean " ( \n ) " EOF. :-) Right :) > Yes. A string in C(++) mode extending over several lines is only valid > when the newlines are escaped. The generic string syntax is partly an > artifice to get font-lock-warning-face, but is also deliberately > intended to cut the opener of the invalid string off from any subsequent > double quote. But is there another goal here, apart from the goal of visually annotating the error? If the intent is only to annotate the error visually, I'd rather leave that to something like Flymake. Admiteddly, it's not practical now, since Flymake usually works by running the whole buffer through an external syntax check tool, which may take ages compared to using syntax hints from within emacs. But that could be changed, my goal is to let Flymake call backends with only recently changed parts of the buffer, and a much faster syntax-checking backend could be devised. Which reminds me, I never did get an answer to https://lists.gnu.org/archive/html/emacs-devel/2017-10/msg00448.html did I? > OK. I'll need to mull this over. OK, do. If you come to the conclusion that it is very important, and when the code becomes stable, I can can increase the complexity of elec-pair.el a bit to make it work in c++-mode. BTW do all cc-based modes "forbid" multi-line strings? >> c-append-to-state-cache: Scan error: "Unbalanced parentheses", 5, 1 > > I don't see this at all. For me, that sequence of actions simply works, > without signalling an error. This was on the master branch as I > committed my change today. Things moves fast :-) I running a master without your commit from around noon. I can't reproduce it now either, good job. João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-05-31 18:37 ` João Távora @ 2018-06-02 13:02 ` Alan Mackenzie 2018-06-03 3:00 ` João Távora 0 siblings, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-06-02 13:02 UTC (permalink / raw) To: João Távora; +Cc: Emacs developers, Tino Calancha Hello, João. On Thu, May 31, 2018 at 19:37:22 +0100, João Távora wrote: > Alan Mackenzie <acm@muc.de> writes: > > Hello, João > > > > On Thu, May 31, 2018 at 17:07:43 +0100, João Távora wrote: [ .... ] > > Yes. A string in C(++) mode extending over several lines is only > > valid when the newlines are escaped. The generic string syntax is > > partly an artifice to get font-lock-warning-face, but is also > > deliberately intended to cut the opener of the invalid string off > > from any subsequent double quote. > But is there another goal here, apart from the goal of visually > annotating the error? Well, "error" might be putting it a bit strongly. Mainly, it's a reminder to somebody typing in a string to close it off properly. > If the intent is only to annotate the error visually, I'd rather leave > that to something like Flymake. Admiteddly, it's not practical now, > since Flymake usually works by running the whole buffer through an > external syntax check tool, which may take ages compared to using syntax > hints from within emacs. > But that could be changed, my goal is to let Flymake call backends with > only recently changed parts of the buffer, and a much faster > syntax-checking backend could be devised. All I can reply with, at the moment, is ... Hmmmm. :-) > Which reminds me, I never did get an answer to > https://lists.gnu.org/archive/html/emacs-devel/2017-10/msg00448.html > did I? No, but you've got one now. :-) > > OK. I'll need to mull this over. > OK, do. If you come to the conclusion that it is very important, and > when the code becomes stable, I can can increase the complexity of > elec-pair.el a bit to make it work in c++-mode. I think the increase in complexity would be quite small, and very local. > BTW do all cc-based modes "forbid" multi-line strings? No. Pike Mode has a special feature whereby a string starting with #" is a multiline string. I think in D Mode (not maintained here), strings simply are multiline, and there is no such thing as an escaped EOL. The writer of the mode sets the CC Mode "language variable" c-multiline-string-start-char to the character # for Pike Mode, or some non-character non-nil value for D Mode (usually t, of course). [ .... ] > João -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-02 13:02 ` Alan Mackenzie @ 2018-06-03 3:00 ` João Távora 0 siblings, 0 replies; 93+ messages in thread From: João Távora @ 2018-06-03 3:00 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Emacs developers, Tino Calancha Hi Alan, Alan Mackenzie <acm@muc.de> writes: > Well, "error" might be putting it a bit strongly. Mainly, it's a > reminder to somebody typing in a string to close it off properly. Yeah, precisely. Hence it would be ideal to annotate it, but not break whitespace autoskip over it. >> only recently changed parts of the buffer, and a much faster >> syntax-checking backend could be devised. > > All I can reply with, at the moment, is ... Hmmmm. :-) That's a more than reasonable reply to my vaporware. I'll let you know when I have something more concrete. >> when the code becomes stable, I can can increase the complexity of >> elec-pair.el a bit to make it work in c++-mode. > I think the increase in complexity would be quite small, and very > local. That remains to be seen... Nevertheless, if we're talking about this one test, my offer stands: I can add a variable to control this (which c-mode will have to set since it's not going to be the default). >> BTW do all cc-based modes "forbid" multi-line strings? > > No. Pike Mode has a special feature whereby a string starting with #" is So Pike Mode keeps the whitespace skip, right? João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-05-31 12:37 ` CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) Alan Mackenzie 2018-05-31 16:07 ` CC Mode and electric-pair "problem" João Távora @ 2018-06-17 16:58 ` Glenn Morris 2018-06-17 20:13 ` Alan Mackenzie 1 sibling, 1 reply; 93+ messages in thread From: Glenn Morris @ 2018-06-17 16:58 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Emacs developers, João Távora, Tino Calancha Alan Mackenzie wrote: > However, the test suite (make check) threw up another discrepancy, in a > test called > electric-pair-whitespace-chomping-2-at-point-4-in-c++-mode-in-strings. Hello, is this still being worked on? The test continues to fail on RHEL 7 and hydra.nixos.org. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-17 16:58 ` Glenn Morris @ 2018-06-17 20:13 ` Alan Mackenzie 2018-06-17 21:07 ` Stefan Monnier 2018-06-17 21:27 ` João Távora 0 siblings, 2 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-06-17 20:13 UTC (permalink / raw) To: Glenn Morris; +Cc: Emacs developers, João Távora, Tino Calancha Hello, Glenn. On Sun, Jun 17, 2018 at 12:58:53 -0400, Glenn Morris wrote: > Alan Mackenzie wrote: > > However, the test suite (make check) threw up another discrepancy, in a > > test called > > electric-pair-whitespace-chomping-2-at-point-4-in-c++-mode-in-strings. > Hello, is this still being worked on? > The test continues to fail on RHEL 7 and hydra.nixos.org. From my point of view, the bug is not being worked on this very day, but has by no means been forgotten. It has needed a period of mulling over. I think João sees it the same way. Although it won't be difficult to fix, this bug is an awkward thing, and will need decisions (smallish ones) to be taken. My favoured method would be to alter electric-pair--skip-whitespace such that a NL terminating a string (as contrasted with a NL terminating a comment) would be allowed to be scanned over. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-17 20:13 ` Alan Mackenzie @ 2018-06-17 21:07 ` Stefan Monnier 2018-06-17 21:27 ` João Távora 1 sibling, 0 replies; 93+ messages in thread From: Stefan Monnier @ 2018-06-17 21:07 UTC (permalink / raw) To: emacs-devel > My favoured method would be to alter electric-pair--skip-whitespace such > that a NL terminating a string (as contrasted with a NL terminating a > comment) would be allowed to be scanned over. AFAIK no language currently offers "NL terminating strings". So, we should indeed behave as if this NL doesn't terminate the string (IIUC the problem is that CC-mode marks NL-inside-string as if it terminates a string, but that's just an internal detail which shouldn't have such visible side-effects. Personally I'd vote to just not treat NF-inside-string in such a special way: it's a lot of trouble on the implementation side for very little benefit to the end user since the way strings are font-locked makes it trivially obvious to the user what's going on). Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-17 20:13 ` Alan Mackenzie 2018-06-17 21:07 ` Stefan Monnier @ 2018-06-17 21:27 ` João Távora 2018-06-18 10:36 ` Alan Mackenzie 1 sibling, 1 reply; 93+ messages in thread From: João Távora @ 2018-06-17 21:27 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha [-- Attachment #1: Type: text/plain, Size: 2079 bytes --] On Sun, Jun 17, 2018 at 9:13 PM, Alan Mackenzie <acm@muc.de> wrote: > Hello, Glenn. > > On Sun, Jun 17, 2018 at 12:58:53 -0400, Glenn Morris wrote: > > Alan Mackenzie wrote: > > > > However, the test suite (make check) threw up another discrepancy, in a > > > test called > > > electric-pair-whitespace-chomping-2-at-point-4-in-c++-mode-in-strings. > > > Hello, is this still being worked on? > > The test continues to fail on RHEL 7 and hydra.nixos.org. > > From my point of view, the bug is not being worked on this very day, but > has by no means been forgotten. It has needed a period of mulling over. > I think João sees it the same way. > Yes, while mulling over things is generally good, I believe the problem from Glenn's perspective is the nuisance of checking whether every test failure is something to worry about or just the thing being mulled over. So I suggest taking a quick temporary action to make the test pass and then think about how to do it properly. This action could be disabling the test temporarily but IME that invariably buries the issue ad eternum. So it's better to do it in cc-mode. Although it won't be difficult to fix, this bug is an awkward thing, and > will need decisions (smallish ones) to be taken. > > My favoured method would be to alter electric-pair--skip-whitespace such > that a NL terminating a string (as contrasted with a NL terminating a > comment) would be allowed to be scanned over. > I'm OK with adding an customization point to electric-pair--skip-whitespace that c-mode can customize. But I also wonder whether the benefit to end-users of handling NL-terminated strings are worth it. Perhaps there are indeed benefits, it's just that I haven't seen them argued. But more importantly perhaps there are ways to reap these benefits in a way that doesn't require changes to e-p-m, or even better, in a way that benefits all of Emacs, not just c-mode. So, in practice, is the advantage here that the user is visually warned of an invalid NL-terminated string? João [-- Attachment #2: Type: text/html, Size: 3022 bytes --] ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-17 21:27 ` João Távora @ 2018-06-18 10:36 ` Alan Mackenzie 2018-06-18 13:24 ` João Távora 0 siblings, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-06-18 10:36 UTC (permalink / raw) To: João Távora; +Cc: Glenn Morris, Emacs developers, Tino Calancha Hello, João. On Sun, Jun 17, 2018 at 22:27:20 +0100, João Távora wrote: > On Sun, Jun 17, 2018 at 9:13 PM, Alan Mackenzie <acm@muc.de> wrote: > > Hello, Glenn. > > On Sun, Jun 17, 2018 at 12:58:53 -0400, Glenn Morris wrote: > > > Alan Mackenzie wrote: > > > > However, the test suite (make check) threw up another discrepancy, in a > > > > test called > > > > electric-pair-whitespace-chomping-2-at-point-4-in-c++-mode-in-strings. > > > Hello, is this still being worked on? > > > The test continues to fail on RHEL 7 and hydra.nixos.org. > > From my point of view, the bug is not being worked on this very day, but > > has by no means been forgotten. It has needed a period of mulling over. > > I think João sees it the same way. > Yes, while mulling over things is generally good, I believe the problem > from Glenn's perspective is the nuisance of checking whether every > test failure is something to worry about or just the thing being > mulled over. Yes. But it is the master branch, where not everything can be expected to work all the time. I think the main thing is, we're _going_ to fix this bug. > So I suggest taking a quick temporary action to make the test pass > and then think about how to do it properly. This action could be > disabling the test temporarily but IME that invariably buries the > issue ad eternum. So it's better to do it in cc-mode. Hmm. To modify CC Mode temporarily to make 'chomp in electric-pair-mode work would be an order of magnitude more work than "simply" to fix the bug. That's without disabling the handling of " in CC Mode entirely. > Although it won't be difficult to fix, this bug is an awkward thing, and > > will need decisions (smallish ones) to be taken. > > My favoured method would be to alter electric-pair--skip-whitespace such > > that a NL terminating a string (as contrasted with a NL terminating a > > comment) would be allowed to be scanned over. > I'm OK with adding an customization point to > electric-pair--skip-whitespace that c-mode can customize. But I also > wonder whether the benefit to end-users of handling NL-terminated > strings are worth it. Perhaps there are indeed benefits, it's just that > I haven't seen them argued. OK, here goes. Why should major modes tie themselves in knots, just so that electric-pair-mode can work? What CC Mode is doing is natural, and matches the reality. A C(++) compiler regards an unterminated string as ending at the (first unescaped) linefeed. It will then regard the next line as code (not string). If there is a subsequent ", the compiler won't see that as a terminator for the unbalanced opening ". CC Mode now matches this reality, which is a Good Thing. electric-pair-mode's chomp facility could be more rigorously coded - sometimes it is dealing with visible whitespace, sometimes it is dealing with syntactic properties. Surely it should be working with visible whitespace all the time? I've attempted a bit of debugging. In addition to electric-pair--skip-whitespace, I encountered a scan-sexps in subfunction ended-prematurely-fn of function electric-pair--balance-info, which snagged on the end of string at EOL. > But more importantly perhaps there are ways to reap these benefits in a > way that doesn't require changes to e-p-m, or even better, in a way > that benefits all of Emacs, not just c-mode. We are talking about a corner case in e-p-m, namely where e-p-m attempts to chomp space between parens inside an invalid string. This surely won't come up in practice very much. Is it worth fixing? (I would say yes.) > So, in practice, is the advantage here that the user is visually > warned of an invalid NL-terminated string? The user is visually informed of the reality: that one or more strings are unterminated, and where the "breakage" is (where the font-lock-string-face stops). This is an improvement over the previous handling, where the opening invalid " merely got warning-face, but the following unterminated string flowed on indefinitely. The disadvantage is that e-p-m is constraining major modes in how they can use syntax-table text properties. I think this is a problem in electric-pair-mode, not in CC Mode. > João -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 10:36 ` Alan Mackenzie @ 2018-06-18 13:24 ` João Távora 2018-06-18 15:18 ` Eli Zaretskii 2018-06-18 15:42 ` Alan Mackenzie 0 siblings, 2 replies; 93+ messages in thread From: João Távora @ 2018-06-18 13:24 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha Alan Mackenzie <acm@muc.de> writes: > Hello, João. >> Yes, while mulling over things is generally good, I believe the problem >> from Glenn's perspective is the nuisance of checking whether every >> test failure is something to worry about or just the thing being >> mulled over. > Yes. But it is the master branch, where not everything can be expected > to work all the time. I think the main thing is, we're _going_ to fix > this bug. Well, I respectfully and totally disagree. The reason we have automated tests in Hydra is to catch unintentional breakage, not intentional breakage. And, IIUC that test is the only one preventing a successful "make check". For development temporarily unhampered by tests, I think a separate branch is a much better alternative. It's a very easy thing to do in git (and in your case, trivial to merge from and back to master, given you have near-total control over that area of code). > Hmm. To modify CC Mode temporarily to make 'chomp in electric-pair-mode > work would be an order of magnitude more work than "simply" to fix the > bug. That's without disabling the handling of " in CC Mode entirely. How so? Can't you just revert the commit that broke it? >> strings are worth it. Perhaps there are indeed benefits, it's just that >> I haven't seen them argued. > OK, here goes. Why should major modes tie themselves in knots, just so > that electric-pair-mode can work? What CC Mode is doing is natural, and > matches the reality. I think you mean "mode", in the singular form :-). Also, it doesn't "match reality": if you open a line in a string, it syntax highlights the remaining string as C statements, but the C parser doesn't see C statements. IOW, newline doesn't *really* terminate a string in C. > electric-pair-mode's chomp facility could be more rigorously coded - > sometimes it is dealing with visible whitespace, sometimes it is dealing > with syntactic properties. Surely it should be working with visible > whitespace all the time? No. If it did so, it would chomp parenthesis from non-comment regions into comment regions, for example. That doesn't make sense, not according to show-paren-mode, for example. By the way, after your change, very basic commands which fall completely outside electric-pair-mode have fundamentally changed their behaviour in cc-mode. Here are a few, out of Emacs -Q: * Open a line in a string, using C-o. Sexp-navigation is now messed up in the whole buffer, i.e. C-M-*. Most commads error or produce surprising result. So even if the intent is to eventually add a backslash escaping the newline, or make it two adjacent strings by typing two quotes (something perfectly allowed by C). * Inside the string, `forward-sexp' in a parenthesis of a NL-terminated string now errors where it would previously do its job of jumping to the closer; * Also inside the string, `blink-matching-paren', on by default, also doesn't work as before: closing a paren on a NL-started string doesn't match the opener. There are no automated tests for these things, otherwise you could be seeing test breakage here too (and, with higher probably, you may be seeing breakage in user's expectations later on). > I've attempted a bit of debugging. In addition to > electric-pair--skip-whitespace, I encountered a scan-sexps in subfunction > ended-prematurely-fn of function electric-pair--balance-info, which > snagged on the end of string at EOL. I don't understand how this matters to the problem at hand, but regardless, can you make a bug report demonstrating the presumed bug and its impact so I can follow up? > We are talking about a corner case in e-p-m, namely where e-p-m attempts > to chomp space between parens inside an invalid string. This surely > won't come up in practice very much. Is it worth fixing? (I would say > yes.) Don't forget that the particular piece of e-p-m we're talking about is one of the ways (arguably the easiest way) to actually fix the specific C/C++ problem at hand for the user. IOW it's not some random whimsical useless thing. >> So, in practice, is the advantage here that the user is visually >> warned of an invalid NL-terminated string? > The user is visually informed of the reality: that one or more strings > are unterminated, and where the "breakage" is (where the > font-lock-string-face stops). This is an improvement over the previous > handling, where the opening invalid " merely got warning-face, but the > following unterminated string flowed on indefinitely. I suppose that's a "yes". In that case, the face `warning`, which defaults to a very bright red, would be fine for me personally (and I'm confident if could be made even more evident). Also, the fact that the remaining string is now syntax-highlighted as C statements is extremely confusing. > The disadvantage is that e-p-m is constraining major modes in how they > can use syntax-table text properties. I think this is a problem in > electric-pair-mode, not in CC Mode. Again, AFAIK, "mode", singular. And, obviously, I'm not going to special-case cc-mode in elec-pair.el: after doing some of my own mulling, I may open a customization point for cc-mode.el to use. So at the very least, it's going to require some (potentially trivial) fix in cc-mode.el, for sure. But now that I've understood the non-e-p-m implications of your change, I urge to at least make this configurable (if it is already configurable, then don't mind me). João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 13:24 ` João Távora @ 2018-06-18 15:18 ` Eli Zaretskii 2018-06-18 15:37 ` João Távora 2018-06-18 15:42 ` Alan Mackenzie 1 sibling, 1 reply; 93+ messages in thread From: Eli Zaretskii @ 2018-06-18 15:18 UTC (permalink / raw) To: João Távora; +Cc: acm, emacs-devel, tino.calancha, rgm > From: João Távora <joaotavora@gmail.com> > Date: Mon, 18 Jun 2018 14:24:39 +0100 > Cc: Glenn Morris <rgm@gnu.org>, Emacs developers <emacs-devel@gnu.org>, > Tino Calancha <tino.calancha@gmail.com> > > > Yes. But it is the master branch, where not everything can be expected > > to work all the time. I think the main thing is, we're _going_ to fix > > this bug. > > Well, I respectfully and totally disagree. The reason we have automated > tests in Hydra is to catch unintentional breakage, not intentional > breakage. And, IIUC that test is the only one preventing a successful > "make check". Isn't there a way to mark a test as expected to fail? ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 15:18 ` Eli Zaretskii @ 2018-06-18 15:37 ` João Távora 2018-06-18 16:46 ` Eli Zaretskii 2018-06-18 20:24 ` Glenn Morris 0 siblings, 2 replies; 93+ messages in thread From: João Távora @ 2018-06-18 15:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Alan Mackenzie, emacs-devel, tino.calancha, Glenn Morris [-- Attachment #1: Type: text/plain, Size: 1178 bytes --] Yes, and I'll probably do that. But in my experience, this has a very high probability of burying the problem, i.e. the incentive for actually fixing the problem is reduced dramatically. It's better to do test-breaking things on separate branches when possible. IMO expected failures are for when a feature is being designed and still incomplete, not when it was already working. João On Mon, Jun 18, 2018, 16:18 Eli Zaretskii <eliz@gnu.org> wrote: > > From: João Távora <joaotavora@gmail.com> > > Date: Mon, 18 Jun 2018 14:24:39 +0100 > > Cc: Glenn Morris <rgm@gnu.org>, Emacs developers <emacs-devel@gnu.org>, > > Tino Calancha <tino.calancha@gmail.com> > > > > > Yes. But it is the master branch, where not everything can be expected > > > to work all the time. I think the main thing is, we're _going_ to fix > > > this bug. > > > > Well, I respectfully and totally disagree. The reason we have automated > > tests in Hydra is to catch unintentional breakage, not intentional > > breakage. And, IIUC that test is the only one preventing a successful > > "make check". > > Isn't there a way to mark a test as expected to fail? > [-- Attachment #2: Type: text/html, Size: 1902 bytes --] ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 15:37 ` João Távora @ 2018-06-18 16:46 ` Eli Zaretskii 2018-06-18 17:21 ` Eli Zaretskii 2018-06-18 23:49 ` João Távora 2018-06-18 20:24 ` Glenn Morris 1 sibling, 2 replies; 93+ messages in thread From: Eli Zaretskii @ 2018-06-18 16:46 UTC (permalink / raw) To: João Távora; +Cc: acm, emacs-devel, tino.calancha, rgm > From: João Távora <joaotavora@gmail.com> > Date: Mon, 18 Jun 2018 16:37:33 +0100 > Cc: Alan Mackenzie <acm@muc.de>, Glenn Morris <rgm@gnu.org>, emacs-devel@gnu.org, > tino.calancha@gmail.com > > Yes, and I'll probably do that. But in my experience, this has a very high probability of burying the problem, i.e. > the incentive for actually fixing the problem is reduced dramatically. But putting the problematic code on a branch reduces the incentive even more, doesn't it? At least with the expected failure, you will see when it unexpectedly starts to succeed; on a branch, the code is easily forgotten forever... Of course, it's better to fix breakage fats, but we all have our lives. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 16:46 ` Eli Zaretskii @ 2018-06-18 17:21 ` Eli Zaretskii 2018-06-18 23:49 ` João Távora 1 sibling, 0 replies; 93+ messages in thread From: Eli Zaretskii @ 2018-06-18 17:21 UTC (permalink / raw) To: joaotavora; +Cc: acm, tino.calancha, rgm, emacs-devel > Date: Mon, 18 Jun 2018 19:46:15 +0300 > From: Eli Zaretskii <eliz@gnu.org> > Cc: acm@muc.de, emacs-devel@gnu.org, tino.calancha@gmail.com, rgm@gnu.org > > Of course, it's better to fix breakage fats, but we all have our lives. ^^^^ I meant "fast", of course... ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 16:46 ` Eli Zaretskii 2018-06-18 17:21 ` Eli Zaretskii @ 2018-06-18 23:49 ` João Távora 2018-06-19 2:37 ` Eli Zaretskii 1 sibling, 1 reply; 93+ messages in thread From: João Távora @ 2018-06-18 23:49 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acm, emacs-devel, tino.calancha, rgm Eli Zaretskii <eliz@gnu.org> writes: >> From: João Távora <joaotavora@gmail.com> >> Date: Mon, 18 Jun 2018 16:37:33 +0100 >> Cc: Alan Mackenzie <acm@muc.de>, Glenn Morris <rgm@gnu.org>, emacs-devel@gnu.org, >> tino.calancha@gmail.com >> >> Yes, and I'll probably do that. But in my experience, this has a very high probability of burying the problem, i.e. >> the incentive for actually fixing the problem is reduced dramatically. > > But putting the problematic code on a branch reduces the incentive > even more, doesn't it? I don't follow. I would answer "no", assuming the person developing the temporarily misbehaving code is motivated to do it in the first place. Develop and break things at will in a branch, merge them to master when they're clean. No? João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 23:49 ` João Távora @ 2018-06-19 2:37 ` Eli Zaretskii 2018-06-19 8:13 ` João Távora 0 siblings, 1 reply; 93+ messages in thread From: Eli Zaretskii @ 2018-06-19 2:37 UTC (permalink / raw) To: João Távora; +Cc: acm, emacs-devel, tino.calancha, rgm > From: João Távora <joaotavora@gmail.com> > Cc: acm@muc.de, rgm@gnu.org, emacs-devel@gnu.org, tino.calancha@gmail.com > Date: Tue, 19 Jun 2018 00:49:17 +0100 > > > But putting the problematic code on a branch reduces the incentive > > even more, doesn't it? > > I don't follow. Code on a branch gets less testing by others, and therefore less reminders about the failing test. > I would answer "no", assuming the person developing the > temporarily misbehaving code is motivated to do it in the first place. > Develop and break things at will in a branch, merge them to master when > they're clean. No? If the code is used, its breakage on a branch hurts like it does on master. If it's unused, then what is it doing in the repository? ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-19 2:37 ` Eli Zaretskii @ 2018-06-19 8:13 ` João Távora 2018-06-19 16:59 ` Eli Zaretskii 0 siblings, 1 reply; 93+ messages in thread From: João Távora @ 2018-06-19 8:13 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acm, emacs-devel, tino.calancha, rgm Eli Zaretskii <eliz@gnu.org> writes: >> From: João Távora <joaotavora@gmail.com> >> Cc: acm@muc.de, rgm@gnu.org, emacs-devel@gnu.org, tino.calancha@gmail.com >> Date: Tue, 19 Jun 2018 00:49:17 +0100 >> >> > But putting the problematic code on a branch reduces the incentive >> > even more, doesn't it? >> >> I don't follow. > > Code on a branch gets less testing by others, and therefore less > reminders about the failing test. But surely, the programmer who broke the test, who is the person technically (and morally) most well suited to fix the problem has the all the original incentive to merge his work. For me this is very clear: only merge if there are 0 failing tests (or rather, if you've increased the number of failing tests by 0). Perhaps CVS used to make this impractival, but nowadays git branches make this very easy. BTW, why does CONTRIBUTE tell us to "make check" at all? >> I would answer "no", assuming the person developing the >> temporarily misbehaving code is motivated to do it in the first place. >> Develop and break things at will in a branch, merge them to master when >> they're clean. No? > If the code is used, its breakage on a branch hurts like it does on > master. Not at all, no, it hurts only the people interested in trying out the feature. On master it hurts everyone, including Hydra's continuous integration, for example, which is the issue at hand. But also other automated things like automated bug bisections etc... > If it's unused, then what is it doing in the repository? To save it. To show it to others for comments. This seems rather obvious to me, so perhaps we are misunderstanding each other. I'm also pretty sure I've seen branches prescribed in this list for unstable features. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-19 8:13 ` João Távora @ 2018-06-19 16:59 ` Eli Zaretskii 2018-06-19 19:40 ` João Távora 0 siblings, 1 reply; 93+ messages in thread From: Eli Zaretskii @ 2018-06-19 16:59 UTC (permalink / raw) To: João Távora; +Cc: acm, emacs-devel, tino.calancha, rgm > From: João Távora <joaotavora@gmail.com> > Cc: acm@muc.de, rgm@gnu.org, emacs-devel@gnu.org, tino.calancha@gmail.com > Date: Tue, 19 Jun 2018 09:13:10 +0100 > > >> > But putting the problematic code on a branch reduces the incentive > >> > even more, doesn't it? > >> > >> I don't follow. > > > > Code on a branch gets less testing by others, and therefore less > > reminders about the failing test. > > But surely, the programmer who broke the test, who is the person > technically (and morally) most well suited to fix the problem has the > all the original incentive to merge his work. Of course. But this is not affected by whether the code is on a branch or on master. > For me this is very clear: only merge if there are 0 failing tests (or > rather, if you've increased the number of failing tests by 0). Perhaps > CVS used to make this impractival, but nowadays git branches make this > very easy. That's a good policy. > BTW, why does CONTRIBUTE tell us to "make check" at all? Is this a tricky question? Because I think the answer is clear to all. > >> I would answer "no", assuming the person developing the > >> temporarily misbehaving code is motivated to do it in the first place. > >> Develop and break things at will in a branch, merge them to master when > >> they're clean. No? > > If the code is used, its breakage on a branch hurts like it does on > > master. > > Not at all, no, it hurts only the people interested in trying out the > feature. On master it hurts everyone It hurts those who try the feature on master as well. > including Hydra's continuous integration, for example, which is the > issue at hand. But also other automated things like automated bug > bisections etc... > > > If it's unused, then what is it doing in the repository? > > To save it. To show it to others for comments. This seems rather > obvious to me, so perhaps we are misunderstanding each other. I'm also > pretty sure I've seen branches prescribed in this list for unstable > features. OK, I think it's time to stop this dispute. It isn't going anywhere, and we basically agree on most aspects of this. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-19 16:59 ` Eli Zaretskii @ 2018-06-19 19:40 ` João Távora 0 siblings, 0 replies; 93+ messages in thread From: João Távora @ 2018-06-19 19:40 UTC (permalink / raw) To: Eli Zaretskii; +Cc: acm, emacs-devel, tino.calancha, rgm Eli Zaretskii <eliz@gnu.org> writes: >> BTW, why does CONTRIBUTE tell us to "make check" at all? > Is this a tricky question? Because I think the answer is clear to > all. Sorry if it sounded like a gotha, but my point was precisely that no clear policy exists. I notice a couple of paragraphs above it says "please test your changes before commiting to the master branch". But for me it's still not clear if that means "don't commit if you've broken any tests"." I do the latter, and try to influence others to work like this, but perhaps the phrasing is purposedly vague so other workflows can be accomodated. Urgent fixes may justify breaking some tests (and that's why I asked what Alan's change did). > OK, I think it's time to stop this dispute. It isn't going anywhere, > and we basically agree on most aspects of this. OK, let's. João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 15:37 ` João Távora 2018-06-18 16:46 ` Eli Zaretskii @ 2018-06-18 20:24 ` Glenn Morris 2018-06-19 2:03 ` João Távora 1 sibling, 1 reply; 93+ messages in thread From: Glenn Morris @ 2018-06-18 20:24 UTC (permalink / raw) To: João Távora Cc: Alan Mackenzie, Eli Zaretskii, tino.calancha, emacs-devel > On Mon, Jun 18, 2018, 16:18 Eli Zaretskii <eliz@gnu.org> wrote: >> Isn't there a way to mark a test as expected to fail? João Távora wrote: > Yes, and I'll probably do that. Please do. Long term test failures are a problem for automated building, testing, merging, etc. Thanks in advance! > But in my experience, this has a very high probability of burying the > problem, i.e. the incentive for actually fixing the problem is reduced > dramatically. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 20:24 ` Glenn Morris @ 2018-06-19 2:03 ` João Távora 0 siblings, 0 replies; 93+ messages in thread From: João Távora @ 2018-06-19 2:03 UTC (permalink / raw) To: Glenn Morris; +Cc: Alan Mackenzie, Eli Zaretskii, tino.calancha, emacs-devel Glenn Morris <rgm@gnu.org> writes: >> On Mon, Jun 18, 2018, 16:18 Eli Zaretskii <eliz@gnu.org> wrote: >>> Isn't there a way to mark a test as expected to fail? >> Yes, and I'll probably do that. > Please do. Done in d37d30cef5bbbdf8d17315835126d76d4681b22a João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 13:24 ` João Távora 2018-06-18 15:18 ` Eli Zaretskii @ 2018-06-18 15:42 ` Alan Mackenzie 2018-06-18 17:01 ` João Távora 1 sibling, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-06-18 15:42 UTC (permalink / raw) To: João Távora; +Cc: Glenn Morris, Emacs developers, Tino Calancha Hello, João. On Mon, Jun 18, 2018 at 14:24:39 +0100, João Távora wrote: > Alan Mackenzie <acm@muc.de> writes: > > Yes. But it is the master branch, where not everything can be expected > > to work all the time. I think the main thing is, we're _going_ to fix > > this bug. > Well, I respectfully and totally disagree. The reason we have automated > tests in Hydra is to catch unintentional breakage, not intentional > breakage. This breakage is unintentional, and we're working out how to fix it. > For development temporarily unhampered by tests, I think a separate > branch is a much better alternative. It's a very easy thing to do in > git (and in your case, trivial to merge from and back to master, given > you have near-total control over that area of code). It's possible, but it's a hassle; it's outside of normal workflow, therefore involves getting into git's execrable documentation. > Can't you just revert the commit that broke it? It was three (or maybe four) successive commits. If I revert them, it will postpone indefinitely the bug that they've fixed. > > OK, here goes. Why should major modes tie themselves in knots, just so > > that electric-pair-mode can work? What CC Mode is doing is natural, and > > matches the reality. > I think you mean "mode", in the singular form :-). No. CC Mode comprises lots of modes, not all of them maintained by me. But even aside from that, CC Mode has often been a pioneer, developing new techniques, which the rest of Emacs has then followed. Examples are hungry deletion and electric indentation. > Also, it doesn't "match reality": if you open a line in a string, it > syntax highlights the remaining string as C statements, but the C parser > doesn't see C statements. IOW, newline doesn't *really* terminate a > string in C. We could argue about words like "terminate" indefinitely. What I think is incontrovertible is if you open a line in a string, the portion after that opening is not part of the string opened on the line above. The new fontification reflects this fact. > > electric-pair-mode's chomp facility could be more rigorously coded - > > sometimes it is dealing with visible whitespace, sometimes it is dealing > > with syntactic properties. Surely it should be working with visible > > whitespace all the time? > No. If it did so, it would chomp parenthesis from non-comment regions > into comment regions, for example. But it could use the strategy of determining the end of any comment, then using non-syntax facilities for traversing the space up to that end. Or something like that. > That doesn't make sense, not according to show-paren-mode, for example. > By the way, after your change, very basic commands which fall completely > outside electric-pair-mode have fundamentally changed their behaviour in > cc-mode. Here are a few, out of Emacs -Q: > * Open a line in a string, using C-o. Sexp-navigation is now messed up > in the whole buffer, i.e. C-M-*. Most commads error or produce > surprising result. So even if the intent is to eventually add a > backslash escaping the newline, or make it two adjacent strings by > typing two quotes (something perfectly allowed by C). I've tried this, obviously, but as far as I'm aware, the operation of C-M-* is correct for the (now syntactically incorrect) buffer. If you can give me a concrete example, I can look at it and correct it. > * Inside the string, `forward-sexp' in a parenthesis of a NL-terminated > string now errors where it would previously do its job of jumping to > the closer; It works more or less the same as C-M-n always has from a parenthesis inside a string, which isn't matched in that string. Just that the notion of "inside a string" is now more exact than it used to be. > * Also inside the string, `blink-matching-paren', on by default, also > doesn't work as before: closing a paren on a NL-started string doesn't > match the opener. Do you mean a NL-ENDED string? I see matching here. If you can be more precise about the failure, I can look at it. > There are no automated tests for these things, otherwise you could be > seeing test breakage here too (and, with higher probably, you may be > seeing breakage in user's expectations later on). No, these things are not all intended functionality of Emacs, they're just side effects of the way the real functionality was implemented. > > I've attempted a bit of debugging. In addition to > > electric-pair--skip-whitespace, I encountered a scan-sexps in subfunction > > ended-prematurely-fn of function electric-pair--balance-info, which > > snagged on the end of string at EOL. > I don't understand how this matters to the problem at hand, but > regardless, can you make a bug report demonstrating the presumed bug and > its impact so I can follow up? I attempted to see how difficult it would be to modify elec-pair.el to cope with unconstrained text properties in buffers. This was the second problem I came up against. > > We are talking about a corner case in e-p-m, namely where e-p-m attempts > > to chomp space between parens inside an invalid string. This surely > > won't come up in practice very much. Is it worth fixing? (I would say > > yes.) > Don't forget that the particular piece of e-p-m we're talking about is > one of the ways (arguably the easiest way) to actually fix the specific > C/C++ problem at hand for the user. IOW it's not some random whimsical > useless thing. It's not useless, but it's rare - it's three things happening all at the same time, namely a broken string, pseudo-matching parens and space between them. This isn't going to happen very often. I'd wager that broken strings (two "s with non-escaped NLs between them) in themselves are quite rare. But I still think it should be fixed. :-) > > The user is visually informed of the reality: that one or more > > strings are unterminated, and where the "breakage" is (where the > > font-lock-string-face stops). This is an improvement over the > > previous handling, where the opening invalid " merely got > > warning-face, but the following unterminated string flowed on > > indefinitely. > I suppose that's a "yes". In that case, the face `warning`, which > defaults to a very bright red, would be fine for me personally (and I'm > confident if could be made even more evident). Also, the fact that the > remaining string is now syntax-highlighted as C statements is extremely > confusing. Why? They are now C statements, and would be handled by the compiler as such. Having them fontified as strings (as they previously were) was confusing. > > The disadvantage is that e-p-m is constraining major modes in how they > > can use syntax-table text properties. I think this is a problem in > > electric-pair-mode, not in CC Mode. > Again, AFAIK, "mode", singular. See above. Perhaps it's worth noting that AWK-Mode has used this method of indicating invalid strings for around 15 years, now. There have never been any complaints about this from users. > And, obviously, I'm not going to special-case cc-mode in elec-pair.el: > after doing some of my own mulling, I may open a customization point > for cc-mode.el to use. I think it's a general case, that of having non-neutral syntax-table text properties on visual space characters. What do you see as a customisation option here? > So at the very least, it's going to require some (potentially trivial) > fix in cc-mode.el, for sure. :-) > But now that I've understood the non-e-p-m implications of your change, > I urge to at least make this configurable (if it is already > configurable, then don't mind me). Make correct fontification configurable? To sum up my viewpoint, I regard the way CC Mode now fontifies broken strings as correct (aside from any remaining bugs, of course). I think elec-pair.el's assumption that whitespace always has "neutral" syntax is unwarranted, and is the root of the current bug. There remains the problem of making chomping parens inside a broken string work. I honestly think that modifying elec-pair.el is the way to go, but I'm open to suggestions of alternative strategies that CC Mode could follow to get the same fontification, that wouldn't require modifying elec-pair.el. > João -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 15:42 ` Alan Mackenzie @ 2018-06-18 17:01 ` João Távora 2018-06-18 18:07 ` Yuri Khan ` (3 more replies) 0 siblings, 4 replies; 93+ messages in thread From: João Távora @ 2018-06-18 17:01 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha Alan Mackenzie <acm@muc.de> writes: >> For development temporarily unhampered by tests, I think a separate >> branch is a much better alternative. It's a very easy thing to do in >> git (and in your case, trivial to merge from and back to master, given >> you have near-total control over that area of code). > > It's possible, but it's a hassle; it's outside of normal workflow, > therefore involves getting into git's execrable documentation. hehe. lol. OK, but really you should checkout out branches, they're all the rage these days :-) >> > OK, here goes. Why should major modes tie themselves in knots, just so >> > that electric-pair-mode can work? What CC Mode is doing is natural, and >> > matches the reality. > >> I think you mean "mode", in the singular form :-). > > No. CC Mode comprises lots of modes, not all of them maintained by me. > But even aside from that, CC Mode has often been a pioneer, developing > new techniques, which the rest of Emacs has then followed. Examples are > hungry deletion and electric indentation. But they are all children of cc-mode.el right? I meant singular as in, afaik, nobody else independently thought of doing that besides you. >> Also, it doesn't "match reality": if you open a line in a string, it >> syntax highlights the remaining string as C statements, but the C parser >> doesn't see C statements. IOW, newline doesn't *really* terminate a >> string in C. > > We could argue about words like "terminate" indefinitely. What I think > is incontrovertible is if you open a line in a string, the portion after > that opening is not part of the string opened on the line above. The > new fontification reflects this fact. OK, but now reflects it reflects something that is also wrong (they're not statements either), but to a much greater degress. And on top of that with many more adverse side effects, of which only one is breaking e-p-m mode. >> > electric-pair-mode's chomp facility could be more rigorously coded - >> > sometimes it is dealing with visible whitespace, sometimes it is dealing >> > with syntactic properties. Surely it should be working with visible >> > whitespace all the time? > >> No. If it did so, it would chomp parenthesis from non-comment regions >> into comment regions, for example. > But it could use the strategy of determining the end of any comment, > then using non-syntax facilities for traversing the space up to that > end. Or something like that. I'll look into "something like that". > I've tried this, obviously, but as far as I'm aware, the operation of > C-M-* is correct for the (now syntactically incorrect) buffer. If you > can give me a concrete example, I can look at it and correct it. It's now much hard to select the whole invalid string. It used to be a matter of C-M-u C-M-SPC. To use query-replace in the region, for example. >> * Also inside the string, `blink-matching-paren', on by default, also >> doesn't work as before: closing a paren on a NL-started string doesn't >> match the opener. > > Do you mean a NL-ENDED string? I see matching here. If you can be more > precise about the failure, I can look at it. No, I mean the closer. You and the mode don't consider that a string anymore, but you used to, and I still want do. >> There are no automated tests for these things, otherwise you could be >> seeing test breakage here too (and, with higher probably, you may be >> seeing breakage in user's expectations later on). > No, these things are not all intended functionality of Emacs, they're > just side effects of the way the real functionality was implemented. These accidents, as you have them, work just fine in just about any other mode I can imagine. And they worked just fine in c-mode up until your change. >> > electric-pair--skip-whitespace, I encountered a scan-sexps in subfunction >> > ended-prematurely-fn of function electric-pair--balance-info, which >> > snagged on the end of string at EOL. >> I don't understand how this matters to the problem at hand, but >> regardless, can you make a bug report demonstrating the presumed bug and >> its impact so I can follow up? > I attempted to see how difficult it would be to modify elec-pair.el to > cope with unconstrained text properties in buffers. This was the second > problem I came up against. Well, programming is a continuous problem in general. If I understand correctly, the thing you're trying to change is an implementation detail of electric-pair-mode, not part of its contract, right? If, on the contrary, you think it is a bug, let me know. >> > We are talking about a corner case in e-p-m, namely where e-p-m attempts >> > to chomp space between parens inside an invalid string. This surely >> > won't come up in practice very much. Is it worth fixing? (I would say >> > yes.) >> Don't forget that the particular piece of e-p-m we're talking about is >> one of the ways (arguably the easiest way) to actually fix the specific >> C/C++ problem at hand for the user. IOW it's not some random whimsical >> useless thing. > It's not useless, but it's rare - it's three things happening all at the > same time, namely a broken string, pseudo-matching parens and space > between them. This isn't going to happen very often. I'd wager that > broken strings (two "s with non-escaped NLs between them) in themselves > are quite rare. But I still think it should be fixed. :-) Well, it's handling the rarities that makes Emacs stand out. >> > The user is visually informed of the reality: that one or more >> > strings are unterminated, and where the "breakage" is (where the >> > font-lock-string-face stops). This is an improvement over the >> > previous handling, where the opening invalid " merely got >> > warning-face, but the following unterminated string flowed on >> > indefinitely. > >> I suppose that's a "yes". In that case, the face `warning`, which >> defaults to a very bright red, would be fine for me personally (and I'm >> confident if could be made even more evident). Also, the fact that the >> remaining string is now syntax-highlighted as C statements is extremely >> confusing. > > Why? They are now C statements, and would be handled by the compiler as > such. Clarify "would". Because this doesn't compile. My compiler doesn't even seem to look at anything after the unterminated string: int main () { printf("foo ); printf("bar"); return 0; } >> > The disadvantage is that e-p-m is constraining major modes in how they >> > can use syntax-table text properties. I think this is a problem in >> > electric-pair-mode, not in CC Mode. > >> Again, AFAIK, "mode", singular. > > See above. Perhaps it's worth noting that AWK-Mode has used this method > of indicating invalid strings for around 15 years, now. There have > never been any complaints about this from users. But they weren't ever exposed to the previous behaviour, right? And also, I believe that there is some discrepancy between the number users of AWK and C, the complexity of the average program, etc... >> But now that I've understood the non-e-p-m implications of your change, >> I urge to at least make this configurable (if it is already >> configurable, then don't mind me). > Make correct fontification configurable? For some newfound value of "correct", surely... > There remains the problem of making chomping parens inside a broken > string work. I honestly think that modifying elec-pair.el is the way to > go, but I'm open to suggestions of alternative strategies that CC Mode > could follow to get the same fontification, that wouldn't require > modifying elec-pair.el. As I said, I will look into providing an entry point in elec-pair.el for this. Didn't you mention earlier pike-mode and d-mode? Quoting your earlier message: > Pike Mode has a special feature whereby a string starting with #" > is a multiline string. I think in D Mode (not maintained here), > strings simply are multiline, and there is no such thing as an > escaped EOL. > The writer of the mode sets the CC Mode "language variable" > c-multiline-string-start-char to the character # for Pike Mode, or > some non-character non-nil value for D Mode (usually t, of > course). Can't I do this to my c/c++ mode? Would't this be a way to get the old behaviour back. Perhaps it could be be let-bound in tests, also. João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 17:01 ` João Távora @ 2018-06-18 18:07 ` Yuri Khan 2018-06-18 22:52 ` João Távora 2018-06-18 18:08 ` Alan Mackenzie ` (2 subsequent siblings) 3 siblings, 1 reply; 93+ messages in thread From: Yuri Khan @ 2018-06-18 18:07 UTC (permalink / raw) To: João Távora Cc: Alan Mackenzie, Emacs developers, Tino Calancha, Glenn Morris On Tue, Jun 19, 2018 at 12:25 AM João Távora <joaotavora@gmail.com> wrote: > Alan Mackenzie <acm@muc.de> writes: > > Why? They are now C statements, and would be handled by the compiler as > > such. > > Clarify "would". Because this doesn't compile. My compiler doesn't even > seem to look at anything after the unterminated string: > > int main () { > printf("foo > ); > printf("bar"); > return 0; > } Mine does. After finding a syntax error, a typical C compiler continues scanning the source, attempting to diagnose more errors. See: ``` int main() { printf("foo ); return baz; } $ gcc test1.c test1.c: In function ‘main’: test1.c:2:5: warning: implicit declaration of function ‘printf’ [-Wimplicit-function-declaration] printf("foo ^ test1.c:2:5: warning: incompatible implicit declaration of built-in function ‘printf’ test1.c:2:5: note: include ‘<stdio.h>’ or provide a declaration of ‘printf’ test1.c:2:12: warning: missing terminating " character printf("foo ^ test1.c:2:5: error: missing terminating " character printf("foo ^ test1.c:2:5: error: too few arguments to function ‘printf’ test1.c:4:12: error: ‘baz’ undeclared (first use in this function) return baz; ^ test1.c:4:12: note: each undeclared identifier is reported only once for each function it appears in ``` Observe that the compiler first complains about the unclosed string literal, then too few arguments, and then undeclared identifier ‘baz’. If the compiler thought the string terminal continued, it would skip everything until the end of file silently. Now add a few characters: ``` int main() { printf("foo "bar"); return baz; } $ gcc test2.c test2.c: In function ‘main’: test2.c:2:5: warning: implicit declaration of function ‘printf’ [-Wimplicit-function-declaration] printf("foo ^ test2.c:2:5: warning: incompatible implicit declaration of built-in function ‘printf’ test2.c:2:5: note: include ‘<stdio.h>’ or provide a declaration of ‘printf’ test2.c:2:12: warning: missing terminating " character printf("foo ^ test2.c:2:5: error: missing terminating " character printf("foo ^ test2.c:4:12: error: ‘baz’ undeclared (first use in this function) return baz; ^ test2.c:4:12: note: each undeclared identifier is reported only once for each function it appears in ``` Observe the compiler no longer complains about too few arguments to ‘printf’. This is consistent with the hypothesis that it discarded the unterminated literal at the newline, and took "bar" as the required format string argument. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 18:07 ` Yuri Khan @ 2018-06-18 22:52 ` João Távora 0 siblings, 0 replies; 93+ messages in thread From: João Távora @ 2018-06-18 22:52 UTC (permalink / raw) To: Yuri Khan; +Cc: Alan Mackenzie, Emacs developers, Tino Calancha, Glenn Morris Yuri Khan <yurivkhan@gmail.com> writes: > On Tue, Jun 19, 2018 at 12:25 AM João Távora <joaotavora@gmail.com> wrote: >> Alan Mackenzie <acm@muc.de> writes: >> > Why? They are now C statements, and would be handled by the compiler as >> > such. >> >> Clarify "would". Because this doesn't compile. My compiler doesn't even >> seem to look at anything after the unterminated string: >> >> int main () { >> printf("foo >> ); >> printf("bar"); >> return 0; >> } > > Mine does. After finding a syntax error, a typical C compiler > continues scanning the source, attempting to diagnose more errors. You're right. That example I fed this particular compiler (clang) didn't have anything for it to complain. João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 17:01 ` João Távora 2018-06-18 18:07 ` Yuri Khan @ 2018-06-18 18:08 ` Alan Mackenzie 2018-06-18 23:43 ` João Távora 2018-06-19 1:48 ` Stefan Monnier 2018-06-18 22:41 ` CC Mode and electric-pair "problem" Stephen Leake 2018-06-19 5:02 ` Alan Mackenzie 3 siblings, 2 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-06-18 18:08 UTC (permalink / raw) To: João Távora; +Cc: Glenn Morris, Emacs developers, Tino Calancha Hello again, João. On Mon, Jun 18, 2018 at 18:01:18 +0100, João Távora wrote: > Alan Mackenzie <acm@muc.de> writes: > > No. CC Mode comprises lots of modes, not all of them maintained by > > me. But even aside from that, CC Mode has often been a pioneer, > > developing new techniques, which the rest of Emacs has then followed. > > Examples are hungry deletion and electric indentation. > But they are all children of cc-mode.el right? I meant singular as in, > afaik, nobody else independently thought of doing that besides you. Probably other people have thought of it. The actual doing was quite involved. But maybe we'll see whether or not the idea spreads. > > We could argue about words like "terminate" indefinitely. What I > > think is incontrovertible is if you open a line in a string, the > > portion after that opening is not part of the string opened on the > > line above. The new fontification reflects this fact. > OK, but now reflects it reflects something that is also wrong (they're > not statements either), but to a much greater degress. And on top of > that with many more adverse side effects, of which only one is breaking > e-p-m mode. How adverse are they really? I mean, I think you are currently in the "looking for flaws" mode, which is essential, worthwhile, and appreciated, but if you were just using, say, C++ mode, how bad would these side effects actually be? That's not a rhetorical question. It's about deciding whether to invest the work to make the "correct" behaviour optional. > > I've tried this, obviously, but as far as I'm aware, the operation of > > C-M-* is correct for the (now syntactically incorrect) buffer. If you > > can give me a concrete example, I can look at it and correct it. > It's now much hard to select the whole invalid string. It used to be a > matter of C-M-u C-M-SPC. To use query-replace in the region, for > example. OK, thanks. But how often does this happen? > >> * Also inside the string, `blink-matching-paren', on by default, also > >> doesn't work as before: closing a paren on a NL-started string doesn't > >> match the opener. > > Do you mean a NL-ENDED string? I see matching here. If you can be more > > precise about the failure, I can look at it. > No, I mean the closer. You and the mode don't consider that a string > anymore, but you used to, and I still want do. OK. > >> There are no automated tests for these things, otherwise you could be > >> seeing test breakage here too (and, with higher probably, you may be > >> seeing breakage in user's expectations later on). > > No, these things are not all intended functionality of Emacs, they're > > just side effects of the way the real functionality was implemented. > These accidents, as you have them, work just fine in just about any > other mode I can imagine. And they worked just fine in c-mode up until > your change. I suspect it is more that people have got so used to them, that any change will appear to be bad. Maybe. > Well, programming is a continuous problem in general. If I understand > correctly, the thing you're trying to change is an implementation detail > of electric-pair-mode, not part of its contract, right? If, on the > contrary, you think it is a bug, let me know. It is what I said at the end of my previous post. e-p-m assumes that whitespace has "neutral" syntax. When it doesn't (like here, with a string-fence property), the scan-sexps doesn't work as desired. I'm convinced this could be changed. > >> > We are talking about a corner case in e-p-m, namely where e-p-m attempts > >> > to chomp space between parens inside an invalid string. This surely > >> > won't come up in practice very much. Is it worth fixing? (I would say > >> > yes.) > >> Don't forget that the particular piece of e-p-m we're talking about is > >> one of the ways (arguably the easiest way) to actually fix the specific > >> C/C++ problem at hand for the user. IOW it's not some random whimsical > >> useless thing. > > It's not useless, but it's rare - it's three things happening all at the > > same time, namely a broken string, pseudo-matching parens and space > > between them. This isn't going to happen very often. I'd wager that > > broken strings (two "s with non-escaped NLs between them) in themselves > > are quite rare. But I still think it should be fixed. :-) > Well, it's handling the rarities that makes Emacs stand out. Indeed! Let's carry on doing this. > >> > The user is visually informed of the reality: that one or more > >> > strings are unterminated, and where the "breakage" is (where the > >> > font-lock-string-face stops). This is an improvement over the > >> > previous handling, where the opening invalid " merely got > >> > warning-face, but the following unterminated string flowed on > >> > indefinitely. > >> I suppose that's a "yes". In that case, the face `warning`, which > >> defaults to a very bright red, would be fine for me personally (and I'm > >> confident if could be made even more evident). Also, the fact that the > >> remaining string is now syntax-highlighted as C statements is extremely > >> confusing. > > Why? They are now C statements, and would be handled by the compiler as > > such. > Clarify "would". Because this doesn't compile. My compiler doesn't even > seem to look at anything after the unterminated string: > int main () { > printf("foo > ); > printf("bar"); > return 0; > } Maybe the compiler has the same bug as the old CC Mode. ;-) But to see my point of view, type the following into a C Mode buffer in Emacs-26.1, the last two lines first, then type in the first line above them: char *foo = "foo; int bar = 5; char *baz = "baz"; The entire second line, and the third line, up to the first ", get string face. We've been used to this for so long that we've lost sight of just how bad and amateurish it really is. Now do the same in master. The fontification of the last two lines remains unaffected by typing in the first line, as it should. > > See above. Perhaps it's worth noting that AWK-Mode has used this > > method of indicating invalid strings for around 15 years, now. There > > have never been any complaints about this from users. > But they weren't ever exposed to the previous behaviour, right? And > also, I believe that there is some discrepancy between the number users > of AWK and C, the complexity of the average program, etc... Most AWK programmers will also be using C, shell-script, whatever. And while there aren't that many of them, they aren't as rare as all that. And when I say no complaints, I mean none whatsoever; not a single one. > >> But now that I've understood the non-e-p-m implications of your change, > >> I urge to at least make this configurable (if it is already > >> configurable, then don't mind me). > > Make correct fontification configurable? > For some newfound value of "correct", surely... Yes. ;-) > > There remains the problem of making chomping parens inside a broken > > string work. I honestly think that modifying elec-pair.el is the way to > > go, but I'm open to suggestions of alternative strategies that CC Mode > > could follow to get the same fontification, that wouldn't require > > modifying elec-pair.el. > As I said, I will look into providing an entry point in elec-pair.el for > this. Thanks. > Didn't you mention earlier pike-mode and d-mode? Quoting your earlier > message: > > Pike Mode has a special feature whereby a string starting with #" > > is a multiline string. I think in D Mode (not maintained here), > > strings simply are multiline, and there is no such thing as an > > escaped EOL. > > The writer of the mode sets the CC Mode "language variable" > > c-multiline-string-start-char to the character # for Pike Mode, or > > some non-character non-nil value for D Mode (usually t, of > > course). > Can't I do this to my c/c++ mode? Would't this be a way to get the old > behaviour back. Perhaps it could be be let-bound in tests, also. These are intended as language variables (i.e. variables which define a language), not user configuration variables. I can't immediately see any adverse effects to binding them, but I can't guarantee there'll be none. As for let binding them for tests, that should be for a short time only. > João -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 18:08 ` Alan Mackenzie @ 2018-06-18 23:43 ` João Távora 2018-06-19 1:35 ` João Távora 2018-06-19 1:48 ` Stefan Monnier 1 sibling, 1 reply; 93+ messages in thread From: João Távora @ 2018-06-18 23:43 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha Alan Mackenzie <acm@muc.de> writes: > Hello again, João. Hello again, Alan > On Mon, Jun 18, 2018 at 18:01:18 +0100, João Távora wrote: > Probably other people have thought of it. The actual doing was quite > involved. But maybe we'll see whether or not the idea spreads. > >> > We could argue about words like "terminate" indefinitely. What I >> > think is incontrovertible is if you open a line in a string, the >> > portion after that opening is not part of the string opened on the >> > line above. The new fontification reflects this fact. > >> OK, but now reflects it reflects something that is also wrong (they're >> not statements either), but to a much greater degress. And on top of >> that with many more adverse side effects, of which only one is breaking >> e-p-m mode. > > How adverse are they really? I mean, I think you are currently in the > "looking for flaws" mode, which is essential, worthwhile, and > appreciated, but if you were just using, say, C++ mode, how bad would > these side effects actually be? That's not a rhetorical question. It's > about deciding whether to invest the work to make the "correct" behaviour > optional. Honestly, I don't know. You're right I am indeed in "looking for flaws" mode. "Once you see it, you can't unsee it". After reading your example below, I think the feature you are adding is good for people who inadvertently delete the terminating quote, but not for those who once in a while add a newline inside a string. I am in the latter group: I almost never unbalance by buffer (thanks mostly to e-p-m) and I like to navigate by sexp, even in languages other than lisp (even in message-mode, for instance). So I get sad when something breaks this balance. >> > I've tried this, obviously, but as far as I'm aware, the operation of >> > C-M-* is correct for the (now syntactically incorrect) buffer. If you >> > can give me a concrete example, I can look at it and correct it. > >> It's now much hard to select the whole invalid string. It used to be a >> matter of C-M-u C-M-SPC. To use query-replace in the region, for >> example. > > OK, thanks. But how often does this happen? Well, there's obvious the case of actually writing a multi-line string, such as when writing a "usage:" blurb. Here I believe most users, like I, will first draft out the string visually and then add "\n\" to every line, perhaps by selecting the string where point is in, which is now much harder. While it's true I don't write many of those lately, it will probably bug me much more often in another situation: I'm in the the habit of C-o'ing a lot (everywhere, not just in strings, obviously) to "open space" for my thoughts, i.e. for the thing I am going to write next. And this new behaviour breaks that. But now I've tested a bit more and can be specific: it breaks *some* of that. C-M-u C-M-SPC is indeed broken. That's what I use, for example, just before I deciding to replace the string with a variable. But curiously, chomping to the end quote is working, which is nice. And if I can somehow make it to the closer quote, C-M-b works, though C-M-f at the opener doesn't. In any case, things were more predictable before. >> These accidents, as you have them, work just fine in just about any >> other mode I can imagine. And they worked just fine in c-mode up until >> your change. > I suspect it is more that people have got so used to them, that any > change will appear to be bad. Maybe. As we all know, Emacs is fertile in this regard. For example, I can think of a certain version control system, rhymes with "knit"... >> Well, programming is a continuous problem in general. If I understand >> correctly, the thing you're trying to change is an implementation detail >> of electric-pair-mode, not part of its contract, right? If, on the >> contrary, you think it is a bug, let me know. > It is what I said at the end of my previous post. e-p-m assumes that > whitespace has "neutral" syntax. When it doesn't (like here, with a > string-fence property), the scan-sexps doesn't work as desired. I'm > convinced this could be changed. OK. I'll have a look (I admit to not having looked at that code in depth since I wrote it 5 years ago). I was pretty much convinced it was flawless :-) >> Well, it's handling the rarities that makes Emacs stand out. > Indeed! Let's carry on doing this. >> int main () { >> printf("foo >> ); >> printf("bar"); >> return 0; >> } > > Maybe the compiler has the same bug as the old CC Mode. ;-) No, I passed it a silly example. It does indeed look past the unterminated string. > But to see my point of view, type the following into a C Mode buffer in > Emacs-26.1, the last two lines first, then type in the first line above > them: > > char *foo = "foo; > int bar = 5; > char *baz = "baz"; > > face. We've been used to this for so long that we've lost sight of just > how bad and amateurish it really is. > > Now do the same in master. The fontification of the last two lines > remains unaffected by typing in the first line, as it should. Indeed, I admit this is better. I very rarely get a buffer like this, though. João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 23:43 ` João Távora @ 2018-06-19 1:35 ` João Távora 0 siblings, 0 replies; 93+ messages in thread From: João Távora @ 2018-06-19 1:35 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha João Távora <joaotavora@gmail.com> writes: >> It is what I said at the end of my previous post. e-p-m assumes that >> whitespace has "neutral" syntax. When it doesn't (like here, with a >> string-fence property), the scan-sexps doesn't work as desired. I'm >> convinced this could be changed. > OK. I'll have a look (I admit to not having looked at that code in depth > since I wrote it 5 years ago). I was pretty much convinced it was > flawless :-) I looked at this code and decided its better to leave it. I don't need it to fix the problem at hand. I may change this opinion, so I started by pushing this change that we need regardless: commit 6353387835f6cb34765ac525ac3e9edf3239e589 Electric-pair-mode lets modes choose how to skip whitespace * lisp/elec-pair.el (electric-pair-skip-whitespace-function): New buffer-local variable. (electric-pair-post-self-insert-function): Call it. Then, I defined this function (defun c-mode-electric-skip-whitespace () "CC-mode's way of skipping whitespace." (let ((saved (point)) (in-comment (nth 4 (syntax-ppss)))) ;; actually if you also skip backslash here, you'll skip/chomp ;; over newline escapes, which may be nice. (skip-chars-forward (apply #'string electric-pair-skip-whitespace-chars)) (unless (or (not in-comment) (nth 4 (syntax-ppss))) (goto-char saved)))) And added this to c-mode-common-hook: (add-hook 'c-mode-common-hook (lambda () (setq-local electric-pair-skip-whitespace-function #'c-mode-electric-skip-whitespace) (add-function :around (local 'electric-pair-skip-self) (lambda (&rest r) (let (terminator) (if (and (setq terminator (nth 3 (syntax-ppss))) (save-excursion (goto-char (1- (line-end-position))) (and (eq terminator (nth 3 (syntax-ppss))) (not (eq terminator (char-after)))))) t (apply r))))))) All e-p-m tests pass, though the detection of NL-terminated string is very shady (but you probably have much better ways inside CC-mode to detect them). If your "fix-scan-sexps" idea above works (I don't understand it) then the add-function won't be needed at all. Let me know what you think, João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 18:08 ` Alan Mackenzie 2018-06-18 23:43 ` João Távora @ 2018-06-19 1:48 ` Stefan Monnier 2018-06-19 3:52 ` Clément Pit-Claudel 2018-06-26 16:08 ` Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] Alan Mackenzie 1 sibling, 2 replies; 93+ messages in thread From: Stefan Monnier @ 2018-06-19 1:48 UTC (permalink / raw) To: emacs-devel > char *foo = "foo; > int bar = 5; > char *baz = "baz"; > > The entire second line, and the third line, up to the first ", get string > face. We've been used to this for so long that we've lost sight of just > how bad and amateurish it really is. But what about when you write char *thedoc = "Here it is: - First do this - Then do that And that's it!"; ? Both cases are valid transient states. Which one will occur more often depends a lot on the particular kind of code you write and your coding habits. Emacs can't reliably distinguish the two cases, so whichever behavior it chooses it will look "amateurish" in some cases. I think the better option here is to focus on the following: 1- Make sure the programmer is aware there's a problem in its code. I.e. highlight the opening quote or the non-escaped end-of-line or something in bright red or something like that. 2- Don't try to guess what the user intended to do. Instead keep our code as simple as possible: the C code we're handed is broken, so there's no real clear "right behavior" anyway. -- Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-19 1:48 ` Stefan Monnier @ 2018-06-19 3:52 ` Clément Pit-Claudel 2018-06-19 6:38 ` Stefan Monnier 2018-06-26 16:08 ` Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] Alan Mackenzie 1 sibling, 1 reply; 93+ messages in thread From: Clément Pit-Claudel @ 2018-06-19 3:52 UTC (permalink / raw) To: emacs-devel On 2018-06-18 21:48, Stefan Monnier wrote: > 1- Make sure the programmer is aware there's a problem in its code. > I.e. highlight the opening quote or the non-escaped end-of-line or > something in bright red or something like that. Agreed. Given this criterion, the patch is an improvement: making sure that lines past the first one are not highlighted suppresses the risk of misleading the programmer into thinking that they have a multiline-string. (This happens to me from time to time in Python, actually: I write "abc def" instead of """abc def""", and the highlighting doesn't immediately reveal the error. Simply not highlighting the second line would help a lot. > 2- Don't try to guess what the user intended to do. > Instead keep our code as simple as possible: the C code we're handed > is broken, so there's no real clear "right behavior" anyway. I'm not sure whether we can afford to bail out like that — for people who don't use some form of structured editing, most of the code that the IDE ends up seeing is broken in some way (unmatched { or ", incomplete declarations, incorrect numbers of arguments, undeclared identifiers, etc.) Modeling our error recovery behaviors on the one used by relevant compilers seems like a pretty good approach (ultimately, for the modes I maintain, I'd like to delegate fontification to a language server provided by the compiler). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-19 3:52 ` Clément Pit-Claudel @ 2018-06-19 6:38 ` Stefan Monnier 2018-06-20 13:48 ` Clément Pit-Claudel 0 siblings, 1 reply; 93+ messages in thread From: Stefan Monnier @ 2018-06-19 6:38 UTC (permalink / raw) To: emacs-devel >> 1- Make sure the programmer is aware there's a problem in its code. >> I.e. highlight the opening quote or the non-escaped end-of-line or >> something in bright red or something like that. > Agreed. Given this criterion, the patch is an improvement: making sure that > lines past the first one are not highlighted suppresses the risk of > misleading the programmer into thinking that they have a multiline-string. The old behavior highlighted the opening (and not-closed on the same line) quote in font-lock-warning-face, which seemed perfectly adequate. > (This happens to me from time to time in Python, actually: I write "abc > def" instead of """abc > def""", and the highlighting doesn't immediately reveal the error. > Simply not highlighting the second line would help a lot. It's easier to highlight the unmatched opener than to try and prevent the second line from being highlighted (and you want to highlight that opener in any case). >> 2- Don't try to guess what the user intended to do. >> Instead keep our code as simple as possible: the C code we're handed >> is broken, so there's no real clear "right behavior" anyway. > > I'm not sure whether we can afford to bail out like that — for people who > don't use some form of structured editing, most of the code that the IDE > ends up seeing is broken in some way (unmatched { or ", incomplete > declarations, incorrect numbers of arguments, undeclared identifiers, etc.) Not sure what you mean by "bail out". Point 1 has added highlighting to warn the user about the presence of a problem. Short of changing the actual code behind the user's back, there's really not much more we can do to prevent the compiler/IDE from seeing that broken code. > Modeling our error recovery behaviors on the one used by relevant compilers > seems like a pretty good approach (ultimately, for the modes I maintain, I'd > like to delegate fontification to a language server provided by the > compiler). Point 2 suggest to go with the simplest implementation (i.e. let the behavior be dictated by the implementation), so if your highlighting is provided by LSP (say), then point 2 would suggest that there's no point trying to provide a different behavior from the one provided by the LSP server. Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-19 6:38 ` Stefan Monnier @ 2018-06-20 13:48 ` Clément Pit-Claudel 0 siblings, 0 replies; 93+ messages in thread From: Clément Pit-Claudel @ 2018-06-20 13:48 UTC (permalink / raw) To: emacs-devel On 2018-06-19 02:38, Stefan Monnier wrote: > It's easier to highlight the unmatched opener than to try and prevent > the second line from being highlighted (and you want to highlight that > opener in any case). Maybe, but I find it much more pleasant if the second line isn't highlighted. > Not sure what you mean by "bail out". Point 1 has added highlighting to > warn the user about the presence of a problem. Short of changing the > actual code behind the user's back, there's really not much more we can > do to prevent the compiler/IDE from seeing that broken code. We want the compiler and IDE to see the broken code, but we also want to do as much as we can to make the experience pleasant (and I find it unpleasant that inserting an unmatched '"' breaks syntax highlighting for the rest of the buffer. As an example, Merlin does a great job at handling broken OCaml code. > Point 2 suggest to go with the simplest implementation (i.e. let the > behavior be dictated by the implementation), so if your highlighting is > provided by LSP (say), then point 2 would suggest that there's no point > trying to provide a different behavior from the one provided by the > LSP server. Yes, I agree. In the meantime, approximating that at the cost of a bit complexity in the Emacs mode seems good. Clément. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] 2018-06-19 1:48 ` Stefan Monnier 2018-06-19 3:52 ` Clément Pit-Claudel @ 2018-06-26 16:08 ` Alan Mackenzie 2018-06-26 20:02 ` João Távora 2018-06-28 23:56 ` Stefan Monnier 1 sibling, 2 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-06-26 16:08 UTC (permalink / raw) To: Stefan Monnier Cc: Clément Pit-Claudel, João Távora, emacs-devel Hello, Stefan. On Mon, Jun 18, 2018 at 21:48:41 -0400, Stefan Monnier wrote: > > char *foo = "foo; > > int bar = 5; > > char *baz = "baz"; > > The entire second line, and the third line, up to the first ", get string > > face. We've been used to this for so long that we've lost sight of just > > how bad and amateurish it really is. > But what about when you write > char *thedoc = "Here it is: > - First do this > - Then do that > And that's it!"; > ? > Both cases are valid transient states. Which one will occur more often > depends a lot on the particular kind of code you write and your > coding habits. I suggest the most common case by far will be writing char *foo = "foo.... in the middle of an existing buffer. > Emacs can't reliably distinguish the two cases, so whichever behavior it > chooses it will look "amateurish" in some cases. No, you've misunderstood my point. It is not the aesthetic "niceness", the lack of which is amateurish; it is fontifying as a string something which isn't a string (as defined by the compiler's error messages). > I think the better option here is to focus on the following: > 1- Make sure the programmer is aware there's a problem in its code. > I.e. highlight the opening quote or the non-escaped end-of-line or > something in bright red or something like that. > 2- Don't try to guess what the user intended to do. > Instead keep our code as simple as possible: the C code we're handed > is broken, so there's no real clear "right behavior" anyway. How about the following suggestion - instead of having permanent string-fence syntax-table text properties to define the ends of unterminated strings: (i) We leave the syntax of the string opener and EOL alone; (ii) we amend font-{lock,core}.el to apply the desired fontification, to be like the new fontification in CC Mode? This could be done straightforwardly in font-lock by temporarily putting the string-fence properties on these strings, applying all the face properties, then removing these properties again. It would need a few new customisation variables to specify what counts as an open string, and so on. > -- Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] 2018-06-26 16:08 ` Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] Alan Mackenzie @ 2018-06-26 20:02 ` João Távora 2018-06-28 23:56 ` Stefan Monnier 1 sibling, 0 replies; 93+ messages in thread From: João Távora @ 2018-06-26 20:02 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Clément Pit-Claudel, Stefan Monnier, emacs-devel Alan Mackenzie <acm@muc.de> writes: >> But what about when you write >> char *thedoc = "Here it is: >> - First do this >> - Then do that >> And that's it!"; >> Both cases are valid transient states. Which one will occur more often >> depends a lot on the particular kind of code you write and your >> coding habits. > I suggest the most common case by far will be writing > char *foo = "foo.... > in the middle of an existing buffer. You can't know that, really. It's not just the users of electric-pair-mode, but users of other popular autopairing packages, or those autopair manually. Or users who mostly edit existing code. One could even speculate (just as tremulously) that more C code gets maintained than written these days. >> Emacs can't reliably distinguish the two cases, so whichever behavior it >> chooses it will look "amateurish" in some cases. > No, you've misunderstood my point. It is not the aesthetic "niceness", > the lack of which is amateurish; it is fontifying as a string something > which isn't a string (as defined by the compiler's error messages). It is just as wrong as fontifying something as C statements that isn't a C statement. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] 2018-06-26 16:08 ` Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] Alan Mackenzie 2018-06-26 20:02 ` João Távora @ 2018-06-28 23:56 ` Stefan Monnier 2018-06-29 0:43 ` Stefan Monnier 1 sibling, 1 reply; 93+ messages in thread From: Stefan Monnier @ 2018-06-28 23:56 UTC (permalink / raw) To: Alan Mackenzie Cc: Clément Pit-Claudel, João Távora, emacs-devel >> I think the better option here is to focus on the following: >> 1- Make sure the programmer is aware there's a problem in its code. >> I.e. highlight the opening quote or the non-escaped end-of-line or >> something in bright red or something like that. >> 2- Don't try to guess what the user intended to do. >> Instead keep our code as simple as possible: the C code we're handed >> is broken, so there's no real clear "right behavior" anyway. > > How about the following suggestion - instead of having permanent > string-fence syntax-table text properties to define the ends of > unterminated strings: My suggestion has no "string-fence syntax-table" or any such thing, so I'm not sure what you're saying here. Before suggesting something else, could you clarify the downside you see with my proposal? > (ii) we amend font-{lock,core}.el to apply the desired fontification, to > be like the new fontification in CC Mode? This problem is not specific to C but it's not common to all programming languages either, so I think that modifying font-(lock|core).el for that would be a gross hack. We could do it somewhat cleanly by having font-lock.el provide some "standard" function that major modes could opt to use in their font-lock-* settings, of course, but I'm having a hard time imagining a solution with a nice semantics on that side: if we do it only by tweaking faces, then we get inconsistent behavior between highlighting and C-M-f, and if we do it by tweaking syntax-tables, then we get weird differences between the case where font-lock is used and where it's not used (e.g. between not-yet-displayed code and already displayed code, or between the case where font-lock-mode is enabled or not). `syntax-table` properties should be applied via syntax-propertize-function in order to be reliably available. Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] 2018-06-28 23:56 ` Stefan Monnier @ 2018-06-29 0:43 ` Stefan Monnier 0 siblings, 0 replies; 93+ messages in thread From: Stefan Monnier @ 2018-06-29 0:43 UTC (permalink / raw) To: Alan Mackenzie Cc: Clément Pit-Claudel, João Távora, emacs-devel > My suggestion has no "string-fence syntax-table" or any such thing, so > I'm not sure what you're saying here. > Before suggesting something else, could you clarify the downside you see > with my proposal? Hmm... sorry: I'm catching up with some mail backlog and didn't read carefully. I noticed too late that I misread: your message was not in reply to my last suggestion but to some earlier discussion. So, please disregard the above text. I'll finish reading my mail before replying further. I think the following part of my answer is still correct, tho maybe out-of-date with subsequent discussion I'm about to discover. Stefan >> (ii) we amend font-{lock,core}.el to apply the desired fontification, to >> be like the new fontification in CC Mode? > > This problem is not specific to C but it's not common to all programming > languages either, so I think that modifying font-(lock|core).el > for that would be a gross hack. > > We could do it somewhat cleanly by having font-lock.el provide some > "standard" function that major modes could opt to use in their > font-lock-* settings, of course, but I'm having a hard time imagining > a solution with a nice semantics on that side: if we do it only by > tweaking faces, then we get inconsistent behavior between highlighting > and C-M-f, and if we do it by tweaking syntax-tables, then we get weird > differences between the case where font-lock is used and where it's not > used (e.g. between not-yet-displayed code and already displayed code, or > between the case where font-lock-mode is enabled or not). > > `syntax-table` properties should be applied via > syntax-propertize-function in order to be reliably available. > > > Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 17:01 ` João Távora 2018-06-18 18:07 ` Yuri Khan 2018-06-18 18:08 ` Alan Mackenzie @ 2018-06-18 22:41 ` Stephen Leake 2018-06-19 0:02 ` João Távora 2018-06-19 3:15 ` Clément Pit-Claudel 2018-06-19 5:02 ` Alan Mackenzie 3 siblings, 2 replies; 93+ messages in thread From: Stephen Leake @ 2018-06-18 22:41 UTC (permalink / raw) To: emacs-devel João Távora <joaotavora@gmail.com> writes: > Alan Mackenzie <acm@muc.de> writes: > >>> > OK, here goes. Why should major modes tie themselves in knots, just so >>> > that electric-pair-mode can work? What CC Mode is doing is natural, and >>> > matches the reality. >> >>> I think you mean "mode", in the singular form :-). >> >> No. CC Mode comprises lots of modes, not all of them maintained by me. >> But even aside from that, CC Mode has often been a pioneer, developing >> new techniques, which the rest of Emacs has then followed. Examples are >> hungry deletion and electric indentation. > > But they are all children of cc-mode.el right? I meant singular as in, > afaik, nobody else independently thought of doing that besides you. For what it's worth, I'm planning on adding "new line terminates string" to ada-mode. As Alan says, that is the way the compiler works. I was initially inspired independently, while working on an error-correcting parser, and found it in cc-mode while looking for ways to implement it. If electric-pair mode wants to support users splitting a string across lines, it should insert " before and after the newline; that's what I would expect from it. For me, it's more common to forget the closing " (possibly due to copy/paste), in which case terminating the string at the new line is more friendly. -- -- Stephe ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 22:41 ` CC Mode and electric-pair "problem" Stephen Leake @ 2018-06-19 0:02 ` João Távora 2018-06-19 3:15 ` Clément Pit-Claudel 1 sibling, 0 replies; 93+ messages in thread From: João Távora @ 2018-06-19 0:02 UTC (permalink / raw) To: Stephen Leake; +Cc: emacs-devel Stephen Leake <stephen_leake@stephe-leake.org> writes: > If electric-pair mode wants to support users splitting a string across > lines, it should insert " before and after the newline; that's what I > would expect from it. Well, I don't understand the specific relevance of your example, e-p-m doesn't "support" that, Emacs does. But FWIW your expectation is exactly what it does now (you insert one quote, get two, then you enter the newline). So that's not the problem, the problem in e-p-m is a corner-case of whitespace "chomping", that shouldn't hopefully be very hard to fix. My objections are beyond electric-pair-mode. I was telling Alan how this breaks sexp-based navigation, for instance. > For me, it's more common to forget the closing " (possibly due to > copy/paste), in which case terminating the string at the new line is > more friendly. Indeed, if you frequently do this, it's somewhat nicer not to paint the rest of the buffer purple (or font-lock-string-face). But now you know about the drawbacks. João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 22:41 ` CC Mode and electric-pair "problem" Stephen Leake 2018-06-19 0:02 ` João Távora @ 2018-06-19 3:15 ` Clément Pit-Claudel 2018-06-19 8:16 ` João Távora 1 sibling, 1 reply; 93+ messages in thread From: Clément Pit-Claudel @ 2018-06-19 3:15 UTC (permalink / raw) To: emacs-devel On 2018-06-18 18:41, Stephen Leake wrote: > For what it's worth, I'm planning on adding "new line terminates string" > to ada-mode. As Alan says, that is the way the compiler works. I was > initially inspired independently, while working on an error-correcting > parser, and found it in cc-mode while looking for ways to implement it. Sorry for jumping in a bit late. Does that mean that after the changed an unclosed quote will only cause refontification up to the end of the line? That would be a very nice improvement. I don't use electric-pair-mode, and as things currently stand inserting an unmatched quote applies font-lock-string-face to the entire buffer, which is a bit annoying. Clément. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-19 3:15 ` Clément Pit-Claudel @ 2018-06-19 8:16 ` João Távora 0 siblings, 0 replies; 93+ messages in thread From: João Távora @ 2018-06-19 8:16 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel Clément Pit-Claudel <cpitclaudel@gmail.com> writes: > On 2018-06-18 18:41, Stephen Leake wrote: >> For what it's worth, I'm planning on adding "new line terminates string" >> to ada-mode. As Alan says, that is the way the compiler works. I was >> initially inspired independently, while working on an error-correcting >> parser, and found it in cc-mode while looking for ways to implement it. > Sorry for jumping in a bit late. Does that mean that after the > changed an unclosed quote will only cause refontification up to the > end of the line? That would be a very nice improvement. I don't use > electric-pair-mode, and as things currently stand inserting an Again, see my reply to Stephen. This has very little nothing to do with electric-pair-mode now. If you use sexp-based navigation, blink-matching-paren, or show-paren-mode you will see considerable differences in behaviour. FWIW I like the non-fontification part, too. But it comes with a hefty price. João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-18 17:01 ` João Távora ` (2 preceding siblings ...) 2018-06-18 22:41 ` CC Mode and electric-pair "problem" Stephen Leake @ 2018-06-19 5:02 ` Alan Mackenzie 2018-06-20 14:16 ` Stefan Monnier 2018-06-26 18:52 ` Alan Mackenzie 3 siblings, 2 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-06-19 5:02 UTC (permalink / raw) To: João Távora; +Cc: Glenn Morris, Tino Calancha, Emacs developers Hello, João. On Mon, Jun 18, 2018 at 18:01:18 +0100, João Távora wrote: [ .... ] Maybe we're looking at this the wrong way. How about this idea: we add a new syntax flag to Emacs, ", which terminates any open string, the same way the syntax > terminates any open comment. We could then set this syntax flag on newline. This would have the disadvantage (for CC Mode) that it wouldn't work with older Emacsen. But it might solve the various problems we've stumbled over in the last few days. > João -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-19 5:02 ` Alan Mackenzie @ 2018-06-20 14:16 ` Stefan Monnier 2018-06-26 18:23 ` Alan Mackenzie 2018-06-27 18:27 ` Alan Mackenzie 2018-06-26 18:52 ` Alan Mackenzie 1 sibling, 2 replies; 93+ messages in thread From: Stefan Monnier @ 2018-06-20 14:16 UTC (permalink / raw) To: emacs-devel > How about this idea: we add a new syntax flag to Emacs, ", which > terminates any open string, the same way the syntax > terminates any > open comment. We could then set this syntax flag on newline. To me this looks like adding a hack to patch over another. I don't think the new behavior of unclosed strings in CC-mode is worse than the old one, but I don't think it's really better either: it's just different (in some cases it's better in others it's worse). So the problem I see with it is that it brings complexity in the code with no real improvement in terms of behavior. The bad-interaction with electric-pair shows that this complexity has a real immediate cost. The suggestion above suggests that this complexity may bring in yet more complexity. Me not happy. If the purpose of the change is to address use cases such as Clément's: > Sorry for jumping in a bit late. Does that mean that after the changed an > unclosed quote will only cause refontification up to the end of the line? > That would be a very nice improvement. I don't use electric-pair-mode, and > as things currently stand inserting an unmatched quote applies > font-lock-string-face to the entire buffer, which is a bit annoying. How 'bout taking an approach that will have much fewer side-effects: Instead of adding the complexity at the low-level of syntax-tables to make strings "magically" terminate at EOL, hook into self-insert-command: when inserting a ", add a matching " at EOL if needed, or remove the " that we added at EOL earlier. Something like (guaranteed 100% tested, of course. No animals were harmed): (add-hook 'post-self-insert-hook (lambda () (when (memq last-command-event '(?\" ?\')) (save-excursion (let ((pos (point)) (ppss (syntax-ppss (line-end-position)))) (when (and (nth 3 ppss) ;; EOL within a string (not (nth 5 ppss))) ;; EOL not escaped (if (and (> (point) pos) (eq last-command-event (char-before))) ;; Remove extraneous unmatched " at EOL. (delete-region (1- (point)) (point)) (insert last-command-event))))))) 'append 'local) I used `append` to try and make it interact better with electric-pair-mode. Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-20 14:16 ` Stefan Monnier @ 2018-06-26 18:23 ` Alan Mackenzie 2018-06-27 13:37 ` João Távora 2018-06-29 3:42 ` Stefan Monnier 2018-06-27 18:27 ` Alan Mackenzie 1 sibling, 2 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-06-26 18:23 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Hello, Stefan. On Wed, Jun 20, 2018 at 10:16:05 -0400, Stefan Monnier wrote: [ .... ] > If the purpose of the change is to address use cases such as Clément's: > > Sorry for jumping in a bit late. Does that mean that after the changed an > > unclosed quote will only cause refontification up to the end of the line? > > That would be a very nice improvement. I don't use electric-pair-mode, and > > as things currently stand inserting an unmatched quote applies > > font-lock-string-face to the entire buffer, which is a bit annoying. > How 'bout taking an approach that will have much fewer side-effects: > Instead of adding the complexity at the low-level of syntax-tables > to make strings "magically" terminate at EOL, hook into > self-insert-command: > when inserting a ", add a matching " at EOL if needed, or remove > the " that we added at EOL earlier. > Something like (guaranteed 100% tested, of course. No animals were harmed): > (add-hook 'post-self-insert-hook > (lambda () > (when (memq last-command-event '(?\" ?\')) > (save-excursion > (let ((pos (point)) > (ppss (syntax-ppss (line-end-position)))) > (when (and (nth 3 ppss) ;; EOL within a string > (not (nth 5 ppss))) ;; EOL not escaped > (if (and (> (point) pos) > (eq last-command-event (char-before))) > ;; Remove extraneous unmatched " at EOL. > (delete-region (1- (point)) (point)) > (insert last-command-event))))))) > 'append 'local) This is effectively electric-pair-mode, which if enabled, already inserts two "s when you type ". Not everybody likes electric-pair-mode. I don't think your suggestion is any better than mine (snipped) to which you replied. > I used `append` to try and make it interact better with electric-pair-mode. > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-26 18:23 ` Alan Mackenzie @ 2018-06-27 13:37 ` João Távora 2018-06-29 3:42 ` Stefan Monnier 1 sibling, 0 replies; 93+ messages in thread From: João Távora @ 2018-06-27 13:37 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Stefan Monnier, emacs-devel Alan Mackenzie <acm@muc.de> writes: Hi Alan >> > unclosed quote will only cause refontification up to the end of the line? >> > That would be a very nice improvement. I don't use electric-pair-mode, and >> > as things currently stand inserting an unmatched quote applies >> > font-lock-string-face to the entire buffer, which is a bit annoying. >> (add-hook 'post-self-insert-hook >> (lambda () >> ... >> (insert last-command-event))))))) >> 'append 'local) > > This is effectively electric-pair-mode, which if enabled, already > inserts two "s when you type ". > > Not everybody likes electric-pair-mode. I don't think your suggestion > is any better than mine (snipped) to which you replied. To be perfectly honest, I got confused by Stefan's suggestion, too. If the goal is to have electric-pair-mode-like behaviour, just turn on electric-pair-mode. I'd just like to point out, however, that automatically pairing quotes and parens extends far beyond electric-pair-mode. Of course I think it's the best of the bunch, but there are other popular packages like smartparens, paredit, wrap-region, textmate (and even my previous autopair which some insist on using for some reason). João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-26 18:23 ` Alan Mackenzie 2018-06-27 13:37 ` João Távora @ 2018-06-29 3:42 ` Stefan Monnier 2018-06-30 18:09 ` Alan Mackenzie 1 sibling, 1 reply; 93+ messages in thread From: Stefan Monnier @ 2018-06-29 3:42 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel > This is effectively electric-pair-mode, which if enabled, already > inserts two "s when you type ". It's very different: it inserts/removes the second " at the end of the line, so it ends up behaving very much like your current code, except: - it only affects self-insert-command. - it uses an explicit " character rather than a syntax-table text-property. So OT1H it provides a behavior closer to current `master` than to electric-pair-mode, but like electric-pair-mode it has a fairly focus'd effect, so is less likely to have unexpected interactions. > Not everybody likes electric-pair-mode. I don't think your suggestion > is any better than mine (snipped) to which you replied. Its main benefit is that it's very superficial with a narrow focus. No need to change any core API like syntax-tables with a feature which will need to be supported for the next very many years. Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-29 3:42 ` Stefan Monnier @ 2018-06-30 18:09 ` Alan Mackenzie 2018-07-01 3:37 ` Stefan Monnier 2018-07-01 15:57 ` Paul Eggert 0 siblings, 2 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-06-30 18:09 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Hello, Stefan. On Thu, Jun 28, 2018 at 23:42:34 -0400, Stefan Monnier wrote: > > This is effectively electric-pair-mode, which if enabled, already > > inserts two "s when you type ". > It's very different: it inserts/removes the second " at the end of the > line, so it ends up behaving very much like your current code, except: > - it only affects self-insert-command. > - it uses an explicit " character rather than a syntax-table text-property. Some people, including me, find the insertion of characters they haven't typed (aside from tabs/spaces for indentation) annoying. It's good that there are minor modes that can do this, but it's not the way to solve the current difficulty. > So OT1H it provides a behavior closer to current `master` than to > electric-pair-mode, but like electric-pair-mode it has a fairly focus'd > effect, so is less likely to have unexpected interactions. > > Not everybody likes electric-pair-mode. I don't think your suggestion > > is any better than mine (snipped) to which you replied. > Its main benefit is that it's very superficial with a narrow focus. > No need to change any core API like syntax-tables with a feature which > will need to be supported for the next very many years. But it doesn't really address the problem. That problem is how to fontify unterminated strings (in both senses of the word "how"). Up till now, Emacs hasn't bothered - it just allows these strings, and the subsequent buffer portion, to be fontified randomly. I think such a string should have string face up till the first unescaped newline (in modes where escaped NLs are required for multiline strings). I can't see any other way anybody would want such a construct to be fontified. > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-30 18:09 ` Alan Mackenzie @ 2018-07-01 3:37 ` Stefan Monnier 2018-07-01 15:24 ` Eli Zaretskii 2018-07-06 21:58 ` Stephen Leake 2018-07-01 15:57 ` Paul Eggert 1 sibling, 2 replies; 93+ messages in thread From: Stefan Monnier @ 2018-07-01 3:37 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel > some people, including me, find the insertion of characters they haven't > typed (aside from tabs/spaces for indentation) annoying. Don't think of it as a character, think of it as "a special syntax-table string-closer that's not really in the buffer": it will be automatically removed when you fix the unterminated string anyway. For that reason it's very different from electric-pair-mode: the " char it adds is not meant to save you from typing typing the closing string delimiter, it's just a technical device to bound temporarily the extent of the string until the time you close it. > It's good that there are minor modes that can do this, but it's not > the way to solve the current difficulty. Maybe the sample implementation I provided is not quite right, but I think the approach of temporarily inserting a " instead of messing with syntax-table properties is actually much closer to the current CC-mode behavior than to electric-pair-mode. > But it doesn't really address the problem. That problem is how to > fontify unterminated strings (in both senses of the word "how"). An unterminated string can only occur in an invalid piece of code. To the extent that invalid code has no clear meaning, there's no way to know what is really the "right" behavior. My point of view is that Emacs should focus on behaving as correctly as possible for valid code. The only effort worth doing w.r.t invalid code is to avoid doing something clearly harmful and to help the user make the code valid again. Anything further than that is time that would be better spent improving the handling of valid code. I don't see any concrete benefit (for the user) of the new behavior over the old (or the reverse for that matter). Either behavior is equally good and which behavior is better will depend on things which Emacs cannot know unless the user explicitly tells us. > Up till now, Emacs hasn't bothered - it just allows these strings, and the > subsequent buffer portion, to be fontified randomly. It's not random: it's arbitrary. The new behavior is also arbitrary. AFAIK you have no statistical data to claim that your new behavior is more often better than worse (and even less data to claim that the difference is significant). So it's mostly different. > I think such a string should have string face up till the first > unescaped newline (in modes where escaped NLs are required for > multiline strings). Yes, we saw that. Some other users agree. Yet others disagree. Personally, as a user, I don't really care which behavior I get: it's a rare transient situation which I'll fix soon anyway, whether Emacs tells me about it or not. OTOH, there is very concrete evidence that the new behavior is worse in the sense that it adds complexity to the code and (as expected) introduces bugs. To me, this is a bad tradeoff. > I can't see any other way anybody would want such a construct > to be fontified. That's just a lack of imagination on your part. Tho it also means you haven't made the effort to appreciate some of the scenarios people have presented here where the old behavior is preferable. Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-01 3:37 ` Stefan Monnier @ 2018-07-01 15:24 ` Eli Zaretskii 2018-07-06 21:58 ` Stephen Leake 1 sibling, 0 replies; 93+ messages in thread From: Eli Zaretskii @ 2018-07-01 15:24 UTC (permalink / raw) To: Stefan Monnier; +Cc: acm, emacs-devel > From: Stefan Monnier <monnier@IRO.UMontreal.CA> > Date: Sat, 30 Jun 2018 23:37:33 -0400 > Cc: emacs-devel@gnu.org > > My point of view is that Emacs should focus on behaving as correctly as > possible for valid code. The only effort worth doing w.r.t invalid code > is to avoid doing something clearly harmful and to help the user make > the code valid again. Anything further than that is time that would be > better spent improving the handling of valid code. > > I don't see any concrete benefit (for the user) of the new behavior over > the old (or the reverse for that matter). Either behavior is equally > good and which behavior is better will depend on things which Emacs > cannot know unless the user explicitly tells us. > [...] > Personally, as a user, I don't really care which behavior I get: it's > a rare transient situation which I'll fix soon anyway, whether Emacs > tells me about it or not. This reflects my opinions as well. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-01 3:37 ` Stefan Monnier 2018-07-01 15:24 ` Eli Zaretskii @ 2018-07-06 21:58 ` Stephen Leake 1 sibling, 0 replies; 93+ messages in thread From: Stephen Leake @ 2018-07-06 21:58 UTC (permalink / raw) To: emacs-devel Stefan Monnier <monnier@IRO.UMontreal.CA> writes: > An unterminated string can only occur in an invalid piece of code. > To the extent that invalid code has no clear meaning, there's no way > to know what is really the "right" behavior. True, but I still agree with Alan; treat the newline as a string terminator for fontification. > My point of view is that Emacs should focus on behaving as correctly as > possible for valid code. The only effort worth doing w.r.t invalid code > is to avoid doing something clearly harmful and to help the user make > the code valid again. Anything further than that is time that would be > better spent improving the handling of valid code. I disagree. When we are editing code, it has incorrect syntax most of the time, yet we still as Emacs to fontify and indent it. So it is a strong requirement that Emacs work "acceptably well" in this context. I'm working on adding robust error correction to my Ada parser, precisely for this purpose. > I don't see any concrete benefit (for the user) of the new behavior over > the old (or the reverse for that matter). Either behavior is equally > good and which behavior is better will depend on things which Emacs > cannot know unless the user explicitly tells us. Right. It would be nice to have the "terminate string on newline" behavior as an option. >> Up till now, Emacs hasn't bothered - it just allows these strings, and the >> subsequent buffer portion, to be fontified randomly. > > It's not random: it's arbitrary. The new behavior is also arbitrary. Right. > OTOH, there is very concrete evidence that the new behavior is worse in > the sense that it adds complexity to the code and (as expected) > introduces bugs. > > To me, this is a bad tradeoff. Ok. I'm hoping for a coding solution that is not as complex. -- -- Stephe ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-30 18:09 ` Alan Mackenzie 2018-07-01 3:37 ` Stefan Monnier @ 2018-07-01 15:57 ` Paul Eggert 1 sibling, 0 replies; 93+ messages in thread From: Paul Eggert @ 2018-07-01 15:57 UTC (permalink / raw) To: Alan Mackenzie, Stefan Monnier; +Cc: emacs-devel Alan Mackenzie wrote: > Some people, including me, find the insertion of characters they haven't > typed (aside from tabs/spaces for indentation) annoying. Hey, I often find the insertion of tabs and spaces annoying! And it's gotten worse recently, in CC-mode. It's often reindenting when I don't want it to, and for C99 constructs like designated initializers it indents in ways that are increasingly unhelpful. (Yes, I know, I know, I should file bug reports....) ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-20 14:16 ` Stefan Monnier 2018-06-26 18:23 ` Alan Mackenzie @ 2018-06-27 18:27 ` Alan Mackenzie 2018-06-29 4:11 ` Stefan Monnier 1 sibling, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-06-27 18:27 UTC (permalink / raw) To: Stefan Monnier Cc: Clément Pit-Claudel, Stephen Leake, João Távora, emacs-devel Hello, yet again, Stefan. On Wed, Jun 20, 2018 at 10:16:05 -0400, Stefan Monnier wrote: > > How about this idea: we add a new syntax flag to Emacs, ", which > > terminates any open string, the same way the syntax > terminates any > > open comment. We could then set this syntax flag on newline. I've been making negative comments about this suggestion of mine over the last day or two. I now believe, again, that the proposal is sound; it would allow the desired fontification (an unterminated string being fontified only up to the next unescaped NL) easily, without interfering with the 'chomp facility in electric-pair-mode. > To me this looks like adding a hack to patch over another. > I don't think the new behavior of unclosed strings in CC-mode is worse > than the old one, but I don't think it's really better either: it's just > different (in some cases it's better in others it's worse). This new fontification in CC Mode is much better. When the terminating " is missing, it no longer fontifies spuriously an unbounded piece of the buffer as a string. Clëment and Stephen Leake have responded positively to this possibility. I think we should enhance Emacs such that it is easy to fontify in this new way. > So the problem I see with it is that it brings complexity in the code > with no real improvement in terms of behavior. The bad-interaction with > electric-pair shows that this complexity has a real immediate cost. > The suggestion above suggests that this complexity may bring in yet > more complexity. The desired facility _is_ complicated. I am not fond of the code in CC Mode which implements it. The question is, where do we put this complexity? With my suggestion, it would be confined mainly to the scanning routines in syntax.c rather than being spread (and duplicated) amongst several major modes. Adapting the forward scanning functionality would be straightforward. Things like (scan-sexps POS -1) would indeed become more difficult. For example, starting at BONL, (scan-sexps BONL -1) in "foo bar would need to find the ", but the same in "foo"bar would need to find the start of bar. In other words, we would have to pair off quotes from the beginning of the line we were scanning backwards over. There may well be difficulties in a NL potentially acting as the terminator of both a string and a comment. I think these are the sorts of complexity you're wary of. font-lock-fontify-syntactically-region could then be amended straightforwardly to apply warning-face to the opening unbalanced " (controlled, of course, by a customisation option). > Me not happy. My suggestion has the strong advantage that it will benefit Emacs as a whole, and there won't need to be separate implementations in CC Mode, Python Mode, Ada Mode, ..... The need for a multilinne string to have escaped NLs between its lines is actually a common pattern in the languages Emacs handles. Why can we not handle it in syntax.c? [ .... ] > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-27 18:27 ` Alan Mackenzie @ 2018-06-29 4:11 ` Stefan Monnier 2018-06-30 19:03 ` Alan Mackenzie 0 siblings, 1 reply; 93+ messages in thread From: Stefan Monnier @ 2018-06-29 4:11 UTC (permalink / raw) To: Alan Mackenzie Cc: Clément Pit-Claudel, Stephen Leake, João Távora, emacs-devel >> > How about this idea: we add a new syntax flag to Emacs, ", which >> > terminates any open string, the same way the syntax > terminates any >> > open comment. We could then set this syntax flag on newline. > I've been making negative comments about this suggestion of mine over > the last day or two. I now believe, again, that the proposal is sound; It's definitely sound. And I very much agree that it could be cleaner than the current code on `master`. I dislike this solution mainly because it requires changes to Emacs's core API, so it bumps against my feeling that the need is not clearly documented: you think the new behavior is more often beneficial than the old behavior but we have no actual data to verify it. FWIW, I do not know that the old behavior is more often beneficial either, but I'm definitely not convinced that the new behavior is often enough more beneficial to justify such changes to syntax-tables. But that's for Eli to judge. So let's look at the technical issues: You suggest introducing a new syntax-table thingy similar to > but for strings. Let's call it ] - This implies we'll need a new C-level function `back_string` to jump backward over such a ]-terminated string, corresponding to back_comment. `back_comment` has proved to be rather nasty, so while we can learn from it, part of what we learn is that jumping backward over such things is much easier than jumping forward, so this innocuous ] will be more costly than might meet the eye. - In CC-mode, \n already has syntax > so it can't also have syntax ] How do you intend to deal with that: will you mark those few \n that terminate strings with syntax-table text-properties? If so, what's the benefit over using string-fences? - Another approach would be to make it possible to mark \n as both ] and > at the same time, which would make the CC-mode feature much cleaner (no need to muck with syntax-table text-properties) but the cost of yet more complexity in the syntax.c code. > would need to find the start of bar. In other words, we would have to > pair off quotes from the beginning of the line we were scanning > backwards over. There may well be difficulties in a NL potentially > acting as the terminator of both a string and a comment. I think these > are the sorts of complexity you're wary of. Yes. > My suggestion has the strong advantage that it will benefit Emacs as a > whole, and there won't need to be separate implementations in CC Mode, > Python Mode, Ada Mode, ..... The need for a multilinne string to have > escaped NLs between its lines is actually a common pattern in the > languages Emacs handles. Why can we not handle it in syntax.c? Emacs has handled it for the last 30 years or so. You just want to handle it in a different way. I agree that Emacs's core should ideally make it easy for a major mode to choose this "different way". But the way I see it, your suggestion is just adding one more wart to syntax-tables whereas we should instead work on "syntax-tables NG". IOW, I think that we should introduce a brand new replacement for syntax-tables (tho I don't really know what it should look like, otherwise I'd have coded it up already); something much more powerful and generic (probably based on a mix of a DFA at one level and some kind of push-down automata on top of it), and such a thing could/should easily accommodate such a feature without even needing any ad-hoc support. Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-29 4:11 ` Stefan Monnier @ 2018-06-30 19:03 ` Alan Mackenzie 2018-06-30 19:29 ` Eli Zaretskii 2018-07-01 4:02 ` CC Mode and electric-pair "problem" Stefan Monnier 0 siblings, 2 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-06-30 19:03 UTC (permalink / raw) To: Stefan Monnier Cc: Clément Pit-Claudel, Stephen Leake, João Távora, emacs-devel Hello, Stefan. On Fri, Jun 29, 2018 at 00:11:24 -0400, Stefan Monnier wrote: > >> > How about this idea: we add a new syntax flag to Emacs, ", which > >> > terminates any open string, the same way the syntax > terminates any > >> > open comment. We could then set this syntax flag on newline. > > I've been making negative comments about this suggestion of mine over > > the last day or two. I now believe, again, that the proposal is sound; > It's definitely sound. And I very much agree that it could be cleaner > than the current code on `master`. I dislike this solution mainly > because it requires changes to Emacs's core API, so it bumps against my > feeling that the need is not clearly documented: you think the new > behavior is more often beneficial than the old behavior but we have no > actual data to verify it. No, what I think is much less nuanced: that the old behaviour is simply wrong; the new behaviour is likewise correct. If one were to design an editor's functionality from scratch, nobody would advocate the old behaviour - it happened because it needed no implementation effort. > FWIW, I do not know that the old behavior is more often beneficial > either, but I'm definitely not convinced that the new behavior is > often enough more beneficial to justify such changes to syntax-tables. I am in the middle of writing a trial implementation (code speaks louder than words). Thus far, it has already worked in shell-script-mode (which required a one-line change, this: - ?\n ">#" + ?\n ">#s" the new `s' flag is how I've constructed it, so far). > But that's for Eli to judge. > So let's look at the technical issues: > You suggest introducing a new syntax-table thingy similar to > but for > strings. Let's call it ] As I noted above, I have implemented it as another flag, `s'. > - This implies we'll need a new C-level function `back_string` to jump > backward over such a ]-terminated string, corresponding to > back_comment. Yes. > `back_comment` has proved to be rather nasty, so while > we can learn from it, part of what we learn is that jumping backward > over such things is much easier .... much less easy. :-) > .... than jumping forward, so this > innocuous ] will be more costly than might meet the eye. It requires the new function, which at the moment seems somewhat less complicated than back_comment, and this requires to be called from scan_lists. > - In CC-mode, \n already has syntax > so it can't also have syntax ] > How do you intend to deal with that: will you mark those few \n that > terminate strings with syntax-table text-properties? This is simple with the flag `s'. NL would thus have end-comment syntax _and_ the `s' flag. In scan_lists, back_comment will be tried before what I'm calling `back_maybe_string', since being a comment ender must have precedence over being a string terminator. > If so, what's the benefit over using string-fences? String-fence stopped the 'chomp facility of electric-pair-mode working properly (for the currently accepted value of "properly"). > - Another approach would be to make it possible to mark \n as both ] and > > at the same time, which would make the CC-mode feature much cleaner > (no need to muck with syntax-table text-properties) but the cost of > yet more complexity in the syntax.c code. That's what I'm doing with `s'. The extra complexity in syntax.c doesn't seem all that bad at the moment. back_maybe_string is currently 137 lines long (including a macro analogous to INC_FROM, and a lossage: clause modelled on the one in back_comment)), compared with back_comment's 289 lines. I'm planning on committing this new code to a branch in the next few days, then you can judge better whether the new facility is worth it. [ .... ] > > My suggestion has the strong advantage that it will benefit Emacs as a > > whole, and there won't need to be separate implementations in CC Mode, > > Python Mode, Ada Mode, ..... The need for a multilinne string to have > > escaped NLs between its lines is actually a common pattern in the > > languages Emacs handles. Why can we not handle it in syntax.c? > Emacs has handled it for the last 30 years or so. You just want to > handle it in a different way. I agree that Emacs's core should ideally > make it easy for a major mode to choose this "different way". > But the way I see it, your suggestion is just adding one more wart to > syntax-tables whereas we should instead work on "syntax-tables NG". > IOW, I think that we should introduce a brand new replacement for > syntax-tables (tho I don't really know what it should look like, > otherwise I'd have coded it up already); something much more powerful > and generic (probably based on a mix of a DFA at one level and some kind > of push-down automata on top of it), and such a thing could/should > easily accommodate such a feature without even needing any > ad-hoc support. "S-T-NG" may be fine for Emacs 28 or 29, but the syntax table is what we have, and what we must work with in the short term. > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-30 19:03 ` Alan Mackenzie @ 2018-06-30 19:29 ` Eli Zaretskii 2018-06-30 20:14 ` Alan Mackenzie 2018-07-01 4:02 ` CC Mode and electric-pair "problem" Stefan Monnier 1 sibling, 1 reply; 93+ messages in thread From: Eli Zaretskii @ 2018-06-30 19:29 UTC (permalink / raw) To: Alan Mackenzie Cc: cpitclaudel, emacs-devel, stephen_leake, monnier, joaotavora > Date: Sat, 30 Jun 2018 19:03:27 +0000 > From: Alan Mackenzie <acm@muc.de> > Cc: Clément Pit-Claudel <cpitclaudel@gmail.com>, > Stephen Leake <stephen_leake@stephe-leake.org>, > João Távora <joaotavora@gmail.com>, > emacs-devel@gnu.org > > I am in the middle of writing a trial implementation (code speaks louder > than words). Thus far, it has already worked in shell-script-mode > (which required a one-line change, this: > > - ?\n ">#" > + ?\n ">#s" > > the new `s' flag is how I've constructed it, so far). > > > But that's for Eli to judge. > > > So let's look at the technical issues: > > You suggest introducing a new syntax-table thingy similar to > but for > > strings. Let's call it ] > > As I noted above, I have implemented it as another flag, `s'. > > > - This implies we'll need a new C-level function `back_string` to jump > > backward over such a ]-terminated string, corresponding to > > back_comment. > > Yes. > > > `back_comment` has proved to be rather nasty, so while > > we can learn from it, part of what we learn is that jumping backward > > over such things is much easier .... > > much less easy. :-) > > > .... than jumping forward, so this > > innocuous ] will be more costly than might meet the eye. > > It requires the new function, which at the moment seems somewhat less > complicated than back_comment, and this requires to be called from > scan_lists. > > > - In CC-mode, \n already has syntax > so it can't also have syntax ] > > How do you intend to deal with that: will you mark those few \n that > > terminate strings with syntax-table text-properties? > > This is simple with the flag `s'. NL would thus have end-comment syntax > _and_ the `s' flag. In scan_lists, back_comment will be tried before > what I'm calling `back_maybe_string', since being a comment ender must have > precedence over being a string terminator. > > > If so, what's the benefit over using string-fences? > > String-fence stopped the 'chomp facility of electric-pair-mode working > properly (for the currently accepted value of "properly"). > > > - Another approach would be to make it possible to mark \n as both ] and > > > at the same time, which would make the CC-mode feature much cleaner > > (no need to muck with syntax-table text-properties) but the cost of > > yet more complexity in the syntax.c code. > > That's what I'm doing with `s'. The extra complexity in syntax.c > doesn't seem all that bad at the moment. back_maybe_string is currently > 137 lines long (including a macro analogous to INC_FROM, and a lossage: > clause modelled on the one in back_comment)), compared with > back_comment's 289 lines. I'm planning on committing this new code to a > branch in the next few days, then you can judge better whether the new > facility is worth it. Could you please recap what problem(s) you are trying to fix with these changes? (I'm sorry for not following, but this thread spans two months and many long messages with several days in-between. It's hard to keep focused on the main issues.) Thanks. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-30 19:29 ` Eli Zaretskii @ 2018-06-30 20:14 ` Alan Mackenzie 2018-07-01 3:50 ` Stefan Monnier ` (2 more replies) 0 siblings, 3 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-06-30 20:14 UTC (permalink / raw) To: Eli Zaretskii Cc: joaotavora, cpitclaudel, stephen_leake, monnier, emacs-devel Hello, Eli. On Sat, Jun 30, 2018 at 22:29:12 +0300, Eli Zaretskii wrote: > > Date: Sat, 30 Jun 2018 19:03:27 +0000 > > From: Alan Mackenzie <acm@muc.de> > > Cc: Clément Pit-Claudel <cpitclaudel@gmail.com>, > > Stephen Leake <stephen_leake@stephe-leake.org>, > > João Távora <joaotavora@gmail.com>, > > emacs-devel@gnu.org > > I am in the middle of writing a trial implementation (code speaks louder > > than words). Thus far, it has already worked in shell-script-mode > > (which required a one-line change, this: > > - ?\n ">#" > > + ?\n ">#s" > > the new `s' flag is how I've constructed it, so far). > > > But that's for Eli to judge. > > > So let's look at the technical issues: > > > You suggest introducing a new syntax-table thingy similar to > but for > > > strings. Let's call it ] > > As I noted above, I have implemented it as another flag, `s'. > > > - This implies we'll need a new C-level function `back_string` to jump > > > backward over such a ]-terminated string, corresponding to > > > back_comment. > > Yes. > > > `back_comment` has proved to be rather nasty, so while > > > we can learn from it, part of what we learn is that jumping backward > > > over such things is much easier .... > > much less easy. :-) > > > .... than jumping forward, so this > > > innocuous ] will be more costly than might meet the eye. > > It requires the new function, which at the moment seems somewhat less > > complicated than back_comment, and this requires to be called from > > scan_lists. > > > - In CC-mode, \n already has syntax > so it can't also have syntax ] > > > How do you intend to deal with that: will you mark those few \n that > > > terminate strings with syntax-table text-properties? > > This is simple with the flag `s'. NL would thus have end-comment syntax > > _and_ the `s' flag. In scan_lists, back_comment will be tried before > > what I'm calling `back_maybe_string', since being a comment ender must have > > precedence over being a string terminator. > > > If so, what's the benefit over using string-fences? > > String-fence stopped the 'chomp facility of electric-pair-mode working > > properly (for the currently accepted value of "properly"). > > > - Another approach would be to make it possible to mark \n as both ] and > > > > at the same time, which would make the CC-mode feature much cleaner > > > (no need to muck with syntax-table text-properties) but the cost of > > > yet more complexity in the syntax.c code. > > That's what I'm doing with `s'. The extra complexity in syntax.c > > doesn't seem all that bad at the moment. back_maybe_string is currently > > 137 lines long (including a macro analogous to INC_FROM, and a lossage: > > clause modelled on the one in back_comment)), compared with > > back_comment's 289 lines. I'm planning on committing this new code to a > > branch in the next few days, then you can judge better whether the new > > facility is worth it. > Could you please recap what problem(s) you are trying to fix with > these changes? (I'm sorry for not following, but this thread spans > two months and many long messages with several days in-between. It's > hard to keep focused on the main issues.) Sorry. That's just the way things go, sometimes. The initial problem I tried to solve was for CC Mode source files with things like: char foo[] = "foo char bar[] = "bar"; Historically, the missing " on "foo has caused subsequent lines to have their string quoting reversed. This is not good. A recent series of CC Mode commits "solved" this by putting string-fence syntax-table text properties on the " and the NL around foo. This caused a "make check" test to fail. With electric-pair-mode enabled and electric-pair-skip-whitespace set to 'chomp, in the following: " ( ) " , typing ) on line 1 should replace the ) on line 2, "chomping" the space between ) and ). However the string-fence property on L1's NL prevented electric-pair-mode from functioning correctly. João and I have discussed at length ways of fixing this. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Also, the CC Mode solution has the disadvantage that other languages cannot get the same fontification advantages, namely that the "foo gets warning-face on the ", and string face extends ONLY to EOL. What I'm now proposing, and implementing as a trial, is to enhance the syntax table facilities to support unterminated strings. There will be an extra syntax flag `s' on newlines meaning "terminate any open string". This is straightforward for forward scanning, but somewhat complicated for backward scanning. However, it does enable unterminated strings to be easily fontified to EOL in any language, with minimal effort. It should allow the desired fontification without causing problems for electric-pair-mode. Stefan is concerned that the extra functionality may not justify the increase in complexity in syntax.c. > Thanks. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-30 20:14 ` Alan Mackenzie @ 2018-07-01 3:50 ` Stefan Monnier 2018-07-01 9:58 ` Alan Mackenzie 2018-07-01 11:22 ` João Távora 2018-07-01 15:22 ` Eli Zaretskii 2 siblings, 1 reply; 93+ messages in thread From: Stefan Monnier @ 2018-07-01 3:50 UTC (permalink / raw) To: Alan Mackenzie Cc: joaotavora, Eli Zaretskii, stephen_leake, cpitclaudel, emacs-devel > The initial problem I tried to solve was for CC Mode source files with > things like: > > char foo[] = "foo > char bar[] = "bar"; > > Historically, the missing " on "foo has caused subsequent lines to have > their string quoting reversed. This is not good. Of course, as mentioned elsewhere, the new functionality is not good either when you have char foo[] = "first line second line\ third line"; Unless we find a magic way to distinguish those cases, both behaviors will be sometimes right and sometimes wrong (and of course, neither really matters since the code is invalid and will be inevitably fixed soon by the user). > A recent series of CC Mode commits "solved" this by putting string-fence > syntax-table text properties on the " and the NL around foo. This caused > a "make check" test to fail. With electric-pair-mode enabled and > electric-pair-skip-whitespace set to 'chomp, in the following: Complexity brings bugs, indeed. > Also, the CC Mode solution has the disadvantage that other languages > cannot get the same fontification advantages, namely that the "foo gets > warning-face on the ", and string face extends ONLY to EOL. "foo gets warning-face on the " is completely unrelated to the discussion at hand: you can have it both with the new code and the old code, and indeed CC-mode had it with the old code as well. > What I'm now proposing, and implementing as a trial, is to enhance the > syntax table facilities to support unterminated strings. Oh indeed, complexity calls for yet more complexity. > Stefan is concerned that the extra functionality may not justify the > increase in complexity in syntax.c. Yes. Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-01 3:50 ` Stefan Monnier @ 2018-07-01 9:58 ` Alan Mackenzie 0 siblings, 0 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-07-01 9:58 UTC (permalink / raw) To: Stefan Monnier Cc: Eli Zaretskii, stephen_leake, joaotavora, cpitclaudel, emacs-devel Hello, Stefan. On Sat, Jun 30, 2018 at 23:50:29 -0400, Stefan Monnier wrote: [ .... ] > > What I'm now proposing, and implementing as a trial, is to enhance the > > syntax table facilities to support unterminated strings. > Oh indeed, complexity calls for yet more complexity. New features call for new code. How can you disparage the new code as "(unacceptable) complexity" when you haven't even seen it? A good point is, who should decide how these strings should be fontified? Three possible answers are an individual on the Emacs core team, the major mode author, the user. Over this entire thread you've been exceedingly negative. You have disparaged at least two ways of doing what's wanted, without suggesting any other, better, way. You seem to be saying "this is difficult/complicated, so we'll just work around the problem/pretend it isn't really a problem, rather than solving it". So, please let's have your technical proposal for how to fontify unterminated strings in the "new way". [ .... ] > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-30 20:14 ` Alan Mackenzie 2018-07-01 3:50 ` Stefan Monnier @ 2018-07-01 11:22 ` João Távora 2018-07-01 15:25 ` Eli Zaretskii 2018-07-01 15:22 ` Eli Zaretskii 2 siblings, 1 reply; 93+ messages in thread From: João Távora @ 2018-07-01 11:22 UTC (permalink / raw) To: Alan Mackenzie Cc: Eli Zaretskii, stephen_leake, cpitclaudel, monnier, emacs-devel Alan Mackenzie <acm@muc.de> writes: >> Could you please recap what problem(s) you are trying to fix with > Sorry. That's just the way things go, sometimes. I'm not sure how far into "final allegations" we are, but below is my summary. > electric-pair-mode from functioning correctly. João and I have discussed > at length ways of fixing this. ... in particular, a few weeks ago I provided, in electric-pair-mode, means for CC mode to declare that it has this particular behaviour. Though I'm still waiting for Alan's comments on this, I'd say the electric-pair-mode test failure is effectively fixed if Alan aggrees to use that customization point. But, in my view, electric-pair-mode was just the canary in the mine: after Alan's changes much more basic things such as C-M-* sexp navigation stop working like they did. I am actually more worried about these. To recap, I like that Alan's change in syntactically incorrect code is better "50% of the time": char *c="an incomplete string int a = 0; ... }<EOB> by not fontifying "int a" as a string, does indeed exhibit some intelligence. But this doesn't (where it previously did): int main () { int a = 0; char *c = "here's me editing a multi-line\n\ string"; puts(c); return 0; } If this switch was all, I wouldn't mind at all. Unfortunately it comes with a very big trade-off: the underlying syntactic changes break e.g. C-M-u C-M-SPC inside the multi-line string being edited (which is precisely something I could use to fix the string). I just noticed that in 26.1 indentation of the "puts(c)" wasn't affected by the temporary editing of the string. Now it is, so another downside, IMO. João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-01 11:22 ` João Távora @ 2018-07-01 15:25 ` Eli Zaretskii 0 siblings, 0 replies; 93+ messages in thread From: Eli Zaretskii @ 2018-07-01 15:25 UTC (permalink / raw) To: João Távora Cc: acm, cpitclaudel, stephen_leake, monnier, emacs-devel > From: João Távora <joaotavora@gmail.com> > Cc: Eli Zaretskii <eliz@gnu.org>, cpitclaudel@gmail.com, emacs-devel@gnu.org, stephen_leake@stephe-leake.org, monnier@IRO.UMontreal.CA > Date: Sun, 01 Jul 2018 12:22:11 +0100 > > I'm not sure how far into "final allegations" we are, but below is my > summary. Thanks. > If this switch was all, I wouldn't mind at all. Unfortunately it comes > with a very big trade-off: the underlying syntactic changes break > e.g. C-M-u C-M-SPC inside the multi-line string being edited (which is > precisely something I could use to fix the string). > > I just noticed that in 26.1 indentation of the "puts(c)" wasn't affected > by the temporary editing of the string. Now it is, so another downside, > IMO. Yes, if fixing a minor annoyance introduces much more serious issues, we shouldn't install such a "fix". ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-30 20:14 ` Alan Mackenzie 2018-07-01 3:50 ` Stefan Monnier 2018-07-01 11:22 ` João Távora @ 2018-07-01 15:22 ` Eli Zaretskii 2018-07-01 16:38 ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie 2 siblings, 1 reply; 93+ messages in thread From: Eli Zaretskii @ 2018-07-01 15:22 UTC (permalink / raw) To: Alan Mackenzie Cc: joaotavora, cpitclaudel, stephen_leake, monnier, emacs-devel > Date: Sat, 30 Jun 2018 20:14:47 +0000 > Cc: cpitclaudel@gmail.com, emacs-devel@gnu.org, > stephen_leake@stephe-leake.org, monnier@IRO.UMontreal.CA, > joaotavora@gmail.com > From: Alan Mackenzie <acm@muc.de> > > > Could you please recap what problem(s) you are trying to fix with > > these changes? (I'm sorry for not following, but this thread spans > > two months and many long messages with several days in-between. It's > > hard to keep focused on the main issues.) > > Sorry. That's just the way things go, sometimes. Not your fault. Thanks for taking the time to recap. > The initial problem I tried to solve was for CC Mode source files with > things like: > > char foo[] = "foo > char bar[] = "bar"; > > Historically, the missing " on "foo has caused subsequent lines to have > their string quoting reversed. This is not good. But not really a catastrophe, IMO. > What I'm now proposing, and implementing as a trial, is to enhance the > syntax table facilities to support unterminated strings. There will be > an extra syntax flag `s' on newlines meaning "terminate any open string". > This is straightforward for forward scanning, but somewhat complicated > for backward scanning. However, it does enable unterminated strings to > be easily fontified to EOL in any language, with minimal effort. > > It should allow the desired fontification without causing problems for > electric-pair-mode. > > Stefan is concerned that the extra functionality may not justify the > increase in complexity in syntax.c. So am I. I'm also concerned that introducing this will slow down various syntax-related features, only to cater to what I consider a minor improvement at best. Of course, if the extra functionality turns out to be not as complex as Stefan fears and won't cause any significant slowdown that concerns me, then perhaps we should have it. But is that a reasonable assumption? Thanks. ^ permalink raw reply [flat|nested] 93+ messages in thread
* scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] 2018-07-01 15:22 ` Eli Zaretskii @ 2018-07-01 16:38 ` Alan Mackenzie 2018-07-08 8:29 ` Stephen Leake 0 siblings, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-07-01 16:38 UTC (permalink / raw) To: Eli Zaretskii Cc: joaotavora, cpitclaudel, stephen_leake, monnier, emacs-devel Hello, Eli. On Sun, Jul 01, 2018 at 18:22:48 +0300, Eli Zaretskii wrote: > > Date: Sat, 30 Jun 2018 20:14:47 +0000 > > Cc: cpitclaudel@gmail.com, emacs-devel@gnu.org, > > stephen_leake@stephe-leake.org, monnier@IRO.UMontreal.CA, > > joaotavora@gmail.com > > From: Alan Mackenzie <acm@muc.de> [ .... ] > > The initial problem I tried to solve was for CC Mode source files with > > things like: > > char foo[] = "foo > > char bar[] = "bar"; > > Historically, the missing " on "foo has caused subsequent lines to have > > their string quoting reversed. This is not good. > But not really a catastrophe, IMO. Perhaps not, but it is nevertheless bad. That it is so difficult to do anything about is also bad. > > What I'm now proposing, and implementing as a trial, is to enhance the > > syntax table facilities to support unterminated strings. There will be > > an extra syntax flag `s' on newlines meaning "terminate any open string". > > This is straightforward for forward scanning, but somewhat complicated > > for backward scanning. However, it does enable unterminated strings to > > be easily fontified to EOL in any language, with minimal effort. > > It should allow the desired fontification without causing problems for > > electric-pair-mode. > > Stefan is concerned that the extra functionality may not justify the > > increase in complexity in syntax.c. > So am I. I'm also concerned that introducing this will slow down > various syntax-related features, only to cater to what I consider a > minor improvement at best. > Of course, if the extra functionality turns out to be not as complex > as Stefan fears and won't cause any significant slowdown that concerns > me, then perhaps we should have it. But is that a reasonable > assumption? It's no longer a matter of assumption. Earlier on this afternoon, I committed a preliminary working version of this change to the branch scratch/fontify-open-string. The most complicated part of the change is the new function back_maybe_string in syntax.c. This is a mere 137 lines long. Even if perhaps not fully fleshed out, it's not far off. By contrast, back_comment (which is also called at every newline when there're line comments) is 289 lines long. I have amended shell-script-mode to use this new strategy. This required changing just one line in sh-script.el. To font-lock.el I have added an optional feature to put warning-face on the opening ". I think it is notable just how easy this new feature is to use. Essentially any mode[*] can use it with a one line change (to the syntax table code for \n). [*] Except, currently, CC Mode. ;-( > Thanks. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] 2018-07-01 16:38 ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie @ 2018-07-08 8:29 ` Stephen Leake 2018-07-15 9:00 ` Stephen Leake 0 siblings, 1 reply; 93+ messages in thread From: Stephen Leake @ 2018-07-08 8:29 UTC (permalink / raw) To: emacs-devel Alan Mackenzie <acm@muc.de> writes: > It's no longer a matter of assumption. Earlier on this afternoon, I > committed a preliminary working version of this change to the branch > scratch/fontify-open-string. I've just tried this in ada-mode, and it works nicely. I like the red face on an unbalanced string quote. No noticeable slowdown in anything I've tried so far. Let me know if there's some experiment you'd like me to run. -- -- Stephe ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] 2018-07-08 8:29 ` Stephen Leake @ 2018-07-15 9:00 ` Stephen Leake 2018-07-15 15:13 ` Eli Zaretskii 2018-07-15 16:56 ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie 0 siblings, 2 replies; 93+ messages in thread From: Stephen Leake @ 2018-07-15 9:00 UTC (permalink / raw) To: emacs-devel An update on this; I just had several missing quotes in a buffer, due to a copy/multiple paste that had a quote error. I did lots of editing with the quote errors present. I didn't even notice them until the compiler complained, just like any other syntax error. In my opinion, that is far preferable to the previous behavior of fontifying large parts of the buffer as string, which forced me to pay attention to a trivial syntax error instead of what I was actually doing. This is in Ada, that does not have the option of escaping a newline to create a multiline string, so treating a newline as string terminator is always correct. Anything I can do to help merge this to main? Stephen Leake <stephen_leake@stephe-leake.org> writes: > Alan Mackenzie <acm@muc.de> writes: > >> It's no longer a matter of assumption. Earlier on this afternoon, I >> committed a preliminary working version of this change to the branch >> scratch/fontify-open-string. > > I've just tried this in ada-mode, and it works nicely. I like the red > face on an unbalanced string quote. > > No noticeable slowdown in anything I've tried so far. > > Let me know if there's some experiment you'd like me to run. -- -- Stephe ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] 2018-07-15 9:00 ` Stephen Leake @ 2018-07-15 15:13 ` Eli Zaretskii 2018-07-15 18:45 ` Alan Mackenzie 2018-07-15 16:56 ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie 1 sibling, 1 reply; 93+ messages in thread From: Eli Zaretskii @ 2018-07-15 15:13 UTC (permalink / raw) To: Stephen Leake; +Cc: emacs-devel > From: Stephen Leake <stephen_leake@stephe-leake.org> > Date: Sun, 15 Jul 2018 04:00:23 -0500 > > Anything I can do to help merge this to main? A few things: . NEWS . Updates for the relevant parts in the manual(s) . Minor nits below: > +(defcustom font-lock-warn-open-string t > + "Fontify the opening quote of an unterminated string with warning face? > +This is done when this variable is non-nil. We use a slightly different style for such options (slightly rephrased to fit on one line): "Non-nil means show opening quotes of unterminated strings with warning face." > +This works only when the syntax-table entry for newline contains the flag `s' > +\(see page \"xxx\" in the Elisp manual)." Please replace "xxx" with an actual value. Also, we don't refer to our manuals as "pages", that is a relic from the "man pages" era. > +#define DEC_AT \ Please #undef DEC_AT when you are done using it (at function's end). > + /* Find the alleged string opener. */ Please leave 2 spaces between the end of the comment and "*/" (here and elsewhere in the patch) > + while ((at > stop) > + && (code != Sstring) > + && (!SYNTAX_FLAGS_CLOSE_STRING (syntax))) > + { > + DEC_AT; > + } A single line doesn't need braces. > + /* Search back for a terminating string delimiter: */ > + while ((at > stop) > + && (code != Sstring) > + && (code != Sstring_fence) > + && (!SYNTAX_FLAGS_CLOSE_STRING (syntax))) > + { > + DEC_AT; > + /* Check for comment and "other" strings. */ > + } Is the last comment at its correct place? It doesn't seem to refer to any code. > + lose: > + UPDATE_SYNTAX_TABLE_FORWARD (*from); > + return false; > + > + lossage: > + /* We've encountered possible comments or strings with mixed > + delimiters. Bail out and scan forward from a safe position. */ "lose" and "lossage" are too similar. Can we have a better name for the latter? > + { > + struct lisp_parse_state state; > + bool adjusted = true; Why did you need the braces here? C99 allows to mix declarations and statements, so we no longer need such braces. > + find_start_value > + = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) > + : state.thislevelstart >= 0 ? state.thislevelstart > + : find_start_value; Please use parentheses here for better readability (to clearly show which parts belong to which condition). > -Comments are ignored if `parse-sexp-ignore-comments' is non-nil. > +Comments are skipped over if `parse-sexp-ignore-comments' is non-nil. We nowadays prefer to quote 'like this' in comments and plain text. > -Comments are ignored if `parse-sexp-ignore-comments' is non-nil. > +Comments are skipped over if `parse-sexp-ignore-comments' is non-nil. Likewise. Thanks again for working on this. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] 2018-07-15 15:13 ` Eli Zaretskii @ 2018-07-15 18:45 ` Alan Mackenzie 2018-07-16 2:23 ` Indentation of ?: in C-mode (was: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]) Stefan Monnier 0 siblings, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-07-15 18:45 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Stephen Leake, emacs-devel Hello, Eli, thanks for the review. The code is still preliminary, and is missing quite a lot of comments, still. I have had doubts about the mechanism (e.g. C-M-b will take a lot of work to make it functional), see my reply to Stephen. On Sun, Jul 15, 2018 at 18:13:15 +0300, Eli Zaretskii wrote: > > From: Stephen Leake <stephen_leake@stephe-leake.org> > > Date: Sun, 15 Jul 2018 04:00:23 -0500 > > Anything I can do to help merge this to main? > A few things: > . NEWS > . Updates for the relevant parts in the manual(s) > . Minor nits below: > > +(defcustom font-lock-warn-open-string t > > + "Fontify the opening quote of an unterminated string with warning face? > > +This is done when this variable is non-nil. > We use a slightly different style for such options (slightly rephrased > to fit on one line): Well done for the compression! > "Non-nil means show opening quotes of unterminated strings with warning face." > > +This works only when the syntax-table entry for newline contains the flag `s' > > +\(see page \"xxx\" in the Elisp manual)." > Please replace "xxx" with an actual value. Also, we don't refer to > our manuals as "pages", that is a relic from the "man pages" era. Yes, thanks. Just "see \"<page name>\"", without the "page"? > > +#define DEC_AT \ > Please #undef DEC_AT when you are done using it (at function's end). OK. > > + /* Find the alleged string opener. */ > Please leave 2 spaces between the end of the comment and "*/" (here > and elsewhere in the patch) OK. As a matter of interest, what is the reason for this? I've seen it all over the Emacs C code. Is it something to do with filling? > > + while ((at > stop) > > + && (code != Sstring) > > + && (!SYNTAX_FLAGS_CLOSE_STRING (syntax))) > > + { > > + DEC_AT; > > + } > A single line doesn't need braces. I'm intending to put more code in there. > > + /* Search back for a terminating string delimiter: */ > > + while ((at > stop) > > + && (code != Sstring) > > + && (code != Sstring_fence) > > + && (!SYNTAX_FLAGS_CLOSE_STRING (syntax))) > > + { > > + DEC_AT; > > + /* Check for comment and "other" strings. */ > > + } > Is the last comment at its correct place? It doesn't seem to refer to > any code. It's a FIXME: "Put in code here to check for comment and "other" strings.". > > + lose: > > + UPDATE_SYNTAX_TABLE_FORWARD (*from); > > + return false; > > + > > + lossage: > > + /* We've encountered possible comments or strings with mixed > > + delimiters. Bail out and scan forward from a safe position. */ > "lose" and "lossage" are too similar. Can we have a better name for > the latter? OK. I took the names from, I think, back_comment. > > + { > > + struct lisp_parse_state state; > > + bool adjusted = true; > Why did you need the braces here? C99 allows to mix declarations and > statements, so we no longer need such braces. OK. > > + find_start_value > > + = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) > > + : state.thislevelstart >= 0 ? state.thislevelstart > > + : find_start_value; > Please use parentheses here for better readability (to clearly show > which parts belong to which condition). Yes, it didn't indent well by itself. Maybe I should raise this with the CC Mode maintainer. But yes, I'll put parens in. > > -Comments are ignored if `parse-sexp-ignore-comments' is non-nil. > > +Comments are skipped over if `parse-sexp-ignore-comments' is non-nil. > We nowadays prefer to quote 'like this' in comments and plain text. OK. > > -Comments are ignored if `parse-sexp-ignore-comments' is non-nil. > > +Comments are skipped over if `parse-sexp-ignore-comments' is non-nil. > Likewise. > Thanks again for working on this. I'll make the stylistic corrections, then get working on it again in earnest. -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Indentation of ?: in C-mode (was: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]) 2018-07-15 18:45 ` Alan Mackenzie @ 2018-07-16 2:23 ` Stefan Monnier 2018-07-16 14:18 ` Eli Zaretskii 0 siblings, 1 reply; 93+ messages in thread From: Stefan Monnier @ 2018-07-16 2:23 UTC (permalink / raw) To: emacs-devel >> > + find_start_value >> > + = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) >> > + : state.thislevelstart >= 0 ? state.thislevelstart >> > + : find_start_value; >> Please use parentheses here for better readability (to clearly show >> which parts belong to which condition). > Yes, it didn't indent well by itself. Maybe I should raise this with > the CC Mode maintainer. But yes, I'll put parens in. This is one of those rare cases where sm-c-mode handles it better: find_start_value = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) : state.thislevelstart >= 0 ? state.thislevelstart : find_start_value; This said, I don't see either indentation as problematic and I'm not sure what would be "better for readability". Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: Indentation of ?: in C-mode (was: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]) 2018-07-16 2:23 ` Indentation of ?: in C-mode (was: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]) Stefan Monnier @ 2018-07-16 14:18 ` Eli Zaretskii 2018-07-16 15:54 ` Indentation of ?: in C-mode Stefan Monnier 0 siblings, 1 reply; 93+ messages in thread From: Eli Zaretskii @ 2018-07-16 14:18 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Sun, 15 Jul 2018 22:23:49 -0400 > > >> > + find_start_value > >> > + = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) > >> > + : state.thislevelstart >= 0 ? state.thislevelstart > >> > + : find_start_value; > >> Please use parentheses here for better readability (to clearly show > >> which parts belong to which condition). > > Yes, it didn't indent well by itself. Maybe I should raise this with > > the CC Mode maintainer. But yes, I'll put parens in. > > This is one of those rare cases where sm-c-mode handles it better: > > find_start_value > = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) > : state.thislevelstart >= 0 ? state.thislevelstart > : find_start_value; > > This said, I don't see either indentation as problematic and I'm not > sure what would be "better for readability". This: find_start_value = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) : (state.thislevelstart >= 0 ? state.thislevelstart : find_start_value); Or maybe even this: find_start_value = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) : (state.thislevelstart >= 0 ? state.thislevelstart : find_start_value); ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: Indentation of ?: in C-mode 2018-07-16 14:18 ` Eli Zaretskii @ 2018-07-16 15:54 ` Stefan Monnier 0 siblings, 0 replies; 93+ messages in thread From: Stefan Monnier @ 2018-07-16 15:54 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel >> >> > + find_start_value >> >> > + = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) >> >> > + : state.thislevelstart >= 0 ? state.thislevelstart >> >> > + : find_start_value; >> >> Please use parentheses here for better readability (to clearly show >> >> which parts belong to which condition). >> > Yes, it didn't indent well by itself. Maybe I should raise this with >> > the CC Mode maintainer. But yes, I'll put parens in. >> This is one of those rare cases where sm-c-mode handles it better: >> >> find_start_value >> = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) >> : state.thislevelstart >= 0 ? state.thislevelstart >> : find_start_value; >> >> This said, I don't see either indentation as problematic and I'm not >> sure what would be "better for readability". > This: > > find_start_value = CONSP (state.levelstarts) > ? XINT (XCAR (state.levelstarts)) > : (state.thislevelstart >= 0 > ? state.thislevelstart > : find_start_value); Interesting: I find this one to be (ever so slightly) less readable. Basically, I read find_start_value = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) : state.thislevelstart >= 0 ? state.thislevelstart : find_start_value; as a C version of (setq find_start_value (cond ((consp state.levelstarts) (XINT (XCAR (state.levelstarts)))) ((>= state.thislevelstart 0) state.thislevelstart) (t find_start_value))); so I find the ": condition ? value" lines to be very natural (the only odd line is really the first one because it doesn't start with ":"). Stefan PS: This was the opportunity to see that sm-c-mode misindents the last line of the middle example if you remove the parens: find_start_value = CONSP (state.levelstarts) ? XINT (XCAR (state.levelstarts)) : state.thislevelstart >= 0 ? state.thislevelstart : find_start_value; although now that I see it, I wonder if sm-c-mode.el read my mind and took ": condition ? value" as a logical entity, just to try and show me where this idea breaks down. ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] 2018-07-15 9:00 ` Stephen Leake 2018-07-15 15:13 ` Eli Zaretskii @ 2018-07-15 16:56 ` Alan Mackenzie 2018-07-17 3:41 ` Stephen Leake 1 sibling, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-07-15 16:56 UTC (permalink / raw) To: Stephen Leake; +Cc: emacs-devel Hello, Stephen. Many thanks for trying out and testing this branch. I'm afraid I've found a rather large snag - there are backward moving commands and functions in Emacs which bypass proper syntax checking. For example, in the following in (the modified) shell-script-mode: 1. foo="Foo" 2. bar="Bar 3. , with point at BOL3, a C-M-b moves to the F, rather than "Bar. This is because (forward-comment -1) crashes into the "whitespace" at the end of L2 (the newline) rather than taking account of its syntax (the string closing flag). At the very least, the function back_comment (in src/syntax.c) will need to be modified to take account of such things, and in doing so, might as well become a function that also goes back over EOL-terminated strings, as Stefan suggested. This will be a lot of work. I fear there may be several, or even many, lisp functions in Emacs which may likewise need modifying. The root cause of this problem, in the abstract, is that Emacs attempts to scan backwards over strings and comments, which is only heuristically possible, rather than scanning forwards over the same constructs and remembering the endpoints. Right at the moment, I don't know how to proceed. Sorry. -- Alan Mackenzie (Nuremberg, Germany). On Sun, Jul 15, 2018 at 04:00:23 -0500, Stephen Leake wrote: > An update on this; I just had several missing quotes in a buffer, due to > a copy/multiple paste that had a quote error. I did lots of editing with > the quote errors present. > I didn't even notice them until the compiler complained, just like any > other syntax error. > In my opinion, that is far preferable to the previous behavior of > fontifying large parts of the buffer as string, which forced me to pay > attention to a trivial syntax error instead of what I was actually doing. > This is in Ada, that does not have the option of escaping a newline to > create a multiline string, so treating a newline as string terminator is > always correct. > Anything I can do to help merge this to main? > Stephen Leake <stephen_leake@stephe-leake.org> writes: > > Alan Mackenzie <acm@muc.de> writes: > > > >> It's no longer a matter of assumption. Earlier on this afternoon, I > >> committed a preliminary working version of this change to the branch > >> scratch/fontify-open-string. > > > > I've just tried this in ada-mode, and it works nicely. I like the red > > face on an unbalanced string quote. > > > > No noticeable slowdown in anything I've tried so far. > > > > Let me know if there's some experiment you'd like me to run. > -- > -- Stephe ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] 2018-07-15 16:56 ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie @ 2018-07-17 3:41 ` Stephen Leake 0 siblings, 0 replies; 93+ messages in thread From: Stephen Leake @ 2018-07-17 3:41 UTC (permalink / raw) To: emacs-devel Alan Mackenzie <acm@muc.de> writes: > Hello, Stephen. > > Many thanks for trying out and testing this branch. You're welcome; thanks for implementing it. > I'm afraid I've found a rather large snag - there are backward moving > commands and functions in Emacs which bypass proper syntax checking. > For example, in the following in (the modified) shell-script-mode: > > 1. foo="Foo" > 2. bar="Bar > 3. > > , with point at BOL3, a C-M-b moves to the F, rather than "Bar. > > This is because (forward-comment -1) crashes into the "whitespace" at > the end of L2 (the newline) rather than taking account of its syntax > (the string closing flag). > > At the very least, the function back_comment (in src/syntax.c) will need > to be modified to take account of such things, and in doing so, might as > well become a function that also goes back over EOL-terminated strings, > as Stefan suggested. This will be a lot of work. > > I fear there may be several, or even many, lisp functions in Emacs which > may likewise need modifying. > > The root cause of this problem, in the abstract, is that Emacs attempts > to scan backwards over strings and comments, which is only heuristically > possible, rather than scanning forwards over the same constructs and > remembering the endpoints. > > Right at the moment, I don't know how to proceed. Sorry. I think you are opposed to syntax-ppss, but that does scan forward and remember things; can we use that here? -- -- Stephe ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-30 19:03 ` Alan Mackenzie 2018-06-30 19:29 ` Eli Zaretskii @ 2018-07-01 4:02 ` Stefan Monnier 2018-07-01 10:58 ` Alan Mackenzie 1 sibling, 1 reply; 93+ messages in thread From: Stefan Monnier @ 2018-07-01 4:02 UTC (permalink / raw) To: Alan Mackenzie Cc: Clément Pit-Claudel, Stephen Leake, João Távora, emacs-devel >> So let's look at the technical issues: >> You suggest introducing a new syntax-table thingy similar to > but for >> strings. Let's call it ] > As I noted above, I have implemented it as another flag, `s'. Better, yes. > This is simple with the flag `s'. NL would thus have end-comment syntax > _and_ the `s' flag. In scan_lists, back_comment will be tried before > what I'm calling `back_maybe_string', since being a comment ender must have > precedence over being a string terminator. Why? How 'bout: char foo[] = "some unterminated // string >> If so, what's the benefit over using string-fences? > String-fence stopped the 'chomp facility of electric-pair-mode working > properly (for the currently accepted value of "properly"). I suspect that it'll be easier to fix electric-pair-mode. So the right answer was that you won't need syntax-table text-properties. But the downside is that every time we scan backwards over a newline we'll have to pay the extra cost of checking whether it's maybe closing an unterminated string. I think such a "string terminator" thingy would be valuable if it were used/needed for *valid* code. But introducing such complexity just to tweak the handling of invalid code doesn't seem like a good tradeoff at all. > That's what I'm doing with `s'. The extra complexity in syntax.c > doesn't seem all that bad at the moment. back_maybe_string is currently > 137 lines long (including a macro analogous to INC_FROM, and a lossage: > clause modelled on the one in back_comment)), compared with > back_comment's 289 lines. I'm planning on committing this new code to a > branch in the next few days, then you can judge better whether the new > facility is worth it. I can't imagine how seeing the code could change my opinion on whether it's worth it. > "S-T-NG" may be fine for Emacs 28 or 29, but the syntax table is what we > have, and what we must work with in the short term. We'll never get to "S-T-NG" if we keep it for the future. Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-01 4:02 ` CC Mode and electric-pair "problem" Stefan Monnier @ 2018-07-01 10:58 ` Alan Mackenzie 2018-07-01 11:46 ` João Távora 2018-07-01 16:13 ` Stefan Monnier 0 siblings, 2 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-07-01 10:58 UTC (permalink / raw) To: Stefan Monnier Cc: Clément Pit-Claudel, Stephen Leake, João Távora, emacs-devel Hello, Stefan. On Sun, Jul 01, 2018 at 00:02:56 -0400, Stefan Monnier wrote: > >> So let's look at the technical issues: > >> You suggest introducing a new syntax-table thingy similar to > but for > >> strings. Let's call it ] > > As I noted above, I have implemented it as another flag, `s'. > Better, yes. > > This is simple with the flag `s'. NL would thus have end-comment syntax > > _and_ the `s' flag. In scan_lists, back_comment will be tried before > > what I'm calling `back_maybe_string', since being a comment ender must have > > precedence over being a string terminator. > Why? How 'bout: > char foo[] = "some unterminated // string Bug compatibility with the current scan-sexps. > > String-fence stopped the 'chomp facility of electric-pair-mode > > working properly (for the currently accepted value of "properly"). > I suspect that it'll be easier to fix electric-pair-mode. This would be my preferred option too, but it's not easy. > But the downside is that every time we scan backwards over a newline > we'll have to pay the extra cost of checking whether it's maybe > closing an unterminated string. Hmmm. Yes, this could increase the backward scanning time quite substantially, but we already do this for back_comment, though. It might be unacceptable. A possibility would be to apply the `s' flag only in a syntax-table text property applied to the newlines of unterminated strings. > I think such a "string terminator" thingy would be valuable if it were > used/needed for *valid* code. But introducing such complexity just to > tweak the handling of invalid code doesn't seem like a good tradeoff > at all. I disagree. Whilst editing code, it is in an invalid state nearly all the time. It is our job to present the user with the best possible display for this dominant state. > > That's what I'm doing with `s'. The extra complexity in syntax.c > > doesn't seem all that bad at the moment. back_maybe_string is currently > > 137 lines long (including a macro analogous to INC_FROM, and a lossage: > > clause modelled on the one in back_comment)), compared with > > back_comment's 289 lines. I'm planning on committing this new code to a > > branch in the next few days, then you can judge better whether the new > > facility is worth it. > I can't imagine how seeing the code could change my opinion on whether > it's worth it. I would hope you would weigh up the small additional complexity against the new features it brings, and reach a balanced judgment, rather than dismissing the new idea without consideration. > > "S-T-NG" may be fine for Emacs 28 or 29, but the syntax table is what we > > have, and what we must work with in the short term. > We'll never get to "S-T-NG" if we keep it for the future. You see the need for it, and have at least some vague notion of what it should look like. I don't. Get hacking! > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-01 10:58 ` Alan Mackenzie @ 2018-07-01 11:46 ` João Távora 2018-07-01 16:13 ` Stefan Monnier 1 sibling, 0 replies; 93+ messages in thread From: João Távora @ 2018-07-01 11:46 UTC (permalink / raw) To: Alan Mackenzie Cc: Clément Pit-Claudel, Stephen Leake, Stefan Monnier, emacs-devel Alan Mackenzie <acm@muc.de> writes: Hi Alan, >> I suspect that it'll be easier to fix electric-pair-mode. > This would be my preferred option too, but it's not easy. I don't follow: i'm still waiting on comments on https://lists.gnu.org/archive/html/emacs-devel/2018-06/msg00606.html Where, at your request, I changed electric-pair-mode to provide a way to fix the immediate problems (test failures/chomp thing). (Obviously, it doesn't fix the overarching issue as I already explained elsewhere.) João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-01 10:58 ` Alan Mackenzie 2018-07-01 11:46 ` João Távora @ 2018-07-01 16:13 ` Stefan Monnier 2018-07-01 18:18 ` Alan Mackenzie 1 sibling, 1 reply; 93+ messages in thread From: Stefan Monnier @ 2018-07-01 16:13 UTC (permalink / raw) To: emacs-devel >> Why? How 'bout: >> char foo[] = "some unterminated // string > Bug compatibility with the current scan-sexps. I don't see why: currently, scan-sexps skips over the comment, but that's not a bug: it's exactly what it is documented to do. When you change the syntax property of ?\n to be "> s", it changes the behavior expected based on the documentation, so in the above case it should treat the \n as closing the string rather than closing the comment. It needs to work reliably for those languages where strings are indeed terminated by newline (e.g. jgraph-mode in GNU ELPA). > Hmmm. Yes, this could increase the backward scanning time quite > substantially, but we already do this for back_comment, though. I expect the impact will be less than that of back_comment, but I think we'd want actual measurements anyway. > A possibility would be to apply the `s' flag only in a syntax-table text > property applied to the newlines of unterminated strings. But that brings us back to "why not use string-fence?". > I disagree. Whilst editing code, it is in an invalid state nearly all > the time. But we usually don't make any effort to guess what the intended closest valid state might be, except where the user is actively editing the text (e.g. by proposing completion candidates for identifiers). >> I can't imagine how seeing the code could change my opinion on whether >> it's worth it. > I would hope you would weigh up the small additional complexity against > the new features it brings, and reach a balanced judgment, rather than > dismissing the new idea without consideration. I did consider it. I just know syntax.c well enough that I'd be very surprised if the actual patch (as opposed to by guess at the what the patch would look like) makes me change my mind. Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-01 16:13 ` Stefan Monnier @ 2018-07-01 18:18 ` Alan Mackenzie 2018-07-01 23:16 ` Stefan Monnier 0 siblings, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-07-01 18:18 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel Hello, Stefan. On Sun, Jul 01, 2018 at 12:13:32 -0400, Stefan Monnier wrote: > >> Why? How 'bout: > >> char foo[] = "some unterminated // string > > Bug compatibility with the current scan-sexps. > I don't see why: currently, scan-sexps skips over the comment, but > that's not a bug: it's exactly what it is documented to do. There is no comment there, but scan-sexps skips to it nevertheless. As you know, I solved these anomalies some while ago with the comment-cache branch. > When you change the syntax property of ?\n to be "> s", it changes the > behavior expected based on the documentation, .... Er, documentation? This new flag isn't documented yet, or at least not in any permanent fashion. > .... so in the above case it should treat the \n as closing the string > rather than closing the comment. I agree. > It needs to work reliably for those languages where strings > are indeed terminated by newline (e.g. jgraph-mode in GNU ELPA). You mean, jgraph-mode is another use-case for `s'? (I'm not familiar with it.) > > Hmmm. Yes, this could increase the backward scanning time quite > > substantially, but we already do this for back_comment, though. > I expect the impact will be less than that of back_comment, but I think > we'd want actual measurements anyway. Yes. > > A possibility would be to apply the `s' flag only in a syntax-table > > text property applied to the newlines of unterminated strings. > But that brings us back to "why not use string-fence?". Yes. String-fence interferes with syntactical stuff "inside" the invalid string, whereas the `s' flag won't. > > I disagree. Whilst editing code, it is in an invalid state nearly > > all the time. > But we usually don't make any effort to guess what the intended > closest valid state might be, except where the user is actively > editing the text (e.g. by proposing completion candidates for > identifiers). There's no need to guess. The compiler defines the state, namely that the (invalid) string ends at the EOL, and what follows is non-string. > > I would hope you would weigh up the small additional complexity against > > the new features it brings, and reach a balanced judgment, rather than > > dismissing the new idea without consideration. > I did consider it. I just know syntax.c well enough that I'd be very > surprised if the actual patch (as opposed to by guess at the what the > patch would look like) makes me change my mind. There's no need to guess. back_maybe_comment is in the new scratch/fontify-open-string branch. It is NOT that complicated. > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-01 18:18 ` Alan Mackenzie @ 2018-07-01 23:16 ` Stefan Monnier 2018-07-02 19:18 ` Alan Mackenzie 0 siblings, 1 reply; 93+ messages in thread From: Stefan Monnier @ 2018-07-01 23:16 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel >> >> Why? How 'bout: >> >> char foo[] = "some unterminated // string >> > Bug compatibility with the current scan-sexps. >> I don't see why: currently, scan-sexps skips over the comment, but >> that's not a bug: it's exactly what it is documented to do. > There is no comment there, but scan-sexps skips to it nevertheless. The starting point is within the string (not according to the C language rules, of course, but according to the syntax-tables settings), and operations like scan-sexps are documented to work under the assumption that the starting point is outside of strings/comments, so it is very much correct for it to consider this "// string\n" to be a comment. I agree that it would be OK for scan-sexps in this case to consider that \n terminates the string rather than the comment, tho. >> When you change the syntax property of ?\n to be "> s", it changes the >> behavior expected based on the documentation, .... > Er, documentation? This new flag isn't documented yet, or at least not > in any permanent fashion. Well, I was talking hypothetically under the assumption that "s" is documented to mean something like "closes a string if there's one to close". >> .... so in the above case it should treat the \n as closing the string >> rather than closing the comment. > I agree. OK, sorry 'bout the above, then, I see we agree. >> It needs to work reliably for those languages where strings >> are indeed terminated by newline (e.g. jgraph-mode in GNU ELPA). > You mean, jgraph-mode is another use-case for `s'? (I'm not familiar > with it.) I looked for existing use-cases and I indeed found one. It's very much not high-profile, tho. Also this use-case is slightly different in that the \n is really the normal/only way to terminate the string in jgraph. In case you're interested: http://web.eecs.utk.edu/~plank/plank/jgraph/jgraph.html >> But that brings us back to "why not use string-fence?". > Yes. String-fence interferes with syntactical stuff "inside" the > invalid string, whereas the `s' flag won't. Not sure how serious this "interferes with syntactical stuff" is in practice. >> But we usually don't make any effort to guess what the intended >> closest valid state might be, except where the user is actively >> editing the text (e.g. by proposing completion candidates for >> identifiers). > There's no need to guess. The compiler defines the state, namely that > the (invalid) string ends at the EOL, and what follows is non-string. The compiler just makes an arbitrary choice, just like we do and that has no bearing on what the intended valid state is (which is not something the compiler can discover either: it's only available in the head of the coder). > There's no need to guess. back_maybe_comment is in the new > scratch/fontify-open-string branch. It is NOT that complicated. Unsurprisingly it introduces a complexity which I find unjustified by the presented benefits. But it now occurs to me that maybe we can do better: have you tried to merge back_maybe_comment into back_comment? After all, back_comment already pays attention to strings (in order to try and correctly handle comment openers appearing within strings), so there's a possibility that back_comment might be able to handle your use case with much fewer changes (and in that case, the performance cost would be pretty close to 0, I think). Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-01 23:16 ` Stefan Monnier @ 2018-07-02 19:18 ` Alan Mackenzie 2018-07-03 2:10 ` Stefan Monnier 0 siblings, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-07-02 19:18 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel On Sun, Jul 01, 2018 at 19:16:06 -0400, Stefan Monnier wrote: > >> >> Why? How 'bout: > >> >> char foo[] = "some unterminated // string > >> > Bug compatibility with the current scan-sexps. > >> I don't see why: currently, scan-sexps skips over the comment, but > >> that's not a bug: it's exactly what it is documented to do. > > There is no comment there, but scan-sexps skips to it nevertheless. > The starting point is within the string (not according to the C language > rules, of course, but according to the syntax-tables settings), and > operations like scan-sexps are documented to work under the assumption > that the starting point is outside of strings/comments, so it is very > much correct for it to consider this "// string\n" to be a comment. Yes. Apologies for my misunderstanding. [ .... ] > ..... I see we agree. Yes. > >> It needs to work reliably for those languages where strings > >> are indeed terminated by newline (e.g. jgraph-mode in GNU ELPA). > > You mean, jgraph-mode is another use-case for `s'? (I'm not familiar > > with it.) > I looked for existing use-cases and I indeed found one. It's very much > not high-profile, tho. Also this use-case is slightly different in that > the \n is really the normal/only way to terminate the string in jgraph. > In case you're interested: > http://web.eecs.utk.edu/~plank/plank/jgraph/jgraph.html > >> But that brings us back to "why not use string-fence?". > > Yes. String-fence interferes with syntactical stuff "inside" the > > invalid string, whereas the `s' flag won't. > Not sure how serious this "interferes with syntactical stuff" is > in practice. Maybe not very. > >> But we usually don't make any effort to guess what the intended > >> closest valid state might be, except where the user is actively > >> editing the text (e.g. by proposing completion candidates for > >> identifiers). > > There's no need to guess. The compiler defines the state, namely that > > the (invalid) string ends at the EOL, and what follows is non-string. > The compiler just makes an arbitrary choice, .... No. The compiler has no choice here. Or does it? Can you identify any other sensible strategy a compiler could follow? > .... just like we do and that has no bearing on what the intended > valid state is (which is not something the compiler can discover > either: it's only available in the head of the coder). There may or may not be a unique "intended valid state". I don't think it's a helpful concept - it suggests that the states a buffer is in most of the time are in some way unimportant. I reaffirm my view that Emacs should present optimal information about these normal (invalid) states, and that they are very important indeed. > > There's no need to guess. back_maybe_comment is in the new > > scratch/fontify-open-string branch. It is NOT that complicated. > Unsurprisingly it introduces a complexity which I find unjustified by > the presented benefits. > But it now occurs to me that maybe we can do better: have you tried to > merge back_maybe_string into back_comment? After all, back_comment > already pays attention to strings (in order to try and correctly > handle comment openers appearing within strings), so there's > a possibility that back_comment might be able to handle your use case > with much fewer changes (and in that case, the performance cost would > be pretty close to 0, I think). That's a good idea. I think it's clear that such a merge could be done. But it would need a lot of detailed painstaking work. It's optimisation (as in "don't do it yet!"). Once we decide to adopt the idea is the time to do this merge, I think. That's assuming some measurements show it's worthwhile (which I think it would be). In fact, in my modified shell-script-mode I timed (scan-sexps BONL -1) a million times on the following text: "string" at the start of a line. With the `s' flag in place: 1.9489 seconds. Without the `s' flag: 1.3003 seconds. This is an overhead of almost exactly 50%. > Stefan -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-07-02 19:18 ` Alan Mackenzie @ 2018-07-03 2:10 ` Stefan Monnier 0 siblings, 0 replies; 93+ messages in thread From: Stefan Monnier @ 2018-07-03 2:10 UTC (permalink / raw) To: Alan Mackenzie; +Cc: emacs-devel >> >> But we usually don't make any effort to guess what the intended >> >> closest valid state might be, except where the user is actively >> >> editing the text (e.g. by proposing completion candidates for >> >> identifiers). >> > There's no need to guess. The compiler defines the state, namely that >> > the (invalid) string ends at the EOL, and what follows is non-string. >> The compiler just makes an arbitrary choice, .... > No. The compiler has no choice here. Or does it? Of course it does. > Can you identify any other sensible strategy a compiler could follow? It could look for the next (closing) " and if it's not on the same line signal an error about "invalid multiline string" (or "unterminated string" if it bumps into EOF). GCC used to do just that (without even signaling an error) IIRC. > There may or may not be a unique "intended valid state". I don't think > it's a helpful concept - it suggests that the states a buffer is in most > of the time are in some way unimportant. I reaffirm my view that Emacs > should present optimal information about these normal (invalid) states, > and that they are very important indeed. I'm not sure you can define "optimal" without defining "intended valid state" in this case. > That's a good idea. I think it's clear that such a merge could be done. > But it would need a lot of detailed painstaking work. From what I remember of back_comment (not very fresh, to be honest), I think there's a good chance it would be pretty easy, actually (at least easy in terms of the resulting patch being short: it may take some time to come up with the patch, OTOH). > With the `s' flag in place: 1.9489 seconds. > Without the `s' flag: 1.3003 seconds. Wow, I must say I expected a significantly lower overhead. Stefan ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-19 5:02 ` Alan Mackenzie 2018-06-20 14:16 ` Stefan Monnier @ 2018-06-26 18:52 ` Alan Mackenzie 2018-06-26 19:45 ` João Távora 1 sibling, 1 reply; 93+ messages in thread From: Alan Mackenzie @ 2018-06-26 18:52 UTC (permalink / raw) To: João Távora; +Cc: Glenn Morris, Emacs developers, Tino Calancha Hello, João. On Tue, Jun 19, 2018 at 05:02:44 +0000, Alan Mackenzie wrote: > On Mon, Jun 18, 2018 at 18:01:18 +0100, João Távora wrote: > [ .... ] > Maybe we're looking at this the wrong way. > How about this idea: we add a new syntax flag to Emacs, ", which > terminates any open string, the same way the syntax > terminates any > open comment. We could then set this syntax flag on newline. This isn't a sensible idea. because it wouldn't solve any of the problems we have with the string-fence syntax. Instead, maybe we should add a new syntactic symbol to Emacs, "one-line string quote". A string opened by such a delimiter would be terminated either by the same quote again, or a newline. This would have the advantage of making fontification easy, whilst still allowing syntactic operations within an invalid string. For example, in char *foo = "( )" , the "s would have "one-line string quote" syntax and be fontified with warning face, but a C-M-n from the ( would still move point to after the ), and all the electric-pair-mode stuff would still work. > This would have the disadvantage (for CC Mode) that it wouldn't work > with older Emacsen. But it might solve the various problems we've > stumbled over in the last few days. This paragraph would still hold. > > João -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-26 18:52 ` Alan Mackenzie @ 2018-06-26 19:45 ` João Távora 2018-06-26 20:09 ` Alan Mackenzie 0 siblings, 1 reply; 93+ messages in thread From: João Távora @ 2018-06-26 19:45 UTC (permalink / raw) To: Alan Mackenzie; +Cc: Glenn Morris, Emacs developers, Tino Calancha > Hello, João. Hi Alan, Alan Mackenzie <acm@muc.de> writes: >> [ .... ] >> Maybe we're looking at this the wrong way. >> How about this idea: we add a new syntax flag to Emacs, ", which >> terminates any open string, the same way the syntax > terminates any >> open comment. We could then set this syntax flag on newline. > This isn't a sensible idea. because it wouldn't solve any of the > problems we have with the string-fence syntax. You realize you're replying to your own suggestion, right? (just checking...) > This would have the advantage of making fontification easy, whilst still > allowing syntactic operations within an invalid string. For example, in > > char *foo = "( > )" > > , the "s would have "one-line string quote" syntax and be fontified with > warning face, but a C-M-n from the ( would still move point to after the > ), and all the electric-pair-mode stuff would still work. Ignoring any complications or complexity that would arise from it, that sounds great (though more important than supporting e-p-m is having C-M-u work from inside the string, which I suppose is included). João ^ permalink raw reply [flat|nested] 93+ messages in thread
* Re: CC Mode and electric-pair "problem". 2018-06-26 19:45 ` João Távora @ 2018-06-26 20:09 ` Alan Mackenzie 0 siblings, 0 replies; 93+ messages in thread From: Alan Mackenzie @ 2018-06-26 20:09 UTC (permalink / raw) To: João Távora; +Cc: Glenn Morris, Emacs developers, Tino Calancha Hello, João. On Tue, Jun 26, 2018 at 20:45:44 +0100, João Távora wrote: > Hi Alan, > Alan Mackenzie <acm@muc.de> writes: > >> [ .... ] > >> Maybe we're looking at this the wrong way. > >> How about this idea: we add a new syntax flag to Emacs, ", which > >> terminates any open string, the same way the syntax > terminates any > >> open comment. We could then set this syntax flag on newline. > > This isn't a sensible idea. because it wouldn't solve any of the > > problems we have with the string-fence syntax. > You realize you're replying to your own suggestion, right? (just > checking...) I do, yes. :-) > > This would have the advantage of making fontification easy, whilst still > > allowing syntactic operations within an invalid string. For example, in > > char *foo = "( > > )" > > , the "s would have "one-line string quote" syntax and be fontified with > > warning face, but a C-M-n from the ( would still move point to after the > > ), and all the electric-pair-mode stuff would still work. > Ignoring any complications or complexity that would arise from it, that > sounds great (though more important than supporting e-p-m is having > C-M-u work from inside the string, which I suppose is included). Indeed. The whole point is that if the syntax scanning starts outside the one-line string, the newline acts as a terminator. If it starts inside the string, the newline doesn't act as anything special. The complications would come with things like scan-sexps, which when starting after a newline and scanning backward, would have to check for a one-line " in the line. I don't see such complications as being unmanageable. > João -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 93+ messages in thread
end of thread, other threads:[~2018-07-17 3:41 UTC | newest] Thread overview: 93+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-05-22 7:42 [Emacs-diffs] master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings Tino Calancha 2018-05-22 17:40 ` Alan Mackenzie 2018-05-22 19:21 ` João Távora 2018-05-22 19:34 ` Eli Zaretskii 2018-05-22 20:25 ` João Távora 2018-05-22 22:17 ` João Távora 2018-05-23 14:52 ` Eli Zaretskii 2018-05-23 20:46 ` Alan Mackenzie 2018-05-23 21:12 ` João Távora 2018-05-23 23:21 ` Michael Welsh Duggan 2018-05-31 12:37 ` CC Mode and electric-pair "problem". (Was: ... master bb591f139f: Enhance CC Mode's fontification, etc., of unterminated strings.) Alan Mackenzie 2018-05-31 16:07 ` CC Mode and electric-pair "problem" João Távora 2018-05-31 17:28 ` Alan Mackenzie 2018-05-31 18:37 ` João Távora 2018-06-02 13:02 ` Alan Mackenzie 2018-06-03 3:00 ` João Távora 2018-06-17 16:58 ` Glenn Morris 2018-06-17 20:13 ` Alan Mackenzie 2018-06-17 21:07 ` Stefan Monnier 2018-06-17 21:27 ` João Távora 2018-06-18 10:36 ` Alan Mackenzie 2018-06-18 13:24 ` João Távora 2018-06-18 15:18 ` Eli Zaretskii 2018-06-18 15:37 ` João Távora 2018-06-18 16:46 ` Eli Zaretskii 2018-06-18 17:21 ` Eli Zaretskii 2018-06-18 23:49 ` João Távora 2018-06-19 2:37 ` Eli Zaretskii 2018-06-19 8:13 ` João Távora 2018-06-19 16:59 ` Eli Zaretskii 2018-06-19 19:40 ` João Távora 2018-06-18 20:24 ` Glenn Morris 2018-06-19 2:03 ` João Távora 2018-06-18 15:42 ` Alan Mackenzie 2018-06-18 17:01 ` João Távora 2018-06-18 18:07 ` Yuri Khan 2018-06-18 22:52 ` João Távora 2018-06-18 18:08 ` Alan Mackenzie 2018-06-18 23:43 ` João Távora 2018-06-19 1:35 ` João Távora 2018-06-19 1:48 ` Stefan Monnier 2018-06-19 3:52 ` Clément Pit-Claudel 2018-06-19 6:38 ` Stefan Monnier 2018-06-20 13:48 ` Clément Pit-Claudel 2018-06-26 16:08 ` Fontifying unterminated strings [was: CC Mode and electric-pair "problem".] Alan Mackenzie 2018-06-26 20:02 ` João Távora 2018-06-28 23:56 ` Stefan Monnier 2018-06-29 0:43 ` Stefan Monnier 2018-06-18 22:41 ` CC Mode and electric-pair "problem" Stephen Leake 2018-06-19 0:02 ` João Távora 2018-06-19 3:15 ` Clément Pit-Claudel 2018-06-19 8:16 ` João Távora 2018-06-19 5:02 ` Alan Mackenzie 2018-06-20 14:16 ` Stefan Monnier 2018-06-26 18:23 ` Alan Mackenzie 2018-06-27 13:37 ` João Távora 2018-06-29 3:42 ` Stefan Monnier 2018-06-30 18:09 ` Alan Mackenzie 2018-07-01 3:37 ` Stefan Monnier 2018-07-01 15:24 ` Eli Zaretskii 2018-07-06 21:58 ` Stephen Leake 2018-07-01 15:57 ` Paul Eggert 2018-06-27 18:27 ` Alan Mackenzie 2018-06-29 4:11 ` Stefan Monnier 2018-06-30 19:03 ` Alan Mackenzie 2018-06-30 19:29 ` Eli Zaretskii 2018-06-30 20:14 ` Alan Mackenzie 2018-07-01 3:50 ` Stefan Monnier 2018-07-01 9:58 ` Alan Mackenzie 2018-07-01 11:22 ` João Távora 2018-07-01 15:25 ` Eli Zaretskii 2018-07-01 15:22 ` Eli Zaretskii 2018-07-01 16:38 ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie 2018-07-08 8:29 ` Stephen Leake 2018-07-15 9:00 ` Stephen Leake 2018-07-15 15:13 ` Eli Zaretskii 2018-07-15 18:45 ` Alan Mackenzie 2018-07-16 2:23 ` Indentation of ?: in C-mode (was: scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".]) Stefan Monnier 2018-07-16 14:18 ` Eli Zaretskii 2018-07-16 15:54 ` Indentation of ?: in C-mode Stefan Monnier 2018-07-15 16:56 ` scratch/fontify-open-string. [Was: CC Mode and electric-pair "problem".] Alan Mackenzie 2018-07-17 3:41 ` Stephen Leake 2018-07-01 4:02 ` CC Mode and electric-pair "problem" Stefan Monnier 2018-07-01 10:58 ` Alan Mackenzie 2018-07-01 11:46 ` João Távora 2018-07-01 16:13 ` Stefan Monnier 2018-07-01 18:18 ` Alan Mackenzie 2018-07-01 23:16 ` Stefan Monnier 2018-07-02 19:18 ` Alan Mackenzie 2018-07-03 2:10 ` Stefan Monnier 2018-06-26 18:52 ` Alan Mackenzie 2018-06-26 19:45 ` João Távora 2018-06-26 20:09 ` Alan Mackenzie
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).