bug#26217: 25.2; shell syntax does not know for i do

unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed

* bug#26217: 25.2; shell syntax does not know for i do
@ 2017-03-22  9:10 Martin Vath
  2023-10-13 12:07 ` bug#26217: bug#2910: 23.0.60; Shell-script coloring bug Mauro Aranda
  0 siblings, 1 reply; 9+ messages in thread
From: Martin Vath @ 2017-03-22  9:10 UTC (permalink / raw)
  To: 26217

1. M-x shell-script-mode
2. Type: "
for i do echo 1; done
for i; do echo 1; done
"

Despite both lines are correct full loops according to POSIX
and in fact understood by all current shells (see below),
this does not correspond to the indentation of emacs and
coloring of "do" in the first line.

That the first line is indeed valid by POSIX and understood
by the shells was discussed recently in the German Usegroup
de.comp.os.unix.shell (OT under "Funktion aus find aufrufen"),
but you can also check the grammer for "for_clause" in
http://pubs.opengroup.org/onlinepubs/000095399/utilities/xcu_chap02.html
noting that it is valid that "linebreak" expands to the empty
string. Another hint that this is intentional in POSIX is the
explicit mentioning of "do" in Section 2.4 of the above page
as the _third_ word in a for command.

Tested on GNU Emacs 25.2.1 and GNU Emacs 24.4.1

(Removing further build data, since the issue was reproduced
on different machines and distributions before reporting.)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#26217: bug#2910: 23.0.60; Shell-script coloring bug
  2017-03-22  9:10 bug#26217: 25.2; shell syntax does not know for i do Martin Vath
@ 2023-10-13 12:07 ` Mauro Aranda
  2023-10-13 16:06   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 9+ messages in thread
From: Mauro Aranda @ 2023-10-13 12:07 UTC (permalink / raw)
  To: 26217; +Cc: Martin Vath, Stefan Monnier

Martin Vath <martin@mvath.de> writes:

 > 1. M-x shell-script-mode
 > 2. Type: "
 > for i do echo 1; done
 > for i; do echo 1; done
 > "
 >
 > Despite both lines are correct full loops according to POSIX
 > and in fact understood by all current shells (see below),
 > this does not correspond to the indentation of emacs and
 > coloring of "do" in the first line.

This Bug Report (and the other ones merged with it) have two parts:
- No highlight for "do" in:
for i do echo 1; done

This works if using bash-ts-mode, which claims in its docstring it can
support Bash or sh.

I don't know what's the decision (if there is one) about situations like
these.  If the bugs in lang-mode are fixed by lang-ts-mode, will these
bug reports be treated as: wontfix? fixed? open in case someone wants to
spend time in lang-mode?

- Wrong indentation for lines after:
for i do echo 1; done

I took a look at this, and ISTM that giving "do" a special treatment
like sh-smie--sh-keyword-p gives to "in" might fix this, perhaps by
reusing sh-smie--sh-keyword-in-p.

I know very little about SMIE, so I'm CCing Stefan M.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#26217: bug#2910: 23.0.60; Shell-script coloring bug
  2023-10-13 12:07 ` bug#26217: bug#2910: 23.0.60; Shell-script coloring bug Mauro Aranda
@ 2023-10-13 16:06   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-10-14 12:44     ` Mauro Aranda
  2023-10-14 16:43     ` Dmitry Gutov
  0 siblings, 2 replies; 9+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-10-13 16:06 UTC (permalink / raw)
  To: Mauro Aranda; +Cc: Martin Vath, 26217

> I don't know what's the decision (if there is one) about situations like
> these.  If the bugs in lang-mode are fixed by lang-ts-mode, will these
> bug reports be treated as: wontfix? fixed? open in case someone wants to
> spend time in lang-mode?

AFAIK we don't consider the `foo-ts-mode` to obsolete the other modes.
Maybe we will, but we don't yet.  IMO I think we'd first need to have
a good long-term strategy about what we'll do when tree-sitter becomes
unmaintained/obsolete.  IOW I think we need to develop our own layer of
abstraction above tree-sitter so that we can accommodate other
parser backends.

FWIW, it's not clear at all what such a layer would look like, so we're
pretty far from it.  I'd welcome people start thinking about it, maybe
by looking at existing alternatives like our own `wisi` (in GNU ELPA),
SMIE, maybe LSP (assuming there are servers out there which can provide
that kind of functionality), etc...

> - Wrong indentation for lines after:
> for i do echo 1; done
>
> I took a look at this, and ISTM that giving "do" a special treatment
> like sh-smie--sh-keyword-p gives to "in" might fix this, perhaps by
> reusing sh-smie--sh-keyword-in-p.

Sounds about right.

        Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#26217: bug#2910: 23.0.60; Shell-script coloring bug
  2023-10-13 16:06   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-10-14 12:44     ` Mauro Aranda
  2023-10-14 14:58       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-10-14 16:43     ` Dmitry Gutov
  1 sibling, 1 reply; 9+ messages in thread
From: Mauro Aranda @ 2023-10-14 12:44 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Martin Vath, 26217

[-- Attachment #1: Type: text/plain, Size: 1654 bytes --]

tags 26217 patch
quit


On 13/10/23 13:06, Stefan Monnier wrote:
 >> I don't know what's the decision (if there is one) about situations like
 >> these.  If the bugs in lang-mode are fixed by lang-ts-mode, will these
 >> bug reports be treated as: wontfix? fixed? open in case someone wants to
 >> spend time in lang-mode?
 >
 > AFAIK we don't consider the `foo-ts-mode` to obsolete the other modes.
 > Maybe we will, but we don't yet.  IMO I think we'd first need to have
 > a good long-term strategy about what we'll do when tree-sitter becomes
 > unmaintained/obsolete.  IOW I think we need to develop our own layer of
 > abstraction above tree-sitter so that we can accommodate other
 > parser backends.
 >
 > FWIW, it's not clear at all what such a layer would look like, so we're
 > pretty far from it.  I'd welcome people start thinking about it, maybe
 > by looking at existing alternatives like our own `wisi` (in GNU ELPA),
 > SMIE, maybe LSP (assuming there are servers out there which can provide
 > that kind of functionality), etc...

Thank you for your response.  I don't know if this has been raised in
emacs-devel, but IMO it should be.

 >> - Wrong indentation for lines after:
 >> for i do echo 1; done
 >>
 >> I took a look at this, and ISTM that giving "do" a special treatment
 >> like sh-smie--sh-keyword-p gives to "in" might fix this, perhaps by
 >> reusing sh-smie--sh-keyword-in-p.
 >
 > Sounds about right.

I attach a patch that should also handle the fontification issue. It
also comes with some tests, and I've did some manual testing on my own.
I haven't found problems, but I'm suspicious because it seems too easy.

[-- Attachment #2: 0001-Fix-indentation-and-fontification-in-shell-script-Bu.patch --]
[-- Type: text/x-patch, Size: 3912 bytes --]

From 420fbfc1b2651839f214edbd809a663f42dcf080 Mon Sep 17 00:00:00 2001
From: Mauro Aranda <maurooaranda@gmail.com>
Date: Sat, 14 Oct 2023 09:05:35 -0300
Subject: [PATCH] Fix indentation and fontification in shell-script (Bug#26217)

* lisp/progmodes/sh-script.el (sh-smie--sh-keyword-p): Treat "do" as
special, like we treat "in".
(sh-smie--sh-keyword-in-p): Change signature.  Take the token to
decide correctly if it's a keyword.
(sh-font-lock-keywords-var-1): Add do.

* test/lisp/progmodes/sh-script-resources/sh-indents.erts: New test.
* test/lisp/progmodes/sh-script-tests.el
(sh-script-test-do-fontification): New test.
---
 lisp/progmodes/sh-script.el                         | 13 ++++++++-----
 .../progmodes/sh-script-resources/sh-indents.erts   |  7 +++++++
 test/lisp/progmodes/sh-script-tests.el              | 11 +++++++++++
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-script.el
index cc521cb0591..de76e175a10 100644
--- a/lisp/progmodes/sh-script.el
+++ b/lisp/progmodes/sh-script.el
@@ -869,7 +869,7 @@ sh-font-lock-keywords-var
   "Default expressions to highlight in Shell Script modes.  See `sh-feature'.")
 
 (defvar sh-font-lock-keywords-var-1
-  '((sh "[ \t]in\\>"))
+  '((sh "[ \t]\\(in\\|do\\)\\>"))
   "Subdued level highlighting for Shell Script modes.")
 
 (defvar sh-font-lock-keywords-var-2 ()
@@ -1809,8 +1809,8 @@ sh-smie--sh-operators-back-re
   (concat "\\(?:^\\|[^\\]\\)\\(?:\\\\\\\\\\)*"
           "\\(" sh-smie--sh-operators-re "\\)"))
 
-(defun sh-smie--sh-keyword-in-p ()
-  "Assuming we're looking at \"in\", return non-nil if it's a keyword.
+(defun sh-smie--sh-keyword-in/do-p (tok)
+  "When looking at TOK (either \"in\" or \"do\"), non-nil if TOK is a keyword.
 Does not preserve point."
   (let ((forward-sexp-function nil)
         (words nil)                     ;We've seen words.
@@ -1832,7 +1832,10 @@ sh-smie--sh-keyword-in-p
        ((equal prev ";")
         (if words (setq newline t)
           (setq res 'keyword)))
-       ((member prev '("case" "for" "select")) (setq res 'keyword))
+       ((member prev (if (string= tok "in")
+                         '("case" "for" "select")
+                       '("for" "select")))
+        (setq res 'keyword))
        ((assoc prev smie-grammar) (setq res 'word))
        (t
         (if newline
@@ -1844,7 +1847,7 @@ sh-smie--sh-keyword-p
   "Non-nil if TOK (at which we're looking) really is a keyword."
   (cond
    ((looking-at "[[:alnum:]_]+=") nil)
-   ((equal tok "in") (sh-smie--sh-keyword-in-p))
+   ((member tok '("in" "do")) (sh-smie--sh-keyword-in/do-p tok))
    (t (sh-smie--keyword-p))))
 
 (defun sh-smie--default-forward-token ()
diff --git a/test/lisp/progmodes/sh-script-resources/sh-indents.erts b/test/lisp/progmodes/sh-script-resources/sh-indents.erts
index 1f92610b3aa..36f4e4c22ab 100644
--- a/test/lisp/progmodes/sh-script-resources/sh-indents.erts
+++ b/test/lisp/progmodes/sh-script-resources/sh-indents.erts
@@ -38,3 +38,10 @@ if test ;then
 fi
 other
 =-=-=
+
+Name: sh-indents5
+
+=-=
+for i do echo 1; done
+for i; do echo 1; done
+=-=-=
diff --git a/test/lisp/progmodes/sh-script-tests.el b/test/lisp/progmodes/sh-script-tests.el
index 52c1303c414..135d7afe3fe 100644
--- a/test/lisp/progmodes/sh-script-tests.el
+++ b/test/lisp/progmodes/sh-script-tests.el
@@ -87,4 +87,15 @@ test-backward-token
   (should-not (test-sh-back "foo;bar"))
   (should (test-sh-back "foo#zot")))
 
+(ert-deftest sh-script-test-do-fontification ()
+  "Test that \"do\" gets fontified correctly, even with no \";\"."
+  (with-temp-buffer
+    (shell-script-mode)
+    (insert "for i do echo 1; done")
+    (font-lock-ensure)
+    (goto-char (point-min))
+    (search-forward "do")
+    (forward-char -1)
+    (should (equal (get-text-property (point) 'face) 'font-lock-keyword-face))))
+
 ;;; sh-script-tests.el ends here
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* bug#26217: bug#2910: 23.0.60; Shell-script coloring bug
  2023-10-14 12:44     ` Mauro Aranda
@ 2023-10-14 14:58       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-10-14 15:01         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-10-14 14:58 UTC (permalink / raw)
  To: Mauro Aranda; +Cc: Martin Vath, 26217

> I haven't found problems, but I'm suspicious because it seems too easy.

Welcome to the world of Emacs, where "if it's easy it must be right".


        Stefan






^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#26217: bug#2910: 23.0.60; Shell-script coloring bug
  2023-10-14 14:58       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-10-14 15:01         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-10-14 15:01 UTC (permalink / raw)
  To: Mauro Aranda; +Cc: Martin Vath, 2910-done, 26217-done

>> I haven't found problems, but I'm suspicious because it seems too easy.
> Welcome to the world of Emacs, where "if it's easy it must be right".

BTW, pushed to `master`, thank you.


        Stefan






^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#26217: bug#2910: 23.0.60; Shell-script coloring bug
  2023-10-13 16:06   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-10-14 12:44     ` Mauro Aranda
@ 2023-10-14 16:43     ` Dmitry Gutov
  2023-10-14 19:07       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  1 sibling, 1 reply; 9+ messages in thread
From: Dmitry Gutov @ 2023-10-14 16:43 UTC (permalink / raw)
  To: Stefan Monnier, Mauro Aranda; +Cc: Martin Vath, 26217

On 13/10/2023 19:06, Stefan Monnier via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:
>> I don't know what's the decision (if there is one) about situations like
>> these.  If the bugs in lang-mode are fixed by lang-ts-mode, will these
>> bug reports be treated as: wontfix? fixed? open in case someone wants to
>> spend time in lang-mode?
> 
> AFAIK we don't consider the `foo-ts-mode` to obsolete the other modes.
> Maybe we will, but we don't yet.  IMO I think we'd first need to have
> a good long-term strategy about what we'll do when tree-sitter becomes
> unmaintained/obsolete.  IOW I think we need to develop our own layer of
> abstraction above tree-sitter so that we can accommodate other
> parser backends.
> 
> FWIW, it's not clear at all what such a layer would look like, so we're
> pretty far from it.  I'd welcome people start thinking about it, maybe
> by looking at existing alternatives like our own `wisi` (in GNU ELPA),
> SMIE, maybe LSP (assuming there are servers out there which can provide
> that kind of functionality), etc...

I don't know how feasible that would be, given that the ts major modes 
we write have to reference fairly low level concerns (such as node 
names, different across all grammars).

Maybe porting Lezer (https://lezer.codemirror.net/) could become a 
replacement in such a scenario, but then we're back to maintaining our 
own grammars again, and with lower performance by an order of a magnitude.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#26217: bug#2910: 23.0.60; Shell-script coloring bug
  2023-10-14 16:43     ` Dmitry Gutov
@ 2023-10-14 19:07       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2023-10-14 19:49         ` Dmitry Gutov
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2023-10-14 19:07 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Martin Vath, Mauro Aranda, 26217

>> FWIW, it's not clear at all what such a layer would look like, so we're
>> pretty far from it.  I'd welcome people start thinking about it, maybe
>> by looking at existing alternatives like our own `wisi` (in GNU ELPA),
>> SMIE, maybe LSP (assuming there are servers out there which can provide
>> that kind of functionality), etc...
> I don't know how feasible that would be,

That's the wrong way to look at it.  There is no doubt that it
is feasible.  What is under question is what it would take.

> given that the ts major modes we write have to reference fairly low
> level concerns (such as node names, different across all grammars).

That just means that a given set of highlighting/indentation rules would
not necessarily work with all possible parser-backends.  But maybe we
could bridge the gap by allowing some intermediate layer that could do
things like translate node names (could be useful even within the
tree-sitter context to deal with evolving grammars).

I'm not saying this is the way to go, mind you.  I don't know how it
could/should work.

But I do think we could do worse than start thinking about it, because
tree-sitter is the kind of technology that's on the "treadmill", whereas
Emacs' evolution has a different pace: how well will Emacs-29's TS modes
work with 2028's tree-sitter grammars?

> Maybe porting Lezer (https://lezer.codemirror.net/) could become

Interesting, thanks.

> a replacement in such a scenario, but then we're back to maintaining
> our own grammars again, and with lower performance by an order of
> a magnitude.

We could look into Lezer support just to help us guide the design of an
intermediate layer API.  No need to maintain lots of our own grammars or
take the performance impact :-)

        Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#26217: bug#2910: 23.0.60; Shell-script coloring bug
  2023-10-14 19:07       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2023-10-14 19:49         ` Dmitry Gutov
  0 siblings, 0 replies; 9+ messages in thread
From: Dmitry Gutov @ 2023-10-14 19:49 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Martin Vath, Mauro Aranda, 26217

On 14/10/2023 22:07, Stefan Monnier wrote:
>>> FWIW, it's not clear at all what such a layer would look like, so we're
>>> pretty far from it.  I'd welcome people start thinking about it, maybe
>>> by looking at existing alternatives like our own `wisi` (in GNU ELPA),
>>> SMIE, maybe LSP (assuming there are servers out there which can provide
>>> that kind of functionality), etc...
>> I don't know how feasible that would be,
> 
> That's the wrong way to look at it.  There is no doubt that it
> is feasible.  What is under question is what it would take.

And whether it would be in any way more economical than just waiting for 
the next comparable solution to come along and writing either 
compatibility shims inside each ts mode, or copying code and adapting 
into the next set of *-ts<N>-mode modes.

So for now we could be better off just coming up with an easier way to 
migrate user configurations across major modes for a given language than 
what we have now.

>> given that the ts major modes we write have to reference fairly low
>> level concerns (such as node names, different across all grammars).
> 
> That just means that a given set of highlighting/indentation rules would
> not necessarily work with all possible parser-backends.  But maybe we
> could bridge the gap by allowing some intermediate layer that could do
> things like translate node names (could be useful even within the
> tree-sitter context to deal with evolving grammars).

That's not impossible, but IME the sets of rules are fairly uniformly 
lower-level.

> I'm not saying this is the way to go, mind you.  I don't know how it
> could/should work.
> 
> But I do think we could do worse than start thinking about it, because
> tree-sitter is the kind of technology that's on the "treadmill", whereas
> Emacs' evolution has a different pace: how well will Emacs-29's TS modes
> work with 2028's tree-sitter grammars?

1. That's why I'm sure we'll start using some grammar pinning - either 
to the commit hash, or to the version range.

2. Overall it's a solid point: either tree-sitter "stabilizes" after a 
while, or goes off the map.

>> Maybe porting Lezer (https://lezer.codemirror.net/) could become
> 
> Interesting, thanks.
> 
>> a replacement in such a scenario, but then we're back to maintaining
>> our own grammars again, and with lower performance by an order of
>> a magnitude.
> 
> We could look into Lezer support just to help us guide the design of an
> intermediate layer API.  No need to maintain lots of our own grammars or
> take the performance impact :-)

It's as good an approach as any, though we have no guarantee that the 
"next tree-sitter" will have a similar enough shape to either of TS or 
Lezer, for an easy future migration.





^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-10-14 19:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-22  9:10 bug#26217: 25.2; shell syntax does not know for i do Martin Vath
2023-10-13 12:07 ` bug#26217: bug#2910: 23.0.60; Shell-script coloring bug Mauro Aranda
2023-10-13 16:06   ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-14 12:44     ` Mauro Aranda
2023-10-14 14:58       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-14 15:01         ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-14 16:43     ` Dmitry Gutov
2023-10-14 19:07       ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2023-10-14 19:49         ` Dmitry Gutov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).