* SMIE grammar for C-style function expressions
@ 2021-09-27 14:39 Nikolay Kudryavtsev
2021-09-27 18:41 ` Filipp Gunbin
2021-09-28 2:34 ` Stefan Monnier
0 siblings, 2 replies; 8+ messages in thread
From: Nikolay Kudryavtsev @ 2021-09-27 14:39 UTC (permalink / raw)
To: help-gnu-emacs@gnu.org; +Cc: Stefan Monnier
Hello.
Recently, I've been working on a major mode with an indentation
implementation based on SMIE. I've been also collecting some notes with
the plan of writing a post on reddit, describing the experience gained.
Since the language I'm working on is kind of a nightmare to indent, some
part of my post is going to be SMIE-related.
With this in mind, I still have a very basic case that I don't think I
100% grok, so I've decided to ask here. The case is this: what's the
proper way to describe SMIE grammar for C-style functions? I remember
Stefan talking that he has a prototype SMIE C implementation, but I
don't think he ever published it.
To explain, a C-style function definition grammar looks something like this:
f n ( a ) { s }
'f' here is a keyword, which in practice would be "function" or say
"def" like in Python so lets mark it like this:
"f" n ( a ) { s }
Both '( a )' and '{ s }' are lists and SMIE grammars generally prefer to
abstract those, to keep parens considered openers and closers. So now
our grammar looks like this:
args = ( a )
stmts = { s }
"f" n args stmts
Now the grammar we ended up with is invalid in SMIE because non
terminals appear consequently in it. The simplest solution would
probably be using virtual keywords, to separate our non terminals:
"f" n " : " args " : " stmts
This of course puts a bit of a strain on the lexer, since it has to
recognize when to give out those tokens.
So, is this the recommended solution for such cases, or maybe there's
some other preferred way?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMIE grammar for C-style function expressions
2021-09-27 14:39 SMIE grammar for C-style function expressions Nikolay Kudryavtsev
@ 2021-09-27 18:41 ` Filipp Gunbin
2021-09-27 19:36 ` Nikolay Kudryavtsev
2021-09-28 2:34 ` Stefan Monnier
1 sibling, 1 reply; 8+ messages in thread
From: Filipp Gunbin @ 2021-09-27 18:41 UTC (permalink / raw)
To: Nikolay Kudryavtsev; +Cc: help-gnu-emacs@gnu.org, Stefan Monnier
On 27/09/2021 17:39 +0300, Nikolay Kudryavtsev wrote:
> I remember Stefan talking that he has a prototype SMIE C
> implementation, but I don't think he ever published it.
sm-c-mode in GNU ELPA?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMIE grammar for C-style function expressions
2021-09-27 18:41 ` Filipp Gunbin
@ 2021-09-27 19:36 ` Nikolay Kudryavtsev
0 siblings, 0 replies; 8+ messages in thread
From: Nikolay Kudryavtsev @ 2021-09-27 19:36 UTC (permalink / raw)
To: Filipp Gunbin; +Cc: help-gnu-emacs@gnu.org, Stefan Monnier
Oh! Thank you. My bad for not looking enough on ELPA.
Looking at, seems that it decided to not bother trying to tackle this
problem, which is understandable, especially for a prototype mode. But
this still leaves my question open.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMIE grammar for C-style function expressions
2021-09-27 14:39 SMIE grammar for C-style function expressions Nikolay Kudryavtsev
2021-09-27 18:41 ` Filipp Gunbin
@ 2021-09-28 2:34 ` Stefan Monnier
2021-09-28 11:47 ` Nikolay Kudryavtsev
1 sibling, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2021-09-28 2:34 UTC (permalink / raw)
To: Nikolay Kudryavtsev; +Cc: help-gnu-emacs@gnu.org
> With this in mind, I still have a very basic case that I don't think I 100%
> grok, so I've decided to ask here. The case is this: what's the proper way
> to describe SMIE grammar for C-style functions? I remember Stefan talking
> that he has a prototype SMIE C implementation, but I don't think he ever
> published it.
>
> To explain, a C-style function definition grammar looks something like this:
>
> f n ( a ) { s }
Could you clarify what you mean by "describe"?
SMIE does not concern itself with trying to detect syntax errors, so
nothing of the above seems hard at all: it should "just work" without
any special effort (just make sure SMIE knows that "(" matches ")" and
"{" matches "}" and that's it).
[ I'd expect more difficulty when you consider sequences of such
declarations, as in:
f n1 ( a1 ) { s1 }
f n2 ( a2 ) { s2 }
where there's no clear separator. In such cases the better option
usually involves treating `f` as an infix separator (rather than
a prefix). ]
Maybe the problem you're seeing has to do with having specific
indentation rules inside the `s` part, maybe? If so, the problem is
better solved in the SMIE indentation rules than in the SMIE grammar.
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMIE grammar for C-style function expressions
2021-09-28 2:34 ` Stefan Monnier
@ 2021-09-28 11:47 ` Nikolay Kudryavtsev
2021-09-28 13:07 ` Stefan Monnier
0 siblings, 1 reply; 8+ messages in thread
From: Nikolay Kudryavtsev @ 2021-09-28 11:47 UTC (permalink / raw)
To: Stefan Monnier; +Cc: help-gnu-emacs@gnu.org
I understand that SMIE does not concern itself with the AST, but I think
it's reasonable to start with describing the language grammar for it in
the most naive way, until we have a good reason not to. "f n ( a ) { s
}" is a single expression in the language, so someone implementing a
grammar for such language may first try to declare it in SMIE as a
single expression.
To try and keep the problem stated in practical terms, lets say the
language grammar looks like this:
args = ( a )
stmts = { s }
"f" n args stmts "f-end"
With the point after "f-end", I want backward-sexp to put me past "f"
and for this:
(A)I have to keep the entire expression within a single SMIE declaration.
(B)My lexer has to add some virtual keywords to separate nonterminals:
"f" n " : " args " : " stmts "f-end"
Is this the proper approach? Are my assumptions A and B correct?
P. S. I've add "f-end" here to wave off the ambiguity caused by the last
token in the real C-style expression being both main and child
expression closer, since that problem, while real, is not the issue here.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMIE grammar for C-style function expressions
2021-09-28 11:47 ` Nikolay Kudryavtsev
@ 2021-09-28 13:07 ` Stefan Monnier
2021-09-28 18:12 ` Nikolay Kudryavtsev
0 siblings, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2021-09-28 13:07 UTC (permalink / raw)
To: Nikolay Kudryavtsev; +Cc: help-gnu-emacs@gnu.org
> I understand that SMIE does not concern itself with the AST, but I think
> it's reasonable to start with describing the language grammar for it in the
> most naive way, until we have a good reason not to. "f n ( a ) { s }" is
> a single expression in the language, so someone implementing a grammar for
> such language may first try to declare it in SMIE as a single expression.
No, the better way to work with SMIE is to fix the problems that occur
rather than to try and first reproduce all the details of the grammar:
the SMIE grammars are almost always an approximation of the real grammar
that accepts meaningless programs as well, but we don't care.
> To try and keep the problem stated in practical terms, lets say the language
> grammar looks like this:
>
> args = ( a )
>
> stmts = { s }
>
> "f" n args stmts "f-end"
Any sequence of "atomic" elements is automatically handled by SMIE
without having to specify it in the grammar. If SMIE knows that "("
matches ")" then "( a )" is such an "atomic" element (similarly, if
SMIE is told that "f" matches "f-end", then "f ... f-end" is also an
atomic element).
So for the above example, the grammar just needs something like:
(defvar foo-grammar
'((exp ("(" exp ")")
("{" exp "}")
("f" exp "f-end"))))
And you may even skip the first two entries because they're probably
already described in the syntax-table.
Now the above will understand
f n (a) {s} f-end
like you want it to. It will also "understand"
f {a} n f (haha) f-end f-end
even though it's non-sensical, but we usually don't care about what SMIE
does with such nonsensical code.
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMIE grammar for C-style function expressions
2021-09-28 13:07 ` Stefan Monnier
@ 2021-09-28 18:12 ` Nikolay Kudryavtsev
2021-09-28 18:45 ` Stefan Monnier
0 siblings, 1 reply; 8+ messages in thread
From: Nikolay Kudryavtsev @ 2021-09-28 18:12 UTC (permalink / raw)
To: Stefan Monnier; +Cc: help-gnu-emacs@gnu.org
I see. I was aware of this option, but letting anything happen in the
middle of statements for which we know the proper grammar just feels
intuitively wrong to me, hence I didn't even mention it. But I guess
it's good enough to solve the issues SMIE concerns itself with.
Thanks for explaining.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: SMIE grammar for C-style function expressions
2021-09-28 18:12 ` Nikolay Kudryavtsev
@ 2021-09-28 18:45 ` Stefan Monnier
0 siblings, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2021-09-28 18:45 UTC (permalink / raw)
To: Nikolay Kudryavtsev; +Cc: help-gnu-emacs@gnu.org
> I see. I was aware of this option, but letting anything happen in the middle
> of statements for which we know the proper grammar just feels intuitively
> wrong to me,
;-)
This design decision is at the core of SMIE. I won't argue that it can
be odd at first (and limiting at times), but it is the key design
element that allows the system to work reasonably well despite
its simplicity.
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-09-28 18:45 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-27 14:39 SMIE grammar for C-style function expressions Nikolay Kudryavtsev
2021-09-27 18:41 ` Filipp Gunbin
2021-09-27 19:36 ` Nikolay Kudryavtsev
2021-09-28 2:34 ` Stefan Monnier
2021-09-28 11:47 ` Nikolay Kudryavtsev
2021-09-28 13:07 ` Stefan Monnier
2021-09-28 18:12 ` Nikolay Kudryavtsev
2021-09-28 18:45 ` Stefan Monnier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).