unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* SMIE grammar for C-style function expressions
@ 2021-09-27 14:39 Nikolay Kudryavtsev
  2021-09-27 18:41 ` Filipp Gunbin
  2021-09-28  2:34 ` Stefan Monnier
  0 siblings, 2 replies; 8+ messages in thread
From: Nikolay Kudryavtsev @ 2021-09-27 14:39 UTC (permalink / raw)
  To: help-gnu-emacs@gnu.org; +Cc: Stefan Monnier

Hello.

Recently, I've been working on a major mode with an indentation 
implementation based on SMIE. I've been also collecting some notes with 
the plan of writing a post on reddit, describing the experience gained. 
Since the language I'm working on is kind of a nightmare to indent, some 
part of my post is going to be SMIE-related.

With this in mind, I still have a very basic case that I don't think I 
100% grok, so I've decided to ask here. The case is this: what's the 
proper way to describe SMIE grammar for C-style functions? I remember 
Stefan talking that he has a prototype SMIE C implementation, but I 
don't think he ever published it.

To explain, a C-style function definition grammar looks something like this:

f n ( a ) { s }

'f' here is a keyword, which in practice would be "function" or say 
"def" like in Python so lets mark it like this:

"f" n ( a ) { s }

Both '( a )' and '{ s }' are lists and SMIE grammars generally prefer to 
abstract those, to keep parens considered openers and closers. So now 
our grammar looks like this:

args = ( a )

stmts = { s }

"f" n args stmts

Now the grammar we ended up with is invalid in SMIE because non 
terminals appear consequently in it. The simplest solution would 
probably be using virtual keywords, to separate our non terminals:

"f" n " : " args " : " stmts

This of course puts a bit of a strain on the lexer, since it has to 
recognize when to give out those tokens.

So, is this the recommended solution for such cases, or maybe there's 
some other preferred way?




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SMIE grammar for C-style function expressions
  2021-09-27 14:39 SMIE grammar for C-style function expressions Nikolay Kudryavtsev
@ 2021-09-27 18:41 ` Filipp Gunbin
  2021-09-27 19:36   ` Nikolay Kudryavtsev
  2021-09-28  2:34 ` Stefan Monnier
  1 sibling, 1 reply; 8+ messages in thread
From: Filipp Gunbin @ 2021-09-27 18:41 UTC (permalink / raw)
  To: Nikolay Kudryavtsev; +Cc: help-gnu-emacs@gnu.org, Stefan Monnier

On 27/09/2021 17:39 +0300, Nikolay Kudryavtsev wrote:

> I remember Stefan talking that he has a prototype SMIE C
> implementation, but I don't think he ever published it.

sm-c-mode in GNU ELPA?



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SMIE grammar for C-style function expressions
  2021-09-27 18:41 ` Filipp Gunbin
@ 2021-09-27 19:36   ` Nikolay Kudryavtsev
  0 siblings, 0 replies; 8+ messages in thread
From: Nikolay Kudryavtsev @ 2021-09-27 19:36 UTC (permalink / raw)
  To: Filipp Gunbin; +Cc: help-gnu-emacs@gnu.org, Stefan Monnier

Oh! Thank you. My bad for not looking enough on ELPA.

Looking at, seems that it decided to not bother trying to tackle this 
problem, which is understandable, especially for a prototype mode. But 
this still leaves my question open.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SMIE grammar for C-style function expressions
  2021-09-27 14:39 SMIE grammar for C-style function expressions Nikolay Kudryavtsev
  2021-09-27 18:41 ` Filipp Gunbin
@ 2021-09-28  2:34 ` Stefan Monnier
  2021-09-28 11:47   ` Nikolay Kudryavtsev
  1 sibling, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2021-09-28  2:34 UTC (permalink / raw)
  To: Nikolay Kudryavtsev; +Cc: help-gnu-emacs@gnu.org

> With this in mind, I still have a very basic case that I don't think I 100%
> grok, so I've decided to ask here. The case is this: what's the proper way
> to describe SMIE grammar for C-style functions? I remember Stefan talking
> that he has a prototype SMIE C implementation, but I don't think he ever
> published it.
>
> To explain, a C-style function definition grammar looks something like this:
>
> f n ( a ) { s }

Could you clarify what you mean by "describe"?

SMIE does not concern itself with trying to detect syntax errors, so
nothing of the above seems hard at all: it should "just work" without
any special effort (just make sure SMIE knows that "(" matches ")" and
"{" matches "}" and that's it).

[ I'd expect more difficulty when you consider sequences of such
declarations, as in:

    f n1 ( a1 ) { s1 }
    f n2 ( a2 ) { s2 }

where there's no clear separator.  In such cases the better option
usually involves treating `f` as an infix separator (rather than
a prefix).  ]

Maybe the problem you're seeing has to do with having specific
indentation rules inside the `s` part, maybe?  If so, the problem is
better solved in the SMIE indentation rules than in the SMIE grammar.


        Stefan




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SMIE grammar for C-style function expressions
  2021-09-28  2:34 ` Stefan Monnier
@ 2021-09-28 11:47   ` Nikolay Kudryavtsev
  2021-09-28 13:07     ` Stefan Monnier
  0 siblings, 1 reply; 8+ messages in thread
From: Nikolay Kudryavtsev @ 2021-09-28 11:47 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: help-gnu-emacs@gnu.org

I understand that SMIE does not concern itself with the AST, but I think 
it's reasonable to start with describing the language grammar for it in 
the most naive way, until we have a good reason not to. "f n ( a ) { s 
}" is a single expression in the language, so someone implementing a 
grammar for such language may first try to declare it in SMIE as a 
single expression.

To try and keep the problem stated in practical terms, lets say the 
language grammar looks like this:

args = ( a )

stmts = { s }

"f" n args stmts "f-end"

With the point after "f-end", I want backward-sexp to put me past "f" 
and for this:

(A)I have to keep the entire expression within a single SMIE declaration.

(B)My lexer has to add some virtual keywords to separate nonterminals:

"f" n " : " args " : " stmts "f-end"

Is this the proper approach? Are my assumptions A and B correct?

P. S. I've add "f-end" here to wave off the ambiguity caused by the last 
token in the real C-style expression being both main and child 
expression closer, since that problem, while real, is not the issue here.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SMIE grammar for C-style function expressions
  2021-09-28 11:47   ` Nikolay Kudryavtsev
@ 2021-09-28 13:07     ` Stefan Monnier
  2021-09-28 18:12       ` Nikolay Kudryavtsev
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2021-09-28 13:07 UTC (permalink / raw)
  To: Nikolay Kudryavtsev; +Cc: help-gnu-emacs@gnu.org

> I understand that SMIE does not concern itself with the AST, but I think
> it's reasonable to start with describing the language grammar for it in the
> most naive way, until we have a good reason not to.   "f n ( a ) { s }" is
> a single expression in the language, so someone implementing a grammar for
> such language may first try to declare it in SMIE as a single expression.

No, the better way to work with SMIE is to fix the problems that occur
rather than to try and first reproduce all the details of the grammar:
the SMIE grammars are almost always an approximation of the real grammar
that accepts meaningless programs as well, but we don't care.

> To try and keep the problem stated in practical terms, lets say the language
> grammar looks like this:
>
> args = ( a )
>
> stmts = { s }
>
> "f" n args stmts "f-end"

Any sequence of "atomic" elements is automatically handled by SMIE
without having to specify it in the grammar.  If SMIE knows that "("
matches ")" then "( a )" is such an "atomic" element (similarly, if
SMIE is told that "f" matches "f-end", then "f ... f-end" is also an
atomic element).
So for the above example, the grammar just needs something like:

    (defvar foo-grammar
        '((exp ("(" exp ")")
               ("{" exp "}")
               ("f" exp "f-end"))))

And you may even skip the first two entries because they're probably
already described in the syntax-table.

Now the above will understand

   f n (a) {s} f-end

like you want it to.  It will also "understand"

   f {a} n f (haha) f-end f-end

even though it's non-sensical, but we usually don't care about what SMIE
does with such nonsensical code.


        Stefan




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SMIE grammar for C-style function expressions
  2021-09-28 13:07     ` Stefan Monnier
@ 2021-09-28 18:12       ` Nikolay Kudryavtsev
  2021-09-28 18:45         ` Stefan Monnier
  0 siblings, 1 reply; 8+ messages in thread
From: Nikolay Kudryavtsev @ 2021-09-28 18:12 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: help-gnu-emacs@gnu.org

I see. I was aware of this option, but letting anything happen in the 
middle of statements for which we know the proper grammar just feels 
intuitively wrong to me, hence I didn't even mention it. But I guess 
it's good enough to solve the issues SMIE concerns itself with.

Thanks for explaining.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: SMIE grammar for C-style function expressions
  2021-09-28 18:12       ` Nikolay Kudryavtsev
@ 2021-09-28 18:45         ` Stefan Monnier
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2021-09-28 18:45 UTC (permalink / raw)
  To: Nikolay Kudryavtsev; +Cc: help-gnu-emacs@gnu.org

> I see. I was aware of this option, but letting anything happen in the middle
> of statements for which we know the proper grammar just feels intuitively
> wrong to me,

;-)

This design decision is at the core of SMIE.  I won't argue that it can
be odd at first (and limiting at times), but it is the key design
element that allows the system to work reasonably well despite
its simplicity.


        Stefan




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-09-28 18:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-27 14:39 SMIE grammar for C-style function expressions Nikolay Kudryavtsev
2021-09-27 18:41 ` Filipp Gunbin
2021-09-27 19:36   ` Nikolay Kudryavtsev
2021-09-28  2:34 ` Stefan Monnier
2021-09-28 11:47   ` Nikolay Kudryavtsev
2021-09-28 13:07     ` Stefan Monnier
2021-09-28 18:12       ` Nikolay Kudryavtsev
2021-09-28 18:45         ` Stefan Monnier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).