unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* SMIE: defining a build.ninja grammar
@ 2023-04-07 14:18 Konstantin Kharlamov
  2023-04-09  4:51 ` tomas
  0 siblings, 1 reply; 2+ messages in thread
From: Konstantin Kharlamov @ 2023-04-07 14:18 UTC (permalink / raw)
  To: help-gnu-emacs

I initially posted this on Emacs stackexchange, but in absence of replies decided to re-post on the mailing list as that's where usually people knowing how SMIE works hang out.

I'm trying to write a mode in SMIE, to figure out how it works and to create some documentation¹.

`build.ninja` (a build system used by Meson and others) is a perfect candidate due to its very simple syntax. So despite there being a `ninja-mode`, I decided to create one based on SMIE and to possibly include it into upstream Emacs.

Syntax showcase (barring that there's a few more keywords and `rule` only allows special variables):

    rule my_rule_title
      local_var_rule  = some text
      command         = cc -c $in -o $out

    global_var = some text

    build path/obj.o: my_rule_title path/obj.c
      local_var_build = some text

Basically, `rule` and `build` accept a few parameters and have a body. The body only allows variable assignments to appear and is characterized by non-zero indentation level. So you can see `local_var_rule` is inside a `rule` region, but `global_var` is outside it.

-------------

I have spent some time studying other SMIE-based modes, reading documentation, and writing code. At this point I've monkey-typed something working, but not really properly, and I think main reason is that I don't know if my grammar is correct (unlikely). My current grammar is attached at the bottom.

So, here are questions I didn't find answers to:

1. Does a grammar have to cover complete buffer or only the interesting parts?

   To give an example: the `build.ninja` example above has `rule` and `build` paragraphs. Obviously that means I have to write at least two SMIE rules: one is to cover possible appearance of `rule` and another for `build`. But once that's done, do I also write a rule that connects the two on the level of an entire buffer, i.e. to say "the buffer is expected to be composed of `rule`s and `build`s"? Or having just the two is enough?
2. How do I define what symbols an identifier contains? For example a `build` title may contain slashes and escaped spaces, but variable and `rule` names are not allowed to have them.
3. How to define newline as a separator? E.g. a `build` ends with a newline, and then follows a region of assignments. I tried using a `"\n"`, but I'm not sure if SMIE interprets the backslash, nor that a `\n` will work with other newline types.
   * sub-question: defining that a line is allowed to continue on the next one if the previous line ended with a `$` (i.e. escapes the newline). I guess if `"\n"` works, then I just have to create a separate rule for `"$\n"`. But I decided to question that explicitly in case the answer to `3` is more complicated than that.
4. How to define a non-zero space token, that is to define that the variable assignment belongs to the previous `build` or `rule`?

-------------

My last attempt is the grammar below. I had some other variants that worked incorrectly, but they were incomplete as well. For this post I created a more complete version, but it does not compile for me because it doesn't like `text` definition, it throws `Adjacent non-terminals: id text`.

    (defvar test-mode-smie-grammar
      (smie-prec2->grammar
       (smie-bnf->prec2
        '((id)
          (path) ;; TODO: define how it's different from `id'
          (statements (statement)
                      (statement "\n" statements))
          (statement (top_decls) (variable))
          (text (id text)
                (text "\n"))
          (variable (id "=" text))
          (build_title (path build_title)
                       (path ":"))
          (top_decls
           ("rule" id)
           ("build" build_title ":" text)
           )
          ))))

1: https://emacs.stackexchange.com/questions/20264/is-there-any-smie-documentation-that-is-clear




^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: SMIE: defining a build.ninja grammar
  2023-04-07 14:18 SMIE: defining a build.ninja grammar Konstantin Kharlamov
@ 2023-04-09  4:51 ` tomas
  0 siblings, 0 replies; 2+ messages in thread
From: tomas @ 2023-04-09  4:51 UTC (permalink / raw)
  To: help-gnu-emacs

[-- Attachment #1: Type: text/plain, Size: 2147 bytes --]

On Fri, Apr 07, 2023 at 05:18:27PM +0300, Konstantin Kharlamov wrote:
> I initially posted this on Emacs stackexchange, but in absence of replies decided to re-post on the mailing list as that's where usually people knowing how SMIE works hang out.

I have no idea of SMIE, but I think I can answer one of
your questions -- perhaps in a surprising way:

[...]

> 3. How to define newline as a separator? E.g. a `build` ends with a newline, and then follows a region of assignments. I tried using a `"\n"`, but I'm not sure if SMIE interprets the backslash, nor that a `\n` will work with other newline types.

[...]

> My last attempt is the grammar below. I had some other variants that worked incorrectly, but they were incomplete as well. For this post I created a more complete version, but it does not compile for me because it doesn't like `text` definition, it throws `Adjacent non-terminals: id text`.
> 
>     (defvar test-mode-smie-grammar
>       (smie-prec2->grammar
>        (smie-bnf->prec2
>         '((id)
>           (path) ;; TODO: define how it's different from `id'
>           (statements (statement)
>                       (statement "\n" statements))
>           (statement (top_decls) (variable))
>           (text (id text)
>                 (text "\n"))
>           (variable (id "=" text))
>           (build_title (path build_title)
>                        (path ":"))
>           (top_decls
>            ("rule" id)
>            ("build" build_title ":" text)
>            )
>           ))))

Those "\n" you have there are translated by the Lisp reader
(i.e. "early") into code point 0x0A. So this is what SMIE is
going to see.

Now, no idea whether it special-cases this into "whatever counts
as a newline"; I'd venture a guess that it doesn't. I'd expect
that if "your" newlines are more complex than a simple "\n",
you'll have to extend the lexer, as detailed in "Defining
Tokens" [1] in the manual.

Cheers

[1] Info menu "SMIE Lexer" or this URL, if you prefer the intertubes:
   https://www.gnu.org/software/emacs/manual/html_node/elisp/SMIE-Lexer.html

-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-04-09  4:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-07 14:18 SMIE: defining a build.ninja grammar Konstantin Kharlamov
2023-04-09  4:51 ` tomas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).