unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* Working around the limitations of SMIE
@ 2022-11-09 21:25 Philip Kaludercic
  2022-11-10  5:00 ` Stefan Monnier via Users list for the GNU Emacs text editor
  0 siblings, 1 reply; 5+ messages in thread
From: Philip Kaludercic @ 2022-11-09 21:25 UTC (permalink / raw)
  To: help-gnu-emacs


Hi,

I am writing a major mode for a little language I am using at
university, and wanted to try using SMIE for indentation and all the
other things.  The issue I find myself confronted with is that functions
are defined as in the following example:

    func funktion(x : int): float
      x := x * x;
      return x;
    end

where there is no delimiter between the return type (float), and the
rest of the body (such as "begin" or something like that).  My naive
approach at defining a grammar was as follows:

    (defconst e2-grammar
      (smie-prec2->grammar
       (smie-bnf->prec2
        '((id) (exp)
          (inst ("func" insts "end") ; <--
                ("var" id ":" id)
                ("if" exp "then" insts "else" insts "end")
                ("if" exp "then" insts "end")
                ("while" exp "do" insts "end")
                ("return" exp)
                (id ":=" exp))
          (insts (inst ";" insts) (inst)))))
      "SMIE grammar for e2.")

Cannot express the fact that a return type, a non-terminal is followed
by the rest of the body, another non-terminal.

Another issue I ran into with the above definition is that instructions
are not indented correctly, as the above grammar doesn't express that in
this language doesn't expect a semicolon after an end (just like C
doesn't expect one after a "}").  So the result is that

instead of:

  while y >= y1 do
    dummy := zeile(x1, x2, xstep, y);
    y := y - ystep;
  end
  return 0;

I get:

  while y >= y1 do
    dummy := zeile(x1, x2, xstep, y);
    y := y - ystep;
  end
    return 0;

because it tries to parse "while ... end return 0" as an instruction.

I've been looking around the web for other major modes that use SMIE but
couldn't find anything satisfying.  I wonder if it is necessary to
define a special token-function that generates whitespace tokens?  I
hope there is a better way.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Working around the limitations of SMIE
  2022-11-09 21:25 Working around the limitations of SMIE Philip Kaludercic
@ 2022-11-10  5:00 ` Stefan Monnier via Users list for the GNU Emacs text editor
  2022-11-11 16:20   ` Philip Kaludercic
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Monnier via Users list for the GNU Emacs text editor @ 2022-11-10  5:00 UTC (permalink / raw)
  To: help-gnu-emacs

> I am writing a major mode for a little language I am using at
> university, and wanted to try using SMIE for indentation and all the
> other things.  The issue I find myself confronted with is that functions
> are defined as in the following example:
>
>     func funktion(x : int): float
>       x := x * x;
>       return x;
>     end
>
> where there is no delimiter between the return type (float), and the
> rest of the body (such as "begin" or something like that).

How is the separation between the function's return type and the
function's body defined?  Is it based on the newline that follows the
type, or is the language constrained to have types that are
a single identifiers?

> Another issue I ran into with the above definition is that instructions
> are not indented correctly, as the above grammar doesn't express that in
> this language doesn't expect a semicolon after an end (just like C
> doesn't expect one after a "}").  So the result is that
>
> instead of:
>
>   while y >= y1 do
>     dummy := zeile(x1, x2, xstep, y);
>     y := y - ystep;
>   end
>   return 0;
>
> I get:
>
>   while y >= y1 do
>     dummy := zeile(x1, x2, xstep, y);
>     y := y - ystep;
>   end
>     return 0;

Based on my experience, I suspect that the simplest solution for this is
to make "end" return 2 tokens (the "end" and then a ghost ";").

You might check how I do it in `sml-mode` (I include the last commit
below my sig, which is where I used this solution instead of another
hack I used earlier), for a similar tho different situation (there it's
the "begin" token which plays double role, but the same approach should
work).


        Stefan


diff --git a/sml-mode.el b/sml-mode.el
index e128832bb0..172ffe59c7 100644
--- a/sml-mode.el
+++ b/sml-mode.el
@@ -1,9 +1,9 @@
 ;;; sml-mode.el --- Major mode for editing (Standard) ML  -*- lexical-binding: t; coding: utf-8 -*-
 
-;; Copyright (C) 1989,1999,2000,2004,2007,2010-2018  Free Software Foundation, Inc.
+;; Copyright (C) 1989-2020  Free Software Foundation, Inc.
 
 ;; Maintainer: Stefan Monnier <monnier@iro.umontreal.ca>
-;; Version: 6.9
+;; Version: 6.10
 ;; Keywords: SML
 ;; Author:	Lars Bo Nielsen
 ;;		Olin Shivers
@@ -12,7 +12,7 @@
 ;;		Matthew Morley <mjm@scs.leeds.ac.uk>
 ;;		Matthias Blume <blume@cs.princeton.edu>
 ;;		(Stefan Monnier) <monnier@iro.umontreal.ca>
-;; Package-Requires: ((emacs "24") (cl-lib "0.5"))
+;; Package-Requires: ((emacs "24.3") (cl-lib "0.5"))
 
 ;; This file is part of GNU Emacs.
 
@@ -55,6 +55,7 @@
 
 ;; - new var sml-abbrev-skeletons to control whether to include skeletons
 ;;   in the main abbrev table.
+;; - change parsing rule of "local"
 
 ;;;;; Changes since 5.0:
 
@@ -563,22 +564,6 @@ Regexp match data 0 points to the chars."
      (cond
       ((smie-rule-parent-p "datatype") (if (smie-rule-bolp) 2))
       ((smie-rule-parent-p "structure" "signature" "functor") 0)))
-    ;; Indent an expression starting with "local" as if it were starting
-    ;; with "fun".
-    (`(:before . "local") (smie-indent-keyword "fun"))
-    ;; FIXME: type/val/fun/... are separators but "local" is not, even though
-    ;; it appears in the same list.  Try to fix up the problem by hand.
-    ;; ((or (equal token "local")
-    ;;      (equal (cdr (assoc token smie-grammar))
-    ;;             (cdr (assoc "fun" smie-grammar))))
-    ;;  (let ((parent (save-excursion (smie-backward-sexp))))
-    ;;    (when (or (and (equal (nth 2 parent) "local")
-    ;;                   (null (car parent)))
-    ;;              (progn
-    ;;                (setq parent (save-excursion (smie-backward-sexp "fun")))
-    ;;                (eq (car parent) (nth 1 (assoc "fun" smie-grammar)))))
-    ;;      (goto-char (nth 1 parent))
-    ;;      (cons 'column (smie-indent-virtual)))))
     ))
 
 (defun sml-smie-definitional-equal-p ()
@@ -635,16 +620,37 @@ Assumes point is right before the | symbol."
          (skip-syntax-forward ".'"))
      (point))))
 
+;; "local...end" acts as a kind of parentheses, but it *also* terminates
+;; any preceding declaration: SMIE doesn't really know how to handle such
+;; a token correctly.  In the past we solved this problem in the
+;; `sml-smie-rules' with an ad-hoc indentation rule for `local', but instead
+;; we now just emit two tokens ("val" and "local") when we encounter
+;; a "local" keyword.
+(defvar-local sml-smie--pending-token nil
+  "Tokens already parsed but not yet emitted.
+Used when several tokens should be emitted at the same buffer position.
+Takes the form (DIR POS . TOKENS).")
+
 (defun sml-smie-forward-token ()
-  (let ((sym (sml-smie-forward-token-1)))
-    (cond
-     ((equal "op" sym)
-      (concat "op " (sml-smie-forward-token-1)))
-     ((member sym '("|" "of" "="))
-      ;; The important lexer for indentation's performance is the backward
-      ;; lexer, so for the forward lexer we delegate to the backward one.
-      (save-excursion (sml-smie-backward-token)))
-     (t sym))))
+  (or
+   (when sml-smie--pending-token
+     (if (and (eq 'forward (car sml-smie--pending-token))
+              (eql (point) (nth 1 sml-smie--pending-token))
+              (nthcdr 2 sml-smie--pending-token))
+         (pop (nthcdr 2 sml-smie--pending-token))
+       (setq sml-smie--pending-token nil)))
+   (let ((sym (sml-smie-forward-token-1)))
+     (cond
+      ((equal "op" sym)
+       (concat "op " (sml-smie-forward-token-1)))
+      ((member sym '("|" "of" "="))
+       ;; The important lexer for indentation's performance is the backward
+       ;; lexer, so for the forward lexer we delegate to the backward one.
+       (save-excursion (sml-smie-backward-token)))
+      ((string= sym "local")
+       (setq sml-smie--pending-token `(forward ,(point) ,sym))
+       "val")
+      (t sym)))))
 
 (defun sml-smie-backward-token-1 ()
   (forward-comment (- (point)))
@@ -656,18 +662,28 @@ Assumes point is right before the | symbol."
      (point))))
 
 (defun sml-smie-backward-token ()
-  (let ((sym (sml-smie-backward-token-1)))
-    (unless (zerop (length sym))
-      ;; FIXME: what should we do if `sym' = "op" ?
-      (let ((point (point)))
-	(if (equal "op" (sml-smie-backward-token-1))
-	    (concat "op " sym)
-	  (goto-char point)
-	  (cond
-	   ((string= sym "=") (if (sml-smie-definitional-equal-p) "d=" "="))
-	   ((string= sym "of") (if (sml-smie-non-nested-of-p) "=of" "of"))
-           ((string= sym "|") (if (sml-smie-datatype-|-p) "d|" "|"))
-	   (t sym)))))))
+  (or
+   (when sml-smie--pending-token
+     (if (and (eq 'backward (car sml-smie--pending-token))
+              (eql (point) (nth 1 sml-smie--pending-token))
+              (nthcdr 2 sml-smie--pending-token))
+         (pop (nthcdr 2 sml-smie--pending-token))
+       (setq sml-smie--pending-token nil)))
+   (let ((sym (sml-smie-backward-token-1)))
+     (unless (zerop (length sym))
+       ;; FIXME: what should we do if `sym' = "op" ?
+       (let ((point (point)))
+	 (if (equal "op" (sml-smie-backward-token-1))
+	     (concat "op " sym)
+	   (goto-char point)
+	   (cond
+	    ((string= sym "=") (if (sml-smie-definitional-equal-p) "d=" "="))
+	    ((string= sym "of") (if (sml-smie-non-nested-of-p) "=of" "of"))
+            ((string= sym "|") (if (sml-smie-datatype-|-p) "d|" "|"))
+            ((string= sym "local")
+             (setq sml-smie--pending-token `(backward ,(point) "val"))
+             sym)
+	    (t sym))))))))
 
 ;;;;
 ;;;; Imenu support
@@ -722,9 +738,8 @@ Assumes point is right before the | symbol."
     map)
   "Keymap for `sml-prog-proc-mode'.")
 
-(defvar sml-prog-proc--buffer nil
+(defvar-local sml-prog-proc--buffer nil
   "The inferior-process buffer to which to send code.")
-(make-variable-buffer-local 'sml-prog-proc--buffer)
 
 (cl-defstruct (sml-prog-proc-descriptor
             (:constructor sml-prog-proc-make)
@@ -764,7 +779,7 @@ Assumes point is right before the | symbol."
             (buf (sml-prog-proc--call run)))
         (with-current-buffer buf
           (if (and ppd (null sml-prog-proc-descriptor))
-              (set (make-local-variable 'sml-prog-proc-descriptor) ppd)))
+              (setq-local sml-prog-proc-descriptor ppd)))
         (setq sml-prog-proc--buffer buf)
         (get-buffer-process sml-prog-proc--buffer))))
 
@@ -929,7 +944,7 @@ Prefix arg AND-GO also means to switch to the read-eval-loop buffer afterwards."
      ;;     (setq dir (unless (equal newdir dir) newdir))))
      ;; (setq dir (or dir default-directory))
      ;; (list cmd dir)))
-  (set (make-local-variable 'sml-prog-proc--compile-command) command)
+  (setq-local sml-prog-proc--compile-command command)
   (save-some-buffers (not compilation-ask-about-save) nil)
   (let ((dir default-directory))
     (when (string-match "^\\s-*cd\\s-+\"\\([^\"]+\\)\"\\s-*;" command)
@@ -1237,16 +1252,14 @@ TAB file name completion, as in shell-mode, etc.."
   (add-hook 'next-error-hook 'inferior-sml-next-error-hook)
 
   ;; Make TAB add a " rather than a space at the end of a file name.
-  (set (make-local-variable 'comint-completion-addsuffix) '("/" . "\""))
+  (setq-local comint-completion-addsuffix '("/" . "\""))
 
-  (set (make-local-variable 'font-lock-defaults)
-       inferior-sml-font-lock-defaults)
+  (setq-local font-lock-defaults inferior-sml-font-lock-defaults)
 
   ;; Compilation support (used for `next-error').
-  (set (make-local-variable 'compilation-error-regexp-alist)
-       sml-error-regexp-alist)
+  (setq-local compilation-error-regexp-alist sml-error-regexp-alist)
   ;; FIXME: move it to sml-mode?
-  (set (make-local-variable 'compilation-error-screen-columns) nil)
+  (setq-local compilation-error-screen-columns nil)
 
   (setq mode-line-process '(": %s")))
 
@@ -1273,52 +1286,48 @@ TAB file name completion, as in shell-mode, etc.."
 
 ;;;###autoload
 (define-derived-mode sml-mode sml-prog-proc-mode "SML"
-  "\\<sml-mode-map>Major mode for editing Standard ML code.
+  "Major mode for editing Standard ML code.
 This mode runs `sml-mode-hook' just before exiting.
-See also (info \"(sml-mode)Top\").
-\\{sml-mode-map}"
-  (set (make-local-variable 'font-lock-defaults) sml-font-lock-defaults)
-  (set (make-local-variable 'prettify-symbols-alist)
-       sml-font-lock-symbols-alist)
-  (set (make-local-variable 'outline-regexp) sml-outline-regexp)
-  (set (make-local-variable 'imenu-create-index-function)
-       'sml-imenu-create-index)
-  (set (make-local-variable 'add-log-current-defun-function)
-       'sml-current-fun-name)
+See also (info \"(sml-mode)Top\")."
+  (setq-local font-lock-defaults sml-font-lock-defaults)
+  (setq-local prettify-symbols-alist sml-font-lock-symbols-alist)
+  (setq-local outline-regexp sml-outline-regexp)
+  (setq-local imenu-create-index-function #'sml-imenu-create-index)
+  (setq-local add-log-current-defun-function #'sml-current-fun-name)
   ;; Treat paragraph-separators in comments as paragraph-separators.
-  (set (make-local-variable 'paragraph-separate)
-       (concat "\\([ \t]*\\*)?\\)?\\(" paragraph-separate "\\)"))
-  (set (make-local-variable 'require-final-newline) t)
-  (set (make-local-variable 'electric-indent-chars)
-       (cons ?\; (if (boundp 'electric-indent-chars)
-                     electric-indent-chars '(?\n))))
-  (set (make-local-variable 'electric-layout-rules)
-       `((?\; . ,(lambda ()
-                   (save-excursion
-                     (skip-chars-backward " \t;")
-                     (unless (or (bolp)
-                                 (progn (skip-chars-forward " \t;")
-                                        (eolp)))
-                       'after))))))
+  (setq-local paragraph-separate
+              (concat "\\([ \t]*\\*)?\\)?\\(" paragraph-separate "\\)"))
+  (setq-local require-final-newline t)
+  (setq-local electric-indent-chars
+              (cons ?\; (if (boundp 'electric-indent-chars)
+                            electric-indent-chars '(?\n))))
+  (setq-local electric-layout-rules
+              `((?\; . ,(lambda ()
+                          (save-excursion
+                            (skip-chars-backward " \t;")
+                            (unless (or (bolp)
+                                        (progn (skip-chars-forward " \t;")
+                                               (eolp)))
+                              'after))))))
   (when sml-electric-pipe-mode
     (add-hook 'post-self-insert-hook #'sml-post-self-insert-pipe nil t))
   (sml-mode-variables))
 
 (defun sml-mode-variables ()
-  (set (make-local-variable 'sml-prog-proc-descriptor) sml-pp-functions)
+  (setq-local sml-prog-proc-descriptor sml-pp-functions)
   (set-syntax-table sml-mode-syntax-table)
   (setq local-abbrev-table sml-mode-abbrev-table)
   ;; Setup indentation and sexp-navigation.
   (smie-setup sml-smie-grammar #'sml-smie-rules
               :backward-token #'sml-smie-backward-token
               :forward-token #'sml-smie-forward-token)
-  (set (make-local-variable 'parse-sexp-ignore-comments) t)
-  (set (make-local-variable 'comment-start) "(* ")
-  (set (make-local-variable 'comment-end) " *)")
-  (set (make-local-variable 'comment-start-skip) "(\\*+\\s-*")
-  (set (make-local-variable 'comment-end-skip) "\\s-*\\*+)")
+  (setq-local parse-sexp-ignore-comments t)
+  (setq-local comment-start "(* ")
+  (setq-local comment-end " *)")
+  (setq-local comment-start-skip "(\\*+\\s-*")
+  (setq-local comment-end-skip "\\s-*\\*+)")
   ;; No need to quote nested comments markers.
-  (set (make-local-variable 'comment-quote-nested) nil))
+  (setq-local comment-quote-nested nil))
 
 (defun sml-funname-of-and ()
   "Name of the function this `and' defines, or nil if not a function.
@@ -1734,9 +1743,8 @@ MAINFILE is the top level file of the project."
 ;;;###autoload
 (define-derived-mode sml-cm-mode fundamental-mode "SML-CM"
   "Major mode for SML/NJ's Compilation Manager configuration files."
-  (set (make-local-variable 'sml-prog-proc-descriptor) sml-pp-functions)
-  (set (make-local-variable 'font-lock-defaults)
-       '(sml-cm-font-lock-keywords nil t nil nil)))
+  (setq-local sml-prog-proc-descriptor sml-pp-functions)
+  (setq-local font-lock-defaults '(sml-cm-font-lock-keywords nil t nil nil)))
 
 ;;;
 ;;; ML-Lex support
@@ -1753,7 +1761,7 @@ MAINFILE is the top level file of the project."
 ;;;###autoload
 (define-derived-mode sml-lex-mode sml-mode "SML-Lex"
   "Major Mode for editing ML-Lex files."
-  (set (make-local-variable 'font-lock-defaults) sml-lex-font-lock-defaults))
+  (setq-local font-lock-defaults sml-lex-font-lock-defaults))
 
 ;;;
 ;;; ML-Yacc support
@@ -1841,8 +1849,8 @@ If nil, align it with previous cases."
 ;;;###autoload
 (define-derived-mode sml-yacc-mode sml-mode "SML-Yacc"
   "Major Mode for editing ML-Yacc files."
-  (set (make-local-variable 'indent-line-function) 'sml-yacc-indent-line)
-  (set (make-local-variable 'font-lock-defaults) sml-yacc-font-lock-defaults))
+  (setq-local indent-line-function 'sml-yacc-indent-line)
+  (setq-local font-lock-defaults sml-yacc-font-lock-defaults))
 
 \f
 (provide 'sml-mode)




^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Working around the limitations of SMIE
  2022-11-10  5:00 ` Stefan Monnier via Users list for the GNU Emacs text editor
@ 2022-11-11 16:20   ` Philip Kaludercic
  2022-11-12 19:06     ` Stefan Monnier
  0 siblings, 1 reply; 5+ messages in thread
From: Philip Kaludercic @ 2022-11-11 16:20 UTC (permalink / raw)
  To: Stefan Monnier via Users list for the GNU Emacs text editor
  Cc: Stefan Monnier

Stefan Monnier via Users list for the GNU Emacs text editor
<help-gnu-emacs@gnu.org> writes:

>> I am writing a major mode for a little language I am using at
>> university, and wanted to try using SMIE for indentation and all the
>> other things.  The issue I find myself confronted with is that functions
>> are defined as in the following example:
>>
>>     func funktion(x : int): float
>>       x := x * x;
>>       return x;
>>     end
>>
>> where there is no delimiter between the return type (float), and the
>> rest of the body (such as "begin" or something like that).
>
> How is the separation between the function's return type and the
> function's body defined?  Is it based on the newline that follows the
> type, or is the language constrained to have types that are
> a single identifiers?

The latter.  This is the grammar production:

functionDeclaration: ' func ' identifier '( ' ( parameterDeclaration ( ', ' parameterDeclaration ) * ) ? ') '
    ( ': ' typeName ) ? block ' end ' ;

>> Another issue I ran into with the above definition is that instructions
>> are not indented correctly, as the above grammar doesn't express that in
>> this language doesn't expect a semicolon after an end (just like C
>> doesn't expect one after a "}").  So the result is that
>>
>> instead of:
>>
>>   while y >= y1 do
>>     dummy := zeile(x1, x2, xstep, y);
>>     y := y - ystep;
>>   end
>>   return 0;
>>
>> I get:
>>
>>   while y >= y1 do
>>     dummy := zeile(x1, x2, xstep, y);
>>     y := y - ystep;
>>   end
>>     return 0;
>
> Based on my experience, I suspect that the simplest solution for this is
> to make "end" return 2 tokens (the "end" and then a ghost ";").

Funnily enough I had tried this out too, but I must have messed up
somewhere because the result wasn't what I had intended.

Thanks.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Working around the limitations of SMIE
  2022-11-11 16:20   ` Philip Kaludercic
@ 2022-11-12 19:06     ` Stefan Monnier
  2022-11-13  0:12       ` Philip Kaludercic
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Monnier @ 2022-11-12 19:06 UTC (permalink / raw)
  To: Philip Kaludercic
  Cc: Stefan Monnier via Users list for the GNU Emacs text editor

>>> I am writing a major mode for a little language I am using at
>>> university, and wanted to try using SMIE for indentation and all the
>>> other things.  The issue I find myself confronted with is that functions
>>> are defined as in the following example:
>>>
>>>     func funktion(x : int): float
>>>       x := x * x;
>>>       return x;
>>>     end
>>>
>>> where there is no delimiter between the return type (float), and the
>>> rest of the body (such as "begin" or something like that).
>>
>> How is the separation between the function's return type and the
>> function's body defined?  Is it based on the newline that follows the
>> type, or is the language constrained to have types that are
>> a single identifiers?
>
> The latter.  This is the grammar production:
>
> functionDeclaration: ' func ' identifier '( ' ( parameterDeclaration ( ', '
> parameterDeclaration ) * ) ? ') '
>     ( ': ' typeName ) ? block ' end ' ;

Hmm... so the only "reliable" separator token is the close parenthesis, huh?
I think I'd go with a hack in the lexer which checks if this is "the
close paren of a function definition" and make it include the subsequent
type annotation (if present).  I.e. that new token would cover the whole of

    ')' ( ':' typeName ) ?

Then again, that wouldn't work with the usual handling of parens in SMIE
(IOW, you couldn't rely on syntax tables for them any more) :-(

Maybe instead you can try and make the lexer recognize just ": typeName"
(treating it as a special token) and then tweak the indentation rules so
as to align the subsequent instruction with it.

>>> Another issue I ran into with the above definition is that instructions
>>> are not indented correctly, as the above grammar doesn't express that in
>>> this language doesn't expect a semicolon after an end (just like C
>>> doesn't expect one after a "}").  So the result is that
>>>
>>> instead of:
>>>
>>>   while y >= y1 do
>>>     dummy := zeile(x1, x2, xstep, y);
>>>     y := y - ystep;
>>>   end
>>>   return 0;
>>>
>>> I get:
>>>
>>>   while y >= y1 do
>>>     dummy := zeile(x1, x2, xstep, y);
>>>     y := y - ystep;
>>>   end
>>>     return 0;
>>
>> Based on my experience, I suspect that the simplest solution for this is
>> to make "end" return 2 tokens (the "end" and then a ghost ";").
>
> Funnily enough I had tried this out too, but I must have messed up
> somewhere because the result wasn't what I had intended.

IIRC it's easier to get it not to work than to get it to work, the
"messed up" doesn't carry the right connotation.


        Stefan




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Working around the limitations of SMIE
  2022-11-12 19:06     ` Stefan Monnier
@ 2022-11-13  0:12       ` Philip Kaludercic
  0 siblings, 0 replies; 5+ messages in thread
From: Philip Kaludercic @ 2022-11-13  0:12 UTC (permalink / raw)
  To: Stefan Monnier
  Cc: Stefan Monnier via Users list for the GNU Emacs text editor

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>>> I am writing a major mode for a little language I am using at
>>>> university, and wanted to try using SMIE for indentation and all the
>>>> other things.  The issue I find myself confronted with is that functions
>>>> are defined as in the following example:
>>>>
>>>>     func funktion(x : int): float
>>>>       x := x * x;
>>>>       return x;
>>>>     end
>>>>
>>>> where there is no delimiter between the return type (float), and the
>>>> rest of the body (such as "begin" or something like that).
>>>
>>> How is the separation between the function's return type and the
>>> function's body defined?  Is it based on the newline that follows the
>>> type, or is the language constrained to have types that are
>>> a single identifiers?
>>
>> The latter.  This is the grammar production:
>>
>> functionDeclaration: ' func ' identifier '( ' ( parameterDeclaration ( ', '
>> parameterDeclaration ) * ) ? ') '
>>     ( ': ' typeName ) ? block ' end ' ;
>
> Hmm... so the only "reliable" separator token is the close parenthesis, huh?
> I think I'd go with a hack in the lexer which checks if this is "the
> close paren of a function definition" and make it include the subsequent
> type annotation (if present).  I.e. that new token would cover the whole of
>
>     ')' ( ':' typeName ) ?
>
> Then again, that wouldn't work with the usual handling of parens in SMIE
> (IOW, you couldn't rely on syntax tables for them any more) :-(
>
> Maybe instead you can try and make the lexer recognize just ": typeName"
> (treating it as a special token) and then tweak the indentation rules so
> as to align the subsequent instruction with it.

I have managed to solve by making the tokenizer distinguish if it is
looking at the function type with or without any whitespace inbetween.
If there is, a ghost token is generated, if not the type token is
returned.  This appears to work well.

Thanks for the hint, once again!



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-11-13  0:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-09 21:25 Working around the limitations of SMIE Philip Kaludercic
2022-11-10  5:00 ` Stefan Monnier via Users list for the GNU Emacs text editor
2022-11-11 16:20   ` Philip Kaludercic
2022-11-12 19:06     ` Stefan Monnier
2022-11-13  0:12       ` Philip Kaludercic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).