Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> if A then
>>    B;
>> elsif C then
>>    D;
>> elsif
>>   E then
>>    F;
>> else
>>   G;
>> end if;
>
> I'm not completely sure how to interpret the above.
> Is it an example of good indentation, or does it present a problem
> case?

This is the prefered indentation. I'm using it as an example of a case
when calling smie-backward-sexp is necessary, when computing the
indentation of F.

That's what modula-2-mode does as well; smie-indent-after-keyword calls
smie-indent-virtual to compute the indentation for "then"; that calls
smie-indent-keyword, which calls (smie-backward-sexp token), with
`token' = "then". This parses all the way back to "if".

> In which way is it related to the select...do case?

smie-backward-sexp ought to treat them similarly, as you agree below.
But it does not, because smie-backward-sexp treats associative keywords
specially.

>> Here the internal keywords "then", "elsif" are not associative; "else" is
>> associative.
>
> That doesn't sound right.  "then" should not be associative,
> but "elseif" should be.

There's a comment on smie--associative-p that says tokens have the same
left and right levels if they are optional; that makes sense, and
applies in this case. "else" is optional, "elsif  then" is optional, but
"elsif" on its own is not optional.

> More specifically, with point right after an "elsif", I'd expect C-M-b
> to stop right after the previous "elsif".

How is that useful for indentation?

What about C-M-b after "else"; should that stop on the preceding "then"?
or "elsif"? (In fact, it goes nowhere). I don't understand the general
rule you are expressing here.

The attached example_grammars.el, has several grammars for
"if/then/else/endif", and in the comments, I show what
smie-backward-sexp does for several points in the "if" statement, with
different values of halfsexp. C-M-b calls (smie-backward-sexp
'halfsexp). In most cases, it does move to after the previous keyword.
But not in all, so I don't see how it is generally useful. On the other
hand, the behavior of (smie-backward-sexp token) is quite reliable.

>> In the analogous situation with "select or", (smie-backward-sexp "or")
>> stops at the previous "or".
>
> The "or" in select should largely behave like the "elsif" above.

That is precisely what I'd like to have, but it is not happening. You
are saying it is the "elsif" that's behaving incorrectly, but the main
point is that they are different. In addition, the way I've defined
"elsif" in Ada mode is the same as "elsif" in modula-2-mode.

> You might like to use rules like:
>
>    (exp ("if" expthenexp "end")
>         ("if" expthenexp "else" exp "end")
>         ("if" expthenexp "elsif" expthenexp "end")
>         ("if" expthenexp "elsif" expthenexp "end")
>    (expthenexp (exp "then" exp))

I tried that (see the attached example_grammars.el, grammar-3). It does
change the levels assigned by the grammar compiler. But "then" does not
have equal left and right levels, and the behavior of smie-backward-sexp
is no more useful, as far as I can tell.

>> Which suggests another way to state my core complaint; since
>> smie-next-sexp will skip an arbitrary number of "transitive pairs",
>
> That's annoying.
>
>> it should also skip an arbitrary number of "transitive singles".  Or at
>> least, provide an option to do so.
>
> No, you got it backwards: you want indentation to walk back as little
> as possible.  E.g. you don't want to walk all the way back to "if" if
> you can indent relative to the closest "elsif".

But walking back to "if" is what modula-2-mode does, via
smie-indent-keyword.

Looking at the various grammars in example_grammars.el, I don't see any
consistent patterns, except that (smie-backward-sexp token) always goes
to "if" when token is one of the statement keywords. That's why I use
it, and I assume that's why smie-indent-keyword uses it.

I could fuss with the grammar of each statement to get slightly shorter
parses in some cases. I don't have the time, and that would be premature
optimization.

It comes down to trading indentation code complexity & programmer time vs CPU time.
The modula-2 grammar has 63 keywords, only a few refined, including many operators.
So far, my Ada grammar has 124 keywords, most refined, no operators.
modula2.el is 617 lines for the entire mode. ada-indent.el is 3153 lines
and counting just for indentation (mostly refinement code); Ada is a
much bigger language.

So I need reliable, simple patterns to make this practical.
(smie-backward-sexp token) is very reliable, except for associative
keywords. That behaviour should be controlled by the language
implementor, not dictated by smie.

I've verified that the patch shown below allows me to make
smie-backward-sexp behave the same for the Ada "select/or" and
"if/elsif" statements. 

--
-- Stephe

--- /Apps/emacs-24.2/lisp/emacs-lisp/smie.el	2012-09-14 04:44:57.521167100 -0400
+++ smie.el	2012-10-17 20:33:06.616977700 -0400
@@ -672,6 +672,8 @@
   ;; has to be careful to distinguish those different cases.
   (eq (smie-op-left toklevels) (smie-op-right toklevels)))

+(defvar smie-skip-associative nil)
+
 (defun smie-next-sexp (next-token next-sexp op-forw op-back halfsexp)
   "Skip over one sexp.
 NEXT-TOKEN is a function of no argument that moves forward by one
@@ -770,7 +772,8 @@
                    ;; The new operator is associative.  Two cases:
                    ;; - it's really just an associative operator (like + or ;)
                    ;;   in which case we should have stopped right before.
-                   ((and lastlevels
+                   ((and (not smie-skip-associative)
+			 lastlevels
                          (smie--associative-p (car lastlevels)))
                     (throw 'return
                            (prog1 (list (or (car toklevels) t) (point) token)