unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* smie-next-sexp vs associative operators
@ 2012-10-14  9:16 Stephen Leake
  2012-10-14 15:54 ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Stephen Leake @ 2012-10-14  9:16 UTC (permalink / raw)
  To: emacs-devel

I'm having a problem with the way smie-next-sexp handles associative
operators.

The problem comes when I'm trying to indent a line in this statement;

         select
            accept BILLY;
            null;
         or
            accept SILLY;
            null;
         or
            accept LILLY do
               null;
            end LILLY;
         end select;

I'm indenting "accept LILLY". It should be indented relative to
"select". 

(I could put in a special case rule for "or", but I want to use the
general purpose rule, that uses smie-backward-sexp to find the statement
start, and indents relative to that.)

In this case, smie-backward-sexp is called with "or"; 

(smie-backward-sexp "or") 

However, this stops with point on "accept SILLY", not accept.

The reason is this code:

                   ((not (smie--associative-p toklevels))
                    (push toklevels levels))
                   ;; The new operator is associative.  Two cases:
                   ;; - it's really just an associative operator (like + or ;)
                   ;;   in which case we should have stopped right before.
                   ((and lastlevels
                         (smie--associative-p (car lastlevels)))
                    (throw 'return
                           (prog1 (list (or (car toklevels) t) (point) token)
                             (goto-char pos))))
                   ;; - it's an associative operator within a larger construct
                   ;;   (e.g. an "elsif"), so we should just ignore it and keep
                   ;;   looking for the closing element.
                   (t (setq levels lastlevels))))))))

Here `lastlevels' is "or" (the one we started with), and `toklevels' is
the next "or". "or" is associative; the right and left levels are the
same. So (smie--associative-p toklevels) and (smie--associative-p (car
lastlevels)) are both true.

That means the code decides it's like +, and stops, and the higher level
indents relative to "accept", which is wrong.

But this is exactly like the "elsif" case the comments are talking
about, so I think the code is broken. In general, there is no way for
SMIE to distinguish an "associative operator" from an "associative
keyword", because at this level everything is an operator.

So I'd like to add an option `smie-next-sexp-stop-on-associative', that
controls whether this stops or keeps going. That way, the user can
let-bind this appropriately, depending on the circumstances. For Ada,
I'll probably set it to nil always, but maybe not.

Somewhat apropos of this topic; I ended up leaving all of the math
operators out of my Ada SMIE grammar; I don't make indentation decisions
based on them, so they don't need to be in the grammar. So this "or" is
the only associative operator I have.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-14  9:16 smie-next-sexp vs associative operators Stephen Leake
@ 2012-10-14 15:54 ` Stefan Monnier
  2012-10-14 18:44   ` Stephen Leake
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2012-10-14 15:54 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

> I'm indenting "accept LILLY".  It should be indented relative to
> "select".

That's odd.  I'd expect it to be indented relative to "or".

To put it more strongly: you should really try to indent relative to
"or" and if you bump into trouble while doing that, then come back here
for help ;-)

> However, this stops with point on "accept SILLY", not accept.
                                                        ^^^^^^
I presume you meant "select".

> But this is exactly like the "elsif" case the comments are talking
> about, so I think the code is broken.

The comment is imprecise, it's meant for "if ... elsif ... end" where
the BNF said ("if" ... "elsif" ... "end") rather than ("if"
... list-of-elseif ... "end") and then list-of-elseif defined as
(list-of-elseif "elsif" list-of-elseif).

I should fix it, thank you for pointing it out.

I.e. this rule is to avoid stopping at "elsif" when scanning from "if" to
"end" or from "end" to "if".  It will still stop when scanning from an
"elsif" and bumping into another "elsif".


        Stefan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-14 15:54 ` Stefan Monnier
@ 2012-10-14 18:44   ` Stephen Leake
  2012-10-14 21:48     ` Stephen Leake
  2012-10-15  1:47     ` Stefan Monnier
  0 siblings, 2 replies; 19+ messages in thread
From: Stephen Leake @ 2012-10-14 18:44 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> I'm indenting "accept LILLY".  It should be indented relative to
>> "select".
>
> That's odd.  I'd expect it to be indented relative to "or".

The reason is that's how "if/then/else" works:

if A then
   B;

elsif C then
   D;

elsif 
  E then
   F;

else
  G;

end if;

Here the internal keywords "then", "elsif" are not transitive; "else" is
transitive. Here we are assuming "E" is a complex multi-line expression,
so we can't use it as an indentation reference; we have to get back to
"if" or "elsif". We call (smie-backward-sexp "then") to do that; it
stops at "if".

In the analogous situation with "select or", (smie-backward-sexp "or")
stops at the previous "or".

Why should a select statement be different? I guess you'd say "because
it is different" :). Still, I'd like one rule that handles both, and
it's quite easy to achieve. The fewer special cases I have, the easier
it is to maintain in the long run.

>> But this is exactly like the "elsif" case the comments are talking
>> about, so I think the code is broken.
>
> The comment is imprecise, it's meant for "if ... elsif ... end" where
> the BNF said ("if" ... "elsif" ... "end")

That structure matches this comment and code (as does the Ada "if"):

           ;; If the new operator is not the last in the BNF rule,
           ;; and is not associative, it's one of the inner operators
           ;; (like the "in" in "let .. in .. end"), so keep looking.
           ((not (smie--associative-p toklevels))
            (push toklevels levels))

However, what determines whether an operator is "transitive" in this
SMIE sense is more subtle; it depends on what the surrounding
non-terminals are. When the two non-terminals are the same, the operator
is transitive. When they are different, it is not. 

For Ada, we have:

"if" expression "then" statements "elsif" expression "then" statements
"end" "if"

"declare" declarations "begin" statements "end"

These keywords are all non-transitive, because they have different
non-terminals to the right and left; declarations, expressions, and
statements are all distinct. "else" above is transitive because it has
'statements' on both sides.

"select" statements "or" statements "else" statements "end" "select"

Here "or" and "else" are transitive, because they have 'statements' on
both sides. In fact have the same levels; both are (25 25) (illustrating
your earlier point that the operator table is not invertible in general).

Note that the "elsif then" pair also has the same levels on the left and
right - they form a "transitive pair". In this case, smie-next-sexp will
skip as many as appear.

Which suggests another way to state my core complaint; since
smie-next-sexp will skip an arbitrary number of "transitive pairs", it
should also skip an arbitrary number of "transitive singles". Or at
least, provide an option to do so.

> rather than ("if" ... list-of-elseif ... "end") and then
> list-of-elseif defined as (list-of-elseif "elsif" list-of-elseif).

adding the missing nonterminals (and remembering that in C-like
languages there are no "statements", only "expressions"):

("if" expressions "elseif" expressions "elseif" expressions "end")

These "elseif" are transitive. But again, it is the surrounding
non-terminals that determine this. In a language that distinguishes
between expressions and statements, "elsif" will never be transitive,
but "elsif expression then" will be.

> I should fix it, thank you for pointing it out.
>
> I.e. this rule is to avoid stopping at "elsif" when scanning from "if" to
> "end" or from "end" to "if".  It will still stop when scanning from an
> "elsif" and bumping into another "elsif".

Hmm. Calling (smie-forward-sexp "if") on the above "if" statement gives
the following toklevel sequence (obtained by setting a break in
smie-next-sexp):

((184) 3)   if
nil         A
(3 25)      then
nil         B
(37 36)     ;
(25 3)      elsif
nil         C
(3 25)      then
nil         D
(37 36)     ;
(25 3)      elsif
nil         E
(3 25)      then
nil         F
(37 36)     ;
(25 25)     else
nil         G
(37 36)     ;
(25 127)    end
(127 (174)) if

Now I see; the checks for associative in smie-next-sexp handle the case
where there is _one_ associative operator (here "else") in a statement.

And with the equivalent sequence in a C-like language that doesn't
have "then", it would stop at the second elsif. I don't see why Ada and
C-like should be different in this.

Since "else" will be transitive in most languages, and appear only once,
that would be a better keyword to use as an example in the current
smie-next-sexp code comments.

And I would still prefer having more control over this, to unify the
behavior of smie-next-sexp on Ada "if" and "select", and between Ada and
C-like "elsif". Similarly for "+"; if I had kept "+" in the grammar, I
would not want smie-next-sexp to stop at it. That choice should be up to
the smie user, not dictated by smie.

As usual, this conversation has been educational - I learned more about
what "transitive" means in SMIE :).

--
-- Stephe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-14 18:44   ` Stephen Leake
@ 2012-10-14 21:48     ` Stephen Leake
  2012-10-15  1:47     ` Stefan Monnier
  1 sibling, 0 replies; 19+ messages in thread
From: Stephen Leake @ 2012-10-14 21:48 UTC (permalink / raw)
  To: emacs-devel

I just realized I used "transitive" in this post when I should have used
"associative"; I don't know why. I apologize; I've got a head cold, and
between the direct effects and the drugs, I seem to be not thinking
quite straight ...

Stephen Leake <stephen_leake@member.fsf.org> writes:

> Stefan Monnier <monnier@iro.umontreal.ca> writes:
>
>>> I'm indenting "accept LILLY".  It should be indented relative to
>>> "select".
>>
>> That's odd.  I'd expect it to be indented relative to "or".
>
> The reason is that's how "if/then/else" works:
>
> if A then
>    B;
>
> elsif C then
>    D;
>
> elsif 
>   E then
>    F;
>
> else
>   G;
>
> end if;
>
> Here the internal keywords "then", "elsif" are not transitive; "else" is
> transitive. Here we are assuming "E" is a complex multi-line expression,
> so we can't use it as an indentation reference; we have to get back to
> "if" or "elsif". We call (smie-backward-sexp "then") to do that; it
> stops at "if".
>
> In the analogous situation with "select or", (smie-backward-sexp "or")
> stops at the previous "or".
>
> Why should a select statement be different? I guess you'd say "because
> it is different" :). Still, I'd like one rule that handles both, and
> it's quite easy to achieve. The fewer special cases I have, the easier
> it is to maintain in the long run.
>
>>> But this is exactly like the "elsif" case the comments are talking
>>> about, so I think the code is broken.
>>
>> The comment is imprecise, it's meant for "if ... elsif ... end" where
>> the BNF said ("if" ... "elsif" ... "end")
>
> That structure matches this comment and code (as does the Ada "if"):
>
>            ;; If the new operator is not the last in the BNF rule,
>            ;; and is not associative, it's one of the inner operators
>            ;; (like the "in" in "let .. in .. end"), so keep looking.
>            ((not (smie--associative-p toklevels))
>             (push toklevels levels))
>
> However, what determines whether an operator is "transitive" in this
> SMIE sense is more subtle; it depends on what the surrounding
> non-terminals are. When the two non-terminals are the same, the operator
> is transitive. When they are different, it is not. 
>
> For Ada, we have:
>
> "if" expression "then" statements "elsif" expression "then" statements
> "end" "if"
>
> "declare" declarations "begin" statements "end"
>
> These keywords are all non-transitive, because they have different
> non-terminals to the right and left; declarations, expressions, and
> statements are all distinct. "else" above is transitive because it has
> 'statements' on both sides.
>
> "select" statements "or" statements "else" statements "end" "select"
>
> Here "or" and "else" are transitive, because they have 'statements' on
> both sides. In fact have the same levels; both are (25 25) (illustrating
> your earlier point that the operator table is not invertible in general).
>
> Note that the "elsif then" pair also has the same levels on the left and
> right - they form a "transitive pair". In this case, smie-next-sexp will
> skip as many as appear.
>
> Which suggests another way to state my core complaint; since
> smie-next-sexp will skip an arbitrary number of "transitive pairs", it
> should also skip an arbitrary number of "transitive singles". Or at
> least, provide an option to do so.
>
>> rather than ("if" ... list-of-elseif ... "end") and then
>> list-of-elseif defined as (list-of-elseif "elsif" list-of-elseif).
>
> adding the missing nonterminals (and remembering that in C-like
> languages there are no "statements", only "expressions"):
>
> ("if" expressions "elseif" expressions "elseif" expressions "end")
>
> These "elseif" are transitive. But again, it is the surrounding
> non-terminals that determine this. In a language that distinguishes
> between expressions and statements, "elsif" will never be transitive,
> but "elsif expression then" will be.
>
>> I should fix it, thank you for pointing it out.
>>
>> I.e. this rule is to avoid stopping at "elsif" when scanning from "if" to
>> "end" or from "end" to "if".  It will still stop when scanning from an
>> "elsif" and bumping into another "elsif".
>
> Hmm. Calling (smie-forward-sexp "if") on the above "if" statement gives
> the following toklevel sequence (obtained by setting a break in
> smie-next-sexp):
>
> ((184) 3)   if
> nil         A
> (3 25)      then
> nil         B
> (37 36)     ;
> (25 3)      elsif
> nil         C
> (3 25)      then
> nil         D
> (37 36)     ;
> (25 3)      elsif
> nil         E
> (3 25)      then
> nil         F
> (37 36)     ;
> (25 25)     else
> nil         G
> (37 36)     ;
> (25 127)    end
> (127 (174)) if
>
> Now I see; the checks for associative in smie-next-sexp handle the case
> where there is _one_ associative operator (here "else") in a statement.
>
> And with the equivalent sequence in a C-like language that doesn't
> have "then", it would stop at the second elsif. I don't see why Ada and
> C-like should be different in this.
>
> Since "else" will be transitive in most languages, and appear only once,
> that would be a better keyword to use as an example in the current
> smie-next-sexp code comments.
>
> And I would still prefer having more control over this, to unify the
> behavior of smie-next-sexp on Ada "if" and "select", and between Ada and
> C-like "elsif". Similarly for "+"; if I had kept "+" in the grammar, I
> would not want smie-next-sexp to stop at it. That choice should be up to
> the smie user, not dictated by smie.
>
> As usual, this conversation has been educational - I learned more about
> what "transitive" means in SMIE :).
>
> --
> -- Stephe

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-14 18:44   ` Stephen Leake
  2012-10-14 21:48     ` Stephen Leake
@ 2012-10-15  1:47     ` Stefan Monnier
  2012-10-15 12:09       ` Stephen Leake
  2012-10-20  9:15       ` Stephen Leake
  1 sibling, 2 replies; 19+ messages in thread
From: Stefan Monnier @ 2012-10-15  1:47 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

>>> I'm indenting "accept LILLY".  It should be indented relative to
>>> "select".
>> That's odd.  I'd expect it to be indented relative to "or".
> The reason is that's how "if/then/else" works:

That's not a good reason: the indentation of an expression that follows
a keyword should almost always be relative to that keyword.

> if A then
>    B;
> elsif C then
>    D;
> elsif 
>   E then
>    F;
> else
>   G;
> end if;

I'm not completely sure how to interpret the above.
Is it an example of good indentation, or does it present a problem case?
In which way is it related to the select...do case?

> Here the internal keywords "then", "elsif" are not transitive; "else" is
> transitive.

That doesn't sound right.  "then" should not be associative,
but "elseif" should be.
More specifically, with point right after an "elsif", I'd expect C-M-b
to stop right after the previous "elsif".

> Here we are assuming "E" is a complex multi-line expression,
> so we can't use it as an indentation reference; we have to get back to
> "if" or "elsif". We call (smie-backward-sexp "then") to do that; it
> stops at "if".

I'm not sure of which "then" you're talking, but when point is just
before F, C-M-b should stop with point in front of E (and returning
the previous "elsif"'s data, relative to which it should be indented).
If it doesn't, your grammar is wrong.

> In the analogous situation with "select or", (smie-backward-sexp "or")
> stops at the previous "or".

The "or" in select should largely behave like the "elsif" above.

> However, what determines whether an operator is "transitive" in this
> SMIE sense is more subtle; it depends on what the surrounding
> non-terminals are. When the two non-terminals are the same, the operator
> is transitive. When they are different, it is not. 

No, an operator is associative simply if it has the same left and
right precedence.  What is more subtle is the choice of when to stop,
which depends on where we come from.

> For Ada, we have:
> "if" expression "then" statements "elsif" expression "then" statements
> "end" "if"

> "declare" declarations "begin" statements "end"

> These keywords are all non-transitive, because they have different
> non-terminals to the right and left; declarations, expressions, and
> statements are all distinct. "else" above is transitive because it has
> 'statements' on both sides.

No: the precedence depends on the preceding/next terminal.  I.e. the
left precedence of "then" will be equal to the right precedence of
"elsif" because of the above rule.

You might like to use rules like:

   (exp ("if" expthenexp "end")
        ("if" expthenexp "else" exp "end")
        ("if" expthenexp "elsif" expthenexp "end")
        ("if" expthenexp "elsif" expthenexp "end")
   (expthenexp (exp "then" exp))

So as to make it clear, that when jumping backward from an "elsif" (or
"else") you want to skip the "then" and stop at the previous "if" or
"elsif".
   
> Which suggests another way to state my core complaint; since
> smie-next-sexp will skip an arbitrary number of "transitive pairs",

That's annoying.

> it should also skip an arbitrary number of "transitive singles".  Or at
> least, provide an option to do so.

No, you got it backwards: you want indentation to walk back as little
as possible.  E.g. you don't want to walk all the way back to "if" if
you can indent relative to the closest "elsif".

> These "elseif" are transitive. But again, it is the surrounding
> non-terminals that determine this. In a language that distinguishes
> between expressions and statements, "elsif" will never be transitive,
> but "elsif expression then" will be.

What was the reason for separating expressions from instructions in
your grammar?  Maybe removing this distinction will help (tho it might
introduce other problems, of course).

Of course the grammar below will still "distinguish" expressions and
instructions while giving you an associative "elsif":

   (exp ("if" exptheninst "end")
        ("if" exptheninst "else" inst "end")
        ("if" exptheninst "elsif" exptheninst "end")
        ("if" exptheninst "elsif" exptheninst "end")
   (exptheninst (exp "then" inst))

> And with the equivalent sequence in a C-like language that doesn't
> have "then", it would stop at the second elsif. I don't see why Ada and
> C-like should be different in this.

Hence the "expthenexp" non-terminal.

> As usual, this conversation has been educational - I learned more about
> what "transitive" means in SMIE :).

Including the difference between transitive and associative?


        Stefan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-15  1:47     ` Stefan Monnier
@ 2012-10-15 12:09       ` Stephen Leake
  2012-10-20  9:15       ` Stephen Leake
  1 sibling, 0 replies; 19+ messages in thread
From: Stephen Leake @ 2012-10-15 12:09 UTC (permalink / raw)
  To: emacs-devel

Just a note to say I'll get back to this later. 

I compared the "if" from modula-2-mode and the "if" from my Ada mode;
smie-backward-sexp treats them significantly differently, even though
the grammar declarations are nearly identical. I need to figure out why
in detail.

I also uncovered a major flaw in my cache algorithm that I need to fix.

And I still have my head cold, so things are slow ...


-- 
-- Stephe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-15  1:47     ` Stefan Monnier
  2012-10-15 12:09       ` Stephen Leake
@ 2012-10-20  9:15       ` Stephen Leake
  2012-10-21 14:58         ` Stephen Leake
  2012-10-23 18:07         ` Stefan Monnier
  1 sibling, 2 replies; 19+ messages in thread
From: Stephen Leake @ 2012-10-20  9:15 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 5876 bytes --]

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> if A then
>>    B;
>> elsif C then
>>    D;
>> elsif
>>   E then
>>    F;
>> else
>>   G;
>> end if;
>
> I'm not completely sure how to interpret the above.
> Is it an example of good indentation, or does it present a problem
> case?

This is the prefered indentation. I'm using it as an example of a case
when calling smie-backward-sexp is necessary, when computing the
indentation of F.

That's what modula-2-mode does as well; smie-indent-after-keyword calls
smie-indent-virtual to compute the indentation for "then"; that calls
smie-indent-keyword, which calls (smie-backward-sexp token), with
`token' = "then". This parses all the way back to "if".

> In which way is it related to the select...do case?

smie-backward-sexp ought to treat them similarly, as you agree below.
But it does not, because smie-backward-sexp treats associative keywords
specially.

>> Here the internal keywords "then", "elsif" are not associative; "else" is
>> associative.
>
> That doesn't sound right.  "then" should not be associative,
> but "elseif" should be.

There's a comment on smie--associative-p that says tokens have the same
left and right levels if they are optional; that makes sense, and
applies in this case. "else" is optional, "elsif  then" is optional, but
"elsif" on its own is not optional.

> More specifically, with point right after an "elsif", I'd expect C-M-b
> to stop right after the previous "elsif".

How is that useful for indentation?

What about C-M-b after "else"; should that stop on the preceding "then"?
or "elsif"? (In fact, it goes nowhere). I don't understand the general
rule you are expressing here.

The attached example_grammars.el, has several grammars for
"if/then/else/endif", and in the comments, I show what
smie-backward-sexp does for several points in the "if" statement, with
different values of halfsexp. C-M-b calls (smie-backward-sexp
'halfsexp). In most cases, it does move to after the previous keyword.
But not in all, so I don't see how it is generally useful. On the other
hand, the behavior of (smie-backward-sexp token) is quite reliable.

>> In the analogous situation with "select or", (smie-backward-sexp "or")
>> stops at the previous "or".
>
> The "or" in select should largely behave like the "elsif" above.

That is precisely what I'd like to have, but it is not happening. You
are saying it is the "elsif" that's behaving incorrectly, but the main
point is that they are different. In addition, the way I've defined
"elsif" in Ada mode is the same as "elsif" in modula-2-mode.

> You might like to use rules like:
>
>    (exp ("if" expthenexp "end")
>         ("if" expthenexp "else" exp "end")
>         ("if" expthenexp "elsif" expthenexp "end")
>         ("if" expthenexp "elsif" expthenexp "end")
>    (expthenexp (exp "then" exp))

I tried that (see the attached example_grammars.el, grammar-3). It does
change the levels assigned by the grammar compiler. But "then" does not
have equal left and right levels, and the behavior of smie-backward-sexp
is no more useful, as far as I can tell.

>> Which suggests another way to state my core complaint; since
>> smie-next-sexp will skip an arbitrary number of "transitive pairs",
>
> That's annoying.
>
>> it should also skip an arbitrary number of "transitive singles".  Or at
>> least, provide an option to do so.
>
> No, you got it backwards: you want indentation to walk back as little
> as possible.  E.g. you don't want to walk all the way back to "if" if
> you can indent relative to the closest "elsif".

But walking back to "if" is what modula-2-mode does, via
smie-indent-keyword.

Looking at the various grammars in example_grammars.el, I don't see any
consistent patterns, except that (smie-backward-sexp token) always goes
to "if" when token is one of the statement keywords. That's why I use
it, and I assume that's why smie-indent-keyword uses it.

I could fuss with the grammar of each statement to get slightly shorter
parses in some cases. I don't have the time, and that would be premature
optimization.

It comes down to trading indentation code complexity & programmer time vs CPU time.
The modula-2 grammar has 63 keywords, only a few refined, including many operators.
So far, my Ada grammar has 124 keywords, most refined, no operators.
modula2.el is 617 lines for the entire mode. ada-indent.el is 3153 lines
and counting just for indentation (mostly refinement code); Ada is a
much bigger language.

So I need reliable, simple patterns to make this practical.
(smie-backward-sexp token) is very reliable, except for associative
keywords. That behaviour should be controlled by the language
implementor, not dictated by smie.

I've verified that the patch shown below allows me to make
smie-backward-sexp behave the same for the Ada "select/or" and
"if/elsif" statements. 

--
-- Stephe

--- /Apps/emacs-24.2/lisp/emacs-lisp/smie.el	2012-09-14 04:44:57.521167100 -0400
+++ smie.el	2012-10-17 20:33:06.616977700 -0400
@@ -672,6 +672,8 @@
   ;; has to be careful to distinguish those different cases.
   (eq (smie-op-left toklevels) (smie-op-right toklevels)))

+(defvar smie-skip-associative nil)
+
 (defun smie-next-sexp (next-token next-sexp op-forw op-back halfsexp)
   "Skip over one sexp.
 NEXT-TOKEN is a function of no argument that moves forward by one
@@ -770,7 +772,8 @@
                    ;; The new operator is associative.  Two cases:
                    ;; - it's really just an associative operator (like + or ;)
                    ;;   in which case we should have stopped right before.
-                   ((and lastlevels
+                   ((and (not smie-skip-associative)
+			 lastlevels
                          (smie--associative-p (car lastlevels)))
                     (throw 'return
                            (prog1 (list (or (car toklevels) t) (point) token)

[-- Attachment #2: example_grammars.el --]
[-- Type: application/emacs-lisp, Size: 15051 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-20  9:15       ` Stephen Leake
@ 2012-10-21 14:58         ` Stephen Leake
  2012-10-23 17:37           ` Stefan Monnier
  2012-10-23 18:07         ` Stefan Monnier
  1 sibling, 1 reply; 19+ messages in thread
From: Stephen Leake @ 2012-10-21 14:58 UTC (permalink / raw)
  To: emacs-devel

Stephen Leake <stephen_leake@member.fsf.org> writes:

> I've verified that the patch shown below allows me to make
> smie-backward-sexp behave the same for the Ada "select/or" and
> "if/elsif" statements.

I just encountered another case where I need smie-skip-associative t:

declare
   <declarations>
begin
   <statements>
exception
   <exception handlers>
end;

"exception" is optional. With smie-skip-associative nil,
(smie-backward-sexp "exception") stops with point after "begin". But
(smie-backward-sexp "end") stops with point before "declare".

That's not consistent, so it complicates my code, meaning more time
spent testing.

With smie-skip-associative t, (smie-backward-sexp "exception") stops
with point before "declare", simplifying the code.

--
-- Stephe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-21 14:58         ` Stephen Leake
@ 2012-10-23 17:37           ` Stefan Monnier
  2012-10-23 22:43             ` Stephen Leake
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2012-10-23 17:37 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

> I just encountered another case where I need smie-skip-associative t:
> declare
>    <declarations>
> begin
>    <statements>
> exception
>    <exception handlers>
> end;
> "exception" is optional.

OK, so it is associative.  Admittedly, it's a not good, since
"declare/begin/exception/exception/exception/end" doesn't make sense, so
it's not really associative, but it's not too terrible.

You can try to make it non-associative with something like

   (exp ("declare" decls "begin" stmts+exceptions "end")
        ...)
   (stmts+exceptions (stmts) (stmts "exception" exc-handlers))

but it still won't behave like `begin' since the whole stmts+exceptions
is now a sub-tree.
   
> With smie-skip-associative nil, (smie-backward-sexp "exception") stops
> with point after "begin".  But (smie-backward-sexp "end") stops with
> point before "declare".

Starting from where?

> That's not consistent, so it complicates my code, meaning more time
> spent testing.

Can you give me some detail about the complication?

> With smie-skip-associative t, (smie-backward-sexp "exception") stops
> with point before "declare", simplifying the code.

But such a smie-skip-associative also suffers from inconvenients
(e.g. it will make you look further back than needed for some
indentation decisions and will prevent your users from jumping from one
elseif to the next with C-M-f).

My intuition tells me there's a better solution to your particular
problem (one where you don't have to skip over such associative
siblings), but if you really want to do that, I'd rather do it with
a new function (ideally written on top of
smie-next/backward/forward-sexp) than with a configuration variable.


        Stefan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-20  9:15       ` Stephen Leake
  2012-10-21 14:58         ` Stephen Leake
@ 2012-10-23 18:07         ` Stefan Monnier
  2012-10-23 23:14           ` Stephen Leake
  1 sibling, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2012-10-23 18:07 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

> This is the prefered indentation. I'm using it as an example of a case
> when calling smie-backward-sexp is necessary, when computing the
> indentation of F.

Yes, smie-backward-sexp is often needed, but very rarely directly in
indentation rules.

> That's what modula-2-mode does as well; smie-indent-after-keyword calls
> smie-indent-virtual to compute the indentation for "then"; that calls
> smie-indent-keyword, which calls (smie-backward-sexp token), with
> `token' = "then". This parses all the way back to "if".

That's actually a bug: the THEN should be indented relative to its
corresponding ELSIF (if any) rather than to the top IF.

>> More specifically, with point right after an "elsif", I'd expect C-M-b
>> to stop right after the previous "elsif".
> How is that useful for indentation?

It's useful because it lets the indentation work by looking at a smaller
part of the buffer.  This is good for performance, and it is good for
graceful degradation (when we misparse some part of the code).
It's also useful when the user doesn't like the auto-indentation's
choice, since we use the user's choice of where to indent the previous
`elsif'.

To my mind, if...elsif...end, is very much like Elisp's `cond', so every
"exp THEN stmt" is a "branch", and those branches are connected at
a higher level via "IF branch ELSIF branch ... ELSIF branch ... END".

> What about C-M-b after "else"; should that stop on the preceding "then"?

No, since "then" is in a lower-level of the parse tree.

> or "elsif"?

That would make sense, yes.

>> (exp ("if" expthenexp "end")
>> ("if" expthenexp "else" exp "end")
>> ("if" expthenexp "elsif" expthenexp "end")
>> ("if" expthenexp "elsif" expthenexp "end")
>> (expthenexp (exp "then" exp))
> I tried that (see the attached example_grammars.el, grammar-3). It does
> change the levels assigned by the grammar compiler. But "then" does not
> have equal left and right levels, and the behavior of smie-backward-sexp
> is no more useful, as far as I can tell.

I don't know what would be "useful" to you and/or why the changes
aren't "useful".

> I've verified that the patch shown below allows me to make
> smie-backward-sexp behave the same for the Ada "select/or" and
> "if/elsif" statements. 

Can you tell me in terms of C-M-f or C-M-b in which circumstance they
don't behave identically (and show me the corresponding grammar rules
you're using)?


        Stefan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-23 17:37           ` Stefan Monnier
@ 2012-10-23 22:43             ` Stephen Leake
  2012-10-24 14:24               ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Stephen Leake @ 2012-10-23 22:43 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> I just encountered another case where I need smie-skip-associative t:
>> declare
>>    <declarations>
>> begin
>>    <statements>
>> exception
>>    <exception handlers>
>> end;
>> "exception" is optional.
>
>> With smie-skip-associative nil, (smie-backward-sexp "exception") stops
>> with point after "begin".  But (smie-backward-sexp "end") stops with
>> point before "declare".
>
> Starting from where?

Starting with point before the keyword passed to smie-backward-sexp.

>> That's not consistent, so it complicates my code, meaning more time
>> spent testing.
>
> Can you give me some detail about the complication?

I'd have to add special cases in the code, which means each special case
needs to be tested, and maintained.

In this particular situation, doing 

(goto-char (nth 1 (smie-backward-sexp (smie-default-forward-token)) 

would give consistent behavior for indentation; it would leave point on
the right column to use for indenting the keyword we started at.

That's not a lot of code. But there might be different code required for
the next grammar fragment that encounters this issue. The point is I
should not have to worry about it in the first place; smie-backward-sexp
should not treat associative operators specially, for the language I'm
implementing.

>> With smie-skip-associative t, (smie-backward-sexp "exception") stops
>> with point before "declare", simplifying the code.
>
> But such a smie-skip-associative also suffers from inconvenients
> (e.g. it will make you look further back than needed for some
> indentation decisions 

Not _me_, just the CPU. That's ok, that's what CPUs are for! as long as
it takes less than noticable time (which it does, so far, on my
admittedly very fast machine). And it gives that gain only for a very
small set of keywords (two out of 130 for Ada mode); from most of the
keywords in statements, smie-backward-sexp will parse back to the
statement start.

From my point of view, setting smie-skip-associative t makes things
simpler; (smie-backward-sexp keyword) always goes to the _first_ keyword
in a statement. I don't have to worry about any special cases. That's
the optimization I need to make right now.

> and will prevent your users from jumping from one elseif to the next
> with C-M-f).

True. But that's not very consistent behavior anyway. If I did want to
offer something like that as a feature, it would have to be a keystroke
for "move from any keyword in a statement to the previous/next keyword
of the same statement". C-M-f does _not_ do that, on its own; it usually
jumps all the way to the end of the sexp, it's only for associative
operators that it stops anywhere else. I'd have to write an entirely new
movement function (which I may do; it could be useful for navigating
across largish function bodies). So I don't see that as a significant
gain of smie-skip-associative nil.

Do you have any examples of when smie-skip-associative nil is useful for
actual indentation issues? The comments imply that's why the special
treatment was introduced in the first place.

> My intuition tells me there's a better solution to your particular
> problem (one where you don't have to skip over such associative
> siblings), 

The only _problem_ is the time it takes me to recognize and fix each
case. I agree I could handle the current behavior for each particular
case in some innovative way. And that might even be fun, if there wasn't
so much _else_ to do to get Ada mode 5.0 into useful shape.

> but if you really want to do that, I'd rather do it with a new
> function (ideally written on top of smie-next/backward/forward-sexp)
> than with a configuration variable.

Can you explain a little more about the downside of having this choice
provided as a configuration option? It's not a _user_ option, just a
language programmer option. It doesn't significantly complicate smie!

Providing another function that does almost exactly the same thing as
smie-backward-sexp would be more confusing to language programmers, in
my opinion (what would we call it?
smie-backward-sexp-skip-associative-nil? :). There are lots of functions
in Emacs whose behavior is affected by variables in ways much more
significant than this.

I don't think I can build it on top of smie-backward-sexp, because the
crucial stack information is thrown away. I might be able to factor out
a simpler smie-backward-sexp-basic, and then implement the two behaviors
on top of that. That would be significantly more complex code than the
patch I have now. 

And that would _not_ give the language implementor the same level of
control; there are higher level smie functions that call
smie-backward-sexp, so we'd have to provide two of each of those? 

Since the right value for smie-skip-associative is tied to the language,
it would make sense to incorporate it into the grammar structure in some
way. That might be better than providing it as a separate variable.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-23 18:07         ` Stefan Monnier
@ 2012-10-23 23:14           ` Stephen Leake
  2012-10-24 14:44             ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Stephen Leake @ 2012-10-23 23:14 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 4278 bytes --]

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> This is the prefered indentation. I'm using it as an example of a case
>> when calling smie-backward-sexp is necessary, when computing the
>> indentation of F.
>
> Yes, smie-backward-sexp is often needed, but very rarely directly in
> indentation rules.

Right; it is buried lower than that in my code as well. The point is
that it is called, not how directly it is called.

>> That's what modula-2-mode does as well; smie-indent-after-keyword calls
>> smie-indent-virtual to compute the indentation for "then"; that calls
>> smie-indent-keyword, which calls (smie-backward-sexp token), with
>> `token' = "then". This parses all the way back to "if".
>
> That's actually a bug: the THEN should be indented relative to its
> corresponding ELSIF (if any) rather than to the top IF.

_all_ of the keywords in the if/then/elsif/else/endif, and in every
other multi-keyword statement in modula-2 and Ada, work this way; they
_all_ jump to the first keyword in the statement. Except for the ones
that happen to be associative. You'd have to totally rewrite
smie-backward-sexp, to fix this "bug".

>>> More specifically, with point right after an "elsif", I'd expect C-M-b
>>> to stop right after the previous "elsif".
>> How is that useful for indentation?
>
> It's useful because it lets the indentation work by looking at a smaller
> part of the buffer.  This is good for performance, and it is good for
> graceful degradation (when we misparse some part of the code).

Optimizing for performance is premature at this point in my project;
simplicity of design is much preferable. I can change this setting
later, if I need the performance gain.

I can believe it might be useful for graceful degradation; I have not done
much testing on Ada mode for that yet.

Have people complained about modula-2-mode for this?

> It's also useful when the user doesn't like the auto-indentation's
> choice, since we use the user's choice of where to indent the previous
> `elsif'.

I admit there are times when I override the indentation engine's choice
with Ada mode 4.01. But they are extremely rare, and that's part of why
I'm rewriting it :).

But all of this just goes to say that the current behavior of
smie-backward-sexp for _non_-associative operators is wrong in the first
place; that's a _major_ redesign!

>>> (exp ("if" expthenexp "end")
>>> ("if" expthenexp "else" exp "end")
>>> ("if" expthenexp "elsif" expthenexp "end")
>>> ("if" expthenexp "elsif" expthenexp "end")
>>> (expthenexp (exp "then" exp))
>> I tried that (see the attached example_grammars.el, grammar-3). It does
>> change the levels assigned by the grammar compiler. But "then" does not
>> have equal left and right levels, and the behavior of smie-backward-sexp
>> is no more useful, as far as I can tell.
>
> I don't know what would be "useful" to you and/or why the changes
> aren't "useful".

That is pretty much the crux of the matter; we have different visions
about how indentation engines should work, partly influenced by the
different languages we are implementing.

As a tool intended to be shared by many languages, smie should
accomodate both visions, and it does so quite nicely, except for this
one minor point.

>> I've verified that the patch shown below allows me to make
>> smie-backward-sexp behave the same for the Ada "select/or" and
>> "if/elsif" statements. 
>
> Can you tell me in terms of C-M-f or C-M-b in which circumstance they
> don't behave identically (and show me the corresponding grammar rules
> you're using)?

Not directly, no, because C-M-f doesn't call (smie-backward-sexp
keyword), it calls (smie-backward-sexp 'halfsexp), which gives different
behavior.

Which is why I wrote the attached example_grammars.el. It contains
several simple grammars all implementing the same if statement, and
defines movement functions that call smie-backward-sexp with each of the
three possible options, and keys to invoke those functions. See the
comments there for the details. Playing with that was quite educational!

grammar-1a (which is a superset of grammar-1, and also implements the
select statement), is my prefered grammar for Ada mode for these two
statements, and is closest to the current modula-2 grammar.

-- 
-- Stephe

[-- Attachment #2: example_grammars.el --]
[-- Type: application/emacs-lisp, Size: 15051 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-23 22:43             ` Stephen Leake
@ 2012-10-24 14:24               ` Stefan Monnier
  2012-10-24 23:45                 ` Stephen Leake
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2012-10-24 14:24 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

>>> That's not consistent, so it complicates my code, meaning more time
>>> spent testing.
>> Can you give me some detail about the complication?
> I'd have to add special cases in the code, which means each special case
> needs to be tested, and maintained.

Without the specifics I can't see what you're referring to.

> (goto-char (nth 1 (smie-backward-sexp (smie-default-forward-token)) 

I've almost never needed to write a call to smie-backward-sexp in SMIE
indentation rules (yes, there are a few exceptions, but only for
constructs that SMIE is unable to parse properly, or for indentation
rules that do not follow the parse-tree structure, such as indentation
of monadic-style code in OCaml).
There's something wrong if you need to write that for indentation of the
if..end construct.

>>> With smie-skip-associative t, (smie-backward-sexp "exception") stops
>>> with point before "declare", simplifying the code.
>> But such a smie-skip-associative also suffers from inconvenients
>> (e.g. it will make you look further back than needed for some
>> indentation decisions
> Not _me_, just the CPU. That's ok, that's what CPUs are for! as long as
> it takes less than noticable time (which it does, so far, on my
> admittedly very fast machine). And it gives that gain only for a very
> small set of keywords (two out of 130 for Ada mode); from most of the
> keywords in statements, smie-backward-sexp will parse back to the
> statement start.

Indentation of an ELSIF with your approach takes time proportional to
the whole IF construct, thus leading to O(n^2) complexity when
reindenting the whole IF construct, whereas if you only go back to the
previous ELSIF, the total complexity stays at O(n).

But admittedly, the problem is not really performance.

> Do you have any examples of when smie-skip-associative nil is useful for
> actual indentation issues?

As mentioned:

   It is good for graceful degradation (when we misparse some part of
   the code).  It's also useful when the user doesn't like the
   auto-indentation's choice, since we use the user's choice of where to
   indent the previous `elsif'.

E.g., we have the following code:

     case foo of
          | 1 => toto
          | 2 => titi
     | 3 => tata

Assuming that the indentation rule for "| in case" says to align | with "case"
when hitting TAB on the last line, should we leave the line unchanged or
should align it with the previous line?

SMIE will usually align it with the previous line, on the principle that
the user's choice on previous lines takes precedence over the
indentation rules.  After all, if the user doesn't like it, she can
first reindent the other lines, so there's no loss of functionality.

> The comments imply that's why the special treatment was introduced in
> the first place.

It's also handy to hit M-C-t with point on a | to swap two branches of
the above "case" (admittedly, it won't work for your ELSIFs because
M-C-t currently makes unwarranted assumptions and hence only behaves
that way for infix operators that have "punctuation syntax" in the
syntax table).  Or to hit M-C-SPC to select a branch (or a few
branches, if you repeat it).

>> but if you really want to do that, I'd rather do it with a new
>> function (ideally written on top of smie-next/backward/forward-sexp)
>> than with a configuration variable.
> Can you explain a little more about the downside of having this choice
> provided as a configuration option? It's not a _user_ option, just a
> language programmer option. It doesn't significantly complicate smie!

Because it breaks the above desirable properties.  So if you need it for
particular indentation rules, that's OK, but it should not affect
general movement behavior.

> Since the right value for smie-skip-associative is tied to the language,

I still don't believe it to be the case.  I think if you should start
from "smie-skip-associative is not an option", you'll soon see that life
isn't that bad after all.


        Stefan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-23 23:14           ` Stephen Leake
@ 2012-10-24 14:44             ` Stefan Monnier
  2012-10-25  0:22               ` Stephen Leake
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2012-10-24 14:44 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

>> That's actually a bug: the THEN should be indented relative to its
>> corresponding ELSIF (if any) rather than to the top IF.
> _all_ of the keywords in the if/then/elsif/else/endif, and in every
> other multi-keyword statement in modula-2 and Ada, work this way; they
> _all_ jump to the first keyword in the statement. Except for the ones
> that happen to be associative. You'd have to totally rewrite
> smie-backward-sexp, to fix this "bug".

SMIE likes to have more structure in its parse tree: instead of an IF
construct having a single level

    (IF_THEN_ELSIF_THEN_ELSE_END exp1 stmt1 exp2 stmt2 stmt3)

it likes to have something deeper, maybe even as deep as

    (IF_END (_ELSIF/ELSE_ (_THEN_ exp1 stmt1) (_THEN_ exp2 stmt2) (stmt3))

One of the advantages being that if a THEN is missing or there's an
extra one somewhere, it should still be able to match the END with
its IF.

> Have people complained about modula-2-mode for this?

Modula-2-mode did not have any indentation until I added the SMIE
indentation (that was actually one reason I chose it: there was no risk
of regression) and noone requested it, which makes me think noone used
this mode.  That's consistent with the fact that Modula-2 is pretty
much dead.  So "no complaint" is not a good indicator.

> But all of this just goes to say that the current behavior of
> smie-backward-sexp for _non_-associative operators is wrong in the first
> place; that's a _major_ redesign!

I don't see why.  If you make your parse tree deeper, you'll get the
same "stop at the closest token" behavior.

BTW, this discussion is really useful to make me figure out some of the
guidelines and design rules that we should document in
SMIE's documentation.

> That is pretty much the crux of the matter; we have different visions
> about how indentation engines should work, partly influenced by the
> different languages we are implementing.

I don't think it's a question of language (I've implemented SMIE support
for Pascal, sh, Modula-2, Prolog, SML/OCaml/Coq, and Octave, which
I think covers a fair variety of syntaxes).
I've just learned to use SMIE to its advantage, whereas you're trying to
fight it.

>> Can you tell me in terms of C-M-f or C-M-b in which circumstance they
>> don't behave identically (and show me the corresponding grammar rules
>> you're using)?
> Not directly, no, because C-M-f doesn't call (smie-backward-sexp
> keyword), it calls (smie-backward-sexp 'halfsexp), which gives different
> behavior.

No, C-M-b gives the same behavior, you just have to start from another
position (i.e. from right after the keyword rather than from before it).


        Stefan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-24 14:24               ` Stefan Monnier
@ 2012-10-24 23:45                 ` Stephen Leake
  2012-10-25  3:01                   ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Stephen Leake @ 2012-10-24 23:45 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>>> That's not consistent, so it complicates my code, meaning more time
>>>> spent testing.
>>> Can you give me some detail about the complication?
>> I'd have to add special cases in the code, which means each special case
>> needs to be tested, and maintained.
>
> Without the specifics I can't see what you're referring to.
>
>> (goto-char (nth 1 (smie-backward-sexp (smie-default-forward-token))
>
> I've almost never needed to write a call to smie-backward-sexp in SMIE
> indentation rules (yes, there are a few exceptions, but only for
> constructs that SMIE is unable to parse properly, or for indentation
> rules that do not follow the parse-tree structure, such as indentation
> of monadic-style code in OCaml).

I don't understand this.

As I pointed out back near the start of this thread, when modula-2-mode
is indenting the token after "then", smie-indent-after-keyword calls
smie-indent-virtual to compute the indentation for "then"; that calls
smie-indent-keyword. If m2-smie-rules doesn't handle it,
smie-indent-keyword calls calls (smie-backward-sexp token), with `token'
= "then". This parses all the way back to "if".

So yes, smie-backward-sexp is called in indentation functions. And it is
the typical case; there are very few overrides in m2-smie-rules.

But perhaps modula-2 is not a good example, and I should look at the
others.

> There's something wrong if you need to write that for indentation of
> the if..end construct.

You keep saying this, and I keep pointing out that the smie engine does
it all the time.

>> Do you have any examples of when smie-skip-associative nil is useful for
>> actual indentation issues?
>
> As mentioned:
>
>    It is good for graceful degradation (when we misparse some part of
>    the code).  It's also useful when the user doesn't like the
>    auto-indentation's choice, since we use the user's choice of where to
>    indent the previous `elsif'.

Yes. But thought there might be some legal language construct that
required it as well. For example, indenting math expressions, since the
comment in smie--associative-p talks about "+".

> E.g., we have the following code:
>
>      case foo of
>           | 1 => toto
>           | 2 => titi
>      | 3 => tata
>
> Assuming that the indentation rule for "| in case" says to align | with "case"
> when hitting TAB on the last line, should we leave the line unchanged or
> should align it with the previous line?
>
> SMIE will usually align it with the previous line, on the principle that
> the user's choice on previous lines takes precedence over the
> indentation rules.  After all, if the user doesn't like it, she can
> first reindent the other lines, so there's no loss of functionality.

That's a reasonable heuristic.

It's not what modula-2 mode does.

Which language does that? I'd like to study the code.

>>> but if you really want to do that, I'd rather do it with a new
>>> function (ideally written on top of smie-next/backward/forward-sexp)
>>> than with a configuration variable.
>> Can you explain a little more about the downside of having this choice
>> provided as a configuration option? It's not a _user_ option, just a
>> language programmer option. It doesn't significantly complicate smie!
>
> Because it breaks the above desirable properties.

I don't understand. The language implementor can set it to nil by
default, to get the above properties. Then during indentation, it can be
let-bound to t.

I guess your point is that if the t functionality is provided in a
different function, that function can be called from the indentation
rules, while the nil function can be called from the motion code.

But that duplicates a _lot_ of code, for one very small change. A
let-bound option is a much simpler solution.

> So if you need it for particular indentation rules, that's OK, but it
> should not affect general movement behavior.

Right; a let-bound option satisfies that.

>> Since the right value for smie-skip-associative is tied to the language,
>
> I still don't believe it to be the case.

It is for Ada.

--
-- Stephe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-24 14:44             ` Stefan Monnier
@ 2012-10-25  0:22               ` Stephen Leake
  0 siblings, 0 replies; 19+ messages in thread
From: Stephen Leake @ 2012-10-25  0:22 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> That's actually a bug: the THEN should be indented relative to its
>>> corresponding ELSIF (if any) rather than to the top IF.
>> _all_ of the keywords in the if/then/elsif/else/endif, and in every
>> other multi-keyword statement in modula-2 and Ada, work this way; they
>> _all_ jump to the first keyword in the statement. Except for the ones
>> that happen to be associative. You'd have to totally rewrite
>> smie-backward-sexp, to fix this "bug".
>
> SMIE likes to have more structure in its parse tree: instead of an IF
> construct having a single level
>
>     (IF_THEN_ELSIF_THEN_ELSE_END exp1 stmt1 exp2 stmt2 stmt3)
>
> it likes to have something deeper, maybe even as deep as
>
>     (IF_END (_ELSIF/ELSE_ (_THEN_ exp1 stmt1) (_THEN_ exp2 stmt2) (stmt3))
>
> One of the advantages being that if a THEN is missing or there's an
> extra one somewhere, it should still be able to match the END with
> its IF.

Ok, that's clearly a design choice.

It is completely orthogonal to the issue of special treatment for
associative operators; none of these will be associative.

My initial reaction is "if you want to move from one keyword to the
previous, write motion code that does that, don't mess with the
grammar".

Such motion code is possible; another choice for the termination
condition of the loop in smie-next-sexp should do it - stop at the first
keyword with a matching level. That would require a flatter grammar, so
the keywords in one sexp do have matching levels.

>> But all of this just goes to say that the current behavior of
>> smie-backward-sexp for _non_-associative operators is wrong in the first
>> place; that's a _major_ redesign!
>
> I don't see why.  If you make your parse tree deeper, you'll get the
> same "stop at the closest token" behavior.

Ok, that makes sense.

But it means spending time optimizing the grammar so it has these
properties. As I've said, I've got way to much work to do, so keeping it
simple is more important at the moment.

> BTW, this discussion is really useful to make me figure out some of the
> guidelines and design rules that we should document in
> SMIE's documentation.

Well, good. I've been learning more, as well.

>> That is pretty much the crux of the matter; we have different visions
>> about how indentation engines should work, partly influenced by the
>> different languages we are implementing.
>
> I don't think it's a question of language (I've implemented SMIE support
> for Pascal, sh, Modula-2, Prolog, SML/OCaml/Coq, and Octave, which

Are those in emacs trunk? I've only looked in 24.2 so far, and there
only modula and octave use smie. 

All of those are far smaller than Ada, by refined keyword count if no
other measure. Prolog and the SML family are certainly very different
languages. 

> I think covers a fair variety of syntaxes).

Yes.

So I'll fall back on "the default value for smie-skip-associative is
tied to the language implementor". Still seems valid.

> I've just learned to use SMIE to its advantage, whereas you're trying
> to fight it.

That's fair. 

It's also fair to keep things very simple at first, and only optimize
when it becomes necessary, and add features like single-keyword motion
when you have the time.

>>> Can you tell me in terms of C-M-f or C-M-b in which circumstance they
>>> don't behave identically (and show me the corresponding grammar rules
>>> you're using)?
>> Not directly, no, because C-M-f doesn't call (smie-backward-sexp
>> keyword), it calls (smie-backward-sexp 'halfsexp), which gives different
>> behavior.
>
> No, C-M-b gives the same behavior, you just have to start from another
> position (i.e. from right after the keyword rather than from before
> it).

True. 

Given this code (Ada except "end if" replaced by "end fi" to avoid
refining, similarly for "end select"):

   if A then     -- then-1
      B;

   elsif C then  -- elsif-1, then-2
      D;

   elsif         -- elsif-2
     E then      -- then-3
      F;

   else          -- else-1
     G;
   end fi;

   select
       accept A1;
       B1;
    or           -- or-1
       accept A2;
       B2;
    or           -- or-2
       accept A3;

    else         -- else-2
      B4;
    end tceles;

Using grammar-1a, smie-skip-associative nil: 

With point after "fi", C-M-b leaves point before "if"; similarly for
"tceles" and "select".

With point after the "else" labeled else-1, C-M-b leaves point before
"if". For else-2, point is left before "accept A3". For or-2, point is
left before "accept A2".

or-2, else-2 are the only exceptions; after all the other keywords,
C-M-b stops with point before "if" or "select".

Setting smie-skip-associative t removes those exceptions; C-M-b gives
the same behavior after all keywords.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-24 23:45                 ` Stephen Leake
@ 2012-10-25  3:01                   ` Stefan Monnier
  2012-10-25 19:17                     ` Stephen Leake
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2012-10-25  3:01 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

> I don't understand this.

Yes, there's a misunderstanding: I meant that the smie-rule function
almost never *directly* calls smie-backward-sexp.  Of course, the
generic part of the smie indentation relies heavily on
smie-backward-sexp.

I insisted on it, because you seem to pay a lot of attention to how
smie-backward-sexp behaves depending on the way it's called, and
unless you call it directly, you don't have much control over how it's
called.

>> There's something wrong if you need to write that for indentation of
>> the if..end construct.
> You keep saying this, and I keep pointing out that the smie engine does
> it all the time.

Yes the engine does it, but you didn't write that code, right?

> Yes. But thought there might be some legal language construct that
> required it as well. For example, indenting math expressions, since the
> comment in smie--associative-p talks about "+".

No, IIRC a left-associative infix operator gets indented in the same
way, so you don't need the "associative special case" since you could
just give those operators a left-associative parsing rule and you'd get
that "skip to the very beginning for indentation" behavior.

> It's not what modula-2 mode does.

Not for ELSIF, but it does for all other associative operators such as
commas in arg lists.

> I don't understand.  The language implementor can set it to nil by
> default, to get the above properties.  Then during indentation, it can
> be let-bound to t.

That would bring back some of the benefits, but not those for
indentation and I still haven't seen a concrete case where such
a setting in indentation would be useful (I'm not saying your case isn't
one such example, but I still haven't seen your concrete case).

> But that duplicates a _lot_ of code, for one very small change. A
> let-bound option is a much simpler solution.

I don't think it can be that much code.  I think it can be implemented
on top of smie-backward-sexp along the lines of:

  (let ((first-token (smie-backward-token))
        data)
    (while (progn (setq data (smie-backward-sexp first-token))
                  (and (smie-associative-p data)
                       (same-precedence first-token data)))))

but still I don't think you need it.

>> I still don't believe it to be the case.
> It is for Ada.

I lack the evidence to be convinced.

>> (IF_END (_ELSIF/ELSE_ (_THEN_ exp1 stmt1) (_THEN_ exp2 stmt2) (stmt3))
[...]
> It is completely orthogonal to the issue of special treatment for
> associative operators; none of these will be associative.

Yes, ELSIF and ELSE will be.

> My initial reaction is "if you want to move from one keyword to the
> previous, write motion code that does that, don't mess with the
> grammar".

I'm just telling you what grammars worked well in my experience.
I'm pretty sure the modula-2 grammar would move further in that
direction if I were to actually use m2-mode and start fixing the
misbehaviors I bump into.

> But it means spending time optimizing the grammar so it has these
> properties. As I've said, I've got way to much work to do, so keeping it
> simple is more important at the moment.

Fine.

>> I don't think it's a question of language (I've implemented SMIE support
>> for Pascal, sh, Modula-2, Prolog, SML/OCaml/Coq, and Octave, which
> Are those in Emacs trunk? I've only looked in 24.2 so far, and there
> only modula and octave use SMIE.

Not all: the pascal, sh, modula-2, prolog, and octave, yes, the sml one
is in the `elpa' branch, and the other two are external (the coq one is
in the ProofGeneral package, and the ocaml one is in the Tuareg-mode
package).

> All of those are far smaller than Ada, by refined keyword count if no
> other measure.

Indeed.

> It's also fair to keep things very simple at first, and only optimize
> when it becomes necessary, and add features like single-keyword motion
> when you have the time.

Fair.  Then let's move on to a more concrete discussion: show me
concrete indentation problems you bumped into.  And I'll try to fix it
without changing either of your grammar or smie.el.
[ I guess this will require me using your code.  ]

> With point after the "else" labeled else-1, C-M-b leaves point before
> "if". For else-2, point is left before "accept A3". For or-2, point is
> left before "accept A2".
> or-2, else-2 are the only exceptions; after all the other keywords,
> C-M-b stops with point before "if" or "select".

Right, because "or" is associative whereas your "elsif" isn't (because
you haven't moved "then" to a sub-tree).

> Setting smie-skip-associative t removes those exceptions; C-M-b gives
> the same behavior after all keywords.

But since all of your "or", "elsif", "else", "select", and "if" are
indented in the same column, it doesn't matter which one is used,
does it?


        Stefan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-25  3:01                   ` Stefan Monnier
@ 2012-10-25 19:17                     ` Stephen Leake
  2012-10-25 21:01                       ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Stephen Leake @ 2012-10-25 19:17 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> I don't understand.  The language implementor can set it to nil by
>> default, to get the above properties.  Then during indentation, it can
>> be let-bound to t.
>
> That would bring back some of the benefits, but not those for
> indentation and I still haven't seen a concrete case where such
> a setting in indentation would be useful (I'm not saying your case isn't
> one such example, but I still haven't seen your concrete case).

I started this thread with a specific example. Your solution is to
abandon the generic "find the statement start" solution, and put in
special cases for keywords that happen to be associative.

> but still I don't think you need it.

Of course there are other ways to solve any particular problem; I was
looking for a more general solution.

>> But it means spending time optimizing the grammar so it has these
>> properties. As I've said, I've got way to much work to do, so keeping it
>> simple is more important at the moment.
>
> Fine.

I will continue using my patched version of smie-backward-sexp until
I've got a functioning ada-mode; then I think about how to move forward
on this.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: smie-next-sexp vs associative operators
  2012-10-25 19:17                     ` Stephen Leake
@ 2012-10-25 21:01                       ` Stefan Monnier
  0 siblings, 0 replies; 19+ messages in thread
From: Stefan Monnier @ 2012-10-25 21:01 UTC (permalink / raw)
  To: Stephen Leake; +Cc: emacs-devel

>> That would bring back some of the benefits, but not those for
>> indentation and I still haven't seen a concrete case where such
>> a setting in indentation would be useful (I'm not saying your case isn't
>> one such example, but I still haven't seen your concrete case).
> I started this thread with a specific example.

No.  By specific example, I meant also the corresponding
smie-indent-rules, which I haven't seen (if you included them, I missed
them).


        Stefan



^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-10-25 21:01 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-14  9:16 smie-next-sexp vs associative operators Stephen Leake
2012-10-14 15:54 ` Stefan Monnier
2012-10-14 18:44   ` Stephen Leake
2012-10-14 21:48     ` Stephen Leake
2012-10-15  1:47     ` Stefan Monnier
2012-10-15 12:09       ` Stephen Leake
2012-10-20  9:15       ` Stephen Leake
2012-10-21 14:58         ` Stephen Leake
2012-10-23 17:37           ` Stefan Monnier
2012-10-23 22:43             ` Stephen Leake
2012-10-24 14:24               ` Stefan Monnier
2012-10-24 23:45                 ` Stephen Leake
2012-10-25  3:01                   ` Stefan Monnier
2012-10-25 19:17                     ` Stephen Leake
2012-10-25 21:01                       ` Stefan Monnier
2012-10-23 18:07         ` Stefan Monnier
2012-10-23 23:14           ` Stephen Leake
2012-10-24 14:44             ` Stefan Monnier
2012-10-25  0:22               ` Stephen Leake

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).