From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Stefan Monnier <monnier@iro.umontreal.ca>
Newsgroups: gmane.emacs.devel
Subject: Re: SMIE
Date: Thu, 10 Jul 2014 09:32:15 -0400
Message-ID: <jwvzjgho04f.fsf-monnier+emacs@gnu.org>
References: <CAPLdYOhnp353s3LTM9EORWbzmiH2JXVjFNX1sp07tQYe2Q4MPA@mail.gmail.com>
	<jwvha2qorue.fsf-monnier+emacs@gnu.org>
	<CAPLdYOjRntb-ivOM1xpR0_JH=-U6-n18nEfZ8KTJM38tKPqmGg@mail.gmail.com>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Trace: ger.gmane.org 1404999178 17603 80.91.229.3 (10 Jul 2014 13:32:58 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Thu, 10 Jul 2014 13:32:58 +0000 (UTC)
Cc: emacs-devel@gnu.org
To: Matt DeBoard <matt.deboard@gmail.com>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Jul 10 15:32:51 2014
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1X5ESv-00023o-V6
	for ged-emacs-devel@m.gmane.org; Thu, 10 Jul 2014 15:32:50 +0200
Original-Received: from localhost ([::1]:38084 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1X5ESv-0004ap-Ig
	for ged-emacs-devel@m.gmane.org; Thu, 10 Jul 2014 09:32:49 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:45838)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <monnier@iro.umontreal.ca>) id 1X5ESi-0004Xp-91
	for emacs-devel@gnu.org; Thu, 10 Jul 2014 09:32:44 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <monnier@iro.umontreal.ca>) id 1X5ESa-000875-Oi
	for emacs-devel@gnu.org; Thu, 10 Jul 2014 09:32:36 -0400
Original-Received: from ironport2-out.teksavvy.com ([206.248.154.181]:63183)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <monnier@iro.umontreal.ca>) id 1X5ESa-00085O-IN
	for emacs-devel@gnu.org; Thu, 10 Jul 2014 09:32:28 -0400
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ArYGAIDvNVNLd+D9/2dsb2JhbABZgwaDSlK/a4EXF3SCJQEBAQECASMzIwULCxoCGA4CAhQYDSSIBAivG6J+F4EpjR4zB4JvgUkEqRmBaoNMIQ
X-IPAS-Result: ArYGAIDvNVNLd+D9/2dsb2JhbABZgwaDSlK/a4EXF3SCJQEBAQECASMzIwULCxoCGA4CAhQYDSSIBAivG6J+F4EpjR4zB4JvgUkEqRmBaoNMIQ
X-IronPort-AV: E=Sophos;i="4.97,753,1389762000"; d="scan'208";a="77058426"
Original-Received: from 75-119-224-253.dsl.teksavvy.com (HELO pastel.home)
	([75.119.224.253])
	by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA;
	10 Jul 2014 09:32:15 -0400
Original-Received: by pastel.home (Postfix, from userid 20848)
	id 85724604AF; Thu, 10 Jul 2014 09:32:15 -0400 (EDT)
In-Reply-To: <CAPLdYOjRntb-ivOM1xpR0_JH=-U6-n18nEfZ8KTJM38tKPqmGg@mail.gmail.com>
	(Matt DeBoard's message of "Wed, 9 Jul 2014 23:53:04 -0400")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux)
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-Received-From: 206.248.154.181
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:172930
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/172930>

> In general I=E2=80=99m having a hard time connecting the dots between the=
 BNF
> grammar table creation, the smie-rules (i.e. :before, :after, etc.),
> tokenization, indentation, and so forth, and how it all comes together
> to make this indentation machine work.

The tokenizer turns the text (sequence of chars) into a sequence of
"tokens", which should be thought of as "words".  This step gets rid of
things like comments and (syntactically insignificant) whitespace.

The BNF then describes which of those tokens are special (people usually
call them "keywords" if they look like human words and "operators" if
they look like math entities) and how they relate to each other to
provide structure.

These two together are sufficient to get C-M-f and C-M-b to do something
useful such as skip from "begin" to the matching "end" and vice-versa.
C-M-f/C-M-b also do something useful when called right before/after an
"infix" keyword: they skip over the corresponding right/left element.
E.g. C-M-b starting from right after "then" should jump to the matching
"if" (it may jump to either just before it or just after it, depending
on the particular way the BNF is specified), and C-M-f from right before
"+" should jump over the whole expression that's being added.

It's generally very useful to debug the BNF and tokenizer using C-M-f
and C-M-b: if these don't jump to the right place, then the indentation
rules won't work well anyway.  Note that these C-M-f and C-M-b may
sometimes stop before and sometimes after the "destination keyword",
depending on details of the BNF grammar, but usually either choice is
fine in the sense that it's "close enough" that the indentation rules
can then be made to work.

The indentation algorithm then works by looking at the immediately
surrounding tokens (the one before and the one after), asks the
rules-function what is the indentation rule for those tokens, and when
needed jumps to the "parent" with (more or less) C-M-b.

> there=E2=80=99s that. I=E2=80=99m slowly starting to pare things away. Th=
e bit you
> wrote about :list-intro is interesting. When you say that it sees two
> or more concatenated expressions, how does that tie in to the BNF
> grammar definitions?

When you have "exp1 exp2 exp3", there is no keyword/operator involved
(tho keyword/operators may be involved inside exp[123], of course), so
the BNF definition doesn't say anything about it.

When the BNF says ("BEGIN" exp "END"), SMIE interprets it as "we can
have any number of `exp' or any other non-special token between BEGIN
and END".  And that is true of all BNF rules in SMIE (SMIE always
accepts any number of non-special tokens anywhere between the special
tokens, and any repetition).

IOW it doesn't tie in to the BNF grammar definition at all, because this
is imposed implicitly for all grammars by SMIE's underlying algorithm.

E.g. the grammar

   (defvar my-smie-bnf '((id)
                         (exp ("BEGIN" exp "END")
                              (id ":=3D" exp))))

will happily consider "a b c :=3D BEGIN d e END" as a valid `exp'.
Actually, it will also accept "BEGIN a b c END :=3D BEGIN d e END",
because as soon as you have such a "parenthesis-like" element, SMIE will
also accept it anywhere (it is a really dumb parsing algorithm which
pays very little attention to the context).

> One final question. In the case of e.g.
> > Can't resolve the precedence cycle: .do < else:. < .do
> What does the placement of the dots (left of "do", right of "else:") mean?

SMIE reduces the BNF grammar into a simple table of precedences.
Each token gets assigned 2 precedences: a left-precedence and
a right-precedence.  So the "." indicates which precedence is affected:
the above says that there's a cycle between the left precedence of "do"
and the right precedence of "else:".

A good way to think about those < and > is as "parenthese":

The rule "else:. < .do" means that SMIE thinks (because of some part of
the BNF grammar) that if the code looks like:

   ... else: .. .. do ...

it should be parsed as

   ... else: ..(.. do ...

rather than as

   ... else: ..).. do ...

And the rule ".do < else:." (or better said "else:. > .do") says just
the reverse:

   ... else: .. .. do ...

it should be parsed as

   ... else: ..).. do ...

rather than as

   ... else: ..(.. do ...

So, SMIE doesn't know which is true.  Sadly, the code I wrote does not
keep track of where those inequalities come from, so it's unable to
point you to the particular part of the BNF rules which made it think
that "else:. < .do" or that ".do < else:.".

> As regarding inclusion in GNU ELPA, I'm just a caretaker for the
> project on behalf of the Elixir-lang people, but as it's already in
> MELPA I'm sure it's fine.

Inclusion into GNU ELPA is slightly different in that it's a half-way
inclusion in Emacs: on the technical side, it means that there's a Git
branch of the code kept along side Emacs's repository (and to which
Emacs maintainers can write) from which the GNU ELPA package get built,
and on the legal side it means that the code has to have the same
copyright status as Emacs, which concretely means that all non-trivial
contributors need to have signed some copyright paperwork.


        Stefan