From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: A vision for multiple major modes [was: Re: [Emacs-diffs] widen-limits c331b66:] Date: Tue, 5 Apr 2016 16:29:28 +0000 Message-ID: <20160405162928.GD3463@acm.fritz.box> References: <56F242E0.7060004@online.de> <877fgtpfrw.fsf@gmail.com> <20160323211605.GA5324@acm.fritz.box> <20160324183835.GB2721@acm.fritz.box> <20160327120919.GA2682@acm.fritz.box> <8d18ba1e-8252-1e6b-2bea-3a0e5e68b052@yandex.ru> <20160329000720.GC5095@acm.fritz.box> <654b8ea3-84ea-60d9-6fe1-b088d52579c3@yandex.ru> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1459873622 4523 80.91.229.3 (5 Apr 2016 16:27:02 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 5 Apr 2016 16:27:02 +0000 (UTC) Cc: Vitalie Spinu , Andreas =?iso-8859-1?Q?R=F6hler?= , emacs-devel@gnu.org, Stefan Monnier , Eli Zaretskii , Drew Adams To: Dmitry Gutov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Apr 05 18:26:52 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1anToZ-0002VZ-MT for ged-emacs-devel@m.gmane.org; Tue, 05 Apr 2016 18:26:51 +0200 Original-Received: from localhost ([::1]:38262 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anToZ-0007qH-18 for ged-emacs-devel@m.gmane.org; Tue, 05 Apr 2016 12:26:51 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42542) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anToJ-0007pJ-Sj for emacs-devel@gnu.org; Tue, 05 Apr 2016 12:26:37 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1anToH-0005pJ-1Z for emacs-devel@gnu.org; Tue, 05 Apr 2016 12:26:35 -0400 Original-Received: from mail.muc.de ([193.149.48.3]:57057) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anToG-0005mN-Nz for emacs-devel@gnu.org; Tue, 05 Apr 2016 12:26:32 -0400 Original-Received: (qmail 11488 invoked by uid 3782); 5 Apr 2016 16:26:29 -0000 Original-Received: from acm.muc.de (p548A5A8B.dip0.t-ipconnect.de [84.138.90.139]) by colin.muc.de (tmda-ofmipd) with ESMTP; Tue, 05 Apr 2016 18:26:28 +0200 Original-Received: (qmail 5163 invoked by uid 1000); 5 Apr 2016 16:29:28 -0000 Content-Disposition: inline In-Reply-To: <654b8ea3-84ea-60d9-6fe1-b088d52579c3@yandex.ru> User-Agent: Mutt/1.5.24 (2015-08-30) X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x X-Received-From: 193.149.48.3 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:202747 Archived-At: Hello, Dmitry. On Fri, Apr 01, 2016 at 04:15:18AM +0300, Dmitry Gutov wrote: > On 03/29/2016 03:07 AM, Alan Mackenzie wrote: > >> E.g. (goto-char (match-end n)) is a common > >> idiom for syntax highlighting and indentation code. > > Yes, but that match is going to be in the same island, or at the very > > least, in the chain of islands containing the current one. > Would looking-at skip over intermediate islands? Yes, somehow. That sort of looking-at is going to contain something which matches whitespace in its regexp. An intermediate island would count as whitespace, one way or another. > Basically, I wonder if > (= (+ (match-beginning 0) (length (match-string 0)) > (match-end 0)) > is going to always hold in that new world. Yes, certainly. But (match-string 0) would include the island inside it. > > Are you thinking more that `forward-char' should move from the end of an > > island to the beginning of the next island in a chain? > As one option, yes. Let's call it option A, and it entails renumerating > buffer positions to avoid gaps. I'm solidly against using alternative buffer positions. The unforeseen side effects we'd have to cope with would be excessive. The machinery we'd have to set up to convert from "real" offsets to "no gaps" offsets would be horrendous. There's no reason forward-char shouldn't jump to the next island in a chain without renumbering buffer positions. (Assuming `restrict-to-island' has been bound to non-nil by the super mode.) > >> True. I'm more worried about gaps. But they could be treated like > >> whitespace, I suppose. > > For example, by `forward-char'? What primitives were you thinking of, > > here? > This would be option B. forward-char doesn't care if the characters are > whitespace or not. I don't know if e.g. search-forward would see the > contents of the intermediate islands as a bunch of actual space > characters (it's something to consider). When the key variable `restrict-to-island' is bound to non-nil, things like search-forward would skip over islands. Otherwise it would see the insides of islands. The latter would be what a user doing C-s would normally want to happen. > Most importantly, the contents of intermediate islands would match > "\\s-", and parse-partial-sexp would similarly skip over them. Yes, with `restrict-to-island' non-nil. parse-partial-sexp would always save its state on encountering the start of an island, and restore it at the end of the island. For the island itself, it would set the state to that at the end of the previous island in the chain, or to "nil". One of the main points of islands is to isolate them syntactically from their surroundings - a bit like a "super comment". > >> Right. So then, some yet-unknown body of code has to become > >> island-aware, and the improvement is not that seamless anymore. > > The way I see it, the super modes would have to be totally aware of the > > island mechanism, but the major modes would be largely, it not totally, > > unaware of it. `syntax-ppss' seems more part of the super mode > > mechanism. > At the very least, that does sound somewhat complicated. I don't agree. The essence of being a super mode is to create, organise and coordinate the islands the buffer contains. > >> What if var has an island-local binding? And my function has this: > >> (setq var 1) > >> (goto-char xx) ; where xx is in a different island > >> will its value change mid-program? What if xx is in the same island? > >> Will the value change then? > > A different island local binding would become current, so yes its value > > would thereby change. Much like if var had a buffer local binding, and > > you do (set-buffer "foo"), the value of var you'd just set would no > > longer be in the current binding. if xx is in the same island, the > > binding just setq'd would remain the same. > That might be something to look out for. Changing a buffer is an > operation with big consequences, we know that already, but using > goto-char didn't have too many implications until now (aside from > point-entered hooks, I guess, which are being phased out). Most (all?) goto-chars in a major mode are going to be within an island (or chain of islands). Only the super mode is going to have to be aware of this effect. I think thinking of it as "changing the value of a variable" is a clumsy way to regard it. "Accessing the correct binding for the current buffer position" is more how I would describe it. > >> I don't think either should be true. Then, the "adequate" model would > >> amount to changing them in post-command-hook anyway. > > What sort of variables are you thinking about here? [ Some time later:] > > Could it be that it might be better to have "island chain local" > > variables rather than merely "island local" ones? > mmm-mode has both kinds. Maybe, but having two extra kinds of locals is more than twice as complicated as only having one. > > So that the three ruby > > lines in your example below would be three islands in a chain, and would > > thus all share a set of "island chain local" bindings rather than each > > line having its own binding? > Yes, probably. But that doesn't solve the above concern. I think "the above concern" was about the lack of newlines in the three ruby lines of the example, and the difficulties this would cause indentation. One thing we could do is to make an island syntactically equivalent to a newline for the enclosing code. Maybe. > >> So making it seem, to the Lisp code, like the multiple related islands > >> don't have anything (for some definition of "anything") between them is > >> not a priority? > > By "related", I think you mean the same thing I meaan by "chained" - the > > three ruby mode lines enclosed in <%...%> in your example below, would be > > chained together by the super mode. > Yes, we can call them chained. As long as it doesn't imply anything > about any presence of islands of other types between them. > >> Here's an example of ERB code Vitalie brought up recently: > >> <% if foo %> > >> <%= bar(boo) %> > >> <% end %> > >> The parts of the buffer between %'s would be Ruby islands. The > >> suggestion was to use the value returned by the indentation function in > >> the second island (second line) to indent the "real" buffer contents. > > The question seems to be, are we indenting in the super mode or in each > > island? The answer seems to be you'd want to indent in the super mode > > here, I think. > Indentation in the super mode would look like this: > <% if foo %> > <%= bar(boo) %> > <% end %> > which is not what we'd really want. But we'd surely be happy enough with <% if foo %> <% = bar(boo) %> <% end %> , if that is the way the user typed in the code. > But yes, a good solution would take html-mode's indentation logic as a > base, and modify it with the indentation offsets provided by the > islands. As a matter of interest, is it really likely that users will type in embedded ruby code with <%...%> delimiters around each line, rather than one pair around, say, a function or an entire program? Is this ruby snippet normal coding, or was it brought up to illustrate an awkward situation we might have to handle? > >> But: neither of the islands contains a newline. If they are combined in > >> the most straightforward way, the Ruby line will be 'if foo bar(boo)', > >> and its indentation would be not what we're looking for. I think the > >> current ways to look for a newline, > >> (progn (skip-syntax-backward " ") > >> (eq (char-before ?\n)) > >> or > >> (looking-at "\\s-$") > >> will fail to work. Also note that neither of the two islands in this > >> example contains the newline in question. > > For example, in the three ruby line scenario above, is there any variable > > you can think of for which it would be advantageous to have a binding > > for each island rather than a shared binding for the set of three (?two)? > Not off the top of my head. Maybe there's none. I would guess there aren't any. > > I'm kind of envisioning successive `forward-char's moving from the end > > of "<% if foo %>" to the beginning of "<%= bar(boo) %>", or even from > > the end of "if foo" to the beginning of "bar(boo)". How does this sound? > Sounds like what I've calling option A above. Basically, my vague > concerns are: > - Performance (checking islands boundaries in each primitive must incur > overhead, even when there are no islands). There would clearly be some overhead. This would be negligible for the most common case of `restrict-to-island' being nil. > - Code complexity: new code paths that might be exercised not very often > in the future. Hence, they could be prone to breakage. A dedicated test > suite would help with that, though. If the abstraction we're talking about is sound, then the code complexity will be manageable. People cope with buffer local variables, people cope with the regexp engine searching for syntactic things. > Also, think back to the problem of the absence of newlines in the Ruby > islands in the above example. The indentation engine needs them, and it > might be harder to make newlines appear both inside and outside of the > Ruby islands (the html-mode indentation code needs them too, I'd wager). > This is where option B has another advantage, aside from (probably) > being easier to implement. -- Alan Mackenzie (Nuremberg, Germany).