From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: A vision for multiple major modes: some design notes Date: Thu, 21 Apr 2016 21:33:23 +0000 Message-ID: <20160421213323.GD1775@acm.fritz.box> References: <20160420194450.GA3457@acm.fritz.box> <8360vb6o7u.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1461274425 17570 80.91.229.3 (21 Apr 2016 21:33:45 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 21 Apr 2016 21:33:45 +0000 (UTC) Cc: emacs-devel@gnu.org, dgutov@yandex.ru To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Apr 21 23:33:36 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1atMEB-00021A-HB for ged-emacs-devel@m.gmane.org; Thu, 21 Apr 2016 23:33:35 +0200 Original-Received: from localhost ([::1]:48441 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atMEA-0006SC-RO for ged-emacs-devel@m.gmane.org; Thu, 21 Apr 2016 17:33:34 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40330) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atME5-0006LG-PS for emacs-devel@gnu.org; Thu, 21 Apr 2016 17:33:31 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1atME3-00018g-Fo for emacs-devel@gnu.org; Thu, 21 Apr 2016 17:33:29 -0400 Original-Received: from mail.muc.de ([193.149.48.3]:31535) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atME3-000174-7y for emacs-devel@gnu.org; Thu, 21 Apr 2016 17:33:27 -0400 Original-Received: (qmail 12226 invoked by uid 3782); 21 Apr 2016 21:33:24 -0000 Original-Received: from acm.muc.de (p548A5630.dip0.t-ipconnect.de [84.138.86.48]) by colin.muc.de (tmda-ofmipd) with ESMTP; Thu, 21 Apr 2016 23:33:22 +0200 Original-Received: (qmail 6159 invoked by uid 1000); 21 Apr 2016 21:33:23 -0000 Content-Disposition: inline In-Reply-To: <8360vb6o7u.fsf@gnu.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x X-Received-From: 193.149.48.3 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:203157 Archived-At: Hello, Eli. I'll get a fuller reply to you later. But for now.... On Thu, Apr 21, 2016 at 05:17:09PM +0300, Eli Zaretskii wrote: > > Date: Wed, 20 Apr 2016 19:44:50 +0000 > > From: Alan Mackenzie > > > > This post describes my notion of how multiple major modes {c,sh}ould be > > implemented. Key notions are "islands", "island chains", and "chain > > local" variable bindings. > Thank you for publishing this. A few comments and questions below. > Please keep in mind that I never had to write any Lisp that deals with > these issues, so apologies in advance for possibly silly questions and > misunderstandings. [ .... ] > More generally, perhaps it will help if you publish the rationale for > at least the main points of this design, discussing possible > alternatives and explaining why you ended up with the one you present > as the design decision. This could help us see the main issues that > are to be dealt with, and perhaps suggest better ways of dealing with > them. Seeing just the final product of the design tends to limit the > discussions to low-level details, which could easily miss the broader > picture and issues. It would be nice if Emacs supported several major modes in a buffer, not just by awkward workarounds, but fully and natively. There's no magic involved in the emergence of the design - it's basically a naive vision of how things should be, given the current state of Emacs. The essence of major mode support is buffer local variables. (Things like the syntax table and local key map are basically buffer local variables, even though they are not accessible as such from Lisp.) So, at first sight, each "island" in the buffer needs its own set of "buffer local" variables. However, a set of variable bindings is a big overhead in terms of RAM, so it would make sense, wherever possible, to share these bindings between islands with the same major mode. Furthermore, in some use cases, there are sequences of islands which are in essence a single stream of text. It thus makes sense to have "chains of islands", all islands in a chain sharing the "chain local" variable bindings. There might be a need for actual "island local" variables, with a separate value in each island. However, Dmitry and I were unable to identify any such variables in an earlier thread on emacs-devel. If any such variables became apparent, then would be the time to work out how to implement them. The parts of a buffer which are not in any island (we won't call these "the ocean" ;-) also need their own variable bindings. It seems to make sense to use the standard buffer local bindings for these, since there would otherwise be no use for them. An alternative would be to construe these regions as being islands in their own right, in their own island chain. However, that would fit badly with the syntactic delimiters for islands (see below). The above applies to most variables which are currently buffer local. However, there are some such variables which are intrinsically to do with the whole buffer, not individual islands within it. These include `buffer-undo-list', the mark, `mark-ring', ..... They must be marked as belonging to the whole buffer, and handled as such, hence the `entire-buffer' property applied to their symbols. How do we implement chain local variable bindings? Why not base them on the implementation of buffer local bindings? Some buffer local variables are fixed slots in the struct buffer, the rest are elements in an association list in the struct buffer. Until there's a better idea, we copy this scheme for chain local variables; the fixed slot variables, currently accessed by the BVAR macro could instead get a somewhat more involved macro called "CVAR" which will somehow use the current position (whatever that means) to select the pertinent struct chain or the familiar struct buffer. Given a buffer position, we need to be able to find the corresponding island chain. "Obviously", we do this with a text property, which we might as well call `island', or possibly `chain'. Since successive accesses to chain local variables are very likely to be in the same chain most of the time, we will cache the "current" chain in buffer local variables. We want `parse-partial-sexp' and friends to work "properly" wrt islands. It is immediately clear that the syntactic context of each island chain is independent of other chains and of the regions outside islands. It is also clear that the syntactic context at the end of an island should be preserved and used as the starting value at the start of the next island in the same chain. It thus seems sensible to introduce new syntactic classes "open island" and "close island" to facilitate this. Why not give them the characters "{" and "}", which are currently unused? This method of delimiting islands does, however, force us to deal with nested islands. Clearly, our parser state must be amended to deal with these stacked and suspended states. It is currently unclear whether `syntax-ppss' needs to return this amended state, or whether the simple "state within the chain" would be adequate. It is clear that syntactic commands such as `forward-list' (C-M-n) must confine their operation to a single island chain. When it comes to movement and search primitives, we want to adapt these so that the impact on existing major modes is minimised. Ideally, we would want major modes to "see" only their own islands (or lack thereof). Thus we treat irrelevant islands as blocks of whitespace. It seems to make sense to have such islands matched by subexpressions in regexps which match spaces. This would obviate the need to amend a great number of regexps currently coded in major modes. On the other hand, when a user does C-s or C-M-s, the Right Thing is surely to search the buffer as a whole, without regard to islands. We therefore need a flag which instructs the primitives how to behave when there are islands. We might as well call this flag `in-islands', for want of a better name. The user will, from time to time, delete the delimiters which define islands, and will insert other ones. The super mode needs to be able to react to these actions, amending its island chains appropriately. I have not been able to come up with an adequate scheme for this using only before/after-change-functions. These variables are going to be chain local, and the buffer local values will hold functions for the buffer regions not in islands. So we introduce `island-before/after-change-function', entire-buffer local variables, each of which will hold a single function intended for adjusting island chains. Their return values will direct Emacs which islands need `before/after-change-functions' invoking on them. To minimise changes to major modes, quite a few primitives (such as `skip-syntax-forward' and `next-single-property-change') will be amended to restrict themselves to island chains when `in-islands' is bound to non-nil. Several Emacs subsystems will need enhancement, in particular redisplay and font-lock. Sorry this has turned out so long, so pedestrian, and so boring. :-( As promised, I have had no magic insights, no sparkling innovations in drawing up these notes - just a sequence of humdrum decisions, one after the other. If I've missed out anything relevant, please say so, then I can try and fill in the gap. It's also clear that what I'm proposing can't be implemented in a couple of weekends - it would be a long hard grind. But it would enable super modes to be written with comparative ease. > Thanks. -- Alan Mackenzie (Nuremberg, Germany).