From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.devel Subject: RE: A vision for multiple major modes: some design notes Date: Thu, 21 Apr 2016 09:05:23 -0700 (PDT) Message-ID: <64f1d39a-dfd0-44ca-86c1-b4d6104b5702@default> References: <20160420194450.GA3457@acm.fritz.box> <05d5bd7e-1cea-4336-a37c-fe6bd6752558@default> <20160421124325.GC1775@acm.fritz.box> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1461254774 21836 80.91.229.3 (21 Apr 2016 16:06:14 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 21 Apr 2016 16:06:14 +0000 (UTC) Cc: emacs-devel@gnu.org To: Alan Mackenzie Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Apr 21 18:06:01 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1atH75-0001ll-T1 for ged-emacs-devel@m.gmane.org; Thu, 21 Apr 2016 18:05:56 +0200 Original-Received: from localhost ([::1]:43744 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atH75-0002Lm-8S for ged-emacs-devel@m.gmane.org; Thu, 21 Apr 2016 12:05:55 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:56420) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atH6l-00028n-IW for emacs-devel@gnu.org; Thu, 21 Apr 2016 12:05:40 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1atH6e-0007c9-Vl for emacs-devel@gnu.org; Thu, 21 Apr 2016 12:05:35 -0400 Original-Received: from userp1040.oracle.com ([156.151.31.81]:40572) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atH6e-0007bM-MP for emacs-devel@gnu.org; Thu, 21 Apr 2016 12:05:28 -0400 Original-Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u3LG5P0o004518 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 21 Apr 2016 16:05:26 GMT Original-Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0021.oracle.com (8.13.8/8.13.8) with ESMTP id u3LG5Oat014417 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 21 Apr 2016 16:05:25 GMT Original-Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id u3LG5NcP025810; Thu, 21 Apr 2016 16:05:24 GMT In-Reply-To: <20160421124325.GC1775@acm.fritz.box> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9 (901082) [OL 12.0.6744.5000 (x86)] X-Source-IP: aserv0021.oracle.com [141.146.126.233] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 156.151.31.81 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:203147 Archived-At: > This is a good point. Maybe it would be better to match an island or > the gap between two chained islands with any regexp element which > matches the space (the good old 0x20 character). See also Eli's feedback about this. I think I agree with him that trying to repurpose whitespace matching for this is maybe the best approach. A separate matching should perhaps be used - nothing to do with whitespace per se, even if the matching used might take whitespace (also) into account. > > I'm pretty sure I would want to be able do things throughout > > a chain that spans different buffers. If it were I, I would > > think about defining all that you are doing using a structure > > that is multi-buffer. > > I don't envisage that the island chains will really be that useful for > (user initiated) searching, etc. The idea is that, to the user, such a > buffer will look much like it already does, except that the font locking > will be appropriate for each island, the major mode key map will be > right for each island, and so on. I see it differently. I think you see it that way because for you the major mode thing is an essential part of the feature you want to implement - it is primary. To me, chains of islands should be the primary, and a very general, thing, and one (important) use of them would be to apply a mode to them ("multi-modes"). IOW, I see (lots of) possible uses for chains of islands that go beyond (i.e., do not necessarily involve) the application of a particular mode to them. And in the general case I see no reason to limit chains to a single buffer. That doesn't mean that there wouldn't be important cases that do limit the use to either (a) applying a given major mode or (b) a single buffer. I just don't see why we would build such limits into the design (i.e., hardcoded, making it hard to extend to either (a) mode-agnostic or (b) multi-buffer).=20 > > [That is what I did for zones.el, for instance - sets of such > > text zones are delimited by markers, which automatically record > > the buffer they pertain to. And they can be persistent, as well. > > Have you considered the possibility of persisting island chains?] Persistence? > > And I would probably want user-level operations, to combine > > chains (append, intersect, union/coalesce, difference). > > And why not be able to do that for chains that cross buffers? >=20 > The chains will be disjoint, so intersection/difference wouldn't be > useful. I understand that the islands in a chain would be disjoint. But why would chains necessarily be disjoint? Why shouldn't chains be independent (at least be able to be independent)? Why would defining one chain impose limits on defining other chains (any new chains would need to be disjoint from existing ones)? See above, regarding the utility of being able to ignore a chain's mode for certain operations (and the ability for a chain to not even have an associated mode). I suspect that you are not seeing the use cases I am, which involve doing all kinds of things to/with the text in a chain of islands. As Eli suggested, think of a chain of islands as an extension of narrowing. Now think of the many different kinds of things you (or code) do to a narrowed region. This should be a more general feature, I think, than what is available in something like MuMaMo or mmm. "Multi-modes" is a subcase. Again, I see a chain of (ordered) text regions as the primary, general feature, and the mapping (restriction) of a major mode to such a chain as a subsidiary feature. > Given that the essential feature of a chain is its major mode, That is where we differ, and that explains, I think, the narrower focus you have. I wouldn't limit the feature to being coupled to a mode. That should be a possibility but not a requirement. > it wouldn't make sense to combine chains (which will usually > have different major modes). It would make sense, depending on what kind of operation you wanted to apply to the text in chains. And chains with the same mode could also be combined, whether in the same buffer or not. > I'm still trying to think through the idea of a > chain having islands in several buffers. Think of the chains first as just buffer narrowings that are multi-region, i.e., ignoring all the syntax and major-mode features that you are thinking about. (You can still think of those, but they come in at a different level - a specific subfeature or set of use cases.) > > Being able to add (e.g. append) a chain in one buffer to a chain > > in another buffer is one simple example. Anything you might want > > to do with one chain you will likely want to be able to do with > > a set of chains, or at least with a chain that results from > > composing a set of chains in various ways. >=20 > > Also, I'm guessing/hoping, but I'm not sure I saw this explicitly, > > that you can have multiple chains (e.g. in the same buffer) that > > use the same major mode. >=20 > Indeed, yes. >=20 > > Being associated with a major mode is only one possible attribute of a > > chain - it is not required, and other attributes and uses of a chain > > are not dependent on it, right? IOW, it is not necessary to think of > > chains as mode-related - that is just one (albeit common) use & > > interpretation, right? >=20 > Not right, sorry. The major mode is an essential attribute of an > island chain. Why? What's necessarily essential about it? That's a design choice, no? Would you consider dropping it as a requirement and keeping it as an option (for any given chain)? > There will be a slot for it in the structure which holds chain > data, just as there is currently a slot for it in the (C) buffer > structure. Must the slot be filled? Always? (Why?) > There will likewise be slots for the syntax table, major > mode key map, and so on. None of these slots would work well with a > null value. Why not optional? Of course if such a slot is not used then it, and anything that depends on it, would not "work well". But that should not prevent other, non-mode-related uses of a chain from working OK. > > > o - An island will be delimited in two complementary ways: > > > * - It will be enclosed syntactically by characters with > > > "open island" and "close island" syntax (see section (v)). > > > Both of these syntactic markers will include a flag "chain" > > > indicating whether there is a previous/next island in the > > > chain. The cdr of the syntax value will be > > > the island chain to which the island belongs. > > > * - It will be covered by the text property `island', whose > > > value will be the pertinent island or island chain >=20 > > Are both always required, or is either sufficient for most > > purposes? >=20 > Both are required, yes. They will both be used. Why required? Why can't the design tolerate not having syntax-based delimiting? I would prefer to see what you're envisaging placed within the context of a more general feature. I see 3 possible levels, in fact: 1. Arbitrary sets of text zones. Not necessarily ordered (e.g. by buffer position). Not necessarily without overlap. 2. #1, but as chains: ordered, non-overlapping. 3. #2, but with an associated major mode per chain. This is essentially what you have in mind, I think. For all 3 levels I can see use cases for chains that cross buffers and use cases for chain-combining operations. I can also imagine using some chain-local variables that are not buffer-specific or mode-specific. (You already allow for that, IIUC.) > > I'm thinking that in many contexts I would not care about > > delimiting by syntax, and I might not even care about > > associating a given chain with a mode. Would I be able to > > use such chains nevertheless (e.g. search/replace across them)? >=20 > I'm not sure this island mechanism is the right tool for doing what > you're suggesting. Depends on what it ends up being. ;-) > For searching/replacing at the user level, some > extra option meaning "only in the current chain" would need to be > added to the user interface. FWIW, I've done this for arbitrary sets of zones (including across buffers). The code is in `isearch-prop.el' (which depends on `zones.el' for this feature). Also, wrt "the current chain": You might want to look at the zones.el code for the use of variables (which can be buffer-local, but need not be) that hold sets of zones (including sets that are "chains") - how users can create them, choose among them, clone them, persist them, etc. > > A priori, I would like to have a chain data structure, and > > as much of the rest of the features as possible, be available > > and manipulable from Lisp. Something like this has lots of > > enhancement possibilities and use cases that we are unlikely > > to imagine at the outset. Implementing more than an absolute > > minimum in C hampers that exploration and improvement. >=20 > One idea would be to implement a chain feature, one of whose uses would > be the major mode islands I've been trying to specify. That's what I've been trying to suggest: chains of zones are more general than the feature you've described. That doesn't take away from the importance of the use case you have in mind. > A significant > part of this would have to be implemented at the C level for speed - > chain local variables are already going to be slower to access than > buffer local variables. We must keep that difference to a minimum. I have no problem with stuff being in C for performance reasons. When that is not critical, keeping stuff in Lisp is good. Especially for a new and very general feature: let folks play with it and experiment with new possibilities. We can later optimize any parts we like. We should avoid doing that prematurely, as always - but especially for Emacs, where Lisp enhancement by users is really the name of the game. Thanks again for opening this discussion and providing a detailed first proposal.