From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.devel Subject: RE: A vision for multiple major modes: some design notes Date: Wed, 20 Apr 2016 14:06:37 -0700 (PDT) Message-ID: <05d5bd7e-1cea-4336-a37c-fe6bd6752558@default> References: <20160420194450.GA3457@acm.fritz.box> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1461186440 16194 80.91.229.3 (20 Apr 2016 21:07:20 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 20 Apr 2016 21:07:20 +0000 (UTC) To: Alan Mackenzie , emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Apr 20 23:07:08 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aszKx-0001kS-Fn for ged-emacs-devel@m.gmane.org; Wed, 20 Apr 2016 23:07:03 +0200 Original-Received: from localhost ([::1]:56451 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aszKw-0003je-5E for ged-emacs-devel@m.gmane.org; Wed, 20 Apr 2016 17:07:02 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:59558) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aszKh-0003gU-Ty for emacs-devel@gnu.org; Wed, 20 Apr 2016 17:06:49 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aszKc-0002Gp-So for emacs-devel@gnu.org; Wed, 20 Apr 2016 17:06:47 -0400 Original-Received: from aserp1040.oracle.com ([141.146.126.69]:23839) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aszKc-0002GE-Kp for emacs-devel@gnu.org; Wed, 20 Apr 2016 17:06:42 -0400 Original-Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u3KL6e1q014526 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 20 Apr 2016 21:06:41 GMT Original-Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0021.oracle.com (8.13.8/8.13.8) with ESMTP id u3KL6d34021435 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 20 Apr 2016 21:06:40 GMT Original-Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by userv0121.oracle.com (8.13.8/8.13.8) with ESMTP id u3KL6cGk028674; Wed, 20 Apr 2016 21:06:39 GMT In-Reply-To: <20160420194450.GA3457@acm.fritz.box> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9 (901082) [OL 12.0.6744.5000 (x86)] X-Source-IP: aserv0021.oracle.com [141.146.126.233] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 141.146.126.69 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:203137 Archived-At: Sounds very good, a priori. And I commend you for actually putting together a clear and comprehensive design proposal for discussion, instead of just implementing something. Especially for something that is likely to lead to new uses and further possibilities, it is good to open up the big picture for discussion (regardless of the outcome). Some feedback, mostly minor - > * - For regexps which recognise whitespace, the regexp must contain > "\\s-" or "\\s " or "[[:space:]]" so that the regexp engine will > handle "foreign" islands and gaps between chained islands as whit= espace. I understand the motivation (you explain it further on). But this hardcoding of what can constitute a "whitespace-matching" pattern seems a bit rigid. No way to flexibly allow for different meanings of whitespace here? What if some code wants to handle \n or \t or \f etc. differently, or to even treat some set of (normally non-whitespace) chars as if they too were whitespace for island purposes? > o - A @dfn{chain} of islands is a canonically ordered chain of islands = in > a single buffer. Why limit it necessarily to a single buffer? It is common to want to do things (search etc.) across multiple buffers, and sometimes regardless of mode. That doesn't diminish just because one might want to use chains of non-contiguous text zones. I'm pretty sure I would want to be able do things throughout a chain that spans different buffers. If it were I, I would think about defining all that you are doing using a structure that is multi-buffer. [That is what I did for zones.el, for instance - sets of such text zones are delimited by markers, which automatically record the buffer they pertain to. And they can be persistent, as well. Have you considered the possibility of persisting island chains?] And I would probably want user-level operations, to combine chains (append, intersect, union/coalesce, difference).=20 And why not be able to do that for chains that cross buffers? Being able to add (e.g. append) a chain in one buffer to a chain in another buffer is one simple example. Anything you might want to do with one chain you will likely want to be able to do with a set of chains, or at least with a chain that results from composing a set of chains in various ways. Also, I'm guessing/hoping, but I'm not sure I saw this explicitly, that you can have multiple chains (e.g. in the same buffer) that use the same major mode. Being associated with a major mode is only one possible attribute of a chain - it is not required, and other attributes and uses of a chain are not dependent on it, right? IOW, it is not necessary to think of chains as mode-related - that is just one (albeit common) use & interpretation, right? > o - An island will be delimited in two complementary ways: > * - It will be enclosed syntactically by characters with > "open island" and "close island" syntax (see section (v)). > Both of these syntactic markers will include a flag "chain" > indicating whether there is a previous/next island in the > chain. The cdr of the syntax value will be > the island chain to which the island belongs. > * - It will be covered by the text property `island', whose > value will be the pertinent island or island chain Are both always required, or is either sufficient for most purposes? Is the syntax one needed only when you need to take advantage of it? Can you do most things using either, so that a given operation (that is not specific to only one of them, e.g. not specific to syntax) can be done regardless of which is available? I'm thinking that in many contexts I would not care about delimiting by syntax, and I might not even care about associating a given chain with a mode. Would I be able to use such chains nevertheless (e.g. search/replace across them)? > Note that if islands are enclosed inside other islands, Maybe you can elaborate on overlapping islands and chains?=20 What caveats or use cases do you see? A priori, I would like to have a chain data structure, and as much of the rest of the features as possible, be available and manipulable from Lisp. Something like this has lots of enhancement possibilities and use cases that we are unlikely to imagine at the outset. Implementing more than an absolute minimum in C hampers that exploration and improvement. HTH. I don't claim to have grasped all of what you envisage. It's great food for thought, in any case. (I asked a couple of times, in the bug thread(s) and here, for just this sort of top-level picture of what was envisaged. I gave up hoping that someone might actually make clear what the question/project/plan is. This is a welcome, if unexpected, development.)