From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Drew Adams <drew.adams@oracle.com>
Newsgroups: gmane.emacs.devel
Subject: RE: A vision for multiple major modes: some design notes
Date: Wed, 20 Apr 2016 14:06:37 -0700 (PDT)
Message-ID: <05d5bd7e-1cea-4336-a37c-fe6bd6752558@default>
References: <20160420194450.GA3457@acm.fritz.box>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
X-Trace: ger.gmane.org 1461186440 16194 80.91.229.3 (20 Apr 2016 21:07:20 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Wed, 20 Apr 2016 21:07:20 +0000 (UTC)
To: Alan Mackenzie <acm@muc.de>, emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Apr 20 23:07:08 2016
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1aszKx-0001kS-Fn
	for ged-emacs-devel@m.gmane.org; Wed, 20 Apr 2016 23:07:03 +0200
Original-Received: from localhost ([::1]:56451 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1aszKw-0003je-5E
	for ged-emacs-devel@m.gmane.org; Wed, 20 Apr 2016 17:07:02 -0400
Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:59558)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <drew.adams@oracle.com>) id 1aszKh-0003gU-Ty
	for emacs-devel@gnu.org; Wed, 20 Apr 2016 17:06:49 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <drew.adams@oracle.com>) id 1aszKc-0002Gp-So
	for emacs-devel@gnu.org; Wed, 20 Apr 2016 17:06:47 -0400
Original-Received: from aserp1040.oracle.com ([141.146.126.69]:23839)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <drew.adams@oracle.com>) id 1aszKc-0002GE-Kp
	for emacs-devel@gnu.org; Wed, 20 Apr 2016 17:06:42 -0400
Original-Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233])
	by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with
	ESMTP id u3KL6e1q014526
	(version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Wed, 20 Apr 2016 21:06:41 GMT
Original-Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72])
	by aserv0021.oracle.com (8.13.8/8.13.8) with ESMTP id u3KL6d34021435
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Wed, 20 Apr 2016 21:06:40 GMT
Original-Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19])
	by userv0121.oracle.com (8.13.8/8.13.8) with ESMTP id u3KL6cGk028674;
	Wed, 20 Apr 2016 21:06:39 GMT
In-Reply-To: <20160420194450.GA3457@acm.fritz.box>
X-Priority: 3
X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9  (901082) [OL
	12.0.6744.5000 (x86)]
X-Source-IP: aserv0021.oracle.com [141.146.126.233]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic]
X-Received-From: 141.146.126.69
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel/>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: "Emacs-devel" <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Xref: news.gmane.org gmane.emacs.devel:203137
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/203137>

Sounds very good, a priori.  And I commend you for actually
putting together a clear and comprehensive design proposal
for discussion, instead of just implementing something.
Especially for something that is likely to lead to new uses
and further possibilities, it is good to open up the big
picture for discussion (regardless of the outcome).

Some feedback, mostly minor -

>     * - For regexps which recognise whitespace, the regexp must contain
>         "\\s-" or "\\s " or "[[:space:]]" so that the regexp engine will
>         handle "foreign" islands and gaps between chained islands as whit=
espace.

I understand the motivation (you explain it further on).  But this
hardcoding of what can constitute a "whitespace-matching" pattern
seems a bit rigid.  No way to flexibly allow for different meanings
of whitespace here?  What if some code wants to handle \n or \t or
\f etc. differently, or to even treat some set of (normally
non-whitespace) chars as if they too were whitespace for island
purposes?

>   o - A @dfn{chain} of islands is a canonically ordered chain of islands =
in
>       a single buffer.

Why limit it necessarily to a single buffer?  It is common to
want to do things (search etc.) across multiple buffers, and
sometimes regardless of mode.  That doesn't diminish just
because one might want to use chains of non-contiguous text
zones.

I'm pretty sure I would want to be able do things throughout
a chain that spans different buffers.  If it were I, I would
think about defining all that you are doing using a structure
that is multi-buffer.

[That is what I did for zones.el, for instance - sets of such
text zones are delimited by markers, which automatically record
the buffer they pertain to.  And they can be persistent, as well.
Have you considered the possibility of persisting island chains?]

And I would probably want user-level operations, to combine
chains (append, intersect, union/coalesce, difference).=20
And why not be able to do that for chains that cross buffers?
Being able to add (e.g. append) a chain in one buffer to a chain
in another buffer is one simple example.  Anything you might want
to do with one chain you will likely want to be able to do with
a set of chains, or at least with a chain that results from
composing a set of chains in various ways.

Also, I'm guessing/hoping, but I'm not sure I saw this explicitly,
that you can have multiple chains (e.g. in the same buffer) that
use the same major mode.  Being associated with a major mode is
only one possible attribute of a chain - it is not required, and
other attributes and uses of a chain are not dependent on it, right?
IOW, it is not necessary to think of chains as mode-related - that
is just one (albeit common) use & interpretation, right?

>   o - An island will be delimited in two complementary ways:
>     * - It will be enclosed syntactically by characters with
>       "open island" and "close island" syntax (see section (v)).
>       Both of these syntactic markers will include a flag "chain"
>       indicating whether there is a previous/next island in the
>       chain.  The cdr of the syntax value will be
>       the island chain to which the island belongs.
>     * - It will be covered by the text property `island', whose
>       value will be the pertinent island or island chain

Are both always required, or is either sufficient for most
purposes?  Is the syntax one needed only when you need to
take advantage of it?  Can you do most things using either,
so that a given operation (that is not specific to only one
of them, e.g. not specific to syntax) can be done regardless
of which is available?

I'm thinking that in many contexts I would not care about
delimiting by syntax, and I might not even care about
associating a given chain with a mode.  Would I be able to
use such chains nevertheless (e.g. search/replace across them)?

>       Note that if islands are enclosed inside other islands,

Maybe you can elaborate on overlapping islands and chains?=20
What caveats or use cases do you see?

A priori, I would like to have a chain data structure, and
as much of the rest of the features as possible, be available
and manipulable from Lisp.  Something like this has lots of
enhancement possibilities and use cases that we are unlikely
to imagine at the outset.  Implementing more than an absolute
minimum in C hampers that exploration and improvement.

HTH.  I don't claim to have grasped all of what you envisage.
It's great food for thought, in any case.

(I asked a couple of times, in the bug thread(s) and here,
for just this sort of top-level picture of what was envisaged.
I gave up hoping that someone might actually make clear what
the question/project/plan is.  This is a welcome, if unexpected,
development.)