From: Alan Mackenzie <acm@muc.de>
To: emacs-devel@gnu.org, Dmitry Gutov <dgutov@yandex.ru>
Subject: A vision for multiple major modes: some design notes
Date: Wed, 20 Apr 2016 19:44:50 +0000 [thread overview]
Message-ID: <20160420194450.GA3457@acm.fritz.box> (raw)
Hello, Dmitry and Emacs.
This post describes my notion of how multiple major modes {c,sh}ould be
implemented. Key notions are "islands", "island chains", and "chain
local" variable bindings.
In this scheme, "super modes" will not have to do anything to swap in/out
local variable bindings pertinent to islands; this will be done by the
underlying C code. Narrowing/widening will not be (ab)used by the super
mode mechanism. Major modes will continue to be able to use the entire
range of Emacs facilities.
Here are some design notes:
(i) Overview and motivation.
o - The aim is to support several major modes simultaneously in a single
buffer.
o - The "super mode" will set up "chains of islands" (see below).
* - Each chain will have its own major mode, key map, syntax table, etc.
* - In each chain, "chain local" variable bindings will exist. Such a
binding will be current when point is within an island in the chain.
* - The coordination of these bindings will be carried out by the
mechanisms described below, without explicit coding in the super mode.
o - To the user, the current major mode will be that of the island where
point is. All familiar commands will work without restriction.
o - To the writer of major modes, a minimal set of restrictions will apply:
* - For some major mode commands, the mode will have to bind the variable
`in-islands' (see below) to non-nil.
* - For regexps which recognise whitespace, the regexp must contain "\\s-"
or "\\s " or "[[:space:]]" so that the regexp engine will handle
"foreign" islands and gaps between chained islands as whitespace.
* - All other Emacs facilities will be available for use, being adapted as
necessary for the island mechanism.
(ii) Definitions and concepts.
o - An @dfn{island} is a contiguous portion of a buffer marked at each end.
Its attributes are those of the chain of islands of which it is an
element.
o - A @dfn{chain} of islands is a canonically ordered chain of islands in a
single buffer. An island chain has its own major mode; it has its own
syntax table, abbreviation table, font lock settings, etc. It has its own
bindings of (most) "buffer" local variables.
o - An island chain will have @dfn{chain local} variable bindings. Such a
binding will become current and accessible when point is within one of the
chain's islands. When point is not in an island, the buffer local binding
of the variable will be current. Most variables which are currently
buffer local in Emacs 25 will become chain local. Those (relatively few)
variables which must retain a single value over an entire buffer will be
marked as such with a non-nil value of the `entire-buffer' property.
o - The variable `using-islands' will be set non-nil to indicate the current
buffer is using the island mechanism.
o - The variable `in-islands' will control island and island chain
facilities. When this variable is bound to non-nil, the facilities
described here (such as chain local variables) are active. When the
variable is nil, (most of) the new facilities are inactive, and Emacs
behaves as Emacs 25.
(iii) Island Chains.
o - An island chain will be a Lisp object which is a C struct similar to
struct buffer. In particular, it will contain slots for common chain
local variables, and an association list for bindings of other chain local
variables.
o - An island chain might contain pointers to the first and last of its
islands (still to be decided).
(iv) Islands.
o - An island will be delimited in two complementary ways:
* - It will be enclosed syntactically by characters with "open island" and
"close island" syntax (see section (v)). Both of these syntactic
markers will include a flag "chain" indicating whether there is a
previous/next island in the chain. The cdr of the syntax value will be
the island chain to which the island belongs.
* - It will be covered by the text property `island', whose value will be
the pertinent island or island chain (see section (ii)) (not yet
decided). Note that if islands are enclosed inside other islands, the
value is the innermost island. There is the possibility of using an
interval tree independent of the one for text properties to increase
performance.
o - An island might be represented by a C or Lisp structure, it might not
(not yet decided). This structure would hold the containing chain,
markers pointing to the start and end of the chain, and the previous and
next islands in the chain.
(v) Syntax, etc.
o - Two new syntax classes, "open island" and "close island" will be
introduced. These will be designated by the characters "{" and "}". Their
"matching character" slots will contain the island's chain. There will be
an extra flag "chain" (denoted by "i") indicating whether there is a
previous/next island in the chain.
o - `scan-lists', `scan-sexps', etc. will treat a "foreign" island as
whitespace, much as they do comments. They will also treat as whitespace
the gap between two islands in a chain.
o - The (currently 11 element) parser state will be enhanced to support
islands as follows:
* - A twelfth element will be introduced. This will contain an
association list whose elements will have the form (island-chain
. 12-element parse state); each element will contain the suspended state
of parsing in the island chain which is the car of the element. An
element with a car of nil will represent the suspended parsing state of
the buffer outside of islands.
* - Elements 12, 13, .... will be island chains of the enclosing islands,
elt 12 being that of the innermost enclosing island, etc. An element
with a value of nil indicates being outside all islands.
o - `parse-partial-sexp' will create and use an enhanced parser state as
described above. Note that a two character construct (such as a C comment
opener) can not enclose an island, and special handling will be required
to exclude this. The syntax table in use will change as the current
position passes between islands.
o - `syntax-ppss' will do the right thing with the extended parser state.
Alternatively, `syntax-ppss' will have an independent 12-element state in
each island chain, where elt. 11 is always nil. Its cache mechanism will
be enhanced such that buffer changes outside of an island chain need not
invalidate the stored cache pertaining to the chain.
o - The facilities in this section are active even when `in-islands' is
nil.
(vi) Regexps.
o - The regexp engine will be enhanced such that the regexps "\\s-", "\\s ",
and "[[:space:]] will match an entire island.
o - The gap between two islands in a chain will also be matched by the above
regexps.
o - This treatment of an island, and a gap between two islands, as WS will
occur only when `in-islands' is non-nil.
o - When `in-islands' is nil, there will be no reliable way of scanning over
an island by regexps, since it is a potentially nested structure, and FSMs
don't recognise arbitrarily nested structures.
(vii) Variables.
o - Island chain local variable bindings will come into existence. These
bindings depend on the island point is in. There will be lower level
routines that will have "position" parameters as an alternative to using
point.
o - All variables which are currently buffer local will become chain local
except for those whose symbols are given a non-nil `entire-buffer'
property. There will be no new functions like
`make-chain-local-variable'.
o - When the `entire-buffer' property is nil, the buffer local binding of a
variable will hold the value pertinent to the areas of the buffer outside
of islands. When that property is non-nil, the binding holds the value
for the entire buffer.
o - When `in-islands' is nil, the chain local mechanism described here is
not used - instead the familiar buffer local binding is used.
o - The current binding for a local variable will be the chain local binding
of the island chain of the island containing point. If point is not in an
island, the buffer local binding is current.
o - If a chain local binding is current, and its value is unbound, the
binding of an enclosing scope is NOT used in its place. Probably the
variable's default-value should be used when reading.
o - In buffer.h, a new macro CVAR ("island chain variable") analogous to
BVAR will be introduced. It will use BVAR as a fall back. Most
invocations of BVAR will be changed to CVAR.
o - In data.c, the mechanism for accessing local variable bindings
(e.g. `swap_in_symval_forwarding') will be enhanced to test `in-islands'
and handle chain local bindings appropriately.
(viii) Change hooks.
o - There will be two additional abnormal hooks,
`island-before-change-function' and `island-after-change-function', which
will each hold a single function or nil. These will take the same
parameters as `before-change-functions' and `after-change-functions'
respectively.
o - The return value of these functions will be an association list with
members whose car is an island chain (or nil, meaning "outside all
islands") and whose cdr is the list of parameters to supply to
`before/after-change-functions for that chain. Usually, the alist will
have just one member containing BEG, END, and for `after-..' OLD-LEN
unchanged.
o - After calling each of these functions, Emacs will invoke
`before/after-change-functions' on each chain in the returned alist. This
will be in place of the standard calls to `before/after-change-functions'.
o - The intention of these hooks is that super modes will use them to detect
the deletion and insertion of islands, and to do the "de-islandification"
and "islandification" as needed.
o - `before/after-change-functions' will be normal chain local variables.
A chain local binding will hold functions for the individual chain. The
buffer local binding will hold functions for the parts of the buffer
outside of islands.
(ix) Miscellaneous commands and functions.
o - `point-min' and `point-max' will, when `in-islands' is non-nil, return
the max/min point in the visible region in the same chain of islands as
point.
o - `search-\(forward\|backward\)\(-regexp\)?' will restrict themselves to
the current island chain when `in-islands' is non-nil.
o - `skip-\(chars\|syntax\)-\(forward\|backward\)' will likewise operate in
the current island chain (how?) when `in-islands' is non-nil.
o - `\(next\|previous\)-\(single\|char\)-property-change', etc., will do the
Right Thing in island chains when `in-islands' is non-nil.
o - New functions `island-min', `island-max', `island-chain-min' and
`island-chain-max' will do what their names say.
o - There will be no restrictions on the use of widening/narrowing, as have
been proposed for other support engines for multiple major modes.
o - New commands like `beginning-of-island', `narrow-to-island', etc. will
be wanted. More difficultly, bindings for them will be needed.
o - ??? Other commands to be amended.
(x) Emacs subsystems and `in-islands'.
o - Redisplay will bind `in-islands' to non-nil, but will successfully
display all islands wholly or partially in windows being displayed.
o - Font Lock will bind `in-islands' to non-nil, but will successfully
fontify all pertinent islands.
o - `island-before/after-change-function' will be called with `in-islands'
nil.
o - `before/after-change-functions' will be called with `in-islands' bound
to non-nil.
o - Major modes will need to bind `in-islands' to non-nil for such things as
indentation.
o - For normal user interaction, `in-islands' will be nil.
--
Alan Mackenzie (Nuremberg, Germany).
next reply other threads:[~2016-04-20 19:44 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-20 19:44 Alan Mackenzie [this message]
2016-04-20 21:06 ` A vision for multiple major modes: some design notes Drew Adams
2016-04-20 23:00 ` Drew Adams
2016-04-21 12:43 ` Alan Mackenzie
2016-04-21 14:24 ` Stefan Monnier
2016-04-23 2:20 ` zhanghj
2016-04-23 22:36 ` Dmitry Gutov
2016-04-21 16:05 ` Drew Adams
2016-04-21 16:31 ` Eli Zaretskii
[not found] ` <<64f1d39a-dfd0-44ca-86c1-b4d6104b5702@default>
[not found] ` <<83oa926i0e.fsf@gnu.org>
2016-04-21 16:59 ` Drew Adams
2016-04-21 19:55 ` Eli Zaretskii
[not found] ` <<<64f1d39a-dfd0-44ca-86c1-b4d6104b5702@default>
[not found] ` <<<83oa926i0e.fsf@gnu.org>
[not found] ` <<791d74d1-2b1d-4304-8e7e-d6c31af7aa41@default>
[not found] ` <<83eg9y68jy.fsf@gnu.org>
2016-04-21 20:26 ` Drew Adams
2016-04-20 22:27 ` Phillip Lord
2016-04-21 9:14 ` Alan Mackenzie
2016-04-22 12:45 ` Phillip Lord
2016-04-21 14:17 ` Eli Zaretskii
2016-04-21 21:33 ` Alan Mackenzie
2016-04-21 22:01 ` Drew Adams
2016-04-22 8:13 ` Alan Mackenzie
2016-04-22 17:04 ` Drew Adams
2016-04-22 9:04 ` Eli Zaretskii
2016-06-13 21:17 ` John Wiegley
2016-06-14 13:13 ` Alan Mackenzie
2016-06-14 16:27 ` John Wiegley
2016-04-21 22:19 ` Alan Mackenzie
2016-04-22 8:48 ` Eli Zaretskii
2016-04-22 22:35 ` Alan Mackenzie
2016-04-23 7:39 ` Eli Zaretskii
2016-04-23 17:02 ` Alan Mackenzie
2016-04-23 18:12 ` Eli Zaretskii
2016-04-23 18:26 ` Dmitry Gutov
2016-04-23 21:08 ` Alan Mackenzie
2016-04-24 6:29 ` Eli Zaretskii
2016-04-24 16:57 ` Alan Mackenzie
2016-04-24 19:59 ` Eli Zaretskii
2016-04-25 6:49 ` Andreas Röhler
2016-04-22 13:42 ` Andy Moreton
2016-04-23 17:14 ` Alan Mackenzie
2016-04-22 14:33 ` Dmitry Gutov
2016-04-22 18:58 ` Richard Stallman
2016-04-22 20:22 ` Alan Mackenzie
2016-04-23 12:27 ` Andreas Röhler
2016-04-23 12:38 ` Richard Stallman
2016-04-23 17:31 ` Alan Mackenzie
2016-04-24 9:22 ` Richard Stallman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160420194450.GA3457@acm.fritz.box \
--to=acm@muc.de \
--cc=dgutov@yandex.ru \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.