From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: phillip.lord@russet.org.uk (Phillip Lord) Newsgroups: gmane.emacs.devel Subject: Re: A vision for multiple major modes: some design notes Date: Wed, 20 Apr 2016 23:27:34 +0100 Message-ID: <87vb3blxux.fsf@russet.org.uk> References: <20160420194450.GA3457@acm.fritz.box> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1461191297 25310 80.91.229.3 (20 Apr 2016 22:28:17 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 20 Apr 2016 22:28:17 +0000 (UTC) Cc: Dmitry Gutov , emacs-devel@gnu.org To: Alan Mackenzie Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Apr 21 00:28:03 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1at0bG-000535-Al for ged-emacs-devel@m.gmane.org; Thu, 21 Apr 2016 00:27:58 +0200 Original-Received: from localhost ([::1]:57702 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1at0bF-0005eQ-Pw for ged-emacs-devel@m.gmane.org; Wed, 20 Apr 2016 18:27:57 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46678) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1at0az-0005bs-SX for emacs-devel@gnu.org; Wed, 20 Apr 2016 18:27:43 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1at0av-00056u-Rq for emacs-devel@gnu.org; Wed, 20 Apr 2016 18:27:41 -0400 Original-Received: from cloud103.planethippo.com ([31.216.48.48]:44046) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1at0av-00056h-FW for emacs-devel@gnu.org; Wed, 20 Apr 2016 18:27:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=russet.org.uk; s=default; h=Content-Type:MIME-Version:Message-ID: In-Reply-To:Date:References:Subject:Cc:To:From; bh=Vp8Cbfwin4yipgzOmmepxDQTpwwuPVsuDkhOdWT96iQ=; b=G3djMeyV/6Ze8uNLsQIVoOberU jRAiKhXrTAWoOS+6EuTONtj8/TPnBEhbZ26GAIq9zTkIKuhIcltdnff3H2CG9777aia6xbgV17v49 VtM+dAJWbfnHD2fgX/Y/BjGExNfEMj/k3OHDoWxMtbnhUJ7opZLlsI0UgJsL/8PC6wraNHzk0YB5P iMIgQg1l2s/O+murMOnnQ7ZwKQ36iyW1qiE/lXTX24xf1UXZURf7fwD00glorI4I3wiJUYb8hIqNf mDEyLXsao7XL+iWCdB1q8HjovR02v2uoVc/bqmxwKIvCZ0v7JDEIX16L2WEerWXvQ0McSjMX9HIpU YFckBLsQ==; Original-Received: from cpc1-benw10-2-0-cust373.gate.cable.virginm.net ([77.98.219.118]:52999 helo=russet.org.uk) by cloud103.planethippo.com with esmtpsa (TLSv1.2:DHE-RSA-AES128-SHA:128) (Exim 4.86_1) (envelope-from ) id 1at0at-001Z2z-Ud; Wed, 20 Apr 2016 23:27:36 +0100 In-Reply-To: <20160420194450.GA3457@acm.fritz.box> (Alan Mackenzie's message of "Wed, 20 Apr 2016 19:44:50 +0000") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.92 (gnu/linux) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - cloud103.planethippo.com X-AntiAbuse: Original Domain - gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - russet.org.uk X-Get-Message-Sender-Via: cloud103.planethippo.com: authenticated_id: phillip.lord@russet.org.uk X-Authenticated-Sender: cloud103.planethippo.com: phillip.lord@russet.org.uk X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 31.216.48.48 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:203138 Archived-At: A few comments, rather than an in-depth analysis, am afraid. Alan Mackenzie writes: > (iv) Islands. > o - An island will be delimited in two complementary ways: > * - It will be enclosed syntactically by characters with "open island" and > "close island" syntax (see section (v)). Both of these syntactic > markers will include a flag "chain" indicating whether there is a > previous/next island in the chain. The cdr of the syntax value will be > the island chain to which the island belongs. > * - It will be covered by the text property `island', whose value will be > the pertinent island or island chain (see section (ii)) (not yet > decided). Note that if islands are enclosed inside other islands, the > value is the innermost island. There is the possibility of using an > interval tree independent of the one for text properties to increase > performance. When you say "complementary" do you mean alternative or simultaneous? I.e. will an island always be enclosed by syntax markers and always have a text property. Or can it have either? I'm still not understanding how the chain of islands is set up. Is this entirely the super modes responsibility? The use of "syntax" suggests that the islands can be detected *purely* syntactically. But, there are many places where this is not true: consider org-mode: #+begin_src emacs-lisp (message "hello world") #+end_src We cannot assume that "+end_src" is the end of a island. Also, how will the regexp engine work when it spans an island? I ask because, if we use the regexp engine to match delimiters, the which syntax do we use, if there are multiple modes in the buffer. > o - An island might be represented by a C or Lisp structure, it might not > (not yet decided). This structure would hold the containing chain, > markers pointing to the start and end of the chain, and the previous and > next islands in the chain. > > (v) Syntax, etc. > o - Two new syntax classes, "open island" and "close island" will be > introduced. These will be designated by the characters "{" and "}". Their > "matching character" slots will contain the island's chain. There will be > an extra flag "chain" (denoted by "i") indicating whether there is a > previous/next island in the chain. > o - `scan-lists', `scan-sexps', etc. will treat a "foreign" island as > whitespace, much as they do comments. They will also treat as whitespace > the gap between two islands in a chain. Difficult to say, but this might produce some counter intuitive behaviour. So, for example, consider some text like so: === Example (here is some lisp) ;; This is a long and tedious piece of documentation in my lisp program. (here is some more lisp) === End Example Now moving backward a paragraph will have a significant difference in behaviour -- on the "(" of "here is some more lisp", we move to "(here is some lisp), while on the char before, we move the "This is a long". Good, bad, expected? Don't know. > o - The (currently 11 element) parser state will be enhanced to support > islands as follows: > * - A twelfth element will be introduced. This will contain an > association list whose elements will have the form (island-chain > . 12-element parse state); each element will contain the suspended state > of parsing in the island chain which is the car of the element. An > element with a car of nil will represent the suspended parsing state of > the buffer outside of islands. > * - Elements 12, 13, .... will be island chains of the enclosing islands, > elt 12 being that of the innermost enclosing island, etc. An element > with a value of nil indicates being outside all islands. > o - `parse-partial-sexp' will create and use an enhanced parser state as > described above. Note that a two character construct (such as a C comment > opener) can not enclose an island, and special handling will be required > to exclude this. The syntax table in use will change as the current > position passes between islands. > o - `syntax-ppss' will do the right thing with the extended parser state. > Alternatively, `syntax-ppss' will have an independent 12-element state in > each island chain, where elt. 11 is always nil. Its cache mechanism will > be enhanced such that buffer changes outside of an island chain need not > invalidate the stored cache pertaining to the chain. > o - The facilities in this section are active even when `in-islands' is > nil. > > (vi) Regexps. > o - The regexp engine will be enhanced such that the regexps "\\s-", "\\s ", > and "[[:space:]] will match an entire island. > o - The gap between two islands in a chain will also be matched by the above > regexps. > o - This treatment of an island, and a gap between two islands, as WS will > occur only when `in-islands' is non-nil. > o - When `in-islands' is nil, there will be no reliable way of scanning over > an island by regexps, since it is a potentially nested structure, and FSMs > don't recognise arbitrarily nested structures. > > (vii) Variables. > o - Island chain local variable bindings will come into existence. These > bindings depend on the island point is in. There will be lower level > routines that will have "position" parameters as an alternative to using > point. > o - All variables which are currently buffer local will become chain local > except for those whose symbols are given a non-nil `entire-buffer' > property. There will be no new functions like > `make-chain-local-variable'. What is the default-value of a chain local variable, if the variable is also buffer-local? Will we need functions for setting all chains in a certain mode in a single buffer? Phil