From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: A vision for multiple major modes: some design notes Date: Fri, 22 Apr 2016 11:48:52 +0300 Message-ID: <83a8km58qz.fsf@gnu.org> References: <20160420194450.GA3457@acm.fritz.box> <8360vb6o7u.fsf@gnu.org> <20160421221943.GE1775@acm.fritz.box> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1461314967 15844 80.91.229.3 (22 Apr 2016 08:49:27 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 22 Apr 2016 08:49:27 +0000 (UTC) Cc: emacs-devel@gnu.org, dgutov@yandex.ru To: Alan Mackenzie Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Apr 22 10:49:22 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1atWm9-0000vY-9A for ged-emacs-devel@m.gmane.org; Fri, 22 Apr 2016 10:49:21 +0200 Original-Received: from localhost ([::1]:57226 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atWm8-0004rJ-OB for ged-emacs-devel@m.gmane.org; Fri, 22 Apr 2016 04:49:20 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:47907) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atWlq-0004gi-5x for emacs-devel@gnu.org; Fri, 22 Apr 2016 04:49:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1atWlm-0002o6-Ur for emacs-devel@gnu.org; Fri, 22 Apr 2016 04:49:02 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:59862) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atWlm-0002o2-Qv; Fri, 22 Apr 2016 04:48:58 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:3017 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1atWlm-0006yd-0q; Fri, 22 Apr 2016 04:48:58 -0400 In-reply-to: <20160421221943.GE1775@acm.fritz.box> (message from Alan Mackenzie on Thu, 21 Apr 2016 22:19:43 +0000) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:203165 Archived-At: > Date: Thu, 21 Apr 2016 22:19:43 +0000 > Cc: dgutov@yandex.ru, emacs-devel@gnu.org > From: Alan Mackenzie > > > A more subtle issue is with point movements that are not shown to the > > user (those done by Lisp code of some command, before redisplay kicks > > in) -- what will be the effect of those? do they trigger redisplay, > > for example? > > They shouldn't trigger redisplay, no. But if that code calls sit-for or somesuch, they will, and the result will be flickering. But that's not a very important issue. > > > * - [Island] will be covered by the text property `island', whose value will be > > > the pertinent island or island chain (see section (ii)) (not yet > > > decided). Note that if islands are enclosed inside other islands, the > > > value is the innermost island. There is the possibility of using an > > > interval tree independent of the one for text properties to increase > > > performance. > > > I don't understand the notion of "enclosed" islands: wouldn't such > > "enclosing" simply break the "outer" island into two separate islands? > > If we mark island start and end with the syntax-table text properties > "{" and "}", we're going to have something like > > { a{ }b } > > . Simply to break the outer island into two pieces, we'd really need to > apply delimiters at a and b, giving: > > { }{ }{ } > > . This would overwrite the previous syntaxes at a and b, and this might > be a Bad Thing. We could design the stuff so that Bad Things won't happen. I consider this nesting of islands a (possibly unnecessary) complications that we shouldn't accept unless we have a very good reason. Nesting immediately requires a plethora of operations that are otherwise not necessary. > > > o - `scan-lists', `scan-sexps', etc. will treat a "foreign" island as > > > whitespace, much as they do comments. They will also treat as whitespace > > > the gap between two islands in a chain. > > > Why whitespace? why not some new category? By overloading whitespace, > > you make things harder on the underlying infrastructure, like regexp > > search and matching. > > I think it's clear that the "foreign" island's syntax has no interaction > with the current island. This is not a contradiction to what I suggested. The new category could be treated the same as whitespace, in its effect on syntax-related issues. By contrast, having whitespace regexp class be indistinguishable from an island probably means complications on a very low level of matching regular expressions and syntax constructs, something that I fear will get in the way. > If we treat it as whitespace, that should minimise the amount of > adapting we need to do to existing major modes. We need to consider the amount of adaptations in the low-level infrastructure code as well, not only on the application level. > I envisage that a regexp element will match the "foreign" island if that > element would match a space. I know this sounds horrible, but I haven't > come up with a scenario where this wouldn't work well. And I say this is a bomb waiting to go off. It is relatively easy to add a new regexp construct for an island (e.g., we already support categories in regexps, so just defining a category is one easy way), and treat that as whitespace, while still keeping our options open to make it behave slightly differently if needed, and still allowing the applications to specify one, but not the other. By contrast, if we decide that whitespace matches an island, we are opening a giant can of worms. Here's one worm out of that can: some low-level operations need to search the buffer using regexps disregarding any narrowing -- what you suggest means these operations cannot safely use whitespace in their regexps. This is something to stay away of, IMO. > > Extending [:space:] that way seems to be an implementation detail > > leaking to user level. I think we should avoid that at all costs. > > Why? I don't understand your last paragraph. See above. [:space:] is something used a lot in Lisp applications, so we leak the implementation of islands to that level: from now on, each Lisp application will need to consider the possibility that searching for [:space:] will find an island, something that might have no relation to whitespace. > > I'm not sure I understand the details. E.g., where will the > > island-chain local values be stored? > > In a C struct chain, analogous to struct buffer, using much the same > mechanisms. What object(s) will that chain be rooted at? And how will it be related to its buffer? > > To remind you, buffer-local variables have a special object in their > > symbol value cell, and BVAR only works for the few buffer-local > > variables that are stored in the buffer object itself. I'm not sure I > > understand how CVAR could solve the problem you need to solve, which > > is keeping multiple chains per buffer, each one with its values of > > these variables. > > CVAR would get the current chain from the `island' (or `chain') text > property at the position. If it is stored in the text property, then you will have to decide what happens when text is copied and yanked elsewhere. > If this is nil, it would do what BVAR does. Once again, BVAR only handles variables that are part of the buffer object itself. The other buffer-local variables (which are the majority) are handled as part of switching the buffer, and the C code simply refers to them by name. So BVAR is not necessarily the correct model for what you are designing. > Otherwise it would access the appropriate named element in the struct > chain. I think CVAR would take three parameters: the variable name, the > buffer, and the buffer position. Can you show a pseudo-code of CVAR? I'm afraid I'm missing something here, because I don't see clearly what you have in mind. > Other chain local variables would be accessed through an alist in the > struct chain holding miscellaneous variables, exactly as is done for > the other buffer local variables in struct buffer. There's no such alist in how we access buffer-local variables, not AFAIK. Again, I must be missing something here. > > This actually sounds like a simple extension of narrowing, so I wonder > > why do we need so many new object types and notions. > > I think it's more like a complicated extension of narrowing. :-) It's simple because instead of one region you have more than one, and the user-level commands don't affect them. All the other changes are exact reproduction of what narrowing does. > I think that chain local variables are essential to multiple major > modes - you can't have m.m.m. without some sort of chain locality. What is "chain locality"? > I also think that for a major mode to work transparently over > several chained islands, all the irrelevant stuff between the > islands needs to be made, er, transparent. Yes, but how is that related to my comment about extending narrowing? > > I don't see any discussion of how redisplay will deal with islands. > > To remind you, redisplay moves through portions of the buffer, without > > moving point, and access buffer-local variables for its job. You need > > to augment the design with something that will allow redisplay see the > > correct values of variables depending on the buffer position it is at. > > The same problem exists for any features that use display simulation > > for making decisions about movement and layout, e.g. vertical-motion. > > I think redisplay is mostly controlled by variables (such as > `scroll-margin') accessed by BVAR. These calls could be replaced by > CVAR. That's not the whole story; once again, you forget about buffer-local variables that are not part of the buffer object; BVAR is not used for those. I gave an example of one such variable: face-remapping-alist, and I selected that variable for a reason. Here's how the display engine refers to it in the current codebase: base_face_id = it->string_from_prefix_prop_p ? (!NILP (Vface_remapping_alist) ? lookup_basic_face (it->f, DEFAULT_FACE_ID) : DEFAULT_FACE_ID) : underlying_face_id (it); Another example (which I also mentioned) is standard-display-table: /* Use the standard display table for displaying strings. */ if (DISP_TABLE_P (Vstandard_display_table)) it->dp = XCHAR_TABLE (Vstandard_display_table); See? no BVAR anywhere in sight. > Problems will arise if redisplay reads the variable once, and > fails to read it again when its current position moves into or out of an > island. Redisplay would have to be aware of island boundaries, and > re-read the controlling variables on passing a boundary. Other than > that, I can't see any big problems. Not yet, anyway. To remind you, the display engine works by examining characters from the buffer text one by one. Are you saying that it will have, for each character it examines, to look up the island chain for possible changes? That would make it abysmally slow, I think. IOW, part of your design needs to provide some efficient means for redisplay to "be aware of island boundaries, and re-read the controlling variables on passing a boundary". There's one more complication, which is related to redisplay, but not only to it. You write: > (ix) Miscellaneous commands and functions. > o - `point-min' and `point-max' will, when `in-islands' is non-nil, return > the max/min point in the visible region in the same chain of islands as > point. > o - `search-\(forward\|backward\)\(-regexp\)?' will restrict themselves to > the current island chain when `in-islands' is non-nil. > o - `skip-\(chars\|syntax\)-\(forward\|backward\)' will likewise operate in > the current island chain (how?) when `in-islands' is non-nil. > o - `\(next\|previous\)-\(single\|char\)-property-change', etc., will do the > Right Thing in island chains when `in-islands' is non-nil. > o - New functions `island-min', `island-max', `island-chain-min' and > `island-chain-max' will do what their names say. > o - There will be no restrictions on the use of widening/narrowing, as have > been proposed for other support engines for multiple major modes. > o - New commands like `beginning-of-island', `narrow-to-island', etc. will > be wanted. More difficultly, bindings for them will be needed. Something bothers me there. What will "M-<" and "M->" do, if point-min and point-max are limited to the current island? Likewise the search commands -- they cannot be limited to the current island, unless the user explicitly says so (and personally, I don't envision users to ask to be so limited). There's a dichotomy here, between the underlying C-level variables that currently are set to the limits of the narrowed region, and affect all user commands and internal operations (e.g., the display engine never looks beyond these limits); and the multi-mode functionality that needs to narrow the view even more. If you propagate the island-level limitations too deep, they will affect user commands and features (like display) that have nothing to do with the reason for which islands are being designed. E.g., a naïve replacement of C macros BEGV and ZV with something that returns the beginning and end of the current island will cause the display show only the current island, as if you narrowed the buffer to that island. I'm sure that's not what we want.