From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: A vision for multiple major modes: some design notes Date: Sat, 23 Apr 2016 10:39:55 +0300 Message-ID: <83d1pg6aes.fsf@gnu.org> References: <20160420194450.GA3457@acm.fritz.box> <8360vb6o7u.fsf@gnu.org> <20160421221943.GE1775@acm.fritz.box> <83a8km58qz.fsf@gnu.org> <20160422223507.GD1873@acm.fritz.box> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1461397260 20307 80.91.229.3 (23 Apr 2016 07:41:00 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 23 Apr 2016 07:41:00 +0000 (UTC) Cc: dgutov@yandex.ru, emacs-devel@gnu.org To: Alan Mackenzie Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Apr 23 09:40:59 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1atsBV-0000Jq-Cm for ged-emacs-devel@m.gmane.org; Sat, 23 Apr 2016 09:40:57 +0200 Original-Received: from localhost ([::1]:48804 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atsBR-0000wH-H9 for ged-emacs-devel@m.gmane.org; Sat, 23 Apr 2016 03:40:53 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:55630) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atsB5-0000qP-E0 for emacs-devel@gnu.org; Sat, 23 Apr 2016 03:40:34 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1atsB3-0004fx-9e for emacs-devel@gnu.org; Sat, 23 Apr 2016 03:40:31 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:39356) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1atsAw-0004fb-HJ; Sat, 23 Apr 2016 03:40:22 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:4868 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1atsAv-0007SM-MY; Sat, 23 Apr 2016 03:40:22 -0400 In-reply-to: <20160422223507.GD1873@acm.fritz.box> (message from Alan Mackenzie on Fri, 22 Apr 2016 22:35:08 +0000) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:203200 Archived-At: > Date: Fri, 22 Apr 2016 22:35:08 +0000 > Cc: emacs-devel@gnu.org, dgutov@yandex.ru > From: Alan Mackenzie > > > > > Why whitespace? why not some new category? By overloading whitespace, > > > > you make things harder on the underlying infrastructure, like regexp > > > > search and matching. > > > > I think it's clear that the "foreign" island's syntax has no interaction > > > with the current island. > > > This is not a contradiction to what I suggested. The new category > > could be treated the same as whitespace, in its effect on > > syntax-related issues. By contrast, having whitespace regexp class be > > indistinguishable from an island probably means complications on a > > very low level of matching regular expressions and syntax constructs, > > something that I fear will get in the way. > > > > If we treat it as whitespace, that should minimise the amount of > > > adapting we need to do to existing major modes. > > > We need to consider the amount of adaptations in the low-level > > infrastructure code as well, not only on the application level. > > I think the adaptations to the regexp engine would be far less work than > adapting many thousands of regexps in major modes we want to use as > sub-modes. For example there are 115 occurrences in CC Mode of just the > exact string "[ \t". Please let's not forget that regexps are used in many places that have no relation whatsoever to major modes, and searching for whitespace is a very common operation using regular expressions. Infecting all those with this new meaning of whitespace that is totally alien to any code that doesn't deal with major mode is IMO plain wrong. More generally, I think we should first and foremost make our goal to have a clean and reasonably simple design, and only care about the amount of changes in major mode code as a secondary goal. Thinking about the changes in major modes first could easily lead us astray. > Bear in mind that this matching of an island by a whitespace regexp > element would happen ONLY whilst `in-islands' was bound to non-nil, i.e. > when a major mode is working in its own island chain. I understand, but I don't think this goes far enough to address my concerns. And my suggestion to have a separate class/category will serve your needs just as well, so I'm unsure why we need to piggyback [:space:]. > Are there any circumstances in which we would not want the major > mode to see the gap between its islands as WS? Who says that every major mode necessarily treats whitespace as you assume? Most (or even all) of those you know about might, but this is not written anywhere as a limitation of a major mode. By hard-wiring this special meaning of [:space:] into your design, you are limiting future (and possibly some rare extant) major modes. > When `in-islands' is nil (i.e. when the super mode's code is > running, or the user is typing commands) the islands would NOT match > a WS regexp. Are you sure that none of the background processing will ever need to treat islands as such? I'm talking about stuff like timers, process filters and sentinels, hook functions run by redisplay and the command loop, etc. If any of these might need to observe the island rules and restrictions, the design which builds on in-islands being bound to non-nil _only_ when the major mode is running its own code is unreliable, and will cause unrelated code to find itself dealing with island peculiarities. E.g., JIT font-lock runs off an idle timer, but clearly needs to observe islands, so it sounds like the problem I'm worried about is pretty much into our faces. > > By contrast, if we decide that whitespace matches an island, we are > > opening a giant can of worms. Here's one worm out of that can: some > > low-level operations need to search the buffer using regexps > > disregarding any narrowing -- what you suggest means these operations > > cannot safely use whitespace in their regexps. This is something to > > stay away of, IMO. > > It depends on whether these low level operations are working within an > island chain (`in-islands' non-nil) or on the buffer as a whole > (`in-islands' nil). I think such operations would typically be run with > `in-islands' nil, hence would not run up against these problems. "Typically" is not good enough, IMO. We must convince ourselves that this happens _always_, and there will _never_ be a reasonably justifiable need to search the entire buffer for whitespace when in-islands is non-nil, i.e. in any of the code that is running as a side-effect of performing some major-mode related operation. > > > CVAR would get the current chain from the `island' (or `chain') text > > > property at the position. > > > If it is stored in the text property, then you will have to decide > > what happens when text is copied and yanked elsewhere. > > It would be the job of the `island-after-change-function' to strip the > unwanted text properties (both the `island' and `syntax-table' ones) and > to apply any needed new ones to the yanked region. The problem is the decision whether they are unwanted or not. It's usually not simple to make that decision for text properties that change the way text is displayed, when surrounding text also affects that. > > > Otherwise it would access the appropriate named element in the struct > > > chain. I think CVAR would take three parameters: the variable name, the > > > buffer, and the buffer position. > > > Can you show a pseudo-code of CVAR? I'm afraid I'm missing something > > here, because I don't see clearly what you have in mind. > > I'll try. Something like this: > > #define CVAR(var, buf, position) \ > chain = read_text_property (Qisland, buf, position), \ > chain ? chain.var \ > : BVAR (var, buf) > > , but I don't think that would be a valid Lvalue in C. :-( Didn't you talk about some alist to look up? I see no alist look up in this pseudo-code. And 'chain.var' sounds wrong, since 'chain' is definitely a Lisp object, not a C struct. Or maybe I don't understand what hides behind read_text_property. > > > Other chain local variables would be accessed through an alist in the > > > struct chain holding miscellaneous variables, exactly as is done for > > > the other buffer local variables in struct buffer. > > > There's no such alist in how we access buffer-local variables, not > > AFAIK. Again, I must be missing something here. > > Or, maybe I am. I thought that the slot `local_var_alist_' in the struct > buffer held the bindings of all the non-BVAR local variables, as an > alist. Ah, you were talking about local_var_alist_... OK, but then I don't see anything like that in CVAR above. > I'm not at all clear on when and how buffer local variable > bindings get swapped in and out of, say, C variables like Vfoo. This happens when we switch buffers, see set_buffer_internal_1. But that function is driven by an explicit event of switching buffers, while in your design you need to do something similar when point crosses some buffer position, which is a much more subtle event. E.g., think about all the save-excursion and save-restriction code out there. > > > > This actually sounds like a simple extension of narrowing, so I wonder > > > > why do we need so many new object types and notions. > > > > I think it's more like a complicated extension of narrowing. :-) > > > It's simple because instead of one region you have more than one, and > > the user-level commands don't affect them. All the other changes are > > exact reproduction of what narrowing does. > > > > I think that chain local variables are essential to multiple major > > > modes - you can't have m.m.m. without some sort of chain locality. > > > What is "chain locality"? > > Having things (variables) which are local to a chain, as opposed to > global variables or buffer local variables or frame local variables. OK, but no one said that applying a restriction and making island-specific bindings of variables must be parts of the same feature. They could be 2 separate features instead. > > base_face_id = it->string_from_prefix_prop_p > > ? (!NILP (Vface_remapping_alist) > > ? lookup_basic_face (it->f, DEFAULT_FACE_ID) > > : DEFAULT_FACE_ID) > > : underlying_face_id (it); > > > Another example (which I also mentioned) is standard-display-table: > > > /* Use the standard display table for displaying strings. */ > > if (DISP_TABLE_P (Vstandard_display_table)) > > it->dp = XCHAR_TABLE (Vstandard_display_table); > > > See? no BVAR anywhere in sight. > > OK. But `face-remapping-alist' can definitely be made buffer local, and > `standard-display-table' most probably can. They both are. > There will be some mechanism (which I don't currently understand) by > which buffer local values are swapped into and out of > Vface_remapping_alist when the current buffer changes. See above: that mechanism is part of the function that switches to another buffer. > Surely a similar mechanism could be created for when the current > island changes. The issue is to make it as cheap as possible, because redisplay code is at liberty to move around the buffer at will, and the location where it examines buffer text is not directly related to point. > > Something bothers me there. What will "M-<" and "M->" do, if > > point-min and point-max are limited to the current island? Likewise > > the search commands -- they cannot be limited to the current island, > > unless the user explicitly says so (and personally, I don't envision > > users to ask to be so limited). > > Those restrictions will only apply when `in-islands' is bound to non-nil, > i.e. when major mode code is running. It will be nil when the user types > in M-<, hence point will move to the beginning of the (visible region of > the) buffer. See above: there might be some situations, like JIT font-lock, where you will want to have in-islands non-nil while running async code, and that might make the islands visible to code that is not strictly part of any major mode, like the infrastructure which invokes these async parts of Emacs code. So I think you need to consider the effects of those on more than just major modes.