From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: Syntax ambiguities in narrowed buffers and multiple major modes: a proposed solution. Date: Tue, 28 Feb 2017 18:58:34 +0000 Message-ID: <20170228185834.GA2248@acm> References: <20170225135355.GA2592@acm> <20170225212236.GD2592@acm> <20170226120656.GA3811@acm> <20170226163724.GD3811@acm> <20170227190558.GA2921@acm> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1488308354 2644 195.159.176.226 (28 Feb 2017 18:59:14 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 28 Feb 2017 18:59:14 +0000 (UTC) User-Agent: Mutt/1.7.2 (2016-11-26) Cc: emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Feb 28 19:59:10 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cimzN-0000HS-Mz for ged-emacs-devel@m.gmane.org; Tue, 28 Feb 2017 19:59:09 +0100 Original-Received: from localhost ([::1]:36279 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cimzT-0007m9-9h for ged-emacs-devel@m.gmane.org; Tue, 28 Feb 2017 13:59:15 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:43140) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cimzK-0007jr-5R for emacs-devel@gnu.org; Tue, 28 Feb 2017 13:59:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cimzF-0003Px-AS for emacs-devel@gnu.org; Tue, 28 Feb 2017 13:59:06 -0500 Original-Received: from ocolin.muc.de ([193.149.48.4]:24172 helo=mail.muc.de) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1cimzF-0003Oa-21 for emacs-devel@gnu.org; Tue, 28 Feb 2017 13:59:01 -0500 Original-Received: (qmail 93941 invoked by uid 3782); 28 Feb 2017 18:58:59 -0000 Original-Received: from acm.muc.de (p548C682E.dip0.t-ipconnect.de [84.140.104.46]) by colin.muc.de (tmda-ofmipd) with ESMTP; Tue, 28 Feb 2017 19:58:59 +0100 Original-Received: (qmail 2459 invoked by uid 1000); 28 Feb 2017 18:58:34 -0000 Content-Disposition: inline In-Reply-To: X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 193.149.48.4 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:212659 Archived-At: Hello, Stefan. On Mon, Feb 27, 2017 at 15:52:19 -0500, Stefan Monnier wrote: [ .... ] > All I said was: > [ I like to consider that strings and comments are also a form of > "island", although we're probably better off supporting them in > a special way like we do now. ] > which I wrote mostly so as to give some background in how I think about > the problem so you can hopefully understand better my position. > You then started to argue that it's wrong to think this way. Sorry, that's not what I meant to say. I simply wanted, in this thread, to keep the meaning of "island" unambiguous. That's the word, nothing more. This was to prevent the thread degenerating into confusion between the concept "island" I have just defined, and the more general uses and connotations attached to the word. [ .... ] > > It [a parse state] is determined by (parse-partial-sexp start end > > ...). > Yes, the state depends on where we start parsing. But there is > a privileged state which is the one rendered visible via font-lock, and > that's the one syntax-ppss intends to cache. That sounds like an intended resolution of the current ambiguity of the starting position of syntax-ppss's cache. Or am I reading too much into the sentence? [ .... ] > > Here's that code fragment again, for reference: > > (save-restriction > > (narrow-to-region ...) > > (with-syntax-table ... > > (backward-sexp 1))) [ .... ] > > That code appears to be from .../lisp/obsolete/complete.el, function > > PC-lisp-complete-symbol. > If so, it's a complete accident. The fragment came straight out of > my imagination. The situations I have in mind are more like in > perl-mode's syntax-propertize function where we need to find the > matching braces in regexp operations (where the matching rules are > slightly different from the ones in normal code) or in sgml-mode where > we jump from < to > and vice-versa using a specialized syntax-table, or > in sm-c-mode where I parse the C code within a CPP directive (itself > treated from the outside as a kind of comment). OK. But any time the current syntax-table is changed, the cache becomes invalid. For such operations, there really needs to be a means of isolating the cache from the syntactic operations, and vice versa. > > :-) There will be situations where things like backward-sexp will call > > back_comment (which is why it is important that back_comment be fast) > > but that code fragment isn't one of them. And even if it did (which > > will be rare), it is not doing it inside a tight loop. > I'm saying that the code fragment can be inside a tight loop (e.g. as > part of a backward lexer used for indentation purposes). OK, accepted. > > I think you've chosen a bad example for making that point. The > > syntax-ppss cache in the above code will need flushing anyway due to the > > with-syntax-table. The comment-cache cache might or might not need > > flushing for that reason (depending on the differences between the two > > syntax tables). > Alan, I'm not trying to pimp syntax-ppss here or to put down > comment-cache, here. I'm just pointing out a real existing problem > which I think the new design should take into account. Indeed the > current syntax-ppss treatment is incorrect (and in sm-c-mode > I work around it by meddling in syntax-ppss's internals, which is > clearly a bad idea). Also accepted, thanks. > IOW, instead of trying to come up with ad-hoc ways to treat > narrow-to-region as something that places island markers, which are then > somehow removed by something else (presumably at the end of > save-restriction, tho that makes for ugly semantics, IOM) and then > additionally handle the with-syntax-table thingy, I think we should > design a new macro specifically for that kind of "temporarily work on > a region as if it was its own buffer, with its own syntax" > (i.e. combining the kind of effect usually obtained with > narrow-to-region+with-syntax-table+save-restriction). > Then we can implement it by adding island markers (and flush the cache) > if we want, and if that proves inefficient later on, we can change it to > use another implementation strategy. The prime motivator for islands is syntactically to mark the various regions of a multi-major-mode buffer which have their own syntax tables, and to do this in a manner which doesn't impose any restrictions or complications on programs' or users' use of narrowing. It is evident that that same mechanism could be useful for marking a "neutral point" for syntactic analysis. Up to now I've concentrated on the pure mechanism without much consideration of the interface with the rest of Lisp, though that is clearly important, too. > Stefan -- Alan Mackenzie (Nuremberg, Germany).