From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Syntax ambiguities in narrowed buffers and multiple major modes: a proposed solution. Date: Mon, 27 Feb 2017 15:52:19 -0500 Message-ID: References: <20170225135355.GA2592@acm> <20170225212236.GD2592@acm> <20170226120656.GA3811@acm> <20170226163724.GD3811@acm> <20170227190558.GA2921@acm> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: blaine.gmane.org 1488229321 1686 195.159.176.226 (27 Feb 2017 21:02:01 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 27 Feb 2017 21:02:01 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Feb 27 22:01:57 2017 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ciSQe-0008Ti-Ei for ged-emacs-devel@m.gmane.org; Mon, 27 Feb 2017 22:01:56 +0100 Original-Received: from localhost ([::1]:56824 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ciSQk-0002Ug-HY for ged-emacs-devel@m.gmane.org; Mon, 27 Feb 2017 16:02:02 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:38550) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ciSHh-0003CL-12 for emacs-devel@gnu.org; Mon, 27 Feb 2017 15:52:42 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ciSHc-0006nu-SK for emacs-devel@gnu.org; Mon, 27 Feb 2017 15:52:41 -0500 Original-Received: from [195.159.176.226] (port=57413 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ciSHc-0006nX-M0 for emacs-devel@gnu.org; Mon, 27 Feb 2017 15:52:36 -0500 Original-Received: from list by blaine.gmane.org with local (Exim 4.84_2) (envelope-from ) id 1ciSHO-00076K-2l for emacs-devel@gnu.org; Mon, 27 Feb 2017 21:52:22 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 99 Original-X-Complaints-To: usenet@blaine.gmane.org Cancel-Lock: sha1:dTU+qnu4Ci5MTbq5Q/lIbrWHLEs= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 195.159.176.226 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:212628 Archived-At: >> In order to design a solution, we need to think about the problem somehow. > That's all very well, but we could spend weeks (or months) talking about > the problem without getting anywhere. You think whichever way you want. All I said was: [ I like to consider that strings and comments are also a form of "island", although we're probably better off supporting them in a special way like we do now. ] which I wrote mostly so as to give some background in how I think about the problem so you can hopefully understand better my position. You then started to argue that it's wrong to think this way. >> >> But the issue is that the syntax beginning in the above example should be >> >> point-min, not 1. > Incidentally, a parse state (like a voltage) is always a _difference_ > between two points. Here I disagree. When you say "difference" it makes it sound like we could take two differences and combine them. That would be great: we could cache every N chars the cumulated difference applied by those N chars and then efficiently compute the state at any position by only combining those pre-computed cumulated diffs. But we can't do that, because the "parse state" is really a *state*. > It is determined by (parse-partial-sexp start end ...). Yes, the state depends on where we start parsing. But there is a privileged state which is the one rendered visible via font-lock, and that's the one syntax-ppss intends to cache. > It is an error in thinking to think that there is any a priori > syntax beginning in any buffer situation. I'm glad we agree about this. > Here's that code fragment again, for reference: > (save-restriction > (narrow-to-region ...) > (with-syntax-table ... > (backward-sexp 1))) >> It's not really inherent in the code, but it's how it currently behaves >> (mostly), and in some cases that is what the author wants. In other >> cases, the author wants something else. > That code appears to be from .../lisp/obsolete/complete.el, function > PC-lisp-complete-symbol. If so, it's a complete accident. The fragment came straight out of my imagination. The situations I have in mind are more like in perl-mode's syntax-propertize function where we need to find the matching braces in regexp operations (where the matching rules are slightly different from the ones in normal code) or in sgml-mode where we jump from < to > and vice-versa using a specialized syntax-table, or in sm-c-mode where I parse the C code within a CPP directive (itself treated from the outside as a kind of comment). > :-) There will be situations where things like backward-sexp will call > back_comment (which is why it is important that back_comment be fast) > but that code fragment isn't one of them. And even if it did (which > will be rare), it is not doing it inside a tight loop. I'm saying that the code fragment can be inside a tight loop (e.g. as part of a backward lexer used for indentation purposes). > I think you've chosen a bad example for making that point. The > syntax-ppss cache in the above code will need flushing anyway due to the > with-syntax-table. The comment-cache cache might or might not need > flushing for that reason (depending on the differences between the two > syntax tables). Alan, I'm not trying to pimp syntax-ppss here or to put down comment-cache, here. I'm just pointing out a real existing problem which I think the new design should take into account. Indeed the current syntax-ppss treatment is incorrect (and in sm-c-mode I work around it by meddling in syntax-ppss's internals, which is clearly a bad idea). IOW, instead of trying to come up with ad-hoc ways to treat narrow-to-region as something that places island markers, which are then somehow removed by something else (presumably at the end of save-restriction, tho that makes for ugly semantics, IOM) and then additionally handle the with-syntax-table thingy, I think we should design a new macro specifically for that kind of "temporarily work on a region as if it was its own buffer, with its own syntax" (i.e. combining the kind of effect usually obtained with narrow-to-region+with-syntax-table+save-restriction). Then we can implement it by adding island markers (and flush the cache) if we want, and if that proves inefficient later on, we can change it to use another implementation strategy. Stefan