From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: CC Mode and electric-pair "problem". Date: Mon, 18 Jun 2018 15:42:27 +0000 Message-ID: <20180618154227.GB3973@ACM> References: <20180531123747.GA24752@ACM> <20180617201351.GA4580@ACM> <20180618103654.GA9771@ACM> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1529336833 12649 195.159.176.226 (18 Jun 2018 15:47:13 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 18 Jun 2018 15:47:13 +0000 (UTC) User-Agent: Mutt/1.9.4 (2018-02-28) Cc: Glenn Morris , Emacs developers , Tino Calancha To: =?iso-8859-1?Q?Jo=E3o_T=E1vora?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jun 18 17:47:08 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fUwN2-00039v-NW for ged-emacs-devel@m.gmane.org; Mon, 18 Jun 2018 17:47:08 +0200 Original-Received: from localhost ([::1]:35430 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fUwP9-00038Y-T3 for ged-emacs-devel@m.gmane.org; Mon, 18 Jun 2018 11:49:19 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:34271) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fUwOq-00034y-RT for emacs-devel@gnu.org; Mon, 18 Jun 2018 11:49:02 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fUwOn-0006xl-M7 for emacs-devel@gnu.org; Mon, 18 Jun 2018 11:49:00 -0400 Original-Received: from colin.muc.de ([193.149.48.1]:26754 helo=mail.muc.de) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1fUwOn-0006xC-9n for emacs-devel@gnu.org; Mon, 18 Jun 2018 11:48:57 -0400 Original-Received: (qmail 11181 invoked by uid 3782); 18 Jun 2018 15:48:55 -0000 Original-Received: from acm.muc.de (p5B146420.dip0.t-ipconnect.de [91.20.100.32]) by colin.muc.de (tmda-ofmipd) with ESMTP; Mon, 18 Jun 2018 17:48:54 +0200 Original-Received: (qmail 4510 invoked by uid 1000); 18 Jun 2018 15:42:27 -0000 Content-Disposition: inline In-Reply-To: X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 193.149.48.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:226460 Archived-At: Hello, Joćo. On Mon, Jun 18, 2018 at 14:24:39 +0100, Joćo Tįvora wrote: > Alan Mackenzie writes: > > Yes. But it is the master branch, where not everything can be expected > > to work all the time. I think the main thing is, we're _going_ to fix > > this bug. > Well, I respectfully and totally disagree. The reason we have automated > tests in Hydra is to catch unintentional breakage, not intentional > breakage. This breakage is unintentional, and we're working out how to fix it. > For development temporarily unhampered by tests, I think a separate > branch is a much better alternative. It's a very easy thing to do in > git (and in your case, trivial to merge from and back to master, given > you have near-total control over that area of code). It's possible, but it's a hassle; it's outside of normal workflow, therefore involves getting into git's execrable documentation. > Can't you just revert the commit that broke it? It was three (or maybe four) successive commits. If I revert them, it will postpone indefinitely the bug that they've fixed. > > OK, here goes. Why should major modes tie themselves in knots, just so > > that electric-pair-mode can work? What CC Mode is doing is natural, and > > matches the reality. > I think you mean "mode", in the singular form :-). No. CC Mode comprises lots of modes, not all of them maintained by me. But even aside from that, CC Mode has often been a pioneer, developing new techniques, which the rest of Emacs has then followed. Examples are hungry deletion and electric indentation. > Also, it doesn't "match reality": if you open a line in a string, it > syntax highlights the remaining string as C statements, but the C parser > doesn't see C statements. IOW, newline doesn't *really* terminate a > string in C. We could argue about words like "terminate" indefinitely. What I think is incontrovertible is if you open a line in a string, the portion after that opening is not part of the string opened on the line above. The new fontification reflects this fact. > > electric-pair-mode's chomp facility could be more rigorously coded - > > sometimes it is dealing with visible whitespace, sometimes it is dealing > > with syntactic properties. Surely it should be working with visible > > whitespace all the time? > No. If it did so, it would chomp parenthesis from non-comment regions > into comment regions, for example. But it could use the strategy of determining the end of any comment, then using non-syntax facilities for traversing the space up to that end. Or something like that. > That doesn't make sense, not according to show-paren-mode, for example. > By the way, after your change, very basic commands which fall completely > outside electric-pair-mode have fundamentally changed their behaviour in > cc-mode. Here are a few, out of Emacs -Q: > * Open a line in a string, using C-o. Sexp-navigation is now messed up > in the whole buffer, i.e. C-M-*. Most commads error or produce > surprising result. So even if the intent is to eventually add a > backslash escaping the newline, or make it two adjacent strings by > typing two quotes (something perfectly allowed by C). I've tried this, obviously, but as far as I'm aware, the operation of C-M-* is correct for the (now syntactically incorrect) buffer. If you can give me a concrete example, I can look at it and correct it. > * Inside the string, `forward-sexp' in a parenthesis of a NL-terminated > string now errors where it would previously do its job of jumping to > the closer; It works more or less the same as C-M-n always has from a parenthesis inside a string, which isn't matched in that string. Just that the notion of "inside a string" is now more exact than it used to be. > * Also inside the string, `blink-matching-paren', on by default, also > doesn't work as before: closing a paren on a NL-started string doesn't > match the opener. Do you mean a NL-ENDED string? I see matching here. If you can be more precise about the failure, I can look at it. > There are no automated tests for these things, otherwise you could be > seeing test breakage here too (and, with higher probably, you may be > seeing breakage in user's expectations later on). No, these things are not all intended functionality of Emacs, they're just side effects of the way the real functionality was implemented. > > I've attempted a bit of debugging. In addition to > > electric-pair--skip-whitespace, I encountered a scan-sexps in subfunction > > ended-prematurely-fn of function electric-pair--balance-info, which > > snagged on the end of string at EOL. > I don't understand how this matters to the problem at hand, but > regardless, can you make a bug report demonstrating the presumed bug and > its impact so I can follow up? I attempted to see how difficult it would be to modify elec-pair.el to cope with unconstrained text properties in buffers. This was the second problem I came up against. > > We are talking about a corner case in e-p-m, namely where e-p-m attempts > > to chomp space between parens inside an invalid string. This surely > > won't come up in practice very much. Is it worth fixing? (I would say > > yes.) > Don't forget that the particular piece of e-p-m we're talking about is > one of the ways (arguably the easiest way) to actually fix the specific > C/C++ problem at hand for the user. IOW it's not some random whimsical > useless thing. It's not useless, but it's rare - it's three things happening all at the same time, namely a broken string, pseudo-matching parens and space between them. This isn't going to happen very often. I'd wager that broken strings (two "s with non-escaped NLs between them) in themselves are quite rare. But I still think it should be fixed. :-) > > The user is visually informed of the reality: that one or more > > strings are unterminated, and where the "breakage" is (where the > > font-lock-string-face stops). This is an improvement over the > > previous handling, where the opening invalid " merely got > > warning-face, but the following unterminated string flowed on > > indefinitely. > I suppose that's a "yes". In that case, the face `warning`, which > defaults to a very bright red, would be fine for me personally (and I'm > confident if could be made even more evident). Also, the fact that the > remaining string is now syntax-highlighted as C statements is extremely > confusing. Why? They are now C statements, and would be handled by the compiler as such. Having them fontified as strings (as they previously were) was confusing. > > The disadvantage is that e-p-m is constraining major modes in how they > > can use syntax-table text properties. I think this is a problem in > > electric-pair-mode, not in CC Mode. > Again, AFAIK, "mode", singular. See above. Perhaps it's worth noting that AWK-Mode has used this method of indicating invalid strings for around 15 years, now. There have never been any complaints about this from users. > And, obviously, I'm not going to special-case cc-mode in elec-pair.el: > after doing some of my own mulling, I may open a customization point > for cc-mode.el to use. I think it's a general case, that of having non-neutral syntax-table text properties on visual space characters. What do you see as a customisation option here? > So at the very least, it's going to require some (potentially trivial) > fix in cc-mode.el, for sure. :-) > But now that I've understood the non-e-p-m implications of your change, > I urge to at least make this configurable (if it is already > configurable, then don't mind me). Make correct fontification configurable? To sum up my viewpoint, I regard the way CC Mode now fontifies broken strings as correct (aside from any remaining bugs, of course). I think elec-pair.el's assumption that whitespace always has "neutral" syntax is unwarranted, and is the root of the current bug. There remains the problem of making chomping parens inside a broken string work. I honestly think that modifying elec-pair.el is the way to go, but I'm open to suggestions of alternative strategies that CC Mode could follow to get the same fontification, that wouldn't require modifying elec-pair.el. > Joćo -- Alan Mackenzie (Nuremberg, Germany).