Bug #25608 and the comment-cache branch

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Bug #25608 and the comment-cache branch
@ 2017-02-02 20:24 Alan Mackenzie
  2017-02-02 20:47 ` Eli Zaretskii
                   ` (3 more replies)
  0 siblings, 4 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-02 20:24 UTC (permalink / raw)
  To: emacs-devel, Eli Zaretskii

Hello Eli and Emacs.

With bug #25608 (....

/*-----------------------------------------------------------------------------
(c) Copyright notice containing open parentheses
-----------------------------------------------------------------------------*/

/*---------------------------------------------------------------------------*/

....), the last line spuriously indents c-basic-offset columns
rightwards.  The cause of this is the open paren at column zero inside
the comment.

This is just the latest manifestation of this bug (which surely it is)
to hit bug-gnu-emacs and bug-cc-mode.  There will surely be more to come
if we don't fix it.

In the comment-cache branch, the above scenario isn't a bug.  It
analyses and indents correctly in that branch.

I think we are all agreed that Emacs should handle correctly formed
comments in C.  comment-cache does correctly handle comments, and it has
been shown to be essentially no slower than master.

I would like to merge comment-cache into master, finally fixing this bug
once and for all.  What do you say, Eli?

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-02 20:24 Bug #25608 and the comment-cache branch Alan Mackenzie
@ 2017-02-02 20:47 ` Eli Zaretskii
  2017-02-02 21:51   ` Alan Mackenzie
  2017-02-02 22:14 ` Dmitry Gutov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 75+ messages in thread
From: Eli Zaretskii @ 2017-02-02 20:47 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> Date: Thu, 2 Feb 2017 20:24:18 +0000
> From: Alan Mackenzie <acm@muc.de>
> 
> I would like to merge comment-cache into master, finally fixing this bug
> once and for all.  What do you say, Eli?

I say there's too much resistance to doing that from people whose
opinions I respect and trust.  Each time this issue comes up, I see
that resistance being expressed again.

I hope it's possible to find some kind of compromise or a different
solution that leaves people less unhappy.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-02 20:47 ` Eli Zaretskii
@ 2017-02-02 21:51   ` Alan Mackenzie
  2017-02-02 22:15     ` Dmitry Gutov
  2017-02-03  7:41     ` Eli Zaretskii
  0 siblings, 2 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-02 21:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Hello, Eli.

On Thu, Feb 02, 2017 at 22:47:24 +0200, Eli Zaretskii wrote:
> > Date: Thu, 2 Feb 2017 20:24:18 +0000
> > From: Alan Mackenzie <acm@muc.de>

> > I would like to merge comment-cache into master, finally fixing this bug
> > once and for all.  What do you say, Eli?

> I say there's too much resistance to doing that from people whose
> opinions I respect and trust.  Each time this issue comes up, I see
> that resistance being expressed again.

Primarily from Stefan.  The issue was discussed just a few weeks ago,
and the resistance expressed was philosophical rather than practical:
for example, it would be nice if the solution was less complicated, or
it would be nice if it also cached the rest of the syntax mechanism.

That criticism did not identify concrete difficulties which
comment-cache might cause.  (I do not deny there might be such
difficulties, but they can surely be fixed, whatever they turn out to
be.)  There was no argument with comment-cache's algorithms, no
non-vague suggestions as to how they might be improved.  In fact the
only technical part of the discussion concerned comment-cache's speed.

The identified resistance was expressed in a form which didn't give me
feedback as to how to make improvements.

> I hope it's possible to find some kind of compromise or a different
> solution that leaves people less unhappy.

Compromise with what?  There is no alternative solution on the table at
the moment.  I would really love to understand what, in concrete terms,
the objections to comment-cache are.

And in the meantime, it's me that has to keep fielding all these paren
in column zero bugs, and some of them (like Paul's bug from last March)
require strenuous debugging.  It's me that has to keep apologising for
this deficiency in Emacs to those raising the bugs.

None of this is fun.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-02 20:24 Bug #25608 and the comment-cache branch Alan Mackenzie
  2017-02-02 20:47 ` Eli Zaretskii
@ 2017-02-02 22:14 ` Dmitry Gutov
  2017-02-03 16:44   ` Alan Mackenzie
  2017-02-02 23:57 ` Stefan Monnier
  2017-02-03  7:49 ` Yuri Khan
  3 siblings, 1 reply; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-02 22:14 UTC (permalink / raw)
  To: Alan Mackenzie, emacs-devel, Eli Zaretskii

On 02.02.2017 22:24, Alan Mackenzie wrote:

> I think we are all agreed that Emacs should handle correctly formed
> comments in C.  comment-cache does correctly handle comments, and it has
> been shown to be essentially no slower than master.

Alan, you seem to have abandoned the previous discussion. Why don't we 
finish it first?

You have been asked for some extra measurements, including the ones 
using the alternative patch. I still haven't seen those yet.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-02 21:51   ` Alan Mackenzie
@ 2017-02-02 22:15     ` Dmitry Gutov
  2017-02-03  7:41     ` Eli Zaretskii
  1 sibling, 0 replies; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-02 22:15 UTC (permalink / raw)
  To: Alan Mackenzie, Eli Zaretskii; +Cc: emacs-devel

On 02.02.2017 23:51, Alan Mackenzie wrote:

> Compromise with what?  There is no alternative solution on the table at
> the moment.

Yes, there is.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-02 20:24 Bug #25608 and the comment-cache branch Alan Mackenzie
  2017-02-02 20:47 ` Eli Zaretskii
  2017-02-02 22:14 ` Dmitry Gutov
@ 2017-02-02 23:57 ` Stefan Monnier
  2017-02-03 16:19   ` Alan Mackenzie
  2017-02-03  7:49 ` Yuri Khan
  3 siblings, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-02 23:57 UTC (permalink / raw)
  To: emacs-devel

> ....), the last line spuriously indents c-basic-offset columns
> rightwards.  The cause of this is the open paren at column zero inside
> the comment.

I think it's important to remember that this problem dates back to
Emacs-17 or so, so it's not super urgent to install a quick fix.


        Stefan




^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-02 21:51   ` Alan Mackenzie
  2017-02-02 22:15     ` Dmitry Gutov
@ 2017-02-03  7:41     ` Eli Zaretskii
  2017-02-03 17:29       ` Alan Mackenzie
  2017-02-05 22:00       ` Alan Mackenzie
  1 sibling, 2 replies; 75+ messages in thread
From: Eli Zaretskii @ 2017-02-03  7:41 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> Date: Thu, 2 Feb 2017 21:51:54 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: emacs-devel@gnu.org
> 
> > I say there's too much resistance to doing that from people whose
> > opinions I respect and trust.  Each time this issue comes up, I see
> > that resistance being expressed again.
> 
> Primarily from Stefan.

Not only Stefan.  Also Dmitry.

> > I hope it's possible to find some kind of compromise or a different
> > solution that leaves people less unhappy.
> 
> Compromise with what?

With the objections, ideas, and suggestions expressed in those
discussions.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-02 20:24 Bug #25608 and the comment-cache branch Alan Mackenzie
                   ` (2 preceding siblings ...)
  2017-02-02 23:57 ` Stefan Monnier
@ 2017-02-03  7:49 ` Yuri Khan
  2017-02-03 18:30   ` Andreas Röhler
  3 siblings, 1 reply; 75+ messages in thread
From: Yuri Khan @ 2017-02-03  7:49 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, Emacs developers

On Fri, Feb 3, 2017 at 3:24 AM, Alan Mackenzie <acm@muc.de> wrote:

> With bug #25608 (....
>
> /*-----------------------------------------------------------------------------
> (c) Copyright notice containing open parentheses
> -----------------------------------------------------------------------------*/
[…]
> ....), the last line spuriously indents c-basic-offset columns
> rightwards.  The cause of this is the open paren at column zero inside
> the comment.

This problem would be obviated by using the proper © copyright sign in
the comment.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-02 23:57 ` Stefan Monnier
@ 2017-02-03 16:19   ` Alan Mackenzie
  2017-02-04  9:06     ` Andreas Röhler
  2017-02-04 18:18     ` Stefan Monnier
  0 siblings, 2 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-03 16:19 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Thu, Feb 02, 2017 at 18:57:52 -0500, Stefan Monnier wrote:
> > ....), the last line spuriously indents c-basic-offset columns
> > rightwards.  The cause of this is the open paren at column zero inside
> > the comment.

> I think it's important to remember that this problem dates back to
> Emacs-17 or so, so it's not super urgent to install a quick fix.

There's no need to be so disparaging.  comment-cache is NOT in any sense
a "quick fix".  It's precisely the opposite.  It's a rigorous rewrite of
back_comment which eliminates "quick fixes", for example
open-paren-in-column-0-is-defun-start.  I think you know this.

And given how long this problem's been around for, it's high time it was
finally fixed.  It's an embarrassment, and it causes pain, repeatedly
and predictably.

Instead, why don't you criticise comment-cache in a constructive
fashion?  Such as by pointing out potential problems it might cause.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-02 22:14 ` Dmitry Gutov
@ 2017-02-03 16:44   ` Alan Mackenzie
  2017-02-03 21:53     ` Dmitry Gutov
  0 siblings, 1 reply; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-03 16:44 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel

Hello, Dmitry.

On Fri, Feb 03, 2017 at 00:14:12 +0200, Dmitry Gutov wrote:
> On 02.02.2017 22:24, Alan Mackenzie wrote:

> > I think we are all agreed that Emacs should handle correctly formed
> > comments in C.  comment-cache does correctly handle comments, and it has
> > been shown to be essentially no slower than master.

> Alan, you seem to have abandoned the previous discussion. Why don't we 
> finish it first?

> You have been asked for some extra measurements, including the ones 
> using the alternative patch.

Perhaps, for clarity's sake, you could post this alternative patch here,
or if it's big, put it into a scratch branch.  Then, at least we'll all
know that we're talking about the same thing.

> I still haven't seen those yet.

I'm not sure what you want them for.  The "alternative patch" didn't
scan comments correctly all the time when I looked at it, just as the
current back_comment doesn't.  But, post the patch, remind me precisely
what you want tested, and I'll do it.

Constructive criticism of comment-cache would be most welcome.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-03  7:41     ` Eli Zaretskii
@ 2017-02-03 17:29       ` Alan Mackenzie
  2017-02-03 22:08         ` Dmitry Gutov
  2017-02-05 22:00       ` Alan Mackenzie
  1 sibling, 1 reply; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-03 17:29 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Hello, Eli.

On Fri, Feb 03, 2017 at 09:41:23 +0200, Eli Zaretskii wrote:
> > Date: Thu, 2 Feb 2017 21:51:54 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: emacs-devel@gnu.org

> > > I say there's too much resistance to doing that from people whose
> > > opinions I respect and trust.  Each time this issue comes up, I see
> > > that resistance being expressed again.

> > Primarily from Stefan.

> Not only Stefan.  Also Dmitry.

I would hope that the substance of these objections would carry more
weight than their authorship.

> > > I hope it's possible to find some kind of compromise or a different
> > > solution that leaves people less unhappy.

> > Compromise with what?

> With the objections, ideas, and suggestions expressed in those
> discussions.

With all due respect, none of these objections and ideas leave room for
compromise.  comment-cache scans comments forwards, the "alternative
patch" Dmitry talks about tries to scan them backwards.  Where is the
scope for compromise?

The objectors do not seem to want compromise - they want comment-cache
to be wholly abandoned.  They object to it for reasons I don't
understand, despite the fact that it elegantly solves a long standing
problem that continues to cause pain on a frequent basis.

If you (or anybody else) could summarize what these objections are, I'd
be very grateful.

Note that there has been NO constructive criticism of comment-cache.
Nobody is pointing out problems it causes or might cause.  Nobody has
looked at the source code to point out potential difficulties or places
where improvements might be made.  Instead, we have ....

And while this is going on, I'm having to deal with perhaps one or two
of these bugs a year in CC Mode, which are time consuming, demoralising
and embarrassing to have to explain to the OPs.  Emacs should be able to
handle parentheses in comments.

What is the problem with comment-cache?

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-03  7:49 ` Yuri Khan
@ 2017-02-03 18:30   ` Andreas Röhler
  0 siblings, 0 replies; 75+ messages in thread
From: Andreas Röhler @ 2017-02-03 18:30 UTC (permalink / raw)
  To: emacs-devel; +Cc: Yuri Khan



On 03.02.2017 08:49, Yuri Khan wrote:
> On Fri, Feb 3, 2017 at 3:24 AM, Alan Mackenzie <acm@muc.de> wrote:
>
>> With bug #25608 (....
>>
>> /*-----------------------------------------------------------------------------
>> (c) Copyright notice containing open parentheses
>> -----------------------------------------------------------------------------*/
> […]
>> ....), the last line spuriously indents c-basic-offset columns
>> rightwards.  The cause of this is the open paren at column zero inside
>> the comment.
> This problem would be obviated by using the proper © copyright sign in
> the comment.
>

Isn't that just an example to illustrate the bug?



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-03 16:44   ` Alan Mackenzie
@ 2017-02-03 21:53     ` Dmitry Gutov
  2017-02-04 11:02       ` Alan Mackenzie
  0 siblings, 1 reply; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-03 21:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, emacs-devel

On 03.02.2017 18:44, Alan Mackenzie wrote:

> Perhaps, for clarity's sake, you could post this alternative patch here,
> or if it's big, put it into a scratch branch.  Then, at least we'll all
> know that we're talking about the same thing.

I've already posted the url. The path is in the comments of the bug 
you're purportedly trying to fix. So here is the message you unlimately 
ignored: http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg01075.html

If the patch is not good enough for some reasons, please post those, 
with specific examples. And I'm sure we can improve it.

> I'm not sure what you want them for.

To see how they compare performance-wise, at least. "syntax-ppss cache 
is slow" was one of the big reasons for introducing the text property 
cache implemented via text properties, written in C, IIRC.

So you should be able to demonstrate this stark difference in performance.

> The "alternative patch" didn't
> scan comments correctly all the time when I looked at it, just as the
> current back_comment doesn't.

Please remind us of the specific problems it has.

> But, post the patch, remind me precisely
> what you want tested,

Can you read the message archive (that I've linked to above), or should 
I copy the past messages here?

> and I'll do it.
> 
> Constructive criticism of comment-cache would be most welcome.

Just look up the previous threads on the subject. Surely you don't 
expect people to rehash the arguments time and time again.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-03 17:29       ` Alan Mackenzie
@ 2017-02-03 22:08         ` Dmitry Gutov
  2017-02-04 10:24           ` Alan Mackenzie
  0 siblings, 1 reply; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-03 22:08 UTC (permalink / raw)
  To: Alan Mackenzie, Eli Zaretskii; +Cc: emacs-devel

On 03.02.2017 19:29, Alan Mackenzie wrote:

> The objectors do not seem to want compromise - they want comment-cache
> to be wholly abandoned.

It's silly to seek a compromise between implementations. Rather, we 
should discuss hard requirements (with some test cases).

And then we should seek the simplest solution that satisfies all of our 
requirements.

> They object to it for reasons I don't
> understand, despite the fact that it elegantly solves a long standing
> problem that continues to cause pain on a frequent basis.

Elegance is in the eye of the beholder. It certainly doesn't seem 
elegant to me, design-wise.

> If you (or anybody else) could summarize what these objections are, I'd
> be very grateful.

"It introduces a second source of truth" seems like a concise summary.

At best, it'll use more memory than it has to. At worst, we risk 
divergence in the information contained in those sources (so functions 
depending on one or the other will behave in incompatible fashion). That 
means nasty bugs that aren't easy to track down.

> Note that there has been NO constructive criticism of comment-cache.

That's insulting, Alan.

> Nobody is pointing out problems it causes or might cause.

And that's false.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-03 16:19   ` Alan Mackenzie
@ 2017-02-04  9:06     ` Andreas Röhler
  2017-02-04 18:18     ` Stefan Monnier
  1 sibling, 0 replies; 75+ messages in thread
From: Andreas Röhler @ 2017-02-04  9:06 UTC (permalink / raw)
  To: emacs-devel; +Cc: Alan Mackenzie



On 03.02.2017 17:19, Alan Mackenzie wrote:
> Hello, Stefan.
>
> On Thu, Feb 02, 2017 at 18:57:52 -0500, Stefan Monnier wrote:
>>> ....), the last line spuriously indents c-basic-offset columns
>>> rightwards.  The cause of this is the open paren at column zero inside
>>> the comment.
>> I think it's important to remember that this problem dates back to
>> Emacs-17 or so, so it's not super urgent to install a quick fix.
> There's no need to be so disparaging.  comment-cache is NOT in any sense
> a "quick fix".  It's precisely the opposite.  It's a rigorous rewrite of
> back_comment which eliminates "quick fixes", for example
> open-paren-in-column-0-is-defun-start.  I think you know this.
>
> And given how long this problem's been around for, it's high time it was
> finally fixed.  It's an embarrassment, and it causes pain, repeatedly
> and predictably.
>
> Instead, why don't you criticise comment-cache in a constructive
> fashion?  Such as by pointing out potential problems it might cause.
>
>>          Stefan

Hi Alan,

IIUC open-paren-in-column-0-is-defun-start is is the underlying issue.
It was told there was an essay to solve this.
May someone point me at it?

Cheers,
Andreas




^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-03 22:08         ` Dmitry Gutov
@ 2017-02-04 10:24           ` Alan Mackenzie
  2017-02-06  2:09             ` Dmitry Gutov
  0 siblings, 1 reply; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-04 10:24 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel

Hello, Dmitry.

On Sat, Feb 04, 2017 at 00:08:41 +0200, Dmitry Gutov wrote:
> On 03.02.2017 19:29, Alan Mackenzie wrote:

> > The objectors do not seem to want compromise - they want comment-cache
> > to be wholly abandoned.

> It's silly to seek a compromise between implementations. Rather, we 
> should discuss hard requirements (with some test cases).

You want comment-cache to be wholly abandoned.

The requirements are simple.  First, and foremost, (forward-comment -1)
must work.  Secondly, it should do so fast, preferably at least as fast
as the current (buggy) implementation.

Here is a test case from months gone by.  Put point-min at the indicated
position, put point at EOL, then do M-: (forward-comment -1).

    char foo[] = "asdf asdf" "asdf"; /* "asdf" */ /*  */  /*   '"'"  */
                      ^

This test works in comment-cache.

> And then we should seek the simplest solution that satisfies all of our 
> requirements.

As simple as possible, but definitely not simpler.  The "solution" you
favour is too simple.  It doesn't work all the time.

> > They object to it for reasons I don't
> > understand, despite the fact that it elegantly solves a long standing
> > problem that continues to cause pain on a frequent basis.

> Elegance is in the eye of the beholder. It certainly doesn't seem 
> elegant to me, design-wise.

> > If you (or anybody else) could summarize what these objections are, I'd
> > be very grateful.

> "It introduces a second source of truth" seems like a concise summary.

So what?  There are any number of "sources of truth" in Emacs.  If one
of them turns out to be a "source of untruth" we call that a bug, and we
fix it.

> At best, it'll use more memory than it has to.

The thing to do here is measure this extra memory.  I did this back in
spring last year, and the number of extra conses used for the cache was
not inordinately high.  Especially not for a 64-bit machine with several
gigabytes of RAM.

> At worst, we risk divergence in the information contained in those
> sources (so functions depending on one or the other will behave in
> incompatible fashion). That means nasty bugs that aren't easy to track
> down.

I think you're seeing something that's not there.  You're picturing some
imagined process where two alternative ways of storing information have
great difficulty staying together, and somehow, over time, are destined
to drift apart.  Sort of like two national currencies trying to stay
pegged to eachother, or something like that.

That's not how computer programs work.  If those two ways end up
differing, we have a bug, which can be fixed like any other bug.  Heck,
even a single "source of truth" can be buggy, with just as severe
consequences.  We get bugs, we fix them.

Note, in this context, that syntax-ppss is broken (bug #22983) and
doesn't look like getting fixed any time soon, yet the world hasn't come
to an end.

> > Note that there has been NO constructive criticism of comment-cache.

> That's insulting, Alan.

It might be, but I think it's true.  You want comment-cache to be wholly
abandoned.  You are not suggesting ways to make it better.  You haven't
tried it, that I'm aware of.  You haven't looked for flaws, with the
intention of getting them fixed.  Instead you are putting forward
reasons, not all of them good, for abandoning comment-cache.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-03 21:53     ` Dmitry Gutov
@ 2017-02-04 11:02       ` Alan Mackenzie
  2017-02-06  1:28         ` Dmitry Gutov
  2017-02-06  2:08         ` Stefan Monnier
  0 siblings, 2 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-04 11:02 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel

Hello, Dmitry.

On Fri, Feb 03, 2017 at 23:53:31 +0200, Dmitry Gutov wrote:
> On 03.02.2017 18:44, Alan Mackenzie wrote:

> > Perhaps, for clarity's sake, you could post this alternative patch here,
> > or if it's big, put it into a scratch branch.  Then, at least we'll all
> > know that we're talking about the same thing.

> I've already posted the url.

You did, indeed.  Apologies.

> The path is in the comments of the bug you're purportedly trying to
> fix. So here is the message you unlimately ignored:
> http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg01075.html

> If the patch is not good enough for some reasons, please post those, 
> with specific examples. And I'm sure we can improve it.

I think it would be useful to post the actual patch here, so it can be
more easily discussed, and to be easier for people who want to try it
out to get to it.

> > I'm not sure what you want them for.

> To see how they compare performance-wise, at least. "syntax-ppss cache 
> is slow" was one of the big reasons for introducing the text property 
> cache implemented via text properties, written in C, IIRC.

syntax-ppss being too slow was its use in a specific circumstance.  That
was trying to use it in place of comment-cache's cache mechanism, but
otherwise using comment-cache.  That would result in ~2 orders of
magnitude slowdown in backward_comment.

> So you should be able to demonstrate this stark difference in performance.

That would involve hacking comment-cache, and as I've said before, would
be a fruitless waste of time.  With syntax-ppss we'd end up having to
scan forward 10,000 characters (on average) with parse-partial-sexp just
to be able to scan back over an 80 character comment.  That's obvious,
and not worth timing.

> > The "alternative patch" didn't scan comments correctly all the time
> > when I looked at it, just as the current back_comment doesn't.

> Please remind us of the specific problems it has.

In the following test case (same as in my other post) the "alternative
patch" doesn't work.  Narrow the buffer with point-min at the indicated
position.  Put point at EOL.  Try M-: (forward-comment -1).  This fails.

    char foo[] = "asdf asdf" "asdf"; /* "asdf" */ /*  */  /*   '"'"  */
                      ^

.

> > and I'll do it.

Using M;- (time-scroll) from the start of xdisp.c, and (time-scroll t)
from its end (having cleared caches by typing a character at BOB), I get
these timings

                      forward              backward
master                 34.51s               36.43s
comment-cache          33.68s               32.81s
"alternative patch"    35.49s               36.05s


(defmacro time-it (&rest forms)
  "Time the running of a sequence of forms using `float-time'.
Call like this: \"M-: (time-it (foo ...) (bar ...) ...)\"."
  `(let ((start (float-time)))
    ,@forms
    (- (float-time) start)))

(defun time-scroll (&optional arg)
  (interactive "P")
  (message "%s"
           (time-it
            (condition-case nil
                (while t
                  (if arg (scroll-down) (scroll-up))
                  (sit-for 0))
              (error nil)))))


It would seem that differences in speed are not big enough to make any
decision on that basis.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-03 16:19   ` Alan Mackenzie
  2017-02-04  9:06     ` Andreas Röhler
@ 2017-02-04 18:18     ` Stefan Monnier
  2017-02-04 18:28       ` Alan Mackenzie
  1 sibling, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-04 18:18 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> Instead, why don't you criticise comment-cache in a constructive
> fashion?  Such as by pointing out potential problems it might cause.

Don't be disingenous: we've been through that several times already.


        Stefan



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-04 18:18     ` Stefan Monnier
@ 2017-02-04 18:28       ` Alan Mackenzie
  0 siblings, 0 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-04 18:28 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello Stefan.

On Sat, Feb 04, 2017 at 13:18:56 -0500, Stefan Monnier wrote:
> > Instead, why don't you criticise comment-cache in a constructive
> > fashion?  Such as by pointing out potential problems it might cause.

> Don't be disingenous: we've been through that several times already.

Yes we have, but no potential problems comment-cache might cause have
been identified.  There's been generalized abstract philosophy on why
comment-cache is supposedly bad, but no real problems.  Nothing which
would cause Emacs to malfunction.

The fact is, comment-cache enables the proper functioning of
backward_comment.  The current master and the "alternative patch" are
both buggy.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-03  7:41     ` Eli Zaretskii
  2017-02-03 17:29       ` Alan Mackenzie
@ 2017-02-05 22:00       ` Alan Mackenzie
  2017-02-06  1:12         ` Stefan Monnier
  2017-02-08 17:20         ` Eli Zaretskii
  1 sibling, 2 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-05 22:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Hello again, Eli.

On Fri, Feb 03, 2017 at 09:41:23 +0200, Eli Zaretskii wrote:
> > Date: Thu, 2 Feb 2017 21:51:54 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: emacs-devel@gnu.org

> > > I say there's too much resistance to doing that from people whose
> > > opinions I respect and trust.  Each time this issue comes up, I see
> > > that resistance being expressed again.

> > Primarily from Stefan.

> Not only Stefan.  Also Dmitry.

> > > I hope it's possible to find some kind of compromise or a different
> > > solution that leaves people less unhappy.

> > Compromise with what?

> With the objections, ideas, and suggestions expressed in those
> discussions.

(forward-comment -1), implemented by backward_comment in syntax.c is an
essential part of Emacs.  There are currently four contending ways for
how this should be done:

(i) The comment-cache branch ("CC").
(ii) The current master with open-paren-in-column-0-is-defun-start set
  to t (its default) ("M-t").
(iii) As (ii) but with o-p-i-c-0-i-d-s set to nil ("M-nil").
(iv) The "alternative patch" proposed by Stefan and advocated by Dmitry
  ("AP").

These four ways have the following characteristics:

        |      Speed       Direction of scanning     Correct parsing
------------------------------------------------------------------------
CC      |       OK             forwards                   yes
M-t     |       OK             backwards                  no [1]
M-nil   |       Slow           backwards                  probably [2]
AP      |       OK             backwards                  no [3]

[1] M-t fails on comments containing parens in column zero.
[2] M-nil depends on scanning comments backwards.  It is believed to be
  correct, but it is difficult to be sure.
[3] AP depends on syntax-ppss, which returns spurious values for narrowed
  buffers (bug #22983).  A test case exists for which AP fails.

By the above criteria, CC is the clear winner.

CC is opposed by Stefan and Dmitry, if I understand correctly, because
they think the type of action performed by CC should be done using
syntax-ppss and no other way.  Additionally, Dmitry has expressed some
minor concern at the extra RAM used by CC, and Stefan has expressed some
concern at how CC might affect multiple major modes in a single buffer.

Right now, I am facing a tedious and quite difficult merge of master
into comment-cache, necessitated by extensive changes in syntax.c since
I last merged, back in December.  Should I bother?

I strongly believe that comment-cache is the best way for Emacs to do
back_comment.  As already said, I am not enthusiastic at continually
having to field bugs with parens in column 0 inside comments, caused by
the current buggy backward_comment.

But I need your acceptance of comment-cache to go any further.  It has
taken a lot of my time to develop, and I am still hopeful of merging it
into master.  If there is a sound technical reason why it should be
abandoned, that is fair enough.  If it is rejected without such a
reason, I will need to reconsider my relationship with Emacs.  I am
currently working (or "working") on several ambitious changes in Emacs.
One of them is restructuring the byte compiler so that error and warning
messages get the correct line number (bug #22288, etc.).  If there is
the prospect of these being rejected without good reason, I am not
willing to take the risk of wasting my time on them.  I would restrict
my participation in Emacs to CC Mode and simple changes in the non-C
part of Emacs which can be done in at most a very few hours.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-05 22:00       ` Alan Mackenzie
@ 2017-02-06  1:12         ` Stefan Monnier
  2017-02-06 18:37           ` Alan Mackenzie
  2017-02-08 17:20         ` Eli Zaretskii
  1 sibling, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-06  1:12 UTC (permalink / raw)
  To: emacs-devel

>         |      Speed       Direction of scanning     Correct parsing
> ------------------------------------------------------------------------
> CC      |       OK             forwards                   yes
> M-t     |       OK             backwards                  no [1]
> M-nil   |       Slow           backwards                  probably [2]
> AP      |       OK             backwards                  no [3]

And its parsing is correct if you assume that syntax-ppss works
correctly (and if it doesn't, it's something that needs to be fixed
anyway because it has much worse consequences than just messing
(forward-comment -1)).


        Stefan




^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-04 11:02       ` Alan Mackenzie
@ 2017-02-06  1:28         ` Dmitry Gutov
  2017-02-06 19:37           ` Alan Mackenzie
  2017-02-06  2:08         ` Stefan Monnier
  1 sibling, 1 reply; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-06  1:28 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, emacs-devel

Hi Alan,

On 04.02.2017 13:02, Alan Mackenzie wrote:

> I think it would be useful to post the actual patch here, so it can be
> more easily discussed, and to be easier for people who want to try it
> out to get to it.

I'd rather it stays in one place, along with any further revisions, if 
needed. That should cause less confusion in the long run.

For any casual observers in this discussion who don't want to follow the 
links: the patch touches src/syntax.c, and it's 20 lines long.

> syntax-ppss being too slow was its use in a specific circumstance.  That
> was trying to use it in place of comment-cache's cache mechanism, but
> otherwise using comment-cache.  That would result in ~2 orders of
> magnitude slowdown in backward_comment.

Ah, so that's what you were arguing against?

Does comment-cache code contain some other functionality that we'd want 
to retain while using the syntax-ppss cache? Something that makes 
performance overhead of syntax-ppss a problem still?

>>> The "alternative patch" didn't scan comments correctly all the time
>>> when I looked at it, just as the current back_comment doesn't.
> 
>> Please remind us of the specific problems it has.
> 
> In the following test case (same as in my other post) the "alternative
> patch" doesn't work.  Narrow the buffer with point-min at the indicated
> position.  Put point at EOL.  Try M-: (forward-comment -1).  This fails.
> 
>      char foo[] = "asdf asdf" "asdf"; /* "asdf" */ /*  */  /*   '"'"  */
>                        ^
> 
> .

Thank you for the reminder. But do you have any examples that do not 
involve narrowing?

Reconciling syntax-ppss with narrowing is a subject of a separate thread 
(one that's regrettably stalled for a while, but I'll get back to it 
soon). As soon as it's resolved, the Alternative Patch should not have 
this problem anymore either.

> Using M;- (time-scroll) from the start of xdisp.c, and (time-scroll t)
> from its end (having cleared caches by typing a character at BOB), I get
> these timings
> 
>                        forward              backward
> master                 34.51s               36.43s
> comment-cache          33.68s               32.81s
> "alternative patch"    35.49s               36.05s

Thanks!

> It would seem that differences in speed are not big enough to make any
> decision on that basis.

Does that just leave the narrowing issues, or is there something else?

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-04 11:02       ` Alan Mackenzie
  2017-02-06  1:28         ` Dmitry Gutov
@ 2017-02-06  2:08         ` Stefan Monnier
  2017-02-06 20:01           ` Alan Mackenzie
  1 sibling, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-06  2:08 UTC (permalink / raw)
  To: emacs-devel

>> > The "alternative patch" didn't scan comments correctly all the time
>> > when I looked at it, just as the current back_comment doesn't.

Of course, there's an alternative way to look at this reality: your
comment-cache changes the behavior in some cases where the AP patch doesn't.
You claim that the new behavior is "correct" and the other one "wrong",
but AFAIK these are borderline cases where both interpretations can be
correct or wrong depending on what narrowing was used for.

So while I don't claim that comment cache's behavior is *wrong*, it
might break existing code.

> In the following test case (same as in my other post) the "alternative
> patch" doesn't work.  Narrow the buffer with point-min at the indicated
> position.  Put point at EOL.  Try M-: (forward-comment -1).  This fails.

>     char foo[] = "asdf asdf" "asdf"; /* "asdf" */ /*  */  /*   '"'"  */
>                       ^

For example here, your intention for narrowing is clear, but Emacs
currently doesn't keep track of the fact that the user put this
narrowing (rather than some code like mmm-mode), so while in this case
your comment-cache is probably right, in other cases it might give the
wrong answer.  E.g. maybe in a case such as

     char foo[] = "for (x = 0; x < n; x++) /* Loop header */\n";
                   ^                                        ^

where the user narrows to the string, then goes to EOL and does
M-: (forward-comment -1)

Really your comment-cache (just like the existing code) currently can't
do much better, because to do better we need to fix the narrowing
problem.

So really, the problem to be solved is the problem of narrowing.
Once that one is solved, AP and comment-cache should both be able to
behave correctly in both cases (in the case of AP, this will happen
without any changes to AP itself, because the fix will be in
syntax-ppss).

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-04 10:24           ` Alan Mackenzie
@ 2017-02-06  2:09             ` Dmitry Gutov
  2017-02-06 19:24               ` Alan Mackenzie
  2017-02-12  2:53               ` John Wiegley
  0 siblings, 2 replies; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-06  2:09 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, emacs-devel

On 04.02.2017 12:24, Alan Mackenzie wrote:

> You want comment-cache to be wholly abandoned.

At least the part that maintains a separate cache. I'm not sure if 
there's anything else there.

It's not because I enjoy disagreeing with you, though.

>> And then we should seek the simplest solution that satisfies all of our
>> requirements.
> 
> As simple as possible, but definitely not simpler.  The "solution" you
> favour is too simple.  It doesn't work all the time.

I concede it's not ideal. However, I strongly believe "fixing" the 
narrowing problem in syntax-ppss with take care of this example, *and* 
will result in lower overall complexity and maintenance burden.

Consider the problems you've had merging master into the comment-cache 
branch. If there were conflicts, that means the new code touches a 
changing area, and it will need to be considered and taken care of by 
the maintainers, probably on an ongoing basis. The AP, on the other 
hand, still applies cleanly.

>> "It introduces a second source of truth" seems like a concise summary.
> 
> So what?  There are any number of "sources of truth" in Emacs.  If one
> of them turns out to be a "source of untruth" we call that a bug, and we
> fix it.

One normally adds an alternative source of truth (i.e. a "cache") to fix 
a significant performance problem, when one really can't do so otherwise.

It seems we agree now that comment-cache's existence can't be justified 
by performance considerations.

Cache invalidation is a known hard problem in CS, so we generally don't 
want to have extra caches.

>> At best, it'll use more memory than it has to.
> 
> The thing to do here is measure this extra memory.  I did this back in
> spring last year, and the number of extra conses used for the cache was
> not inordinately high.  Especially not for a 64-bit machine with several
> gigabytes of RAM.

Maybe it's not bad, without a direct link it's hard for me to comment on 
that now. But "no extra memory usage" would be a better outcome anyway.

> I think you're seeing something that's not there.  You're picturing some
> imagined process where two alternative ways of storing information have
> great difficulty staying together, and somehow, over time, are destined
> to drift apart.  Sort of like two national currencies trying to stay
> pegged to eachother, or something like that.

I'm picturing weird syntax highlighting/defun navigation/etc behavior 
that comes and goes seemingly randomly, and which forces us to debug 
both cache mechanisms to see which one is getting something wrong.

They don't even have to drift far apart functionality-wise, as long as 
their implementations are largely independent.

> That's not how computer programs work.  If those two ways end up
> differing, we have a bug, which can be fixed like any other bug.  Heck,
> even a single "source of truth" can be buggy, with just as severe
> consequences.  We get bugs, we fix them.

And the more sources of truth we have, we more places we might end up 
having to fix.

> Note, in this context, that syntax-ppss is broken (bug #22983) and
> doesn't look like getting fixed any time soon, yet the world hasn't come
> to an end.

A consistently "wrong" behavior is better than having some standard 
library functions work "correctly", and some otherwise.

Consider this again: as long as syntax-ppss continues to have problems 
in the cases you imagine, the caches _will_ diverge in those cases.

Honestly, my head hurts when I start thinking up problem examples, but 
I'm sure the users and authors of modes that define 
syntax-propertize-function and/or use syntax-ppss won't like them.

>>> Note that there has been NO constructive criticism of comment-cache.
> 
>> That's insulting, Alan.
> 
> It might be, but I think it's true.  You want comment-cache to be wholly
> abandoned.  You are not suggesting ways to make it better.  You haven't
> tried it, that I'm aware of.  You haven't looked for flaws, with the
> intention of getting them fixed.

You seem to argue that a high-level criticism can't be constructive, and 
that any good one has to discuss lower-level implementation details.

 > Instead you are putting forward
 > reasons, not all of them good, for abandoning comment-cache.

Aside from "two sources of truth", the other reason is that we have a 
much-simpler patch that gives us (or will eventually give) the same 
benefits.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-06  1:12         ` Stefan Monnier
@ 2017-02-06 18:37           ` Alan Mackenzie
  0 siblings, 0 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-06 18:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Sun, Feb 05, 2017 at 20:12:13 -0500, Stefan Monnier wrote:
> >         |      Speed       Direction of scanning     Correct parsing
> > ------------------------------------------------------------------------
> > CC      |       OK             forwards                   yes
> > M-t     |       OK             backwards                  no [1]
> > M-nil   |       Slow           backwards                  probably [2]
> > AP      |       OK             backwards                  no [3]

> And its parsing is correct if you assume that syntax-ppss works
> correctly ....

syntax-ppss doesn't work correctly, as you know full well, having
snipped the bug number (#22983) for the breakage from what you've cited.
That bug has been open for almost a year, and there was a whole thread
on emacs-devel about asking for it to be fixed back last summer.  I've
asked several times since then for it to be fixed, to no avail.

> .... (and if it doesn't, it's something that needs to be fixed
> anyway because it has much worse consequences than just messing
> (forward-comment -1)).

All the signs are that it will never be fixed.  If I'm wrong here, set
yourself a deadline for fixing it and let us know what that deadline is.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-06  2:09             ` Dmitry Gutov
@ 2017-02-06 19:24               ` Alan Mackenzie
  2017-02-07  1:42                 ` Dmitry Gutov
  2017-02-12  2:53               ` John Wiegley
  1 sibling, 1 reply; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-06 19:24 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel

Hello, Dmitry.

On Mon, Feb 06, 2017 at 04:09:42 +0200, Dmitry Gutov wrote:
> On 04.02.2017 12:24, Alan Mackenzie wrote:

> > You want comment-cache to be wholly abandoned.

> At least the part that maintains a separate cache. I'm not sure if 
> there's anything else there.

The essence of comment-cache is scanning comments only in the forward
direction.  This is impractical without a good cache.  The syntax-ppss
cache is wholly inadequate here (and would be even if it worked in the
general case).

> >> And then we should seek the simplest solution that satisfies all of our
> >> requirements.

> > As simple as possible, but definitely not simpler.  The "solution" you
> > favour is too simple.  It doesn't work all the time.

> I concede it's not ideal. However, I strongly believe "fixing" the 
> narrowing problem in syntax-ppss with take care of this example, *and* 
> will result in lower overall complexity and maintenance burden.

There's no sign of syntax-ppss being fixed.  Bug #22983 has been open
for almost a year, and despite repeated requests from me, there has been
no movement on it.

Anyways, there are other problems with the "alternative patch".  It
doesn't clear it's caches when syntax-table properties are applied to or
removed from a buffer.  It doesn't clear its caches when a "literal
relevant" change is made to the current syntax table, or a different
syntax-table is made current.  comment-cache handles these situations
correctly - that's where its perceived complexity scores.

> Consider the problems you've had merging master into the comment-cache 
> branch. If there were conflicts, that means the new code touches a 
> changing area, and it will need to be considered and taken care of by 
> the maintainers, probably on an ongoing basis.

comment-cache has rewriten backward_comment entirely, hence the
troublesome merge.  It's no more difficult for maintainers than the
current version of Emacs.

> The AP, on the other hand, still applies cleanly.

Not surprisingly.  It's simplistic, too simplistic.

> >> "It introduces a second source of truth" seems like a concise summary.

> > So what?  There are any number of "sources of truth" in Emacs.  If one
> > of them turns out to be a "source of untruth" we call that a bug, and we
> > fix it.

> One normally adds an alternative source of truth (i.e. a "cache") to fix 
> a significant performance problem, when one really can't do so otherwise.

So far, there's no fully satisfactory alternative to comment-cache on
the table.

> It seems we agree now that comment-cache's existence can't be justified 
> by performance considerations.

> Cache invalidation is a known hard problem in CS, so we generally don't 
> want to have extra caches.

It might be a difficult problem but it's not NP-complete, or anything
like that.  comment-cache solves the cache invalidation.  syntax-ppss,
used in the "alternative patch" doesn't.  (See above.)

> >> At best, it'll use more memory than it has to.

> > The thing to do here is measure this extra memory.  I did this back in
> > spring last year, and the number of extra conses used for the cache was
> > not inordinately high.  Especially not for a 64-bit machine with several
> > gigabytes of RAM.

> Maybe it's not bad, without a direct link it's hard for me to comment on 
> that now. But "no extra memory usage" would be a better outcome anyway.

It would, but nobody's come up with a satisfactory way to achieve this.

> > I think you're seeing something that's not there.  You're picturing some
> > imagined process where two alternative ways of storing information have
> > great difficulty staying together, and somehow, over time, are destined
> > to drift apart.  Sort of like two national currencies trying to stay
> > pegged to eachother, or something like that.

> I'm picturing weird syntax highlighting/defun navigation/etc behavior 
> that comes and goes seemingly randomly, and which forces us to debug 
> both cache mechanisms to see which one is getting something wrong.

Oh, I've had plenty of practice at this sort of thing.  Open parens at
column 0 in comments have been a frequent trigger for these problems.
comment-cache's cache is simple, and should thus be easy to verify.

> They don't even have to drift far apart functionality-wise, as long as 
> their implementations are largely independent.

They shouldn't drift apart at all.  But drifting apart is no worse a
problem than a single cache being wrong.

[ .... ]

> > Note, in this context, that syntax-ppss is broken (bug #22983) and
> > doesn't look like getting fixed any time soon, yet the world hasn't come
> > to an end.

> A consistently "wrong" behavior is better than having some standard 
> library functions work "correctly", and some otherwise.

A consistently wrong behaviour in a cache handler is not better.

> Consider this again: as long as syntax-ppss continues to have problems 
> in the cases you imagine, the caches _will_ diverge in those cases.

Yes they will.  In those cases, it would still be better if
backward_comment functioned correctly.

> Honestly, my head hurts when I start thinking up problem examples, but 
> I'm sure the users and authors of modes that define 
> syntax-propertize-function and/or use syntax-ppss won't like them.

They won't see them.

> >>> Note that there has been NO constructive criticism of comment-cache.

> >> That's insulting, Alan.

> > It might be, but I think it's true.  You want comment-cache to be wholly
> > abandoned.  You are not suggesting ways to make it better.  You haven't
> > tried it, that I'm aware of.  You haven't looked for flaws, with the
> > intention of getting them fixed.

> You seem to argue that a high-level criticism can't be constructive, and 
> that any good one has to discuss lower-level implementation details.

Arguing for complete abandonment is not constructive criticism.

>  > Instead you are putting forward
>  > reasons, not all of them good, for abandoning comment-cache.

> Aside from "two sources of truth", the other reason is that we have a 
> much-simpler patch that gives us (or will eventually give) the same 
> benefits.

It doesn't.  It doesn't clear its caches when it ought to because of
changes in syntax-table text properties, changes in the current syntax
table, or swapping to a different syntax table.  comment-cache handles
all of these things.

I'm not saying the "alternative patch" couldn't be enhanced to do these
things properly, but it would then no longer be a 20-line patch.  It
would also likely be much slower.  Why bother, when comment-cache exists
and works?

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-06  1:28         ` Dmitry Gutov
@ 2017-02-06 19:37           ` Alan Mackenzie
  0 siblings, 0 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-06 19:37 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel

Hello, Dmitry.

On Mon, Feb 06, 2017 at 03:28:44 +0200, Dmitry Gutov wrote:
> Hi Alan,

> On 04.02.2017 13:02, Alan Mackenzie wrote:

[ .... ]

> For any casual observers in this discussion who don't want to follow the 
> links: the patch touches src/syntax.c, and it's 20 lines long.

> > syntax-ppss being too slow was its use in a specific circumstance.  That
> > was trying to use it in place of comment-cache's cache mechanism, but
> > otherwise using comment-cache.  That would result in ~2 orders of
> > magnitude slowdown in backward_comment.

> Ah, so that's what you were arguing against?

Yes.  I thought at the time that that's what you were advocating.

> Does comment-cache code contain some other functionality that we'd want 
> to retain while using the syntax-ppss cache? Something that makes 
> performance overhead of syntax-ppss a problem still?

There's the failure to clear the syntax-ppss cache (e.g. after applying
syntax table properties) that I outlined in detail in my other post.
comment-cache's cache is cleared correctly in these circumstances.

> >>> The "alternative patch" didn't scan comments correctly all the time
> >>> when I looked at it, just as the current back_comment doesn't.

> >> Please remind us of the specific problems it has.

> > In the following test case (same as in my other post) the "alternative
> > patch" doesn't work.  Narrow the buffer with point-min at the indicated
> > position.  Put point at EOL.  Try M-: (forward-comment -1).  This fails.

> >      char foo[] = "asdf asdf" "asdf"; /* "asdf" */ /*  */  /*   '"'"  */
> >                        ^

> > .

> Thank you for the reminder. But do you have any examples that do not 
> involve narrowing?

No (other than cache clearing problems).  The bug on a narrowed buffer
is serious enough not to require "support" from other bugs.

> Reconciling syntax-ppss with narrowing is a subject of a separate thread 
> (one that's regrettably stalled for a while, but I'll get back to it 
> soon). As soon as it's resolved, the Alternative Patch should not have 
> this problem anymore either.

Not this one, no.

> > Using M;- (time-scroll) from the start of xdisp.c, and (time-scroll t)
> > from its end (having cleared caches by typing a character at BOB), I get
> > these timings

> >                        forward              backward
> > master                 34.51s               36.43s
> > comment-cache          33.68s               32.81s
> > "alternative patch"    35.49s               36.05s

> Thanks!

> > It would seem that differences in speed are not big enough to make any
> > decision on that basis.

> Does that just leave the narrowing issues, or is there something else?

See above and my other post from this evening for the "something else".

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-06  2:08         ` Stefan Monnier
@ 2017-02-06 20:01           ` Alan Mackenzie
  2017-02-06 22:33             ` Stefan Monnier
  2017-02-07 15:29             ` Eli Zaretskii
  0 siblings, 2 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-06 20:01 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Sun, Feb 05, 2017 at 21:08:15 -0500, Stefan Monnier wrote:
> >> > The "alternative patch" didn't scan comments correctly all the time
> >> > when I looked at it, just as the current back_comment doesn't.

> Of course, there's an alternative way to look at this reality: your
> comment-cache changes the behavior in some cases where the AP patch doesn't.
> You claim that the new behavior is "correct" and the other one "wrong",
> but AFAIK these are borderline cases where both interpretations can be
> correct or wrong depending on what narrowing was used for.

I think that is a fundamental mistake in thinking.  The syntactic
significance of a buffer position is not changed by any narrowing in
force, no matter what the narrowing is "used for".  Any other
interpretation leads to the inconsistencies you've identified.

That one or more multiple-mode attempts attempt to use narrowing that
way is a fundamental problem in those modes which we should solve by
providing a better method.  (I suggested one such method last spring.)

> So while I don't claim that comment cache's behavior is *wrong*, it
> might break existing code.

> > In the following test case (same as in my other post) the "alternative
> > patch" doesn't work.  Narrow the buffer with point-min at the indicated
> > position.  Put point at EOL.  Try M-: (forward-comment -1).  This fails.

> >     char foo[] = "asdf asdf" "asdf"; /* "asdf" */ /*  */  /*   '"'"  */
> >                       ^

> For example here, your intention for narrowing is clear, ....

There's no "intention" involved here.  There's just narrowing.

> .... but Emacs currently doesn't keep track of the fact that the user
> put this narrowing (rather than some code like mmm-mode), so while in
> this case your comment-cache is probably right, in other cases it
> might give the wrong answer.  E.g. maybe in a case such as

>      char foo[] = "for (x = 0; x < n; x++) /* Loop header */\n";
>                    ^                                        ^

> where the user narrows to the string, then goes to EOL and does
> M-: (forward-comment -1)

Even if the user narrows to the string, it's still a string.  It's not a
comment, and can't be one.

Even if, traditionally, Emacs has treated this string portion as a
comment, that was merely for simplicity of implementation.  This is not
an important point, however, because moving back over comments is not a
user command, and major modes will have checked for a "safe place"
before attempting (forward-comment -1) or a backwards scan-lists.

> Really your comment-cache (just like the existing code) currently can't
> do much better, because to do better we need to fix the narrowing
> problem.

I don't think there is such a problem.

> So really, the problem to be solved is the problem of narrowing.
> Once that one is solved, AP and comment-cache should both be able to
> behave correctly in both cases (in the case of AP, this will happen
> without any changes to AP itself, because the fix will be in
> syntax-ppss).

As I pointed out to Dmitry, AP fails to clear the syntax-ppss cache when
syntax-table properties in a buffer are changed (which is _always_ done
with the change hooks disabled) or the current syntax table is changed,
or a different syntax table is made current.  comment-cache clears its
cache correctly in these scenarios.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-06 20:01           ` Alan Mackenzie
@ 2017-02-06 22:33             ` Stefan Monnier
  2017-02-07 21:24               ` Alan Mackenzie
  2017-02-07 15:29             ` Eli Zaretskii
  1 sibling, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-06 22:33 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

>> char foo[] = "for (x = 0; x < n; x++) /* Loop header */\n";
>> ^                                        ^

>> where the user narrows to the string, then goes to EOL and does
>> M-: (forward-comment -1)

> Even if the user narrows to the string, it's still a string.  It's not a
> comment, and can't be one.

As the user who did the above operation I beg to differ: I narrowed
specifically because I wanted to treat this as the chunk of C code
it is.

It would be arrogant for Emacs to claim it knows better than the user.


        Stefan



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-06 19:24               ` Alan Mackenzie
@ 2017-02-07  1:42                 ` Dmitry Gutov
  2017-02-07 19:21                   ` Alan Mackenzie
  0 siblings, 1 reply; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-07  1:42 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, emacs-devel

Hey Alan,

On 06.02.2017 21:24, Alan Mackenzie wrote:

> The essence of comment-cache is scanning comments only in the forward
> direction.  This is impractical without a good cache.  The syntax-ppss
> cache is wholly inadequate here (and would be even if it worked in the
> general case).

How come the "alternative patch" works well, then? The only bugs you've 
outlined so far are related to narrowing and syntax table change, but 
not any static complex syntactic situations, which is where I would 
expect scanning direction to have an impact.

> There's no sign of syntax-ppss being fixed.  Bug #22983 has been open
> for almost a year, and despite repeated requests from me, there has been
> no movement on it.

You didn't show any enthusiasm about the initial proposed fix, which was 
rather simple. Now we've had more discussions, and the bar for a 
solution has been raised. I'm thinking about it again. Let's not give up.

> Anyways, there are other problems with the "alternative patch".  It
> doesn't clear it's caches when syntax-table properties are applied to or
> removed from a buffer.  It doesn't clear its caches when a "literal
> relevant" change is made to the current syntax table, or a different
> syntax-table is made current.

Tracking changes inside a syntax table is possible (at the expense of 
some performance, as usual), but kinda pointless, I think. Most issues 
related to that, if they ever come up, could be answered with "don't do 
that".

Tracking the used syntax table is also a problem which we need to solve 
for syntax-ppss. A good design could handle it and narrowing together.

> comment-cache handles these situations
> correctly - that's where its perceived complexity scores.

And it does that in a pretty inflexible way.

> comment-cache has rewriten backward_comment entirely, hence the
> troublesome merge.  It's no more difficult for maintainers than the
> current version of Emacs.

But surely it is more complex, with cache handling logic.

> They shouldn't drift apart at all.  But drifting apart is no worse a
> problem than a single cache being wrong.

Yes, it is worse. You have more code to debug. And comment-cache adds 
quite a bit of code.

> Arguing for complete abandonment is not constructive criticism.

When an alternative approach is recommended, yes, it is.

> I'm not saying the "alternative patch" couldn't be enhanced to do these
> things properly, but it would then no longer be a 20-line patch.

I think it would be. The enhancements you're referring to will most 
likely be implemented on the Lisp level, and they are needed anyway.

So the "speed up forward-comment" patch would still come out to 20 lines.

> It would also likely be much slower.
I wouldn't be so sure. A syntax table comparison, for instance, would be 
pretty cheap compared to what syntax-ppss does already.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-06 20:01           ` Alan Mackenzie
  2017-02-06 22:33             ` Stefan Monnier
@ 2017-02-07 15:29             ` Eli Zaretskii
  2017-02-07 21:09               ` Alan Mackenzie
  1 sibling, 1 reply; 75+ messages in thread
From: Eli Zaretskii @ 2017-02-07 15:29 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Mon, 6 Feb 2017 20:01:16 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: emacs-devel@gnu.org
> 
> The syntactic significance of a buffer position is not changed by
> any narrowing in force, no matter what the narrowing is "used for".

If that's what you think, you are not talking about Emacs.  Emacs
always behaved as if nothing existed outside of the current
narrowing.  Even the display engine behaves like that: e.g., by
suitable narrowing of bidirectional text you can completely change how
the accessible portion is displayed.

> That one or more multiple-mode attempts attempt to use narrowing that
> way is a fundamental problem in those modes which we should solve by
> providing a better method.

That's true, but it doesn't affect the basic fact that Emacs behaves
differently, and you cannot change that without significant changes on
levels below applications.

> Even if the user narrows to the string, it's still a string.

Emacs currently doesn't have any means of knowing that, because the
portions of the buffer outside the accessible region are simply not
accessible.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-07  1:42                 ` Dmitry Gutov
@ 2017-02-07 19:21                   ` Alan Mackenzie
  2017-02-14 15:28                     ` Dmitry Gutov
  0 siblings, 1 reply; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-07 19:21 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel

Hello, Dmitry.

On Tue, Feb 07, 2017 at 03:42:50 +0200, Dmitry Gutov wrote:
> Hey Alan,

> On 06.02.2017 21:24, Alan Mackenzie wrote:

> > The essence of comment-cache is scanning comments only in the forward
> > direction.  This is impractical without a good cache.  The syntax-ppss
> > cache is wholly inadequate here (and would be even if it worked in the
> > general case).

> How come the "alternative patch" works well, then?

Well, aside from the fact that it doesn't (IMAO), it is only consulted
relatively rarely, in certain cases of back_coment where the backward
scanning hits something it doesn't want to handle.  The AP is marginally
slower than comment-cache.  If such awkward comments were prominent in a
file, it would be noticeably slower.  In comment-cache, the cache is
used for every back_comment.

> The only bugs you've outlined so far are related to narrowing and
> syntax table change, but not any static complex syntactic situations,
> which is where I would expect scanning direction to have an impact.

Those bugs are enough, aren't they?  (forward-comment -1) etc., should
work correctly in any circumstances.  There might be the sort of bugs
you're looking for, but I suspect not.  The backward scanning code is
very complicated.

> > There's no sign of syntax-ppss being fixed.  Bug #22983 has been open
> > for almost a year, and despite repeated requests from me, there has been
> > no movement on it.

> You didn't show any enthusiasm about the initial proposed fix, which was 
> rather simple. Now we've had more discussions, and the bar for a 
> solution has been raised. I'm thinking about it again. Let's not give up.

I wasn't enthusiastic about your proposed fix because I found it ugly.

> > Anyways, there are other problems with the "alternative patch".  It
> > doesn't clear it's caches when syntax-table properties are applied to or
> > removed from a buffer.  It doesn't clear its caches when a "literal
> > relevant" change is made to the current syntax table, or a different
> > syntax-table is made current.

> Tracking changes inside a syntax table is possible (at the expense of 
> some performance, as usual), but kinda pointless, I think. Most issues 
> related to that, if they ever come up, could be answered with "don't do 
> that".

That sounds like you've decided you want to use syntax-ppss no matter
what, and the bugs this will cause will just be relabeled as features.
As I've said before, the aim should be for back_comment always to work.

> Tracking the used syntax table is also a problem which we need to solve 
> for syntax-ppss. A good design could handle it and narrowing together.

You should now be able to see why I dislike syntax-ppss so much.  As
well as being incompatible with narrowing (which should be sort of
fixable), there is an essential lack of cache invalidating (which would
only be fixable by a radically different design).  There is no sign that
much thought was given to cache invalidation in the design of
syntax-ppss.  This probably cannot be fixed, or if it can, will involve
lots of programming at the C level, and will slow Emacs down quite a
bit.

> > comment-cache handles these situations correctly - that's where its
> > perceived complexity scores.

> And it does that in a pretty inflexible way.

It works.  Other ways (apart from M-nil (master with
open-paren-in-column-0-is-defun-start set to nil)) don't.  The sort of
flexibility I recall you wanting is simply not possible in
comment-cache, though its role could be expanded for other uses which
need the literality of a position.

> > comment-cache has rewriten back_comment entirely, hence the
> > troublesome merge.  It's no more difficult for maintainers than the
> > current version of Emacs.

> But surely it is more complex, with cache handling logic.

It's differently complicated.  master's back_comment, which attempts to
scan comments backwards is more complicated than comment-cache's
back_comment (including its cacheing logic).

> > They shouldn't drift apart at all.  But drifting apart is no worse a
> > problem than a single cache being wrong.

> Yes, it is worse. You have more code to debug. And comment-cache adds 
> quite a bit of code.

How have you quantified "quite a bit"?

> > Arguing for complete abandonment is not constructive criticism.

> When an alternative approach is recommended, yes, it is.

There is nothing to indicate you've even looked at comment-cache.  All
the criticisms you've made have been from a distance, based on rumour
(even if the source of that rumour has been me).  These criticisms have
been entirely destructive.  I repeat, you want comment-cache to be
wholly abandoned, apparently because you like syntax-ppss so much.  The
alternative "recommended" approach has documented deficiencies, yet you
still advocate it.

> > I'm not saying the "alternative patch" couldn't be enhanced to do these
> > things properly, but it would then no longer be a 20-line patch.

> I think it would be. The enhancements you're referring to will most 
> likely be implemented on the Lisp level, and they are needed anyway.

They can't be implemented at the Lisp level.  The tools Emacs Lisp
provides for cache invalidation (basically,
before/after-change-functions) aren't up to the job.

> So the "speed up forward-comment" patch would still come out to 20 lines.

Well, if you get a decent bug fix involving, say a 700 line patch which
includes those 20 lines, I suppose you could still call it a 20 line
patch, somehow.

> > It would also likely be much slower.

> I wouldn't be so sure. A syntax table comparison, for instance, would be 
> pretty cheap compared to what syntax-ppss does already.

Full syntax-table comparisons are slow, even when written in C.  I tried
it back in December.  CC Mode regularly switches syntax-tables.  My
usual time-scroll function on xdisp.c ran at about half the speed when a
comparison was done at every set-syntax-table.  The results had to be
cached, after which it ran at normal speed again.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-07 15:29             ` Eli Zaretskii
@ 2017-02-07 21:09               ` Alan Mackenzie
  2017-02-08 17:28                 ` Eli Zaretskii
  0 siblings, 1 reply; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-07 21:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: monnier, emacs-devel

Hello, Eli.

On Tue, Feb 07, 2017 at 17:29:40 +0200, Eli Zaretskii wrote:
> > Date: Mon, 6 Feb 2017 20:01:16 +0000
> > From: Alan Mackenzie <acm@muc.de>
> > Cc: emacs-devel@gnu.org

> > The syntactic significance of a buffer position is not changed by
> > any narrowing in force, no matter what the narrowing is "used for".

> If that's what you think, you are not talking about Emacs.  Emacs
> always behaved as if nothing existed outside of the current
> narrowing.

Not consistently.  Font lock, in all the modes I'm aware of, does not
invert its "stringiness" when point-min lies within a string.

> Even the display engine behaves like that: e.g., by suitable narrowing
> of bidirectional text you can completely change how the accessible
> portion is displayed.

Is this a deliberate design decision, or is it simply what tumbled out
after bidi was implemented in the easiest and most natural fashion?

> > That one or more multiple-mode attempts attempt to use narrowing that
> > way is a fundamental problem in those modes which we should solve by
> > providing a better method.

> That's true, but it doesn't affect the basic fact that Emacs behaves
> differently, and you cannot change that without significant changes on
> levels below applications.

> > Even if the user narrows to the string, it's still a string.

> Emacs currently doesn't have any means of knowing that, because the
> portions of the buffer outside the accessible region are simply not
> accessible.

As you know, I've implemented a scheme by which Emacs can know this.

Up till now, recognition of literals has been done solely by the local
context, probably because it was easier to implement this way rather
than any deep design decision.  Or am I wrong here?

Is there any part of Emacs which depends on this way of recognising
literals, and would that be badly hurt if literals came to be recognised
by their global context (as syntax-ppss currently sort of does)?

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-06 22:33             ` Stefan Monnier
@ 2017-02-07 21:24               ` Alan Mackenzie
  2017-02-08 12:54                 ` Stefan Monnier
  0 siblings, 1 reply; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-07 21:24 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Mon, Feb 06, 2017 at 17:33:29 -0500, Stefan Monnier wrote:
> >> char foo[] = "for (x = 0; x < n; x++) /* Loop header */\n";
> >> ^                                        ^

> >> where the user narrows to the string, then goes to EOL and does
> >> M-: (forward-comment -1)

> > Even if the user narrows to the string, it's still a string.  It's not a
> > comment, and can't be one.

> As the user who did the above operation I beg to differ: I narrowed
> specifically because I wanted to treat this as the chunk of C code
> it is.

It would likely have been less work to have temporarily deleted the
first string quote.

> It would be arrogant for Emacs to claim it knows better than the user.

More arrogant than a user expecting C syntax to be superseeded?

As a matter of interest, what was the real use case for this, how often
do you do it, and how big would the loss be if you couldn't do it any
more?

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-07 21:24               ` Alan Mackenzie
@ 2017-02-08 12:54                 ` Stefan Monnier
  0 siblings, 0 replies; 75+ messages in thread
From: Stefan Monnier @ 2017-02-08 12:54 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> As a matter of interest, what was the real use case for this, how often
> do you do it, and how big would the loss be if you couldn't do it any
> more?

I'm not worried about breaking users's expectations in this regard.
The problem is with packages's expectations.  We need packages to be able
to get "the other" behavior.


        Stefan



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-05 22:00       ` Alan Mackenzie
  2017-02-06  1:12         ` Stefan Monnier
@ 2017-02-08 17:20         ` Eli Zaretskii
  2017-02-11 23:25           ` Alan Mackenzie
  1 sibling, 1 reply; 75+ messages in thread
From: Eli Zaretskii @ 2017-02-08 17:20 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> Date: Sun, 5 Feb 2017 22:00:45 +0000
> Cc: emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> But I need your acceptance of comment-cache to go any further.  It has
> taken a lot of my time to develop, and I am still hopeful of merging it
> into master.  If there is a sound technical reason why it should be
> abandoned, that is fair enough.  If it is rejected without such a
> reason, I will need to reconsider my relationship with Emacs.  I am
> currently working (or "working") on several ambitious changes in Emacs.
> One of them is restructuring the byte compiler so that error and warning
> messages get the correct line number (bug #22288, etc.).  If there is
> the prospect of these being rejected without good reason, I am not
> willing to take the risk of wasting my time on them.  I would restrict
> my participation in Emacs to CC Mode and simple changes in the non-C
> part of Emacs which can be done in at most a very few hours.

Alan,

I hear you, and I'm sorry that you feel such frustration over your
efforts whose results might not end up in the Emacs sources.  Please
understand my position: I'm not an expert on the underlying issues,
neither syntax.c in general, nor syntax-ppss, and not the particular
application of these to CC Mode.  So when two of our best experts on
these issues unanimously disagree with your proposal, I cannot dismiss
their opinions and approve the merge.  Their arguments are technical
and sound, even though they are about the general principles of your
design and not about specific details of your implementation.  But
that doesn't make their arguments invalid or less sound.

So please don't perceive this as "rejection without sound technical
reasons".

As for your other work on changes in Emacs: I see no reasons to
believe their review or prospects of acceptance will be related to the
present issue in any way.  They will be treated completely
independently of this one.

I can understand your fears of having those other changes rejected
because of some aspect of the design or the implementation.  I had my
share of that when I worked on the bidi display engine.  I can tell
what I did to lower the probability of such an outcome: when I made
major design decisions, I published them here and asked for (and
received) comments.  May I suggest that you try that technique as
well?  Doing that will IME go a long way towards identifying the
problematic issues long before they are cast in written and debugged
code, and thus allow you to avoid unnecessary refactoring and grief.

Hoping to see many of your patches in Emacs in the years to come.

TIA

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-07 21:09               ` Alan Mackenzie
@ 2017-02-08 17:28                 ` Eli Zaretskii
  0 siblings, 0 replies; 75+ messages in thread
From: Eli Zaretskii @ 2017-02-08 17:28 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Tue, 7 Feb 2017 21:09:42 +0000
> Cc: monnier@iro.umontreal.ca, emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > If that's what you think, you are not talking about Emacs.  Emacs
> > always behaved as if nothing existed outside of the current
> > narrowing.
> 
> Not consistently.  Font lock, in all the modes I'm aware of, does not
> invert its "stringiness" when point-min lies within a string.

There are a few exceptions, yes.  But mostly what I described is
accurate.

> > Even the display engine behaves like that: e.g., by suitable narrowing
> > of bidirectional text you can completely change how the accessible
> > portion is displayed.
> 
> Is this a deliberate design decision, or is it simply what tumbled out
> after bidi was implemented in the easiest and most natural fashion?

It isn't related to bidi in any way, it's how the display engine
behaved since day one, long before I started coding the bidirectional
support.  I just left that aspect alone and didn't change it.

> > Emacs currently doesn't have any means of knowing that, because the
> > portions of the buffer outside the accessible region are simply not
> > accessible.
> 
> As you know, I've implemented a scheme by which Emacs can know this.

Dmitry's point is exactly that a solution to these issues will also
resolve some bugs related to CC mode, which you tried to solve in your
branch.  I tend to agree with Dmitry that narrowing and its effect on
Emacs internals is a separate problem that needs to be solved in a
more general way than just in CC mode or thereabouts.

> Up till now, recognition of literals has been done solely by the local
> context, probably because it was easier to implement this way rather
> than any deep design decision.  Or am I wrong here?

I think it isn't an accident.  There's a deeper issue here: if some
portion of the buffer is inaccessible to user-level commands, it might
be confusing if some features would internally behave as if the
restriction didn't exist, at least in general.

Finding a solution for this which doesn't introduce the confusion is a
challenge.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-08 17:20         ` Eli Zaretskii
@ 2017-02-11 23:25           ` Alan Mackenzie
  2017-02-12  0:55             ` Stefan Monnier
  0 siblings, 1 reply; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-11 23:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

Hello, Eli.

Thanks for the reply.

On Wed, Feb 08, 2017 at 19:20:41 +0200, Eli Zaretskii wrote:
> > Date: Sun, 5 Feb 2017 22:00:45 +0000
> > Cc: emacs-devel@gnu.org
> > From: Alan Mackenzie <acm@muc.de>

> > But I need your acceptance of comment-cache to go any further.  It has
> > taken a lot of my time to develop, and I am still hopeful of merging it
> > into master.  If there is a sound technical reason why it should be
> > abandoned, that is fair enough.  If it is rejected without such a
> > reason, I will need to reconsider my relationship with Emacs.  I am
> > currently working (or "working") on several ambitious changes in Emacs.
> > One of them is restructuring the byte compiler so that error and warning
> > messages get the correct line number (bug #22288, etc.).  If there is
> > the prospect of these being rejected without good reason, I am not
> > willing to take the risk of wasting my time on them.  I would restrict
> > my participation in Emacs to CC Mode and simple changes in the non-C
> > part of Emacs which can be done in at most a very few hours.

> Alan,

> I hear you, and I'm sorry that you feel such frustration over your
> efforts whose results might not end up in the Emacs sources.  Please
> understand my position: I'm not an expert on the underlying issues,
> neither syntax.c in general, nor syntax-ppss, and not the particular
> application of these to CC Mode.  So when two of our best experts on
> these issues unanimously disagree with your proposal, I cannot dismiss
> their opinions and approve the merge.

I am something of an expert myself on syntax.c, having enhanced it to
handle comments continued by escaped newlines, to handle a scan stopping
in the middle of a two-character comment delimiter, having refactored
important bits of it and fixed several bugs in it.

> Their arguments are technical and sound, even though they are about
> the general principles of your design and not about specific details
> of your implementation.  But that doesn't make their arguments invalid
> or less sound.

> So please don't perceive this as "rejection without sound technical
> reasons".

Yet an important bug remains unfixed.  comment-cache would fix that bug.
I would expect you to take this into account when weighing up the
arguments for and against.

I would expect you to take into account all the technical arguments both
for and against, and to place less importance on who is arguing than the
substance of their arguments.  You say you are "not an expert" on the
issues, yet I don't think this is strictly true.  You know easily enough
about syntax to understand the arguments about it.

> As for your other work on changes in Emacs: I see no reasons to
> believe their review or prospects of acceptance will be related to the
> present issue in any way.  They will be treated completely
> independently of this one.

As I say, I an unhappy about the way the comment-cache issue has been
handled.  I asked on three separate occasions to merge it into master.
On the first two (2016-03 and 2016-12), I received no clear answer.
This third time there is at last a "no", but the reason given is not
technical but on others' personal authority: "when two of our best
experts ... disagree" their opinion holds sway over mine, seemingly
regardless of the strength of the technical points.  Two against one, I
suppose.

> I can understand your fears of having those other changes rejected
> because of some aspect of the design or the implementation.

My fear is that speculative changes might well not be evaluated on
technical grounds, as I feel comment-cache has not been.  None of the
posts opposing comment-cache have even acknowleged that it fixes a bug,
and none of them have attempted to weigh comment-cache's alleged
disadvantages against the fact of the bug fix.

In the current situation I think that both Stefan and Dmitry have an
emotional attachment to syntax-ppss despite its manifest flaws, and it
is this which is behind their opposition to comment-cache, which they
see as some sort of "competitor".  (Here, I don't hide the fact that I
strongly dislike syntax-ppss.)

> I had my share of that when I worked on the bidi display engine.  I
> can tell what I did to lower the probability of such an outcome: when
> I made major design decisions, I published them here and asked for
> (and received) comments.  May I suggest that you try that technique as
> well?

I announced my intention to cache the literal state in text properties
before starting work on it, and even had a brief exchange with yourself
about this.  I think this was in the context of bug #22884 (Paul E.'s
bug about slowness in config.h, which was quickly tracked down to an
open paren in column 0 inside a comment).

> Doing that will IME go a long way towards identifying the problematic
> issues long before they are cast in written and debugged code, and
> thus allow you to avoid unnecessary refactoring and grief.

> Hoping to see many of your patches in Emacs in the years to come.

> TIA

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-11 23:25           ` Alan Mackenzie
@ 2017-02-12  0:55             ` Stefan Monnier
  2017-02-12 12:05               ` Alan Mackenzie
  0 siblings, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-12  0:55 UTC (permalink / raw)
  To: emacs-devel

> In the current situation I think that both Stefan and Dmitry have an
> emotional attachment to syntax-ppss despite its manifest flaws, and it

Of course, I have an emotional attachment to syntax-ppss, since I wrote
it and used it all over the place.  And of course you have an emotional
attachment to comment-cache since you wrote it.

But using words like "flaw" to describe a simple shortcoming of the
current implementation, is really not helping.  I hope I never wrote
something about your comment-cache that was similarly aimed at just
putting it down.

BTW, your comment-cache doesn't fix that "flaw" and hence won't help any
of those users of syntax-ppss which can't be changed to use your
comment-cache.  Which is why I said many months ago that it'd be fine to
use something like your comment-cache *if* you extend it to provide the
functionality of syntax-ppss.

But that's also why I think this whole discussion is pointless: we first
need to focus on that "flaw" which comes to the problems of narrowing
and whether tools like syntax-ppss, comment-cache, font-lock, etc... can
and should widen and if so when and up to where.

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-06  2:09             ` Dmitry Gutov
  2017-02-06 19:24               ` Alan Mackenzie
@ 2017-02-12  2:53               ` John Wiegley
  2017-02-12  8:20                 ` Elias Mårtenson
                                   ` (2 more replies)
  1 sibling, 3 replies; 75+ messages in thread
From: John Wiegley @ 2017-02-12  2:53 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Eli Zaretskii, emacs-devel

>>>>> "DG" == Dmitry Gutov <dgutov@yandex.ru> writes:

GD> One normally adds an alternative source of truth (i.e. a "cache") to fix a
DG> significant performance problem, when one really can't do so otherwise.

DG> It seems we agree now that comment-cache's existence can't be justified by
GD> performance considerations.

DG> Cache invalidation is a known hard problem in CS, so we generally don't
GD> want to have extra caches.

This argument right here is why I would vote against comment-cache: I'd rather
have parens-in-comments-at-column-0 parsed incorrectly -- at least, until
syntax-ppss is fixed -- than to add another cache just to fix this problem.
Unless I've missed something...

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12  2:53               ` John Wiegley
@ 2017-02-12  8:20                 ` Elias Mårtenson
  2017-02-12 10:47                 ` Alan Mackenzie
  2017-02-12 11:14                 ` martin rudalics
  2 siblings, 0 replies; 75+ messages in thread
From: Elias Mårtenson @ 2017-02-12  8:20 UTC (permalink / raw)
  To: emacs-devel, Alan Mackenzie, Dmitry Gutov, Eli Zaretskii

[-- Attachment #1: Type: text/plain, Size: 1030 bytes --]

On 12 Feb 2017 10:55, "John Wiegley" <jwiegley@gmail.com> wrote:

This argument right here is why I would vote against comment-cache: I'd
rather
have parens-in-comments-at-column-0 parsed incorrectly -- at least, until
syntax-ppss is fixed -- than to add another cache just to fix this problem.
Unless I've missed something...

I'm sorry for butting in on this discussion, but I've been following it
with great interest.

During the course of this thread, it has been mentioned that merging
comment-cache would create two different systems: the new one used to track
comments, and syntax-ppss for everything else. Is there a technical reason
why this isn't being considered?

From what I've understood, the way comment-cache solves the problem is
clearly superior so I'm wondering why, when it was implemented, it was
restricted to tracking comments only. Wouldn't this mechanism be useful as
a complete replacement for the current implementation of syntax-ppss.

It seems as though Stefan is also thinking along those lines?

[-- Attachment #2: Type: text/html, Size: 1831 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12  2:53               ` John Wiegley
  2017-02-12  8:20                 ` Elias Mårtenson
@ 2017-02-12 10:47                 ` Alan Mackenzie
  2017-02-12 11:14                 ` martin rudalics
  2 siblings, 0 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-12 10:47 UTC (permalink / raw)
  To: Dmitry Gutov, Eli Zaretskii, emacs-devel

Hello, John.

On Sat, Feb 11, 2017 at 18:53:58 -0800, John Wiegley wrote:
> >>>>> "DG" == Dmitry Gutov <dgutov@yandex.ru> writes:

> GD> One normally adds an alternative source of truth (i.e. a "cache") to fix a
> DG> significant performance problem, when one really can't do so otherwise.

> DG> It seems we agree now that comment-cache's existence can't be justified by
> GD> performance considerations.

> DG> Cache invalidation is a known hard problem in CS, so we generally don't
> GD> want to have extra caches.

> This argument right here is why I would vote against comment-cache: I'd rather
> have parens-in-comments-at-column-0 parsed incorrectly -- at least, until
> syntax-ppss is fixed -- than to add another cache just to fix this problem.
> Unless I've missed something...

What you've missed is that the cache invalidation in comment-cache is
rock solid - with the exception that it doesn't watch
parse-sexp-lookup-properties and parse-sexp-ignore-comments, variables
that are typically used only in initialisation.  If this were deemed a
flaw, it could be fixed very easily.

The other thing is that syntax-ppss doesn't look like getting fixed.
Bug #22983 has been open for almost a year, despite requests to have it
fixed.  Also syntax-ppss's cache invalidation is less than rigorous.

Yet another thing is that it is me that is having to field the
open-paren-in-column-0-in-comment bugs, which typically happen in CC
Mode, and it is a demoralising waste of time each time it happens.
Recently, I've been telling the bug raisers that there's a fix which
should hopefully appear in Emacs 26.  With all honesty, I don't think I
can say that any more.

> -- 
> John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
> http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12  2:53               ` John Wiegley
  2017-02-12  8:20                 ` Elias Mårtenson
  2017-02-12 10:47                 ` Alan Mackenzie
@ 2017-02-12 11:14                 ` martin rudalics
  2017-02-12 15:05                   ` Andreas Röhler
  2017-02-12 15:39                   ` Eli Zaretskii
  2 siblings, 2 replies; 75+ messages in thread
From: martin rudalics @ 2017-02-12 11:14 UTC (permalink / raw)
  To: Dmitry Gutov, Alan Mackenzie, Eli Zaretskii, emacs-devel

 > This argument right here is why I would vote against comment-cache: I'd rather
 > have parens-in-comments-at-column-0 parsed incorrectly -- at least, until
 > syntax-ppss is fixed -- than to add another cache just to fix this problem.
 > Unless I've missed something...

It makes me rather sad that this discussion does not consider ecological
consequences at all.  IIUC it started because of a "(c)" copyright
characters sequence in the comment of some C code.  Doesn't it strike
anyone as the ultimate irony to consider this an issue in the context of
copylefted software?

Also IIUC we still adhere to the GNU coding standards which clearly say
that

   It is important to put the open-brace that starts the body of a C
   function in column one, so that they will start a defun. Several tools
   look for open-braces in column one to find the beginnings of C
   functions. These tools will not work on code not formatted that way.

   Avoid putting open-brace, open-parenthesis or open-bracket in column
   one when they are inside a function, so that they won’t start a
   defun. The open-brace that starts a struct body can go in column one
   if you find it useful to treat that definition as a defun.

The continuous attempts to deceive this standard's rules have been
harassing me for many years now.  If people do like copyrighted code and
code written according to non-GNU standards, then we should provide at
least one single option that respects an open paren in column zero where
it belongs to: At the beginning of a defun and nowhere else.

In Emacs this option is called `open-paren-in-column-0-is-defun-start'.
Emacs code should obey this option in the sense that if it is non-nil,
it should behave ecologically in terms of consumption of CPU cycles and
memory space.

martin

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12  0:55             ` Stefan Monnier
@ 2017-02-12 12:05               ` Alan Mackenzie
  2017-02-12 13:13                 ` Juanma Barranquero
                                   ` (2 more replies)
  0 siblings, 3 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-12 12:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Sat, Feb 11, 2017 at 19:55:46 -0500, Stefan Monnier wrote:
> > In the current situation I think that both Stefan and Dmitry have an
> > emotional attachment to syntax-ppss despite its manifest flaws, and it

> Of course, I have an emotional attachment to syntax-ppss, since I wrote
> it and used it all over the place.  And of course you have an emotional
> attachment to comment-cache since you wrote it.

I also have an attachment to it because it works, and would save me
demoralizing work debugging bugs caused by open parens in column zero in
comments.

> But using words like "flaw" to describe a simple shortcoming of the
> current implementation, is really not helping.  I hope I never wrote
> something about your comment-cache that was similarly aimed at just
> putting it down.

Bug #22983 is a flaw.  It has been open for nearly a year, yet for some
reason isn't being fixed.  Also the cache invalidation in syntax-ppss is
less than rigorous.  For example, the cache isn't invalidated when
syntax-table text properties are applied or removed.

> BTW, your comment-cache doesn't fix that "flaw" and hence won't help any
> of those users of syntax-ppss which can't be changed to use your
> comment-cache.

That's incoherent.  comment-cache was never intended to help those other
uses, though it appears it could do so for most of them.  That
particular flaw we're talking about doesn't appear in comment cache, so
there's nothing to fix there.

> Which is why I said many months ago that it'd be fine to use something
> like your comment-cache *if* you extend it to provide the
> functionality of syntax-ppss.

Can't be done, as I keep telling you.  comment-cache is solely for
handling literals.

> But that's also why I think this whole discussion is pointless: we first
> need to focus on that "flaw" which comes to the problems of narrowing
> and whether tools like syntax-ppss, comment-cache, font-lock, etc... can
> and should widen and if so when and up to where.

Maybe sometime.  In the meantime, the bug with open parens in column
zero in comments should be fixed.

The question of "widening" is not difficult.  Narrowing a buffer should
not change the syntax of the characters in it.  Doing so leads to
inconsistencies.

If I understand correctly, the problem is that multiple-major-mode modes
are trying to use narrowing to get a null syntactic context.  They are
trying this because we don't provide anything better.  We should provide
something better.  I suggested such a something last spring ("islands").
If each buffer position has an unambiguous syntactic context the
question of "widening" simply evaporates.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12 12:05               ` Alan Mackenzie
@ 2017-02-12 13:13                 ` Juanma Barranquero
  2017-02-12 15:57                 ` Dmitry Gutov
  2017-02-12 17:49                 ` Stefan Monnier
  2 siblings, 0 replies; 75+ messages in thread
From: Juanma Barranquero @ 2017-02-12 13:13 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Stefan Monnier, Emacs developers

[-- Attachment #1: Type: text/plain, Size: 1483 bytes --]

On Sun, Feb 12, 2017 at 1:05 PM, Alan Mackenzie <acm@muc.de> wrote:

> The question of "widening" is not difficult.  Narrowing a buffer should
> not change the syntax of the characters in it.  Doing so leads to
> inconsistencies.
>
> If I understand correctly, the problem is that multiple-major-mode modes
> are trying to use narrowing to get a null syntactic context.  They are
> trying this because we don't provide anything better.  We should provide
> something better.  I suggested such a something last spring ("islands").
> If each buffer position has an unambiguous syntactic context the
> question of "widening" simply evaporates.

I have no opinion on this thread's issue, so no personal attachment to
either side.

But I find curious that you complain that people refuses (so to speak) to
judge comment-cache based on its technical merits, and yet you refuse to
acknowledge that the narrow/widen issue is not technical but social.
There's not a technical reason to prefer that narrowing doesn't change the
syntax of characters, and it's not clear at all that it is what the users
prefer. Indeed, as Eli and Stefan already pointed out, narrowing has been
used in both senses inside and outside Emacs proper. Perhaps you're right
and both behaviors should be separated and clearly defined as two different
facilities, but insisting that your view is the right one is not, per se,
an argument for comment-cache.

Just my 0,00002€.

   Juanma

[-- Attachment #2: Type: text/html, Size: 1752 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12 11:14                 ` martin rudalics
@ 2017-02-12 15:05                   ` Andreas Röhler
  2017-02-12 15:39                   ` Eli Zaretskii
  1 sibling, 0 replies; 75+ messages in thread
From: Andreas Röhler @ 2017-02-12 15:05 UTC (permalink / raw)
  To: emacs-devel; +Cc: martin rudalics, John Wiegley

[-- Attachment #1: Type: text/plain, Size: 2445 bytes --]



On 12.02.2017 12:14, martin rudalics wrote:
> > This argument right here is why I would vote against comment-cache: 
> I'd rather
> > have parens-in-comments-at-column-0 parsed incorrectly -- at least, 
> until
> > syntax-ppss is fixed -- than to add another cache just to fix this 
> problem.
> > Unless I've missed something...
>
> It makes me rather sad that this discussion does not consider ecological
> consequences at all.  IIUC it started because of a "(c)" copyright
> characters sequence in the comment of some C code.  Doesn't it strike
> anyone as the ultimate irony to consider this an issue in the context of
> copylefted software?
>
> Also IIUC we still adhere to the GNU coding standards which clearly say
> that
>
>   It is important to put the open-brace that starts the body of a C
>   function in column one, so that they will start a defun. Several tools
>   look for open-braces in column one to find the beginnings of C
>   functions. These tools will not work on code not formatted that way.
>
>   Avoid putting open-brace, open-parenthesis or open-bracket in column
>   one when they are inside a function, so that they won’t start a
>   defun. The open-brace that starts a struct body can go in column one
>   if you find it useful to treat that definition as a defun.
>
> The continuous attempts to deceive this standard's rules have been
> harassing me for many years now.  If people do like copyrighted code and
> code written according to non-GNU standards, then we should provide at
> least one single option that respects an open paren in column zero where
> it belongs to: At the beginning of a defun and nowhere else.
>
> In Emacs this option is called `open-paren-in-column-0-is-defun-start'.
> Emacs code should obey this option in the sense that if it is non-nil,
> it should behave ecologically in terms of consumption of CPU cycles and
> memory space.
>
> martin
>
>

Hi Martin,

thanks. IMO you addressed two core-issues of Emacs' future: a political 
and a technical one.

As for the copyright, conceive it as contradicting to the idea of free 
software - whilst the paperworks reached out here rely on it.

open-paren-in-column-0-is-defun-start ignores the fact functions 
commonly might be nested.
That way Emacs can't handle nested definitions reliably.

Why not have a purely GPL-based Emacs with the 
open-paren-in-column-0-is-defun-start hampering removed?

Make Emacs still greater,

Andreas


[-- Attachment #2: Type: text/html, Size: 3546 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12 11:14                 ` martin rudalics
  2017-02-12 15:05                   ` Andreas Röhler
@ 2017-02-12 15:39                   ` Eli Zaretskii
  1 sibling, 0 replies; 75+ messages in thread
From: Eli Zaretskii @ 2017-02-12 15:39 UTC (permalink / raw)
  To: martin rudalics; +Cc: acm, emacs-devel, dgutov

> Date: Sun, 12 Feb 2017 12:14:27 +0100
> From: martin rudalics <rudalics@gmx.at>
> 
> It makes me rather sad that this discussion does not consider ecological
> consequences at all.  IIUC it started because of a "(c)" copyright
> characters sequence in the comment of some C code.  Doesn't it strike
> anyone as the ultimate irony to consider this an issue in the context of
> copylefted software?

I believe this is because we want Emacs to support non-GNU C/C++
source code reasonably well.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12 12:05               ` Alan Mackenzie
  2017-02-12 13:13                 ` Juanma Barranquero
@ 2017-02-12 15:57                 ` Dmitry Gutov
  2017-02-12 17:29                   ` Alan Mackenzie
  2017-02-12 17:49                 ` Stefan Monnier
  2 siblings, 1 reply; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-12 15:57 UTC (permalink / raw)
  To: Alan Mackenzie, Stefan Monnier; +Cc: emacs-devel

On 12.02.2017 14:05, Alan Mackenzie wrote:

> That's incoherent.  comment-cache was never intended to help those other
> uses, though it appears it could do so for most of them.  That
> particular flaw we're talking about doesn't appear in comment cache, so
> there's nothing to fix there.

You're changing a low-level primitive to adhere to a non-flexible view 
of the world that is incompatible with syntax-ppss.

> Maybe sometime.  In the meantime, the bug with open parens in column
> zero in comments should be fixed.

If you're willing to give up narrowing support, that bug can be fixed in 
no time, with the 20-line patch we all know about.

> The question of "widening" is not difficult.  Narrowing a buffer should
> not change the syntax of the characters in it.  Doing so leads to
> inconsistencies.

Yeah, you really want narrowing to be interpreted the way that is more 
convenient for your usage habits. I want it to be interpreted that's 
more convenient for the code I've written and ended up maintaining. 
Resolving this conflict requires some thought.

> If I understand correctly, the problem is that multiple-major-mode modes
> are trying to use narrowing to get a null syntactic context.  They are
> trying this because we don't provide anything better.  We should provide
> something better.  I suggested such a something last spring ("islands").

You suggested implementing a big, ambiguously defined feature.

We basically have no way to determine whether it would work out. I've 
spent some time on that discussion helping you narrow down the specs, 
but my personal takeaway is that it's too complex. Maybe I'm too 
unimaginative and lazy, though, so please go ahead and work on a 
prototype if you're confident.

In the meantime, however, we need to keep Emacs compatible with 
multiple-major-mode modes some other way.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12 15:57                 ` Dmitry Gutov
@ 2017-02-12 17:29                   ` Alan Mackenzie
  2017-02-12 20:35                     ` Dmitry Gutov
  2017-02-13  1:47                     ` zhanghj
  0 siblings, 2 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-12 17:29 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel

Hello, Dmitry.

On Sun, Feb 12, 2017 at 17:57:24 +0200, Dmitry Gutov wrote:
> On 12.02.2017 14:05, Alan Mackenzie wrote:

> > That's incoherent.  comment-cache was never intended to help those other
> > uses, though it appears it could do so for most of them.  That
> > particular flaw we're talking about doesn't appear in comment cache, so
> > there's nothing to fix there.

> You're changing a low-level primitive to adhere to a non-flexible view 
> of the world that is incompatible with syntax-ppss.

No.  comment-cache and syntax-ppss are independent of each other and
thus not incompatible.

> > Maybe sometime.  In the meantime, the bug with open parens in column
> > zero in comments should be fixed.

> If you're willing to give up narrowing support, that bug can be fixed in 
> no time, with the 20-line patch we all know about.

I'm not willing to give up narrowing support.  Neither are lots of other
people.  It's a fundamental feature of Emacs, widely used.

> > The question of "widening" is not difficult.  Narrowing a buffer should
> > not change the syntax of the characters in it.  Doing so leads to
> > inconsistencies.

> Yeah, you really want narrowing to be interpreted the way that is more 
> convenient for your usage habits.

I want it to be handled correctly and consistently.

> I want it to be interpreted that's more convenient for the code I've
> written and ended up maintaining.  Resolving this conflict requires
> some thought.

Multiple-major-mode code?  Narrowing is not a good way of doing this,
and I propose a better way.

> > If I understand correctly, the problem is that multiple-major-mode modes
> > are trying to use narrowing to get a null syntactic context.  They are
> > trying this because we don't provide anything better.  We should provide
> > something better.  I suggested such a something last spring ("islands").

> You suggested implementing a big, ambiguously defined feature.

It was big, yes, but reasonably well defined.  What I really meant in my
last paragraph was that the syntax bits of "islands" should be used in
place of what is now done with narrowing.  This would introduce two new
syntax classes "open island" and "close island".  "Open island" would
stack the current syntactic state and start anew, with a new syntax
table.  "Close island" would pop this stack, restoring the previous
state and syntax table.

> We basically have no way to determine whether it would work out. I've 
> spent some time on that discussion helping you narrow down the specs, 
> but my personal takeaway is that it's too complex.

It's not complicated.  It's just big.  It's intention is to be the
simplest possible natural implementation of multiple-buffer-modes which
doesn't need nasty workarounds.

> Maybe I'm too unimaginative and lazy, though, so please go ahead and
> work on a prototype if you're confident.

I'm confident it could work, but there's much more work involved than
I'm capable of doing on my own in a reasonable time.  It would need a
team to work on it.  That's not really the way Emacs gets developed.

> In the meantime, however, we need to keep Emacs compatible with 
> multiple-major-mode modes some other way.

See above.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12 12:05               ` Alan Mackenzie
  2017-02-12 13:13                 ` Juanma Barranquero
  2017-02-12 15:57                 ` Dmitry Gutov
@ 2017-02-12 17:49                 ` Stefan Monnier
  2017-02-13 18:09                   ` Alan Mackenzie
  2 siblings, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-12 17:49 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> I also have an attachment to it because it works, and would save me
                                                 ^^^
                                               for you

> demoralizing work debugging bugs caused by open parens in column zero in
> comments.

Is that really the only reason?  It seems like an amazingly complicated
way to go about it.  Let's see some alternatives:
- set open-paren-in-column-0-is-defun-start to nil.
- add a font-lock rule which highlights open-paren-in-column-0 inside
  comments in bright red.
- use my syntax-ppss-based patch.

> Bug #22983 is a flaw.

Great!  We're trying to have a reasoned argument; I tell you that using
this term to describe this problem is not helping and you insist on
using it.  From where I stand, this qualifies as provocation.

> Also the cache invalidation in syntax-ppss is less than rigorous.

Yup, syntax-ppss's implementation is not perfect.  That can be improved.

> For example, the cache isn't invalidated when syntax-table text
> properties are applied or removed.

This is not a correct characterization of the most common
cache-invalidation problem with syntax-ppss: there's a correlation
between the problem and syntax-table text properties, but that's all: it
also affects all other properties, but it doesn't affect all changes to
the syntax-table text properties.

>> BTW, your comment-cache doesn't fix that "flaw" and hence won't help any
>> of those users of syntax-ppss which can't be changed to use your
>> comment-cache.
> That's incoherent.  comment-cache was never intended to help those other
> uses, though it appears it could do so for most of them.

It's only incoherent if you refuse to see the larger picture.

> Can't be done, as I keep telling you.  comment-cache is solely for
> handling literals.

Then it's useless, AFAIC:
- we need syntax-ppss's data for lots of things.
- your code can't replace all those uses.
- so we're stuck with syntax-ppss, no matter how much you think it sucks.
- then we might as well use it in back_comment instead of your code,
  since it's there and is cheap.

> The question of "widening" is not difficult.  Narrowing a buffer should
> not change the syntax of the characters in it.  Doing so leads to
> inconsistencies.

I can agree with that.  But currently that's not how Emacs behaves, so
it's an incompatible change (which I'm quite willing to make, BTW), and
needs to come with some way to recover the other behavior when that one
is needed.

> If I understand correctly, the problem is that multiple-major-mode modes
> are trying to use narrowing to get a null syntactic context.

That's the typical example, but not the only one.

> They are trying this because we don't provide anything better.
> We should provide something better.

Agreed.

> I suggested such a something last spring ("islands").  If each buffer
> position has an unambiguous syntactic context the question of
> "widening" simply evaporates.

I think this is too specialized to a particular application (multiple
major modes).  We also need to accommodate other cases.  For that we
need to provide something equivalent to

    (save-restriction
      (narrow-to-region BEG END)
      ...)

but where syntax-ppss and friends will know that we shouldn't widen past
BEG/END and that BEG should be taken as "the (temporary) beginning of
the buffer".  Let's call it `with-region-as-subbuffer`.  Most likely,
this new functionality should also make it possible to temporarily
provide a different syntax-table.  Such things are used in various
circumstances where the author simply wants to reuse Emacs's syntax.c
code to avoid writing some ad-hoc parsing.

IOW, we need to provide something on top of which we can build this
`with-region-as-subbuffer` macro as well as your islands.

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12 17:29                   ` Alan Mackenzie
@ 2017-02-12 20:35                     ` Dmitry Gutov
  2017-02-13  1:47                     ` zhanghj
  1 sibling, 0 replies; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-12 20:35 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Stefan Monnier, emacs-devel

On 12.02.2017 19:29, Alan Mackenzie wrote:

> comment-cache and syntax-ppss are independent of each other

No, they are not. forward-comment and syntax-ppss are and will be used 
from the same codebases.

>> In the meantime, however, we need to keep Emacs compatible with
>> multiple-major-mode modes some other way.
> 
> See above.

I'm not seeing anything "above" that satisfies the condition of "in the 
meantime".

There's no patch for the islands feature proposed, but you want us to 
accept the comment-cache now-ish.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12 17:29                   ` Alan Mackenzie
  2017-02-12 20:35                     ` Dmitry Gutov
@ 2017-02-13  1:47                     ` zhanghj
  2017-02-13  5:50                       ` Stefan Monnier
  1 sibling, 1 reply; 75+ messages in thread
From: zhanghj @ 2017-02-13  1:47 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: netjune, emacs-devel, Stefan Monnier, Dmitry Gutov

Alan Mackenzie <acm@muc.de> writes:

>
> Multiple-major-mode code?  Narrowing is not a good way of doing this,
> and I propose a better way.
>
>> > If I understand correctly, the problem is that multiple-major-mode modes
>> > are trying to use narrowing to get a null syntactic context.  They are
>> > trying this because we don't provide anything better.  We should provide
>> > something better.  I suggested such a something last spring ("islands").
>
>> You suggested implementing a big, ambiguously defined feature.
>
> It was big, yes, but reasonably well defined.  What I really meant in my
> last paragraph was that the syntax bits of "islands" should be used in
> place of what is now done with narrowing.  This would introduce two new
> syntax classes "open island" and "close island".  "Open island" would
> stack the current syntactic state and start anew, with a new syntax
> table.  "Close island" would pop this stack, restoring the previous
> state and syntax table.
>

How about adding two text properties like island-major-mode and
island-variables? All chars in the same island have the same values of
the two text properties.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-13  1:47                     ` zhanghj
@ 2017-02-13  5:50                       ` Stefan Monnier
  2017-02-13  6:45                         ` zhanghj
                                           ` (2 more replies)
  0 siblings, 3 replies; 75+ messages in thread
From: Stefan Monnier @ 2017-02-13  5:50 UTC (permalink / raw)
  To: emacs-devel

> How about adding two text properties like island-major-mode and
> island-variables? All chars in the same island have the same values of
> the two text properties.

A multi-major-mode package could use such a strategy, but I don't think
we want to hard-code such a thing directly in font-lock and syntax-ppss.
Instead, we should focus on an intermediate API that syntax-ppss and
font-lock can use on one side and which a new island-mode mmm can use on
the other.

E.g. sgml-mode may want to occasionally treat a tag as "an island"
(i.e. parse it using a special syntax-table and ignoring the surrounding
context), during some internal processing (e.g. within a limited dynamic
scope), but it wouldn't want to have to place text-properties for that:
let-binding vars would be a lot more convenient.

Similarly, it would be a lot more convenient for syntax-ppss to consult
some dynamically-scoped variable to find the "beginning of (sub)buffer",
rather than having to scan text properties.

So, I think something along the lines of prog-indentation-context would
be more appropriate (and an island-mode could still consult
text-properties to then temporarily set some dynamically scoped variable).

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-13  5:50                       ` Stefan Monnier
@ 2017-02-13  6:45                         ` zhanghj
  2017-02-13  7:24                           ` Stefan Monnier
  2017-02-13 16:14                           ` Drew Adams
  2017-02-13  7:05                         ` zhanghj
  2017-02-13  7:16                         ` zhanghj
  2 siblings, 2 replies; 75+ messages in thread
From: zhanghj @ 2017-02-13  6:45 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: netjune, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Similarly, it would be a lot more convenient for syntax-ppss to consult
> some dynamically-scoped variable to find the "beginning of (sub)buffer",
> rather than having to scan text properties.
>
Every island may has two varibles like island-begin-marker and
island-end-marker in island-variables. Then we don't need to scan text
property. Just use the two markers to identify the region of the island.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-13  5:50                       ` Stefan Monnier
  2017-02-13  6:45                         ` zhanghj
@ 2017-02-13  7:05                         ` zhanghj
  2017-02-13  7:16                         ` zhanghj
  2 siblings, 0 replies; 75+ messages in thread
From: zhanghj @ 2017-02-13  7:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: netjune, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> So, I think something along the lines of prog-indentation-context would
> be more appropriate (and an island-mode could still consult
> text-properties to then temporarily set some dynamically scoped variable).
>
Or just add one text property: island-context. Then we can get the
context easily.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-13  5:50                       ` Stefan Monnier
  2017-02-13  6:45                         ` zhanghj
  2017-02-13  7:05                         ` zhanghj
@ 2017-02-13  7:16                         ` zhanghj
  2017-02-13 14:57                           ` Dmitry Gutov
  2 siblings, 1 reply; 75+ messages in thread
From: zhanghj @ 2017-02-13  7:16 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: netjune, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> A multi-major-mode package could use such a strategy, but I don't think
> we want to hard-code such a thing directly in font-lock and syntax-ppss.
> Instead, we should focus on an intermediate API that syntax-ppss and
> font-lock can use on one side and which a new island-mode mmm can use on
> the other.
>
If font-lock and syntax-ppss can be sub-buffer/island oriented, mmm-mode
will be easy to write. It just have to setup and maintain the island
related text propertis.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-13  6:45                         ` zhanghj
@ 2017-02-13  7:24                           ` Stefan Monnier
  2017-02-13  7:59                             ` zhanghj
  2017-02-13 16:14                           ` Drew Adams
  1 sibling, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-13  7:24 UTC (permalink / raw)
  To: emacs-devel

> Every island may has two varibles like island-begin-marker and
> island-end-marker in island-variables.

I don't know what that means concretely.  How can you have "variables"
in `island-variables`?  How would syntax-ppss know which island's
variables to use and find them?


        Stefan




^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-13  7:24                           ` Stefan Monnier
@ 2017-02-13  7:59                             ` zhanghj
  2017-02-13  9:25                               ` Stefan Monnier
  0 siblings, 1 reply; 75+ messages in thread
From: zhanghj @ 2017-02-13  7:59 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: netjune, emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> Every island may has two varibles like island-begin-marker and
>> island-end-marker in island-variables.
>
> I don't know what that means concretely.  How can you have "variables"
> in `island-variables`?  How would syntax-ppss know which island's
> variables to use and find them?
>
The island-varibles may be an association list, which contains basic
information, local variables and cache data of the island.



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-13  7:59                             ` zhanghj
@ 2017-02-13  9:25                               ` Stefan Monnier
  0 siblings, 0 replies; 75+ messages in thread
From: Stefan Monnier @ 2017-02-13  9:25 UTC (permalink / raw)
  To: zhanghj; +Cc: netjune, emacs-devel

> The island-varibles may be an association list, which contains basic
> information, local variables and cache data of the island.

Sorry, I still don't see how that would work.

"Association list" with what kind of keys?  It can't contain variables,
since association lists contain values, not variables (they can contain
symbols, tho, but I'm not sure why you'd want to use such indirections).

How would syntax-ppss know which island's
variables to use and find them?

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-13  7:16                         ` zhanghj
@ 2017-02-13 14:57                           ` Dmitry Gutov
  0 siblings, 0 replies; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-13 14:57 UTC (permalink / raw)
  To: zhanghj, Stefan Monnier; +Cc: netjune, emacs-devel

On 13.02.2017 09:16, zhanghj wrote:

> If font-lock and syntax-ppss can be sub-buffer/island oriented, mmm-mode
> will be easy to write.

Unlikely. It will still require the modes to implement their stuff 
"correctly" (and that would still be hard to verify without using them 
in multi-mode context, I believe).

Further, mmm-mode is not just about font-lock and indentation, though 
they are surely a big part of it.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* RE: Bug #25608 and the comment-cache branch
  2017-02-13  6:45                         ` zhanghj
  2017-02-13  7:24                           ` Stefan Monnier
@ 2017-02-13 16:14                           ` Drew Adams
  1 sibling, 0 replies; 75+ messages in thread
From: Drew Adams @ 2017-02-13 16:14 UTC (permalink / raw)
  To: zhanghj, Stefan Monnier; +Cc: netjune, emacs-devel

> > Similarly, it would be a lot more convenient for syntax-ppss to consult
> > some dynamically-scoped variable to find the "beginning of (sub)buffer",
> > rather than having to scan text properties.
> >
> Every island may has two varibles like island-begin-marker and
> island-end-marker in island-variables. Then we don't need to scan text
> property. Just use the two markers to identify the region of the island.

FWIW: This is exactly what I do in `zones.el'.  You can have any
number of such "island" (or zones) variables.  Each is a list of
such marker pairs.

Actually, each pair can have form (ID POSITION1 POSITION2 . EXTRA),
where ID is a natural-number zone identifier, the POSITIONS are
natural numbers, markers for the same buffer, or "readable markers"
for the same buffer.  EXTRA is a list of anything (typically nil).

A "readable marker" is a list (marker BUFFER POSITION), where
BUFFER is a buffer name (string) and POSITION is a buffer position
(number only).  Readable markers let you save zones persistently
(e.g., as bookmarks) and restore them.

https://www.emacswiki.org/emacs/Zones

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-12 17:49                 ` Stefan Monnier
@ 2017-02-13 18:09                   ` Alan Mackenzie
  2017-02-13 19:34                     ` Eli Zaretskii
  2017-02-13 21:21                     ` Stefan Monnier
  0 siblings, 2 replies; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-13 18:09 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Sun, Feb 12, 2017 at 12:49:29 -0500, Stefan Monnier wrote:
> > I also have an attachment to it because it works, and would save me
>                                                  ^^^
>                                                for you

> > demoralizing work debugging bugs caused by open parens in column zero in
> > comments.

> Is that really the only reason?  It seems like an amazingly complicated
> way to go about it.

comment-cache is a gross simplification of syntax.c, scanning comments
only in the forward direction and totally eliminating
open-paren-in-column-0-is-defun-start from syntax.c.  To achieve this
simplification took a lot of work.

> Let's see some alternatives:
> - set open-paren-in-column-0-is-defun-start to nil.

Too slow.

> - add a font-lock rule which highlights open-paren-in-column-0 inside
>   comments in bright red.

Totally irrelevant to what's required.  By the way, did you ever get
around to looking at that bug which noted that mechanism no longer
working?

> - use my syntax-ppss-based patch.

Doesn't work all the time, in particular in narrowed buffers.

> > Bug #22983 is a flaw.

> Great!  We're trying to have a reasoned argument; I tell you that using
> this term to describe this problem is not helping and you insist on
> using it.  From where I stand, this qualifies as provocation.

Of all the words in English which mean "imperfection", that's one of the
milder ones.  Would you prefer I used "defect" or "fault" or something
else?  #22983 _is_ a flaw/defect/fault/whatever else you want to call
it.  In particular, syntax-ppss gives out different results for a buffer
position depending on its internal state.

If that word would provoke you into actually fixing #22983, after all
this time, I would use it again for that purpose.

> > Also the cache invalidation in syntax-ppss is less than rigorous.

> Yup, syntax-ppss's implementation is not perfect.  That can be improved.

> > For example, the cache isn't invalidated when syntax-table text
> > properties are applied or removed.

> This is not a correct characterization of the most common
> cache-invalidation problem with syntax-ppss: there's a correlation
> between the problem and syntax-table text properties, but that's all: it
> also affects all other properties, but it doesn't affect all changes to
> the syntax-table text properties.

syntax-ppss is a cache of the syntax-table text property.  Not
invalidating the cache when a syntax-table text property is changed is
an imperfection.  Will it be fixed?

By contrast, there are no known bugs in the cacheing in comment-cache.

> >> BTW, your comment-cache doesn't fix that "flaw" and hence won't help any
> >> of those users of syntax-ppss which can't be changed to use your
> >> comment-cache.
> > That's incoherent.  comment-cache was never intended to help those other
> > uses, though it appears it could do so for most of them.

> It's only incoherent if you refuse to see the larger picture.

The larger picture is that comment-cache can work alongside syntax-ppss
pefectly happily without any contention.

> > Can't be done, as I keep telling you.  comment-cache is solely for
> > handling literals.

> Then it's useless, AFAIC:
> - we need syntax-ppss's data for lots of things.
> - your code can't replace all those uses.
> - so we're stuck with syntax-ppss, no matter how much you think it sucks.
> - then we might as well use it in back_comment instead of your code,
>   since it's there and is cheap.

But it doesn't work properly.  comment-cache is also cheap, having
already been written and debugged.

> > The question of "widening" is not difficult.  Narrowing a buffer should
> > not change the syntax of the characters in it.  Doing so leads to
> > inconsistencies.

> I can agree with that.  But currently that's not how Emacs behaves, so
> it's an incompatible change (which I'm quite willing to make, BTW), and
> needs to come with some way to recover the other behavior when that one
> is needed.

> > If I understand correctly, the problem is that multiple-major-mode modes
> > are trying to use narrowing to get a null syntactic context.

> That's the typical example, but not the only one.

> > They are trying this because we don't provide anything better.
> > We should provide something better.

> Agreed.

> > I suggested such a something last spring ("islands").  If each buffer
> > position has an unambiguous syntactic context the question of
> > "widening" simply evaporates.

> I think this is too specialized to a particular application (multiple
> major modes).  We also need to accommodate other cases.

Could you identify these other cases?

> For that we need to provide something equivalent to

>     (save-restriction
>       (narrow-to-region BEG END)
>       ...)

> but where syntax-ppss and friends will know that we shouldn't widen past
> BEG/END ....

I thorougly dislike the conceptualization of handling syntax as
"widening".  Both the Lisp and C parts of Emacs use narrowing and
widening "all the time", and if we try to express semantics in terms of
"widening" and "narrowing" we're going to create confusion.

> .... and that BEG should be taken as "the (temporary) beginning of the
> buffer".  Let's call it `with-region-as-subbuffer`.  Most likely, this
> new functionality should also make it possible to temporarily provide
> a different syntax-table.  Such things are used in various
> circumstances where the author simply wants to reuse Emacs's syntax.c
> code to avoid writing some ad-hoc parsing.

> IOW, we need to provide something on top of which we can build this
> `with-region-as-subbuffer` macro as well as your islands.

Introducing the new syntactic symbols "island start/end" would cater for
with-region-as-subbuffer admirably, without having to resort to
confusing narrowing.  Every buffer position would continue to have its
unique (global) syntactic context.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-13 18:09                   ` Alan Mackenzie
@ 2017-02-13 19:34                     ` Eli Zaretskii
  2017-02-13 21:21                     ` Stefan Monnier
  1 sibling, 0 replies; 75+ messages in thread
From: Eli Zaretskii @ 2017-02-13 19:34 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

> Date: Mon, 13 Feb 2017 18:09:19 +0000
> From: Alan Mackenzie <acm@muc.de>
> Cc: emacs-devel@gnu.org
> 
> Both the Lisp and C parts of Emacs use narrowing and widening "all
> the time"

Are you sure?  In C code, I see exactly 2 calls to 'widen' and 4 calls
to 'narrow-to-region', something that doesn't really fit "all the time"
description.  (Most of the above few calls are to support auto-saving
and process-filters, and the rest are in set-buffer-multibyte -- it
should be clear to anyone that all of the above must override any
narrowing.)

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-13 18:09                   ` Alan Mackenzie
  2017-02-13 19:34                     ` Eli Zaretskii
@ 2017-02-13 21:21                     ` Stefan Monnier
  1 sibling, 0 replies; 75+ messages in thread
From: Stefan Monnier @ 2017-02-13 21:21 UTC (permalink / raw)
  To: emacs-devel

> comment-cache is a gross simplification of syntax.c, scanning comments
> only in the forward direction and totally eliminating
> open-paren-in-column-0-is-defun-start from syntax.c.

BTW, let me help you: eliminating open-paren-in-column-0-is-defun-start
is not terribly important and doesn't justify all those changes (my
syntax-ppss-based patch does that in a much simpler way).

The real upside to your code is the elimination of back_comment.
If I were you, that's how I'd try to sell it.

> To achieve this simplification took a lot of work.

I don't doubt it.  syntax-ppss OTOH didn't take much work at all.

>> - add a font-lock rule which highlights open-paren-in-column-0 inside
>> comments in bright red.
> Totally irrelevant to what's required.

Very relevant to:

    demoralizing work debugging bugs caused by open parens in column
    zero in comments.

since even if that highlighting is something personal in your ~/.emacs
that should make such debugging trivial, since you can easily arrange to
burp very loudly as soon as such an open paren occurs, so when you get
a bug report about it, as soon as you try to reproduce the problem with
the OP's file your hack will scream bloody murder and your debugging
will be immediately done for you.

> By the way, did you ever get around to looking at that bug which noted
> that mechanism no longer working?

No, I became more interested in using that syntax-ppss-based patch to
get rid of open-paren-in-column-0-is-defun-start in syntax.c ;-)

>> - use my syntax-ppss-based patch.
> Doesn't work all the time, in particular in narrowed buffers.

Works just fine in narrowed buffers for me, and I can't remember any bug
report about it other than yours so the problem doesn't seem nearly as
serious as you make it out to be.

BTW, your code also breaks down miserably in some narrowed buffers
(i.e. in those narrowed buffers where the narrowing semantics expected
is not the one you decided is The Only Choice).

> Of all the words in English which mean "imperfection", that's one of the
> milder ones.  Would you prefer I used "defect" or "fault" or something
> else?

How 'bout "bug"?

> If that word would provoke you into actually fixing #22983, after all
> this time, I would use it again for that purpose.

As you know, fixing this depends on figuring out how to solve the
narrowing-semantics issue.  Once a solution is chosen, fixing the bug
will be very easy.

> syntax-ppss is a cache of the syntax-table text property.
> Not invalidating the cache when a syntax-table text property is changed is
> an imperfection.  Will it be fixed?

Patch welcome.  The lack of bug-reports about it makes it a rather
low-priority issue.

>> >> BTW, your comment-cache doesn't fix that "flaw" and hence won't help any
>> >> of those users of syntax-ppss which can't be changed to use your
>> >> comment-cache.
>> > That's incoherent.  comment-cache was never intended to help those other
>> > uses, though it appears it could do so for most of them.
>> It's only incoherent if you refuse to see the larger picture.
> The larger picture is that comment-cache can work alongside syntax-ppss
> pefectly happily without any contention.

Looks like you still haven't seen the larger picture.

>> > Can't be done, as I keep telling you.  comment-cache is solely for
>> > handling literals.
>> Then it's useless, AFAIC:
>> - we need syntax-ppss's data for lots of things.
>> - your code can't replace all those uses.
>> - so we're stuck with syntax-ppss, no matter how much you think it sucks.
>> - then we might as well use it in back_comment instead of your code,
>> since it's there and is cheap.
> But it doesn't work properly.  comment-cache is also cheap, having
> already been written and debugged.

Which part of "comment-cache is solely for handling literals" don't
you understand?

>> For that we need to provide something equivalent to
>>
>> (save-restriction
>> (narrow-to-region BEG END)
>> ...)
>>
>> but where syntax-ppss and friends will know that we shouldn't widen past
>> BEG/END ....
> I thorougly dislike the conceptualization of handling syntax as
> "widening".

Yes, sorry.  Rather than "widen" above, you should read "look".

> Introducing the new syntactic symbols "island start/end" would cater for
> with-region-as-subbuffer admirably, without having to resort to
> confusing narrowing.

What about those cases where the "subbuffer region" on which the
code wants to operate shouldn't have any special syntax-table properties
in general (the code just wants to temporarily operate on it in
a particular way).

E.g. some minor-mode may want to treat some C strings as "sub-buffer
written in some other programming language" for some specific commands,
while still keeping them as "plain old strings" in general.
If I understand correctly your suggestion of islands markers, the minor
mode would have to add those markers temporarily, then operate on the
sub-buffer and then remove them.  And along the way this causes the
whole comment-cache/syntax-ppss/font-lock state to be flushed because of
the temporary change.

In contrast, currently that minor mode could do

    (save-restriction
      (narrow-to-region BEG END)
      (with-syntax-table other-mode-syntax-table)
      ...)

which doesn't need to modify the buffer at all, and hence doesn't
invalidate any cache.

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-07 19:21                   ` Alan Mackenzie
@ 2017-02-14 15:28                     ` Dmitry Gutov
  2017-02-14 16:38                       ` Stefan Monnier
  2017-02-14 21:14                       ` Alan Mackenzie
  0 siblings, 2 replies; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-14 15:28 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Eli Zaretskii, emacs-devel

Hi Alan,

On 07.02.2017 21:21, Alan Mackenzie wrote:

>> How come the "alternative patch" works well, then?
> 
> Well, aside from the fact that it doesn't (IMAO), it is only consulted
> relatively rarely, in certain cases of back_coment where the backward
> scanning hits something it doesn't want to handle.

What is "it"? I would imagine that to be sure that point is not e.g. 
inside a string, the patch would have to consult the cache (or call 
syntax-ppss) at least once per forward-comment call.

 From there, I don't really see a real need for backward comment 
scanning. So if you rewrote some code to use forward scanning, that 
approach should be applicable on top of the AP as well.

>>> There's no sign of syntax-ppss being fixed.  Bug #22983 has been open
>>> for almost a year, and despite repeated requests from me, there has been
>>> no movement on it.
> 
>> You didn't show any enthusiasm about the initial proposed fix, which was
>> rather simple. Now we've had more discussions, and the bar for a
>> solution has been raised. I'm thinking about it again. Let's not give up.
> 
> I wasn't enthusiastic about your proposed fix because I found it ugly.

Thank you for clearing that up.

> That sounds like you've decided you want to use syntax-ppss no matter
> what, and the bugs this will cause will just be relabeled as features.
> As I've said before, the aim should be for back_comment always to work.

More importantly, I want to keep as much logic in Lisp as feasible, 
which is the currently recommended approach anyway.

Problems like this could be solved in different ways without avoiding 
that goal. We can provide new faster primitives if manipulating some 
data structure in Lisp is not enough (but we need benchmarks first, and 
so far, speed is not a problem). We can also add new hooks if 
before-change-functions is not up to snuff.

>> Tracking the used syntax table is also a problem which we need to solve
>> for syntax-ppss. A good design could handle it and narrowing together.
> 
> You should now be able to see why I dislike syntax-ppss so much.  As
> well as being incompatible with narrowing (which should be sort of
> fixable), there is an essential lack of cache invalidating (which would
> only be fixable by a radically different design).

Why wouldn't it be fixable with a moderate change in design? The problem 
you are referring to (which is almost entirely theoretical at this 
point, in the absence of bug reports) is cause by syntax-ppss usage 
inside with-silent-modifications.

>> And it does that in a pretty inflexible way.
> 
> It works.  Other ways (apart from M-nil (master with
> open-paren-in-column-0-is-defun-start set to nil)) don't.  The sort of
> flexibility I recall you wanting is simply not possible in
> comment-cache,

Why isn't it? It could adhere to narrowing bounds, or not, just as well 
as syntax-ppss. The problems with cache invalidation when narrowing 
changes should be very similar.

> It's differently complicated.  master's back_comment, which attempts to
> scan comments backwards is more complicated than comment-cache's
> back_comment (including its cacheing logic).

Ideally we'd have the best of both worlds, of course. Like mentioned 
above, I see no hard need for backward scanning anymore.

>> Yes, it is worse. You have more code to debug. And comment-cache adds
>> quite a bit of code.
> 
> How have you quantified "quite a bit"?

771 insertions(+), 402 deletions? Admittedly, this is not a lot by C 
standards.

> There is nothing to indicate you've even looked at comment-cache.  All

I've looked at it now. Since it's implemented in C, I have little 
ability to judge the quality of the code, or the low-level nuances.

And yet, I've managed to provide coherent comments, haven't I?

> the criticisms you've made have been from a distance, based on rumour
> (even if the source of that rumour has been me).

Discussing design on a high level is a normal practice, and we often do 
so even when the code is available, in the interest of saving time.

> I repeat, you want comment-cache to be
> wholly abandoned, apparently because you like syntax-ppss so much.  The
> alternative "recommended" approach has documented deficiencies, yet you
> still advocate it.

Both approaches have documented deficiencies.

>> So the "speed up forward-comment" patch would still come out to 20 lines.
> 
> Well, if you get a decent bug fix involving, say a 700 line patch which
> includes those 20 lines, I suppose you could still call it a 20 line
> patch, somehow.

Even if that takes 700 lines, in the end it will be 700 + 20 lines 
versus 700 + 370 lines that comment-cache takes.

>>> It would also likely be much slower.
> 
>> I wouldn't be so sure. A syntax table comparison, for instance, would be
>> pretty cheap compared to what syntax-ppss does already.
> 
> Full syntax-table comparisons are slow, even when written in C.

Really? How do you quantify that? In c++-mode,

(benchmark 1000 '(equal (syntax-table) (syntax-table)))

outputs "Elapsed time: 0.004712s". Which is an order of magnitude less 
than (benchmark 1000 '(syntax-ppss)) outputs, in an empty buffer with a 
warmed-up cache.

> I tried
> it back in December.  CC Mode regularly switches syntax-tables.  My
> usual time-scroll function on xdisp.c ran at about half the speed when a
> comparison was done at every set-syntax-table.  The results had to be
> cached, after which it ran at normal speed again.

That doesn't tell me a lot, unfortunately. Maybe it was a design 
problem, e.g. invalidating cache eagerly and too often, instead of doing 
it lazily like syntax-ppss does.

Although CC Mode would have to change syntax tables a lot, for it to 
even show up on the radar.

It's possible that your "compare syntax tables" routine does a lot, of 
course. But if we really need that kind of fuzzy comparison, we can 
implement that function in C and export for using in Lisp.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-14 15:28                     ` Dmitry Gutov
@ 2017-02-14 16:38                       ` Stefan Monnier
  2017-02-22  2:25                         ` Dmitry Gutov
  2017-02-14 21:14                       ` Alan Mackenzie
  1 sibling, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-14 16:38 UTC (permalink / raw)
  To: emacs-devel

> What is "it"? I would imagine that to be sure that point is not
> e.g. inside a string, the patch would have to consult the cache (or call
> syntax-ppss) at least once per forward-comment call.

Like all the sexp movement functions, `forward-comment` is allowed to
assume that the starting position is outside of comments/strings, so it
doesn't need to consult the cache to see if it's within a string.

In the case we do scan forward (e.g. the case where we end up using
parse-partial-sexp (or syntax-ppss in my patch)), we actually manually
re-introduce that behavior: if the forward parse says that the
end-comment-marker in inside a string (or inside another comment), we
re-parse from the beginning of that string (or comment) to try and see
if that end-comment-marker could be considered to close a comment nested
within the string (or the other comment).

> From there, I don't really see a real need for backward comment scanning.
> So if you rewrote some code to use forward scanning, that approach should be
> applicable on top of the AP as well.

Calling syntax-ppss every time back_comment is invoked would probably
result in bad performance currently: when parsing backward
(e.g. backward-sexp), the syntax-ppss-last optimization is ineffective,
so we'd fallback on syntax-ppss-cache which ends up scanning on the
average syntax-ppss-max-span/2 (i.e. 10K) chars.  When \n is a comment
ender (i.e. in most programming language modes), it would imply
a forward scan of 10K for every line.

IOW, for such an approach to work, we'd have to rework syntax-ppss to be
faster when scanning backward (e.g. reduce syntax-ppss-max-span, which
would have other repercussions).

>> That sounds like you've decided you want to use syntax-ppss no matter

There's no alternative to syntax-ppss on the table, AFAIK.

>> what, and the bugs this will cause will just be relabeled as features.

Care to back up this claim?

Where/when have we claimed that what you consider as a syntax-ppss bug
is a feature?

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-14 15:28                     ` Dmitry Gutov
  2017-02-14 16:38                       ` Stefan Monnier
@ 2017-02-14 21:14                       ` Alan Mackenzie
  2017-02-16 14:10                         ` Stefan Monnier
  1 sibling, 1 reply; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-14 21:14 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel

Hello, Dmitry.

On Tue, Feb 14, 2017 at 17:28:58 +0200, Dmitry Gutov wrote:
> Hi Alan,

> On 07.02.2017 21:21, Alan Mackenzie wrote:

> >> How come the "alternative patch" works well, then?

> > Well, aside from the fact that it doesn't (IMAO), it is only consulted
> > relatively rarely, in certain cases of back_coment where the backward
> > scanning hits something it doesn't want to handle.

> What is "it"?

Respectively, the "alternative patch", the syntax-ppss cache mechanism,
and the backward scanning, in the three uses in that sentence.  Sorry it
wasn't clear.

> I would imagine that to be sure that point is not e.g.  inside a
> string, the patch would have to consult the cache (or call syntax-ppss)
> at least once per forward-comment call.

Indeed.

>  From there, I don't really see a real need for backward comment 
> scanning. So if you rewrote some code to use forward scanning, that 
> approach should be applicable on top of the AP as well.

The back_comment function needs to use backward scanning unless it has a
robust enough and fast enough cache giving it the literality at any
point.

[ .... ]

> More importantly, I want to keep as much logic in Lisp as feasible, 
> which is the currently recommended approach anyway.

I sometimes think you might be trying to keep more in Lisp than is
feasible.

> Problems like this could be solved in different ways without avoiding 
> that goal. We can provide new faster primitives if manipulating some 
> data structure in Lisp is not enough (but we need benchmarks first, and 
> so far, speed is not a problem). We can also add new hooks if 
> before-change-functions is not up to snuff.

In other words, implementing new logic in C.  One thing which cannot be
done in lisp (without new facilities in C) is invalidating the cache when
syntax-table text properties are applied and removed (which is always
done when the change hooks are inhibited).  You can do it directly in C,
you can write new facilities in C to allow it to be done in lisp, or you
can pretend it doesn't need doing.  comment-cache takes the first
approach, syntax-ppss takes the last at the moment.

> >> Tracking the used syntax table is also a problem which we need to solve
> >> for syntax-ppss. A good design could handle it and narrowing together.

> > You should now be able to see why I dislike syntax-ppss so much.  As
> > well as being incompatible with narrowing (which should be sort of
> > fixable), there is an essential lack of cache invalidating (which would
> > only be fixable by a radically different design).

> Why wouldn't it be fixable with a moderate change in design? The problem 
> you are referring to (which is almost entirely theoretical at this 
> point, in the absence of bug reports) ....

Here I disagree with you and Stephan profoundly - A flaw is a flaw
whether or not it has yet provoked a bug report.  And just because
something hasn't yet had a bug report on it doesn't mean it's OK.  If we
see a way non-rigorous coding _can_ lead to a bug, then that is a bug and
needs fixing, particularly if it's in a primitive.

> .... is caused by syntax-ppss usage inside with-silent-modifications.

Yes.

> > It's differently complicated.  master's back_comment, which attempts to
> > scan comments backwards is more complicated than comment-cache's
> > back_comment (including its cacheing logic).

> Ideally we'd have the best of both worlds, of course. Like mentioned 
> above, I see no hard need for backward scanning anymore.

But for reasonable execution speeds you either need backward scanning of
comments, or comment-cache (or something like it).

> >> Yes, it is worse. You have more code to debug. And comment-cache adds
> >> quite a bit of code.

> > How have you quantified "quite a bit"?

> 771 insertions(+), 402 deletions? Admittedly, this is not a lot by C 
> standards.

I don't think it is, either.  A good deal of that is the wholesale
replacement of back_comment with the simpler new version.

> > There is nothing to indicate you've even looked at comment-cache.  All

> I've looked at it now. Since it's implemented in C, I have little 
> ability to judge the quality of the code, or the low-level nuances.

> And yet, I've managed to provide coherent comments, haven't I?

You have, yes.  Thanks.

[ .... ]

> >> So the "speed up forward-comment" patch would still come out to 20 lines.

> > Well, if you get a decent bug fix involving, say a 700 line patch which
> > includes those 20 lines, I suppose you could still call it a 20 line
> > patch, somehow.

> Even if that takes 700 lines, in the end it will be 700 + 20 lines 
> versus 700 + 370 lines that comment-cache takes.

I think it is the result that counts, not the number of changed lines.

> >>> It would also likely be much slower.

> >> I wouldn't be so sure. A syntax table comparison, for instance, would be
> >> pretty cheap compared to what syntax-ppss does already.

> > Full syntax-table comparisons are slow, even when written in C.

> Really? How do you quantify that? In c++-mode,

> (benchmark 1000 '(equal (syntax-table) (syntax-table)))

> outputs "Elapsed time: 0.004712s". Which is an order of magnitude less 
> than (benchmark 1000 '(syntax-ppss)) outputs, in an empty buffer with a 
> warmed-up cache.

> > I tried
> > it back in December.  CC Mode regularly switches syntax-tables.  My
> > usual time-scroll function on xdisp.c ran at about half the speed when a
> > comparison was done at every set-syntax-table.  The results had to be
> > cached, after which it ran at normal speed again.

> That doesn't tell me a lot, unfortunately. Maybe it was a design 
> problem, e.g. invalidating cache eagerly and too often, instead of doing 
> it lazily like syntax-ppss does.

It was a case of seeing if two distinct syntax tables were "the same"
from the point of view of literals.  In other words, they could parse
parentheses, whitespace and so on however they liked, but comments and
strings had to be parsed identically by both tables for them to count
"the same".

This is an instance where syntax-ppss's ambitions count against it - on
any set-syntax-table syntax-ppss's caches should really be cleared,
strictly speaking.  But they're less effective as caches if this is done.
Perhaps the major mode should give its input.

> Although CC Mode would have to change syntax tables a lot, for it to 
> even show up on the radar.

It does.  For example, there's a `c-make-no-parens-syntax-table' in which
parens/braces/brackets are not paren characters, used for parsing
template/generic delimiters in C++ and Java.

> It's possible that your "compare syntax tables" routine does a lot, of 
> course. But if we really need that kind of fuzzy comparison, we can 
> implement that function in C and export for using in Lisp.

I think that's what I did.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-14 21:14                       ` Alan Mackenzie
@ 2017-02-16 14:10                         ` Stefan Monnier
  2017-02-18 10:44                           ` Alan Mackenzie
  0 siblings, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-16 14:10 UTC (permalink / raw)
  To: emacs-devel

> It was a case of seeing if two distinct syntax tables were "the same"
> from the point of view of literals.  In other words, they could parse
> parentheses, whitespace and so on however they liked, but comments and
> strings had to be parsed identically by both tables for them to count
> "the same".

Interesting.  Indeed, given that syntax-ppss has to pay attention to
more than comments and strings, equivalence between syntax-tables is
never something I considered.

> This is an instance where syntax-ppss's ambitions count against it - on
> any set-syntax-table syntax-ppss's caches should really be cleared,
> strictly speaking.

As you know, syntax-ppss's caching is fairly naive currently and doesn't
make enough checks to give correct results in some cases.  Changes in
the syntax-tables and in point-min being two examples discussed here.

I already suggested to fix the issue w.r.t point-min by replacing
syntax-ppss-cache with a table indexed by the value of point-min.
The same idea could be used for syntax-tables.  I.e. make
syntax-ppss-cache indexed by the combination of syntax-table and
point-min.

Another option is to provide a `with-temp-syntactic-context` macro,
which would locally bind syntax-ppss-cache to nil.  So code which needs
to temporarily use a different point-min and/or syntax-table for some
parsing&navigation work could use this macro to avoid being affected by
the normal cache as well as polluting the cache.

I use this approach of let-binding syntax-ppss-cache is sm-c-mode, for
example (and yes: it's a dirty hack since sm-c-mode shouldn't mess with
syntax-ppss's internals).

Which approach is best depends on the use: If that same syntax-table
will be reused many times (so caching between uses would be beneficial),
then indexing by syntax-table in syntax-ppss-cache is likely the better
choice, otherwise with-temp-syntactic-context is probably all you need.

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-16 14:10                         ` Stefan Monnier
@ 2017-02-18 10:44                           ` Alan Mackenzie
  2017-02-18 13:49                             ` Stefan Monnier
  0 siblings, 1 reply; 75+ messages in thread
From: Alan Mackenzie @ 2017-02-18 10:44 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hello, Stefan.

On Thu, Feb 16, 2017 at 09:10:44 -0500, Stefan Monnier wrote:
> > It was a case of seeing if two distinct syntax tables were "the same"
> > from the point of view of literals.  In other words, they could parse
> > parentheses, whitespace and so on however they liked, but comments and
> > strings had to be parsed identically by both tables for them to count
> > "the same".

> Interesting.  Indeed, given that syntax-ppss has to pay attention to
> more than comments and strings, equivalence between syntax-tables is
> never something I considered.

For syntax-ppss, two syntax tables are either `equal' or not.  There's
probably no other useful standard of equivalence here.

> > This is an instance where syntax-ppss's ambitions count against it - on
> > any set-syntax-table syntax-ppss's caches should really be cleared,
> > strictly speaking.

> As you know, syntax-ppss's caching is fairly naive currently and doesn't
> make enough checks to give correct results in some cases.  Changes in
> the syntax-tables and in point-min being two examples discussed here.

Another example is modify-syntax-entry, though this is surely less
important, since it will almost always be done at initialisation only.
Zapping the syntax-ppss cache is probably a good way of handling it.

> I already suggested to fix the issue w.r.t point-min by replacing
> syntax-ppss-cache with a table indexed by the value of point-min.
> The same idea could be used for syntax-tables.  I.e. make
> syntax-ppss-cache indexed by the combination of syntax-table and
> point-min.

We'd need to be careful not to fill up too much RAM with these caches,
particularly for different values of point-min.

> Another option is to provide a `with-temp-syntactic-context` macro,
> which would locally bind syntax-ppss-cache to nil.  So code which needs
> to temporarily use a different point-min and/or syntax-table for some
> parsing&navigation work could use this macro to avoid being affected by
> the normal cache as well as polluting the cache.

I'm not too keen on the "using a different point-min for some parsing"
bit.  I suggest, again, using island-start and island-end syntactic
markers (these optionally supply a different syntax table).  These would
enable things like temporarily "narrowing to (what looks like) a
comment" and permanently marking a region as an island (e.g. for
multiple major modes), yet the syntax at any position would be rigorous
and unique throughout the buffer.

> I use this approach of let-binding syntax-ppss-cache in sm-c-mode, for
> example (and yes: it's a dirty hack since sm-c-mode shouldn't mess with
> syntax-ppss's internals).

> Which approach is best depends on the use: If that same syntax-table
> will be reused many times (so caching between uses would be beneficial),
> then indexing by syntax-table in syntax-ppss-cache is likely the better
> choice, otherwise with-temp-syntactic-context is probably all you need.


>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-18 10:44                           ` Alan Mackenzie
@ 2017-02-18 13:49                             ` Stefan Monnier
  0 siblings, 0 replies; 75+ messages in thread
From: Stefan Monnier @ 2017-02-18 13:49 UTC (permalink / raw)
  To: emacs-devel

> For syntax-ppss, two syntax tables are either `equal' or not.  There's
> probably no other useful standard of equivalence here.

You can also ignore difference between word/symbol/whitespace, I guess.
That would properly handle the common situation where the syntax-table
is changed during in font-lock to make all symbol-syntax chars into
word-syntax chars.
But I'm far from convinced it's worth the trouble.

>> I already suggested to fix the issue w.r.t point-min by replacing
>> syntax-ppss-cache with a table indexed by the value of point-min.
>> The same idea could be used for syntax-tables.  I.e. make
>> syntax-ppss-cache indexed by the combination of syntax-table and
>> point-min.
> We'd need to be careful not to fill up too much RAM with these caches,
> particularly for different values of point-min.

Given that it's flushed past any buffer modification and is only filled
lazily I'm not too worried.

Additionally, for multi-major-mode uses, the various "branches" of the
cache (each with a different point-min and syntax-table) would probably
end up with no overlap at all, so it wouldn't take up more space than now.
And with `with-temp-syntactic-context` the "other cache" would only
live temporarily.

> I'm not too keen on the "using a different point-min for some parsing"
> bit.  I suggest, again, using island-start and island-end syntactic

I say `point-min` because that's what we currently have.  What I mean by
that is "the logical beginning of the (sub)buffer".  So it could be
island-start, or prog-indentation-context, or ...

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-14 16:38                       ` Stefan Monnier
@ 2017-02-22  2:25                         ` Dmitry Gutov
  2017-02-22  3:53                           ` Stefan Monnier
  0 siblings, 1 reply; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-22  2:25 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

On 14.02.2017 18:38, Stefan Monnier wrote:

> Like all the sexp movement functions, `forward-comment` is allowed to
> assume that the starting position is outside of comments/strings, so it
> doesn't need to consult the cache to see if it's within a string.

I see, thanks. And I think that means that, ideally, it would work 
without the caller having to adjust the syntax visibility bounds, or the 
like, as long as the syntax table is correct and the beginning (or the 
end) of the currently navigated comment is within view.

> In the case we do scan forward (e.g. the case where we end up using
> parse-partial-sexp (or syntax-ppss in my patch)), we actually manually
> re-introduce that behavior: if the forward parse says that the
> end-comment-marker in inside a string (or inside another comment), we
> re-parse from the beginning of that string (or comment) to try and see
> if that end-comment-marker could be considered to close a comment nested
> within the string (or the other comment).

That indeed sounds complex.

> Calling syntax-ppss every time back_comment is invoked would probably
> result in bad performance currently: when parsing backward
> (e.g. backward-sexp), the syntax-ppss-last optimization is ineffective,
> so we'd fallback on syntax-ppss-cache which ends up scanning on the
> average syntax-ppss-max-span/2 (i.e. 10K) chars.  When \n is a comment
> ender (i.e. in most programming language modes), it would imply
> a forward scan of 10K for every line.

You're probably right, but I wonder what the benchmarks would say.

(parse-partial-sexp 1 10000) takes 0.0005 seconds here, so it'd still 
require some intensive usage to show up on user's radar.

Previously, we started from the beginning of the current defun, as 
delineated by an open paren in the first column, right?

I've seen function definitions longer than 10000 chars.

> IOW, for such an approach to work, we'd have to rework syntax-ppss to be
> faster when scanning backward (e.g. reduce syntax-ppss-max-span, which
> would have other repercussions).

Perhaps we could use the "generic comment bounds" syntax-table property 
to delineate such difficult comments. If that idea sounds similar to 
comment-cache, that is no accident.

But we should try to limit the incompatibility with mixed modes by only 
caching the beginnings of comments which contain strings, nested 
comments, etc. Better suggestion welcome (use a tree data structure 
instead of in-buffer text-properties?).

I've only recently come to the realization that our usage of the 
syntax-table text property has the same general incompatibility with 
mixed mode buffers as comment-cache does. The only reasons why it 
doesn't show as much is because we use them relatively rarely. But we 
couldn't, for instance, apply a "generic string" syntax to some literal 
in a subregion that is inside a "generic string" belonging to the 
primary major mode. Not sure what to do about that.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-22  2:25                         ` Dmitry Gutov
@ 2017-02-22  3:53                           ` Stefan Monnier
  2017-02-23 14:23                             ` Dmitry Gutov
  0 siblings, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-22  3:53 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> I see, thanks. And I think that means that, ideally, it would work without
> the caller having to adjust the syntax visibility bounds, or the like, as
> long as the syntax table is correct and the beginning (or the end) of the
> currently navigated comment is within view.

Right, but not reliably so: very often we need to parse backward not
just until the matching starter but until the previous closer (to make
sure the starter we saw was not itself within an earlier comment), and
in other cases the mix of comment markers and string markers make it
impossible to guess if we were really inside a comment, so we end up
falling back on the forward-parse code.

>> In the case we do scan forward (e.g. the case where we end up using
>> parse-partial-sexp (or syntax-ppss in my patch)), we actually manually
>> re-introduce that behavior: if the forward parse says that the
>> end-comment-marker in inside a string (or inside another comment), we
>> re-parse from the beginning of that string (or comment) to try and see
>> if that end-comment-marker could be considered to close a comment nested
>> within the string (or the other comment).
> That indeed sounds complex.

Actually, it's very straightforward: the forward parse already gives us
the beginning of the surrounding element, so we just re-do the forward
parse from that spot.  It's just a matter of wrapping the code inside
a loop.

>> Calling syntax-ppss every time back_comment is invoked would probably
>> result in bad performance currently: when parsing backward
>> (e.g. backward-sexp), the syntax-ppss-last optimization is ineffective,
>> so we'd fallback on syntax-ppss-cache which ends up scanning on the
>> average syntax-ppss-max-span/2 (i.e. 10K) chars.  When \n is a comment
>> ender (i.e. in most programming language modes), it would imply
>> a forward scan of 10K for every line.

> You're probably right, but I wonder what the benchmarks would say.

> (parse-partial-sexp 1 10000) takes 0.0005 seconds here, so it'd still
> require some intensive usage to show up on user's radar.

> Previously, we started from the beginning of the current defun, as
> delineated by an open paren in the first column, right?

No.  "Previously", we typically scan the line backward and stop as soon
as we hit the previous \n (which tells us that no comment can start
earlier than that if it finishes with a \n).

In a few cases, we do fallback on the forward parse code, in which case
indeed we'll take longer, but those are normally rare (which is why this
comment-cache and my syntax-ppss-patch haven't been installed yet: they
improve the performance of a case that's somewhat infrequent).

> Perhaps we could use the "generic comment bounds" syntax-table property to
> delineate such difficult comments. If that idea sounds similar to
> comment-cache, that is no accident.

Maybe.  Obviously, my syntax-ppss hammer makes me think that such
alternate solutions aren't needed: syntax-ppss solves this case without
having to try and come out with a clever way to detect which comments
are tricky nor how to mark them.

> I've only recently come to the realization that our usage of the
> syntax-table text property has the same general incompatibility with mixed
> mode buffers as comment-cache does. The only reasons why it doesn't show as
> much is because we use them relatively rarely. But we couldn't, for
> instance, apply a "generic string" syntax to some literal in a subregion
> that is inside a "generic string" belonging to the primary major mode.

Indeed.

> Not sure what to do about that.

Not completely sure either.  I've had vague ideas of adding some kind of
hook to syntax-tables, i.e. add a new kind of syntax element which ends
up calling an Elisp function of your choice so you can make it "do the
right thing" for the particular construct.

So when scanning (forward or backward), if we bump into an element with
that syntax (typically applied as a syntax-table text-property), we call
the function which will know how to jump over the sub-region or will
signal an "end of sub-region" error.

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-22  3:53                           ` Stefan Monnier
@ 2017-02-23 14:23                             ` Dmitry Gutov
  2017-02-23 14:48                               ` Stefan Monnier
  0 siblings, 1 reply; 75+ messages in thread
From: Dmitry Gutov @ 2017-02-23 14:23 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On 22.02.2017 05:53, Stefan Monnier wrote:
>> I see, thanks. And I think that means that, ideally, it would work without
>> the caller having to adjust the syntax visibility bounds, or the like, as
>> long as the syntax table is correct and the beginning (or the end) of the
>> currently navigated comment is within view.
> 
> Right, but not reliably so: very often we need to parse backward not
> just until the matching starter but until the previous closer (to make
> sure the starter we saw was not itself within an earlier comment), and
> in other cases the mix of comment markers and string markers make it
> impossible to guess if we were really inside a comment, so we end up
> falling back on the forward-parse code.

Naturally, we'd need to save more information to be able to do that. 
E.g. propertize the end of a complex comment with the position of its 
beginning. Since the first time we go through a buffer is in the forward 
direction, getting that info would be inexpensive.

> Actually, it's very straightforward: the forward parse already gives us
> the beginning of the surrounding element, so we just re-do the forward
> parse from that spot.  It's just a matter of wrapping the code inside
> a loop.

You're likely a better judge of that. It does sound a bit convoluted to 
me (and having to deal with different kinds of comments adds its 
complexity), but not something that having a handful of tests wouldn't 
keep straight.

> No.  "Previously", we typically scan the line backward and stop as soon
> as we hit the previous \n (which tells us that no comment can start
> earlier than that if it finishes with a \n).
> 
> In a few cases, we do fallback on the forward parse code, in which case
> indeed we'll take longer, but those are normally rare (which is why this
> comment-cache and my syntax-ppss-patch haven't been installed yet: they
> improve the performance of a case that's somewhat infrequent).

I see, thanks.

>> Perhaps we could use the "generic comment bounds" syntax-table property to
>> delineate such difficult comments. If that idea sounds similar to
>> comment-cache, that is no accident.
> 
> Maybe.  Obviously, my syntax-ppss hammer makes me think that such
> alternate solutions aren't needed: syntax-ppss solves this case without
> having to try and come out with a clever way to detect which comments
> are tricky nor how to mark them.

The alternative tweak I had in mind would be applied somewhere around 
syntax-propertize. So it would be a matter of trading off one bit of 
complexity for another, still staying within the framework of syntax-ppss.

> Not completely sure either.  I've had vague ideas of adding some kind of
> hook to syntax-tables, i.e. add a new kind of syntax element which ends
> up calling an Elisp function of your choice so you can make it "do the
> right thing" for the particular construct.

I think just having paired syntactic elements would suffice. Or just 
propertizing the whole subregion with one text property span. Whichever 
would be easier to process.

Not sure about using the syntax-table property for this. In some weird 
cases there won't be a space of a newline to put these syntax-table 
values on. And a newline staying a newline might be syntactically 
important for the primary major mode somewhere.

Another thing to consider is that we would probably want to fontify the 
contents of all subregions normally, even when inside comments belonging 
to the outer mode. So the primitives used in 
font-lock-fontify-syntactically-region would need to be able to stop at 
those boundaries instead of automatically skipping over.

> So when scanning (forward or backward), if we bump into an element with
> that syntax (typically applied as a syntax-table text-property), we call
> the function which will know how to jump over the sub-region or will
> signal an "end of sub-region" error.

Just having those hooks won't be enough, we still don't have enough info 
how to syntax-propertize the subregion contents, for instance. So I'm 
not sure what the flexibility of using the functions here would buy us.

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-23 14:23                             ` Dmitry Gutov
@ 2017-02-23 14:48                               ` Stefan Monnier
  2017-02-24  7:46                                 ` Tom Tromey
  0 siblings, 1 reply; 75+ messages in thread
From: Stefan Monnier @ 2017-02-23 14:48 UTC (permalink / raw)
  To: emacs-devel

> Naturally, we'd need to save more information to be able to do
> that. E.g. propertize the end of a complex comment with the position of its
> beginning. Since the first time we go through a buffer is in the forward
> direction, getting that info would be inexpensive.

Actually, back_comment may very well parse backward "the first time",
since it doesn't use any cache (currently).

>> Maybe.  Obviously, my syntax-ppss hammer makes me think that such
>> alternate solutions aren't needed: syntax-ppss solves this case without
>> having to try and come out with a clever way to detect which comments
>> are tricky nor how to mark them.
> The alternative tweak I had in mind would be applied somewhere around
> syntax-propertize.

But you still have to decide what info to save and when.  I.e. there's
a design problem, with non-trivial tradeoffs.
Using syntax-ppss saves me the trouble.

> Another thing to consider is that we would probably want to fontify the
> contents of all subregions normally, even when inside comments belonging to
> the outer mode. So the primitives used in
> font-lock-fontify-syntactically-region would need to be able to stop at
> those boundaries instead of automatically skipping over.

That's right.

> Just having those hooks won't be enough, we still don't have enough info how
> to syntax-propertize the subregion contents, for instance. So I'm not sure
> what the flexibility of using the functions here would buy us.

Me neither.

Maybe the real issue is that starting from syntax-table +
syntax-propertize + font-lock is a losing proposition, and instead we
should first come up with a completely different design (from scratch)
that would be able to parse and highlight code and to accommodate
multiple major modes.  Then we can maybe see if there's a way to evolve
the current design of syntax-table + syntax-propertize + font-lock to
that new design.

        Stefan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: Bug #25608 and the comment-cache branch
  2017-02-23 14:48                               ` Stefan Monnier
@ 2017-02-24  7:46                                 ` Tom Tromey
  0 siblings, 0 replies; 75+ messages in thread
From: Tom Tromey @ 2017-02-24  7:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

>>>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes:

Stefan> Maybe the real issue is that starting from syntax-table +
Stefan> syntax-propertize + font-lock is a losing proposition, and instead we
Stefan> should first come up with a completely different design (from scratch)
Stefan> that would be able to parse and highlight code and to accommodate
Stefan> multiple major modes.  Then we can maybe see if there's a way to evolve
Stefan> the current design of syntax-table + syntax-propertize + font-lock to
Stefan> that new design.

As you know I'm interested in this area via my work on html/js/css.
However I am not really aware of what the problems might be; syntax-ppss
seems just fine.

(I did read the comment cache thread but my takeaway from that was just
that there's a disagreement about the correct behavior various things in
the face of narrowing; which seems like a pretty limited sort of thing.)

Anyway, I'm writing to say that it would be illuminating to have someone
in the know (i.e., you) write up a description of the actual problems;
then from there proceed to what the solutions might be.

Tom

^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2017-02-24  7:46 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-02 20:24 Bug #25608 and the comment-cache branch Alan Mackenzie
2017-02-02 20:47 ` Eli Zaretskii
2017-02-02 21:51   ` Alan Mackenzie
2017-02-02 22:15     ` Dmitry Gutov
2017-02-03  7:41     ` Eli Zaretskii
2017-02-03 17:29       ` Alan Mackenzie
2017-02-03 22:08         ` Dmitry Gutov
2017-02-04 10:24           ` Alan Mackenzie
2017-02-06  2:09             ` Dmitry Gutov
2017-02-06 19:24               ` Alan Mackenzie
2017-02-07  1:42                 ` Dmitry Gutov
2017-02-07 19:21                   ` Alan Mackenzie
2017-02-14 15:28                     ` Dmitry Gutov
2017-02-14 16:38                       ` Stefan Monnier
2017-02-22  2:25                         ` Dmitry Gutov
2017-02-22  3:53                           ` Stefan Monnier
2017-02-23 14:23                             ` Dmitry Gutov
2017-02-23 14:48                               ` Stefan Monnier
2017-02-24  7:46                                 ` Tom Tromey
2017-02-14 21:14                       ` Alan Mackenzie
2017-02-16 14:10                         ` Stefan Monnier
2017-02-18 10:44                           ` Alan Mackenzie
2017-02-18 13:49                             ` Stefan Monnier
2017-02-12  2:53               ` John Wiegley
2017-02-12  8:20                 ` Elias Mårtenson
2017-02-12 10:47                 ` Alan Mackenzie
2017-02-12 11:14                 ` martin rudalics
2017-02-12 15:05                   ` Andreas Röhler
2017-02-12 15:39                   ` Eli Zaretskii
2017-02-05 22:00       ` Alan Mackenzie
2017-02-06  1:12         ` Stefan Monnier
2017-02-06 18:37           ` Alan Mackenzie
2017-02-08 17:20         ` Eli Zaretskii
2017-02-11 23:25           ` Alan Mackenzie
2017-02-12  0:55             ` Stefan Monnier
2017-02-12 12:05               ` Alan Mackenzie
2017-02-12 13:13                 ` Juanma Barranquero
2017-02-12 15:57                 ` Dmitry Gutov
2017-02-12 17:29                   ` Alan Mackenzie
2017-02-12 20:35                     ` Dmitry Gutov
2017-02-13  1:47                     ` zhanghj
2017-02-13  5:50                       ` Stefan Monnier
2017-02-13  6:45                         ` zhanghj
2017-02-13  7:24                           ` Stefan Monnier
2017-02-13  7:59                             ` zhanghj
2017-02-13  9:25                               ` Stefan Monnier
2017-02-13 16:14                           ` Drew Adams
2017-02-13  7:05                         ` zhanghj
2017-02-13  7:16                         ` zhanghj
2017-02-13 14:57                           ` Dmitry Gutov
2017-02-12 17:49                 ` Stefan Monnier
2017-02-13 18:09                   ` Alan Mackenzie
2017-02-13 19:34                     ` Eli Zaretskii
2017-02-13 21:21                     ` Stefan Monnier
2017-02-02 22:14 ` Dmitry Gutov
2017-02-03 16:44   ` Alan Mackenzie
2017-02-03 21:53     ` Dmitry Gutov
2017-02-04 11:02       ` Alan Mackenzie
2017-02-06  1:28         ` Dmitry Gutov
2017-02-06 19:37           ` Alan Mackenzie
2017-02-06  2:08         ` Stefan Monnier
2017-02-06 20:01           ` Alan Mackenzie
2017-02-06 22:33             ` Stefan Monnier
2017-02-07 21:24               ` Alan Mackenzie
2017-02-08 12:54                 ` Stefan Monnier
2017-02-07 15:29             ` Eli Zaretskii
2017-02-07 21:09               ` Alan Mackenzie
2017-02-08 17:28                 ` Eli Zaretskii
2017-02-02 23:57 ` Stefan Monnier
2017-02-03 16:19   ` Alan Mackenzie
2017-02-04  9:06     ` Andreas Röhler
2017-02-04 18:18     ` Stefan Monnier
2017-02-04 18:28       ` Alan Mackenzie
2017-02-03  7:49 ` Yuri Khan
2017-02-03 18:30   ` Andreas Röhler

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).