unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#22983: syntax-ppss returns wrong result.
@ 2016-03-11 15:15 Alan Mackenzie
  2016-03-11 20:31 ` Dmitry Gutov
                   ` (3 more replies)
  0 siblings, 4 replies; 65+ messages in thread
From: Alan Mackenzie @ 2016-03-11 15:15 UTC (permalink / raw)
  To: 22983

Hello, Emacs.

The fundamental contract in syntax-ppss is that (syntax-ppss POS)
returns the same value as (parse-partial-sexp (point-min) POS) (with the
exception of elements 2 and 6).  This is currently not always the case.

In the master branch, emacs -Q and visit xdisp.c with C-x C-f.  Follow
this recipe:

    M-: (syntax-ppss-flush-cache 1)
    M-: (setq ppss-0 (syntax-ppss 40000))
    M-<
    C-s #include " <CR>
    M->
    C-x n n
    M-: (setq ppss-1 (syntax-ppss 40000))
    M-: (setq parse (parse-partial-sexp (point-min) 40000))

At this point, `ppss-1' and `parse' should match (apart from elements 2
and 6).  What we actually have is:

    ppss-1: (2 39992 nil nil nil nil 2 nil nil (39975 39992))
    parse:  (0 nil 15674 34 nil nil 0 nil 15675 nil)

.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie
@ 2016-03-11 20:31 ` Dmitry Gutov
  2016-03-11 21:24   ` Alan Mackenzie
  2016-03-13 18:52 ` Andreas Röhler
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2016-03-11 20:31 UTC (permalink / raw)
  To: Alan Mackenzie, 22983

On 03/11/2016 05:15 PM, Alan Mackenzie wrote:

> At this point, `ppss-1' and `parse' should match (apart from elements 2
> and 6).  What we actually have is:
>
>     ppss-1: (2 39992 nil nil nil nil 2 nil nil (39975 39992))
>     parse:  (0 nil 15674 34 nil nil 0 nil 15675 nil)

I think you mean that ppss-0 and ppss-1 must match independent of 
narrowing, and also match (parse-partial-sexp 1 40000).

Considering narrowing can change point-min arbitrarily, specifying 
(syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing 
proposition if you want consistency.

Alas, we have some code out there that implements multiple-major-mode 
functionality using narrowing and some hacking of syntax-ppss-last 
syntax-ppss-cache values.

Changing syntax-ppss to be independent of narrowing will break it, and 
we'll need to provide some alternative first.

We could introduce a syntax-ppss-dont-widen variable, though. Similar to 
font-lock-dont-widen.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 20:31 ` Dmitry Gutov
@ 2016-03-11 21:24   ` Alan Mackenzie
  2016-03-11 21:35     ` Dmitry Gutov
  2016-03-13 17:32     ` Stefan Monnier
  0 siblings, 2 replies; 65+ messages in thread
From: Alan Mackenzie @ 2016-03-11 21:24 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 22983

Hello, Dmitry.

On Fri, Mar 11, 2016 at 10:31:50PM +0200, Dmitry Gutov wrote:
> On 03/11/2016 05:15 PM, Alan Mackenzie wrote:

> > At this point, `ppss-1' and `parse' should match (apart from elements 2
> > and 6).  What we actually have is:

> >     ppss-1: (2 39992 nil nil nil nil 2 nil nil (39975 39992))
> >     parse:  (0 nil 15674 34 nil nil 0 nil 15675 nil)

> I think you mean that ppss-0 and ppss-1 must match independent of 
> narrowing, and also match (parse-partial-sexp 1 40000).

Er no, I meant what I wrote: the result of (syntax-ppss pos) must match
that of (parse-partial-sexp (point-min) pos).  I think ppss-0 and ppss-1
did actually match (but I can't quite remember).

> Considering narrowing can change point-min arbitrarily, specifying 
> (syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing 
> proposition if you want consistency.

Indeed.  But that is how syntax-ppss is specified, and (partially) how
it is implemented.

> Alas, we have some code out there that implements multiple-major-mode 
> functionality using narrowing and some hacking of syntax-ppss-last 
> syntax-ppss-cache values.

> Changing syntax-ppss to be independent of narrowing will break it, and 
> we'll need to provide some alternative first.

syntax-ppss is broken, and can't be fixed.  The only sensible fix would
be to specify that (syntax-ppss pos) is the same as (parse-partial-sexp
1 pos).  But that is then a totally different function, and there are
around 200 uses in the Emacs sources to check and fix, to say nothing of
external code.

> We could introduce a syntax-ppss-dont-widen variable, though. Similar to 
> font-lock-dont-widen.

I'm trying to figure that out.  Wouldn't that still leave you with
problems when point-min is inside a string?

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 21:24   ` Alan Mackenzie
@ 2016-03-11 21:35     ` Dmitry Gutov
  2016-03-11 22:15       ` Alan Mackenzie
  2016-03-13 17:32     ` Stefan Monnier
  1 sibling, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2016-03-11 21:35 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983

On 03/11/2016 11:24 PM, Alan Mackenzie wrote:

>> I think you mean that ppss-0 and ppss-1 must match independent of
>> narrowing, and also match (parse-partial-sexp 1 40000).
>
> Er no, I meant what I wrote: the result of (syntax-ppss pos) must match
> that of (parse-partial-sexp (point-min) pos).  I think ppss-0 and ppss-1
> did actually match (but I can't quite remember).

I imagine they didn't. I got the same value in all three cases, though, 
so your scenario could use some revising.

>> Considering narrowing can change point-min arbitrarily, specifying
>> (syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing
>> proposition if you want consistency.
>
> Indeed.  But that is how syntax-ppss is specified, and (partially) how
> it is implemented.

That part of specification can be rephrased.

>> Alas, we have some code out there that implements multiple-major-mode
>> functionality using narrowing and some hacking of syntax-ppss-last
>> syntax-ppss-cache values.
>
>> Changing syntax-ppss to be independent of narrowing will break it, and
>> we'll need to provide some alternative first.
>
> syntax-ppss is broken, and can't be fixed.

It's used ubiquitously, so it must be working.

> The only sensible fix would
> be to specify that (syntax-ppss pos) is the same as (parse-partial-sexp
> 1 pos).  But that is then a totally different function, and there are
> around 200 uses in the Emacs sources to check and fix, to say nothing of
> external code.

Not entirely different, no. AFAIK, these are the semantics the vast 
majority of its usages expect. Except the multiple-major-mode case, 
which we'd ideally try to accommodate, too.

>> We could introduce a syntax-ppss-dont-widen variable, though. Similar to
>> font-lock-dont-widen.
>
> I'm trying to figure that out.  Wouldn't that still leave you with
> problems when point-min is inside a string?

syntax-ppss-dont-widen would be nil by default, it would be an escape 
hatch toward the current semantics, for when the caller knows how to 
manage narrowings, etc.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 21:35     ` Dmitry Gutov
@ 2016-03-11 22:15       ` Alan Mackenzie
  2016-03-11 22:38         ` Dmitry Gutov
  0 siblings, 1 reply; 65+ messages in thread
From: Alan Mackenzie @ 2016-03-11 22:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 22983

Hello, Dmitry.

On Fri, Mar 11, 2016 at 11:35:08PM +0200, Dmitry Gutov wrote:
> On 03/11/2016 11:24 PM, Alan Mackenzie wrote:

> > Er no, I meant what I wrote: the result of (syntax-ppss pos) must match
> > that of (parse-partial-sexp (point-min) pos).  I think ppss-0 and ppss-1
> > did actually match (but I can't quite remember).

> I imagine they didn't. I got the same value in all three cases, though, 
> so your scenario could use some revising.

Sorry about that.

> >> Considering narrowing can change point-min arbitrarily, specifying
> >> (syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing
> >> proposition if you want consistency.

> > Indeed.  But that is how syntax-ppss is specified, and (partially) how
> > it is implemented.

> That part of specification can be rephrased.

It's more than the specification which needs redoing.  The implementation
(sometimes) returns the equivalent of (parse-partial-sexp (point-min)
pos)), when point-min is not in a "safe place".

> >> Alas, we have some code out there that implements multiple-major-mode
> >> functionality using narrowing and some hacking of syntax-ppss-last
> >> syntax-ppss-cache values.

> >> Changing syntax-ppss to be independent of narrowing will break it, and
> >> we'll need to provide some alternative first.

> > syntax-ppss is broken, and can't be fixed.

> It's used ubiquitously, so it must be working.

It might well be ubiquitous, but it's broken.  Consider this: syntax-ppss
will return the result of a parse based at point-min.  In general, the
caller does not know whether point-min is in a string or not.  Therefore
the result is of little value, UNLESS the caller takes special action,
such as widening the buffer before every call to syntax-ppss.

> > The only sensible fix would be to specify that (syntax-ppss pos) is
> > the same as (parse-partial-sexp 1 pos).  But that is then a totally
> > different function, and there are around 200 uses in the Emacs
> > sources to check and fix, to say nothing of external code.

> Not entirely different, no. AFAIK, these are the semantics the vast 
> majority of its usages expect.

But it's not the semantics these .el files get.  What's probably keeping
them functional is the rarity with which buffers are narrowed to an
"awkward" point-min.

> Except the multiple-major-mode case, which we'd ideally try to
> accommodate, too.

How does this code handle the changeover of syntax tables at a mode
boundary?

> >> We could introduce a syntax-ppss-dont-widen variable, though. Similar to
> >> font-lock-dont-widen.

> > I'm trying to figure that out.  Wouldn't that still leave you with
> > problems when point-min is inside a string?

> syntax-ppss-dont-widen would be nil by default, it would be an escape 
> hatch toward the current semantics, for when the caller knows how to 
> manage narrowings, etc.

Ah, OK.  I think I see that now.  Maybe.  Surely the trouble is that
either ALL calls or NONE must have s-p-dont-widen set.  When that flag is
toggled, all the caches have to be cleared.  Maybe there should be some
initialisation flag in some initialisation function.  Or something like
that.  (It's getting late!).

It strikes me that the multiple major mode stuff could do with a
substantially enhanced version of syntax-ppss which would smoothly handle
going over a mode boundary.  But I don't know how you're implementing
that.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 22:15       ` Alan Mackenzie
@ 2016-03-11 22:38         ` Dmitry Gutov
  2016-03-13 17:37           ` Stefan Monnier
  0 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2016-03-11 22:38 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983

On 03/12/2016 12:15 AM, Alan Mackenzie wrote:

>> That part of specification can be rephrased.
>
> It's more than the specification which needs redoing.  The implementation
> (sometimes) returns the equivalent of (parse-partial-sexp (point-min)
> pos)), when point-min is not in a "safe place".

Sure. I just meant that we shouldn't get hung up on that element of the 
specification.

>>>> Changing syntax-ppss to be independent of narrowing will break it, and
>>>> we'll need to provide some alternative first.
>
>>> syntax-ppss is broken, and can't be fixed.
>
>> It's used ubiquitously, so it must be working.
>
> It might well be ubiquitous, but it's broken.

And yet, it can be fixed.

> Consider this: syntax-ppss
> will return the result of a parse based at point-min.  In general, the
> caller does not know whether point-min is in a string or not.  Therefore
> the result is of little value, UNLESS the caller takes special action,
> such as widening the buffer before every call to syntax-ppss.

You can say that.

>> Not entirely different, no. AFAIK, these are the semantics the vast
>> majority of its usages expect.
>
> But it's not the semantics these .el files get.  What's probably keeping
> them functional is the rarity with which buffers are narrowed to an
> "awkward" point-min.

Another thing that keeps it together, is that narrowing, as a user-level 
operator, is not that popular.

Personally, I consider it an anti-feature.

>> Except the multiple-major-mode case, which we'd ideally try to
>> accommodate, too.
>
> How does this code handle the changeover of syntax tables at a mode
> boundary?

The "inner" regions start with an "empty", top-level state. This is 
actually fine, because these are usually small enough not to benefit 
from the syntax-ppss cache too much (and syntax-ppss-last still helps).

The parts of the outer region following a subregion with different 
syntax table... rely on a few hacks, and a manual application of a 
`syntax-table' property when necessary. We need a better solution there, 
but it's probably out of scope for this discussion.

>> syntax-ppss-dont-widen would be nil by default, it would be an escape
>> hatch toward the current semantics, for when the caller knows how to
>> manage narrowings, etc.
>
> Ah, OK.  I think I see that now.  Maybe.  Surely the trouble is that
> either ALL calls or NONE must have s-p-dont-widen set.

Hmm, you're right. This variable still seems essential, but to be safe, 
mmm-mode and friends should probably also advise syntax-ppss, to always 
perform narrowing as appropriate.

> When that flag is
> toggled, all the caches have to be cleared.  Maybe there should be some
> initialisation flag in some initialisation function.  Or something like
> that.  (It's getting late!).

Is the syntax-ppss-dont-widen really relevant for your comment cache? It 
would be used only by certain major modes, and worst comes to worst, you 
could disable the cache in those buffers.

> It strikes me that the multiple major mode stuff could do with a
> substantially enhanced version of syntax-ppss which would smoothly handle
> going over a mode boundary.  But I don't know how you're implementing
> that.

So far, we're just wrapping the font-lock and indentation code, and 
otherwise hope for the best.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 21:24   ` Alan Mackenzie
  2016-03-11 21:35     ` Dmitry Gutov
@ 2016-03-13 17:32     ` Stefan Monnier
  1 sibling, 0 replies; 65+ messages in thread
From: Stefan Monnier @ 2016-03-13 17:32 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov

> Er no, I meant what I wrote: the result of (syntax-ppss pos) must match
> that of (parse-partial-sexp (point-min) pos).

That's what the docstring says, but is it the result you're looking for?


        Stefan





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 22:38         ` Dmitry Gutov
@ 2016-03-13 17:37           ` Stefan Monnier
  2016-03-13 18:57             ` Alan Mackenzie
  0 siblings, 1 reply; 65+ messages in thread
From: Stefan Monnier @ 2016-03-13 17:37 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983

>> But it's not the semantics these .el files get.  What's probably keeping
>> them functional is the rarity with which buffers are narrowed to an
>> "awkward" point-min.
> Another thing that keeps it together, is that narrowing, as a user-level
> operator, is not that popular.

Luckily, yes.

> Personally, I consider it an anti-feature.

Same here.  Luckily also, as pointed out elsewhere, the semantics of it
is unclear, so that in several important cases, whichever behavior we
end up choosing will be both correct for some users and incorrect
for others.

Hence, so far, I didn't make any effort to try and "do the right thing"
for user-activated narrowing, since these are just not well defined
enough to even determine what is "the right thing".


        Stefan





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie
  2016-03-11 20:31 ` Dmitry Gutov
@ 2016-03-13 18:52 ` Andreas Röhler
  2016-03-13 18:56   ` Dmitry Gutov
  2016-03-18  0:49 ` Dmitry Gutov
       [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org>
  3 siblings, 1 reply; 65+ messages in thread
From: Andreas Röhler @ 2016-03-13 18:52 UTC (permalink / raw)
  To: 22983

[-- Attachment #1: Type: text/plain, Size: 532 bytes --]



On 11.03.2016 16:15, Alan Mackenzie wrote:
> Hello, Emacs.
>
> The fundamental contract in syntax-ppss is that (syntax-ppss POS)
> returns the same value as (parse-partial-sexp (point-min) POS) (with the
> exception of elements 2 and 6).  This is currently not always the case.
>
> In the master branch, emacs -Q and visit xdisp.c with C-x C-f.  Follow
> this recipe:
>
>      M-: (syntax-ppss-flush-cache 1)
>      M-: (setq ppss-0 (syntax-ppss 40000))

(setq ppss-0 (syntax-ppss 40000)

moved point - see attachment. Should it?

[-- Attachment #2: moves-point.png --]
[-- Type: image/png, Size: 125904 bytes --]

^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-13 18:52 ` Andreas Röhler
@ 2016-03-13 18:56   ` Dmitry Gutov
  0 siblings, 0 replies; 65+ messages in thread
From: Dmitry Gutov @ 2016-03-13 18:56 UTC (permalink / raw)
  To: Andreas Röhler, 22983

On 03/13/2016 08:52 PM, Andreas Röhler wrote:

> (setq ppss-0 (syntax-ppss 40000)
>
> moved point - see attachment. Should it?

Yes. See the last sentence in its docstring.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-13 17:37           ` Stefan Monnier
@ 2016-03-13 18:57             ` Alan Mackenzie
  2016-03-14  0:47               ` Dmitry Gutov
  2016-03-14  1:49               ` Stefan Monnier
  0 siblings, 2 replies; 65+ messages in thread
From: Alan Mackenzie @ 2016-03-13 18:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 22983, Dmitry Gutov

Hello, Stefan.

On Sun, Mar 13, 2016 at 01:37:27PM -0400, Stefan Monnier wrote:
> >> But it's not the semantics these .el files get.  What's probably keeping
> >> them functional is the rarity with which buffers are narrowed to an
> >> "awkward" point-min.
> > Another thing that keeps it together, is that narrowing, as a user-level
> > operator, is not that popular.

> Luckily, yes.

I happen to use it frequently.  I expect other users do, to.  It's
useful.

> > Personally, I consider it an anti-feature.

> Same here.  Luckily also, as pointed out elsewhere, the semantics of it
> is unclear, so that in several important cases, whichever behavior we
> end up choosing will be both correct for some users and incorrect
> for others.

That's pure sophistry.  The semantics needed are quite clear:  What were
strings and comments before narrowing should remain strings and comments
after narrowing.  Otherwise, nothing would work in such a narrowed
buffer.  font-locking, for example, behaves properly in a narrowed
buffer.

> Hence, so far, I didn't make any effort to try and "do the right thing"
> for user-activated narrowing, since these are just not well defined
> enough to even determine what is "the right thing".

Lets define them as I said in the previous paragraph.  Or can you
conceive of a use case where one would want narrowing to invert strings
and non-strings, leaving comments totally random?

Do you have any views on how the bug should be resolved?

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-13 18:57             ` Alan Mackenzie
@ 2016-03-14  0:47               ` Dmitry Gutov
  2016-03-14  1:04                 ` Drew Adams
  2016-03-14  1:49               ` Stefan Monnier
  1 sibling, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2016-03-14  0:47 UTC (permalink / raw)
  To: Alan Mackenzie, Stefan Monnier; +Cc: 22983

On 03/13/2016 08:57 PM, Alan Mackenzie wrote:

> I happen to use it frequently.  I expect other users do, to.  It's
> useful.

It might be, but it's not very well-designed. If you only want to hide 
some parts of buffer from being displayed, changing point-min and 
point-max, which affect quite a lot of Lisp functions, seems unnecessary.

Introducing a couple of global variables that would only be read by the 
display code, seems like a better approach. I don't think that 
narrow-to-region should be a user-level function. Introducing a new 
function, using a different mechanism shouldn't be too hard though, if 
we reuse the existing binding.

> What were
> strings and comments before narrowing should remain strings and comments
> after narrowing.  Otherwise, nothing would work in such a narrowed
> buffer.  font-locking, for example, behaves properly in a narrowed
> buffer.

It behaves like we tell it to behave. If I bind font-lock-dont-widen to 
t, font-lock won't look beyond the narrowing.

>> Hence, so far, I didn't make any effort to try and "do the right thing"
>> for user-activated narrowing, since these are just not well defined
>> enough to even determine what is "the right thing".
>
> Lets define them as I said in the previous paragraph.  Or can you
> conceive of a use case where one would want narrowing to invert strings
> and non-strings, leaving comments totally random?

At risk of inviting further confusion, yes, mmm-mode and polymode (new 
example!) use narrowing to persuade font-lock and indentation code that 
there's nothing beyond the narrowed region. We might declare such usages 
invalid, and that's a possible choice, but I think keeping support for 
them wouldn't be too hard, at least for a while.

Note that if your comment cache always widens the buffer before 
calculating the values to save, its result might conflict with 
syntax-ppss in mmm-mode and polymode (right?). Leading to font-lock, 
indentation and certain commands behaving in different, conflicting 
ways. That's just conjecture at this point, of course.

> Do you have any views on how the bug should be resolved?

Stefan probably has another opinion, but I'd either ignore the issue of 
narrowing, or introduce syntax-ppss-dont-widen like proposed (and thus 
make syntax-ppss widen by default). Together with adding a command that 
would replace interactive use of narrow-to-region.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-14  0:47               ` Dmitry Gutov
@ 2016-03-14  1:04                 ` Drew Adams
  2016-04-03 22:55                   ` John Wiegley
  0 siblings, 1 reply; 65+ messages in thread
From: Drew Adams @ 2016-03-14  1:04 UTC (permalink / raw)
  To: Dmitry Gutov, Alan Mackenzie, Stefan Monnier; +Cc: 22983

> > I happen to use it frequently.  I expect other users do, to.  It's
> > useful.
> 
> It might be, but it's not very well-designed. If you only want to hide
> some parts of buffer from being displayed, changing point-min and
> point-max, which affect quite a lot of Lisp functions, seems unnecessary.

Well, well, well.  All of this is likely OT for this thread.

But no - narrowing is in fact explicitly _about_ changing
`point-min' and `point-max', so you can act on a particular
section of a buffer.  It is not only about "hid[ing] some
parts of a buffer from being displayed".  This is true for
both interactive use and in code.

> I don't think that narrow-to-region should be a user-level
> function.

Is this a joke?  Maybe you think that because you think
it is only about hiding text?

> Introducing a new function, using a different mechanism
> shouldn't be too hard though, if we reuse the existing binding.

Please don't.  Please don't even think about it.

And if you really think you have something to say about
it, then please bring it up in emacs-devel, not in a bug
thread that is not especially related to it.

> adding a command that would replace interactive use of
> narrow-to-region.

Ridiculous (IMHO).  Add whatever commands you like, but
please do not think about replacing `narrow-to-region'
willy nilly.  It is one of the most useful Emacs commands.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-13 18:57             ` Alan Mackenzie
  2016-03-14  0:47               ` Dmitry Gutov
@ 2016-03-14  1:49               ` Stefan Monnier
  1 sibling, 0 replies; 65+ messages in thread
From: Stefan Monnier @ 2016-03-14  1:49 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov

> That's pure sophistry.  The semantics needed are quite clear:

For your use case, yes.  It's quite clear *in your mind*.  There are
other use cases.  Worse yet: Elisp doesn't generally know if the
narrowing was setup by the user or by some Elisp caller up the stack.
So even if we were to pretend that the use-case is clear when the
narrowing is set by the user, we'd still have to figure out if that's
the case.

> Lets define them as I said in the previous paragraph.  Or can you
> conceive of a use case where one would want narrowing to invert strings
> and non-strings, leaving comments totally random?

There's the case where some Elisp code does

    (save-restriction
      (narrow-to-region beg end
        (with-syntax-table ...)))

to parse a sub-part of your buffer in a different way.  Of course this
completely breaks syntax-ppss and friends.  I need to do exactly that in
sm-c-mode (when parsing the C code inside CPP directives, since those
directives are marked as comments), for example and had to use

      (let ((syntax-propertize-function nil)
            (syntax-ppss-cache nil)
            (syntax-ppss-last nil))
        ...)

to deal with it.  It would be easy/natural to add a binding of
syntax-ppss-dont-widen in there (and/or literal-cache-dont-widen for
that matter).

> Do you have any views on how the bug should be resolved?

Look up some past discussions of how to number lines in a narrowed
buffer (same basic issue), where we discussed this.

We basically need to add information about which kind of narrowing is
in effect.  IIRC one way suggested was to have 2 narrowing states at the
same time: the current one, plus a new one which is a kind of "hard
narrowing" (the current narrowing would have to be "narrower" than the
"hard narrowing"), with corresponding new kind of "widen".


        Stefan





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie
  2016-03-11 20:31 ` Dmitry Gutov
  2016-03-13 18:52 ` Andreas Röhler
@ 2016-03-18  0:49 ` Dmitry Gutov
  2016-03-19 12:27   ` Alan Mackenzie
  2016-03-19 23:00   ` Vitalie Spinu
       [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org>
  3 siblings, 2 replies; 65+ messages in thread
From: Dmitry Gutov @ 2016-03-18  0:49 UTC (permalink / raw)
  To: Alan Mackenzie, 22983

On 03/11/2016 05:15 PM, Alan Mackenzie wrote:

This patch should make ppss-0 and ppss-1 match:

diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
index e20a210..c1b9d84 100644
--- a/lisp/emacs-lisp/syntax.el
+++ b/lisp/emacs-lisp/syntax.el
@@ -371,6 +371,11 @@ syntax-ppss-max-span
  We try to make sure that cache entries are at least this far apart
  from each other, to avoid keeping too much useless info.")

+(defvar syntax-ppss-dont-widen nil
+  "If non-nil, `syntax-ppss' will work on the non-widened buffer.
+The code that uses this should create local bindings for
+`syntax-ppss-cache' and `syntax-ppss-last' too.")
+
  (defvar syntax-begin-function nil
    "Function to move back outside of any comment/string/paren.
  This function should move the cursor back to some syntactically safe
@@ -423,12 +428,21 @@ syntax-ppss
  in the returned list (counting from 0) cannot be relied upon.
  Point is at POS when this function returns.

+IF `syntax-ppss-dont-widen' is nil, the buffer is temporarily
+widened.
+
  It is necessary to call `syntax-ppss-flush-cache' explicitly if
  this function is called while `before-change-functions' is
  temporarily let-bound, or if the buffer is modified without
  running the hook."
    ;; Default values.
    (unless pos (setq pos (point)))
+  (save-restriction
+    (unless syntax-ppss-dont-widen
+      (widen))
+    (syntax-pps--at pos)))
+
+(defun syntax-ppss--at (pos)
    (syntax-propertize pos)
    ;;
    (let ((old-ppss (cdr syntax-ppss-last))






^ permalink raw reply related	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-18  0:49 ` Dmitry Gutov
@ 2016-03-19 12:27   ` Alan Mackenzie
  2016-03-19 18:47     ` Dmitry Gutov
  2016-03-19 23:16     ` Vitalie Spinu
  2016-03-19 23:00   ` Vitalie Spinu
  1 sibling, 2 replies; 65+ messages in thread
From: Alan Mackenzie @ 2016-03-19 12:27 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 22983

Hello, Dmitry.

On Fri, Mar 18, 2016 at 02:49:34AM +0200, Dmitry Gutov wrote:
> On 03/11/2016 05:15 PM, Alan Mackenzie wrote:

> This patch should make ppss-0 and ppss-1 match:

OK, no bad thing!

But seeing that the function is a new function (its specification has
changed), it will need new test cases, fresh new attempts to break it.

> diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
> index e20a210..c1b9d84 100644
> --- a/lisp/emacs-lisp/syntax.el
> +++ b/lisp/emacs-lisp/syntax.el
> @@ -371,6 +371,11 @@ syntax-ppss-max-span
>   We try to make sure that cache entries are at least this far apart
>   from each other, to avoid keeping too much useless info.")

> +(defvar syntax-ppss-dont-widen nil
> +  "If non-nil, `syntax-ppss' will work on the non-widened buffer.
> +The code that uses this should create local bindings for
> +`syntax-ppss-cache' and `syntax-ppss-last' too.")
> +

I'm against this bit.  If syntax-ppss-dont-widen is non-nil, the buffer
is narrowed, and the local cache variables are correctly bound and
filled, then something at a low level is going to widen the buffer (and
call back_comment) without knowing to restore the global bindings for
those cache variables.  This could easily give the wrong result and
corrupt the locally bound cache.

I think the only sensible functionality for syntax-ppss is to be
equivalent to (parse-partial-sexp 1 pos).  Then everybody knows where
they stand.  Those pieces of code which actually need a ppss cache with
origin other than 1 could then use a more appropriate specialized
function whose cache wouldn't get mixed up with syntax-ppss's.  (It
could share a lot of code with syntax-ppss).

>   (defvar syntax-begin-function nil
>     "Function to move back outside of any comment/string/paren.
>   This function should move the cursor back to some syntactically safe
> @@ -423,12 +428,21 @@ syntax-ppss
>   in the returned list (counting from 0) cannot be relied upon.
>   Point is at POS when this function returns.

> +IF `syntax-ppss-dont-widen' is nil, the buffer is temporarily
> +widened.
> +
>   It is necessary to call `syntax-ppss-flush-cache' explicitly if
>   this function is called while `before-change-functions' is
>   temporarily let-bound, or if the buffer is modified without
>   running the hook."
>     ;; Default values.
>     (unless pos (setq pos (point)))
> +  (save-restriction
> +    (unless syntax-ppss-dont-widen
> +      (widen))
> +    (syntax-pps--at pos)))
> +
> +(defun syntax-ppss--at (pos)
>     (syntax-propertize pos)
>     ;;
>     (let ((old-ppss (cdr syntax-ppss-last))

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-19 12:27   ` Alan Mackenzie
@ 2016-03-19 18:47     ` Dmitry Gutov
  2016-03-27  0:51       ` John Wiegley
  2016-03-19 23:16     ` Vitalie Spinu
  1 sibling, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2016-03-19 18:47 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983

On 03/19/2016 02:27 PM, Alan Mackenzie wrote:

> OK, no bad thing!

Good.

> But seeing that the function is a new function (its specification has
> changed), it will need new test cases, fresh new attempts to break it.

Sure, please go ahead.

It needed new test cases even before this miraculous transformation.

>> +(defvar syntax-ppss-dont-widen nil
>> +  "If non-nil, `syntax-ppss' will work on the non-widened buffer.
>> +The code that uses this should create local bindings for
>> +`syntax-ppss-cache' and `syntax-ppss-last' too.")
>> +
>
> I'm against this bit.

I'm not married to it, but at least it would provide a backward 
compatibility escape hatch for a while. If a new way of handling mixed 
modes is added and turns out to be satisfactory, we can remove this 
variable later.

> If syntax-ppss-dont-widen is non-nil, the buffer
> is narrowed, and the local cache variables are correctly bound and
> filled, then something at a low level is going to widen the buffer (and
> call back_comment) without knowing to restore the global bindings for
> those cache variables.

When and why would that happen? I do not recall that happening before.

Since the "low level" is a bounded set, we should be able to make sure 
that the primitives do not, in fact, widen before calling syntax-ppss.

I suppose some could widen afterward.

> This could easily give the wrong result and
> corrupt the locally bound cache.

Even so, that would only affect the local cache, and as such, only the 
subregions, in the case of mixed-mode usage. In the general case, it 
would only affect the consumers of syntax-ppss that bound 
syntax-ppss-dont-widen, as long as they bound the cache variables as 
well, which we tell them to.

That lowers the damage area considerably.

> I think the only sensible functionality for syntax-ppss is to be
> equivalent to (parse-partial-sexp 1 pos).  Then everybody knows where
> they stand.  Those pieces of code which actually need a ppss cache with
> origin other than 1 could then use a more appropriate specialized
> function whose cache wouldn't get mixed up with syntax-ppss's.  (It
> could share a lot of code with syntax-ppss).

They already use syntax-ppss. I imagine Emacs's backward compatibility 
policy has something to say about that.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-18  0:49 ` Dmitry Gutov
  2016-03-19 12:27   ` Alan Mackenzie
@ 2016-03-19 23:00   ` Vitalie Spinu
  2016-03-19 23:20     ` Dmitry Gutov
  1 sibling, 1 reply; 65+ messages in thread
From: Vitalie Spinu @ 2016-03-19 23:00 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983


Thanks for this. This is a step in right direction IMHO.

One side note. `parsep-ppss` has a condition-case for args-out-of-range which
could be easily optimized out. You already know that you are calling
parse-partial-sexp with out of range arguments if narrowing is in place. The
current error check obfuscates the logic and makes debugging harder. Would it be
possible for you to have a look once you are on it? Not a big deal though.

Thanks,

  Vitalie

>> On Fri, Mar 18 2016 02:49, Dmitry Gutov wrote:

> On 03/11/2016 05:15 PM, Alan Mackenzie wrote:

> This patch should make ppss-0 and ppss-1 match:

> diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
> index e20a210..c1b9d84 100644
> --- a/lisp/emacs-lisp/syntax.el
> +++ b/lisp/emacs-lisp/syntax.el
> @@ -371,6 +371,11 @@ syntax-ppss-max-span
>  We try to make sure that cache entries are at least this far apart
>  from each other, to avoid keeping too much useless info.")

> +(defvar syntax-ppss-dont-widen nil
> +  "If non-nil, `syntax-ppss' will work on the non-widened buffer.
> +The code that uses this should create local bindings for
> +`syntax-ppss-cache' and `syntax-ppss-last' too.")
> +
>  (defvar syntax-begin-function nil
>    "Function to move back outside of any comment/string/paren.
>  This function should move the cursor back to some syntactically safe
> @@ -423,12 +428,21 @@ syntax-ppss
>  in the returned list (counting from 0) cannot be relied upon.
>  Point is at POS when this function returns.

> +IF `syntax-ppss-dont-widen' is nil, the buffer is temporarily
> +widened.
> +
>  It is necessary to call `syntax-ppss-flush-cache' explicitly if
>  this function is called while `before-change-functions' is
>  temporarily let-bound, or if the buffer is modified without
>  running the hook."
>    ;; Default values.
>    (unless pos (setq pos (point)))
> +  (save-restriction
> +    (unless syntax-ppss-dont-widen
> +      (widen))
> +    (syntax-pps--at pos)))
> +
> +(defun syntax-ppss--at (pos)
>    (syntax-propertize pos)
>    ;;
>    (let ((old-ppss (cdr syntax-ppss-last))





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-19 12:27   ` Alan Mackenzie
  2016-03-19 18:47     ` Dmitry Gutov
@ 2016-03-19 23:16     ` Vitalie Spinu
  1 sibling, 0 replies; 65+ messages in thread
From: Vitalie Spinu @ 2016-03-19 23:16 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov



>> On Sat, Mar 19 2016 12:27, Alan Mackenzie wrote:

> I think the only sensible functionality for syntax-ppss is to be
> equivalent to (parse-partial-sexp 1 pos).  Then everybody knows where
> they stand.  

This would not work for multi modes. Till there is a feasible way to advice
parse-partial-sexp there will be no way to ensure the above contract is
satisfied in multi-modes.

  Vitalie





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-19 23:00   ` Vitalie Spinu
@ 2016-03-19 23:20     ` Dmitry Gutov
  0 siblings, 0 replies; 65+ messages in thread
From: Dmitry Gutov @ 2016-03-19 23:20 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, 22983

On 03/20/2016 01:00 AM, Vitalie Spinu wrote:
>
> Thanks for this. This is a step in right direction IMHO.
>
> One side note. `parsep-ppss` has a condition-case for args-out-of-range which
> could be easily optimized out. You already know that you are calling
> parse-partial-sexp with out of range arguments if narrowing is in place.

That seems like it might make the code more complex: there are several 
parse-partial-sexp calls inside condition-case (for different situations 
with the existing cache), and we may have to add a comparison near each 
of them.

> The
> current error check obfuscates the logic and makes debugging harder. Would it be
> possible for you to have a look once you are on it? Not a big deal though.

I think you can still follow the execution flow with edebug, can't you?

If you're debugging a problem with args-out-of-range, another option is 
to replace `condition-case' with `condition-case-unless-debug' and 
re-evaluate the definition (but restore it when you're done, otherwise 
the args-out-of-range handler won't fire, I think).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-19 18:47     ` Dmitry Gutov
@ 2016-03-27  0:51       ` John Wiegley
  2016-03-27  1:14         ` Dmitry Gutov
  0 siblings, 1 reply; 65+ messages in thread
From: John Wiegley @ 2016-03-27  0:51 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983

[-- Attachment #1: Type: text/plain, Size: 1571 bytes --]

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

>> I think the only sensible functionality for syntax-ppss is to be equivalent
>> to (parse-partial-sexp 1 pos). Then everybody knows where they stand. Those
>> pieces of code which actually need a ppss cache with origin other than 1
>> could then use a more appropriate specialized function whose cache wouldn't
>> get mixed up with syntax-ppss's. (It could share a lot of code with
>> syntax-ppss).

> They already use syntax-ppss. I imagine Emacs's backward compatibility
> policy has something to say about that.

There are times when our backward compatibility policy must bend, and even
break.

Specifically, we have a few existing cases where incomplete code has or will
be shipped in a release. The argument for doing so has often been, "So we can
see what users think." But if we *also* say that once it is released and
people start using, we can't change it, then it's a Catch-22.

syntax-ppss needs more work, that seems to be fairly clear based on the volume
of discussion around this feature, and bugs like this one. Therefore, since it
is not solid yet I'm not willing to let existing dependencies prevent us from
fixing its flaws.

When a feature becomes solid and true, like lexical-binding, that's when I
become incredibly reticent to make any changes whatsoever -- without the
convergence of all the planets and the moons.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 629 bytes --]

^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-27  0:51       ` John Wiegley
@ 2016-03-27  1:14         ` Dmitry Gutov
  2016-04-03 22:58           ` John Wiegley
  0 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2016-03-27  1:14 UTC (permalink / raw)
  To: John Wiegley; +Cc: Alan Mackenzie, 22983

On 03/27/2016 02:51 AM, John Wiegley wrote:

> syntax-ppss needs more work, that seems to be fairly clear based on the volume
> of discussion around this feature, and bugs like this one.

Bugs, plural? Alan has filed just one so far, and I've posted the 
trivial patch.

> Therefore, since it
> is not solid yet I'm not willing to let existing dependencies prevent us from
> fixing its flaws.

The aforementioned patch both fixes the bug and allows syntax-ppss to 
continue to be used in the fashion I've mentioned previously.

The question that's holding it, as far as I'm concerned, if whether the 
"hard narrowing" discussion reaches a satisfying conclusion. If it does, 
we won't really need syntax-ppss-dont-widen.

> When a feature becomes solid and true, like lexical-binding, that's when I
> become incredibly reticent to make any changes whatsoever -- without the
> convergence of all the planets and the moons.

I've never said anything about avoiding making changes to it. But when 
we do that, we usually try to accommodate the existing uses (the 
importance of which depends on how many uses there are out there).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-14  1:04                 ` Drew Adams
@ 2016-04-03 22:55                   ` John Wiegley
  0 siblings, 0 replies; 65+ messages in thread
From: John Wiegley @ 2016-04-03 22:55 UTC (permalink / raw)
  To: Drew Adams; +Cc: 22983, Dmitry Gutov, Stefan Monnier, Alan Mackenzie

>>>>> Drew Adams <drew.adams@oracle.com> writes:

>> Introducing a new function, using a different mechanism shouldn't be too
>> hard though, if we reuse the existing binding.

> Please don't. Please don't even think about it.

> And if you really think you have something to say about it, then please
> bring it up in emacs-devel, not in a bug thread that is not especially
> related to it.

+1!

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-27  1:14         ` Dmitry Gutov
@ 2016-04-03 22:58           ` John Wiegley
  2016-04-03 23:15             ` Dmitry Gutov
  0 siblings, 1 reply; 65+ messages in thread
From: John Wiegley @ 2016-04-03 22:58 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983

[-- Attachment #1: Type: text/plain, Size: 1035 bytes --]

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

> The aforementioned patch both fixes the bug and allows syntax-ppss to
> continue to be used in the fashion I've mentioned previously.

> The question that's holding it, as far as I'm concerned, if whether the
> "hard narrowing" discussion reaches a satisfying conclusion. If it does, we
> won't really need syntax-ppss-dont-widen.

Have things reached a satisfactory conclusion now?  It was hard for me to tell
by the end of this thread.

> I've never said anything about avoiding making changes to it. But when we do
> that, we usually try to accommodate the existing uses (the importance of
> which depends on how many uses there are out there).

Sure, though it's experimental nature does get taken into account. If a thing
is wrong, I'm not interested in accommodating existing workarounds to its
wrongness.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 629 bytes --]

^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-04-03 22:58           ` John Wiegley
@ 2016-04-03 23:15             ` Dmitry Gutov
  2017-09-02 13:12               ` Eli Zaretskii
  0 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2016-04-03 23:15 UTC (permalink / raw)
  To: John Wiegley; +Cc: Alan Mackenzie, 22983

On 04/04/2016 01:58 AM, John Wiegley wrote:

> Have things reached a satisfactory conclusion now?  It was hard for me to tell
> by the end of this thread.

It's a separate discussion, see 
http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html

> Sure, though it's experimental nature does get taken into account. If a thing
> is wrong, I'm not interested in accommodating existing workarounds to its
> wrongness.

What experimental nature?





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-04-03 23:15             ` Dmitry Gutov
@ 2017-09-02 13:12               ` Eli Zaretskii
  2017-09-02 17:40                 ` Alan Mackenzie
  0 siblings, 1 reply; 65+ messages in thread
From: Eli Zaretskii @ 2017-09-02 13:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: jwiegley, acm, 22983

unblock 24655 by 22983
thanks

> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Mon, 4 Apr 2016 02:15:50 +0300
> Cc: Alan Mackenzie <acm@muc.de>, 22983@debbugs.gnu.org
> 
> On 04/04/2016 01:58 AM, John Wiegley wrote:
> 
> > Have things reached a satisfactory conclusion now?  It was hard for me to tell
> > by the end of this thread.
> 
> It's a separate discussion, see 
> http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html
> 
> > Sure, though it's experimental nature does get taken into account. If a thing
> > is wrong, I'm not interested in accommodating existing workarounds to its
> > wrongness.
> 
> What experimental nature?

It doesn't sound like this discussion is leading anywhere, and since
almost 1.5 years has passed with no comments, I guess this bug doesn't
need to block the release of Emacs 26.1, at least.

Thanks.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-02 13:12               ` Eli Zaretskii
@ 2017-09-02 17:40                 ` Alan Mackenzie
  2017-09-02 17:53                   ` Eli Zaretskii
                                     ` (2 more replies)
  0 siblings, 3 replies; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-02 17:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jwiegley, Dmitry Gutov, 22983

Hello, Eli.

On Sat, Sep 02, 2017 at 16:12:48 +0300, Eli Zaretskii wrote:
> unblock 24655 by 22983
> thanks

> > From: Dmitry Gutov <dgutov@yandex.ru>
> > Date: Mon, 4 Apr 2016 02:15:50 +0300
> > Cc: Alan Mackenzie <acm@muc.de>, 22983@debbugs.gnu.org

> > On 04/04/2016 01:58 AM, John Wiegley wrote:

> > > Have things reached a satisfactory conclusion now?  It was hard for me to tell
> > > by the end of this thread.

> > It's a separate discussion, see 
> > http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html

> > > Sure, though it's experimental nature does get taken into account. If a thing
> > > is wrong, I'm not interested in accommodating existing workarounds to its
> > > wrongness.

> > What experimental nature?

> It doesn't sound like this discussion is leading anywhere, and since
> almost 1.5 years has passed with no comments, I guess this bug doesn't
> need to block the release of Emacs 26.1, at least.

I'm not happy about this.  22983 is a serious design flaw, which has had
deleterious effects deep within Emacs.  One recorded example, resulting
in an infinite loop, is:

#########################################################################
From: Philipp Stephani <p.stephani2@gmail.com>
To: emacs-devel@gnu.org
Subject: [PATCH] Protect against an infloop in python-mode
Date: Tue, 28 Feb 2017 22:31:49 +0100

There appears to be an edge case caused by using `syntax-ppss' in a
narrowed buffer during JIT lock inside of Python triple-quote strings.
Unfortunately it is impossible to reproduce without manually
destroying the syntactic information in the Python buffer, but it has
been observed in practice.  In that case it can happen that the syntax
caches get sufficiently out of whack so that there appear to be
overlapping strings in the buffer.  As Python has no nested strings,
this situation is impossible and leads to an infloop in
`python-nav-end-of-statement'.  Protect against this by checking
whether the search for the end of the current string makes progress.
#########################################################################

In this case, Philipp had to apply a workaround.

Seeing as how Stefan is not prepared to take responsibility for his own
bugs, I suggest that I fix it, something I really don't want to spend
time on.  Before I do start spending time on it, I would like some
assurance that my fix will not be blocked or reverted (both have happened
to other things in the core I've worked on), and that I will have a
reasonable amount of time to get the job done (a few weeks) before any
freeze for Emacs 25.3 or 26 comes into force.

> Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-02 17:40                 ` Alan Mackenzie
@ 2017-09-02 17:53                   ` Eli Zaretskii
  2017-09-03 20:44                   ` John Wiegley
  2017-09-04 23:34                   ` Dmitry Gutov
  2 siblings, 0 replies; 65+ messages in thread
From: Eli Zaretskii @ 2017-09-02 17:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: jwiegley, dgutov, 22983

> Date: Sat, 2 Sep 2017 17:40:27 +0000
> Cc: Dmitry Gutov <dgutov@yandex.ru>, jwiegley@gmail.com, 22983@debbugs.gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > It doesn't sound like this discussion is leading anywhere, and since
> > almost 1.5 years has passed with no comments, I guess this bug doesn't
> > need to block the release of Emacs 26.1, at least.
> 
> I'm not happy about this.  22983 is a serious design flaw, which has had
> deleterious effects deep within Emacs.

I didn't close the bug, mind you.  I just removed it from the list of
those blocking the impending release.  You, or anyone else, are free
to work on fixing it and/or discuss the various approaches to dealing
with this issue.

> Before I do start spending time on it, I would like some assurance
> that my fix will not be blocked or reverted (both have happened to
> other things in the core I've worked on)

I doubt that anyone could give you such a promise without seeing the
proposed changes.  Especially since this and related issues, and
solutions proposed for them, already have some history of being
controversial.

> and that I will have a reasonable amount of time to get the job done
> (a few weeks) before any freeze for Emacs 25.3 or 26 comes into
> force.

That I can promise you.  Feature freeze doesn't affect bugfixes, and
Emacs 26.1 is not going to be released tomorrow or the next week.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-02 17:40                 ` Alan Mackenzie
  2017-09-02 17:53                   ` Eli Zaretskii
@ 2017-09-03 20:44                   ` John Wiegley
  2017-09-04 23:34                   ` Dmitry Gutov
  2 siblings, 0 replies; 65+ messages in thread
From: John Wiegley @ 2017-09-03 20:44 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov

>>>>> Alan Mackenzie <acm@muc.de> writes:

> Seeing as how Stefan is not prepared to take responsibility for his own bugs

Let's not use language like this if avoidable, please. I'm certain Stefan
would do so, he just may not see this issue the same way you do (which is what
I recall from the extensive discussions on syntax-ppss). To characterize it as
a fault is only discouraging or frustrating; it doesn't help Emacs.

Thanks,
-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-02 17:40                 ` Alan Mackenzie
  2017-09-02 17:53                   ` Eli Zaretskii
  2017-09-03 20:44                   ` John Wiegley
@ 2017-09-04 23:34                   ` Dmitry Gutov
  2017-09-05  6:57                     ` Andreas Röhler
                                       ` (2 more replies)
  2 siblings, 3 replies; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-04 23:34 UTC (permalink / raw)
  To: Alan Mackenzie, Eli Zaretskii; +Cc: jwiegley, Philipp Stephani, 22983

On 9/2/17 8:40 PM, Alan Mackenzie wrote:
> I'm not happy about this.  22983 is a serious design flaw, which has had
> deleterious effects deep within Emacs.

I'm sure we want to fix design flaws. As long as there is a solid plan 
that does not swap one flaw for another.

> One recorded example, resulting
> in an infinite loop, is:
> 
> #########################################################################
> From: Philipp Stephani <p.stephani2@gmail.com>
> To: emacs-devel@gnu.org
> Subject: [PATCH] Protect against an infloop in python-mode
> Date: Tue, 28 Feb 2017 22:31:49 +0100
> 
> There appears to be an edge case caused by using `syntax-ppss' in a
> narrowed buffer during JIT lock inside of Python triple-quote strings.
> Unfortunately it is impossible to reproduce without manually
> destroying the syntactic information in the Python buffer, but it has
> been observed in practice.  In that case it can happen that the syntax
> caches get sufficiently out of whack so that there appear to be
> overlapping strings in the buffer.  As Python has no nested strings,
> this situation is impossible and leads to an infloop in
> `python-nav-end-of-statement'.  Protect against this by checking
> whether the search for the end of the current string makes progress.
> #########################################################################
> 
> In this case, Philipp had to apply a workaround.

The problem manifested during jit-lock. Do we understand why the (widen) 
call inside font-lock-default-fontify-region didn't help?





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-04 23:34                   ` Dmitry Gutov
@ 2017-09-05  6:57                     ` Andreas Röhler
  2017-09-05 12:28                     ` John Wiegley
  2017-09-07 17:56                     ` Alan Mackenzie
  2 siblings, 0 replies; 65+ messages in thread
From: Andreas Röhler @ 2017-09-05  6:57 UTC (permalink / raw)
  To: 22983

[-- Attachment #1: Type: text/plain, Size: 1740 bytes --]



On 05.09.2017 01:34, Dmitry Gutov wrote:
> On 9/2/17 8:40 PM, Alan Mackenzie wrote:
>> I'm not happy about this.  22983 is a serious design flaw, which has had
>> deleterious effects deep within Emacs.
>
> I'm sure we want to fix design flaws. As long as there is a solid plan 
> that does not swap one flaw for another.
>
>> One recorded example, resulting
>> in an infinite loop, is:
>>
>> ######################################################################### 
>>
>> From: Philipp Stephani <p.stephani2@gmail.com>
>> To: emacs-devel@gnu.org
>> Subject: [PATCH] Protect against an infloop in python-mode
>> Date: Tue, 28 Feb 2017 22:31:49 +0100
>>
>> There appears to be an edge case caused by using `syntax-ppss' in a
>> narrowed buffer during JIT lock inside of Python triple-quote strings.
>> Unfortunately it is impossible to reproduce without manually
>> destroying the syntactic information in the Python buffer, but it has
>> been observed in practice.  In that case it can happen that the syntax
>> caches get sufficiently out of whack so that there appear to be
>> overlapping strings in the buffer.  As Python has no nested strings,
>> this situation is impossible and leads to an infloop in
>> `python-nav-end-of-statement'.  Protect against this by checking
>> whether the search for the end of the current string makes progress.
>> ######################################################################### 
>>
>>
>> In this case, Philipp had to apply a workaround.
>
> The problem manifested during jit-lock. Do we understand why the 
> (widen) call inside font-lock-default-fontify-region didn't help?
>
>
>


IIRC its about dissolving circular dependencies notably between 
syntax-propertize-function and syntax-ppss.

[-- Attachment #2: Type: text/html, Size: 2904 bytes --]

^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-04 23:34                   ` Dmitry Gutov
  2017-09-05  6:57                     ` Andreas Röhler
@ 2017-09-05 12:28                     ` John Wiegley
  2017-09-07 20:45                       ` Alan Mackenzie
  2017-09-07 17:56                     ` Alan Mackenzie
  2 siblings, 1 reply; 65+ messages in thread
From: John Wiegley @ 2017-09-05 12:28 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Philipp Stephani, 22983

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

> I'm sure we want to fix design flaws. As long as there is a solid plan that
> does not swap one flaw for another.

Can we have a summary of the current proposal(s) on the table? It would help
to clarify, rather than navigating past discussions. Alan has told me that
this issue is affecting people and has been outstanding for some time; I'd
like to get a better idea of its seriousness/scope, and what effect the
available solutions would have (as Dmitry says, we don't want to replace one
flaw with another).

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-04 23:34                   ` Dmitry Gutov
  2017-09-05  6:57                     ` Andreas Röhler
  2017-09-05 12:28                     ` John Wiegley
@ 2017-09-07 17:56                     ` Alan Mackenzie
  2017-09-07 20:36                       ` Dmitry Gutov
  2 siblings, 1 reply; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-07 17:56 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: jwiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Tue, Sep 05, 2017 at 02:34:15 +0300, Dmitry Gutov wrote:
> On 9/2/17 8:40 PM, Alan Mackenzie wrote:
> > I'm not happy about this.  22983 is a serious design flaw, which has had
> > deleterious effects deep within Emacs.

> I'm sure we want to fix design flaws. As long as there is a solid plan 
> that does not swap one flaw for another.

Plan or not, it should be fixed.

> > One recorded example, resulting
> > in an infinite loop, is:
> > 
> > #########################################################################
> > From: Philipp Stephani <p.stephani2@gmail.com>
> > To: emacs-devel@gnu.org
> > Subject: [PATCH] Protect against an infloop in python-mode
> > Date: Tue, 28 Feb 2017 22:31:49 +0100
> > 
> > There appears to be an edge case caused by using `syntax-ppss' in a
> > narrowed buffer during JIT lock inside of Python triple-quote strings.
> > Unfortunately it is impossible to reproduce without manually
> > destroying the syntactic information in the Python buffer, but it has
> > been observed in practice.  In that case it can happen that the syntax
> > caches get sufficiently out of whack so that there appear to be
> > overlapping strings in the buffer.  As Python has no nested strings,
> > this situation is impossible and leads to an infloop in
> > `python-nav-end-of-statement'.  Protect against this by checking
> > whether the search for the end of the current string makes progress.
> > #########################################################################
> > 
> > In this case, Philipp had to apply a workaround.

> The problem manifested during jit-lock. Do we understand why the (widen) 
> call inside font-lock-default-fontify-region didn't help?

I don't, not in detail, no.  Philipp might know.  But if syntax-ppss was
used whilst the buffer was narrowed, it likely corrupted its cache, and
that corruption remained after widening the buffer.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-07 17:56                     ` Alan Mackenzie
@ 2017-09-07 20:36                       ` Dmitry Gutov
  0 siblings, 0 replies; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-07 20:36 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: jwiegley, Philipp Stephani, 22983

On 9/7/17 8:56 PM, Alan Mackenzie wrote:

>> The problem manifested during jit-lock. Do we understand why the (widen)
>> call inside font-lock-default-fontify-region didn't help?
> 
> I don't, not in detail, no.  Philipp might know.  But if syntax-ppss was
> used whilst the buffer was narrowed, it likely corrupted its cache, and
> that corruption remained after widening the buffer.

Details matter.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-05 12:28                     ` John Wiegley
@ 2017-09-07 20:45                       ` Alan Mackenzie
  2017-09-08 16:04                         ` Andreas Röhler
  2017-09-09  9:44                         ` Dmitry Gutov
  0 siblings, 2 replies; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-07 20:45 UTC (permalink / raw)
  To: John Wiegley; +Cc: 22983, Philipp Stephani, Dmitry Gutov

Hello, John.

On Tue, Sep 05, 2017 at 13:28:52 +0100, John Wiegley wrote:
> >>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

> > I'm sure we want to fix design flaws. As long as there is a solid plan that
> > does not swap one flaw for another.

> Can we have a summary of the current proposal(s) on the table? It would help
> to clarify, rather than navigating past discussions. Alan has told me that
> this issue is affecting people and has been outstanding for some time; I'd
> like to get a better idea of its seriousness/scope, and what effect the
> available solutions would have (as Dmitry says, we don't want to replace one
> flaw with another).

First, I think it's worthwhile emphasising what the function purports to
do:

    syntax-ppss is a compiled Lisp function in `syntax.el'.

    (syntax-ppss &optional POS)

    Parse-Partial-Sexp State at POS, defaulting to point.
    The returned value is the same as that of `parse-partial-sexp'
    run from `point-min' to POS except that values at positions 2 and 6
    in the returned list (counting from 0) cannot be relied upon.
    Point is at POS when this function returns.

The solution I propose is to introduce a second cache into syntax-ppss,
and this cache would be used whenever (not (eq (point-min) 1)).
Whenever point-min changes, and isn't 1, this second cached would be
calculated again from scratch.

This proposal has these advantages:

(i) It would make the function deliver what its unchanged doc string
says.  This is important, given that syntax-ppss has been very widely
used within Emacs, and likely by external packages too; these will
typically have assumed the advertised behaviour of the function, without
having tested it in narrowed buffers.

(i) In the case which currently works, namely a non-narrowed buffer,
there would be only a minute slow-down (basically, there would be extra
code to check point-min and select the cache to use).

(ii) The cache for use in a narrowed buffer might well be sufficiently
fast in normal use.  If it is not, it could be enhanced readily.

I think Dmitry also proposed a method of solution some months ago,
though I don't remember in detail what it was.  Dmitry, do you still
think your solution would work?  If so, please elaborate on it.

> -- 
> John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
> http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-07 20:45                       ` Alan Mackenzie
@ 2017-09-08 16:04                         ` Andreas Röhler
  2017-09-10 18:26                           ` Alan Mackenzie
  2017-09-09  9:44                         ` Dmitry Gutov
  1 sibling, 1 reply; 65+ messages in thread
From: Andreas Röhler @ 2017-09-08 16:04 UTC (permalink / raw)
  To: 22983; +Cc: Alan Mackenzie, John Wiegley

[-- Attachment #1: Type: text/plain, Size: 2958 bytes --]



On 07.09.2017 22:45, Alan Mackenzie wrote:
> Hello, John.
>
> On Tue, Sep 05, 2017 at 13:28:52 +0100, John Wiegley wrote:
>>>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:
>>> I'm sure we want to fix design flaws. As long as there is a solid plan that
>>> does not swap one flaw for another.
>> Can we have a summary of the current proposal(s) on the table? It would help
>> to clarify, rather than navigating past discussions. Alan has told me that
>> this issue is affecting people and has been outstanding for some time; I'd
>> like to get a better idea of its seriousness/scope, and what effect the
>> available solutions would have (as Dmitry says, we don't want to replace one
>> flaw with another).
> First, I think it's worthwhile emphasising what the function purports to
> do:
>
>      syntax-ppss is a compiled Lisp function in `syntax.el'.
>
>      (syntax-ppss &optional POS)
>
>      Parse-Partial-Sexp State at POS, defaulting to point.
>      The returned value is the same as that of `parse-partial-sexp'
>      run from `point-min' to POS except that values at positions 2 and 6
>      in the returned list (counting from 0) cannot be relied upon.
>      Point is at POS when this function returns.
>
> The solution I propose is to introduce a second cache into syntax-ppss,
> and this cache would be used whenever (not (eq (point-min) 1)).
> Whenever point-min changes, and isn't 1, this second cached would be
> calculated again from scratch.
>
> This proposal has these advantages:
>
> (i) It would make the function deliver what its unchanged doc string
> says.  This is important, given that syntax-ppss has been very widely
> used within Emacs, and likely by external packages too; these will
> typically have assumed the advertised behaviour of the function, without
> having tested it in narrowed buffers.
>
> (i) In the case which currently works, namely a non-narrowed buffer,
> there would be only a minute slow-down (basically, there would be extra
> code to check point-min and select the cache to use).
>
> (ii) The cache for use in a narrowed buffer might well be sufficiently
> fast in normal use.  If it is not, it could be enhanced readily.
>
> I think Dmitry also proposed a method of solution some months ago,
> though I don't remember in detail what it was.  Dmitry, do you still
> think your solution would work?  If so, please elaborate on it.
>
>> -- 
>> John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
>> http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

Hi Alan and all,

assume a complex matter behind, a bunch of bugs resp. design issues, not 
a single one.
Fixing this would affect syntax-propertize, parse-partial-sexp, 
syntax-ppss and font-lock stuff at once.

http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html
points at some spot. There should be more.

As a first step listing referential tests including benchmarks should be 
helpful.




[-- Attachment #2: Type: text/html, Size: 4540 bytes --]

^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-07 20:45                       ` Alan Mackenzie
  2017-09-08 16:04                         ` Andreas Röhler
@ 2017-09-09  9:44                         ` Dmitry Gutov
  2017-09-09 10:20                           ` Alan Mackenzie
  2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
  1 sibling, 2 replies; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-09  9:44 UTC (permalink / raw)
  To: Alan Mackenzie, John Wiegley; +Cc: 22983, Philipp Stephani

Hi Alan,

On 9/7/17 11:45 PM, Alan Mackenzie wrote:

> The solution I propose is to introduce a second cache into syntax-ppss,
> and this cache would be used whenever (not (eq (point-min) 1)).
> Whenever point-min changes, and isn't 1, this second cached would be
> calculated again from scratch.

Thanks for writing this up. I think it's a good step, and since it 
follow the current wording of the docstring, it should be highly 
compatible with the existing code.

> This proposal has these advantages:
> 
> (i) It would make the function deliver what its unchanged doc string
> says.  This is important, given that syntax-ppss has been very widely
> used within Emacs, and likely by external packages too; these will
> typically have assumed the advertised behaviour of the function, without
> having tested it in narrowed buffers.

It will also continue to function as expected in mmm-mode, AFAICT, 
without the need for an "escape hatch" we discussed before.

> (i) In the case which currently works, namely a non-narrowed buffer,
> there would be only a minute slow-down (basically, there would be extra
> code to check point-min and select the cache to use).
> 
> (ii) The cache for use in a narrowed buffer might well be sufficiently
> fast in normal use.  If it is not, it could be enhanced readily.

And since the API doesn't change, and the observable behavior doesn't 
either (in the vast majority of cases; probably all except the broken 
ones), we can refine this solution easily, or even swap it for something 
else, with little cost.

> I think Dmitry also proposed a method of solution some months ago,
> though I don't remember in detail what it was.  Dmitry, do you still
> think your solution would work?  If so, please elaborate on it.

There is a simple patch at 
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22983#47, but I after some 
consideration, I now prefer your proposed approach. We've also had some 
grander ideas about enhancing things further, but those can be added 
later, after we finally decide.

I do want to know what Stefan thinks of this subject now, though.

Caveats:

- This solves the dependency on point-min, but does nothing about the 
dependency on the current syntax-table (which can change). I'm not 
necessarily suggesting we try to solve that now, though.

- Before this change is pushed to master, or shortly after, I'd like to 
know that it actually fixed the problem Philipp experienced with 
python-mode, so we can revert 4fbd330. If it was caused by e.g. 
syntax-table changing, we've not improved much.

All the best,
Dmitry.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-09  9:44                         ` Dmitry Gutov
@ 2017-09-09 10:20                           ` Alan Mackenzie
  2017-09-09 12:18                             ` Dmitry Gutov
  2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
  1 sibling, 1 reply; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-09 10:20 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Sat, Sep 09, 2017 at 12:44:02 +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 9/7/17 11:45 PM, Alan Mackenzie wrote:

> > The solution I propose is to introduce a second cache into syntax-ppss,
> > and this cache would be used whenever (not (eq (point-min) 1)).
> > Whenever point-min changes, and isn't 1, this second cached would be
> > calculated again from scratch.

> Thanks for writing this up. I think it's a good step, and since it 
> follow the current wording of the docstring, it should be highly 
> compatible with the existing code.

Thanks.

> > This proposal has these advantages:

> > (i) It would make the function deliver what its unchanged doc string
> > says.  This is important, given that syntax-ppss has been very widely
> > used within Emacs, and likely by external packages too; these will
> > typically have assumed the advertised behaviour of the function, without
> > having tested it in narrowed buffers.

> It will also continue to function as expected in mmm-mode, AFAICT, 
> without the need for an "escape hatch" we discussed before.

> > (i) In the case which currently works, namely a non-narrowed buffer,
> > there would be only a minute slow-down (basically, there would be extra
> > code to check point-min and select the cache to use).

> > (ii) The cache for use in a narrowed buffer might well be sufficiently
> > fast in normal use.  If it is not, it could be enhanced readily.

> And since the API doesn't change, and the observable behavior doesn't 
> either (in the vast majority of cases; probably all except the broken 
> ones), we can refine this solution easily, or even swap it for something 
> else, with little cost.

Yes.  I now have a provisional implementation of this new strategy,
which works on the test case for xdisp.c with which I opened the bug.
It seems to be working, generally.  I need to test it more thoroughly.

In the implementation, I have left the function `syntax-ppss' untouched
except for adding a function call to set up the cache right at the
start.  I have refactored syntax-ppss-flush-cache, extracting a function
which is called directly from the cache-selecting code.  Other than
that, there is one new function (which switches the current cache in
use) and a few new variables to keep track of the caches.

> > I think Dmitry also proposed a method of solution some months ago,
> > though I don't remember in detail what it was.  Dmitry, do you still
> > think your solution would work?  If so, please elaborate on it.

> There is a simple patch at 
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22983#47, but I after some 
> consideration, I now prefer your proposed approach. We've also had some 
> grander ideas about enhancing things further, but those can be added 
> later, after we finally decide.

Yes, I agree.

> I do want to know what Stefan thinks of this subject now, though.

Yes.

> Caveats:

> - This solves the dependency on point-min, but does nothing about the 
> dependency on the current syntax-table (which can change). I'm not 
> necessarily suggesting we try to solve that now, though.

I had some ideas on this back in the spring (about having "indirect
variables") which could be used quickly to "swap out" the current
syntax-table text properties, and (more importantly) quickly swap them
back in.  But that's for another day.

> - Before this change is pushed to master, or shortly after, I'd like to 
> know that it actually fixed the problem Philipp experienced with 
> python-mode, so we can revert 4fbd330. If it was caused by e.g. 
> syntax-table changing, we've not improved much.

I am naturally interested in this, too.  If my patch doesn't fix this
bug, at least it will have removed a layer of fog inhibiting its
investigation.

> All the best,
> Dmitry.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-09 10:20                           ` Alan Mackenzie
@ 2017-09-09 12:18                             ` Dmitry Gutov
  2017-09-10 11:42                               ` Alan Mackenzie
  0 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-09 12:18 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

On 9/9/17 1:20 PM, Alan Mackenzie wrote:

> In the implementation, I have left the function `syntax-ppss' untouched
> except for adding a function call to set up the cache right at the
> start.  I have refactored syntax-ppss-flush-cache, extracting a function
> which is called directly from the cache-selecting code.  Other than
> that, there is one new function (which switches the current cache in
> use) and a few new variables to keep track of the caches.

Not sure I understand. If you call (syntax-ppss) with significantly 
different narrowings without flushing the cache (e.g. without modifying 
the buffer), sounds like it'll have to return the same results under the 
described implementation.

If so, it doesn't sound strict enough.

>> Caveats:
> 
>> - This solves the dependency on point-min, but does nothing about the
>> dependency on the current syntax-table (which can change). I'm not
>> necessarily suggesting we try to solve that now, though.
> 
> I had some ideas on this back in the spring (about having "indirect
> variables") which could be used quickly to "swap out" the current
> syntax-table text properties, and (more importantly) quickly swap them
> back in.  But that's for another day.

I admit I'm not sure what all this implies.

>> - Before this change is pushed to master, or shortly after, I'd like to
>> know that it actually fixed the problem Philipp experienced with
>> python-mode, so we can revert 4fbd330. If it was caused by e.g.
>> syntax-table changing, we've not improved much.
> 
> I am naturally interested in this, too.  If my patch doesn't fix this
> bug, at least it will have removed a layer of fog inhibiting its
> investigation.

Let's hope so.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-09  9:44                         ` Dmitry Gutov
  2017-09-09 10:20                           ` Alan Mackenzie
@ 2017-09-10 11:36                           ` Alan Mackenzie
  2017-09-10 22:53                             ` Stefan Monnier
                                               ` (2 more replies)
  1 sibling, 3 replies; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-10 11:36 UTC (permalink / raw)
  To: Dmitry Gutov, Philipp Stephani; +Cc: John Wiegley, 22983

Hello, Dmitry and Philipp.

On Sat, Sep 09, 2017 at 12:44:02 +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 9/7/17 11:45 PM, Alan Mackenzie wrote:

> > The solution I propose is to introduce a second cache into syntax-ppss,
> > and this cache would be used whenever (not (eq (point-min) 1)).
> > Whenever point-min changes, and isn't 1, this second cached would be
> > calculated again from scratch.

Here is a patch implementing this.  Comments about it would be welcome.

[ .... ]

> And since the API doesn't change, and the observable behavior doesn't 
> either (in the vast majority of cases; probably all except the broken 
> ones), we can refine this solution easily, or even swap it for something 
> else, with little cost.

[ .... ]

> Caveats:

> - This solves the dependency on point-min, but does nothing about the 
> dependency on the current syntax-table (which can change). I'm not 
> necessarily suggesting we try to solve that now, though.

> - Before this change is pushed to master, or shortly after, I'd like to 
> know that it actually fixed the problem Philipp experienced with 
> python-mode, so we can revert 4fbd330. If it was caused by e.g. 
> syntax-table changing, we've not improved much.

Philipp, any chance of you trying out python mode with this patch but
without 4fbd330?



diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
index d1d5176944..952ea8bb83 100644
--- a/lisp/emacs-lisp/syntax.el
+++ b/lisp/emacs-lisp/syntax.el
@@ -386,11 +386,103 @@ syntax-ppss-cache
 (defvar-local syntax-ppss-last nil
   "Cache of (LAST-POS . LAST-PPSS).")
 
-(defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache)
-(defun syntax-ppss-flush-cache (beg &rest ignored)
-  "Flush the cache of `syntax-ppss' starting at position BEG."
-  ;; Set syntax-propertize to refontify anything past beg.
-  (setq syntax-propertize--done (min beg syntax-propertize--done))
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Several caches.
+;;
+;; Because `syntax-ppss' is equivalent to (parse-partial-sexp
+;; (POINT-MIN) x), we need either to empty the cache when we narrow
+;; the buffer, which is suboptimal, or we need to use several caches.
+;;
+;; The implementation which follows uses three caches, the current one
+;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive
+;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the
+;; former state of the active cache as it was used in widened and
+;; narrowed buffers respectively.  There are also the variables
+;; `syntax-ppss-max-valid-{wide,narrow}' which hold the maximum
+;; position where the caches are valid, due to buffer changes.
+;;
+;; At the first call to `syntax-ppss' after a widening or narrowing of
+;; the buffer, the pertinent inactive cache is swapped into the
+;; current cache by calling `syntax-ppss-set-cache'.  Note that there
+;; is currently just one inactive cache for narrowed buffers, so only
+;; one inactive narrowed cache can be stored at a time.
+;;
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(defvar-local syntax-ppss-cache-wide nil
+  "Holds the value of `syntax-ppss-cache' for a widened buffer.")
+(defvar-local syntax-ppss-last-wide nil
+  "Holds the value of `syntax-ppss-last' for a widened buffer.")
+(defvar-local syntax-ppss-max-valid-wide most-positive-fixnum
+  "The buffer position after which `syntax-ppss-cache-wide' is invalid.")
+
+(defvar-local syntax-ppss-cache-narrow nil
+  "Holds the value of `syntax-ppss-cache' for a narrowed buffer.")
+(defvar-local syntax-ppss-last-narrow nil
+  "Holds the value of `syntax-ppss-last' for a narrowed buffer.")
+(defvar-local syntax-ppss-max-valid-narrow most-positive-fixnum
+  "The buffer position after which `syntax-ppss-cache-narrow' is invalid.")
+
+(defvar-local syntax-ppss-narrow-point-min 1
+  "Value of `point-min' for which the stored \"narrow\" cache is valid.")
+
+(defvar-local syntax-ppss-supremum most-positive-fixnum
+  "Lowest change position since previous restriction change.")
+
+(defvar-local syntax-ppss-cache-point-min 1
+  "Value of `point-min' for which the current cache is valid.")
+
+(defun syntax-ppss-set-cache ()
+  "Swap in and out the cache pertinent to the new point-min."
+  (unless (eq (point-min) syntax-ppss-cache-point-min)
+    ;; Update the stored `...max-valid' values.
+    (setq syntax-ppss-max-valid-wide
+          (if (eq syntax-ppss-cache-point-min 1)
+              (or (caar syntax-ppss-cache)
+                  1)
+            (min syntax-ppss-max-valid-wide syntax-ppss-supremum)))
+    (setq syntax-ppss-max-valid-narrow
+          (if (eq syntax-ppss-cache-point-min syntax-ppss-narrow-point-min)
+              (or (caar syntax-ppss-cache)
+                  syntax-ppss-cache-point-min)
+            (min syntax-ppss-max-valid-narrow syntax-ppss-supremum)))
+    (setq syntax-ppss-supremum most-positive-fixnum)
+
+    ;; Store away the current values of the cache.
+    (cond
+     ((eq syntax-ppss-cache-point-min 1)
+      (setq syntax-ppss-cache-wide syntax-ppss-cache
+            syntax-ppss-last-wide syntax-ppss-last))
+     ((eq syntax-ppss-cache-point-min syntax-ppss-narrow-point-min)
+      (setq syntax-ppss-cache-narrow syntax-ppss-cache
+            syntax-ppss-last-narrow syntax-ppss-last))
+     (syntax-ppss-cache
+      (setq syntax-ppss-narrow-point-min syntax-ppss-cache-point-min
+            syntax-ppss-cache-narrow syntax-ppss-cache
+            syntax-ppss-last-narrow syntax-ppss-last))
+     (t nil))
+
+    ;; Restore/initialize the cache for the new point-min.
+    (cond
+     ((eq (point-min) 1)
+      (setq syntax-ppss-cache syntax-ppss-cache-wide
+            syntax-ppss-last syntax-ppss-last-wide)
+      (save-restriction
+        (widen)
+        (syntax-ppss-invalidate-cache syntax-ppss-max-valid-wide)))
+     ((eq (point-min) syntax-ppss-narrow-point-min)
+      (setq syntax-ppss-cache syntax-ppss-cache-narrow
+            syntax-ppss-last syntax-ppss-last-narrow)
+      (save-restriction
+        (widen)
+        (syntax-ppss-invalidate-cache syntax-ppss-max-valid-narrow)))
+     (t
+      (setq syntax-ppss-cache nil
+            syntax-ppss-last nil)))
+    (setq syntax-ppss-cache-point-min (point-min))))
+
+(defun syntax-ppss-invalidate-cache (beg &rest ignored)
+  "Invalidate the cache of `syntax-ppss' starting at position BEG."
   ;; Flush invalid cache entries.
   (while (and syntax-ppss-cache (> (caar syntax-ppss-cache) beg))
     (setq syntax-ppss-cache (cdr syntax-ppss-cache)))
@@ -411,6 +503,16 @@ syntax-ppss-flush-cache
   ;;   (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t))
   )
 
+;; Retain the following two for compatibility reasons.
+(defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache)
+(defun syntax-ppss-flush-cache (beg &rest ignored)
+  "Flush the `syntax-ppss' caches and set `syntax-propertize--done'."
+  (setq syntax-ppss-supremum (min beg syntax-ppss-supremum))
+  ;; Ensure the appropriate cache is active.
+  (syntax-ppss-set-cache)
+  (setq syntax-propertize--done (min beg syntax-propertize--done))
+  (syntax-ppss-invalidate-cache beg ignored))
+
 (defvar syntax-ppss-stats
   [(0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (1 . 2500.0)])
 (defun syntax-ppss-stats ()
@@ -434,6 +536,8 @@ syntax-ppss
 this function is called while `before-change-functions' is
 temporarily let-bound, or if the buffer is modified without
 running the hook."
+  ;; Ensure the appropriate cache is active.
+  (syntax-ppss-set-cache)
   ;; Default values.
   (unless pos (setq pos (point)))
   (syntax-propertize pos)


> All the best,
> Dmitry.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply related	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-09 12:18                             ` Dmitry Gutov
@ 2017-09-10 11:42                               ` Alan Mackenzie
  0 siblings, 0 replies; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-10 11:42 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Sat, Sep 09, 2017 at 15:18:11 +0300, Dmitry Gutov wrote:
> On 9/9/17 1:20 PM, Alan Mackenzie wrote:

> > In the implementation, I have left the function `syntax-ppss' untouched
> > except for adding a function call to set up the cache right at the
> > start.  I have refactored syntax-ppss-flush-cache, extracting a function
> > which is called directly from the cache-selecting code.  Other than
> > that, there is one new function (which switches the current cache in
> > use) and a few new variables to keep track of the caches.

> Not sure I understand. If you call (syntax-ppss) with significantly 
> different narrowings without flushing the cache (e.g. without modifying 
> the buffer), sounds like it'll have to return the same results under the 
> described implementation.

> If so, it doesn't sound strict enough.

On changing from one narrowing to another narrowing (more precisely, when
point-min is changed, neither value being 1), the cache is flushed, even
though the buffer has not been modified.

Anyhow, I've posted a patch elsewhere on this thread.  Comments on it
would be welcome.

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-08 16:04                         ` Andreas Röhler
@ 2017-09-10 18:26                           ` Alan Mackenzie
  0 siblings, 0 replies; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-10 18:26 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: 22983

Hello, Andreas.

On Fri, Sep 08, 2017 at 18:04:37 +0200, Andreas Röhler wrote:

> On 07.09.2017 22:45, Alan Mackenzie wrote:

> > The solution I propose is to introduce a second cache into syntax-ppss,
> > and this cache would be used whenever (not (eq (point-min) 1)).
> > Whenever point-min changes, and isn't 1, this second cached would be
> > calculated again from scratch.

> > This proposal has these advantages:

> > (i) It would make the function deliver what its unchanged doc string
> > says.  This is important, given that syntax-ppss has been very widely
> > used within Emacs, and likely by external packages too; these will
> > typically have assumed the advertised behaviour of the function, without
> > having tested it in narrowed buffers.

> > (i) In the case which currently works, namely a non-narrowed buffer,
> > there would be only a minute slow-down (basically, there would be extra
> > code to check point-min and select the cache to use).

> > (ii) The cache for use in a narrowed buffer might well be sufficiently
> > fast in normal use.  If it is not, it could be enhanced readily.

> Hi Alan and all,

> assume a complex matter behind, a bunch of bugs resp. design issues, not 
> a single one.

I don't think this bug is _that_ complex, and even if it has associated
bugs, I think we can fix it on its own.

> Fixing this would affect syntax-propertize, parse-partial-sexp, 
> syntax-ppss and font-lock stuff at once.

I'll give you one out of four.  ;-)  syntax-ppss will definitely be
affected, parse-partial-sexp definitely not, and the other two possibly
in corner cases, but hopefully not.

> http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html
> points at some spot. There should be more.

I think, at least I hope, that is an orthoganol issue.

> As a first step listing referential tests including benchmarks should be 
> helpful.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
@ 2017-09-10 22:53                             ` Stefan Monnier
  2017-09-10 23:36                               ` Dmitry Gutov
  2017-09-11 19:42                               ` Alan Mackenzie
  2017-09-11  0:11                             ` Dmitry Gutov
  2017-09-17 11:12                             ` Philipp Stephani
  2 siblings, 2 replies; 65+ messages in thread
From: Stefan Monnier @ 2017-09-10 22:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983

> +;; Several caches.
> +;; Because `syntax-ppss' is equivalent to (parse-partial-sexp
> +;; (POINT-MIN) x), we need either to empty the cache when we narrow
> +;; the buffer, which is suboptimal, or we need to use several caches.

I think that (parse-partial-sexp 1 x) is more often what the caller
wants than (parse-partial-sexp (point-min) x), but if you're happy with
the behavior described by the docstring, then that's fine.

> +;; The implementation which follows uses three caches, the current one
> +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive
> +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the
> +;; former state of the active cache as it was used in widened and
> +;; narrowed buffers respectively.

Earlier in the thread, I suggested to use a single cache indexed by the
position of point-min (or by the position and point-min and by the
current syntax-table, so as to also handle changes in the syntax-table),
i.e. a list of (POINT-MIN-POS . CACHE-DATA) or
((POINT-MIN-POS . SYNTAX-TABLE) . CACHE-DATA).  I think it would lead to
less code duplication than your patch which only handles 2 different
POINT-MIN-POS (and one of the two has to be equal to 1), but existing
code trumps hypothetical designs.

So, I have no objections to the patch.  But I think (parse-partial-sexp
(point-min) x) is a design bug in syntax-ppss which we will need to fix
sooner or later, which is why I never bothered to implement something
like your patch, which only makes the code do what its doc says rather
than what the caller needs.


        Stefan





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 22:53                             ` Stefan Monnier
@ 2017-09-10 23:36                               ` Dmitry Gutov
  2017-09-11 11:10                                 ` Stefan Monnier
  2017-09-11 19:42                               ` Alan Mackenzie
  1 sibling, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-10 23:36 UTC (permalink / raw)
  To: Stefan Monnier, Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

On 9/11/17 1:53 AM, Stefan Monnier wrote:

> I think that (parse-partial-sexp 1 x) is more often what the caller
> wants than (parse-partial-sexp (point-min) x), but if you're happy with
> the behavior described by the docstring, then that's fine.

And yet, I struggle to find such callers. But those that do, can 
(save-restriction (widen) (syntax-ppss)) anyway.

>> +;; The implementation which follows uses three caches, the current one
>> +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive
>> +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the
>> +;; former state of the active cache as it was used in widened and
>> +;; narrowed buffers respectively.
> 
> Earlier in the thread, I suggested to use a single cache indexed by the
> position of point-min

That would lead to clobbering the global cache when we use syntax-ppss 
for some local parsing. E.g. if ruby-syntax-propertize-percent-literal 
didn't bind parse-sexp-lookup-properties to nil, it might clobber the 
cache unnecessarily.

I don't have the data on whether this would be a frequent problem, though.

> i.e. a list of (POINT-MIN-POS . CACHE-DATA) or
> ((POINT-MIN-POS . SYNTAX-TABLE) . CACHE-DATA).  I think it would lead to
> less code duplication than your patch which only handles 2 different
> POINT-MIN-POS (and one of the two has to be equal to 1), but existing
> code trumps hypothetical designs.

I also think there's a way to implement this behavior with less code and 
new variables, albeit with extra indirection.

> So, I have no objections to the patch.  But I think (parse-partial-sexp
> (point-min) x) is a design bug in syntax-ppss which we will need to fix
> sooner or later, which is why I never bothered to implement something
> like your patch, which only makes the code do what its doc says rather
> than what the caller needs.

I'm considering the idea now that syntax-ppss should stay a caching 
wrapper around parse-partial-sexp, and the responsibility to widen 
should always be the caller's. This way, it can be used for different 
purposes that we've discussed before many times.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
  2017-09-10 22:53                             ` Stefan Monnier
@ 2017-09-11  0:11                             ` Dmitry Gutov
  2017-09-11 20:12                               ` Alan Mackenzie
  2017-09-17 11:12                             ` Philipp Stephani
  2 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-11  0:11 UTC (permalink / raw)
  To: Alan Mackenzie, Philipp Stephani; +Cc: John Wiegley, 22983

On 9/10/17 2:36 PM, Alan Mackenzie wrote:

>>> The solution I propose is to introduce a second cache into syntax-ppss,
>>> and this cache would be used whenever (not (eq (point-min) 1)).
>>> Whenever point-min changes, and isn't 1, this second cached would be
>>> calculated again from scratch.
> 
> Here is a patch implementing this.  Comments about it would be welcome.

Thank you. It seems to hold up to the main test scenario I had in mind, 
so I don't have any complaints behavior-wise.

It looks pretty big, though. With lots of new global variables.

Before, we had syntax-ppss-cache and syntax-ppss-last. The patch adds 8 
new ones.

I propose two avenues for simplification:

1) Use a cons structure for the (PPSS-CACHE . PPSS-LAST) structure. We 
will have three global variables total: syntax-ppss-data-wide, 
syntax-ppss-data-narrow, syntax-ppss-data-narrow-point-min. syntax-ppss 
would bind a local variable syntax-ppss-data to one of the first two 
depending on the value of the third (and then modify its car and cdr 
during the course of execution).

2) Some extra vars serve to delay the actual clearing of the unused 
cache until it's used again. It's a valid idea, but what if we try 
without it at first? So syntax-ppss-flush-cache would always clear both 
caches eagerly.

The advantages:

- Less code, easier to reason about.

- Any package than advises syntax-ppss will have to juggle fewer global 
variables. So Vatalie's polymode will have an easier time of it. It 
could even reuse some of the cache-while-narrowed logic by substituting 
the values of syntax-ppss-data-narrow and 
syntax-ppss-data-narrow-point-min as appropriate.

The obvious downside is, of course, extra indirection, which translates 
to extra overhead. We don't know how significant it will be, though.

Would you like to see the code?





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 23:36                               ` Dmitry Gutov
@ 2017-09-11 11:10                                 ` Stefan Monnier
  2017-09-12  0:11                                   ` Dmitry Gutov
  0 siblings, 1 reply; 65+ messages in thread
From: Stefan Monnier @ 2017-09-11 11:10 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Philipp Stephani, John Wiegley, 22983

>> I think that (parse-partial-sexp 1 x) is more often what the caller
>> wants than (parse-partial-sexp (point-min) x), but if you're happy with
>> the behavior described by the docstring, then that's fine.
> And yet, I struggle to find such callers.  But those that do, can
> (save-restriction (widen) (syntax-ppss)) anyway.

Good point.

>>> +;; The implementation which follows uses three caches, the current one
>>> +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive
>>> +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the
>>> +;; former state of the active cache as it was used in widened and
>>> +;; narrowed buffers respectively.
>> Earlier in the thread, I suggested to use a single cache indexed by the
>> position of point-min
> That would lead to clobbering the global cache when we use syntax-ppss for
> some local parsing.

My suggestion is to have a list of N caches, instead of having exactly
2 caches.  I can't see how that could lead to more clobbering.

> I'm considering the idea now that syntax-ppss should stay a caching wrapper
> around parse-partial-sexp, and the responsibility to widen should always be
> the caller's.  This way, it can be used for different purposes that we've
> discussed before many times.

It does have the advantage of circumventing the discussion of
"up-to-where should we widen" ;-)


        Stefan





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 22:53                             ` Stefan Monnier
  2017-09-10 23:36                               ` Dmitry Gutov
@ 2017-09-11 19:42                               ` Alan Mackenzie
  2017-09-11 20:20                                 ` Stefan Monnier
  1 sibling, 1 reply; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-11 19:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983

Hello, Stefan.

On Sun, Sep 10, 2017 at 18:53:53 -0400, Stefan Monnier wrote:
> > +;; Several caches.
> > +;; Because `syntax-ppss' is equivalent to (parse-partial-sexp
> > +;; (POINT-MIN) x), we need either to empty the cache when we narrow
> > +;; the buffer, which is suboptimal, or we need to use several caches.

> I think that (parse-partial-sexp 1 x) is more often what the caller
> wants than (parse-partial-sexp (point-min) x), but if you're happy with
> the behavior described by the docstring, then that's fine.

I've never been happy with the specification, partly for that reason,
but we are where we are, with lots of use of syntax-ppss, so I think it
needs fixing according to that spec.

> > +;; The implementation which follows uses three caches, the current one
> > +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive
> > +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the
> > +;; former state of the active cache as it was used in widened and
> > +;; narrowed buffers respectively.

> Earlier in the thread, I suggested to use a single cache indexed by the
> position of point-min (or by the position and point-min and by the
> current syntax-table, so as to also handle changes in the syntax-table),
> i.e. a list of (POINT-MIN-POS . CACHE-DATA) or
> ((POINT-MIN-POS . SYNTAX-TABLE) . CACHE-DATA).  I think it would lead to
> less code duplication than your patch which only handles 2 different
> POINT-MIN-POS (and one of the two has to be equal to 1), but existing
> code trumps hypothetical designs.

I deliberately kept the patch simple, avoiding even an alist with the
point-min position as key.  This would necessitate having an arbitrary
maximum length of alist, and continual manipulation of this list.  Not
difficult, I agree, but do we need it?  How often are there going to be
nested or alternating narrowing with enough calls to syntax-ppss to
cause the establishment of syntax-ppss-cache (as opposed to merely
syntax-ppss-last, which my patch doesn't consider sufficient reason to
store a new narrow-cache)?  (These aren't rhetorical questions, by the
way, but real ones.  Which is the best way forward?)

However, the patch was deliberately contructed to make the replacement
of the two-cache cache by an arbitrary length alist simple.

> So, I have no objections to the patch.  But I think (parse-partial-sexp
> (point-min) x) is a design bug in syntax-ppss which we will need to fix
> sooner or later, which is why I never bothered to implement something
> like your patch, which only makes the code do what its doc says rather
> than what the caller needs.

I couldn't agree more.  However, syntax-ppss is established and there
are callers that depend on its literal specification.  Maybe a way
forward would be to introduce a new function equivalent to
(parse-partial-sexp 1 x) and deprecate syntax-ppss.  However, a name
would need to be found for this new function, not an easy task.  ;-)
(syntax-ppss is a very good name, but couldn't be reused.)

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-11  0:11                             ` Dmitry Gutov
@ 2017-09-11 20:12                               ` Alan Mackenzie
  2017-09-12  0:24                                 ` Dmitry Gutov
  0 siblings, 1 reply; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-11 20:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Mon, Sep 11, 2017 at 03:11:22 +0300, Dmitry Gutov wrote:
> On 9/10/17 2:36 PM, Alan Mackenzie wrote:

> >>> The solution I propose is to introduce a second cache into syntax-ppss,
> >>> and this cache would be used whenever (not (eq (point-min) 1)).
> >>> Whenever point-min changes, and isn't 1, this second cached would be
> >>> calculated again from scratch.

> > Here is a patch implementing this.  Comments about it would be welcome.

> Thank you. It seems to hold up to the main test scenario I had in mind, 
> so I don't have any complaints behavior-wise.

Thanks.

> It looks pretty big, though. With lots of new global variables.

> Before, we had syntax-ppss-cache and syntax-ppss-last. The patch adds 8 
> new ones.

Yes.  But each one has a very single purpose, and there are no loops in
the new code, which makes it easier to be sure it is correct.

> I propose two avenues for simplification:

> 1) Use a cons structure for the (PPSS-CACHE . PPSS-LAST) structure. We 
> will have three global variables total: syntax-ppss-data-wide, 
> syntax-ppss-data-narrow, syntax-ppss-data-narrow-point-min. syntax-ppss 
> would bind a local variable syntax-ppss-data to one of the first two 
> depending on the value of the third (and then modify its car and cdr 
> during the course of execution).

I'm in favour rather of setting syntax-ppss-{cache,last} to the
appropriate stored cache.  This will avoid needing to change the
function syntax-ppss much.

A disadvantage of using such a cons is in debugging.  It is more
difficult to understand a cons like this when it is printed out, than
the two component lists (which are difficult enough themselves).

> 2) Some extra vars serve to delay the actual clearing of the unused 
> cache until it's used again. It's a valid idea, but what if we try 
> without it at first? So syntax-ppss-flush-cache would always clear both 
> caches eagerly.

When there's a lot of buffer changing going on, it is an overhead having
to clear both (or several) caches continually.  (I'm thinking about the
possible extension to using an alist of caches, which could be quite
long.)

Also clearing both caches at the same time would be a bigger change to
syntax-ppss-flush-cache than it's suffered so far.

But I'm really not sure which way is better.

> The advantages:

> - Less code, easier to reason about.

> - Any package than advises syntax-ppss will have to juggle fewer global 
> variables.

I was intending that the new variables be purely internal, and that no
external elisp would need to access them.  I suppose I really ought to
have put "--" in the middle of their names.

> So Vatalie's polymode will have an easier time of it. It could even
> reuse some of the cache-while-narrowed logic by substituting the
> values of syntax-ppss-data-narrow and
> syntax-ppss-data-narrow-point-min as appropriate.

That sounds a little dangerous.  

> The obvious downside is, of course, extra indirection, which translates 
> to extra overhead. We don't know how significant it will be, though.

I wouldn't be keen on seeing lots of (car compound-variable) and (cdr
compound-variable) throughout the syntax-ppss function.  I think it
would make it significantly more difficult to understand.

> Would you like to see the code?

Yes, why not?

But just to make my position clear, I'm not particularly fixed on my
patch as submitted.  It was optimised for simplicity and correctness
rather than elegance, though I don't think it's too bad.  I'm fairly
open on whether we use your suggestions or Stefan's suggestion of having
an alist of caches.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-11 19:42                               ` Alan Mackenzie
@ 2017-09-11 20:20                                 ` Stefan Monnier
  0 siblings, 0 replies; 65+ messages in thread
From: Stefan Monnier @ 2017-09-11 20:20 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983

> difficult, I agree, but do we need it?  How often are there going to be
> nested or alternating narrowing with enough calls to syntax-ppss to
> cause the establishment of syntax-ppss-cache (as opposed to merely
> syntax-ppss-last, which my patch doesn't consider sufficient reason to
> store a new narrow-cache)?  (These aren't rhetorical questions, by the
> way, but real ones.  Which is the best way forward?)

I agree that it probably doesn't make much difference in practice.

> I couldn't agree more.  However, syntax-ppss is established and there
> are callers that depend on its literal specification.  Maybe a way
> forward would be to introduce a new function equivalent to
> (parse-partial-sexp 1 x) and deprecate syntax-ppss.  However, a name
> would need to be found for this new function, not an easy task.  ;-)
> (syntax-ppss is a very good name, but couldn't be reused.)

Let's go with your patch for now, and then see if Dmitry's impression
that adding a call to `widen` before the call works even better.


        Stefan





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-11 11:10                                 ` Stefan Monnier
@ 2017-09-12  0:11                                   ` Dmitry Gutov
  2017-09-12 22:12                                     ` Richard Stallman
  0 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-12  0:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Alan Mackenzie, Philipp Stephani, John Wiegley, 22983

On 9/11/17 2:10 PM, Stefan Monnier wrote:

> My suggestion is to have a list of N caches, instead of having exactly
> 2 caches.  I can't see how that could lead to more clobbering.

Um, sorry I misunderstood. I interpreted that as only keeping one pair.

But here are some other issues:

1) If we maintain a cache for all narrowings that have ever been used in 
the buffer, we adopt the idea that all of them are "real" and e.g. 
correspond to chunks in different major modes in a multi-mode context. 
Switching to a different syntax table and parsing a segment of text like 
ruby-syntax-propertize-percent-literal does falls outside of this 
concept. But of course, we can index by syntax table as well... overall, 
things become much complex than when changing the narrowing bounds 
implies just throwing away that cache.

2) If there are a lot of elements inside the cache alist, we have to get 
rid of them from time to time. Not sure what the rules will be. Again, 
if they correspond to multi-mode chunks, we can at least be confident 
that the number of items in the alist will be finite. Not necessarily so 
if narrowing+spss is used for arbitrary purposes.

3) As the number of elements in the alist grows, flushing each value 
inside syntax-ppss-flush-cache eagerly will become slower and slower, I 
expect. And a lazy strategy of the kind proposed by Alan will become 
necessary.

>> I'm considering the idea now that syntax-ppss should stay a caching wrapper
>> around parse-partial-sexp, and the responsibility to widen should always be
>> the caller's.  This way, it can be used for different purposes that we've
>> discussed before many times.
> 
> It does have the advantage of circumventing the discussion of
> "up-to-where should we widen" ;-)

Indeed. :)





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-11 20:12                               ` Alan Mackenzie
@ 2017-09-12  0:24                                 ` Dmitry Gutov
  2017-09-17 10:29                                   ` Alan Mackenzie
  0 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-12  0:24 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

On 9/11/17 11:12 PM, Alan Mackenzie wrote:

>> Before, we had syntax-ppss-cache and syntax-ppss-last. The patch adds 8
>> new ones.
> 
> Yes.  But each one has a very single purpose, and there are no loops in
> the new code, which makes it easier to be sure it is correct.

On the one hand, yes, on the other hand, the more code you have (or the 
more vars you have to juggle), the harder it is to keep track.

> I'm in favour rather of setting syntax-ppss-{cache,last} to the
> appropriate stored cache.  This will avoid needing to change the
> function syntax-ppss much.

My proposal will change syntax-ppss, yes. So, unfortunately, the patch 
will be more difficult to read. But not the resulting code, hopefully.

But I think I see what you mean. The disadvantage is that we'll need 
code that will ferry those values back to the appropriate variables as 
well (which we see in your patch). We can discuss that option after.

> A disadvantage of using such a cons is in debugging.  It is more
> difficult to understand a cons like this when it is printed out, than
> the two component lists (which are difficult enough themselves).

You win some, you lose some. We could use structs, if you like, but 
overall, the values are already complex, so consing won't make that much 
worse.

> When there's a lot of buffer changing going on, it is an overhead having
> to clear both (or several) caches continually.  (I'm thinking about the
> possible extension to using an alist of caches, which could be quite
> long.)

Both caches - yes, but shouldn't be too bad. The "alist of caches" 
approach would most likely require that laziness, but I'm not sure we 
really want to go there (see another email).

> Also clearing both caches at the same time would be a bigger change to
> syntax-ppss-flush-cache than it's suffered so far.

True.

>> - Any package than advises syntax-ppss will have to juggle fewer global
>> variables.
> 
> I was intending that the new variables be purely internal, and that no
> external elisp would need to access them.  I suppose I really ought to
> have put "--" in the middle of their names.

Yes, but if we can make life easier for some, why not? Sometimes 
third-party author can life with breakage between Emacs versions.

>> So Vatalie's polymode will have an easier time of it. It could even
>> reuse some of the cache-while-narrowed logic by substituting the
>> values of syntax-ppss-data-narrow and
>> syntax-ppss-data-narrow-point-min as appropriate.
> 
> That sounds a little dangerous.

Not much worse than what multi-mode packages already do, though.

>> The obvious downside is, of course, extra indirection, which translates
>> to extra overhead. We don't know how significant it will be, though.
> 
> I wouldn't be keen on seeing lots of (car compound-variable) and (cdr
> compound-variable) throughout the syntax-ppss function.  I think it
> would make it significantly more difficult to understand.

Hopefully there will be only several such places. But again, we can use 
structs.

>> Would you like to see the code?
> 
> Yes, why not?

Please give me until the end of the week.

> But just to make my position clear, I'm not particularly fixed on my
> patch as submitted.  It was optimised for simplicity and correctness
> rather than elegance, though I don't think it's too bad.  I'm fairly
> open on whether we use your suggestions or Stefan's suggestion of having
> an alist of caches.

Cool.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-12  0:11                                   ` Dmitry Gutov
@ 2017-09-12 22:12                                     ` Richard Stallman
  0 siblings, 0 replies; 65+ messages in thread
From: Richard Stallman @ 2017-09-12 22:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 22983, p.stephani2, jwiegley, monnier, acm

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

There is probably some optimal number of caches to remember.
If the code can handle any number of caches, it can discard
all but the last N, and then we could try adjusting N to get
the best performance.  I expect we don't want N to be more
than 4.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.






^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-12  0:24                                 ` Dmitry Gutov
@ 2017-09-17 10:29                                   ` Alan Mackenzie
  2017-09-17 23:43                                     ` Dmitry Gutov
  0 siblings, 1 reply; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-17 10:29 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Tue, Sep 12, 2017 at 03:24:08 +0300, Dmitry Gutov wrote:
> On 9/11/17 11:12 PM, Alan Mackenzie wrote:

[ .... ]

> > I wouldn't be keen on seeing lots of (car compound-variable) and (cdr
> > compound-variable) throughout the syntax-ppss function.  I think it
> > would make it significantly more difficult to understand.

> Hopefully there will be only several such places. But again, we can use 
> structs.

I don't know anything about these things.  But seeing as how syntax.el is
preloaded, the definition of structs would need to be preloaded earlier.

> >> Would you like to see the code?

> > Yes, why not?

> Please give me until the end of the week.

The end of the week has arrived.  Are you still intending to propose an
alternative formulation of the new cache manipulation for syntax-ppss?

> > But just to make my position clear, I'm not particularly fixed on my
> > patch as submitted.  It was optimised for simplicity and correctness
> > rather than elegance, though I don't think it's too bad.  I'm fairly
> > open on whether we use your suggestions or Stefan's suggestion of having
> > an alist of caches.

> Cool.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
  2017-09-10 22:53                             ` Stefan Monnier
  2017-09-11  0:11                             ` Dmitry Gutov
@ 2017-09-17 11:12                             ` Philipp Stephani
  2017-09-19 20:50                               ` Alan Mackenzie
  2 siblings, 1 reply; 65+ messages in thread
From: Philipp Stephani @ 2017-09-17 11:12 UTC (permalink / raw)
  To: Alan Mackenzie, Dmitry Gutov; +Cc: John Wiegley, 22983

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

Alan Mackenzie <acm@muc.de> schrieb am So., 10. Sep. 2017 um 13:42 Uhr:

>
> > - Before this change is pushed to master, or shortly after, I'd like to
> > know that it actually fixed the problem Philipp experienced with
> > python-mode, so we can revert 4fbd330. If it was caused by e.g.
> > syntax-table changing, we've not improved much.
>
> Philipp, any chance of you trying out python mode with this patch but
> without 4fbd330?


Unfortunately the problem wasn't easily reproducible back then. The problem
would occur from time to time, but I never found a way to trigger it
reproducibly. Therefore the unit test I've added in the commit artificially
generates the symptom. The root cause is still unknown; while syntax-ppss
and narrowing might be a potential root cause (the fontification code uses
both), it might also be something else.

[-- Attachment #2: Type: text/html, Size: 1157 bytes --]

^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-17 10:29                                   ` Alan Mackenzie
@ 2017-09-17 23:43                                     ` Dmitry Gutov
  2017-09-18 19:08                                       ` Alan Mackenzie
  0 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-17 23:43 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

[-- Attachment #1: Type: text/plain, Size: 715 bytes --]

Hi Alan,

On 9/17/17 1:29 PM, Alan Mackenzie wrote:

> I don't know anything about these things.  But seeing as how syntax.el is
> preloaded, the definition of structs would need to be preloaded earlier.

OK, let's do without that for now. The result doesn't look too bad to my 
eyes, at least.

>>>> Would you like to see the code?
> 
>>> Yes, why not?
> 
>> Please give me until the end of the week.
> 
> The end of the week has arrived.  Are you still intending to propose an
> alternative formulation of the new cache manipulation for syntax-ppss?

Thanks for the reminder. The patch is attached. I've tested it 
minimally, any feedback is welcome.

(It reads much better in Emacs with diff-auto-refine-mode).


[-- Attachment #2: alt-ppss-fix.diff --]
[-- Type: text/x-patch, Size: 7392 bytes --]

diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
index d1d5176944..a77589f1b7 100644
--- a/lisp/emacs-lisp/syntax.el
+++ b/lisp/emacs-lisp/syntax.el
@@ -381,10 +381,26 @@ syntax-begin-function
 point (where the PPSS is equivalent to nil).")
 (make-obsolete-variable 'syntax-begin-function nil "25.1")
 
-(defvar-local syntax-ppss-cache nil
-  "List of (POS . PPSS) pairs, in decreasing POS order.")
-(defvar-local syntax-ppss-last nil
-  "Cache of (LAST-POS . LAST-PPSS).")
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Several caches.
+;;
+;; Because `syntax-ppss' is equivalent to (parse-partial-sexp
+;; (POINT-MIN) x), we need either to empty the cache when we narrow
+;; the buffer, which is suboptimal, or we need to use several caches.
+;; We use two of them, one for widened buffer, and one for narrowing.
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(defvar-local syntax-ppss-wide nil
+  "Cons of two elements (CACHE . LAST).
+Where CACHE is a list of (POS . PPSS) pairs, in decreasing POS order,
+and LAST is a pair (LAST-POS . LAST-PPS) caching the last invocation.
+These are valid when the buffer has no restriction.")
+
+(defvar-local syntax-ppss-narrow nil
+  "Same as `syntax-ppss-wide' but for a narrowed buffer.")
+
+(defvar-local syntax-ppss-narrow-start nil
+  "Start position of the narrowing for `syntax-ppss-narrow'.")
 
 (defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache)
 (defun syntax-ppss-flush-cache (beg &rest ignored)
@@ -392,24 +408,29 @@ syntax-ppss-flush-cache
   ;; Set syntax-propertize to refontify anything past beg.
   (setq syntax-propertize--done (min beg syntax-propertize--done))
   ;; Flush invalid cache entries.
-  (while (and syntax-ppss-cache (> (caar syntax-ppss-cache) beg))
-    (setq syntax-ppss-cache (cdr syntax-ppss-cache)))
-  ;; Throw away `last' value if made invalid.
-  (when (< beg (or (car syntax-ppss-last) 0))
-    ;; If syntax-begin-function jumped to BEG, then the old state at BEG can
-    ;; depend on the text after BEG (which is presumably changed).  So if
-    ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the
-    ;; assumed nil state at BEG may not be valid any more.
-    (if (<= beg (or (syntax-ppss-toplevel-pos (cdr syntax-ppss-last))
-                    (nth 3 syntax-ppss-last)
-                    0))
-	(setq syntax-ppss-last nil)
-      (setcar syntax-ppss-last nil)))
-  ;; Unregister if there's no cache left.  Sadly this doesn't work
-  ;; because `before-change-functions' is temporarily bound to nil here.
-  ;; (unless syntax-ppss-cache
-  ;;   (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t))
-  )
+  (dolist (cell (list syntax-ppss-wide syntax-ppss-narrow))
+    (pcase cell
+      (`(,cache . ,last)
+       (while (and cache (> (caar cache) beg))
+         (setq cache (cdr cache)))
+       ;; Throw away `last' value if made invalid.
+       (when (< beg (or (car last) 0))
+         ;; If syntax-begin-function jumped to BEG, then the old state at BEG can
+         ;; depend on the text after BEG (which is presumably changed).  So if
+         ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the
+         ;; assumed nil state at BEG may not be valid any more.
+         (if (<= beg (or (syntax-ppss-toplevel-pos (cdr last))
+                         (nth 3 last)
+                         0))
+	     (setq last nil)
+           (setcar last nil)))
+       ;; Unregister if there's no cache left.  Sadly this doesn't work
+       ;; because `before-change-functions' is temporarily bound to nil here.
+       ;; (unless cache
+       ;;   (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t))
+       (setcar cell cache)
+       (setcdr cell last)))
+    ))
 
 (defvar syntax-ppss-stats
   [(0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (1 . 2500.0)])
@@ -423,6 +444,17 @@ syntax-ppss-stats
 (defvar-local syntax-ppss-table nil
   "Syntax-table to use during `syntax-ppss', if any.")
 
+(defun syntax-ppss--data ()
+  (if (eq (point-min) 1)
+      (progn
+        (unless syntax-ppss-wide
+          (setq syntax-ppss-wide (cons nil nil)))
+        syntax-ppss-wide)
+    (unless (eq syntax-ppss-narrow-start (point-min))
+      (setq syntax-ppss-narrow-start (point-min))
+      (setq syntax-ppss-narrow (cons nil nil)))
+    syntax-ppss-narrow))
+
 (defun syntax-ppss (&optional pos)
   "Parse-Partial-Sexp State at POS, defaulting to point.
 The returned value is the same as that of `parse-partial-sexp'
@@ -439,10 +471,13 @@ syntax-ppss
   (syntax-propertize pos)
   ;;
   (with-syntax-table (or syntax-ppss-table (syntax-table))
-  (let ((old-ppss (cdr syntax-ppss-last))
-	(old-pos (car syntax-ppss-last))
-	(ppss nil)
-	(pt-min (point-min)))
+  (let* ((cell (syntax-ppss--data))
+         (ppss-cache (car cell))
+         (ppss-last (cdr cell))
+         (old-ppss (cdr ppss-last))
+         (old-pos (car ppss-last))
+         (ppss nil)
+         (pt-min (point-min)))
     (if (and old-pos (> old-pos pos)) (setq old-pos nil))
     ;; Use the OLD-POS if usable and close.  Don't update the `last' cache.
     (condition-case nil
@@ -475,7 +510,7 @@ syntax-ppss
 	   ;; The OLD-* data can't be used.  Consult the cache.
 	   (t
 	    (let ((cache-pred nil)
-		  (cache syntax-ppss-cache)
+		  (cache ppss-cache)
 		  (pt-min (point-min))
 		  ;; I differentiate between PT-MIN and PT-BEST because
 		  ;; I feel like it might be important to ensure that the
@@ -491,7 +526,7 @@ syntax-ppss
 	      (if cache (setq pt-min (caar cache) ppss (cdar cache)))
 
 	      ;; Setup the before-change function if necessary.
-	      (unless (or syntax-ppss-cache syntax-ppss-last)
+	      (unless (or ppss-cache ppss-last)
 		(add-hook 'before-change-functions
 			  'syntax-ppss-flush-cache t t))
 
@@ -541,7 +576,7 @@ syntax-ppss
 			      pt-min (setq pt-min (/ (+ pt-min pos) 2))
 			      nil nil ppss))
                   (push (cons pt-min ppss)
-                        (if cache-pred (cdr cache-pred) syntax-ppss-cache)))
+                        (if cache-pred (cdr cache-pred) ppss-cache)))
 
 		;; Compute the actual return value.
 		(setq ppss (parse-partial-sexp pt-min pos nil nil ppss))
@@ -562,13 +597,15 @@ syntax-ppss
 		      (if (> (- (caar cache-pred) pos) syntax-ppss-max-span)
 			  (push pair (cdr cache-pred))
 			(setcar cache-pred pair))
-		    (if (or (null syntax-ppss-cache)
-			    (> (- (caar syntax-ppss-cache) pos)
+		    (if (or (null ppss-cache)
+			    (> (- (caar ppss-cache) pos)
 			       syntax-ppss-max-span))
-			(push pair syntax-ppss-cache)
-		      (setcar syntax-ppss-cache pair)))))))))
+			(push pair ppss-cache)
+		      (setcar ppss-cache pair)))))))))
 
-	  (setq syntax-ppss-last (cons pos ppss))
+	  (setq ppss-last (cons pos ppss))
+          (setcar cell ppss-cache)
+          (setcdr cell ppss-last)
 	  ppss)
       (args-out-of-range
        ;; If the buffer is more narrowed than when we built the cache,
@@ -582,7 +619,7 @@ syntax-ppss
 (defun syntax-ppss-debug ()
   (let ((pt nil)
 	(min-diffs nil))
-    (dolist (x (append syntax-ppss-cache (list (cons (point-min) nil))))
+    (dolist (x (append (car (syntax-ppss--data)) (list (cons (point-min) nil))))
       (when pt (push (- pt (car x)) min-diffs))
       (setq pt (car x)))
     min-diffs))

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-17 23:43                                     ` Dmitry Gutov
@ 2017-09-18 19:08                                       ` Alan Mackenzie
  2017-09-19  0:02                                         ` Dmitry Gutov
  0 siblings, 1 reply; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-18 19:08 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry

On Mon, Sep 18, 2017 at 02:43:05 +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 9/17/17 1:29 PM, Alan Mackenzie wrote:

> > I don't know anything about these things.  But seeing as how syntax.el is
> > preloaded, the definition of structs would need to be preloaded earlier.

> OK, let's do without that for now. The result doesn't look too bad to my 
> eyes, at least.

> >>>> Would you like to see the code?

> >>> Yes, why not?

> >> Please give me until the end of the week.

> > The end of the week has arrived.  Are you still intending to propose an
> > alternative formulation of the new cache manipulation for syntax-ppss?

> Thanks for the reminder. The patch is attached. I've tested it 
> minimally, any feedback is welcome.

Thanks for this.  I'm impressed.  Your syntax-ppss--data is far more
elegant than my syntax-ppss-set-cache.  The burden of carrying around
the caches in cons cells is much less than I had feared.  The amendments
to syntax-ppss are also less than I had feared, amounting to little more
than substituting "syntax-ppss-cache" with "ppss-cache" etc., and making
a few bindings to support that.

I notice you flush both caches eagerly, as you said you would.  No harm
in that.

So, I'm willing to go with your version.  I haven't tried actually
running it, yet.

But there's one small change I would ask you to consider making - that
is, in the cache conses, to put ppss-last in the car and ppss-cache in
the cdr.  That way, while debugging, ppss-last will be easy to find
(it's the first element of the list) and ppss-cache will also be easy to
find (the second element onwards).  

> (It reads much better in Emacs with diff-auto-refine-mode).

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-18 19:08                                       ` Alan Mackenzie
@ 2017-09-19  0:02                                         ` Dmitry Gutov
  2017-09-19 20:47                                           ` Alan Mackenzie
  0 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-19  0:02 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

[-- Attachment #1: Type: text/plain, Size: 1080 bytes --]

On 9/18/17 10:08 PM, Alan Mackenzie wrote:

> Thanks for this.  I'm impressed.  Your syntax-ppss--data is far more
> elegant than my syntax-ppss-set-cache.  The burden of carrying around
> the caches in cons cells is much less than I had feared.  The amendments
> to syntax-ppss are also less than I had feared, amounting to little more
> than substituting "syntax-ppss-cache" with "ppss-cache" etc., and making
> a few bindings to support that.

Thanks!

> I notice you flush both caches eagerly, as you said you would.  No harm
> in that.
> 
> So, I'm willing to go with your version.  I haven't tried actually
> running it, yet.

Please do.

> But there's one small change I would ask you to consider making - that
> is, in the cache conses, to put ppss-last in the car and ppss-cache in
> the cdr.  That way, while debugging, ppss-last will be easy to find
> (it's the first element of the list) and ppss-cache will also be easy to
> find (the second element onwards).

Sure, that makes a lot of sense, since ppss-last is a smaller structure. 
The modified patch is attached.

[-- Attachment #2: alt-ppss-fix-2.diff --]
[-- Type: text/x-patch, Size: 7391 bytes --]

diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
index d1d5176944..c44e754ac0 100644
--- a/lisp/emacs-lisp/syntax.el
+++ b/lisp/emacs-lisp/syntax.el
@@ -381,10 +381,26 @@ syntax-begin-function
 point (where the PPSS is equivalent to nil).")
 (make-obsolete-variable 'syntax-begin-function nil "25.1")
 
-(defvar-local syntax-ppss-cache nil
-  "List of (POS . PPSS) pairs, in decreasing POS order.")
-(defvar-local syntax-ppss-last nil
-  "Cache of (LAST-POS . LAST-PPSS).")
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Several caches.
+;;
+;; Because `syntax-ppss' is equivalent to (parse-partial-sexp
+;; (POINT-MIN) x), we need either to empty the cache when we narrow
+;; the buffer, which is suboptimal, or we need to use several caches.
+;; We use two of them, one for widened buffer, and one for narrowing.
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(defvar-local syntax-ppss-wide nil
+  "Cons of two elements (LAST . CACHE).
+Where LAST is a pair (LAST-POS . LAST-PPS) caching the last invocation
+and CACHE is a list of (POS . PPSS) pairs, in decreasing POS order.
+These are valid when the buffer has no restriction.")
+
+(defvar-local syntax-ppss-narrow nil
+  "Same as `syntax-ppss-wide' but for a narrowed buffer.")
+
+(defvar-local syntax-ppss-narrow-start nil
+  "Start position of the narrowing for `syntax-ppss-narrow'.")
 
 (defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache)
 (defun syntax-ppss-flush-cache (beg &rest ignored)
@@ -392,24 +408,29 @@ syntax-ppss-flush-cache
   ;; Set syntax-propertize to refontify anything past beg.
   (setq syntax-propertize--done (min beg syntax-propertize--done))
   ;; Flush invalid cache entries.
-  (while (and syntax-ppss-cache (> (caar syntax-ppss-cache) beg))
-    (setq syntax-ppss-cache (cdr syntax-ppss-cache)))
-  ;; Throw away `last' value if made invalid.
-  (when (< beg (or (car syntax-ppss-last) 0))
-    ;; If syntax-begin-function jumped to BEG, then the old state at BEG can
-    ;; depend on the text after BEG (which is presumably changed).  So if
-    ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the
-    ;; assumed nil state at BEG may not be valid any more.
-    (if (<= beg (or (syntax-ppss-toplevel-pos (cdr syntax-ppss-last))
-                    (nth 3 syntax-ppss-last)
-                    0))
-	(setq syntax-ppss-last nil)
-      (setcar syntax-ppss-last nil)))
-  ;; Unregister if there's no cache left.  Sadly this doesn't work
-  ;; because `before-change-functions' is temporarily bound to nil here.
-  ;; (unless syntax-ppss-cache
-  ;;   (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t))
-  )
+  (dolist (cell (list syntax-ppss-wide syntax-ppss-narrow))
+    (pcase cell
+      (`(,last . ,cache)
+       (while (and cache (> (caar cache) beg))
+         (setq cache (cdr cache)))
+       ;; Throw away `last' value if made invalid.
+       (when (< beg (or (car last) 0))
+         ;; If syntax-begin-function jumped to BEG, then the old state at BEG can
+         ;; depend on the text after BEG (which is presumably changed).  So if
+         ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the
+         ;; assumed nil state at BEG may not be valid any more.
+         (if (<= beg (or (syntax-ppss-toplevel-pos (cdr last))
+                         (nth 3 last)
+                         0))
+	     (setq last nil)
+           (setcar last nil)))
+       ;; Unregister if there's no cache left.  Sadly this doesn't work
+       ;; because `before-change-functions' is temporarily bound to nil here.
+       ;; (unless cache
+       ;;   (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t))
+       (setcar cell last)
+       (setcdr cell cache)))
+    ))
 
 (defvar syntax-ppss-stats
   [(0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (1 . 2500.0)])
@@ -423,6 +444,17 @@ syntax-ppss-stats
 (defvar-local syntax-ppss-table nil
   "Syntax-table to use during `syntax-ppss', if any.")
 
+(defun syntax-ppss--data ()
+  (if (eq (point-min) 1)
+      (progn
+        (unless syntax-ppss-wide
+          (setq syntax-ppss-wide (cons nil nil)))
+        syntax-ppss-wide)
+    (unless (eq syntax-ppss-narrow-start (point-min))
+      (setq syntax-ppss-narrow-start (point-min))
+      (setq syntax-ppss-narrow (cons nil nil)))
+    syntax-ppss-narrow))
+
 (defun syntax-ppss (&optional pos)
   "Parse-Partial-Sexp State at POS, defaulting to point.
 The returned value is the same as that of `parse-partial-sexp'
@@ -439,10 +471,13 @@ syntax-ppss
   (syntax-propertize pos)
   ;;
   (with-syntax-table (or syntax-ppss-table (syntax-table))
-  (let ((old-ppss (cdr syntax-ppss-last))
-	(old-pos (car syntax-ppss-last))
-	(ppss nil)
-	(pt-min (point-min)))
+  (let* ((cell (syntax-ppss--data))
+         (ppss-last (car cell))
+         (ppss-cache (cdr cell))
+         (old-ppss (cdr ppss-last))
+         (old-pos (car ppss-last))
+         (ppss nil)
+         (pt-min (point-min)))
     (if (and old-pos (> old-pos pos)) (setq old-pos nil))
     ;; Use the OLD-POS if usable and close.  Don't update the `last' cache.
     (condition-case nil
@@ -475,7 +510,7 @@ syntax-ppss
 	   ;; The OLD-* data can't be used.  Consult the cache.
 	   (t
 	    (let ((cache-pred nil)
-		  (cache syntax-ppss-cache)
+		  (cache ppss-cache)
 		  (pt-min (point-min))
 		  ;; I differentiate between PT-MIN and PT-BEST because
 		  ;; I feel like it might be important to ensure that the
@@ -491,7 +526,7 @@ syntax-ppss
 	      (if cache (setq pt-min (caar cache) ppss (cdar cache)))
 
 	      ;; Setup the before-change function if necessary.
-	      (unless (or syntax-ppss-cache syntax-ppss-last)
+	      (unless (or ppss-cache ppss-last)
 		(add-hook 'before-change-functions
 			  'syntax-ppss-flush-cache t t))
 
@@ -541,7 +576,7 @@ syntax-ppss
 			      pt-min (setq pt-min (/ (+ pt-min pos) 2))
 			      nil nil ppss))
                   (push (cons pt-min ppss)
-                        (if cache-pred (cdr cache-pred) syntax-ppss-cache)))
+                        (if cache-pred (cdr cache-pred) ppss-cache)))
 
 		;; Compute the actual return value.
 		(setq ppss (parse-partial-sexp pt-min pos nil nil ppss))
@@ -562,13 +597,15 @@ syntax-ppss
 		      (if (> (- (caar cache-pred) pos) syntax-ppss-max-span)
 			  (push pair (cdr cache-pred))
 			(setcar cache-pred pair))
-		    (if (or (null syntax-ppss-cache)
-			    (> (- (caar syntax-ppss-cache) pos)
+		    (if (or (null ppss-cache)
+			    (> (- (caar ppss-cache) pos)
 			       syntax-ppss-max-span))
-			(push pair syntax-ppss-cache)
-		      (setcar syntax-ppss-cache pair)))))))))
+			(push pair ppss-cache)
+		      (setcar ppss-cache pair)))))))))
 
-	  (setq syntax-ppss-last (cons pos ppss))
+	  (setq ppss-last (cons pos ppss))
+          (setcar cell ppss-last)
+          (setcdr cell ppss-cache)
 	  ppss)
       (args-out-of-range
        ;; If the buffer is more narrowed than when we built the cache,
@@ -582,7 +619,7 @@ syntax-ppss
 (defun syntax-ppss-debug ()
   (let ((pt nil)
 	(min-diffs nil))
-    (dolist (x (append syntax-ppss-cache (list (cons (point-min) nil))))
+    (dolist (x (append (cdr (syntax-ppss--data)) (list (cons (point-min) nil))))
       (when pt (push (- pt (car x)) min-diffs))
       (setq pt (car x)))
     min-diffs))

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-19  0:02                                         ` Dmitry Gutov
@ 2017-09-19 20:47                                           ` Alan Mackenzie
  2017-09-22 14:09                                             ` Dmitry Gutov
  0 siblings, 1 reply; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-19 20:47 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Tue, Sep 19, 2017 at 03:02:06 +0300, Dmitry Gutov wrote:
> On 9/18/17 10:08 PM, Alan Mackenzie wrote:

[ .... ]

> > So, I'm willing to go with your version.  I haven't tried actually
> > running it, yet.

> Please do.

I have done now, without the slightest cause for concern (see below).

> > But there's one small change I would ask you to consider making - that
> > is, in the cache conses, to put ppss-last in the car and ppss-cache in
> > the cdr.  That way, while debugging, ppss-last will be easy to find
> > (it's the first element of the list) and ppss-cache will also be easy to
> > find (the second element onwards).

> Sure, that makes a lot of sense, since ppss-last is a smaller structure. 
> The modified patch is attached.

Thanks.

I've done some semi-formal testing on it.  My semi-formal test log is:

(ii) Do some testing, using xdisp.c as test file.  A file.c will not have
  other calls to syntax-ppss interfering with the tests.
  o - 1. Normal working: check both caches stay empty.  They don't, because
    syntax-ppss is used, I think, by font locking.
  o - 2. Normal work in a narrowed buffer.  Seems OK.
  o - 3. Switch back to widened.  Seems OK.
  o - 4. Switch back to narrowed, same point-min.  Check the caches.  They
    look OK.
  o - 5. Switch to a different narrowing and (syntax-ppss (point-min)).  This
    does indeed empty the syntax-ppss-narrow, as it should.  s-p-wide looks
    unchanged.  Good.
  o - 6. Get well filled caches for both narrow and wide regions.  With the
    buffer wide, make a buffer change early in the buffer.  Check both caches
    are properly trimmed.  They are.
  o - 7. Repeat 6, but trim with the buffer narrow.  Both caches look OK, the
    narrow cache being (nil).

Maybe I should also try some heavy hacking in, say, Emacs Lisp mode as a
kind of soak test, since elisp mode uses syntax-ppss quite a bit, I
believe.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-17 11:12                             ` Philipp Stephani
@ 2017-09-19 20:50                               ` Alan Mackenzie
  0 siblings, 0 replies; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-19 20:50 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: John Wiegley, Dmitry Gutov, 22983

Hello, Philipp.

On Sun, Sep 17, 2017 at 11:12:25 +0000, Philipp Stephani wrote:
> Alan Mackenzie <acm@muc.de> schrieb am So., 10. Sep. 2017 um 13:42 Uhr:


> > > - Before this change is pushed to master, or shortly after, I'd like to
> > > know that it actually fixed the problem Philipp experienced with
> > > python-mode, so we can revert 4fbd330. If it was caused by e.g.
> > > syntax-table changing, we've not improved much.

> > Philipp, any chance of you trying out python mode with this patch but
> > without 4fbd330?


> Unfortunately the problem wasn't easily reproducible back then. The problem
> would occur from time to time, but I never found a way to trigger it
> reproducibly. Therefore the unit test I've added in the commit artificially
> generates the symptom. The root cause is still unknown; while syntax-ppss
> and narrowing might be a potential root cause (the fontification code uses
> both), it might also be something else.

OK, I understand.  Cache effects are the very devil to debug.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-19 20:47                                           ` Alan Mackenzie
@ 2017-09-22 14:09                                             ` Dmitry Gutov
  2017-09-24 11:26                                               ` Alan Mackenzie
  0 siblings, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-22 14:09 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

Hi Alan,

On 9/19/17 11:47 PM, Alan Mackenzie wrote:

> I have done now, without the slightest cause for concern (see below).

Thank you. Should you commit the patch (with any documentation tweaks 
you deem necessary), or should I?

> I've done some semi-formal testing on it.  My semi-formal test log is:
> 
> (ii) Do some testing, using xdisp.c as test file.  A file.c will not have
>    other calls to syntax-ppss interfering with the tests.
>    o - 1. Normal working: check both caches stay empty.  They don't, because
>      syntax-ppss is used, I think, by font locking.
>    o - 2. Normal work in a narrowed buffer.  Seems OK.
>    o - 3. Switch back to widened.  Seems OK.
>    o - 4. Switch back to narrowed, same point-min.  Check the caches.  They
>      look OK.
>    o - 5. Switch to a different narrowing and (syntax-ppss (point-min)).  This
>      does indeed empty the syntax-ppss-narrow, as it should.  s-p-wide looks
>      unchanged.  Good.
>    o - 6. Get well filled caches for both narrow and wide regions.  With the
>      buffer wide, make a buffer change early in the buffer.  Check both caches
>      are properly trimmed.  They are.
>    o - 7. Repeat 6, but trim with the buffer narrow.  Both caches look OK, the
>      narrow cache being (nil).

Yes, this sounds fine. I've tried out most of those myself too, except 
usually without checking the cache contents. Just the syntax-ppss results.

It would be nice to have 2 or 3 of those added as automated tests, BTW.

> Maybe I should also try some heavy hacking in, say, Emacs Lisp mode as a
> kind of soak test, since elisp mode uses syntax-ppss quite a bit, I
> believe.

Sure, except emacs-lisp-mode seems to still retain certain 
indentation-related problems, even without this change.

I don't really expect to uncover problems from this patch much later. 
That's been the point of making the change as simple as possible.





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-22 14:09                                             ` Dmitry Gutov
@ 2017-09-24 11:26                                               ` Alan Mackenzie
  2017-09-25 23:53                                                 ` Dmitry Gutov
  2017-10-04 20:07                                                 ` Johan Bockgård
  0 siblings, 2 replies; 65+ messages in thread
From: Alan Mackenzie @ 2017-09-24 11:26 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Fri, Sep 22, 2017 at 17:09:03 +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 9/19/17 11:47 PM, Alan Mackenzie wrote:

> > I have done now, without the slightest cause for concern (see below).

> Thank you. Should you commit the patch (with any documentation tweaks 
> you deem necessary), or should I?

Could I ask you to do it, please?  I'm somewhat exhausted from debating
another basic Emacs change.

Ah yes, the documentation.  I checked the doc in the elisp manual, and
twice the phrase "from the beginning of the buffer" was used.  I've
clarified that with "from the beginning of the visible portion of the
buffer".  I've also amended "a cache" to "caches", though this doesn't
seem too important.  What do you think:


diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi
index e3ae53536f..b37f2b22b8 100644
--- a/doc/lispref/syntax.texi
+++ b/doc/lispref/syntax.texi
@@ -751,7 +751,8 @@ Position Parse
 
 @defun syntax-ppss &optional pos
 This function returns the parser state that the parser would reach at
-position @var{pos} starting from the beginning of the buffer.
+position @var{pos} starting from the beginning of the visible portion
+of the buffer.
 @iftex
 See the next section for
 @end iftex
@@ -762,11 +763,11 @@ Position Parse
 
 The return value is the same as if you call the low-level parsing
 function @code{parse-partial-sexp} to parse from the beginning of the
-buffer to @var{pos} (@pxref{Low-Level Parsing}).  However,
-@code{syntax-ppss} uses a cache to speed up the computation.  Due to
-this optimization, the second value (previous complete subexpression)
-and sixth value (minimum parenthesis depth) in the returned parser
-state are not meaningful.
+visible portion of the buffer to @var{pos} (@pxref{Low-Level
+Parsing}).  However, @code{syntax-ppss} uses caches to speed up the
+computation.  Due to this optimization, the second value (previous
+complete subexpression) and sixth value (minimum parenthesis depth) in
+the returned parser state are not meaningful.
 
 This function has a side effect: it adds a buffer-local entry to
 @code{before-change-functions} (@pxref{Change Hooks}) for


-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply related	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-24 11:26                                               ` Alan Mackenzie
@ 2017-09-25 23:53                                                 ` Dmitry Gutov
  2017-10-01 16:36                                                   ` Alan Mackenzie
  2017-10-04 20:07                                                 ` Johan Bockgård
  1 sibling, 1 reply; 65+ messages in thread
From: Dmitry Gutov @ 2017-09-25 23:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

Hi Alan,

On 9/24/17 2:26 PM, Alan Mackenzie wrote:

> Could I ask you to do it, please?  I'm somewhat exhausted from debating
> another basic Emacs change.

Pushed to emacs-26, thanks.

> Ah yes, the documentation.  I checked the doc in the elisp manual, and
> twice the phrase "from the beginning of the buffer" was used.  I've
> clarified that with "from the beginning of the visible portion of the
> buffer".  I've also amended "a cache" to "caches", though this doesn't
> seem too important.  What do you think:

LGTM. I think you can push it and finally close this bug. ;-)





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
       [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org>
@ 2017-10-01 16:31   ` Alan Mackenzie
  0 siblings, 0 replies; 65+ messages in thread
From: Alan Mackenzie @ 2017-10-01 16:31 UTC (permalink / raw)
  To: 22983-done

The bug has been fixed by patches to the emacs-26 branch.

-- 
Alan Mackenzie (Nuremberg, Germany).






^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-25 23:53                                                 ` Dmitry Gutov
@ 2017-10-01 16:36                                                   ` Alan Mackenzie
  0 siblings, 0 replies; 65+ messages in thread
From: Alan Mackenzie @ 2017-10-01 16:36 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Tue, Sep 26, 2017 at 02:53:52 +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 9/24/17 2:26 PM, Alan Mackenzie wrote:

> > Could I ask you to do it, please?  I'm somewhat exhausted from debating
> > another basic Emacs change.

> Pushed to emacs-26, thanks.

> > Ah yes, the documentation.  I checked the doc in the elisp manual, and
> > twice the phrase "from the beginning of the buffer" was used.  I've
> > clarified that with "from the beginning of the visible portion of the
> > buffer".  I've also amended "a cache" to "caches", though this doesn't
> > seem too important.  What do you think:

> LGTM. I think you can push it and finally close this bug. ;-)

Thanks.  I've just done both of these things.  Phew!

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 65+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-24 11:26                                               ` Alan Mackenzie
  2017-09-25 23:53                                                 ` Dmitry Gutov
@ 2017-10-04 20:07                                                 ` Johan Bockgård
  1 sibling, 0 replies; 65+ messages in thread
From: Johan Bockgård @ 2017-10-04 20:07 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983

Alan Mackenzie <acm@muc.de> writes:

> Ah yes, the documentation.  I checked the doc in the elisp manual, and
> twice the phrase "from the beginning of the buffer" was used.  I've
> clarified that with "from the beginning of the visible portion of the
> buffer".

The manual uses the term "accessible portion" for this. ("Visible"
usually refers to text in a window.)





^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2017-10-04 20:07 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie
2016-03-11 20:31 ` Dmitry Gutov
2016-03-11 21:24   ` Alan Mackenzie
2016-03-11 21:35     ` Dmitry Gutov
2016-03-11 22:15       ` Alan Mackenzie
2016-03-11 22:38         ` Dmitry Gutov
2016-03-13 17:37           ` Stefan Monnier
2016-03-13 18:57             ` Alan Mackenzie
2016-03-14  0:47               ` Dmitry Gutov
2016-03-14  1:04                 ` Drew Adams
2016-04-03 22:55                   ` John Wiegley
2016-03-14  1:49               ` Stefan Monnier
2016-03-13 17:32     ` Stefan Monnier
2016-03-13 18:52 ` Andreas Röhler
2016-03-13 18:56   ` Dmitry Gutov
2016-03-18  0:49 ` Dmitry Gutov
2016-03-19 12:27   ` Alan Mackenzie
2016-03-19 18:47     ` Dmitry Gutov
2016-03-27  0:51       ` John Wiegley
2016-03-27  1:14         ` Dmitry Gutov
2016-04-03 22:58           ` John Wiegley
2016-04-03 23:15             ` Dmitry Gutov
2017-09-02 13:12               ` Eli Zaretskii
2017-09-02 17:40                 ` Alan Mackenzie
2017-09-02 17:53                   ` Eli Zaretskii
2017-09-03 20:44                   ` John Wiegley
2017-09-04 23:34                   ` Dmitry Gutov
2017-09-05  6:57                     ` Andreas Röhler
2017-09-05 12:28                     ` John Wiegley
2017-09-07 20:45                       ` Alan Mackenzie
2017-09-08 16:04                         ` Andreas Röhler
2017-09-10 18:26                           ` Alan Mackenzie
2017-09-09  9:44                         ` Dmitry Gutov
2017-09-09 10:20                           ` Alan Mackenzie
2017-09-09 12:18                             ` Dmitry Gutov
2017-09-10 11:42                               ` Alan Mackenzie
2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
2017-09-10 22:53                             ` Stefan Monnier
2017-09-10 23:36                               ` Dmitry Gutov
2017-09-11 11:10                                 ` Stefan Monnier
2017-09-12  0:11                                   ` Dmitry Gutov
2017-09-12 22:12                                     ` Richard Stallman
2017-09-11 19:42                               ` Alan Mackenzie
2017-09-11 20:20                                 ` Stefan Monnier
2017-09-11  0:11                             ` Dmitry Gutov
2017-09-11 20:12                               ` Alan Mackenzie
2017-09-12  0:24                                 ` Dmitry Gutov
2017-09-17 10:29                                   ` Alan Mackenzie
2017-09-17 23:43                                     ` Dmitry Gutov
2017-09-18 19:08                                       ` Alan Mackenzie
2017-09-19  0:02                                         ` Dmitry Gutov
2017-09-19 20:47                                           ` Alan Mackenzie
2017-09-22 14:09                                             ` Dmitry Gutov
2017-09-24 11:26                                               ` Alan Mackenzie
2017-09-25 23:53                                                 ` Dmitry Gutov
2017-10-01 16:36                                                   ` Alan Mackenzie
2017-10-04 20:07                                                 ` Johan Bockgård
2017-09-17 11:12                             ` Philipp Stephani
2017-09-19 20:50                               ` Alan Mackenzie
2017-09-07 17:56                     ` Alan Mackenzie
2017-09-07 20:36                       ` Dmitry Gutov
2016-03-19 23:16     ` Vitalie Spinu
2016-03-19 23:00   ` Vitalie Spinu
2016-03-19 23:20     ` Dmitry Gutov
     [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org>
2017-10-01 16:31   ` Alan Mackenzie

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).