bug#22983: syntax-ppss returns wrong result.

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

* bug#22983: syntax-ppss returns wrong result.
@ 2016-03-11 15:15 Alan Mackenzie
  2016-03-11 20:31 ` Dmitry Gutov
                   ` (3 more replies)
  0 siblings, 4 replies; 155+ messages in thread
From: Alan Mackenzie @ 2016-03-11 15:15 UTC (permalink / raw)
  To: 22983

Hello, Emacs.

The fundamental contract in syntax-ppss is that (syntax-ppss POS)
returns the same value as (parse-partial-sexp (point-min) POS) (with the
exception of elements 2 and 6).  This is currently not always the case.

In the master branch, emacs -Q and visit xdisp.c with C-x C-f.  Follow
this recipe:

    M-: (syntax-ppss-flush-cache 1)
    M-: (setq ppss-0 (syntax-ppss 40000))
    M-<
    C-s #include " <CR>
    M->
    C-x n n
    M-: (setq ppss-1 (syntax-ppss 40000))
    M-: (setq parse (parse-partial-sexp (point-min) 40000))

At this point, `ppss-1' and `parse' should match (apart from elements 2
and 6).  What we actually have is:

    ppss-1: (2 39992 nil nil nil nil 2 nil nil (39975 39992))
    parse:  (0 nil 15674 34 nil nil 0 nil 15675 nil)

.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie
@ 2016-03-11 20:31 ` Dmitry Gutov
  2016-03-11 21:24   ` Alan Mackenzie
  2016-03-13 18:52 ` Andreas Röhler
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-11 20:31 UTC (permalink / raw)
  To: Alan Mackenzie, 22983

On 03/11/2016 05:15 PM, Alan Mackenzie wrote:

> At this point, `ppss-1' and `parse' should match (apart from elements 2
> and 6).  What we actually have is:
>
>     ppss-1: (2 39992 nil nil nil nil 2 nil nil (39975 39992))
>     parse:  (0 nil 15674 34 nil nil 0 nil 15675 nil)

I think you mean that ppss-0 and ppss-1 must match independent of 
narrowing, and also match (parse-partial-sexp 1 40000).

Considering narrowing can change point-min arbitrarily, specifying 
(syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing 
proposition if you want consistency.

Alas, we have some code out there that implements multiple-major-mode 
functionality using narrowing and some hacking of syntax-ppss-last 
syntax-ppss-cache values.

Changing syntax-ppss to be independent of narrowing will break it, and 
we'll need to provide some alternative first.

We could introduce a syntax-ppss-dont-widen variable, though. Similar to 
font-lock-dont-widen.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 20:31 ` Dmitry Gutov
@ 2016-03-11 21:24   ` Alan Mackenzie
  2016-03-11 21:35     ` Dmitry Gutov
  2016-03-13 17:32     ` bug#22983: syntax-ppss returns wrong result Stefan Monnier
  0 siblings, 2 replies; 155+ messages in thread
From: Alan Mackenzie @ 2016-03-11 21:24 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 22983

Hello, Dmitry.

On Fri, Mar 11, 2016 at 10:31:50PM +0200, Dmitry Gutov wrote:
> On 03/11/2016 05:15 PM, Alan Mackenzie wrote:

> > At this point, `ppss-1' and `parse' should match (apart from elements 2
> > and 6).  What we actually have is:

> >     ppss-1: (2 39992 nil nil nil nil 2 nil nil (39975 39992))
> >     parse:  (0 nil 15674 34 nil nil 0 nil 15675 nil)

> I think you mean that ppss-0 and ppss-1 must match independent of 
> narrowing, and also match (parse-partial-sexp 1 40000).

Er no, I meant what I wrote: the result of (syntax-ppss pos) must match
that of (parse-partial-sexp (point-min) pos).  I think ppss-0 and ppss-1
did actually match (but I can't quite remember).

> Considering narrowing can change point-min arbitrarily, specifying 
> (syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing 
> proposition if you want consistency.

Indeed.  But that is how syntax-ppss is specified, and (partially) how
it is implemented.

> Alas, we have some code out there that implements multiple-major-mode 
> functionality using narrowing and some hacking of syntax-ppss-last 
> syntax-ppss-cache values.

> Changing syntax-ppss to be independent of narrowing will break it, and 
> we'll need to provide some alternative first.

syntax-ppss is broken, and can't be fixed.  The only sensible fix would
be to specify that (syntax-ppss pos) is the same as (parse-partial-sexp
1 pos).  But that is then a totally different function, and there are
around 200 uses in the Emacs sources to check and fix, to say nothing of
external code.

> We could introduce a syntax-ppss-dont-widen variable, though. Similar to 
> font-lock-dont-widen.

I'm trying to figure that out.  Wouldn't that still leave you with
problems when point-min is inside a string?

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 21:24   ` Alan Mackenzie
@ 2016-03-11 21:35     ` Dmitry Gutov
  2016-03-11 22:15       ` Alan Mackenzie
  2016-03-13 17:32     ` bug#22983: syntax-ppss returns wrong result Stefan Monnier
  1 sibling, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-11 21:35 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983

On 03/11/2016 11:24 PM, Alan Mackenzie wrote:

>> I think you mean that ppss-0 and ppss-1 must match independent of
>> narrowing, and also match (parse-partial-sexp 1 40000).
>
> Er no, I meant what I wrote: the result of (syntax-ppss pos) must match
> that of (parse-partial-sexp (point-min) pos).  I think ppss-0 and ppss-1
> did actually match (but I can't quite remember).

I imagine they didn't. I got the same value in all three cases, though, 
so your scenario could use some revising.

>> Considering narrowing can change point-min arbitrarily, specifying
>> (syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing
>> proposition if you want consistency.
>
> Indeed.  But that is how syntax-ppss is specified, and (partially) how
> it is implemented.

That part of specification can be rephrased.

>> Alas, we have some code out there that implements multiple-major-mode
>> functionality using narrowing and some hacking of syntax-ppss-last
>> syntax-ppss-cache values.
>
>> Changing syntax-ppss to be independent of narrowing will break it, and
>> we'll need to provide some alternative first.
>
> syntax-ppss is broken, and can't be fixed.

It's used ubiquitously, so it must be working.

> The only sensible fix would
> be to specify that (syntax-ppss pos) is the same as (parse-partial-sexp
> 1 pos).  But that is then a totally different function, and there are
> around 200 uses in the Emacs sources to check and fix, to say nothing of
> external code.

Not entirely different, no. AFAIK, these are the semantics the vast 
majority of its usages expect. Except the multiple-major-mode case, 
which we'd ideally try to accommodate, too.

>> We could introduce a syntax-ppss-dont-widen variable, though. Similar to
>> font-lock-dont-widen.
>
> I'm trying to figure that out.  Wouldn't that still leave you with
> problems when point-min is inside a string?

syntax-ppss-dont-widen would be nil by default, it would be an escape 
hatch toward the current semantics, for when the caller knows how to 
manage narrowings, etc.





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 21:35     ` Dmitry Gutov
@ 2016-03-11 22:15       ` Alan Mackenzie
  2016-03-11 22:38         ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Alan Mackenzie @ 2016-03-11 22:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 22983

Hello, Dmitry.

On Fri, Mar 11, 2016 at 11:35:08PM +0200, Dmitry Gutov wrote:
> On 03/11/2016 11:24 PM, Alan Mackenzie wrote:

> > Er no, I meant what I wrote: the result of (syntax-ppss pos) must match
> > that of (parse-partial-sexp (point-min) pos).  I think ppss-0 and ppss-1
> > did actually match (but I can't quite remember).

> I imagine they didn't. I got the same value in all three cases, though, 
> so your scenario could use some revising.

Sorry about that.

> >> Considering narrowing can change point-min arbitrarily, specifying
> >> (syntax-ppss pos) as (parse-partial-sexp (point-min) pos) is a losing
> >> proposition if you want consistency.

> > Indeed.  But that is how syntax-ppss is specified, and (partially) how
> > it is implemented.

> That part of specification can be rephrased.

It's more than the specification which needs redoing.  The implementation
(sometimes) returns the equivalent of (parse-partial-sexp (point-min)
pos)), when point-min is not in a "safe place".

> >> Alas, we have some code out there that implements multiple-major-mode
> >> functionality using narrowing and some hacking of syntax-ppss-last
> >> syntax-ppss-cache values.

> >> Changing syntax-ppss to be independent of narrowing will break it, and
> >> we'll need to provide some alternative first.

> > syntax-ppss is broken, and can't be fixed.

> It's used ubiquitously, so it must be working.

It might well be ubiquitous, but it's broken.  Consider this: syntax-ppss
will return the result of a parse based at point-min.  In general, the
caller does not know whether point-min is in a string or not.  Therefore
the result is of little value, UNLESS the caller takes special action,
such as widening the buffer before every call to syntax-ppss.

> > The only sensible fix would be to specify that (syntax-ppss pos) is
> > the same as (parse-partial-sexp 1 pos).  But that is then a totally
> > different function, and there are around 200 uses in the Emacs
> > sources to check and fix, to say nothing of external code.

> Not entirely different, no. AFAIK, these are the semantics the vast 
> majority of its usages expect.

But it's not the semantics these .el files get.  What's probably keeping
them functional is the rarity with which buffers are narrowed to an
"awkward" point-min.

> Except the multiple-major-mode case, which we'd ideally try to
> accommodate, too.

How does this code handle the changeover of syntax tables at a mode
boundary?

> >> We could introduce a syntax-ppss-dont-widen variable, though. Similar to
> >> font-lock-dont-widen.

> > I'm trying to figure that out.  Wouldn't that still leave you with
> > problems when point-min is inside a string?

> syntax-ppss-dont-widen would be nil by default, it would be an escape 
> hatch toward the current semantics, for when the caller knows how to 
> manage narrowings, etc.

Ah, OK.  I think I see that now.  Maybe.  Surely the trouble is that
either ALL calls or NONE must have s-p-dont-widen set.  When that flag is
toggled, all the caches have to be cleared.  Maybe there should be some
initialisation flag in some initialisation function.  Or something like
that.  (It's getting late!).

It strikes me that the multiple major mode stuff could do with a
substantially enhanced version of syntax-ppss which would smoothly handle
going over a mode boundary.  But I don't know how you're implementing
that.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 22:15       ` Alan Mackenzie
@ 2016-03-11 22:38         ` Dmitry Gutov
  2016-03-13 17:37           ` Stefan Monnier
  2016-03-14 15:16           ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Alan Mackenzie
  0 siblings, 2 replies; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-11 22:38 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983

On 03/12/2016 12:15 AM, Alan Mackenzie wrote:

>> That part of specification can be rephrased.
>
> It's more than the specification which needs redoing.  The implementation
> (sometimes) returns the equivalent of (parse-partial-sexp (point-min)
> pos)), when point-min is not in a "safe place".

Sure. I just meant that we shouldn't get hung up on that element of the 
specification.

>>>> Changing syntax-ppss to be independent of narrowing will break it, and
>>>> we'll need to provide some alternative first.
>
>>> syntax-ppss is broken, and can't be fixed.
>
>> It's used ubiquitously, so it must be working.
>
> It might well be ubiquitous, but it's broken.

And yet, it can be fixed.

> Consider this: syntax-ppss
> will return the result of a parse based at point-min.  In general, the
> caller does not know whether point-min is in a string or not.  Therefore
> the result is of little value, UNLESS the caller takes special action,
> such as widening the buffer before every call to syntax-ppss.

You can say that.

>> Not entirely different, no. AFAIK, these are the semantics the vast
>> majority of its usages expect.
>
> But it's not the semantics these .el files get.  What's probably keeping
> them functional is the rarity with which buffers are narrowed to an
> "awkward" point-min.

Another thing that keeps it together, is that narrowing, as a user-level 
operator, is not that popular.

Personally, I consider it an anti-feature.

>> Except the multiple-major-mode case, which we'd ideally try to
>> accommodate, too.
>
> How does this code handle the changeover of syntax tables at a mode
> boundary?

The "inner" regions start with an "empty", top-level state. This is 
actually fine, because these are usually small enough not to benefit 
from the syntax-ppss cache too much (and syntax-ppss-last still helps).

The parts of the outer region following a subregion with different 
syntax table... rely on a few hacks, and a manual application of a 
`syntax-table' property when necessary. We need a better solution there, 
but it's probably out of scope for this discussion.

>> syntax-ppss-dont-widen would be nil by default, it would be an escape
>> hatch toward the current semantics, for when the caller knows how to
>> manage narrowings, etc.
>
> Ah, OK.  I think I see that now.  Maybe.  Surely the trouble is that
> either ALL calls or NONE must have s-p-dont-widen set.

Hmm, you're right. This variable still seems essential, but to be safe, 
mmm-mode and friends should probably also advise syntax-ppss, to always 
perform narrowing as appropriate.

> When that flag is
> toggled, all the caches have to be cleared.  Maybe there should be some
> initialisation flag in some initialisation function.  Or something like
> that.  (It's getting late!).

Is the syntax-ppss-dont-widen really relevant for your comment cache? It 
would be used only by certain major modes, and worst comes to worst, you 
could disable the cache in those buffers.

> It strikes me that the multiple major mode stuff could do with a
> substantially enhanced version of syntax-ppss which would smoothly handle
> going over a mode boundary.  But I don't know how you're implementing
> that.

So far, we're just wrapping the font-lock and indentation code, and 
otherwise hope for the best.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 22:38         ` Dmitry Gutov
@ 2016-03-13 17:37           ` Stefan Monnier
  2016-03-13 18:57             ` Alan Mackenzie
  2016-03-14 15:16           ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Alan Mackenzie
  1 sibling, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-13 17:37 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983

>> But it's not the semantics these .el files get.  What's probably keeping
>> them functional is the rarity with which buffers are narrowed to an
>> "awkward" point-min.
> Another thing that keeps it together, is that narrowing, as a user-level
> operator, is not that popular.

Luckily, yes.

> Personally, I consider it an anti-feature.

Same here.  Luckily also, as pointed out elsewhere, the semantics of it
is unclear, so that in several important cases, whichever behavior we
end up choosing will be both correct for some users and incorrect
for others.

Hence, so far, I didn't make any effort to try and "do the right thing"
for user-activated narrowing, since these are just not well defined
enough to even determine what is "the right thing".

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-13 17:37           ` Stefan Monnier
@ 2016-03-13 18:57             ` Alan Mackenzie
  2016-03-14  0:47               ` Dmitry Gutov
  2016-03-14  1:49               ` Stefan Monnier
  0 siblings, 2 replies; 155+ messages in thread
From: Alan Mackenzie @ 2016-03-13 18:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 22983, Dmitry Gutov

Hello, Stefan.

On Sun, Mar 13, 2016 at 01:37:27PM -0400, Stefan Monnier wrote:
> >> But it's not the semantics these .el files get.  What's probably keeping
> >> them functional is the rarity with which buffers are narrowed to an
> >> "awkward" point-min.
> > Another thing that keeps it together, is that narrowing, as a user-level
> > operator, is not that popular.

> Luckily, yes.

I happen to use it frequently.  I expect other users do, to.  It's
useful.

> > Personally, I consider it an anti-feature.

> Same here.  Luckily also, as pointed out elsewhere, the semantics of it
> is unclear, so that in several important cases, whichever behavior we
> end up choosing will be both correct for some users and incorrect
> for others.

That's pure sophistry.  The semantics needed are quite clear:  What were
strings and comments before narrowing should remain strings and comments
after narrowing.  Otherwise, nothing would work in such a narrowed
buffer.  font-locking, for example, behaves properly in a narrowed
buffer.

> Hence, so far, I didn't make any effort to try and "do the right thing"
> for user-activated narrowing, since these are just not well defined
> enough to even determine what is "the right thing".

Lets define them as I said in the previous paragraph.  Or can you
conceive of a use case where one would want narrowing to invert strings
and non-strings, leaving comments totally random?

Do you have any views on how the bug should be resolved?

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-13 18:57             ` Alan Mackenzie
@ 2016-03-14  0:47               ` Dmitry Gutov
  2016-03-14  1:04                 ` Drew Adams
  2016-03-14  1:49               ` Stefan Monnier
  1 sibling, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-14  0:47 UTC (permalink / raw)
  To: Alan Mackenzie, Stefan Monnier; +Cc: 22983

On 03/13/2016 08:57 PM, Alan Mackenzie wrote:

> I happen to use it frequently.  I expect other users do, to.  It's
> useful.

It might be, but it's not very well-designed. If you only want to hide 
some parts of buffer from being displayed, changing point-min and 
point-max, which affect quite a lot of Lisp functions, seems unnecessary.

Introducing a couple of global variables that would only be read by the 
display code, seems like a better approach. I don't think that 
narrow-to-region should be a user-level function. Introducing a new 
function, using a different mechanism shouldn't be too hard though, if 
we reuse the existing binding.

> What were
> strings and comments before narrowing should remain strings and comments
> after narrowing.  Otherwise, nothing would work in such a narrowed
> buffer.  font-locking, for example, behaves properly in a narrowed
> buffer.

It behaves like we tell it to behave. If I bind font-lock-dont-widen to 
t, font-lock won't look beyond the narrowing.

>> Hence, so far, I didn't make any effort to try and "do the right thing"
>> for user-activated narrowing, since these are just not well defined
>> enough to even determine what is "the right thing".
>
> Lets define them as I said in the previous paragraph.  Or can you
> conceive of a use case where one would want narrowing to invert strings
> and non-strings, leaving comments totally random?

At risk of inviting further confusion, yes, mmm-mode and polymode (new 
example!) use narrowing to persuade font-lock and indentation code that 
there's nothing beyond the narrowed region. We might declare such usages 
invalid, and that's a possible choice, but I think keeping support for 
them wouldn't be too hard, at least for a while.

Note that if your comment cache always widens the buffer before 
calculating the values to save, its result might conflict with 
syntax-ppss in mmm-mode and polymode (right?). Leading to font-lock, 
indentation and certain commands behaving in different, conflicting 
ways. That's just conjecture at this point, of course.

> Do you have any views on how the bug should be resolved?

Stefan probably has another opinion, but I'd either ignore the issue of 
narrowing, or introduce syntax-ppss-dont-widen like proposed (and thus 
make syntax-ppss widen by default). Together with adding a command that 
would replace interactive use of narrow-to-region.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-14  0:47               ` Dmitry Gutov
@ 2016-03-14  1:04                 ` Drew Adams
  2016-04-03 22:55                   ` John Wiegley
  0 siblings, 1 reply; 155+ messages in thread
From: Drew Adams @ 2016-03-14  1:04 UTC (permalink / raw)
  To: Dmitry Gutov, Alan Mackenzie, Stefan Monnier; +Cc: 22983

> > I happen to use it frequently.  I expect other users do, to.  It's
> > useful.
> 
> It might be, but it's not very well-designed. If you only want to hide
> some parts of buffer from being displayed, changing point-min and
> point-max, which affect quite a lot of Lisp functions, seems unnecessary.

Well, well, well.  All of this is likely OT for this thread.

But no - narrowing is in fact explicitly _about_ changing
`point-min' and `point-max', so you can act on a particular
section of a buffer.  It is not only about "hid[ing] some
parts of a buffer from being displayed".  This is true for
both interactive use and in code.

> I don't think that narrow-to-region should be a user-level
> function.

Is this a joke?  Maybe you think that because you think
it is only about hiding text?

> Introducing a new function, using a different mechanism
> shouldn't be too hard though, if we reuse the existing binding.

Please don't.  Please don't even think about it.

And if you really think you have something to say about
it, then please bring it up in emacs-devel, not in a bug
thread that is not especially related to it.

> adding a command that would replace interactive use of
> narrow-to-region.

Ridiculous (IMHO).  Add whatever commands you like, but
please do not think about replacing `narrow-to-region'
willy nilly.  It is one of the most useful Emacs commands.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-14  1:04                 ` Drew Adams
@ 2016-04-03 22:55                   ` John Wiegley
  0 siblings, 0 replies; 155+ messages in thread
From: John Wiegley @ 2016-04-03 22:55 UTC (permalink / raw)
  To: Drew Adams; +Cc: 22983, Dmitry Gutov, Stefan Monnier, Alan Mackenzie

>>>>> Drew Adams <drew.adams@oracle.com> writes:

>> Introducing a new function, using a different mechanism shouldn't be too
>> hard though, if we reuse the existing binding.

> Please don't. Please don't even think about it.

> And if you really think you have something to say about it, then please
> bring it up in emacs-devel, not in a bug thread that is not especially
> related to it.

+1!

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-13 18:57             ` Alan Mackenzie
  2016-03-14  0:47               ` Dmitry Gutov
@ 2016-03-14  1:49               ` Stefan Monnier
  1 sibling, 0 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-14  1:49 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov

> That's pure sophistry.  The semantics needed are quite clear:

For your use case, yes.  It's quite clear *in your mind*.  There are
other use cases.  Worse yet: Elisp doesn't generally know if the
narrowing was setup by the user or by some Elisp caller up the stack.
So even if we were to pretend that the use-case is clear when the
narrowing is set by the user, we'd still have to figure out if that's
the case.

> Lets define them as I said in the previous paragraph.  Or can you
> conceive of a use case where one would want narrowing to invert strings
> and non-strings, leaving comments totally random?

There's the case where some Elisp code does

    (save-restriction
      (narrow-to-region beg end
        (with-syntax-table ...)))

to parse a sub-part of your buffer in a different way.  Of course this
completely breaks syntax-ppss and friends.  I need to do exactly that in
sm-c-mode (when parsing the C code inside CPP directives, since those
directives are marked as comments), for example and had to use

      (let ((syntax-propertize-function nil)
            (syntax-ppss-cache nil)
            (syntax-ppss-last nil))
        ...)

to deal with it.  It would be easy/natural to add a binding of
syntax-ppss-dont-widen in there (and/or literal-cache-dont-widen for
that matter).

> Do you have any views on how the bug should be resolved?

Look up some past discussions of how to number lines in a narrowed
buffer (same basic issue), where we discussed this.

We basically need to add information about which kind of narrowing is
in effect.  IIRC one way suggested was to have 2 narrowing states at the
same time: the current one, plus a new one which is a kind of "hard
narrowing" (the current narrowing would have to be "narrower" than the
"hard narrowing"), with corresponding new kind of "widen".

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-11 22:38         ` Dmitry Gutov
  2016-03-13 17:37           ` Stefan Monnier
@ 2016-03-14 15:16           ` Alan Mackenzie
  2016-03-14 17:34             ` Andreas Röhler
  2016-03-14 20:06             ` Dmitry Gutov
  1 sibling, 2 replies; 155+ messages in thread
From: Alan Mackenzie @ 2016-03-14 15:16 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Hello, Dmitry.

On Sat, Mar 12, 2016 at 12:38:49AM +0200, Dmitry Gutov wrote:
> On 03/12/2016 12:15 AM, Alan Mackenzie wrote:

> >> Except the multiple-major-mode case, which we'd ideally try to
> >> accommodate, too.

> > How does this code handle the changeover of syntax tables at a mode
> > boundary?

> The "inner" regions start with an "empty", top-level state. This is 
> actually fine, because these are usually small enough not to benefit 
> from the syntax-ppss cache too much (and syntax-ppss-last still helps).

> The parts of the outer region following a subregion with different 
> syntax table... rely on a few hacks, and a manual application of a 
> `syntax-table' property when necessary. We need a better solution there, 
> but it's probably out of scope for this discussion.

How about an extension to the parse-partial-sexp (etc.) code?  For
example, a feature I would call an "island", where a character could be
marked with the "island start" syntax-table property, and another
character lower down could be marked with the "island end" character.
parse-partial-sexp, on encountering an island start syntax would somehow
stack the current parse state and start a new one with a different syntax
table.  The parse state, instead of consisting of the 10 elements
currently, would in general have 10n elements, where n is the depth of
nesting.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-14 15:16           ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Alan Mackenzie
@ 2016-03-14 17:34             ` Andreas Röhler
  2016-03-14 20:06             ` Dmitry Gutov
  1 sibling, 0 replies; 155+ messages in thread
From: Andreas Röhler @ 2016-03-14 17:34 UTC (permalink / raw)
  To: emacs-devel; +Cc: Alan Mackenzie



On 14.03.2016 16:16, Alan Mackenzie wrote:
> Hello, Dmitry.
>
> On Sat, Mar 12, 2016 at 12:38:49AM +0200, Dmitry Gutov wrote:
>> On 03/12/2016 12:15 AM, Alan Mackenzie wrote:
>>>> Except the multiple-major-mode case, which we'd ideally try to
>>>> accommodate, too.
>>> How does this code handle the changeover of syntax tables at a mode
>>> boundary?
>> The "inner" regions start with an "empty", top-level state. This is
>> actually fine, because these are usually small enough not to benefit
>> from the syntax-ppss cache too much (and syntax-ppss-last still helps).
>> The parts of the outer region following a subregion with different
>> syntax table... rely on a few hacks, and a manual application of a
>> `syntax-table' property when necessary. We need a better solution there,
>> but it's probably out of scope for this discussion.
> How about an extension to the parse-partial-sexp (etc.) code?  For
> example, a feature I would call an "island", where a character could be
> marked with the "island start" syntax-table property, and another
> character lower down could be marked with the "island end" character.
> parse-partial-sexp, on encountering an island start syntax would somehow
> stack the current parse state and start a new one with a different syntax
> table.  The parse state, instead of consisting of the 10 elements
> currently, would in general have 10n elements, where n is the depth of
> nesting.
> 0

AFAIU narrowing would provide that already WRT  parse-partial-sexp, 
maybe combined with some markup like folding-mode. Remains to hand over 
these sheets to font-lock.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-14 15:16           ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Alan Mackenzie
  2016-03-14 17:34             ` Andreas Röhler
@ 2016-03-14 20:06             ` Dmitry Gutov
  2016-03-19 22:51               ` Vitalie Spinu
  1 sibling, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-14 20:06 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Hi Alan,

On 03/14/2016 05:16 PM, Alan Mackenzie wrote:

> How about an extension to the parse-partial-sexp (etc.) code?  For
> example, a feature I would call an "island", where a character could be
> marked with the "island start" syntax-table property, and another
> character lower down could be marked with the "island end" character.

Something like that might help, although I hesitate asking for that 
change because it's a relatively big one, and it would still solve only 
one multiple-mode-related issue.

What it would help with, is fool the "outer" major mode into ignoring 
the preceding submode regions, in the return value of syntax-ppss. But 
we could have pretty much that already by advising syntax-ppss. That 
leaves out parse-partial-sexp, but it's not used that often directly in 
major mode code (though sgml-mode uses it).

> parse-partial-sexp, on encountering an island start syntax would somehow
> stack the current parse state and start a new one with a different syntax
> table.  The parse state, instead of consisting of the 10 elements
> currently, would in general have 10n elements, where n is the depth of
> nesting.

To be able to parse across different regions, it would need to know the 
syntax table for each one (using the syntax-table text property?), as 
well as to be able to apply the appropriate syntax-propertize-function 
in each region. The latter is handled by mmm-mode, though, in a 
seemingly adequate fashion (it installs a composite function that knows 
how to dispatch to mode-specific ones).

Maybe it's worth a try. Though I don't know how Stefan uses narrowing in 
sm-c-mode, and whether this proposal is appropriate to replace narrowing 
in syntax-ppss for this use case.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-14 20:06             ` Dmitry Gutov
@ 2016-03-19 22:51               ` Vitalie Spinu
  2016-03-20  2:19                 ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-19 22:51 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, emacs-devel

>> On Mon, Mar 14 2016 22:06, Dmitry Gutov wrote:

> Hi Alan,

> On 03/14/2016 05:16 PM, Alan Mackenzie wrote:

>> How about an extension to the parse-partial-sexp (etc.) code?  For
>> example, a feature I would call an "island", where a character could be
>> marked with the "island start" syntax-table property, and another
>> character lower down could be marked with the "island end" character.

> Something like that might help, although I hesitate asking for that change
> because it's a relatively big one, and it would still solve only one
> multiple-mode-related issue.

[...]

> But we could have pretty much that already by advising syntax-ppss. That
> leaves out parse-partial-sexp, but it's not used that often directly in major
> mode code (though sgml-mode uses it).

You can simulate islands by marking inner spans as comments with comment classes
(11 and 12). I used those in polymode in the past, but not anymore. It's not
that useful. Most of the parsing that modes do is regex based. So if a mode
author decides to regexpf for a wiki link on a full buffer after widening it,
islands won't help.

As Dmitry mentioned, there is little multi-mode cannot do with advising
syntax-ppss. The issue is still that parse-partial-sexp blows with narrowed code
and you cannot advice it.

IMO the most useful direction for multi-modes is to add a hard narrowing that
Stephen mentioned in the other thread. `syntax-ppss-dont-widen` goes in that
direction, but it doesn't address the issue of distinguishing between user
narrowing and "hard narrowing" in multi modes.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-19 22:51               ` Vitalie Spinu
@ 2016-03-20  2:19                 ` Dmitry Gutov
  2016-03-20 12:15                   ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-20  2:19 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel

On 03/20/2016 12:51 AM, Vitalie Spinu wrote:

> You can simulate islands by marking inner spans as comments with comment classes
> (11 and 12).

Ooh, that's a solid idea. Should be more generic that my "propertize <> 
as punctuation" approach.

> I used those in polymode in the past, but not anymore. It's not
> that useful. Most of the parsing that modes do is regex based.

I'd say it's still useful. Without the above, I've had indentation 
problems with sgml-mode.

A good mode would use syntax-ppss to check that point is not inside a 
string or comment. Maybe that's not often done in font-lock, but it's at 
least common in syntax-propertize and indentation functions.

Example: sgml-lexical-context. It performs a search at first, but in the 
end uses parse-partial-sexp, and returns a value based on that status.

> So if a mode
> author decides to regexpf for a wiki link on a full buffer after widening it,
> islands won't help.

Where does widening happens in this case? First, we have 
font-lock-dont-widen.

For indentation, we've introduced prog-indentation-context recently. And 
indentation functions in programming modes are supposed to call 
prog-widen instead of widen now.

syntax-propertize-function's aren't supposed to call widen at all, I think.

> IMO the most useful direction for multi-modes is to add a hard narrowing that
> Stephen mentioned in the other thread. `syntax-ppss-dont-widen` goes in that
> direction, but it doesn't address the issue of distinguishing between user
> narrowing and "hard narrowing" in multi modes.

syntax-ppss-dont-widen and prog-indentation-context will be the 
indicators of the "hard narrowing", I guess.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-20  2:19                 ` Dmitry Gutov
@ 2016-03-20 12:15                   ` Vitalie Spinu
  2016-03-20 15:58                     ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-20 12:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, emacs-devel

>> On Sun, Mar 20 2016 04:19, Dmitry Gutov wrote:

>> So if a mode author decides to regexpf for a wiki link on a full buffer after
>> widening it, islands won't help.

> Where does widening happens in this case? First, we have font-lock-dont-widen.

Well, font-lock-dont-widen is not respected even in c-mode. Look at
c-before-context-fl-expand-region and semi-safe-place which are called directly
or indirectly from c-font-lock-fontify-region.

> For indentation, we've introduced prog-indentation-context recently. And
> indentation functions in programming modes are supposed to call prog-widen
> instead of widen now.

I was not aware of that. Not sure if it is a step in right direction though.

`prog-indentation-context` looks fine to me but multi-modes already have their
own wrappers for indentation which do just that according to their own semantics
of modes/submodes/chunks/headers etc.

The primary intent of `prog-indentation-context` is to be used in
`prog-widen`. This part seems like a major complication. All mode authors now
have to understand what is prog-widen, prog-first-column and
prog-indentation-context. Why to burden prog-mode authors with notions that
multi-mode engines can take care themselves?

It is also not clear to me why should prog-widen be used in indentation context
only? It makes perfect sense for this function to be used in font-locking and
syntax-propertize-function as well.

It's essentially a half-backed implementation of "hard widening" discussed
earlier. Why not impose the widening restriction directly in `widen` then? Maybe
bring widen to elisp and rename C widen into widen-internal. Then add generic
`prog-hard-widen-limits` which would be checked along prog-indentation-context
limits.

Multi-mode engines can then impose those hard limits whenever they need to and
adjust indentation accordingly. It's not that hard in my experience. Polymode
has a few lines to wrap indentation and it works reasonably well in pretty much
all contexts I have tried. All other problems can be solved with hard narrowing.

  https://github.com/vspinu/polymode/blob/master/polymode-methods.el#L715-L809

Unless I miss something essential it's really not worth imposing such
complexities on mode authors. Judging from the python.el, which is the only mode
using prog-first-column so far, it's not a trivial task. Each mode author will
basically have to implement indentation logic that mmm-mode or polymode already
implement. And even then, multi-mode engines will probably need to overwrite
that because the semantics of submode spans is either emacs-mode or
multi-mode-engine specific.

> syntax-propertize-function's aren't supposed to call widen at all, I think.

This should probably be in the docs then. Mode authors can decide to do loads of
work in there. One instance is `markdown-mode` which caches all font-lock
properties in syntax-propertize-function. While markdown-mode is clean and
doesn't use widen anywhere, that might not be the case for other modes.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-20 12:15                   ` Vitalie Spinu
@ 2016-03-20 15:58                     ` Dmitry Gutov
  2016-03-21  1:05                       ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-20 15:58 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel

On 03/20/2016 02:15 PM, Vitalie Spinu wrote:

> Well, font-lock-dont-widen is not respected even in c-mode. Look at
> c-before-context-fl-expand-region and semi-safe-place which are called directly
> or indirectly from c-font-lock-fontify-region.

Well, yes. c-mode is special, as usual. That should be workable if CC 
Mode starts using prog-widen instead of widen, though.

>> For indentation, we've introduced prog-indentation-context recently. And
>> indentation functions in programming modes are supposed to call prog-widen
>> instead of widen now.
>
> I was not aware of that. Not sure if it is a step in right direction though.

I'm not 100% happy with it either.

> `prog-indentation-context` looks fine to me but multi-modes already have their
> own wrappers for indentation which do just that according to their own semantics
> of modes/submodes/chunks/headers etc.

Too bad you were not around when this addition was discussed.

> The primary intent of `prog-indentation-context` is to be used in
> `prog-widen`. This part seems like a major complication. All mode authors now
> have to understand what is prog-widen, prog-first-column and
> prog-indentation-context. Why to burden prog-mode authors with notions that
> multi-mode engines can take care themselves?

IIRC, using first-column is fairly justified, the outer mode can't add 
extra indentation to the submode is a reliable, sane way (though I've 
also been hacking around that quite successfully). Here's the full 
discussion:

http://lists.gnu.org/archive/html/emacs-devel/2015-01/msg00431.html
http://lists.gnu.org/archive/html/emacs-devel/2015-02/msg00290.html

with my messages further down.

> It is also not clear to me why should prog-widen be used in indentation context
> only? It makes perfect sense for this function to be used in font-locking and
> syntax-propertize-function as well.

Indeed. In js-mode's case, the offending code is called from 
font-lock-keywords, for example.

> It's essentially a half-backed implementation of "hard widening" discussed
> earlier. Why not impose the widening restriction directly in `widen` then? Maybe
> bring widen to elisp and rename C widen into widen-internal. Then add generic
> `prog-hard-widen-limits` which would be checked along prog-indentation-context
> limits.

Right! At the very least, I we should extract the second element of 
prog-indentation-context into a separate variable, and make prog-widen 
more prominent.

But a proper implementation of hard-widen would be even better in my 
book. Although someone would need to comb through all low-level 
functions, at least, and decide which of them need to call 
widen-internal, and which will be fine with just widen.

Are you interested in working on a patch? Also Cc'ing Stefan.

Looking back on it, it seems prog-indentation-context was merged too 
early: it only has one usage so far, so it's not clear if the approach 
is generally viable.

Christoph sort of promised to add support in CC Mode, but then 
disappeared. Which is not so surprising, that stuff is difficult.

> Unless I miss something essential it's really not worth imposing such
> complexities on mode authors. Judging from the python.el, which is the only mode
> using prog-first-column so far, it's not a trivial task. Each mode author will
> basically have to implement indentation logic that mmm-mode or polymode already
> implement. And even then, multi-mode engines will probably need to overwrite
> that because the semantics of submode spans is either emacs-mode or
> multi-mode-engine specific.

This is not too different what I was saying, I think. That discussion is 
fairly long, though, and it veered off to the side.

AFAICT, though, the ultimate justification for having first-column is 
Python's indentation cycling behavior: 
http://lists.gnu.org/archive/html/emacs-devel/2015-02/msg01096.html

Which is not that convincing, but makes some things clearner anyway.

But the last element, previous-chunks, is still not used anywhere in 
Emacs. I think including it turned out to be a mistake, or at least 
premature.

>> syntax-propertize-function's aren't supposed to call widen at all, I think.
>
> This should probably be in the docs then.

Probably.

> Mode authors can decide to do loads of
> work in there. One instance is `markdown-mode` which caches all font-lock
> properties in syntax-propertize-function. While markdown-mode is clean and
> doesn't use widen anywhere, that might not be the case for other modes.

ruby-syntax-propertize also does some involved parsing, but as long as 
there's no `widen' there, we should be fine.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-20 15:58                     ` Dmitry Gutov
@ 2016-03-21  1:05                       ` Vitalie Spinu
  2016-03-21  3:11                         ` Stefan Monnier
                                           ` (2 more replies)
  0 siblings, 3 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21  1:05 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel

>> On Sun, Mar 20 2016 17:58, Dmitry Gutov wrote:

> On 03/20/2016 02:15 PM, Vitalie Spinu wrote:

> IIRC, using first-column is fairly justified, the outer mode can't add extra
> indentation to the submode is a reliable, sane way

The inner mode cannot often make that decision either. Same inner mode can be
used in very different multi-mode contexts, each with their own semantics for
chunks/headers/indentation. Reducing all that to a simple (first-column
. previous-chunk) pair and letting inner mode do the job is surely not
enough. The only actor to make that decision should be multi-mode engine itself.

Instead of teaching modes about multi-modes, a much better idea is to introduce
`calculate-indent-function` which would accept POS and optional STRING-AFTER and
STRING-BEFORE. This function will return the indentation of STRING-AFTER at POS
assuming there is a virtual STRING-BEFORE just before POS.

This way, a multi-mode engine can call inner-mode's calculate-indent-function at
the end of previous chunk with STRING-AFTER being the line at point and
STRING-BEFORE being the content of current chunk. Most modes indent reliably
based on one previous line, so in 99% of the cases STRING-BEFORE can be nil and
multi-mode engine can call calculate-indent-function only on first line of the
current chunk (and that only for continuation chunks, which are a minority out
there). Then a lot of modes don't even care about what's in the current line, so
STRING-AFTER will be irrelevant as well. Thus most modes will not even need a an
implementation of calculate-indent-function.

This is both more general than prog-indentation-context and doesn't require
teaching major-modes about multi-modes. Moreover, a lot of major-modes already
have such a "calculator" in place.

>> It's essentially a half-backed implementation of "hard widening" discussed
>> earlier. Why not impose the widening restriction directly in `widen` then?
>> Maybe bring widen to elisp and rename C widen into widen-internal. Then add
>> generic `prog-hard-widen-limits` which would be checked along
>> prog-indentation-context limits.

> Right! At the very least, I we should extract the second element of
> prog-indentation-context into a separate variable, and make prog-widen more
> prominent.

Not sure about removing second element. Good thing about keeping all of them in
one place is for the indentation engine to be concerned with a single variable.

BTW, third argument should be renamed into PREVIOUS-CHUNK. The function returns
one chunk.

> But a proper implementation of hard-widen would be even better in my
> book. Although someone would need to comb through all low-level functions, at
> least, and decide which of them need to call widen-internal, and which will be
> fine with just widen.

No need to decide on widen-internal. All functions are free to call widen just
as they do before. It's 100% backward compatible. The only reason to use
`widen-internal` is to bring `widen` to elisp in order to allow for advise and
better debugging. Actually, with hard-widen-limits, there will be no need for
advice, so it can be kept in C.

Only consumers of `hard-widen-limits` should be concerned with its side
effects. But that's uniformly better than current situation when you cannot do
much about restricting widen.

In my experience hard-widen and parse-partial-sexp are the only hurdle in the
way of proper multi-modes. I don't remember a single problem that would occur
for a different reason.

BTW, I parse-partial-sexp must abide hard-widen-limits as well. This way the
request aired in bug#22983 of parse-partial-sexp == syntax-ppss will be
automatically satisfied. You won't need syntax-ppss-dont-widen either.

> Are you interested in working on a patch? Also Cc'ing Stefan.

My knowledge of emacs C internals is close to 0. Elisp side (and probably C
side) of this is trivial. I will look into it but I don't think I am the best
person for that.

> Looking back on it, it seems prog-indentation-context was merged too early: it
> only has one usage so far, so it's not clear if the approach is generally
> viable.

> Christoph sort of promised to add support in CC Mode, but then
> disappeared. Which is not so surprising, that stuff is difficult.

A patch that would require hunting every single mode out there and implementing
multi-modes locally should have been more carefully considered IMO. Emacs 25 is
not yet there, so it's not late to reconsider that decision.

> AFAICT, though, the ultimate justification for having first-column is Python's
> indentation cycling behavior:
> http://lists.gnu.org/archive/html/emacs-devel/2015-02/msg01096.html

> Which is not that convincing, but makes some things clearner anyway.

It's not convincing to me either. I use Christoph's indentation-0 trick in and
it indeed works reliably for all modes I have tried except python. But python's
issue can be fixed with a simple advice of python-indent-line-function, no need
to overhaul python indentation because of that. This is how it's now done in
polymode:

  https://github.com/vspinu/polymode/blob/master/polymode-compat.el#L189-L199

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21  1:05                       ` Vitalie Spinu
@ 2016-03-21  3:11                         ` Stefan Monnier
  2016-03-21  5:05                           ` Vitalie Spinu
  2016-03-21 11:56                           ` Dmitry Gutov
  2016-03-21  5:08                         ` [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] Vitalie Spinu
  2016-03-21 11:47                         ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Dmitry Gutov
  2 siblings, 2 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21  3:11 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov

> BTW, I parse-partial-sexp must abide hard-widen-limits as well.

I don't understand what this means.  parse-partial-sexp is passed
2 locations and it works between them.  There's not much opportunity
for widening.

But yes, syntax-ppss should obey hard-widen-limits.

> A patch that would require hunting every single mode out there and
> implementing multi-modes locally should have been more carefully
> considered IMO.

I must say I don't understand how what we have is so very different from
what you suggest.  Of course, I fully agree on the need to deprecate
indent-line-function and use a side-effect free replacement which
returns the desired indentation (instead performing the indentation).

I think both suggestions require changes to every mode, and in both
cases the changes can be reduced to a one-liner or close enough (for
the simple case).  Admittedly, for it to be a one-liner, we'll need to
provide a standard helper function.

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21  3:11                         ` Stefan Monnier
@ 2016-03-21  5:05                           ` Vitalie Spinu
  2016-03-21  7:13                             ` Andreas Röhler
  2016-03-21 12:26                             ` Stefan Monnier
  2016-03-21 11:56                           ` Dmitry Gutov
  1 sibling, 2 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21  5:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov



>> On Sun, Mar 20 2016 23:11, Stefan Monnier wrote:

>> BTW, I parse-partial-sexp must abide hard-widen-limits as well.

> I don't understand what this means.  parse-partial-sexp is passed
> 2 locations and it works between them.  There's not much opportunity
> for widening.

parse-partial-sexp should work between hard limits (at least the lower
bound). It should operate as if hard-narrowed buffer is the real buffer.

So ideally it should take (max FROM (car hard-widen-limits)) as the starting
position. This will give the desired consistency between parse-partial-sexp and
syntax-ppss with the price of slightly modifying the semantics of
parse-partial-sexp in a backward compatible way.

>> A patch that would require hunting every single mode out there and
>> implementing multi-modes locally should have been more carefully
>> considered IMO.

> I must say I don't understand how what we have is so very different from
> what you suggest.  

It's quite a bit different:

  - Major mode authors won't need to know about multi-modes. That means not
    dealing with chunks/spans/headers etc. These concepts are not even uniformly
    defined between existing multi-mode engines.
  
  - Major mode authors won't need to re-implement the indentation logic already
    there in multi-modes. The logic is likely to be too simplistic and major
    mode authors will have to re-do it anyways.

  - Setup is more general. multi-mode engine decides where to call
    calculate-indent-function and with what parameters and with what narrowing.

  - Arguably calculate-indent-function is a simpler concept to grasp

  - calculate-indent-function is needed anyways

> I think both suggestions require changes to every mode, and in both cases the
> changes can be reduced to a one-liner or close enough (for the simple
> case). Admittedly, for it to be a one-liner, we'll need to provide a standard
> helper function.

Judging from python.el it might be quite hard to provide a generic one liner to
deal with all those 3 elements. For calculate-indent-function instead you can
provide a straightforward one line assuming that STRING-BEFORE/AFTER do not
matter.


  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21  5:05                           ` Vitalie Spinu
@ 2016-03-21  7:13                             ` Andreas Röhler
  2016-03-21 12:26                             ` Stefan Monnier
  1 sibling, 0 replies; 155+ messages in thread
From: Andreas Röhler @ 2016-03-21  7:13 UTC (permalink / raw)
  To: emacs-devel



On 21.03.2016 06:05, Vitalie Spinu wrote:
>
>>> On Sun, Mar 20 2016 23:11, Stefan Monnier wrote:
>>> BTW, I parse-partial-sexp must abide hard-widen-limits as well.
>> I don't understand what this means.  parse-partial-sexp is passed
>> 2 locations and it works between them.  There's not much opportunity
>> for widening.
> parse-partial-sexp should work between hard limits (at least the lower
> bound). It should operate as if hard-narrowed buffer is the real buffer.
>
> So ideally it should take (max FROM (car hard-widen-limits)) as the starting
> position. This will give the desired consistency between parse-partial-sexp and
> syntax-ppss with the price of slightly modifying the semantics of
> parse-partial-sexp in a backward compatible way.
>
>>> A patch that would require hunting every single mode out there and
>>> implementing multi-modes locally should have been more carefully
>>> considered IMO.
>> I must say I don't understand how what we have is so very different from
>> what you suggest.
> It's quite a bit different:
>
>    - Major mode authors won't need to know about multi-modes. That means not
>      dealing with chunks/spans/headers etc. These concepts are not even uniformly
>      defined between existing multi-mode engines.
>    
>    - Major mode authors won't need to re-implement the indentation logic already
>      there in multi-modes. The logic is likely to be too simplistic and major
>      mode authors will have to re-do it anyways.
>
>    - Setup is more general. multi-mode engine decides where to call
>      calculate-indent-function and with what parameters and with what narrowing.
>
>    - Arguably calculate-indent-function is a simpler concept to grasp
>
>    - calculate-indent-function is needed anyways
>
>> I think both suggestions require changes to every mode, and in both cases the
>> changes can be reduced to a one-liner or close enough (for the simple
>> case). Admittedly, for it to be a one-liner, we'll need to provide a standard
>> helper function.
> Judging from python.el it might be quite hard to provide a generic one liner to
> deal with all those 3 elements. For calculate-indent-function instead you can
> provide a straightforward one line assuming that STRING-BEFORE/AFTER do not
> matter.
>
>
>    Vitalie
>

+1



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21  5:05                           ` Vitalie Spinu
  2016-03-21  7:13                             ` Andreas Röhler
@ 2016-03-21 12:26                             ` Stefan Monnier
  2016-03-21 14:13                               ` Vitalie Spinu
  1 sibling, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 12:26 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov

>>> BTW, I parse-partial-sexp must abide hard-widen-limits as well.
>> I don't understand what this means.  parse-partial-sexp is passed
>> 2 locations and it works between them.  There's not much opportunity
>> for widening.
> parse-partial-sexp should work between hard limits (at least the lower
> bound). It should operate as if hard-narrowed buffer is the real buffer.

You mean it should ignore the current (user)narrowing?  Why?
I'd think that if something needs to ignore the (user)narrowing it'd be
parse-partial-sexp's *caller* but not parse-partial-sexp itself.

> So ideally it should take (max FROM (car hard-widen-limits)) as the starting
> position.

You mean: as opposed to (max FROM (point-min))?

I disagree.  Functions should usually not accept to talk about positions
outside of the point-min/max range.

Notice how syntax-ppss is different in this regard: since it doesn't
receive FROM, that same rule doesn't prevent syntax-ppss from widening
to (car hard-widen-limits).

> This will give the desired consistency between parse-partial-sexp and
> syntax-ppss with the price of slightly modifying the semantics of
> parse-partial-sexp in a backward compatible way.

I'd be curious to know in which circumstances (i.e. specific code in
specific packages) this would make a difference.  As mentioned above,
I think these cases would be better fixed by changing the calling code
to perform widening before calling parse-partial-sexp.

>>> A patch that would require hunting every single mode out there and
>>> implementing multi-modes locally should have been more carefully
>>> considered IMO.

>   - Major mode authors won't need to know about multi-modes. That
>     means not dealing with chunks/spans/headers etc.  These concepts are
>     not even uniformly defined between existing multi-mode engines.

I understand that's your claim, but I don't understand why/how this is
different between the two proposals.

>   - Major mode authors won't need to re-implement the indentation
>   logic already there in multi-modes. The logic is likely to be too
>   simplistic and major mode authors will have to re-do it anyways.
>
>   - Setup is more general. multi-mode engine decides where to call
>   calculate-indent-function and with what parameters and with
>   what narrowing.

Same here.

>   - Arguably calculate-indent-function is a simpler concept to grasp

As mentioned, I fully agree with the need to replace
indent-line-function with calculate-indent-function (tho I like to name
it prog-indent-function).  So the difference is w.r.t your
STRONG-BEFORE/AFTER: which code provides them, which code obeys them,
and how that compares to the way prog-indentation-context is provided
and obeyed.

>> I think both suggestions require changes to every mode, and in both
>> cases the changes can be reduced to a one-liner or close enough (for
>> the simple case).  Admittedly, for it to be a one-liner, we'll need to
>> provide a standard helper function.
> Judging from python.el it might be quite hard to provide a generic one
> liner to deal with all those 3 elements.  For calculate-indent-function
> instead you can provide a straightforward one line assuming that
> STRING-BEFORE/AFTER do not matter.

My hunch is that if STRING-BEFORE/AFTER don't matter, then it should not
be hard to come up with a generic function in prog.el which can be
invoked with a one liner in the major mode (assuming the major mode
sets (prog|calculate)-indent-function).

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 12:26                             ` Stefan Monnier
@ 2016-03-21 14:13                               ` Vitalie Spinu
  2016-03-21 14:43                                 ` Stefan Monnier
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 14:13 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Alan Mackenzie, Dmitry Gutov, emacs-devel

>> On Mon, Mar 21 2016 08:26, Stefan Monnier wrote:

>> parse-partial-sexp should work between hard limits (at least the lower
>> bound). It should operate as if hard-narrowed buffer is the real buffer.

> You mean it should ignore the current (user)narrowing?  Why? I'd think that if
> something needs to ignore the (user)narrowing it'd be parse-partial-sexp's
> *caller* but not parse-partial-sexp itself.

Currently it just throws out-of-range errors. So in that sense it does ignore
user narrowing in a very inconvenient way.

parse-partial-sexp is called from code exclusively and it just happens that in
multi-modes it is called outside of narrow region quite often. That's a major
inconvenience. Why on earth one would need to take account in user narrowing for
syntax parsing? If parse-partial-sexp could be made to always widen to hard
limits it will automatically solve a bunch of problems. bug#22983 being one of
them, condition-case awkwardness in syntax-ppss being another one, and the
ubiquitous out-of-range errors in font-lock in multi-modes being the most
important one.

>> So ideally it should take (max FROM (car hard-widen-limits)) as the starting
>> position.

> You mean: as opposed to (max FROM (point-min))?

Yes.

> I disagree.  Functions should usually not accept to talk about positions
> outside of the point-min/max range.

Depends on the function. point-max/min is mostly user level. Why wold syntax
parsing would need to respect that? Bug#22983 ilustrates that clearly. If user
narrows in the middles of a string, it creates huge problems.

Note that with Dmitry's new syntax-ppps-dont-widen proposal syntax-ppps widens
first.

Can I ask you the reverse? What do you gain by respecting user narrowing in
syntax parsing?

> Notice how syntax-ppss is different in this regard: since it doesn't
> receive FROM, that same rule doesn't prevent syntax-ppss from widening
> to (car hard-widen-limits).

Well, not quite different. It has POS which might be outside of user narrowed
range.

>> This will give the desired consistency between parse-partial-sexp and
>> syntax-ppss with the price of slightly modifying the semantics of
>> parse-partial-sexp in a backward compatible way.

> I'd be curious to know in which circumstances (i.e. specific code in specific
> packages) this would make a difference.  As mentioned above, I think these
> cases would be better fixed by changing the calling code to perform widening
> before calling parse-partial-sexp.

I think bug#22983 is illustrative enough. Multi-mode code is a nightmare because
of out-of-range errors in parsing. `syntax-ppss` is protected but that
condition-case is triggered in 99.99% of the times in multi-modes.

In multi modes you really want to keep narrowing because most of the major-mode
functionality works well on narrowed code. Pretty much all of it except
syntactic parsing and font-locking. Occasional property lockup outside of
narrowed region could be dealt with on case by case basis or, hopefully, with
new hard-narrowed-limits at the core of it.

>>>> A patch that would require hunting every single mode out there and
>>>> implementing multi-modes locally should have been more carefully
>>>> considered IMO.

>>   - Major mode authors won't need to know about multi-modes. That
>>     means not dealing with chunks/spans/headers etc.  These concepts are
>>     not even uniformly defined between existing multi-mode engines.

> I understand that's your claim, but I don't understand why/how this is
> different between the two proposals.

Major mode author has to deal with the span explicitly as defined in
previous-chunk in prog-indentation-context. Cognitively this is a more demanding
task. Ask a new person to go and read the doc of prog-indentation-context and
ask how much he or she understands of it. I read it and I think I understand
most of it, but looking at all the usages of prog-widen and prog-first-column in
python.el my brain gives up. Previous-chunk is not even used in python.el!

The prog-calculate-indent-function is more general. You can call it on any
buffer position (need not be last point in the previous span). It can be called
with whatever STRING-BEFORE and STRING-AFTER (these can, but need not be, actual
strings in the buffer). Current prog-indentation-context allows for possibility
of a string to be inserted before begging in of current chunk. STRING-BEFORE is
more more general than that because of the arbitrary POS that it can be applied
to. 

My claim is that we can achieve much higher generality and don't bother mode
authors with all those concepts like current/previous span/chunk, starting/end
position etc. Only multi-mode engine can take proper care of those anyways.

Here is a simple example when inner mode cannot decide by itself on the
indentation. Assume for concreteness a noweb header with some code immediately
following the header:

  <<foo, some_arg=4>>= some_call(blabla) 
      some_other_call(blabla) ## indented by offset 2 with respect to header or prev_chunk

How do you indent the some_call(blabal) after the header? The most meaningful
way is to keep it untouched just as user defined it. If inner mode would indent
it by itself it would give offset of 4. This is a simple example of header
dependence.

You can easily imagine more complex cases when not only one previous span need
to be considered but a range of previous spans of the same inner mode. Moreover
there might be nested inner chunks. Which chunk/span will you include in
prog-indentation-context? The entire previous code chunk or only the last
homogeneous span after the most recent inner-inner chunk?

Indentation of a span is commonly dependent on the header of the chunk (note the
terminology distinction). You can imagine having a parameter in the header that
would determine the indentation of the chunk's body. Header-dependence is a
simple and common case of inter-span dependence. It's not hard to imagine
complex cases when indentation in current span will depend not only on the
previous span of the same mode but on other spans of host mode or even other
inner (nested or not) modes.

IMO the best way is to leave all this complexities to multi-mode authors to deal
with on case by case basis. You never know what sort of complexities and chunk
dependencies new multi-modes will impose. Better keep things
generic. prog-calculate-indent-function seems like a multi-mode agnostic
solution. I am not sure if it will solve all problems, but it's surely solves
more than prog-indentation-context does in a cleaner way.

Note on terminology. I put quite some effort to sort things out in
polymode. Glossary of terms is here:

  https://github.com/vspinu/polymode/tree/master/modes#glossary-of-terms

For many reasons it's important to distinguish between portions of code that
include header/tails and homogeneous portions of the same mode. Former portions
I call `chunks` and those can include other chunks of different sub-modes. The
latter, homogeneous portions, I call `spans`.

The fact that core emacs is now starting building pieces of multi-mode
functionality here and there and thus entrenching a somewhat naive
interpretation of a "chunk" doesn't make me happy. Not a big deal though.

> My hunch is that if STRING-BEFORE/AFTER don't matter,

It will actually matter for quite some modes in continuation chunks. I was too
optimistic.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 14:13                               ` Vitalie Spinu
@ 2016-03-21 14:43                                 ` Stefan Monnier
  2016-03-21 16:42                                   ` Vitalie Spinu
  2016-03-21 16:45                                   ` Vitalie Spinu
  0 siblings, 2 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 14:43 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, Dmitry Gutov, emacs-devel

> parse-partial-sexp is called from code exclusively and it just happens
> that in multi-modes it is called outside of narrow region quite often.

How/why?  Can you give some concrete scenario?

> That's a major inconvenience. Why on earth one would need to
> take account in user narrowing for syntax parsing?

Because you need it for *everything* that looks at the buffer.
Why should parse-partial-sexp be different from (say) scan-sexps?

> If parse-partial-sexp could be made to always widen to hard limits it
> will automatically solve a bunch of problems. bug#22983 being one of them,

Bug#22983 should be fixed by widening, indeed, but it should be done in
syntax.el.  Widening in parse-partial-sexp would only address some cases
but not all (e.g. the syntax-begin-function cases or the
syntax-propertize-function cases).  Those other cases can only be fixed
in syntax.el.

> the ubiquitous out-of-range errors in font-lock in multi-modes being
> the most important one.

I'm not familiar with those, so if you could give some examples it
would help us judge if they would indeed benefit from a fix in
parse-partial-sexp rather than elsewhere.

>> Notice how syntax-ppss is different in this regard: since it doesn't
>> receive FROM, that same rule doesn't prevent syntax-ppss from widening
>> to (car hard-widen-limits).
> Well, not quite different. It has POS which might be outside of user narrowed
> range.

No: POS should be within point-min/max.

> Major mode author has to deal with the span explicitly as defined in
> previous-chunk in prog-indentation-context. Cognitively this is a more
> demanding task. Ask a new person to go and read the doc of
> prog-indentation-context and ask how much he or she understands of
> it. I read it and I think I understand most of it, but looking at all
> the usages of prog-widen and prog-first-column in python.el my brain
> gives up. Previous-chunk is not even used in python.el!

Replace all your widen calls with calls to `prog-widen' and you get
the same result (since (nth 1 prog-indentation-context) is basically
another name for your hard-widen-limit).  So I don't think prog-widen is
that very complicated.

As for prog-first-column the local major mode can just ignore it in
which case the multi-mode can do the same that you do.  It's only useful
if you need/want to provide a more complex behavior than what
polymode supports.  So, of course, it's more complex.

> The prog-calculate-indent-function is more general. You can call it on
> any buffer position (need not be last point in the previous span).

[ Note: In my mind, the "natural main case" for multi-mode indentation
  is when you call the indentation function on the *first position* of
  a span.  But you seem to look at it from the other end, where you call
  the indentation function on the *last position* of the previous span.
  I think I'm beginning to see why.  ]

Note that "is more general" here means that the major mode's function
has to handle more cases, so it would seem to fundamentally require more
work on the major mode's side.

> Current prog-indentation-context allows for possibility of a string to
> be inserted before begging in of current chunk.  STRING-BEFORE is more
> more general than that because of the arbitrary POS that it can be
> applied to. 

Good point.  I didn't think of that.  Do you make use of that
possibility, and/or can you give an example where it's useful?

> Here is a simple example when inner mode cannot decide by itself on
> the indentation.  Assume for concreteness a noweb header with some code
> immediately following the header:
>
>   <<foo, some_arg=4>>= some_call(blabla) 
>       some_other_call(blabla) ## indented by offset 2 with respect to header or prev_chunk
>
> How do you indent the some_call(blabal) after the header? The most
> meaningful way is to keep it untouched just as user defined it. If
> inner mode would indent it by itself it would give offset of 4. This
> is a simple example of header dependence.

Maybe it's because I'm not familiar with noweb, but I didn't understand
this example.  It looks like a very interesting example, so could you go
over it again in much more detail?

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 14:43                                 ` Stefan Monnier
@ 2016-03-21 16:42                                   ` Vitalie Spinu
  2016-03-21 18:31                                     ` Stefan Monnier
  2016-03-21 20:33                                     ` Alan Mackenzie
  2016-03-21 16:45                                   ` Vitalie Spinu
  1 sibling, 2 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 16:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov

>> On Mon, Mar 21 2016 10:43, Stefan Monnier wrote:

>> parse-partial-sexp is called from code exclusively and it just happens
>> that in multi-modes it is called outside of narrow region quite often.

> How/why?  Can you give some concrete scenario?

MM engine narrows to span region for a lot of tasks, most importantly
font-lock. If inner mode fortification functions misbehaves (ignoring
font-lock-dont-widen for example) like c-mode does this leads to trouble. So to
avoid those troubles you would advice individual functions and narrow them
properly or apply other tricks like overwriting output value or input args.  It
all works fine till that function calls parse-partial-sexp (or some other low
level function) and blows with args-out-of-range error.

To be frank, the issue of parse-partial-sexp is fading because modes are now
using syntax-ppss more extensively. Most of the problems with parse-partial-sexp
from the past are now internalized in condition-case inside syntax-ppss. That
condition-case is triggered very often (almost always) from inside polymode
chunk narrowing.

>> That's a major inconvenience. Why on earth one would need to
>> take account in user narrowing for syntax parsing?

> Because you need it for *everything* that looks at the buffer.
> Why should parse-partial-sexp be different from (say) scan-sexps?

I think parse-partial-sexp, syntax-ppss and maybe some others, are special in
the sense that in order to return a correct value they need to be aware of the
whole buffer. I don't see this as an inconsistency but I might be too naive.

>> If parse-partial-sexp could be made to always widen to hard limits it
>> will automatically solve a bunch of problems. bug#22983 being one of them,

> Bug#22983 should be fixed by widening, indeed, but it should be done in
> syntax.el.  Widening in parse-partial-sexp would only address some cases
> but not all (e.g. the syntax-begin-function cases or the
> syntax-propertize-function cases).  Those other cases can only be fixed
> in syntax.el.

>> the ubiquitous out-of-range errors in font-lock in multi-modes being
>> the most important one.

> I'm not familiar with those, so if you could give some examples it
> would help us judge if they would indeed benefit from a fix in
> parse-partial-sexp rather than elsewhere.

c-mode provides an example. I don't remember where exactly and how but it has to
do with but c-before-context-fl-expand-region and c-state-semi-safe-place
because I am advising these two functions currently.

The logic is roughly like this, c-mode engine doesn't respect
font-lock-dont-widen, widens stuff in some of it's functions, then calls its
parsing, gets back some points outside font-lock range and blows when trying to
access those points from narrowed region.

I was not collecting these cases carefully but I will start doing it and will
get with more concrete examples in the following weeks.

Another directions of problems is syntax-propertize. It can be called with POS
outside of current narrowed region. Particularly from
internal--syntax-propertize. But again I don't recall how exactly that was
happening now.

> Replace all your widen calls with calls to `prog-widen' and you get the same
> result (since (nth 1 prog-indentation-context) is basically another name for
> your hard-widen-limit).  So I don't think prog-widen is that very complicated.

It's not but you have to enforce that in all known modes.

> Note that "is more general" here means that the major mode's function has to
> handle more cases, so it would seem to fundamentally require more work on the
> major mode's side.

I don't agree. Work must be done only in the generic multi-mode engines
(mmm-mode, polymode etc). Other modes should re-use that generic infrastructure,
or even better, do nothing, and leave to someone else to define a new polymode
with host chunk being *the* mode. That every mode with basic needs for inner
sub-modes tries to re-invent the wheel is a dead end.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 16:42                                   ` Vitalie Spinu
@ 2016-03-21 18:31                                     ` Stefan Monnier
  2016-03-21 19:16                                       ` Vitalie Spinu
  2016-03-21 20:33                                     ` Alan Mackenzie
  1 sibling, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 18:31 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov

> To be frank, the issue of parse-partial-sexp is fading because modes
> are now using syntax-ppss more extensively.

Aha!  So we agree: there's no reason to worry about parse-partial-sexp,
since at this point pretty much all modes rely on syntax-ppss (other
than CC-modes, obviously).

> Most of the problems with parse-partial-sexp from the past are now
> internalized in condition-case inside syntax-ppss. That condition-case
> is triggered very often (almost always) from inside polymode
> chunk narrowing.

Right.  But I don't see it as a problem in parse-partial-sexp, rather
than a problem in syntax.el.

>> Because you need it for *everything* that looks at the buffer.
>> Why should parse-partial-sexp be different from (say) scan-sexps?
> I think parse-partial-sexp, syntax-ppss and maybe some others, are special in
> the sense that in order to return a correct value they need to be aware of the
> whole buffer. I don't see this as an inconsistency but I might be too naive.

scan-sexps will complain about unmatched parens and things like that if
it bumps into point-min/max.  Same for re-search-*.  I think you've just
been too often exposed to the use of (parse-partial-sexp 1 ...) where
the resulting signal bites you right away, whereas many other functions
won't signal an error and will instead do *something* (which may not
always be incorrect, but may often enough still result in acceptable
behavior).

> c-mode provides an example. I don't remember where exactly and how but
> it has to do with but c-before-context-fl-expand-region and
> c-state-semi-safe-place because I am advising these two
> functions currently.

CC-modes is definitely a very special case here.
We should aim to limit the amount of changes in most major modes, so
better not pay too much attention to cc-mode from that point of view.

> The logic is roughly like this, c-mode engine doesn't respect
> font-lock-dont-widen, widens stuff in some of it's functions, then
> calls its parsing, gets back some points outside font-lock range and
> blows when trying to access those points from narrowed region.

Sounds like a problem in cc-mode, which will require changes in
cc-mode.  The generic code shouldn't worry about that.

> I was not collecting these cases carefully but I will start doing it and will
> get with more concrete examples in the following weeks.

Thanks.

> Another directions of problems is syntax-propertize.  It can be called
> with POS outside of current narrowed region.  Particularly from
> internal--syntax-propertize.

That would definitely be an error, so if you bump into such a case
please report it.

>> Replace all your widen calls with calls to `prog-widen' and you get
>> the same result (since (nth 1 prog-indentation-context) is basically
>> another name for your hard-widen-limit).  So I don't think prog-widen
>> is that very complicated.
> It's not but you have to enforce that in all known modes.

I prefer to say that "any major mode which wants to play with the new
snazzy multi-mode feature needs to be adjusted (e.g. with prog-widen)".
It's perfectly fine if some major modes don't play along correctly until
they're fixed.

"Try to get multi-mode working without touching anyone's code"
(e.g. using advice) is great, but we already have packages which do that.

>> Note that "is more general" here means that the major mode's function has to
>> handle more cases, so it would seem to fundamentally require more work on the
>> major mode's side.

> I don't agree. Work must be done only in the generic multi-mode engines
> (mmm-mode, polymode etc).

The "is more general" I was quoting was talking about the ways the
generic code can call the major-mode-specific code.  If this is more
generic, it means the major-mode-specific code needs to handle more
situations (e.g. STRING-BEFORE appearing not just at the beginning of
a chunk).

> Other modes should re-use that generic infrastructure, or even better,
> do nothing, and leave to someone else to define a new polymode with
> host chunk being *the* mode. That every mode with basic needs for
> inner sub-modes tries to re-invent the wheel is a dead end.

I don't understand: every major mode's indentation code will have to pay
attention to the STRING-BEFORE/AFTER that it receives from the generic
code and will have to do something with it (it can ignore it but at the
cost of sub-optimal results).  And AFAIK this can only be done by the
major mode's code, not the generic mode's code.
[ I feel like I must be missing something.  ]

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 18:31                                     ` Stefan Monnier
@ 2016-03-21 19:16                                       ` Vitalie Spinu
  2016-03-21 20:47                                         ` Stefan Monnier
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 19:16 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Alan Mackenzie, Dmitry Gutov, emacs-devel



>> On Mon, Mar 21 2016 14:31, Stefan Monnier wrote:

>> Other modes should re-use that generic infrastructure, or even better,
>> do nothing, and leave to someone else to define a new polymode with
>> host chunk being *the* mode. That every mode with basic needs for
>> inner sub-modes tries to re-invent the wheel is a dead end.

> I don't understand: every major mode's indentation code will have to pay
> attention to the STRING-BEFORE/AFTER that it receives from the generic
> code and will have to do something with it (it can ignore it but at the
> cost of sub-optimal results).  And AFAIK this can only be done by the
> major mode's code, not the generic mode's code.
> [ I feel like I must be missing something.  ]

The hope is that most modes will need the default implementation. Maybe
prog-indentation-funciton need not even know about those arguments. Along what
Dmitry proposed, there could be an optional extra piece
prog-indentation-with-virtual-context-function that modes might choose to set.

I think it might be good to try this as part of a multi-mode engine first. I can
try it with polymode. If it's generic enough it could be ported into emacs in a
later stage to be re-used with other multi-mode setups.

  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 19:16                                       ` Vitalie Spinu
@ 2016-03-21 20:47                                         ` Stefan Monnier
  0 siblings, 0 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 20:47 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, Dmitry Gutov, emacs-devel

> The hope is that most modes will need the default implementation. Maybe
> prog-indentation-funciton need not even know about those arguments. Along what
> Dmitry proposed, there could be an optional extra piece
> prog-indentation-with-virtual-context-function that modes might choose to set.

That was the idea being the prog-indent-context: indentation functions
can choose to use if they wish, but by default they don't have to.


        Stefan



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 16:42                                   ` Vitalie Spinu
  2016-03-21 18:31                                     ` Stefan Monnier
@ 2016-03-21 20:33                                     ` Alan Mackenzie
  2016-03-21 20:49                                       ` Stefan Monnier
                                                         ` (2 more replies)
  1 sibling, 3 replies; 155+ messages in thread
From: Alan Mackenzie @ 2016-03-21 20:33 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Dmitry Gutov, Stefan Monnier, emacs-devel

Hello, Vitalie.

On Mon, Mar 21, 2016 at 05:42:51PM +0100, Vitalie Spinu wrote:

> >> On Mon, Mar 21 2016 10:43, Stefan Monnier wrote:

> >> parse-partial-sexp is called from code exclusively and it just happens
> >> that in multi-modes it is called outside of narrow region quite often.

> > How/why?  Can you give some concrete scenario?

> MM engine narrows to span region for a lot of tasks, most importantly
> font-lock. If inner mode fortification functions misbehaves (ignoring
> font-lock-dont-widen for example) like c-mode does this leads to trouble.

That's a misunderstanding of what `font-lock-dont-widen' is.  It's
purely a signal to font-lock.  Its doc string makes clear that it's
intended for use by major modes.  It is for a major mode to set this
flag, not to act on it.

CC Mode absolutely needs to widen, to get the context necessary for
correct fontification and indentation (which can be an arbitrary depth).  

> So to avoid those troubles you would advice individual functions and
> narrow them properly or apply other tricks like overwriting output
> value or input args.  It all works fine till that function calls
> parse-partial-sexp (or some other low level function) and blows with
> args-out-of-range error.

Reading some of the posts on emacs-devel today, it strikes me that
narrowing might be the wrong tool for marking the boundaries of distinct
regions where different major modes are in effect.  It seems to cause
nothing but trouble.  I don't know what the right tool is, and it may
not currently exist in Emacs.  But it might be a good use of time to
work out what properties such boundary markers ought to have, and if
necessary, to implement them.

[ .... ]

>   Vitalie

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 20:33                                     ` Alan Mackenzie
@ 2016-03-21 20:49                                       ` Stefan Monnier
  2016-03-21 21:03                                       ` Drew Adams
  2016-03-21 21:12                                       ` Dmitry Gutov
  2 siblings, 0 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 20:49 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Vitalie Spinu, Dmitry Gutov, emacs-devel

> Reading some of the posts on emacs-devel today, it strikes me that
> narrowing might be the wrong tool for marking the boundaries of distinct
> regions where different major modes are in effect.  It seems to cause
> nothing but trouble.

That's why prog-indent-context just provides the boundaries of the
current chunk but without narrowing.


        Stefan



^ permalink raw reply	[flat|nested] 155+ messages in thread

* RE: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 20:33                                     ` Alan Mackenzie
  2016-03-21 20:49                                       ` Stefan Monnier
@ 2016-03-21 21:03                                       ` Drew Adams
  2016-03-21 21:12                                       ` Dmitry Gutov
  2 siblings, 0 replies; 155+ messages in thread
From: Drew Adams @ 2016-03-21 21:03 UTC (permalink / raw)
  To: Alan Mackenzie, Vitalie Spinu; +Cc: emacs-devel, Stefan Monnier, Dmitry Gutov

> Reading some of the posts on emacs-devel today, it strikes me that
> narrowing might be the wrong tool for marking the boundaries of distinct
> regions where different major modes are in effect.  It seems to cause
> nothing but trouble.
>
> I don't know what the right tool is, and it may not currently exist
> in Emacs.  But it might be a good use of time to work out what
> properties such boundary markers ought to have, and if necessary,
> to implement them.

Indeed.  Just what properties do you need for "such boundary
markers"?

What's wrong with using _markers_ to mark area boundaries?
(That's what I use in library zones.el, for example.)

What's wrong with using text properties to mark areas?
(That's what I use in library isearch-prop.el, for example.)

I haven't been following this thread except for scanning it,
so I don't really know what the need is for messing with
narrowing (or for defining another, "harder" narrowing).

I'm just hoping that at the end of the day our age-old
narrowing feature will at least remain as it has been, for
both programs and interactive use.

(It doesn't give me confidence, a priori, to have seen that
at least one developer involved expressed little understanding
of how users and programs actually use narrowing, thinking
that narrowing is only for hiding text you do not want to see.)

How about a bit of a spec (description, summary) of what you
really need?  It's not even clear what the problem is that you
are trying to solve.

Look at the Subject line of this thread, and that of the bug
thread that this one derived from, and that of the other
derived thread, for the patch "hard-widen-limits".  Do they
really characterize what this is all about: the problem to be
solved?

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 20:33                                     ` Alan Mackenzie
  2016-03-21 20:49                                       ` Stefan Monnier
  2016-03-21 21:03                                       ` Drew Adams
@ 2016-03-21 21:12                                       ` Dmitry Gutov
  2 siblings, 0 replies; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-21 21:12 UTC (permalink / raw)
  To: Alan Mackenzie, Vitalie Spinu; +Cc: Stefan Monnier, emacs-devel

Hi Alan,

On 03/21/2016 10:33 PM, Alan Mackenzie wrote:

>> MM engine narrows to span region for a lot of tasks, most importantly
>> font-lock. If inner mode fortification functions misbehaves (ignoring
>> font-lock-dont-widen for example) like c-mode does this leads to trouble.
>
> That's a misunderstanding of what `font-lock-dont-widen' is.  It's
> purely a signal to font-lock.  Its doc string makes clear that it's
> intended for use by major modes.

It does not. The docstring gives examples of the modes where it can be 
useful. It does not say that the variable can only be set by a major mode.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 14:43                                 ` Stefan Monnier
  2016-03-21 16:42                                   ` Vitalie Spinu
@ 2016-03-21 16:45                                   ` Vitalie Spinu
  2016-03-21 22:55                                     ` Dmitry Gutov
  2016-03-22 14:51                                     ` Stefan Monnier
  1 sibling, 2 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 16:45 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov

I have split the answer in two separate conceptually distinct parts. This one is
about indentation complexities of generic multi-modes.

>> On Mon, Mar 21 2016 10:43, Stefan Monnier wrote:

>> Current prog-indentation-context allows for possibility of a string to be
>> inserted before begging in of current chunk.  STRING-BEFORE is more more
>> general than that because of the arbitrary POS that it can be applied to.

> Good point.  I didn't think of that.  Do you make use of that
> possibility, and/or can you give an example where it's useful?

Please see example of erb mode below.

>> Here is a simple example when inner mode cannot decide by itself on
>> the indentation.  Assume for concreteness a noweb header with some code
>> immediately following the header:
>>
>>   <<foo, some_arg=4>>= some_call(blabla) 
>>       some_other_call(blabla) ## indented by offset 2 with respect to header or prev_chunk
>>
>> How do you indent the some_call(blabal) after the header? The most
>> meaningful way is to keep it untouched just as user defined it. If
>> inner mode would indent it by itself it would give offset of 4. This
>> is a simple example of header dependence.

> Maybe it's because I'm not familiar with noweb, but I didn't understand this
> example.  It looks like a very interesting example, so could you go over it
> again in much more detail?

Noweb is not essential here. The story will hold for pretty much all multi-modes
with non-full-line headers. In noweb `<<foo, some_arg=4>>=` is a header of a
chunk. Polymode places heads and tail in their own modes because they are not
conceptually part of nor host or sub-mode. You can specify arbitrary parameters
in the head which might even instruct how to indent the chunk. The first code
line `some_call(blabla)` is placed on the same line with the head. This is
uncommon but it's the simplest real case I can think of.

There are two issues here.

First one is how do you indent the head itself? Let's assume the point is after
`foo`. If you follow the naive prog-indentation-context the indentation should
be handled by the mode in the head chunk, right? Let's call it
noweb-head-mode. This mode is the same for many noweb host-mode/inner-mode
combinations and defaults to poly-head-tail-mode. Host mode is commonly LaTeX
but it can be anything. One reasonable way to indent it is to use the host mode
indentation engine. Note that this is in contrast of the
prog-indentation-context assumption for which PREVIOUS-CHUNK is assumed to be of
the same mode type as the current type.

The second issue is with respect to the first line immediately after the
header. If you naively call inner mode indentation engine on that line in a
narrowed buffer starting after >>= you will get it indented to FIRST-COLUMN,
which in the above case is the indentation of the head, plus noweb chunk offset
which is a polymode specific thing and it is customizable per inner mode. Do you
really want to insert that space after >>=? Probably not. So the code following
the header is special. That means that you will either have to take care of that
in multi-mode engine or extend prog-indentation-context.

Think of markdown simple code spans `this = is(a, codes, span)` which can occur
anywhere in the buffer. Indentation is not meaningful within the span at all,
the whole chunk should be indented by the outer mode just before the opening `.

Real trouble comes with continuation chunks. You might need to have a completely
reversed indentation logic - in outer/host spans MM engine needs to call inner
mode for indentation. Consider this example of erb mode taken from
https://github.com/fxbois/web-mode/blob/master/tests/demo.erb.

    <div id='header'>
      <% if signed_in? -%>
        <%= link_to t('.sign_out'), sign_out_path, :method => :delete %>
      <% else -%>
        <%= link_to t('.sign_in'), sign_in_path %>
      <% end -%>
    </div>

One meaningful approach here is to indent if-else-end block using inner mode
rules, right? This is what web-mode seems to be currently doing. Assume you are
just in front of `<%= link_to`. This is host hmtl mode. But you need to indent
according to inner mode construct. So what do you do? You go to the end of
previous code chunk and call prog-indentation-function of inner mode with
STRING-BEFORE = "\n" and STRING-AFTER="link_to t('.sign_out'), sign_out_path,
:method => :delete". Simple isn't it? That's precisely my proposal.

The message is that whatever you try you will not be able to completely leave
all the work to inner mode or capture it with naive constructs like
prog-indentation-context. Quite the opposite, new complexities are likely to
make multi-mode authors life harder.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 16:45                                   ` Vitalie Spinu
@ 2016-03-21 22:55                                     ` Dmitry Gutov
  2016-03-22 14:51                                     ` Stefan Monnier
  1 sibling, 0 replies; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-21 22:55 UTC (permalink / raw)
  To: Vitalie Spinu, Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel

On 03/21/2016 06:45 PM, Vitalie Spinu wrote:

> Real trouble comes with continuation chunks. You might need to have a completely
> reversed indentation logic - in outer/host spans MM engine needs to call inner
> mode for indentation. Consider this example of erb mode taken from
> https://github.com/fxbois/web-mode/blob/master/tests/demo.erb.
>
>     <div id='header'>
>       <% if signed_in? -%>
>         <%= link_to t('.sign_out'), sign_out_path, :method => :delete %>
>       <% else -%>
>         <%= link_to t('.sign_in'), sign_in_path %>
>       <% end -%>
>     </div>

FWIW, I've successfully implemented indentation for ERB (and EJS) files 
without delegating to ruby-mode and js-mode indentation code:

https://github.com/purcell/mmm-mode/blob/c9a857a638701482931ffaaee262b61ce53489f3/mmm-erb.el#L157-L225

(it indents web-mode's example almost identically)

It should be easy to add support for similar types of files with almost 
the same code. But yes, it implies adding support for each new template 
format manually.

> One meaningful approach here is to indent if-else-end block using inner mode
> rules, right? This is what web-mode seems to be currently doing. Assume you are
> just in front of `<%= link_to`. This is host hmtl mode. But you need to indent
> according to inner mode construct. So what do you do? You go to the end of
> previous code chunk and call prog-indentation-function of inner mode with
> STRING-BEFORE = "\n" and STRING-AFTER="link_to t('.sign_out'), sign_out_path,
> :method => :delete". Simple isn't it? That's precisely my proposal.

It's simpler for the caller, but I'm having hard time imagining how to 
implement it properly on a major mode's side without inserting those 
strings into the buffer, or using a temporary buffer.

If STRING-BEFORE and STRING-AFTER are not allowed to contain newlines, 
yes, that becomes easier to handle, but that also loses in generality.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 16:45                                   ` Vitalie Spinu
  2016-03-21 22:55                                     ` Dmitry Gutov
@ 2016-03-22 14:51                                     ` Stefan Monnier
  2016-03-22 18:17                                       ` Vitalie Spinu
  2016-03-22 18:26                                       ` Vitalie Spinu
  1 sibling, 2 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-22 14:51 UTC (permalink / raw)
  To: emacs-devel

> Noweb is not essential here.  The story will hold for pretty much all
> multi-modes with non-full-line headers.  In noweb `<<foo,
> some_arg=4>>=` is a header of a chunk.  Polymode places heads and tail
> in their own modes because they are not conceptually part of nor host
> or sub-mode.  You can specify arbitrary parameters in the head which
> might even instruct how to indent the chunk.  The first code line
> `some_call(blabla)` is placed on the same line with the head.  This is
> uncommon but it's the simplest real case I can think of.

OK.

> There are two issues here.

Haha!

> First one is how do you indent the head itself?  Let's assume point is
> after `foo`.  If you follow the naive prog-indentation-context the
> indentation should be handled by the mode in the head chunk, right?

If it's got its own "chunk", then yes.

> Let's call it noweb-head-mode.  This mode is the same for many noweb
> host-mode/inner-mode combinations and defaults to poly-head-tail-mode.

OK.

> Host mode is commonly LaTeX but it can be anything.

OK.

> One reasonable way to indent it is to use the host mode
> indentation engine.

Right, since the beginning of line is still in latex-mode, the "<<foo,
some_arg=4>>= some_call(blabla)" line would be indented by latex-mode.
I.e. the generic code would go to BOL, and call latex-mode's indentation
while setting prog-indentation-context with an "end of chunk" that's
at point.

> Note that this is in contrast of the prog-indentation-context
> assumption for which PREVIOUS-CHUNK is assumed to be of the same mode
> type as the current type.

Not sure what PREVIOUS-CHUNK has to do with it.

> The second issue is with respect to the first line immediately after the
> header.

Since it's not on its own line, I don't see why it would be an issue
for indentation.

> If you naively call inner mode indentation engine on that line in a
> narrowed buffer starting after >>= you will get it indented to FIRST-COLUMN,

That's also one of the reasons why I didn't want to impose narrowing in
prog-indentation-context.

> mode for indentation. Consider this example of erb mode taken from
> https://github.com/fxbois/web-mode/blob/master/tests/demo.erb.
>
>     <div id='header'>
>       <% if signed_in? -%>
>         <%= link_to t('.sign_out'), sign_out_path, :method => :delete %>
>       <% else -%>
>         <%= link_to t('.sign_in'), sign_in_path %>
>       <% end -%>
>     </div>

IIUC, we have here a tight interleaving of lots of little chunks,
alternating between HTML and ..[according to duckduckgo].. Ruby.

> One meaningful approach here is to indent if-else-end block using inner mode
> rules, right?

Another approach would be to consider it as a sequence of chunks, rather
than as chunks of one mode nested in another.  So each chunk controls
the FIRST-COLUMN of the next chunk.

In any case, this seems messy.

> This is what web-mode seems to be currently doing. Assume you are
> just in front of `<%= link_to`. This is host hmtl mode. But you need to indent
> according to inner mode construct. So what do you do? You go to the end of
> previous code chunk and call prog-indentation-function of inner mode with
> STRING-BEFORE = "\n" and STRING-AFTER="link_to t('.sign_out'), sign_out_path,
> :method => :delete". Simple isn't it? That's precisely my proposal.

Ah, now I see a use of STRING-BEFORE and STRING-AFTER, thanks.

This case of STRING-BEFORE being "\n" is very special: in SMIE, the core
indentation function (smie-indent-calculate) basically behaves as if
it's always called with STRING-BEFORE="\n".

IOW, we could define prog-indent-function as always behaving "as if
STRING-BEFORE was \n".  In the normal case, the foo-calculate-indent
function is called at the beginning of line anyway, so adding
a STRING-BEFORE="\n" won't affect its behavior.

As for STRING-AFTER, the example is compelling, but I don't yet
understand really how it would all work out overall.
I'm thinking of cases like:

    <% 3.times do %>
      <li>
        some text
        <% if signed_in? -%>
          <%= link_to t('.sign_out'), sign_out_path, :method => :delete %>
        <% else -%>
          <%= link_to t('.sign_in'), sign_in_path %>
        <% end -%>
      </li>
    <% end %>

How should the "generic" code that links HTML and Ruby know when to
indent using the HTML indentation code and when to use the Ruby
indentation rules?

Maybe my suggestion of considering it as a sequence of chunks (where each
chunk controls the FIRST-COLUMN of the next chunk) could work, but it's
far from obvious.

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-22 14:51                                     ` Stefan Monnier
@ 2016-03-22 18:17                                       ` Vitalie Spinu
  2016-03-23  1:18                                         ` Dmitry Gutov
  2016-03-23 13:18                                         ` Stefan Monnier
  2016-03-22 18:26                                       ` Vitalie Spinu
  1 sibling, 2 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-22 18:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

>> On Tue, Mar 22 2016 10:51, Stefan Monnier wrote:

>> The second issue is with respect to the first line immediately after the
>> header.

> Since it's not on its own line, I don't see why it would be an issue
> for indentation.

It's a problem if you narrow to current span and allow inner mode to indent
first line. So one way or another, multi-mode has to interfere in this case
beyond FIRST-COLUMN hint.

Without narrowing it's not clear what is the contract that inner mode should
respect to handle previous chunk locations. It's not even clear if previous
locations should be of the same modes chunk, previous head span or maybe a set
of heterogeneous chunks.

In any case once the inner mode gets locations of previous chunks it all becomes
an very messy open question. Modes can decide to do whatever they see fit. The
STRING-BEFORE/AFTER system, not ideal of course, but it keeps the mode within
its own world and doesn't leave much space for "improvisation".

>> mode for indentation. Consider this example of erb mode taken from
>> https://github.com/fxbois/web-mode/blob/master/tests/demo.erb.>
>>     <div id='header'>
>>       <% if signed_in? -%>
>>         <%= link_to t('.sign_out'), sign_out_path, :method => :delete %>
>>       <% else -%>
>>         <%= link_to t('.sign_in'), sign_in_path %>
>>       <% end -%>
>>     </div>

>> One meaningful approach here is to indent if-else-end block using inner mode
>> rules, right?

> Another approach would be to consider it as a sequence of chunks, rather
> than as chunks of one mode nested in another.  So each chunk controls
> the FIRST-COLUMN of the next chunk.

This will not work in above case. <%else-%> chunk needs to know about where <%if
signed_in? -%> was indented which is not an immediately preceding chunk.

It's hard to think of better solution than collecting all relevant previous
chunks in one place and indenting according to inner mode. In order to indent
"<%else-%>", STRING-BEFORE should be full "link_to ..." line. So basically
STRING-BEFORE must consist of all ruby spans in between "if" and "else" chunks.

> In any case, this seems messy.

Yeh. Very much.

> As for STRING-AFTER, the example is compelling, but I don't yet
> understand really how it would all work out overall.

Neither do I. Strings are hard to process in emacs and the mode will need to
either modify current buffer by inserting it in a special region or use a
separate buffer for that.

I tend to agree with Dmitry, if you decide not to pass chunk locations to inner
modes then there is no much point in getting complicated with passing
BEFORE/AFTER strings. Multi-mode engine can take care of that satisfactory.

> How should the "generic" code that links HTML and Ruby know when to indent
> using the HTML indentation code and when to use the Ruby indentation rules?

No idea. Dmitry should have an answer for that. He implemented mmm-erb.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-22 18:17                                       ` Vitalie Spinu
@ 2016-03-23  1:18                                         ` Dmitry Gutov
  2016-03-23 13:18                                         ` Stefan Monnier
  1 sibling, 0 replies; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-23  1:18 UTC (permalink / raw)
  To: Vitalie Spinu, Stefan Monnier; +Cc: emacs-devel

On 03/22/2016 08:17 PM, Vitalie Spinu wrote:

> This will not work in above case. <%else-%> chunk needs to know about where <%if
> signed_in? -%> was indented which is not an immediately preceding chunk.
>
> It's hard to think of better solution than collecting all relevant previous
> chunks in one place and indenting according to inner mode. In order to indent
> "<%else-%>", STRING-BEFORE should be full "link_to ..." line. So basically
> STRING-BEFORE must consist of all ruby spans in between "if" and "else" chunks.

...and the multi-mode package would have to know, somehow, that the "if" 
chunk is special in this regard, and know which "if" matches which 
"end", etc. Or simply always include all previous chunks in the given 
mode in STRING-BEFORE.

>> How should the "generic" code that links HTML and Ruby know when to indent
>> using the HTML indentation code and when to use the Ruby indentation rules?
>
> No idea. Dmitry should have an answer for that. He implemented mmm-erb.

Again, mmm-erb is written to support a limited set of template languages 
(currently, two, though supporting JSP would be trivial-ish, were 
java-mode not a part of CC Mode, with associated pitfalls).

So IME, the multi-mode package needs to hardcode, in some form of 
another, the knowledge which file formats use this approach to indentation.

And since we're doing that anyway, using a simpler indentation code in 
those particular files doesn't seem like a bad idea either. (For 
non-continuation hunks, at least).

BTW, web-mode doesn't seem to dispatch to inner modes's functions at 
all: 
https://github.com/fxbois/web-mode/blob/c5aacacb8f4c233844306806a102405c8e9671c9/web-mode.el#L7164-L7198. 
I'm not a fan of this approach in general, but that clearly means that 
it can work for indentation.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-22 18:17                                       ` Vitalie Spinu
  2016-03-23  1:18                                         ` Dmitry Gutov
@ 2016-03-23 13:18                                         ` Stefan Monnier
  1 sibling, 0 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-23 13:18 UTC (permalink / raw)
  To: emacs-devel

>> Since it's not on its own line, I don't see why it would be an issue
>> for indentation.
> It's a problem if you narrow to current span and allow inner mode to indent
> first line.

I guess it's an issue if the buffer is "always narrowed", in which case
the "first" line might get indented accidentally, indeed.
But otherwise, there's no reason for the generic mode to go through the
trouble of "narrow + indent" this partial line.

> Without narrowing it's not clear what is the contract that inner mode should
> respect to handle previous chunk locations.

For prog-indent-context we provide (START . END), as well
as PREVIOUS-CHUNKS, where the contract is that the major mode's
indentation code should only look at those parts of the buffer.
It's up to the mode to decide whether it does that via narrowing, or
some other way.

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-22 14:51                                     ` Stefan Monnier
  2016-03-22 18:17                                       ` Vitalie Spinu
@ 2016-03-22 18:26                                       ` Vitalie Spinu
  2016-03-23  2:07                                         ` Stefan Monnier
  1 sibling, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-22 18:26 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

>> On Tue, Mar 22 2016 10:51, Stefan Monnier wrote:

> As for STRING-AFTER, the example is compelling, but I don't yet
> understand really how it would all work out overall.

How about passing signatures to indentation-funciton?

Assume that there is a way to represent previous indentation context with a
simple data structure, akin to parse-partial-sexp but for indentation. Then you
can compute indentation context of a span by passing to the indentation-funciton
the content of the previous location. 

Of course the indentation context data structure should be mode specific and
modes must be constructing it themselves. But some useful degree of uniformity
is surely possible. For example FIRST-COLUMN is a very simple one dimensional
signature.

Besides being useful for incremental indentation within a mode, it can be
directly leveraged by multi-modes. Just pick the context from previous chunk,
modify usefully and pass to the next chunk. (Of course locations of previous
chunks should not be part of the signature).

 Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-22 18:26                                       ` Vitalie Spinu
@ 2016-03-23  2:07                                         ` Stefan Monnier
  2016-03-23 10:56                                           ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-23  2:07 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: emacs-devel

> Of course the indentation context data structure should be mode specific and
> modes must be constructing it themselves. But some useful degree of uniformity
> is surely possible. For example FIRST-COLUMN is a very simple one dimensional
> signature.

Yes, it's an attractive idea.  But for example in the case of SMIE we
never compute this context directly, instead we discover it as we parse
the text backward from point.

But I guess we could represent the context as an integer (the position
from which to parse backward).

Still, in the ERB case we'd need to mix the HTML context with the Ruby
context, so the representation of the context can't be "internal to the
major mode".

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-23  2:07                                         ` Stefan Monnier
@ 2016-03-23 10:56                                           ` Vitalie Spinu
  2016-03-23 11:41                                             ` Stefan Monnier
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-23 10:56 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

>> On Tue, Mar 22 2016 22:07, Stefan Monnier wrote:

> Still, in the ERB case we'd need to mix the HTML context with the Ruby
> context, so the representation of the context can't be "internal to the
> major mode".

The signature is internal to the inner mode but will have a generic part which
multi-mode can understand and modify. Multi-mode will take care of the
interleaving and mixing when needed. I think there is absolutely no way to avoid
a "superviser", but each inner mode need not know about the big picture.

In erb case each ruby line will be asked for an indentation offset in a narrowed
buffer with context from previous ruby span. Then multi-mode will indent the
whole <% ... %> according to this inner offset plus an offset derived from
parent html element. The last step is to ask the ruby mode to produce an
indentation context at the end of this ruby line and cache it.

> But I guess we could represent the context as an integer (the position from
> which to parse backward).

No, no. Context should not have absolute positions in it. That would ruin the
whole thing. It should contain information about the nesting of language
constructs sufficient to be able to indent first line of an inner span without
any other positional knowledge.

For example, if current line is directly part of IF block, most languages don't
care what precedes IF head at all, only the offset of IF. So an entry in the
context data structure might look like (IF . IF-OFFSET). If line number within
the block is important then you can pass (IF IF-OFFSET . RELATIVE-LINE). My
hunch is that for most languages you would be able to reduce indentation to a
small number of block-continuation constructs (less than 10).

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-23 10:56                                           ` Vitalie Spinu
@ 2016-03-23 11:41                                             ` Stefan Monnier
  2016-03-23 12:39                                               ` Vitalie Spinu
  2016-03-24  7:30                                               ` Andreas Röhler
  0 siblings, 2 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-23 11:41 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: emacs-devel

> No, no. Context should not have absolute positions in it. That would ruin the
> whole thing. It should contain information about the nesting of language
> constructs sufficient to be able to indent first line of an inner span without
> any other positional knowledge.

As explained, the SMIE indentation has no such "summary of context".
The only context it could use is either a position or the complete text
before that position.

> For example, if current line is directly part of IF block, most
> languages don't care what precedes IF head at all, only the offset
> of IF.

Yes, there are simple cases we know how to handle.  The problem is the
general case, along with the work to modify the existing indentation
codes to be able to generate and use that data.

> So an entry in the context data structure might look like (IF
> . IF-OFFSET). If line number within the block is important then you
> can pass (IF IF-OFFSET . RELATIVE-LINE). My hunch is that for most
> languages you would be able to reduce indentation to a small number of
> block-continuation constructs (less than 10).

Consider

    x = a << b + c * d

the code after this line can be indented in various different ways:

                 * e
or
             + e
or
        == e
or
  ; e

And of course, in this example, I put all relevant operators, but in
practice they'll generally be on different lines.  And SMIE (which
supports that kinds of indentation, e.g. in sm-c-mode) doesn't
pre-compute that context: it's only when it sees the "+" at the
beginning of line that it moves back over higher-precedence operators to
find the matching alignment spot.

In theory, SMIE could try to create the kind of context you're thinking
of, but that would amount to a complete rewrite (and it would likely be
very difficult if not impossible to make it work with existing
smie-rules-functions, so it'd break backward compatibility).

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-23 11:41                                             ` Stefan Monnier
@ 2016-03-23 12:39                                               ` Vitalie Spinu
  2016-03-23 13:23                                                 ` Stefan Monnier
  2016-03-24  7:30                                               ` Andreas Röhler
  1 sibling, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-23 12:39 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel



>> On Wed, Mar 23 2016 07:41, Stefan Monnier wrote:

> In theory, SMIE could try to create the kind of context you're thinking of,
> but that would amount to a complete rewrite (and it would likely be very
> difficult if not impossible to make it work with existing
> smie-rules-functions, so it'd break backward compatibility).

You are the best to judge what is possible or not. I had in mind that each mode
will build such a context, but tackling it at SMIE level seems like a much
better start. Will be back when I have a better understanding of it.

SMIE is relatively new; 7 modes in emacs and 5 in ELPA are already using it, but
it probably can be still molded here and there.


  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-23 12:39                                               ` Vitalie Spinu
@ 2016-03-23 13:23                                                 ` Stefan Monnier
  2016-03-23 15:28                                                   ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-23 13:23 UTC (permalink / raw)
  To: emacs-devel

> SMIE is relatively new; 7 modes in emacs and 5 in ELPA are already
> using it, but it probably can be still molded here and there.

Sure.  I'm just pointing out that it's a difficult modification to make,
at least for some indentation codes (and probably for many of them).

Maybe it's OK to design a multi-mode system which requires every major
mode that wants to play with it well (e.g. well enough to get the kind
of behavior we want for ERB) to basically rewrite its indentation code.

But this won't fly unless we also make it possible to use major modes
which haven't been rewritten in that way.

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-23 13:23                                                 ` Stefan Monnier
@ 2016-03-23 15:28                                                   ` Dmitry Gutov
  2016-03-23 21:51                                                     ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-23 15:28 UTC (permalink / raw)
  To: Stefan Monnier, emacs-devel

On 03/23/2016 03:23 PM, Stefan Monnier wrote:

> Maybe it's OK to design a multi-mode system which requires every major
> mode that wants to play with it well (e.g. well enough to get the kind
> of behavior we want for ERB) to basically rewrite its indentation code.
>
> But this won't fly unless we also make it possible to use major modes
> which haven't been rewritten in that way.

Supporting both approaches would also require some feature discovery 
mechanism, or hardcodng a list of modes that support the "advanced" way, 
somewhere.

Can we agree to shelve the PREVIOUS-CHUNKS/STRING-BEFORE/etc discussion 
until someone comes with a patch that shows a convincing usage of it, in 
multiple modes?

Preferably with some performance numbers, showing a corresponding 
improvement when used together with some multi-mode package.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-23 15:28                                                   ` Dmitry Gutov
@ 2016-03-23 21:51                                                     ` Vitalie Spinu
  0 siblings, 0 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-23 21:51 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel



>> On Wed, Mar 23 2016 17:28, Dmitry Gutov wrote:

> On 03/23/2016 03:23 PM, Stefan Monnier wrote:

> Can we agree to shelve the PREVIOUS-CHUNKS/STRING-BEFORE/etc discussion until
> someone comes with a patch that shows a convincing usage of it, in multiple
> modes?

> Preferably with some performance numbers, showing a corresponding improvement
> when used together with some multi-mode package.

Yeps. The topic has been exhausted for now.

  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-23 11:41                                             ` Stefan Monnier
  2016-03-23 12:39                                               ` Vitalie Spinu
@ 2016-03-24  7:30                                               ` Andreas Röhler
  1 sibling, 0 replies; 155+ messages in thread
From: Andreas Röhler @ 2016-03-24  7:30 UTC (permalink / raw)
  To: emacs-devel; +Cc: Vitalie Spinu, Stefan Monnier



On 23.03.2016 12:41, Stefan Monnier wrote:
>> No, no. Context should not have absolute positions in it. That would ruin the
>> whole thing. It should contain information about the nesting of language
>> constructs sufficient to be able to indent first line of an inner span without
>> any other positional knowledge.
> As explained, the SMIE indentation has no such "summary of context".
> The only context it could use is either a position or the complete text
> before that position.
>
>> For example, if current line is directly part of IF block, most
>> languages don't care what precedes IF head at all, only the offset
>> of IF.
> Yes, there are simple cases we know how to handle.  The problem is the
> general case, along with the work to modify the existing indentation
> codes to be able to generate and use that data.
>
>> So an entry in the context data structure might look like (IF
>> . IF-OFFSET). If line number within the block is important then you
>> can pass (IF IF-OFFSET . RELATIVE-LINE). My hunch is that for most
>> languages you would be able to reduce indentation to a small number of
>> block-continuation constructs (less than 10).
> Consider
>
>      x = a << b + c * d
>
> the code after this line can be indented in various different ways:
>
>                   * e
> or
>               + e
> or
>          == e
> or
>    ; e
>
> And of course, in this example, I put all relevant operators, but in
> practice they'll generally be on different lines.  And SMIE (which
> supports that kinds of indentation, e.g. in sm-c-mode) doesn't
> pre-compute that context: it's only when it sees the "+" at the
> beginning of line that it moves back over higher-precedence operators to
> find the matching alignment spot.
>
> In theory, SMIE could try to create the kind of context you're thinking
> of, but that would amount to a complete rewrite (and it would likely be
> very difficult if not impossible to make it work with existing
> smie-rules-functions, so it'd break backward compatibility).
>
>
>          Stefan
>


Stefan, came across SMIE as it looked like an interesting abstraction. 
Estimate your efforts. Nonetheless think that path turned out wrong 
meanwhile. SMIE-based indentation will not be easier, but run into more 
and more complexity instead. The reasons deserve being discussed elsewhere.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21  3:11                         ` Stefan Monnier
  2016-03-21  5:05                           ` Vitalie Spinu
@ 2016-03-21 11:56                           ` Dmitry Gutov
  1 sibling, 0 replies; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-21 11:56 UTC (permalink / raw)
  To: Stefan Monnier, Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel

On 03/21/2016 05:11 AM, Stefan Monnier wrote:

> I must say I don't understand how what we have is so very different from
> what you suggest.  Of course, I fully agree on the need to deprecate
> indent-line-function and use a side-effect free replacement which
> returns the desired indentation (instead performing the indentation).
>
> I think both suggestions require changes to every mode, and in both
> cases the changes can be reduced to a one-liner or close enough (for
> the simple case).  Admittedly, for it to be a one-liner, we'll need to
> provide a standard helper function.

It also sounds like we should revert the changes that brought in 
prog-indentation-context in emacs-25, and proceed with the results of 
this discussion on master. Provided we reach an agreement here, of course.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21  1:05                       ` Vitalie Spinu
  2016-03-21  3:11                         ` Stefan Monnier
@ 2016-03-21  5:08                         ` Vitalie Spinu
  2016-03-21 12:39                           ` Stefan Monnier
  2016-03-21 11:47                         ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Dmitry Gutov
  2 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21  5:08 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 385 bytes --]


>> On Mon, Mar 21 2016 02:05, Vitalie Spinu wrote:

>> Are you interested in working on a patch? Also Cc'ing Stefan.

> My knowledge of emacs C internals is close to 0. Elisp side (and probably C
> side) of this is trivial. I will look into it but I don't think I am the best
> person for that.

The widen part turned to be easy.  Will look at parse-partial-sexp tomorrow.

 Vitalie


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-diff, Size: 2049 bytes --]

From eafcff9e72499e3bb6dc462d406dd33a885e3d49 Mon Sep 17 00:00:00 2001
From: Vitalie Spinu <spinuvit@gmail.com>
Date: Mon, 21 Mar 2016 05:41:55 +0100
Subject: [PATCH] Implement hard-narrowing

 `widen` now respects restrictions imposed by new variable `hard-widen-limits`
---
 src/buffer.c  |  5 +++++
 src/editfns.c | 14 +++++++++++++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/buffer.c b/src/buffer.c
index f06d7e0..5232c49 100644
--- a/src/buffer.c
+++ b/src/buffer.c
@@ -6219,6 +6219,11 @@ and disregard a `read-only' text property if the property value
 is a member of the list.  */);
   Vinhibit_read_only = Qnil;
 
+  DEFVAR_LISP ("hard-widen-limits", Vhard_widen_limits,
+	       doc: /* When non-nil `widen` will widen to these limits.
+Must be a cons of the form (MIN . MAX) where MIN and MAX are integers or markers.  */);
+  Vhard_widen_limits = Qnil;
+
   DEFVAR_PER_BUFFER ("cursor-type", &BVAR (current_buffer, cursor_type), Qnil,
 		     doc: /* Cursor to use when this buffer is in the selected window.
 Values are interpreted as follows:
diff --git a/src/editfns.c b/src/editfns.c
index 2ac0537..fb1f652 100644
--- a/src/editfns.c
+++ b/src/editfns.c
@@ -3480,12 +3480,24 @@ DEFUN ("delete-and-extract-region", Fdelete_and_extract_region,
     return empty_unibyte_string;
   return del_range_1 (XINT (start), XINT (end), 1, 1);
 }
+
 \f
 DEFUN ("widen", Fwiden, Swiden, 0, 0, "",
        doc: /* Remove restrictions (narrowing) from current buffer.
-This allows the buffer's full text to be seen and edited.  */)
+This allows the buffer's full text to be seen and edited.
+If `hard-widen-limits` is non-nil, widen only to those limits.  */)
   (void)
 {
+
+  if (! NILP (Vhard_widen_limits))
+    {
+      CHECK_CONS(Vhard_widen_limits);
+      Lisp_Object hbeg = XCAR(Vhard_widen_limits);
+      Lisp_Object hend = XCDR(Vhard_widen_limits);
+      Fnarrow_to_region(hbeg, hend);
+      return Qnil;
+    }
+
   if (BEG != BEGV || Z != ZV)
     current_buffer->clip_changed = 1;
   BEGV = BEG;
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21  5:08                         ` [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] Vitalie Spinu
@ 2016-03-21 12:39                           ` Stefan Monnier
  2016-03-21 12:54                             ` Vitalie Spinu
  2016-03-21 14:04                             ` Stefan Monnier
  0 siblings, 2 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 12:39 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, emacs-devel, Dmitry Gutov

> +  DEFVAR_LISP ("hard-widen-limits", Vhard_widen_limits,
> +	       doc: /* When non-nil `widen` will widen to these limits.
> +Must be a cons of the form (MIN . MAX) where MIN and MAX are integers or markers.  */);
> +  Vhard_widen_limits = Qnil;

Sorry to nitpick, but I'm not completely happy with this API.  As an
implementation it might be OK, but I can imagine wanting to change the
implementation in the future but being stuck by the exposed internals.

So I suggest we instead expose only a new primitive
"call-with-hard-narrowing" which could look like:

    (defun call-with-hard-narrowing (from to func)
      (make-local-variable 'internal--hard-widen-limits)
      (let ((internal--hard-widen-limits (cons from to)))
        (funcall func)))

which could be supplemented with a corresponding macro

    (defmacro with-hard-narrowing (from to &rest body)
      `(call-with-hard-narrowing ,from ,to (lambda () ,body)))


-- Stefan



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21 12:39                           ` Stefan Monnier
@ 2016-03-21 12:54                             ` Vitalie Spinu
  2016-03-21 14:07                               ` Stefan Monnier
  2016-03-21 14:04                             ` Stefan Monnier
  1 sibling, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 12:54 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Alan Mackenzie, Dmitry Gutov, emacs-devel


Sounds reasonable. But whatever is the internal implementation shouldn't
hard-widen-limits be there anyways?

Why to bother with call-with-hard-narrowing and not have all the logic in
with-hard-narrowing directly? I looks to me that it's better to expose hard
narrowing to elisp only. If possible it should be transparent to low level code.

  Vitalie

>> On Mon, Mar 21 2016 08:39, Stefan Monnier wrote:

>> +  DEFVAR_LISP ("hard-widen-limits", Vhard_widen_limits,
>> +	       doc: /* When non-nil `widen` will widen to these limits.
>> +Must be a cons of the form (MIN . MAX) where MIN and MAX are integers or markers.  */);
>> +  Vhard_widen_limits = Qnil;

> Sorry to nitpick, but I'm not completely happy with this API.  As an
> implementation it might be OK, but I can imagine wanting to change the
> implementation in the future but being stuck by the exposed internals.

> So I suggest we instead expose only a new primitive
> "call-with-hard-narrowing" which could look like:

>     (defun call-with-hard-narrowing (from to func)
>       (make-local-variable 'internal--hard-widen-limits)
>       (let ((internal--hard-widen-limits (cons from to)))
>         (funcall func)))

> which could be supplemented with a corresponding macro

>     (defmacro with-hard-narrowing (from to &rest body)
>       `(call-with-hard-narrowing ,from ,to (lambda () ,body)))

> -- Stefan



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21 12:54                             ` Vitalie Spinu
@ 2016-03-21 14:07                               ` Stefan Monnier
  2016-03-21 14:14                                 ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 14:07 UTC (permalink / raw)
  To: emacs-devel

> Why to bother with call-with-hard-narrowing and not have all the logic
> in with-hard-narrowing directly? I looks to me that it's better to
> expose hard narrowing to elisp only. If possible it should be
> transparent to low level code.

The macro-expanded code will be written into the .elc files which people
will expect will work in the future without having to recompile.

So any API used by the macro-expanded code ends up being sufficiently
exposed that we can't easily get rid of it.


        Stefan




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21 14:07                               ` Stefan Monnier
@ 2016-03-21 14:14                                 ` Vitalie Spinu
  0 siblings, 0 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 14:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel


Aha. Clear.

>> On Mon, Mar 21 2016 10:07, Stefan Monnier wrote:

>> Why to bother with call-with-hard-narrowing and not have all the logic
>> in with-hard-narrowing directly? I looks to me that it's better to
>> expose hard narrowing to elisp only. If possible it should be
>> transparent to low level code.

> The macro-expanded code will be written into the .elc files which people
> will expect will work in the future without having to recompile.

> So any API used by the macro-expanded code ends up being sufficiently
> exposed that we can't easily get rid of it.

>         Stefan



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21 12:39                           ` Stefan Monnier
  2016-03-21 12:54                             ` Vitalie Spinu
@ 2016-03-21 14:04                             ` Stefan Monnier
  2016-03-21 14:33                               ` Vitalie Spinu
  1 sibling, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 14:04 UTC (permalink / raw)
  To: emacs-devel

>     (defun call-with-hard-narrowing (from to func)
>       (make-local-variable 'internal--hard-widen-limits)
>       (let ((internal--hard-widen-limits (cons from to)))
>         (funcall func)))

Hmm... I now realize that this won't handle the case of info-mode
buffers (and similarly rmail buffers) where the hard-narrowing is not
scoped.


        Stefan




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21 14:04                             ` Stefan Monnier
@ 2016-03-21 14:33                               ` Vitalie Spinu
  2016-03-21 14:54                                 ` Stefan Monnier
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 14:33 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel



>> On Mon, Mar 21 2016 10:04, Stefan Monnier wrote:

>>     (defun call-with-hard-narrowing (from to func)
>>       (make-local-variable 'internal--hard-widen-limits)
>>       (let ((internal--hard-widen-limits (cons from to)))
>>         (funcall func)))

> Hmm... I now realize that this won't handle the case of info-mode
> buffers (and similarly rmail buffers) where the hard-narrowing is not
> scoped.

What does this mean in plain English?


  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21 14:33                               ` Vitalie Spinu
@ 2016-03-21 14:54                                 ` Stefan Monnier
  2016-03-21 17:16                                   ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 14:54 UTC (permalink / raw)
  To: emacs-devel

>>> (defun call-with-hard-narrowing (from to func)
>>> (make-local-variable 'internal--hard-widen-limits)
>>> (let ((internal--hard-widen-limits (cons from to)))
>>> (funcall func)))
>> Hmm... I now realize that this won't handle the case of info-mode
>> buffers (and similarly rmail buffers) where the hard-narrowing is not
>> scoped.
> What does this mean in plain English?

That it sets narrowing and leaves it there.  So it can't use
call-with-hard-narrowing for that.


        Stefan




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21 14:54                                 ` Stefan Monnier
@ 2016-03-21 17:16                                   ` Vitalie Spinu
  2016-03-21 18:36                                     ` Stefan Monnier
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 17:16 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel



>> On Mon, Mar 21 2016 10:54, Stefan Monnier wrote:

> That it sets narrowing and leaves it there.  So it can't use
> call-with-hard-narrowing for that.

Why would it need it for?

This is outside of use cases that I have in mind. with-hard-narrowing should be
used in limited, transient, prog-only contexts; almost exclusively for advices
in multi-mode engines.

  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21 17:16                                   ` Vitalie Spinu
@ 2016-03-21 18:36                                     ` Stefan Monnier
  2016-03-21 19:18                                       ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 18:36 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: emacs-devel

>> That it sets narrowing and leaves it there.  So it can't use
>> call-with-hard-narrowing for that.
> Why would it need it for?

Because an info file is made up of various nodes, and Emacs only shows
one node at a time by narrowing.  This narrowing should be "hard"
because font-lock and such should only operate on a node at a time.

The same was true for Rmail which used to show the content of each email
simply by narrowing the mailbox file to the specific email.  Not sure if
it still does that (clearly it can't be so simple with MIME's base64
and attachments).

> This is outside of use cases that I have in mind.

Indeed, it's a different case, but one where the narrowing should be
hard as well.

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21 18:36                                     ` Stefan Monnier
@ 2016-03-21 19:18                                       ` Vitalie Spinu
  2016-03-22  3:17                                         ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 19:18 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel



>> On Mon, Mar 21 2016 14:36, Stefan Monnier wrote:

>> This is outside of use cases that I have in mind.

> Indeed, it's a different case, but one where the narrowing should be
> hard as well.

Ok. This part is trickier but it might not be that hard. Will keep this in mind.

  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-21 19:18                                       ` Vitalie Spinu
@ 2016-03-22  3:17                                         ` Vitalie Spinu
  2016-03-22  9:57                                           ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-22  3:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 618 bytes --]



>> On Mon, Mar 21 2016 20:18, Vitalie Spinu wrote:

>>> This is outside of use cases that I have in mind.

>> Indeed, it's a different case, but one where the narrowing should be
>> hard as well.

> Ok. This part is trickier but it might not be that hard. Will keep this in mind.

I have pushed the proposed change to `widen-limits` branch. The C level
consequences are fairly innocuous. There are only 3 instances of calls to Fwiden
in the whole emacs.

Given that there seem to be use cases of permanent limiting I kept it as buffer
local variable. I also switched to milder name buffer-widen-limits.

  Vitalie



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-diff, Size: 5854 bytes --]

3 files changed, 48 insertions(+), 3 deletions(-)
src/buffer.c  | 19 +++++++++++++++++--
src/buffer.h  | 17 +++++++++++++++++
src/editfns.c | 15 ++++++++++++++-

modified   src/buffer.c
@@ -329,6 +329,11 @@ bset_scroll_up_aggressively (struct buffer *b, Lisp_Object val)
   b->scroll_up_aggressively_ = val;
 }
 static void
+bset_widen_limits (struct buffer *b, Lisp_Object val)
+{
+  b->widen_limits_ = val;
+}
+static void
 bset_selective_display (struct buffer *b, Lisp_Object val)
 {
   b->selective_display_ = val;
@@ -847,6 +852,7 @@ CLONE nil means the indirect buffer's state is reset to default values.  */)
       bset_display_count (b, make_number (0));
       bset_backed_up (b, Qnil);
       bset_auto_save_file_name (b, Qnil);
+      bset_widen_limits (b, b->base_buffer->widen_limits_);
       set_buffer_internal_1 (b);
       Fset (intern ("buffer-save-without-query"), Qnil);
       Fset (intern ("buffer-file-number"), Qnil);
@@ -961,6 +967,7 @@ reset_buffer_local_variables (struct buffer *b, bool permanent_too)
      things that depend on the major mode.
      default-major-mode is handled at a higher level.
      We ignore it here.  */
+  bset_widen_limits(b, Qnil);
   bset_major_mode (b, Qfundamental_mode);
   bset_keymap (b, Qnil);
   bset_mode_name (b, QSFundamental);
@@ -2167,7 +2174,7 @@ so the buffer is truly empty after this.  */)
 {
   Fwiden ();
 
-  del_range (BEG, Z);
+  del_range (BEGWL, ZWL);
 
   current_buffer->last_window_start = 1;
   /* Prevent warnings, or suspension of auto saving, that would happen
@@ -5037,6 +5044,7 @@ init_buffer_once (void)
   bset_display_count (&buffer_local_flags, make_number (-1));
   bset_display_time (&buffer_local_flags, make_number (-1));
   bset_enable_multibyte_characters (&buffer_local_flags, make_number (-1));
+  bset_widen_limits (&buffer_local_flags, make_number (-1));
 
   /* These used to be stuck at 0 by default, but now that the all-zero value
      means Qnil, we have to initialize them explicitly.  */
@@ -5160,6 +5168,7 @@ init_buffer_once (void)
   bset_cursor_type (&buffer_defaults, Qt);
   bset_extra_line_spacing (&buffer_defaults, Qnil);
   bset_cursor_in_non_selected_windows (&buffer_defaults, Qt);
+  bset_widen_limits (&buffer_defaults, Qnil);
 
   bset_enable_multibyte_characters (&buffer_defaults, Qt);
   bset_buffer_file_coding_system (&buffer_defaults, Qnil);
@@ -5367,7 +5376,6 @@ defvar_per_buffer (struct Lisp_Buffer_Objfwd *bo_fwd, const char *namestring,
     emacs_abort ();
 }
 
-
 /* Initialize the buffer routines.  */
 void
 syms_of_buffer (void)
@@ -5796,6 +5804,13 @@ If you set this to -2, that means don't turn off auto-saving in this buffer
 if its text size shrinks.   If you use `buffer-swap-text' on a buffer,
 you probably should set this to -2 in that buffer.  */);
 
+  DEFVAR_PER_BUFFER ("buffer-widen-limits", &BVAR (current_buffer, widen_limits),
+                     Qnil,
+                     doc: /* When non-nil `widen` will widen to these limits.
+Must be a cons of the form (MIN . MAX) where MIN and MAX are integers
+of hard widen limits in this buffer. This is an experimental variable
+intended primarily for multi-mode engines.  */);
+
   DEFVAR_PER_BUFFER ("selective-display", &BVAR (current_buffer, selective_display),
 		     Qnil,
 		     doc: /* Non-nil enables selective display.
modified   src/buffer.h
@@ -59,6 +59,10 @@ INLINE_HEADER_BEGIN
 #define Z (current_buffer->text->z)
 #define Z_BYTE (current_buffer->text->z_byte)
 
+/* Positions that take into account widen limits.  */
+#define BEGWL (BUF_BEGWL (current_buffer))
+#define ZWL (BUF_ZWL(current_buffer))
+
 /* Macros for the addresses of places in the buffer.  */
 
 /* Address of beginning of buffer.  */
@@ -128,6 +132,15 @@ INLINE_HEADER_BEGIN
     : NILP (BVAR (buf, begv_marker)) ? buf->begv_byte	\
     : marker_byte_position (BVAR (buf, begv_marker)))
 
+/* Hard positions in buffer. */
+#define BUF_BEGWL(buf)  	                                \
+  ((NILP (BVAR (buf, widen_limits))) ?  BUF_BEG (buf)    \
+   : XINT( XCAR (BVAR (buf, widen_limits))))
+
+#define BUF_ZWL(buf)  	                                \
+  ((NILP (BVAR (buf, widen_limits))) ?  BUF_Z (buf)      \
+   : XINT( XCDR (BVAR (buf, widen_limits))))
+
 /* Position of point in buffer.  */
 #define BUF_PT(buf)					\
    (buf == current_buffer ? PT				\
@@ -150,6 +163,7 @@ INLINE_HEADER_BEGIN
     : NILP (BVAR (buf, zv_marker)) ? buf->zv_byte	\
     : marker_byte_position (BVAR (buf, zv_marker)))
 
+
 /* Position of gap in buffer.  */
 #define BUF_GPT(buf) ((buf)->text->gpt)
 #define BUF_GPT_BYTE(buf) ((buf)->text->gpt_byte)
@@ -748,6 +762,9 @@ struct buffer
      See `cursor-type' for other values.  */
   Lisp_Object cursor_in_non_selected_windows_;
 
+  /* Cons of hard widen limits */
+  Lisp_Object widen_limits_;
+
   /* No more Lisp_Object beyond this point.  Except undo_list,
      which is handled specially in Fgarbage_collect.  */
 
modified   src/editfns.c
@@ -3480,12 +3480,25 @@ DEFUN ("delete-and-extract-region", Fdelete_and_extract_region,
     return empty_unibyte_string;
   return del_range_1 (XINT (start), XINT (end), 1, 1);
 }
+
 \f
 DEFUN ("widen", Fwiden, Swiden, 0, 0, "",
        doc: /* Remove restrictions (narrowing) from current buffer.
-This allows the buffer's full text to be seen and edited.  */)
+This allows the buffer's full text to be seen and edited.
+If `buffer-widen-limits` is non-nil, widen only to those limits.  */)
   (void)
 {
+
+  if (!NILP (BVAR(current_buffer, widen_limits)))
+    {
+      Lisp_Object hl = BVAR(current_buffer,  widen_limits);
+      CHECK_CONS(hl);
+      CHECK_NUMBER(XCAR(hl));
+      CHECK_NUMBER(XCDR(hl));
+      Fnarrow_to_region(XCAR(hl), XCDR(hl));
+      return Qnil;
+    }
+
   if (BEG != BEGV || Z != ZV)
     current_buffer->clip_changed = 1;
   BEGV = BEG;

[back]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-22  3:17                                         ` Vitalie Spinu
@ 2016-03-22  9:57                                           ` Vitalie Spinu
  2016-03-22 10:05                                             ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-22  9:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel


I think having a local variable for this is not the best idea.

In proposed implementation you first set buffer-widen-limits, but the effect of
it will show only on the next invocation of widen. So the consumer will need to
set this variable and always follow it by widen. If widen is not called, one can
access outside regions. Particularly `narrow` doesn't know about hard limits, so
you can narrow to outside those limits.

I think a better implementation would be to have `set-widen-limits` function
which would set the limits and narrow the region taking into account current
narrowing. The supporting `with-widen-limits` macro will call `set-widen-limits`
in unwind-protect.

I guess this comes very close to the implementation that you suggested earlier
for hiding the internals.

  Vitalie

>> On Tue, Mar 22 2016 04:17, Vitalie Spinu wrote:

>>> On Mon, Mar 21 2016 20:18, Vitalie Spinu wrote:

>>>> This is outside of use cases that I have in mind.

>>> Indeed, it's a different case, but one where the narrowing should be
>>> hard as well.

>> Ok. This part is trickier but it might not be that hard. Will keep this in mind.

> I have pushed the proposed change to `widen-limits` branch. The C level
> consequences are fairly innocuous. There are only 3 instances of calls to Fwiden
> in the whole emacs.

> Given that there seem to be use cases of permanent limiting I kept it as buffer
> local variable. I also switched to milder name buffer-widen-limits.

>   Vitalie


> 3 files changed, 48 insertions(+), 3 deletions(-)
> src/buffer.c  | 19 +++++++++++++++++--
> src/buffer.h  | 17 +++++++++++++++++
> src/editfns.c | 15 ++++++++++++++-

> modified   src/buffer.c
> @@ -329,6 +329,11 @@ bset_scroll_up_aggressively (struct buffer *b, Lisp_Object val)
>    b->scroll_up_aggressively_ = val;
>  }
>  static void
> +bset_widen_limits (struct buffer *b, Lisp_Object val)
> +{
> +  b->widen_limits_ = val;
> +}
> +static void
>  bset_selective_display (struct buffer *b, Lisp_Object val)
>  {
>    b->selective_display_ = val;
> @@ -847,6 +852,7 @@ CLONE nil means the indirect buffer's state is reset to default values.  */)
>        bset_display_count (b, make_number (0));
>        bset_backed_up (b, Qnil);
>        bset_auto_save_file_name (b, Qnil);
> +      bset_widen_limits (b, b->base_buffer->widen_limits_);
>        set_buffer_internal_1 (b);
>        Fset (intern ("buffer-save-without-query"), Qnil);
>        Fset (intern ("buffer-file-number"), Qnil);
> @@ -961,6 +967,7 @@ reset_buffer_local_variables (struct buffer *b, bool permanent_too)
>       things that depend on the major mode.
>       default-major-mode is handled at a higher level.
>       We ignore it here.  */
> +  bset_widen_limits(b, Qnil);
>    bset_major_mode (b, Qfundamental_mode);
>    bset_keymap (b, Qnil);
>    bset_mode_name (b, QSFundamental);
> @@ -2167,7 +2174,7 @@ so the buffer is truly empty after this.  */)
>  {
>    Fwiden ();
>  
> -  del_range (BEG, Z);
> +  del_range (BEGWL, ZWL);
>  
>    current_buffer->last_window_start = 1;
>    /* Prevent warnings, or suspension of auto saving, that would happen
> @@ -5037,6 +5044,7 @@ init_buffer_once (void)
>    bset_display_count (&buffer_local_flags, make_number (-1));
>    bset_display_time (&buffer_local_flags, make_number (-1));
>    bset_enable_multibyte_characters (&buffer_local_flags, make_number (-1));
> +  bset_widen_limits (&buffer_local_flags, make_number (-1));
>  
>    /* These used to be stuck at 0 by default, but now that the all-zero value
>       means Qnil, we have to initialize them explicitly.  */
> @@ -5160,6 +5168,7 @@ init_buffer_once (void)
>    bset_cursor_type (&buffer_defaults, Qt);
>    bset_extra_line_spacing (&buffer_defaults, Qnil);
>    bset_cursor_in_non_selected_windows (&buffer_defaults, Qt);
> +  bset_widen_limits (&buffer_defaults, Qnil);
>  
>    bset_enable_multibyte_characters (&buffer_defaults, Qt);
>    bset_buffer_file_coding_system (&buffer_defaults, Qnil);
> @@ -5367,7 +5376,6 @@ defvar_per_buffer (struct Lisp_Buffer_Objfwd *bo_fwd, const char *namestring,
>      emacs_abort ();
>  }
>  
> -
>  /* Initialize the buffer routines.  */
>  void
>  syms_of_buffer (void)
> @@ -5796,6 +5804,13 @@ If you set this to -2, that means don't turn off auto-saving in this buffer
>  if its text size shrinks.   If you use `buffer-swap-text' on a buffer,
>  you probably should set this to -2 in that buffer.  */);
>  
> +  DEFVAR_PER_BUFFER ("buffer-widen-limits", &BVAR (current_buffer, widen_limits),
> +                     Qnil,
> +                     doc: /* When non-nil `widen` will widen to these limits.
> +Must be a cons of the form (MIN . MAX) where MIN and MAX are integers
> +of hard widen limits in this buffer. This is an experimental variable
> +intended primarily for multi-mode engines.  */);
> +
>    DEFVAR_PER_BUFFER ("selective-display", &BVAR (current_buffer, selective_display),
>  		     Qnil,
>  		     doc: /* Non-nil enables selective display.
> modified   src/buffer.h
> @@ -59,6 +59,10 @@ INLINE_HEADER_BEGIN
>  #define Z (current_buffer->text->z)
>  #define Z_BYTE (current_buffer->text->z_byte)
>  
> +/* Positions that take into account widen limits.  */
> +#define BEGWL (BUF_BEGWL (current_buffer))
> +#define ZWL (BUF_ZWL(current_buffer))
> +
>  /* Macros for the addresses of places in the buffer.  */
>  
>  /* Address of beginning of buffer.  */
> @@ -128,6 +132,15 @@ INLINE_HEADER_BEGIN
>      : NILP (BVAR (buf, begv_marker)) ? buf->begv_byte	\
>      : marker_byte_position (BVAR (buf, begv_marker)))
>  
> +/* Hard positions in buffer. */
> +#define BUF_BEGWL(buf)  	                                \
> +  ((NILP (BVAR (buf, widen_limits))) ?  BUF_BEG (buf)    \
> +   : XINT( XCAR (BVAR (buf, widen_limits))))
> +
> +#define BUF_ZWL(buf)  	                                \
> +  ((NILP (BVAR (buf, widen_limits))) ?  BUF_Z (buf)      \
> +   : XINT( XCDR (BVAR (buf, widen_limits))))
> +
>  /* Position of point in buffer.  */
>  #define BUF_PT(buf)					\
>     (buf == current_buffer ? PT				\
> @@ -150,6 +163,7 @@ INLINE_HEADER_BEGIN
>      : NILP (BVAR (buf, zv_marker)) ? buf->zv_byte	\
>      : marker_byte_position (BVAR (buf, zv_marker)))
>  
> +
>  /* Position of gap in buffer.  */
>  #define BUF_GPT(buf) ((buf)->text->gpt)
>  #define BUF_GPT_BYTE(buf) ((buf)->text->gpt_byte)
> @@ -748,6 +762,9 @@ struct buffer
>       See `cursor-type' for other values.  */
>    Lisp_Object cursor_in_non_selected_windows_;
>  
> +  /* Cons of hard widen limits */
> +  Lisp_Object widen_limits_;
> +
>    /* No more Lisp_Object beyond this point.  Except undo_list,
>       which is handled specially in Fgarbage_collect.  */
>  
> modified   src/editfns.c
> @@ -3480,12 +3480,25 @@ DEFUN ("delete-and-extract-region", Fdelete_and_extract_region,
>      return empty_unibyte_string;
>    return del_range_1 (XINT (start), XINT (end), 1, 1);
>  }
> +
>  \f
>  DEFUN ("widen", Fwiden, Swiden, 0, 0, "",
>         doc: /* Remove restrictions (narrowing) from current buffer.
> -This allows the buffer's full text to be seen and edited.  */)
> +This allows the buffer's full text to be seen and edited.
> +If `buffer-widen-limits` is non-nil, widen only to those limits.  */)
>    (void)
>  {
> +
> +  if (!NILP (BVAR(current_buffer, widen_limits)))
> +    {
> +      Lisp_Object hl = BVAR(current_buffer,  widen_limits);
> +      CHECK_CONS(hl);
> +      CHECK_NUMBER(XCAR(hl));
> +      CHECK_NUMBER(XCDR(hl));
> +      Fnarrow_to_region(XCAR(hl), XCDR(hl));
> +      return Qnil;
> +    }
> +
>    if (BEG != BEGV || Z != ZV)
>      current_buffer->clip_changed = 1;
>    BEGV = BEG;

> [back]



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-22  9:57                                           ` Vitalie Spinu
@ 2016-03-22 10:05                                             ` Vitalie Spinu
  2016-03-22 11:57                                               ` Stefan Monnier
  2016-03-22 20:08                                               ` Richard Stallman
  0 siblings, 2 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-22 10:05 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel


>> On Tue, Mar 22 2016 10:57, Vitalie Spinu wrote:

> So the consumer will need to set this variable and always follow it by widen.

Hm. This also implies that each consumer will need to take care of current
narrowing and re-narrow to new limits. This doesn't sound right.

I am also not sure what the behavior of save-restriction should be. Should
save-restriction unwind hard limits as well?

  Vitalie


   



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-22 10:05                                             ` Vitalie Spinu
@ 2016-03-22 11:57                                               ` Stefan Monnier
  2016-03-22 16:28                                                 ` Vitalie Spinu
  2016-04-28 13:29                                                 ` Vitalie Spinu
  2016-03-22 20:08                                               ` Richard Stallman
  1 sibling, 2 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-22 11:57 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: emacs-devel

>> So the consumer will need to set this variable and always follow it by widen.
> Hm. This also implies that each consumer will need to take care of current
> narrowing and re-narrow to new limits. This doesn't sound right.
> I am also not sure what the behavior of save-restriction should be. Should
> save-restriction unwind hard limits as well?

IIRC past discussions on this issue, one option was to merge your
set-widen-limits into narrow-to-region by adding an optional argument
`hard'.  And yes, I think save-restriction should unwind hard limits.


        Stefan



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-22 11:57                                               ` Stefan Monnier
@ 2016-03-22 16:28                                                 ` Vitalie Spinu
  2016-03-22 16:44                                                   ` Stefan Monnier
  2016-04-28 13:29                                                 ` Vitalie Spinu
  1 sibling, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-22 16:28 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

>> On Tue, Mar 22 2016 07:57, Stefan Monnier wrote:

> IIRC past discussions on this issue, one option was to merge your
> set-widen-limits into narrow-to-region by adding an optional argument `hard'.

If narrowing is already in place, set-widen-limits will not touch it unless the
visible region expands beyond the hard limits. I think widen limits is
fundamentally about widening and only indirectly about narrowing.

Mixing hard limit into a common user level function is a bad marketing
strategy. We don't want to encourage major modes to use it in funny ways. If
users and major modes decide to use hard limits we might end up in the same
situation as now when narrow/widen is not perceived as a good tool for
multi-modes.

Static usage of hard widening, like in Info example, is not really a
problem. Multi modes need to impose hard limits transiently, in specific
contexts like indentation, syntax parsing or font lock, and will restore the
limits at the end. Problems will occur if major modes start using hard limits in
such contexts directly.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-22 16:28                                                 ` Vitalie Spinu
@ 2016-03-22 16:44                                                   ` Stefan Monnier
  2016-03-22 19:36                                                     ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-22 16:44 UTC (permalink / raw)
  To: emacs-devel

> Mixing hard limit into a common user level function is a bad marketing
> strategy.

`narrow-to-region' is not only a user-level command.
It's also a low-level primitive.

The narrow-to-region command can't set the optional argument unless we
take extra steps to let it, so the "hard narrowing" would only be
available from Elisp, not interactively.

> If users and major modes decide to use hard limits we might end up in
> the same situation as now when narrow/widen is not perceived as a good
> tool for multi-modes.

Could be.  Maybe there are more "kinds of narrowing" than just 2, indeed.

But for me, the main consideration is whether the text before/after
point-min can be taken into account as a kind of context, or whether the
text between point-min/max should be treated (even if temporarily) as
being the whole&sole truth.

That's what "hard narrowing" means to me.  I don't think I'd be able to
design something that can take into account finer distinctions of
narrowings right now, for lack of understanding about what those finer
distinctions could be and what kind of problems they lead to.

> limits at the end. Problems will occur if major modes start using hard
> limits in such contexts directly.

I don't see any reason why problems *will* occur in that case (tho, of
course, Murphy could be that reason).  So until such problems do show up,
I wouldn't worry.

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-22 16:44                                                   ` Stefan Monnier
@ 2016-03-22 19:36                                                     ` Vitalie Spinu
  2016-03-23  2:22                                                       ` Stefan Monnier
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-22 19:36 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

>> On Tue, Mar 22 2016 12:44, Stefan Monnier wrote:

> "hard narrowing" would only be available from Elisp, not interactively.

Interactive or not, doesn't matter. The danger is the whatever eslip used within
hard-narrowed regions.

>> If users and major modes decide to use hard limits we might end up in
>> the same situation as now when narrow/widen is not perceived as a good
>> tool for multi-modes.

> Could be.  Maybe there are more "kinds of narrowing" than just 2, indeed.

> But for me, the main consideration is whether the text before/after
> point-min can be taken into account as a kind of context, or whether the
> text between point-min/max should be treated (even if temporarily) as
> being the whole&sole truth.

I agree completely. But I think defining an "whole&sole" universe need not
involve current implementation of narrowing. It's about inability to widen not
ability to narrow. My patch didn't even touch `narrow` because that's not
needed.

There is no real need to invent extra type of narrowing. It's a lot of extra
work with no additional benefit. It's simply enough to define hard limits that
none of the standard functions can lift.

In order to define a different type of narrowing you would need to introduce
alternatives to BEGV, BEGV_BYTE ZV, ZV_BYTE and the hunt them everywhere where
BEGV or BEGV_BYTE are used right now.

What concrete semantics do you have in mind? If a user or elisp already narrowed
the buffer, will hard narrowing re-narrow it? If user typed within a hard region
the hard narrowed region, will the upper hard limit expand just as ZV does?

My approach is simpler and leaves current narrowing functionality alone. You set
the limits and allow narrowing happening inside those limits normally. Even
widen cannot lift those limits. You create a small universe within the buffer
with only one exit (set-widen-limits nil nil).

You might end up loosing text outside of the bounds if you modify the buffer and
then call widen, but that's by design and this is how it's different from visual
narrowing. Hard limits stay the same irrespective of what happens to the buffer.

>> limits at the end. Problems will occur if major modes start using hard
>> limits in such contexts directly.

> I don't see any reason why problems *will* occur in that case (tho, of
> course, Murphy could be that reason).  So until such problems do show up,
> I wouldn't worry.

The problem is not hypothetical. It's occurring right now. If you impose limits
in order to do font-lock and font-lock-fontify-region-function changes those
limits that screws your multi mode. That's what is happening with current
narrowing/widening mechanism and that's precisely the reason for extra widen
limits in the first place.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-22 19:36                                                     ` Vitalie Spinu
@ 2016-03-23  2:22                                                       ` Stefan Monnier
  2016-03-23 11:41                                                         ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-23  2:22 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: emacs-devel

> There is no real need to invent extra type of narrowing. It's a lot of extra
> work with no additional benefit.

I don't see any extra work.  (narrow-to-region BEG END 'hard) would just
be the API used to set your hard limits, and that's all there is to it.

> If user typed within a hard region the hard narrowed region, will the
> upper hard limit expand just as ZV does?

This is indispensable, yes.  No matter whether the hard limits are
folded int narrow-to-region or any other way: the upper limit has to be
a marker, and unless we strictly enforce that the hard limits can't be
circumvented at all, the lower limit would probably have to be a marker
as well.

> My approach is simpler and leaves current narrowing functionality
> alone.  You set the limits and allow narrowing happening inside those
> limits normally.

That's also how I imagine (narrow-to-region BEG END 'hard) working.
It just won't allow widening outside of those hard limits.

> You might end up loosing text outside of the bounds if you modify the
> buffer and then call widen, but that's by design and this is how it's
> different from visual narrowing.  Hard limits stay the same
> irrespective of what happens to the buffer.

Sounds like a wart.  What's the benefit?

>>> limits at the end. Problems will occur if major modes start using hard
>>> limits in such contexts directly.
>> I don't see any reason why problems *will* occur in that case (tho, of
>> course, Murphy could be that reason).  So until such problems do show up,
>> I wouldn't worry.
> The problem is not hypothetical. It's occurring right now.

It can't because we don't have hard limits right now.


        Stefan



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-23  2:22                                                       ` Stefan Monnier
@ 2016-03-23 11:41                                                         ` Vitalie Spinu
  2016-03-23 12:34                                                           ` Stefan Monnier
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-23 11:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

>> On Tue, Mar 22 2016 22:22, Stefan Monnier wrote:

>> If user typed within a hard region the hard narrowed region, will the
>> upper hard limit expand just as ZV does?

> This is indispensable, yes.  No matter whether the hard limits are
> folded int narrow-to-region or any other way: the upper limit has to be
> a marker, and unless we strictly enforce that the hard limits can't be
> circumvented at all, the lower limit would probably have to be a marker
> as well.

Ok. So we agree that there is work involved of tracking an extra
marker. Whenever buffer is modified by low level code, it must track new ZH
marker and respect the relationship between ZH and ZV. There are 544 occurrences
of ZV in emacs source. In order to add this extra marker one would need to go
through all of those cases and enforce the semantics of ZH.

It might be that adjusting ZV macros might do the job, but I cannot judge
because I am not yet familiar with buffer modification code.

>> You might end up loosing text outside of the bounds if you modify the
>> buffer and then call widen, but that's by design and this is how it's
>> different from visual narrowing.  Hard limits stay the same
>> irrespective of what happens to the buffer.

> Sounds like a wart.  What's the benefit?

True, but it's almost a direct implementation of the restriction in
prog-widen. It has same limitations and multi-modes are completely fine with
those. A with-widen-limits macro will suffice for multi-mode use case. But you
proposed to extend it to permanent set-widen-limits or (narrow-to-region
.. 'hard). I see the benefit of it in info mode but I think it's pretty
marginal.

The proposed non-marker implementation will deter usage of widen-limits in
contexts that involve buffer modification. But it will work just fine with
multi-modes and with read-only info use cases. It also works fine with editing
as long as it's not followed by widen. If widen is used the buffer will be
re-narrowed to old limits.

I will look into ZH marker this weekend. Maybe it's not that hard as I imagine.

>>>> limits at the end. Problems will occur if major modes start using hard
>>>> limits in such contexts directly.
>>> I don't see any reason why problems *will* occur in that case (tho, of
>>> course, Murphy could be that reason).  So until such problems do show up,
>>> I wouldn't worry.
>> The problem is not hypothetical. It's occurring right now.

> It can't because we don't have hard limits right now.

Oh common. You know I was referring to current widen/narrow mechanism. It's one
step to extrapolate to hard narrowing from there.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-23 11:41                                                         ` Vitalie Spinu
@ 2016-03-23 12:34                                                           ` Stefan Monnier
  2016-03-23 12:41                                                             ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-23 12:34 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: emacs-devel

> Ok. So we agree that there is work involved of tracking an extra
> marker. Whenever buffer is modified by low level code, it must track new ZH
> marker and respect the relationship between ZH and ZV. There are 544 occurrences
> of ZV in emacs source. In order to add this extra marker one would need to go
> through all of those cases and enforce the semantics of ZH.

I wouldn't want to touch Z* and BEG*, indeed.
I'm just suggesting to keep the limits as markers rather than
as integers.  It's a trivial change.


        Stefan



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-23 12:34                                                           ` Stefan Monnier
@ 2016-03-23 12:41                                                             ` Vitalie Spinu
  2016-03-29 21:43                                                               ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-23 12:41 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel



>> On Wed, Mar 23 2016 08:34, Stefan Monnier wrote:

> I wouldn't want to touch Z* and BEG*, indeed. I'm just suggesting to keep the
> limits as markers rather than as integers.  It's a trivial change.

Hm. That might work quite well actually.

   Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-23 12:41                                                             ` Vitalie Spinu
@ 2016-03-29 21:43                                                               ` Vitalie Spinu
  2016-04-22 14:34                                                                 ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-29 21:43 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2046 bytes --]



>> On Wed, Mar 23 2016 12:41, Vitalie Spinu wrote:

>> I'm just suggesting to keep the limits as markers rather than as integers.

Attaching a patch for this. AFAIC it shapes pretty nicely. There are two types
of narrowing, visual and hard. The imposition or lifting of these is done
through narrow-to-region and widen depending on the value of optional HARD
argument.

I haven't tested it yet because of the following build problem with loaddeffs:

    ...    
    Finding pointers to doc strings...
    Finding pointers to doc strings...done
    Dumping under the name emacs
    91970 pure bytes used
    : paxctl -zex emacs
    mv -f emacs bootstrap-emacs
    make -C ../lisp compile-first EMACS="../src/bootstrap-emacs"
    make[3]: Entering directory '/home/vspinu/bin/emacs-test/lisp'
      ELC      emacs-lisp/macroexp.elc
      ELC      emacs-lisp/cconv.elc
      ELC      emacs-lisp/byte-opt.elc
      ELC      emacs-lisp/bytecomp.elc
      ELC      emacs-lisp/autoload.elc
    make[3]: Leaving directory '/home/vspinu/bin/emacs-test/lisp'
    make -C ../lisp autoloads EMACS="../src/bootstrap-emacs"
    make[3]: Entering directory '/home/vspinu/bin/emacs-test/lisp'
      GEN      calendar/cal-loaddefs.el
    Loading macroexp.elc...
    appt.el:0:0: error: wrong-type-argument: (markerp /home/vspinu/bin/emacs-test/lisp/calendar/appt.el)
    Makefile:402: recipe for target 'calendar/cal-loaddefs.el' failed
    make[3]: *** [calendar/cal-loaddefs.el] Error 255
    make[3]: Leaving directory '/home/vspinu/bin/emacs-test/lisp'
    Makefile:727: recipe for target '../lisp/loaddefs.el' failed
    make[2]: *** [../lisp/loaddefs.el] Error 2
    make[2]: Leaving directory '/home/vspinu/bin/emacs-test/src'
    Makefile:398: recipe for target 'src' failed
    make[1]: *** [src] Error 2
    make[1]: Leaving directory '/home/vspinu/bin/emacs-test'
    Makefile:1091: recipe for target 'bootstrap' failed
    make: *** [bootstrap] Error 2


Any ideas of why this is happening?

The relevant branch is scratch/hard-narrow.


  Vitalie


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-diff, Size: 10728 bytes --]

scratch/hard-narrow origin/scratch/hard-narrow 7068e4c811f7530e14d2684fea68499418642b33
Author:     Vitalie Spinu <spinuvit@gmail.com>
AuthorDate: Mon Mar 21 05:41:55 2016 +0100
Commit:     Vitalie Spinu <spinuvit@gmail.com>
CommitDate: Tue Mar 29 23:29:54 2016 +0200

Parent:     f99b512 In M-%, avoid making buffer-local binding of text-property-default-nonsticky
Merged:     emacs-24 master scratch/hard-narrow
Containing: scratch/hard-narrow
Follows:    emacs-25.0.92 (138)

Hard narrowing

Idem

modified   src/buffer.c
@@ -571,6 +571,9 @@ even if it is dead.  The return value is never nil.  */)
   bset_begv_marker (b, Qnil);
   bset_zv_marker (b, Qnil);
 
+  bset_begh_marker (b, Qnil);
+  bset_zh_marker (b, Qnil);
+
   name = Fcopy_sequence (buffer_or_name);
   set_string_intervals (name, NULL);
   bset_name (b, name);
@@ -835,6 +838,7 @@ CLONE nil means the indirect buffer's state is reset to default values.  */)
       bset_pt_marker (b, build_marker (b, b->pt, b->pt_byte));
       bset_begv_marker (b, build_marker (b, b->begv, b->begv_byte));
       bset_zv_marker (b, build_marker (b, b->zv, b->zv_byte));
+
       XMARKER (BVAR (b, zv_marker))->insertion_type = 1;
     }
   else
@@ -2165,9 +2169,9 @@ Any narrowing restriction in effect (see `narrow-to-region') is removed,
 so the buffer is truly empty after this.  */)
   (void)
 {
-  Fwiden ();
+  Fwiden (Qnil);
 
-  del_range (BEG, Z);
+  del_range (BEGV, ZV);
 
   current_buffer->last_window_start = 1;
   /* Prevent warnings, or suspension of auto saving, that would happen
@@ -2310,6 +2314,8 @@ DEFUN ("buffer-swap-text", Fbuffer_swap_text, Sbuffer_swap_text,
   swapfield_ (pt_marker, Lisp_Object);
   swapfield_ (begv_marker, Lisp_Object);
   swapfield_ (zv_marker, Lisp_Object);
+  swapfield_ (begh_marker, Lisp_Object);
+  swapfield_ (zh_marker, Lisp_Object);
   bset_point_before_scroll (current_buffer, Qnil);
   bset_point_before_scroll (other_buffer, Qnil);
 
@@ -2490,7 +2496,7 @@ current buffer is cleared.  */)
 	    }
 	}
       if (narrowed)
-	Fnarrow_to_region (make_number (begv), make_number (zv));
+	Fnarrow_to_region (make_number (begv), make_number (zv), Qnil);
     }
   else
     {
@@ -2571,7 +2577,7 @@ current buffer is cleared.  */)
 	TEMP_SET_PT (pt);
 
       if (narrowed)
-	Fnarrow_to_region (make_number (begv), make_number (zv));
+	Fnarrow_to_region (make_number (begv), make_number (zv), Qnil);
 
       /* Do this first, so that chars_in_text asks the right question.
 	 set_intervals_multibyte needs it too.  */
@@ -5053,6 +5059,8 @@ init_buffer_once (void)
   bset_pt_marker (&buffer_local_flags, make_number (0));
   bset_begv_marker (&buffer_local_flags, make_number (0));
   bset_zv_marker (&buffer_local_flags, make_number (0));
+  bset_begh_marker (&buffer_local_flags, make_number (0));
+  bset_zh_marker (&buffer_local_flags, make_number (0));
   bset_last_selected_window (&buffer_local_flags, make_number (0));
 
   idx = 1;
modified   src/buffer.h
@@ -416,6 +416,26 @@ extern void enlarge_buffer_text (struct buffer *, ptrdiff_t);
 
 #define BUF_FETCH_BYTE(buf, n) \
   *(BUF_BYTE_ADDRESS ((buf), (n)))
+
+\f
+/* Macros for setting and accessing hard-narrow markers */
+
+/* Position of beginning of hard-narrowed range of buffer. */
+#define BEGH (BUF_BEGH (current_buffer))
+#define BUF_BEGH(buf)                                   \
+  ((NILP (BVAR (buf, begh_marker))) ? BUF_BEG (buf)     \
+   : marker_position (BVAR (buf, begh_marker)))
+#define SET_BUF_BEGH(buf, charpos)                               \
+  (bset_begh_marker (buf, build_marker(buf, charpos, buf_charpos_to_bytepos(buf, charpos))))
+
+/* Position of end of hard-narrowed range of buffer. */
+#define ZH (BUF_ZH(current_buffer))
+#define BUF_ZH(buf)                                     \
+  ((NILP (BVAR (buf, zh_marker))) ? BUF_Z (buf)         \
+   : marker_position (BVAR (buf, zh_marker)))
+#define SET_BUF_ZH(buf, charpos)                                 \
+  (bset_zh_marker (buf, build_marker(buf, charpos, buf_charpos_to_bytepos(buf, charpos))))
+
 \f
 /* Define the actual buffer data structures.  */
 
@@ -666,6 +686,12 @@ struct buffer
      ZV for this buffer when the buffer is not current.  */
   Lisp_Object zv_marker_;
 
+  /* Lower hard limit of the buffer.*/
+  Lisp_Object begh_marker_;
+
+  /* Upper hard limit of the buffer.*/
+  Lisp_Object zh_marker_;
+
   /* This holds the point value before the last scroll operation.
      Explicitly setting point sets this to nil.  */
   Lisp_Object point_before_scroll_;
@@ -984,6 +1010,16 @@ bset_width_table (struct buffer *b, Lisp_Object val)
 {
   b->width_table_ = val;
 }
+INLINE void
+bset_begh_marker (struct buffer *b, Lisp_Object val)
+{
+  b->begh_marker_ = val;
+}
+INLINE void
+bset_zh_marker (struct buffer *b, Lisp_Object val)
+{
+  b->zh_marker_ = val;
+}
 
 /* Number of Lisp_Objects at the beginning of struct buffer.
    If you add, remove, or reorder Lisp_Objects within buffer
modified   src/bytecode.c
@@ -1682,17 +1682,18 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 
 	CASE (Bnarrow_to_region):
 	  {
-	    Lisp_Object v1;
+	    Lisp_Object v1, v2;
 	    BEFORE_POTENTIAL_GC ();
 	    v1 = POP;
-	    TOP = Fnarrow_to_region (TOP, v1);
+	    v2 = POP;
+	    TOP = Fnarrow_to_region (TOP, v2, v1);
 	    AFTER_POTENTIAL_GC ();
 	    NEXT;
 	  }
 
 	CASE (Bwiden):
 	  BEFORE_POTENTIAL_GC ();
-	  PUSH (Fwiden ());
+	  TOP = Fwiden (TOP);
 	  AFTER_POTENTIAL_GC ();
 	  NEXT;
 
modified   src/editfns.c
@@ -3480,33 +3480,54 @@ DEFUN ("delete-and-extract-region", Fdelete_and_extract_region,
     return empty_unibyte_string;
   return del_range_1 (XINT (start), XINT (end), 1, 1);
 }
+
 \f
-DEFUN ("widen", Fwiden, Swiden, 0, 0, "",
+DEFUN ("widen", Fwiden, Swiden, 0, 1, "",
        doc: /* Remove restrictions (narrowing) from current buffer.
-This allows the buffer's full text to be seen and edited.  */)
-  (void)
+If HARD is non-nil, remove the hard restriction imposed by a previous
+call to \\[narrow-to-region].  If HARD is nil, remove visual
+restriction up to the previously imposed hard limit (if any).  */)
+  (Lisp_Object hard)
 {
-  if (BEG != BEGV || Z != ZV)
-    current_buffer->clip_changed = 1;
-  BEGV = BEG;
-  BEGV_BYTE = BEG_BYTE;
-  SET_BUF_ZV_BOTH (current_buffer, Z, Z_BYTE);
-  /* Changing the buffer bounds invalidates any recorded current column.  */
-  invalidate_current_column ();
+
+  if(!NILP (hard))
+    {
+      bset_begh_marker(current_buffer, Qnil);
+      bset_zh_marker(current_buffer, Qnil);
+    }
+  else
+    {
+      if (BEG != BEGV || Z != ZV)
+        current_buffer->clip_changed = 1;
+      BEGV = BEG;
+      BEGV_BYTE = BEG_BYTE;
+      SET_BUF_ZV_BOTH (current_buffer, Z, Z_BYTE);
+      /* Changing the buffer bounds invalidates any recorded current column.  */
+      invalidate_current_column ();
+    }
+
   return Qnil;
 }
 
-DEFUN ("narrow-to-region", Fnarrow_to_region, Snarrow_to_region, 2, 2, "r",
-       doc: /* Restrict editing in this buffer to the current region.
-The rest of the text becomes temporarily invisible and untouchable
-but is not deleted; if you save the buffer in a file, the invisible
-text is included in the file.  \\[widen] makes all visible again.
-See also `save-restriction'.
+DEFUN ("narrow-to-region", Fnarrow_to_region, Snarrow_to_region, 2, 3, "r",
+       doc: /* Restrict editing in this buffer to the current
+region. START and END are positions (integers or markers) bounding the
+text that should restricted. There can be two types of restrictions,
+visual and hard. If HARD is nil, impose visual restriction, otherwise
+a hard one.
 
-When calling from a program, pass two arguments; positions (integers
-or markers) bounding the text that should remain visible.  */)
-  (register Lisp_Object start, Lisp_Object end)
+When visual restriction is in place, the rest of the text is invisible
+and untouchable but is not deleted; if you save the buffer in a file,
+the invisible text is included in the file. \\[widen] with nil
+optional argument makes it all visible again.
+
+When hard restriction is in place, invocations of (visual) \\[widen]
+with nil argument removes visual narrowing up to the hard
+restriction. In order to lift hard restriction,  call \\[widen] with
+non-nil HARD argument.  */)
+  (register Lisp_Object start, Lisp_Object end, Lisp_Object hard)
 {
+
   CHECK_NUMBER_COERCE_MARKER (start);
   CHECK_NUMBER_COERCE_MARKER (end);
 
@@ -3519,6 +3540,15 @@ or markers) bounding the text that should remain visible.  */)
   if (!(BEG <= XINT (start) && XINT (start) <= XINT (end) && XINT (end) <= Z))
     args_out_of_range (start, end);
 
+  if (!NILP (hard))
+    {
+      SET_BUF_BEGH (current_buffer, XFASTINT (start));
+      SET_BUF_ZH (current_buffer, XFASTINT (end));
+      if (BEGV >= XFASTINT (start) && ZV <= XFASTINT (end))
+        /* Visual narrowing within hard limits.  */
+        return Qnil;
+    }
+
   if (BEGV != XFASTINT (start) || ZV != XFASTINT (end))
     current_buffer->clip_changed = 1;
 
@@ -3533,6 +3563,7 @@ or markers) bounding the text that should remain visible.  */)
   return Qnil;
 }
 
+
 Lisp_Object
 save_restriction_save (void)
 {
modified   src/fileio.c
@@ -4764,7 +4764,7 @@ write_region (Lisp_Object start, Lisp_Object end, Lisp_Object filename,
 	 This is useful in tar-mode.  --Stef
       XSETFASTINT (start, BEG);
       XSETFASTINT (end, Z); */
-      Fwiden ();
+      Fwiden (Qnil);
     }
 
   record_unwind_protect (build_annotations_unwind,
modified   src/lread.c
@@ -1850,7 +1850,7 @@ readevalloop (Lisp_Object readcharfun,
 	  /* Set point and ZV around stuff to be read.  */
 	  Fgoto_char (start);
 	  if (!NILP (end))
-	    Fnarrow_to_region (make_number (BEGV), end);
+	    Fnarrow_to_region (make_number (BEGV), end, Qnil);
 
 	  /* Just for cleanliness, convert END to a marker
 	     if it is an integer.  */
modified   src/process.c
@@ -5514,7 +5514,7 @@ Otherwise it discards the output.  */)
       /* If the output marker is outside of the visible region, save
 	 the restriction and widen.  */
       if (! (BEGV <= PT && PT <= ZV))
-	Fwiden ();
+	Fwiden (Qnil);
 
       /* Adjust the multibyteness of TEXT to that of the buffer.  */
       if (NILP (BVAR (current_buffer, enable_multibyte_characters))
@@ -5558,7 +5558,7 @@ Otherwise it discards the output.  */)
 
       /* If the restriction isn't what it should be, set it.  */
       if (old_begv != BEGV || old_zv != ZV)
-	Fnarrow_to_region (make_number (old_begv), make_number (old_zv));
+	Fnarrow_to_region (make_number (old_begv), make_number (old_zv), Qnil);
 
       bset_read_only (current_buffer, old_read_only);
       SET_PT_BOTH (opoint, opoint_byte);

[back]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-29 21:43                                                               ` Vitalie Spinu
@ 2016-04-22 14:34                                                                 ` Dmitry Gutov
  2016-04-24  7:22                                                                   ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-04-22 14:34 UTC (permalink / raw)
  To: Vitalie Spinu, Stefan Monnier; +Cc: emacs-devel

Hi Vitalie,

On 03/30/2016 12:43 AM, Vitalie Spinu wrote:
> I haven't tested it yet because of the following build problem with loaddeffs:

It actually builds fine here. Have you tried 'make bootstrap'?



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-22 14:34                                                                 ` Dmitry Gutov
@ 2016-04-24  7:22                                                                   ` Vitalie Spinu
  2016-04-24  7:28                                                                     ` Achim Gratz
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-04-24  7:22 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel


Hm. I have reported that error with `make bootstrap`.

I got stuck at that silly error and didn't have time to figure it out. Will get
back at it next week after my deadlines are over.

>> On Fri, Apr 22 2016 17:34, Dmitry Gutov wrote:

> Hi Vitalie,

> On 03/30/2016 12:43 AM, Vitalie Spinu wrote:
>> I haven't tested it yet because of the following build problem with loaddeffs:

> It actually builds fine here. Have you tried 'make bootstrap'?



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-24  7:22                                                                   ` Vitalie Spinu
@ 2016-04-24  7:28                                                                     ` Achim Gratz
  2016-04-24 11:33                                                                       ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Achim Gratz @ 2016-04-24  7:28 UTC (permalink / raw)
  To: emacs-devel

Vitalie Spinu writes:
> Hm. I have reported that error with `make bootstrap`.
>
> I got stuck at that silly error and didn't have time to figure it out. Will get
> back at it next week after my deadlines are over.

Do a 'make extraclean' and try again.


Regards,
Achim.
-- 
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Factory and User Sound Singles for Waldorf rackAttack:
http://Synth.Stromeko.net/Downloads.html#WaldorfSounds




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-24  7:28                                                                     ` Achim Gratz
@ 2016-04-24 11:33                                                                       ` Vitalie Spinu
  2016-04-24 13:20                                                                         ` Andreas Schwab
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-04-24 11:33 UTC (permalink / raw)
  To: Achim Gratz; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 211 bytes --]



>> On Sun, Apr 24 2016 09:28, Achim Gratz wrote:
> Do a 'make extraclean' and try again.

Doesn't help either.


I have narrowed it down to adding an extra args to primitives. This is how I do
it right now. 


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-diff, Size: 1918 bytes --]

5 files changed, 7 insertions(+), 6 deletions(-)
src/buffer.c   | 2 +-
src/bytecode.c | 2 +-
src/editfns.c  | 5 +++--
src/fileio.c   | 2 +-
src/process.c  | 2 +-

modified   src/buffer.c
@@ -2165,7 +2165,7 @@ Any narrowing restriction in effect (see `narrow-to-region') is removed,
 so the buffer is truly empty after this.  */)
   (void)
 {
-  Fwiden ();
+  Fwiden (Qnil);
 
   del_range (BEG, Z);
 
modified   src/bytecode.c
@@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 
 	CASE (Bwiden):
 	  BEFORE_POTENTIAL_GC ();
-	  PUSH (Fwiden ());
+	  TOP = Fwiden (TOP);
 	  AFTER_POTENTIAL_GC ();
 	  NEXT;
 
modified   src/editfns.c
@@ -3483,11 +3483,12 @@ DEFUN ("delete-and-extract-region", Fdelete_and_extract_region,
     return empty_unibyte_string;
   return del_range_1 (XINT (start), XINT (end), 1, 1);
 }
+
 \f
-DEFUN ("widen", Fwiden, Swiden, 0, 0, "",
+DEFUN ("widen", Fwiden, Swiden, 0, 1, "",
        doc: /* Remove restrictions (narrowing) from current buffer.
 This allows the buffer's full text to be seen and edited.  */)
-  (void)
+  (Lisp_Object hard)
 {
   if (BEG != BEGV || Z != ZV)
     current_buffer->clip_changed = 1;
modified   src/fileio.c
@@ -4764,7 +4764,7 @@ write_region (Lisp_Object start, Lisp_Object end, Lisp_Object filename,
 	 This is useful in tar-mode.  --Stef
       XSETFASTINT (start, BEG);
       XSETFASTINT (end, Z); */
-      Fwiden ();
+      Fwiden (Qnil);
     }
 
   record_unwind_protect (build_annotations_unwind,
modified   src/process.c
@@ -5514,7 +5514,7 @@ Otherwise it discards the output.  */)
       /* If the output marker is outside of the visible region, save
 	 the restriction and widen.  */
       if (! (BEGV <= PT && PT <= ZV))
-	Fwiden ();
+	Fwiden (Qnil);
 
       /* Adjust the multibyteness of TEXT to that of the buffer.  */
       if (NILP (BVAR (current_buffer, enable_multibyte_characters))

[-- Attachment #3: Type: text/plain, Size: 999 bytes --]



And this is the error which `make bootstrap` gives:

    make -C ../lisp autoloads EMACS="../src/bootstrap-emacs"
    make[3]: Entering directory '/home/vspinu/bin/emacs-test/lisp'
      GEN      calendar/cal-loaddefs.el
    Loading macroexp.elc...
    appt.el:0:0: error: wrong-type-argument: (markerp /home/vspinu/bin/emacs-test/lisp/calendar/appt.el)
    Makefile:406: recipe for target 'calendar/cal-loaddefs.el' failed
    make[3]: *** [calendar/cal-loaddefs.el] Error 255
    make[3]: Leaving directory '/home/vspinu/bin/emacs-test/lisp'
    Makefile:727: recipe for target '../lisp/loaddefs.el' failed
    make[2]: *** [../lisp/loaddefs.el] Error 2
    make[2]: Leaving directory '/home/vspinu/bin/emacs-test/src'
    Makefile:398: recipe for target 'src' failed
    make[1]: *** [src] Error 2
    make[1]: Leaving directory '/home/vspinu/bin/emacs-test'
    Makefile:1091: recipe for target 'bootstrap' failed
    make: *** [bootstrap] Error 2
    

What am I doing wrong here?

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-24 11:33                                                                       ` Vitalie Spinu
@ 2016-04-24 13:20                                                                         ` Andreas Schwab
  2016-04-24 16:11                                                                           ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Andreas Schwab @ 2016-04-24 13:20 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Achim Gratz, emacs-devel

Vitalie Spinu <spinuvit@gmail.com> writes:

> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
>  
>  	CASE (Bwiden):
>  	  BEFORE_POTENTIAL_GC ();
> -	  PUSH (Fwiden ());
> +	  TOP = Fwiden (TOP);

You are clobbering the stack here.  Instead of pushing a new value you
are overwriting an unrelated value on the stack.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-24 13:20                                                                         ` Andreas Schwab
@ 2016-04-24 16:11                                                                           ` Vitalie Spinu
  2016-04-24 16:19                                                                             ` Andreas Schwab
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-04-24 16:11 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Achim Gratz, emacs-devel



>> On Sun, Apr 24 2016 15:20, Andreas Schwab wrote:

> Vitalie Spinu <spinuvit@gmail.com> writes:

>> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
>>  
>>  	CASE (Bwiden):
>>  	  BEFORE_POTENTIAL_GC ();
>> -	  PUSH (Fwiden ());
>> +	  TOP = Fwiden (TOP);

> You are clobbering the stack here.  Instead of pushing a new value you
> are overwriting an unrelated value on the stack.

I don't think so. I pick an argument form the stack and put the return value in
it. This is what all one-arg functions do in bytecode.c.


  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-24 16:11                                                                           ` Vitalie Spinu
@ 2016-04-24 16:19                                                                             ` Andreas Schwab
  2016-04-24 16:41                                                                               ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Andreas Schwab @ 2016-04-24 16:19 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Achim Gratz, emacs-devel

Vitalie Spinu <spinuvit@gmail.com> writes:

>>> On Sun, Apr 24 2016 15:20, Andreas Schwab wrote:
>
>> Vitalie Spinu <spinuvit@gmail.com> writes:
>
>>> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
>>>  
>>>  	CASE (Bwiden):
>>>  	  BEFORE_POTENTIAL_GC ();
>>> -	  PUSH (Fwiden ());
>>> +	  TOP = Fwiden (TOP);
>
>> You are clobbering the stack here.  Instead of pushing a new value you
>> are overwriting an unrelated value on the stack.
>
> I don't think so. I pick an argument form the stack and put the return value in
> it. This is what all one-arg functions do in bytecode.c.

But nobody is pushing that argument.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-24 16:19                                                                             ` Andreas Schwab
@ 2016-04-24 16:41                                                                               ` Vitalie Spinu
  2016-04-24 16:48                                                                                 ` Andreas Schwab
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-04-24 16:41 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Achim Gratz, emacs-devel



>> On Sun, Apr 24 2016 18:19, Andreas Schwab wrote:

> Vitalie Spinu <spinuvit@gmail.com> writes:

>>>> On Sun, Apr 24 2016 15:20, Andreas Schwab wrote:
>>
>>> Vitalie Spinu <spinuvit@gmail.com> writes:
>>
>>>> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
>>>>  
>>>>  	CASE (Bwiden):
>>>>  	  BEFORE_POTENTIAL_GC ();
>>>> -	  PUSH (Fwiden ());
>>>> +	  TOP = Fwiden (TOP);
>>
>>> You are clobbering the stack here.  Instead of pushing a new value you
>>> are overwriting an unrelated value on the stack.
>>
>> I don't think so. I pick an argument form the stack and put the return value in
>> it. This is what all one-arg functions do in bytecode.c.

> But nobody is pushing that argument.

I am not pushing anything. I am just overriding. PUSH in above diff is a deleted
line.

  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-24 16:41                                                                               ` Vitalie Spinu
@ 2016-04-24 16:48                                                                                 ` Andreas Schwab
  2016-04-24 18:01                                                                                   ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Andreas Schwab @ 2016-04-24 16:48 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Achim Gratz, emacs-devel

Vitalie Spinu <spinuvit@gmail.com> writes:

>>> On Sun, Apr 24 2016 18:19, Andreas Schwab wrote:
>
>> Vitalie Spinu <spinuvit@gmail.com> writes:
>
>>>>> On Sun, Apr 24 2016 15:20, Andreas Schwab wrote:
>>>
>>>> Vitalie Spinu <spinuvit@gmail.com> writes:
>>>
>>>>> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
>>>>>  
>>>>>  	CASE (Bwiden):
>>>>>  	  BEFORE_POTENTIAL_GC ();
>>>>> -	  PUSH (Fwiden ());
>>>>> +	  TOP = Fwiden (TOP);
>>>
>>>> You are clobbering the stack here.  Instead of pushing a new value you
>>>> are overwriting an unrelated value on the stack.
>>>
>>> I don't think so. I pick an argument form the stack and put the return value in
>>> it. This is what all one-arg functions do in bytecode.c.
>
>> But nobody is pushing that argument.
>
> I am not pushing anything.

Exactly.  This opcode takes no argument, and you cannot change that.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-24 16:48                                                                                 ` Andreas Schwab
@ 2016-04-24 18:01                                                                                   ` Vitalie Spinu
  2016-04-24 19:05                                                                                     ` Andreas Schwab
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-04-24 18:01 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Achim Gratz, emacs-devel



>> On Sun, Apr 24 2016 18:48, Andreas Schwab wrote:

> Vitalie Spinu <spinuvit@gmail.com> writes:

>>>> On Sun, Apr 24 2016 18:19, Andreas Schwab wrote:
>>
>>> Vitalie Spinu <spinuvit@gmail.com> writes:
>>
>>>>>> On Sun, Apr 24 2016 15:20, Andreas Schwab wrote:
>>>>
>>>>> Vitalie Spinu <spinuvit@gmail.com> writes:
>>>>
>>>>>> @@ -1692,7 +1692,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
>>>>>>  
>>>>>>  	CASE (Bwiden):
>>>>>>  	  BEFORE_POTENTIAL_GC ();
>>>>>> -	  PUSH (Fwiden ());
>>>>>> +	  TOP = Fwiden (TOP);
>>>>
>>>>> You are clobbering the stack here.  Instead of pushing a new value you
>>>>> are overwriting an unrelated value on the stack.
>>>>
>>>> I don't think so. I pick an argument form the stack and put the return value in
>>>> it. This is what all one-arg functions do in bytecode.c.
>>
>>> But nobody is pushing that argument.
>>
>> I am not pushing anything.

> Exactly.  This opcode takes no argument, and you cannot change that.

Could you please elaborate a bit where you think the problem is?

What exactly I cannot change? I am changing the number of argument of widen
(from zero to one) and adjusting the byte code table. Do you say that's not the
right way to do it? How should I do it then?


  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-24 18:01                                                                                   ` Vitalie Spinu
@ 2016-04-24 19:05                                                                                     ` Andreas Schwab
  0 siblings, 0 replies; 155+ messages in thread
From: Andreas Schwab @ 2016-04-24 19:05 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Achim Gratz, emacs-devel

Vitalie Spinu <spinuvit@gmail.com> writes:

> What exactly I cannot change? I am changing the number of argument of widen
> (from zero to one) and adjusting the byte code table. Do you say that's not the
> right way to do it? How should I do it then?

All bytecode ops are part of the ABI.  You cannot just change them
without breaking existing elc files.  Either introduce a new op or leave
the one-argument form as is.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-22 11:57                                               ` Stefan Monnier
  2016-03-22 16:28                                                 ` Vitalie Spinu
@ 2016-04-28 13:29                                                 ` Vitalie Spinu
  2016-04-30 14:06                                                   ` Stefan Monnier
  1 sibling, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-04-28 13:29 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 402 bytes --]



>> On Tue, Mar 22 2016 06:57, Stefan Monnier wrote:

> IIRC past discussions on this issue, one option was to merge your
> set-widen-limits into narrow-to-region by adding an optional argument `hard'.

Stefan, adding extra argument turned to be a train wreck. I am afraid, if I
cannot get help on how to extend primitives, I am giving up at this point.


Adding a dummy argument to Fbobp like this:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-diff, Size: 770 bytes --]

2 files changed, 3 insertions(+), 3 deletions(-)
src/bytecode.c | 2 +-
src/editfns.c  | 4 ++--

modified   src/bytecode.c
@@ -1589,7 +1589,7 @@ exec_byte_code (Lisp_Object bytestr, Lisp_Object vector, Lisp_Object maxdepth,
 	  NEXT;
 
 	CASE (Bbobp):
-	  PUSH (Fbobp ());
+	  TOP = Fbobp (TOP);
 	  NEXT;
 
 	CASE (Bcurrent_buffer):
modified   src/editfns.c
@@ -1164,10 +1164,10 @@ At the beginning of the buffer or accessible region, return 0.  */)
   return temp;
 }
 
-DEFUN ("bobp", Fbobp, Sbobp, 0, 0, 0,
+DEFUN ("bobp", Fbobp, Sbobp, 0, 1, 0,
        doc: /* Return t if point is at the beginning of the buffer.
 If the buffer is narrowed, this means the beginning of the narrowed part.  */)
-  (void)
+  (Lisp_Object dummy)
 {
   if (PT == BEGV)
     return Qt;


[-- Attachment #3: Type: text/plain, Size: 1605 bytes --]


then

  make extraclean && git clean -f && make bootstrap

gives "Wrong type argument" during byte compilation:

    make[3]: Entering directory '/home/vspinu/bin/emacs-test/lisp'
      ELC      ../lisp/international/eucjp-ms.elc
    Reloading stale loaddefs.el
    Loading /home/vspinu/bin/emacs-test/lisp/loaddefs.el (source)...
    make[3]: Leaving directory '/home/vspinu/bin/emacs-test/lisp'
    make -C ../admin/unidata all EMACS="../../src/bootstrap-emacs"
    make[3]: Entering directory '/home/vspinu/bin/emacs-test/admin/unidata'
      GEN      ../../src/macuvs.h
      GEN      ../../lisp/international/charprop.el
    Wrong type argument: char-or-string-p, #<EMACS BUG: INVALID DATATYPE (MISC 0x48ff) Save your buffers immediately and please report this bug>
    Makefile:87: recipe for target '../../lisp/international/charprop.el' failed
    make[3]: *** [../../lisp/international/charprop.el] Error 255
    make[3]: Leaving directory '/home/vspinu/bin/emacs-test/admin/unidata'
    Makefile:498: recipe for target '../lisp/international/charprop.el' failed
    make[2]: *** [../lisp/international/charprop.el] Error 2
    make[2]: Leaving directory '/home/vspinu/bin/emacs-test/src'
    Makefile:398: recipe for target 'src' failed
    make[1]: *** [src] Error 2
    make[1]: Leaving directory '/home/vspinu/bin/emacs-test'
    GNUmakefile:79: recipe for target 'bootstrap' failed


I am defining Bbobp, Fbobp, Sbobp just as it's done with any other primitive
with an optional argument. `Fbobp` is never used at C level, so the above diff
is complete and could be run as it is.

 Vitalie


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-04-28 13:29                                                 ` Vitalie Spinu
@ 2016-04-30 14:06                                                   ` Stefan Monnier
  0 siblings, 0 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-04-30 14:06 UTC (permalink / raw)
  To: emacs-devel

> Stefan, adding extra argument turned to be a train wreck. I am afraid, if I
> cannot get help on how to extend primitives, I am giving up at this point.

Indeed, I didn't consider the fact that some (most?) of those functions
are also implemented as bytecode.  Hmm...

>  	case (Bbobp):
> -	  PUSH (Fbobp ());
> +	  TOP = Fbobp (TOP);
>  	  NEXT;

You can't change the bytecode's behavior since existing .elc files will
otherwise break down completely since the stack's will be modified in
a way it doesn't expect [ For new .elc files you could make it work by
changing bytecomp.el to update the byte compiler's understanding of how
the bytecode works.  ]

So for functions that have a corresponding bytecode, you can either stop
using the bytecode when the new arg is used (this requires changing
bytecomp.el accordingly, e.g. by removing the corresponding code such as
"(byte-defop 125 -1 byte-narrow-to-region)"), or you leave the function
unchanged and introduce another function instead.

Having a bytecode for `narrow-to-region` is not very useful (the
main/only benefit is speed of executing narrow-to-region, but the
difference shouldn't be significant) so it'd be perfectly OK to stop
using this bytecode.  For `bobp` I'm not completely sure of the
potential performance impact, OTOH.

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-22 10:05                                             ` Vitalie Spinu
  2016-03-22 11:57                                               ` Stefan Monnier
@ 2016-03-22 20:08                                               ` Richard Stallman
  2016-03-22 22:45                                                 ` Vitalie Spinu
  1 sibling, 1 reply; 155+ messages in thread
From: Richard Stallman @ 2016-03-22 20:08 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: monnier, emacs-devel

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > > So the consumer will need to set this variable and always follow it by widen.

In the context of Emacs -- or software, generally -- what does
"consumer" mean?  One of the nice things about installing a program
in your computer is that running it does not use it up.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]]
  2016-03-22 20:08                                               ` Richard Stallman
@ 2016-03-22 22:45                                                 ` Vitalie Spinu
  0 siblings, 0 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-22 22:45 UTC (permalink / raw)
  To: Richard Stallman; +Cc: monnier, emacs-devel



>> On Tue, Mar 22 2016 16:08, Richard Stallman wrote:

> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]

>   > > So the consumer will need to set this variable and always follow it by widen.

> In the context of Emacs -- or software, generally -- what does
> "consumer" mean?  One of the nice things about installing a program
> in your computer is that running it does not use it up.

By "consumer" of a function or variable I meant any elisp code that uses (aka
consumes) that function or variable.

  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21  1:05                       ` Vitalie Spinu
  2016-03-21  3:11                         ` Stefan Monnier
  2016-03-21  5:08                         ` [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] Vitalie Spinu
@ 2016-03-21 11:47                         ` Dmitry Gutov
  2016-03-21 12:40                           ` Vitalie Spinu
  2 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-21 11:47 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel

On 03/21/2016 03:05 AM, Vitalie Spinu wrote:
>
>>> On Sun, Mar 20 2016 17:58, Dmitry Gutov wrote:
>
>> On 03/20/2016 02:15 PM, Vitalie Spinu wrote:
>
>> IIRC, using first-column is fairly justified, the outer mode can't add extra
>> indentation to the submode is a reliable, sane way
>
> The inner mode cannot often make that decision either.

What decision? One case where the mode cannot return its proposed 
indentation at all, is when the resulting column would be negative. 
Using first-column can make it positive again via simple addition.

Using calculate-indent-function which returns a numeric value, would 
solve that as well, of course, at the expense of having to update all 
major modes out there, and documentation. And making whatever 
third-party guides are out there obsolete in this regard. I'm not really 
against that, mind you.

> Same inner mode can be
> used in very different multi-mode contexts, each with their own semantics for
> chunks/headers/indentation. Reducing all that to a simple (first-column
> . previous-chunk) pair and letting inner mode do the job is surely not
> enough. The only actor to make that decision should be multi-mode engine itself.

I'm not claiming that using previous-chunk is good.

> Instead of teaching modes about multi-modes, a much better idea is to introduce
> `calculate-indent-function` which would accept POS and optional STRING-AFTER and
> STRING-BEFORE. This function will return the indentation of STRING-AFTER at POS
> assuming there is a virtual STRING-BEFORE just before POS.

Strings? Indentation engines do not deal with strings, they deal with 
buffer contents. Having them handle this possibility would also amount 
to sharing a part of multi-mode logic.

Instead, if you want to know what indentation an inner mode would return 
if STRING-BEFORE was before it, insert that string into the buffer 
(while inhibiting undo history). Call the indentation function, then 
remove the string. Any performance concerns with that?

> Most modes indent reliably
> based on one previous line,

Ruby doesn't. Most modes based on SMIE will need more than the previous 
line in the general case, too.

> Then a lot of modes don't even care about what's in the current line, so
> STRING-AFTER will be irrelevant as well.

Almost all of them care whether the current line contains }, or `end', 
or `else', and so on.

>>> It's essentially a half-backed implementation of "hard widening" discussed
>>> earlier. Why not impose the widening restriction directly in `widen` then?
>>> Maybe bring widen to elisp and rename C widen into widen-internal. Then add
>>> generic `prog-hard-widen-limits` which would be checked along
>>> prog-indentation-context limits.
>
>> Right! At the very least, I we should extract the second element of
>> prog-indentation-context into a separate variable, and make prog-widen more
>> prominent.
>
> Not sure about removing second element. Good thing about keeping all of them in
> one place is for the indentation engine to be concerned with a single variable.

Didn't you mention font-lock and syntax-propertize yourself? Why would 
they call a function that's solely dependent on an indentation variable?

In any case, your hard-narrowing proposal is very similar. Surely you 
don't want to keep the second element of prog-indentation-context after 
hard-narrowing becomes available?

> Only consumers of `hard-widen-limits` should be concerned with its side
> effects. But that's uniformly better than current situation when you cannot do
> much about restricting widen.

OK, so *every* consumer of widen will have to obey the hard limits. That 
might work, if there's no low-level code that absolutely has to always 
be able to widen to the whole buffer.

> BTW, I parse-partial-sexp must abide hard-widen-limits as well.

If you want parse-partial-sexp to obey limits, you narrow the buffer 
around it.

> This way the
> request aired in bug#22983 of parse-partial-sexp == syntax-ppss will be
> automatically satisfied. You won't need syntax-ppss-dont-widen either.

That doesn't seem relevant. That bug is about stale cache values between 
different narrowing bounds.

> A patch that would require hunting every single mode out there and implementing
> multi-modes locally should have been more carefully considered IMO. Emacs 25 is
> not yet there, so it's not late to reconsider that decision.

I concur.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 11:47                         ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Dmitry Gutov
@ 2016-03-21 12:40                           ` Vitalie Spinu
  2016-03-21 13:07                             ` Dmitry Gutov
  2016-03-21 14:02                             ` Stefan Monnier
  0 siblings, 2 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 12:40 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel

>> On Mon, Mar 21 2016 13:47, Dmitry Gutov wrote:

> On 03/21/2016 03:05 AM, Vitalie Spinu wrote:
>>
>>>> On Sun, Mar 20 2016 17:58, Dmitry Gutov wrote:
>>
>>
>> The inner mode cannot often make that decision either.

> What decision? 

Decision of how much to indent. Inner mode just doesn't have a complete picture
of what is going on. Just having access to previous chunk is not enough.

Note that I don't mind FIRST-COLUMN functionality. I think it's harmless and
probably useful. I mostly mind the last two arguments of
prog-indentation-context.

> I'm not claiming that using previous-chunk is good.

Good ;)

>> Instead of teaching modes about multi-modes, a much better idea is to introduce
>> `calculate-indent-function` which would accept POS and optional STRING-AFTER and
>> STRING-BEFORE. This function will return the indentation of STRING-AFTER at POS
>> assuming there is a virtual STRING-BEFORE just before POS.

> Strings? Indentation engines do not deal with strings, they deal with buffer
> contents. Having them handle this possibility would also amount to sharing a
> part of multi-mode logic.

Yeh. That's the sucky part. My hope is that BEFORE-STRING will be seldom
used. Given that this case applies only to continuation chunks and assuming that
multi-mode engine can identify those (at least at multi-mode level) this is a
reasonable trade off IMO. In polymode I haven't even got down to indentation of
continuation chunks yet. They are not that common in literate programming.

Performance is not a primary concern for indentation. Correctness and conceptual
cleanness is at a much higher stake here. My hope is that generic helper
functions can be optimized to re-use same temp buffer for multiple invocations
of calculate-indent-function.

>> Then a lot of modes don't even care about what's in the current line, so
>> STRING-AFTER will be irrelevant as well.

> Almost all of them care whether the current line contains }, or `end', or
> `else', and so on.

Indeed. But this information is trivial to retrieve from STRING-AFTER.

> In any case, your hard-narrowing proposal is very similar. Surely you don't want
> to keep the second element of prog-indentation-context after hard-narrowing
> becomes available?

Indeed. I was not thinking about algorithmic complexities.

AFAIK if second element is removed, the third one should go as well. That leaves
only FIRST-COLUMN then, which I personally don't mind.

>> Only consumers of `hard-widen-limits` should be concerned with its side
>> effects. But that's uniformly better than current situation when you cannot do
>> much about restricting widen.

> OK, so *every* consumer of widen will have to obey the hard limits. That might
> work, if there's no low-level code that absolutely has to always be able to
> widen to the whole buffer.

I think as long as low level code uses BEGV and ZV instead of BEG and Z
everything should be fine. That is with an implicit assumption that hard limits
are always wider than the current visual narrowing which is a reasonable
contract IMO.

Even better, as long as low level routines use BEG and Z consistently (and it
looks like they do) BEG and Z can be modified to take care of
hard-widen-limits. This might be the easiest solution. In any case going through
all C code and checking usage of widen is not such an insurmountable task.

>> BTW, I parse-partial-sexp must abide hard-widen-limits as well.

> If you want parse-partial-sexp to obey limits, you narrow the buffer around it.

>> This way the
>> request aired in bug#22983 of parse-partial-sexp == syntax-ppss will be
>> automatically satisfied. You won't need syntax-ppss-dont-widen either.

> That doesn't seem relevant. That bug is about stale cache values between
> different narrowing bounds.

Right. Those stale values won't occur in multi-modes because both syntax-ppss
and parse-partial-sexp will always operate on same hard-narrowed regions.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 12:40                           ` Vitalie Spinu
@ 2016-03-21 13:07                             ` Dmitry Gutov
  2016-03-21 14:20                               ` Vitalie Spinu
  2016-03-21 14:02                             ` Stefan Monnier
  1 sibling, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-21 13:07 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel

On 03/21/2016 02:40 PM, Vitalie Spinu wrote:

>> Strings? Indentation engines do not deal with strings, they deal with buffer
>> contents. Having them handle this possibility would also amount to sharing a
>> part of multi-mode logic.
>
> Yeh. That's the sucky part. My hope is that BEFORE-STRING will be seldom
> used.

Then let's not add that to the API until we see a concrete need for it.

> Performance is not a primary concern for indentation. Correctness and conceptual
> cleanness is at a much higher stake here. My hope is that generic helper
> functions can be optimized to re-use same temp buffer for multiple invocations
> of calculate-indent-function.

So, how about trying my alternative proposal first?

>>> Then a lot of modes don't even care about what's in the current line, so
>>> STRING-AFTER will be irrelevant as well.
>
>> Almost all of them care whether the current line contains }, or `end', or
>> `else', and so on.
>
> Indeed. But this information is trivial to retrieve from STRING-AFTER.

Feeding it to each particular indentation engine is not going to be trivial.

>> In any case, your hard-narrowing proposal is very similar. Surely you don't want
>> to keep the second element of prog-indentation-context after hard-narrowing
>> becomes available?
>
> Indeed. I was not thinking about algorithmic complexities.
>
> AFAIK if second element is removed, the third one should go as well. That leaves
> only FIRST-COLUMN then, which I personally don't mind.

OK. And that one could be replaced with the introduction of 
prog-indentation-function. Though that might be getting ahead of ourselfves.

>>> This way the
>>> request aired in bug#22983 of parse-partial-sexp == syntax-ppss will be
>>> automatically satisfied. You won't need syntax-ppss-dont-widen either.
>
>> That doesn't seem relevant. That bug is about stale cache values between
>> different narrowing bounds.
>
> Right. Those stale values won't occur in multi-modes because both syntax-ppss
> and parse-partial-sexp will always operate on same hard-narrowed regions.

We could only be sure of that for syntax-ppss calls in facilities that 
the multi-mode handles specially, like font-lock, syntax-propertize and 
indentation. Not so with any other functions the user could call.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 13:07                             ` Dmitry Gutov
@ 2016-03-21 14:20                               ` Vitalie Spinu
  2016-03-21 14:29                                 ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 14:20 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel



>> On Mon, Mar 21 2016 15:07, Dmitry Gutov wrote:

> On 03/21/2016 02:40 PM, Vitalie Spinu wrote:

>>> Strings? Indentation engines do not deal with strings, they deal with buffer
>>> contents. Having them handle this possibility would also amount to sharing a
>>> part of multi-mode logic.
>>
>> Yeh. That's the sucky part. My hope is that BEFORE-STRING will be seldom
>> used.

> Then let's not add that to the API until we see a concrete need for it.

It might be good to not include these (prog-indentation-context including) in
emacs 25 release.

>> Performance is not a primary concern for indentation. Correctness and conceptual
>> cleanness is at a much higher stake here. My hope is that generic helper
>> functions can be optimized to re-use same temp buffer for multiple invocations
>> of calculate-indent-function.

> So, how about trying my alternative proposal first?

Sorry. What proposal do you mean?

>> Right. Those stale values won't occur in multi-modes because both syntax-ppss
>> and parse-partial-sexp will always operate on same hard-narrowed regions.

> We could only be sure of that for syntax-ppss calls in facilities that the
> multi-mode handles specially, like font-lock, syntax-propertize and
> indentation. Not so with any other functions the user could call.

I assume that multi-mode engine is advising syntax-ppss which I think it should.

  Vitalie



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 14:20                               ` Vitalie Spinu
@ 2016-03-21 14:29                                 ` Dmitry Gutov
  2016-03-21 14:42                                   ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-21 14:29 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel

On 03/21/2016 04:20 PM, Vitalie Spinu wrote:

> It might be good to not include these (prog-indentation-context including) in
> emacs 25 release.

Of course, none of them. But nor should we put BEFORE-STRING into master 
until we understand that we really need it, and how to use it.

>>> Performance is not a primary concern for indentation. Correctness and conceptual
>>> cleanness is at a much higher stake here. My hope is that generic helper
>>> functions can be optimized to re-use same temp buffer for multiple invocations
>>> of calculate-indent-function.
>
>> So, how about trying my alternative proposal first?
>
> Sorry. What proposal do you mean?

"""
Instead, if you want to know what indentation an inner mode would return 
if STRING-BEFORE was before it, insert that string into the buffer 
(while inhibiting undo history). Call the indentation function, then 
remove the string.
"""

Same with AFTER-STRING. The multi-mode package itself would do that.

>>> Right. Those stale values won't occur in multi-modes because both syntax-ppss
>>> and parse-partial-sexp will always operate on same hard-narrowed regions.
>
>> We could only be sure of that for syntax-ppss calls in facilities that the
>> multi-mode handles specially, like font-lock, syntax-propertize and
>> indentation. Not so with any other functions the user could call.
>
> I assume that multi-mode engine is advising syntax-ppss which I think it should.

Very well, that's an option. Having syntax-ppss-dont-widen (or making 
syntax-ppss respect hard-widen-limits) should be sufficient for it.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 14:29                                 ` Dmitry Gutov
@ 2016-03-21 14:42                                   ` Vitalie Spinu
  2016-03-21 14:56                                     ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 14:42 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel

>> On Mon, Mar 21 2016 16:29, Dmitry Gutov wrote:

>> Sorry. What proposal do you mean?

> """
> Instead, if you want to know what indentation an inner mode would return if
> STRING-BEFORE was before it, insert that string into the buffer (while
> inhibiting undo history). Call the indentation function, then remove the string.
> """

Inner mode might decide to operate on string directly, or put stuff in a temp
buffer, work on last line only, or simply ignore it. Why to hard-wire the usage
of STRING-BEFORE so badly?

My gut feeling is to avoid modifying buffer context in indentation engine at all
costs. In the future, if performance with temp buffers will be a real issue, we
can add more low level functions for fast operation on string to do some common
parsing tasks. We can even extend parse-ppss to deal with BEFORE-STRING.

  Vitalie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 14:42                                   ` Vitalie Spinu
@ 2016-03-21 14:56                                     ` Dmitry Gutov
  2016-03-21 16:52                                       ` Vitalie Spinu
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-21 14:56 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel

On 03/21/2016 04:42 PM, Vitalie Spinu wrote:

>> """
>> Instead, if you want to know what indentation an inner mode would return if
>> STRING-BEFORE was before it, insert that string into the buffer (while
>> inhibiting undo history). Call the indentation function, then remove the string.
>> """
>
> Inner mode might decide to operate on string directly, or put stuff in a temp
> buffer, work on last line only, or simply ignore it.

Yes, each major mode would have to make all of these choices.

Why burden them with that concern? Wouldn't that become a part of the 
same problem that you yourself mentioned, "teaching modes about 
multi-modes"?

> Why to hard-wire the usage
> of STRING-BEFORE so badly?

What hard-wiring?

STRING-BEFORE is not a tangible part of my proposal. There's no API 
change tied to it.

> My gut feeling is to avoid modifying buffer context in indentation engine at all
> costs.

Why? That's worked out okay for me.

Alternatively, you can create a temp buffer each time, compose pieces of 
inner mode text in it, and call the indentation function. Again, in 
multi-mode code.

> In the future, if performance with temp buffers will be a real issue, we
> can add more low level functions for fast operation on string to do some common
> parsing tasks. We can even extend parse-ppss to deal with BEFORE-STRING.

Performance is a distant concern, complexity is the immediate one. If 
modifying buffers turns out to be a problem, then we can do all the 
stuff you mention above.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 14:56                                     ` Dmitry Gutov
@ 2016-03-21 16:52                                       ` Vitalie Spinu
  2016-03-21 21:30                                         ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 16:52 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel


Ok, so the alternative proposal is not to do anything. I like that. The only
reason to have STRING-AFTER and STRING-BEFORE is potential mode specific
optimization. If that's not a concern, no need for that.

  Vitalie

>> On Mon, Mar 21 2016 16:56, Dmitry Gutov wrote:

> On 03/21/2016 04:42 PM, Vitalie Spinu wrote:

>>> """
>>> Instead, if you want to know what indentation an inner mode would return if
>>> STRING-BEFORE was before it, insert that string into the buffer (while
>>> inhibiting undo history). Call the indentation function, then remove the string.
>>> """
>>
>> Inner mode might decide to operate on string directly, or put stuff in a temp
>> buffer, work on last line only, or simply ignore it.

> Yes, each major mode would have to make all of these choices.

> Why burden them with that concern? Wouldn't that become a part of the same
> problem that you yourself mentioned, "teaching modes about multi-modes"?

>> Why to hard-wire the usage
>> of STRING-BEFORE so badly?

> What hard-wiring?

> STRING-BEFORE is not a tangible part of my proposal. There's no API change tied
> to it.

>> My gut feeling is to avoid modifying buffer context in indentation engine at all
>> costs.

> Why? That's worked out okay for me.

> Alternatively, you can create a temp buffer each time, compose pieces of inner
> mode text in it, and call the indentation function. Again, in multi-mode code.

>> In the future, if performance with temp buffers will be a real issue, we
>> can add more low level functions for fast operation on string to do some common
>> parsing tasks. We can even extend parse-ppss to deal with BEFORE-STRING.

> Performance is a distant concern, complexity is the immediate one. If modifying
> buffers turns out to be a problem, then we can do all the stuff you mention
> above.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 16:52                                       ` Vitalie Spinu
@ 2016-03-21 21:30                                         ` Dmitry Gutov
  2016-04-03 23:34                                           ` John Wiegley
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-21 21:30 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, Stefan Monnier, emacs-devel

On 03/21/2016 06:52 PM, Vitalie Spinu wrote:
>
> Ok, so the alternative proposal is not to do anything. I like that.

Rather, wait and see, instead of hurrying to put those into the API.

> The only
> reason to have STRING-AFTER and STRING-BEFORE is potential mode specific
> optimization. If that's not a concern, no need for that.

Performance may be a concern, but we don't know that yet. As long as 
they're not required for correctness, let's not get ahead of ourselves.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 21:30                                         ` Dmitry Gutov
@ 2016-04-03 23:34                                           ` John Wiegley
  0 siblings, 0 replies; 155+ messages in thread
From: John Wiegley @ 2016-04-03 23:34 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Vitalie Spinu, Stefan Monnier, emacs-devel

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

> Performance may be a concern, but we don't know that yet. As long as they're
> not required for correctness, let's not get ahead of ourselves.

Yes, much agreed.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 12:40                           ` Vitalie Spinu
  2016-03-21 13:07                             ` Dmitry Gutov
@ 2016-03-21 14:02                             ` Stefan Monnier
  2016-03-21 14:31                               ` Vitalie Spinu
  1 sibling, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 14:02 UTC (permalink / raw)
  To: emacs-devel

> Note that I don't mind FIRST-COLUMN functionality. I think it's harmless and
> probably useful. I mostly mind the last two arguments of
> prog-indentation-context.

OK, so you're OK with FIRST-COLUMN.  The last two args are:
- (START . END), which you actually do want, except you want to store it
  in hard-widen-limit.  I'm OK with storing it elsewhere.
- PREVIOUS-CHUNKS.  It can be a string, in which case it's just like your
  STRING-BEFORE.  So your main issues with it are either that you don't
  want to allow it to be a function, or that you want to store/pas it in
  a different way, right?

>> Almost all of them care whether the current line contains }, or `end', or
>> `else', and so on.
> Indeed. But this information is trivial to retrieve from STRING-AFTER.

In the case of SMIE, it would probably not be too difficult to adjust it
so it can work with STRING-AFTER, tho I definitely wouldn't call it
trivial to implement the case of "END END END aligns with the matching
outer BEGIN" which is currently supported (and was default until 24.5 or
so).

But I must say that I don't understand why you need this
STRING-AFTER thingy.  Isn't that text already right there in the buffer?

E.g. in prog-indentation-context, we do have something equivalent to
hard-widen-limit and to STRING-BEFORE but we have nothing like
STRING-AFTER: the indentation code is expected to get that info by
looking at the buffer after point.

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 14:02                             ` Stefan Monnier
@ 2016-03-21 14:31                               ` Vitalie Spinu
  2016-03-21 15:06                                 ` Stefan Monnier
  0 siblings, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-21 14:31 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel


I have elaborated on all these in my other long reply. I would just conclude
here that because both calculate-indent-function and prog-indentation-context
try to solve same problem, they are bound to have overlapping parts. It's just
that calculate-indent-function is more general, easier to understand for prog
authors and it solves three problems at once - replacement of
indent-line-function, removing extra prog-indentation-context/prog-widen and not
exposing multi-mode complexities.

Note also that the intention of the `hard-widen-limit` is to make it work
seamlessly for all existing code that use widen. While prog-indentation-context
requires to teach every mode out there to use prog-widen instead of widen. This
doesn't sound right at all.

  Vitalie


>> On Mon, Mar 21 2016 10:02, Stefan Monnier wrote:

>> Note that I don't mind FIRST-COLUMN functionality. I think it's harmless and
>> probably useful. I mostly mind the last two arguments of
>> prog-indentation-context.

> OK, so you're OK with FIRST-COLUMN.  The last two args are:
> - (START . END), which you actually do want, except you want to store it
>   in hard-widen-limit.  I'm OK with storing it elsewhere.
> - PREVIOUS-CHUNKS.  It can be a string, in which case it's just like your
>   STRING-BEFORE.  So your main issues with it are either that you don't
>   want to allow it to be a function, or that you want to store/pas it in
>   a different way, right?

>>> Almost all of them care whether the current line contains }, or `end', or
>>> `else', and so on.
>> Indeed. But this information is trivial to retrieve from STRING-AFTER.

> In the case of SMIE, it would probably not be too difficult to adjust it
> so it can work with STRING-AFTER, tho I definitely wouldn't call it
> trivial to implement the case of "END END END aligns with the matching
> outer BEGIN" which is currently supported (and was default until 24.5 or
> so).

> But I must say that I don't understand why you need this
> STRING-AFTER thingy.  Isn't that text already right there in the buffer?

> E.g. in prog-indentation-context, we do have something equivalent to
> hard-widen-limit and to STRING-BEFORE but we have nothing like
> STRING-AFTER: the indentation code is expected to get that info by
> looking at the buffer after point.

>         Stefan



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 14:31                               ` Vitalie Spinu
@ 2016-03-21 15:06                                 ` Stefan Monnier
  2016-03-21 17:15                                   ` Andreas Röhler
  0 siblings, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2016-03-21 15:06 UTC (permalink / raw)
  To: emacs-devel

> I have elaborated on all these in my other long reply. I would just
> conclude here that because both calculate-indent-function and
> prog-indentation-context try to solve same problem, they are bound to
> have overlapping parts. It's just that calculate-indent-function is
> more general, easier to understand for prog authors and it solves
> three problems at once - replacement of indent-line-function, removing
> extra prog-indentation-context/prog-widen and not exposing
> multi-mode complexities.

STRING-BEFORE/STRING-AFTER/PREVIOUS-CHUNKS look like the main complexity
(from smie.el's point of view, they all seem pretty painful to support).

> Note also that the intention of the `hard-widen-limit` is to make it
> work seamlessly for all existing code that use widen.  While
> prog-indentation-context requires to teach every mode out there to use
> prog-widen instead of widen. This doesn't sound right at all.

The reason is that your suggestion risks breaking code since it changes
the behavior of `widen'.

Maybe the breakage would be extremely limited or even not exist at all,
in which case the tradeoff is probably worth it.  My gut feeling is that
it's too risky, but that's just my gut feeling.

Note also that most modes don't bother using widen, and search&replace
is pretty easy to do.  But if my fear is unfounded, then indeed it's
better to just change `widen' directly.

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]
  2016-03-21 15:06                                 ` Stefan Monnier
@ 2016-03-21 17:15                                   ` Andreas Röhler
  0 siblings, 0 replies; 155+ messages in thread
From: Andreas Röhler @ 2016-03-21 17:15 UTC (permalink / raw)
  To: emacs-devel



On 21.03.2016 16:06, Stefan Monnier wrote:
>> I have elaborated on all these in my other long reply. I would just
>> conclude here that because both calculate-indent-function and
>> prog-indentation-context try to solve same problem, they are bound to
>> have overlapping parts. It's just that calculate-indent-function is
>> more general, easier to understand for prog authors and it solves
>> three problems at once - replacement of indent-line-function, removing
>> extra prog-indentation-context/prog-widen and not exposing
>> multi-mode complexities.
> STRING-BEFORE/STRING-AFTER/PREVIOUS-CHUNKS look like the main complexity
> (from smie.el's point of view, they all seem pretty painful to support).

Would expect that.


>
>> Note also that the intention of the `hard-widen-limit` is to make it
>> work seamlessly for all existing code that use widen.  While
>> prog-indentation-context requires to teach every mode out there to use
>> prog-widen instead of widen. This doesn't sound right at all.
> The reason is that your suggestion risks breaking code since it changes
> the behavior of `widen'.
>
> Maybe the breakage would be extremely limited or even not exist at all,
> in which case the tradeoff is probably worth it.  My gut feeling is that
> it's too risky, but that's just my gut feeling.
>
> Note also that most modes don't bother using widen, and search&replace
> is pretty easy to do.  But if my fear is unfounded, then indeed it's
> better to just change `widen' directly.
>
>
>          Stefan
>
>

What about if going back and reflect from the starting point, if a well 
defined result of narrowing would be the best - without changing code in 
core at all?

When it narrows from midst of a sexp, its beginning is missing, okay. 
Where is the problem?
If this wasn't wanted, the narrowing was wrong. No need to fix this from 
Emacs side.

Sorry, should I not have understood what's at stake...








^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 21:24   ` Alan Mackenzie
  2016-03-11 21:35     ` Dmitry Gutov
@ 2016-03-13 17:32     ` Stefan Monnier
  1 sibling, 0 replies; 155+ messages in thread
From: Stefan Monnier @ 2016-03-13 17:32 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov

> Er no, I meant what I wrote: the result of (syntax-ppss pos) must match
> that of (parse-partial-sexp (point-min) pos).

That's what the docstring says, but is it the result you're looking for?


        Stefan





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie
  2016-03-11 20:31 ` Dmitry Gutov
@ 2016-03-13 18:52 ` Andreas Röhler
  2016-03-13 18:56   ` Dmitry Gutov
  2016-03-18  0:49 ` Dmitry Gutov
       [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org>
  3 siblings, 1 reply; 155+ messages in thread
From: Andreas Röhler @ 2016-03-13 18:52 UTC (permalink / raw)
  To: 22983

[-- Attachment #1: Type: text/plain, Size: 532 bytes --]



On 11.03.2016 16:15, Alan Mackenzie wrote:
> Hello, Emacs.
>
> The fundamental contract in syntax-ppss is that (syntax-ppss POS)
> returns the same value as (parse-partial-sexp (point-min) POS) (with the
> exception of elements 2 and 6).  This is currently not always the case.
>
> In the master branch, emacs -Q and visit xdisp.c with C-x C-f.  Follow
> this recipe:
>
>      M-: (syntax-ppss-flush-cache 1)
>      M-: (setq ppss-0 (syntax-ppss 40000))

(setq ppss-0 (syntax-ppss 40000)

moved point - see attachment. Should it?

[-- Attachment #2: moves-point.png --]
[-- Type: image/png, Size: 125904 bytes --]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-13 18:52 ` Andreas Röhler
@ 2016-03-13 18:56   ` Dmitry Gutov
  0 siblings, 0 replies; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-13 18:56 UTC (permalink / raw)
  To: Andreas Röhler, 22983

On 03/13/2016 08:52 PM, Andreas Röhler wrote:

> (setq ppss-0 (syntax-ppss 40000)
>
> moved point - see attachment. Should it?

Yes. See the last sentence in its docstring.





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie
  2016-03-11 20:31 ` Dmitry Gutov
  2016-03-13 18:52 ` Andreas Röhler
@ 2016-03-18  0:49 ` Dmitry Gutov
  2016-03-19 12:27   ` Alan Mackenzie
  2016-03-19 23:00   ` Vitalie Spinu
       [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org>
  3 siblings, 2 replies; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-18  0:49 UTC (permalink / raw)
  To: Alan Mackenzie, 22983

On 03/11/2016 05:15 PM, Alan Mackenzie wrote:

This patch should make ppss-0 and ppss-1 match:

diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
index e20a210..c1b9d84 100644
--- a/lisp/emacs-lisp/syntax.el
+++ b/lisp/emacs-lisp/syntax.el
@@ -371,6 +371,11 @@ syntax-ppss-max-span
  We try to make sure that cache entries are at least this far apart
  from each other, to avoid keeping too much useless info.")

+(defvar syntax-ppss-dont-widen nil
+  "If non-nil, `syntax-ppss' will work on the non-widened buffer.
+The code that uses this should create local bindings for
+`syntax-ppss-cache' and `syntax-ppss-last' too.")
+
  (defvar syntax-begin-function nil
    "Function to move back outside of any comment/string/paren.
  This function should move the cursor back to some syntactically safe
@@ -423,12 +428,21 @@ syntax-ppss
  in the returned list (counting from 0) cannot be relied upon.
  Point is at POS when this function returns.

+IF `syntax-ppss-dont-widen' is nil, the buffer is temporarily
+widened.
+
  It is necessary to call `syntax-ppss-flush-cache' explicitly if
  this function is called while `before-change-functions' is
  temporarily let-bound, or if the buffer is modified without
  running the hook."
    ;; Default values.
    (unless pos (setq pos (point)))
+  (save-restriction
+    (unless syntax-ppss-dont-widen
+      (widen))
+    (syntax-pps--at pos)))
+
+(defun syntax-ppss--at (pos)
    (syntax-propertize pos)
    ;;
    (let ((old-ppss (cdr syntax-ppss-last))






^ permalink raw reply related	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-18  0:49 ` Dmitry Gutov
@ 2016-03-19 12:27   ` Alan Mackenzie
  2016-03-19 18:47     ` Dmitry Gutov
  2016-03-19 23:16     ` Vitalie Spinu
  2016-03-19 23:00   ` Vitalie Spinu
  1 sibling, 2 replies; 155+ messages in thread
From: Alan Mackenzie @ 2016-03-19 12:27 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 22983

Hello, Dmitry.

On Fri, Mar 18, 2016 at 02:49:34AM +0200, Dmitry Gutov wrote:
> On 03/11/2016 05:15 PM, Alan Mackenzie wrote:

> This patch should make ppss-0 and ppss-1 match:

OK, no bad thing!

But seeing that the function is a new function (its specification has
changed), it will need new test cases, fresh new attempts to break it.

> diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
> index e20a210..c1b9d84 100644
> --- a/lisp/emacs-lisp/syntax.el
> +++ b/lisp/emacs-lisp/syntax.el
> @@ -371,6 +371,11 @@ syntax-ppss-max-span
>   We try to make sure that cache entries are at least this far apart
>   from each other, to avoid keeping too much useless info.")

> +(defvar syntax-ppss-dont-widen nil
> +  "If non-nil, `syntax-ppss' will work on the non-widened buffer.
> +The code that uses this should create local bindings for
> +`syntax-ppss-cache' and `syntax-ppss-last' too.")
> +

I'm against this bit.  If syntax-ppss-dont-widen is non-nil, the buffer
is narrowed, and the local cache variables are correctly bound and
filled, then something at a low level is going to widen the buffer (and
call back_comment) without knowing to restore the global bindings for
those cache variables.  This could easily give the wrong result and
corrupt the locally bound cache.

I think the only sensible functionality for syntax-ppss is to be
equivalent to (parse-partial-sexp 1 pos).  Then everybody knows where
they stand.  Those pieces of code which actually need a ppss cache with
origin other than 1 could then use a more appropriate specialized
function whose cache wouldn't get mixed up with syntax-ppss's.  (It
could share a lot of code with syntax-ppss).

>   (defvar syntax-begin-function nil
>     "Function to move back outside of any comment/string/paren.
>   This function should move the cursor back to some syntactically safe
> @@ -423,12 +428,21 @@ syntax-ppss
>   in the returned list (counting from 0) cannot be relied upon.
>   Point is at POS when this function returns.

> +IF `syntax-ppss-dont-widen' is nil, the buffer is temporarily
> +widened.
> +
>   It is necessary to call `syntax-ppss-flush-cache' explicitly if
>   this function is called while `before-change-functions' is
>   temporarily let-bound, or if the buffer is modified without
>   running the hook."
>     ;; Default values.
>     (unless pos (setq pos (point)))
> +  (save-restriction
> +    (unless syntax-ppss-dont-widen
> +      (widen))
> +    (syntax-pps--at pos)))
> +
> +(defun syntax-ppss--at (pos)
>     (syntax-propertize pos)
>     ;;
>     (let ((old-ppss (cdr syntax-ppss-last))

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-19 12:27   ` Alan Mackenzie
@ 2016-03-19 18:47     ` Dmitry Gutov
  2016-03-27  0:51       ` John Wiegley
  2016-03-19 23:16     ` Vitalie Spinu
  1 sibling, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-19 18:47 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983

On 03/19/2016 02:27 PM, Alan Mackenzie wrote:

> OK, no bad thing!

Good.

> But seeing that the function is a new function (its specification has
> changed), it will need new test cases, fresh new attempts to break it.

Sure, please go ahead.

It needed new test cases even before this miraculous transformation.

>> +(defvar syntax-ppss-dont-widen nil
>> +  "If non-nil, `syntax-ppss' will work on the non-widened buffer.
>> +The code that uses this should create local bindings for
>> +`syntax-ppss-cache' and `syntax-ppss-last' too.")
>> +
>
> I'm against this bit.

I'm not married to it, but at least it would provide a backward 
compatibility escape hatch for a while. If a new way of handling mixed 
modes is added and turns out to be satisfactory, we can remove this 
variable later.

> If syntax-ppss-dont-widen is non-nil, the buffer
> is narrowed, and the local cache variables are correctly bound and
> filled, then something at a low level is going to widen the buffer (and
> call back_comment) without knowing to restore the global bindings for
> those cache variables.

When and why would that happen? I do not recall that happening before.

Since the "low level" is a bounded set, we should be able to make sure 
that the primitives do not, in fact, widen before calling syntax-ppss.

I suppose some could widen afterward.

> This could easily give the wrong result and
> corrupt the locally bound cache.

Even so, that would only affect the local cache, and as such, only the 
subregions, in the case of mixed-mode usage. In the general case, it 
would only affect the consumers of syntax-ppss that bound 
syntax-ppss-dont-widen, as long as they bound the cache variables as 
well, which we tell them to.

That lowers the damage area considerably.

> I think the only sensible functionality for syntax-ppss is to be
> equivalent to (parse-partial-sexp 1 pos).  Then everybody knows where
> they stand.  Those pieces of code which actually need a ppss cache with
> origin other than 1 could then use a more appropriate specialized
> function whose cache wouldn't get mixed up with syntax-ppss's.  (It
> could share a lot of code with syntax-ppss).

They already use syntax-ppss. I imagine Emacs's backward compatibility 
policy has something to say about that.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-19 18:47     ` Dmitry Gutov
@ 2016-03-27  0:51       ` John Wiegley
  2016-03-27  1:14         ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: John Wiegley @ 2016-03-27  0:51 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983

[-- Attachment #1: Type: text/plain, Size: 1571 bytes --]

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

>> I think the only sensible functionality for syntax-ppss is to be equivalent
>> to (parse-partial-sexp 1 pos). Then everybody knows where they stand. Those
>> pieces of code which actually need a ppss cache with origin other than 1
>> could then use a more appropriate specialized function whose cache wouldn't
>> get mixed up with syntax-ppss's. (It could share a lot of code with
>> syntax-ppss).

> They already use syntax-ppss. I imagine Emacs's backward compatibility
> policy has something to say about that.

There are times when our backward compatibility policy must bend, and even
break.

Specifically, we have a few existing cases where incomplete code has or will
be shipped in a release. The argument for doing so has often been, "So we can
see what users think." But if we *also* say that once it is released and
people start using, we can't change it, then it's a Catch-22.

syntax-ppss needs more work, that seems to be fairly clear based on the volume
of discussion around this feature, and bugs like this one. Therefore, since it
is not solid yet I'm not willing to let existing dependencies prevent us from
fixing its flaws.

When a feature becomes solid and true, like lexical-binding, that's when I
become incredibly reticent to make any changes whatsoever -- without the
convergence of all the planets and the moons.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 629 bytes --]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-27  0:51       ` John Wiegley
@ 2016-03-27  1:14         ` Dmitry Gutov
  2016-04-03 22:58           ` John Wiegley
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-27  1:14 UTC (permalink / raw)
  To: John Wiegley; +Cc: Alan Mackenzie, 22983

On 03/27/2016 02:51 AM, John Wiegley wrote:

> syntax-ppss needs more work, that seems to be fairly clear based on the volume
> of discussion around this feature, and bugs like this one.

Bugs, plural? Alan has filed just one so far, and I've posted the 
trivial patch.

> Therefore, since it
> is not solid yet I'm not willing to let existing dependencies prevent us from
> fixing its flaws.

The aforementioned patch both fixes the bug and allows syntax-ppss to 
continue to be used in the fashion I've mentioned previously.

The question that's holding it, as far as I'm concerned, if whether the 
"hard narrowing" discussion reaches a satisfying conclusion. If it does, 
we won't really need syntax-ppss-dont-widen.

> When a feature becomes solid and true, like lexical-binding, that's when I
> become incredibly reticent to make any changes whatsoever -- without the
> convergence of all the planets and the moons.

I've never said anything about avoiding making changes to it. But when 
we do that, we usually try to accommodate the existing uses (the 
importance of which depends on how many uses there are out there).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-27  1:14         ` Dmitry Gutov
@ 2016-04-03 22:58           ` John Wiegley
  2016-04-03 23:15             ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: John Wiegley @ 2016-04-03 22:58 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983

[-- Attachment #1: Type: text/plain, Size: 1035 bytes --]

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

> The aforementioned patch both fixes the bug and allows syntax-ppss to
> continue to be used in the fashion I've mentioned previously.

> The question that's holding it, as far as I'm concerned, if whether the
> "hard narrowing" discussion reaches a satisfying conclusion. If it does, we
> won't really need syntax-ppss-dont-widen.

Have things reached a satisfactory conclusion now?  It was hard for me to tell
by the end of this thread.

> I've never said anything about avoiding making changes to it. But when we do
> that, we usually try to accommodate the existing uses (the importance of
> which depends on how many uses there are out there).

Sure, though it's experimental nature does get taken into account. If a thing
is wrong, I'm not interested in accommodating existing workarounds to its
wrongness.

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 629 bytes --]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-04-03 22:58           ` John Wiegley
@ 2016-04-03 23:15             ` Dmitry Gutov
  2017-09-02 13:12               ` Eli Zaretskii
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2016-04-03 23:15 UTC (permalink / raw)
  To: John Wiegley; +Cc: Alan Mackenzie, 22983

On 04/04/2016 01:58 AM, John Wiegley wrote:

> Have things reached a satisfactory conclusion now?  It was hard for me to tell
> by the end of this thread.

It's a separate discussion, see 
http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html

> Sure, though it's experimental nature does get taken into account. If a thing
> is wrong, I'm not interested in accommodating existing workarounds to its
> wrongness.

What experimental nature?





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-04-03 23:15             ` Dmitry Gutov
@ 2017-09-02 13:12               ` Eli Zaretskii
  2017-09-02 17:40                 ` Alan Mackenzie
  0 siblings, 1 reply; 155+ messages in thread
From: Eli Zaretskii @ 2017-09-02 13:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: jwiegley, acm, 22983

unblock 24655 by 22983
thanks

> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Mon, 4 Apr 2016 02:15:50 +0300
> Cc: Alan Mackenzie <acm@muc.de>, 22983@debbugs.gnu.org
> 
> On 04/04/2016 01:58 AM, John Wiegley wrote:
> 
> > Have things reached a satisfactory conclusion now?  It was hard for me to tell
> > by the end of this thread.
> 
> It's a separate discussion, see 
> http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html
> 
> > Sure, though it's experimental nature does get taken into account. If a thing
> > is wrong, I'm not interested in accommodating existing workarounds to its
> > wrongness.
> 
> What experimental nature?

It doesn't sound like this discussion is leading anywhere, and since
almost 1.5 years has passed with no comments, I guess this bug doesn't
need to block the release of Emacs 26.1, at least.

Thanks.





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-02 13:12               ` Eli Zaretskii
@ 2017-09-02 17:40                 ` Alan Mackenzie
  2017-09-02 17:53                   ` Eli Zaretskii
                                     ` (2 more replies)
  0 siblings, 3 replies; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-02 17:40 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: jwiegley, Dmitry Gutov, 22983

Hello, Eli.

On Sat, Sep 02, 2017 at 16:12:48 +0300, Eli Zaretskii wrote:
> unblock 24655 by 22983
> thanks

> > From: Dmitry Gutov <dgutov@yandex.ru>
> > Date: Mon, 4 Apr 2016 02:15:50 +0300
> > Cc: Alan Mackenzie <acm@muc.de>, 22983@debbugs.gnu.org

> > On 04/04/2016 01:58 AM, John Wiegley wrote:

> > > Have things reached a satisfactory conclusion now?  It was hard for me to tell
> > > by the end of this thread.

> > It's a separate discussion, see 
> > http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html

> > > Sure, though it's experimental nature does get taken into account. If a thing
> > > is wrong, I'm not interested in accommodating existing workarounds to its
> > > wrongness.

> > What experimental nature?

> It doesn't sound like this discussion is leading anywhere, and since
> almost 1.5 years has passed with no comments, I guess this bug doesn't
> need to block the release of Emacs 26.1, at least.

I'm not happy about this.  22983 is a serious design flaw, which has had
deleterious effects deep within Emacs.  One recorded example, resulting
in an infinite loop, is:

#########################################################################
From: Philipp Stephani <p.stephani2@gmail.com>
To: emacs-devel@gnu.org
Subject: [PATCH] Protect against an infloop in python-mode
Date: Tue, 28 Feb 2017 22:31:49 +0100

There appears to be an edge case caused by using `syntax-ppss' in a
narrowed buffer during JIT lock inside of Python triple-quote strings.
Unfortunately it is impossible to reproduce without manually
destroying the syntactic information in the Python buffer, but it has
been observed in practice.  In that case it can happen that the syntax
caches get sufficiently out of whack so that there appear to be
overlapping strings in the buffer.  As Python has no nested strings,
this situation is impossible and leads to an infloop in
`python-nav-end-of-statement'.  Protect against this by checking
whether the search for the end of the current string makes progress.
#########################################################################

In this case, Philipp had to apply a workaround.

Seeing as how Stefan is not prepared to take responsibility for his own
bugs, I suggest that I fix it, something I really don't want to spend
time on.  Before I do start spending time on it, I would like some
assurance that my fix will not be blocked or reverted (both have happened
to other things in the core I've worked on), and that I will have a
reasonable amount of time to get the job done (a few weeks) before any
freeze for Emacs 25.3 or 26 comes into force.

> Thanks.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-02 17:40                 ` Alan Mackenzie
@ 2017-09-02 17:53                   ` Eli Zaretskii
  2017-09-03 20:44                   ` John Wiegley
  2017-09-04 23:34                   ` Dmitry Gutov
  2 siblings, 0 replies; 155+ messages in thread
From: Eli Zaretskii @ 2017-09-02 17:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: jwiegley, dgutov, 22983

> Date: Sat, 2 Sep 2017 17:40:27 +0000
> Cc: Dmitry Gutov <dgutov@yandex.ru>, jwiegley@gmail.com, 22983@debbugs.gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > It doesn't sound like this discussion is leading anywhere, and since
> > almost 1.5 years has passed with no comments, I guess this bug doesn't
> > need to block the release of Emacs 26.1, at least.
> 
> I'm not happy about this.  22983 is a serious design flaw, which has had
> deleterious effects deep within Emacs.

I didn't close the bug, mind you.  I just removed it from the list of
those blocking the impending release.  You, or anyone else, are free
to work on fixing it and/or discuss the various approaches to dealing
with this issue.

> Before I do start spending time on it, I would like some assurance
> that my fix will not be blocked or reverted (both have happened to
> other things in the core I've worked on)

I doubt that anyone could give you such a promise without seeing the
proposed changes.  Especially since this and related issues, and
solutions proposed for them, already have some history of being
controversial.

> and that I will have a reasonable amount of time to get the job done
> (a few weeks) before any freeze for Emacs 25.3 or 26 comes into
> force.

That I can promise you.  Feature freeze doesn't affect bugfixes, and
Emacs 26.1 is not going to be released tomorrow or the next week.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-02 17:40                 ` Alan Mackenzie
  2017-09-02 17:53                   ` Eli Zaretskii
@ 2017-09-03 20:44                   ` John Wiegley
  2017-09-04 23:34                   ` Dmitry Gutov
  2 siblings, 0 replies; 155+ messages in thread
From: John Wiegley @ 2017-09-03 20:44 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov

>>>>> Alan Mackenzie <acm@muc.de> writes:

> Seeing as how Stefan is not prepared to take responsibility for his own bugs

Let's not use language like this if avoidable, please. I'm certain Stefan
would do so, he just may not see this issue the same way you do (which is what
I recall from the extensive discussions on syntax-ppss). To characterize it as
a fault is only discouraging or frustrating; it doesn't help Emacs.

Thanks,
-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-02 17:40                 ` Alan Mackenzie
  2017-09-02 17:53                   ` Eli Zaretskii
  2017-09-03 20:44                   ` John Wiegley
@ 2017-09-04 23:34                   ` Dmitry Gutov
  2017-09-05  6:57                     ` Andreas Röhler
                                       ` (2 more replies)
  2 siblings, 3 replies; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-04 23:34 UTC (permalink / raw)
  To: Alan Mackenzie, Eli Zaretskii; +Cc: jwiegley, Philipp Stephani, 22983

On 9/2/17 8:40 PM, Alan Mackenzie wrote:
> I'm not happy about this.  22983 is a serious design flaw, which has had
> deleterious effects deep within Emacs.

I'm sure we want to fix design flaws. As long as there is a solid plan 
that does not swap one flaw for another.

> One recorded example, resulting
> in an infinite loop, is:
> 
> #########################################################################
> From: Philipp Stephani <p.stephani2@gmail.com>
> To: emacs-devel@gnu.org
> Subject: [PATCH] Protect against an infloop in python-mode
> Date: Tue, 28 Feb 2017 22:31:49 +0100
> 
> There appears to be an edge case caused by using `syntax-ppss' in a
> narrowed buffer during JIT lock inside of Python triple-quote strings.
> Unfortunately it is impossible to reproduce without manually
> destroying the syntactic information in the Python buffer, but it has
> been observed in practice.  In that case it can happen that the syntax
> caches get sufficiently out of whack so that there appear to be
> overlapping strings in the buffer.  As Python has no nested strings,
> this situation is impossible and leads to an infloop in
> `python-nav-end-of-statement'.  Protect against this by checking
> whether the search for the end of the current string makes progress.
> #########################################################################
> 
> In this case, Philipp had to apply a workaround.

The problem manifested during jit-lock. Do we understand why the (widen) 
call inside font-lock-default-fontify-region didn't help?





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-04 23:34                   ` Dmitry Gutov
@ 2017-09-05  6:57                     ` Andreas Röhler
  2017-09-05 12:28                     ` John Wiegley
  2017-09-07 17:56                     ` Alan Mackenzie
  2 siblings, 0 replies; 155+ messages in thread
From: Andreas Röhler @ 2017-09-05  6:57 UTC (permalink / raw)
  To: 22983

[-- Attachment #1: Type: text/plain, Size: 1740 bytes --]



On 05.09.2017 01:34, Dmitry Gutov wrote:
> On 9/2/17 8:40 PM, Alan Mackenzie wrote:
>> I'm not happy about this.  22983 is a serious design flaw, which has had
>> deleterious effects deep within Emacs.
>
> I'm sure we want to fix design flaws. As long as there is a solid plan 
> that does not swap one flaw for another.
>
>> One recorded example, resulting
>> in an infinite loop, is:
>>
>> ######################################################################### 
>>
>> From: Philipp Stephani <p.stephani2@gmail.com>
>> To: emacs-devel@gnu.org
>> Subject: [PATCH] Protect against an infloop in python-mode
>> Date: Tue, 28 Feb 2017 22:31:49 +0100
>>
>> There appears to be an edge case caused by using `syntax-ppss' in a
>> narrowed buffer during JIT lock inside of Python triple-quote strings.
>> Unfortunately it is impossible to reproduce without manually
>> destroying the syntactic information in the Python buffer, but it has
>> been observed in practice.  In that case it can happen that the syntax
>> caches get sufficiently out of whack so that there appear to be
>> overlapping strings in the buffer.  As Python has no nested strings,
>> this situation is impossible and leads to an infloop in
>> `python-nav-end-of-statement'.  Protect against this by checking
>> whether the search for the end of the current string makes progress.
>> ######################################################################### 
>>
>>
>> In this case, Philipp had to apply a workaround.
>
> The problem manifested during jit-lock. Do we understand why the 
> (widen) call inside font-lock-default-fontify-region didn't help?
>
>
>


IIRC its about dissolving circular dependencies notably between 
syntax-propertize-function and syntax-ppss.

[-- Attachment #2: Type: text/html, Size: 2904 bytes --]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-04 23:34                   ` Dmitry Gutov
  2017-09-05  6:57                     ` Andreas Röhler
@ 2017-09-05 12:28                     ` John Wiegley
  2017-09-07 20:45                       ` Alan Mackenzie
  2017-09-07 17:56                     ` Alan Mackenzie
  2 siblings, 1 reply; 155+ messages in thread
From: John Wiegley @ 2017-09-05 12:28 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Philipp Stephani, 22983

>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

> I'm sure we want to fix design flaws. As long as there is a solid plan that
> does not swap one flaw for another.

Can we have a summary of the current proposal(s) on the table? It would help
to clarify, rather than navigating past discussions. Alan has told me that
this issue is affecting people and has been outstanding for some time; I'd
like to get a better idea of its seriousness/scope, and what effect the
available solutions would have (as Dmitry says, we don't want to replace one
flaw with another).

-- 
John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-05 12:28                     ` John Wiegley
@ 2017-09-07 20:45                       ` Alan Mackenzie
  2017-09-08 16:04                         ` Andreas Röhler
  2017-09-09  9:44                         ` Dmitry Gutov
  0 siblings, 2 replies; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-07 20:45 UTC (permalink / raw)
  To: John Wiegley; +Cc: 22983, Philipp Stephani, Dmitry Gutov

Hello, John.

On Tue, Sep 05, 2017 at 13:28:52 +0100, John Wiegley wrote:
> >>>>> Dmitry Gutov <dgutov@yandex.ru> writes:

> > I'm sure we want to fix design flaws. As long as there is a solid plan that
> > does not swap one flaw for another.

> Can we have a summary of the current proposal(s) on the table? It would help
> to clarify, rather than navigating past discussions. Alan has told me that
> this issue is affecting people and has been outstanding for some time; I'd
> like to get a better idea of its seriousness/scope, and what effect the
> available solutions would have (as Dmitry says, we don't want to replace one
> flaw with another).

First, I think it's worthwhile emphasising what the function purports to
do:

    syntax-ppss is a compiled Lisp function in `syntax.el'.

    (syntax-ppss &optional POS)

    Parse-Partial-Sexp State at POS, defaulting to point.
    The returned value is the same as that of `parse-partial-sexp'
    run from `point-min' to POS except that values at positions 2 and 6
    in the returned list (counting from 0) cannot be relied upon.
    Point is at POS when this function returns.

The solution I propose is to introduce a second cache into syntax-ppss,
and this cache would be used whenever (not (eq (point-min) 1)).
Whenever point-min changes, and isn't 1, this second cached would be
calculated again from scratch.

This proposal has these advantages:

(i) It would make the function deliver what its unchanged doc string
says.  This is important, given that syntax-ppss has been very widely
used within Emacs, and likely by external packages too; these will
typically have assumed the advertised behaviour of the function, without
having tested it in narrowed buffers.

(i) In the case which currently works, namely a non-narrowed buffer,
there would be only a minute slow-down (basically, there would be extra
code to check point-min and select the cache to use).

(ii) The cache for use in a narrowed buffer might well be sufficiently
fast in normal use.  If it is not, it could be enhanced readily.

I think Dmitry also proposed a method of solution some months ago,
though I don't remember in detail what it was.  Dmitry, do you still
think your solution would work?  If so, please elaborate on it.

> -- 
> John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
> http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-07 20:45                       ` Alan Mackenzie
@ 2017-09-08 16:04                         ` Andreas Röhler
  2017-09-10 18:26                           ` Alan Mackenzie
  2017-09-09  9:44                         ` Dmitry Gutov
  1 sibling, 1 reply; 155+ messages in thread
From: Andreas Röhler @ 2017-09-08 16:04 UTC (permalink / raw)
  To: 22983; +Cc: Alan Mackenzie, John Wiegley

[-- Attachment #1: Type: text/plain, Size: 2958 bytes --]



On 07.09.2017 22:45, Alan Mackenzie wrote:
> Hello, John.
>
> On Tue, Sep 05, 2017 at 13:28:52 +0100, John Wiegley wrote:
>>>>>>> Dmitry Gutov <dgutov@yandex.ru> writes:
>>> I'm sure we want to fix design flaws. As long as there is a solid plan that
>>> does not swap one flaw for another.
>> Can we have a summary of the current proposal(s) on the table? It would help
>> to clarify, rather than navigating past discussions. Alan has told me that
>> this issue is affecting people and has been outstanding for some time; I'd
>> like to get a better idea of its seriousness/scope, and what effect the
>> available solutions would have (as Dmitry says, we don't want to replace one
>> flaw with another).
> First, I think it's worthwhile emphasising what the function purports to
> do:
>
>      syntax-ppss is a compiled Lisp function in `syntax.el'.
>
>      (syntax-ppss &optional POS)
>
>      Parse-Partial-Sexp State at POS, defaulting to point.
>      The returned value is the same as that of `parse-partial-sexp'
>      run from `point-min' to POS except that values at positions 2 and 6
>      in the returned list (counting from 0) cannot be relied upon.
>      Point is at POS when this function returns.
>
> The solution I propose is to introduce a second cache into syntax-ppss,
> and this cache would be used whenever (not (eq (point-min) 1)).
> Whenever point-min changes, and isn't 1, this second cached would be
> calculated again from scratch.
>
> This proposal has these advantages:
>
> (i) It would make the function deliver what its unchanged doc string
> says.  This is important, given that syntax-ppss has been very widely
> used within Emacs, and likely by external packages too; these will
> typically have assumed the advertised behaviour of the function, without
> having tested it in narrowed buffers.
>
> (i) In the case which currently works, namely a non-narrowed buffer,
> there would be only a minute slow-down (basically, there would be extra
> code to check point-min and select the cache to use).
>
> (ii) The cache for use in a narrowed buffer might well be sufficiently
> fast in normal use.  If it is not, it could be enhanced readily.
>
> I think Dmitry also proposed a method of solution some months ago,
> though I don't remember in detail what it was.  Dmitry, do you still
> think your solution would work?  If so, please elaborate on it.
>
>> -- 
>> John Wiegley                  GPG fingerprint = 4710 CF98 AF9B 327B B80F
>> http://newartisans.com                          60E1 46C4 BD1A 7AC1 4BA2

Hi Alan and all,

assume a complex matter behind, a bunch of bugs resp. design issues, not 
a single one.
Fixing this would affect syntax-propertize, parse-partial-sexp, 
syntax-ppss and font-lock stuff at once.

http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html
points at some spot. There should be more.

As a first step listing referential tests including benchmarks should be 
helpful.




[-- Attachment #2: Type: text/html, Size: 4540 bytes --]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-08 16:04                         ` Andreas Röhler
@ 2017-09-10 18:26                           ` Alan Mackenzie
  0 siblings, 0 replies; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-10 18:26 UTC (permalink / raw)
  To: Andreas Röhler; +Cc: 22983

Hello, Andreas.

On Fri, Sep 08, 2017 at 18:04:37 +0200, Andreas Röhler wrote:

> On 07.09.2017 22:45, Alan Mackenzie wrote:

> > The solution I propose is to introduce a second cache into syntax-ppss,
> > and this cache would be used whenever (not (eq (point-min) 1)).
> > Whenever point-min changes, and isn't 1, this second cached would be
> > calculated again from scratch.

> > This proposal has these advantages:

> > (i) It would make the function deliver what its unchanged doc string
> > says.  This is important, given that syntax-ppss has been very widely
> > used within Emacs, and likely by external packages too; these will
> > typically have assumed the advertised behaviour of the function, without
> > having tested it in narrowed buffers.

> > (i) In the case which currently works, namely a non-narrowed buffer,
> > there would be only a minute slow-down (basically, there would be extra
> > code to check point-min and select the cache to use).

> > (ii) The cache for use in a narrowed buffer might well be sufficiently
> > fast in normal use.  If it is not, it could be enhanced readily.

> Hi Alan and all,

> assume a complex matter behind, a bunch of bugs resp. design issues, not 
> a single one.

I don't think this bug is _that_ complex, and even if it has associated
bugs, I think we can fix it on its own.

> Fixing this would affect syntax-propertize, parse-partial-sexp, 
> syntax-ppss and font-lock stuff at once.

I'll give you one out of four.  ;-)  syntax-ppss will definitely be
affected, parse-partial-sexp definitely not, and the other two possibly
in corner cases, but hopefully not.

> http://lists.gnu.org/archive/html/emacs-devel/2016-03/msg01576.html
> points at some spot. There should be more.

I think, at least I hope, that is an orthoganol issue.

> As a first step listing referential tests including benchmarks should be 
> helpful.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-07 20:45                       ` Alan Mackenzie
  2017-09-08 16:04                         ` Andreas Röhler
@ 2017-09-09  9:44                         ` Dmitry Gutov
  2017-09-09 10:20                           ` Alan Mackenzie
  2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
  1 sibling, 2 replies; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-09  9:44 UTC (permalink / raw)
  To: Alan Mackenzie, John Wiegley; +Cc: 22983, Philipp Stephani

Hi Alan,

On 9/7/17 11:45 PM, Alan Mackenzie wrote:

> The solution I propose is to introduce a second cache into syntax-ppss,
> and this cache would be used whenever (not (eq (point-min) 1)).
> Whenever point-min changes, and isn't 1, this second cached would be
> calculated again from scratch.

Thanks for writing this up. I think it's a good step, and since it 
follow the current wording of the docstring, it should be highly 
compatible with the existing code.

> This proposal has these advantages:
> 
> (i) It would make the function deliver what its unchanged doc string
> says.  This is important, given that syntax-ppss has been very widely
> used within Emacs, and likely by external packages too; these will
> typically have assumed the advertised behaviour of the function, without
> having tested it in narrowed buffers.

It will also continue to function as expected in mmm-mode, AFAICT, 
without the need for an "escape hatch" we discussed before.

> (i) In the case which currently works, namely a non-narrowed buffer,
> there would be only a minute slow-down (basically, there would be extra
> code to check point-min and select the cache to use).
> 
> (ii) The cache for use in a narrowed buffer might well be sufficiently
> fast in normal use.  If it is not, it could be enhanced readily.

And since the API doesn't change, and the observable behavior doesn't 
either (in the vast majority of cases; probably all except the broken 
ones), we can refine this solution easily, or even swap it for something 
else, with little cost.

> I think Dmitry also proposed a method of solution some months ago,
> though I don't remember in detail what it was.  Dmitry, do you still
> think your solution would work?  If so, please elaborate on it.

There is a simple patch at 
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22983#47, but I after some 
consideration, I now prefer your proposed approach. We've also had some 
grander ideas about enhancing things further, but those can be added 
later, after we finally decide.

I do want to know what Stefan thinks of this subject now, though.

Caveats:

- This solves the dependency on point-min, but does nothing about the 
dependency on the current syntax-table (which can change). I'm not 
necessarily suggesting we try to solve that now, though.

- Before this change is pushed to master, or shortly after, I'd like to 
know that it actually fixed the problem Philipp experienced with 
python-mode, so we can revert 4fbd330. If it was caused by e.g. 
syntax-table changing, we've not improved much.

All the best,
Dmitry.





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-09  9:44                         ` Dmitry Gutov
@ 2017-09-09 10:20                           ` Alan Mackenzie
  2017-09-09 12:18                             ` Dmitry Gutov
  2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
  1 sibling, 1 reply; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-09 10:20 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Sat, Sep 09, 2017 at 12:44:02 +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 9/7/17 11:45 PM, Alan Mackenzie wrote:

> > The solution I propose is to introduce a second cache into syntax-ppss,
> > and this cache would be used whenever (not (eq (point-min) 1)).
> > Whenever point-min changes, and isn't 1, this second cached would be
> > calculated again from scratch.

> Thanks for writing this up. I think it's a good step, and since it 
> follow the current wording of the docstring, it should be highly 
> compatible with the existing code.

Thanks.

> > This proposal has these advantages:

> > (i) It would make the function deliver what its unchanged doc string
> > says.  This is important, given that syntax-ppss has been very widely
> > used within Emacs, and likely by external packages too; these will
> > typically have assumed the advertised behaviour of the function, without
> > having tested it in narrowed buffers.

> It will also continue to function as expected in mmm-mode, AFAICT, 
> without the need for an "escape hatch" we discussed before.

> > (i) In the case which currently works, namely a non-narrowed buffer,
> > there would be only a minute slow-down (basically, there would be extra
> > code to check point-min and select the cache to use).

> > (ii) The cache for use in a narrowed buffer might well be sufficiently
> > fast in normal use.  If it is not, it could be enhanced readily.

> And since the API doesn't change, and the observable behavior doesn't 
> either (in the vast majority of cases; probably all except the broken 
> ones), we can refine this solution easily, or even swap it for something 
> else, with little cost.

Yes.  I now have a provisional implementation of this new strategy,
which works on the test case for xdisp.c with which I opened the bug.
It seems to be working, generally.  I need to test it more thoroughly.

In the implementation, I have left the function `syntax-ppss' untouched
except for adding a function call to set up the cache right at the
start.  I have refactored syntax-ppss-flush-cache, extracting a function
which is called directly from the cache-selecting code.  Other than
that, there is one new function (which switches the current cache in
use) and a few new variables to keep track of the caches.

> > I think Dmitry also proposed a method of solution some months ago,
> > though I don't remember in detail what it was.  Dmitry, do you still
> > think your solution would work?  If so, please elaborate on it.

> There is a simple patch at 
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22983#47, but I after some 
> consideration, I now prefer your proposed approach. We've also had some 
> grander ideas about enhancing things further, but those can be added 
> later, after we finally decide.

Yes, I agree.

> I do want to know what Stefan thinks of this subject now, though.

Yes.

> Caveats:

> - This solves the dependency on point-min, but does nothing about the 
> dependency on the current syntax-table (which can change). I'm not 
> necessarily suggesting we try to solve that now, though.

I had some ideas on this back in the spring (about having "indirect
variables") which could be used quickly to "swap out" the current
syntax-table text properties, and (more importantly) quickly swap them
back in.  But that's for another day.

> - Before this change is pushed to master, or shortly after, I'd like to 
> know that it actually fixed the problem Philipp experienced with 
> python-mode, so we can revert 4fbd330. If it was caused by e.g. 
> syntax-table changing, we've not improved much.

I am naturally interested in this, too.  If my patch doesn't fix this
bug, at least it will have removed a layer of fog inhibiting its
investigation.

> All the best,
> Dmitry.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-09 10:20                           ` Alan Mackenzie
@ 2017-09-09 12:18                             ` Dmitry Gutov
  2017-09-10 11:42                               ` Alan Mackenzie
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-09 12:18 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

On 9/9/17 1:20 PM, Alan Mackenzie wrote:

> In the implementation, I have left the function `syntax-ppss' untouched
> except for adding a function call to set up the cache right at the
> start.  I have refactored syntax-ppss-flush-cache, extracting a function
> which is called directly from the cache-selecting code.  Other than
> that, there is one new function (which switches the current cache in
> use) and a few new variables to keep track of the caches.

Not sure I understand. If you call (syntax-ppss) with significantly 
different narrowings without flushing the cache (e.g. without modifying 
the buffer), sounds like it'll have to return the same results under the 
described implementation.

If so, it doesn't sound strict enough.

>> Caveats:
> 
>> - This solves the dependency on point-min, but does nothing about the
>> dependency on the current syntax-table (which can change). I'm not
>> necessarily suggesting we try to solve that now, though.
> 
> I had some ideas on this back in the spring (about having "indirect
> variables") which could be used quickly to "swap out" the current
> syntax-table text properties, and (more importantly) quickly swap them
> back in.  But that's for another day.

I admit I'm not sure what all this implies.

>> - Before this change is pushed to master, or shortly after, I'd like to
>> know that it actually fixed the problem Philipp experienced with
>> python-mode, so we can revert 4fbd330. If it was caused by e.g.
>> syntax-table changing, we've not improved much.
> 
> I am naturally interested in this, too.  If my patch doesn't fix this
> bug, at least it will have removed a layer of fog inhibiting its
> investigation.

Let's hope so.





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-09 12:18                             ` Dmitry Gutov
@ 2017-09-10 11:42                               ` Alan Mackenzie
  0 siblings, 0 replies; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-10 11:42 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Sat, Sep 09, 2017 at 15:18:11 +0300, Dmitry Gutov wrote:
> On 9/9/17 1:20 PM, Alan Mackenzie wrote:

> > In the implementation, I have left the function `syntax-ppss' untouched
> > except for adding a function call to set up the cache right at the
> > start.  I have refactored syntax-ppss-flush-cache, extracting a function
> > which is called directly from the cache-selecting code.  Other than
> > that, there is one new function (which switches the current cache in
> > use) and a few new variables to keep track of the caches.

> Not sure I understand. If you call (syntax-ppss) with significantly 
> different narrowings without flushing the cache (e.g. without modifying 
> the buffer), sounds like it'll have to return the same results under the 
> described implementation.

> If so, it doesn't sound strict enough.

On changing from one narrowing to another narrowing (more precisely, when
point-min is changed, neither value being 1), the cache is flushed, even
though the buffer has not been modified.

Anyhow, I've posted a patch elsewhere on this thread.  Comments on it
would be welcome.

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-09  9:44                         ` Dmitry Gutov
  2017-09-09 10:20                           ` Alan Mackenzie
@ 2017-09-10 11:36                           ` Alan Mackenzie
  2017-09-10 22:53                             ` Stefan Monnier
                                               ` (2 more replies)
  1 sibling, 3 replies; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-10 11:36 UTC (permalink / raw)
  To: Dmitry Gutov, Philipp Stephani; +Cc: John Wiegley, 22983

Hello, Dmitry and Philipp.

On Sat, Sep 09, 2017 at 12:44:02 +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 9/7/17 11:45 PM, Alan Mackenzie wrote:

> > The solution I propose is to introduce a second cache into syntax-ppss,
> > and this cache would be used whenever (not (eq (point-min) 1)).
> > Whenever point-min changes, and isn't 1, this second cached would be
> > calculated again from scratch.

Here is a patch implementing this.  Comments about it would be welcome.

[ .... ]

> And since the API doesn't change, and the observable behavior doesn't 
> either (in the vast majority of cases; probably all except the broken 
> ones), we can refine this solution easily, or even swap it for something 
> else, with little cost.

[ .... ]

> Caveats:

> - This solves the dependency on point-min, but does nothing about the 
> dependency on the current syntax-table (which can change). I'm not 
> necessarily suggesting we try to solve that now, though.

> - Before this change is pushed to master, or shortly after, I'd like to 
> know that it actually fixed the problem Philipp experienced with 
> python-mode, so we can revert 4fbd330. If it was caused by e.g. 
> syntax-table changing, we've not improved much.

Philipp, any chance of you trying out python mode with this patch but
without 4fbd330?



diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
index d1d5176944..952ea8bb83 100644
--- a/lisp/emacs-lisp/syntax.el
+++ b/lisp/emacs-lisp/syntax.el
@@ -386,11 +386,103 @@ syntax-ppss-cache
 (defvar-local syntax-ppss-last nil
   "Cache of (LAST-POS . LAST-PPSS).")
 
-(defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache)
-(defun syntax-ppss-flush-cache (beg &rest ignored)
-  "Flush the cache of `syntax-ppss' starting at position BEG."
-  ;; Set syntax-propertize to refontify anything past beg.
-  (setq syntax-propertize--done (min beg syntax-propertize--done))
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Several caches.
+;;
+;; Because `syntax-ppss' is equivalent to (parse-partial-sexp
+;; (POINT-MIN) x), we need either to empty the cache when we narrow
+;; the buffer, which is suboptimal, or we need to use several caches.
+;;
+;; The implementation which follows uses three caches, the current one
+;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive
+;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the
+;; former state of the active cache as it was used in widened and
+;; narrowed buffers respectively.  There are also the variables
+;; `syntax-ppss-max-valid-{wide,narrow}' which hold the maximum
+;; position where the caches are valid, due to buffer changes.
+;;
+;; At the first call to `syntax-ppss' after a widening or narrowing of
+;; the buffer, the pertinent inactive cache is swapped into the
+;; current cache by calling `syntax-ppss-set-cache'.  Note that there
+;; is currently just one inactive cache for narrowed buffers, so only
+;; one inactive narrowed cache can be stored at a time.
+;;
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(defvar-local syntax-ppss-cache-wide nil
+  "Holds the value of `syntax-ppss-cache' for a widened buffer.")
+(defvar-local syntax-ppss-last-wide nil
+  "Holds the value of `syntax-ppss-last' for a widened buffer.")
+(defvar-local syntax-ppss-max-valid-wide most-positive-fixnum
+  "The buffer position after which `syntax-ppss-cache-wide' is invalid.")
+
+(defvar-local syntax-ppss-cache-narrow nil
+  "Holds the value of `syntax-ppss-cache' for a narrowed buffer.")
+(defvar-local syntax-ppss-last-narrow nil
+  "Holds the value of `syntax-ppss-last' for a narrowed buffer.")
+(defvar-local syntax-ppss-max-valid-narrow most-positive-fixnum
+  "The buffer position after which `syntax-ppss-cache-narrow' is invalid.")
+
+(defvar-local syntax-ppss-narrow-point-min 1
+  "Value of `point-min' for which the stored \"narrow\" cache is valid.")
+
+(defvar-local syntax-ppss-supremum most-positive-fixnum
+  "Lowest change position since previous restriction change.")
+
+(defvar-local syntax-ppss-cache-point-min 1
+  "Value of `point-min' for which the current cache is valid.")
+
+(defun syntax-ppss-set-cache ()
+  "Swap in and out the cache pertinent to the new point-min."
+  (unless (eq (point-min) syntax-ppss-cache-point-min)
+    ;; Update the stored `...max-valid' values.
+    (setq syntax-ppss-max-valid-wide
+          (if (eq syntax-ppss-cache-point-min 1)
+              (or (caar syntax-ppss-cache)
+                  1)
+            (min syntax-ppss-max-valid-wide syntax-ppss-supremum)))
+    (setq syntax-ppss-max-valid-narrow
+          (if (eq syntax-ppss-cache-point-min syntax-ppss-narrow-point-min)
+              (or (caar syntax-ppss-cache)
+                  syntax-ppss-cache-point-min)
+            (min syntax-ppss-max-valid-narrow syntax-ppss-supremum)))
+    (setq syntax-ppss-supremum most-positive-fixnum)
+
+    ;; Store away the current values of the cache.
+    (cond
+     ((eq syntax-ppss-cache-point-min 1)
+      (setq syntax-ppss-cache-wide syntax-ppss-cache
+            syntax-ppss-last-wide syntax-ppss-last))
+     ((eq syntax-ppss-cache-point-min syntax-ppss-narrow-point-min)
+      (setq syntax-ppss-cache-narrow syntax-ppss-cache
+            syntax-ppss-last-narrow syntax-ppss-last))
+     (syntax-ppss-cache
+      (setq syntax-ppss-narrow-point-min syntax-ppss-cache-point-min
+            syntax-ppss-cache-narrow syntax-ppss-cache
+            syntax-ppss-last-narrow syntax-ppss-last))
+     (t nil))
+
+    ;; Restore/initialize the cache for the new point-min.
+    (cond
+     ((eq (point-min) 1)
+      (setq syntax-ppss-cache syntax-ppss-cache-wide
+            syntax-ppss-last syntax-ppss-last-wide)
+      (save-restriction
+        (widen)
+        (syntax-ppss-invalidate-cache syntax-ppss-max-valid-wide)))
+     ((eq (point-min) syntax-ppss-narrow-point-min)
+      (setq syntax-ppss-cache syntax-ppss-cache-narrow
+            syntax-ppss-last syntax-ppss-last-narrow)
+      (save-restriction
+        (widen)
+        (syntax-ppss-invalidate-cache syntax-ppss-max-valid-narrow)))
+     (t
+      (setq syntax-ppss-cache nil
+            syntax-ppss-last nil)))
+    (setq syntax-ppss-cache-point-min (point-min))))
+
+(defun syntax-ppss-invalidate-cache (beg &rest ignored)
+  "Invalidate the cache of `syntax-ppss' starting at position BEG."
   ;; Flush invalid cache entries.
   (while (and syntax-ppss-cache (> (caar syntax-ppss-cache) beg))
     (setq syntax-ppss-cache (cdr syntax-ppss-cache)))
@@ -411,6 +503,16 @@ syntax-ppss-flush-cache
   ;;   (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t))
   )
 
+;; Retain the following two for compatibility reasons.
+(defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache)
+(defun syntax-ppss-flush-cache (beg &rest ignored)
+  "Flush the `syntax-ppss' caches and set `syntax-propertize--done'."
+  (setq syntax-ppss-supremum (min beg syntax-ppss-supremum))
+  ;; Ensure the appropriate cache is active.
+  (syntax-ppss-set-cache)
+  (setq syntax-propertize--done (min beg syntax-propertize--done))
+  (syntax-ppss-invalidate-cache beg ignored))
+
 (defvar syntax-ppss-stats
   [(0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (1 . 2500.0)])
 (defun syntax-ppss-stats ()
@@ -434,6 +536,8 @@ syntax-ppss
 this function is called while `before-change-functions' is
 temporarily let-bound, or if the buffer is modified without
 running the hook."
+  ;; Ensure the appropriate cache is active.
+  (syntax-ppss-set-cache)
   ;; Default values.
   (unless pos (setq pos (point)))
   (syntax-propertize pos)


> All the best,
> Dmitry.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply related	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
@ 2017-09-10 22:53                             ` Stefan Monnier
  2017-09-10 23:36                               ` Dmitry Gutov
  2017-09-11 19:42                               ` Alan Mackenzie
  2017-09-11  0:11                             ` Dmitry Gutov
  2017-09-17 11:12                             ` Philipp Stephani
  2 siblings, 2 replies; 155+ messages in thread
From: Stefan Monnier @ 2017-09-10 22:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983

> +;; Several caches.
> +;; Because `syntax-ppss' is equivalent to (parse-partial-sexp
> +;; (POINT-MIN) x), we need either to empty the cache when we narrow
> +;; the buffer, which is suboptimal, or we need to use several caches.

I think that (parse-partial-sexp 1 x) is more often what the caller
wants than (parse-partial-sexp (point-min) x), but if you're happy with
the behavior described by the docstring, then that's fine.

> +;; The implementation which follows uses three caches, the current one
> +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive
> +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the
> +;; former state of the active cache as it was used in widened and
> +;; narrowed buffers respectively.

Earlier in the thread, I suggested to use a single cache indexed by the
position of point-min (or by the position and point-min and by the
current syntax-table, so as to also handle changes in the syntax-table),
i.e. a list of (POINT-MIN-POS . CACHE-DATA) or
((POINT-MIN-POS . SYNTAX-TABLE) . CACHE-DATA).  I think it would lead to
less code duplication than your patch which only handles 2 different
POINT-MIN-POS (and one of the two has to be equal to 1), but existing
code trumps hypothetical designs.

So, I have no objections to the patch.  But I think (parse-partial-sexp
(point-min) x) is a design bug in syntax-ppss which we will need to fix
sooner or later, which is why I never bothered to implement something
like your patch, which only makes the code do what its doc says rather
than what the caller needs.

        Stefan

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 22:53                             ` Stefan Monnier
@ 2017-09-10 23:36                               ` Dmitry Gutov
  2017-09-11 11:10                                 ` Stefan Monnier
  2017-09-11 19:42                               ` Alan Mackenzie
  1 sibling, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-10 23:36 UTC (permalink / raw)
  To: Stefan Monnier, Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

On 9/11/17 1:53 AM, Stefan Monnier wrote:

> I think that (parse-partial-sexp 1 x) is more often what the caller
> wants than (parse-partial-sexp (point-min) x), but if you're happy with
> the behavior described by the docstring, then that's fine.

And yet, I struggle to find such callers. But those that do, can 
(save-restriction (widen) (syntax-ppss)) anyway.

>> +;; The implementation which follows uses three caches, the current one
>> +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive
>> +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the
>> +;; former state of the active cache as it was used in widened and
>> +;; narrowed buffers respectively.
> 
> Earlier in the thread, I suggested to use a single cache indexed by the
> position of point-min

That would lead to clobbering the global cache when we use syntax-ppss 
for some local parsing. E.g. if ruby-syntax-propertize-percent-literal 
didn't bind parse-sexp-lookup-properties to nil, it might clobber the 
cache unnecessarily.

I don't have the data on whether this would be a frequent problem, though.

> i.e. a list of (POINT-MIN-POS . CACHE-DATA) or
> ((POINT-MIN-POS . SYNTAX-TABLE) . CACHE-DATA).  I think it would lead to
> less code duplication than your patch which only handles 2 different
> POINT-MIN-POS (and one of the two has to be equal to 1), but existing
> code trumps hypothetical designs.

I also think there's a way to implement this behavior with less code and 
new variables, albeit with extra indirection.

> So, I have no objections to the patch.  But I think (parse-partial-sexp
> (point-min) x) is a design bug in syntax-ppss which we will need to fix
> sooner or later, which is why I never bothered to implement something
> like your patch, which only makes the code do what its doc says rather
> than what the caller needs.

I'm considering the idea now that syntax-ppss should stay a caching 
wrapper around parse-partial-sexp, and the responsibility to widen 
should always be the caller's. This way, it can be used for different 
purposes that we've discussed before many times.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 23:36                               ` Dmitry Gutov
@ 2017-09-11 11:10                                 ` Stefan Monnier
  2017-09-12  0:11                                   ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Stefan Monnier @ 2017-09-11 11:10 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, Philipp Stephani, John Wiegley, 22983

>> I think that (parse-partial-sexp 1 x) is more often what the caller
>> wants than (parse-partial-sexp (point-min) x), but if you're happy with
>> the behavior described by the docstring, then that's fine.
> And yet, I struggle to find such callers.  But those that do, can
> (save-restriction (widen) (syntax-ppss)) anyway.

Good point.

>>> +;; The implementation which follows uses three caches, the current one
>>> +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive
>>> +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the
>>> +;; former state of the active cache as it was used in widened and
>>> +;; narrowed buffers respectively.
>> Earlier in the thread, I suggested to use a single cache indexed by the
>> position of point-min
> That would lead to clobbering the global cache when we use syntax-ppss for
> some local parsing.

My suggestion is to have a list of N caches, instead of having exactly
2 caches.  I can't see how that could lead to more clobbering.

> I'm considering the idea now that syntax-ppss should stay a caching wrapper
> around parse-partial-sexp, and the responsibility to widen should always be
> the caller's.  This way, it can be used for different purposes that we've
> discussed before many times.

It does have the advantage of circumventing the discussion of
"up-to-where should we widen" ;-)


        Stefan





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-11 11:10                                 ` Stefan Monnier
@ 2017-09-12  0:11                                   ` Dmitry Gutov
  2017-09-12 22:12                                     ` Richard Stallman
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-12  0:11 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Alan Mackenzie, Philipp Stephani, John Wiegley, 22983

On 9/11/17 2:10 PM, Stefan Monnier wrote:

> My suggestion is to have a list of N caches, instead of having exactly
> 2 caches.  I can't see how that could lead to more clobbering.

Um, sorry I misunderstood. I interpreted that as only keeping one pair.

But here are some other issues:

1) If we maintain a cache for all narrowings that have ever been used in 
the buffer, we adopt the idea that all of them are "real" and e.g. 
correspond to chunks in different major modes in a multi-mode context. 
Switching to a different syntax table and parsing a segment of text like 
ruby-syntax-propertize-percent-literal does falls outside of this 
concept. But of course, we can index by syntax table as well... overall, 
things become much complex than when changing the narrowing bounds 
implies just throwing away that cache.

2) If there are a lot of elements inside the cache alist, we have to get 
rid of them from time to time. Not sure what the rules will be. Again, 
if they correspond to multi-mode chunks, we can at least be confident 
that the number of items in the alist will be finite. Not necessarily so 
if narrowing+spss is used for arbitrary purposes.

3) As the number of elements in the alist grows, flushing each value 
inside syntax-ppss-flush-cache eagerly will become slower and slower, I 
expect. And a lazy strategy of the kind proposed by Alan will become 
necessary.

>> I'm considering the idea now that syntax-ppss should stay a caching wrapper
>> around parse-partial-sexp, and the responsibility to widen should always be
>> the caller's.  This way, it can be used for different purposes that we've
>> discussed before many times.
> 
> It does have the advantage of circumventing the discussion of
> "up-to-where should we widen" ;-)

Indeed. :)

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-12  0:11                                   ` Dmitry Gutov
@ 2017-09-12 22:12                                     ` Richard Stallman
  0 siblings, 0 replies; 155+ messages in thread
From: Richard Stallman @ 2017-09-12 22:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 22983, p.stephani2, jwiegley, monnier, acm

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

There is probably some optimal number of caches to remember.
If the code can handle any number of caches, it can discard
all but the last N, and then we could try adjusting N to get
the best performance.  I expect we don't want N to be more
than 4.

-- 
Dr Richard Stallman
President, Free Software Foundation (gnu.org, fsf.org)
Internet Hall-of-Famer (internethalloffame.org)
Skype: No way! See stallman.org/skype.html.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 22:53                             ` Stefan Monnier
  2017-09-10 23:36                               ` Dmitry Gutov
@ 2017-09-11 19:42                               ` Alan Mackenzie
  2017-09-11 20:20                                 ` Stefan Monnier
  1 sibling, 1 reply; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-11 19:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983

Hello, Stefan.

On Sun, Sep 10, 2017 at 18:53:53 -0400, Stefan Monnier wrote:
> > +;; Several caches.
> > +;; Because `syntax-ppss' is equivalent to (parse-partial-sexp
> > +;; (POINT-MIN) x), we need either to empty the cache when we narrow
> > +;; the buffer, which is suboptimal, or we need to use several caches.

> I think that (parse-partial-sexp 1 x) is more often what the caller
> wants than (parse-partial-sexp (point-min) x), but if you're happy with
> the behavior described by the docstring, then that's fine.

I've never been happy with the specification, partly for that reason,
but we are where we are, with lots of use of syntax-ppss, so I think it
needs fixing according to that spec.

> > +;; The implementation which follows uses three caches, the current one
> > +;; (in `syntax-ppss-cache' and `syntax-ppss-last') and two inactive
> > +;; ones (in `syntax-ppss-{cache,last}-{wide,narrow}'), which store the
> > +;; former state of the active cache as it was used in widened and
> > +;; narrowed buffers respectively.

> Earlier in the thread, I suggested to use a single cache indexed by the
> position of point-min (or by the position and point-min and by the
> current syntax-table, so as to also handle changes in the syntax-table),
> i.e. a list of (POINT-MIN-POS . CACHE-DATA) or
> ((POINT-MIN-POS . SYNTAX-TABLE) . CACHE-DATA).  I think it would lead to
> less code duplication than your patch which only handles 2 different
> POINT-MIN-POS (and one of the two has to be equal to 1), but existing
> code trumps hypothetical designs.

I deliberately kept the patch simple, avoiding even an alist with the
point-min position as key.  This would necessitate having an arbitrary
maximum length of alist, and continual manipulation of this list.  Not
difficult, I agree, but do we need it?  How often are there going to be
nested or alternating narrowing with enough calls to syntax-ppss to
cause the establishment of syntax-ppss-cache (as opposed to merely
syntax-ppss-last, which my patch doesn't consider sufficient reason to
store a new narrow-cache)?  (These aren't rhetorical questions, by the
way, but real ones.  Which is the best way forward?)

However, the patch was deliberately contructed to make the replacement
of the two-cache cache by an arbitrary length alist simple.

> So, I have no objections to the patch.  But I think (parse-partial-sexp
> (point-min) x) is a design bug in syntax-ppss which we will need to fix
> sooner or later, which is why I never bothered to implement something
> like your patch, which only makes the code do what its doc says rather
> than what the caller needs.

I couldn't agree more.  However, syntax-ppss is established and there
are callers that depend on its literal specification.  Maybe a way
forward would be to introduce a new function equivalent to
(parse-partial-sexp 1 x) and deprecate syntax-ppss.  However, a name
would need to be found for this new function, not an easy task.  ;-)
(syntax-ppss is a very good name, but couldn't be reused.)

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-11 19:42                               ` Alan Mackenzie
@ 2017-09-11 20:20                                 ` Stefan Monnier
  0 siblings, 0 replies; 155+ messages in thread
From: Stefan Monnier @ 2017-09-11 20:20 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983

> difficult, I agree, but do we need it?  How often are there going to be
> nested or alternating narrowing with enough calls to syntax-ppss to
> cause the establishment of syntax-ppss-cache (as opposed to merely
> syntax-ppss-last, which my patch doesn't consider sufficient reason to
> store a new narrow-cache)?  (These aren't rhetorical questions, by the
> way, but real ones.  Which is the best way forward?)

I agree that it probably doesn't make much difference in practice.

> I couldn't agree more.  However, syntax-ppss is established and there
> are callers that depend on its literal specification.  Maybe a way
> forward would be to introduce a new function equivalent to
> (parse-partial-sexp 1 x) and deprecate syntax-ppss.  However, a name
> would need to be found for this new function, not an easy task.  ;-)
> (syntax-ppss is a very good name, but couldn't be reused.)

Let's go with your patch for now, and then see if Dmitry's impression
that adding a call to `widen` before the call works even better.


        Stefan





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
  2017-09-10 22:53                             ` Stefan Monnier
@ 2017-09-11  0:11                             ` Dmitry Gutov
  2017-09-11 20:12                               ` Alan Mackenzie
  2017-09-17 11:12                             ` Philipp Stephani
  2 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-11  0:11 UTC (permalink / raw)
  To: Alan Mackenzie, Philipp Stephani; +Cc: John Wiegley, 22983

On 9/10/17 2:36 PM, Alan Mackenzie wrote:

>>> The solution I propose is to introduce a second cache into syntax-ppss,
>>> and this cache would be used whenever (not (eq (point-min) 1)).
>>> Whenever point-min changes, and isn't 1, this second cached would be
>>> calculated again from scratch.
> 
> Here is a patch implementing this.  Comments about it would be welcome.

Thank you. It seems to hold up to the main test scenario I had in mind, 
so I don't have any complaints behavior-wise.

It looks pretty big, though. With lots of new global variables.

Before, we had syntax-ppss-cache and syntax-ppss-last. The patch adds 8 
new ones.

I propose two avenues for simplification:

1) Use a cons structure for the (PPSS-CACHE . PPSS-LAST) structure. We 
will have three global variables total: syntax-ppss-data-wide, 
syntax-ppss-data-narrow, syntax-ppss-data-narrow-point-min. syntax-ppss 
would bind a local variable syntax-ppss-data to one of the first two 
depending on the value of the third (and then modify its car and cdr 
during the course of execution).

2) Some extra vars serve to delay the actual clearing of the unused 
cache until it's used again. It's a valid idea, but what if we try 
without it at first? So syntax-ppss-flush-cache would always clear both 
caches eagerly.

The advantages:

- Less code, easier to reason about.

- Any package than advises syntax-ppss will have to juggle fewer global 
variables. So Vatalie's polymode will have an easier time of it. It 
could even reuse some of the cache-while-narrowed logic by substituting 
the values of syntax-ppss-data-narrow and 
syntax-ppss-data-narrow-point-min as appropriate.

The obvious downside is, of course, extra indirection, which translates 
to extra overhead. We don't know how significant it will be, though.

Would you like to see the code?

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-11  0:11                             ` Dmitry Gutov
@ 2017-09-11 20:12                               ` Alan Mackenzie
  2017-09-12  0:24                                 ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-11 20:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Mon, Sep 11, 2017 at 03:11:22 +0300, Dmitry Gutov wrote:
> On 9/10/17 2:36 PM, Alan Mackenzie wrote:

> >>> The solution I propose is to introduce a second cache into syntax-ppss,
> >>> and this cache would be used whenever (not (eq (point-min) 1)).
> >>> Whenever point-min changes, and isn't 1, this second cached would be
> >>> calculated again from scratch.

> > Here is a patch implementing this.  Comments about it would be welcome.

> Thank you. It seems to hold up to the main test scenario I had in mind, 
> so I don't have any complaints behavior-wise.

Thanks.

> It looks pretty big, though. With lots of new global variables.

> Before, we had syntax-ppss-cache and syntax-ppss-last. The patch adds 8 
> new ones.

Yes.  But each one has a very single purpose, and there are no loops in
the new code, which makes it easier to be sure it is correct.

> I propose two avenues for simplification:

> 1) Use a cons structure for the (PPSS-CACHE . PPSS-LAST) structure. We 
> will have three global variables total: syntax-ppss-data-wide, 
> syntax-ppss-data-narrow, syntax-ppss-data-narrow-point-min. syntax-ppss 
> would bind a local variable syntax-ppss-data to one of the first two 
> depending on the value of the third (and then modify its car and cdr 
> during the course of execution).

I'm in favour rather of setting syntax-ppss-{cache,last} to the
appropriate stored cache.  This will avoid needing to change the
function syntax-ppss much.

A disadvantage of using such a cons is in debugging.  It is more
difficult to understand a cons like this when it is printed out, than
the two component lists (which are difficult enough themselves).

> 2) Some extra vars serve to delay the actual clearing of the unused 
> cache until it's used again. It's a valid idea, but what if we try 
> without it at first? So syntax-ppss-flush-cache would always clear both 
> caches eagerly.

When there's a lot of buffer changing going on, it is an overhead having
to clear both (or several) caches continually.  (I'm thinking about the
possible extension to using an alist of caches, which could be quite
long.)

Also clearing both caches at the same time would be a bigger change to
syntax-ppss-flush-cache than it's suffered so far.

But I'm really not sure which way is better.

> The advantages:

> - Less code, easier to reason about.

> - Any package than advises syntax-ppss will have to juggle fewer global 
> variables.

I was intending that the new variables be purely internal, and that no
external elisp would need to access them.  I suppose I really ought to
have put "--" in the middle of their names.

> So Vatalie's polymode will have an easier time of it. It could even
> reuse some of the cache-while-narrowed logic by substituting the
> values of syntax-ppss-data-narrow and
> syntax-ppss-data-narrow-point-min as appropriate.

That sounds a little dangerous.  

> The obvious downside is, of course, extra indirection, which translates 
> to extra overhead. We don't know how significant it will be, though.

I wouldn't be keen on seeing lots of (car compound-variable) and (cdr
compound-variable) throughout the syntax-ppss function.  I think it
would make it significantly more difficult to understand.

> Would you like to see the code?

Yes, why not?

But just to make my position clear, I'm not particularly fixed on my
patch as submitted.  It was optimised for simplicity and correctness
rather than elegance, though I don't think it's too bad.  I'm fairly
open on whether we use your suggestions or Stefan's suggestion of having
an alist of caches.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-11 20:12                               ` Alan Mackenzie
@ 2017-09-12  0:24                                 ` Dmitry Gutov
  2017-09-17 10:29                                   ` Alan Mackenzie
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-12  0:24 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

On 9/11/17 11:12 PM, Alan Mackenzie wrote:

>> Before, we had syntax-ppss-cache and syntax-ppss-last. The patch adds 8
>> new ones.
> 
> Yes.  But each one has a very single purpose, and there are no loops in
> the new code, which makes it easier to be sure it is correct.

On the one hand, yes, on the other hand, the more code you have (or the 
more vars you have to juggle), the harder it is to keep track.

> I'm in favour rather of setting syntax-ppss-{cache,last} to the
> appropriate stored cache.  This will avoid needing to change the
> function syntax-ppss much.

My proposal will change syntax-ppss, yes. So, unfortunately, the patch 
will be more difficult to read. But not the resulting code, hopefully.

But I think I see what you mean. The disadvantage is that we'll need 
code that will ferry those values back to the appropriate variables as 
well (which we see in your patch). We can discuss that option after.

> A disadvantage of using such a cons is in debugging.  It is more
> difficult to understand a cons like this when it is printed out, than
> the two component lists (which are difficult enough themselves).

You win some, you lose some. We could use structs, if you like, but 
overall, the values are already complex, so consing won't make that much 
worse.

> When there's a lot of buffer changing going on, it is an overhead having
> to clear both (or several) caches continually.  (I'm thinking about the
> possible extension to using an alist of caches, which could be quite
> long.)

Both caches - yes, but shouldn't be too bad. The "alist of caches" 
approach would most likely require that laziness, but I'm not sure we 
really want to go there (see another email).

> Also clearing both caches at the same time would be a bigger change to
> syntax-ppss-flush-cache than it's suffered so far.

True.

>> - Any package than advises syntax-ppss will have to juggle fewer global
>> variables.
> 
> I was intending that the new variables be purely internal, and that no
> external elisp would need to access them.  I suppose I really ought to
> have put "--" in the middle of their names.

Yes, but if we can make life easier for some, why not? Sometimes 
third-party author can life with breakage between Emacs versions.

>> So Vatalie's polymode will have an easier time of it. It could even
>> reuse some of the cache-while-narrowed logic by substituting the
>> values of syntax-ppss-data-narrow and
>> syntax-ppss-data-narrow-point-min as appropriate.
> 
> That sounds a little dangerous.

Not much worse than what multi-mode packages already do, though.

>> The obvious downside is, of course, extra indirection, which translates
>> to extra overhead. We don't know how significant it will be, though.
> 
> I wouldn't be keen on seeing lots of (car compound-variable) and (cdr
> compound-variable) throughout the syntax-ppss function.  I think it
> would make it significantly more difficult to understand.

Hopefully there will be only several such places. But again, we can use 
structs.

>> Would you like to see the code?
> 
> Yes, why not?

Please give me until the end of the week.

> But just to make my position clear, I'm not particularly fixed on my
> patch as submitted.  It was optimised for simplicity and correctness
> rather than elegance, though I don't think it's too bad.  I'm fairly
> open on whether we use your suggestions or Stefan's suggestion of having
> an alist of caches.

Cool.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-12  0:24                                 ` Dmitry Gutov
@ 2017-09-17 10:29                                   ` Alan Mackenzie
  2017-09-17 23:43                                     ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-17 10:29 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Tue, Sep 12, 2017 at 03:24:08 +0300, Dmitry Gutov wrote:
> On 9/11/17 11:12 PM, Alan Mackenzie wrote:

[ .... ]

> > I wouldn't be keen on seeing lots of (car compound-variable) and (cdr
> > compound-variable) throughout the syntax-ppss function.  I think it
> > would make it significantly more difficult to understand.

> Hopefully there will be only several such places. But again, we can use 
> structs.

I don't know anything about these things.  But seeing as how syntax.el is
preloaded, the definition of structs would need to be preloaded earlier.

> >> Would you like to see the code?

> > Yes, why not?

> Please give me until the end of the week.

The end of the week has arrived.  Are you still intending to propose an
alternative formulation of the new cache manipulation for syntax-ppss?

> > But just to make my position clear, I'm not particularly fixed on my
> > patch as submitted.  It was optimised for simplicity and correctness
> > rather than elegance, though I don't think it's too bad.  I'm fairly
> > open on whether we use your suggestions or Stefan's suggestion of having
> > an alist of caches.

> Cool.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-17 10:29                                   ` Alan Mackenzie
@ 2017-09-17 23:43                                     ` Dmitry Gutov
  2017-09-18 19:08                                       ` Alan Mackenzie
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-17 23:43 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

[-- Attachment #1: Type: text/plain, Size: 715 bytes --]

Hi Alan,

On 9/17/17 1:29 PM, Alan Mackenzie wrote:

> I don't know anything about these things.  But seeing as how syntax.el is
> preloaded, the definition of structs would need to be preloaded earlier.

OK, let's do without that for now. The result doesn't look too bad to my 
eyes, at least.

>>>> Would you like to see the code?
> 
>>> Yes, why not?
> 
>> Please give me until the end of the week.
> 
> The end of the week has arrived.  Are you still intending to propose an
> alternative formulation of the new cache manipulation for syntax-ppss?

Thanks for the reminder. The patch is attached. I've tested it 
minimally, any feedback is welcome.

(It reads much better in Emacs with diff-auto-refine-mode).


[-- Attachment #2: alt-ppss-fix.diff --]
[-- Type: text/x-patch, Size: 7392 bytes --]

diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
index d1d5176944..a77589f1b7 100644
--- a/lisp/emacs-lisp/syntax.el
+++ b/lisp/emacs-lisp/syntax.el
@@ -381,10 +381,26 @@ syntax-begin-function
 point (where the PPSS is equivalent to nil).")
 (make-obsolete-variable 'syntax-begin-function nil "25.1")
 
-(defvar-local syntax-ppss-cache nil
-  "List of (POS . PPSS) pairs, in decreasing POS order.")
-(defvar-local syntax-ppss-last nil
-  "Cache of (LAST-POS . LAST-PPSS).")
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Several caches.
+;;
+;; Because `syntax-ppss' is equivalent to (parse-partial-sexp
+;; (POINT-MIN) x), we need either to empty the cache when we narrow
+;; the buffer, which is suboptimal, or we need to use several caches.
+;; We use two of them, one for widened buffer, and one for narrowing.
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(defvar-local syntax-ppss-wide nil
+  "Cons of two elements (CACHE . LAST).
+Where CACHE is a list of (POS . PPSS) pairs, in decreasing POS order,
+and LAST is a pair (LAST-POS . LAST-PPS) caching the last invocation.
+These are valid when the buffer has no restriction.")
+
+(defvar-local syntax-ppss-narrow nil
+  "Same as `syntax-ppss-wide' but for a narrowed buffer.")
+
+(defvar-local syntax-ppss-narrow-start nil
+  "Start position of the narrowing for `syntax-ppss-narrow'.")
 
 (defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache)
 (defun syntax-ppss-flush-cache (beg &rest ignored)
@@ -392,24 +408,29 @@ syntax-ppss-flush-cache
   ;; Set syntax-propertize to refontify anything past beg.
   (setq syntax-propertize--done (min beg syntax-propertize--done))
   ;; Flush invalid cache entries.
-  (while (and syntax-ppss-cache (> (caar syntax-ppss-cache) beg))
-    (setq syntax-ppss-cache (cdr syntax-ppss-cache)))
-  ;; Throw away `last' value if made invalid.
-  (when (< beg (or (car syntax-ppss-last) 0))
-    ;; If syntax-begin-function jumped to BEG, then the old state at BEG can
-    ;; depend on the text after BEG (which is presumably changed).  So if
-    ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the
-    ;; assumed nil state at BEG may not be valid any more.
-    (if (<= beg (or (syntax-ppss-toplevel-pos (cdr syntax-ppss-last))
-                    (nth 3 syntax-ppss-last)
-                    0))
-	(setq syntax-ppss-last nil)
-      (setcar syntax-ppss-last nil)))
-  ;; Unregister if there's no cache left.  Sadly this doesn't work
-  ;; because `before-change-functions' is temporarily bound to nil here.
-  ;; (unless syntax-ppss-cache
-  ;;   (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t))
-  )
+  (dolist (cell (list syntax-ppss-wide syntax-ppss-narrow))
+    (pcase cell
+      (`(,cache . ,last)
+       (while (and cache (> (caar cache) beg))
+         (setq cache (cdr cache)))
+       ;; Throw away `last' value if made invalid.
+       (when (< beg (or (car last) 0))
+         ;; If syntax-begin-function jumped to BEG, then the old state at BEG can
+         ;; depend on the text after BEG (which is presumably changed).  So if
+         ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the
+         ;; assumed nil state at BEG may not be valid any more.
+         (if (<= beg (or (syntax-ppss-toplevel-pos (cdr last))
+                         (nth 3 last)
+                         0))
+	     (setq last nil)
+           (setcar last nil)))
+       ;; Unregister if there's no cache left.  Sadly this doesn't work
+       ;; because `before-change-functions' is temporarily bound to nil here.
+       ;; (unless cache
+       ;;   (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t))
+       (setcar cell cache)
+       (setcdr cell last)))
+    ))
 
 (defvar syntax-ppss-stats
   [(0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (1 . 2500.0)])
@@ -423,6 +444,17 @@ syntax-ppss-stats
 (defvar-local syntax-ppss-table nil
   "Syntax-table to use during `syntax-ppss', if any.")
 
+(defun syntax-ppss--data ()
+  (if (eq (point-min) 1)
+      (progn
+        (unless syntax-ppss-wide
+          (setq syntax-ppss-wide (cons nil nil)))
+        syntax-ppss-wide)
+    (unless (eq syntax-ppss-narrow-start (point-min))
+      (setq syntax-ppss-narrow-start (point-min))
+      (setq syntax-ppss-narrow (cons nil nil)))
+    syntax-ppss-narrow))
+
 (defun syntax-ppss (&optional pos)
   "Parse-Partial-Sexp State at POS, defaulting to point.
 The returned value is the same as that of `parse-partial-sexp'
@@ -439,10 +471,13 @@ syntax-ppss
   (syntax-propertize pos)
   ;;
   (with-syntax-table (or syntax-ppss-table (syntax-table))
-  (let ((old-ppss (cdr syntax-ppss-last))
-	(old-pos (car syntax-ppss-last))
-	(ppss nil)
-	(pt-min (point-min)))
+  (let* ((cell (syntax-ppss--data))
+         (ppss-cache (car cell))
+         (ppss-last (cdr cell))
+         (old-ppss (cdr ppss-last))
+         (old-pos (car ppss-last))
+         (ppss nil)
+         (pt-min (point-min)))
     (if (and old-pos (> old-pos pos)) (setq old-pos nil))
     ;; Use the OLD-POS if usable and close.  Don't update the `last' cache.
     (condition-case nil
@@ -475,7 +510,7 @@ syntax-ppss
 	   ;; The OLD-* data can't be used.  Consult the cache.
 	   (t
 	    (let ((cache-pred nil)
-		  (cache syntax-ppss-cache)
+		  (cache ppss-cache)
 		  (pt-min (point-min))
 		  ;; I differentiate between PT-MIN and PT-BEST because
 		  ;; I feel like it might be important to ensure that the
@@ -491,7 +526,7 @@ syntax-ppss
 	      (if cache (setq pt-min (caar cache) ppss (cdar cache)))
 
 	      ;; Setup the before-change function if necessary.
-	      (unless (or syntax-ppss-cache syntax-ppss-last)
+	      (unless (or ppss-cache ppss-last)
 		(add-hook 'before-change-functions
 			  'syntax-ppss-flush-cache t t))
 
@@ -541,7 +576,7 @@ syntax-ppss
 			      pt-min (setq pt-min (/ (+ pt-min pos) 2))
 			      nil nil ppss))
                   (push (cons pt-min ppss)
-                        (if cache-pred (cdr cache-pred) syntax-ppss-cache)))
+                        (if cache-pred (cdr cache-pred) ppss-cache)))
 
 		;; Compute the actual return value.
 		(setq ppss (parse-partial-sexp pt-min pos nil nil ppss))
@@ -562,13 +597,15 @@ syntax-ppss
 		      (if (> (- (caar cache-pred) pos) syntax-ppss-max-span)
 			  (push pair (cdr cache-pred))
 			(setcar cache-pred pair))
-		    (if (or (null syntax-ppss-cache)
-			    (> (- (caar syntax-ppss-cache) pos)
+		    (if (or (null ppss-cache)
+			    (> (- (caar ppss-cache) pos)
 			       syntax-ppss-max-span))
-			(push pair syntax-ppss-cache)
-		      (setcar syntax-ppss-cache pair)))))))))
+			(push pair ppss-cache)
+		      (setcar ppss-cache pair)))))))))
 
-	  (setq syntax-ppss-last (cons pos ppss))
+	  (setq ppss-last (cons pos ppss))
+          (setcar cell ppss-cache)
+          (setcdr cell ppss-last)
 	  ppss)
       (args-out-of-range
        ;; If the buffer is more narrowed than when we built the cache,
@@ -582,7 +619,7 @@ syntax-ppss
 (defun syntax-ppss-debug ()
   (let ((pt nil)
 	(min-diffs nil))
-    (dolist (x (append syntax-ppss-cache (list (cons (point-min) nil))))
+    (dolist (x (append (car (syntax-ppss--data)) (list (cons (point-min) nil))))
       (when pt (push (- pt (car x)) min-diffs))
       (setq pt (car x)))
     min-diffs))

^ permalink raw reply related	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-17 23:43                                     ` Dmitry Gutov
@ 2017-09-18 19:08                                       ` Alan Mackenzie
  2017-09-19  0:02                                         ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-18 19:08 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry

On Mon, Sep 18, 2017 at 02:43:05 +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 9/17/17 1:29 PM, Alan Mackenzie wrote:

> > I don't know anything about these things.  But seeing as how syntax.el is
> > preloaded, the definition of structs would need to be preloaded earlier.

> OK, let's do without that for now. The result doesn't look too bad to my 
> eyes, at least.

> >>>> Would you like to see the code?

> >>> Yes, why not?

> >> Please give me until the end of the week.

> > The end of the week has arrived.  Are you still intending to propose an
> > alternative formulation of the new cache manipulation for syntax-ppss?

> Thanks for the reminder. The patch is attached. I've tested it 
> minimally, any feedback is welcome.

Thanks for this.  I'm impressed.  Your syntax-ppss--data is far more
elegant than my syntax-ppss-set-cache.  The burden of carrying around
the caches in cons cells is much less than I had feared.  The amendments
to syntax-ppss are also less than I had feared, amounting to little more
than substituting "syntax-ppss-cache" with "ppss-cache" etc., and making
a few bindings to support that.

I notice you flush both caches eagerly, as you said you would.  No harm
in that.

So, I'm willing to go with your version.  I haven't tried actually
running it, yet.

But there's one small change I would ask you to consider making - that
is, in the cache conses, to put ppss-last in the car and ppss-cache in
the cdr.  That way, while debugging, ppss-last will be easy to find
(it's the first element of the list) and ppss-cache will also be easy to
find (the second element onwards).  

> (It reads much better in Emacs with diff-auto-refine-mode).

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-18 19:08                                       ` Alan Mackenzie
@ 2017-09-19  0:02                                         ` Dmitry Gutov
  2017-09-19 20:47                                           ` Alan Mackenzie
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-19  0:02 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

[-- Attachment #1: Type: text/plain, Size: 1080 bytes --]

On 9/18/17 10:08 PM, Alan Mackenzie wrote:

> Thanks for this.  I'm impressed.  Your syntax-ppss--data is far more
> elegant than my syntax-ppss-set-cache.  The burden of carrying around
> the caches in cons cells is much less than I had feared.  The amendments
> to syntax-ppss are also less than I had feared, amounting to little more
> than substituting "syntax-ppss-cache" with "ppss-cache" etc., and making
> a few bindings to support that.

Thanks!

> I notice you flush both caches eagerly, as you said you would.  No harm
> in that.
> 
> So, I'm willing to go with your version.  I haven't tried actually
> running it, yet.

Please do.

> But there's one small change I would ask you to consider making - that
> is, in the cache conses, to put ppss-last in the car and ppss-cache in
> the cdr.  That way, while debugging, ppss-last will be easy to find
> (it's the first element of the list) and ppss-cache will also be easy to
> find (the second element onwards).

Sure, that makes a lot of sense, since ppss-last is a smaller structure. 
The modified patch is attached.

[-- Attachment #2: alt-ppss-fix-2.diff --]
[-- Type: text/x-patch, Size: 7391 bytes --]

diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
index d1d5176944..c44e754ac0 100644
--- a/lisp/emacs-lisp/syntax.el
+++ b/lisp/emacs-lisp/syntax.el
@@ -381,10 +381,26 @@ syntax-begin-function
 point (where the PPSS is equivalent to nil).")
 (make-obsolete-variable 'syntax-begin-function nil "25.1")
 
-(defvar-local syntax-ppss-cache nil
-  "List of (POS . PPSS) pairs, in decreasing POS order.")
-(defvar-local syntax-ppss-last nil
-  "Cache of (LAST-POS . LAST-PPSS).")
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Several caches.
+;;
+;; Because `syntax-ppss' is equivalent to (parse-partial-sexp
+;; (POINT-MIN) x), we need either to empty the cache when we narrow
+;; the buffer, which is suboptimal, or we need to use several caches.
+;; We use two of them, one for widened buffer, and one for narrowing.
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(defvar-local syntax-ppss-wide nil
+  "Cons of two elements (LAST . CACHE).
+Where LAST is a pair (LAST-POS . LAST-PPS) caching the last invocation
+and CACHE is a list of (POS . PPSS) pairs, in decreasing POS order.
+These are valid when the buffer has no restriction.")
+
+(defvar-local syntax-ppss-narrow nil
+  "Same as `syntax-ppss-wide' but for a narrowed buffer.")
+
+(defvar-local syntax-ppss-narrow-start nil
+  "Start position of the narrowing for `syntax-ppss-narrow'.")
 
 (defalias 'syntax-ppss-after-change-function 'syntax-ppss-flush-cache)
 (defun syntax-ppss-flush-cache (beg &rest ignored)
@@ -392,24 +408,29 @@ syntax-ppss-flush-cache
   ;; Set syntax-propertize to refontify anything past beg.
   (setq syntax-propertize--done (min beg syntax-propertize--done))
   ;; Flush invalid cache entries.
-  (while (and syntax-ppss-cache (> (caar syntax-ppss-cache) beg))
-    (setq syntax-ppss-cache (cdr syntax-ppss-cache)))
-  ;; Throw away `last' value if made invalid.
-  (when (< beg (or (car syntax-ppss-last) 0))
-    ;; If syntax-begin-function jumped to BEG, then the old state at BEG can
-    ;; depend on the text after BEG (which is presumably changed).  So if
-    ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the
-    ;; assumed nil state at BEG may not be valid any more.
-    (if (<= beg (or (syntax-ppss-toplevel-pos (cdr syntax-ppss-last))
-                    (nth 3 syntax-ppss-last)
-                    0))
-	(setq syntax-ppss-last nil)
-      (setcar syntax-ppss-last nil)))
-  ;; Unregister if there's no cache left.  Sadly this doesn't work
-  ;; because `before-change-functions' is temporarily bound to nil here.
-  ;; (unless syntax-ppss-cache
-  ;;   (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t))
-  )
+  (dolist (cell (list syntax-ppss-wide syntax-ppss-narrow))
+    (pcase cell
+      (`(,last . ,cache)
+       (while (and cache (> (caar cache) beg))
+         (setq cache (cdr cache)))
+       ;; Throw away `last' value if made invalid.
+       (when (< beg (or (car last) 0))
+         ;; If syntax-begin-function jumped to BEG, then the old state at BEG can
+         ;; depend on the text after BEG (which is presumably changed).  So if
+         ;; BEG=(car (nth 10 syntax-ppss-last)) don't reuse that data because the
+         ;; assumed nil state at BEG may not be valid any more.
+         (if (<= beg (or (syntax-ppss-toplevel-pos (cdr last))
+                         (nth 3 last)
+                         0))
+	     (setq last nil)
+           (setcar last nil)))
+       ;; Unregister if there's no cache left.  Sadly this doesn't work
+       ;; because `before-change-functions' is temporarily bound to nil here.
+       ;; (unless cache
+       ;;   (remove-hook 'before-change-functions 'syntax-ppss-flush-cache t))
+       (setcar cell last)
+       (setcdr cell cache)))
+    ))
 
 (defvar syntax-ppss-stats
   [(0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (0 . 0.0) (1 . 2500.0)])
@@ -423,6 +444,17 @@ syntax-ppss-stats
 (defvar-local syntax-ppss-table nil
   "Syntax-table to use during `syntax-ppss', if any.")
 
+(defun syntax-ppss--data ()
+  (if (eq (point-min) 1)
+      (progn
+        (unless syntax-ppss-wide
+          (setq syntax-ppss-wide (cons nil nil)))
+        syntax-ppss-wide)
+    (unless (eq syntax-ppss-narrow-start (point-min))
+      (setq syntax-ppss-narrow-start (point-min))
+      (setq syntax-ppss-narrow (cons nil nil)))
+    syntax-ppss-narrow))
+
 (defun syntax-ppss (&optional pos)
   "Parse-Partial-Sexp State at POS, defaulting to point.
 The returned value is the same as that of `parse-partial-sexp'
@@ -439,10 +471,13 @@ syntax-ppss
   (syntax-propertize pos)
   ;;
   (with-syntax-table (or syntax-ppss-table (syntax-table))
-  (let ((old-ppss (cdr syntax-ppss-last))
-	(old-pos (car syntax-ppss-last))
-	(ppss nil)
-	(pt-min (point-min)))
+  (let* ((cell (syntax-ppss--data))
+         (ppss-last (car cell))
+         (ppss-cache (cdr cell))
+         (old-ppss (cdr ppss-last))
+         (old-pos (car ppss-last))
+         (ppss nil)
+         (pt-min (point-min)))
     (if (and old-pos (> old-pos pos)) (setq old-pos nil))
     ;; Use the OLD-POS if usable and close.  Don't update the `last' cache.
     (condition-case nil
@@ -475,7 +510,7 @@ syntax-ppss
 	   ;; The OLD-* data can't be used.  Consult the cache.
 	   (t
 	    (let ((cache-pred nil)
-		  (cache syntax-ppss-cache)
+		  (cache ppss-cache)
 		  (pt-min (point-min))
 		  ;; I differentiate between PT-MIN and PT-BEST because
 		  ;; I feel like it might be important to ensure that the
@@ -491,7 +526,7 @@ syntax-ppss
 	      (if cache (setq pt-min (caar cache) ppss (cdar cache)))
 
 	      ;; Setup the before-change function if necessary.
-	      (unless (or syntax-ppss-cache syntax-ppss-last)
+	      (unless (or ppss-cache ppss-last)
 		(add-hook 'before-change-functions
 			  'syntax-ppss-flush-cache t t))
 
@@ -541,7 +576,7 @@ syntax-ppss
 			      pt-min (setq pt-min (/ (+ pt-min pos) 2))
 			      nil nil ppss))
                   (push (cons pt-min ppss)
-                        (if cache-pred (cdr cache-pred) syntax-ppss-cache)))
+                        (if cache-pred (cdr cache-pred) ppss-cache)))
 
 		;; Compute the actual return value.
 		(setq ppss (parse-partial-sexp pt-min pos nil nil ppss))
@@ -562,13 +597,15 @@ syntax-ppss
 		      (if (> (- (caar cache-pred) pos) syntax-ppss-max-span)
 			  (push pair (cdr cache-pred))
 			(setcar cache-pred pair))
-		    (if (or (null syntax-ppss-cache)
-			    (> (- (caar syntax-ppss-cache) pos)
+		    (if (or (null ppss-cache)
+			    (> (- (caar ppss-cache) pos)
 			       syntax-ppss-max-span))
-			(push pair syntax-ppss-cache)
-		      (setcar syntax-ppss-cache pair)))))))))
+			(push pair ppss-cache)
+		      (setcar ppss-cache pair)))))))))
 
-	  (setq syntax-ppss-last (cons pos ppss))
+	  (setq ppss-last (cons pos ppss))
+          (setcar cell ppss-last)
+          (setcdr cell ppss-cache)
 	  ppss)
       (args-out-of-range
        ;; If the buffer is more narrowed than when we built the cache,
@@ -582,7 +619,7 @@ syntax-ppss
 (defun syntax-ppss-debug ()
   (let ((pt nil)
 	(min-diffs nil))
-    (dolist (x (append syntax-ppss-cache (list (cons (point-min) nil))))
+    (dolist (x (append (cdr (syntax-ppss--data)) (list (cons (point-min) nil))))
       (when pt (push (- pt (car x)) min-diffs))
       (setq pt (car x)))
     min-diffs))

^ permalink raw reply related	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-19  0:02                                         ` Dmitry Gutov
@ 2017-09-19 20:47                                           ` Alan Mackenzie
  2017-09-22 14:09                                             ` Dmitry Gutov
  0 siblings, 1 reply; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-19 20:47 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Tue, Sep 19, 2017 at 03:02:06 +0300, Dmitry Gutov wrote:
> On 9/18/17 10:08 PM, Alan Mackenzie wrote:

[ .... ]

> > So, I'm willing to go with your version.  I haven't tried actually
> > running it, yet.

> Please do.

I have done now, without the slightest cause for concern (see below).

> > But there's one small change I would ask you to consider making - that
> > is, in the cache conses, to put ppss-last in the car and ppss-cache in
> > the cdr.  That way, while debugging, ppss-last will be easy to find
> > (it's the first element of the list) and ppss-cache will also be easy to
> > find (the second element onwards).

> Sure, that makes a lot of sense, since ppss-last is a smaller structure. 
> The modified patch is attached.

Thanks.

I've done some semi-formal testing on it.  My semi-formal test log is:

(ii) Do some testing, using xdisp.c as test file.  A file.c will not have
  other calls to syntax-ppss interfering with the tests.
  o - 1. Normal working: check both caches stay empty.  They don't, because
    syntax-ppss is used, I think, by font locking.
  o - 2. Normal work in a narrowed buffer.  Seems OK.
  o - 3. Switch back to widened.  Seems OK.
  o - 4. Switch back to narrowed, same point-min.  Check the caches.  They
    look OK.
  o - 5. Switch to a different narrowing and (syntax-ppss (point-min)).  This
    does indeed empty the syntax-ppss-narrow, as it should.  s-p-wide looks
    unchanged.  Good.
  o - 6. Get well filled caches for both narrow and wide regions.  With the
    buffer wide, make a buffer change early in the buffer.  Check both caches
    are properly trimmed.  They are.
  o - 7. Repeat 6, but trim with the buffer narrow.  Both caches look OK, the
    narrow cache being (nil).

Maybe I should also try some heavy hacking in, say, Emacs Lisp mode as a
kind of soak test, since elisp mode uses syntax-ppss quite a bit, I
believe.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-19 20:47                                           ` Alan Mackenzie
@ 2017-09-22 14:09                                             ` Dmitry Gutov
  2017-09-24 11:26                                               ` Alan Mackenzie
  0 siblings, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-22 14:09 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

Hi Alan,

On 9/19/17 11:47 PM, Alan Mackenzie wrote:

> I have done now, without the slightest cause for concern (see below).

Thank you. Should you commit the patch (with any documentation tweaks 
you deem necessary), or should I?

> I've done some semi-formal testing on it.  My semi-formal test log is:
> 
> (ii) Do some testing, using xdisp.c as test file.  A file.c will not have
>    other calls to syntax-ppss interfering with the tests.
>    o - 1. Normal working: check both caches stay empty.  They don't, because
>      syntax-ppss is used, I think, by font locking.
>    o - 2. Normal work in a narrowed buffer.  Seems OK.
>    o - 3. Switch back to widened.  Seems OK.
>    o - 4. Switch back to narrowed, same point-min.  Check the caches.  They
>      look OK.
>    o - 5. Switch to a different narrowing and (syntax-ppss (point-min)).  This
>      does indeed empty the syntax-ppss-narrow, as it should.  s-p-wide looks
>      unchanged.  Good.
>    o - 6. Get well filled caches for both narrow and wide regions.  With the
>      buffer wide, make a buffer change early in the buffer.  Check both caches
>      are properly trimmed.  They are.
>    o - 7. Repeat 6, but trim with the buffer narrow.  Both caches look OK, the
>      narrow cache being (nil).

Yes, this sounds fine. I've tried out most of those myself too, except 
usually without checking the cache contents. Just the syntax-ppss results.

It would be nice to have 2 or 3 of those added as automated tests, BTW.

> Maybe I should also try some heavy hacking in, say, Emacs Lisp mode as a
> kind of soak test, since elisp mode uses syntax-ppss quite a bit, I
> believe.

Sure, except emacs-lisp-mode seems to still retain certain 
indentation-related problems, even without this change.

I don't really expect to uncover problems from this patch much later. 
That's been the point of making the change as simple as possible.





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-22 14:09                                             ` Dmitry Gutov
@ 2017-09-24 11:26                                               ` Alan Mackenzie
  2017-09-25 23:53                                                 ` Dmitry Gutov
  2017-10-04 20:07                                                 ` Johan Bockgård
  0 siblings, 2 replies; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-24 11:26 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Fri, Sep 22, 2017 at 17:09:03 +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 9/19/17 11:47 PM, Alan Mackenzie wrote:

> > I have done now, without the slightest cause for concern (see below).

> Thank you. Should you commit the patch (with any documentation tweaks 
> you deem necessary), or should I?

Could I ask you to do it, please?  I'm somewhat exhausted from debating
another basic Emacs change.

Ah yes, the documentation.  I checked the doc in the elisp manual, and
twice the phrase "from the beginning of the buffer" was used.  I've
clarified that with "from the beginning of the visible portion of the
buffer".  I've also amended "a cache" to "caches", though this doesn't
seem too important.  What do you think:


diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi
index e3ae53536f..b37f2b22b8 100644
--- a/doc/lispref/syntax.texi
+++ b/doc/lispref/syntax.texi
@@ -751,7 +751,8 @@ Position Parse
 
 @defun syntax-ppss &optional pos
 This function returns the parser state that the parser would reach at
-position @var{pos} starting from the beginning of the buffer.
+position @var{pos} starting from the beginning of the visible portion
+of the buffer.
 @iftex
 See the next section for
 @end iftex
@@ -762,11 +763,11 @@ Position Parse
 
 The return value is the same as if you call the low-level parsing
 function @code{parse-partial-sexp} to parse from the beginning of the
-buffer to @var{pos} (@pxref{Low-Level Parsing}).  However,
-@code{syntax-ppss} uses a cache to speed up the computation.  Due to
-this optimization, the second value (previous complete subexpression)
-and sixth value (minimum parenthesis depth) in the returned parser
-state are not meaningful.
+visible portion of the buffer to @var{pos} (@pxref{Low-Level
+Parsing}).  However, @code{syntax-ppss} uses caches to speed up the
+computation.  Due to this optimization, the second value (previous
+complete subexpression) and sixth value (minimum parenthesis depth) in
+the returned parser state are not meaningful.
 
 This function has a side effect: it adds a buffer-local entry to
 @code{before-change-functions} (@pxref{Change Hooks}) for


-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply related	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-24 11:26                                               ` Alan Mackenzie
@ 2017-09-25 23:53                                                 ` Dmitry Gutov
  2017-10-01 16:36                                                   ` Alan Mackenzie
  2017-10-04 20:07                                                 ` Johan Bockgård
  1 sibling, 1 reply; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-25 23:53 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, 22983

Hi Alan,

On 9/24/17 2:26 PM, Alan Mackenzie wrote:

> Could I ask you to do it, please?  I'm somewhat exhausted from debating
> another basic Emacs change.

Pushed to emacs-26, thanks.

> Ah yes, the documentation.  I checked the doc in the elisp manual, and
> twice the phrase "from the beginning of the buffer" was used.  I've
> clarified that with "from the beginning of the visible portion of the
> buffer".  I've also amended "a cache" to "caches", though this doesn't
> seem too important.  What do you think:

LGTM. I think you can push it and finally close this bug. ;-)





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-25 23:53                                                 ` Dmitry Gutov
@ 2017-10-01 16:36                                                   ` Alan Mackenzie
  0 siblings, 0 replies; 155+ messages in thread
From: Alan Mackenzie @ 2017-10-01 16:36 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: John Wiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Tue, Sep 26, 2017 at 02:53:52 +0300, Dmitry Gutov wrote:
> Hi Alan,

> On 9/24/17 2:26 PM, Alan Mackenzie wrote:

> > Could I ask you to do it, please?  I'm somewhat exhausted from debating
> > another basic Emacs change.

> Pushed to emacs-26, thanks.

> > Ah yes, the documentation.  I checked the doc in the elisp manual, and
> > twice the phrase "from the beginning of the buffer" was used.  I've
> > clarified that with "from the beginning of the visible portion of the
> > buffer".  I've also amended "a cache" to "caches", though this doesn't
> > seem too important.  What do you think:

> LGTM. I think you can push it and finally close this bug. ;-)

Thanks.  I've just done both of these things.  Phew!

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-24 11:26                                               ` Alan Mackenzie
  2017-09-25 23:53                                                 ` Dmitry Gutov
@ 2017-10-04 20:07                                                 ` Johan Bockgård
  1 sibling, 0 replies; 155+ messages in thread
From: Johan Bockgård @ 2017-10-04 20:07 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: John Wiegley, Philipp Stephani, Dmitry Gutov, 22983

Alan Mackenzie <acm@muc.de> writes:

> Ah yes, the documentation.  I checked the doc in the elisp manual, and
> twice the phrase "from the beginning of the buffer" was used.  I've
> clarified that with "from the beginning of the visible portion of the
> buffer".

The manual uses the term "accessible portion" for this. ("Visible"
usually refers to text in a window.)





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
  2017-09-10 22:53                             ` Stefan Monnier
  2017-09-11  0:11                             ` Dmitry Gutov
@ 2017-09-17 11:12                             ` Philipp Stephani
  2017-09-19 20:50                               ` Alan Mackenzie
  2 siblings, 1 reply; 155+ messages in thread
From: Philipp Stephani @ 2017-09-17 11:12 UTC (permalink / raw)
  To: Alan Mackenzie, Dmitry Gutov; +Cc: John Wiegley, 22983

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

Alan Mackenzie <acm@muc.de> schrieb am So., 10. Sep. 2017 um 13:42 Uhr:

>
> > - Before this change is pushed to master, or shortly after, I'd like to
> > know that it actually fixed the problem Philipp experienced with
> > python-mode, so we can revert 4fbd330. If it was caused by e.g.
> > syntax-table changing, we've not improved much.
>
> Philipp, any chance of you trying out python mode with this patch but
> without 4fbd330?


Unfortunately the problem wasn't easily reproducible back then. The problem
would occur from time to time, but I never found a way to trigger it
reproducibly. Therefore the unit test I've added in the commit artificially
generates the symptom. The root cause is still unknown; while syntax-ppss
and narrowing might be a potential root cause (the fontification code uses
both), it might also be something else.

[-- Attachment #2: Type: text/html, Size: 1157 bytes --]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: [ Patch ] Re: bug#22983: syntax-ppss returns wrong result.
  2017-09-17 11:12                             ` Philipp Stephani
@ 2017-09-19 20:50                               ` Alan Mackenzie
  0 siblings, 0 replies; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-19 20:50 UTC (permalink / raw)
  To: Philipp Stephani; +Cc: John Wiegley, Dmitry Gutov, 22983

Hello, Philipp.

On Sun, Sep 17, 2017 at 11:12:25 +0000, Philipp Stephani wrote:
> Alan Mackenzie <acm@muc.de> schrieb am So., 10. Sep. 2017 um 13:42 Uhr:


> > > - Before this change is pushed to master, or shortly after, I'd like to
> > > know that it actually fixed the problem Philipp experienced with
> > > python-mode, so we can revert 4fbd330. If it was caused by e.g.
> > > syntax-table changing, we've not improved much.

> > Philipp, any chance of you trying out python mode with this patch but
> > without 4fbd330?


> Unfortunately the problem wasn't easily reproducible back then. The problem
> would occur from time to time, but I never found a way to trigger it
> reproducibly. Therefore the unit test I've added in the commit artificially
> generates the symptom. The root cause is still unknown; while syntax-ppss
> and narrowing might be a potential root cause (the fontification code uses
> both), it might also be something else.

OK, I understand.  Cache effects are the very devil to debug.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-04 23:34                   ` Dmitry Gutov
  2017-09-05  6:57                     ` Andreas Röhler
  2017-09-05 12:28                     ` John Wiegley
@ 2017-09-07 17:56                     ` Alan Mackenzie
  2017-09-07 20:36                       ` Dmitry Gutov
  2 siblings, 1 reply; 155+ messages in thread
From: Alan Mackenzie @ 2017-09-07 17:56 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: jwiegley, Philipp Stephani, 22983

Hello, Dmitry.

On Tue, Sep 05, 2017 at 02:34:15 +0300, Dmitry Gutov wrote:
> On 9/2/17 8:40 PM, Alan Mackenzie wrote:
> > I'm not happy about this.  22983 is a serious design flaw, which has had
> > deleterious effects deep within Emacs.

> I'm sure we want to fix design flaws. As long as there is a solid plan 
> that does not swap one flaw for another.

Plan or not, it should be fixed.

> > One recorded example, resulting
> > in an infinite loop, is:
> > 
> > #########################################################################
> > From: Philipp Stephani <p.stephani2@gmail.com>
> > To: emacs-devel@gnu.org
> > Subject: [PATCH] Protect against an infloop in python-mode
> > Date: Tue, 28 Feb 2017 22:31:49 +0100
> > 
> > There appears to be an edge case caused by using `syntax-ppss' in a
> > narrowed buffer during JIT lock inside of Python triple-quote strings.
> > Unfortunately it is impossible to reproduce without manually
> > destroying the syntactic information in the Python buffer, but it has
> > been observed in practice.  In that case it can happen that the syntax
> > caches get sufficiently out of whack so that there appear to be
> > overlapping strings in the buffer.  As Python has no nested strings,
> > this situation is impossible and leads to an infloop in
> > `python-nav-end-of-statement'.  Protect against this by checking
> > whether the search for the end of the current string makes progress.
> > #########################################################################
> > 
> > In this case, Philipp had to apply a workaround.

> The problem manifested during jit-lock. Do we understand why the (widen) 
> call inside font-lock-default-fontify-region didn't help?

I don't, not in detail, no.  Philipp might know.  But if syntax-ppss was
used whilst the buffer was narrowed, it likely corrupted its cache, and
that corruption remained after widening the buffer.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2017-09-07 17:56                     ` Alan Mackenzie
@ 2017-09-07 20:36                       ` Dmitry Gutov
  0 siblings, 0 replies; 155+ messages in thread
From: Dmitry Gutov @ 2017-09-07 20:36 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: jwiegley, Philipp Stephani, 22983

On 9/7/17 8:56 PM, Alan Mackenzie wrote:

>> The problem manifested during jit-lock. Do we understand why the (widen)
>> call inside font-lock-default-fontify-region didn't help?
> 
> I don't, not in detail, no.  Philipp might know.  But if syntax-ppss was
> used whilst the buffer was narrowed, it likely corrupted its cache, and
> that corruption remained after widening the buffer.

Details matter.





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-19 12:27   ` Alan Mackenzie
  2016-03-19 18:47     ` Dmitry Gutov
@ 2016-03-19 23:16     ` Vitalie Spinu
  1 sibling, 0 replies; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-19 23:16 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 22983, Dmitry Gutov



>> On Sat, Mar 19 2016 12:27, Alan Mackenzie wrote:

> I think the only sensible functionality for syntax-ppss is to be
> equivalent to (parse-partial-sexp 1 pos).  Then everybody knows where
> they stand.  

This would not work for multi modes. Till there is a feasible way to advice
parse-partial-sexp there will be no way to ensure the above contract is
satisfied in multi-modes.

  Vitalie





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-18  0:49 ` Dmitry Gutov
  2016-03-19 12:27   ` Alan Mackenzie
@ 2016-03-19 23:00   ` Vitalie Spinu
  2016-03-19 23:20     ` Dmitry Gutov
  1 sibling, 1 reply; 155+ messages in thread
From: Vitalie Spinu @ 2016-03-19 23:00 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Alan Mackenzie, 22983


Thanks for this. This is a step in right direction IMHO.

One side note. `parsep-ppss` has a condition-case for args-out-of-range which
could be easily optimized out. You already know that you are calling
parse-partial-sexp with out of range arguments if narrowing is in place. The
current error check obfuscates the logic and makes debugging harder. Would it be
possible for you to have a look once you are on it? Not a big deal though.

Thanks,

  Vitalie

>> On Fri, Mar 18 2016 02:49, Dmitry Gutov wrote:

> On 03/11/2016 05:15 PM, Alan Mackenzie wrote:

> This patch should make ppss-0 and ppss-1 match:

> diff --git a/lisp/emacs-lisp/syntax.el b/lisp/emacs-lisp/syntax.el
> index e20a210..c1b9d84 100644
> --- a/lisp/emacs-lisp/syntax.el
> +++ b/lisp/emacs-lisp/syntax.el
> @@ -371,6 +371,11 @@ syntax-ppss-max-span
>  We try to make sure that cache entries are at least this far apart
>  from each other, to avoid keeping too much useless info.")

> +(defvar syntax-ppss-dont-widen nil
> +  "If non-nil, `syntax-ppss' will work on the non-widened buffer.
> +The code that uses this should create local bindings for
> +`syntax-ppss-cache' and `syntax-ppss-last' too.")
> +
>  (defvar syntax-begin-function nil
>    "Function to move back outside of any comment/string/paren.
>  This function should move the cursor back to some syntactically safe
> @@ -423,12 +428,21 @@ syntax-ppss
>  in the returned list (counting from 0) cannot be relied upon.
>  Point is at POS when this function returns.

> +IF `syntax-ppss-dont-widen' is nil, the buffer is temporarily
> +widened.
> +
>  It is necessary to call `syntax-ppss-flush-cache' explicitly if
>  this function is called while `before-change-functions' is
>  temporarily let-bound, or if the buffer is modified without
>  running the hook."
>    ;; Default values.
>    (unless pos (setq pos (point)))
> +  (save-restriction
> +    (unless syntax-ppss-dont-widen
> +      (widen))
> +    (syntax-pps--at pos)))
> +
> +(defun syntax-ppss--at (pos)
>    (syntax-propertize pos)
>    ;;
>    (let ((old-ppss (cdr syntax-ppss-last))





^ permalink raw reply	[flat|nested] 155+ messages in thread

* bug#22983: syntax-ppss returns wrong result.
  2016-03-19 23:00   ` Vitalie Spinu
@ 2016-03-19 23:20     ` Dmitry Gutov
  0 siblings, 0 replies; 155+ messages in thread
From: Dmitry Gutov @ 2016-03-19 23:20 UTC (permalink / raw)
  To: Vitalie Spinu; +Cc: Alan Mackenzie, 22983

On 03/20/2016 01:00 AM, Vitalie Spinu wrote:
>
> Thanks for this. This is a step in right direction IMHO.
>
> One side note. `parsep-ppss` has a condition-case for args-out-of-range which
> could be easily optimized out. You already know that you are calling
> parse-partial-sexp with out of range arguments if narrowing is in place.

That seems like it might make the code more complex: there are several 
parse-partial-sexp calls inside condition-case (for different situations 
with the existing cache), and we may have to add a comparison near each 
of them.

> The
> current error check obfuscates the logic and makes debugging harder. Would it be
> possible for you to have a look once you are on it? Not a big deal though.

I think you can still follow the execution flow with edebug, can't you?

If you're debugging a problem with args-out-of-range, another option is 
to replace `condition-case' with `condition-case-unless-debug' and 
re-evaluate the definition (but restore it when you're done, otherwise 
the args-out-of-range handler won't fire, I think).

^ permalink raw reply	[flat|nested] 155+ messages in thread

[parent not found: <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org>]

* bug#22983: syntax-ppss returns wrong result.
       [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org>
@ 2017-10-01 16:31   ` Alan Mackenzie
  0 siblings, 0 replies; 155+ messages in thread
From: Alan Mackenzie @ 2017-10-01 16:31 UTC (permalink / raw)
  To: 22983-done

The bug has been fixed by patches to the emacs-26 branch.

-- 
Alan Mackenzie (Nuremberg, Germany).






^ permalink raw reply	[flat|nested] 155+ messages in thread

end of thread, other threads:[~2017-10-04 20:07 UTC | newest]

Thread overview: 155+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-11 15:15 bug#22983: syntax-ppss returns wrong result Alan Mackenzie
2016-03-11 20:31 ` Dmitry Gutov
2016-03-11 21:24   ` Alan Mackenzie
2016-03-11 21:35     ` Dmitry Gutov
2016-03-11 22:15       ` Alan Mackenzie
2016-03-11 22:38         ` Dmitry Gutov
2016-03-13 17:37           ` Stefan Monnier
2016-03-13 18:57             ` Alan Mackenzie
2016-03-14  0:47               ` Dmitry Gutov
2016-03-14  1:04                 ` Drew Adams
2016-04-03 22:55                   ` John Wiegley
2016-03-14  1:49               ` Stefan Monnier
2016-03-14 15:16           ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Alan Mackenzie
2016-03-14 17:34             ` Andreas Röhler
2016-03-14 20:06             ` Dmitry Gutov
2016-03-19 22:51               ` Vitalie Spinu
2016-03-20  2:19                 ` Dmitry Gutov
2016-03-20 12:15                   ` Vitalie Spinu
2016-03-20 15:58                     ` Dmitry Gutov
2016-03-21  1:05                       ` Vitalie Spinu
2016-03-21  3:11                         ` Stefan Monnier
2016-03-21  5:05                           ` Vitalie Spinu
2016-03-21  7:13                             ` Andreas Röhler
2016-03-21 12:26                             ` Stefan Monnier
2016-03-21 14:13                               ` Vitalie Spinu
2016-03-21 14:43                                 ` Stefan Monnier
2016-03-21 16:42                                   ` Vitalie Spinu
2016-03-21 18:31                                     ` Stefan Monnier
2016-03-21 19:16                                       ` Vitalie Spinu
2016-03-21 20:47                                         ` Stefan Monnier
2016-03-21 20:33                                     ` Alan Mackenzie
2016-03-21 20:49                                       ` Stefan Monnier
2016-03-21 21:03                                       ` Drew Adams
2016-03-21 21:12                                       ` Dmitry Gutov
2016-03-21 16:45                                   ` Vitalie Spinu
2016-03-21 22:55                                     ` Dmitry Gutov
2016-03-22 14:51                                     ` Stefan Monnier
2016-03-22 18:17                                       ` Vitalie Spinu
2016-03-23  1:18                                         ` Dmitry Gutov
2016-03-23 13:18                                         ` Stefan Monnier
2016-03-22 18:26                                       ` Vitalie Spinu
2016-03-23  2:07                                         ` Stefan Monnier
2016-03-23 10:56                                           ` Vitalie Spinu
2016-03-23 11:41                                             ` Stefan Monnier
2016-03-23 12:39                                               ` Vitalie Spinu
2016-03-23 13:23                                                 ` Stefan Monnier
2016-03-23 15:28                                                   ` Dmitry Gutov
2016-03-23 21:51                                                     ` Vitalie Spinu
2016-03-24  7:30                                               ` Andreas Röhler
2016-03-21 11:56                           ` Dmitry Gutov
2016-03-21  5:08                         ` [Patch] hard-widen-limits [was Re: Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.]] Vitalie Spinu
2016-03-21 12:39                           ` Stefan Monnier
2016-03-21 12:54                             ` Vitalie Spinu
2016-03-21 14:07                               ` Stefan Monnier
2016-03-21 14:14                                 ` Vitalie Spinu
2016-03-21 14:04                             ` Stefan Monnier
2016-03-21 14:33                               ` Vitalie Spinu
2016-03-21 14:54                                 ` Stefan Monnier
2016-03-21 17:16                                   ` Vitalie Spinu
2016-03-21 18:36                                     ` Stefan Monnier
2016-03-21 19:18                                       ` Vitalie Spinu
2016-03-22  3:17                                         ` Vitalie Spinu
2016-03-22  9:57                                           ` Vitalie Spinu
2016-03-22 10:05                                             ` Vitalie Spinu
2016-03-22 11:57                                               ` Stefan Monnier
2016-03-22 16:28                                                 ` Vitalie Spinu
2016-03-22 16:44                                                   ` Stefan Monnier
2016-03-22 19:36                                                     ` Vitalie Spinu
2016-03-23  2:22                                                       ` Stefan Monnier
2016-03-23 11:41                                                         ` Vitalie Spinu
2016-03-23 12:34                                                           ` Stefan Monnier
2016-03-23 12:41                                                             ` Vitalie Spinu
2016-03-29 21:43                                                               ` Vitalie Spinu
2016-04-22 14:34                                                                 ` Dmitry Gutov
2016-04-24  7:22                                                                   ` Vitalie Spinu
2016-04-24  7:28                                                                     ` Achim Gratz
2016-04-24 11:33                                                                       ` Vitalie Spinu
2016-04-24 13:20                                                                         ` Andreas Schwab
2016-04-24 16:11                                                                           ` Vitalie Spinu
2016-04-24 16:19                                                                             ` Andreas Schwab
2016-04-24 16:41                                                                               ` Vitalie Spinu
2016-04-24 16:48                                                                                 ` Andreas Schwab
2016-04-24 18:01                                                                                   ` Vitalie Spinu
2016-04-24 19:05                                                                                     ` Andreas Schwab
2016-04-28 13:29                                                 ` Vitalie Spinu
2016-04-30 14:06                                                   ` Stefan Monnier
2016-03-22 20:08                                               ` Richard Stallman
2016-03-22 22:45                                                 ` Vitalie Spinu
2016-03-21 11:47                         ` Syntax tables for multiple modes [was: bug#22983: syntax-ppss returns wrong result.] Dmitry Gutov
2016-03-21 12:40                           ` Vitalie Spinu
2016-03-21 13:07                             ` Dmitry Gutov
2016-03-21 14:20                               ` Vitalie Spinu
2016-03-21 14:29                                 ` Dmitry Gutov
2016-03-21 14:42                                   ` Vitalie Spinu
2016-03-21 14:56                                     ` Dmitry Gutov
2016-03-21 16:52                                       ` Vitalie Spinu
2016-03-21 21:30                                         ` Dmitry Gutov
2016-04-03 23:34                                           ` John Wiegley
2016-03-21 14:02                             ` Stefan Monnier
2016-03-21 14:31                               ` Vitalie Spinu
2016-03-21 15:06                                 ` Stefan Monnier
2016-03-21 17:15                                   ` Andreas Röhler
2016-03-13 17:32     ` bug#22983: syntax-ppss returns wrong result Stefan Monnier
2016-03-13 18:52 ` Andreas Röhler
2016-03-13 18:56   ` Dmitry Gutov
2016-03-18  0:49 ` Dmitry Gutov
2016-03-19 12:27   ` Alan Mackenzie
2016-03-19 18:47     ` Dmitry Gutov
2016-03-27  0:51       ` John Wiegley
2016-03-27  1:14         ` Dmitry Gutov
2016-04-03 22:58           ` John Wiegley
2016-04-03 23:15             ` Dmitry Gutov
2017-09-02 13:12               ` Eli Zaretskii
2017-09-02 17:40                 ` Alan Mackenzie
2017-09-02 17:53                   ` Eli Zaretskii
2017-09-03 20:44                   ` John Wiegley
2017-09-04 23:34                   ` Dmitry Gutov
2017-09-05  6:57                     ` Andreas Röhler
2017-09-05 12:28                     ` John Wiegley
2017-09-07 20:45                       ` Alan Mackenzie
2017-09-08 16:04                         ` Andreas Röhler
2017-09-10 18:26                           ` Alan Mackenzie
2017-09-09  9:44                         ` Dmitry Gutov
2017-09-09 10:20                           ` Alan Mackenzie
2017-09-09 12:18                             ` Dmitry Gutov
2017-09-10 11:42                               ` Alan Mackenzie
2017-09-10 11:36                           ` bug#22983: [ Patch ] " Alan Mackenzie
2017-09-10 22:53                             ` Stefan Monnier
2017-09-10 23:36                               ` Dmitry Gutov
2017-09-11 11:10                                 ` Stefan Monnier
2017-09-12  0:11                                   ` Dmitry Gutov
2017-09-12 22:12                                     ` Richard Stallman
2017-09-11 19:42                               ` Alan Mackenzie
2017-09-11 20:20                                 ` Stefan Monnier
2017-09-11  0:11                             ` Dmitry Gutov
2017-09-11 20:12                               ` Alan Mackenzie
2017-09-12  0:24                                 ` Dmitry Gutov
2017-09-17 10:29                                   ` Alan Mackenzie
2017-09-17 23:43                                     ` Dmitry Gutov
2017-09-18 19:08                                       ` Alan Mackenzie
2017-09-19  0:02                                         ` Dmitry Gutov
2017-09-19 20:47                                           ` Alan Mackenzie
2017-09-22 14:09                                             ` Dmitry Gutov
2017-09-24 11:26                                               ` Alan Mackenzie
2017-09-25 23:53                                                 ` Dmitry Gutov
2017-10-01 16:36                                                   ` Alan Mackenzie
2017-10-04 20:07                                                 ` Johan Bockgård
2017-09-17 11:12                             ` Philipp Stephani
2017-09-19 20:50                               ` Alan Mackenzie
2017-09-07 17:56                     ` Alan Mackenzie
2017-09-07 20:36                       ` Dmitry Gutov
2016-03-19 23:16     ` Vitalie Spinu
2016-03-19 23:00   ` Vitalie Spinu
2016-03-19 23:20     ` Dmitry Gutov
     [not found] ` <mailman.7307.1457709188.843.bug-gnu-emacs@gnu.org>
2017-10-01 16:31   ` Alan Mackenzie

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.