unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Problems with syntax-ppss
@ 2008-04-04 17:26 Alan Mackenzie
  2008-04-04 17:29 ` Lennart Borgman (gmail)
  2008-04-04 20:24 ` Stefan Monnier
  0 siblings, 2 replies; 11+ messages in thread
From: Alan Mackenzie @ 2008-04-04 17:26 UTC (permalink / raw)
  To: emacs-devel

Hi, Emacs!

I've just encountered a rather knotty problem in CC Mode for which
syntax-ppss ought to be a solution; I need to find out, RAPIDLY, whether
a particular buffer position is inside a string or comment.
Unfortunately (for me), ......

syntax-ppss does it's parsing from (point-min), not from BOB.

So if the buffer is currently narrowed, this function will return an
meaningless value for the envisaged use.

But if I widen the buffer first, what happens to syntax-ppss's cache?
Is this just discarded, or are perhaps two caches maintained (one from
BOB, the other from the current (or most recent) (point-min)?

Advice, please!

Forgive me at this point for not reading the fine source code - it's
over 150 lines and looks rather forbidding.

It would be nice if the the Elisp manual could be more explicit on such
points.  (Hey, tell me how it is, and I'll expand the manual!)

I think the doc-string for the function is inadequate - it fails to
state that parsing starts at (point-min) rather than BOB.

Thanks in advance!

-- 
Alan Mackenzie (Nuremberg, Germany).




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with syntax-ppss
  2008-04-04 17:26 Problems with syntax-ppss Alan Mackenzie
@ 2008-04-04 17:29 ` Lennart Borgman (gmail)
  2008-04-04 20:24 ` Stefan Monnier
  1 sibling, 0 replies; 11+ messages in thread
From: Lennart Borgman (gmail) @ 2008-04-04 17:29 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Alan Mackenzie wrote:
> Hi, Emacs!
> 
> I've just encountered a rather knotty problem in CC Mode for which
> syntax-ppss ought to be a solution; I need to find out, RAPIDLY, whether
> a particular buffer position is inside a string or comment.
> Unfortunately (for me), ......
> 
> syntax-ppss does it's parsing from (point-min), not from BOB.
> 
> So if the buffer is currently narrowed, this function will return an
> meaningless value for the envisaged use.
> 
> But if I widen the buffer first, what happens to syntax-ppss's cache?
> Is this just discarded, or are perhaps two caches maintained (one from
> BOB, the other from the current (or most recent) (point-min)?

I believe nothing happens to the cache. The cache is just a list of 
position + state. There is no reason to change this when widening the 
buffer.

The cache is flushed in before-change-hook. All entries after first 
changed position are removed.

> Advice, please!
> 
> Forgive me at this point for not reading the fine source code - it's
> over 150 lines and looks rather forbidding.
> 
> It would be nice if the the Elisp manual could be more explicit on such
> points.  (Hey, tell me how it is, and I'll expand the manual!)
> 
> I think the doc-string for the function is inadequate - it fails to
> state that parsing starts at (point-min) rather than BOB.
> 
> Thanks in advance!
> 




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with syntax-ppss
  2008-04-04 17:26 Problems with syntax-ppss Alan Mackenzie
  2008-04-04 17:29 ` Lennart Borgman (gmail)
@ 2008-04-04 20:24 ` Stefan Monnier
  2008-04-04 21:14   ` martin rudalics
  1 sibling, 1 reply; 11+ messages in thread
From: Stefan Monnier @ 2008-04-04 20:24 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> But if I widen the buffer first, what happens to syntax-ppss's cache?
> Is this just discarded, or are perhaps two caches maintained (one from
> BOB, the other from the current (or most recent) (point-min)?

> Advice, please!

I strongly recommend to always call syntax-ppss in a widened buffer.


        Stefan


PS: Except of course for buffers like RMAIL or Info which are really
    a combination of sub-documents.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with syntax-ppss
  2008-04-04 20:24 ` Stefan Monnier
@ 2008-04-04 21:14   ` martin rudalics
  2008-04-05 14:46     ` Alan Mackenzie
  0 siblings, 1 reply; 11+ messages in thread
From: martin rudalics @ 2008-04-04 21:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel

> I strongly recommend to always call syntax-ppss in a widened buffer.

... and with match-data saved.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with syntax-ppss
  2008-04-04 21:14   ` martin rudalics
@ 2008-04-05 14:46     ` Alan Mackenzie
  2008-04-05 18:37       ` Stefan Monnier
  0 siblings, 1 reply; 11+ messages in thread
From: Alan Mackenzie @ 2008-04-05 14:46 UTC (permalink / raw)
  To: martin rudalics; +Cc: Stefan Monnier, emacs-devel

Hi, S and M!

On Fri, Apr 04, 2008 at 11:14:55PM +0200, martin rudalics wrote:
> >I strongly recommend to always call syntax-ppss in a widened buffer.
> 
> ... and with match-data saved.

Er, your replies don't exactly radiate an aura of confidence about
syntax-ppss.  ;-(

I think you (Stefan) 're saying that the function isn't 100% defined for
a narrowed buffer.  Will calling s-ppss on a narrowed buffer corrupt the
cache at all, for example?

As a matter of interest, are there any benchmark figures for s-ppss?
Like, how many characters do you have to scan more than, before s-ppss
(an interpreted lisp function) starts being faster than
(parse-partial-sexp 1 (point)) (a fast function written in C)?

-- 
Alan.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with syntax-ppss
  2008-04-05 14:46     ` Alan Mackenzie
@ 2008-04-05 18:37       ` Stefan Monnier
  2008-04-06 14:07         ` Alan Mackenzie
  0 siblings, 1 reply; 11+ messages in thread
From: Stefan Monnier @ 2008-04-05 18:37 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: martin rudalics, emacs-devel

>> >I strongly recommend to always call syntax-ppss in a widened buffer.
>> ... and with match-data saved.

> Er, your replies don't exactly radiate an aura of confidence about
> syntax-ppss.  ;-(

> I think you (Stefan) 're saying that the function isn't 100% defined for
> a narrowed buffer.

Indeed.

> Will calling s-ppss on a narrowed buffer corrupt the
> cache at all, for example?

Yes it can.  Similarly the cache does not keep track of the syntax-table
so if you switch syntax-table between calls you may get unexpected results.

> As a matter of interest, are there any benchmark figures for s-ppss?

I did time it in various circumstances when writing it (so as to tune
its algorithm).

> Like, how many characters do you have to scan more than, before s-ppss
> (an interpreted lisp function) starts being faster than
> (parse-partial-sexp 1 (point)) (a fast function written in C)?

I can't remember exactly, but syntax-ppss-max-span was set based on
these measurements, so it gives you an idea.  Note that there are two
different caches: there's syntax-ppss-cache which is affected by
syntax-ppss-max-span and is only really useful for large buffers, and
there's syntax-ppss-last which benefit from spatial locality.


        Stefan




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with syntax-ppss
  2008-04-05 18:37       ` Stefan Monnier
@ 2008-04-06 14:07         ` Alan Mackenzie
  2008-04-07 14:57           ` Stefan Monnier
  2008-04-07 14:59           ` Stefan Monnier
  0 siblings, 2 replies; 11+ messages in thread
From: Alan Mackenzie @ 2008-04-06 14:07 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: martin rudalics, emacs-devel

Hi, Stefan,

On Sat, Apr 05, 2008 at 02:37:39PM -0400, Stefan Monnier wrote:
> >> >I strongly recommend to always call syntax-ppss in a widened
> >> >buffer.
> >> ... and with match-data saved.

> > Er, your replies don't exactly radiate an aura of confidence about
> > syntax-ppss.  ;-(

> > I think you (Stefan) 're saying that the function isn't 100% defined
> > for a narrowed buffer.

> Indeed.

That's not good.

> > Will calling s-ppss on a narrowed buffer corrupt the cache at all,
> > for example?

> Yes it can.  Similarly the cache does not keep track of the
> syntax-table so if you switch syntax-table between calls you may get
> unexpected results.

That's also not good.

Would it not be a good idea to (i) redefine syntax-ppss as calculating
the syntax from BOB (as opposed to (point-min)); (ii) have several
caches, each associated with a particular syntax table (how many modes
are going to use more than 2 or, perhaps, 3?); (iii) put a
`save-match-data' round the function?

[ Stuff about performance read and appreciated. ]

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with syntax-ppss
  2008-04-06 14:07         ` Alan Mackenzie
@ 2008-04-07 14:57           ` Stefan Monnier
  2008-04-07 15:14             ` Lennart Borgman (gmail)
  2008-04-07 14:59           ` Stefan Monnier
  1 sibling, 1 reply; 11+ messages in thread
From: Stefan Monnier @ 2008-04-07 14:57 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: martin rudalics, emacs-devel

>> > Will calling s-ppss on a narrowed buffer corrupt the cache at all,
>> > for example?
>> Yes it can.  Similarly the cache does not keep track of the
>> syntax-table so if you switch syntax-table between calls you may get
>> unexpected results.

> That's also not good.

> Would it not be a good idea to (i) redefine syntax-ppss as calculating
> the syntax from BOB (as opposed to (point-min));

That's already what it does, when it can.  Maybe it could be improved as
follows: signal an error if it needs to recompute from (point-min) and
(point-min) is not the beginning of the buffer, unless the caller sets
a `syntax-ppss-narrowed-is-ok' variable (for use in modes like Info and
Rmail).  Maybe that variable should be shared with font-lock-dont-widen.

> (ii) have several caches, each associated with a particular syntax
> table (how many modes are going to use more than 2 or, perhaps, 3?);

This may incur a significant cost.  I much prefer to let the caller
decide when a change of syntax-table requires flushing the cache.

Another approach might be to introduce a `syntax-ppss-syntax-table'.

> (iii) put a `save-match-data' round the function?

This is a definite "no": it is much better to let the caller do it in
the very few cases where it's needed, than to pay the cost needlessly
for all the other cases.


        Stefan




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with syntax-ppss
  2008-04-06 14:07         ` Alan Mackenzie
  2008-04-07 14:57           ` Stefan Monnier
@ 2008-04-07 14:59           ` Stefan Monnier
  1 sibling, 0 replies; 11+ messages in thread
From: Stefan Monnier @ 2008-04-07 14:59 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: martin rudalics, emacs-devel

> (iii) put a `save-match-data' round the function?

Actually, let me qualify the previous "no": around the whole function,
this is a definite "no", but I suspect that the match data can only be
affected via syntax-begin-function, so maybe we should document that
syntax-begin-function should preserve the match-data, and then we can
document that syntax-ppss does preserve the match-data.


        Stefan




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with syntax-ppss
  2008-04-07 14:57           ` Stefan Monnier
@ 2008-04-07 15:14             ` Lennart Borgman (gmail)
  2008-04-07 16:43               ` Stefan Monnier
  0 siblings, 1 reply; 11+ messages in thread
From: Lennart Borgman (gmail) @ 2008-04-07 15:14 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Alan Mackenzie, emacs-devel, martin rudalics

Stefan Monnier wrote:
>> Would it not be a good idea to (i) redefine syntax-ppss as calculating
>> the syntax from BOB (as opposed to (point-min));
> 
> That's already what it does, when it can.  Maybe it could be improved as
> follows: signal an error if it needs to recompute from (point-min) and
> (point-min) is not the beginning of the buffer, unless the caller sets
> a `syntax-ppss-narrowed-is-ok' variable (for use in modes like Info and
> Rmail).  Maybe that variable should be shared with font-lock-dont-widen.

Until I had to start reading the code I thought that 
font-lock-dont-widen already took care of this. I think there have to be 
a way to tell syntax-ppss not to widen otherwise I can't see what 
purpose font-lock-dont-widen could have (but I guess I am missing 
something there?).

Adding a new variable syntax-ppss-narrowed-is-ok might be good, but 
should not font-lock-dont-widen be honored regardless of this? Or should 
perhaps font-lock set the new variable as needed?


>> (ii) have several caches, each associated with a particular syntax
>> table (how many modes are going to use more than 2 or, perhaps, 3?);
> 
> This may incur a significant cost.  I much prefer to let the caller
> decide when a change of syntax-table requires flushing the cache.

Looks better to me since flushing is done based on the position in the 
buffer.

> Another approach might be to introduce a `syntax-ppss-syntax-table'.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Problems with syntax-ppss
  2008-04-07 15:14             ` Lennart Borgman (gmail)
@ 2008-04-07 16:43               ` Stefan Monnier
  0 siblings, 0 replies; 11+ messages in thread
From: Stefan Monnier @ 2008-04-07 16:43 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: Alan Mackenzie, emacs-devel, martin rudalics

>>> Would it not be a good idea to (i) redefine syntax-ppss as calculating
>>> the syntax from BOB (as opposed to (point-min));
>> 
>> That's already what it does, when it can.  Maybe it could be improved as
>> follows: signal an error if it needs to recompute from (point-min) and
>> (point-min) is not the beginning of the buffer, unless the caller sets
>> a `syntax-ppss-narrowed-is-ok' variable (for use in modes like Info and
>> Rmail).  Maybe that variable should be shared with font-lock-dont-widen.

> Until I had to start reading the code I thought that font-lock-dont-widen
> already took care of this. I think there have to be a way to tell
> syntax-ppss not to widen otherwise I can't see what purpose
> font-lock-dont-widen could have (but I guess I am missing something there?).

All I meant is that the two variables would probably always be set
together: a mode that requires font-lock-dont-widen would also need
syntax-ppss-narrowed-is-ok, so it might be better to link them somehow.


        Stefan




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-04-07 16:43 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-04 17:26 Problems with syntax-ppss Alan Mackenzie
2008-04-04 17:29 ` Lennart Borgman (gmail)
2008-04-04 20:24 ` Stefan Monnier
2008-04-04 21:14   ` martin rudalics
2008-04-05 14:46     ` Alan Mackenzie
2008-04-05 18:37       ` Stefan Monnier
2008-04-06 14:07         ` Alan Mackenzie
2008-04-07 14:57           ` Stefan Monnier
2008-04-07 15:14             ` Lennart Borgman (gmail)
2008-04-07 16:43               ` Stefan Monnier
2008-04-07 14:59           ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).