unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* generic buffer parsing cache data
@ 2007-06-30 21:38 Paul Pogonyshev
  2007-07-01 11:48 ` martin rudalics
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Pogonyshev @ 2007-06-30 21:38 UTC (permalink / raw)
  To: emacs-devel

[ I'm not sure this topic was not discussed yet.  Still, I haven't
  seen this in any of the modes. ]

I propose adding a generic way of caching different kinds of parse
data in a buffer.  Purpose is to speed up parsing when it requires
long trip back in a buffer, by reusing previous results.  And to make
this generic enough, so that modes don't have to reinvent the wheel.

For instance, in `python-indentation-levels' I see:

	  (while (python-beginning-of-block)
	    ...)

This means each time `python-indentaton-levels' is called, it will
temporarily travel back in the buffer until it reaches a block
starting in the first column (toplevel block.)  This is not exactly
fast.  Especially in major modes that need to parse something more
difficult than Python syntax.

I propose that each point position could have "cached parsing data".
This would be an alist indexed with cache data identifier.  For
instance, Python mode could add sth. like

	'python-mode . (def "foo" (8 4 0))

after each line starting a block.  This means block type (def, class,
maybe other types if interesting to the mode), block name if
applicable, and indentation levels.  Then `python-indentation-levels'
could be like this (in pseudocode):

	python-beginning-of-block
	forward-line
	python-ensure-cache-data
	return 3rd element of cache-data

where `python-ensure-cache-data' would be like

	if there is cache data, just return
	else:
	    travel to previous block
	    python-ensure-cache-data
	    build cache data for this block based on the previous

We can either reuse text property machinery or invent something else
for storing cache data.  Difference of cache data is that it should
be automatically invalidated (by Emacs core, without major mode
interaction) from point position X onwards when text at X changes.
Thus, modes can be confident, that if there is some cached data at
some point Y, then it was computed with exactly the same text from
points 0 to Y.

With normal flow of work, when you navigate to some function and
start typing or editing code in it, there will be cache hits for
everything above that functions.  So, reparsing will be needed only
of the function itself, not of anything above.

Does this sound as a good idea?  Is it worth developing it in
more details?  Or even starting with sample code?

Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-06-30 21:38 generic buffer parsing cache data Paul Pogonyshev
@ 2007-07-01 11:48 ` martin rudalics
  2007-07-01 12:16   ` Paul Pogonyshev
  0 siblings, 1 reply; 19+ messages in thread
From: martin rudalics @ 2007-07-01 11:48 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: emacs-devel

> I propose that each point position could have "cached parsing data".
> This would be an alist indexed with cache data identifier.

Have you experimented with `syntax-ppss'?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 11:48 ` martin rudalics
@ 2007-07-01 12:16   ` Paul Pogonyshev
  2007-07-01 12:38     ` martin rudalics
                       ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Paul Pogonyshev @ 2007-07-01 12:16 UTC (permalink / raw)
  To: emacs-devel; +Cc: martin rudalics

martin rudalics wrote:
> > I propose that each point position could have "cached parsing data".
> > This would be an alist indexed with cache data identifier.
> 
> Have you experimented with `syntax-ppss'?

I propose to add something generic.  For instance, Python mode needs to
know indentation level of blocks.  It seems that `syntax-ppss` doesn't
return it at all.  And adding everything that might ever be needed by
some XYZ mode seems counter-productive and complicates an already complex
function and its return value.

I just mean that major modes can have needs beyond that suited by
`syntax-ppss`.  And as far as I can see, they can either parse half of
the buffer each time they need something, or invent some ad-hoc custom
code for caching such data.

As a side note, I was told somewhen that `parse-partial-sexp` is not
limited to Lisp syntax.  How about this amendment to documentation?

Paul


*** syntax.c	23 Jun 2007 12:18:14 +0300	1.206
--- syntax.c	01 Jul 2007 15:13:54 +0300	
***************
*** 3021,3028 ****
  }
  
  DEFUN ("parse-partial-sexp", Fparse_partial_sexp, Sparse_partial_sexp, 2, 6, 0,
!        doc: /* Parse Lisp syntax starting at FROM until TO; return status of parse at TO.
! Parsing stops at TO or when certain criteria are met;
   point is set to where parsing stops.
  If fifth arg OLDSTATE is omitted or nil,
   parsing assumes that FROM is the beginning of a function.
--- 3021,3029 ----
  }
  
  DEFUN ("parse-partial-sexp", Fparse_partial_sexp, Sparse_partial_sexp, 2, 6, 0,
!        doc: /* Parse syntax starting at FROM until TO; return status of parse at TO.
! Exact rules are determined by buffer's major mode.  Parsing stops at
! TO or when certain criteria are met;
   point is set to where parsing stops.
  If fifth arg OLDSTATE is omitted or nil,
   parsing assumes that FROM is the beginning of a function.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 12:16   ` Paul Pogonyshev
@ 2007-07-01 12:38     ` martin rudalics
  2007-07-01 13:41       ` Paul Pogonyshev
  2007-07-01 12:45     ` Thien-Thi Nguyen
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: martin rudalics @ 2007-07-01 12:38 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: emacs-devel

 > I propose to add something generic.  For instance, Python mode needs to
 > know indentation level of blocks.  It seems that `syntax-ppss` doesn't
 > return it at all.  And adding everything that might ever be needed by
 > some XYZ mode seems counter-productive and complicates an already complex
 > function and its return value.
 >
 > I just mean that major modes can have needs beyond that suited by
 > `syntax-ppss`.  And as far as I can see, they can either parse half of
 > the buffer each time they need something, or invent some ad-hoc custom
 > code for caching such data.

Like `c-state-cache'.  Well, `syntax-ppss' can only do whatever
`parse-partial-sexp' does.  Occasionally, that's not even sufficient for
the Elisp case (look how `lisp-font-lock-syntactic-face-function'
strives for detecting doc-strings).  I'd appreciate if you came up with
something more "generic" (if you just could give a clear description of
that term).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 12:16   ` Paul Pogonyshev
  2007-07-01 12:38     ` martin rudalics
@ 2007-07-01 12:45     ` Thien-Thi Nguyen
  2007-07-01 13:10       ` Paul Pogonyshev
  2007-07-01 13:28       ` joakim
  2007-07-01 12:52     ` Stefan Monnier
                       ` (2 subsequent siblings)
  4 siblings, 2 replies; 19+ messages in thread
From: Thien-Thi Nguyen @ 2007-07-01 12:45 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: emacs-devel

() Paul Pogonyshev <pogonyshev@gmx.net>
() Sun, 1 Jul 2007 15:16:31 +0300

   I propose to add something generic.

why not look at getting CEDET to work w/ python?

thi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 12:16   ` Paul Pogonyshev
  2007-07-01 12:38     ` martin rudalics
  2007-07-01 12:45     ` Thien-Thi Nguyen
@ 2007-07-01 12:52     ` Stefan Monnier
  2007-07-01 13:49       ` Paul Pogonyshev
  2007-07-01 16:32     ` Richard Stallman
  2007-07-01 16:32     ` Richard Stallman
  4 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2007-07-01 12:52 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: martin rudalics, emacs-devel

>> > I propose that each point position could have "cached parsing data".
>> > This would be an alist indexed with cache data identifier.
>> Have you experimented with `syntax-ppss'?
> I propose to add something generic.  For instance, Python mode needs to
> know indentation level of blocks.  It seems that `syntax-ppss` doesn't
> return it at all.  And adding everything that might ever be needed by
> some XYZ mode seems counter-productive and complicates an already complex
> function and its return value.

100% agreement.

This said, I think it might make sense to combine the two so that
syntax-ppss returns not just the parse-partial-sexp state but also some
mode-specific data.  At least it's been in my TODO list for a while now.

> As a side note, I was told somewhen that `parse-partial-sexp` is not
> limited to Lisp syntax.  How about this amendment to documentation?

Sounds good,


        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 12:45     ` Thien-Thi Nguyen
@ 2007-07-01 13:10       ` Paul Pogonyshev
  2007-07-01 13:16         ` Lennart Borgman (gmail)
  2007-07-01 13:28       ` joakim
  1 sibling, 1 reply; 19+ messages in thread
From: Paul Pogonyshev @ 2007-07-01 13:10 UTC (permalink / raw)
  To: emacs-devel; +Cc: Thien-Thi Nguyen

Thien-Thi Nguyen wrote:
> () Paul Pogonyshev <pogonyshev@gmx.net>
> () Sun, 1 Jul 2007 15:16:31 +0300
> 
>    I propose to add something generic.
> 
> why not look at getting CEDET to work w/ python?

That might be an option.  However, I'm interested not only in
Python mode and that is a little perpendicular to my proposal.
Besides, I think CEDET will benifit from building a standard
way of caching right into Emacs core.  (OK, after some years,
because large packages tend to have large lag, since they need
to support older versions, etc.)

Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 13:10       ` Paul Pogonyshev
@ 2007-07-01 13:16         ` Lennart Borgman (gmail)
  2007-07-01 13:43           ` Paul Pogonyshev
  0 siblings, 1 reply; 19+ messages in thread
From: Lennart Borgman (gmail) @ 2007-07-01 13:16 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: Thien-Thi Nguyen, emacs-devel

Paul Pogonyshev wrote:
> Thien-Thi Nguyen wrote:
>> () Paul Pogonyshev <pogonyshev@gmx.net>
>> () Sun, 1 Jul 2007 15:16:31 +0300
>>
>>    I propose to add something generic.
>>
>> why not look at getting CEDET to work w/ python?
> 
> That might be an option.  However, I'm interested not only in
> Python mode and that is a little perpendicular to my proposal.
> Besides, I think CEDET will benifit from building a standard
> way of caching right into Emacs core.  (OK, after some years,
> because large packages tend to have large lag, since they need
> to support older versions, etc.)


Is there a caching mechanism in CEDET?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 12:45     ` Thien-Thi Nguyen
  2007-07-01 13:10       ` Paul Pogonyshev
@ 2007-07-01 13:28       ` joakim
  1 sibling, 0 replies; 19+ messages in thread
From: joakim @ 2007-07-01 13:28 UTC (permalink / raw)
  To: emacs-devel

Thien-Thi Nguyen <ttn@gnuvola.org> writes:

> () Paul Pogonyshev <pogonyshev@gmx.net>
> () Sun, 1 Jul 2007 15:16:31 +0300
>
>    I propose to add something generic.
>
> why not look at getting CEDET to work w/ python?

CEDET does work with python. Do you have any particular feature in
mind, that you feel is missing?

>
> thi

-- 
Joakim Verona

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 12:38     ` martin rudalics
@ 2007-07-01 13:41       ` Paul Pogonyshev
  2007-07-01 15:20         ` martin rudalics
  2007-07-01 20:40         ` Richard Stallman
  0 siblings, 2 replies; 19+ messages in thread
From: Paul Pogonyshev @ 2007-07-01 13:41 UTC (permalink / raw)
  To: emacs-devel; +Cc: martin rudalics

martin rudalics wrote:
>  > I propose to add something generic.  For instance, Python mode needs to
>  > know indentation level of blocks.  It seems that `syntax-ppss` doesn't
>  > return it at all.  And adding everything that might ever be needed by
>  > some XYZ mode seems counter-productive and complicates an already complex
>  > function and its return value.
>  >
>  > I just mean that major modes can have needs beyond that suited by
>  > `syntax-ppss`.  And as far as I can see, they can either parse half of
>  > the buffer each time they need something, or invent some ad-hoc custom
>  > code for caching such data.
> 
> Like `c-state-cache'.  Well, `syntax-ppss' can only do whatever
> `parse-partial-sexp' does.  Occasionally, that's not even sufficient for
> the Elisp case (look how `lisp-font-lock-syntactic-face-function'
> strives for detecting doc-strings).  I'd appreciate if you came up with
> something more "generic" (if you just could give a clear description of
> that term).

For instance, something like this:

    Function: put-cache-data key data &optional pos

	Store cache DATA with given KEY in the current buffer, at position
	POS (if not specified, then where point currently is.)

    Function: get-cache-data key &optional pos

	Return cache data associated with given KEY in the current buffer
	at position POS (if not specified, then where point currently is.)
	If there is no data with that KEY stored at position, or if it has
	been invalidated, return nil.

Internally, Emacs core (at C level) automatically invalidates cache data
starting from X onwards when buffer text from X to Y (Y >= X) changes in
some way.  Whether cache data is actively removed from internal storage,
or just somehow marked invalid is implementation detail and irrelevant for
Elisp level.

It is unclear whether changes in any text properties should lead to cache
invalidation.  Probably no, at least by default.

It also makes sense to define some `anchors'.  Those would be ways of
partitioning buffers into parts, where changes in one part don't cause
invalidation of cache data in other parts.  For instance, in Python mode
anchors would be set wherever a toplevel block is defined, since it stops
parsing on reaching a toplevel anyway.  However, this can be added later.
For instance, it is not clear when and how to remove anchors.  (I.e. in
Python mode if toplevel is indented to another level, it should stop
being an anchor.)

It is required that major mode stores cache data at some logical position,
so it can later find them again.  Maybe it also makes sense to add

    Function: find-cache-data key &optional pos

	Find and return cache data at POS (or point position) or _before
	it_.  Return nil if there is no (valid) cached data at pos or
	anywhere before with that KEY.

However, I don't see any obvious ways of using it.  As I can see, modes
should access cache data like this (in pseudocode):

	mode-get-cache-data:
	    data = (get-cache-data mode-key)
	    if data is nil:
		data = (mode-compute-cache-data)
		(put-cache-data mode-key data)
	    return data

	mode-compute-cache-data:
	    save-excursion:
		travel-to-higher-level-cache-point
		higher-level-data = (mode-get-cache-data)
	    data = (mode-compute-data-from-higher-level higher-level-data)
	    return data

Here `higher-level' is not the same as `previous'.  For instance, in
Python mode it makes sense to compute indentation from the block this one
is nested in, not just previous block:

    class X:
	class Y: # <-- higher-level block for the current block
	    class Z:
		def bla (): # <-- previos block (with cached data)
		   pass
	    def __init__(self): # <-- current block
		pass

Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 13:16         ` Lennart Borgman (gmail)
@ 2007-07-01 13:43           ` Paul Pogonyshev
  0 siblings, 0 replies; 19+ messages in thread
From: Paul Pogonyshev @ 2007-07-01 13:43 UTC (permalink / raw)
  To: emacs-devel; +Cc: Lennart Borgman (gmail), Thien-Thi Nguyen

Lennart Borgman (gmail) wrote:
> Paul Pogonyshev wrote:
> > Thien-Thi Nguyen wrote:
> >> () Paul Pogonyshev <pogonyshev@gmx.net>
> >> () Sun, 1 Jul 2007 15:16:31 +0300
> >>
> >>    I propose to add something generic.
> >>
> >> why not look at getting CEDET to work w/ python?
> > 
> > That might be an option.  However, I'm interested not only in
> > Python mode and that is a little perpendicular to my proposal.
> > Besides, I think CEDET will benifit from building a standard
> > way of caching right into Emacs core.  (OK, after some years,
> > because large packages tend to have large lag, since they need
> > to support older versions, etc.)
> 
> Is there a caching mechanism in CEDET?

I dunno, that is more of a question for Thien-Thi Nguyen.  However,
from my experience, most non-trivial mode would benifit from some
way of caching.

Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 12:52     ` Stefan Monnier
@ 2007-07-01 13:49       ` Paul Pogonyshev
  2007-07-01 20:08         ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Pogonyshev @ 2007-07-01 13:49 UTC (permalink / raw)
  To: emacs-devel; +Cc: martin rudalics, Stefan Monnier

Stefan Monnier wrote:
> >> > I propose that each point position could have "cached parsing data".
> >> > This would be an alist indexed with cache data identifier.
> >> Have you experimented with `syntax-ppss'?
> > I propose to add something generic.  For instance, Python mode needs to
> > know indentation level of blocks.  It seems that `syntax-ppss` doesn't
> > return it at all.  And adding everything that might ever be needed by
> > some XYZ mode seems counter-productive and complicates an already complex
> > function and its return value.
> 
> 100% agreement.
> 
> This said, I think it might make sense to combine the two so that
> syntax-ppss returns not just the parse-partial-sexp state but also some
> mode-specific data.  At least it's been in my TODO list for a while now.

That would be nice, but I see one possible non-trivial problem here.  I'm
not sure that `parse-partial-sexp' stores cached data where it is most
logical for the current mode.  So it might lead to worse cache performance,
because `parse-partial-sexp' might choose to store data in more sparse
positions in a buffer than its mode would prefer.  (However, I might be
wrong, there can be a way to influent this.)

Also, relying on `parse-partial-sexp' makes caching impossible or very
difficult for minor modes, because they don't generally have a say in
determining buffer syntax.

Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 13:41       ` Paul Pogonyshev
@ 2007-07-01 15:20         ` martin rudalics
  2007-07-01 20:40         ` Richard Stallman
  1 sibling, 0 replies; 19+ messages in thread
From: martin rudalics @ 2007-07-01 15:20 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: emacs-devel

 > It is unclear whether changes in any text properties should lead to cache
 > invalidation.  Probably no, at least by default.

Maybe for changes in syntax-table text properties.

 > It also makes sense to define some `anchors'.  Those would be ways of
 > partitioning buffers into parts, where changes in one part don't cause
 > invalidation of cache data in other parts.  For instance, in Python mode
 > anchors would be set wherever a toplevel block is defined, since it stops
 > parsing on reaching a toplevel anyway.  However, this can be added later.
 > For instance, it is not clear when and how to remove anchors.  (I.e. in
 > Python mode if toplevel is indented to another level, it should stop
 > being an anchor.)

An open-paren-in-column-0 would be such an anchor in many modes.

 > It is required that major mode stores cache data at some logical position,
 > so it can later find them again.  Maybe it also makes sense to add
 >
 >     Function: find-cache-data key &optional pos
 >
 > 	Find and return cache data at POS (or point position) or _before
 > 	it_.  Return nil if there is no (valid) cached data at pos or
 > 	anywhere before with that KEY.

The classic method based on the entire information to get out of a
nested block.

 > However, I don't see any obvious ways of using it.  As I can see, modes
 > should access cache data like this (in pseudocode):
 >
 > 	mode-get-cache-data:
 > 	    data = (get-cache-data mode-key)
 > 	    if data is nil:
 > 		data = (mode-compute-cache-data)
 > 		(put-cache-data mode-key data)
 > 	    return data
 >
 > 	mode-compute-cache-data:
 > 	    save-excursion:
 > 		travel-to-higher-level-cache-point
 > 		higher-level-data = (mode-get-cache-data)
 > 	    data = (mode-compute-data-from-higher-level higher-level-data)
 > 	    return data
 >
 > Here `higher-level' is not the same as `previous'.  For instance, in
 > Python mode it makes sense to compute indentation from the block this one
 > is nested in, not just previous block:
 >
 >     class X:
 > 	class Y: # <-- higher-level block for the current block
 > 	    class Z:
 > 		def bla (): # <-- previos block (with cached data)
 > 		   pass
 > 	    def __init__(self): # <-- current block
 > 		pass

I don't see why it should be difficult to keep the entire necessary
information at "bla".  After all you'd avoid to compute your data from
"class Y".  Moreover in other contexts it's not clear whether a thing
like "class Y" does constitute a "higher-level" block than the current
one.

Anyway, I understand that you want some routines for cache management
where the major (or minor) mode would fill in the necessary information
itself.  I don't see any problems adding such functionality to
`syntax-ppss' as Stefan suggested.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 12:16   ` Paul Pogonyshev
                       ` (2 preceding siblings ...)
  2007-07-01 12:52     ` Stefan Monnier
@ 2007-07-01 16:32     ` Richard Stallman
  2007-07-01 16:32     ` Richard Stallman
  4 siblings, 0 replies; 19+ messages in thread
From: Richard Stallman @ 2007-07-01 16:32 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: rudalics, emacs-devel

    !        doc: /* Parse syntax starting at FROM until TO; return status of parse at TO.
    ! Exact rules are determined by buffer's major mode.

If this refers to the fact that the syntax table depends
on the major mode, that is so basic that I think it doesn't
need to be repeated here.  However, if we did want to remind
people of it here, we should say that parsing depends on the syntax table.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 12:16   ` Paul Pogonyshev
                       ` (3 preceding siblings ...)
  2007-07-01 16:32     ` Richard Stallman
@ 2007-07-01 16:32     ` Richard Stallman
  4 siblings, 0 replies; 19+ messages in thread
From: Richard Stallman @ 2007-07-01 16:32 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: rudalics, emacs-devel

    I propose to add something generic.  For instance, Python mode needs to
    know indentation level of blocks.  It seems that `syntax-ppss` doesn't
    return it at all.  And adding everything that might ever be needed by
    some XYZ mode seems counter-productive and complicates an already complex
    function and its return value.

I would be glad to see this mechanism made more powerful such that it can
serve the needs of a wider range of languages.  The challenge is to
design a mechanism that is clean and simple and yet can do the job.
Would you like to work on it?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 13:49       ` Paul Pogonyshev
@ 2007-07-01 20:08         ` Stefan Monnier
  2007-07-01 20:44           ` Paul Pogonyshev
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Monnier @ 2007-07-01 20:08 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: martin rudalics, emacs-devel

>> This said, I think it might make sense to combine the two so that
>> syntax-ppss returns not just the parse-partial-sexp state but also some
>> mode-specific data.  At least it's been in my TODO list for a while now.

> That would be nice, but I see one possible non-trivial problem here.  I'm
> not sure that `parse-partial-sexp' stores cached data where it is most
> logical for the current mode.  So it might lead to worse cache performance,
> because `parse-partial-sexp' might choose to store data in more sparse
> positions in a buffer than its mode would prefer.  (However, I might be
> wrong, there can be a way to influent this.)

You mean that mode data might benefit from being cached at more regular
intervals because it costs more to compute it?  You might be right, here,
I don't know.

> Also, relying on `parse-partial-sexp' makes caching impossible or very
> difficult for minor modes, because they don't generally have a say in
> determining buffer syntax.

I don't think parse-partial-sexp would be involved at all: syntax-ppss would
call the mode-specific code on one side and parse-partial-sexp on the
other.  So the issue of minor modes is orthogonal (tho relevant as well).

IIUC you're not proposing a standard generic cache, but a generic library to
manage one's own cache.  So the caching policy (at what kind of interval to
keep records of it, etc...) can be chosen freely, and also so that it can be
used indifferently by major modes, minor modes, etc...


        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 13:41       ` Paul Pogonyshev
  2007-07-01 15:20         ` martin rudalics
@ 2007-07-01 20:40         ` Richard Stallman
  1 sibling, 0 replies; 19+ messages in thread
From: Richard Stallman @ 2007-07-01 20:40 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: rudalics, emacs-devel

    For instance, something like this:

	Function: put-cache-data key data &optional pos
	...

This is a clean interface for a totally new feature, but is it a
feasible feature?  Does anyone know of an efficient algorithm to
implemented the proposed feature?  It seems like a hard problem to me.

Thus, I would suggest looking for a way to extend the existing
feature, one that is straightforward to implement.  The challenge here
is to find such an extension that would be useful in various modes to
speed up parsing.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 20:08         ` Stefan Monnier
@ 2007-07-01 20:44           ` Paul Pogonyshev
  2007-07-01 21:23             ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Paul Pogonyshev @ 2007-07-01 20:44 UTC (permalink / raw)
  To: emacs-devel; +Cc: martin rudalics, Stefan Monnier

Stefan Monnier wrote:
> IIUC you're not proposing a standard generic cache, but a generic library to
> manage one's own cache.  So the caching policy (at what kind of interval to
> keep records of it, etc...) can be chosen freely, and also so that it can be
> used indifferently by major modes, minor modes, etc...

More or less.  What is `standard' about it is that invalidation should be done
in efficient manner by Emacs core.

Is there any way to easily check where `parse-partial-sexp' caches its data?
Just to easily see if it is already suitable for what modes would need.

Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: generic buffer parsing cache data
  2007-07-01 20:44           ` Paul Pogonyshev
@ 2007-07-01 21:23             ` Stefan Monnier
  0 siblings, 0 replies; 19+ messages in thread
From: Stefan Monnier @ 2007-07-01 21:23 UTC (permalink / raw)
  To: Paul Pogonyshev; +Cc: martin rudalics, emacs-devel

> Is there any way to easily check where `parse-partial-sexp' caches its
> data?  Just to easily see if it is already suitable for what modes
> would need.

parse-partial-sexp doesn't cache anything.  `syntax-ppss' caches its data in
an association list.


        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2007-07-01 21:23 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-30 21:38 generic buffer parsing cache data Paul Pogonyshev
2007-07-01 11:48 ` martin rudalics
2007-07-01 12:16   ` Paul Pogonyshev
2007-07-01 12:38     ` martin rudalics
2007-07-01 13:41       ` Paul Pogonyshev
2007-07-01 15:20         ` martin rudalics
2007-07-01 20:40         ` Richard Stallman
2007-07-01 12:45     ` Thien-Thi Nguyen
2007-07-01 13:10       ` Paul Pogonyshev
2007-07-01 13:16         ` Lennart Borgman (gmail)
2007-07-01 13:43           ` Paul Pogonyshev
2007-07-01 13:28       ` joakim
2007-07-01 12:52     ` Stefan Monnier
2007-07-01 13:49       ` Paul Pogonyshev
2007-07-01 20:08         ` Stefan Monnier
2007-07-01 20:44           ` Paul Pogonyshev
2007-07-01 21:23             ` Stefan Monnier
2007-07-01 16:32     ` Richard Stallman
2007-07-01 16:32     ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).