C Mode: acceleration in brace deserts.

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* C Mode: acceleration in brace deserts.
@ 2009-12-03 16:21 Alan Mackenzie
  2009-12-03 16:26 ` Lennart Borgman
  2009-12-03 17:09 ` Stefan Monnier
  0 siblings, 2 replies; 19+ messages in thread
From: Alan Mackenzie @ 2009-12-03 16:21 UTC (permalink / raw)
  To: Richard Stallman, Chong Yidong, emacs-devel

Hi, Richard, Yidong and Emacs!

I have just committed my enhancements to CC Mode's `c-parse-state'.
This is a central and critical part of Emacs which maintains a cache of
the positions of certain braces, and fewer parens of other types.  As
the position of interest in a buffer moves, this cache is updated to
reflect the new position.

In the previous version, the cache handling was severely suboptimal in
"brace deserts", large areas of buffers entirely lacking in braces.  In
the extreme, the algorithm was scanning from BOB every time
c-parse-state was called.

An example of such an extreme can be downloaded from
<http://www.muc.de/~acm/AT91SAM9263_inc.h>.  To see the effect, visit
it, do M-> and try scrolling back a page at a time.  Please compare the
previous version with the new.

The difference is noticeable even in our own .../src/lisp.h.  It is now
noticeably less sluggish when scrolling backwards from EOB.

I have tested these changes thoroughly, principly by calculating
c-parse-state successively at an infinite number (1000) of random points
in the buffers lisp.h and xdisp.c, then comparing "old" results with
"new" results.  Many thanks to Stefan for `diff-refine-hunk' :-).

If anybody notices anything odd, please let me know.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-03 16:21 C Mode: acceleration in brace deserts Alan Mackenzie
@ 2009-12-03 16:26 ` Lennart Borgman
  2009-12-03 16:59   ` Alan Mackenzie
  2009-12-03 17:09 ` Stefan Monnier
  1 sibling, 1 reply; 19+ messages in thread
From: Lennart Borgman @ 2009-12-03 16:26 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Chong Yidong, Richard Stallman, emacs-devel

Hi Alan,

Can you tell me how the cache is implemented so I can support it in MuMaMo?

Best wishes,
L


On Thu, Dec 3, 2009 at 5:21 PM, Alan Mackenzie <acm@muc.de> wrote:
> Hi, Richard, Yidong and Emacs!
>
> I have just committed my enhancements to CC Mode's `c-parse-state'.
> This is a central and critical part of Emacs which maintains a cache of
> the positions of certain braces, and fewer parens of other types.  As
> the position of interest in a buffer moves, this cache is updated to
> reflect the new position.
>
> In the previous version, the cache handling was severely suboptimal in
> "brace deserts", large areas of buffers entirely lacking in braces.  In
> the extreme, the algorithm was scanning from BOB every time
> c-parse-state was called.
>
> An example of such an extreme can be downloaded from
> <http://www.muc.de/~acm/AT91SAM9263_inc.h>.  To see the effect, visit
> it, do M-> and try scrolling back a page at a time.  Please compare the
> previous version with the new.
>
> The difference is noticeable even in our own .../src/lisp.h.  It is now
> noticeably less sluggish when scrolling backwards from EOB.
>
> I have tested these changes thoroughly, principly by calculating
> c-parse-state successively at an infinite number (1000) of random points
> in the buffers lisp.h and xdisp.c, then comparing "old" results with
> "new" results.  Many thanks to Stefan for `diff-refine-hunk' :-).
>
> If anybody notices anything odd, please let me know.
>
> --
> Alan Mackenzie (Nuremberg, Germany).
>
>
>




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-03 16:26 ` Lennart Borgman
@ 2009-12-03 16:59   ` Alan Mackenzie
  2009-12-03 17:22     ` Lennart Borgman
  2009-12-04  5:31     ` Richard Stallman
  0 siblings, 2 replies; 19+ messages in thread
From: Alan Mackenzie @ 2009-12-03 16:59 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: emacs-devel

Hi, Lennart!

On Thu, Dec 03, 2009 at 05:26:57PM +0100, Lennart Borgman wrote:
> Hi Alan,

> Can you tell me how the cache [c-parse-state] is implemented so I can
> support it in MuMaMo?

Short brusque answer: no - it's ~1300 lines of code, much of it arcane.

Slightly longer answer: The cache's structure is a list of the positions
of each successively enclosing brace/bracket/paren from point going back
to the top level.  Additionally, if there is a brace pair preceding such
a b/b/p, it is recorded as a cons.  Also there is a "good position", a
position where the cache is known to be valid.

When c-state-cache is called from a new position, the cache is usually
updated rather than being calculated from scratch.  The involves
removing entries from the cache which are now later than point, removing
other entries which aren't relevant, since they've been "closed off",
etc. - then scanning forward pairs of parens at a time successively
entering deeper levels (see `c-append-to-state-cache').  Lots of dirty
tricks are used to speed up the process as much as possible.

Sorry I can't be more help, here.  But if you've any specific questions,
let me know.

> Best wishes,
> L

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-03 16:21 C Mode: acceleration in brace deserts Alan Mackenzie
  2009-12-03 16:26 ` Lennart Borgman
@ 2009-12-03 17:09 ` Stefan Monnier
  1 sibling, 0 replies; 19+ messages in thread
From: Stefan Monnier @ 2009-12-03 17:09 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Chong Yidong, Richard Stallman, emacs-devel

> The difference is noticeable even in our own .../src/lisp.h.  It is now
> noticeably less sluggish when scrolling backwards from EOB.

Thank you.

> Many thanks to Stefan for `diff-refine-hunk' :-).

You've also thanked myself many times for it ;-)


        Stefan




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-03 16:59   ` Alan Mackenzie
@ 2009-12-03 17:22     ` Lennart Borgman
  2009-12-03 19:39       ` Alan Mackenzie
  2009-12-04  5:31     ` Richard Stallman
  1 sibling, 1 reply; 19+ messages in thread
From: Lennart Borgman @ 2009-12-03 17:22 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Hi Alan,

On Thu, Dec 3, 2009 at 5:59 PM, Alan Mackenzie <acm@muc.de> wrote:
> Hi, Lennart!
>
> On Thu, Dec 03, 2009 at 05:26:57PM +0100, Lennart Borgman wrote:
>> Hi Alan,
>
>> Can you tell me how the cache [c-parse-state] is implemented so I can
>> support it in MuMaMo?
>
> Short brusque answer: no - it's ~1300 lines of code, much of it arcane.


I need only a very short answer. I need to know how you store this
data so I do not destroy it when switching major mode in MuMaMo.

In the current situation I can only try to do that since the different
major modes may stamp on each other (I need some more Emacs support to
avoid that). But I can try. In some situations it is needed (for
example php may be split up in several parts (which html code between)
where the indentation in the next part should be aligned to that in
the prev part).

If you store it in a buffer local variable I am happy since all I have
to do then is to make that survive major mode switching. If you store
it in text properties I will be a bit more sad.


> Slightly longer answer: The cache's structure is a list of the positions
> of each successively enclosing brace/bracket/paren from point going back
> to the top level.  Additionally, if there is a brace pair preceding such
> a b/b/p, it is recorded as a cons.  Also there is a "good position", a
> position where the cache is known to be valid.
>
> When c-state-cache is called from a new position, the cache is usually
> updated rather than being calculated from scratch.  The involves
> removing entries from the cache which are now later than point, removing
> other entries which aren't relevant, since they've been "closed off",
> etc. - then scanning forward pairs of parens at a time successively
> entering deeper levels (see `c-append-to-state-cache').  Lots of dirty
> tricks are used to speed up the process as much as possible.
>
> Sorry I can't be more help, here.  But if you've any specific questions,
> let me know.
>
>> Best wishes,
>> L
>
> --
> Alan Mackenzie (Nuremberg, Germany).
>




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-03 17:22     ` Lennart Borgman
@ 2009-12-03 19:39       ` Alan Mackenzie
  2009-12-03 19:57         ` Lennart Borgman
  0 siblings, 1 reply; 19+ messages in thread
From: Alan Mackenzie @ 2009-12-03 19:39 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: emacs-devel

Hi, Lennart!

On Thu, Dec 03, 2009 at 06:22:02PM +0100, Lennart Borgman wrote:
> Hi Alan,

> On Thu, Dec 3, 2009 at 5:59 PM, Alan Mackenzie <acm@muc.de> wrote:
> > Hi, Lennart!

> > On Thu, Dec 03, 2009 at 05:26:57PM +0100, Lennart Borgman wrote:
> >> Hi Alan,

> >> Can you tell me how the cache [c-parse-state] is implemented so I can
> >> support it in MuMaMo?

> > Short brusque answer: no - it's ~1300 lines of code, much of it arcane.


> I need only a very short answer. I need to know how you store this
> data so I do not destroy it when switching major mode in MuMaMo.

Sorry, I misunderstood you.  I thought you were wanting to copy the
algorithm.

The state of this cache is held entirely in the variables (all of them
buffer local) initialised thusly:

(defun c-state-cache-init ()
  (setq c-state-cache nil
        c-state-cache-good-pos 1
        c-state-nonlit-pos-cache nil
        c-state-nonlit-pos-cache-limit 1
        c-state-brace-pair-desert nil
        c-state-point-min 1
        c-state-point-min-lit-type nil
        c-state-point-min-lit-start nil
        c-state-min-scan-pos 1
        c-state-old-cpp-beg nil
        c-state-old-cpp-end nil)
  (c-state-mark-point-min-literal))

, where `c-state-mark-point-min-literal' merely sets 3 variables already
named.  I don't honestly see a way MuMaMo could disturb this state by
accident.

> In the current situation I can only try to do that since the different
> major modes may stamp on each other (I need some more Emacs support to
> avoid that). But I can try. In some situations it is needed (for
> example php may be split up in several parts (which html code between)
> where the indentation in the next part should be aligned to that in
> the prev part).

> If you store it in a buffer local variable I am happy since all I have
> to do then is to make that survive major mode switching. If you store
> it in text properties I will be a bit more sad.

Ah, yes.  I use text properties, too.  On each C macro, #if, etc., I set
a category property on the "#" and one on (usually) the newline that
terminates it.  I also put category properties on "<" and ">" to mark
them as C++ template or Java generic delimiters.  Does this cause you
problems at all?

--
Alan Mackenzie (Nuremberg, Germany).




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-03 19:39       ` Alan Mackenzie
@ 2009-12-03 19:57         ` Lennart Borgman
  2009-12-04 10:34           ` Lennart Borgman
  0 siblings, 1 reply; 19+ messages in thread
From: Lennart Borgman @ 2009-12-03 19:57 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Hi Alan,

On Thu, Dec 3, 2009 at 8:39 PM, Alan Mackenzie <acm@muc.de> wrote:
> Hi, Lennart!
>
> The state of this cache is held entirely in the variables (all of them
> buffer local) initialised thusly:
>
> (defun c-state-cache-init ()
>  (setq c-state-cache nil
>        c-state-cache-good-pos 1
>        c-state-nonlit-pos-cache nil
>        c-state-nonlit-pos-cache-limit 1
>        c-state-brace-pair-desert nil
>        c-state-point-min 1
>        c-state-point-min-lit-type nil
>        c-state-point-min-lit-start nil
>        c-state-min-scan-pos 1
>        c-state-old-cpp-beg nil
>        c-state-old-cpp-end nil)
>  (c-state-mark-point-min-literal))
>
> , where `c-state-mark-point-min-literal' merely sets 3 variables already
> named.  I don't honestly see a way MuMaMo could disturb this state by
> accident.


Thanks. Mumamo needs to know because it switches major mode and that
normally kills buffer local variables.


>> In the current situation I can only try to do that since the different
>> major modes may stamp on each other (I need some more Emacs support to
>> avoid that). But I can try. In some situations it is needed (for
>> example php may be split up in several parts (which html code between)
>> where the indentation in the next part should be aligned to that in
>> the prev part).
>
>> If you store it in a buffer local variable I am happy since all I have
>> to do then is to make that survive major mode switching. If you store
>> it in text properties I will be a bit more sad.
>
> Ah, yes.  I use text properties, too.  On each C macro, #if, etc., I set
> a category property on the "#" and one on (usually) the newline that
> terminates it.  I also put category properties on "<" and ">" to mark
> them as C++ template or Java generic delimiters.  Does this cause you
> problems at all?


If they are properly named so that no other modes uses them then I do
not think there is any problem.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-03 16:59   ` Alan Mackenzie
  2009-12-03 17:22     ` Lennart Borgman
@ 2009-12-04  5:31     ` Richard Stallman
  2009-12-04 11:37       ` Alan Mackenzie
  1 sibling, 1 reply; 19+ messages in thread
From: Richard Stallman @ 2009-12-04  5:31 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: lennart.borgman, emacs-devel

    Short brusque answer: no - it's ~1300 lines of code, much of it arcane.

Could it be rewritten into a modular facility?




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-03 19:57         ` Lennart Borgman
@ 2009-12-04 10:34           ` Lennart Borgman
  2009-12-04 11:03             ` Lennart Borgman
  2009-12-04 13:54             ` Alan Mackenzie
  0 siblings, 2 replies; 19+ messages in thread
From: Lennart Borgman @ 2009-12-04 10:34 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Hi again Alan,

On Thu, Dec 3, 2009 at 8:57 PM, Lennart Borgman
<lennart.borgman@gmail.com> wrote:
> Hi Alan,
>
> On Thu, Dec 3, 2009 at 8:39 PM, Alan Mackenzie <acm@muc.de> wrote:
>> Hi, Lennart!
>>
>> The state of this cache is held entirely in the variables (all of them
>> buffer local) initialised thusly:
>>
>> (defun c-state-cache-init ()
>>  (setq c-state-cache nil
>>        c-state-cache-good-pos 1
>>        c-state-nonlit-pos-cache nil
>>        c-state-nonlit-pos-cache-limit 1
>>        c-state-brace-pair-desert nil
>>        c-state-point-min 1
>>        c-state-point-min-lit-type nil
>>        c-state-point-min-lit-start nil
>>        c-state-min-scan-pos 1
>>        c-state-old-cpp-beg nil
>>        c-state-old-cpp-end nil)
>>  (c-state-mark-point-min-literal))
>>
>> , where `c-state-mark-point-min-literal' merely sets 3 variables already
>> named.  I don't honestly see a way MuMaMo could disturb this state by
>> accident.
>
>
> Thanks. Mumamo needs to know because it switches major mode and that
> normally kills buffer local variables.

I have a bit trouble with this. I believe there is a simple solution,
but it requires some low level changes to Emacs. Your changes here
illustrates very well why such a change may be desireable to support
mult major modes.

You are parsing the buffer from the beginning to find a state at a
point (this state is here "in literal or not"). This of course breaks
if there are chunks with different major modes in the buffer.

All parsers naturally behave like this (unless they are not
specifically taught about multi major modes and its implementation).
js2, semantic, font-lock are other examples.

I think the easiest cure for this is to let them just see the parts of
the buffers that are in the programming language they know of at the
moment. (This is perhaps not enough but a good start that covers most
possibilities - and can be used for all parsers.)

This must however be implemented on a low level. All C primitives
reading the buffer must know about it. It is probably in most cases
straightforward to implement it. A level between the buffer reading
primitives and the buffer content is needed.  This hides the parts
that should not be seen.

It is probably possible to support your changes in MuMaMo now, but it
is not easy while it will perhaps break easily instead. I have done
something similar to syntax-ppss. I wish we could have the low level
change instead.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-04 10:34           ` Lennart Borgman
@ 2009-12-04 11:03             ` Lennart Borgman
  2009-12-04 11:56               ` Alan Mackenzie
  2009-12-04 13:54             ` Alan Mackenzie
  1 sibling, 1 reply; 19+ messages in thread
From: Lennart Borgman @ 2009-12-04 11:03 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

On Fri, Dec 4, 2009 at 11:34 AM, Lennart Borgman
<lennart.borgman@gmail.com> wrote:
> Hi again Alan,
>
> On Thu, Dec 3, 2009 at 8:57 PM, Lennart Borgman
> <lennart.borgman@gmail.com> wrote:
>> Hi Alan,
>>
>> On Thu, Dec 3, 2009 at 8:39 PM, Alan Mackenzie <acm@muc.de> wrote:
>>> Hi, Lennart!
>>>
>>> The state of this cache is held entirely in the variables (all of them
>>> buffer local) initialised thusly:
>>>
>>> (defun c-state-cache-init ()
>>>  (setq c-state-cache nil
>>>        c-state-cache-good-pos 1
>>>        c-state-nonlit-pos-cache nil
>>>        c-state-nonlit-pos-cache-limit 1
>>>        c-state-brace-pair-desert nil
>>>        c-state-point-min 1
>>>        c-state-point-min-lit-type nil
>>>        c-state-point-min-lit-start nil
>>>        c-state-min-scan-pos 1
>>>        c-state-old-cpp-beg nil
>>>        c-state-old-cpp-end nil)
>>>  (c-state-mark-point-min-literal))
>>>
>>> , where `c-state-mark-point-min-literal' merely sets 3 variables already
>>> named.  I don't honestly see a way MuMaMo could disturb this state by
>>> accident.
>>
>>
>> Thanks. Mumamo needs to know because it switches major mode and that
>> normally kills buffer local variables.
>
>
> I have a bit trouble with this. I believe there is a simple solution,
> but it requires some low level changes to Emacs. Your changes here
> illustrates very well why such a change may be desireable to support
> mult major modes.
>
> You are parsing the buffer from the beginning to find a state at a
> point (this state is here "in literal or not"). This of course breaks
> if there are chunks with different major modes in the buffer.
>
> All parsers naturally behave like this (unless they are not
> specifically taught about multi major modes and its implementation).
> js2, semantic, font-lock are other examples.
>
> I think the easiest cure for this is to let them just see the parts of
> the buffers that are in the programming language they know of at the
> moment. (This is perhaps not enough but a good start that covers most
> possibilities - and can be used for all parsers.)
>
> This must however be implemented on a low level. All C primitives
> reading the buffer must know about it. It is probably in most cases
> straightforward to implement it. A level between the buffer reading
> primitives and the buffer content is needed.  This hides the parts
> that should not be seen.
>
>
> It is probably possible to support your changes in MuMaMo now, but it
> is not easy while it will perhaps break easily instead. I have done
> something similar to syntax-ppss. I wish we could have the low level
> change instead.


Just a question I forgot: Why are you not using syntax-ppss here? Are
you not looking for the same thing (ie inside comment or string).

If you did it that way then it would already be supported by MuMaMo.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-04  5:31     ` Richard Stallman
@ 2009-12-04 11:37       ` Alan Mackenzie
  2009-12-05  6:50         ` Richard Stallman
  0 siblings, 1 reply; 19+ messages in thread
From: Alan Mackenzie @ 2009-12-04 11:37 UTC (permalink / raw)
  To: Richard Stallman; +Cc: lennart.borgman, emacs-devel

Hi, Richard,

On Fri, Dec 04, 2009 at 12:31:10AM -0500, Richard Stallman wrote:
>     Short brusque answer: no - it[enhancement of c-parse-state]'s ~1300
>     lines of code, much of it arcane.

> Could it be rewritten into a modular facility?

I'm not sure what you mean here.  c-parse-state deals essentially with
the braces and any "lesser" parens in "brace-block" languages, and it's
tightly optimised, as tight as I could make it.  It makes extensive use
of brace/paren syntax (parse-partial-sexp and scan-lists) and plays dirty
tricks with category text properties.

Are you thinking of somehow parametrising it so that it could deal with
"braces" which are, say, keyword tokens like Pascal's BEGIN and END?  I
think this would be possible, but not worthwhile - it would probably be
better to write this from scratch, possibly along similar lines.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-04 11:03             ` Lennart Borgman
@ 2009-12-04 11:56               ` Alan Mackenzie
  2009-12-04 12:03                 ` Lennart Borgman
  0 siblings, 1 reply; 19+ messages in thread
From: Alan Mackenzie @ 2009-12-04 11:56 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: emacs-devel

On Fri, Dec 04, 2009 at 12:03:27PM +0100, Lennart Borgman wrote:
> On Fri, Dec 4, 2009 at 11:34 AM, Lennart Borgman

> Just a question I forgot: Why are you not using syntax-ppss here
> [keeping track of comments/strings in c-parse-state's supporting
> functions]?  Are you not looking for the same thing (ie inside comment
> or string).

I think at the time I did it, it was just less work to write it from
scratch.  syntax-ppss isn't a well encapsulated system, and it almost
requires its users to read its source code to see exactly what it does.
Also, c-parse-state (in effect) changes the syntax table in use by
setting category properties.  At the time, I was considering actually
changing the syntax table (for reasons I'm not entirely clear about any
more).  Does syntax-ppss exist in XEmacs, and if so, since when?  Some of
syntax-ppss's supporting infrastructure only came into existence after
syntax-ppss itself.

> If you did it that way then it would already be supported by MuMaMo.

Ah.  That sounds like a good reason to change my new code to use
syntax-ppss.  :-(  But not before the first pretest release next week.

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-04 11:56               ` Alan Mackenzie
@ 2009-12-04 12:03                 ` Lennart Borgman
  2009-12-04 12:18                   ` Lennart Borgman
  0 siblings, 1 reply; 19+ messages in thread
From: Lennart Borgman @ 2009-12-04 12:03 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

On Fri, Dec 4, 2009 at 12:56 PM, Alan Mackenzie <acm@muc.de> wrote:
> On Fri, Dec 04, 2009 at 12:03:27PM +0100, Lennart Borgman wrote:
>> On Fri, Dec 4, 2009 at 11:34 AM, Lennart Borgman
>
>> Just a question I forgot: Why are you not using syntax-ppss here
>> [keeping track of comments/strings in c-parse-state's supporting
>> functions]?  Are you not looking for the same thing (ie inside comment
>> or string).
>
> I think at the time I did it, it was just less work to write it from
> scratch.  syntax-ppss isn't a well encapsulated system, and it almost
> requires its users to read its source code to see exactly what it does.
> Also, c-parse-state (in effect) changes the syntax table in use by
> setting category properties.  At the time, I was considering actually
> changing the syntax table (for reasons I'm not entirely clear about any
> more).  Does syntax-ppss exist in XEmacs, and if so, since when?  Some of
> syntax-ppss's supporting infrastructure only came into existence after
> syntax-ppss itself.
>
>> If you did it that way then it would already be supported by MuMaMo.
>
> Ah.  That sounds like a good reason to change my new code to use
> syntax-ppss.  :-(  But not before the first pretest release next week.

Thanks. ;-)

Would it be enough to change c-state-literal-at? Perhaps you could
send me a new version of that so I could test it in that case. I could
just defadvice c-state-literal-at.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-04 12:03                 ` Lennart Borgman
@ 2009-12-04 12:18                   ` Lennart Borgman
  0 siblings, 0 replies; 19+ messages in thread
From: Lennart Borgman @ 2009-12-04 12:18 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

On Fri, Dec 4, 2009 at 1:03 PM, Lennart Borgman
<lennart.borgman@gmail.com> wrote:
> On Fri, Dec 4, 2009 at 12:56 PM, Alan Mackenzie <acm@muc.de> wrote:
>> On Fri, Dec 04, 2009 at 12:03:27PM +0100, Lennart Borgman wrote:
>>> On Fri, Dec 4, 2009 at 11:34 AM, Lennart Borgman
>>
>>> Just a question I forgot: Why are you not using syntax-ppss here
>>> [keeping track of comments/strings in c-parse-state's supporting
>>> functions]?  Are you not looking for the same thing (ie inside comment
>>> or string).
>>
>> I think at the time I did it, it was just less work to write it from
>> scratch.  syntax-ppss isn't a well encapsulated system, and it almost
>> requires its users to read its source code to see exactly what it does.
>> Also, c-parse-state (in effect) changes the syntax table in use by
>> setting category properties.  At the time, I was considering actually
>> changing the syntax table (for reasons I'm not entirely clear about any
>> more).  Does syntax-ppss exist in XEmacs, and if so, since when?  Some of
>> syntax-ppss's supporting infrastructure only came into existence after
>> syntax-ppss itself.
>>
>>> If you did it that way then it would already be supported by MuMaMo.
>>
>> Ah.  That sounds like a good reason to change my new code to use
>> syntax-ppss.  :-(  But not before the first pretest release next week.
>
> Thanks. ;-)
>
> Would it be enough to change c-state-literal-at? Perhaps you could
> send me a new version of that so I could test it in that case. I could
> just defadvice c-state-literal-at.

Just to show how little I understand of this: Would this do the
c-state-literal-at ob?

(defun mumamo-c-state-literal-at (here)
  ;; If position HERE is inside a literal, return (START . END), the
  ;; boundaries of the literal (which may be outside the accessible bit of the
  ;; buffer).  Otherwise, return nil.
  ;;
  ;; This function is almost the same as `c-literal-limits'.  It differs in
  ;; that it is a lower level function, and that it rigourously follows the
  ;; syntax from BOB, whereas `c-literal-limits' uses a "local" safe position.
  (let* ((is-here (point))
         (s (syntax-ppss here))
         (ret (when (or (nth 3 s) (nth 4 s))	; in a string or comment
                (parse-partial-sexp (point) (point-max)
                                    nil			 ; TARGETDEPTH
                                    nil			 ; STOPBEFORE
                                    s			 ; OLDSTATE
                                    'syntax-table)	 ; stop at end of literal
                (cons (nth 8 s) (point)))))
    (goto-char is-here)
    ret))




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-04 10:34           ` Lennart Borgman
  2009-12-04 11:03             ` Lennart Borgman
@ 2009-12-04 13:54             ` Alan Mackenzie
  2009-12-04 19:03               ` Lennart Borgman
  1 sibling, 1 reply; 19+ messages in thread
From: Alan Mackenzie @ 2009-12-04 13:54 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: emacs-devel

Hi again, Lennart!

On Fri, Dec 04, 2009 at 11:34:11AM +0100, Lennart Borgman wrote:
> Hi again Alan,

> >> , where `c-state-mark-point-min-literal' merely sets 3 variables already
> >> named.  I don't honestly see a way MuMaMo could disturb this state by
> >> accident.

> > Thanks. Mumamo needs to know because it switches major mode and that
> > normally kills buffer local variables.

OK.  But the buffer local variables c-parse-state and
c-parse-state-good-pos have existed since shortly after 4004 BC anyway.
Does MuMaMo have a list of such variables it handles specially?

> I have a bit trouble with this. I believe there is a simple solution,
> but it requires some low level changes to Emacs. Your changes here
> illustrates very well why such a change may be desireable to support
> mult major modes.

Yes.  MuMaMo (or something like it) should go to the core of Emacs.  It
could enable a gross simplification of CC Mode if there were to be some
automatic switchover to "C preprocessor mode".  I think there should be
a special type of overlay ("extent" in XEmacs) which is a "syntactic
island" to the syntax routines, and possibly (say, by binding some
variable to non-nil) for movement commands too, i.e. these routines
would "simply" jump over the island.  Such an island could have its own
syntax table, keymaps and (even) major mode.  There would, of course, be
numerous details to sort out.  Given how common mixed modes are (C
preprocessor stuff, "literate programming", here documents in shell
scripts, all sorts of things embedded in HTML pages, ....), it's a
wonder we don't already have the tools in the Emacs core.

> You are parsing the buffer from the beginning to find a state at a
> point (this state is here "in literal or not"). This of course breaks
> if there are chunks with different major modes in the buffer.

Yes.  Sorry.

> All parsers naturally behave like this (unless they are not
> specifically taught about multi major modes and its implementation).
> js2, semantic, font-lock are other examples.

Is there a set of guidelines anywhere as to how to make a mode
MuMaMoable?

> I think the easiest cure for this is to let them just see the parts of
> the buffers that are in the programming language they know of at the
> moment. (This is perhaps not enough but a good start that covers most
> possibilities - and can be used for all parsers.)

> This must however be implemented on a low level.

"Must be implemented" is in the passive.  ;-)  It's a pity programming
in C is such a dreary business.

> All C primitives reading the buffer must know about it. It is probably
> in most cases straightforward to implement it. A level between the
> buffer reading primitives and the buffer content is needed.  This
> hides the parts that should not be seen.

Agreed.

> It is probably possible to support your changes in MuMaMo now, but it
> is not easy while it will perhaps break easily instead. I have done
> something similar to syntax-ppss. I wish we could have the low level
> change instead.

Me too!

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-04 13:54             ` Alan Mackenzie
@ 2009-12-04 19:03               ` Lennart Borgman
  2009-12-05  2:11                 ` Lennart Borgman
  0 siblings, 1 reply; 19+ messages in thread
From: Lennart Borgman @ 2009-12-04 19:03 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

On Fri, Dec 4, 2009 at 2:54 PM, Alan Mackenzie <acm@muc.de> wrote:
> Hi again, Lennart!
>
> On Fri, Dec 04, 2009 at 11:34:11AM +0100, Lennart Borgman wrote:
>> Hi again Alan,
>
>> >> , where `c-state-mark-point-min-literal' merely sets 3 variables already
>> >> named.  I don't honestly see a way MuMaMo could disturb this state by
>> >> accident.
>
>> > Thanks. Mumamo needs to know because it switches major mode and that
>> > normally kills buffer local variables.
>
> OK.  But the buffer local variables c-parse-state and
> c-parse-state-good-pos have existed since shortly after 4004 BC anyway.
> Does MuMaMo have a list of such variables it handles specially?


Several. Some of them are trivial changes that should go into Emacs
core like put (put var 'permanent-local t) on major mode independent
variables, for example those on editor emulation minor modes. (Those
are no-doubts, but may need a bit of works for turning off these
modes.)

Others are non-trivial trying to fill semantic gaps in variable
locality. (Local in major mode, local in chunk etc.) I am unsure which
ones I have implemented at the momented. MuMaMo has been much of a
testing adventure.



>> I have a bit trouble with this. I believe there is a simple solution,
>> but it requires some low level changes to Emacs. Your changes here
>> illustrates very well why such a change may be desireable to support
>> mult major modes.
>
> Yes.  MuMaMo (or something like it) should go to the core of Emacs.  It
> could enable a gross simplification of CC Mode if there were to be some
> automatic switchover to "C preprocessor mode".  I think there should be
> a special type of overlay ("extent" in XEmacs) which is a "syntactic
> island" to the syntax routines, and possibly (say, by binding some
> variable to non-nil) for movement commands too, i.e. these routines
> would "simply" jump over the island.  Such an island could have its own
> syntax table, keymaps and (even) major mode.


Yes. This is what MuMaMo is about of course. (Even though it started
out with more special cases and much of it is not thought out yet.)


> There would, of course, be
> numerous details to sort out.  Given how common mixed modes are (C
> preprocessor stuff, "literate programming", here documents in shell
> scripts, all sorts of things embedded in HTML pages, ....), it's a
> wonder we don't already have the tools in the Emacs core.
>
>> You are parsing the buffer from the beginning to find a state at a
>> point (this state is here "in literal or not"). This of course breaks
>> if there are chunks with different major modes in the buffer.
>
> Yes.  Sorry.
>
>> All parsers naturally behave like this (unless they are not
>> specifically taught about multi major modes and its implementation).
>> js2, semantic, font-lock are other examples.
>
> Is there a set of guidelines anywhere as to how to make a mode
> MuMaMoable?


For parsers: No.

I thought about it, but come to the conclusion that the only
reasonable approach is changing the low level buffer reading
primitives.

Too much is otherwise difficult to implement in the parsers. At least
that is what I believe. I could implement structures in MuMaMo that
are lists of buffer chunks in special modes (I started to do that),
but it have to be accessed at low level parts of the code and that
essentially means that code similar to what I suggested needs to
partly written in every parser (or in the low level reading
primitives).

Without the low level changes also all the old parsers have to be rewritten.


>> I think the easiest cure for this is to let them just see the parts of
>> the buffers that are in the programming language they know of at the
>> moment. (This is perhaps not enough but a good start that covers most
>> possibilities - and can be used for all parsers.)
>
>> This must however be implemented on a low level.
>
> "Must be implemented" is in the passive.  ;-)  It's a pity programming
> in C is such a dreary business.
>
>> All C primitives reading the buffer must know about it. It is probably
>> in most cases straightforward to implement it. A level between the
>> buffer reading primitives and the buffer content is needed.  This
>> hides the parts that should not be seen.
>
> Agreed.
>
>> It is probably possible to support your changes in MuMaMo now, but it
>> is not easy while it will perhaps break easily instead. I have done
>> something similar to syntax-ppss. I wish we could have the low level
>> change instead.
>
> Me too!
>
> --
> Alan Mackenzie (Nuremberg, Germany).
>




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-04 19:03               ` Lennart Borgman
@ 2009-12-05  2:11                 ` Lennart Borgman
  2009-12-05  4:49                   ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Lennart Borgman @ 2009-12-05  2:11 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Hi Alan,

>> Is there a set of guidelines anywhere as to how to make a mode
>> MuMaMoable?
>
>
> For parsers: No.


Since we do not have the low level routines now here is an idea to get
things working now:

- Maybe use syntax-ppss. It will cache the information and MuMaMo has
support for making this per chunk + major mode. This is needed instead
of directly calling (parse-partial-sexp 0 to).

- Instead of just calling (widen) call a function hold in a variable.
The inital value for this should of course be the function symbol
'widen. This will allow MuMaMo to restrict (widen) to just the current
mumamo chunk and that is probably what you want when parsing. That
function will then narrow the buffer instead of widening it most of
the time. You could then use it to get the boundaries for the mumamo
chunk at point to if you want to.

But maybe that restricted (widen) is not sufficient? Maybe you have to
somehow jump back to the previous mumamo chunk with your major mode?
How would you want to do that in that case? (I suggest just another
variable holding a function that will give you the end point of the
prev mumamo chunk or nil if there is none. This should variable should
of course be nil by default.)




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-05  2:11                 ` Lennart Borgman
@ 2009-12-05  4:49                   ` Stefan Monnier
  0 siblings, 0 replies; 19+ messages in thread
From: Stefan Monnier @ 2009-12-05  4:49 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Alan Mackenzie, emacs-devel

> Since we do not have the low level routines now here is an idea to get
> things working now:
> - Maybe use syntax-ppss. It will cache the information and MuMaMo has
> support for making this per chunk + major mode. This is needed instead
> of directly calling (parse-partial-sexp 0 to).

Yes, I think the best way is to integrate it into syntax-ppss, but the
problem is that syntax-ppss is currently fairly limited.  A first
extension would be to add to syntax-ppss something similar to
font-lock-syntactic-keywords.  That should be fairly easy to do.

But it would still be restricted to parse-partial-sexp (possibly
tweaked by syntax-table text-properties), which is not sufficient in
general, so we should either provide a replacement for or extend
parse-partial-sexp  to be more powerful and flexible.

        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: C Mode: acceleration in brace deserts.
  2009-12-04 11:37       ` Alan Mackenzie
@ 2009-12-05  6:50         ` Richard Stallman
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Stallman @ 2009-12-05  6:50 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: lennart.borgman, emacs-devel

    I'm not sure what you mean here.  c-parse-state deals essentially with
    the braces and any "lesser" parens in "brace-block" languages, and it's
    tightly optimised, as tight as I could make it.  It makes extensive use
    of brace/paren syntax (parse-partial-sexp and scan-lists) and plays dirty
    tricks with category text properties.

    Are you thinking of somehow parametrising it so that it could deal with
    "braces" which are, say, keyword tokens like Pascal's BEGIN and END?

I hadn't got that specific in thinking about it yet.  It just seems to me
that there are other languages where indentation is based on some
sort of tokens and they might have "deserts" too.




^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-12-05  6:50 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-03 16:21 C Mode: acceleration in brace deserts Alan Mackenzie
2009-12-03 16:26 ` Lennart Borgman
2009-12-03 16:59   ` Alan Mackenzie
2009-12-03 17:22     ` Lennart Borgman
2009-12-03 19:39       ` Alan Mackenzie
2009-12-03 19:57         ` Lennart Borgman
2009-12-04 10:34           ` Lennart Borgman
2009-12-04 11:03             ` Lennart Borgman
2009-12-04 11:56               ` Alan Mackenzie
2009-12-04 12:03                 ` Lennart Borgman
2009-12-04 12:18                   ` Lennart Borgman
2009-12-04 13:54             ` Alan Mackenzie
2009-12-04 19:03               ` Lennart Borgman
2009-12-05  2:11                 ` Lennart Borgman
2009-12-05  4:49                   ` Stefan Monnier
2009-12-04  5:31     ` Richard Stallman
2009-12-04 11:37       ` Alan Mackenzie
2009-12-05  6:50         ` Richard Stallman
2009-12-03 17:09 ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).