Idea for syntax-ppss. Is it new? Could it be any good?

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* Idea for syntax-ppss.  Is it new?  Could it be any good?
@ 2008-07-26 21:44 Alan Mackenzie
  2008-07-27  0:36 ` Lennart Borgman (gmail)
  2008-07-27  1:34 ` Stefan Monnier
  0 siblings, 2 replies; 14+ messages in thread
From: Alan Mackenzie @ 2008-07-26 21:44 UTC (permalink / raw)
  To: emacs-devel

Hi, Emacs,

Looking at the doc string for syntax-ppss, it seems this could be _very_
useful in a certain body of code I'm responsible for.  That body of code
has a lot of heuristics that determine whether point is within a
string/comment, and some of these are not watertight (such as hard-coded
limits on comment sizes to achieve speed).  Basically, they're a PITA.
syntax-ppss, if it was guaranteed watertight, could remove the gnawing
uncertainty from much of the code.

However, the manual documents limitations on syntax-ppss's functionality.

How about reimplementing it thusly?:  The current syntax would be cached
for positions at every N bytes (where N would be, perhaps 1024, possibly
8192).  A call to syntax-ppss would simply call parse-partial-sexp from
the latest valid cached position, filling out the cache as it goes.  Any
buffer change would invalidate cached values for N > POS.

I envisage coding this in C rather than Lisp.  There would be some
complications to do with making sure the syntax table isn't tampered
with, and so on.  This code would surely be fast and reliable.

Obviously I'm not proposing this for the pending release, but what do
people think about the idea?

-- 
Alan Mackenzie (Nuremberg, Germany)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Idea for syntax-ppss.  Is it new?  Could it be any good?
  2008-07-26 21:44 Idea for syntax-ppss. Is it new? Could it be any good? Alan Mackenzie
@ 2008-07-27  0:36 ` Lennart Borgman (gmail)
  2008-07-27  1:34 ` Stefan Monnier
  1 sibling, 0 replies; 14+ messages in thread
From: Lennart Borgman (gmail) @ 2008-07-27  0:36 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

Alan Mackenzie wrote:
> Hi, Emacs,
> 
> Looking at the doc string for syntax-ppss, it seems this could be _very_
> useful in a certain body of code I'm responsible for.  That body of code
> has a lot of heuristics that determine whether point is within a
> string/comment, and some of these are not watertight (such as hard-coded
> limits on comment sizes to achieve speed).  Basically, they're a PITA.
> syntax-ppss, if it was guaranteed watertight, could remove the gnawing
> uncertainty from much of the code.
> 
> However, the manual documents limitations on syntax-ppss's functionality.
> 
> How about reimplementing it thusly?:  The current syntax would be cached
> for positions at every N bytes (where N would be, perhaps 1024, possibly
> 8192).  A call to syntax-ppss would simply call parse-partial-sexp from
> the latest valid cached position, filling out the cache as it goes.  Any
> buffer change would invalidate cached values for N > POS.

There are some defadvices for syntax-ppss and cousins in mumamo.el that 
does something like this. This is needed in multi major mode buffers if 
the normal font lock routines is used there.

I do not exactly do what you propose and a bit more is needed (for 
mumamo). I think however that any solution should take the need for 
multi major modes into account.

> I envisage coding this in C rather than Lisp.  There would be some
> complications to do with making sure the syntax table isn't tampered
> with, and so on.  This code would surely be fast and reliable.
> 
> Obviously I'm not proposing this for the pending release, but what do
> people think about the idea?
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Idea for syntax-ppss.  Is it new?  Could it be any good?
  2008-07-26 21:44 Idea for syntax-ppss. Is it new? Could it be any good? Alan Mackenzie
  2008-07-27  0:36 ` Lennart Borgman (gmail)
@ 2008-07-27  1:34 ` Stefan Monnier
  2008-07-27 14:50   ` Alan Mackenzie
  1 sibling, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2008-07-27  1:34 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

> How about reimplementing it thusly?:  The current syntax would be cached
> for positions at every N bytes (where N would be, perhaps 1024, possibly
> 8192).  A call to syntax-ppss would simply call parse-partial-sexp from
> the latest valid cached position, filling out the cache as it goes.  Any
> buffer change would invalidate cached values for N > POS.

Isn't that what syntax-ppss does?


        Stefan




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Idea for syntax-ppss.  Is it new?  Could it be any good?
  2008-07-27  1:34 ` Stefan Monnier
@ 2008-07-27 14:50   ` Alan Mackenzie
  2008-07-27 15:51     ` Stefan Monnier
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Alan Mackenzie @ 2008-07-27 14:50 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hi, Stefan!

On Sat, Jul 26, 2008 at 09:34:45PM -0400, Stefan Monnier wrote:
> > How about reimplementing it thusly?:  The current syntax would be cached
> > for positions at every N bytes (where N would be, perhaps 1024, possibly
> > 8192).  A call to syntax-ppss would simply call parse-partial-sexp from
> > the latest valid cached position, filling out the cache as it goes.  Any
> > buffer change would invalidate cached values for N > POS.

> Isn't that what syntax-ppss does?

It caches the state for several positions, but I don't think they're at
regular positions.  I don't understand the detailed workings of the
routine at the moment.  I suspect that the slowness of all the lisp
manipulation will outweigh the benefit of the caching, but I would
confirm or refute that with the profiler before doing anything serious.

partial-parse-sexp is blindingly fast.  To scan an entire 3Mb C buffer
on my elderly 1.2 GHz Athlon takes 0.27s.  That is why I suspect that
the lisping in syntax-ppss might need severe optimisation.  But again,
it's only a hunch.

What I think really needs doing is to make this function bulletproof: It
should work on narrowed buffers, it should give reliable elements 2 and
6, its cache should be cleared when functions like `modify-syntax-entry'
are called or parse-sexp-lookup-properties is changed, and the cache
should be bound to nil on `with-syntax-table'.  I actually think it
could be useful to maintain several parallel caches, each for a
different syntax-table (or an equivalence class of syntax tables).  And
so on.  Basically, I would like `(syntax-ppss)' to tell me with 100%
reliability, no ifs, no buts, whether I am at top-level, in a comment,
or in a string.

Also, Lennart is asking for it to work nicely with multiple major modes.
Surely this would be a Good Thing.  Files containing several major modes
are commonplace (awk or sed embedded within a shell script, html
embedded within php, ....).

At the moment, CC Mode applies a heuristic maximum size of strings and
comments, for performance reasons.  Checking for strings and comments is
done so frequently that the mode uses elaborate internal caches.  It
would be nice if this cacheing could move to the Emacs core.

Again, this isn't something which can be implemented in a weekend, but I
think it would be worthwhile for Emacs 24.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Idea for syntax-ppss.  Is it new?  Could it be any good?
  2008-07-27 14:50   ` Alan Mackenzie
@ 2008-07-27 15:51     ` Stefan Monnier
  2008-07-27 19:20       ` Alan Mackenzie
  2008-07-28  2:27     ` Richard M Stallman
  2008-08-31  8:37     ` Better parse-partial-sexp; multiple major modes (was: Idea for syntax-ppss) Daniel Colascione
  2 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2008-07-27 15:51 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

>> Isn't that what syntax-ppss does?
> It caches the state for several positions, but I don't think they're at
> regular positions.

C-h v syntax-ppss-max-span

It's not exactly perfectly regular, but I don't think the
difference matters.

> I don't understand the detailed workings of the routine at the moment.
> I suspect that the slowness of all the lisp manipulation will outweigh
> the benefit of the caching, but I would confirm or refute that with
> the profiler before doing anything serious.

> partial-parse-sexp is blindingly fast.  To scan an entire 3Mb C buffer
> on my elderly 1.2 GHz Athlon takes 0.27s.  That is why I suspect that
> the lisping in syntax-ppss might need severe optimisation.  But again,
> it's only a hunch.

When I wrote syntax-ppss, my main goal was to never be significantly
slower than parse-partial-sexp.  Even if it's not as fast as it could be
if written in C (which is pretty much obviously true), that's not
a reason to recode it in C.

> What I think really needs doing is to make this function bulletproof: It
> should work on narrowed buffers,

That can be done, tho it needs extra info in order to know how to
interpret the fact that it's narrowed.

> it should give reliable elements 2 and 6,

If you really care about them, then I recommend you fix it in
parse-partial-sexp.

> its cache should be cleared when functions like `modify-syntax-entry'
> are called or parse-sexp-lookup-properties is changed, and the cache
> should be bound to nil on `with-syntax-table'.  I actually think it
> could be useful to maintain several parallel caches, each for a
> different syntax-table (or an equivalence class of syntax tables).  And
> so on.  Basically, I would like `(syntax-ppss)' to tell me with 100%
> reliability, no ifs, no buts, whether I am at top-level, in a comment,
> or in a string.

I think this will result in too many cache flushes and will make the
code too intrusive or too ad-hoc.  I'd rather have
a syntax-ppss-syntax-table (and force parse-sexp-lookup-properties to t)
if you want more reliable results.

> Also, Lennart is asking for it to work nicely with multiple major modes.
> Surely this would be a Good Thing.  Files containing several major modes
> are commonplace (awk or sed embedded within a shell script, html
> embedded within php, ....).

Yes, that's a desirable extension.

> At the moment, CC Mode applies a heuristic maximum size of strings and
> comments, for performance reasons.  Checking for strings and comments is
> done so frequently that the mode uses elaborate internal caches.  It
> would be nice if this cacheing could move to the Emacs core.

You can do it today.  Have you even tried to use syntax-ppss before
asking for it to be improved?

> Again, this isn't something which can be implemented in a weekend, but I
> think it would be worthwhile for Emacs 24.

Other than the multi-major-mode part, it all sounds like very
minor changes.


        Stefan




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Idea for syntax-ppss.  Is it new?  Could it be any good?
  2008-07-27 15:51     ` Stefan Monnier
@ 2008-07-27 19:20       ` Alan Mackenzie
  2008-07-27 20:17         ` Stefan Monnier
  0 siblings, 1 reply; 14+ messages in thread
From: Alan Mackenzie @ 2008-07-27 19:20 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Hi, Stefan,

On Sun, Jul 27, 2008 at 11:51:36AM -0400, Stefan Monnier wrote:
> >> Isn't that what syntax-ppss does?
> > It caches the state for several positions, but I don't think they're at
> > regular positions.

> C-h v syntax-ppss-max-span

> It's not exactly perfectly regular, but I don't think the difference
> matters.

I was looking at my 3Mb buffer, and it seemed they were at wildly
irregular positions.  But I think I was seeing something which wasn't
really there.

> > partial-parse-sexp is blindingly fast.  To scan an entire 3Mb C
> > buffer on my elderly 1.2 GHz Athlon takes 0.27s.  That is why I
> > suspect that the lisping in syntax-ppss might need severe
> > optimisation.  But again, it's only a hunch.

> When I wrote syntax-ppss, my main goal was to never be significantly
> slower than parse-partial-sexp.  Even if it's not as fast as it could
> be if written in C (which is pretty much obviously true), that's not a
> reason to recode it in C.

Surely the goal should be to be significantly faster most of the time.
Presumably it achieves this in practice.  The reason to recode in C
would be to make it fast enough, or to couple it up to things which
couldn't be done in lisp.  But probably neither of these things is
needed.

> > What I think really needs doing is to make this function
> > bulletproof: It should work on narrowed buffers,

> That can be done, tho it needs extra info in order to know how to
> interpret the fact that it's narrowed.

Don't understand.  The function is defined as the equivalent of
(parse-partial-sexp (point-min) pos)?  You've said before that the
function is best not called when a buffer is narrowed.  Couldn't we just
redefine it as (parse-partial-sexp 1 pos)?  Then we could just put
(save-extension (widen ..... )) into syntax-ppss.

[ .... ]

> I think this will result in too many cache flushes and will make the
> code too intrusive or too ad-hoc.  I'd rather have a
> syntax-ppss-syntax-table (and force parse-sexp-lookup-properties to t)
> if you want more reliable results.

Hey, syntax-ppss-syntax-table is a brilliant idea!  In its doc string one
could say "after setting this, clear the cache by calling ...
(syntax-ppss-flush-cache 1)".

> > Also, Lennart is asking for it to work nicely with multiple major modes.
> > Surely this would be a Good Thing.  Files containing several major modes
> > are commonplace (awk or sed embedded within a shell script, html
> > embedded within php, ....).

> Yes, that's a desirable extension.

> > At the moment, CC Mode applies a heuristic maximum size of strings and
> > comments, for performance reasons.  Checking for strings and comments is
> > done so frequently that the mode uses elaborate internal caches.  It
> > would be nice if this cacheing could move to the Emacs core.

> You can do it today.  Have you even tried to use syntax-ppss before
> asking for it to be improved?

No.  I think I've been scared by its vagueness (about narrowed regions)
more than anything.  It's defined in the elisp manual as equivalent to
(pps (point-min) pos) rather than (pps 1 pos).  It also uses
syntax-begin-function, which doesn't seem right, and wouldn't work well
in CC Mode; the only way s-b-f can give a cast-iron result is by calling
parse-partial-sexp, or syntax-ppss.  In fact, if syntax-ppss was
bulletproof, syntax-begin-function would be redundant.  I don't think
syntax-ppss is quite the right function for what I want to do.  I need
something like it, but not identical.

Maybe I should test syntax-ppss by coding up inside a macro which
widens.  And I've been less than convinced it's actually faster.  In
fact, I'll go and do some speed tests and report back.

> > Again, this isn't something which can be implemented in a weekend,
> > but I think it would be worthwhile for Emacs 24.

> Other than the multi-major-mode part, it all sounds like very
> minor changes.

Maybe.

>         Stefan

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Idea for syntax-ppss.  Is it new?  Could it be any good?
  2008-07-27 19:20       ` Alan Mackenzie
@ 2008-07-27 20:17         ` Stefan Monnier
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Monnier @ 2008-07-27 20:17 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

>> When I wrote syntax-ppss, my main goal was to never be significantly
>> slower than parse-partial-sexp.  Even if it's not as fast as it could
>> be if written in C (which is pretty much obviously true), that's not a
>> reason to recode it in C.
> Surely the goal should be to be significantly faster most of the time.

Of course, but that's the easy part.

> Presumably it achieves this in practice.  The reason to recode in C
> would be to make it fast enough, or to couple it up to things which
> couldn't be done in lisp.  But probably neither of these things is
> needed.

I may prove necessary to reimplement it in C at some point, but for now
I haven't found any need for it.

>> That can be done, tho it needs extra info in order to know how to
>> interpret the fact that it's narrowed.

> Don't understand.  The function is defined as the equivalent of
> (parse-partial-sexp (point-min) pos)?  You've said before that the
> function is best not called when a buffer is narrowed.  Couldn't we just
> redefine it as (parse-partial-sexp 1 pos)?  Then we could just put
> (save-extension (widen ..... )) into syntax-ppss.

We need the equivalent of font-lock-dont-widen for cases such a Rmail or
Info.  Probably this could be considered as a special case of
multi-mode buffers.

>> I think this will result in too many cache flushes and will make the
>> code too intrusive or too ad-hoc.  I'd rather have a
>> syntax-ppss-syntax-table (and force parse-sexp-lookup-properties to t)
>> if you want more reliable results.

> Hey, syntax-ppss-syntax-table is a brilliant idea!  In its doc string one
> could say "after setting this, clear the cache by calling ...
> (syntax-ppss-flush-cache 1)".

Tho, it'd then make sense to let (syntax-ppss-flush-cache) do the job,
so as to avoid the need to hardcode this ugly 1.
[ Note: my own buffers all start at position 12345678. ]

> No.  I think I've been scared by its vagueness (about narrowed regions)
> more than anything.  It's defined in the elisp manual as equivalent to
> (pps (point-min) pos) rather than (pps 1 pos).  It also uses
> syntax-begin-function, which doesn't seem right, and wouldn't work well
> in CC Mode; the only way s-b-f can give a cast-iron result is by calling
> parse-partial-sexp, or syntax-ppss.  In fact, if syntax-ppss was
> bulletproof, syntax-begin-function would be redundant.

Note that it's default value is nil.  I.e. it's only provided for those
cases where the major mode can provide the info more cheaply than by
running parse-partial-sexp.

> Maybe I should test syntax-ppss by coding up inside a macro which
> widens.  And I've been less than convinced it's actually faster.  In
> fact, I'll go and do some speed tests and report back.

I'm sure you can come up with tests where it's consistently slower than
<writeyouralternativehere>.  The specific use I had in mind of
syntax-ppss is when you call it very freely, even several times within
the same command in nearby locations.  And as mentioned, it mostly tries
to avoid pathological behaviors rather than trying to get the very best
possible speed in "the usual case".

Also the question is not "is it faster", but "is it fast enough".

        Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Idea for syntax-ppss.  Is it new?  Could it be any good?
  2008-07-27 14:50   ` Alan Mackenzie
  2008-07-27 15:51     ` Stefan Monnier
@ 2008-07-28  2:27     ` Richard M Stallman
  2008-07-28  4:08       ` Stefan Monnier
  2008-08-31  8:37     ` Better parse-partial-sexp; multiple major modes (was: Idea for syntax-ppss) Daniel Colascione
  2 siblings, 1 reply; 14+ messages in thread
From: Richard M Stallman @ 2008-07-28  2:27 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: monnier, emacs-devel

    It caches the state for several positions, but I don't think they're at
    regular positions.  I don't understand the detailed workings of the
    routine at the moment.

If we have a cache, it might be more efficient to cache the locations
where beginnings of defuns were found.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Idea for syntax-ppss.  Is it new?  Could it be any good?
  2008-07-28  2:27     ` Richard M Stallman
@ 2008-07-28  4:08       ` Stefan Monnier
  2008-07-28 21:47         ` Richard M Stallman
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2008-07-28  4:08 UTC (permalink / raw)
  To: rms; +Cc: Alan Mackenzie, emacs-devel

>     It caches the state for several positions, but I don't think they're at
>     regular positions.  I don't understand the detailed workings of the
>     routine at the moment.

> If we have a cache, it might be more efficient to cache the locations
> where beginnings of defuns were found.

Actually, it's not because a 10MB file might be composed of
a single defun.


        Stefan




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Idea for syntax-ppss.  Is it new?  Could it be any good?
  2008-07-28  4:08       ` Stefan Monnier
@ 2008-07-28 21:47         ` Richard M Stallman
  0 siblings, 0 replies; 14+ messages in thread
From: Richard M Stallman @ 2008-07-28 21:47 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: acm, emacs-devel

    > If we have a cache, it might be more efficient to cache the locations
    > where beginnings of defuns were found.

    Actually, it's not because a 10MB file might be composed of
    a single defun.

That case is so rare that it isn't very important.

The benefit of caching beginnings of defuns is that you don't have to
invalidate the cache when there's a change in the buffer.
Subsequent defuns are still valid.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Better parse-partial-sexp; multiple major modes (was: Idea for syntax-ppss)
  2008-07-27 14:50   ` Alan Mackenzie
  2008-07-27 15:51     ` Stefan Monnier
  2008-07-28  2:27     ` Richard M Stallman
@ 2008-08-31  8:37     ` Daniel Colascione
  2008-08-31 15:02       ` Better parse-partial-sexp; multiple major modes Lennart Borgman (gmail)
  2008-09-01  6:10       ` Better parse-partial-sexp; multiple major modes (was: Idea for syntax-ppss) Richard M. Stallman
  2 siblings, 2 replies; 14+ messages in thread
From: Daniel Colascione @ 2008-08-31  8:37 UTC (permalink / raw)
  To: emacs-devel

On Jul 27, 2008, at 10:50 AM, Alan Mackenzie wrote:
> What I think really needs doing is to make this function  
> bulletproof: It
> should work on narrowed buffers, it should give reliable elements 2  
> and
> 6, its cache should be cleared when functions like `modify-syntax- 
> entry'
> are called or parse-sexp-lookup-properties is changed, and the cache
> should be bound to nil on `with-syntax-table'.  I actually think it
> could be useful to maintain several parallel caches, each for a
> different syntax-table (or an equivalence class of syntax tables).   
> And
> so on.  Basically, I would like `(syntax-ppss)' to tell me with 100%
> reliability, no ifs, no buts, whether I am at top-level, in a comment,
> or in a string.

Surely, such a creature would have to live on the C side of things, if  
only for practical reasons. (With the proliferation of with-this and  
inhibit-that options available to Lisp, I don't see how one can easily  
and robustly catch all buffer modification. Not to mention that no  
matter which of before-change-functions and after-change-functions you  
used, you could still race against other functions using the same  
facility.)

If this perfectly caching parse-partial-sexp lives in C anyway, why  
not just call it parse-partial-sexp? Optimize parse-partial-sexp for  
the case of start being 1 or (point-min). syntax-ppss becomes a simple  
wrapper. Not only would it be possible to robustly catch all buffer  
and context modification, but by optimizing the existing function, all  
existing users would automatically win. I'd offer to write a patch,  
but I don't know the core well enough to know how to "easily and  
robustly catch all buffer modification".

>
> Also, Lennart is asking for it to work nicely with multiple major  
> modes.
> Surely this would be a Good Thing.  Files containing several major  
> modes
> are commonplace (awk or sed embedded within a shell script, html
> embedded within php, ....).

After several attempts at using and understanding multiple major mode  
facilities, I'm convinced the only way forward is core support for the  
concept. Lennart's done a fine job with what's in Emacs currently. But  
anything involving multiple major modes today is a quivering mound of  
hacks. All the work Lennart's had to do to get modes playing nice with  
each other is a testament to that.

Maybe a core solution could be something like this: in a given buffer,  
each character has a chunk-name character property. You'd buffer- 
locally map chunk names to major modes. For each chunk name, create a  
buffer containing just the text assigned to that chunk. Make the major- 
mode the major mode for the chunk buffer, and let that major-mode  
handle fontification, keybindings, and so on. In the main buffer,  
assemble the various bits from the chunk-buffers and allow the user to  
navigate the combined buffer normally.

Keybindings with point at a particular character would just be the  
keybindings present in that character's chunk-buffer. If you need  
special keybindings common across all chunk buffers, just bind the key  
in all the chunk buffers. If a given chunk needs placeholder text to  
represent text of some other chunk, it should be possible add it to  
that chunk buffer without affecting any of the others.

Anyway, this scheme is:

1) Robust - no messing around with variables, no tweaking fontification
2) Backwards compatible - a major-mode doesn't need to know it's being  
used this way
3) Versatile - you can compose arbitrary modes this way, even  
recursively
4) Conceptually simple (I hope)

Any thoughts?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Better parse-partial-sexp; multiple major modes
  2008-08-31  8:37     ` Better parse-partial-sexp; multiple major modes (was: Idea for syntax-ppss) Daniel Colascione
@ 2008-08-31 15:02       ` Lennart Borgman (gmail)
  2008-09-01  6:10       ` Better parse-partial-sexp; multiple major modes (was: Idea for syntax-ppss) Richard M. Stallman
  1 sibling, 0 replies; 14+ messages in thread
From: Lennart Borgman (gmail) @ 2008-08-31 15:02 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

Daniel Colascione wrote:
>> Also, Lennart is asking for it to work nicely with multiple major modes.
>> Surely this would be a Good Thing.  Files containing several major modes
>> are commonplace (awk or sed embedded within a shell script, html
>> embedded within php, ....).
> 
> After several attempts at using and understanding multiple major mode
> facilities, I'm convinced the only way forward is core support for the
> concept. Lennart's done a fine job with what's in Emacs currently. But
> anything involving multiple major modes today is a quivering mound of
> hacks. All the work Lennart's had to do to get modes playing nice with
> each other is a testament to that.

Are you suggesting that you have problems using MuMaMo today? If so then
please report it as a bug.

> Maybe a core solution could be something like this: in a given buffer,
> each character has a chunk-name character property. You'd buffer-locally
> map chunk names to major modes. For each chunk name, create a buffer
> containing just the text assigned to that chunk. Make the major-mode the
> major mode for the chunk buffer, and let that major-mode handle
> fontification, keybindings, and so on. In the main buffer, assemble the
> various bits from the chunk-buffers and allow the user to navigate the
> combined buffer normally.

This was an idea I played with at the beginning when I wrote mumamo.el.

I am afraid I think the concepts involved (like buffer local etc) must
be considered quite a bit more first before going in this direction. It
is unclear to me now whether chunk-buffers really would be of any help.
They might be, but I am not sure.

> Keybindings with point at a particular character would just be the
> keybindings present in that character's chunk-buffer. If you need
> special keybindings common across all chunk buffers, just bind the key
> in all the chunk buffers. If a given chunk needs placeholder text to
> represent text of some other chunk, it should be possible add it to that
> chunk buffer without affecting any of the others.
> 
> Anyway, this scheme is:
> 
> 1) Robust - no messing around with variables, no tweaking fontification

Unfortunately I do not think that will hold.

I think the way to go is "interface style". font-lock has some good
support for making multiple major modes possible.

> 2) Backwards compatible - a major-mode doesn't need to know it's being
> used this way

Are you aware of that a major mode does not need to know anything when
used in MuMaMo?

> 3) Versatile - you can compose arbitrary modes this way, even recursively

The main difficulty with sub-modes in sub-modes is stability. How do you
find the sub-modes in sub-modes after buffer changes?

I think I know how to implement this in MuMaMo, but I do not have time
right now. (And it is not very much needed.)

> 4) Conceptually simple (I hope)
> 
> Any thoughts?
> 
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Better parse-partial-sexp; multiple major modes
  2008-08-31  9:39 Daniel Colascione
@ 2008-08-31 18:17 ` Stefan Monnier
  0 siblings, 0 replies; 14+ messages in thread
From: Stefan Monnier @ 2008-08-31 18:17 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

> Such a thing would have to live on the C side of things, right? (With the
> proliferation of with-this and inhibit-that options available to Lisp,
> I don't see how one can easily and robustly catch all buffer
> modification.  Not to mention that no matter which of before-change-
> functions and after-change-functions you used, you could still race against
> other functions using the same facility.)

In theory, that might be true, but it's not been that big of a deal
for now.  And of course moving it to C would help some but it won't
solve all those problems magically.

> Anyway, this scheme is:

> 1) Robust - no messing around with variables, no tweaking fontification
> 2) Backwards compatible - a major-mode doesn't need to know it's being used
> this way
> 3) Versatile - you can compose arbitrary modes this way, even recursively
> 4) Conceptually simple (I hope)

I don't think it's possible to get 1 and 2 together, really.


        Stefan




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Better parse-partial-sexp; multiple major modes (was: Idea for syntax-ppss)
  2008-08-31  8:37     ` Better parse-partial-sexp; multiple major modes (was: Idea for syntax-ppss) Daniel Colascione
  2008-08-31 15:02       ` Better parse-partial-sexp; multiple major modes Lennart Borgman (gmail)
@ 2008-09-01  6:10       ` Richard M. Stallman
  1 sibling, 0 replies; 14+ messages in thread
From: Richard M. Stallman @ 2008-09-01  6:10 UTC (permalink / raw)
  To: Daniel Colascione; +Cc: emacs-devel

    Maybe a core solution could be something like this: in a given buffer,  
    each character has a chunk-name character property. You'd buffer- 
    locally map chunk names to major modes. For each chunk name, create a  
    buffer containing just the text assigned to that chunk.

That is elegant but very expensive.  Making so many buffers display as
one buffer, or keeping them in sync, would mean major changes at a low
level.  Even coming up with a spec for how insertion and deletion should
work in such a scheme is hard.

We are much better off with MuMaMo.




^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-09-01  6:10 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-26 21:44 Idea for syntax-ppss. Is it new? Could it be any good? Alan Mackenzie
2008-07-27  0:36 ` Lennart Borgman (gmail)
2008-07-27  1:34 ` Stefan Monnier
2008-07-27 14:50   ` Alan Mackenzie
2008-07-27 15:51     ` Stefan Monnier
2008-07-27 19:20       ` Alan Mackenzie
2008-07-27 20:17         ` Stefan Monnier
2008-07-28  2:27     ` Richard M Stallman
2008-07-28  4:08       ` Stefan Monnier
2008-07-28 21:47         ` Richard M Stallman
2008-08-31  8:37     ` Better parse-partial-sexp; multiple major modes (was: Idea for syntax-ppss) Daniel Colascione
2008-08-31 15:02       ` Better parse-partial-sexp; multiple major modes Lennart Borgman (gmail)
2008-09-01  6:10       ` Better parse-partial-sexp; multiple major modes (was: Idea for syntax-ppss) Richard M. Stallman
  -- strict thread matches above, loose matches on Subject: below --
2008-08-31  9:39 Daniel Colascione
2008-08-31 18:17 ` Better parse-partial-sexp; multiple major modes Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).