PHP mode and mmm-mode

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

* PHP mode and mmm-mode
@ 2006-05-02  4:24 Michael Shulman
  2006-05-02  8:10 ` martin rudalics
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Michael Shulman @ 2006-05-02  4:24 UTC (permalink / raw)
  Cc: emacs-devel

(I don't normally read this list, but Lennart kindly alerted me to this thread.)

To preface this with a disclaimer: it's been some years since I wrote
the first version of mmm-mode (which was itself based, at least
conceptually, on an earlier and much smaller package called mmm.el by
Gongquan Chen).  Since then I've become busy with other things and
mostly neglected it, so I'm not entirely on top of how things stand
now.  I should also add a disclaimer that I don't have a huge amount
of programming experience, so it's quite possible that my thoughts and
impressions are very wrong.  However, here are my two cents anyway;
feel free to ignore them or tell me what's wrong with them.

mmm-mode is a hack.  Actually, a collection of many hacks.  That it
more or less works, much of the time, is, I believe, a testament to
the flexibility and extensibility of Emacs.  However, it has many
flaws, and there are certain situations that I've never been able to
make it handle smoothly, so I would not really suggest including it in
Emacs.  For example, in certain modes (which I don't remember off the
top of my head) indentation in submode regions is broken, while in
others, quotation marks in one place can adversely affect the
font-locking somewhere where they really shouldn't.

It's quite possible that someone with a better understanding of the
internals of Emacs' syntax-parsing and font-lock systems could make
mmm-mode do a better job.  However, my current feeling is that a
different approach entirely might produce better results.

I came to believe while working on mmm-mode that in an ideal world,
the major-mode architecture would be designed to support being told to
manage only parts of a buffer.  Then when a mode is specifying its
syntax, it could specify that certain delimiters are to be used to
block off chunks of a buffer and hand them off to some other mode to
be processed.  Flags could specify whether these chunks should be
considered as sections of the same file, or unrelated code snippets,
or more generally which should be which.  Then the syntax-parsing,
indentation, and font-lock code would automatically hand those chunks
off to the other mode, which would parse them all together as required
and treat all the text in between, which it isn't handling, as a
string, or ignore it entirely, depending.  Moreover, that sub-mode
could in turn specify nested submodes, and the whole thing would Just
Work.

Then one could design, say a php-mode which would hand parts of the
buffer off to, respectively, an html-mode, xml-mode, or whatever
(depending perhaps on the beginning of the file, a local variable, or
on the file name), and a php-code-mode (or it might handle the code
itself).  The html-mode could then in turn be designed to hand chunks
of its section of the file off to javascript-mode or css-mode.  And so
on.  There are all sorts of templating tools in common use today (e.g.
ASP, JSP, Mason, Zope, etc.) which intersperse various kinds of code
with various kinds of text, so I feel that a general system for
combining major modes, like mmm-mode but more reliable, would be
significantly better and more useful than just "a PHP mode."

Now, a rewrite of all the major modes to support a design change like
this seems quite unlikely; certainly in the near future.  But it's
possible that such a system might be implemented "on top of" the
existing system, the way mmm-mode is, but in a different way.  For
instance, mmm-mode uses overlays to keep track of which text should be
in which mode.  This is fine as far as it goes, but it makes it hard
to completely conceal extraneous parts of the buffer from modes that
should not be paying attention to them, producing the above-mentioned
problems with font-lock and indentation.  Perhaps an approach based on
narrowing, or the creation of auxiliary buffers, might work better; I
haven't really explored these possibilities.  I would be interested to
hear if others have.

Michael

On 5/1/06, Lennart Borgman <lennart.borgman.073@student.lu.se> wrote:
> Lars Magne Ingebrigtsen wrote:
> > Richard Stallman <rms@gnu.org> writes:
> >
> >
> >> I do not know PHP.  All I can say is that I would welcome addition of
> >> a good PHP mode.
> >>
> >
> > Has the inclusion of mmm-mode (written by Michael Shulman and others)
> > been discussed previously?  It looks like the approach taken to this
> > problem (multi-mode programming languages) is promising.
> >
> I do not know whether it has been discussed, but it seems to me there
> are currently unfortunately problems with this approach. I believe a
> number of things have to be addressed in Emacs before something like
> mm-mode can work for all modes. But I am not sure, it is quite complex.
> Below are however my thoughts about it.
>
> I think the idea of having different modes in different regions in a
> buffer is very good. So I tried it with nxml-mode (actually
> nxhtml-mode), but without any real success. Dean Scarff says he had some
> success, but still problems with that nxml-mode re-fontifies region that
> has another submode (see
> http://www.emacswiki.org/cgi-bin/wiki/NxmlModeForXHTML). This is also
> what I found.
>
> I can really see no way to stop different modes from stomping at each
> other with mmm-mode. And I can not see that it is possible to do that
> with Emacs (even the CVS version) today. To do that there must be some
> way to tell a mode only to care about a region or a list of regions and
> I can not see how to do that now. (Maybe some more things are required,
> but this is a minimum.)
>
> For the moment I would therefore suggest either html-script.el or
> html-inlined.el. They use narrowing and changes the buffers major mode
> temporarily during narrowing.
>
> For the future I would sugges investigating ways to do what mm-mode
> tries to do.
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PHP mode and mmm-mode
  2006-05-02  4:24 PHP mode and mmm-mode Michael Shulman
@ 2006-05-02  8:10 ` martin rudalics
  2006-05-02 14:57 ` Stefan Monnier
  2006-05-02 20:29 ` Lars Magne Ingebrigtsen
  2 siblings, 0 replies; 8+ messages in thread
From: martin rudalics @ 2006-05-02  8:10 UTC (permalink / raw)
  Cc: emacs-devel

The most simple approach would be to mark regions reserved for "other"
modes by turning them into generic strings.  Usually, major modes don't
touch strings during indentation, font-lock colors them uniformly,
parse-partial-sexp and syntax-ppss handle them.  Hence there wouldn't
have to be done too much to implement this - point-entered/-left hooks
to switch to the appropriate mode and set the appropriate syntax-table
properties.  Obviously, a region reserved for a particular mode had to
maintain point-entered/-left properties appropriately - likely the more
expensive overhead involved.

With foo-mode active, bar-mode and baz-mode would be barred off as

foo-mode
|bar-mode|
foo-mode
|baz-mode
bar-mode|

where text between two matching bars is a generic string.  Entering
bar-mode text would require to reinstall syntax-table text properties

|foo-mode|
bar-mode
|foo-mode
baz-mode|
bar-mode

and peform a redisplay.  Things like `indent-buffer' in foo-mode would
use `inhibit-point-motion-hooks' to avoid switching to another mode
intermittently.  In any case, the original philosophy that "each buffer
has only one major mode at a time" would remain intact.

AFAICT only three major modes - cc-awk, cperl, and perl - currently use
generic strings for their purposes.  Whether and how to accomodate them
is beyond my comprehension.

Just an idea - with blatant disregard for the particular needs of PHP
and friends.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PHP mode and mmm-mode
  2006-05-02  4:24 PHP mode and mmm-mode Michael Shulman
  2006-05-02  8:10 ` martin rudalics
@ 2006-05-02 14:57 ` Stefan Monnier
  2006-05-02 20:29 ` Lars Magne Ingebrigtsen
  2 siblings, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2006-05-02 14:57 UTC (permalink / raw)
  Cc: Lennart Borgman, emacs-devel

> I came to believe while working on mmm-mode that in an ideal world,
> the major-mode architecture would be designed to support being told to
> manage only parts of a buffer.

I think we all agree on that.  But in order to know what that should look
like, I expect several iterations will be needed.  And in order to know how
to get from here to there, we need experimentation.  That's what mmm-mode
is/was for, as far as I'm concerned.
And I believe it is possible to go from here to there progressively.

What I'd like to see from you is not a "let's start over with a great second
system", because we all know what happens with second systems, but a list of
very concrete problems you've encountered by order of importance.  By "very
concrete", I really mean *very*: specific to particular situations with
particular majors modes.  Something where there's a hope we can fix this
particular case by adding one simple convention and changing mmm-mode and
the respecitve major modes to obey or take advantage of that convention.

These should have been sent via M-x report-emacs-bug, years ago already when
working on mmm-mode.  I've always been surprised/disappointed not to see any
feature request from the mmm-mode guy(s).

        Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PHP mode and mmm-mode
  2006-05-02  4:24 PHP mode and mmm-mode Michael Shulman
  2006-05-02  8:10 ` martin rudalics
  2006-05-02 14:57 ` Stefan Monnier
@ 2006-05-02 20:29 ` Lars Magne Ingebrigtsen
  2006-05-02 23:44   ` Nic
                     ` (3 more replies)
  2 siblings, 4 replies; 8+ messages in thread
From: Lars Magne Ingebrigtsen @ 2006-05-02 20:29 UTC (permalink / raw)
  Cc: Lennart Borgman, emacs-devel

"Michael Shulman" <shulman@math.uchicago.edu> writes:

> For example, in certain modes (which I don't remember off the
> top of my head) indentation in submode regions is broken, while in
> others, quotation marks in one place can adversely affect the
> font-locking somewhere where they really shouldn't.

Well, that's to be expected, but if a convention for telling modes
what regions belongs to what modes, then these things can be fixed in
the relevant major modes.

We're not talking about hundreds of modes, either -- the number of
modes where mixing types is likely is probably pretty low.  Say
10 to 20.

> This is fine as far as it goes, but it makes it hard
> to completely conceal extraneous parts of the buffer from modes that
> should not be paying attention to them, producing the above-mentioned
> problems with font-lock and indentation.

I just had a gross idea.  Before calling any of the major-mode
functions (in response to, say, `TAB'), you'd make all the text that's
not in the current major mode invisible and intangible.  Then each
major mode function would believe there was nothing but its own type
of text in the buffer.

The mmm minor mode would basically install a keymap that does

...
(make-other-text-invisible)
(call-real-function)
(make-other-text-visible-again)
...

The major mode would probably need a way to tell mmm which functions
would need this treatment. 

> Perhaps an approach based on narrowing, or the creation of auxiliary
> buffers, might work better; I haven't really explored these
> possibilities.

I think using narrowing and auxiliary buffers would both be less than
optimal.  When you program, you need to see the context.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PHP mode and mmm-mode
  2006-05-02 20:29 ` Lars Magne Ingebrigtsen
@ 2006-05-02 23:44   ` Nic
  2006-05-03  3:11   ` David Hansen
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Nic @ 2006-05-02 23:44 UTC (permalink / raw)
  Cc: Lennart Borgman, Michael Shulman, emacs-devel

Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> "Michael Shulman" <shulman@math.uchicago.edu> writes:
>
>> For example, in certain modes (which I don't remember off the
>> top of my head) indentation in submode regions is broken, while in
>> others, quotation marks in one place can adversely affect the
>> font-locking somewhere where they really shouldn't.
>
> Well, that's to be expected, but if a convention for telling modes
> what regions belongs to what modes, then these things can be fixed in
> the relevant major modes.
>
> We're not talking about hundreds of modes, either -- the number of
> modes where mixing types is likely is probably pretty low.  Say
> 10 to 20.

Actually, I don't think that's true.

I often type XML in mail messages, Java buffers, C buffers, LISP
buffers, Javascript buffers, not to mention XSLT buffers.

I often type Javascript in XML buffers, Java buffers, C buffers, LISP
buffers and mail messages.

I think a solution that offered me syntax colouring and completion and
some of the other facilities when in a foreign mode would be a real
boon.



Nic Ferrier

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PHP mode and mmm-mode
  2006-05-02 20:29 ` Lars Magne Ingebrigtsen
  2006-05-02 23:44   ` Nic
@ 2006-05-03  3:11   ` David Hansen
  2006-05-03  3:43   ` Stefan Monnier
  2007-01-01  2:11   ` Richard Stallman
  3 siblings, 0 replies; 8+ messages in thread
From: David Hansen @ 2006-05-03  3:11 UTC (permalink / raw)


On Tue, 02 May 2006 22:29:04 +0200 Lars Magne Ingebrigtsen wrote:

> I think using narrowing and auxiliary buffers would both be less than
> optimal.  When you program, you need to see the context.

Have a look at Dave Loves multi-mode.el:

http://www.loveshack.ukfsn.org/emacs/multi-mode.el

It uses indirect buffers and only narrows to region for
indentation or font-lock code.  You always see the whole
buffer.

It's far away from being perfect but at least indentation
within one region works well.  Though it can't deal with
mixed code like

<%
if (a) {
%>
  <h3>blah</h3>
<%
  // cc-mode thinks this is a top level funcall
  // and will indent it to `point-at-bol'
  foo ();
}
%>

David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PHP mode and mmm-mode
  2006-05-02 20:29 ` Lars Magne Ingebrigtsen
  2006-05-02 23:44   ` Nic
  2006-05-03  3:11   ` David Hansen
@ 2006-05-03  3:43   ` Stefan Monnier
  2007-01-01  2:11   ` Richard Stallman
  3 siblings, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2006-05-03  3:43 UTC (permalink / raw)
  Cc: Lennart Borgman, Michael Shulman, emacs-devel

> Well, that's to be expected, but if a convention for telling modes
> what regions belongs to what modes, then these things can be fixed in
> the relevant major modes.

> We're not talking about hundreds of modes, either -- the number of
> modes where mixing types is likely is probably pretty low.  Say
> 10 to 20.

Even if that's not true, it's not a problem: there is simply no other way to
to fix each mode one by one anyway, and every mode that's adjusted is
a step forward.

> I just had a gross idea.  Before calling any of the major-mode
> functions (in response to, say, `TAB'), you'd make all the text that's
> not in the current major mode invisible and intangible.  Then each
> major mode function would believe there was nothing but its own type
> of text in the buffer.

I'd rather play with syntax-tables to mark them as comments.  Or use
narrowing (together with some convention for how to transfer
parse-state-info from one chunk to another).  The `intangible' property is
just extremely difficult to live with.


        Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PHP mode and mmm-mode
  2006-05-02 20:29 ` Lars Magne Ingebrigtsen
                     ` (2 preceding siblings ...)
  2006-05-03  3:43   ` Stefan Monnier
@ 2007-01-01  2:11   ` Richard Stallman
  3 siblings, 0 replies; 8+ messages in thread
From: Richard Stallman @ 2007-01-01  2:11 UTC (permalink / raw)
  Cc: lennart.borgman.073, shulman, emacs-devel

I am going through the old mail that I failed to deal with during the
past year.  Please forgive me for taking so long to respond.

    I just had a gross idea.  Before calling any of the major-mode
    functions (in response to, say, `TAB'), you'd make all the text that's
    not in the current major mode invisible and intangible.  Then each
    major mode function would believe there was nothing but its own type
    of text in the buffer.

This is an interesting idea, but I think it won't work unless we rewrite
major modes specifically to work with it.

Making it invisible is pointless since it would only affect
redisplay.

Making it intangible would stop point from staying inside it.  That
could influence the various major modes.  But I don't think it would
influence them enough.  The classic case of mixing two languages is
Bison input.

Would making everything except the C code intangible cause the C code
to be handled and indented properly by CC mode?  I tend to think the
answer is no, but you could try it easily and see.

    > Perhaps an approach based on narrowing, or the creation of auxiliary
    > buffers, might work better; I haven't really explored these
    > possibilities.

    I think using narrowing and auxiliary buffers would both be less than
    optimal.  When you program, you need to see the context.

I agree with you that that isn't likely to work.

Perhaps we could have a text property `language' and a variable
`current-language'.  If the `language' property of a character doesn't
equal the value of `current-language', then the character would be
treated syntactically as whitespace in syntax.c, and ignored _in the
appropriate way_ by various other primitives.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-01-01  2:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-02  4:24 PHP mode and mmm-mode Michael Shulman
2006-05-02  8:10 ` martin rudalics
2006-05-02 14:57 ` Stefan Monnier
2006-05-02 20:29 ` Lars Magne Ingebrigtsen
2006-05-02 23:44   ` Nic
2006-05-03  3:11   ` David Hansen
2006-05-03  3:43   ` Stefan Monnier
2007-01-01  2:11   ` Richard Stallman

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).