eLisp fontlock with mmm-mode

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

* eLisp fontlock with mmm-mode
@ 2003-08-12 11:18 Sam Vilain
  0 siblings, 0 replies; 9+ messages in thread
From: Sam Vilain @ 2003-08-12 11:18 UTC (permalink / raw)


Hi all,

When I try to mix two modes using mmm-ify-by-regexp, there are
problems when the context of one interferes with the other.

  see http://www.vilain.net/emacs/

     00-control.png is a screenshot of the buffer in fundamental mode
     01-html-mode.png is the buffer in HTML mode
     02-tt-mode.png is the buffer in TT mode, below
     03-html[tt]-mmm.png is the buffer mmm-ify'd using the elisp
        expression on the fourth line.
     tt-mode.el is the source code for the TT mode (included below)

Before taking each screenshot, I used font-lock-fontify-buffer to
re-do the highlighting.

I would like the mode within the mmm-mode to ignore the current
highlighting context of the base mode.  Is this possible?

If it is more likely a bug in the tt-mode, is there a simple problem
with the following syntax highlighting definition that would cause
this to happen?

I'll include it, because it's quite short:

(require 'font-lock)

(defvar tt-mode-hook nil
  "List of functions to call when entering TT mode")

(defvar tt-keywords "\\bGET\\b\\|\\bCALL\\b\\|\\bSET\\b\\|\\bDEFAULT
\\b\\|\\bINSERT\\b\\|\\bINCLUDE\\b\\|\\bBLOCK\\b\\|\\bEND\\b\\|
\\bPROCESS\\b\\|\\bWRAPPER\\b\\|\\bIF\\b\\|\\bUNLESS\\b\\|\\bELSIF
\\b\\|\\bELSE\\b\\|\\bSWITCH\\b\\|\\bCASE\\b\\|\\bFOREACH\\b\\|
\\bWHILE\\b\\|\\bFILTER\\b\\|\\bUSE\\b\\|\\bMACRO\\b\\|\\bPERL
\\b\\|\\bRAWPERL\\b\\|\\bTRY\\b\\|\\bTHROW\\b\\|\\bCATCH\\b\\|
\\bFINAL\\b\\|\\bLAST\\b\\|\\bRETURN\\b\\|\\bSTOP\\b\\|\\bCLEAR
\\b\\|\\bMETA\\b\\|\\bTAGS")

(defvar tt-font-lock-keywords 
   (list
    ;; Fontify [& ... &] expressions
    '("\\(\\[%[-+]?\\)\\(.+?\\)\\([-+]?%\\]\\)"  
      (1 font-lock-string-face t)
      (2 font-lock-variable-name-face t)
      (3 font-lock-string-face t))
    ;; Look for keywords within those expressions
    (list (concat
	   "\\[%[-+]? *\\("
	   tt-keywords 
	   "\\)") 
	  1 font-lock-keyword-face t)
    )
  "Expressions to font-lock in tt-mode.")

(defun tt-mode ()
  "Major mode for editing Template Toolkit files"
  (interactive)
  (kill-all-local-variables)
  (setq major-mode 'tt-mode)
  (setq mode-name "TT")
  (if (string-match "Xemacs" emacs-version)
      (progn
	(make-local-variable 'font-lock-keywords)
	(setq font-lock-keywords tt-font-lock-keywords))
    ;; Emacs
    (make-local-variable 'font-lock-defaults)
    (setq font-lock-defaults '(tt-font-lock-keywords nil t))
    )
  (font-lock-mode)
  (run-hooks tt-mode-hook))

(provide 'tt-mode)

Much appreciated,
-- 
Sam Vilain, sam@vilain.net

"This is an object-oriented system.
 If we change anything, the users object." 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* eLisp fontlock with mmm-mode
@ 2003-09-03  0:41 Sam Vilain
  0 siblings, 0 replies; 9+ messages in thread
From: Sam Vilain @ 2003-09-03  0:41 UTC (permalink / raw)


Hi all,

When I try to mix two modes using mmm-ify-by-regexp, there are
problems when the context of one interferes with the other.

  see http://www.vilain.net/emacs/

     00-control.png is a screenshot of the buffer in fundamental mode
     01-html-mode.png is the buffer in HTML mode
     02-tt-mode.png is the buffer in TT mode, below
     03-html[tt]-mmm.png is the buffer mmm-ify'd using the elisp
        expression on the fourth line.
     tt-mode.el is the source code for the TT mode (included below)

Before taking each screenshot, I used font-lock-fontify-buffer to
re-do the highlighting.

I would like the mode within the mmm-mode to ignore the current
highlighting context of the base mode.  Is this possible?

If it is more likely a bug in the tt-mode, is there a simple problem
with the following syntax highlighting definition that would cause
this to happen?

I'll include it, because it's quite short:

(require 'font-lock)

(defvar tt-mode-hook nil
  "List of functions to call when entering TT mode")

(defvar tt-keywords "\\bGET\\b\\|\\bCALL\\b\\|\\bSET\\b\\|\\bDEFAULT
\\b\\|\\bINSERT\\b\\|\\bINCLUDE\\b\\|\\bBLOCK\\b\\|\\bEND\\b\\|
\\bPROCESS\\b\\|\\bWRAPPER\\b\\|\\bIF\\b\\|\\bUNLESS\\b\\|\\bELSIF
\\b\\|\\bELSE\\b\\|\\bSWITCH\\b\\|\\bCASE\\b\\|\\bFOREACH\\b\\|
\\bWHILE\\b\\|\\bFILTER\\b\\|\\bUSE\\b\\|\\bMACRO\\b\\|\\bPERL
\\b\\|\\bRAWPERL\\b\\|\\bTRY\\b\\|\\bTHROW\\b\\|\\bCATCH\\b\\|
\\bFINAL\\b\\|\\bLAST\\b\\|\\bRETURN\\b\\|\\bSTOP\\b\\|\\bCLEAR
\\b\\|\\bMETA\\b\\|\\bTAGS")

(defvar tt-font-lock-keywords 
   (list
    ;; Fontify [& ... &] expressions
    '("\\(\\[%[-+]?\\)\\(.+?\\)\\([-+]?%\\]\\)"  
      (1 font-lock-string-face t)
      (2 font-lock-variable-name-face t)
      (3 font-lock-string-face t))
    ;; Look for keywords within those expressions
    (list (concat
	   "\\[%[-+]? *\\("
	   tt-keywords 
	   "\\)") 
	  1 font-lock-keyword-face t)
    )
  "Expressions to font-lock in tt-mode.")

(defun tt-mode ()
  "Major mode for editing Template Toolkit files"
  (interactive)
  (kill-all-local-variables)
  (setq major-mode 'tt-mode)
  (setq mode-name "TT")
  (if (string-match "Xemacs" emacs-version)
      (progn
	(make-local-variable 'font-lock-keywords)
	(setq font-lock-keywords tt-font-lock-keywords))
    ;; Emacs
    (make-local-variable 'font-lock-defaults)
    (setq font-lock-defaults '(tt-font-lock-keywords nil t))
    )
  (font-lock-mode)
  (run-hooks tt-mode-hook))

(provide 'tt-mode)

Much appreciated,
-- 
Sam Vilain, sam@vilain.net

"This is an object-oriented system.
 If we change anything, the users object." 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: eLisp fontlock with mmm-mode
       [not found] <mailman.34.1062553939.18171.help-gnu-emacs@gnu.org>
@ 2003-09-03 14:59 ` Joe Kelsey
  2003-09-03 17:02   ` Kevin Rodgers
  0 siblings, 1 reply; 9+ messages in thread
From: Joe Kelsey @ 2003-09-03 14:59 UTC (permalink / raw)


The mmm SourceForge site has inks to the mmm-discuss mailing list,
http:://mmm-mode.sourceforge.net/

Aside from that, support for mixed-mode buffers suffers in Emacs due
to limitations on the ability of using syntax tables for multiple
purposes in a buffer.   The design of syntax tables implies that a
single syntax table controls an entire buffer in a single style. 
mmm-mode attempts to get around this by "dynamically" switching syntax
tables as the point moves through various areas of a buffer.  One very
noticable side effect involves the fact that when you set up the
syntax table for a particular sub-buffer, it changes the entire buffer
view.  Until someone comes up with a way to regionalize syntax tables,
you just have to live with the "bleeding" of syntax table-based
font-locks between buffer regions.

/Joe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: eLisp fontlock with mmm-mode
  2003-09-03 14:59 ` Joe Kelsey
@ 2003-09-03 17:02   ` Kevin Rodgers
  2003-09-05 15:02     ` Joe Kelsey
  0 siblings, 1 reply; 9+ messages in thread
From: Kevin Rodgers @ 2003-09-03 17:02 UTC (permalink / raw)

Joe Kelsey wrote:

> Aside from that, support for mixed-mode buffers suffers in Emacs due
> to limitations on the ability of using syntax tables for multiple
> purposes in a buffer.   The design of syntax tables implies that a
> single syntax table controls an entire buffer in a single style. 
> mmm-mode attempts to get around this by "dynamically" switching syntax
> tables as the point moves through various areas of a buffer.  One very
> noticable side effect involves the fact that when you set up the
> syntax table for a particular sub-buffer, it changes the entire buffer
> view.  Until someone comes up with a way to regionalize syntax tables,
> you just have to live with the "bleeding" of syntax table-based
> font-locks between buffer regions.

I thought that had already been done; from the Special Properties node
of the Emacs Lisp manual:

| Properties with Special Meanings
| --------------------------------
| 
|    Here is a table of text property names that have special built-in
| meanings.  The following sections list a few additional special property
| names that control filling and property inheritance.  All other names
| have no standard meaning, and you can use them as you like.
...

| `syntax-table'
|      The `syntax-table' property overrides what the syntax table says
|      about this particular character.  *Note Syntax Properties::.

-- 
Kevin Rodgers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: eLisp fontlock with mmm-mode
  2003-09-03 17:02   ` Kevin Rodgers
@ 2003-09-05 15:02     ` Joe Kelsey
  2003-09-11 22:28       ` Alan Mackenzie
  0 siblings, 1 reply; 9+ messages in thread
From: Joe Kelsey @ 2003-09-05 15:02 UTC (permalink / raw)

Kevin Rodgers <ihs_4664@yahoo.com> wrote in message news:<3F561EAE.3030506@yahoo.com>...
> Joe Kelsey wrote:
> 
> > Aside from that, support for mixed-mode buffers suffers in Emacs due
> > to limitations on the ability of using syntax tables for multiple
> > purposes in a buffer.   The design of syntax tables implies that a
> > single syntax table controls an entire buffer in a single style. 
> > mmm-mode attempts to get around this by "dynamically" switching syntax
> > tables as the point moves through various areas of a buffer.  One very
> > noticable side effect involves the fact that when you set up the
> > syntax table for a particular sub-buffer, it changes the entire buffer
> > view.  Until someone comes up with a way to regionalize syntax tables,
> > you just have to live with the "bleeding" of syntax table-based
> > font-locks between buffer regions.
> 
> 
> I thought that had already been done; from the Special Properties node
> of the Emacs Lisp manual:

Text properties apply to portions of the buffer and constitute the
basis of font-lock mode.  The interaction between the global
syntax-table and text properties allow font-lock to operate in a
specific buffer.

mmm-mode works by segregating the buffer into overlay sections.  As
the cursor moves outof one overlay and into another, it switches the
global syntax-table.

The syntax-table text property works differently from the global
syntax table in that it applies to a specific section of the buffer. 
However, applying a syntax-table property to a specific section of
text also involves a lot of extra overhead and thus it doesn't come
cheaply.

I have experimented in mmm-mode with using the syntax-table text
property to make inactive overlays have specific properties to try to
make indenting work better in multi-mode buffers.  Basically, nothing
works perfectly.  The global syntax-table doesn't work completely
satisfactorily in multi-mode buffers.  Syntax-table text properties
involve enormous overhead and also do not work well enough.

The real problem involves resolving the dichotomy between linear
editing and the discontinuous nature of multi-mode files.  I don't
really have a perfect solution right now.

/Joe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: eLisp fontlock with mmm-mode
  2003-09-05 15:02     ` Joe Kelsey
@ 2003-09-11 22:28       ` Alan Mackenzie
  2003-09-12 15:46         ` Joe Kelsey
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Mackenzie @ 2003-09-11 22:28 UTC (permalink / raw)


Joe Kelsey <joe-gg@zircon.seattle.wa.us> wrote on 5 Sep 2003 08:02:02
-0700:
> Kevin Rodgers <ihs_4664@yahoo.com> wrote in message
> news:<3F561EAE.3030506@yahoo.com>...
>> Joe Kelsey wrote:

>> > Aside from that, support for mixed-mode buffers suffers in Emacs due
>> > to limitations on the ability of using syntax tables for multiple
>> > purposes in a buffer.   The design of syntax tables implies that a
>> > single syntax table controls an entire buffer in a single style.
>> > mmm-mode attempts to get around this by "dynamically" switching
>> > syntax tables as the point moves through various areas of a buffer.
>> > One very noticable side effect involves the fact that when you set
>> > up the syntax table for a particular sub-buffer, it changes the
>> > entire buffer view.  Until someone comes up with a way to
>> > regionalize syntax tables, you just have to live with the "bleeding"
>> > of syntax table-based font-locks between buffer regions.

>> I thought that had already been done; from the Special Properties node
>> of the Emacs Lisp manual:

> Text properties apply to portions of the buffer and constitute the
> basis of font-lock mode.  The interaction between the global
> syntax-table and text properties allow font-lock to operate in a
> specific buffer.

> mmm-mode works by segregating the buffer into overlay sections.  As
> the cursor moves outof one overlay and into another, it switches the
> global syntax-table.

> The syntax-table text property works differently from the global
> syntax table in that it applies to a specific section of the buffer. 
> However, applying a syntax-table property to a specific section of
> text also involves a lot of extra overhead and thus it doesn't come
> cheaply.

Joe, I take it the value you are giving to the syntax-table text property
is a syntax-table, and you're giving this to the whole mmm section of the
buffer in a single operation.  What do you mean by "a lot of extra
overhead"?  Do you mean extra coding or sluggish performance?  If the
latter, do you have any quantitative feel for how bad the hit is?

> I have experimented in mmm-mode with using the syntax-table text
> property to make inactive overlays have specific properties to try to
> make indenting work better in multi-mode buffers.

You mean, you are setting the STTP to a harmless value (say the WS code)?
In that case, how do you go about restoring the STTP value to what the
major mode might have set it to?

> Basically, nothing works perfectly.  The global syntax-table doesn't
> work completely satisfactorily in multi-mode buffers.  Syntax-table
> text properties involve enormous overhead and also do not work well
> enough.

Isn't this sort of thing one of the reasons the syntax-table TP was
invented?  Surely the overhead can't be that bad (at least, not in GNU
Emacs - I've heard that it can cause significant performance hits in the
other Emacs, though).

> The real problem involves resolving the dichotomy between linear
> editing and the discontinuous nature of multi-mode files.  I don't
> really have a perfect solution right now.

It feels like we could do with some sort of support in the core for
multiple sections.  Something a bit like a region, or a narrowed section,
but independent of them, inside of which font-locking, indentation and so
on would be calculated.

> /Joe

-- 
Alan Mackenzie (Munich, Germany)
Email: aacm@muuc.dee; to decode, wherever there is a repeated letter
(like "aa"), remove half of them (leaving, say, "a").

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: eLisp fontlock with mmm-mode
  2003-09-11 22:28       ` Alan Mackenzie
@ 2003-09-12 15:46         ` Joe Kelsey
  2003-09-12 21:55           ` Alan Mackenzie
  0 siblings, 1 reply; 9+ messages in thread
From: Joe Kelsey @ 2003-09-12 15:46 UTC (permalink / raw)

Alan Mackenzie<none@example.invalid> wrote in message news:<7vsqjb.r5.ln@acm.acm>...
> Joe Kelsey <joe-gg@zircon.seattle.wa.us> wrote on 5 Sep 2003 08:02:02
> -0700:
> > Kevin Rodgers <ihs_4664@yahoo.com> wrote in message
> > news:<3F561EAE.3030506@yahoo.com>...
> > Joe Kelsey wrote:
> > The syntax-table text property works differently from the global
> > syntax table in that it applies to a specific section of the buffer. 
> > However, applying a syntax-table property to a specific section of
> > text also involves a lot of extra overhead and thus it doesn't come
> > cheaply.
> 
> Joe, I take it the value you are giving to the syntax-table text property
> is a syntax-table, and you're giving this to the whole mmm section of the
> buffer in a single operation.  What do you mean by "a lot of extra
> overhead"?  Do you mean extra coding or sluggish performance?  If the
> latter, do you have any quantitative feel for how bad the hit is?

The maintainer of mmm-mode felt that turning on
parse-sexp-lookup-properties might impose unacceptable overhead on
buffer activities.  I have no direct evidence of any performance
penalties due to turning on parse-sexp-lookup-properties, but I bow to
the owner of mmm-mode in his personal decisions.

> > I have experimented in mmm-mode with using the syntax-table text
> > property to make inactive overlays have specific properties to try to
> > make indenting work better in multi-mode buffers.
> 
> You mean, you are setting the STTP to a harmless value (say the WS code)?
> In that case, how do you go about restoring the STTP value to what the
> major mode might have set it to?

I created a function to apply a syntax-table property to a set of
regions like this:

(defun mmm-space-other-regions ()
  "Give all other regions space syntax."
  (interactive)
  (mmm-syntax-other-regions (string-to-syntax " ") nil)
  (setq parse-sexp-lookup-properties t))

which eventually calls the following on each region (the definition of
mmm-syntax-other-regions does not matter to this discussion):

(defun mmm-syntax-region (start stop syntax)
  "Apply a syntax description from START to STOP using the syntax cell
SYNTAX.
Sets the local text property syntax-table to SYNTAX in the region.
If SYNTAX is nil, then remove local syntax-table property."
  (if syntax
      (add-text-properties start stop (list 'syntax-table syntax))
    (remove-text-properties start stop '(syntax-table nil))))

> > The real problem involves resolving the dichotomy between linear
> > editing and the discontinuous nature of multi-mode files.  I don't
> > really have a perfect solution right now.
> 
> It feels like we could do with some sort of support in the core for
> multiple sections.  Something a bit like a region, or a narrowed section,
> but independent of them, inside of which font-locking, indentation and so
> on would be calculated.

You seem to feel that some sort of "narrowing" function might work. 
However, in reality, when you look at multi-mode buffers as supported
by mmm-mode, narrowing by itself does not provide enough context. 
Unfortunately, indentation engines and font-lock engines, at least as
implemented by cc-mode, rely on a combination of syntax-tasble
properties and regular expression searching to accomplish their tasks.

For example, take a noweb file.  This consists of a literate program,
actually a mixture of LaTeX and code in what Norman Ramsey calls
"chunks".  Each documentation chunk describes ideas behind surronding
code chunks.  Part of the literate programming style involves the
tangle and weave processes.  Tangling means reassembling the disjoint
code chunks from a web file into a choerent whole for presentation to
the compiler.  Weaving involves processing the entire web to markup
sections appropriately, applying pretty-printing markup to the code
and adjusting the web syntax markup to fit the documentation
processor, such as LaTeX or HTML.

While editing such buffers, you want to present different views of the
buffer in different sections.  For instance, if you want to edit the
documentation, then you want latex-mode to treat the code chunks as a
single unmovable piece, essentially a syntactic word and not attempt
to reformat it using its own peculiar ideas of paragraph formatting.

Meanwhile, while editing code, you want cc-mode to completely ignore
the documentation chunks.  One of the most frustrating parts of this
comes when cc-mode spots an unterminated apostrophe in a preceding
documentation chunk and treats it as an unterminated string, thus
compeltely screwing up the indentation of the code.  Also, you may
want to consider disjoint code sections as virtually appended to each
other in order to carry forward indentation from one to the other. 
Also, you may want to ignore other code sections depending on how
related they are to each other, since the tangle process may move them
around into different places, including separating them into
completely different files (.c versus .h).

I want to have "virtual views" of a buffer imposed upon major modes in
order to restrict how far afield their regular expression linear
searches can carry them.  Thus, I want to specify a set of regions
which cc-mode can consider virtually catenated in order to restrict it
to only looking at characters in that view.  Then it can use all of
the regular-expression optimizations to supplement syntax-table
properties it wants in order to work correctly in multi-mode buffers. 
Something like narrow-to-region, but given a list of disjoint regions
to narrow to.

/Joe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: eLisp fontlock with mmm-mode
  2003-09-12 15:46         ` Joe Kelsey
@ 2003-09-12 21:55           ` Alan Mackenzie
  2003-09-12 22:39             ` Stefan Monnier
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Mackenzie @ 2003-09-12 21:55 UTC (permalink / raw)


Joe Kelsey <joe-gg@zircon.seattle.wa.us> wrote on 12 Sep 2003 08:46:51
-0700:
> Alan Mackenzie<none@example.invalid> wrote in message
> news:<7vsqjb.r5.ln@acm.acm>...
>> Joe Kelsey <joe-gg@zircon.seattle.wa.us> wrote on 5 Sep 2003 08:02:02
>> -0700:
>> > Kevin Rodgers <ihs_4664@yahoo.com> wrote in message
>> > news:<3F561EAE.3030506@yahoo.com>...  Joe Kelsey wrote: The
>> > syntax-table text property works differently from the global syntax
>> > table in that it applies to a specific section of the buffer.
>> > However, applying a syntax-table property to a specific section of
>> > text also involves a lot of extra overhead and thus it doesn't come
>> > cheaply.

>> Joe, I take it the value you are giving to the syntax-table text
>> property is a syntax-table, and you're giving this to the whole mmm
>> section of the buffer in a single operation.  What do you mean by "a
>> lot of extra overhead"?  Do you mean extra coding or sluggish
>> performance?  If the latter, do you have any quantitative feel for how
>> bad the hit is?

> The maintainer of mmm-mode felt that turning on
> parse-sexp-lookup-properties might impose unacceptable overhead on
> buffer activities.  I have no direct evidence of any performance
> penalties due to turning on parse-sexp-lookup-properties, but I bow to
> the owner of mmm-mode in his personal decisions.

syntax-table properties are in constant use in AWK Mode (also part of CC
Mode).  I've never felt they impacted the performance significantly, even
on my 166 MHz dinosaur.  But, then again, large AWK buffers are rare.  My
feel (and it's not more than that) is that ST properties will impact the
performance, but by an acceptable degree on a slow (< 200 MHz) machine,
and barely noticeably on a fast (> 1 GHz) machine.

>> > I have experimented in mmm-mode with using the syntax-table text
>> > property to make inactive overlays have specific properties to try
>> > to make indenting work better in multi-mode buffers.

>> You mean, you are setting the STTP to a harmless value (say the WS
>> code)?  In that case, how do you go about restoring the STTP value to
>> what the major mode might have set it to?

> I created a function to apply a syntax-table property to a set of
> regions like this:

[snipped ...]

OK.  This approach rules out the use of the syntax-table property by
major modes, if they are to be used in MMM Mode.  :-(

Maybe it would be possible to adapt the core to support several ST text
properties simultaneously (e.g. syntax-table, syntax-table-cc,
syntax-table-mason, ....), and to setq parse-sexp-lookup-properties to
one of these symbols rather than simply t.

>> > The real problem involves resolving the dichotomy between linear
>> > editing and the discontinuous nature of multi-mode files.  I don't
>> > really have a perfect solution right now.

>> It feels like we could do with some sort of support in the core for
>> multiple sections.  Something a bit like a region, or a narrowed
>> section, but independent of them, inside of which font-locking,
>> indentation and so on would be calculated.

> You seem to feel that some sort of "narrowing" function might work.
> However, in reality, when you look at multi-mode buffers as supported
> by mmm-mode, narrowing by itself does not provide enough context.
> Unfortunately, indentation engines and font-lock engines, at least as
> implemented by cc-mode, rely on a combination of syntax-table
> properties and regular expression searching to accomplish their tasks.

"Unfortunately"?  How else could CC Mode do it?

> For example, take a noweb file.  This consists of a literate program,

As an aside, could you explain what a "literate progam" is, exactly?
What it's for, who uses it, and so on.

> actually a mixture of LaTeX and code in what Norman Ramsey calls
> "chunks".  Each documentation chunk describes ideas behind surrounding
> code chunks.  Part of the literate programming style involves the
> tangle and weave processes.  Tangling means reassembling the disjoint
> code chunks from a web file into a coherent whole for presentation to
> the compiler.  Weaving involves processing the entire web to markup
> sections appropriately, applying pretty-printing markup to the code and
> adjusting the web syntax markup to fit the documentation processor,
> such as LaTeX or HTML.

> While editing such buffers, you want to present different views of the
> buffer in different sections.  For instance, if you want to edit the
> documentation, then you want latex-mode to treat the code chunks as a
> single unmovable piece, essentially a syntactic word and not attempt
> to reformat it using its own peculiar ideas of paragraph formatting.

> Meanwhile, while editing code, you want cc-mode to completely ignore
> the documentation chunks.

How about setting something like c-doc-comment-start-regexp and so on to
something which would transform the documentation chunks into comments
(as far as CC Mode is concerned)?

> One of the most frustrating parts of this comes when cc-mode spots an
> unterminated apostrophe in a preceding documentation chunk and treats
> it as an unterminated string, thus completely screwing up the
> indentation of the code.

This is sort of semi-intentional in CC Mode, so that if you miss out a
required terminator (such as a " or a ; or a ') it fouls up the
indentation of the next line, thus drawing your attention to the error
before you get a compiler syntax message.  But what else could CC Mode
do?  C, C++, and friends are syntactically ghastly languages, and
analyzing then in the backwards direction (necessary for doing the
indentation) is even harder than in the forwards direction (like a
compiler does).

> Also, you may want to consider disjoint code sections as virtually
> appended to each other in order to carry forward indentation from one
> to the other.   Also, you may want to ignore other code sections
> depending on how related they are to each other, since the tangle
> process may move them around into different places, including
> separating them into completely different files (.c versus .h).

You actually need a computer to produce tangled code?  :-)

> I want to have "virtual views" of a buffer imposed upon major modes in
> order to restrict how far afield their regular expression linear
> searches can carry them.  Thus, I want to specify a set of regions
> which cc-mode can consider virtually catenated in order to restrict it
> to only looking at characters in that view.  Then it can use all of
> the regular-expression optimizations to supplement syntax-table
> properties it wants in order to work correctly in multi-mode buffers. 
> Something like narrow-to-region, but given a list of disjoint regions
> to narrow to.

Phew!  Just that, eh?  This certainly goes well beyond the boundaries of
anything CC Mode was ever intended to do.  It sounds to me rather like a
major enhancement to the Emacs core.  

To start the ball rolling, here are some ideas for functions which could
offer this sort of functionality:

(virtualize-to-regions '((1 . 2019) (3350 . 6003) (6252 . 6290)))
(virtual-widen)
(save-virtual-restriction ....)    ; like save-restriction.

These would manipulate "virtual views".  All standard emacs functions
would then work on the view as though it were a single contiguous buffer.

> /Joe

-- 
Alan Mackenzie (Munich, Germany)
Email: aacm@muuc.dee; to decode, wherever there is a repeated letter
(like "aa"), remove half of them (leaving, say, "a").

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: eLisp fontlock with mmm-mode
  2003-09-12 21:55           ` Alan Mackenzie
@ 2003-09-12 22:39             ` Stefan Monnier
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Monnier @ 2003-09-12 22:39 UTC (permalink / raw)


> syntax-table properties are in constant use in AWK Mode (also part of CC
> Mode).  I've never felt they impacted the performance significantly, even
> on my 166 MHz dinosaur.  But, then again, large AWK buffers are rare.  My

They are also heavily used in CPerl-mode and also (tho less heavily) in many
other major modes.  The performance impact should indeed be small in
general.  In typical uses, the main performance impact is the time taken to
compute/add the properties themselves, not the time to look them up
during parsing.

> OK.  This approach rules out the use of the syntax-table property by
> major modes, if they are to be used in MMM Mode.  :-(

Unless the major mode uses font-lock to add those properties and mmm-mode is
careful to tell font-lock to re-adorn the syntax-table properties
when needed.

> Maybe it would be possible to adapt the core to support several ST text
> properties simultaneously (e.g. syntax-table, syntax-table-cc,
> syntax-table-mason, ....), and to setq parse-sexp-lookup-properties to
> one of these symbols rather than simply t.

It's probably easier to use overlays.

>> Unfortunately, indentation engines and font-lock engines, at least as
>> implemented by cc-mode, rely on a combination of syntax-table
>> properties and regular expression searching to accomplish their tasks.

> "Unfortunately"?  How else could CC Mode do it?

Using syntax-tables only.  But yes, that would be somewhere between
impossible and very painful.

>> For example, take a noweb file.  This consists of a literate program,

> As an aside, could you explain what a "literate progam" is, exactly?
> What it's for, who uses it, and so on.

It a coding style that thinks of a source file as "description of the program,
interspersed with the actual code" rather than "the code interspersed with
comments".
You can typically run the file through TeX to get a beautifully typeset
description of your code, or you run it through some other filter to
extract the actual code and then compile it.

It might look like:

  ...
  \section{Parsing the input}
  To parse the input we define a \emph{function} \kw{foo}:
  \begin{code}
  void foo (int a, int b, char *c)
  {
     ...
  }
  \end{code}
  ...

So you need both latex-mode and c-mode in the same buffer.

> analyzing then in the backwards direction (necessary for doing the
> indentation) is even harder than in the forwards direction (like a
> compiler does).

I see you've learned the secret ;-).


        Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-09-12 22:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-08-12 11:18 eLisp fontlock with mmm-mode Sam Vilain
  -- strict thread matches above, loose matches on Subject: below --
2003-09-03  0:41 Sam Vilain
     [not found] <mailman.34.1062553939.18171.help-gnu-emacs@gnu.org>
2003-09-03 14:59 ` Joe Kelsey
2003-09-03 17:02   ` Kevin Rodgers
2003-09-05 15:02     ` Joe Kelsey
2003-09-11 22:28       ` Alan Mackenzie
2003-09-12 15:46         ` Joe Kelsey
2003-09-12 21:55           ` Alan Mackenzie
2003-09-12 22:39             ` Stefan Monnier

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.