unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* cc-mode help - c-basic-matchers-after
@ 2010-04-24 13:37 D Chiesa
  0 siblings, 0 replies; 2+ messages in thread
From: D Chiesa @ 2010-04-24 13:37 UTC (permalink / raw)
  To: emacs-devel

Hi,

I'm working on improving the fontification of c# in csharp-mode, which 
depends on cc-mode.

In most cases in cc-mode, reference to the c-lang-const symbols are done 
through a variable of the same name, rather than directly through the 
(c-lang-const ...) fn.   The pattern for defining the variable looks like 
this:

    (c-lang-defconst c-brace-list-key
      ;; Regexp matching the start of declarations where the following
      ;; block is a brace list.
      t (c-make-keywords-re t (c-lang-const c-brace-list-decl-kwds)))
    (c-lang-defvar c-brace-list-key (c-lang-const c-brace-list-key))

And then the code in cc-mode references the value either via the variable, 
or (c-lang-const ...).  This is nice because it offers the chance for a 
cc-mode language to set its own regex into that symbol, and that regex may 
or may not be the result of a simple call to c-make-keywords-re.

But, in some cases this approach is not used faithfully.  One case in 
particular causes problems for fontification of C#.

In c-basic-matchers-after (defined in cc-fonts.el), the first case in that 
fn deals with identifiers inside enum lists, and "hard-codes" the regex used 
to recognize brace-lists .  See below, the call to concat, and specifically 
the comment that begins with "Disallow".

     (c-lang-defconst c-basic-matchers-after
       "Font lock matchers for various things that should be fontified after
     generic casts and declarations are fontified.  Used on level 2 and
     higher."

       t `(;; Fontify the identifiers inside enum lists.  (The enum type
           ;; name is handled by `c-simple-decl-matchers' or
           ;; `c-complex-decl-matchers' below.
           ,@(when (c-lang-const c-brace-id-list-kwds)
               `((,(c-make-font-lock-search-function
                    (concat
                     "\\<\\("
                     (c-make-keywords-re nil (c-lang-const 
c-brace-id-list-kwds))
                     "\\)\\>"
                     ;; Disallow various common punctuation chars that can't 
come
                     ;; before the '{' of the enum list, to avoid searching 
too far.
                     "[^\]\[{}();,/#=]*"
                     "{")
                    '((c-font-lock-declarators limit t nil)
                      (save-match-data
                        (goto-char (match-end 0))
                        (c-put-char-property (1- (point)) 'c-type
                                             'c-decl-id-start)
                        (c-forward-syntactic-ws))
                      (goto-char (match-end 0)))))))

This works in many languages, but it does not work in C#, specifically for 
the case of object initializers, which take this form:

     var x = new MyType(arg1, arg2, ...) {
         Field1 = "foo",
         Field2 = "bar",
     };

This syntax creates a new instance using the given constructor, and then 
sets public fields or properties on that instance to the given values.
When I say "it does not work" what I mean is that regex in the matcher 
doesn't match, and as a result the char property c-decl-id-start is not 
applied to the open curly.  As a result of that, the assignment statements 
inside the curlies are not fontified properly.

C# 3.0 also allows this simpler syntax:

     var x = new MyType {
         Field1 = "foo",
         Field2 = "bar",
     };

...which invokes the default constructor, and then performs the assignments. 
This syntax is fontified correctly.   The difference is the absence of the 
(), which is "disallowed" by the hard-coded regex.

The difference is shown here:
http://i40.tinypic.com/29qo0go.jpg

What I'd like to see is that regex in c-basic-matchers-after to be a pure 
c-lang-const.  Rather than augmenting that regex deep inside the matcher to 
stipulate that () must be disallowed in that context, if that regex could 
refer to an unadorned c-lang-const, then any mode dependent upon cc-mode 
would be able to set the appropriate regex for the matcher, in an 
appropriate c-lang-defconst .

In other words, change the code for c-basic-matchers-after to

       t `(;; Fontify the identifiers inside enum lists.  (The enum type
           ;; name is handled by `c-simple-decl-matchers' or
           ;; `c-complex-decl-matchers' below.
           ,@(when (c-lang-const c-brace-id-list-kwds)
               `((,(c-make-font-lock-search-function
                    (c-lang-const c-brace-id-list-beginning-re)
                    '((c-font-lock-declarators limit t nil)


and introduce c-brace-id-list-beginning-re , as

    (c-lang-defconst c-brace-id-list-beginning-re
      ;; Regexp matching the start of a brace list, including the opening
      ;; brace.
      t (concat
         "\\<\\("
         (c-make-keywords-re nil (c-lang-const c-brace-id-list-kwds))
         "\\)\\>"
         ;; Disallow various common punctuation chars that can't come
         ;; before the '{' of the enum list, to avoid searching too far.
         "[^\]\[{}();,/#=]*"
         "{")
      )

Have I understood this properly?  Does this request make sense?


-Dino Chiesa

 





^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: cc-mode help - c-basic-matchers-after
@ 2010-04-24 14:35 D Chiesa
  0 siblings, 0 replies; 2+ messages in thread
From: D Chiesa @ 2010-04-24 14:35 UTC (permalink / raw)
  To: emacs-devel

Actually, thinking about this a bit further,
although I think that change might be sensible in theory,  it won't satisfy 
the need presented by C#.   Even if I had a way to plug in a regex for C# 
into that matcher, I don't think there's a way to construct a single, simple 
regex in emacs that will handle  the general case of c# syntax.

Remember, the case I'm trying to fontify is the object initializer that 
follows a type constructor.  In C#, the arglist passed to the type 
constructor may have arbitrarily deep nesting of parens and braces.  Like 
this:

    var x = new MyType( new Foo { V1 = 7, V2 = 54},  new Bar(true) )
    {
        Field1 = "foo"
    };

The current implementation of c-basic-matchers-after rejects this because it 
hard-codes a regex that disallows parens + braces.   But as far as I know, 
there's no way in emacs to construct a regex that will match balanced 
parens.  Correct me if I'm wrong on that.   If so, a simple regex won't 
satisfy, and the matcher needs to be more intelligent to handle that case.

That leaves me without a solid justification for the change I suggested 
previously.

So this is all for naught.  ok, back to your regularly-scheduled 
programming. . .

-Dino Chiesa


--------------------------------------------------
From: "D Chiesa" <dpchiesa@hotmail.com>
Sent: Saturday, April 24, 2010 9:37 AM
To: <emacs-devel@gnu.org>
Subject: cc-mode help - c-basic-matchers-after

> Hi,
>
> I'm working on improving the fontification of c# in csharp-mode, which 
> depends on cc-mode.
>
> In most cases in cc-mode, reference to the c-lang-const symbols are done 
> through a variable of the same name, rather than directly through the 
> (c-lang-const ...) fn.   The pattern for defining the variable looks like 
> this:
>
>    (c-lang-defconst c-brace-list-key
>      ;; Regexp matching the start of declarations where the following
>      ;; block is a brace list.
>      t (c-make-keywords-re t (c-lang-const c-brace-list-decl-kwds)))
>    (c-lang-defvar c-brace-list-key (c-lang-const c-brace-list-key))
>
> And then the code in cc-mode references the value either via the variable, 
> or (c-lang-const ...).  This is nice because it offers the chance for a 
> cc-mode language to set its own regex into that symbol, and that regex may 
> or may not be the result of a simple call to c-make-keywords-re.
>
> But, in some cases this approach is not used faithfully.  One case in 
> particular causes problems for fontification of C#.
>
> In c-basic-matchers-after (defined in cc-fonts.el), the first case in that 
> fn deals with identifiers inside enum lists, and "hard-codes" the regex 
> used to recognize brace-lists .  See below, the call to concat, and 
> specifically the comment that begins with "Disallow".
>
>     (c-lang-defconst c-basic-matchers-after
>       "Font lock matchers for various things that should be fontified 
> after
>     generic casts and declarations are fontified.  Used on level 2 and
>     higher."
>
>       t `(;; Fontify the identifiers inside enum lists.  (The enum type
>           ;; name is handled by `c-simple-decl-matchers' or
>           ;; `c-complex-decl-matchers' below.
>           ,@(when (c-lang-const c-brace-id-list-kwds)
>               `((,(c-make-font-lock-search-function
>                    (concat
>                     "\\<\\("
>                     (c-make-keywords-re nil (c-lang-const 
> c-brace-id-list-kwds))
>                     "\\)\\>"
>                     ;; Disallow various common punctuation chars that 
> can't come
>                     ;; before the '{' of the enum list, to avoid searching 
> too far.
>                     "[^\]\[{}();,/#=]*"
>                     "{")
>                    '((c-font-lock-declarators limit t nil)
>                      (save-match-data
>                        (goto-char (match-end 0))
>                        (c-put-char-property (1- (point)) 'c-type
>                                             'c-decl-id-start)
>                        (c-forward-syntactic-ws))
>                      (goto-char (match-end 0)))))))
>
> This works in many languages, but it does not work in C#, specifically for 
> the case of object initializers, which take this form:
>
>     var x = new MyType(arg1, arg2, ...) {
>         Field1 = "foo",
>         Field2 = "bar",
>     };
>
> This syntax creates a new instance using the given constructor, and then 
> sets public fields or properties on that instance to the given values.
> When I say "it does not work" what I mean is that regex in the matcher 
> doesn't match, and as a result the char property c-decl-id-start is not 
> applied to the open curly.  As a result of that, the assignment statements 
> inside the curlies are not fontified properly.
>
> C# 3.0 also allows this simpler syntax:
>
>     var x = new MyType {
>         Field1 = "foo",
>         Field2 = "bar",
>     };
>
> ...which invokes the default constructor, and then performs the 
> assignments. This syntax is fontified correctly.   The difference is the 
> absence of the (), which is "disallowed" by the hard-coded regex.
>
> The difference is shown here:
> http://i40.tinypic.com/29qo0go.jpg
>
> What I'd like to see is that regex in c-basic-matchers-after to be a pure 
> c-lang-const.  Rather than augmenting that regex deep inside the matcher 
> to stipulate that () must be disallowed in that context, if that regex 
> could refer to an unadorned c-lang-const, then any mode dependent upon 
> cc-mode would be able to set the appropriate regex for the matcher, in an 
> appropriate c-lang-defconst .
>
> In other words, change the code for c-basic-matchers-after to
>
>       t `(;; Fontify the identifiers inside enum lists.  (The enum type
>           ;; name is handled by `c-simple-decl-matchers' or
>           ;; `c-complex-decl-matchers' below.
>           ,@(when (c-lang-const c-brace-id-list-kwds)
>               `((,(c-make-font-lock-search-function
>                    (c-lang-const c-brace-id-list-beginning-re)
>                    '((c-font-lock-declarators limit t nil)
>
>
> and introduce c-brace-id-list-beginning-re , as
>
>    (c-lang-defconst c-brace-id-list-beginning-re
>      ;; Regexp matching the start of a brace list, including the opening
>      ;; brace.
>      t (concat
>         "\\<\\("
>         (c-make-keywords-re nil (c-lang-const c-brace-id-list-kwds))
>         "\\)\\>"
>         ;; Disallow various common punctuation chars that can't come
>         ;; before the '{' of the enum list, to avoid searching too far.
>         "[^\]\[{}();,/#=]*"
>         "{")
>      )
>
> Have I understood this properly?  Does this request make sense?
>
>
> -Dino Chiesa
>
>
> 




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-04-24 14:35 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-24 14:35 cc-mode help - c-basic-matchers-after D Chiesa
  -- strict thread matches above, loose matches on Subject: below --
2010-04-24 13:37 D Chiesa

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).