unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Alan Mackenzie <acm@muc.de>
To: "Marshall, Simon" <simon.marshall@misys.com>,
	'Chong Yidong' <cyd@MIT.EDU>,
	Stefan Monnier <monnier@iro.umontreal.ca>
Cc: "'bug-cc-mode@gnu.org'" <bug-cc-mode@gnu.org>,
	"'emacs-devel@gnu.org'" <emacs-devel@gnu.org>
Subject: Re: Font-lock decides function call is function declaration in C+ + - embryonic solution.
Date: 23 Feb 2007 00:47:54 +0100	[thread overview]
Message-ID: <20070223010309.GA3981@muc.de> (raw)
In-Reply-To: <81CCA6588E60BB42BE68BD029ED4826011AB3F79@wimex2.wim.midas-kapiti.com>

Hi, Simon, Chong, Steffan!

On Mon, Feb 05, 2007 at 04:46:32PM -0000, Marshall, Simon wrote:

[ .... ]

> 1.  The goal is to write the code snippet:

> int main() {
>   foo();
>   bar();
> }

> emacs -Q foo.cpp
> int SPC main() SPC { RET } RET C-p C-o bar();

> OK so far.  Now to insert the "foo();" line:

> C-a C-o foo

> At this point, "foo" is fontified as a type, and "bar" as a variable.  OK.
> Now:

> ()

> The fontification of "foo" and "bar" disappears.  OK.  Now complete the
> snippet:

> ;

> Now "foo" is fontified as a variable.  This is wrong.

I've got an embryonic solution for the problem.  The basic idea is to
create a before-change function which looks for certain `c-type'
properties in the vicinity of the change - these indicate that "foo" is a
member of `c-found-types'.  An after-change function can then remove
"foo" from this cache.

To get an idea of what's going on, do
   M-: (c-list-found-types)
or
   M-: c-maybe-stale-found-type
.  The latter is the data structure passed between the
{before,after}-change-functions.

Don't get worried by the size/complexity of this new code.  The worst
thing it's going to do is wrongly take "foo" out of this cache - this
will slow Emacs, but won't cause it to crash.

THIS ISN'T PRODUCTION QUALITY CODE, or anywher near it, so please don't
"debug" it or "tidy it up" for me!  I'm posting it mainly to give
credibility to the notion that I'm making headway with this problem.  In
particular, it only solves Simon's first bug recipe.  It doesn't yet
solve the second one (which will probably be quite easy to fix), and it
doesn't yet deal with template types in `c-found-types', or with
comments, strings, macros, narrowed regions, .....

With that said, here is the embryonic patch to cc-engine.el and
cc-mode.el.



*** cc-engine.220207.el	2007-02-03 00:17:53.000000000 +0000
--- cc-engine.el	2007-02-23 00:09:24.096985896 +0000
***************
*** 2491,2514 ****
    ;; Move to the beginning of the current token.  Do not move if not
    ;; in the middle of one.  BACK-LIMIT may be used to bound the
    ;; backward search; if given it's assumed to be at the boundary
!   ;; between two tokens.
    ;;
    ;; This function might do hidden buffer changes.
-   (if (looking-at "\\w\\|\\s_")
-       (skip-syntax-backward "w_" back-limit)
      (let ((start (point)))
!       (when (< (skip-syntax-backward ".()" back-limit) 0)
! 	(while (let ((pos (or (and (looking-at c-nonsymbol-token-regexp)
! 				   (match-end 0))
! 			      ;; `c-nonsymbol-token-regexp' should always match
! 			      ;; since we've skipped backward over punctuator
! 			      ;; or paren syntax, but consume one char in case
! 			      ;; it doesn't so that we don't leave point before
! 			      ;; some earlier incorrect token.
! 			      (1+ (point)))))
! 		 (if (<= pos start)
! 		     (goto-char pos))
! 		 (< pos start)))))))
  
  (defun c-end-of-current-token (&optional back-limit)
    ;; Move to the end of the current token.  Do not move if not in the
--- 2491,2515 ----
    ;; Move to the beginning of the current token.  Do not move if not
    ;; in the middle of one.  BACK-LIMIT may be used to bound the
    ;; backward search; if given it's assumed to be at the boundary
!   ;; between two tokens.  Return non-nil if the point is move, nil
!   ;; otherwise.
    ;;
    ;; This function might do hidden buffer changes.
      (let ((start (point)))
!       (if (looking-at "\\w\\|\\s_")
! 	  (skip-syntax-backward "w_" back-limit)
! 	(when (< (skip-syntax-backward ".()" back-limit) 0)
! 	  (while (let ((pos (or (and (looking-at c-nonsymbol-token-regexp)
! 				     (match-end 0))
! 				;; `c-nonsymbol-token-regexp' should always match
! 				;; since we've skipped backward over punctuator
! 				;; or paren syntax, but consume one char in case
! 				;; it doesn't so that we don't leave point before
! 				;; some earlier incorrect token.
! 				(1+ (point)))))
! 		   (if (<= pos start)
! 		       (goto-char pos))))))
!       (< (point) start)))
  
  (defun c-end-of-current-token (&optional back-limit)
    ;; Move to the end of the current token.  Do not move if not in the
***************
*** 3957,3962 ****
--- 3958,3966 ----
  ;; file, and we only use this as a last resort in ambiguous cases (see
  ;; `c-forward-decl-or-cast-1').
  ;;
+ ;; Not every type need be in this cache.  However, things which have
+ ;; ceased to be types must be removed from it.
+ ;;
  ;; Template types in C++ are added here too but with the template
  ;; arglist replaced with "<>" in references or "<" for the one in the
  ;; primary type.  E.g. the type "Foo<A,B>::Bar<C>" is stored as
***************
*** 3990,3995 ****
--- 3994,4003 ----
        (unintern (substring type 0 -1) c-found-types)
        (intern type c-found-types))))
  
+ (defsubst c-unfind-type (name)
+   ;; Remove the "NAME" from c-found-types, if present.
+   (unintern name c-found-types))
+ 
  (defsubst c-check-type (from to)
    ;; Return non-nil if the given region contains a type in
    ;; `c-found-types'.
***************
*** 4008,4013 ****
--- 4016,4038 ----
  	      c-found-types)
      (sort type-list 'string-lessp)))
  
+ (defun c-clean-found-types (beg end old-len)
+   ;; An after change function which, in conjunction with the info in
+   ;; c-maybe-stale-found-type (set in c-before-change), removes a type
+   ;; from `c-found-types', should this type have become stale.  For
+   ;; example, this happens to "foo" when "foo \n bar();" becomes
+   ;; "foo(); \n bar();".  Such stale types, if not removed, foul up
+   ;; the fontification.
+   (if c-maybe-stale-found-type ; e.g. (c-decl-id-start "foo" 97 107 " (* ooka) " "o")
+       (cond
+       ;; Some cases which don't disrupt the string: don't needlessly
+       ;; remove "foo"
+        (nil)				; code this up.  FIXME!!!
+        ((eq (car c-maybe-stale-found-type) 'c-decl-id-start)
+ 	(c-unfind-type (cadr c-maybe-stale-found-type))))))
+ 
+ 
+ 
  \f
  ;; Handling of small scale constructs like types and names.
  
*** cc-mode.220207.el	2007-01-01 21:18:31.000000000 +0000
--- cc-mode.el	2007-02-23 00:34:17.569943496 +0000
***************
*** 412,419 ****
  ;; temporary changes in some font lock support modes, causing extra
  ;; unnecessary work and font lock glitches due to interactions between
  ;; various text properties.
  
! (defun c-after-change (beg end len)
    ;; Function put on `after-change-functions' to adjust various caches
    ;; etc.  Prefer speed to finesse here, since there will be an order
    ;; of magnitude more calls to this function than any of the
--- 412,525 ----
  ;; temporary changes in some font lock support modes, causing extra
  ;; unnecessary work and font lock glitches due to interactions between
  ;; various text properties.
+ ;; 
+ ;; (2007-02-12): The macro `combine-after-change-calls' ISN'T used any
+ ;; more.
+ 
+ ;; c-maybe-stale-found-type records a place near the region being
+ ;; changed where an element of `found-types' might become stale.  It 
+ ;; is set in c-before-change and is either nil, or has the form:
+ ;;
+ ;;   (97 107 c-decl-id-start " (* ooka) " "o"), where
+ ;;   
+ ;; o - 97 107 is the region potentially containing the stale type -
+ ;;   this is delimited by a non-nil c-type text property at 96 and
+ ;;   either another one or a ";", "{", or "}" at 107.
+ ;; 
+ ;; o - `c-decl-id-start' is the c-type text property value at buffer
+ ;;   pos 96.
+ ;; 
+ ;; o - " (* ooka) " is the (before change) buffer portion containing
+ ;;   the suspect type (here "ooka").
+ ;;
+ ;; o - "o" is the buffer contents which is about to be deleted.  This
+ ;;   would be the empty string for an insertion.
+ 
+ (defvar c-maybe-stale-found-type nil)
+ (make-variable-buffer-local 'c-maybe-stale-found-type)
+ (defun c-before-change (beg end)
+   ;; Function to be put on `before-change-function'.  Currently
+   ;; (2007-02) it is used only to remove stale entries from the
+   ;; `c-found-types' cache, and to record entries which a
+   ;; `c-after-change' function might confirm as stale.
+   ;; 
+   ;; Note that this function must be FAST rather than accurate.  Note
+   ;; also that it only has any effect when font locking is enabled.
+   ;; We exploit this by checking for font-lock-*-face instead of doing
+   ;; rigourous syntactic analysis.
+ 
+   ;; If either change boundary is wholly inside an identifier, delete
+   ;; it/them from the cache.  Don't worry about being inside a string
+   ;; or a comment - "wrongly" removing a symbol from `c-found-types'
+   ;; isn't critical.
+   (setq c-maybe-stale-found-type nil)
+   (save-excursion
+     ;; Are we inserting/deleting stuff in the middle of an identifier?
+     (cond
+      ((let (tok-beg tok-end)
+ 	(not (equal
+ 	      (mapcar
+ 	       (lambda (pos)
+ 		 (goto-char pos)
+ 		 (setq tok-beg (and (c-beginning-of-current-token) (point)))
+ 		 (goto-char pos)
+ 		 (setq tok-end (and (c-end-of-current-token) (point)))
+ 		 (if (and tok-beg tok-end)
+ 		     (c-unfind-type (buffer-substring-no-properties tok-beg tok-end))))
+ 	       `(,beg ,end))
+ 	      '(nil nil)))))
+ 
+     ;; Are we (potentially) disrupting the syntactic context which
+     ;; makes a type a type?  E.g. by inserting stuff after "foo" in
+     ;; "foo bar;", or before "foo" in "typedef foo *bar;"?
+     ;;
+     ;; We search for appropriate c-type properties "near" the change.
+     ;; First, find an appropriate boundary for this property search.
+      ((let (lim
+ 	   type type-pos
+ 	   marked-id term-pos
+ 	   (end1
+ 	    (if (eq (get-text-property end 'face) 'font-lock-comment-face)
+ 		(previous-single-property-change end 'face)
+ 	      end)))
+        (when (>= end1 beg) ; Don't hassle about changes entirely in comments.
+ 	 (skip-chars-backward "^;{}") ; FIXME!!!  loop for comment, maybe
+ 	 (setq lim (max (point-min) (1- (point))))
+ 	 (when (and (> end1 1)
+ 		    (setq type-pos
+ 			  (if (get-text-property (1- end1) 'c-type)
+ 			      end1
+ 			    (previous-single-property-change end1 'c-type nil lim))))
+ 	   (setq type (get-text-property (max (1- type-pos) lim) 'c-type))
+ 	   (cond
+ 	    ((memq type '(c-decl-id-start c-decl-type-start))
+ 	     ;; Get the identifier, if any, that the property is on.
+ 	     (goto-char (1- type-pos))
+ 	     (setq marked-id
+ 		   (when (looking-at "\\(\\sw\\|\\s_\\)")
+ 		     (c-beginning-of-current-token)
+ 		     (buffer-substring-no-properties (point) type-pos)))
+ 
+ 	     (goto-char end1)
+ 	     (skip-chars-forward "^;{}") ; FIXME!!!  loop for comment, maybe
+ 	     (setq lim (point))
+ 	     (setq term-pos
+ 		   (or (next-single-property-change end 'c-type nil lim) lim))
+ 	     (setq c-maybe-stale-found-type
+ 		   (list type marked-id
+ 			 type-pos term-pos
+ 			 (buffer-substring-no-properties type-pos term-pos)
+ 			 (buffer-substring-no-properties beg end))))
+ 
+ 	    ;; 	   ((eq type 'c-decl-type-start)
+ 	    ;; 	    (################
+ 
+ 	    (type (message "Unhandled c-type at %s" type-pos)))
+ 
+ 	   )))))))
+   
  
! (defun c-after-change (beg end old-len)
    ;; Function put on `after-change-functions' to adjust various caches
    ;; etc.  Prefer speed to finesse here, since there will be an order
    ;; of magnitude more calls to this function than any of the
***************
*** 441,446 ****
--- 547,553 ----
  	  (when (> beg end)
  	    (setq beg end)))
  
+ 	(c-clean-found-types beg end old-len) ; maybe we don't need all of these.
  	(c-invalidate-sws-region-after beg end)
  	(c-invalidate-state-cache beg)
  	(c-invalidate-find-decl-cache beg)
***************
*** 577,582 ****
--- 684,691 ----
  
    ;; Install the functions that ensure that various internal caches
    ;; don't become invalid due to buffer changes.
+   (make-local-hook 'before-change-functions)
+   (add-hook 'before-change-functions 'c-before-change nil t)
    (make-local-hook 'after-change-functions)
    (add-hook 'after-change-functions 'c-after-change nil t))



-- 
Alan Mackenzie (Ittersbach, Germany).

  parent reply	other threads:[~2007-02-22 23:47 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-05 16:46 Font-lock decides function call is function declaration in C+ + Marshall, Simon
2007-02-05 17:14 ` Chong Yidong
2007-02-07 20:45 ` Alan Mackenzie
2007-02-09 21:25   ` Chong Yidong
2007-02-11 17:40     ` Alan Mackenzie
2007-02-11 20:11       ` Stefan Monnier
2007-02-11 23:18       ` Chong Yidong
2007-02-12  2:45         ` Stefan Monnier
2007-02-12 17:59         ` Alan Mackenzie
2007-02-22 23:47 ` Alan Mackenzie [this message]
2007-03-01 17:19   ` Font-lock decides function call is function declaration in C+ + - embryonic solution Chong Yidong
2007-03-02  3:28     ` Richard Stallman
2007-03-03 10:18     ` Alan Mackenzie
2007-03-08 22:07     ` Font-lock decides function call is function declaration in C+ + - Patch Alan Mackenzie
2007-03-08 22:58       ` Chong Yidong
2007-03-09 21:25       ` Richard Stallman
2007-03-09 23:23         ` Alan Mackenzie
2007-03-11  4:24           ` Richard Stallman
  -- strict thread matches above, loose matches on Subject: below --
2007-02-23 10:23 Font-lock decides function call is function declaration in C+ + - embryonic solution Marshall, Simon
2007-02-23 21:01 ` Alan Mackenzie
2007-02-27 15:08 Marshall, Simon
2007-03-06 10:49 Marshall, Simon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070223010309.GA3981@muc.de \
    --to=acm@muc.de \
    --cc=bug-cc-mode@gnu.org \
    --cc=cyd@MIT.EDU \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    --cc=simon.marshall@misys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).