From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.bugs Subject: bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Date: Sun, 27 Jan 2013 18:59:06 +0000 Message-ID: <20130127185906.GA16161@acm.acm> References: <20130125175057.GA3345@acm.acm> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1359313608 23397 80.91.229.3 (27 Jan 2013 19:06:48 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 27 Jan 2013 19:06:48 +0000 (UTC) Cc: 13541@debbugs.gnu.org To: Leo Liu Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sun Jan 27 20:07:08 2013 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1TzXZL-0003aA-2K for geb-bug-gnu-emacs@m.gmane.org; Sun, 27 Jan 2013 20:07:07 +0100 Original-Received: from localhost ([::1]:56909 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TzXZ2-0001eO-Nc for geb-bug-gnu-emacs@m.gmane.org; Sun, 27 Jan 2013 14:06:48 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:53670) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TzXYz-0001e6-Oc for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2013 14:06:47 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TzXYy-0000bA-6n for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2013 14:06:45 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:46187) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TzXYv-0000ae-Dd; Sun, 27 Jan 2013 14:06:41 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1TzXZF-0003rF-Tp; Sun, 27 Jan 2013 14:07:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Alan Mackenzie Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org, bug-cc-mode@gnu.org Resent-Date: Sun, 27 Jan 2013 19:07:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13541 X-GNU-PR-Package: emacs,cc-mode X-GNU-PR-Keywords: Original-Received: via spool by 13541-submit@debbugs.gnu.org id=B13541.135931357614774 (code B ref 13541); Sun, 27 Jan 2013 19:07:01 +0000 Original-Received: (at 13541) by debbugs.gnu.org; 27 Jan 2013 19:06:16 +0000 Original-Received: from localhost ([127.0.0.1]:51651 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1TzXYV-0003qF-Kk for submit@debbugs.gnu.org; Sun, 27 Jan 2013 14:06:16 -0500 Original-Received: from colin.muc.de ([193.149.48.1]:20438 helo=mail.muc.de) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1TzXYR-0003q2-FY for 13541@debbugs.gnu.org; Sun, 27 Jan 2013 14:06:13 -0500 Original-Received: (qmail 46894 invoked by uid 3782); 27 Jan 2013 19:05:48 -0000 Original-Received: from acm.muc.de (pD951BC47.dip.t-dialin.net [217.81.188.71]) by colin.muc.de (tmda-ofmipd) with ESMTP; Sun, 27 Jan 2013 20:05:47 +0100 Original-Received: (qmail 16727 invoked by uid 1000); 27 Jan 2013 18:59:06 -0000 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Delivery-Agent: TMDA/1.1.12 (Macallan) X-Primary-Address: acm@muc.de X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:70381 Archived-At: Hi, Leo. On Sat, Jan 26, 2013 at 07:14:49PM +0800, Leo Liu wrote: > On 2013-01-26 01:50 +0800, Alan Mackenzie wrote: > > Could you please try out, fairly thoroughly, the following patch, and let > > me know how it goes. It aims to fontify a /regexp/ wherever one might > > occur. > The second regexp is not font-locked in this case: > /a/ { print /abc/ } Yes, thanks for spotting this. The situation was more complicated than I thought. I think this replacement patch fixes that case (together with a few others). Would you try it out again, please. === modified file 'lisp/progmodes/cc-awk.el' *** lisp/progmodes/cc-awk.el 2013-01-01 09:11:05 +0000 --- lisp/progmodes/cc-awk.el 2013-01-27 18:23:59 +0000 *************** *** 127,148 **** ;; escaped EOL. ;; REGEXPS FOR "HARMLESS" STRINGS/LINES. - (defconst c-awk-harmless-char-re "[^_#/\"\\\\\n\r]") - ;; Matches any character but a _, #, /, ", \, or newline. N.B. _" starts a - ;; localization string in gawk 3.1 (defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)") ;; Matches an underline NOT followed by ". (defconst c-awk-harmless-string*-re (concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*")) ! ;; Matches a (possibly empty) sequence of chars without unescaped /, ", \, ! ;; #, or newlines. (defconst c-awk-harmless-string*-here-re (concat "\\=" c-awk-harmless-string*-re)) ! ;; Matches the (possibly empty) sequence of chars without unescaped /, ", \, ! ;; at point. (defconst c-awk-harmless-line-re ! (concat c-awk-harmless-string*-re ! "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob)) ;; Matches (the tail of) an AWK \"logical\" line not containing an unescaped ;; " or /. "logical" means "possibly containing escaped newlines". A comment ;; is matched as part of the line even if it contains a " or a /. The End of --- 127,155 ---- ;; escaped EOL. ;; REGEXPS FOR "HARMLESS" STRINGS/LINES. (defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)") ;; Matches an underline NOT followed by ". + (defconst c-awk-harmless-char-re "[^_#/\"{}();\\\\\n\r]") + ;; Mathches any character not significant in the state machine applying + ;; syntax-table properties to "s and /s. (defconst c-awk-harmless-string*-re (concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*")) ! ;; Matches a (possibly empty) sequence of characters insignificant in the ! ;; state machine applying syntax-table properties to "s and /s. (defconst c-awk-harmless-string*-here-re (concat "\\=" c-awk-harmless-string*-re)) ! ;; Matches the (possibly empty) sequence of "insignificant" chars at point. ! ! (defconst c-awk-harmless-line-char-re "[^_#/\"\\\\\n\r]") ! ;; Matches any character but a _, #, /, ", \, or newline. N.B. _" starts a ! ;; localisation string in gawk 3.1 ! (defconst c-awk-harmless-line-string*-re ! (concat "\\(" c-awk-harmless-line-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*")) ! ;; Matches a (possibly empty) sequence of chars without unescaped /, ", \, ! ;; #, or newlines. (defconst c-awk-harmless-line-re ! (concat c-awk-harmless-line-string*-re ! "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob)) ;; Matches (the tail of) an AWK \"logical\" line not containing an unescaped ;; " or /. "logical" means "possibly containing escaped newlines". A comment ;; is matched as part of the line even if it contains a " or a /. The End of *************** *** 211,217 **** ;; division sign. (defconst c-awk-neutral-re ; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7 ! "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)") ;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /. ;; This is space/tab, braces, an auto-increment/decrement operator or an ;; escaped character. Or one of the (invalid) characters @ or `. But NOT an --- 218,224 ---- ;; division sign. (defconst c-awk-neutral-re ; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7 ! "\\([}@` \t]\\|\\+\\+\\|--\\|\\\\\\(.\\|[\n\r]\\)\\)") ;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /. ;; This is space/tab, braces, an auto-increment/decrement operator or an ;; escaped character. Or one of the (invalid) characters @ or `. But NOT an *************** *** 231,238 **** ;; will only work when there won't be a preceding " or / before the sought / ;; to foul things up. (defconst c-awk-non-arith-op-bra-re ! "[[\(&=:!><,?;'~|]") ! ;; Matches an opening BRAcket, round or square, or any operator character ;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a ;; regexp bracket) these arith ops are unnecessary and a pain, because of "++" ;; and "--". --- 238,245 ---- ;; will only work when there won't be a preceding " or / before the sought / ;; to foul things up. (defconst c-awk-non-arith-op-bra-re ! "[[\({&=:!><,?;'~|]") ! ;; Matches an openeing BRAcket ,round or square, or any operator character ;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a ;; regexp bracket) these arith ops are unnecessary and a pain, because of "++" ;; and "--". *************** *** 242,247 **** --- 249,264 ---- ;; bracket, in a context where an immediate / would be a division sign. This ;; will only work when there won't be a preceding " or / before the sought / ;; to foul things up. + (defconst c-awk-pre-exp-alphanum-kwd-re + (concat "\\(^\\|[^_\n\r]\\)\\<" + (regexp-opt '("print" "return" "case") t) + "\\>\\([^_\n\r]\\|$\\)")) + ;; Matches all AWK keywords which can precede expressions (including + ;; /regexp/). + (defconst c-awk-kwd-regexp-sign-re + (concat c-awk-pre-exp-alphanum-kwd-re c-awk-neutrals*-re "/")) + ;; Matches a piece of AWK buffer ending in /, where is a keyword + ;; which can precede an expression. ;; REGEXPS USED FOR FINDING THE POSITION OF A "virtual semicolon" (defconst c-awk-_-harmless-nonws-char-re "[^#/\"\\\\\n\r \t]") *************** *** 721,729 **** (goto-char anchor) ;; Analyze the line to find out what the / is. (if (if anchor-state-/div ! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t)) ! (search-forward-regexp c-awk-div-sign-re (1+ /point) t)) ! ;; A division sign. (progn (goto-char (1+ /point)) nil) ;; A regexp opener ;; Jump over the regexp innards, setting the match data. --- 738,747 ---- (goto-char anchor) ;; Analyze the line to find out what the / is. (if (if anchor-state-/div ! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t)) ! (and (not (search-forward-regexp c-awk-kwd-regexp-sign-re (1+ /point) t)) ! (search-forward-regexp c-awk-div-sign-re (1+ /point) t))) ! ;; A division sign. (progn (goto-char (1+ /point)) nil) ;; A regexp opener ;; Jump over the regexp innards, setting the match data. *************** *** 776,787 **** (< (point) lim)) (setq anchor (point)) (search-forward-regexp c-awk-harmless-string*-here-re nil t) ! ;; We are now looking at either a " or a /. ! ;; Do our thing on the string, regexp or division sign. (setq anchor-state-/div ! (if (looking-at "_?\"") ! (c-awk-syntax-tablify-string) ! (c-awk-syntax-tablify-/ anchor anchor-state-/div)))) nil)) ;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set --- 794,813 ---- (< (point) lim)) (setq anchor (point)) (search-forward-regexp c-awk-harmless-string*-here-re nil t) ! ;; We are now looking at either a " or a / or a brace/paren/semicolon. ! ;; Do our thing on the string, regexp or divsion sign or update our state. (setq anchor-state-/div ! (cond ! ((looking-at "_?\"") ! (c-awk-syntax-tablify-string)) ! ((eq (char-after) ?/) ! (c-awk-syntax-tablify-/ anchor anchor-state-/div)) ! ((memq (char-after) '(?{ ?} ?\( ?\;)) ! (forward-char) ! nil) ! (t ; ?\) ! (forward-char) ! t)))) nil)) ;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set > Leo -- Alan Mackenzie (Nuremberg, Germany).