unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
@ 2013-01-24 11:43 Leo Liu
  2013-01-24 18:28 ` Glenn Morris
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Leo Liu @ 2013-01-24 11:43 UTC (permalink / raw)
  To: 13541; +Cc: bug-cc-mode

[-- Attachment #1: Type: text/plain, Size: 214 bytes --]

In an awk buffer having the following text:

#--BEGIN--
NF { /xyz/ }

NF {
    /xyz/
}
#--END--

I have the second regexp properly font-locked but not the first one.

(tested in GNU Emacs 24.2.92.1 of 2013-01-13)


[-- Attachment #2: awk-mode-bug.png --]
[-- Type: image/png, Size: 6820 bytes --]

[-- Attachment #3: Type: text/plain, Size: 5 bytes --]


Leo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
  2013-01-24 11:43 bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Leo Liu
@ 2013-01-24 18:28 ` Glenn Morris
       [not found] ` <b5ehhajuzy.fsf@fencepost.gnu.org>
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Glenn Morris @ 2013-01-24 18:28 UTC (permalink / raw)
  To: Leo Liu; +Cc: 13541

Leo Liu wrote:

> In an awk buffer having the following text:
>
> #--BEGIN--
> NF { /xyz/ }
>
> NF {
>     /xyz/
> }
> #--END--
>
> I have the second regexp properly font-locked but not the first one.

Do you have an example of an actual useful awk script showing the issue,
because this one seems like a pointless no-op?





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
       [not found] ` <b5ehhajuzy.fsf@fencepost.gnu.org>
@ 2013-01-24 22:16   ` Alan Mackenzie
  2013-01-25  1:20     ` Leo Liu
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-24 22:16 UTC (permalink / raw)
  To: Glenn Morris; +Cc: sdl.web, 13541

Hi, Glenn,

On Thu, Jan 24, 2013 at 01:28:33PM -0500, Glenn Morris wrote:
> Leo Liu wrote:

> > In an awk buffer having the following text:

> > #--BEGIN--
> > NF { /xyz/ }

> > NF {
> >     /xyz/
> > }
> > #--END--

> > I have the second regexp properly font-locked but not the first one.

> Do you have an example of an actual useful awk script showing the issue,
> because this one seems like a pointless no-op?

This is a real bug, perhaps not a difficult one.  "/regexp/" is an
expression with value 1 iff the current input line matches the regexp.
So a line like

    NF { print /xyz/ }

is perfectly legitimate, printing 1 if there's an "xyz" on the line.

I'm looking at this bug at the moment.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
  2013-01-24 22:16   ` Alan Mackenzie
@ 2013-01-25  1:20     ` Leo Liu
  2013-01-25  1:33       ` Glenn Morris
  2013-01-25  8:44       ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
  0 siblings, 2 replies; 18+ messages in thread
From: Leo Liu @ 2013-01-25  1:20 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 13541

On 2013-01-25 06:16 +0800, Alan Mackenzie wrote:
> This is a real bug, perhaps not a difficult one.  "/regexp/" is an
> expression with value 1 iff the current input line matches the regexp.
> So a line like
>
>     NF { print /xyz/ }
>
> is perfectly legitimate, printing 1 if there's an "xyz" on the line.
>
> I'm looking at this bug at the moment.

Thanks to all for chiming in.

Alan, I also have another seemingly buglet about indentation.

Every line after a pattern-action pair like the following one (where
action is omitted) is indented to column 4, i.e. it doesn't recognise a
newline terminates a pattern.

$0 == "Emacs"
    |
    all following lines indented here

(this might be regression, I seem to recall reporting something along
these lines some while ago.)

Leo





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
  2013-01-25  1:20     ` Leo Liu
@ 2013-01-25  1:33       ` Glenn Morris
  2013-01-25  1:44         ` Glenn Morris
  2013-01-25  8:44       ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
  1 sibling, 1 reply; 18+ messages in thread
From: Glenn Morris @ 2013-01-25  1:33 UTC (permalink / raw)
  To: Leo Liu; +Cc: Alan Mackenzie, 13541

Leo Liu wrote:

> (this might be regression, I seem to recall reporting something along
> these lines some while ago.)

No, it is the never addressed

http://debbugs.gnu.org/12274





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
  2013-01-25  1:33       ` Glenn Morris
@ 2013-01-25  1:44         ` Glenn Morris
  2013-01-25 21:32           ` Richard Stallman
  0 siblings, 1 reply; 18+ messages in thread
From: Glenn Morris @ 2013-01-25  1:44 UTC (permalink / raw)
  To: Leo Liu; +Cc: Alan Mackenzie, 13541


> Leo Liu wrote:
>
>> (this might be regression,

PS henceforth it is prohibited to use the word "regression" except in
the form "this is a regression against Emacs XX.YY, where it works as
desired".





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12274: 24.2; awk-mode indentation failure
  2013-01-25  1:20     ` Leo Liu
  2013-01-25  1:33       ` Glenn Morris
@ 2013-01-25  8:44       ` Alan Mackenzie
  2013-01-25 12:58         ` Stefan Monnier
                           ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-25  8:44 UTC (permalink / raw)
  To: Leo Liu; +Cc: 12274

Hello, Leo.

On Fri, Jan 25, 2013 at 09:20:19AM +0800, Leo Liu wrote:

[ .... ]

> Alan, I also have another seemingly buglet about indentation.

As Glenn remarked, this is bug #12274.

> Every line after a pattern-action pair like the following one (where
> action is omitted) is indented to column 4, i.e. it doesn't recognise a
> newline terminates a pattern.

> $0 == "Emacs"
>     |
>     all following lines indented here

> (this might be regression, I seem to recall reporting something along
> these lines some while ago.)

No, not a regression, rather a bug which has been there since 4004 BC.
It's actually the "=" sign which triggers it, confusing the parsing
algortihm into thinking it's a C initialisation statement.

The solution is to move the pertinent AWK parsing clause earlier on in
the enclosing cond form.

Glenn, this is not a regression.  Should I nevertheless commit it to the
emacs-24 branch?

Here's the patch:


diff -r 0d641a4d3e7c cc-engine.el
--- a/cc-engine.el	Wed Jan 23 18:17:40 2013 +0000
+++ b/cc-engine.el	Fri Jan 25 08:27:12 2013 +0000
@@ -9880,6 +9880,18 @@
 	    ;; contains any class offset
 	    )))
 
+	 ;; CASE 5P: AWK pattern or function or continuation
+	 ;; thereof.
+	 ((c-major-mode-is 'awk-mode)
+	  (setq placeholder (point))
+	  (c-add-stmt-syntax
+	   (if (and (eq (c-beginning-of-statement-1) 'same)
+		    (/= (point) placeholder))
+	       'topmost-intro-cont
+	     'topmost-intro)
+	   nil nil
+	   containing-sexp paren-state))
+
 	 ;; CASE 5D: this could be a top-level initialization, a
 	 ;; member init list continuation, or a template argument
 	 ;; list continuation.
@@ -10039,18 +10051,6 @@
 	      (goto-char (point-min)))
 	  (c-add-syntax 'objc-method-intro (c-point 'boi)))
 
-	 ;; CASE 5P: AWK pattern or function or continuation
-	 ;; thereof.
-	 ((c-major-mode-is 'awk-mode)
-	  (setq placeholder (point))
-	  (c-add-stmt-syntax
-	   (if (and (eq (c-beginning-of-statement-1) 'same)
-		    (/= (point) placeholder))
-	       'topmost-intro-cont
-	     'topmost-intro)
-	   nil nil
-	   containing-sexp paren-state))
-
 	 ;; CASE 5N: At a variable declaration that follows a class
 	 ;; definition or some other block declaration that doesn't
 	 ;; end at the closing '}'.  C.f. case 5D.5.

> Leo

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12274: 24.2; awk-mode indentation failure
  2013-01-25  8:44       ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
@ 2013-01-25 12:58         ` Stefan Monnier
  2013-01-25 17:33         ` Glenn Morris
  2013-01-25 19:17         ` Alan Mackenzie
  2 siblings, 0 replies; 18+ messages in thread
From: Stefan Monnier @ 2013-01-25 12:58 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Leo Liu, 12274

> Glenn, this is not a regression.  Should I nevertheless commit it to the
> emacs-24 branch?

No, this should go to trunk,


        Stefan






^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12274: 24.2; awk-mode indentation failure
  2013-01-25  8:44       ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
  2013-01-25 12:58         ` Stefan Monnier
@ 2013-01-25 17:33         ` Glenn Morris
  2013-01-25 19:17         ` Alan Mackenzie
  2 siblings, 0 replies; 18+ messages in thread
From: Glenn Morris @ 2013-01-25 17:33 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Leo Liu, 12274

Alan Mackenzie wrote:

> Glenn, this is not a regression.  Should I nevertheless commit it to the
> emacs-24 branch?

No, trunk please.





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
  2013-01-24 11:43 bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Leo Liu
  2013-01-24 18:28 ` Glenn Morris
       [not found] ` <b5ehhajuzy.fsf@fencepost.gnu.org>
@ 2013-01-25 17:50 ` Alan Mackenzie
  2013-01-26 11:14   ` Leo Liu
  2013-01-29 20:58 ` Alan Mackenzie
  3 siblings, 1 reply; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-25 17:50 UTC (permalink / raw)
  To: Leo Liu; +Cc: 13541

Hi, Leo.

On Thu, Jan 24, 2013 at 07:43:06PM +0800, Leo Liu wrote:
> In an awk buffer having the following text:

> #--BEGIN--
> NF { /xyz/ }

> NF {
>     /xyz/
> }
> #--END--

> I have the second regexp properly font-locked but not the first one.

Yes.

Could you please try out, fairly thoroughly, the following patch, and let
me know how it goes.  It aims to fontify a /regexp/ wherever one might
occur.




=== modified file 'lisp/progmodes/cc-awk.el'
*** lisp/progmodes/cc-awk.el	2013-01-01 09:11:05 +0000
--- lisp/progmodes/cc-awk.el	2013-01-25 17:47:38 +0000
***************
*** 211,217 ****
  ;; division sign.
  (defconst c-awk-neutral-re
  ;  "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
!   "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)")
  ;;   A "neutral" char(pair).  Doesn't change the "state" of a subsequent /.
  ;; This is space/tab, braces, an auto-increment/decrement operator or an
  ;; escaped character.  Or one of the (invalid) characters @ or `.  But NOT an
--- 211,217 ----
  ;; division sign.
  (defconst c-awk-neutral-re
  ;  "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
!   "\\([}@` \t]\\|\\+\\+\\|--\\|\\\\\\(.\\|[\n\r]\\)\\)")
  ;;   A "neutral" char(pair).  Doesn't change the "state" of a subsequent /.
  ;; This is space/tab, braces, an auto-increment/decrement operator or an
  ;; escaped character.  Or one of the (invalid) characters @ or `.  But NOT an
***************
*** 231,237 ****
  ;; will only work when there won't be a preceding " or / before the sought /
  ;; to foul things up.
  (defconst c-awk-non-arith-op-bra-re
!   "[[\(&=:!><,?;'~|]")
  ;;   Matches an opening BRAcket, round or square, or any operator character
  ;; apart from +,-,/,*,%.  For the purpose at hand (detecting a / which is a
  ;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
--- 231,237 ----
  ;; will only work when there won't be a preceding " or / before the sought /
  ;; to foul things up.
  (defconst c-awk-non-arith-op-bra-re
!   "[[\({&=:!><,?;'~|]")
  ;;   Matches an opening BRAcket, round or square, or any operator character
  ;; apart from +,-,/,*,%.  For the purpose at hand (detecting a / which is a
  ;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
***************
*** 242,247 ****
--- 242,257 ----
  ;; bracket, in a context where an immediate / would be a division sign.  This
  ;; will only work when there won't be a preceding " or / before the sought /
  ;; to foul things up.
+ (defconst c-awk-pre-exp-alphanum-kwd-re
+   (concat "\\(^\\|[^_\n\r]\\)\\<"
+ 	  (regexp-opt '("print" "return" "case") t)
+ 	  "\\>\\([^_\n\r]\\|$\\)"))
+ ;;   Matches all AWK keywords which can precede expressions (including
+ ;; /regexp/).
+ (defconst c-awk-kwd-regexp-sign-re
+   (concat c-awk-pre-exp-alphanum-kwd-re c-awk-neutrals*-re "/"))
+ ;;   Matches a piece of AWK buffer ending in <kwd> /, where <kwd> is a keyword
+ ;; which can precede an expression.
  
  ;; REGEXPS USED FOR FINDING THE POSITION OF A "virtual semicolon"
  (defconst c-awk-_-harmless-nonws-char-re "[^#/\"\\\\\n\r \t]")
***************
*** 721,729 ****
      (goto-char anchor)
      ;; Analyze the line to find out what the / is.
      (if (if anchor-state-/div
!             (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
!           (search-forward-regexp c-awk-div-sign-re (1+ /point) t))
!         ;; A division sign.
  	(progn (goto-char (1+ /point)) nil)
        ;; A regexp opener
        ;; Jump over the regexp innards, setting the match data.
--- 731,740 ----
      (goto-char anchor)
      ;; Analyze the line to find out what the / is.
      (if (if anchor-state-/div
! 	    (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! 	  (and (not (search-forward-regexp c-awk-kwd-regexp-sign-re (1+ /point) t))
! 	       (search-forward-regexp c-awk-div-sign-re (1+ /point) t)))
! 	;; A division sign.
  	(progn (goto-char (1+ /point)) nil)
        ;; A regexp opener
        ;; Jump over the regexp innards, setting the match data.


> Leo

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#12274: 24.2; awk-mode indentation failure
  2013-01-25  8:44       ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
  2013-01-25 12:58         ` Stefan Monnier
  2013-01-25 17:33         ` Glenn Morris
@ 2013-01-25 19:17         ` Alan Mackenzie
  2 siblings, 0 replies; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-25 19:17 UTC (permalink / raw)
  To: 12274-done

Bug fixed.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
  2013-01-25  1:44         ` Glenn Morris
@ 2013-01-25 21:32           ` Richard Stallman
  0 siblings, 0 replies; 18+ messages in thread
From: Richard Stallman @ 2013-01-25 21:32 UTC (permalink / raw)
  To: Glenn Morris; +Cc: acm, sdl.web, 13541

If a a nasty and cruel bug appears just before the release, is that
"regression to the mean"?

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use Ekiga or an ordinary phone call






^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
  2013-01-25 17:50 ` bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Alan Mackenzie
@ 2013-01-26 11:14   ` Leo Liu
  2013-01-27 18:59     ` Alan Mackenzie
       [not found]     ` <20130127185906.GA16161__1271.15463042191$1359313643$gmane$org@acm.acm>
  0 siblings, 2 replies; 18+ messages in thread
From: Leo Liu @ 2013-01-26 11:14 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 13541

On 2013-01-26 01:50 +0800, Alan Mackenzie wrote:
> Could you please try out, fairly thoroughly, the following patch, and let
> me know how it goes.  It aims to fontify a /regexp/ wherever one might
> occur.

The second regexp is not font-locked in this case:

/a/ { print /abc/ }

Leo





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
  2013-01-26 11:14   ` Leo Liu
@ 2013-01-27 18:59     ` Alan Mackenzie
       [not found]     ` <20130127185906.GA16161__1271.15463042191$1359313643$gmane$org@acm.acm>
  1 sibling, 0 replies; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-27 18:59 UTC (permalink / raw)
  To: Leo Liu; +Cc: 13541

Hi, Leo.

On Sat, Jan 26, 2013 at 07:14:49PM +0800, Leo Liu wrote:
> On 2013-01-26 01:50 +0800, Alan Mackenzie wrote:
> > Could you please try out, fairly thoroughly, the following patch, and let
> > me know how it goes.  It aims to fontify a /regexp/ wherever one might
> > occur.

> The second regexp is not font-locked in this case:

> /a/ { print /abc/ }

Yes, thanks for spotting this.  The situation was more complicated than I
thought.  I think this replacement patch fixes that case (together with a
few others).  Would you try it out again, please.



=== modified file 'lisp/progmodes/cc-awk.el'
*** lisp/progmodes/cc-awk.el	2013-01-01 09:11:05 +0000
--- lisp/progmodes/cc-awk.el	2013-01-27 18:23:59 +0000
***************
*** 127,148 ****
  ;; escaped EOL.
  
  ;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
- (defconst c-awk-harmless-char-re "[^_#/\"\\\\\n\r]")
- ;;   Matches any character but a _, #, /, ", \, or newline.  N.B. _" starts a
- ;; localization string in gawk 3.1
  (defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
  ;;   Matches an underline NOT followed by ".
  (defconst c-awk-harmless-string*-re
    (concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;;   Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
  (defconst c-awk-harmless-string*-here-re
    (concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of chars without unescaped /, ", \,
! ;; at point.
  (defconst c-awk-harmless-line-re
!   (concat c-awk-harmless-string*-re
!           "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
  ;;   Matches (the tail of) an AWK \"logical\" line not containing an unescaped
  ;; " or /.  "logical" means "possibly containing escaped newlines".  A comment
  ;; is matched as part of the line even if it contains a " or a /.  The End of
--- 127,155 ----
  ;; escaped EOL.
  
  ;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
  (defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
  ;;   Matches an underline NOT followed by ".
+ (defconst c-awk-harmless-char-re "[^_#/\"{}();\\\\\n\r]")
+ ;;   Mathches any character not significant in the state machine applying
+ ;; syntax-table properties to "s and /s.
  (defconst c-awk-harmless-string*-re
    (concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;;   Matches a (possibly empty) sequence of characters insignificant in the
! ;; state machine applying syntax-table properties to "s and /s.
  (defconst c-awk-harmless-string*-here-re
    (concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of "insignificant" chars at point.
! 
! (defconst c-awk-harmless-line-char-re "[^_#/\"\\\\\n\r]")
! ;;   Matches any character but a _, #, /, ", \, or newline.  N.B. _" starts a
! ;; localisation string in gawk 3.1
! (defconst c-awk-harmless-line-string*-re
!   (concat "\\(" c-awk-harmless-line-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;;   Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
  (defconst c-awk-harmless-line-re
!   (concat c-awk-harmless-line-string*-re
! 	  "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
  ;;   Matches (the tail of) an AWK \"logical\" line not containing an unescaped
  ;; " or /.  "logical" means "possibly containing escaped newlines".  A comment
  ;; is matched as part of the line even if it contains a " or a /.  The End of
***************
*** 211,217 ****
  ;; division sign.
  (defconst c-awk-neutral-re
  ;  "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
!   "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)")
  ;;   A "neutral" char(pair).  Doesn't change the "state" of a subsequent /.
  ;; This is space/tab, braces, an auto-increment/decrement operator or an
  ;; escaped character.  Or one of the (invalid) characters @ or `.  But NOT an
--- 218,224 ----
  ;; division sign.
  (defconst c-awk-neutral-re
  ;  "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
!   "\\([}@` \t]\\|\\+\\+\\|--\\|\\\\\\(.\\|[\n\r]\\)\\)")
  ;;   A "neutral" char(pair).  Doesn't change the "state" of a subsequent /.
  ;; This is space/tab, braces, an auto-increment/decrement operator or an
  ;; escaped character.  Or one of the (invalid) characters @ or `.  But NOT an
***************
*** 231,238 ****
  ;; will only work when there won't be a preceding " or / before the sought /
  ;; to foul things up.
  (defconst c-awk-non-arith-op-bra-re
!   "[[\(&=:!><,?;'~|]")
! ;;   Matches an opening BRAcket, round or square, or any operator character
  ;; apart from +,-,/,*,%.  For the purpose at hand (detecting a / which is a
  ;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
  ;; and "--".
--- 238,245 ----
  ;; will only work when there won't be a preceding " or / before the sought /
  ;; to foul things up.
  (defconst c-awk-non-arith-op-bra-re
!   "[[\({&=:!><,?;'~|]")
! ;;   Matches an openeing BRAcket ,round or square, or any operator character
  ;; apart from +,-,/,*,%.  For the purpose at hand (detecting a / which is a
  ;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
  ;; and "--".
***************
*** 242,247 ****
--- 249,264 ----
  ;; bracket, in a context where an immediate / would be a division sign.  This
  ;; will only work when there won't be a preceding " or / before the sought /
  ;; to foul things up.
+ (defconst c-awk-pre-exp-alphanum-kwd-re
+   (concat "\\(^\\|[^_\n\r]\\)\\<"
+ 	  (regexp-opt '("print" "return" "case") t)
+ 	  "\\>\\([^_\n\r]\\|$\\)"))
+ ;;   Matches all AWK keywords which can precede expressions (including
+ ;; /regexp/).
+ (defconst c-awk-kwd-regexp-sign-re
+   (concat c-awk-pre-exp-alphanum-kwd-re c-awk-neutrals*-re "/"))
+ ;;   Matches a piece of AWK buffer ending in <kwd> /, where <kwd> is a keyword
+ ;; which can precede an expression.
  
  ;; REGEXPS USED FOR FINDING THE POSITION OF A "virtual semicolon"
  (defconst c-awk-_-harmless-nonws-char-re "[^#/\"\\\\\n\r \t]")
***************
*** 721,729 ****
      (goto-char anchor)
      ;; Analyze the line to find out what the / is.
      (if (if anchor-state-/div
!             (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
!           (search-forward-regexp c-awk-div-sign-re (1+ /point) t))
!         ;; A division sign.
  	(progn (goto-char (1+ /point)) nil)
        ;; A regexp opener
        ;; Jump over the regexp innards, setting the match data.
--- 738,747 ----
      (goto-char anchor)
      ;; Analyze the line to find out what the / is.
      (if (if anchor-state-/div
! 	    (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! 	  (and (not (search-forward-regexp c-awk-kwd-regexp-sign-re (1+ /point) t))
! 	       (search-forward-regexp c-awk-div-sign-re (1+ /point) t)))
! 	;; A division sign.
  	(progn (goto-char (1+ /point)) nil)
        ;; A regexp opener
        ;; Jump over the regexp innards, setting the match data.
***************
*** 776,787 ****
               (< (point) lim))
        (setq anchor (point))
        (search-forward-regexp c-awk-harmless-string*-here-re nil t)
!       ;; We are now looking at either a " or a /.
!       ;; Do our thing on the string, regexp or division sign.
        (setq anchor-state-/div
!             (if (looking-at "_?\"")
!                 (c-awk-syntax-tablify-string)
!               (c-awk-syntax-tablify-/ anchor anchor-state-/div))))
      nil))
  
  ;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set
--- 794,813 ----
               (< (point) lim))
        (setq anchor (point))
        (search-forward-regexp c-awk-harmless-string*-here-re nil t)
!       ;; We are now looking at either a " or a / or a brace/paren/semicolon.
!       ;; Do our thing on the string, regexp or divsion sign or update our state.
        (setq anchor-state-/div
! 	    (cond
! 	     ((looking-at "_?\"")
! 	      (c-awk-syntax-tablify-string))
! 	     ((eq (char-after) ?/)
! 	      (c-awk-syntax-tablify-/ anchor anchor-state-/div))
! 	     ((memq (char-after) '(?{ ?} ?\( ?\;))
! 	      (forward-char)
! 	      nil)
! 	     (t 			; ?\)
! 	      (forward-char)
! 	      t))))
      nil))
  
  ;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set


> Leo

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
       [not found]     ` <20130127185906.GA16161__1271.15463042191$1359313643$gmane$org@acm.acm>
@ 2013-01-28  1:12       ` Leo Liu
  2013-01-28 11:14         ` Alan Mackenzie
       [not found]         ` <20130128111417.GA3330@acm.acm>
  0 siblings, 2 replies; 18+ messages in thread
From: Leo Liu @ 2013-01-28  1:12 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 13541

On 2013-01-28 02:59 +0800, Alan Mackenzie wrote:
> Yes, thanks for spotting this.  The situation was more complicated than I
> thought.  I think this replacement patch fixes that case (together with a
> few others).  Would you try it out again, please.

Still fails with:

/a/ { (print /abc/) }

or

/a/ { p /abc/ } # incorrect awk so not sure a bug or feature

Leo





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
  2013-01-28  1:12       ` Leo Liu
@ 2013-01-28 11:14         ` Alan Mackenzie
       [not found]         ` <20130128111417.GA3330@acm.acm>
  1 sibling, 0 replies; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-28 11:14 UTC (permalink / raw)
  To: Leo Liu; +Cc: 13541

Hi, Leo.

On Mon, Jan 28, 2013 at 09:12:01AM +0800, Leo Liu wrote:
> On 2013-01-28 02:59 +0800, Alan Mackenzie wrote:
> > Yes, thanks for spotting this.  The situation was more complicated than I
> > thought.  I think this replacement patch fixes that case (together with a
> > few others).  Would you try it out again, please.

> Still fails with:

> /a/ { (print /abc/) }

Whoops!  There's a slight glitch in one of the regexps in cc-awk.el.  If
there were a space before "print", it would be "all right".  I've sent a
corrected patch below.

> or

> /a/ { p /abc/ } # incorrect awk so not sure a bug or feature

That "/abc/" is two division signs with a variable between them.  :-)
Compare your text with this:

BEGIN { a = 1 }
/a/ { print a /a/ a }

At the moment, after an alphanumeric token, /regexp/ is only a regexp
when the token is one of the keywords ("print" "case" "return").  There
might be more such keywords (I've not found any).  In a way, "printf"
could be one too, except its first argument is always the format string,
so that wouldn't be useful.

Here's the amended patch:



=== modified file 'lisp/progmodes/cc-awk.el'
*** lisp/progmodes/cc-awk.el	2013-01-01 09:11:05 +0000
--- lisp/progmodes/cc-awk.el	2013-01-28 10:57:52 +0000
***************
*** 127,148 ****
  ;; escaped EOL.
  
  ;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
- (defconst c-awk-harmless-char-re "[^_#/\"\\\\\n\r]")
- ;;   Matches any character but a _, #, /, ", \, or newline.  N.B. _" starts a
- ;; localization string in gawk 3.1
  (defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
  ;;   Matches an underline NOT followed by ".
  (defconst c-awk-harmless-string*-re
    (concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;;   Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
  (defconst c-awk-harmless-string*-here-re
    (concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of chars without unescaped /, ", \,
! ;; at point.
  (defconst c-awk-harmless-line-re
!   (concat c-awk-harmless-string*-re
!           "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
  ;;   Matches (the tail of) an AWK \"logical\" line not containing an unescaped
  ;; " or /.  "logical" means "possibly containing escaped newlines".  A comment
  ;; is matched as part of the line even if it contains a " or a /.  The End of
--- 127,155 ----
  ;; escaped EOL.
  
  ;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
  (defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
  ;;   Matches an underline NOT followed by ".
+ (defconst c-awk-harmless-char-re "[^_#/\"{}();\\\\\n\r]")
+ ;;   Mathches any character not significant in the state machine applying
+ ;; syntax-table properties to "s and /s.
  (defconst c-awk-harmless-string*-re
    (concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;;   Matches a (possibly empty) sequence of characters insignificant in the
! ;; state machine applying syntax-table properties to "s and /s.
  (defconst c-awk-harmless-string*-here-re
    (concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of "insignificant" chars at point.
! 
! (defconst c-awk-harmless-line-char-re "[^_#/\"\\\\\n\r]")
! ;;   Matches any character but a _, #, /, ", \, or newline.  N.B. _" starts a
! ;; localisation string in gawk 3.1
! (defconst c-awk-harmless-line-string*-re
!   (concat "\\(" c-awk-harmless-line-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;;   Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
  (defconst c-awk-harmless-line-re
!   (concat c-awk-harmless-line-string*-re
! 	  "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
  ;;   Matches (the tail of) an AWK \"logical\" line not containing an unescaped
  ;; " or /.  "logical" means "possibly containing escaped newlines".  A comment
  ;; is matched as part of the line even if it contains a " or a /.  The End of
***************
*** 211,217 ****
  ;; division sign.
  (defconst c-awk-neutral-re
  ;  "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
!   "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)")
  ;;   A "neutral" char(pair).  Doesn't change the "state" of a subsequent /.
  ;; This is space/tab, braces, an auto-increment/decrement operator or an
  ;; escaped character.  Or one of the (invalid) characters @ or `.  But NOT an
--- 218,224 ----
  ;; division sign.
  (defconst c-awk-neutral-re
  ;  "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
!   "\\([}@` \t]\\|\\+\\+\\|--\\|\\\\\\(.\\|[\n\r]\\)\\)")
  ;;   A "neutral" char(pair).  Doesn't change the "state" of a subsequent /.
  ;; This is space/tab, braces, an auto-increment/decrement operator or an
  ;; escaped character.  Or one of the (invalid) characters @ or `.  But NOT an
***************
*** 231,238 ****
  ;; will only work when there won't be a preceding " or / before the sought /
  ;; to foul things up.
  (defconst c-awk-non-arith-op-bra-re
!   "[[\(&=:!><,?;'~|]")
! ;;   Matches an opening BRAcket, round or square, or any operator character
  ;; apart from +,-,/,*,%.  For the purpose at hand (detecting a / which is a
  ;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
  ;; and "--".
--- 238,245 ----
  ;; will only work when there won't be a preceding " or / before the sought /
  ;; to foul things up.
  (defconst c-awk-non-arith-op-bra-re
!   "[[\({&=:!><,?;'~|]")
! ;;   Matches an openeing BRAcket ,round or square, or any operator character
  ;; apart from +,-,/,*,%.  For the purpose at hand (detecting a / which is a
  ;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
  ;; and "--".
***************
*** 242,247 ****
--- 249,264 ----
  ;; bracket, in a context where an immediate / would be a division sign.  This
  ;; will only work when there won't be a preceding " or / before the sought /
  ;; to foul things up.
+ (defconst c-awk-pre-exp-alphanum-kwd-re
+   (concat "\\(^\\|\\=\\|[^_\n\r]\\)\\<"
+ 	  (regexp-opt '("print" "return" "case") t)
+ 	  "\\>\\([^_\n\r]\\|$\\)"))
+ ;;   Matches all AWK keywords which can precede expressions (including
+ ;; /regexp/).
+ (defconst c-awk-kwd-regexp-sign-re
+   (concat c-awk-pre-exp-alphanum-kwd-re c-awk-neutrals*-re "/"))
+ ;;   Matches a piece of AWK buffer ending in <kwd> /, where <kwd> is a keyword
+ ;; which can precede an expression.
  
  ;; REGEXPS USED FOR FINDING THE POSITION OF A "virtual semicolon"
  (defconst c-awk-_-harmless-nonws-char-re "[^#/\"\\\\\n\r \t]")
***************
*** 721,729 ****
      (goto-char anchor)
      ;; Analyze the line to find out what the / is.
      (if (if anchor-state-/div
!             (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
!           (search-forward-regexp c-awk-div-sign-re (1+ /point) t))
!         ;; A division sign.
  	(progn (goto-char (1+ /point)) nil)
        ;; A regexp opener
        ;; Jump over the regexp innards, setting the match data.
--- 738,747 ----
      (goto-char anchor)
      ;; Analyze the line to find out what the / is.
      (if (if anchor-state-/div
! 	    (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! 	  (and (not (search-forward-regexp c-awk-kwd-regexp-sign-re (1+ /point) t))
! 	       (search-forward-regexp c-awk-div-sign-re (1+ /point) t)))
! 	;; A division sign.
  	(progn (goto-char (1+ /point)) nil)
        ;; A regexp opener
        ;; Jump over the regexp innards, setting the match data.
***************
*** 776,787 ****
               (< (point) lim))
        (setq anchor (point))
        (search-forward-regexp c-awk-harmless-string*-here-re nil t)
!       ;; We are now looking at either a " or a /.
!       ;; Do our thing on the string, regexp or division sign.
        (setq anchor-state-/div
!             (if (looking-at "_?\"")
!                 (c-awk-syntax-tablify-string)
!               (c-awk-syntax-tablify-/ anchor anchor-state-/div))))
      nil))
  
  ;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set
--- 794,813 ----
               (< (point) lim))
        (setq anchor (point))
        (search-forward-regexp c-awk-harmless-string*-here-re nil t)
!       ;; We are now looking at either a " or a / or a brace/paren/semicolon.
!       ;; Do our thing on the string, regexp or divsion sign or update our state.
        (setq anchor-state-/div
! 	    (cond
! 	     ((looking-at "_?\"")
! 	      (c-awk-syntax-tablify-string))
! 	     ((eq (char-after) ?/)
! 	      (c-awk-syntax-tablify-/ anchor anchor-state-/div))
! 	     ((memq (char-after) '(?{ ?} ?\( ?\;))
! 	      (forward-char)
! 	      nil)
! 	     (t 			; ?\)
! 	      (forward-char)
! 	      t))))
      nil))
  
  ;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set



> Leo

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
       [not found]         ` <20130128111417.GA3330@acm.acm>
@ 2013-01-28 12:11           ` Leo Liu
  0 siblings, 0 replies; 18+ messages in thread
From: Leo Liu @ 2013-01-28 12:11 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: 13541

On 2013-01-28 19:14 +0800, Alan Mackenzie wrote:
> Whoops!  There's a slight glitch in one of the regexps in cc-awk.el.  If
> there were a space before "print", it would be "all right".  I've sent a
> corrected patch below.

OK, I have no further complaints ;)

Leo





^ permalink raw reply	[flat|nested] 18+ messages in thread

* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
  2013-01-24 11:43 bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Leo Liu
                   ` (2 preceding siblings ...)
  2013-01-25 17:50 ` bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Alan Mackenzie
@ 2013-01-29 20:58 ` Alan Mackenzie
  3 siblings, 0 replies; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-29 20:58 UTC (permalink / raw)
  To: 13541-done

Bug fixed.

-- 
Alan Mackenzie (Nuremberg, Germany).





^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-01-29 20:58 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-24 11:43 bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Leo Liu
2013-01-24 18:28 ` Glenn Morris
     [not found] ` <b5ehhajuzy.fsf@fencepost.gnu.org>
2013-01-24 22:16   ` Alan Mackenzie
2013-01-25  1:20     ` Leo Liu
2013-01-25  1:33       ` Glenn Morris
2013-01-25  1:44         ` Glenn Morris
2013-01-25 21:32           ` Richard Stallman
2013-01-25  8:44       ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
2013-01-25 12:58         ` Stefan Monnier
2013-01-25 17:33         ` Glenn Morris
2013-01-25 19:17         ` Alan Mackenzie
2013-01-25 17:50 ` bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Alan Mackenzie
2013-01-26 11:14   ` Leo Liu
2013-01-27 18:59     ` Alan Mackenzie
     [not found]     ` <20130127185906.GA16161__1271.15463042191$1359313643$gmane$org@acm.acm>
2013-01-28  1:12       ` Leo Liu
2013-01-28 11:14         ` Alan Mackenzie
     [not found]         ` <20130128111417.GA3330@acm.acm>
2013-01-28 12:11           ` Leo Liu
2013-01-29 20:58 ` Alan Mackenzie

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).