* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
@ 2013-01-24 11:43 Leo Liu
2013-01-24 18:28 ` Glenn Morris
` (3 more replies)
0 siblings, 4 replies; 18+ messages in thread
From: Leo Liu @ 2013-01-24 11:43 UTC (permalink / raw)
To: 13541; +Cc: bug-cc-mode
[-- Attachment #1: Type: text/plain, Size: 214 bytes --]
In an awk buffer having the following text:
#--BEGIN--
NF { /xyz/ }
NF {
/xyz/
}
#--END--
I have the second regexp properly font-locked but not the first one.
(tested in GNU Emacs 24.2.92.1 of 2013-01-13)
[-- Attachment #2: awk-mode-bug.png --]
[-- Type: image/png, Size: 6820 bytes --]
[-- Attachment #3: Type: text/plain, Size: 5 bytes --]
Leo
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
2013-01-24 11:43 bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Leo Liu
@ 2013-01-24 18:28 ` Glenn Morris
[not found] ` <b5ehhajuzy.fsf@fencepost.gnu.org>
` (2 subsequent siblings)
3 siblings, 0 replies; 18+ messages in thread
From: Glenn Morris @ 2013-01-24 18:28 UTC (permalink / raw)
To: Leo Liu; +Cc: 13541
Leo Liu wrote:
> In an awk buffer having the following text:
>
> #--BEGIN--
> NF { /xyz/ }
>
> NF {
> /xyz/
> }
> #--END--
>
> I have the second regexp properly font-locked but not the first one.
Do you have an example of an actual useful awk script showing the issue,
because this one seems like a pointless no-op?
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
[not found] ` <b5ehhajuzy.fsf@fencepost.gnu.org>
@ 2013-01-24 22:16 ` Alan Mackenzie
2013-01-25 1:20 ` Leo Liu
0 siblings, 1 reply; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-24 22:16 UTC (permalink / raw)
To: Glenn Morris; +Cc: sdl.web, 13541
Hi, Glenn,
On Thu, Jan 24, 2013 at 01:28:33PM -0500, Glenn Morris wrote:
> Leo Liu wrote:
> > In an awk buffer having the following text:
> > #--BEGIN--
> > NF { /xyz/ }
> > NF {
> > /xyz/
> > }
> > #--END--
> > I have the second regexp properly font-locked but not the first one.
> Do you have an example of an actual useful awk script showing the issue,
> because this one seems like a pointless no-op?
This is a real bug, perhaps not a difficult one. "/regexp/" is an
expression with value 1 iff the current input line matches the regexp.
So a line like
NF { print /xyz/ }
is perfectly legitimate, printing 1 if there's an "xyz" on the line.
I'm looking at this bug at the moment.
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
2013-01-24 22:16 ` Alan Mackenzie
@ 2013-01-25 1:20 ` Leo Liu
2013-01-25 1:33 ` Glenn Morris
2013-01-25 8:44 ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
0 siblings, 2 replies; 18+ messages in thread
From: Leo Liu @ 2013-01-25 1:20 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: 13541
On 2013-01-25 06:16 +0800, Alan Mackenzie wrote:
> This is a real bug, perhaps not a difficult one. "/regexp/" is an
> expression with value 1 iff the current input line matches the regexp.
> So a line like
>
> NF { print /xyz/ }
>
> is perfectly legitimate, printing 1 if there's an "xyz" on the line.
>
> I'm looking at this bug at the moment.
Thanks to all for chiming in.
Alan, I also have another seemingly buglet about indentation.
Every line after a pattern-action pair like the following one (where
action is omitted) is indented to column 4, i.e. it doesn't recognise a
newline terminates a pattern.
$0 == "Emacs"
|
all following lines indented here
(this might be regression, I seem to recall reporting something along
these lines some while ago.)
Leo
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
2013-01-25 1:20 ` Leo Liu
@ 2013-01-25 1:33 ` Glenn Morris
2013-01-25 1:44 ` Glenn Morris
2013-01-25 8:44 ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
1 sibling, 1 reply; 18+ messages in thread
From: Glenn Morris @ 2013-01-25 1:33 UTC (permalink / raw)
To: Leo Liu; +Cc: Alan Mackenzie, 13541
Leo Liu wrote:
> (this might be regression, I seem to recall reporting something along
> these lines some while ago.)
No, it is the never addressed
http://debbugs.gnu.org/12274
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
2013-01-25 1:33 ` Glenn Morris
@ 2013-01-25 1:44 ` Glenn Morris
2013-01-25 21:32 ` Richard Stallman
0 siblings, 1 reply; 18+ messages in thread
From: Glenn Morris @ 2013-01-25 1:44 UTC (permalink / raw)
To: Leo Liu; +Cc: Alan Mackenzie, 13541
> Leo Liu wrote:
>
>> (this might be regression,
PS henceforth it is prohibited to use the word "regression" except in
the form "this is a regression against Emacs XX.YY, where it works as
desired".
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12274: 24.2; awk-mode indentation failure
2013-01-25 1:20 ` Leo Liu
2013-01-25 1:33 ` Glenn Morris
@ 2013-01-25 8:44 ` Alan Mackenzie
2013-01-25 12:58 ` Stefan Monnier
` (2 more replies)
1 sibling, 3 replies; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-25 8:44 UTC (permalink / raw)
To: Leo Liu; +Cc: 12274
Hello, Leo.
On Fri, Jan 25, 2013 at 09:20:19AM +0800, Leo Liu wrote:
[ .... ]
> Alan, I also have another seemingly buglet about indentation.
As Glenn remarked, this is bug #12274.
> Every line after a pattern-action pair like the following one (where
> action is omitted) is indented to column 4, i.e. it doesn't recognise a
> newline terminates a pattern.
> $0 == "Emacs"
> |
> all following lines indented here
> (this might be regression, I seem to recall reporting something along
> these lines some while ago.)
No, not a regression, rather a bug which has been there since 4004 BC.
It's actually the "=" sign which triggers it, confusing the parsing
algortihm into thinking it's a C initialisation statement.
The solution is to move the pertinent AWK parsing clause earlier on in
the enclosing cond form.
Glenn, this is not a regression. Should I nevertheless commit it to the
emacs-24 branch?
Here's the patch:
diff -r 0d641a4d3e7c cc-engine.el
--- a/cc-engine.el Wed Jan 23 18:17:40 2013 +0000
+++ b/cc-engine.el Fri Jan 25 08:27:12 2013 +0000
@@ -9880,6 +9880,18 @@
;; contains any class offset
)))
+ ;; CASE 5P: AWK pattern or function or continuation
+ ;; thereof.
+ ((c-major-mode-is 'awk-mode)
+ (setq placeholder (point))
+ (c-add-stmt-syntax
+ (if (and (eq (c-beginning-of-statement-1) 'same)
+ (/= (point) placeholder))
+ 'topmost-intro-cont
+ 'topmost-intro)
+ nil nil
+ containing-sexp paren-state))
+
;; CASE 5D: this could be a top-level initialization, a
;; member init list continuation, or a template argument
;; list continuation.
@@ -10039,18 +10051,6 @@
(goto-char (point-min)))
(c-add-syntax 'objc-method-intro (c-point 'boi)))
- ;; CASE 5P: AWK pattern or function or continuation
- ;; thereof.
- ((c-major-mode-is 'awk-mode)
- (setq placeholder (point))
- (c-add-stmt-syntax
- (if (and (eq (c-beginning-of-statement-1) 'same)
- (/= (point) placeholder))
- 'topmost-intro-cont
- 'topmost-intro)
- nil nil
- containing-sexp paren-state))
-
;; CASE 5N: At a variable declaration that follows a class
;; definition or some other block declaration that doesn't
;; end at the closing '}'. C.f. case 5D.5.
> Leo
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12274: 24.2; awk-mode indentation failure
2013-01-25 8:44 ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
@ 2013-01-25 12:58 ` Stefan Monnier
2013-01-25 17:33 ` Glenn Morris
2013-01-25 19:17 ` Alan Mackenzie
2 siblings, 0 replies; 18+ messages in thread
From: Stefan Monnier @ 2013-01-25 12:58 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: Leo Liu, 12274
> Glenn, this is not a regression. Should I nevertheless commit it to the
> emacs-24 branch?
No, this should go to trunk,
Stefan
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12274: 24.2; awk-mode indentation failure
2013-01-25 8:44 ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
2013-01-25 12:58 ` Stefan Monnier
@ 2013-01-25 17:33 ` Glenn Morris
2013-01-25 19:17 ` Alan Mackenzie
2 siblings, 0 replies; 18+ messages in thread
From: Glenn Morris @ 2013-01-25 17:33 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: Leo Liu, 12274
Alan Mackenzie wrote:
> Glenn, this is not a regression. Should I nevertheless commit it to the
> emacs-24 branch?
No, trunk please.
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
2013-01-24 11:43 bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Leo Liu
2013-01-24 18:28 ` Glenn Morris
[not found] ` <b5ehhajuzy.fsf@fencepost.gnu.org>
@ 2013-01-25 17:50 ` Alan Mackenzie
2013-01-26 11:14 ` Leo Liu
2013-01-29 20:58 ` Alan Mackenzie
3 siblings, 1 reply; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-25 17:50 UTC (permalink / raw)
To: Leo Liu; +Cc: 13541
Hi, Leo.
On Thu, Jan 24, 2013 at 07:43:06PM +0800, Leo Liu wrote:
> In an awk buffer having the following text:
> #--BEGIN--
> NF { /xyz/ }
> NF {
> /xyz/
> }
> #--END--
> I have the second regexp properly font-locked but not the first one.
Yes.
Could you please try out, fairly thoroughly, the following patch, and let
me know how it goes. It aims to fontify a /regexp/ wherever one might
occur.
=== modified file 'lisp/progmodes/cc-awk.el'
*** lisp/progmodes/cc-awk.el 2013-01-01 09:11:05 +0000
--- lisp/progmodes/cc-awk.el 2013-01-25 17:47:38 +0000
***************
*** 211,217 ****
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
--- 211,217 ----
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([}@` \t]\\|\\+\\+\\|--\\|\\\\\\(.\\|[\n\r]\\)\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
***************
*** 231,237 ****
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\(&=:!><,?;'~|]")
;; Matches an opening BRAcket, round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
--- 231,237 ----
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\({&=:!><,?;'~|]")
;; Matches an opening BRAcket, round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
***************
*** 242,247 ****
--- 242,257 ----
;; bracket, in a context where an immediate / would be a division sign. This
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
+ (defconst c-awk-pre-exp-alphanum-kwd-re
+ (concat "\\(^\\|[^_\n\r]\\)\\<"
+ (regexp-opt '("print" "return" "case") t)
+ "\\>\\([^_\n\r]\\|$\\)"))
+ ;; Matches all AWK keywords which can precede expressions (including
+ ;; /regexp/).
+ (defconst c-awk-kwd-regexp-sign-re
+ (concat c-awk-pre-exp-alphanum-kwd-re c-awk-neutrals*-re "/"))
+ ;; Matches a piece of AWK buffer ending in <kwd> /, where <kwd> is a keyword
+ ;; which can precede an expression.
;; REGEXPS USED FOR FINDING THE POSITION OF A "virtual semicolon"
(defconst c-awk-_-harmless-nonws-char-re "[^#/\"\\\\\n\r \t]")
***************
*** 721,729 ****
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
--- 731,740 ----
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (and (not (search-forward-regexp c-awk-kwd-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t)))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
> Leo
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#12274: 24.2; awk-mode indentation failure
2013-01-25 8:44 ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
2013-01-25 12:58 ` Stefan Monnier
2013-01-25 17:33 ` Glenn Morris
@ 2013-01-25 19:17 ` Alan Mackenzie
2 siblings, 0 replies; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-25 19:17 UTC (permalink / raw)
To: 12274-done
Bug fixed.
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
2013-01-25 1:44 ` Glenn Morris
@ 2013-01-25 21:32 ` Richard Stallman
0 siblings, 0 replies; 18+ messages in thread
From: Richard Stallman @ 2013-01-25 21:32 UTC (permalink / raw)
To: Glenn Morris; +Cc: acm, sdl.web, 13541
If a a nasty and cruel bug appears just before the release, is that
"regression to the mean"?
--
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
Use Ekiga or an ordinary phone call
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
2013-01-25 17:50 ` bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Alan Mackenzie
@ 2013-01-26 11:14 ` Leo Liu
2013-01-27 18:59 ` Alan Mackenzie
[not found] ` <20130127185906.GA16161__1271.15463042191$1359313643$gmane$org@acm.acm>
0 siblings, 2 replies; 18+ messages in thread
From: Leo Liu @ 2013-01-26 11:14 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: 13541
On 2013-01-26 01:50 +0800, Alan Mackenzie wrote:
> Could you please try out, fairly thoroughly, the following patch, and let
> me know how it goes. It aims to fontify a /regexp/ wherever one might
> occur.
The second regexp is not font-locked in this case:
/a/ { print /abc/ }
Leo
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
2013-01-26 11:14 ` Leo Liu
@ 2013-01-27 18:59 ` Alan Mackenzie
[not found] ` <20130127185906.GA16161__1271.15463042191$1359313643$gmane$org@acm.acm>
1 sibling, 0 replies; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-27 18:59 UTC (permalink / raw)
To: Leo Liu; +Cc: 13541
Hi, Leo.
On Sat, Jan 26, 2013 at 07:14:49PM +0800, Leo Liu wrote:
> On 2013-01-26 01:50 +0800, Alan Mackenzie wrote:
> > Could you please try out, fairly thoroughly, the following patch, and let
> > me know how it goes. It aims to fontify a /regexp/ wherever one might
> > occur.
> The second regexp is not font-locked in this case:
> /a/ { print /abc/ }
Yes, thanks for spotting this. The situation was more complicated than I
thought. I think this replacement patch fixes that case (together with a
few others). Would you try it out again, please.
=== modified file 'lisp/progmodes/cc-awk.el'
*** lisp/progmodes/cc-awk.el 2013-01-01 09:11:05 +0000
--- lisp/progmodes/cc-awk.el 2013-01-27 18:23:59 +0000
***************
*** 127,148 ****
;; escaped EOL.
;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
- (defconst c-awk-harmless-char-re "[^_#/\"\\\\\n\r]")
- ;; Matches any character but a _, #, /, ", \, or newline. N.B. _" starts a
- ;; localization string in gawk 3.1
(defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
;; Matches an underline NOT followed by ".
(defconst c-awk-harmless-string*-re
(concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
(defconst c-awk-harmless-string*-here-re
(concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of chars without unescaped /, ", \,
! ;; at point.
(defconst c-awk-harmless-line-re
! (concat c-awk-harmless-string*-re
! "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
;; Matches (the tail of) an AWK \"logical\" line not containing an unescaped
;; " or /. "logical" means "possibly containing escaped newlines". A comment
;; is matched as part of the line even if it contains a " or a /. The End of
--- 127,155 ----
;; escaped EOL.
;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
(defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
;; Matches an underline NOT followed by ".
+ (defconst c-awk-harmless-char-re "[^_#/\"{}();\\\\\n\r]")
+ ;; Mathches any character not significant in the state machine applying
+ ;; syntax-table properties to "s and /s.
(defconst c-awk-harmless-string*-re
(concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of characters insignificant in the
! ;; state machine applying syntax-table properties to "s and /s.
(defconst c-awk-harmless-string*-here-re
(concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of "insignificant" chars at point.
!
! (defconst c-awk-harmless-line-char-re "[^_#/\"\\\\\n\r]")
! ;; Matches any character but a _, #, /, ", \, or newline. N.B. _" starts a
! ;; localisation string in gawk 3.1
! (defconst c-awk-harmless-line-string*-re
! (concat "\\(" c-awk-harmless-line-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
(defconst c-awk-harmless-line-re
! (concat c-awk-harmless-line-string*-re
! "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
;; Matches (the tail of) an AWK \"logical\" line not containing an unescaped
;; " or /. "logical" means "possibly containing escaped newlines". A comment
;; is matched as part of the line even if it contains a " or a /. The End of
***************
*** 211,217 ****
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
--- 218,224 ----
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([}@` \t]\\|\\+\\+\\|--\\|\\\\\\(.\\|[\n\r]\\)\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
***************
*** 231,238 ****
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\(&=:!><,?;'~|]")
! ;; Matches an opening BRAcket, round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
;; and "--".
--- 238,245 ----
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\({&=:!><,?;'~|]")
! ;; Matches an openeing BRAcket ,round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
;; and "--".
***************
*** 242,247 ****
--- 249,264 ----
;; bracket, in a context where an immediate / would be a division sign. This
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
+ (defconst c-awk-pre-exp-alphanum-kwd-re
+ (concat "\\(^\\|[^_\n\r]\\)\\<"
+ (regexp-opt '("print" "return" "case") t)
+ "\\>\\([^_\n\r]\\|$\\)"))
+ ;; Matches all AWK keywords which can precede expressions (including
+ ;; /regexp/).
+ (defconst c-awk-kwd-regexp-sign-re
+ (concat c-awk-pre-exp-alphanum-kwd-re c-awk-neutrals*-re "/"))
+ ;; Matches a piece of AWK buffer ending in <kwd> /, where <kwd> is a keyword
+ ;; which can precede an expression.
;; REGEXPS USED FOR FINDING THE POSITION OF A "virtual semicolon"
(defconst c-awk-_-harmless-nonws-char-re "[^#/\"\\\\\n\r \t]")
***************
*** 721,729 ****
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
--- 738,747 ----
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (and (not (search-forward-regexp c-awk-kwd-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t)))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
***************
*** 776,787 ****
(< (point) lim))
(setq anchor (point))
(search-forward-regexp c-awk-harmless-string*-here-re nil t)
! ;; We are now looking at either a " or a /.
! ;; Do our thing on the string, regexp or division sign.
(setq anchor-state-/div
! (if (looking-at "_?\"")
! (c-awk-syntax-tablify-string)
! (c-awk-syntax-tablify-/ anchor anchor-state-/div))))
nil))
;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set
--- 794,813 ----
(< (point) lim))
(setq anchor (point))
(search-forward-regexp c-awk-harmless-string*-here-re nil t)
! ;; We are now looking at either a " or a / or a brace/paren/semicolon.
! ;; Do our thing on the string, regexp or divsion sign or update our state.
(setq anchor-state-/div
! (cond
! ((looking-at "_?\"")
! (c-awk-syntax-tablify-string))
! ((eq (char-after) ?/)
! (c-awk-syntax-tablify-/ anchor anchor-state-/div))
! ((memq (char-after) '(?{ ?} ?\( ?\;))
! (forward-char)
! nil)
! (t ; ?\)
! (forward-char)
! t))))
nil))
;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set
> Leo
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
[not found] ` <20130127185906.GA16161__1271.15463042191$1359313643$gmane$org@acm.acm>
@ 2013-01-28 1:12 ` Leo Liu
2013-01-28 11:14 ` Alan Mackenzie
[not found] ` <20130128111417.GA3330@acm.acm>
0 siblings, 2 replies; 18+ messages in thread
From: Leo Liu @ 2013-01-28 1:12 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: 13541
On 2013-01-28 02:59 +0800, Alan Mackenzie wrote:
> Yes, thanks for spotting this. The situation was more complicated than I
> thought. I think this replacement patch fixes that case (together with a
> few others). Would you try it out again, please.
Still fails with:
/a/ { (print /abc/) }
or
/a/ { p /abc/ } # incorrect awk so not sure a bug or feature
Leo
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
2013-01-28 1:12 ` Leo Liu
@ 2013-01-28 11:14 ` Alan Mackenzie
[not found] ` <20130128111417.GA3330@acm.acm>
1 sibling, 0 replies; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-28 11:14 UTC (permalink / raw)
To: Leo Liu; +Cc: 13541
Hi, Leo.
On Mon, Jan 28, 2013 at 09:12:01AM +0800, Leo Liu wrote:
> On 2013-01-28 02:59 +0800, Alan Mackenzie wrote:
> > Yes, thanks for spotting this. The situation was more complicated than I
> > thought. I think this replacement patch fixes that case (together with a
> > few others). Would you try it out again, please.
> Still fails with:
> /a/ { (print /abc/) }
Whoops! There's a slight glitch in one of the regexps in cc-awk.el. If
there were a space before "print", it would be "all right". I've sent a
corrected patch below.
> or
> /a/ { p /abc/ } # incorrect awk so not sure a bug or feature
That "/abc/" is two division signs with a variable between them. :-)
Compare your text with this:
BEGIN { a = 1 }
/a/ { print a /a/ a }
At the moment, after an alphanumeric token, /regexp/ is only a regexp
when the token is one of the keywords ("print" "case" "return"). There
might be more such keywords (I've not found any). In a way, "printf"
could be one too, except its first argument is always the format string,
so that wouldn't be useful.
Here's the amended patch:
=== modified file 'lisp/progmodes/cc-awk.el'
*** lisp/progmodes/cc-awk.el 2013-01-01 09:11:05 +0000
--- lisp/progmodes/cc-awk.el 2013-01-28 10:57:52 +0000
***************
*** 127,148 ****
;; escaped EOL.
;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
- (defconst c-awk-harmless-char-re "[^_#/\"\\\\\n\r]")
- ;; Matches any character but a _, #, /, ", \, or newline. N.B. _" starts a
- ;; localization string in gawk 3.1
(defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
;; Matches an underline NOT followed by ".
(defconst c-awk-harmless-string*-re
(concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
(defconst c-awk-harmless-string*-here-re
(concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of chars without unescaped /, ", \,
! ;; at point.
(defconst c-awk-harmless-line-re
! (concat c-awk-harmless-string*-re
! "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
;; Matches (the tail of) an AWK \"logical\" line not containing an unescaped
;; " or /. "logical" means "possibly containing escaped newlines". A comment
;; is matched as part of the line even if it contains a " or a /. The End of
--- 127,155 ----
;; escaped EOL.
;; REGEXPS FOR "HARMLESS" STRINGS/LINES.
(defconst c-awk-harmless-_ "_\\([^\"]\\|\\'\\)")
;; Matches an underline NOT followed by ".
+ (defconst c-awk-harmless-char-re "[^_#/\"{}();\\\\\n\r]")
+ ;; Mathches any character not significant in the state machine applying
+ ;; syntax-table properties to "s and /s.
(defconst c-awk-harmless-string*-re
(concat "\\(" c-awk-harmless-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of characters insignificant in the
! ;; state machine applying syntax-table properties to "s and /s.
(defconst c-awk-harmless-string*-here-re
(concat "\\=" c-awk-harmless-string*-re))
! ;; Matches the (possibly empty) sequence of "insignificant" chars at point.
!
! (defconst c-awk-harmless-line-char-re "[^_#/\"\\\\\n\r]")
! ;; Matches any character but a _, #, /, ", \, or newline. N.B. _" starts a
! ;; localisation string in gawk 3.1
! (defconst c-awk-harmless-line-string*-re
! (concat "\\(" c-awk-harmless-line-char-re "\\|" c-awk-esc-pair-re "\\|" c-awk-harmless-_ "\\)*"))
! ;; Matches a (possibly empty) sequence of chars without unescaped /, ", \,
! ;; #, or newlines.
(defconst c-awk-harmless-line-re
! (concat c-awk-harmless-line-string*-re
! "\\(" c-awk-comment-without-nl "\\)?" c-awk-nl-or-eob))
;; Matches (the tail of) an AWK \"logical\" line not containing an unescaped
;; " or /. "logical" means "possibly containing escaped newlines". A comment
;; is matched as part of the line even if it contains a " or a /. The End of
***************
*** 211,217 ****
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
--- 218,224 ----
;; division sign.
(defconst c-awk-neutral-re
; "\\([{}@` \t]\\|\\+\\+\\|--\\|\\\\.\\)+") ; changed, 2003/6/7
! "\\([}@` \t]\\|\\+\\+\\|--\\|\\\\\\(.\\|[\n\r]\\)\\)")
;; A "neutral" char(pair). Doesn't change the "state" of a subsequent /.
;; This is space/tab, braces, an auto-increment/decrement operator or an
;; escaped character. Or one of the (invalid) characters @ or `. But NOT an
***************
*** 231,238 ****
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\(&=:!><,?;'~|]")
! ;; Matches an opening BRAcket, round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
;; and "--".
--- 238,245 ----
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
(defconst c-awk-non-arith-op-bra-re
! "[[\({&=:!><,?;'~|]")
! ;; Matches an openeing BRAcket ,round or square, or any operator character
;; apart from +,-,/,*,%. For the purpose at hand (detecting a / which is a
;; regexp bracket) these arith ops are unnecessary and a pain, because of "++"
;; and "--".
***************
*** 242,247 ****
--- 249,264 ----
;; bracket, in a context where an immediate / would be a division sign. This
;; will only work when there won't be a preceding " or / before the sought /
;; to foul things up.
+ (defconst c-awk-pre-exp-alphanum-kwd-re
+ (concat "\\(^\\|\\=\\|[^_\n\r]\\)\\<"
+ (regexp-opt '("print" "return" "case") t)
+ "\\>\\([^_\n\r]\\|$\\)"))
+ ;; Matches all AWK keywords which can precede expressions (including
+ ;; /regexp/).
+ (defconst c-awk-kwd-regexp-sign-re
+ (concat c-awk-pre-exp-alphanum-kwd-re c-awk-neutrals*-re "/"))
+ ;; Matches a piece of AWK buffer ending in <kwd> /, where <kwd> is a keyword
+ ;; which can precede an expression.
;; REGEXPS USED FOR FINDING THE POSITION OF A "virtual semicolon"
(defconst c-awk-_-harmless-nonws-char-re "[^#/\"\\\\\n\r \t]")
***************
*** 721,729 ****
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
--- 738,747 ----
(goto-char anchor)
;; Analyze the line to find out what the / is.
(if (if anchor-state-/div
! (not (search-forward-regexp c-awk-regexp-sign-re (1+ /point) t))
! (and (not (search-forward-regexp c-awk-kwd-regexp-sign-re (1+ /point) t))
! (search-forward-regexp c-awk-div-sign-re (1+ /point) t)))
! ;; A division sign.
(progn (goto-char (1+ /point)) nil)
;; A regexp opener
;; Jump over the regexp innards, setting the match data.
***************
*** 776,787 ****
(< (point) lim))
(setq anchor (point))
(search-forward-regexp c-awk-harmless-string*-here-re nil t)
! ;; We are now looking at either a " or a /.
! ;; Do our thing on the string, regexp or division sign.
(setq anchor-state-/div
! (if (looking-at "_?\"")
! (c-awk-syntax-tablify-string)
! (c-awk-syntax-tablify-/ anchor anchor-state-/div))))
nil))
;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set
--- 794,813 ----
(< (point) lim))
(setq anchor (point))
(search-forward-regexp c-awk-harmless-string*-here-re nil t)
! ;; We are now looking at either a " or a / or a brace/paren/semicolon.
! ;; Do our thing on the string, regexp or divsion sign or update our state.
(setq anchor-state-/div
! (cond
! ((looking-at "_?\"")
! (c-awk-syntax-tablify-string))
! ((eq (char-after) ?/)
! (c-awk-syntax-tablify-/ anchor anchor-state-/div))
! ((memq (char-after) '(?{ ?} ?\( ?\;))
! (forward-char)
! nil)
! (t ; ?\)
! (forward-char)
! t))))
nil))
;; ACM, 2002/07/21: Thoughts: We need an AWK Mode after-change function to set
> Leo
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
[not found] ` <20130128111417.GA3330@acm.acm>
@ 2013-01-28 12:11 ` Leo Liu
0 siblings, 0 replies; 18+ messages in thread
From: Leo Liu @ 2013-01-28 12:11 UTC (permalink / raw)
To: Alan Mackenzie; +Cc: 13541
On 2013-01-28 19:14 +0800, Alan Mackenzie wrote:
> Whoops! There's a slight glitch in one of the regexps in cc-awk.el. If
> there were a space before "print", it would be "all right". I've sent a
> corrected patch below.
OK, I have no further complaints ;)
Leo
^ permalink raw reply [flat|nested] 18+ messages in thread
* bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals
2013-01-24 11:43 bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Leo Liu
` (2 preceding siblings ...)
2013-01-25 17:50 ` bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Alan Mackenzie
@ 2013-01-29 20:58 ` Alan Mackenzie
3 siblings, 0 replies; 18+ messages in thread
From: Alan Mackenzie @ 2013-01-29 20:58 UTC (permalink / raw)
To: 13541-done
Bug fixed.
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2013-01-29 20:58 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-24 11:43 bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Leo Liu
2013-01-24 18:28 ` Glenn Morris
[not found] ` <b5ehhajuzy.fsf@fencepost.gnu.org>
2013-01-24 22:16 ` Alan Mackenzie
2013-01-25 1:20 ` Leo Liu
2013-01-25 1:33 ` Glenn Morris
2013-01-25 1:44 ` Glenn Morris
2013-01-25 21:32 ` Richard Stallman
2013-01-25 8:44 ` bug#12274: 24.2; awk-mode indentation failure Alan Mackenzie
2013-01-25 12:58 ` Stefan Monnier
2013-01-25 17:33 ` Glenn Morris
2013-01-25 19:17 ` Alan Mackenzie
2013-01-25 17:50 ` bug#13541: 24.2.92; awk-mode: wrong font locking regexp literals Alan Mackenzie
2013-01-26 11:14 ` Leo Liu
2013-01-27 18:59 ` Alan Mackenzie
[not found] ` <20130127185906.GA16161__1271.15463042191$1359313643$gmane$org@acm.acm>
2013-01-28 1:12 ` Leo Liu
2013-01-28 11:14 ` Alan Mackenzie
[not found] ` <20130128111417.GA3330@acm.acm>
2013-01-28 12:11 ` Leo Liu
2013-01-29 20:58 ` Alan Mackenzie
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.