unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#46889: cperl-mode: Fix indentation issues [PATCH]
@ 2021-03-03 14:43 Harald Jörg
  2021-03-04 17:31 ` Lars Ingebrigtsen
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Harald Jörg @ 2021-03-03 14:43 UTC (permalink / raw)
  To: 46889

[-- Attachment #1: Type: text/plain, Size: 1218 bytes --]

Hello Emacs-Maintainers,

this started out as working on Bug#42169 but it soon became clear that
there's more lurking in CPerl mode's indentation code, so I decided to
open a new issue here.

In 7b2448ae (cperl-mode: Improve detection of index entries for imenu),
committed on Feb 17, the imenu part of Bug#42169 has been fixed.  This
commit now fixes the indentation after a package which brings its own
block.

Therefore, the patch in this report should finally fix Bug#42169.

As a by-catch, the new code fixes the ancient Bug#8077.

This patch also fixes the first part of Bug#11733.  The "other
indentation bug" reported in Bug#11733 (using a colon as a Perl regexp
terminator) is a different story and not covered by this patch.

Also fixed now (without known bug number): labels with a space before,
or a comment after the colon are now correctly indented and fontified.

With this patch in effect, CPerl mode should finally have caught up with
Perl 5.30!

Many thanks to Mattias Engdegård for his patient explanations how rx
expressions can be used without breaking compatibility to Emacs 26.1.
Until now, this allowed two ugly regexp literals to be eliminated.
-- 
Cheers,
haj

[-- Attachment #2: cperl-mode: Fix indentation issues --]
[-- Type: text/x-diff, Size: 33256 bytes --]

From 792f23db44a04db179921e72b11634e03d04ebb0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Harald=20J=C3=B6rg?= <haj@posteo.de>
Date: Wed, 3 Mar 2021 14:07:44 +0100
Subject: [PATCH] ; cperl-mode: fix indentation issues
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* lisp/progmodes/cperl-mode.el (cperl-core-named-block-keywords),
(cperl-bare-block-keywords): New variables to allow adding new
keywords without mangling regular expressions..
(cperl--label-rx),
(cperl--false-label-rx) and more: String regexps are replaced
by rx forms, errors reported by relint are fixed. (thanks
Mattias Engdegård!)
(cperl-block-declaration-p): New function to replace several
inconsistent regexps.
(cperl-beginning-of-defun): New function to capture lexical
subs and functions where the opening brace is on a line of its
own (or even separated by attributes, signatures etc.)
(cperl-maybe-white-and-comment-rex),
(cperl-white-and-comment-rex): Now implemented in terms of
rx-forms.
(cperl-after-sub-regexp): Deleted, no more used.
(cperl-imenu--function-name-regexp-perl): Deleted, no more used.
(cperl-indent-line): Use the new rx-forms.
(cperl-sniff-for-indent): Use the new rx-formxs.
(cperl-after-block-p): Use the new keyword lists.
(cperl-init-faces): Replace literal regexps by rx forms.

* test/lisp/progmodes/cperl-mode-tests.el
(cperl-test-indent-region): New indentation test for lexical
subroutines and packages with version and/or blocks.
(cperl-test-ws+-rx): Test the rx S-expression instead of the
string regexp.

* test/lisp/progmodes/cperl-mode-resources/cperl-indent-exp.pl:
New testcases for indenting labels and "new style" packages.

* test/lisp/progmodes/cperl-mode-resources/cperl-indent-region.pl:
New file for indentation tests spanning more than one expression.
---
 lisp/progmodes/cperl-mode.el                  | 396 ++++++++++--------
 .../cperl-mode-resources/cperl-indent-exp.pl  |  76 ++++
 .../cperl-indent-region.pl                    | 118 ++++++
 test/lisp/progmodes/cperl-mode-tests.el       |  14 +-
 4 files changed, 430 insertions(+), 174 deletions(-)
 create mode 100644 test/lisp/progmodes/cperl-mode-resources/cperl-indent-region.pl

diff --git a/lisp/progmodes/cperl-mode.el b/lisp/progmodes/cperl-mode.el
index 6b22228397..d2788da625 100644
--- a/lisp/progmodes/cperl-mode.el
+++ b/lisp/progmodes/cperl-mode.el
@@ -1212,73 +1212,98 @@ cperl-menu
 ;; minimalistic Perl grammar, to be used instead of individual (and
 ;; not always consistent) literal regular expressions.
 
-(defconst cperl--basic-identifier-regexp
-  (rx (sequence (or alpha "_") (* (or word "_"))))
-  "A regular expression for the name of a \"basic\" Perl variable.
+;; Perl Keyword sets
+;; These might be extended by extension modules or future Perl
+;; versions.  Therefore it should be possible to change the values
+;; at runtime, or even: for one CPerl mode buffer.
+(defconst cperl-core-named-block-keywords
+  '("BEGIN" "CHECK" "END" "INIT" "UNITCHECK"
+    ;; These two _should_ be following a sub statement, but are
+    ;; actually valid Perl without.
+    "DESTROY" "AUTOLOAD")
+  "These keywords introduce a block which ends a statement
+   without 'sub', and without a semicolon.  This affects
+   indentation.")
+
+(defconst cperl-bare-block-keywords
+  (append cperl-core-named-block-keywords
+          '("else"
+            ;; FIXME: The next keywords have been taken over from
+            ;; history.  Do they really belong here? - haj 2021-02-25
+            "grep" "map"
+            ;; FIXME: The following keywords are not Perl core
+            ;; and there are subtle syntactical differences between
+            ;; different extensions providing them - haj 2021-02-25
+            "try" "catch" "continue"))
+  "Keywords which are directly followed by a block.")
+
+(eval-and-compile
+  (defconst cperl--ws-rx
+    '(sequence (or space "\n"))
+    "Regular expression for a single whitespace in Perl.")
+
+  (defconst cperl--eol-comment-rx
+    '(sequence "#" (0+ (not (in "\n"))) "\n")
+    "Regular expression for a single end-of-line comment in Perl")
+
+  (defconst cperl--ws-or-comment-rx
+    '(or (eval cperl--ws-rx)
+         (eval cperl--eol-comment-rx))
+    "A regular expression for either whitespace or comment")
+
+  (defconst cperl--ws*-rx
+    '(0+ (eval cperl--ws-or-comment-rx))
+    "Regular expression for optional whitespaces or comments in Perl")
+
+  (defconst cperl--ws+-rx
+    '(1+ (eval cperl--ws-or-comment-rx))
+    "Regular expression for a sequence of whitespace and comments in Perl.")
+
+  (defconst cperl--basic-identifier-rx
+    '(sequence (or alpha "_") (* (or word "_")))
+    "A regular expression for the name of a \"basic\" Perl variable.
 Neither namespace separators nor sigils are included.  As is,
 this regular expression applies to labels,subroutine calls where
 the ampersand sigil is not required, and names of subroutine
 attributes.")
 
-(defconst cperl--label-regexp
-  (rx-to-string
-   `(sequence
-     symbol-start
-     (regexp ,cperl--basic-identifier-regexp)
-     (0+ space)
-     ":"))
-  "A regular expression for a Perl label.
+  (defconst cperl--label-rx
+    '(sequence symbol-start
+               (eval cperl--basic-identifier-rx)
+               (0+ space)
+               ":")
+    "A regular expression for a Perl label.
 By convention, labels are uppercase alphabetics, but this isn't
 enforced.")
 
-(defconst cperl--normal-identifier-regexp
-  (rx-to-string
-   `(or
-     (sequence
-      (1+ (sequence
-           (opt (regexp ,cperl--basic-identifier-regexp))
-           "::"))
-      (opt (regexp ,cperl--basic-identifier-regexp)))
-     (regexp ,cperl--basic-identifier-regexp)))
-  "A regular expression for a Perl variable name with optional namespace.
+  (defconst cperl--false-label-rx
+    '(sequence (or (in "sym") "tr") (0+ space) ":")
+    "A regular expression which is similar to a label, but might as
+  well be a quote-like operator with a colon as delimiter.")
+
+  (defconst cperl--normal-identifier-rx
+    '(or (sequence (1+ (sequence
+                        (opt (eval cperl--basic-identifier-rx))
+                        "::"))
+                   (opt (eval cperl--basic-identifier-rx)))
+         (eval cperl--basic-identifier-rx))
+    "A regular expression for a Perl variable name with optional namespace.
 Examples are `foo`, `Some::Module::VERSION`, and `::` (yes, that
 is a legal variable name).")
 
-(defconst cperl--special-identifier-regexp
-  (rx-to-string
-   `(or
-     (1+ digit)                          ; $0, $1, $2, ...
-     (sequence "^" (any "A-Z" "]^_?\\")) ; $^V
-     (sequence "{" (0+ space)            ; ${^MATCH}
-               "^" (any "A-Z" "]^_?\\")
-               (0+ (any "A-Z" "_" digit))
-               (0+ space) "}")
-     (in "!\"$%&'()+,-./:;<=>?@\\]^_`|~")))   ; $., $|, $", ... but not $^ or ${
-  "The list of Perl \"punctuation\" variables, as listed in perlvar.")
-
-(defconst cperl--ws-regexp
-  (rx-to-string
-   '(or space "\n"))
-  "Regular expression for a single whitespace in Perl.")
-
-(defconst cperl--eol-comment-regexp
-  (rx-to-string
-   '(sequence "#" (0+ (not (in "\n"))) "\n"))
-  "Regular expression for a single end-of-line comment in Perl")
-
-(defconst cperl--ws-or-comment-regexp
-  (rx-to-string
-   `(1+
-     (or
-      (regexp ,cperl--ws-regexp)
-      (regexp ,cperl--eol-comment-regexp))))
-  "Regular expression for a sequence of whitespace and comments in Perl.")
-
-(defconst cperl--ows-regexp
-  (rx-to-string
-   `(opt (regexp ,cperl--ws-or-comment-regexp)))
-  "Regular expression for optional whitespaces or comments in Perl")
-
+  (defconst cperl--special-identifier-rx
+    '(or
+      (1+ digit)                          ; $0, $1, $2, ...
+      (sequence "^" (any "A-Z" "]^_?\\")) ; $^V
+      (sequence "{" (0+ space)            ; ${^MATCH}
+                "^" (any "A-Z" "]^_?\\")
+                (0+ (any "A-Z" "_" digit))
+                (0+ space) "}")
+      (in "!\"$%&'()+,-./:;<=>?@\\]^_`|~"))   ; $., $|, $", ... but not $^ or ${
+    "The list of Perl \"punctuation\" variables, as listed in perlvar."))
+
+;; This is left as a string regexp.  There are many version schemes in
+;; the wild, so people might want to fiddle with this variable.
 (defconst cperl--version-regexp
   (rx-to-string
    `(or
@@ -1296,11 +1321,11 @@ cperl--package-regexp
    `(sequence
      "package" ; FIXME: the "class" and "role" keywords need to be
                ; recognized soon...ish.
-     (regexp ,cperl--ws-or-comment-regexp)
-     (group (regexp ,cperl--normal-identifier-regexp))
+     (eval cperl--ws+-rx)
+     (group (eval cperl--normal-identifier-rx))
      (opt
       (sequence
-       (regexp ,cperl--ws-or-comment-regexp)
+       (eval cperl--ws+-rx)
        (group (regexp ,cperl--version-regexp))))))
   "A regular expression for package NAME VERSION in Perl.
 Contains two groups for the package name and version.")
@@ -1309,7 +1334,7 @@ cperl--package-for-imenu-regexp
   (rx-to-string
    `(sequence
      (regexp ,cperl--package-regexp)
-     (regexp ,cperl--ows-regexp)
+     (eval cperl--ws*-rx)
      (group (or ";" "{"))))
   "A regular expression to collect package names for `imenu`.
 Catches \"package NAME;\", \"package NAME VERSION;\", \"package
@@ -1321,19 +1346,28 @@ cperl--sub-name-regexp
   (rx-to-string
    `(sequence
      (optional (sequence (group (or "my" "state" "our"))
-			 (regexp ,cperl--ws-or-comment-regexp)))
+			 (eval cperl--ws+-rx)))
      "sub" ; FIXME: the "method" and maybe "fun" keywords need to be
            ; recognized soon...ish.
-     (regexp ,cperl--ws-or-comment-regexp)
-     (group (regexp ,cperl--normal-identifier-regexp))))
+     (eval cperl--ws+-rx)
+     (group (eval cperl--normal-identifier-rx))))
   "A regular expression to detect a subroutine start.
 Contains two groups: One for to distinguish lexical from
 \"normal\" subroutines and one for the subroutine name.")
 
-(defconst cperl--pod-heading-regexp
+(defconst cperl--block-declaration-regexp
   (rx-to-string
    `(sequence
-     line-start "=head"
+     (or "package" "sub")  ; "class" and "method" coming soon
+     (1+ (eval cperl--ws-or-comment-rx))
+     (eval cperl--normal-identifier-rx)))
+  "A regular expression to find a declaration for a named block.
+Used for indentation.  These declarations introduce a block which
+does not need a semicolon to terminate the statement.")
+
+(defconst cperl--pod-heading-regexp
+  (rx-to-string
+   `(sequence line-start "=head"
      (group (in "1-4"))
      (1+ (in " \t"))
      (group (1+ (not (in "\n"))))
@@ -1346,61 +1380,84 @@ cperl--imenu-entries-regexp
   (rx-to-string
    `(or
      (regexp ,cperl--package-for-imenu-regexp) ; 1..3
-     (regexp ,cperl--sub-name-regexp)	       ; 4..5
+     (regexp ,cperl--sub-name-regexp)          ; 4..5
      (regexp ,cperl--pod-heading-regexp)))     ; 6..7
   "A regular expression to collect stuff that goes into the `imenu` index.
 Covers packages, subroutines, and POD headings.")
 
 \f
+(defun cperl-block-declaration-p ()
+  "Tests whether the following ?\\{ opens a declaration block.
+Returns the column where the declarating keyword is found, or nil
+if this isn't a declaration block.  Declaration blocks are named
+subroutines, packages and the like.  They start with a keyword
+and a name, to be followed by various descriptive items which are
+just skipped over for our purpose."
+  ;; A scan error means that none of the declarators has been found
+  (condition-case nil
+      (let ((is-block-declaration nil)
+            (continue-searching t))
+        (while (and continue-searching (not (bobp)))
+          (forward-sexp -1)
+          (cond
+           ((looking-at cperl--block-declaration-regexp)
+            (setq is-block-declaration (current-column)
+                  continue-searching nil))
+           ;; Another brace means this is no block declaration
+           ((looking-at "{")
+            (setq continue-searching nil))
+           (t
+            (cperl-backward-to-noncomment (point-min))
+            ;; A semicolon or an opening brace prevent this block from
+            ;; being a block declaration
+            (when (or (eq (preceding-char) ?\;)
+                      (eq (preceding-char) ?{))
+              (setq continue-searching nil)))))
+        is-block-declaration)
+    (error nil)))
+
+(defun cperl-beginning-of-defun (&optional arg)
+  "Move backward to the beginning of a subroutine or package ARG times.
+If point is within a function, then go to the line where its
+definition starts.  If point is outside of a function but within
+a package, go to the first line of the package.  Returns t if the
+search succeeded.  If there's neither a package nor a subroutine
+around point, leaves point undhanged and returns nil.  This
+function ignores `open-paren-in-column-0-is-defun-start',
+and also ignores ARG.
+
+This is the substitute for `beginning-of-defun-raw', so it does
+not jump to the beginning of the line where the target was
+found."
+  (interactive "^p")
+  (unless arg (setq arg 1))
+  (let ((state (syntax-ppss (point))))
+    (catch 'done
+      (dolist (paren-pos (reverse (nth 9 state)))
+        (when (char-equal (char-after paren-pos) ?{)
+          (goto-char paren-pos)
+          (when (cperl-block-declaration-p)
+            (throw 'done t))))
+      (re-search-backward
+       (rx-to-string `(sequence line-start
+                                (regexp ,cperl--package-regexp)))
+       nil 'move))))
+
+\f
 ;; These two must be unwound, otherwise take exponential time
-(defconst cperl-maybe-white-and-comment-rex "[ \t\n]*\\(#[^\n]*\n[ \t\n]*\\)*"
+(defconst cperl-maybe-white-and-comment-rex
+  (rx-to-string `(group (eval cperl--ws*-rx)))
+  ;; was: "[ \t\n]*\\(#[^\n]*\n[ \t\n]*\\)*"
 "Regular expression to match optional whitespace with interspersed comments.
 Should contain exactly one group.")
 
 ;; This one is tricky to unwind; still very inefficient...
-(defconst cperl-white-and-comment-rex "\\([ \t\n]\\|#[^\n]*\n\\)+"
+(defconst cperl-white-and-comment-rex
+  (rx-to-string `(group (eval cperl--ws+-rx)))
+  ;; was: "\\([ \t\n]\\|#[^\n]*\n\\)+"
 "Regular expression to match whitespace with interspersed comments.
 Should contain exactly one group.")
 
-
-;; Is incorporated in `cperl-outline-regexp', `defun-prompt-regexp'.
-;; Details of groups in this may be used in several functions; see comments
-;; near mentioned above variable(s)...
-;; sub($$):lvalue{}  sub:lvalue{} Both allowed...
-(defsubst cperl-after-sub-regexp (named attr) ; 9 groups without attr...
-  "Match the text after `sub' in a subroutine declaration.
-If NAMED is nil, allows anonymous subroutines.  Matches up to the first \":\"
-of attributes (if present), or end of the name or prototype (whatever is
-the last)."
-  (concat				; Assume n groups before this...
-   "\\("				; n+1=name-group
-     cperl-white-and-comment-rex	; n+2=pre-name
-     "\\(::[a-zA-Z_0-9:']+\\|[a-zA-Z_'][a-zA-Z_0-9:']*\\)" ; n+3=name
-   "\\)"				; END n+1=name-group
-   (if named "" "?")
-   "\\("				; n+4=proto-group
-     cperl-maybe-white-and-comment-rex	; n+5=pre-proto
-     "\\(([^()]*)\\)"			; n+6=prototype
-   "\\)?"				; END n+4=proto-group
-   "\\("				; n+7=attr-group
-     cperl-maybe-white-and-comment-rex	; n+8=pre-attr
-     "\\("				; n+9=start-attr
-        ":"
-	(if attr (concat
-		  "\\("
-		     cperl-maybe-white-and-comment-rex ; whitespace-comments
-		     "\\(\\sw\\|_\\)+"	; attr-name
-		     ;; attr-arg (1 level of internal parens allowed!)
-		     "\\((\\(\\\\.\\|[^\\()]\\|([^\\()]*)\\)*)\\)?"
-		     "\\("		; optional : (XXX allows trailing???)
-		        cperl-maybe-white-and-comment-rex ; whitespace-comments
-		     ":\\)?"
-		  "\\)+")
-	  "[^:]")
-     "\\)"
-   "\\)?"				; END n+6=proto-group
-   ))
-
 ;; Tired of editing this in 8 places every time I remember that there
 ;; is another method-defining keyword
 (defvar cperl-sub-keywords
@@ -1414,28 +1471,8 @@ cperl-char-ends-sub-keyword-p
            when (eq char (aref keyword (1- (length keyword))))
            return t))
 
-;; Details of groups in this are used in `cperl-imenu--create-perl-index'
-;;  and `cperl-outline-level'.
-;; Was: 2=sub|package; now 2=package-group, 5=package-name 8=sub-name (+3)
-(defvar cperl-imenu--function-name-regexp-perl
-  (concat
-   "^\\("				; 1 = all
-       "\\([ \t]*package"		; 2 = package-group
-          "\\("				; 3 = package-name-group
-	    cperl-white-and-comment-rex ; 4 = pre-package-name
-	       "\\([a-zA-Z_0-9:']+\\)\\)?\\)" ; 5 = package-name
-       "\\|"
-          "[ \t]*"
-          cperl-sub-regexp
-	  (cperl-after-sub-regexp 'named nil) ; 8=name 11=proto 14=attr-start
-	  cperl-maybe-white-and-comment-rex	; 15=pre-block
-   "\\|"
-     "=head\\([1-4]\\)[ \t]+"		; 16=level
-     "\\([^\n]+\\)$"			; 17=text
-   "\\)"))
-
 (defvar cperl-outline-regexp
-  (concat cperl-imenu--function-name-regexp-perl "\\|" "\\`"))
+  (concat "^[ \t]*" cperl--imenu-entries-regexp "\\|" "\\`"))
 
 (defvar cperl-mode-syntax-table nil
   "Syntax table in use in CPerl mode buffers.")
@@ -1713,20 +1750,11 @@ cperl-mode
   (setq-local comment-end "")
   (setq-local comment-column cperl-comment-column)
   (setq-local comment-start-skip "#+ *")
-
-;;       "[ \t]*sub"
-;;	  (cperl-after-sub-regexp 'named nil) ; 8=name 11=proto 14=attr-start
-;;	  cperl-maybe-white-and-comment-rex	; 15=pre-block
-  (setq-local defun-prompt-regexp
-              (concat "^[ \t]*\\("
-                      cperl-sub-regexp
-                      (cperl-after-sub-regexp 'named 'attr-groups)
-                      "\\|"			; per toke.c
-                      "\\(BEGIN\\|UNITCHECK\\|CHECK\\|INIT\\|END\\|AUTOLOAD\\|DESTROY\\)"
-                      "\\)"
-                      cperl-maybe-white-and-comment-rex))
   (setq-local comment-indent-function #'cperl-comment-indent)
+
+  (setq-local beginning-of-defun-function #'cperl-beginning-of-defun)
   (setq-local fill-paragraph-function #'cperl-fill-paragraph)
+
   (setq-local parse-sexp-ignore-comments t)
   (setq-local indent-region-function #'cperl-indent-region)
   ;;(setq auto-fill-function #'cperl-do-auto-fill) ; Need to switch on and off!
@@ -2523,8 +2551,9 @@ cperl-indent-line
 	  (t
 	   (skip-chars-forward " \t")
 	   (if (listp indent) (setq indent (car indent)))
-	   (cond ((and (looking-at "[A-Za-z_][A-Za-z_0-9]*:[^:]")
-		       (not (looking-at "[smy]:\\|tr:")))
+	   (cond ((and (looking-at (rx (sequence (eval cperl--label-rx)
+                                                 (not (in ":")))))
+                       (not (looking-at (rx (eval cperl--false-label-rx)))))
 		  (and (> indent 0)
 		       (setq indent (max cperl-min-label-indent
 					 (+ indent cperl-label-offset)))))
@@ -2716,6 +2745,8 @@ cperl-sniff-for-indent
 			     (and (eq (preceding-char) ?\})
 				  (cperl-after-block-and-statement-beg
 				   (point-min))) ; Was start - too close
+                             (and char-after (char-equal char-after ?{)
+                                  (save-excursion (cperl-block-declaration-p)))
 			     (memq char-after (append ")]}" nil))
 			     (and (eq (preceding-char) ?\:) ; label
 				  (progn
@@ -2759,12 +2790,10 @@ cperl-sniff-for-indent
 		   ;; Back up over label lines, since they don't
 		   ;; affect whether our line is a continuation.
 		   ;; (Had \, too)
-		   (while;;(or (eq (preceding-char) ?\,)
-		       (and (eq (preceding-char) ?:)
-			    (or;;(eq (char-after (- (point) 2)) ?\') ; ????
-			     (memq (char-syntax (char-after (- (point) 2)))
-				   '(?w ?_))))
-		     ;;)
+                   (while (and (eq (preceding-char) ?:)
+                                 (re-search-backward
+                                  (rx (sequence (eval cperl--label-rx) point))
+                                  nil t))
 		     ;; This is always FALSE?
 		     (if (eq (preceding-char) ?\,)
 			 ;; Will go to beginning of line, essentially.
@@ -2776,6 +2805,7 @@ cperl-sniff-for-indent
 		   (if (not (or (eq (1- (point)) containing-sexp)
                                 (and cperl-indent-parens-as-block
                                      (not is-block))
+                                (save-excursion (cperl-block-declaration-p))
 				(memq (preceding-char)
 				      (append (if is-block " ;{" " ,;{") '(nil)))
 				(and (eq (preceding-char) ?\})
@@ -2804,10 +2834,17 @@ cperl-sniff-for-indent
 			(forward-char 1)
 			(let ((colon-line-end 0))
 			  (while
-			      (progn (skip-chars-forward " \t\n")
-				     ;; s: foo : bar :x is NOT label
-				     (and (looking-at "#\\|\\([a-zA-Z0-9_$]+\\):[^:]\\|=[a-zA-Z]")
-					  (not (looking-at "[sym]:\\|tr:"))))
+			      (progn
+                                (skip-chars-forward " \t\n")
+				;; s: foo : bar :x is NOT label
+                                (and (looking-at
+                                      (rx
+                                       (or "#"
+                                           (sequence (eval cperl--label-rx)
+                                                     (not (in ":")))
+                                           (sequence "=" (in "a-zA-Z")))))
+				     (not (looking-at
+                                           (rx (eval cperl--false-label-rx))))))
 			    ;; Skip over comments and labels following openbrace.
 			    (cond ((= (following-char) ?\#)
 				   (forward-line 1))
@@ -3066,7 +3103,10 @@ cperl-calculate-indent
 		 ;; If line starts with label, calculate label indentation
 		 (if (save-excursion
 		       (beginning-of-line)
-		       (looking-at "[ \t]*[a-zA-Z_][a-zA-Z_0-9]*:[^:]"))
+                       (looking-at (rx
+                                    (sequence (0+ space)
+                                              (eval cperl--label-rx)
+                                              (not (in ":"))))))
 		     (if (> (current-indentation) cperl-min-label-indent)
 			 (- (current-indentation) cperl-label-offset)
 		       ;; Do not move `parse-data', this should
@@ -4645,15 +4685,19 @@ cperl-backward-to-noncomment
 	    (if (< p (point)) (goto-char p))
 	    (setq stop t))))))
 
-;; Used only in `cperl-calculate-indent'...
+;; Used only in `cperl-sniff-for-indent'...
 (defun cperl-block-p ()
-  "Point is before ?\\{.  Checks whether it starts a block."
+  "Point is before ?\\{.  Returns true if it starts a block."
   ;; No save-excursion!  This is more a distinguisher of a block/hash ref...
   (cperl-backward-to-noncomment (point-min))
   (or (memq (preceding-char) (append ";){}$@&%\C-@" nil)) ; Or label!  \C-@ at bobp
 					; Label may be mixed up with `$blah :'
       (save-excursion (cperl-after-label))
+      ;; text with the 'attrib-group property is also covered by the
+      ;; next clause.  We keep it because it is faster (for
+      ;; subroutines with attributes).
       (get-text-property (cperl-1- (point)) 'attrib-group)
+      (save-excursion (cperl-block-declaration-p))
       (and (memq (char-syntax (preceding-char)) '(?w ?_))
 	   (progn
 	     (backward-sexp)
@@ -4691,20 +4735,16 @@ cperl-after-block-p
 	      (save-excursion (cperl-after-label))
 	      ;; sub :attr {}
 	      (get-text-property (cperl-1- (point)) 'attrib-group)
+              (save-excursion (cperl-block-declaration-p))
 	      (if (memq (char-syntax (preceding-char)) '(?w ?_)) ; else {}
 		  (save-excursion
 		    (forward-sexp -1)
 		    ;; else {}     but not    else::func {}
-		    (or (and (looking-at "\\(else\\|catch\\|try\\|continue\\|grep\\|map\\|BEGIN\\|END\\|UNITCHECK\\|CHECK\\|INIT\\)\\>")
-			     (not (looking-at "\\(\\sw\\|_\\)+::")))
-			;; sub f {}
-			(progn
-			  (cperl-backward-to-noncomment lim)
-			  (and (cperl-char-ends-sub-keyword-p (preceding-char))
-			       (progn
-				 (forward-sexp -1)
-				 (looking-at
-                                  (concat cperl-sub-regexp "[ \t\n\f#]")))))))
+                    (or (and (looking-at
+                              (rx-to-string `(sequence
+                                   (eval (cons 'or cperl-bare-block-keywords))
+                                   word-end)))
+			     (not (looking-at "\\(\\sw\\|_\\)+::")))))
 		;; What precedes is not word...  XXXX Last statement in sub???
 		(cperl-after-expr-p lim))))
       (error nil))))
@@ -5326,9 +5366,9 @@ cperl-imenu--create-perl-index
 	      (setq current-package-end (save-excursion
 					  (goto-char (match-beginning 3))
 					  (forward-sexp)
-					  (point)))
+					  (point))))
 	    (push (cons name marker) index-package-alist)
-	    (push (cons (concat "package " name) marker) index-unsorted-alist))))
+	    (push (cons (concat "package " name) marker) index-unsorted-alist)))
 	 ((match-string 5)              ; found a sub name!
 	  (unless (nth 4 state)         ; skip if in a comment
 	    (setq name (match-string-no-properties 5)
@@ -5592,10 +5632,24 @@ cperl-init-faces
 		       2 font-lock-string-face t)))
 	    '("[[ \t{,(]\\(-?[a-zA-Z0-9_:]+\\)[ \t]*=>" 1
 	      font-lock-string-face t)
-	    '("^[ \t]*\\([a-zA-Z0-9_]+[ \t]*:\\)[ \t]*\\($\\|{\\|\\<\\(until\\|while\\|for\\(each\\)?\\|do\\)\\>\\)" 1
-	      font-lock-constant-face)	; labels
-	    '("\\<\\(continue\\|next\\|last\\|redo\\|break\\|goto\\)\\>[ \t]+\\([a-zA-Z0-9_:]+\\)" ; labels as targets
-	      2 font-lock-constant-face)
+            ;; labels
+            `(,(rx-to-string
+                `(sequence
+                  (0+ space)
+                  (group (eval cperl--label-rx))
+                  (0+ space)
+                  (or line-end "#" "{"
+                      (sequence word-start
+                                (or "until" "while" "for" "foreach" "do")
+                                word-end))))
+              1 font-lock-constant-face)
+            ;; labels as targets (no trailing colon!)
+            `(,(rx-to-string
+                '(sequence
+                 (or "continue" "next" "last" "redo" "break" "goto")
+                 (0+ space)
+                 (group (eval cperl--basic-identifier-rx))))
+              1 font-lock-constant-face)
 	    ;; Uncomment to get perl-mode-like vars
             ;;; '("[$*]{?\\(\\sw+\\)" 1 font-lock-variable-name-face)
             ;;; '("\\([@%]\\|\\$#\\)\\(\\sw+\\)"
diff --git a/test/lisp/progmodes/cperl-mode-resources/cperl-indent-exp.pl b/test/lisp/progmodes/cperl-mode-resources/cperl-indent-exp.pl
index 8c1883a10f..33d0d7942e 100644
--- a/test/lisp/progmodes/cperl-mode-resources/cperl-indent-exp.pl
+++ b/test/lisp/progmodes/cperl-mode-resources/cperl-indent-exp.pl
@@ -50,3 +50,79 @@
   quuux;
 }
 # -------- if-then-else: end --------
+
+# -------- labels: input --------
+{
+LABEL:
+for (0..1) {
+next LABEL if $_;
+unless ($_) last LABEL;
+}
+LABEL_WITH_SPACE 	:
+while (1) {
+last LABEL_WITH_SPACE;
+}
+LABEL_WITH_COMMENT : # explain that label
+say "This label has a comment."
+}
+# -------- labels: expected output --------
+{
+ LABEL:
+  for (0..1) {
+    next LABEL if $_;
+    unless ($_) last LABEL;
+  }
+ LABEL_WITH_SPACE 	:
+  while (1) {
+    last LABEL_WITH_SPACE;
+  }
+ LABEL_WITH_COMMENT :		# explain that label
+  say "This label has a comment."
+}
+# -------- labels: end --------
+
+# -------- packages: input --------
+package Foo {
+package Foo::Inner::One {
+...;
+}
+package Foo::Inner::Two {
+...;
+}
+}
+# -------- packages: expected output --------
+package Foo {
+  package Foo::Inner::One {
+    ...;
+  }
+  package Foo::Inner::Two {
+    ...;
+  }
+}
+# -------- packages: end --------
+
+# -------- braces after newline: input --------
+package Outer {
+package Inner
+{
+sub foo
+{
+}
+sub bar
+{
+}
+}
+}
+# -------- braces after newline: expected output --------
+package Outer {
+  package Inner
+  {
+    sub foo
+    {
+    }
+    sub bar
+    {
+    }
+  }
+}
+# -------- braces after newline: end --------
diff --git a/test/lisp/progmodes/cperl-mode-resources/cperl-indent-region.pl b/test/lisp/progmodes/cperl-mode-resources/cperl-indent-region.pl
new file mode 100644
index 0000000000..219d5292ca
--- /dev/null
+++ b/test/lisp/progmodes/cperl-mode-resources/cperl-indent-region.pl
@@ -0,0 +1,118 @@
+#!/usr/bin/env perl
+use strict;
+use warnings;
+use 5.020;
+
+# This file contains test input and expected output for the tests in
+# cperl-mode-tests.el, cperl-test-indent-region.  The code is
+# syntactically valid, but doesn't make much sense.
+
+# (Bug#11733) https://debbugs.gnu.org/cgi/bugreport.cgi?bug=11733
+# -------- sub \n brace: input --------
+sub foo
+  {
+  }
+
+sub bar
+  {
+  }
+# -------- sub \n brace: expected output --------
+sub foo
+{
+}
+
+sub bar
+{
+}
+# -------- sub \n brace: end --------
+
+# -------- map \n brace: input --------
+{
+my %hash = map
+{
+$_ = 1
+}
+@_;
+}
+# -------- map \n brace: expected output --------
+{
+  my %hash = map
+    {
+      $_ = 1
+    }
+    @_;
+}
+# -------- map \n brace: end --------
+
+# -------- if \n brace: input --------
+if (1 == @_)
+  {
+    say "one argument";
+  } else {
+    say "No arguments, or plenty";
+  }
+# -------- if \n brace: expected output --------
+if (1 == @_) {
+  say "one argument";
+} else {
+  say "No arguments, or plenty";
+}
+# -------- if \n brace: end --------
+
+# -------- simple sub: input --------
+sub middle {
+  my ($left,$right) = @_;
+  return  0.5 * ($left + $right);
+}
+$this_should_be_justified = 'left';
+# -------- simple sub: expected output --------
+sub middle {
+  my ($left,$right) = @_;
+  return  0.5 * ($left + $right);
+}
+$this_should_be_justified = 'left';
+# -------- simple sub: end --------
+
+# ------- hash reference: input --------
+my $result = { ok => 'true' }
+  if ($this_should_be_indented == 'cperl-continued-statement-offset');
+# ------- hash reference: expected output --------
+my $result = { ok => 'true' }
+  if ($this_should_be_indented == 'cperl-continued-statement-offset');
+# ------- hash reference: end --------
+
+# The following syntax was introduced in Perl v5.14:
+# ------- package BLOCK: input --------
+package Puck {
+our $cperl_indent_level = 4;
+}
+our $good_indentation = 0;
+# ------- package BLOCK: expected output --------
+package Puck {
+  our $cperl_indent_level = 4;
+}
+our $good_indentation = 0;
+# ------- package BLOCK: end --------
+
+# as of Perl 5.18, subroutines can be lexical
+# -------- simple sub: input --------
+package Pick 1.23 {
+  my sub middle {
+    my ($left,$right) = @_;
+    return  0.5 * ($left + $right);
+    $this_should_be_indented =
+      $cperl_indent_level;
+  }
+}
+$this_should_be_justified = 'left';
+# -------- simple sub: expected output --------
+package Pick 1.23 {
+  my sub middle {
+    my ($left,$right) = @_;
+    return  0.5 * ($left + $right);
+    $this_should_be_indented =
+      $cperl_indent_level;
+  }
+}
+$this_should_be_justified = 'left';
+# -------- simple sub: end --------
diff --git a/test/lisp/progmodes/cperl-mode-tests.el b/test/lisp/progmodes/cperl-mode-tests.el
index 61e4ece49b..1bcc41ab9f 100644
--- a/test/lisp/progmodes/cperl-mode-tests.el
+++ b/test/lisp/progmodes/cperl-mode-tests.el
@@ -103,6 +103,14 @@ cperl-test-indent-exp
    (ert-resource-file "cperl-indent-exp.pl")
    (cperl-indent-exp))) ; here we go!
 
+(ert-deftest cperl-test-indent-region ()
+  "Test indenting of regions which can span more than one
+expression."
+  (skip-unless (eq cperl-test-mode #'cperl-mode))
+  (cperl--run-test-cases
+   (ert-resource-file "cperl-indent-region.pl")
+   (indent-region (point-min) (point-max))))
+
 (ert-deftest cperl-test-indent-styles ()
   (skip-unless (eq cperl-test-mode #'cperl-mode))
   (cperl--run-test-cases
@@ -189,17 +197,17 @@ cperl-test-ws-regexp
 	 '(" " "\t" "\n"))
 	(invalid
 	 '("a" "  " "")))
-    (cperl-test--validate-regexp cperl--ws-regexp
+    (cperl-test--validate-regexp (rx (eval cperl--ws-rx))
 				 valid invalid)))
 
-(ert-deftest cperl-test-ws-or-comment-regexp ()
+(ert-deftest cperl-test-ws+-rx ()
   "Tests sequences of whitespace and comment lines."
   (let ((valid
 	 `(" " "\t#\n" "\n# \n"
 	   ,(concat "# comment\n" "# comment\n" "\n" "#comment\n")))
 	(invalid
 	 '("=head1 NAME\n" )))
-    (cperl-test--validate-regexp cperl--ws-or-comment-regexp
+    (cperl-test--validate-regexp (rx (eval cperl--ws+-rx))
 				 valid invalid)))
 
 (ert-deftest cperl-test-version-regexp ()
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-05-18 13:47 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-03 14:43 bug#46889: cperl-mode: Fix indentation issues [PATCH] Harald Jörg
2021-03-04 17:31 ` Lars Ingebrigtsen
2021-03-04 18:07   ` Mattias Engdegård
2021-03-04 18:16     ` Lars Ingebrigtsen
2021-03-04 18:23       ` Mattias Engdegård
2021-03-04 18:34         ` Lars Ingebrigtsen
2021-03-05 10:30       ` bug#46889: Help version guesser bug (was: bug#46889: cperl-mode: Fix indentation issues [PATCH]) Stephen Berman
2021-03-05 13:06         ` bug#46889: Help version guesser bug Lars Ingebrigtsen
2021-03-04 18:19   ` bug#46889: cperl-mode: Fix indentation issues [PATCH] Harald Jörg
2021-03-09 17:06 ` Harald Jörg
2021-03-09 17:18 ` Harald Jörg
2021-03-10 14:42   ` Lars Ingebrigtsen
2021-05-17 15:28     ` Lars Ingebrigtsen
2021-05-17 21:31       ` Harald Jörg
2021-05-18 13:47         ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).