From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: haj@posteo.de (Harald =?UTF-8?Q?J=C3=B6rg?=) Newsgroups: gmane.emacs.bugs Subject: bug#46889: cperl-mode: Fix indentation issues [PATCH] Date: Wed, 03 Mar 2021 15:43:10 +0100 Message-ID: <87im68uwld.fsf@hajtower> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35530"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) To: 46889@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Mar 03 15:44:29 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lHSjk-00092W-D3 for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 03 Mar 2021 15:44:28 +0100 Original-Received: from localhost ([::1]:51162 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lHSjj-0005Z3-Fb for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 03 Mar 2021 09:44:27 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:40032) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lHSjL-0005Yj-2m for bug-gnu-emacs@gnu.org; Wed, 03 Mar 2021 09:44:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:43928) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lHSjK-0001Or-Rn for bug-gnu-emacs@gnu.org; Wed, 03 Mar 2021 09:44:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lHSjK-0005ts-L1 for bug-gnu-emacs@gnu.org; Wed, 03 Mar 2021 09:44:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: haj@posteo.de (Harald =?UTF-8?Q?J=C3=B6rg?=) Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 03 Mar 2021 14:44:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 46889 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.161478260722624 (code B ref -1); Wed, 03 Mar 2021 14:44:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 3 Mar 2021 14:43:27 +0000 Original-Received: from localhost ([127.0.0.1]:55474 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lHSij-0005sp-GH for submit@debbugs.gnu.org; Wed, 03 Mar 2021 09:43:26 -0500 Original-Received: from lists.gnu.org ([209.51.188.17]:52720) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lHSih-0005sf-86 for submit@debbugs.gnu.org; Wed, 03 Mar 2021 09:43:24 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:39978) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lHSih-0005Vz-0z for bug-gnu-emacs@gnu.org; Wed, 03 Mar 2021 09:43:23 -0500 Original-Received: from mout02.posteo.de ([185.67.36.66]:41413) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lHSiZ-0001BJ-2D for bug-gnu-emacs@gnu.org; Wed, 03 Mar 2021 09:43:22 -0500 Original-Received: from submission (posteo.de [89.146.220.130]) by mout02.posteo.de (Postfix) with ESMTPS id D861C240100 for ; Wed, 3 Mar 2021 15:43:11 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1614782591; bh=5TL0+U3ra8aIaV+Nvv1Re4k/FZE6KLWBsP6COKZllA4=; h=From:To:Subject:Date:From; b=ZvtLOOvkY3rVreOsj/Iokq8m0GerO5UlRe/KBiN1qESTyYkSa6oIcpasW5NcEhhQz 7Dp7bBuKE5N13K60sd7vQQcO/HrsHivZHhE2PV7df9ee/mfuINWUP//pAP7dbB1+1/ p0sMqmJVo4tPh8CMdC6uTbXn3iHeAiRCNn4hWsmhVuQWjaTv8LzarfGvdMXiK04Ykn bEyINVoQ0mQ3FzWSPb/AbqIvOfUBV3kJvxBTf8s6m2bNnvETC4TSyIVE58HLMqau84 fasytoyBsEr6Ls92MnV/B16PivyUhNC4lu1OmDKjJB12i1mCj/lzwnER5pCyzBZ6cG 6e6UjOhploAgg== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4DrGvM0pr3z9rxM for ; Wed, 3 Mar 2021 15:43:10 +0100 (CET) Received-SPF: pass client-ip=185.67.36.66; envelope-from=haj@posteo.de; helo=mout02.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:201276 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello Emacs-Maintainers, this started out as working on Bug#42169 but it soon became clear that there's more lurking in CPerl mode's indentation code, so I decided to open a new issue here. In 7b2448ae (cperl-mode: Improve detection of index entries for imenu), committed on Feb 17, the imenu part of Bug#42169 has been fixed. This commit now fixes the indentation after a package which brings its own block. Therefore, the patch in this report should finally fix Bug#42169. As a by-catch, the new code fixes the ancient Bug#8077. This patch also fixes the first part of Bug#11733. The "other indentation bug" reported in Bug#11733 (using a colon as a Perl regexp terminator) is a different story and not covered by this patch. Also fixed now (without known bug number): labels with a space before, or a comment after the colon are now correctly indented and fontified. With this patch in effect, CPerl mode should finally have caught up with Perl 5.30! Many thanks to Mattias Engdeg=C3=A5rd for his patient explanations how rx expressions can be used without breaking compatibility to Emacs 26.1. Until now, this allowed two ugly regexp literals to be eliminated. --=20 Cheers, haj --=-=-= Content-Type: text/x-diff; charset=utf-8 Content-Disposition: attachment; filename=0001-cperl-mode-fix-indentation-issues.patch Content-Transfer-Encoding: quoted-printable Content-Description: cperl-mode: Fix indentation issues >From 792f23db44a04db179921e72b11634e03d04ebb0 Mon Sep 17 00:00:00 2001 From: =3D?UTF-8?q?Harald=3D20J=3DC3=3DB6rg?=3D Date: Wed, 3 Mar 2021 14:07:44 +0100 Subject: [PATCH] ; cperl-mode: fix indentation issues MIME-Version: 1.0 Content-Type: text/plain; charset=3DUTF-8 Content-Transfer-Encoding: 8bit * lisp/progmodes/cperl-mode.el (cperl-core-named-block-keywords), (cperl-bare-block-keywords): New variables to allow adding new keywords without mangling regular expressions.. (cperl--label-rx), (cperl--false-label-rx) and more: String regexps are replaced by rx forms, errors reported by relint are fixed. (thanks Mattias Engdeg=C3=A5rd!) (cperl-block-declaration-p): New function to replace several inconsistent regexps. (cperl-beginning-of-defun): New function to capture lexical subs and functions where the opening brace is on a line of its own (or even separated by attributes, signatures etc.) (cperl-maybe-white-and-comment-rex), (cperl-white-and-comment-rex): Now implemented in terms of rx-forms. (cperl-after-sub-regexp): Deleted, no more used. (cperl-imenu--function-name-regexp-perl): Deleted, no more used. (cperl-indent-line): Use the new rx-forms. (cperl-sniff-for-indent): Use the new rx-formxs. (cperl-after-block-p): Use the new keyword lists. (cperl-init-faces): Replace literal regexps by rx forms. * test/lisp/progmodes/cperl-mode-tests.el (cperl-test-indent-region): New indentation test for lexical subroutines and packages with version and/or blocks. (cperl-test-ws+-rx): Test the rx S-expression instead of the string regexp. * test/lisp/progmodes/cperl-mode-resources/cperl-indent-exp.pl: New testcases for indenting labels and "new style" packages. * test/lisp/progmodes/cperl-mode-resources/cperl-indent-region.pl: New file for indentation tests spanning more than one expression. --- lisp/progmodes/cperl-mode.el | 396 ++++++++++-------- .../cperl-mode-resources/cperl-indent-exp.pl | 76 ++++ .../cperl-indent-region.pl | 118 ++++++ test/lisp/progmodes/cperl-mode-tests.el | 14 +- 4 files changed, 430 insertions(+), 174 deletions(-) create mode 100644 test/lisp/progmodes/cperl-mode-resources/cperl-indent-r= egion.pl diff --git a/lisp/progmodes/cperl-mode.el b/lisp/progmodes/cperl-mode.el index 6b22228397..d2788da625 100644 --- a/lisp/progmodes/cperl-mode.el +++ b/lisp/progmodes/cperl-mode.el @@ -1212,73 +1212,98 @@ cperl-menu ;; minimalistic Perl grammar, to be used instead of individual (and ;; not always consistent) literal regular expressions. =20 -(defconst cperl--basic-identifier-regexp - (rx (sequence (or alpha "_") (* (or word "_")))) - "A regular expression for the name of a \"basic\" Perl variable. +;; Perl Keyword sets +;; These might be extended by extension modules or future Perl +;; versions. Therefore it should be possible to change the values +;; at runtime, or even: for one CPerl mode buffer. +(defconst cperl-core-named-block-keywords + '("BEGIN" "CHECK" "END" "INIT" "UNITCHECK" + ;; These two _should_ be following a sub statement, but are + ;; actually valid Perl without. + "DESTROY" "AUTOLOAD") + "These keywords introduce a block which ends a statement + without 'sub', and without a semicolon. This affects + indentation.") + +(defconst cperl-bare-block-keywords + (append cperl-core-named-block-keywords + '("else" + ;; FIXME: The next keywords have been taken over from + ;; history. Do they really belong here? - haj 2021-02-25 + "grep" "map" + ;; FIXME: The following keywords are not Perl core + ;; and there are subtle syntactical differences between + ;; different extensions providing them - haj 2021-02-25 + "try" "catch" "continue")) + "Keywords which are directly followed by a block.") + +(eval-and-compile + (defconst cperl--ws-rx + '(sequence (or space "\n")) + "Regular expression for a single whitespace in Perl.") + + (defconst cperl--eol-comment-rx + '(sequence "#" (0+ (not (in "\n"))) "\n") + "Regular expression for a single end-of-line comment in Perl") + + (defconst cperl--ws-or-comment-rx + '(or (eval cperl--ws-rx) + (eval cperl--eol-comment-rx)) + "A regular expression for either whitespace or comment") + + (defconst cperl--ws*-rx + '(0+ (eval cperl--ws-or-comment-rx)) + "Regular expression for optional whitespaces or comments in Perl") + + (defconst cperl--ws+-rx + '(1+ (eval cperl--ws-or-comment-rx)) + "Regular expression for a sequence of whitespace and comments in Perl.= ") + + (defconst cperl--basic-identifier-rx + '(sequence (or alpha "_") (* (or word "_"))) + "A regular expression for the name of a \"basic\" Perl variable. Neither namespace separators nor sigils are included. As is, this regular expression applies to labels,subroutine calls where the ampersand sigil is not required, and names of subroutine attributes.") =20 -(defconst cperl--label-regexp - (rx-to-string - `(sequence - symbol-start - (regexp ,cperl--basic-identifier-regexp) - (0+ space) - ":")) - "A regular expression for a Perl label. + (defconst cperl--label-rx + '(sequence symbol-start + (eval cperl--basic-identifier-rx) + (0+ space) + ":") + "A regular expression for a Perl label. By convention, labels are uppercase alphabetics, but this isn't enforced.") =20 -(defconst cperl--normal-identifier-regexp - (rx-to-string - `(or - (sequence - (1+ (sequence - (opt (regexp ,cperl--basic-identifier-regexp)) - "::")) - (opt (regexp ,cperl--basic-identifier-regexp))) - (regexp ,cperl--basic-identifier-regexp))) - "A regular expression for a Perl variable name with optional namespace. + (defconst cperl--false-label-rx + '(sequence (or (in "sym") "tr") (0+ space) ":") + "A regular expression which is similar to a label, but might as + well be a quote-like operator with a colon as delimiter.") + + (defconst cperl--normal-identifier-rx + '(or (sequence (1+ (sequence + (opt (eval cperl--basic-identifier-rx)) + "::")) + (opt (eval cperl--basic-identifier-rx))) + (eval cperl--basic-identifier-rx)) + "A regular expression for a Perl variable name with optional namespace. Examples are `foo`, `Some::Module::VERSION`, and `::` (yes, that is a legal variable name).") =20 -(defconst cperl--special-identifier-regexp - (rx-to-string - `(or - (1+ digit) ; $0, $1, $2, ... - (sequence "^" (any "A-Z" "]^_?\\")) ; $^V - (sequence "{" (0+ space) ; ${^MATCH} - "^" (any "A-Z" "]^_?\\") - (0+ (any "A-Z" "_" digit)) - (0+ space) "}") - (in "!\"$%&'()+,-./:;<=3D>?@\\]^_`|~"))) ; $., $|, $", ... but not = $^ or ${ - "The list of Perl \"punctuation\" variables, as listed in perlvar.") - -(defconst cperl--ws-regexp - (rx-to-string - '(or space "\n")) - "Regular expression for a single whitespace in Perl.") - -(defconst cperl--eol-comment-regexp - (rx-to-string - '(sequence "#" (0+ (not (in "\n"))) "\n")) - "Regular expression for a single end-of-line comment in Perl") - -(defconst cperl--ws-or-comment-regexp - (rx-to-string - `(1+ - (or - (regexp ,cperl--ws-regexp) - (regexp ,cperl--eol-comment-regexp)))) - "Regular expression for a sequence of whitespace and comments in Perl.") - -(defconst cperl--ows-regexp - (rx-to-string - `(opt (regexp ,cperl--ws-or-comment-regexp))) - "Regular expression for optional whitespaces or comments in Perl") - + (defconst cperl--special-identifier-rx + '(or + (1+ digit) ; $0, $1, $2, ... + (sequence "^" (any "A-Z" "]^_?\\")) ; $^V + (sequence "{" (0+ space) ; ${^MATCH} + "^" (any "A-Z" "]^_?\\") + (0+ (any "A-Z" "_" digit)) + (0+ space) "}") + (in "!\"$%&'()+,-./:;<=3D>?@\\]^_`|~")) ; $., $|, $", ... but not = $^ or ${ + "The list of Perl \"punctuation\" variables, as listed in perlvar.")) + +;; This is left as a string regexp. There are many version schemes in +;; the wild, so people might want to fiddle with this variable. (defconst cperl--version-regexp (rx-to-string `(or @@ -1296,11 +1321,11 @@ cperl--package-regexp `(sequence "package" ; FIXME: the "class" and "role" keywords need to be ; recognized soon...ish. - (regexp ,cperl--ws-or-comment-regexp) - (group (regexp ,cperl--normal-identifier-regexp)) + (eval cperl--ws+-rx) + (group (eval cperl--normal-identifier-rx)) (opt (sequence - (regexp ,cperl--ws-or-comment-regexp) + (eval cperl--ws+-rx) (group (regexp ,cperl--version-regexp)))))) "A regular expression for package NAME VERSION in Perl. Contains two groups for the package name and version.") @@ -1309,7 +1334,7 @@ cperl--package-for-imenu-regexp (rx-to-string `(sequence (regexp ,cperl--package-regexp) - (regexp ,cperl--ows-regexp) + (eval cperl--ws*-rx) (group (or ";" "{")))) "A regular expression to collect package names for `imenu`. Catches \"package NAME;\", \"package NAME VERSION;\", \"package @@ -1321,19 +1346,28 @@ cperl--sub-name-regexp (rx-to-string `(sequence (optional (sequence (group (or "my" "state" "our")) - (regexp ,cperl--ws-or-comment-regexp))) + (eval cperl--ws+-rx))) "sub" ; FIXME: the "method" and maybe "fun" keywords need to be ; recognized soon...ish. - (regexp ,cperl--ws-or-comment-regexp) - (group (regexp ,cperl--normal-identifier-regexp)))) + (eval cperl--ws+-rx) + (group (eval cperl--normal-identifier-rx)))) "A regular expression to detect a subroutine start. Contains two groups: One for to distinguish lexical from \"normal\" subroutines and one for the subroutine name.") =20 -(defconst cperl--pod-heading-regexp +(defconst cperl--block-declaration-regexp (rx-to-string `(sequence - line-start "=3Dhead" + (or "package" "sub") ; "class" and "method" coming soon + (1+ (eval cperl--ws-or-comment-rx)) + (eval cperl--normal-identifier-rx))) + "A regular expression to find a declaration for a named block. +Used for indentation. These declarations introduce a block which +does not need a semicolon to terminate the statement.") + +(defconst cperl--pod-heading-regexp + (rx-to-string + `(sequence line-start "=3Dhead" (group (in "1-4")) (1+ (in " \t")) (group (1+ (not (in "\n")))) @@ -1346,61 +1380,84 @@ cperl--imenu-entries-regexp (rx-to-string `(or (regexp ,cperl--package-for-imenu-regexp) ; 1..3 - (regexp ,cperl--sub-name-regexp) ; 4..5 + (regexp ,cperl--sub-name-regexp) ; 4..5 (regexp ,cperl--pod-heading-regexp))) ; 6..7 "A regular expression to collect stuff that goes into the `imenu` index. Covers packages, subroutines, and POD headings.") =20 +(defun cperl-block-declaration-p () + "Tests whether the following ?\\{ opens a declaration block. +Returns the column where the declarating keyword is found, or nil +if this isn't a declaration block. Declaration blocks are named +subroutines, packages and the like. They start with a keyword +and a name, to be followed by various descriptive items which are +just skipped over for our purpose." + ;; A scan error means that none of the declarators has been found + (condition-case nil + (let ((is-block-declaration nil) + (continue-searching t)) + (while (and continue-searching (not (bobp))) + (forward-sexp -1) + (cond + ((looking-at cperl--block-declaration-regexp) + (setq is-block-declaration (current-column) + continue-searching nil)) + ;; Another brace means this is no block declaration + ((looking-at "{") + (setq continue-searching nil)) + (t + (cperl-backward-to-noncomment (point-min)) + ;; A semicolon or an opening brace prevent this block from + ;; being a block declaration + (when (or (eq (preceding-char) ?\;) + (eq (preceding-char) ?{)) + (setq continue-searching nil))))) + is-block-declaration) + (error nil))) + +(defun cperl-beginning-of-defun (&optional arg) + "Move backward to the beginning of a subroutine or package ARG times. +If point is within a function, then go to the line where its +definition starts. If point is outside of a function but within +a package, go to the first line of the package. Returns t if the +search succeeded. If there's neither a package nor a subroutine +around point, leaves point undhanged and returns nil. This +function ignores `open-paren-in-column-0-is-defun-start', +and also ignores ARG. + +This is the substitute for `beginning-of-defun-raw', so it does +not jump to the beginning of the line where the target was +found." + (interactive "^p") + (unless arg (setq arg 1)) + (let ((state (syntax-ppss (point)))) + (catch 'done + (dolist (paren-pos (reverse (nth 9 state))) + (when (char-equal (char-after paren-pos) ?{) + (goto-char paren-pos) + (when (cperl-block-declaration-p) + (throw 'done t)))) + (re-search-backward + (rx-to-string `(sequence line-start + (regexp ,cperl--package-regexp))) + nil 'move)))) + + ;; These two must be unwound, otherwise take exponential time -(defconst cperl-maybe-white-and-comment-rex "[ \t\n]*\\(#[^\n]*\n[ \t\n]*\= \)*" +(defconst cperl-maybe-white-and-comment-rex + (rx-to-string `(group (eval cperl--ws*-rx))) + ;; was: "[ \t\n]*\\(#[^\n]*\n[ \t\n]*\\)*" "Regular expression to match optional whitespace with interspersed comment= s. Should contain exactly one group.") =20 ;; This one is tricky to unwind; still very inefficient... -(defconst cperl-white-and-comment-rex "\\([ \t\n]\\|#[^\n]*\n\\)+" +(defconst cperl-white-and-comment-rex + (rx-to-string `(group (eval cperl--ws+-rx))) + ;; was: "\\([ \t\n]\\|#[^\n]*\n\\)+" "Regular expression to match whitespace with interspersed comments. Should contain exactly one group.") =20 - -;; Is incorporated in `cperl-outline-regexp', `defun-prompt-regexp'. -;; Details of groups in this may be used in several functions; see comments -;; near mentioned above variable(s)... -;; sub($$):lvalue{} sub:lvalue{} Both allowed... -(defsubst cperl-after-sub-regexp (named attr) ; 9 groups without attr... - "Match the text after `sub' in a subroutine declaration. -If NAMED is nil, allows anonymous subroutines. Matches up to the first \"= :\" -of attributes (if present), or end of the name or prototype (whatever is -the last)." - (concat ; Assume n groups before this... - "\\(" ; n+1=3Dname-group - cperl-white-and-comment-rex ; n+2=3Dpre-name - "\\(::[a-zA-Z_0-9:']+\\|[a-zA-Z_'][a-zA-Z_0-9:']*\\)" ; n+3=3Dname - "\\)" ; END n+1=3Dname-group - (if named "" "?") - "\\(" ; n+4=3Dproto-group - cperl-maybe-white-and-comment-rex ; n+5=3Dpre-proto - "\\(([^()]*)\\)" ; n+6=3Dprototype - "\\)?" ; END n+4=3Dproto-group - "\\(" ; n+7=3Dattr-group - cperl-maybe-white-and-comment-rex ; n+8=3Dpre-attr - "\\(" ; n+9=3Dstart-attr - ":" - (if attr (concat - "\\(" - cperl-maybe-white-and-comment-rex ; whitespace-comments - "\\(\\sw\\|_\\)+" ; attr-name - ;; attr-arg (1 level of internal parens allowed!) - "\\((\\(\\\\.\\|[^\\()]\\|([^\\()]*)\\)*)\\)?" - "\\(" ; optional : (XXX allows trailing???) - cperl-maybe-white-and-comment-rex ; whitespace-comments - ":\\)?" - "\\)+") - "[^:]") - "\\)" - "\\)?" ; END n+6=3Dproto-group - )) - ;; Tired of editing this in 8 places every time I remember that there ;; is another method-defining keyword (defvar cperl-sub-keywords @@ -1414,28 +1471,8 @@ cperl-char-ends-sub-keyword-p when (eq char (aref keyword (1- (length keyword)))) return t)) =20 -;; Details of groups in this are used in `cperl-imenu--create-perl-index' -;; and `cperl-outline-level'. -;; Was: 2=3Dsub|package; now 2=3Dpackage-group, 5=3Dpackage-name 8=3Dsub-n= ame (+3) -(defvar cperl-imenu--function-name-regexp-perl - (concat - "^\\(" ; 1 =3D all - "\\([ \t]*package" ; 2 =3D package-group - "\\(" ; 3 =3D package-name-group - cperl-white-and-comment-rex ; 4 =3D pre-package-name - "\\([a-zA-Z_0-9:']+\\)\\)?\\)" ; 5 =3D package-name - "\\|" - "[ \t]*" - cperl-sub-regexp - (cperl-after-sub-regexp 'named nil) ; 8=3Dname 11=3Dproto 14=3Dattr-sta= rt - cperl-maybe-white-and-comment-rex ; 15=3Dpre-block - "\\|" - "=3Dhead\\([1-4]\\)[ \t]+" ; 16=3Dlevel - "\\([^\n]+\\)$" ; 17=3Dtext - "\\)")) - (defvar cperl-outline-regexp - (concat cperl-imenu--function-name-regexp-perl "\\|" "\\`")) + (concat "^[ \t]*" cperl--imenu-entries-regexp "\\|" "\\`")) =20 (defvar cperl-mode-syntax-table nil "Syntax table in use in CPerl mode buffers.") @@ -1713,20 +1750,11 @@ cperl-mode (setq-local comment-end "") (setq-local comment-column cperl-comment-column) (setq-local comment-start-skip "#+ *") - -;; "[ \t]*sub" -;; (cperl-after-sub-regexp 'named nil) ; 8=3Dname 11=3Dproto 14=3Dattr-s= tart -;; cperl-maybe-white-and-comment-rex ; 15=3Dpre-block - (setq-local defun-prompt-regexp - (concat "^[ \t]*\\(" - cperl-sub-regexp - (cperl-after-sub-regexp 'named 'attr-groups) - "\\|" ; per toke.c - "\\(BEGIN\\|UNITCHECK\\|CHECK\\|INIT\\|END\\|AUTOLOA= D\\|DESTROY\\)" - "\\)" - cperl-maybe-white-and-comment-rex)) (setq-local comment-indent-function #'cperl-comment-indent) + + (setq-local beginning-of-defun-function #'cperl-beginning-of-defun) (setq-local fill-paragraph-function #'cperl-fill-paragraph) + (setq-local parse-sexp-ignore-comments t) (setq-local indent-region-function #'cperl-indent-region) ;;(setq auto-fill-function #'cperl-do-auto-fill) ; Need to switch on and= off! @@ -2523,8 +2551,9 @@ cperl-indent-line (t (skip-chars-forward " \t") (if (listp indent) (setq indent (car indent))) - (cond ((and (looking-at "[A-Za-z_][A-Za-z_0-9]*:[^:]") - (not (looking-at "[smy]:\\|tr:"))) + (cond ((and (looking-at (rx (sequence (eval cperl--label-rx) + (not (in ":"))))) + (not (looking-at (rx (eval cperl--false-label-rx)))= )) (and (> indent 0) (setq indent (max cperl-min-label-indent (+ indent cperl-label-offset))))) @@ -2716,6 +2745,8 @@ cperl-sniff-for-indent (and (eq (preceding-char) ?\}) (cperl-after-block-and-statement-beg (point-min))) ; Was start - too close + (and char-after (char-equal char-after ?{) + (save-excursion (cperl-block-declaration= -p))) (memq char-after (append ")]}" nil)) (and (eq (preceding-char) ?\:) ; label (progn @@ -2759,12 +2790,10 @@ cperl-sniff-for-indent ;; Back up over label lines, since they don't ;; affect whether our line is a continuation. ;; (Had \, too) - (while;;(or (eq (preceding-char) ?\,) - (and (eq (preceding-char) ?:) - (or;;(eq (char-after (- (point) 2)) ?\') ; ???? - (memq (char-syntax (char-after (- (point) 2))) - '(?w ?_)))) - ;;) + (while (and (eq (preceding-char) ?:) + (re-search-backward + (rx (sequence (eval cperl--label-rx) poi= nt)) + nil t)) ;; This is always FALSE? (if (eq (preceding-char) ?\,) ;; Will go to beginning of line, essentially. @@ -2776,6 +2805,7 @@ cperl-sniff-for-indent (if (not (or (eq (1- (point)) containing-sexp) (and cperl-indent-parens-as-block (not is-block)) + (save-excursion (cperl-block-declaration-p= )) (memq (preceding-char) (append (if is-block " ;{" " ,;{") '(nil))) (and (eq (preceding-char) ?\}) @@ -2804,10 +2834,17 @@ cperl-sniff-for-indent (forward-char 1) (let ((colon-line-end 0)) (while - (progn (skip-chars-forward " \t\n") - ;; s: foo : bar :x is NOT label - (and (looking-at "#\\|\\([a-zA-Z0-9_$]+\\):[^:]\\|=3D[a-zA-Z]") - (not (looking-at "[sym]:\\|tr:")))) + (progn + (skip-chars-forward " \t\n") + ;; s: foo : bar :x is NOT label + (and (looking-at + (rx + (or "#" + (sequence (eval cperl--label-rx) + (not (in ":"))) + (sequence "=3D" (in "a-zA-Z")))= )) + (not (looking-at + (rx (eval cperl--false-label-rx= )))))) ;; Skip over comments and labels following openbrace. (cond ((=3D (following-char) ?\#) (forward-line 1)) @@ -3066,7 +3103,10 @@ cperl-calculate-indent ;; If line starts with label, calculate label indentation (if (save-excursion (beginning-of-line) - (looking-at "[ \t]*[a-zA-Z_][a-zA-Z_0-9]*:[^:]")) + (looking-at (rx + (sequence (0+ space) + (eval cperl--label-rx) + (not (in ":")))))) (if (> (current-indentation) cperl-min-label-indent) (- (current-indentation) cperl-label-offset) ;; Do not move `parse-data', this should @@ -4645,15 +4685,19 @@ cperl-backward-to-noncomment (if (< p (point)) (goto-char p)) (setq stop t)))))) =20 -;; Used only in `cperl-calculate-indent'... +;; Used only in `cperl-sniff-for-indent'... (defun cperl-block-p () - "Point is before ?\\{. Checks whether it starts a block." + "Point is before ?\\{. Returns true if it starts a block." ;; No save-excursion! This is more a distinguisher of a block/hash ref.= .. (cperl-backward-to-noncomment (point-min)) (or (memq (preceding-char) (append ";){}$@&%\C-@" nil)) ; Or label! \C-= @ at bobp ; Label may be mixed up with `$blah :' (save-excursion (cperl-after-label)) + ;; text with the 'attrib-group property is also covered by the + ;; next clause. We keep it because it is faster (for + ;; subroutines with attributes). (get-text-property (cperl-1- (point)) 'attrib-group) + (save-excursion (cperl-block-declaration-p)) (and (memq (char-syntax (preceding-char)) '(?w ?_)) (progn (backward-sexp) @@ -4691,20 +4735,16 @@ cperl-after-block-p (save-excursion (cperl-after-label)) ;; sub :attr {} (get-text-property (cperl-1- (point)) 'attrib-group) + (save-excursion (cperl-block-declaration-p)) (if (memq (char-syntax (preceding-char)) '(?w ?_)) ; else {} (save-excursion (forward-sexp -1) ;; else {} but not else::func {} - (or (and (looking-at "\\(else\\|catch\\|try\\|continue\\|grep\\|map\= \|BEGIN\\|END\\|UNITCHECK\\|CHECK\\|INIT\\)\\>") - (not (looking-at "\\(\\sw\\|_\\)+::"))) - ;; sub f {} - (progn - (cperl-backward-to-noncomment lim) - (and (cperl-char-ends-sub-keyword-p (preceding-char)) - (progn - (forward-sexp -1) - (looking-at - (concat cperl-sub-regexp "[ \t\n\f#]")))= )))) + (or (and (looking-at + (rx-to-string `(sequence + (eval (cons 'or cperl-bare-block-keywor= ds)) + word-end))) + (not (looking-at "\\(\\sw\\|_\\)+::"))))) ;; What precedes is not word... XXXX Last statement in sub??? (cperl-after-expr-p lim)))) (error nil)))) @@ -5326,9 +5366,9 @@ cperl-imenu--create-perl-index (setq current-package-end (save-excursion (goto-char (match-beginning 3)) (forward-sexp) - (point))) + (point)))) (push (cons name marker) index-package-alist) - (push (cons (concat "package " name) marker) index-unsorted-alist)))) + (push (cons (concat "package " name) marker) index-unsorted-alist))) ((match-string 5) ; found a sub name! (unless (nth 4 state) ; skip if in a comment (setq name (match-string-no-properties 5) @@ -5592,10 +5632,24 @@ cperl-init-faces 2 font-lock-string-face t))) '("[[ \t{,(]\\(-?[a-zA-Z0-9_:]+\\)[ \t]*=3D>" 1 font-lock-string-face t) - '("^[ \t]*\\([a-zA-Z0-9_]+[ \t]*:\\)[ \t]*\\($\\|{\\|\\<\\(until\\|wh= ile\\|for\\(each\\)?\\|do\\)\\>\\)" 1 - font-lock-constant-face) ; labels - '("\\<\\(continue\\|next\\|last\\|redo\\|break\\|goto\\)\\>[ \t]+\\([= a-zA-Z0-9_:]+\\)" ; labels as targets - 2 font-lock-constant-face) + ;; labels + `(,(rx-to-string + `(sequence + (0+ space) + (group (eval cperl--label-rx)) + (0+ space) + (or line-end "#" "{" + (sequence word-start + (or "until" "while" "for" "foreach" "do") + word-end)))) + 1 font-lock-constant-face) + ;; labels as targets (no trailing colon!) + `(,(rx-to-string + '(sequence + (or "continue" "next" "last" "redo" "break" "goto") + (0+ space) + (group (eval cperl--basic-identifier-rx)))) + 1 font-lock-constant-face) ;; Uncomment to get perl-mode-like vars ;;; '("[$*]{?\\(\\sw+\\)" 1 font-lock-variable-name-face) ;;; '("\\([@%]\\|\\$#\\)\\(\\sw+\\)" diff --git a/test/lisp/progmodes/cperl-mode-resources/cperl-indent-exp.pl b= /test/lisp/progmodes/cperl-mode-resources/cperl-indent-exp.pl index 8c1883a10f..33d0d7942e 100644 --- a/test/lisp/progmodes/cperl-mode-resources/cperl-indent-exp.pl +++ b/test/lisp/progmodes/cperl-mode-resources/cperl-indent-exp.pl @@ -50,3 +50,79 @@ quuux; } # -------- if-then-else: end -------- + +# -------- labels: input -------- +{ +LABEL: +for (0..1) { +next LABEL if $_; +unless ($_) last LABEL; +} +LABEL_WITH_SPACE : +while (1) { +last LABEL_WITH_SPACE; +} +LABEL_WITH_COMMENT : # explain that label +say "This label has a comment." +} +# -------- labels: expected output -------- +{ + LABEL: + for (0..1) { + next LABEL if $_; + unless ($_) last LABEL; + } + LABEL_WITH_SPACE : + while (1) { + last LABEL_WITH_SPACE; + } + LABEL_WITH_COMMENT : # explain that label + say "This label has a comment." +} +# -------- labels: end -------- + +# -------- packages: input -------- +package Foo { +package Foo::Inner::One { +...; +} +package Foo::Inner::Two { +...; +} +} +# -------- packages: expected output -------- +package Foo { + package Foo::Inner::One { + ...; + } + package Foo::Inner::Two { + ...; + } +} +# -------- packages: end -------- + +# -------- braces after newline: input -------- +package Outer { +package Inner +{ +sub foo +{ +} +sub bar +{ +} +} +} +# -------- braces after newline: expected output -------- +package Outer { + package Inner + { + sub foo + { + } + sub bar + { + } + } +} +# -------- braces after newline: end -------- diff --git a/test/lisp/progmodes/cperl-mode-resources/cperl-indent-region.p= l b/test/lisp/progmodes/cperl-mode-resources/cperl-indent-region.pl new file mode 100644 index 0000000000..219d5292ca --- /dev/null +++ b/test/lisp/progmodes/cperl-mode-resources/cperl-indent-region.pl @@ -0,0 +1,118 @@ +#!/usr/bin/env perl +use strict; +use warnings; +use 5.020; + +# This file contains test input and expected output for the tests in +# cperl-mode-tests.el, cperl-test-indent-region. The code is +# syntactically valid, but doesn't make much sense. + +# (Bug#11733) https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D11733 +# -------- sub \n brace: input -------- +sub foo + { + } + +sub bar + { + } +# -------- sub \n brace: expected output -------- +sub foo +{ +} + +sub bar +{ +} +# -------- sub \n brace: end -------- + +# -------- map \n brace: input -------- +{ +my %hash =3D map +{ +$_ =3D 1 +} +@_; +} +# -------- map \n brace: expected output -------- +{ + my %hash =3D map + { + $_ =3D 1 + } + @_; +} +# -------- map \n brace: end -------- + +# -------- if \n brace: input -------- +if (1 =3D=3D @_) + { + say "one argument"; + } else { + say "No arguments, or plenty"; + } +# -------- if \n brace: expected output -------- +if (1 =3D=3D @_) { + say "one argument"; +} else { + say "No arguments, or plenty"; +} +# -------- if \n brace: end -------- + +# -------- simple sub: input -------- +sub middle { + my ($left,$right) =3D @_; + return 0.5 * ($left + $right); +} +$this_should_be_justified =3D 'left'; +# -------- simple sub: expected output -------- +sub middle { + my ($left,$right) =3D @_; + return 0.5 * ($left + $right); +} +$this_should_be_justified =3D 'left'; +# -------- simple sub: end -------- + +# ------- hash reference: input -------- +my $result =3D { ok =3D> 'true' } + if ($this_should_be_indented =3D=3D 'cperl-continued-statement-offset'); +# ------- hash reference: expected output -------- +my $result =3D { ok =3D> 'true' } + if ($this_should_be_indented =3D=3D 'cperl-continued-statement-offset'); +# ------- hash reference: end -------- + +# The following syntax was introduced in Perl v5.14: +# ------- package BLOCK: input -------- +package Puck { +our $cperl_indent_level =3D 4; +} +our $good_indentation =3D 0; +# ------- package BLOCK: expected output -------- +package Puck { + our $cperl_indent_level =3D 4; +} +our $good_indentation =3D 0; +# ------- package BLOCK: end -------- + +# as of Perl 5.18, subroutines can be lexical +# -------- simple sub: input -------- +package Pick 1.23 { + my sub middle { + my ($left,$right) =3D @_; + return 0.5 * ($left + $right); + $this_should_be_indented =3D + $cperl_indent_level; + } +} +$this_should_be_justified =3D 'left'; +# -------- simple sub: expected output -------- +package Pick 1.23 { + my sub middle { + my ($left,$right) =3D @_; + return 0.5 * ($left + $right); + $this_should_be_indented =3D + $cperl_indent_level; + } +} +$this_should_be_justified =3D 'left'; +# -------- simple sub: end -------- diff --git a/test/lisp/progmodes/cperl-mode-tests.el b/test/lisp/progmodes/= cperl-mode-tests.el index 61e4ece49b..1bcc41ab9f 100644 --- a/test/lisp/progmodes/cperl-mode-tests.el +++ b/test/lisp/progmodes/cperl-mode-tests.el @@ -103,6 +103,14 @@ cperl-test-indent-exp (ert-resource-file "cperl-indent-exp.pl") (cperl-indent-exp))) ; here we go! =20 +(ert-deftest cperl-test-indent-region () + "Test indenting of regions which can span more than one +expression." + (skip-unless (eq cperl-test-mode #'cperl-mode)) + (cperl--run-test-cases + (ert-resource-file "cperl-indent-region.pl") + (indent-region (point-min) (point-max)))) + (ert-deftest cperl-test-indent-styles () (skip-unless (eq cperl-test-mode #'cperl-mode)) (cperl--run-test-cases @@ -189,17 +197,17 @@ cperl-test-ws-regexp '(" " "\t" "\n")) (invalid '("a" " " ""))) - (cperl-test--validate-regexp cperl--ws-regexp + (cperl-test--validate-regexp (rx (eval cperl--ws-rx)) valid invalid))) =20 -(ert-deftest cperl-test-ws-or-comment-regexp () +(ert-deftest cperl-test-ws+-rx () "Tests sequences of whitespace and comment lines." (let ((valid `(" " "\t#\n" "\n# \n" ,(concat "# comment\n" "# comment\n" "\n" "#comment\n"))) (invalid '("=3Dhead1 NAME\n" ))) - (cperl-test--validate-regexp cperl--ws-or-comment-regexp + (cperl-test--validate-regexp (rx (eval cperl--ws+-rx)) valid invalid))) =20 (ert-deftest cperl-test-version-regexp () --=20 2.20.1 --=-=-=--