From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: haj@posteo.de (Harald =?UTF-8?Q?J=C3=B6rg?=) Newsgroups: gmane.emacs.bugs Subject: bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH] Date: Wed, 23 Dec 2020 03:19:15 +0100 Message-ID: <87sg7xxo0s.fsf@hajtower> References: <87a8k4tdxp.fsf@jidanni.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34596"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) Cc: Stefan Monnier To: 23461@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Dec 23 03:21:34 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1krtmQ-0008t9-6f for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 23 Dec 2020 03:21:34 +0100 Original-Received: from localhost ([::1]:52148 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1krtmP-0000IU-39 for geb-bug-gnu-emacs@m.gmane-mx.org; Tue, 22 Dec 2020 21:21:33 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:47794) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1krtkw-0008I4-Pd for bug-gnu-emacs@gnu.org; Tue, 22 Dec 2020 21:20:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:39785) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1krtkw-0008JC-G1 for bug-gnu-emacs@gnu.org; Tue, 22 Dec 2020 21:20:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1krtkw-0004FH-Ax for bug-gnu-emacs@gnu.org; Tue, 22 Dec 2020 21:20:02 -0500 X-Loop: help-debbugs@gnu.org In-Reply-To: <87a8k4tdxp.fsf@jidanni.org> Resent-From: haj@posteo.de (Harald =?UTF-8?Q?J=C3=B6rg?=) Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 23 Dec 2020 02:20:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 23461 X-GNU-PR-Package: emacs Original-Received: via spool by 23461-submit@debbugs.gnu.org id=B23461.160868996716253 (code B ref 23461); Wed, 23 Dec 2020 02:20:02 +0000 Original-Received: (at 23461) by debbugs.gnu.org; 23 Dec 2020 02:19:27 +0000 Original-Received: from localhost ([127.0.0.1]:51331 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1krtkM-0004E2-7M for submit@debbugs.gnu.org; Tue, 22 Dec 2020 21:19:27 -0500 Original-Received: from mout01.posteo.de ([185.67.36.65]:41770) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1krtkJ-0004Dk-BL for 23461@debbugs.gnu.org; Tue, 22 Dec 2020 21:19:24 -0500 Original-Received: from submission (posteo.de [89.146.220.130]) by mout01.posteo.de (Postfix) with ESMTPS id D2D7616005C for <23461@debbugs.gnu.org>; Wed, 23 Dec 2020 03:19:16 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.de; s=2017; t=1608689956; bh=Jx9sqtZgT+Nqov61P0Gm0FgzfmwxLZd9OgjQE5AhAoA=; h=From:To:Cc:Subject:Date:From; b=ADAfFkTpDKuVnNyol7YFmKwlyvwdxHSyGUOQ/Yzm4AK6vYLJfV37M//QtNVpjGHri eTlB1Tx2GbkZeE0EH1ljvh1nlhAzeSLmSvsqyk6GXjh/2fhWFjPjW4wy8XDNv7f4aU LJhHl+Xd45K9mqulws+wAymvAzFSP/9EV8CCT3dAlPm8m9uRKQchUD5rvBDOJlnq6+ ds+R531P2VbZ1AJf9JotJngqwZbysasiH/S8FwwS/ScuksA08My/npXp+BB2g5oe2a I4WOtllS3eAkjM7zp3COopjUQEaRO9FG9/vwyAXAP+OeiN78CL03Dp6F2JRa17TQD+ 4N0xbe8tByxJw== Original-Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4D0xjH654Sz6tmB; Wed, 23 Dec 2020 03:19:15 +0100 (CET) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:196597 Archived-At: --=-=-= Content-Type: text/plain This is a detour from my work on CPerl mode bugs ... while trying to steal some syntax concepts from perl-mode.pl, I stumbled over this old report. I guess I can explain what's going on. Short story: Perl mode marks HERE-docs syntactically as c-style comments, hence font-lock-mode selects the comments face. Investigating how to fix this leads to the longer story. There are two possible approaches: 1) use a string-style syntax (generic string) instead of c-style comments to flag HERE-documents. That way, font-lock picks up the correct face automagically. 2) Keep HERE_docs as c-style comments, but change the face mapping by injecting a function into font-lock-defaults which applies the string face to c-style comments. Both approaches work, but both are a bit whacky. For 1), changing the syntax code is easy but opens a can of worms: Indentation after the HERE-doc doesn't work any more. The reason is that Perl mode needs to go "back" to find out whether a statement is a continuation line. "Back" includes skipping back over comments, but that HERE-doc is no longer a comment, so it blocks the way to find whether the line before the HERE-doc ends a statement. To fix that, all calls to perl-backward-to-noncomment must be checked whether they need to skip backward over HERE-docs, too. I added a function perl-backward-to-noncomment-nonhere, and eventually it turned out that the simple perl-backward-to-noncomment seems to be superfluous. For 2), it feels wrong to have strings marked as comments, and it is a bit of a hack to insert a function into font-lock-keywords which doesn't even search for keywords. CPerl mode uses a similar trick, but CPerl mode is renowned for being whacky. Also, __DATA__ sections in Perl mode are marked generic strings, so there ought to be some disambiguation. The patch uses the first approach, and also adds tests which are independent of the chosen solution. As a bycatch, it also fixes the case where the line starting a HERE-doc ends in a comment, which was messed up by perl-mode. I could not find a bug report for that but test cases are included. Perhaps Stefan has an opinion on this, and chances are good that he can point to a better solution... -- Happy winter solstice, haj --=-=-= Content-Type: text/x-diff Content-Disposition: attachment; filename=0001-perl-mode-Display-here-docs-as-strings-instead-of-co.patch Content-Description: perl-mode: Treat HERE-docs as strings >From a17f2323d9018fa312b6721fa7ea5744edc79039 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Harald=20J=C3=B6rg?= Date: Wed, 23 Dec 2020 02:34:33 +0100 Subject: [PATCH] ; perl-mode: Display here-docs as strings instead of comments. * lisp/progmodes/perl-mode.el (perl-syntax-propertize-function): Make HERE-doc start a generic string instead of a c-style comment. Handle the case where the line starting a HERE-doc ends with a comment. (perl--beginning-of-here-doc): New function. (perl-backward-to-noncomment-nonhere): New function. (perl-syntax-propertize-special-constructs): Make HERE-terminators end generic strings instead of c-style comments, using the new functions. * test/lisp/progmodes/cperl-mode-tests.el (cperl-test-heredocs): New test (30 should-forms) for various aspects of HERE-documents. Works for CPerl mode, and with the patch also for Perl mode. * test/lisp/progmodes/cperl-mode-resources/here-docs.pl: New file with test cases. --- lisp/progmodes/perl-mode.el | 56 +++++++-- .../cperl-mode-resources/here-docs.pl | 111 ++++++++++++++++++ test/lisp/progmodes/cperl-mode-tests.el | 29 +++++ 3 files changed, 187 insertions(+), 9 deletions(-) create mode 100644 test/lisp/progmodes/cperl-mode-resources/here-docs.pl diff --git a/lisp/progmodes/perl-mode.el b/lisp/progmodes/perl-mode.el index fd8a51b5a5..a961f723d5 100644 --- a/lisp/progmodes/perl-mode.el +++ b/lisp/progmodes/perl-mode.el @@ -324,14 +324,29 @@ perl-syntax-propertize-function ;; disambiguate with the left-bitshift operator. "\\|" perl--syntax-exp-intro-regexp "<<\\(?2:\\sw+\\)\\)" ".*\\(\n\\)") - (4 (let* ((st (get-text-property (match-beginning 4) 'syntax-table)) + (4 (let* ((eol (match-beginning 4)) + (st (get-text-property eol 'syntax-table)) (name (match-string 2)) (indented (match-beginning 1))) (goto-char (match-end 2)) (if (save-excursion (nth 8 (syntax-ppss (match-beginning 0)))) + ;; '>>' occurred in a string, or in a comment. ;; Leave the property of the newline unchanged. st - (cons (car (string-to-syntax "< c")) + ;; Before changing the syntax to generic string, let's + ;; check whether we are in an end-of-line comment, and + ;; if so, cheat by shifting the comment markers one char + ;; to the left. + (when (nth 4 (save-excursion (syntax-ppss eol))) + (when (equal (car (syntax-after (1- eol))) + (car (string-to-syntax "<"))) + ;; yet another edge case: "#" is the last character + ;; in that line, so there's actually no comment. + (put-text-property (- eol 2) (1- eol) + 'syntax-table (string-to-syntax "<"))) + (put-text-property (1- eol) eol + 'syntax-table (string-to-syntax ">"))) + (cons (car (string-to-syntax "|")) ;; Remember the names of heredocs found on this line. (cons (cons (pcase (aref name 0) (?\\ (substring name 1)) @@ -342,7 +357,7 @@ perl-syntax-propertize-function ;; We don't call perl-syntax-propertize-special-constructs directly ;; from the << rule, because there might be other elements (between ;; the << and the \n) that need to be propertized. - ("\\(?:$\\)\\s<" + ("\\(?:$\\)\\s|" (0 (ignore (perl-syntax-propertize-special-constructs end)))) ) (point) end))) @@ -364,12 +379,24 @@ perl-quote-syntax-table (modify-syntax-entry close ")" st)) st)) +(defun perl--beginning-of-here-doc (state) + "If STATE describes a here-document, return its start, else return nil." + ;; We need to distinguish here-docs from normal strings, and from + ;; quote-like constructs like q//. + (let ((in-string-p (nth 3 state)) + (string-start (nth 8 state))) + (and in-string-p + (= (syntax-class (syntax-after string-start)) 15) ; generic string + ;; here-doc strings have a syntax table cdr for the terminator(s) + (cdr-safe (get-text-property string-start 'syntax-table)) + string-start))) ; return the start position if all other tests are t + (defun perl-syntax-propertize-special-constructs (limit) "Propertize special constructs like regexps and formats." (let ((state (syntax-ppss)) char) (cond - ((eq 2 (nth 7 state)) + ((perl--beginning-of-here-doc state) ;; A Here document. (let ((names (cdr (get-text-property (nth 8 state) 'syntax-table)))) (when (cdr names) @@ -386,7 +413,7 @@ perl-syntax-propertize-special-constructs limit 'move)) (unless names (put-text-property (1- (point)) (point) 'syntax-table - (string-to-syntax "> c")))))) + (string-to-syntax "|")))))) ((or (null (setq char (nth 3 state))) (and (characterp char) (null (get-text-property (nth 8 state) 'syntax-table)))) @@ -910,14 +937,14 @@ perl-continuation-line-p "Move to end of previous line and return non-nil if continued." ;; Statement level. Is it a continuation or a new statement? ;; Find previous non-comment character. - (perl-backward-to-noncomment) + (perl-backward-to-noncomment-nonhere) ;; Back up over label lines, since they don't ;; affect whether our line is a continuation. (while (and (eq (preceding-char) ?:) (memq (char-syntax (char-after (- (point) 2))) '(?w ?_))) (beginning-of-line) - (perl-backward-to-noncomment)) + (perl-backward-to-noncomment-nonhere)) ;; Now we get the answer. (unless (memq (preceding-char) '(?\; ?\} ?\{)) (preceding-char))) @@ -959,7 +986,7 @@ perl-calculate-indent (state (syntax-ppss)) (containing-sexp (nth 1 state)) ;; Don't auto-indent in a quoted string or a here-document. - (unindentable (or (nth 3 state) (eq 2 (nth 7 state))))) + (unindentable (or (nth 3 state) (perl--beginning-of-here-doc state)))) (when (and (eq t (nth 3 state)) (save-excursion (goto-char (nth 8 state)) @@ -976,7 +1003,7 @@ perl-calculate-indent (if perl-indent-parens-as-block '(?\{ ?\( ?\[) '(?\{))) 0 ; move to beginning of line if it starts a function body ;; indent a little if this is a continuation line - (perl-backward-to-noncomment) + (perl-backward-to-noncomment-nonhere) (if (or (bobp) (memq (preceding-char) '(?\; ?\}))) 0 perl-continued-statement-offset))) @@ -1076,6 +1103,17 @@ perl-backward-to-noncomment "Move point backward to after the first non-white-space, skipping comments." (forward-comment (- (point-max)))) +(defun perl-backward-to-noncomment-nonhere () + "Move point backward, skipping comments and here-docs." + ;; Comments can appear after a here-doc, but also at the end of the + ;; line containing the here-doc delimiter(s). + (forward-comment (- (point-max))) + (unless (equal (point) (point-min)) + (let ((here-start (perl--beginning-of-here-doc + (save-excursion (syntax-ppss (1- (point))))))) + (when here-start (goto-char here-start))) + (forward-comment (- (point-max))))) + (defun perl-backward-to-start-of-continued-exp () (while (let ((c (preceding-char))) diff --git a/test/lisp/progmodes/cperl-mode-resources/here-docs.pl b/test/lisp/progmodes/cperl-mode-resources/here-docs.pl new file mode 100644 index 0000000000..39e4fe06ba --- /dev/null +++ b/test/lisp/progmodes/cperl-mode-resources/here-docs.pl @@ -0,0 +1,111 @@ +use 5.020; + +=head1 NAME + +here-docs.pl - resource file for cperl-test-here-docs + +=head1 DESCRIPTION + +This file holds a couple of HERE documents, with a variety of normal +and edge cases. For a formatted view of this description, run: + + (cperl-perldoc "here-docs.pl") + +For each of the HERE documents, the following checks will done: + +=over 4 + +=item * + +All occurrences of the string "look-here" are fontified as +'font-lock-string-face. Note that we deliberately test the face, not +the syntax property: Users won't care for the syntax property, but +they see the face. Different implementations with different syntax +properties have been seen in the past. + +=item * + +Indentation of the line(s) containing "look-here" is 0, i.e. there are no +leading spaces. + +=item * + +Indentation of the following perl statement containing "indent" should +be 0 if the statement contains "noindent", and according to the mode's +continued-statement-offset otherwise. + +=back + +=cut + +# Prologue to make the test file valid without warnings + +my $text; +my $any; +my $indentation; +my $anywhere = 'back again'; + +=head1 The Tests + +=head2 Test Case 1 + +We have two HERE documents in one line with different quoting styles. + +=cut + +## test case + +$text = <<"HERE" . <<'THERE' . $any; +#look-here and +HERE +$tlook-here and +THERE + +my $noindent = "This should be left-justified"; + +=head2 Test case 2 + +A HERE document followed by a continuation line + +=cut + +## test case + +$text = <