unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#23461: perl mode uses same color for comments and here documents
@ 2016-05-05 21:03 積丹尼 Dan Jacobson
  2019-10-09  5:52 ` Lars Ingebrigtsen
  2020-12-23  2:19 ` bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH] Harald Jörg
  0 siblings, 2 replies; 14+ messages in thread
From: 積丹尼 Dan Jacobson @ 2016-05-05 21:03 UTC (permalink / raw)
  To: 23461

perl mode uses same color for comments and here documents.
Isn't there some other color that can be chosen?

print <<'EOF';
 <meta name="viewport" content="width=device-width">
EOF
# use XML::TreePP;

emacs-version "24.5.1"





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23461: perl mode uses same color for comments and here documents
  2016-05-05 21:03 bug#23461: perl mode uses same color for comments and here documents 積丹尼 Dan Jacobson
@ 2019-10-09  5:52 ` Lars Ingebrigtsen
  2020-12-23  2:19 ` bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH] Harald Jörg
  1 sibling, 0 replies; 14+ messages in thread
From: Lars Ingebrigtsen @ 2019-10-09  5:52 UTC (permalink / raw)
  To: 積丹尼 Dan Jacobson; +Cc: 23461

積丹尼 Dan Jacobson <jidanni@jidanni.org> writes:

> perl mode uses same color for comments and here documents.
> Isn't there some other color that can be chosen?
>
> print <<'EOF';
>  <meta name="viewport" content="width=device-width">
> EOF
> # use XML::TreePP;

Yes, that doesn't seem optimal...

But as usual when trying to investigate what's going on with these
syntax things, I have no idea what's going on.  perl-mode detects these
correctly, but I don't know where the mapping from whatever this does to
faces happens, and I can't edebug through syntax-propertize-rules, and
putting a `debug' into the thing is just ignored.

Does anybody know where I should be poking around to find out what's
going on here?

      ;; Here documents.
      ((concat
        "\\(?:"
        ;; << "EOF", << 'EOF', or << \EOF
        "<<\\(~\\)?[ \t]*\\('[^'\n]*'\\|\"[^\"\n]*\"\\|\\\\[[:alpha:]][[:alnum:]]*\\)"
        ;; The <<EOF case which needs perl--syntax-exp-intro-regexp, to
        ;; disambiguate with the left-bitshift operator.
        "\\|" perl--syntax-exp-intro-regexp "<<\\(?2:\\sw+\\)\\)"
        ".*\\(\n\\)")
       (4 (let* ((st (get-text-property (match-beginning 4) 'syntax-table))
                 (name (match-string 2))
                 (indented (match-beginning 1)))
            (goto-char (match-end 2))
            (if (save-excursion (nth 8 (syntax-ppss (match-beginning 0))))
                ;; Leave the property of the newline unchanged.
                st

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2016-05-05 21:03 bug#23461: perl mode uses same color for comments and here documents 積丹尼 Dan Jacobson
  2019-10-09  5:52 ` Lars Ingebrigtsen
@ 2020-12-23  2:19 ` Harald Jörg
  2020-12-23  4:00   ` Stefan Monnier
  1 sibling, 1 reply; 14+ messages in thread
From: Harald Jörg @ 2020-12-23  2:19 UTC (permalink / raw)
  To: 23461; +Cc: Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 2277 bytes --]

This is a detour from my work on CPerl mode bugs ... while trying to
steal some syntax concepts from perl-mode.pl, I stumbled over this old
report.

I guess I can explain what's going on.  Short story: Perl mode marks
HERE-docs syntactically as c-style comments, hence font-lock-mode
selects the comments face.

Investigating how to fix this leads to the longer story.  There are two
possible approaches:

  1) use a string-style syntax (generic string) instead of c-style
     comments to flag HERE-documents.  That way, font-lock picks up the
     correct face automagically.

  2) Keep HERE_docs as c-style comments, but change the face mapping by
     injecting a function into font-lock-defaults which applies the
     string face to c-style comments.

Both approaches work, but both are a bit whacky.  For 1), changing the
syntax code is easy but opens a can of worms: Indentation after the
HERE-doc doesn't work any more.  The reason is that Perl mode needs to
go "back" to find out whether a statement is a continuation line.
"Back" includes skipping back over comments, but that HERE-doc is no
longer a comment, so it blocks the way to find whether the line before
the HERE-doc ends a statement.  To fix that, all calls to
perl-backward-to-noncomment must be checked whether they need to skip
backward over HERE-docs, too.  I added a function
perl-backward-to-noncomment-nonhere, and eventually it turned out that
the simple perl-backward-to-noncomment seems to be superfluous.

For 2), it feels wrong to have strings marked as comments, and it is a
bit of a hack to insert a function into font-lock-keywords which doesn't
even search for keywords.  CPerl mode uses a similar trick, but CPerl
mode is renowned for being whacky.  Also, __DATA__ sections in Perl mode
are marked generic strings, so there ought to be some disambiguation.

The patch uses the first approach, and also adds tests which are
independent of the chosen solution.


As a bycatch, it also fixes the case where the line starting a HERE-doc
ends in a comment, which was messed up by perl-mode.  I could not find a
bug report for that but test cases are included.

Perhaps Stefan has an opinion on this, and chances are good that he can
point to a better solution...
-- 
Happy winter solstice,
haj

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: perl-mode: Treat HERE-docs as strings --]
[-- Type: text/x-diff, Size: 11766 bytes --]

From a17f2323d9018fa312b6721fa7ea5744edc79039 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Harald=20J=C3=B6rg?= <haj@posteo.de>
Date: Wed, 23 Dec 2020 02:34:33 +0100
Subject: [PATCH] ; perl-mode: Display here-docs as strings instead of
 comments.

* lisp/progmodes/perl-mode.el (perl-syntax-propertize-function):
Make HERE-doc start a generic string instead of a c-style comment.
Handle the case where the line starting a HERE-doc ends with a
comment.
(perl--beginning-of-here-doc): New function.
(perl-backward-to-noncomment-nonhere): New function.
(perl-syntax-propertize-special-constructs): Make HERE-terminators
end generic strings instead of c-style comments, using the new
functions.

* test/lisp/progmodes/cperl-mode-tests.el (cperl-test-heredocs):
New test (30 should-forms) for various aspects of HERE-documents.
Works for CPerl mode, and with the patch also for Perl mode.

* test/lisp/progmodes/cperl-mode-resources/here-docs.pl: New file
with test cases.
---
 lisp/progmodes/perl-mode.el                   |  56 +++++++--
 .../cperl-mode-resources/here-docs.pl         | 111 ++++++++++++++++++
 test/lisp/progmodes/cperl-mode-tests.el       |  29 +++++
 3 files changed, 187 insertions(+), 9 deletions(-)
 create mode 100644 test/lisp/progmodes/cperl-mode-resources/here-docs.pl

diff --git a/lisp/progmodes/perl-mode.el b/lisp/progmodes/perl-mode.el
index fd8a51b5a5..a961f723d5 100644
--- a/lisp/progmodes/perl-mode.el
+++ b/lisp/progmodes/perl-mode.el
@@ -324,14 +324,29 @@ perl-syntax-propertize-function
         ;; disambiguate with the left-bitshift operator.
         "\\|" perl--syntax-exp-intro-regexp "<<\\(?2:\\sw+\\)\\)"
         ".*\\(\n\\)")
-       (4 (let* ((st (get-text-property (match-beginning 4) 'syntax-table))
+       (4 (let* ((eol (match-beginning 4))
+                 (st (get-text-property eol 'syntax-table))
                  (name (match-string 2))
                  (indented (match-beginning 1)))
             (goto-char (match-end 2))
             (if (save-excursion (nth 8 (syntax-ppss (match-beginning 0))))
+                ;; '>>' occurred in a string, or in a comment.
                 ;; Leave the property of the newline unchanged.
                 st
-              (cons (car (string-to-syntax "< c"))
+              ;; Before changing the syntax to generic string, let's
+              ;; check whether we are in an end-of-line comment, and
+              ;; if so, cheat by shifting the comment markers one char
+              ;; to the left.
+              (when (nth 4 (save-excursion (syntax-ppss eol)))
+                (when (equal (car (syntax-after (1- eol)))
+                             (car (string-to-syntax "<")))
+                  ;; yet another edge case: "#" is the last character
+                  ;; in that line, so there's actually no comment.
+                  (put-text-property (- eol 2) (1- eol)
+                                     'syntax-table (string-to-syntax "<")))
+                (put-text-property (1- eol) eol
+                                   'syntax-table (string-to-syntax ">")))
+              (cons (car (string-to-syntax "|"))
                     ;; Remember the names of heredocs found on this line.
                     (cons (cons (pcase (aref name 0)
                                   (?\\ (substring name 1))
@@ -342,7 +357,7 @@ perl-syntax-propertize-function
       ;; We don't call perl-syntax-propertize-special-constructs directly
       ;; from the << rule, because there might be other elements (between
       ;; the << and the \n) that need to be propertized.
-      ("\\(?:$\\)\\s<"
+      ("\\(?:$\\)\\s|"
        (0 (ignore (perl-syntax-propertize-special-constructs end))))
       )
      (point) end)))
@@ -364,12 +379,24 @@ perl-quote-syntax-table
       (modify-syntax-entry close ")" st))
     st))
 
+(defun perl--beginning-of-here-doc (state)
+  "If STATE describes a here-document, return its start, else return nil."
+  ;; We need to distinguish here-docs from normal strings, and from
+  ;; quote-like constructs like q//.
+  (let ((in-string-p (nth 3 state))
+        (string-start (nth 8 state)))
+  (and in-string-p
+       (= (syntax-class (syntax-after string-start)) 15) ; generic string
+       ;; here-doc strings have a syntax table cdr for the terminator(s)
+       (cdr-safe (get-text-property string-start 'syntax-table))
+       string-start))) ; return the start position if all other tests are t
+
 (defun perl-syntax-propertize-special-constructs (limit)
   "Propertize special constructs like regexps and formats."
   (let ((state (syntax-ppss))
         char)
     (cond
-     ((eq 2 (nth 7 state))
+     ((perl--beginning-of-here-doc state)
       ;; A Here document.
       (let ((names (cdr (get-text-property (nth 8 state) 'syntax-table))))
         (when (cdr names)
@@ -386,7 +413,7 @@ perl-syntax-propertize-special-constructs
                      limit 'move))
           (unless names
             (put-text-property (1- (point)) (point) 'syntax-table
-                               (string-to-syntax "> c"))))))
+                               (string-to-syntax "|"))))))
      ((or (null (setq char (nth 3 state)))
           (and (characterp char)
                (null (get-text-property (nth 8 state) 'syntax-table))))
@@ -910,14 +937,14 @@ perl-continuation-line-p
   "Move to end of previous line and return non-nil if continued."
   ;; Statement level.  Is it a continuation or a new statement?
   ;; Find previous non-comment character.
-  (perl-backward-to-noncomment)
+  (perl-backward-to-noncomment-nonhere)
   ;; Back up over label lines, since they don't
   ;; affect whether our line is a continuation.
   (while (and (eq (preceding-char) ?:)
               (memq (char-syntax (char-after (- (point) 2)))
                     '(?w ?_)))
     (beginning-of-line)
-    (perl-backward-to-noncomment))
+    (perl-backward-to-noncomment-nonhere))
   ;; Now we get the answer.
   (unless (memq (preceding-char) '(?\; ?\} ?\{))
     (preceding-char)))
@@ -959,7 +986,7 @@ perl-calculate-indent
 	   (state (syntax-ppss))
 	   (containing-sexp (nth 1 state))
 	   ;; Don't auto-indent in a quoted string or a here-document.
-	   (unindentable (or (nth 3 state) (eq 2 (nth 7 state)))))
+	   (unindentable (or (nth 3 state) (perl--beginning-of-here-doc state))))
       (when (and (eq t (nth 3 state))
                  (save-excursion
                    (goto-char (nth 8 state))
@@ -976,7 +1003,7 @@ perl-calculate-indent
                   (if perl-indent-parens-as-block '(?\{ ?\( ?\[) '(?\{)))
             0          ; move to beginning of line if it starts a function body
           ;; indent a little if this is a continuation line
-          (perl-backward-to-noncomment)
+          (perl-backward-to-noncomment-nonhere)
           (if (or (bobp)
                   (memq (preceding-char) '(?\; ?\})))
               0 perl-continued-statement-offset)))
@@ -1076,6 +1103,17 @@ perl-backward-to-noncomment
   "Move point backward to after the first non-white-space, skipping comments."
   (forward-comment (- (point-max))))
 
+(defun perl-backward-to-noncomment-nonhere ()
+  "Move point backward, skipping comments and here-docs."
+  ;; Comments can appear after a here-doc, but also at the end of the
+  ;; line containing the here-doc delimiter(s).
+  (forward-comment (- (point-max)))
+  (unless (equal (point) (point-min))
+    (let ((here-start (perl--beginning-of-here-doc
+                       (save-excursion (syntax-ppss (1- (point)))))))
+      (when here-start (goto-char here-start)))
+    (forward-comment (- (point-max)))))
+
 (defun perl-backward-to-start-of-continued-exp ()
   (while
       (let ((c (preceding-char)))
diff --git a/test/lisp/progmodes/cperl-mode-resources/here-docs.pl b/test/lisp/progmodes/cperl-mode-resources/here-docs.pl
new file mode 100644
index 0000000000..39e4fe06ba
--- /dev/null
+++ b/test/lisp/progmodes/cperl-mode-resources/here-docs.pl
@@ -0,0 +1,111 @@
+use 5.020;
+
+=head1 NAME
+
+here-docs.pl - resource file for cperl-test-here-docs
+
+=head1 DESCRIPTION
+
+This file holds a couple of HERE documents, with a variety of normal
+and edge cases.  For a formatted view of this description, run:
+
+   (cperl-perldoc "here-docs.pl")
+
+For each of the HERE documents, the following checks will done:
+
+=over 4
+
+=item *
+
+All occurrences of the string "look-here" are fontified as
+'font-lock-string-face.  Note that we deliberately test the face, not
+the syntax property: Users won't care for the syntax property, but
+they see the face.  Different implementations with different syntax
+properties have been seen in the past.
+
+=item *
+
+Indentation of the line(s) containing "look-here" is 0, i.e. there are no
+leading spaces.
+
+=item *
+
+Indentation of the following perl statement containing "indent" should
+be 0 if the statement contains "noindent", and according to the mode's
+continued-statement-offset otherwise.
+
+=back
+
+=cut
+
+# Prologue to make the test file valid without warnings
+
+my $text;
+my $any;
+my $indentation;
+my $anywhere = 'back again';
+
+=head1 The Tests
+
+=head2 Test Case 1
+
+We have two HERE documents in one line with different quoting styles.
+
+=cut
+
+## test case
+
+$text = <<"HERE" . <<'THERE' . $any;
+#look-here and
+HERE
+$tlook-here and
+THERE
+
+my $noindent = "This should be left-justified";
+
+=head2 Test case 2
+
+A HERE document followed by a continuation line
+
+=cut
+
+## test case
+
+$text = <<HERE
+look-here
+HERE
+
+. 'indent-level'; # Continuation, should be indented
+
+=head2 Test case 3
+
+A here document with a line-end comment in the starter line,
+after a complete statement
+
+=cut
+
+## test case
+
+$text = <<HERE; # start here
+look-here
+HERE
+
+my $noindent = "New statement in this line";
+
+=head2 Test case 4
+
+A HERE document with a to-be-continued statement and a comment in the
+starter line.
+
+=cut
+
+## test case
+
+$text = <<HERE # start here
+look-here
+HERE
+
+. 'indent-level'; # Continuation, should be indented
+
+
+__END__
diff --git a/test/lisp/progmodes/cperl-mode-tests.el b/test/lisp/progmodes/cperl-mode-tests.el
index cb2d067a61..d9b090896d 100644
--- a/test/lisp/progmodes/cperl-mode-tests.el
+++ b/test/lisp/progmodes/cperl-mode-tests.el
@@ -135,6 +135,35 @@ cperl-test-fontify-punct-vars
         (should (equal (nth 3 (syntax-ppss)) nil))
         (should (equal (nth 4 (syntax-ppss)) t))))))
 
+(ert-deftest cperl-test-heredocs ()
+  "Test that HERE-docs are fontified as strings."
+  (let ((file (ert-resource-file "here-docs.pl"))
+        (cperl-continued-statement-offset perl-continued-statement-offset)
+        (case-fold-search nil))
+    (with-temp-buffer
+      (insert-file-contents file)
+      (goto-char (point-min))
+      (funcall cperl-test-mode)
+      (indent-region (point-min) (point-max))
+      (font-lock-ensure (point-min) (point-max))
+      (while (search-forward "## test case" nil t)
+        (save-excursion
+          (while (search-forward "look-here" nil t)
+            (should (equal
+                     (get-text-property (match-beginning 0) 'face)
+                     'font-lock-string-face))
+            (beginning-of-line)
+            (should (null (looking-at "[ \t]")))
+            (forward-line 1)))
+        (should (re-search-forward
+                 (concat "^\\([ \t]*\\)" ; the actual indentation amount
+                         "\\([^ \t\n].*?\\)\\(no\\)?indent")
+                 nil t))
+        (should (equal (- (match-end 1) (match-beginning 1))
+                       (if (match-beginning 3) 0
+                         perl-indent-level))))
+      )))
+
 ;;; Tests for issues reported in the Bug Tracker
 
 (defun cperl-test--run-bug-10483 ()
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2020-12-23  2:19 ` bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH] Harald Jörg
@ 2020-12-23  4:00   ` Stefan Monnier
  2020-12-23 14:37     ` Harald Jörg
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2020-12-23  4:00 UTC (permalink / raw)
  To: Harald Jörg; +Cc: 23461

> Investigating how to fix this leads to the longer story.  There are two
> possible approaches:
>
>   1) use a string-style syntax (generic string) instead of c-style
>      comments to flag HERE-documents.  That way, font-lock picks up the
>      correct face automagically.
>
>   2) Keep HERE_docs as c-style comments, but change the face mapping by
>      injecting a function into font-lock-defaults which applies the
>      string face to c-style comments.

3) Use `font-lock-syntactic-face-function`?


        Stefan






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2020-12-23  4:00   ` Stefan Monnier
@ 2020-12-23 14:37     ` Harald Jörg
  2020-12-23 16:34       ` Stefan Monnier
  0 siblings, 1 reply; 14+ messages in thread
From: Harald Jörg @ 2020-12-23 14:37 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 23461

[-- Attachment #1: Type: text/plain, Size: 1048 bytes --]

Stefan Monnier writes:

>> Investigating how to fix this leads to the longer story.  There are two
>> possible approaches:
>>
>>   1) use a string-style syntax (generic string) instead of c-style
>>      comments to flag HERE-documents.  That way, font-lock picks up the
>>      correct face automagically.
>>
>>   2) Keep HERE_docs as c-style comments, but change the face mapping by
>>      injecting a function into font-lock-defaults which applies the
>>      string face to c-style comments.
>
> 3) Use `font-lock-syntactic-face-function`?

Ah - thanks for this pointer!  I wasn't aware of this function, though
this feature is already in use in perl-mode.el.

This looks like to be an improved variation of 2): HERE-docs remain
marked as c-style comments, and `font-lock-syntactic-face-function` is
used to display them as strings.

A patch for this variation is attached.  Tests and test resources are
the same as with the first patch, and this patch also contains the same
fix for HERE-doc starters with a trailing comment.
-- 
Cheers,
haj

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Display HERE-docs as strings --]
[-- Type: text/x-diff, Size: 7480 bytes --]

From f5025280db960b956e64bd9c1a7049c0fa294c79 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Harald=20J=C3=B6rg?= <haj@posteo.de>
Date: Wed, 23 Dec 2020 15:17:27 +0100
Subject: [PATCH] ; perl-mode: Display here-docs as strings instead of
 comments.

* lisp/progmodes/perl-mode.el
(perl-font-lock-syntactic-face-function): Declare HERE-docs to be
fontified as string.
(perl-syntax-propertize-function): Handle comments after a
HERE-doc starter line.

* test/lisp/progmodes/cperl-mode-tests.el (cperl-test-heredocs):
New test (30 should-forms) for various aspects of HERE-documents.
Works for CPerl mode, and with the patch also for Perl mode.

* test/lisp/progmodes/cperl-mode-resources/here-docs.pl: New file
with test cases.
---
 lisp/progmodes/perl-mode.el                   |  20 +++-
 .../cperl-mode-resources/here-docs.pl         | 111 ++++++++++++++++++
 test/lisp/progmodes/cperl-mode-tests.el       |  29 +++++
 3 files changed, 159 insertions(+), 1 deletion(-)
 create mode 100644 test/lisp/progmodes/cperl-mode-resources/here-docs.pl

diff --git a/lisp/progmodes/perl-mode.el b/lisp/progmodes/perl-mode.el
index fd8a51b5a5..cc3eb4948a 100644
--- a/lisp/progmodes/perl-mode.el
+++ b/lisp/progmodes/perl-mode.el
@@ -324,13 +324,28 @@ perl-syntax-propertize-function
         ;; disambiguate with the left-bitshift operator.
         "\\|" perl--syntax-exp-intro-regexp "<<\\(?2:\\sw+\\)\\)"
         ".*\\(\n\\)")
-       (4 (let* ((st (get-text-property (match-beginning 4) 'syntax-table))
+       (4 (let* ((eol (match-beginning 4))
+                 (st (get-text-property eol 'syntax-table))
                  (name (match-string 2))
                  (indented (match-beginning 1)))
             (goto-char (match-end 2))
             (if (save-excursion (nth 8 (syntax-ppss (match-beginning 0))))
+                ;; '>>' occurred in a string, or in a comment.
                 ;; Leave the property of the newline unchanged.
                 st
+              ;; Before changing the syntax to c-style comment, let's
+              ;; check whether we are in an end-of-line comment, and
+              ;; if so, cheat by shifting the comment markers one char
+              ;; to the left.
+              (when (nth 4 (save-excursion (syntax-ppss eol)))
+                (when (equal (car (syntax-after (1- eol)))
+                             (car (string-to-syntax "<")))
+                  ;; yet another edge case: "#" is the last character
+                  ;; in that line, so there's actually no comment.
+                  (put-text-property (- eol 2) (1- eol)
+                                     'syntax-table (string-to-syntax "<")))
+                (put-text-property (1- eol) eol
+                                   'syntax-table (string-to-syntax ">")))
               (cons (car (string-to-syntax "< c"))
                     ;; Remember the names of heredocs found on this line.
                     (cons (cons (pcase (aref name 0)
@@ -485,6 +500,9 @@ perl-syntax-propertize-special-constructs
 
 (defun perl-font-lock-syntactic-face-function (state)
   (cond
+   ((and (eq 2 (nth 7 state)) ; c-style comment
+         (cdr-safe (get-text-property (nth 8 state) 'syntax-table))) ; HERE doc
+    'font-lock-string-face)
    ((and (nth 3 state)
          (eq ?e (cdr-safe (get-text-property (nth 8 state) 'syntax-table)))
          ;; This is a second-arg of s{..}{...} form; let's check if this second
diff --git a/test/lisp/progmodes/cperl-mode-resources/here-docs.pl b/test/lisp/progmodes/cperl-mode-resources/here-docs.pl
new file mode 100644
index 0000000000..39e4fe06ba
--- /dev/null
+++ b/test/lisp/progmodes/cperl-mode-resources/here-docs.pl
@@ -0,0 +1,111 @@
+use 5.020;
+
+=head1 NAME
+
+here-docs.pl - resource file for cperl-test-here-docs
+
+=head1 DESCRIPTION
+
+This file holds a couple of HERE documents, with a variety of normal
+and edge cases.  For a formatted view of this description, run:
+
+   (cperl-perldoc "here-docs.pl")
+
+For each of the HERE documents, the following checks will done:
+
+=over 4
+
+=item *
+
+All occurrences of the string "look-here" are fontified as
+'font-lock-string-face.  Note that we deliberately test the face, not
+the syntax property: Users won't care for the syntax property, but
+they see the face.  Different implementations with different syntax
+properties have been seen in the past.
+
+=item *
+
+Indentation of the line(s) containing "look-here" is 0, i.e. there are no
+leading spaces.
+
+=item *
+
+Indentation of the following perl statement containing "indent" should
+be 0 if the statement contains "noindent", and according to the mode's
+continued-statement-offset otherwise.
+
+=back
+
+=cut
+
+# Prologue to make the test file valid without warnings
+
+my $text;
+my $any;
+my $indentation;
+my $anywhere = 'back again';
+
+=head1 The Tests
+
+=head2 Test Case 1
+
+We have two HERE documents in one line with different quoting styles.
+
+=cut
+
+## test case
+
+$text = <<"HERE" . <<'THERE' . $any;
+#look-here and
+HERE
+$tlook-here and
+THERE
+
+my $noindent = "This should be left-justified";
+
+=head2 Test case 2
+
+A HERE document followed by a continuation line
+
+=cut
+
+## test case
+
+$text = <<HERE
+look-here
+HERE
+
+. 'indent-level'; # Continuation, should be indented
+
+=head2 Test case 3
+
+A here document with a line-end comment in the starter line,
+after a complete statement
+
+=cut
+
+## test case
+
+$text = <<HERE; # start here
+look-here
+HERE
+
+my $noindent = "New statement in this line";
+
+=head2 Test case 4
+
+A HERE document with a to-be-continued statement and a comment in the
+starter line.
+
+=cut
+
+## test case
+
+$text = <<HERE # start here
+look-here
+HERE
+
+. 'indent-level'; # Continuation, should be indented
+
+
+__END__
diff --git a/test/lisp/progmodes/cperl-mode-tests.el b/test/lisp/progmodes/cperl-mode-tests.el
index cb2d067a61..d9b090896d 100644
--- a/test/lisp/progmodes/cperl-mode-tests.el
+++ b/test/lisp/progmodes/cperl-mode-tests.el
@@ -135,6 +135,35 @@ cperl-test-fontify-punct-vars
         (should (equal (nth 3 (syntax-ppss)) nil))
         (should (equal (nth 4 (syntax-ppss)) t))))))
 
+(ert-deftest cperl-test-heredocs ()
+  "Test that HERE-docs are fontified as strings."
+  (let ((file (ert-resource-file "here-docs.pl"))
+        (cperl-continued-statement-offset perl-continued-statement-offset)
+        (case-fold-search nil))
+    (with-temp-buffer
+      (insert-file-contents file)
+      (goto-char (point-min))
+      (funcall cperl-test-mode)
+      (indent-region (point-min) (point-max))
+      (font-lock-ensure (point-min) (point-max))
+      (while (search-forward "## test case" nil t)
+        (save-excursion
+          (while (search-forward "look-here" nil t)
+            (should (equal
+                     (get-text-property (match-beginning 0) 'face)
+                     'font-lock-string-face))
+            (beginning-of-line)
+            (should (null (looking-at "[ \t]")))
+            (forward-line 1)))
+        (should (re-search-forward
+                 (concat "^\\([ \t]*\\)" ; the actual indentation amount
+                         "\\([^ \t\n].*?\\)\\(no\\)?indent")
+                 nil t))
+        (should (equal (- (match-end 1) (match-beginning 1))
+                       (if (match-beginning 3) 0
+                         perl-indent-level))))
+      )))
+
 ;;; Tests for issues reported in the Bug Tracker
 
 (defun cperl-test--run-bug-10483 ()
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2020-12-23 14:37     ` Harald Jörg
@ 2020-12-23 16:34       ` Stefan Monnier
  2020-12-23 18:46         ` Harald Jörg
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2020-12-23 16:34 UTC (permalink / raw)
  To: Harald Jörg; +Cc: 23461

> This looks like to be an improved variation of 2): HERE-docs remain
> marked as c-style comments, and `font-lock-syntactic-face-function` is
> used to display them as strings.
>
> A patch for this variation is attached.

Looks good, thanks.
See nitpicks below,


        Stefan


>              (if (save-excursion (nth 8 (syntax-ppss (match-beginning 0))))
> +                ;; '>>' occurred in a string, or in a comment.
>                  ;; Leave the property of the newline unchanged.

Is think this `>>` is a type for `<<`, or am I missing something?

> +              ;; Before changing the syntax to c-style comment, let's
> +              ;; check whether we are in an end-of-line comment, and
> +              ;; if so, cheat by shifting the comment markers one char
> +              ;; to the left.

I jump straight to reading the code before reading your email's text and
it took me a bit of time to understand what this was about.
I think part of the reason is the "we are in an end-of-line comment"
since this is actually not about the case where the "<<" (which is
where "we" are at this moment, in my mind) is inside a comment.
So, I think the comment would be better if it just gave a straight
example, like

                 ;; Beware of `foo <<'BAR' #baz` because
                 ;; the newline needs to close the comment
                 ;; and can't be used to start the here-doc.

Also rather than "shifting the comment markers one char to the left"
I'd just say that "we terminate the comment *just before* the newline".

> +              (when (nth 4 (save-excursion (syntax-ppss eol)))
> +                (when (equal (car (syntax-after (1- eol)))
> +                             (car (string-to-syntax "<")))

This is a pessimistic test, because it will misfire when you have

     foo <<'BAR' #baz#

I think we should compare (1- eol) with (nth 8 (syntax-ppss eol)) instead.

> +                  ;; yet another edge case: "#" is the last character
> +                  ;; in that line, so there's actually no comment.
> +                  (put-text-property (- eol 2) (1- eol)
> +                                     'syntax-table (string-to-syntax "<")))

Indeed, terminating the comment just before the newline is a problem if
"just before the newline" is the comment starter.  I see that in that
case, you mark the char before the # but that can also be a problem with
things like:

    foo <<'BAR' "baz"#

An alternative is to leave the comment alone and start the heredoc just
after the newline instead (that approach suffers from the fact that we
need to be careful we don't accidentally throw away that `syntax-table`
text property when the next line is edited.  We can do that using the
`syntax-multiline` text property).

>  (defun perl-font-lock-syntactic-face-function (state)
>    (cond
> +   ((and (eq 2 (nth 7 state)) ; c-style comment
> +         (cdr-safe (get-text-property (nth 8 state) 'syntax-table))) ; HERE doc
> +    'font-lock-string-face)

I think some people won't like the string-face property for it.
How 'bout we (require 'sh-script) and use the `sh-heredoc` face?


        Stefan






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2020-12-23 16:34       ` Stefan Monnier
@ 2020-12-23 18:46         ` Harald Jörg
  2020-12-23 19:04           ` Stefan Monnier
  0 siblings, 1 reply; 14+ messages in thread
From: Harald Jörg @ 2020-12-23 18:46 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 23461

Stefan Monnier writes:

>> This looks like to be an improved variation of 2): HERE-docs remain
>> marked as c-style comments, and `font-lock-syntactic-face-function` is
>> used to display them as strings.
>>
>> A patch for this variation is attached.
>
> Looks good, thanks.
> See nitpicks below,

Many thanks for your review!

>>              (if (save-excursion (nth 8 (syntax-ppss (match-beginning 0))))
>> +                ;; '>>' occurred in a string, or in a comment.
>>                  ;; Leave the property of the newline unchanged.
>
> Is think this `>>` is a type for `<<`, or am I missing something?

Ouch.  That is a typo, of course.  *I* have missed something.

>> +              ;; Before changing the syntax to c-style comment, let's
>> +              ;; check whether we are in an end-of-line comment, and
>> +              ;; if so, cheat by shifting the comment markers one char
>> +              ;; to the left.
>
> I jump straight to reading the code before reading your email's text and
> it took me a bit of time to understand what this was about.
> I think part of the reason is the "we are in an end-of-line comment"
> since this is actually not about the case where the "<<" (which is
> where "we" are at this moment, in my mind) is inside a comment.
> So, I think the comment would be better if it just gave a straight
> example, like
>
>                  ;; Beware of `foo <<'BAR' #baz` because
>                  ;; the newline needs to close the comment
>                  ;; and can't be used to start the here-doc.

That's better, thank you!

> Also rather than "shifting the comment markers one char to the left"
> I'd just say that "we terminate the comment *just before* the newline".

>> +              (when (nth 4 (save-excursion (syntax-ppss eol)))
>> +                (when (equal (car (syntax-after (1- eol)))
>> +                             (car (string-to-syntax "<")))
>
> This is a pessimistic test, because it will misfire when you have
>
>      foo <<'BAR' #baz#
>
> I think we should compare (1- eol) with (nth 8 (syntax-ppss eol))
> instead.

You are right.  I'm still not very experienced with this syntax stuff,
as it turnes out.  In hindsight it is clear that a "#" will still be a
comment starter, even if it is already in a comment.

>> +                  ;; yet another edge case: "#" is the last character
>> +                  ;; in that line, so there's actually no comment.
>> +                  (put-text-property (- eol 2) (1- eol)
>> +                                     'syntax-table (string-to-syntax "<")))
>
> Indeed, terminating the comment just before the newline is a problem if
> "just before the newline" is the comment starter.  I see that in that
> case, you mark the char before the # but that can also be a problem with
> things like:
>
>     foo <<'BAR' "baz"#

...or, as I've discovered in the meantime, also bad,

      foo << BAR#

      foo << BAR;#

(The latter breaks indentation after the HERE-doc because the ";"
became a comment)

So, back to the drawing board.  If that edge case of an "empty comment"
is just ignored, then the single # loses its comment-face (that is, by
the way, the treatment of HERE-docs in sh-script.el), which is ugly but
as far as I can tell harmless.  Maybe this can be covered by assigning a
font-lock-face property to these single comment starters... I'll check
that.

> An alternative is to leave the comment alone and start the heredoc just
> after the newline instead (that approach suffers from the fact that we
> need to be careful we don't accidentally throw away that `syntax-table`
> text property when the next line is edited.  We can do that using the
> `syntax-multiline` text property).

This is similar to how CPerl mode does it, and the fact that text
properties can get lost when text is edited brought me here.

The bugs Bug#14343 and Bug#28962 in CPerl mode come from lost text
properties, and Bug#24101 is a consequence of starting a syntax property
with the first character.  So, I wanted to check how Perl mode covers
that.  I'll check whether `syntax-multiline` can add some more
robustness.

This points to the deeper problem: In Perl, we have some occasions where
one character has two different roles.  A line end ends a comment *and*
starts a here-doc, and in s/foo/bar/ge the middle slash ends the search
string *and* starts the replacement string.  The programming modes seem
to treat this by distributing the roles on two neighbouring characters,
which comes with some ... inaccuracies if the characters nearby have
roles of their own.

>>  (defun perl-font-lock-syntactic-face-function (state)
>>    (cond
>> +   ((and (eq 2 (nth 7 state)) ; c-style comment
>> +         (cdr-safe (get-text-property (nth 8 state) 'syntax-table))) ; HERE doc
>> +    'font-lock-string-face)
>
> I think some people won't like the string-face property for it.
> How 'bout we (require 'sh-script) and use the `sh-heredoc` face?

I'd hesitate to do that.  I think that in shell-script mode it is well
justified to use a different face for HERE-docs, but in Perl it isn't.
In Perl, a HERE-doc is just a string and can be used wherever a string
can be used.  So, string-face seems quite appropriate.  In Shell, a
HERE-doc is sort of an I/O redirection.  You can't, for example, assign
a HERE-doc to a shell variable.  So, a different face seems appropriate.

BTW: CPerl also mode uses string-face for HERE-docs.
-- 
Cheers,
haj





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2020-12-23 18:46         ` Harald Jörg
@ 2020-12-23 19:04           ` Stefan Monnier
  2020-12-23 19:06             ` Stefan Monnier
  2020-12-24 15:29             ` Harald Jörg
  0 siblings, 2 replies; 14+ messages in thread
From: Stefan Monnier @ 2020-12-23 19:04 UTC (permalink / raw)
  To: Harald Jörg; +Cc: 23461

>> Indeed, terminating the comment just before the newline is a problem if
>> "just before the newline" is the comment starter.  I see that in that
>> case, you mark the char before the # but that can also be a problem with
>> things like:
>>
>>     foo <<'BAR' "baz"#
>
> ...or, as I've discovered in the meantime, also bad,
>
>       foo << BAR#
>
>       foo << BAR;#
>
> (The latter breaks indentation after the HERE-doc because the ";"
> became a comment)
>
> So, back to the drawing board.  If that edge case of an "empty comment"
> is just ignored, then the single # loses its comment-face (that is, by
> the way, the treatment of HERE-docs in sh-script.el), which is ugly but
> as far as I can tell harmless.  Maybe this can be covered by assigning a
> font-lock-face property to these single comment starters... I'll check
> that.

I suspect that shifting the "here-doc start" to the next line (in the
"#\n" case only) and placing a `syntax-multiline` property on all three
chars (#, \n, and the first char of the next line) will be our least
bad option.

> This points to the deeper problem: In Perl, we have some occasions where
> one character has two different roles.  A line end ends a comment *and*
> starts a here-doc, and in s/foo/bar/ge the middle slash ends the search
> string *and* starts the replacement string.  The programming modes seem
> to treat this by distributing the roles on two neighbouring characters,
> which comes with some ... inaccuracies if the characters nearby have
> roles of their own.

Yes, this doesn't occur very often, but when it does there's no
really satisfactory solution currently.  I also encountered such
situations in a few other modes (sorry, can't remember where, offhand).
I've often thought about trying to add some way to "cram several syntax
elements on a single character" or have some way to place an arbitrary
"state change function" on a character which would take an PPSS and
return a new one).  But then I end up concluding that a completely new
system would be preferable (e.g. one based on a DFA so that we don't
need ad-hoc "multi-char comment markers" and so that fewer cases need
to resort to text-property crutches).

>>>  (defun perl-font-lock-syntactic-face-function (state)
>>>    (cond
>>> +   ((and (eq 2 (nth 7 state)) ; c-style comment
>>> +         (cdr-safe (get-text-property (nth 8 state) 'syntax-table))) ; HERE doc
>>> +    'font-lock-string-face)
>>
>> I think some people won't like the string-face property for it.
>> How 'bout we (require 'sh-script) and use the `sh-heredoc` face?
>
> I'd hesitate to do that.  I think that in shell-script mode it is well
> justified to use a different face for HERE-docs, but in Perl it isn't.
> In Perl, a HERE-doc is just a string and can be used wherever a string
> can be used.  So, string-face seems quite appropriate.  In Shell, a
> HERE-doc is sort of an I/O redirection.  You can't, for example, assign
> a HERE-doc to a shell variable.  So, a different face seems appropriate.
>
> BTW: CPerl also mode uses string-face for HERE-docs.

I must admit I don't use Perl very much these days, but when I used it,
I used it as a "better shell", so I thought of Perl's here docs in
exactly the same way as sh's here docs.

So maybe a compromise is to add a new `perl-heredoc` face and make it
inherit from `font-lock-string-face` by default?


        Stefan






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2020-12-23 19:04           ` Stefan Monnier
@ 2020-12-23 19:06             ` Stefan Monnier
  2020-12-24 15:29             ` Harald Jörg
  1 sibling, 0 replies; 14+ messages in thread
From: Stefan Monnier @ 2020-12-23 19:06 UTC (permalink / raw)
  To: Harald Jörg; +Cc: 23461

> I suspect that shifting the "here-doc start" to the next line (in the
> "#\n" case only) and placing a `syntax-multiline` property on all three
> chars (#, \n, and the first char of the next line) will be our least
> bad option.

BTW, the `syntax-multiline` property only "works" if/after you add
`syntax-propertize-multiline` to `syntax-propertize-extend-region-functions`.


        Stefan






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2020-12-23 19:04           ` Stefan Monnier
  2020-12-23 19:06             ` Stefan Monnier
@ 2020-12-24 15:29             ` Harald Jörg
  2020-12-24 17:32               ` Stefan Monnier
  1 sibling, 1 reply; 14+ messages in thread
From: Harald Jörg @ 2020-12-24 15:29 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: 23461

Stefan Monnier writes:

> I suspect that shifting the "here-doc start" to the next line (in the
> "#\n" case only) and placing a `syntax-multiline` property on all three
> chars (#, \n, and the first char of the next line) will be our least
> bad option.

I have some doubts here regarding the effort and the side effects....

The current regexp for here-doc detection ends with the newline, so
either it needs to capture that first char of the next line, or the
property needs to be explicitly set.  Also, the regexp in the following
rule ("\\(?:$\\)\\s<" would no longer fire to call
`perl-syntax-propertize-special-constructs`.  Finally, that first
character of the here-doc would fail to pass the "(eq 2 (nth 7 state))"
test which has been introduced to fix the "current" bug#23461 in
`perl-font-lock-syntactic-face-function`; the same test is used in some
other places.  That's a lot of special cases to handle, for an edge case
like a "#" starting a comment in the last column.

As for side effects, it would break word movement for the first word of
within the here-doc, like it has been observed for the replacement
string of a regexp in Bug#24101.  While that's not a big deal, I found
it pretty stunning.  I understand now why this is happening, but I've no
fix.

Compared to that, just ignoring that special case is easy.  It has no
non-local side effects.  The only downside is that in this special case
a comment starter "#" is converted to a comment terminator "#" - so left
unfontified.  I could easily live with that.

> [...]
> I must admit I don't use Perl very much these days, but when I used it,
> I used it as a "better shell", so I thought of Perl's here docs in
> exactly the same way as sh's here docs.
>
> So maybe a compromise is to add a new `perl-heredoc` face and make it
> inherit from `font-lock-string-face` by default?

That's a possibility, sure.
-- 
Cheers,
haj





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2020-12-24 15:29             ` Harald Jörg
@ 2020-12-24 17:32               ` Stefan Monnier
  2021-01-04 23:43                 ` Harald Jörg
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Monnier @ 2020-12-24 17:32 UTC (permalink / raw)
  To: Harald Jörg; +Cc: 23461

> I have some doubts here regarding the effort and the side effects....

No doubt.


        Stefan






^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2020-12-24 17:32               ` Stefan Monnier
@ 2021-01-04 23:43                 ` Harald Jörg
  2021-01-05  9:14                   ` Lars Ingebrigtsen
  0 siblings, 1 reply; 14+ messages in thread
From: Harald Jörg @ 2021-01-04 23:43 UTC (permalink / raw)
  To: 23461; +Cc: Stefan Monnier

[-- Attachment #1: Type: text/plain, Size: 948 bytes --]

Stefan Monnier writes:

>> I have some doubts here regarding the effort and the side effects....
>
> No doubt.

:)

In the meantime I found that just ignoring the edge case of a lone "#"
at the end of a line starting HERE-docs breaks indentation after the
HERE-doc, so I no longer consider this as a valid option.

I just failed to get all cases covered when the HERE-doc starter is
moved to the next line.

So here's another attempt to show HERE-docs with their own face, treat
"normal" line-end comments correctly, and work around lines which end
with the comment starter.  It is a kludge, but maybe for such a rare
edge case a kludge is acceptable.

The trick: The lone "#" is now syntaxified as "whitespace" and
font-lock-faced as comment.  This is ugly, but also well-contained in
the offending line, so should not have unwanted effects at a distance.

The tests are designed for, and pass, for both Perl mode and CPerl mode.
-- 
Cheers,
haj


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: perl-mode: Display HERE-docs with their own face --]
[-- Type: text/x-diff, Size: 9171 bytes --]

From 9a19e5991b33db7b8a71b653078f2a3c0a7afd08 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Harald=20J=C3=B6rg?= <haj@posteo.de>
Date: Tue, 5 Jan 2021 00:09:02 +0100
Subject: [PATCH] ; perl-mode: Display here-docs as strings instead of comments

* lisp/progmodes/perl-mode.el
(perl-syntax-propertize-function): Handle HERE doc starter
lines ending in a comment.
(perl-heredoc): New face for HERE docs, inheriting from
font-lock-string-face.
(perl-font-lock-syntactic-face-function): Apply the new face
to HERE docs (Bug#23461).

* test/lisp/progmodes/cperl-mode-tests.el
(cperl-test--run-bug-10483): Skip for Perl mode.  The test
explicitly calls a function of CPerl mode.

* test/lisp/progmodes/cperl-mode-resources/here-docs.pl: New
tests for HERE docs, common to Perl mode and CPerl mode.
---
 lisp/progmodes/perl-mode.el                   |  29 +++-
 .../cperl-mode-resources/here-docs.pl         | 143 ++++++++++++++++++
 test/lisp/progmodes/cperl-mode-tests.el       |  31 ++++
 3 files changed, 202 insertions(+), 1 deletion(-)
 create mode 100644 test/lisp/progmodes/cperl-mode-resources/here-docs.pl

diff --git a/lisp/progmodes/perl-mode.el b/lisp/progmodes/perl-mode.el
index ec20b01a0f..2a2a4978c6 100644
--- a/lisp/progmodes/perl-mode.el
+++ b/lisp/progmodes/perl-mode.el
@@ -324,13 +324,33 @@ perl-syntax-propertize-function
         ;; disambiguate with the left-bitshift operator.
         "\\|" perl--syntax-exp-intro-regexp "<<\\(?2:\\sw+\\)\\)"
         ".*\\(\n\\)")
-       (4 (let* ((st (get-text-property (match-beginning 4) 'syntax-table))
+       (4 (let* ((eol (match-beginning 4))
+                 (st (get-text-property eol 'syntax-table))
                  (name (match-string 2))
                  (indented (match-beginning 1)))
             (goto-char (match-end 2))
             (if (save-excursion (nth 8 (syntax-ppss (match-beginning 0))))
+                ;; '<<' occurred in a string, or in a comment.
                 ;; Leave the property of the newline unchanged.
                 st
+              ;; Beware of `foo <<'BAR' #baz` because
+              ;; the newline needs to start the here-doc
+              ;; and can't be used to close the comment.
+              (let ((eol-state (save-excursion (syntax-ppss eol))))
+                (when (nth 4 eol-state)
+                  (if (/= (1- eol) (nth 8 eol-state))
+                      ;; make the last char of the comment closing it
+                      (put-text-property (1- eol) eol
+                                         'syntax-table (string-to-syntax ">"))
+                    ;; In `foo <<'BAR' #` the # is the last character
+                    ;; before eol and can't both open and close the
+                    ;; comment.  Workaround: disguise the "#" as
+                    ;; whitespace and fontify it as a comment.
+                    (put-text-property (1- eol) eol
+                                       'syntax-table (string-to-syntax "-"))
+                    (put-text-property (1- eol) eol
+                                       'font-lock-face
+                                       'font-lock-comment-face))))
               (cons (car (string-to-syntax "< c"))
                     ;; Remember the names of heredocs found on this line.
                     (cons (cons (pcase (aref name 0)
@@ -483,8 +503,15 @@ perl-syntax-propertize-special-constructs
 	      ;; as twoarg).
 	      (perl-syntax-propertize-special-constructs limit)))))))))
 
+(defface perl-heredoc
+  '((t (:inherit font-lock-string-face)))
+  "The face for here-documents.  Inherits from font-lock-string-face.")
+
 (defun perl-font-lock-syntactic-face-function (state)
   (cond
+   ((and (eq 2 (nth 7 state)) ; c-style comment
+         (cdr-safe (get-text-property (nth 8 state) 'syntax-table))) ; HERE doc
+    'perl-heredoc)
    ((and (nth 3 state)
          (eq ?e (cdr-safe (get-text-property (nth 8 state) 'syntax-table)))
          ;; This is a second-arg of s{..}{...} form; let's check if this second
diff --git a/test/lisp/progmodes/cperl-mode-resources/here-docs.pl b/test/lisp/progmodes/cperl-mode-resources/here-docs.pl
new file mode 100644
index 0000000000..8af4625fff
--- /dev/null
+++ b/test/lisp/progmodes/cperl-mode-resources/here-docs.pl
@@ -0,0 +1,143 @@
+use 5.020;
+
+=head1 NAME
+
+here-docs.pl - resource file for cperl-test-here-docs
+
+=head1 DESCRIPTION
+
+This file holds a couple of HERE documents, with a variety of normal
+and edge cases.  For a formatted view of this description, run:
+
+   (cperl-perldoc "here-docs.pl")
+
+For each of the HERE documents, the following checks will done:
+
+=over 4
+
+=item *
+
+All occurrences of the string "look-here" are fontified correcty.
+Note that we deliberately test the face, not the syntax property:
+Users won't care for the syntax property, but they see the face.
+Different implementations with different syntax properties have been
+seen in the past.
+
+=item *
+
+Indentation of the line(s) containing "look-here" is 0, i.e. there are no
+leading spaces.
+
+=item *
+
+Indentation of the following perl statement containing "indent" should
+be 0 if the statement contains "noindent", and according to the mode's
+continued-statement-offset otherwise.
+
+=back
+
+=cut
+
+# Prologue to make the test file valid without warnings
+
+my $text;
+my $any;
+my $indentation;
+my $anywhere = 'back again';
+my $noindent;
+
+=head1 The Tests
+
+=head2 Test Case 1
+
+We have two HERE documents in one line with different quoting styles.
+
+=cut
+
+## test case
+
+$text = <<"HERE" . <<'THERE' . $any;
+#look-here and
+HERE
+$tlook-here and
+THERE
+
+$noindent = "This should be left-justified";
+
+=head2 Test case 2
+
+A HERE document followed by a continuation line
+
+=cut
+
+## test case
+
+$text = <<HERE
+look-here
+HERE
+
+. 'indent-level'; # Continuation, should be indented
+
+=head2 Test case 3
+
+A here document with a line-end comment in the starter line,
+after a complete statement
+
+=cut
+
+## test case
+
+$text = <<HERE; # start here
+look-here
+HERE
+
+$noindent = "New statement in this line";
+
+=head2 Test case 4
+
+A HERE document with a to-be-continued statement and a comment in the
+starter line.
+
+=cut
+
+## test case
+
+$text = <<HERE # start here
+look-here
+HERE
+
+. 'indent-level'; # Continuation, should be indented
+
+=head2 Test case 5
+
+A HERE document with a comment sign, but no comment to follow.
+
+
+=cut
+
+## test case
+
+$text = <<HERE; #
+look-here
+HERE
+
+$noindent = "New statement in this line";
+
+=head2 Test case 6
+
+A HERE document with a comment sign, but no comment to follow, with a
+statement to be continued.  Also, the character before the comment
+sign has a relevant syntax property (end of string in our case) which
+must be preserved.
+
+=cut
+
+## test case
+
+$text = <<"HERE"#
+look-here
+HERE
+
+. 'indent-level'; # Continuation, should be indented
+
+__END__
diff --git a/test/lisp/progmodes/cperl-mode-tests.el b/test/lisp/progmodes/cperl-mode-tests.el
index 46e687f14d..71fa4e5ea4 100644
--- a/test/lisp/progmodes/cperl-mode-tests.el
+++ b/test/lisp/progmodes/cperl-mode-tests.el
@@ -135,6 +135,36 @@ cperl-test-fontify-punct-vars
         (should (equal (nth 3 (syntax-ppss)) nil))
         (should (equal (nth 4 (syntax-ppss)) t))))))
 
+(ert-deftest cperl-test-heredocs ()
+  "Test that HERE-docs are fontified with the appropriate face."
+  (let ((file (ert-resource-file "here-docs.pl"))
+        (cperl-continued-statement-offset perl-continued-statement-offset)
+        (target-font (if (equal cperl-test-mode 'perl-mode) 'perl-heredoc
+                       'font-lock-string-face))
+        (case-fold-search nil))
+    (with-temp-buffer
+      (insert-file-contents file)
+      (goto-char (point-min))
+      (funcall cperl-test-mode)
+      (indent-region (point-min) (point-max))
+      (font-lock-ensure (point-min) (point-max))
+      (while (search-forward "## test case" nil t)
+        (save-excursion
+          (while (search-forward "look-here" nil t)
+            (should (equal
+                     (get-text-property (match-beginning 0) 'face)
+                     target-font))
+            (beginning-of-line)
+            (should (null (looking-at "[ \t]")))
+            (forward-line 1)))
+        (should (re-search-forward
+                 (concat "^\\([ \t]*\\)" ; the actual indentation amount
+                         "\\([^ \t\n].*?\\)\\(no\\)?indent")
+                 nil t))
+        (should (equal (- (match-end 1) (match-beginning 1))
+                       (if (match-beginning 3) 0
+                         perl-indent-level)))))))
+
 ;;; Tests for issues reported in the Bug Tracker
 
 (defun cperl-test--run-bug-10483 ()
@@ -164,6 +194,7 @@ cperl-test-bug-10483
   (interactive)
   (skip-unless (not (getenv "EMACS_HYDRA_CI"))) ; FIXME times out
   (skip-unless (not (< emacs-major-version 28))) ; times out in older Emacsen
+  (skip-unless (eq cperl-test-mode #'cperl-mode))
   (let* ((emacs (concat invocation-directory invocation-name))
          (test-function 'cperl-test--run-bug-10483)
          (test-function-name (symbol-name test-function))
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2021-01-04 23:43                 ` Harald Jörg
@ 2021-01-05  9:14                   ` Lars Ingebrigtsen
  2021-01-05 12:30                     ` Harald Jörg
  0 siblings, 1 reply; 14+ messages in thread
From: Lars Ingebrigtsen @ 2021-01-05  9:14 UTC (permalink / raw)
  To: Harald Jörg; +Cc: 23461, Stefan Monnier

haj@posteo.de (Harald Jörg) writes:

> The trick: The lone "#" is now syntaxified as "whitespace" and
> font-lock-faced as comment.  This is ugly, but also well-contained in
> the offending line, so should not have unwanted effects at a distance.
>
> The tests are designed for, and pass, for both Perl mode and CPerl mode.

Thanks; applied to Emacs 28.  However, the cperl tests failed -- I stuck
a (require 'perl-mode) into the test, and that made it work.  I don't
know whether that's the proper solution here, though -- please have a look.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





^ permalink raw reply	[flat|nested] 14+ messages in thread

* bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
  2021-01-05  9:14                   ` Lars Ingebrigtsen
@ 2021-01-05 12:30                     ` Harald Jörg
  0 siblings, 0 replies; 14+ messages in thread
From: Harald Jörg @ 2021-01-05 12:30 UTC (permalink / raw)
  To: Lars Ingebrigtsen; +Cc: 23461, Stefan Monnier

Lars Ingebrigtsen <larsi@gnus.org> writes:

> haj@posteo.de (Harald Jörg) writes:
>
>> The trick: The lone "#" is now syntaxified as "whitespace" and
>> font-lock-faced as comment.  This is ugly, but also well-contained in
>> the offending line, so should not have unwanted effects at a distance.
>>
>> The tests are designed for, and pass, for both Perl mode and CPerl mode.
>
> Thanks; applied to Emacs 28.  However, the cperl tests failed -- I stuck
> a (require 'perl-mode) into the test, and that made it work.  I don't
> know whether that's the proper solution here, though -- please have a look.

Ouch - and thanks!

Adding (require 'perl-mode) is the correct thing to do.  In my batch
tests I loaded it explicitly for no good reason, I've eliminated that
now.

The tests need both modes to account for different names of custom
variables (here: cperl-continued-statement-offset
vs. perl-continued-statement-offset) which affect the expected outcome.
-- 
Thanks again,
haj






^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-01-05 12:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-05 21:03 bug#23461: perl mode uses same color for comments and here documents 積丹尼 Dan Jacobson
2019-10-09  5:52 ` Lars Ingebrigtsen
2020-12-23  2:19 ` bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH] Harald Jörg
2020-12-23  4:00   ` Stefan Monnier
2020-12-23 14:37     ` Harald Jörg
2020-12-23 16:34       ` Stefan Monnier
2020-12-23 18:46         ` Harald Jörg
2020-12-23 19:04           ` Stefan Monnier
2020-12-23 19:06             ` Stefan Monnier
2020-12-24 15:29             ` Harald Jörg
2020-12-24 17:32               ` Stefan Monnier
2021-01-04 23:43                 ` Harald Jörg
2021-01-05  9:14                   ` Lars Ingebrigtsen
2021-01-05 12:30                     ` Harald Jörg

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).