From: haj@posteo.de (Harald Jörg)
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: 23461@debbugs.gnu.org
Subject: bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH]
Date: Wed, 23 Dec 2020 19:46:19 +0100 [thread overview]
Message-ID: <87r1ng5pj8.fsf@hajtower> (raw)
In-Reply-To: <jwvy2hov77q.fsf-monnier+emacs@gnu.org> (Stefan Monnier's message of "Wed, 23 Dec 2020 11:34:12 -0500")
Stefan Monnier writes:
>> This looks like to be an improved variation of 2): HERE-docs remain
>> marked as c-style comments, and `font-lock-syntactic-face-function` is
>> used to display them as strings.
>>
>> A patch for this variation is attached.
>
> Looks good, thanks.
> See nitpicks below,
Many thanks for your review!
>> (if (save-excursion (nth 8 (syntax-ppss (match-beginning 0))))
>> + ;; '>>' occurred in a string, or in a comment.
>> ;; Leave the property of the newline unchanged.
>
> Is think this `>>` is a type for `<<`, or am I missing something?
Ouch. That is a typo, of course. *I* have missed something.
>> + ;; Before changing the syntax to c-style comment, let's
>> + ;; check whether we are in an end-of-line comment, and
>> + ;; if so, cheat by shifting the comment markers one char
>> + ;; to the left.
>
> I jump straight to reading the code before reading your email's text and
> it took me a bit of time to understand what this was about.
> I think part of the reason is the "we are in an end-of-line comment"
> since this is actually not about the case where the "<<" (which is
> where "we" are at this moment, in my mind) is inside a comment.
> So, I think the comment would be better if it just gave a straight
> example, like
>
> ;; Beware of `foo <<'BAR' #baz` because
> ;; the newline needs to close the comment
> ;; and can't be used to start the here-doc.
That's better, thank you!
> Also rather than "shifting the comment markers one char to the left"
> I'd just say that "we terminate the comment *just before* the newline".
>> + (when (nth 4 (save-excursion (syntax-ppss eol)))
>> + (when (equal (car (syntax-after (1- eol)))
>> + (car (string-to-syntax "<")))
>
> This is a pessimistic test, because it will misfire when you have
>
> foo <<'BAR' #baz#
>
> I think we should compare (1- eol) with (nth 8 (syntax-ppss eol))
> instead.
You are right. I'm still not very experienced with this syntax stuff,
as it turnes out. In hindsight it is clear that a "#" will still be a
comment starter, even if it is already in a comment.
>> + ;; yet another edge case: "#" is the last character
>> + ;; in that line, so there's actually no comment.
>> + (put-text-property (- eol 2) (1- eol)
>> + 'syntax-table (string-to-syntax "<")))
>
> Indeed, terminating the comment just before the newline is a problem if
> "just before the newline" is the comment starter. I see that in that
> case, you mark the char before the # but that can also be a problem with
> things like:
>
> foo <<'BAR' "baz"#
...or, as I've discovered in the meantime, also bad,
foo << BAR#
foo << BAR;#
(The latter breaks indentation after the HERE-doc because the ";"
became a comment)
So, back to the drawing board. If that edge case of an "empty comment"
is just ignored, then the single # loses its comment-face (that is, by
the way, the treatment of HERE-docs in sh-script.el), which is ugly but
as far as I can tell harmless. Maybe this can be covered by assigning a
font-lock-face property to these single comment starters... I'll check
that.
> An alternative is to leave the comment alone and start the heredoc just
> after the newline instead (that approach suffers from the fact that we
> need to be careful we don't accidentally throw away that `syntax-table`
> text property when the next line is edited. We can do that using the
> `syntax-multiline` text property).
This is similar to how CPerl mode does it, and the fact that text
properties can get lost when text is edited brought me here.
The bugs Bug#14343 and Bug#28962 in CPerl mode come from lost text
properties, and Bug#24101 is a consequence of starting a syntax property
with the first character. So, I wanted to check how Perl mode covers
that. I'll check whether `syntax-multiline` can add some more
robustness.
This points to the deeper problem: In Perl, we have some occasions where
one character has two different roles. A line end ends a comment *and*
starts a here-doc, and in s/foo/bar/ge the middle slash ends the search
string *and* starts the replacement string. The programming modes seem
to treat this by distributing the roles on two neighbouring characters,
which comes with some ... inaccuracies if the characters nearby have
roles of their own.
>> (defun perl-font-lock-syntactic-face-function (state)
>> (cond
>> + ((and (eq 2 (nth 7 state)) ; c-style comment
>> + (cdr-safe (get-text-property (nth 8 state) 'syntax-table))) ; HERE doc
>> + 'font-lock-string-face)
>
> I think some people won't like the string-face property for it.
> How 'bout we (require 'sh-script) and use the `sh-heredoc` face?
I'd hesitate to do that. I think that in shell-script mode it is well
justified to use a different face for HERE-docs, but in Perl it isn't.
In Perl, a HERE-doc is just a string and can be used wherever a string
can be used. So, string-face seems quite appropriate. In Shell, a
HERE-doc is sort of an I/O redirection. You can't, for example, assign
a HERE-doc to a shell variable. So, a different face seems appropriate.
BTW: CPerl also mode uses string-face for HERE-docs.
--
Cheers,
haj
next prev parent reply other threads:[~2020-12-23 18:46 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-05 21:03 bug#23461: perl mode uses same color for comments and here documents 積丹尼 Dan Jacobson
2019-10-09 5:52 ` Lars Ingebrigtsen
2020-12-23 2:19 ` bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH] Harald Jörg
2020-12-23 4:00 ` Stefan Monnier
2020-12-23 14:37 ` Harald Jörg
2020-12-23 16:34 ` Stefan Monnier
2020-12-23 18:46 ` Harald Jörg [this message]
2020-12-23 19:04 ` Stefan Monnier
2020-12-23 19:06 ` Stefan Monnier
2020-12-24 15:29 ` Harald Jörg
2020-12-24 17:32 ` Stefan Monnier
2021-01-04 23:43 ` Harald Jörg
2021-01-05 9:14 ` Lars Ingebrigtsen
2021-01-05 12:30 ` Harald Jörg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r1ng5pj8.fsf@hajtower \
--to=haj@posteo.de \
--cc=23461@debbugs.gnu.org \
--cc=monnier@iro.umontreal.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).