From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.bugs Subject: bug#23461: perl-mode: Displaying HERE-docs as strings instead of comments [PATCH] Date: Wed, 23 Dec 2020 14:04:27 -0500 Message-ID: References: <87sg7xxo0s.fsf@hajtower> <87zh24611c.fsf@hajtower> <87r1ng5pj8.fsf@hajtower> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="24809"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: 23461@debbugs.gnu.org To: haj@posteo.de (Harald =?UTF-8?Q?J=C3=B6rg?=) Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Wed Dec 23 20:05:14 2020 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ks9Ri-0006MD-0w for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 23 Dec 2020 20:05:14 +0100 Original-Received: from localhost ([::1]:46868 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ks9Rg-00040E-JB for geb-bug-gnu-emacs@m.gmane-mx.org; Wed, 23 Dec 2020 14:05:12 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:41102) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ks9RY-0003zr-02 for bug-gnu-emacs@gnu.org; Wed, 23 Dec 2020 14:05:04 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:42637) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ks9RW-0004yo-Lj for bug-gnu-emacs@gnu.org; Wed, 23 Dec 2020 14:05:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ks9RW-0006r0-Fq for bug-gnu-emacs@gnu.org; Wed, 23 Dec 2020 14:05:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Stefan Monnier Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 23 Dec 2020 19:05:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 23461 X-GNU-PR-Package: emacs Original-Received: via spool by 23461-submit@debbugs.gnu.org id=B23461.160875027826311 (code B ref 23461); Wed, 23 Dec 2020 19:05:02 +0000 Original-Received: (at 23461) by debbugs.gnu.org; 23 Dec 2020 19:04:38 +0000 Original-Received: from localhost ([127.0.0.1]:54183 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ks9R8-0006qI-AS for submit@debbugs.gnu.org; Wed, 23 Dec 2020 14:04:38 -0500 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:36112) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ks9R6-0006q3-2p for 23461@debbugs.gnu.org; Wed, 23 Dec 2020 14:04:37 -0500 Original-Received: from pmg1.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 8F98D100486; Wed, 23 Dec 2020 14:04:30 -0500 (EST) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id D36B6100410; Wed, 23 Dec 2020 14:04:28 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1608750268; bh=G+j6VGAlP+n6Ig6x/sL/f5LBdKMwb9ONUY4bTXN89/E=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=Rp1ZAyNK6cWtkaPPBg7T2YydSaBe7Eoy5llInFJafx0vK8UDXRf9Tkr/c240RBRj+ 6o+GU45hSaKF6TtCZPjWbsPstm77wNqWLNJQStQgr7V+ExWnwZKoVD0lckLkMtI6LC 5d8I3CG3JRPPFjV8Zo/JXjXb5J96AWBTwtUuz6nqpEaRkl5uFNv/3DoZo7kPx95TEp 6JZVkHJZD5hMs1OLWPawUt4Z7KUOvC54kowqdSv0SnbNf0kucuiK697wEB5rFpSGp5 +BCcTMnnPOudkLXJAtXBUummLltVDHF+FvX0mgkQOWph3XeIua4TNKGTpXE5djn6Hf rPXD0G4wzjnPw== Original-Received: from alfajor (69-165-136-52.dsl.teksavvy.com [69.165.136.52]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 7A0F4120409; Wed, 23 Dec 2020 14:04:28 -0500 (EST) In-Reply-To: <87r1ng5pj8.fsf@hajtower> ("Harald =?UTF-8?Q?J=C3=B6rg?="'s message of "Wed, 23 Dec 2020 19:46:19 +0100") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:196633 Archived-At: >> Indeed, terminating the comment just before the newline is a problem if >> "just before the newline" is the comment starter. I see that in that >> case, you mark the char before the # but that can also be a problem with >> things like: >> >> foo <<'BAR' "baz"# > > ...or, as I've discovered in the meantime, also bad, > > foo << BAR# > > foo << BAR;# > > (The latter breaks indentation after the HERE-doc because the ";" > became a comment) > > So, back to the drawing board. If that edge case of an "empty comment" > is just ignored, then the single # loses its comment-face (that is, by > the way, the treatment of HERE-docs in sh-script.el), which is ugly but > as far as I can tell harmless. Maybe this can be covered by assigning a > font-lock-face property to these single comment starters... I'll check > that. I suspect that shifting the "here-doc start" to the next line (in the "#\n" case only) and placing a `syntax-multiline` property on all three chars (#, \n, and the first char of the next line) will be our least bad option. > This points to the deeper problem: In Perl, we have some occasions where > one character has two different roles. A line end ends a comment *and* > starts a here-doc, and in s/foo/bar/ge the middle slash ends the search > string *and* starts the replacement string. The programming modes seem > to treat this by distributing the roles on two neighbouring characters, > which comes with some ... inaccuracies if the characters nearby have > roles of their own. Yes, this doesn't occur very often, but when it does there's no really satisfactory solution currently. I also encountered such situations in a few other modes (sorry, can't remember where, offhand). I've often thought about trying to add some way to "cram several syntax elements on a single character" or have some way to place an arbitrary "state change function" on a character which would take an PPSS and return a new one). But then I end up concluding that a completely new system would be preferable (e.g. one based on a DFA so that we don't need ad-hoc "multi-char comment markers" and so that fewer cases need to resort to text-property crutches). >>> (defun perl-font-lock-syntactic-face-function (state) >>> (cond >>> + ((and (eq 2 (nth 7 state)) ; c-style comment >>> + (cdr-safe (get-text-property (nth 8 state) 'syntax-table))) ; HERE doc >>> + 'font-lock-string-face) >> >> I think some people won't like the string-face property for it. >> How 'bout we (require 'sh-script) and use the `sh-heredoc` face? > > I'd hesitate to do that. I think that in shell-script mode it is well > justified to use a different face for HERE-docs, but in Perl it isn't. > In Perl, a HERE-doc is just a string and can be used wherever a string > can be used. So, string-face seems quite appropriate. In Shell, a > HERE-doc is sort of an I/O redirection. You can't, for example, assign > a HERE-doc to a shell variable. So, a different face seems appropriate. > > BTW: CPerl also mode uses string-face for HERE-docs. I must admit I don't use Perl very much these days, but when I used it, I used it as a "better shell", so I thought of Perl's here docs in exactly the same way as sh's here docs. So maybe a compromise is to add a new `perl-heredoc` face and make it inherit from `font-lock-string-face` by default? Stefan