From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: fabrice nicol Newsgroups: gmane.emacs.bugs Subject: bug#47408: Etags support for Mercury [v0.3] Date: Sun, 28 Mar 2021 17:49:20 +0200 Message-ID: <97f573da-ec63-7362-13c2-ca28a6634480@gmail.com> References: <5ba2fec3-3f61-fb7e-35eb-7188fa6064a4@gmail.com> <834kgvo220.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="------------F072051EFF8E1E91D0356A3F" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8146"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 To: 47408@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Mar 28 17:49:22 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lQXfF-00020I-PI for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 28 Mar 2021 17:49:21 +0200 Original-Received: from localhost ([::1]:35060 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lQXfE-0001LY-Qo for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 28 Mar 2021 11:49:20 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54156) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lQXev-0001LF-W3 for bug-gnu-emacs@gnu.org; Sun, 28 Mar 2021 11:49:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:35264) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lQXev-0005uh-Ns for bug-gnu-emacs@gnu.org; Sun, 28 Mar 2021 11:49:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lQXev-0006Gk-Mb for bug-gnu-emacs@gnu.org; Sun, 28 Mar 2021 11:49:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: fabrice nicol Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 28 Mar 2021 15:49:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 47408 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 47408-submit@debbugs.gnu.org id=B47408.161694653724089 (code B ref 47408); Sun, 28 Mar 2021 15:49:01 +0000 Original-Received: (at 47408) by debbugs.gnu.org; 28 Mar 2021 15:48:57 +0000 Original-Received: from localhost ([127.0.0.1]:46810 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lQXeq-0006GS-SL for submit@debbugs.gnu.org; Sun, 28 Mar 2021 11:48:57 -0400 Original-Received: from mail-wr1-f41.google.com ([209.85.221.41]:43912) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lQXen-0006G9-KB for 47408@debbugs.gnu.org; Sun, 28 Mar 2021 11:48:55 -0400 Original-Received: by mail-wr1-f41.google.com with SMTP id x7so10367625wrw.10 for <47408@debbugs.gnu.org>; Sun, 28 Mar 2021 08:48:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:subject:references:to:message-id:date:user-agent:mime-version :in-reply-to:content-language; bh=CbPAWbD3/rI7ibXFy1FyoAVEtEq8KSaCWmsdgB2dtSg=; b=HG1UVTkiqnTfj0DVVIY9YreHRR6pOwXhpRSmOBJql5e/Y4mXEzVlKRPWEfgSwsYe4a VwWEX3KUSlbgUcxxIyDIDbszgZeuh3NDkR9EC6yAnfqAioPc1gFs7ES1+cCQL3+6iiR2 TmIPnMNe+cD3qd2szc66pHq4qkP02Rhim4i+XE2iLX9FRz/p1zHgLojDi/SkqPFvhQ1+ bKNjZayaWsog6TGIU2loxWOyVk/+Wyx5OmlVSTUIHEP7ZpDLZ3HvA6e2WFnxA1yTusRo Sj1nhvOLegN3eBEmGVAPDeN6SMWVMOOCmT8kZgEwvS/uUCpW8XzJ6r7nLQ8yAan7f63P d9wA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:subject:references:to:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=CbPAWbD3/rI7ibXFy1FyoAVEtEq8KSaCWmsdgB2dtSg=; b=E3t5X3DpwBjaeU7x1GBBeOGCOMbf4cQ0Ltduu3XqTuFFihTEtK+gbeU0wsfaWTdow2 wk9aOA0yxh8NLCynrRrCL0n7nXeBMyDB1EPjcVBjEOeG48+fCGYAWrQU08F6qOrfqlEz pCOUl//9C6xKYmpfu6LXXFx3qcpH47qoSguLIFLPnU7uymc69aT9K/kq95IDDQsMFTcq 4LkJu2du931uFpXkqBXVicJ2SlubTqDWw/r9j4fOscS5W1UUR3+kmjcwwdferFGLBZlW m7AkDKtndnvH1m7po6GbO/CEhBSEDkfbF2utMi/zZcwVnthWsiJAo5v43OASpBMle5Ub RQJg== X-Gm-Message-State: AOAM530oVeJ2ztAhsNBLnKYUQjf9tRdYhuwYb5zQwuk+X2JxDT/3BYJT /Gf4In+s0dQTu8GykVtW1Qv0qh6PHtc= X-Google-Smtp-Source: ABdhPJyG1ONmj/lnCi295chVkdKURaIhn1SnhpAEQihcHqbQqr8pGySykpmHFsnoYyhVpCEng31tGQ== X-Received: by 2002:adf:ec83:: with SMTP id z3mr24118831wrn.59.1616946527547; Sun, 28 Mar 2021 08:48:47 -0700 (PDT) Original-Received: from ?IPv6:2a01:cb1d:88b9:5c00:7b73:7901:965e:8523? ([2a01:cb1d:88b9:5c00:7b73:7901:965e:8523]) by smtp.gmail.com with ESMTPSA id c131sm21537368wma.37.2021.03.28.08.48.47 for <47408@debbugs.gnu.org> (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 28 Mar 2021 08:48:47 -0700 (PDT) In-Reply-To: <834kgvo220.fsf@gnu.org> Content-Language: en-US X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:203184 Archived-At: This is a multi-part message in MIME format. --------------F072051EFF8E1E91D0356A3F Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Thanks for this review. Changes will be implemented soon as indicated. (1) There is just one point that I would like to discuss before changing things around: the proposed -m/-M short option issue. I left this couple of options in (following Francesco Potorti only for long options --declarations/--no-defines), for two reasons: 1. The ambiguity between Objective C and Mercury Both languages having the same file extension .m, it was necessary to add in a heuristic test function, in the absence of explicit language identification input from command line. Yet all heuristics may fail in rare cases. Tests show a fairly low failure rate on the Mercury compiler source code.  Less than 0.5 % of .m files are not identified as Mercury files by the test (this should have been documented somewhere).  File concerned by test failure are some Mercury test files and documentary test files with only (or almost only) comments and blank lines. While this could be improved by tweaking the heuristic test, it would make it more complex, bug-prone and ultimately hard to maintain. So -m/-M are useful to deal with these rare files, as they do not rely on the heuristic test function at all but on their own semantics, which explicitly identifies Mercury. The only alternative I see is to explicitly warn users about adding '-l mercury' to command line when using long options (in etags.1 and possibly other docs). Whether this is less intrusive (or more) than -m/-M is not crystal-clear to me.  Both solutions look on a par wrt this criterion, but -m/-M may be more user-friendly. If two short options are one too many, I propose redesigning the short option pair as just one -m option with a binary argument (like: '-m defines / -m all', or -m 0 / -m 1). 2. The social side of things As indicated previously, I also consulted the Mercury review list, and the feedback was positive on -m/-M (see below): > Accommodating different people's different preferences is a good idea > if it can be done at acceptable cost. > >> Instead of -M, you should use --declarations >> >> Instead of -m, you should use --no-defines > There is no need for "instead"; you can support both forms of both options. > So I opted for a compromise: renaming long options, following F. Potorti, and keeping -m/-M, following Z. Somogyi. (2) Your following question: > diff --git a/lisp/speedbar.el b/lisp/speedbar.el > index 12e57b1108..63f3cd6ca1 100644 > --- a/lisp/speedbar.el > +++ b/lisp/speedbar.el > @@ -3534,6 +3534,8 @@ speedbar-fetch-etags-parse-list > speedbar-parse-c-or-c++tag) > ("^\\.emacs$\\|.\\(el\\|l\\|lsp\\)\\'" . > "def[^i]+\\s-+\\(\\(\\w\\|[-_]\\)+\\)\\s-*\C-?") > + ("^\\.m$\\'" . > + "\\(^:-\\)?\\s-*\\(\\(pred\\|func\\|type\\|instance\\|typeclass\\)+\\s+\\([a-z]+[a-zA-Z0-9_]*\\)+\\)\\s-*(?^?") > ; ("\\.\\([fF]\\|for\\|FOR\\|77\\|90\\)\\'" . > ; speedbar-parse-fortran77-tag) > ("\\.tex\\'" . speedbar-parse-tex-string) What about ObjC here? or are these keywords good for ObjC as well? has the following reply: Objective C .m files are not parsed by speedbar.el in current repository code, so the added feature does not break anything.  Issues will only arise if/when Emacs maintainers for Objective C support decide on adding this file format to the speedbar parser.   It would be premature (and out-of-place) for me to settle this on my own.  Should this move happen, the heuristics used in etags.c (function test_objc_is_mercury) could then be ported to elisp code. >> +.TP >> +.B \-M, \-\-no\-defines >> +For the Mercury programming language, tag both declarations and >> +definitions. Declarations start a line with \fI:\-\fP optionally followed by a >> +quantifier over a variable (\fIsome [T]\fP or \fIall [T]\fP), then by >> +a builtin operator like \fIpred\fP or \fIfunc\fP. >> +Definitions are first rules of clauses, as in Prolog. >> +Implies \-\-language=mercury. >> +.TP >> +.B \-m, \-\-declarations >> +For the Mercury programming language, tag declarations as with \fB\-M\fP, but do not >> +tag definitions. Implies \-\-language=mercury. > This is not what Francesco Potortì suggested to do. He suggested that > you use the existing options --no-defines and --declarations, but give > them Mercury-specific meanings when processing Mercury source files. > IOW, let's not introduce the new -m and -M shorthands for these options, > and let's describe the Mercury-specific meaning of the existing > options where they are currently described in etags.1. OK? > +** Etags support for the Mercury programming language (https://mercurylang.org). > +** New etags command line options '-M/-m' or --declarations/--no-defines'. > +Tags all Mercury declarations. For compatibility with Prolog etags support, > +predicates and functions appearing first in clauses will be tagged if etags is > +run with the option '-M' or '--declarations'. If run with '-m' or > +'--no-defines', declarations will be tagged but definitions will not. > +Both options imply --language=mercury. > This should be amended for the changes in the options I described > above. > As mentioned, let's not introduce -m and -M. > >> + case 'M': >> + with_mercury_definitions = true; FALLTHROUGH; >> + case 'm': >> + { >> + language lang = >> + { "mercury", Mercury_help, Mercury_functions, Mercury_suffixes }; >> + >> + argbuffer[current_arg].lang = ⟨ >> + argbuffer[current_arg].arg_type = at_language; >> + } >> + break; > Shouldn't be needed anymore. > >> diff --git a/lisp/speedbar.el b/lisp/speedbar.el >> index 12e57b1108..63f3cd6ca1 100644 >> --- a/lisp/speedbar.el >> +++ b/lisp/speedbar.el >> @@ -3534,6 +3534,8 @@ speedbar-fetch-etags-parse-list >> speedbar-parse-c-or-c++tag) >> ("^\\.emacs$\\|.\\(el\\|l\\|lsp\\)\\'" . >> "def[^i]+\\s-+\\(\\(\\w\\|[-_]\\)+\\)\\s-*\C-?") >> + ("^\\.m$\\'" . >> + "\\(^:-\\)?\\s-*\\(\\(pred\\|func\\|type\\|instance\\|typeclass\\)+\\s+\\([a-z]+[a-zA-Z0-9_]*\\)+\\)\\s-*(?^?") >> ; ("\\.\\([fF]\\|for\\|FOR\\|77\\|90\\)\\'" . >> ; speedbar-parse-fortran77-tag) >> ("\\.tex\\'" . speedbar-parse-tex-string) > What about ObjC here? or are these keywords good for ObjC as well? > > Last, but not least: if you can, please provide a test file for the > etags test suite, see test/manual/etags/. > > Thanks again for working on this. --------------F072051EFF8E1E91D0356A3F Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit

Thanks for this review.

Changes will be implemented soon as indicated.

(1)  There is just one point that I would like to discuss before changing things around: the proposed -m/-M short option issue.


I left this couple of options in (following Francesco Potorti only for long options --declarations/--no-defines), for two reasons:

1. The ambiguity between Objective C and Mercury

Both languages having the same file extension .m, it was necessary to add in a heuristic test function, in the absence of explicit language identification input from command line.

Yet all heuristics may fail in rare cases. Tests show a fairly low failure rate on the Mercury compiler source code.  Less than 0.5 % of .m files are not identified as Mercury files by the test (this should have been documented somewhere).  File concerned by test failure are some Mercury test files and documentary test files with only (or almost only) comments and blank lines.

While this could be improved by tweaking the heuristic test, it would make it more complex, bug-prone and ultimately hard to maintain.

So -m/-M are useful to deal with these rare files, as they do not rely on the heuristic test function at all but on their own semantics, which explicitly identifies Mercury.   

The only alternative I see is to explicitly warn users about adding '-l mercury' to command line when using long options (in etags.1 and possibly other docs).

Whether this is less intrusive (or more) than -m/-M is not crystal-clear to me.  Both solutions look on a par wrt this criterion, but -m/-M may be more user-friendly.

If two short options are one too many, I propose redesigning the short option pair as just one -m option with a binary argument (like: '-m defines / -m all', or -m 0 / -m 1).


2. The social side of things

As indicated previously, I also consulted the Mercury review list, and the feedback was positive on -m/-M (see below):

Accommodating different people's different preferences is a good idea
if it can be done at acceptable cost.

Instead of -M, you should use --declarations

Instead of -m, you should use --no-defines
There is no need for "instead"; you can support both forms of both options.

So I opted for a compromise: renaming long options, following F. Potorti, and keeping -m/-M, following Z. Somogyi.


(2) Your following question:


diff --git a/lisp/speedbar.el b/lisp/speedbar.el
index 12e57b1108..63f3cd6ca1 100644
--- a/lisp/speedbar.el
+++ b/lisp/speedbar.el
@@ -3534,6 +3534,8 @@ speedbar-fetch-etags-parse-list
      speedbar-parse-c-or-c++tag)
     ("^\\.emacs$\\|.\\(el\\|l\\|lsp\\)\\'" .
      "def[^i]+\\s-+\\(\\(\\w\\|[-_]\\)+\\)\\s-*\C-?")
+      ("^\\.m$\\'" .
+     "\\(^:-\\)?\\s-*\\(\\(pred\\|func\\|type\\|instance\\|typeclass\\)+\\s+\\([a-z]+[a-zA-Z0-9_]*\\)+\\)\\s-*(?^?")
 ;    ("\\.\\([fF]\\|for\\|FOR\\|77\\|90\\)\\'" .
 ;      speedbar-parse-fortran77-tag)
     ("\\.tex\\'" . speedbar-parse-tex-string)
What about ObjC here? or are these keywords good for ObjC as well?

has the following reply: Objective C .m files are not parsed by speedbar.el in current repository code, so the added feature does not break anything.  Issues will only arise if/when Emacs maintainers for Objective C support decide on adding this file format to the speedbar parser.   It would be premature (and out-of-place) for me to settle this on my own.  Should this move happen, the heuristics used in etags.c (function test_objc_is_mercury) could then be ported to elisp code.


+.TP
+.B \-M, \-\-no\-defines
+For the Mercury programming language, tag both declarations and
+definitions.  Declarations start a line with \fI:\-\fP optionally followed by a
+quantifier over a variable (\fIsome [T]\fP or \fIall [T]\fP), then by
+a builtin operator like \fIpred\fP or \fIfunc\fP.
+Definitions are first rules of clauses, as in Prolog.
+Implies \-\-language=mercury.
+.TP
+.B \-m, \-\-declarations
+For the Mercury programming language, tag declarations as with \fB\-M\fP, but do not
+tag definitions. Implies \-\-language=mercury.
This is not what Francesco Potortì suggested to do.  He suggested that
you use the existing options --no-defines and --declarations, but give
them Mercury-specific meanings when processing Mercury source files.
IOW, let's not introduce the new -m and -M shorthands for these options,
and let's describe the Mercury-specific meaning of the existing
options where they are currently described in etags.1.  OK?

+** Etags support for the Mercury programming language (https://mercurylang.org).
+** New etags command line options '-M/-m' or --declarations/--no-defines'.
+Tags all Mercury declarations.  For compatibility with Prolog etags support,
+predicates and functions appearing first in clauses will be tagged if etags is
+run with the option '-M' or '--declarations'.  If run with '-m' or
+'--no-defines', declarations will be tagged but definitions will not.
+Both options imply --language=mercury.
This should be amended for the changes in the options I described
above.
As mentioned, let's not introduce -m and -M.

+      case 'M':
+	with_mercury_definitions = true; FALLTHROUGH;
+      case 'm':
+	{
+	  language lang =
+	    { "mercury", Mercury_help, Mercury_functions, Mercury_suffixes };
+
+	  argbuffer[current_arg].lang = &lang;
+	  argbuffer[current_arg].arg_type = at_language;
+	}
+	break;
Shouldn't be needed anymore.

diff --git a/lisp/speedbar.el b/lisp/speedbar.el
index 12e57b1108..63f3cd6ca1 100644
--- a/lisp/speedbar.el
+++ b/lisp/speedbar.el
@@ -3534,6 +3534,8 @@ speedbar-fetch-etags-parse-list
      speedbar-parse-c-or-c++tag)
     ("^\\.emacs$\\|.\\(el\\|l\\|lsp\\)\\'" .
      "def[^i]+\\s-+\\(\\(\\w\\|[-_]\\)+\\)\\s-*\C-?")
+      ("^\\.m$\\'" .
+     "\\(^:-\\)?\\s-*\\(\\(pred\\|func\\|type\\|instance\\|typeclass\\)+\\s+\\([a-z]+[a-zA-Z0-9_]*\\)+\\)\\s-*(?^?")
 ;    ("\\.\\([fF]\\|for\\|FOR\\|77\\|90\\)\\'" .
 ;      speedbar-parse-fortran77-tag)
     ("\\.tex\\'" . speedbar-parse-tex-string)
What about ObjC here? or are these keywords good for ObjC as well?

Last, but not least: if you can, please provide a test file for the
etags test suite, see test/manual/etags/.

Thanks again for working on this.
--------------F072051EFF8E1E91D0356A3F--