From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: fabrice nicol Newsgroups: gmane.emacs.bugs Subject: bug#47408: Etags support for Mercury [v0.3] Date: Sat, 27 Mar 2021 11:51:22 +0100 Message-ID: <5ba2fec3-3f61-fb7e-35eb-7188fa6064a4@gmail.com> References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------CAE36B3A8BD255920925A15F" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6850"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 To: 47408@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sat Mar 27 16:40:14 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lQB2r-0001f2-D0 for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 27 Mar 2021 16:40:13 +0100 Original-Received: from localhost ([::1]:51416 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lQB2q-0006Qa-Dh for geb-bug-gnu-emacs@m.gmane-mx.org; Sat, 27 Mar 2021 11:40:12 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:45014) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lQB2g-0006QG-O9 for bug-gnu-emacs@gnu.org; Sat, 27 Mar 2021 11:40:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:33198) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lQB2g-0001Dw-FA for bug-gnu-emacs@gnu.org; Sat, 27 Mar 2021 11:40:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1lQB2g-0005cq-Cm for bug-gnu-emacs@gnu.org; Sat, 27 Mar 2021 11:40:02 -0400 X-Loop: help-debbugs@gnu.org In-Reply-To: Resent-From: fabrice nicol Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 27 Mar 2021 15:40:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 47408 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 47408-submit@debbugs.gnu.org id=B47408.161685958921593 (code B ref 47408); Sat, 27 Mar 2021 15:40:02 +0000 Original-Received: (at 47408) by debbugs.gnu.org; 27 Mar 2021 15:39:49 +0000 Original-Received: from localhost ([127.0.0.1]:44743 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lQB2S-0005cC-Fo for submit@debbugs.gnu.org; Sat, 27 Mar 2021 11:39:49 -0400 Original-Received: from mail-wm1-f47.google.com ([209.85.128.47]:40705) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1lQ6Ww-0002hG-Ma for 47408@debbugs.gnu.org; Sat, 27 Mar 2021 06:51:00 -0400 Original-Received: by mail-wm1-f47.google.com with SMTP id y124-20020a1c32820000b029010c93864955so6130070wmy.5 for <47408@debbugs.gnu.org>; Sat, 27 Mar 2021 03:50:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=to:from:subject:message-id:date:user-agent:mime-version :content-language; bh=1qXB6CLllaOx/Ifp4Aco1zGgbQWqYPrr4sSfXEli/GA=; b=fifhjBHTnrSWNlTak8Q9PYqkyqiDtaww9Dguxje7XXcdv38kAcwgZzMxVTAO6+QQhn KNIQeeuqaaZKERfnzXOoTgb89qwm59y8vImZ5CoLTTWMxwu0A6aE6gZBQ77aWWB5Qks1 OcXTWPo1gUGIwERLMjSsTBUfMUyrI4JBkXGcmdmvTzhWTiG7+kHM9jlH/9lDg5Zx+411 jxnY7CSZCiws2Me0KcRn32IJqkqM15oZ8wIUZOWXeevU4OTvru0f8ydrjxJTqfOW5fQQ ZEpXBattbIyE3kENH2qfQrhEHPHZ4TjEhXB9DLbZPeHPeLfis4KOP7L6Hj1F+zAhX6+C wLbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-language; bh=1qXB6CLllaOx/Ifp4Aco1zGgbQWqYPrr4sSfXEli/GA=; b=avDN1PkR18mmYF/KQ1kV+TtB7VzSYPmOCcy1V4wmNSGFS5DFgsSl5bPzR2sGy4EXnW F5Vfrgg274GJ5ublWOhdZZDbJImZDVD+WPhO9HElmv7QX8MMk4pzebe6J7T2yFSeoxYI EYq+qX2FuzbqBMGrwdMK3yCV+iELVxvTbpEzPPSpDO2n5nGTQWc71skJPxckypWgY3TZ LrmUGNmkiorCensFGtV68J2/0GHQj9FPoqQv//iVXt+r4WQJsjYXZVn0OBehDcjTGyrc ONaJvbDSWUZrdvrErmN6CvoDooCRZ7D2fZ7jK4KU/P0LCp48wQtKn7vCY1xLpkf577az CZbQ== X-Gm-Message-State: AOAM532ZgerlHcCTijaqoDppr4Bnhi04Qubo0m9GnAPOCE9xdJnKnbb1 oxf1GMOT+nlyVJSxvV/p5BqQ1mzmHRs= X-Google-Smtp-Source: ABdhPJzCifPzjiPqzc0L2+jjjG7KkY0BeP3lGIVuizRXOtzkjd5EihsHJz2vkdqlBgN0qCh+mr8A3Q== X-Received: by 2002:a1c:c1:: with SMTP id 184mr9049991wma.143.1616842252567; Sat, 27 Mar 2021 03:50:52 -0700 (PDT) Original-Received: from ?IPv6:2a01:cb1d:88b9:5c00:7b73:7901:965e:8523? ([2a01:cb1d:88b9:5c00:7b73:7901:965e:8523]) by smtp.gmail.com with ESMTPSA id l1sm17363939wrv.87.2021.03.27.03.50.51 for <47408@debbugs.gnu.org> (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 27 Mar 2021 03:50:51 -0700 (PDT) Content-Language: en-US X-Mailman-Approved-At: Sat, 27 Mar 2021 11:39:47 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:203130 Archived-At: This is a multi-part message in MIME format. --------------CAE36B3A8BD255920925A15F Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Hi, I'm sending a new patch for this Mercury feature request. The attached patch fixes a previously unnoticed regression that affects Objective C parsing (Mercury and Objective C have same file extensions .m). I had to resort to an added heuristics (implemented in `test_objc_is_mercury') to disambiguate Mercury from Objective C source files with extension .m. The patch is cumulative and replaces the former one. Best, Fabrice Nicol --------------CAE36B3A8BD255920925A15F Content-Type: text/x-patch; charset=UTF-8; name="0001-Fixed-regressions-caused-by-Objc-Mercury-ambiguous-f.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0001-Fixed-regressions-caused-by-Objc-Mercury-ambiguous-f.pa"; filename*1="tch" >From 50f3f9a0d46d11d0ac096f79f0d5aa1bc17b7920 Mon Sep 17 00:00:00 2001 From: Fabrice Nicol Date: Sat, 27 Mar 2021 10:16:44 +0100 Subject: [PATCH] Fixed regressions caused by Objc/Mercury ambiguous file extension .m. --- doc/man/etags.1 | 25 ++- etc/NEWS | 9 + lib-src/etags.c | 444 ++++++++++++++++++++++++++++++++++++++++++++--- lisp/speedbar.el | 2 + 4 files changed, 455 insertions(+), 25 deletions(-) diff --git a/doc/man/etags.1 b/doc/man/etags.1 index c5c15fb182..903e38a145 100644 --- a/doc/man/etags.1 +++ b/doc/man/etags.1 @@ -1,5 +1,5 @@ .\" See section COPYING for copyright and redistribution information. -.TH ETAGS 1 "2019-06-24" "GNU Tools" "GNU" +.TH ETAGS 1 "2021-03-25" "GNU Tools" "GNU" .de BP .sp .ti -.2i @@ -50,9 +50,9 @@ format understood by .BR vi ( 1 )\c \&. Both forms of the program understand the syntax of C, Objective C, C++, Java, Fortran, Ada, Cobol, Erlang, -Forth, Go, HTML, LaTeX, Emacs Lisp/Common Lisp, Lua, Makefile, Pascal, Perl, -Ruby, PHP, PostScript, Python, Prolog, Scheme and -most assembler\-like syntaxes. +Forth, Go, HTML, LaTeX, Emacs Lisp/Common Lisp, Lua, Makefile, Mercury, Pascal, +Perl, Ruby, PHP, PostScript, Python, Prolog, Scheme and most assembler\-like +syntaxes. Both forms read the files specified on the command line, and write a tag table (defaults: \fBTAGS\fP for \fBetags\fP, \fBtags\fP for \fBctags\fP) in the current working directory. @@ -270,6 +270,23 @@ prints detailed information about how tags are created for LANG. .B \-V, \-\-version Print the current version of the program (same as the version of the emacs \fBetags\fP is shipped with). +.TP +.B \-, \-\-version +Print the current version of the program (same as the version of the +emacs \fBetags\fP is shipped with). +.TP +.B \-M, \-\-no\-defines +For the Mercury programming language, tag both declarations and +definitions. Declarations start a line with \fI:\-\fP optionally followed by a +quantifier over a variable (\fIsome [T]\fP or \fIall [T]\fP), then by +a builtin operator like \fIpred\fP or \fIfunc\fP. +Definitions are first rules of clauses, as in Prolog. +Implies \-\-language=mercury. +.TP +.B \-m, \-\-declarations +For the Mercury programming language, tag declarations as with \fB\-M\fP, but do not +tag definitions. Implies \-\-language=mercury. + .SH "SEE ALSO" "\|\fBemacs\fP\|" entry in \fBinfo\fP; \fIGNU Emacs Manual\fP, Richard diff --git a/etc/NEWS b/etc/NEWS index 68812c64cc..4af4e76371 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -93,6 +93,15 @@ useful on systems such as FreeBSD which ships only with "etc/termcap". * Changes in Emacs 28.1 +--- +** Etags support for the Mercury programming language (https://mercurylang.org). +** New etags command line options '-M/-m' or --declarations/--no-defines'. +Tags all Mercury declarations. For compatibility with Prolog etags support, +predicates and functions appearing first in clauses will be tagged if etags is +run with the option '-M' or '--declarations'. If run with '-m' or +'--no-defines', declarations will be tagged but definitions will not. +Both options imply --language=mercury. + +++ ** New command 'font-lock-update', bound to 'C-x x f'. This command updates the syntax highlighting in this buffer. diff --git a/lib-src/etags.c b/lib-src/etags.c index b5c18e0e01..53e04794dd 100644 --- a/lib-src/etags.c +++ b/lib-src/etags.c @@ -142,7 +142,13 @@ Copyright (C) 1984, 1987-1989, 1993-1995, 1998-2021 Free Software # define CTAGS false #endif -/* Copy to DEST from SRC (containing LEN bytes), and append a NUL byte. */ +/* Define MERCURY_HEURISTICS_RATIO as it was necessary to disambiguate + Mercury from Objective C, which have same file extensions .m */ +#ifndef MERCURY_HEURISTICS_RATIO +# define MERCURY_HEURISTICS_RATIO 0.02 +#endif + +/* COPY to DEST from SRC (containing LEN bytes), and append a NUL byte. */ static void memcpyz (void *dest, void const *src, ptrdiff_t len) { @@ -359,6 +365,7 @@ #define xrnew(op, n, m) ((op) = xnrealloc (op, n, (m) * sizeof *(op))) static void Lisp_functions (FILE *); static void Lua_functions (FILE *); static void Makefile_targets (FILE *); +static void Mercury_functions (FILE *); static void Pascal_functions (FILE *); static void Perl_functions (FILE *); static void PHP_functions (FILE *); @@ -378,6 +385,7 @@ #define xrnew(op, n, m) ((op) = xnrealloc (op, n, (m) * sizeof *(op))) static bool nocase_tail (const char *); static void get_tag (char *, char **); static void get_lispy_tag (char *); +static void test_objc_is_mercury(char *, language **); static void analyze_regex (char *); static void free_regexps (void); @@ -621,7 +629,6 @@ #define STDIN 0x1001 /* returned by getopt_long on --parse-stdin */ "In Java code, all the tags constructs of C and C++ code are\n\ tagged. (Use --help --lang=c --lang=c++ --lang=java for full help.)"; - static const char *Cobol_suffixes [] = { "COB", "cob", NULL }; static char Cobol_help [] = @@ -683,10 +690,22 @@ #define STDIN 0x1001 /* returned by getopt_long on --parse-stdin */ "In makefiles, targets are tags; additionally, variables are tags\n\ unless you specify '--no-globals'."; +/* Mercury and Objective C share the same .m file extensions. */ +static const char *Mercury_suffixes [] = + {"m", /* Use option -l mercury to switch from Objective C to Mercury. */ + NULL}; +static const char Mercury_help [] = + "In Mercury code, tags are all declarations beginning a line with :-\n\ +and optionally Prolog-like definitions (first rule for a predicate or \ +function).\n\ +To enable this behavior, run etags using -M or --declarations."; +static bool with_mercury_definitions = false; +double mercury_heuristics_ratio = MERCURY_HEURISTICS_RATIO; + static const char *Objc_suffixes [] = - { "lm", /* Objective lex file */ - "m", /* Objective C file */ - NULL }; + {"lm", + "m", /* By default, Objective C will be assumed. */ + NULL}; static const char Objc_help [] = "In Objective C code, tags include Objective C definitions for classes,\n\ class categories, methods and protocols. Tags for variables and\n\ @@ -773,7 +792,6 @@ #define STDIN 0x1001 /* returned by getopt_long on --parse-stdin */ 'TEXTAGS' to a colon-separated list like, for example,\n\ TEXTAGS=\"mycommand:myothercommand\"."; - static const char *Texinfo_suffixes [] = { "texi", "texinfo", "txi", NULL }; static const char Texinfo_help [] = @@ -824,7 +842,9 @@ #define STDIN 0x1001 /* returned by getopt_long on --parse-stdin */ { "lisp", Lisp_help, Lisp_functions, Lisp_suffixes }, { "lua", Lua_help,Lua_functions,Lua_suffixes,NULL,Lua_interpreters}, { "makefile", Makefile_help,Makefile_targets,NULL,Makefile_filenames}, + /* objc listed before mercury as it is a better default for .m extensions. */ { "objc", Objc_help, plain_C_entries, Objc_suffixes }, + { "mercury", Mercury_help, Mercury_functions, Mercury_suffixes }, { "pascal", Pascal_help, Pascal_functions, Pascal_suffixes }, { "perl",Perl_help,Perl_functions,Perl_suffixes,NULL,Perl_interpreters}, { "php", PHP_help, PHP_functions, PHP_suffixes }, @@ -1061,6 +1081,17 @@ print_help (argument *argbuffer) which you like."); } + puts ("-m, --declarations\n\ + For the Mercury programming language, only tag declarations.\n\ + Declarations start a line with :- \n\ + Implies --language=mercury."); + + puts ("-M, --no-defines\n\ + For the Mercury programming language, include both declarations and\n\ + definitions. Declarations start a line with :- while definitions\n\ + are first rules for a given item, as for Prolog.\n\ + Implies --language=mercury."); + puts ("-V, --version\n\ Print the version of the program.\n\ -h, --help\n\ @@ -1111,7 +1142,7 @@ main (int argc, char **argv) /* When the optstring begins with a '-' getopt_long does not rearrange the non-options arguments to be at the end, but leaves them alone. */ - optstring = concat ("-ac:Cf:Il:o:Qr:RSVhH", + optstring = concat ("-ac:Cf:Il:Mmo:Qr:RSVhHW", (CTAGS) ? "BxdtTuvw" : "Di:", ""); @@ -1202,9 +1233,20 @@ main (int argc, char **argv) case 'Q': class_qualify = 1; break; + case 'M': + with_mercury_definitions = true; FALLTHROUGH; + case 'm': + { + language lang = + { "mercury", Mercury_help, Mercury_functions, Mercury_suffixes }; + + argbuffer[current_arg].lang = ⟨ + argbuffer[current_arg].arg_type = at_language; + } + break; /* Etags options */ - case 'D': constantypedefs = false; break; + case 'D': constantypedefs = false; break; case 'i': included_files[nincluded_files++] = optarg; break; /* Ctags options. */ @@ -1298,19 +1340,19 @@ main (int argc, char **argv) analyze_regex (argbuffer[i].what); break; case at_filename: - this_file = argbuffer[i].what; - /* Input file named "-" means read file names from stdin - (one per line) and use them. */ - if (streq (this_file, "-")) - { - if (parsing_stdin) - fatal ("cannot parse standard input " - "AND read file names from it"); - while (readline_internal (&filename_lb, stdin, "-") > 0) - process_file_name (filename_lb.buffer, lang); - } - else - process_file_name (this_file, lang); + this_file = argbuffer[i].what; + /* Input file named "-" means read file names from stdin + (one per line) and use them. */ + if (streq (this_file, "-")) + { + if (parsing_stdin) + fatal ("cannot parse standard input " + "AND read file names from it"); + while (readline_internal (&filename_lb, stdin, "-") > 0) + process_file_name (filename_lb.buffer, lang); + } + else + process_file_name (this_file, lang); break; case at_stdin: this_file = argbuffer[i].what; @@ -1775,6 +1817,11 @@ find_entries (FILE *inf) if (parser == NULL) { lang = get_language_from_filename (curfdp->infname, true); + + /* Disambiguate file names between Objc and Mercury */ + if (lang != NULL && strcmp(lang->name, "objc") == 0) + test_objc_is_mercury(curfdp->infname, &lang); + if (lang != NULL && lang->function != NULL) { curfdp->lang = lang; @@ -6019,6 +6066,361 @@ prolog_atom (char *s, size_t pos) return 0; } + +/* + * Support for Mercury + * + * Assumes that the declarationa starts at column 0. + * Original code by Sunichirou Sugou (1989) for Prolog. + * Rewritten by Anders Lindgren (1996) for Prolog. + * Adapted by Fabrice Nicol (2021) for Mercury. + * Note: Prolog-support behavior is preserved if + * -M/--declarations is used, corresponding to + * with_mercury_definitions=true. + */ + +static ptrdiff_t mercury_pr (char *, char *, ptrdiff_t); +static void mercury_skip_comment (linebuffer *, FILE *); +static bool is_mercury_type = false; +static bool is_mercury_quantifier = false; +static bool is_mercury_declaration = false; + +/* To robustly disambiguate between Objective C and Mercury, parse file + with the following heuristics hook: + (number of occurrences of non-blank, non-fully-commented lines + comprising ':-' at the start of line)/ number of lines > mercury_heuristics_ratio */ + +static void test_objc_is_mercury(char *this_file, language **lang) +{ + if (this_file == NULL) return; + FILE* fp = fopen(this_file, "r"); + if (fp == NULL) return; + int c; + uint64_t lines = 1; + uint64_t mercury_decls = 0; + bool blank_line = false; + bool start_of_line = true; + + while ((c = fgetc(fp)) != EOF) + { + switch (c) + { + case '\n': + if (! blank_line) ++lines; + blank_line = true; + start_of_line = true; + break; + case '%': FALLTHROUGH; + case ' ': FALLTHROUGH; + case '\t': + start_of_line = false; + break; + case ':': + if (! start_of_line) break; + start_of_line = false; + c = fgetc(fp); + if (c == '-') ++mercury_decls; + break; + default: + start_of_line = false; + blank_line = false; + } + } + + double ratio = 0; + ratio = ((double) mercury_decls ) / lines; + if (ratio > mercury_heuristics_ratio) + { + /* Change the language from Objective C to Mercury */ + static language lang0 = { "mercury", Mercury_help, Mercury_functions, + Mercury_suffixes }; + *lang = &lang0; + } +} + +static void +Mercury_functions (FILE *inf) +{ + char *cp, *last = NULL; + ptrdiff_t lastlen = 0, allocated = 0; + if (declarations) with_mercury_definitions = true; + + LOOP_ON_INPUT_LINES (inf, lb, cp) + { + if (cp[0] == '\0') /* Empty line */ + continue; + else if (c_isspace (cp[0]) || cp[0] == '%') + /* a Prolog-type comment or anything other than a declaration */ + continue; + else if (cp[0] == '/' && cp[1] == '*') /* Mercury C-type comment. */ + mercury_skip_comment (&lb, inf); + else + { + is_mercury_declaration = (cp[0] == ':' && cp[1] == '-'); + + if (is_mercury_declaration + || with_mercury_definitions) + { + ptrdiff_t len = mercury_pr (cp, last, lastlen); + if (0 < len) + { + /* Store the declaration to avoid generating duplicate + tags later. */ + if (allocated <= len) + { + xrnew (last, len + 1, 1); + allocated = len + 1; + } + memcpyz (last, cp, len); + lastlen = len; + } + } + } + } + free (last); +} + +static void +mercury_skip_comment (linebuffer *plb, FILE *inf) +{ + char *cp; + + do + { + for (cp = plb->buffer; *cp != '\0'; ++cp) + if (cp[0] == '*' && cp[1] == '/') + return; + readline (plb, inf); + } + while (perhaps_more_input (inf)); +} + +/* + * A declaration is added if it matches: + * :-( + * If with_mercury_definitions == true, we also add: + * ( + * or :- + * As for Prolog support, different arities and types are not taken into + * consideration. + * Item is added to the tags database if it doesn't match the + * name of the previous declaration. + * + * Consume a Mercury declaration. + * Return the number of bytes consumed, or 0 if there was an error. + * + * A Mercury declaration must be one of: + * :- type + * :- solver type + * :- pred + * :- func + * :- inst + * :- mode + * :- typeclass + * :- instance + * :- pragma + * :- promise + * :- initialise + * :- finalise + * :- mutable + * :- module + * :- interface + * :- implementation + * :- import_module + * :- use_module + * :- include_module + * :- end_module + * followed on the same line by an alphanumeric sequence, starting with a lower + * case letter or by a single-quoted arbitrary string. + * Single quotes can escape themselves. Backslash quotes everything. + * + * Return the size of the name of the declaration or 0 if no header was found. + * As quantifiers may precede functions or predicates, we must list them too. + */ + +static const char *Mercury_decl_tags[] = {"type", "solver type", "pred", + "func", "inst", "mode", "typeclass", "instance", "pragma", "promise", + "initialise", "finalise", "mutable", "module", "interface", "implementation", + "import_module", "use_module", "include_module", "end_module", "some", "all"}; + +static size_t +mercury_decl (char *s, size_t pos) +{ + if (s == NULL) return 0; + + size_t origpos; + origpos = pos; + + while (s + pos != NULL && (c_isalnum (s[pos]) || s[pos] == '_')) ++pos; + + uint8_t decl_type_length = pos - origpos; + char buf[decl_type_length + 1]; + memset(buf, 0, decl_type_length + 1); + + /* Mercury declaration tags. Consume them, then check the declaration item + following :- is legitimate, then go on as in the prolog case. */ + + memcpy(buf, &s[origpos], decl_type_length); + + bool found_decl_tag = false; + + if (is_mercury_quantifier) + { + if (strcmp(buf, "pred") != 0 && strcmp(buf, "func") != 0) /* Bad syntax */ + return 0; + is_mercury_quantifier = false; /* Beset to base value. */ + found_decl_tag = true; + } + else + { + for (int j = 0; j < sizeof(Mercury_decl_tags)/sizeof(char*); ++j) + { + if (strcmp(buf, Mercury_decl_tags[j]) == 0) + { + found_decl_tag = true; + if (strcmp(buf, "type") == 0) + is_mercury_type = true; + + if (strcmp(buf, "some") == 0 + || strcmp(buf, "all") == 0) { + is_mercury_quantifier = true; + } + + break; /* Found declaration tag of rank j. */ + } + else + /* 'solver type' has a blank in the middle, + so this is the hard case */ + if (strcmp(buf, "solver") == 0) + { + ++pos; + while (s + pos != NULL && (c_isalnum (s[pos]) || s[pos] == '_')) + ++pos; + + decl_type_length = pos - origpos; + char buf2[decl_type_length + 1]; + memset(buf2, 0, decl_type_length + 1); + memcpy(buf2, &s[origpos], decl_type_length); + + if (strcmp(buf2, "solver type") == 0) + { + found_decl_tag = false; + break; /* Found declaration tag of rank j. */ + } + } + } + } + + /* If with_mercury_definitions == false + * this is a Mercury syntax error, ignoring... */ + + if (with_mercury_definitions) + { + if (found_decl_tag) + pos = skip_spaces (s + pos) - s; /* Skip len blanks again */ + else + /* Prolog-like behavior + * we have parsed the predicate once, yet inappropriately + * so restarting again the parsing step */ + pos = 0; + } + else + { + if (found_decl_tag) + pos = skip_spaces (s + pos) - s; /* Skip len blanks again */ + else + return 0; + } + + /* From now on it is the same as for Prolog except for module dots */ + + if (c_islower (s[pos]) || s[pos] == '_' ) + { + /* The name is unquoted. + Do not confuse module dots with end-of-declaration dots. */ + + while (c_isalnum (s[pos]) + || s[pos] == '_' + || (s[pos] == '.' /* A module dot */ + && s + pos + 1 != NULL + && (c_isalnum (s[pos + 1]) || s[pos + 1] == '_'))) + ++pos; + + return pos - origpos; + } + else if (s[pos] == '\'') + { + ++pos; + for (;;) + { + if (s[pos] == '\'') + { + ++pos; + if (s[pos] != '\'') + break; + ++pos; /* A double quote */ + } + else if (s[pos] == '\0') /* Multiline quoted atoms are ignored. */ + return 0; + else if (s[pos] == '\\') + { + if (s[pos+1] == '\0') + return 0; + pos += 2; + } + else + ++pos; + } + return pos - origpos; + } + else if (is_mercury_quantifier && s[pos] == '[') /* :- some [T] pred/func */ + { + for (++pos; s + pos != NULL && s[pos] != ']'; ++pos) {} + if (s + pos == NULL) return 0; + ++pos; + pos = skip_spaces (s + pos) - s; + return mercury_decl(s, pos) + pos - origpos; + } + else + return 0; +} + +static ptrdiff_t +mercury_pr (char *s, char *last, ptrdiff_t lastlen) +{ + size_t len0 = 0; + is_mercury_type = false; + is_mercury_quantifier = false; + + if (is_mercury_declaration) + { + /* Skip len0 blanks only for declarations. */ + len0 = skip_spaces (s + 2) - s; + } + + size_t len = mercury_decl (s , len0); + if (len == 0) return 0; + len += len0; + + if (( (s[len] == '.' /* This is a statement dot, not a module dot. */ + || (s[len] == '(' && (len += 1)) + || (s[len] == ':' /* Stopping in case of a rule. */ + && s[len + 1] == '-' + && (len += 2))) + && (lastlen != len || memcmp (s, last, len) != 0) + ) + /* Types are often declared on several lines so keeping just + the first line */ + || is_mercury_type + ) + { + make_tag (s, 0, true, s, len, lineno, linecharno); + return len; + } + + return 0; +} + /* * Support for Erlang diff --git a/lisp/speedbar.el b/lisp/speedbar.el index 12e57b1108..63f3cd6ca1 100644 --- a/lisp/speedbar.el +++ b/lisp/speedbar.el @@ -3534,6 +3534,8 @@ speedbar-fetch-etags-parse-list speedbar-parse-c-or-c++tag) ("^\\.emacs$\\|.\\(el\\|l\\|lsp\\)\\'" . "def[^i]+\\s-+\\(\\(\\w\\|[-_]\\)+\\)\\s-*\C-?") + ("^\\.m$\\'" . + "\\(^:-\\)?\\s-*\\(\\(pred\\|func\\|type\\|instance\\|typeclass\\)+\\s+\\([a-z]+[a-zA-Z0-9_]*\\)+\\)\\s-*(?^?") ; ("\\.\\([fF]\\|for\\|FOR\\|77\\|90\\)\\'" . ; speedbar-parse-fortran77-tag) ("\\.tex\\'" . speedbar-parse-tex-string) -- 2.26.3 --------------CAE36B3A8BD255920925A15F--