all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Stephen Leake <stephen_leake@stephe-leake.org>
To: emacs-devel <emacs-devel@gnu.org>
Subject: Re: Using incremental parsing in Emacs
Date: Fri, 03 Jan 2020 11:39:50 -0800	[thread overview]
Message-ID: <86zhf4gwhl.fsf@stephe-leake.org> (raw)
In-Reply-To: <83blrkj1o1.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 03 Jan 2020 12:05:02 +0200")

Eli Zaretskii <eliz@gnu.org> writes:

> Would someone like to try to figure out how we could use the
> incremental parsing technology in Emacs for making our
> programming-language support more accurate and efficient?  

GNU ELPA ada-mode is an existing example; it has a full language parser
(error-correcting generalized LR), that supports some advanced
navigation. It could be extended to do some code completion.

Instead of "incremental parsing" (which updates an existing syntax tree
given source changes) it uses "partial parsing" (parsing only part of a
file) and very robust error handling. It works very well on very large
Ada files (it is in production use by Eurocontrol and others).

Error correction is critical, since buffers are normally not syntactically
correct during editing.

I've tried using the same parser generator on Java and Python; the
results are not as good as for Ada (apparently Ada lends itself to LR
parsing better than those languages). That might be improved by
massaging the grammar, but that risks implementing not-quite-Java,
not-quite-Python.

Others mentioned LSP (https://langserver.org/); that method supports
incremental parsing, since it is centered on sending source edits from
the editor to the language server (after sending the full text once). It
also supports algorithms that require more than one source file, since
all files involved in a project can be loaded into the same language
server instance (the ada-mode parser is strictly one file). That allows
providing completion on parameters for functions declared in other files,
for example.

Many editors are moving to support LSP; that allows them to take
advantage of any parser technology developed independently.

ada-mode has its own protocol between elisp and the external parser,
provided by the GNU ELPA wisi package (the ada-mode parser was started
before LSP). The parser in ada-mode could be used in an LSP language
server.

So I think the short answer to your post is "GNU ELPA eglot", with
possibly some work importing some of that into core to make it more
efficient. eglot is currently listed as "incompat" in *Packages* (in
both emacs 27 and 26); I don't know why. I have not tried eglot; I don't
know how complete it is. There is also
https://github.com/emacs-lsp/lsp-mode.

The syntax used for expressing the grammar is usually fairly tightly
tied to the language and/or the parser generator; trying to generalize
that for all languages supported by Emacs is a huge task, not worth
doing. With LSP, building a grammar for a langauge is done once for each
language server.

Whether the language server is implemented as an external process, or as
a loadable module, is an implementation detail. ada-mode uses an
external process, mostly because it was started before modules were
stablilized. The communications between the language server and elisp
(whether ada-mode style or LSP) involves sending text, not binary data
(and _not_ pointers into the emacs buffer!). Doing that via the module
interface vs pipes to a process is a wash for speed. Using a process
fully isolates the server code from emacs, eliminating any possible
third-party library version conflicts.

It could be possible to implent an LSP language server in elisp, running
in a separate thread (or even the same thread; it can be used
synchonously). That might be an interesting excercise, and would
eliminate other language dependencies. ada-mode used to support an elisp
parser generated from the same grammar, but that never supported error
correction; implementing very complex algorithms is just easier in a
more advanced language (and certainly faster at run time; critical for
error correction).

-- 
-- Stephe



  parent reply	other threads:[~2020-01-03 19:39 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-03 10:05 Using incremental parsing in Emacs Eli Zaretskii
2020-01-03 13:36 ` phillip.lord
2020-01-03 14:24   ` Eli Zaretskii
2020-01-03 15:43     ` arthur miller
2020-01-03 16:00 ` Dmitry Gutov
2020-01-03 17:09   ` Pankaj Jangid
2020-01-03 19:39 ` Stephen Leake [this message]
2020-01-03 20:05   ` Eli Zaretskii
2020-01-03 22:21     ` arthur miller
2020-01-04  3:46       ` HaiJun Zhang
2020-01-04  8:23       ` Eli Zaretskii
2020-01-03 23:53     ` Stephen Leake
2020-01-04  8:45       ` Eli Zaretskii
2020-01-04 14:05         ` arthur miller
2020-01-04 19:26         ` Stephen Leake
2020-01-04 19:54           ` Eli Zaretskii
2020-01-05 17:05             ` Stephen Leake
2020-01-05 19:14               ` yyoncho
2020-01-05 22:44     ` Dmitry Gutov
2020-01-04  3:59 ` HaiJun Zhang
     [not found] ` <41b3e9a0-2866-4692-a35c-6d9541bc3aaa@Spark>
2020-01-04  4:57   ` HaiJun Zhang
2020-01-04  8:55     ` Eli Zaretskii
2020-01-04 12:50       ` VanL
2020-01-04 13:22         ` arthur miller
2020-01-04 23:47         ` Replacing all C code???? Richard Stallman
2020-01-05  3:35           ` VanL
2020-01-05 22:19             ` Richard Stallman
2020-01-05  5:01           ` Stefan Monnier
2020-01-05 16:58             ` Fangrui Song
2020-01-05 22:18             ` Richard Stallman
2020-01-05 22:25               ` Stefan Monnier
2020-01-07  2:34                 ` VanL
2020-01-04 13:30       ` Using incremental parsing in Emacs arthur miller
2020-01-04 13:42         ` Dmitry Gutov
2020-01-04 14:46 ` arthur miller
2020-01-05 14:50   ` Alan Third
2020-01-05 15:16     ` arthur miller
2020-01-05 15:29     ` Eli Zaretskii
2020-01-05 15:31     ` Eli Zaretskii
2020-01-05 17:11     ` Stephen Leake
2020-01-09 21:56   ` Dmitry Gutov
2020-01-10  7:41     ` Eli Zaretskii
2020-01-11  1:41       ` Dmitry Gutov
2020-01-11  7:53         ` Eli Zaretskii
2020-01-11 12:24           ` Dmitry Gutov
2020-01-11 12:29             ` Eli Zaretskii
2020-01-04 20:26 ` Yuan Fu
2020-01-04 20:43 ` Stefan Monnier
2020-01-05 14:19   ` Alan Third
2020-01-05 17:07     ` Stephen Leake
2020-01-05 19:16       ` Alan Third
2020-01-05 17:09     ` Stefan Monnier
2020-01-05 18:22       ` Eli Zaretskii
2020-01-05 19:18         ` Stefan Monnier
2020-01-05 19:36           ` Eli Zaretskii
2020-01-05 20:27             ` Stefan Monnier
2020-01-05 21:12               ` yyoncho
2020-01-05 22:10                 ` Stefan Monnier
2020-01-05 23:08                   ` yyoncho
2020-01-06  3:39                   ` Eli Zaretskii
2020-01-05 19:23         ` arthur miller
2020-01-05 19:40           ` Eli Zaretskii
2020-01-05 20:28             ` arthur miller
2020-01-06  3:42               ` Eli Zaretskii
2020-01-06  4:39                 ` HaiJun Zhang
2020-01-06  5:33                   ` Eli Zaretskii
2020-01-06  5:55                     ` HaiJun Zhang
2020-01-06  6:11                       ` Eli Zaretskii
2020-01-06 16:45                     ` arthur miller
2020-01-07 16:19                       ` Eli Zaretskii
2020-01-06 13:47                   ` Stefan Monnier
2020-01-06 16:36                     ` HaiJun Zhang
2020-01-06 16:48                     ` arthur miller
2020-01-06 16:14 ` Anand Tamariya
     [not found] <1504933445.581219.1569619792280.ref@mail.yahoo.com>
2019-09-27 21:29 ` Where to place third-party C source code? Jorge Araya Navarro
2019-09-28  6:31   ` Eli Zaretskii
2019-09-28  7:33     ` Jorge Javier Araya Navarro
2019-09-28 12:54       ` Stefan Monnier
2019-12-26 16:52         ` yyoncho
2020-01-04  3:25           ` Using incremental parsing in Emacs HaiJun Zhang
2020-01-04  5:21             ` Tobias Bading
2020-01-04 23:48             ` Richard Stallman
2020-01-05  3:36               ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86zhf4gwhl.fsf@stephe-leake.org \
    --to=stephen_leake@stephe-leake.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.