From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Stephen Leake Newsgroups: gmane.emacs.devel Subject: Re: Using incremental parsing in Emacs Date: Sat, 04 Jan 2020 11:26:38 -0800 Message-ID: <86r20fgh01.fsf@stephe-leake.org> References: <83blrkj1o1.fsf@gnu.org> <86zhf4gwhl.fsf@stephe-leake.org> <83tv5cgvar.fsf@gnu.org> <86v9psgkqe.fsf@stephe-leake.org> <83mub3hao7.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="216248"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (windows-nt) To: emacs-devel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Jan 04 20:27:32 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1inp59-000u84-NA for ged-emacs-devel@m.gmane.org; Sat, 04 Jan 2020 20:27:32 +0100 Original-Received: from localhost ([::1]:36264 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1inp58-0006nz-I8 for ged-emacs-devel@m.gmane.org; Sat, 04 Jan 2020 14:27:30 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:37745) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1inp4R-0006K9-Uo for emacs-devel@gnu.org; Sat, 04 Jan 2020 14:26:49 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1inp4P-0004gV-NK for emacs-devel@gnu.org; Sat, 04 Jan 2020 14:26:47 -0500 Original-Received: from gateway34.websitewelcome.com ([192.185.148.231]:38323) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1inp4P-0004b8-AC for emacs-devel@gnu.org; Sat, 04 Jan 2020 14:26:45 -0500 Original-Received: from cm17.websitewelcome.com (cm17.websitewelcome.com [100.42.49.20]) by gateway34.websitewelcome.com (Postfix) with ESMTP id 967D0162E54 for ; Sat, 4 Jan 2020 13:26:40 -0600 (CST) Original-Received: from host2007.hostmonster.com ([67.20.76.71]) by cmsmtp with SMTP id np4KiHHZ9qNtvnp4KivGch; Sat, 04 Jan 2020 13:26:40 -0600 X-Authority-Reason: nr=8 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=stephe-leake.org; s=default; h=Content-Transfer-Encoding:Content-Type: MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:To:From:Sender: Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=WbKOlT8H7EhhSh/WGM2+3HRwEpg6KsmQmjGC7UXAs5k=; b=HLFt6dG43kT6gjDelaq9n67xH/ 3T4M0OPQO+aDzsjZVCoVXfUdBQ8eHHrAkNU3r2ucXCaATgQCj/7xM8tA2BFB6JV10TJXAbNtoI21L uGv9ZTvGITgelGg/LcfMxbeYnyKdNoTDqnsLC69ttnIhAQfAk5N6cJ3beIekGlRcXq5vgwDWYG0Im zhFGiy62EI2/nw2VR81VwsLif/ec/fmLt2PDpC92s+4Jcua6dJg1ha5M5Mw86O6nCQRedgrOXWp9m b4yGWkzyrK9n9LvTwzuiv6ty8YZxsOnRTk5cF8lxTId6Dr8Ydm1PSMImUbFpk2XdZ6egFmvNsqXzg z4gTu7Zg==; Original-Received: from [76.77.182.20] (port=56263 helo=Takver4) by host2007.hostmonster.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.92) (envelope-from ) id 1inp4K-002EiO-1e for emacs-devel@gnu.org; Sat, 04 Jan 2020 12:26:40 -0700 In-Reply-To: <83mub3hao7.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 04 Jan 2020 10:45:44 +0200") X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - host2007.hostmonster.com X-AntiAbuse: Original Domain - gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - stephe-leake.org X-BWhitelist: no X-Source-IP: 76.77.182.20 X-Source-L: No X-Exim-ID: 1inp4K-002EiO-1e X-Source-Sender: (Takver4) [76.77.182.20]:56263 X-Source-Auth: stephen_leake@stephe-leake.org X-Email-Count: 1 X-Source-Cap: c3RlcGhlbGU7c3RlcGhlbGU7aG9zdDIwMDcuaG9zdG1vbnN0ZXIuY29t X-Local-Domain: yes X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 192.185.148.231 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:243931 Archived-At: Eli Zaretskii writes: >> From: Stephen Leake >> Date: Fri, 03 Jan 2020 15:53:45 -0800 >>=20 >> The interface should look like LSP; it aims to support everything an IDE >> needs from a "language server" (ie parser), and allows for custom >> extensions where it falls short. > > Maybe I'm the odd one out, but I don't think I have a clear idea of > what the "LSP interface" entails. Would you (or someone else) post a > summary, or point to some place where this is described succinctly > enough to not require a long study? The full description is at https://microsoft.github.io/language-server-protocol/specifications/specifi= cation-3-14/ However, that document apparently only describes commands sent to the server, not the responses sent from the server. My attempt at a summary, in the form of a description of how LSP is used in a typical editing session: User visits a file who's major mode supports LSP. Emacs starts or connects to a language server for that language (this can be customized in eglot to be per-project, and in other ways). Emacs sends the entire file contents to the server. For every edit the user makes after that, the edit it sent to the server; the message contains deleted and inserted text. It is up to Emacs how much insert/delete to include in each message to the server; I assume it is not every character. Sending that message from after-change-hook would be a natural choice, but it might be better to cache the information in order to send fewer messages. When font-lock is triggered, Emacs sends a request for formatting a range to the server (LSP command =E2=80=98textDocument/rangeFormatting=E2= =80=99); the server sends back new text for that range, with proper indentation and capitalization. I assume it also supports faces via some markup in the JSON, but I have not seen that in the docs. Similarly, when the user requests indentation (via TAB or some other command), a format request is sent. When the user starts typing a function call (or otherwise requests completion), a textDocument/completion request is sent to the server; it responds with the possible completions of the function name, and then the parameter list. > We did learn one important thing from using LSP servers: that > processing the JSON stuff back and forth adds non-trivial overhead and > slows down the application enough to annoy users, even after we did > all we can to speed up the translation.=20=20 Ok. I did not follow that in detail. Do we have any speed comparisons with other editors? > So I think it makes sense to take one more look at the issue and see > if we can come up with better interfaces, which will suit Emacs > applications better and allow faster processing.=20 There is always a tradeoff between speed and flexibility. The ada-mode interface to the external process is highly optimized to do exactly what ada-mode currently needs, and is very fast. But it is also brittle; adding new features may require large changes, and causes version incompatibility. LSP is much more flexible, allowing expansion to new features easily, and allowing feature negotiation. Other editors seem to cope well with the json approach, so it should be possible for Emacs as well. > Using a library that processes stuff locally would then allow us to > implement such interfaces more easily, since we will be free from the > restrictions imposed by the need to communicate with external > processes. I gather you are suggesting that the language server could be an Emacs module (or even an elisp package), with function calls for the various features. That is certainly possible, but loses the ability to use any server developed external to the Emacs project. It might be possible to refactor some servers to work that way (replacing the json interface with a direct function call interface), but it would be a lot of work. > we'll most probably want some combination of LSP-based and local > parsers-based features. E.g., it's quite possible that LSP servers > could be better for some complex jobs, where speed matters less. > > My point is that we shouldn't lock up our minds, not yet anyway. A > fresh look at these issues, taking the incremental parsing into > account, could benefit us in the long run. Ok. I will work on adding LSP support for ada-mode (reusing eglot and/or lsp-mode), and see what might be done about the speed issues. I need to do that anyway to support a customer request. I can also look at moving the current Ada parser into an Emacs module, to see if that helps with speed. >> LSP language servers are implemented in some compiled language, not >> elisp; eglot/lsp-mode is just the elisp side of the protocol. The elisp >> sends edits and info requests (ie, "insert/delete this text at this >> point", "fontify/format this range") to the server, and handles the >> responses. > > I'm saying we should look into this and see whether there are better > ways that that. Suppose such a server had direct access to buffer > text: would that allow a more efficient interface than the above?=20=20 No; lexing the actual text is not where the time is spent. > We should definitely support LSP. We already do, albeit in > third-party packages. We added native JSON support and jsonrpc for > doing this better. If there's anything else we can do in that > direction, people should speak up. Ok. > But my point is that LSP is not necessarily the only game in town we > should support. For example, font-lock doesn't use LSP, and probably > never will, due to performance issues;=20 Ada-mode uses the external process to compute faces for identifiers. That works well, although I do (setq jit-lock-defer-time 1.0) so it only fontifies when I pause typing; otherwise there can be an annoying delay after each character. However, doing correct font-lock for Ada without a parser is pretty much impossible (on anything more than language keywords), and there is very little that can be done to speed up the parsing. Migrating the parser into a module might help, but only a little. Adding a json interface would slow it down, of course. > should we improve font-lock using infrastructure that's based on > language parsing?=20 ada-mode builds on the current font-lock infrastructure; the font-lock timer triggers a parse on a range, and the parse actions set font-lock-face text properties. > And there are other features that could benefit, I've mentioned them. > If you are saying they all should just use LSP, then I don't think I > agree. I'm saying they all could use LSP in principle, but I have not had any experience actually doing that, so it may not work very well in practice. I don't think you are objecting to LSP in principle, but do have a problem with the speed penalty due to using JSON. Since other editors are succeeding with that, perhaps there is more Emacs could do here. --=20 -- Stephe