From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.devel Subject: Re: Generation of tags for the current project on the fly Date: Fri, 12 Jan 2018 16:52:21 +0300 Message-ID: References: <4559858d-eb32-d071-fdad-e51430700260@yandex.ru> <83shbb30z1.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1515765041 32503 195.159.176.226 (12 Jan 2018 13:50:41 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 12 Jan 2018 13:50:41 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:58.0) Gecko/20100101 Thunderbird/58.0 Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Jan 12 14:50:37 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eZzj7-00080M-R2 for ged-emacs-devel@m.gmane.org; Fri, 12 Jan 2018 14:50:34 +0100 Original-Received: from localhost ([::1]:42982 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZzl7-0007nQ-Et for ged-emacs-devel@m.gmane.org; Fri, 12 Jan 2018 08:52:37 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46978) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eZzkz-0007ms-Je for emacs-devel@gnu.org; Fri, 12 Jan 2018 08:52:30 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eZzkw-0007hR-CH for emacs-devel@gnu.org; Fri, 12 Jan 2018 08:52:29 -0500 Original-Received: from mail-lf0-x22a.google.com ([2a00:1450:4010:c07::22a]:38959) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eZzkw-0007gR-1T; Fri, 12 Jan 2018 08:52:26 -0500 Original-Received: by mail-lf0-x22a.google.com with SMTP id m8so6019012lfc.6; Fri, 12 Jan 2018 05:52:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=fbWqXyrwZDUrorwARd4q1OVxzUxJnE3xo8z9mD0mOqs=; b=RrV7Z6S5pPKhSDhq2EmbelUbEUJY5vn5GbNiZIa4qJirBNMvUpPkpOQjE1YX64/4Bx NidekPGZYDov5R44Mh0EtHPsed6r8/GRETKhMSEbsSym10kpPXZEuKg603CTcxOjMwqg tRI3orpk9QmCQaIFGhs7/sCXXc14Zlol2LCpfZJGzIGXq8qN4NmB0Xo+j9u6YF6WRPOv Sz8dRt1CghlxjOvVAGJ2wYjs3lRsj9Wn5RFhLyHjrB4gbYbGhuOI5bqDy+iVXsP2ORtZ PsnctU72q5IHE69FVAA8G2/myJ3ycf8TicygVZPgCg6nUPi7d/C6aolwxb1Kjvi8uGJx mOqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=fbWqXyrwZDUrorwARd4q1OVxzUxJnE3xo8z9mD0mOqs=; b=N0jqwcJbleYqnFMqrPdp6FbXd8mr5iHbZHbdHyWQKovKQxu3T/E/9UoMROdLQE3Wsa PShYe73P4u8MKX9VlEnLUbPLOY5TdytIULdyLVZ2YrBxgXqQGrUpwVmHHEA1FO3bWnQd uTQZbm73TDmHlGM0fnjM8x0CdcD8ayvpyuW3Aq9ypkt3sCdkHIxOYsf/pZ02u9QMkQVN SdwwZNHJoi36slmLG3QdTAFugNeXtj+6nppeTO108+egc79RU/LtA2tydmwWlN8LYp/w eHFQpAG7OQkWt29qgBUEIQ+S6fiHYAY25OO/nqmb1sDHAfa5fGB0OZJ6+se86nCdIGO9 guLw== X-Gm-Message-State: AKwxytfMM/jK4mGCikWUYiJDu65Jq+VLtOwfVf0MyFZT6W1ZJT/OmCEn CMYEQYQcFY1Wq9/8inH/zhkKavNl X-Google-Smtp-Source: ACJfBosXfIvrOSwS5briPEt/GgJys9GrVl9urC/Z0WIxByr6GujABJreWgKEmMHX3IKL+93W5QPGqA== X-Received: by 10.25.215.223 with SMTP id q92mr11925183lfi.107.1515765143871; Fri, 12 Jan 2018 05:52:23 -0800 (PST) Original-Received: from [192.168.1.174] ([178.252.127.239]) by smtp.googlemail.com with ESMTPSA id w13sm4284323ljw.69.2018.01.12.05.52.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 12 Jan 2018 05:52:22 -0800 (PST) In-Reply-To: <83shbb30z1.fsf@gnu.org> Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4010:c07::22a X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:221889 Archived-At: On 1/12/18 12:01 PM, Eli Zaretskii wrote: > Why discard it after the first save? The tags table is probably still > very much valid. Indeed, it's a rough heuristic. I'm aiming for correctness here, not for performance. On the other hand, code navigation and editing are often fairly distinct activities, you don't switch between the two too frequently. So waiting a second or two when going from the latter to the former shouldn't be too terrible. > I'd not discard it until either of the following > happens: > > . we fail to find a tag Not sure about this one. We can make this customizable, of course (although the implementation might end up a bit convoluted), but IMO it's not good for the default behavior. Failing to find a tag is a valid result (some identifiers can be absent, or defined somewhere else, e.g. in the libraries), and doing a rescan each time that happens might be more annoying. Further, some users will call C-u xref-find-definitions, look for the new tag in the completion table, fail to find it there, and simply abort without trying the search. > . the user visits a tags table explicitly That's of course, works already. > . the user switches to a different project(?) It's an omission currently, but yes, I fully intend to add this. > We could offer generating a tags table if we don't find one in the > tree, instead of generating it automatically. And then what? Visit it? And make the user to rescan manually every time? I'm fine with this as an optional behavior (and it will also be an improvement, of course, since generating tags is not exactly trivial for new users, and even many not-so-new ones), but I don't want this for the default. > I think this would be a > better UI and UX, especially given the time it could take to generate > TAGS (see below). Sublime Text, Atom and VS Code simply index the project code, AFAIK, without extra prompts. I think we should try to show a similar experience, even if it's not great for big projects. There are several directions we can improve on it, but showing the user that "yes, we can find-definition right away" is a good thing. >> For reference, indexing the Emacs sources takes ~1.1sec here. > > Was that with cold cache or warm cache? Warm, probably. But that's the relevant time, isn't it? We're most wondering how long it will take to *reindexing* (because we're discussing when to do it). The first indexing will take place anyway. > "make TAGS" takes about 9 sec here with a warm cache, and this is an > SSD disk. 'make tags' makes 1 second on my machine, with an NVMe disk. > On fencepost.gnu.org, a (somewhat slow) GNU/Linux system, > it took 12 sec with a cold cache and 4 sec with a warm cache. And > Emacs is not a large project; I wonder what would happen in larger > ones, like GCC or glibc. We can try to somehow detect very large projects, and helpfully offer to visit a tags table instead. Anyway, M-x visit-tags-table still works. > IOW, I don't think this is so fast that we could do that without user > approval. The argument here is that if the user called xref-find-definitions, it's better to do a (long-ish) scan and show something, instead of failing. They always have an option of C-g (we could also catch it and show helpful instructions if the process took too long). > I don't understand why you didn't use the commonly used form: > > find . -name "*.rb" -o -name "*.js" ... | etags -o- - Because the project API doesn't make this easy. Anyway, generating the full list of files is relatively fast in comparison. At most, it took like 30% of the whole time (and less in other cases). And we can speed it up further independently (e.g. using git ls-files). > Doing things the way you did raises issues with encoding of file > names, which could cause subtle problem in rare use cases. Well, I haven't seen them yet, and don't really understand how they're going to happen. But we'll probably fix them, one way or another. > I think > using 'find' is also faster. find is used under the covers. The difference is just that the invocations of etags are only happening later. > More generally, I think doing this that way is not TRT, at least not > by default. "make TAGS" in Emacs will produce a much richer tags > table than your method, because our Makefiles use regexps to augment > the automatic tagging in etags. So I think we should first try to > invoke the TAGS target of a Makefile in the tree, if one exists, and > only use the naïve command as fallback. 'make tags' is very much specific to Emacs. We can introduce some kind of protocol, of course, but my primary goal here is to improve the out-of-the-box behavior. Further, the task will have to write tags to stdout: the current code saves the temporary tags file to /tmp, and there are reasons to do that. Anyway, that part shouldn't be too hard. A possible venue for improvement is to somehow derive a multi-TAGS-files structure (with their dependencies) from the project information. Still thinking about it.