From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Tassilo Horn Newsgroups: gmane.emacs.devel Subject: Re: A project-files implementation for Git projects Date: Wed, 18 Sep 2019 19:15:24 +0200 Message-ID: <87ef0dy18z.fsf@gnu.org> References: <8736h9rdc4.fsf@gnu.org> <87mufcfz1u.fsf@gnu.org> <87tv9kz2x6.fsf@gnu.org> <87a7bbjdwe.fsf@gnu.org> <87a7ba8uvx.fsf@gnu.org> <87pnk2zvvy.fsf@gnu.org> <87sgows6wy.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="81232"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Dmitry Gutov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Sep 18 20:08:42 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1iAeNe-000L04-24 for ged-emacs-devel@m.gmane.org; Wed, 18 Sep 2019 20:08:42 +0200 Original-Received: from localhost ([::1]:33558 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iAe3A-0008V2-Dl for ged-emacs-devel@m.gmane.org; Wed, 18 Sep 2019 13:47:32 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:36966) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iAdYD-00042B-42 for emacs-devel@gnu.org; Wed, 18 Sep 2019 13:15:34 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:34523) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1iAdYC-0001PA-8m; Wed, 18 Sep 2019 13:15:32 -0400 Original-Received: from auth2-smtp.messagingengine.com ([66.111.4.228]:44723) by fencepost.gnu.org with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.82) (envelope-from ) id 1iAdYB-0003uQ-KJ; Wed, 18 Sep 2019 13:15:31 -0400 Original-Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailauth.nyi.internal (Postfix) with ESMTP id 1F52D21F14; Wed, 18 Sep 2019 13:15:31 -0400 (EDT) Original-Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Wed, 18 Sep 2019 13:15:31 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedufedrudekgdduuddtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvufhfffgjkfgfgggtsehttdertddtredtnecuhfhrohhmpefvrghsshhi lhhoucfjohhrnhcuoehtshguhhesghhnuhdrohhrgheqnecukfhppeegiedrkedtrdejtd drvdehnecurfgrrhgrmhepmhgrihhlfhhrohhmpehthhhorhhnodhmvghsmhhtphgruhht hhhpvghrshhonhgrlhhithihqdekieejfeekjeekgedqieefhedvleekqdhtshguhheppe hgnhhurdhorhhgsehfrghsthhmrghilhdrfhhmnecuvehluhhsthgvrhfuihiivgeptd X-ME-Proxy: Original-Received: from thinkpad-t440p (p2e504619.dip0.t-ipconnect.de [46.80.70.25]) by mail.messagingengine.com (Postfix) with ESMTPA id 4E6AFD60057; Wed, 18 Sep 2019 13:15:28 -0400 (EDT) Mail-Followup-To: Dmitry Gutov , emacs-devel@gnu.org In-Reply-To: (Dmitry Gutov's message of "Tue, 17 Sep 2019 14:06:03 +0300") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:240146 Archived-At: Dmitry Gutov writes: Hi Dmitry, >> Ah, "hg status --all" lists all files including their state >> (untracked, ignored, you-name-it), so that's the one we should use. >> Performance seems to be the same as for "hg files". > > In my testing the performance difference is about 2x: > > $ bash -c "time hg status -c >/dev/null" > > real 0m12,015s > user 0m1,899s > sys 0m10,113s > > $ bash -c "time hg files >/dev/null" > > real 0m5,970s > user 0m1,004s > sys 0m4,965s > > (project-files (project-current)) takes ~7 seconds here on the same repo > (Mozilla Firefox checkout). > > But if it's faster than 'find' anyway on some platforms, why not? As > long as there's a solution that will handle the adjusted ignore rules > in a similarly performant fashion. Right. >> I think we can come up with a VC list-files operation which >> optionally includes untracked and ignored files (where the latter >> implies the former, doesn't it?) > > Whether it implies or not, depends on which set of ignores we're > talking about (Git's own or the modified one). > >> but I'd leave the filtering according to project-vc-ignores to >> project.el. > > Have you tries benchmarking this approach? E.g. calling 'git ls-files > -c -o -z' and then doing all the filtering indicated by .gitignore > rules? > > Try it on the current Emacs repo. > > IME it's the ignore rules that take up 99% of the CPU time when using > 'find'. Without them, 'find .' is instant (though that depends on the > disk access speed). If we're going to implement that in Elisp, I'd > wager it's going to be even slower. Well, ok. I've now played with an interface (vc-call-backend (vc-responsible-backend dir) 'list-files dir include-unregistered extra-includes) where extra-includes works in addition to the standard VC ignore rules (.gitignore, .hgignore). Or do you want to override the VC-internal rules? At least for Git and Hg, I came up with reasonable implementations: --8<---------------cut here---------------start------------->8--- (defun vc-git-list-files (&optional dir include-unregistered extra-ignores) (let ((default-directory (or dir default-directory)) (args '("-z"))) (when include-unregistered (setq args (nconc args '("-c" "-o" "--exclude-standard")))) (when extra-ignores (setq args (nconc args (mapcan (lambda (i) (list "--exclude" i)) (copy-list extra-ignores))))) (mapcar #'expand-file-name (cl-remove-if #'string-empty-p (split-string (apply #'vc-git--run-command-string nil "ls-files" args) "\0"))))) (defun vc-hg-list-files (&optional dir include-unregistered extra-ignores) (let ((default-directory (or dir default-directory)) args files) (when include-unregistered (setq args (nconc args '("--all")))) (when extra-ignores (setq args (nconc args (mapcan (lambda (i) (list "--exclude" i)) (copy-list extra-ignores))))) (with-temp-buffer (apply #'vc-hg-command t 0 "." "status" args) (goto-char (point-min)) (while (re-search-forward "^[?C]\s+\\(.*\\)$" nil t) (setq files (cons (expand-file-name (match-string 1)) files)))) (nreverse files))) --8<---------------cut here---------------end--------------->8--- There's a semantic difference between Git and Hg in the treatment of extra-ignores. With Git, the extra-ignores do not rule out committed files (i.e., they are only effective for untracked files) while for Hg, they also rule out committed files. I think the Hg semantics are probably better but I don't see how to change the Git version so that it acts the same way (except by re-filtering in lisp, of course), do you? I haven't looked at the other backends. I guess bzr will probably be doable, too. However, for SVN, there's no way to list unregistered files. A correct (but horribly slow) default implementation should also be doable. Bye, Tassilo