From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Spencer Baugh Newsgroups: gmane.emacs.bugs Subject: bug#69188: 30.0.50; project-files + project-find-file is slow in large repositories Date: Thu, 15 Feb 2024 17:55:46 -0500 Message-ID: Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="31188"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Dmitry Gutov To: 69188@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Sun Feb 18 19:22:38 2024 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1rbloE-0007sc-1L for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 18 Feb 2024 19:22:38 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rblnM-0006jg-6l; Sun, 18 Feb 2024 13:21:44 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rblnK-0006iz-Pt for bug-gnu-emacs@gnu.org; Sun, 18 Feb 2024 13:21:42 -0500 Original-Received: from debbugs.gnu.org ([2001:470:142:5::43]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rblnK-0002JG-IG for bug-gnu-emacs@gnu.org; Sun, 18 Feb 2024 13:21:42 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1rblnf-0002vF-5B for bug-gnu-emacs@gnu.org; Sun, 18 Feb 2024 13:22:03 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Spencer Baugh Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 18 Feb 2024 18:22:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 69188 X-GNU-PR-Package: emacs X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.170828047510191 (code B ref -1); Sun, 18 Feb 2024 18:22:02 +0000 Original-Received: (at submit) by debbugs.gnu.org; 18 Feb 2024 18:21:15 +0000 Original-Received: from localhost ([127.0.0.1]:36915 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rblms-0002dl-L2 for submit@debbugs.gnu.org; Sun, 18 Feb 2024 13:21:15 -0500 Original-Received: from lists.gnu.org ([209.51.188.17]:53956) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rblSM-0001mW-BG for submit@debbugs.gnu.org; Sun, 18 Feb 2024 13:00:03 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1razKK-0003zA-HQ for bug-gnu-emacs@gnu.org; Fri, 16 Feb 2024 09:36:32 -0500 Original-Received: from mxout5.mail.janestreet.com ([64.215.233.18]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1razKI-0003NB-QF for bug-gnu-emacs@gnu.org; Fri, 16 Feb 2024 09:36:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1708094189; bh=t9/KDA79OABR2WeY/27WB0uFtBQh2wam0ho3GDDdr48=; h=From:To:Cc:Subject:Date; b=PHO9jNfIy1tUY4BqgxOAZBuusI9CFnpHQhoLD1j2E/7cLROmgyOvL/bWE4xJ7YFHV VbAxBcu7CKeR7CiJ13HBA5neswjjv9hZIljFAO1oBRoWDaqy6yzdXIRjyv1Gy3asWz h4U8TsMi5jkPmvMloZz/AnwkuehdWhubP5uA2J/B89UEzO0qyR6WFrWBvoXvFAaSvW /5wSFCNjaQGI3QCdMGJsava2hBc3fI6Jcv0Y9BTyq7324KBA7frIlQxVlb0zR8wjvy MesZPCoNI+6dSj4R5SOzO7sf5YbBmWsBhTMxxjeOTRQFAAlMqPVZkmY4PNilsJLQG0 7N2mB7878WcYQ== Received-SPF: pass client-ip=64.215.233.18; envelope-from=sbaugh@janestreet.com; helo=mxout5.mail.janestreet.com X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, DATE_IN_PAST_12_24=1.049, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.bugs:280175 Archived-At: (project-files (project-current)) takes around 1 second in Linux (80k files) and 7 seconds in my larger (500k file) repository. With this patch: diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el index c7c07c3d34c..037beaa835a 100644 --- a/lisp/progmodes/project.el +++ b/lisp/progmodes/project.el @@ -667,12 +667,15 @@ (setq i (concat i "**")))) i))) extra-ignores))))) - (setq files - (mapcar - (lambda (file) (concat default-directory file)) - (split-string - (apply #'vc-git--run-command-string nil "ls-files" args) - "\0" t))) + (with-temp-buffer + (let ((ok (apply #'vc-git--out-ok "ls-files" args)) + (pt (point-min))) + (unless ok + (error "File listing failed: %s" (buffer-string))) + (goto-char pt) + (while (search-forward "\0" nil t) + (push (concat default-directory (buffer-substring-no-properties pt (1- (point)))) files) + (setq pt (point))))) (when (project--vc-merge-submodules-p default-directory) ;; Unfortunately, 'ls-files --recurse-submodules' conflicts with '-o'. (let* ((submodules (project--git-submodules)) project-files in Linux takes around .75 seconds. If I further remove the (concat default-directory ...) around each file, it speeds up to .5 seconds. (Note that git ls-files itself takes only around 20 milliseconds) My large repository (which uses Mercurial) has a custom project-files which is basically: (with-temp-buffer (unless (zerop (apply #'call-process "rhg" nil t nil "files")) (error "File listing failed: %s" (buffer-string))) (goto-char (point-min)) (let ((pt (point)) res) (while (search-forward "\n" nil t) (push (file-name-concat default-directory (buffer-substring-no-properties pt (1- (point)))) res) (setq pt (point))) res)) Likewise, removing the (concat default-directory ...) speeds my project-files up from 7 seconds to 4.5 seconds. This is especially silly because project-find-file then just removes this default-directory again from all the files, which has yet more overhead. My proposal: Could we find a way to make the default-directory not necessary for the files returned from project-files? Perhaps project-files could be allowed to return relative file paths which are relative to the project root. Then in the common case where all the files are within the project root, project-find-file would be way faster. Happy to implement this, if it makes sense. Another optimization I've considered: We could run the process asynchronously so project-files parsing can be parallel with the process; but the process is usually very fast anyway, that's not most of the overhead, so that won't be a big win. However, that would make it easy for project-files as a whole to be asynchronous. Then that would allow project-find-file to start the listing in the background, and then we'd write a completion table which completes only over whatever files we've already read into Emacs. I think this would be a lot nicer for most use-cases, and I'd again be happy to implement this. Also happy to implement any other optimizations you think might make sense. In GNU Emacs 30.0.50 (build 37, x86_64-pc-linux-gnu, X toolkit, cairo version 1.15.12, Xaw scroll bars) of 2024-02-13 built on igm-qws-u22796a Repository revision: a24a2b1ceb12f11c9d345190fbf554f27c4ec186 Repository branch: master Windowing system distributor 'The X.Org Foundation', version 11.0.12011000 System Description: Rocky Linux 8.9 (Green Obsidian) Configured using: 'configure -C --with-x-toolkit=lucid 'CFLAGS=-O0 -g3' --without-native-compilation --without-gif' Configured features: CAIRO DBUS FREETYPE GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG JSON LIBSELINUX LIBSYSTEMD LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS X11 XDBE XIM XINPUT2 XPM LUCID ZLIB Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t global-eldoc-mode: t eldoc-mode: t show-paren-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t minibuffer-regexp-mode: t line-number-mode: t indent-tabs-mode: t transient-mark-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t Load-path shadows: None found. Features: (shadow sort mail-extr emacsbug message mailcap yank-media puny dired dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068 epg-config gnus-util text-property-search time-date subr-x mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader cl-loaddefs cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils rmc iso-transl tooltip cconv eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd touch-screen tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic indonesian philippine cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget keymap hashtable-print-readable backquote threads dbusbind inotify dynamic-setting system-font-setting font-render-setting cairo x-toolkit xinput2 x multi-tty move-toolbar make-network-process emacs) Memory information: ((conses 16 65052 9318) (symbols 48 9539 0) (strings 32 22452 1449) (string-bytes 1 659675) (vectors 16 9245) (vector-slots 8 111110 9295) (floats 8 40 17) (intervals 56 262 0) (buffers 976 10))