unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#73320: [PATCH] project--vc-list-files: use Git's sparse-index
@ 2024-09-17 16:55 Sean Allred
  2024-09-17 22:54 ` Dmitry Gutov
  0 siblings, 1 reply; 8+ messages in thread
From: Sean Allred @ 2024-09-17 16:55 UTC (permalink / raw)
  To: 73320

[-- Attachment #1: Type: text/plain, Size: 1684 bytes --]

Tags: patch

Hello!

I noticed that `C-x p f` (M-x project-find-file) took an incredibly long
time to run on our monorepo -- even when using a sparse index -- and I
tracked the problem down to this function. Adding `--sparse` to the
`git-ls-files` invocation resolves the issue handily, albeit with the
quirk of still showing top-level directories that are excised from the
sparse index (which we may want to remove from the return value of this
function since it does say it returns /files/).

I'm expecting at least one more version of this patch before it's even
considered for merge. Given that this change removes many, many results
from the return value of `project--vc-list-files', I suspect this would
be a breaking change for some use case that I'm not considering. I'm
hoping the list can provide some feedback on the best way to make this
configurable. I've not really hacked in project.el before and am not
super familiar with its internals.

If you prefer, you may pull this patch from my fork on github at

    url: git@github.com:vermiculus/emacs.git
    branch: sa/sparse-index

Keep in mind I may be updating this branch with feedback from this list
as it comes in.

-Sean


In GNU Emacs 29.1 (build 1, aarch64-apple-darwin21.6.0, NS
 appkit-2113.60 Version 12.6.6 (Build 21G646)) of 2023-07-30 built on
 armbob.lan
Windowing system distributor 'Apple', version 10.3.2487
System Description:  macOS 14.6.1

Configured using:
 'configure --with-ns '--enable-locallisppath=/Library/Application
 Support/Emacs/${version}/site-lisp:/Library/Application
 Support/Emacs/site-lisp' --with-modules 'CFLAGS=-DFD_SETSIZE=10000
 -DDARWIN_UNLIMITED_SELECT' --with-x-toolkit=no'


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-project-vc-list-files-use-Git-s-sparse-index.patch --]
[-- Type: text/patch, Size: 1737 bytes --]

From e72c013afbafad3faf19f98acb8d0387fdf045ba Mon Sep 17 00:00:00 2001
From: Sean Allred <allred.sean@gmail.com>
Date: Tue, 17 Sep 2024 10:07:41 -0500
Subject: [PATCH] project--vc-list-files: use Git's sparse-index

When dealing with exceptionally large Git repositories, the performance
of `project-find-file` can suffer dramatically as the list of files is
collected for completion. This adds insult to injury when you consider
cases where the developer has configured the repository to use a sparse
checkout where the vast majority of these files are not even present on
disk and are not valid candidates for completion.

Unconditionally pass along the `--sparse` option to `git-ls-files` to
coerce `project--vc-list-files` only consider files present in the
sparse index (and presumably on disk).

No consideration is made to make this configurable as it is currently
unclear as to the best way to do this -- whether to follow the example
of the `include-untracked` symbol in this function or to adopt some
other mechanism.
---
 lisp/progmodes/project.el | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
index b29d5ed5404..82cb79322dd 100644
--- a/lisp/progmodes/project.el
+++ b/lisp/progmodes/project.el
@@ -663,7 +663,7 @@ project--vc-list-files
   (pcase backend
     (`Git
      (let* ((default-directory (expand-file-name (file-name-as-directory dir)))
-            (args '("-z"))
+            (args '("-z", "--sparse"))
             (vc-git-use-literal-pathspecs nil)
             (include-untracked (project--value-in-dir
                                 'project-vc-include-untracked

base-commit: 7d365a2d72d8e656262205827cc5fdf423c3a41f
-- 
2.46.0


[-- Attachment #3: Type: text/plain, Size: 17 bytes --]


-- 
Sean Allred

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* bug#73320: [PATCH] project--vc-list-files: use Git's sparse-index
  2024-09-17 16:55 bug#73320: [PATCH] project--vc-list-files: use Git's sparse-index Sean Allred
@ 2024-09-17 22:54 ` Dmitry Gutov
  2024-09-18  0:36   ` Sean Allred
  0 siblings, 1 reply; 8+ messages in thread
From: Dmitry Gutov @ 2024-09-17 22:54 UTC (permalink / raw)
  To: Sean Allred, 73320

Hi!

On 17/09/2024 19:55, Sean Allred wrote:

> I noticed that `C-x p f` (M-x project-find-file) took an incredibly long
> time to run on our monorepo -- even when using a sparse index -- and I
> tracked the problem down to this function.

That sounds like a good problem to try to solve.

> Adding `--sparse` to the
> `git-ls-files` invocation resolves the issue handily, albeit with the
> quirk of still showing top-level directories that are excised from the
> sparse index (which we may want to remove from the return value of this
> function since it does say it returns /files/).

The submitted patch has a comma inside which results in an error

   vc-delistify: Wrong type argument: characterp, \,

Removing it fixes the error.

But let's start from the beginning. Could you help me set up a sparse 
repo that would help test out the change?

I took a large-ish checkout with shallow history and set up the sparse 
config like this:

   git sparse-checkout init --cone
   git sparse-checkout set gfx media

With that, both 'git status' and 'ls' behave as expected - no extra 
directories around (just the top-level files and two subdirs).

But 'git ls-files' and 'git ls-files --sparse' continue to show the same 
large output. Any idea what could be wrong? My version of Git, perhaps? 
Which is 2.40.1.

> I'm expecting at least one more version of this patch before it's even
> considered for merge. Given that this change removes many, many results
> from the return value of `project--vc-list-files', I suspect this would
> be a breaking change for some use case that I'm not considering.

Yeah, I expect project-find-regexp, project-search, 
project-query-replace-regexp might start misbehaving without additional 
filtering -- either throwing up errors or, best case, continuing to 
search through the "hidden" directories.






^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#73320: [PATCH] project--vc-list-files: use Git's sparse-index
  2024-09-17 22:54 ` Dmitry Gutov
@ 2024-09-18  0:36   ` Sean Allred
  2024-09-18 22:27     ` Dmitry Gutov
  0 siblings, 1 reply; 8+ messages in thread
From: Sean Allred @ 2024-09-18  0:36 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 73320

Dmitry Gutov <dmitry@gutov.dev> writes:
> The submitted patch has a comma inside which results in an error

Well that's embarrassing. I've resolved this in my branch. I'm not sure
why I wasn't seeing an issue locally; perhaps I made this error somehow
after I eval'd the form for testing. Thanks for catching!

> But let's start from the beginning. Could you help me set up a sparse
> repo that would help test out the change?
>
> I took a large-ish checkout with shallow history and set up the sparse
> config like this:
>
>   git sparse-checkout init --cone
>   git sparse-checkout set gfx media
>
> With that, both 'git status' and 'ls' behave as expected - no extra
> directories around (just the top-level files and two subdirs).
>
> But 'git ls-files' and 'git ls-files --sparse' continue to show the
> same large output. Any idea what could be wrong? My version of Git,
> perhaps? Which is 2.40.1.

Sure! You're very close. The key is here in git-ls-files(1):

    --sparse
        If the index is sparse, show the sparse directories without
        expanding to the contained files. Sparse directories will be shown
        with a trailing slash, such as "x/" for a sparse directory "x".

combined with this in git-sparse-checkout(1):

    Use the --[no-]sparse-index option to use a sparse index (the
    default is to not use it).

I fell into this too when setting up my worktree for testing; normally
our internal tooling takes care of this for me. To resolve this issue in
an existing checkout, you can simply

    git sparse-checkout init --sparse-index

and that will rewrite your index file using sparse-index rules.

> Yeah, I expect project-find-regexp, project-search,
> project-query-replace-regexp might start misbehaving without
> additional filtering -- either throwing up errors or, best case,
> continuing to search through the "hidden" directories.

Not sure how best to track that we should come back to this, but yeah.
It seems like the right place to add some sort of switch would be in the
`project-files` defmethod. From here, it looks like all the functions
you mention could choose the behavior right for them. (Based on the
function names alone -- it seems they would /also/ be interested in
operating on only those files which exist on disk.)

Incidentally looking at the version check within `project-files`, it's
worthwhile to point out that `--sparse` is likely /not/ compatible with
ancient versions of Git. Does vc have any sort of policy on requiring
recent versions of these tools? If the answer is 'not really', I'll
additionally want to add some sort of protection against using
`--sparse` when the Git version won't understand it. This should be easy
enough to do within the implementation of `project--vc-list-files`.

-- 
Sean Allred





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#73320: [PATCH] project--vc-list-files: use Git's sparse-index
  2024-09-18  0:36   ` Sean Allred
@ 2024-09-18 22:27     ` Dmitry Gutov
  2024-09-19  4:25       ` Sean Allred
  2024-09-19  5:41       ` Eli Zaretskii
  0 siblings, 2 replies; 8+ messages in thread
From: Dmitry Gutov @ 2024-09-18 22:27 UTC (permalink / raw)
  To: Sean Allred; +Cc: 73320

On 18/09/2024 03:36, Sean Allred wrote:
> Dmitry Gutov <dmitry@gutov.dev> writes:
>> The submitted patch has a comma inside which results in an error
> 
> Well that's embarrassing. I've resolved this in my branch. I'm not sure
> why I wasn't seeing an issue locally; perhaps I made this error somehow
> after I eval'd the form for testing. Thanks for catching!

No problem!

>> But let's start from the beginning. Could you help me set up a sparse
>> repo that would help test out the change?
>>
>> I took a large-ish checkout with shallow history and set up the sparse
>> config like this:
>>
>>    git sparse-checkout init --cone
>>    git sparse-checkout set gfx media
>>
>> With that, both 'git status' and 'ls' behave as expected - no extra
>> directories around (just the top-level files and two subdirs).
>>
>> But 'git ls-files' and 'git ls-files --sparse' continue to show the
>> same large output. Any idea what could be wrong? My version of Git,
>> perhaps? Which is 2.40.1.
> 
> Sure! You're very close. The key is here in git-ls-files(1):
> 
>      --sparse
>          If the index is sparse, show the sparse directories without
>          expanding to the contained files. Sparse directories will be shown
>          with a trailing slash, such as "x/" for a sparse directory "x".
> 
> combined with this in git-sparse-checkout(1):
> 
>      Use the --[no-]sparse-index option to use a sparse index (the
>      default is to not use it).
> 
> I fell into this too when setting up my worktree for testing; normally
> our internal tooling takes care of this for me. To resolve this issue in
> an existing checkout, you can simply
> 
>      git sparse-checkout init --sparse-index
> 
> and that will rewrite your index file using sparse-index rules.

Thanks! So the key here is using "sparse index", not just "sparse 
checkout", since these are two different features. Which have to be 
enabled separately for sort of "gradual rollout", I guess.

>> Yeah, I expect project-find-regexp, project-search,
>> project-query-replace-regexp might start misbehaving without
>> additional filtering -- either throwing up errors or, best case,
>> continuing to search through the "hidden" directories.
> 
> Not sure how best to track that we should come back to this, but yeah.
> It seems like the right place to add some sort of switch would be in the
> `project-files` defmethod. From here, it looks like all the functions
> you mention could choose the behavior right for them. (Based on the
> function names alone -- it seems they would /also/ be interested in
> operating on only those files which exist on disk.)

I think we can just remove the names ending with '/'. The built-in 
commands don't seem to error out on them right now - probably because 
there is some protection against nonexistent files - but those files are 
(were) still shown as completions for project-find-file. Try out this 
addition please. The performance here seems about the same even with a 
large list (something I was worried about):

diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
index b29d5ed5404..a2e3f3f52e6 100644
--- a/lisp/progmodes/project.el
+++ b/lisp/progmodes/project.el
@@ -663,7 +663,7 @@ project--vc-list-files
    (pcase backend
      (`Git
       (let* ((default-directory (expand-file-name 
(file-name-as-directory dir)))
-            (args '("-z"))
+            (args '("-z" "--sparse"))
              (vc-git-use-literal-pathspecs nil)
              (include-untracked (project--value-in-dir
                                  'project-vc-include-untracked
@@ -703,7 +703,8 @@ project--vc-list-files
               (delq nil
                     (mapcar
                      (lambda (file)
-                      (unless (member file submodules)
+                      (unless (or (member file submodules)
+                                  (eq ?/ (aref file (1- (length file)))))
                          (if project-files-relative-names
                              file
                            (concat default-directory file))))


> Incidentally looking at the version check within `project-files`, it's
> worthwhile to point out that `--sparse` is likely /not/ compatible with
> ancient versions of Git. Does vc have any sort of policy on requiring
> recent versions of these tools? If the answer is 'not really', I'll
> additionally want to add some sort of protection against using
> `--sparse` when the Git version won't understand it. This should be easy
> enough to do within the implementation of `project--vc-list-files`.

IIRC it was something like "should work on the CentOS stable", and maybe 
CentOS N-1 as well. But the release-based distro was discontinued since 
the last time this question came up ;-(

We can call vc-git--program-version the same way it's used in 
vc-git-state. Which version should we make the minimum?





^ permalink raw reply related	[flat|nested] 8+ messages in thread

* bug#73320: [PATCH] project--vc-list-files: use Git's sparse-index
  2024-09-18 22:27     ` Dmitry Gutov
@ 2024-09-19  4:25       ` Sean Allred
  2024-09-19  9:44         ` Dmitry Gutov
  2024-09-19  5:41       ` Eli Zaretskii
  1 sibling, 1 reply; 8+ messages in thread
From: Sean Allred @ 2024-09-19  4:25 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: 73320

Dmitry Gutov <dmitry@gutov.dev> writes:
>>> Yeah, I expect project-find-regexp, project-search,
>>> project-query-replace-regexp might start misbehaving without
>>> additional filtering -- either throwing up errors or, best case,
>>> continuing to search through the "hidden" directories.
>> Not sure how best to track that we should come back to this, but
>> yeah.
>> It seems like the right place to add some sort of switch would be in the
>> `project-files` defmethod. From here, it looks like all the functions
>> you mention could choose the behavior right for them. (Based on the
>> function names alone -- it seems they would /also/ be interested in
>> operating on only those files which exist on disk.)
>
> I think we can just remove the names ending with '/'. The built-in
> commands don't seem to error out on them right now - probably because
> there is some protection against nonexistent files - but those files
> are (were) still shown as completions for project-find-file. Try out
> this addition please. The performance here seems about the same even
> with a large list (something I was worried about):
>
> diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
> index b29d5ed5404..a2e3f3f52e6 100644
> --- a/lisp/progmodes/project.el
> +++ b/lisp/progmodes/project.el
> @@ -663,7 +663,7 @@ project--vc-list-files
>    (pcase backend
>      (`Git
>       (let* ((default-directory (expand-file-name
>       (file-name-as-directory dir)))
> -            (args '("-z"))
> +            (args '("-z" "--sparse"))
>              (vc-git-use-literal-pathspecs nil)
>              (include-untracked (project--value-in-dir
>                                  'project-vc-include-untracked
> @@ -703,7 +703,8 @@ project--vc-list-files
>               (delq nil
>                     (mapcar
>                      (lambda (file)
> -                      (unless (member file submodules)
> +                      (unless (or (member file submodules)
> +                                  (eq ?/ (aref file (1- (length file)))))
>                          (if project-files-relative-names
>                              file
>                            (concat default-directory file))))

Works fine for me :-) Though I've added an additional version check
inlined below.

>> Incidentally looking at the version check within `project-files`, it's
>> worthwhile to point out that `--sparse` is likely /not/ compatible with
>> ancient versions of Git. [...]
>
> [...]
>
> We can call vc-git--program-version the same way it's used in
> vc-git-state. Which version should we make the minimum?

The `--sparse` option was introduced in 2.35. The following seems to
work well for me:

diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
index b29d5ed5404..873bc92729d 100644
--- a/lisp/progmodes/project.el
+++ b/lisp/progmodes/project.el
@@ -663,7 +663,8 @@ project--vc-list-files
   (pcase backend
     (`Git
      (let* ((default-directory (expand-file-name (file-name-as-directory dir)))
-            (args '("-z"))
+            (args `("-z" ,@(when (version<= "2.35" (vc-git--program-version))
+                             '("--sparse"))))
             (vc-git-use-literal-pathspecs nil)
             (include-untracked (project--value-in-dir
                                 'project-vc-include-untracked
@@ -703,7 +704,8 @@ project--vc-list-files
              (delq nil
                    (mapcar
                     (lambda (file)
-                      (unless (member file submodules)
+                      (unless (or (member file submodules)
+                                  (eq ?/ (aref file (1- (length file)))))
                         (if project-files-relative-names
                             file
                           (concat default-directory file))))

Since we're getting a bit busy with our conditions, though, it might be
better to start using `cond`:

diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
index 873bc92729d..b42415154e3 100644
--- a/lisp/progmodes/project.el
+++ b/lisp/progmodes/project.el
@@ -704,11 +704,11 @@ project--vc-list-files
              (delq nil
                    (mapcar
                     (lambda (file)
-                      (unless (or (member file submodules)
-                                  (eq ?/ (aref file (1- (length file)))))
-                        (if project-files-relative-names
-                            file
-                          (concat default-directory file))))
+                      (cond
+                       ((member file submodules) nil)
+                       ((eq ?/ (aref file (1- (length file)))) nil)
+                       (project-files-relative-names file)
+                       (t (concat default-directory file))))
                     (split-string
                      (with-output-to-string
                        (apply #'vc-git-command standard-output 0 nil "ls-files" args))

This seems to help readability -- at least to me. There's probably also
a nominal performance benefit since `cond` is a special form.

I've pushed this as branch `sa/sparse-index-2` to my repository. (This
is in addition to the `sa/sparse-index` branch, which contains the
`file-exists-p` check mentioned below plus what might be, I take it, an
ultimately unneeded opt-out parameter in `project-files`.)

It's worth noting that actually performing a `file-exists-p` check here
would have the added benefit of handling the awkward state between Git
2.25 (where sparse-checkout was introduced) and 2.35 (where git-ls-files
learned --sparse) where ls-files could still report things that _look_
like files but are not present. This would be fixed by just replacing
the (eq ..) form with (not (file-exists-p file)).

-- 
Sean Allred





^ permalink raw reply related	[flat|nested] 8+ messages in thread

* bug#73320: [PATCH] project--vc-list-files: use Git's sparse-index
  2024-09-18 22:27     ` Dmitry Gutov
  2024-09-19  4:25       ` Sean Allred
@ 2024-09-19  5:41       ` Eli Zaretskii
  2024-09-19  9:34         ` Dmitry Gutov
  1 sibling, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2024-09-19  5:41 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: allred.sean, 73320

> Cc: 73320@debbugs.gnu.org
> Date: Thu, 19 Sep 2024 01:27:03 +0300
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> -                      (unless (member file submodules)
> +                      (unless (or (member file submodules)
> +                                  (eq ?/ (aref file (1- (length file)))))
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Why not use directory-name-p here?

> > Incidentally looking at the version check within `project-files`, it's
> > worthwhile to point out that `--sparse` is likely /not/ compatible with
> > ancient versions of Git. Does vc have any sort of policy on requiring
> > recent versions of these tools? If the answer is 'not really', I'll
> > additionally want to add some sort of protection against using
> > `--sparse` when the Git version won't understand it. This should be easy
> > enough to do within the implementation of `project--vc-list-files`.
> 
> IIRC it was something like "should work on the CentOS stable", and maybe 
> CentOS N-1 as well. But the release-based distro was discontinued since 
> the last time this question came up ;-(
> 
> We can call vc-git--program-version the same way it's used in 
> vc-git-state. Which version should we make the minimum?

In which version was --sparse support introduced, and what is
considered "ancient" for this purpose?  I would not like us to rely on
shining new features of Git if the price is to break people who have
older versions.  TRT in these cases is to probe the version and act
accordingly, using the new features only when they are supported.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#73320: [PATCH] project--vc-list-files: use Git's sparse-index
  2024-09-19  5:41       ` Eli Zaretskii
@ 2024-09-19  9:34         ` Dmitry Gutov
  0 siblings, 0 replies; 8+ messages in thread
From: Dmitry Gutov @ 2024-09-19  9:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: allred.sean, 73320

On 19/09/2024 08:41, Eli Zaretskii wrote:
>> Cc: 73320@debbugs.gnu.org
>> Date: Thu, 19 Sep 2024 01:27:03 +0300
>> From: Dmitry Gutov <dmitry@gutov.dev>
>>
>> -                      (unless (member file submodules)
>> +                      (unless (or (member file submodules)
>> +                                  (eq ?/ (aref file (1- (length file)))))
>                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Why not use directory-name-p here?

Thanks for the suggestion, I forgot that it's available and fast.

We might want to likewise update completion-emacs22-try-completion and 
vc-git--file-list-is-rootdir.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#73320: [PATCH] project--vc-list-files: use Git's sparse-index
  2024-09-19  4:25       ` Sean Allred
@ 2024-09-19  9:44         ` Dmitry Gutov
  0 siblings, 0 replies; 8+ messages in thread
From: Dmitry Gutov @ 2024-09-19  9:44 UTC (permalink / raw)
  To: Sean Allred; +Cc: 73320

On 19/09/2024 07:25, Sean Allred wrote:

>> diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
>> index b29d5ed5404..a2e3f3f52e6 100644
>> --- a/lisp/progmodes/project.el
>> +++ b/lisp/progmodes/project.el
>> @@ -663,7 +663,7 @@ project--vc-list-files
>>     (pcase backend
>>       (`Git
>>        (let* ((default-directory (expand-file-name
>>        (file-name-as-directory dir)))
>> -            (args '("-z"))
>> +            (args '("-z" "--sparse"))
>>               (vc-git-use-literal-pathspecs nil)
>>               (include-untracked (project--value-in-dir
>>                                   'project-vc-include-untracked
>> @@ -703,7 +703,8 @@ project--vc-list-files
>>                (delq nil
>>                      (mapcar
>>                       (lambda (file)
>> -                      (unless (member file submodules)
>> +                      (unless (or (member file submodules)
>> +                                  (eq ?/ (aref file (1- (length file)))))
>>                           (if project-files-relative-names
>>                               file
>>                             (concat default-directory file))))
> 
> Works fine for me :-) Though I've added an additional version check
> inlined below.

Great, thanks!

>>> Incidentally looking at the version check within `project-files`, it's
>>> worthwhile to point out that `--sparse` is likely /not/ compatible with
>>> ancient versions of Git. [...]
>>
>> [...]
>>
>> We can call vc-git--program-version the same way it's used in
>> vc-git-state. Which version should we make the minimum?
> 
> The `--sparse` option was introduced in 2.35. The following seems to
> work well for me:
> 
> diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
> index b29d5ed5404..873bc92729d 100644
> --- a/lisp/progmodes/project.el
> +++ b/lisp/progmodes/project.el
> @@ -663,7 +663,8 @@ project--vc-list-files
>     (pcase backend
>       (`Git
>        (let* ((default-directory (expand-file-name (file-name-as-directory dir)))
> -            (args '("-z"))
> +            (args `("-z" ,@(when (version<= "2.35" (vc-git--program-version))
> +                             '("--sparse"))))
>               (vc-git-use-literal-pathspecs nil)
>               (include-untracked (project--value-in-dir
>                                   'project-vc-include-untracked
> @@ -703,7 +704,8 @@ project--vc-list-files
>                (delq nil
>                      (mapcar
>                       (lambda (file)
> -                      (unless (member file submodules)
> +                      (unless (or (member file submodules)
> +                                  (eq ?/ (aref file (1- (length file)))))
>                           (if project-files-relative-names
>                               file
>                             (concat default-directory file))))

Like Eli suggested, we can use directory-name-p instead of the second 
condition.

> Since we're getting a bit busy with our conditions, though, it might be
> better to start using `cond`:
> 
> diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
> index 873bc92729d..b42415154e3 100644
> --- a/lisp/progmodes/project.el
> +++ b/lisp/progmodes/project.el
> @@ -704,11 +704,11 @@ project--vc-list-files
>                (delq nil
>                      (mapcar
>                       (lambda (file)
> -                      (unless (or (member file submodules)
> -                                  (eq ?/ (aref file (1- (length file)))))
> -                        (if project-files-relative-names
> -                            file
> -                          (concat default-directory file))))
> +                      (cond
> +                       ((member file submodules) nil)
> +                       ((eq ?/ (aref file (1- (length file)))) nil)
> +                       (project-files-relative-names file)
> +                       (t (concat default-directory file))))
>                       (split-string
>                        (with-output-to-string
>                          (apply #'vc-git-command standard-output 0 nil "ls-files" args))
> 
> This seems to help readability -- at least to me. There's probably also
> a nominal performance benefit since `cond` is a special form.

I think I'd rather keep the 'unless' for now: it groups the first two 
cases a bit more clearly.

> I've pushed this as branch `sa/sparse-index-2` to my repository. (This
> is in addition to the `sa/sparse-index` branch, which contains the
> `file-exists-p` check mentioned below plus what might be, I take it, an
> ultimately unneeded opt-out parameter in `project-files`.)

Yeah, it doesn't seem to be useful to return files that don't exist in 
the current work directory - no idea what a caller would do with them.

> It's worth noting that actually performing a `file-exists-p` check here
> would have the added benefit of handling the awkward state between Git
> 2.25 (where sparse-checkout was introduced) and 2.35 (where git-ls-files
> learned --sparse) where ls-files could still report things that _look_
> like files but are not present. This would be fixed by just replacing
> the (eq ..) form with (not (file-exists-p file)).

'file-exists-p' does I/O, so it would make processing an order of a 
magnitude slower. Here's a benchmark you can try:

   (benchmark-run 1 (project-files (project-current)))





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-09-19  9:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-17 16:55 bug#73320: [PATCH] project--vc-list-files: use Git's sparse-index Sean Allred
2024-09-17 22:54 ` Dmitry Gutov
2024-09-18  0:36   ` Sean Allred
2024-09-18 22:27     ` Dmitry Gutov
2024-09-19  4:25       ` Sean Allred
2024-09-19  9:44         ` Dmitry Gutov
2024-09-19  5:41       ` Eli Zaretskii
2024-09-19  9:34         ` Dmitry Gutov

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).