unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* A project-files implementation for Git projects
@ 2019-09-06  9:19 Tassilo Horn
  2019-09-06 12:52 ` Stefan Monnier
  0 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-09-06  9:19 UTC (permalink / raw)
  To: emacs-devel

Hi all,

I'm struggling a bit with the performance of `project-files' for large
projects so I've tried coming up with a faster implementation for Git
projects which uses "git ls-files" instead of "find".

--8<---------------cut here---------------start------------->8---
(cl-defmethod project-files ((project (head vc)) &optional dirs)
  "Implementation of `project-files' for Git projects."
  (cl-mapcan
   (lambda (dir)
     (if-let ((git (and (file-exists-p
                         (expand-file-name ".git/config" dir))
                        (executable-find "git"))))
         (let ((default-directory dir))
           (sort (split-string
                  (shell-command-to-string
                   (concat git " ls-files -z"))
                  "\0" t)
                 #'string<))
       ;; No Git project, so go with the default.
       (cl-call-next-method)))
   (or dirs (project-roots project))))
--8<---------------cut here---------------end--------------->8---

Some benchmarking shows that it's about 8 times faster on my system.
Would something like this be worthwhile to have in project.el?

Some notes:

- Actually, that hasn't to be a cl-defmethod because the project type vc
  doesn't tell if it's a git project and we still need to test each dir
  separately.  So basically the default implementation could do that
  just as well.  Or maybe the project types could be transient or (vc
  . <backend>) where <backend> is some vc-backend?

- It changes the semantics a bit.  The default implementation finds all
  files (minus the ignored) from a project's directory while git
  ls-files just lists the tracked files.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-06  9:19 A project-files implementation for Git projects Tassilo Horn
@ 2019-09-06 12:52 ` Stefan Monnier
  2019-09-10  6:25   ` Tassilo Horn
  0 siblings, 1 reply; 94+ messages in thread
From: Stefan Monnier @ 2019-09-06 12:52 UTC (permalink / raw)
  To: emacs-devel

> (cl-defmethod project-files ((project (head vc)) &optional dirs)
>   "Implementation of `project-files' for Git projects."
>   (cl-mapcan
>    (lambda (dir)
>      (if-let ((git (and (file-exists-p
>                          (expand-file-name ".git/config" dir))
>                         (executable-find "git"))))

Since it's a handler for `vc` it should handle all VC backends.  To do
that, it can simply use the `vc-call-backend` mechanism, so the
Git-specific code can be put inside vc-git and you can have a generic
implementation that does just the `cl-call-next-method`.


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-06 12:52 ` Stefan Monnier
@ 2019-09-10  6:25   ` Tassilo Horn
  2019-09-10 12:56     ` Stefan Monnier
  2019-09-10 14:41     ` Eli Zaretskii
  0 siblings, 2 replies; 94+ messages in thread
From: Tassilo Horn @ 2019-09-10  6:25 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> (cl-defmethod project-files ((project (head vc)) &optional dirs)
>>   "Implementation of `project-files' for Git projects."
>>   (cl-mapcan
>>    (lambda (dir)
>>      (if-let ((git (and (file-exists-p
>>                          (expand-file-name ".git/config" dir))
>>                         (executable-find "git"))))
>
> Since it's a handler for `vc` it should handle all VC backends.  To do
> that, it can simply use the `vc-call-backend` mechanism, so the
> Git-specific code can be put inside vc-git and you can have a generic
> implementation that does just the `cl-call-next-method`.

I didn't know that mechanism, so I have some more questions.

Would that mean that I would need to add functions

  vc-git-list-files (using: git ls-files)
  vc-hg-list-files (using: hg files)
  ...

for all backends which support listing tracked files?

And then project-files would call (vc-call-backend backend 'list-files)
and if that signals vc-not-supported call cl-call-next-method?  But how
do I know the right backend without explicit tests?  vc-backend wants a
file and all I have is the project's directory.

Also, I think most vc backends have a way to list tracked files but not
all those are faster than find is.  "git ls-files" is much faster than
find but short testing revealed that "hg files" is much slower.  I don't
think it would be a good idea to use it in project-files where speed is
important.  So name the vc function vc-<backend>-list-files-fast and
only provide an implementation for Git?

Thanks,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-10  6:25   ` Tassilo Horn
@ 2019-09-10 12:56     ` Stefan Monnier
  2019-09-10 13:39       ` Tassilo Horn
  2019-09-10 14:41     ` Eli Zaretskii
  1 sibling, 1 reply; 94+ messages in thread
From: Stefan Monnier @ 2019-09-10 12:56 UTC (permalink / raw)
  To: Tassilo Horn; +Cc: emacs-devel

> Would that mean that I would need to add functions
>
>   vc-git-list-files (using: git ls-files)
>   vc-hg-list-files (using: hg files)
>   ...
>
> for all backends which support listing tracked files?

For all backends for which you want to implement the feature yes.
For the rest, you define `vc-default-list-files` instead.

> And then project-files would call (vc-call-backend backend 'list-files)
> and if that signals vc-not-supported call cl-call-next-method?

No.  If there's no vc-<backend>-list-files, then it calls
vc-default-list-files.  No signal.

> But how do I know the right backend without explicit tests?
> vc-backend wants a file and all I have is the project's directory.

IIRC, vc.el calls vc-backend with the directory name in those cases.

> Also, I think most vc backends have a way to list tracked files but not
> all those are faster than find is.

If it's not faster, then don't bother implementing
vc-<backend>-list-files (unless the purpose is to get a different list
rather than to get the list faster).

> So name the vc function vc-<backend>-list-files-fast and only provide
> an implementation for Git?

Sure.  But please go through the vc-call-backend mechanism so as not to
break the abstraction.


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-10 12:56     ` Stefan Monnier
@ 2019-09-10 13:39       ` Tassilo Horn
  2019-09-10 13:56         ` Stefan Monnier
                           ` (2 more replies)
  0 siblings, 3 replies; 94+ messages in thread
From: Tassilo Horn @ 2019-09-10 13:39 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

Hi Stefan,

>> Would that mean that I would need to add functions
>>
>>   vc-git-list-files (using: git ls-files)
>>   vc-hg-list-files (using: hg files)
>>   ...
>>
>> for all backends which support listing tracked files?
>
> For all backends for which you want to implement the feature yes.  For
> the rest, you define `vc-default-list-files` instead.

But what should this do?  From a vc list-files function I'd expect (and
document) that it lists all and only tracked files.  So should the
default implementation use find to locate all files and then check each
one if it is tracked using vc-state (or something alike)?

>> And then project-files would call (vc-call-backend backend 'list-files)
>> and if that signals vc-not-supported call cl-call-next-method?
>
> No.  If there's no vc-<backend>-list-files, then it calls
> vc-default-list-files.  No signal.
>
>> But how do I know the right backend without explicit tests?
>> vc-backend wants a file and all I have is the project's directory.
>
> IIRC, vc.el calls vc-backend with the directory name in those cases.

(vc-backend "~/Repos/el/emacs") => nil

But that's my emacs git checkout...

>> Also, I think most vc backends have a way to list tracked files but not
>> all those are faster than find is.
>
> If it's not faster, then don't bother implementing
> vc-<backend>-list-files (unless the purpose is to get a different list
> rather than to get the list faster).
>
>> So name the vc function vc-<backend>-list-files-fast and only provide
>> an implementation for Git?
>
> Sure.  But please go through the vc-call-backend mechanism so as not
> to break the abstraction.

Well, I think a vc list-files is generally useful no matter the
performance but for the usage in project-files from project.el the
performance matters a lot.

So IMO, I'd just go with a vc list-files-fast for the usage in
project.el and possibly another vc list-files where implementation are
also provided for the slower backends.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-10 13:39       ` Tassilo Horn
@ 2019-09-10 13:56         ` Stefan Monnier
  2019-09-11 11:00           ` Tassilo Horn
  2019-09-10 13:57         ` Robert Pluim
  2019-09-10 14:24         ` Dmitry Gutov
  2 siblings, 1 reply; 94+ messages in thread
From: Stefan Monnier @ 2019-09-10 13:56 UTC (permalink / raw)
  To: Tassilo Horn; +Cc: emacs-devel

>> For all backends for which you want to implement the feature yes.  For
>> the rest, you define `vc-default-list-files` instead.
>
> But what should this do?  From a vc list-files function I'd expect (and
> document) that it lists all and only tracked files.  So should the
> default implementation use find to locate all files and then check each
> one if it is tracked using vc-state (or something alike)?

I'd pass it a "fallback" function, so you'd have something like:

    (defun vc-default-list-files-fast (backend ... fallback)
      (funcall fallback))

and in the call you'd do

    (vc-call-backend (vc-responsible-backend dir) 'list-files-fast ... #'cl-call-next-method)

>> IIRC, vc.el calls vc-backend with the directory name in those cases.
> (vc-backend "~/Repos/el/emacs") => nil
> But that's my emacs git checkout...

Ah, indeed, it seems you want vc-responsible-backend instead.


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-10 13:39       ` Tassilo Horn
  2019-09-10 13:56         ` Stefan Monnier
@ 2019-09-10 13:57         ` Robert Pluim
  2019-09-10 14:24         ` Dmitry Gutov
  2 siblings, 0 replies; 94+ messages in thread
From: Robert Pluim @ 2019-09-10 13:57 UTC (permalink / raw)
  To: Tassilo Horn; +Cc: Stefan Monnier, emacs-devel

>>>>> On Tue, 10 Sep 2019 15:39:01 +0200, Tassilo Horn <tsdh@gnu.org> said:
    >>> But how do I know the right backend without explicit tests?
    >>> vc-backend wants a file and all I have is the project's directory.
    >> 
    >> IIRC, vc.el calls vc-backend with the directory name in those cases.

    Tassilo> (vc-backend "~/Repos/el/emacs") => nil

    Tassilo> But that's my emacs git checkout...

If you can persuade it to do

(vc-responsible-backend "~/Repos/el/emacs")

instead, then things should work.


Robert



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-10 13:39       ` Tassilo Horn
  2019-09-10 13:56         ` Stefan Monnier
  2019-09-10 13:57         ` Robert Pluim
@ 2019-09-10 14:24         ` Dmitry Gutov
  2 siblings, 0 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-10 14:24 UTC (permalink / raw)
  To: Tassilo Horn, Stefan Monnier; +Cc: emacs-devel

On 10.09.2019 16:39, Tassilo Horn wrote:
> But what should this do?  From a vc list-files function I'd expect (and
> document) that it lists all and only tracked files.  So should the
> default implementation use find to locate all files and then check each
> one if it is tracked using vc-state (or something alike)?

I'd really like it to be more feature-rich. I.e. to accept arguments 
which files it will return, or blacklist/whitelist.

In my limited testing, Git can handle it and will still return the list 
of files much faster than the current find-based solution.

It's a bit more complex to implement, though. That's why it has been on 
my list for a while without much progress.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-10  6:25   ` Tassilo Horn
  2019-09-10 12:56     ` Stefan Monnier
@ 2019-09-10 14:41     ` Eli Zaretskii
  1 sibling, 0 replies; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-10 14:41 UTC (permalink / raw)
  To: Tassilo Horn; +Cc: monnier, emacs-devel

> From: Tassilo Horn <tsdh@gnu.org>
> Date: Tue, 10 Sep 2019 08:25:17 +0200
> Cc: emacs-devel@gnu.org
> 
> Also, I think most vc backends have a way to list tracked files but not
> all those are faster than find is.  "git ls-files" is much faster than
> find but short testing revealed that "hg files" is much slower.

"hg files"?  Did you mean "hg locate", perhaps?  Or are there new
commands in hg that were added lately?  (My installation of Mercurial
is quite old, so maybe I'm out of touch.)

Anyway, whether 'find' or the VC-specific way is faster might be OS
dependent.  I'm guessing you tested on GNU/Linux; on MS-Windows, I get
the opposite results for both hg and bzr (let alone Git), even though
my build of GNU Find is highly optimized and generally much faster
than other ports available on Windows.  Plus, 'find' might not be
available on Windows, whereas the VC backend for a repository must be
available, almost by definition.

So at the very least this should be customizable, and in general,
unless 'find' is orders of magnitude faster, I'd prefer to use VC in
all cases.

Thanks.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-10 13:56         ` Stefan Monnier
@ 2019-09-11 11:00           ` Tassilo Horn
  2019-09-11 20:01             ` Tassilo Horn
  2019-09-14  0:33             ` Dmitry Gutov
  0 siblings, 2 replies; 94+ messages in thread
From: Tassilo Horn @ 2019-09-11 11:00 UTC (permalink / raw)
  To: emacs-devel

Stefan Monnier <monnier@iro.umontreal.ca> writes:

Hi Stefan, Dmitry & Eli,

>> But what should this do?  From a vc list-files function I'd expect
>> (and document) that it lists all and only tracked files.  So should
>> the default implementation use find to locate all files and then
>> check each one if it is tracked using vc-state (or something alike)?
>
> I'd pass it a "fallback" function, so you'd have something like:
>
>     (defun vc-default-list-files-fast (backend ... fallback)
>       (funcall fallback))
>
> and in the call you'd do
>
>     (vc-call-backend (vc-responsible-backend dir)
>      'list-files-fast ... #'cl-call-next-method)

Ah, right, that's a good idea!

>> (vc-backend "~/Repos/el/emacs") => nil
>> But that's my emacs git checkout...
>
> Ah, indeed, it seems you want vc-responsible-backend instead.

(vc-responsible-backend "~/Repos/el/emacs") => Git

Yup, that's it!

Dmitry Gutov <dgutov@yandex.ru> writes:

> On 10.09.2019 16:39, Tassilo Horn wrote:
>> But what should this do?  From a vc list-files function I'd expect
>> (and document) that it lists all and only tracked files.  So should
>> the default implementation use find to locate all files and then
>> check each one if it is tracked using vc-state (or something alike)?
>
> I'd really like it to be more feature-rich. I.e. to accept arguments
> which files it will return, or blacklist/whitelist.
>
> In my limited testing, Git can handle it and will still return the
> list of files much faster than the current find-based solution.
>
> It's a bit more complex to implement, though. That's why it has been
> on my list for a while without much progress.

Yes, I guess ideally it would take a list of vc-states like up-to-date,
edited, needs-update (probably with the exclusion of unregistered) and
list the files in any of the given states.

I'll start simple with listing all tracked files...

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Tassilo Horn <tsdh@gnu.org>
>> Date: Tue, 10 Sep 2019 08:25:17 +0200
>> Cc: emacs-devel@gnu.org
>> 
>> Also, I think most vc backends have a way to list tracked files but
>> not all those are faster than find is.  "git ls-files" is much faster
>> than find but short testing revealed that "hg files" is much slower.
>
> "hg files"?  Did you mean "hg locate", perhaps?  Or are there new
> commands in hg that were added lately?  (My installation of Mercurial
> is quite old, so maybe I'm out of touch.)

Seems so.

--8<---------------cut here---------------start------------->8---
$ hg files --help
hg files [OPTION]... [FILE]...

list tracked files

    Print files under Mercurial control in the working directory or
    specified revision for given files (excluding removed files). Files
    can be specified as filenames or filesets.

    If no files are given to match, this command prints the names of all
    files under Mercurial control.
--8<---------------cut here---------------end--------------->8---

> Anyway, whether 'find' or the VC-specific way is faster might be OS
> dependent.  I'm guessing you tested on GNU/Linux;

Yes.

> on MS-Windows, I get the opposite results for both hg and bzr (let
> alone Git), even though my build of GNU Find is highly optimized and
> generally much faster than other ports available on Windows.  Plus,
> 'find' might not be available on Windows, whereas the VC backend for a
> repository must be available, almost by definition.
>
> So at the very least this should be customizable, and in general,
> unless 'find' is orders of magnitude faster, I'd prefer to use VC in
> all cases.

Ok, you are right.  I'll work on it and report back.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-11 11:00           ` Tassilo Horn
@ 2019-09-11 20:01             ` Tassilo Horn
  2019-09-13 20:38               ` Tassilo Horn
  2019-09-14  0:29               ` Dmitry Gutov
  2019-09-14  0:33             ` Dmitry Gutov
  1 sibling, 2 replies; 94+ messages in thread
From: Tassilo Horn @ 2019-09-11 20:01 UTC (permalink / raw)
  To: emacs-devel

Hi again,

here is a working solution for a VC list-files function.  I've added
implementations for Git, Hg, Bzr, and SVN plus a default implementation
which probably does the right thing for all other handled VC backends.
I guess Monotone also has the ability to quickly list all tracked files
but I haven't been able to install it.

The default implementation is quite slow and so is the SVN version which
fetches the file listing from the remote SVN server.

I also added a vc `project-files' implementation which uses the VC
list-files feature for backends in a new list-valued defcustom
`project-vc-project-files-backends'.

Comments welcome!

--8<---------------cut here---------------start------------->8---
6 files changed, 102 insertions(+), 3 deletions(-)
lisp/progmodes/project.el | 33 ++++++++++++++++++++++++++++++++-
lisp/vc/vc-bzr.el         | 16 ++++++++++++++++
lisp/vc/vc-git.el         | 14 +++++++++++++-
lisp/vc/vc-hg.el          | 13 +++++++++++++
lisp/vc/vc-svn.el         | 18 +++++++++++++++++-
lisp/vc/vc.el             | 11 +++++++++++

modified   lisp/progmodes/project.el
@@ -225,6 +225,26 @@ project-vc-ignores
   :type '(repeat string)
   :safe 'listp)
 
+(defcustom project-vc-project-files-backends '(Bzr Git Hg)
+  "List of vc backends which should be used by `project-files'.
+
+For projects using a backend in this list, `project-files' will
+query the version control system for all tracked files instead of
+using the \"find\" command.
+
+Note that this imposes some differences in semantics:
+
+- The vc backends list tracked files whereas \"find\" lists
+  existing files.
+
+- The performance differs vastly.  The Git backend list files
+  very fast (and generally faster than \"find\") while the SVN
+  backend does so by querying the remote subversion server, i.e.,
+  it requires a network connection and is slow."
+  :type `(set ,@(mapcar (lambda (b) `(const :tag ,(format "%s" b) ,b))
+                        vc-handled-backends))
+  :version "27.1")
+
 ;; FIXME: Using the current approach, major modes are supposed to set
 ;; this variable to a buffer-local value.  So we don't have access to
 ;; the "external roots" of language A from buffers of language B, which
@@ -277,9 +297,20 @@ project-external-roots
      (funcall project-vc-external-roots-function)))
    (project-roots project)))
 
+(cl-defmethod project-files ((project (head vc)) &optional dirs)
+  "Implementation of `project-files' for version controlled projects."
+  (cl-mapcan
+   (lambda (dir)
+     (let ((backend (ignore-errors (vc-responsible-backend dir))))
+       (if (and backend
+                (memq backend project-vc-project-files-backends))
+           (vc-call-backend backend 'list-files dir)
+         (cl-call-next-method))))
+   (or dirs (project-roots project))))
+
 (cl-defmethod project-ignores ((project (head vc)) dir)
   (let* ((root (cdr project))
-          backend)
+         backend)
     (append
      (when (file-equal-p dir root)
        (setq backend (vc-responsible-backend root))
modified   lisp/vc/vc-bzr.el
@@ -45,6 +45,8 @@ vc-bzr-checkout-model
 
 ;;; Code:
 
+(require 'subr-x) ; for string-empty-p
+
 (eval-when-compile
   (require 'cl-lib)
   (require 'vc-dispatcher)
@@ -1315,6 +1317,20 @@ vc-bzr-revision-completion-table
                                              vc-bzr-revision-keywords))
                             string pred)))))
 
+(declare-function cl-remove-if "cl-seq")
+
+(defun vc-bzr-list-files (&optional dir _args)
+  (let ((default-directory (or dir default-directory)))
+    (mapcar
+     #'expand-file-name
+     (cl-remove-if #'string-empty-p
+                   (split-string
+                    (with-output-to-string
+                      (with-current-buffer standard-output
+                        (vc-bzr-command "ls" t 0 "."
+                                        "--null")))
+                    "\0")))))
+
 (provide 'vc-bzr)
 
 ;;; vc-bzr.el ends here
modified   lisp/vc/vc-git.el
@@ -102,9 +102,10 @@
 
 ;;; Code:
 
+(require 'subr-x) ; for string-trim-right, string-empty-p
+
 (eval-when-compile
   (require 'cl-lib)
-  (require 'subr-x) ; for string-trim-right
   (require 'vc)
   (require 'vc-dir))
 
@@ -1706,6 +1707,17 @@ vc-git-symbolic-commit
                                                       (1- (point-max)))))))
          (and name (not (string= name "undefined")) name))))
 
+(declare-function cl-remove-if "cl-seq")
+
+(defun vc-git-list-files (&optional dir _args)
+  (let ((default-directory (or dir default-directory)))
+    (mapcar
+     #'expand-file-name
+     (cl-remove-if #'string-empty-p
+                   (split-string
+                    (vc-git--run-command-string nil "ls-files" "-z")
+                    "\0")))))
+
 (provide 'vc-git)
 
 ;;; vc-git.el ends here
modified   lisp/vc/vc-hg.el
@@ -102,6 +102,7 @@
 ;;; Code:
 
 (require 'cl-lib)
+(require 'subr-x)
 
 (eval-when-compile
   (require 'vc)
@@ -1457,6 +1458,18 @@ vc-hg-command
 (defun vc-hg-root (file)
   (vc-find-root file ".hg"))
 
+(defun vc-hg-list-files (&optional dir _args)
+  (let ((default-directory (or dir default-directory)))
+    (mapcar
+     #'expand-file-name
+     (cl-remove-if #'string-empty-p
+                   (split-string
+                    (with-output-to-string
+                      (with-current-buffer standard-output
+                        (vc-hg-command t 0 "."
+                                       "files" "--print0")))
+                    "\0")))))
+
 (provide 'vc-hg)
 
 ;;; vc-hg.el ends here
modified   lisp/vc/vc-svn.el
@@ -28,7 +28,9 @@
 
 ;;; Code:
 
+(require 'subr-x)
 (eval-when-compile
+  (require 'cl-lib)
   (require 'vc))
 
 ;; Clear up the cache to force vc-call to check again and discover
@@ -807,7 +809,21 @@ vc-svn-revision-table
           (push (match-string 1 loglines) vc-svn-revisions)
           (setq start (+ start (match-end 0)))
           (setq loglines (buffer-substring-no-properties start (point-max)))))
-    vc-svn-revisions)))
+      vc-svn-revisions)))
+
+(declare-function cl-remove-if "cl-seq")
+
+(defun vc-svn-list-files (&optional dir _args)
+  (let ((default-directory (or dir default-directory)))
+    (mapcar
+     #'expand-file-name
+     (cl-remove-if #'string-empty-p
+                   (split-string
+                    (with-output-to-string
+                      (with-current-buffer standard-output
+                        (vc-svn-command t 0 "."
+                                        "list" "--recursive")))
+                    "\n")))))
 
 (provide 'vc-svn)
 
modified   lisp/vc/vc.el
@@ -3106,6 +3106,17 @@ vc-file-tree-walk-internal
                   (vc-file-tree-walk-internal dirf func args)))))
        (directory-files dir)))))
 
+\f
+
+(defun vc-default-list-files (_backend &optional dir _args)
+  (let* ((default-directory (or dir default-directory))
+         (inhibit-message t)
+         files)
+    (vc-file-tree-walk default-directory
+                       (lambda (f)
+                         (setq files (cons f files))))
+    files))
+
 (provide 'vc)
 
 ;;; vc.el ends here
--8<---------------cut here---------------end--------------->8---

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-11 20:01             ` Tassilo Horn
@ 2019-09-13 20:38               ` Tassilo Horn
  2019-09-14  0:29               ` Dmitry Gutov
  1 sibling, 0 replies; 94+ messages in thread
From: Tassilo Horn @ 2019-09-13 20:38 UTC (permalink / raw)
  To: emacs-devel

Hi again,

someone told me a better way to query the tracked files from subversion
which is much faster and requires no network call off-list.

--8<---------------cut here---------------start------------->8---
(defun vc-svn-list-files (&optional dir)
  (let ((default-directory (or dir default-directory))
        files)
    (with-temp-buffer
      (vc-svn-command t 0 "."
                      "info" "--recursive"
                      "--show-item" "relative-url"
                      "--show-item" "kind")
      (goto-char (point-min))
      (while (re-search-forward "^file\s+\\(.*\\)$" nil t)
        (setq files (cons (expand-file-name (match-string 1))
                          files))))
    (nreverse files)))
--8<---------------cut here---------------end--------------->8---

I've also pushed my changes to the branch scratch/tsdh-vc-list-files for
review.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-11 20:01             ` Tassilo Horn
  2019-09-13 20:38               ` Tassilo Horn
@ 2019-09-14  0:29               ` Dmitry Gutov
  2019-09-14 16:26                 ` Tassilo Horn
  1 sibling, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-14  0:29 UTC (permalink / raw)
  To: emacs-devel

Hi Tassilo,

This general approach has been on my list for a while, and I'm thankful 
for help.

But we should get the details right, and IMO that means no "differences 
in semantics", and it implies certain requirements on the new VC backend 
action which we'll need to consider.

On 11.09.2019 23:01, Tassilo Horn wrote:
> here is a working solution for a VC list-files function.  I've added
> implementations for Git, Hg, Bzr, and SVN plus a default implementation
> which probably does the right thing for all other handled VC backends.
> I guess Monotone also has the ability to quickly list all tracked files
> but I haven't been able to install it.

I don't think there's a point in delegating to backends where the 
performance will end up being worse. Though I see you added a faster 
implementation for SVN later.

> I also added a vc `project-files' implementation which uses the VC
> list-files feature for backends in a new list-valued defcustom
> `project-vc-project-files-backends'.

That implementation should both include untracked files (since it's what 
we'd generally expect from it given the current Project API semantics) 
and honor project-vc-ignores (which is something a user can set via 
dir-locals, and I personally found quite useful).

Ideally it also should allow future support for whitelisting entries 
(that override ignores returned by Git), though that's not there yet.

All this makes creating new VC action more difficult, but I think we can 
do that. Maybe not for all backends, but Git for sure, which will help 
95% of our audience.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-11 11:00           ` Tassilo Horn
  2019-09-11 20:01             ` Tassilo Horn
@ 2019-09-14  0:33             ` Dmitry Gutov
  2019-09-14 16:43               ` Tassilo Horn
  1 sibling, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-14  0:33 UTC (permalink / raw)
  To: emacs-devel

On 11.09.2019 14:00, Tassilo Horn wrote:
> Yes, I guess ideally it would take a list of vc-states like up-to-date,
> edited, needs-update (probably with the exclusion of unregistered) and
> list the files in any of the given states.

Why exclude unregistered? I don't imagine they would slow anything down.

In addition to states, the command will likely need a second argument: 
the list of ignores. It can default to whatever Git ignores already, but 
we could also just pass the whole list of ignores anyway.

Speaking of ignores format, maybe they should just be whatever the 
backend in question understands, e.g. the contents of gitignore verbatim.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-14  0:29               ` Dmitry Gutov
@ 2019-09-14 16:26                 ` Tassilo Horn
  2019-09-15 18:56                   ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-09-14 16:26 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

Hi Dmitry,

> But we should get the details right, and IMO that means no
> "differences in semantics", and it implies certain requirements on the
> new VC backend action which we'll need to consider.

Sure.

> On 11.09.2019 23:01, Tassilo Horn wrote:
>> here is a working solution for a VC list-files function.  I've added
>> implementations for Git, Hg, Bzr, and SVN plus a default
>> implementation which probably does the right thing for all other
>> handled VC backends.  I guess Monotone also has the ability to
>> quickly list all tracked files but I haven't been able to install it.
>
> I don't think there's a point in delegating to backends where the
> performance will end up being worse. Though I see you added a faster
> implementation for SVN later.

As Eli said, just stat find is faster than, e.g., "bzr ls" on my
GNU/Linux system doesn't mean much.  One benefit is that if you have a
bzr project, you'll have bzr installed.  That doesn't need to be the
case for find at least on Windows systems.

>> I also added a vc `project-files' implementation which uses the VC
>> list-files feature for backends in a new list-valued defcustom
>> `project-vc-project-files-backends'.
>
> That implementation should both include untracked files (since it's
> what we'd generally expect from it given the current Project API
> semantics) and honor project-vc-ignores (which is something a user can
> set via dir-locals, and I personally found quite useful).

Hm, git can list untracked files, list ignored files, and also get a
custom ignore pattern.

"bzr ls" has an --unknown flag which should list unknown files and an
--ignored flag to list ignored files, but in my version it then just
lists nothing when I specify either one.

"hg files" doesn't seem to have a way to list untracked files.  Same for
subversion.

> Ideally it also should allow future support for whitelisting entries
> (that override ignores returned by Git), though that's not there yet.
>
> All this makes creating new VC action more difficult, but I think we
> can do that. Maybe not for all backends, but Git for sure, which will
> help 95% of our audience.

If you have an interface in mind (i.e., list-files all arguments and
their meaning), I can try and check how far we can get.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-14  0:33             ` Dmitry Gutov
@ 2019-09-14 16:43               ` Tassilo Horn
  2019-09-15  8:29                 ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-09-14 16:43 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

>> Yes, I guess ideally it would take a list of vc-states like
>> up-to-date, edited, needs-update (probably with the exclusion of
>> unregistered) and list the files in any of the given states.
>
> Why exclude unregistered? I don't imagine they would slow anything
> down.

Because IMHO they are not interesting except if they were just created
by myself and not yet added which seemed like a minor edge case (and at
least git ls-files lists files which are not yet committed but staged).
And because it seems to me that git is more an exception in being able
to list unregistered files.

> In addition to states, the command will likely need a second argument:
> the list of ignores.  It can default to whatever Git ignores already,
> but we could also just pass the whole list of ignores anyway.
>
> Speaking of ignores format, maybe they should just be whatever the
> backend in question understands, e.g. the contents of gitignore
> verbatim.

I don't see how that would work except by modifying (or moving back and
forth) the existing .{git,hg,bzr}ignore to include project-vc-ignores.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-14 16:43               ` Tassilo Horn
@ 2019-09-15  8:29                 ` Dmitry Gutov
  2019-09-15  9:06                   ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-15  8:29 UTC (permalink / raw)
  To: emacs-devel

On 14.09.2019 19:43, Tassilo Horn wrote:

>> Why exclude unregistered? I don't imagine they would slow anything
>> down.
> 
> Because IMHO they are not interesting except if they were just created
> by myself and not yet added which seemed like a minor edge case (and at
> least git ls-files lists files which are not yet committed but staged).

Why else would they be there if not created by yourself? 
Machine-generated files are usually in gitignore.

And every new file goes through this status. It would be an omission if 
project-find-file could find the file you just created (and probably 
working on).

> And because it seems to me that git is more an exception in being able
> to list unregistered files.

If some VC backends just can't provide the full functionality, we should 
disable using them by default (and only leave 'Git in the defcustom you 
added, with some explanation).

>> Speaking of ignores format, maybe they should just be whatever the
>> backend in question understands, e.g. the contents of gitignore
>> verbatim.
> 
> I don't see how that would work except by modifying (or moving back and
> forth) the existing .{git,hg,bzr}ignore to include project-vc-ignores.

It's an option, although not a very fail-proof one.

Git has helpful options like --exclude and --exclude-from, although it 
seems they don't affect the already tracked files. For which there are 
different things we can do:

* Use negative patterns (https://stackoverflow.com/a/53083343/615245), 
this approach seems to recommend we use a separate argument like 
extra-ignores for the backend command. --exclude-from should help with 
whitelisting entries, though.

* Grep the output, or process it in Emacs programmatically.

* Idea from https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=556584#10, 
though using it might not be the fastest option. Someone should 
benchmark it.

With the first option, the command line would look like:

   git ls-files -c -o --exclude=!whitelisted-dir1 
--exclude=!whitelisted-dir2 --exclude-standard ":!:extra-ignore1" 
":!:extra-ignore2"



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-15  8:29                 ` Dmitry Gutov
@ 2019-09-15  9:06                   ` Dmitry Gutov
  0 siblings, 0 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-15  9:06 UTC (permalink / raw)
  To: emacs-devel

On 15.09.2019 11:29, Dmitry Gutov wrote:

> If some VC backends just can't provide the full functionality, we should 
> disable using them by default (and only leave 'Git in the defcustom you 
> added, with some explanation).

And speaking of other backends, Mercurial has 'hg status --all', for 
instance. Although it's not very fast.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-14 16:26                 ` Tassilo Horn
@ 2019-09-15 18:56                   ` Dmitry Gutov
  2019-09-16  2:27                     ` Eli Zaretskii
  2019-09-16 13:32                     ` Tassilo Horn
  0 siblings, 2 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-15 18:56 UTC (permalink / raw)
  To: emacs-devel

On 14.09.2019 19:26, Tassilo Horn wrote:

> As Eli said, just stat find is faster than, e.g., "bzr ls" on my
> GNU/Linux system doesn't mean much.  One benefit is that if you have a
> bzr project, you'll have bzr installed.  That doesn't need to be the
> case for find at least on Windows systems.

Makes sense. If 'find' is not available, defaulting to Bzr might be an 
option. Although I'm concerned that if it misses some of the files, it 
might be better to tell the user to install 'find'.

> "bzr ls" has an --unknown flag which should list unknown files and an
> --ignored flag to list ignored files, but in my version it then just
> lists nothing when I specify either one.

*shrug* I wouldn't spend too much time on this particular VCS.

> "hg files" doesn't seem to have a way to list untracked files.  Same for
> subversion.

I've only looked into Mercurial so far, and 'hg status -u' seems to do 
the trick. Depending on performance, you could call 'hg status -c -u', 
or augment the output of 'hg files' with a call to 'hg status -u'.

> If you have an interface in mind (i.e., list-files all arguments and
> their meaning), I can try and check how far we can get.

I think it's either

   (list-files include-untracked extra-ignores whitelist)

or

   (list-files pathspecs include-untracked all-ignores)

where pathspecs defaults to nil meaning "all files" and all-ignores 
defaults to the current contents of .gitignore. It could also be 
interpreted as a file name, but I'm guessing most backends would have a 
hard time supporting this usage.

In the latter case project-files's implementation would contain some 
backend-specific code, e.g. to translate additional ignores into 
pathspec values, so that they also apply to tracked files.

Overall, I'm not 100% sure that we should use VC backend action here 
because it seems like we'll be fighting an impedance mismatch between 
what Git thinks its files are and what we want to see in the list of 
project files. We probably want to use Git because it's fast and 
flexible, but other VCSes are going to be less helpful. So if I were 
writing this myself I'd create a "fast path" for Git repos, and delegate 
to 'find' otherwise.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-15 18:56                   ` Dmitry Gutov
@ 2019-09-16  2:27                     ` Eli Zaretskii
  2019-09-16  3:36                       ` Dmitry Gutov
  2019-09-16 13:32                     ` Tassilo Horn
  1 sibling, 1 reply; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-16  2:27 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Sun, 15 Sep 2019 21:56:11 +0300
> 
> Overall, I'm not 100% sure that we should use VC backend action here 
> because it seems like we'll be fighting an impedance mismatch between 
> what Git thinks its files are and what we want to see in the list of 
> project files. We probably want to use Git because it's fast and 
> flexible, but other VCSes are going to be less helpful. So if I were 
> writing this myself I'd create a "fast path" for Git repos, and delegate 
> to 'find' otherwise.

You seem to be ignoring what I wrote about 'find' being significantly
slower on non-Posix platforms (perhaps on all non-GNU/Linux
platforms).



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-16  2:27                     ` Eli Zaretskii
@ 2019-09-16  3:36                       ` Dmitry Gutov
  2019-09-16 15:25                         ` Eli Zaretskii
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-16  3:36 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 16.09.2019 5:27, Eli Zaretskii wrote:

> You seem to be ignoring what I wrote about 'find' being significantly
> slower on non-Posix platforms (perhaps on all non-GNU/Linux
> platforms).

A little bit, but I think I've addressed that in another email.

Should we sacrifice some correctness for speed? If backends other than 
Git (and maybe Hg) can't return untracked files, any file that the user 
just created won't show up in project-find-file. I think it's a problem.

We could make up some performance using filenotify and caching, so that 
'find', though slower, would at least be called infrequently.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-15 18:56                   ` Dmitry Gutov
  2019-09-16  2:27                     ` Eli Zaretskii
@ 2019-09-16 13:32                     ` Tassilo Horn
  2019-09-17 11:06                       ` Dmitry Gutov
  1 sibling, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-09-16 13:32 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

Hi Dmitry,

>> "hg files" doesn't seem to have a way to list untracked files.  Same
>> for subversion.
>
> I've only looked into Mercurial so far, and 'hg status -u' seems to do
> the trick.  Depending on performance, you could call 'hg status -c -u',
> or augment the output of 'hg files' with a call to 'hg status -u'.

Ah, "hg status --all" lists all files including their state (untracked,
ignored, you-name-it), so that's the one we should use.  Performance
seems to be the same as for "hg files".

> Overall, I'm not 100% sure that we should use VC backend action here
> because it seems like we'll be fighting an impedance mismatch between
> what Git thinks its files are and what we want to see in the list of
> project files. We probably want to use Git because it's fast and
> flexible, but other VCSes are going to be less helpful. So if I were
> writing this myself I'd create a "fast path" for Git repos, and
> delegate to 'find' otherwise.

I think we can come up with a VC list-files operation which optionally
includes untracked and ignored files (where the latter implies the
former, doesn't it?) but I'd leave the filtering according to
project-vc-ignores to project.el.

How would project.el call such a VC list-files operation?  I guess you
would include untracked and also ignored files, right?  In my use-case,
the inclusion of ignored files would probably increase the size of the
list of files by a factor of at least 2 because for every *.java file in
our project, there's at least one ignored *.class file (but probably
more like 2-20 *.class files).  So a new defcustom or include ignored
files only if project-vc-ignores has a non-nil value?

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-16  3:36                       ` Dmitry Gutov
@ 2019-09-16 15:25                         ` Eli Zaretskii
  2019-09-17 10:46                           ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-16 15:25 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Mon, 16 Sep 2019 06:36:55 +0300
> 
> Should we sacrifice some correctness for speed?

No, but it was never my intent to say otherwise.  What is "correct"
here?  E.g., 'find' will find _all_ the files, including the ignored
one -- is that "correct"?

> If backends other than Git (and maybe Hg) can't return untracked
> files

Which backend cannot do that?  Bzr can (you just need to call it
twice), and the other two also can.  So I think we have no problem
here.

> We could make up some performance using filenotify and caching, so that 
> 'find', though slower, would at least be called infrequently.

That's over-engineering, I think.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-16 15:25                         ` Eli Zaretskii
@ 2019-09-17 10:46                           ` Dmitry Gutov
  2019-09-17 12:03                             ` Eli Zaretskii
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-17 10:46 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 16.09.2019 18:25, Eli Zaretskii wrote:

>> Should we sacrifice some correctness for speed?
> 
> No, but it was never my intent to say otherwise.  What is "correct"
> here?  E.g., 'find' will find _all_ the files, including the ignored
> one -- is that "correct"?

'find' can handle an arbitrary list of ignores, including whitelisting 
entries (a feature which I've yet to add, but still).

Not every file that's registered in a VCS should be seen as part of the 
project, and some of the unregistered (and gitignore-d) still should be 
(e.g. config files that are created by every developer individually but 
still should be editable by them through Emacs and project-find-file).

>> If backends other than Git (and maybe Hg) can't return untracked
>> files
> 
> Which backend cannot do that?  Bzr can (you just need to call it
> twice), and the other two also can.  So I think we have no problem
> here.

Tassilo said that Bzr can't, I haven't checked. There was also talk of 
SVN, so "the other two" might mean "the other three".

>> We could make up some performance using filenotify and caching, so that
>> 'find', though slower, would at least be called infrequently.
> 
> That's over-engineering, I think.

Really? Consider: we have to maintain the find-based approach anyway 
since not everyone uses Git/Hg/Bzr.

And the filenotify-based cache would do one job. Conceptually, at least, 
it should be simpler than coercing several VC systems into doing the job 
of 'find' with ignore/whitelist entries not exactly matching VCS's view 
of the repository.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-16 13:32                     ` Tassilo Horn
@ 2019-09-17 11:06                       ` Dmitry Gutov
  2019-09-18 17:15                         ` Tassilo Horn
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-17 11:06 UTC (permalink / raw)
  To: Tassilo Horn; +Cc: emacs-devel

Hi Tassilo,

On 16.09.2019 16:32, Tassilo Horn wrote:

> Ah, "hg status --all" lists all files including their state (untracked,
> ignored, you-name-it), so that's the one we should use.  Performance
> seems to be the same as for "hg files".

In my testing the performance difference is about 2x:

$ bash -c "time hg status -c >/dev/null"

real	0m12,015s
user	0m1,899s
sys	0m10,113s

$ bash -c "time hg files >/dev/null"

real	0m5,970s
user	0m1,004s
sys	0m4,965s

(project-files (project-current)) takes ~7 seconds here on the same repo 
(Mozilla Firefox checkout).

But if it's faster than 'find' anyway on some platforms, why not? As 
long as there's a solution that will handle the adjusted ignore rules in 
a similarly performant fashion.

> I think we can come up with a VC list-files operation which optionally
> includes untracked and ignored files (where the latter implies the
> former, doesn't it?)

Whether it implies or not, depends on which set of ignores we're talking 
about (Git's own or the modified one).

> but I'd leave the filtering according to
> project-vc-ignores to project.el.

Have you tries benchmarking this approach? E.g. calling 'git ls-files -c 
-o -z' and then doing all the filtering indicated by .gitignore rules?

Try it on the current Emacs repo.

IME it's the ignore rules that take up 99% of the CPU time when using 
'find'. Without them, 'find .' is instant (though that depends on the 
disk access speed). If we're going to implement that in Elisp, I'd wager 
it's going to be even slower.

> How would project.el call such a VC list-files operation?  I guess you
> would include untracked and also ignored files, right?

I encourage you to try this approach even with Git only, without the VC 
facade, and see where we end up.

> In my use-case,
> the inclusion of ignored files would probably increase the size of the
> list of files by a factor of at least 2 because for every *.java file in
> our project, there's at least one ignored *.class file (but probably
> more like 2-20 *.class files).

I'm fairly sure the compilation artefacts aren't going to be the only 
problem of this approach.

> So a new defcustom or include ignored
> files only if project-vc-ignores has a non-nil value?

I suppose we could end up having different branches of logic for whether 
the user ends up using this variable, but I'd rather not if the "yes" 
branch is always going to be slower. It's like punishing them for using 
a reasonable feature.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-17 10:46                           ` Dmitry Gutov
@ 2019-09-17 12:03                             ` Eli Zaretskii
  2019-09-17 12:55                               ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-17 12:03 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Tue, 17 Sep 2019 13:46:19 +0300
> 
> On 16.09.2019 18:25, Eli Zaretskii wrote:
> 
> >> Should we sacrifice some correctness for speed?
> > 
> > No, but it was never my intent to say otherwise.  What is "correct"
> > here?  E.g., 'find' will find _all_ the files, including the ignored
> > one -- is that "correct"?
> 
> 'find' can handle an arbitrary list of ignores, including whitelisting 
> entries (a feature which I've yet to add, but still).

So you are going to convert the likes of .gitignore to a long list of
'find's -name arguments?  Instead of just using the tools that can do
this at a flip of a command-line argument?  Why?

> Not every file that's registered in a VCS should be seen as part of the 
> project, and some of the unregistered (and gitignore-d) still should be 
> (e.g. config files that are created by every developer individually but 
> still should be editable by them through Emacs and project-find-file).

That's a separate topic, IMO, but IME (which is admittedly thin) any
"project" kind of feature requires the user to register the files
somehow, before the file appears in the tree shown by the project's
UI.

> >> If backends other than Git (and maybe Hg) can't return untracked
> >> files
> > 
> > Which backend cannot do that?  Bzr can (you just need to call it
> > twice), and the other two also can.  So I think we have no problem
> > here.
> 
> Tassilo said that Bzr can't, I haven't checked.

That's not exactly what Tassilo said, and "bzr help ls" is just a few
keystrokes away of each one of us.

> >> We could make up some performance using filenotify and caching, so that
> >> 'find', though slower, would at least be called infrequently.
> > 
> > That's over-engineering, I think.
> 
> Really? Consider: we have to maintain the find-based approach anyway 
> since not everyone uses Git/Hg/Bzr.

Where there's no VCS back-end, 'find' will have to do, and if it's
slow, there's not much we can do about it.

> And the filenotify-based cache would do one job.

IME, filenotify doesn't scale well, so I won't recommend it for
non-toy projects in this context.

> Conceptually, at least, it should be simpler than coercing several
> VC systems into doing the job of 'find' with ignore/whitelist
> entries not exactly matching VCS's view of the repository.

I think you are wrong here.  It is 'find' that needs to be coerced to
do a job that is not really its prime, and a project's view is in most
cases almost exactly that of the VCS used for the project.  The
difference is just-now added files that were not yet registered, and
how many of these can there be?



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-17 12:03                             ` Eli Zaretskii
@ 2019-09-17 12:55                               ` Dmitry Gutov
  2019-09-17 13:14                                 ` Eli Zaretskii
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-17 12:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 17.09.2019 15:03, Eli Zaretskii wrote:

>> 'find' can handle an arbitrary list of ignores, including whitelisting
>> entries (a feature which I've yet to add, but still).
> 
> So you are going to convert the likes of .gitignore to a long list of
> 'find's -name arguments?  Instead of just using the tools that can do
> this at a flip of a command-line argument?  Why?

First of all, it's already done and working.

Second, like I said, using Git for this purpose seems easy enough. 
Others, less so.

> That's a separate topic, IMO, but IME (which is admittedly thin) any
> "project" kind of feature requires the user to register the files
> somehow, before the file appears in the tree shown by the project's
> UI.

Not in my experience.

It's not intuitive that a user has to vc-register a file before it shows 
up in 'project-find-file'. At the very least, we'll have confused users, 
and I'm not sure explaining it all in the manual is the answer.

Second, I'd prefer it to keep working like it already does.

> That's not exactly what Tassilo said, and "bzr help ls" is just a few
> keystrokes away of each one of us.

I didn't have it installed, but OK.

 From what I see, it doesn't allow to "unignore" certain files 
selectively, and I'm not sure if there's a way to apply additional 
ignores except by doing it in Lisp.

>> And the filenotify-based cache would do one job.
> 
> IME, filenotify doesn't scale well, so I won't recommend it for
> non-toy projects in this context.

Even if we just listen to "create", "rename" and "delete" events?

Too bad. I imagine it has more to do with maturity of the package, then, 
so maybe someday.

> I think you are wrong here.  It is 'find' that needs to be coerced to
> do a job that is not really its prime, and a project's view is in most
> cases almost exactly that of the VCS used for the project.

It's a job we've been using 'find' for for decades, and I only had to 
adapt some existing code.

> The
> difference is just-now added files that were not yet registered, and
> how many of these can there be?

That's not the only problem. We want to change the ignores list 
arbitrarily, or maybe just use a user-defined one. It will usually 
correlate with the contents of gitignore/hgignore/bzrignore, but not 
exactly. And neither Hg or Bzr seem to offer easy ways to adjust their 
lists of ignores before output.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-17 12:55                               ` Dmitry Gutov
@ 2019-09-17 13:14                                 ` Eli Zaretskii
  2019-09-19 15:33                                   ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-17 13:14 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Tue, 17 Sep 2019 15:55:55 +0300
> 
> On 17.09.2019 15:03, Eli Zaretskii wrote:
> 
> >> 'find' can handle an arbitrary list of ignores, including whitelisting
> >> entries (a feature which I've yet to add, but still).
> > 
> > So you are going to convert the likes of .gitignore to a long list of
> > 'find's -name arguments?  Instead of just using the tools that can do
> > this at a flip of a command-line argument?  Why?
> 
> First of all, it's already done and working.

I don't see how it is relevant.  My point is that having to work with
long 'find' command lines is unwieldy and inelegant.  A given VCS
repository can have many dozens of ignored files.

> Second, like I said, using Git for this purpose seems easy enough. 
> Others, less so.

I don't see how you arrived to the last conclusion.  AFAIU, Tassilo
presented code to deal with others as well.

> It's not intuitive that a user has to vc-register a file before it shows 
> up in 'project-find-file'.

We already established that we will show also non-registered files, so
this is a non-issue.

>  From what I see, it doesn't allow to "unignore" certain files 
> selectively, and I'm not sure if there's a way to apply additional 
> ignores except by doing it in Lisp.

What do you mean by "unignore"?  Which VCS backend allows you to do
that, and how?

> >> And the filenotify-based cache would do one job.
> > 
> > IME, filenotify doesn't scale well, so I won't recommend it for
> > non-toy projects in this context.
> 
> Even if we just listen to "create", "rename" and "delete" events?
> 
> Too bad. I imagine it has more to do with maturity of the package, then, 
> so maybe someday.

It isn't the problem of filenotify the package, it's a problem with
the OS features it uses: inotify, knotify, etc.  They all lose events
when there are a lot of file operations in a tree.  E.g., try using
Auto-revert-tail mode on a log file produced by building the Boost
library -- in the default mode, where it uses file notifications, it
simply chokes.

> > I think you are wrong here.  It is 'find' that needs to be coerced to
> > do a job that is not really its prime, and a project's view is in most
> > cases almost exactly that of the VCS used for the project.
> 
> It's a job we've been using 'find' for for decades, and I only had to 
> adapt some existing code.

That something _can_ be done a certain way doesn't yet mean it
_should_ be done that way.

> > The
> > difference is just-now added files that were not yet registered, and
> > how many of these can there be?
> 
> That's not the only problem. We want to change the ignores list 
> arbitrarily, or maybe just use a user-defined one. It will usually 
> correlate with the contents of gitignore/hgignore/bzrignore, but not 
> exactly. And neither Hg or Bzr seem to offer easy ways to adjust their 
> lists of ignores before output.

Then those features will be not available for hg and bzr, or we will
tell their users that must have that to switch to 'find'.  But let's
not bypass VC where it can do the job, because that's TRT from where I
stand.  We should not design our implementations for the lowest
denominator, especially when we know that a large proportion of the
users will use Git anyway.

Thanks.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-17 11:06                       ` Dmitry Gutov
@ 2019-09-18 17:15                         ` Tassilo Horn
  2019-09-19 16:01                           ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-09-18 17:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

Hi Dmitry,

>> Ah, "hg status --all" lists all files including their state
>> (untracked, ignored, you-name-it), so that's the one we should use.
>> Performance seems to be the same as for "hg files".
>
> In my testing the performance difference is about 2x:
>
> $ bash -c "time hg status -c >/dev/null"
>
> real	0m12,015s
> user	0m1,899s
> sys	0m10,113s
>
> $ bash -c "time hg files >/dev/null"
>
> real	0m5,970s
> user	0m1,004s
> sys	0m4,965s
>
> (project-files (project-current)) takes ~7 seconds here on the same repo
> (Mozilla Firefox checkout).
>
> But if it's faster than 'find' anyway on some platforms, why not? As
> long as there's a solution that will handle the adjusted ignore rules
> in a similarly performant fashion.

Right.

>> I think we can come up with a VC list-files operation which
>> optionally includes untracked and ignored files (where the latter
>> implies the former, doesn't it?)
>
> Whether it implies or not, depends on which set of ignores we're
> talking about (Git's own or the modified one).
>
>> but I'd leave the filtering according to project-vc-ignores to
>> project.el.
>
> Have you tries benchmarking this approach? E.g. calling 'git ls-files
> -c -o -z' and then doing all the filtering indicated by .gitignore
> rules?
>
> Try it on the current Emacs repo.
>
> IME it's the ignore rules that take up 99% of the CPU time when using
> 'find'. Without them, 'find .' is instant (though that depends on the
> disk access speed). If we're going to implement that in Elisp, I'd
> wager it's going to be even slower.

Well, ok.  I've now played with an interface

  (vc-call-backend (vc-responsible-backend dir)
		   'list-files
		   dir
		   include-unregistered
		   extra-includes)

where extra-includes works in addition to the standard VC ignore rules
(.gitignore, .hgignore).  Or do you want to override the VC-internal
rules?

At least for Git and Hg, I came up with reasonable implementations:

--8<---------------cut here---------------start------------->8---
(defun vc-git-list-files (&optional dir
                                    include-unregistered
                                    extra-ignores)
  (let ((default-directory (or dir default-directory))
        (args '("-z")))
    (when include-unregistered
      (setq args (nconc args '("-c" "-o" "--exclude-standard"))))
    (when extra-ignores
      (setq args (nconc args
                        (mapcan
                         (lambda (i)
                           (list "--exclude" i))
                         (copy-list extra-ignores)))))
    (mapcar
     #'expand-file-name
     (cl-remove-if
      #'string-empty-p
      (split-string
       (apply #'vc-git--run-command-string nil "ls-files" args)
       "\0")))))

(defun vc-hg-list-files (&optional dir
                                   include-unregistered
                                   extra-ignores)
  (let ((default-directory (or dir default-directory))
        args
        files)
    (when include-unregistered
      (setq args (nconc args '("--all"))))
    (when extra-ignores
      (setq args (nconc args
                        (mapcan
                         (lambda (i)
                           (list "--exclude" i))
                         (copy-list extra-ignores)))))
    (with-temp-buffer
      (apply #'vc-hg-command t 0 "."
             "status" args)
      (goto-char (point-min))
      (while (re-search-forward "^[?C]\s+\\(.*\\)$" nil t)
        (setq files (cons (expand-file-name (match-string 1))
                          files))))
    (nreverse files)))
--8<---------------cut here---------------end--------------->8---

There's a semantic difference between Git and Hg in the treatment of
extra-ignores.  With Git, the extra-ignores do not rule out committed
files (i.e., they are only effective for untracked files) while for Hg,
they also rule out committed files.  I think the Hg semantics are
probably better but I don't see how to change the Git version so that it
acts the same way (except by re-filtering in lisp, of course), do you?

I haven't looked at the other backends.  I guess bzr will probably be
doable, too.  However, for SVN, there's no way to list unregistered
files.  A correct (but horribly slow) default implementation should also
be doable.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-17 13:14                                 ` Eli Zaretskii
@ 2019-09-19 15:33                                   ` Dmitry Gutov
  2019-09-19 17:29                                     ` Eli Zaretskii
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-19 15:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 17.09.2019 16:14, Eli Zaretskii wrote:

> My point is that having to work with
> long 'find' command lines is unwieldy and inelegant.  A given VCS
> repository can have many dozens of ignored files.

I don't see how the length of the command line is an issue. The user 
won't see it.

And anyway, as we've established, the find-based solution will have to 
be there anyway, at least for some uses. So there's no use arguing about 
its awkwardness: the complexity coming with it will have to be there anyway.

>> Second, like I said, using Git for this purpose seems easy enough.
>> Others, less so.
> 
> I don't see how you arrived to the last conclusion.  AFAIU, Tassilo
> presented code to deal with others as well.

Sorry it's in the thread. Including the end of the message you were 
replying to.

>> It's not intuitive that a user has to vc-register a file before it shows
>> up in 'project-find-file'.
> 
> We already established that we will show also non-registered files, so
> this is a non-issue.

I was replying to a sentence from you where you seemed to disagree.

>>   From what I see, it doesn't allow to "unignore" certain files
>> selectively, and I'm not sure if there's a way to apply additional
>> ignores except by doing it in Lisp.
> 
> What do you mean by "unignore"?  Which VCS backend allows you to do
> that, and how?

Err, not yet.

But I'm thinking we should allow adding entries like '!/foo/bar' to 
project-vc-ignores, similar to Git's related syntax.

The find-based indexer is yet to learn to support that feature, but 
asking Git to add extra whitelist entries looks fairly easy.

> It isn't the problem of filenotify the package, it's a problem with
> the OS features it uses: inotify, knotify, etc.  They all lose events
> when there are a lot of file operations in a tree.  E.g., try using
> Auto-revert-tail mode on a log file produced by building the Boost
> library -- in the default mode, where it uses file notifications, it
> simply chokes.

Should there be a way to debounce the events, or handle this somehow? 
For instance, when too many events come, we could simply discard the 
files list cache, and force it to be rebuilt later. For this particular 
use, we don't really have to see every notification.

>> That's not the only problem. We want to change the ignores list
>> arbitrarily, or maybe just use a user-defined one. It will usually
>> correlate with the contents of gitignore/hgignore/bzrignore, but not
>> exactly. And neither Hg or Bzr seem to offer easy ways to adjust their
>> lists of ignores before output.
> 
> Then those features will be not available for hg and bzr, or we will
> tell their users that must have that to switch to 'find'.  But let's
> not bypass VC where it can do the job, because that's TRT from where I
> stand.  We should not design our implementations for the lowest
> denominator, especially when we know that a large proportion of the
> users will use Git anyway.

Yes. Like I said several messages back: "We probably want to use Git 
because it's fast and flexible, but other VCSes are going to be less 
helpful". Tassilo might prove me wrong in some instances, but it's 
definitely hard to support this feature across the range of VC backends.

So it *would* make sense to bypass VC (the API) and just use Git (and 
maybe Hg) directly when available.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-18 17:15                         ` Tassilo Horn
@ 2019-09-19 16:01                           ` Dmitry Gutov
  2019-09-22  8:56                             ` Tassilo Horn
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-19 16:01 UTC (permalink / raw)
  To: emacs-devel

Hi Tassilo,

On 18.09.2019 20:15, Tassilo Horn wrote:

> Well, ok.  I've now played with an interface
> 
>    (vc-call-backend (vc-responsible-backend dir)
> 		   'list-files
> 		   dir
> 		   include-unregistered
> 		   extra-includes)

Not sure we should use the vc-call-backend route when only 1 or 2 
backends will work well enough, but OK, it's not essential.

> where extra-includes works in addition to the standard VC ignore rules
> (.gitignore, .hgignore).  Or do you want to override the VC-internal
> rules?

I'm afraid it might not work as well if we try to treat all (modified) 
ignores the same. In my understanding, the speed with which Git lists 
files to a large extent stems from not having to apply the ignores to 
the already-registered files. Someone should benchmark this, but I think 
if we use the "negative pathspec" approach mentioned below for all 
ignores together, it might slow down file listing by an order of 
magnitude or several.

> At least for Git and Hg, I came up with reasonable implementations:
> 
> --8<---------------cut here---------------start------------->8---
> (defun vc-git-list-files (&optional dir
>                                      include-unregistered
>                                      extra-ignores)
>    (let ((default-directory (or dir default-directory))
>          (args '("-z")))
>      (when include-unregistered
>        (setq args (nconc args '("-c" "-o" "--exclude-standard"))))
>      (when extra-ignores
>        (setq args (nconc args
>                          (mapcan
>                           (lambda (i)
>                             (list "--exclude" i))
>                           (copy-list extra-ignores)))))
>      (mapcar
>       #'expand-file-name
>       (cl-remove-if
>        #'string-empty-p
>        (split-string
>         (apply #'vc-git--run-command-string nil "ls-files" args)
>         "\0")))))
> 
> (defun vc-hg-list-files (&optional dir
>                                     include-unregistered
>                                     extra-ignores)
>    (let ((default-directory (or dir default-directory))
>          args
>          files)
>      (when include-unregistered
>        (setq args (nconc args '("--all"))))
>      (when extra-ignores
>        (setq args (nconc args
>                          (mapcan
>                           (lambda (i)
>                             (list "--exclude" i))
>                           (copy-list extra-ignores)))))
>      (with-temp-buffer
>        (apply #'vc-hg-command t 0 "."
>               "status" args)
>        (goto-char (point-min))
>        (while (re-search-forward "^[?C]\s+\\(.*\\)$" nil t)
>          (setq files (cons (expand-file-name (match-string 1))
>                            files))))
>      (nreverse files)))

Terrific, thank you! How is Hg's performance with this approach? Does 
adding a few ignores (like 5 or 10) slow down the output measurably?

BTW, can Hg support extra whitelist entries as well?

> --8<---------------cut here---------------end--------------->8---
> 
> There's a semantic difference between Git and Hg in the treatment of
> extra-ignores.  With Git, the extra-ignores do not rule out committed
> files (i.e., they are only effective for untracked files) while for Hg,
> they also rule out committed files.  I think the Hg semantics are
> probably better

Better and important, IMO.

> but I don't see how to change the Git version so that it
> acts the same way (except by re-filtering in lisp, of course), do you?

Previously suggested:

https://stackoverflow.com/questions/36753573/how-do-i-exclude-files-from-git-ls-files/53083343#53083343

That means converting all extra-ignores into negative pathspec strings.

> I haven't looked at the other backends.  I guess bzr will probably be
> doable, too.  However, for SVN, there's no way to list unregistered
> files.  A correct (but horribly slow) default implementation should also
> be doable.

Yeah, I wonder if we should treat this as a VC operation. On the other 
hand, the fallback implementation could just as well use 'find'.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-19 15:33                                   ` Dmitry Gutov
@ 2019-09-19 17:29                                     ` Eli Zaretskii
  2019-09-20 11:25                                       ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-19 17:29 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Thu, 19 Sep 2019 18:33:23 +0300
> Cc: emacs-devel@gnu.org
> 
> > It isn't the problem of filenotify the package, it's a problem with
> > the OS features it uses: inotify, knotify, etc.  They all lose events
> > when there are a lot of file operations in a tree.  E.g., try using
> > Auto-revert-tail mode on a log file produced by building the Boost
> > library -- in the default mode, where it uses file notifications, it
> > simply chokes.
> 
> Should there be a way to debounce the events, or handle this somehow? 
> For instance, when too many events come, we could simply discard the 
> files list cache, and force it to be rebuilt later. For this particular 
> use, we don't really have to see every notification.

The problem is not in Emacs, it's in inotify etc.  Their documentation
explicitly says that events cab legitimately lost.

I don't really feel like continuing the rest of the discussion, as you
clearly don't care about my views on this.  Whatever.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-19 17:29                                     ` Eli Zaretskii
@ 2019-09-20 11:25                                       ` Dmitry Gutov
  2019-09-20 12:59                                         ` Eli Zaretskii
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-20 11:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 19.09.2019 20:29, Eli Zaretskii wrote:

> The problem is not in Emacs, it's in inotify etc.  Their documentation
> explicitly says that events cab legitimately lost.

I *think* a handler can deal with that in a way, by storing time 
signatures of N last events, and if new ones arrive in some very short 
window of time, declare itself unable to cope, and flush the cache. So, 
uh, that might work.

> I don't really feel like continuing the rest of the discussion, as you
> clearly don't care about my views on this.  Whatever.

I'm sorry, we're probably speaking different languages. I think I've 
been explaining how some things I could guess you wanted me to do are 
not feasible.

If there's anything in particular (that you wanted done), please tell.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 11:25                                       ` Dmitry Gutov
@ 2019-09-20 12:59                                         ` Eli Zaretskii
  2019-09-20 13:28                                           ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-20 12:59 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Fri, 20 Sep 2019 14:25:03 +0300
> 
> On 19.09.2019 20:29, Eli Zaretskii wrote:
> 
> > The problem is not in Emacs, it's in inotify etc.  Their documentation
> > explicitly says that events cab legitimately lost.
> 
> I *think* a handler can deal with that in a way

No, it cannot, because the events are lost before they reach the
Emacs event queue.  Unless, that is, you mean by "handler" something
that is outside Emacs, in which case I see no way for Emacs to control
how such a "handler" behaves.

> > I don't really feel like continuing the rest of the discussion, as you
> > clearly don't care about my views on this.  Whatever.
> 
> I'm sorry, we're probably speaking different languages. I think I've 
> been explaining how some things I could guess you wanted me to do are 
> not feasible.

"Not feasible".  You are, in effect, saying that I have no real idea
what I'm talking about.  And that's exactly the issue at hand.

Sorry, I can no longer afford discussions in which replies are posted
after two, sometimes 3 or 4 days, and in which each of my arguments,
no matter how minor, always gets contradicted with counter-arguments,
with context stripped so thoroughly that I need to go back 2, 3,
sometimes even 4 messages back to even begin to understand how we
arrived to that.  You never even _try_ to compromise, let alone see
the issue from my POV.  It's as if anything I say that doesn't match
your opinion is in advance deemed invalid.  There's no hope of any
agreement.  That isn't a kind of argument I can afford spending my
time on, sorry.  So I will have to stop trying to do my job of being a
co-maintainer in these cases.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 12:59                                         ` Eli Zaretskii
@ 2019-09-20 13:28                                           ` Dmitry Gutov
  2019-09-20 13:45                                             ` Stefan Monnier
  2019-09-20 14:23                                             ` Eli Zaretskii
  0 siblings, 2 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-20 13:28 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 20.09.2019 15:59, Eli Zaretskii wrote:

> No, it cannot, because the events are lost before they reach the
> Emacs event queue.  Unless, that is, you mean by "handler" something
> that is outside Emacs, in which case I see no way for Emacs to control
> how such a "handler" behaves.

It could work by detecting a "congestion" before the event happens (by 
seeing that events fire up very quickly, so the queue must fill up 
soon). That wouldn't be 100% reliable, though.

> "Not feasible".  You are, in effect, saying that I have no real idea
> what I'm talking about.  And that's exactly the issue at hand.

I suppose that's possible. And if so, my apologies.

FWIW, before you wrote that "giving up" message, I figured we've reached 
the same page, more or less. Maybe I lack reading comprehension.

> Sorry, I can no longer afford discussions in which replies are posted
> after two, sometimes 3 or 4 days,

Sorry about that.

> and in which each of my arguments,
> no matter how minor, always gets contradicted with counter-arguments,

I've seen the same from your side. And if you see it as arguing, I see 
that as having to to explain stuff, sometimes multiple times, in 
different branches of this same thread.

Maybe you would prefer different levels of nesting, but they make 
reading things harder on my end, including when I also go back to 
re-read the messages in this discussion. I can try cutting out less 
context next time, but by how much?

> with context stripped so thoroughly that I need to go back 2, 3,
> sometimes even 4 messages back to even begin to understand how we
> arrived to that.  You never even _try_ to compromise, let alone see
> the issue from my POV.

I'm sorry, I don't know what you're talking about here.

I *cannot* compromise by removing reliance on 'find' because we'll need 
it anyway. You complained about it for some reason, but also yourself 
acknowledged that it will remain necessary in certain cases.

What else is there to compromise on? Implement the file listing using 
Bzr and SVN as well? People are free to work on that, as long as no 
functionality is lost in the end, meaning proper fallback to 'find' (I 
suppose) when the tool can't support the necessary feature (extra 
ignores or whitelist) that the user requested.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 13:28                                           ` Dmitry Gutov
@ 2019-09-20 13:45                                             ` Stefan Monnier
  2019-09-20 13:54                                               ` Dmitry Gutov
  2019-09-20 14:23                                             ` Eli Zaretskii
  1 sibling, 1 reply; 94+ messages in thread
From: Stefan Monnier @ 2019-09-20 13:45 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel

> It could work by detecting a "congestion" before the event happens (by
> seeing that events fire up very quickly, so the queue must fill up 
> soon). That wouldn't be 100% reliable, though.

That's right: it wouldn't be 100% reliable.  I don't think we want to go
that way.  We could ask for inotify or other backends to send us
a special event when other events were dropped, but IIRC currently we
don't get that info and inferring it from outside is just too ugly
to consider.


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 13:45                                             ` Stefan Monnier
@ 2019-09-20 13:54                                               ` Dmitry Gutov
  2019-09-20 14:12                                                 ` Michael Albinus
  2019-09-20 15:01                                                 ` Stefan Monnier
  0 siblings, 2 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-20 13:54 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: Eli Zaretskii, emacs-devel

On 20.09.2019 16:45, Stefan Monnier wrote:
> That's right: it wouldn't be 100% reliable.  I don't think we want to go
> that way.

That would be for one application, not for the general facility, but I'm 
not in a hurry to implement that, of course.

> We could ask for inotify or other backends to send us
> a special event when other events were dropped

If inotify and friends could do that, that would be best.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 13:54                                               ` Dmitry Gutov
@ 2019-09-20 14:12                                                 ` Michael Albinus
  2019-09-20 14:30                                                   ` Eli Zaretskii
  2019-09-20 15:01                                                 ` Stefan Monnier
  1 sibling, 1 reply; 94+ messages in thread
From: Michael Albinus @ 2019-09-20 14:12 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, Stefan Monnier, emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

> On 20.09.2019 16:45, Stefan Monnier wrote:
>> That's right: it wouldn't be 100% reliable.  I don't think we want to go
>> that way.
>
> That would be for one application, not for the general facility, but
> I'm not in a hurry to implement that, of course.
>
>> We could ask for inotify or other backends to send us
>> a special event when other events were dropped
>
> If inotify and friends could do that, that would be best.

inotify(7) documents the event

           IN_Q_OVERFLOW
                  Event queue overflowed (wd is -1 for this event).

It is signalled by our inotify backend as `q-overflow' event, but
filenotify.el doesn't care. And I haven't checked the other
backends. Maybe we shall expose it in filenotify.el (independent where
this discussion ends)?

Best regards, Michael.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 13:28                                           ` Dmitry Gutov
  2019-09-20 13:45                                             ` Stefan Monnier
@ 2019-09-20 14:23                                             ` Eli Zaretskii
  2019-09-20 14:48                                               ` Dmitry Gutov
  1 sibling, 1 reply; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-20 14:23 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Cc: emacs-devel@gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Fri, 20 Sep 2019 16:28:07 +0300
> 
> On 20.09.2019 15:59, Eli Zaretskii wrote:
> 
> > No, it cannot, because the events are lost before they reach the
> > Emacs event queue.  Unless, that is, you mean by "handler" something
> > that is outside Emacs, in which case I see no way for Emacs to control
> > how such a "handler" behaves.
> 
> It could work by detecting a "congestion" before the event happens (by 
> seeing that events fire up very quickly, so the queue must fill up 
> soon).

What is "it" here?  If "it" is part of Emacs, then no, it cannot see
any congestion coming, and cannot know anything about the events
firing up quickly, because these events are reported by inotify, a
service running outside of Emacs.

And if "it" is inotify itself, then you should be talking about this
with its developers, which are not here.  And the same with other
similar services filenotify uses.

> > and in which each of my arguments,
> > no matter how minor, always gets contradicted with counter-arguments,
> 
> I've seen the same from your side.

And that makes it okay?

> I *cannot* compromise by removing reliance on 'find' because we'll need 
> it anyway. You complained about it for some reason, but also yourself 
> acknowledged that it will remain necessary in certain cases.
> 
> What else is there to compromise on? Implement the file listing using 
> Bzr and SVN as well? People are free to work on that, as long as no 
> functionality is lost in the end, meaning proper fallback to 'find' (I 
> suppose) when the tool can't support the necessary feature (extra 
> ignores or whitelist) that the user requested.

This just reiterates the same arguments.  I see no reason for another
round of the same stuff and the same results.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 14:12                                                 ` Michael Albinus
@ 2019-09-20 14:30                                                   ` Eli Zaretskii
  2019-09-20 14:51                                                     ` Dmitry Gutov
  2019-09-20 14:55                                                     ` Michael Albinus
  0 siblings, 2 replies; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-20 14:30 UTC (permalink / raw)
  To: Michael Albinus; +Cc: emacs-devel, monnier, dgutov

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,  Eli Zaretskii
>  <eliz@gnu.org>,  emacs-devel@gnu.org
> Date: Fri, 20 Sep 2019 16:12:23 +0200
> 
> inotify(7) documents the event
> 
>            IN_Q_OVERFLOW
>                   Event queue overflowed (wd is -1 for this event).
> 
> It is signalled by our inotify backend as `q-overflow' event, but
> filenotify.el doesn't care. And I haven't checked the other
> backends. Maybe we shall expose it in filenotify.el (independent where
> this discussion ends)?

Expose it and do what?  About the only thing you can do is switch to
the alternative implementation (like turn off auto-revert-use-notify),
but (a) not every feature even has such an alternative, and (b) you
will still miss events, because there's no way to recover the lost
events.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 14:23                                             ` Eli Zaretskii
@ 2019-09-20 14:48                                               ` Dmitry Gutov
  0 siblings, 0 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-20 14:48 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On 20.09.2019 17:23, Eli Zaretskii wrote:

> What is "it" here?  If "it" is part of Emacs, then no, it cannot see
> any congestion coming, and cannot know anything about the events
> firing up quickly, because these events are reported by inotify, a
> service running outside of Emacs.

A handler function could measure and save the current time, for each 
event. But never mind, nobody's liking this approach anyway.

>>> and in which each of my arguments,
>>> no matter how minor, always gets contradicted with counter-arguments,
>>
>> I've seen the same from your side.
> 
> And that makes it okay?

Well, yes? I don't mind it from either side, and if you mind, maybe 
start by giving a better example. Sorry to be blunt.

And again, I hadn't viewed this discussion as confrontational, up until now.

> This just reiterates the same arguments.  I see no reason for another
> round of the same stuff and the same results.

OK, guess we're definitely not on the same page.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 14:30                                                   ` Eli Zaretskii
@ 2019-09-20 14:51                                                     ` Dmitry Gutov
  2019-09-20 15:04                                                       ` Michael Albinus
  2019-09-20 14:55                                                     ` Michael Albinus
  1 sibling, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-20 14:51 UTC (permalink / raw)
  To: Eli Zaretskii, Michael Albinus; +Cc: monnier, emacs-devel

On 20.09.2019 17:30, Eli Zaretskii wrote:

> Expose it and do what?  About the only thing you can do is switch to
> the alternative implementation (like turn off auto-revert-use-notify),

That's the idea. As long as the switch can be made temporary.

For project-files, that would mean switching to whatever function 
actually enumerates all files in the project (we need it anyway).

I think this would require two events, though: queue-overflow-start and 
queue-overflow-stop.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 14:30                                                   ` Eli Zaretskii
  2019-09-20 14:51                                                     ` Dmitry Gutov
@ 2019-09-20 14:55                                                     ` Michael Albinus
  2019-09-20 15:55                                                       ` Eli Zaretskii
  1 sibling, 1 reply; 94+ messages in thread
From: Michael Albinus @ 2019-09-20 14:55 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, monnier, dgutov

Eli Zaretskii <eliz@gnu.org> writes:

>> It is signalled by our inotify backend as `q-overflow' event, but
>> filenotify.el doesn't care. And I haven't checked the other
>> backends. Maybe we shall expose it in filenotify.el (independent where
>> this discussion ends)?
>
> Expose it and do what?  About the only thing you can do is switch to
> the alternative implementation (like turn off auto-revert-use-notify),
> but (a) not every feature even has such an alternative, and (b) you
> will still miss events, because there's no way to recover the lost
> events.

In filenotify.el, we use already an internal `stopped' event, see (info
"(elisp) File Notifications")

It is raised when somebody calls `file-notify-rm-watch', or a watched
file is deleted or renamed. Since it is documented, packages using
filenotify could care about. And yes, autorevert falls back to polling.

If `q-overflow' arrives in filenotify, we could send a `stopped' event
for all valid watch descriptors which belong to inotify. I don't say
that every package can be profit from this, but it is better than doing
nothing.

Btw, I haven't seen ever `q-overflow'. So I wouldn't know at the
beginning, how to write a test in filenotify-tests.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 13:54                                               ` Dmitry Gutov
  2019-09-20 14:12                                                 ` Michael Albinus
@ 2019-09-20 15:01                                                 ` Stefan Monnier
  2019-09-20 15:59                                                   ` Eli Zaretskii
  1 sibling, 1 reply; 94+ messages in thread
From: Stefan Monnier @ 2019-09-20 15:01 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, emacs-devel

>> That's right: it wouldn't be 100% reliable.  I don't think we want to go
>> that way.
> That would be for one application, not for the general facility, but I'm not
> in a hurry to implement that, of course.

Good, because this kind of code is better kept away from Emacs.

The experience with file-notifications in auto-revert mode has been
pretty poor, AFAIC: the implementation is fairly complex/fiddly, and the
user-level behavior is only sometimes better (I often see the delay that
comes from polling).


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 14:51                                                     ` Dmitry Gutov
@ 2019-09-20 15:04                                                       ` Michael Albinus
  2019-09-22  9:23                                                         ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Michael Albinus @ 2019-09-20 15:04 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, monnier, emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

> I think this would require two events, though: queue-overflow-start
> and queue-overflow-stop.

I'm not sure that inotify signals that a queue could be reused. It must
be arranged by the client. inotify(7) says

       Note that the event queue can overflow.  In this case, events are lost.
       Robust applications should handle the possibility of lost events grace‐
       fully.   For example, it may be necessary to rebuild part or all of the
       application cache.  (One simple, but possibly expensive, approach is to
       close  the  inotify file descriptor, empty the cache, create a new ino‐
       tify file descriptor, and then re-create watches and cache entries  for
       the objects to be monitored.)

This means, that a watch object returned by `file-notify-add-watch'
cannot be reused.

Let's first inform filenotify clients, that a watch is not valid
anymore. This gives them already valuable information.

Best regards, Michael.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 14:55                                                     ` Michael Albinus
@ 2019-09-20 15:55                                                       ` Eli Zaretskii
  0 siblings, 0 replies; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-20 15:55 UTC (permalink / raw)
  To: Michael Albinus; +Cc: emacs-devel, monnier, dgutov

> From: Michael Albinus <michael.albinus@gmx.de>
> Cc: dgutov@yandex.ru,  monnier@iro.umontreal.ca,  emacs-devel@gnu.org
> Date: Fri, 20 Sep 2019 16:55:01 +0200
> 
> Btw, I haven't seen ever `q-overflow'.

Try watching a log file of a Boost library build while the build runs.
If that doesn't trigger q-overflow, then probably nothing will.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 15:01                                                 ` Stefan Monnier
@ 2019-09-20 15:59                                                   ` Eli Zaretskii
  2019-09-20 17:32                                                     ` Stefan Monnier
  0 siblings, 1 reply; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-20 15:59 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel, dgutov

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>,  emacs-devel@gnu.org
> Date: Fri, 20 Sep 2019 11:01:54 -0400
> 
> The experience with file-notifications in auto-revert mode has been
> pretty poor

Did you see a better experience in any other application?  All I hear
is negative experience; e.g., LSP developers frequently advise their
users on Reddit to turn notifications off.

Which leads me to a personal conclusion that using these interfaces in
Emacs is a largely failed experiment: a feature in which we invested
(and still are investing) a lot of effort, but whose ROI is rather low
and disappointing.  Makes me wish we had the wisdom to say no thanks
when the feature was first proposed.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 15:59                                                   ` Eli Zaretskii
@ 2019-09-20 17:32                                                     ` Stefan Monnier
  2019-09-20 17:49                                                       ` Eli Zaretskii
  0 siblings, 1 reply; 94+ messages in thread
From: Stefan Monnier @ 2019-09-20 17:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, dgutov

>> The experience with file-notifications in auto-revert mode has been
>> pretty poor
> Did you see a better experience in any other application?

I have no other experience to compare it with, no: neither xterm nor
ctwm use them, I haven't noticed that firefox uses them, and well with
Emacs that's pretty much all the applications I'm using ;-)

Oh, wait, maybe `mpd` can use them on its music library, but I wouldn't
have noticed the difference (that library changes much too rarely and
I always explicitly ask for a refresh when I do change it).


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 17:32                                                     ` Stefan Monnier
@ 2019-09-20 17:49                                                       ` Eli Zaretskii
  2019-09-20 18:04                                                         ` Stefan Monnier
  0 siblings, 1 reply; 94+ messages in thread
From: Eli Zaretskii @ 2019-09-20 17:49 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: dgutov, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Fri, 20 Sep 2019 13:32:52 -0400
> Cc: emacs-devel@gnu.org, dgutov@yandex.ru
> 
> >> The experience with file-notifications in auto-revert mode has been
> >> pretty poor
> > Did you see a better experience in any other application?
> 
> I have no other experience to compare it with, no: neither xterm nor
> ctwm use them, I haven't noticed that firefox uses them, and well with
> Emacs that's pretty much all the applications I'm using ;-)
> 
> Oh, wait, maybe `mpd` can use them on its music library, but I wouldn't
> have noticed the difference (that library changes much too rarely and
> I always explicitly ask for a refresh when I do change it).

I meant in Emacs applications.  Sorry for being too vague.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 17:49                                                       ` Eli Zaretskii
@ 2019-09-20 18:04                                                         ` Stefan Monnier
  0 siblings, 0 replies; 94+ messages in thread
From: Stefan Monnier @ 2019-09-20 18:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: dgutov, emacs-devel

>> >> The experience with file-notifications in auto-revert mode has been
>> >> pretty poor
>> > Did you see a better experience in any other application?
> I meant in Emacs applications.  Sorry for being too vague.

Ah, right.  Well, no I don't think I use other Emacs packages which make
use of file notifications.

BTW, while I obviously agree with you that the overall experience is
rather bleak, I do enjoy the faster refresh of auto-revert when file
notifications do work.  I'm just disappointed that this is not the
overwhelming majority of the cases (it's probably the majority of the
cases, but if so, not by a wide enough margin that I wouldn't have to
doubt it).


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-19 16:01                           ` Dmitry Gutov
@ 2019-09-22  8:56                             ` Tassilo Horn
  2019-09-22  9:37                               ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-09-22  8:56 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

Hi Dmitry,

>> where extra-includes works in addition to the standard VC ignore
>> rules (.gitignore, .hgignore).  Or do you want to override the
>> VC-internal rules?
>
> I'm afraid it might not work as well if we try to treat all (modified)
> ignores the same. In my understanding, the speed with which Git lists
> files to a large extent stems from not having to apply the ignores to
> the already-registered files.  Someone should benchmark this, but I
> think if we use the "negative pathspec" approach mentioned below for
> all ignores together, it might slow down file listing by an order of
> magnitude or several.
>
>> At least for Git and Hg, I came up with reasonable implementations:
>> [...]
>
> Terrific, thank you! How is Hg's performance with this approach? Does
> adding a few ignores (like 5 or 10) slow down the output measurably?

No, it doesn't slow down the listing (in comparison to just hg status
--all).  However, my test hg repo is not extraordinarily large (~4000
files).

> BTW, can Hg support extra whitelist entries as well?

"hg status --all" prints everything including ignored files.  An
--exclude restricts the output and filters the output so that matching
files are not listed.  --include also restricts the output so than only
files matched by such an include pattern are listed.

>> --8<---------------cut here---------------end--------------->8---
>> There's a semantic difference between Git and Hg in the treatment of
>> extra-ignores.  With Git, the extra-ignores do not rule out committed
>> files (i.e., they are only effective for untracked files) while for
>> Hg, they also rule out committed files.  I think the Hg semantics are
>> probably better
>
> Better and important, IMO.
>
>> but I don't see how to change the Git version so that it acts the
>> same way (except by re-filtering in lisp, of course), do you?
>
> Previously suggested:
>
> https://stackoverflow.com/questions/36753573/how-do-i-exclude-files-from-git-ls-files/53083343#53083343
>
> That means converting all extra-ignores into negative pathspec strings.

Ok, I see.  So that would be this and it seems like now we have the same
semantics as with the hg version:

--8<---------------cut here---------------start------------->8---
(defun vc-git-list-files (&optional dir
                                    include-unregistered
                                    extra-ignores)
  (let ((default-directory (or dir default-directory))
        (args '("-z")))
    (when include-unregistered
      (setq args (append args '("-c" "-o" "--exclude-standard"))))
    (when extra-ignores
      (setq args (append args
                         (cons "--"
                               (mapcar
                                (lambda (i)
                                  (format ":!:%s" i))
                                extra-ignores)))))
    (mapcar
     #'expand-file-name
     (cl-remove-if
      #'string-empty-p
      (split-string
       (apply #'vc-git--run-command-string nil "ls-files" args)
       "\0")))))
--8<---------------cut here---------------end--------------->8---

So basically "git status ... -- '*.el'" corresponds to "hg status --all
--include '*.el'" whereas negative pathspecs correspond to hg's
--exclude.

A quick look at bzr suggests there's just a way restrict positively,
i.e., like --include with hg.

>> I haven't looked at the other backends.  I guess bzr will probably be
>> doable, too.  However, for SVN, there's no way to list unregistered
>> files.  A correct (but horribly slow) default implementation should
>> also be doable.
>
> Yeah, I wonder if we should treat this as a VC operation. On the other
> hand, the fallback implementation could just as well use 'find'.

Right now, it uses `vc-file-tree-walk'...

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-20 15:04                                                       ` Michael Albinus
@ 2019-09-22  9:23                                                         ` Dmitry Gutov
  0 siblings, 0 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-22  9:23 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Eli Zaretskii, monnier, emacs-devel

On 20.09.2019 18:04, Michael Albinus wrote:
> Dmitry Gutov <dgutov@yandex.ru> writes:
> 
>> I think this would require two events, though: queue-overflow-start
>> and queue-overflow-stop.
> 
> I'm not sure that inotify signals that a queue could be reused. It must
> be arranged by the client. inotify(7) says
> 
>         Note that the event queue can overflow.  In this case, events are lost.
>         Robust applications should handle the possibility of lost events grace‐
>         fully.   For example, it may be necessary to rebuild part or all of the
>         application cache.  (One simple, but possibly expensive, approach is to
>         close  the  inotify file descriptor, empty the cache, create a new ino‐
>         tify file descriptor, and then re-create watches and cache entries  for
>         the objects to be monitored.)

Ah, okay. So I guess the application can enter the "degraded" mode and 
then add a timer to try and enable notification-based operation again.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-22  8:56                             ` Tassilo Horn
@ 2019-09-22  9:37                               ` Dmitry Gutov
  2019-09-23  7:42                                 ` Tassilo Horn
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-22  9:37 UTC (permalink / raw)
  To: emacs-devel

Hi Tassilo,

On 22.09.2019 11:56, Tassilo Horn wrote:

> No, it doesn't slow down the listing (in comparison to just hg status
> --all).  However, my test hg repo is not extraordinarily large (~4000
> files).

If performance is C*N where N is the number of files, then we can 
compare the times to complete on any medium-sized repo, as long as the 
list of ignores is significant (though they don't need to match anything).

Anyway, if the perf looks good to you we could push the improvement 
first and then deal with negative reports.

>> BTW, can Hg support extra whitelist entries as well?
> 
> "hg status --all" prints everything including ignored files.  An
> --exclude restricts the output and filters the output so that matching
> files are not listed.  --include also restricts the output so than only
> files matched by such an include pattern are listed.

What about the hgignore contents? Does EXTRA-IGNORES in the Hg 
implementation actually mean ALL-IGNORES, i.e. will we need to pass the 
whole ignores list there?

I've been toying with an implementation for Git which uses negative 
pathspecs to specify all ignores (including the whitelist), instead of 
modifying the ignores list. Performance-wise, it looks good enough, so 
it seems my intuition was wrong. We could hit maximum command line 
length this way, though this didn't happen with Emacs's gitignore, which 
is not small. I wonder how much of a concern that would be.

The actual implementation wasn't saved on disk and got eaten by a 
reboot, but I can show it later if you like.

> Ok, I see.  So that would be this and it seems like now we have the same
> semantics as with the hg version:

Very good. Support for rooted entries and whitelist can be easily added 
here.

There's a caveat, though: negative pathspecs have only been added AFAICT 
in Git 1.9. Whereas CentOS Stable is on Git 1.8.3 currently.

So we'll have to handle it somehow, e.g. use a fallback for that version.

> A quick look at bzr suggests there's just a way restrict positively,
> i.e., like --include with hg.

That makes me more inclined to just hardcode two implementations (one 
for Git and another for Hg) inside project.el. At least as the first 
version of this feature.

>> Yeah, I wonder if we should treat this as a VC operation. On the other
>> hand, the fallback implementation could just as well use 'find'.
> 
> Right now, it uses `vc-file-tree-walk'...

Shouldn't somebody reimplement it on top of 'find'?



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-22  9:37                               ` Dmitry Gutov
@ 2019-09-23  7:42                                 ` Tassilo Horn
  2019-09-23 12:22                                   ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-09-23  7:42 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

Hi Dmitry,

>> No, it doesn't slow down the listing (in comparison to just hg status
>> --all).  However, my test hg repo is not extraordinarily large (~4000
>> files).
>
> If performance is C*N where N is the number of files, then we can
> compare the times to complete on any medium-sized repo, as long as the
> list of ignores is significant (though they don't need to match
> anything).

I'll see if I can get some larger repo and report back.  With my test
repo with ~4000 files, it took around 0.25 seconds no matter if zero or
ten --exclude patterns were given.

> Anyway, if the perf looks good to you we could push the improvement
> first and then deal with negative reports.

Yes.

>>> BTW, can Hg support extra whitelist entries as well?
>> "hg status --all" prints everything including ignored files.  An
>> --exclude restricts the output and filters the output so that
>> matching files are not listed.  --include also restricts the output
>> so than only files matched by such an include pattern are listed.
>
> What about the hgignore contents? Does EXTRA-IGNORES in the Hg
> implementation actually mean ALL-IGNORES, i.e. will we need to pass
> the whole ignores list there?

No, "hg status --all" prints files with their status, e.g.,

$ hg status --all
? unregistered.txt
I ignored.o
C .hgignore
C committed.txt

Right now, we don't collect files marked as "I"gnored.  As soon as you
add extra ignores, files will actually be filtered:

$ hg status --all --exclude '*.o'
? unregistered.txt
C .hgignore
C committed.txt

> I've been toying with an implementation for Git which uses negative
> pathspecs to specify all ignores (including the whitelist), instead of
> modifying the ignores list. Performance-wise, it looks good enough, so
> it seems my intuition was wrong. We could hit maximum command line
> length this way, though this didn't happen with Emacs's gitignore,
> which is not small. I wonder how much of a concern that would be.
>
> The actual implementation wasn't saved on disk and got eaten by a
> reboot, but I can show it later if you like.

Sure, then I can check if that's doable with at least hg.

>> Ok, I see.  So that would be this and it seems like now we have the
>> same semantics as with the hg version:
>
> Very good. Support for rooted entries and whitelist can be easily
> added here.
>
> There's a caveat, though: negative pathspecs have only been added
> AFAICT in Git 1.9. Whereas CentOS Stable is on Git 1.8.3 currently.
>
> So we'll have to handle it somehow, e.g. use a fallback for that
> version.

IMHO, the fallback is just use the existing "find" version, no?

>> A quick look at bzr suggests there's just a way restrict positively,
>> i.e., like --include with hg.
>
> That makes me more inclined to just hardcode two implementations (one
> for Git and another for Hg) inside project.el. At least as the first
> version of this feature.

I have no clear preference but as my main concern is better performance
with our Git repo at work, I won't object.

>>> Yeah, I wonder if we should treat this as a VC operation. On the
>>> other hand, the fallback implementation could just as well use
>>> 'find'.
>> Right now, it uses `vc-file-tree-walk'...
>
> Shouldn't somebody reimplement it on top of 'find'?

I don't know.  It would surely be faster but there might be systems
without 'find'.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-23  7:42                                 ` Tassilo Horn
@ 2019-09-23 12:22                                   ` Dmitry Gutov
  2019-09-27 16:17                                     ` Tassilo Horn
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-23 12:22 UTC (permalink / raw)
  To: emacs-devel

On 23.09.2019 10:42, Tassilo Horn wrote:

> I'll see if I can get some larger repo and report back.  With my test
> repo with ~4000 files, it took around 0.25 seconds no matter if zero or
> ten --exclude patterns were given.

Thanks!

>> What about the hgignore contents? Does EXTRA-IGNORES in the Hg
>> implementation actually mean ALL-IGNORES, i.e. will we need to pass
>> the whole ignores list there?
> 
> No, "hg status --all" prints files with their status, e.g.,
> 
> $ hg status --all
> ? unregistered.txt
> I ignored.o
> C .hgignore
> C committed.txt

I think that was more of a "yes". :-)

> Right now, we don't collect files marked as "I"gnored.  As soon as you
> add extra ignores, files will actually be filtered:
> 
> $ hg status --all --exclude '*.o'
> ? unregistered.txt
> C .hgignore
> C committed.txt

What I wondered is whether running 'hg status --all --exclude 
'committed.*' would include 'ignored.o' in the output.

It does, and still with 'I' at bol. It's a good result, and it probably 
implies that ALL-IGNORES semantics might be a better choice, in case 
some people have *lots* of files ignores in their Hg repos (build 
artefacts, etc).

>> I've been toying with an implementation for Git which uses negative
>> pathspecs to specify all ignores (including the whitelist), instead of
>> modifying the ignores list. Performance-wise, it looks good enough, so
>> it seems my intuition was wrong. We could hit maximum command line
>> length this way, though this didn't happen with Emacs's gitignore,
>> which is not small. I wonder how much of a concern that would be.
>>
>> The actual implementation wasn't saved on disk and got eaten by a
>> reboot, but I can show it later if you like.
> 
> Sure, then I can check if that's doable with at least hg.

OK, a bit later. But it seems obviously doable with Hg: just add all 
includes with --exclude and avoid filtering by the status character.

>> There's a caveat, though: negative pathspecs have only been added
>> AFAICT in Git 1.9. Whereas CentOS Stable is on Git 1.8.3 currently.
>>
>> So we'll have to handle it somehow, e.g. use a fallback for that
>> version.
> 
> IMHO, the fallback is just use the existing "find" version, no?

Yes. Just worried about the number of cases where we would use the fallback.

>> That makes me more inclined to just hardcode two implementations (one
>> for Git and another for Hg) inside project.el. At least as the first
>> version of this feature.
> 
> I have no clear preference but as my main concern is better performance
> with our Git repo at work, I won't object.

Excellent.

>>> Right now, it uses `vc-file-tree-walk'...
>>
>> Shouldn't somebody reimplement it on top of 'find'?
> 
> I don't know.  It would surely be faster but there might be systems
> without 'find'.

Makes sense. So it would need a fallback as well. Maybe someday, then.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-23 12:22                                   ` Dmitry Gutov
@ 2019-09-27 16:17                                     ` Tassilo Horn
  2019-09-30  0:09                                       ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-09-27 16:17 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Hi Dmitry,

I have to admit that I've lost track over what we're actually trying to
do exactly.  My main motivation was just to get a faster project-files
implementation for the Git repo I have at work.  I've now fixed that on
my end using a simple caching approach which is good enough for me ATM.

So I think that your current idea is to:

- No general VC list-files operation since we can come up with good
  versions just for Git and Hg anyway.  Fine with me.

- I'm not sure if we're on the same board when it comes to ignores.

If you want me to implement something I fear you have to explain the
ignore story again.  I don't know what you imply by renaming the
parameter from EXTRA-IGNORES to ALL-IGNORES.  Do you mean that the
patterns in .{git,hg}ignore (and $XDG_CONFIG_HOME/git/ignore,
$GIT_DIR/info/exclude) should be part of the ALL-IGNORES list (in
addition to project-vc-ignores) and parsed from those files?  Processing
3 files in the case of Git and handling 3 ignore syntaxes (regexp, glob,
rootglob) in the case of Hg doesn't sound too appealing to me.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-27 16:17                                     ` Tassilo Horn
@ 2019-09-30  0:09                                       ` Dmitry Gutov
  2019-09-30  0:25                                         ` Stefan Monnier
                                                           ` (3 more replies)
  0 siblings, 4 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-30  0:09 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2296 bytes --]

Hi Tassilo,

On 27.09.2019 19:17, Tassilo Horn wrote:

> I have to admit that I've lost track over what we're actually trying to
> do exactly.  My main motivation was just to get a faster project-files
> implementation for the Git repo I have at work.  I've now fixed that on
> my end using a simple caching approach which is good enough for me ATM.

I'm attaching a working patch that will probably need tweaks later. I'm 
not sure how to handle attribution best. If you like it, you can simply 
commit it under your name (up to now, I've mostly done the rearranging 
and some minor tweaks).

> So I think that your current idea is to:
> 
> - No general VC list-files operation since we can come up with good
>    versions just for Git and Hg anyway.  Fine with me.

Great. I think we should start with that that either way. And when some 
other application arises that could use a VC 'ls-files' command we 
should see whether we can extract something that satisfies both. Maybe 
the minimum supported Git version will make it easier for us too then.

> - I'm not sure if we're on the same board when it comes to ignores.
> 
> If you want me to implement something I fear you have to explain the
> ignore story again.  I don't know what you imply by renaming the
> parameter from EXTRA-IGNORES to ALL-IGNORES.  Do you mean that the
> patterns in .{git,hg}ignore (and $XDG_CONFIG_HOME/git/ignore,
> $GIT_DIR/info/exclude) should be part of the ALL-IGNORES list (in
> addition to project-vc-ignores) and parsed from those files?

Good point, I forgot about those (and also about per-directory gitignore 
files). ALL-IGNORES would be a nice semantics for a VC 'ls-files' 
command, but if we're not doing that now, we don't have to try to fit 
that approach.

Still, there could be a performance problem with outputting all ignored 
files and then parsing out only a part of them. Could you try the new 
feature with a test repository from 
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22481?

> Processing
> 3 files in the case of Git and handling 3 ignore syntaxes (regexp, glob,
> rootglob) in the case of Hg doesn't sound too appealing to me.

Speaking of syntaxes, I was curious which one is used by Mercurial's 
--exclude. But it seems to accept globs okay, including ones rooted with 
'./'.

[-- Attachment #2: vc-list-files.diff --]
[-- Type: text/x-patch, Size: 2709 bytes --]

diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
index 4693d07fa8..bbb6310323 100644
--- a/lisp/progmodes/project.el
+++ b/lisp/progmodes/project.el
@@ -277,6 +277,66 @@ project-try-vc
      (funcall project-vc-external-roots-function)))
    (project-roots project)))
 
+(cl-defmethod project-files ((project (head vc)) &optional dirs)
+  (cl-mapcan
+   (lambda (dir)
+     (let (backend)
+       (if (and (file-equal-p dir (cdr project))
+                (setq backend (vc-responsible-backend dir))
+                (cond
+                 ((eq backend 'Hg))
+                 ((and (eq backend 'Git)
+                       (or
+                        (not project-vc-ignores)
+                        (version<= "1.9" (vc-git--program-version)))))))
+           (project-vc-list-files dir backend project-vc-ignores)
+         (project--files-in-directory
+          dir
+          (project--dir-ignores project dir)))))
+   (or dirs (project-roots project))))
+
+(defun project-vc-list-files (dir backend extra-ignores)
+  (pcase backend
+    (`Git
+     (let ((default-directory dir)
+           (args '("-z")))
+       (when t ;include-unregistered
+         (setq args (append args '("-c" "-o" "--exclude-standard"))))
+       (when extra-ignores
+         (setq args (append args
+                            (cons "--"
+                                  (mapcar
+                                   (lambda (i)
+                                     (if (string-match "\\./" i)
+                                         (format ":!/:%s" (substring i 2))
+                                       (format ":!:%s" i)))
+                                   extra-ignores)))))
+       (mapcar
+        #'expand-file-name
+        (split-string
+         (apply #'vc-git--run-command-string nil "ls-files" args)
+         "\0" t))))
+    (`Hg
+     (let ((default-directory dir)
+           args
+           files)
+       (when t ;include-unregistered
+         (setq args (nconc args '("--all"))))
+       (when extra-ignores
+         (setq args (nconc args
+                           (mapcan
+                            (lambda (i)
+                              (list "--exclude" i))
+                            (copy-list extra-ignores)))))
+       (with-temp-buffer
+         (apply #'vc-hg-command t 0 "."
+                "status" args)
+         (goto-char (point-min))
+         (while (re-search-forward "^[?C]\s+\\(.*\\)$" nil t)
+           (setq files (cons (expand-file-name (match-string 1))
+                             files))))
+       (nreverse files)))))
+
 (cl-defmethod project-ignores ((project (head vc)) dir)
   (let* ((root (cdr project))
           backend)

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-30  0:09                                       ` Dmitry Gutov
@ 2019-09-30  0:25                                         ` Stefan Monnier
  2019-09-30  6:50                                           ` Dmitry Gutov
  2019-10-01  8:11                                         ` Dmitry Gutov
                                                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 94+ messages in thread
From: Stefan Monnier @ 2019-09-30  0:25 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

>> So I think that your current idea is to:
>> - No general VC list-files operation since we can come up with good
>>    versions just for Git and Hg anyway.  Fine with me.
> Great. I think we should start with that that either way. And when some
> other application arises that could use a VC 'ls-files' command we 
> should see whether we can extract something that satisfies both. Maybe the
> minimum supported Git version will make it easier for us too then.

FWIW, I simply cannot see what could be the advantage of not going
through `vc-call-backend`.


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-30  0:25                                         ` Stefan Monnier
@ 2019-09-30  6:50                                           ` Dmitry Gutov
  2019-09-30 17:09                                             ` Stefan Monnier
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-09-30  6:50 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On 30.09.2019 3:25, Stefan Monnier wrote:

> FWIW, I simply cannot see what could be the advantage of not going
> through `vc-call-backend`.

Two reasons. None of them insurmountable, but:

1. Not having to worry about changing the API to accommodate future needs.

2. Very uneven backend capabilities, and the necessity to set up 
fallbacks. As you can see, even now Git requires a modern-enough version 
to support EXTRA-IGNORES. And, well, project--files-in-directory is 
private, at least for now.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-30  6:50                                           ` Dmitry Gutov
@ 2019-09-30 17:09                                             ` Stefan Monnier
  2019-10-01  8:19                                               ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Stefan Monnier @ 2019-09-30 17:09 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

>> FWIW, I simply cannot see what could be the advantage of not going
>> through `vc-call-backend`.
> Two reasons. None of them insurmountable, but:
> 1. Not having to worry about changing the API to accommodate future needs.

I think we should feel free to change this "API" at will.

> 2. Very uneven backend capabilities, and the necessity to set up
> fallbacks.  As you can see, even now Git requires a modern-enough version to
> support EXTRA-IGNORES. And, well, project--files-in-directory is private, at
> least for now.

These seem to affect the code regardless of whether it goes through
vc-call-backend or not.


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-30  0:09                                       ` Dmitry Gutov
  2019-09-30  0:25                                         ` Stefan Monnier
@ 2019-10-01  8:11                                         ` Dmitry Gutov
  2019-10-03  8:33                                           ` Tassilo Horn
  2019-10-03  7:41                                         ` Tassilo Horn
  2019-10-03 23:02                                         ` A project-files implementation for Git projects Dmitry Gutov
  3 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-01  8:11 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 278 bytes --]

Here's an updated patch.

- project--vc-list-files now has '--' in its name.

- Turns out that sometimes Git repositories include symlinks, and even 
broken ones. Grep, subsequently, chokes on those with "file not found", 
so we now suppress all "not found" messages from Grep.

[-- Attachment #2: vc-list-files.diff --]
[-- Type: text/x-patch, Size: 3108 bytes --]

diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
index 4693d07fa8..4556c0bdb9 100644
--- a/lisp/progmodes/project.el
+++ b/lisp/progmodes/project.el
@@ -277,6 +277,67 @@ project-try-vc
      (funcall project-vc-external-roots-function)))
    (project-roots project)))
 
+(cl-defmethod project-files ((project (head vc)) &optional dirs)
+  (cl-mapcan
+   (lambda (dir)
+     (let (backend)
+       (if (and (file-equal-p dir (cdr project))
+                (setq backend (vc-responsible-backend dir))
+                nil
+                (cond
+                 ((eq backend 'Hg))
+                 ((and (eq backend 'Git)
+                       (or
+                        (not project-vc-ignores)
+                        (version<= "1.9" (vc-git--program-version)))))))
+           (project--vc-list-files dir backend project-vc-ignores)
+         (project--files-in-directory
+          dir
+          (project--dir-ignores project dir)))))
+   (or dirs (project-roots project))))
+
+(defun project--vc-list-files (dir backend extra-ignores)
+  (pcase backend
+    (`Git
+     (let ((default-directory dir)
+           (args '("-z")))
+       (when t ;include-unregistered
+         (setq args (append args '("-c" "-o" "--exclude-standard"))))
+       (when extra-ignores
+         (setq args (append args
+                            (cons "--"
+                                  (mapcar
+                                   (lambda (i)
+                                     (if (string-match "\\./" i)
+                                         (format ":!/:%s" (substring i 2))
+                                       (format ":!:%s" i)))
+                                   extra-ignores)))))
+       (mapcar
+        #'expand-file-name
+        (split-string
+         (apply #'vc-git--run-command-string nil "ls-files" args)
+         "\0" t))))
+    (`Hg
+     (let ((default-directory dir)
+           args
+           files)
+       (when t ;include-unregistered
+         (setq args (nconc args '("--all"))))
+       (when extra-ignores
+         (setq args (nconc args
+                           (mapcan
+                            (lambda (i)
+                              (list "--exclude" i))
+                            (copy-list extra-ignores)))))
+       (with-temp-buffer
+         (apply #'vc-hg-command t 0 "."
+                "status" args)
+         (goto-char (point-min))
+         (while (re-search-forward "^[?C]\s+\\(.*\\)$" nil t)
+           (setq files (cons (expand-file-name (match-string 1))
+                             files))))
+       (nreverse files)))))
+
 (cl-defmethod project-ignores ((project (head vc)) dir)
   (let* ((root (cdr project))
           backend)
@@ -391,7 +452,7 @@ project--find-regexp-in-files
        (status nil)
        (hits nil)
        (xrefs nil)
-       (command (format "xargs -0 grep %s -nHE -e %s"
+       (command (format "xargs -0 grep %s -snHE -e %s"
                         (if (and case-fold-search
                                  (isearch-no-upper-case-p regexp t))
                             "-i"

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-30 17:09                                             ` Stefan Monnier
@ 2019-10-01  8:19                                               ` Dmitry Gutov
  2019-10-01 12:31                                                 ` Stefan Monnier
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-01  8:19 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On 30.09.2019 20:09, Stefan Monnier wrote:
>> 1. Not having to worry about changing the API to accommodate future needs.
> 
> I think we should feel free to change this "API" at will.

So no worry about external VC backends? Like these ones:

   vc-darcs
   vc-fossil
   vc-hgcmd
   vc-msg
   vc-osc

>> 2. Very uneven backend capabilities, and the necessity to set up
>> fallbacks.  As you can see, even now Git requires a modern-enough version to
>> support EXTRA-IGNORES. And, well, project--files-in-directory is private, at
>> least for now.
> 
> These seem to affect the code regardless of whether it goes through
> vc-call-backend or not.

Not exactly. vc-call-backend has its own fallback mechanism (using 
project--files-in-directory there would be kinda awkward), but if we do 
all the checks in one place, we don't have to use it.

Further, normally when somebody adds an implementation of a VC action 
for some backend, it's a good thing, even if the implementation is 
imperfect. Not so in this case.

Using a defcustom for "which backends are capable" is also out of line 
with general VC usage.

In any case, we should push the new feature first. The decision whether 
the functionality should move to a VC action can happen later.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-01  8:19                                               ` Dmitry Gutov
@ 2019-10-01 12:31                                                 ` Stefan Monnier
  2019-10-01 13:10                                                   ` Stefan Monnier
  0 siblings, 1 reply; 94+ messages in thread
From: Stefan Monnier @ 2019-10-01 12:31 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

>>> 1. Not having to worry about changing the API to accommodate future needs.
>> I think we should feel free to change this "API" at will.
>
> So no worry about external VC backends? Like these ones:
>
>   vc-darcs
>   vc-fossil
>   vc-hgcmd
>   vc-msg
>   vc-osc

Exactly! (note: I'm not talking about the whole API, I'm talking about
this "project files" part of the API).

> Further, normally when somebody adds an implementation of a VC action for
> some backend, it's a good thing, even if the implementation is
> imperfect. Not so in this case.

Good point,


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-01 12:31                                                 ` Stefan Monnier
@ 2019-10-01 13:10                                                   ` Stefan Monnier
  2019-10-01 23:38                                                     ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Stefan Monnier @ 2019-10-01 13:10 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Exactly! (note: I'm not talking about the whole API, I'm talking about
> this "project files" part of the API).

We could use a `project--files` name to hint at the "internal" status.


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-01 13:10                                                   ` Stefan Monnier
@ 2019-10-01 23:38                                                     ` Dmitry Gutov
  2019-10-03  9:25                                                       ` Felician Nemeth
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-01 23:38 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel

On 01.10.2019 16:10, Stefan Monnier wrote:
>> Exactly! (note: I'm not talking about the whole API, I'm talking about
>> this "project files" part of the API).
> 
> We could use a `project--files` name to hint at the "internal" status.

All right, that deals with the objection above.

In my view, vc-call-backend's main purpose is to facilitate 
extensibility and being able to abstract away the current backend. Which 
is not a goal in the present situation.

If you really prefer it, though, we can switch to it anyway, but better 
do it when we can avoid having to support Git < 1.9 (then the fallback 
can be easily done in the caller).



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-30  0:09                                       ` Dmitry Gutov
  2019-09-30  0:25                                         ` Stefan Monnier
  2019-10-01  8:11                                         ` Dmitry Gutov
@ 2019-10-03  7:41                                         ` Tassilo Horn
  2019-10-03 12:33                                           ` Dmitry Gutov
  2019-10-03 23:02                                         ` A project-files implementation for Git projects Dmitry Gutov
  3 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-10-03  7:41 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

Hi Dmitry,

>> I have to admit that I've lost track over what we're actually trying
>> to do exactly.  My main motivation was just to get a faster
>> project-files implementation for the Git repo I have at work.  I've
>> now fixed that on my end using a simple caching approach which is
>> good enough for me ATM.
>
> I'm attaching a working patch that will probably need tweaks
> later. I'm not sure how to handle attribution best.  If you like it,
> you can simply commit it under your name (up to now, I've mostly done
> the rearranging and some minor tweaks).

Feel free to take my code and arrange and extend it how it suits you
best.  This is Free Software, isn't it?

As you see, currently I'm reading this list maybe one a week and have
very little time to work on this feature, so if you push things forward
I'm happy to have helped and be able to enjoy the fruits of our labor
anytime soon. :-)

Now I'll test you latest patch and report back...

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-01  8:11                                         ` Dmitry Gutov
@ 2019-10-03  8:33                                           ` Tassilo Horn
  2019-10-03 13:19                                             ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-10-03  8:33 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

Hi Dmitry,

> Here's an updated patch.
>
> - project--vc-list-files now has '--' in its name.
>
> - Turns out that sometimes Git repositories include symlinks, and even broken
>  ones. Grep, subsequently, chokes on those with "file not found", so we now
> suppress all "not found" messages from Grep.
>
> diff --git a/lisp/progmodes/project.el b/lisp/progmodes/project.el
> index 4693d07fa8..4556c0bdb9 100644
> --- a/lisp/progmodes/project.el
> +++ b/lisp/progmodes/project.el
> @@ -277,6 +277,67 @@ project-try-vc
>       (funcall project-vc-external-roots-function)))
>     (project-roots project)))
>  
> +(cl-defmethod project-files ((project (head vc)) &optional dirs)
> +  (cl-mapcan
> +   (lambda (dir)
> +     (let (backend)
> +       (if (and (file-equal-p dir (cdr project))
> +                (setq backend (vc-responsible-backend dir))
> +                nil
                   ^^^

So this disables the VC operation.  I've removed it, and the speed
improvement is good here.  This is my test case (the Emacs repository):

--8<---------------cut here---------------start------------->8---
(let* ((dir "~/Repos/el/emacs")
       (p (project-current nil dir))
       f1 f2)
  (let ((t1 (benchmark-run 10
	      (setq f1 (project-files p))))
	(t2 (benchmark-run 10
	      (setq f2 (project--files-in-directory
			dir (project--dir-ignores p dir))))))
    (message "Files: %d (VC) vs. %d (find)" (length f1) (length f2))
    (message "VC) Elapsed time: %fs (%fs in %d GCs)"
	     (car t1) (nth 2 t1) (nth 1 t1))
    (message "Find) Elapsed time: %fs (%fs in %d GCs)"
	     (car t2) (nth 2 t2) (nth 1 t2)))
  (let ((d1 (cl-set-difference f1 f2 :test #'string=))
	(d2 (cl-set-difference f2 f1 :test #'string=)))
    (message "Files found by VC but not by find:")
    (dolist (f d1)
      (message "  %s" f))
    (message "Files found by find but not by VC:")
    (dolist (f d2)
      (message "  %s" f))))
--8<---------------cut here---------------end--------------->8---

Here is the output:

--8<---------------cut here---------------start------------->8---
VC) Elapsed time: 1.379560s (0.308720s in 6 GCs)
Find) Elapsed time: 4.397054s (0.200695s in 4 GCs)
Files found by VC but not by find:
  /home/horn/Repos/el/emacs/doc/lispintro/cons-1.pdf
  /home/horn/Repos/el/emacs/doc/lispintro/cons-2.pdf
  /home/horn/Repos/el/emacs/doc/lispintro/cons-2a.pdf
  /home/horn/Repos/el/emacs/doc/lispintro/cons-3.pdf
  /home/horn/Repos/el/emacs/doc/lispintro/cons-4.pdf
  /home/horn/Repos/el/emacs/doc/lispintro/cons-5.pdf
  /home/horn/Repos/el/emacs/doc/lispintro/drawers.pdf
  /home/horn/Repos/el/emacs/doc/lispintro/lambda-1.pdf
  /home/horn/Repos/el/emacs/doc/lispintro/lambda-2.pdf
  /home/horn/Repos/el/emacs/doc/lispintro/lambda-3.pdf
  /home/horn/Repos/el/emacs/etc/refcards/Makefile
  /home/horn/Repos/el/emacs/etc/refcards/gnus-logo.pdf
  /home/horn/Repos/el/emacs/lib/_Noreturn.h
  /home/horn/Repos/el/emacs/lib/stdalign.in.h
  /home/horn/Repos/el/emacs/lib/stddef.in.h
  /home/horn/Repos/el/emacs/lib/stdint.in.h
  /home/horn/Repos/el/emacs/lib/stdio-impl.h
  /home/horn/Repos/el/emacs/lib/stdio.in.h
  /home/horn/Repos/el/emacs/lib/stdlib.in.h
  /home/horn/Repos/el/emacs/m4/__inline.m4
  /home/horn/Repos/el/emacs/test/data/xdg/mimeinfo.cache
  /home/horn/Repos/el/emacs/test/lisp/progmodes/flymake-resources/Makefile
  /home/horn/Repos/el/emacs/test/manual/etags/Makefile
  /home/horn/Repos/el/emacs/test/manual/etags/make-src/Makefile
  /home/horn/Repos/el/emacs/test/manual/indent/Makefile
Files found by find but not by VC:
  /home/horn/Repos/el/emacs/aclocal.m4
  /home/horn/Repos/el/emacs/config.status
  /home/horn/Repos/el/emacs/configure
  /home/horn/Repos/el/emacs/info/dir
--8<---------------cut here---------------end--------------->8---

Then I did it on a clean checkout of the gtk repository and got this
result:

--8<---------------cut here---------------start------------->8---
Files: 4774 (VC) vs. 4774 (find)
VC) Elapsed time: 1.721054s (0.461112s in 9 GCs)
Find) Elapsed time: 0.634624s (0.152549s in 3 GCs)
Files found by VC but not by find:
Files found by find but not by VC:
nil
--8<---------------cut here---------------end--------------->8---

So here, Git has been much slower that find!

And again with gnulib:

--8<---------------cut here---------------start------------->8---
Files: 9936 (VC) vs. 9936 (find)
VC) Elapsed time: 3.444869s (0.902124s in 16 GCs)
Find) Elapsed time: 1.380269s (0.285082s in 5 GCs)
Files found by VC but not by find:
Files found by find but not by VC:
--8<---------------cut here---------------end--------------->8---

Again Git was slower.  What my gtk and gnulib repositories have in
common is that they are clean, i.e., no build artifacts which would be
matched by the exclude args passed to find...

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-01 23:38                                                     ` Dmitry Gutov
@ 2019-10-03  9:25                                                       ` Felician Nemeth
  2019-10-03 10:32                                                         ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Felician Nemeth @ 2019-10-03  9:25 UTC (permalink / raw)
  To: emacs-devel; +Cc: Stefan Monnier, Dmitry Gutov

>> We could use a `project--files` name to hint at the "internal" status.

Can you, please, consider keeping project-files public? 

I have an extension to eglot (the language server protocol client in GNU
ELPA) that relies on this feature.  I think project-files is useful in
general even if it's slow.

Thank you.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03  9:25                                                       ` Felician Nemeth
@ 2019-10-03 10:32                                                         ` Dmitry Gutov
  2019-10-03 11:15                                                           ` Felician Nemeth
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-03 10:32 UTC (permalink / raw)
  To: Felician Nemeth, emacs-devel; +Cc: Stefan Monnier

On 03.10.2019 12:25, Felician Nemeth wrote:
>>> We could use a `project--files` name to hint at the "internal" status.
> 
> Can you, please, consider keeping project-files public?

No need to worry, it's about something else, not the project-files method.

> I have an extension to eglot (the language server protocol client in GNU
> ELPA) that relies on this feature

Could you tell us what that extension is/does?



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03 10:32                                                         ` Dmitry Gutov
@ 2019-10-03 11:15                                                           ` Felician Nemeth
  2019-10-03 12:31                                                             ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Felician Nemeth @ 2019-10-03 11:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel

>> Can you, please, consider keeping project-files public?
>
>No need to worry, it's about something else, not the project-files method.

Thank you.

>> I have an extension to eglot (the language server protocol client in GNU
>> ELPA) that relies on this feature
>
> Could you tell us what that extension is/does?

It implements the "files extension to LSP".  It basically allows the
client and the server to have separate file systems.  For example, the
server can run inside a Docker container, or the source code can be on a
remote system accessed by Tramp.

The specification is here:
https://github.com/sourcegraph/language-server-protocol/blob/master/extension-files.md

And my unfinished implementation is here:
https://github.com/nemethf/eglot/blob/xfiles-v2/eglot-x.el



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03 11:15                                                           ` Felician Nemeth
@ 2019-10-03 12:31                                                             ` Dmitry Gutov
  2019-10-03 14:39                                                               ` Felician Nemeth
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-03 12:31 UTC (permalink / raw)
  To: Felician Nemeth; +Cc: Stefan Monnier, emacs-devel

On 03.10.2019 14:15, Felician Nemeth wrote:

>>> I have an extension to eglot (the language server protocol client in GNU
>>> ELPA) that relies on this feature
>>
>> Could you tell us what that extension is/does?
> 
> It implements the "files extension to LSP".  It basically allows the
> client and the server to have separate file systems.  For example, the
> server can run inside a Docker container, or the source code can be on a
> remote system accessed by Tramp.
> 
> The specification is here:
> https://github.com/sourcegraph/language-server-protocol/blob/master/extension-files.md
> 
> And my unfinished implementation is here:
> https://github.com/nemethf/eglot/blob/xfiles-v2/eglot-x.el

Interesting, thank you. Does it use a particular backend, or the default 
one as well (VC project)?



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03  7:41                                         ` Tassilo Horn
@ 2019-10-03 12:33                                           ` Dmitry Gutov
  2019-10-03 12:51                                             ` Tassilo Horn
  2019-10-04  5:52                                             ` Co-authoring and attribution in commit message (was: A project-files implementation for Git projects) Kévin Le Gouguec
  0 siblings, 2 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-03 12:33 UTC (permalink / raw)
  To: emacs-devel

Hi Tassilo,

On 03.10.2019 10:41, Tassilo Horn wrote:

>> I'm attaching a working patch that will probably need tweaks
>> later. I'm not sure how to handle attribution best.  If you like it,
>> you can simply commit it under your name (up to now, I've mostly done
>> the rearranging and some minor tweaks).
> 
> Feel free to take my code and arrange and extend it how it suits you
> best.  This is Free Software, isn't it?

Just talking about attribution. So if it's fine with you, I can mention 
you in the commit message somehow.

> As you see, currently I'm reading this list maybe one a week and have
> very little time to work on this feature, so if you push things forward
> I'm happy to have helped and be able to enjoy the fruits of our labor
> anytime soon. :-)

All right, thank you.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03 12:33                                           ` Dmitry Gutov
@ 2019-10-03 12:51                                             ` Tassilo Horn
  2019-10-04  5:52                                             ` Co-authoring and attribution in commit message (was: A project-files implementation for Git projects) Kévin Le Gouguec
  1 sibling, 0 replies; 94+ messages in thread
From: Tassilo Horn @ 2019-10-03 12:51 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

Hi Dmitry,

>> Feel free to take my code and arrange and extend it how it suits you
>> best.  This is Free Software, isn't it?
>
> Just talking about attribution. So if it's fine with you, I can
> mention you in the commit message somehow.

Of course, that would be decent of you. :-)

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03  8:33                                           ` Tassilo Horn
@ 2019-10-03 13:19                                             ` Dmitry Gutov
  2019-10-03 17:15                                               ` Tassilo Horn
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-03 13:19 UTC (permalink / raw)
  To: emacs-devel

On 03.10.2019 11:33, Tassilo Horn wrote:

>> +(cl-defmethod project-files ((project (head vc)) &optional dirs)
>> +  (cl-mapcan
>> +   (lambda (dir)
>> +     (let (backend)
>> +       (if (and (file-equal-p dir (cdr project))
>> +                (setq backend (vc-responsible-backend dir))
>> +                nil
>                     ^^^
> 
> So this disables the VC operation.  I've removed it, and the speed
> improvement is good here.  This is my test case (the Emacs repository):

Yes, sorry. Used this for comparative testing and forgot to take it out.

The Emacs repository is the one I've mostly tested on as well.

> --8<---------------cut here---------------start------------->8---
> (let* ((dir "~/Repos/el/emacs")
>         (p (project-current nil dir))
>         f1 f2)
>    (let ((t1 (benchmark-run 10
> 	      (setq f1 (project-files p))))
> 	(t2 (benchmark-run 10
> 	      (setq f2 (project--files-in-directory
> 			dir (project--dir-ignores p dir))))))
>      (message "Files: %d (VC) vs. %d (find)" (length f1) (length f2))
>      (message "VC) Elapsed time: %fs (%fs in %d GCs)"
> 	     (car t1) (nth 2 t1) (nth 1 t1))
>      (message "Find) Elapsed time: %fs (%fs in %d GCs)"
> 	     (car t2) (nth 2 t2) (nth 1 t2)))
>    (let ((d1 (cl-set-difference f1 f2 :test #'string=))
> 	(d2 (cl-set-difference f2 f1 :test #'string=)))
>      (message "Files found by VC but not by find:")
>      (dolist (f d1)
>        (message "  %s" f))
>      (message "Files found by find but not by VC:")
>      (dolist (f d2)
>        (message "  %s" f))))
> --8<---------------cut here---------------end--------------->8---
> 
> Here is the output:
> 
> --8<---------------cut here---------------start------------->8---
> VC) Elapsed time: 1.379560s (0.308720s in 6 GCs)
> Find) Elapsed time: 4.397054s (0.200695s in 4 GCs)
> Files found by VC but not by find:
>    /home/horn/Repos/el/emacs/doc/lispintro/cons-1.pdf
>    /home/horn/Repos/el/emacs/doc/lispintro/cons-2.pdf
>    /home/horn/Repos/el/emacs/doc/lispintro/cons-2a.pdf
>    /home/horn/Repos/el/emacs/doc/lispintro/cons-3.pdf
>    /home/horn/Repos/el/emacs/doc/lispintro/cons-4.pdf
>    /home/horn/Repos/el/emacs/doc/lispintro/cons-5.pdf
>    /home/horn/Repos/el/emacs/doc/lispintro/drawers.pdf
>    /home/horn/Repos/el/emacs/doc/lispintro/lambda-1.pdf
>    /home/horn/Repos/el/emacs/doc/lispintro/lambda-2.pdf
>    /home/horn/Repos/el/emacs/doc/lispintro/lambda-3.pdf
>    /home/horn/Repos/el/emacs/etc/refcards/Makefile
>    /home/horn/Repos/el/emacs/etc/refcards/gnus-logo.pdf
>    /home/horn/Repos/el/emacs/lib/_Noreturn.h
>    /home/horn/Repos/el/emacs/lib/stdalign.in.h
>    /home/horn/Repos/el/emacs/lib/stddef.in.h
>    /home/horn/Repos/el/emacs/lib/stdint.in.h
>    /home/horn/Repos/el/emacs/lib/stdio-impl.h
>    /home/horn/Repos/el/emacs/lib/stdio.in.h
>    /home/horn/Repos/el/emacs/lib/stdlib.in.h
>    /home/horn/Repos/el/emacs/m4/__inline.m4
>    /home/horn/Repos/el/emacs/test/data/xdg/mimeinfo.cache
>    /home/horn/Repos/el/emacs/test/lisp/progmodes/flymake-resources/Makefile
>    /home/horn/Repos/el/emacs/test/manual/etags/Makefile
>    /home/horn/Repos/el/emacs/test/manual/etags/make-src/Makefile
>    /home/horn/Repos/el/emacs/test/manual/indent/Makefile

The difference is that the 'find' based method does not support 
whitelist entries yet.

When it does, that might make its performance slightly worse, but 
probably not in gtk or gnulib repos.

> Files found by find but not by VC:
>    /home/horn/Repos/el/emacs/aclocal.m4
>    /home/horn/Repos/el/emacs/config.status
>    /home/horn/Repos/el/emacs/configure
>    /home/horn/Repos/el/emacs/info/dir
> --8<---------------cut here---------------end--------------->8---
> 
> Then I did it on a clean checkout of the gtk repository and got this
> result:
> 
> --8<---------------cut here---------------start------------->8---
> Files: 4774 (VC) vs. 4774 (find)
> VC) Elapsed time: 1.721054s (0.461112s in 9 GCs)
> Find) Elapsed time: 0.634624s (0.152549s in 3 GCs)
> Files found by VC but not by find:
> Files found by find but not by VC:
> nil
> --8<---------------cut here---------------end--------------->8---
> 
> So here, Git has been much slower that find!

Interesting! I haven't seen that result before, but it sounds plausible. 
IME it's ignore rules that make 'find' work slower. Git optimizes that 
logic somehow. So on projects that have few ignore rules 'find' could be 
faster.

I've also tried the gtk repo, and the performance ratio over here is the 
same, although in my case 'git ls-files' here is faster than 'git 
ls-files' in Emacs's repo (and 'find' is twice faster still).

> And again with gnulib:
> 
> --8<---------------cut here---------------start------------->8---
> Files: 9936 (VC) vs. 9936 (find)
> VC) Elapsed time: 3.444869s (0.902124s in 16 GCs)
> Find) Elapsed time: 1.380269s (0.285082s in 5 GCs)
> Files found by VC but not by find:
> Files found by find but not by VC:
> --8<---------------cut here---------------end--------------->8---
> 
> Again Git was slower.  What my gtk and gnulib repositories have in
> common is that they are clean, i.e., no build artifacts which would be
> matched by the exclude args passed to find...

gtk has only one .gitignore entry, gnulib has 8, but fairly simple ones.

So, what should we do here? Maybe:

1. Implement whitelist rules support for 'find'.

2. Add a defcustom project-vc-list-files-method? With a value 'auto' 
which would check the backend and Git version. Maybe the presence of 
'find' as well. Other possible values would be 'find' and 'vc'.

If you have time, could you compare the performance of 'find' and 'git 
ls-files' in the command line? Because when simply redirecting to a file 
I'm seeing a different result:

$ bash -c "time git ls-files >test"

real	0m0,011s
user	0m0,005s
sys	0m0,006s

$ bash -c "time find . >test2"

real	0m0,026s
user	0m0,008s
sys	0m0,018s

That could indicate some inefficiency in processing the output in Emacs.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03 12:31                                                             ` Dmitry Gutov
@ 2019-10-03 14:39                                                               ` Felician Nemeth
  2019-10-03 14:42                                                                 ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Felician Nemeth @ 2019-10-03 14:39 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel

>> It implements the "files extension to LSP".  It basically allows the
>> client and the server to have separate file systems.  For example, the
>> server can run inside a Docker container, or the source code can be on a
>> remote system accessed by Tramp.
>>
>> The specification is here:
>> https://github.com/sourcegraph/language-server-protocol/blob/master/extension-files.md
>>
>> And my unfinished implementation is here:
>> https://github.com/nemethf/eglot/blob/xfiles-v2/eglot-x.el
>
> Interesting, thank you. Does it use a particular backend, or the
> default one as well (VC project)?

It only calls project-files, project-roots, and project-external-roots.
So, it doesn't know about backends and other implementation details.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03 14:39                                                               ` Felician Nemeth
@ 2019-10-03 14:42                                                                 ` Dmitry Gutov
  2019-10-03 15:10                                                                   ` Felician Nemeth
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-03 14:42 UTC (permalink / raw)
  To: Felician Nemeth; +Cc: Stefan Monnier, emacs-devel

On 03.10.2019 17:39, Felician Nemeth wrote:
> It only calls project-files, project-roots, and project-external-roots.
> So, it doesn't know about backends and other implementation details.

But doesn't it rely in the current backend corresponding to the current 
LSP project? For instance, so that LSP server and the project agree on 
the list of project files.

In any case, I'm curious which backend you end up normally using. You 
can evaluate (project-current) to see that.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03 14:42                                                                 ` Dmitry Gutov
@ 2019-10-03 15:10                                                                   ` Felician Nemeth
  2019-10-03 15:15                                                                     ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Felician Nemeth @ 2019-10-03 15:10 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Stefan Monnier, emacs-devel

>> It only calls project-files, project-roots, and project-external-roots.
>> So, it doesn't know about backends and other implementation details.
>
> But doesn't it rely in the current backend corresponding to the
> current LSP project? For instance, so that LSP server and the project
> agree on the list of project files.

If I understand the question correctly, then it does.  Eglot relies on
project.el to determine which files should be managed together by an LSP
server.  My extension uses the same project "instace" that eglot itself
uses in order to send the list of files in the project (and later their
contents) to the server.

> In any case, I'm curious which backend you end up normally using. You
> can evaluate (project-current) to see that.

Nowdays there's only one backend that I use: the vc backend with git.  I
tried EDE, but abandoned it.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03 15:10                                                                   ` Felician Nemeth
@ 2019-10-03 15:15                                                                     ` Dmitry Gutov
  0 siblings, 0 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-03 15:15 UTC (permalink / raw)
  To: Felician Nemeth; +Cc: Stefan Monnier, emacs-devel

On 03.10.2019 18:10, Felician Nemeth wrote:
> If I understand the question correctly, then it does.  Eglot relies on
> project.el to determine which files should be managed together by an LSP
> server.  My extension uses the same project "instace" that eglot itself
> uses in order to send the list of files in the project (and later their
> contents) to the server.

 > Nowdays there's only one backend that I use: the vc backend with git.

Interesting. Thank you.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03 13:19                                             ` Dmitry Gutov
@ 2019-10-03 17:15                                               ` Tassilo Horn
  2019-10-03 22:49                                                 ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-10-03 17:15 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

> If you have time, could you compare the performance of 'find' and 'git
> ls-files' in the command line? Because when simply redirecting to a
> file I'm seeing a different result:
>
> $ bash -c "time git ls-files >test"
>
> real	0m0,011s
> user	0m0,005s
> sys	0m0,006s
>
> $ bash -c "time find . >test2"
>
> real	0m0,026s
> user	0m0,008s
> sys	0m0,018s
>
> That could indicate some inefficiency in processing the output in
> Emacs.

I just tried with the gcc repository with its about 100000 files.  Here
it was about equal with ~30secs for both git and find.  The git ls-files
invocation on the command line with output piped to /dev/null is done in
one tenth of a second.

Oh, when dropping the `expand-file-name' call we're doing on every file,
it's ten times faster (~3secs).

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03 17:15                                               ` Tassilo Horn
@ 2019-10-03 22:49                                                 ` Dmitry Gutov
  2019-10-04  7:47                                                   ` Tassilo Horn
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-03 22:49 UTC (permalink / raw)
  To: emacs-devel

On 03.10.2019 20:15, Tassilo Horn wrote:

> I just tried with the gcc repository with its about 100000 files.  Here
> it was about equal with ~30secs for both git and find.  The git ls-files
> invocation on the command line with output piped to /dev/null is done in
> one tenth of a second.
> 
> Oh, when dropping the `expand-file-name' call we're doing on every file,
> it's ten times faster (~3secs).

Nice observation, thanks. It's a significant per-item cost, so no 
surprise projects with lots of files are (were) doing worse than 'find'.

I've changed it to 'concat' now which isn't free, but much faster.

I wish someone did some benchmarking with Hg projects. mozilla-central 
(~200000 files) is still slower on my machine with Hg than with 'find'.

We still don't support regexp rules with 'find', but that's probably 
doesn't affect the speed in this example, and it's a reason to prefer Hg 
for listing.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-09-30  0:09                                       ` Dmitry Gutov
                                                           ` (2 preceding siblings ...)
  2019-10-03  7:41                                         ` Tassilo Horn
@ 2019-10-03 23:02                                         ` Dmitry Gutov
  3 siblings, 0 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-03 23:02 UTC (permalink / raw)
  To: emacs-devel

On 30.09.2019 3:09, Dmitry Gutov wrote:
> 
> Still, there could be a performance problem with outputting all ignored 
> files and then parsing out only a part of them. Could you try the new 
> feature with a test repository from 
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22481?

I'm pushing the code now, but it would be nice if someone looked into 
this probability.

I'd hate to make life more difficult for some Mercurial users with no 
good reason.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Co-authoring and attribution in commit message (was: A project-files implementation for Git projects)
  2019-10-03 12:33                                           ` Dmitry Gutov
  2019-10-03 12:51                                             ` Tassilo Horn
@ 2019-10-04  5:52                                             ` Kévin Le Gouguec
  2019-10-04  8:33                                               ` Co-authoring and attribution in commit message Dmitry Gutov
  1 sibling, 1 reply; 94+ messages in thread
From: Kévin Le Gouguec @ 2019-10-04  5:52 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

> Just talking about attribution. So if it's fine with you, I can
> mention you in the commit message somehow.

Out of curiosity (apologies for meddling in), would this be the kind of
situation where sticking a "Co-authored-by" makes sense?
(cf. CONTRIBUTE)

Terse and robotic as it may sound, I guess the convention allows anyone
(or any tool) interested in attribution to find it somewhat
deterministically…

(This prompted me to check whether any tool out there actually
recognizes this tag; apparently GitHub's log UI does[1].)

(Also, Magit just gained a new function to insert those easily[2]; no
default binding yet, unfortunately.)


[1]: https://github.blog/2018-01-29-commit-together-with-co-authors/
[2]: https://github.com/magit/magit/commit/3add5310ba98b74bb3e7fb82dd259713a0c6606c



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-03 22:49                                                 ` Dmitry Gutov
@ 2019-10-04  7:47                                                   ` Tassilo Horn
  2019-10-04  7:58                                                     ` Tassilo Horn
                                                                       ` (3 more replies)
  0 siblings, 4 replies; 94+ messages in thread
From: Tassilo Horn @ 2019-10-04  7:47 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

Hi Dmitry,

>> I just tried with the gcc repository with its about 100000 files.
>> Here it was about equal with ~30secs for both git and find.  The git
>> ls-files invocation on the command line with output piped to
>> /dev/null is done in one tenth of a second.
>> Oh, when dropping the `expand-file-name' call we're doing on every
>> file, it's ten times faster (~3secs).
>
> Nice observation, thanks. It's a significant per-item cost, so no
> surprise projects with lots of files are (were) doing worse than
> 'find'.
>
> I've changed it to 'concat' now which isn't free, but much faster.

Yes, that's much better/faster.  But shouldn't we use

  (let ((default-directory (file-name-as-directory dir))
        ...

in order to ensure that default-directory always has a trailing / and
concat DTRT?

By the way, in my major use-case for `project-files' (a source to raven)
I want the files to be relative to the given directory.  So I'm calling
`file-relative-name' on every file in the result.  It would be cool if
there was a way to return the file names relative to the given directory
(which wouldn't make much sense if multiple dirs are given, though).

> I wish someone did some benchmarking with Hg projects.
> mozilla-central (~200000 files) is still slower on my machine with Hg
> than with 'find'.

I'll do.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-04  7:47                                                   ` Tassilo Horn
@ 2019-10-04  7:58                                                     ` Tassilo Horn
  2019-10-04 13:16                                                       ` Dmitry Gutov
  2019-10-04  8:49                                                     ` Tassilo Horn
                                                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-10-04  7:58 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Tassilo Horn <tsdh@gnu.org> writes:

> By the way, in my major use-case for `project-files' (a source to
> raven) I want the files to be relative to the given directory.  So I'm
> calling `file-relative-name' on every file in the result.  It would be
> cool if there was a way to return the file names relative to the given
> directory (which wouldn't make much sense if multiple dirs are given,
> though).

Oh wait, forget that.  What I really would like to have is a way to get
*all* project files but relative to the given directory.  Right now, if
I specify a project subdirectory, I only get the files in there which is
sensible, of course.  So I have to call `project-files' for the project
given that "git ls-files <subdir>" only returns the files in <subdir>
and then call `file-relative-name' on the results.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: Co-authoring and attribution in commit message
  2019-10-04  5:52                                             ` Co-authoring and attribution in commit message (was: A project-files implementation for Git projects) Kévin Le Gouguec
@ 2019-10-04  8:33                                               ` Dmitry Gutov
  2019-10-04 21:36                                                 ` Karl Fogel
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-04  8:33 UTC (permalink / raw)
  To: Kévin Le Gouguec; +Cc: emacs-devel

On 04.10.2019 8:52, Kévin Le Gouguec wrote:
> Dmitry Gutov <dgutov@yandex.ru> writes:
> 
>> Just talking about attribution. So if it's fine with you, I can
>> mention you in the commit message somehow.
> 
> Out of curiosity (apologies for meddling in), would this be the kind of
> situation where sticking a "Co-authored-by" makes sense?
> (cf. CONTRIBUTE)

It really would. If the commit was not already pushed, I'd have used 
your advice, thank you.

> Terse and robotic as it may sound, I guess the convention allows anyone
> (or any tool) interested in attribution to find it somewhat
> deterministically…
> 
> (This prompted me to check whether any tool out there actually
> recognizes this tag; apparently GitHub's log UI does[1].)

Cool. I didn't know that.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-04  7:47                                                   ` Tassilo Horn
  2019-10-04  7:58                                                     ` Tassilo Horn
@ 2019-10-04  8:49                                                     ` Tassilo Horn
  2019-10-04 12:57                                                       ` Dmitry Gutov
  2019-10-04 12:16                                                     ` Stefan Monnier
  2019-10-04 13:08                                                     ` Dmitry Gutov
  3 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-10-04  8:49 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Tassilo Horn <tsdh@gnu.org> writes:

>> I wish someone did some benchmarking with Hg projects.
>> mozilla-central (~200000 files) is still slower on my machine with Hg
>> than with 'find'.
>
> I'll do.

On my machine, the Hg version is much faster than find:

Files: 283417 (VC) vs. 283497 (find)
VC) Elapsed time: 4.714386s (0.782360s in 12 GCs)
Find) Elapsed time: 18.569516s (0.393230s in 5 GCs)

I downloaded an archive of mozilla-central and performed "hg init && hg
add * && hg commit -m foo" myself.  The hg status --all call on a
terminal takes about 2.8 seconds itself.

My hg version is 5.1.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-04  7:47                                                   ` Tassilo Horn
  2019-10-04  7:58                                                     ` Tassilo Horn
  2019-10-04  8:49                                                     ` Tassilo Horn
@ 2019-10-04 12:16                                                     ` Stefan Monnier
  2019-10-04 13:08                                                     ` Dmitry Gutov
  3 siblings, 0 replies; 94+ messages in thread
From: Stefan Monnier @ 2019-10-04 12:16 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

> Yes, that's much better/faster.  But shouldn't we use
>
>   (let ((default-directory (file-name-as-directory dir))
>         ...
>
> in order to ensure that default-directory always has a trailing / and

Yes, `default-directory` should *always* have a trailing slash (as
mentioned in its docstring).


        Stefan




^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-04  8:49                                                     ` Tassilo Horn
@ 2019-10-04 12:57                                                       ` Dmitry Gutov
  2019-10-04 13:59                                                         ` Tassilo Horn
  0 siblings, 1 reply; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-04 12:57 UTC (permalink / raw)
  To: emacs-devel

On 04.10.2019 11:49, Tassilo Horn wrote:
> Tassilo Horn <tsdh@gnu.org> writes:
> 
>>> I wish someone did some benchmarking with Hg projects.
>>> mozilla-central (~200000 files) is still slower on my machine with Hg
>>> than with 'find'.
>>
>> I'll do.
> 
> On my machine, the Hg version is much faster than find:
> 
> Files: 283417 (VC) vs. 283497 (find)
> VC) Elapsed time: 4.714386s (0.782360s in 12 GCs)
> Find) Elapsed time: 18.569516s (0.393230s in 5 GCs)
> 
> I downloaded an archive of mozilla-central and performed "hg init && hg
> add * && hg commit -m foo" myself.  The hg status --all call on a
> terminal takes about 2.8 seconds itself.
> 
> My hg version is 5.1.

That's great. My Hg version is 4.8.2, but maybe they had done some 
performance optimization in the new release. I have a "proper" Hg 
checkout, though, with full history. Maybe that also has an effect.

In any case, I just pushed an optimized implementation for Hg. Please 
try it out as well when you have time. On my machine it's ~equal with 
'find' based implementation now (roughly 7 seconds).

(project--files-in-directory default-directory nil) completes in 2.6s, 
though.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-04  7:47                                                   ` Tassilo Horn
                                                                       ` (2 preceding siblings ...)
  2019-10-04 12:16                                                     ` Stefan Monnier
@ 2019-10-04 13:08                                                     ` Dmitry Gutov
  3 siblings, 0 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-04 13:08 UTC (permalink / raw)
  To: emacs-devel

On 04.10.2019 10:47, Tassilo Horn wrote:

> By the way, in my major use-case for `project-files' (a source to raven)
> I want the files to be relative to the given directory.  So I'm calling
> `file-relative-name' on every file in the result.  It would be cool if
> there was a way to return the file names relative to the given directory
> (which wouldn't make much sense if multiple dirs are given, though).

I considered it, even if just for performance optimization, but having 
multiple roots makes it a problem.

We should deprecate the "multiple roots" thing, actually, but external 
roots will stay with us, I think (they have their uses). Some larger 
rework of the API (and of external roots' representation) might make it 
viable, though.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-04  7:58                                                     ` Tassilo Horn
@ 2019-10-04 13:16                                                       ` Dmitry Gutov
  0 siblings, 0 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-04 13:16 UTC (permalink / raw)
  To: emacs-devel

On 04.10.2019 10:58, Tassilo Horn wrote:

> Oh wait, forget that.  What I really would like to have is a way to get
> *all* project files but relative to the given directory.  Right now, if
> I specify a project subdirectory, I only get the files in there which is
> sensible, of course.  So I have to call `project-files' for the project
> given that "git ls-files <subdir>" only returns the files in <subdir>
> and then call `file-relative-name' on the results.

That's, um... we don't actually guarantee that project-files will work 
when passed anything other than one of the project roots. But as long as 
it works for you, good.

In any case, your suggested return format is probably a no-go. The DIRS 
argument is to specify the set of files to be returned, we really can't 
interpret it in a conflicting way.

The good news is that you still can use file-relative-name even if the 
first argument is relative already (but to default-directory):

 > (file-relative-name "foo/bar" "/tmp")

=> "../home/dgutov/vc/emacs-master/lisp/progmodes/foo/bar"

But alas, file-relative-name is not free.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-04 12:57                                                       ` Dmitry Gutov
@ 2019-10-04 13:59                                                         ` Tassilo Horn
  2019-10-04 15:24                                                           ` Dmitry Gutov
  0 siblings, 1 reply; 94+ messages in thread
From: Tassilo Horn @ 2019-10-04 13:59 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel

Dmitry Gutov <dgutov@yandex.ru> writes:

>> On my machine, the Hg version is much faster than find:
>> Files: 283417 (VC) vs. 283497 (find)
>> VC) Elapsed time: 4.714386s (0.782360s in 12 GCs)
>> Find) Elapsed time: 18.569516s (0.393230s in 5 GCs)
>> I downloaded an archive of mozilla-central and performed "hg init && hg
>> add * && hg commit -m foo" myself.  The hg status --all call on a
>> terminal takes about 2.8 seconds itself.
>> My hg version is 5.1.
>
> That's great. My Hg version is 4.8.2, but maybe they had done some
> performance optimization in the new release. I have a "proper" Hg
> checkout, though, with full history. Maybe that also has an effect.
>
> In any case, I just pushed an optimized implementation for Hg. Please
> try it out as well when you have time. On my machine it's ~equal with
> 'find' based implementation now (roughly 7 seconds).

I get about the same time as before.

Files: 283417 (VC) vs. 283497 (find)
VC) Elapsed time: 4.525872s (0.620848s in 11 GCs)
Find) Elapsed time: 18.480051s (0.268073s in 4 GCs)


> (project--files-in-directory default-directory nil) completes in 2.6s,
> though.

Hm, 13 seconds on the first run and then about 3.7 seconds on each
subsequent run.

Bye,
Tassilo



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: A project-files implementation for Git projects
  2019-10-04 13:59                                                         ` Tassilo Horn
@ 2019-10-04 15:24                                                           ` Dmitry Gutov
  0 siblings, 0 replies; 94+ messages in thread
From: Dmitry Gutov @ 2019-10-04 15:24 UTC (permalink / raw)
  To: emacs-devel

On 04.10.2019 16:59, Tassilo Horn wrote:

>> In any case, I just pushed an optimized implementation for Hg. Please
>> try it out as well when you have time. On my machine it's ~equal with
>> 'find' based implementation now (roughly 7 seconds).
> 
> I get about the same time as before.
> 
> Files: 283417 (VC) vs. 283497 (find)
> VC) Elapsed time: 4.525872s (0.620848s in 11 GCs)
> Find) Elapsed time: 18.480051s (0.268073s in 4 GCs)

Curious.

The performance of 'hg status --all' and 'hg status --all --no-status' 
is significantly different on my machine (the latter is almost 2x faster).

Maybe the difference is part of post-4.8.2 improvements.

>> (project--files-in-directory default-directory nil) completes in 2.6s,
>> though.
> 
> Hm, 13 seconds on the first run and then about 3.7 seconds on each
> subsequent run.

Right, I only mention the "warm" numbers. The result is par for the 
course, the laptop I'm typing this on is pretty new, so the disk 
performance should be better than most our users will see for a while.



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: Co-authoring and attribution in commit message
  2019-10-04  8:33                                               ` Co-authoring and attribution in commit message Dmitry Gutov
@ 2019-10-04 21:36                                                 ` Karl Fogel
  2019-10-05  6:55                                                   ` Eli Zaretskii
  0 siblings, 1 reply; 94+ messages in thread
From: Karl Fogel @ 2019-10-04 21:36 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: emacs-devel, Kévin Le Gouguec

On 04 Oct 2019, Dmitry Gutov wrote:
>On 04.10.2019 8:52, Kévin Le Gouguec wrote:
>> Dmitry Gutov <dgutov@yandex.ru> writes:
>> 
>>> Just talking about attribution. So if it's fine with you, I can
>>> mention you in the commit message somehow.
>> Out of curiosity (apologies for meddling in), would this be the kind
>> of
>> situation where sticking a "Co-authored-by" makes sense?
>> (cf. CONTRIBUTE)
>
>It really would. If the commit was not already pushed, I'd have used
>your advice, thank you.

By the way, there are starting to be some standards for including these kinds of parseable facts in commit messages.

One place to look (and possibly to contribute a suggestion to) is https://www.conventionalcommits.org/.  Another, older standard is at http://subversion.apache.org/docs/community-guide/conventions.html#crediting, though I think it's only used in that one project.  The latter one looks like it's more directly relevant to this use case than Conventional Commits is, as it currently supports these fields:

   Patch by:
   Suggested by:
   Found by:
   Review by:
   Tested by:

"Co-authored-by:" is probably similar to "Patch by:".  Now, I think it still makes sense for us to use "Co-authored-by:", given that some (admittedly non-free-software) tools already parse it [1], but I wanted to make sure that these parallel efforts are known, so that we can learn whatever is available to be learned from them as we're setting a convention.

Best regards,
-Karl

[1] https://github.blog/2018-01-29-commit-together-with-co-authors/, as Kévin pointed out earlier



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: Co-authoring and attribution in commit message
  2019-10-04 21:36                                                 ` Karl Fogel
@ 2019-10-05  6:55                                                   ` Eli Zaretskii
  0 siblings, 0 replies; 94+ messages in thread
From: Eli Zaretskii @ 2019-10-05  6:55 UTC (permalink / raw)
  To: Karl Fogel; +Cc: emacs-devel, kevin.legouguec, dgutov

> From: Karl Fogel <kfogel@red-bean.com>
> Date: Fri, 04 Oct 2019 16:36:21 -0500
> Cc: emacs-devel@gnu.org,
>  Kévin Le Gouguec <kevin.legouguec@gmail.com>
> 
> One place to look (and possibly to contribute a suggestion to) is https://www.conventionalcommits.org/.  Another, older standard is at http://subversion.apache.org/docs/community-guide/conventions.html#crediting, though I think it's only used in that one project.  The latter one looks like it's more directly relevant to this use case than Conventional Commits is, as it currently supports these fields:
> 
>    Patch by:
>    Suggested by:
>    Found by:
>    Review by:
>    Tested by:

Emacs ChangeLog-related modes recognize the attributions via the
following regexp:

 "\\(^\\( +\\|\t\\)\\|  \\)\\(Thanks to\\|Patch\\(es\\)? by\\|Report\\(ed by\\| from\\)\\|Suggest\\(ed by\\|ion from\\)\\)

> "Co-authored-by:" is probably similar to "Patch by:".  Now, I think it still makes sense for us to use "Co-authored-by:", given that some (admittedly non-free-software) tools already parse it [1], but I wanted to make sure that these parallel efforts are known, so that we can learn whatever is available to be learned from them as we're setting a convention.

I think this should be part of the GNU Coding Standards (GCS), and
thus I suggest to discuss it on the bug-standards mailing list, not
here.  IMO, it isn't right for a single project to adopt such
standards, this should be in GNU-common conventions and guidelines.



^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2019-10-05  6:55 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-06  9:19 A project-files implementation for Git projects Tassilo Horn
2019-09-06 12:52 ` Stefan Monnier
2019-09-10  6:25   ` Tassilo Horn
2019-09-10 12:56     ` Stefan Monnier
2019-09-10 13:39       ` Tassilo Horn
2019-09-10 13:56         ` Stefan Monnier
2019-09-11 11:00           ` Tassilo Horn
2019-09-11 20:01             ` Tassilo Horn
2019-09-13 20:38               ` Tassilo Horn
2019-09-14  0:29               ` Dmitry Gutov
2019-09-14 16:26                 ` Tassilo Horn
2019-09-15 18:56                   ` Dmitry Gutov
2019-09-16  2:27                     ` Eli Zaretskii
2019-09-16  3:36                       ` Dmitry Gutov
2019-09-16 15:25                         ` Eli Zaretskii
2019-09-17 10:46                           ` Dmitry Gutov
2019-09-17 12:03                             ` Eli Zaretskii
2019-09-17 12:55                               ` Dmitry Gutov
2019-09-17 13:14                                 ` Eli Zaretskii
2019-09-19 15:33                                   ` Dmitry Gutov
2019-09-19 17:29                                     ` Eli Zaretskii
2019-09-20 11:25                                       ` Dmitry Gutov
2019-09-20 12:59                                         ` Eli Zaretskii
2019-09-20 13:28                                           ` Dmitry Gutov
2019-09-20 13:45                                             ` Stefan Monnier
2019-09-20 13:54                                               ` Dmitry Gutov
2019-09-20 14:12                                                 ` Michael Albinus
2019-09-20 14:30                                                   ` Eli Zaretskii
2019-09-20 14:51                                                     ` Dmitry Gutov
2019-09-20 15:04                                                       ` Michael Albinus
2019-09-22  9:23                                                         ` Dmitry Gutov
2019-09-20 14:55                                                     ` Michael Albinus
2019-09-20 15:55                                                       ` Eli Zaretskii
2019-09-20 15:01                                                 ` Stefan Monnier
2019-09-20 15:59                                                   ` Eli Zaretskii
2019-09-20 17:32                                                     ` Stefan Monnier
2019-09-20 17:49                                                       ` Eli Zaretskii
2019-09-20 18:04                                                         ` Stefan Monnier
2019-09-20 14:23                                             ` Eli Zaretskii
2019-09-20 14:48                                               ` Dmitry Gutov
2019-09-16 13:32                     ` Tassilo Horn
2019-09-17 11:06                       ` Dmitry Gutov
2019-09-18 17:15                         ` Tassilo Horn
2019-09-19 16:01                           ` Dmitry Gutov
2019-09-22  8:56                             ` Tassilo Horn
2019-09-22  9:37                               ` Dmitry Gutov
2019-09-23  7:42                                 ` Tassilo Horn
2019-09-23 12:22                                   ` Dmitry Gutov
2019-09-27 16:17                                     ` Tassilo Horn
2019-09-30  0:09                                       ` Dmitry Gutov
2019-09-30  0:25                                         ` Stefan Monnier
2019-09-30  6:50                                           ` Dmitry Gutov
2019-09-30 17:09                                             ` Stefan Monnier
2019-10-01  8:19                                               ` Dmitry Gutov
2019-10-01 12:31                                                 ` Stefan Monnier
2019-10-01 13:10                                                   ` Stefan Monnier
2019-10-01 23:38                                                     ` Dmitry Gutov
2019-10-03  9:25                                                       ` Felician Nemeth
2019-10-03 10:32                                                         ` Dmitry Gutov
2019-10-03 11:15                                                           ` Felician Nemeth
2019-10-03 12:31                                                             ` Dmitry Gutov
2019-10-03 14:39                                                               ` Felician Nemeth
2019-10-03 14:42                                                                 ` Dmitry Gutov
2019-10-03 15:10                                                                   ` Felician Nemeth
2019-10-03 15:15                                                                     ` Dmitry Gutov
2019-10-01  8:11                                         ` Dmitry Gutov
2019-10-03  8:33                                           ` Tassilo Horn
2019-10-03 13:19                                             ` Dmitry Gutov
2019-10-03 17:15                                               ` Tassilo Horn
2019-10-03 22:49                                                 ` Dmitry Gutov
2019-10-04  7:47                                                   ` Tassilo Horn
2019-10-04  7:58                                                     ` Tassilo Horn
2019-10-04 13:16                                                       ` Dmitry Gutov
2019-10-04  8:49                                                     ` Tassilo Horn
2019-10-04 12:57                                                       ` Dmitry Gutov
2019-10-04 13:59                                                         ` Tassilo Horn
2019-10-04 15:24                                                           ` Dmitry Gutov
2019-10-04 12:16                                                     ` Stefan Monnier
2019-10-04 13:08                                                     ` Dmitry Gutov
2019-10-03  7:41                                         ` Tassilo Horn
2019-10-03 12:33                                           ` Dmitry Gutov
2019-10-03 12:51                                             ` Tassilo Horn
2019-10-04  5:52                                             ` Co-authoring and attribution in commit message (was: A project-files implementation for Git projects) Kévin Le Gouguec
2019-10-04  8:33                                               ` Co-authoring and attribution in commit message Dmitry Gutov
2019-10-04 21:36                                                 ` Karl Fogel
2019-10-05  6:55                                                   ` Eli Zaretskii
2019-10-03 23:02                                         ` A project-files implementation for Git projects Dmitry Gutov
2019-09-14  0:33             ` Dmitry Gutov
2019-09-14 16:43               ` Tassilo Horn
2019-09-15  8:29                 ` Dmitry Gutov
2019-09-15  9:06                   ` Dmitry Gutov
2019-09-10 13:57         ` Robert Pluim
2019-09-10 14:24         ` Dmitry Gutov
2019-09-10 14:41     ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).