As some food for thought, we've used git attributes via "git check-attr" to record custom metadata at the repository level to provide hints to tooling. As project.el and other tools evolve, perhaps we can consider the cases where users can supply functions that influence behavior where one use case is based on things like git attributes. The case in hand is whether a submodule should be considered external to a project.el project such as the vendor/third-party submodule I'd mentioned. We use the default project-vc-merge-submodules t and our exceptions could be encoded via a custom git attribute and indicated via a predicate function, perhaps. In the absence of formal guidelines, and git does not seem to define "reserved words" for naming, we prefix custom attributes with an underscore.

On Tue, Aug 13, 2024 at 9:31 AM Ship Mints <shipmints@gmail.com> wrote:
A good step is awareness for users and invalidating the cache via project-forget methods is a good idea. I'd also offer a direct function to invoke to invalidate the cache for programmatic use.

vc caching, longer term, may need to consider a few more complex use cases such as git repos with both submodules that are considered extensions of the base project and submodules which are not. A concrete example I see often is a "mono repo" structure with core server and library code but with web and mobile front end code in submodules that are treated as part of the project proper BUT with submodules for vendor/third-party code that are not. A question here would be which parts of the tree belong to which cached vc root.

Another use case I see is working on many unrelated projects/repos across a variety of clients all in the same Emacs session and with perhaps 100+ buffers/files open (as I pretty much have right now), a 17-element cache won't be sufficient? Should the cache be for parent directories and not for file names? With files, it gets full fast. Mine is full right now with files most of which share the same repo root and some that don't. I have wondered whether an implementation would be better as directory variables? Cache invalidation without timestamps on .dir-locals.el files remain the same but directory variable treatment might be more natural to Emacs users?

-Stephane

On Mon, Aug 12, 2024 at 9:43 PM Dmitry Gutov <dmitry@gutov.dev> wrote:
Hi!

On 05/08/2024 20:18, Ship Mints wrote:
> (vc-file-setprop dir 'project-vc project) in project-try-vc. There is no
> facility, public API or private, to clear the cache en-masse. One could
> reset the cache via clearing the vector vc-file-prop-obarray
> (setq vc-file-prop-obarray (make-vector 17 0)) in the absence of an API.
> You can observe what's in your vc-file-prop-obarray for yourself before
> taking this action.

That's right. One step toward that goal would be moving the cache to
some other data structure - possibly a tree-like one, to also be able to
short-circuit the upward directory searches.

Cache invalidation is a sore point, though: the directory tree can
change behind the scenes outside Emacs, so unless the caching is
disabled the other complete solutions would rely on something like
filenotify.

OT2H if we're okay with supporting only manual clears e.g. using 'M-x
project-forget-project' or 'M-x project-forget-projects-under', that
could be implemented easily enough. The current vc-file-prop-obarray
structure could be refreshed with a full scan.