On 4/14/2023 10:45 PM, Jim Porter wrote: > Here are a couple small fixes after testing against the 2000 latest > commits (though I haven't examined *all* the errors in detail). Ok, I've now gone over the 5000 latest commits in some closer detail. Here are the main classes of error I see: 1. Commit message lines starting with "*" that aren't file entries. 2. Commit messages from Eglot's import (these aren't *really* a problem, and I think authors.el is smart enough to handle them; if something like this happened in future, we could just use "git push --no-verify" to ignore this hook). 3. Obviously-wrong typos of file names (including files listed in the commit message that exist, but aren't in the diff). I think erroring out on all three of these cases makes sense. Number 2 isn't truly an error, but it would be hard to detect that reliably in the hook (at least not without adding a list of exceptions), and it's rare enough that we can just use "--no-verify" as needed (or fix up the commit messages during the import). There are a few cases that are more interesting though: A. Wildcards in filenames ------------------------- 626f2f7441, 623db40dd1, 4b5414abbc, c32212bf96, f4bfe7834a, 23c1ee6989, 7e185bc9ba, 36474a1e49, 5fb8b20fa3, c45bfd3c4a, fcc7373b28, 8e9835c673, 952550258d, d1a7d16f8e, dc083ebc4e, 5f74397490, 0f85f2c0e5, 49bad2a0a6, 5a77517e7d, 24b9515da0, ae3904bb5d, 228d9d615d, 8461cfc8fc, e402887d5d, ab7dddea90, 9e4f11a163, 6342264ef7 These cases use wildcards in the file names. I think we should avoid doing this, for the same reason the GNU ChangeLog standards say to avoid doing it for function names: "If you mention the names of the modified functions or variables, it’s important to name them in full. ... Subsequent maintainers will often search for a function name to find all the change log entries that pertain to it; if you abbreviate the name, they won’t find it when they search."[1] Therefore, I didn't do anything about these in the latest revision of the hook: they're still errors. For committers that don't want to list every file explicitly, they could just add an entry for the directory, *without* any wildcards. B. File entry for a directory, using a trailing "/" --------------------------------------------------- 9b80fe55f9, 28d0654943, d9c94e93b7, 7d0dc31833, 16975078b4, 07235678a4, c55f4055dd, 9451ea0a05, f342b7c969 I think this is perfectly ok, so I've fixed the hook to allow this (see attached). C. Line-wrapped file entries ---------------------------- 8675f4136c, 0e25a39e69, 7c79eea51d, 335a5fd173, 31f8ae53be These are cases where file entries extend multiple lines. In the first example, a single file is split onto two lines, which I don't think is feasible to identify. In the other cases, the entries are formatted like so: * path/to/file.txt, path/to/other/file.txt: Change something. I've partially fixed this so that it can properly parse the first line, but handling the second line is tricky, so I didn't bother. It would be better if committers used this format anyway: * path/to/file.txt: * path/to/other/file.txt: Change something. D. Merge commit with file entries (and no diff) ----------------------------------------------- 289b457cac This is an odd case: it's a merge commit that mentions a bunch of file changes, but there's no diff in this commit. I'm not really sure what to do about this one, but maybe we shouldn't add file entries for cases like this under normal circumstances. If there's an exception to that rule, we can always push a commit like this with "--no-verify". [1] https://www.gnu.org/prep/standards/html_node/Style-of-Change-Logs.html#Style-of-Change-Logs