From: Jim Porter <jporterbugs@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: acm@muc.de, emacs-devel@gnu.org
Subject: Re: Mistakes in commit log messages
Date: Fri, 14 Apr 2023 20:41:49 -0700 [thread overview]
Message-ID: <5adfbf6f-fbcb-f4e8-3662-48bd5eb6a269@gmail.com> (raw)
In-Reply-To: <838rew5lak.fsf@gnu.org>
[-- Attachment #1: Type: text/plain, Size: 2858 bytes --]
Ok, I think this patch should work, though I'll continue testing it
locally before I merge it. We could also merge this patch sooner if we
temporarily made the pre-push hook advisory (i.e. doesn't block a push).
That way, others can try these hooks out without it blocking anyone's
work. Then when we're happy with it, we can make the pre-push hook error
out on bad commit messages.
On 4/12/2023 11:49 PM, Eli Zaretskii wrote:
> Please be sure to test the part that finds the file names on all the
> log messages in the repository. The things people do in logs will
> sometimes surprise you. In particular, a '*' after leading whitespace
> doesn't necessarily flag a file name, see, for example, the following
> commits:
>
> 92d75e5c53241ac76e8fdcb6fc66ade68354687c
This works without errors with my latest patch. (Though it's not smart
enough to recognize the "src/comp.c" as a file to check, since there's
no leading "* ".) I think that's probably ok; it'd be hard to detect
cases like that reliably.
> 0bd96806ef1a0f0d2d3f48cdb1204b7e393ab036
This fails, correctly I think. The first line of the commit message is
"* Rename `comp--typeof-builtin-types'", which I don't think we should
allow (going forward, at least). Since it's in the first line, we
*could* treat that specially, but I'm not sure it's worth the
complexity. Committers can just avoid starting lines with "*" (for
example, by using " *" instead).
> eff42dc0af741cc56c52d7d9577d29fc16f9f665
This also works without errors. An indented "*" is ok, and not checked
by these hooks.
> b5f70c239e87e5f38fd70181ef75cd28a43a8b41
This fails for two reasons: first, there's a line like this: "*
buffer-match-p and match-buffers", which is recognized as a list of file
names. That looks like auto-fill-mode (or something similar) adding a "*
" where it shouldn't. That's happened to me before, so I'd be glad for
the hook to catch this.
It also fails because of this line: "* lisp/window.el
(display-buffer-assq-regexp): Mention what happens". That's correct too,
since "lisp/window.el" wasn't changed in this commit.
If you apply my patch, you can also test out other commits via "echo
COMMIT-SHA | awk -f build-aux/git-hooks/commit-msg-files.awk". (I'll do
this locally as well.)
> Also, it looks like your script doesn't recognize file names in a line
> that starts with a semi-colon, as in this commit:
I fixed this case, though as far as I can tell, authors.el doesn't look
at lines like this. (I could be wrong, since I just read over that code
briefly.)
> What about the opposite: a file is mentioned in the diff, but not in
> the commit message? Or maybe that is allowed, and we shouldn't block
> it?
I think that's ok. Robert Pluim mentions a couple of cases, and we
probably also want to allow commit messages like: "; Fix last change".
[-- Attachment #2: 0001-Add-Git-hooks-to-check-filenames-listed-in-the-commi.patch --]
[-- Type: text/plain, Size: 8210 bytes --]
From f849fa082f0d7c8c3d472120e91e155b8700c65c Mon Sep 17 00:00:00 2001
From: Jim Porter <jporterbugs@gmail.com>
Date: Wed, 12 Apr 2023 23:03:31 -0700
Subject: [PATCH] Add Git hooks to check filenames listed in the commit message
* build-aux/git-hooks/commit-msg-files.awk:
* build-aux/git-hooks/post-commit:
* build-aux/git-hooks/pre-push: New files...
* autogen.sh: ... add them.
---
autogen.sh | 2 +-
build-aux/git-hooks/commit-msg-files.awk | 91 ++++++++++++++++++++++++
build-aux/git-hooks/post-commit | 29 ++++++++
build-aux/git-hooks/pre-push | 66 +++++++++++++++++
4 files changed, 187 insertions(+), 1 deletion(-)
create mode 100644 build-aux/git-hooks/commit-msg-files.awk
create mode 100755 build-aux/git-hooks/post-commit
create mode 100755 build-aux/git-hooks/pre-push
diff --git a/autogen.sh b/autogen.sh
index af4c2ad14df..71d7ac89abf 100755
--- a/autogen.sh
+++ b/autogen.sh
@@ -340,7 +340,7 @@ hooks=
tailored_hooks=
sample_hooks=
-for hook in commit-msg pre-commit prepare-commit-msg; do
+for hook in commit-msg pre-commit prepare-commit-msg post-commit pre-push commit-msg-files.awk; do
cmp -- build-aux/git-hooks/$hook "$hooks/$hook" >/dev/null 2>&1 ||
tailored_hooks="$tailored_hooks $hook"
done
diff --git a/build-aux/git-hooks/commit-msg-files.awk b/build-aux/git-hooks/commit-msg-files.awk
new file mode 100644
index 00000000000..c1edccdf3e5
--- /dev/null
+++ b/build-aux/git-hooks/commit-msg-files.awk
@@ -0,0 +1,91 @@
+# Check the file list of GNU Emacs change log entries for each commit SHA.
+
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This file is part of GNU Emacs.
+
+# GNU Emacs is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# GNU Emacs is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>.
+
+function get_commit_changes(commit_sha, changes, i, j, len, bits, filename) {
+ # Collect all the files touched in the specified commit.
+ while ((("git log -1 --name-status --format= " commit_sha) | getline) > 0) {
+ for (i = 2; i <= NF; i++) {
+ len = split($i, bits, "/")
+ for (j = 1; j <= len; j++) {
+ if (j == 1)
+ filename = bits[j]
+ else
+ filename = filename "/" bits[j]
+ changes[filename] = 1
+ }
+ }
+ }
+}
+
+function check_commit_msg_files(commit_sha, verbose, changes, good, msg, \
+ filenames_str, filenames, i) {
+ get_commit_changes(commit_sha, changes)
+ good = 1
+
+ while ((("git log -1 --format=%B " commit_sha) | getline) > 0) {
+ if (verbose && ! msg)
+ msg = $0
+
+ # Find lines that reference files. We look at any line starting
+ # with "*" (possibly prefixed by "; ") where the file part starts
+ # with an alphanumeric character. The file part ends if we
+ # encounter any of the following characters: [ ( < { :
+ if (/^(; )?\*[ \t]+[[:alnum:]]/ && match($0, "[[:alnum:]][^[(<{:]*")) {
+ # There might be multiple files listed on this line, separated
+ # by a comma and/or space. Iterate over each of them.
+ split(substr($0, RSTART, RLENGTH), filenames, "[[:blank:],][[:blank:]]*")
+ for (i in filenames) {
+ if (length(filenames[i]) && ! (filenames[i] in changes)) {
+ if (good) {
+ # Print a header describing the error.
+ if (verbose)
+ printf("In commit %s \"%s\"...\n", substr(commit_sha, 1, 10), msg)
+ printf("Files listed in commit message, but not in diff:\n")
+ }
+ printf(" %s\n", filenames[i])
+ good = 0
+ }
+ }
+ }
+ }
+
+ return good
+}
+
+BEGIN {
+ if (reason == "pre-push")
+ verbose = 1
+}
+
+/^[a-z0-9]{40}$/ {
+ if (! check_commit_msg_files($0, verbose)) {
+ status = 1
+ }
+}
+
+END {
+ if (status != 0) {
+ if (reason == "pre-push")
+ error_msg = "Push aborted"
+ else
+ error_msg = "Bad commit message"
+ printf("%s; please see the file 'CONTRIBUTE'\n", error_msg)
+ }
+ exit status
+}
diff --git a/build-aux/git-hooks/post-commit b/build-aux/git-hooks/post-commit
new file mode 100755
index 00000000000..4c30ec76e02
--- /dev/null
+++ b/build-aux/git-hooks/post-commit
@@ -0,0 +1,29 @@
+#!/bin/sh
+# Check the file list of GNU Emacs change log entries after committing.
+
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This file is part of GNU Emacs.
+
+# GNU Emacs is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# GNU Emacs is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>.
+
+# Prefer gawk if available, as it handles NUL bytes properly.
+if type gawk >/dev/null 2>&1; then
+ awk="gawk"
+else
+ awk="awk"
+fi
+
+git rev-parse HEAD | $awk -v reason=post-commit \
+ -f .git/hooks/commit-msg-files.awk
diff --git a/build-aux/git-hooks/pre-push b/build-aux/git-hooks/pre-push
new file mode 100755
index 00000000000..136f84e6691
--- /dev/null
+++ b/build-aux/git-hooks/pre-push
@@ -0,0 +1,66 @@
+#!/bin/sh
+# Check the file list of GNU Emacs change log entries before pushing.
+
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This file is part of GNU Emacs.
+
+# GNU Emacs is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# GNU Emacs is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>.
+
+# Prefer gawk if available, as it handles NUL bytes properly.
+if type gawk >/dev/null 2>&1; then
+ awk="gawk"
+else
+ awk="awk"
+fi
+
+# Standard input receives lines of the form:
+# <local ref> SP <local name> SP <remote ref> SP <remote name> LF
+$awk -v origin_name="$1" '
+ # If the local SHA is all zeroes, ignore it.
+ $2 ~ /^0{40}$/ {
+ next
+ }
+
+ $2 ~ /^[a-z0-9]{40}$/ {
+ newref = $2
+ # If the remote SHA is all zeroes, this is a new object to be
+ # pushed (likely a branch). Go backwards until we find a SHA on
+ # an origin branch.
+ if ($4 ~ /^0{40}$/) {
+ back = 0
+ while ((("git branch -r -l '\''" origin_name "/*'\'' --contains " \
+ newref "~" back) | getline) == 0) {
+
+ # Only look back at most 1000 commits, just in case...
+ if (back++ > 1000)
+ break;
+ }
+
+ ("git rev-parse " newref "~" back) | getline oldref
+ if (!(oldref ~ /^[a-z0-9]{40}$/)) {
+ # The SHA is misformatted! Skip this line.
+ next
+ }
+ } else if ($4 ~ /^[a-z0-9]{40}$/) {
+ oldref = $4
+ } else {
+ # The SHA is misformatted! Skip this line.
+ next
+ }
+
+ # Print every SHA after oldref, up to (and including) newref.
+ system("git rev-list --reverse " oldref ".." newref)
+ }
+' | $awk -v reason=pre-push -f .git/hooks/commit-msg-files.awk
--
2.25.1
next prev parent reply other threads:[~2023-04-15 3:41 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <835ya5m4p0.fsf@gnu.org>
[not found] ` <ZDPkykCsW3i30UR9@ACM>
[not found] ` <83v8i4arzt.fsf@gnu.org>
[not found] ` <CANh=_JF0CEPDsWZSuyy9ymByma2LxcypP90O3-LQ+KhoJ8cqvg@mail.gmail.com>
[not found] ` <CANh=_JEO4-E79dPCLc3cRLi7=ftAzc+H1FC46eck1vJN3TD3Sg@mail.gmail.com>
2023-04-11 6:02 ` Mistakes in commit log messages Eli Zaretskii
2023-04-11 14:01 ` Alan Mackenzie
2023-04-11 14:57 ` Eli Zaretskii
2023-04-11 17:20 ` Alan Mackenzie
2023-04-11 18:00 ` Eli Zaretskii
2023-04-11 18:31 ` Jim Porter
2023-04-11 18:45 ` Eli Zaretskii
2023-04-11 19:27 ` Jim Porter
2023-04-11 19:36 ` Eli Zaretskii
2023-04-12 0:20 ` Jim Porter
2023-04-13 6:18 ` Jim Porter
2023-04-13 6:49 ` Eli Zaretskii
2023-04-13 7:47 ` Robert Pluim
2023-04-15 3:41 ` Jim Porter [this message]
2023-04-15 5:45 ` Jim Porter
2023-04-15 7:15 ` Eli Zaretskii
2023-04-15 10:44 ` Alan Mackenzie
2023-04-15 11:00 ` Eli Zaretskii
2023-04-21 22:16 ` Filipp Gunbin
2023-04-15 20:54 ` Jim Porter
2023-04-15 21:23 ` Jim Porter
2023-04-16 5:43 ` Eli Zaretskii
2023-04-16 20:06 ` Jim Porter
2023-04-16 20:19 ` Michael Albinus
2023-04-17 2:22 ` Eli Zaretskii
2023-04-17 7:28 ` Michael Albinus
2023-04-21 4:59 ` Jim Porter
2023-04-15 7:08 ` Eli Zaretskii
2023-04-12 9:41 ` Alan Mackenzie
2023-04-12 10:14 ` Eli Zaretskii
2023-04-12 9:32 ` Alan Mackenzie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5adfbf6f-fbcb-f4e8-3662-48bd5eb6a269@gmail.com \
--to=jporterbugs@gmail.com \
--cc=acm@muc.de \
--cc=eliz@gnu.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).