all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Jim Porter <jporterbugs@gmail.com>
To: Eli Zaretskii <eliz@gnu.org>
Cc: acm@muc.de, emacs-devel@gnu.org
Subject: Re: Mistakes in commit log messages
Date: Fri, 14 Apr 2023 22:45:44 -0700	[thread overview]
Message-ID: <9c8826af-a160-d36f-7577-2bc5ec0c2004@gmail.com> (raw)
In-Reply-To: <5adfbf6f-fbcb-f4e8-3662-48bd5eb6a269@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1870 bytes --]

On 4/14/2023 8:41 PM, Jim Porter wrote:
> Ok, I think this patch should work, though I'll continue testing it 
> locally before I merge it. We could also merge this patch sooner if we 
> temporarily made the pre-push hook advisory (i.e. doesn't block a push). 
> That way, others can try these hooks out without it blocking anyone's 
> work. Then when we're happy with it, we can make the pre-push hook error 
> out on bad commit messages.

Here are a couple small fixes after testing against the 2000 latest 
commits (though I haven't examined *all* the errors in detail). I fixed 
the hooks so that they actually close the pipes we use, and also made a 
regexp a bit more strict. Previously, it split file names by a comma 
and/or a space, but there are a few files with commas in their names, 
e.g. "test/lisp/net/eudc-resources/dc=gnu,dc=org.ldif". Now it handles 
those properly.

Testing this, there are a bunch of commits where the first line is of 
the form "* Do something". Maybe the hooks should ignore cases like 
that, or we could just disallow that going forward. I don't have a 
strong opinion.

There's also one more commit I'm not quite sure what to do about: 
0a95a81d8d36722ccf030a6194ecd953fc257a59. It has this in the commit message:

     This fixes bug #61144.  If the space around the * is "symmetric" we 
leave foo
     * bar unfontified, a multiplication operation.  If it is 
"asymmetric" we
     fontify it as a pointer declaration.

The second line of this excerpt is treated like a file entry. Maybe our 
hook could allow that if it were clever enough, or maybe this is a rare 
enough occurrence that we should just have committers reformat the 
message slightly.

Almost all the other errors are due to commits from packages that were 
imported into the Emacs repo and their files moved around during the 
import (e.g. Eglot, use-package).

[-- Attachment #2: 0001-Add-Git-hooks-to-check-filenames-listed-in-the-commi.patch --]
[-- Type: text/plain, Size: 8386 bytes --]

From 6e2a73dfba7c0b1a7b4000f9305b666911f4a171 Mon Sep 17 00:00:00 2001
From: Jim Porter <jporterbugs@gmail.com>
Date: Wed, 12 Apr 2023 23:03:31 -0700
Subject: [PATCH] Add Git hooks to check filenames listed in the commit message

* build-aux/git-hooks/commit-msg-files.awk:
* build-aux/git-hooks/post-commit:
* build-aux/git-hooks/pre-push: New files...
* autogen.sh: ... add them.
---
 autogen.sh                               |  3 +-
 build-aux/git-hooks/commit-msg-files.awk | 96 ++++++++++++++++++++++++
 build-aux/git-hooks/post-commit          | 29 +++++++
 build-aux/git-hooks/pre-push             | 70 +++++++++++++++++
 4 files changed, 197 insertions(+), 1 deletion(-)
 create mode 100644 build-aux/git-hooks/commit-msg-files.awk
 create mode 100755 build-aux/git-hooks/post-commit
 create mode 100755 build-aux/git-hooks/pre-push

diff --git a/autogen.sh b/autogen.sh
index af4c2ad14df..6127e7b24f4 100755
--- a/autogen.sh
+++ b/autogen.sh
@@ -340,7 +340,8 @@ hooks=
 tailored_hooks=
 sample_hooks=
 
-for hook in commit-msg pre-commit prepare-commit-msg; do
+for hook in commit-msg pre-commit prepare-commit-msg post-commit \
+            pre-push commit-msg-files.awk; do
     cmp -- build-aux/git-hooks/$hook "$hooks/$hook" >/dev/null 2>&1 ||
 	tailored_hooks="$tailored_hooks $hook"
 done
diff --git a/build-aux/git-hooks/commit-msg-files.awk b/build-aux/git-hooks/commit-msg-files.awk
new file mode 100644
index 00000000000..da066b83bdd
--- /dev/null
+++ b/build-aux/git-hooks/commit-msg-files.awk
@@ -0,0 +1,96 @@
+# Check the file list of GNU Emacs change log entries for each commit SHA.
+
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This file is part of GNU Emacs.
+
+# GNU Emacs is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# GNU Emacs is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+function get_commit_changes(commit_sha, changes,    cmd, i, j, len, \
+                            bits, filename) {
+  # Collect all the files touched in the specified commit.
+  cmd = ("git log -1 --name-status --format= " commit_sha)
+  while ((cmd | getline) > 0) {
+    for (i = 2; i <= NF; i++) {
+      len = split($i, bits, "/")
+      for (j = 1; j <= len; j++) {
+        if (j == 1)
+          filename = bits[j]
+        else
+          filename = filename "/" bits[j]
+        changes[filename] = 1
+      }
+    }
+  }
+  close(cmd)
+}
+
+function check_commit_msg_files(commit_sha, verbose,    changes, good, \
+                                cmd, msg, filenames_str, filenames, i) {
+  get_commit_changes(commit_sha, changes)
+  good = 1
+
+  cmd = ("git log -1 --format=%B " commit_sha)
+  while ((cmd | getline) > 0) {
+    if (verbose && ! msg)
+      msg = $0
+
+    # Find lines that reference files.  We look at any line starting
+    # with "*" (possibly prefixed by "; ") where the file part starts
+    # with an alphanumeric character.  The file part ends if we
+    # encounter any of the following characters: [ ( < { :
+    if (/^(; )?\*[ \t]+[[:alnum:]]/ && match($0, "[[:alnum:]][^[(<{:]*")) {
+      # There might be multiple files listed on this line, separated
+      # by spaces (and possibly a comma).  Iterate over each of them.
+      split(substr($0, RSTART, RLENGTH), filenames, ",?[[:blank:]]+")
+      for (i in filenames) {
+        if (length(filenames[i]) && ! (filenames[i] in changes)) {
+          if (good) {
+            # Print a header describing the error.
+            if (verbose)
+              printf("In commit %s \"%s\"...\n", substr(commit_sha, 1, 10), msg)
+            printf("Files listed in commit message, but not in diff:\n")
+          }
+          printf("  %s\n", filenames[i])
+          good = 0
+        }
+      }
+    }
+  }
+  close(cmd)
+
+  return good
+}
+
+BEGIN {
+  if (reason == "pre-push")
+    verbose = 1
+}
+
+/^[a-z0-9]{40}$/ {
+  if (! check_commit_msg_files($0, verbose)) {
+    status = 1
+  }
+}
+
+END {
+  if (status != 0) {
+    if (reason == "pre-push")
+      error_msg = "Push aborted"
+    else
+      error_msg = "Bad commit message"
+    printf("%s; please see the file 'CONTRIBUTE'\n", error_msg)
+  }
+  exit status
+}
diff --git a/build-aux/git-hooks/post-commit b/build-aux/git-hooks/post-commit
new file mode 100755
index 00000000000..4c30ec76e02
--- /dev/null
+++ b/build-aux/git-hooks/post-commit
@@ -0,0 +1,29 @@
+#!/bin/sh
+# Check the file list of GNU Emacs change log entries after committing.
+
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This file is part of GNU Emacs.
+
+# GNU Emacs is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# GNU Emacs is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+# Prefer gawk if available, as it handles NUL bytes properly.
+if type gawk >/dev/null 2>&1; then
+  awk="gawk"
+else
+  awk="awk"
+fi
+
+git rev-parse HEAD | $awk -v reason=post-commit \
+                          -f .git/hooks/commit-msg-files.awk
diff --git a/build-aux/git-hooks/pre-push b/build-aux/git-hooks/pre-push
new file mode 100755
index 00000000000..b0185a97b28
--- /dev/null
+++ b/build-aux/git-hooks/pre-push
@@ -0,0 +1,70 @@
+#!/bin/sh
+# Check the file list of GNU Emacs change log entries before pushing.
+
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This file is part of GNU Emacs.
+
+# GNU Emacs is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# GNU Emacs is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with GNU Emacs.  If not, see <https://www.gnu.org/licenses/>.
+
+# Prefer gawk if available, as it handles NUL bytes properly.
+if type gawk >/dev/null 2>&1; then
+  awk="gawk"
+else
+  awk="awk"
+fi
+
+# Standard input receives lines of the form:
+#   <local ref> SP <local name> SP <remote ref> SP <remote name> LF
+$awk -v origin_name="$1" '
+  # If the local SHA is all zeroes, ignore it.
+  $2 ~ /^0{40}$/ {
+    next
+  }
+
+  $2 ~ /^[a-z0-9]{40}$/ {
+    newref = $2
+    # If the remote SHA is all zeroes, this is a new object to be
+    # pushed (likely a branch).  Go backwards until we find a SHA on
+    # an origin branch.
+    if ($4 ~ /^0{40}$/) {
+      back = 0
+      cmd = ("git branch -r -l '\''" origin_name "/*'\'' --contains " \
+             newref "~" back)
+      while ((cmd | getline) == 0) {
+
+        # Only look back at most 1000 commits, just in case...
+        if (back++ > 1000)
+          break;
+      }
+      close(cmd)
+
+      cmd = ("git rev-parse " newref "~" back)
+      cmd | getline oldref
+      if (!(oldref ~ /^[a-z0-9]{40}$/)) {
+        # The SHA is misformatted!  Skip this line.
+        next
+      }
+      close(cmd)
+    } else if ($4 ~ /^[a-z0-9]{40}$/)  {
+      oldref = $4
+    } else {
+      # The SHA is misformatted!  Skip this line.
+      next
+    }
+
+    # Print every SHA after oldref, up to (and including) newref.
+    system("git rev-list --reverse " oldref ".." newref)
+  }
+' | $awk -v reason=pre-push -f .git/hooks/commit-msg-files.awk
-- 
2.25.1


  reply	other threads:[~2023-04-15  5:45 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <835ya5m4p0.fsf@gnu.org>
     [not found] ` <ZDPkykCsW3i30UR9@ACM>
     [not found]   ` <83v8i4arzt.fsf@gnu.org>
     [not found]     ` <CANh=_JF0CEPDsWZSuyy9ymByma2LxcypP90O3-LQ+KhoJ8cqvg@mail.gmail.com>
     [not found]       ` <CANh=_JEO4-E79dPCLc3cRLi7=ftAzc+H1FC46eck1vJN3TD3Sg@mail.gmail.com>
2023-04-11  6:02         ` Mistakes in commit log messages Eli Zaretskii
2023-04-11 14:01           ` Alan Mackenzie
2023-04-11 14:57             ` Eli Zaretskii
2023-04-11 17:20               ` Alan Mackenzie
2023-04-11 18:00                 ` Eli Zaretskii
2023-04-11 18:31             ` Jim Porter
2023-04-11 18:45               ` Eli Zaretskii
2023-04-11 19:27                 ` Jim Porter
2023-04-11 19:36                   ` Eli Zaretskii
2023-04-12  0:20                     ` Jim Porter
2023-04-13  6:18                       ` Jim Porter
2023-04-13  6:49                         ` Eli Zaretskii
2023-04-13  7:47                           ` Robert Pluim
2023-04-15  3:41                           ` Jim Porter
2023-04-15  5:45                             ` Jim Porter [this message]
2023-04-15  7:15                               ` Eli Zaretskii
2023-04-15 10:44                                 ` Alan Mackenzie
2023-04-15 11:00                                   ` Eli Zaretskii
2023-04-21 22:16                                   ` Filipp Gunbin
2023-04-15 20:54                               ` Jim Porter
2023-04-15 21:23                                 ` Jim Porter
2023-04-16  5:43                                   ` Eli Zaretskii
2023-04-16 20:06                                     ` Jim Porter
2023-04-16 20:19                                       ` Michael Albinus
2023-04-17  2:22                                       ` Eli Zaretskii
2023-04-17  7:28                                         ` Michael Albinus
2023-04-21  4:59                                 ` Jim Porter
2023-04-15  7:08                             ` Eli Zaretskii
2023-04-12  9:41                     ` Alan Mackenzie
2023-04-12 10:14                       ` Eli Zaretskii
2023-04-12  9:32               ` Alan Mackenzie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9c8826af-a160-d36f-7577-2bc5ec0c2004@gmail.com \
    --to=jporterbugs@gmail.com \
    --cc=acm@muc.de \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.