all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
* New Git hooks for checking file names in commit messages
@ 2023-04-21  4:55 Jim Porter
  2023-04-21 12:05 ` John Yates
  2023-04-21 13:57 ` Alan Mackenzie
  0 siblings, 2 replies; 34+ messages in thread
From: Jim Porter @ 2023-04-21  4:55 UTC (permalink / raw)
  To: emacs-devel

(For some background on this, see here: 
https://lists.gnu.org/archive/html/emacs-devel/2023-04/msg00274.html)

I've just merged some new Git hooks that should help to identify errors 
in the list of file names in commit messages (see commit 4416262f59f). 
These ensure that any file name listed in the commit message is actually 
in the diff, so that it's harder to make a typo when writing the commit 
message. The hooks recognize files on lines starting with "*" or "; *". 
This also means that if you want to use a "*" for a bulleted list of 
sentences, you should do something to make it not look like a file 
entry. For example, " *" is ok.

In addition to listing individual files, you can also list whole 
directories (e.g. if you added many files to a directory). However, you 
can't use wildcards in your file entries. (If people really want this, 
it's probably doable, but I think being explicit will help future 
maintainers if they ever need to grep the changelogs.)

You can install these hooks via the autogen.sh script, as usual. There's 
one part that might be surprising, at least initially though: while this 
is a hook to check commit messages, it actually runs in the post-commit 
and pre-push phases. This is because we want to examine the full diff of 
the commit in order to compare against the file entries in the 
message.[1] As a result, you'll just get an advisory message when you 
make the commit (it's too late to error out), and the pre-push hook will 
prevent you from pushing a badly-formatted commit message upstream.

I encourage everyone to test these hooks out and report back if you find 
any problems. I ran them locally on the last 5000 commits to master and 
manually verified that there are no false positives, but there could 
still be bugs. If you use a non-gawk awk, I especially encourage you to 
test them out. I did my best to only use POSIX awk features, but you 
never know...

[1] It's hard to do this during the commit-msg hook since there's no 
cross-platform way I know of to handle amending commits properly.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21  4:55 New Git hooks for checking file names in commit messages Jim Porter
@ 2023-04-21 12:05 ` John Yates
  2023-04-21 16:44   ` Jim Porter
  2023-04-21 13:57 ` Alan Mackenzie
  1 sibling, 1 reply; 34+ messages in thread
From: John Yates @ 2023-04-21 12:05 UTC (permalink / raw)
  To: Jim Porter; +Cc: emacs-devel

On Fri, Apr 21, 2023 at 12:55 AM Jim Porter <jporterbugs@gmail.com> wrote:
>
> This also means that if you want to use a "*" for a bulleted list of
> sentences, you should do something to make it not look like a file
> entry. For example, " *" is ok.

I am not wild about this change in commit message syntax.  It clashes
with all my other uses of git.

Do you allow more than one file on a bulleted line?  If not then could
you use a whitespace check to identify most bulleted lines which do
not actually contain a file name?

Another heuristic might be to accumulate whether among all of the
files in the diffs there are any without file extensions.  If all file names
do indeed include extensions then a bulleted line without a "." foolowed
immediately by a non-whitespace character could be skipped.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21  4:55 New Git hooks for checking file names in commit messages Jim Porter
  2023-04-21 12:05 ` John Yates
@ 2023-04-21 13:57 ` Alan Mackenzie
  2023-04-21 15:05   ` Eli Zaretskii
  2023-04-21 16:39   ` Jim Porter
  1 sibling, 2 replies; 34+ messages in thread
From: Alan Mackenzie @ 2023-04-21 13:57 UTC (permalink / raw)
  To: Jim Porter; +Cc: emacs-devel

Hello, Jim.

On Thu, Apr 20, 2023 at 21:55:15 -0700, Jim Porter wrote:

[ .... ]

> There's one part that might be surprising, at least initially though:
> while this is a hook to check commit messages, it actually runs in the
> post-commit and pre-push phases. This is because we want to examine
> the full diff of the commit in order to compare against the file
> entries in the message.[1] As a result, you'll just get an advisory
> message when you make the commit (it's too late to error out), and the
> pre-push hook will prevent you from pushing a badly-formatted commit
> message upstream.

How is this helpful?  Telling somebody they've made a mistake is only
helpful if they can correct it.  If "it's too late to error out", it
will just create annoyance.  Immediately after such a commit it is
possible to use $ git commit --amend, but not later.  And preventing
people from pushing these erroneous commits is, again, not helpful.  It
will likely take more work, in total, somehow to correct these errors
(how?) than it took Eli to correct all the commit message mistakes for
the Emacs-29.1 pretest release.

I don't understand why the checking of the commit message isn't done in
the commit-msg hook.

[ .... ]

> [1] It's hard to do this during the commit-msg hook since there's no 
> cross-platform way I know of to handle amending commits properly.

I don't understand this bit.  Why does the lack of a way of amending
commits make it difficult to error out from the commit-msg hook?  What
am I missing?

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 13:57 ` Alan Mackenzie
@ 2023-04-21 15:05   ` Eli Zaretskii
  2023-04-21 15:38     ` Arsen Arsenović
  2023-04-22 23:55     ` Dmitry Gutov
  2023-04-21 16:39   ` Jim Porter
  1 sibling, 2 replies; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-21 15:05 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: jporterbugs, emacs-devel

> Date: Fri, 21 Apr 2023 13:57:22 +0000
> Cc: emacs-devel@gnu.org
> From: Alan Mackenzie <acm@muc.de>
> 
> > There's one part that might be surprising, at least initially though:
> > while this is a hook to check commit messages, it actually runs in the
> > post-commit and pre-push phases. This is because we want to examine
> > the full diff of the commit in order to compare against the file
> > entries in the message.[1] As a result, you'll just get an advisory
> > message when you make the commit (it's too late to error out), and the
> > pre-push hook will prevent you from pushing a badly-formatted commit
> > message upstream.
> 
> How is this helpful?  Telling somebody they've made a mistake is only
> helpful if they can correct it.  If "it's too late to error out", it
> will just create annoyance.  Immediately after such a commit it is
> possible to use $ git commit --amend, but not later.  And preventing
> people from pushing these erroneous commits is, again, not helpful.

It is still helpful, because as long as you didn't push, you can still
do

   $ git reset HEAD^

and then fix whatever needs fixing and commit again.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 15:05   ` Eli Zaretskii
@ 2023-04-21 15:38     ` Arsen Arsenović
  2023-04-21 15:50       ` Eli Zaretskii
  2023-04-22 23:55     ` Dmitry Gutov
  1 sibling, 1 reply; 34+ messages in thread
From: Arsen Arsenović @ 2023-04-21 15:38 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Alan Mackenzie, jporterbugs, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1551 bytes --]


Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Fri, 21 Apr 2023 13:57:22 +0000
>> Cc: emacs-devel@gnu.org
>> From: Alan Mackenzie <acm@muc.de>
>> 
>> > There's one part that might be surprising, at least initially though:
>> > while this is a hook to check commit messages, it actually runs in the
>> > post-commit and pre-push phases. This is because we want to examine
>> > the full diff of the commit in order to compare against the file
>> > entries in the message.[1] As a result, you'll just get an advisory
>> > message when you make the commit (it's too late to error out), and the
>> > pre-push hook will prevent you from pushing a badly-formatted commit
>> > message upstream.
>> 
>> How is this helpful?  Telling somebody they've made a mistake is only
>> helpful if they can correct it.  If "it's too late to error out", it
>> will just create annoyance.  Immediately after such a commit it is
>> possible to use $ git commit --amend, but not later.  And preventing
>> people from pushing these erroneous commits is, again, not helpful.
>
> It is still helpful, because as long as you didn't push, you can still
> do
>
>    $ git reset HEAD^

You can also reword during a rebase, which has the advantage of not
chucking any work away.  See [1].

Everything can be changed at any point in Git, until you push, and even
then if the server agrees, so there's never a time that is too late to
error out.

Have a great day.

[1] https://git-scm.com/docs/git-rebase#_interactive_mode
-- 
Arsen Arsenović

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 381 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 15:38     ` Arsen Arsenović
@ 2023-04-21 15:50       ` Eli Zaretskii
  2023-04-21 16:20         ` Arsen Arsenović
  2023-04-21 19:03         ` Björn Bidar
  0 siblings, 2 replies; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-21 15:50 UTC (permalink / raw)
  To: Arsen Arsenović; +Cc: acm, jporterbugs, emacs-devel

> From: Arsen Arsenović <arsen@aarsen.me>
> Cc: Alan Mackenzie <acm@muc.de>, jporterbugs@gmail.com, emacs-devel@gnu.org
> Date: Fri, 21 Apr 2023 17:38:19 +0200
> 
> You can also reword during a rebase, which has the advantage of not
> chucking any work away.  See [1].

We advise contributors not to rebase on public branches, only on their
local ones.

> Everything can be changed at any point in Git, until you push, and even
> then if the server agrees, so there's never a time that is too late to
> error out.

Our server is set up not to allow non-FF pushes.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 15:50       ` Eli Zaretskii
@ 2023-04-21 16:20         ` Arsen Arsenović
  2023-04-21 17:44           ` Eli Zaretskii
  2023-04-21 19:03         ` Björn Bidar
  1 sibling, 1 reply; 34+ messages in thread
From: Arsen Arsenović @ 2023-04-21 16:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, jporterbugs, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 790 bytes --]


Eli Zaretskii <eliz@gnu.org> writes:

>> From: Arsen Arsenović <arsen@aarsen.me>
>> Cc: Alan Mackenzie <acm@muc.de>, jporterbugs@gmail.com, emacs-devel@gnu.org
>> Date: Fri, 21 Apr 2023 17:38:19 +0200
>> 
>> You can also reword during a rebase, which has the advantage of not
>> chucking any work away.  See [1].
>
> We advise contributors not to rebase on public branches, only on their
> local ones.

Sure, but using 'reset' implies that you can rebase, as otherwise you
couldn't reset to re-do the commits.

>> Everything can be changed at any point in Git, until you push, and even
>> then if the server agrees, so there's never a time that is too late to
>> error out.
>
> Our server is set up not to allow non-FF pushes.

For the best.
-- 
Arsen Arsenović

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 381 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 13:57 ` Alan Mackenzie
  2023-04-21 15:05   ` Eli Zaretskii
@ 2023-04-21 16:39   ` Jim Porter
  1 sibling, 0 replies; 34+ messages in thread
From: Jim Porter @ 2023-04-21 16:39 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: emacs-devel

On 4/21/2023 6:57 AM, Alan Mackenzie wrote:
> How is this helpful?  Telling somebody they've made a mistake is only
> helpful if they can correct it.  If "it's too late to error out", it
> will just create annoyance.  Immediately after such a commit it is
> possible to use $ git commit --amend, but not later.

This is why I added a post-commit hook that informs the committer of the 
problem immediately after committing. However, since this is a 
*post*-commit hook, it's too late to abort the commit. So there is a 
message, but it's purely advisory at that stage. Sorry if my wording was 
unclear originally.

>> [1] It's hard to do this during the commit-msg hook since there's no
>> cross-platform way I know of to handle amending commits properly.
> 
> I don't understand this bit.  Why does the lack of a way of amending
> commits make it difficult to error out from the commit-msg hook?  What
> am I missing?

It's impossible (as far as I'm aware) to reliably get a list of changed 
files from Git during the commit-msg hook. Since you can't tell whether 
the commit is being amended without platform-specific hacks, you have no 
way of knowing whether you should compute your diff against HEAD or HEAD^.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 12:05 ` John Yates
@ 2023-04-21 16:44   ` Jim Porter
  2023-04-21 17:44     ` Jim Porter
  0 siblings, 1 reply; 34+ messages in thread
From: Jim Porter @ 2023-04-21 16:44 UTC (permalink / raw)
  To: John Yates; +Cc: emacs-devel

On 4/21/2023 5:05 AM, John Yates wrote:
> On Fri, Apr 21, 2023 at 12:55 AM Jim Porter <jporterbugs@gmail.com> wrote:
>>
>> This also means that if you want to use a "*" for a bulleted list of
>> sentences, you should do something to make it not look like a file
>> entry. For example, " *" is ok.
> 
> I am not wild about this change in commit message syntax.  It clashes
> with all my other uses of git.

It's not a change in syntax (at least, it's not *intended* to change the 
syntax). Commit messages for Emacs should all use the GNU Changelog 
format so that they can be imported into the actual ChangeLog files. 
These then get used to update etc/AUTHORS via the admin/authors.el 
script, which looks at the file entries in the ChangeLog files.[1]

This ended up being a problem for Eli when preparing the Emacs 29 
prerelease tarball, since there were a bunch of incorrect commit 
messages that required several hours of manual intervention to fix. 
(Since I was one of the people Eli emailed off-list to inform us that 
we'd made these errors, I thought I should take some time to help 
prevent this issue in the future.)

I tried to be as lenient as possible with the Git hooks so that they 
only catch things that would be a problem for admin/authors.el, though
there are some small differences regarding multi-line lists of files.

> Do you allow more than one file on a bulleted line?

Yes. admin/authors.el allows this, so the Git hooks also allow it. (I 
don't personally like it though.)

In principle, we could copy over the admin/authors.el heuristic though: 
a line beginning with "*" starts a file entry *only if there's a ":" 
somewhere following it, up to the next whitespace-only line*. This would 
help avoid some false positives, but it also fails to catch errors where 
committers forgot to add the ":", which I saw on several occasions when 
looking over historical commits.

[1] That's a bit indirect (you could update etc/AUTHORS from the git 
logs directly), but it lets maintainers modify the ChangeLog if 
necessary before updating etc/AUTHORS, e.g. to fix authorship of a 
change or to remove entries if they were backed out for some reason.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 16:20         ` Arsen Arsenović
@ 2023-04-21 17:44           ` Eli Zaretskii
  2023-04-21 17:50             ` Arsen Arsenović
  2023-04-21 18:25             ` Andreas Schwab
  0 siblings, 2 replies; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-21 17:44 UTC (permalink / raw)
  To: Arsen Arsenović; +Cc: acm, jporterbugs, emacs-devel

> From: Arsen Arsenović <arsen@aarsen.me>
> Cc: acm@muc.de, jporterbugs@gmail.com, emacs-devel@gnu.org
> Date: Fri, 21 Apr 2023 18:20:16 +0200
> 
> >> You can also reword during a rebase, which has the advantage of not
> >> chucking any work away.  See [1].
> >
> > We advise contributors not to rebase on public branches, only on their
> > local ones.
> 
> Sure, but using 'reset' implies that you can rebase, as otherwise you
> couldn't reset to re-do the commits.

No "git reset" doesn't rebase.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 16:44   ` Jim Porter
@ 2023-04-21 17:44     ` Jim Porter
  2023-04-22  6:11       ` Eli Zaretskii
  0 siblings, 1 reply; 34+ messages in thread
From: Jim Porter @ 2023-04-21 17:44 UTC (permalink / raw)
  To: John Yates; +Cc: emacs-devel, eliz

[-- Attachment #1: Type: text/plain, Size: 828 bytes --]

On 4/21/2023 9:44 AM, Jim Porter wrote:
> I tried to be as lenient as possible with the Git hooks so that they 
> only catch things that would be a problem for admin/authors.el, though
> there are some small differences regarding multi-line lists of files.

After thinking about this for a bit, I came up with a way to match 
admin/authors.el in the hooks (at least, to the best of my knowledge). 
This should mean that the hooks will report exactly the same errors as 
admin/authors.el.

With this patch, there's no restriction on putting a "*" in the first 
column of a commit message, so long as that paragraph doesn't also have 
a ":" in it (just like admin/authors.el).

Eli, what do you think? I've run this against the 5000 latest commits 
and the results look right (though I haven't looked as thoroughly as 
last time).

[-- Attachment #2: 0001-Improve-the-logic-of-the-file-entry-Git-hooks-to-mat.patch --]
[-- Type: text/plain, Size: 2629 bytes --]

From 2ed9e94bf10206f9fc706f347b1e877a6221e676 Mon Sep 17 00:00:00 2001
From: Jim Porter <jporterbugs@gmail.com>
Date: Fri, 21 Apr 2023 10:06:49 -0700
Subject: [PATCH] Improve the logic of the file entry Git hooks to match
 admin/authors.el

* build-aux/git-hooks/commit-msg-files.awk (check_commit_msg_files):
Accumulate file entries over multiple lines.
---
 build-aux/git-hooks/commit-msg-files.awk | 31 ++++++++++++++++++------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/build-aux/git-hooks/commit-msg-files.awk b/build-aux/git-hooks/commit-msg-files.awk
index 3856e474d3e..67256668cc2 100644
--- a/build-aux/git-hooks/commit-msg-files.awk
+++ b/build-aux/git-hooks/commit-msg-files.awk
@@ -59,15 +59,28 @@ function check_commit_msg_files(commit_sha, verbose,    changes, good, \
     if (verbose && ! msg)
       msg = $0
 
-    # Find lines that reference files.  We look at any line starting
-    # with "*" (possibly prefixed by "; ") where the file part starts
-    # with an alphanumeric character.  The file part ends if we
-    # encounter any of the following characters: [ ( < { :
-    if (/^(; )?\*[ \t]+[[:alnum:]]/ && match($0, /[[:alnum:]][^[(<{:]*/)) {
-      # There might be multiple files listed on this line, separated
+    # Find file entries in the commit message.  We look at any line
+    # starting with "*" (possibly prefixed by "; ") followed by a ":",
+    # possibly on a different line.  If we encounter a blank line
+    # without seeing a ":", then we don't treat that as a file entry.
+
+    # Accumulate the contents of a (possible) file entry.
+    if (/^[ \t]*$/)
+      filenames_str = ""
+    else if (/^(; )?\*[ \t]+[[:alnum:]]/)
+      filenames_str = $0
+    else if (filenames_str)
+      filenames_str = (filenames_str $0)
+
+    # We have a file entry; analyze it.
+    if (filenames_str && /:/) {
+      # Delete the leading "*" and any trailing information.
+      sub(/^(; )?\*[ \t]+/, "", filenames_str)
+      sub(/[[(<{:].*$/, "", filenames_str)
+
+      # There might be multiple files listed in this entry, separated
       # by spaces (and possibly a comma).  Iterate over each of them.
-      split(substr($0, RSTART, RLENGTH), filenames, ",?([[:blank:]]+|$)")
-
+      split(filenames_str, filenames, ",?([[:blank:]]+|$)")
       for (i in filenames) {
         # Remove trailing slashes from any directory entries.
         sub(/\/$/, "", filenames[i])
@@ -83,6 +96,8 @@ function check_commit_msg_files(commit_sha, verbose,    changes, good, \
           good = 0
         }
       }
+
+      filenames_str = ""
     }
   }
   close(cmd)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 17:44           ` Eli Zaretskii
@ 2023-04-21 17:50             ` Arsen Arsenović
  2023-04-22  6:16               ` Eli Zaretskii
  2023-04-21 18:25             ` Andreas Schwab
  1 sibling, 1 reply; 34+ messages in thread
From: Arsen Arsenović @ 2023-04-21 17:50 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: acm, jporterbugs, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 920 bytes --]


Eli Zaretskii <eliz@gnu.org> writes:

>> From: Arsen Arsenović <arsen@aarsen.me>
>> Cc: acm@muc.de, jporterbugs@gmail.com, emacs-devel@gnu.org
>> Date: Fri, 21 Apr 2023 18:20:16 +0200
>> 
>> >> You can also reword during a rebase, which has the advantage of not
>> >> chucking any work away.  See [1].
>> >
>> > We advise contributors not to rebase on public branches, only on their
>> > local ones.
>> 
>> Sure, but using 'reset' implies that you can rebase, as otherwise you
>> couldn't reset to re-do the commits.
>
> No "git reset" doesn't rebase.

No, but it means that everything above the reset point is malleable, in
which case, you can use a rebase on it anyway (which can be reasoned
with as a hard reset with repeated patch application).

I'm trying to say that it's unnecessary to reset entire histories to
modify some words or formatting in commit messages.
-- 
Arsen Arsenović

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 381 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 17:44           ` Eli Zaretskii
  2023-04-21 17:50             ` Arsen Arsenović
@ 2023-04-21 18:25             ` Andreas Schwab
  1 sibling, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2023-04-21 18:25 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Arsen Arsenović, acm, jporterbugs, emacs-devel

On Apr 21 2023, Eli Zaretskii wrote:

>> From: Arsen Arsenović <arsen@aarsen.me>
>> Cc: acm@muc.de, jporterbugs@gmail.com, emacs-devel@gnu.org
>> Date: Fri, 21 Apr 2023 18:20:16 +0200
>> 
>> >> You can also reword during a rebase, which has the advantage of not
>> >> chucking any work away.  See [1].
>> >
>> > We advise contributors not to rebase on public branches, only on their
>> > local ones.
>> 
>> Sure, but using 'reset' implies that you can rebase, as otherwise you
>> couldn't reset to re-do the commits.
>
> No "git reset" doesn't rebase.

It is effectively a rebase as well.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 15:50       ` Eli Zaretskii
  2023-04-21 16:20         ` Arsen Arsenović
@ 2023-04-21 19:03         ` Björn Bidar
  2023-04-21 19:53           ` Jim Porter
  2023-04-22  7:03           ` Eli Zaretskii
  1 sibling, 2 replies; 34+ messages in thread
From: Björn Bidar @ 2023-04-21 19:03 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Arsen Arsenović, acm, jporterbugs, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Arsen Arsenović <arsen@aarsen.me>
>> Cc: Alan Mackenzie <acm@muc.de>, jporterbugs@gmail.com,
>> emacs-devel@gnu.org
>> Date: Fri, 21 Apr 2023 17:38:19 +0200
>> 
>> You can also reword during a rebase, which has the advantage of not
>> chucking any work away.  See [1].
>
> We advise contributors not to rebase on public branches, only on their
> local ones.
>
>> Everything can be changed at any point in Git, until you push, and
>> even
>> then if the server agrees, so there's never a time that is too late
>> to
>> error out.
>
> Our server is set up not to allow non-FF pushes.

Should branches be checked by these hooks before they are merged into
master/version branches? Before it is possible to fix any commits with
git rebase --interactive.

Br,

Björn



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 19:03         ` Björn Bidar
@ 2023-04-21 19:53           ` Jim Porter
  2023-04-22  7:03           ` Eli Zaretskii
  1 sibling, 0 replies; 34+ messages in thread
From: Jim Porter @ 2023-04-21 19:53 UTC (permalink / raw)
  To: Björn Bidar, Eli Zaretskii; +Cc: Arsen Arsenović, acm, emacs-devel

On 4/21/2023 12:03 PM, Björn Bidar wrote:
> Should branches be checked by these hooks before they are merged into
> master/version branches? Before it is possible to fix any commits with
> git rebase --interactive.

We could ignore branches like scratch/FOO, though the easiest way to 
avoid errors from these hooks is just to avoid adding file entries to 
the commit message in the first place. The hooks only flag your commit 
if the message includes a file entry that's not part of the diff; it 
doesn't mandate that the message lists every file from the diff, however.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 17:44     ` Jim Porter
@ 2023-04-22  6:11       ` Eli Zaretskii
  2023-04-22  6:59         ` Jim Porter
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-22  6:11 UTC (permalink / raw)
  To: Jim Porter; +Cc: john, emacs-devel

> Date: Fri, 21 Apr 2023 10:44:10 -0700
> From: Jim Porter <jporterbugs@gmail.com>
> Cc: emacs-devel@gnu.org, eliz@gnu.org
> 
> On 4/21/2023 9:44 AM, Jim Porter wrote:
> > I tried to be as lenient as possible with the Git hooks so that they 
> > only catch things that would be a problem for admin/authors.el, though
> > there are some small differences regarding multi-line lists of files.
> 
> After thinking about this for a bit, I came up with a way to match 
> admin/authors.el in the hooks (at least, to the best of my knowledge). 
> This should mean that the hooks will report exactly the same errors as 
> admin/authors.el.
> 
> With this patch, there's no restriction on putting a "*" in the first 
> column of a commit message, so long as that paragraph doesn't also have 
> a ":" in it (just like admin/authors.el).
> 
> Eli, what do you think? I've run this against the 5000 latest commits 
> and the results look right (though I haven't looked as thoroughly as 
> last time).

I think the hooks should follow the GNU Standards (which are also our
conventions), not their particular expressions in algorithms used by
authors.el.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 17:50             ` Arsen Arsenović
@ 2023-04-22  6:16               ` Eli Zaretskii
  0 siblings, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-22  6:16 UTC (permalink / raw)
  To: Arsen Arsenović; +Cc: acm, jporterbugs, emacs-devel

> From: Arsen Arsenović <arsen@aarsen.me>
> Cc: acm@muc.de, jporterbugs@gmail.com, emacs-devel@gnu.org
> Date: Fri, 21 Apr 2023 19:50:09 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> >
> > No, "git reset" doesn't rebase.
> 
> No, but it means that everything above the reset point is malleable, in
> which case, you can use a rebase on it anyway (which can be reasoned
> with as a hard reset with repeated patch application).
> 
> I'm trying to say that it's unnecessary to reset entire histories to
> modify some words or formatting in commit messages.

No one was talking about "entire histories", we were talking about a
single commit that was just made, but not yet pushed.  "git reset" is
a way to undo the last commit, letting the user to make any changes
he/she wants before committing again.  So it is (MO) the easiest and
safest way of fixing such mistakes.  It is why we recommend users to
always do a "git show" before pushing, so that they could review what
is about to be pushed before actually doing so.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-22  6:11       ` Eli Zaretskii
@ 2023-04-22  6:59         ` Jim Porter
  2023-04-22  7:51           ` Eli Zaretskii
  0 siblings, 1 reply; 34+ messages in thread
From: Jim Porter @ 2023-04-22  6:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: john, emacs-devel

On 4/21/2023 11:11 PM, Eli Zaretskii wrote:
>> Date: Fri, 21 Apr 2023 10:44:10 -0700
>> From: Jim Porter <jporterbugs@gmail.com>
>> Cc: emacs-devel@gnu.org, eliz@gnu.org
>>
>> Eli, what do you think? I've run this against the 5000 latest commits
>> and the results look right (though I haven't looked as thoroughly as
>> last time).
> 
> I think the hooks should follow the GNU Standards (which are also our
> conventions), not their particular expressions in algorithms used by
> authors.el.

I *think* in this case, the two are the same... or at least the relevant 
algorithms in authors.el are consistent with the GNU standards. I don't 
see a strict set of rules in the GNU standards, so I had to make some 
guesses based on the examples and the implementation in authors.el. This 
is what I used:

1. A file entry is introduced by a line starting with "*" or ";[ \t]+*" 
(this is slightly different from authors.el; indenting the "*" will make 
it not count as a file entry for the purposes of the hooks)

2. A file entry must also contain a ":" before the next blank line

3. A file entry's list of files starts after the leading "*" and 
continues up until the first ":", "(", "[", or "<"

4. Each file in a list of files is split by spaces, and possibly a comma 
(specifically ",?[ \t]+")[1]

5. You can continue long file names over multiple lines if necessary

I think the first 3 are all consistent with the spirit of the GNU 
standards as well as all the provided examples. The 4th rule is more 
questionable; maybe we should be stricter about this one, but I saw that 
style a dozen or so times in looking through the commit history, and 
authors.el is ok with it. The last rule only comes up rarely, but it's 
useful for deeply-nested files (mostly test resources).

[1] Note: I tweaked this since the patch I posted. That's the only 
change though.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 19:03         ` Björn Bidar
  2023-04-21 19:53           ` Jim Porter
@ 2023-04-22  7:03           ` Eli Zaretskii
  2023-04-22 19:52             ` Jim Porter
  1 sibling, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-22  7:03 UTC (permalink / raw)
  To: Björn Bidar; +Cc: arsen, acm, jporterbugs, emacs-devel

> From: Björn Bidar <bjorn.bidar@thaodan.de>
> Cc: Arsen Arsenović <arsen@aarsen.me>,  acm@muc.de,
>   jporterbugs@gmail.com,
>   emacs-devel@gnu.org
> Date: Fri, 21 Apr 2023 22:03:01 +0300
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > Our server is set up not to allow non-FF pushes.
> 
> Should branches be checked by these hooks before they are merged into
> master/version branches?

No.  We don't enforce these rules on branches from which Emacs
tarballs will not be produced.  The commit log message which matters
for those is the one of the merge-commit when the branch is landed on
master.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-22  6:59         ` Jim Porter
@ 2023-04-22  7:51           ` Eli Zaretskii
  2023-04-22 19:44             ` Jim Porter
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-22  7:51 UTC (permalink / raw)
  To: Jim Porter; +Cc: john, emacs-devel

> Date: Fri, 21 Apr 2023 23:59:33 -0700
> Cc: john@yates-sheets.org, emacs-devel@gnu.org
> From: Jim Porter <jporterbugs@gmail.com>
> 
> On 4/21/2023 11:11 PM, Eli Zaretskii wrote:
> >> Date: Fri, 21 Apr 2023 10:44:10 -0700
> >> From: Jim Porter <jporterbugs@gmail.com>
> >> Cc: emacs-devel@gnu.org, eliz@gnu.org
> >>
> >> Eli, what do you think? I've run this against the 5000 latest commits
> >> and the results look right (though I haven't looked as thoroughly as
> >> last time).
> > 
> > I think the hooks should follow the GNU Standards (which are also our
> > conventions), not their particular expressions in algorithms used by
> > authors.el.
> 
> I *think* in this case, the two are the same... or at least the relevant 
> algorithms in authors.el are consistent with the GNU standards. I don't 
> see a strict set of rules in the GNU standards, so I had to make some 
> guesses based on the examples and the implementation in authors.el.

The programmatic expression of the standards is in add-log.el, I
think.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-22  7:51           ` Eli Zaretskii
@ 2023-04-22 19:44             ` Jim Porter
  2023-04-22 23:21               ` John Yates
  0 siblings, 1 reply; 34+ messages in thread
From: Jim Porter @ 2023-04-22 19:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: john, emacs-devel

[-- Attachment #1: Type: text/plain, Size: 967 bytes --]

On 4/22/2023 12:51 AM, Eli Zaretskii wrote:
> The programmatic expression of the standards is in add-log.el, I
> think.

Thanks, that was helpful. From that, I see that you can list multiple 
files separated by commas and a space like so:

   * file1.txt, file2.txt

add-log.el's font-lock doesn't allow lists of files like that to span 
multiple lines, but I think it's reasonable to support that in the 
hooks. Everything else about the entries can span multiple lines, after all.

With add-log.el as a reference, I tightened up the Git hooks a bit so 
that they're stricter. They're still reasonably flexible though, e.g. it 
won't treat something like this as a file entry:

   * This is a bullet point.

That will help reduce false positives so that committers can format the 
description of their commit fairly freely. Hopefully this is a good 
balance: the hooks are stricter than authors.el, but not so strict that 
reasonable (IMO) messages will be rejected.

[-- Attachment #2: 0001-Improve-the-logic-of-the-file-entry-Git-hooks-to-mat.patch --]
[-- Type: text/plain, Size: 2617 bytes --]

From f86ddf1ced484f673eada070f35da7d052e43b99 Mon Sep 17 00:00:00 2001
From: Jim Porter <jporterbugs@gmail.com>
Date: Fri, 21 Apr 2023 10:06:49 -0700
Subject: [PATCH] Improve the logic of the file entry Git hooks to match
 admin/authors.el

* build-aux/git-hooks/commit-msg-files.awk (check_commit_msg_files):
Accumulate file entries over multiple lines.
---
 build-aux/git-hooks/commit-msg-files.awk | 31 ++++++++++++++++++------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/build-aux/git-hooks/commit-msg-files.awk b/build-aux/git-hooks/commit-msg-files.awk
index 3856e474d3e..edffb7d339c 100644
--- a/build-aux/git-hooks/commit-msg-files.awk
+++ b/build-aux/git-hooks/commit-msg-files.awk
@@ -59,15 +59,28 @@ function check_commit_msg_files(commit_sha, verbose,    changes, good, \
     if (verbose && ! msg)
       msg = $0
 
-    # Find lines that reference files.  We look at any line starting
-    # with "*" (possibly prefixed by "; ") where the file part starts
-    # with an alphanumeric character.  The file part ends if we
-    # encounter any of the following characters: [ ( < { :
-    if (/^(; )?\*[ \t]+[[:alnum:]]/ && match($0, /[[:alnum:]][^[(<{:]*/)) {
-      # There might be multiple files listed on this line, separated
+    # Find file entries in the commit message.  We look at any line
+    # starting with "*" (possibly prefixed by "; ") followed by a ":",
+    # possibly on a different line.  If we encounter a blank line
+    # without seeing a ":", then we don't treat that as a file entry.
+
+    # Accumulate the contents of a (possible) file entry.
+    if (/^[ \t]*$/)
+      filenames_str = ""
+    else if (/^(; )?\*[ \t]+[[:alnum:]]/)
+      filenames_str = $0
+    else if (filenames_str)
+      filenames_str = (filenames_str $0)
+
+    # We have a file entry; analyze it.
+    if (filenames_str && /:/) {
+      # Delete the leading "*" and any trailing information.
+      sub(/^(; )?\*[ \t]+/, "", filenames_str)
+      sub(/[[(<:].*$/, "", filenames_str)
+
+      # There might be multiple files listed in this entry, separated
       # by spaces (and possibly a comma).  Iterate over each of them.
-      split(substr($0, RSTART, RLENGTH), filenames, ",?([[:blank:]]+|$)")
-
+      split(filenames_str, filenames, ",[ \t]+")
       for (i in filenames) {
         # Remove trailing slashes from any directory entries.
         sub(/\/$/, "", filenames[i])
@@ -83,6 +96,8 @@ function check_commit_msg_files(commit_sha, verbose,    changes, good, \
           good = 0
         }
       }
+
+      filenames_str = ""
     }
   }
   close(cmd)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-22  7:03           ` Eli Zaretskii
@ 2023-04-22 19:52             ` Jim Porter
  2023-04-23  6:11               ` Eli Zaretskii
  0 siblings, 1 reply; 34+ messages in thread
From: Jim Porter @ 2023-04-22 19:52 UTC (permalink / raw)
  To: Eli Zaretskii, Björn Bidar; +Cc: arsen, acm, emacs-devel

On 4/22/2023 12:03 AM, Eli Zaretskii wrote:
>> From: Björn Bidar <bjorn.bidar@thaodan.de>
>> Should branches be checked by these hooks before they are merged into
>> master/version branches?
> 
> No.  We don't enforce these rules on branches from which Emacs
> tarballs will not be produced.  The commit log message which matters
> for those is the one of the merge-commit when the branch is landed on
> master.
I'll make sure this works. I do think it's useful to check commits on 
all branches, since we won't know until the merge how a committer plans 
to do the merge. For example, you could push a feature branch upstream, 
and when the time comes to merge it, rebase it onto master. In that case 
the history is linear, so we'd want to check every commit that was 
newly-added to master.

I'm not exactly sure how the Git log -> ChangeLog -> etc/AUTHORS process 
works for merge commits, but it shouldn't be too hard to make the Git 
hooks be compatible with that process. (For example, we could just 
detect if we're pushing a merge commit and then stop validating any 
commits-to-be-pushed prior to that.)

Of course, we could also say that it's still useful to validate all 
those commits anyway, since it keeps our commit history better-formatted.

If there's ever a case where these new hooks do the wrong thing, we can 
just run "git push --no-verify" to ignore the hooks. We're the humans, 
so we have the final word over the computer, after all. :)



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-22 19:44             ` Jim Porter
@ 2023-04-22 23:21               ` John Yates
  0 siblings, 0 replies; 34+ messages in thread
From: John Yates @ 2023-04-22 23:21 UTC (permalink / raw)
  To: Jim Porter; +Cc: Eli Zaretskii, emacs-devel

On Sat, Apr 22, 2023 at 3:44 PM Jim Porter <jporterbugs@gmail.com> wrote:
> They're still reasonably flexible though, e.g. it
> won't treat something like this as a file entry:
>
>    * This is a bullet point.

Thank you.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-21 15:05   ` Eli Zaretskii
  2023-04-21 15:38     ` Arsen Arsenović
@ 2023-04-22 23:55     ` Dmitry Gutov
  2023-04-23  5:36       ` Eli Zaretskii
  2023-04-23  9:47       ` Björn Bidar
  1 sibling, 2 replies; 34+ messages in thread
From: Dmitry Gutov @ 2023-04-22 23:55 UTC (permalink / raw)
  To: Eli Zaretskii, Alan Mackenzie; +Cc: jporterbugs, emacs-devel

On 21/04/2023 18:05, Eli Zaretskii wrote:
> It is still helpful, because as long as you didn't push, you can still
> do
> 
>     $ git reset HEAD^
> 
> and then fix whatever needs fixing and commit again.

When only the commit message is problematic,

   git commit --amend

is usually a more ergonomic choice.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-22 23:55     ` Dmitry Gutov
@ 2023-04-23  5:36       ` Eli Zaretskii
  2023-04-23  9:47       ` Björn Bidar
  1 sibling, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-23  5:36 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: acm, jporterbugs, emacs-devel

> Date: Sun, 23 Apr 2023 02:55:14 +0300
> Cc: jporterbugs@gmail.com, emacs-devel@gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> On 21/04/2023 18:05, Eli Zaretskii wrote:
> > It is still helpful, because as long as you didn't push, you can still
> > do
> > 
> >     $ git reset HEAD^
> > 
> > and then fix whatever needs fixing and commit again.
> 
> When only the commit message is problematic,
> 
>    git commit --amend
> 
> is usually a more ergonomic choice.

This was already mentioned up-thread, so no need to reiterate.  Alan
asked what to do if that is not enough, and that's why I mentioned
"git reset".



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-22 19:52             ` Jim Porter
@ 2023-04-23  6:11               ` Eli Zaretskii
  2023-04-23  7:07                 ` Jim Porter
  2023-04-23  7:19                 ` Jim Porter
  0 siblings, 2 replies; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-23  6:11 UTC (permalink / raw)
  To: Jim Porter; +Cc: bjorn.bidar, arsen, acm, emacs-devel

> Date: Sat, 22 Apr 2023 12:52:58 -0700
> Cc: arsen@aarsen.me, acm@muc.de, emacs-devel@gnu.org
> From: Jim Porter <jporterbugs@gmail.com>
> 
> On 4/22/2023 12:03 AM, Eli Zaretskii wrote:
> >> From: Björn Bidar <bjorn.bidar@thaodan.de>
> >> Should branches be checked by these hooks before they are merged into
> >> master/version branches?
> > 
> > No.  We don't enforce these rules on branches from which Emacs
> > tarballs will not be produced.  The commit log message which matters
> > for those is the one of the merge-commit when the branch is landed on
> > master.
> I'll make sure this works. I do think it's useful to check commits on 
> all branches, since we won't know until the merge how a committer plans 
> to do the merge.

The risk exists, but the freedom we give developers with log messages
on feature branches is more important.  We should not make developing
features more troublesome by enforcing well-formatted log messages for
WIP changes, especially since frequently the commits are just
checkpoints, they don't signify any changes that have a useful
description.  So we should not block pushing of branch commits due to
log messages.

> For example, you could push a feature branch upstream, and when the
> time comes to merge it, rebase it onto master. In that case the
> history is linear, so we'd want to check every commit that was
> newly-added to master.

We prefer that people do not rebase onto master.  We hope they don't
do that, in practice.  They can rebase the branch before merging it,
of course, but that is not the situation you describe, and should not
cause problems with these hooks, because only the log message of the
final merge-commit matters.

> I'm not exactly sure how the Git log -> ChangeLog -> etc/AUTHORS process 
> works for merge commits

Merge commits usually don't have any meaningful log messages and don't
mention files.  You must explicitly add a log message mentioning files
and functions for a merge-commit to have a log message, and we request
that people who land feature branch create a log message describing
all the new/changed things in the merge for that single commit before
you push.  Most other merge-commits aren't important in the context of
this discussion.

> For example, we could just detect if we're pushing a merge commit
> and then stop validating any commits-to-be-pushed prior to that.

If this can be done easily and without mistakes, that would be okay, I
think.  But please consider all the frequent use cases where
merge-commits appear.  These include:

  . merge-commit by "git pull" when there are local commits (several
    people making changes to the same branch at the same time)
  . merge from the release branch to master
  . cherry-pick from master to the release branch
  . merge of a feature branch or a scratch branch

> Of course, we could also say that it's still useful to validate all 
> those commits anyway, since it keeps our commit history better-formatted.

But if so, what do you want the user who does the merge do?  The log
messages of past commits cannot be amended, so this would simply
preclude people from pushing merges and/or force them to use the
"--no-verify" switch.  For example, a frequent situation for merges is
a merge from the release branch to master -- if one or more of the
commits being merged has bad log messages, we don't want to block the
merge, because nothing can be done about those bad messages at the
time of the merge.

> If there's ever a case where these new hooks do the wrong thing, we can 
> just run "git push --no-verify" to ignore the hooks. We're the humans, 
> so we have the final word over the computer, after all. :)

Increasing the number of situations where we'd need --no-verify is not
a Good Thing.  In this case, it is better to leave the badly-formatted
log messages alone, and let them be dealt with at make-tarball time.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-23  6:11               ` Eli Zaretskii
@ 2023-04-23  7:07                 ` Jim Porter
  2023-04-23  7:37                   ` Eli Zaretskii
  2023-04-23  7:19                 ` Jim Porter
  1 sibling, 1 reply; 34+ messages in thread
From: Jim Porter @ 2023-04-23  7:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: bjorn.bidar, arsen, acm, emacs-devel

On 4/22/2023 11:11 PM, Eli Zaretskii wrote:
> The risk exists, but the freedom we give developers with log messages
> on feature branches is more important.  We should not make developing
> features more troublesome by enforcing well-formatted log messages for
> WIP changes, especially since frequently the commits are just
> checkpoints, they don't signify any changes that have a useful
> description.  So we should not block pushing of branch commits due to
> log messages.

Ok, it should be easy to restrict the pre-push hook to master/emacs-NN 
branches. However, just to make sure we're on the same page: the *only* 
errors that committers will ever see from these hooks is if they listed 
a file in GNU Change Log format that wasn't in the diff. Anything else 
is allowed, even a message as simple as, "Do something".

> Merge commits usually don't have any meaningful log messages and don't
> mention files.  You must explicitly add a log message mentioning files
> and functions for a merge-commit to have a log message, and we request
> that people who land feature branch create a log message describing
> all the new/changed things in the merge for that single commit before
> you push.  Most other merge-commits aren't important in the context of
> this discussion.

This part should be easy enough to handle: if a commit is a merge 
commit, we just won't validate any file names listed in the log message. 
(In theory, we could try, but it would be a lot of effort for something 
with not a lot of value.)

Going back to your previous message, I want to be sure I understand this 
part:

> The commit log message which matters for those is the one of the
> merge-commit when the branch is landed on master.

Suppose I make commits A, B, and C on a branch, and then merge to master 
with merge-commit D. Does your message mean that the only commit that 
ends up in the ChangeLog file is commit D? (Or that commit D is the only 
one that authors.el looks at?)

I was looking for examples of merge commits to test against and found 
289b457cac1, merging the 'abort-redisplay' branch. That's in ChangeLog.4 
on the emacs-29 branch, as expected. However, I also see commits that 
appear to be from that branch in the change log too, such as 
4b00bc47c7e. I don't see any logic in authors.el to ignore these commits 
either. Am I just misunderstanding something?



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-23  6:11               ` Eli Zaretskii
  2023-04-23  7:07                 ` Jim Porter
@ 2023-04-23  7:19                 ` Jim Porter
  2023-04-23  7:39                   ` Eli Zaretskii
  1 sibling, 1 reply; 34+ messages in thread
From: Jim Porter @ 2023-04-23  7:19 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: bjorn.bidar, arsen, acm, emacs-devel

(I forgot to reply to this part of the message previously...)

On 4/22/2023 11:11 PM, Eli Zaretskii wrote:
> But if so, what do you want the user who does the merge do?  The log
> messages of past commits cannot be amended, so this would simply
> preclude people from pushing merges and/or force them to use the
> "--no-verify" switch.  For example, a frequent situation for merges is
> a merge from the release branch to master -- if one or more of the
> commits being merged has bad log messages, we don't want to block the
> merge, because nothing can be done about those bad messages at the
> time of the merge.

I wonder if we should get rid of the pre-push hook entirely then. While 
a committer can theoretically amend old commits prior to the merge, this 
is messy, a lot of work, and could cause other problems too.

This would mean that a committer sees a warning about bad log messages 
(at post-commit time), but it doesn't actually prevent them from doing 
anything. A warning might be too easy to ignore, but maybe it's better 
than being silent about it. I'm not sure what to do here...



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-23  7:07                 ` Jim Porter
@ 2023-04-23  7:37                   ` Eli Zaretskii
  2023-04-23 19:15                     ` Jim Porter
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-23  7:37 UTC (permalink / raw)
  To: Jim Porter; +Cc: bjorn.bidar, arsen, acm, emacs-devel

> Date: Sun, 23 Apr 2023 00:07:49 -0700
> Cc: bjorn.bidar@thaodan.de, arsen@aarsen.me, acm@muc.de, emacs-devel@gnu.org
> From: Jim Porter <jporterbugs@gmail.com>
> 
> On 4/22/2023 11:11 PM, Eli Zaretskii wrote:
> > The risk exists, but the freedom we give developers with log messages
> > on feature branches is more important.  We should not make developing
> > features more troublesome by enforcing well-formatted log messages for
> > WIP changes, especially since frequently the commits are just
> > checkpoints, they don't signify any changes that have a useful
> > description.  So we should not block pushing of branch commits due to
> > log messages.
> 
> Ok, it should be easy to restrict the pre-push hook to master/emacs-NN 
> branches. However, just to make sure we're on the same page: the *only* 
> errors that committers will ever see from these hooks is if they listed 
> a file in GNU Change Log format that wasn't in the diff. Anything else 
> is allowed, even a message as simple as, "Do something".

It's a compromise (ideally, other mistakes in file names should also
be detected), but I think it's the best we can do without too many
false positives.

> > Merge commits usually don't have any meaningful log messages and don't
> > mention files.  You must explicitly add a log message mentioning files
> > and functions for a merge-commit to have a log message, and we request
> > that people who land feature branch create a log message describing
> > all the new/changed things in the merge for that single commit before
> > you push.  Most other merge-commits aren't important in the context of
> > this discussion.
> 
> This part should be easy enough to handle: if a commit is a merge 
> commit, we just won't validate any file names listed in the log message. 

This is not what I meant, though.  If a merge-commit does have a
meaningful log message, its file names should be validated.  When a
feature branch is landed on master, we request that all the important
changes that the branch brings be described in the log message of the
merge-commit.  So it shouldn't be ignored.

> > The commit log message which matters for those is the one of the
> > merge-commit when the branch is landed on master.
> 
> Suppose I make commits A, B, and C on a branch, and then merge to master 
> with merge-commit D. Does your message mean that the only commit that 
> ends up in the ChangeLog file is commit D? (Or that commit D is the only 
> one that authors.el looks at?)

The only one that it is reasonable to validate is commit D, yes.  The
other commits should not be rejected due to their log messages, at
least not ideally.

> I was looking for examples of merge commits to test against and found 
> 289b457cac1, merging the 'abort-redisplay' branch. That's in ChangeLog.4 
> on the emacs-29 branch, as expected. However, I also see commits that 
> appear to be from that branch in the change log too, such as 
> 4b00bc47c7e. I don't see any logic in authors.el to ignore these commits 
> either. Am I just misunderstanding something?

No, there's no such logic.  But rejecting commits on a feature/scratch
branch due to this would be too much of an annoyance, so we shouldn't
do that.  Whoever runs authors.el will have to deal with this.  (My
recommendation to developers who work on branches is not to use
ChangeLog-style file name references in branch commits that aren't
important enough to appear in the generated ChangeLog files.)



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-23  7:19                 ` Jim Porter
@ 2023-04-23  7:39                   ` Eli Zaretskii
  2023-04-23  9:51                     ` Björn Bidar
  0 siblings, 1 reply; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-23  7:39 UTC (permalink / raw)
  To: Jim Porter; +Cc: bjorn.bidar, arsen, acm, emacs-devel

> Date: Sun, 23 Apr 2023 00:19:24 -0700
> Cc: bjorn.bidar@thaodan.de, arsen@aarsen.me, acm@muc.de, emacs-devel@gnu.org
> From: Jim Porter <jporterbugs@gmail.com>
> 
> (I forgot to reply to this part of the message previously...)
> 
> On 4/22/2023 11:11 PM, Eli Zaretskii wrote:
> > But if so, what do you want the user who does the merge do?  The log
> > messages of past commits cannot be amended, so this would simply
> > preclude people from pushing merges and/or force them to use the
> > "--no-verify" switch.  For example, a frequent situation for merges is
> > a merge from the release branch to master -- if one or more of the
> > commits being merged has bad log messages, we don't want to block the
> > merge, because nothing can be done about those bad messages at the
> > time of the merge.
> 
> I wonder if we should get rid of the pre-push hook entirely then. While 
> a committer can theoretically amend old commits prior to the merge, this 
> is messy, a lot of work, and could cause other problems too.

"git commit --amend" is not a lot of work.  I do it all the time when
applying patches by others, because they frequently have mistakes and
omissions.

> This would mean that a committer sees a warning about bad log messages 
> (at post-commit time), but it doesn't actually prevent them from doing 
> anything. A warning might be too easy to ignore, but maybe it's better 
> than being silent about it. I'm not sure what to do here...

A warning that doesn't preclude a commit from being pushed is much
less useful, IMO.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-22 23:55     ` Dmitry Gutov
  2023-04-23  5:36       ` Eli Zaretskii
@ 2023-04-23  9:47       ` Björn Bidar
  1 sibling, 0 replies; 34+ messages in thread
From: Björn Bidar @ 2023-04-23  9:47 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: Eli Zaretskii, Alan Mackenzie, jporterbugs, emacs-devel

Dmitry Gutov <dmitry@gutov.dev> writes:

> On 21/04/2023 18:05, Eli Zaretskii wrote:
>> It is still helpful, because as long as you didn't push, you can
>> still
>> do
>>     $ git reset HEAD^
>> and then fix whatever needs fixing and commit again.
>
> When only the commit message is problematic,
>
>   git commit --amend
>
> is usually a more ergonomic choice.

Also you can use  -n, --[no-]verify to disable hooks also whileusing git
commit, see man 1 git-commit.

           By default, the pre-commit and commit-msg hooks are run. When any
           of --no-verify or -n is given, these are bypassed. See also
           githooks(5).



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-23  7:39                   ` Eli Zaretskii
@ 2023-04-23  9:51                     ` Björn Bidar
  0 siblings, 0 replies; 34+ messages in thread
From: Björn Bidar @ 2023-04-23  9:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Jim Porter, arsen, acm, emacs-devel

Eli Zaretskii <eliz@gnu.org> writes:

>> This would mean that a committer sees a warning about bad log messages 
>> (at post-commit time), but it doesn't actually prevent them from doing 
>> anything. A warning might be too easy to ignore, but maybe it's better 
>> than being silent about it. I'm not sure what to do here...
>
> A warning that doesn't preclude a commit from being pushed is much
> less useful, IMO.

I agree, especially if using a git frontend such as magit where you
maybe not see the full message unless you press $.

There are always options to disable the hooks in exceptional cases.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-23  7:37                   ` Eli Zaretskii
@ 2023-04-23 19:15                     ` Jim Porter
  2023-04-23 19:24                       ` Eli Zaretskii
  0 siblings, 1 reply; 34+ messages in thread
From: Jim Porter @ 2023-04-23 19:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: bjorn.bidar, arsen, acm, emacs-devel

I've pushed a couple of fixes for the hooks (described in more detail 
below). I *think* this should resolve all the issues, though I'll still 
do more testing locally, especially with merge commits.

More generally, if people find these hooks to be annoying or unhelpful, 
I truly won't be offended if they get backed out. I wrote them in the 
hopes that it will make it easier to maintain Emacs, but if they can't 
meet that goal, then they shouldn't stick around causing problems.

On 4/23/2023 12:37 AM, Eli Zaretskii wrote:
>> Date: Sun, 23 Apr 2023 00:07:49 -0700
>> Cc: bjorn.bidar@thaodan.de, arsen@aarsen.me, acm@muc.de, emacs-devel@gnu.org
>> From: Jim Porter <jporterbugs@gmail.com>
>>
>> Ok, it should be easy to restrict the pre-push hook to master/emacs-NN
>> branches. ...
> 
> It's a compromise (ideally, other mistakes in file names should also
> be detected), but I think it's the best we can do without too many
> false positives.

Ok, done.

>> This part should be easy enough to handle: if a commit is a merge
>> commit, we just won't validate any file names listed in the log message.
> 
> This is not what I meant, though.  If a merge-commit does have a
> meaningful log message, its file names should be validated.

It turns out this is actually very easy to do, now that I've learned 
about the "--first-parent" option in Git. Using that on a merge commit, 
I can get a list of all the changes that were merged into the target 
branch (i.e. everything from the feature branch that's now on, say, master).

With this fix, commit 289b457cac1 (the merge of 'abort-redisplay') now 
passes the tests, as it should.

>> Suppose I make commits A, B, and C on a branch, and then merge to master
>> with merge-commit D. Does your message mean that the only commit that
>> ends up in the ChangeLog file is commit D? (Or that commit D is the only
>> one that authors.el looks at?)
> 
> The only one that it is reasonable to validate is commit D, yes.  The
> other commits should not be rejected due to their log messages, at
> least not ideally.

I believe "--first-parent" also solves this case. In particular, by 
using this option, the hook correctly ignores the commits from the Eglot 
branch that were merged in. This means that any merge like that in the 
future won't run afoul of these hooks.

I'll have to do a bit more testing here to try out other edge cases, but 
this should only result in false negatives, not false positives. (False 
negatives would just require some fixup when making the tarball.)



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: New Git hooks for checking file names in commit messages
  2023-04-23 19:15                     ` Jim Porter
@ 2023-04-23 19:24                       ` Eli Zaretskii
  0 siblings, 0 replies; 34+ messages in thread
From: Eli Zaretskii @ 2023-04-23 19:24 UTC (permalink / raw)
  To: Jim Porter; +Cc: bjorn.bidar, arsen, acm, emacs-devel

> Date: Sun, 23 Apr 2023 12:15:37 -0700
> Cc: bjorn.bidar@thaodan.de, arsen@aarsen.me, acm@muc.de, emacs-devel@gnu.org
> From: Jim Porter <jporterbugs@gmail.com>
> 
> I've pushed a couple of fixes for the hooks (described in more detail 
> below). I *think* this should resolve all the issues, though I'll still 
> do more testing locally, especially with merge commits.
> 
> More generally, if people find these hooks to be annoying or unhelpful, 
> I truly won't be offended if they get backed out. I wrote them in the 
> hopes that it will make it easier to maintain Emacs, but if they can't 
> meet that goal, then they shouldn't stick around causing problems.

We'll see.  In any case, thank you for doing this.



^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2023-04-23 19:24 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-21  4:55 New Git hooks for checking file names in commit messages Jim Porter
2023-04-21 12:05 ` John Yates
2023-04-21 16:44   ` Jim Porter
2023-04-21 17:44     ` Jim Porter
2023-04-22  6:11       ` Eli Zaretskii
2023-04-22  6:59         ` Jim Porter
2023-04-22  7:51           ` Eli Zaretskii
2023-04-22 19:44             ` Jim Porter
2023-04-22 23:21               ` John Yates
2023-04-21 13:57 ` Alan Mackenzie
2023-04-21 15:05   ` Eli Zaretskii
2023-04-21 15:38     ` Arsen Arsenović
2023-04-21 15:50       ` Eli Zaretskii
2023-04-21 16:20         ` Arsen Arsenović
2023-04-21 17:44           ` Eli Zaretskii
2023-04-21 17:50             ` Arsen Arsenović
2023-04-22  6:16               ` Eli Zaretskii
2023-04-21 18:25             ` Andreas Schwab
2023-04-21 19:03         ` Björn Bidar
2023-04-21 19:53           ` Jim Porter
2023-04-22  7:03           ` Eli Zaretskii
2023-04-22 19:52             ` Jim Porter
2023-04-23  6:11               ` Eli Zaretskii
2023-04-23  7:07                 ` Jim Porter
2023-04-23  7:37                   ` Eli Zaretskii
2023-04-23 19:15                     ` Jim Porter
2023-04-23 19:24                       ` Eli Zaretskii
2023-04-23  7:19                 ` Jim Porter
2023-04-23  7:39                   ` Eli Zaretskii
2023-04-23  9:51                     ` Björn Bidar
2023-04-22 23:55     ` Dmitry Gutov
2023-04-23  5:36       ` Eli Zaretskii
2023-04-23  9:47       ` Björn Bidar
2023-04-21 16:39   ` Jim Porter

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.