From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Alan Mackenzie Newsgroups: gmane.emacs.devel Subject: Re: Mistakes in commit log messages Date: Tue, 11 Apr 2023 14:01:48 +0000 Message-ID: References: <835ya5m4p0.fsf@gnu.org> <83v8i4arzt.fsf@gnu.org> <838rezardu.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="13589"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Jim Porter , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Tue Apr 11 16:02:53 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pmEaD-0003JF-Gl for ged-emacs-devel@m.gmane-mx.org; Tue, 11 Apr 2023 16:02:53 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pmEZY-0002uG-HJ; Tue, 11 Apr 2023 10:02:12 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pmEZR-0002u6-RD for emacs-devel@gnu.org; Tue, 11 Apr 2023 10:02:05 -0400 Original-Received: from mx3.muc.de ([193.149.48.5]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pmEZP-0005bn-DM for emacs-devel@gnu.org; Tue, 11 Apr 2023 10:02:05 -0400 Original-Received: (qmail 1093 invoked by uid 3782); 11 Apr 2023 16:01:49 +0200 Original-Received: from acm.muc.de (pd953a980.dip0.t-ipconnect.de [217.83.169.128]) (using STARTTLS) by colin.muc.de (tmda-ofmipd) with ESMTP; Tue, 11 Apr 2023 16:01:48 +0200 Original-Received: (qmail 16717 invoked by uid 1000); 11 Apr 2023 14:01:48 -0000 Content-Disposition: inline In-Reply-To: <838rezardu.fsf@gnu.org> X-Submission-Agent: TMDA/1.3.x (Ph3nix) X-Primary-Address: acm@muc.de Received-SPF: pass client-ip=193.149.48.5; envelope-from=acm@muc.de; helo=mx3.muc.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:305237 Archived-At: Hello, Eli and Jim. On Tue, Apr 11, 2023 at 09:02:05 +0300, Eli Zaretskii wrote: > > From: Jim Porter > > Date: Mon, 10 Apr 2023 14:52:15 -0700 > > Cc: Alan Mackenzie , philipk@posteo.net, luangruo@yahoo.com > > On Mon, Apr 10, 2023 at 10:18 AM Jim Porter wrote: > > > I looked into doing this, and I think it'd be possible to extend the > > > existing commit-msg hook (in build-aux/git-hooks) to do this, at least > > > using gawk. I don't really know awk though, so I'm sure my solution > > > would be clumsy and probably gawk-specific. I wonder if we could make > > > the hooks use Emacs Lisp... > > If someone could figure out how to disable this code on non-gawk awks, > > I think the attached diff should do the trick. Any thoughts? > I think a solution that doesn't use Gawk-specific features would be > preferable, since no one said the mistakes are private only to users > of GNU/Linux and MS-Windows, where Gawk is basically the only Awk. > For the other readers of emacs-devel: this came from a private email I > wrote to several of our active contributors telling them that their > commit log messages included a substantial number of mistakes in file > names mentioned in the log message. The admin/authors.el program > discovered those mistakes while trying to generate attributions for > who did what in Emacs (the etc/AUTHORS file). Someone suggested to > augment our commit hooks to avoid such mistakes, at least those of > them that can be easily detected by a simple script. > The script suggested by Jim is below: > > diff --git a/build-aux/git-hooks/commit-msg b/build-aux/git-hooks/commit-msg > > index d0578bcfb46..cdc99f4b399 100755 > > --- a/build-aux/git-hooks/commit-msg > > +++ b/build-aux/git-hooks/commit-msg > > @@ -45,6 +45,7 @@ at_sign= > > # Check the log entry. > > exec $awk -v at_sign="$at_sign" -v cent_sign="$cent_sign" -v file="$1" ' > > + @load "filefuncs" > > BEGIN { > > # These regular expressions assume traditional Unix unibyte behavior. > > # They are needed for old or broken versions of awk, e.g., > > @@ -129,6 +130,18 @@ at_sign= > > status = 1 > > } > > + /^* / { > > + # Check that any filenames mentioned in the commit message > > + # actually exist. Currently, this only prints a warning to > > + # prevent potential issues with false positives. > > + if(match($2, "[^:/][^:]*")) { > > + FILE = substr($2, RSTART, RLENGTH) > > + if(stat(FILE, type) < 0) { > > + printf("Warning: file '\''%s'\'' in commit message not found\n", FILE) > > + } > > + } > > + } > > + > > $0 ~ unsafe_gnu_url { > > needs_rewriting = 1 > > } After having to ask on the help-gawk mailing list how to do it, I've got a suggestion that uses only AWK, and checks for the existence of each file in a "* foo..." line by attempting to read the first line from it. It also reports an error if there are no such lines (it is possible the contributor forgot to include the "* " in his file lines). --- commit-msg 2023-01-15 15:01:05.006074916 +0000 +++ commit-msg.acm 2023-04-11 13:59:18.517300896 +0000 @@ -138,11 +138,24 @@ status = 1 } + /^\* [a-zA-Z0-9_.~#-]/ { + nfiles++ + if ((rc = (getline x < $2)) < 0) { + status = 1 + print "File " $2 " cannot be read: [" ERRNO "]" + } + close($2) + } + END { if (nlines == 0) { print "Empty commit message" status = 1 } + if (!nfiles) { + print "No file lines in commit message" + status = 1 + } if (status == 0 && needs_rewriting) { for (i = 1; i <= NR; i++) { line = input[i] -- Alan Mackenzie (Nuremberg, Germany).