From: Tomi Ollila <tomi.ollila@iki.fi>
To: David Bremner <david@tethera.net>, notmuch@notmuchmail.org
Subject: Re: [RFC] devel: script to calculate a list of authors.
Date: Wed, 03 Jun 2020 21:02:16 +0300 [thread overview]
Message-ID: <m2tuzs6n9z.fsf@guru.guru-group.fi> (raw)
In-Reply-To: <20200603160743.1449796-1-david@tethera.net>
On Wed, Jun 03 2020, David Bremner wrote:
> As an initial heuristic, report anyone with at least 15 lines of code
> in the current source tree. Test corpora are excluded, although
> probabably this doesn't change much about the list of authors
> produced.
> ---
>
> I realized both AUTHORS and debian/copyright are woefully out of
> date. I think it makes sense to keep something like this in the repo,
> both to ease updates and to document a policy. Presuambly 'author '
> should be removed from the output, but I'm guessing Tomi will tear
> this apart anyway ;).
Hi David,
I started doing that before even reading your commit (used less(1) to
look the email from bottom... ;)
I got some idea, but then decided there is no point spending too
many minutes (what was that one 'time management' xkcd again... =D)
anyway, some fun editing the nifty pipeline you write
git ls-files | grep -v -e "$FILE_EXCLUDE" | xargs -n 1 -d \\n | \
git blame -w --line-porcelain -- | \
sed -n "/$AUTHOR_EXCLUDE/d; s/^[aA][uU][tT][hH][Oo][rR] //p" | \
sort -fd | uniq -ic | awk "\$1 >= $THRESHOLD" | sort -nr
If there are more authors or files to exclude, then it is easiest
to just write those out in pipeline, e.g.
grep -v -e 'file-one' -e 'file 2 with spaces' ...
and
sed -n '/author1_ex1/d; /author2_ex1/d; s/^[aA][uU][tT][hH][Oo][rR] //p'
(and, as usual, I always recommend 'set -euf' in shell scripts)
Tomi
> devel/author-scan.sh | 11 +++++++++++
> 1 file changed, 11 insertions(+)
> create mode 100644 devel/author-scan.sh
>
> diff --git a/devel/author-scan.sh b/devel/author-scan.sh
> new file mode 100644
> index 00000000..b7b46a33
> --- /dev/null
> +++ b/devel/author-scan.sh
> @@ -0,0 +1,11 @@
> +#!/bin/sh
> +
> +FILE_EXCLUDE='corpora'
> +AUTHOR_EXCLUDE='uncrustify'
> +# based on the FSF guideline, for want of a better idea.
> +THRESHOLD=15
> +
> +git ls-files | grep -v "$FILE_EXCLUDE" |
> + while read f; do
> + git blame -w --line-porcelain -- "$f" | grep -I '^author ' | grep -v "$AUTHOR_EXCLUDE"
> + done | sort -fd | uniq -ic | awk "\$1 >= $THRESHOLD" | sort -nr
> --
> 2.26.2
next prev parent reply other threads:[~2020-06-03 18:02 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-03 16:07 [RFC] devel: script to calculate a list of authors David Bremner
2020-06-03 18:02 ` Tomi Ollila [this message]
2020-06-05 10:07 ` [PATCH] " David Bremner
2020-06-05 12:10 ` Tomi Ollila
2020-06-05 13:03 ` David Bremner
2020-06-06 11:47 ` David Bremner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m2tuzs6n9z.fsf@guru.guru-group.fi \
--to=tomi.ollila@iki.fi \
--cc=david@tethera.net \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).