unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* [RFC] devel: script to calculate a list of authors.
@ 2020-06-03 16:07 David Bremner
  2020-06-03 18:02 ` Tomi Ollila
  0 siblings, 1 reply; 6+ messages in thread
From: David Bremner @ 2020-06-03 16:07 UTC (permalink / raw)
  To: notmuch

As an initial heuristic, report anyone with at least 15 lines of code
in the current source tree. Test corpora are excluded, although
probabably this doesn't change much about the list of authors
produced.
---

I realized both AUTHORS and debian/copyright are woefully out of
date. I think it makes sense to keep something like this in the repo,
both to ease updates and to document a policy. Presuambly 'author '
should be removed from the output, but I'm guessing Tomi will tear
this apart anyway ;).

 devel/author-scan.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)
 create mode 100644 devel/author-scan.sh

diff --git a/devel/author-scan.sh b/devel/author-scan.sh
new file mode 100644
index 00000000..b7b46a33
--- /dev/null
+++ b/devel/author-scan.sh
@@ -0,0 +1,11 @@
+#!/bin/sh
+
+FILE_EXCLUDE='corpora'
+AUTHOR_EXCLUDE='uncrustify'
+# based on the FSF guideline, for want of a better idea.
+THRESHOLD=15
+
+git ls-files | grep -v "$FILE_EXCLUDE" |
+    while read f; do
+        git blame -w --line-porcelain -- "$f" | grep -I '^author ' | grep -v "$AUTHOR_EXCLUDE"
+    done | sort -fd | uniq -ic | awk "\$1 >= $THRESHOLD" |  sort -nr
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC] devel: script to calculate a list of authors.
  2020-06-03 16:07 [RFC] devel: script to calculate a list of authors David Bremner
@ 2020-06-03 18:02 ` Tomi Ollila
  2020-06-05 10:07   ` [PATCH] " David Bremner
  0 siblings, 1 reply; 6+ messages in thread
From: Tomi Ollila @ 2020-06-03 18:02 UTC (permalink / raw)
  To: David Bremner, notmuch

On Wed, Jun 03 2020, David Bremner wrote:

> As an initial heuristic, report anyone with at least 15 lines of code
> in the current source tree. Test corpora are excluded, although
> probabably this doesn't change much about the list of authors
> produced.
> ---
>
> I realized both AUTHORS and debian/copyright are woefully out of
> date. I think it makes sense to keep something like this in the repo,
> both to ease updates and to document a policy. Presuambly 'author '
> should be removed from the output, but I'm guessing Tomi will tear
> this apart anyway ;).

Hi David,

I started doing that before even reading your commit (used less(1) to
look the email from bottom... ;)

I got some idea, but then decided there is no point spending too
many minutes (what was that one 'time management' xkcd again... =D)

anyway, some fun editing the nifty pipeline you write

git ls-files | grep -v -e "$FILE_EXCLUDE" | xargs -n 1 -d \\n | \
    git blame -w --line-porcelain -- | \
    sed -n "/$AUTHOR_EXCLUDE/d; s/^[aA][uU][tT][hH][Oo][rR] //p" | \
    sort -fd | uniq -ic | awk "\$1 >= $THRESHOLD" | sort -nr

If there are more authors or files to exclude, then it is easiest
to just write those out in pipeline, e.g.

grep -v -e 'file-one' -e 'file 2 with spaces' ...

and

sed -n '/author1_ex1/d; /author2_ex1/d; s/^[aA][uU][tT][hH][Oo][rR] //p'

(and, as usual, I always recommend 'set -euf' in shell scripts)

Tomi

>  devel/author-scan.sh | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>  create mode 100644 devel/author-scan.sh
>
> diff --git a/devel/author-scan.sh b/devel/author-scan.sh
> new file mode 100644
> index 00000000..b7b46a33
> --- /dev/null
> +++ b/devel/author-scan.sh
> @@ -0,0 +1,11 @@
> +#!/bin/sh
> +
> +FILE_EXCLUDE='corpora'
> +AUTHOR_EXCLUDE='uncrustify'
> +# based on the FSF guideline, for want of a better idea.
> +THRESHOLD=15
> +
> +git ls-files | grep -v "$FILE_EXCLUDE" |
> +    while read f; do
> +        git blame -w --line-porcelain -- "$f" | grep -I '^author ' | grep -v "$AUTHOR_EXCLUDE"
> +    done | sort -fd | uniq -ic | awk "\$1 >= $THRESHOLD" |  sort -nr
> -- 
> 2.26.2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] devel: script to calculate a list of authors.
  2020-06-03 18:02 ` Tomi Ollila
@ 2020-06-05 10:07   ` David Bremner
  2020-06-05 12:10     ` Tomi Ollila
  2020-06-06 11:47     ` David Bremner
  0 siblings, 2 replies; 6+ messages in thread
From: David Bremner @ 2020-06-05 10:07 UTC (permalink / raw)
  To: Tomi Ollila, David Bremner, notmuch

As an initial heuristic, report anyone with at least 15 lines of code
in the current source tree. Test corpora are excluded, although
probabably this doesn't change much about the list of authors
produced.
---
 devel/author-scan.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)
 create mode 100644 devel/author-scan.sh

diff --git a/devel/author-scan.sh b/devel/author-scan.sh
new file mode 100644
index 00000000..2d9c4af8
--- /dev/null
+++ b/devel/author-scan.sh
@@ -0,0 +1,11 @@
+#!/bin/sh
+
+FILE_EXCLUDE='corpora'
+AUTHOR_EXCLUDE='uncrustify'
+# based on the FSF guideline, for want of a better idea.
+THRESHOLD=15
+
+git ls-files | grep -v -e "$FILE_EXCLUDE" | xargs -n 1 -d \\n \
+                                                  git blame -w --line-porcelain -- | \
+    sed -n "/$AUTHOR_EXCLUDE/d; s/^[aA][uU][tT][hH][Oo][rR] //p" | \
+    sort -fd | uniq -ic | awk "\$1 >= $THRESHOLD" | sort -nr
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] devel: script to calculate a list of authors.
  2020-06-05 10:07   ` [PATCH] " David Bremner
@ 2020-06-05 12:10     ` Tomi Ollila
  2020-06-05 13:03       ` David Bremner
  2020-06-06 11:47     ` David Bremner
  1 sibling, 1 reply; 6+ messages in thread
From: Tomi Ollila @ 2020-06-05 12:10 UTC (permalink / raw)
  To: David Bremner, notmuch

On Fri, Jun 05 2020, David Bremner wrote:

> As an initial heuristic, report anyone with at least 15 lines of code
> in the current source tree. Test corpora are excluded, although
> probabably this doesn't change much about the list of authors
> produced.
> ---
>  devel/author-scan.sh | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>  create mode 100644 devel/author-scan.sh
>
> diff --git a/devel/author-scan.sh b/devel/author-scan.sh
> new file mode 100644
> index 00000000..2d9c4af8
> --- /dev/null
> +++ b/devel/author-scan.sh
> @@ -0,0 +1,11 @@
> +#!/bin/sh
> +
> +FILE_EXCLUDE='corpora'
> +AUTHOR_EXCLUDE='uncrustify'
> +# based on the FSF guideline, for want of a better idea.
> +THRESHOLD=15
> +
> +git ls-files | grep -v -e "$FILE_EXCLUDE" | xargs -n 1 -d \\n \
> +                                                  git blame -w --line-porcelain -- | \

It worked !? =D -- good -- this indentation in line above is interesting...

Tomi

> +    sed -n "/$AUTHOR_EXCLUDE/d; s/^[aA][uU][tT][hH][Oo][rR] //p" | \
> +    sort -fd | uniq -ic | awk "\$1 >= $THRESHOLD" | sort -nr
> -- 
> 2.26.2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] devel: script to calculate a list of authors.
  2020-06-05 12:10     ` Tomi Ollila
@ 2020-06-05 13:03       ` David Bremner
  0 siblings, 0 replies; 6+ messages in thread
From: David Bremner @ 2020-06-05 13:03 UTC (permalink / raw)
  To: Tomi Ollila, notmuch

Tomi Ollila <tomi.ollila@iki.fi> writes:

> On Fri, Jun 05 2020, David Bremner wrote:
>
>> As an initial heuristic, report anyone with at least 15 lines of code
>> in the current source tree. Test corpora are excluded, although
>> probabably this doesn't change much about the list of authors
>> produced.
>> ---
>>  devel/author-scan.sh | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>  create mode 100644 devel/author-scan.sh
>>
>> diff --git a/devel/author-scan.sh b/devel/author-scan.sh
>> new file mode 100644
>> index 00000000..2d9c4af8
>> --- /dev/null
>> +++ b/devel/author-scan.sh
>> @@ -0,0 +1,11 @@
>> +#!/bin/sh
>> +
>> +FILE_EXCLUDE='corpora'
>> +AUTHOR_EXCLUDE='uncrustify'
>> +# based on the FSF guideline, for want of a better idea.
>> +THRESHOLD=15
>> +
>> +git ls-files | grep -v -e "$FILE_EXCLUDE" | xargs -n 1 -d \\n \
>> +                                                  git blame -w --line-porcelain -- | \
>
> It worked !? =D -- good -- this indentation in line above is interesting...

I had to delete a | before git blame to get your version to work, so I
accepeted Emacs' suggestion of how to indent since it emphasizes that
git blame is an argument to xargs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] devel: script to calculate a list of authors.
  2020-06-05 10:07   ` [PATCH] " David Bremner
  2020-06-05 12:10     ` Tomi Ollila
@ 2020-06-06 11:47     ` David Bremner
  1 sibling, 0 replies; 6+ messages in thread
From: David Bremner @ 2020-06-06 11:47 UTC (permalink / raw)
  To: Tomi Ollila, notmuch

David Bremner <david@tethera.net> writes:

> As an initial heuristic, report anyone with at least 15 lines of code
> in the current source tree. Test corpora are excluded, although
> probabably this doesn't change much about the list of authors
> produced.

second version pushed to master and release

d

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-06-06 11:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-03 16:07 [RFC] devel: script to calculate a list of authors David Bremner
2020-06-03 18:02 ` Tomi Ollila
2020-06-05 10:07   ` [PATCH] " David Bremner
2020-06-05 12:10     ` Tomi Ollila
2020-06-05 13:03       ` David Bremner
2020-06-06 11:47     ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).