From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id yOxKLcDl1145CwAA0tVLHw (envelope-from ) for ; Wed, 03 Jun 2020 18:02:40 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id 6D0kKcDl115qDgAAbx9fmQ (envelope-from ) for ; Wed, 03 Jun 2020 18:02:40 +0000 Received: from arlo.cworth.org (arlo.cworth.org [50.126.95.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 33AD9940665 for ; Wed, 3 Jun 2020 18:02:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 639E56DE0F34; Wed, 3 Jun 2020 11:02:33 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id K3TotjyC8LIj; Wed, 3 Jun 2020 11:02:28 -0700 (PDT) Received: from arlo.cworth.org (localhost [IPv6:::1]) by arlo.cworth.org (Postfix) with ESMTP id 3E00F6DE0F19; Wed, 3 Jun 2020 11:02:27 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 283906DE0F19 for ; Wed, 3 Jun 2020 11:02:26 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hwf-Tu1tpFyi for ; Wed, 3 Jun 2020 11:02:24 -0700 (PDT) Received: from lahtoruutu.iki.fi (lahtoruutu.iki.fi [185.185.170.37]) by arlo.cworth.org (Postfix) with ESMTPS id 1AC906DE0EF2 for ; Wed, 3 Jun 2020 11:02:21 -0700 (PDT) Received: from guru.guru-group.fi (unknown [IPv6:2a02:2380:1:9:5054:ff:feb7:a4bc]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: too) by lahtoruutu.iki.fi (Postfix) with ESMTPSA id 4117D1B00196; Wed, 3 Jun 2020 21:02:18 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1591207338; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bmhteCwY5mmlLyFzhoFjGcxAi7un7Mvfvvk9mgr9U50=; b=NyhO0PwjVHMzByI/xHEUfSocJedhZKfwzRJwReYMiSjOE21b8l/Uh0tMl/0+13WjeKwHU/ hV6CZB4xcRFkblW5XJbGMB7Uyly8IMjhC3V5g2TPXQdhX8OodNet34q3IlflF05+DeW4sH 7D7XwNHXNZwpo97Zv6ny1/TwevNGBviW2RM3Jt8TPtQeJdC+yvZxQtJ0fVTWX7seSLq2o5 4PLsufxnq4Q4GTDnmTwRMI9FKPH6g2jBOtkdMLAPhZhe6/wb7yxmOZGi0t0eLOUAHELFDn 3X8aFzLzTc/s8wG9fxnoLsZt9g8fYYgwOixzVFXlOoe0JGB/xIuXNBVCzGaIlA== From: Tomi Ollila To: David Bremner , notmuch@notmuchmail.org Subject: Re: [RFC] devel: script to calculate a list of authors. In-Reply-To: <20200603160743.1449796-1-david@tethera.net> References: <20200603160743.1449796-1-david@tethera.net> User-Agent: Notmuch/0.28.3+84~g41389bb (https://notmuchmail.org) Emacs/26.3 (x86_64-pc-linux-gnu) X-Face: HhBM'cA~ MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1591207338; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bmhteCwY5mmlLyFzhoFjGcxAi7un7Mvfvvk9mgr9U50=; b=sd9NWrjzXx7Re0N0zDOsP0dCYkAk+qJwHfQVFyeWE5/cxTkuYu3OUDChy7CvBxKsxlx3cu IrZ1H3UY0fmVVTfe9zlsTcy/fmluYOsFCiUH4E4blnXOY3T2XhsKY1X0+e8DlzaNFdkUuC 2ExMvA4vb0pX6xWYvZLO0Q02a8ZH39CB9reEn3T7qYTk2LDgZDG8eaeLVWheyd+nWh00Ym rJ1OmV+9GUfJgv5cKcPUjnFlQ5fjqmSzpa/COPnq1VmVyRWlKoa9doEgimjt2T0JSZc4B+ jRbFe78clotOqiu2R6JHI4LsL+62Rpx1TwuupW/y/UUzhPIMOD06errOhg/nhw== ARC-Seal: i=1; s=lahtoruutu; d=iki.fi; t=1591207338; a=rsa-sha256; cv=none; b=h5Mu07HW1IEvExxAGVdRljejeAqMRXptkZWwN/nmIl29UnS4jKIiPl6SxqlJG1/CQ+hgK/ As2jckFwrLcLMmvZvbApIlroasNIec+2elpp36cHT1l1JznFbjLVVF6VLGQg+QA1lzXjcg WUP7uerGZ+l2Zo33z01ZJGBaTA9CgRZ2fcsemvssaHvBivrs3qfEs9SU1c5pnRdwPQ9Med FAFiKLxPXudJf/sPPEUUp3Q7XLwS3zDSO/VzavO0KwiA0mdnjGFmmCs3U9WTgW8c/4Zzpp v+/AORTmEMD8ZADtzcm7wESfRkJm28v5kXXEeiv6iZg5GkxEDXQsU+VHSL0wIw== ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=too smtp.mailfrom=tomi.ollila@iki.fi X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: notmuch-bounces@notmuchmail.org Sender: "notmuch" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=fail (body hash did not verify) header.d=iki.fi header.s=lahtoruutu header.b=NyhO0Pwj; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 50.126.95.6 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Spam-Score: 1.99 X-TUID: 8CCYUYFItO3M On Wed, Jun 03 2020, David Bremner wrote: > As an initial heuristic, report anyone with at least 15 lines of code > in the current source tree. Test corpora are excluded, although > probabably this doesn't change much about the list of authors > produced. > --- > > I realized both AUTHORS and debian/copyright are woefully out of > date. I think it makes sense to keep something like this in the repo, > both to ease updates and to document a policy. Presuambly 'author ' > should be removed from the output, but I'm guessing Tomi will tear > this apart anyway ;). Hi David, I started doing that before even reading your commit (used less(1) to look the email from bottom... ;) I got some idea, but then decided there is no point spending too many minutes (what was that one 'time management' xkcd again... =D) anyway, some fun editing the nifty pipeline you write git ls-files | grep -v -e "$FILE_EXCLUDE" | xargs -n 1 -d \\n | \ git blame -w --line-porcelain -- | \ sed -n "/$AUTHOR_EXCLUDE/d; s/^[aA][uU][tT][hH][Oo][rR] //p" | \ sort -fd | uniq -ic | awk "\$1 >= $THRESHOLD" | sort -nr If there are more authors or files to exclude, then it is easiest to just write those out in pipeline, e.g. grep -v -e 'file-one' -e 'file 2 with spaces' ... and sed -n '/author1_ex1/d; /author2_ex1/d; s/^[aA][uU][tT][hH][Oo][rR] //p' (and, as usual, I always recommend 'set -euf' in shell scripts) Tomi > devel/author-scan.sh | 11 +++++++++++ > 1 file changed, 11 insertions(+) > create mode 100644 devel/author-scan.sh > > diff --git a/devel/author-scan.sh b/devel/author-scan.sh > new file mode 100644 > index 00000000..b7b46a33 > --- /dev/null > +++ b/devel/author-scan.sh > @@ -0,0 +1,11 @@ > +#!/bin/sh > + > +FILE_EXCLUDE='corpora' > +AUTHOR_EXCLUDE='uncrustify' > +# based on the FSF guideline, for want of a better idea. > +THRESHOLD=15 > + > +git ls-files | grep -v "$FILE_EXCLUDE" | > + while read f; do > + git blame -w --line-porcelain -- "$f" | grep -I '^author ' | grep -v "$AUTHOR_EXCLUDE" > + done | sort -fd | uniq -ic | awk "\$1 >= $THRESHOLD" | sort -nr > -- > 2.26.2