From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ricardo Wurmus Subject: Help wanted for mumi (issues.guix.gnu.org) Date: Sat, 19 Oct 2019 23:12:30 +0200 Message-ID: <87o8ycjv7l.fsf@elephly.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:54640) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iLw1u-00089D-L3 for guix-devel@gnu.org; Sat, 19 Oct 2019 17:12:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iLw1t-0006Ap-A7 for guix-devel@gnu.org; Sat, 19 Oct 2019 17:12:54 -0400 Received: from sender4-of-o51.zoho.com ([136.143.188.51]:21147) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iLw1s-00069Y-SE for guix-devel@gnu.org; Sat, 19 Oct 2019 17:12:53 -0400 List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Guix Devel Hello Guix, our bug tracker web interface at issues.guix.gnu.org could really benefit from a more reliable, faster search. Currently, mumi (the application behind issues.guix.gnu.org) uses a slow interface to Debbugs, the bug tracker service that runs at debbugs.gnu.org. The search isn=E2=80=99t great as it returns duplicates a= nd is paginated, which makes it unsuitable for processing. Mumi may need to further filter the search results by status or activity, or any other metric that the Debbugs search API doesn=E2=80=99t let us do. So I decided to switch away from using the Debbugs API and instead operate on a *local* copy of all messages that reach Debbugs. Debbugs operates on email messages, and luckily it allows us to download these original messages. Whenever someone visits an issue page, all related messages are downloaded by mumi, so it amasses a sizeable stash of emails over time. Mumi is using a modified version of =E2=80=9Cmu=E2=80=9D, the mail indexer = and search tool, to continuously index the contents of all messages. (=E2=80=9Cmu=E2= =80=9D is modified only so that the issue number is indexed alongside the message contents.) Unfortunately, that=E2=80=99s as far as I got before life intervened. The = next step is really close, but getting there requires more contiguous segments of time than I can free at the moment. We really only need to do the following things next: 1) keep updating the mu database as new messages are stored 2) using the mu Guile bindings to search messages via mu instead of using the slow Debbugs API. While working on 2 we may find that more properties should be stored in the mu database, and that=E2=80=99s fine. Our variant of mu is easily patc= hed to accomodate our needs. Does anyone here have an interest in playing with and improving mumi? It=E2=80=99s a very simple code base and it=E2=80=99s very easy to get star= ted. The code is here: https://git.elephly.net/software/mumi.git -- Ricardo