From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.2 required=3.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id A348B1F538 for ; Tue, 21 Mar 2023 23:07:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1679440067; bh=BwJsLCkuceWpGRk/2G45jRngwktJIjIuzdIuEkgKD4w=; h=From:To:Subject:Date:In-Reply-To:References:From; b=yNB0VUytLMCXWbDZKwWEzgRKOJK5IUlCO+tJaLs0ilpob4mDAKsSQpo9RyNqwiCIa iTsIy5p/sxdWQ1RxaNdY8+FVMLl2G0ilhpCRaCedoB8+jEHuZc8h2dq5Rv3v2pmhNL gI075iv2r1gXzn16vIY/MfFt+h+eRA4i7aWwGNk0= From: Eric Wong To: meta@public-inbox.org Subject: [PATCH 20/28] cindex: attempt to give oldest commits lowest docids Date: Tue, 21 Mar 2023 23:07:35 +0000 Message-Id: <20230321230743.3020032-20-e@80x24.org> In-Reply-To: <20230321230743.3020032-1-e@80x24.org> References: <20230321230701.3019936-1-e@80x24.org> <20230321230743.3020032-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit List-Id: Monotonically increasing docids may help us avoid sorting output for the web and CLI, since recent commits are generally the most desired search results. `git log --reverse' incurs no extra overhead in this case, since `--stdin' will mean git buffers the commit list in memory before attempting to emit anything. --- lib/PublicInbox/CodeSearchIdx.pm | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/CodeSearchIdx.pm b/lib/PublicInbox/CodeSearchIdx.pm index 176422d0..f0b506da 100644 --- a/lib/PublicInbox/CodeSearchIdx.pm +++ b/lib/PublicInbox/CodeSearchIdx.pm @@ -52,8 +52,12 @@ our $SEEN_MAX = 100000; # TODO: do we care about committer name + email? or tree OID? my @FMT = qw(H P ct an ae at s b); # (b)ody must be last + +# git log --stdin buffers all commits before emitting, thus --reverse +# doesn't incur extra overhead. We use --reverse to keep Xapian docids +# increasing so we may be able to avoid sorting results in some cases my @LOG_STDIN = (qw(log --no-decorate --no-color --no-notes -p --stat -M - --stdin --no-walk=unsorted), '--pretty=format:%n%x00'. + --reverse --stdin --no-walk=unsorted), '--pretty=format:%n%x00'. join('%n', map { "%$_" } @FMT)); sub new {