From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id IPwWNrYBZGIoRQEAbAwnHQ (envelope-from ) for ; Sat, 23 Apr 2022 15:40:06 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id YCEWNrYBZGJPhQEAauVa8A (envelope-from ) for ; Sat, 23 Apr 2022 15:40:06 +0200 Received: from mail.notmuchmail.org (yantan.tethera.net [IPv6:2a01:4f9:c011:7a79::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 19AB79F7B for ; Sat, 23 Apr 2022 15:40:06 +0200 (CEST) Received: from yantan.tethera.net (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id B37255F792; Sat, 23 Apr 2022 13:39:17 +0000 (UTC) Received: from fethera.tethera.net (fethera.tethera.net [IPv6:2607:5300:60:c5::1]) by mail.notmuchmail.org (Postfix) with ESMTP id C735F5F780 for ; Sat, 23 Apr 2022 13:39:14 +0000 (UTC) Received: by fethera.tethera.net (Postfix, from userid 1001) id 3EA775FC0D; Sat, 23 Apr 2022 09:39:13 -0400 (EDT) Received: (nullmailer pid 3856581 invoked by uid 1000); Sat, 23 Apr 2022 13:38:57 -0000 From: David Bremner To: notmuch@notmuchmail.org Subject: [PATCH 14/16] CLI/git: create CachedIndex class Date: Sat, 23 Apr 2022 10:38:46 -0300 Message-Id: <20220423133848.3852688-15-david@tethera.net> X-Mailer: git-send-email 2.35.2 In-Reply-To: <20220423133848.3852688-1-david@tethera.net> References: <20220423133848.3852688-1-david@tethera.net> MIME-Version: 1.0 Message-ID-Hash: KWTXMGFBWTAC4HAY7FPFH4CEPW4KTCV2 X-Message-ID-Hash: KWTXMGFBWTAC4HAY7FPFH4CEPW4KTCV2 X-MailFrom: bremner@tethera.net X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.3 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: DE ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1650721206; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-owner:list-unsubscribe:list-subscribe:list-post; bh=I72jZG0x7AuRuVN0QEeKHBcNTfDLtEyQ4/8YFiNeT48=; b=qp+gix3sfRXFfC+SD3T1/3xfHI53+4PGU4kTgy/88cZ2FW0wsLCyvFBuVtU7JKKP1hK7BL Z68mzVwpsgxWCmsA/yvojA3pAv80dLidyJMUVrCB1qBr1YFhpkmmO0+gaj1cLsVz5ZMu1h Tcu8VPrBcSmEw1UaldmCgXS24YTV3hScw0geX2tgfN1uAyEPWVy6wO9sf6daLdpeE5QWCK LcTnlZHodm1H2JEKYlFoXVafiJjZqkLQhqhhraH0nvfAj5zPo1LUKNudPfSDC28NJ8PqTA ac9XPRZ7P/n4NElN/1ZFFXGAQCjop1cFjkHfc58UTf9cxmc0FUuDlOxoEDSpMA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1650721206; a=rsa-sha256; cv=none; b=nazJFoIsFKfu59prSIgbF8iHpWHbkt004UATpm4LMysAkBCeK3FAT7dr9sMFoqRheu3aVJ t7IqTviQCVTUJM4XuLpO7wHAlVazHLaC5yTep6kkAr78Zvo5w7KzyUqv5jEflo2rgU2Sf/ sJW6S1N3PLsg67UVAQAAVZIlvUaGxDP3NYvalXZ0NOLcVXGQEiJgnvHq9CPw45jPr6v9vJ jPzlQZ9kLexpQw7IgKIwMzHwXdMBJmar2kASWVnQQ6buOhSuVD586NqSLnkMQZWg4ZY7zu 0S0W6uwzf8OHaX3PRbEplTN1onu7WZlmDw8dcPUS9UMQM9/CTMSJhoh+cCEGOA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2a01:4f9:c011:7a79::1 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Spam-Score: -1.01 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2a01:4f9:c011:7a79::1 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Queue-Id: 19AB79F7B X-Spam-Score: -1.01 X-Migadu-Scanner: scn1.migadu.com X-TUID: xXgoOco1xuYh The "git-read-tree HEAD" is a bottleneck, but unfortunately sometimes is needed. Cache the index checksum and hash to reduce the number of times the operation is run. The overall design is a simplified version of the PrivateIndex class, which is partially refactored to support the new class. --- notmuch-git.in | 136 +++++++++++++++++++++++++++++++++++-------------- 1 file changed, 97 insertions(+), 39 deletions(-) diff --git a/notmuch-git.in b/notmuch-git.in index b3f71699..261b3f85 100755 --- a/notmuch-git.in +++ b/notmuch-git.in @@ -342,41 +342,98 @@ def _is_committed(status): return len(status['added']) + len(status['deleted']) == 0 +class CachedIndex: + def __init__(self, repo, treeish): + self.cache_path = _os.path.join(repo, 'notmuch', 'index_cache.json') + self.index_path = _os.path.join(repo, 'index') + self.current_treeish = treeish + # cached values + self.treeish = None + self.hash = None + self.index_checksum = None + + self._load_cache_file() + + def _load_cache_file(self): + try: + with open(self.cache_path) as f: + data = _json.load(f) + self.treeish = data['treeish'] + self.hash = data['hash'] + self.index_checksum = data['index_checksum'] + except FileNotFoundError: + pass + except _json.JSONDecodeError: + _LOG.error("Error decoding cache") + _sys.exit(1) + + def __enter__(self): + self.read_tree() + return self + + def __exit__(self, type, value, traceback): + checksum = _read_index_checksum(self.index_path) + (_, hash, _) = _git( + args=['rev-parse', self.current_treeish], + stdout=_subprocess.PIPE, + wait=True) + + with open(self.cache_path, "w") as f: + _json.dump({'treeish': self.current_treeish, + 'hash': hash.rstrip(), 'index_checksum': checksum }, f) + + @timed + def read_tree(self): + current_checksum = _read_index_checksum(self.index_path) + (_, hash, _) = _git( + args=['rev-parse', self.current_treeish], + stdout=_subprocess.PIPE, + wait=True) + current_hash = hash.rstrip() + + if self.current_treeish == self.treeish and \ + self.index_checksum and self.index_checksum == current_checksum and \ + self.hash and self.hash == current_hash: + return + + _git(args=['read-tree', self.current_treeish], wait=True) + + def commit(treeish='HEAD', message=None): """ Commit prefix-matching tags from the notmuch database to Git. """ + status = get_status() if _is_committed(status=status): _LOG.warning('Nothing to commit') return - _git(args=['read-tree', '--empty'], wait=True) - _git(args=['read-tree', treeish], wait=True) - try: - _update_index(status=status) - (_, tree, _) = _git( - args=['write-tree'], - stdout=_subprocess.PIPE, - wait=True) - (_, parent, _) = _git( - args=['rev-parse', treeish], - stdout=_subprocess.PIPE, - wait=True) - (_, commit, _) = _git( - args=['commit-tree', tree.strip(), '-p', parent.strip()], - input=message, - stdout=_subprocess.PIPE, - wait=True) - _git( - args=['update-ref', treeish, commit.strip()], - stdout=_subprocess.PIPE, - wait=True) - except Exception as e: - _git(args=['read-tree', '--empty'], wait=True) - _git(args=['read-tree', treeish], wait=True) - raise + with CachedIndex(NMBGIT, treeish) as index: + try: + _update_index(status=status) + (_, tree, _) = _git( + args=['write-tree'], + stdout=_subprocess.PIPE, + wait=True) + (_, parent, _) = _git( + args=['rev-parse', treeish], + stdout=_subprocess.PIPE, + wait=True) + (_, commit, _) = _git( + args=['commit-tree', tree.strip(), '-p', parent.strip()], + input=message, + stdout=_subprocess.PIPE, + wait=True) + _git( + args=['update-ref', treeish, commit.strip()], + stdout=_subprocess.PIPE, + wait=True) + except Exception as e: + _git(args=['read-tree', '--empty'], wait=True) + _git(args=['read-tree', treeish], wait=True) + raise @timed def _update_index(status): @@ -664,7 +721,7 @@ class PrivateIndex: return self def __exit__(self, type, value, traceback): - checksum = self._read_index_checksum() + checksum = _read_index_checksum(self.index_path) (count, uuid, lastmod) = _read_database_lastmod() with open(self.cache_path, "w") as f: _json.dump({'prefix': self.current_prefix, 'uuid': uuid, 'lastmod': lastmod, 'checksum': checksum }, f) @@ -683,23 +740,11 @@ class PrivateIndex: _LOG.error("Error decoding cache") _sys.exit(1) - def _read_index_checksum (self): - """Read the index checksum, as defined by index-format.txt in the git source - WARNING: assumes SHA1 repo""" - import binascii - try: - with open(self.index_path, 'rb') as f: - size=_os.path.getsize(self.index_path) - f.seek(size-20); - return binascii.hexlify(f.read(20)).decode('ascii') - except FileNotFoundError: - return None - @timed def _index_tags(self): "Write notmuch tags to private git index." prefix = '+{0}'.format(_ENCODED_TAG_PREFIX) - current_checksum = self._read_index_checksum() + current_checksum = _read_index_checksum(self.index_path) if (self.prefix == None or self.prefix != self.current_prefix or self.checksum == None or self.checksum != current_checksum): _git( @@ -755,6 +800,19 @@ class PrivateIndex: s[id].add(tag) return s +def _read_index_checksum (index_path): + """Read the index checksum, as defined by index-format.txt in the git source + WARNING: assumes SHA1 repo""" + import binascii + try: + with open(index_path, 'rb') as f: + size=_os.path.getsize(index_path) + f.seek(size-20); + return binascii.hexlify(f.read(20)).decode('ascii') + except FileNotFoundError: + return None + + def _clear_tags_for_message(index, id): """ Clear any existing index entries for message 'id' -- 2.35.2