From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id AIV6I0gP8WMUiwAAbAwnHQ (envelope-from ) for ; Sat, 18 Feb 2023 18:47:52 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id iA6RI0gP8WPXGAAA9RJhRA (envelope-from ) for ; Sat, 18 Feb 2023 18:47:52 +0100 Received: from mail.notmuchmail.org (yantan.tethera.net [IPv6:2a01:4f9:c011:7a79::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 4E997154DE for ; Sat, 18 Feb 2023 18:47:52 +0100 (CET) Received: from yantan.tethera.net (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id 30A585F805; Sat, 18 Feb 2023 17:47:50 +0000 (UTC) Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by mail.notmuchmail.org (Postfix) with ESMTPS id C34185F3E5 for ; Sat, 18 Feb 2023 17:47:47 +0000 (UTC) Received: by mail-ed1-x52f.google.com with SMTP id ec30so4337177edb.10 for ; Sat, 18 Feb 2023 09:47:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=kbE/OE7mpYa9yBzS4oij3x/hc+jpZJ1FcvGfRHU7EAE=; b=lM86mqJSs7MAiyYM28f7vhhuXWJ3JMi6hlsDUoMt/zFVQqS2EjSjd9qYmCSmeFlzBA jrCwH48FEF/rb9yFEVsDaAYt0GLh6x56WaDYZCWc2hFB6gz/dzjRGnEU6XZh8RgipbVd nvxsP7ZIUJHes4nx+Dr0XemdCCZQWecwsNMCe6iaSVuuRtLO9g5siT7guJjixbAO0bkl g/ekPIRiMZ6eNpHIGP8JdWW2M3hXmN+qlFiJOdXgSXP7s79mPdSl/Itg+H5esX7jGeBi 5G9aIDoX3wMAlBslyrlQP3dwlHvYWW9LsTpPEWgQGs3fzVadSqj6j1tqRSvbjny8k7kM xesw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=kbE/OE7mpYa9yBzS4oij3x/hc+jpZJ1FcvGfRHU7EAE=; b=MYp1qzRBXYk6RzpiNR4QTwHBmqJOTfHsYVKYU8yf0BxZhg6vsdu5iXK/8LT16XBmmK FvEF55NMrODqjsb0DWJVfE/TVhKRnwT3tfQ3fCC1mxCc3TD0pzNdi0n8tdkfmu+O7ykM Z7S6q0YfQOsT5b+5+ioEpC9c8t8wYVE9p/dZnMUdhUBWyeXPSp94GWyVKET50diz9/7D APHo6ZZF/EveR+c2ccL9/cEARpdX13IxIPEx68gApvFWEJvGL4QilsEmVl/lTOGTKHve 9a6UJiVf1LovO7nyfVHRGcZnSA51qY7NRCtERO26xgKbZSuyIcJEIzckq08+yWXU59xt KDjg== X-Gm-Message-State: AO0yUKUHQgRlv0ZZlghUhNGwS07x3r6bXBnB26ORv022CdPQPysvtFUD OxkZjbNHwawlC/hk2GmOjX28mzUloxI3Xit32xle5QPoiLY= X-Google-Smtp-Source: AK7set8qkuoA2n2OuY+wpZvEiiAoMPrlbVtKaId48mdx39oBs60ncnVbBPGWivywpo+MQ7pdrfa2zxWKAKlAoIAsYEM= X-Received: by 2002:a50:9f43:0:b0:4ad:7ba3:5bce with SMTP id b61-20020a509f43000000b004ad7ba35bcemr330664edf.7.1676742466799; Sat, 18 Feb 2023 09:47:46 -0800 (PST) MIME-Version: 1.0 References: <20230213122631.2088558-1-david@tethera.net> <87lel1pluu.fsf@tethera.net> <87fsb9pb5m.fsf@tethera.net> <87bklxow5v.fsf@tethera.net> In-Reply-To: <87bklxow5v.fsf@tethera.net> From: Michael J Gruber Date: Sat, 18 Feb 2023 18:47:35 +0100 Message-ID: Subject: Re: Proof of concept for counting messages in thread To: David Bremner Message-ID-Hash: QUZWN4J2CN25C4QN2W4EEESYUCEQ6L3L X-Message-ID-Hash: QUZWN4J2CN25C4QN2W4EEESYUCEQ6L3L X-MailFrom: michaeljgruber@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: notmuch@notmuchmail.org X-Mailman-Version: 3.3.3 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_IN X-Migadu-Country: DE ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("body hash did not verify") header.d=gmail.com header.s=20210112 header.b=lM86mqJS; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2a01:4f9:c011:7a79::1 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none) ARC-Seal: i=1; s=key1; d=yhetil.org; t=1676742472; a=rsa-sha256; cv=none; b=JOYFDVczMHX5NFnMmgdta6x1b2TYpu56oAM9A4eYMqF8m7BVg++63+EUZZpCXeitQdTktr CW/nmqEqMTXo4yinJdJeHU7PppPI2Xncje7MR6QmncgYfxv0oi5AFbNwmDRj2CHAONlxhg Lgsb8UBkW3Z7N/KVsDsWZ7LjooYr1xJbAQcrAVo/qXu4rGZJw1RYw9p3UiRecouiswT3PX m2f2E/HrPys5IBgG+4WkuvKdauBjEmKsfKvRysAEHD79wG/BFm4/Q3CVNkXARpEeH7cNAk b5Hp9gIatjuu7rKFwauek2b8jUhf6qmEu8m0UKISy2Grei9UdOcSs+7Edv95tA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1676742472; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-owner:list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=xYjRlOLKADsWuaJyVnEHq6wZRTku9/svaHUXVprsgng=; b=t96Ymc204ZdeWKsUnvT7q+A6t0lb7Z/7X8Jj95anTy3WgjUmNVb9x3nK3NyLF5+mKDV+Rq BjjkHdxn09FDoTziY6Htj8OvqKwcix+t26j9pVXUrSTPDJ02/uiOY9SIfv7t4TeDUmJ8fF cq/EsBWFUdZdVCZpcG88CVsvDfpMjIGqZmi0qvOwVjlQAY6Y+7fURngZHU7YaQkWSolX5t CwyrwSMJyiZoZvqiON5mflnqAxqyHn/K3qoES0VylqEI0xHJmtFEBkxXbfE9qx7HEhMTme u0xFruW81BkPcQAsMG+8FyITnq30D8SDFr+wQvhxImB5ehkZA8kUIAyDZpx4Eg== X-Migadu-Spam-Score: 7.48 X-Migadu-Scanner: scn1.migadu.com X-Migadu-Queue-Id: 4E997154DE Authentication-Results: aspmx1.migadu.com; dkim=fail ("body hash did not verify") header.d=gmail.com header.s=20210112 header.b=lM86mqJS; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2a01:4f9:c011:7a79::1 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org; dmarc=fail reason="SPF not aligned (relaxed)" header.from=gmail.com (policy=none) X-Spam-Score: 7.48 X-TUID: SKBeNXO2rFnf Am Di., 14. Feb. 2023 um 02:47 Uhr schrieb David Bremner : > > Michael J Gruber writes: > > > That is really weird: > > ``` > > xapian-delve -t G0000000000021229 . > > Posting List for term 'G0000000000021229' (termfreq 115, collfreq 0, > > wdf_max 0): 146259 ... > > ``` > > with 115 record numbers, all different. > > Doing `xapian-delve -1r` for each of them and grepping for the G-lines > > gives 115 times that correct thread id. > > Grepping for the Q-lines and notmuch-searching for the message ids > > gives only 5 results (the expected ones). Apparantly, there are bogus > > mail records which that thread points to. > > 1) Do those "bogus" records have a "Tghost" term? That would be for > messages that are known via references, but not actually in the local > database. This is a bug / feature of the current implementation, it > counts all messages known, whether or not local copies exist. Yes, the extra ones all are ghosts, and I slowly remember that they scared me in the past already ... These ghosts appear to be pretty common. It happens all the time that I am joined to an existing discussion thread where I do not have all references. I'd go as far as to say that counting ghosts as thread members makes this useless for me. On the other hand, notmuch's own count gets this right. And getting different counts is even more confusing. > 2) Do they have more than one G term? That suggests a bug somewhere. We > actually have a test in the test suite [1] for that, but of course that is > with a simple artificial database. No, they all have one. But their sheer number looks suspicious: those 5 "real" e-mails have maybe 20 reference headers in total, and some of them refer to some of those 5. Grepping the account store for those references gives me around that number. Where do the 110 ghosts (90 extra) come from which this thread points to? Still scared by them ... we need ghost busters! Michael