From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id yP1dGCIvSGSI2AAASxT56A (envelope-from ) for ; Tue, 25 Apr 2023 21:50:58 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id 0OVFGCIvSGR74gAA9RJhRA (envelope-from ) for ; Tue, 25 Apr 2023 21:50:58 +0200 Received: from mail.notmuchmail.org (yantan.tethera.net [IPv6:2a01:4f9:c011:7a79::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 7FB12F665 for ; Tue, 25 Apr 2023 21:50:55 +0200 (CEST) Received: from yantan.tethera.net (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id ACE855F5F0; Tue, 25 Apr 2023 19:50:52 +0000 (UTC) X-Greylist: delayed 599 seconds by postgrey-1.36 at yantan; Tue, 25 Apr 2023 19:50:50 UTC Received: from phubs.tethera.net (phubs.tethera.net [192.99.9.157]) by mail.notmuchmail.org (Postfix) with ESMTPS id 751FF5E021 for ; Tue, 25 Apr 2023 19:50:50 +0000 (UTC) Received: from tethera.net (unknown [131.202.229.34]) by phubs.tethera.net (Postfix) with ESMTPS id 1A193180050; Tue, 25 Apr 2023 16:40:50 -0300 (ADT) Received: (nullmailer pid 212036 invoked by uid 1000); Tue, 25 Apr 2023 19:40:49 -0000 From: David Bremner To: Al Haji-Ali , notmuch@notmuchmail.org Subject: Re: Correcting message references In-Reply-To: <87r0s8141e.fsf@tethera.net> References: <874jp85g26.fsf@minkowski.home> <87o7nf50ve.fsf@minkowski.home> <87r0s8141e.fsf@tethera.net> Date: Tue, 25 Apr 2023 16:40:49 -0300 Message-ID: <87ildjn41q.fsf@tethera.net> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Message-ID-Hash: UJAI7K2J5CD5K7VVNSLFWCZ53LMTS6H6 X-Message-ID-Hash: UJAI7K2J5CD5K7VVNSLFWCZ53LMTS6H6 X-MailFrom: david@tethera.net X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header X-Mailman-Version: 3.3.3 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Migadu-Flow: FLOW_IN X-Migadu-Country: DE ARC-Seal: i=1; s=key1; d=yhetil.org; t=1682452256; a=rsa-sha256; cv=none; b=OZInrU3EpDW7OsvSzeN1sVEAhKSPdEUHPe2uCZSyjyF8rD55RrcX8vqZ1IZ00ess5Ay8pR cAs/2y6HXrdfFwijC2TnmJqjvlKg+TF8t8GUcqERoQdUpPCxvTDEIa2l/vh9OkkXrsSjLo /A+dVAWSMjaqbL+guvadMR6fh6nEcsUvMkQQkw0l3ar2fJu7xoKKazfrUcfs5AcatRo4Bq YYY3QNqJq8YzfpIkkJYgC2YhbhGfzrR4fMxFDpvpIzlT2jteJxibM07FZu1PK4nLA3pgSE gP8Ol4oCWdmyVJmHP+u/NXfECfJVOUZinrKxVEPCPc3c5794ptu4efzPI56bCA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2a01:4f9:c011:7a79::1 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1682452256; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:list-id:list-help: list-owner:list-unsubscribe:list-subscribe:list-post; bh=2BT8ULT57A3mUPxoQiSzb53JH0EApaIm/mNJsjueoXk=; b=QTl3BxT5XrLHHBOboEl516waFaPhTGKUa/nZy/I53nyV0EPitLP+jao40diOxFQTtmRHdP tcwvhB3LRJrItCb4Qgr05GlWp8ntoI/8PhvH2xMS5l2WqjRpnKKZzznqYpfKpLS5+paGn3 f7vQLn0uwWKzbb1R6dyLbRRUyS/E+mqCalp/GK3d238hARln78uAs8d1PvQIrKwZc2fezZ i4AmIDGrBOafbsrDjdtbioKObe8LN0eWVzCwapke0BqXtEIAbhwb9QAJIVhaxmrJQU5409 Ghkqwe0xEUITgT7Qiijbw7djWqzGfOJFUB/NuxXxEasta6/tYNImQx7X7JiOpw== X-Migadu-Scanner: scn1.migadu.com Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 2a01:4f9:c011:7a79::1 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org X-Migadu-Spam-Score: -0.17 X-Spam-Score: -0.17 X-Migadu-Queue-Id: 7FB12F665 X-TUID: dDKzxy077q60 --=-=-= Content-Type: text/plain David Bremner writes: > Al Haji-Ali writes: > >> So it does seem to be a lingering ghost message, but I am sure that there are no messages in the database referring to this ID (except messages in this current thread which have the ID in the message body). >> I don't know why this particular ID is associated to messages in another seemingly unrelated thread as you in the pdf. >> >> Is there a way to remove this ghost message record somehow to test it? Or is there a better way of figuring this out. > > It turns out notmuch does not remove ghost messages until all the other > messages in the thread are deleted. I guess if you temporarily move > the other messages in the thread out of the way and run notmuch new, the > ghost message should be deleted. > > I don't know how often this lazy deletion is a problem. Deleting > messages is already a bottleneck in notmuch-new so I am a bit hesitant > to make it more complicated. It is possible to "garbage collect" > unreferenced ghost messages. I'll have to think about how big a > performance hit it would be to add this to notmuch new. > > d Here is a prototype standalone program to find lingering unreferenced ghosts. I find 33 (out of about 60k total ghost messages) in about 0.3s on this laptop. Currently it does not modify the database, but the next step would be to delete the documents rather than just printing them out. If you have libxapian-dev (or equivalent) installed you can build it with $ c++ ggc.cc -o ggc -lxapian and then run it $ ./ggc ~/.local/share/notmuch/default/xapian I would be interested if it finds your problematic ghost message (and how long it takes). --=-=-= Content-Type: text/x-c++src Content-Disposition: inline; filename=ggc.cc #include #include int main(int argc, char **argv){ if (argc != 2) { fprintf (stderr, "usage: ggc xapian-database\n"); exit (1); } Xapian::Database db(argv[1]); Xapian::Enquire enquire(db); enquire.set_query(Xapian::Query("Tghost")); auto mset = enquire.get_mset (0,db.get_doccount ()); for (auto iter=mset.begin (); iter != mset.end(); iter++){ std::string mid; auto doc = iter.get_document (); auto term_iter = doc.termlist_begin (); term_iter.skip_to ("Q"); mid=(*term_iter).substr(1); std::string ref_term = "XREFERENCE" + mid; auto ref_count = db.get_termfreq (ref_term); std::string reply_term = "XREPLYTO" + mid; auto reply_count = db.get_termfreq (reply_term); if (ref_count+reply_count == 0){ std::cout << "docid=" << *iter; std::cout << " mid=" << mid; std::cout << std::endl; } } } --=-=-= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline --=-=-=--