From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2.migadu.com ([2001:41d0:303:e224::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms13.migadu.com with LMTPS id yIoOF0Y87mbghQAAe85BDQ:P1 (envelope-from ) for ; Sat, 21 Sep 2024 03:23:50 +0000 Received: from aspmx1.migadu.com ([2001:41d0:303:e224::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2.migadu.com with LMTPS id yIoOF0Y87mbghQAAe85BDQ (envelope-from ) for ; Sat, 21 Sep 2024 05:23:50 +0200 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=fail ("body hash did not verify") header.d=ofb.net header.s=ofb header.b=ivI7DtAd; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 135.181.149.255 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1726889030; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-owner:list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=8Z37SS93yhCmHP7Vtlj7B8j7PSiKPD5YmErq3LYyi24=; b=EyYA9oKs78XAJjQ2L9tHB2oLe7NHAfPyEwTrI1XcZ/V0oIb5piSuaWLT3P9ypPTTWxPPHo dQidR1HPLB+b0G/kZ78Edy3w/US/j6RttKCd73dGoWyslOShty8JA/BWsJew/F2xCc3b0+ Ut1xlDs3RoeborJuDC/Kcarszk+OBA03cu4q0lWz5Uto1xz56f+KMCAqZnwVWIJnG1+pTy qheeuS/aymZme4hjWzJYisrRu3KPBYCh5QCFiAWgqEiQs7cDSXmBbhlD7xkD2KepSFYfv1 Q/NEybARcf2A7cX6YT8ITS2KE84k4kPD40VcmS4ACgxvFNjSB6xbyMvLZ+s5CA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1726889030; a=rsa-sha256; cv=none; b=spRckvD1MqkvKIIoujp4jilhRZimEn/cbqiJ85vdbkL84TiXEEctcaPpLAWIne+7NIs+Ta tYXGYbKNWolHiA9PmpBNpClpy4mc/nAz7dnXKTsGTBYn0ddjmfkZ2tJtCOeUSRc/tAg/qW bAvfW3He8yjiHBoMaMyYmUcYGj4v3nQQtmRbQab1TwJMHk0p7E4oyfL8p8wBg2xT1sGhZI Y6tfLFjyjM7OwbqBerdG3ShbCMEHqSJjet6qluXTdquqUlSWfcsUWQPERZOsivmOngCEAe c8nnR/IKg985Ra921nAmDaEwM+p/hyqARvIW1icpS44qhN7UFEogQXunf7zfpQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("body hash did not verify") header.d=ofb.net header.s=ofb header.b=ivI7DtAd; spf=pass (aspmx1.migadu.com: domain of notmuch-bounces@notmuchmail.org designates 135.181.149.255 as permitted sender) smtp.mailfrom=notmuch-bounces@notmuchmail.org; dmarc=none Received: from mail.notmuchmail.org (yantan.tethera.net [135.181.149.255]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 16D0B7C32E for ; Sat, 21 Sep 2024 05:23:49 +0200 (CEST) Received: from yantan.tethera.net (localhost [127.0.0.1]) by mail.notmuchmail.org (Postfix) with ESMTP id A41005F7EB; Sat, 21 Sep 2024 03:23:46 +0000 (UTC) Received: from egnor-2020.ofb.net (egnor-2020.ofb.net [IPv6:2600:3c01:e000:3d3::1]) by mail.notmuchmail.org (Postfix) with ESMTPS id 0783D5E2A4 for ; Sat, 21 Sep 2024 03:23:44 +0000 (UTC) Received: from ofb.net (ofb.net [104.197.242.163]) by egnor-2020.ofb.net (Postfix) with ESMTP id 178054E26D8; Sat, 21 Sep 2024 03:23:42 +0000 (UTC) Received: from localhost (unknown [50.247.104.190]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by ofb.net (Postfix) with ESMTPSA id AE2FC41214; Fri, 20 Sep 2024 20:23:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ofb.net; s=ofb; t=1726889021; bh=HrMORYqj6QvMDvonySut4iRfNOFF3eBmqZgjmfzPyn0=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=ivI7DtAdT+VXP1dl7AwlTuq9qcSJaMCLqjxHCuLBDp17+BfoqfDjvNDqMc6eOJGdu N/JEl4h7uwZQCOb782sp+oYkAP2PwWsya6AYqdTpeew1sYzDhI0KTu7QKBrdVNXOq9 reCENpTO8mV79CwSpr/8qKmn7ymECZRZf6bxl6hVNNVdjmqN7Cp/Fw/AhZDfQ5NwJ6 yMPE40D1PfoLHfYE+CzFvOPp1tQUIZbjIrkcR78tANlxU8dmDF+lN3qu+zBjVQoKYO W1XPBUCOdmTlP11Wmia+yrLoJ+Q855blmKwZl2Rdspzkv95Cc8f6vX7scaUTE3DF36 1TKW9eBxZZAlw== Date: Fri, 20 Sep 2024 20:23:40 -0700 From: Frederick Eaton To: Pengji Zhang Subject: Re: searching for a message by path Message-ID: <20240921032340.opozeclfbyqzw2yt@localhost> References: <20240920175232.zryeqyl76nbydiab@localhost> <87zfo1dfa1.fsf@pengjiz.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <87zfo1dfa1.fsf@pengjiz.com> Message-ID-Hash: NLBGAP3OHIH2SZC75QIFV7YLBO6SG6DV X-Message-ID-Hash: NLBGAP3OHIH2SZC75QIFV7YLBO6SG6DV X-MailFrom: frederik@ofb.net X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-notmuch.notmuchmail.org-0 CC: notmuch@notmuchmail.org X-Mailman-Version: 3.3.3 Precedence: list Reply-To: frederik@ofb.net List-Id: "Use and development of the notmuch mail system." List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii"; format="flowed" Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_IN X-Migadu-Country: DE X-Migadu-Scanner: mx12.migadu.com X-Migadu-Spam-Score: -1.99 X-Migadu-Queue-Id: 16D0B7C32E X-Spam-Score: -1.99 X-TUID: P9Q6TtAvZJQF Thank you for your response, Pengji. On Sat, Sep 21, 2024 at 08:25:10AM +0800, Pengji Zhang wrote: >Hi Frederick, > >Frederick Eaton writes: > >>I am trying to figure out how to adapt a script I wrote for >>filtering messages, to apply notmuch tags to each message. A >>difficulty is that the messages are already in the Notmuch database, >>because another tool has delivered them to a maildir and run >>"notmuch new". >> >>Now, Notmuch can provide me with the paths of all the new >>(unfiltered) messages, which I can give to my script. The question I >>have is, once the filter is done, how can the script tell Notmuch >>which message to apply the tags to? > > >I am not sure if I understand you correctly. If the problem here is to >distinguish existing messages and new messages, would the config >option 'new.tags' work? For example, use > > notmuch config set new.tags new > >to give all new messages a 'new' tag. No, I already have that configuration. The first sentence described what I already know how to do, the second sentence is what I'm trying to do. Suppose the filter script reads a message from a particular file and decides that it is spam. How does the filter tell Notmuch that the message corresponding to that file is spam? You seem to be saying below that the filter script should extract the Message-ID and use it to identify the message to Notmuch, since file paths of the messages are not indexed. Probably what my script should be doing for each message is appending a line to a batch file like this: +spam -new -- id:some_message_id@foo +inbox -new -- id:some_other@baz and then passing the batch file to "notmuch tag"? >>I've tentatively concluded that the best way to locate each message >>in the Notmuch database is to extract the Message-ID and search for >>it with "id:"? But the FAQ says that multiple messages can have the >>same Message-ID (and some spam messages don't have one at all). > >IIRC, in the Notmuch database tags are associated with message IDs, so >you probably do not need to worry about this. This time, I'm not sure I understand. >>If I could access the message using the filename that the script is >>processing, it would seem slightly more reliable. It seems like >>there should be some way to allow a Notmuch database entry to be >>accessed directly by filename, without even creating a Notmuch-style >>search query containing that filename, but rather by passing the >>filename as a command-line argument to "notmuch". It would be nice >>not to have to worry about quoting and unquoting. > >I am not sure if this is useful, given that (presumably) Notmuch uses >message IDs as keys. Besides, those filenames are usually generated >automatically and quite cryptic. It might be useful for the reasons I stated, namely in case the Message-ID does not exist or is not unique. >>When I try to search for a message using "path:", nothing seems to >>work. >> >>[...] >> >>There were no results for any of the "path:" searches, although the >>"id:" search worked. I am using version 0.32.2 and can update if >>this may be related to a bug that was fixed in the past few years. > >I have never used 0.32.2 so I am not sure if there are any >differences, but for version 0.38.3, the prefix "path:" is used to >search for messages in some *directory*, and the query should be >*relative* to the maildir. > >I highly recommend the manual page 'notmuch-search-terms(7)' and also >other pages if you have time. They are informative and well written, >and very helpful for writing message processing scripts. Thank you for interpreting that section for me. The manual pages may be informative and well written, but if my opinion matters, then I think that they could be made slightly clearer than they are. For example, explaining directly to the user that there is no index of path names would help clarify what can be done with the software. Also, a short example of using Notmuch in a filter script would be useful in one of the manual pages, particularly illustrating the case where the programmer wants to re-tag a message that is provided as a file or on stdin. My copy of the notmuch-search-terms manual page says: path: or path:/** or path:// The path: prefix searches for email messages that are in partic- ular directories within the mail store. The directory must be specified relative to the top-level maildir (and without the leading slash). ... I see now that this text is only suggesting that Notmuch supports searches for directory names, but on first read it wasn't really clear to me whether "directory-path" means a "path to a directory" or a "file path consisting of directories followed by a filename", particularly as there is no obvious reason for Notmuch not to index filenames. I think "path:" would be clearer, and saying "The path: prefix matches email messages that are stored in a specified directory on the filesystem, which must be specified relative to the top-level maildir, and here is how to find out what the 'top-level maildir' is when you have for example $HOME/mail/notmuch/ configured as your database path in ~/.notmuch-config ...". Even clearer would be to explain why the "path:" search prefix only accepts directories, point out that it should be called "dir:" instead of "path:", and warn the user that the search will be inefficient because there is no index of filenames. Thank you, Frederick