unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* Alternative to no longer supported folder:foo* wildcard matching ?
@ 2015-03-09 19:55 Jean-Marc Liotier
  2015-03-09 22:06 ` David Bremner
  0 siblings, 1 reply; 6+ messages in thread
From: Jean-Marc Liotier @ 2015-03-09 19:55 UTC (permalink / raw)
  To: notmuch

Hello ! I am a brand new Notmuch user (and Mairix emigrant - looks like 
I am not the only one here), very impressed with how everything works 
right out of the box - thanks to all who made Notmuch !

 From http://notmuchmail.org/news/release-0.18/ I read: "Wildcard 
matching (folder:foo*) is no longer supported". Too bad... It is exactly 
what I thought I needed to reach mail search nirvana.

So nowadays, is there any other way to express "this folder and all its 
subfolders" ? The path: keyword does not seem useful for that with a 
maildir with a flat structure of dot.delimited.directories - or is there 
something like a dot.delimited.* wildcard ?

Or am I entirely missing some vernacular usage that makes a subfolder 
wildcard useless ?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Alternative to no longer supported folder:foo* wildcard matching ?
  2015-03-09 19:55 Alternative to no longer supported folder:foo* wildcard matching ? Jean-Marc Liotier
@ 2015-03-09 22:06 ` David Bremner
  2015-03-10  0:16   ` Jean-Marc Liotier
  0 siblings, 1 reply; 6+ messages in thread
From: David Bremner @ 2015-03-09 22:06 UTC (permalink / raw)
  To: Jean-Marc Liotier, notmuch

Jean-Marc Liotier <jm@liotier.org> writes:

>
> So nowadays, is there any other way to express "this folder and all its 
> subfolders" ? The path: keyword does not seem useful for that with a 
> maildir with a flat structure of dot.delimited.directories - or is there 
> something like a dot.delimited.* wildcard ?
>

One option is to create symlink farm. Since it's only directories being
symlinked, it isn't that bad.  I don't know how well this scales, but it
seems to work for about 200k messages in 184 mailing lists. Roughly
speaking:

% mkdir list
% cd list
% ln -s ../.list.* .
% mmv .list.* *  # zsh specific, optional
% notmuch new

Notmuch new took about 10 minutes, but now I can search

'path:list/**'

to add a second level

% mkdir debian
% cd debian
% ln -s ../debian-* .
% notmuch new

Of course this could be scripted.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Alternative to no longer supported folder:foo* wildcard matching ?
  2015-03-09 22:06 ` David Bremner
@ 2015-03-10  0:16   ` Jean-Marc Liotier
  2015-03-10  7:10     ` David Bremner
  0 siblings, 1 reply; 6+ messages in thread
From: Jean-Marc Liotier @ 2015-03-10  0:16 UTC (permalink / raw)
  To: David Bremner, notmuch

On 09/03/2015 23:06, David Bremner wrote:
> Jean-Marc Liotier <jm@liotier.org> writes:
>> So nowadays, is there any other way to express "this folder and all its  subfolders" ? The path: keyword does not seem useful for that with a maildir with a flat structure of dot.delimited.directories - or is there something like a dot.delimited.* wildcard ?
> One option is to create symlink farm. Since it's only directories being
> symlinked, it isn't that bad.  I don't know how well this scales, but it
> seems to work for about 200k messages in 184 mailing lists.

On the plus side: it works. Here is my interpretation of the idea:

% cd ~/Maildir
% mkdir .NM_myTopLevelFolder
% ln -rs .myTopLevelFolder* -t .NM_myTopLevelFolder
% rm -f .NM_myTopLevelFolder/.myTopLevelFolder
% notmuch new
% notmuch-mutt --remove-dups --output-dir ~/Maildir/.=Search \
         search keyword and "path:.NM_myTopLevelFolder/**"

So, thanks for this workaround suggestion.

On the downside:
- It doubles the number of messages to index (though then even 
multiplied by two, my 300k messages are Not Much Mail™ - but still...)
- myTopLevelFolder gets a NM_myTopLevelFolder twin and restricting the 
search scope to it requires using its twin's name
- The additional messages are duplicates, so --remove-dups becomes 
mandatory in any search query
- This method is good for restricting the search scope to a directory, 
but not for excluding a directory from the search scope... Which alas is 
what I desire most...


> Roughly
> speaking:
>
> % mkdir list
> % cd list
> % ln -s ../.list.* .
> % mmv .list.* *  # zsh specific, optional
> % notmuch new
>
> Notmuch new took about 10 minutes, but now I can search
>
> 'path:list/**'
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Alternative to no longer supported folder:foo* wildcard matching ?
  2015-03-10  0:16   ` Jean-Marc Liotier
@ 2015-03-10  7:10     ` David Bremner
  2015-03-10 12:21       ` Jean-Marc Liotier
  0 siblings, 1 reply; 6+ messages in thread
From: David Bremner @ 2015-03-10  7:10 UTC (permalink / raw)
  To: Jean-Marc Liotier, notmuch

Jean-Marc Liotier <jm@liotier.org> writes:

>
> % cd ~/Maildir
> % mkdir .NM_myTopLevelFolder
> % ln -rs .myTopLevelFolder* -t .NM_myTopLevelFolder

This is doing one level of links. If you want more hierarchy
you'll have to create subdirectories with links to only some of your
folders. In fact the NM_myTopLevelFolder doesn't seem useful to me,
since you don't gain any new queries that way.

So if you have .foo.*  and .bar.*, I would

mkdir foo
ln -rs .foo.* -t foo
mkdir bar
ln -rs .bar.* -t bar

> On the downside:
> - It doubles the number of messages to index (though then even 
> multiplied by two, my 300k messages are Not Much Mail™ - but still...)

Conceivably this has to do with duplicating the top level folder, not
sure; I don't see an increase. In particular I don't see in increase in
the output of "notmuch count", so in notmuch jargon, the number of
_messages_ does not increase but rather the number of _files_. There
will obviously be some growth in the database size, but there was
nothing too shocking in my experiments (I didn't measure carefully, but
my database is still at around 30% of the raw mail size)

> - myTopLevelFolder gets a NM_myTopLevelFolder twin and restricting the 
> search scope to it requires using its twin's name

yes, I suppose that's true. But for "nice" symlink names this doesn't
seem so terrible. But *shrug* it's a matter of taste.

> - The additional messages are duplicates, so --remove-dups becomes 
> mandatory in any search query

Based on the name, I'd suspect "remove-dups" corresponds roughly to the
default behaviour of notmuch in reporting results.

> - This method is good for restricting the search scope to a directory,
> but not for excluding a directory from the search scope... Which alas
> is what I desire most...

Either I don't understand what you want, or this might again be
something to do with notmuch-mutt. For me, queries like

        notmuch count not 'path:list/**' 

and
        notmuch count not 'path:list/**' and from:bremner

work as expected.

Hope this helps,

d

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Alternative to no longer supported folder:foo* wildcard matching ?
  2015-03-10  7:10     ` David Bremner
@ 2015-03-10 12:21       ` Jean-Marc Liotier
  2015-03-10 13:25         ` David Bremner
  0 siblings, 1 reply; 6+ messages in thread
From: Jean-Marc Liotier @ 2015-03-10 12:21 UTC (permalink / raw)
  To: David Bremner, notmuch

On 10/03/2015 08:10, David Bremner wrote:
> [..] In fact the NM_myTopLevelFolder doesn't seem useful to me,
> since you don't gain any new queries that way.
>
> So if you have .foo.*  and .bar.*, I would
>
> mkdir foo
> ln -rs .foo.* -t foo

I had forgotten I could make a maildir without initial . just as well - 
thus removing the need for my silly prefix: with dot is a normal 
maildir, without is a top-level Notmuch symlink container.

> in notmuch jargon, the number of _messages_ does not increase but 
> rather the number of _files_.

That makes sense since, from what I read in notmuch-insert.c,  the 
messages are identified by message-id and therefore counted only once 
however many times they are encountered during notmuch new.

> so --remove-dups becomes
> mandatory in any search query
> Based on the name, I'd suspect "remove-dups" corresponds roughly to the
> default behaviour of notmuch in reporting results.

Yes, though --duplicate=N is only supported with --output=files and 
--output=messages, and it is not even appear the default there since { 
NOTMUCH_OPT_INT, &ctx->dupe, "duplicate", 'D', 0  } in notmuch-search.c

Otherwise it does behaves as implicitly removing duplicates, which was 
somewhat confusing me until I understood notmuch's conceptual 
distinction between files and messages.

>          notmuch count not 'path:list/**'
> and
>          notmuch count not 'path:list/**' and from:bremner
>
> work as expected.

Indeed they do - both with notmuch and notmuch-mutt... I just had to 
struggle a bit until I realized that my notmuch-mutt --output-dir is 
inside the maildir indexed by notmuch... So the number of result and 
duplicates varied according to what state the symlink results maildir 
was in when I last indexed the whole thing...

Yes, I do need the --output-dir to be inside the maildir because the 
IMAP server lets me have my search results in any MUA I happen to be 
using (Thunderbird, K-9 or Outlook for example). And by the way, maybe 
notmuch-mutt should be named notmuch-maildir or notmuch-symlinks : the 
mutt part is just about setting the macros in ~/.muttrc - everything 
else is generic to anything that can read a maildir.

So my indexing command is now :

notmuch tag +nmsearchresult 'path:.=Search/**' && notmuch new

With ~/.notmuch-config now containing:

[search]
exclude_tags=deleted;spam;nmsearchresult

And it works.

I would have prefered explicit folder inclusion/exclusion at query time 
and path inclusion/exclusion at indexing time... But I guess I'll get 
used to notmuch's logic of tagging everything and then using the tags.

Thank you for your kind guidance !

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Alternative to no longer supported folder:foo* wildcard matching ?
  2015-03-10 12:21       ` Jean-Marc Liotier
@ 2015-03-10 13:25         ` David Bremner
  0 siblings, 0 replies; 6+ messages in thread
From: David Bremner @ 2015-03-10 13:25 UTC (permalink / raw)
  To: Jean-Marc Liotier, notmuch

Jean-Marc Liotier <jm@liotier.org> writes:

> I would have prefered explicit folder inclusion/exclusion at query time 
> and path inclusion/exclusion at indexing time... But I guess I'll get 
> used to notmuch's logic of tagging everything and then using the tags.

In general tags are helpful (or just full text searches). But it does
seem like maildir++ (.foo.bar folders) would be reasonable to support
directly. I have the impression from my experiments with symlinks that
the cost in database bloat is not too bad. A simple matter of
programming ;).

d

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-03-10 13:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-09 19:55 Alternative to no longer supported folder:foo* wildcard matching ? Jean-Marc Liotier
2015-03-09 22:06 ` David Bremner
2015-03-10  0:16   ` Jean-Marc Liotier
2015-03-10  7:10     ` David Bremner
2015-03-10 12:21       ` Jean-Marc Liotier
2015-03-10 13:25         ` David Bremner

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).