* correct way to search for only PDF attachments
@ 2015-09-29 0:55 Xu Wang
2015-09-29 2:00 ` Carl Worth
0 siblings, 1 reply; 8+ messages in thread
From: Xu Wang @ 2015-09-29 0:55 UTC (permalink / raw)
To: notmuch
Hi,
I would look to look for all emails from a colleague jongho. I tried:
from:jongho attachment:pdf
which seems to do as I wanted.
To understand more, what does the following search for?
from:jongho attachment:.*pdf
I know it is incorrect as the results tell me, but what exactly does it do?
Also, how does the first one above know that I want only PDF
attachments and not an attachment called "pdformula.txt" ?
Kind regards,
Xu
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: correct way to search for only PDF attachments
2015-09-29 0:55 correct way to search for only PDF attachments Xu Wang
@ 2015-09-29 2:00 ` Carl Worth
2015-09-29 4:51 ` Xu Wang
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Carl Worth @ 2015-09-29 2:00 UTC (permalink / raw)
To: Xu Wang, notmuch
[-- Attachment #1: Type: text/plain, Size: 1499 bytes --]
On Mon, Sep 28 2015, Xu Wang wrote:
> I would look to look for all emails from a colleague jongho. I tried:
>
> from:jongho attachment:pdf
>
> which seems to do as I wanted.
Good. That should work.
> To understand more, what does the following search for?
>
> from:jongho attachment:.*pdf
Uhm, probably only strange things. There are some mechanisms for getting
notmuch to emit some debugging information on what the final search
terms end up being, (but I don't recall if they still require
recompilation or not).
I'm not testing now, but I wouldn't be surprised if that ended up doing
something like searching for a phrase like "attachment pdf" anywhere
within a message. (The Xapian parser can be somewhat unpredictable when
you give it unexpected input.)
> Also, how does the first one above know that I want only PDF
> attachments and not an attachment called "pdformula.txt" ?
It doesn't know that you want only PDF attachments. The key part is that
the indexing is performed by breaking text up into individual terms, (at
punctuation boundaries usually). So a search specification like
"attachment:pdf" is searching for things that were indexed with the
"pdf" term within the attachment prefix. So that won't match a filename
like pdformula.txt, (which would be indexed as two terms, "pdformula"
and "txt"), but it would match pdf.ormula.txt, (which would be indexed
as three terms, "pdf", "ormula" and "txt").
The Xapian documentation can be examined if you want more details.
-Carl
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: correct way to search for only PDF attachments
2015-09-29 2:00 ` Carl Worth
@ 2015-09-29 4:51 ` Xu Wang
2015-09-29 7:15 ` Suvayu Ali
2015-09-29 11:00 ` David Bremner
2 siblings, 0 replies; 8+ messages in thread
From: Xu Wang @ 2015-09-29 4:51 UTC (permalink / raw)
To: Carl Worth; +Cc: notmuch
On Mon, Sep 28, 2015 at 10:00 PM, Carl Worth <cworth@cworth.org> wrote:
> On Mon, Sep 28 2015, Xu Wang wrote:
>> I would look to look for all emails from a colleague jongho. I tried:
>>
>> from:jongho attachment:pdf
>>
>> which seems to do as I wanted.
>
> Good. That should work.
>
>> To understand more, what does the following search for?
>>
>> from:jongho attachment:.*pdf
>
> Uhm, probably only strange things. There are some mechanisms for getting
> notmuch to emit some debugging information on what the final search
> terms end up being, (but I don't recall if they still require
> recompilation or not).
>
> I'm not testing now, but I wouldn't be surprised if that ended up doing
> something like searching for a phrase like "attachment pdf" anywhere
> within a message. (The Xapian parser can be somewhat unpredictable when
> you give it unexpected input.)
>
>> Also, how does the first one above know that I want only PDF
>> attachments and not an attachment called "pdformula.txt" ?
>
> It doesn't know that you want only PDF attachments. The key part is that
> the indexing is performed by breaking text up into individual terms, (at
> punctuation boundaries usually). So a search specification like
> "attachment:pdf" is searching for things that were indexed with the
> "pdf" term within the attachment prefix. So that won't match a filename
> like pdformula.txt, (which would be indexed as two terms, "pdformula"
> and "txt"), but it would match pdf.ormula.txt, (which would be indexed
> as three terms, "pdf", "ormula" and "txt").
>
> The Xapian documentation can be examined if you want more details.
This is highly useful. Thank for such an explanation!! Thank you, Carl.
Kind regards,
Xu
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: correct way to search for only PDF attachments
2015-09-29 2:00 ` Carl Worth
2015-09-29 4:51 ` Xu Wang
@ 2015-09-29 7:15 ` Suvayu Ali
2015-09-29 11:00 ` David Bremner
2 siblings, 0 replies; 8+ messages in thread
From: Suvayu Ali @ 2015-09-29 7:15 UTC (permalink / raw)
To: notmuch
On Mon, Sep 28, 2015 at 07:00:13PM -0700, Carl Worth wrote:
> On Mon, Sep 28 2015, Xu Wang wrote:
>
> > To understand more, what does the following search for?
> >
> > from:jongho attachment:.*pdf
>
> Uhm, probably only strange things. There are some mechanisms for getting
> notmuch to emit some debugging information on what the final search
> terms end up being, (but I don't recall if they still require
> recompilation or not).
This should work:
$ export NOTMUCH_DEBUG_QUERY=1
$ notmuch count -- from:suvayu attachment:*.pdf
Query string is:
from:suvayu attachment:*.pdf
Exclude query is:
Xapian::Query()
Final query is:
Xapian::Query((Tmail AND ZXFROMsuvayu:(pos=1) AND Zattach:(pos=2) AND Zpdf:(pos=3)))
217
$ notmuch count -- from:suvayu attachment:pdf
Query string is:
from:suvayu attachment:pdf
Exclude query is:
Xapian::Query()
Final query is:
Xapian::Query((Tmail AND ZXFROMsuvayu:(pos=1) AND ZXATTACHMENTpdf:(pos=2)))
151
I guess to answer the OP's question, the globbed form simply does a text
search of attach and pdf. The keyword is not recognised at all.
Cheers,
--
Suvayu
Open source is the future. It sets us free.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: correct way to search for only PDF attachments
2015-09-29 2:00 ` Carl Worth
2015-09-29 4:51 ` Xu Wang
2015-09-29 7:15 ` Suvayu Ali
@ 2015-09-29 11:00 ` David Bremner
2015-09-29 11:56 ` Suvayu Ali
2 siblings, 1 reply; 8+ messages in thread
From: David Bremner @ 2015-09-29 11:00 UTC (permalink / raw)
To: Carl Worth, Xu Wang, notmuch
Carl Worth <cworth@cworth.org> writes:
> On Mon, Sep 28 2015, Xu Wang wrote:
>> I would look to look for all emails from a colleague jongho. I tried:
>>
>> from:jongho attachment:pdf
>>
>> which seems to do as I wanted.
>
> Good. That should work.
Another option is to use mimetype:pdf
man notmuch-search-terms is probably worth a look when facing these
kinds of puzzles. It contains both Carl's reply about term based search
and mine about the mimetype: prefix. Of course it is getting pretty
big, I don't know what to do about that.
d
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: correct way to search for only PDF attachments
2015-09-29 11:00 ` David Bremner
@ 2015-09-29 11:56 ` Suvayu Ali
2015-09-29 13:48 ` David Bremner
0 siblings, 1 reply; 8+ messages in thread
From: Suvayu Ali @ 2015-09-29 11:56 UTC (permalink / raw)
To: notmuch
On Tue, Sep 29, 2015 at 08:00:18AM -0300, David Bremner wrote:
>
> Of course it is getting pretty big, I don't know what to do about
> that.
How about an overview in notmuch-search-terms with more detailed docs in
an info page? coreutils does this. I don't think this will add any new
build dependencies either, as sphinx supports info pages. I see
texinfo_documents is already defined in doc/conf.py. Maybe that is an
option?
--
Suvayu
Open source is the future. It sets us free.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: correct way to search for only PDF attachments
2015-09-29 11:56 ` Suvayu Ali
@ 2015-09-29 13:48 ` David Bremner
2015-09-30 15:16 ` Xu Wang
0 siblings, 1 reply; 8+ messages in thread
From: David Bremner @ 2015-09-29 13:48 UTC (permalink / raw)
To: Suvayu Ali, notmuch
Suvayu Ali <fatkasuvayu+linux@gmail.com> writes:
> On Tue, Sep 29, 2015 at 08:00:18AM -0300, David Bremner wrote:
>>
>> Of course it is getting pretty big, I don't know what to do about
>> that.
>
> How about an overview in notmuch-search-terms with more detailed docs in
> an info page? coreutils does this. I don't think this will add any new
> build dependencies either, as sphinx supports info pages. I see
> texinfo_documents is already defined in doc/conf.py. Maybe that is an
> option?
>
I'm not really in favour of requiring anyone who is not already using
emacs to use info. Of course we could provide the same long form docs
in other formats (most likely html). I don't know if splitting into
shorter man pages plus a longer manual would really help, but it's
likely we could take better advantage of sphinx. I know that Patrick
Totzke started a rework of the docs
https://github.com/pazz/notmuch/tree/docs
I don't think that's really in a state to contemplate merging (for one
thing it hasn't kept up with doc changes in master); but maybe somebody
wants to pick up where Patrick left off.
d
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: correct way to search for only PDF attachments
2015-09-29 13:48 ` David Bremner
@ 2015-09-30 15:16 ` Xu Wang
0 siblings, 0 replies; 8+ messages in thread
From: Xu Wang @ 2015-09-30 15:16 UTC (permalink / raw)
To: David Bremner; +Cc: notmuch
On Tue, Sep 29, 2015 at 9:48 AM, David Bremner <david@tethera.net> wrote:
> Suvayu Ali <fatkasuvayu+linux@gmail.com> writes:
>
>> On Tue, Sep 29, 2015 at 08:00:18AM -0300, David Bremner wrote:
>>>
>>> Of course it is getting pretty big, I don't know what to do about
>>> that.
>>
>> How about an overview in notmuch-search-terms with more detailed docs in
>> an info page? coreutils does this. I don't think this will add any new
>> build dependencies either, as sphinx supports info pages. I see
>> texinfo_documents is already defined in doc/conf.py. Maybe that is an
>> option?
>>
>
> I'm not really in favour of requiring anyone who is not already using
> emacs to use info. Of course we could provide the same long form docs
> in other formats (most likely html). I don't know if splitting into
> shorter man pages plus a longer manual would really help, but it's
> likely we could take better advantage of sphinx. I know that Patrick
> Totzke started a rework of the docs
>
> https://github.com/pazz/notmuch/tree/docs
>
> I don't think that's really in a state to contemplate merging (for one
> thing it hasn't kept up with doc changes in master); but maybe somebody
> wants to pick up where Patrick left off.
>
> d
> _______________________________________________
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
Thank you everyone for all of the information and for walking me in
through the example!
I will study more in depth and look at the detailed documentation.
Kind regards,
Xu
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-09-30 15:17 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-29 0:55 correct way to search for only PDF attachments Xu Wang
2015-09-29 2:00 ` Carl Worth
2015-09-29 4:51 ` Xu Wang
2015-09-29 7:15 ` Suvayu Ali
2015-09-29 11:00 ` David Bremner
2015-09-29 11:56 ` Suvayu Ali
2015-09-29 13:48 ` David Bremner
2015-09-30 15:16 ` Xu Wang
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).