From: Xu Wang <xuwang762@gmail.com>
To: Carl Worth <cworth@cworth.org>
Cc: notmuch@notmuchmail.org
Subject: Re: correct way to search for only PDF attachments
Date: Tue, 29 Sep 2015 00:51:01 -0400 [thread overview]
Message-ID: <CAJhTkNg0_j3R8zdpywmZkreFU2p+Wky8oxC7vvuQYzNK2U=-1Q@mail.gmail.com> (raw)
In-Reply-To: <87vbau9e8i.fsf@wondoo.home.cworth.org>
On Mon, Sep 28, 2015 at 10:00 PM, Carl Worth <cworth@cworth.org> wrote:
> On Mon, Sep 28 2015, Xu Wang wrote:
>> I would look to look for all emails from a colleague jongho. I tried:
>>
>> from:jongho attachment:pdf
>>
>> which seems to do as I wanted.
>
> Good. That should work.
>
>> To understand more, what does the following search for?
>>
>> from:jongho attachment:.*pdf
>
> Uhm, probably only strange things. There are some mechanisms for getting
> notmuch to emit some debugging information on what the final search
> terms end up being, (but I don't recall if they still require
> recompilation or not).
>
> I'm not testing now, but I wouldn't be surprised if that ended up doing
> something like searching for a phrase like "attachment pdf" anywhere
> within a message. (The Xapian parser can be somewhat unpredictable when
> you give it unexpected input.)
>
>> Also, how does the first one above know that I want only PDF
>> attachments and not an attachment called "pdformula.txt" ?
>
> It doesn't know that you want only PDF attachments. The key part is that
> the indexing is performed by breaking text up into individual terms, (at
> punctuation boundaries usually). So a search specification like
> "attachment:pdf" is searching for things that were indexed with the
> "pdf" term within the attachment prefix. So that won't match a filename
> like pdformula.txt, (which would be indexed as two terms, "pdformula"
> and "txt"), but it would match pdf.ormula.txt, (which would be indexed
> as three terms, "pdf", "ormula" and "txt").
>
> The Xapian documentation can be examined if you want more details.
This is highly useful. Thank for such an explanation!! Thank you, Carl.
Kind regards,
Xu
next prev parent reply other threads:[~2015-09-29 4:51 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-29 0:55 correct way to search for only PDF attachments Xu Wang
2015-09-29 2:00 ` Carl Worth
2015-09-29 4:51 ` Xu Wang [this message]
2015-09-29 7:15 ` Suvayu Ali
2015-09-29 11:00 ` David Bremner
2015-09-29 11:56 ` Suvayu Ali
2015-09-29 13:48 ` David Bremner
2015-09-30 15:16 ` Xu Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://notmuchmail.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJhTkNg0_j3R8zdpywmZkreFU2p+Wky8oxC7vvuQYzNK2U=-1Q@mail.gmail.com' \
--to=xuwang762@gmail.com \
--cc=cworth@cworth.org \
--cc=notmuch@notmuchmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://yhetil.org/notmuch.git/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).