unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* extract attachments from multiple mails
@ 2012-06-25 11:44 David Belohrad
  2012-06-25 17:14 ` Jameson Graef Rollins
  0 siblings, 1 reply; 3+ messages in thread
From: David Belohrad @ 2012-06-25 11:44 UTC (permalink / raw)
  To: notmuch


Dear All,

someone can give an advice? I have many emails containing
attachment. This is typically an output of copy-machine, which fragments
a scan into multiple attachments.

I'd like to extract those attached files in a one batch into a specific
directory. Is there any way how to programmatically fetch those files?

thanks
..d..

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: extract attachments from multiple mails
  2012-06-25 11:44 extract attachments from multiple mails David Belohrad
@ 2012-06-25 17:14 ` Jameson Graef Rollins
  2012-06-25 17:41   ` Jameson Graef Rollins
  0 siblings, 1 reply; 3+ messages in thread
From: Jameson Graef Rollins @ 2012-06-25 17:14 UTC (permalink / raw)
  To: David Belohrad, notmuch

[-- Attachment #1: Type: text/plain, Size: 2109 bytes --]

On Mon, Jun 25 2012, David Belohrad <david@belohrad.ch> wrote:
> someone can give an advice? I have many emails containing
> attachment. This is typically an output of copy-machine, which fragments
> a scan into multiple attachments.
>
> I'd like to extract those attached files in a one batch into a specific
> directory. Is there any way how to programmatically fetch those files?

notmuch show has a --part option for outputting a single part from a
MIME message.  Unfortunately there's currently no clean way to determine
the number of parts in a message.  But sort of hackily, you could do
something like:

for id in $(notmuch search --output=messages tag:files-to-extract); do
    for part in $(seq 1 10); do
        notmuch show --part=$part  --format=raw $id > $id.$part
    done
done

That will also save any multipart parts, which aren't really that
useful, so you'll have to sort through them.

You can make something much cleaner with python, using the notmuch and
email python bindings:

http://packages.python.org/notmuch/
http://docs.python.org/library/email-examples.html

I hacked up something simple below that will extract parts from messages
matching a search term into the current directory (tested).

hth.

jamie.


#!/usr/bin/env python

import subprocess
import sys
import os
import notmuch
import email
import errno
import mimetypes

dbpath = subprocess.check_output(['notmuch', 'config', 'get', 'database.path']).strip()
db = notmuch.Database(dbpath)
query = notmuch.Query(db, sys.argv[1])
for msg in query.search_messages():
    with open(msg.get_filename(), 'r') as f:
        msg = email.message_from_file(f)
    counter = 1
    for part in msg.walk():
        if part.get_content_maintype() == 'multipart': continue
        filename = part.get_filename()
        if not filename:
            ext = mimetypes.guess_extension(part.get_content_type())
        if not ext:
            ext = '.bin'
        filename = 'part-%03d%s' % (counter, ext)
        counter += 1
        print filename
        with open(filename, 'wb') as f:
            f.write(part.get_payload(decode=True))

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: extract attachments from multiple mails
  2012-06-25 17:14 ` Jameson Graef Rollins
@ 2012-06-25 17:41   ` Jameson Graef Rollins
  0 siblings, 0 replies; 3+ messages in thread
From: Jameson Graef Rollins @ 2012-06-25 17:41 UTC (permalink / raw)
  To: David Belohrad, notmuch


[-- Attachment #1.1: Type: text/plain, Size: 263 bytes --]

On Mon, Jun 25 2012, Jameson Graef Rollins <jrollins@finestructure.net> wrote:
> I hacked up something simple below that will extract parts from messages
> matching a search term into the current directory (tested).

Improved/bug fixed version attached.

jamie.


[-- Attachment #1.2: jnotmuch-extract-parts --]
[-- Type: application/octet-stream, Size: 1047 bytes --]

#!/usr/bin/env python

import subprocess
import sys
import os
import notmuch
import email
import errno
import mimetypes

dbpath = subprocess.check_output(['notmuch', 'config', 'get', 'database.path']).strip()
db = notmuch.Database(dbpath)
query = notmuch.Query(db, sys.argv[1])
for nmsg in query.search_messages():
    outdir = nmsg.get_message_id()
    with open(nmsg.get_filename(), 'r') as f:
        msg = email.message_from_file(f)
    counter = 1
    if not os.path.exists(outdir): os.makedirs(outdir)
    for part in msg.walk():
        if part.get_content_maintype() == 'multipart': continue
        filename = part.get_filename()
        print part.get_content_type()
        if not filename:
            ext = mimetypes.guess_extension(part.get_content_type())
        if not ext:
            ext = '.bin'
        filename = 'part-%03d%s' % (counter, ext)
        outfile = os.path.join(outdir,filename)
        print outfile
        with open(outfile, 'wb') as f:
            f.write(part.get_payload(decode=True))
        counter += 1

[-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-06-25 17:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-25 11:44 extract attachments from multiple mails David Belohrad
2012-06-25 17:14 ` Jameson Graef Rollins
2012-06-25 17:41   ` Jameson Graef Rollins

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).