* Idea for storing tags @ 2010-01-11 22:19 martin f krafft 2010-01-12 3:44 ` Scott Robinson ` (3 more replies) 0 siblings, 4 replies; 42+ messages in thread From: martin f krafft @ 2010-01-11 22:19 UTC (permalink / raw) To: mailtags discussion list; +Cc: notmuch discussion list [-- Attachment #1: Type: text/plain, Size: 2725 bytes --] Folks, over in #notmuch, we just floated an idea that I'd like to get out to you. We've been debating storing tags for messages. Therefore I am cross-posting. Please forgive me. So far, there are two approaches: 1. External database, which has the downside of not being synchronisable with standard IMAP, like the rest of your mail (assuming you use IMAP). Also, it's possible for mailstore and database to get out of sync. 2. In-headers, which has the downside of leaking (e.g. when bouncing), and incurs the risks associated with message rewrites (which I think is pretty much ignorable, but it's still there). Also, there's a performance issue, but in the context of an indexer like notmuch, this is negligible. The leakage is real, though and I think it makes in-headers unusable. After all, I don't ever want anyone else to know that I tag e-mails from my boss as "from-idiots", and I forward and bounce mail on a regular basis. I could tell my MTA to remove those headers, but I might forget to do that on a new system. We also previously determined that IMAP keywords are pretty much useless as they are stored per mailbox, not per message, not standardised, and limited in their length anyway [0]. This also means that we don't really need to investigate sensibly storing tags in Maildir (e.g. with xattrs), because IMAP cannot transport them. 0. http://lists.madduck.net/pipermail/mailtags/2007-August/msg00016.html Seriously, who implemented IMAPv4rev1 and what sort of crack were they smoking?? I remember there was some KDE groupware contacts manager that used IMAP to synchronise contacts. At first, this sounds horrible, but when you detach IMAP from RFC822, it becomes a generic synchronising protocol. The next step is then straight forward, and I want to share this idea with you: How about using pseudo-mails stored in Maildir and synchronised by IMAP? E.g. every folder could have a subfolder .TAGS and if we find a way to smartly pair messages between parent and subfolder, we'd have a tag store alongside the mailstore it refers to, but without the danger of leakage, and without having to rewrite messages. The major problem with this is when clients don't understand this "protocol", for then they will display all .TAGS folders as regular IMAP folders, and try to treat the messages therein as regular mails. Somewhere sometime this is bound to blow up and I don't really know how to prevent that. Anyway, the idea is out now. Thoughts? -- martin | http://madduck.net/ | http://two.sentenc.es/ echo Prpv a\'rfg cnf har cvcr | tr Pacfghnrvp Cnpstuaeic spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-11 22:19 Idea for storing tags martin f krafft @ 2010-01-12 3:44 ` Scott Robinson 2010-01-12 4:06 ` martin f krafft 2010-01-12 4:51 ` Potential problem using Git for mail (was: Idea for storing tags) martin f krafft 2010-01-12 4:11 ` Idea for storing tags Scott Morrison ` (2 subsequent siblings) 3 siblings, 2 replies; 42+ messages in thread From: Scott Robinson @ 2010-01-12 3:44 UTC (permalink / raw) To: notmuch I wrote a script to store and sync my tags. * One filename per message-ID. * Line-feed seperated tags in each file. Then the whole structure is controlled via git. Conflict-resolution and sync comes for free. It isn't clear what use-case the earlier e-mail is aiming to satisfy. This is how I solved my tag sync issues, though. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-12 3:44 ` Scott Robinson @ 2010-01-12 4:06 ` martin f krafft 2010-01-12 4:51 ` Potential problem using Git for mail (was: Idea for storing tags) martin f krafft 1 sibling, 0 replies; 42+ messages in thread From: martin f krafft @ 2010-01-12 4:06 UTC (permalink / raw) To: notmuch; +Cc: mailtags discussion list [-- Attachment #1: Type: text/plain, Size: 969 bytes --] also sprach Scott Robinson <scott@quadhome.com> [2010.01.12.1644 +1300]: > I wrote a script to store and sync my tags. > > * One filename per message-ID. > * Line-feed seperated tags in each file. > > Then the whole structure is controlled via git. > Conflict-resolution and sync comes for free. How do you ensure that the external tag store and your mail store do not go out of sync? I assume that mails without a tagfile are simply untagged, so that's hardly the issue. However, if you delete a mail, how do you ensure that the tag database is cleaned up? Also, do you attach tags automatically, e.g. with procmail on the server? If so, how do you initiate git-pull locally? Would you consider sharing your script? -- martin | http://madduck.net/ | http://two.sentenc.es/ "alle vorurteile kommen aus den eingeweiden." - friedrich nietzsche spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Potential problem using Git for mail (was: Idea for storing tags) 2010-01-12 3:44 ` Scott Robinson 2010-01-12 4:06 ` martin f krafft @ 2010-01-12 4:51 ` martin f krafft 2010-01-12 19:38 ` Jameson Rollins 2010-01-14 8:12 ` Asheesh Laroia 1 sibling, 2 replies; 42+ messages in thread From: martin f krafft @ 2010-01-12 4:51 UTC (permalink / raw) To: notmuch [-- Attachment #1: Type: text/plain, Size: 1438 bytes --] also sprach Scott Robinson <scott@quadhome.com> [2010.01.12.1644 +1300]: > Then the whole structure is controlled via git. > Conflict-resolution and sync comes for free. I've just had a good think about this, also because the idea of abandoning IMAP and using Git has been around for a while and I have not really wrapped my head around it. If the MDA delivers to Git, then potentially, you might get into a situation where you cannot write your own changes back to the repo. This is also a DoS scenario: I'll just keep sending you e-mail, and if I manage to pass your mail filters, I'll basically commit to your mail repository at regular intervals. Say those are 5 seconds. In order for you to write updates to the repo, e.g. to update tags, then you would need to pull, rebase, and push all within 5 seconds, for otherwise you'd try to push non-fast-forwards. This a bit unrealistic, surely, but there's a real annoyance in it: you'd have to pull/rebase/push until a push succeeds — until you found a time window between pull and push during which the MDA didn't write to the repo. This might take a long time. If this happens in the background by Cron, it's not a real concern, but if this becomes a UI issue, I wouldn't know how to handle it. -- martin | http://madduck.net/ | http://two.sentenc.es/ don't hate yourself in the morning -- sleep till noon. spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Potential problem using Git for mail (was: Idea for storing tags) 2010-01-12 4:51 ` Potential problem using Git for mail (was: Idea for storing tags) martin f krafft @ 2010-01-12 19:38 ` Jameson Rollins 2010-01-12 19:55 ` martin f krafft 2010-01-14 8:12 ` Asheesh Laroia 1 sibling, 1 reply; 42+ messages in thread From: Jameson Rollins @ 2010-01-12 19:38 UTC (permalink / raw) To: notmuch [-- Attachment #1: Type: text/plain, Size: 1243 bytes --] On Tue, Jan 12, 2010 at 05:51:53PM +1300, martin f krafft wrote: > If the MDA delivers to Git, then potentially, you might get into > a situation where you cannot write your own changes back to the > repo. This is also a DoS scenario: I'll just keep sending you > e-mail, and if I manage to pass your mail filters, I'll basically > commit to your mail repository at regular intervals. Say those are > 5 seconds. In order for you to write updates to the repo, e.g. to > update tags, then you would need to pull, rebase, and push all > within 5 seconds, for otherwise you'd try to push non-fast-forwards. > > This a bit unrealistic, surely, but there's a real annoyance in it: > you'd have to pull/rebase/push until a push succeeds — until you > found a time window between pull and push during which the MDA > didn't write to the repo. This might take a long time. If this > happens in the background by Cron, it's not a real concern, but if > this becomes a UI issue, I wouldn't know how to handle it. What about if just the tag information is stored in the repository, and not the mail itself? In that case only the user would be pushing into the repo and you wouldn't have to worry about the DoS scenario. jamie. [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Potential problem using Git for mail (was: Idea for storing tags) 2010-01-12 19:38 ` Jameson Rollins @ 2010-01-12 19:55 ` martin f krafft 0 siblings, 0 replies; 42+ messages in thread From: martin f krafft @ 2010-01-12 19:55 UTC (permalink / raw) To: Jameson Rollins; +Cc: notmuch [-- Attachment #1: Type: text/plain, Size: 561 bytes --] also sprach Jameson Rollins <jrollins@finestructure.net> [2010.01.13.0838 +1300]: > What about if just the tag information is stored in the > repository, and not the mail itself? In that case only the user > would be pushing into the repo and you wouldn't have to worry > about the DoS scenario. I certainly would like the ability to have messages automatically-tagged on delivery, by procmail. -- martin | http://madduck.net/ | http://two.sentenc.es/ may the bluebird of happiness twiddle your bits. spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Potential problem using Git for mail (was: Idea for storing tags) 2010-01-12 4:51 ` Potential problem using Git for mail (was: Idea for storing tags) martin f krafft 2010-01-12 19:38 ` Jameson Rollins @ 2010-01-14 8:12 ` Asheesh Laroia 2010-01-14 20:37 ` martin f krafft 1 sibling, 1 reply; 42+ messages in thread From: Asheesh Laroia @ 2010-01-14 8:12 UTC (permalink / raw) To: martin f krafft; +Cc: notmuch [-- Attachment #1: Type: TEXT/PLAIN, Size: 2836 bytes --] On Tue, 12 Jan 2010, martin f krafft wrote: > If the MDA delivers to Git, then potentially, you might get into a > situation where you cannot write your own changes back to the repo. This > is also a DoS scenario: I'll just keep sending you e-mail, and if I > manage to pass your mail filters, I'll basically commit to your mail > repository at regular intervals. Say those are 5 seconds. In order for > you to write updates to the repo, e.g. to update tags, then you would > need to pull, rebase, and push all within 5 seconds, for otherwise you'd > try to push non-fast-forwards. Sure. But the MDA doesn't need to do the commit immediately. Since (presumably) we're using Maildir, the MDA on the mail receiving server is going to generate filenames that won't cause conflicts. So it's okay to leave the files uncommitted. If that's too scary, then have the MDA deliver to its own git branch with its own checkout. Then, if you can force linearity with a lock (!), your client can have a special "lock the repo and push" command. Your remote MUA could even ask the MDA to lock the Maildir while it does a merge and then pushes that, and then the MDA can go back to dequeuing messages from the MTA into the Maildir. Not the beautiful lockless world the purists want, but I'm okay with that. > This a bit unrealistic, surely, but there's a real annoyance in it: > you'd have to pull/rebase/push until a push succeeds — until you found a > time window between pull and push during which the MDA didn't write to > the repo. This might take a long time. If this happens in the background > by Cron, it's not a real concern, but if this becomes a UI issue, I > wouldn't know how to handle it. It's not entirely unreasonable. Cron caused issues like that for me when I tracked my Maildir in git. I'm just learning about notmuchmail.org, but I'll keep listening here. Preferably CC: me on replies to this mail. I will say, I'm interested in an email setup with with working IMAP on at least one side. There's one other bad race I ran into when using git to manage my Maildirs. I was using Dovecot to serve my Maildir to an IMAP client, alpine. I separately did a "git merge" from origin/master, where the remote MTA had an MDA deliving messages and a layer on top of that committed them. When I did the "git merge", git would create the Maildir files in ~/Maildir/cur/... non-atomically. Dovecot would notice the file in ~/Maildir/cur/ and think, "This file must be ready!" So it would parse it even though git hadn't finished writing it. This caused me to only see partial headers in Alpine since Dovecot parsed it before it was a complete message. That kind of sucked. -- Asheesh. -- Almost anything derogatory you could say about today's software design would be accurate. -- K. E. Iverson ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Potential problem using Git for mail (was: Idea for storing tags) 2010-01-14 8:12 ` Asheesh Laroia @ 2010-01-14 20:37 ` martin f krafft 2010-01-21 6:28 ` Asheesh Laroia 0 siblings, 1 reply; 42+ messages in thread From: martin f krafft @ 2010-01-14 20:37 UTC (permalink / raw) To: Asheesh Laroia; +Cc: notmuch [-- Attachment #1: Type: text/plain, Size: 1500 bytes --] also sprach Asheesh Laroia <asheesh@asheesh.org> [2010.01.14.2112 +1300]: > Sure. But the MDA doesn't need to do the commit immediately. Since > (presumably) we're using Maildir, the MDA on the mail receiving > server is going to generate filenames that won't cause conflicts. > So it's okay to leave the files uncommitted. So when does the commit happen? > When I did the "git merge", git would create the Maildir files in > ~/Maildir/cur/... non-atomically. This might be something that the Git people could address if it was brought up on the mailing list. Then again, it might not be possible without going via a temporary file, which I doubt will fly. I suppose that I never actually considered merges on the IMAP server side, but obviously the IMAP server has to work off a clone, and that means it needs to merge. > Dovecot would notice the file in ~/Maildir/cur/ and think, "This > file must be ready!" So it would parse it even though git hadn't > finished writing it. This caused me to only see partial headers in > Alpine since Dovecot parsed it before it was a complete message. I wonder if a custom merge driver could address this to properly use …/tmp/ to assemble the message and only then move it. -- martin | http://madduck.net/ | http://two.sentenc.es/ "this week dragged past me so slowly; the days fell on their knees..." -- david bowie spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Potential problem using Git for mail (was: Idea for storing tags) 2010-01-14 20:37 ` martin f krafft @ 2010-01-21 6:28 ` Asheesh Laroia 2010-01-25 0:46 ` Git as notmuch object store (was: Potential problem using Git for mail) martin f krafft 0 siblings, 1 reply; 42+ messages in thread From: Asheesh Laroia @ 2010-01-21 6:28 UTC (permalink / raw) To: martin f krafft; +Cc: notmuch [-- Attachment #1: Type: TEXT/PLAIN, Size: 2038 bytes --] On Fri, 15 Jan 2010, martin f krafft wrote: > also sprach Asheesh Laroia <asheesh@asheesh.org> [2010.01.14.2112 +1300]: >> Sure. But the MDA doesn't need to do the commit immediately. Since >> (presumably) we're using Maildir, the MDA on the mail receiving >> server is going to generate filenames that won't cause conflicts. >> So it's okay to leave the files uncommitted. > > So when does the commit happen? > >> When I did the "git merge", git would create the Maildir files in >> ~/Maildir/cur/... non-atomically. > > This might be something that the Git people could address if it was > brought up on the mailing list. Then again, it might not be possible > without going via a temporary file, which I doubt will fly. A temporary file + rename() is the only way, as far as I know. > I suppose that I never actually considered merges on the IMAP server > side, but obviously the IMAP server has to work off a clone, and that > means it needs to merge. It's not "merge" that's unsafe; that just builds a tree in the git index (assuming no conflicts). It's the ensuing process of git writing a tree to the filesystem that is problematic. I could probably actually write a wrapper that locks the Maildir while git is operating. It would probably be specific to each IMAP server. Note that this mean git is fundamentally incompatible with Maildir, not just IMAP servers. >> Dovecot would notice the file in ~/Maildir/cur/ and think, "This file >> must be ready!" So it would parse it even though git hadn't finished >> writing it. This caused me to only see partial headers in Alpine since >> Dovecot parsed it before it was a complete message. > > I wonder if a custom merge driver could address this to properly use > …/tmp/ to assemble the message and only then move it. I don't think a merge driver can do it for the reason stated above. -- Asheesh. -- I always turn to the sports pages first, which record people's accomplishments. The front page has nothing but man's failures. -- Chief Justice Earl Warren ^ permalink raw reply [flat|nested] 42+ messages in thread
* Git as notmuch object store (was: Potential problem using Git for mail) 2010-01-21 6:28 ` Asheesh Laroia @ 2010-01-25 0:46 ` martin f krafft 2010-01-25 5:19 ` Asheesh Laroia ` (2 more replies) 0 siblings, 3 replies; 42+ messages in thread From: martin f krafft @ 2010-01-25 0:46 UTC (permalink / raw) To: Asheesh Laroia; +Cc: notmuch [-- Attachment #1: Type: text/plain, Size: 3212 bytes --] also sprach Asheesh Laroia <asheesh@asheesh.org> [2010.01.21.1928 +1300]: > >I suppose that I never actually considered merges on the IMAP > >server side, but obviously the IMAP server has to work off a clone, > >and that means it needs to merge. > > It's not "merge" that's unsafe; that just builds a tree in the git > index (assuming no conflicts). It's the ensuing process of git > writing a tree to the filesystem that is problematic. There is no way to make that atomic, I am afraid. As you say. > I could probably actually write a wrapper that locks the Maildir > while git is operating. It would probably be specific to each IMAP > server. Ouch! I'd really rather not go there. > Note that this mean git is fundamentally incompatible with > Maildir, not just IMAP servers. We had an idea about using Git to replace IMAP altogether, along with making notmuch use a bare Git repository as object store. The idea is that notmuch uses low-level Git commands to access the .git repository (from which you can still checkout a tree tying the blobs into a Maildir). The benefit would be compression, lower inode count (due to packs), and backups using clones/merges. You could either have the MDA write to a Git repo on the server side and use git packs to download mail to a local clone, or one could have e.g. offlineimap grow a Git storage backend. The interface to notmuch would be the same. If we used this, all the rename and delete code would be refactored into Git and could be removed from notmuch. In addition, notmuch could actually use Git tree objects to represent the results of searches, and you could checkout these trees. However, deleting messages from search results would not have any effect on the message or its existence in other search results, much like what happens with mairix nowadays. I think we all kinda agreed that the Maildir flags should not be used by notmuch and that things like Sebastian's notmuchsync should be used if people wanted flags represented in Maildir filenames. Instead of a Maildir checkout, notmuch could provide an interface to browse the store contents in a way that could make it accessible to mutt. The argument is that with 'notmuch {ls,cat,rm,…}', a mutt backend could be trivially written. I am not sure about that, but it's worth a try. But there are still good reasons why you'd want to have IMAP capability too, e.g. Webmail. Given the atomicity problems that come from Git, maybe an IMAP server reading from the Git store would make sense. However, this all sounds like a lot of NIH and reinvention. It's a bit like the marriage between the hypothetical Maildir2 and Git, which is definitely worth pursuing. Before we embark on any of this, however, we'd need to define the way in which Git stores mail. Stewart, you've worked most on this so far. Would you like to share your thoughts? -- martin | http://madduck.net/ | http://two.sentenc.es/ "reife des mannes, das ist es, den ernst wiedergefunden zu haben, den man hatte als kind beim spiel." -- friedrich nietzsche spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Git as notmuch object store (was: Potential problem using Git for mail) 2010-01-25 0:46 ` Git as notmuch object store (was: Potential problem using Git for mail) martin f krafft @ 2010-01-25 5:19 ` Asheesh Laroia 2010-01-25 7:43 ` martin f krafft 2010-01-25 13:49 ` Sebastian Spaeth 2010-02-15 0:51 ` Stewart Smith 2 siblings, 1 reply; 42+ messages in thread From: Asheesh Laroia @ 2010-01-25 5:19 UTC (permalink / raw) To: martin f krafft; +Cc: notmuch [-- Attachment #1: Type: TEXT/PLAIN, Size: 5131 bytes --] On Mon, 25 Jan 2010, martin f krafft wrote: > also sprach Asheesh Laroia <asheesh@asheesh.org> [2010.01.21.1928 > +1300]: >>> I suppose that I never actually considered merges on the IMAP server >>> side, but obviously the IMAP server has to work off a clone, and that >>> means it needs to merge. >> >> It's not "merge" that's unsafe; that just builds a tree in the git >> index (assuming no conflicts). It's the ensuing process of git writing >> a tree to the filesystem that is problematic. > > There is no way to make that atomic, I am afraid. As you say. > >> I could probably actually write a wrapper that locks the Maildir while >> git is operating. It would probably be specific to each IMAP server. > > Ouch! I'd really rather not go there. You say "Ouch" but you should know Dovecot *already* does this. I don't mind interoperating with that. See http://wiki.dovecot.org/MailboxFormat/Maildir, section "Issues with the specification", subsection "Locking". I term this the famous readdir() race. Without this lock, Maildir is fundamentally incompatible with IMAP -- one Maildir-using process modifying message flags could make a different Maildir-using process think said message is actually deleted. In the case of temporary disappearing mails in Mutt locally, that's not the end of the world. For IMAP, it will make the IMAP daemon (one of the Maildir-using processes) send a note to IMAP clients saying that the message has been deleted and expunged. >> Note that this mean git is fundamentally incompatible with Maildir, not >> just IMAP servers. > > We had an idea about using Git to replace IMAP altogether, along with > making notmuch use a bare Git repository as object store. The idea is > that notmuch uses low-level Git commands to access the .git repository > (from which you can still checkout a tree tying the blobs into a > Maildir). The benefit would be compression, lower inode count (due to > packs), and backups using clones/merges. Sure, that makes sense to me. > You could either have the MDA write to a Git repo on the server side and > use git packs to download mail to a local clone, or one could have e.g. > offlineimap grow a Git storage backend. The interface to notmuch would > be the same. Yeah, I generally like this. > If we used this, all the rename and delete code would be refactored into > Git and could be removed from notmuch. In addition, notmuch could > actually use Git tree objects to represent the results of searches, and > you could checkout these trees. However, deleting messages from search > results would not have any effect on the message or its existence in > other search results, much like what happens with mairix nowadays. That's okay with me. > I think we all kinda agreed that the Maildir flags should not be used by > notmuch and that things like Sebastian's notmuchsync should be used if > people wanted flags represented in Maildir filenames. Aww, I like Maildir flags, but if there's a sync tool, I'm fine with that. > Instead of a Maildir checkout, notmuch could provide an interface to > browse the store contents in a way that could make it accessible to > mutt. The argument is that with 'notmuch {ls,cat,rm,…}', a mutt backend > could be trivially written. I am not sure about that, but it's worth a > try. Sure. > But there are still good reasons why you'd want to have IMAP capability > too, e.g. Webmail. Given the atomicity problems that come from Git, > maybe an IMAP server reading from the Git store would make sense. It wouldn't be too hard to write a FUSE filesystem that presented an interface to a Git repository that didn't allow the contents of files to be modified. Then Dovecot could think it's interacting with the filesystem. > However, this all sounds like a lot of NIH and reinvention. It's > a bit like the marriage between the hypothetical Maildir2 and Git, > which is definitely worth pursuing. Before we embark on any of this, > however, we'd need to define the way in which Git stores mail. Sure. If it were me, I'd just say, "For phase 1 of notmuch, just have git store Maildir spools." When you need a filesystem interface for e.g. Dovecot, have a FUSE wrapper. See how far that can take you, and then see if version 2 is necessary. (-: > Stewart, you've worked most on this so far. Would you like to share your > thoughts? I'll listen, too. Just don't fall into the trap of thinking Maildir is compatible with IMAP. It's not, because as I understand things, the filesystem doesn't guarantee that you can actually iterate across a directory's files if another process is modifying the list of files. I'm not sure, but maybe it's safe if you refuse to ever modify a message's flags in the filename. Anyway, as I see it, further hacks that aren't much worse than Dovecot's should be considered okay, unless you have a more elegant design up your sleeve. If I'm slightly wrong about something, try to give me the benefit of doubt. It's past midnight. (-: -- Asheesh. -- There's no real need to do housework -- after four years it doesn't get any worse. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Git as notmuch object store (was: Potential problem using Git for mail) 2010-01-25 5:19 ` Asheesh Laroia @ 2010-01-25 7:43 ` martin f krafft 0 siblings, 0 replies; 42+ messages in thread From: martin f krafft @ 2010-01-25 7:43 UTC (permalink / raw) To: notmuch [-- Attachment #1: Type: text/plain, Size: 3224 bytes --] also sprach Asheesh Laroia <asheesh@asheesh.org> [2010.01.25.1819 +1300]: > You say "Ouch" but you should know Dovecot *already* does this. I > don't mind interoperating with that. > > See http://wiki.dovecot.org/MailboxFormat/Maildir, section "Issues > with the specification", subsection "Locking". I term this theQ > famous readdir() race. Yikes. IMAP (including dovecot) just SUCKS. > Without this lock, Maildir is fundamentally incompatible with IMAP > -- one Maildir-using process modifying message flags could make > a different Maildir-using process think said message is actually > deleted. In the case of temporary disappearing mails in Mutt > locally, that's not the end of the world. For IMAP, it will make > the IMAP daemon (one of the Maildir-using processes) send a note > to IMAP clients saying that the message has been deleted and > expunged. […] > Just don't fall into the trap of thinking Maildir is compatible > with IMAP. It's not, because as I understand things, the > filesystem doesn't guarantee that you can actually iterate across > a directory's files if another process is modifying the list of > files. This is all perfect reason to concentrate even more on designing a store that could potentially make IMAP obsolete once and for all! The current idea is to sync Git downstream only, and find a way to keep multiple copies of a tagstore in sync, by way of the "server instance" (where mail is received/delivered). Deleting messages would then be something like setting the notmuch::deleted tag, which clients would honour; on the server, a cleanup process would run regularly to actually delete the blobs associated with deleted messages. This would then propogate the next time one pulls from Git. Whether to store history (commit objects) or just collections (tree objects) needs to be investigated. > >But there are still good reasons why you'd want to have IMAP > >capability too, e.g. Webmail. Given the atomicity problems that > >come from Git, maybe an IMAP server reading from the Git store > >would make sense. > > It wouldn't be too hard to write a FUSE filesystem that presented > an interface to a Git repository that didn't allow the contents of > files to be modified. Then Dovecot could think it's interacting > with the filesystem. Yes, a FUSE layer (which adds a daemon), or a lightweight access API via libnotmuch. Probably the former using the latter. ;) > Aww, I like Maildir flags, but if there's a sync tool, I'm fine > with that. […] > I'm not sure, but maybe it's safe if you refuse to ever modify > a message's flags in the filename. The main point is that there is nothing really in Maildir filenames that you couldn't equally (and possibly better) represent in the notmuch::* tag namespace, and then there is benefit in only having one used primarily (which means notmuchsync can do whatever it wants without affecting or messing with notmuch). -- martin | http://madduck.net/ | http://two.sentenc.es/ "if I can't dance, i don't want to be part of your revolution." - emma goldman spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Git as notmuch object store (was: Potential problem using Git for mail) 2010-01-25 0:46 ` Git as notmuch object store (was: Potential problem using Git for mail) martin f krafft 2010-01-25 5:19 ` Asheesh Laroia @ 2010-01-25 13:49 ` Sebastian Spaeth 2010-01-25 16:22 ` Mike Kelly ` (2 more replies) 2010-02-15 0:51 ` Stewart Smith 2 siblings, 3 replies; 42+ messages in thread From: Sebastian Spaeth @ 2010-01-25 13:49 UTC (permalink / raw) To: martin f krafft, Asheesh Laroia; +Cc: notmuch On Mon, 25 Jan 2010 13:46:59 +1300, martin f krafft <madduck@madduck.net> wrote: > I think we all kinda agreed that the Maildir flags should not be > used by notmuch and that things like Sebastian's notmuchsync should > be used if people wanted flags represented in Maildir filenames. While notmuchsync fullfils my needs, it is a kludge. It needs to call "notmuch" for each mail where a MailDir flag has changed (which can be quite often on an initial run, where most mails are likely to be read), this can take a long, long time. It would makes sense IMHO to at least pick pioto's "don't set unread if 'S' flag is set" on notmuch new[1]. Or - at the very least - not to set the "unread" flag by default. Sebastian [1] pioto's noarg-count branch (http://git.pioto.org/gitweb/notmuch.git Announced in mail id:20100121204201.1C82764A0E@aether.pioto.org) ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Git as notmuch object store (was: Potential problem using Git for mail) 2010-01-25 13:49 ` Sebastian Spaeth @ 2010-01-25 16:22 ` Mike Kelly 2010-01-25 21:46 ` tag dir proposal [was: Re: Git as notmuch object store] Jameson Rollins 2010-01-25 19:49 ` Git as notmuch object store (was: Potential problem using Git for mail) martin f krafft 2010-01-27 9:00 ` Sebastian Spaeth 2 siblings, 1 reply; 42+ messages in thread From: Mike Kelly @ 2010-01-25 16:22 UTC (permalink / raw) To: Sebastian Spaeth, martin f krafft, Asheesh Laroia; +Cc: notmuch On Mon, 25 Jan 2010 14:49:00 +0100, "Sebastian Spaeth" <Sebastian@SSpaeth.de> wrote: > > On Mon, 25 Jan 2010 13:46:59 +1300, martin f krafft <madduck@madduck.net> wrote: > > I think we all kinda agreed that the Maildir flags should not be > > used by notmuch and that things like Sebastian's notmuchsync should > > be used if people wanted flags represented in Maildir filenames. > > While notmuchsync fullfils my needs, it is a kludge. It needs to call > "notmuch" for each mail where a MailDir flag has changed (which can be > quite often on an initial run, where most mails are likely to be read), > this can take a long, long time. It would makes sense IMHO to at least > pick pioto's "don't set unread if 'S' flag is set" on notmuch new[1]. notmuchsync, as currently implemented, suffers from major performance issues, in my opinion. It's a useful short term workaround, but not a good long term solution. But, I personally will always be using both notmuch and some other IMAP client (my phone). I want the two to remain in sync easily enough. notmuch is already much more robust with respect to that than sup, I think (in terms of handling renames without barfing, etc). At the very least, I want `notmuch new` to be able to: If it sees a rename that involves changing maildir flags, alter the related tags as necessary. Similarly, provide a mechanism for correlating the folder name with some set of tags, and change those tags as messages are moved around. For example, I might have: ~/.notmuch-config: [database] path=/home/pioto/mail ... [tags] pioto@pioto.org/INBOX.ListMail.notmuch = notmuch So, a 'tags' section, where each key is the folder name, relative to the db path, and the value is one or more tag names This means that I could relabel a message in gmail, for example, and have the changes apply to notmuch at my next offlineimap run. And, it means that my existing procmail rules will still be useful both to notmuch, and to my phone, for the purpose of categorizing things. I agree that all this should be optional. But, since it is likely the behavior most people would expect, I think it should be the default. PS. You mean the 'new-unread' branch, not the 'noarg-count' branch, from my repo. -- Mike Kelly ^ permalink raw reply [flat|nested] 42+ messages in thread
* tag dir proposal [was: Re: Git as notmuch object store] 2010-01-25 16:22 ` Mike Kelly @ 2010-01-25 21:46 ` Jameson Rollins 2010-01-26 16:32 ` Scott Robinson 2010-01-28 5:10 ` martin f krafft 0 siblings, 2 replies; 42+ messages in thread From: Jameson Rollins @ 2010-01-25 21:46 UTC (permalink / raw) To: Mike Kelly, notmuch [-- Attachment #1: Type: text/plain, Size: 2722 bytes --] On Mon, 25 Jan 2010 11:22:47 -0500 (EST), Mike Kelly <pioto@pioto.org> wrote: > Similarly, provide a mechanism for correlating the folder name with > some set of tags, and change those tags as messages are moved around. > > For example, I might have: > > ~/.notmuch-config: > > [database] > path=/home/pioto/mail > ... > [tags] > pioto@pioto.org/INBOX.ListMail.notmuch = notmuch > > So, a 'tags' section, where each key is the folder name, relative to the > db path, and the value is one or more tag names I think this idea is a really good one and I would like to pursue it as a tangent thread here. I was going to propose something very similar to this. I think it's a very flexible idea that would help in a lot of ways. For instance, notmuch emacs (and emacs message-mode) is currently not good at handling sent mail. At the moment mail is just Bcc'd to yourself. However, this means that these sent messages end up back in your inbox with 'inbox' and 'unread' tags which then need to be removed so that the sent message is archived. If one could configure notmuch such that only new mail in an inbox directory would be tagged with 'inbox' and 'unread', and manage to coax emacs to fcc directly into an archive, then these sent messages would not have the problematic 'inbox' and 'unread' tags. Even better, then sent mail could be fcc'd to a sent mail directory would could then be configured to automatically get a 'sent' tag. Notmuch emacs also currently does not handle message drafts, which makes it very difficult to resume messages that were postponed from a previous session. If notmuch could be configured to tag messages in the message-mode "message-auto-save-directory" with a 'draft' tag, then it would greatly facilitate finding draft messages. It would also be sweet if this could remove tags as well (maybe be prepending '-' or '+' to the tag specification. For example, I can imagine implementing the above examples like this: [database] path=/home/jrollins/.mail [tags] inbox = +inbox,+unread sent = +sent drafts = +draft archive = -inbox I think we should definitely implement something like this. It would make things a lot more flexible. Notmuch could be configured to not tag any messages by default (which would make a lot of people using notmuch for other backends happier) and then notmuch setup could could provide an example tags stanza that would tag new messages with 'inbox' and 'unread' (maybe with a wildcard that would replicate the current behavior): [tags] * = +inbox,+unread I would love to see this. Hopefully we can rally some more support for this idea. jamie. [-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-25 21:46 ` tag dir proposal [was: Re: Git as notmuch object store] Jameson Rollins @ 2010-01-26 16:32 ` Scott Robinson 2010-01-26 17:03 ` Jameson Rollins 2010-01-28 5:10 ` martin f krafft 1 sibling, 1 reply; 42+ messages in thread From: Scott Robinson @ 2010-01-26 16:32 UTC (permalink / raw) To: notmuch Excerpts from Jameson Rollins's message of Mon Jan 25 15:46:55 -0600 2010: > I think this idea is a really good one and I would like to pursue it as > a tangent thread here. I was going to propose something very similar to > this. I think it's a very flexible idea that would help in a lot of > ways. > > [...] This is getting involved. Maybe I'm missing something in this thread; but, why couldn't these complex and context-sensitive decisions be delegated to sub-processes? ala git hooks? ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-26 16:32 ` Scott Robinson @ 2010-01-26 17:03 ` Jameson Rollins 2010-01-28 5:12 ` martin f krafft 0 siblings, 1 reply; 42+ messages in thread From: Jameson Rollins @ 2010-01-26 17:03 UTC (permalink / raw) To: Scott Robinson, notmuch [-- Attachment #1: Type: text/plain, Size: 793 bytes --] On Tue, 26 Jan 2010 10:32:02 -0600, Scott Robinson <scott@quadhome.com> wrote: > Excerpts from Jameson Rollins's message of Mon Jan 25 15:46:55 -0600 2010: > > I think this idea is a really good one and I would like to pursue it as > > a tangent thread here. I was going to propose something very similar to > > this. I think it's a very flexible idea that would help in a lot of > > ways. > > > > [...] > > This is getting involved. > > Maybe I'm missing something in this thread; but, why couldn't these complex and > context-sensitive decisions be delegated to sub-processes? ala git hooks? I think this idea is completely independent of anything having to do with using git as a mail store. That's why I was trying to separate it into a separate thread. jamie. [-- Attachment #2: Type: application/pgp-signature, Size: 835 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-26 17:03 ` Jameson Rollins @ 2010-01-28 5:12 ` martin f krafft 2010-01-28 5:28 ` James Westby 0 siblings, 1 reply; 42+ messages in thread From: martin f krafft @ 2010-01-28 5:12 UTC (permalink / raw) To: Jameson Rollins; +Cc: notmuch [-- Attachment #1: Type: text/plain, Size: 862 bytes --] also sprach Jameson Rollins <jrollins@finestructure.net> [2010.01.27.0603 +1300]: > > This is getting involved. > > > > Maybe I'm missing something in this thread; but, why couldn't these complex and > > context-sensitive decisions be delegated to sub-processes? ala git hooks? > > I think this idea is completely independent of anything having to do > with using git as a mail store. That's why I was trying to separate it > into a separate thread. I think he meant "notmuch hooks like you have hooks for Git too", e.g. thread:755741d13573c7642761d2a175cb146d -- martin | http://madduck.net/ | http://two.sentenc.es/ "if i am occasionally a little overdressed, i make up for it by being always immensely over-educated." -- oscar wilde spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-28 5:12 ` martin f krafft @ 2010-01-28 5:28 ` James Westby 2010-01-28 5:34 ` martin f krafft 0 siblings, 1 reply; 42+ messages in thread From: James Westby @ 2010-01-28 5:28 UTC (permalink / raw) To: martin f krafft, Jameson Rollins; +Cc: notmuch On Thu, 28 Jan 2010 18:12:52 +1300, martin f krafft <madduck@madduck.net> wrote: > also sprach Jameson Rollins <jrollins@finestructure.net> [2010.01.27.0603 +1300]: > > > This is getting involved. > > > > > > Maybe I'm missing something in this thread; but, why couldn't these complex and > > > context-sensitive decisions be delegated to sub-processes? ala git hooks? > > > > I think this idea is completely independent of anything having to do > > with using git as a mail store. That's why I was trying to separate it > > into a separate thread. > > I think he meant "notmuch hooks like you have hooks for Git too", > e.g. thread:755741d13573c7642761d2a175cb146d Are you trying to use thread: such that it could be passed to notmuch show to see the conversation? That's not going to work so well if so. Thanks, James ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-28 5:28 ` James Westby @ 2010-01-28 5:34 ` martin f krafft 2010-01-28 6:22 ` James Westby 2010-01-28 9:55 ` martin f krafft 0 siblings, 2 replies; 42+ messages in thread From: martin f krafft @ 2010-01-28 5:34 UTC (permalink / raw) To: James Westby; +Cc: notmuch [-- Attachment #1: Type: text/plain, Size: 574 bytes --] also sprach James Westby <jw+debian@jameswestby.net> [2010.01.28.1828 +1300]: > Are you trying to use thread: such that it could be passed to > notmuch show to see the conversation? > > That's not going to work so well if so. Why not? Works fine for me with the vim plugin... -- martin | http://madduck.net/ | http://two.sentenc.es/ "perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away." -- antoine de saint-exupéry spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-28 5:34 ` martin f krafft @ 2010-01-28 6:22 ` James Westby 2010-01-28 9:55 ` martin f krafft 1 sibling, 0 replies; 42+ messages in thread From: James Westby @ 2010-01-28 6:22 UTC (permalink / raw) To: martin f krafft; +Cc: notmuch On Thu, 28 Jan 2010 18:34:21 +1300, martin f krafft <madduck@madduck.net> wrote: > also sprach James Westby <jw+debian@jameswestby.net> [2010.01.28.1828 +1300]: > > Are you trying to use thread: such that it could be passed to > > notmuch show to see the conversation? > > > > That's not going to work so well if so. > > Why not? Works fine for me with the vim plugin... lib/message.cc:560 static void thread_id_generate (thread_id_t *thread_id) { static int seeded = 0; FILE *dev_random; uint32_t value; char *s; int i; if (! seeded) { dev_random = fopen ("/dev/random", "r"); if (dev_random == NULL) { srand (time (NULL)); } else { fread ((void *) &value, sizeof (value), 1, dev_random); srand (value); fclose (dev_random); } seeded = 1; } s = thread_id->str; for (i = 0; i < NOTMUCH_THREAD_ID_DIGITS; i += 8) { value = rand (); sprintf (s, "%08x", value); s += 8; } } so it works fine for you, however I have no idea which thread you are talking about. Thanks, James ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-28 5:34 ` martin f krafft 2010-01-28 6:22 ` James Westby @ 2010-01-28 9:55 ` martin f krafft 1 sibling, 0 replies; 42+ messages in thread From: martin f krafft @ 2010-01-28 9:55 UTC (permalink / raw) To: James Westby, Jameson Rollins, notmuch [-- Attachment #1: Type: text/plain, Size: 627 bytes --] also sprach martin f krafft <madduck@madduck.net> [2010.01.28.1834 +1300]: > > That's not going to work so well if so. > > Why not? Works fine for me with the vim plugin... Now I get it. I was talking about id:20100114084713.GA22273@harikalardiyari Sorry, I *am* new to notmuch ;) -- martin | http://madduck.net/ | http://two.sentenc.es/ "when zarathustra was alone... he said to his heart: 'could it be possible! this old saint in the forest hath not yet heard of it, that god is dead!'" - friedrich nietzsche spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-25 21:46 ` tag dir proposal [was: Re: Git as notmuch object store] Jameson Rollins 2010-01-26 16:32 ` Scott Robinson @ 2010-01-28 5:10 ` martin f krafft 2010-01-28 12:32 ` Servilio Afre Puentes 1 sibling, 1 reply; 42+ messages in thread From: martin f krafft @ 2010-01-28 5:10 UTC (permalink / raw) To: Jameson Rollins; +Cc: notmuch [-- Attachment #1: Type: text/plain, Size: 2529 bytes --] also sprach Jameson Rollins <jrollins@finestructure.net> [2010.01.26.1046 +1300]: > > For example, I might have: > > > > ~/.notmuch-config: > > > > [database] > > path=/home/pioto/mail > > ... > > [tags] > > pioto@pioto.org/INBOX.ListMail.notmuch = notmuch > > > > So, a 'tags' section, where each key is the folder name, relative to the > > db path, and the value is one or more tag names > > I think this idea is a really good one and I would like to pursue it as > a tangent thread here. I was going to propose something very similar to > this. I think it's a very flexible idea that would help in a lot of > ways. I think we need to carefully distinguish here. The above seems to suggest a mapping from folder to tag, but we don't actually need tags for folder locations, because those can (and should) be implicitly determined from the database and storing the tag in addition would just run the risk of getting out of sync: if I moved a message, I would also have to remember to delete old and add new tags, which is just asking for trouble. > [tags] > inbox = +inbox,+unread > sent = +sent > drafts = +draft > archive = -inbox This proposal, on the other hand, is an interesting one, but when is it supposed to happen? It just feels wrong to make this happen as part of 'notmuch new'. What I would like to see is a notmuch-aware MDA, e.g. a programme which reads an incoming mail on stdin and can do all this kind of stuff, e.g. assign tags based on such rules (or take tags as arguments, so that I could trivially set tags from procmail too), write the message to the message store, and update the database. This would allow us to get rid of 'notmuch new' altogether, at least conceptually. We'd still need it if mail is being delivered independently, e.g. with offlineimap. On the performance side, it might make sense to write to a journal instead of updating the database every time. SpamAssassin does this with its Bayesian database, and it only merges the journal every X updates (or when the user manually requests it). I am not sure whether this is possible with Xapian. On the other hand, I think notmuch needs to learn to journal anyway so that we can keep different instances in sync. -- martin | http://madduck.net/ | http://two.sentenc.es/ "the only way to get rid of a temptation is to yield to it." -- oscar wilde spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-28 5:10 ` martin f krafft @ 2010-01-28 12:32 ` Servilio Afre Puentes 2010-01-28 20:39 ` martin f krafft 0 siblings, 1 reply; 42+ messages in thread From: Servilio Afre Puentes @ 2010-01-28 12:32 UTC (permalink / raw) To: Jameson Rollins, Mike Kelly, notmuch 2010/1/28 martin f krafft <madduck@madduck.net>: > also sprach Jameson Rollins <jrollins@finestructure.net> [2010.01.26.1046 +1300]: >> > For example, I might have: >> > >> > ~/.notmuch-config: >> > >> > [database] >> > path=/home/pioto/mail >> > ... >> > [tags] >> > pioto@pioto.org/INBOX.ListMail.notmuch = notmuch >> > >> > So, a 'tags' section, where each key is the folder name, relative to the >> > db path, and the value is one or more tag names >> >> I think this idea is a really good one and I would like to pursue it as >> a tangent thread here. I was going to propose something very similar to >> this. I think it's a very flexible idea that would help in a lot of >> ways. > > I think we need to carefully distinguish here. The above seems to > suggest a mapping from folder to tag, but we don't actually need > tags for folder locations because those can (and should) be implicitly > determined from the database I think that the usefulness of this functionality is that we can have a mapping from physical organization of the mail to a tagging scheme of our choosing, and we can be relieved from having to remember the location of the mail (that can be different in different from different mail clients). But even right now I can't find a documented way of searching by location, so AFAIK the implementation of this proposal would allow something that is not possible at the moment. >> [tags] >> inbox = +inbox,+unread >> sent = +sent >> drafts = +draft >> archive = -inbox > > This proposal, on the other hand, is an interesting one, but when is > it supposed to happen? It just feels wrong to make this happen as > part of 'notmuch new'. Why so? > What I would like to see is a notmuch-aware MDA, e.g. a programme > which reads an incoming mail on stdin and can do all this kind of > stuff, e.g. assign tags based on such rules (or take tags as > arguments, so that I could trivially set tags from procmail too), > write the message to the message store, and update the database. Such an MDA wouldn't need to use "notmuch new", and thus won't be affected by this > This would allow us to get rid of 'notmuch new' altogether, at least > conceptually. We'd still need it if mail is being delivered > independently, e.g. with offlineimap. Then we'd still need it, why not make it better? Regards, Servilio ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-28 12:32 ` Servilio Afre Puentes @ 2010-01-28 20:39 ` martin f krafft 2010-01-28 20:49 ` Ben Gamari 0 siblings, 1 reply; 42+ messages in thread From: martin f krafft @ 2010-01-28 20:39 UTC (permalink / raw) To: notmuch [-- Attachment #1: Type: text/plain, Size: 807 bytes --] also sprach Servilio Afre Puentes <servilio@gmail.com> [2010.01.29.0132 +1300]: > >> [tags] > >> inbox = +inbox,+unread > >> sent = +sent > >> drafts = +draft > >> archive = -inbox > > > > This proposal, on the other hand, is an interesting one, but when is > > it supposed to happen? It just feels wrong to make this happen as > > part of 'notmuch new'. > > Why so? I guess I just dislike having to run notmuch new regularly, rather than integrating the database more closely with the mail flow. -- martin | http://madduck.net/ | http://two.sentenc.es/ "to get back my youth i would do anything in the world, except take exercise, get up early, or be respectable." -- oscar wilde spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-28 20:39 ` martin f krafft @ 2010-01-28 20:49 ` Ben Gamari 2010-01-28 21:11 ` martin f krafft 2010-01-28 21:16 ` Jed Brown 0 siblings, 2 replies; 42+ messages in thread From: Ben Gamari @ 2010-01-28 20:49 UTC (permalink / raw) To: notmuch Excerpts from martin f krafft's message of Thu Jan 28 15:39:10 -0500 2010: > also sprach Servilio Afre Puentes <servilio@gmail.com> [2010.01.29.0132 +1300]: > > >> [tags] > > >> inbox = +inbox,+unread > > >> sent = +sent > > >> drafts = +draft > > >> archive = -inbox > > > > > > This proposal, on the other hand, is an interesting one, but when is > > > it supposed to happen? It just feels wrong to make this happen as > > > part of 'notmuch new'. > > > > Why so? > > I guess I just dislike having to run notmuch new regularly, rather > than integrating the database more closely with the mail flow. > Sounds like you need to add a line to crontab. - Ben ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-28 20:49 ` Ben Gamari @ 2010-01-28 21:11 ` martin f krafft [not found] ` <1264713802-sup-620@ben-laptop> 2010-01-28 21:16 ` Jed Brown 1 sibling, 1 reply; 42+ messages in thread From: martin f krafft @ 2010-01-28 21:11 UTC (permalink / raw) To: Ben Gamari; +Cc: notmuch [-- Attachment #1: Type: text/plain, Size: 788 bytes --] also sprach Ben Gamari <bgamari@gmail.com> [2010.01.29.0949 +1300]: > > I guess I just dislike having to run notmuch new regularly, > > rather than integrating the database more closely with the mail > > flow. > > > Sounds like you need to add a line to crontab. It still feels like a hack. It's a bit like making many changes to a source code repository (new mails get delivered) and committing only once every hour (notmuch new), rather than making and committing transactional changes (delivering and catalogueing mails individually). -- martin | http://madduck.net/ | http://two.sentenc.es/ a Hooloovoo is a superintelligent shade of the color blue. -- douglas adams, "the hitchhiker's guide to the galaxy" spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
[parent not found: <1264713802-sup-620@ben-laptop>]
[parent not found: <20100128221735.GE8942@lapse.rw.madduck.net>]
* Re: tag dir proposal [was: Re: Git as notmuch object store] [not found] ` <20100128221735.GE8942@lapse.rw.madduck.net> @ 2010-01-28 23:30 ` Ben Gamari 0 siblings, 0 replies; 42+ messages in thread From: Ben Gamari @ 2010-01-28 23:30 UTC (permalink / raw) To: martin f krafft, notmuch Excerpts from martin f krafft's message of Thu Jan 28 17:17:35 -0500 2010: > Cron-scheduling is a regular activity. I am talking about > event-based scheduling. incron could do that and fire up a process > every time a message is dropped into a directory, but notmuch > doesn't provide me with an interface to say "you don't have to > iterate the Maildir yourself since I know exactly what changed: just > update your catalog with the new message in file foo/bar.msg". Fair enough. After reading your arguments I think I might have initially misunderstood you. I would actually tend to agree. Passing paths to notmuch does seem to be a reasonable approach. > > To me, notmuch-new is not Unix-y. To me, > > find $MAILDIR -type f -print0 | xargs -0 notmuch-update > > is Unix-y. ;) > I think it really depends upon what you are doing. I can certainly see when you might be want to simply have notmuch synchronize the index against the mail store. However, it seems the majority of the time one simply desires to add a message to the index (i.e. after delivery). Therefore, it seems like there is a place for both commands. > > In my configuration, I simply have a bash script in ~/.bin that simply > > runs offlineimap followed by notmuch new. This works quite nicely. > > This is essentially the same situation as with slocate, which has to > be run from cron currently, and hence gets outdated regularly. > Compare this to a hypothetical filesystem that exposed an index of > filenames (or content even!) to user-space, which could be used to > quickly search for files in real-time without the need to run > regular updates. I know other operating systems that have this > functionality already. > > Anyway, this is going off on a tangent, I feel. > That might be true but that certainly won't stop. ;) One would think that it wouldn't be difficult to teach slocate about inotify. I briefly looked into this and found rlocate but quickly realized that it requires its own kernel module. Apparently this has been investigated[1] and the inotify watch count limit becomes an issue very quickly. I seem to recall, however, that there were some whispers on the LKML about adding an interface that would be more capable of supporting such a system. I can't seem to recall the details, however, and homework beckons. Cheers, - Ben ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: tag dir proposal [was: Re: Git as notmuch object store] 2010-01-28 20:49 ` Ben Gamari 2010-01-28 21:11 ` martin f krafft @ 2010-01-28 21:16 ` Jed Brown 1 sibling, 0 replies; 42+ messages in thread From: Jed Brown @ 2010-01-28 21:16 UTC (permalink / raw) To: Ben Gamari, notmuch On Thu, 28 Jan 2010 15:49:34 -0500, Ben Gamari <bgamari@gmail.com> wrote: > Sounds like you need to add a line to crontab. I haven't been following this thread closely so I hope this isn't too out of context. I agree that certain things like notmuch-new should go in the crontab, but I think that notmuch-new should need to be run exactly once to process a new batch of messages into the desired state. Having notmuch-new apply one set of tags and then relying on another process run afterwards to change the tags according to a filter is undesirable in my opinion, both for the mild performance reason of making two passes, but more importantly because of lock contention between the two processes and the ease of viewing the database in the inconsistent state. As far as I understand the situation, my favorite solution is to have notmuch-new run a hook on each message as it is indexed. Jed ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Git as notmuch object store (was: Potential problem using Git for mail) 2010-01-25 13:49 ` Sebastian Spaeth 2010-01-25 16:22 ` Mike Kelly @ 2010-01-25 19:49 ` martin f krafft 2010-01-27 9:00 ` Sebastian Spaeth 2 siblings, 0 replies; 42+ messages in thread From: martin f krafft @ 2010-01-25 19:49 UTC (permalink / raw) To: Sebastian Spaeth; +Cc: notmuch, Asheesh Laroia [-- Attachment #1: Type: text/plain, Size: 827 bytes --] also sprach Sebastian Spaeth <Sebastian@SSpaeth.de> [2010.01.26.0249 +1300]: > While notmuchsync fullfils my needs, it is a kludge. It needs to > call "notmuch" for each mail where a MailDir flag has changed > (which can be quite often on an initial run, where most mails are > likely to be read), this can take a long, long time. It would > makes sense IMHO to at least pick pioto's "don't set unread if 'S' > flag is set" on notmuch new[1]. I am sure this could be implemented with libnotmuch if it proves to be useful. -- martin | http://madduck.net/ | http://two.sentenc.es/ "it isn't pollution that's harming the environment. it's the impurities in our air and water that are doing it." - dan quayle spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Git as notmuch object store (was: Potential problem using Git for mail) 2010-01-25 13:49 ` Sebastian Spaeth 2010-01-25 16:22 ` Mike Kelly 2010-01-25 19:49 ` Git as notmuch object store (was: Potential problem using Git for mail) martin f krafft @ 2010-01-27 9:00 ` Sebastian Spaeth 2 siblings, 0 replies; 42+ messages in thread From: Sebastian Spaeth @ 2010-01-27 9:00 UTC (permalink / raw) To: martin f krafft, Asheesh Laroia; +Cc: notmuch On Mon, 25 Jan 2010 14:49:00 +0100, "Sebastian Spaeth" <Sebastian@SSpaeth.de> wrote: > While notmuchsync fullfils my needs, it is a kludge. It needs to call > "notmuch" for each mail where a MailDir flag has changed (which can be > quite often on an initial run, where most mails are likely to be read), > this can take a long, long time. It would makes sense IMHO to at least > pick pioto's "don't set unread if 'S' flag is set" on notmuch new[1]. Once python bindings exist for the notmuch shared library, I am sure we can speed notmuchsync up a lot by keeping the connection open and tagging all mails in one go rather than executing a separate binary for each mail. So, this approach might still be feasible. Sebastian ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Git as notmuch object store (was: Potential problem using Git for mail) 2010-01-25 0:46 ` Git as notmuch object store (was: Potential problem using Git for mail) martin f krafft 2010-01-25 5:19 ` Asheesh Laroia 2010-01-25 13:49 ` Sebastian Spaeth @ 2010-02-15 0:51 ` Stewart Smith 2 siblings, 0 replies; 42+ messages in thread From: Stewart Smith @ 2010-02-15 0:51 UTC (permalink / raw) To: Asheesh Laroia, notmuch On Mon, Jan 25, 2010 at 01:46:59PM +1300, martin f krafft wrote: > Stewart, you've worked most on this so far. Would you like to share > your thoughts? Just posted a new thread with my latest experiments. Things look rather good from a storage size point of view. Still a few things to work out though. -- Stewart Smith ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-11 22:19 Idea for storing tags martin f krafft 2010-01-12 3:44 ` Scott Robinson @ 2010-01-12 4:11 ` Scott Morrison 2010-01-13 1:24 ` martin f krafft 2010-01-12 21:39 ` David A. Harding 2010-01-14 1:32 ` Carl Worth 3 siblings, 1 reply; 42+ messages in thread From: Scott Morrison @ 2010-01-12 4:11 UTC (permalink / raw) To: mailtags discussion list; +Cc: notmuch discussion list Thought you would be interested in my experiences and thoughts from actually doing this kind of stuff. With my software MailTags (www.indev.ca/MailTags.html) and I have looked at all these options and decided to go with storing tags in headers (in json formatted data for the X-MailTags header) I have thought seriously about using pseudo emails stored in a specially named directory but feel there are a couple of issues with this. 1. synchronization of tag data with emails -- if they are in a subfolder then it presents the issue of maintaining this subfolder when managing emails (moving, deleting, duplicating etc) and any .tag folder unaware clients are likely cause an breakage in tagdata/message association. One way of doing this is to have a global .tag folder. 2. what happens if that message is archived or moved to an exclusively local cache -- eg. Mail.app on OS X can easily move IMAP messages to a folder resident on the computers computers? -- 3. what happens with duplicates of emails -- I would assume that the message id would be the key to match the tag data to the message. In this system a duplicate of a message could not have a different set of tags from the original (not that this would necessarily be desirable.) As I mentioned, I went with tags in headers -- though this has its own drawbacks. Your mention of potential leakage (aka inadvertent disclosure of tag data) is real -- but only if the client used to bounce/forward is not the one to tag the message (one would assume that if a client can tag, it can know to exclude the tags in a bounce.) Mail.app -- which I am pluging into does not forward headers -- though it will include all headers in a bounce -- but chance are you aren't tagging messages you are bouncing.:) The performance issue is very real -- because it means that somehow messages have to rewritten to the IMAP server -- IMAP doesn't have a mechanism AFAIK for updates. Additionally, IMAP doesn't have a mechanism for simply replacing one message data with another -- a new message must be written and the old message must be deleted and the message IMAP UID will change, and the client will have to deal with this especially if it is cache the messages. Also GMAIL IMAP is an issue- gmail IMAP is not IMAP -- it simply doesn't work like a true imap server -- writes to folders in gmail IMAP are translated to database updates where it is attributing a single record of the message with the folder it was "written" to. Changing headers on a gmail IMAP message simply will not work because it will will reject the message as update of the single record (and not actually write the new data). Still tags in headers meant that I didn't have to worry about making sure that the .tags folder is maintained appropriate (throughout moves and deletions) and that the data is stored much closer to the message for data recovery if it is ever needed and for archiving tags. -- in anycase -- this is what I have working -- though I am open to considering new approaches. Scott ps. also see my post to the mailtags-list from a few years back http://lists.madduck.net/pipermail/mailtags/2007-August/msg00017.html On 2010-01-11, at 5:19 PM, martin f krafft wrote: > Folks, over in #notmuch, we just floated an idea that I'd like to > get out to you. We've been debating storing tags for messages. > Therefore I am cross-posting. Please forgive me. > > So far, there are two approaches: > > 1. External database, which has the downside of not being > synchronisable with standard IMAP, like the rest of your mail > (assuming you use IMAP). Also, it's possible for mailstore and > database to get out of sync. > > 2. In-headers, which has the downside of leaking (e.g. when > bouncing), and incurs the risks associated with message rewrites > (which I think is pretty much ignorable, but it's still there). > Also, there's a performance issue, but in the context of an > indexer like notmuch, this is negligible. > > The leakage is real, though and I think it makes in-headers > unusable. After all, I don't ever want anyone else to know that > I tag e-mails from my boss as "from-idiots", and I forward and > bounce mail on a regular basis. I could tell my MTA to remove > those headers, but I might forget to do that on a new system. > > We also previously determined that IMAP keywords are pretty much > useless as they are stored per mailbox, not per message, not > standardised, and limited in their length anyway [0]. This also > means that we don't really need to investigate sensibly storing tags > in Maildir (e.g. with xattrs), because IMAP cannot transport them. > > 0. http://lists.madduck.net/pipermail/mailtags/2007-August/msg00016.html > > Seriously, who implemented IMAPv4rev1 and what sort of crack were > they smoking?? > > I remember there was some KDE groupware contacts manager that used > IMAP to synchronise contacts. At first, this sounds horrible, but > when you detach IMAP from RFC822, it becomes a generic synchronising > protocol. The next step is then straight forward, and I want to > share this idea with you: > > How about using pseudo-mails stored in Maildir and synchronised by > IMAP? E.g. every folder could have a subfolder .TAGS and if we find > a way to smartly pair messages between parent and subfolder, we'd > have a tag store alongside the mailstore it refers to, but without > the danger of leakage, and without having to rewrite messages. > > The major problem with this is when clients don't understand this > "protocol", for then they will display all .TAGS folders as regular > IMAP folders, and try to treat the messages therein as regular > mails. Somewhere sometime this is bound to blow up and I don't > really know how to prevent that. > > Anyway, the idea is out now. Thoughts? > > -- > martin | http://madduck.net/ | http://two.sentenc.es/ > > echo Prpv a\'rfg cnf har cvcr | tr Pacfghnrvp Cnpstuaeic > > spamtraps: madduck.bogus@madduck.net > _______________________________________________ > mailtags mailing list > mailtags@lists.madduck.net > http://lists.madduck.net/listinfo/mailtags ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-12 4:11 ` Idea for storing tags Scott Morrison @ 2010-01-13 1:24 ` martin f krafft 2010-01-13 5:39 ` Scott Morrison 0 siblings, 1 reply; 42+ messages in thread From: martin f krafft @ 2010-01-13 1:24 UTC (permalink / raw) To: mailtags discussion list, notmuch discussion list [-- Attachment #1: Type: text/plain, Size: 4873 bytes --] also sprach Scott Morrison <smorr@indev.ca> [2010.01.12.1711 +1300]: > 1. synchronization of tag data with emails -- if they are in > a subfolder then it presents the issue of maintaining this > subfolder when managing emails (moving, deleting, duplicating etc) > and any .tag folder unaware clients are likely cause an breakage > in tagdata/message association. One way of doing this is to have > a global .tag folder. A global .tag folder indexed by e.g. message ID, as you state later, would probably allow for this. Or a file-per-tag design. We'd have to think carefully about pros and cons for each. When thinking about this, I always have to remind myself that we are targetting this at a design that has indexed search. If that weren't the case, searches would be incredibly expensive. Maybe a better approach would be content addressing (see below). > 2. what happens if that message is archived or moved to an > exclusively local cache -- eg. Mail.app on OS X can easily move > IMAP messages to a folder resident on the computers computers? Well, if the target can store tags, then ideally the MUA should know how to transfer them along. Maybe the right thing to do would be to use extended attributes (which are stored in the inode!), even if they may not be universally supported yet. If our solution scales, then this might lead to a significant increase in xattr adoption. > 3. what happens with duplicates of emails -- I would assume that > the message id would be the key to match the tag data to the > message. In this system a duplicate of a message could not have > a different set of tags from the original (not that this would > necessarily be desirable.) Duplicates need folders, and tags and folders are somewhat at odds with each other. I mean, you can represent a folder hierarchy with tags (and more), and if you have tags and folders, you are potentially introducing a level of confusion/ambiguity that we don't want in the first place. Maybe the ideal solution doesn't need folders anymore (and IMAP-compatible (Maildir) subfolders have always been a hack anyway). There are also two types of duplicates: copies and links. The former can diverge, the latter can't. I don't really see a reason for either. It's not like you need to copy a mail before you edit it, and I don't see a real reason for linking, assuming that the primary means of browsing will be tag-searches anyway. Duplicates always make me think of content addressing, like Git's object cache. We could store the content hash of a message in its filename, and also use the hash to index into the tag database. I think that would be much cleaner than message IDs, and would make handling true duplicates (links) much easier, while copies (diverged ex-duplicates) would also be taken care of automatically. > Your mention of potential leakage (aka inadvertent disclosure of > tag data) is real -- but only if the client used to bounce/forward > is not the one to tag the message (one would assume that if > a client can tag, it can know to exclude the tags in a bounce.) True, and it's probably the minority of people using multiple clients. But those who do might also manipulate mail with sed and use sendmail directly. I don't think we can successfully enhance RFC 5351 to make MTAs always ditch the Tags:-header. > Mail.app -- which I am pluging into does not forward headers -- ew! ;) (I think one should be able to forward pristine mails) > though it will include all headers in a bounce -- but chance are > you aren't tagging messages you are bouncing.:) That chance might well be very low. I bounce/forward-as-attachment a lot of mail from the past to make it easier for others to establish context. > The performance issue is very real -- because it means that > somehow messages have to rewritten to the IMAP server -- IMAP > doesn't have a mechanism AFAIK for updates. Not even UIDPLUS? http://wiki.dovecot.org/FeatUIDPLUS > Additionally, IMAP doesn't have a mechanism for simply replacing > one message data with another -- a new message must be written and > the old message must be deleted and the message IMAP UID will > change, and the client will have to deal with this especially if > it is cache the messages. Yes, I am experiencing this pain regularly, since I currently use a lot of message rewriting as part of my workflow — one of the reasons why I'd like to find an alternative. > Also GMAIL IMAP is an issue- Yeah, I bet. Is there anyone who doesn't think that that's Google's problem, not ours, though? -- martin | http://madduck.net/ | http://two.sentenc.es/ "there's someone in my head but it's not me." -- pink floyd, the dark side of the moon, 1972 spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-13 1:24 ` martin f krafft @ 2010-01-13 5:39 ` Scott Morrison 2010-01-13 5:52 ` martin f krafft 2010-01-14 1:37 ` Carl Worth 0 siblings, 2 replies; 42+ messages in thread From: Scott Morrison @ 2010-01-13 5:39 UTC (permalink / raw) To: mailtags discussion list; +Cc: notmuch discussion list On 2010-01-12, at 8:24 PM, martin f krafft wrote: > also sprach Scott Morrison <smorr@indev.ca> [2010.01.12.1711 +1300]: >> 1. synchronization of tag data with emails -- if they are in >> a subfolder then it presents the issue of maintaining this >> subfolder when managing emails (moving, deleting, duplicating etc) >> and any .tag folder unaware clients are likely cause an breakage >> in tagdata/message association. One way of doing this is to have >> a global .tag folder. > > A global .tag folder indexed by e.g. message ID, as you state later, > would probably allow for this. Or a file-per-tag design. We'd have > to think carefully about pros and cons for each. > > When thinking about this, I always have to remind myself that we are > targetting this at a design that has indexed search. If that weren't > the case, searches would be incredibly expensive. > > Maybe a better approach would be content addressing (see below). Content hashing -- good Idea (& not something that has hit me before) -- better than Message-Id as I believe there are still some MUA /MTAs that allow messages without message ids. The only potential issue with this is that it is critical then to preserve the message source against encoding changes though that shouldn't be too hard to avoid. > >> 2. what happens if that message is archived or moved to an >> exclusively local cache -- eg. Mail.app on OS X can easily move >> IMAP messages to a folder resident on the computers computers? > > Well, if the target can store tags, then ideally the MUA should know > how to transfer them along. > > Maybe the right thing to do would be to use extended attributes > (which are stored in the inode!), even if they may not be > universally supported yet. If our solution scales, then this might > lead to a significant increase in xattr adoption. The problem with anything that is not universally supported is that for a package that is to appeal to a wide userbase, most don't know and don't care about the particulars of this IMAP server vs that IMAP server. all they know it that for some reason it doesn't work with account X -- which leads to support head aches. > >> 3. what happens with duplicates of emails -- I would assume that >> the message id would be the key to match the tag data to the >> message. In this system a duplicate of a message could not have >> a different set of tags from the original (not that this would >> necessarily be desirable.) > > Duplicates need folders, and tags and folders are somewhat at odds > with each other. I mean, you can represent a folder hierarchy with > tags (and more), and if you have tags and folders, you are > potentially introducing a level of confusion/ambiguity that we don't > want in the first place. Maybe the ideal solution doesn't need > folders anymore (and IMAP-compatible (Maildir) subfolders have > always been a hack anyway). > > There are also two types of duplicates: copies and links. The former > can diverge, the latter can't. I don't really see a reason for > either. It's not like you need to copy a mail before you edit it, > and I don't see a real reason for linking, assuming that the primary > means of browsing will be tag-searches anyway. > > Duplicates always make me think of content addressing, like Git's > object cache. We could store the content hash of a message in its > filename, and also use the hash to index into the tag database. > I think that would be much cleaner than message IDs, and would make > handling true duplicates (links) much easier, while copies (diverged > ex-duplicates) would also be taken care of automatically. I agree that conceptually duplicates should be buried but end users do have "peculiar" organization systems. > > -snip- >> The performance issue is very real -- because it means that >> somehow messages have to rewritten to the IMAP server -- IMAP >> doesn't have a mechanism AFAIK for updates. > > Not even UIDPLUS? > http://wiki.dovecot.org/FeatUIDPLUS From my reading, uidplus doesn't allow a delta modification of a message on a server -- just to write a portion of a message back -- you still have to write the whole thing back and that can mean real bandwidth issues for some messages. > >> Additionally, IMAP doesn't have a mechanism for simply replacing >> one message data with another -- a new message must be written and >> the old message must be deleted and the message IMAP UID will >> change, and the client will have to deal with this especially if >> it is cache the messages. > > Yes, I am experiencing this pain regularly, since I currently use > a lot of message rewriting as part of my workflow — one of the > reasons why I'd like to find an alternative. > >> Also GMAIL IMAP is an issue- > > Yeah, I bet. Is there anyone who doesn't think that that's Google's > problem, not ours, though? > Call it Googles problem as you like -- but when I have a product that doesn't work with GMAIL IMAP there are a lot of potential users that don't care about server peculiarities and rather just have it work. ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-13 5:39 ` Scott Morrison @ 2010-01-13 5:52 ` martin f krafft 2010-01-14 1:37 ` Carl Worth 1 sibling, 0 replies; 42+ messages in thread From: martin f krafft @ 2010-01-13 5:52 UTC (permalink / raw) To: mailtags discussion list, notmuch discussion list [-- Attachment #1: Type: text/plain, Size: 2605 bytes --] also sprach Scott Morrison <smorr@indev.ca> [2010.01.13.1752 +1300]: > The problem with anything that is not universally supported is > that for a package that is to appeal to a wide userbase, most > don't know and don't care about the particulars of this IMAP > server vs that IMAP server. all they know it that for some reason > it doesn't work with account X -- which leads to support head > aches. [...] > Call it Googles problem as you like -- but when I have a product > that doesn't work with GMAIL IMAP there are a lot of potential > users that don't care about server peculiarities and rather just > have it work. Well, the way I see it: you cannot change all IMAP servers at once, and you certainly cannot change Google. If it's possible to implement tagging for email (dare say semantic e-mail) with standard means (where standard means sub-standard, as exemplified by your previous GMail IMAP example), then that's the best way, but if that can't happen then we ought to try a better way. Should we find a solution then, by the rate of standardisation on the 'Net, maybe my grandchildren will finally be able to do proper e-mail. ;) > I agree that conceptually duplicates should be buried but end > users do have "peculiar" organization systems. I think tags should help abstract e-mail away from underlying storage and I'd love that to be a goal. > From my reading, uidplus doesn't allow a delta modification of > a message on a server -- just to write a portion of a message back > -- you still have to write the whole thing back and that can mean > real bandwidth issues for some messages. Absolutely. It would indeed be better if you could just send changes. I just sent a blank mail to imap-protocol-subscribe@mailman.u.washington.edu and have started browsing the archives. So far, there's not really anything relevant. Anyway, looking back at the RFC on keywords, it's not exactly encouraging: A keyword is defined by the server implementation. Keywords do not begin with "\". Servers MAY permit the client to define new keywords in the mailbox (see the description of the PERMANENTFLAGS response code for more information). Anyway, I'll try to untangle the various issues re:IMAP we've been seeing, write mails for each, and hopefully get to the point where I can enquire about IMAPv5. ;) -- martin | http://madduck.net/ | http://two.sentenc.es/ the unix philosophy basically involves giving you enough rope to hang yourself. and then some more, just to be sure. spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-13 5:39 ` Scott Morrison 2010-01-13 5:52 ` martin f krafft @ 2010-01-14 1:37 ` Carl Worth 1 sibling, 0 replies; 42+ messages in thread From: Carl Worth @ 2010-01-14 1:37 UTC (permalink / raw) To: Scott Morrison, mailtags discussion list; +Cc: notmuch discussion list [-- Attachment #1: Type: text/plain, Size: 1004 bytes --] On Wed, 13 Jan 2010 00:39:14 -0500, Scott Morrison <smorr@indev.ca> wrote: > > Maybe a better approach would be content addressing (see below). > > Content hashing -- good Idea (& not something that has hit me before) > -- better than Message-Id as I believe there are still some MUA /MTAs > that allow messages without message ids. The only potential issue > with this is that it is critical then to preserve the message source > against encoding changes though that shouldn't be too hard to avoid. Another problem with content-based naming for messages is that most of the messages in my mail store that I consider duplicates don't actually have identical content. (One is sent directly to me via CC and the other is sent by the mailing-list software *after* appending a footer to the message.) That said, notmuch already does use a sha-1 sum as the message identifier for any message that does not have a valid Message-ID header. So there's definitely a place for this. -Carl [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-11 22:19 Idea for storing tags martin f krafft 2010-01-12 3:44 ` Scott Robinson 2010-01-12 4:11 ` Idea for storing tags Scott Morrison @ 2010-01-12 21:39 ` David A. Harding 2010-01-14 1:32 ` Carl Worth 3 siblings, 0 replies; 42+ messages in thread From: David A. Harding @ 2010-01-12 21:39 UTC (permalink / raw) To: martin f krafft On Tue, Jan 12, 2010 at 11:19:09AM +1300, martin f krafft wrote: > I think [tag leakage] it makes in-headers unusable. After all, I don't > ever want anyone else to know that I tag e-mails from my boss as > "from-idiots", You can cryptographically hash tags so that third-parties can't read the contents of the in-headers. For security, a salt should be appended to the tag name to make dictionary attacks on the tags more difficult. For their owners' convenience, mail clients will want a mapping of hash to tag name. > [...] pseudo-mails stored in Maildir and synchronised by IMAP A single RFC2822 message can store the salt and hash-to-tag database. It could contain a clear subject and directions to the end user not to move or delete it. This would not, I think, terribly confuse existing mail clients or their users. -Dave -- David A. Harding Website: http://dtrt.org/ 1 (609) 997-0765 Email: dave@dtrt.org Jabber/XMPP: dharding@jabber.org ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-11 22:19 Idea for storing tags martin f krafft ` (2 preceding siblings ...) 2010-01-12 21:39 ` David A. Harding @ 2010-01-14 1:32 ` Carl Worth 2010-01-14 8:04 ` martin f krafft 3 siblings, 1 reply; 42+ messages in thread From: Carl Worth @ 2010-01-14 1:32 UTC (permalink / raw) To: martin f krafft, mailtags discussion list; +Cc: notmuch discussion list [-- Attachment #1: Type: text/plain, Size: 2334 bytes --] On Tue, 12 Jan 2010 11:19:09 +1300, martin f krafft <madduck@madduck.net> wrote: > 1. External database, which has the downside of not being > synchronisable with standard IMAP, like the rest of your mail > (assuming you use IMAP). Also, it's possible for mailstore and > database to get out of sync. Yes. This approach requires some external means of synchronizing the tags from one system to another. I don't understand what it would mean to have the mailstore and the database out of synch here. This approach doesn't have the tags in the mailstore by definition, right? > How about using pseudo-mails stored in Maildir and synchronised by > IMAP? E.g. every folder could have a subfolder .TAGS and if we find > a way to smartly pair messages between parent and subfolder, we'd > have a tag store alongside the mailstore it refers to, but without > the danger of leakage, and without having to rewrite messages. ... > Anyway, the idea is out now. Thoughts? There are a couple of problems that I don't see addressed at all with this approach. The first is that there's not a one-to-one mapping between messages and files in the mail store. (I'm CCed on a lot of list mail meaning that I have multiple files in my mail store for a single message.) Second, the only reason I would be interested in synchronizing mail between two systems is so that I could manipulate the tag data in multiple places, (that is, remove the "unread" tag whether on my network-disconnected laptop or via web-mail when away from my laptop). Using imap for synchronizing a file of tags within the mail store gives you no mechanism for doing any sort of conflict resolution, right? (Which I think in almost all cases is going to be quite trivial if there's a chance for a program to resolve it.) So it sounds to me like we're going to need *something* custom for doing the synchronization, (to handle modifications on both ends). At which point there's only disadvantages to keeping the data inside the mailstore, and there's also no disadvantage left to keeping the data inside a database. [*] [*] Though, I think a plain-text file with tags managed with something like git (and perhaps a custom merger) could save a lot of work. Or perhaps a plain-text journal of tag manipulations on either end that could be replayed on the other. -Carl [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-14 1:32 ` Carl Worth @ 2010-01-14 8:04 ` martin f krafft 2010-01-14 22:24 ` Carl Worth 0 siblings, 1 reply; 42+ messages in thread From: martin f krafft @ 2010-01-14 8:04 UTC (permalink / raw) To: Carl Worth; +Cc: notmuch discussion list, mailtags discussion list [-- Attachment #1: Type: text/plain, Size: 2974 bytes --] also sprach Carl Worth <cworth@cworth.org> [2010.01.14.1432 +1300]: > Yes. This approach requires some external means of synchronizing the > tags from one system to another. > > I don't understand what it would mean to have the mailstore and the > database out of synch here. This approach doesn't have the tags in the > mailstore by definition, right? You might have marked a message 'read' on one machine and if the two get out of sync on another machine, you might have the same message unread there. > > How about using pseudo-mails stored in Maildir and synchronised by > > IMAP? E.g. every folder could have a subfolder .TAGS and if we find > > a way to smartly pair messages between parent and subfolder, we'd > > have a tag store alongside the mailstore it refers to, but without > > the danger of leakage, and without having to rewrite messages. > ... > > Anyway, the idea is out now. Thoughts? > > There are a couple of problems that I don't see addressed at all with > this approach. The first is that there's not a one-to-one mapping > between messages and files in the mail store. (I'm CCed on a lot of list > mail meaning that I have multiple files in my mail store for a single > message.) Shouldn't this just be solved? I've had formail+procmail delete my duplicates for 10+ years, and while I don't like the fact that I usually get the CC before the list mail, and thus cannot filter on Delivered-To, I have never looked back. > Second, the only reason I would be interested in synchronizing mail > between two systems is so that I could manipulate the tag data in > multiple places, (that is, remove the "unread" tag whether on my > network-disconnected laptop or via web-mail when away from my > laptop). Using imap for synchronizing a file of tags within the mail > store gives you no mechanism for doing any sort of conflict resolution, > right? (Which I think in almost all cases is going to be quite trivial > if there's a chance for a program to resolve it.) I have not thought about this, but you are right. IMAP does not really allow for conflict resolution, which may well be *the* reason why you cannot update existing messages. > [*] Though, I think a plain-text file with tags managed with > something like git (and perhaps a custom merger) could save a lot > of work. Or perhaps a plain-text journal of tag manipulations on > either end that could be replayed on the other. Git is good at conflict resolution if run interactively, but [0] still makes me question whether it can ever take the place of IMAP. However, Asheesh Laroia, who has floated the idea of Git-for-mail at DebConf8 already, has some ideas and hopefully will soon reply to my mail [0], which I just bounced. 0. http://notmuchmail.org/pipermail/notmuch/2010/001114.html -- martin | http://madduck.net/ | http://two.sentenc.es/ apt-get source --compile gentoo spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-14 8:04 ` martin f krafft @ 2010-01-14 22:24 ` Carl Worth 2010-01-14 22:32 ` martin f krafft 0 siblings, 1 reply; 42+ messages in thread From: Carl Worth @ 2010-01-14 22:24 UTC (permalink / raw) To: martin f krafft; +Cc: notmuch discussion list, mailtags discussion list [-- Attachment #1: Type: text/plain, Size: 2250 bytes --] On Thu, 14 Jan 2010 21:04:21 +1300, martin f krafft <madduck@madduck.net> wrote: > You might have marked a message 'read' on one machine and if the two > get out of sync on another machine, you might have the same message > unread there. That's a different issue though. With two databases there's clearly the opportunity for the two databases to be out of synch. But you talked about the database being out of synch with respect to the mailstore. And that's something I just don't understand, (given the assumption that all tags are stored in the database---which was the explicit description of the case of interest). > Shouldn't this just be solved? I've had formail+procmail delete my > duplicates for 10+ years, and while I don't like the fact that > I usually get the CC before the list mail, and thus cannot filter on > Delivered-To, I have never looked back. Notmuch has access to all the information it needs to allow you to delete the CC version once the list mail arrives. So you could do notmuch-based deletion now and avoid losing the Delivered-To header if you want. > > [*] Though, I think a plain-text file with tags managed with > > something like git (and perhaps a custom merger) could save a lot > > of work. Or perhaps a plain-text journal of tag manipulations on > > either end that could be replayed on the other. > > Git is good at conflict resolution if run interactively, but [0] > still makes me question whether it can ever take the place of IMAP. > However, Asheesh Laroia, who has floated the idea of Git-for-mail at > DebConf8 already, has some ideas and hopefully will soon reply to my > mail [0], which I just bounced. > > 0. http://notmuchmail.org/pipermail/notmuch/2010/001114.html Using git for mail is an interesting idea, but not what I was actually proposing here. I think that synchronizing the mail store and synchronizing the tags information are tasks that have different requirements, and for which we may well want different tools. So I was talking about using imap (or rsync, or what have you) for copying the mailtstore, and then having something with a bit more domain-specific awareness for doing the synchronization of the tags data. -Carl [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Idea for storing tags 2010-01-14 22:24 ` Carl Worth @ 2010-01-14 22:32 ` martin f krafft 0 siblings, 0 replies; 42+ messages in thread From: martin f krafft @ 2010-01-14 22:32 UTC (permalink / raw) To: mailtags discussion list, notmuch discussion list [-- Attachment #1: Type: text/plain, Size: 2593 bytes --] also sprach Carl Worth <cworth@cworth.org> [2010.01.15.1124 +1300]: > > You might have marked a message 'read' on one machine and if the two > > get out of sync on another machine, you might have the same message > > unread there. > > That's a different issue though. With two databases there's clearly the > opportunity for the two databases to be out of synch. > > But you talked about the database being out of synch with respect to the > mailstore. And that's something I just don't understand, (given the > assumption that all tags are stored in the database---which was the > explicit description of the case of interest). Yes, we are talking about the situation where the tagstore is seperate from the mailstore, and that they are both synchronised with a server, or between machines, separately. If for some reason you only synchronise the mailstore — say because the connection drops before the sync of the tagstore completes — then you end up with an out-of-sync situation, because the mailstore-sync will have pulled in a new message, but not the associated tags. So if you had already read this message on another machine and tagged it 'done', then it would show up on this machine as 'new' without the 'done' tag, because the tags were not synchronised. The only way to really solve this is by transferring a message and its tags in a transactional way. > > Shouldn't this just be solved? I've had formail+procmail delete my > > duplicates for 10+ years, and while I don't like the fact that > > I usually get the CC before the list mail, and thus cannot filter on > > Delivered-To, I have never looked back. > > Notmuch has access to all the information it needs to allow you to > delete the CC version once the list mail arrives. So you could do > notmuch-based deletion now and avoid losing the Delivered-To header if > you want. Of course. I hadn't thought that far. However, there are still benefits to formail, namely avoiding having to run duplicates through potentially expensive spamfilters. > I think that synchronizing the mail store and synchronizing the > tags information are tasks that have different requirements, and > for which we may well want different tools. Fair enough. Maybe I am just paranoid about the stores getting out of sync (see above). -- martin | http://madduck.net/ | http://two.sentenc.es/ "we all know linux is great... it does infinite loops in 5 seconds." -- linus torvalds spamtraps: madduck.bogus@madduck.net [-- Attachment #2: Digital signature (see http://martin-krafft.net/gpg/) --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2010-02-15 0:51 UTC | newest] Thread overview: 42+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-11 22:19 Idea for storing tags martin f krafft 2010-01-12 3:44 ` Scott Robinson 2010-01-12 4:06 ` martin f krafft 2010-01-12 4:51 ` Potential problem using Git for mail (was: Idea for storing tags) martin f krafft 2010-01-12 19:38 ` Jameson Rollins 2010-01-12 19:55 ` martin f krafft 2010-01-14 8:12 ` Asheesh Laroia 2010-01-14 20:37 ` martin f krafft 2010-01-21 6:28 ` Asheesh Laroia 2010-01-25 0:46 ` Git as notmuch object store (was: Potential problem using Git for mail) martin f krafft 2010-01-25 5:19 ` Asheesh Laroia 2010-01-25 7:43 ` martin f krafft 2010-01-25 13:49 ` Sebastian Spaeth 2010-01-25 16:22 ` Mike Kelly 2010-01-25 21:46 ` tag dir proposal [was: Re: Git as notmuch object store] Jameson Rollins 2010-01-26 16:32 ` Scott Robinson 2010-01-26 17:03 ` Jameson Rollins 2010-01-28 5:12 ` martin f krafft 2010-01-28 5:28 ` James Westby 2010-01-28 5:34 ` martin f krafft 2010-01-28 6:22 ` James Westby 2010-01-28 9:55 ` martin f krafft 2010-01-28 5:10 ` martin f krafft 2010-01-28 12:32 ` Servilio Afre Puentes 2010-01-28 20:39 ` martin f krafft 2010-01-28 20:49 ` Ben Gamari 2010-01-28 21:11 ` martin f krafft [not found] ` <1264713802-sup-620@ben-laptop> [not found] ` <20100128221735.GE8942@lapse.rw.madduck.net> 2010-01-28 23:30 ` Ben Gamari 2010-01-28 21:16 ` Jed Brown 2010-01-25 19:49 ` Git as notmuch object store (was: Potential problem using Git for mail) martin f krafft 2010-01-27 9:00 ` Sebastian Spaeth 2010-02-15 0:51 ` Stewart Smith 2010-01-12 4:11 ` Idea for storing tags Scott Morrison 2010-01-13 1:24 ` martin f krafft 2010-01-13 5:39 ` Scott Morrison 2010-01-13 5:52 ` martin f krafft 2010-01-14 1:37 ` Carl Worth 2010-01-12 21:39 ` David A. Harding 2010-01-14 1:32 ` Carl Worth 2010-01-14 8:04 ` martin f krafft 2010-01-14 22:24 ` Carl Worth 2010-01-14 22:32 ` martin f krafft
Code repositories for project(s) associated with this public inbox https://yhetil.org/notmuch.git/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).