From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 41171431FD0 for ; Fri, 10 Jun 2011 14:11:15 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LLGF2Iysr5n8 for ; Fri, 10 Jun 2011 14:11:13 -0700 (PDT) Received: from dmz-mailsec-scanner-7.mit.edu (DMZ-MAILSEC-SCANNER-7.MIT.EDU [18.7.68.36]) by olra.theworths.org (Postfix) with ESMTP id 5ED71431FB6 for ; Fri, 10 Jun 2011 14:11:13 -0700 (PDT) X-AuditID: 12074424-b7bc6ae000005a77-23-4df288700099 Received: from mailhub-auth-3.mit.edu ( [18.9.21.43]) by dmz-mailsec-scanner-7.mit.edu (Symantec Messaging Gateway) with SMTP id C6.6F.23159.07882FD4; Fri, 10 Jun 2011 17:11:12 -0400 (EDT) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id p5ALBClq014402; Fri, 10 Jun 2011 17:11:12 -0400 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id p5ALBARn021129 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Fri, 10 Jun 2011 17:11:11 -0400 (EDT) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.72) (envelope-from ) id 1QV8yt-0005nv-Ji; Fri, 10 Jun 2011 17:11:03 -0400 Date: Fri, 10 Jun 2011 17:11:03 -0400 From: Austin Clements To: Carl Worth Subject: Re: [PATCH 00/10] Fix 'notmuch new' atomicity issues Message-ID: <20110610211103.GC16025@mit.edu> References: <1298015940-31986-1-git-send-email-amdragon@mit.edu> <87ei34rnc5.fsf@yoom.home.cworth.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87ei34rnc5.fsf@yoom.home.cworth.org> User-Agent: Mutt/1.5.20 (2009-06-14) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmphleLIzCtJLcpLzFFi42IR4hTV1i3o+ORrsL3FyOLmzzlsFtdvzmR2 YPLYvfkBi8ezVbeYA5iiuGxSUnMyy1KL9O0SuDKudL5mKljgUPFtcyNzA+M3oy5GTg4JAROJ Dz93sEPYYhIX7q1n62Lk4hAS2McosWfqaXYIZwOjxMGzK1khnJNMEhfnXmSEcJYwSpy89IsR pJ9FQFXi4sPZrCA2m4CGxLb9y8HiIgJKEk+PrGICsZkFpCW+/W4Gs4UFbCV2f7oKtIKDg1dA R2JiuyjEzF4miX2NvWA1vAKCEidnPmGB6NWSuPHvJRNIPcic5f84QMKcAkYSJ9qWgpWLCqhI XNvfzjaBUWgWku5ZSLpnIXQvYGRexSibklulm5uYmVOcmqxbnJyYl5dapGuul5tZopeaUrqJ ERTY7C4qOxibDykdYhTgYFTi4RXR+OQrxJpYVlyZe4hRkoNJSZT3fjtQiC8pP6UyI7E4I76o NCe1+BCjBAezkgjvepAcb0piZVVqUT5MSpqDRUmcd56kuq+QQHpiSWp2ampBahFMVoaDQ0mC 9w5Io2BRanpqRVpmTglCmomDE2Q4D9DwFrDhxQWJucWZ6RD5U4yKUuK8j0ESAiCJjNI8uF5Y 4nnFKA70ijDvWZAqHmDSgut+BTSYCWhw26GPIINLEhFSUg2MMxj61rfk/u5kvWt106fHUMj9 GzebrrCUoVn8c7kf8ycZhLW8ftLQcnn/HrmuzJsWXOqpJ58xcMz+Gv3hyYqy1Uy6UxI+T8xy vPEyN/y2aFgz98l7wdvOPjfzz5yi2WPb86n2oPcBL/XrMbqbd/z/XhdfZR525sD+SR+9T5tu Oa8x3avil/o/JZbijERDLeai4kQAvT08QxcDAAA= Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jun 2011 21:11:15 -0000 Quoth Carl Worth on Jun 08 at 3:05 pm: > On Sat, 28 May 2011 22:51:10 -0400, Austin Clements wrote: > > Rebased to current master (cb8418) as atomic-new-v4 (aka > > for-review/atomic-new-v4). > > Hi Austin, > > Thanks so much for sending this series (and 4 times, even!). > > I *really* like the new robustness provided by this series, and I > especially like the exhaustive testing here. Thanks so much! > > Having just gone through the for-review/atomic-new-v4 series, I have a > few comments. Some are very minor and I'll be glad to implement them > myself: > > 1. Two commits have "lose" misspelled as "loose". These are "ew: > don't loose messages on SIGINT" and "new: Wrap adding a > message in an atomic section". Ooops. > 2. The commit with summary of "lib: Make _notmuch_message_sync > capable of deleting a message." is missing the rest of its > commit message with a complete explanation. For example, this > commit message should describe that a message document is > deleted from the database (if the deleted field is set when > _sync is called). And the commit message should also mention > that this functionality is not currently used, but prepares > for a subsequent use. Fixed. > 3. While reviewing the commit "lib: Indicate if there are more > filenames after removal" the "if (status == > NOTMUCH_STATUS_SUCCESS)" looked out of place to me. Indeed, > if status is any other value at this point in the code, then > the function should have returned earlier. I intend to follow > up with a commit that adds the missing early return and > removes this condition. Okay. I suspect I was just retaining the error semantics in this function (which were probably a hold-out from before the folder search patch). I've slipped a patch in at the beginning that adds the missing check and removed the condition. > 4. I really don't like that the final state of the code has two > different functions named notmuch_message_remove_filename and > _notmuch_message_remove_filename. If the semantics of these > functions are identical, then there should be only one > function. If the semantics are different, then they need to > have noticeably distinct names, (and a single underscore > doesn't count). Too much Linux kernel hacking, I suppose. I've left this alone for the moment because it's likely to change with the below discussion. (Two solutions are either to rename _notmuch_message_remove_filename something even more ridiculously long like _notmuch_message_remove_filename_no_delete, or to make notmuch_message_tags_to_maildir_flags first add the new file name and then remove the old one, so a message can't transiently have no file names and the merge the two filename removal functions into one.) > 5. The final code has a function inside of notmuch-new.c named > "remove_file", but this function isn't removing a > file---instead it's removing a message document from the > database. So it needs a more accurate name. Mm. It's now remove_filename (could be remove_message_filename?) It's *might* remove a message document from the database, but its primary purpose is to remove a filename from a message. I've pushed the easy changes as atomic-new-v5, mostly to get them in the record. > Like I said, those are all pretty minor and I would just implement all > of those and push the series myself, but for one remaining issue that is > a bit more significant. > > The last issue has to do with the addition of the > notmuch_database_find_message_by_filename and > notmuch_message_remove_filename functions. In the series as it stands, > notmuch-new.c is updated to call these two functions instead of calling > the existing notmuch_database_remove_message function (which itself also > calls the same functions). > > That sets off a red flag in my mind. If our program is avoiding a > library function and substituting its own implementation, how are other > users of the library going to get things right? Should we deprecate > notmuch_database_remove_message? Should we add more documentation to it > describing the situation in which a user might prefer not to call it? It > seems the library is harder to use than it should be in this area. The intent was to deprecate notmuch_database_remove_message, yes. > Meanwhile, I'm not very satisfied by the existence of > notmuch_message_remove_filename in the public API. It would have a > natural pairing with notmuch_message_add_filename, but the series isn't > exporting that functionality. So things feel more asymmetric than they > should be as well. Part of why atomicity was a mess was because the API blurs the distinction between a message as a concrete, single file and a message as a message-id that may have many file names. find_message_by_filename and remove_filename were attempting to sharpen this distinction. But, maybe they sharpen it in the wrong direction. An alternate way to look at this is that a message is a single file that can also tell you file names that contain equivalent messages. This might be more of a mindset (or documentation change) than an actual API change; I'm not sure. It certainly fits better with the existing {add,remove}_message, but it's not clear if that's intentional or historical. Thoughts? > Now, why is notmuch-new going through all this effort to reimplement an > existing library function (and requiring two new library functions in > the process)? What it wants to do is to wrap the functionality of > database_remove_message in freeze/thaw and while the message is frozen > call notmuch_message_maildir_flags_to_tags. > > So, how to fix my complaints above? > > * Do we want to allow database_remove_message to optionally call > maildir_flags_to_tags? > > This seems a little messy in requiring some additional information > to the library so it can know whether to do the maildir > synchronization here. And it's also asymmetric unless we would also > support similar synchronization support in the library for simlar > operations. > > * Do we want to expose notmuch_message_add_filename as well as > remove_filename for better symmetry? > > I'm not sure I like that. It still feels like we're exposing too > many internals and not making it obvious to the user how to do > things. Having just the existing add_message/remove_message > functions definitely makes the interface easier. > > * Can we fix the remove case without this new library API by simply > adding calls to begin_atomic and end_atomic? > > I think this is probably the solution I would prefer to see. > > What do you think, Austin? Of these three, I would definitely go with the last. In fact, I tried the first two when I was originally designing this patch and can assure you they'll get us into trouble. ]:--8) I recall a few reasons for why I designed this the way I did. One was what I mentioned above about sharpening the distinction between messages and filenames. Another was that I wanted to reuse the freeze/thaw mechanism; in fact, I introduced atomic sections only when I realized that using freeze/thaw was going to be impossible for add. Perhaps it makes more sense to lean the other direction. Finally, I felt that it was important that the API be easy to use correctly, from an atomicity standpoint; hence the introduction of more operations that were meaningful on frozen messages (I also tried to make it so you couldn't overlook atomicity issues when using the library, but I don't think I succeeded). That last reason is also compatible with your last suggestion. If we move to atomic sections, I think we have to make sure the library never internally violates atomicity and that the library user only needs to use atomic sections directly if they need atomicity across multiple library calls. This shouldn't be hard, especially with nested atomics. I'll give this a try and see where it leads. > -Carl