From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id F273B431FB6 for ; Sun, 15 Jan 2012 09:58:51 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UzoPe-SEmP+w for ; Sun, 15 Jan 2012 09:58:51 -0800 (PST) Received: from dmz-mailsec-scanner-3.mit.edu (DMZ-MAILSEC-SCANNER-3.MIT.EDU [18.9.25.14]) by olra.theworths.org (Postfix) with ESMTP id 3C9B5431FAE for ; Sun, 15 Jan 2012 09:58:51 -0800 (PST) X-AuditID: 1209190e-b7f7c6d0000008c3-f9-4f1313d9daa4 Received: from mailhub-auth-1.mit.edu ( [18.9.21.35]) by dmz-mailsec-scanner-3.mit.edu (Symantec Messaging Gateway) with SMTP id 93.14.02243.9D3131F4; Sun, 15 Jan 2012 12:58:49 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id q0FHwmM6018515; Sun, 15 Jan 2012 12:58:49 -0500 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id q0FHwkCB009904 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Sun, 15 Jan 2012 12:58:47 -0500 (EST) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77) (envelope-from ) id 1RmULo-0003y8-VY; Sun, 15 Jan 2012 12:58:41 -0500 From: Austin Clements To: David Edmondson , Pieter Praet Subject: Re: [PATCH] Output unmodified Content-Type header value for JSON format. In-Reply-To: References: <1321659905-24367-1-git-send-email-dmitry.kurochkin@gmail.com> <87fwhkyisj.fsf@servo.finestructure.net> <87wrawq1dz.fsf@gmail.com> <87d3coxu7s.fsf@servo.finestructure.net> <87r512pru2.fsf@gmail.com> <87ipmewo4z.fsf@servo.finestructure.net> <20111123034021.GL9351@mit.edu> <87ipkglui4.fsf@praet.org> <20120112172840.GC18625@mit.edu> <87ehv2proa.fsf@praet.org> User-Agent: Notmuch/0.10.2+133~g9e35ff5 (http://notmuchmail.org) Emacs/23.3.1 (i486-pc-linux-gnu) Date: Sun, 15 Jan 2012 12:58:40 -0500 Message-ID: <87boq4q23z.fsf@awakening.csail.mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrBIsWRmVeSWpSXmKPExsUixCmqrHtTWNjf4PNZU4t9d7YwWVy/OZPZ 4vfrG8wOzB67nv9l8ni26hazR8e+y6wBzFFcNimpOZllqUX6dglcGUcPdzAWXJao6G1IbGB8 IdzFyMkhIWAi8bR3FyOELSZx4d56ti5GLg4hgX2MEjsuTmSCcDYwSmx7vpAVwjnJJLH1z29G CGcJo8Tn2dPA+tkENCS27V8OZosIOElsW/SVCcRmFpCW+Pa7GcwWFgiUWAJ0BojNKWAjcfDM eqhBi5glHt+6zgKSEBVIlJg1r5UdxGYRUJW4vfkUWJwX6NgdfedYIWxBiZMzn7BALNCSuPHv JdMERsFZSFKzkKQWMDKtYpRNya3SzU3MzClOTdYtTk7My0st0jXWy80s0UtNKd3ECA5gSb4d jF8PKh1iFOBgVOLhFVIV8hdiTSwrrsw9xCjJwaQkyssDDH8hvqT8lMqMxOKM+KLSnNTiQ4wS HMxKIryVfEA53pTEyqrUonyYlDQHi5I4r5rWOz8hgfTEktTs1NSC1CKYrAwHh5IEryfIUMGi 1PTUirTMnBKENBMHJ8hwHqDhdUIgw4sLEnOLM9Mh8qcYFaXEeeVAmgVAEhmleXC9sATzilEc 6BVhXmeQKh5gcoLrfgU0mAlocE6rEMjgkkSElFQDo/SKf20zGULvLbg8J6bmUvjRkuPTTiy1 YbjjGpCg3BL1n+m0zKbFTk+27308Wfy95a0rzfpdIpMuVq/YeIPJNTojmlMtmYV/itkVx/PT axdapQbI7AvYVbjRNlxS4tT/j+2y6z826PTutnsReTwxIPWTm/cG4TWtmyW/1s59x3DnCZ97 vMCCSUosxRmJhlrMRcWJAIppEa8LAwAA Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Jan 2012 17:58:52 -0000 On Sun, 15 Jan 2012 11:52:40 +0000, David Edmondson wrote: > > Technically the IRC discussion was about not including *any* part > > content in the JSON output, and always using show --format=raw or > > similar to retrieve desired parts. Currently, notmuch includes part > > content in the JSON only for text/*, *except* when it's text/html. I > > assume non-text parts are omitted because binary data is hard to > > represent in JSON and text/html is omitted because some people don't > > need it. However, this leads to some peculiar asymmetry in the Emacs > > code where sometimes it pulls part content out of the JSON and > > sometimes it retrieves it using show --format=raw. This in turn leads > > to asymmetry in content encoding handling, since notmuch handles > > content encoding for parts included in the JSON (and there's no good > > way around that since JSON is Unicode), but not for parts retrieved as > > raw. > > Including the text output in the JSON results in significantly fewer > calls to 'notmuch' during the building of a typical `notmuch-show-mode' > buffer. Someone with one of those older, crankier computers could easily > test how much effect this has by changing > `notmuch-show-get-bodypart-content' slightly. Yes. I was mostly reiterating the IRC discussion for Pieter. Since this discussion, I've stabilized on the pre-fetching notion I described in id:"20120115003617.GH1801@mit.edu", though I do think we should make this clear in the code: that the rule for whether the JSON includes a "content" key for a leaf part is internal to the CLI and that consumers should be prepared to use it if it's there and to retrieve the content separately if it's not. This is exactly how the Emacs code happens to work, it just hasn't been codified anywhere. Looking at it this way gives us more flexibility than the current code takes advantage of; for example we could omit content from the JSON if it's over some size threshold since the cost of sending that to a client that doesn't need it is high while the cost of having the client retrieve it for itself is relatively low. > > The idea discussed on IRC was to remove all part content from the JSON > > output and to always use show to retrieve it, possibly beefing up > > show's support for content decoding (and possibly introducing a way to > > retrieve multiple raw parts at once to avoid re-parsing). This would > > get the JSON format out of the business of guessing what consumers > > need, simplify the Emacs code, and normalize content encoding > > handling. > > Is there a real problem being solved here? Having a clean structure is > nice, except when it's not. The "real" problem is the asymmetry in encoding handling that started this discussion. Content included in the JSON is re-encoded by the CLI, while content retrieved via raw needs to be re-encoded by the client. OTOH, I don't understand the encoding story for HTML, since the encoding can come from either a header or from the body of the HTML. Does this make it strictly necessary for the client to handle the encoding?