From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id B07B3429E40 for ; Sat, 21 Jan 2012 16:23:28 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jlBoRChJsfXw for ; Sat, 21 Jan 2012 16:23:28 -0800 (PST) Received: from dmz-mailsec-scanner-5.mit.edu (DMZ-MAILSEC-SCANNER-5.MIT.EDU [18.7.68.34]) by olra.theworths.org (Postfix) with ESMTP id DE411431FAF for ; Sat, 21 Jan 2012 16:23:27 -0800 (PST) X-AuditID: 12074422-b7fd66d0000008f9-8c-4f1b56ff71e5 Received: from mailhub-auth-1.mit.edu ( [18.9.21.35]) by dmz-mailsec-scanner-5.mit.edu (Symantec Messaging Gateway) with SMTP id A5.B0.02297.FF65B1F4; Sat, 21 Jan 2012 19:23:27 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id q0M0NQnM018500; Sat, 21 Jan 2012 19:23:27 -0500 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id q0M0NPiv024726 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Sat, 21 Jan 2012 19:23:26 -0500 (EST) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77) (envelope-from ) id 1RolD3-00007o-Nk; Sat, 21 Jan 2012 19:23:01 -0500 Date: Sat, 21 Jan 2012 19:23:01 -0500 From: Austin Clements To: Peter Feigl Subject: Re: [PATCH] rewriting notmuch-search for structured output to make other output formats easier Message-ID: <20120122002301.GN16740@mit.edu> References: <1327180568-30385-1-git-send-email-craven@gmx.net> <20120121220407.GK16740@mit.edu> <87obtwlk76.fsf@nexoid.at> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87obtwlk76.fsf@nexoid.at> User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFupileLIzCtJLcpLzFFi42IR4hRV1v0fJu1vMGWavMXehnZGi+s3ZzI7 MHks3rSfzePZqlvMAUxRXDYpqTmZZalF+nYJXBlNV/+yFDToVTy6m9XAeEili5GDQ0LARGLN PL4uRk4gU0ziwr31bF2MXBxCAvsYJSbv2coE4WxglGg6doUdwjnJJPF44Sc2kG4hgSWMEj/K QUwWAVWJ0w3ZIIPYBDQktu1fzghiiwgoSDxb1wRmMwtIS3z73cwEUi4skClxsT0IJMwroCPR 0HQTrERIoE5iad8CJoi4oMTJmU9YIFq1JG78ewnWCjJm+T8OkDCngLrE6+1zWUFsUQEViSkn t7FNYBSahaR7FpLuWQjdCxiZVzHKpuRW6eYmZuYUpybrFicn5uWlFuma6uVmluilppRuYgQF M7uL0g7GnweVDjEKcDAq8fAm7JP0F2JNLCuuzD3EKMnBpCTKOzdQ2l+ILyk/pTIjsTgjvqg0 J7X4EKMEB7OSCG9ZF1A5b0piZVVqUT5MSpqDRUmcV13rnZ+QQHpiSWp2ampBahFMVoaDQ0mC 1x0YtUKCRanpqRVpmTklCGkmDk6Q4TxAw21BaniLCxJzizPTIfKnGHU5vvxuO88oxJKXn5cq Jc4bDVIkAFKUUZoHNweWhF4xigO9JcxrD1LFA0xgcJNeAS1hAlrCkScFsqQkESEl1cC4itc0 R/L6345kg63nLky032ju3mMcHqpdYiK05cqdA3sLohY9qr0Vk/Njq8V66c43Ar9rHZgYDZQu LhEMLGwIXHTpic0V8U8cmr8XhL/arDdpG9OU6wwGocUtO+LX2NwvOLWpJlPIPSlAeqFFtLHU mo/y4ekpNjpBhs2cc2zyvHP1DpRd91diKc5INNRiLipOBADAa5ZJHQMAAA== Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Jan 2012 00:23:28 -0000 Quoth Peter Feigl on Jan 22 at 12:17 am: > On Sat, 21 Jan 2012 17:04:07 -0500, Austin Clements wrote: > > Quoth Peter Feigl on Jan 21 at 10:16 pm: > > I think this is a great idea and I'm a fan of having an S-expression > > format, but I also think there's a much easier and more general way to > > structure this. > > > > In particular, I don't think you should hijack search_format, since > > you'll wind up repeating most of this work for anything else that > > needs to output structured data (notmuch show, at least). Rather, I'd > > suggest creating a general "structure printer" struct that isn't tied > > to search. You've essentially already done this, you just called it > > search_format. Then, leave the existing format callbacks in place, > > just use this new API instead of lots of printf calls. > > I'm sorry I haven't been more clear about this, the intention was all > along to check whether this would be ok in notmuch-search, and if it got > accepted there, to factor it out into a separate file and then use it in > notmuch-show and notmuch-reply. There's nothing in the printer (except > for the name of the struct) that ties it to search. Great! However, it seems counterproductive to put all of these structures and functions into search only to then move them somewhere else. Wouldn't it make more sense to introduce the generic structure printer in a first patch, then refactor the existing code to use it in a second (or maybe more) patch in a series? > > What about all of those annoying {tag,item}_{start,sep,end} fields? I > > think you can simultaneously get rid of those *and* simplify the > > structure printer API. If the structure printer is allowed to keep a > > little state, it can automatically insert separators. With a little > > nesting state, it could even insert terminators by just saying "pop me > > to this level". This could probably be better, but I'm imagining > > something like > > I agree, however this is complicated by the fact that there are > additional restrictions on the actually printed code: newlines should be > placed at strategic locations to improve parsability, which could be > hard to decide in the printer alone without support from the logic that > drives it. True, but that support is easy to add and, I would argue, the exact implementation of framing *should* be up to the printer (though obviously where to place the framing should be up to the caller). Following the general structure of the API I was proposing, you could add a function like /* Print a framing break that is easy to detect in a parser. */ void sprinter_break (struct sprinter *sp); that for JSON and S-expressions could just print a newline. Or, for JSON, if we want separators to appear before the newline (which they don't have to; we can choose our framing however we want), it could simply record that a newline should be printed after the next separator or terminator. > > struct sprinter * > > new_json_sprinter (const void *ctx, FILE *stream); > > struct sprinter * > > new_sexp_sprinter (const void *ctx, FILE *stream); > > > > /* Start a map (a JSON object or a S-expression alist/plist/whatever) > > * and return the nesting level of the map. */ > > int > > sprinter_map (struct sprinter *sp); > > /* Start a list (aka array) and return the nesting level of the list. */ > > int > > sprinter_list (struct sprinter *sp); > > > > /* Close maps and lists until reaching level. */ > > void > > sprinter_pop (struct sprinter *sp, int level); > > > > void > > sprinter_map_key (struct sprinter *sp, const char *key); > > void > > sprinter_number (struct sprinter *sp, int val); > > void > > sprinter_string (struct sprinter *sp, const char *val); > > void > > sprinter_bool (struct sprinter *sp, notmuch_bool_t val); > > > > and that's it. This would also subsume your format_attribute_* > > helpers. > > > > Unfortunately, it's a pain to pass things like a structure printer > > object around formatters (too bad notmuch isn't written in C++, eh?). > > I think it's better to address this than to structure around it. > > Probably the simplest thing to do is to make a struct for formatter > > state and pass that in to the callbacks. You could also more > > completely emulate classes, but that would probably be overkill for > > this. > > I believe this approach is similar to the one I've implemented (though > probably higher level, not so many details explicitly written into the > formatting code). I would suggest trying to get any sort of structured *nods* It was definitely inspired by yours. The complexity has to go somewhere, and in the long run it seems it would be better to isolate it in the printer than to deal with it in every caller. My ulterior motive was to maintain roughly the existing and extensible callback structure while eliminating the need for the various start/sep/end fields, which would have to differ between the formats and would otherwise duplicate knowledge that should be isolated to a structure printer. > formatters (whether more like your suggestions or like the thing I > implemented doesn't matter so much) into the main codebase, and then > refactoring the other parts to use it. I've thought about providing a > single patch to all of notmuch that accomplishes this, but I've deemed > it too large and complicated to be accepted, I thought limiting it to > notmuch-search would be a way to get it set up, so that it could be > expanded to the other parts later. I've found that smaller patches are much more likely to get reviewed on the notmuch list, even if they're part of a series that adds up to a big change. > Thanks for the comments, I'll keep thinking about your design, a very > interesting idea! > > Peter