Quoth myself on Jul 13 at 2:57 pm: > Quoth Pieter Praet on Jul 13 at 4:16 pm: > > Jamie Zawinski once said/wrote [1]: > > 'Some people, when confronted with a problem, think "I know, > > I'll use regular expressions." Now they have two problems.' > > > > With this in mind, I set out to get rid of this whole regex mess altogether, > > by populating the search buffer using Notmuch's JSON output instead of doing > > brittle text matching tricks. > > > > Looking for some documentation, I stumbled upon a long-forgotten gem [2]. > > > > David's already done pretty much all of the work for us! > > Yes, similar thoughts were running through my head as I futzed with > the formatting for this. My concern with moving to JSON for search > buffers is that parsing it is about *30 times slower* than the current > regexp-based approach (0.6 seconds versus 0.02 seconds for a mere 1413 > result search buffer). I think JSON makes a lot of sense for show > buffers because there's generally less data and it has a lot of > complicated structure. Search results, on the other hand, have a very > simple, regular, and constrained structure, so JSON doesn't buy us > nearly as much. > > JSON is hard to parse because, like the text search output, it's > designed for human consumption (of course, unlike the text search > output, it's also designed for computer consumption). There's > something to be said for the debuggability and generality of this and > JSON is very good for exchanging small objects, but it's a remarkably > inefficient way to exchange large amounts of data between two > programs. > > I guess what I'm getting at, though it pains me to say it, is perhaps > search needs a fast, computer-readable interchange format. The > structure of the data is so simple and constrained that this could be > altogether trivial. > > Or maybe I need a faster computer. Or maybe I need to un-lame my benchmark. TL;DR: We should use JSON for search results, but possibly not the json.el shipped with Emacs. I realized that my text benchmark didn't capture the cost of extracting the match strings. re-search-forward records matches as buffer positions, which don't get realized into strings until you call match-string. Hence, match-string is quite expensive. Also, Emacs' json.el is slow, so I perked it up. My modified json.el is ~3X faster, particularly for string-heavy output like notmuch's. Though now I'm well into the realm of "eq is faster than =" and "M-x disassemble", so unless I missed something big, this is as fast as it gets. While I was still thinking about new IPC formats, I realized that the text format and the Emacs UI are already tightly coupled, so why not go all the way and use S-expressions for IPC? I now think JSON is fast enough to use, but S-expressions still have a certain appeal. They share most of the benefits of JSON; structure and extensibility in particular. Further, while the content of some ad-hoc format could easily diverge from both the text and JSON formats, S-expressions could exactly parallel the JSON content (with a little more abstraction, they could even share the same format code). For kicks, I included an S-expression benchmark. It beats out the text parser by a factor of two and the optimized JSON parser by a factor of three. Here are the results for my 1,413 result search buffer and timeworn computer Time Normalized --format=text 0.148s 1.00x --format=json 0.598s 4.04x custom json.el 0.209s 1.41x + string keys 0.195s 1.32x S-expressions 0.066s 0.45x I don't have time right now, but next week I might be able to look through and update dme's JSON-based search code. The benchmark and modified json.el are attached. The benchmark is written so you can open it and eval-buffer, then C-x C-e the various calls in the comments. You can either make-text/make-json, or run notmuch manually, pipe the results into files "text" and "json", and open them in Emacs. Please excuse the modified json.el code; it's gone through zero cleanup.