unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
* About the json output and the number of results shown.
@ 2011-01-12 18:37 Christophe-Marie Duquesne
  2011-01-12 22:39 ` Mike Kelly
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Christophe-Marie Duquesne @ 2011-01-12 18:37 UTC (permalink / raw)
  To: notmuch

Hi,

The notmuch command line tool has an option that seems very
interesting to me: --output=json

In several languages, and especially in python, json is as easy to parse as:
>>> res = json.load(stream)
If your stream contains valid json, you then get all your data in res
and you can immediately use it.

With notmuch, some commands can bring a lot of results, and can take
some time to return. That is why when I began to write a curse
interface to notmuch, I added a mechanism to spawn these commands in
background and gather the results asynchronously. Sadly, this makes me
unable to use the built-in python json parser: As long as the output
has not finished, the data on the stream is not valid json since it
lacks at least the closing bracket '}'. As a consequence, I find
easier not to use json and parse the data as it arrives.

So I am wondering: what is the point of having a tool that is able to
output json and ending in not using it? Is there a solution to make
the json output more useable? One solution I've been thinking about
would be to add an option: the range of results to show (something
like --range=25:50). Is it doable easily? I mean: if results are not
guaranteed to be in a given order, that would obviously be an issue.
Same if finding the results 25:50 is exactly as long as finding the
results 1:50. Otherwise, if it is doable, I guess this mail is a
feature request. In any case, do you have any proposal for making
sense of this json output without modifications in the notmuch CLI?

Cheers,
Christophe-Marie Duquesne

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About the json output and the number of results shown.
  2011-01-12 18:37 About the json output and the number of results shown Christophe-Marie Duquesne
@ 2011-01-12 22:39 ` Mike Kelly
  2011-01-28 20:44   ` Carl Worth
  2011-01-13 10:34 ` Sebastian Spaeth
  2011-01-28 20:40 ` Carl Worth
  2 siblings, 1 reply; 10+ messages in thread
From: Mike Kelly @ 2011-01-12 22:39 UTC (permalink / raw)
  To: notmuch

I've had other problems attempting to use the JSON interface recently.
For starters, if I'm simply trying to retrieve a single message, the
interface is rather awkard. I seem to need to do something like:

    my $json = `notmuch show --format=json id:$message_id`;
    my $parsed_json = decode_json($json);
    my $message = $parsed_json->[0][0][0];

And, when I'm doing my search earlier to even find those message ids, I
need to do a check to `notmuch count` first to see if I'll even get any
results, because the 0 result case is not valid JSON.

Of course, any feedback, like "you're doing it wrong" would be helpful.
My script is available at:

    http://git.pioto.org/gitweb?p=pioto-scripts.git;a=blob;f=mail/notmuch-poll.pl;hb=master

-- 
Mike Kelly

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About the json output and the number of results shown.
  2011-01-12 18:37 About the json output and the number of results shown Christophe-Marie Duquesne
  2011-01-12 22:39 ` Mike Kelly
@ 2011-01-13 10:34 ` Sebastian Spaeth
  2011-01-13 18:46   ` Christophe-Marie Duquesne
  2011-01-28 20:40 ` Carl Worth
  2 siblings, 1 reply; 10+ messages in thread
From: Sebastian Spaeth @ 2011-01-13 10:34 UTC (permalink / raw)
  To: Christophe-Marie Duquesne, notmuch

[-- Attachment #1: Type: text/plain, Size: 675 bytes --]

On Wed, 12 Jan 2011 19:37:21 +0100, Christophe-Marie Duquesne <chm.duquesne@gmail.com> wrote:
> With notmuch, some commands can bring a lot of results, and can take
> some time to return. That is why when I began to write a curse
> interface to notmuch, I added a mechanism to spawn these commands in
> background and gather the results asynchronously.

Alternatively, you could use the python bindings to libnotmuch and call
the functions directly. You could do that in a unthreaded or threaded
fashion...

Somewhat obsolete but mostly correct documentation of the included
python API is here: http://packages.python.org/cnotmuch/ (I really need
to update that).

Sebastian

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About the json output and the number of results shown.
  2011-01-13 10:34 ` Sebastian Spaeth
@ 2011-01-13 18:46   ` Christophe-Marie Duquesne
  2011-01-14 13:48     ` Sebastian Spaeth
  2011-01-28 20:27     ` Carl Worth
  0 siblings, 2 replies; 10+ messages in thread
From: Christophe-Marie Duquesne @ 2011-01-13 18:46 UTC (permalink / raw)
  To: Sebastian Spaeth; +Cc: notmuch

> Alternatively, you could use the python bindings to libnotmuch and call
> the functions directly. You could do that in a unthreaded or threaded
> fashion...

I've had a look to the python libnotmuch documentation. My problem
with this API is that, unless I did not read it correctly, if I use
one of its functions in a threaded fashion, I still have to wait for
this function to finish until I get results. When using the command
line tool, I can process the text as it gets printed on stdout, and I
have data to show to the user even though notmuch has not finished to
output it...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About the json output and the number of results shown.
  2011-01-13 18:46   ` Christophe-Marie Duquesne
@ 2011-01-14 13:48     ` Sebastian Spaeth
  2011-01-28 20:27     ` Carl Worth
  1 sibling, 0 replies; 10+ messages in thread
From: Sebastian Spaeth @ 2011-01-14 13:48 UTC (permalink / raw)
  To: Christophe-Marie Duquesne; +Cc: notmuch

On Thu, 13 Jan 2011 19:46:29 +0100, Christophe-Marie Duquesne <chm.duquesne@gmail.com> wrote:
> > Alternatively, you could use the python bindings to libnotmuch and call
> > the functions directly. You could do that in a unthreaded or threaded
> > fashion...
> 
> I've had a look to the python libnotmuch documentation. My problem
> with this API is that, unless I did not read it correctly, if I use
> one of its functions in a threaded fashion, I still have to wait for
> this function to finish until I get results. When using the command
> line tool, I can process the text as it gets printed on stdout, and I
> have data to show to the user even though notmuch has not finished to
> output it...

Well, you have to wait until the "search_messages()" function returns
and once it does you can output text as you go, by iterating through the
threads/mails. I am not sure that the command line client is any faster
as that with it's interactive output.

But I haven't investigated the performance closely.

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About the json output and the number of results shown.
  2011-01-13 18:46   ` Christophe-Marie Duquesne
  2011-01-14 13:48     ` Sebastian Spaeth
@ 2011-01-28 20:27     ` Carl Worth
  1 sibling, 0 replies; 10+ messages in thread
From: Carl Worth @ 2011-01-28 20:27 UTC (permalink / raw)
  To: Christophe-Marie Duquesne, Sebastian Spaeth; +Cc: notmuch

[-- Attachment #1: Type: text/plain, Size: 1042 bytes --]

On Thu, 13 Jan 2011 19:46:29 +0100, Christophe-Marie Duquesne <chm.duquesne@gmail.com> wrote:
> I've had a look to the python libnotmuch documentation. My problem
> with this API is that, unless I did not read it correctly, if I use
> one of its functions in a threaded fashion, I still have to wait for
> this function to finish until I get results.

The search function should return very quickly (ore pretty close to that
anyway). It's only when you start iterating through the results that
there's a lot of time being spent in the library functions.

> When using the command
> line tool, I can process the text as it gets printed on stdout, and I
> have data to show to the user even though notmuch has not finished to
> output it...

This functionality of the command-line tool is implemented with the same
library functions you would call. So you should get exactly the same
behavior that you want by calling the library directly.

Please let us know if that's not the case.

-Carl

-- 
carl.d.worth@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About the json output and the number of results shown.
  2011-01-12 18:37 About the json output and the number of results shown Christophe-Marie Duquesne
  2011-01-12 22:39 ` Mike Kelly
  2011-01-13 10:34 ` Sebastian Spaeth
@ 2011-01-28 20:40 ` Carl Worth
  2011-02-13  6:29   ` Jeff Waugh
  2 siblings, 1 reply; 10+ messages in thread
From: Carl Worth @ 2011-01-28 20:40 UTC (permalink / raw)
  To: Christophe-Marie Duquesne, notmuch

[-- Attachment #1: Type: text/plain, Size: 2515 bytes --]

On Wed, 12 Jan 2011 19:37:21 +0100, Christophe-Marie Duquesne <chm.duquesne@gmail.com> wrote:
> So I am wondering: what is the point of having a tool that is able to
> output json and ending in not using it? Is there a solution to make
> the json output more useable? One solution I've been thinking about
> would be to add an option: the range of results to show (something
> like --range=25:50). Is it doable easily?

This is fairly easy to do, yes. We even had functionality like this
once, and I'll probably even add it back soon, (since a client like the
vim interface isn't able to do the kind of asynchronous processing that
you would really want).

One problem with the ranged output (for "notmuch search" at least) is
that small ranges with large initial offsets will take longer than
expected. This is because in this case notmuch can't directly use
Xapian's range offset support. The user is asking for an offset as a
number of threads, but within Xapian we only have messages stored. So
notmuch will have to search for messages from the beginning, construct a
bunch of useless threads, and then throw those threads away after doing
no more than counting them.

This inefficiency in this API was one of the reasons I dropped this
functionality before. It's pretty ugly. But I don't see a really good
answer for that.

> feature request. In any case, do you have any proposal for making
> sense of this json output without modifications in the notmuch CLI?

We've run into basically the same issue with the emacs interface. We've
been avoiding using the json output precisely because the emacs JSON
parsing would need to see all the output before it could start
parsing. And that wouldn't give us the responsive user interface that we
want.

One idea I've had for this is to change the output (perhaps with a
command-line option) to avoid emitting the outer array. That is, the
results would instead be a series of independent JSON objects rather
than a single JSON object. That should let the application treat things
quickly by simply calling the JSON parser for each complete
object. (Though, here, the application would likely want a cheap way to
know when the input represented a complete object.)

If anyone wants to help improve our JSON output here, then that would be
great.

For any change to the structure of the JSON output, I'd also like to see
some documentation added to specify that structure clearly.

-Carl

-- 
carl.d.worth@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About the json output and the number of results shown.
  2011-01-12 22:39 ` Mike Kelly
@ 2011-01-28 20:44   ` Carl Worth
  2011-01-29 15:59     ` Mike Kelly
  0 siblings, 1 reply; 10+ messages in thread
From: Carl Worth @ 2011-01-28 20:44 UTC (permalink / raw)
  To: Mike Kelly, notmuch

[-- Attachment #1: Type: text/plain, Size: 1063 bytes --]

On Wed, 12 Jan 2011 22:39:45 +0000, Mike Kelly <pioto@pioto.org> wrote:
> For starters, if I'm simply trying to retrieve a single message, the
> interface is rather awkard. I seem to need to do something like:
> 
>     my $json = `notmuch show --format=json id:$message_id`;
>     my $parsed_json = decode_json($json);
>     my $message = $parsed_json->[0][0][0];

That does seem fairly awkward, yes. Do you have a suggestion for how
you'd like the output to be structured instead?

> And, when I'm doing my search earlier to even find those message ids, I
> need to do a check to `notmuch count` first to see if I'll even get any
> results, because the 0 result case is not valid JSON.

Yikes! That's a bug in notmuch that we should get fixed rather than you
just working around it. I just started adding a test for this
case. Currently:

	notmuch search --format=json "string that matches nothing"

returns nothing. Presumably, this should return just an empty json array
instead, (that is, "[]")?

-Carl

-- 
carl.d.worth@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About the json output and the number of results shown.
  2011-01-28 20:44   ` Carl Worth
@ 2011-01-29 15:59     ` Mike Kelly
  0 siblings, 0 replies; 10+ messages in thread
From: Mike Kelly @ 2011-01-29 15:59 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 1353 bytes --]

On Sat, 29 Jan 2011 06:44:40 +1000
Carl Worth <cworth@cworth.org> wrote:

> On Wed, 12 Jan 2011 22:39:45 +0000, Mike Kelly <pioto@pioto.org>
> wrote:
> > For starters, if I'm simply trying to retrieve a single message, the
> > interface is rather awkard. I seem to need to do something like:
> > 
> >     my $json = `notmuch show --format=json id:$message_id`;
> >     my $parsed_json = decode_json($json);
> >     my $message = $parsed_json->[0][0][0];
> 
> That does seem fairly awkward, yes. Do you have a suggestion for how
> you'd like the output to be structured instead?

Well, if I ask for a single message, I'd expect to just get a single
message. So, $message = $parsed_json, without the extra single-entry
arrays.

> > And, when I'm doing my search earlier to even find those message
> > ids, I need to do a check to `notmuch count` first to see if I'll
> > even get any results, because the 0 result case is not valid JSON.
> 
> Yikes! That's a bug in notmuch that we should get fixed rather than
> you just working around it. I just started adding a test for this
> case. Currently:
> 
> 	notmuch search --format=json "string that matches nothing"
> 
> returns nothing. Presumably, this should return just an empty json
> array instead, (that is, "[]")?

Yeah, should be "[]".

Thanks.

-- 
Mike Kelly

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About the json output and the number of results shown.
  2011-01-28 20:40 ` Carl Worth
@ 2011-02-13  6:29   ` Jeff Waugh
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Waugh @ 2011-02-13  6:29 UTC (permalink / raw)
  To: notmuch

[-- Attachment #1: Type: text/plain, Size: 772 bytes --]

On Sat, Jan 29, 2011 at 07:40, Carl Worth wrote:

> One idea I've had for this is to change the output (perhaps with a
> command-line option) to avoid emitting the outer array. That is, the
> results would instead be a series of independent JSON objects rather
> than a single JSON object. That should let the application treat things
> quickly by simply calling the JSON parser for each complete
> object.


It might be useful to model this on the Twitter streaming API, which just
delivers a lot of JSON + '\r\n' (large objects straddle http chunks).


> (Though, here, the application would likely want a cheap way to
> know when the input represented a complete object.)
>

Is that necessary? You're definitely going to get a \r\n or an EOF at some
point. :-)

- Jeff

[-- Attachment #2: Type: text/html, Size: 1175 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-02-13  6:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-12 18:37 About the json output and the number of results shown Christophe-Marie Duquesne
2011-01-12 22:39 ` Mike Kelly
2011-01-28 20:44   ` Carl Worth
2011-01-29 15:59     ` Mike Kelly
2011-01-13 10:34 ` Sebastian Spaeth
2011-01-13 18:46   ` Christophe-Marie Duquesne
2011-01-14 13:48     ` Sebastian Spaeth
2011-01-28 20:27     ` Carl Worth
2011-01-28 20:40 ` Carl Worth
2011-02-13  6:29   ` Jeff Waugh

Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).