unofficial mirror of notmuch@notmuchmail.org
 help / color / mirror / code / Atom feed
From: Mark Walters <markwalters1009@gmail.com>
To: Austin Clements <amdragon@MIT.EDU>
Cc: notmuch@notmuchmail.org
Subject: Re: [PATCH v2 1/7] cli: allow query to come from stdin
Date: Mon, 26 Nov 2012 10:15:06 +0000	[thread overview]
Message-ID: <87mwy4smad.fsf@qmul.ac.uk> (raw)
In-Reply-To: <20121124174134.GH4562@mit.edu>


Hi

Many thanks for all the reviews: I have incorporated most of your and
Tomi's suggestions in my latest version. However, for this patch I
wonder whether just using David's batch tagging would be sufficient. It
does mean that I can't construct the list of possible tag removals
correctly for a large query but I can just return all tags in this
case. I think this is probably an acceptable trade off: you don't get
the correct list of possible completions if you are tagging more than
5000 messages at once.

This patch is not very complicated but it does add another
feature/option to the command line so if it is not needed I am inclined
not to add it. If people think that being able to do searches for
queries in excess of ARGMAX (possible 2MB or so) is useful then we could
add it.

Incidentally for the tag completions: my view is the correct thing is to
offer completions (for tag removal) based on what tags the buffer shows
(ie what was there when the query was run) rather than what is actually
tags are present now: this would be easy to add if anyone cared
sufficiently.

Any thoughts?

Best wishes

Mark

Austin Clements <amdragon@MIT.EDU> writes:

> Quoth markwalters1009 on Nov 24 at  1:20 pm:
>> From: Mark Walters <markwalters1009@gmail.com>
>> 
>> After this series there will be times when a caller will want to pass
>> a very large query string to notmuch (eg a list of 10,000 message-ids)
>> and this can exceed the size of ARG_MAX. Hence allow notmuch to take
>> the query from stdin (if the query is -).
>> ---
>>  query-string.c |   41 +++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 41 insertions(+), 0 deletions(-)
>> 
>> diff --git a/query-string.c b/query-string.c
>> index 6536512..b1fbdeb 100644
>> --- a/query-string.c
>> +++ b/query-string.c
>> @@ -20,6 +20,44 @@
>>  
>>  #include "notmuch-client.h"
>>  
>> +/* Read a single query string from STDIN, using
>> + * 'ctx' as the talloc owner for all allocations.
>> + *
>> + * This function returns NULL in case of insufficient memory or read
>> + * errors.
>> + */
>> +static char *
>> +query_string_from_stdin (void *ctx)
>> +{
>> +    char *query_string;
>> +    char buf[4096];
>> +    ssize_t remain;
>> +
>> +    query_string = talloc_strdup (ctx, "");
>> +    if (query_string == NULL)
>> +	return NULL;
>> +
>> +    for (;;) {
>> +	remain = read (STDIN_FILENO, buf, sizeof(buf) - 1);
>> +	if (remain == 0)
>> +	    break;
>> +	if (remain < 0) {
>> +	    if (errno == EINTR)
>> +		continue;
>> +	    fprintf (stderr, "Error: reading from standard input: %s\n",
>> +		     strerror (errno));
>
> talloc_free (query_string) ?
>
>> +	    return NULL;
>> +	}
>> +
>> +	buf[remain] = '\0';
>> +	query_string = talloc_strdup_append (query_string, buf);
>
> Eliminate the NUL in buf and instead
>  talloc_strndup_append (query_string, buf, remain) ?
>
> Should there be some (large) bound on the size of the query string to
> prevent runaway?
>
>> +	if (query_string == NULL)
>
> Technically it would be good to talloc_free the old pointer here, too.
>
>> +	    return NULL;
>> +    }
>> +
>> +    return query_string;
>> +}
>> +
>
> This whole approach is O(n^2), which might actually matter for large
> query strings.  How about (tested, but only a little):
>
> #define MAX_QUERY_STRING_LENGTH (16 * 1024 * 1024)
>
> /* Read a single query string from STDIN, using 'ctx' as the talloc
>  * owner for all allocations.
>  *
>  * This function returns NULL in case of insufficient memory or read
>  * errors.
>  */
> static char *
> query_string_from_stdin (void *ctx)
> {
>     char *query_string = NULL, *new_qs;
>     size_t pos = 0, end = 0;
>     ssize_t got;
>
>     for (;;) {
> 	if (end - pos < 512) {
> 	    end = MAX(end * 2, 1024);
> 	    if (end >= MAX_QUERY_STRING_LENGTH) {
> 		fprintf (stderr, "Error: query too long\n");
> 		goto FAIL;
> 	    }
> 	    new_qs = talloc_realloc (ctx, query_string, char, end);
> 	    if (new_qs == NULL)
> 		goto FAIL;
> 	    query_string = new_qs;
> 	}
>
> 	got = read (STDIN_FILENO, query_string + pos, end - pos - 1);
> 	if (got == 0)
> 	    break;
> 	if (got < 0) {
> 	   if (errno == EINTR)
> 	       continue;
> 	   fprintf (stderr, "Error: reading from standard input: %s\n",
> 		    strerror (errno));
> 	   goto FAIL;
> 	}
> 	pos += got;
>     }
>
>     query_string[pos] = '\0';
>     return query_string;
>
>  FAIL:
>     talloc_free (query_string);
>     return NULL;
> }
>
>>  /* Construct a single query string from the passed arguments, using
>>   * 'ctx' as the talloc owner for all allocations.
>>   *
>> @@ -35,6 +73,9 @@ query_string_from_args (void *ctx, int argc, char *argv[])
>>      char *query_string;
>>      int i;
>>  
>> +    if ((argc == 1) && (strcmp ("-", argv[0]) == 0))
>> +	return query_string_from_stdin (ctx);
>> +
>>      query_string = talloc_strdup (ctx, "");
>>      if (query_string == NULL)
>>  	return NULL;

  reply	other threads:[~2012-11-26 10:15 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-24 13:20 [PATCH v2 0/7] Fix emacs tagging race markwalters1009
2012-11-24 13:20 ` [PATCH v2 1/7] cli: allow query to come from stdin markwalters1009
2012-11-24 13:24   ` Mark Walters
2012-11-24 17:41   ` Austin Clements
2012-11-26 10:15     ` Mark Walters [this message]
2012-11-24 22:34   ` Tomi Ollila
2012-11-24 13:20 ` [PATCH v2 2/7] test: for the new query from stdin functionality markwalters1009
2012-11-24 13:20 ` [PATCH v2 3/7] emacs: notmuch.el split call-process into call-process-region markwalters1009
2012-11-24 13:20 ` [PATCH v2 4/7] emacs: make emacs tagging use the stdin query functionality markwalters1009
2012-11-24 22:09   ` Austin Clements
2012-11-24 13:20 ` [PATCH v2 5/7] test: test for race when tagging from emacs search markwalters1009
2012-11-24 13:20 ` [PATCH v2 6/7] cli: allow search mode to include msg-ids with JSON output markwalters1009
2012-11-24 22:30   ` Tomi Ollila
2012-11-25  0:23   ` Austin Clements
2012-11-24 13:20 ` [PATCH v2 7/7] emacs: make emacs use message-ids for tagging markwalters1009
2012-11-25  0:38   ` Austin Clements

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://notmuchmail.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mwy4smad.fsf@qmul.ac.uk \
    --to=markwalters1009@gmail.com \
    --cc=amdragon@MIT.EDU \
    --cc=notmuch@notmuchmail.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhetil.org/notmuch.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).