From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id F1B29431FDA for ; Sun, 23 Dec 2012 17:40:06 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id M0Pd0owaUDqa for ; Sun, 23 Dec 2012 17:40:06 -0800 (PST) Received: from tesseract.cs.unb.ca (tesseract.cs.unb.ca [131.202.240.238]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 730B9431FBC for ; Sun, 23 Dec 2012 17:40:02 -0800 (PST) Received: from fctnnbsc30w-156034082078.dhcp-dynamic.fibreop.nb.bellaliant.net ([156.34.82.78] helo=zancas.localnet) by tesseract.cs.unb.ca with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1Tmx1L-0008Ko-Rl; Sun, 23 Dec 2012 21:40:00 -0400 Received: from bremner by zancas.localnet with local (Exim 4.80) (envelope-from ) id 1Tmx1G-0002nb-AV; Sun, 23 Dec 2012 21:39:54 -0400 From: david@tethera.net To: notmuch@notmuchmail.org Subject: [Patch v9 05/17] util/string-util: add a new string tokenized function Date: Sun, 23 Dec 2012 21:39:31 -0400 Message-Id: <1356313183-9266-6-git-send-email-david@tethera.net> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1356313183-9266-1-git-send-email-david@tethera.net> References: <1356313183-9266-1-git-send-email-david@tethera.net> X-Spam_bar: - Cc: David Bremner X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Dec 2012 01:40:07 -0000 From: David Bremner This initial target use is in quoting queries for Xapian. We want to split into tokens, but preserve the delimiters between the tokens verbatim. --- util/string-util.c | 12 ++++++++++++ util/string-util.h | 19 +++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/util/string-util.c b/util/string-util.c index b9039f4..1586483 100644 --- a/util/string-util.c +++ b/util/string-util.c @@ -34,6 +34,18 @@ strtok_len (char *s, const char *delim, size_t *len) return *len ? s : NULL; } +char * +strtok_len2 (char *s, const char *delim, size_t *len, size_t *delim_len) +{ + /* length of token */ + *len = strcspn (s, delim); + + /* length of following delimiter */ + *delim_len = strspn (s + *len, delim); + + return *len || *delim_len ? s : NULL; +} + int double_quote_str (void *ctx, const char *str, diff --git a/util/string-util.h b/util/string-util.h index 4fc7942..12398a5 100644 --- a/util/string-util.h +++ b/util/string-util.h @@ -19,6 +19,25 @@ char *strtok_len (char *s, const char *delim, size_t *len); +/* Like strtok_len, but return length of delimiters as well. Return + * value is indicated by pointer and length, not null terminator. + * Does _not_ skip initial delimiters. + * + * Usage pattern: + * + * const char *tok = input; + * const char *delim = " :.,"; + * size_t tok_len = 0; + * size_t delim_len = 0; + * + * while ((tok = strtok_len (tok + tok_len + delim_len, delim, + * &tok_len, &delim_len)) != NULL) { + * // do stuff with token and following delimiters. + * } + */ + +char *strtok_len2 (char *s, const char *delim, size_t *len, size_t *delim_len); + /* Copy str to dest, surrounding with double quotes. * Any internal double-quotes are doubled, i.e. a"b -> "a""b" * -- 1.7.10.4