From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 342976DE0C71 for ; Sun, 5 Feb 2017 12:16:36 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: 0.507 X-Spam-Level: X-Spam-Status: No, score=0.507 tagged_above=-999 required=5 tests=[AWL=-0.145, SPF_NEUTRAL=0.652] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aIXVaIWcYlqi for ; Sun, 5 Feb 2017 12:16:35 -0800 (PST) Received: from guru.guru-group.fi (guru.guru-group.fi [46.183.73.34]) by arlo.cworth.org (Postfix) with ESMTP id CF53F6DE0A6C for ; Sun, 5 Feb 2017 12:16:34 -0800 (PST) Received: from guru.guru-group.fi (localhost [IPv6:::1]) by guru.guru-group.fi (Postfix) with ESMTP id 6CD6A1001A4; Sun, 5 Feb 2017 22:16:06 +0200 (EET) From: Tomi Ollila To: David Bremner , notmuch@notmuchmail.org Subject: Re: [Patch v4] lib: regexp matching in 'subject' and 'from' In-Reply-To: <87ziia2jpj.fsf@nikula.org> References: <20170121032752.6788-1-david@tethera.net> <20170121135917.22062-1-david@tethera.net> <87efzqef2r.fsf@tethera.net> <87ziia2jpj.fsf@nikula.org> User-Agent: Notmuch/0.23.3+85~g2b85e66 (https://notmuchmail.org) Emacs/24.5.1 (x86_64-unknown-linux-gnu) X-Face: HhBM'cA~ MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Feb 2017 20:16:36 -0000 TOn Sun, Jan 29 2017, Jani Nikula wrote: > On Wed, 25 Jan 2017, David Bremner wrote: >> Tomi Ollila writes: >> >>> >>> Why would not mesasge_id not be useful to regex match. I can come up quite >>> a few use cases... but if there are techinal difficulties... then that >>> should be mentioned instead. >> >> I'll have a look. Since the first version of this patch (when that >> message was written), people have actually asked for some kind of >> wildcard matching of message-ids. > > Theoretically "/" is an acceptable character in message-ids [1]. Rare, > unlikely, but acceptable. Searching for message-id's beginning with "/" > would have to use regexps, which would break in all sorts of ways > throughout the stack. I don't think there are handy alternatives to > "//", given the characters that are acceptable in message-ids, > but this is something to think about. > > For example, could the regexp matcher for message-ids first check if the > "regexp" is a strict match with "/" and all, and accept those? This > might be a reasonable workaround if it can be made to work. > > [1] https://tools.ietf.org/html/rfc2822#section-3.2.4 > >>> maybe this commit message should inform that xapian with field processors >>> (1.4.x) is required for this feature -- and emphasize it a bit better in >>> manual page ? >>> >>> Probably '//' is used to escape '/' -- should such a character ever needed >>> in regex search. >>> >> >> Currently no escaping is needed because it only looks at the first and >> last characters of the string (the usual xapian/shell rules mean that "" might >> be needed). >> >> The following seem to work as hoped >> >> # match a / with a space before it >> >> % notmuch search 'subject:"/ //"' >> >> # just a slash >> >> % notmuch search subject:/// >> >> # anchored slash >> >> % notmuch search subject:/^// >> >> The trailing slash is actually decorative, we could drop it. Actually >> *blush* I just noticed the current code is missing something from this line >> >> if (str.at (0) == '/' && str.at (str.size () - 1)){ >> >> _if_ that line is fixed, then it will have the slightly odd behaviour of >> >> subject:/blah >> >> doing a non-regex search >> >> We could also throw an error for that case, maybe that's the best option. > > I'd go with an error. It's easy to loosen the rules later on if we > decide that's a good idea. Much harder to accept loose rules now, let > users get used to it, and try to tighten the rules if we realize we'd > need that for some reason. I agree -- should we allow trailing slash ('/') without first char also being '/' (e.g. subject:blah/) Tomi > > BR, > Jani.