From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id C63826DE16E4 for ; Thu, 9 Feb 2017 08:16:49 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: 0.507 X-Spam-Level: X-Spam-Status: No, score=0.507 tagged_above=-999 required=5 tests=[AWL=-0.145, SPF_NEUTRAL=0.652] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KgJa2QdsmFZS for ; Thu, 9 Feb 2017 08:16:48 -0800 (PST) Received: from guru.guru-group.fi (guru.guru-group.fi [46.183.73.34]) by arlo.cworth.org (Postfix) with ESMTP id 84BBD6DE16E2 for ; Thu, 9 Feb 2017 08:16:48 -0800 (PST) Received: from guru.guru-group.fi (localhost [IPv6:::1]) by guru.guru-group.fi (Postfix) with ESMTP id 52D2E1000E1; Thu, 9 Feb 2017 18:15:56 +0200 (EET) From: Tomi Ollila To: David Bremner , notmuch@notmuchmail.org Subject: Re: [Patch v4] lib: regexp matching in 'subject' and 'from' In-Reply-To: <87efz8vz0w.fsf@rocinante.cs.unb.ca> References: <20170121032752.6788-1-david@tethera.net> <20170121135917.22062-1-david@tethera.net> <87efzqef2r.fsf@tethera.net> <87ziia2jpj.fsf@nikula.org> <87efz8vz0w.fsf@rocinante.cs.unb.ca> User-Agent: Notmuch/0.23.3+85~g2b85e66 (https://notmuchmail.org) Emacs/24.5.1 (x86_64-unknown-linux-gnu) X-Face: HhBM'cA~ MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Feb 2017 16:16:49 -0000 On Thu, Feb 09 2017, David Bremner wrote: > Jani Nikula writes: > >> >> Theoretically "/" is an acceptable character in message-ids [1]. Rare, >> unlikely, but acceptable. Searching for message-id's beginning with "/" >> would have to use regexps, which would break in all sorts of ways >> throughout the stack. I don't think there are handy alternatives to >> "//", given the characters that are acceptable in message-ids, >> but this is something to think about. > > Would telling the user to \ escape ( or double /) the initial / be good a while ago I thought this double // but dismissed it quickly (re-searching just for single quote can be useful...) In the rare cases anyone needs to disable regex processing, imo this \ is the best idea i've (not) come up with. some command line testing with(and -out) quoting: $ printf %s\\n id:\/some/crazy/message-id id:/some/crazy/message-id $ printf %s\\n "id:\/some/crazy/message-id" id:\/some/crazy/message-id $ printf %s\\n 'id:\/some/crazy/message-id' id:\/some/crazy/message-id $ printf %s\\n id:\\/some/crazy/message-id id:\/some/crazy/message-id $ printf %s\\n "id:\\/some/crazy/message-id" id:\/some/crazy/message-id $ printf %s\\n 'id:\\/some/crazy/message-id' id:\\/some/crazy/message-id so: $ printf %s\\n 'id:"\/some/crazy/message-id with spaces"' id:"\/some/crazy/message-id with spaces" > enough there? This would disable regex processing. I guess this goes > back to someone's earlier suggestion. A third option would be to use > single quotes there ("id:'/foo'"), but that isn't really consistent with > either Xapian > or usual regex conventions. $ printf %s\\n 'id:"'\''/foo with spaces ;D'\''"' id:"'/foo with spaces ;D'" or, perhaps this is clearer >;) $ printf %s\\n 'id:"'"'"'/foo with spaces ;D'"'"'"' id:"'/foo with spaces ;D'" > > So I guess my favourite idea ATM is to use id:\/some/crazy/message-id > FWIW, I don't have any such message ids. > >> For example, could the regexp matcher for message-ids first check if the >> "regexp" is a strict match with "/" and all, and accept those? This >> might be a reasonable workaround if it can be made to work. > > We're building a query, so I think the equivalent is to make an OR, with > the exact match and the regex posting source. That could be done, > although I'm a bit uneasy about how this makes the syntax for id: > different, so id:/foo would be legit, but from:/foo would be an error. > Maybe the dwim-factor is worth it. > > d