From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <3tveGXAoKB6sXLcXdecZYRRZZRWP.NZXYZeXfNSYZeXfNSXLTW.ZcR@flex--marmstrong.bounces.google.com> Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 6ED626DE106E for ; Mon, 11 Mar 2019 17:05:12 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -7.827 X-Spam-Level: X-Spam-Status: No, score=-7.827 tagged_above=-999 required=5 tests=[AWL=-0.137, DKIMWL_WL_MED=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_FILL_THIS_FORM_SHORT=0.01, USER_IN_DEF_DKIM_WL=-7.5] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pj9NTjLL6t6p for ; Mon, 11 Mar 2019 17:05:11 -0700 (PDT) Received: from mail-qt1-f202.google.com (mail-qt1-f202.google.com [209.85.160.202]) by arlo.cworth.org (Postfix) with ESMTPS id 9A6266DE1053 for ; Mon, 11 Mar 2019 17:05:11 -0700 (PDT) Received: by mail-qt1-f202.google.com with SMTP id k5so748630qte.0 for ; Mon, 11 Mar 2019 17:05:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to; bh=uGXASS0wCAG6OEIHmpCdWuiQJTQuQqs3i6a+Q/UNLEg=; b=h99KR01dOW1JXtzfSrd/AGaK+049+MCy42lnnM3U1dBo5nmFJLhFed/TiOkga6fjGC 4uVglJryETB2bkZ2eCYqoK0/n6LAiYHuxcS/h+SvL+gGYm5fmKkCRId+O4wk6729LO9N STptEBFPG/XlW7DG/Vbm3CtrkzBzlTS+YaxRHZ7aIqHqkTWoQsAzBT66zbYBwMsfXGcv zZp1ziD+Q1/kkkYK3TML/SUYWiLk2R/JUedTf8B0KahyM13WRGMTz0i18nhTvGkK7S0/ E+gj6pumwmZA3MFgW94lXIRENGUDgYIzu2Sl7xkVbPNLyeG4Unn3c50N9TR0wFH03uUc ko9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to; bh=uGXASS0wCAG6OEIHmpCdWuiQJTQuQqs3i6a+Q/UNLEg=; b=obiJhVt0Q1C1olSrL2QVkCsH6hyVfG7mISo6r+DtuEfDStYZhibI2RapQY88tgCIyG EIvAJWkH1PLYvwUspO4OP9D8L1othg99rf9Pj0/sGJtnPS50p0bYTeXROerjPUf9PXoU wJ1i7Tc+/GNr7Lrl1D+kbQSIM0wkXMpzRsA8uJhmz6Yofs0s3v2fIXA4rsOESTX1o7Vk LhmM/FEkaCv7IpKE9kg+M3x6JcP0hVCju7b5trPjXA1OeeQAoe66gZOxSTf0cEektslJ 3ImaQrfChsVk5ThqhSO59bdQC62MsN0D4UnElf/0uEbNIMojkS6PkHd+AKq70l6+/0t3 k0vA== X-Gm-Message-State: APjAAAV73C37j7VLzggVhCouhtJ5kCCOzPOHn3OL2tsHfEcnF21F7dl5 JNL7WBHBVGYxdvHcNZWH7iHMpqeAbgKu7x0s X-Google-Smtp-Source: APXvYqyNEADUf/mzvyzC/Bq/BZBOf+lJ4aHH69pxcLRa70wWDTHpVKfPlLN2ywDeV8/Of5v9k2/vA/TA/8FZOMEk X-Received: by 2002:ac8:26f6:: with SMTP id 51mr20639471qtp.48.1552349110083; Mon, 11 Mar 2019 17:05:10 -0700 (PDT) Date: Mon, 11 Mar 2019 17:05:08 -0700 In-Reply-To: <87a7i4c3t5.fsf@wondoo.home.cworth.org> Message-Id: Mime-Version: 1.0 References: <87muui87om.fsf@len.workgroup> <87ef7hyxqs.fsf@len.workgroup> <87a7i4c3t5.fsf@wondoo.home.cworth.org> Subject: Re: how to search for hyphenated words? (was: how to search for Morse code?) From: Matt Armstrong To: Carl Worth , Gregor Zattler , notmuch@notmuchmail.org Content-Type: text/plain; charset="UTF-8" X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Mar 2019 00:05:12 -0000 Carl Worth writes: > Hi Gregor, > > The trick here is that when notmuch is indexing body text it feeds it > into a Xapian function that parses the text by finding "terms" in the > text. And this parser considers both punctuation and whitespace as > separators between terms. I notice that Xapian supports something called "phrase searches", documented as: "A phrase surrounded with double quotes ("") matches documents containing that exact phrase. Hyphenated words are also treated as phrases, as are cases such as filenames and email addresses (e.g. /etc/passwd or president@whitehouse.gov)." I assume that this particular Xapian feature is unavailable in notmuch? If so, I wonder if enabling has ever been considered? Being able to "drop down" to do things like exact phrase matches is one reason why I use notmuch, because the precision sometimes matters. I currently do this by fetching the mail message itself and using old-school mail processing tools on the message file.