From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id DC0176DE0BA5 for ; Mon, 13 Nov 2017 09:47:10 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.001 X-Spam-Level: X-Spam-Status: No, score=-0.001 tagged_above=-999 required=5 tests=[AWL=0.010, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VacFa29WCH6W for ; Mon, 13 Nov 2017 09:47:09 -0800 (PST) Received: from fethera.tethera.net (fethera.tethera.net [198.245.60.197]) by arlo.cworth.org (Postfix) with ESMTPS id 9C5AF6DE0B64 for ; Mon, 13 Nov 2017 09:47:09 -0800 (PST) Received: from remotemail by fethera.tethera.net with local (Exim 4.89) (envelope-from ) id 1eEIp6-0008Uz-Av; Mon, 13 Nov 2017 12:47:04 -0500 Received: (nullmailer pid 2066 invoked by uid 1000); Mon, 13 Nov 2017 17:47:01 -0000 From: David Bremner To: Stefano Zacchiroli Cc: Bruno Deremble , notmuch@notmuchmail.org Subject: Re: accented characters In-Reply-To: <20171113143515.5hbnsma72r24qutf@upsilon.cc> References: <87h8tz8b2v.fsf@ens.fr> <87efp2b9er.fsf@tethera.net> <20171113143515.5hbnsma72r24qutf@upsilon.cc> Date: Mon, 13 Nov 2017 13:47:01 -0400 Message-ID: <8760ae6pgq.fsf@tesseract.cs.unb.ca> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2017 17:47:11 -0000 Stefano Zacchiroli writes: > > Unicode has a notion of canonical form that rearrange accented > characters in a sequence of non-accented characters + modifiers > https://en.wikipedia.org/wiki/Unicode_equivalence . A bunch of libraries > use that stuff to normalize-away accents in unicode strings. I'm aware > of a few in Python for instance, but not in C++ (which I believe is what > you'd be interested in). > Apropos, Rob Browning started looking at canonicalization using glib in id:1440951676-17286-1-git-send-email-rlb@defaultvalue.org http://article.gmane.org/gmane.mail.notmuch.general/21004