From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.devel Subject: Re: On language-dependent defaults for character-folding Date: Sat, 13 Feb 2016 01:57:33 +0200 Organization: LINKOV.NET Message-ID: <87oablfpn3.fsf@mail.linkov.net> References: <87mvr9wxqz.fsf@wanadoo.es> <87io1xwq1e.fsf@wanadoo.es> <87vb5wvzfz.fsf@mail.linkov.net> <87io1wt4cc.fsf@wanadoo.es> <8737syoima.fsf@mail.linkov.net> <871t8iu277.fsf@wanadoo.es> <83d1s28kvh.fsf@gnu.org> <87r3gis7sm.fsf@wanadoo.es> <83twle71xy.fsf@gnu.org> <87io1us0te.fsf@wanadoo.es> <83pow26svf.fsf@gnu.org> <87a8n5srbp.fsf@wanadoo.es> <83d1s17npz.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1455321825 19414 80.91.229.3 (13 Feb 2016 00:03:45 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 13 Feb 2016 00:03:45 +0000 (UTC) Cc: =?iso-8859-1?Q?=D3scar?= Fuentes , emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Feb 13 01:03:31 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aUNgL-00007B-Hs for ged-emacs-devel@m.gmane.org; Sat, 13 Feb 2016 01:03:25 +0100 Original-Received: from localhost ([::1]:37406 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUNgK-0000uy-Hx for ged-emacs-devel@m.gmane.org; Fri, 12 Feb 2016 19:03:24 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:49931) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUNgG-0000pp-2J for emacs-devel@gnu.org; Fri, 12 Feb 2016 19:03:21 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aUNgE-0005Gz-QD for emacs-devel@gnu.org; Fri, 12 Feb 2016 19:03:19 -0500 Original-Received: from sub3.mail.dreamhost.com ([69.163.253.7]:40269 helo=homiemail-a39.g.dreamhost.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUNgB-0005B9-2R; Fri, 12 Feb 2016 19:03:15 -0500 Original-Received: from homiemail-a39.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a39.g.dreamhost.com (Postfix) with ESMTP id E8E3F150074; Fri, 12 Feb 2016 16:03:08 -0800 (PST) Original-Received: from localhost.linkov.net (85.253.57.57.cable.starman.ee [85.253.57.57]) (Authenticated sender: jurta@jurta.org) by homiemail-a39.g.dreamhost.com (Postfix) with ESMTPA id DF73915006D; Fri, 12 Feb 2016 16:03:07 -0800 (PST) In-Reply-To: <83d1s17npz.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 12 Feb 2016 21:06:16 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.90 (x86_64-pc-linux-gnu) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 69.163.253.7 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:199842 Archived-At: > I was trying to develop a dialogue which will help me and you > understand where your resistance begins and where it ends. I think > it's important to do that to better understand the issues, but if you > don't want that, we can stop any moment. Can't we somehow use the same char-folding as is implemented in ICU String Search Service (this is also used for search in Chromium): http://userguide.icu-project.org/collation/icu-string-search-service that supports matching of accented letters, conjoined letters, and ignorable punctuation. As is described in http://userguide.icu-project.org/collation/concepts there are several levels of character matching: 1. Primary Level: differences between base characters 2. Secondary Level: Accents in the characters 3. Tertiary Level: Upper and lower case differences in characters 4. Quaternary Level: Punctuation is ignored (where e.g. snake-cased =E2=80=9Cblack_bird=E2=80=9D matches camel-cased =E2=80=9CblackBird=E2= =80=9D) 5. Identical Level Maybe our customization could provide options to choose between all these levels?