From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "Drew Adams" Newsgroups: gmane.emacs.bugs Subject: bug#13041: 24.2; diacritic-fold-search Date: Thu, 6 Dec 2012 07:59:59 -0800 Message-ID: References: <20121130182205.C722F14B8D@panix1.panix.com><87hao69b5r.fsf@mail.jurta.org><20665.8224.844876.619203@panix5.panix.com><87hao6zko4.fsf@mail.jurta.org><83fw3qtboc.fsf@gnu.org><87hao5jqu3.fsf@mail.jurta.org><50BB93C2.1050007@gmx.at><83y5hgs564.fsf@gnu.org><50BC7BF5.2020400@gmx.at><83hao3rskd.fsf@gnu.org><50BCE49D.6010001@gmx.at><837gozrp8f.fsf@gnu.org><50BE38F3.3030907@gmx.at><3E2D742BA0FC44B7A61665D85AAC3712@us.oracle.com><50BF1702.4020100@gmx.at><611DD154E83240D183A7B5B88691DC37@us.oracle.com> <8164D22E74F94504B41247F314787E10@us.oracle.com> <50C07410.8060705@gmx.at> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1354809673 16811 80.91.229.3 (6 Dec 2012 16:01:13 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 6 Dec 2012 16:01:13 +0000 (UTC) Cc: perin@panix.com, 13041@debbugs.gnu.org, perin@acm.org To: "'martin rudalics'" Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Dec 06 17:01:26 2012 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Tgdt7-0008GI-Jr for geb-bug-gnu-emacs@m.gmane.org; Thu, 06 Dec 2012 17:01:25 +0100 Original-Received: from localhost ([::1]:60467 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Tgdsv-0006cb-4I for geb-bug-gnu-emacs@m.gmane.org; Thu, 06 Dec 2012 11:01:13 -0500 Original-Received: from eggs.gnu.org ([208.118.235.92]:42467) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Tgdsj-0006Zh-PR for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2012 11:01:11 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TgdsX-0002uH-Jb for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2012 11:01:01 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:46737) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TgdsX-0002uB-F9 for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2012 11:00:49 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.72) (envelope-from ) id 1Tgdsk-0002Hf-D3 for bug-gnu-emacs@gnu.org; Thu, 06 Dec 2012 11:01:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: "Drew Adams" Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 06 Dec 2012 16:01:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 13041 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 13041-submit@debbugs.gnu.org id=B13041.13548096308741 (code B ref 13041); Thu, 06 Dec 2012 16:01:02 +0000 Original-Received: (at 13041) by debbugs.gnu.org; 6 Dec 2012 16:00:30 +0000 Original-Received: from localhost ([127.0.0.1]:56988 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1TgdsD-0002Gv-FY for submit@debbugs.gnu.org; Thu, 06 Dec 2012 11:00:30 -0500 Original-Received: from userp1040.oracle.com ([156.151.31.81]:36754) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Tgds7-0002Gh-TM for 13041@debbugs.gnu.org; Thu, 06 Dec 2012 11:00:26 -0500 Original-Received: from acsinet22.oracle.com (acsinet22.oracle.com [141.146.126.238]) by userp1040.oracle.com (Sentrion-MTA-4.2.2/Sentrion-MTA-4.2.2) with ESMTP id qB6G05kN030181 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 6 Dec 2012 16:00:06 GMT Original-Received: from acsmt358.oracle.com (acsmt358.oracle.com [141.146.40.158]) by acsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id qB6G03uv011490 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 6 Dec 2012 16:00:03 GMT Original-Received: from abhmt119.oracle.com (abhmt119.oracle.com [141.146.116.71]) by acsmt358.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id qB6G03qr018563; Thu, 6 Dec 2012 10:00:03 -0600 Original-Received: from dradamslap1 (/10.159.236.61) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 06 Dec 2012 08:00:02 -0800 X-Mailer: Microsoft Office Outlook 11 In-reply-to: <50C07410.8060705@gmx.at> Thread-Index: Ac3TnOcm/CdinjtMRa6+Y/CRcjdyywAK/bng X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-Source-IP: acsinet22.oracle.com [141.146.126.238] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:68029 Archived-At: > > We are using compatibility normalization, not canonical=20 > > normalization. So a search (or a string comparison test) > > for `f' will match the ligature `ffi' > > (whereas it would not match wrt canonical normalization). >=20 > If it can be done, searching for "f" should match ligatures like "ff" > and "fi". That's what I thought you were planning/preparing to do. On the other hand, as the Unicode spec points out (for level 2), = sometimes someone wants to distinguish searching for f from searching for the = ligature. Ideally (we might never get there), that would be possible as an = alternative (choice). The spec also points to hybrid situations regarding case conversion (see = sect RL2.4) where, e.g., you might want to do full case matching on =DF in a = literal name such as Strau=DF but simple case folding on =DF when used in a = character class, such as [=DF]. Dunno whether we would ever get there either. There seems to be a lot in the Unicode regexp spec (http://www.unicode.org/reports/tr18/) that could be food for thought = for Emacs. I imagine that some Emacs Dev folks have already taken a close look and = given it some thought.