From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#22038: 25.1.50; Character folding issues with isearch Date: Sat, 28 Nov 2015 19:40:26 +0200 Message-ID: <83bnaeowdx.fsf@gnu.org> References: <87fuzqkszp.fsf@gmx.net> <83egfaoz7b.fsf@gnu.org> <877fl2kq1u.fsf@gmx.net> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Trace: ger.gmane.org 1448732486 14642 80.91.229.3 (28 Nov 2015 17:41:26 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 28 Nov 2015 17:41:26 +0000 (UTC) Cc: 22038@debbugs.gnu.org To: Stephen Berman Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Nov 28 18:41:09 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a2jUf-0006WJ-PS for geb-bug-gnu-emacs@m.gmane.org; Sat, 28 Nov 2015 18:41:05 +0100 Original-Received: from localhost ([::1]:33252 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2jUj-0004KQ-9q for geb-bug-gnu-emacs@m.gmane.org; Sat, 28 Nov 2015 12:41:09 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:48426) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2jUf-0004KH-Hn for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 12:41:06 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a2jUc-0005i4-8a for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 12:41:05 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:40413) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2jUc-0005ht-4o for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 12:41:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1a2jUb-0003Z5-Rr for bug-gnu-emacs@gnu.org; Sat, 28 Nov 2015 12:41:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 28 Nov 2015 17:41:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 22038 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 22038-submit@debbugs.gnu.org id=B22038.144873245913687 (code B ref 22038); Sat, 28 Nov 2015 17:41:01 +0000 Original-Received: (at 22038) by debbugs.gnu.org; 28 Nov 2015 17:40:59 +0000 Original-Received: from localhost ([127.0.0.1]:58354 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a2jUY-0003Yg-UJ for submit@debbugs.gnu.org; Sat, 28 Nov 2015 12:40:59 -0500 Original-Received: from mtaout21.012.net.il ([80.179.55.169]:56670) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1a2jUE-0003YE-A0 for 22038@debbugs.gnu.org; Sat, 28 Nov 2015 12:40:57 -0500 Original-Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0NYJ00L00CM27F00@a-mtaout21.012.net.il> for 22038@debbugs.gnu.org; Sat, 28 Nov 2015 19:40:36 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([84.94.185.246]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NYJ00LSHD3O6640@a-mtaout21.012.net.il>; Sat, 28 Nov 2015 19:40:36 +0200 (IST) In-reply-to: <877fl2kq1u.fsf@gmx.net> X-012-Sender: halo1@inter.net.il X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:109375 Archived-At: > From: Stephen Berman > Cc: 22038@debbugs.gnu.org > Date: Sat, 28 Nov 2015 18:10:53 +0100 > > > (That's the only way I could parse "multiple characters matching a > > single string".) We will have that, but it won't allow "ss" to match > > "ß", unless you customize character-fold-table to include that. The > > reason is that "ß" doesn't have any decompositions in the Unicode > > database, so the default character-fold-table doesn't include any > > expansions for it. > > This suggests to me that basing character folding solely on character > decomposition is insufficient. From a user's point of view I see no > reason why the search string "a" under character-folding matches "ä" but > not e.g. "æ". Requiring a customization to get the latter strikes me as > a user-unfriendly crutch to work around a deficient implementation. (I > don't know if it's easy to improve, I'm just giving my impression as a > user.) Easiness is not the most important issue here: there's a more basic problem involved. Both "ß" vs "ss" and "æ" vs "a" (or "ae") are language-specific: they are only valid matches in the context of specific languages. AFAIU, that is why they are not in the Unicode database. And we don't yet have language-specific text processing capabilities and infrastructure (well, string-collate-lessp and string-collate-equalp are a beginning, but only that). So allowing those by default risk running afoul of what users want. There are more language-specific foldings possible, outside of the European languages. For example, folding of Arabic positional forms of the same letter. These are at times much more important than the above ligatures, and yet we don't support them yet, either. In this initial release of such functionality I think it is prudent to go by the standard, because we don't yet have any real-life experience to build upon. That doesn't cover every possible use case where a more radical folding would be useful, but we had nothing in Emacs 24, so this is still a large step in the right direction, IMO. Let's not bite more than we can chew.