From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Questions about isearch Date: Wed, 25 Nov 2015 22:10:53 +0200 Message-ID: <83d1uxx2k2.fsf@gnu.org> References: <83lh9lx6oi.fsf@gnu.org> <87egfdant7.fsf@gmx.us> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Trace: ger.gmane.org 1448482282 22739 80.91.229.3 (25 Nov 2015 20:11:22 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 25 Nov 2015 20:11:22 +0000 (UTC) Cc: emacs-devel@gnu.org To: Rasmus Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Nov 25 21:11:12 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a1gPH-0001Yv-0n for ged-emacs-devel@m.gmane.org; Wed, 25 Nov 2015 21:11:11 +0100 Original-Received: from localhost ([::1]:47478 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a1gPI-0006n0-GG for ged-emacs-devel@m.gmane.org; Wed, 25 Nov 2015 15:11:12 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42349) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a1gPE-0006mt-1m for emacs-devel@gnu.org; Wed, 25 Nov 2015 15:11:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a1gP9-0004l6-1E for emacs-devel@gnu.org; Wed, 25 Nov 2015 15:11:07 -0500 Original-Received: from mtaout24.012.net.il ([80.179.55.180]:55494) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a1gP8-0004l2-PS for emacs-devel@gnu.org; Wed, 25 Nov 2015 15:11:02 -0500 Original-Received: from conversion-daemon.mtaout24.012.net.il by mtaout24.012.net.il (HyperSendmail v2007.08) id <0NYD00800ZM06S00@mtaout24.012.net.il> for emacs-devel@gnu.org; Wed, 25 Nov 2015 22:03:49 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([84.94.185.246]) by mtaout24.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NYD004HMZQD1T40@mtaout24.012.net.il>; Wed, 25 Nov 2015 22:03:49 +0200 (IST) In-reply-to: <87egfdant7.fsf@gmx.us> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 80.179.55.180 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:195249 Archived-At: > From: Rasmus > Date: Wed, 25 Nov 2015 20:20:20 +0100 >=20 > > 1. Character folding doesn't catch ligatures, such as =C3=A6 (sho= uld it match > > the two characters "ae")? >=20 > In Danish I would not consider this a ligature, but a separate lett= er. It > can be written as ae, however. Thus, it would probably be nice to = match > it via =E2=80=99ae=E2=80=99. But where to stop? How about =E2= =80=99=C3=A5=E2=80=99 (matched by =E2=80=99a=E2=80=99)? Should > it be captured by "aa"? =C3=98 by =E2=80=99oe=E2=80=99? There= =E2=80=99s also =E2=80=99=C5=93=E2=80=99... >=20 > Probably there=E2=80=99s lots of these weird cases. Please read the node "Lax Search" in the Emacs manual. That ship sailed several months ago, and Emacs already supports "character folding", and thus yes, 'a' matches '=C3=A5' (and also '=C3=A4' and '= =C3=A1' and '=C7=8E' and many others). We don't make these matches language dependent, because Emacs is a multi-lingual environment, and most text is not tagged with a particular language. So we use language-independent folding, and AFAIU "ae" should have matched '=C3=A6' under the rules = we use. But it doesn't. (Similarly "ff" and '=EF=AC=80' and others.) > > 2. It also doesn't match =C3=A4 (a single character) with a=CC= =88 (2 characters, > > which Emacs correctly composes into 1 grapheme cluster). Should = it? >=20 > This reminds me: UTF-8 "stroked through a" (a=CC=B6) is also displa= yed as a=CC=88 > rather than the stroke through a Emacs on my system. But this is p= robably > a different issue. Display is a different issue, indeed.