From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Artur Malabarba Newsgroups: gmane.emacs.devel Subject: Re: Questions about isearch Date: Fri, 27 Nov 2015 16:55:45 +0000 Message-ID: References: <83lh9lx6oi.fsf@gnu.org> <83a8q1x1cn.fsf@gnu.org> <87h9k74pkw.fsf@gmail.com> <83bnafse4f.fsf@gnu.org> Reply-To: bruce.connor.am@gmail.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a113fb2a407741a052588908d X-Trace: ger.gmane.org 1448643373 10996 80.91.229.3 (27 Nov 2015 16:56:13 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 27 Nov 2015 16:56:13 +0000 (UTC) Cc: emacs-devel To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Nov 27 17:56:10 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a2MJM-0000i9-A6 for ged-emacs-devel@m.gmane.org; Fri, 27 Nov 2015 17:55:52 +0100 Original-Received: from localhost ([::1]:57577 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2MJP-00052B-2i for ged-emacs-devel@m.gmane.org; Fri, 27 Nov 2015 11:55:55 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:33867) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2MJK-0004zO-Fc for emacs-devel@gnu.org; Fri, 27 Nov 2015 11:55:51 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a2MJJ-0000G7-E7 for emacs-devel@gnu.org; Fri, 27 Nov 2015 11:55:50 -0500 Original-Received: from mail-lf0-x22d.google.com ([2a00:1450:4010:c07::22d]:33162) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2MJH-0000Fc-4c; Fri, 27 Nov 2015 11:55:47 -0500 Original-Received: by lfaz4 with SMTP id z4so135397991lfa.0; Fri, 27 Nov 2015 08:55:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=ehhLJZv9lwezrTmnMPUg42p2QUFWi6o36uUDtWl2cTU=; b=iWFdiMyIFK2w4Al7FTNW1dN6txN3BuyQJYrUH9vx7O2w8o+YHIfKUPvEvuzTWjHU38 wU29Z+6hVnDzHTbKySEZOUNHARdUmtOdOlpxxXi0zxEMuhQtOyhpZjHd6ciii5z25Cgg cixlThD+z61XVG5JaMoBV2dEAuQa4sl1VTDebO71HtofzP0RPnvVp7kw0IqqXH5ZSr7E chbZ98DIhAiSKwGgMJ9BOL/jXWhojVXb8/ERKwN/XG66X0RH5j9KpUjVt9/yM26JldkY QC6T4Mv2X9DWIOChbsZiqWvDSlkG6d+Q0ejASMqXVOWVwsF0i514H7p4w+wP1MqlvHBz 5tXg== X-Received: by 10.25.18.92 with SMTP id h89mr17512178lfi.54.1448643345737; Fri, 27 Nov 2015 08:55:45 -0800 (PST) Original-Received: by 10.112.202.99 with HTTP; Fri, 27 Nov 2015 08:55:45 -0800 (PST) Original-Received: by 10.112.202.99 with HTTP; Fri, 27 Nov 2015 08:55:45 -0800 (PST) In-Reply-To: <83bnafse4f.fsf@gnu.org> X-Google-Sender-Auth: 8pn1SXq9axZjdylAHs562-HaQDQ X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2a00:1450:4010:c07::22d X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:195387 Archived-At: --001a113fb2a407741a052588908d Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 27 Nov 2015 2:36 pm, "Eli Zaretskii" wrote: > > It does for me. In this very buffer, if I isearch for 'f' I can get to > > the ligature above. > > Right, it does. I think I tried "ff", not "f". Is that supposed to > work? No. We don't support having multiple characters match a single string. This is a design limitation. We can (and should) discuss improving this. But for now I think it should be documented as not supported. > > >> > 2. It also doesn't match =C3=A4 (a single character) with a=CC=88 = (2 characters, > > >> > which Emacs correctly composes into 1 grapheme cluster). Should it= ? > > > > Done now. > > Thanks. > > But if this now work, why doesn't "ff" find =EF=AC=80 or vice versa? Isn= 't > that the same case? No. Each one is a different scenario here. - "ff" not finding =EF=AC=80 is a case of multiple chars in the search str= ing can't be collapsed as a single thing (see above). It's the same reason why 'a=CC=88' still doesn't match =C3=A4. - =C3=A4 now finds 'a=CC=88'. Because that is exactly its decomposition. - =EF=AC=80 doesn't find "ff", because the decomposition of =EF=AC=80 is no= t exactly (f f), it's actually (compat f f). This was a decision, it's not a limitation. I figured that a character should only match its decomposition if the decomposition is strictly made of chars. Otherwise you get things like =C2= =B9 matching 1 (which I thought we didn't want). --001a113fb2a407741a052588908d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On 27 Nov 2015 2:36 pm, "Eli Zaretskii" <eliz@gnu.org> wrote:
> > It does for me. In this very buffer, if I isearch for 'f'= I can get to
> > the ligature above.
>
> Right, it does.=C2=A0 I think I tried "ff", not "f"= ;.=C2=A0 Is that supposed to
> work?

No. We don't support having multiple characters match a = single string.

This is a design limitation. We can (and should) discuss imp= roving this. But for now I think it should be documented as not supported. =

> > >> > 2. It also doesn't match =C3=A4 = (a single character) with a=CC=88 (2 characters,
> > >> > which Emacs correctly composes into 1 grapheme clus= ter). Should it?
> >
> > Done now.
>
> Thanks.
>
> But if this now work, why doesn't "ff" find =EF=AC=80 or= vice versa?=C2=A0 Isn't
> that the same case?

No. Each one is a different scenario here.

-=C2=A0 "ff" not finding =EF=AC=80 is a case of m= ultiple chars in the search string can't be collapsed as a single thing= (see above). It's the same reason why 'a=CC=88' still doesn= 9;t match =C3=A4.
- =C3=A4 now finds 'a=CC=88'. Because that is exactly its decomposi= tion.
- =EF=AC=80 doesn't find "ff", because the decomposition of = =EF=AC=80 is not exactly (f f), it's actually (compat f f). This was a = decision, it's not a limitation.
I figured that a character should only match its decomposition if the decom= position is strictly made of chars. Otherwise you get things like =C2=B9 ma= tching 1 (which I thought we didn't want).

--001a113fb2a407741a052588908d--