From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.devel Subject: Re: search-default-mode char-fold-to-regexp and Greek Extended block characters, Re: search-default-mode char-fold-to-regexp and Greek Extended block characters Date: Fri, 26 Jul 2019 21:38:30 +0300 Organization: LINKOV.NET Message-ID: <878sskybpi.fsf@mail.linkov.net> References: <87r26gv6k2.fsf@mail.linkov.net> <87blxj3u4e.fsf@mail.linkov.net> <87ef2f0xx3.fsf@tcd.ie> <834l3ium3f.fsf@gnu.org> <83wogduc41.fsf@gnu.org> <83h87cpzml.fsf@gnu.org> <87r26gv6k2.fsf@mail.linkov.net> <87blxj3u4e.fsf@mail.linkov.net> <87a7d2asu3.fsf@mail.linkov.net> <87v9vp23g0.fsf@mail.linkov.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="88980"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Jul 26 20:42:08 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hr5AK-000Mub-SK for ged-emacs-devel@m.gmane.org; Fri, 26 Jul 2019 20:42:05 +0200 Original-Received: from localhost ([::1]:42974 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hr5AJ-0001pB-HH for ged-emacs-devel@m.gmane.org; Fri, 26 Jul 2019 14:42:03 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:59620) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hr5AG-0001os-1z for emacs-devel@gnu.org; Fri, 26 Jul 2019 14:42:00 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hr5AF-00086K-4v for emacs-devel@gnu.org; Fri, 26 Jul 2019 14:41:59 -0400 Original-Received: from blue.elm.relay.mailchannels.net ([23.83.212.20]:21343) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hr5AE-00083Z-Rt for emacs-devel@gnu.org; Fri, 26 Jul 2019 14:41:59 -0400 X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Original-Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 4DC0120523 for ; Fri, 26 Jul 2019 18:41:57 +0000 (UTC) Original-Received: from pdx1-sub0-mail-a46.g.dreamhost.com (100-96-4-62.trex.outbound.svc.cluster.local [100.96.4.62]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id CA04621750 for ; Fri, 26 Jul 2019 18:41:56 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Original-Received: from pdx1-sub0-mail-a46.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.17.3); Fri, 26 Jul 2019 18:41:57 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|jurta@jurta.org X-MailChannels-Auth-Id: dreamhost X-Gusty-Thread: 0120b8bb0abae097_1564166517128_2069255297 X-MC-Loop-Signature: 1564166517128:2297256663 X-MC-Ingress-Time: 1564166517127 Original-Received: from pdx1-sub0-mail-a46.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a46.g.dreamhost.com (Postfix) with ESMTP id BA1277FD8D for ; Fri, 26 Jul 2019 11:41:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=linkov.net; h=from:to :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; s=linkov.net; bh=oPeZT4 vXMGLssntMMVZJB4BJ4D0=; b=TT1y/jzZ95Ary/ybhcAPj6jf8E0a6/LxobeR/7 TGGyla3ojXmVWLQdDemvhz9Av+zgGRZrI3OLEhnhD4kO8hCd1PfQ5MoYFZHfmc7d WDvMT74U2oi8wfQ889s2btzKXuXhi02sf242xEWjx8li8hV5EMSEMReH7Dkq+8fX E5EPk= Original-Received: from mail.jurta.org (m91-129-103-76.cust.tele2.ee [91.129.103.76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by pdx1-sub0-mail-a46.g.dreamhost.com (Postfix) with ESMTPSA id A39757FD8A for ; Fri, 26 Jul 2019 11:41:51 -0700 (PDT) X-DH-BACKEND: pdx1-sub0-mail-a46 In-Reply-To: (Robert Pluim's message of "Fri, 26 Jul 2019 13:09:27 +0200") X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: 0 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduvddrkeeggdduvdelucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucenucfjughrpefhvffuohhfffgjkfgfgggtgfesthekredttderjeenucfhrhhomheplfhurhhiucfnihhnkhhovhcuoehjuhhriheslhhinhhkohhvrdhnvghtqeenucfkphepledurdduvdelrddutdefrdejieenucfrrghrrghmpehmohguvgepshhmthhppdhhvghlohepmhgrihhlrdhjuhhrthgrrdhorhhgpdhinhgvthepledurdduvdelrddutdefrdejiedprhgvthhurhhnqdhprghthheplfhurhhiucfnihhnkhhovhcuoehjuhhriheslhhinhhkohhvrdhnvghtqedpmhgrihhlfhhrohhmpehjuhhriheslhhinhhkohhvrdhnvghtpdhnrhgtphhtthhopegvmhgrtghsqdguvghvvghlsehgnhhurdhorhhgnecuvehluhhsthgvrhfuihiivgeptd X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 23.83.212.20 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:238933 Archived-At: > Juri> If there are many such cases, then better to handle them auto= matically indeed > Juri> (if this doesn't cause slowdown too much) instead of adding t= hem one by one > Juri> to the default values. Does this handle =C3=9F as well? > > There are 74, and I don=CA=BCt want to maintain such a list by hand :-)= . Yes, 74 is too tedious to maintain by hand, so better to install your previous patch (if it doesn't have the problem mentioned below) since there are only 3 such complex characters (handled by your newer patch) that is easy to add by hand: '((?=C3=9F "ss") (?=E1=BF=93 "=CE=B9=CC=88=CC=81") (?=E1=BF=A3 "=CF=85=CC=88=CC=81")) > =C3=9F is not a complex character, so is never looked at here. But if w= e > hoist the checking out of the loop over complex characters, we can > make that work as well (this supersedes my previous patch). > > I have no idea of the performance impact of all this. > [...] > + (aset equiv (aref roundtrip 0) > + (cons str (aref equiv (aref roundtrip 0)))))) It seems this adds a symmetric decomposition from the first character of = "ss", i.e. from ?s to "=C3=9F". Shouldn't this rather update 'equiv-multi' ins= tead? OTOH, I see no reason to add symmetric decompositions by default since they are handled by the option 'char-fold-symmetric'.