From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.devel Subject: RE: On language-dependent defaults for character-folding Date: Sat, 13 Feb 2016 10:26:34 -0800 (PST) Message-ID: References: <87mvr9wxqz.fsf@wanadoo.es> <87io1xwq1e.fsf@wanadoo.es> <87vb5wvzfz.fsf@mail.linkov.net> <87io1wt4cc.fsf@wanadoo.es> <8737syoima.fsf@mail.linkov.net> <871t8iu277.fsf@wanadoo.es> <83d1s28kvh.fsf@gnu.org> <87r3gis7sm.fsf@wanadoo.es> <83twle71xy.fsf@gnu.org> <87io1us0te.fsf@wanadoo.es> <83pow26svf.fsf@gnu.org> <87a8n5srbp.fsf@wanadoo.es> <83d1s17npz.fsf@gnu.org> <87oablfpn3.fsf@mail.linkov.net> <834mdd6llx.fsf@gnu.org> <7fbb8bc7-9a97-4bad-a103-a6690a35241d@default> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="__1455387996393322639abhmp0011.oracle.com" X-Trace: ger.gmane.org 1455388038 7731 80.91.229.3 (13 Feb 2016 18:27:18 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 13 Feb 2016 18:27:18 +0000 (UTC) Cc: =?utf-8?B?w5NzY2FyIEZ1ZW50ZXM=?= , Eli Zaretskii , Juri Linkov , emacs-devel To: bruce.connor.am@gmail.com Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Feb 13 19:27:05 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aUeuO-0007f7-Qy for ged-emacs-devel@m.gmane.org; Sat, 13 Feb 2016 19:27:05 +0100 Original-Received: from localhost ([::1]:44006 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUeuO-0005wy-2L for ged-emacs-devel@m.gmane.org; Sat, 13 Feb 2016 13:27:04 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37594) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUeu9-0005vL-BH for emacs-devel@gnu.org; Sat, 13 Feb 2016 13:26:50 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aUeu8-0006T3-BC for emacs-devel@gnu.org; Sat, 13 Feb 2016 13:26:49 -0500 Original-Received: from aserp1040.oracle.com ([141.146.126.69]:39934) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUeu2-0006Np-JC; Sat, 13 Feb 2016 13:26:42 -0500 Original-Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u1DIQbhu005563 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 13 Feb 2016 18:26:38 GMT Original-Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0022.oracle.com (8.13.8/8.13.8) with ESMTP id u1DIQbF3026779 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Sat, 13 Feb 2016 18:26:37 GMT Original-Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by userv0121.oracle.com (8.13.8/8.13.8) with ESMTP id u1DIQaPO010596; Sat, 13 Feb 2016 18:26:36 GMT In-Reply-To: X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9 (901082) [OL 12.0.6691.5000 (x86)] X-Source-IP: aserv0022.oracle.com [141.146.126.234] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 141.146.126.69 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:199896 Archived-At: --__1455387996393322639abhmp0011.oracle.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable > > The implementation should really be on the C level, like the > > case-folding support.=C2=A0 The current implementation isn't, and > > therefore has several disadvantages some of which were already > > pointed out... > > I would like to see a list of the disadvantages laid out clearly. See a thread here called =E2=80=9CChar-folding: how can we implement matchi= ng multiple characters as a single "thing"?=E2=80=9D.=20 In summary, char folding was generating regexps that were too long for Emac= s to handle.=20 The best solution we reached was to make char folding dumber, so that the r= esulting regexps wouldn't grow exponentially. The C-level implementations of char folding that have been discussed wouldn= 't have this problem because they wouldn't need regexps. Even with the current solution, char folding can still produce too long reg= exps if the input string is very long (which it handles by falling back on = regular search).=20 A second disadvantage is that you can't do char folding for regexp searches= (though I can't tell how common that would be).=20 Yes, I read that part of the thread. But thanks for the reminder. --__1455387996393322639abhmp0011.oracle.com Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable

> > The implementatio= n should really be on the C level, like the
> > case-folding suppo= rt.  The current implementation isn't, and
> > therefore has = several disadvantages some of which were already
> > pointed out..= .
>
> I would like to see a list of the disadvantages laid out = clearly.

See a thread here called =E2=80=9CChar-folding: h= ow can we implement matching multiple characters as a single "thing&qu= ot;?=E2=80=9D.

In summary, char folding was generating re= gexps that were too long for Emacs to handle.

The best so= lution we reached was to make char folding dumber, so that the resulting re= gexps wouldn't grow exponentially.

The C-level implementat= ions of char folding that have been discussed wouldn't have this problem be= cause they wouldn't need regexps.<= /span>

Even with the current solution, char folding can still produce= too long regexps if the input string is very long (which it handles by fal= ling back on regular search).

A second disadvantage is th= at you can't do char folding for regexp searches (though I can't tell how c= ommon that would be).

Yes, I read that part o= f the thread. But thanks for the reminder.

--__1455387996393322639abhmp0011.oracle.com--