From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Artur Malabarba Newsgroups: gmane.emacs.devel Subject: RE: On language-dependent defaults for character-folding Date: Sat, 13 Feb 2016 16:15:25 -0200 Message-ID: References: <87mvr9wxqz.fsf@wanadoo.es> <87io1xwq1e.fsf@wanadoo.es> <87vb5wvzfz.fsf@mail.linkov.net> <87io1wt4cc.fsf@wanadoo.es> <8737syoima.fsf@mail.linkov.net> <871t8iu277.fsf@wanadoo.es> <83d1s28kvh.fsf@gnu.org> <87r3gis7sm.fsf@wanadoo.es> <83twle71xy.fsf@gnu.org> <87io1us0te.fsf@wanadoo.es> <83pow26svf.fsf@gnu.org> <87a8n5srbp.fsf@wanadoo.es> <83d1s17npz.fsf@gnu.org> <87oablfpn3.fsf@mail.linkov.net> <834mdd6llx.fsf@gnu.org> <7fbb8bc7-9a97-4bad-a103-a6690a35241d@default> Reply-To: bruce.connor.am@gmail.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=94eb2c0b15748dcc7b052baac4ba X-Trace: ger.gmane.org 1455387338 29664 80.91.229.3 (13 Feb 2016 18:15:38 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 13 Feb 2016 18:15:38 +0000 (UTC) Cc: =?UTF-8?Q?=C3=93scar_Fuentes?= , Eli Zaretskii , Juri Linkov , emacs-devel To: Drew Adams Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Feb 13 19:15:37 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aUejD-0008M7-Uh for ged-emacs-devel@m.gmane.org; Sat, 13 Feb 2016 19:15:32 +0100 Original-Received: from localhost ([::1]:43937 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUejD-0002RF-2s for ged-emacs-devel@m.gmane.org; Sat, 13 Feb 2016 13:15:31 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:34550) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUej9-0002R1-A2 for emacs-devel@gnu.org; Sat, 13 Feb 2016 13:15:28 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aUej8-0002Xv-A2 for emacs-devel@gnu.org; Sat, 13 Feb 2016 13:15:27 -0500 Original-Received: from mail-yk0-x231.google.com ([2607:f8b0:4002:c07::231]:33292) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUej8-0002Xp-4S; Sat, 13 Feb 2016 13:15:26 -0500 Original-Received: by mail-yk0-x231.google.com with SMTP id z13so46380267ykd.0; Sat, 13 Feb 2016 10:15:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=yiX0imeNcB+C5lei80fny32oQ1zsr4cPvh3kRoEq87Q=; b=Ub3BVT4CN9ep5G+P35ynZAL5XPwYpEUkWroq4+QmJ8NA8fFjknAIzF2k48JdcmUV52 iMu0tSILIzR5wEGAaDKpKF09RAkleI+WaQrIbK4yqamTk2EzBabyIUqN9CouXcA27IIq XrCIKzGO56rsnvDjyTfVvbljug2LNZyiE7TV3IgnWxwPCobsT1GgAA5NTaKrjZYhXSxj mesE3gE070sw1Q3MmtdM8cb74On89Ow1K7jNLrFE6gwzFNTefU31lUskEAA7iU2JZnKY Tr7XX47lXHiWN6mxlSve5X6pP9jxYQvPwI5Xq7OOiLFpwGjDs9pMRvpKDj6KZfm3wWc7 1tuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:reply-to:sender:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=yiX0imeNcB+C5lei80fny32oQ1zsr4cPvh3kRoEq87Q=; b=aZudpeiqhn1EtPkS4INKZzI4XC/Il3PmJKOYh9I65W+mK4+rOJViFRemWdk7nza8TP GP2dbO4naCYaUrDtqcnGBpplfUG7sKQwixyHdesVJgwj/07wv0LRA/SgAeGQb/ELmrYX kNdHEz4B06xJiFBRln28rw4ShrPKaJUM5PyOJ8lu+GeFtzQuIM6JFszlq89WiKk4uRaO 42A3DqrseaxIOsTufwvBREajTmTTR/YxHMr0vQ4ujtzdRuZtrr2zkEDPV41W6IJmA9RJ FSidyF1I/M9A6GWX0PbLwEQcWcZWu+xE9bjm7jiNRDHObu+KOBoVwnJ9wkDA/2Cm2/gs fWmA== X-Gm-Message-State: AG10YOTX6fA870bva5sQz/xzibU4r4NWJy5L0K57NXirqJH8lwLmHDIn3GSzsXpEPD2hTffVoVjySqmxtcwigQ== X-Received: by 10.37.231.14 with SMTP id e14mr1764694ybh.11.1455387325612; Sat, 13 Feb 2016 10:15:25 -0800 (PST) Original-Received: by 10.129.79.83 with HTTP; Sat, 13 Feb 2016 10:15:25 -0800 (PST) Original-Received: by 10.129.79.83 with HTTP; Sat, 13 Feb 2016 10:15:25 -0800 (PST) In-Reply-To: <7fbb8bc7-9a97-4bad-a103-a6690a35241d@default> X-Google-Sender-Auth: ooAfCmcT3A7shFMkuWscajiitno X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:4002:c07::231 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:199895 Archived-At: --94eb2c0b15748dcc7b052baac4ba Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 13 Feb 2016 3:20 pm, "Drew Adams" wrote: > > > The implementation should really be on the C level, like the > > case-folding support. The current implementation isn't, and > > therefore has several disadvantages some of which were already > > pointed out... > > I would like to see a list of the disadvantages laid out clearly. See a thread here called =E2=80=9CChar-folding: how can we implement matchi= ng multiple characters as a single "thing"?=E2=80=9D. In summary, char folding was generating regexps that were too long for Emacs to handle. The best solution we reached was to make char folding dumber, so that the resulting regexps wouldn't grow exponentially. The C-level implementations of char folding that have been discussed wouldn't have this problem because they wouldn't need regexps. Even with the current solution, char folding can still produce too long regexps if the input string is very long (which it handles by falling back on regular search). A second disadvantage is that you can't do char folding for regexp searches (though I can't tell how common that would be). --94eb2c0b15748dcc7b052baac4ba Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On 13 Feb 2016 3:20 pm, "Drew Adams" <drew.adams@oracle.com> wrote:
>
> > The implementation should really be on the C level, like the
> > case-folding support.=C2=A0 The current implementation isn't,= and
> > therefore has several disadvantages some of which were already > > pointed out...
>
> I would like to see a list of the disadvantages laid out clearly.

See a thread here called =E2=80=9CChar-folding: how can we i= mplement matching multiple characters as a single "thing"?=E2=80= =9D.

In summary, char folding was generating regexps that were to= o long for Emacs to handle.

The best solution we reached was to make char folding dumber= , so that the resulting regexps wouldn't grow exponentially.

The C-level implementations of char folding that have been d= iscussed wouldn't have this problem because they wouldn't need rege= xps.

Even with the current solution, char folding can still produ= ce too long regexps if the input string is very long (which it handles by f= alling back on regular search).

A second disadvantage is that you can't do char folding = for regexp searches (though I can't tell how common that would be). --94eb2c0b15748dcc7b052baac4ba--