From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: Char-folding: how can we implement matching multiple characters as a single "thing"? Date: Mon, 30 Nov 2015 19:55:21 +0200 Message-ID: <837fkzmkxi.fsf@gnu.org> References: <565C7E1F.10204@gmail.com> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Trace: ger.gmane.org 1448906162 4814 80.91.229.3 (30 Nov 2015 17:56:02 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 30 Nov 2015 17:56:02 +0000 (UTC) Cc: emacs-devel@gnu.org To: =?utf-8?Q?Cl=C3=A9ment?= Pit--Claudel Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Nov 30 18:55:53 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a3Sg4-0002du-BD for ged-emacs-devel@m.gmane.org; Mon, 30 Nov 2015 18:55:52 +0100 Original-Received: from localhost ([::1]:42529 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3Sg3-0001XW-LP for ged-emacs-devel@m.gmane.org; Mon, 30 Nov 2015 12:55:51 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:38975) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3Sfl-0001WN-2K for emacs-devel@gnu.org; Mon, 30 Nov 2015 12:55:33 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a3Sfg-000509-SP for emacs-devel@gnu.org; Mon, 30 Nov 2015 12:55:32 -0500 Original-Received: from mtaout21.012.net.il ([80.179.55.169]:33435) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a3Sfg-0004zj-Ej for emacs-devel@gnu.org; Mon, 30 Nov 2015 12:55:28 -0500 Original-Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0NYN005002S6U100@a-mtaout21.012.net.il> for emacs-devel@gnu.org; Mon, 30 Nov 2015 19:55:26 +0200 (IST) Original-Received: from HOME-C4E4A596F7 ([84.94.185.246]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NYN005TS34DQN70@a-mtaout21.012.net.il>; Mon, 30 Nov 2015 19:55:26 +0200 (IST) In-reply-to: <565C7E1F.10204@gmail.com> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 X-Received-From: 80.179.55.169 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:195627 Archived-At: > From: Cl=C3=A9ment Pit--Claudel > Date: Mon, 30 Nov 2015 17:49:35 +0100 >=20 > * Extend the C-level implementation of regular expressions to make = character-folding a capability of the C engine, allowing for succinct= regexps + a flag. > * Add a new backslash to regular expressions for matching char-fold= ed characters (custom character categories almost already allow for t= his, so the implementation would probably be similar). The correct implementation for character folding is the same as for case-folding: on the C level of the search.c routines. This was discussed back when this feature was in the design phase. However, the undertaking of coding this in search.c was too complex, so we decided to go with the current implementation in Lisp and live with its limitations for the time being. Volunteers are welcome to work on the ultimate solution, which should indeed include normalization of both the search string and the buffer/string text that is searched. Thanks.