From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Drew Adams Newsgroups: gmane.emacs.devel Subject: character folding future [was: Questions about isearch] Date: Sat, 28 Nov 2015 08:48:57 -0800 (PST) Message-ID: <893eaaa4-6867-4e3f-a926-85f650367d6f@default> References: <83lh9lx6oi.fsf@gnu.org> <83a8q1x1cn.fsf@gnu.org> <87h9k74pkw.fsf@gmail.com> <83bnafse4f.fsf@gnu.org> <878u5jrvih.fsf@rub.de> <87mvtyqzyx.fsf@mbork.pl> <831tbaqwwv.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1448729378 1180 80.91.229.3 (28 Nov 2015 16:49:38 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 28 Nov 2015 16:49:38 +0000 (UTC) Cc: Stephen Berman , Richard Stallman , emacs-devel To: bruce.connor.am@gmail.com, Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Nov 28 17:49:22 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1a2igb-0003h7-JG for ged-emacs-devel@m.gmane.org; Sat, 28 Nov 2015 17:49:21 +0100 Original-Received: from localhost ([::1]:33029 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2ige-0005Um-Qd for ged-emacs-devel@m.gmane.org; Sat, 28 Nov 2015 11:49:24 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:40094) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2igQ-0005Uh-3J for emacs-devel@gnu.org; Sat, 28 Nov 2015 11:49:11 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1a2igO-0000vB-PV for emacs-devel@gnu.org; Sat, 28 Nov 2015 11:49:10 -0500 Original-Received: from aserp1040.oracle.com ([141.146.126.69]:17582) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1a2igK-0000tX-Eq; Sat, 28 Nov 2015 11:49:04 -0500 Original-Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id tASGmxo1010904 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 28 Nov 2015 16:49:00 GMT Original-Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0021.oracle.com (8.13.8/8.13.8) with ESMTP id tASGmxd4028275 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Sat, 28 Nov 2015 16:48:59 GMT Original-Received: from abhmp0017.oracle.com (abhmp0017.oracle.com [141.146.116.23]) by aserv0122.oracle.com (8.13.8/8.13.8) with ESMTP id tASGmwV5005468; Sat, 28 Nov 2015 16:48:59 GMT In-Reply-To: X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9 (901082) [OL 12.0.6691.5000 (x86)] X-Source-IP: userv0021.oracle.com [156.151.31.71] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 141.146.126.69 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:195467 Archived-At: > Ok. I'm going to work on the char-folding a little bit more today to > implement support for multi-char matches and to combine it with > case-folding. Hopefully that will iron out the final inconsistencies. Thanks for working on this, Artur. I invite you to also take a look at some code I wrote for this, which I've put in `character-fold+.el'. It follows a previous discussion. Any of that, or similar, that gets added to vanilla Emacs will mean one less thing for me to bother with. ;-) A description is here: http://www.emacswiki.org/emacs/CharacterFoldPlus. The code is here: http://www.emacswiki.org/emacs/download/character-fold%2b.el The additions are essentially these: 1. An option, `char-fold-ad-hoc', for the ad hoc char foldings. Default value: the same ad hoc foldings as vanilla Emacs (quotation marks). 2. A Boolean option, `char-fold-symmetric', which when non-nil means that all members of a folding equivalence class are treated equivalently, whether base char, compositions, or other strings of chars. This lets you search for e' or =C3=A9 and find e and any of the other members of its class (including composition strings). The default value is nil (off). 3. A general workhorse function, `update-char-fold-table', that updates the value of variable `character-fold-table' (from which it was derived). It is used when option `char-fold-symmetric' is toggled, and it makes use of options `char-fold-ad-hoc' and `char-fold-symmetric'. 4. `character-fold-to-regexp' is advised, to reflect whether char folding is currently symmetric. Library Isearch+ provides a toggle for `char-fold-symmetric', bound by default to `M-s =3D' during Isearch. Another Isearch toggle can be useful when char folding is symmetric: `M-s h L', which toggles lazy highlighting, which can slow things down when using symmetric char folding. The code for `isearch+.el' is here: http://www.emacswiki.org/emacs/download/isearch%2b.el Earlier, I invited a discussion about future customization of character folding (and folding in general). That hasn't happened, so far. But `char-fold-ad-hoc' could be a start. One possibility is for an alist option, whose entries would each be a list (MODES CLASSES), where CLASSES is a list of char-folding classes such as that of `char-fold-ad-hoc'. When any of the MODES is current, those CLASSES would be used by `update-char-fold-table'. Users could thus: 1. Add their own equivalence classes. 2. Associate any number of such classes with particular modes. 3. Customize the ad hoc classes used by default. In addition, we could provide the class that abstracts from diacriticals explicitly, as another, non-customizable (?) class, so that users could include or exclude it too wrt specific modes. (Currently it is implicit in char folding, i.e., hard-coded.) Letting users exclude the broad diacritical class and include their own classes would accomodate wanting some diacritical foldings but not others. With symmetric folding it should offer considerable flexibility. Utility functions that do some of the work currently done by `update-char-fold-table' could be created, to be used by users to easily create their own diacritical classes. Currently, that part is still hard-coded (only ad hoc foldings are open to user customization, so far).