From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ulrich Mueller Newsgroups: gmane.emacs.devel Subject: Re: Upcoming loss of usability of Emacs source files and Emacs. Date: Thu, 18 Jun 2015 07:27:37 +0200 Message-ID: <21890.22217.610318.184683@a1i15.kph.uni-mainz.de> References: <20150615142237.GA3517@acm.fritz.box> <87y4jkhqh5.fsf@uwakimon.sk.tsukuba.ac.jp> <557F3C22.4060909@cs.ucla.edu> <5580D356.4050708@cs.ucla.edu> <87si9qonxb.fsf@gnu.org> <87ioamz8if.fsf@petton.fr> <32013464-2300-46c6-ba46-4a3c36bfee5d@default> <87twu62nnt.fsf@mbork.pl> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1434605335 18657 80.91.229.3 (18 Jun 2015 05:28:55 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 18 Jun 2015 05:28:55 +0000 (UTC) Cc: eggert@cs.ucla.edu, rms@gnu.org, Nicolas Petton , emacs-devel@gnu.org, Tassilo Horn , acm@muc.de, stephen@xemacs.org To: Marcin Borkowski Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Jun 18 07:28:46 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Z5SNH-0003fi-2D for ged-emacs-devel@m.gmane.org; Thu, 18 Jun 2015 07:28:27 +0200 Original-Received: from localhost ([::1]:50228 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5SNB-0005Ea-E0 for ged-emacs-devel@m.gmane.org; Thu, 18 Jun 2015 01:28:21 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:53950) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5SN0-0005EV-2T for emacs-devel@gnu.org; Thu, 18 Jun 2015 01:28:10 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z5SMz-0005e7-AC for emacs-devel@gnu.org; Thu, 18 Jun 2015 01:28:10 -0400 Original-Received: from a1www.kph.uni-mainz.de ([134.93.134.1]:51529) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5SMr-0005RB-VZ; Thu, 18 Jun 2015 01:28:02 -0400 Original-Received: from a1i15.kph.uni-mainz.de (a1i15.kph.uni-mainz.de [134.93.134.92]) by a1www.kph.uni-mainz.de (8.14.9/8.14.7) with ESMTP id t5I5RdYY003400; Thu, 18 Jun 2015 07:27:39 +0200 Original-Received: from a1i15.kph.uni-mainz.de (localhost [127.0.0.1]) by a1i15.kph.uni-mainz.de (8.14.8/8.14.2) with ESMTP id t5I5RdfS023226; Thu, 18 Jun 2015 07:27:39 +0200 Original-Received: (from ulm@localhost) by a1i15.kph.uni-mainz.de (8.14.8/8.14.8/Submit) id t5I5RbZn023222; Thu, 18 Jun 2015 07:27:37 +0200 In-Reply-To: <87twu62nnt.fsf@mbork.pl> X-Mailer: VM 8.2.0b under 24.3.1 (x86_64-pc-linux-gnu) X-MIME-Autoconverted: from 8bit to quoted-printable by a1www.kph.uni-mainz.de id t5I5RdYY003400 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 134.93.134.1 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:187262 Archived-At: >>>>> On Wed, 17 Jun 2015, Marcin Borkowski wrote: > On the other hand, it would be great if we had an "ascii-folding" > option, making (some reasonable subset of) Unicode "equivalent" to > ASCII, so that we could easily search for e.g. the Polish word =E2=80=98= =C5=BC=C3=B3=C5=82w=E2=80=99 > (meaning "turtle") by typing `zolw'. (I have to say that lack of this > is one of my main gripes with A****n's K****e e-book reader - this > renders the "search" option unusable for non-English texts...) I have the following code in my .emacs which does exactly that: ;; Ignore accent and umlaut marks when searching. ;; Works for Emacs 19.30 and later. (let ((eqv-list '("aA=C3=A0=C3=80=C3=A1=C3=81=C3=A2=C3=82=C3=A3=C3=83=C3=A4= =C3=84=C3=A5=C3=85" "cC=C3=A7=C3=87" "eE=C3=A8=C3=88=C3=A9=C3=89=C3=AA=C3=8A=C3=AB=C3=8B" "iI=C3=AC=C3=8C=C3=AD=C3=8D=C3=AE=C3=8E=C3=AF=C3=8F" "nN=C3=B1=C3=91" "oO=C3=B2=C3=92=C3=B3=C3=93=C3=B4=C3=94=C3=B5=C3=95=C3=B6=C3=96=C3=B8= =C3=98" "uU=C3=B9=C3=99=C3=BA=C3=9A=C3=BB=C3=9B=C3=BC=C3=9C" "yY=C3=BD=C3=9D=C3=BF")) (table (standard-case-table)) canon) (setq canon (copy-sequence table)) (mapcar (lambda (s) (mapcar (lambda (c) (aset canon c (aref s 0))) s)) eqv-list) (set-char-table-extra-slot table 1 canon) (set-char-table-extra-slot table 2 nil) (set-standard-case-table table)) Maybe it could be used as the basis for a minor mode? Downside is that the above will significantly slow down search in multibyte buffers. This is because equivalent characters have a different number of bytes in UTF-8, therefore the Boyer-Moore algorithm cannot be used any more. Ulrich