From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Ted Zlatanov Newsgroups: gmane.emacs.devel Subject: extending case-fold-search to remove nonspacing marks (diacritics etc.) Date: Thu, 05 Feb 2015 17:16:04 -0500 Organization: =?utf-8?B?0KLQtdC+0LTQvtGAINCX0LvQsNGC0LDQvdC+0LI=?= @ Cienfuegos Message-ID: <87fvakvwbf.fsf@lifelogs.com> Reply-To: emacs-devel@gnu.org NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1423174593 4771 80.91.229.3 (5 Feb 2015 22:16:33 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 5 Feb 2015 22:16:33 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Feb 05 23:16:29 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YJUiq-0006YC-RJ for ged-emacs-devel@m.gmane.org; Thu, 05 Feb 2015 23:16:29 +0100 Original-Received: from localhost ([::1]:45776 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YJUio-0001dq-RN for ged-emacs-devel@m.gmane.org; Thu, 05 Feb 2015 17:16:26 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36048) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YJUib-0001dd-Ug for emacs-devel@gnu.org; Thu, 05 Feb 2015 17:16:14 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YJUiX-0006u4-BV for emacs-devel@gnu.org; Thu, 05 Feb 2015 17:16:13 -0500 Original-Received: from plane.gmane.org ([80.91.229.3]:56046) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YJUiX-0006td-3U for emacs-devel@gnu.org; Thu, 05 Feb 2015 17:16:09 -0500 Original-Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1YJUiU-0006M9-9S for emacs-devel@gnu.org; Thu, 05 Feb 2015 23:16:06 +0100 Original-Received: from c-98-229-61-72.hsd1.ma.comcast.net ([98.229.61.72]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 05 Feb 2015 23:16:06 +0100 Original-Received: from tzz by c-98-229-61-72.hsd1.ma.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 05 Feb 2015 23:16:06 +0100 X-Injected-Via-Gmane: http://gmane.org/ Mail-Followup-To: emacs-devel@gnu.org Original-Lines: 29 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: c-98-229-61-72.hsd1.ma.comcast.net X-Face: bd.DQ~'29fIs`T_%O%C\g%6jW)yi[zuz6; d4V0`@y-~$#3P_Ng{@m+e4o<4P'#(_GJQ%TT= D}[Ep*b!\e,fBZ'j_+#"Ps?s2!4H2-Y"sx" Mail-Copies-To: never User-Agent: Gnus/5.130012 (Ma Gnus v0.12) Emacs/25.0.50 (gnu/linux) Cancel-Lock: sha1:39txLs51TttFc5PInCzMMOpvNgQ= X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 80.91.229.3 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:182483 Archived-At: https://emacs.stackexchange.com/questions/7992/how-to-search-an-arabic-word-in-text-without-its-diacritics-accents suggested it would be useful if diacritics were ignored when searching for text in various situations. This is similar to `case-fold-search' but more generic. Here's what I suggested as the answer at the ELisp level: #+begin_src emacs-lisp (defun kill-marks (string) (concat (loop for c across string when (not (eq 'Mn (get-char-code-property c 'general-category))) collect c))) (let* ((original1 "your Arabic string here") (normalized1 (ucs-normalize-NFKD-string original1)) (original2 "your other Arabic string here") (normalized2 (ucs-normalize-NFKD-string original2))) (equal (replace-regexp-in-string "." 'kill-marks normalized1) (replace-regexp-in-string "." 'kill-marks normalized2))) #+end_src This would probably be useful for other languages, not just Arabic. But implementing it for users so it works like `case-fold-search' (you just set something in Customize and all search commands DWYM) seems much harder. Does anyone have suggestions? Maybe some defadvice magic? Or is it not possible? Thanks Ted