From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: "B. T. Raven" Newsgroups: gmane.emacs.help Subject: Re: diacritic-fold-search? Date: Thu, 29 Nov 2012 15:59:57 -0600 Organization: NewsGuy - Unlimited Usenet $23.95 Message-ID: References: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1354226714 23631 80.91.229.3 (29 Nov 2012 22:05:14 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 29 Nov 2012 22:05:14 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Thu Nov 29 23:05:25 2012 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1TeCEW-0004bC-HY for geh-help-gnu-emacs@m.gmane.org; Thu, 29 Nov 2012 23:05:24 +0100 Original-Received: from localhost ([::1]:36041 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TeCEK-0007pa-TZ for geh-help-gnu-emacs@m.gmane.org; Thu, 29 Nov 2012 17:05:12 -0500 Original-Path: usenet.stanford.edu!news.glorb.com!npeer03.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!spln!extra.newsguy.com!newsp.newsguy.com!news6 Original-Newsgroups: gnu.emacs.help Original-Lines: 49 Original-NNTP-Posting-Host: p4d3cf3bc0e1daccc9eb42d41bc144e4e3fba938b00147bf4.newsdawg.com User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120907 Thunderbird/15.0.1 In-Reply-To: X-Received-Bytes: 2752 Original-Xref: usenet.stanford.edu gnu.emacs.help:195638 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:87959 Archived-At: Here are some accent-folding data in a .js file that could probably be put into some kind of data structure Emacs supports: http://hex-machina.com/scripts/yui/3.3.0pr1/api/unicode-data-accentfold.js.html See especially the link to the Unicode utilities at the last header comment. Ed > "Drew Adams" writes: > >>> Is there a way to search ignoring diacritics, e.g. capturing "apres" >>> both with and without an accent grave over the "e"? >> >> Great question. I don't think so, but I'm guessing that lots of users could >> make good use of such a feature! >> >> Unless someone points out here that this is already possible, why don't >> you submit an enhancement request for this feature (`M-x >> report-emacs-bug' is also for enhancement requests): be able to toggle >> Isearch distinguishing certain sets of similar chars (diacritics). >> >> There could be predefined sets of equivalence classes of chars (e.g., >> the same letter, modulo diacritical marks). And users could be able to >> customize these classes. >> >> Likewise, for punctuation chars that are very similar (in >> purpose/visually), such as straight quotes and curly quotes, and >> no-break hyphen, hyphen, and the various dashes. >> >> Likewise, for whitespace chars other than the standard SPC, TAB, etc. >> For whitespace, I believe there might be some handling of additional >> chars such as no-break space, but what's needed, here too, is a simple >> way to toggle distinguishing them on/off. >> >> But your use case is the best one: be able to optionally ignore diacritical >> marks when searching. > > It may not be totally irrelevant to note that search engines make > diacritic-agnostic search the default. And some Web browsers (Chrome > but not Firefox) do this for searches of a page they’re displaying. > > /Lew > --- > Lew Perin / perin@acm.org > http://babelcarp.org >