From: "B. T. Raven" <btraven@nihilo.net>
To: help-gnu-emacs@gnu.org
Subject: Re: diacritic-fold-search?
Date: Thu, 29 Nov 2012 15:59:57 -0600 [thread overview]
Message-ID: <k98lsu01fs@news6.newsguy.com> (raw)
In-Reply-To: <pc7zk20xmkl.fsf@panix1.panix.com>
Here are some accent-folding data in a .js file that could probably be
put into some kind of data structure Emacs supports:
http://hex-machina.com/scripts/yui/3.3.0pr1/api/unicode-data-accentfold.js.html
See especially the link to the Unicode utilities at the last header comment.
Ed
> "Drew Adams" <drew.adams@oracle.com> writes:
>
>>> Is there a way to search ignoring diacritics, e.g. capturing "apres"
>>> both with and without an accent grave over the "e"?
>>
>> Great question. I don't think so, but I'm guessing that lots of users could
>> make good use of such a feature!
>>
>> Unless someone points out here that this is already possible, why don't
>> you submit an enhancement request for this feature (`M-x
>> report-emacs-bug' is also for enhancement requests): be able to toggle
>> Isearch distinguishing certain sets of similar chars (diacritics).
>>
>> There could be predefined sets of equivalence classes of chars (e.g.,
>> the same letter, modulo diacritical marks). And users could be able to
>> customize these classes.
>>
>> Likewise, for punctuation chars that are very similar (in
>> purpose/visually), such as straight quotes and curly quotes, and
>> no-break hyphen, hyphen, and the various dashes.
>>
>> Likewise, for whitespace chars other than the standard SPC, TAB, etc.
>> For whitespace, I believe there might be some handling of additional
>> chars such as no-break space, but what's needed, here too, is a simple
>> way to toggle distinguishing them on/off.
>>
>> But your use case is the best one: be able to optionally ignore diacritical
>> marks when searching.
>
> It may not be totally irrelevant to note that search engines make
> diacritic-agnostic search the default. And some Web browsers (Chrome
> but not Firefox) do this for searches of a page they’re displaying.
>
> /Lew
> ---
> Lew Perin / perin@acm.org
> http://babelcarp.org
>
next prev parent reply other threads:[~2012-11-29 21:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-29 17:20 diacritic-fold-search? Lewis Perin
2012-11-29 17:39 ` diacritic-fold-search? Drew Adams
2012-11-30 14:13 ` diacritic-fold-search? Doug Lewan
[not found] ` <mailman.14059.1354210783.855.help-gnu-emacs@gnu.org>
2012-11-29 18:59 ` diacritic-fold-search? Lewis Perin
2012-11-29 19:10 ` diacritic-fold-search? Drew Adams
2012-11-29 19:31 ` diacritic-fold-search? Dani Moncayo
2012-11-29 21:59 ` B. T. Raven [this message]
2012-11-30 15:29 ` diacritic-fold-search? Lewis Perin
2012-11-30 18:31 ` diacritic-fold-search? Lewis Perin
-- strict thread matches above, loose matches on Subject: below --
2012-11-29 17:12 diacritic-fold-search? Lewis Perin
2012-11-29 18:19 ` diacritic-fold-search? Peter Dyballa
2012-11-29 18:29 ` diacritic-fold-search? Drew Adams
[not found] ` <mailman.14069.1354213153.855.help-gnu-emacs@gnu.org>
2012-11-29 18:37 ` diacritic-fold-search? Lewis Perin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=k98lsu01fs@news6.newsguy.com \
--to=btraven@nihilo.net \
--cc=help-gnu-emacs@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).