From: martin rudalics <rudalics@gmx.at>
To: Juri Linkov <juri@jurta.org>
Cc: 13041@debbugs.gnu.org, perin@panix.com, perin@acm.org
Subject: bug#13041: 24.2; diacritic-fold-search
Date: Fri, 07 Dec 2012 11:37:00 +0100 [thread overview]
Message-ID: <50C1C6CC.9020103@gmx.at> (raw)
In-Reply-To: <871uf2647i.fsf@mail.jurta.org>
> This is usable to sort and compare strings, but I don't see
> how ucs-normalize.el could help in the search. I suppose the
> searched buffer can't be normalized before starting a search.
You can either temporarily
- leave the text alone but give each string that should be handled
specially a text property with the normalized form. In this case
searching has to pay attention to these properties, if present.
- normalize the text and give each normalized string a text property
with the original text. In this case searching will proceed as usual
but you have to restore the original text when done.
I don't know how feasible these are for searching. But I used the
second approach for sorting without problems.
Also I don't know how to handle the return value and/or highlighting
when, for example, finding a match for "suf" within "suffer". For
example, replacing each occurrence of "suf" with the empty string should
leave us with "fer" here. So in this case, we have to deal with the
normalized string anyway. OTOH replacing a match for "res" in "résumé"
with the empty string should probably leave us with "umé".
> So the search function somehow should be able to skip combining
> characters in the buffer. But to do this, the translation table needs
> to contain additional information about certain characters to ignore.
> Also the translation table should be able to map a sequence of
> characters like "ss" to "ß".
I have no idea how many mappings like "ß" -> "ss" exist. The problem is
that we don't get them from UnicodeData.txt IIUC.
martin
next prev parent reply other threads:[~2012-12-07 10:37 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-30 18:22 bug#13041: 24.2; diacritic-fold-search Lewis Perin
2012-11-30 18:51 ` Juri Linkov
2012-11-30 21:07 ` Lewis Perin
2012-12-01 0:27 ` Juri Linkov
2012-12-01 0:47 ` Drew Adams
2012-12-01 0:49 ` Drew Adams
2012-12-01 1:20 ` Lew Perin
2012-12-01 6:50 ` Drew Adams
2012-12-01 8:32 ` Eli Zaretskii
2012-12-01 9:09 ` Eli Zaretskii
2012-12-01 16:38 ` Drew Adams
2012-12-02 0:27 ` Juri Linkov
2012-12-02 17:45 ` martin rudalics
2012-12-02 18:02 ` Eli Zaretskii
2012-12-03 10:16 ` martin rudalics
2012-12-03 16:47 ` Eli Zaretskii
2012-12-03 17:42 ` martin rudalics
2012-12-03 17:59 ` Eli Zaretskii
2012-12-04 17:54 ` martin rudalics
2012-12-04 19:28 ` Eli Zaretskii
2012-12-05 9:41 ` martin rudalics
2012-12-05 16:37 ` Eli Zaretskii
2012-12-06 10:31 ` martin rudalics
2012-12-06 17:48 ` Eli Zaretskii
2012-12-05 23:05 ` Juri Linkov
2012-12-06 10:32 ` martin rudalics
2012-12-04 20:12 ` Drew Adams
2012-12-04 23:15 ` Drew Adams
2012-12-05 6:50 ` Drew Adams
2012-12-05 9:42 ` martin rudalics
2012-12-05 15:38 ` Drew Adams
2012-12-06 9:25 ` Kenichi Handa
2012-12-06 10:34 ` martin rudalics
2012-12-06 17:50 ` Eli Zaretskii
2012-12-07 0:58 ` Juri Linkov
2012-12-07 6:33 ` Eli Zaretskii
2012-12-07 10:37 ` martin rudalics [this message]
2012-12-07 23:55 ` Juri Linkov
2012-12-08 8:20 ` Eli Zaretskii
2012-12-08 11:35 ` martin rudalics
2012-12-08 12:40 ` Eli Zaretskii
2012-12-08 11:21 ` martin rudalics
2012-12-08 23:07 ` Juri Linkov
2012-12-09 0:04 ` Drew Adams
2012-12-09 17:52 ` martin rudalics
2012-12-09 18:06 ` Drew Adams
2012-12-11 7:19 ` Eli Zaretskii
2012-12-08 23:54 ` Stefan Monnier
2012-12-09 0:14 ` Drew Adams
2012-12-09 15:42 ` Stefan Monnier
2012-12-09 18:00 ` Drew Adams
2012-12-09 0:35 ` Juri Linkov
2012-12-09 11:35 ` Stephen Berman
2012-12-09 17:52 ` martin rudalics
2012-12-09 15:45 ` Stefan Monnier
2012-12-10 7:57 ` Juri Linkov
2012-12-10 8:20 ` Eli Zaretskii
2012-12-05 9:42 ` martin rudalics
2012-12-05 9:42 ` martin rudalics
2012-12-05 15:38 ` Drew Adams
2012-12-05 15:51 ` Lewis Perin
2012-12-05 16:20 ` Drew Adams
2012-12-05 17:16 ` Drew Adams
2012-12-05 18:00 ` Drew Adams
2012-12-05 18:27 ` Eli Zaretskii
2012-12-06 10:31 ` martin rudalics
2012-12-06 15:59 ` Drew Adams
2012-12-06 10:28 ` martin rudalics
2012-12-06 17:53 ` Eli Zaretskii
2012-12-05 23:04 ` Juri Linkov
2012-12-06 10:31 ` martin rudalics
2012-12-07 0:52 ` Juri Linkov
2012-12-02 21:39 ` Juri Linkov
2012-12-03 10:16 ` martin rudalics
2012-12-04 0:17 ` Juri Linkov
2012-12-04 3:41 ` Eli Zaretskii
2012-12-02 18:16 ` Eli Zaretskii
2012-12-02 21:31 ` Juri Linkov
2012-12-05 19:17 ` Drew Adams
2012-12-05 21:19 ` Eli Zaretskii
2012-11-30 19:31 ` Stefan Monnier
2016-08-31 14:45 ` Michael Albinus
[not found] ` <22473.57245.883865.68491@panix5.panix.com>
2016-09-03 7:06 ` Michael Albinus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50C1C6CC.9020103@gmx.at \
--to=rudalics@gmx.at \
--cc=13041@debbugs.gnu.org \
--cc=juri@jurta.org \
--cc=perin@acm.org \
--cc=perin@panix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/emacs.git
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.