all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: martin rudalics <rudalics@gmx.at>
To: Juri Linkov <juri@jurta.org>
Cc: 13041@debbugs.gnu.org, perin@panix.com, perin@acm.org
Subject: bug#13041: 24.2; diacritic-fold-search
Date: Fri, 07 Dec 2012 11:37:00 +0100	[thread overview]
Message-ID: <50C1C6CC.9020103@gmx.at> (raw)
In-Reply-To: <871uf2647i.fsf@mail.jurta.org>

 > This is usable to sort and compare strings, but I don't see
 > how ucs-normalize.el could help in the search.  I suppose the
 > searched buffer can't be normalized before starting a search.

You can either temporarily

- leave the text alone but give each string that should be handled
   specially a text property with the normalized form.  In this case
   searching has to pay attention to these properties, if present.

- normalize the text and give each normalized string a text property
   with the original text.  In this case searching will proceed as usual
   but you have to restore the original text when done.

I don't know how feasible these are for searching.  But I used the
second approach for sorting without problems.

Also I don't know how to handle the return value and/or highlighting
when, for example, finding a match for "suf" within "suffer".  For
example, replacing each occurrence of "suf" with the empty string should
leave us with "fer" here.  So in this case, we have to deal with the
normalized string anyway.  OTOH replacing a match for "res" in "résumé"
with the empty string should probably leave us with "umé".

 > So the search function somehow should be able to skip combining
 > characters in the buffer.  But to do this, the translation table needs
 > to contain additional information about certain characters to ignore.
 > Also the translation table should be able to map a sequence of
 > characters like "ss" to "ß".

I have no idea how many mappings like "ß" -> "ss" exist.  The problem is
that we don't get them from UnicodeData.txt IIUC.

martin






  parent reply	other threads:[~2012-12-07 10:37 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-30 18:22 bug#13041: 24.2; diacritic-fold-search Lewis Perin
2012-11-30 18:51 ` Juri Linkov
2012-11-30 21:07   ` Lewis Perin
2012-12-01  0:27     ` Juri Linkov
2012-12-01  0:47       ` Drew Adams
2012-12-01  0:49         ` Drew Adams
2012-12-01  1:20           ` Lew Perin
2012-12-01  6:50             ` Drew Adams
2012-12-01  8:32       ` Eli Zaretskii
2012-12-01  9:09         ` Eli Zaretskii
2012-12-01 16:38         ` Drew Adams
2012-12-02  0:27         ` Juri Linkov
2012-12-02 17:45           ` martin rudalics
2012-12-02 18:02             ` Eli Zaretskii
2012-12-03 10:16               ` martin rudalics
2012-12-03 16:47                 ` Eli Zaretskii
2012-12-03 17:42                   ` martin rudalics
2012-12-03 17:59                     ` Eli Zaretskii
2012-12-04 17:54                       ` martin rudalics
2012-12-04 19:28                         ` Eli Zaretskii
2012-12-05  9:41                           ` martin rudalics
2012-12-05 16:37                             ` Eli Zaretskii
2012-12-06 10:31                               ` martin rudalics
2012-12-06 17:48                                 ` Eli Zaretskii
2012-12-05 23:05                             ` Juri Linkov
2012-12-06 10:32                               ` martin rudalics
2012-12-04 20:12                         ` Drew Adams
2012-12-04 23:15                           ` Drew Adams
2012-12-05  6:50                             ` Drew Adams
2012-12-05  9:42                               ` martin rudalics
2012-12-05 15:38                                 ` Drew Adams
2012-12-06  9:25                               ` Kenichi Handa
2012-12-06 10:34                                 ` martin rudalics
2012-12-06 17:50                                   ` Eli Zaretskii
2012-12-07  0:58                                 ` Juri Linkov
2012-12-07  6:33                                   ` Eli Zaretskii
2012-12-07 10:37                                   ` martin rudalics [this message]
2012-12-07 23:55                                     ` Juri Linkov
2012-12-08  8:20                                       ` Eli Zaretskii
2012-12-08 11:35                                         ` martin rudalics
2012-12-08 12:40                                           ` Eli Zaretskii
2012-12-08 11:21                                       ` martin rudalics
2012-12-08 23:07                                         ` Juri Linkov
2012-12-09  0:04                                           ` Drew Adams
2012-12-09 17:52                                           ` martin rudalics
2012-12-09 18:06                                             ` Drew Adams
2012-12-11  7:19                                               ` Eli Zaretskii
2012-12-08 23:54                                       ` Stefan Monnier
2012-12-09  0:14                                         ` Drew Adams
2012-12-09 15:42                                           ` Stefan Monnier
2012-12-09 18:00                                             ` Drew Adams
2012-12-09  0:35                                         ` Juri Linkov
2012-12-09 11:35                                           ` Stephen Berman
2012-12-09 17:52                                             ` martin rudalics
2012-12-09 15:45                                           ` Stefan Monnier
2012-12-10  7:57                                             ` Juri Linkov
2012-12-10  8:20                                               ` Eli Zaretskii
2012-12-05  9:42                             ` martin rudalics
2012-12-05  9:42                           ` martin rudalics
2012-12-05 15:38                             ` Drew Adams
2012-12-05 15:51                               ` Lewis Perin
2012-12-05 16:20                                 ` Drew Adams
2012-12-05 17:16                               ` Drew Adams
2012-12-05 18:00                                 ` Drew Adams
2012-12-05 18:27                                   ` Eli Zaretskii
2012-12-06 10:31                                   ` martin rudalics
2012-12-06 15:59                                     ` Drew Adams
2012-12-06 10:28                               ` martin rudalics
2012-12-06 17:53                                 ` Eli Zaretskii
2012-12-05 23:04                             ` Juri Linkov
2012-12-06 10:31                               ` martin rudalics
2012-12-07  0:52                                 ` Juri Linkov
2012-12-02 21:39             ` Juri Linkov
2012-12-03 10:16               ` martin rudalics
2012-12-04  0:17                 ` Juri Linkov
2012-12-04  3:41                   ` Eli Zaretskii
2012-12-02 18:16           ` Eli Zaretskii
2012-12-02 21:31             ` Juri Linkov
2012-12-05 19:17             ` Drew Adams
2012-12-05 21:19               ` Eli Zaretskii
2012-11-30 19:31 ` Stefan Monnier
2016-08-31 14:45 ` Michael Albinus
     [not found]   ` <22473.57245.883865.68491@panix5.panix.com>
2016-09-03  7:06     ` Michael Albinus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50C1C6CC.9020103@gmx.at \
    --to=rudalics@gmx.at \
    --cc=13041@debbugs.gnu.org \
    --cc=juri@jurta.org \
    --cc=perin@acm.org \
    --cc=perin@panix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.