all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Robert Pluim <rpluim@gmail.com>
To: Juri Linkov <juri@linkov.net>
Cc: "Basil L. Contovounesios" <contovob@tcd.ie>, emacs-devel@gnu.org
Subject: Re: search-default-mode char-fold-to-regexp and Greek Extended block characters, Re: search-default-mode char-fold-to-regexp and Greek Extended block characters
Date: Thu, 25 Jul 2019 22:44:29 +0200	[thread overview]
Message-ID: <m2sgqtkfo2.fsf@gmail.com> (raw)
In-Reply-To: <87a7d2asu3.fsf@mail.linkov.net> (Juri Linkov's message of "Thu,  25 Jul 2019 21:40:12 +0300, Thu, 25 Jul 2019 21:46:20 +0300")

>>>>> On Thu, 25 Jul 2019 21:40:12 +0300, Juri Linkov <juri@linkov.net> said:

    >> Can you please explain why iota with dialytika and tonos needs to be
    >> special-cased in these places?

    Juri> Here is the test case that demonstrates the need to add it
    Juri> to char-fold-include:

    Juri> 0. emacs -Q
    Juri> 1. Paste this text to *scratch*: "ΐΐ"
    Juri> 2. Search for two IOTAs with char-fold, e.g.: C-s M-s ' ιι

    Juri> The char-fold search doesn't match the characters with combining accents
    Juri> with their base char GREEK SMALL LETTER IOTA.

    Juri> However, after adding (?ι "ΐ") to char-fold-include it can match the
    Juri> base character IOTA.

Yes, I see the problem now. Maybe this can be solved by adding that
mapping when building char-fold-table. Or 'those mappings' I should
say, since there are going to be many cases like this.

How about the following? It passes your tests with the FIXMEs
uncommented (and isearch for multiple iotas matches multiple iotas +
combining diacriticals).

I deliberately restricted it to lower case characters, since the
roundtripping fails for İ and a large number of titlecase characters.

diff --git i/lisp/char-fold.el w/lisp/char-fold.el
index f379229e6c..91fd7ddc28 100644
--- i/lisp/char-fold.el
+++ w/lisp/char-fold.el
@@ -108,6 +108,17 @@
                                     (car next-decomp)))
                            (funcall make-decomp-match-char (list (car next-decomp)) char)))
                      (setq dec next-decomp)))
+               ;; If there is no precomposed uppercase version of a
+               ;; character with diacriticals, we also add a mapping
+               ;; from the base character to the base character with
+               ;; combining diacriticals
+               (when (eq (get-char-code-property char 'general-category) 'Ll)
+                 (let* ((str (char-to-string char))
+                        (upper (upcase str))
+                        (roundtrip (downcase upper)))
+                   (when (> (length roundtrip) 1)
+                     (aset equiv (aref roundtrip 0)
+                           (cons roundtrip (aref equiv (aref roundtrip 0)))))))
                ;; Do it again, without the non-spacing characters.
                ;; This allows 'a' to match 'ä'.
                (let ((simpler-decomp nil)



  reply	other threads:[~2019-07-25 20:44 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-19 14:18 search-default-mode char-fold-to-regexp and Greek Extended block characters Robert Pluim
2019-07-19 14:37 ` Eli Zaretskii
2019-07-19 16:03   ` Robert Pluim
2019-07-19 18:13     ` Eli Zaretskii
2019-07-21 11:03       ` Robert Pluim
2019-07-22 18:39         ` Robert Pluim
2019-07-23 14:57           ` Eli Zaretskii
2019-07-23 17:43             ` Robert Pluim
2019-07-23 20:29               ` Juri Linkov
2019-07-24  7:56                 ` Robert Pluim
2019-07-24  7:59                   ` Robert Pluim
2019-07-24  9:04                 ` Robert Pluim
2019-07-24 23:12                   ` Juri Linkov
2019-07-25  0:18                     ` Basil L. Contovounesios
2019-07-25 18:40                       ` Juri Linkov
2019-07-25 20:44                         ` Robert Pluim [this message]
2019-07-25 21:35                           ` search-default-mode char-fold-to-regexp and Greek Extended block characters, " Juri Linkov
2019-07-26 11:09                             ` Robert Pluim
2019-07-26 18:38                               ` Juri Linkov
2019-07-29  8:32                                 ` Robert Pluim
2019-07-29 18:09                                   ` Juri Linkov
2019-07-30  8:09                                     ` Robert Pluim
2019-07-30 10:15                                       ` Eli Zaretskii
2019-07-25  2:36                     ` Eli Zaretskii
2019-07-25  8:59                       ` Robert Pluim
2019-07-25 12:53                         ` Eli Zaretskii
2019-07-25  8:46                     ` Robert Pluim
2019-07-25 18:46                       ` Juri Linkov
2019-07-26  6:04                         ` Eli Zaretskii
2019-07-26 18:40                           ` Juri Linkov
2019-07-26 19:13                             ` Eli Zaretskii
2019-07-19 18:53 ` Juri Linkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2sgqtkfo2.fsf@gmail.com \
    --to=rpluim@gmail.com \
    --cc=contovob@tcd.ie \
    --cc=emacs-devel@gnu.org \
    --cc=juri@linkov.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.