unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Damien Cassou <damien@cassou.me>
Cc: 56747@debbugs.gnu.org
Subject: bug#56747: 28.1.90; Char fold search doesn't work
Date: Mon, 25 Jul 2022 15:01:31 +0300	[thread overview]
Message-ID: <83zggxe6zo.fsf@gnu.org> (raw)
In-Reply-To: <87v8rm70vl.fsf@cassou.me> (message from Damien Cassou on Sun, 24 Jul 2022 21:43:10 +0200)

> From: Damien Cassou <damien@cassou.me>
> Cc: 56747@debbugs.gnu.org
> Date: Sun, 24 Jul 2022 21:43:10 +0200
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> > Which part of the manual led you to expect the above behavior?
> 
> This page (info "(emacs) Lax Search") says:
> 
>   In addition, ‘a’ matches other characters that resemble it, or have it
>   as part of their graphical representation, such as U+249C
>   PARENTHESIZED LATIN SMALL LETTER A and U+2100 ACCOUNT OF (which looks
>   like a small ‘a’ over ‘c’)
> 
> Those 2 characters are the ones I tried so I was expecting to make it
> work.

Ah, you are right.  I wasn't reading the text closely enough.

> > By default, Emacs only folds "canonically-equivalent" characters, and
> > those two aren't equivalent to 'a'.
> 
> Then I don't understand what the manual is saying. Can you please
> explain?

It's a documentation bug: these 2 pairs are by default not handled as
equivalent.  The reasons are to some extent heuristics: since the
table of the equivalent character sequences is produced mechanically,
allowing such "too lax" equivalences would lead to surprising false
matches; see bug#20975 for one example.  (These surprising results are
in part due to the our simplistic implementation, whereby we convert
the set of equivalent sequences to a regexp.)  So we decided to play
it safe, and not allow 'a' to match a character whose Unicode
decomposition is "(a)", because 'a' is not the first character of the
decomposition.  We do allow the sequence "(a)" to match ⒜ (but not
vice versa!), and we do allow 'a' to match 'ⓐ' (because 'a' is the
only character in the decomposition of the latter).

The result of these heuristics is somewhat inconsistent from user POV,
which is why we have a facility to customize it.

So I've now updated the manual to quote only examples that really
work.

> By the way, you are doing an amazing job with Emacs! Thank you so much
> Eli.

Thanks, but please don't forget Lars and others involved in the
development.





  reply	other threads:[~2022-07-25 12:01 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-24 17:27 bug#56747: 28.1.90; Char fold search doesn't work Damien Cassou
2022-07-24 17:45 ` Eli Zaretskii
2022-07-24 19:43   ` Damien Cassou
2022-07-25 12:01     ` Eli Zaretskii [this message]
     [not found]       ` <87lesh74he.fsf@cassou.me>
2022-07-25 13:38         ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83zggxe6zo.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=56747@debbugs.gnu.org \
    --cc=damien@cassou.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).