From: Chong Yidong <cyd@stupidchicken.com>
To: Kenichi Handa <handa@m17n.org>
Cc: 540@emacsbugs.donarmstrong.com
Subject: bug#540: 23.0.60; Unicode search bug
Date: Wed, 27 Aug 2008 00:15:57 -0400 [thread overview]
Message-ID: <87wsi3qeiq.fsf@cyd.mit.edu> (raw)
Hi Handa-san,
Could you take a look at this bug report? Thanks.
Juri Linkov <juri@jurta.org> wrote:
> There is a weird bug in searching Unicode text. The search function
> fails on Cyrillic letters between codepoints #x0400 and #x041f, but
> successfully finds a Cyrillic letter between #x0420 and #x042f.
>
> I tried to debug this and see that in case of failure it calls
> `boyer_moore', and in case of successful search it calls
> `simple_search'. I checked the Unicode properties, but everything
> seems correct.
>
> This bug didn't exist before the Unicode merge.
>
> The easiest way to reproduce it: run `emacs -Q', put in the *scratch*
> buffer the following 4 lines (note the leading space):
>
> (search-forward " П" nil t)
> (search-forward " Р" nil t)
> П
> Р
>
> and type `C-x C-e' after each of first two lines.
Here, the failing case is:
П = 1055 = 10000011111
inverse(П) = 1087 = 10000111111
^^^^^^
whereas the case that works (by setting boyer_moore_ok to 0) is
Р = 1056 = 10000100000
inverse(Р) = 1088 = 10001000000
^^^^^^
I've indicated the last 6 bits, according to the logic in search_buffer
(which I don't fully understand).
next reply other threads:[~2008-08-27 4:15 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-27 4:15 Chong Yidong [this message]
2008-08-27 10:59 ` bug#540: 23.0.60; Unicode search bug Andreas Schwab
-- strict thread matches above, loose matches on Subject: below --
2008-07-06 18:43 Juri Linkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wsi3qeiq.fsf@cyd.mit.edu \
--to=cyd@stupidchicken.com \
--cc=540@emacsbugs.donarmstrong.com \
--cc=handa@m17n.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).