all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Artur Malabarba <bruce.connor.am@gmail.com>
To: Alan Mackenzie <acm@muc.de>
Cc: 22090@debbugs.gnu.org
Subject: bug#22090: Isearch is sluggish and eventually refuses further service with "[Too many words]".
Date: Sun, 6 Dec 2015 12:50:24 +0000	[thread overview]
Message-ID: <CAAdUY-JHBy3Bzax8dwdO+FO263o15sCfev3KLrwUjmqij_y9FA@mail.gmail.com> (raw)
In-Reply-To: <20151205185220.GF2698@acm.fritz.box>

2015-12-05 18:52 GMT+00:00 Alan Mackenzie <acm@muc.de>:
> But it seems the complexity (and it can't honestly be that much,
> surely?) is intrinsic to the task being carried out.  Sticking a "\\|"
> between the upper case and lower case versions clearly doesn't work.
>
> Seriously, how difficult can it be to generate
>
>     "\\([Aa][´`]?\\|[áà𝑎ÁÀ]\\)"
>
> , which is a blameless regexp, given where you've already got to?

Oh. I see. I thought you were talking about mutually exclusive
regexps. Indeed a regexp like that would be trivial to generate. But
is it really blameless? I mean, if "\\(A\\|a\\)" can lead to extremely
slow searches, doesn't the same happen with "[Aa]"?

Anyway, at this point I'm just asking for future knowledge/reference.
According to Eli, the current implementation is in accordance with the
Unicode Standard. So it's probably best to keep it this way at least
for the first release of the feature.

> Once you've generated the long regexp, if it's too long, you can split
> it up into, say, 3 pieces A, B, C, such that (equal re (concat A B C)).
>
> Then you can do something like:
>
>     (and (search-forward-regexp A bound noerror)
>          (search-forward-regexp (concat "\\=" B) bound noerror)
>          (search-forward-regexp (concat "\\=" C) bound noerror))
>
> .  Though, thinking about it, it might be less painful to enhance the
> regexp engine to take longer regexps.

Besides. Char-folding is supposed to turn strings into regexps usable
anywhere, and this wouldn't work with that.

I've added a clause to the function so that it won't do any
charfolding if the resulting regexp would be longer than 5k chars
(instead it will just regexp-quote). That will at least prevent the
too-many words error in isearch. (I already had this clause in there
before, but it was using 10k, which apparently is not enough).





      reply	other threads:[~2015-12-06 12:50 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-04  4:20 bug#22090: Isearch is sluggish and eventually refuses further service with "[Too many words]" Alan Mackenzie
2015-12-04  9:23 ` Eli Zaretskii
2015-12-04 15:16   ` Artur Malabarba
2015-12-04 15:23     ` Eli Zaretskii
2015-12-04 16:06       ` Artur Malabarba
2015-12-04 16:27         ` Eli Zaretskii
2015-12-04 16:37           ` Artur Malabarba
2015-12-04 18:48             ` Eli Zaretskii
2015-12-04 19:59               ` Artur Malabarba
2015-12-05  9:19                 ` Eli Zaretskii
2015-12-04 15:49     ` Random832
2015-12-04 16:21       ` Artur Malabarba
2015-12-04 16:37         ` Random832
2015-12-04 16:51           ` Artur Malabarba
2015-12-04 18:24           ` Eli Zaretskii
     [not found] ` <mailman.1363.1449242229.31583.bug-gnu-emacs@gnu.org>
2015-12-04 17:01   ` Alan Mackenzie
2015-12-04 19:21   ` Alan Mackenzie
2015-12-04 20:08     ` Eli Zaretskii
2015-12-04 20:49     ` Artur Malabarba
2015-12-04 23:00       ` Alan Mackenzie
2015-12-05 17:23         ` Artur Malabarba
2015-12-05 17:32           ` Eli Zaretskii
2015-12-05 18:12             ` Artur Malabarba
2015-12-05 18:34               ` Eli Zaretskii
2015-12-05 18:52           ` Alan Mackenzie
2015-12-06 12:50             ` Artur Malabarba [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAdUY-JHBy3Bzax8dwdO+FO263o15sCfev3KLrwUjmqij_y9FA@mail.gmail.com \
    --to=bruce.connor.am@gmail.com \
    --cc=22090@debbugs.gnu.org \
    --cc=acm@muc.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.