unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Le Wang <l26wang@gmail.com>
Cc: "Óscar Fuentes" <ofv@wanadoo.es>, emacs-devel@gnu.org
Subject: Re: Emacs needs truely useful flex matching
Date: Mon, 15 Apr 2013 09:50:50 -0400	[thread overview]
Message-ID: <jwv61zodjf3.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <CAM=K+irSz6aDT8TN+JDNz-goZr4cE6MeUExePJ_gUNGZrk9Z_Q@mail.gmail.com> (Le Wang's message of "Mon, 15 Apr 2013 08:14:44 +0800")

> Consider "fafbfcfd-bcd", your regexp will match "afbfcfd" from the first
> word, but we want to prefer "a" from the first word, "bcd" from the second
> word.

Duh!  Thanks for spelling it out.
So indeed you'd need regexps like

(re "ab...") = (concat "\\<a\\([^b]*\\)" (re "b...")
                       "\\|a\\([^b]*\\)" (re "b..."))

which blows up very quickly both in size and in matching time.

You can probably make it work with regexps, but it'll take more work:
- Match using the kind of regexp I suggested.  Use it to filter out
  uninteresting candidates.
- For those elements that match, try to match with stronger regexps
  to see if the match's value can be improved.  I.e. for "abcd", you'd
  first try "\\<a\\([^b]*\\)b\\([^c]*\\)c\\([^d]*\\)d", then
  either try "a\\([^b]*\\)\\<b\\([^c]*\\)c\\([^d]*\\)d" or
  try "\\<a\\([^b]*\\)\\<b\\([^c]*\\)c\\([^d]*\\)d" depending on the
  first test, etc...
  For an N-char key, that's O(N) regexp-matches, tho it may also turn out
  to require O(N) regexp-compilations, so it's probably OK but might
  require some extra work to optimize it further.

> Actually can I skip the GC somehow since I know I'm about to allocate a
> bunch of memory that never gets freed.

You can let-bound gc-cons-threshold, but I think it's not worth
the risk (the let-binding may apply to more code than expected in
situations such as debugging).


        Stefan



  reply	other threads:[~2013-04-15 13:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-21 15:02 Emacs needs truely useful flex matching Le Wang
2013-03-21 17:49 ` Óscar Fuentes
2013-03-21 23:34   ` Le Wang
2013-03-21 23:58   ` Stefan Monnier
2013-03-22  1:00     ` Le Wang
2013-03-22  8:24       ` Eli Zaretskii
2013-03-22 11:18         ` Dmitry Gutov
2013-04-14 16:48       ` Le Wang
2013-04-14 18:18         ` Stefan Monnier
2013-04-15  0:14           ` Le Wang
2013-04-15 13:50             ` Stefan Monnier [this message]
2013-03-22  2:36 ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwv61zodjf3.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=emacs-devel@gnu.org \
    --cc=l26wang@gmail.com \
    --cc=ofv@wanadoo.es \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).