all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: help-gnu-emacs@gnu.org
Subject: Re: will we ever have zero width assertions in regexps?
Date: Mon, 07 Feb 2011 15:30:25 -0500	[thread overview]
Message-ID: <jwvd3n3twxv.fsf-monnier+gnu.emacs.help@gnu.org> (raw)
In-Reply-To: slrnikist1.vk8.nospam-abuse@powdermilk.math.berkeley.edu

>>> So you have a REx which is matched against a line, but you want (in
>>> addition to the usual effects of matching) to know whether it "wanted"
>>> the match to overflow into the following line?
>> 
>>> If so, it looks like "reusing the continuation state" would not be a
>>> serious optimization - it would add just a small multiplicative
>>> constant to the "use only the hypothetical bit" scenario...
>> 
>> We could probably make it work with just that extra bit, indeed.
>> But with the full intermediate state, we get to just "start the search
>> with last line's state" instead of having to "start the search from the
>> previous N lines since they all ended with the <wantmore> bit set", so
>> it will happily work with many-lines cases without having to reparse
>> those many lines N times.

> Hmm, I thought about a different scenario: if the bit is set, then one
> switches to a DIFFERENT REx designed for a multi-line case.  Otherwise
> why not just run it against the rest of the buffer, instead of
> one-line?

Because we don't want to match those regexps against the whole buffer
every time the buffer is modified (the buffer may be large).

Also it can be tricky to match only some of the font-lock regexps
against the whole buffer, since font-lock-keywords is normally defined
with the assumption that the regexps are applied in turn, that earlier
ones prevent subsequent ones from being applied and that we start with
a fresh un-highlighted buffer.  So in general, if we want to apply one
of the font-lock-keywords to the whole buffer, we have to do it for all
of them.

BTW, another reason to want a non-backtracking matcher can be seen in
the recent thread "Stack overflow in regexp matcher".


        Stefan



  reply	other threads:[~2011-02-07 20:30 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <mailman.1.1296054361.23496.help-gnu-emacs@gnu.org>
2011-01-26 15:58 ` will we ever have zero width assertions in regexps? Stefan Monnier
2011-01-27  1:45   ` Le Wang
     [not found]   ` <mailman.6.1296092730.6982.help-gnu-emacs@gnu.org>
2011-01-27  2:21     ` Stefan Monnier
2011-01-27  6:34       ` Ilya Zakharevich
2011-01-27 16:10         ` Stefan Monnier
2011-01-28 23:49           ` Ilya Zakharevich
2011-01-29  2:51             ` Stefan Monnier
2011-01-29 22:28               ` Ilya Zakharevich
2011-01-31 16:08                 ` Stefan Monnier
2011-01-31 17:10                   ` Ilya Zakharevich
2011-01-31 21:29                     ` Stefan Monnier
2011-02-02 15:09                       ` Ilya Zakharevich
2011-02-07 20:30                         ` Stefan Monnier [this message]
2011-02-08 22:41                           ` Ilya Zakharevich
2011-01-26 14:55 Le Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvd3n3twxv.fsf-monnier+gnu.emacs.help@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.