all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Luc Teirlinck <teirllm@dms.auburn.edu>
Cc: emacs-devel@gnu.org
Subject: Re: Matches for multiline regexps
Date: Fri, 17 Jun 2005 21:48:32 -0500 (CDT)	[thread overview]
Message-ID: <200506180248.j5I2mW504853@raven.dms.auburn.edu> (raw)
In-Reply-To: <E1DjIIl-0001qe-7Z@fencepost.gnu.org> (message from Richard Stallman on Fri, 17 Jun 2005 10:58:35 -0400)

Richard Stallman wrote:

       Additional remark: from simpler examples. it appears that they are
       _intended_ to be line numbers.  If so, this is a bug.

   Yes, it seems to be a bug in counting the line numbers.  Could you fix
   that too?

I will take a look at it, but first a decision has to be made on how
we treat overlapping matches.  (I am talking about matches that
themselves overlap.  I have no problem handling a match that starts on
the same line on which a previous match ended, but later on the line,
so that the matches themselves do not overlap, only one of their lines.)

The current occur implementation for multiline regexps has _several_
problems.  Apart from getting the line numbers wrong, the matches do
not get correctly displayed: only their first line is shown.  The
current implementation _tries_ to "correctly" (in one of the two
possible interpretations of what is "correct") find all matches in
case there are overlapping matches.  But it does not come close to
succeeding in that.  Worse, it has to pay for its attempt to do so by
failing to find all matches in more natural cases where there are no
overlapping matches and only one possible interpretation of "correct".
The present occur implementation differs radically in philosophy with
all other word or regexp search functions in Emacs and is backward
incompatible with Emacs 21.

I propose to have occur treat overlapping matches the same as the
other Emacs search functions do, which is also the way occur behaved
before Emacs 22.  That is, given a buffer with the following five lines:

11
11
11
11
11

`M-x occur RET 11 C-q C-j 11 RET' will find two matches, one on line 1
and one on line 3.  Those are the only matches that
`C-M-s 11 C-q C-j 11 RET C-s C-s C-s...' at beginning of buffer is
going to find.  It is what occur does in Emacs 21.  Implementing this
correctly seems relatively easy and does not require paying a price in
efficiency.  If this interpretation is good enough for C-M-s, then why
not for occur?

Trying to fix occur to handle the other interpretation of "correct"
(matches at lines 1, 2, 3 and 4) is possible but more difficult.  (The
current occur version can do that correctly in this example, but fails
for many other examples.)  Even a completely correct implementation
would still present problems.  It could make the handling of more
natural regexps less efficient, it clashes with all other search
functions in its philosophy, and it would not be clear how to display
all multiline matches in a way that is clear and avoids excessive
redundancy, because there could be a _lot_ of overlapping lines
between matches.  With my proposal only _consecutive_ entries in the
*Occur* buffer could overlap and the overlap would be at most one
line.  With a correct implementation of the other interpretation,
there is no limit in amount of overlap.

Sincerely,

Luc.

  reply	other threads:[~2005-06-18  2:48 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-16  1:40 Matches for multiline regexps Luc Teirlinck
2005-06-16  2:09 ` Luc Teirlinck
2005-06-16  2:24 ` Luc Teirlinck
2005-06-16 16:24 ` Richard Stallman
2005-06-17  3:26   ` Luc Teirlinck
2005-06-17 14:58     ` Richard Stallman
2005-06-18  2:48       ` Luc Teirlinck [this message]
2005-06-19  3:50         ` Richard Stallman
2005-06-19 14:14           ` Juri Linkov
2005-06-20  3:50             ` Richard Stallman
2005-06-20  4:47               ` Juri Linkov
2005-06-21  2:00                 ` Richard Stallman
2005-06-20  1:57           ` Luc Teirlinck
2005-06-20 17:51             ` Richard Stallman
2005-06-18  3:17       ` Luc Teirlinck
2005-06-17  3:30   ` Luc Teirlinck
2005-06-17 14:58     ` Richard Stallman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200506180248.j5I2mW504853@raven.dms.auburn.edu \
    --to=teirllm@dms.auburn.edu \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.