* bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character)
@ 2011-10-06 9:13 Štěpán Němec
2011-10-06 12:57 ` Stefan Monnier
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Štěpán Němec @ 2011-10-06 9:13 UTC (permalink / raw)
To: 9681
Quoting from <http://permalink.gmane.org/gmane.emacs.gnus.user/15052>:
===========
> What am I doing wrong?
Nothing, I think :-). I personally don't use fancy split-
ting, but a deeper look at (at least Gnus 5.13's) code seems
to locate the culprit in Emacs' *backward* regular expres-
sion "non-greedity": Position point at the end of
"bugzilla.gdm", C-u C-r "\w+" - et voilà, only one character
is matched.
============
If this curious inconsistency of `re-search-backward' with
`re-search-forward' is intentional (which I hope it is not), it should
be documented, but I couldn't find anything in the manuals or
docstrings.
--
Štěpán
^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) 2011-10-06 9:13 bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) Štěpán Němec @ 2011-10-06 12:57 ` Stefan Monnier 2011-10-06 18:48 ` Štěpán Němec 2011-10-06 19:46 ` Johan Bockgård 2012-03-16 15:49 ` Jack Duthen 2 siblings, 1 reply; 10+ messages in thread From: Stefan Monnier @ 2011-10-06 12:57 UTC (permalink / raw) To: Štěpán Němec; +Cc: 9681 > If this curious inconsistency of `re-search-backward' with > `re-search-forward' is intentional (which I hope it is not), it should > be documented, but I couldn't find anything in the manuals or > docstrings. re-search-* stops at the first character position that has a match. And then it chooses the longest match at that position. Stefan ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) 2011-10-06 12:57 ` Stefan Monnier @ 2011-10-06 18:48 ` Štěpán Němec 2011-10-07 13:02 ` Stefan Monnier 0 siblings, 1 reply; 10+ messages in thread From: Štěpán Němec @ 2011-10-06 18:48 UTC (permalink / raw) To: Stefan Monnier; +Cc: 9681 [Stefan: sorry for two replies, I forgot to cc the bug list in my first reply, also, I've changed my mind on some of the points since then, see below.] On Thu, Oct 06, 2011 at 08:57:09AM -0400, Stefan Monnier wrote: > > If this curious inconsistency of `re-search-backward' with > > `re-search-forward' is intentional (which I hope it is not), it should > > be documented, but I couldn't find anything in the manuals or > > docstrings. > > re-search-* stops at the first character position that has a match. > And then it chooses the longest match at that position. Thanks, but I'm not sure I understand what you mean here. Naturally, the longest match for `re-search-backward' should be backward, not forward, i.e. using your wording above, when searching _backward_ for \w+ in "foobar|" where "|" is point, the "first character position that has a match" might be "r", but it's hardly the longest match. If I'm the only one who considers this behaviour broken (by design?[1]), which I very much doubt, it definitely needs to at least be documented, as I'm certainly not the only one who is very surprised by this behaviour. In my opinion it should be fixed, though. [1] Cf. e.g. ?\w\+ in Vim, which does the right thing. -- Štěpán ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) 2011-10-06 18:48 ` Štěpán Němec @ 2011-10-07 13:02 ` Stefan Monnier 2011-10-07 13:19 ` Štěpán Němec 2011-10-11 2:03 ` Kenichi Handa 0 siblings, 2 replies; 10+ messages in thread From: Stefan Monnier @ 2011-10-07 13:02 UTC (permalink / raw) To: Štěpán Němec; +Cc: 9681 >> re-search-* stops at the first character position that has a match. >> And then it chooses the longest match at that position. > Thanks, but I'm not sure I understand what you mean here. Naturally, the > longest match for `re-search-backward' should be backward, not forward, Ah, yes, sorry for being unclear: the search for a match goes backward, but the matching itself goes forward. The docstring of re-search-backward is more clear about that: The match found is the one starting last in the buffer and yet ending before the origin of the search. > If I'm the only one who considers this behaviour broken (by design?[1]), It's not the ideal behavior, admittedly. It's even more obvious in `looking-back'. But fixing it would require the implementation of a backward regexp matcher. Stefan ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) 2011-10-07 13:02 ` Stefan Monnier @ 2011-10-07 13:19 ` Štěpán Němec 2011-10-11 2:03 ` Kenichi Handa 1 sibling, 0 replies; 10+ messages in thread From: Štěpán Němec @ 2011-10-07 13:19 UTC (permalink / raw) To: Stefan Monnier; +Cc: 9681 On Fri, Oct 07, 2011 at 09:02:18AM -0400, Stefan Monnier wrote: > >> re-search-* stops at the first character position that has a match. > >> And then it chooses the longest match at that position. > > Thanks, but I'm not sure I understand what you mean here. Naturally, the > > longest match for `re-search-backward' should be backward, not forward, > > Ah, yes, sorry for being unclear: the search for a match goes backward, > but the matching itself goes forward. > > The docstring of re-search-backward is more clear about that: > > The match found is the one starting last in the buffer > and yet ending before the origin of the search. I suppose that is more clear if you already know the behaviour, but I didn't understand it that way, either. I think it should at least add that the match is still forward, not backward, and that it might not behave as expected for regexps containing constructs like * and +. > > If I'm the only one who considers this behaviour broken (by design?[1]), > > It's not the ideal behavior, admittedly. It's even more obvious in > `looking-back'. But fixing it would require the implementation of > a backward regexp matcher. Yeah, as I said above (and as is obvious in the message quoted in the bug report), the set of regexps usable with `re-search-backward' seems to be quite limited, and one has to be very careful when using it (and even some developers apparently fail at that). So, again: it definitely needs better documentation, and IMO it also needs fixing. -- Štěpán ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) 2011-10-07 13:02 ` Stefan Monnier 2011-10-07 13:19 ` Štěpán Němec @ 2011-10-11 2:03 ` Kenichi Handa 2011-10-11 3:56 ` Stefan Monnier 1 sibling, 1 reply; 10+ messages in thread From: Kenichi Handa @ 2011-10-11 2:03 UTC (permalink / raw) To: Stefan Monnier; +Cc: 9681, stepnem In article <jwvr52pj706.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes: > It's not the ideal behavior, admittedly. It's even more obvious in > `looking-back'. But fixing it would require the implementation of > a backward regexp matcher. FYI, in Mule (the version before integrating into Emacs), we implemented such a feature by doing these: o Regular expression compiler written in Elisp which generates both forward matching and backward matching compiled patterns. o Modify regex.c to accept the above patterns and do backward matching if necessary. I vaguely remember that I discussed this feature with RMS when we were going to integrate Mule's multilingual feature into Emacs, and it was rejected because it's not related to multilingual feature. And actually, that feature had been used very rarely. --- Kenichi Handa handa@m17n.org ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) 2011-10-11 2:03 ` Kenichi Handa @ 2011-10-11 3:56 ` Stefan Monnier 0 siblings, 0 replies; 10+ messages in thread From: Stefan Monnier @ 2011-10-11 3:56 UTC (permalink / raw) To: Kenichi Handa; +Cc: 9681, stepnem >> It's not the ideal behavior, admittedly. It's even more obvious in >> `looking-back'. But fixing it would require the implementation of >> a backward regexp matcher. > FYI, in Mule (the version before integrating into Emacs), we > implemented such a feature by doing these: I actually think it would be nice to have such a thing, but I also think it'd be more important to have a non-backtracking regexp matcher. > multilingual feature. And actually, that feature had been > used very rarely. Indeed, it's not often needed, but those few cases can be significant. Stefan ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) 2011-10-06 9:13 bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) Štěpán Němec 2011-10-06 12:57 ` Stefan Monnier @ 2011-10-06 19:46 ` Johan Bockgård 2012-03-16 15:49 ` Jack Duthen 2 siblings, 0 replies; 10+ messages in thread From: Johan Bockgård @ 2011-10-06 19:46 UTC (permalink / raw) To: Štěpán Němec; +Cc: 9681 Štěpán Němec <stepnem@gmail.com> writes: > If this curious inconsistency of `re-search-backward' with > `re-search-forward' is intentional (which I hope it is not), it should > be documented, but I couldn't find anything in the manuals or > docstrings. Then you must not have looked very hard. ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) 2011-10-06 9:13 bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) Štěpán Němec 2011-10-06 12:57 ` Stefan Monnier 2011-10-06 19:46 ` Johan Bockgård @ 2012-03-16 15:49 ` Jack Duthen 2012-03-16 16:29 ` bug#11025: Fwd: " Jack Duthen 2 siblings, 1 reply; 10+ messages in thread From: Jack Duthen @ 2012-03-16 15:49 UTC (permalink / raw) To: 9681 On Thu, Oct 06, 2011 at 08:57:09AM -0400, Stefan Monnier wrote: > re-search-* stops at the first character position that has a match. > And then it chooses the longest match at that position. Stepan wrote: > So, again: it definitely needs better documentation, > and IMO it also needs fixing. Hi! For my own imenu-prev-index-position-function, I needed a backward regexp search which would match something like ".+" the way one (like Stepan) can expect rather than the way it actually does (as described by Stefan). So, I just wrote a function to do that. The way it handles the COUNT variable is not as good as one could want but, as I almost never use it, I don't care. It's not very efficient but, since I can't notice the time it takes when used in the "*rescan" menu and since I can't imagine a better algorithm, it's ok for me. (defun jd-re-search-backward (regexp &optional bound noerror count) (let ((orig-point (point)) bom) (when (re-search-backward regexp bound noerror count) (setq bom (point)) ; should not be useful (goto-char (point-min)) (while (re-search-forward regexp orig-point 'noerror) ;; remember the last beginning of match (setq bom (match-beginning 0))) (goto-char bom) ;; set match data (erased by the last failing search) and return T (looking-at regexp)))) HTH )jack( ^ permalink raw reply [flat|nested] 10+ messages in thread
* bug#11025: Fwd: bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) 2012-03-16 15:49 ` Jack Duthen @ 2012-03-16 16:29 ` Jack Duthen 0 siblings, 0 replies; 10+ messages in thread From: Jack Duthen @ 2012-03-16 16:29 UTC (permalink / raw) To: 11025 Oops! It's a feature, not a bug... Sorry for the noise. I just found the same problem addressed in bug#9681. I understood the current behavior of re-search-backward. see my reply. sincerely )jack( ---------- Forwarded message ---------- From: Jack Duthen <duthen.mac.01@gmail.com> Date: 2012/3/16 Subject: Re: bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) To: bug-gnu-emacs@gnu.org On Thu, Oct 06, 2011 at 08:57:09AM -0400, Stefan Monnier wrote: > re-search-* stops at the first character position that has a match. > And then it chooses the longest match at that position. Stepan wrote: > So, again: it definitely needs better documentation, > and IMO it also needs fixing. Hi! For my own imenu-prev-index-position-function, I needed a backward regexp search which would match something like ".+" the way one (like Stepan) can expect rather than the way it actually does (as described by Stefan). So, I just wrote a function to do that. The way it handles the COUNT variable is not as good as one could want but, as I almost never use it, I don't care. It's not very efficient but, since I can't notice the time it takes when used in the "*rescan" menu and since I can't imagine a better algorithm, it's ok for me. (defun jd-re-search-backward (regexp &optional bound noerror count) (let ((orig-point (point)) bom) (when (re-search-backward regexp bound noerror count) (setq bom (point)) ; should not be useful (goto-char (point-min)) (while (re-search-forward regexp orig-point 'noerror) ;; remember the last beginning of match (setq bom (match-beginning 0))) (goto-char bom) ;; set match data (erased by the last failing search) and return T (looking-at regexp)))) HTH )jack( ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-03-16 16:29 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-06 9:13 bug#9681: Broken behaviour of re-search-backward (.+ matching only a single character) Štěpán Němec 2011-10-06 12:57 ` Stefan Monnier 2011-10-06 18:48 ` Štěpán Němec 2011-10-07 13:02 ` Stefan Monnier 2011-10-07 13:19 ` Štěpán Němec 2011-10-11 2:03 ` Kenichi Handa 2011-10-11 3:56 ` Stefan Monnier 2011-10-06 19:46 ` Johan Bockgård 2012-03-16 15:49 ` Jack Duthen 2012-03-16 16:29 ` bug#11025: Fwd: " Jack Duthen
Code repositories for project(s) associated with this external index https://git.savannah.gnu.org/cgit/emacs.git https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.