From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Sam Halliday Newsgroups: gmane.emacs.bugs Subject: bug#35119: 26.1; narrow-to-region loses word-start/symbol-start information at end Date: Wed, 3 Apr 2019 14:05:25 +0100 Message-ID: References: <87y34r1glv.fsf@gmail.com> <83a7h7e3ex.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="4682"; mail-complaints-to="usenet@blaine.gmane.org" Cc: 35119@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Apr 03 15:06:17 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1hBfan-00010J-Oi for geb-bug-gnu-emacs@m.gmane.org; Wed, 03 Apr 2019 15:06:13 +0200 Original-Received: from localhost ([127.0.0.1]:43311 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBfam-0006Lu-Oy for geb-bug-gnu-emacs@m.gmane.org; Wed, 03 Apr 2019 09:06:12 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:47495) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hBfad-0006Jd-Md for bug-gnu-emacs@gnu.org; Wed, 03 Apr 2019 09:06:04 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hBfac-0004YS-GT for bug-gnu-emacs@gnu.org; Wed, 03 Apr 2019 09:06:03 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:56876) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hBfac-0004XW-4W for bug-gnu-emacs@gnu.org; Wed, 03 Apr 2019 09:06:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hBfab-0005lH-Ov for bug-gnu-emacs@gnu.org; Wed, 03 Apr 2019 09:06:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Sam Halliday Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 03 Apr 2019 13:06:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 35119 X-GNU-PR-Package: emacs Original-Received: via spool by 35119-submit@debbugs.gnu.org id=B35119.155429674722120 (code B ref 35119); Wed, 03 Apr 2019 13:06:01 +0000 Original-Received: (at 35119) by debbugs.gnu.org; 3 Apr 2019 13:05:47 +0000 Original-Received: from localhost ([127.0.0.1]:42187 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hBfaK-0005kh-DD for submit@debbugs.gnu.org; Wed, 03 Apr 2019 09:05:44 -0400 Original-Received: from mail-vs1-f50.google.com ([209.85.217.50]:35314) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1hBfaI-0005kT-E0 for 35119@debbugs.gnu.org; Wed, 03 Apr 2019 09:05:42 -0400 Original-Received: by mail-vs1-f50.google.com with SMTP id d8so8871797vsp.2 for <35119@debbugs.gnu.org>; Wed, 03 Apr 2019 06:05:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=K/CjLRI8K9WjWR+iPGTX9TBXHGzG/uAVjq8EH9Aoruk=; b=ePDkr/Jo5rEPhZVS2AfrTvrMCH1kA9XT4WUP8ttBjBGNneUGLrlhhZ0tvloFbl2It/ RTxPmlClgnqtGon52xggjXQjYrIFjQuFiM3qe5/sSfIPNBpM7hJaymVayVLlVBA5TWys H3oW/oQxg49GqAeagw8ZA7AtZ4zvkxR+iHh6++NcxsxWoKBK3l+PQXweYwM9UoxGBPYE DAdgQ/pkQJ6ZxVblt8+UXgmDN2NjV1WhLiWh31vZ/fvKnBkQW8Hue89kZSzN6n9hn9L+ AYp4XeWV6yXtnXNFRbL1VguIoCdihQAPncF9dlUw8lTBqT2rh5MN5jY3jixbYJzGuO3v uFeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=K/CjLRI8K9WjWR+iPGTX9TBXHGzG/uAVjq8EH9Aoruk=; b=cf/I9m4B2gCxvnNRzgTrHICOu67x8WtLNlUdxMPi5qEoDLnPuokMdpYVA14oIEi9BG QtdaMHH7gW8xOGiQmU9Cs3g7/qDi3VG2HRtxQkmJP0dy8rKO5pGH0IOuY+DPDRU6sh7f 9L7VQfr/IfH0j00JrK+OAk4tGQY48Ik9JiN6vtb00/e+GbpHab9QqtFn9mHjR03IP5YZ ySJ4lsRZegNzJYu6FgJrSAeTcA5ngZBX8fUIKdSCCWVbQazDzl3qY6jBiav1l+7Bj+lw s86OgOSwX0i8WoMy7jvBhwpHksDKp9knKAac4KucqpTpdxRfjU5na1uyVz2NZxe5IJFh WZdA== X-Gm-Message-State: APjAAAWFO3kyc3uFBZioSmp2ZPCgQybv0X5yq0hTN3Pru9tAt02KW4zJ b5+ako2Xk2+NusTVQi/h+d1tYQP0UZlne5MbddA= X-Google-Smtp-Source: APXvYqzWXaff1/cOd3FR6bRrk+Gi2wkZX/XIG12Bx3qh+T1F6H6Sun3dmQ8ZBtkunwZEBzBm8AKlOD5BipAiXxzMS74= X-Received: by 2002:a05:6102:3c2:: with SMTP id n2mr57807vsq.41.1554296736925; Wed, 03 Apr 2019 06:05:36 -0700 (PDT) In-Reply-To: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:157121 Archived-At: To be clear, I still think this is a bug... but I'm now thinking that the bug is in re-search-forward. This alternative looking-back works for the example but is broken in other ways (defun haskell-tng-lexer:greedy-looking-back (regexp lower) (let ((upper (+ (point) 1)) ;; must be +1 to include zero-lengths (start lower)) (save-excursion (catch 'hit (while (< start upper) (goto-char start) (re-search-forward regexp upper 't) (when (= (point) (- upper 1)) (throw 'hit 't)) (setq start (+ 1 start))) nil)))) On Wed, 3 Apr 2019 at 14:01, Sam Halliday wrote: > > Hmm, on further investigation I think this may just be regexp behaviour. > > I came up with this as an alternative to `looking-back' > > (defun my-looking-back (regexp lower) > (let ((upper (point)) > (start lower)) > (save-excursion > (catch 'hit > (while (< start upper) > (goto-char start) > (re-search-forward regexp upper 't) > (when (= (point) upper) > (throw 'hit 't)) > (setq start (+ 1 start))) > nil)))) > > and it also fails to match the : in the example. So perhaps limit is > also excluding the zero-length implied by the subsequent character. > > On Wed, 3 Apr 2019 at 13:30, Sam Halliday wrote: > > > > Hi Eli, > > > > Sorry that was a terrible bug report. > > > > This impacts me in `looking-back'. Here's an interactive snippet to > > demonstrate the problem (not minimised to`narrow-to-region'): > > > > (defun look-for-35119 () > > (interactive) > > (if (looking-back > > (rx (: word-end ":" word-start)) > > ;;(rx (: word-end ":")) > > (- (point) 1) 't) > > (message "hit") > > (message "miss"))) > > > > in emacs-lisp-mode, which defines : as non-word, interactively > > evaluate look-for-35119 when the point is just after the colon in this > > example text > > > > wibble:wobble > > > > I would expect to see "hit", but we get "miss". To demonstrate that > > the word-start is the cause of the problem, try the commented regexp > > and try again, you'll get "hit" but of course this regexp is not what > > is intended. For example, it would also match in between :: in the > > following: > > > > wibble::wobble > > > > The cause is that the `narrow-to-region' call inside `looking-back' is > > dropping the word-start zero length match at the beginning of wobble. > > This may or may not be a bug in narrow-to-region, but I'm quite sure > > it's a bug in `looking-back'. There is most likely a similar example > > demonstrating that the zero lengths are missing at the start as well > > as the end. > > > > I've tried playing around with multiple alternative implementations of > > `looking-back' but none are working for me. Probably the best > > workaround I can think of is to extend the `narrow-to-region' call by > > one more character at the start and the end. Dealing with the start is > > easy, we just goto-char limit+1, but dealing with the end is difficult > > as we need to put an anychar \\. matcher in the doctored regexp and > > then the match-end is off-by-one from what the user expects, so then > > we have to doctor that, and then all hell breaks loose. > > > > Does that make sense? > > > > > > On Wed, 3 Apr 2019 at 12:25, Eli Zaretskii wrote: > > > > > > > From: Sam Halliday > > > > Date: Wed, 03 Apr 2019 12:19:08 +0100 > > > > > > > > If the function `narrow-to-region' (as it is in `looking-back') is used > > > > to restrict the region prior to an invocation of re-search-forward or > > > > looking-at, then zero length regexp patterns are lost at the boundaries. > > > > > > Could you please provide a recipe to reproduce the issue? I'm not > > > sure I understand what is the problem you are describing. > > > > > > Thanks.