all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Dmitry Gutov <dgutov@yandex.ru>
To: lee <lee@yagibdah.de>
Cc: 20703@debbugs.gnu.org
Subject: bug#20703: 24.4; Stack overflow in regexp matcher
Date: Wed, 3 Jun 2015 03:58:10 +0300	[thread overview]
Message-ID: <556E5122.3070907@yandex.ru> (raw)
In-Reply-To: <871tht24vq.fsf@heimdali.yagibdah.de>

On 06/03/2015 12:10 AM, lee wrote:

> Processing goes to 42% before the debugger comes up:

Good. That means this error is not limited to Projectile.

> I tried to find the line and only got to the point where so much of the
> file was cut out that I didn't manage to go back to a step at which I'm
> getting the error.  If I have some time this weekend, I can try again.

If you can get it down even to a 1000 lines you're comfortable sending, 
it'll be good enough.

I'd like to improve our regexp to avoid the overflow problem if 
possible, however the line in question is likely simply too long. If 
there are a lot of these lines in TAGS (and there probably are, since 
it's 1.8GB), you'll need to improve the method of its generation anyway.

With Projectile, it would likely mean adding some directories to the 
ignored list, see projectile-tags-exclude-patterns.

> Isn't there a way to get a better hint than the pretty vague "42%"?

You can open the file and isearch-forward-regexp for .\{200,\} (or some 
bigger value). That will find abnormally long lines.

Or try this patch:

diff --git a/lisp/progmodes/etags.el b/lisp/progmodes/etags.el
index bf57770..4e6a844 100644
--- a/lisp/progmodes/etags.el
+++ b/lisp/progmodes/etags.el
@@ -1267,18 +1267,22 @@ buffer-local values of tags table format variables."
        ;;   \5 is the explicitly-specified tag name.
        ;;   \6 is the line to start searching at;
        ;;   \7 is the char to start searching at.
-      (while (re-search-forward
-	      "^\\(\\([^\177]+[^-a-zA-Z0-9_+*$:\177]+\\)?\
+      (condition-case err
+          (while (re-search-forward
+                  "^\\(\\([^\177]+[^-a-zA-Z0-9_+*$:\177]+\\)?\
  \\([-a-zA-Z0-9_+*$?:]+\\)[^-a-zA-Z0-9_+*$?:\177]*\\)\177\
  \\(\\([^\n\001]+\\)\001\\)?\\([0-9]+\\)?,\\([0-9]+\\)?\n"
-	      nil t)
-	(push	(prog1 (if (match-beginning 5)
-			   ;; There is an explicit tag name.
-			   (buffer-substring (match-beginning 5) (match-end 5))
-			 ;; No explicit tag name.  Best guess.
-			 (buffer-substring (match-beginning 3) (match-end 3)))
-		  (progress-reporter-update progress-reporter (point)))
-		table)))
+                  nil t)
+            (push	(prog1 (if (match-beginning 5)
+                                   ;; There is an explicit tag name.
+                                   (buffer-substring (match-beginning 
5) (match-end 5))
+                                 ;; No explicit tag name.  Best guess.
+                                 (buffer-substring (match-beginning 3) 
(match-end 3)))
+                          (progress-reporter-update progress-reporter 
(point)))
+                        table))
+        (error
+         (message "error happened near %d" (point))
+         (error (error-message-string err)))))
      table))

  (defun etags-snarf-tag (&optional use-explicit) ; Doc string?






  reply	other threads:[~2015-06-03  0:58 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-31 16:46 bug#20703: 24.4; Stack overflow in regexp matcher lee
2015-05-31 22:26 ` Dmitry Gutov
     [not found]   ` <87lhg31f38.fsf@heimdali.yagibdah.de>
2015-06-01 18:45     ` Dmitry Gutov
2015-06-02 21:10       ` lee
2015-06-03  0:58         ` Dmitry Gutov [this message]
2015-06-01 14:10 ` Stefan Monnier
2015-06-02 21:26   ` lee
2015-06-03  0:46     ` Stefan Monnier
2015-06-03 14:46     ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=556E5122.3070907@yandex.ru \
    --to=dgutov@yandex.ru \
    --cc=20703@debbugs.gnu.org \
    --cc=lee@yagibdah.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.