From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Dmitry Gutov Newsgroups: gmane.emacs.bugs Subject: bug#20703: 24.4; Stack overflow in regexp matcher Date: Wed, 3 Jun 2015 03:58:10 +0300 Message-ID: <556E5122.3070907@yandex.ru> References: <87siac1yqq.fsf@heimdali.yagibdah.de> <556B8A7E.8080705@yandex.ru> <87lhg31f38.fsf@heimdali.yagibdah.de> <556CA839.1090402@yandex.ru> <871tht24vq.fsf@heimdali.yagibdah.de> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1433293172 28255 80.91.229.3 (3 Jun 2015 00:59:32 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 3 Jun 2015 00:59:32 +0000 (UTC) Cc: 20703@debbugs.gnu.org To: lee Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Jun 03 02:59:18 2015 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Yzx1S-00063j-Mu for geb-bug-gnu-emacs@m.gmane.org; Wed, 03 Jun 2015 02:59:10 +0200 Original-Received: from localhost ([::1]:33120 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yzx1S-00034A-9x for geb-bug-gnu-emacs@m.gmane.org; Tue, 02 Jun 2015 20:59:10 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:58431) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yzx1O-00033P-Sa for bug-gnu-emacs@gnu.org; Tue, 02 Jun 2015 20:59:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yzx1L-0008AA-2b for bug-gnu-emacs@gnu.org; Tue, 02 Jun 2015 20:59:06 -0400 Original-Received: from debbugs.gnu.org ([140.186.70.43]:56324) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yzx1K-00089o-WC for bug-gnu-emacs@gnu.org; Tue, 02 Jun 2015 20:59:03 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1Yzx1K-0005gA-F8 for bug-gnu-emacs@gnu.org; Tue, 02 Jun 2015 20:59:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Dmitry Gutov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 03 Jun 2015 00:59:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 20703 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 20703-submit@debbugs.gnu.org id=B20703.143329310021736 (code B ref 20703); Wed, 03 Jun 2015 00:59:02 +0000 Original-Received: (at 20703) by debbugs.gnu.org; 3 Jun 2015 00:58:20 +0000 Original-Received: from localhost ([127.0.0.1]:38066 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yzx0e-0005eW-9r for submit@debbugs.gnu.org; Tue, 02 Jun 2015 20:58:20 -0400 Original-Received: from mail-wi0-f172.google.com ([209.85.212.172]:36508) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Yzx0c-0005eC-Bl for 20703@debbugs.gnu.org; Tue, 02 Jun 2015 20:58:19 -0400 Original-Received: by wibdq8 with SMTP id dq8so74796516wib.1 for <20703@debbugs.gnu.org>; Tue, 02 Jun 2015 17:58:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-type:content-transfer-encoding; bh=iFgs/MarsIcwQt58N0YL6qvss6P7ocKbW0gOJHasAac=; b=qAPyk6nOjQOrwJb9daknPvPhggDIChXRCpm0t9n07H4wg8q/g1yIIjhxa0SsbDGbQB PWugAiO3Rwwv6m2JUCQ3RRPPl/ei8GMud1He/mjRJoYY6tyXzNd/y+Dn/peM40LMVCuB I8mSXL3yXYUThw40U4ZHx7K4Giy3A9cmtjuu2YWBFeEqMoirIgyJYgyLGJJOcxXMTWcu ABqrgjF0JyEC1F0cKgCeqV6334HI+i1Ki53QXoSIPkTdjkkQRWSiBa1Tu2SrPbo0LdnC yw1xPqlN5TcRqJCoy6bUupChqxzh2lENBMNI9rarwME0dDaz/FymsUCpaUffmhtrGodW j9Pg== X-Received: by 10.194.63.16 with SMTP id c16mr43034584wjs.105.1433293092696; Tue, 02 Jun 2015 17:58:12 -0700 (PDT) Original-Received: from [192.168.1.2] ([82.102.93.54]) by mx.google.com with ESMTPSA id j7sm29233659wjz.11.2015.06.02.17.58.12 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 02 Jun 2015 17:58:12 -0700 (PDT) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0 In-Reply-To: <871tht24vq.fsf@heimdali.yagibdah.de> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:103527 Archived-At: On 06/03/2015 12:10 AM, lee wrote: > Processing goes to 42% before the debugger comes up: Good. That means this error is not limited to Projectile. > I tried to find the line and only got to the point where so much of the > file was cut out that I didn't manage to go back to a step at which I'm > getting the error. If I have some time this weekend, I can try again. If you can get it down even to a 1000 lines you're comfortable sending, it'll be good enough. I'd like to improve our regexp to avoid the overflow problem if possible, however the line in question is likely simply too long. If there are a lot of these lines in TAGS (and there probably are, since it's 1.8GB), you'll need to improve the method of its generation anyway. With Projectile, it would likely mean adding some directories to the ignored list, see projectile-tags-exclude-patterns. > Isn't there a way to get a better hint than the pretty vague "42%"? You can open the file and isearch-forward-regexp for .\{200,\} (or some bigger value). That will find abnormally long lines. Or try this patch: diff --git a/lisp/progmodes/etags.el b/lisp/progmodes/etags.el index bf57770..4e6a844 100644 --- a/lisp/progmodes/etags.el +++ b/lisp/progmodes/etags.el @@ -1267,18 +1267,22 @@ buffer-local values of tags table format variables." ;; \5 is the explicitly-specified tag name. ;; \6 is the line to start searching at; ;; \7 is the char to start searching at. - (while (re-search-forward - "^\\(\\([^\177]+[^-a-zA-Z0-9_+*$:\177]+\\)?\ + (condition-case err + (while (re-search-forward + "^\\(\\([^\177]+[^-a-zA-Z0-9_+*$:\177]+\\)?\ \\([-a-zA-Z0-9_+*$?:]+\\)[^-a-zA-Z0-9_+*$?:\177]*\\)\177\ \\(\\([^\n\001]+\\)\001\\)?\\([0-9]+\\)?,\\([0-9]+\\)?\n" - nil t) - (push (prog1 (if (match-beginning 5) - ;; There is an explicit tag name. - (buffer-substring (match-beginning 5) (match-end 5)) - ;; No explicit tag name. Best guess. - (buffer-substring (match-beginning 3) (match-end 3))) - (progress-reporter-update progress-reporter (point))) - table))) + nil t) + (push (prog1 (if (match-beginning 5) + ;; There is an explicit tag name. + (buffer-substring (match-beginning 5) (match-end 5)) + ;; No explicit tag name. Best guess. + (buffer-substring (match-beginning 3) (match-end 3))) + (progress-reporter-update progress-reporter (point))) + table)) + (error + (message "error happened near %d" (point)) + (error (error-message-string err))))) table)) (defun etags-snarf-tag (&optional use-explicit) ; Doc string?