From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Sam Halliday Newsgroups: gmane.emacs.help Subject: Re: BUG 20703 further evidence Date: Wed, 13 Jan 2016 13:36:21 -0800 (PST) Message-ID: <5a73d376-ec6e-4ad3-8575-667629306d55@googlegroups.com> References: <5ab4af6b-5b7d-40f9-b49f-2d8cc6926e9f@googlegroups.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1452721218 18557 80.91.229.3 (13 Jan 2016 21:40:18 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 13 Jan 2016 21:40:18 +0000 (UTC) To: help-gnu-emacs@gnu.org Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Wed Jan 13 22:40:18 2016 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aJT9N-00045Y-BH for geh-help-gnu-emacs@m.gmane.org; Wed, 13 Jan 2016 22:40:18 +0100 Original-Received: from localhost ([::1]:39297 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aJT9M-0000qZ-LM for geh-help-gnu-emacs@m.gmane.org; Wed, 13 Jan 2016 16:40:16 -0500 X-Received: by 10.140.156.198 with SMTP id c189mr482838qhc.3.1452720981897; Wed, 13 Jan 2016 13:36:21 -0800 (PST) X-Received: by 10.50.93.42 with SMTP id cr10mr703511igb.7.1452720981866; Wed, 13 Jan 2016 13:36:21 -0800 (PST) Original-Path: usenet.stanford.edu!6no4724807qgy.0!news-out.google.com!kr2ni4151igb.0!nntp.google.com!h5no4048460igh.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Original-Newsgroups: gnu.emacs.help In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=86.21.102.205; posting-account=kRukCAoAAAANs-vsVh9dFwo5kp5pwnPz Original-NNTP-Posting-Host: 86.21.102.205 User-Agent: G2/1.0 Injection-Date: Wed, 13 Jan 2016 21:36:21 +0000 Original-Xref: usenet.stanford.edu gnu.emacs.help:216430 X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Original-Sender: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.help:108720 Archived-At: Thanks Dmitry, For Emacs 25 we have the option to be smarter, but since I'm on Emacs 24 I = am currently in the market for an evil hack :-) (although, copying the emac= s-25 faster implementation might not be a bad idea as well, this is a parti= cularly slow part of using Emacs). The approach that sounds most sensible for my use case sounds like just exc= luding that one file from indexing, because I can do that from my .ctags. I= actually hadn't thought of it until you mentioned it! I was thinking along= the lines of a function that deletes all the long lines from a TAGS file, = part of a validation / cleanup phase. If you have a recipe in mind for that= , it would be pretty useful. Could you please copy out your proposed changes in full? I won't be applyin= g them against their sources, I'll just put them in my scratch and execute = in the running instance. Best regards, Sam On Wednesday, 13 January 2016 21:25:43 UTC, Dmitry Gutov wrote: > Hi Sam, >=20 > On 01/13/2016 08:54 PM, Sam Halliday wrote: >=20 > > I have been seeing a problem that is described in this bug report > > > > https://debbugs.gnu.org/db/20/20703.html > > > > I have applied the suggested patch to etags-tags-completion-table (copi= ed below in completeness for your convenience) and trapped an error case. >=20 > You should try the current version in emacs-25, it's smaller and faster= =20 > than previously, although it also probably fails at long-enough lines. >=20 > > I'm triggering the error in an extremely long line of code (46,000 char= acters!). I presume somebody programmatically generated the line and pasted= it into the source. A workaround could be to simply filter such lines at t= he ctag building or loading stage, just something that deletes "long" lines= , whatever that may mean. Probably 500 characters is long enough! > > > > I could also look at adding maximum sizes to my regexes in ctags, but t= hat really isn't a general solution because many ctags patterns do not have= such limits. >=20 > I can think of some other possible solutions: >=20 > - External pre-processor that removes lines that are too long. >=20 > - Extra step, together with a custom variable, in visit-tags-table, that= =20 > goes through the opened files and does the same. >=20 > - re-search-forward with limit, as implemented in the patch below=20 > (against emacs-25), that might work against problematic files like that= =20 > (I haven't tested it). >=20 > I don't really know if we should install it, though, because it adds a=20 > performance overhead of ~10%. And I don't know if this problem is common= =20 > enough. >=20 > Because another way to combat it is at the source: through judicious=20 > application of --exclude argument. As a bonus, the generation phase will= =20 > become faster as well (sometimes dramatically). >=20 > Should we add a validation phase to visit-tags-table instead? Like, one= =20 > that would say "your TAGS files contains obviously malformed entries=20 > from file XXX.min.js, go back and ignore it"? >=20 > diff --git a/lisp/progmodes/etags.el b/lisp/progmodes/etags.el > index 2db7220..9a663d4 100644 > --- a/lisp/progmodes/etags.el > +++ b/lisp/progmodes/etags.el > @@ -1252,8 +1252,9 @@ etags-file-of-tag > str > (expand-file-name str (file-truename default-directory)))))) >=20 > +(defvar etags--table-line-limit 500) >=20 > -(defun etags-tags-completion-table () ; Doc string? > +(defun etags-tags-completion-table () ; Doc string? > (let (table > (progress-reporter > (make-progress-reporter > @@ -1263,10 +1264,13 @@ etags-tags-completion-table > (goto-char (point-min)) > ;; This regexp matches an explicit tag name or the place where > ;; it would start. > - (while (re-search-forward > - "[\f\t\n\r()=3D,; ]?\177\\\(?:\\([^\n\001]+\\)\001\\)?" > - nil t) > - (push (prog1 (if (match-beginning 1) > + (while (not (eobp)) > + (if (not (re-search-forward > + "[\f\t\n\r()=3D,; ]?\177\\\(?:\\([^\n\001]+\\)\001\\)?= " > + ;; Avoid lines that are too long (bug#20703). > + (+ (point) etags--table-line-limit) t)) > + (forward-line 1) > + (push (prog1 (if (match-beginning 1) > ;; There is an explicit tag name. > (buffer-substring (match-beginning 1) (match-end 1)) > ;; No explicit tag name. Backtrack a little, > @@ -1277,7 +1281,7 @@ etags-tags-completion-table > (buffer-substring (point)=20 > (match-beginning 0)) > (goto-char (match-end 0)))) > (progress-reporter-update progress-reporter (point))) > - table))) > + table)))) > table)) >=20 > (defun etags-snarf-tag (&optional use-explicit) ; Doc string?