From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Agustin Martin Newsgroups: gmane.emacs.bugs Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Date: Fri, 28 Feb 2014 12:45:45 +0100 Message-ID: <20140228114545.GA8669@agmartin.aq.upm.es> References: <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> <20140224160317.GA2475@openwall.com> <20140226203202.GA23749@agmartin.aq.upm.es> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="TB36FDmn/VVEgNH/" X-Trace: ger.gmane.org 1393587974 24155 80.91.229.3 (28 Feb 2014 11:46:14 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 28 Feb 2014 11:46:14 +0000 (UTC) To: Aleksey Cherepanov , 16800@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Fri Feb 28 12:46:21 2014 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WJLtU-000797-TK for geb-bug-gnu-emacs@m.gmane.org; Fri, 28 Feb 2014 12:46:21 +0100 Original-Received: from localhost ([::1]:50579 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WJLtU-0000DJ-IE for geb-bug-gnu-emacs@m.gmane.org; Fri, 28 Feb 2014 06:46:20 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:52389) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WJLtK-0000Cq-Cn for bug-gnu-emacs@gnu.org; Fri, 28 Feb 2014 06:46:17 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WJLtC-0000XX-QN for bug-gnu-emacs@gnu.org; Fri, 28 Feb 2014 06:46:10 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:42527) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WJLtC-0000XQ-MV for bug-gnu-emacs@gnu.org; Fri, 28 Feb 2014 06:46:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1WJLtC-0003d3-Gk for bug-gnu-emacs@gnu.org; Fri, 28 Feb 2014 06:46:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Agustin Martin Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 28 Feb 2014 11:46:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139358795113923 (code B ref 16800); Fri, 28 Feb 2014 11:46:02 +0000 Original-Received: (at 16800) by debbugs.gnu.org; 28 Feb 2014 11:45:51 +0000 Original-Received: from localhost ([127.0.0.1]:43709 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJLt0-0003cU-7v for submit@debbugs.gnu.org; Fri, 28 Feb 2014 06:45:50 -0500 Original-Received: from edison.ccupm.upm.es ([138.100.198.71]:56767) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WJLsx-0003cL-H3 for 16800@debbugs.gnu.org; Fri, 28 Feb 2014 06:45:48 -0500 Original-Received: from agmartin.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131]) by smtp.upm.es (8.14.3/8.14.3/edison-001) with ESMTP id s1SBjjZ1016831; Fri, 28 Feb 2014 12:45:45 +0100 Original-Received: by agmartin.aq.upm.es (Postfix, from userid 1000) id A2C53401C2; Fri, 28 Feb 2014 12:45:45 +0100 (CET) Content-Disposition: inline In-Reply-To: <20140226203202.GA23749@agmartin.aq.upm.es> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:86379 Archived-At: --TB36FDmn/VVEgNH/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Feb 26, 2014 at 09:32:02PM +0100, Agustin Martin wrote: > On Mon, Feb 24, 2014 at 08:03:17PM +0400, Aleksey Cherepanov wrote: > > I played with different (maybe wrong) implementations of > > flyspell-word-search-backward and measured time against t.txt > > (produced by the one-liner). All implementations are attached. > > [ ... Tons of extensive and impressive debugging ... ] > > > We could avoid capturing at all. And it works faster as shown by 4 > > last functions. > > Hi, > > Thanks a lot for the extensive debugging and for all the suggestions. I > have been playing with something based in your last function, but trying > to get something more compact, see below current status [ ... ] > I did some efficiency test and it seemed similar to those of your efficient > functions. Need to check further for corner cases, bugs, etc ... Hi, Aleksey Please find attached my first candidate for commit. Is similar to what I sent before, but needed to add an explicit check for word at eob in `flyspell-word-search-forward'. Will try to have more testing before committing. Seems to work well with the file generated by your one-liner, even with corner cases like new misspellings added at bob or eob, but the wider the testing the better. Hope no one will generate files with words containing something in OTHERCHARS. Thanks for all your help -- Agustin --TB36FDmn/VVEgNH/ Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="flyspell.el_flyspell-word-search.3.diff" --- flyspell.el.orig 2014-02-26 19:05:29.651986038 +0100 +++ flyspell.el 2014-02-28 12:01:03.930010553 +0100 @@ -1048,10 +1048,21 @@ ;;*---------------------------------------------------------------------*/ (defun flyspell-word-search-backward (word bound &optional ignore-case) (save-excursion - (let ((r '()) - (inhibit-point-motion-hooks t) - p) - (while (and (not r) (setq p (search-backward word bound t))) + (let* ((r '()) + (inhibit-point-motion-hooks t) + (flyspell-not-casechars (flyspell-get-not-casechars)) + (word-re (concat flyspell-not-casechars + (regexp-quote word) + flyspell-not-casechars)) + p) + (while + (and (not r) + (setq p (if (re-search-backward word-re bound t) + ;; word-re match begins one char before word + (progn (forward-char) (point)) + ;; Check above does not match similar word at b-o-b + (goto-char (point-min)) + (search-forward word (length word) t)))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case @@ -1066,10 +1077,25 @@ ;;*---------------------------------------------------------------------*/ (defun flyspell-word-search-forward (word bound) (save-excursion - (let ((r '()) - (inhibit-point-motion-hooks t) - p) - (while (and (not r) (setq p (search-forward word bound t))) + (let* ((r '()) + (inhibit-point-motion-hooks t) + (word-end (nth 2 (flyspell-get-word))) + (flyspell-not-casechars (flyspell-get-not-casechars)) + (word-re (concat flyspell-not-casechars + (regexp-quote word) + flyspell-not-casechars)) + p) + (while + (and (not r) + (setq p (if (= word-end (point-max)) + nil ;; Current word is at e-o-b. No forward search + (if (re-search-forward word-re bound t) + ;; word-re match ends one char after word + (progn (backward-char) (point)) + ;; Check above does not match similar word at e-o-b + (goto-char (point-max)) + (search-backward word (- (point-max) + (length word)) t))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (string-equal (car lw) word)) (setq r p) --TB36FDmn/VVEgNH/--