From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Aleksey Cherepanov Newsgroups: gmane.emacs.bugs Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file Date: Mon, 24 Feb 2014 20:03:17 +0400 Message-ID: <20140224160317.GA2475@openwall.com> References: <83k3co4hzd.fsf@gnu.org> <20140222124413.GA4971@openwall.com> <83vbw72t05.fsf@gnu.org> <20140222160217.GA15616@openwall.com> <83ios72j8b.fsf@gnu.org> <20140222185511.GA23643@openwall.com> <838ut23lo9.fsf@gnu.org> <20140223195659.GA23581@openwall.com> <20140223230251.GA30257@openwall.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="SUOF0GtieIMvvwua" X-Trace: ger.gmane.org 1393257855 24459 80.91.229.3 (24 Feb 2014 16:04:15 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 24 Feb 2014 16:04:15 +0000 (UTC) Cc: 16800@debbugs.gnu.org To: Agustin Martin Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Feb 24 17:04:21 2014 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WHy0z-0000UO-70 for geb-bug-gnu-emacs@m.gmane.org; Mon, 24 Feb 2014 17:04:21 +0100 Original-Received: from localhost ([::1]:58469 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WHy0y-000136-QB for geb-bug-gnu-emacs@m.gmane.org; Mon, 24 Feb 2014 11:04:20 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:54003) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WHy0o-000130-EA for bug-gnu-emacs@gnu.org; Mon, 24 Feb 2014 11:04:17 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WHy0h-0005wj-0r for bug-gnu-emacs@gnu.org; Mon, 24 Feb 2014 11:04:10 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:36801) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WHy0g-0005we-U7 for bug-gnu-emacs@gnu.org; Mon, 24 Feb 2014 11:04:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.80) (envelope-from ) id 1WHy0f-0004Xd-OT for bug-gnu-emacs@gnu.org; Mon, 24 Feb 2014 11:04:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Aleksey Cherepanov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 24 Feb 2014 16:04:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 16800 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 16800-submit@debbugs.gnu.org id=B16800.139325781217409 (code B ref 16800); Mon, 24 Feb 2014 16:04:01 +0000 Original-Received: (at 16800) by debbugs.gnu.org; 24 Feb 2014 16:03:32 +0000 Original-Received: from localhost ([127.0.0.1]:37982 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHy0B-0004Wi-9B for submit@debbugs.gnu.org; Mon, 24 Feb 2014 11:03:32 -0500 Original-Received: from mail-la0-f54.google.com ([209.85.215.54]:55912) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WHy07-0004WR-0J for 16800@debbugs.gnu.org; Mon, 24 Feb 2014 11:03:28 -0500 Original-Received: by mail-la0-f54.google.com with SMTP id mc6so2539172lab.13 for <16800@debbugs.gnu.org>; Mon, 24 Feb 2014 08:03:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=BIbyzYMNZsJxY09IsdFgbZ4dyHXFEzP7WAWm+Tkgk2E=; b=Ov1gE2imE24ZNhcuryfUoW0CFVgNs2Mm1/pnlrS1t0wa+I9B0hTtoMr4QM5C2sqQtm pPXTWj6D6BehaeEeZQsODcotxF89IafajqMZItbvtmT0SrRDiP4vN9MrXRJ63KTqCxh2 Q3lMhEMbViHauwrqHBnZkoNmRQnCmKZSuCA/8gkVKSsDE6crakWHgDbpXjSQUPIknDr8 DfCRy+ISbtDDy4cPrrWCAxYE4wsn418OFFQZ3/KJeMtEE0b9v8ok7so55PuA560Y8m0G ERcvkagk1eyPnDbC3+KC3jel/1MrNt6xtsnzPGBb8MwSvKCKYPSKXVLlK582IfBT7q2m yQ8Q== X-Received: by 10.112.72.170 with SMTP id e10mr11891369lbv.43.1393257800736; Mon, 24 Feb 2014 08:03:20 -0800 (PST) Original-Received: from openwall.com ([188.123.230.115]) by mx.google.com with ESMTPSA id cl5sm19046701lbb.14.2014.02.24.08.03.19 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Mon, 24 Feb 2014 08:03:19 -0800 (PST) Content-Disposition: inline In-Reply-To: <20140223230251.GA30257@openwall.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 140.186.70.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:86134 Archived-At: --SUOF0GtieIMvvwua Content-Type: text/plain; charset=us-ascii Content-Disposition: inline I played with different (maybe wrong) implementations of flyspell-word-search-backward and measured time against t.txt (produced by the one-liner). All implementations are attached. perl -e 'print(((join " ", ("met and") x 10) . "\n") x 30000)' > t.txt my-test-agustin - Implementation from Agustin Martin with regexp-quote my-test-concat-up - and concat moved upper my-test-concat-up-goto - and goto-char moved into setq my-test-concat-up-goto-notcap - and ?: added to the first group my-test-concat-up-goto-notcap-bob - and \b replaced by \` my-test-concat-up-goto-notcap-bob-bobp - and goto-char replaced with conditional forward-char (on bobp) my-test-concat-up-goto-notcap-nobob-bobp - and the first group is removed, this case is handled separately, my-test-concat-up-goto-notcap-nobob-nobobp - and bobp check is replaced by progn due to separate handling my-test-goto-notcap-nobob-nobobp - and concat moved down (back), my-test-concat-up-goto-notcap-nobob-bobp-fixed - fixed for correct handling of beginning of buffer. # |String| Time |Result| Function name 1 nd (0 0 192227 640000) nil my-test-agustin 2 nd (0 0 192569 63000) nil my-test-concat-up 3 nd (0 0 193895 468000) nil my-test-concat-up-goto 4 nd (0 0 194372 743000) nil my-test-concat-up-goto-notcap 5 nd (0 0 151535 868000) nil my-test-concat-up-goto-notcap-bob 6 nd (0 0 131831 49000) nil my-test-concat-up-goto-notcap-bob-bobp 7 nd (0 0 92012 191000) nil my-test-concat-up-goto-notcap-nobob-bobp 8 nd (0 0 93928 281000) nil my-test-concat-up-goto-notcap-nobob-nobobp 9 nd (0 0 93796 52000) nil my-test-goto-notcap-nobob-nobobp 10 nd (0 0 94061 645000) nil my-test-concat-up-goto-notcap-nobob-bobp-fixed It is from Messages of (my-try "nd") in t.txt. The last 4 functions are quite close and often mixes differently due to fluctuations. Really they could not be measured against this file because re-search-forward always should return nil, I think. Functions 7, 8, 9 are not correct: they find a word if we search a word at the beginning of buffer staying at the middle of it. Function 10 has logic to handle this case. Other corner cases should be thought and tried too. The times could be different for other files and other words. On Mon, Feb 24, 2014 at 03:02:51AM +0400, Aleksey Cherepanov wrote: > I've performed some tests against my .org file (not in emacs -Q): > On Sun, Feb 23, 2014 at 11:56:59PM +0400, Aleksey Cherepanov wrote: > > Maybe it would be faster to not capture word but capture one char or > > void but I doubt the difference would be noticable. > > 307899: (0 3 174172 939000) :: \(?:[^[:alpha:]]\|\`\)\([[:alpha:]]+\) > 307899: (0 3 250515 907000) :: \([^[:alpha:]]\|\`\)\(?:[[:alpha:]]+\) > 307899: (0 3 218270 136000) :: \([^[:alpha:]]\|\`\)[[:alpha:]]+ > Unexpectedly capturing of word works a bit faster. Maybe it is not a > word but the second group and it would work differently for search > forward. Or alpha+ instead of fixed word caused it. Anyway the > difference is very small. We could avoid capturing at all. And it works faster as shown by 4 last functions. Thanks! -- Regards, Aleksey Cherepanov --SUOF0GtieIMvvwua Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="t.el" ;; Implementation from Agustin Martin with additional regexp-quote (defun my-test-agustin (word bound &optional ignore-case) (save-excursion (let ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) p) (while (and (not r) (setq p (re-search-backward (concat "\\(" flyspell-not-casechars "\\|\\b\\)" "\\(" (regexp-quote word) "\\)" flyspell-not-casechars ) bound t))) (goto-char (match-beginning 2)) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-concat-up (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat "\\(" flyspell-not-casechars "\\|\\b\\)" "\\(" (regexp-quote word) "\\)" flyspell-not-casechars)) p) (while (and (not r) (setq p (re-search-backward word-re bound t))) (goto-char (match-beginning 2)) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-concat-up-goto (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat "\\(" flyspell-not-casechars "\\|\\b\\)" "\\(" (regexp-quote word) "\\)" flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (goto-char (match-beginning 2))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-concat-up-goto-notcap (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat "\\(?:" flyspell-not-casechars "\\|\\b\\)" "\\(" (regexp-quote word) "\\)" flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (goto-char (match-beginning 2))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-concat-up-goto-notcap-bob (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat "\\(?:" flyspell-not-casechars "\\|\\`\\)" "\\(" (regexp-quote word) "\\)" flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (goto-char (match-beginning 2))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) (defun my-test-concat-up-goto-notcap-bob-bobp (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat "\\(?:" flyspell-not-casechars "\\|\\`\\)" (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (unless (bobp) (forward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) r))) ;; Wrong (defun my-test-concat-up-goto-notcap-nobob-bobp (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (unless (bobp) (forward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) (unless r (setq p (goto-char (point-min))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p)))) r))) ;; Wrong (defun my-test-concat-up-goto-notcap-nobob-nobobp (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (progn (forward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) (unless r (setq p (goto-char (point-min))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p)))) r))) ;; Wrong (defun my-test-goto-notcap-nobob-nobobp (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars) bound t) (progn (forward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) (unless r (setq p (goto-char (point-min))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p)))) r))) (defun my-test-concat-up-goto-notcap-nobob-bobp-fixed (word bound &optional ignore-case) (save-excursion (let* ((r '()) (inhibit-point-motion-hooks t) (flyspell-not-casechars (flyspell-get-not-casechars)) (word-re (concat flyspell-not-casechars (regexp-quote word) flyspell-not-casechars)) p) (while (and (not r) (setq p (and (re-search-backward word-re bound t) (unless (bobp) (forward-char) (point))))) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p) (goto-char p)))) (unless r (let ((pos (point))) (setq p (goto-char (point-min))) (and (search-forward word (length word) t) (let ((lw (flyspell-get-word))) (if (and (consp lw) (if ignore-case (string-equal (downcase (car lw)) (downcase word)) (string-equal (car lw) word))) (setq r p)))))) r))) (defun my-try (word) (message "%s" (mapconcat (lambda (func) (end-of-buffer) (let* ((time (current-time)) (res (apply func '("nd" nil)))) (format ":>: %s %S =%S %S" word (subtract-time (current-time) time) res func))) (let ((lst '(my-test-agustin my-test-concat-up my-test-concat-up-goto my-test-concat-up-goto-notcap my-test-concat-up-goto-notcap-bob my-test-concat-up-goto-notcap-bob-bobp my-test-concat-up-goto-notcap-nobob-bobp my-test-concat-up-goto-notcap-nobob-nobobp my-test-goto-notcap-nobob-nobobp my-test-concat-up-goto-notcap-nobob-bobp-fixed))) (concatenate 'list lst lst)) "\n"))) --SUOF0GtieIMvvwua--