all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Aleksey Cherepanov <aleksey.4erepanov@gmail.com>
To: Agustin Martin <agustin.martin@hispalinux.es>
Cc: 16800@debbugs.gnu.org
Subject: bug#16800: 24.3; flyspell works slow on very short words at the end of big file
Date: Sat, 1 Mar 2014 14:33:05 +0400	[thread overview]
Message-ID: <20140301103305.GA600@openwall.com> (raw)
In-Reply-To: <20140228231141.GA20782@openwall.com>

[-- Attachment #1: Type: text/plain, Size: 1685 bytes --]

On Sat, Mar 01, 2014 at 03:11:41AM +0400, Aleksey Cherepanov wrote:
> Wow! Your 'if' in 'while's condition is very elegant. Nice!
> 
> On Fri, Feb 28, 2014 at 12:45:45PM +0100, Agustin Martin wrote:
> > Please find attached my first candidate for commit. Is similar to what I
> > sent before, but needed to add an explicit check for word at eob in
> > `flyspell-word-search-forward'. 
> > 
> > Will try to have more testing before committing. Seems to work well with the
> > file generated by your one-liner, even with corner cases like new
> > misspellings added at bob or eob, but the wider the testing the better.
> 
> I've wrote a small fuzzer. It is in attach. To run it:
> $ LANG=C emacs -Q --eval '(load-file "t2.el")'
> Then C-j to start. It modifies buffer you are in.

There is a mistake in my-try-mixed-pairs, fixed version is attached.

> (kbd "nd SPC and SPC nd C-b") fails to highlight the second "nd" as
> duplicate. It is a problem with bound equal to (length word) in
> -backward function. I did not check it when I wrote it.
> > +			 (search-forward word (length word) t))))
>                          (search-forward word (1+ (length word)) t))))

(1+ ...) is wrong, it should be similar to -forward: (+ (point-min)
...) because (point-min) is not always 1 (narrowing could change
this).

BTW flyspell does not escape restrictions/narrowing when it searches
for duplicate. Would not it be more convenient to widen before search?

Like
(save-restriction
  (widen)
  ... search ...

> One "nd" is colored as duplicate due to -backward function after that
> fix. I did not touch it yet because it is a time for a break for me.

Thanks!

-- 
Regards,
Aleksey Cherepanov

[-- Attachment #2: t2.el --]
[-- Type: text/plain, Size: 11071 bytes --]

(require 'cl)
(require 'flyspell)

(setq my-fuzzer-buffer-name "*temp for fuzzer*")
(switch-to-buffer my-fuzzer-buffer-name)
(unless (= (point-min) (point-max))
  (error "Could not operate on non-empty buffer"))

(flyspell-mode 1)

(random t)

;; Orig
(defun my-test-backward-orig (word bound &optional ignore-case)
  (save-excursion
    (let ((r '())
	  (inhibit-point-motion-hooks t)
	  p)
      (while (and (not r) (setq p (search-backward word bound t)))
	(let ((lw (flyspell-get-word)))
	  (if (and (consp lw)
		   (if ignore-case
		       (string-equal (downcase (car lw)) (downcase word))
		     (string-equal (car lw) word)))
	      (setq r p)
	    (goto-char p))))
      r)))
(defun my-test-forward-orig (word bound)
  (save-excursion
    (let ((r '())
	  (inhibit-point-motion-hooks t)
	  p)
      (while (and (not r) (setq p (search-forward word bound t)))
	(let ((lw (flyspell-get-word)))
	  (if (and (consp lw) (string-equal (car lw) word))
	      (setq r p)
	    (goto-char (1+ p)))))
      r)))

;; Agustin Martin
(defun my-test-backward-agustin (word bound &optional ignore-case)
  (save-excursion
    (let* ((r '())
           (inhibit-point-motion-hooks t)
           (flyspell-not-casechars (flyspell-get-not-casechars))
           (word-re (concat
                     flyspell-not-casechars
                     (regexp-quote word)
                     flyspell-not-casechars))
           p)
      (while
          (and (not r)
               (setq p (if (re-search-backward word-re bound t)
                           (progn (forward-char) (point))
                         ;; Check if word is at bob
                         (goto-char (point-min))
                         (search-forward word (length word) t))))
        (let ((lw (flyspell-get-word)))
          (if (and (consp lw)
                   (if ignore-case
                       (string-equal (downcase (car lw)) (downcase word))
                     (string-equal (car lw) word)))
              (setq r p)
            (goto-char p))))
      r)))
(defun my-test-forward-agustin (word bound)
  (save-excursion
    (let* ((r '())
          (inhibit-point-motion-hooks t)
          (word-end (nth 2 (flyspell-get-word)))
          (flyspell-not-casechars (flyspell-get-not-casechars))
          (word-re (concat flyspell-not-casechars
                           (regexp-quote word)
                           flyspell-not-casechars))
          p)
      (while
         (and (not r)
              (setq p (if (= word-end (point-max))
                          nil ;; Current word is at e-o-b. No forward search
                        (if (re-search-forward word-re bound t)
                            ;; word-re match ends one char after word
                            (progn (backward-char) (point))
                          ;; Check above does not match similar word at e-o-b
                          (goto-char (point-max))
                          (search-backward word (- (point-max)
                                                   (length word)) t)))))
	(let ((lw (flyspell-get-word)))
	  (if (and (consp lw) (string-equal (car lw) word))
	      (setq r p)
	    (goto-char (1+ p)))))
      r)))

;; Fixed
(defun my-test-backward-agustin-fixed (word bound &optional ignore-case)
  ;; (my-test-backward-agustin word bound ignore-case))
  (save-excursion
    (let* ((r '())
           (inhibit-point-motion-hooks t)
           (flyspell-not-casechars (flyspell-get-not-casechars))
           (word-re (concat
                     flyspell-not-casechars
                     (regexp-quote word)
                     flyspell-not-casechars))
           p)
      (while
          (and (not r)
               (setq p (if (re-search-backward word-re bound t)
                           (progn (forward-char) (point))
                         ;; Check if word is at bob
                         (goto-char (point-min))
                         (search-forward word (+ (point-min)
                                                 (length word)) t))))
        (let ((lw (flyspell-get-word)))
          (if (and (consp lw)
                   (if ignore-case
                       (string-equal (downcase (car lw)) (downcase word))
                     (string-equal (car lw) word)))
              (setq r p)
            (goto-char p))))
      r)))
(defun my-test-forward-agustin-fixed (word bound)
  (save-excursion
    (let* ((r '())
           (inhibit-point-motion-hooks t)
           (flyspell-not-casechars (flyspell-get-not-casechars))
           (word-re (concat flyspell-not-casechars
                            (regexp-quote word)
                            flyspell-not-casechars))
           p)
      (flyspell-get-word)
      (while
          (and (not r)
               (setq p (if (eobp)
                           nil ;; Current word is at e-o-b. No forward search
                         (if (re-search-forward word-re bound t)
                             ;; word-re match ends one char after word
                             (progn (backward-char) (point))
                           ;; Check above does not match similar word at e-o-b
                           (goto-char (point-max))
                           (and (search-backward word (- (point-max)
                                                         (length word)) t)
                                (goto-char (point-max)))))))
	(let ((lw (flyspell-get-word)))
	  (if (and (consp lw) (string-equal (car lw) word))
	      (setq r p)
	    (goto-char (1+ p)))))
      r)))

(defun my-make-test-macro ()
  (let* ((good "met")
         (sep "SPC")
         (bad "nd")
         (oc "'")
         (bol "C-a")
         ;; not really eol but enough
         (eol "C-e")
         (parts (list good sep bad oc bol eol))
         (len (length parts)))
    (eval `(kbd ,(mapconcat (lambda (a)
                              (nth (random len) parts))
                            (make-list (1+ (random 100)) 0)
                            " ")))))

;; nil if everythings is equal,
;; 'badtext if text is not equal,
;; position is the first position with different properties.
(defun my-compare-strings-with-properties (a b)
  (if (string= (car a) (car b))
      (let ((len (length (car a)))
            (pos 0)
            (badpos nil)
            (faces1 (cadr a))
            (faces2 (cadr b)))
        (while (and (not badpos)
                    (< pos len))
          (unless (equal (nth pos faces1)
                         (nth pos faces2))
              (setq badpos pos))
          ;; (message ">> %d" pos)
          (setq pos (1+ pos)))
        ;; (if badpos
        ;;     (progn
        ;;       (message ":>> faces1 %S" faces1)
        ;;       (message ":>> faces2 %S" faces2)))
        badpos)
    'badtext))

(defun my-try-macro (macro)
  (let ((strings
         ;; (message ">> count = %d, macro = %S" count macro)
         (mapcar
          (lambda (name)
            (delete-region (point-min) (point-max))
            (letf (((symbol-function 'flyspell-word-search-forward)
                    (intern (concat "my-test-forward-" (symbol-name name))))
                   ((symbol-function 'flyspell-word-search-backward)
                    (intern (concat "my-test-backward-" (symbol-name name)))))
              ;; (message ">> pre %S %d" name count)
              (execute-kbd-macro macro)
              ;; (message ">> post %S %d" name count)
              )
            (list (buffer-string)
                  (mapcar (lambda (pos)
                            (get-char-property pos 'face))
                          (number-sequence (point-min) (point-max)))))
          '(orig new))))
    (my-compare-strings-with-properties (car strings) (cadr strings))))

;; It may not reduce to the minimun in one run. It fails at reductions
;; if 2 or more chars should be removed at the same time.
(defun my-reduce (macro)
  (let ((bad (my-try-macro macro))
        (fails 0)
        newmacro)
    (if bad
      (while (< fails 100)
      (let ((pos (random (length macro))))
        (setq newmacro (concat (substring macro 0 pos) (substring macro (1+ pos))))
        ;; (message ">> %S" macro)
        ;; (message ">> %S" newmacro)
        (if (my-try-macro newmacro)
            (progn
              (setq fails 0)
              (setq macro newmacro))
          (setq fails (1+ fails)))))
      (message ":>>   We reduce only faulty macros"))
    macro))

;; Change this to use other functions instead of -agustin-fixed
(defun my-reset-new ()
  (defun my-test-backw+ard-new (word bound &optional ignore-case)
    (my-test-backward-agustin-fixed word bound ignore-case))
  (defun my-test-forward-new (word bound)
    (my-test-forward-agustin-fixed word bound)))
(my-reset-new)

(defun my-try-mixed-pairs (macro)
  (unwind-protect
      (if (my-try-macro macro)
          (progn
            (my-reset-new)
            (defun my-test-backward-new (word bound &optional ignore-case)
              (my-test-backward-orig word bound ignore-case))
            (if (my-try-macro macro)
                (message ":>> Difference is from -forward function"))
            (my-reset-new)
            (defun my-test-forward-new (word bound)
              (my-test-forward-orig word bound))
            (if (my-try-macro macro)
                (message ":>> Difference is from -backward function")))
        (message ":>>   We mix pairs only for faulty macros"))
    (my-reset-new)))

(defun my-fuzz ()
  (interactive)
  (unless (string= (ispell-get-otherchars) "[']")
    (error "Unexpected not-casechars value"))
  (buffer-disable-undo)
  (unwind-protect
      (let ((more t)
            (count 0)
            (time (current-time)))
        (while (and more
                    (< count (if my-macro 1 15)))
          (let* ((macro (or my-macro (my-make-test-macro)))
                 (bad (my-try-macro macro)))
            (setq more (not bad))
            (unless more
              (message ":>> Bad at %S running %S" bad macro)
              (my-try-mixed-pairs macro)
              (message ":>> Reduced macro: %S" (my-reduce macro))))
          (setq count (1+ count)))
        (message ":>> Fuzzing: %d macros are finished in %S"
                 count
                 (subtract-time (current-time) time))
        (message ":>> %s" (if more "Without differences" "There are differences")))
    (buffer-enable-undo))
  nil)
(global-set-key (kbd "C-j") 'my-fuzz)

(split-window-right)
(other-window 1)
(view-echo-area-messages)
(other-window 1)

;; For manual debug

;; (defun flyspell-word-search-backward (word bound &optional ignore-case)
;;   (my-test-backward-agustin-fixed word bound ignore-case))
;; (defun flyspell-word-search-forward (word bound)
;;   (my-test-forward-agustin-fixed word bound))

;; Define non-nil to run only one test with this macro not randomly
(setq my-macro nil)

;; (setq my-macro (kbd "nd SPC and SPC nd C-a"))
;; (setq my-macro (kbd "nd SPC nd C-a"))

;; (setq my-macro (kbd "nd SPC and C-a"))

(setq my-macro (kbd "n SPC n C-a"))

;; (setq my-macro (kbd "nd C-e"))


  reply	other threads:[~2014-03-01 10:33 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-18 20:56 bug#16800: 24.3; flyspell works slow on very short words at the end of big file Aleksey Cherepanov
2014-02-21 10:15 ` Eli Zaretskii
2014-02-21 14:38   ` Agustin Martin
2014-02-21 15:12     ` Eli Zaretskii
2014-02-21 15:21       ` Eli Zaretskii
2014-02-22 12:44       ` Aleksey Cherepanov
2014-02-22 13:10         ` Eli Zaretskii
2014-02-22 16:02           ` Aleksey Cherepanov
2014-02-22 16:41             ` Eli Zaretskii
2014-02-22 18:55               ` Aleksey Cherepanov
2014-02-22 20:16                 ` Aleksey Cherepanov
2014-02-22 21:03                 ` Eli Zaretskii
2014-02-23  1:26                   ` Agustin Martin
2014-02-23 18:36                     ` Eli Zaretskii
2014-02-23 19:56                     ` Aleksey Cherepanov
2014-02-23 23:02                       ` Aleksey Cherepanov
2014-02-24 16:03                         ` Aleksey Cherepanov
2014-02-26 20:32                           ` Agustin Martin
2014-02-28 11:45                             ` Agustin Martin
2014-02-28 11:51                               ` Eli Zaretskii
2014-03-01 21:44                                 ` Aleksey Cherepanov
2014-03-02  3:56                                   ` Eli Zaretskii
2014-03-09 17:36                                     ` Agustin Martin
2014-03-09 18:02                                       ` Aleksey Cherepanov
2014-03-09 18:24                                         ` Eli Zaretskii
2014-02-28 23:11                               ` Aleksey Cherepanov
2014-03-01 10:33                                 ` Aleksey Cherepanov [this message]
2014-03-01 15:50                                   ` Aleksey Cherepanov
2014-03-01 21:39                                 ` Aleksey Cherepanov
2014-03-09 17:25                                 ` Agustin Martin
2015-03-06 21:46                                   ` Agustin Martin
2015-03-07  8:09                                     ` Eli Zaretskii
2014-02-23 20:39                     ` Aleksey Cherepanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140301103305.GA600@openwall.com \
    --to=aleksey.4erepanov@gmail.com \
    --cc=16800@debbugs.gnu.org \
    --cc=agustin.martin@hispalinux.es \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.