unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
@ 2018-07-26  9:44 Artem Boldarev
  2018-07-27 12:45 ` Eli Zaretskii
  2018-07-27 16:00 ` Agustin Martin
  0 siblings, 2 replies; 19+ messages in thread
From: Artem Boldarev @ 2018-07-26  9:44 UTC (permalink / raw)
  To: 32280


Checking large enough buffer with FlySpell leads to the unexpected
results (at least, when spell checking Russian, but I believe that it is
possible to reproduce the bug for other languages, at least Ukrainian).

For example, when checking large enough buffer (large enough to trigger
flyspell-large-region) I got the following messages in the *Messages*
buffer:

Local Ispell dictionary set to ru_RU
Starting new Ispell process hunspell with ru_RU dictionary...
Checking region...
Spell Checking...100% [посимвольно]
Spell Checking completed.
  -> смом - 346: word not found
  -> стостояния - 319: word not found
  -> рекрсивного - 308: word not found
  -> универсальнее - 266: word not found
  -> генериует - 222: word not found

It was not able to find the misspelt words to highlight them in the 
buffer which I tried
to spell check. On the other hand, some not misspelt words were 
highlighted (именно,
бесконечный, усложняет).  Under other circumstances, these words are not 
highlighted as misspelt (which is as it should be).

The problem turned out to be in the flyspell-external-point-words: It
makes some heuristic checks before calling (flyspell-word nil t). It
seems that these checks are OK for English, as I never encountered any
problems when spell-checking English texts.

Here is the version of the function which seems to be correct:

(defun flyspell-external-point-words ()
   "Mark words from a buffer listing incorrect words in order of appearance.
The list of incorrect words should be in `flyspell-external-ispell-buffer'.
\(We finish by killing that buffer and setting the variable to nil.)
The buffer to mark them in is `flyspell-large-region-buffer'."
   (let (words-not-found
     (ispell-otherchars (ispell-get-otherchars))
     (buffer-scan-pos flyspell-large-region-beg)
     case-fold-search)
     (with-current-buffer flyspell-external-ispell-buffer
       (goto-char (point-min))
       ;; Loop over incorrect words, in the order they were reported,
       ;; which is also the order they appear in the buffer being checked.
       (while (re-search-forward "\\([^\n]+\\)\n" nil t)
     ;; Bind WORD to the next one.
     (let ((word (match-string 1)) (wordpos (point)))
       ;; Here there used to be code to see if WORD is the same
       ;; as the previous iteration, and count the number of consecutive
       ;; identical words, and the loop below would search for that many.
       ;; That code seemed to be incorrect, and on principle, should
       ;; be unnecessary too. -- rms.
       (if flyspell-issue-message-flag
           (message "Spell Checking...%d%% [%s]"
                (floor (* 100.0 (point)) (point-max))
                word))
       (with-current-buffer flyspell-large-region-buffer
         (goto-char buffer-scan-pos)
         (let ((keep t))
           ;; Iterate on string search until string is found as word,
           ;; not as substring.
           (while keep
         (if (search-forward word
                     flyspell-large-region-end t)
             (let* ((found-list
                 (save-excursion
                   ;; Move back into the match
                   ;; so flyspell-get-word will find it.
                   (forward-char -1)
                   (flyspell-get-word)))
                (found (car found-list))
                (found-length (length found))
                (misspell-length (length word)))
               (when (or
                  ;; Size and content matches, we really found it.
                  (and (= found-length misspell-length)
                       (string= found word))
                  ;; Matches as part of a boundary-char separated
                  ;; word.
                  (member word
                          (split-string found ispell-otherchars))
                  ;; ispell treats beginning of some TeX
                  ;; commands as nroff control sequences
                  ;; and strips them in the list of
                  ;; misspelled words thus giving a
                  ;; non-existent word.  Skip if ispell
                  ;; is used, string is a TeX command
                  ;; (char before beginning of word is
                  ;; backslash) and none of the previous
                  ;; conditions match.
                  (and (not ispell-really-aspell)
                   (save-excursion
                     (goto-char (- (nth 1 found-list) 1))
                     (if (looking-at "[\\]" )
                     t
                       nil))))
             (setq keep nil)
             (flyspell-word nil t)
             ;; Search for next misspelled word will begin from
             ;; end of last validated match.
             (setq buffer-scan-pos (point))))
           ;; Record if misspelling is not found and try new one
           (cl-pushnew (concat " -> " word " - "
                        (int-to-string wordpos))
                               words-not-found :test #'equal)
           (setq keep nil)))))))
       ;; we are done
       (if flyspell-issue-message-flag (message "Spell Checking 
completed.")))
     ;; Warn about not found misspellings
     (dolist (word words-not-found)
       (message "%s: word not found" word))
     ;; Kill and forget the buffer with the list of incorrect words.
     (kill-buffer flyspell-external-ispell-buffer)
     (setq flyspell-external-ispell-buffer nil)))

The important lines are the following:

                  ;; Size and content matches, we really found it.
                  (and (= found-length misspell-length)
                       (string= found word))
                  ;; Matches as part of a boundary-char separated
                  ;; word.
                  (member word
                          (split-string found ispell-otherchars))
                  ;; ispell treats beginning of some TeX
                  ;; commands as nroff control sequences
                  ;; and strips them in the list of
                  ;; misspelled words thus giving a
                  ;; non-existent word.  Skip if ispell
                  ;; is used, string is a TeX command
                  ;; (char before beginning of word is
                  ;; backslash) and none of the previous
                  ;; conditions match.
                  (and (not ispell-really-aspell)
                   (save-excursion
                     (goto-char (- (nth 1 found-list) 1))
                     (if (looking-at "[\\]" )
                     t
                       nil))))


The important parts of my configuration:

     (setq ispell-program-name "hunspell")
     ;; set dictionaries
     (setq ispell-dictionary-alist
           '(("en_GB"
              "[A-Za-z]" "[^A-Za-z]"
              "[']" nil ("-d en_GB") nil iso-8859-1)

             ("ru_RU"
  "[АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]"
  "[^АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]"
              "[-]" nil ("-d ru_RU") nil utf-8)

             ("uk_UA"
  "[АБВГДЕЄЖЗИІЇЙКЛМНОПРСТУФХЦЧШЩЬЮЯАабвгдеєжзиіїйклмнопрстуфхцчшщьюя]"
  "[^АБВГДЕЄЖЗИІЇЙКЛМНОПРСТУФХЦЧШЩЬЮЯАабвгдеєжзиіїйклмнопрстуфхцчшщьюя]"
              "[-']" nil ("-d uk_UA") nil utf-8)
             ))


I hope you will investigate the problem.


In GNU Emacs 26.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.22.30)
  of 2018-07-05 built on juergen
Windowing system distributor 'HC-Consult', version 11.0.12000000
Recent messages:
Omitted 3 lines.
Omitting...
(Nothing to omit)
Wrote /home/artem/.emacs.data/desktop/emacs.desktop-lock
Desktop: 1 frame, 19 buffers restored, 1 failed to restore.
Turning on magit-auto-revert-mode...done (0.427s, 27 buffers checked)
For information about GNU Emacs and the GNU system, type C-h C-a.
Making completion list...
user-error: Beginning of history; no preceding item
user-error: End of history; no default available

Configured using:
  'configure --prefix=/usr --sysconfdir=/etc --libexecdir=/usr/lib
  --localstatedir=/var --with-x-toolkit=gtk3 --with-xft --with-modules
  'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong
  -fno-plt' CPPFLAGS=-D_FORTIFY_SOURCE=2
  LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS NOTIFY
ACL GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB TOOLKIT_SCROLL_BARS
GTK3 X11 MODULES THREADS LIBSYSTEMD LCMS2

Important settings:
   value of $LANG: en_US.UTF-8
   locale-coding-system: utf-8-unix

Major mode: Text

Minor modes in effect:
   display-line-numbers-mode: t
   desktop-save-mode: t
   global-magit-file-mode: t
   diff-auto-refine-mode: t
   magit-auto-revert-mode: t
   global-git-commit-mode: t
   async-bytecomp-package-mode: t
   gud-tooltip-mode: t
   flyspell-mode: t
   shell-dirtrack-mode: t
   winner-mode: t
   global-auto-complete-mode: t
   ido-everywhere: t
   show-paren-mode: t
   global-auto-revert-mode: t
   cl-old-struct-compat-mode: t
   tooltip-mode: t
   global-eldoc-mode: t
   electric-indent-mode: t
   mouse-wheel-mode: t
   menu-bar-mode: t
   file-name-shadow-mode: t
   global-font-lock-mode: t
   font-lock-mode: t
   blink-cursor-mode: t
   auto-composition-mode: t
   auto-encryption-mode: t
   auto-compression-mode: t
   column-number-mode: t
   line-number-mode: t
   transient-mark-mode: t

Load-path shadows:
/home/artem/.emacs.data/elpa/xcscope-20180426.12/xcscope hides 
/usr/share/emacs/site-lisp/xcscope
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox hides 
/usr/share/emacs/26.1/lisp/org/ox
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-texinfo hides 
/usr/share/emacs/26.1/lisp/org/ox-texinfo
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-publish hides 
/usr/share/emacs/26.1/lisp/org/ox-publish
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-org hides 
/usr/share/emacs/26.1/lisp/org/ox-org
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-odt hides 
/usr/share/emacs/26.1/lisp/org/ox-odt
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-md hides 
/usr/share/emacs/26.1/lisp/org/ox-md
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-man hides 
/usr/share/emacs/26.1/lisp/org/ox-man
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-latex hides 
/usr/share/emacs/26.1/lisp/org/ox-latex
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-icalendar 
hides /usr/share/emacs/26.1/lisp/org/ox-icalendar
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-html hides 
/usr/share/emacs/26.1/lisp/org/ox-html
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-beamer hides 
/usr/share/emacs/26.1/lisp/org/ox-beamer
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ox-ascii hides 
/usr/share/emacs/26.1/lisp/org/ox-ascii
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org hides 
/usr/share/emacs/26.1/lisp/org/org
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-w3m hides 
/usr/share/emacs/26.1/lisp/org/org-w3m
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-version hides 
/usr/share/emacs/26.1/lisp/org/org-version
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-timer hides 
/usr/share/emacs/26.1/lisp/org/org-timer
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-table hides 
/usr/share/emacs/26.1/lisp/org/org-table
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-src hides 
/usr/share/emacs/26.1/lisp/org/org-src
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-rmail hides 
/usr/share/emacs/26.1/lisp/org/org-rmail
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-protocol 
hides /usr/share/emacs/26.1/lisp/org/org-protocol
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-plot hides 
/usr/share/emacs/26.1/lisp/org/org-plot
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-pcomplete 
hides /usr/share/emacs/26.1/lisp/org/org-pcomplete
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-mouse hides 
/usr/share/emacs/26.1/lisp/org/org-mouse
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-mobile hides 
/usr/share/emacs/26.1/lisp/org/org-mobile
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-mhe hides 
/usr/share/emacs/26.1/lisp/org/org-mhe
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-macs hides 
/usr/share/emacs/26.1/lisp/org/org-macs
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-macro hides 
/usr/share/emacs/26.1/lisp/org/org-macro
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-loaddefs 
hides /usr/share/emacs/26.1/lisp/org/org-loaddefs
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-list hides 
/usr/share/emacs/26.1/lisp/org/org-list
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-lint hides 
/usr/share/emacs/26.1/lisp/org/org-lint
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-irc hides 
/usr/share/emacs/26.1/lisp/org/org-irc
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-install hides 
/usr/share/emacs/26.1/lisp/org/org-install
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-inlinetask 
hides /usr/share/emacs/26.1/lisp/org/org-inlinetask
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-info hides 
/usr/share/emacs/26.1/lisp/org/org-info
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-indent hides 
/usr/share/emacs/26.1/lisp/org/org-indent
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-id hides 
/usr/share/emacs/26.1/lisp/org/org-id
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-habit hides 
/usr/share/emacs/26.1/lisp/org/org-habit
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-gnus hides 
/usr/share/emacs/26.1/lisp/org/org-gnus
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-footnote 
hides /usr/share/emacs/26.1/lisp/org/org-footnote
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-feed hides 
/usr/share/emacs/26.1/lisp/org/org-feed
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-faces hides 
/usr/share/emacs/26.1/lisp/org/org-faces
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-eww hides 
/usr/share/emacs/26.1/lisp/org/org-eww
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-eshell hides 
/usr/share/emacs/26.1/lisp/org/org-eshell
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-entities 
hides /usr/share/emacs/26.1/lisp/org/org-entities
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-element hides 
/usr/share/emacs/26.1/lisp/org/org-element
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-duration 
hides /usr/share/emacs/26.1/lisp/org/org-duration
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-docview hides 
/usr/share/emacs/26.1/lisp/org/org-docview
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-datetree 
hides /usr/share/emacs/26.1/lisp/org/org-datetree
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-ctags hides 
/usr/share/emacs/26.1/lisp/org/org-ctags
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-crypt hides 
/usr/share/emacs/26.1/lisp/org/org-crypt
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-compat hides 
/usr/share/emacs/26.1/lisp/org/org-compat
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-colview hides 
/usr/share/emacs/26.1/lisp/org/org-colview
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-clock hides 
/usr/share/emacs/26.1/lisp/org/org-clock
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-capture hides 
/usr/share/emacs/26.1/lisp/org/org-capture
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-bibtex hides 
/usr/share/emacs/26.1/lisp/org/org-bibtex
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-bbdb hides 
/usr/share/emacs/26.1/lisp/org/org-bbdb
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-attach hides 
/usr/share/emacs/26.1/lisp/org/org-attach
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-archive hides 
/usr/share/emacs/26.1/lisp/org/org-archive
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/org-agenda hides 
/usr/share/emacs/26.1/lisp/org/org-agenda
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob hides 
/usr/share/emacs/26.1/lisp/org/ob
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-vala hides 
/usr/share/emacs/26.1/lisp/org/ob-vala
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-tangle hides 
/usr/share/emacs/26.1/lisp/org/ob-tangle
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-table hides 
/usr/share/emacs/26.1/lisp/org/ob-table
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-stan hides 
/usr/share/emacs/26.1/lisp/org/ob-stan
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-sqlite hides 
/usr/share/emacs/26.1/lisp/org/ob-sqlite
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-sql hides 
/usr/share/emacs/26.1/lisp/org/ob-sql
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-shen hides 
/usr/share/emacs/26.1/lisp/org/ob-shen
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-shell hides 
/usr/share/emacs/26.1/lisp/org/ob-shell
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-sed hides 
/usr/share/emacs/26.1/lisp/org/ob-sed
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-screen hides 
/usr/share/emacs/26.1/lisp/org/ob-screen
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-scheme hides 
/usr/share/emacs/26.1/lisp/org/ob-scheme
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-sass hides 
/usr/share/emacs/26.1/lisp/org/ob-sass
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-ruby hides 
/usr/share/emacs/26.1/lisp/org/ob-ruby
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-ref hides 
/usr/share/emacs/26.1/lisp/org/ob-ref
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-python hides 
/usr/share/emacs/26.1/lisp/org/ob-python
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-processing 
hides /usr/share/emacs/26.1/lisp/org/ob-processing
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-plantuml hides 
/usr/share/emacs/26.1/lisp/org/ob-plantuml
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-picolisp hides 
/usr/share/emacs/26.1/lisp/org/ob-picolisp
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-perl hides 
/usr/share/emacs/26.1/lisp/org/ob-perl
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-org hides 
/usr/share/emacs/26.1/lisp/org/ob-org
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-octave hides 
/usr/share/emacs/26.1/lisp/org/ob-octave
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-ocaml hides 
/usr/share/emacs/26.1/lisp/org/ob-ocaml
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-mscgen hides 
/usr/share/emacs/26.1/lisp/org/ob-mscgen
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-maxima hides 
/usr/share/emacs/26.1/lisp/org/ob-maxima
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-matlab hides 
/usr/share/emacs/26.1/lisp/org/ob-matlab
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-makefile hides 
/usr/share/emacs/26.1/lisp/org/ob-makefile
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-lua hides 
/usr/share/emacs/26.1/lisp/org/ob-lua
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-lob hides 
/usr/share/emacs/26.1/lisp/org/ob-lob
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-lisp hides 
/usr/share/emacs/26.1/lisp/org/ob-lisp
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-lilypond hides 
/usr/share/emacs/26.1/lisp/org/ob-lilypond
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-ledger hides 
/usr/share/emacs/26.1/lisp/org/ob-ledger
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-latex hides 
/usr/share/emacs/26.1/lisp/org/ob-latex
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-keys hides 
/usr/share/emacs/26.1/lisp/org/ob-keys
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-js hides 
/usr/share/emacs/26.1/lisp/org/ob-js
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-java hides 
/usr/share/emacs/26.1/lisp/org/ob-java
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-io hides 
/usr/share/emacs/26.1/lisp/org/ob-io
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-hledger hides 
/usr/share/emacs/26.1/lisp/org/ob-hledger
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-haskell hides 
/usr/share/emacs/26.1/lisp/org/ob-haskell
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-groovy hides 
/usr/share/emacs/26.1/lisp/org/ob-groovy
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-gnuplot hides 
/usr/share/emacs/26.1/lisp/org/ob-gnuplot
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-fortran hides 
/usr/share/emacs/26.1/lisp/org/ob-fortran
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-forth hides 
/usr/share/emacs/26.1/lisp/org/ob-forth
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-exp hides 
/usr/share/emacs/26.1/lisp/org/ob-exp
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-eval hides 
/usr/share/emacs/26.1/lisp/org/ob-eval
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-emacs-lisp 
hides /usr/share/emacs/26.1/lisp/org/ob-emacs-lisp
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-ebnf hides 
/usr/share/emacs/26.1/lisp/org/ob-ebnf
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-dot hides 
/usr/share/emacs/26.1/lisp/org/ob-dot
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-ditaa hides 
/usr/share/emacs/26.1/lisp/org/ob-ditaa
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-css hides 
/usr/share/emacs/26.1/lisp/org/ob-css
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-core hides 
/usr/share/emacs/26.1/lisp/org/ob-core
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-coq hides 
/usr/share/emacs/26.1/lisp/org/ob-coq
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-comint hides 
/usr/share/emacs/26.1/lisp/org/ob-comint
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-clojure hides 
/usr/share/emacs/26.1/lisp/org/ob-clojure
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-calc hides 
/usr/share/emacs/26.1/lisp/org/ob-calc
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-awk hides 
/usr/share/emacs/26.1/lisp/org/ob-awk
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-asymptote 
hides /usr/share/emacs/26.1/lisp/org/ob-asymptote
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-abc hides 
/usr/share/emacs/26.1/lisp/org/ob-abc
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-R hides 
/usr/share/emacs/26.1/lisp/org/ob-R
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-J hides 
/usr/share/emacs/26.1/lisp/org/ob-J
/home/artem/.emacs.data/elpa/org-plus-contrib-20180709/ob-C hides 
/usr/share/emacs/26.1/lisp/org/ob-C

Features:
(shadow mail-extr emacsbug sendmail linum lisp-mnt macrostep-c cmacexp
macrostep irony-cdb-libclang irony-cdb-json irony-cdb-clang-complete
irony-cdb company-oddmuse company-keywords company-etags company-gtags
company-dabbrev-code company-dabbrev company-files company-capf
company-cmake company-xcode company-clang company-semantic company-eclim
company-bbdb company-irony company-template irony-eldoc flycheck-irony
irony-diagnostics irony-completion irony-snippet ac-slime sort dired-aux
jka-compr display-line-numbers hl-line plan9-theme basic-theme desktop
frameset magit-bookmark magit-obsolete magit-blame magit-stash
magit-bisect magit-remote magit-commit magit-sequence magit-notes
magit-worktree magit-tag magit-merge magit-branch magit-reset
magit-collab ghub let-alist magit-files magit-refs magit-status magit
magit-repos magit-apply magit-wip magit-log which-func imenu magit-diff
smerge-mode diff-mode magit-core magit-autorevert magit-process
magit-margin magit-mode git-commit magit-git magit-section magit-utils
crm magit-popup log-edit message rfc822 mml mml-sec epa derived epg
mm-decode mm-bodies mm-encode mailabbrev gmm-utils mailheader pcvs-util
add-log with-editor async-bytecomp async irony irony-iotask ggtags ewoc
xcscope cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align
cc-engine cc-vars cc-defs deft cl ox-bibtex ox-odt rng-loc rng-uri
rng-parse rng-match rng-dt rng-util rng-pttrn nxml-parse nxml-ns
nxml-enc xmltok nxml-util ox-latex ox-icalendar ox-html table ox-ascii
ox-publish ox org-element avl-tree generator org org-macro org-footnote
org-pcomplete org-list org-faces org-entities org-version ob-emacs-lisp
ob ob-tangle org-src ob-ref ob-lob ob-table ob-keys ob-exp ob-comint
ob-core ob-eval org-compat org-macs org-loaddefs cal-menu calendar
cal-loaddefs robe url-http tls gnutls url-auth mail-parse rfc2231 url-gw
nsm rmc inf-ruby ruby-mode smie flymake-lua company-lua lua-mode rcirc
cargo cargo-process markdown-mode color company-racer deferred company
pcase racer pos-tip f s rust-mode gud flyspell ispell eww puny mm-url
gnus nnheader gnus-util rmail rmail-loaddefs rfc2047 rfc2045 ietf-drums
mail-utils url-queue shr svg xml dom slime-trace-dialog
slime-xref-browser tree-widget wid-edit slime-fancy-inspector
slime-fuzzy slime-c-p-c slime-editing-commands slime-asdf grep
slime-references slime-compiler-notes-tree slime-autodoc slime-repl
slime-parse slime arc-mode archive-mode noutline outline easy-mmode
hyperspec browse-url elec-pair dired-x dired dired-loaddefs esh-var
esh-io esh-cmd esh-opt esh-ext esh-proc esh-arg esh-groups eshell
esh-module esh-mode esh-util bookmark pp url url-proxy url-privacy
url-expand url-methods url-history url-cookie url-domsuf url-util
mailcap tramp tramp-compat tramp-loaddefs trampver ucs-normalize shell
pcomplete parse-time format-spec elisp-slime-nav etags xref project
winner flycheck json map find-func subr-x dash advice flymake-proc
flymake warnings thingatpt auto-complete-config auto-complete edmacro
kmacro popup ido windmove paren autorevert filenotify mm-util mail-prsvr
cl-extra help-mode finder-inf tex-site rx slime-autoloads info package
easymenu epg-config url-handlers url-parse auth-source cl-seq eieio
eieio-core cl-macs eieio-loaddefs password-cache url-vars seq byte-opt
gv compile comint ansi-color ring bytecomp byte-compile cconv server
cl-loaddefs cl-lib time-date mule-util tooltip eldoc electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win
term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript charprop case-table epa-hook jka-cmpr-hook
help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote dbusbind inotify lcms2 dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty
make-network-process emacs)

Memory information:
((conses 16 589646 60784)
  (symbols 48 55348 4)
  (miscs 40 871 2775)
  (strings 32 161266 15744)
  (string-bytes 1 4916243)
  (vectors 16 85794)
  (vector-slots 8 1289279 103516)
  (floats 8 429 500)
  (intervals 56 2676 0)
  (buffers 992 34))






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-26  9:44 bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer Artem Boldarev
@ 2018-07-27 12:45 ` Eli Zaretskii
  2018-07-28  0:00   ` Artem Boldarev
  2018-07-27 16:00 ` Agustin Martin
  1 sibling, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2018-07-27 12:45 UTC (permalink / raw)
  To: Artem Boldarev; +Cc: 32280

> From: Artem Boldarev <artem.boldarev@gmail.com>
> Date: Thu, 26 Jul 2018 12:44:26 +0300
> 
> Checking large enough buffer with FlySpell leads to the unexpected
> results (at least, when spell checking Russian, but I believe that it is
> possible to reproduce the bug for other languages, at least Ukrainian).
> 
> For example, when checking large enough buffer (large enough to trigger
> flyspell-large-region) I got the following messages in the *Messages*
> buffer:
> 
> Local Ispell dictionary set to ru_RU
> Starting new Ispell process hunspell with ru_RU dictionary...
> Checking region...
> Spell Checking...100% [посимвольно]
> Spell Checking completed.
>   -> смом - 346: word not found
>   -> стостояния - 319: word not found
>   -> рекрсивного - 308: word not found
>   -> универсальнее - 266: word not found
>   -> генериует - 222: word not found
> 
> It was not able to find the misspelt words to highlight them in the 
> buffer which I tried
> to spell check. On the other hand, some not misspelt words were 
> highlighted (именно,
> бесконечный, усложняет).  Under other circumstances, these words are not 
> highlighted as misspelt (which is as it should be).

Can you post the text where this happens?

> The problem turned out to be in the flyspell-external-point-words: It
> makes some heuristic checks before calling (flyspell-word nil t). It
> seems that these checks are OK for English, as I never encountered any
> problems when spell-checking English texts.
> 
> Here is the version of the function which seems to be correct:

AFAICT, you have removed a single line:

			     (< found-length misspell-length)

Can you take me through your reasoning why this line is incorrect, and
what assumptions it made that are correct for English, but not for
Russian?

Thanks.





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-26  9:44 bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer Artem Boldarev
  2018-07-27 12:45 ` Eli Zaretskii
@ 2018-07-27 16:00 ` Agustin Martin
  2018-07-28  0:00   ` Artem Boldarev
  2018-07-28  0:23   ` Artem Boldarev
  1 sibling, 2 replies; 19+ messages in thread
From: Agustin Martin @ 2018-07-27 16:00 UTC (permalink / raw)
  To: 32280, Artem Boldarev

On Thu, Jul 26, 2018 at 12:44:26PM +0300, Artem Boldarev wrote:
> 
> Checking large enough buffer with FlySpell leads to the unexpected
> results (at least, when spell checking Russian, but I believe that it is
> possible to reproduce the bug for other languages, at least Ukrainian).
> 
> For example, when checking large enough buffer (large enough to trigger
> flyspell-large-region) I got the following messages in the *Messages*
> buffer:
> 
> Local Ispell dictionary set to ru_RU
> Starting new Ispell process hunspell with ru_RU dictionary...
> Checking region...
> Spell Checking...100% [посимвольно]
> Spell Checking completed.
>  -> смом - 346: word not found
>  -> стостояния - 319: word not found
>  -> рекрсивного - 308: word not found
>  -> универсальнее - 266: word not found
>  -> генериует - 222: word not found
[...]

Hi,

>     (setq ispell-dictionary-alist

You should not set `ispell-dictionary-alist' yourself. If you really need an
entry with special features, add it to `ispell-local-dictionary-alist',
better with an ad-hoc name.

>           '(("en_GB"
>              "[A-Za-z]" "[^A-Za-z]"
>              "[']" nil ("-d en_GB") nil iso-8859-1)
> 
>             ("ru_RU"
>  "[АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]"
>  "[^АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]"
>              "[-]" nil ("-d ru_RU") nil utf-8)

As far as I know hunspell-ru is encoded in koi8-r (at least in Debian
lo-dicts), but you declare it as utf-8. Unless your dict is indeed in utf-8
and declared as such, this may be the problem.

¿What happens if you comment all your "(setq ispell-dictionary-alist ... )"
stuff and just trust the list of available dictionaries provided by Emacs
(Tools/Spellchecking/Change dictionary), selecting ru_RU from it?

Regards,

-- 
Agustin





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-27 12:45 ` Eli Zaretskii
@ 2018-07-28  0:00   ` Artem Boldarev
  2018-07-29 14:09     ` Artem Boldarev
  0 siblings, 1 reply; 19+ messages in thread
From: Artem Boldarev @ 2018-07-28  0:00 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 32280

Hello Eli,
> Can you post the text where this happens?
The text where I encountered the problem is a personal e-mail, so I can 
not share it as it is. I will try to craft a sample text and describe 
the steps for bug reproduction using emacs -Q.
> AFAICT, you have removed a single line:
>
> 			     (< found-length misspell-length)

I am also replaced:
;; Size matches, we really found it.
(= found-length misspell-length)

with

;; Size and content matches, we really found it.
  (and (= found-length misspell-length)
           (string= found word))

I believe, in this case there is no need in  (< found-length 
misspell-length) anymore.
> Can you take me through your reasoning why this line is incorrect, and
> what assumptions it made that are correct for English, but not for
> Russian?
As about my reasoning behind the changes: I felt that it is not right to 
mark the word as misspelt without actually checking the content. 
Moreover, look at the original comment right behind the (< found-length 
misspell-length) line:
                  ;; Misspelling has higher length than
                  ;; what flyspell considers the word.
                              ;; Caused by boundary-chars mismatch.
                              ;; Validating seems safe.
I am not sure that comparing length of found word and misspelt word is 
enough to make an assumption that validating is safe (even considering 
the preceding checks). The keyword here, I think, is 'seems'. For some 
reason, it really works most of the time.

I believe that the bug should be possible to reproduce for texts in 
English too. For some reason, I have not encountered this problem while 
spell checking English. I should note that flyspell-buffer works fine 
for *most* of the texts in Russian and Ukrainian which I have checked 
and the discussed issue is rarely encountered. I did not know that It 
exists until  I started using flyspell-buffer regularly.

Kind regards,
Artem






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-27 16:00 ` Agustin Martin
@ 2018-07-28  0:00   ` Artem Boldarev
  2018-07-30 13:20     ` Agustin Martin
  2018-07-28  0:23   ` Artem Boldarev
  1 sibling, 1 reply; 19+ messages in thread
From: Artem Boldarev @ 2018-07-28  0:00 UTC (permalink / raw)
  To: Agustin Martin; +Cc: 32280

Hello Agustin,

Thanks for your suggestion!

Unfortunately, it does not work on my system with 'emacs -Q'. So, 
somehow I need to manually configure my dictionaries anyway. I will 
consider replacing 'ispell-dictionary-alist'with 
'ispell-local-dictionary-alist' in my configuration. Thank you for 
pointing out.

The codepage I specified in the configuration, as it seems, is not the 
problem as spell checking works fine *most* of the time. I could 
spellcheck large amounts of text without any issues. It seems that 
hunspell always uses utf-8 internally, but I am not sure: I will try to 
investigate this.

By the way, I was able to reproduce the problem on the official Windows 
build of the Emacs with a different version of Hunspell and dictionaries 
from LibreOffice.

> You should not set `ispell-dictionary-alist' yourself. If you really need an
> entry with special features, add it to `ispell-local-dictionary-alist',
> better with an ad-hoc name.
>
>>            '(("en_GB"
>>               "[A-Za-z]" "[^A-Za-z]"
>>               "[']" nil ("-d en_GB") nil iso-8859-1)
>>
>>              ("ru_RU"
>>   "[АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]"
>>   "[^АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]"
>>               "[-]" nil ("-d ru_RU") nil utf-8)
> As far as I know hunspell-ru is encoded in koi8-r (at least in Debian
> lo-dicts), but you declare it as utf-8. Unless your dict is indeed in utf-8
> and declared as such, this may be the problem.
>
> ¿What happens if you comment all your "(setq ispell-dictionary-alist ... )"
> stuff and just trust the list of available dictionaries provided by Emacs
> (Tools/Spellchecking/Change dictionary), selecting ru_RU from it?
>
> Regards,
>






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-27 16:00 ` Agustin Martin
  2018-07-28  0:00   ` Artem Boldarev
@ 2018-07-28  0:23   ` Artem Boldarev
  2018-07-28  7:02     ` Eli Zaretskii
  1 sibling, 1 reply; 19+ messages in thread
From: Artem Boldarev @ 2018-07-28  0:23 UTC (permalink / raw)
  To: Agustin Martin, 32280

Just a quick addition. When doing as you suggested I am getting the 
following message:

Wrong type argument: stringp, nil

It is probably because both 'ispell-dictionary-alist' and 
'ispell-local-dictionary-alist' are NIL, but I have not investigated it.

> ¿What happens if you comment all your "(setq ispell-dictionary-alist ... )"
> stuff and just trust the list of available dictionaries provided by Emacs
> (Tools/Spellchecking/Change dictionary), selecting ru_RU from it?
>
> Regards,
>






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-28  0:23   ` Artem Boldarev
@ 2018-07-28  7:02     ` Eli Zaretskii
  2018-07-29 14:15       ` Artem Boldarev
  0 siblings, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2018-07-28  7:02 UTC (permalink / raw)
  To: Artem Boldarev; +Cc: 32280, agustin6martin

> From: Artem Boldarev <artem.boldarev@gmail.com>
> Date: Sat, 28 Jul 2018 03:23:40 +0300
> 
> Just a quick addition. When doing as you suggested I am getting the 
> following message:
> 
> Wrong type argument: stringp, nil
> 
> It is probably because both 'ispell-dictionary-alist' and 
> 'ispell-local-dictionary-alist' are NIL, but I have not investigated it.

What is your version of hunspell?  And what does it produce if you
invoke "hunspell -D" from the shell prompt?





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-28  0:00   ` Artem Boldarev
@ 2018-07-29 14:09     ` Artem Boldarev
  2018-07-29 17:33       ` Eli Zaretskii
  2018-07-30  6:22       ` martin rudalics
  0 siblings, 2 replies; 19+ messages in thread
From: Artem Boldarev @ 2018-07-29 14:09 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 32280

Hello,

I have crafted some sample data as well as wrote instruction how to 
reproduce the bug.

The sample file can be found by following this link:

https://chaoticlab.io/pub/flyspell-bug/flyspell-sample.txt

The instruction alongside with the required code can be downloaded from 
here:

https://chaoticlab.io/pub/flyspell-bug/flyspell-bug-reproduction.el

I made some screenshots which demonstrate the bug:

https://chaoticlab.io/pub/flyspell-bug/flyspell-bug-linux.png
https://chaoticlab.io/pub/flyspell-bug/flyspell-bug-windows.png

I haven't been able to demonstrate the case when a misspelt word is not 
highlighted though. I will send an update should I craft the required data.

I hope this is helpful.

Regards,
Artem

> Hello Eli,
>> Can you post the text where this happens?
> The text where I encountered the problem is a personal e-mail, so I 
> can not share it as it is. I will try to craft a sample text and 
> describe the steps for bug reproduction using emacs -Q.
>> AFAICT, you have removed a single line:
>>
>>                  (< found-length misspell-length)
>
> I am also replaced:
> ;; Size matches, we really found it.
> (= found-length misspell-length)
>
> with
>
> ;; Size and content matches, we really found it.
>  (and (= found-length misspell-length)
>           (string= found word))
>
> I believe, in this case there is no need in  (< found-length 
> misspell-length) anymore.
>> Can you take me through your reasoning why this line is incorrect, and
>> what assumptions it made that are correct for English, but not for
>> Russian?
> As about my reasoning behind the changes: I felt that it is not right 
> to mark the word as misspelt without actually checking the content. 
> Moreover, look at the original comment right behind the (< 
> found-length misspell-length) line:
>                  ;; Misspelling has higher length than
>                  ;; what flyspell considers the word.
>                              ;; Caused by boundary-chars mismatch.
>                              ;; Validating seems safe.
> I am not sure that comparing length of found word and misspelt word is 
> enough to make an assumption that validating is safe (even considering 
> the preceding checks). The keyword here, I think, is 'seems'. For some 
> reason, it really works most of the time.
>
> I believe that the bug should be possible to reproduce for texts in 
> English too. For some reason, I have not encountered this problem 
> while spell checking English. I should note that flyspell-buffer works 
> fine for *most* of the texts in Russian and Ukrainian which I have 
> checked and the discussed issue is rarely encountered. I did not know 
> that It exists until  I started using flyspell-buffer regularly.
>
> Kind regards,
> Artem
>






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-28  7:02     ` Eli Zaretskii
@ 2018-07-29 14:15       ` Artem Boldarev
  0 siblings, 0 replies; 19+ messages in thread
From: Artem Boldarev @ 2018-07-29 14:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 32280, agustin6martin

Just a quick addition.

As about hunspell version:

*On Linux:*

$ hunspell --version

@(#) International Ispell Version 3.2.06 (but really Hunspell 1.6.2)

$ hunspell -D

SEARCH PATH:
.::/usr/share/hunspell:/usr/share/myspell:/usr/share/myspell/dicts:/Library/Spelling:/home/artem//.openoffice.org/3/user/wordbook:/home/artem//.openoffice.org2/user/wordbook:/home/artem//.openoffice.org2.0/user/wordbook:/home/artem//Library/Spelling:/opt/openoffice.org/basis3.0/share/dict/ooo:/usr/lib/openoffice.org/basis3.0/share/dict/ooo:/opt/openoffice.org2.4/share/dict/ooo:/usr/lib/openoffice.org2.4/share/dict/ooo:/opt/openoffice.org2.3/share/dict/ooo:/usr/lib/openoffice.org2.3/share/dict/ooo:/opt/openoffice.org2.2/share/dict/ooo:/usr/lib/openoffice.org2.2/share/dict/ooo:/opt/openoffice.org2.1/share/dict/ooo:/usr/lib/openoffice.org2.1/share/dict/ooo:/opt/openoffice.org2.0/share/dict/ooo:/usr/lib/openoffice.org2.0/share/dict/ooo
AVAILABLE DICTIONARIES (path is not mandatory for -d option):
/usr/share/hunspell/en_GB-large
/usr/share/hunspell/en_GB
/usr/share/hunspell/en_US-large
/usr/share/hunspell/ru_RU
/usr/share/hunspell/uk_UA
/usr/share/myspell/dicts/en_GB-large
/usr/share/myspell/dicts/en_US-large
Can't open affix or dictionary files for dictionary named "en_US".

*On Windows:*


 > hunspell --version

@(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.2)

  > hunspell -D

SEARCH PATH:
.;;C:\Hunspell\;D:\Users\Artem\Application Data\OpenOffice.org 
2\user\wordbook;C:\Tools\Hunspell\bin\..\share\hunspell;C:\Program 
files\OpenOffice.org 2.4\share\dict\ooo\;C:\Program files\OpenOffice.org 
2.3\share\dict\ooo\;C:\Program files\OpenOffice.org 
2.2\share\dict\ooo\;C:\Program files\OpenOffice.org 
2.1\share\dict\ooo\;C:\Program files\OpenOffice.org 2.0\share\dict\ooo\
AVAILABLE DICTIONARIES (path is not mandatory for -d option):
C:\Tools\Hunspell\bin\..\share\hunspell\default
C:\Tools\Hunspell\bin\..\share\hunspell\de_AT_frami
C:\Tools\Hunspell\bin\..\share\hunspell\de_CH_frami
C:\Tools\Hunspell\bin\..\share\hunspell\de_DE_frami
C:\Tools\Hunspell\bin\..\share\hunspell\en_AU
C:\Tools\Hunspell\bin\..\share\hunspell\en_CA
C:\Tools\Hunspell\bin\..\share\hunspell\en_GB
C:\Tools\Hunspell\bin\..\share\hunspell\en_US
C:\Tools\Hunspell\bin\..\share\hunspell\en_ZA
C:\Tools\Hunspell\bin\..\share\hunspell\nb_NO
C:\Tools\Hunspell\bin\..\share\hunspell\nn_NO
C:\Tools\Hunspell\bin\..\share\hunspell\ru_RU
C:\Tools\Hunspell\bin\..\share\hunspell\sv_FI
C:\Tools\Hunspell\bin\..\share\hunspell\sv_SE
C:\Tools\Hunspell\bin\..\share\hunspell\uk_UA
Can't open affix or dictionary files for dictionary named "RU".

>> From: Artem Boldarev <artem.boldarev@gmail.com>
>> Date: Sat, 28 Jul 2018 03:23:40 +0300
>>
>> Just a quick addition. When doing as you suggested I am getting the
>> following message:
>>
>> Wrong type argument: stringp, nil
>>
>> It is probably because both 'ispell-dictionary-alist' and
>> 'ispell-local-dictionary-alist' are NIL, but I have not investigated it.
> What is your version of hunspell?  And what does it produce if you
> invoke "hunspell -D" from the shell prompt?







^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-29 14:09     ` Artem Boldarev
@ 2018-07-29 17:33       ` Eli Zaretskii
  2018-07-30  6:22       ` martin rudalics
  1 sibling, 0 replies; 19+ messages in thread
From: Eli Zaretskii @ 2018-07-29 17:33 UTC (permalink / raw)
  To: Artem Boldarev; +Cc: 32280

> From: Artem Boldarev <artem.boldarev@gmail.com>
> Cc: 32280@debbugs.gnu.org
> Date: Sun, 29 Jul 2018 17:09:54 +0300
> 
> I have crafted some sample data as well as wrote instruction how to 
> reproduce the bug.

Thanks, I will look into this within the next few days, if no one
beats me to it.





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-29 14:09     ` Artem Boldarev
  2018-07-29 17:33       ` Eli Zaretskii
@ 2018-07-30  6:22       ` martin rudalics
  2018-07-30 10:00         ` Artem Boldarev
  1 sibling, 1 reply; 19+ messages in thread
From: martin rudalics @ 2018-07-30  6:22 UTC (permalink / raw)
  To: Artem Boldarev, Eli Zaretskii; +Cc: 32280

 > https://chaoticlab.io/pub/flyspell-bug/flyspell-bug-linux.png
 > https://chaoticlab.io/pub/flyspell-bug/flyspell-bug-windows.png

 From these images it seems immediately evident that flyspell wrongly
marks и, именно, бесконечный and усложняет as misspelled only when a
non-cyrillic word follows it.  However, as paragraph 4 in these
examples also demonstrates, such condition is not sufficient since
there the words preceding 'HTML' and 'Lorem Ipsum' are not marked.

Could you try to play around with the seqeuencing of words in that
example?  Maybe a clearer pattern emerges.

Thanks, martin






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-30  6:22       ` martin rudalics
@ 2018-07-30 10:00         ` Artem Boldarev
  0 siblings, 0 replies; 19+ messages in thread
From: Artem Boldarev @ 2018-07-30 10:00 UTC (permalink / raw)
  To: martin rudalics, Eli Zaretskii; +Cc: 32280

Hi,

Yes, there is a clear pattern. The wrong behaviour appears only when the 
word in Latin, which follows the Cyrillic one, has the length more or 
equal to the length of a preceding word.

Here is  the sample text:

https://chaoticlab.io/pub/flyspell-bug/sample2/flyspell-sample2.txt

Instructions for the bug reproduction are the same:

https://chaoticlab.io/pub/flyspell-bug/flyspell-bug-reproduction.el

Screenshot without the provided fix:

https://chaoticlab.io/pub/flyspell-bug/sample2/flyspell-bug-sample2.png

Screenshot with the provided fix:

https://chaoticlab.io/pub/flyspell-bug/sample2/flyspell-bug-sample2-fixed.png

It seems logical to me that the provided fix is sufficient for this case 
considering what was changed in the problematic function.

Regards,
Artem

> > https://chaoticlab.io/pub/flyspell-bug/flyspell-bug-linux.png
> > https://chaoticlab.io/pub/flyspell-bug/flyspell-bug-windows.png
>
> From these images it seems immediately evident that flyspell wrongly
> marks и, именно, бесконечный and усложняет as misspelled only when a
> non-cyrillic word follows it.  However, as paragraph 4 in these
> examples also demonstrates, such condition is not sufficient since
> there the words preceding 'HTML' and 'Lorem Ipsum' are not marked.
>
> Could you try to play around with the seqeuencing of words in that
> example?  Maybe a clearer pattern emerges.
>
> Thanks, martin
>






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-28  0:00   ` Artem Boldarev
@ 2018-07-30 13:20     ` Agustin Martin
  2018-07-30 16:29       ` Artem Boldarev
  0 siblings, 1 reply; 19+ messages in thread
From: Agustin Martin @ 2018-07-30 13:20 UTC (permalink / raw)
  To: 32280, Artem Boldarev

On Sat, Jul 28, 2018 at 03:00:40AM +0300, Artem Boldarev wrote:
> Hello Agustin,
> 
> Thanks for your suggestion!
> 
> Unfortunately, it does not work on my system with 'emacs -Q'. So, somehow I
> need to manually configure my dictionaries anyway. I will consider replacing
> 'ispell-dictionary-alist'with 'ispell-local-dictionary-alist' in my
> configuration. Thank you for pointing out.

Hi,

I am using your example file with emacs -Q (Emacs 25) and had no problems,
flyspell-buffer works as expected after setting ispell-program-name to
hunspell and set ru_RU as dict (hunspell 1.6.2 here).

Local Ispell dictionary set to ru_RU
Starting new Ispell process hunspell with ru_RU dictionary...
Checking region...
Spell Checking...100% [laborum]
Spell Checking completed.

The auto-detected values for hunspell ru_RU are

(ru_RU [[:alpha:]] [^[:alpha:]]  t (-d ru_RU) nil utf-8)

Wonder why otherchars is not shown.

> The codepage I specified in the configuration, as it seems, is not the
> problem as spell checking works fine *most* of the time. I could spellcheck
> large amounts of text without any issues. It seems that hunspell always uses
> utf-8 internally, but I am not sure: I will try to investigate this.

Seems I was wrong, it is a long time since I digged there.

¿What happens if you use "[[:alpha:]]" and "[^[:alpha:]]" instead if the
explicit character strings "[АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]"
and "[^АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]"?

Regards,

-- 
Agustin





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-30 13:20     ` Agustin Martin
@ 2018-07-30 16:29       ` Artem Boldarev
  2018-07-30 16:43         ` Agustin Martin
  0 siblings, 1 reply; 19+ messages in thread
From: Artem Boldarev @ 2018-07-30 16:29 UTC (permalink / raw)
  To: Agustin Martin, 32280

Hi,

Thanks, Agustin, this is an interesting find! I have altered my 
configuration as you suggested, and indeed I wasn't able to trigger the 
bug any more.

https://chaoticlab.io/pub/flyspell-bug/flyspell-bug-no-explicit-chars.png

Anyway, I am pretty confident that altering the configuration does not 
resolve the bug, but rather hides it. I think so because of the 
following reasons:

1. I do not see why my previous configuration, which uses explicitly 
specified characters, is wrong. It works fine when spell checking as you 
type and for smaller buffers and regions (when flyspell-large-region 
does not get called).

2. Without the fix, the above-discussed inconsistency exists between how 
flyspell works when you use it for:

a) spell checking as you type and checking smaller regions of text (when 
flyspell-small-region gets called).
b) spell checking large regions of text.

It should be noted, that putting the cursor on a wrongly highlighted 
word when the flyspell mode is active, gets the word rechecked and 
un-highlighted. This behaviour demonstrates this inconsistency one more 
time.

In fact, this inconsistency is the bug which I reported.

Regards,
Artem


> Hi,
>
> I am using your example file with emacs -Q (Emacs 25) and had no problems,
> flyspell-buffer works as expected after setting ispell-program-name to
> hunspell and set ru_RU as dict (hunspell 1.6.2 here).
>
> Local Ispell dictionary set to ru_RU
> Starting new Ispell process hunspell with ru_RU dictionary...
> Checking region...
> Spell Checking...100% [laborum]
> Spell Checking completed.
>
> The auto-detected values for hunspell ru_RU are
>
> (ru_RU [[:alpha:]] [^[:alpha:]]  t (-d ru_RU) nil utf-8)
>
> Wonder why otherchars is not shown.
>
>> The codepage I specified in the configuration, as it seems, is not the
>> problem as spell checking works fine *most* of the time. I could spellcheck
>> large amounts of text without any issues. It seems that hunspell always uses
>> utf-8 internally, but I am not sure: I will try to investigate this.
> Seems I was wrong, it is a long time since I digged there.
>
> ¿What happens if you use "[[:alpha:]]" and "[^[:alpha:]]" instead if the
> explicit character strings "[АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]"
> and "[^АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]"?
>
> Regards,
>






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-30 16:29       ` Artem Boldarev
@ 2018-07-30 16:43         ` Agustin Martin
  2018-07-30 18:12           ` Artem Boldarev
  0 siblings, 1 reply; 19+ messages in thread
From: Agustin Martin @ 2018-07-30 16:43 UTC (permalink / raw)
  To: 32280, Artem Boldarev

On Mon, Jul 30, 2018 at 07:29:06PM +0300, Artem Boldarev wrote:
> Hi,
> 
> Thanks, Agustin, this is an interesting find! I have altered my
> configuration as you suggested, and indeed I wasn't able to trigger the bug
> any more.
> 
> https://chaoticlab.io/pub/flyspell-bug/flyspell-bug-no-explicit-chars.png
> 
> Anyway, I am pretty confident that altering the configuration does not
> resolve the bug, but rather hides it. I think so because of the following
> reasons:
> 
> 1. I do not see why my previous configuration, which uses explicitly
> specified characters, is wrong. It works fine when spell checking as you
> type and for smaller buffers and regions (when flyspell-large-region does
> not get called).

Hi,

I'd suggest you to try lines below

[A-Za-zАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]
[^A-Za-zАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]

with the latin chars A-Za-z added. ¿Does it work?
 
> 2. Without the fix, the above-discussed inconsistency exists between how
> flyspell works when you use it for:
> 
> a) spell checking as you type and checking smaller regions of text (when
> flyspell-small-region gets called).
> b) spell checking large regions of text.

AFAIK `flyspell-small-region' is very inefficient in terms of time for large
buffers, so `flyspell-large-region' uses a completely different approach for
those large buffers. It first looks for a list of possible misspellings and
then searches for them sequentially in the text, running flyspell-word on
each one.

Regards,

-- 
Agustin





^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-30 16:43         ` Agustin Martin
@ 2018-07-30 18:12           ` Artem Boldarev
  2018-08-04 10:43             ` Eli Zaretskii
  0 siblings, 1 reply; 19+ messages in thread
From: Artem Boldarev @ 2018-07-30 18:12 UTC (permalink / raw)
  To: Agustin Martin, 32280

I have tried to do as you suggested. The result is the same as in my 
previous letter.
> Hi,
>
> I'd suggest you to try lines below
>
> [A-Za-zАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]
> [^A-Za-zАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]
>
> with the latin chars A-Za-z added. ¿Does it work?
>   
>> 2. Without the fix, the above-discussed inconsistency exists between how
>> flyspell works when you use it for:
>>
>> a) spell checking as you type and checking smaller regions of text (when
>> flyspell-small-region gets called).
>> b) spell checking large regions of text.

Yes indeed, it is so inefficient that checking a small region of text is 
more efficient with the flyspell-large-region most of the time (you can 
alter this behaviour by changing the flyspell-large-region variable). It 
checks spelling word by word - this is the source of its inefficiency.

> AFAIK `flyspell-small-region' is very inefficient in terms of time for large
> buffers, so `flyspell-large-region' uses a completely different approach for
> those large buffers. It first looks for a list of possible misspellings and
> then searches for them sequentially in the text, running flyspell-word on
> each one.
>
> Regards,
>






^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-07-30 18:12           ` Artem Boldarev
@ 2018-08-04 10:43             ` Eli Zaretskii
  2018-08-07 10:56               ` Artem Boldarev
  0 siblings, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2018-08-04 10:43 UTC (permalink / raw)
  To: Artem Boldarev; +Cc: 32280, agustin6martin

> From: Artem Boldarev <artem.boldarev@gmail.com>
> Date: Mon, 30 Jul 2018 21:12:36 +0300
> 
> > I'd suggest you to try lines below
> >
> > [A-Za-zАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]
> > [^A-Za-zАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя]
> >
> > with the latin chars A-Za-z added. ¿Does it work?
> 
> I have tried to do as you suggested. The result is the same as in my 
> previous letter.

And now I understand why.  The problem is not with comparing the
length of the misspelled word, the problem is with this part of
flyspell-external-point-words:

	      ;; Iterate on string search until string is found as word,
	      ;; not as substring.
	      (while keep
		(if (search-forward word
				    flyspell-large-region-end t)
		    (let* ((found-list
			    (save-excursion
			      ;; Move back into the match
			      ;; so flyspell-get-word will find it.
			      (forward-char -1)
			      (flyspell-get-word)))  <<<<<<<<<<<<<<<<<<<<<<
			   (found (car found-list))
			   (found-length (length found))
			   (misspell-length (length word)))

When the misspelled word doesn't match CASECHARS, the call to
flyspell-get-word will find an entirely different word than the one
which was originally found as misspelled: it will find the first word
before point that matches CASECHARS.  In your case, since the
misspelled words were in English, flyspell-get-word will find the
first Cyrillic word before point.  From there on, the logic of the
code in flyspell-external-point-words completely breaks down, and
yields results that are more-or-less random.

IOW, the assumption of the current logic in
flyspell-external-point-words is that the misspelled word is from the
same language that is supported by the current dictionary, and in your
case this assumption is false.  This is why the problem disappeared as
soon as you added Latin alphabetic characters to CASECHARS.

So please try this patch for flyspell.el, it should fix your problem
with the original setup of ru_RU (it also fixes an unrelated wrong
assumption which goes back to the days when the spell-checking program
could only be either Ispell or Aspell):

diff --git a/lisp/textmodes/flyspell.el b/lisp/textmodes/flyspell.el
index 5726bd8..4d7a189 100644
--- a/lisp/textmodes/flyspell.el
+++ b/lisp/textmodes/flyspell.el
@@ -1420,10 +1420,20 @@ flyspell-external-point-words
 The list of incorrect words should be in `flyspell-external-ispell-buffer'.
 \(We finish by killing that buffer and setting the variable to nil.)
 The buffer to mark them in is `flyspell-large-region-buffer'."
-  (let (words-not-found
-	(ispell-otherchars (ispell-get-otherchars))
-	(buffer-scan-pos flyspell-large-region-beg)
-	case-fold-search)
+  (let* (words-not-found
+         (flyspell-casechars (flyspell-get-casechars))
+         (ispell-otherchars (ispell-get-otherchars))
+         (ispell-many-otherchars-p (ispell-get-many-otherchars-p))
+         (word-chars (concat flyspell-casechars
+                             "+\\("
+                             (if (not (string= "" ispell-otherchars))
+                                 (concat ispell-otherchars "?"))
+                             flyspell-casechars
+                             "+\\)"
+                             (if ispell-many-otherchars-p
+                                 "*" "?")))
+         (buffer-scan-pos flyspell-large-region-beg)
+         case-fold-search)
     (with-current-buffer flyspell-external-ispell-buffer
       (goto-char (point-min))
       ;; Loop over incorrect words, in the order they were reported,
@@ -1453,11 +1463,18 @@ flyspell-external-point-words
 			      ;; Move back into the match
 			      ;; so flyspell-get-word will find it.
 			      (forward-char -1)
-			      (flyspell-get-word)))
+                              ;; Is this a word that matches the
+                              ;; current dictionary?
+                              (if (looking-at word-chars)
+			          (flyspell-get-word))))
 			   (found (car found-list))
 			   (found-length (length found))
 			   (misspell-length (length word)))
 		      (when (or
+                             ;; Misspelled word is not from the
+                             ;; language supported by the current
+                             ;; dictionary.
+                             (null found)
 			     ;; Size matches, we really found it.
 			     (= found-length misspell-length)
 			     ;; Matches as part of a boundary-char separated
@@ -1479,13 +1496,21 @@ flyspell-external-point-words
 			     ;; backslash) and none of the previous
 			     ;; conditions match.
 			     (and (not ispell-really-aspell)
+                                  (not ispell-really-hunspell)
+                                  (not ispell-really-enchant)
 				  (save-excursion
 				    (goto-char (- (nth 1 found-list) 1))
 				    (if (looking-at "[\\]" )
 					t
 				      nil))))
 			(setq keep nil)
-			(flyspell-word nil t)
+                        ;; Don't try spell-checking words whose
+                        ;; characters don't match CASECHARS, because
+                        ;; flyspell-word will then consider as
+                        ;; misspelling the preceding word that matches
+                        ;; CASECHARS.
+                        (or (null found)
+			    (flyspell-word nil t))
 			;; Search for next misspelled word will begin from
 			;; end of last validated match.
 			(setq buffer-scan-pos (point))))





^ permalink raw reply related	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-08-04 10:43             ` Eli Zaretskii
@ 2018-08-07 10:56               ` Artem Boldarev
  2018-08-07 15:37                 ` Eli Zaretskii
  0 siblings, 1 reply; 19+ messages in thread
From: Artem Boldarev @ 2018-08-07 10:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 32280, agustin6martin

Hi,

Thanks, Eli! I have tried the proposed patch, and it seems to solve the 
problem.

Thanks to everyone who have helped to track down the problem.

By the way, are there any chances to see these changes incorporated into 
Emacs 26.2?

Kind regards,
Artem


>> I have tried to do as you suggested. The result is the same as in my
>> previous letter.
> And now I understand why.  The problem is not with comparing the
> length of the misspelled word, the problem is with this part of
> flyspell-external-point-words:
>
> 	      ;; Iterate on string search until string is found as word,
> 	      ;; not as substring.
> 	      (while keep
> 		(if (search-forward word
> 				    flyspell-large-region-end t)
> 		    (let* ((found-list
> 			    (save-excursion
> 			      ;; Move back into the match
> 			      ;; so flyspell-get-word will find it.
> 			      (forward-char -1)
> 			      (flyspell-get-word)))  <<<<<<<<<<<<<<<<<<<<<<
> 			   (found (car found-list))
> 			   (found-length (length found))
> 			   (misspell-length (length word)))
>
> When the misspelled word doesn't match CASECHARS, the call to
> flyspell-get-word will find an entirely different word than the one
> which was originally found as misspelled: it will find the first word
> before point that matches CASECHARS.  In your case, since the
> misspelled words were in English, flyspell-get-word will find the
> first Cyrillic word before point.  From there on, the logic of the
> code in flyspell-external-point-words completely breaks down, and
> yields results that are more-or-less random.
>
> IOW, the assumption of the current logic in
> flyspell-external-point-words is that the misspelled word is from the
> same language that is supported by the current dictionary, and in your
> case this assumption is false.  This is why the problem disappeared as
> soon as you added Latin alphabetic characters to CASECHARS.
>
> So please try this patch for flyspell.el, it should fix your problem
> with the original setup of ru_RU (it also fixes an unrelated wrong
> assumption which goes back to the days when the spell-checking program
> could only be either Ispell or Aspell):
>
> diff --git a/lisp/textmodes/flyspell.el b/lisp/textmodes/flyspell.el
> index 5726bd8..4d7a189 100644
> --- a/lisp/textmodes/flyspell.el
> +++ b/lisp/textmodes/flyspell.el
> @@ -1420,10 +1420,20 @@ flyspell-external-point-words
>   The list of incorrect words should be in `flyspell-external-ispell-buffer'.
>   \(We finish by killing that buffer and setting the variable to nil.)
>   The buffer to mark them in is `flyspell-large-region-buffer'."
> -  (let (words-not-found
> -	(ispell-otherchars (ispell-get-otherchars))
> -	(buffer-scan-pos flyspell-large-region-beg)
> -	case-fold-search)
> +  (let* (words-not-found
> +         (flyspell-casechars (flyspell-get-casechars))
> +         (ispell-otherchars (ispell-get-otherchars))
> +         (ispell-many-otherchars-p (ispell-get-many-otherchars-p))
> +         (word-chars (concat flyspell-casechars
> +                             "+\\("
> +                             (if (not (string= "" ispell-otherchars))
> +                                 (concat ispell-otherchars "?"))
> +                             flyspell-casechars
> +                             "+\\)"
> +                             (if ispell-many-otherchars-p
> +                                 "*" "?")))
> +         (buffer-scan-pos flyspell-large-region-beg)
> +         case-fold-search)
>       (with-current-buffer flyspell-external-ispell-buffer
>         (goto-char (point-min))
>         ;; Loop over incorrect words, in the order they were reported,
> @@ -1453,11 +1463,18 @@ flyspell-external-point-words
>   			      ;; Move back into the match
>   			      ;; so flyspell-get-word will find it.
>   			      (forward-char -1)
> -			      (flyspell-get-word)))
> +                              ;; Is this a word that matches the
> +                              ;; current dictionary?
> +                              (if (looking-at word-chars)
> +			          (flyspell-get-word))))
>   			   (found (car found-list))
>   			   (found-length (length found))
>   			   (misspell-length (length word)))
>   		      (when (or
> +                             ;; Misspelled word is not from the
> +                             ;; language supported by the current
> +                             ;; dictionary.
> +                             (null found)
>   			     ;; Size matches, we really found it.
>   			     (= found-length misspell-length)
>   			     ;; Matches as part of a boundary-char separated
> @@ -1479,13 +1496,21 @@ flyspell-external-point-words
>   			     ;; backslash) and none of the previous
>   			     ;; conditions match.
>   			     (and (not ispell-really-aspell)
> +                                  (not ispell-really-hunspell)
> +                                  (not ispell-really-enchant)
>   				  (save-excursion
>   				    (goto-char (- (nth 1 found-list) 1))
>   				    (if (looking-at "[\\]" )
>   					t
>   				      nil))))
>   			(setq keep nil)
> -			(flyspell-word nil t)
> +                        ;; Don't try spell-checking words whose
> +                        ;; characters don't match CASECHARS, because
> +                        ;; flyspell-word will then consider as
> +                        ;; misspelling the preceding word that matches
> +                        ;; CASECHARS.
> +                        (or (null found)
> +			    (flyspell-word nil t))
>   			;; Search for next misspelled word will begin from
>   			;; end of last validated match.
>   			(setq buffer-scan-pos (point))))







^ permalink raw reply	[flat|nested] 19+ messages in thread

* bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer
  2018-08-07 10:56               ` Artem Boldarev
@ 2018-08-07 15:37                 ` Eli Zaretskii
  0 siblings, 0 replies; 19+ messages in thread
From: Eli Zaretskii @ 2018-08-07 15:37 UTC (permalink / raw)
  To: Artem Boldarev; +Cc: 32280-done, agustin6martin

> Cc: agustin6martin@gmail.com, 32280@debbugs.gnu.org
> From: Artem Boldarev <artem.boldarev@gmail.com>
> Date: Tue, 7 Aug 2018 13:56:58 +0300
> 
> Thanks, Eli! I have tried the proposed patch, and it seems to solve the 
> problem.
> 
> Thanks to everyone who have helped to track down the problem.
> 
> By the way, are there any chances to see these changes incorporated into 
> Emacs 26.2?

Thanks for testing, I pushed the fix to the emacs-26 branch, and I'm
closing the bug.





^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-08-07 15:37 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-26  9:44 bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer Artem Boldarev
2018-07-27 12:45 ` Eli Zaretskii
2018-07-28  0:00   ` Artem Boldarev
2018-07-29 14:09     ` Artem Boldarev
2018-07-29 17:33       ` Eli Zaretskii
2018-07-30  6:22       ` martin rudalics
2018-07-30 10:00         ` Artem Boldarev
2018-07-27 16:00 ` Agustin Martin
2018-07-28  0:00   ` Artem Boldarev
2018-07-30 13:20     ` Agustin Martin
2018-07-30 16:29       ` Artem Boldarev
2018-07-30 16:43         ` Agustin Martin
2018-07-30 18:12           ` Artem Boldarev
2018-08-04 10:43             ` Eli Zaretskii
2018-08-07 10:56               ` Artem Boldarev
2018-08-07 15:37                 ` Eli Zaretskii
2018-07-28  0:23   ` Artem Boldarev
2018-07-28  7:02     ` Eli Zaretskii
2018-07-29 14:15       ` Artem Boldarev

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).