From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#32280: 26.1; FLYSPELL-BUFFER sometimes misbehaves for some input in a large enough buffer Date: Sat, 04 Aug 2018 13:43:00 +0300 Message-ID: <83h8katnt7.fsf@gnu.org> References: <992503e5-5f88-30c7-e9b9-fe0a884d2e52@gmail.com> <20180727160048.GA30487@agmartin.aq.upm.es> <20180730132033.GA1182@agmartin.aq.upm.es> <3d036b32-01df-6595-a023-3fc243613813@gmail.com> <20180730164303.GA12241@agmartin.aq.upm.es> <41d68d6b-a687-30c1-818f-57659d7ec6c5@gmail.com> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1533379355 6879 195.159.176.226 (4 Aug 2018 10:42:35 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 4 Aug 2018 10:42:35 +0000 (UTC) Cc: 32280@debbugs.gnu.org, agustin6martin@gmail.com To: Artem Boldarev Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Aug 04 12:42:30 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1flu0x-0001cN-KV for geb-bug-gnu-emacs@m.gmane.org; Sat, 04 Aug 2018 12:42:27 +0200 Original-Received: from localhost ([::1]:54580 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flu2l-000465-MQ for geb-bug-gnu-emacs@m.gmane.org; Sat, 04 Aug 2018 06:44:19 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:49579) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flu2Y-00042K-VQ for bug-gnu-emacs@gnu.org; Sat, 04 Aug 2018 06:44:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flu2U-0008Dq-Nw for bug-gnu-emacs@gnu.org; Sat, 04 Aug 2018 06:44:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:35421) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flu2U-0008Dc-Jg for bug-gnu-emacs@gnu.org; Sat, 04 Aug 2018 06:44:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1flu2U-0002YF-Cg for bug-gnu-emacs@gnu.org; Sat, 04 Aug 2018 06:44:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 04 Aug 2018 10:44:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 32280 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 32280-submit@debbugs.gnu.org id=B32280.15333794029750 (code B ref 32280); Sat, 04 Aug 2018 10:44:02 +0000 Original-Received: (at 32280) by debbugs.gnu.org; 4 Aug 2018 10:43:22 +0000 Original-Received: from localhost ([127.0.0.1]:40439 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1flu1o-0002XA-Hz for submit@debbugs.gnu.org; Sat, 04 Aug 2018 06:43:21 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:58436) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1flu1m-0002Wt-JF for 32280@debbugs.gnu.org; Sat, 04 Aug 2018 06:43:18 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flu1e-0007iI-2H for 32280@debbugs.gnu.org; Sat, 04 Aug 2018 06:43:13 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:42153) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flu1d-0007iD-Tv; Sat, 04 Aug 2018 06:43:09 -0400 Original-Received: from [176.228.60.248] (port=3131 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1flu1d-00040W-8X; Sat, 04 Aug 2018 06:43:09 -0400 In-reply-to: <41d68d6b-a687-30c1-818f-57659d7ec6c5@gmail.com> (message from Artem Boldarev on Mon, 30 Jul 2018 21:12:36 +0300) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:149260 Archived-At: > From: Artem Boldarev > Date: Mon, 30 Jul 2018 21:12:36 +0300 > > > I'd suggest you to try lines below > > > > [A-Za-zАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя] > > [^A-Za-zАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯабвгдеёжзийклмнопрстуфхцчшщьыъэюя] > > > > with the latin chars A-Za-z added. ¿Does it work? > > I have tried to do as you suggested. The result is the same as in my > previous letter. And now I understand why. The problem is not with comparing the length of the misspelled word, the problem is with this part of flyspell-external-point-words: ;; Iterate on string search until string is found as word, ;; not as substring. (while keep (if (search-forward word flyspell-large-region-end t) (let* ((found-list (save-excursion ;; Move back into the match ;; so flyspell-get-word will find it. (forward-char -1) (flyspell-get-word))) <<<<<<<<<<<<<<<<<<<<<< (found (car found-list)) (found-length (length found)) (misspell-length (length word))) When the misspelled word doesn't match CASECHARS, the call to flyspell-get-word will find an entirely different word than the one which was originally found as misspelled: it will find the first word before point that matches CASECHARS. In your case, since the misspelled words were in English, flyspell-get-word will find the first Cyrillic word before point. From there on, the logic of the code in flyspell-external-point-words completely breaks down, and yields results that are more-or-less random. IOW, the assumption of the current logic in flyspell-external-point-words is that the misspelled word is from the same language that is supported by the current dictionary, and in your case this assumption is false. This is why the problem disappeared as soon as you added Latin alphabetic characters to CASECHARS. So please try this patch for flyspell.el, it should fix your problem with the original setup of ru_RU (it also fixes an unrelated wrong assumption which goes back to the days when the spell-checking program could only be either Ispell or Aspell): diff --git a/lisp/textmodes/flyspell.el b/lisp/textmodes/flyspell.el index 5726bd8..4d7a189 100644 --- a/lisp/textmodes/flyspell.el +++ b/lisp/textmodes/flyspell.el @@ -1420,10 +1420,20 @@ flyspell-external-point-words The list of incorrect words should be in `flyspell-external-ispell-buffer'. \(We finish by killing that buffer and setting the variable to nil.) The buffer to mark them in is `flyspell-large-region-buffer'." - (let (words-not-found - (ispell-otherchars (ispell-get-otherchars)) - (buffer-scan-pos flyspell-large-region-beg) - case-fold-search) + (let* (words-not-found + (flyspell-casechars (flyspell-get-casechars)) + (ispell-otherchars (ispell-get-otherchars)) + (ispell-many-otherchars-p (ispell-get-many-otherchars-p)) + (word-chars (concat flyspell-casechars + "+\\(" + (if (not (string= "" ispell-otherchars)) + (concat ispell-otherchars "?")) + flyspell-casechars + "+\\)" + (if ispell-many-otherchars-p + "*" "?"))) + (buffer-scan-pos flyspell-large-region-beg) + case-fold-search) (with-current-buffer flyspell-external-ispell-buffer (goto-char (point-min)) ;; Loop over incorrect words, in the order they were reported, @@ -1453,11 +1463,18 @@ flyspell-external-point-words ;; Move back into the match ;; so flyspell-get-word will find it. (forward-char -1) - (flyspell-get-word))) + ;; Is this a word that matches the + ;; current dictionary? + (if (looking-at word-chars) + (flyspell-get-word)))) (found (car found-list)) (found-length (length found)) (misspell-length (length word))) (when (or + ;; Misspelled word is not from the + ;; language supported by the current + ;; dictionary. + (null found) ;; Size matches, we really found it. (= found-length misspell-length) ;; Matches as part of a boundary-char separated @@ -1479,13 +1496,21 @@ flyspell-external-point-words ;; backslash) and none of the previous ;; conditions match. (and (not ispell-really-aspell) + (not ispell-really-hunspell) + (not ispell-really-enchant) (save-excursion (goto-char (- (nth 1 found-list) 1)) (if (looking-at "[\\]" ) t nil)))) (setq keep nil) - (flyspell-word nil t) + ;; Don't try spell-checking words whose + ;; characters don't match CASECHARS, because + ;; flyspell-word will then consider as + ;; misspelling the preceding word that matches + ;; CASECHARS. + (or (null found) + (flyspell-word nil t)) ;; Search for next misspelled word will begin from ;; end of last validated match. (setq buffer-scan-pos (point))))