From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Agustin Martin Newsgroups: gmane.emacs.devel Subject: Re: flyspell.el [1.90->1.91] flyspell-large-region-beg should be moved after good match Date: Thu, 22 Dec 2005 14:02:46 +0100 Message-ID: <20051222130246.GA3382@agmartin.aq.upm.es> References: <20051216124235.GA3357@agmartin.aq.upm.es> <20051219004115.GA24197@agmartin.aq.upm.es> <20051219012817.GA26385@agmartin.aq.upm.es> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="LQksG6bCIzRHxTLp" X-Trace: sea.gmane.org 1135257667 15088 80.91.229.2 (22 Dec 2005 13:21:07 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 22 Dec 2005 13:21:07 +0000 (UTC) Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Dec 22 14:21:06 2005 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1EpQNP-0003gK-2J for ged-emacs-devel@m.gmane.org; Thu, 22 Dec 2005 14:20:59 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EpQMN-0006Uw-As for ged-emacs-devel@m.gmane.org; Thu, 22 Dec 2005 08:19:55 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1EpQ86-0008DX-O2 for emacs-devel@gnu.org; Thu, 22 Dec 2005 08:05:11 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1EpQ85-0008Cz-0b for emacs-devel@gnu.org; Thu, 22 Dec 2005 08:05:09 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1EpQ84-0008CV-3q for emacs-devel@gnu.org; Thu, 22 Dec 2005 08:05:08 -0500 Original-Received: from [138.100.4.49] (helo=edison.ccupm.upm.es) by monty-python.gnu.org with esmtp (Exim 4.34) id 1EpQ7B-0003F4-4y for emacs-devel@gnu.org; Thu, 22 Dec 2005 08:04:13 -0500 Original-Received: from mala.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131]) by edison.ccupm.upm.es (8.12.10/8.12.10) with ESMTP id jBMD41Q2011873; Thu, 22 Dec 2005 14:04:01 +0100 Original-Received: by mala.aq.upm.es (Postfix, from userid 1000) id 58F3B347B; Thu, 22 Dec 2005 14:02:46 +0100 (CET) Original-To: emacs-devel@gnu.org Content-Disposition: inline In-Reply-To: <20051219012817.GA26385@agmartin.aq.upm.es> User-Agent: Mutt/1.5.11 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:48213 Archived-At: --LQksG6bCIzRHxTLp Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Mon, Dec 19, 2005 at 02:28:17AM +0100, Agustin Martin wrote: > On Mon, Dec 19, 2005 at 01:41:15AM +0100, Agustin Martin wrote: > > On Fri, Dec 16, 2005 at 08:03:25PM -0500, Richard M. Stallman wrote: > > > I think this fix is a cleaner one. Does it work right? > > I will try testing this in a current emacs when I can. I still do not > > understand where and why is the jump produced in that file, but as mentioned > > above seems related to words tagged as not found because contain chars not > > in casechars or boundary-chars, but actually found in the search loop thus > > moving point. > > Tested a bit more, and indeed related to boundary chars mismatch. I am > considering a better approach for this kind of mismatches, but is still > untested. Testing things in emacs-snapshot, so we get real up-to-date results. The matter here is if search for next misspelling should start where searches involving last misspell ended or only where last validated match was. I am strongly in favour of the second option and that is what my patch proposed. The reason for this is that boundary-char mismatches between ispell and ispell.el are a potential source for problems that are not easy to debug. Having updated ispell-dictionary-alist entries is not enough, because this is still very subjected to user settings in the ~/.emacs file or to changes in the ispell aff file (this last should not be a problem for aspell, where ispell.el detects boundary-chars, but users settings might still be). Since this is hard to debug for a normal user I am strongly in favour of the conservative option, moving search start only after a validated match. I was testing previous flyspell.el version with your last non-installed patch in a test-system where dot is declared as boundary char in ispell francais dict but not in ispell.el (this is fixed in current ispell.el, but some ~/.emacs might put it wrong) with the contents francais.aff anothermisspell francais.aff francais.aff francais.aff flyspell do not think francais.aff is a word, so it is not validated and is re-searched and finally marked as not found, but point is then at the last francais.aff appearance, so next misspelling, "anothermisspell" is also not found, as well as any other misspelling in the middle. Bad forward unsync. There is something else in the validation code that can be improved, based on Piet van Oostrum suggestion in the old "flyspell bug" thread, validate if misspelling length is higher than length of what flyspell considers a word. In the above example, "francais.aff" would be the misspelling, but considering where point is, flyspell would say that current word is "aff", and since length("francais.aff") > length("aff") match would be validated and the above example work even in a per-search point move. Unfortunately this might not work with more ellaborated mismatches, so I think we should combine both things, validating also as above and be safer starting searches from last validated match. I am attaching a more ellaborated patch for consideration -- Agustin --LQksG6bCIzRHxTLp Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="flyspell.el.flyspell-external-point-words.diff" --- flyspell.el.orig 2005-12-22 12:23:55.000000000 +0100 +++ flyspell.el 2005-12-22 12:49:28.000000000 +0100 @@ -1325,6 +1325,9 @@ (* 100 (/ (float (point)) (point-max))) word)) (with-current-buffer flyspell-large-region-buffer + ;; Make sure search starts from last validated match (or from + ;; beginning of region if the first time). This intends to avoid + ;; erroneous forward jumps that might cause fatal forward unsyncs. (goto-char flyspell-large-region-beg) (let ((keep t)) ;; Iterate on string search until string is found as word, @@ -1334,15 +1337,21 @@ flyspell-large-region-end t) (save-excursion (goto-char (- (point) 1)) - (let* ((flyword-prev-l (flyspell-get-word nil)) + (let* ((match-point (+ (point) 1)) + (flyword-prev-l (flyspell-get-word nil)) (flyword-prev (car flyword-prev-l)) - (size-match (= (length flyword-prev) (length word)))) + (flyword-length (length flyword-prev)) + (misspell-length (length word))) (when (or - ;; size matches, we are done - size-match + ;; Size matches, we are done + (= flyword-length misspell-length) ;; Matches as part of a boundary-char separated word (member word (split-string flyword-prev ispell-otherchars)) + ;; Misspelling has higher length than what flyspell + ;; considers the word. Caused by boundary-chars mismatch. + ;; Validating seems safe. + (< flyword-length misspell-length) ;; ispell treats beginning of some TeX ;; commands as nroff control sequences ;; and strips them in the list of @@ -1360,8 +1369,9 @@ nil)))) (setq keep nil) (flyspell-word) - ;; Next search will begin from end of last match - ))) + ;; Search for next misspelled word will begin from + ;; end of last validated match. + (setq flyspell-large-region-beg match-point)))) ;; Record if misspelling is not found and try new one (add-to-list 'words-not-found (concat " -> " word " - " --LQksG6bCIzRHxTLp Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel --LQksG6bCIzRHxTLp--