unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* flyspell.el and non-word characters in CASECHARS
@ 2012-04-16 19:55 Eli Zaretskii
  2012-04-17 17:26 ` Agustin Martin
  0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2012-04-16 19:55 UTC (permalink / raw)
  To: emacs-devel

In flyspell.el:flyspell-check-pre-word-p we have this snippet:

   ((or (and (= flyspell-pre-point (- (point) 1))
	     (eq (char-syntax (char-after flyspell-pre-point)) ?w))
	(= flyspell-pre-point (point))
	(= flyspell-pre-point (+ (point) 1)))
    nil)
   ((and (symbolp this-command)
	 (not executing-kbd-macro)
	 (or (get this-command 'flyspell-delayed)
	     (and (get this-command 'flyspell-deplacement)
		  (eq flyspell-previous-command this-command)))
	 (or (= (current-column) 0)
	     (= (current-column) flyspell-pre-column)
	     ;; If other post-command-hooks change the buffer,
	     ;; flyspell-pre-point can lie past eob (bug#468).
	     (null (char-after flyspell-pre-point))
	     (eq (char-syntax (char-after flyspell-pre-point)) ?w)))
    nil)

I think it's wrong to test for word syntax here; we should test for a
match against CASECHARS, or maybe even CASECHARS and OTHERCHARS.
These are what defines a "word" in this context, because flyspell must
be consistent with what the speller does.

I bumped into this spell-checking Hebrew text with Hunspell: the he_IL
dictionary considers " and ' be WORDCHARS (they are indeed used as
part of words of foreign origin and in acronyms), but typing these
characters under flyspell-mode immediately marks the preceding word as
a typo, although self-insert-command is in
flyspell-default-delayed-commands, and so should have triggered a
3-sec delay in spell-checking, letting me to continue typing.  If I
use a match against CASECHARS instead of the word syntax, the problem
goes away.

Am I missing something?



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: flyspell.el and non-word characters in CASECHARS
  2012-04-16 19:55 flyspell.el and non-word characters in CASECHARS Eli Zaretskii
@ 2012-04-17 17:26 ` Agustin Martin
  2012-04-17 17:51   ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Agustin Martin @ 2012-04-17 17:26 UTC (permalink / raw)
  To: emacs-devel

On Mon, Apr 16, 2012 at 10:55:37PM +0300, Eli Zaretskii wrote:
> In flyspell.el:flyspell-check-pre-word-p we have this snippet:
> 
>    ((or (and (= flyspell-pre-point (- (point) 1))
> 	     (eq (char-syntax (char-after flyspell-pre-point)) ?w))
> 	(= flyspell-pre-point (point))
> 	(= flyspell-pre-point (+ (point) 1)))
>     nil)
>    ((and (symbolp this-command)
> 	 (not executing-kbd-macro)
> 	 (or (get this-command 'flyspell-delayed)
> 	     (and (get this-command 'flyspell-deplacement)
> 		  (eq flyspell-previous-command this-command)))
> 	 (or (= (current-column) 0)
> 	     (= (current-column) flyspell-pre-column)
> 	     ;; If other post-command-hooks change the buffer,
> 	     ;; flyspell-pre-point can lie past eob (bug#468).
> 	     (null (char-after flyspell-pre-point))
> 	     (eq (char-syntax (char-after flyspell-pre-point)) ?w)))
>     nil)
> 
> I think it's wrong to test for word syntax here; we should test for a
> match against CASECHARS, or maybe even CASECHARS and OTHERCHARS.
> These are what defines a "word" in this context, because flyspell must
> be consistent with what the speller does.
> 
> I bumped into this spell-checking Hebrew text with Hunspell: the he_IL
> dictionary considers " and ' be WORDCHARS (they are indeed used as
> part of words of foreign origin and in acronyms), but typing these
> characters under flyspell-mode immediately marks the preceding word as
> a typo, although self-insert-command is in
> flyspell-default-delayed-commands, and so should have triggered a
> 3-sec delay in spell-checking, letting me to continue typing.  If I
> use a match against CASECHARS instead of the word syntax, the problem
> goes away.
> 
> Am I missing something?

The only reason I can think is that at that time there is no way to know if
that wordchar is going to be in the middle of a word or not. If it appears
at a word boundary, is not what ispell.el seems to consider a wordchar.
Did your test work only with CASECHARS instead of CASECHARS+OTHERCHARS?

For the records, I see the same problem in other languages for valid words
including "otherchars". Since I was recently playing with Catalan I tried
with "intel·ligent", just after typing the middledot "intel" is marked as
typo and only when I finish correctly typing word the mark disappear.

-- 
Agustin



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: flyspell.el and non-word characters in CASECHARS
  2012-04-17 17:26 ` Agustin Martin
@ 2012-04-17 17:51   ` Eli Zaretskii
  2012-04-18 16:26     ` Agustin Martin
  0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2012-04-17 17:51 UTC (permalink / raw)
  To: Agustin Martin; +Cc: emacs-devel

> Date: Tue, 17 Apr 2012 19:26:36 +0200
> From: Agustin Martin <agustin.martin@hispalinux.es>
> 
> The only reason I can think is that at that time there is no way to know if
> that wordchar is going to be in the middle of a word or not. If it appears
> at a word boundary, is not what ispell.el seems to consider a wordchar.

But in that case, the following non-word character (blank or
punctuation) will trigger the spell-check of the word.  So we lose
nothing, right?

> Did your test work only with CASECHARS instead of CASECHARS+OTHERCHARS?

I actually _added_ to the word-syntax test the test against CASECHARS,
like this:

   ((or (and (= flyspell-pre-point (- (point) 1))
	     (or (eq (char-syntax (char-after flyspell-pre-point)) ?w)
		 (string-match-p (flyspell-get-casechars)
				 (buffer-substring-no-properties
				  flyspell-pre-point (1+ flyspell-pre-point)))))
	(= flyspell-pre-point (point))
	(= flyspell-pre-point (+ (point) 1)))
    nil)



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: flyspell.el and non-word characters in CASECHARS
  2012-04-17 17:51   ` Eli Zaretskii
@ 2012-04-18 16:26     ` Agustin Martin
  2012-04-18 18:45       ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Agustin Martin @ 2012-04-18 16:26 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 2027 bytes --]

On Tue, Apr 17, 2012 at 08:51:24PM +0300, Eli Zaretskii wrote:
> > Date: Tue, 17 Apr 2012 19:26:36 +0200
> > From: Agustin Martin <agustin.martin@hispalinux.es>
> > 
> > The only reason I can think is that at that time there is no way to know if
> > that wordchar is going to be in the middle of a word or not. If it appears
> > at a word boundary, is not what ispell.el seems to consider a wordchar.
> 
> But in that case, the following non-word character (blank or
> punctuation) will trigger the spell-check of the word.  So we lose
> nothing, right?
> 
> > Did your test work only with CASECHARS instead of CASECHARS+OTHERCHARS?
> 
> I actually _added_ to the word-syntax test the test against CASECHARS,
> like this:
> 
>    ((or (and (= flyspell-pre-point (- (point) 1))
> 	     (or (eq (char-syntax (char-after flyspell-pre-point)) ?w)
> 		 (string-match-p (flyspell-get-casechars)
> 				 (buffer-substring-no-properties
> 				  flyspell-pre-point (1+ flyspell-pre-point)))))
> 	(= flyspell-pre-point (point))
> 	(= flyspell-pre-point (+ (point) 1)))
>     nil)

I tested with your changes and they do not seem to help here. I put some
(message "") to check when the casechars test is reached and in a small text
showing this behavior I found no match (not previously matched by word
syntax). I put an otherchars test and also did not help, but at least there
is a proper match in otherchars. 

Anyway, adding otherchars test did not help directly but did indirectly. 
When testing otherchars I noticed that flyspell.el seems to honour delays
for dashes, but not for otherchars. Words are checked inmediately after
apostrophe, but check is properly delayed for dashes.

I have been playing with enabling delays also for otherchars together with
adding an otherchars test. I think I tried this morning these changes
and had problems with things like 

sdasd'ss 

and friends, but now they seem to work well. I am a bit confused, I
probably tested something different.

Does attached diff help at your site? 

-- 
Agustin

[-- Attachment #2: flyspell.el_use-otherchars.diff --]
[-- Type: text/x-diff, Size: 1520 bytes --]

--- flyspell.el.orig	2012-04-12 15:06:12.780784001 +0200
+++ flyspell.el	2012-04-18 10:12:57.749272942 +0200
@@ -739,7 +739,11 @@
 	 (eq flyspell-pre-pre-buffer flyspell-pre-buffer))
     nil)
    ((or (and (= flyspell-pre-point (- (point) 1))
-	     (eq (char-syntax (char-after flyspell-pre-point)) ?w))
+	     (or (eq (char-syntax (char-after flyspell-pre-point)) ?w)
+		 (string-match-p (ispell-get-otherchars)
+				 (buffer-substring-no-properties
+				  flyspell-pre-point (1+ flyspell-pre-point)))
+		 ))
 	(= flyspell-pre-point (point))
 	(= flyspell-pre-point (+ (point) 1)))
     nil)
@@ -815,6 +819,7 @@
 	 (save-excursion
 	   (backward-char 1)
 	   (and (looking-at (flyspell-get-not-casechars))
+		(not (looking-at (ispell-get-otherchars)))
 		(or flyspell-consider-dash-as-word-delimiter-flag
 		    (not (looking-at "-"))))))
     ;; yes because we have reached or typed a word delimiter.
@@ -880,6 +885,7 @@
 				     (save-excursion
 				       (backward-char 1)
 				       (and (and (looking-at (flyspell-get-not-casechars)) 1)
+					    (not (looking-at (ispell-get-otherchars)))
 					    (and (or flyspell-consider-dash-as-word-delimiter-flag
 						     (not (looking-at "\\-"))) 2))))))
 			  c))))
@@ -895,6 +901,7 @@
 				       (save-excursion
 					 (backward-char 1)
 					 (and (looking-at (flyspell-get-not-casechars))
+					      (not (looking-at (ispell-get-otherchars)))
 					      (or flyspell-consider-dash-as-word-delimiter-flag
 						  (not (looking-at "\\-"))))))))
 			    c))

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: flyspell.el and non-word characters in CASECHARS
  2012-04-18 16:26     ` Agustin Martin
@ 2012-04-18 18:45       ` Eli Zaretskii
  2012-04-20 15:26         ` Agustin Martin
  0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2012-04-18 18:45 UTC (permalink / raw)
  To: Agustin Martin; +Cc: emacs-devel

> Date: Wed, 18 Apr 2012 18:26:19 +0200
> From: Agustin Martin <agustin.martin@hispalinux.es>
> 
> > I actually _added_ to the word-syntax test the test against CASECHARS,
> > like this:
> > 
> >    ((or (and (= flyspell-pre-point (- (point) 1))
> > 	     (or (eq (char-syntax (char-after flyspell-pre-point)) ?w)
> > 		 (string-match-p (flyspell-get-casechars)
> > 				 (buffer-substring-no-properties
> > 				  flyspell-pre-point (1+ flyspell-pre-point)))))
> > 	(= flyspell-pre-point (point))
> > 	(= flyspell-pre-point (+ (point) 1)))
> >     nil)
> 
> I tested with your changes and they do not seem to help here.

Those were not all of the changes, I just showed them for an
illustration.  The full change involved a similar change a few lines
below the above snippet, where again flyspell checks the word syntax.

> Does attached diff help at your site? 

I will try that when I have time, thanks.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: flyspell.el and non-word characters in CASECHARS
  2012-04-18 18:45       ` Eli Zaretskii
@ 2012-04-20 15:26         ` Agustin Martin
  0 siblings, 0 replies; 6+ messages in thread
From: Agustin Martin @ 2012-04-20 15:26 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]

On Wed, Apr 18, 2012 at 09:45:41PM +0300, Eli Zaretskii wrote:
> > Date: Wed, 18 Apr 2012 18:26:19 +0200
> > From: Agustin Martin <agustin.martin@hispalinux.es>
> > 
> > > I actually _added_ to the word-syntax test the test against CASECHARS,
> > > like this:
> > > 
> > >    ((or (and (= flyspell-pre-point (- (point) 1))
> > > 	     (or (eq (char-syntax (char-after flyspell-pre-point)) ?w)
> > > 		 (string-match-p (flyspell-get-casechars)
> > > 				 (buffer-substring-no-properties
> > > 				  flyspell-pre-point (1+ flyspell-pre-point)))))
> > > 	(= flyspell-pre-point (point))
> > > 	(= flyspell-pre-point (+ (point) 1)))
> > >     nil)
> > 
> > I tested with your changes and they do not seem to help here.
> 
> Those were not all of the changes, I just showed them for an
> illustration.  The full change involved a similar change a few lines
> below the above snippet, where again flyspell checks the word syntax.

I noticed that part when first looking and later forgot about it. Thanks for
reminding.

Updated diff attached for wider testing.

Regards,

-- 
Agustin

[-- Attachment #2: flyspell.el_use-otherchars.2.diff --]
[-- Type: text/x-diff, Size: 2046 bytes --]

--- flyspell.el.orig	2012-04-12 15:06:12.780784001 +0200
+++ flyspell.el	2012-04-19 11:05:07.936947902 +0200
@@ -739,7 +739,11 @@
 	 (eq flyspell-pre-pre-buffer flyspell-pre-buffer))
     nil)
    ((or (and (= flyspell-pre-point (- (point) 1))
-	     (eq (char-syntax (char-after flyspell-pre-point)) ?w))
+	     (or (eq (char-syntax (char-after flyspell-pre-point)) ?w)
+		 (string-match-p (ispell-get-otherchars)
+		  		 (buffer-substring-no-properties
+		  		  flyspell-pre-point (1+ flyspell-pre-point)))
+		 ))
 	(= flyspell-pre-point (point))
 	(= flyspell-pre-point (+ (point) 1)))
     nil)
@@ -753,7 +757,11 @@
 	     ;; If other post-command-hooks change the buffer,
 	     ;; flyspell-pre-point can lie past eob (bug#468).
 	     (null (char-after flyspell-pre-point))
-	     (eq (char-syntax (char-after flyspell-pre-point)) ?w)))
+	     (or (eq (char-syntax (char-after flyspell-pre-point)) ?w)
+		 (string-match-p (ispell-get-otherchars)
+				 (buffer-substring-no-properties
+				  flyspell-pre-point (1+ flyspell-pre-point)))
+		 )))
     nil)
    ((not (eq (current-buffer) flyspell-pre-buffer))
     t)
@@ -815,6 +823,7 @@
 	 (save-excursion
 	   (backward-char 1)
 	   (and (looking-at (flyspell-get-not-casechars))
+		(not (looking-at (ispell-get-otherchars)))
 		(or flyspell-consider-dash-as-word-delimiter-flag
 		    (not (looking-at "-"))))))
     ;; yes because we have reached or typed a word delimiter.
@@ -880,6 +889,7 @@
 				     (save-excursion
 				       (backward-char 1)
 				       (and (and (looking-at (flyspell-get-not-casechars)) 1)
+					    (not (looking-at (ispell-get-otherchars)))
 					    (and (or flyspell-consider-dash-as-word-delimiter-flag
 						     (not (looking-at "\\-"))) 2))))))
 			  c))))
@@ -895,6 +905,7 @@
 				       (save-excursion
 					 (backward-char 1)
 					 (and (looking-at (flyspell-get-not-casechars))
+					      (not (looking-at (ispell-get-otherchars)))
 					      (or flyspell-consider-dash-as-word-delimiter-flag
 						  (not (looking-at "\\-"))))))))
 			    c))

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-04-20 15:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-16 19:55 flyspell.el and non-word characters in CASECHARS Eli Zaretskii
2012-04-17 17:26 ` Agustin Martin
2012-04-17 17:51   ` Eli Zaretskii
2012-04-18 16:26     ` Agustin Martin
2012-04-18 18:45       ` Eli Zaretskii
2012-04-20 15:26         ` Agustin Martin

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).