unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#13460: Issue to change dictionary when using hunspell on emacs
@ 2013-01-16 12:25 Jochen Schmitt
  2013-01-16 18:01 ` Eli Zaretskii
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Jochen Schmitt @ 2013-01-16 12:25 UTC (permalink / raw)
  To: 13460

Hallo,

I'm using emacs-24.1 on Fedora 17 (x86_64) and have the following
isuue. When I want to write an text in enlish I have to change
the used dictionary with M-x ispell-change-disctionary english.

Unfortunately, I have to find out, that hunspell doesn't works
properly after I have changed the dictionary to english.

My examination show, that emacs will called hunspell in the way
quoted abouve:

[s4504kr@omega ~]$ hunspell -a  -d english -B -i iso-8859-1
@(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.2)
Can't open affix or dictionary files for dictionary named "english".
[s4504kr@omega ~]$ exit
exit

So It may be nice, if you have an solution for this issue.

Best Regards:

Jochen Schmitt





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-16 12:25 bug#13460: Issue to change dictionary when using hunspell on emacs Jochen Schmitt
@ 2013-01-16 18:01 ` Eli Zaretskii
  2013-01-16 23:23   ` Glenn Morris
  2013-02-20 17:50 ` bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs Agustin Martin
  2013-04-04 14:41 ` bug#13639: " Jacek Chrząszcz
  2 siblings, 1 reply; 33+ messages in thread
From: Eli Zaretskii @ 2013-01-16 18:01 UTC (permalink / raw)
  To: Jochen Schmitt; +Cc: 13460

> Date: Wed, 16 Jan 2013 13:25:10 +0100
> From: Jochen Schmitt <Jochen@herr-schmitt.de>
> 
> I'm using emacs-24.1 on Fedora 17 (x86_64) and have the following
> isuue. When I want to write an text in enlish I have to change
> the used dictionary with M-x ispell-change-disctionary english.
> 
> Unfortunately, I have to find out, that hunspell doesn't works
> properly after I have changed the dictionary to english.
> 
> My examination show, that emacs will called hunspell in the way
> quoted abouve:
> 
> [s4504kr@omega ~]$ hunspell -a  -d english -B -i iso-8859-1
> @(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.2)
> Can't open affix or dictionary files for dictionary named "english".
> [s4504kr@omega ~]$ exit
> exit
> 
> So It may be nice, if you have an solution for this issue.

You need to install the English dictionary for Hunspell.  I suspect
that its name will be en_US (or maybe en_GB), not "english".





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-16 18:01 ` Eli Zaretskii
@ 2013-01-16 23:23   ` Glenn Morris
  2013-01-17  3:51     ` Eli Zaretskii
       [not found]     ` <20130117131733.GA20519@omega.in.herr-schmitt.de>
  0 siblings, 2 replies; 33+ messages in thread
From: Glenn Morris @ 2013-01-16 23:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 13460, Jochen Schmitt

Eli Zaretskii wrote:

> You need to install the English dictionary for Hunspell.  I suspect
> that its name will be en_US (or maybe en_GB), not "english".

M-x ispell-change-dictionary doesn't accept "en_US" as input.
It wants something like "english" (coming from
ispell-dictionary-base-alist), which as you say is wrong.
Tested with:

hunspell -D
[...]
LOADED DICTIONARY:
/usr/share/myspell/en_US.aff
/usr/share/myspell/en_US.dic
Hunspell 1.2.8

emacs -Q --eval '(setq ispell-program-name "/usr/bin/hunspell")






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-16 23:23   ` Glenn Morris
@ 2013-01-17  3:51     ` Eli Zaretskii
  2013-01-17  6:37       ` Glenn Morris
       [not found]     ` <20130117131733.GA20519@omega.in.herr-schmitt.de>
  1 sibling, 1 reply; 33+ messages in thread
From: Eli Zaretskii @ 2013-01-17  3:51 UTC (permalink / raw)
  To: Glenn Morris; +Cc: 13460, Jochen

> From: Glenn Morris <rgm@gnu.org>
> Cc: Jochen Schmitt <Jochen@herr-schmitt.de>,  13460@debbugs.gnu.org
> Date: Wed, 16 Jan 2013 18:23:23 -0500
> 
> Eli Zaretskii wrote:
> 
> > You need to install the English dictionary for Hunspell.  I suspect
> > that its name will be en_US (or maybe en_GB), not "english".
> 
> M-x ispell-change-dictionary doesn't accept "en_US" as input.
> It wants something like "english" (coming from
> ispell-dictionary-base-alist), which as you say is wrong.

Then one needs to customize ispell-local-dictionary-alist to include
the setting for en_US.  Here's what I have there:

     '("en_US"
	"[[:alpha:]]"
	"[^[:alpha:]]"
	"[']" nil ("-r") nil utf-8)

The OP may wish to omit the -r switch, it's not a necessity.

Also, be sure to look at en_US.aff and match the character set it
mentions there with the "utf-8" part above.





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17  3:51     ` Eli Zaretskii
@ 2013-01-17  6:37       ` Glenn Morris
  2013-01-17 12:26         ` Agustin Martin
  0 siblings, 1 reply; 33+ messages in thread
From: Glenn Morris @ 2013-01-17  6:37 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 13460, Jochen

Eli Zaretskii wrote:

>> From: Glenn Morris <rgm@gnu.org>
>> Cc: Jochen Schmitt <Jochen@herr-schmitt.de>,  13460@debbugs.gnu.org
>> Date: Wed, 16 Jan 2013 18:23:23 -0500
>> 
>> Eli Zaretskii wrote:
>> 
>> > You need to install the English dictionary for Hunspell.  I suspect
>> > that its name will be en_US (or maybe en_GB), not "english".
>> 
>> M-x ispell-change-dictionary doesn't accept "en_US" as input.
>> It wants something like "english" (coming from
>> ispell-dictionary-base-alist), which as you say is wrong.
>
> Then one needs to customize ispell-local-dictionary-alist to include
> the setting for en_US.  Here's what I have there:
>
>      '("en_US"
> 	"[[:alpha:]]"
> 	"[^[:alpha:]]"
> 	"[']" nil ("-r") nil utf-8)
>
> The OP may wish to omit the -r switch, it's not a necessity.
>
> Also, be sure to look at en_US.aff and match the character set it
> mentions there with the "utf-8" part above.

IMO it should work out of the box.
Ie ispell-set-spellchecker-params should handle hunspell as it currently
does aspell, which has its own ispell-find-aspell-dictionaries func.






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17  6:37       ` Glenn Morris
@ 2013-01-17 12:26         ` Agustin Martin
  2013-01-17 15:24           ` Agustin Martin
  2013-01-17 16:10           ` Eli Zaretskii
  0 siblings, 2 replies; 33+ messages in thread
From: Agustin Martin @ 2013-01-17 12:26 UTC (permalink / raw)
  To: 13460

On Thu, Jan 17, 2013 at 01:37:30AM -0500, Glenn Morris wrote:
> Eli Zaretskii wrote:
> 
> >> From: Glenn Morris <rgm@gnu.org>
> >> Cc: Jochen Schmitt <Jochen@herr-schmitt.de>,  13460@debbugs.gnu.org
> >> Date: Wed, 16 Jan 2013 18:23:23 -0500
> >> 
> >> Eli Zaretskii wrote:
> >> 
> >> > You need to install the English dictionary for Hunspell.  I suspect
> >> > that its name will be en_US (or maybe en_GB), not "english".
> >> 
> >> M-x ispell-change-dictionary doesn't accept "en_US" as input.
> >> It wants something like "english" (coming from
> >> ispell-dictionary-base-alist), which as you say is wrong.
> >
> > Then one needs to customize ispell-local-dictionary-alist to include
> > the setting for en_US.  Here's what I have there:
> >
> >      '("en_US"
> > 	"[[:alpha:]]"
> > 	"[^[:alpha:]]"
> > 	"[']" nil ("-r") nil utf-8)
> >
> > The OP may wish to omit the -r switch, it's not a necessity.
> >
> > Also, be sure to look at en_US.aff and match the character set it
> > mentions there with the "utf-8" part above.
> 
> IMO it should work out of the box.
> Ie ispell-set-spellchecker-params should handle hunspell as it currently
> does aspell, which has its own ispell-find-aspell-dictionaries func.

The problem is that hunspell -D does not return control. A bug has been
opened for this, together with patch suggested by Eli Zaretskii,

http://sourceforge.net/tracker/?func=detail&aid=3522524&group_id=143754&atid=756395

A workaround was proposed (redirecting from /dev/null), but it seems too
UNIX biassed.

There is also an associated problem when hunspell does not find the requested
dictionary under Emacs, because it does not triger an explicit error. This
leaves Emacs waiting for a reply in an infinite loop,

http://bugs.debian.org/690318

The reason is that when in pipe mode hunspell sends init string in a non
ispell/aspell compliant way. Reported and patch proposed as

http://sourceforge.net/tracker/?func=detail&aid=3577183&group_id=143754&atid=756395

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 12:26         ` Agustin Martin
@ 2013-01-17 15:24           ` Agustin Martin
  2013-01-17 16:31             ` Stefan Monnier
                               ` (3 more replies)
  2013-01-17 16:10           ` Eli Zaretskii
  1 sibling, 4 replies; 33+ messages in thread
From: Agustin Martin @ 2013-01-17 15:24 UTC (permalink / raw)
  To: 13460

On Thu, Jan 17, 2013 at 01:26:31PM +0100, Agustin Martin wrote:
> On Thu, Jan 17, 2013 at 01:37:30AM -0500, Glenn Morris wrote:
> > Eli Zaretskii wrote:
> > 
> > >> From: Glenn Morris <rgm@gnu.org>
> > >> Cc: Jochen Schmitt <Jochen@herr-schmitt.de>,  13460@debbugs.gnu.org
> > >> Date: Wed, 16 Jan 2013 18:23:23 -0500
> > >> 
> > >> Eli Zaretskii wrote:
> > >> 
> > >> > You need to install the English dictionary for Hunspell.  I suspect
> > >> > that its name will be en_US (or maybe en_GB), not "english".
> > >> 
> > >> M-x ispell-change-dictionary doesn't accept "en_US" as input.
> > >> It wants something like "english" (coming from
> > >> ispell-dictionary-base-alist), which as you say is wrong.
> > >
> > > Then one needs to customize ispell-local-dictionary-alist to include
> > > the setting for en_US.  Here's what I have there:
> > >
> > >      '("en_US"
> > > 	"[[:alpha:]]"
> > > 	"[^[:alpha:]]"
> > > 	"[']" nil ("-r") nil utf-8)
> > >
> > > The OP may wish to omit the -r switch, it's not a necessity.
> > >
> > > Also, be sure to look at en_US.aff and match the character set it
> > > mentions there with the "utf-8" part above.
> > 
> > IMO it should work out of the box.
> > Ie ispell-set-spellchecker-params should handle hunspell as it currently
> > does aspell, which has its own ispell-find-aspell-dictionaries func.
> 
> The problem is that hunspell -D does not return control. A bug has been
> opened for this, together with patch suggested by Eli Zaretskii,
> 
> http://sourceforge.net/tracker/?func=detail&aid=3522524&group_id=143754&atid=756395
> 
> A workaround was proposed (redirecting from /dev/null), but it seems too
> UNIX biassed.

There is a second issue I forgot, one needs to get info from the installed
.aff files, so all them must be completely opened to look for that info
(OTHERCHARS and friends) and I'd expect that to slow Emacs init a bit. Since
I did not try to write preliminary code for that parsing I cannot evaluate
that delay. Fortunately aspell used small .dat files for that purpose.

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 12:26         ` Agustin Martin
  2013-01-17 15:24           ` Agustin Martin
@ 2013-01-17 16:10           ` Eli Zaretskii
  1 sibling, 0 replies; 33+ messages in thread
From: Eli Zaretskii @ 2013-01-17 16:10 UTC (permalink / raw)
  To: Agustin Martin; +Cc: 13460

> Date: Thu, 17 Jan 2013 13:26:31 +0100
> From: Agustin Martin <agustin.martin@hispalinux.es>
> 
> > Ie ispell-set-spellchecker-params should handle hunspell as it currently
> > does aspell, which has its own ispell-find-aspell-dictionaries func.
> 
> The problem is that hunspell -D does not return control.

Right.  But perhaps ispell.el could kill hunspell once it has read the
list of dictionaries.

> A workaround was proposed (redirecting from /dev/null), but it seems too
> UNIX biassed.

We could use 'null-device', which is portable.

> There is also an associated problem when hunspell does not find the requested
> dictionary under Emacs, because it does not triger an explicit error.

Right, but we should always find the dictionary ;-)





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 15:24           ` Agustin Martin
@ 2013-01-17 16:31             ` Stefan Monnier
  2013-01-17 18:15               ` Agustin Martin
  2013-01-17 16:41             ` Eli Zaretskii
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 33+ messages in thread
From: Stefan Monnier @ 2013-01-17 16:31 UTC (permalink / raw)
  To: Agustin Martin; +Cc: 13460

> There is a second issue I forgot, one needs to get info from the installed
> .aff files, so all them must be completely opened to look for that info
> (OTHERCHARS and friends) and I'd expect that to slow Emacs init a bit.

We don't have to do that just to get the list of languages: it can be
delayed to the moment a particular language is selected.


        Stefan





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 15:24           ` Agustin Martin
  2013-01-17 16:31             ` Stefan Monnier
@ 2013-01-17 16:41             ` Eli Zaretskii
  2013-01-17 18:12               ` Agustin Martin
  2013-01-17 18:08             ` Glenn Morris
       [not found]             ` <7076415.12428.1358446115519.JavaMail.root@mx1-new.spamfiltro.es>
  3 siblings, 1 reply; 33+ messages in thread
From: Eli Zaretskii @ 2013-01-17 16:41 UTC (permalink / raw)
  To: Agustin Martin; +Cc: 13460

> Date: Thu, 17 Jan 2013 16:24:16 +0100
> From: Agustin Martin <agustin.martin@hispalinux.es>
> 
> There is a second issue I forgot, one needs to get info from the installed
> .aff files, so all them must be completely opened to look for that info
> (OTHERCHARS and friends) and I'd expect that to slow Emacs init a bit.

You don't need OTHERCHARS, only the SET, to figure out the encoding in
which to talk to hunspell for each dictionary.  (OTHERCHARS cannot be
gleaned from the hunspell .aff files anyway, AFAIU.)

The other problem is with CASECHARS, but that is unavailable with
aspell as well, we are just guessing there.  We could guess the same
for hunspell.





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 15:24           ` Agustin Martin
  2013-01-17 16:31             ` Stefan Monnier
  2013-01-17 16:41             ` Eli Zaretskii
@ 2013-01-17 18:08             ` Glenn Morris
       [not found]             ` <7076415.12428.1358446115519.JavaMail.root@mx1-new.spamfiltro.es>
  3 siblings, 0 replies; 33+ messages in thread
From: Glenn Morris @ 2013-01-17 18:08 UTC (permalink / raw)
  To: Agustin Martin; +Cc: 13460

Agustin Martin wrote:

> The problem is that hunspell -D does not return control.
[...]
> A workaround was proposed (redirecting from /dev/null), but it seems too
> UNIX biassed.

"UNIX biased" -> "does not work on MS Windows" ?

If the equivalent of "hunspell < null-device" works on MS Windows,
that's one problem easily solved, no?





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 16:41             ` Eli Zaretskii
@ 2013-01-17 18:12               ` Agustin Martin
  2013-01-17 18:42                 ` Eli Zaretskii
       [not found]                 ` <11624660.12538.1358448223517.JavaMail.root@mx1-new.spamfiltro.es>
  0 siblings, 2 replies; 33+ messages in thread
From: Agustin Martin @ 2013-01-17 18:12 UTC (permalink / raw)
  To: 13460

On Thu, Jan 17, 2013 at 06:41:20PM +0200, Eli Zaretskii wrote:
> > Date: Thu, 17 Jan 2013 16:24:16 +0100
> > From: Agustin Martin <agustin.martin@hispalinux.es>
> > 
> > There is a second issue I forgot, one needs to get info from the installed
> > .aff files, so all them must be completely opened to look for that info
> > (OTHERCHARS and friends) and I'd expect that to slow Emacs init a bit.
> 
> You don't need OTHERCHARS, only the SET, to figure out the encoding in
> which to talk to hunspell for each dictionary.  (OTHERCHARS cannot be
> gleaned from the hunspell .aff files anyway, AFAIU.)

Sorry, I should have written WORDCHARS.

> The other problem is with CASECHARS, but that is unavailable with
> aspell as well, we are just guessing there.  We could guess the same
> for hunspell.

Agreed.

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 16:31             ` Stefan Monnier
@ 2013-01-17 18:15               ` Agustin Martin
  0 siblings, 0 replies; 33+ messages in thread
From: Agustin Martin @ 2013-01-17 18:15 UTC (permalink / raw)
  To: 13460

On Thu, Jan 17, 2013 at 11:31:00AM -0500, Stefan Monnier wrote:
> > There is a second issue I forgot, one needs to get info from the installed
> > .aff files, so all them must be completely opened to look for that info
> > (OTHERCHARS and friends) and I'd expect that to slow Emacs init a bit.
> 
> We don't have to do that just to get the list of languages: it can be
> delayed to the moment a particular language is selected.

Good point, thanks. And that info can even be cached. I like this.

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
       [not found]     ` <20130117131733.GA20519@omega.in.herr-schmitt.de>
@ 2013-01-17 18:19       ` Glenn Morris
  2013-01-17 19:30         ` Agustin Martin
  0 siblings, 1 reply; 33+ messages in thread
From: Glenn Morris @ 2013-01-17 18:19 UTC (permalink / raw)
  To: Jochen Schmitt; +Cc: 13460


(Please keep the debbugs address cc'd.
Resending your comments so that they are more visible.)

Date: Thu, 17 Jan 2013 14:17:34 +0100
From: Jochen Schmitt <Jochen@herr-schmitt.de>

I have try to create a suggestion for a general solution for
this issue in the next release of emacs.

I have attached a patch on this mail which introduced a alist to
translate the dictionary names like 'english' in the form which will
be accepted by hunspell.

I have done a first short test to check out, that this is a
working solution.
 
Of course the ispell-hunspell-dictionary-alist need extension because 
I have put only two entries to be able to check out my solution.

Best Regards:

Jochen Schmitt

diff -up emacs-24.2/lisp/textmodes/ispell.el.hunspell emacs-24.2/lisp/textmodes/ispell.el
--- emacs-24.2/lisp/textmodes/ispell.el.hunspell	2013-01-17 13:17:45.389785784 +0100
+++ emacs-24.2/lisp/textmodes/ispell.el	2013-01-17 13:19:43.388797273 +0100
@@ -572,6 +572,13 @@ re-start Emacs."
 		       (coding-system :tag "Coding System")))
   :group 'ispell)
 
+(defvar ispell-hunspell-dictionary-alist
+  '((nil "en_GB")
+    ("english" "en_GB")
+    ("american" "en_US")
+   )
+  "Associating list between apell and hunspell dictionaries names"
+)
 
 (defvar ispell-dictionary-base-alist
   '((nil
@@ -2610,7 +2617,9 @@ Keeps argument list for future ispell in
           (append
            (if (and ispell-current-dictionary      ; Not for default dict (nil)
                     (not (member "-d" orig-args))) ; Only define if not overridden.
-               (list "-d" ispell-current-dictionary))
+               (list "-d" (if ispell-really-hunspell
+			      (cadr (assoc ispell-current-dictionary ispell-hunspell-dictionary-alist))
+			   ispell-current-dictionary)))
            orig-args
            (if ispell-current-personal-dictionary ; Use specified pers dict.
                (list "-p"





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 18:12               ` Agustin Martin
@ 2013-01-17 18:42                 ` Eli Zaretskii
       [not found]                 ` <11624660.12538.1358448223517.JavaMail.root@mx1-new.spamfiltro.es>
  1 sibling, 0 replies; 33+ messages in thread
From: Eli Zaretskii @ 2013-01-17 18:42 UTC (permalink / raw)
  To: Agustin Martin; +Cc: 13460

> Date: Thu, 17 Jan 2013 19:12:34 +0100
> From: Agustin Martin <agustin.martin@hispalinux.es>
> 
> Sorry, I should have written WORDCHARS.

Why do we need that?





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
       [not found]             ` <7076415.12428.1358446115519.JavaMail.root@mx1-new.spamfiltro.es>
@ 2013-01-17 18:44               ` Agustin Martin
  0 siblings, 0 replies; 33+ messages in thread
From: Agustin Martin @ 2013-01-17 18:44 UTC (permalink / raw)
  To: 13460

On Thu, Jan 17, 2013 at 01:08:30PM -0500, Glenn Morris wrote:
> Agustin Martin wrote:
> 
> > The problem is that hunspell -D does not return control.
> [...]
> > A workaround was proposed (redirecting from /dev/null), but it seems too
> > UNIX biassed.
> 
> "UNIX biased" -> "does not work on MS Windows" ?
> 
> If the equivalent of "hunspell < null-device" works on MS Windows,
> that's one problem easily solved, no?

I do not use MS Windows myself, so I am just guessing possible problems. 
I was thinking about trying to start playing with /dev/null in my Debian
box, but Eli proposed 'null-device' which seems better.

Anyway, I'd like to have a look at this, but I am having little spare time
now. Will try to find some time soon, but cannot promise. So, if someone did
some work on this, is welcome.

Regards,

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
       [not found]                 ` <11624660.12538.1358448223517.JavaMail.root@mx1-new.spamfiltro.es>
@ 2013-01-17 19:06                   ` Agustin Martin
  2013-01-17 19:36                     ` Eli Zaretskii
  0 siblings, 1 reply; 33+ messages in thread
From: Agustin Martin @ 2013-01-17 19:06 UTC (permalink / raw)
  To: 13460

On Thu, Jan 17, 2013 at 08:42:58PM +0200, Eli Zaretskii wrote:
> > Date: Thu, 17 Jan 2013 19:12:34 +0100
> > From: Agustin Martin <agustin.martin@hispalinux.es>
> > 
> > Sorry, I should have written WORDCHARS.
> 
> Why do we need that?

This is what ispell.el calls otherchars. Parsing WORDCHARS ensures that both
hunspell and ispell.el think about the same characters in that category.

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 18:19       ` Glenn Morris
@ 2013-01-17 19:30         ` Agustin Martin
  2013-01-18 17:05           ` Agustin Martin
  0 siblings, 1 reply; 33+ messages in thread
From: Agustin Martin @ 2013-01-17 19:30 UTC (permalink / raw)
  To: 13460, Jochen Schmitt

From: Jochen Schmitt
> 
> I have try to create a suggestion for a general solution for
> this issue in the next release of emacs.
> 
> I have attached a patch on this mail which introduced a alist to
> translate the dictionary names like 'english' in the form which will
> be accepted by hunspell.
> 
> I have done a first short test to check out, that this is a
> working solution.
>  
> Of course the ispell-hunspell-dictionary-alist need extension because 
> I have put only two entries to be able to check out my solution.

Hi Jochen. Thanks a lot for your feedback (and to Glenn for forwarding it),
you can send your followups to the bug address.

I remember to have done some initial work with an alias file for hunspell,
but I do not find that now.

I'd keep the name `ispell-hunspell-dictionary-alist' for the alist of
actually found dicts, once implemented in one way or another. I vaguely
remember using something like `ispell-hunspell-dictionary-equivs-alist' for
the purpose of having a list of equivalences. I'd also not hardcode nil to
"en_GB".

For the rest I think it can be useful as a temporary workaround, but I'd
prefer to see these changes in a sanitized dictionary alist for hunspell,
something similar to what is done in `ispell-set-spellchecker-params' to use
[:alpha:] when possible, but for this purpose and limited to the original
base alist. Otherwise will only work for entries not using explicit "-d"
and may hide personal choices in ~/.emacs pointing to non-standard locations
(e.g., "american" using "~/personal/en_US". (Looked quickly, may be I missed
something)

Let me try to find where I have my previous work and what is harder, try to
find the time. This should not be that time consuming, so I expect to look
at this shortly.

Thanks again. 

Regards.

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 19:06                   ` Agustin Martin
@ 2013-01-17 19:36                     ` Eli Zaretskii
  0 siblings, 0 replies; 33+ messages in thread
From: Eli Zaretskii @ 2013-01-17 19:36 UTC (permalink / raw)
  To: Agustin Martin; +Cc: 13460

> Date: Thu, 17 Jan 2013 20:06:31 +0100
> From: Agustin Martin <agustin.martin@hispalinux.es>
> 
> On Thu, Jan 17, 2013 at 08:42:58PM +0200, Eli Zaretskii wrote:
> > > Date: Thu, 17 Jan 2013 19:12:34 +0100
> > > From: Agustin Martin <agustin.martin@hispalinux.es>
> > > 
> > > Sorry, I should have written WORDCHARS.
> > 
> > Why do we need that?
> 
> This is what ispell.el calls otherchars. Parsing WORDCHARS ensures that both
> hunspell and ispell.el think about the same characters in that category.

I think you are mistaken, that's not my reading of hunspell(4).





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-17 19:30         ` Agustin Martin
@ 2013-01-18 17:05           ` Agustin Martin
  2013-01-18 18:03             ` Jochen Schmitt
  2013-01-21  9:43             ` Jochen Schmitt
  0 siblings, 2 replies; 33+ messages in thread
From: Agustin Martin @ 2013-01-18 17:05 UTC (permalink / raw)
  To: 13460; +Cc: Jochen Schmitt

[-- Attachment #1: Type: text/plain, Size: 2782 bytes --]

On Thu, Jan 17, 2013 at 08:30:29PM +0100, Agustin Martin wrote:
> From: Jochen Schmitt
> > 
> > I have try to create a suggestion for a general solution for
> > this issue in the next release of emacs.
> > 
> > I have attached a patch on this mail which introduced a alist to
> > translate the dictionary names like 'english' in the form which will
> > be accepted by hunspell.
> > 
> > I have done a first short test to check out, that this is a
> > working solution.
> >  
> > Of course the ispell-hunspell-dictionary-alist need extension because 
> > I have put only two entries to be able to check out my solution.
> 
> Hi Jochen. Thanks a lot for your feedback (and to Glenn for forwarding it),
> you can send your followups to the bug address.
> 
> I remember to have done some initial work with an alias file for hunspell,
> but I do not find that now.
> 
> I'd keep the name `ispell-hunspell-dictionary-alist' for the alist of
> actually found dicts, once implemented in one way or another. I vaguely
> remember using something like `ispell-hunspell-dictionary-equivs-alist' for
> the purpose of having a list of equivalences. I'd also not hardcode nil to
> "en_GB".
> 
> For the rest I think it can be useful as a temporary workaround, but I'd
> prefer to see these changes in a sanitized dictionary alist for hunspell,
> something similar to what is done in `ispell-set-spellchecker-params' to use
> [:alpha:] when possible, but for this purpose and limited to the original
> base alist. Otherwise will only work for entries not using explicit "-d"
> and may hide personal choices in ~/.emacs pointing to non-standard locations
> (e.g., "american" using "~/personal/en_US". (Looked quickly, may be I missed
> something)
> 
> Let me try to find where I have my previous work and what is harder, try to
> find the time. This should not be that time consuming, so I expect to look
> at this shortly.

I have been playing with this. Please see attached patch for current status.
There are a couple of minor things I would like to think about first.

Current changes explicitly set "english" to one of the two main choices
("en_GB"). This  is not something I like very much and I am aware that
people is sensitive about this. I'd prefer to associate it with plain "en",
but hunspell has some pending issues regarding fallback values.

Since there should be mapppings for all (but nil) default dict definitions, 
and this is only done for those dicts I am also considering to show an error
if an expected mappping is not found, but this is a really minor internal
issue just to help finding missing mappings early.

I will test these changes a bit more and if no problems appear will commit
early next week. Feedback is welcome.

Thanks again for your suggestions.

-- 
Agustin

[-- Attachment #2: ispell.el_hunspell-default-dict-names-mapping.diff --]
[-- Type: text/x-diff, Size: 3457 bytes --]

--- ispell.el.orig	2013-01-18 15:35:17.804804007 +0100
+++ ispell.el	2013-01-18 18:03:01.017847901 +0100
@@ -773,6 +773,41 @@
 (make-obsolete-variable 'ispell-aspell-supports-utf8
                         'ispell-encoding8-command "23.1")
 
+(defvar ispell-hunspell-dictionary-equivs-alist
+  '(("american"      "en_US")
+    ("brasileiro"    "pt_BR")
+    ("british"       "en_GB")
+    ("castellano"    "es_ES")
+    ("castellano8"   "es_ES")
+    ("czech"         "cs_CZ")
+    ("dansk"         "da_DK")
+    ("deutsch"       "de_DE")
+    ("deutsch8"      "de_DE")
+    ("english"       "en_GB")
+    ("esperanto"     "eo")
+    ("esperanto-tex" "eo")
+    ("finnish"       "fi_FI")
+    ("francais7"     "fr_FR")
+    ("francais"      "fr_FR")
+    ("francais-tex"  "fr_FR")
+    ("german"        "de_DE")
+    ("german8"       "de_DE")
+    ("italiano"      "it_IT")
+    ("nederlands"    "nl_NL")
+    ("nederlands8"   "nl_NL")
+    ("norsk"         "nn_NO")
+    ("norsk7-tex"    "nn_NO")
+    ("polish"        "pl_PL")
+    ("portugues"     "pt_PT")
+    ("russian"       "ru_RU")
+    ("russianw"      "ru_RU")
+    ("slovak"        "sk_SK")
+    ("slovenian"     "sl_SI")
+    ("svenska"       "sv_SE")
+    ("hebrew"        "he_IL"))
+  "Alist with matching hunspell dict names for standard dict names in
+  `ispell-dictionary-base-alist'.")
+
 (defvar ispell-emacs-alpha-regexp
   (if (string-match "^[[:alpha:]]+$" "abcde")
       "[[:alpha:]]"
@@ -1134,9 +1169,52 @@
 		    ispell-encoding8-command)
 	       ispell-aspell-dictionary-alist
 	     nil))
+	  (ispell-dictionary-base-alist ispell-dictionary-base-alist)
 	  ispell-base-dicts-override-alist ; Override only base-dicts-alist
 	  all-dicts-alist)
 
+      ;; While ispell and aspell (through aliases) use the traditional
+      ;; dict naming originally expected by ispell.el, hunspell
+      ;; uses locale based names with no alias.  We need to map
+      ;; standard names to locale based names to make default dict
+      ;; definitions available for hunspell.
+      (if ispell-really-hunspell
+	  (let (tmp-dicts-alist)
+	    (dolist (adict ispell-dictionary-base-alist)
+	      (let* ((dict-name (nth 0 adict))
+		     (ispell-args (nth 5 adict))
+		     (ispell-args-has-d (member "-d" ispell-args)))
+		;; Remove "-d" option from `ispell-args' if present
+		(if ispell-args-has-d
+		    (let ((ispell-args-after-d
+			   (cdr (cdr ispell-args-has-d)))
+			  (ispell-args-before-d
+			   (butlast ispell-args (length ispell-args-has-d))))
+		      (setq ispell-args
+			    (nconc ispell-args-before-d
+				   ispell-args-after-d))))
+		;; Unless default dict, re-add "-d" option with the mapped value
+		(if dict-name
+		    (nconc ispell-args
+			   (list "-d"
+				 (or (cadr (assoc
+					    dict-name
+					    ispell-hunspell-dictionary-equivs-alist))
+				     dict-name))))
+
+		(add-to-list 'tmp-dicts-alist
+			     (list
+			      dict-name      ; dict name
+			      (nth 1 adict)  ; casechars
+			      (nth 2 adict)  ; not-casechars
+			      (nth 3 adict)  ; otherchars
+			      (nth 4 adict)  ; many-otherchars-p
+			      ispell-args    ; ispell-args
+			      (nth 6 adict)  ; extended-character-mode
+			      (nth 7 adict)  ; dict encoding
+			      )))
+	      (setq ispell-dictionary-base-alist tmp-dicts-alist))))
+
       (run-hooks 'ispell-initialize-spellchecker-hook)
 
       ;; Add dicts to ``ispell-dictionary-alist'' unless already present.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-18 17:05           ` Agustin Martin
@ 2013-01-18 18:03             ` Jochen Schmitt
  2013-01-18 19:03               ` Eli Zaretskii
  2013-01-18 19:05               ` Agustin Martin
  2013-01-21  9:43             ` Jochen Schmitt
  1 sibling, 2 replies; 33+ messages in thread
From: Jochen Schmitt @ 2013-01-18 18:03 UTC (permalink / raw)
  To: 13460

On Fri, Jan 18, 2013 at 06:05:01PM +0100, Agustin Martin wrote:
> On Thu, Jan 17, 2013 at 08:30:29PM +0100, Agustin Martin wrote:
> There are a couple of minor things I would like to think about first.
> 
> Current changes explicitly set "english" to one of the two main choices
> ("en_GB"). This  is not something I like very much and I am aware that
> people is sensitive about this. I'd prefer to associate it with plain "en",
>

I have find out, that hunspell wiel accecpt -d en_GB,en_US as
an parameter, so this issue may be fixed.
 
> I will test these changes a bit more and if no problems appear will commit
> early next week. Feedback is welcome.
>
I will be happy, if you can notifiy me about the state of your work, 
because I want that this patch may be integrated in the official
emacs package of Fedora Linux. This is important from my point of
view, because hunspell is the default spell checking application
in Fedora.

Best Regards:

Jochen Schmitt





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-18 18:03             ` Jochen Schmitt
@ 2013-01-18 19:03               ` Eli Zaretskii
  2013-01-18 19:23                 ` Agustin Martin
  2013-01-18 19:05               ` Agustin Martin
  1 sibling, 1 reply; 33+ messages in thread
From: Eli Zaretskii @ 2013-01-18 19:03 UTC (permalink / raw)
  To: Jochen Schmitt; +Cc: 13460

> Date: Fri, 18 Jan 2013 19:03:34 +0100
> From: Jochen Schmitt <Jochen@herr-schmitt.de>
> 
> I have find out, that hunspell wiel accecpt -d en_GB,en_US as
> an parameter, so this issue may be fixed.

Beware: when you invoke hunspell like that, it uses the .aff file from
the first dictionary only, and ignores any .aff files of the other
dictionaries.  This could bite you where US and GB English differ.

In general, this option is meant to support _additional_ dictionaries
in the same language, like if you want to use a specialized dictionary
for medicine or some other discipline, together with a general-purpose
dictionary for the same language.





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-18 18:03             ` Jochen Schmitt
  2013-01-18 19:03               ` Eli Zaretskii
@ 2013-01-18 19:05               ` Agustin Martin
  2013-01-21 16:52                 ` Agustin Martin
  1 sibling, 1 reply; 33+ messages in thread
From: Agustin Martin @ 2013-01-18 19:05 UTC (permalink / raw)
  To: 13460; +Cc: Jochen Schmitt

On Fri, Jan 18, 2013 at 07:03:34PM +0100, Jochen Schmitt wrote:
> On Fri, Jan 18, 2013 at 06:05:01PM +0100, Agustin Martin wrote:
> > On Thu, Jan 17, 2013 at 08:30:29PM +0100, Agustin Martin wrote:
> > There are a couple of minor things I would like to think about first.
> > 
> > Current changes explicitly set "english" to one of the two main choices
> > ("en_GB"). This  is not something I like very much and I am aware that
> > people is sensitive about this. I'd prefer to associate it with plain "en",
> >
> 
> I have find out, that hunspell wiel accecpt -d en_GB,en_US as
> an parameter, so this issue may be fixed.

Thanks for the info.

I am not native English, so I am a bit unsure that this is the desired
behavior, people may get puzzled by "english" accepting simultaneously
"center/centre", "colour/color" and friends. What native English people
think about this?

There is also the fact that first dict in that list must always be installed,
otherwise we get the dict not found error. In most setups both dicts are
installed, so this should not be a big problem, but I'd put first the most
popular, en_US.

> > I will test these changes a bit more and if no problems appear will commit
> > early next week. Feedback is welcome.
> >
> I will be happy, if you can notifiy me about the state of your work, 
> because I want that this patch may be integrated in the official
> emacs package of Fedora Linux. This is important from my point of
> view, because hunspell is the default spell checking application
> in Fedora.

Once I commit changes I will close the bug report and you will receive a
message about it. If you want the final diff I can attach it to the closing
message.

Regards,

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-18 19:03               ` Eli Zaretskii
@ 2013-01-18 19:23                 ` Agustin Martin
  0 siblings, 0 replies; 33+ messages in thread
From: Agustin Martin @ 2013-01-18 19:23 UTC (permalink / raw)
  To: 13460; +Cc: Jochen Schmitt

On Fri, Jan 18, 2013 at 09:03:17PM +0200, Eli Zaretskii wrote:
> > Date: Fri, 18 Jan 2013 19:03:34 +0100
> > From: Jochen Schmitt <Jochen@herr-schmitt.de>
> > 
> > I have find out, that hunspell wiel accecpt -d en_GB,en_US as
> > an parameter, so this issue may be fixed.
> 
> Beware: when you invoke hunspell like that, it uses the .aff file from
> the first dictionary only, and ignores any .aff files of the other
> dictionaries.  This could bite you where US and GB English differ.
> 
> In general, this option is meant to support _additional_ dictionaries
> in the same language, like if you want to use a specialized dictionary
> for medicine or some other discipline, together with a general-purpose
> dictionary for the same language.

Replied to Jochen message just before receiving your message and noticing that
I forgot that this indeed was for additional dictionaries and that only
first aff is used, thanks for reminding.

I can now think about another possible problem with this. hunspell accepts
both hunspell only and old myspell dicts. If e.g. en_GB has a myspell aff
file (and .dic file for that .aff file) and en_US a hunspell only one (and
associated .dic file for it), when using "-d en_GB,en_US" interaction may
become at least strange if not unpredictable.

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-18 17:05           ` Agustin Martin
  2013-01-18 18:03             ` Jochen Schmitt
@ 2013-01-21  9:43             ` Jochen Schmitt
  1 sibling, 0 replies; 33+ messages in thread
From: Jochen Schmitt @ 2013-01-21  9:43 UTC (permalink / raw)
  To: 13460

[-- Attachment #1: Type: text/plain, Size: 1209 bytes --]

On Fri, Jan 18, 2013 at 06:05:01PM +0100, Agustin Martin wrote:

> I have been playing with this. Please see attached patch for current status.
> There are a couple of minor things I would like to think about first.
> 
> Current changes explicitly set "english" to one of the two main choices
> ("en_GB"). This  is not something I like very much and I am aware that
> people is sensitive about this. I'd prefer to associate it with plain "en",
> but hunspell has some pending issues regarding fallback values.
> 
> Since there should be mapppings for all (but nil) default dict definitions, 
> and this is only done for those dicts I am also considering to show an error
> if an expected mappping is not found, but this is a really minor internal
> issue just to help finding missing mappings early.
> 
> I will test these changes a bit more and if no problems appear will commit
> early next week. Feedback is welcome.

I have add a monor change to your suggested patch to generate
an error message, if a language doesn't exist in ispell-hunspell-equivs-alist.

My tests show, that this parch works as expected.

I have attached the modified version of the patch on this mail.

Best Regards:

Jochen Schmitt

[-- Attachment #2: emacs-24.2-hunspell.patch --]
[-- Type: text/plain, Size: 4375 bytes --]

diff -up emacs-24.2/lisp/textmodes/ispell.el.hunspell emacs-24.2/lisp/textmodes/ispell.el
--- emacs-24.2/lisp/textmodes/ispell.el.hunspell	2013-01-19 12:38:21.365802034 +0100
+++ emacs-24.2/lisp/textmodes/ispell.el	2013-01-19 14:32:10.527026717 +0100
@@ -572,6 +572,40 @@ re-start Emacs."
 		       (coding-system :tag "Coding System")))
   :group 'ispell)
 
+(defvar ispell-hunspell-dictionary-equivs-alist
+  '(("american"      "en_US")
+    ("brasileiro"    "pt_BR")
+    ("british"       "en_GB")
+    ("castellano"    "es_ES")
+    ("castellano8"   "es_ES")
+    ("czech"         "cs_CZ")
+    ("dansk"         "da_DK")
+    ("deutsch"       "de_DE")
+    ("deutsch8"      "de_DE")
+    ("english"       "en_GB,en_US")
+    ("esperanto"     "eo")
+    ("esperanto-tex" "eo")
+    ("finnish"       "fi_FI")
+    ("francais7"     "fr_FR")
+    ("francais"      "fr_FR")
+    ("francais-tex"  "fr_FR")
+    ("german"        "de_DE")
+    ("german8"       "de_DE")
+    ("italiano"      "it_IT")
+    ("nederlands"    "nl_NL")
+    ("nederlands8"   "nl_NL")
+    ("norsk"         "nn_NO")
+    ("norsk7-tex"    "nn_NO")
+    ("polish"        "pl_PL")
+    ("portugues"     "pt_PT")
+    ("russian"       "ru_RU")
+    ("russianw"      "ru_RU")
+    ("slovak"        "sk_SK")
+    ("slovenian"     "sl_SI")
+    ("svenska"       "sv_SE")
+    ("hebrew"        "he_IL"))
+  "Alist with matching hunspell dict names for standard dict names in
+  `ispell-dictionary-base-alist'.")
 
 (defvar ispell-dictionary-base-alist
   '((nil
@@ -1077,9 +1111,15 @@ time, before `ispell-dictionary-alist' i
 sysadmins to override entries in `ispell-dictionary-base-alist'
 by putting those overrides in `ispell-base-dicts-override-alist', which is
 a dynamically scoped var with same format as `ispell-dictionary-alist'.
-This alist will not override the auto-detected values (e.g. if a recent
+This alist will no<<<<<<<<<<<<<t override the auto-detected values (e.g. if a recent
 aspell is used along with Emacs).")
 
+(defun ispell-hunspell-dictionary-option (dict)
+  (let ((ret (cadr (assoc dict ispell-hunspell-dictionary-equivs-alist))))
+       (list "-d" (if (null ret) 
+		      (error "Hunspell doesn't sopport dictionary '%s'" dict)
+		      ret))))
+
 (defun ispell-set-spellchecker-params ()
   "Initialize some spellchecker parameters when changed or first used."
   (unless (eq ispell-last-program-name ispell-program-name)
@@ -1106,9 +1146,47 @@ aspell is used along with Emacs).")
 		    ispell-encoding8-command)
 	       ispell-aspell-dictionary-alist
 	     nil))
+	  (ispell-dictionary-base-alist ispell-dictionary-base-alist)
 	  ispell-base-dicts-override-alist ; Override only base-dicts-alist
 	  all-dicts-alist)
 
+      ;; While ispell and aspell (through aliases) use the traditional
+      ;; dict naming originally expected by ispell.el, hunspell
+      ;; uses locale based names with no alias.  We need to map
+      ;; standard names to locale based names to make default dict
+      ;; definitions available for hunspell.
+      (if ispell-really-hunspell
+	  (let (tmp-dicts-alist)
+	    (dolist (adict ispell-dictionary-base-alist)
+	      (let* ((dict-name (nth 0 adict))
+		     (ispell-args (nth 5 adict))
+		     (ispell-args-has-d (member "-d" ispell-args)))
+		;; Remove "-d" option from `ispell-args' if present
+		(if ispell-args-has-d
+		    (let ((ispell-args-after-d
+			   (cdr (cdr ispell-args-has-d)))
+			  (ispell-args-before-d
+			   (butlast ispell-args (length ispell-args-has-d))))
+		      (setq ispell-args
+			    (nconc ispell-args-before-d
+				   ispell-args-after-d))))
+		;; Unless default dict, re-add "-d" option with the mapped value
+		(if dict-name 
+		    (nconc ispell-args
+			   (ispell-hunspell-dictionary-option dict-name)))
+		(add-to-list 'tmp-dicts-alist
+			     (list
+			      dict-name      ; dict name
+			      (nth 1 adict)  ; casechars
+			      (nth 2 adict)  ; not-casechars
+			      (nth 3 adict)  ; otherchars
+			      (nth 4 adict)  ; many-otherchars-p
+			      ispell-args    ; ispell-args
+			      (nth 6 adict)  ; extended-character-mode
+			      (nth 7 adict)  ; dict encoding
+			      )))
+	      (setq ispell-dictionary-base-alist tmp-dicts-alist))))
+
       (run-hooks 'ispell-initialize-spellchecker-hook)
 
       ;; Add dicts to ``ispell-dictionary-alist'' unless already present.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13460: Issue to change dictionary when using hunspell on emacs
  2013-01-18 19:05               ` Agustin Martin
@ 2013-01-21 16:52                 ` Agustin Martin
  0 siblings, 0 replies; 33+ messages in thread
From: Agustin Martin @ 2013-01-21 16:52 UTC (permalink / raw)
  To: 13460-done

On Fri, Jan 18, 2013 at 08:05:41PM +0100, Agustin Martin wrote:
> On Fri, Jan 18, 2013 at 07:03:34PM +0100, Jochen Schmitt wrote:
> > On Fri, Jan 18, 2013 at 06:05:01PM +0100, Agustin Martin wrote:
> > > I will test these changes a bit more and if no problems appear will commit
> > > early next week. Feedback is welcome.
> > >
> > I will be happy, if you can notifiy me about the state of your work, 
> > because I want that this patch may be integrated in the official
> > emacs package of Fedora Linux. This is important from my point of
> > view, because hunspell is the default spell checking application
> > in Fedora.
> 
> Once I commit changes I will close the bug report and you will receive a
> message about it. If you want the final diff I can attach it to the closing
> message.

Fix committed, closing bug report.

For english I left en_US alone. For the reasons shown in the bug thread I
think is not a good idea to mix them.

I have relaxed a bit the check, if a standard dict does not have an
associated hunspell mapping it is ignored for hunspell with a warning,
so error message will only appear when trying to use that dict instead
of everytime hunspell is used as spellchecker.

Regards,

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs.
  2013-01-16 12:25 bug#13460: Issue to change dictionary when using hunspell on emacs Jochen Schmitt
  2013-01-16 18:01 ` Eli Zaretskii
@ 2013-02-20 17:50 ` Agustin Martin
  2013-02-20 19:00   ` Eli Zaretskii
  2013-04-04 14:41 ` bug#13639: " Jacek Chrząszcz
  2 siblings, 1 reply; 33+ messages in thread
From: Agustin Martin @ 2013-02-20 17:50 UTC (permalink / raw)
  To: 13639

[-- Attachment #1: Type: text/plain, Size: 1410 bytes --]

On Thu, Jan 17, 2013 at 09:36:09PM +0200, Eli Zaretskii wrote:
> > On Thu, Jan 17, 2013 at 08:42:58PM +0200, Eli Zaretskii wrote:
> > > > Date: Thu, 17 Jan 2013 19:12:34 +0100
> > > > From: Agustin Martin <agustin.martin@hispalinux.es>
> > > > 
> > > > Sorry, I should have written WORDCHARS.
> > > 
> > > Why do we need that?
> > 
> > This is what ispell.el calls otherchars. Parsing WORDCHARS ensures that
> > both
> > hunspell and ispell.el think about the same characters in that category.
> 
> I think you are mistaken, that's not my reading of hunspell(4).

Sorry for the late reply,

(Opening a new thread specifically about hunspell dicts autodetection and
using new cloned bugreport #13639 specific about this)

Although WORDCHARS description in hunspell(4)

WORDCHARS characters
   WORDCHARS extends tokenizer of Hunspell command line interface
   with additional word character. For example, dot, dash, n-dash, numbers,
   percent sign are word character in Hungarian.

is too hungarian biassed and does not mention usual apostrophe AFAIK it
mostly refers to the same as 'otherchars', although hunspell may accept
that in locations not in the middle of a word.

The good news are that I started working on hunspell dicts autodetection.
For those curious I am attaching my initial test suite. I am currently
integrating this into ispell.el (unfortunately slowly due to time
constraints)

-- 
Agustin

[-- Attachment #2: hunspell-autodetect.el --]
[-- Type: text/plain, Size: 4540 bytes --]

(require 'ispell)

(setq ispell-debug t)
(setq ispell-program-name "hunspell")

(setq ispell-hunspell-dict-paths-alist nil)
(setq ispell-hunspell-dictionary-alist nil)

(defun ispell-print-if-debug (string)
  ""
  (if ispell-debug
      (message "%s" string)))

(defun ispell-replace-dictionary-entry (dicts-alist new-entry)
  "Replace old entry in `DICTS-ALIST' with `NEW-ENTRY'.
Mostly intended to play with `ispell-dictionary-alist' and friends."
  (let (newlist)
    (dolist (entry dicts-alist)
      (if (string= (car new-entry) (car entry))
	  (add-to-list 'newlist new-entry)
	(add-to-list 'newlist entry)))
    newlist))

(defun ispell-parse-hunspell-affix-file (dict-name)
  "Parse hunspell affix file for `dict-name'.
Return a list in `ispell-dictionary-alist' format."
  (let* ((path (cadr (assoc dict-name ispell-hunspell-dict-paths-alist)))
	 (affix-file (concat path dict-name ".aff")))
    (unless path
      (error "No matching entry for %s" dict-name))
    (if (file-exists-p affix-file)
	(with-temp-buffer
	  (insert-file-contents affix-file)
	  (let (otherchars-string otherchars-list)
	    (setq otherchars-string
		  (save-excursion
		    (beginning-of-buffer)
		    (if (search-forward-regexp "^WORDCHARS +" nil t )
			(buffer-substring (point)
					  (progn (end-of-line) (point))))))
	    ;; Remove trailing whitespace and extra stuff. Make list if non-nil.
	    (setq otherchars-list
		  (if otherchars-string
		      (split-string
		       (if (string-match " +.*$" otherchars-string)
			   (replace-match "" nil nil otherchars-string)
			 otherchars-string)
		       "" t)))

	    ;; Fill dict entry
	    (list dict-name
		  "[[:alpha:]]"
		  "[^[:alpha:]]"
		  (if otherchars-list
		      (regexp-opt otherchars-list)
		    "")
		  t                      ;; many-otherchars-p: We can't tell, set to t
		  (list "-d" dict-name)
		  nil                    ;; extended-char-mode: not supported by hunspell
		  'utf-8)))
      (error "File \"%s\" not found" affix-file))))

(defun ispell-find-hunspell-dictionaries ()
  "Parse installed hunspell dictionaries."
  (let ((hunspell-found-dicts
	 (split-string
	  (with-temp-buffer
	    (ispell-call-process ispell-program-name
				 null-device
				 t
				 nil
				 "-D")
	    (buffer-string))
	  "[\n\r]+"
	  t))
	hunspell-default-dict
	hunspell-default-dict-entry)
    (dolist (dict hunspell-found-dicts)
      (let* ((full-name (file-name-nondirectory dict))
	     (path      (file-name-directory dict))
	     (basename  (file-name-sans-extension full-name)))
	(if (string-match "\\.aff$" dict)
	    ;; Found default dictionary
	    (if hunspell-default-dict
		(error "Default dict already defined as %s. Not using %s."
		       hunspell-default-dict dict)
	      (setq hunspell-default-dict (list basename path)))
	  (if (and (not (assoc basename ispell-hunspell-dict-paths-alist))
		   (file-exists-p (concat dict ".aff")))
	      ;; Entry has an associated .aff file and no previous value.
	      (progn
		(ispell-print-if-debug
		 (format "++ dict-entry:%s name:%s basename:%s path:%s aff:%s"
			 dict full-name basename path (concat dict ".aff")))
		(add-to-list 'ispell-hunspell-dict-paths-alist
			     (list basename path)))
	    (ispell-print-if-debug
	     (format "-- Skipping %s" dict))))))
    ;; Parse values for default dictionary.
    (setq hunspell-default-dict (car hunspell-default-dict))
    (setq hunspell-default-dict-entry
	  (ispell-parse-hunspell-affix-file hunspell-default-dict))
    ;; Create an alist of found dicts with only names, except for default dict.
    (setq ispell-hunspell-dictionary-alist
	  (list (append (list nil) (cdr hunspell-default-dict-entry))))
    (dolist (dict (mapcar 'car ispell-hunspell-dict-paths-alist))
      (if (string= dict hunspell-default-dict)
	  (add-to-list 'ispell-hunspell-dictionary-alist
		       hunspell-default-dict-entry)
	(add-to-list 'ispell-hunspell-dictionary-alist
		     (list dict))))))

(ispell-find-hunspell-dictionaries)

(setq mylang "en_US")

(message "-- For selected language \"%s\" before: %s"
	 mylang
	 (assoc mylang ispell-hunspell-dictionary-alist))

(or (cadr (assoc mylang ispell-hunspell-dictionary-alist))
    (let ((dict-entry (ispell-parse-hunspell-affix-file mylang)))
      (setq ispell-hunspell-dictionary-alist
            (ispell-replace-dictionary-entry ispell-hunspell-dictionary-alist
                                             dict-entry))))

(message "-- For selected language \"%s\" after: %s"
	 mylang
	 (assoc mylang ispell-hunspell-dictionary-alist))


^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs.
  2013-02-20 17:50 ` bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs Agustin Martin
@ 2013-02-20 19:00   ` Eli Zaretskii
  2013-02-28 19:23     ` Agustin Martin
  0 siblings, 1 reply; 33+ messages in thread
From: Eli Zaretskii @ 2013-02-20 19:00 UTC (permalink / raw)
  To: Agustin Martin; +Cc: 13639

> Date: Wed, 20 Feb 2013 18:50:45 +0100
> From: Agustin Martin <agustin.martin@hispalinux.es>
> 
> > > > > Sorry, I should have written WORDCHARS.
> > > > 
> > > > Why do we need that?
> > > 
> > > This is what ispell.el calls otherchars. Parsing WORDCHARS ensures that
> > > both
> > > hunspell and ispell.el think about the same characters in that category.
> > 
> > I think you are mistaken, that's not my reading of hunspell(4).
> 
> Sorry for the late reply,
> 
> (Opening a new thread specifically about hunspell dicts autodetection and
> using new cloned bugreport #13639 specific about this)
> 
> Although WORDCHARS description in hunspell(4)
> 
> WORDCHARS characters
>    WORDCHARS extends tokenizer of Hunspell command line interface
>    with additional word character. For example, dot, dash, n-dash, numbers,
>    percent sign are word character in Hungarian.
> 
> is too hungarian biassed and does not mention usual apostrophe AFAIK it
> mostly refers to the same as 'otherchars', although hunspell may accept
> that in locations not in the middle of a word.

I didn't just read the man page, I also looked into several *.aff
files that install with Hunspell dictionaries.  It is clear to me that
WORDCHARS is at least unreliable, even if your interpretation is
correct (of which I'm still unconvinced): some *.aff files don't have
that entry at all (e.g., en_GB.aff, whose OTHERCHARS should include
the ' character, and also ru_RU.aff); others, like he_IL.aff, have
that entry mention all the CASECHARS, in addition to OTHERCHARS.  I
wouldn't bet my money on what that entry gives us.

> The good news are that I started working on hunspell dicts autodetection.

Good news, indeed!  Thanks!





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs.
  2013-02-20 19:00   ` Eli Zaretskii
@ 2013-02-28 19:23     ` Agustin Martin
  2013-02-28 20:26       ` Eli Zaretskii
  2013-04-15 10:18       ` Agustin Martin
  0 siblings, 2 replies; 33+ messages in thread
From: Agustin Martin @ 2013-02-28 19:23 UTC (permalink / raw)
  To: 13639

On Wed, Feb 20, 2013 at 09:00:41PM +0200, Eli Zaretskii wrote:
> I didn't just read the man page, I also looked into several *.aff
> files that install with Hunspell dictionaries.  It is clear to me that
> WORDCHARS is at least unreliable, even if your interpretation is
> correct (of which I'm still unconvinced): some *.aff files don't have
> that entry at all (e.g., en_GB.aff, whose OTHERCHARS should include
> the ' character, and also ru_RU.aff); others, like he_IL.aff, have
> that entry mention all the CASECHARS, in addition to OTHERCHARS.  I
> wouldn't bet my money on what that entry gives us.

IMHO those dictionaries are buggy (this may include some of the dicts I
package for Debian, have to look). 

As an example, I tried Debian en_AU, not having WORDCHARS '

$ echo "ber's" | hunspell -a -d /usr/share/hunspell/en_AU
@(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.2) 
& ber 15 0: bee, bet, be, beer, bier, bear, berg, berm, bar, bed, bur, beg, per, her, be r
*

while if I add the WORDCHARS ' entry I get, as expected

$ echo "ber's" | hunspell -a -d ./en_AU
@(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.2)
& ber's 15 0: bee's, bet's, beer's, bier's, berg's, berm's, bar's, bed's, bur's, bergs, berms, mer's, Ser's, Berber's, Berger's

with ' properly handled.

> > The good news are that I started working on hunspell dicts autodetection.
> 
> Good news, indeed!  Thanks!

Just commited a first cut for hunspell dicts autodetection. I have tested it
only in my GNU/Debian box and seems to work well, so time is come for real
life check to notice how many things went unnoticed.

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs.
  2013-02-28 19:23     ` Agustin Martin
@ 2013-02-28 20:26       ` Eli Zaretskii
  2013-04-15 10:18       ` Agustin Martin
  1 sibling, 0 replies; 33+ messages in thread
From: Eli Zaretskii @ 2013-02-28 20:26 UTC (permalink / raw)
  To: Agustin Martin; +Cc: 13639

> Date: Thu, 28 Feb 2013 20:23:45 +0100
> From: Agustin Martin <agustin.martin@hispalinux.es>
> 
> On Wed, Feb 20, 2013 at 09:00:41PM +0200, Eli Zaretskii wrote:
> > I didn't just read the man page, I also looked into several *.aff
> > files that install with Hunspell dictionaries.  It is clear to me that
> > WORDCHARS is at least unreliable, even if your interpretation is
> > correct (of which I'm still unconvinced): some *.aff files don't have
> > that entry at all (e.g., en_GB.aff, whose OTHERCHARS should include
> > the ' character, and also ru_RU.aff); others, like he_IL.aff, have
> > that entry mention all the CASECHARS, in addition to OTHERCHARS.  I
> > wouldn't bet my money on what that entry gives us.
> 
> IMHO those dictionaries are buggy

Maybe so, but that's what is out there.  My point was that we cannot
rely on those entries, and that point stands even if that is because
of bugs in event the most popular dictionaries.

> > > The good news are that I started working on hunspell dicts autodetection.
> > 
> > Good news, indeed!  Thanks!
> 
> Just commited a first cut for hunspell dicts autodetection. I have tested it
> only in my GNU/Debian box and seems to work well, so time is come for real
> life check to notice how many things went unnoticed.

Thanks.





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13639: ispell.el: hunspell dicts autodetection under Emacs.
  2013-01-16 12:25 bug#13460: Issue to change dictionary when using hunspell on emacs Jochen Schmitt
  2013-01-16 18:01 ` Eli Zaretskii
  2013-02-20 17:50 ` bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs Agustin Martin
@ 2013-04-04 14:41 ` Jacek Chrząszcz
  2013-04-05 15:57   ` Agustin Martin
  2 siblings, 1 reply; 33+ messages in thread
From: Jacek Chrząszcz @ 2013-04-04 14:41 UTC (permalink / raw)
  To: 13639

[-- Attachment #1: Type: text/plain, Size: 667 bytes --]

Hi,

I'd like to post a small correction to ispell.el (hunspell dictionary decoding).
In case the initial ispell-args coordinate of ispell-dictionary-alist
entry is set to nil, the hunspell equiv lookup does not work. The
attached patch corrects this.

Unfortunately I have not been able to set up emacs spellchecking to
work with hunspell for the Polish language due to some encoding
problems.

"if: Ispell and its process have different character maps"

Even though flyspell works to some extent, I go back to aspell, which
works much better.

FYI, I am testing Fedora 18 with packaged GNU Emacs 24.2.1
(x86_64-redhat-linux-gnu, GTK+ Version 3.6.4).

Bests,

Jacek

[-- Attachment #2: emacs.24.2.hunspell.2.patch --]
[-- Type: application/octet-stream, Size: 475 bytes --]

--- ispell.el.orig	2013-04-04 16:06:20.292114823 +0200
+++ ispell.el	2013-04-04 15:49:34.819740110 +0200
@@ -1171,7 +1170,7 @@
 		;; Unless default dict, re-add "-d" option with the mapped value
 		(if dict-name
 		    (if dict-equiv
-			(nconc ispell-args (list "-d" dict-equiv))
+			(setq ispell-args (nconc ispell-args (list "-d" dict-equiv)))
 		      (message
 		       "ispell-set-spellchecker-params: Missing hunspell equiv for \"%s\". Skipping."
 		       dict-name)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13639: ispell.el: hunspell dicts autodetection under Emacs.
  2013-04-04 14:41 ` bug#13639: " Jacek Chrząszcz
@ 2013-04-05 15:57   ` Agustin Martin
  0 siblings, 0 replies; 33+ messages in thread
From: Agustin Martin @ 2013-04-05 15:57 UTC (permalink / raw)
  To: Jacek Chrzaszcz, 13639

On Thu, Apr 04, 2013 at 04:41:44PM +0200, Jacek Chrz??szcz wrote:
> Hi,
> 
> I'd like to post a small correction to ispell.el (hunspell dictionary decoding).
> In case the initial ispell-args coordinate of ispell-dictionary-alist
> entry is set to nil, the hunspell equiv lookup does not work. The
> attached patch corrects this.

Change committed, thanks for the info. Seems I removed more than needed when
cleaning up debugging stuff.

This did not become evident in trunk since hunspell dicts autodetection was
added later, also caring of aliases in a different way, taking precedence
over old method. 

> Unfortunately I have not been able to set up emacs spellchecking to
> work with hunspell for the Polish language due to some encoding
> problems.
> 
> "if: Ispell and its process have different character maps"
> 
> Even though flyspell works to some extent, I go back to aspell, which
> works much better.
> 
> FYI, I am testing Fedora 18 with packaged GNU Emacs 24.2.1
> (x86_64-redhat-linux-gnu, GTK+ Version 3.6.4).

I am testing with myspell pl dictionary and hunspell with Emacs trunk and
seems to work. If the problem is still present for trunk I appreciate a
minimal test file showing the problem and more info about the dictionary
used for hunspell. Note that this may have been fixed with auto-detection.

Regards,

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs.
  2013-02-28 19:23     ` Agustin Martin
  2013-02-28 20:26       ` Eli Zaretskii
@ 2013-04-15 10:18       ` Agustin Martin
  1 sibling, 0 replies; 33+ messages in thread
From: Agustin Martin @ 2013-04-15 10:18 UTC (permalink / raw)
  To: 13639-done

On Thu, Feb 28, 2013 at 08:23:45PM +0100, Agustin Martin wrote:
> Just commited a first cut for hunspell dicts autodetection. I have tested it
> only in my GNU/Debian box and seems to work well, so time is come for real
> life check to notice how many things went unnoticed.

Some time passed without apparent problems, so I am closing this bug report.

-- 
Agustin





^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2013-04-15 10:18 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-16 12:25 bug#13460: Issue to change dictionary when using hunspell on emacs Jochen Schmitt
2013-01-16 18:01 ` Eli Zaretskii
2013-01-16 23:23   ` Glenn Morris
2013-01-17  3:51     ` Eli Zaretskii
2013-01-17  6:37       ` Glenn Morris
2013-01-17 12:26         ` Agustin Martin
2013-01-17 15:24           ` Agustin Martin
2013-01-17 16:31             ` Stefan Monnier
2013-01-17 18:15               ` Agustin Martin
2013-01-17 16:41             ` Eli Zaretskii
2013-01-17 18:12               ` Agustin Martin
2013-01-17 18:42                 ` Eli Zaretskii
     [not found]                 ` <11624660.12538.1358448223517.JavaMail.root@mx1-new.spamfiltro.es>
2013-01-17 19:06                   ` Agustin Martin
2013-01-17 19:36                     ` Eli Zaretskii
2013-01-17 18:08             ` Glenn Morris
     [not found]             ` <7076415.12428.1358446115519.JavaMail.root@mx1-new.spamfiltro.es>
2013-01-17 18:44               ` Agustin Martin
2013-01-17 16:10           ` Eli Zaretskii
     [not found]     ` <20130117131733.GA20519@omega.in.herr-schmitt.de>
2013-01-17 18:19       ` Glenn Morris
2013-01-17 19:30         ` Agustin Martin
2013-01-18 17:05           ` Agustin Martin
2013-01-18 18:03             ` Jochen Schmitt
2013-01-18 19:03               ` Eli Zaretskii
2013-01-18 19:23                 ` Agustin Martin
2013-01-18 19:05               ` Agustin Martin
2013-01-21 16:52                 ` Agustin Martin
2013-01-21  9:43             ` Jochen Schmitt
2013-02-20 17:50 ` bug#13639: [emacs] ispell.el: hunspell dicts autodetection under Emacs Agustin Martin
2013-02-20 19:00   ` Eli Zaretskii
2013-02-28 19:23     ` Agustin Martin
2013-02-28 20:26       ` Eli Zaretskii
2013-04-15 10:18       ` Agustin Martin
2013-04-04 14:41 ` bug#13639: " Jacek Chrząszcz
2013-04-05 15:57   ` Agustin Martin

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).