unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set
@ 2021-08-10 15:12 Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2021-08-10 16:03 ` Eli Zaretskii
  2022-08-22 12:57 ` Lars Ingebrigtsen
  0 siblings, 2 replies; 7+ messages in thread
From: Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-08-10 15:12 UTC (permalink / raw)
  To: 49982

This configuration should be everything that's needed for ispell.el to
work with Hunspell, regardless of system locale:

     (setq ispell-program-name (executable-find "hunspell")
           ispell-dictionary "en_US"))

However, when system locale (the LANG environment variable) does not 
have a corresponding Hunspell dictionary, 
`ispell-find-hunspell-dictionaries` returns the error "Can't find 
Hunspell dictionary with a .aff affix file", despite ispell-dictionary 
being set.

ispell.el relies on Hunspell to load a default and report it, but
Hunspell just errors out if it can't find a dictionary for the system
locale. And because ispell.el is trying to get Hunspell's default
dictionary, it doesn't pass `ispell-dictionary' onto Hunspell.

This behavior is surprising. If `ispell-dictionary` is non-nil, that
means the user has already specified their preferred dictionary, and it
should not matter that Hunspell cannot find the dictionary it would use
when a preferred dictionary isn't specified.

It's ispell.el that needs to be fixed here because the user specifies
their preference in Emacs, and it is its job to communicate that
preference to Hunspell.

`ispell-find-hunspell-dictionaries` should pass "-d
${ispell-dictionary}" to Hunspell if `ispell-dictionary` is set. This 
invocation:

     hunspell -d "en_US" -D /dev/null

works as expected regardless of the system locale.

* System info

In GNU Emacs 27.2 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.27,
  cairo version 1.17.4) of 2021-03-27 built on juergen Windowing system
  distributor 'The X.Org Foundation', version 11.0.12013000 System
  Description: Arch Linux

Hunspell 1.7.0; hunspell -D is

     SEARCH PATH:
 
.::/usr/share/hunspell:/usr/share/myspell:/usr/share/myspell/dicts:/Library/Spelling:/home/kisaragi-hiu/.openoffice.org/3/user/wordbook:/home/kisaragi-hiu/.openoffice.org2/user/wordbook:/home/kisaragi-hiu/.openoffice.org2.0/user/wordbook:/home/kisaragi-hiu/Library/Spelling:/opt/openoffice.org/basis3.0/share/dict/ooo:/usr/lib/openoffice.org/basis3.0/share/dict/ooo:/opt/openoffice.org2.4/share/dict/ooo:/usr/lib/openoffice.org2.4/share/dict/ooo:/opt/openoffice.org2.3/share/dict/ooo:/usr/lib/openoffice.org2.3/share/dict/ooo:/opt/openoffice.org2.2/share/dict/ooo:/usr/lib/openoffice.org2.2/share/dict/ooo:/opt/openoffice.org2.1/share/dict/ooo:/usr/lib/openoffice.org2.1/share/dict/ooo:/opt/openoffice.org2.0/share/dict/ooo:/usr/lib/openoffice.org2.0/share/dict/ooo
     AVAILABLE DICTIONARIES (path is not mandatory for -d option):
     ... [truncated]
     /usr/share/hunspell/en_US-large
     ... [truncated]

* Reproduction

- Notice how Hunspell does not return LOADED DICTIONARY under, for 
example, ja_JP:

     export LANG=ja_JP
     hunspell -D /dev/null
     # Output:
     # ... [truncated]
     # Can't open affix or dictionary files for dictionary named "ja_JP".

- Now, in Emacs with LANG set to ja_JP, set ispell up with Hunspell as 
usual.

     (setq ispell-program-name (executable-find "hunspell")
           ispell-dictionary "en_US"))

- Observe the error.

     (ispell-start-process)
     ;; -> ispell-find-hunspell-dictionaries: Can^[$B!G^[(Bt find Hunspell 
dictionary with a .aff affix file





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set
  2021-08-10 15:12 bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2021-08-10 16:03 ` Eli Zaretskii
  2021-08-10 18:51   ` Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2022-08-22 12:57 ` Lars Ingebrigtsen
  1 sibling, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2021-08-10 16:03 UTC (permalink / raw)
  To: Kisaragi Hiu; +Cc: 49982

> Date: Wed, 11 Aug 2021 00:12:06 +0900
> From:  Kisaragi Hiu via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> This configuration should be everything that's needed for ispell.el to
> work with Hunspell, regardless of system locale:
> 
>      (setq ispell-program-name (executable-find "hunspell")
>            ispell-dictionary "en_US"))
> 
> However, when system locale (the LANG environment variable) does not 
> have a corresponding Hunspell dictionary, 
> `ispell-find-hunspell-dictionaries` returns the error "Can't find 
> Hunspell dictionary with a .aff affix file", despite ispell-dictionary 
> being set.
> 
> ispell.el relies on Hunspell to load a default and report it, but
> Hunspell just errors out if it can't find a dictionary for the system
> locale. And because ispell.el is trying to get Hunspell's default
> dictionary, it doesn't pass `ispell-dictionary' onto Hunspell.
> 
> This behavior is surprising. If `ispell-dictionary` is non-nil, that
> means the user has already specified their preferred dictionary, and it
> should not matter that Hunspell cannot find the dictionary it would use
> when a preferred dictionary isn't specified.
> 
> It's ispell.el that needs to be fixed here because the user specifies
> their preference in Emacs, and it is its job to communicate that
> preference to Hunspell.
> 
> `ispell-find-hunspell-dictionaries` should pass "-d
> ${ispell-dictionary}" to Hunspell if `ispell-dictionary` is set. This 
> invocation:
> 
>      hunspell -d "en_US" -D /dev/null
> 
> works as expected regardless of the system locale.

Thanks for the report and the analysis.

Frankly, I'm a bit wary of making the proposed change unconditionally.
First, yours is an unusual use case, I think: when Hunspell is
installed, the dictionary that corresponds to the locale is always
installed, because otherwise Hunspell will not work reliably from the
shell command line.  And second, relying on the non-nil value of
ispell-dictionary is fragile: the value could be a remnant from some
previous invocation or from an unsuccessful customization that has
nothing to do with the user's choice or his/her current intent.

Moreover, if you manually set ispell-dictionary, then what would be
the purpose of calling ispell-find-hunspell-dictionaries at all?

So maybe we should add a new user option that would force using the
value of ispell-dictionary right from the start.  That would at least
avoid the risk of breaking somebody else's use case.

I wonder if anyone else has an opinion about this.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set
  2021-08-10 16:03 ` Eli Zaretskii
@ 2021-08-10 18:51   ` Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2021-08-10 19:29     ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-08-10 18:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 49982

Thank you for the response! Let me try to add some clarifications (that 
hopefully don't sound too harsh):

 > First, yours is an unusual use case, I think: when Hunspell is
 > installed, the dictionary that corresponds to the locale is always
 > installed, because otherwise Hunspell will not work reliably from the
 > shell command line.

I'm fairly certain my use case isn't unusual.

There are no easily installable Hunspell dictionaries for, among other 
languages:

- Any variant of Chinese (Mandarin)
- Japanese
- Kazakh
- Khmer
- Malay

Every user of any of these languages who tries to set up Hunspell
along with ispell.el and Flyspell has to find or invent a poorly
documented workaround.

- [[https://texwiki.texjp.org/?Hunspell][TeXJP (Japanese) mentions]] 
"add[ing] the DICTIONARY or WORDLIST environment variables if needed" 
(「また、必要に応じて環境変数DICTIONARYやWORDLISTを指定しておきます。」)
- [[https://home.hirosaki-u.ac.jp/heroic-2020/1575/][Hirosaki University 
Information Technology Center PC lab's tutorial to spellchecking in 
Emacs]] sets DICTIONARY to en_US
- 200ok.ch (developer of Organice)'s 
[[https://200ok.ch/posts/2020-08-22_setting_up_spell_checking_with_multiple_dictionaries.html][tutorial 
for using multiple dictionaries for Hunspell + ispell.el]] mentions

     ;; Configure `LANG`, otherwise ispell.el cannot find a 'default
     ;; dictionary' even though multiple dictionaries will be configured
     ;; in next line.
     (setenv "LANG" "en_US.UTF-8")

- 
[[http://blog.binchen.org/posts/what-s-the-best-spell-check-set-up-in-emacs/][Chen 
Bin's blog post on setting up spell check]] uses this block:

     ;; find aspell and hunspell automatically
     (cond
      ;; try hunspell at first
       ;; if hunspell does NOT exist, use aspell
      ((executable-find "hunspell")
       (setq ispell-program-name "hunspell")
       (setq ispell-local-dictionary "en_US")
       (setq ispell-local-dictionary-alist
             ;; Please note the list `("-d" "en_US")` contains ACTUAL 
parameters passed to hunspell
             ;; You could use `("-d" "en_US,en_US-med")` to check with 
multiple dictionaries
             '(("en_US" "[[:alpha:]]" "[^[:alpha:]]" "[']" nil ("-d" 
"en_US") nil utf-8)))

       ;; new variable `ispell-hunspell-dictionary-alist' is defined in 
Emacs
       ;; If it's nil, Emacs tries to automatically set up the dictionaries.
       (when (boundp 'ispell-hunspell-dictionary-alist)
         (setq ispell-hunspell-dictionary-alist 
ispell-local-dictionary-alist)))

   "Emacs tries to automatically set up the dictionaries" refers to
   ispell-set-spellchecker-params running 
ispell-find-hunspell-dictionaries after
   seeing that ispell-hunspell-dictionary-alist is nil.

My use case is not unusual. Fixing this bug would eliminate the need
for these workarounds.

(From the command line you just pass in -d yourself. Setting environment 
variables is also a native way of configuring programs in the CLI; in 
Emacs generally wrapper packages like ispell.el define user options 
instead of asking users to do `setenv` themselves.)

 > And second, relying on the non-nil value of
 > ispell-dictionary is fragile: the value could be a remnant from some
 > previous invocation or from an unsuccessful customization that has
 > nothing to do with the user's choice or his/her current intent.

ispell-dictionary is a user option, not an internal variable. Nothing
in ispell.el changes ispell-dictionary besides the command to help the
user change the preferred dictionary, `ispell-change-dictionary`, so
the value cannot be a remnant from a previous invocation.

Without doing anything, ispell-dictionary being nil signals to ispell.el to
use the spell checker's default, as evident from its Custom type:

     (defcustom ispell-dictionary nil
       "Default dictionary to use if `ispell-local-dictionary' is nil."
       :type '(choice string
                      (const :tag "default" nil))
       :group 'ispell)

In fact, the user can set ispell-dictionary in their init.el when 
they're using aspell and have it work as expected. That's why I consider 
this a bug.

 > Moreover, if you manually set ispell-dictionary, then what would be
 > the purpose of calling ispell-find-hunspell-dictionaries at all?

I don't call ispell-find-hunspell-dictionaries myself --- turning on 
flyspell eventually calls it.

The error actually occurs when flyspell-mode-on calls
ispell-set-spellchecker-params, which in turn calls
ispell-find-hunspell-dictionaries to set up internal variables.

This is how Chen Bin's workaround works: it sets
ispell-local-dictionary-alist first, then sets
ispell-hunspell-dictionary-alist to it, preventing
ispell-set-spellchecker-params from triggering the error.

ispell-find-hunspell-dictionaries in fact always returns nil, and is 
only usedfor side effects: setting up
- ispell-hunspell-dictionary-alist,
- ispell-hunspell-dict-paths-alist,
- and ispell-dicts-name2locale-equivs-alist.

I'd like to hear more perspectives on this as well.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set
  2021-08-10 18:51   ` Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2021-08-10 19:29     ` Eli Zaretskii
  2021-08-11 11:17       ` Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  0 siblings, 1 reply; 7+ messages in thread
From: Eli Zaretskii @ 2021-08-10 19:29 UTC (permalink / raw)
  To: Kisaragi Hiu; +Cc: 49982

> From: Kisaragi Hiu <mail@kisaragi-hiu.com>
> Cc: 49982@debbugs.gnu.org
> Date: Wed, 11 Aug 2021 03:51:22 +0900
> 
> Thank you for the response! Let me try to add some clarifications (that 
> hopefully don't sound too harsh):
> 
>  > First, yours is an unusual use case, I think: when Hunspell is
>  > installed, the dictionary that corresponds to the locale is always
>  > installed, because otherwise Hunspell will not work reliably from the
>  > shell command line.
> 
> I'm fairly certain my use case isn't unusual.
> 
> There are no easily installable Hunspell dictionaries for, among other 
> languages:
> 
> - Any variant of Chinese (Mandarin)
> - Japanese
> - Kazakh
> - Khmer
> - Malay
> 
> Every user of any of these languages who tries to set up Hunspell
> along with ispell.el and Flyspell has to find or invent a poorly
> documented workaround.
> 
> - [[https://texwiki.texjp.org/?Hunspell][TeXJP (Japanese) mentions]] 
> "add[ing] the DICTIONARY or WORDLIST environment variables if needed" 
> (「また、必要に応じて環境変数DICTIONARYやWORDLISTを指定しておきます。」)
> - [[https://home.hirosaki-u.ac.jp/heroic-2020/1575/][Hirosaki University 
> Information Technology Center PC lab's tutorial to spellchecking in 
> Emacs]] sets DICTIONARY to en_US
> - 200ok.ch (developer of Organice)'s 
> [[https://200ok.ch/posts/2020-08-22_setting_up_spell_checking_with_multiple_dictionaries.html][tutorial 
> for using multiple dictionaries for Hunspell + ispell.el]] mentions

Indeed, defining DICTIONARY in the environment is the way to control
the default dictionary.  It is documented in the Hunspell's man page.
Why cannot it be the solution for when no Hunspell dictionary could be
found that matches the locale?  Using $DICTIONARY should solve your
problem both inside Emacs and outside it.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set
  2021-08-10 19:29     ` Eli Zaretskii
@ 2021-08-11 11:17       ` Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2021-08-11 12:12         ` Eli Zaretskii
  0 siblings, 1 reply; 7+ messages in thread
From: Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-08-11 11:17 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 49982

 > Indeed, defining DICTIONARY in the environment is the way to control
the default dictionary.  It is documented in the Hunspell's man page.
Why cannot it be the solution for when no Hunspell dictionary could be
found that matches the locale?  Using $DICTIONARY should solve your
problem both inside Emacs and outside it.

I don't know, maybe I'm biased here. Hunspell has its quirks, but isn't 
it ispell.el's job to work around quirks in spellcheckers, and not the 
end user's? ispell.el worked around Hunspell 1.7's new output quirk. Why 
can't it work around this quirk?

*My* problem is already solved by using the workaround. The bug is that 
nobody should have to use the workaround.

Using environment variables to configure subprocesses is always 
something that a user can do, but, as you know, there's a reason why 
ispell.el exposes spellchecker options through Emacs user options.

Besides, which dictionary one specifies in `DICTIONARY` doesn't actually 
matter, it just needs to be one that exists, as it will be overridden by 
ispell-dictionary when ispell.el actually starts spellchecking. You can 
do (in emacs -Q):

     (setenv "LANG" "ja_JP") ; trigger the quirk
     (setenv "DICTIONARY" "en_US") ; tame ispell-find-hunspell-dictionaries
     (setq ispell-program (executable-find "hunspell")
           ispell-dictionary "en_GB")
     (flyspell-mode)

and see that it's spellchecking color to colour. (Try typing "color" 
then running M-x flyspell-auto-correct-previous-word)

---

ispell-dictionary is ispell.el's way of specifying the main dictionary. 
The manual:

 > Spell-checkers look up spelling in two dictionaries: the standard
dictionary and your personal dictionary.  The standard dictionary is
specified by the variable ‘ispell-local-dictionary’ or, if that is
‘nil’, by the variable ‘ispell-dictionary’.  If both are ‘nil’, the
spelling program’s default dictionary is used.

The spelling program's default should only ever have an effect when both 
ispell-local-dictionary and ispell-dictionary is nil.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set
  2021-08-11 11:17       ` Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2021-08-11 12:12         ` Eli Zaretskii
  0 siblings, 0 replies; 7+ messages in thread
From: Eli Zaretskii @ 2021-08-11 12:12 UTC (permalink / raw)
  To: Kisaragi Hiu; +Cc: 49982

> From: Kisaragi Hiu <mail@kisaragi-hiu.com>
> Cc: 49982@debbugs.gnu.org
> Date: Wed, 11 Aug 2021 20:17:20 +0900
> 
>  > Indeed, defining DICTIONARY in the environment is the way to control
> the default dictionary.  It is documented in the Hunspell's man page.
> Why cannot it be the solution for when no Hunspell dictionary could be
> found that matches the locale?  Using $DICTIONARY should solve your
> problem both inside Emacs and outside it.
> 
> I don't know, maybe I'm biased here. Hunspell has its quirks, but isn't 
> it ispell.el's job to work around quirks in spellcheckers, and not the 
> end user's?

Not when the spell-checker is basically not configured correctly.

> ispell.el worked around Hunspell 1.7's new output quirk.

That was something users could do nothing on their end to solve.

> Using environment variables to configure subprocesses is always 
> something that a user can do, but, as you know, there's a reason why 
> ispell.el exposes spellchecker options through Emacs user options.

That's not what I meant.  I meant to suggest that you set DICTIONARY
in the init files of your interactive shell, so that it would allow
you to use Hunspell both inside Emacs (because Emacs inherits the
environment variables of its parent shell) and outside Emacs.  I
didn't mean to suggest that you (or others) should inject DICTIONARY
into the environment of the Hunspell sub-process by doing something in
Emacs, like setenv etc.

> Besides, which dictionary one specifies in `DICTIONARY` doesn't actually 
> matter, it just needs to be one that exists, as it will be overridden by 
> ispell-dictionary when ispell.el actually starts spellchecking.

It should be the dictionary you want to use by default.  In your case,
I assume it's en_US.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set
  2021-08-10 15:12 bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
  2021-08-10 16:03 ` Eli Zaretskii
@ 2022-08-22 12:57 ` Lars Ingebrigtsen
  1 sibling, 0 replies; 7+ messages in thread
From: Lars Ingebrigtsen @ 2022-08-22 12:57 UTC (permalink / raw)
  To: Kisaragi Hiu; +Cc: 49982

Kisaragi Hiu <mail@kisaragi-hiu.com> writes:

> This configuration should be everything that's needed for ispell.el to
> work with Hunspell, regardless of system locale:
>
>     (setq ispell-program-name (executable-find "hunspell")
>           ispell-dictionary "en_US"))
>
> However, when system locale (the LANG environment variable) does not
> have a corresponding Hunspell dictionary,
> `ispell-find-hunspell-dictionaries` returns the error "Can't find
> Hunspell dictionary with a .aff affix file", despite ispell-dictionary
> being set.

I've now fixed this in Emacs 29 (by first using our current rules, and
then trying again with -d ispell-dictionary).





^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-08-22 12:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-10 15:12 bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-08-10 16:03 ` Eli Zaretskii
2021-08-10 18:51   ` Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-08-10 19:29     ` Eli Zaretskii
2021-08-11 11:17       ` Kisaragi Hiu via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-08-11 12:12         ` Eli Zaretskii
2022-08-22 12:57 ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).