unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: help-gnu-emacs@gnu.org
Subject: Re: Hunspell for Japanese
Date: Sat, 17 Feb 2018 17:18:23 +0200	[thread overview]
Message-ID: <83tvufbq68.fsf@gnu.org> (raw)
In-Reply-To: <2D3482A4-0F9A-4D09-86C2-958AB03FA388@misasa.okayama-u.ac.jp> (message from Tak Kunihiro on Sat, 17 Feb 2018 22:53:50 +0900)

> From: Tak Kunihiro <tkk@misasa.okayama-u.ac.jp>
> Date: Sat, 17 Feb 2018 22:53:50 +0900
> Cc: 国広卓也 <tkk@misasa.okayama-u.ac.jp>
> 
> I want to spellcheck English phrases that are mixed in Japanese
> phrases by `hunspell'.  When I call M-x ispell-word, responses from `aspell' and
> `hunspell' differ.  The difference results in how underlines are drawn in
> flyspell-mode.  The `hunspell' gives many unnecessary underlines on Japanese phrases.

If your dictionary is for English, why do you expect flyspell-mode to
work correctly with words in another language?  It can't do anything
sensible with such foreign words.  The underlines flyspell-mode shows
in Japanese words when the dictionary is for English could be
anything; you should simply disregard any such underlines in
non-English words.

Can you tell why you pay attention to underlines in non-English words
in this situation?

> Is is possible to make `hunspell' behave like `aspell'?

They are very different programs, so they cannot behave the same.

> $ which hunspell
> /opt/local/bin/hunspell
> $ hunspell -D
> ...
> /opt/local/share/hunspell/en_US
> LOADED DICTIONARY:
> /opt/local/share/hunspell/en_US.aff
> /opt/local/share/hunspell/en_US.dic
> Hunspell 1.6.2
> $ Emacs -Q
> M-: (insert "Emacsは日本ではイーマックスと呼ばれる")
> C-a
> M-: (setq ispell-program-name "hunspell")
> M-x ispell-word
> X-b *Messages*
> 
> > Starting new Ispell process hunspell with default dictionary...
> > Checking spelling of EMACSは日本語ではイーマックスと呼ばれる...
> > ispell-word: Ispell and its process have different character maps

I see the same message.  It is caused by Hunspell somehow considering
the string "は日本語ではイーマックスと呼ばれる" as more than one word,
and it therefore returns 3 misspellings, which then trigger the above
cryptic error message.

But once again, you've set up flyspell-mode to work in English, so you
shouldn't pay attention to what it does with Japanese.  For starters,
I believe the encoding Emacs uses is incorrect in that case, because
the en_US.aff file probably states that it wants a Latin-1 encoding,
not UTF-8.  But even using UTF-8 will not help here, AFAIU.



  reply	other threads:[~2018-02-17 15:18 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-17 13:53 Hunspell for Japanese Tak Kunihiro
2018-02-17 15:18 ` Eli Zaretskii [this message]
2018-02-18  5:31 ` Tak Kunihiro
2018-02-18 15:59   ` Eli Zaretskii
2018-02-24  1:41   ` Tak Kunihiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83tvufbq68.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=help-gnu-emacs@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).