unofficial mirror of help-guix@gnu.org 
 help / color / mirror / Atom feed
* Guix's enchant misreports numerals on Debian---both buster and bullseye
@ 2020-08-20 18:35 Jorge P. de Morais Neto
  2020-08-20 18:45 ` Jorge P. de Morais Neto
  0 siblings, 1 reply; 5+ messages in thread
From: Jorge P. de Morais Neto @ 2020-08-20 18:35 UTC (permalink / raw)
  To: help-guix

Hello.  I reported Emacs bug#42248 and I and the Emacs developers
realized at least part of the problem is with Guix version of Enchant.

On an updated Debian bullseye, enchant 2.2.8 from Guix misreports
numerals.  The same enchant upstream version, when installed from APT,
does not have this problem.  Guix enchant 2.2.8 on Debian buster also
misreports numerals.

See (on Debian bullseye):

$ enchant-2 -v
@(#) International Ispell Version 3.1.20 (but really Enchant 2.2.8)
$ which enchant-2
/usr/bin/enchant-2
$ enchant-2 -l -d en_US /tmp/enchant-test.txt
Doesn
Amarelou

$ guix install enchant; hash enchant-2
[...]
$ enchant-2 -v
@(#) International Ispell Version 3.1.20 (but really Enchant 2.2.8)
$ which enchant-2
/home/jorge/.guix-profile/bin/enchant-2
$ enchant-2 -l -d en_US /tmp/enchant-test.txt
2015
Casa
42
Amarelou
2018

The file /tmp/enchant-test.txt:
--8<---------------cut here---------------start------------->8---
Doesn't 2015
Casa 42
Amarelou 2018
--8<---------------cut here---------------end--------------->8---

So enchant 2.2.8 (either from APT or from Guix) does not understand
"doesn't"; and, what's worse, enchant-2.2.8 from Guix reports every
numeral as a misspelling.

Regards

-- 
- <https://jorgemorais.gitlab.io/justice-for-rms/>
- I am Brazilian.  I hope my English is correct and I welcome feedback.
- Free Software Supporter: <https://www.fsf.org/free-software-supporter>
- If an email of mine arrives at your spam box, please notify me.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Guix's enchant misreports numerals on Debian---both buster and bullseye
  2020-08-20 18:35 Guix's enchant misreports numerals on Debian---both buster and bullseye Jorge P. de Morais Neto
@ 2020-08-20 18:45 ` Jorge P. de Morais Neto
  2020-08-20 19:15   ` Julien Lepiller
  0 siblings, 1 reply; 5+ messages in thread
From: Jorge P. de Morais Neto @ 2020-08-20 18:45 UTC (permalink / raw)
  To: help-guix

Em [2020-08-20 qui 15:35:00-0300], Jorge P. de Morais Neto escreveu:

> So enchant 2.2.8 (either from APT or from Guix) does not understand
> "doesn't"; and, what's worse, enchant-2.2.8 from Guix reports every
> numeral as a misspelling.

I now reread my experiment and realized enchant from Guix does
understand "doesn't".  So enchant 2.2.8 from Guix gets "doesn't"
correctly, but not numerals, and enchant 2.2.8 from APT gets numerals
correctly, but not "doesn't".  Could enchant get both numerals and
"doesn't" correctly?  That would be ideal.  Failing that, APT's enchant
situation is much preferable than Guix's enchant.

Regards

-- 
- <https://jorgemorais.gitlab.io/justice-for-rms/>
- If an email of mine arrives at your spam box, please notify me.
- Please adopt free/libre formats like PDF, ODF, Org, LaTeX, Opus, WebM and 7z.
- Free/libre software for Replicant, LineageOS and Android: https://f-droid.org
- [[https://www.gnu.org/philosophy/free-sw.html][What is free software?]]


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Guix's enchant misreports numerals on Debian---both buster and bullseye
  2020-08-20 18:45 ` Jorge P. de Morais Neto
@ 2020-08-20 19:15   ` Julien Lepiller
  2020-08-21  0:14     ` Julien Lepiller
  0 siblings, 1 reply; 5+ messages in thread
From: Julien Lepiller @ 2020-08-20 19:15 UTC (permalink / raw)
  To: help-guix, Jorge P. de Morais Neto

So I've look at it quickly. It seems our enchant is built only with aspell, whereas debian is built with hunspell. In fact, our hunspell is able to detect the misspellings, and does not flag numbers nor doesn't. Maybe you could use hunspell directly as your enchant? Not sure if that works, I'm not an emacs user.

I've tried building enchant with hunspell, but although it worked, it still doesn't use the hunspell dictionnary. Looking at strace, it ignores $DICPATH which hunspell uses, and looks in various other directories. Sxmlinking one of them to $DICPATH didn't work either. Enchant was able to find hunspell's en_US.dic file, but then fails when looking for en_US.aff in the same directory. What is this aff file?

On 2020年8月20日 14:45:56 GMT-04:00, "Jorge P. de Morais Neto" <jorge+list@disroot.org> wrote:
>Em [2020-08-20 qui 15:35:00-0300], Jorge P. de Morais Neto escreveu:
>
>> So enchant 2.2.8 (either from APT or from Guix) does not understand
>> "doesn't"; and, what's worse, enchant-2.2.8 from Guix reports every
>> numeral as a misspelling.
>
>I now reread my experiment and realized enchant from Guix does
>understand "doesn't".  So enchant 2.2.8 from Guix gets "doesn't"
>correctly, but not numerals, and enchant 2.2.8 from APT gets numerals
>correctly, but not "doesn't".  Could enchant get both numerals and
>"doesn't" correctly?  That would be ideal.  Failing that, APT's enchant
>situation is much preferable than Guix's enchant.
>
>Regards
>
>-- 
>- <https://jorgemorais.gitlab.io/justice-for-rms/>
>- If an email of mine arrives at your spam box, please notify me.
>- Please adopt free/libre formats like PDF, ODF, Org, LaTeX, Opus, WebM
>and 7z.
>- Free/libre software for Replicant, LineageOS and Android:
>https://f-droid.org
>- [[https://www.gnu.org/philosophy/free-sw.html][What is free
>software?]]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Guix's enchant misreports numerals on Debian---both buster and bullseye
  2020-08-20 19:15   ` Julien Lepiller
@ 2020-08-21  0:14     ` Julien Lepiller
  2020-08-22 20:31       ` Jorge P. de Morais Neto
  0 siblings, 1 reply; 5+ messages in thread
From: Julien Lepiller @ 2020-08-21  0:14 UTC (permalink / raw)
  To: help-guix, Jorge P. de Morais Neto

Hi,

I ended up pushing two patches: the first one installs the required .aff files along with the .dic files in hunspell-dict-en. The second adds hunspell as an input to enchant.

With these, on Guix System, I was able to reproduce the behavior of debian's enchant. Numerals are not marked as incorrect anymore.

I have two issues left: enchant ignores DICPATH and loads dictionaries from global directories, which is an issue on the Guix System (less so on foreign distros as they probably have dictionaries installed at these locations).

Hunspell itself doesn't flag "doesn" as incorrect, whereas enchant does, despite using the same dictionary. If this is also the case on Debian, we might have found a bug in enchant.

So: after you run guix pull and update enchant, you will see the same behavior from guix' and debian's enchant.

On 2020年8月20日 15:15:53 GMT-04:00, Julien Lepiller <julien@lepiller.eu> wrote:
>So I've look at it quickly. It seems our enchant is built only with
>aspell, whereas debian is built with hunspell. In fact, our hunspell is
>able to detect the misspellings, and does not flag numbers nor doesn't.
>Maybe you could use hunspell directly as your enchant? Not sure if that
>works, I'm not an emacs user.
>
>I've tried building enchant with hunspell, but although it worked, it
>still doesn't use the hunspell dictionnary. Looking at strace, it
>ignores $DICPATH which hunspell uses, and looks in various other
>directories. Sxmlinking one of them to $DICPATH didn't work either.
>Enchant was able to find hunspell's en_US.dic file, but then fails when
>looking for en_US.aff in the same directory. What is this aff file?
>
>On 2020年8月20日 14:45:56 GMT-04:00, "Jorge P. de Morais Neto"
><jorge+list@disroot.org> wrote:
>>Em [2020-08-20 qui 15:35:00-0300], Jorge P. de Morais Neto escreveu:
>>
>>> So enchant 2.2.8 (either from APT or from Guix) does not understand
>>> "doesn't"; and, what's worse, enchant-2.2.8 from Guix reports every
>>> numeral as a misspelling.
>>
>>I now reread my experiment and realized enchant from Guix does
>>understand "doesn't".  So enchant 2.2.8 from Guix gets "doesn't"
>>correctly, but not numerals, and enchant 2.2.8 from APT gets numerals
>>correctly, but not "doesn't".  Could enchant get both numerals and
>>"doesn't" correctly?  That would be ideal.  Failing that, APT's
>enchant
>>situation is much preferable than Guix's enchant.
>>
>>Regards
>>
>>-- 
>>- <https://jorgemorais.gitlab.io/justice-for-rms/>
>>- If an email of mine arrives at your spam box, please notify me.
>>- Please adopt free/libre formats like PDF, ODF, Org, LaTeX, Opus,
>WebM
>>and 7z.
>>- Free/libre software for Replicant, LineageOS and Android:
>>https://f-droid.org
>>- [[https://www.gnu.org/philosophy/free-sw.html][What is free
>>software?]]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Guix's enchant misreports numerals on Debian---both buster and bullseye
  2020-08-21  0:14     ` Julien Lepiller
@ 2020-08-22 20:31       ` Jorge P. de Morais Neto
  0 siblings, 0 replies; 5+ messages in thread
From: Jorge P. de Morais Neto @ 2020-08-22 20:31 UTC (permalink / raw)
  To: Julien Lepiller, help-guix

> Maybe you could use hunspell directly as your enchant?  Not sure if
> that works, I'm not an emacs user.

Emacs does support several other spell checkers, but I want to use
enchant in order to share the user dictionary with other applications
that use enchant.

> With these, on Guix System, I was able to reproduce the behavior of
> debian's enchant.  Numerals are not marked as incorrect anymore.

Yes, same here.  Thank you!

> Hunspell itself doesn't flag "doesn" as incorrect, whereas enchant
> does, despite using the same dictionary.  If this is also the case on
> Debian, we might have found a bug in enchant.

The command-line tool enchant-2 misreports "doesn't", but Gedit, which
very probably uses Enchant, correctly accepts "doesn't and four other
contractions I tested.  It seems Gedit calls the enchant library in a
different way than Enchant's own command-line tool.

The reason to believe that Gedit uses Enchant are:
1. Wikipedia says so
2. Gedit's spell checker correctly accepts every word I have in the
   Enchant user dictionary.

Best regards

-- 
- <https://jorgemorais.gitlab.io/justice-for-rms/>
- I am Brazilian.  I hope my English is correct and I welcome feedback.
- <https://www.defectivebydesign.org/>
- <https://www.gnu.org/>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-22 20:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-20 18:35 Guix's enchant misreports numerals on Debian---both buster and bullseye Jorge P. de Morais Neto
2020-08-20 18:45 ` Jorge P. de Morais Neto
2020-08-20 19:15   ` Julien Lepiller
2020-08-21  0:14     ` Julien Lepiller
2020-08-22 20:31       ` Jorge P. de Morais Neto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).