From: Agustin Martin <agustin6martin@gmail.com>
To: 17742@debbugs.gnu.org
Cc: Reuben Thomas <rrt@sc3d.org>
Subject: bug#17742: Acknowledgement (Support for enchant?)
Date: Mon, 19 Dec 2016 18:37:19 +0100 [thread overview]
Message-ID: <20161219173719.5lt4u562tf4mcwcy@agmartin.aq.upm.es> (raw)
In-Reply-To: <838trb6h7s.fsf@gnu.org>
On Mon, Dec 19, 2016 at 06:01:27PM +0200, Eli Zaretskii wrote:
> > From: Reuben Thomas
>
> > Basic tests using [[:alpha:]] for casechars and [^[:alpha:]] for not-casechars seem to work OK.
>
> For which language and dictionary? This will definitely do the wrong
> thing for Hunspell he_IL dictionary I have here, which says:
>
> WORDCHARS אבגדהוזחטיכלמנסעפצקרשתםןךףץ'"
>
> That is, it wants ' and " to be treated as word-constituent
> characters. As another example, I can envision a dictionary of
> acronyms and abbreviations, which might want to treat the period as a
> word-constituent character, to support the likes of "a.k.a.".
> Etc. etc. -- this is up to the dictionary to decide, and Emacs must
> follow suit.
>
> Also, please note that [:alpha:] in Emacs 25 means a much larger set
> of characters than in previous versions, see NEWS. It will in general
> catch strings of characters that cannot possibly be TRT for a
> single-language dictionary. E.g.,
>
> (string-match "[[:alpha:]]+" "aβגд") => 0
>
> > I meant [[:graph:]] and [^[:graph:]].
>
> This will match an even larger set in Emacs 25, I don't think we will
> ever want that for spell-checking.
Hi,
Not following this very closely, but ispell.el still use [:alpha:] for
aspell and hunspell. If I remember this properly, old meaning means
something like "as for current locale" while it has now a much wider
meaning.
For the vast majority of systems this should not be a problem, but I wonder
if this can have some side effects for ispell.el in corner cases.
> > Also, as I realised while preparing the patch for bug#25230, it is only hunspell that has special information
> > about character classes. All the others just use [:alpha:]. So if it's good enough for ispell and aspell, can't it be
> > good enough for enchant? (It just means that for now "direct Hunspell" is arguably better than "Hunspell via
> > Enchant".)
>
> Hunspell is the most modern and sophisticated speller, we certainly
> don't want to degrade it. Also, Aspell uses the dictionaries at least
> for some of this info, see the function I pointed to above.
>
> Once again, if Enchant uses a back-end for which we know how to find
> this information, we should do so.
About Enchant, last time I looked at it it was mostly intented for use
through libenchant, not through the standalone enchant binary, which was
more like some kind of testing tool. As a matter of fact its list of
options is quite short and it seems to lack support for personal
dictionaries. Since Emacs uses a pipe for spellchecking I do not think
we should worry too much about the enchant binary.
Things may have changed recently in enchant, but I would not expect that too
much, its man page still mentions myspell and not at all hunspell (so it
may be a bit outdated), although it seems to be able to use libhunspell.
Also, there is no easy way to know which particular spellchecking engine is
being used. Enchant uses $(datadir)/enchant and ~/.enchant config files to
define preferences, but I see no way to make enchant tell which one is being
used. So, it is not easy to parse dictionary info.
Sorry if I have missed some things. Gmail tags some of Reuben mails as spam
and puts them out of my usual workflow.
--
Agustin
next prev parent reply other threads:[~2016-12-19 17:37 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-10 0:01 bug#17742: Support for enchant? Reuben Thomas
2014-09-15 11:06 ` bug#17742: Limitations of enchant Reuben Thomas
2016-12-02 0:15 ` Reuben Thomas
[not found] ` <handler.17742.B.140235850213377.ack@debbugs.gnu.org>
2016-12-05 21:41 ` bug#17742: Acknowledgement (Support for enchant?) Reuben Thomas
2016-12-06 15:55 ` Eli Zaretskii
2016-12-06 15:56 ` Reuben Thomas
2016-12-13 0:53 ` Reuben Thomas
2016-12-13 16:37 ` Eli Zaretskii
2016-12-13 18:26 ` Reuben Thomas
2016-12-13 18:54 ` Eli Zaretskii
2016-12-13 21:17 ` Reuben Thomas
2016-12-13 21:30 ` Reuben Thomas
2016-12-14 15:42 ` Eli Zaretskii
2016-12-15 12:36 ` Reuben Thomas
2016-12-18 23:39 ` Reuben Thomas
2016-12-19 1:02 ` Reuben Thomas
2016-12-19 12:41 ` Reuben Thomas
2016-12-19 16:01 ` Eli Zaretskii
2016-12-19 17:37 ` Agustin Martin [this message]
2016-12-19 18:09 ` Eli Zaretskii
2016-12-19 21:21 ` Reuben Thomas
2016-12-19 21:27 ` Reuben Thomas
2016-12-20 15:38 ` Eli Zaretskii
2016-12-19 21:47 ` Reuben Thomas
2016-12-19 22:04 ` Reuben Thomas
2016-12-20 15:40 ` Eli Zaretskii
2016-12-20 15:40 ` Eli Zaretskii
2016-12-20 21:43 ` Reuben Thomas
2016-12-21 17:13 ` Eli Zaretskii
2016-12-21 17:32 ` Reuben Thomas
2017-08-09 11:35 ` Reuben Thomas
2017-08-18 8:54 ` Eli Zaretskii
2017-08-20 13:02 ` Reuben Thomas
2017-08-20 14:42 ` Eli Zaretskii
2017-08-20 14:50 ` Reuben Thomas
2017-08-20 19:34 ` Eli Zaretskii
2017-08-20 20:36 ` Reuben Thomas
2017-08-20 14:50 ` bug#17742: Reuben Thomas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/emacs/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161219173719.5lt4u562tf4mcwcy@agmartin.aq.upm.es \
--to=agustin6martin@gmail.com \
--cc=17742@debbugs.gnu.org \
--cc=rrt@sc3d.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).