From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Agustin Martin Newsgroups: gmane.emacs.bugs Subject: bug#17742: Acknowledgement (Support for enchant?) Date: Mon, 19 Dec 2016 18:37:19 +0100 Message-ID: <20161219173719.5lt4u562tf4mcwcy@agmartin.aq.upm.es> References: <834m2hjbmr.fsf@gnu.org> <83bmwfbxaf.fsf@gnu.org> <837f73bqwv.fsf@gnu.org> <838trb6h7s.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1482169096 32744 195.159.176.226 (19 Dec 2016 17:38:16 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 19 Dec 2016 17:38:16 +0000 (UTC) User-Agent: NeoMutt/20161126 (1.7.1) Cc: Reuben Thomas To: 17742@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Dec 19 18:38:13 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cJ1t4-0007WZ-4f for geb-bug-gnu-emacs@m.gmane.org; Mon, 19 Dec 2016 18:38:10 +0100 Original-Received: from localhost ([::1]:47013 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cJ1t8-000328-N5 for geb-bug-gnu-emacs@m.gmane.org; Mon, 19 Dec 2016 12:38:14 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46077) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cJ1sz-00030j-VS for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2016 12:38:07 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cJ1sw-00013t-OC for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2016 12:38:06 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:60715) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cJ1sw-00013p-Ks for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2016 12:38:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cJ1sw-00022b-CH for bug-gnu-emacs@gnu.org; Mon, 19 Dec 2016 12:38:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Agustin Martin Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 19 Dec 2016 17:38:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17742 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 17742-submit@debbugs.gnu.org id=B17742.14821690477796 (code B ref 17742); Mon, 19 Dec 2016 17:38:02 +0000 Original-Received: (at 17742) by debbugs.gnu.org; 19 Dec 2016 17:37:27 +0000 Original-Received: from localhost ([127.0.0.1]:47881 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cJ1sN-00021g-DW for submit@debbugs.gnu.org; Mon, 19 Dec 2016 12:37:27 -0500 Original-Received: from neon-v1.ccupm.upm.es ([138.100.198.71]:46633) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cJ1sL-00021X-5I for 17742@debbugs.gnu.org; Mon, 19 Dec 2016 12:37:25 -0500 Original-Received: from agmartin.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131]) (user=agustin.martin@upm.es mech=LOGIN bits=0) by neon-v1.ccupm.upm.es (8.14.4/8.14.4/neon-v1-002) with ESMTP id uBJHbJWK027170 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 19 Dec 2016 18:37:19 +0100 Original-Received: by agmartin.aq.upm.es (Postfix, from userid 1000) id 10B914046C; Mon, 19 Dec 2016 18:37:20 +0100 (CET) Content-Disposition: inline In-Reply-To: <838trb6h7s.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:127201 Archived-At: On Mon, Dec 19, 2016 at 06:01:27PM +0200, Eli Zaretskii wrote: > > From: Reuben Thomas > > > Basic tests using [[:alpha:]] for casechars and [^[:alpha:]] for not-casechars seem to work OK. > > For which language and dictionary? This will definitely do the wrong > thing for Hunspell he_IL dictionary I have here, which says: > > WORDCHARS אבגדהוזחטיכלמנסעפצקרשתםןךףץ'" > > That is, it wants ' and " to be treated as word-constituent > characters. As another example, I can envision a dictionary of > acronyms and abbreviations, which might want to treat the period as a > word-constituent character, to support the likes of "a.k.a.". > Etc. etc. -- this is up to the dictionary to decide, and Emacs must > follow suit. > > Also, please note that [:alpha:] in Emacs 25 means a much larger set > of characters than in previous versions, see NEWS. It will in general > catch strings of characters that cannot possibly be TRT for a > single-language dictionary. E.g., > > (string-match "[[:alpha:]]+" "aβגд") => 0 > > > ​I meant [[:graph:]] and [^[:graph:]].​ > > This will match an even larger set in Emacs 25, I don't think we will > ever want that for spell-checking. Hi, Not following this very closely, but ispell.el still use [:alpha:] for aspell and hunspell. If I remember this properly, old meaning means something like "as for current locale" while it has now a much wider meaning. For the vast majority of systems this should not be a problem, but I wonder if this can have some side effects for ispell.el in corner cases. > > ​Also, as I realised while preparing the patch for bug#25230, it is only hunspell that has special information > > about character classes. All the others just use [:alpha:]. So if it's good enough for ispell and aspell, can't it be > > good enough for enchant? (It just means that for now "direct Hunspell" is arguably better than "Hunspell via > > Enchant".) > > Hunspell is the most modern and sophisticated speller, we certainly > don't want to degrade it. Also, Aspell uses the dictionaries at least > for some of this info, see the function I pointed to above. > > Once again, if Enchant uses a back-end for which we know how to find > this information, we should do so. About Enchant, last time I looked at it it was mostly intented for use through libenchant, not through the standalone enchant binary, which was more like some kind of testing tool. As a matter of fact its list of options is quite short and it seems to lack support for personal dictionaries. Since Emacs uses a pipe for spellchecking I do not think we should worry too much about the enchant binary. Things may have changed recently in enchant, but I would not expect that too much, its man page still mentions myspell and not at all hunspell (so it may be a bit outdated), although it seems to be able to use libhunspell. Also, there is no easy way to know which particular spellchecking engine is being used. Enchant uses $(datadir)/enchant and ~/.enchant config files to define preferences, but I see no way to make enchant tell which one is being used. So, it is not easy to parse dictionary info. Sorry if I have missed some things. Gmail tags some of Reuben mails as spam and puts them out of my usual workflow. -- Agustin