From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Agustin Martin Newsgroups: gmane.emacs.bugs Subject: bug#7668: ispell and dictionary encodings Date: Tue, 21 Dec 2010 12:30:08 +0100 Message-ID: <20101221113008.GB3440@agmartin.aq.upm.es> References: <20101220113148.GA12469@agmartin.aq.upm.es> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1292932407 24477 80.91.229.12 (21 Dec 2010 11:53:27 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 21 Dec 2010 11:53:27 +0000 (UTC) To: 7668@debbugs.gnu.org, Reuben Thomas Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Dec 21 12:53:22 2010 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1PV0mO-0000Cv-DG for geb-bug-gnu-emacs@m.gmane.org; Tue, 21 Dec 2010 12:53:22 +0100 Original-Received: from localhost ([127.0.0.1]:44824 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PV0mN-00089w-5w for geb-bug-gnu-emacs@m.gmane.org; Tue, 21 Dec 2010 06:53:19 -0500 Original-Received: from [140.186.70.92] (port=37061 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PV0mD-00085k-5C for bug-gnu-emacs@gnu.org; Tue, 21 Dec 2010 06:53:11 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PV0m6-0000Y9-MO for bug-gnu-emacs@gnu.org; Tue, 21 Dec 2010 06:53:03 -0500 Original-Received: from debbugs.gnu.org ([140.186.70.43]:36001) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PV0m6-0000Y5-KD for bug-gnu-emacs@gnu.org; Tue, 21 Dec 2010 06:53:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.69) (envelope-from ) id 1PV0K1-0005FT-RN; Tue, 21 Dec 2010 06:24:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Agustin Martin Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-To: owner@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 21 Dec 2010 11:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7668 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 7668-submit@debbugs.gnu.org id=B7668.129293062420152 (code B ref 7668); Tue, 21 Dec 2010 11:24:01 +0000 Original-Received: (at 7668) by debbugs.gnu.org; 21 Dec 2010 11:23:44 +0000 Original-Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PV0Jj-0005Ez-4i for submit@debbugs.gnu.org; Tue, 21 Dec 2010 06:23:43 -0500 Original-Received: from fibonacci.ccupm.upm.es ([138.100.198.70] helo=smtp.upm.es) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PV0Jh-0005Eg-4F for 7668@debbugs.gnu.org; Tue, 21 Dec 2010 06:23:42 -0500 Original-Received: from agmartin.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131]) by smtp.upm.es (8.14.3/8.14.3/fibonacci-001) with ESMTP id oBLBU9qo026790; Tue, 21 Dec 2010 12:30:09 +0100 Original-Received: by agmartin.aq.upm.es (Postfix, from userid 1000) id C0FB78241B; Tue, 21 Dec 2010 12:30:08 +0100 (CET) Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list Resent-Date: Tue, 21 Dec 2010 06:24:01 -0500 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:42707 Archived-At: On Mon, Dec 20, 2010 at 03:40:18PM +0000, Reuben Thomas wrote: > On 20 December 2010 11:31, Agustin Martin wrote: > > [a very helpful reply; thanks] > > > On Fri, Dec 17, 2010 at 06:30:14PM +0000, Reuben Thomas wrote: > > If you are not going to use XEmacs, but only FSF Emacs, just use [:alpha:] > > for the case-character and non-case-character strings along with utf-8. That > > is already done automatically for aspell dictionaries, where is easy to get > > a list of installed dictionaries and additional info. > > So, the built-in entries of ispell-dictionary-base-alist are > specifically for ispell? Or generally for versions of the spellcheckers that do not properly support different encodings, old aspells and hunspells, there are still some of them flying around. > In that case, it seems a bit odd that they > are used for hunspell, but perhaps the problem is that you can't get > hunspell to give you that information about its dictionaries? That is indeed part of the problem. Otherwise something like (ispell-aspell-find-dictionaries) and friends could be used. 'hunspell -D' does not provide all the info, and does not return control until ^C. > But is > there in any case a reason not to default to using [:alpha:] for > case-chars and ^[:alpha:] for non-case-chars with hunspell? Besides old aspells and hunspells, I am trying to improve XEmacs compatibility for ispell.el and flyspell.el. I keep patched versions for Debian, so all Emacs flavours use the same ispell.el and flyspell.el. In its current incarnation, even Emacs >=21.3 is supported by Debian patched files. I am currently removing all that compatibility leaving only Emacs23 and XEmacs, and would like to keep FSF Emacs ispell.el and flyspell.el reasonably close to those I use, so I need less changes. And XEmacs do not support [:alpha:]. An intermediate possibility could be to use a hunspell specific default dictionary list built on the fly from base-alist with encoding set to utf8 and case/not-case changed to [:alpha:] for FSF Emacs and recent enough hunspell. Since this would only be done first time ispell.el invokes hunspell spellchecking, seems be reasonable. But I have to think about this. > In case I'm getting too confused, I'll just restate the basic > objective I have: I want to be able to spell-check (in my case, > British, but I don't think it matters for this purpose) English with > a) accents and b) fancy quotes. In these days of utf-8 being widely > used for English, it seems it should be possible to do at least b) out > of the box, which currently it isn't, as far as I can see. Putting those fancy quotes in 'otherchars' section in dictionary definition for ispell.el should make ispell.el consider them part of the word, but IIRC will not affect hunspell unless they are defined in TRY section of .aff file. -- Agustin