From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Agustin Martin Newsgroups: gmane.emacs.devel Subject: Re: Bug 130397 (Was: Emacs - Ispell problem with i[no]german dictionary) Date: Wed, 22 Dec 2004 18:13:06 +0100 Message-ID: <20041222171306.GA4462@agmartin.aq.upm.es> References: <20040517120658.GA6919@agmartin.aq.upm.es> <20041217121515.GA2270@agmartin.aq.upm.es> <200412221237.VAA07262@etlken.m17n.org> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1103735748 24808 80.91.229.6 (22 Dec 2004 17:15:48 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 22 Dec 2004 17:15:48 +0000 (UTC) Cc: lionel@mamane.lu, emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Dec 22 18:15:41 2004 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1ChA5N-0001oH-00 for ; Wed, 22 Dec 2004 18:15:41 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1ChAFv-0001jG-L4 for ged-emacs-devel@m.gmane.org; Wed, 22 Dec 2004 12:26:35 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1ChAFo-0001j9-RA for emacs-devel@gnu.org; Wed, 22 Dec 2004 12:26:28 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1ChAFo-0001im-08 for emacs-devel@gnu.org; Wed, 22 Dec 2004 12:26:28 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1ChAFn-0001ij-SC for emacs-devel@gnu.org; Wed, 22 Dec 2004 12:26:27 -0500 Original-Received: from [138.100.4.49] (helo=edison.ccupm.upm.es) by monty-python.gnu.org with esmtp (Exim 4.34) id 1ChA59-000231-TN for emacs-devel@gnu.org; Wed, 22 Dec 2004 12:15:28 -0500 Original-Received: from mala.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131]) by edison.ccupm.upm.es (8.12.10/8.12.10) with ESMTP id iBMHD6LN021808; Wed, 22 Dec 2004 18:13:06 +0100 Original-Received: by mala.aq.upm.es (Postfix, from userid 1000) id EE5F927302; Wed, 22 Dec 2004 18:13:06 +0100 (CET) Original-To: Kenichi Handa , 130397@bugs.debian.org Content-Disposition: inline In-Reply-To: <200412221237.VAA07262@etlken.m17n.org> User-Agent: Mutt/1.5.6+20040907i X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:31332 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:31332 On Wed, Dec 22, 2004 at 09:37:32PM +0900, Kenichi Handa wrote: > Please try the same thing with the latest CVS code. With > that, when you type e-grave in fr_FR@euro locale, e-grave of > latin-iso8859-15 should be inserted in a buffer. So, as far > as you are using a dictionary that uses iso-8859-15 encoding > (or in general, using a dictionary that uses the same > encoding as your locale), you should not face the above > problem. > Thanks for the tip. I am not maintaining emacs, but a package for the common dictionaries setup (dictionaries-common) that provides a recent and patched ispell.el for all the diferent emacsen flavours ({x}emacs) to integrate the different dicts and spellchecking engines in some way. I will be happy to test this once is included in sid emacs. > > I am playing with redefining ispell-get-coding-system function in ispell.el > > so dict coding-system is changed to iso-8859-15 if was originally > > iso-8859-1 and emacs has iso-8859-15 as buffer-file-coding-system, something > > like > > At least you should check if buffer-file-coding-system is > nil or not before callding coding-system-get. Thanks for pointing put this, change added. > But, anyway, > I think the above function is too ad-hoc. As iso-8859-1 and > iso-8859-15 contains different set of characters (even if > they are few), it's not good to treat them as the same > thing. > > For instance, if a dictionary uses iso-8859-1 encoding, it > doesn't contain "\264" in CASECHARS entry. But, if a > dictionary uses iso-8859-15 encoding, it should contain > "\264" (Z-WITH-CARON) in CASECHARS entry. > > So, if you are going to check the spell of some word > containing Z-WITH-CARON by iso-8859-1 dictionary, something > goes wrong. > I was aware of this, but anyway thanks for reminding. Code is probably too ad-hoc, but latin{0,1} thing is also a somewhat ad-hoc scenario, where latin0 should have really be named as something like iso-8859-1v2, that is, a revision. I cannot imagine somebody using a iso-8859-2 dict and trying to write in a iso8859-1 buffer, but with iso-8859-1 and iso-8859-15 that is happening too frequently. So we have a lot of people that blindly select the locale @euro variant without realizing its implications, and that iso-8859-1 and iso-8859-15 are different, but very close encodings (from a practical point of view, they are fully equivalent for most languages but IIRC french (oe,"Y) and finnish {sSzZ}^, ^ stands for caron; the euro symbol seems not significant to spellchecking). Furthermore (this is probably fixed by the CVS code you mentioned above), in current sid emacs utf-8 files can be checked with a latin1 dict (of course if they do not use chars outside latin1) using the ispell.el internal reencodings, but fails for iso-8859-15 declared dict. The current state of ispell dicts in Debian is that ifrench is iso-8859-15 as default (although has a real latin1 entry), while finnish do not set at all the {s,z}-caron chars, so it is a fully latin1 entry. aspell-fr and aspell-fi are set to plain latin1. So the only language that might currently require extra work is french, and for it I find reasonable to use for emacs as default the iso-8859-15 entry (tagged as iso-8859-1 for the above sustem to work). For this I would like to hear Lionel's point of view, since he has put a lot of effort to make iso-8859-15 available for spellchecking (Hi, Lionel). I personally do not like having separate iso-8859-15 entries unless they are really required. For the above dicts, that would be for french, and I am not at all sure that it is really required. Thanks a lot for your feedback, Handa. Cheers, -- Agustin