From mboxrd@z Thu Jan  1 00:00:00 1970
Path: main.gmane.org!not-for-mail
From: Agustin Martin <agustin.martin@hispalinux.es>
Newsgroups: gmane.emacs.devel
Subject: Re: Bug 130397 (Was: Emacs - Ispell problem with
	i[no]german	dictionary)
Date: Wed, 22 Dec 2004 18:13:06 +0100
Message-ID: <20041222171306.GA4462@agmartin.aq.upm.es>
References: <Pine.LNX.4.43.0305140821370.30166-100000@wr-linux02.rki.ivbb.bund.de>
	<m3addpd2ur.fsf@dionysos.nib> <E19HNCh-0000tv-00@fencepost.gnu.org>
	<20040517120658.GA6919@agmartin.aq.upm.es>
	<20041217121515.GA2270@agmartin.aq.upm.es>
	<200412221237.VAA07262@etlken.m17n.org>
NNTP-Posting-Host: deer.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: sea.gmane.org 1103735748 24808 80.91.229.6 (22 Dec 2004 17:15:48 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Wed, 22 Dec 2004 17:15:48 +0000 (UTC)
Cc: lionel@mamane.lu, emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Dec 22 18:15:41 2004
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Original-Received: from lists.gnu.org ([199.232.76.165])
	by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 1ChA5N-0001oH-00
	for <ged-emacs-devel@m.gmane.org>; Wed, 22 Dec 2004 18:15:41 +0100
Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.33)
	id 1ChAFv-0001jG-L4
	for ged-emacs-devel@m.gmane.org; Wed, 22 Dec 2004 12:26:35 -0500
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33)
	id 1ChAFo-0001j9-RA
	for emacs-devel@gnu.org; Wed, 22 Dec 2004 12:26:28 -0500
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33)
	id 1ChAFo-0001im-08
	for emacs-devel@gnu.org; Wed, 22 Dec 2004 12:26:28 -0500
Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.33) id 1ChAFn-0001ij-SC
	for emacs-devel@gnu.org; Wed, 22 Dec 2004 12:26:27 -0500
Original-Received: from [138.100.4.49] (helo=edison.ccupm.upm.es)
	by monty-python.gnu.org with esmtp (Exim 4.34) id 1ChA59-000231-TN
	for emacs-devel@gnu.org; Wed, 22 Dec 2004 12:15:28 -0500
Original-Received: from mala.aq.upm.es (Agmartin.aq.upm.es [138.100.41.131])
	by edison.ccupm.upm.es (8.12.10/8.12.10) with ESMTP id iBMHD6LN021808; 
	Wed, 22 Dec 2004 18:13:06 +0100
Original-Received: by mala.aq.upm.es (Postfix, from userid 1000)
	id EE5F927302; Wed, 22 Dec 2004 18:13:06 +0100 (CET)
Original-To: Kenichi Handa <handa@m17n.org>, 130397@bugs.debian.org
Content-Disposition: inline
In-Reply-To: <200412221237.VAA07262@etlken.m17n.org>
User-Agent: Mutt/1.5.6+20040907i
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: main.gmane.org gmane.emacs.devel:31332
X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:31332

On Wed, Dec 22, 2004 at 09:37:32PM +0900, Kenichi Handa wrote:

> Please try the same thing with the latest CVS code.  With
> that, when you type e-grave in fr_FR@euro locale, e-grave of
> latin-iso8859-15 should be inserted in a buffer.  So, as far
> as you are using a dictionary that uses iso-8859-15 encoding
> (or in general, using a dictionary that uses the same
> encoding as your locale), you should not face the above
> problem.
> 

Thanks for the tip. I am not maintaining emacs, but a package for the common
dictionaries setup (dictionaries-common) that provides a recent and patched
ispell.el for all the diferent emacsen flavours ({x}emacs) to integrate the
different dicts and spellchecking engines in some way. I will be happy
to test this once is included in sid emacs.

> > I am playing with redefining ispell-get-coding-system function in ispell.el
> > so dict coding-system is changed to iso-8859-15 if was originally
> > iso-8859-1 and emacs has iso-8859-15 as buffer-file-coding-system, something
> > like
> 
> At least you should check if buffer-file-coding-system is
> nil or not before callding coding-system-get.  

Thanks for pointing put this, change added.

> But, anyway,
> I think the above function is too ad-hoc.  As iso-8859-1 and
> iso-8859-15 contains different set of characters (even if
> they are few), it's not good to treat them as the same
> thing.
> 
> For instance, if a dictionary uses iso-8859-1 encoding, it
> doesn't contain "\264" in CASECHARS entry.  But, if a
> dictionary uses iso-8859-15 encoding, it should contain
> "\264" (Z-WITH-CARON) in CASECHARS entry.
> 
> So, if you are going to check the spell of some word
> containing Z-WITH-CARON by iso-8859-1 dictionary, something
> goes wrong.
> 

I was aware of this, but anyway thanks for reminding. Code is probably too
ad-hoc, but latin{0,1} thing is also a somewhat ad-hoc scenario, where
latin0 should have really be named as something like iso-8859-1v2, that is,
a revision. I cannot imagine somebody using a iso-8859-2 dict and trying to
write in a iso8859-1 buffer, but with iso-8859-1 and iso-8859-15 that is
happening too frequently. 

So we have a lot of people that blindly select the locale @euro variant
without realizing its implications, and that iso-8859-1 and iso-8859-15
are different, but very close encodings (from a practical point of view,
they are fully equivalent for most languages but IIRC french (oe,"Y) and
finnish {sSzZ}^, ^ stands for caron; the euro symbol seems not significant
to spellchecking). 

Furthermore (this is probably fixed by the CVS code you mentioned above),
in current sid emacs utf-8 files can be checked with a latin1 dict (of
course if they do not use chars outside latin1) using the ispell.el
internal reencodings, but fails for iso-8859-15 declared dict.

The current state of ispell dicts in Debian is that ifrench is iso-8859-15
as default (although has a real latin1 entry), while finnish do not set at
all the {s,z}-caron chars, so it is a fully latin1 entry. aspell-fr and
aspell-fi are set to plain latin1.

So the only language that might currently require extra work is french, and
for it I find reasonable to use for emacs as default the iso-8859-15 entry
(tagged as iso-8859-1 for the above sustem to work). For this I would like
to hear Lionel's point of view, since he has put a lot of effort to make
iso-8859-15 available for spellchecking (Hi, Lionel). 

I personally do not like having separate iso-8859-15 entries unless they are
really required. For the above dicts, that would be for french, and I am not
at all sure that it is really required.

Thanks a lot for your feedback, Handa.

Cheers,

-- 
Agustin