From mboxrd@z Thu Jan  1 00:00:00 1970
Path: main.gmane.org!not-for-mail
From: Stefan Monnier <monnier@iro.umontreal.ca>
Newsgroups: gmane.emacs.devel
Subject: Re: Bug 130397
Date: Tue, 04 Jan 2005 23:42:15 -0500
Message-ID: <873bxgjxrp.fsf-monnier+emacs@gnu.org>
References: <Pine.LNX.4.43.0305140821370.30166-100000@wr-linux02.rki.ivbb.bund.de>
	<m3addpd2ur.fsf@dionysos.nib> <E19HNCh-0000tv-00@fencepost.gnu.org>
	<20040517120658.GA6919@agmartin.aq.upm.es>
	<20041217121515.GA2270@agmartin.aq.upm.es>
	<200412221237.VAA07262@etlken.m17n.org>
	<20041222171306.GA4462@agmartin.aq.upm.es>
	<200501041250.VAA10883@etlken.m17n.org>
	<m1llb9p887.fsf-monnier+emacs@gnu.org>
	<200501050200.LAA12589@etlken.m17n.org>
NNTP-Posting-Host: deer.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: sea.gmane.org 1104900402 1412 80.91.229.6 (5 Jan 2005 04:46:42 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Wed, 5 Jan 2005 04:46:42 +0000 (UTC)
Cc: agustin.martin@hispalinux.es, lionel@mamane.lu, emacs-devel@gnu.org,
	k.stevens@ieee.org, 130397@bugs.debian.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Jan 05 05:46:32 2005
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Original-Received: from lists.gnu.org ([199.232.76.165])
	by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 1Cm344-0002xp-00
	for <ged-emacs-devel@m.gmane.org>; Wed, 05 Jan 2005 05:46:32 +0100
Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.33)
	id 1Cm3FI-0001Vf-T8
	for ged-emacs-devel@m.gmane.org; Tue, 04 Jan 2005 23:58:08 -0500
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33)
	id 1Cm3ER-0000Yj-0Y
	for emacs-devel@gnu.org; Tue, 04 Jan 2005 23:57:15 -0500
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33)
	id 1Cm3EN-0000Wq-Hn
	for emacs-devel@gnu.org; Tue, 04 Jan 2005 23:57:12 -0500
Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.33) id 1Cm3EM-0000Vn-Ks
	for emacs-devel@gnu.org; Tue, 04 Jan 2005 23:57:10 -0500
Original-Received: from [209.226.175.188] (helo=tomts25-srv.bellnexxia.net)
	by monty-python.gnu.org with esmtp (Exim 4.34) id 1Cm2zy-0000uV-HZ
	for emacs-devel@gnu.org; Tue, 04 Jan 2005 23:42:18 -0500
Original-Received: from alfajor ([67.71.119.166]) by tomts25-srv.bellnexxia.net
	(InterMail vM.5.01.06.10 201-253-122-130-110-20040306) with ESMTP
	id <20050105044217.BKPL25979.tomts25-srv.bellnexxia.net@alfajor>;
	Tue, 4 Jan 2005 23:42:17 -0500
Original-Received: by alfajor (Postfix, from userid 1000)
	id 6E8472FD13; Tue,  4 Jan 2005 23:42:15 -0500 (EST)
Original-To: Kenichi Handa <handa@m17n.org>
In-Reply-To: <200501050200.LAA12589@etlken.m17n.org> (Kenichi Handa's
	message of "Wed, 5 Jan 2005 11:00:11 +0900 (JST)")
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: main.gmane.org gmane.emacs.devel:31875
X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:31875

>> Why can't ispell.el just use the `w' syntax to decide what is a word and
>> then rely on the decoding/encoding to do the rest of the work?

>> That would fix the problem where a word like "exp=E9rience" is checked a=
s two
>> words if the dictionary is "american".

> That will cause another problem.  For instance, when we have
> "espa=F1ol" in a buffer and the ispell dictionary is czech
> (latin-2), as "espa=F1ol" is encoded into "espa?ol" by
> latin-2, it causes the error "Ispell and its process have
> different character maps" because ispell returns the result
> of two words "eapa" and "ol".

But ispell.el should be able to automatically check whether the chars can be
safely encoded with the coding-system and if not (as in your example),
ispell.el will know that the word can't be checked by ispell and should
just be skipped (and maybe marked as "uncheckable").

>>> + 		(string-as-multibyte
>>> + 		 (mapconcat
>>> + 		  #'(lambda (c)
>>> + 		      (let ((unichar (aref ucs-mule-8859-to-mule-unicode c)))
>>> + 			(if unichar
>>> + 			    (aref ispell-unified-chars-table unichar)
>>> + 			  (string c))))
>>> + 		  str ""))))

>> Do you expect the output of mapconcat to be unibyte and to contain
>> emacs-mule encoding of multibyte chars?

> No.  STR may be an ASCII-only string, in which case, the
> result of mapconcat is a unibyte ASCII-only string.  I'd
> like to change it to a multibyte ASCII-only stirng to avoid
> converting STR again and again in such a case.

Then string-to-multibyte sounds like a safer choice.
`string-as-multibyte' has very strange semantics, I recommend we avoid it as
much as possible.


        Stefan