From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Bug 130397 Date: Tue, 04 Jan 2005 23:42:15 -0500 Message-ID: <873bxgjxrp.fsf-monnier+emacs@gnu.org> References: <20040517120658.GA6919@agmartin.aq.upm.es> <20041217121515.GA2270@agmartin.aq.upm.es> <200412221237.VAA07262@etlken.m17n.org> <20041222171306.GA4462@agmartin.aq.upm.es> <200501041250.VAA10883@etlken.m17n.org> <200501050200.LAA12589@etlken.m17n.org> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1104900402 1412 80.91.229.6 (5 Jan 2005 04:46:42 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 5 Jan 2005 04:46:42 +0000 (UTC) Cc: agustin.martin@hispalinux.es, lionel@mamane.lu, emacs-devel@gnu.org, k.stevens@ieee.org, 130397@bugs.debian.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Jan 05 05:46:32 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1Cm344-0002xp-00 for ; Wed, 05 Jan 2005 05:46:32 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1Cm3FI-0001Vf-T8 for ged-emacs-devel@m.gmane.org; Tue, 04 Jan 2005 23:58:08 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1Cm3ER-0000Yj-0Y for emacs-devel@gnu.org; Tue, 04 Jan 2005 23:57:15 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1Cm3EN-0000Wq-Hn for emacs-devel@gnu.org; Tue, 04 Jan 2005 23:57:12 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1Cm3EM-0000Vn-Ks for emacs-devel@gnu.org; Tue, 04 Jan 2005 23:57:10 -0500 Original-Received: from [209.226.175.188] (helo=tomts25-srv.bellnexxia.net) by monty-python.gnu.org with esmtp (Exim 4.34) id 1Cm2zy-0000uV-HZ for emacs-devel@gnu.org; Tue, 04 Jan 2005 23:42:18 -0500 Original-Received: from alfajor ([67.71.119.166]) by tomts25-srv.bellnexxia.net (InterMail vM.5.01.06.10 201-253-122-130-110-20040306) with ESMTP id <20050105044217.BKPL25979.tomts25-srv.bellnexxia.net@alfajor>; Tue, 4 Jan 2005 23:42:17 -0500 Original-Received: by alfajor (Postfix, from userid 1000) id 6E8472FD13; Tue, 4 Jan 2005 23:42:15 -0500 (EST) Original-To: Kenichi Handa In-Reply-To: <200501050200.LAA12589@etlken.m17n.org> (Kenichi Handa's message of "Wed, 5 Jan 2005 11:00:11 +0900 (JST)") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:31875 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:31875 >> Why can't ispell.el just use the `w' syntax to decide what is a word and >> then rely on the decoding/encoding to do the rest of the work? >> That would fix the problem where a word like "exp=E9rience" is checked a= s two >> words if the dictionary is "american". > That will cause another problem. For instance, when we have > "espa=F1ol" in a buffer and the ispell dictionary is czech > (latin-2), as "espa=F1ol" is encoded into "espa?ol" by > latin-2, it causes the error "Ispell and its process have > different character maps" because ispell returns the result > of two words "eapa" and "ol". But ispell.el should be able to automatically check whether the chars can be safely encoded with the coding-system and if not (as in your example), ispell.el will know that the word can't be checked by ispell and should just be skipped (and maybe marked as "uncheckable"). >>> + (string-as-multibyte >>> + (mapconcat >>> + #'(lambda (c) >>> + (let ((unichar (aref ucs-mule-8859-to-mule-unicode c))) >>> + (if unichar >>> + (aref ispell-unified-chars-table unichar) >>> + (string c)))) >>> + str "")))) >> Do you expect the output of mapconcat to be unibyte and to contain >> emacs-mule encoding of multibyte chars? > No. STR may be an ASCII-only string, in which case, the > result of mapconcat is a unibyte ASCII-only string. I'd > like to change it to a multibyte ASCII-only stirng to avoid > converting STR again and again in such a case. Then string-to-multibyte sounds like a safer choice. `string-as-multibyte' has very strange semantics, I recommend we avoid it as much as possible. Stefan