From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Dave Love Newsgroups: gmane.emacs.devel Subject: Re: iso-8859-1 and non-latin-1 chars Date: 06 Jan 2003 19:28:21 +0000 Sender: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Message-ID: References: <4207.1041354888@ichips.intel.com> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1041990033 15053 80.91.224.249 (8 Jan 2003 01:40:33 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Wed, 8 Jan 2003 01:40:33 +0000 (UTC) Cc: Kenichi Handa Return-path: Original-Received: from quimby.gnus.org ([80.91.224.244]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18W5Cj-0003uQ-00 for ; Wed, 08 Jan 2003 02:40:25 +0100 Original-Received: from monty-python.gnu.org ([199.232.76.173]) by quimby.gnus.org with esmtp (Exim 3.12 #1 (Debian)) id 18W5HY-00067g-00 for ; Wed, 08 Jan 2003 02:45:24 +0100 Original-Received: from localhost ([127.0.0.1] helo=monty-python.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18Vy8C-0004I7-03 for emacs-devel@quimby.gnus.org; Tue, 07 Jan 2003 13:07:16 -0500 Original-Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.10.13) id 18VcvF-0003Up-00 for emacs-devel@gnu.org; Mon, 06 Jan 2003 14:28:29 -0500 Original-Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.10.13) id 18VcvE-0003Ue-00 for emacs-devel@gnu.org; Mon, 06 Jan 2003 14:28:29 -0500 Original-Received: from albion.dl.ac.uk ([148.79.80.39]) by monty-python.gnu.org with esmtp (Exim 4.10.13) id 18VcvC-0003UM-00; Mon, 06 Jan 2003 14:28:26 -0500 Original-Received: from fx by albion.dl.ac.uk with local (Exim 3.35 #1 (Debian)) id 18Vcv7-0003d9-00; Mon, 06 Jan 2003 19:28:21 +0000 Original-To: Ken Stevens Original-Lines: 27 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2 Original-cc: rms@gnu.org Original-cc: g.kuenning@ieee.org Original-cc: emacs-devel@gnu.org X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1b5 Precedence: list List-Id: Emacs development discussions. List-Help: List-Post: List-Subscribe: , List-Archive: List-Unsubscribe: , Errors-To: emacs-devel-bounces+emacs-devel=quimby.gnus.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:10548 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:10548 Ken Stevens writes: > Ispell _does_ support multibyte characters. This was one of the > historical reasons ispell.el did not use emacs syntax tables to > determine word boundaries. (It supported latex words that included > escape sequences such as \'{o}, etc.) I didn't think that's really the same thing, but it's a long time since I hacked on ispell. Also, I don't see why Emacs couldn't match such words the same as ispell. Anyhow, as I don't know, what does one have to do to create and use a dictionary for utf-8 text? I could probably add support for that. > I am not sure what it would take to support all the internal emacs > encodings, or if this would be the best approach. It is only _external_ encodings that are relevant, and perhaps only utf-8. [I don't know if spell-checking actually makes sense in the Oriental languages which typically use the multibyte iso-2022 encodings.] Note that Emacs could cope now with checking utf-8-encoded text against a dictionary for an 8-bit character set as long as the text concerned can be encoded in that set. The text will be appropriately encoded when it is sent to the subprocess. I think that really requires using Emacs's (multibyte) syntax tables, though.