From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Bug 130397 Date: Sat, 08 Jan 2005 13:47:09 +0100 Message-ID: References: <28878.1105029010@ichips.intel.com> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1105190130 9225 80.91.229.6 (8 Jan 2005 13:15:30 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 8 Jan 2005 13:15:30 +0000 (UTC) Cc: Kenichi Handa , 130397@bugs.debian.org, agustin.martin@hispalinux.es, lionel@mamane.lu, emacs-devel@gnu.org, juri@jurta.org, Ken Stevens , Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Jan 08 14:15:22 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1CnGR8-0001sk-00 for ; Sat, 08 Jan 2005 14:15:22 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1CnGcW-0006ql-6e for ged-emacs-devel@m.gmane.org; Sat, 08 Jan 2005 08:27:08 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1CnGYs-0005ZQ-6G for emacs-devel@gnu.org; Sat, 08 Jan 2005 08:23:22 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1CnGYi-0005VS-I3 for emacs-devel@gnu.org; Sat, 08 Jan 2005 08:23:18 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1CnGYg-0005RA-38 for emacs-devel@gnu.org; Sat, 08 Jan 2005 08:23:10 -0500 Original-Received: from [199.232.76.164] (helo=fencepost.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.34) id 1CnFzu-0003xK-28 for emacs-devel@gnu.org; Sat, 08 Jan 2005 07:47:14 -0500 Original-Received: from localhost ([127.0.0.1] helo=lola.goethe.zz) by fencepost.gnu.org with esmtp (Exim 4.34) id 1CnFzp-0007cB-Df; Sat, 08 Jan 2005 07:47:10 -0500 Original-To: Geoff Kuenning In-Reply-To: (Geoff Kuenning's message of "08 Jan 2005 13:31:11 +0100") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:32035 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:32035 Geoff Kuenning writes: > Ken writes: > >> Geoff has a much better understanding of the underlying spell search >> engine. Perhaps he can shed additional light on this topic. > > I just looked at the code to be sure my memory is correct. Here's > the short rundown: in the '-a' interface, ispell interfaces with the > outside world purely in a byte-indexed mode. It is perfectly > capable of handling UTF-8 and similar multi-byte encodings, but when > it reports the offsets of incorrect words, it does so as a byte > offset, not a character offset. > > Does emacs provide an underlying byte-indexed interface to the > buffer? If so, life should be easy: just have ispell.el use that > interface. You are wrongly assuming that the buffer is maintained in UTF-8. It isn't. Byte indexing is not going to be fun with regard to efficiency, unless we get some interface that will, while writing out a file in UTF-8, store an array of byte/character correspondences for the UTF-8 (or whatever other) character conversion somewhere. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum