From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: David Kastrup Newsgroups: gmane.emacs.devel Subject: Re: Bug 130397 Date: Mon, 10 Jan 2005 10:09:41 +0100 Message-ID: References: <28878.1105029010@ichips.intel.com> <01c4f6cf$Blat.v2.2.2$5c4e1220@zahav.net.il> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1105348522 10454 80.91.229.6 (10 Jan 2005 09:15:22 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 10 Jan 2005 09:15:22 +0000 (UTC) Cc: geoff@cs.hmc.edu, 130397@bugs.debian.org, agustin.martin@hispalinux.es, lionel@mamane.lu, emacs-devel@gnu.org, kstevens@ichips.intel.com, snogglethorpe@gmail.com, miles@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Jan 10 10:15:12 2005 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1Cnvdo-0002L8-00 for ; Mon, 10 Jan 2005 10:15:12 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1CnvlK-0002oX-Ab for ged-emacs-devel@m.gmane.org; Mon, 10 Jan 2005 04:22:58 -0500 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1CnvkS-0002e4-J9 for emacs-devel@gnu.org; Mon, 10 Jan 2005 04:22:04 -0500 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1CnvkO-0002cB-H6 for emacs-devel@gnu.org; Mon, 10 Jan 2005 04:22:00 -0500 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1CnvkO-0002bz-CC for emacs-devel@gnu.org; Mon, 10 Jan 2005 04:22:00 -0500 Original-Received: from [199.232.76.164] (helo=fencepost.gnu.org) by monty-python.gnu.org with esmtp (Exim 4.34) id 1CnvYb-0006x2-1b for emacs-devel@gnu.org; Mon, 10 Jan 2005 04:09:49 -0500 Original-Received: from localhost ([127.0.0.1] helo=lola.goethe.zz) by fencepost.gnu.org with esmtp (Exim 4.34) id 1CnvYE-0000VP-Nr; Mon, 10 Jan 2005 04:09:27 -0500 Original-To: Eli Zaretskii In-Reply-To: <01c4f6cf$Blat.v2.2.2$5c4e1220@zahav.net.il> (Eli Zaretskii's message of "Mon, 10 Jan 2005 06:45:31 +0200") User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.emacs.devel:32090 X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:32090 "Eli Zaretskii" writes: >> Date: Sat, 8 Jan 2005 22:29:21 +0900 >> From: Miles Bader >> Cc: Geoff Kuenning , 130397@bugs.debian.org, >> agustin.martin@hispalinux.es, lionel@mamane.lu, >> Kenichi Handa , emacs-devel@gnu.org, >> juri@jurta.org, Ken Stevens , >> Stefan Monnier >> >> If ispell wants utf-8, it's easy enough to convert each input line to >> utf-8 and deal with offsets into that in the event of a mispelling; > > Or account for byte offsets by (variable) multibyte lenght of each > character, which Emacs knows. I don't remember for the moment whether > the multibyte length of the UTF-8 encoding can be gotten at by a Lisp > program, but if not, we could add some primitive to do that. Just encode the line to utf-8, find the correct point in the byte string, cut off the line there, convert back and check the length of the string. This works unless you are in the middle of a character. But it would be much saner if our conversion facilities would preserve markers (which they don't do right now): encode to utf-8, place a marker at the right byte offset, undo the conversion. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum