From mboxrd@z Thu Jan  1 00:00:00 1970
Path: main.gmane.org!not-for-mail
From: Stefan Monnier <monnier@iro.umontreal.ca>
Newsgroups: gmane.emacs.devel
Subject: Re: Mule problem.
Date: 20 Aug 2004 16:31:08 -0400
Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Message-ID: <jwvllg9a7fp.fsf-monnier+emacs@gnu.org>
References: <x5hdqxh8y2.fsf@lola.goethe.zz>
NNTP-Posting-Host: deer.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: sea.gmane.org 1093033936 25385 80.91.224.253 (20 Aug 2004 20:32:16 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Fri, 20 Aug 2004 20:32:16 +0000 (UTC)
Cc: emacs-devel@gnu.org
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Aug 20 22:32:09 2004
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Original-Received: from lists.gnu.org ([199.232.76.165])
	by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 1ByG3U-00011z-00
	for <ged-emacs-devel@m.gmane.org>; Fri, 20 Aug 2004 22:32:08 +0200
Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.33)
	id 1ByG7o-0002gD-6e
	for ged-emacs-devel@m.gmane.org; Fri, 20 Aug 2004 16:36:36 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33)
	id 1ByG7f-0002g6-Bt
	for emacs-devel@gnu.org; Fri, 20 Aug 2004 16:36:27 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33)
	id 1ByG7e-0002fn-9S
	for emacs-devel@gnu.org; Fri, 20 Aug 2004 16:36:26 -0400
Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.33) id 1ByG7e-0002fd-78
	for emacs-devel@gnu.org; Fri, 20 Aug 2004 16:36:26 -0400
Original-Received: from [132.204.24.67] (helo=mercure.iro.umontreal.ca)
	by monty-python.gnu.org with esmtp (Exim 4.34)
	id 1ByG2k-0005Qa-GJ; Fri, 20 Aug 2004 16:31:22 -0400
Original-Received: from asado.iro.umontreal.ca (asado.iro.umontreal.ca [132.204.24.84])
	by mercure.iro.umontreal.ca (Postfix) with ESMTP
	id 547B4B30279; Fri, 20 Aug 2004 16:31:08 -0400 (EDT)
Original-Received: by asado.iro.umontreal.ca (Postfix, from userid 20848)
	id BDC838CA23; Fri, 20 Aug 2004 16:31:08 -0400 (EDT)
Original-To: David Kastrup <dak@gnu.org>
In-Reply-To: <x5hdqxh8y2.fsf@lola.goethe.zz>
Original-Lines: 34
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50
X-DIRO-MailScanner-Information: Please contact the ISP for more information
X-DIRO-MailScanner: Found to be clean
X-DIRO-MailScanner-SpamCheck: n'est pas un polluriel, SpamAssassin (score=0,
	requis 5)
X-MailScanner-From: monnier@iro.umontreal.ca
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: main.gmane.org gmane.emacs.devel:26349
X-Report-Spam: http://spam.gmane.org/gmane.emacs.devel:26349

> So I basically have to take the buffer line, convert it into a
> canonical form based on the byte sequence, take all the error
> messages, convert them into canonical form, too, correlate the errors
> in the messages with the canonical form, and then convert everything
> back.

> It would be most efficient if I could just place markers at the points
> of all error, and then call undo repeatedly until I arrive back at the
> original buffer line, then take a look at where the markers winded
> up.  Unfortunately, this does not work.

I don't think there's a good generic answer.  But in your case, IIUC you're
working on a single line, so maybe you can use something like:
- encode the buffer line to a sequence of bytes.
- figure out the error location(s).
- insert newlines at each error location.
- decode the sequence of bytes back to the "original buffer line"
  plus newlines.
Obviously, you can't use `undo' here, so encode+decode should be a nop
(which is sadly not always the case with Emacs coding-systems).

> If the conversion functions could be given a (sorted) array of string
> positions, and would record where those positions moved to upon
> conversion, this would help quite a bit.

Note that you won't be able to use `undo' in any case, because `undo' will
just replace the canonical string with the unencoded string that was there
before (which has the same effect as decoding, but is done with a single
insert+delete and no knowledge of coding-systems), so it cannot properly
track markers that are in the middle of the changed text: those will end up
either at the beginning or at the end of the changed text.


        Stefan