unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Mule problem.
@ 2004-08-20 20:05 David Kastrup
  2004-08-20 20:31 ` Stefan Monnier
  0 siblings, 1 reply; 2+ messages in thread
From: David Kastrup @ 2004-08-20 20:05 UTC (permalink / raw)



What is the most efficient possibility of encoding or decoding regions
in a buffer while keeping markers intact?

Or, somewhat differently put: assuming that I have a conceptional
marker at every character, what happens to those markers?  Effectively
I need to correlate error messages from a TeX compiler run with stuff
in the source buffer, and TeX has the bad habit of not looking at
characters but bytes in its source, and also of transcribing some
input characters into hexadecimal instead of keeping them.

So I basically have to take the buffer line, convert it into a
canonical form based on the byte sequence, take all the error
messages, convert them into canonical form, too, correlate the errors
in the messages with the canonical form, and then convert everything
back.

It would be most efficient if I could just place markers at the points
of all error, and then call undo repeatedly until I arrive back at the
original buffer line, then take a look at where the markers winded
up.  Unfortunately, this does not work.

If the conversion functions could be given a (sorted) array of string
positions, and would record where those positions moved to upon
conversion, this would help quite a bit.

Any idea of how to work this at the moment?

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Mule problem.
  2004-08-20 20:05 Mule problem David Kastrup
@ 2004-08-20 20:31 ` Stefan Monnier
  0 siblings, 0 replies; 2+ messages in thread
From: Stefan Monnier @ 2004-08-20 20:31 UTC (permalink / raw)
  Cc: emacs-devel

> So I basically have to take the buffer line, convert it into a
> canonical form based on the byte sequence, take all the error
> messages, convert them into canonical form, too, correlate the errors
> in the messages with the canonical form, and then convert everything
> back.

> It would be most efficient if I could just place markers at the points
> of all error, and then call undo repeatedly until I arrive back at the
> original buffer line, then take a look at where the markers winded
> up.  Unfortunately, this does not work.

I don't think there's a good generic answer.  But in your case, IIUC you're
working on a single line, so maybe you can use something like:
- encode the buffer line to a sequence of bytes.
- figure out the error location(s).
- insert newlines at each error location.
- decode the sequence of bytes back to the "original buffer line"
  plus newlines.
Obviously, you can't use `undo' here, so encode+decode should be a nop
(which is sadly not always the case with Emacs coding-systems).

> If the conversion functions could be given a (sorted) array of string
> positions, and would record where those positions moved to upon
> conversion, this would help quite a bit.

Note that you won't be able to use `undo' in any case, because `undo' will
just replace the canonical string with the unencoded string that was there
before (which has the same effect as decoding, but is done with a single
insert+delete and no knowledge of coding-systems), so it cannot properly
track markers that are in the middle of the changed text: those will end up
either at the beginning or at the end of the changed text.


        Stefan

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-08-20 20:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-20 20:05 Mule problem David Kastrup
2004-08-20 20:31 ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).