unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* How to create a derived encoding?
@ 2004-10-12  0:10 David Kastrup
  2004-10-12 15:09 ` Stefan Monnier
  0 siblings, 1 reply; 6+ messages in thread
From: David Kastrup @ 2004-10-12  0:10 UTC (permalink / raw)



After considerable thinking about the problem, I have arrived at the
conclusion that for efficiency's sake I'd like to have an encoding
like tex-utf-8 which is derived from the normal utf-8 except that
sequences like ^^8a and similar are converted into a corresponding
byte before combining Unicode characters.  It would be a bonus if such
sequences staid unchanged in case that this sort of composition does
not lead to a valid Unicode character, but that's just a bonus.

The problem is that TeX has no clue about _characters_, but works on
byte streams, and it has the habit of transliterating some byte codes
in the above manner.  Treating the output of TeX sensibly means
converting those transliteration back into bytes _before_ assembling
Unicode characters.

The same problem occurs with unibyte non-ASCII encodings by Latin-1.
I already have one (rather inefficient) hack to deal with that in
preview-latex, but it does not extend easily to multibyte.

So if there was a tolerably working way to derive a special encoding
(which will be used as a process output encoding) that reconverts
control sequences like the above before composing unicode characters
from the resulting utf-8 stream, this would appear to be by far the
fastest and convenient way to go about this problem.

Any hints how to derive a suitably augmented encoding from an existing
one?

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-10-14 11:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-12  0:10 How to create a derived encoding? David Kastrup
2004-10-12 15:09 ` Stefan Monnier
2004-10-12 15:27   ` David Kastrup
2004-10-12 16:23     ` Stefan Monnier
2004-10-12 21:02       ` David Kastrup
2004-10-14 11:12         ` Oliver Scholz

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).