coding tags and utf-16

* coding tags and utf-16
@ 2005-12-21  8:00 Werner LEMBERG
  2005-12-23 23:43 ` Werner LEMBERG
  2006-01-04  6:42 ` Kenichi Handa
  0 siblings, 2 replies; 25+ messages in thread
From: Werner LEMBERG @ 2005-12-21  8:00 UTC (permalink / raw)

There is a serious problem with coding tags and utf-16 encodings of
any flavour: Emacs simply can't recognize the tag.  This is a
non-trivial problem.  Right now I'm working on a groff preprocessor
which tries to handle this.  I'm doing the following to find the tag
in an encoding-independent way:

  . Check whether the file starts with the BOM (Byte Order Mark) --
    this is one of the following byte sequences:

      UTF-8:  0xEFBBBF
      UTF-16: 0xFEFF or 0xFFFE

    Skip it.

  . Ignore zero bytes while looking for the -*- coding: ... -*-
    stuff.

This heuristic algorithm might not give correct results in all cases
but it should be sufficiently reliable for normal use.

    Werner

^ permalink raw reply	[flat|nested] 25+ messages in thread