unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed
* chinese encoded in UTF-8 and XML
@ 2003-09-25 20:05 Knackeback
  2003-09-25 20:32 ` Andreas Prilop
  2003-09-26  2:52 ` Micah Cowan
  0 siblings, 2 replies; 7+ messages in thread
From: Knackeback @ 2003-09-25 20:05 UTC (permalink / raw)


Hi, I wrote a XML file with GNU emacs 21.2.2 and with 
chinese character content encoded in UTF-8.
I wrote something like:

<?xml version="1.0" encoding="UTF-8"?>
<test>
<chinese>撒</chinese>
<chinese>鰓</chinese>
</test>

and then I used "C-x RET f" and then I choosed utf-8.
Then I typed "C-x C-s" to save my file.
I hope this is the right way in emacs to store the content
as UTF-8 encoded text ?!
Now I tried to parse the file with xmllint. xmllint is a
small xml-parser program which comes with libxml2. 
The parser complains that the second "chinese line" is not proper
UTF-8.

==>

uhu:4: error: Input is not proper UTF-8, indicate encoding !
<chinese>鰓</chinese>
         ^
uhu:4: error: Bytes: 0xC4 0xCE 0x3C 0x2F
<chinese>鰓</chinese>

It is interesting that the parser only grumbles about the second 
chinese line.

I'm anxious to see an explanation !

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-09-26 16:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-25 20:05 chinese encoded in UTF-8 and XML Knackeback
2003-09-25 20:32 ` Andreas Prilop
2003-09-26  2:52 ` Micah Cowan
2003-09-26  4:58   ` Miles Bader
2003-09-26 14:12     ` James H.Cloos Jr.
     [not found]   ` <mailman.736.1064552317.21628.help-gnu-emacs@gnu.org>
2003-09-26  6:39     ` Gernot Hassenpflug
2003-09-26 16:16   ` Stefan Monnier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).