From: Stephen Compall <s11@member.fsf.org>
Cc: guile-devel@gnu.org
Subject: Which Encoding? (was Re: Unicode and Guile)
Date: 26 Oct 2003 12:34:47 +0000 [thread overview]
Message-ID: <xfyd6ckb2jc.fsf_-_@csserver.evansville.edu> (raw)
In-Reply-To: <200310260003.RAA10375@morrowfield.regexps.com>
Tom Lord <lord@emf.net> writes:
> It's culturually discriminatory to regard utf-16 as worse than utf-8
> in those regards.
>
> Or, put differently, for many potential users, utf-16 is the best of
> both worlds: it optimizes the size of the most common characters
> (for some users), and it can also handle any Unicode character.
That's the thing -- it can't, at least not thinking in fixed-width
terms, which was my goal in suggesting UCS-4. It may be able to
handle all *current* Unicode characters, but what about those in the
future? Unicode supports code points higher than 16-bit.
I say it's the worst of both worlds (from the C API user's point of
view), because you have to deal with breaking ASCII compatibility for
7-bit code points, *and* still need surrogate characters
(i.e. variable width), for code points above 65535 (the difference
between UTF-16 and UCS-2).
UTF-16 suffers the same problem as UTF-8: programmers may be tempted
to simply treat the data block as fixed-width 16-bit strings (8-bit
for UTF-8, of course), which of course will break on the surrogate
characters.
If you want to assume that Unicode will never grow out of the 16-bit
set, then UCS-2 would be a much better choice than UTF-16, IMHO. That
way, it is clear that C programs only need deal with fixed-width,
16-bit characters.
--
Stephen Compall or s11 or sirian
Since a politician never believes what he says, he is surprised
when others believe him.
-- Charles DeGaulle
Ft. Meade Lexis-Nexis smuggle virus BROMURE JSOFC3IP emc plutonium
electronic surveillance quarter number key offensive information
warfare fraud Albania Khaddafi
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel
next prev parent reply other threads:[~2003-10-26 12:34 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-10-21 17:15 Unicode and Guile Andy Wingo
2003-10-25 17:08 ` Stephen Compall
2003-10-26 0:03 ` Tom Lord
2003-10-26 12:34 ` Stephen Compall [this message]
2003-10-31 13:25 ` Andy Wingo
2003-11-03 13:35 ` text buffers (was Re: Unicode and Guile) Stephen Compall
2003-11-03 20:34 ` Tom Lord
2003-11-04 10:04 ` Stephen Compall
2003-11-03 20:31 ` Unicode and Guile Tom Lord
2003-11-06 18:16 ` Andy Wingo
2003-11-11 19:02 ` Tom Lord
2003-11-12 0:29 ` Marius Vollmer
2003-11-12 1:40 ` Tom Lord
2003-11-12 2:30 ` Marius Vollmer
2003-11-12 4:03 ` Tom Lord
2003-11-12 16:59 ` Marius Vollmer
2003-11-17 16:17 ` Andy Wingo
2003-11-12 0:06 ` Marius Vollmer
2003-11-12 1:27 ` Tom Lord
2003-10-31 13:16 ` Andy Wingo
2003-11-02 21:23 ` Kevin Ryde
2003-11-26 20:35 ` Mikael Djurfeldt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xfyd6ckb2jc.fsf_-_@csserver.evansville.edu \
--to=s11@member.fsf.org \
--cc=guile-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).