Thanks hien-Thi, Daniel and Eli. Eli pointed a good example; I'll say another one. In the countries, it's character encoded multibytes, like China, Japan and Korea (i.e., in CJKs), it would be a common issue to convert codeset. In Korea, a certain web page may be written by EUC-KR codeset and another by UTF-8. In Japan, Shift-JIS, EUC-JP, ISO-2022-JP and UTF-8. In China, GBK, gb18030, Big5, Big5-HKSCS and UTF-8. I mean that koreans use 2 different codesets, japanese 4, chinese 5 in the net. It seems not to happen comparing chinese web page and korean web page with a same program but... Suppose you want to write a program monitoring web pages, the codeset converter would be need. Just in CJKs? Greeks use 3 codesets, vietnamese 2, arabs 3, and so on. It looks like that russians use many codesets like chinese. 2012/4/29 Eli Zaretskii > > Date: Sat, 28 Apr 2012 20:29:22 +0200 > > From: Daniel Krueger > > Cc: guile-user@gnu.org, Sunjoong Lee > > > > i think there shouldn't be any transcoding of guile's strings, as > > strings are internal representation of characters, no matter how they > > are encoded. So the only time when encoding matters is when it passes > > it's `internal boundarys', i mean if you write the string to a port or > > read from a port or pass it as a string to a foreign library. For the > > ports all transcoding is available, and as said, the real > > representation of guile strings internally is as utf8, which can't be > > changed. The only additional thing i forgot about are bytevectors, if > > you convert a string to an explicit representation, but afaik there > > you also can give the encoding to use. > > > > Am I wrong? > > You are mostly right, but only "mostly". Experience teaches that > sometimes you need to change encoding even inside "the boundaries". > One notable example is when the original encoding was determined > incorrectly, and the application wants to "re-decode" the string, when > its external origin is no longer available. Another example is an > application that wants to convert an encoded string into base-64 (or > similar) form -- you'll need to encode the string internally first. > > These kinds of rare, but still important, use cases are the reason why > Emacs Lisp has primitives to do encoding and decoding of in-memory > strings; as much as Emacs maintainers want to get rid of the related > need to support "unibyte strings", they are not going to go away any > time soon. > > IOW, Guile needs a way to represent a string encoded in something > other than UTF-8, and convert between UTF-8 and other encodings. >