From: Eli Zaretskii <eliz@gnu.org>
To: Daniel Krueger <keenbug@googlemail.com>
Cc: guile-user@gnu.org, ttn@gnuvola.org, sunjoong@gmail.com
Subject: Re: I'm looking for a method of converting a string's character encoding
Date: Sat, 28 Apr 2012 23:55:32 +0300 [thread overview]
Message-ID: <834ns37f0b.fsf@gnu.org> (raw)
In-Reply-To: <CAAh5vOP=ZDsPMSxV49r6uxomr6D3D2-8wP-9OcMQrCzWTE5Sew@mail.gmail.com>
> Date: Sat, 28 Apr 2012 20:29:22 +0200
> From: Daniel Krueger <keenbug@googlemail.com>
> Cc: guile-user@gnu.org, Sunjoong Lee <sunjoong@gmail.com>
>
> i think there shouldn't be any transcoding of guile's strings, as
> strings are internal representation of characters, no matter how they
> are encoded. So the only time when encoding matters is when it passes
> it's `internal boundarys', i mean if you write the string to a port or
> read from a port or pass it as a string to a foreign library. For the
> ports all transcoding is available, and as said, the real
> representation of guile strings internally is as utf8, which can't be
> changed. The only additional thing i forgot about are bytevectors, if
> you convert a string to an explicit representation, but afaik there
> you also can give the encoding to use.
>
> Am I wrong?
You are mostly right, but only "mostly". Experience teaches that
sometimes you need to change encoding even inside "the boundaries".
One notable example is when the original encoding was determined
incorrectly, and the application wants to "re-decode" the string, when
its external origin is no longer available. Another example is an
application that wants to convert an encoded string into base-64 (or
similar) form -- you'll need to encode the string internally first.
These kinds of rare, but still important, use cases are the reason why
Emacs Lisp has primitives to do encoding and decoding of in-memory
strings; as much as Emacs maintainers want to get rid of the related
need to support "unibyte strings", they are not going to go away any
time soon.
IOW, Guile needs a way to represent a string encoded in something
other than UTF-8, and convert between UTF-8 and other encodings.
next prev parent reply other threads:[~2012-04-28 20:55 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-27 21:13 I'm looking for a method of converting a string's character encoding Sunjoong Lee
2012-04-28 1:40 ` Sunjoong Lee
2012-04-28 16:38 ` Sunjoong Lee
2012-04-28 17:33 ` Thien-Thi Nguyen
2012-04-28 18:29 ` Daniel Krueger
2012-04-28 19:54 ` Thien-Thi Nguyen
2012-04-28 20:55 ` Eli Zaretskii [this message]
2012-04-28 22:42 ` Sunjoong Lee
2012-04-29 0:25 ` Sunjoong Lee
2012-04-30 10:18 ` Daniel Krueger
2012-04-30 12:21 ` Eli Zaretskii
2012-05-03 22:34 ` Ludovic Courtès
2012-05-02 3:57 ` Daniel Hartwig
2012-05-03 5:14 ` Sunjoong Lee
2012-05-03 22:31 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=834ns37f0b.fsf@gnu.org \
--to=eliz@gnu.org \
--cc=guile-user@gnu.org \
--cc=keenbug@googlemail.com \
--cc=sunjoong@gmail.com \
--cc=ttn@gnuvola.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).