From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Sunjoong Lee Newsgroups: gmane.lisp.guile.user Subject: I'm looking for a method of converting a string's character encoding Date: Sat, 28 Apr 2012 06:13:47 +0900 Message-ID: NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=f46d04428c9ce6f9ff04beaf9540 X-Trace: dough.gmane.org 1335561263 4840 80.91.229.3 (27 Apr 2012 21:14:23 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 27 Apr 2012 21:14:23 +0000 (UTC) To: guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Fri Apr 27 23:14:20 2012 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SNsUe-0004Sf-05 for guile-user@m.gmane.org; Fri, 27 Apr 2012 23:14:20 +0200 Original-Received: from localhost ([::1]:37997 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SNsUd-0007Hr-A6 for guile-user@m.gmane.org; Fri, 27 Apr 2012 17:14:19 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:55733) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SNsUY-0007H8-CF for guile-user@gnu.org; Fri, 27 Apr 2012 17:14:15 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SNsUW-0000lT-Jh for guile-user@gnu.org; Fri, 27 Apr 2012 17:14:13 -0400 Original-Received: from mail-wg0-f49.google.com ([74.125.82.49]:32893) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SNsUW-0000gT-AN for guile-user@gnu.org; Fri, 27 Apr 2012 17:14:12 -0400 Original-Received: by wgbds1 with SMTP id ds1so822427wgb.30 for ; Fri, 27 Apr 2012 14:14:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=G9jcQyEPNMZKrTeXWDRXQrH3f82eCEFkdfEUaO8NaDQ=; b=x2Wz+5nsBcOiqrvDxbnn681hIyUqrFf0WvAW1reSWivZtrhId8Lnuegdww16mjGnVl eMBpYw7BSKpoykOGLE/Kl+VuA6wPZrR8NilhKnPgpx4Uf9ibcWBSKG8pTz3FlpKqXMvR Y674nzsb7pMTaADj9QdH9NQoguC1Kyu4onz0w4rrWqmiBLwpJKF8XpnXK1tg6JDDRGXg oV+8PanJBOKNjNo5sx3ApGxWPjmFe/qUELe0GzYLEvsEY5ZcdJtCgQ5WVflGeqzYzvMY 0U10wuPWLcsjSuVz+ffRA4iwMsAktsYLY1zsHw4E6WC0Ax5sgm5Wa1WkkSvwsLUpenoR 4wHg== Original-Received: by 10.180.82.136 with SMTP id i8mr1079373wiy.19.1335561250339; Fri, 27 Apr 2012 14:14:10 -0700 (PDT) Original-Received: by 10.223.93.206 with HTTP; Fri, 27 Apr 2012 14:13:47 -0700 (PDT) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 74.125.82.49 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:9411 Archived-At: --f46d04428c9ce6f9ff04beaf9540 Content-Type: text/plain; charset=UTF-8 Hello, I'm looking for a method of converting a string's character encoding from a certain codeset to utf-8. I know the string of Guile uses utf-8 and (read (open-bytevector-input-port (string->utf8 "hello"))) returns "hello" . But what if the string "hello" be encoded not utf-8 and you want to get utf-8 converted string? What I want is like iconv. Background; #:decode-body? keyword of http-get seems not to work properly; I should set #:decode-body? to false value and decode the contents body string manually. If a web page's charset be utf-8, there be no problem. If not, a problem occurs. decode-response-body of (web client) call decode-string with web page's charset. But real charset of bytevector is iso-8859-1, not web page's charset. If so, you should not let http-get use decode-response-body. After getting response-body with bytevector form, you should decode it with "iso-8859-1" like decode-string's manner. Then you'll get web page's contents body string; it's charset is what you see in response header. Now, I need to convert this contents body string to utf-8 but I don't know how. I think it would be with port i/o. Thanks. --f46d04428c9ce6f9ff04beaf9540 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello,

I'm looking for a method of converting a stri= ng's character encoding from a certain=C2=A0codeset to utf-8. I know th= e string of Guile uses utf-8 and=C2=A0(read (open-bytevector-input-port (st= ring->utf8 "hello"))) returns "hello" . But what if = the string "hello" be encoded not utf-8 and you want to get utf-8= converted string? What I want is like iconv.

Background;
#:decode-body? keyword of=C2=A0ht= tp-get seems not to work=C2=A0properly; I should set=C2=A0#:decode-body? to= false value and decode the contents body string manually. If a web page= 9;s=C2=A0charset be utf-8, there be no problem. If not, a problem occurs.= =C2=A0decode-response-body of (web client) call=C2=A0decode-string with web= page's=C2=A0charset. But real=C2=A0charset of=C2=A0bytevector is iso-8= 859-1, not=C2=A0web page's charset. If so, you should not let=C2=A0http= -get use=C2=A0decode-response-body.

After getting=C2=A0response-body with=C2=A0bytevector f= orm, you should decode it with "iso-8859-1" like=C2=A0decode-stri= ng's manner. Then you'll get web page's contents body string; i= t's=C2=A0charset is what you see in response header.

Now, I need to convert this contents body string to utf= -8 but I don't know how. I think it would be with port i/o.
<= br>
Thanks.

--f46d04428c9ce6f9ff04beaf9540--