From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Daniel Krueger Newsgroups: gmane.lisp.guile.user Subject: Re: I'm looking for a method of converting a string's character encoding Date: Mon, 30 Apr 2012 12:18:59 +0200 Message-ID: References: <87obqbwykh.fsf@gnuvola.org> <834ns37f0b.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: dough.gmane.org 1335781160 28813 80.91.229.3 (30 Apr 2012 10:19:20 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 30 Apr 2012 10:19:20 +0000 (UTC) Cc: guile-user@gnu.org, ttn@gnuvola.org, sunjoong@gmail.com To: Eli Zaretskii Original-X-From: guile-user-bounces+guile-user=m.gmane.org@gnu.org Mon Apr 30 12:19:19 2012 Return-path: Envelope-to: guile-user@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SOnhL-0005N5-5f for guile-user@m.gmane.org; Mon, 30 Apr 2012 12:19:15 +0200 Original-Received: from localhost ([::1]:52376 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SOnhH-0005Vm-FA for guile-user@m.gmane.org; Mon, 30 Apr 2012 06:19:11 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:56572) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SOnhB-0005VW-SN for guile-user@gnu.org; Mon, 30 Apr 2012 06:19:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SOnh9-00039d-S1 for guile-user@gnu.org; Mon, 30 Apr 2012 06:19:05 -0400 Original-Received: from mail-pb0-f41.google.com ([209.85.160.41]:55104) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SOnh9-00031s-Dh; Mon, 30 Apr 2012 06:19:03 -0400 Original-Received: by pbbrp2 with SMTP id rp2so929559pbb.0 for ; Mon, 30 Apr 2012 03:18:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=l/UpO8q5+qrXiu+3iZoE8jMbJn5NHWDQLStJQGaOK14=; b=sdoAqVHrVN4sqna1O+rcMXPQKxfPYbDUY2facH8IjXeWtgp6zS3zHnGXItBLxNhH6a gzjLl0yHwN7PcMVMCb2kDLDScEMm63ZhuMUR67z+5z+uIFHnM6w2quIZLZ/by9+Gl6c9 +VyRw4lSa8R9RcJjB3YrRlbm3sGF2CxO+imqzCK+HHN7BWPQowNl0PLlcqyYv8tq8zGr V8mr5Z8WbdkT91j+br/4CvNVW3E/SYy3uVaNnkBxvjKla0bHuHQ4gtS7ThoDGRFZlrjw 2HL54oA05dc1L9iPwZpSdVwAjrgLWvcwHZBI9mwXKlvnK9ZeMS+ANbUHj62QU1OSDOP/ HVRA== Original-Received: by 10.68.233.167 with SMTP id tx7mr20515389pbc.50.1335781139246; Mon, 30 Apr 2012 03:18:59 -0700 (PDT) Original-Received: by 10.142.11.21 with HTTP; Mon, 30 Apr 2012 03:18:59 -0700 (PDT) In-Reply-To: <834ns37f0b.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.41 X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane.org@gnu.org Original-Sender: guile-user-bounces+guile-user=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.user:9424 Archived-At: On Sat, Apr 28, 2012 at 10:55 PM, Eli Zaretskii wrote: > One notable example is when the original encoding was determined > incorrectly, and the application wants to "re-decode" the string, when > its external origin is no longer available. Okay, but then I would suggest either if you know you're probably not getting the right encoding but can determine it later to only store the input as a bytevector and later decode it correctly. Or if you already have the string you could encode it back to a bytevector with the wrong guessed encoding (which should emit the original input I think) and then re-decode it with the right encoding. Wouldn't that be the same solution as adding a primitive which does the same thing but on some lower level? >=A0Another example is an > application that wants to convert an encoded string into base-64 (or > similar) form -- you'll need to encode the string internally first. Here I don't have enough experience, but wouldn't you then just again transform the string into a bytevector and further work with it? > IOW, Guile needs a way to represent a string encoded in something > other than UTF-8, and convert between UTF-8 and other encodings. I think strings should be encoding `independent', so you don't have to mind that if you don't need to, and if you're working with a special encoding you're working on a representation of the `text' as a number of characters encoded in some numbers, so you use a bytevector. The only thing I'm not sure about is whether guile supports encoding a string (into a bytevector) in some other format than UTF-8, so if there don't exist other procedures I would suggest adding a string to bytevector decoder which takes an encoder and the encoders (or just procedures which convert the string directly into a bytevector in a specific encoding). WDYT?