From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mike Gran Newsgroups: gmane.lisp.guile.devel Subject: Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword} Date: Mon, 6 Sep 2010 09:28:03 -0700 (PDT) Message-ID: <998452.84210.qm@web37903.mail.mud.yahoo.com> References: NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1283790596 2755 80.91.229.12 (6 Sep 2010 16:29:56 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 6 Sep 2010 16:29:56 +0000 (UTC) To: Andy Wingo , guile-devel Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Mon Sep 06 18:29:53 2010 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OseZp-0007OW-2K for guile-devel@m.gmane.org; Mon, 06 Sep 2010 18:29:49 +0200 Original-Received: from localhost ([127.0.0.1]:56718 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OseZo-0006pk-88 for guile-devel@m.gmane.org; Mon, 06 Sep 2010 12:29:48 -0400 Original-Received: from [140.186.70.92] (port=40655 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OseYA-0005h4-VX for guile-devel@gnu.org; Mon, 06 Sep 2010 12:28:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OseY9-0007f0-Su for guile-devel@gnu.org; Mon, 06 Sep 2010 12:28:06 -0400 Original-Received: from web37903.mail.mud.yahoo.com ([209.191.91.165]:36847) by eggs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OseY9-0007eq-No for guile-devel@gnu.org; Mon, 06 Sep 2010 12:28:05 -0400 Original-Received: (qmail 84717 invoked by uid 60001); 6 Sep 2010 16:28:04 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1283790483; bh=xxxSlA2geWNAwiJ11e1J5rFC6pVIKJmOqvcmM5xjb/E=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=SPyD+yitbhnJlkIFkXGdG08oNyu57KYssZjESiDzWHH6TOMhqQKuQGLvOYpZ9nCzpe4kWwy0PvgLLGtbb0yUUml8cpUi1RTqkMiLujRL5XQfalioWYOj/4dtf+Fgos0EZjoeBgLFAzlYru5swwfZijguXVGl2X6q0UKXxsJPm48= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=PF5LTmMzc5IQdYka0zw1TDTIMHKQkCJmJW14QxrYalC+rA0SLFSuiYr+s3Yk8POpWjNg3FbLprTQ5pGs4DHlvt9m3QsfjC6uw03X4zjBMw+DObpKLt+/lwy0pWcRQxsx57PYJu7v6x5ctMrRcocoJki8s0XmAWxgr3a6jMfTPnw=; X-YMail-OSG: If2O68MVM1lfZNWnEOSxeogPbgG4I8OceW84hIAHOByJatj tC7okzWJUbcxXx.eIdYt0I9mu4cG7CxgjXhg061LrvUS7XX0znuvw5qfvNxP NSnSLKPlHJrJdUkULMu1LKEvezje9EwoMNDBPH9Zy3fcxT8zuqsbB86vGeQV seR6WfhAeKRgW5gX7oZnQ7SuWRkVTPkyYbGpQOhD6dPvC1hxTPpaRMh0NUC0 dsIVoZLkjaXOWnRQh6HcCu.HdlUjCugIK9murZ5x60F0DChI1QqIn.5QOPNb O3KplfMH7xssxpFqoSrv0thVJUf06S60We0M1ujaAG6c58KL2sMKPxz151Ow - Original-Received: from [71.130.222.132] by web37903.mail.mud.yahoo.com via HTTP; Mon, 06 Sep 2010 09:28:03 PDT X-Mailer: YahooMailRC/470 YahooMailWebService/0.8.105.279950 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: FreeBSD 6.x (1) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:10874 Archived-At: > From: Andy Wingo [...] > The solution is to use functions that specify the locale. We don't have > those yet, but we do have the capability to write them > now. Specifically: > > scm_from_utf8_string > scm_from_utf8_symbol > scm_from_utf8_keyword > > scm_from_latin1_string > scm_from_latin1_symbol > scm_from_latin1_keyword > > We probably also need the "n" variants. > [...] > So then we need, I think: > > scm_to_utf8_string > scm_to_utf16_string > scm_to_utf32_string > > We need the "n" variants here too (perhaps more). Some of this is already in the bytevectors module, but, perhaps not in an easy form for C source code. It would easy enough to do, but, there is a failure case to consider for scm_from_utf8_string. The C utf8 string could contain incorrectly encoded data. You could throw the encoding error, or you could replace the bad utf8 with U+FFFD or the question mark. The bytevector's utf8->string always throws encoding-error. Maybe that's good enough. Otherwise, perhaps something like scm_from_utf8_stringn (str, len, error_or_replace_strategy) If you didn't mind the overhead of calling the somewhat heavyweight scm_{to,from}_stringn, these could be macros or inline functions that wrap that. -Mike