From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Andy Wingo Newsgroups: gmane.lisp.guile.devel Subject: Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword} Date: Mon, 06 Sep 2010 18:58:04 +0200 Message-ID: References: <998452.84210.qm@web37903.mail.mud.yahoo.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1283792114 9148 80.91.229.12 (6 Sep 2010 16:55:14 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Mon, 6 Sep 2010 16:55:14 +0000 (UTC) Cc: guile-devel To: Mike Gran Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Mon Sep 06 18:55:09 2010 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1OseyI-000632-Ui for guile-devel@m.gmane.org; Mon, 06 Sep 2010 18:55:07 +0200 Original-Received: from localhost ([127.0.0.1]:59557 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OseyI-0002Gf-H5 for guile-devel@m.gmane.org; Mon, 06 Sep 2010 12:55:06 -0400 Original-Received: from [140.186.70.92] (port=41962 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OseyA-0002Fz-HF for guile-devel@gnu.org; Mon, 06 Sep 2010 12:54:59 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Osey9-0004jN-4t for guile-devel@gnu.org; Mon, 06 Sep 2010 12:54:58 -0400 Original-Received: from a-pb-sasl-quonix.pobox.com ([208.72.237.25]:58393 helo=sasl.smtp.pobox.com) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Osey9-0004j9-34 for guile-devel@gnu.org; Mon, 06 Sep 2010 12:54:57 -0400 Original-Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id E2BE9D498A; Mon, 6 Sep 2010 12:54:55 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=hFZ6pc3YO26N6UIEvDP0S+2sMIQ=; b=cLHv+C 8KAC5cWZ5nDYs3SYlJEtK5Z11nKVmTkRtEgDtJVffnDzPPgCtLxjnnZVrWuX6pIk mm1nA2DE/dns4zsqE2ZKg9Mkv9ockYwMM+FSZFiH9vH3sowsSrkxmZ4MtTllVfCE MRUhW6SG0fkKNB7WgfMmxeNgr+cX301YbVD+0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=rOogXYdhLJqKWM/MXZTQQubK5bkAVEIe yT2mNgvH4F/b7bLBtyIW7gqhzwJdVpJM6irY7LbOQN8Uo+dy5B9clumrqR3HJwJf 1vO4kaQvxUk+S6q2uFmADotE58lkL+W4Z0/P58WFzSliYU/OWVodPYvAWUVfsRht cpca9YWRX8w= Original-Received: from a-pb-sasl-quonix. (unknown [127.0.0.1]) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTP id CDAC4D4989; Mon, 6 Sep 2010 12:54:54 -0400 (EDT) Original-Received: from unquote.localdomain (unknown [79.156.65.247]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by a-pb-sasl-quonix.pobox.com (Postfix) with ESMTPSA id 3E351D4988; Mon, 6 Sep 2010 12:54:53 -0400 (EDT) In-Reply-To: <998452.84210.qm@web37903.mail.mud.yahoo.com> (Mike Gran's message of "Mon, 6 Sep 2010 09:28:03 -0700 (PDT)") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) X-Pobox-Relay-ID: 79460F78-B9D7-11DF-8575-030CEE7EF46B-02397024!a-pb-sasl-quonix.pobox.com X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:10875 Archived-At: Greetings, On Mon 06 Sep 2010 18:28, Mike Gran writes: > there is a failure case to consider for scm_from_utf8_string. The C > utf8 string could contain incorrectly encoded data. There is the analogous case of scm_to_locale_string, if the string is not encodable in the current locale. > You could throw the encoding error, or you could replace the > bad utf8 with U+FFFD or the question mark. > > The bytevector's utf8->string always throws encoding-error. > Maybe that's good enough. Yeah, maybe so. > Otherwise, perhaps something like > > scm_from_utf8_stringn (str, len, error_or_replace_strategy) > > If you didn't mind the overhead of calling the somewhat > heavyweight scm_{to,from}_stringn, these could be macros > or inline functions that wrap that. Ah, I did not see scm_{to,from}_stringn. Cool! I think scm_from_utf8_stringn et al should be proper functions, and probably their initial implementations just call scm_{to,from}_stringn. But we should at least do the straightforward optimization for the latin1 case. Cheers, Andy -- http://wingolog.org/