From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Neil Jerram Newsgroups: gmane.lisp.guile.devel Subject: Re: SRFI-14 and locale settings Date: Wed, 13 Sep 2006 19:07:23 +0100 Message-ID: <87y7snwsh0.fsf@ossau.uklinux.net> References: <87y7t03ngn.fsf@laas.fr> <87slj89lrk.fsf@ossau.uklinux.net> <87wt8krocj.fsf@laas.fr> <87odtvkxl1.fsf@zip.com.au> <87r6yodtv3.fsf@laas.fr> <87ejun5kj7.fsf@zip.com.au> <877j095t91.fsf@laas.fr> <871wqhymo4.fsf@ossau.uklinux.net> <87lkooqie6.fsf@laas.fr> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1158170975 7856 80.91.229.2 (13 Sep 2006 18:09:35 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 13 Sep 2006 18:09:35 +0000 (UTC) Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Wed Sep 13 20:09:33 2006 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1GNZAe-0003Pw-4o for guile-devel@m.gmane.org; Wed, 13 Sep 2006 20:09:12 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GNZAd-0007cm-Jk for guile-devel@m.gmane.org; Wed, 13 Sep 2006 14:09:11 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1GNZAa-0007cU-BZ for guile-devel@gnu.org; Wed, 13 Sep 2006 14:09:08 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1GNZAY-0007b7-S1 for guile-devel@gnu.org; Wed, 13 Sep 2006 14:09:08 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GNZAY-0007b4-MT for guile-devel@gnu.org; Wed, 13 Sep 2006 14:09:06 -0400 Original-Received: from [80.84.72.33] (helo=mail3.uklinux.net) by monty-python.gnu.org with esmtp (Exim 4.52) id 1GNZCK-000248-C6 for guile-devel@gnu.org; Wed, 13 Sep 2006 14:10:56 -0400 Original-Received: from laruns (host86-129-125-104.range86-129.btcentralplus.com [86.129.125.104]) by mail3.uklinux.net (Postfix) with ESMTP id 8BFE540A52F for ; Wed, 13 Sep 2006 18:09:05 +0000 (UTC) Original-Received: from laruns (laruns [127.0.0.1]) by laruns (Postfix) with ESMTP id 42EBD6F71D for ; Wed, 13 Sep 2006 19:07:25 +0100 (BST) Original-To: guile-devel@gnu.org In-Reply-To: <87lkooqie6.fsf@laas.fr> ( =?iso-8859-1?q?Ludovic_Court=E8s's_message_of?= "Wed, 13 Sep 2006 10:29:21 +0200") User-Agent: Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:6086 Archived-At: ludovic.courtes@laas.fr (Ludovic Court=E8s) writes: > Hi, > > Neil Jerram writes: > >> Yes. So it seems to me, therefore, that we should not be using >> isalpha() etc. to construct char-set:letter, but should instead hard >> code it as the intersection of (char-set:letter as specified by SRFI >> 14) with (the set of characters that Guile can represent). > > In practice, I can think of two ways to determine the set of _letters_ > available in the current encoding (which is what `char-set:letter' > expects). > > 1. Since SRFI-14 lists all the characters that have to be added to the > ASCII `char-set:letter' to get the Latin-1 `char-set:letter', we > could somehow hard-code them. But this is ugly. I don't see why you think it's ugly. If it's the right solution, it's the right solution. > 2. Or, we can use a predicate that uses the `is' functions which we > expect to be language-independent (i.e., those functions that only > depend on the locale's charset), such as: > > (!isblank (c)) && (!ispunct (c)) && (!isdigit (c)) && (!iscntrl (c)) Now this is ugly, IMO! > This is certainly not perfect, but it should work for Latin-1, and > hopefully for other 8-bit charsets as well. > > As Kevin mentioned earlier, all the char sets could be re-computed in > `scm_setlocale ()'. This sounds even trickier, and wrong, given that the intention of SRFI 14 is for char-set:letter to be locale-independent. Regards, Neil _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://lists.gnu.org/mailman/listinfo/guile-devel