From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: ludovic.courtes@laas.fr (Ludovic =?iso-8859-1?Q?Court=E8s?=) Newsgroups: gmane.lisp.guile.devel Subject: Re: SRFI-14 and locale settings Date: Tue, 19 Sep 2006 14:28:34 +0200 Organization: LAAS-CNRS Message-ID: <87eju86nwt.fsf@laas.fr> References: <87y7t03ngn.fsf@laas.fr> <87slj89lrk.fsf@ossau.uklinux.net> <87wt8krocj.fsf@laas.fr> <87odtvkxl1.fsf@zip.com.au> <87r6yodtv3.fsf@laas.fr> <87ejun5kj7.fsf@zip.com.au> <877j095t91.fsf@laas.fr> <87fyevs42r.fsf@zip.com.au> <87ac52d1lj.fsf@laas.fr> <87u03aosqt.fsf@zip.com.au> <877j05wkb5.fsf@ossau.uklinux.net> <87y7sjj554.fsf@laas.fr> <87d59sg2j3.fsf@zip.com.au> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: sea.gmane.org 1158668948 29673 80.91.229.2 (19 Sep 2006 12:29:08 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 19 Sep 2006 12:29:08 +0000 (UTC) Cc: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Tue Sep 19 14:29:06 2006 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by ciao.gmane.org with esmtp (Exim 4.43) id 1GPeij-0002Qo-Pf for guile-devel@m.gmane.org; Tue, 19 Sep 2006 14:29:02 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GPeij-00069q-0f for guile-devel@m.gmane.org; Tue, 19 Sep 2006 08:29:01 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1GPeib-00064Z-Rw for guile-devel@gnu.org; Tue, 19 Sep 2006 08:28:53 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1GPeia-0005zV-GE for guile-devel@gnu.org; Tue, 19 Sep 2006 08:28:53 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GPeia-0005z1-Bw for guile-devel@gnu.org; Tue, 19 Sep 2006 08:28:52 -0400 Original-Received: from [140.93.0.15] (helo=laas.laas.fr) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA:32) (Exim 4.52) id 1GPele-0001Ud-2J for guile-devel@gnu.org; Tue, 19 Sep 2006 08:32:02 -0400 Original-Received: by laas.laas.fr (8.13.7/8.13.4) with SMTP id k8JCSkjm001413; Tue, 19 Sep 2006 14:28:50 +0200 (CEST) Original-To: Neil Jerram X-URL: http://www.laas.fr/~lcourtes/ X-Revolutionary-Date: Jour du Travail de =?iso-8859-1?Q?l'Ann=E9e?= 214 de la =?iso-8859-1?Q?R=E9volution?= X-PGP-Key-ID: 0xEB1F5364 X-PGP-Key: http://www.laas.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 821D 815D 902A 7EAB 5CEE D120 7FBA 3D4F EB1F 5364 X-OS: powerpc-unknown-linux-gnu Mail-Followup-To: Neil Jerram , guile-devel@gnu.org In-Reply-To: <87d59sg2j3.fsf@zip.com.au> (Kevin Ryde's message of "Tue, 19 Sep 2006 09:48:00 +1000") User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux) X-Spam-Score: 0.496 () MAILTO_TO_SPAM_ADDR X-Scanned-By: MIMEDefang at CNRS-LAAS X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:6102 Archived-At: --=-=-= Hi, Kevin Ryde writes: >> but `punctuation', for instance, is a superset of what >> SRFI-14 expects while `symbol' is (correspondingly) a subset of what it >> should be, > > Does the srfi specified relation to graphic still hold? Ie. > > graphic = letter + digit + punctuation + symbol Yes. I added tests for `char-set:graphic' in both Latin-1 and ASCII. While I was at it, I modified `srfi-14.c' so that, for all char sets defined by the SRFI as a union of other char sets, it explicitly uses a predicate that reflects this (see, e.g., `CSET_GRAPHIC_PRED', `CSET_PRINTING_PRED'), so that the property is verified "by construction". >> -#define SCM_CHARSET_SET(cs, idx) \ >> - (((long *) SCM_SMOB_DATA (cs))[(idx) / SCM_BITS_PER_LONG] |= \ >> +#define SCM_CHARSET_SET(cs, idx) \ >> + (((long *) SCM_SMOB_DATA (cs))[(idx) / SCM_BITS_PER_LONG] |= \ >> (1L << ((idx) % SCM_BITS_PER_LONG))) > > Is that a change? No, sorry, just re-formatting (applying `c-backslash-region'). >> + if (pred (c)) \ >> + SCM_CHARSET_SET ((cset), (c)); \ >> + else \ >> + SCM_CHARSET_UNSET ((cset), (c)); \ > > It may be possible to do a "set to a value" rather than separate > set/unset macros. In theory, but since we either set a bit by or'ing it or clear it by and'ing its one's complement, it's not easily doable. ;-) >> -(use-modules (srfi srfi-14)) >> +(use-modules (srfi srfi-14) >> + (srfi srfi-1) ;; `every' >> + (test-suite lib)) > > A "define-module" there can prevent srfi-1 leaking out to subsequent > tests. Agreed. I changed this too. >> +(define (find-latin1-locale) >> + ;; Try to find and install an ISO-8859-1 locale. Return `#f' on failure. >> + (if (defined? 'setlocale) >> + (let loop ((locales (map (lambda (lang) >> + (string-append lang ".iso88591")) >> + '("de_DE" "en_GB" "en_US" "es_ES" >> + "fr_FR" "it_IT")))) > > The posix "locale -a" program can print all available locales, if you > wanted to ask nl_langinfo(CODESET) or "locale -k charmap" what the > charset is for each of them, or just try the undotted ones with > 8859-1, or whatever. Yeah, I know, but then we'd have to rely on, say, `(ice-9 popen)' to run `locale' and parse its output, and `locale' would have to be present and standard-conforming, etc. So I thought that hardcoding locales this way would not be less reliable and at least simpler than running `locale'. `nl_langinfo ()' would be great, but we'd need to provide bindings for it first, and it's an X/Open API, not ISO C, so it may not be available everywhere (unfortunately). The updated patch is attached below (only `srfi-14.test' was changed). Let me know if it's ok to commit. Thanks, Ludovic. --=-=-= Content-Type: text/x-patch; charset=iso-8859-1 Content-Disposition: inline; filename*=us-ascii''%2c%2cthe-diff.diff Content-Description: The updated patch Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by laas.laas.fr id k8JCSkjm001413 --- orig/configure.in +++ mod/configure.in @@ -598,9 +598,10 @@ # readdir_r - recent posix, not on old systems # stat64 - SuS largefile stuff, not on old systems # sysconf - not on old systems +# isblank - available as a GNU extension or in C99 # _NSGetEnviron - Darwin specific # -AC_CHECK_FUNCS([DINFINITY DQNAN ctermid fesetround ftime fchown getcwd g= eteuid gettimeofday gmtime_r ioctl lstat mkdir mknod nice readdir_r readl= ink rename rmdir select setegid seteuid setlocale setpgid setsid sigactio= n siginterrupt stat64 strftime strptime symlink sync sysconf tcgetpgrp tc= setpgrp times uname waitpid strdup system usleep atexit on_exit chown lin= k fcntl ttyname getpwent getgrent kill getppid getpgrp fork setitimer get= itimer strchr strcmp index bcopy memcpy rindex unsetenv _NSGetEnviron]) +AC_CHECK_FUNCS([DINFINITY DQNAN ctermid fesetround ftime fchown getcwd g= eteuid gettimeofday gmtime_r ioctl lstat mkdir mknod nice readdir_r readl= ink rename rmdir select setegid seteuid setlocale setpgid setsid sigactio= n siginterrupt stat64 strftime strptime symlink sync sysconf tcgetpgrp tc= setpgrp times uname waitpid strdup system usleep atexit on_exit chown lin= k fcntl ttyname getpwent getgrent kill getppid getpgrp fork setitimer get= itimer strchr strcmp index bcopy memcpy rindex unsetenv isblank _NSGetEnv= iron]) =20 # Reasons for testing: # netdb.h - not in mingw --- orig/libguile/posix.c +++ mod/libguile/posix.c @@ -34,6 +34,7 @@ #include "libguile/feature.h" #include "libguile/strings.h" #include "libguile/srfi-13.h" +#include "libguile/srfi-14.h" #include "libguile/vectors.h" #include "libguile/lang.h" =20 @@ -1392,6 +1393,10 @@ SCM_SYSERROR; } =20 + /* Recompute the standard SRFI-14 character sets in a locale-dependent + (actually charset-dependent) way. */ + scm_srfi_14_compute_char_sets (); + scm_dynwind_end (); return scm_from_locale_string (rv); } --- orig/libguile/srfi-14.c +++ mod/libguile/srfi-14.c @@ -17,18 +17,27 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1= 301 USA */ =20 +#define _GNU_SOURCE /* Ask for `isblank ()'. */ =20 #include #include =20 +#ifdef HAVE_CONFIG_H +# include +#endif + #include "libguile.h" #include "libguile/srfi-14.h" =20 =20 -#define SCM_CHARSET_SET(cs, idx) \ - (((long *) SCM_SMOB_DATA (cs))[(idx) / SCM_BITS_PER_LONG] |=3D \ +#define SCM_CHARSET_SET(cs, idx) \ + (((long *) SCM_SMOB_DATA (cs))[(idx) / SCM_BITS_PER_LONG] |=3D \ (1L << ((idx) % SCM_BITS_PER_LONG))) =20 +#define SCM_CHARSET_UNSET(cs, idx) \ + (((long *) SCM_SMOB_DATA (cs))[(idx) / SCM_BITS_PER_LONG] &=3D \ + (~(1L << ((idx) % SCM_BITS_PER_LONG)))) + #define BYTES_PER_CHARSET (SCM_CHARSET_SIZE / 8) #define LONGS_PER_CHARSET (SCM_CHARSET_SIZE / SCM_BITS_PER_LONG) =20 @@ -1393,6 +1402,9 @@ } #undef FUNC_NAME =20 +=0C +/* Standard character sets. */ + SCM scm_char_set_lower_case; SCM scm_char_set_upper_case; SCM scm_char_set_title_case; @@ -1411,48 +1423,123 @@ SCM scm_char_set_empty; SCM scm_char_set_full; =20 -static SCM -make_predset (int (*pred) (int)) -{ - int ch; - SCM cs =3D make_char_set (NULL); - for (ch =3D 0; ch < 256; ch++) - if (pred (ch)) - SCM_CHARSET_SET (cs, ch); - return cs; -} =20 -static SCM -define_predset (const char *name, int (*pred) (int)) +/* Create an empty character set and return it after binding it to NAME.= */ +static inline SCM +define_charset (const char *name) { - SCM cs =3D make_predset (pred); + SCM cs =3D make_char_set (NULL); scm_c_define (name, cs); return scm_permanent_object (cs); } =20 -static SCM -make_strset (const char *str) +/* Membership predicates for the various char sets. + + XXX: The `punctuation' and `symbol' char sets have no direct equivale= nt in + . Thus, the predicates below yield correct results for ASCI= I, + but they do not provide the result described by the SRFI for Latin-1.= The + correct Latin-1 result could only be obtained by hard-coding the + characters listed by the SRFI, but the problem would remain for other + 8-bit charsets. + + Similarly, character 0xA0 in Latin-1 (unbreakable space, `#\0240') sh= ould + be part of `char-set:blank'. However, glibc's current (2006/09) Lati= n-1 + locales (which use the ISO 14652 "i18n" FDCC-set) do not consider it + `blank' so it ends up in `char-set:punctuation'. */ +#ifdef HAVE_ISBLANK +# define CSET_BLANK_PRED(c) (isblank (c)) +#else +# define CSET_BLANK_PRED(c) \ + (((c) =3D=3D ' ') || ((c) =3D=3D '\t')) +#endif + +#define CSET_SYMBOL_PRED(c) \ + (((c) !=3D '\0') && (strchr ("$+<=3D>^`|~", (c)) !=3D NULL)) +#define CSET_PUNCT_PRED(c) \ + ((ispunct (c)) && (!CSET_SYMBOL_PRED (c))) + +#define CSET_LOWER_PRED(c) (islower (c)) +#define CSET_UPPER_PRED(c) (isupper (c)) +#define CSET_LETTER_PRED(c) (isalpha (c)) +#define CSET_DIGIT_PRED(c) (isdigit (c)) +#define CSET_WHITESPACE_PRED(c) (isspace (c)) +#define CSET_CONTROL_PRED(c) (iscntrl (c)) +#define CSET_HEX_DIGIT_PRED(c) (isxdigit (c)) +#define CSET_ASCII_PRED(c) (isascii (c)) + +/* Some char sets are explicitly defined by the SRFI as a union of other= char + sets so we try to follow this closely. */ + +#define CSET_LETTER_AND_DIGIT_PRED(c) \ + (CSET_LETTER_PRED (c) || CSET_DIGIT_PRED (c)) + +#define CSET_GRAPHIC_PRED(c) \ + (CSET_LETTER_PRED (c) || CSET_DIGIT_PRED (c) \ + || CSET_PUNCT_PRED (c) || CSET_SYMBOL_PRED (c)) + +#define CSET_PRINTING_PRED(c) \ + (CSET_GRAPHIC_PRED (c) || CSET_WHITESPACE_PRED (c)) + +/* False and true predicates. */ +#define CSET_TRUE_PRED(c) (1) +#define CSET_FALSE_PRED(c) (0) + + +/* Compute the contents of all the standard character sets. Computation= may + need to be re-done at `setlocale'-time because some char sets (e.g., + `char-set:letter') need to reflect the character set supported by Gui= le. + + For instance, at startup time, the "C" locale is used, thus Guile sup= ports + only ASCII; therefore, `char-set:letter' only contains English letter= s. + The user can change this by invoking `setlocale' and specifying a loc= ale + with an 8-bit charset, thereby augmenting some of the SRFI-14 standar= d + character sets. + + This works because some of the predicates used below to construct + character sets (e.g., `isalpha(3)') are locale-dependent (so + charset-dependent, though generally not language-dependent). For det= ails, + please see the `guile-devel' mailing list archive of September 2006. = */ +void +scm_srfi_14_compute_char_sets (void) { - SCM cs =3D make_char_set (NULL); - while (*str) +#define UPDATE_CSET(c, cset, pred) \ + do \ + { \ + if (pred (c)) \ + SCM_CHARSET_SET ((cset), (c)); \ + else \ + SCM_CHARSET_UNSET ((cset), (c)); \ + } \ + while (0) + + register int ch; + + for (ch =3D 0; ch < 256; ch++) { - SCM_CHARSET_SET (cs, *str); - str++; + UPDATE_CSET (ch, scm_char_set_upper_case, CSET_UPPER_PRED); + UPDATE_CSET (ch, scm_char_set_lower_case, CSET_LOWER_PRED); + UPDATE_CSET (ch, scm_char_set_title_case, CSET_FALSE_PRED); + UPDATE_CSET (ch, scm_char_set_letter, CSET_LETTER_PRED); + UPDATE_CSET (ch, scm_char_set_digit, CSET_DIGIT_PRED); + UPDATE_CSET (ch, scm_char_set_letter_and_digit, + CSET_LETTER_AND_DIGIT_PRED); + UPDATE_CSET (ch, scm_char_set_graphic, CSET_GRAPHIC_PRED); + UPDATE_CSET (ch, scm_char_set_printing, CSET_PRINTING_PRED); + UPDATE_CSET (ch, scm_char_set_whitespace, CSET_WHITESPACE_PRED); + UPDATE_CSET (ch, scm_char_set_iso_control, CSET_CONTROL_PRED); + UPDATE_CSET (ch, scm_char_set_punctuation, CSET_PUNCT_PRED); + UPDATE_CSET (ch, scm_char_set_symbol, CSET_SYMBOL_PRED); + UPDATE_CSET (ch, scm_char_set_hex_digit, CSET_HEX_DIGIT_PRED); + UPDATE_CSET (ch, scm_char_set_blank, CSET_BLANK_PRED); + UPDATE_CSET (ch, scm_char_set_ascii, CSET_ASCII_PRED); + UPDATE_CSET (ch, scm_char_set_empty, CSET_FALSE_PRED); + UPDATE_CSET (ch, scm_char_set_full, CSET_TRUE_PRED); } - return cs; -} =20 -static SCM -define_strset (const char *name, const char *str) -{ - SCM cs =3D make_strset (str); - scm_c_define (name, cs); - return scm_permanent_object (cs); +#undef UPDATE_CSET } =20 -static int false (int ch) { return 0; } -static int true (int ch) { return 1; } - +=0C void scm_init_srfi_14 (void) { @@ -1461,24 +1548,25 @@ scm_set_smob_free (scm_tc16_charset, charset_free); scm_set_smob_print (scm_tc16_charset, charset_print); =20 - scm_char_set_upper_case =3D define_predset ("char-set:upper-case", isu= pper); - scm_char_set_lower_case =3D define_predset ("char-set:lower-case", isl= ower); - scm_char_set_title_case =3D define_predset ("char-set:title-case", fal= se); - scm_char_set_letter =3D define_predset ("char-set:letter", isalpha); - scm_char_set_digit =3D define_predset ("char-set:digit", isdigit); - scm_char_set_letter_and_digit =3D define_predset ("char-set:letter+dig= it", - isalnum); - scm_char_set_graphic =3D define_predset ("char-set:graphic", isgraph); - scm_char_set_printing =3D define_predset ("char-set:printing", isprint= ); - scm_char_set_whitespace =3D define_predset ("char-set:whitespace", iss= pace); - scm_char_set_iso_control =3D define_predset ("char-set:iso-control", i= scntrl); - scm_char_set_punctuation =3D define_predset ("char-set:punctuation", i= spunct); - scm_char_set_symbol =3D define_strset ("char-set:symbol", "$+<=3D>^`|~= "); - scm_char_set_hex_digit =3D define_predset ("char-set:hex-digit", isxdi= git); - scm_char_set_blank =3D define_strset ("char-set:blank", " \t"); - scm_char_set_ascii =3D define_predset ("char-set:ascii", isascii); - scm_char_set_empty =3D define_predset ("char-set:empty", false); - scm_char_set_full =3D define_predset ("char-set:full", true); + scm_char_set_upper_case =3D define_charset ("char-set:upper-case"); + scm_char_set_lower_case =3D define_charset ("char-set:lower-case"); + scm_char_set_title_case =3D define_charset ("char-set:title-case"); + scm_char_set_letter =3D define_charset ("char-set:letter"); + scm_char_set_digit =3D define_charset ("char-set:digit"); + scm_char_set_letter_and_digit =3D define_charset ("char-set:letter+dig= it"); + scm_char_set_graphic =3D define_charset ("char-set:graphic"); + scm_char_set_printing =3D define_charset ("char-set:printing"); + scm_char_set_whitespace =3D define_charset ("char-set:whitespace"); + scm_char_set_iso_control =3D define_charset ("char-set:iso-control"); + scm_char_set_punctuation =3D define_charset ("char-set:punctuation"); + scm_char_set_symbol =3D define_charset ("char-set:symbol"); + scm_char_set_hex_digit =3D define_charset ("char-set:hex-digit"); + scm_char_set_blank =3D define_charset ("char-set:blank"); + scm_char_set_ascii =3D define_charset ("char-set:ascii"); + scm_char_set_empty =3D define_charset ("char-set:empty"); + scm_char_set_full =3D define_charset ("char-set:full"); + + scm_srfi_14_compute_char_sets (); =20 #include "libguile/srfi-14.x" } --- orig/libguile/srfi-14.h +++ mod/libguile/srfi-14.h @@ -106,7 +106,7 @@ SCM_API SCM scm_char_set_empty; SCM_API SCM scm_char_set_full; =20 -SCM_API void scm_c_init_srfi_14 (void); +SCM_API void scm_srfi_14_compute_char_sets (void); SCM_API void scm_init_srfi_14 (void); =20 #endif /* SCM_SRFI_14_H */ --- orig/test-suite/tests/srfi-14.test +++ mod/test-suite/tests/srfi-14.test @@ -1,4 +1,4 @@ -;;;; srfi-14.test --- Test suite for Guile's SRFI-14 functions. -*- sche= me -*- +;;;; srfi-14.test --- Test suite for Guile's SRFI-14 functions. ;;;; Martin Grabmueller, 2001-07-16 ;;;; ;;;; Copyright (C) 2001, 2006 Free Software Foundation, Inc. @@ -18,7 +18,11 @@ ;;;; the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor= , ;;;; Boston, MA 02110-1301 USA =20 -(use-modules (srfi srfi-14)) +(define-module (test-suite test-srfi-14) + :use-module (srfi srfi-14) + :use-module (srfi srfi-1) ;; `every' + :use-module (test-suite lib)) + =20 (define exception:invalid-char-set-cursor (cons 'misc-error "^invalid character set cursor")) @@ -186,3 +190,128 @@ (pass-if "upper case char set" (char-set=3D (char-set-map char-upcase char-set:lower-case) char-set:upper-case))) + +(with-test-prefix "string->char-set" + + (pass-if "some char set" + (let ((chars '(#\g #\u #\i #\l #\e))) + (char-set=3D (list->char-set chars) + (string->char-set (apply string chars)))))) + +;; Make sure we get an ASCII charset and character classification. +(if (defined? 'setlocale) (setlocale LC_CTYPE "C")) + +(with-test-prefix "standard char sets (ASCII)" + + (pass-if "char-set:letter" + (char-set=3D (string->char-set + (string-append "abcdefghijklmnopqrstuvwxyz" + "ABCDEFGHIJKLMNOPQRSTUVWXYZ")) + char-set:letter)) + + (pass-if "char-set:punctuation" + (char-set=3D (string->char-set "!\"#%&'()*,-./:;?@[\\]_{}") + char-set:punctuation)) + + (pass-if "char-set:symbol" + (char-set=3D (string->char-set "$+<=3D>^`|~") + char-set:symbol)) + + (pass-if "char-set:letter+digit" + (char-set=3D char-set:letter+digit + (char-set-union char-set:letter char-set:digit))) + + (pass-if "char-set:graphic" + (char-set=3D char-set:graphic + (char-set-union char-set:letter char-set:digit + char-set:punctuation char-set:symbol))) + + (pass-if "char-set:printing" + (char-set=3D char-set:printing + (char-set-union char-set:whitespace char-set:graphic)))= ) + + +=0C +;;; +;;; 8-bit charsets. +;;; +;;; Here, we only test ISO-8859-1 (Latin-1), notably because behavior of +;;; SRFI-14 for implementations supporting this charset is well-defined. +;;; + +(define (every? pred lst) + (not (not (every pred lst)))) + +(define (find-latin1-locale) + ;; Try to find and install an ISO-8859-1 locale. Return `#f' on failu= re. + (if (defined? 'setlocale) + (let loop ((locales (map (lambda (lang) + (string-append lang ".iso88591")) + '("de_DE" "en_GB" "en_US" "es_ES" + "fr_FR" "it_IT")))) + (if (null? locales) + #f + (if (false-if-exception (setlocale LC_CTYPE (car locales))) + (car locales) + (loop (cdr locales))))) + #f)) + + +(define %latin1 (find-latin1-locale)) + +(with-test-prefix "Latin-1 (8-bit charset)" + + ;; Note: the membership tests below are not exhaustive. + + (pass-if "char-set:letter (membership)" + (if (not %latin1) + (throw 'unresolved) + (let ((letters (char-set->list char-set:letter))) + (every? (lambda (8-bit-char) + (memq 8-bit-char letters)) + (append '(#\a #\b #\c) ;; ASCII + (string->list "=E7=E9=E8=E2=F9=C9=C0=C8=CA") ;; French + (string->list "=F8=F1=D1=ED=DF=E5=E6=F0=FE")))))) + + (pass-if "char-set:letter (size)" + (if (not %latin1) + (throw 'unresolved) + (=3D (char-set-size char-set:letter) 117))) + + (pass-if "char-set:lower-case (size)" + (if (not %latin1) + (throw 'unresolved) + (=3D (char-set-size char-set:lower-case) (+ 26 33)))) + + (pass-if "char-set:upper-case (size)" + (if (not %latin1) + (throw 'unresolved) + (=3D (char-set-size char-set:upper-case) (+ 26 30)))) + + (pass-if "char-set:punctuation (membership)" + (if (not %latin1) + (thrown 'unresolved) + (let ((punctuation (char-set->list char-set:punctuation))) + (every? (lambda (8-bit-char) + (memq 8-bit-char punctuation)) + (append '(#\! #\. #\?) ;; ASCII + (string->list "=A1=BF") ;; Castellano + (string->list "=AB=BB")))))) ;; French + + (pass-if "char-set:letter+digit" + (char-set=3D char-set:letter+digit + (char-set-union char-set:letter char-set:digit))) + + (pass-if "char-set:graphic" + (char-set=3D char-set:graphic + (char-set-union char-set:letter char-set:digit + char-set:punctuation char-set:symbol))) + + (pass-if "char-set:printing" + (char-set=3D char-set:printing + (char-set-union char-set:whitespace char-set:graphic)))) + +;; Local Variables: +;; mode: scheme +;; coding: latin-1 +;; End: --=-=-= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://lists.gnu.org/mailman/listinfo/guile-devel --=-=-=--