* More i18n
@ 2006-12-10 15:04 Ludovic Courtès
2006-12-11 19:42 ` Neil Jerram
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Ludovic Courtès @ 2006-12-10 15:04 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 3199 bytes --]
Hi,
The patch below provides further i18n support. In particular, it
internationalizes SRFI-19. Since SRFI-19 uses internally low-level
locale information (in order to answer questions such as "what's the
name of the first week day in this locale?"), `nl_langinfo ()' seemed
more appropriate than `strftime ()' to achieve this. Thus, the patch
also provides a wrapper for `nl_langinfo ()', called
`language-information'.
In order to be consistent with the rest of `(ice-9 i18n)',
`language-information' accepts an optional argument which should be a
locale object. Consequently, `language-information' has to perform
appropriate charset conversion. Consider the following examples:
guile> (setlocale LC_ALL "C")
guile> (language-information DAY_1)
"Sunday"
;; No conversion was needed here, easy.
guile> (define eo (make-locale LC_ALL_MASK "eo_EO.UTF-8"))
guile> (setlocale LC_ALL "eo_EO.ISO-8859-3")
guile> (language-information DAY_1 eo)
"dimanĉo"
;; The UTF-8 string returned by `nl_langinfo ()' was successfully
;; converted to ISO-8859-3, the current "internal representation" of
;; Guile.
guile> (setlocale LC_ALL "C")
guile> (language-information DAY_1 eo)
standard input:7:1: In procedure make_stringbuf_from_c_string in expression (language-information DAY_1 eo):
standard input:7:1: Invalid or incomplete multibyte or wide character
ABORT: (system-error)
;; Charset conversion failed (could not convert UTF-8 string to
;; ASCII).
This is achieved by the introduction of `scm_from_string ()', a
generalization of `scm_from_locale_string ()':
-- C Function: SCM scm_from_string (const char *str, const char
*charset)
-- C Function: SCM scm_from_stringn (const char *str, size_t len,
const char *charset)
These are generalized versions of the above functions.
Create a Scheme string that has the same contents as STR when
interpreted in the character encoding specified by CHARSET (the
interpretation of CHARSET is platform-dependent).
The current implementation of this function is very 8-bit-oriented: it
assumes that the internal representation of strings in Guile is 8-bit
and is defined by the current locale's charset. For instance,
`(setlocale LC_ALL "eo_EO.ISO-8859-3")' led `scm_from_string' to assume
Latin-3 as the internal string representation. Of course, in the
(hopefully not so distant) future, the internal string representation
will certainly be locale-independent (UTF-8 or some such), but I believe
this is a reasonable starting point until we switch to Unicode.
Note that `nl_langinfo ()' is specified by SuSv2 but is not available,
for instance, on Windows. Likewise, `iconv ()' may not be available on
all platforms but fortunately, `libiconv' [0] can be used on platforms
where it is not natively available.
The patch also addresses most of the issues raised by Kevin in the other
thread, and includes the `cond-feature' macro discussed in yet another
thread. SRFI-19 currently uses it but could easily be changed to use
`provided?' if we decide not to include `cond-feature'.
Comments welcome!
Thanks,
Ludovic.
[0] http://www.gnu.org/software/libiconv/
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: The new i18n patch --]
[-- Type: text/x-patch, Size: 50666 bytes --]
--- orig/configure.in
+++ mod/configure.in
@@ -525,12 +525,14 @@
# Reasons for testing:
# complex.h - new in C99
# fenv.h - available in C99, but not older systems
+# langinfo.h, nl_types.h - SuS v2
+# iconv.h - SuS v2
#
AC_CHECK_HEADERS([complex.h fenv.h io.h libc.h limits.h malloc.h memory.h string.h \
regex.h rxposix.h rx/rxposix.h sys/dir.h sys/ioctl.h sys/select.h \
sys/time.h sys/timeb.h sys/times.h sys/stdtypes.h sys/types.h \
sys/utime.h time.h unistd.h utime.h pwd.h grp.h sys/utsname.h \
-direct.h])
+direct.h langinfo.h nl_types.h iconv.h])
# "complex double" is new in C99, and "complex" is only a keyword if
# <complex.h> is included
@@ -613,17 +615,20 @@
# truncate - not in mingw
# isblank - available as a GNU extension or in C99
# _NSGetEnviron - Darwin specific
-# strcoll_l, newlocale - GNU extensions (glibc)
+# strcoll_l, newlocale - GNU extensions (glibc), also available on Darwin
+# nl_langinfo - X/Open, not available on Windows.
+# iconv - X/Open, SuS v2
#
-AC_CHECK_FUNCS([DINFINITY DQNAN chsize clog10 ctermid fesetround ftime ftruncate fchown getcwd geteuid gettimeofday gmtime_r ioctl lstat mkdir mknod nice readdir_r readlink rename rmdir select setegid seteuid setlocale setpgid setsid sigaction siginterrupt stat64 strftime strptime symlink sync sysconf tcgetpgrp tcsetpgrp times uname waitpid strdup system usleep atexit on_exit chown link fcntl ttyname getpwent getgrent kill getppid getpgrp fork setitimer getitimer strchr strcmp index bcopy memcpy rindex truncate unsetenv isblank _NSGetEnviron strcoll strcoll_l newlocale])
+AC_CHECK_FUNCS([DINFINITY DQNAN chsize clog10 ctermid fesetround ftime ftruncate fchown getcwd geteuid gettimeofday gmtime_r ioctl lstat mkdir mknod nice readdir_r readlink rename rmdir select setegid seteuid setlocale setpgid setsid sigaction siginterrupt stat64 strftime strptime symlink sync sysconf tcgetpgrp tcsetpgrp times uname waitpid strdup system usleep atexit on_exit chown link fcntl ttyname getpwent getgrent kill getppid getpgrp fork setitimer getitimer strchr strcmp index bcopy memcpy rindex truncate unsetenv isblank _NSGetEnviron strcoll strcoll_l newlocale nl_langinfo iconv])
# Reasons for testing:
# netdb.h - not in mingw
# sys/param.h - not in mingw
# sethostname - the function itself check because it's not in mingw,
# the DECL is checked because Solaris 10 doens't have in any header
+# xlocale.h - needed on Darwin for the `locale_t' API
#
-AC_CHECK_HEADERS(crypt.h netdb.h sys/param.h sys/resource.h sys/file.h)
+AC_CHECK_HEADERS(crypt.h netdb.h sys/param.h sys/resource.h sys/file.h xlocale.h)
AC_CHECK_FUNCS(chroot flock getlogin cuserid getpriority setpriority getpass sethostname gethostname)
AC_CHECK_DECLS([sethostname])
--- orig/doc/ref/api-data.texi
+++ mod/doc/ref/api-data.texi
@@ -3653,6 +3653,18 @@
null-terminated and the real length will be found with @code{strlen}.
@end deftypefn
+@deftypefn {C Function} SCM scm_from_string (const char *str, const char *charset)
+@deftypefnx {C Function} SCM scm_from_stringn (const char *str, size_t len, const char *charset)
+These are generalized versions of the above functions.
+
+Create a Scheme string that has the same contents as @var{str} when
+interpreted in the character encoding specified by @var{charset} (the
+interpretation of charset is platform-dependent).
+
+The @var{len} and @var{str} must follow the same rules as for the
+above functions.
+@end deftypefn
+
@deftypefn {C Function} SCM scm_take_locale_string (char *str)
@deftypefnx {C Function} SCM scm_take_locale_stringn (char *str, size_t len)
Like @code{scm_from_locale_string} and @code{scm_from_locale_stringn},
--- orig/doc/ref/api-i18n.texi
+++ mod/doc/ref/api-i18n.texi
@@ -159,34 +159,17 @@
@deffn {Scheme Procedure} string-locale<? s1 s2 [locale]
@deffnx {C Function} scm_string_locale_lt (s1, s2, locale)
-Compare strings @var{s1} and @var{s2} in a locale-dependent way. If
-@var{locale} is provided, it should be locale object (as returned by
-@code{make-locale}) and will be used to perform the comparison;
-otherwise, the current system locale is used.
-@end deffn
-
-@deffn {Scheme Procedure} string-locale>? s1 s2 [locale]
+@deffnx {Scheme Procedure} string-locale>? s1 s2 [locale]
@deffnx {C Function} scm_string_locale_gt (s1, s2, locale)
+@deffnx {Scheme Procedure} string-locale-ci<? s1 s2 [locale]
+@deffnx {C Function} scm_string_locale_ci_lt (s1, s2, locale)
+@deffnx {Scheme Procedure} string-locale-ci>? s1 s2 [locale]
+@deffnx {C Function} scm_string_locale_ci_gt (s1, s2, locale)
Compare strings @var{s1} and @var{s2} in a locale-dependent way. If
@var{locale} is provided, it should be locale object (as returned by
@code{make-locale}) and will be used to perform the comparison;
-otherwise, the current system locale is used.
-@end deffn
-
-@deffn {Scheme Procedure} string-locale-ci<? s1 s2 [locale]
-@deffnx {C Function} scm_string_locale_ci_lt (s1, s2, locale)
-Compare strings @var{s1} and @var{s2} in a case-insensitive, and
-locale-dependent way. If @var{locale} is provided, it should be
-locale object (as returned by @code{make-locale}) and will be used to
-perform the comparison; otherwise, the current system locale is used.
-@end deffn
-
-@deffn {Scheme Procedure} string-locale-ci>? s1 s2 [locale]
-@deffnx {C Function} scm_string_locale_ci_gt (s1, s2, locale)
-Compare strings @var{s1} and @var{s2} in a case-insensitive, and
-locale-dependent way. If @var{locale} is provided, it should be
-locale object (as returned by @code{make-locale}) and will be used to
-perform the comparison; otherwise, the current system locale is used.
+otherwise, the current system locale is used. For the @code{-ci}
+variants, the comparison is made in a case-insensitive way.
@end deffn
@deffn {Scheme Procedure} string-locale-ci=? s1 s2 [locale]
@@ -199,26 +182,16 @@
@deffn {Scheme Procedure} char-locale<? c1 c2 [locale]
@deffnx {C Function} scm_char_locale_lt (c1, c2, locale)
-Return true if character @var{c1} is lower than @var{c2} according to
-@var{locale} or to the current locale.
-@end deffn
-
-@deffn {Scheme Procedure} char-locale>? c1 c2 [locale]
+@deffnx {Scheme Procedure} char-locale>? c1 c2 [locale]
@deffnx {C Function} scm_char_locale_gt (c1, c2, locale)
-Return true if character @var{c1} is greater than @var{c2} according
-to @var{locale} or to the current locale.
-@end deffn
-
-@deffn {Scheme Procedure} char-locale-ci<? c1 c2 [locale]
+@deffnx {Scheme Procedure} char-locale-ci<? c1 c2 [locale]
@deffnx {C Function} scm_char_locale_ci_lt (c1, c2, locale)
-Return true if character @var{c1} is lower than @var{c2}, in a case
-insensitive way according to @var{locale} or to the current locale.
-@end deffn
-
-@deffn {Scheme Procedure} char-locale-ci>? c1 c2 [locale]
+@deffnx {Scheme Procedure} char-locale-ci>? c1 c2 [locale]
@deffnx {C Function} scm_char_locale_ci_gt (c1, c2, locale)
-Return true if character @var{c1} is greater than @var{c2}, in a case
-insensitive way according to @var{locale} or to the current locale.
+Compare characters @var{c1} and @var{c2} according to either
+@var{locale} (a locale object as returned by @code{make-locale}) or
+the current locale. For the @code{-ci} variants, the comparison is
+made in a case-insensitive way.
@end deffn
@deffn {Scheme Procedure} char-locale-ci=? c1 c2 [locale]
@@ -236,8 +209,8 @@
or region of the world. For instance, while most languages using the
Latin alphabet map lower-case letter ``i'' to upper-case letter ``I'',
Turkish maps lower-case ``i'' to ``Latin capital letter I with dot
-above''. The following procedures allow to provide idiomatic
-character mapping.
+above''. The following procedures allow programmers to provide
+idiomatic character mapping.
@deffn {Scheme Procedure} char-locale-downcase chr [locale]
@deffnx {C Function} scm_char_locale_upcase (chr, locale)
@@ -263,12 +236,17 @@
according to either @var{locale} or the current locale.
@end deffn
-Finally, the following procedures allow programs to read numbers
+Note that in the current implementation Guile has no notion of
+multibyte characters and in a multibyte locale characters may not be
+converted correctly.
+
+The following procedures allow programs to read numbers
written according to a particular locale. As an example, in English,
``ten thousand and a half'' is usually written @code{10,000.5} while
in French it is written @code{10000,5}. These procedures allow to
account for these differences.
+@findex strtod
@deffn {Scheme Procedure} locale-string->integer str [base [locale]]
@deffnx {C Function} scm_locale_string_to_integer (str, base, locale)
Convert string @var{str} into an integer according to either
@@ -276,20 +254,62 @@
the current process locale. If @var{base} is specified, then it
determines the base of the integer being read (e.g., @code{16} for an
hexadecimal number, @code{10} for a decimal number); by default,
-decimal numbers are read. Return two values: an integer (on success)
-or @code{#f}, and the number of characters read from @var{str}
-(@code{0} on failure).
+decimal numbers are read. Return two values (@pxref{Multiple
+Values}): an integer (on success) or @code{#f}, and the number of
+characters read from @var{str} (@code{0} on failure).
+
+This function is based on the C library's @code{strtol} function
+(@pxref{Parsing of Integers, @code{strtol},, libc, The GNU C Library
+Reference Manual}).
@end deffn
+@findex strtod
@deffn {Scheme Procedure} locale-string->inexact str [locale]
@deffnx {C Function} scm_locale_string_to_inexact (str, locale)
Convert string @var{str} into an inexact number according to either
@var{locale} (a locale object as returned by @code{make-locale}) or
-the current process locale. Return two values: an inexact number (on
-success) or @code{#f}, and the number of characters read from
-@var{str} (@code{0} on failure).
+the current process locale. Return two values (@pxref{Multiple
+Values}): an inexact number (on success) or @code{#f}, and the number
+of characters read from @var{str} (@code{0} on failure).
+
+This function is based on the C library's @code{strtod} function
+(@pxref{Parsing of Floats, @code{strtod},, libc, The GNU C Library
+Reference Manual}).
+@end deffn
+
+@findex nl_langinfo
+@cindex low-level locale information
+It is sometimes useful to obtain very specific information about a
+locale such as the name it uses for days or months, its format for
+representing floating-point figures, etc. The @code{(ice-9 i18n)}
+module provides support for this with the @code{language-information}
+procedure. Note that this procedure is only available on platforms
+that provide @code{nl_langinfo ()} (@pxref{The Elegant and Fast Way,
+@code{nl_langinfo},, libc, The GNU C Library Reference Manual}). When
+@code{language-information}, the @code{language-information} feature
+is provided (@pxref{Feature Tracking}).
+
+@deffn {Scheme Procedure} language-information item [locale]
+@deffnx {C Function} scm_language_information (item, locale)
+Return a string denoting locale information for @var{item} in the
+current locale or that specified by @var{locale}. The semantics and
+arguments are the same as those of the X/Open @code{nl_langinfo}
+function (@pxref{The Elegant and Fast Way, @code{nl_langinfo},, libc,
+The GNU C Library Reference Manual}).
@end deffn
+It can be used as follows:
+
+@example
+(language-information DAY_1 (make-locale LC_ALL_MASK "C"))
+@result{} "Sunday"
+
+(language-information ABMON_2 (make-locale LC_ALL_MASK "C"))
+@result{} "Feb"
+
+(language-information RADIXCHAR (make-locale LC_ALL_MASK "fr_FR"))
+@result{} ","
+@end example
@node Gettext Support
@subsection Gettext Support
--- orig/doc/ref/srfi-modules.texi
+++ mod/doc/ref/srfi-modules.texi
@@ -2095,10 +2095,10 @@
described here, since the specification and reference implementation
differ.
-Currently Guile doesn't implement any localizations for the above, all
-outputs are in English, and the @samp{~c} conversion is POSIX
-@code{ctime} style @samp{~a ~b ~d ~H:~M:~S~z ~Y}. This may change in
-the future.
+Conversion is locale-dependent on systems that support
+@code{language-information} (@pxref{The ice-9 i18n Module,
+@code{language-information}}). @xref{Locales, @code{setlocale}}, for
+information on how to change the current locale.
@node SRFI-19 String to date
@@ -2219,9 +2219,10 @@
returned, instead the weekday will be derived from the day, month and
year.
-Currently Guile doesn't implement any localizations for the above,
-month and weekday names are always expected in English. This may
-change in the future.
+Conversion is locale-dependent on systems that support
+@code{language-information} (@pxref{The ice-9 i18n Module,
+@code{language-information}}). @xref{Locales, @code{setlocale}}, for
+information on how to change the current locale.
@end defun
--- orig/ice-9/boot-9.scm
+++ mod/ice-9/boot-9.scm
@@ -3245,25 +3245,17 @@
(append (hashq-ref %cond-expand-table mod '())
features)))))
-(define cond-expand
+(define (make-cond-expand-macro feature-available?
+ syntax-error unfulfilled-error)
(procedure->memoizing-macro
(lambda (exp env)
- (let ((clauses (cdr exp))
- (syntax-error (lambda (cl)
- (error "invalid clause in `cond-expand'" cl))))
+ (let ((clauses (cdr exp)))
(letrec
((test-clause
(lambda (clause)
(cond
((symbol? clause)
- (or (memq clause %cond-expand-features)
- (let lp ((uses (module-uses (env-module env))))
- (if (pair? uses)
- (or (memq clause
- (hashq-ref %cond-expand-table
- (car uses) '()))
- (lp (cdr uses)))
- #f))))
+ (feature-available? clause env))
((pair? clause)
(cond
((eq? 'and (car clause))
@@ -3295,7 +3287,7 @@
(let lp ((c clauses))
(cond
((null? c)
- (error "Unfulfilled `cond-expand'"))
+ (unfulfilled-error))
((not (pair? c))
(syntax-error c))
((not (pair? (car c)))
@@ -3309,6 +3301,21 @@
(else
(lp (cdr c))))))))))
+(define cond-expand
+ (make-cond-expand-macro (lambda (clause env)
+ (or (memq clause %cond-expand-features)
+ (let lp ((uses (module-uses (env-module env))))
+ (if (pair? uses)
+ (or (memq clause
+ (hashq-ref %cond-expand-table
+ (car uses) '()))
+ (lp (cdr uses)))
+ #f))))
+ (lambda (clause)
+ (error "invalid clause in `cond-expand'" clause))
+ (lambda ()
+ (error "unfulfilled `cond-expand'"))))
+
;; This procedure gets called from the startup code with a list of
;; numbers, which are the numbers of the SRFIs to be loaded on startup.
;;
@@ -3323,6 +3330,22 @@
\f
+;;; `cond-feature'
+;;;
+
+(define cond-feature
+ ;; Provide a mechanism similar to `cond-expand' for Guile's built-in
+ ;; features.
+ (make-cond-expand-macro (lambda (clause env)
+ (provided? clause))
+ (lambda (clause)
+ (error "invalid clause in `cond-feature'"
+ clause))
+ (lambda ()
+ (error "unfulfilled `cond-feature'"))))
+
+\f
+
;;; srfi-55: require-extension
;;;
--- orig/ice-9/i18n.scm
+++ mod/ice-9/i18n.scm
@@ -54,7 +54,33 @@
string-locale-downcase string-locale-upcase
;; reading numbers
- locale-string->integer locale-string->inexact))
+ locale-string->integer locale-string->inexact
+
+ ;; detailed language information
+ language-information
+ CODESET
+ ABDAY_1 ABDAY_2 ABDAY_3 ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7
+ DAY_1 DAY_2 DAY_3 DAY_4 DAY_5 DAY_6 DAY_7
+ ABMON_1 ABMON_2 ABMON_3 ABMON_4 ABMON_5 ABMON_6 ABMON_7
+ ABMON_8 ABMON_9 ABMON_10 ABMON_11 ABMON_12
+ MON_1 MON_2 MON_3 MON_4 MON_5 MON_6 MON_7 MON_8
+ MON_9 MON_10 MON_11 MON_12
+ AM_STR PM_STR
+ D_T_FMT D_FMT T_FMT T_FMT_AMPM
+ ERA ERA_D_FMT ERA_D_T_FMT ERA_T_FMT
+ ALT_DIGITS RADIXCHAR THOUSEP
+ CRNCYSTR
+ YESEXPR NOEXPR YESSTR NOSTR
+
+ ;; GNU extensions (may be unavailable on non-GNU systems)
+ ERA_YEAR DECIMAL_POINT
+ INT_CURR_SYMBOL CURRENCY_SYMBOL
+ MON_DECIMAL_POINT MON_THOUSANDS_SEP MON_GROUPING
+ POSITIVE_SIGN NEGATIVE_SIGN INT_FRAC_DIGITS FRAC_DIGITS
+ P_CS_PRECEDES P_SEP_BY_SPACE N_CS_PRECEDES N_SEP_BY_SPACE
+ P_SIGN_POSN N_SIGN_POSN INT_P_CS_PRECEDES INT_P_SEP_BY_SPACE
+ INT_N_CS_PRECEDES INT_N_SEP_BY_SPACE INT_P_SIGN_POSN
+ INT_N_SIGN_POSN))
(load-extension "libguile-i18n-v-0" "scm_init_i18n")
--- orig/libguile/i18n.c
+++ mod/libguile/i18n.c
@@ -46,6 +46,7 @@
#include "libguile/dynwind.h"
#include "libguile/validate.h"
#include "libguile/values.h"
+#include "libguile/threads.h"
#include <locale.h>
#include <string.h> /* `strcoll ()' */
@@ -56,8 +57,18 @@
# define USE_GNU_LOCALE_API
#endif
-#ifndef USE_GNU_LOCALE_API
-# include "libguile/posix.h" /* for `scm_i_locale_mutex' */
+#if (defined USE_GNU_LOCALE_API) && (defined HAVE_XLOCALE_H)
+/* Darwin now supports the "GNU" thread-safe locale API but one has to
+ include <xlocale.h> to get it. See:
+ http://developer.apple.com/documentation/Darwin/Reference/ManPages/man3/newlocale.3.html */
+# include <xlocale.h>
+#endif
+
+#include "libguile/posix.h" /* for `scm_i_locale_mutex' */
+
+#if (defined HAVE_LANGINFO_H) && (defined HAVE_NL_TYPES_H)
+# include <langinfo.h>
+# include <nl_types.h>
#endif
#ifndef HAVE_SETLOCALE
@@ -69,6 +80,9 @@
}
#endif
+/* Helper stringification macro. */
+#define SCM_I18N_STRINGIFY(_name) # _name
+
\f
/* Locale objects, string and character collation, and other locale-dependent
@@ -99,44 +113,43 @@
#ifndef USE_GNU_LOCALE_API
-/* Provide the locale category masks as found in glibc (copied from
- <locale.h> as found in glibc 2.3.6). This must be kept in sync with
- `locale-categories.h'. */
-
-# define LC_CTYPE_MASK (1 << LC_CTYPE)
-# define LC_COLLATE_MASK (1 << LC_COLLATE)
-# define LC_MESSAGES_MASK (1 << LC_MESSAGES)
-# define LC_MONETARY_MASK (1 << LC_MONETARY)
-# define LC_NUMERIC_MASK (1 << LC_NUMERIC)
-# define LC_TIME_MASK (1 << LC_TIME)
+/* Provide the locale category masks as found in glibc. This must be kept in
+ sync with `locale-categories.h'. */
+
+# define LC_CTYPE_MASK 1
+# define LC_COLLATE_MASK 2
+# define LC_MESSAGES_MASK 4
+# define LC_MONETARY_MASK 8
+# define LC_NUMERIC_MASK 16
+# define LC_TIME_MASK 32
# ifdef LC_PAPER
-# define LC_PAPER_MASK (1 << LC_PAPER)
+# define LC_PAPER_MASK 64
# else
# define LC_PAPER_MASK 0
# endif
# ifdef LC_NAME
-# define LC_NAME_MASK (1 << LC_NAME)
+# define LC_NAME_MASK 128
# else
# define LC_NAME_MASK 0
# endif
# ifdef LC_ADDRESS
-# define LC_ADDRESS_MASK (1 << LC_ADDRESS)
+# define LC_ADDRESS_MASK 256
# else
# define LC_ADDRESS_MASK 0
# endif
# ifdef LC_TELEPHONE
-# define LC_TELEPHONE_MASK (1 << LC_TELEPHONE)
+# define LC_TELEPHONE_MASK 512
# else
# define LC_TELEPHONE_MASK 0
# endif
# ifdef LC_MEASUREMENT
-# define LC_MEASUREMENT_MASK (1 << LC_MEASUREMENT)
+# define LC_MEASUREMENT_MASK 1024
# else
# define LC_MEASUREMENT_MASK 0
# endif
# ifdef LC_IDENTIFICATION
-# define LC_IDENTIFICATION_MASK (1 << LC_IDENTIFICATION)
+# define LC_IDENTIFICATION_MASK 2048
# else
# define LC_IDENTIFICATION_MASK 0
# endif
@@ -224,7 +237,21 @@
}
#endif
+/* Throw an exception corresponding to error ERR. */
+static void inline
+scm_locale_error (const char *func_name, int err)
+{
+ SCM s_err;
+
+ s_err = scm_from_int (err);
+ scm_error (scm_system_error_key, func_name,
+ "Failed to install locale",
+ scm_cons (scm_strerror (s_err), SCM_EOL),
+ scm_cons (s_err, SCM_EOL));
+}
+
+\f
SCM_DEFINE (scm_make_locale, "make-locale", 2, 1, 0,
(SCM category_mask, SCM locale_name, SCM base_locale),
"Return a reference to a data structure representing a set of "
@@ -249,13 +276,13 @@
c_locale = newlocale (c_category_mask, c_locale_name, c_base_locale);
+ free (c_locale_name);
+
if (!c_locale)
- locale = SCM_BOOL_F;
+ scm_locale_error (FUNC_NAME, errno);
else
SCM_NEWSMOB (locale, scm_tc16_locale_smob_type, c_locale);
- free (c_locale_name);
-
#else
c_locale = scm_gc_malloc (sizeof (* c_locale), "locale");
@@ -277,10 +304,7 @@
"Return true if @var{obj} is a locale object.")
#define FUNC_NAME s_scm_locale_p
{
- if (SCM_SMOB_PREDICATE (scm_tc16_locale_smob_type, obj))
- return SCM_BOOL_T;
-
- return SCM_BOOL_F;
+ return scm_from_bool (SCM_SMOB_PREDICATE (scm_tc16_locale_smob_type, obj));
}
#undef FUNC_NAME
@@ -481,19 +505,6 @@
return err;
}
-/* Throw an exception corresponding to error ERR. */
-static void inline
-scm_locale_error (const char *func_name, int err)
-{
- SCM s_err;
-
- s_err = scm_from_int (err);
- scm_error (scm_system_error_key, func_name,
- "Failed to install locale",
- scm_cons (scm_strerror (s_err), SCM_EOL),
- scm_cons (s_err, SCM_EOL));
-}
-
/* Convenient macro to run STATEMENT in the locale context of C_LOCALE. */
#define RUN_IN_LOCALE_SECTION(_c_locale, _statement) \
do \
@@ -1124,25 +1135,309 @@
}
#undef FUNC_NAME
+\f
+/* Language information, aka. `nl_langinfo ()'. */
+
+/* FIXME: Issues related to `language-information'.
+
+ 1. The `CODESET' value if not normalized. This is a secondary issue, but
+ still a practical issue. See
+ http://www.cl.cam.ac.uk/~mgk25/ucs/norm_charmap.c for codeset
+ normalization.
+
+ 2. `nl_langinfo ()' is not available on Windows (but do we care?). */
+
+
+SCM_DEFINE (scm_language_information, "language-information", 1, 1, 0,
+ (SCM item, SCM locale),
+ "Return a string denoting locale information for @var{item} "
+ "in the current locale or that specified by @var{locale}. "
+ "The semantics and arguments are the same as those of the "
+ "X/Open @code{nl_langinfo} function (@pxref{The Elegant and "
+ "Fast Way, @code{nl_langinfo},, libc, The GNU C Library "
+ "Reference Manual}).")
+#define FUNC_NAME s_scm_language_information
+{
+#ifdef HAVE_NL_LANGINFO
+ SCM result;
+ nl_item c_item;
+ char *c_result, *codeset;
+ const char *tmp_codeset;
+ scm_t_locale c_locale;
+
+ SCM_VALIDATE_INT_COPY (2, item, c_item);
+ SCM_VALIDATE_OPTIONAL_LOCALE_COPY (2, locale, c_locale);
+
+ /* Sadly, `nl_langinfo ()' returns a pointer to a static string. According
+ to SuS v2, that static string may be modified by subsequent calls to
+ `nl_langinfo ()' as well as by calls to `setlocale ()'. Thus, we must
+ acquire the locale mutex before doing invoking `nl_langinfo ()'. See
+ http://opengroup.org/onlinepubs/007908799/xsh/nl_langinfo.html for
+ details. */
+
+ scm_i_pthread_mutex_lock (&scm_i_locale_mutex);
+ if (c_locale != NULL)
+ {
+#ifdef USE_GNU_LOCALE_API
+ c_result = nl_langinfo_l (c_item, c_locale);
+ tmp_codeset = nl_langinfo_l (CODESET, c_locale);
+#else
+ /* We can't use `RUN_IN_LOCALE_SECTION ()' here because the locale
+ mutex is already taken. */
+ int lsec_err;
+ scm_t_locale_settings lsec_prev_locale;
+
+ lsec_err = get_current_locale_settings (&lsec_prev_locale);
+ if (lsec_err)
+ scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
+ else
+ {
+ lsec_err = install_locale (c_locale);
+ if (lsec_err)
+ {
+ leave_locale_section (&prev_locale);
+ free_locale_settings (&prev_locale);
+ }
+ }
+
+ if (lsec_err)
+ scm_locale_error (FUNC_NAME, lsec_err);
+ else
+ {
+ /* Get the result under C_LOCALE. */
+ c_result = nl_langinfo (c_item);
+ tmp_codeset = nl_langinfo (CODESET);
+
+ leave_locale_section (&lsec_prev_locale);
+ free_locale_settings (&lsec_prev_locale);
+ }
+#endif
+ }
+ else
+ {
+ c_result = nl_langinfo (c_item);
+ tmp_codeset = nl_langinfo (CODESET);
+ }
+
+ c_result = strdup (c_result);
+
+ codeset = (char *) alloca (strlen (tmp_codeset) + 1);
+ strcpy (codeset, tmp_codeset);
+
+ scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
+
+ if (c_result == NULL)
+ {
+ result = SCM_BOOL_F;
+ /* FIXME: Raise an error. */
+ }
+ else
+ {
+ /* Interpret C_RESULT using the right encoding. */
+ result = scm_from_stringn (c_result, strlen (c_result), codeset);
+ free (c_result);
+ }
+
+ return result;
+#else
+ scm_syserror_msg (FUNC_NAME,
+ "`language-information' not supported on your system",
+ SCM_EOL, ENOSYS);
+
+ return SCM_BOOL_F;
+#endif
+}
+#undef FUNC_NAME
+
+/* Define the `nl_item' constants. */
+static inline void
+define_langinfo_items (void)
+{
+#if (defined HAVE_NL_TYPES_H) && (defined HAVE_LANGINFO_H)
+
+#define DEFINE_NLITEM_CONSTANT(_item) \
+ scm_c_define (# _item, SCM_I_MAKINUM (_item))
+
+ DEFINE_NLITEM_CONSTANT (CODESET);
+
+ /* Abbreviated days of the week. */
+ DEFINE_NLITEM_CONSTANT (ABDAY_1);
+ DEFINE_NLITEM_CONSTANT (ABDAY_2);
+ DEFINE_NLITEM_CONSTANT (ABDAY_3);
+ DEFINE_NLITEM_CONSTANT (ABDAY_4);
+ DEFINE_NLITEM_CONSTANT (ABDAY_5);
+ DEFINE_NLITEM_CONSTANT (ABDAY_6);
+ DEFINE_NLITEM_CONSTANT (ABDAY_7);
+
+ /* Long-named days of the week. */
+ DEFINE_NLITEM_CONSTANT (DAY_1); /* Sunday */
+ DEFINE_NLITEM_CONSTANT (DAY_2); /* Monday */
+ DEFINE_NLITEM_CONSTANT (DAY_3); /* Tuesday */
+ DEFINE_NLITEM_CONSTANT (DAY_4); /* Wednesday */
+ DEFINE_NLITEM_CONSTANT (DAY_5); /* Thursday */
+ DEFINE_NLITEM_CONSTANT (DAY_6); /* Friday */
+ DEFINE_NLITEM_CONSTANT (DAY_7); /* Saturday */
+
+ /* Abbreviated month names. */
+ DEFINE_NLITEM_CONSTANT (ABMON_1); /* Jan */
+ DEFINE_NLITEM_CONSTANT (ABMON_2);
+ DEFINE_NLITEM_CONSTANT (ABMON_3);
+ DEFINE_NLITEM_CONSTANT (ABMON_4);
+ DEFINE_NLITEM_CONSTANT (ABMON_5);
+ DEFINE_NLITEM_CONSTANT (ABMON_6);
+ DEFINE_NLITEM_CONSTANT (ABMON_7);
+ DEFINE_NLITEM_CONSTANT (ABMON_8);
+ DEFINE_NLITEM_CONSTANT (ABMON_9);
+ DEFINE_NLITEM_CONSTANT (ABMON_10);
+ DEFINE_NLITEM_CONSTANT (ABMON_11);
+ DEFINE_NLITEM_CONSTANT (ABMON_12);
+
+ /* Long month names. */
+ DEFINE_NLITEM_CONSTANT (MON_1); /* January */
+ DEFINE_NLITEM_CONSTANT (MON_2);
+ DEFINE_NLITEM_CONSTANT (MON_3);
+ DEFINE_NLITEM_CONSTANT (MON_4);
+ DEFINE_NLITEM_CONSTANT (MON_5);
+ DEFINE_NLITEM_CONSTANT (MON_6);
+ DEFINE_NLITEM_CONSTANT (MON_7);
+ DEFINE_NLITEM_CONSTANT (MON_8);
+ DEFINE_NLITEM_CONSTANT (MON_9);
+ DEFINE_NLITEM_CONSTANT (MON_10);
+ DEFINE_NLITEM_CONSTANT (MON_11);
+ DEFINE_NLITEM_CONSTANT (MON_12);
+
+ DEFINE_NLITEM_CONSTANT (AM_STR); /* Ante meridiem string. */
+ DEFINE_NLITEM_CONSTANT (PM_STR); /* Post meridiem string. */
+
+ DEFINE_NLITEM_CONSTANT (D_T_FMT); /* Date and time format for strftime. */
+ DEFINE_NLITEM_CONSTANT (D_FMT); /* Date format for strftime. */
+ DEFINE_NLITEM_CONSTANT (T_FMT); /* Time format for strftime. */
+ DEFINE_NLITEM_CONSTANT (T_FMT_AMPM);/* 12-hour time format for strftime. */
+
+ DEFINE_NLITEM_CONSTANT (ERA); /* Alternate era. */
+ DEFINE_NLITEM_CONSTANT (ERA_D_FMT); /* Date in alternate era format. */
+ DEFINE_NLITEM_CONSTANT (ERA_D_T_FMT); /* Date and time in alternate era
+ format. */
+ DEFINE_NLITEM_CONSTANT (ERA_T_FMT); /* Time in alternate era format. */
+
+ DEFINE_NLITEM_CONSTANT (ALT_DIGITS); /* Alternate symbols for digits. */
+ DEFINE_NLITEM_CONSTANT (RADIXCHAR);
+ DEFINE_NLITEM_CONSTANT (THOUSEP);
+
+#ifdef YESEXPR
+ DEFINE_NLITEM_CONSTANT (YESEXPR);
+#endif
+#ifdef NOEXPR
+ DEFINE_NLITEM_CONSTANT (NOEXPR);
+#endif
+#ifdef YESSTR
+ DEFINE_NLITEM_CONSTANT (YESSTR);
+#endif
+#ifdef NOSTR
+ DEFINE_NLITEM_CONSTANT (NOSTR);
+#endif
+
+
+ /* GNU extensions. */
+
+#ifdef ERA_YEAR
+ DEFINE_NLITEM_CONSTANT (ERA_YEAR); /* Year in alternate era format. */
+#endif
+#ifdef DECIMAL_POINT
+ DEFINE_NLITEM_CONSTANT (DECIMAL_POINT); /* Equivalent to `RADIXCHAR'. */
+#endif
+
+ /* LC_MONETARY category: formatting of monetary quantities.
+ These items each correspond to a member of `struct lconv',
+ defined in <locale.h>. */
+#ifdef INT_CURR_SYMBOL
+ DEFINE_NLITEM_CONSTANT (INT_CURR_SYMBOL);
+#endif
+#ifdef CURRENCY_SYMBOL
+ DEFINE_NLITEM_CONSTANT (CURRENCY_SYMBOL);
+#endif
+#ifdef CRNCYSTR /* legacy symbol */
+ DEFINE_NLITEM_CONSTANT (CRNCYSTR);
+#endif
+#ifdef MON_DECIMAL_POINT
+ DEFINE_NLITEM_CONSTANT (MON_DECIMAL_POINT);
+#endif
+#ifdef MON_THOUSANDS_SEP
+ DEFINE_NLITEM_CONSTANT (MON_THOUSANDS_SEP);
+#endif
+#ifdef MON_GROUPING
+ DEFINE_NLITEM_CONSTANT (MON_GROUPING);
+#endif
+#ifdef POSITIVE_SIGN
+ DEFINE_NLITEM_CONSTANT (POSITIVE_SIGN);
+#endif
+#ifdef NEGATIVE_SIGN
+ DEFINE_NLITEM_CONSTANT (NEGATIVE_SIGN);
+#endif
+#ifdef INT_FRAC_DIGITS
+ DEFINE_NLITEM_CONSTANT (INT_FRAC_DIGITS);
+#endif
+#ifdef FRAC_DIGITS
+ DEFINE_NLITEM_CONSTANT (FRAC_DIGITS);
+#endif
+#ifdef P_CS_PRECEDES
+ DEFINE_NLITEM_CONSTANT (P_CS_PRECEDES);
+#endif
+#ifdef P_SEP_BY_SPACE
+ DEFINE_NLITEM_CONSTANT (P_SEP_BY_SPACE);
+#endif
+#ifdef N_CS_PRECEDES
+ DEFINE_NLITEM_CONSTANT (N_CS_PRECEDES);
+#endif
+#ifdef N_SEP_BY_SPACE
+ DEFINE_NLITEM_CONSTANT (N_SEP_BY_SPACE);
+#endif
+#ifdef P_SIGN_POSN
+ DEFINE_NLITEM_CONSTANT (P_SIGN_POSN);
+#endif
+#ifdef N_SIGN_POSN
+ DEFINE_NLITEM_CONSTANT (N_SIGN_POSN);
+#endif
+#ifdef INT_P_CS_PRECEDES
+ DEFINE_NLITEM_CONSTANT (INT_P_CS_PRECEDES);
+#endif
+#ifdef INT_P_SEP_BY_SPACE
+ DEFINE_NLITEM_CONSTANT (INT_P_SEP_BY_SPACE);
+#endif
+#ifdef INT_N_CS_PRECEDES
+ DEFINE_NLITEM_CONSTANT (INT_N_CS_PRECEDES);
+#endif
+#ifdef INT_N_SEP_BY_SPACE
+ DEFINE_NLITEM_CONSTANT (INT_N_SEP_BY_SPACE);
+#endif
+#ifdef INT_P_SIGN_POSN
+ DEFINE_NLITEM_CONSTANT (INT_P_SIGN_POSN);
+#endif
+#ifdef INT_N_SIGN_POSN
+ DEFINE_NLITEM_CONSTANT (INT_N_SIGN_POSN);
+#endif
+
+#undef DEFINE_NLITEM_CONSTANT
+
+#endif /* HAVE_NL_TYPES_H */
+}
\f
void
scm_init_i18n ()
{
- scm_add_feature ("ice-9-i18n");
-
-#define _SCM_STRINGIFY_LC(_name) # _name
-#define SCM_STRINGIFY_LC(_name) _SCM_STRINGIFY_LC (_name)
+#ifdef HAVE_NL_LANGINFO
+ scm_add_feature ("language-information");
+ define_langinfo_items ();
+#endif
/* Define all the relevant `_MASK' variables. */
-#define SCM_DEFINE_LOCALE_CATEGORY(_name) \
- scm_c_define ("LC_" SCM_STRINGIFY_LC (_name) "_MASK", \
+#define SCM_DEFINE_LOCALE_CATEGORY(_name) \
+ scm_c_define ("LC_" SCM_I18N_STRINGIFY (_name) "_MASK", \
SCM_I_MAKINUM (LC_ ## _name ## _MASK));
#include "locale-categories.h"
#undef SCM_DEFINE_LOCALE_CATEGORY
-#undef SCM_STRINGIFY_LC
-#undef _SCM_STRINGIFY_LC
scm_c_define ("LC_ALL_MASK", SCM_I_MAKINUM (LC_ALL_MASK));
--- orig/libguile/i18n.h
+++ mod/libguile/i18n.h
@@ -40,6 +40,7 @@
SCM_API SCM scm_string_locale_downcase (SCM chr, SCM locale);
SCM_API SCM scm_locale_string_to_integer (SCM str, SCM base, SCM locale);
SCM_API SCM scm_locale_string_to_inexact (SCM str, SCM locale);
+SCM_API SCM scm_language_information (SCM item, SCM locale);
SCM_API void scm_init_i18n (void);
--- orig/libguile/posix.c
+++ mod/libguile/posix.c
@@ -119,6 +119,13 @@
# define USE_GNU_LOCALE_API
#endif
+#if (defined USE_GNU_LOCALE_API) && (defined HAVE_XLOCALE_H)
+/* Darwin now supports the "GNU" thread-safe locale API but one has to
+ include <xlocale.h> to get it. See:
+ http://developer.apple.com/documentation/Darwin/Reference/ManPages/man3/newlocale.3.html */
+# include <xlocale.h>
+#endif
+
#if HAVE_CRYPT_H
# include <crypt.h>
#endif
@@ -1384,12 +1391,11 @@
}
#undef FUNC_NAME
-#ifndef USE_GNU_LOCALE_API
/* This mutex is used to serialize invocations of `setlocale ()' on non-GNU
- systems (i.e., systems where a reentrant locale API is not available).
- See `i18n.c' for details. */
-scm_i_pthread_mutex_t scm_i_locale_mutex;
-#endif
+ systems (i.e., systems where a reentrant locale API is not available). It
+ is also acquired before calls to `nl_langinfo ()'. See `i18n.c' for
+ details. */
+scm_i_pthread_mutex_t scm_i_locale_mutex = SCM_I_PTHREAD_MUTEX_INITIALIZER;
#ifdef HAVE_SETLOCALE
@@ -1406,6 +1412,7 @@
"the locale will be set using environment variables.")
#define FUNC_NAME s_scm_setlocale
{
+ int c_category;
char *clocale;
char *rv;
@@ -1421,13 +1428,11 @@
scm_dynwind_free (clocale);
}
-#ifndef USE_GNU_LOCALE_API
+ c_category = scm_i_to_lc_category (category, 1);
+
scm_i_pthread_mutex_lock (&scm_i_locale_mutex);
-#endif
- rv = setlocale (scm_i_to_lc_category (category, 1), clocale);
-#ifndef USE_GNU_LOCALE_API
+ rv = setlocale (c_category, clocale);
scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
-#endif
if (rv == NULL)
{
@@ -1965,10 +1970,6 @@
void
scm_init_posix ()
{
-#ifndef USE_GNU_LOCALE_API
- scm_i_pthread_mutex_init (&scm_i_locale_mutex, NULL);
-#endif
-
scm_add_feature ("posix");
#ifdef HAVE_GETEUID
scm_add_feature ("EIDs");
--- orig/libguile/strings.c
+++ mod/libguile/strings.c
@@ -17,9 +17,19 @@
\f
+#if HAVE_CONFIG_H
+# include <config.h>
+#endif
#include <string.h>
#include <stdio.h>
+#ifdef HAVE_ICONV_H
+# include <iconv.h>
+#endif
+#if (defined HAVE_LANGINFO_H) && (defined HAVE_NL_TYPES_H)
+# include <langinfo.h>
+# include <nl_types.h>
+#endif
#include "libguile/_scm.h"
#include "libguile/chars.h"
@@ -29,6 +39,10 @@
#include "libguile/validate.h"
#include "libguile/dynwind.h"
+#include "libguile/threads.h"
+#include "libguile/posix.h" /* `scm_i_locale_mutex' */
+
+
\f
/* {Strings}
@@ -115,13 +129,176 @@
}
else
{
- char *mem = scm_gc_malloc (len+1, "string");
+ char *mem = scm_gc_malloc (len+1, "stringbuf");
mem[len] = '\0';
return scm_double_cell (STRINGBUF_TAG, (scm_t_bits) mem,
(scm_t_bits) len, (scm_t_bits) 0);
}
}
+static void
+unlock_mutex (void *mutex)
+{
+ scm_i_pthread_mutex_unlock ((scm_i_pthread_mutex_t *) mutex);
+}
+
+typedef struct
+{
+ char *start;
+ size_t max_len;
+} scm_t_buffer;
+
+static void
+free_buf (void *bufp)
+{
+ scm_t_buffer *buf;
+
+ buf = (scm_t_buffer *)bufp;
+ if (buf->start != NULL)
+ {
+ scm_gc_free (buf->start, buf->max_len, "stringbuf");
+ buf->start = NULL;
+ buf->max_len = 0;
+ }
+}
+
+
+/* Return a new stringbuf from string STR of size LEN which is encoded in
+ CHARSET. */
+static SCM
+make_stringbuf_from_c_string (const char *str, size_t len,
+ const char *charset)
+{
+ SCM res;
+ char *dst;
+
+ if (len == (size_t) -1)
+ len = strlen (str);
+
+#if (!defined HAVE_NL_LANGINFO) || (!defined HAVE_ICONV)
+ res = make_stringbuf (len);
+ dst = STRINGBUF_CHARS (res);
+ memcpy (dst, str, len);
+#else
+ {
+ /* We make the assumption that the current locale's charset is used as
+ our internal encoding for strings. */
+ const char *current_charset;
+
+ scm_i_pthread_mutex_lock (&scm_i_locale_mutex);
+ current_charset = nl_langinfo (CODESET);
+ if ((charset == NULL)
+ || (!strcmp (charset, current_charset)))
+ {
+ /* No conversion needed. */
+ scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
+
+ res = make_stringbuf (len);
+ dst = STRINGBUF_CHARS (res);
+ memcpy (dst, str, len);
+ }
+ else
+ {
+ /* Need to convert from CHARSET to CURRENT_CHARSET. */
+ iconv_t cc;
+ size_t nconv, output_avail, output_len;
+ char *output;
+ scm_t_buffer output_buf = { NULL, 0 };
+
+ /* The memory allocation routines may raise an exception so we must
+ make sure we don't leave the locale mutex locked if that
+ happens. */
+ scm_dynwind_begin (0);
+ scm_dynwind_unwind_handler (unlock_mutex, &scm_i_locale_mutex, 0);
+ scm_dynwind_unwind_handler (free_buf, &output_buf, 0);
+
+ cc = iconv_open (current_charset, charset);
+ if (cc == (iconv_t) -1)
+ goto conv_open_error;
+
+ output_buf.max_len = output_avail = len * 2;
+ output_buf.start = output =
+ (char *) scm_gc_malloc (output_buf.max_len, "stringbuf");
+
+ nconv = iconv (cc, NULL, NULL, &output, &output_avail);
+ if (nconv == (size_t) -1)
+ goto conv_error;
+
+ /* Convert from STR into OUTPUT_START. */
+ while ((len > 0) && (output_avail > 0))
+ {
+ nconv = iconv (cc, (char **)&str, &len,
+ &output, &output_avail);
+
+ if (nconv == (size_t) -1)
+ {
+ switch (errno)
+ {
+ case E2BIG:
+ /* We ran output of space in OUTPUT: grow it. */
+ output_len = output_buf.max_len - output_avail;
+ output_buf.start =
+ scm_gc_realloc (output_buf.start, output_buf.max_len,
+ output_buf.max_len * 2, "stringbuf");
+
+ output = output_buf.start + output_len;
+ output_buf.max_len *= 2;
+ output_avail = output_buf.max_len - output_len;
+ break;
+
+ default:
+ /* Incomplete byte sequence in STR, or actual conversion
+ error. */
+ goto conv_error;
+ }
+ }
+ }
+
+ output_len = output_buf.max_len - output_avail;
+
+ *output = '\0';
+ iconv_close (cc);
+ goto done;
+
+ conv_error:
+ iconv_close (cc);
+ conv_open_error:
+ scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
+ scm_syserror (__FUNCTION__);
+
+ done:
+ scm_dynwind_end ();
+ scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
+
+ /* Note: The stringbuf' length will be OUTPUT_LEN, that is, the
+ length _in octets_ of the converted string. This will be so until
+ we switch to Unicode internally. */
+ if (output_len <= STRINGBUF_MAX_INLINE_LEN - 1)
+ {
+ res = scm_double_cell (STRINGBUF_TAG | STRINGBUF_F_INLINE
+ | (output_len << 16),
+ 0, 0, 0);
+ memcpy (STRINGBUF_CHARS (res), output_buf.start, output_len);
+ }
+ else
+ {
+ if (output_buf.max_len > output_len + 1)
+ /* Shrink the output buffer. */
+ output_buf.start = scm_gc_realloc (output_buf.start,
+ output_buf.max_len,
+ output_len + 1, "stringbuf");
+
+ res = scm_double_cell (STRINGBUF_TAG,
+ (scm_t_bits) output_buf.start,
+ (scm_t_bits) output_len, (scm_t_bits) 0);
+ }
+ }
+ }
+#endif
+
+ return res;
+}
+
/* Return a new stringbuf whose underlying storage consists of the LEN+1
octets pointed to by STR (the last octet is zero). */
SCM_C_INLINE_KEYWORD SCM
@@ -149,6 +326,7 @@
scm_i_pthread_mutex_t stringbuf_write_mutex = SCM_I_PTHREAD_MUTEX_INITIALIZER;
+\f
/* Copy-on-write strings.
*/
@@ -179,16 +357,23 @@
#define IS_SH_STRING(str) (SCM_CELL_TYPE(str)==SH_STRING_TAG)
+static inline SCM
+make_string_from_stringbuf (SCM stringbuf)
+{
+ return (scm_double_cell (STRING_TAG, SCM_UNPACK (stringbuf),
+ (scm_t_bits) 0,
+ (scm_t_bits) STRINGBUF_LENGTH (stringbuf)));
+}
+
SCM
scm_i_make_string (size_t len, char **charsp)
{
SCM buf = make_stringbuf (len);
- SCM res;
+
if (charsp)
*charsp = STRINGBUF_CHARS (buf);
- res = scm_double_cell (STRING_TAG, SCM_UNPACK(buf),
- (scm_t_bits)0, (scm_t_bits) len);
- return res;
+
+ return (make_string_from_stringbuf (buf));
}
static void
@@ -842,6 +1027,25 @@
}
SCM
+scm_from_stringn (const char *str, size_t len,
+ const char *charset)
+{
+ SCM stringbuf;
+
+ stringbuf = make_stringbuf_from_c_string (str, len, charset);
+ return (make_string_from_stringbuf (stringbuf));
+}
+
+SCM
+scm_from_string (const char *str, const char *charset)
+{
+ SCM stringbuf;
+
+ stringbuf = make_stringbuf_from_c_string (str, -1, charset);
+ return (make_string_from_stringbuf (stringbuf));
+}
+
+SCM
scm_from_locale_stringn (const char *str, size_t len)
{
SCM res;
@@ -849,8 +1053,10 @@
if (len == (size_t)-1)
len = strlen (str);
+
res = scm_i_make_string (len, &dst);
memcpy (dst, str, len);
+
return res;
}
--- orig/libguile/strings.h
+++ mod/libguile/strings.h
@@ -100,6 +100,9 @@
SCM_API int scm_is_string (SCM x);
SCM_API SCM scm_from_locale_string (const char *str);
SCM_API SCM scm_from_locale_stringn (const char *str, size_t len);
+SCM_API SCM scm_from_string (const char *str, const char *charset);
+SCM_API SCM scm_from_stringn (const char *str, size_t len,
+ const char *charset);
SCM_API SCM scm_take_locale_string (char *str);
SCM_API SCM scm_take_locale_stringn (char *str, size_t len);
SCM_API char *scm_to_locale_string (SCM str);
--- orig/srfi/srfi-19.scm
+++ mod/srfi/srfi-19.scm
@@ -41,7 +41,8 @@
(define-module (srfi srfi-19)
:use-module (srfi srfi-6)
:use-module (srfi srfi-8)
- :use-module (srfi srfi-9))
+ :use-module (srfi srfi-9)
+ :use-module (ice-9 i18n))
(begin-deprecated
;; Prevent `export' from re-exporting core bindings. This behaviour
@@ -150,48 +151,70 @@
;;-- LOCALE dependent constants
-(define priv:locale-number-separator ".")
+(define priv:locale-number-separator
+ (cond-feature (language-information RADIXCHAR)
+ (else ".")))
(define priv:locale-abbr-weekday-vector
- (vector "Sun" "Mon" "Tue" "Wed" "Thu" "Fri" "Sat"))
+ (cond-feature (language-information
+ (vector ABDAY_1 ABDAY_2 ABDAY_3
+ ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7))
+ (else
+ (vector "Sun" "Mon" "Tue" "Wed" "Thu" "Fri" "Sat"))))
(define priv:locale-long-weekday-vector
- (vector
- "Sunday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday" "Saturday"))
+ (cond-feature (language-information
+ (vector DAY_1 DAY_2 DAY_3 DAY_4 DAY_5 DAY_6 DAY_7))
+ (else
+ (vector "Sunday" "Monday" "Tuesday" "Wednesday" "Thursday"
+ "Friday" "Saturday"))))
;; note empty string in 0th place.
(define priv:locale-abbr-month-vector
- (vector ""
- "Jan"
- "Feb"
- "Mar"
- "Apr"
- "May"
- "Jun"
- "Jul"
- "Aug"
- "Sep"
- "Oct"
- "Nov"
- "Dec"))
+ (cond-feature (language-information
+ (vector #f ABMON_1 ABMON_2 ABMON_3 ABMON_4 ABMON_5 ABMON_6
+ ABMON_7 ABMON_8 ABMON_9 ABMON_10 ABMON_11 ABMON_12))
+ (else
+ (vector ""
+ "Jan"
+ "Feb"
+ "Mar"
+ "Apr"
+ "May"
+ "Jun"
+ "Jul"
+ "Aug"
+ "Sep"
+ "Oct"
+ "Nov"
+ "Dec"))))
(define priv:locale-long-month-vector
- (vector ""
- "January"
- "February"
- "March"
- "April"
- "May"
- "June"
- "July"
- "August"
- "September"
- "October"
- "November"
- "December"))
-
-(define priv:locale-pm "PM")
-(define priv:locale-am "AM")
+ (cond-feature (language-information
+ (vector #f MON_1 MON_2 MON_3 MON_4 MON_5 MON_6
+ MON_7 MON_8 MON_9 MON_10 MON_11 MON_12))
+ (else
+ (vector ""
+ "January"
+ "February"
+ "March"
+ "April"
+ "May"
+ "June"
+ "July"
+ "August"
+ "September"
+ "October"
+ "November"
+ "December"))))
+
+(define priv:locale-pm
+ (cond-feature (language-information PM_STR)
+ (else "PM")))
+
+(define priv:locale-am
+ (cond-feature (language-information AM_STR)
+ (else "AM")))
;; See date->string
(define priv:locale-date-time-format "~a ~b ~d ~H:~M:~S~z ~Y")
@@ -965,23 +988,48 @@
(abs (remainder i (expt 10 n))))
(define (priv:locale-abbr-weekday n)
- (vector-ref priv:locale-abbr-weekday-vector n))
+ (cond-feature (language-information
+ (language-information
+ (vector-ref priv:locale-abbr-weekday-vector n)))
+ (else
+ (vector-ref priv:locale-abbr-weekday-vector n))))
(define (priv:locale-long-weekday n)
- (vector-ref priv:locale-long-weekday-vector n))
+ (cond-feature (language-information
+ (language-information
+ (vector-ref priv:locale-long-weekday-vector n)))
+ (else
+ (vector-ref priv:locale-long-weekday-vector n))))
(define (priv:locale-abbr-month n)
- (vector-ref priv:locale-abbr-month-vector n))
+ (cond-feature (language-information
+ (language-information
+ (vector-ref priv:locale-abbr-month-vector n)))
+ (else
+ (vector-ref priv:locale-abbr-month-vector n))))
(define (priv:locale-long-month n)
- (vector-ref priv:locale-long-month-vector n))
+ (cond-feature (language-information
+ (language-information
+ (vector-ref priv:locale-long-month-vector n)))
+ (else
+ (vector-ref priv:locale-long-month-vector n))))
(define (priv:vector-find needle haystack comparator)
(let ((len (vector-length haystack)))
(define (priv:vector-find-int index)
(cond
((>= index len) #f)
- ((comparator needle (vector-ref haystack index)) index)
+ ((comparator needle
+ (cond-feature
+ (language-information
+ (let ((item (vector-ref haystack index)))
+ (if item
+ (language-information item)
+ "")))
+ (else
+ (vector-ref haystack index))))
+ index)
(else (priv:vector-find-int (+ index 1)))))
(priv:vector-find-int 0)))
@@ -1003,10 +1051,10 @@
(define (priv:locale-print-time-zone date port)
(priv:tz-printer (date-zone-offset date) port))
-;; FIXME: we should use strftime to determine this dynamically if possible.
-;; Again, locale specific.
(define (priv:locale-am/pm hr)
- (if (> hr 11) priv:locale-pm priv:locale-am))
+ (let ((am/pm (if (> hr 11) priv:locale-pm priv:locale-am)))
+ (cond-feature (language-information (language-information am/pm))
+ (else am/pm))))
(define (priv:tz-printer offset port)
(cond
--- orig/test-suite/tests/i18n.test
+++ mod/test-suite/tests/i18n.test
@@ -19,6 +19,7 @@
(define-module (test-suite i18n)
:use-module (ice-9 i18n)
+ :use-module (srfi srfi-1)
:use-module (test-suite lib))
;; Start from a pristine locale state.
@@ -65,7 +66,7 @@
\f
(define %french-locale
(false-if-exception
- (make-locale (logior LC_CTYPE_MASK LC_COLLATE_MASK)
+ (make-locale (logior LC_CTYPE_MASK LC_COLLATE_MASK LC_NUMERIC_MASK)
"fr_FR.ISO-8859-1")))
(define (under-french-locale-or-unresolved thunk)
@@ -134,7 +135,54 @@
(make-locale LC_NUMERIC_MASK "C")))
(lambda (result char-count)
(and (equal? result 123.456)
- (equal? char-count 7))))))
+ (equal? char-count 7)))))
+
+ (pass-if "locale-string->inexact (French)"
+ (under-french-locale-or-unresolved
+ (lambda ()
+ (call-with-values
+ (lambda ()
+ (locale-string->inexact "123,456" %french-locale))
+ (lambda (result char-count)
+ (and (equal? result 123.456)
+ (equal? char-count 7))))))))
+
+\f
+;;;
+;;; `language-information'
+;;;
+
+(define %have-language-information?
+ (provided? 'language-information))
+
+(setlocale LC_ALL "C")
+(define %c-locale (make-locale LC_ALL_MASK "C"))
+
+(define %english-days
+ '("Sunday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday" "Saturday"))
+
+(define (every? . args)
+ (not (not (apply every args))))
+
+
+(with-test-prefix "language-information"
+
+ (pass-if "language-information (1 arg)"
+ (if (not %have-language-information?)
+ (throw 'unresolved)
+ (every? equal?
+ %english-days
+ (map language-information
+ (list DAY_1 DAY_2 DAY_3 DAY_4 DAY_5 DAY_6 DAY_7)))))
+
+ (pass-if "language-information (2 args)"
+ (if (not %have-language-information?)
+ (throw 'unresolved)
+ (every? equal?
+ %english-days
+ (map (lambda (day)
+ (language-information day %c-locale))
+ (list DAY_1 DAY_2 DAY_3 DAY_4 DAY_5 DAY_6 DAY_7))))))
;;; Local Variables:
--- orig/test-suite/tests/srfi-19.test
+++ mod/test-suite/tests/srfi-19.test
@@ -27,6 +27,9 @@
:use-module (srfi srfi-19)
:use-module (ice-9 format))
+;; Make sure we use the default locale.
+(setlocale LC_ALL "C")
+
(define (with-tz* tz thunk)
"Temporarily set the TZ environment variable to the passed string
value and call THUNK."
@@ -142,6 +145,19 @@
(string->date "2001-06-01@08:00" "~Y-~m-~d@~H:~M")))
(date->time-utc
(make-date 0 0 0 12 1 6 2001 0))))
+ (pass-if "string->date understands days and months"
+ (time=? (let ((d (string->date "Saturday, December 9, 2006"
+ "~A, ~B ~d, ~Y")))
+ (date->time-utc (make-date (date-nanosecond d)
+ (date-second d)
+ (date-minute d)
+ (date-hour d)
+ (date-day d)
+ (date-month d)
+ (date-year d)
+ 0)))
+ (date->time-utc
+ (make-date 0 0 0 0 9 12 2006 0))))
;; check time comparison procedures
(let* ((time1 (make-time time-monotonic 0 0))
(time2 (make-time time-monotonic 0 0))
[-- Attachment #3: Type: text/plain, Size: 143 bytes --]
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-10 15:04 More i18n Ludovic Courtès
@ 2006-12-11 19:42 ` Neil Jerram
2006-12-12 9:09 ` Ludovic Courtès
2006-12-13 0:11 ` Kevin Ryde
2006-12-11 23:26 ` Kevin Ryde
2006-12-12 19:19 ` Kevin Ryde
2 siblings, 2 replies; 23+ messages in thread
From: Neil Jerram @ 2006-12-11 19:42 UTC (permalink / raw)
Cc: guile-devel
ludo@chbouib.org (Ludovic Courtès) writes:
> Hi,
>
> The patch below provides further i18n support.
Overall, this all looks fine to me (although I'm nowhere near an
expert in i18n). Detailed comments on about 40% are as follows.
> +# langinfo.h, nl_types.h - SuS v2
> +# iconv.h - SuS v2
"SuS v2" isn't quite a reason for testing, is it? I think you need a
few more words here.
> +@deftypefn {C Function} SCM scm_from_string (const char *str, const char *charset)
> +@deftypefnx {C Function} SCM scm_from_stringn (const char *str, size_t len, const char *charset)
> +These are generalized versions of the above functions.
> +
> +Create a Scheme string that has the same contents as @var{str} when
> +interpreted in the character encoding specified by @var{charset} (the
> +interpretation of charset is platform-dependent).
Could you:
(i) add a reference to where charsets are documented further
(ii) modify wording to make clear that it is the source C string that
is being "interpreted in the character encoding ..."
?
> +@findex nl_langinfo
> +@cindex low-level locale information
> +It is sometimes useful to obtain very specific information about a
> +locale such as the name it uses for days or months, its format for
> +representing floating-point figures, etc. The @code{(ice-9 i18n)}
> +module provides support for this with the @code{language-information}
> +procedure. Note that this procedure is only available on platforms
> +that provide @code{nl_langinfo ()} (@pxref{The Elegant and Fast Way,
> +@code{nl_langinfo},, libc, The GNU C Library Reference Manual}). When
> +@code{language-information}, the @code{language-information} feature
> +is provided (@pxref{Feature Tracking}).
Last sentence seems garbled.
> + ;; detailed language information
> + language-information
> + CODESET
> + ABDAY_1 ABDAY_2 ABDAY_3 ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7
> + DAY_1 DAY_2 DAY_3 DAY_4 DAY_5 DAY_6 DAY_7
> + ABMON_1 ABMON_2 ABMON_3 ABMON_4 ABMON_5 ABMON_6 ABMON_7
> + ABMON_8 ABMON_9 ABMON_10 ABMON_11 ABMON_12
> + MON_1 MON_2 MON_3 MON_4 MON_5 MON_6 MON_7 MON_8
> + MON_9 MON_10 MON_11 MON_12
> + AM_STR PM_STR
> + D_T_FMT D_FMT T_FMT T_FMT_AMPM
> + ERA ERA_D_FMT ERA_D_T_FMT ERA_T_FMT
> + ALT_DIGITS RADIXCHAR THOUSEP
> + CRNCYSTR
> + YESEXPR NOEXPR YESSTR NOSTR
> +
> + ;; GNU extensions (may be unavailable on non-GNU systems)
> + ERA_YEAR DECIMAL_POINT
> + INT_CURR_SYMBOL CURRENCY_SYMBOL
> + MON_DECIMAL_POINT MON_THOUSANDS_SEP MON_GROUPING
> + POSITIVE_SIGN NEGATIVE_SIGN INT_FRAC_DIGITS FRAC_DIGITS
> + P_CS_PRECEDES P_SEP_BY_SPACE N_CS_PRECEDES N_SEP_BY_SPACE
> + P_SIGN_POSN N_SIGN_POSN INT_P_CS_PRECEDES INT_P_SEP_BY_SPACE
> + INT_N_CS_PRECEDES INT_N_SEP_BY_SPACE INT_P_SIGN_POSN
> + INT_N_SIGN_POSN))
That's a lot of constant definitions! Did you consider the possibility of
any other approach (e.g. symbols)?
> + 1. The `CODESET' value if not normalized. This is a secondary
> + issue, but
"if" -> "is"?
> + 2. `nl_langinfo ()' is not available on Windows (but do we care?). */
I think we care enough for it to be worth removing the "(but do we
care?)".
> + c_result = strdup (c_result);
> +
> + codeset = (char *) alloca (strlen (tmp_codeset) + 1);
> + strcpy (codeset, tmp_codeset);
Can we avoid alloca here? We know from another current thread that
some systems don't seem to have it.
...
That's as far as I've got for now. Does anyone else have a cunning
way of reviewing a big diff like this? I'm finding it quite slow. It
would be nicer if epatch or something similar worked, but for me it
doesn't.
Regards,
Neil
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-10 15:04 More i18n Ludovic Courtès
2006-12-11 19:42 ` Neil Jerram
@ 2006-12-11 23:26 ` Kevin Ryde
2006-12-12 9:36 ` Ludovic Courtès
2006-12-12 19:19 ` Kevin Ryde
2 siblings, 1 reply; 23+ messages in thread
From: Kevin Ryde @ 2006-12-11 23:26 UTC (permalink / raw)
Cc: guile-devel
ludo@chbouib.org (Ludovic Courtès) writes:
>
> In order to be consistent with the rest of `(ice-9 i18n)',
> `language-information' accepts an optional argument which should be a
> locale object. Consequently, `language-information' has to perform
> appropriate charset conversion.
If you ask for something from a particular locale object, shouldn't
you get the charset of that object? That'd be what I'd expect.
(Just for the time being of course, while encodings in strings are
exposed.)
> +# iconv.h - SuS v2
> +# iconv - X/Open, SuS v2
AM_ICONV (from gettext, see its manual) is good for that detection, it
can cope with various oddities.
> +@example
> +(language-information DAY_1 (make-locale LC_ALL_MASK "C"))
> +@result{} "Sunday"
DAY_1 smells a lot like C, surely something more schemely is possible?
After all, guile isn't a C interpreter with parenthesized syntax! :-)
You should setup to use localeconv and strftime too, if nl_langinfo
isn't available, that'd cope with DOS systems. Or if you want to tie
only to nl_langinfo then calling the function nl-langinfo will make it
clearest what's being had.
> + scm_error (scm_system_error_key, func_name,
> + "Failed to install locale",
> + scm_cons (scm_strerror (s_err), SCM_EOL),
> + scm_cons (s_err, SCM_EOL));
SCM_SYSERROR ?
> + c_result = strdup (c_result);
> + if (c_result == NULL)
> + {
> + result = SCM_BOOL_F;
> + /* FIXME: Raise an error. */
The frames stuff is the easiest way to cope with errors like out of
memory from within a mutex lock.
Though I suspect there's quite a few places in existing code that may
leak resources of one type or another under out of memory errors. And
I doubt out of memory is even recoverable at all if genuinely almost
full.
> +#define SCM_DEFINE_LOCALE_CATEGORY(_name) \
> + scm_c_define ("LC_" SCM_I18N_STRINGIFY (_name) "_MASK", \
> SCM_I_MAKINUM (LC_ ## _name ## _MASK));
scm_from_int (or long, as the case may be) can protect against the
unlikely case of a mask bit outside the size of a fixnum.
> + nconv = iconv (cc, (char **)&str, &len,
> + &output, &output_avail);
Incidentally, if you're thinking about iconv (which I suspect is not
needed yet), then I've found it useful to also have conversions that
put in dummies for untranslatable chars. With glibc the charset names
"foo//TRANSLIT" etc give that effect, but elsewhere you have to watch
for EILSEQ and skip that char. Good for output to at least display
something.
> (define priv:locale-abbr-weekday-vector
> - (vector "Sun" "Mon" "Tue" "Wed" "Thu" "Fri" "Sat"))
> + (cond-feature (language-information
> + (vector ABDAY_1 ABDAY_2 ABDAY_3
> + ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7))
This raises a point about the interface. It should be possible to ask
for "day number N". If ABDAY and friends aren't consecutive then they
should be.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-11 19:42 ` Neil Jerram
@ 2006-12-12 9:09 ` Ludovic Courtès
2006-12-12 19:48 ` Kevin Ryde
2006-12-31 16:22 ` Neil Jerram
2006-12-13 0:11 ` Kevin Ryde
1 sibling, 2 replies; 23+ messages in thread
From: Ludovic Courtès @ 2006-12-12 9:09 UTC (permalink / raw)
Cc: guile-devel
Hi,
I agree with your comments and fixes and I'll address them soom. Just a
few notes...
Neil Jerram <neil@ossau.uklinux.net> writes:
> "SuS v2" isn't quite a reason for testing, is it? I think you need a
> few more words here.
I meant that they may not be available everywhere. I should certainly
make it clearer.
> That's a lot of constant definitions! Did you consider the possibility of
> any other approach (e.g. symbols)?
I'm no longer sure it's generally good to use symbol to that end. It
creates secondary name spaces where regular module-based access control
cannot be used.
There's nothing obliging us to use integers for those constants. For
instance, `DAY_1' could be defined as:
(define DAY_1 (cons 'a 'b))
Then, if a piece of code doesn't know (or cannot access) the value of
`DAY_1', it cannot forge it and consequently cannot get the effect of
`(language-information DAY_1)'. This is the same as what happens with
other bindings.
Now, as Kevin said, the names are very C-ish...
>> + c_result = strdup (c_result);
>> +
>> + codeset = (char *) alloca (strlen (tmp_codeset) + 1);
>> + strcpy (codeset, tmp_codeset);
>
> Can we avoid alloca here? We know from another current thread that
> some systems don't seem to have it.
Sure, but `alloca ()' may be faster on most platforms and it avoids
leaks. Isn't it just that we should update the `alloca' blob from the
Autoconf manual?
> That's as far as I've got for now. Does anyone else have a cunning
> way of reviewing a big diff like this? I'm finding it quite slow. It
> would be nicer if epatch or something similar worked, but for me it
> doesn't.
Unfortunately, no.
Thanks!
Ludovic.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-11 23:26 ` Kevin Ryde
@ 2006-12-12 9:36 ` Ludovic Courtès
2006-12-12 19:43 ` Kevin Ryde
0 siblings, 1 reply; 23+ messages in thread
From: Ludovic Courtès @ 2006-12-12 9:36 UTC (permalink / raw)
Hi,
Kevin Ryde <user42@zip.com.au> writes:
> ludo@chbouib.org (Ludovic Courtès) writes:
>>
>> In order to be consistent with the rest of `(ice-9 i18n)',
>> `language-information' accepts an optional argument which should be a
>> locale object. Consequently, `language-information' has to perform
>> appropriate charset conversion.
>
> If you ask for something from a particular locale object, shouldn't
> you get the charset of that object? That'd be what I'd expect.
My understanding is that, implicitly, Guile currently uses the current
locale's encoding as its "internal representation". For instance,
`scm_from_locale_string ()' stores the input string "as is", i.e., in
the current locale's encoding. Thus, I think it makes sense to remain
consistent with this view.
When we have a non-8-bit internal representation, conversion will
probably be needed in any case.
> AM_ICONV (from gettext, see its manual) is good for that detection, it
> can cope with various oddities.
Ok.
> DAY_1 smells a lot like C, surely something more schemely is possible?
> After all, guile isn't a C interpreter with parenthesized syntax! :-)
Agreed, but changing the name makes it harder to automate the definition
with macros...
> You should setup to use localeconv and strftime too, if nl_langinfo
> isn't available, that'd cope with DOS systems. Or if you want to tie
> only to nl_langinfo then calling the function nl-langinfo will make it
> clearest what's being had.
Are there many systems that provide `localeconv' but not `nl_langinfo'?
I guess Windows has the former but not the latter, in which case it
might make sense to support it. That would require a bit of work,
though.
> Though I suspect there's quite a few places in existing code that may
> leak resources of one type or another under out of memory errors. And
> I doubt out of memory is even recoverable at all if genuinely almost
> full.
Yes, out-of-memory is currently mostly handled by `abort ()' (e.g.,
`scm_memory_error ()').
> Incidentally, if you're thinking about iconv (which I suspect is not
> needed yet), then I've found it useful to also have conversions that
> put in dummies for untranslatable chars. With glibc the charset names
> "foo//TRANSLIT" etc give that effect, but elsewhere you have to watch
> for EILSEQ and skip that char. Good for output to at least display
> something.
The issue is that (i) it is glibc-specific and (ii) it is not a desired
behavior in many cases.
>> (define priv:locale-abbr-weekday-vector
>> - (vector "Sun" "Mon" "Tue" "Wed" "Thu" "Fri" "Sat"))
>> + (cond-feature (language-information
>> + (vector ABDAY_1 ABDAY_2 ABDAY_3
>> + ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7))
>
> This raises a point about the interface. It should be possible to ask
> for "day number N". If ABDAY and friends aren't consecutive then they
> should be.
Right. I guess we could provide a higher-level API, along the lines of:
(language-information AB_DAY 1)
... that would use the same kind of vector internally to map day numbers
to `nl_langinfo' constants. This could be implemented in Scheme.
So, maybe we should rename `language-information' to `nl-langinfo' and
keep the C code as simple as possible, while leaving room for a
higher-level implementation that would possibly deserve the name of
`language-information'?
Thanks,
Ludovic.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-10 15:04 More i18n Ludovic Courtès
2006-12-11 19:42 ` Neil Jerram
2006-12-11 23:26 ` Kevin Ryde
@ 2006-12-12 19:19 ` Kevin Ryde
2006-12-31 16:15 ` Neil Jerram
2007-01-16 21:46 ` Ludovic Courtès
2 siblings, 2 replies; 23+ messages in thread
From: Kevin Ryde @ 2006-12-12 19:19 UTC (permalink / raw)
ludo@chbouib.org (Ludovic Courtès) writes:
>
> + ABDAY_1 ABDAY_2 ABDAY_3 ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7
> + DAY_1 DAY_2 DAY_3 DAY_4 DAY_5 DAY_6 DAY_7
By way of concrete suggestion, something like
(locale-weekday N)
(locale-weekday-short N)
(locale-month N)
(locale-month-short N)
or similar names would be much nicer than messing about with
(... (+ ABDAY_1 N))
(... (+ MON_1 N -1)) ;; if N is 1 to 12
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-12 9:36 ` Ludovic Courtès
@ 2006-12-12 19:43 ` Kevin Ryde
0 siblings, 0 replies; 23+ messages in thread
From: Kevin Ryde @ 2006-12-12 19:43 UTC (permalink / raw)
ludovic.courtes@laas.fr (Ludovic Courtès) writes:
>
> My understanding is that, implicitly, Guile currently uses the current
> locale's encoding as its "internal representation".
Not really, I believe it's a case of encoding ignorance, or at best
agnosticism :-). A few functions care, but on the whole there's
nothing uniformly applied.
No doubt this is set to change, but for now you can have different
strings with different encodings (as I've had the misfortune to be
doing lately). If there were a single encoding enforced for strings
then you'd for instance like existing strings converted upon doing a
setlocale.
All changing in the perhaps near future, until then I'd suggest doing
as little as possible. (Make sure it's 8-bit clean and leave it at
that.)
> Are there many systems that provide `localeconv' but not `nl_langinfo'?
> I guess Windows has the former but not the latter, in which case it
> might make sense to support it.
Supposedly there's a GetLocaleInfo or something on DOS, if anyone
cared.
> The issue is that (i) it is glibc-specific and (ii) it is not a desired
> behavior in many cases.
Yes, //translit is glibc, but putting question marks for bad chars is
possible everywhere, and would be a nice feature, eventually.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-12 9:09 ` Ludovic Courtès
@ 2006-12-12 19:48 ` Kevin Ryde
2006-12-31 16:23 ` Neil Jerram
2006-12-31 16:22 ` Neil Jerram
1 sibling, 1 reply; 23+ messages in thread
From: Kevin Ryde @ 2006-12-12 19:48 UTC (permalink / raw)
Cc: guile-devel
ludovic.courtes@laas.fr (Ludovic Courtès) writes:
>
> Sure, but `alloca ()' may be faster on most platforms and it avoids
> leaks. Isn't it just that we should update the `alloca' blob from the
> Autoconf manual?
Yep. I'm not aware of any problem with alloca availability, but it
seems the blob has slightly regressed for fleebsd systems at the
moment, that's all.
> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> That's as far as I've got for now. Does anyone else have a cunning
>> way of reviewing a big diff like this?
There's a story of a buddhist master who held a book up to his nose
and said "it has a troublesome smell". Try holding your nose up to
the screen to see if you have a similar power for patches! :-)
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-11 19:42 ` Neil Jerram
2006-12-12 9:09 ` Ludovic Courtès
@ 2006-12-13 0:11 ` Kevin Ryde
2006-12-31 16:54 ` Neil Jerram
1 sibling, 1 reply; 23+ messages in thread
From: Kevin Ryde @ 2006-12-13 0:11 UTC (permalink / raw)
Cc: guile-devel, Ludovic Courtès
Neil Jerram <neil@ossau.uklinux.net> writes:
>
> (i) add a reference to where charsets are documented further
If only there was a decent central bit in the gnu manuals :-(.
Running "iconv -l" is the only way I know to even see a full list.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-12 19:19 ` Kevin Ryde
@ 2006-12-31 16:15 ` Neil Jerram
2007-01-01 22:32 ` Kevin Ryde
2007-01-16 21:46 ` Ludovic Courtès
1 sibling, 1 reply; 23+ messages in thread
From: Neil Jerram @ 2006-12-31 16:15 UTC (permalink / raw)
Kevin Ryde <user42@zip.com.au> writes:
> ludo@chbouib.org (Ludovic Courtès) writes:
>>
>> + ABDAY_1 ABDAY_2 ABDAY_3 ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7
>> + DAY_1 DAY_2 DAY_3 DAY_4 DAY_5 DAY_6 DAY_7
>
> By way of concrete suggestion, something like
>
> (locale-weekday N)
> (locale-weekday-short N)
> (locale-month N)
> (locale-month-short N)
Yes, that looks nice.
> or similar names would be much nicer than messing about with
>
> (... (+ ABDAY_1 N))
> (... (+ MON_1 N -1)) ;; if N is 1 to 12
Not sure about 1..12 though; 0-based indices are a pretty well
established convention in Scheme, aren't they?
Regards,
Neil
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-12 9:09 ` Ludovic Courtès
2006-12-12 19:48 ` Kevin Ryde
@ 2006-12-31 16:22 ` Neil Jerram
1 sibling, 0 replies; 23+ messages in thread
From: Neil Jerram @ 2006-12-31 16:22 UTC (permalink / raw)
ludovic.courtes@laas.fr (Ludovic Courtès) writes:
> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> "SuS v2" isn't quite a reason for testing, is it? I think you need a
>> few more words here.
>
> I meant that they may not be available everywhere. I should certainly
> make it clearer.
Thanks.
>> That's a lot of constant definitions! Did you consider the possibility of
>> any other approach (e.g. symbols)?
>
> I'm no longer sure it's generally good to use symbol to that end. It
> creates secondary name spaces where regular module-based access control
> cannot be used.
Good point; agreed.
> There's nothing obliging us to use integers for those constants. For
> instance, `DAY_1' could be defined as:
>
> (define DAY_1 (cons 'a 'b))
>
> Then, if a piece of code doesn't know (or cannot access) the value of
> `DAY_1', it cannot forge it and consequently cannot get the effect of
> `(language-information DAY_1)'. This is the same as what happens with
> other bindings.
>
> Now, as Kevin said, the names are very C-ish...
I think your higher-level interface idea, including Kevin's
locale-weekday etc., would do a good job of covering that up.
>>> + c_result = strdup (c_result);
>>> +
>>> + codeset = (char *) alloca (strlen (tmp_codeset) + 1);
>>> + strcpy (codeset, tmp_codeset);
>>
>> Can we avoid alloca here? We know from another current thread that
>> some systems don't seem to have it.
>
> Sure, but `alloca ()' may be faster on most platforms and it avoids
> leaks. Isn't it just that we should update the `alloca' blob from the
> Autoconf manual?
Yes, quite right; I've been inappropriately concerned about alloca
recently!
Regards,
Neil
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-12 19:48 ` Kevin Ryde
@ 2006-12-31 16:23 ` Neil Jerram
0 siblings, 0 replies; 23+ messages in thread
From: Neil Jerram @ 2006-12-31 16:23 UTC (permalink / raw)
Kevin Ryde <user42@zip.com.au> writes:
>> Neil Jerram <neil@ossau.uklinux.net> writes:
>>
>>> That's as far as I've got for now. Does anyone else have a cunning
>>> way of reviewing a big diff like this?
>
> There's a story of a buddhist master who held a book up to his nose
> and said "it has a troublesome smell". Try holding your nose up to
> the screen to see if you have a similar power for patches! :-)
Thanks, that sounds good - except I'm not sure Ludovic would
appreciate that kind of feedback!
Regards,
Neil
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-13 0:11 ` Kevin Ryde
@ 2006-12-31 16:54 ` Neil Jerram
0 siblings, 0 replies; 23+ messages in thread
From: Neil Jerram @ 2006-12-31 16:54 UTC (permalink / raw)
Cc: guile-devel
Kevin Ryde <user42@zip.com.au> writes:
> Neil Jerram <neil@ossau.uklinux.net> writes:
>>
>> (i) add a reference to where charsets are documented further
>
> If only there was a decent central bit in the gnu manuals :-(.
> Running "iconv -l" is the only way I know to even see a full list.
It wasn't a list that I was looking for, just some idea of what a
charset is.
I've read just enough about this area to know that there is a lot of
terminology describing similar but subtly different concepts, like
locales, charsets, code pages; chars, code points, glyphs; encodings,
etc.
If it's a case of Guile's "charset" mapping exactly to some
well-known external definition of "charset", all I'm asking for is a
reference to that external definition.
Regards,
Neil
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-31 16:15 ` Neil Jerram
@ 2007-01-01 22:32 ` Kevin Ryde
0 siblings, 0 replies; 23+ messages in thread
From: Kevin Ryde @ 2007-01-01 22:32 UTC (permalink / raw)
Cc: guile-devel
Neil Jerram <neil@ossau.uklinux.net> writes:
>
> Not sure about 1..12 though; 0-based indices are a pretty well
> established convention in Scheme, aren't they?
0 to 6 on weekday names would be ok. But on the months I think 1 to
12 would be more helpful, and probably less error-prone if you're
mostly using 1 to 12 on input/output.
I know tm:mon / tm_mon is 0 to 11, but I've always found that pretty
irritating.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2006-12-12 19:19 ` Kevin Ryde
2006-12-31 16:15 ` Neil Jerram
@ 2007-01-16 21:46 ` Ludovic Courtès
2007-01-16 22:08 ` Ludovic Courtès
` (2 more replies)
1 sibling, 3 replies; 23+ messages in thread
From: Ludovic Courtès @ 2007-01-16 21:46 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 2046 bytes --]
Hi,
Kevin Ryde <user42@zip.com.au> writes:
> By way of concrete suggestion, something like
>
> (locale-weekday N)
> (locale-weekday-short N)
> (locale-month N)
> (locale-month-short N)
>
> or similar names would be much nicer than messing about with
>
> (... (+ ABDAY_1 N))
> (... (+ MON_1 N -1)) ;; if N is 1 to 12
The patch below (against HEAD) addresses some of the concerns that were
raised earlier, notably:
1. The awful, C-ish `nl_langinfo'-based API is now replaced by a shiny
Schemey API, following Kevin's suggestion. `nl-langinfo' itself is
only used internally and not exported. As a result, most of the
`nl-langinfo'-based procedures are written in Scheme.
2. Since `nl-langinfo' is hidden from Scheme programmers, all the
`locale-' procedures are guaranteed to work, no matter whether
`nl_langinfo ()' is actually available (see the feature-conditional
macros in `i18n.scm').
As a consequence, SRFI-19 internationalization is now
straightforward as does not require feature-conditional code.
3. Previously uninterpreted `nl_langinfo ()' special return values are
now correctly interpreted. For instance, the return value for
`GROUPING' should not be interpreted as a string, as was the case
with my earlier patch.
4. No attempt is made to convert to a different encoding the strings
returned by `nl_langinfo ()'.
5. High-level procedures for locale-dependent number output are
provided, namely `number->locale-string' and
`monetary-amount->locale-string'. Both of them are implemented in
Scheme (which is relieving, really ;-)).
6. Various bugs in the C code base were fixed, a few improvements were
made (notably slight optimizations for systems without `locale_t').
7. The test suite and documentation were augmented.
I'm well aware that the patch is hard to read. Perhaps the most
important things to start with are the API in `i18n.scm' and the
corresponding documentation and test cases.
Thanks,
Ludovic.
[-- Attachment #2: The new i18n patch! --]
[-- Type: text/x-diff, Size: 90056 bytes --]
--- orig/configure.in
+++ mod/configure.in
@@ -525,12 +525,13 @@
# Reasons for testing:
# complex.h - new in C99
# fenv.h - available in C99, but not older systems
+# langinfo.h, nl_types.h - SuS v2
#
AC_CHECK_HEADERS([complex.h fenv.h io.h libc.h limits.h malloc.h memory.h string.h \
regex.h rxposix.h rx/rxposix.h sys/dir.h sys/ioctl.h sys/select.h \
sys/time.h sys/timeb.h sys/times.h sys/stdtypes.h sys/types.h \
sys/utime.h time.h unistd.h utime.h pwd.h grp.h sys/utsname.h \
-direct.h])
+direct.h langinfo.h nl_types.h])
# "complex double" is new in C99, and "complex" is only a keyword if
# <complex.h> is included
@@ -613,17 +614,19 @@
# truncate - not in mingw
# isblank - available as a GNU extension or in C99
# _NSGetEnviron - Darwin specific
-# strcoll_l, newlocale - GNU extensions (glibc)
+# strcoll_l, newlocale - GNU extensions (glibc), also available on Darwin
+# nl_langinfo - X/Open, not available on Windows.
#
-AC_CHECK_FUNCS([DINFINITY DQNAN chsize clog10 ctermid fesetround ftime ftruncate fchown getcwd geteuid gettimeofday gmtime_r ioctl lstat mkdir mknod nice readdir_r readlink rename rmdir select setegid seteuid setlocale setpgid setsid sigaction siginterrupt stat64 strftime strptime symlink sync sysconf tcgetpgrp tcsetpgrp times uname waitpid strdup system usleep atexit on_exit chown link fcntl ttyname getpwent getgrent kill getppid getpgrp fork setitimer getitimer strchr strcmp index bcopy memcpy rindex truncate unsetenv isblank _NSGetEnviron strcoll strcoll_l newlocale])
+AC_CHECK_FUNCS([DINFINITY DQNAN chsize clog10 ctermid fesetround ftime ftruncate fchown getcwd geteuid gettimeofday gmtime_r ioctl lstat mkdir mknod nice readdir_r readlink rename rmdir select setegid seteuid setlocale setpgid setsid sigaction siginterrupt stat64 strftime strptime symlink sync sysconf tcgetpgrp tcsetpgrp times uname waitpid strdup system usleep atexit on_exit chown link fcntl ttyname getpwent getgrent kill getppid getpgrp fork setitimer getitimer strchr strcmp index bcopy memcpy rindex truncate unsetenv isblank _NSGetEnviron strcoll strcoll_l newlocale nl_langinfo])
# Reasons for testing:
# netdb.h - not in mingw
# sys/param.h - not in mingw
# sethostname - the function itself check because it's not in mingw,
# the DECL is checked because Solaris 10 doens't have in any header
+# xlocale.h - needed on Darwin for the `locale_t' API
#
-AC_CHECK_HEADERS(crypt.h netdb.h sys/param.h sys/resource.h sys/file.h)
+AC_CHECK_HEADERS(crypt.h netdb.h sys/param.h sys/resource.h sys/file.h xlocale.h)
AC_CHECK_FUNCS(chroot flock getlogin cuserid getpriority setpriority getpass sethostname gethostname)
AC_CHECK_DECLS([sethostname])
--- orig/doc/ref/api-data.texi
+++ mod/doc/ref/api-data.texi
@@ -1,6 +1,6 @@
@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
-@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006
+@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006, 2007
@c Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.
@@ -1015,7 +1015,7 @@
The following procedures read and write numbers according to their
external representation as defined by R5RS (@pxref{Lexical structure,
R5RS Lexical Structure,, r5rs, The Revised^5 Report on the Algorithmic
-Language Scheme}). @xref{The ice-9 i18n Module, the @code{(ice-9
+Language Scheme}). @xref{Number Input and Output, the @code{(ice-9
i18n)} module}, for locale-dependent number parsing.
@deffn {Scheme Procedure} number->string n [radix]
@@ -2949,7 +2949,7 @@
The first set is specified in R5RS and has names that end in @code{?}.
The second set is specified in SRFI-13 and the names have no ending
@code{?}. The predicates ending in @code{-ci} ignore the character case
-when comparing strings. @xref{The ice-9 i18n Module, the @code{(ice-9
+when comparing strings. @xref{Text Collation, the @code{(ice-9
i18n)} module}, for locale-dependent string comparison.
@rnindex string=?
--- orig/doc/ref/api-i18n.texi
+++ mod/doc/ref/api-i18n.texi
@@ -1,6 +1,6 @@
@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
-@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006
+@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006, 2007
@c Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.
@@ -19,16 +19,20 @@
program message strings.
@menu
-* The ice-9 i18n Module:: Honoring cultural conventions.
-* Gettext Support:: Translating message strings.
+* i18n Introduction:: Introduction to Guile's i18n support.
+* Text Collation:: Sorting strings and characters.
+* Character Case Mapping:: Case mapping.
+* Number Input and Output:: Parsing and printing numbers.
+* Accessing Locale Information:: Detailed locale information.
+* Gettext Support:: Translating message strings.
@end menu
-@node The ice-9 i18n Module
-@subsection The @code{(ice-9 i18n)} Module
+@node i18n Introduction, Text Collation, Internationalization, Internationalization
+@subsection Internationalization with Guile
-In order to make use of the following functions, one must import the
-@code{(ice-9 i18n)} module in the usual way:
+In order to make use of the functions described thereafter, the
+@code{(ice-9 i18n)} module must be imported in the usual way:
@example
(use-modules (ice-9 i18n))
@@ -64,83 +68,41 @@
The procedures provided by this module allow the development of
programs that adapt automatically to any locale setting. As we will
-see later, many of the locale-dependent procedures provided by this
-module can optionally take a @dfn{locale object} argument. This
-additional argument defines the locale settings that must be followed
-by the invoked procedure. When it is omitted, then the current locale
-settings of the process are followed (@pxref{Locales,
-@code{setlocale}}).
+see later, many of these procedures can optionally take a @dfn{locale
+object} argument. This additional argument defines the locale
+settings that must be followed by the invoked procedure. When it is
+omitted, then the current locale settings of the process are followed
+(@pxref{Locales, @code{setlocale}}).
The following procedures allow the manipulation of such locale
objects.
-@deffn {Scheme Procedure} make-locale category-mask locale-name [base-locale]
-@deffnx {C Function} scm_make_locale (category_mask, locale_name, base_locale)
+@deffn {Scheme Procedure} make-locale category-list locale-name [base-locale]
+@deffnx {C Function} scm_make_locale (category_list, locale_name, base_locale)
Return a reference to a data structure representing a set of locale
datasets. @var{locale-name} should be a string denoting a particular
-locale, e.g., @code{"aa_DJ"}. Unlike for the @var{category} parameter
-for @code{setlocale}, the @var{category-mask} parameter here uses a
-single bit for each category, made by OR'ing together @code{LC_*_MASK}
-bits. The optional @var{base-locale} argument can be used to specify
-a locale object whose settings are to be used as a basis for the
-locale object being returned.
-
-The available locale category masks are the following:
-
-@defvar LC_COLLATE_MASK
-Represents the collation locale category.
-@end defvar
-@defvar LC_CTYPE_MASK
-Represents the character classification locale category.
-@end defvar
-@defvar LC_MESSAGES_MASK
-Represents the messages locale category.
-@end defvar
-@defvar LC_MONETARY_MASK
-Represents the monetary locale category.
-@end defvar
-@defvar LC_NUMERIC_MASK
-Represents the way numbers are displayed.
-@end defvar
-@defvar LC_TIME_MASK
-Represents the way date and time are displayed
-@end defvar
-
-The following category masks are also available but will not have any
-effect on systems that do not support them:
-
-@defvar LC_PAPER_MASK
-@defvarx LC_NAME_MASK
-@defvarx LC_ADDRESS_MASK
-@defvarx LC_TELEPHONE_MASK
-@defvarx LC_MEASUREMENT_MASK
-@defvarx LC_IDENTIFICATION_MASK
-@end defvar
-
-Finally, there is also:
-
-@defvar LC_ALL_MASK
-This represents all the locale categories supported by the system.
-@end defvar
-
-The @code{LC_*_MASK} variables are bound to integers which may be OR'd
-together using @code{logior} (@pxref{Primitive Numerics,
-@code{logior}}). For instance, the following invocation creates a
-locale object that combines the use of Esperanto for messages and
-character classification with the default settings for the other
-categories (i.e., the settings of the default @code{C} locale which
-usually represents conventions in use in the USA):
-
-@example
-(make-locale (logior LC_MESSAGE_MASK LC_CTYPE_MASK) "eo_EO")
+locale (e.g., @code{"aa_DJ"}) and @var{category-list} should be a list
+of locale categories as used with @code{setlocale} (@pxref{Locales,
+@code{setlocale}}). Optionally, if @code{base-locale} is passed, it
+should be a locale object denoting settings for categories not listed
+in @var{category-list}.
+
+The following invocation creates a locale object that combines the use
+of Swedish for messages and character classification with the
+default settings for the other categories (i.e., the settings of the
+default @code{C} locale which usually represents conventions in use in
+the USA):
+
+@example
+(make-locale (list LC_MESSAGE LC_CTYPE) "sv_SE")
@end example
-The following example combines the use of Swedish conventions with
-monetary conventions from Croatia:
+The following example combines the use of Esperanto messages and
+conventions with monetary conventions from Croatia:
@example
-(make-locale LC_MONETARY_MASK "hr_HR"
- (make-locale LC_ALL_MASK "sv_SE"))
+(make-locale (list LC_MONETARY) "hr_HR"
+ (make-locale (list LC_ALL) "eo_EO"))
@end example
A @code{system-error} exception (@pxref{Handling Errors}) is raised by
@@ -155,70 +117,56 @@
Return true if @var{obj} is a locale object.
@end deffn
-The following procedures provide support for text collation.
+@defvr {Scheme Variable} %global-locale
+@defvrx {C Variable} scm_global_locale
+This variable is bound to a locale object denoting the current process
+locale as installed using @code{setlocale ()} (@pxref{Locales}). It
+may be used like any other locale object, including as a third
+argument to @code{make-locale}, for instance.
+@end defvr
+
+
+@node Text Collation, Character Case Mapping, i18n Introduction, Internationalization
+@subsection Text Collation
+
+The following procedures provide support for text collation, i.e.,
+locale-dependent string and character sorting.
@deffn {Scheme Procedure} string-locale<? s1 s2 [locale]
@deffnx {C Function} scm_string_locale_lt (s1, s2, locale)
-Compare strings @var{s1} and @var{s2} in a locale-dependent way. If
-@var{locale} is provided, it should be locale object (as returned by
-@code{make-locale}) and will be used to perform the comparison;
-otherwise, the current system locale is used.
-@end deffn
-
-@deffn {Scheme Procedure} string-locale>? s1 s2 [locale]
+@deffnx {Scheme Procedure} string-locale>? s1 s2 [locale]
@deffnx {C Function} scm_string_locale_gt (s1, s2, locale)
+@deffnx {Scheme Procedure} string-locale-ci<? s1 s2 [locale]
+@deffnx {C Function} scm_string_locale_ci_lt (s1, s2, locale)
+@deffnx {Scheme Procedure} string-locale-ci>? s1 s2 [locale]
+@deffnx {C Function} scm_string_locale_ci_gt (s1, s2, locale)
Compare strings @var{s1} and @var{s2} in a locale-dependent way. If
@var{locale} is provided, it should be locale object (as returned by
@code{make-locale}) and will be used to perform the comparison;
-otherwise, the current system locale is used.
-@end deffn
-
-@deffn {Scheme Procedure} string-locale-ci<? s1 s2 [locale]
-@deffnx {C Function} scm_string_locale_ci_lt (s1, s2, locale)
-Compare strings @var{s1} and @var{s2} in a case-insensitive, and
-locale-dependent way. If @var{locale} is provided, it should be
-locale object (as returned by @code{make-locale}) and will be used to
-perform the comparison; otherwise, the current system locale is used.
-@end deffn
-
-@deffn {Scheme Procedure} string-locale-ci>? s1 s2 [locale]
-@deffnx {C Function} scm_string_locale_ci_gt (s1, s2, locale)
-Compare strings @var{s1} and @var{s2} in a case-insensitive, and
-locale-dependent way. If @var{locale} is provided, it should be
-locale object (as returned by @code{make-locale}) and will be used to
-perform the comparison; otherwise, the current system locale is used.
+otherwise, the current system locale is used. For the @code{-ci}
+variants, the comparison is made in a case-insensitive way.
@end deffn
@deffn {Scheme Procedure} string-locale-ci=? s1 s2 [locale]
@deffnx {C Function} scm_string_locale_ci_eq (s1, s2, locale)
Compare strings @var{s1} and @var{s2} in a case-insensitive, and
locale-dependent way. If @var{locale} is provided, it should be
-locale object (as returned by @code{make-locale}) and will be used to
+a locale object (as returned by @code{make-locale}) and will be used to
perform the comparison; otherwise, the current system locale is used.
@end deffn
@deffn {Scheme Procedure} char-locale<? c1 c2 [locale]
@deffnx {C Function} scm_char_locale_lt (c1, c2, locale)
-Return true if character @var{c1} is lower than @var{c2} according to
-@var{locale} or to the current locale.
-@end deffn
-
-@deffn {Scheme Procedure} char-locale>? c1 c2 [locale]
+@deffnx {Scheme Procedure} char-locale>? c1 c2 [locale]
@deffnx {C Function} scm_char_locale_gt (c1, c2, locale)
-Return true if character @var{c1} is greater than @var{c2} according
-to @var{locale} or to the current locale.
-@end deffn
-
-@deffn {Scheme Procedure} char-locale-ci<? c1 c2 [locale]
+@deffnx {Scheme Procedure} char-locale-ci<? c1 c2 [locale]
@deffnx {C Function} scm_char_locale_ci_lt (c1, c2, locale)
-Return true if character @var{c1} is lower than @var{c2}, in a case
-insensitive way according to @var{locale} or to the current locale.
-@end deffn
-
-@deffn {Scheme Procedure} char-locale-ci>? c1 c2 [locale]
+@deffnx {Scheme Procedure} char-locale-ci>? c1 c2 [locale]
@deffnx {C Function} scm_char_locale_ci_gt (c1, c2, locale)
-Return true if character @var{c1} is greater than @var{c2}, in a case
-insensitive way according to @var{locale} or to the current locale.
+Compare characters @var{c1} and @var{c2} according to either
+@var{locale} (a locale object as returned by @code{make-locale}) or
+the current locale. For the @code{-ci} variants, the comparison is
+made in a case-insensitive way.
@end deffn
@deffn {Scheme Procedure} char-locale-ci=? c1 c2 [locale]
@@ -227,6 +175,9 @@
insensitive way according to @var{locale} or to the current locale.
@end deffn
+@node Character Case Mapping, Number Input and Output, Text Collation, Internationalization
+@subsection Character Case Mapping
+
The procedures below provide support for ``character case mapping'',
i.e., to convert characters or strings to their upper-case or
lower-case equivalent. Note that SRFI-13 provides procedures that
@@ -236,8 +187,8 @@
or region of the world. For instance, while most languages using the
Latin alphabet map lower-case letter ``i'' to upper-case letter ``I'',
Turkish maps lower-case ``i'' to ``Latin capital letter I with dot
-above''. The following procedures allow to provide idiomatic
-character mapping.
+above''. The following procedures allow programmers to provide
+idiomatic character mapping.
@deffn {Scheme Procedure} char-locale-downcase chr [locale]
@deffnx {C Function} scm_char_locale_upcase (chr, locale)
@@ -263,12 +214,20 @@
according to either @var{locale} or the current locale.
@end deffn
-Finally, the following procedures allow programs to read numbers
+Note that in the current implementation Guile has no notion of
+multibyte characters and in a multibyte locale characters may not be
+converted correctly.
+
+@node Number Input and Output, Accessing Locale Information, Character Case Mapping, Internationalization
+@subsection Number Input and Output
+
+The following procedures allow programs to read and write numbers
written according to a particular locale. As an example, in English,
``ten thousand and a half'' is usually written @code{10,000.5} while
-in French it is written @code{10000,5}. These procedures allow to
+in French it is written @code{10 000,5}. These procedures allow to
account for these differences.
+@findex strtod
@deffn {Scheme Procedure} locale-string->integer str [base [locale]]
@deffnx {C Function} scm_locale_string_to_integer (str, base, locale)
Convert string @var{str} into an integer according to either
@@ -276,22 +235,232 @@
the current process locale. If @var{base} is specified, then it
determines the base of the integer being read (e.g., @code{16} for an
hexadecimal number, @code{10} for a decimal number); by default,
-decimal numbers are read. Return two values: an integer (on success)
-or @code{#f}, and the number of characters read from @var{str}
-(@code{0} on failure).
+decimal numbers are read. Return two values (@pxref{Multiple
+Values}): an integer (on success) or @code{#f}, and the number of
+characters read from @var{str} (@code{0} on failure).
+
+This function is based on the C library's @code{strtol} function
+(@pxref{Parsing of Integers, @code{strtol},, libc, The GNU C Library
+Reference Manual}).
@end deffn
+@findex strtod
@deffn {Scheme Procedure} locale-string->inexact str [locale]
@deffnx {C Function} scm_locale_string_to_inexact (str, locale)
Convert string @var{str} into an inexact number according to either
@var{locale} (a locale object as returned by @code{make-locale}) or
-the current process locale. Return two values: an inexact number (on
-success) or @code{#f}, and the number of characters read from
-@var{str} (@code{0} on failure).
+the current process locale. Return two values (@pxref{Multiple
+Values}): an inexact number (on success) or @code{#f}, and the number
+of characters read from @var{str} (@code{0} on failure).
+
+This function is based on the C library's @code{strtod} function
+(@pxref{Parsing of Floats, @code{strtod},, libc, The GNU C Library
+Reference Manual}).
+@end deffn
+
+@deffn {Scheme Procedure} number->locale-string number [fraction-digits [locale]]
+Convert @var{number} (an inexact) into a string according to the
+cultural conventions of either @var{locale} (a locale object) or the
+current locale. Optionally, @var{fraction-digits} may be bound to an
+integer specifying the number of fractional digits to be displayed.
+@end deffn
+
+@deffn {Scheme Procedure} monetary-amount->locale-string amount intl? [locale]
+Convert @var{amount} (an inexact denoting a monetary amount) into a
+string according to the cultural conventions of either @var{locale} (a
+locale object) or the current locale. If @var{intl?} is true, then
+the international monetary format for the given locale is used
+(@pxref{Currency Symbol, international and locale monetary formats,,
+libc, The GNU C Library Reference Manual}).
+@end deffn
+
+
+@node Accessing Locale Information, Gettext Support, Number Input and Output, Internationalization
+@subsection Accessing Locale Information
+
+@findex nl_langinfo
+@cindex low-level locale information
+It is sometimes useful to obtain very specific information about a
+locale such as the word it uses for days or months, its format for
+representing floating-point figures, etc. The @code{(ice-9 i18n)}
+module provides support for this in a way that is similar to the libc
+functions @code{nl_langinfo ()} and @code{localeconv ()}
+(@pxref{Locale Information, accessing locale information from C,,
+libc, The GNU C Library Reference Manual}). The available functions
+are listed below.
+
+@deffn {Scheme Procedure} locale-encoding [locale]
+Return the name of the encoding (a string whose interpretation is
+system-dependent) of either @var{locale} or the current locale.
+@end deffn
+
+The following functions deal with dates and times.
+
+@deffn {Scheme Procedure} locale-day day [locale]
+@deffnx {Scheme Procedure} locale-day-short day [locale]
+@deffnx {Scheme Procedure} locale-month month [locale]
+@deffnx {Scheme Procedure} locale-month-short month [locale]
+Return the word (a string) used in either @var{locale} or the current
+locale to name the day (or month) denoted by @var{day} (or
+@var{month}), an integer between 1 and 7 (or 1 and 12). The
+@code{-short} variants provide an abbreviation instead of a full name.
+@end deffn
+
+@deffn {Scheme Procedure} locale-am-string [locale]
+@deffnx {Scheme Procedure} locale-pm-string [locale]
+Return a (potentially empty) string that is used to denote @i{ante
+meridiem} (or @i{post meridiem}) hours in 12-hour format.
+@end deffn
+
+@deffn {Scheme Procedure} locale-date+time-format [locale]
+@deffnx {Scheme Procedure} locale-date-format [locale]
+@deffnx {Scheme Procedure} locale-time-format [locale]
+@deffnx {Scheme Procedure} locale-time+am/pm-format [locale]
+@deffnx {Scheme Procedure} locale-era-date-format [locale]
+@deffnx {Scheme Procedure} locale-era-date+time-format [locale]
+@deffnx {Scheme Procedure} locale-era-time-format [locale]
+These procedures return format strings suitable to @code{strftime}
+(@pxref{Time}) that may be used to display (part of) a date/time
+according to certain constraints and to the conventions of either
+@var{locale} or the current locale (@pxref{The Elegant and Fast Way,
+the @code{nl_langinfo ()} items,, libc, The GNU C Library Reference
+Manual}).
+@end deffn
+
+@deffn {Scheme Procedure} locale-era [locale]
+@deffnx {Scheme Procedure} locale-era-year [locale]
+These functions return, respectively, the era and the year of the
+relevant era used in @var{locale} or the current locale. Most locales
+do not define this value. An example of a locale that does define
+this value is the Japanese one.
+@end deffn
+
+The following procedures give information about number representation.
+
+@deffn {Scheme Procedure} locale-decimal-point [locale]
+@deffnx {Scheme Procedure} locale-thousands-separator [locale]
+These functions return a string denoting the representation of the
+decimal point or that of the thousand separator (respectively) for
+either @var{locale} or the current locale.
+@end deffn
+
+@deffn {Scheme Procedure} locale-digit-grouping [locale]
+Return a (potentially circular) list denoting how digits of the
+integer part of a number are to be grouped, starting at the decimal
+point and going to the left. The list contains integers indicating
+the size of the successive groups, from right to left. If the list is
+non-circular, then no grouping occurs for digits beyond the last
+group.
+
+For instance, if the returned list is a circular list that contains
+only @code{3} and the thousand separator is @code{","} (as is the case
+with English locales), then the number @code{12345678} should be
+printed @code{12,345,678}.
+@end deffn
+
+The following procedures deal with the representation of monetary
+amounts. Some of them take an additional @var{intl?} argument (a
+boolean) that tells whether the international or local monetary
+conventions for the given locale are to be used.
+
+@deffn {Scheme Procedure} locale-monetary-decimal-point [locale]
+@deffnx {Scheme Procedure} locale-monetary-thousands-separator [locale]
+@deffnx {Scheme Procedure} locale-monetary-grouping [locale]
+These are the monetary counterparts of the above procedures. These
+procedures apply to monetary amounts.
+@end deffn
+
+@deffn {Scheme Procedure} locale-currency-symbol intl? [locale]
+Return the currency symbol (a string) of either @var{locale} or the
+current locale.
+
+The following example illustrates the difference between the local and
+international monetary formats:
+
+@example
+(define us (make-locale (list LC_MONETARY) "en_US"))
+(locale-currency-symbol #f us)
+@result{} "-$"
+(locale-currency-symbol #t us)
+@result{} "USD "
+@end example
+@end deffn
+
+@deffn {Scheme Procedure} locale-monetary-fractional-digits intl? [locale]
+Return the number of fractional digits to be used when printing
+monetary amounts according to either @var{locale} or the current
+locale. If the locale does not specify it, then @code{#f} is
+returned.
+@end deffn
+
+@deffn {Scheme Procedure} locale-currency-symbol-precedes-positive? intl? [locale]
+@deffnx {Scheme Procedure} locale-currency-symbol-precedes-negative? intl? [locale]
+@deffnx {Scheme Procedure} locale-positive-separated-by-space? intl? [locale]
+@deffnx {Scheme Procedure} locale-negative-separated-by-space? intl? [locale]
+These procedures return a boolean indicating whether the currency
+symbol should precede a positive/negative number, and whether a
+whitespace should be inserted between the currency symbol and a
+positive/negative amount.
+@end deffn
+
+@deffn {Scheme Procedure} locale-monetary-positive-sign [locale]
+@deffnx {Scheme Procedure} locale-monetary-negative-sign [locale]
+Return a string denoting the positive (respectively negative) sign
+that should be used when printing a monetary amount.
+@end deffn
+
+@deffn {Scheme Procedure} locale-positive-sign-position
+@deffnx {Scheme Procedure} locale-negative-sign-position
+These functions return a symbol telling where a sign of a
+positive/negative monetary amount is to appear when printing it. The
+possible values are:
+
+@table @code
+@item parenthesize
+The currency symbol and quantity should be surrounded by parentheses.
+@item sign-before
+Print the sign string before the quantity and currency symbol.
+@item sign-after
+Print the sign string after the quantity and currency symbol.
+@item sign-before-currency-symbol
+Print the sign string right before the currency symbol.
+@item sign-after-currency-symbol
+Print the sign string right after the currency symbol.
+@item unspecified
+Unspecified. We recommend you print the sign after the currency
+symbol.
+@end table
+
@end deffn
+Finally, the two following procedures may be helpful when programming
+user interfaces:
+
+@deffn {Scheme Procedure} locale-yes-regexp [locale]
+@deffnx {Scheme Procedure} locale-no-regexp [locale]
+Return a string that can be used as a regular expression to recognize
+a positive (respectively, negative) response to a yes/no question.
+
+Here is an example:
+@example
+(format #t "Does Guile rock?~%")
+(let ((answer (read-line)))
+ (cond ((string-match (locale-yes-regexp) answer)
+ "Yes it does.")
+ ((string-match (locale-no-regexp) answer)
+ "No it doesn't.")
+ (else
+ "What do you mean?")))
+@end example
+@end deffn
+
+Example uses of some of these functions are the implementation of the
+@code{number->locale-string} and @code{monetary-amount->locale-string}
+procedures (@pxref{Number Input and Output}), as well as that the
+SRFI-19 date and time convertion to/from strings (@pxref{SRFI-19}).
+
-@node Gettext Support
+@node Gettext Support, , Accessing Locale Information, Internationalization
@subsection Gettext Support
Guile provides an interface to GNU @code{gettext} for translating
--- orig/doc/ref/posix.texi
+++ mod/doc/ref/posix.texi
@@ -1,6 +1,6 @@
@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
-@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006
+@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006, 2007
@c Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.
@@ -3162,7 +3162,7 @@
Manual}.
Note that @code{setlocale} affects locale settings for the whole
-process. @xref{The ice-9 i18n Module, locale objects and
+process. @xref{i18n Introduction, locale objects and
@code{make-locale}}, for a thread-safe alternative.
@end deffn
--- orig/doc/ref/srfi-modules.texi
+++ mod/doc/ref/srfi-modules.texi
@@ -1,6 +1,6 @@
@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
-@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004
+@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2007
@c Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.
@@ -2095,10 +2095,10 @@
described here, since the specification and reference implementation
differ.
-Currently Guile doesn't implement any localizations for the above, all
-outputs are in English, and the @samp{~c} conversion is POSIX
-@code{ctime} style @samp{~a ~b ~d ~H:~M:~S~z ~Y}. This may change in
-the future.
+Conversion is locale-dependent on systems that support it
+(@pxref{Accessing Locale Information}). @xref{Locales,
+@code{setlocale}}, for information on how to change the current
+locale.
@node SRFI-19 String to date
@@ -2219,9 +2219,10 @@
returned, instead the weekday will be derived from the day, month and
year.
-Currently Guile doesn't implement any localizations for the above,
-month and weekday names are always expected in English. This may
-change in the future.
+Conversion is locale-dependent on systems that support it
+(@pxref{Accessing Locale Information}). @xref{Locales,
+@code{setlocale}}, for information on how to change the current
+locale.
@end defun
--- orig/ice-9/i18n.scm
+++ mod/ice-9/i18n.scm
@@ -1,6 +1,6 @@
;;;; i18n.scm --- internationalization support
-;;;; Copyright (C) 2006 Free Software Foundation, Inc.
+;;;; Copyright (C) 2006, 2007 Free Software Foundation, Inc.
;;;;
;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public
@@ -29,18 +29,10 @@
;;; Code:
(define-module (ice-9 i18n)
+ :use-module (ice-9 optargs)
:export (;; `locale' type
make-locale locale?
-
- ;; locale category masks (standard)
- LC_ALL_MASK
- LC_COLLATE_MASK LC_CTYPE_MASK LC_MESSAGES_MASK
- LC_MONETARY_MASK LC_NUMERIC_MASK LC_TIME_MASK
-
- ;; locale category masks (non-standard)
- LC_PAPER_MASK LC_NAME_MASK LC_ADDRESS_MASK
- LC_TELEPHONE_MASK LC_MEASUREMENT_MASK
- LC_IDENTIFICATION_MASK
+ %global-locale
;; text collation
string-locale<? string-locale>?
@@ -54,11 +46,372 @@
string-locale-downcase string-locale-upcase
;; reading numbers
- locale-string->integer locale-string->inexact))
+ locale-string->integer locale-string->inexact
+
+ ;; charset/encoding
+ locale-encoding
+
+ ;; days and months
+ locale-day-short locale-day locale-month-short locale-month
+
+ ;; date and time
+ locale-am-string locale-pm-string
+ locale-date+time-format locale-date-format locale-time-format
+ locale-time+am/pm-format
+ locale-era locale-era-year
+ locale-era-date-format locale-era-date+time-format
+ locale-era-time-format
+
+ ;; monetary
+ locale-currency-symbol
+ locale-monetary-decimal-point locale-monetary-thousands-separator
+ locale-monetary-grouping locale-monetary-fractional-digits
+ locale-currency-symbol-precedes-positive?
+ locale-currency-symbol-precedes-negative?
+ locale-positive-separated-by-space?
+ locale-negative-separated-by-space?
+ locale-monetary-positive-sign locale-monetary-negative-sign
+ locale-positive-sign-position locale-negative-sign-position
+ monetary-amount->locale-string
+
+ ;; number formatting
+ locale-digit-grouping locale-decimal-point
+ locale-thousands-separator
+ number->locale-string
+
+ ;; miscellaneous
+ locale-yes-regexp locale-no-regexp))
(load-extension "libguile-i18n-v-0" "scm_init_i18n")
+\f
+;;;
+;;; Charset/encoding.
+;;;
+
+(define (locale-encoding . locale)
+ (apply nl-langinfo CODESET locale))
+
+\f
+;;;
+;;; Months and days.
+;;;
+
+;; Helper macro: Define a procedure named NAME that maps its argument to
+;; NL-ITEMS (when `nl-langinfo' is provided) or DEFAULTS (when `nl-langinfo'
+;; is not provided).
+(define-macro (define-vector-langinfo-mapping name nl-items defaults)
+ (let* ((item-count (length nl-items))
+ (defines (if (provided? 'nl-langinfo)
+ `(define %nl-items (vector #f ,@nl-items))
+ `(define %defaults (vector #f ,@defaults))))
+ (make-body (lambda (result)
+ `(if (and (integer? item) (exact? item))
+ (if (and (>= item 1) (<= item ,item-count))
+ ,result
+ (throw 'out-of-range "out of range" item))
+ (throw 'wrong-type-arg "wrong argument type" item)))))
+ `(define (,name item . locale)
+ ,defines
+ ,(make-body (if (provided? 'nl-langinfo)
+ '(apply nl-langinfo (vector-ref %nl-items item) locale)
+ '(vector-ref %defaults item))))))
+
+
+(define-vector-langinfo-mapping locale-day-short
+ (ABDAY_1 ABDAY_2 ABDAY_3 ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7)
+ ("Sun" "Mon" "Tue" "Wed" "Thu" "Fri" "Sat"))
+
+(define-vector-langinfo-mapping locale-day
+ (DAY_1 DAY_2 DAY_3 DAY_4 DAY_5 DAY_6 DAY_7)
+ ("Sunday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday" "Saturday"))
+
+(define-vector-langinfo-mapping locale-month-short
+ (ABMON_1 ABMON_2 ABMON_3 ABMON_4 ABMON_5 ABMON_6
+ ABMON_7 ABMON_8 ABMON_9 ABMON_10 ABMON_11 ABMON_12)
+ ("Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"))
+
+(define-vector-langinfo-mapping locale-month
+ (MON_1 MON_2 MON_3 MON_4 MON_5 MON_6 MON_7 MON_8 MON_9 MON_10 MON_11 MON_12)
+ ("January" "February" "March" "April" "May" "June" "July" "August"
+ "September" "October" "November" "December"))
+
+
+\f
+;;;
+;;; Date and time.
+;;;
+
+;; Helper macro: Define a procedure NAME that gets langinfo item ITEM.
+(define-macro (define-simple-langinfo-mapping name item default)
+ (let ((body (if (provided? 'nl-langinfo)
+ `(apply nl-langinfo ,item locale)
+ default)))
+ `(define (,name . locale)
+ ,body)))
+
+(define-simple-langinfo-mapping locale-am-string
+ AM_STR "AM")
+(define-simple-langinfo-mapping locale-pm-string
+ PM_STR "PM")
+(define-simple-langinfo-mapping locale-date+time-format
+ D_T_FMT "%a %b %e %H:%M:%S %Y")
+(define-simple-langinfo-mapping locale-date-format
+ D_FMT "%m/%d/%y")
+(define-simple-langinfo-mapping locale-time-format
+ T_FMT "%H:%M:%S")
+(define-simple-langinfo-mapping locale-time+am/pm-format
+ T_FMT_AMPM "%I:%M:%S %p")
+(define-simple-langinfo-mapping locale-era
+ ERA "")
+(define-simple-langinfo-mapping locale-era-year
+ ERA_YEAR "")
+(define-simple-langinfo-mapping locale-era-date+time-format
+ ERA_D_T_FMT "")
+(define-simple-langinfo-mapping locale-era-date-format
+ ERA_D_FMT "")
+(define-simple-langinfo-mapping locale-era-time-format
+ ERA_T_FMT "")
+
+
+\f
+;;;
+;;; Monetary information.
+;;;
+
+(define-macro (define-monetary-langinfo-mapping name local-item intl-item
+ default/local default/intl)
+ (let ((body (if (provided? 'nl-langinfo)
+ `(if intl?
+ (apply nl-langinfo ,intl-item locale)
+ (apply nl-langinfo ,local-item locale))
+ `(if intl? ,default/intl ,default/local))))
+ `(define (,name intl? . locale)
+ ,body)))
+
+;; FIXME: How can we use ALT_DIGITS?
+(define-monetary-langinfo-mapping locale-currency-symbol
+ CRNCYSTR INT_CURR_SYMBOL
+ "-" "")
+(define-monetary-langinfo-mapping locale-monetary-fractional-digits
+ FRAC_DIGITS INT_FRAC_DIGITS
+ 2 2)
+
+(define-simple-langinfo-mapping locale-monetary-positive-sign
+ POSITIVE_SIGN "+")
+(define-simple-langinfo-mapping locale-monetary-negative-sign
+ NEGATIVE_SIGN "-")
+(define-simple-langinfo-mapping locale-monetary-decimal-point
+ MON_DECIMAL_POINT "")
+(define-simple-langinfo-mapping locale-monetary-thousands-separator
+ MON_THOUSANDS_SEP "")
+(define-simple-langinfo-mapping locale-monetary-digit-grouping
+ MON_GROUPING '())
+
+(define-monetary-langinfo-mapping locale-currency-symbol-precedes-positive?
+ P_CS_PRECEDES INT_P_CS_PRECEDES
+ #t #t)
+(define-monetary-langinfo-mapping locale-currency-symbol-precedes-negative?
+ N_CS_PRECEDES INT_N_CS_PRECEDES
+ #t #t)
+
+
+(define-monetary-langinfo-mapping locale-positive-separated-by-space?
+ ;; Whether a space should be inserted between a positive amount and the
+ ;; currency symbol.
+ P_SEP_BY_SPACE INT_P_SEP_BY_SPACE
+ #t #t)
+(define-monetary-langinfo-mapping locale-negative-separated-by-space?
+ ;; Whether a space should be inserted between a negative amount and the
+ ;; currency symbol.
+ N_SEP_BY_SPACE INT_N_SEP_BY_SPACE
+ #t #t)
+
+(define-monetary-langinfo-mapping locale-positive-sign-position
+ ;; Position of the positive sign wrt. currency symbol and quantity in a
+ ;; monetary amount.
+ P_SIGN_POSN INT_P_SIGN_POSN
+ 'unspecified 'unspecified)
+(define-monetary-langinfo-mapping locale-negative-sign-position
+ ;; Position of the negative sign wrt. currency symbol and quantity in a
+ ;; monetary amount.
+ N_SIGN_POSN INT_N_SIGN_POSN
+ 'unspecified 'unspecified)
+
+
+(define (%number-integer-part int grouping separator)
+ ;; Process INT (a string denoting a number's integer part) and return a new
+ ;; string with digit grouping and separators according to GROUPING (a list,
+ ;; potentially circular) and SEPARATOR (a string).
+ (let ((strlen (string-length int)))
+
+ ;; Process INT from right to left.
+ (let loop ((index (- strlen 1))
+ (grouping grouping)
+ (since-separator 0)
+ (result ""))
+ (if (< index 0)
+ result
+ (let ((insert-separator? (and (not (null? grouping))
+ (> index 0)
+ (= (car grouping)
+ (+ since-separator 1)))))
+ (loop (- index 1)
+ (if insert-separator? (cdr grouping) grouping)
+ (if insert-separator? 0 (+ since-separator 1))
+ (string-append (if insert-separator? separator "")
+ (string (string-ref int index))
+ result)))))))
+
+(define (add-monetary-sign+currency amount figure intl? locale)
+ ;; Add a sign and currency symbol around FIGURE. FIGURE should be a
+ ;; formatted unsigned amount (a string) representing AMOUNT.
+ (let* ((positive? (> amount 0))
+ (sign
+ (cond ((> amount 0) (locale-monetary-positive-sign locale))
+ ((< amount 0) (locale-monetary-negative-sign locale))
+ (else "")))
+ (currency (locale-currency-symbol intl? locale))
+ (currency-precedes?
+ (if positive?
+ locale-currency-symbol-precedes-positive?
+ locale-currency-symbol-precedes-negative?))
+ (separated?
+ (if positive?
+ locale-positive-separated-by-space?
+ locale-negative-separated-by-space?))
+ (sign-position
+ (if positive?
+ locale-positive-sign-position
+ locale-negative-sign-position))
+ (currency-space
+ (if (separated? intl? locale) " " ""))
+ (append-currency
+ (lambda (amt)
+ (if (currency-precedes? intl? locale)
+ (string-append currency currency-space amt)
+ (string-append amt currency-space currency)))))
+
+ (case (sign-position intl? locale)
+ ((parenthesize)
+ (string-append "(" (append-currency figure) ")"))
+ ((sign-before)
+ (string-append sign (append-currency figure)))
+ ((sign-after unspecified)
+ ;; following glibc's recommendation for `unspecified'.
+ (if (currency-precedes? intl? locale)
+ (string-append currency currency-space sign figure)
+ (string-append figure currency-space currency sign)))
+ ((sign-before-currency-symbol)
+ (if (currency-precedes? intl? locale)
+ (string-append sign currency currency-space figure)
+ (string-append figure currency-space sign currency))) ;; unlikely
+ ((sign-after-currency-symbol)
+ (if (currency-precedes? intl? locale)
+ (string-append currency sign currency-space figure)
+ (string-append figure currency-space currency sign)))
+ (else
+ (error "unsupported sign position" (sign-position intl? locale))))))
+
+
+(define* (monetary-amount->locale-string amount intl?
+ #:optional (locale %global-locale))
+ "Convert @var{amount} (an inexact) into a string according to the cultural
+conventions of either @var{locale} (a locale object) or the current locale.
+If @var{intl?} is true, then the international monetary format for the given
+locale is used."
+
+ (let* ((fraction-digits
+ (or (locale-monetary-fractional-digits intl? locale) 2))
+ (decimal-part
+ (lambda (dec)
+ (if (or (string=? dec "") (eq? 0 fraction-digits))
+ ""
+ (string-append (locale-monetary-decimal-point locale)
+ (if (< fraction-digits (string-length dec))
+ (substring dec 0 fraction-digits)
+ dec)))))
+
+ (external-repr (number->string (if (> amount 0) amount (- amount))))
+ (int+dec (string-split external-repr #\.))
+ (int (car int+dec))
+ (dec (decimal-part (if (null? (cdr int+dec))
+ ""
+ (cadr int+dec))))
+ (grouping (locale-monetary-digit-grouping locale))
+ (separator (locale-monetary-thousands-separator locale)))
+
+ (add-monetary-sign+currency amount
+ (string-append
+ (%number-integer-part int grouping
+ separator)
+ dec)
+ intl? locale)))
+
+
+\f
+;;;
+;;; Number formatting.
+;;;
+
+(define-simple-langinfo-mapping locale-digit-grouping
+ GROUPING '())
+(define-simple-langinfo-mapping locale-decimal-point
+ RADIXCHAR ".")
+(define-simple-langinfo-mapping locale-thousands-separator
+ THOUSEP "")
+
+(define* (number->locale-string number
+ #:optional (fraction-digits #t)
+ (locale %global-locale))
+ "Convert @var{number} (an inexact) into a string according to the cultural
+conventions of either @var{locale} (a locale object) or the current locale.
+Optionally, @var{fraction-digits} may be bound to an integer specifying the
+number of fractional digits to be displayed."
+
+ (let* ((sign
+ (cond ((> number 0) "")
+ ((< number 0) "-")
+ (else "")))
+ (decimal-part
+ (lambda (dec)
+ (if (or (string=? dec "") (eq? 0 fraction-digits))
+ ""
+ (string-append (locale-decimal-point locale)
+ (if (and (integer? fraction-digits)
+ (< fraction-digits
+ (string-length dec)))
+ (substring dec 0 fraction-digits)
+ dec))))))
+
+ (let* ((external-repr (number->string (if (> number 0)
+ number
+ (- number))))
+ (int+dec (string-split external-repr #\.))
+ (int (car int+dec))
+ (dec (decimal-part (if (null? (cdr int+dec))
+ ""
+ (cadr int+dec))))
+ (grouping (locale-digit-grouping locale))
+ (separator (locale-thousands-separator locale)))
+
+ (string-append sign
+ (%number-integer-part int grouping separator)
+ dec))))
+
+\f
+;;;
+;;; Miscellaneous.
+;;;
+
+(define-simple-langinfo-mapping locale-yes-regexp
+ YESEXPR "^[yY]")
+(define-simple-langinfo-mapping locale-no-regexp
+ NOEXPR "^[nN]")
+
+;; `YESSTR' and `NOSTR' are considered deprecated so we don't provide them.
+
;;; Local Variables:
;;; coding: latin-1
--- orig/libguile/i18n.c
+++ mod/libguile/i18n.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 2006 Free Software Foundation, Inc.
+/* Copyright (C) 2006, 2007 Free Software Foundation, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
@@ -46,6 +46,7 @@
#include "libguile/dynwind.h"
#include "libguile/validate.h"
#include "libguile/values.h"
+#include "libguile/threads.h"
#include <locale.h>
#include <string.h> /* `strcoll ()' */
@@ -53,11 +54,29 @@
#include <errno.h>
#if (defined HAVE_NEWLOCALE) && (defined HAVE_STRCOLL_L)
+/* The GNU thread-aware locale API is documented in ``Thread-Aware Locale
+ Model, a Proposal'', by Ulrich Drepper:
+
+ http://people.redhat.com/drepper/tllocale.ps.gz
+
+ It is now also implemented by Darwin:
+
+ http://developer.apple.com/documentation/Darwin/Reference/ManPages/man3/newlocale.3.html
+
+ The whole API is being standardized by the X/Open Group (as of Jan. 2007)
+ following Drepper's proposal. */
# define USE_GNU_LOCALE_API
#endif
-#ifndef USE_GNU_LOCALE_API
-# include "libguile/posix.h" /* for `scm_i_locale_mutex' */
+#if (defined USE_GNU_LOCALE_API) && (defined HAVE_XLOCALE_H)
+# include <xlocale.h>
+#endif
+
+#include "libguile/posix.h" /* for `scm_i_locale_mutex' */
+
+#if (defined HAVE_LANGINFO_H) && (defined HAVE_NL_TYPES_H)
+# include <langinfo.h>
+# include <nl_types.h>
#endif
#ifndef HAVE_SETLOCALE
@@ -69,6 +88,9 @@
}
#endif
+/* Helper stringification macro. */
+#define SCM_I18N_STRINGIFY(_name) # _name
+
\f
/* Locale objects, string and character collation, and other locale-dependent
@@ -78,14 +100,14 @@
locale API on non-GNU systems. The emulation is a bit "brute-force":
Whenever a `-locale<?' procedure is passed a locale object, then:
- 1. The `scm_t_locale_mutex' is locked.
+ 1. The `scm_i_locale_mutex' is locked.
2. A series of `setlocale ()' call is performed to store the current
- locale for each category in an `scm_t_locale_settings' object.
+ locale for each category in an `scm_t_locale' object.
3. A series of `setlocale ()' call is made to install each of the locale
categories of each of the base locales of each locale object,
recursively, starting from the last locale object of the chain.
4. The settings captured in step (2) are restored.
- 5. The `scm_t_locale_mutex' is released.
+ 5. The `scm_i_locale_mutex' is released.
Hopefully, some smart standard will make that hack useless someday...
A similar API can be found in MzScheme starting from version 200:
@@ -97,62 +119,65 @@
re-installing this locale upon locale-dependent calls. */
+/* Return the category mask corresponding to CAT. */
+#define SCM_LOCALE_CATEGORY_MASK(_cat) LC_ ## _cat ## _MASK
+
+
#ifndef USE_GNU_LOCALE_API
-/* Provide the locale category masks as found in glibc (copied from
- <locale.h> as found in glibc 2.3.6). This must be kept in sync with
- `locale-categories.h'. */
-
-# define LC_CTYPE_MASK (1 << LC_CTYPE)
-# define LC_COLLATE_MASK (1 << LC_COLLATE)
-# define LC_MESSAGES_MASK (1 << LC_MESSAGES)
-# define LC_MONETARY_MASK (1 << LC_MONETARY)
-# define LC_NUMERIC_MASK (1 << LC_NUMERIC)
-# define LC_TIME_MASK (1 << LC_TIME)
+/* Provide the locale category masks as found in glibc. This must be kept in
+ sync with `locale-categories.h'. */
+
+# define LC_CTYPE_MASK 1
+# define LC_COLLATE_MASK 2
+# define LC_MESSAGES_MASK 4
+# define LC_MONETARY_MASK 8
+# define LC_NUMERIC_MASK 16
+# define LC_TIME_MASK 32
# ifdef LC_PAPER
-# define LC_PAPER_MASK (1 << LC_PAPER)
+# define LC_PAPER_MASK 64
# else
# define LC_PAPER_MASK 0
# endif
# ifdef LC_NAME
-# define LC_NAME_MASK (1 << LC_NAME)
+# define LC_NAME_MASK 128
# else
# define LC_NAME_MASK 0
# endif
# ifdef LC_ADDRESS
-# define LC_ADDRESS_MASK (1 << LC_ADDRESS)
+# define LC_ADDRESS_MASK 256
# else
# define LC_ADDRESS_MASK 0
# endif
# ifdef LC_TELEPHONE
-# define LC_TELEPHONE_MASK (1 << LC_TELEPHONE)
+# define LC_TELEPHONE_MASK 512
# else
# define LC_TELEPHONE_MASK 0
# endif
# ifdef LC_MEASUREMENT
-# define LC_MEASUREMENT_MASK (1 << LC_MEASUREMENT)
+# define LC_MEASUREMENT_MASK 1024
# else
# define LC_MEASUREMENT_MASK 0
# endif
# ifdef LC_IDENTIFICATION
-# define LC_IDENTIFICATION_MASK (1 << LC_IDENTIFICATION)
+# define LC_IDENTIFICATION_MASK 2048
# else
# define LC_IDENTIFICATION_MASK 0
# endif
-# define LC_ALL_MASK (LC_CTYPE_MASK \
- | LC_NUMERIC_MASK \
- | LC_TIME_MASK \
- | LC_COLLATE_MASK \
- | LC_MONETARY_MASK \
- | LC_MESSAGES_MASK \
- | LC_PAPER_MASK \
- | LC_NAME_MASK \
- | LC_ADDRESS_MASK \
- | LC_TELEPHONE_MASK \
- | LC_MEASUREMENT_MASK \
- | LC_IDENTIFICATION_MASK \
+# define LC_ALL_MASK (LC_CTYPE_MASK \
+ | LC_NUMERIC_MASK \
+ | LC_TIME_MASK \
+ | LC_COLLATE_MASK \
+ | LC_MONETARY_MASK \
+ | LC_MESSAGES_MASK \
+ | LC_PAPER_MASK \
+ | LC_NAME_MASK \
+ | LC_ADDRESS_MASK \
+ | LC_TELEPHONE_MASK \
+ | LC_MEASUREMENT_MASK \
+ | LC_IDENTIFICATION_MASK \
)
/* Locale objects as returned by `make-locale' on non-GNU systems. */
@@ -163,12 +188,28 @@
int category_mask;
} *scm_t_locale;
-#else
+
+/* Free the resources used by LOCALE. */
+static inline void
+scm_i_locale_free (scm_t_locale locale)
+{
+ free (locale->locale_name);
+ locale->locale_name = NULL;
+}
+
+#else /* USE_GNU_LOCALE_API */
/* Alias for glibc's locale type. */
typedef locale_t scm_t_locale;
-#endif
+#define scm_i_locale_free freelocale
+
+#endif /* USE_GNU_LOCALE_API */
+
+
+/* A locale object denoting the global locale. */
+SCM_GLOBAL_VARIABLE (scm_global_locale, "%global-locale");
+
/* Validate parameter ARG as a locale object and set C_LOCALE to the
corresponding C locale object. */
@@ -199,16 +240,8 @@
{
scm_t_locale c_locale;
- c_locale = (scm_t_locale)SCM_SMOB_DATA (locale);
-
-#ifdef USE_GNU_LOCALE_API
- freelocale ((locale_t)c_locale);
-#else
- c_locale->base_locale = SCM_UNDEFINED;
- free (c_locale->locale_name);
-
- scm_gc_free (c_locale, sizeof (* c_locale), "locale");
-#endif
+ c_locale = (scm_t_locale) SCM_SMOB_DATA (locale);
+ scm_i_locale_free (c_locale);
return 0;
}
@@ -217,76 +250,38 @@
static SCM
smob_locale_mark (SCM locale)
{
- scm_t_locale c_locale;
-
- c_locale = (scm_t_locale)SCM_SMOB_DATA (locale);
- return (c_locale->base_locale);
-}
-#endif
-
+ register SCM dependency;
-SCM_DEFINE (scm_make_locale, "make-locale", 2, 1, 0,
- (SCM category_mask, SCM locale_name, SCM base_locale),
- "Return a reference to a data structure representing a set of "
- "locale datasets. Unlike for the @var{category} parameter for "
- "@code{setlocale}, the @var{category_mask} parameter here uses "
- "a single bit for each category, made by OR'ing together "
- "@code{LC_*_MASK} bits.")
-#define FUNC_NAME s_scm_make_locale
-{
- SCM locale = SCM_BOOL_F;
- int c_category_mask;
- char *c_locale_name;
- scm_t_locale c_base_locale, c_locale;
-
- SCM_VALIDATE_INT_COPY (1, category_mask, c_category_mask);
- SCM_VALIDATE_STRING (2, locale_name);
- SCM_VALIDATE_OPTIONAL_LOCALE_COPY (3, base_locale, c_base_locale);
-
- c_locale_name = scm_to_locale_string (locale_name);
-
-#ifdef USE_GNU_LOCALE_API
-
- c_locale = newlocale (c_category_mask, c_locale_name, c_base_locale);
+ if (!scm_is_eq (locale, SCM_VARIABLE_REF (scm_global_locale)))
+ {
+ scm_t_locale c_locale;
- if (!c_locale)
- locale = SCM_BOOL_F;
+ c_locale = (scm_t_locale) SCM_SMOB_DATA (locale);
+ dependency = (c_locale->base_locale);
+ }
else
- SCM_NEWSMOB (locale, scm_tc16_locale_smob_type, c_locale);
-
- free (c_locale_name);
-
-#else
-
- c_locale = scm_gc_malloc (sizeof (* c_locale), "locale");
- c_locale->base_locale = base_locale;
-
- c_locale->category_mask = c_category_mask;
- c_locale->locale_name = c_locale_name;
-
- SCM_NEWSMOB (locale, scm_tc16_locale_smob_type, c_locale);
+ dependency = SCM_BOOL_F;
+ return dependency;
+}
#endif
- return locale;
-}
-#undef FUNC_NAME
-SCM_DEFINE (scm_locale_p, "locale?", 1, 0, 0,
- (SCM obj),
- "Return true if @var{obj} is a locale object.")
-#define FUNC_NAME s_scm_locale_p
-{
- if (SCM_SMOB_PREDICATE (scm_tc16_locale_smob_type, obj))
- return SCM_BOOL_T;
+static void inline scm_locale_error (const char *, int) SCM_NORETURN;
- return SCM_BOOL_F;
+/* Throw an exception corresponding to error ERR. */
+static void inline
+scm_locale_error (const char *func_name, int err)
+{
+ scm_syserror_msg (func_name,
+ "Failed to install locale",
+ SCM_EOL, err);
}
-#undef FUNC_NAME
\f
-#ifndef USE_GNU_LOCALE_API /* Emulate GNU's reentrant locale API. */
+/* Emulating GNU's reentrant locale API. */
+#ifndef USE_GNU_LOCALE_API
/* Maximum number of chained locales (via `base_locale'). */
@@ -309,7 +304,7 @@
#define SCM_DEFINE_LOCALE_CATEGORY(_name) \
{ \
SCM_SYSCALL (locale_name = setlocale (LC_ ## _name, NULL)); \
- if (!locale_name) \
+ if (locale_name == NULL) \
goto handle_error; \
\
settings-> _name = strdup (locale_name); \
@@ -323,7 +318,7 @@
return 0;
handle_error:
- return errno;
+ return EINVAL;
handle_oom:
return ENOMEM;
@@ -346,7 +341,7 @@
return 0;
handle_error:
- return errno;
+ return EINVAL;
}
/* Free memory associated with SETTINGS. */
@@ -376,7 +371,7 @@
else
{
#define SCM_DEFINE_LOCALE_CATEGORY(_name) \
- if (category_mask & LC_ ## _name ## _MASK) \
+ if (category_mask & SCM_LOCALE_CATEGORY_MASK (_name)) \
{ \
SCM_SYSCALL (result = setlocale (LC_ ## _name, locale_name)); \
if (result == NULL) \
@@ -389,7 +384,7 @@
return 0;
handle_error:
- return errno;
+ return EINVAL;
}
/* Install LOCALE, recursively installing its base locales first. On
@@ -398,6 +393,7 @@
install_locale (scm_t_locale locale)
{
scm_t_locale stack[LOCALE_STACK_SIZE_MAX];
+ int category_mask = 0;
size_t stack_size = 0;
int stack_offset = 0;
const char *result = NULL;
@@ -412,12 +408,16 @@
stack[stack_size++] = locale;
+ /* Keep track of which categories have already been taken into
+ account. */
+ category_mask |= locale->category_mask;
+
if (locale->base_locale != SCM_UNDEFINED)
- locale = (scm_t_locale)SCM_SMOB_DATA (locale->base_locale);
+ locale = (scm_t_locale) SCM_SMOB_DATA (locale->base_locale);
else
locale = NULL;
}
- while (locale != NULL);
+ while ((locale != NULL) && (category_mask != LC_ALL_MASK));
/* Install the C locale to start from a pristine state. */
SCM_SYSCALL (result = setlocale (LC_ALL, "C"));
@@ -442,7 +442,7 @@
return 0;
handle_error:
- return errno;
+ return EINVAL;
}
/* Leave the locked locale section. */
@@ -481,19 +481,6 @@
return err;
}
-/* Throw an exception corresponding to error ERR. */
-static void inline
-scm_locale_error (const char *func_name, int err)
-{
- SCM s_err;
-
- s_err = scm_from_int (err);
- scm_error (scm_system_error_key, func_name,
- "Failed to install locale",
- scm_cons (scm_strerror (s_err), SCM_EOL),
- scm_cons (s_err, SCM_EOL));
-}
-
/* Convenient macro to run STATEMENT in the locale context of C_LOCALE. */
#define RUN_IN_LOCALE_SECTION(_c_locale, _statement) \
do \
@@ -514,8 +501,218 @@
} \
while (0)
+/* Convert the current locale settings into a locale SMOB. On success, zero
+ is returned and RESULT points to the new SMOB. Otherwise, an error is
+ returned. */
+static int
+get_current_locale (SCM *result)
+{
+ int err = 0;
+ scm_t_locale c_locale;
+ const char *current_locale;
+
+ c_locale = scm_gc_malloc (sizeof (* c_locale), "locale");
+
+
+ scm_i_pthread_mutex_lock (&scm_i_locale_mutex);
+
+ c_locale->category_mask = LC_ALL_MASK;
+ c_locale->base_locale = SCM_UNDEFINED;
+
+ current_locale = setlocale (LC_ALL, NULL);
+ if (current_locale != NULL)
+ {
+ c_locale->locale_name = strdup (current_locale);
+ if (c_locale->locale_name == NULL)
+ err = ENOMEM;
+ }
+ else
+ err = EINVAL;
+
+ scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
+
+ if (err)
+ scm_gc_free (c_locale, sizeof (* c_locale), "locale");
+ else
+ SCM_NEWSMOB (*result, scm_tc16_locale_smob_type, c_locale);
+
+ return err;
+}
+
+
#endif /* !USE_GNU_LOCALE_API */
+
+\f
+/* Convert CATEGORIES, a list of locale categories (integers), into a
+ category mask. */
+static int
+category_list_to_category_mask (SCM categories,
+ const char *func_name, int pos)
+{
+ int c_category_mask = 0;
+
+ for (; !scm_is_null (categories); categories = SCM_CDR (categories))
+ {
+ int c_category;
+ SCM category = SCM_CAR (categories);
+
+ c_category = scm_to_int (category);
+
+#define SCM_DEFINE_LOCALE_CATEGORY(_name) \
+ case LC_ ## _name: \
+ c_category_mask |= SCM_LOCALE_CATEGORY_MASK (_name); \
+ break;
+
+ switch (c_category)
+ {
+#include "locale-categories.h"
+
+ case LC_ALL:
+ c_category_mask |= LC_ALL_MASK;
+ break;
+
+ default:
+ scm_wrong_type_arg_msg (func_name, pos, category,
+ "locale category");
+ }
+
+#undef SCM_DEFINE_LOCALE_CATEGORY
+ }
+
+ return c_category_mask;
+}
+
+
+SCM_DEFINE (scm_make_locale, "make-locale", 2, 1, 0,
+ (SCM category_list, SCM locale_name, SCM base_locale),
+ "Return a reference to a data structure representing a set of "
+ "locale datasets. @var{category_list} should be a list "
+ "of locale categories as used with @code{setlocale} "
+ "(@pxref{Locales, @code{setlocale}}) and @var{locale_name} "
+ "should be the name of the locale considered (e.g., "
+ "@code{\"sl_SI\"}). Optionally, if @code{base_locale} is "
+ "passed, it should be a locale object denoting settings for "
+ "categories not listed in @var{category_list}.")
+#define FUNC_NAME s_scm_make_locale
+{
+ SCM locale = SCM_BOOL_F;
+ int err = 0;
+ int c_category_mask;
+ char *c_locale_name;
+ scm_t_locale c_base_locale, c_locale;
+
+ SCM_VALIDATE_LIST (1, category_list);
+ SCM_VALIDATE_STRING (2, locale_name);
+ SCM_VALIDATE_OPTIONAL_LOCALE_COPY (3, base_locale, c_base_locale);
+
+ c_category_mask = category_list_to_category_mask (category_list,
+ FUNC_NAME, 1);
+ c_locale_name = scm_to_locale_string (locale_name);
+
+#ifdef USE_GNU_LOCALE_API
+
+ if (scm_is_eq (base_locale, SCM_VARIABLE_REF (scm_global_locale)))
+ {
+ /* Fetch the current locale and turn in into a `locale_t'. Don't
+ duplicate the resulting `locale_t' because we want it to be consumed
+ by `newlocale ()'. */
+ char *current_locale;
+
+ scm_i_pthread_mutex_lock (&scm_i_locale_mutex);
+
+ current_locale = setlocale (LC_ALL, NULL);
+ c_base_locale = newlocale (LC_ALL_MASK, current_locale, NULL);
+
+ scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
+
+ if (c_base_locale == (locale_t) 0)
+ scm_locale_error (FUNC_NAME, errno);
+ }
+ else if (c_base_locale != (locale_t) 0)
+ {
+ /* C_BASE_LOCALE is to be consumed by `newlocale ()' so it needs to be
+ duplicated before. */
+ c_base_locale = duplocale (c_base_locale);
+ if (c_base_locale == (locale_t) 0)
+ {
+ err = errno;
+ goto fail;
+ }
+ }
+
+ c_locale = newlocale (c_category_mask, c_locale_name, c_base_locale);
+
+ free (c_locale_name);
+
+ if (c_locale == (locale_t) 0)
+ {
+ if (scm_is_eq (base_locale, SCM_VARIABLE_REF (scm_global_locale)))
+ /* The base locale object was created lazily and must be freed. */
+ freelocale (c_base_locale);
+
+ scm_locale_error (FUNC_NAME, errno);
+ }
+ else
+ SCM_NEWSMOB (locale, scm_tc16_locale_smob_type, c_locale);
+
+#else
+
+ c_locale = scm_gc_malloc (sizeof (* c_locale), "locale");
+
+ c_locale->category_mask = c_category_mask;
+ c_locale->locale_name = c_locale_name;
+
+ if (scm_is_eq (base_locale, SCM_VARIABLE_REF (scm_global_locale)))
+ {
+ /* Get the current locale settings and turn them into a locale
+ object. */
+ err = get_current_locale (&base_locale);
+ if (err)
+ goto fail;
+ }
+
+ c_locale->base_locale = base_locale;
+
+ {
+ /* Try out the new locale and raise an exception if it doesn't work. */
+ int err;
+ scm_t_locale_settings prev_locale;
+
+ err = enter_locale_section (c_locale, &prev_locale);
+ leave_locale_section (&prev_locale);
+
+ if (err)
+ goto fail;
+ else
+ SCM_NEWSMOB (locale, scm_tc16_locale_smob_type, c_locale);
+ }
+
+#endif
+
+ return locale;
+
+ fail:
+#ifndef USE_GNU_LOCALE_API
+ scm_gc_free (c_locale, sizeof (* c_locale), "locale");
+#endif
+ free (c_locale_name);
+ scm_locale_error (FUNC_NAME, err);
+
+ return SCM_BOOL_F;
+}
+#undef FUNC_NAME
+
+SCM_DEFINE (scm_locale_p, "locale?", 1, 0, 0,
+ (SCM obj),
+ "Return true if @var{obj} is a locale object.")
+#define FUNC_NAME s_scm_locale_p
+{
+ return scm_from_bool (SCM_SMOB_PREDICATE (scm_tc16_locale_smob_type, obj));
+}
+#undef FUNC_NAME
+
+
\f
/* Locale-dependent string comparison. */
@@ -1124,25 +1321,399 @@
}
#undef FUNC_NAME
+\f
+/* Language information, aka. `nl_langinfo ()'. */
+
+/* FIXME: Issues related to `language-information'.
+
+ 1. The `CODESET' value is not normalized. This is a secondary issue, but
+ still a practical issue. See
+ http://www.cl.cam.ac.uk/~mgk25/ucs/norm_charmap.c for codeset
+ normalization.
+
+ 2. `nl_langinfo ()' is not available on Windows.
+
+ 3. `nl_langinfo ()' may return strings encoded in a locale different from
+ the current one, thereby defeating `scm_from_locale_string ()'.
+ Example: support the current locale is "Latin-1" and one asks:
+
+ (language-information DAY_1 (make-locale "eo_EO.UTF-8"))
+
+ The result will be a UTF-8 string. However, `scm_from_locale_string',
+ which expects a Latin-1 string, won't be able to make much sense of the
+ returned string. Thus, we'd need an `scm_from_string ()' variant where
+ the locale (or charset) is explicitly passed. */
+
+
+SCM_DEFINE (scm_nl_langinfo, "nl-langinfo", 1, 1, 0,
+ (SCM item, SCM locale),
+ "Return a string denoting locale information for @var{item} "
+ "in the current locale or that specified by @var{locale}. "
+ "The semantics and arguments are the same as those of the "
+ "X/Open @code{nl_langinfo} function (@pxref{The Elegant and "
+ "Fast Way, @code{nl_langinfo},, libc, The GNU C Library "
+ "Reference Manual}).")
+#define FUNC_NAME s_scm_nl_langinfo
+{
+#ifdef HAVE_NL_LANGINFO
+ SCM result;
+ nl_item c_item;
+ char *c_result;
+ scm_t_locale c_locale;
+
+ SCM_VALIDATE_INT_COPY (2, item, c_item);
+ SCM_VALIDATE_OPTIONAL_LOCALE_COPY (2, locale, c_locale);
+
+ /* Sadly, `nl_langinfo ()' returns a pointer to a static string. According
+ to SuS v2, that static string may be modified by subsequent calls to
+ `nl_langinfo ()' as well as by calls to `setlocale ()'. Thus, we must
+ acquire the locale mutex before doing invoking `nl_langinfo ()'. See
+ http://opengroup.org/onlinepubs/007908799/xsh/nl_langinfo.html for
+ details. */
+
+ scm_i_pthread_mutex_lock (&scm_i_locale_mutex);
+ if (c_locale != NULL)
+ {
+#ifdef USE_GNU_LOCALE_API
+ c_result = nl_langinfo_l (c_item, c_locale);
+#else
+ /* We can't use `RUN_IN_LOCALE_SECTION ()' here because the locale
+ mutex is already taken. */
+ int lsec_err;
+ scm_t_locale_settings lsec_prev_locale;
+
+ lsec_err = get_current_locale_settings (&lsec_prev_locale);
+ if (lsec_err)
+ scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
+ else
+ {
+ lsec_err = install_locale (c_locale);
+ if (lsec_err)
+ {
+ leave_locale_section (&lsec_prev_locale);
+ free_locale_settings (&lsec_prev_locale);
+ }
+ }
+
+ if (lsec_err)
+ scm_locale_error (FUNC_NAME, lsec_err);
+ else
+ {
+ c_result = nl_langinfo (c_item);
+
+ leave_locale_section (&lsec_prev_locale);
+ free_locale_settings (&lsec_prev_locale);
+ }
+#endif
+ }
+ else
+ c_result = nl_langinfo (c_item);
+
+ c_result = strdup (c_result);
+ scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
+
+ if (c_result == NULL)
+ result = SCM_BOOL_F;
+ else
+ {
+ char *p;
+
+ switch (c_item)
+ {
+ case GROUPING:
+ case MON_GROUPING:
+ /* In this cases, the result is to be interpreted as a list of
+ numbers. If the last item is `CHARS_MAX', it has the special
+ meaning "no more grouping". */
+ result = SCM_EOL;
+ for (p = c_result; (*p != '\0') && (*p != CHAR_MAX); p++)
+ result = scm_cons (SCM_I_MAKINUM ((int) *p), result);
+
+ {
+ SCM last_pair = result;
+
+ result = scm_reverse_x (result, SCM_EOL);
+
+ if (*p != CHAR_MAX)
+ {
+ /* Cyclic grouping information. */
+ if (last_pair != SCM_EOL)
+ SCM_SETCDR (last_pair, result);
+ }
+ }
+
+ free (c_result);
+ break;
+
+ case FRAC_DIGITS:
+ case INT_FRAC_DIGITS:
+ /* This is to be interpreted as a single integer. */
+ if (*c_result == CHAR_MAX)
+ /* Unspecified. */
+ result = SCM_BOOL_F;
+ else
+ result = SCM_I_MAKINUM (*c_result);
+
+ free (c_result);
+ break;
+
+ case P_CS_PRECEDES:
+ case N_CS_PRECEDES:
+ case INT_P_CS_PRECEDES:
+ case INT_N_CS_PRECEDES:
+ case P_SEP_BY_SPACE:
+ case N_SEP_BY_SPACE:
+ /* This is to be interpreted as a boolean. */
+ result = scm_from_bool (*c_result);
+
+ free (c_result);
+ break;
+
+ case P_SIGN_POSN:
+ case N_SIGN_POSN:
+ case INT_P_SIGN_POSN:
+ case INT_N_SIGN_POSN:
+ /* See `(libc) Sign of Money Amount' for the interpretation of the
+ return value here. */
+ switch (*c_result)
+ {
+ case 0:
+ result = scm_from_locale_symbol ("parenthesize");
+ break;
+
+ case 1:
+ result = scm_from_locale_symbol ("sign-before");
+ break;
+
+ case 2:
+ result = scm_from_locale_symbol ("sign-after");
+ break;
+
+ case 3:
+ result = scm_from_locale_symbol ("sign-before-currency-symbol");
+ break;
+
+ case 4:
+ result = scm_from_locale_symbol ("sign-after-currency-symbol");
+ break;
+
+ default:
+ result = scm_from_locale_symbol ("unspecified");
+ }
+ break;
+
+ default:
+ /* FIXME: `locale_string ()' is not appropriate here because of
+ encoding issues (see comment above). */
+ result = scm_take_locale_string (c_result);
+ }
+ }
+
+ return result;
+#else
+ scm_syserror_msg (FUNC_NAME,
+ "`language-information' not supported on your system",
+ SCM_EOL, ENOSYS);
+
+ return SCM_BOOL_F;
+#endif
+}
+#undef FUNC_NAME
+
+/* Define the `nl_item' constants. */
+static inline void
+define_langinfo_items (void)
+{
+#if (defined HAVE_NL_TYPES_H) && (defined HAVE_LANGINFO_H)
+
+#define DEFINE_NLITEM_CONSTANT(_item) \
+ scm_c_define (# _item, SCM_I_MAKINUM (_item))
+
+ DEFINE_NLITEM_CONSTANT (CODESET);
+
+ /* Abbreviated days of the week. */
+ DEFINE_NLITEM_CONSTANT (ABDAY_1);
+ DEFINE_NLITEM_CONSTANT (ABDAY_2);
+ DEFINE_NLITEM_CONSTANT (ABDAY_3);
+ DEFINE_NLITEM_CONSTANT (ABDAY_4);
+ DEFINE_NLITEM_CONSTANT (ABDAY_5);
+ DEFINE_NLITEM_CONSTANT (ABDAY_6);
+ DEFINE_NLITEM_CONSTANT (ABDAY_7);
+
+ /* Long-named days of the week. */
+ DEFINE_NLITEM_CONSTANT (DAY_1); /* Sunday */
+ DEFINE_NLITEM_CONSTANT (DAY_2); /* Monday */
+ DEFINE_NLITEM_CONSTANT (DAY_3); /* Tuesday */
+ DEFINE_NLITEM_CONSTANT (DAY_4); /* Wednesday */
+ DEFINE_NLITEM_CONSTANT (DAY_5); /* Thursday */
+ DEFINE_NLITEM_CONSTANT (DAY_6); /* Friday */
+ DEFINE_NLITEM_CONSTANT (DAY_7); /* Saturday */
+
+ /* Abbreviated month names. */
+ DEFINE_NLITEM_CONSTANT (ABMON_1); /* Jan */
+ DEFINE_NLITEM_CONSTANT (ABMON_2);
+ DEFINE_NLITEM_CONSTANT (ABMON_3);
+ DEFINE_NLITEM_CONSTANT (ABMON_4);
+ DEFINE_NLITEM_CONSTANT (ABMON_5);
+ DEFINE_NLITEM_CONSTANT (ABMON_6);
+ DEFINE_NLITEM_CONSTANT (ABMON_7);
+ DEFINE_NLITEM_CONSTANT (ABMON_8);
+ DEFINE_NLITEM_CONSTANT (ABMON_9);
+ DEFINE_NLITEM_CONSTANT (ABMON_10);
+ DEFINE_NLITEM_CONSTANT (ABMON_11);
+ DEFINE_NLITEM_CONSTANT (ABMON_12);
+
+ /* Long month names. */
+ DEFINE_NLITEM_CONSTANT (MON_1); /* January */
+ DEFINE_NLITEM_CONSTANT (MON_2);
+ DEFINE_NLITEM_CONSTANT (MON_3);
+ DEFINE_NLITEM_CONSTANT (MON_4);
+ DEFINE_NLITEM_CONSTANT (MON_5);
+ DEFINE_NLITEM_CONSTANT (MON_6);
+ DEFINE_NLITEM_CONSTANT (MON_7);
+ DEFINE_NLITEM_CONSTANT (MON_8);
+ DEFINE_NLITEM_CONSTANT (MON_9);
+ DEFINE_NLITEM_CONSTANT (MON_10);
+ DEFINE_NLITEM_CONSTANT (MON_11);
+ DEFINE_NLITEM_CONSTANT (MON_12);
+
+ DEFINE_NLITEM_CONSTANT (AM_STR); /* Ante meridiem string. */
+ DEFINE_NLITEM_CONSTANT (PM_STR); /* Post meridiem string. */
+
+ DEFINE_NLITEM_CONSTANT (D_T_FMT); /* Date and time format for strftime. */
+ DEFINE_NLITEM_CONSTANT (D_FMT); /* Date format for strftime. */
+ DEFINE_NLITEM_CONSTANT (T_FMT); /* Time format for strftime. */
+ DEFINE_NLITEM_CONSTANT (T_FMT_AMPM);/* 12-hour time format for strftime. */
+
+ DEFINE_NLITEM_CONSTANT (ERA); /* Alternate era. */
+ DEFINE_NLITEM_CONSTANT (ERA_D_FMT); /* Date in alternate era format. */
+ DEFINE_NLITEM_CONSTANT (ERA_D_T_FMT); /* Date and time in alternate era
+ format. */
+ DEFINE_NLITEM_CONSTANT (ERA_T_FMT); /* Time in alternate era format. */
+
+ DEFINE_NLITEM_CONSTANT (ALT_DIGITS); /* Alternate symbols for digits. */
+ DEFINE_NLITEM_CONSTANT (RADIXCHAR);
+ DEFINE_NLITEM_CONSTANT (THOUSEP);
+
+#ifdef YESEXPR
+ DEFINE_NLITEM_CONSTANT (YESEXPR);
+#endif
+#ifdef NOEXPR
+ DEFINE_NLITEM_CONSTANT (NOEXPR);
+#endif
+#ifdef YESSTR
+ DEFINE_NLITEM_CONSTANT (YESSTR);
+#endif
+#ifdef NOSTR
+ DEFINE_NLITEM_CONSTANT (NOSTR);
+#endif
+
+
+ /* GNU extensions. */
+
+#ifdef ERA_YEAR
+ DEFINE_NLITEM_CONSTANT (ERA_YEAR); /* Year in alternate era format. */
+#endif
+#ifdef DECIMAL_POINT
+ DEFINE_NLITEM_CONSTANT (DECIMAL_POINT); /* Equivalent to `RADIXCHAR'. */
+#endif
+
+ /* LC_MONETARY category: formatting of monetary quantities.
+ These items each correspond to a member of `struct lconv',
+ defined in <locale.h>. */
+#ifdef INT_CURR_SYMBOL
+ DEFINE_NLITEM_CONSTANT (INT_CURR_SYMBOL);
+#endif
+#ifdef CURRENCY_SYMBOL
+ DEFINE_NLITEM_CONSTANT (CURRENCY_SYMBOL);
+#endif
+#ifdef CRNCYSTR /* legacy symbol */
+ DEFINE_NLITEM_CONSTANT (CRNCYSTR);
+#endif
+#ifdef MON_DECIMAL_POINT
+ DEFINE_NLITEM_CONSTANT (MON_DECIMAL_POINT);
+#endif
+#ifdef MON_THOUSANDS_SEP
+ DEFINE_NLITEM_CONSTANT (MON_THOUSANDS_SEP);
+#endif
+#ifdef MON_GROUPING
+ DEFINE_NLITEM_CONSTANT (MON_GROUPING);
+#endif
+#ifdef POSITIVE_SIGN
+ DEFINE_NLITEM_CONSTANT (POSITIVE_SIGN);
+#endif
+#ifdef NEGATIVE_SIGN
+ DEFINE_NLITEM_CONSTANT (NEGATIVE_SIGN);
+#endif
+#ifdef GROUPING
+ DEFINE_NLITEM_CONSTANT (GROUPING);
+#endif
+#ifdef INT_FRAC_DIGITS
+ DEFINE_NLITEM_CONSTANT (INT_FRAC_DIGITS);
+#endif
+#ifdef FRAC_DIGITS
+ DEFINE_NLITEM_CONSTANT (FRAC_DIGITS);
+#endif
+#ifdef P_CS_PRECEDES
+ DEFINE_NLITEM_CONSTANT (P_CS_PRECEDES);
+#endif
+#ifdef P_SEP_BY_SPACE
+ DEFINE_NLITEM_CONSTANT (P_SEP_BY_SPACE);
+#endif
+#ifdef N_CS_PRECEDES
+ DEFINE_NLITEM_CONSTANT (N_CS_PRECEDES);
+#endif
+#ifdef N_SEP_BY_SPACE
+ DEFINE_NLITEM_CONSTANT (N_SEP_BY_SPACE);
+#endif
+#ifdef P_SIGN_POSN
+ DEFINE_NLITEM_CONSTANT (P_SIGN_POSN);
+#endif
+#ifdef N_SIGN_POSN
+ DEFINE_NLITEM_CONSTANT (N_SIGN_POSN);
+#endif
+#ifdef INT_P_CS_PRECEDES
+ DEFINE_NLITEM_CONSTANT (INT_P_CS_PRECEDES);
+#endif
+#ifdef INT_P_SEP_BY_SPACE
+ DEFINE_NLITEM_CONSTANT (INT_P_SEP_BY_SPACE);
+#endif
+#ifdef INT_N_CS_PRECEDES
+ DEFINE_NLITEM_CONSTANT (INT_N_CS_PRECEDES);
+#endif
+#ifdef INT_N_SEP_BY_SPACE
+ DEFINE_NLITEM_CONSTANT (INT_N_SEP_BY_SPACE);
+#endif
+#ifdef INT_P_SIGN_POSN
+ DEFINE_NLITEM_CONSTANT (INT_P_SIGN_POSN);
+#endif
+#ifdef INT_N_SIGN_POSN
+ DEFINE_NLITEM_CONSTANT (INT_N_SIGN_POSN);
+#endif
+
+#undef DEFINE_NLITEM_CONSTANT
+
+#endif /* HAVE_NL_TYPES_H */
+}
\f
void
scm_init_i18n ()
{
- scm_add_feature ("ice-9-i18n");
+ SCM global_locale_smob;
-#define _SCM_STRINGIFY_LC(_name) # _name
-#define SCM_STRINGIFY_LC(_name) _SCM_STRINGIFY_LC (_name)
+#ifdef HAVE_NL_LANGINFO
+ scm_add_feature ("nl-langinfo");
+ define_langinfo_items ();
+#endif
/* Define all the relevant `_MASK' variables. */
-#define SCM_DEFINE_LOCALE_CATEGORY(_name) \
- scm_c_define ("LC_" SCM_STRINGIFY_LC (_name) "_MASK", \
- SCM_I_MAKINUM (LC_ ## _name ## _MASK));
+#define SCM_DEFINE_LOCALE_CATEGORY(_name) \
+ scm_c_define ("LC_" SCM_I18N_STRINGIFY (_name) "_MASK", \
+ scm_from_int (LC_ ## _name ## _MASK));
#include "locale-categories.h"
#undef SCM_DEFINE_LOCALE_CATEGORY
-#undef SCM_STRINGIFY_LC
-#undef _SCM_STRINGIFY_LC
scm_c_define ("LC_ALL_MASK", SCM_I_MAKINUM (LC_ALL_MASK));
@@ -1151,6 +1722,10 @@
#ifndef USE_GNU_LOCALE_API
scm_set_smob_mark (scm_tc16_locale_smob_type, smob_locale_mark);
#endif
+
+ /* Initialize the global locale object with a special `locale' SMOB. */
+ SCM_NEWSMOB (global_locale_smob, scm_tc16_locale_smob_type, NULL);
+ SCM_VARIABLE_SET (scm_global_locale, global_locale_smob);
}
--- orig/libguile/i18n.h
+++ mod/libguile/i18n.h
@@ -22,6 +22,7 @@
#include "libguile/__scm.h"
+SCM_API SCM scm_global_locale;
SCM_API SCM scm_make_locale (SCM category_mask, SCM locale_name, SCM base_locale);
SCM_API SCM scm_locale_p (SCM obj);
SCM_API SCM scm_string_locale_lt (SCM s1, SCM s2, SCM locale);
@@ -40,6 +41,7 @@
SCM_API SCM scm_string_locale_downcase (SCM chr, SCM locale);
SCM_API SCM scm_locale_string_to_integer (SCM str, SCM base, SCM locale);
SCM_API SCM scm_locale_string_to_inexact (SCM str, SCM locale);
+SCM_API SCM scm_nl_langinfo (SCM item, SCM locale);
SCM_API void scm_init_i18n (void);
--- orig/libguile/posix.c
+++ mod/libguile/posix.c
@@ -119,6 +119,13 @@
# define USE_GNU_LOCALE_API
#endif
+#if (defined USE_GNU_LOCALE_API) && (defined HAVE_XLOCALE_H)
+/* Darwin now supports the "GNU" thread-safe locale API but one has to
+ include <xlocale.h> to get it. See:
+ http://developer.apple.com/documentation/Darwin/Reference/ManPages/man3/newlocale.3.html */
+# include <xlocale.h>
+#endif
+
#if HAVE_CRYPT_H
# include <crypt.h>
#endif
@@ -1384,12 +1391,11 @@
}
#undef FUNC_NAME
-#ifndef USE_GNU_LOCALE_API
/* This mutex is used to serialize invocations of `setlocale ()' on non-GNU
- systems (i.e., systems where a reentrant locale API is not available).
- See `i18n.c' for details. */
-scm_i_pthread_mutex_t scm_i_locale_mutex;
-#endif
+ systems (i.e., systems where a reentrant locale API is not available). It
+ is also acquired before calls to `nl_langinfo ()'. See `i18n.c' for
+ details. */
+scm_i_pthread_mutex_t scm_i_locale_mutex = SCM_I_PTHREAD_MUTEX_INITIALIZER;
#ifdef HAVE_SETLOCALE
@@ -1406,6 +1412,7 @@
"the locale will be set using environment variables.")
#define FUNC_NAME s_scm_setlocale
{
+ int c_category;
char *clocale;
char *rv;
@@ -1421,13 +1428,11 @@
scm_dynwind_free (clocale);
}
-#ifndef USE_GNU_LOCALE_API
+ c_category = scm_i_to_lc_category (category, 1);
+
scm_i_pthread_mutex_lock (&scm_i_locale_mutex);
-#endif
- rv = setlocale (scm_i_to_lc_category (category, 1), clocale);
-#ifndef USE_GNU_LOCALE_API
+ rv = setlocale (c_category, clocale);
scm_i_pthread_mutex_unlock (&scm_i_locale_mutex);
-#endif
if (rv == NULL)
{
@@ -1965,10 +1970,6 @@
void
scm_init_posix ()
{
-#ifndef USE_GNU_LOCALE_API
- scm_i_pthread_mutex_init (&scm_i_locale_mutex, NULL);
-#endif
-
scm_add_feature ("posix");
#ifdef HAVE_GETEUID
scm_add_feature ("EIDs");
--- orig/srfi/srfi-19.scm
+++ mod/srfi/srfi-19.scm
@@ -41,7 +41,8 @@
(define-module (srfi srfi-19)
:use-module (srfi srfi-6)
:use-module (srfi srfi-8)
- :use-module (srfi srfi-9))
+ :use-module (srfi srfi-9)
+ :use-module (ice-9 i18n))
(begin-deprecated
;; Prevent `export' from re-exporting core bindings. This behaviour
@@ -150,48 +151,9 @@
;;-- LOCALE dependent constants
-(define priv:locale-number-separator ".")
-
-(define priv:locale-abbr-weekday-vector
- (vector "Sun" "Mon" "Tue" "Wed" "Thu" "Fri" "Sat"))
-
-(define priv:locale-long-weekday-vector
- (vector
- "Sunday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday" "Saturday"))
-
-;; note empty string in 0th place.
-(define priv:locale-abbr-month-vector
- (vector ""
- "Jan"
- "Feb"
- "Mar"
- "Apr"
- "May"
- "Jun"
- "Jul"
- "Aug"
- "Sep"
- "Oct"
- "Nov"
- "Dec"))
-
-(define priv:locale-long-month-vector
- (vector ""
- "January"
- "February"
- "March"
- "April"
- "May"
- "June"
- "July"
- "August"
- "September"
- "October"
- "November"
- "December"))
-
-(define priv:locale-pm "PM")
-(define priv:locale-am "AM")
+(define priv:locale-number-separator locale-decimal-point)
+(define priv:locale-pm locale-pm-string)
+(define priv:locale-am locale-am-string)
;; See date->string
(define priv:locale-date-time-format "~a ~b ~d ~H:~M:~S~z ~Y")
@@ -964,38 +926,33 @@
(define (priv:last-n-digits i n)
(abs (remainder i (expt 10 n))))
-(define (priv:locale-abbr-weekday n)
- (vector-ref priv:locale-abbr-weekday-vector n))
-
-(define (priv:locale-long-weekday n)
- (vector-ref priv:locale-long-weekday-vector n))
-
-(define (priv:locale-abbr-month n)
- (vector-ref priv:locale-abbr-month-vector n))
-
-(define (priv:locale-long-month n)
- (vector-ref priv:locale-long-month-vector n))
-
-(define (priv:vector-find needle haystack comparator)
- (let ((len (vector-length haystack)))
- (define (priv:vector-find-int index)
+(define priv:locale-abbr-weekday locale-day-short)
+(define priv:locale-long-weekday locale-day)
+(define priv:locale-abbr-month locale-month-short)
+(define priv:locale-long-month locale-month)
+
+(define (priv:vector-find needle haystack-ref haystack-len comparator)
+ (define (priv:vector-find-int index)
+ (let loop ((index index))
(cond
- ((>= index len) #f)
- ((comparator needle (vector-ref haystack index)) index)
- (else (priv:vector-find-int (+ index 1)))))
- (priv:vector-find-int 0)))
+ ((> index haystack-len) #f)
+ ((comparator needle (haystack-ref index))
+ index)
+ (else (priv:vector-find-int (+ index 1))))))
+
+ (priv:vector-find-int 1))
(define (priv:locale-abbr-weekday->index string)
- (priv:vector-find string priv:locale-abbr-weekday-vector string=?))
+ (priv:vector-find string priv:locale-abbr-weekday 7 string=?))
(define (priv:locale-long-weekday->index string)
- (priv:vector-find string priv:locale-long-weekday-vector string=?))
+ (priv:vector-find string priv:locale-long-weekday 7 string=?))
(define (priv:locale-abbr-month->index string)
- (priv:vector-find string priv:locale-abbr-month-vector string=?))
+ (priv:vector-find string priv:locale-abbr-month 12 string=?))
(define (priv:locale-long-month->index string)
- (priv:vector-find string priv:locale-long-month-vector string=?))
+ (priv:vector-find string priv:locale-long-month 12 string=?))
;; FIXME: mkoeppe: Put a symbolic time zone in the date structs.
@@ -1003,10 +960,8 @@
(define (priv:locale-print-time-zone date port)
(priv:tz-printer (date-zone-offset date) port))
-;; FIXME: we should use strftime to determine this dynamically if possible.
-;; Again, locale specific.
(define (priv:locale-am/pm hr)
- (if (> hr 11) priv:locale-pm priv:locale-am))
+ (if (> hr 11) (priv:locale-pm) (priv:locale-am)))
(define (priv:tz-printer offset port)
(cond
@@ -1069,7 +1024,7 @@
(le (string-length ns)))
(if (> le 2)
(begin
- (display priv:locale-number-separator port)
+ (display (priv:locale-number-separator) port)
(display (substring ns 2 le) port)))))))
(cons #\h (lambda (date pad-with port)
(display (date->string date "~b") port)))
--- orig/test-suite/tests/i18n.test
+++ mod/test-suite/tests/i18n.test
@@ -1,6 +1,6 @@
;;;; i18n.test --- Exercise the i18n API.
;;;;
-;;;; Copyright (C) 2006 Free Software Foundation, Inc.
+;;;; Copyright (C) 2006, 2007 Free Software Foundation, Inc.
;;;; Ludovic Courtès
;;;;
;;;; This library is free software; you can redistribute it and/or
@@ -19,25 +19,38 @@
(define-module (test-suite i18n)
:use-module (ice-9 i18n)
+ :use-module (srfi srfi-1)
:use-module (test-suite lib))
;; Start from a pristine locale state.
(setlocale LC_ALL "C")
+(define exception:locale-error
+ (cons 'system-error "Failed to install locale"))
+
+
\f
(with-test-prefix "locale objects"
(pass-if "make-locale (2 args)"
- (not (not (make-locale LC_ALL_MASK "C"))))
+ (not (not (make-locale (list LC_ALL) "C"))))
(pass-if "make-locale (3 args)"
- (not (not (make-locale LC_COLLATE_MASK "C"
- (make-locale LC_MESSAGES_MASK "C")))))
+ (not (not (make-locale (list LC_COLLATE) "C"
+ (make-locale (list LC_MESSAGES) "C")))))
+
+ (pass-if-exception "make-locale with unknown locale" exception:locale-error
+ (make-locale (list LC_ALL) "does-not-exist"))
(pass-if "locale?"
- (and (locale? (make-locale LC_ALL_MASK "C"))
- (locale? (make-locale (logior LC_MESSAGES_MASK LC_NUMERIC_MASK) "C"
- (make-locale LC_CTYPE_MASK "C"))))))
+ (and (locale? (make-locale (list LC_ALL) "C"))
+ (locale? (make-locale (list LC_MESSAGES LC_NUMERIC) "C"
+ (make-locale (list LC_CTYPE) "C")))))
+
+ (pass-if "%global-locale"
+ (and (locale? %global-locale))
+ (locale? (make-locale (list LC_MONETARY) "C"
+ %global-locale))))
\f
@@ -46,27 +59,30 @@
(pass-if "string-locale<?"
(and (string-locale<? "hello" "world")
(string-locale<? "hello" "world"
- (make-locale LC_COLLATE_MASK "C"))))
+ (make-locale (list LC_COLLATE) "C"))))
(pass-if "char-locale<?"
(and (char-locale<? #\a #\b)
- (char-locale<? #\a #\b (make-locale LC_COLLATE_MASK "C"))))
+ (char-locale<? #\a #\b (make-locale (list LC_COLLATE) "C"))))
(pass-if "string-locale-ci=?"
(and (string-locale-ci=? "Hello" "HELLO")
(string-locale-ci=? "Hello" "HELLO"
- (make-locale LC_COLLATE_MASK "C"))))
+ (make-locale (list LC_COLLATE) "C"))))
(pass-if "string-locale-ci<?"
(and (string-locale-ci<? "hello" "WORLD")
(string-locale-ci<? "hello" "WORLD"
- (make-locale LC_COLLATE_MASK "C")))))
+ (make-locale (list LC_COLLATE) "C")))))
\f
+(define %french-locale-name
+ "fr_FR.ISO-8859-1")
+
(define %french-locale
(false-if-exception
- (make-locale (logior LC_CTYPE_MASK LC_COLLATE_MASK)
- "fr_FR.ISO-8859-1")))
+ (make-locale (list LC_CTYPE LC_COLLATE LC_NUMERIC LC_TIME)
+ %french-locale-name)))
(define (under-french-locale-or-unresolved thunk)
;; On non-GNU systems, an exception may be raised only when the locale is
@@ -112,11 +128,11 @@
(pass-if "char-locale-downcase"
(and (eq? #\a (char-locale-downcase #\A))
- (eq? #\a (char-locale-downcase #\A (make-locale LC_ALL_MASK "C")))))
+ (eq? #\a (char-locale-downcase #\A (make-locale (list LC_ALL) "C")))))
(pass-if "char-locale-upcase"
(and (eq? #\Z (char-locale-upcase #\z))
- (eq? #\Z (char-locale-upcase #\z (make-locale LC_ALL_MASK "C"))))))
+ (eq? #\Z (char-locale-upcase #\z (make-locale (list LC_ALL) "C"))))))
\f
(with-test-prefix "number parsing"
@@ -131,10 +147,98 @@
(call-with-values
(lambda ()
(locale-string->inexact "123.456"
- (make-locale LC_NUMERIC_MASK "C")))
+ (make-locale (list LC_NUMERIC) "C")))
(lambda (result char-count)
(and (equal? result 123.456)
- (equal? char-count 7))))))
+ (equal? char-count 7)))))
+
+ (pass-if "locale-string->inexact (French)"
+ (under-french-locale-or-unresolved
+ (lambda ()
+ (call-with-values
+ (lambda ()
+ (locale-string->inexact "123,456" %french-locale))
+ (lambda (result char-count)
+ (and (equal? result 123.456)
+ (equal? char-count 7))))))))
+
+\f
+;;;
+;;; `nl-langinfo'
+;;;
+
+(setlocale LC_ALL "C")
+(define %c-locale (make-locale (list LC_ALL) "C"))
+
+(define %english-days
+ '("Sunday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday" "Saturday"))
+
+(define (every? . args)
+ (not (not (apply every args))))
+
+
+(with-test-prefix "nl-langinfo et al."
+
+ (pass-if "locale-day (1 arg)"
+ (every? equal?
+ %english-days
+ (map locale-day (map 1+ (iota 7)))))
+
+ (pass-if "locale-day (2 args)"
+ (every? equal?
+ %english-days
+ (map (lambda (day)
+ (locale-day day %c-locale))
+ (map 1+ (iota 7)))))
+
+ (pass-if "locale-day (2 args, using `%global-locale')"
+ (every? equal?
+ %english-days
+ (map (lambda (day)
+ (locale-day day %global-locale))
+ (map 1+ (iota 7)))))
+
+ (pass-if "locale-day (French)"
+ (under-french-locale-or-unresolved
+ (lambda ()
+ (let ((result (locale-day 3 %french-locale)))
+ (and (string? result)
+ (string-ci=? result "mardi"))))))
+
+ (pass-if "locale-day (French, using `%global-locale')"
+ ;; Make sure `%global-locale' captures the current locale settings as
+ ;; installed using `setlocale'.
+ (under-french-locale-or-unresolved
+ (lambda ()
+ (dynamic-wind
+ (lambda ()
+ (setlocale LC_TIME %french-locale-name))
+ (lambda ()
+ (let* ((fr (make-locale (list LC_MONETARY) "C" %global-locale))
+ (result (locale-day 3 fr)))
+ (setlocale LC_ALL "C")
+ (and (string? result)
+ (string-ci=? result "mardi"))))
+ (lambda ()
+ (setlocale LC_ALL "C"))))))
+
+ (pass-if "default locale"
+ ;; Make sure the default locale does not capture the current locale
+ ;; settings as installed using `setlocale'. The default locale should be
+ ;; "C".
+ (under-french-locale-or-unresolved
+ (lambda ()
+ (dynamic-wind
+ (lambda ()
+ (setlocale LC_ALL %french-locale-name))
+ (lambda ()
+ (let* ((locale (make-locale (list LC_MONETARY) "C"))
+ (result (locale-day 3 locale)))
+ (setlocale LC_ALL "C")
+ (and (string? result)
+ (string-ci=? result "Tuesday"))))
+ (lambda ()
+ (setlocale LC_ALL "C")))))))
;;; Local Variables:
--- orig/test-suite/tests/srfi-19.test
+++ mod/test-suite/tests/srfi-19.test
@@ -27,6 +27,9 @@
:use-module (srfi srfi-19)
:use-module (ice-9 format))
+;; Make sure we use the default locale.
+(setlocale LC_ALL "C")
+
(define (with-tz* tz thunk)
"Temporarily set the TZ environment variable to the passed string
value and call THUNK."
@@ -142,6 +145,19 @@
(string->date "2001-06-01@08:00" "~Y-~m-~d@~H:~M")))
(date->time-utc
(make-date 0 0 0 12 1 6 2001 0))))
+ (pass-if "string->date understands days and months"
+ (time=? (let ((d (string->date "Saturday, December 9, 2006"
+ "~A, ~B ~d, ~Y")))
+ (date->time-utc (make-date (date-nanosecond d)
+ (date-second d)
+ (date-minute d)
+ (date-hour d)
+ (date-day d)
+ (date-month d)
+ (date-year d)
+ 0)))
+ (date->time-utc
+ (make-date 0 0 0 0 9 12 2006 0))))
;; check time comparison procedures
(let* ((time1 (make-time time-monotonic 0 0))
(time2 (make-time time-monotonic 0 0))
[-- Attachment #3: Type: text/plain, Size: 143 bytes --]
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2007-01-16 21:46 ` Ludovic Courtès
@ 2007-01-16 22:08 ` Ludovic Courtès
2007-01-23 0:49 ` Kevin Ryde
2007-01-23 0:46 ` Kevin Ryde
2007-01-25 1:05 ` Kevin Ryde
2 siblings, 1 reply; 23+ messages in thread
From: Ludovic Courtès @ 2007-01-16 22:08 UTC (permalink / raw)
Oops, I forgot to mention a few things...
ludo@chbouib.org (Ludovic Courtès) writes:
> 2. Since `nl-langinfo' is hidden from Scheme programmers, all the
> `locale-' procedures are guaranteed to work, no matter whether
> `nl_langinfo ()' is actually available (see the feature-conditional
> macros in `i18n.scm').
The idea is that eventually we could plug in `localeconv' support
transparently, i.e., the user-visible API would remain the same but it
would become capable of using `localeconv' when `nl-langinfo' is not
available (that would work only for monetary and numeric procedures,
though).
-1. Introduced `%global-locale' (which is quite handy).
-2. Changed `make-locale' to use a category list rather than a
category mask:
(make-locale (list LC_NUMERIC LC_MESSAGES) "eo_EO")
And I think that's it.
Ludo'.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2007-01-16 21:46 ` Ludovic Courtès
2007-01-16 22:08 ` Ludovic Courtès
@ 2007-01-23 0:46 ` Kevin Ryde
2007-01-23 23:44 ` Ludovic Courtès
2007-01-25 1:05 ` Kevin Ryde
2 siblings, 1 reply; 23+ messages in thread
From: Kevin Ryde @ 2007-01-23 0:46 UTC (permalink / raw)
Cc: guile-devel
ludo@chbouib.org (Ludovic Courtès) writes:
>
> 4. No attempt is made to convert to a different encoding the strings
> returned by `nl_langinfo ()'.
Good.
> 5. High-level procedures for locale-dependent number output are
> provided, namely `number->locale-string' and
> `monetary-amount->locale-string'. Both of them are implemented in
> Scheme (which is relieving, really ;-)).
See how much nicer it is :-).
> +* i18n Introduction:: Introduction to Guile's i18n support.
"i18n" is a horrible geek-ish abbreviation, much better written in
full in the manual.
> +@deffn {Scheme Procedure} make-locale category-list locale-name [base-locale]
Just a single category too will be good there. Just LC_ALL is
probably be almost always what's wanted.
> +@defvr {Scheme Variable} %global-locale
I'm worried if that should be a variable, rather than a function.
> These procedures allow to account for these differences.
Slightly ungrammatical there (I'm sure it reads better in french :-).
> +This function is based on the C library's @code{strtod} function
Not any more? Or just in a broadly similar style?
> +@deffn {Scheme Procedure} locale-era [locale]
> +@deffnx {Scheme Procedure} locale-era-year [locale]
> +These functions return, respectively, the era and the year of the
> +relevant era used in @var{locale} or the current locale. Most locales
> +do not define this value.
What do the functions do in that case (ie. no era)? Will my computer
explode?
> +@deffn {Scheme Procedure} locale-digit-grouping [locale]
> +Return a (potentially circular) list denoting how digits
You could rearrange the words to get "list of integers" somewhere near
the start, for maximum clarity.
> +(define us (make-locale (list LC_MONETARY) "en_US"))
That's an example of where I think having to say "list" is a bit
pesky. Just (make-locale LC_MONETARY ...) will be good.
> +(locale-currency-symbol #f us)
> +@result{} "-$"
Is it actually "-$" there, or just "$"?
> +@deffn {Scheme Procedure} locale-yes-regexp [locale]
> +@deffnx {Scheme Procedure} locale-no-regexp [locale]
These aren't gnu extensions are they?
> +Example uses of some of these functions are the implementation of the
> +@code{number->locale-string} and @code{monetary-amount->locale-string}
> +procedures (@pxref{Number Input and Output}), as well as that the
> +SRFI-19 date and time convertion to/from strings (@pxref{SRFI-19}).
I wouldn't worry about that bit. A nice broad introduction at the
start of the node should be ok.
> +(define (locale-encoding . locale)
> + (apply nl-langinfo CODESET locale))
I think you should implement almost all these basic bits in C, and not
have a big nl-langinfo. It'll be easier for a future localeconv
conditionals, or some of the DOS apis. It'll also be smaller and
faster to load and run.
> +(define-vector-langinfo-mapping locale-day-short
> + (ABDAY_1 ABDAY_2 ABDAY_3 ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7)
> + ("Sun" "Mon" "Tue" "Wed" "Thu" "Fri" "Sat"))
These in C too I think.
> number formatting ...
>
> + (let loop ((index (- strlen 1))
> + (grouping grouping)
> + (since-separator 0)
> + (result ""))
Consider breaking out of the loop when grouping is null, and in any
case going group by group instead of character by character.
The result could be built as a list of strings to be concatenated at
the end too (string-concatenate/reverse or whatever it is).
> ;;; Local Variables:
> ;;; coding: latin-1
The fallbacks are all ascii aren't they?
> nl-langinfo ...
> + However, `scm_from_locale_string',
> + which expects a Latin-1 string, won't be able to make much sense of the
> + returned string.
No need to be too worried about that. scm_from_locale_string is
little more than a placeholder until there's a notion of coding within
strings. Almost every call to it will have to be reviewed for whether
it's actually the locale charset, another charset, or binary bytes.
> nl-langinfo ...
> + switch (c_item)
> + {
> + case GROUPING:
> + case MON_GROUPING:
Special cases like that will be cleaner in their own function I think.
> + scm_syserror_msg (FUNC_NAME,
> + "`language-information' not supported on your system",
Still `language-information'?
I expect if each `locale-foo' is a separate func then they should be
able to return sensible defaults, avoiding such an error.
> +(define (priv:vector-find needle haystack-ref haystack-len comparator)
Does that still search a vector? Could rename it if not. Whatever it
does now, a func call int range searcher or whatever.
> +;;;
> +;;; `nl-langinfo'
> +;;;
Could make a separate section for each func exercised there
(irrespective what they do internally).
Apart from that it all looks good.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2007-01-16 22:08 ` Ludovic Courtès
@ 2007-01-23 0:49 ` Kevin Ryde
0 siblings, 0 replies; 23+ messages in thread
From: Kevin Ryde @ 2007-01-23 0:49 UTC (permalink / raw)
Cc: guile-devel
ludo@chbouib.org (Ludovic Courtès) writes:
>
> The idea is that eventually we could plug in `localeconv' support
> transparently, i.e., the user-visible API would remain the same but it
> would become capable of using `localeconv' when `nl-langinfo' is not
> available (that would work only for monetary and numeric procedures,
> though).
Yep. And strftime might be able to cover the time bits - though
perhaps no sensible system has localized strftime without other locale
bits. (And like I say I think conditionals in C will be easier than
scheme.)
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2007-01-23 0:46 ` Kevin Ryde
@ 2007-01-23 23:44 ` Ludovic Courtès
2007-01-24 20:58 ` Kevin Ryde
0 siblings, 1 reply; 23+ messages in thread
From: Ludovic Courtès @ 2007-01-23 23:44 UTC (permalink / raw)
Hi Kevin,
I basically agree with most of your remarks. Here are some I don't
quite agree with or that deserve discussion.
Kevin Ryde <user42@zip.com.au> writes:
>> +* i18n Introduction:: Introduction to Guile's i18n support.
>
> "i18n" is a horrible geek-ish abbreviation, much better written in
> full in the manual.
Agreed, but that's only the name of the Info node. Changing it here
would make the line too long for Info.
>> +@defvr {Scheme Variable} %global-locale
>
> I'm worried if that should be a variable, rather than a function.
I wondered too. My conclusion is that there's nothing to be afraid of:
after all, it's a special object and we have full control over it
(pretty much like the EOF object, for instance[*]).
>> +This function is based on the C library's @code{strtod} function
>
> Not any more? Or just in a broadly similar style?
It's based on `strtod', really---I think you're mixing this and
`number->locale-string'.
>> +@deffn {Scheme Procedure} locale-yes-regexp [locale]
>> +@deffnx {Scheme Procedure} locale-no-regexp [locale]
>
> These aren't gnu extensions are they?
No, they're not:
http://www.opengroup.org/onlinepubs/009695399/basedefs/langinfo.h.html
>> +(define (locale-encoding . locale)
>> + (apply nl-langinfo CODESET locale))
>
> I think you should implement almost all these basic bits in C, and not
> have a big nl-langinfo. It'll be easier for a future localeconv
> conditionals, or some of the DOS apis. It'll also be smaller and
> faster to load and run.
>
>> +(define-vector-langinfo-mapping locale-day-short
>> + (ABDAY_1 ABDAY_2 ABDAY_3 ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7)
>> + ("Sun" "Mon" "Tue" "Wed" "Thu" "Fri" "Sat"))
>
> These in C too I think.
I'm not sure about it. Writing all those bits in C would have clear
short-term advantages, mainly faster loading. That said, beside the
fact that I'm a bit lazy and don't really feel like rewriting the thing
in C ATM ;-), I think the above Scheme constructs are easier to work
with (to maintain, to add `localeconv' support, etc.) than a C
equivalent. So I'd be tempted to keep them as is.
(Besides, I don't think it's very high priority, especially since it
doesn't have any impact on the API itself.)
>> number formatting ...
>>
>> + (let loop ((index (- strlen 1))
>> + (grouping grouping)
>> + (since-separator 0)
>> + (result ""))
>
> Consider breaking out of the loop when grouping is null, and in any
> case going group by group instead of character by character.
Yes, indeed!
> The result could be built as a list of strings to be concatenated at
> the end too (string-concatenate/reverse or whatever it is).
What does it buy us to apply `string-append' at the end rather than
`string-append' at each step?
>> ;;; Local Variables:
>> ;;; coding: latin-1
>
> The fallbacks are all ascii aren't they?
I prefer to explicitly specify the encoding of non-ASCII files.
Thanks for the detailed review!
Ludo'.
[*] Actually, the EOF object _is_ available in R6RS, but it is returned
by the `eof-object' procedure. The following post sheds some light
on this decision:
http://www.r6rs.org/r6rs-editors/2006-July/001536.html . I'm not
sure this reasoning would apply well here: We already have two
implementations of `%global-locale' and both can happily deal with
it.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2007-01-23 23:44 ` Ludovic Courtès
@ 2007-01-24 20:58 ` Kevin Ryde
2007-01-31 21:13 ` Ludovic Courtès
0 siblings, 1 reply; 23+ messages in thread
From: Kevin Ryde @ 2007-01-24 20:58 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel
ludo@chbouib.org (Ludovic Courtès) writes:
>
> Agreed, but that's only the name of the Info node. Changing it here
> would make the line too long for Info.
Call the node just "Internationalization" and then call the sub nodes
something else, that should fit fine. There's a good chance people
coming to such matters for the first time won't even know what "i18n"
stands for.
>%global-locale
> I wondered too. My conclusion is that there's nothing to be afraid of:
> after all, it's a special object and we have full control over it
> (pretty much like the EOF object, for instance[*]).
If someone sets the locale with setlocale from C, does it still work,
or does %global-locale get out of sync. Gtk likes to do that for
instance.
> I think the above Scheme constructs are easier to work
> with (to maintain, to add `localeconv' support, etc.) than a C
> equivalent.
I think you'll find it's easier in C, especially if you have to worry
about different combinations of fallbacks in different funcs.
> (Besides, I don't think it's very high priority, especially since it
> doesn't have any impact on the API itself.)
If you can bring it to a high state of polish then you can put it all
in 1.8.
> What does it buy us to apply `string-append' at the end rather than
> `string-append' at each step?
string-append at each step means O(n^2) worth of copying (in the
string length). That afflicted output string ports until recently.
Of course whether anyone should be using a big number there is
debatable, but at least we can ensure performance doesn't prevent one
from doing so.
> I prefer to explicitly specify the encoding of non-ASCII files.
But is it non-ascii?
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2007-01-16 21:46 ` Ludovic Courtès
2007-01-16 22:08 ` Ludovic Courtès
2007-01-23 0:46 ` Kevin Ryde
@ 2007-01-25 1:05 ` Kevin Ryde
2 siblings, 0 replies; 23+ messages in thread
From: Kevin Ryde @ 2007-01-25 1:05 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel
ludo@chbouib.org (Ludovic Courtès) writes:
>
> +@deffn {Scheme Procedure} locale-yes-regexp [locale]
> +@deffnx {Scheme Procedure} locale-no-regexp [locale]
> +Return a string that can be used as a regular expression to recognize
> +a positive (respectively, negative) response to a yes/no question.
You could show what it is in the C locale there, as a sample. Just
reading it made me wonder if it might match "y" or "yes" or both, or
whatever. (Too much detail is obviously outside our scope, but
something to show what's typical is always helpful.)
> +;; `YESSTR' and `NOSTR' are considered deprecated so we don't provide them.
You could put with locale-yes-regexp in the manual that gettext is
recommended for a yes/no string for output. The same as the glibc
manual says.
> +#ifdef YESSTR
> + DEFINE_NLITEM_CONSTANT (YESSTR);
> +#endif
> +#ifdef NOSTR
> + DEFINE_NLITEM_CONSTANT (NOSTR);
> +#endif
If unused then I expect that can be dropped.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2007-01-24 20:58 ` Kevin Ryde
@ 2007-01-31 21:13 ` Ludovic Courtès
2007-01-31 22:06 ` Kevin Ryde
0 siblings, 1 reply; 23+ messages in thread
From: Ludovic Courtès @ 2007-01-31 21:13 UTC (permalink / raw)
To: guile-devel
Hi,
I finally went ahead and merged the new i18n module into HEAD for you to
test. Hopefully nothing will break and the i18n API won't need to
changed (much). ;-)
A few more notes...
Kevin Ryde <user42@zip.com.au> writes:
> Call the node just "Internationalization" and then call the sub nodes
> something else, that should fit fine. There's a good chance people
> coming to such matters for the first time won't even know what "i18n"
> stands for.
Not possible, because "Internationalization" already names the parent
node, and "Introduction" alone probably isn't a good idea.
>>%global-locale
>> I wondered too. My conclusion is that there's nothing to be afraid of:
>> after all, it's a special object and we have full control over it
>> (pretty much like the EOF object, for instance[*]).
>
> If someone sets the locale with setlocale from C, does it still work,
> or does %global-locale get out of sync. Gtk likes to do that for
> instance.
Yes, it works. `%global-locale' is just a locale SMOB whose value is
NULL instead of pointing to an actual `scm_locale_t' object.
>> I think the above Scheme constructs are easier to work
>> with (to maintain, to add `localeconv' support, etc.) than a C
>> equivalent.
>
> I think you'll find it's easier in C, especially if you have to worry
> about different combinations of fallbacks in different funcs.
At this point I'm not convinced about C being easier to deal with but
we'll see when we add `strftime' or `localeconv' support.
> If you can bring it to a high state of polish then you can put it all
> in 1.8.
Back in November or December, there was quite a consensus on this list
about having a strict policy for stable releases, where no new feature
would be added. Back then, I advocated your approach, but a strict
policy has clear advantages, so we should probably stick to it.
>> I prefer to explicitly specify the encoding of non-ASCII files.
>
> But is it non-ascii?
The very presence of my family name in the file makes it non-ASCII. :-)
Thanks,
Ludovic.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: More i18n
2007-01-31 21:13 ` Ludovic Courtès
@ 2007-01-31 22:06 ` Kevin Ryde
0 siblings, 0 replies; 23+ messages in thread
From: Kevin Ryde @ 2007-01-31 22:06 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guile-devel
ludo@chbouib.org (Ludovic Courtès) writes:
>
> Not possible, because "Internationalization" already names the parent
> node, and "Introduction" alone probably isn't a good idea.
"Internationalization Introduction" will be a goer then. It's
self-explanatory so there's no need for extra words on the info menu
line. (Having node names self-explanatory is of course much better
than adding extra explanation in a particular spot.)
> Back in November or December, there was quite a consensus on this list
> about having a strict policy for stable releases,
It died for lack of interest, not agreement :). There's really not
enough active work to need to be strict about "stable". If something
works and has a good definition it can be released ...
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2007-01-31 22:06 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-10 15:04 More i18n Ludovic Courtès
2006-12-11 19:42 ` Neil Jerram
2006-12-12 9:09 ` Ludovic Courtès
2006-12-12 19:48 ` Kevin Ryde
2006-12-31 16:23 ` Neil Jerram
2006-12-31 16:22 ` Neil Jerram
2006-12-13 0:11 ` Kevin Ryde
2006-12-31 16:54 ` Neil Jerram
2006-12-11 23:26 ` Kevin Ryde
2006-12-12 9:36 ` Ludovic Courtès
2006-12-12 19:43 ` Kevin Ryde
2006-12-12 19:19 ` Kevin Ryde
2006-12-31 16:15 ` Neil Jerram
2007-01-01 22:32 ` Kevin Ryde
2007-01-16 21:46 ` Ludovic Courtès
2007-01-16 22:08 ` Ludovic Courtès
2007-01-23 0:49 ` Kevin Ryde
2007-01-23 0:46 ` Kevin Ryde
2007-01-23 23:44 ` Ludovic Courtès
2007-01-24 20:58 ` Kevin Ryde
2007-01-31 21:13 ` Ludovic Courtès
2007-01-31 22:06 ` Kevin Ryde
2007-01-25 1:05 ` Kevin Ryde
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).