From: ludo@chbouib.org (Ludovic Courtès)
Subject: Re: Text collation
Date: Sun, 10 Dec 2006 13:30:00 +0100 [thread overview]
Message-ID: <87odqcszlj.fsf@chbouib.org> (raw)
In-Reply-To: 87fyc1df70.fsf@zip.com.au
Hi,
Just a few notes about your remarks regarding `(ice-9 i18n)'. A patch
will follow soon.
Kevin Ryde <user42@zip.com.au> writes:
>> +@node The ice-9 i18n Module
>
> See if you can think of a better section name.
Actually, since we're not going to include this module in 1.8, I think
I'd be in favor of moving the `gettext'-related functions in `(ice-9
i18n)'. Then the doc would have to be rearranged accordingly.
>> +@deffn {Scheme Procedure} string-locale<? s1 s2 [locale]
>> +@deffn {Scheme Procedure} string-locale>? s1 s2 [locale]
>> +@deffn {Scheme Procedure} string-locale-ci<? s1 s2 [locale]
>> +@deffn {Scheme Procedure} string-locale-ci>? s1 s2 [locale]
>> +@deffn {Scheme Procedure} string-locale-ci=? s1 s2 [locale]
>
> These could be described in one block I think, to avoid five very
> similar descriptions. Likewise the char ones.
Yes, done.
>> +... Note that SRFI-13 provides procedures that
>> +look similar (@pxref{Alphabetic Case Mapping}). However, the SRFI-13
>> +procedures are locale-independent.
>
> That's the intention of the srfi I guess, but it's not true currently
> is it? Don't they use toupper() and therefore get whatever nonsense
> the current setlocale() gives. Perhaps better leave the description
> of srfi-13 to that section.
Perhaps, but this is undocumented behavior. :-)
> Do you need a caveat about multibyte characters there, for now? Like
> "Note that in the current implementation Guile has no notion of
> multibyte characters and in a multibyte locale characters may not be
> converted correctly."
Yes.
>> +@deffn {Scheme Procedure} locale-string->integer str [base [locale]]
>> +@deffn {Scheme Procedure} locale-string->inexact str [locale]
>
> I think you should cross-reference strtol and strtod here, since their
> parsing is rather idiosyncratic. I'd even be a bit tempted to name
> them strtol and strtod in guile, to make it clear they're only one
> possible way of parsing. Except those names aren't very nice ...
I added a cross-ref to glibc's `strto{dl}', but I'm not willing to
change the names to the C library names (I'm not sure that's what you
were suggesting though).
>> +... Return two values:
>
> Consider @pxref{Multiple Values}, since multi-values are (thankfully)
> fairly rare.
Yes, done.
>> - scmconfig.h.top gettext.h
>> + scmconfig.h.top libgettext.h
>
> I don't think that's good. Best leave gettext.h the gettext one, and
> use another name for guile. Gettext got there first, and it doesn't
> really matter which guile header has which prototypes.
The former `i18n.c' (which contained only `gettext'-related code) was
renamed to `gettext.c' which seems more appropriate. Thus, for
consistency, the corresponding header file had to be renamed from
`i18n.h' to `gettext.h'. Since `gettext.h' was already used for the one
coming from Gettext, it had to be renamed. `libgettext.h' doesn't seem
such a bad name to me.
>> +/* This mutex is used to serialize invocations of `setlocale ()' on non-GNU
>> + systems (i.e., systems where a reentrant locale API is not available).
>> + See `i18n.c' for details. */
>> +scm_i_pthread_mutex_t scm_i_locale_mutex;
>
> There's an scm_i_misc_mutex for use when protection is (or should be)
> rarely needed.
It seems more robust to use a dedicated mutex.
>> +/* Provide the locale category masks as found in glibc (copied from
>> + <locale.h> as found in glibc 2.3.6). This must be kept in sync with
>> + `locale-categories.h'. */
>> +# define LC_CTYPE_MASK (1 << LC_CTYPE)
>> +# define LC_COLLATE_MASK (1 << LC_COLLATE)
>> +# define LC_MESSAGES_MASK (1 << LC_MESSAGES)
>> +# define LC_MONETARY_MASK (1 << LC_MONETARY)
>> +# define LC_NUMERIC_MASK (1 << LC_NUMERIC)
>> +# define LC_TIME_MASK (1 << LC_TIME)
>
> I think you should put some privately selected bits there, not depend
> on LC_CTYPE etc being in range 0 to 31.
Good point, done.
>> +/* Alias for glibc's locale type. */
>> +typedef locale_t scm_t_locale;
>
> I suppose the emulation could provide locale_t. Might make it hard to
> exercise on an actual gnu system. A #define locale_t would likely be
> ok.
It seems safer to make changes only in the `scm_' name space. As a
matter of fact, I just discovered that Darwin now implements the
`locale_t' "GNU" API (I suppose that change is quite recent):
http://developer.apple.com/documentation/Darwin/Reference/ManPages/man3/newlocale.3.html
Thus, defining `locale_t', `newlocale', et al. internally would have
been a potential source of problems when building on that platform.
>> +#ifdef USE_GNU_LOCALE_API
>> + freelocale ((locale_t)c_locale);
>> +#else
>> + c_locale->base_locale = SCM_UNDEFINED;
>> + free (c_locale->locale_name);
>> + scm_gc_free (c_locale, sizeof (* c_locale), "locale");
>> +#endif
>
> A possibility there, and with other funcs, would be to implement a
> compatible freelocale(), instead of sticking conditionals in each
> usage.
(See above).
>> +#ifdef USE_GNU_LOCALE_API
>> +
>> + c_locale = newlocale (c_category_mask, c_locale_name, c_base_locale);
>> + if (!c_locale)
>> + locale = SCM_BOOL_F;
>
> Your docs call for an exception on unknown locale don't they?
Indeed, fixed.
> And should you tell the gc something about the size of a locale_t, and
> perhaps extra for its underlying data? To approximate memory used,
> for the gc triggers.
Yes, but `locale_t' is typically a pointer type, and the size of the
struct pointed to by `locale_t' could be opaque (although that is
currently not the case with glibc). So we could provide a guess for the
underlying object size, but maybe we can also just safely ignore it?
>> +void
>> +scm_init_i18n ()
>> +{
>> + scm_add_feature ("ice-9-i18n");
>
> Is there any point adding a feature after the module is loaded? :)
Indeed, removed. :-)
>> +(define (under-french-locale-or-unresolved thunk)
>> + ;; On non-GNU systems, an exception may be raised only when the locale is
>> + ;; actually used rather than at `make-locale'-time. Thus, we must guard
>> + ;; against both.
>> + (if %french-locale
>> + (catch 'system-error thunk
>> + (lambda (key . args)
>> + (throw 'unresolved)))
>> + (throw 'unresolved)))
>
> Do you mean 'unsupported rather than 'unresolved, when fr_FR isn't
> available from the system?
I really meant "unresolved", in the sense that the test cannot be run
when `fr_FR' isn't available.
>> +(with-test-prefix "number parsing"
>
> Some french number parsing too? Just to show there's a point to
> locale dependent parsing :).
Done.
Thanks for your detailed review!
Ludovic.
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/guile-devel
next prev parent reply other threads:[~2006-12-10 12:30 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-19 9:23 Text collation Ludovic Courtès
2006-09-19 22:38 ` Kevin Ryde
2006-10-22 18:33 ` Ludovic Courtès
2006-10-23 2:01 ` Rob Browning
2006-10-23 7:56 ` Ludovic Courtès
2006-10-24 8:37 ` Rob Browning
2006-10-25 8:16 ` Ludovic Courtès
2006-10-25 8:46 ` Rob Browning
2006-10-25 18:40 ` Neil Jerram
2006-10-25 19:55 ` Rob Browning
2006-10-26 8:47 ` Ludovic Courtès
2006-11-09 7:44 ` Ludovic Courtès
2006-11-09 17:43 ` Rob Browning
2006-11-10 13:39 ` Ludovic Courtès
2006-11-11 15:17 ` Neil Jerram
2006-11-20 13:24 ` Ludovic Courtès
2006-11-21 22:03 ` Neil Jerram
2006-11-22 13:38 ` Ludovic Courtès
2006-10-25 18:43 ` Neil Jerram
2006-10-25 19:31 ` Rob Browning
2006-10-25 18:33 ` Neil Jerram
2006-10-26 8:39 ` Ludovic Courtès
2006-11-29 23:08 ` Kevin Ryde
2006-11-30 15:19 ` Ludovic Courtès
2006-12-02 21:56 ` Kevin Ryde
2006-12-04 9:01 ` Ludovic Courtès
2006-12-05 0:20 ` Kevin Ryde
2006-12-05 18:42 ` Carl Witty
2006-12-05 20:41 ` Kevin Ryde
2006-12-05 22:29 ` Carl Witty
2006-12-05 0:38 ` Kevin Ryde
2006-12-02 22:02 ` Kevin Ryde
2006-12-10 12:30 ` Ludovic Courtès [this message]
2006-12-11 22:32 ` Kevin Ryde
2006-12-12 8:38 ` Ludovic Courtès
2006-12-12 20:04 ` Kevin Ryde
2006-12-13 9:41 ` Ludovic Courtès
2006-12-31 17:10 ` Neil Jerram
2006-12-15 20:52 ` Kevin Ryde
2006-12-12 19:05 ` Kevin Ryde
2006-12-13 9:14 ` Ludovic Courtès
2006-12-12 19:16 ` Kevin Ryde
2006-12-13 9:20 ` Ludovic Courtès
2006-12-12 21:37 ` Kevin Ryde
2006-12-13 9:28 ` Ludovic Courtès
2006-12-13 20:10 ` Kevin Ryde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87odqcszlj.fsf@chbouib.org \
--to=ludo@chbouib.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).