* uc_tolower (uc_toupper (x))
@ 2011-03-10 23:39 Mark H Weaver
0 siblings, 0 replies; 2+ messages in thread
From: Mark H Weaver @ 2011-03-10 23:39 UTC (permalink / raw)
To: guile-devel
I've noticed that srfi-13.c very frequently does:
uc_tolower (uc_toupper (x))
Is there a good reason to do this instead of:
uc_tolower (x)
?
Mark
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: uc_tolower (uc_toupper (x))
@ 2011-03-11 0:54 Mike Gran
0 siblings, 0 replies; 2+ messages in thread
From: Mike Gran @ 2011-03-11 0:54 UTC (permalink / raw)
To: Mark H Weaver, guile-devel@gnu.org
> From:Mark H Weaver <mhw@netris.org>
> To:guile-devel@gnu.org
> Cc:
> Sent:Thursday, March 10, 2011 3:39 PM
> Subject:uc_tolower (uc_toupper (x))
>
> I've noticed that srfi-13.c very frequently does:
>
> uc_tolower (uc_toupper (x))
>
> Is there a good reason to do this instead of:
>
> uc_tolower (x)
Unicode defines a case folding algorithm as well as
a data table for case insensitive sorting. Setting
things to lowercase is a decent approximation of
case folding. But doing the upper->lower operation picks
up a few more of the corner cases, like U+03C2 GREEK
SMALL LETTER FINAL SIGMA and U+03C3 GREEK SMALL LETTER SIGMA
which are the same letter with different representations,
or U+00B5 MICRO SIGN and U+039C GREEK SMALL LETTER MU
which are supposed to have the same sort ordering.
Now that we've pulled in all of libunistring, it might
be a good idea to see if it has a complete implementation
of unicode case folding, because upper->lower is also not
completely correct.
-Mike
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2011-03-11 0:54 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-10 23:39 uc_tolower (uc_toupper (x)) Mark H Weaver
-- strict thread matches above, loose matches on Subject: below --
2011-03-11 0:54 Mike Gran
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).