unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Minor queries about Unicode char docs
@ 2009-08-31 10:21 Neil Jerram
  2009-09-01  1:40 ` Mike Gran
  0 siblings, 1 reply; 4+ messages in thread
From: Neil Jerram @ 2009-08-31 10:21 UTC (permalink / raw)
  To: Mike Gran; +Cc: Guile Development

First of all, thanks for making these docs (specifically, commit
3f12aed) so clear.  They seem so much clearer and simpler to me than
the months of back-and-forth discussion on r6rs-discuss.  I know those
things are not really comparable, but I hope you can see what I mean.

Then, a couple of queries.

 SCM_DEFINE1 (scm_char_less_p, "char<?", scm_tc7_rpsubr, 
              (SCM x, SCM y),
-            "Return @code{#t} iff @var{x} is less than @var{y} in the Unicode sequence,\n"
-            "else @code{#f}.")
+             "Return @code{#t} iff the code point of @var{x} is less than the code\n"
+             "point of @var{y}, else @code{#f}.")

I think there's a case here for making the docstring not identical to
the corresponding manual text.  In the manual context, the section
begins with talking about Unicode, so "Unicode" can be assumed for
everything that follows.  But in the docstring, when someone types
(help char<?), they'll just see

  Return `#t' iff the code point of `x' is less than the code
  point of `y', else `#f'.

For this context I think it would be clearer to say

  Return `#t' iff the Unicode code point of `x' is less than the
  code point of `y', else `#f'.

+Case-insensitive character comparisons of characters use @emph{Unicode
+case folding}.  In case folding comparisons, if a character is
+lowercase and has an uppercase form that can be expressed as a single
+character, it is converted to uppercase before comparison.  Unicode
+case folding is language independent: it uses rules that are generally
+true, but, it cannot cover all cases for all languages.

That's very clear, but what if a character doesn't have an uppercase
form that can be expressed as a single character?  Does Guile then
throw an exception, or does it perform the comparison with the
lowercase code point?

Thanks!

     Neil




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Minor queries about Unicode char docs
  2009-08-31 10:21 Minor queries about Unicode char docs Neil Jerram
@ 2009-09-01  1:40 ` Mike Gran
  2009-09-01  1:43   ` Mike Gran
  2009-09-01 21:46   ` Neil Jerram
  0 siblings, 2 replies; 4+ messages in thread
From: Mike Gran @ 2009-09-01  1:40 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

On Mon, 2009-08-31 at 11:21 +0100, Neil Jerram wrote:
> I think there's a case here for making the docstring not identical to
> the corresponding manual text.  In the manual context, the section
> begins with talking about Unicode, so "Unicode" can be assumed for
> everything that follows.  But in the docstring, when someone types
> (help char<?), they'll just see
> 
>   Return `#t' iff the code point of `x' is less than the code
>   point of `y', else `#f'.
> 
> For this context I think it would be clearer to say
> 
>   Return `#t' iff the Unicode code point of `x' is less than the
>   code point of `y', else `#f'.

Sounds good.

> 
> +Case-insensitive character comparisons of characters use @emph{Unicode
> +case folding}.  In case folding comparisons, if a character is
> +lowercase and has an uppercase form that can be expressed as a single
> +character, it is converted to uppercase before comparison.  Unicode
> +case folding is language independent: it uses rules that are generally
> +true, but, it cannot cover all cases for all languages.
> 
> That's very clear, but what if a character doesn't have an uppercase
> form that can be expressed as a single character?  Does Guile then
> throw an exception, or does it perform the comparison with the
> lowercase code point?

I see what you mean.  The text should have something like...

"In case folding comparisons, if a character is lowercase and has an
uppercase form that can be expressed as a single character, its
uppercase form is used in the comparison.  All other characters are not
modified for the comparison.  Note that the German letter Sharp S
(Eszett) is not uppercased before the comparison since its plural has
two characters instead of one."

> 
> Thanks!
> 
>      Neil

Thanks,

Mike





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Minor queries about Unicode char docs
  2009-09-01  1:40 ` Mike Gran
@ 2009-09-01  1:43   ` Mike Gran
  2009-09-01 21:46   ` Neil Jerram
  1 sibling, 0 replies; 4+ messages in thread
From: Mike Gran @ 2009-09-01  1:43 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

On Mon, 2009-08-31 at 18:40 -0700, Mike Gran wrote:
> Note that the German letter Sharp S
> (Eszett) is not uppercased before the comparison since its plural has
> two characters instead of one."

I meant to say 'its _uppercase form_ has two characters instead of one'.






^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Minor queries about Unicode char docs
  2009-09-01  1:40 ` Mike Gran
  2009-09-01  1:43   ` Mike Gran
@ 2009-09-01 21:46   ` Neil Jerram
  1 sibling, 0 replies; 4+ messages in thread
From: Neil Jerram @ 2009-09-01 21:46 UTC (permalink / raw)
  To: Mike Gran; +Cc: Guile Development

Mike Gran <spk121@yahoo.com> writes:

>> For this context I think it would be clearer to say
>> 
>>   Return `#t' iff the Unicode code point of `x' is less than the
>>   code point of `y', else `#f'.
>
> Sounds good.

[..]
> I see what you mean.  The text should have something like...
>
> "In case folding comparisons, if a character is lowercase and has an
> uppercase form that can be expressed as a single character, its
> uppercase form is used in the comparison.  All other characters are not
> modified for the comparison.  Note that the German letter Sharp S
> (Eszett) is not uppercased before the comparison since its plural has
> two characters instead of one."

> I meant to say 'its _uppercase form_ has two characters instead of one'.

Thanks, those changes sound great.  Are you happy to commit them
sometime?

     Neil




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-09-01 21:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-31 10:21 Minor queries about Unicode char docs Neil Jerram
2009-09-01  1:40 ` Mike Gran
2009-09-01  1:43   ` Mike Gran
2009-09-01 21:46   ` Neil Jerram

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).