unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* [PATCH 0/3] Documentation improvements
@ 2024-06-25 11:20 Andrew Tropin
  2024-06-25 11:20 ` [PATCH 1/3] Make string-length documentation more correct Andrew Tropin
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Andrew Tropin @ 2024-06-25 11:20 UTC (permalink / raw)
  To: guile-devel; +Cc: Andrew Tropin

Fix spelling, mentions of removed code and factual inaccuracies.

Andrew Tropin (3):
  Make string-length documentation more correct
  Change make-dynamic-state mentions to current-dynamic-state
  Fix spelling

 doc/r5rs/r5rs.texi          | 2 +-
 doc/ref/api-scheduling.texi | 2 +-
 doc/ref/srfi-modules.texi   | 2 +-
 doc/ref/vm.texi             | 2 +-
 libguile/fluids.c           | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

-- 
2.45.1




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] Make string-length documentation more correct
  2024-06-25 11:20 [PATCH 0/3] Documentation improvements Andrew Tropin
@ 2024-06-25 11:20 ` Andrew Tropin
  2024-06-25 11:27   ` Maxime Devos
  2024-06-25 11:20 ` [PATCH 2/3] Change make-dynamic-state mentions to current-dynamic-state Andrew Tropin
  2024-06-25 11:20 ` [PATCH 3/3] Fix spelling Andrew Tropin
  2 siblings, 1 reply; 14+ messages in thread
From: Andrew Tropin @ 2024-06-25 11:20 UTC (permalink / raw)
  To: guile-devel; +Cc: Andrew Tropin

* doc/r5rs/r5rs.texi:
---
 doc/r5rs/r5rs.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/r5rs/r5rs.texi b/doc/r5rs/r5rs.texi
index 775c93094..f2e9dda19 100644
--- a/doc/r5rs/r5rs.texi
+++ b/doc/r5rs/r5rs.texi
@@ -5846,7 +5846,7 @@ Returns a newly allocated string composed of the arguments.
 
 @deffn {procedure} string-length  string
 
-Returns the number of characters in the given @var{string}.
+Returns the number of bytes in the given @var{string}.
 @end deffn
 
 
-- 
2.45.1




^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] Change make-dynamic-state mentions to current-dynamic-state
  2024-06-25 11:20 [PATCH 0/3] Documentation improvements Andrew Tropin
  2024-06-25 11:20 ` [PATCH 1/3] Make string-length documentation more correct Andrew Tropin
@ 2024-06-25 11:20 ` Andrew Tropin
  2024-06-25 11:20 ` [PATCH 3/3] Fix spelling Andrew Tropin
  2 siblings, 0 replies; 14+ messages in thread
From: Andrew Tropin @ 2024-06-25 11:20 UTC (permalink / raw)
  To: guile-devel; +Cc: Andrew Tropin

* doc/ref/api-scheduling.texi:
* libguile/fluids.c:
---
 doc/ref/api-scheduling.texi | 2 +-
 libguile/fluids.c           | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/ref/api-scheduling.texi b/doc/ref/api-scheduling.texi
index d79808049..f6cc942a1 100644
--- a/doc/ref/api-scheduling.texi
+++ b/doc/ref/api-scheduling.texi
@@ -199,7 +199,7 @@ A fluid created with @code{make-thread-local-fluid} won't be captured by
 Return a newly created fluid, whose initial value is @var{dflt}, or
 @code{#f} if @var{dflt} is not given.  Unlike fluids made with
 @code{make-fluid}, thread local fluids are not captured by
-@code{make-dynamic-state}.  Similarly, a newly spawned child thread does
+@code{current-dynamic-state}.  Similarly, a newly spawned child thread does
 not inherit thread-local fluid values from the parent thread.
 @end deffn
 
diff --git a/libguile/fluids.c b/libguile/fluids.c
index ebdb48fbc..4632f32ae 100644
--- a/libguile/fluids.c
+++ b/libguile/fluids.c
@@ -264,7 +264,7 @@ SCM_DEFINE (scm_make_thread_local_fluid, "make-thread-local-fluid", 0, 1, 0,
 	    "Return a newly created fluid, whose initial value is @var{dflt},\n"
             "or @code{#f} if @var{dflt} is not given.  Unlike fluids made\n"
 	    "with @code{make-fluid}, thread local fluids are not captured\n"
-            "by @code{make-dynamic-state}.  Similarly, a newly spawned\n"
+            "by @code{current-dynamic-state}.  Similarly, a newly spawned\n"
             "child thread does not inherit thread-local fluid values from\n"
             "the parent thread.")
 #define FUNC_NAME s_scm_make_thread_local_fluid
-- 
2.45.1




^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] Fix spelling
  2024-06-25 11:20 [PATCH 0/3] Documentation improvements Andrew Tropin
  2024-06-25 11:20 ` [PATCH 1/3] Make string-length documentation more correct Andrew Tropin
  2024-06-25 11:20 ` [PATCH 2/3] Change make-dynamic-state mentions to current-dynamic-state Andrew Tropin
@ 2024-06-25 11:20 ` Andrew Tropin
  2 siblings, 0 replies; 14+ messages in thread
From: Andrew Tropin @ 2024-06-25 11:20 UTC (permalink / raw)
  To: guile-devel; +Cc: Andrew Tropin

* doc/ref/srfi-modules.texi:
* doc/ref/vm.texi:
---
 doc/ref/srfi-modules.texi | 2 +-
 doc/ref/vm.texi           | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/ref/srfi-modules.texi b/doc/ref/srfi-modules.texi
index 02da3e2f2..7e2295acd 100644
--- a/doc/ref/srfi-modules.texi
+++ b/doc/ref/srfi-modules.texi
@@ -5093,7 +5093,7 @@ wrap procedure bodies with @code{(lazy ...)}.
 @end itemize
 
 @node SRFI-46
-@subsection SRFI-46 Basic syntax-rules Extensions
+@subsection SRFI-46 - Basic syntax-rules Extensions
 @cindex SRFI-46
 
 Guile's core @code{syntax-rules} supports the extensions specified by
diff --git a/doc/ref/vm.texi b/doc/ref/vm.texi
index b0669f0d4..c0c4aa3c4 100644
--- a/doc/ref/vm.texi
+++ b/doc/ref/vm.texi
@@ -533,7 +533,7 @@ Side tables of procedure properties, arities, and docstrings.
 @item .guile.docstrs.strtab
 Side table of frame maps, describing the set of live slots for ever
 return point in the program text, and whether those slots are pointers
-are not.  Used by the garbage collector.
+or not.  Used by the garbage collector.
 @item .debug_info
 @itemx .debug_abbrev
 @itemx .debug_str
-- 
2.45.1




^ permalink raw reply related	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/3] Make string-length documentation more correct
  2024-06-25 11:20 ` [PATCH 1/3] Make string-length documentation more correct Andrew Tropin
@ 2024-06-25 11:27   ` Maxime Devos
  2024-06-26 11:18     ` Andrew Tropin
  0 siblings, 1 reply; 14+ messages in thread
From: Maxime Devos @ 2024-06-25 11:27 UTC (permalink / raw)
  To: Andrew Tropin, guile-devel@gnu.org; +Cc: Andrew Tropin

[-- Attachment #1: Type: text/plain, Size: 516 bytes --]

 >-Returns the number of characters in the given @var{string}.
+Returns the number of bytes in the given @var{string}.
 
This is false. For example, (string-length "😀") is 1, whereas in all encodings I know of it is more than one byte. Also, R5RS says:

>procedure: string-length string
>Returns the number of characters in the given string.

, not “return the number of bytes”. Without mentioning the encoding, the “number of bytes” would be ill-defined anyways.

Best regards,
Maxime Devos.

[-- Attachment #2: Type: text/html, Size: 1934 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/3] Make string-length documentation more correct
  2024-06-25 11:27   ` Maxime Devos
@ 2024-06-26 11:18     ` Andrew Tropin
  2024-06-26 11:46       ` Maxime Devos
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Tropin @ 2024-06-26 11:18 UTC (permalink / raw)
  To: Maxime Devos, guile-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 758 bytes --]

On 2024-06-25 13:27, Maxime Devos wrote:

>  >-Returns the number of characters in the given @var{string}.
> +Returns the number of bytes in the given @var{string}.
>  
> This is false. For example, (string-length "😀") is 1, whereas in all encodings I know of it is more than one byte. Also, R5RS says:

Maybe `the number of codepoints` will work here.

(string-length "👨‍🏭") ;; => 3
(string-length "é") ;; => 2

The number of characters here is 1 in both cases.

>
>>procedure: string-length string
>>Returns the number of characters in the given string.
>
> , not “return the number of bytes”. Without mentioning the encoding, the “number of bytes” would be ill-defined anyways.

-- 
Best regards,
Andrew Tropin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/3] Make string-length documentation more correct
  2024-06-26 11:18     ` Andrew Tropin
@ 2024-06-26 11:46       ` Maxime Devos
  2024-06-26 12:07         ` tomas
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Maxime Devos @ 2024-06-26 11:46 UTC (permalink / raw)
  To: Andrew Tropin, guile-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 1049 bytes --]


>>  >-Returns the number of characters in the given @var{string}.
>> +Returns the number of bytes in the given @var{string}.
>>  
>> This is false. For example, (string-length "😀") is 1, whereas in all encodings I know of it is >more than one byte. Also, R5RS says: [...]
>
>Maybe `the number of codepoints` will work here.
>
>(string-length "👨‍🏭") ;; => 3
>(string-length "é") ;; => 2
>
>The number of characters here is 1 in both cases.

No, in Unicode (and Guile equates character=Unicode character) all characters correspond to a single codepoint.

You need to fix your setup, that’s not what Guile does. Are you sure you have set the encoding of current-input-port correctly? (Probably by setting LC_ALL or the like to a UTF-8 locale.) Otherwise the 3 bytes in the UTF-8 encoding might be interpreted in terms of some 8-bit encoding.

Here’s a test: if you can input #\👨‍🏭 without errors and it evaluates to #\👨‍🏭, then the encoding should be set up correctly.

Best regards,
Maxime Devos

[-- Attachment #2: Type: text/html, Size: 3043 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] Make string-length documentation more correct
  2024-06-26 11:46       ` Maxime Devos
@ 2024-06-26 12:07         ` tomas
  2024-06-26 12:09           ` Maxime Devos
  2024-06-26 12:18         ` Jean Abou Samra
  2024-06-28 13:38         ` Andrew Tropin
  2 siblings, 1 reply; 14+ messages in thread
From: tomas @ 2024-06-26 12:07 UTC (permalink / raw)
  To: Maxime Devos; +Cc: Andrew Tropin, guile-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 1214 bytes --]

On Wed, Jun 26, 2024 at 01:46:28PM +0200, Maxime Devos wrote:
> 
> >>  >-Returns the number of characters in the given @var{string}.
> >> +Returns the number of bytes in the given @var{string}.
> >>  
> >> This is false. For example, (string-length "😀") is 1, whereas in all encodings I know of it is >more than one byte. Also, R5RS says: [...]
> >
> >Maybe `the number of codepoints` will work here.
> >
> >(string-length "👨‍🏭") ;; => 3
> >(string-length "é") ;; => 2
> >
> >The number of characters here is 1 in both cases.
> 
> No, in Unicode (and Guile equates character=Unicode character) all characters correspond to a single codepoint.

It's more subtle than that: Unicode knows about "combining characters",
so it's quite possible that Andrew's "é" consists of two code points
(FWIW, it arrives to me as just one, but perhaps there was some
canonicalization [1] step in between).

ISTR that "Unicode character" is actually synonymous the same than "Unicode
code point" -- but the common meaning of "character" is more fuzzy. Perhaps
it's wise to avoid that word when trying to be precise.

Cheers

[1] https://en.wikipedia.org/wiki/Unicode_normalization

-- 
t

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/3] Make string-length documentation more correct
  2024-06-26 12:07         ` tomas
@ 2024-06-26 12:09           ` Maxime Devos
  0 siblings, 0 replies; 14+ messages in thread
From: Maxime Devos @ 2024-06-26 12:09 UTC (permalink / raw)
  To: tomas@tuxteam.de; +Cc: Andrew Tropin, guile-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

>ISTR that "Unicode character" is actually synonymous the same than "Unicode
code point" -- but the common meaning of "character" is more fuzzy. Perhaps
it's wise to avoid that word when trying to be precise.

My second point was that it is to late for that, unless you intend to rename procedures like character? etc..


[-- Attachment #2: Type: text/html, Size: 1490 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] Make string-length documentation more correct
  2024-06-26 11:46       ` Maxime Devos
  2024-06-26 12:07         ` tomas
@ 2024-06-26 12:18         ` Jean Abou Samra
  2024-06-26 12:26           ` Maxime Devos
  2024-06-28 13:38         ` Andrew Tropin
  2 siblings, 1 reply; 14+ messages in thread
From: Jean Abou Samra @ 2024-06-26 12:18 UTC (permalink / raw)
  To: Maxime Devos, Andrew Tropin, guile-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 1478 bytes --]

Le mercredi 26 juin 2024 à 13:46 +0200, Maxime Devos a écrit :
> > 
> > Maybe `the number of codepoints` will work here.
> > (string-length "👨‍🏭") ;; => 3
> > (string-length "é") ;; => 2> 
> > The number of characters here is 1 in both cases.
> 
> No, in Unicode (and Guile equates character=Unicode character) all
> characters correspond to a single codepoint.


Agreed. "The number of code points" would be correct, but "the number
of characters" (i.e., the current wording) is correct too. In the
Scheme terminology, a character is just a Unicode code point,
as can be seen from the name of the procedure character? and related
APIs.


> You need to fix your setup, that’s not what Guile does.


No; he wrote é, U+0065 LATIN SMALL LETTER E + U+0301 COMBINING ACUTE ACCENT,
which is two characters unlike é, LATIN SMALL LETTER E WITH ACUTE.

Likewise 👨‍🏭 is U+1F468 MAN + U+200D ZERO WIDTH JOINER + U+1F3ED FACTORY.

The "visual characters" are called grapheme clusters, and AFAIK Guile
doesn't provide any API that relates to grapheme clusters. (Note that
the number of grapheme clusters in a given strings depends on the Unicode
database and therefore on the Unicode version.)

There are programming languages where the data type called "character"
corresponds to grapheme clusters, but I don't think this is common.
Swift is the only example I know.

Obligatory reading: https://hsivonen.fi/string-length/



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/3] Make string-length documentation more correct
  2024-06-26 12:18         ` Jean Abou Samra
@ 2024-06-26 12:26           ` Maxime Devos
  2024-06-26 14:47             ` Damien Mattei
  2024-06-28 13:42             ` Andrew Tropin
  0 siblings, 2 replies; 14+ messages in thread
From: Maxime Devos @ 2024-06-26 12:26 UTC (permalink / raw)
  To: Jean Abou Samra, Andrew Tropin, guile-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 353 bytes --]

>No; he wrote é, U+0065 LATIN SMALL LETTER E + U+0301 COMBINING ACUTE ACCENT,
>which is two characters unlike é, LATIN SMALL LETTER E WITH ACUTE.
>
>Likewise 👨‍🏭 is U+1F468 MAN + U+200D ZERO WIDTH JOINER + U+1F3ED FACTORY.

Right, I should have tested that instead of assuming it’s the pre-combined é and a single-codepoint emoji.


[-- Attachment #2: Type: text/html, Size: 2000 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] Make string-length documentation more correct
  2024-06-26 12:26           ` Maxime Devos
@ 2024-06-26 14:47             ` Damien Mattei
  2024-06-28 13:42             ` Andrew Tropin
  1 sibling, 0 replies; 14+ messages in thread
From: Damien Mattei @ 2024-06-26 14:47 UTC (permalink / raw)
  To: guile-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 71 bytes --]

and how long for this one? 𓄿𓎢𓆑𓇋𓅃𓉔𓇌𓃀𓆓
😂

[-- Attachment #2: Type: text/html, Size: 229 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/3] Make string-length documentation more correct
  2024-06-26 11:46       ` Maxime Devos
  2024-06-26 12:07         ` tomas
  2024-06-26 12:18         ` Jean Abou Samra
@ 2024-06-28 13:38         ` Andrew Tropin
  2 siblings, 0 replies; 14+ messages in thread
From: Andrew Tropin @ 2024-06-28 13:38 UTC (permalink / raw)
  To: Maxime Devos, guile-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 1347 bytes --]

On 2024-06-26 13:46, Maxime Devos wrote:

>>>  >-Returns the number of characters in the given @var{string}.
>>> +Returns the number of bytes in the given @var{string}.
>>>  
>>> This is false. For example, (string-length "😀") is 1, whereas in all encodings I know of it is >more than one byte. Also, R5RS says: [...]
>>
>>Maybe `the number of codepoints` will work here.
>>
>>(string-length "👨‍🏭") ;; => 3
>>(string-length "é") ;; => 2
>>
>>The number of characters here is 1 in both cases.
>
> No, in Unicode (and Guile equates character=Unicode character) all characters correspond to a single codepoint.
>
> You need to fix your setup, that’s not what Guile does. Are you sure you have set the encoding of current-input-port correctly? (Probably by setting LC_ALL or the like to a UTF-8 locale.) Otherwise the 3 bytes in the UTF-8 encoding might be interpreted in terms of some 8-bit encoding.
>
> Here’s a test: if you can input #\👨‍🏭 without errors and it evaluates to #\👨‍🏭, then the encoding should be set up correctly.

(setlocale LC_ALL) ;; => "en_US.utf8"
(display #\👨‍🏭) ;; => /home/bob/guile-ares-rs/dev/guile/tmp.scm:84:15: unknown character name 👨‍🏭

The same hapenning if I do it in usual REPL: 
LC_ALL=en_US.utf8 guile

-- 
Best regards,
Andrew Tropin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/3] Make string-length documentation more correct
  2024-06-26 12:26           ` Maxime Devos
  2024-06-26 14:47             ` Damien Mattei
@ 2024-06-28 13:42             ` Andrew Tropin
  1 sibling, 0 replies; 14+ messages in thread
From: Andrew Tropin @ 2024-06-28 13:42 UTC (permalink / raw)
  To: Maxime Devos, Jean Abou Samra, guile-devel@gnu.org

[-- Attachment #1: Type: text/plain, Size: 557 bytes --]

On 2024-06-26 14:26, Maxime Devos wrote:

>>No; he wrote é, U+0065 LATIN SMALL LETTER E + U+0301 COMBINING ACUTE ACCENT,
>>which is two characters unlike é, LATIN SMALL LETTER E WITH ACUTE.
>>
>>Likewise 👨‍🏭 is U+1F468 MAN + U+200D ZERO WIDTH JOINER + U+1F3ED FACTORY.
>
> Right, I should have tested that instead of assuming it’s the
> pre-combined é and a single-codepoint emoji.
>

Let's keep string-length documentation intact :)

It would be cool if somebody apply the rest two patches.

-- 
Best regards,
Andrew Tropin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-06-28 13:42 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-25 11:20 [PATCH 0/3] Documentation improvements Andrew Tropin
2024-06-25 11:20 ` [PATCH 1/3] Make string-length documentation more correct Andrew Tropin
2024-06-25 11:27   ` Maxime Devos
2024-06-26 11:18     ` Andrew Tropin
2024-06-26 11:46       ` Maxime Devos
2024-06-26 12:07         ` tomas
2024-06-26 12:09           ` Maxime Devos
2024-06-26 12:18         ` Jean Abou Samra
2024-06-26 12:26           ` Maxime Devos
2024-06-26 14:47             ` Damien Mattei
2024-06-28 13:42             ` Andrew Tropin
2024-06-28 13:38         ` Andrew Tropin
2024-06-25 11:20 ` [PATCH 2/3] Change make-dynamic-state mentions to current-dynamic-state Andrew Tropin
2024-06-25 11:20 ` [PATCH 3/3] Fix spelling Andrew Tropin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).