From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail
From: Maxime Devos <maximedevos@telenet.be>
Newsgroups: gmane.lisp.guile.devel
Subject: RE: Improving the handling of system data (env, users, paths, ...)
Date: Sun, 7 Jul 2024 16:59:10 +0200
Message-ID: <20240707165910.kez92C00H4hwdlW01ezAlE@andre.telenet-ops.be>
References: <878qyeqn1q.fsf@trouble.defaultvalue.org> <86jzhx3gxe.fsf@gnu.org>
 <9985c529ffbbabaa259ee62226ced1feec8c7810.camel@abou-samra.fr>
 <865xth31kq.fsf@gnu.org>
 <20240707133527.kbbT2C0064hwdlW01bbTq5@baptiste.telenet-ops.be>
 <8634ol2sal.fsf@gnu.org>
Mime-Version: 1.0
Content-Type: multipart/alternative;
 boundary="_D92A67A6-BA24-4E64-8561-CBC64E4D06EE_"
Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214";
	logging-data="35820"; mail-complaints-to="usenet@ciao.gmane.io"
Cc: "jean@abou-samra.fr" <jean@abou-samra.fr>, 
 "rlb@defaultvalue.org" <rlb@defaultvalue.org>, 
 "guile-devel@gnu.org" <guile-devel@gnu.org>
To: Eli Zaretskii <eliz@gnu.org>
Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Sun Jul 07 16:59:45 2024
Return-path: <guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org>
Envelope-to: guile-devel@m.gmane-mx.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
	(Exim 4.92)
	(envelope-from <guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org>)
	id 1sQTMd-0008yE-5B
	for guile-devel@m.gmane-mx.org; Sun, 07 Jul 2024 16:59:43 +0200
Original-Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <guile-devel-bounces@gnu.org>)
	id 1sQTMG-0001kc-5T; Sun, 07 Jul 2024 10:59:20 -0400
Original-Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <maximedevos@telenet.be>)
 id 1sQTMC-0001iP-M3
 for guile-devel@gnu.org; Sun, 07 Jul 2024 10:59:16 -0400
Original-Received: from andre.telenet-ops.be ([2a02:1800:120:4::f00:15])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <maximedevos@telenet.be>)
 id 1sQTM9-000313-Vr
 for guile-devel@gnu.org; Sun, 07 Jul 2024 10:59:16 -0400
Original-Received: from [IPv6:2a02:1811:8c0e:ef00:95f6:12f6:aa85:7dcc]
 ([IPv6:2a02:1811:8c0e:ef00:95f6:12f6:aa85:7dcc])
 by andre.telenet-ops.be with bizsmtp
 id kez92C00H4hwdlW01ezAlE; Sun, 07 Jul 2024 16:59:10 +0200
Importance: normal
X-Priority: 3
In-Reply-To: <8634ol2sal.fsf@gnu.org>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=telenet.be; s=r24;
 t=1720364350; bh=RikUqzO0lbCsmRISdyX9zXJCcl2w7oR3UOMtEO1bF3M=;
 h=To:Cc:From:Subject:Date:In-Reply-To:References;
 b=My/KLNQOyVEhWfIFMA3gz0ahPVQie/Ep5g98syRllcwftA3MjlOVwsLi0JnZng1+s
 3BKcWAs8BeZOGbhHNEGsUeNtOzkZaxHlP+fR5IW0WLlv+dUd07v+yOpaPLYpFSc4Sv
 q8rl8QdVZroqpcw/ieMOPXBLOhE3g9jj8CNzAFARxoTSDoTPg6QGTXRuGtkSIhRDFA
 rd+AMax8TSV6kDE+gDX8ye2Qws4wgKoxBXQmleZ0shEdpcoHUfPR/l9Z2gj/dfVi9m
 3kZEdaRPDKFSgxLEULlHGjrTTVkjm47I44Z8vGkQjb1u6bDTYt+QBh9VS6AJZkBgnc
 Fv3cARP+Cgl6Q==
Received-SPF: pass client-ip=2a02:1800:120:4::f00:15;
 envelope-from=maximedevos@telenet.be; helo=andre.telenet-ops.be
X-Spam_score_int: -27
X-Spam_score: -2.8
X-Spam_bar: --
X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001,
 HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: guile-devel@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Developers list for Guile,
 the GNU extensibility library" <guile-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guile-devel>,
 <mailto:guile-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/guile-devel>
List-Post: <mailto:guile-devel@gnu.org>
List-Help: <mailto:guile-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guile-devel>,
 <mailto:guile-devel-request@gnu.org?subject=subscribe>
Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org
Original-Sender: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org
Xref: news.gmane.io gmane.lisp.guile.devel:22557
Archived-At: <http://permalink.gmane.org/gmane.lisp.guile.devel/22557>

--_D92A67A6-BA24-4E64-8561-CBC64E4D06EE_
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

>> >> Guile is a Scheme implementation, bound by Scheme standards and compa=
tibility
>> >> with other Scheme implementations (and backwards compatibility too).
>> >
>> >Yes, I understand that.
>>=20
>> Going by what you are saying below, I think you don=E2=80=99t.
>
>Thank you for your vote of confidence.

That was not a vote of confidence, if anything, it=E2=80=99s the contrary.

> I=E2=80=99m pretty sure that they weren=E2=80=99t intending to get the 0x=
b5 byte. Rather, they were using the equivalent of =E2=80=98string-ref=E2=
=80=99 (i.e., =E2=80=98aref=E2=80=99) and demonstrating that the result is =
bogus in Scheme.  In Scheme, =E2=80=98(string-ref ...)=E2=80=99 needs to re=
turn a character, and there exists no (Unicode) character with codepoint 41=
94229, so what Emacs returns here would be bogus for (Guile) Scheme.

>aref in Emacs and string-ref in Guile are not the same, and if Guile
needs to produce a raw byte in this scenario, it can be easily
arranged.  In Emacs we have other goals.

It is the opposite. In Guile, string-ref does not need to produce bytes, bu=
t characters =E2=80=93 just like aref (modulo difference in how Scheme and =
Emacs define =E2=80=98byte=E2=80=99).

>IOW, I think this argument is pointless, since it is easy to adapt the
mechanism to what Guile needs.

No =E2=80=93 the argument is about how it is impossible to adapt the mechan=
ism to Guile, since bytes aren=E2=80=99t characters in Unicode.

> >From the Emacs manual:
>=20
> >For example, you can access individual characters in a string using the =
function=C2=A0aref=C2=A0(see=C2=A0Functions that Operate on Arrays).
>=20
> Thus, (aref the-string index) is the equivalent of (string-ref the-string=
 index).

>No, because a raw byte is not a character.

Yes, because characters are characters. Both string-ref and aref return cha=
racters. This is documented in both the Emacs and Guile manual:

Again, from the Emacs manual:

> A string is a fixed sequence of characters. [...] Since strings are array=
s, and therefore sequences as well, you can operate on them with the genera=
l array and sequence functions documented in=C2=A0Sequences, Arrays, and Ve=
ctors. For example, you can access individual characters in a string using =
the function=C2=A0aref=C2=A0(see=C2=A0Functions that Operate on Arrays).

Hence, (aref the-string index) returns (Emacs) characters.

Likewise, from the Guile manual:

> Scheme Procedure:=C2=A0string-ref=C2=A0str k
>C Function:=C2=A0scm_string_ref=C2=A0(str, k)
Return character=C2=A0k=C2=A0of=C2=A0str=C2=A0using zero-origin indexing.=
=C2=A0k=C2=A0must be a valid index of=C2=A0str.

Clearly, these are equivalent (modulo difference in the meaning of =E2=80=
=98characters=E2=80=99).

>If Guile restricts itself to Unicode characters and only them, it will
lack important features.  So my suggestion is not to have this
restriction.

Guile restricting strings to Unicode _is_ an important feature (simplicity,=
 and compatibility).

Guile extending strings beyond Unicode is a _limitation_ (compatibility and=
 other trickiness for applications).

I could imagine in the far future there might be too little codepoints left=
 in Unicode, in which case the range of what Guile (and more generally, Sch=
eme and Unicode) considers characters needs to be extended (even if that ha=
s some compatibility implicaitons), but that time hasn=E2=80=99t arrived ye=
t.

The important feature of this thread, is supporting file names (and getenv =
stuff, etc.) that doesn=E2=80=99t fit properly in the =E2=80=98string=E2=80=
=99 model. As mentioned earlier (in the initial message, even), there are s=
olutions to that do not impose the =E2=80=98let characters go beyond Unicod=
e=E2=80=99 limitation.

>I think the fact that this discussion is held, and that Rob suggested
to use Latin-1 for the purpose of supporting raw bytes is a clear
indication that Guile, too, needs to deal with "character-like" data
that does not fit the Unicode framework.=20

True, and I never claimed otherwise.

> So I think saying that strings in Guile can only hold Unicode characters =
will not give you what this discussion attempts to give.

Sure, and I wasn=E2=80=99t trying to. What I (and IIUC, the other person as=
 well) was doing was mentioning how neither the Emacs=E2=80=99s thing is a =
solution. (Whether because of backwards compatibility, or whether because o=
f not _wanting_ to conflate bytes with characters (and not wanting to go be=
yond Unicode) with all the consequences this conflation would imply for app=
lications.)

> In particular, how will you
handle the situations described by Rob where a file has a name that is
not a valid UTF-8 sequence (thus not "characters" as long as you
interpret text as UTF-8)?

Scheme does not interpret text as UTF-8, that=E2=80=99s an internal impleme=
ntation detail and a matter of things like locales. Instead, to Scheme text=
 is (Unicode) characters.

I have outlined a solution (that does not conflate characters with bytes) i=
n another response. IIRC, it was in a response so Rob. I would propose actu=
ally, you know, reading it. I=E2=80=99m not sure, but IIRC Rob also mention=
ed another solution (i.e., just accept bytevectors in some locations, or do=
 Latin-1).

Also, this structure makes no sense. Even if I did not provide an alternati=
ve solution of my own, that wouldn=E2=80=99t mean Emacs=E2=80=99s thing is =
the answer. (Negative) criticism can be valid without providing alternative=
s.

Best regards,
Maxime Devos.

--_D92A67A6-BA24-4E64-8561-CBC64E4D06EE_
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset="utf-8"

<html xmlns:o=3D"urn:schemas-microsoft-com:office:office" xmlns:w=3D"urn:sc=
hemas-microsoft-com:office:word" xmlns:m=3D"http://schemas.microsoft.com/of=
fice/2004/12/omml" xmlns=3D"http://www.w3.org/TR/REC-html40"><head><meta ht=
tp-equiv=3DContent-Type content=3D"text/html; charset=3Dutf-8"><meta name=
=3DGenerator content=3D"Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0cm;
	font-size:11.0pt;
	font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
code
	{mso-style-priority:99;
	font-family:"Courier New";}
span.category-def
	{mso-style-name:category-def;}
.MsoChpDefault
	{mso-style-type:export-only;}
@page WordSection1
	{size:612.0pt 792.0pt;
	margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
	{page:WordSection1;}
--></style></head><body lang=3Den-BE link=3Dblue vlink=3D"#954F72" style=3D=
'word-wrap:break-word'><div class=3DWordSection1><p class=3DMsoNormal><span=
 lang=3DEN-US>&gt;&gt; &gt;&gt; Guile is a Scheme implementation, bound by =
Scheme standards and compatibility<o:p></o:p></span></p><p class=3DMsoNorma=
l><span lang=3DEN-US>&gt;&gt; &gt;&gt; with other Scheme implementations (a=
nd backwards compatibility too).<o:p></o:p></span></p><p class=3DMsoNormal>=
<span lang=3DEN-US>&gt;&gt; &gt;<o:p></o:p></span></p><p class=3DMsoNormal>=
<span lang=3DEN-US>&gt;&gt; &gt;Yes, I understand that.<o:p></o:p></span></=
p><p class=3DMsoNormal><span lang=3DEN-US>&gt;&gt; <o:p></o:p></span></p><p=
 class=3DMsoNormal><span lang=3DEN-US>&gt;&gt; Going by what you are saying=
 below, I think you don=E2=80=99t.<o:p></o:p></span></p><p class=3DMsoNorma=
l><span lang=3DEN-US>&gt;<o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><=
span lang=3DEN-US>&gt;Thank you for your vote of confidence.<o:p></o:p></sp=
an></p><p class=3DMsoNormal><span lang=3DEN-US><o:p>&nbsp;</o:p></span></p>=
<p class=3DMsoNormal><span lang=3DEN-US>That was not a vote of confidence, =
if anything, it=E2=80=99s the contrary.<o:p></o:p></span></p><p class=3DMso=
Normal><span lang=3DEN-US><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal>=
<span lang=3Den-BE>&gt; I=E2=80=99m pretty sure that they weren=E2=80=99t i=
ntending to get the 0xb5 byte. Rather, they were using the equivalent of =
=E2=80=98string-ref=E2=80=99 (i.e., =E2=80=98aref=E2=80=99) and demonstrati=
ng that the result is bogus in Scheme.=C2=A0 In Scheme, =E2=80=98(string-re=
f ...)=E2=80=99 needs to return a character, and there exists no (Unicode) =
character with codepoint 4194229, so what Emacs returns here would be bogus=
 for (Guile) Scheme.<o:p></o:p></span></p><p class=3DMsoNormal><span lang=
=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span lang=3Den-B=
E>&gt;aref in Emacs and string-ref in Guile are not the same, and if Guile<=
o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>needs to produ=
ce a raw byte in this scenario, it can be easily<o:p></o:p></span></p><p cl=
ass=3DMsoNormal><span lang=3Den-BE>arranged.=C2=A0 In Emacs we have other g=
oals.<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE><o:p>&nb=
sp;</o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>It is the oppos=
ite. In Guile, string-ref does not need to produce bytes, but characters =
=E2=80=93 just like aref (modulo difference in how Scheme and Emacs define =
=E2=80=98byte=E2=80=99).<o:p></o:p></span></p><p class=3DMsoNormal><span la=
ng=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span lang=3Den=
-BE>&gt;IOW, I think this argument is pointless, since it is easy to adapt =
the<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>mechanism =
to what Guile needs.<o:p></o:p></span></p><p class=3DMsoNormal><span lang=
=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span lang=3Den-B=
E>No =E2=80=93 the argument is about how it is impossible to adapt the mech=
anism to Guile, since bytes aren=E2=80=99t characters in Unicode.<o:p></o:p=
></span></p><p class=3DMsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p></span=
></p><p class=3DMsoNormal><span lang=3Den-BE>&gt; &gt;From the Emacs manual=
:<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>&gt; <o:p></=
o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>&gt; &gt;For example=
, you can access individual characters in a string using the function&nbsp;=
aref&nbsp;(see&nbsp;Functions that Operate on Arrays).<o:p></o:p></span></p=
><p class=3DMsoNormal><span lang=3Den-BE>&gt; <o:p></o:p></span></p><p clas=
s=3DMsoNormal><span lang=3Den-BE>&gt; Thus, (aref the-string index) is the =
equivalent of (string-ref the-string index).<o:p></o:p></span></p><p class=
=3DMsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoN=
ormal><span lang=3Den-BE>&gt;No, because a raw byte is not a character.<o:p=
></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p>=
</span></p><p class=3DMsoNormal><span lang=3Den-BE>Yes, because characters =
are characters. Both string-ref and aref return characters. This is documen=
ted in both the Emacs and Guile manual:<o:p></o:p></span></p><p class=3DMso=
Normal><span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal>=
<span lang=3Den-BE>Again, from the Emacs manual:<o:p></o:p></span></p><p cl=
ass=3DMsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DM=
soNormal><span lang=3Den-BE>&gt; </span><span lang=3Den-BE style=3D'font-si=
ze:12.0pt;font-family:"Times New Roman",serif;color:black;background:white'=
>A string is a fixed sequence of characters. [...] Since strings are arrays=
, and therefore sequences as well, you can operate on them with the general=
 array and sequence functions documented in&nbsp;</span><span lang=3Den-BE>=
<a href=3D"https://www.gnu.org/software/emacs/manual/html_node/elisp/Sequen=
ces-Arrays-Vectors.html"><span style=3D'font-size:12.0pt;font-family:"Times=
 New Roman",serif;color:#005090;background:white'>Sequences, Arrays, and Ve=
ctors</span></a></span><span lang=3Den-BE style=3D'font-size:12.0pt;font-fa=
mily:"Times New Roman",serif;color:black;background:white'>. For example, y=
ou can access individual characters in a string using the function&nbsp;</s=
pan><code><span lang=3Den-BE style=3D'font-size:10.0pt;color:black;backgrou=
nd:white'>aref</span></code><span lang=3Den-BE style=3D'font-size:12.0pt;fo=
nt-family:"Times New Roman",serif;color:black;background:white'>&nbsp;(see&=
nbsp;</span><span lang=3Den-BE><a href=3D"https://www.gnu.org/software/emac=
s/manual/html_node/elisp/Array-Functions.html"><span style=3D'font-size:12.=
0pt;font-family:"Times New Roman",serif;color:#005090;background:white'>Fun=
ctions that Operate on Arrays</span></a></span><span lang=3Den-BE style=3D'=
font-size:12.0pt;font-family:"Times New Roman",serif;color:black;background=
:white'>).<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE sty=
le=3D'font-size:12.0pt;font-family:"Times New Roman",serif;color:black;back=
ground:white'><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span lang=
=3Den-BE style=3D'font-size:12.0pt;font-family:"Times New Roman",serif;colo=
r:black;background:white'>Hence, (aref the-string index) returns (Emacs) ch=
aracters.<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE styl=
e=3D'font-size:12.0pt;font-family:"Times New Roman",serif;color:black;backg=
round:white'><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span lang=3D=
en-BE style=3D'font-size:12.0pt;font-family:"Times New Roman",serif;color:b=
lack;background:white'>Likewise, from the Guile manual:<o:p></o:p></span></=
p><p class=3DMsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p cl=
ass=3DMsoNormal style=3D'mso-margin-top-alt:6.0pt;margin-right:0cm;margin-b=
ottom:6.0pt;margin-left:0cm;line-height:18.0pt;background:white'><span lang=
=3Den-BE style=3D'color:black'>&gt;</span><span class=3Dcategory-def><b><sp=
an lang=3Den-BE style=3D'font-size:12.0pt;font-family:"Times New Roman",ser=
if;color:#222222'> Scheme Procedure:&nbsp;</span></b></span><strong><span l=
ang=3Den-BE style=3D'font-size:13.5pt;font-family:"Calibri",sans-serif;colo=
r:#222222'>string-ref</span></strong><b><span lang=3Den-BE style=3D'font-si=
ze:12.0pt;font-family:"Times New Roman",serif;color:#222222'>&nbsp;<var>str=
 k</var><o:p></o:p></span></b></p><p class=3DMsoNormal style=3D'mso-margin-=
top-alt:6.0pt;margin-right:0cm;margin-bottom:6.0pt;margin-left:0cm;line-hei=
ght:18.0pt;background:white'><span class=3Dcategory-def><b><span lang=3Den-=
BE style=3D'font-size:12.0pt;font-family:"Times New Roman",serif;color:#222=
222'>&gt;C Function:&nbsp;</span></b></span><strong><span lang=3Den-BE styl=
e=3D'font-size:13.5pt;font-family:"Calibri",sans-serif;color:#222222'>scm_s=
tring_ref</span></strong><b><span lang=3Den-BE style=3D'font-size:12.0pt;fo=
nt-family:"Times New Roman",serif;color:#222222'>&nbsp;<var>(str, k)</var><=
o:p></o:p></span></b></p><p style=3D'mso-margin-top-alt:6.0pt;margin-right:=
11.95pt;margin-bottom:6.0pt;margin-left:36.0pt;line-height:18.0pt;backgroun=
d:white'><span lang=3Den-BE style=3D'font-size:12.0pt;font-family:"Times Ne=
w Roman",serif;color:black'>Return character&nbsp;<var>k</var>&nbsp;of&nbsp=
;<var>str</var>&nbsp;using zero-origin indexing.&nbsp;<var>k</var>&nbsp;mus=
t be a valid index of&nbsp;<var>str</var>.<o:p></o:p></span></p><p class=3D=
MsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNorm=
al><span lang=3Den-BE>Clearly, these are equivalent (modulo difference in t=
he meaning of =E2=80=98characters=E2=80=99).<o:p></o:p></span></p><p class=
=3DMsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoN=
ormal><span lang=3Den-BE>&gt;If Guile restricts itself to Unicode character=
s and only them, it will<o:p></o:p></span></p><p class=3DMsoNormal><span la=
ng=3Den-BE>lack important features.=C2=A0 So my suggestion is not to have t=
his<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>restrictio=
n.<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE><o:p>&nbsp;=
</o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>Guile restricting =
strings to Unicode _<i>is</i>_ an important feature (simplicity, and compat=
ibility).<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE><o:p=
>&nbsp;</o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>Guile exten=
ding strings beyond Unicode is a _<i>limitation</i>_ (compatibility and oth=
er trickiness for applications).<o:p></o:p></span></p><p class=3DMsoNormal>=
<span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span l=
ang=3Den-BE>I could imagine in the far future there might be too little cod=
epoints left in Unicode, in which case the range of what Guile (and more ge=
nerally, Scheme and Unicode) considers characters needs to be extended (eve=
n if that has some compatibility implicaitons), but that time hasn=E2=80=99=
t arrived yet.<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE=
><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>The im=
portant feature of this thread, is supporting file names (and getenv stuff,=
 etc.) that doesn=E2=80=99t fit properly in the =E2=80=98string=E2=80=99 mo=
del. As mentioned earlier (in the initial message, even), there are solutio=
ns to that do not impose the =E2=80=98let characters go beyond Unicode=E2=
=80=99 limitation.<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3De=
n-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>&g=
t;I think the fact that this discussion is held, and that Rob suggested<o:p=
></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>to use Latin-1 fo=
r the purpose of supporting raw bytes is a clear<o:p></o:p></span></p><p cl=
ass=3DMsoNormal><span lang=3Den-BE>indication that Guile, too, needs to dea=
l with &quot;character-like&quot; data<o:p></o:p></span></p><p class=3DMsoN=
ormal><span lang=3Den-BE>that does not fit the Unicode framework. <o:p></o:=
p></span></p><p class=3DMsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p></spa=
n></p><p class=3DMsoNormal><span lang=3Den-BE>True, and I never claimed oth=
erwise.<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE><o:p>&=
nbsp;</o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>&gt; So I thi=
nk saying that strings in Guile can only hold Unicode characters will not g=
ive you what this discussion attempts to give.<o:p></o:p></span></p><p clas=
s=3DMsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMso=
Normal><span lang=3Den-BE>Sure, and I wasn=E2=80=99t trying to. What I (and=
 IIUC, the other person as well) was doing was mentioning how neither the E=
macs=E2=80=99s thing is a solution. (Whether because of backwards compatibi=
lity, or whether because of not _<i>wanting</i>_ to conflate bytes with cha=
racters (and not wanting to go beyond Unicode) with all the consequences th=
is conflation would imply for applications.)<o:p></o:p></span></p><p class=
=3DMsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoN=
ormal><span lang=3Den-BE>&gt; In particular, how will you<o:p></o:p></span>=
</p><p class=3DMsoNormal><span lang=3Den-BE>handle the situations described=
 by Rob where a file has a name that is<o:p></o:p></span></p><p class=3DMso=
Normal><span lang=3Den-BE>not a valid UTF-8 sequence (thus not &quot;charac=
ters&quot; as long as you<o:p></o:p></span></p><p class=3DMsoNormal><span l=
ang=3Den-BE>interpret text as UTF-8)?<o:p></o:p></span></p><p class=3DMsoNo=
rmal><span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><s=
pan lang=3Den-BE>Scheme does not interpret text as UTF-8, that=E2=80=99s an=
 internal implementation detail and a matter of things like locales. Instea=
d, to Scheme text is (Unicode) characters.<o:p></o:p></span></p><p class=3D=
MsoNormal><span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNorm=
al><span lang=3Den-BE>I have outlined a solution (that does not conflate ch=
aracters with bytes) in another response. IIRC, it was in a response so Rob=
. I would propose actually, you know, reading it. I=E2=80=99m not sure, but=
 IIRC Rob also mentioned another solution (i.e., just accept bytevectors in=
 some locations, or do Latin-1).<o:p></o:p></span></p><p class=3DMsoNormal>=
<span lang=3Den-BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span l=
ang=3Den-BE>Also, this structure makes no sense. Even if I did not provide =
an alternative solution of my own, that wouldn=E2=80=99t mean Emacs=E2=80=
=99s thing is the answer. (Negative) criticism can be valid without providi=
ng alternatives.<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-=
BE><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>Best=
 regards,<o:p></o:p></span></p><p class=3DMsoNormal><span lang=3Den-BE>Maxi=
me Devos.<o:p></o:p></span></p></div></body></html>=

--_D92A67A6-BA24-4E64-8561-CBC64E4D06EE_--