From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Maxime Devos Newsgroups: gmane.lisp.guile.devel Subject: Re: [PATCH] Enable utf8->string to take a range Date: Wed, 09 Mar 2022 14:24:14 +0100 Message-ID: <9e8b7145e02b643dea86a0ddfd415d950558b882.camel@telenet.be> References: <87h79x6abc.fsf@vijaymarupudi.com> <0f4ce6f8ddbdd9456dcc0063b206bf8c76d71da6.camel@telenet.be> <87bl046dss.fsf@vijaymarupudi.com> <87pmokmuon.fsf@vijaymarupudi.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="=-acSvwJ/LtBK67Gws4bjH" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="1619"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Evolution 3.38.3-1 To: Vijay Marupudi , guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Wed Mar 09 14:25:01 2022 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nRwJI-0000Gf-UJ for guile-devel@m.gmane-mx.org; Wed, 09 Mar 2022 14:25:00 +0100 Original-Received: from localhost ([::1]:34380 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nRwJH-0004gK-Rp for guile-devel@m.gmane-mx.org; Wed, 09 Mar 2022 08:24:59 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:44362) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nRwIc-0004M4-Hs for guile-devel@gnu.org; Wed, 09 Mar 2022 08:24:18 -0500 Original-Received: from [2a02:1800:120:4::f00:14] (port=51170 helo=xavier.telenet-ops.be) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nRwIb-0003fh-1j for guile-devel@gnu.org; Wed, 09 Mar 2022 08:24:18 -0500 Original-Received: from ptr-bvsjgyhxw7psv60dyze.18120a2.ip6.access.telenet.be ([IPv6:2a02:1811:8c09:9d00:3c5f:2eff:feb0:ba5a]) by xavier.telenet-ops.be with bizsmtp id 4DQE2700A4UW6Th01DQE1r; Wed, 09 Mar 2022 14:24:14 +0100 In-Reply-To: <87pmokmuon.fsf@vijaymarupudi.com> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telenet.be; s=r22; t=1646832254; bh=KFo0+qIIj/YC+PKSWU/WceDVgvLbIdM20fCaDfI7nsY=; h=Subject:From:To:Date:In-Reply-To:References; b=JmeYfYm563Ycr9rNFOC2VFAfmgeyMjzekH410cNFmTM6NsZUmwfQb/75iTW7tOUif UkrQKlCv2AS0t+K6As8xnxJcP1lOZHy2a5bHczzVYlSt8X/YM3+cyQlFC6DuuOqxwj gildmfIjCrzukXiAzNVeHsJ/YWF/c38eNhjJizH3BK8r2s99mioBYz2ejNRGVfsa0F X01Nit03NpeU6k0FmlFBRnbz50bkVTgG1PRXNlHZdYE/Pk76CYSGfqh2lMFlQsPwAF lS7iY4UK0/Eik7kaSfULxFtvv45f4l19rkjMWxCTQd1vgy7H3Qg/xRzmvK2V+75q88 c7K3cMHMKFMww== X-Host-Lookup-Failed: Reverse DNS lookup failed for 2a02:1800:120:4::f00:14 (failed) Received-SPF: pass client-ip=2a02:1800:120:4::f00:14; envelope-from=maximedevos@telenet.be; helo=xavier.telenet-ops.be X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:21165 Archived-At: --=-acSvwJ/LtBK67Gws4bjH Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Vijay Marupudi schreef op vr 21-01-2022 om 20:21 [-0500]: > +SCM_DEFINE (scm_utf8_range_to_string, "utf8->string", > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1, 2,= 0, > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (SCM = utf, SCM start, SCM end), > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "Retu= rn a newly allocate string that contains from the > UTF-8-" > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "enco= ded contents of bytevector @var{utf}.") This is incorrect, since the nul character is encoded even though UTF- proper does not allow encoding the nul character -- UTF-8 with an encoding of the nul character is sometimes called =E2=80=98modified UTF-8= =E2=80=99. The distinction is sometimes relevant, e.g. the GNS specifications asks for labels to be encoded in UTF-8, and according to the spec writers, that implied that nul characters are forbidden. As such, I cannot rely on 'utf8->string' to verify that there aren't any nul characters. Greetings, Maxime. --=-acSvwJ/LtBK67Gws4bjH Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYiiqfhccbWF4aW1lZGV2 b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7hqJAQCsSZyLSd2A8wqU2Mf3iRFFyeZr yy8DjIF0JKGStIhWlgD8CB8gJP68tmr+0X0tcloejBmX3GGY+5fxw05pq0VB4Ag= =sOZ6 -----END PGP SIGNATURE----- --=-acSvwJ/LtBK67Gws4bjH--