From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Subject: bug#35785: =?UTF-8?Q?=E2=80=98string->uri=E2=80=99?= is locale-dependent and breaks in =?UTF-8?Q?=E2=80=98sv=5FSE=E2=80=99?= Date: Tue, 28 May 2019 13:17:15 +0200 Message-ID: <8736ky3k1w.fsf@gnu.org> References: <878sv4j1au.fsf@gmail.com> <87d0kgvuxj.fsf@gnu.org> <87tvdqgwyg.fsf@gmail.com> <87blzxwkrn.fsf_-_@gnu.org> <87ftp017k6.fsf@elephly.net> <875zpw6mq0.fsf@ngyro.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([209.51.188.92]:38362) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hVa7H-0004Xx-4i for bug-guix@gnu.org; Tue, 28 May 2019 07:18:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hVa7G-0008HC-9v for bug-guix@gnu.org; Tue, 28 May 2019 07:18:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:41636) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hVa7G-0008H8-7Y for bug-guix@gnu.org; Tue, 28 May 2019 07:18:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hVa7G-00085w-02 for bug-guix@gnu.org; Tue, 28 May 2019 07:18:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <875zpw6mq0.fsf@ngyro.com> (Timothy Sample's message of "Mon, 27 May 2019 09:39:03 -0400") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Timothy Sample Cc: 35785@debbugs.gnu.org, Einar Largenius Hi Timothy, Timothy Sample skribis: > A quick reading of RFC 3986 suggests that the host part of a URI can be > an IP address (version 4 or 6) or a registered name. It gives the > following rules for registered names: > > reg-name =3D *( unreserved / pct-encoded / sub-delims ) > unreserved =3D ALPHA / DIGIT / "-" / "." / "_" / "~" > pct-encoded =3D "%" HEXDIG HEXDIG > sub-delims =3D "!" / "$" / "&" / "'" / "(" / ")" > / "*" / "+" / "," / ";" / "=3D" > > Here, =E2=80=9CALPHA=E2=80=9D, =E2=80=9CDIGIT=E2=80=9D, and =E2=80=9CHEXD= IG=E2=80=9D are specified in RFC 2234, and are > just the ASCII ranges you might expect (except for that =E2=80=9CHEXDIG= =E2=80=9D only > allows uppercase letters). Do you think you could turn that into a patch for Guile? I=E2=80=99d happi= ly apply it. :-) It looks like both [[:alnum:]] & co. and ranges would be locale-dependent, so my understanding is that we=E2=80=99ll have to list al= l the characters explicitly, right? Thanks, Ludo=E2=80=99.