From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ludovic =?UTF-8?Q?Court=C3=A8s?= Subject: bug#35785: =?UTF-8?Q?=E2=80=98string->uri=E2=80=99?= is locale-dependent and breaks in =?UTF-8?Q?=E2=80=98sv=5FSE=E2=80=99?= Date: Tue, 04 Jun 2019 09:42:55 +0200 Message-ID: <87imtlhk3k.fsf@gnu.org> References: <878sv4j1au.fsf@gmail.com> <87d0kgvuxj.fsf@gnu.org> <87tvdqgwyg.fsf@gmail.com> <87blzxwkrn.fsf_-_@gnu.org> <87ftp017k6.fsf@elephly.net> <875zpw6mq0.fsf@ngyro.com> <8736ky3k1w.fsf@gnu.org> <87imtnsdsb.fsf@ngyro.com> <871s0ahlfq.fsf@gnu.org> <87ef4asq53.fsf@ngyro.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([209.51.188.92]:48710) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hY471-0005rZ-CW for bug-guix@gnu.org; Tue, 04 Jun 2019 03:44:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hY470-00024I-6n for bug-guix@gnu.org; Tue, 04 Jun 2019 03:44:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:58477) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hY470-0001wi-04 for bug-guix@gnu.org; Tue, 04 Jun 2019 03:44:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1hY46z-0007ig-MD for bug-guix@gnu.org; Tue, 04 Jun 2019 03:44:01 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87ef4asq53.fsf@ngyro.com> (Timothy Sample's message of "Mon, 03 Jun 2019 10:24:40 -0400") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Timothy Sample Cc: 35785@debbugs.gnu.org, Einar Largenius Hello, Timothy Sample skribis: >>> From 7b02be4c050c7b17a0e2685e8e453295f798c360 Mon Sep 17 00:00:00 2001 >>> From: Timothy Sample >>> Date: Sun, 2 Jun 2019 14:41:20 -0400 >>> Subject: [PATCH] Make URI handling locale independent. >>> >>> Fixes . >>> >>> * module/web/uri.scm (digits, hex-digits, letters): New variables. >>> (ipv4-regexp, ipv6-regexp, domain-label-regexp, top-label-regexp, >>> userinfo-pat, host-pat, ipv6-host-pat, port-pat, scheme-pat): Explicitly >>> list each character instead of using character ranges. >>> * test-suite/tests/web-uri.test: Add corresponding tests. >> >> [...] >> >>> + (pass-if "http://www.example.com (sv_SE)" >>> + (dynamic-wind >>> + (lambda () #t) >>> + (lambda () >>> + (with-locale "sv_SE.utf8" >>> + (reload-module (resolve-module '(web uri))) >>> + (uri=3D? (string->uri "http://www.example.com") >>> + #:scheme 'http #:host "www.example.com" #:path ""))) >> >> Aren=E2=80=99t =E2=80=98reload-module=E2=80=99 calls a leftover that can= now be removed (also in >> the other test)? > > I needed to reload the modules like that to make the tests fail without > the patch and pass with it. My understanding is that the bug happens > at regex compile time, which happens when the module is loaded. If I > don=E2=80=99t reload the module, the old URI code passes the tests, since= the > regexes were compiled with a locale that does not trigger the bug. It=E2= =80=99s > a little wacky, sure, but it was the best idea I could come up with. Oooh, I see. Could you add a comment to explain this? Then we=E2=80=99re = done. >> For the sv_SE test, what about taking a host name with a =E2=80=98w=E2= =80=99, since >> that=E2=80=99s the use case that allowed us to uncover this bug? > > I thought I was being clever by using a =E2=80=9Cwww=E2=80=9D hostname, b= ut apparently > it=E2=80=99s so normalized as to be invisible! Feel free to change it to > something more obvious like =E2=80=9Cw.com=E2=80=9D or whatever. Silly me, I guess I need new glasses. :-) Thanks! Ludo=E2=80=99.