From mboxrd@z Thu Jan 1 00:00:00 1970 From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=) Subject: bug#26948: =?UTF-8?Q?=E2=80=98write-file=E2=80=99?= output should not be locale-dependent Date: Mon, 29 May 2017 11:12:54 +0200 Message-ID: <87mv9wc9gp.fsf_-_@gnu.org> References: <8737c51e6r.fsf@gmail.com> <87shk3y74g.fsf@gnu.org> <8737btieie.fsf@gmail.com> <87vaoovvvz.fsf@gnu.org> <87o9ucu1t3.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:38561) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dFGkY-0008Pj-Gt for bug-guix@gnu.org; Mon, 29 May 2017 05:14:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dFGkU-0000kt-D7 for bug-guix@gnu.org; Mon, 29 May 2017 05:14:06 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:39855) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dFGkU-0000km-9L for bug-guix@gnu.org; Mon, 29 May 2017 05:14:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dFGkU-0007HE-3z for bug-guix@gnu.org; Mon, 29 May 2017 05:14:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87o9ucu1t3.fsf@gmail.com> (Maxim Cournoyer's message of "Sun, 28 May 2017 14:00:59 -0700") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Maxim Cournoyer Cc: 26948@debbugs.gnu.org Maxim Cournoyer skribis: > ludo@gnu.org (Ludovic Court=C3=A8s) writes: [...] >> Strangely that file name has question marks instead of the non-ASCII >> characters on my GuixSD system: >> >> $ ls -l /etc/ssl/certs/*Certi*mara* >> lrwxrwxrwx 8 root root 162 Jan 1 1970 '/etc/ssl/certs/AC_Ra?z_Certic?m= ara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem' ->= '/gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/cert= s/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.= 203.236.12.pem' > > Hmm. That is strange. It seems like you also have a locale problem, but > that it is handled in a way that doesn't break nss-certs? AFAICS the file is really called that way, with question marks: --8<---------------cut here---------------start------------->8--- scheme@(guile-user)> (stat "/gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss= -certs-3.30.2/etc/ssl/certs/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.= 224.21.227.87.240.105.140.203.236.12.pem") $2 =3D #(64768 4719936 33060 8 0 0 0 2444 1496043280 1 1492867575 4096 8 re= gular 292 130744281 0 1492867575) --8<---------------cut here---------------end--------------->8--- And: --8<---------------cut here---------------start------------->8--- $ wget -O - https://mirror.hydra.gnu.org/guix/nar/gzip/3ql0vilc0zv6ra42ghi0= 4787vrg6bb71-nss-certs-3.30.2 |gunzip -c | guix archive -x t --2017-05-29 10:55:36-- https://mirror.hydra.gnu.org/guix/nar/gzip/3ql0vil= c0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2 Ni solvigas mirror.hydra.gnu.org (mirror.hydra.gnu.org)... 131.159.14.26, 2= 001:4ca0:2001:10:225:90ff:fedb:c720 Konektado al mirror.hydra.gnu.org (mirror.hydra.gnu.org)|131.159.14.26|:443= ... konektita. HTTP peto sendita, ni atendas respondon... 200 OK Grando: 171969 (168K) [application/octet-stream] Ni konservas al: 'STDOUT' - 100%[=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D>] 167.94K --.-KB/s en 0.08s=20=20=20 2017-05-29 10:55:37 (2.02 MB/s) - skribita al =C4=89efeligujo [171969/17196= 9] $ find t -name AC_Ra\* t/etc/ssl/certs/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.8= 7.240.105.140.203.236.12.pem $ locale LANG=3Den_US.utf8 LC_CTYPE=3D"en_US.utf8" LC_NUMERIC=3D"en_US.utf8" LC_TIME=3D"en_US.utf8" LC_COLLATE=3D"en_US.utf8" LC_MONETARY=3D"en_US.utf8" LC_MESSAGES=3D"en_US.utf8" LC_PAPER=3Dfr_FR.utf8 LC_NAME=3D"en_US.utf8" LC_ADDRESS=3D"en_US.utf8" LC_TELEPHONE=3D"en_US.utf8" LC_MEASUREMENT=3D"en_US.utf8" LC_IDENTIFICATION=3D"en_US.utf8" LC_ALL=3D --8<---------------cut here---------------end--------------->8--- But wait! =E2=80=9Cguix build nss-certs --check -K=E2=80=9D fails, and the= diff is: --8<---------------cut here---------------start------------->8--- $ LANGUAGE=3D diff -ur /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-cert= s-3.30.2{,-check} Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/= etc/ssl/certs: AC_Ra=C3=ADz_Certic=C3=A1mara_S.A.:2.15.7.126.82.147.123.224= .21.227.87.240.105.140.203.236.12.pem Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ss= l/certs: AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.1= 05.140.203.236.12.pem diff -ur /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/s= sl/certs/ae8153b9.0 /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3= .30.2-check/etc/ssl/certs/ae8153b9.0 --- /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/ce= rts/ae8153b9.0 1970-01-01 01:00:01.000000000 +0100 +++ /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/= ssl/certs/ae8153b9.0 1970-01-01 01:00:01.000000000 +0100 @@ -3,10 +3,10 @@ # distrust=3D # openssl-trust=3DcodeSigning emailProtection serverAuth -----BEGIN CERTIFICATE----- -MIIHyTCCBbGgAwIBAgIBATANBgkqhkiG9w0BAQUFADB9MQswCQYDVQQGEwJJTDEW +MIIHhzCCBW+gAwIBAgIBLTANBgkqhkiG9w0BAQsFADB9MQswCQYDVQQGEwJJTDEW MBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwg Q2VydGlmaWNhdGUgU2lnbmluZzEpMCcGA1UEAxMgU3RhcnRDb20gQ2VydGlmaWNh -dGlvbiBBdXRob3JpdHkwHhcNMDYwOTE3MTk0NjM2WhcNMzYwOTE3MTk0NjM2WjB9 +dGlvbiBBdXRob3JpdHkwHhcNMDYwOTE3MTk0NjM3WhcNMzYwOTE3MTk0NjM2WjB9 MQswCQYDVQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMi [...] +O3NJo2pXh5Tl1njFmUNj403gdy3hZZlyaQQaRwnmDwFWJPsfvw55qVguucQJAX6V +um0ABj6y6koQOdjQK/W/7HW/lwLFCRsI3FU34oH7N4RDYiDK51ZLZer+bMEkkySh +NOsF/5oirpt9P/FlUQqmMGqz9IgcgA38corog14=3D -----END CERTIFICATE----- Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/= etc/ssl/certs: Certinomis_-_Autorit=C3=A9_Racine:2.1.1.pem Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ss= l/certs: Certinomis_-_Autorit?_Racine:2.1.1.pem Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/= etc/ssl/certs: NetLock_Arany_=3DClass_Gold=3D_F=C5=91tan=C3=BAs=C3=ADtv=C3= =A1ny:2.6.73.65.44.228.0.16.pem Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ss= l/certs: NetLock_Arany_=3DClass_Gold=3D_F?tan?s?tv?ny:2.6.73.65.44.228.0.16= .pem Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ss= l/certs: T?B?TAK_UEKAE_K?k_Sertifika_Hizmet_Sa?lay?c?s?_-_S?r?m_3:2.1.17.pem Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ss= l/certs: T?RKTRUST_Elektronik_Sertifika_Hizmet_Sa?lay?c?s?_H5:2.7.0.142.23.= 254.36.32.129.pem Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/= etc/ssl/certs: T=C3=9CB=C4=B0TAK_UEKAE_K=C3=B6k_Sertifika_Hizmet_Sa=C4=9Fla= y=C4=B1c=C4=B1s=C4=B1_-_S=C3=BCr=C3=BCm_3:2.1.17.pem Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/= etc/ssl/certs: T=C3=9CRKTRUST_Elektronik_Sertifika_Hizmet_Sa=C4=9Flay=C4=B1= c=C4=B1s=C4=B1_H5:2.7.0.142.23.254.36.32.129.pem --8<---------------cut here---------------end--------------->8--- See? (The difference in the first certificate is weird too=E2=80=A6) There are two ways to create nars. One is via the =E2=80=98export-paths=E2= =80=99 RPC (implemented in the daemon in C++), which does not interpret file names and thus leaves them untouched. The other one is via =E2=80=98write-file= =E2=80=99 from (guix serialization), which is written in Scheme and thus converts file names from locale encoding (specifically, =E2=80=98scandir=E2=80=99 does th= at.) =E2=80=98guix publish=E2=80=99 uses the latter, so =E2=80=98guix publish=E2= =80=99 is sensitive to locale settings, which is pretty bad. Guile currently does not allow us to specify whether/how file names should be decoded, but possible solutions have been discussed for 2.2. In the meantime, solutions are: 1. To run =E2=80=98guix publish=E2=80=99 in a UTF-8 locale, which apparen= tly was not the case. 2. Add to (guix build syscalls) a separate locale-independent =E2=80=98scandir=E2=80=99 implementation and use that. Thoughts? Ludo=E2=80=99.