From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 26948@debbugs.gnu.org
Subject: bug#26948: ‘write-file’ output should not be locale-dependent
Date: Mon, 29 May 2017 13:15:04 -0700 [thread overview]
Message-ID: <87h903s9mf.fsf@gmail.com> (raw)
In-Reply-To: <87mv9wc9gp.fsf_-_@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\?\= \=\?utf-8\?Q\?\=22's\?\= message of "Mon, 29 May 2017 11:12:54 +0200")
ludo@gnu.org (Ludovic Courtès) writes:
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> ludo@gnu.org (Ludovic Courtès) writes:
>
> [...]
>
>>> Strangely that file name has question marks instead of the non-ASCII
>>> characters on my GuixSD system:
>>>
>>> $ ls -l /etc/ssl/certs/*Certi*mara*
>>> lrwxrwxrwx 8 root root 162 Jan 1 1970
>>> '/etc/ssl/certs/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem'
>>> ->
>>> '/gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem'
>>
>> Hmm. That is strange. It seems like you also have a locale problem, but
>> that it is handled in a way that doesn't break nss-certs?
>
> AFAICS the file is really called that way, with question marks:
>
> scheme@(guile-user)> (stat "/gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem")
> $2 = #(64768 4719936 33060 8 0 0 0 2444 1496043280 1 1492867575 4096 8 regular 292 130744281 0 1492867575)
>
>
> And:
>
> $ wget -O - https://mirror.hydra.gnu.org/guix/nar/gzip/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2 |gunzip -c | guix archive -x t
> --2017-05-29 10:55:36-- https://mirror.hydra.gnu.org/guix/nar/gzip/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2
> Ni solvigas mirror.hydra.gnu.org (mirror.hydra.gnu.org)... 131.159.14.26, 2001:4ca0:2001:10:225:90ff:fedb:c720
> Konektado al mirror.hydra.gnu.org (mirror.hydra.gnu.org)|131.159.14.26|:443... konektita.
> HTTP peto sendita, ni atendas respondon... 200 OK
> Grando: 171969 (168K) [application/octet-stream]
> Ni konservas al: 'STDOUT'
>
> - 100%[==============================================>] 167.94K --.-KB/s en 0.08s
>
> 2017-05-29 10:55:37 (2.02 MB/s) - skribita al ĉefeligujo [171969/171969]
>
> $ find t -name AC_Ra\*
> t/etc/ssl/certs/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem
> $ locale
> LANG=en_US.utf8
> LC_CTYPE="en_US.utf8"
> LC_NUMERIC="en_US.utf8"
> LC_TIME="en_US.utf8"
> LC_COLLATE="en_US.utf8"
> LC_MONETARY="en_US.utf8"
> LC_MESSAGES="en_US.utf8"
> LC_PAPER=fr_FR.utf8
> LC_NAME="en_US.utf8"
> LC_ADDRESS="en_US.utf8"
> LC_TELEPHONE="en_US.utf8"
> LC_MEASUREMENT="en_US.utf8"
> LC_IDENTIFICATION="en_US.utf8"
> LC_ALL=
>
--8<---------------cut here---------------start------------->8---
$ find /etc/ssl/certs -name AC_Ra\*
/etc/ssl/certs/AC_Raíz_Certicámara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
--8<---------------cut here---------------end--------------->8---
The file name appears normally here (in xterm). I'm not sure why it's
different on your side, since we are both using UTF-8 locales. It does
still look strange when seen from strace though, but I guess this is
peculiarity of strace:
open("/etc/ssl/certs/AC_Ra\303\255z_Certic\303\241mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.p", O_RDONLY) = -1 ENOENT (No such file or directory)
> But wait! “guix build nss-certs --check -K” fails, and the diff is:
>
> $ LANGUAGE= diff -ur /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2{,-check}
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs: AC_Raíz_Certicámara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs: AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem
> diff -ur /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs/ae8153b9.0 /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs/ae8153b9.0
> --- /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs/ae8153b9.0 1970-01-01 01:00:01.000000000 +0100
> +++ /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs/ae8153b9.0 1970-01-01 01:00:01.000000000 +0100
> @@ -3,10 +3,10 @@
> # distrust=
> # openssl-trust=codeSigning emailProtection serverAuth
> -----BEGIN CERTIFICATE-----
> -MIIHyTCCBbGgAwIBAgIBATANBgkqhkiG9w0BAQUFADB9MQswCQYDVQQGEwJJTDEW
> +MIIHhzCCBW+gAwIBAgIBLTANBgkqhkiG9w0BAQsFADB9MQswCQYDVQQGEwJJTDEW
Can this be explained by locale alone? That is troubling.
> MBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwg
> Q2VydGlmaWNhdGUgU2lnbmluZzEpMCcGA1UEAxMgU3RhcnRDb20gQ2VydGlmaWNh
> -dGlvbiBBdXRob3JpdHkwHhcNMDYwOTE3MTk0NjM2WhcNMzYwOTE3MTk0NjM2WjB9
> +dGlvbiBBdXRob3JpdHkwHhcNMDYwOTE3MTk0NjM3WhcNMzYwOTE3MTk0NjM2WjB9
^ ???
> MQswCQYDVQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMi
>
> [...]
>
> +O3NJo2pXh5Tl1njFmUNj403gdy3hZZlyaQQaRwnmDwFWJPsfvw55qVguucQJAX6V
> +um0ABj6y6koQOdjQK/W/7HW/lwLFCRsI3FU34oH7N4RDYiDK51ZLZer+bMEkkySh
> +NOsF/5oirpt9P/FlUQqmMGqz9IgcgA38corog14=
> -----END CERTIFICATE-----
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs: Certinomis_-_Autorité_Racine:2.1.1.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs: Certinomis_-_Autorit?_Racine:2.1.1.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs: NetLock_Arany_=Class_Gold=_Főtanúsítvány:2.6.73.65.44.228.0.16.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs: NetLock_Arany_=Class_Gold=_F?tan?s?tv?ny:2.6.73.65.44.228.0.16.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs: T?B?TAK_UEKAE_K?k_Sertifika_Hizmet_Sa?lay?c?s?_-_S?r?m_3:2.1.17.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs: T?RKTRUST_Elektronik_Sertifika_Hizmet_Sa?lay?c?s?_H5:2.7.0.142.23.254.36.32.129.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs: TÜBİTAK_UEKAE_Kök_Sertifika_Hizmet_Sağlayıcısı_-_Sürüm_3:2.1.17.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs: TÜRKTRUST_Elektronik_Sertifika_Hizmet_Sağlayıcısı_H5:2.7.0.142.23.254.36.32.129.pem
>
> See? (The difference in the first certificate is weird too…)
>
> There are two ways to create nars. One is via the ‘export-paths’ RPC
> (implemented in the daemon in C++), which does not interpret file names
> and thus leaves them untouched. The other one is via ‘write-file’ from
> (guix serialization), which is written in Scheme and thus converts file
> names from locale encoding (specifically, ‘scandir’ does that.)
>
> ‘guix publish’ uses the latter, so ‘guix publish’ is sensitive to locale
> settings, which is pretty bad.
>
> Guile currently does not allow us to specify whether/how file names
> should be decoded, but possible solutions have been discussed for 2.2.
>
> In the meantime, solutions are:
>
> 1. To run ‘guix publish’ in a UTF-8 locale, which apparently was not
> the case.
I'm surprised by that. Wouldn't a utf8 locale be the default?
>
> 2. Add to (guix build syscalls) a separate locale-independent
> ‘scandir’ implementation and use that.
If the general solution is to fix it in Guile, the workaround proposed
in 1. seems preferable.
Maxim
next prev parent reply other threads:[~2017-05-29 20:16 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-16 5:19 bug#26948: gnutls errors on multiple guix commands Maxim Cournoyer
2017-05-17 12:56 ` Ludovic Courtès
2017-05-25 7:26 ` Maxim Cournoyer
2017-05-26 8:56 ` Ludovic Courtès
2017-05-28 18:38 ` Mark H Weaver
2017-05-29 4:36 ` Maxim Cournoyer
2017-05-29 9:31 ` Ludovic Courtès
2017-05-29 21:26 ` Mark H Weaver
2017-05-30 11:25 ` Ludovic Courtès
2017-05-28 21:00 ` Maxim Cournoyer
2017-05-29 9:12 ` bug#26948: ‘write-file’ output should not be locale-dependent Ludovic Courtès
2017-05-29 20:15 ` Maxim Cournoyer [this message]
2017-05-30 11:57 ` Ludovic Courtès
2017-06-16 15:09 ` Ludovic Courtès
2017-07-27 12:55 ` Ludovic Courtès
2021-01-08 22:04 ` bug#26948: 'guix publish' file name decoding is locale-dependent Maxim Cournoyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://guix.gnu.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h903s9mf.fsf@gmail.com \
--to=maxim.cournoyer@gmail.com \
--cc=26948@debbugs.gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).