unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: 26948@debbugs.gnu.org
Subject: bug#26948: ‘write-file’ output should not be locale-dependent
Date: Mon, 29 May 2017 13:15:04 -0700	[thread overview]
Message-ID: <87h903s9mf.fsf@gmail.com> (raw)
In-Reply-To: <87mv9wc9gp.fsf_-_@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\?\= \=\?utf-8\?Q\?\=22's\?\= message of "Mon, 29 May 2017 11:12:54 +0200")

ludo@gnu.org (Ludovic Courtès) writes:

> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> ludo@gnu.org (Ludovic Courtès) writes:
>
> [...]
>
>>> Strangely that file name has question marks instead of the non-ASCII
>>> characters on my GuixSD system:
>>>
>>> $ ls -l /etc/ssl/certs/*Certi*mara*
>>> lrwxrwxrwx 8 root root 162 Jan 1 1970
>>> '/etc/ssl/certs/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem'
>>> ->
>>> '/gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem'
>>
>> Hmm. That is strange. It seems like you also have a locale problem, but
>> that it is handled in a way that doesn't break nss-certs?
>
> AFAICS the file is really called that way, with question marks:
>
> scheme@(guile-user)> (stat "/gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem")
> $2 = #(64768 4719936 33060 8 0 0 0 2444 1496043280 1 1492867575 4096 8 regular 292 130744281 0 1492867575)
>
>
> And:
>
> $ wget -O - https://mirror.hydra.gnu.org/guix/nar/gzip/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2 |gunzip -c | guix archive -x t
> --2017-05-29 10:55:36--  https://mirror.hydra.gnu.org/guix/nar/gzip/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2
> Ni solvigas mirror.hydra.gnu.org (mirror.hydra.gnu.org)... 131.159.14.26, 2001:4ca0:2001:10:225:90ff:fedb:c720
> Konektado al mirror.hydra.gnu.org (mirror.hydra.gnu.org)|131.159.14.26|:443... konektita.
> HTTP peto sendita, ni atendas respondon... 200 OK
> Grando: 171969 (168K) [application/octet-stream]
> Ni konservas al: 'STDOUT'
>
> -                            100%[==============================================>] 167.94K  --.-KB/s    en 0.08s   
>
> 2017-05-29 10:55:37 (2.02 MB/s) - skribita al ĉefeligujo [171969/171969]
>
> $ find t -name AC_Ra\*
> t/etc/ssl/certs/AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem
> $ locale
> LANG=en_US.utf8
> LC_CTYPE="en_US.utf8"
> LC_NUMERIC="en_US.utf8"
> LC_TIME="en_US.utf8"
> LC_COLLATE="en_US.utf8"
> LC_MONETARY="en_US.utf8"
> LC_MESSAGES="en_US.utf8"
> LC_PAPER=fr_FR.utf8
> LC_NAME="en_US.utf8"
> LC_ADDRESS="en_US.utf8"
> LC_TELEPHONE="en_US.utf8"
> LC_MEASUREMENT="en_US.utf8"
> LC_IDENTIFICATION="en_US.utf8"
> LC_ALL=
>

--8<---------------cut here---------------start------------->8---
$ find /etc/ssl/certs -name AC_Ra\*
/etc/ssl/certs/AC_Raíz_Certicámara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
--8<---------------cut here---------------end--------------->8---

The file name appears normally here (in xterm). I'm not sure why it's
different on your side, since we are both using UTF-8 locales. It does
still look strange when seen from strace though, but I guess this is
peculiarity of strace:

open("/etc/ssl/certs/AC_Ra\303\255z_Certic\303\241mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.p", O_RDONLY) = -1 ENOENT (No such file or directory)

> But wait!  “guix build nss-certs --check -K” fails, and the diff is:
>
> $ LANGUAGE= diff -ur /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2{,-check}
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs: AC_Raíz_Certicámara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs: AC_Ra?z_Certic?mara_S.A.:2.15.7.126.82.147.123.224.21.227.87.240.105.140.203.236.12.pem
> diff -ur /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs/ae8153b9.0 /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs/ae8153b9.0
> --- /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs/ae8153b9.0	1970-01-01 01:00:01.000000000 +0100
> +++ /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs/ae8153b9.0	1970-01-01 01:00:01.000000000 +0100
> @@ -3,10 +3,10 @@
>  # distrust=
>  # openssl-trust=codeSigning emailProtection serverAuth
>  -----BEGIN CERTIFICATE-----
> -MIIHyTCCBbGgAwIBAgIBATANBgkqhkiG9w0BAQUFADB9MQswCQYDVQQGEwJJTDEW
> +MIIHhzCCBW+gAwIBAgIBLTANBgkqhkiG9w0BAQsFADB9MQswCQYDVQQGEwJJTDEW

Can this be explained by locale alone? That is troubling.

>  MBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMiU2VjdXJlIERpZ2l0YWwg
>  Q2VydGlmaWNhdGUgU2lnbmluZzEpMCcGA1UEAxMgU3RhcnRDb20gQ2VydGlmaWNh
> -dGlvbiBBdXRob3JpdHkwHhcNMDYwOTE3MTk0NjM2WhcNMzYwOTE3MTk0NjM2WjB9
> +dGlvbiBBdXRob3JpdHkwHhcNMDYwOTE3MTk0NjM3WhcNMzYwOTE3MTk0NjM2WjB9
                                          ^ ???

>  MQswCQYDVQQGEwJJTDEWMBQGA1UEChMNU3RhcnRDb20gTHRkLjErMCkGA1UECxMi
>
> [...]
>
> +O3NJo2pXh5Tl1njFmUNj403gdy3hZZlyaQQaRwnmDwFWJPsfvw55qVguucQJAX6V
> +um0ABj6y6koQOdjQK/W/7HW/lwLFCRsI3FU34oH7N4RDYiDK51ZLZer+bMEkkySh
> +NOsF/5oirpt9P/FlUQqmMGqz9IgcgA38corog14=
>  -----END CERTIFICATE-----
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs: Certinomis_-_Autorité_Racine:2.1.1.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs: Certinomis_-_Autorit?_Racine:2.1.1.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs: NetLock_Arany_=Class_Gold=_Főtanúsítvány:2.6.73.65.44.228.0.16.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs: NetLock_Arany_=Class_Gold=_F?tan?s?tv?ny:2.6.73.65.44.228.0.16.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs: T?B?TAK_UEKAE_K?k_Sertifika_Hizmet_Sa?lay?c?s?_-_S?r?m_3:2.1.17.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2/etc/ssl/certs: T?RKTRUST_Elektronik_Sertifika_Hizmet_Sa?lay?c?s?_H5:2.7.0.142.23.254.36.32.129.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs: TÜBİTAK_UEKAE_Kök_Sertifika_Hizmet_Sağlayıcısı_-_Sürüm_3:2.1.17.pem
> Only in /gnu/store/3ql0vilc0zv6ra42ghi04787vrg6bb71-nss-certs-3.30.2-check/etc/ssl/certs: TÜRKTRUST_Elektronik_Sertifika_Hizmet_Sağlayıcısı_H5:2.7.0.142.23.254.36.32.129.pem
>
> See?  (The difference in the first certificate is weird too…)
>
> There are two ways to create nars.  One is via the ‘export-paths’ RPC
> (implemented in the daemon in C++), which does not interpret file names
> and thus leaves them untouched.  The other one is via ‘write-file’ from
> (guix serialization), which is written in Scheme and thus converts file
> names from locale encoding (specifically, ‘scandir’ does that.)
>
> ‘guix publish’ uses the latter, so ‘guix publish’ is sensitive to locale
> settings, which is pretty bad.
>
> Guile currently does not allow us to specify whether/how file names
> should be decoded, but possible solutions have been discussed for 2.2.
>
> In the meantime, solutions are:
>
>   1. To run ‘guix publish’ in a UTF-8 locale, which apparently was not
>      the case.

I'm surprised by that. Wouldn't a utf8 locale be the default?

>
>   2. Add to (guix build syscalls) a separate locale-independent
>      ‘scandir’ implementation and use that.

If the general solution is to fix it in Guile, the workaround proposed
in 1. seems preferable.

Maxim

  reply	other threads:[~2017-05-29 20:16 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-16  5:19 bug#26948: gnutls errors on multiple guix commands Maxim Cournoyer
2017-05-17 12:56 ` Ludovic Courtès
2017-05-25  7:26   ` Maxim Cournoyer
2017-05-26  8:56     ` Ludovic Courtès
2017-05-28 18:38       ` Mark H Weaver
2017-05-29  4:36         ` Maxim Cournoyer
2017-05-29  9:31         ` Ludovic Courtès
2017-05-29 21:26           ` Mark H Weaver
2017-05-30 11:25             ` Ludovic Courtès
2017-05-28 21:00       ` Maxim Cournoyer
2017-05-29  9:12         ` bug#26948: ‘write-file’ output should not be locale-dependent Ludovic Courtès
2017-05-29 20:15           ` Maxim Cournoyer [this message]
2017-05-30 11:57             ` Ludovic Courtès
2017-06-16 15:09               ` Ludovic Courtès
2017-07-27 12:55           ` Ludovic Courtès
2021-01-08 22:04             ` bug#26948: 'guix publish' file name decoding is locale-dependent Maxim Cournoyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h903s9mf.fsf@gmail.com \
    --to=maxim.cournoyer@gmail.com \
    --cc=26948@debbugs.gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).