unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: Attila Lendvai <attila@lendvai.name>
To: Maxime Devos <maximedevos@telenet.be>
Cc: 54893@debbugs.gnu.org
Subject: bug#54893: guix-daemon, locale, LANG, and unicode in git tag names
Date: Wed, 13 Apr 2022 07:51:08 +0000	[thread overview]
Message-ID: <4sSjKaCcadx8brYQC5HZuP-SyMku3BlXRTZwaCUH13qSv01N33lk9vyUWzzE6R889ZuQRpI_6Pl4Q_51v8jMhUhwh6f9rly5h0EhlUqHG80=@lendvai.name> (raw)
In-Reply-To: <be1fe91166c5a9e95975084d9e8dc7f80222cf4d.camel@telenet.be>

> * LANG should be set, because it is in #:leaked-env-vars (see
> guix/git-download.scm). I don't know whose LANG it is though
> -- the user's, or the daemon's?


if i add this to the gexp:

(simple-format (current-error-port)
               "LANG is '~A'~%"
               (getenv "LANG"))
(setenv "LANG" "en_US.utf8")
(setenv "GUIX_LOCPATH" "/run/current-system/locale")
(setlocale LC_ALL (getenv "LANG"))

i see:

LANG is ''
Backtrace:
           2 (primitive-load "/gnu/store/z4bis94jg0s0y0xj1xbmliv7xs8?")
In ice-9/eval.scm:
    619:8  1 (_ #f)
In unknown file:
           0 (setlocale 6 "en_US.utf8")

ERROR: In procedure setlocale:
In procedure setlocale: Invalid argument


> * GUIX_LOCPATH is not leaked.


it's the same if i add GUIX_LOCPATH to the #:leaked-env-vars and don't setenv it explicitly.


> * Even if it was, I don't think that /gnu/store/...glibc-locales
> would be accessible from the build container (though you could give
> it a try?).


i didn't check this specifically, but i'm afraid you are right, and this is why my kludge doesn't work.


> * So perhaps GUIX_LOCPATH needs to be set in the gexp in
> guix/git-download.scm, + some setlocale as done by
> gnu-build-system.


i don't understand why the setlocale call in gnu-build-system's install-locale works, but my setlocale kludge in git-download doesn't.

i even tried to add glibc-locale as native-inputs to the package in question, but it didn't help.


> * Long-term, it could be interesting to remove the
> ‘file name = string encoded in current locale's encoding’
> assumption from Guile.


i'm not sure why the wrong locale breaks file-system walking and deleting, though.

i assume if every function in guile uses/assumes the same locale (character encoding), then both directions through the guile FFI should be idempotent, no? and i think both ASCII and UTF-8 are idempotent wrt C bytes <-> scheme string conversions. IOW, it's only the displaying of the chars that should be broken, not file operations.

or am i wrong to assume this?

or maybe the character encoding algo used in guile's FFI silently emits actual question marks in place of bytes that are outside the valid range of the encoding used? if so, that's not a very defensive way of coding, and it's eating up hours of my life...

hrm... this is not relevant here, only a related thought: things can go wrong in the GEXP serialization, too: if the writing side and the reading side doesn't use the same character encoding. locale should be set explicitly at the relevant entry points.

i'd appreciate if someone could help me come up with at least a kludge, so that i could make progress until it's fixed properly.

thanks for your insights Maxime,

--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
If you never heal from what hurt you, you'll bleed on people who didn't cut you.





  reply	other threads:[~2022-04-13  7:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-12 19:47 bug#54893: guix-daemon, locale, LANG, and unicode in git tag names Attila Lendvai
2022-04-12 20:40 ` Maxime Devos
2022-04-13  7:51   ` Attila Lendvai [this message]
2022-04-13  8:03     ` Maxime Devos
2022-04-13  8:45       ` Attila Lendvai
2022-04-19 11:38         ` Attila Lendvai
2022-04-19 15:45           ` Maxime Devos
2022-04-19 16:07           ` Maxime Devos
2022-04-13  8:22     ` Maxime Devos
2022-04-13 10:40       ` Liliana Marie Prikler
2022-04-13 10:57         ` Maxime Devos
2022-04-13  8:29     ` Maxime Devos
2022-04-19 18:09 ` bug#54893: [PATCH] guix: git-download: Set locale to deal with Unicode in git metadata Attila Lendvai
2022-04-20 20:12   ` bug#54893: guix-daemon, locale, LANG, and unicode in git tag names Ludovic Courtès
2022-04-20 22:15   ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='4sSjKaCcadx8brYQC5HZuP-SyMku3BlXRTZwaCUH13qSv01N33lk9vyUWzzE6R889ZuQRpI_6Pl4Q_51v8jMhUhwh6f9rly5h0EhlUqHG80=@lendvai.name' \
    --to=attila@lendvai.name \
    --cc=54893@debbugs.gnu.org \
    --cc=maximedevos@telenet.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).