From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id CHX3A0OBVmJ1LwEAgWs5BA (envelope-from ) for ; Wed, 13 Apr 2022 09:52:35 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id 4MZhOEKBVmKFMgEAG6o9tA (envelope-from ) for ; Wed, 13 Apr 2022 09:52:34 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 3AA21204E6 for ; Wed, 13 Apr 2022 09:52:34 +0200 (CEST) Received: from localhost ([::1]:37594 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1neXne-0004yw-ED for larch@yhetil.org; Wed, 13 Apr 2022 03:52:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48558) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1neXnH-0004xu-AW for bug-guix@gnu.org; Wed, 13 Apr 2022 03:52:03 -0400 Received: from debbugs.gnu.org ([209.51.188.43]:57587) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1neXnG-0005I4-HV for bug-guix@gnu.org; Wed, 13 Apr 2022 03:52:02 -0400 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1neXnG-0000pZ-Fw for bug-guix@gnu.org; Wed, 13 Apr 2022 03:52:02 -0400 X-Loop: help-debbugs@gnu.org Subject: bug#54893: guix-daemon, locale, LANG, and unicode in git tag names Resent-From: Attila Lendvai Original-Sender: "Debbugs-submit" Resent-CC: bug-guix@gnu.org Resent-Date: Wed, 13 Apr 2022 07:52:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 54893 X-GNU-PR-Package: guix X-GNU-PR-Keywords: To: Maxime Devos Received: via spool by 54893-submit@debbugs.gnu.org id=B54893.16498362873124 (code B ref 54893); Wed, 13 Apr 2022 07:52:02 +0000 Received: (at 54893) by debbugs.gnu.org; 13 Apr 2022 07:51:27 +0000 Received: from localhost ([127.0.0.1]:51480 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1neXmg-0000oJ-Lb for submit@debbugs.gnu.org; Wed, 13 Apr 2022 03:51:26 -0400 Received: from mail-4317.proton.ch ([185.70.43.17]:61889) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1neXmd-0000nz-Do for 54893@debbugs.gnu.org; Wed, 13 Apr 2022 03:51:25 -0400 Date: Wed, 13 Apr 2022 07:51:08 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lendvai.name; s=protonmail3; t=1649836276; bh=NhgglEZ0S/137Q5eOW63g3+f3nA4JIUNIOxOxLAJ3GI=; h=Date:To:From:Cc:Reply-To:Subject:Message-ID:In-Reply-To: References:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID; b=O5/P7UTM2gfV4OekEXMuW/aF2JdQIitDhWue+XM80Zh8iv4YoMJwIDy/cU45Y/ILI nGVJV4iUqBA2InZjFyzUyf9CmvnJws/OBEiIOHwHRCmzjShyxp2Iu8fX3TTyBIhmPm WmdVwjZSCnr124iGU2ikZDSHyQrL+sUIkLG33rKsmyvbXhkO9Z0ZZjIV98FcTY51v+ y5MkQSqI53ELkyW7umEsLWZzXNCIKLPUKTPJcNcb4VgSTf2hjY4PYBTKXK8gCPcW82 m7Y3m709mHykAgRgE2teZ09W4t6F83M6cMT4dJYk7xS/e5XVhwvXPqsXFf3QLBmPI5 YhZFMyLErxY2g== From: Attila Lendvai Message-ID: <4sSjKaCcadx8brYQC5HZuP-SyMku3BlXRTZwaCUH13qSv01N33lk9vyUWzzE6R889ZuQRpI_6Pl4Q_51v8jMhUhwh6f9rly5h0EhlUqHG80=@lendvai.name> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-guix@gnu.org List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Attila Lendvai Cc: 54893@debbugs.gnu.org Errors-To: bug-guix-bounces+larch=yhetil.org@gnu.org Sender: "bug-Guix" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1649836354; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:resent-cc: resent-from:resent-sender:resent-message-id:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=NhgglEZ0S/137Q5eOW63g3+f3nA4JIUNIOxOxLAJ3GI=; b=BcoubYlYUr2IqqeoVK+GOeGf9Gv8KeFAvOtvWk3VK6AVG/tJoX1god9Mb+EfP136QOj5ww uNJUxGRSIx+u8rMTgxZphVtjqER2tLudj0VlkZYeDCdkL33rNTzBle5fKmbVYvqRof55Gs xSKiMjOSXZCY3pp8EXzQHEJptmw9+pPDUuVno52OpCurF9jkqDeuBdfJjB6h746xV0DYB7 kzpXDbCdpstq1CDr/hIhU6WTNfXYDJcxrRQ1tjXiq9LoUXg8LcZPezMjt9ok+GJBhihEAR jbZ9/uuDUpZO7jkkzrGhzYYJ5yaA/UWxIkcQIUZP49xfeSIZmLXIxrr3QmA0Fg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1649836354; a=rsa-sha256; cv=none; b=GIwmIN/jVYyWwscjqX75tP+SMPO0AuQ4/iyOWqwKcdanjNQ4dzDEqC3qnyk8l7zZ3DtnwD o3go2vzA4+G+9Er4hfOtuOfGXCfWI3bGka0g3Nm8bIGITQZg1MCpi80v5y0OdXe8r+yoKD rVl3jmAQ9p4ruqeiaN2mJlx/i6WP/NQGmlyPXlXZe+sYRzckNWok9qqiK/R8lZWfgXnNZr qKgGS/55VI1tPz5GTFRMFCCFnnVRuOZkNMuyySfd/G31sAEKFqt0pXJ34hfDitTSRxPtww QNXIvaTmZFTDSBPinsefXt2tBuT4jnjZRYfwM4i1/zyDmaK1dvy63MkUZxEGIQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=lendvai.name header.s=protonmail3 header.b="O5/P7UTM"; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: 0.44 Authentication-Results: aspmx1.migadu.com; dkim=fail ("headers rsa verify failed") header.d=lendvai.name header.s=protonmail3 header.b="O5/P7UTM"; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "bug-guix-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="bug-guix-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 3AA21204E6 X-Spam-Score: 0.44 X-Migadu-Scanner: scn0.migadu.com X-TUID: xoIIBmEigo37 > * LANG should be set, because it is in #:leaked-env-vars (see > guix/git-download.scm). I don't know whose LANG it is though > -- the user's, or the daemon's? if i add this to the gexp: (simple-format (current-error-port) "LANG is '~A'~%" (getenv "LANG")) (setenv "LANG" "en_US.utf8") (setenv "GUIX_LOCPATH" "/run/current-system/locale") (setlocale LC_ALL (getenv "LANG")) i see: LANG is '' Backtrace: 2 (primitive-load "/gnu/store/z4bis94jg0s0y0xj1xbmliv7xs8?") In ice-9/eval.scm: 619:8 1 (_ #f) In unknown file: 0 (setlocale 6 "en_US.utf8") ERROR: In procedure setlocale: In procedure setlocale: Invalid argument > * GUIX_LOCPATH is not leaked. it's the same if i add GUIX_LOCPATH to the #:leaked-env-vars and don't sete= nv it explicitly. > * Even if it was, I don't think that /gnu/store/...glibc-locales > would be accessible from the build container (though you could give > it a try?). i didn't check this specifically, but i'm afraid you are right, and this is= why my kludge doesn't work. > * So perhaps GUIX_LOCPATH needs to be set=C2=A0in the gexp in > guix/git-download.scm, + some setlocale as done by > gnu-build-system. i don't understand why the setlocale call in gnu-build-system's install-loc= ale works, but my setlocale kludge in git-download doesn't. i even tried to add glibc-locale as native-inputs to the package in questio= n, but it didn't help. > * Long-term, it could be interesting to remove the > =E2=80=98file name =3D string encoded in current locale's encoding= =E2=80=99 > assumption from Guile. i'm not sure why the wrong locale breaks file-system walking and deleting, = though. i assume if every function in guile uses/assumes the same locale (character= encoding), then both directions through the guile FFI should be idempotent= , no? and i think both ASCII and UTF-8 are idempotent wrt C bytes <-> schem= e string conversions. IOW, it's only the displaying of the chars that shoul= d be broken, not file operations. or am i wrong to assume this? or maybe the character encoding algo used in guile's FFI silently emits act= ual question marks in place of bytes that are outside the valid range of th= e encoding used? if so, that's not a very defensive way of coding, and it's= eating up hours of my life... hrm... this is not relevant here, only a related thought: things can go wro= ng in the GEXP serialization, too: if the writing side and the reading side= doesn't use the same character encoding. locale should be set explicitly a= t the relevant entry points. i'd appreciate if someone could help me come up with at least a kludge, so = that i could make progress until it's fixed properly. thanks for your insights Maxime, -- =E2=80=A2 attila lendvai =E2=80=A2 PGP: 963F 5D5F 45C7 DFCD 0A39 -- If you never heal from what hurt you, you'll bleed on people who didn't cut= you.