From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Rob Browning Newsgroups: gmane.lisp.guile.devel Subject: RE: Improving the handling of system data (env, users, paths, ...) Date: Sun, 07 Jul 2024 14:40:43 -0500 Message-ID: <87plrp0z44.fsf@trouble.defaultvalue.org> References: <878qyeqn1q.fsf@trouble.defaultvalue.org> <20240707122425.kaQQ2C00E4hwdlW06aQRe0@michel.telenet-ops.be> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="26248"; mail-complaints-to="usenet@ciao.gmane.io" To: Maxime Devos , "guile-devel@gnu.org" Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Sun Jul 07 21:41:11 2024 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1sQXl1-0006e2-1r for guile-devel@m.gmane-mx.org; Sun, 07 Jul 2024 21:41:11 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sQXke-0004TF-On; Sun, 07 Jul 2024 15:40:48 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sQXkd-0004So-9F for guile-devel@gnu.org; Sun, 07 Jul 2024 15:40:47 -0400 Original-Received: from defaultvalue.org ([45.33.119.55]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sQXkb-0003YE-QJ for guile-devel@gnu.org; Sun, 07 Jul 2024 15:40:47 -0400 Original-Received: from trouble.defaultvalue.org (localhost [127.0.0.1]) (Authenticated sender: rlb@defaultvalue.org) by defaultvalue.org (Postfix) with ESMTPSA id 03F5D204C2; Sun, 7 Jul 2024 14:40:44 -0500 (CDT) Original-Received: by trouble.defaultvalue.org (Postfix, from userid 1000) id 9A64D14E081; Sun, 7 Jul 2024 14:40:43 -0500 (CDT) In-Reply-To: <20240707122425.kaQQ2C00E4hwdlW06aQRe0@michel.telenet-ops.be> Received-SPF: pass client-ip=45.33.119.55; envelope-from=rlb@defaultvalue.org; helo=defaultvalue.org X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.lisp.guile.devel:22566 Archived-At: Maxime Devos writes: > I=E2=80=99d rather not. It=E2=80=99s rather stateful and hence non-trivia= l to compose. > Also, locale is not only about the encoding of text [file name/env > encodings/xattr/...], but also about language. Also setting the > language is excessive in this case. The proposal would be that you'd only change the "CTYPE" to Latin-1, it's strictly for the purpose of getting *bytes* since Latin-1 will do that with no possibility of crashing on unencodable data. And of course there's no way of knowing what the *real* encoding is without out of band information. That's true for getenv, and also true for say every call to get a user or group name from the system. Each user name *could* (but won't, outiside generative testing, you'd hope) have a different encoding. > This, OTOH, seems a bit better =E2=80=93 =E2=80=98with-locale=E2=80=99 is= like =E2=80=98parameterize=E2=80=99 > and hence pretty composable. However, it still stuffers from the > problem that it sets too much (also, there is no such thing as the > =E2=80=9Ciso-8859-1=E2=80=9D locale?). Oh, I was just writing pseudo-code, and right, you'd only want to change the CTYPE for the current purposes, and that's what I'd expect whatever we end up with to make it easy/efficient/safe to do. > IIRC, in ISO-88519-1 there is a direct correspondence between bytes and c= haracters > (and Guile recognises this), so there is no cost beyond mere copying. While it may change, I believe the current plan is to switch Guile to UTF-8 internally, which is why I've been including that in considerations. > Here is an alternative solution: Right, there are a lot of options if we're in the market for a "broader" solution, but my impression was that we aren't right now (see my other followup message). Thanks --=20 Rob Browning rlb @defaultvalue.org and @debian.org GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4