From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:306:2d92::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id 4AOeFKDv6GQaFgEAG6o9tA:P1 (envelope-from ) for ; Fri, 25 Aug 2023 20:14:56 +0200 Received: from aspmx1.migadu.com ([2001:41d0:306:2d92::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id 4AOeFKDv6GQaFgEAG6o9tA (envelope-from ) for ; Fri, 25 Aug 2023 20:14:56 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 062E546404 for ; Fri, 25 Aug 2023 20:14:55 +0200 (CEST) Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=laesvuori.fi header.s=mail header.b=KI66J5ju; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=reject) header.from=laesvuori.fi ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1692987296; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=A3qjRCRL2f+9i7zp/tD+UC9SFrMrBUMvsa8rhvLiSBg=; b=rzQo0IhoVhH+x1prsTyOR4w8tE0nR5mYJ1xy63itwG9KTT92SYKrHo5kkrjHAdasG/M2vj 2s3/vRdtBbkVzDMboFHXDTy7EL6XI2H+wgGb0yN4I1cX75eKzkWNsTM0jdonP+cBX8+Q2M oOxwQRsigq5NV7LyqYM2yVvGBDXDYXSqv1HeKI7L2ZmGMFggmHJLZongLiu8gCRvUxU5PH WwFGAzlbekf+xbk4wyzKku3LPq8frYuan/HTP+JcpFyUNns1wLxt7UNY2Zgqbz+f+2+yWX 0PkzP2or3eADk6QQ37fmjXc9lHCbyVoJV4S497dwe6HXlGcfnJ1bWsJUc1o2QQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=laesvuori.fi header.s=mail header.b=KI66J5ju; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=reject) header.from=laesvuori.fi ARC-Seal: i=1; s=key1; d=yhetil.org; t=1692987296; a=rsa-sha256; cv=none; b=jmVhypLmCmEtl+LqJ0g9afRy9NCwrVpa+RIhLKB3auWikan6/XQGwEUo8r9Oq/aQSaMQOW Lha8Op0eNXhSYloytQD/YoJKpKZUfCCwn4ADVbZTMJ6m8FPnLofWrHkabQDPdPeugKwjF3 ElcgidEdPh/04GAXPpwsMGv4V1zdiNBUGJDS/rKmOrF4bLInP3q77ICcazreFav7JqYNth wSvCO5itGXRQ0EHsbv7GhaMlIZSMfAmgliChT4D+gFTPXZqTxVt6tbc2DPhs4plT8CH+rv W1PoZQBoCzPLJ4x9htzF49Uk9adUwDJ+ME7iw/eHe3wOgHlEtvSVbgdk8MTUdg== Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qZbK5-0000XP-Q0; Fri, 25 Aug 2023 14:14:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qZbK3-0000X9-GT for guix-devel@gnu.org; Fri, 25 Aug 2023 14:14:15 -0400 Received: from vmi571514.contaboserver.net ([75.119.130.101] helo=mail.laesvuori.fi) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qZbK1-0004KI-5B for guix-devel@gnu.org; Fri, 25 Aug 2023 14:14:15 -0400 Received: from X-kone (88-113-24-127.elisa-laajakaista.fi [88.113.24.127]) by mail.laesvuori.fi (Postfix) with ESMTPSA id E768C3401A8; Fri, 25 Aug 2023 20:14:26 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=laesvuori.fi; s=mail; t=1692987267; bh=A3qjRCRL2f+9i7zp/tD+UC9SFrMrBUMvsa8rhvLiSBg=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=KI66J5juOTG6JKWR2B8Kdre12iUm6D4dkTSl9XEh29ihASwop5chVGCSbTSd9GlKg aVBaj4/4bpx3D1V8c+F9XXVy218RrGbctY/mqolr9zvyeTrFxrpYMBbEoh2lcBwK7G yPcQcFX+KL+9iBMgnDWjZIX/aOxHvQfl4wWaeszY= Date: Fri, 25 Aug 2023 21:14:03 +0300 From: Saku Laesvuori To: Kaelyn Cc: Eidvilas =?utf-8?Q?Markevi=C4=8Dius?= , Nathan Dehnel , guix-devel@gnu.org Subject: Re: Relaxing the restrictions for store item names Message-ID: <20230825181403.w3xzklltma73bbei@X-kone> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="rfcleef5jejjmil7" Content-Disposition: inline In-Reply-To: Received-SPF: pass client-ip=75.119.130.101; envelope-from=saku@laesvuori.fi; helo=mail.laesvuori.fi X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-Migadu-Scanner: mx2.migadu.com X-Migadu-Spam-Score: -2.45 X-Spam-Score: -2.45 X-Migadu-Queue-Id: 062E546404 X-TUID: MV8O7QCS2/0D --rfcleef5jejjmil7 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable > > Although now, just a few hours later, I'm having second thoughts on > > this. When you really think about it, it's very unlinkely that some > > user would prefer typing something like > >=20 > > guix install %D0%B8%D0%BC%D0%B0%D0%B3%D0%B8%D0%BD%D0%B0%D1%80%D0%B8-%D0= %BF%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC > >=20 > > over > >=20 > > guix install =D0=B8=D0=BC=D0=B0=D0=B3=D0=B8=D0=BD=D0=B0=D1=80=D0=B8-=D0= =BF=D1=80=D0=BE=D0=B3=D1=80=D0=B0=D0=BC >=20 > I imagine that, for usability, the percent encoding (or other encoding > or transliteration) of non-ASCII characters could be handled > transparently, i.e. for "guix install =D0=B8=D0=BC=D0=B0=D0=B3=D0=B8=D0= =BD=D0=B0=D1=80=D0=B8-=D0=BF=D1=80=D0=BE=D0=B3=D1=80=D0=B0=D0=BC", guix wou= ld > translate "=D0=B8=D0=BC=D0=B0=D0=B3=D0=B8=D0=BD=D0=B0=D1=80=D0=B8-=D0=BF= =D1=80=D0=BE=D0=B3=D1=80=D0=B0=D0=BC" to the encoded form for operations. A= nd > if the escape character (e.g. the "%" in percent encoding) isn't also > a valid character for store or package names then the values can be > handled transparently. For example, both "guix install git" and "guix > install %67%69%74" and "guix install g%69t" would all install git. > > > [...] > > > It would also make > > store name unnecessarily long (they're already long as is), and > > there's a 255 char limit for filenames that we have to keep in mind as > > well. Searching the store using standard utilities such as find and > > grep would too, as a consequence, >=20 > I split out the quote above as a bit of reference. While I agree that > we have to keep in mind the 255 char limit for filenames, with percent > encoding causing a single byte in ASCII or UTF-8 to become ~3 bytes > (with iirc most non-latin characters having multi-byte encodings in > UTF-8) and the store hashes being a 33 byte prefix (counting the > dash), 255 chars is still quite a bit. Specifically, the extracted > quote above--without the "> " prefixes and with line breaks treated as > single characters--is exactly 255 characters. (I find a bit of > readable text to be helpful for wrapping my brain around a value like > "255 characters".) > > > break... There's just too many problems with this. The encoding could also be transparent in the other direction so the percent encoded form would be usable on the command line (in addition to the UTF-8 one, of course), but guix would translate it to UTF-8 for operations. This would allow typing all package names with only ascii characters but still keep the store readable and grepable. There are most likely simple utility programs that can decode precent encoding, so the store is also grepable with only ascii characters.=20 There is no reason (that I can see) not to allow UTF-8 in the store paths, other than it being hard to type with a keyboard for a different locale. But how often do people actually want to type store paths by hand? I at least avoid it at all times possible by using $(guix build ...),= =20 $(herd configuration ...), $(realpath /var/guix/profiles/...) etc. Even when recovering a broken system the only store path you really need to type is that of a working guix (and /var/guix/profiles/... probably also works in a broken system). > > even if they don't have the russian (or whatever other language) > > keyboard layout set up on their system, so just for accessability > > purposes, the solution wouldn't be all that great. I agree. It is really annyoing and hard to write percent encoding by hand, so this doesn't really solve the issue of UTF-8 being hard to write with an ASCII keyboard. Maybe some sort of fuzzy character matching could be used in guix search instead of percent encoding. That way people could find the packages even if they can't type the entire name and then use the name from guix search (by copy-pasting or shell piping) to install it (or do whatever operation they want to it). --rfcleef5jejjmil7 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEoMkZR3NPB29fCOn/JX0oSiodOjIFAmTo72YACgkQJX0oSiod OjJ7Fg/+J+aVY/qtlgc0VVBMH3jAM/WkOjzwg4u/uJv/ejx9QNA3aZXHhvOM9ZVs 8OIDVXKQFTZlDVobHZwRscGmPy5cq/abtTQKof+Irpd0xS+G6ON/c1Sv63kuJfdQ sQ8eIil+/mmijxOkVs7NyQ6frudMB2sBSMNtjLXHUBP7UYAmwHp2wuzqc6b/Nchp lNxWtUphJ9GvZSEty9x6+O+FLM7snHZ0XFTfHCf0WBtMaE6g3S513qIJtlYegHYp 0REcw9udVlK8OYcPIIvCtg8mUf4VFmm8n96JMRP9IGpE33MnBnJfI5GS6x7K3DGI gLooNJfldXZHqs2yysFimMo4PW7TDa+OIQpgAfD1h433pizxsUbrdmXTEnrW7XHz NIhmrYExJzPK+b55LeFET6uqwg+x6R8jaqSb+aBC5TglkoW8mgosPV64mgivBSrB L9IxgWa/srLoodumT9OeHJ1kHAYrOvUWvZsse2KQ22vXnM2EdJ6KMYM7HQ7D6itu fTNxPZOJcm1OhdHpNksoP0v6Bl2WVFRiKUumhFtjQy28J/4LH4Yf5WSnOvAmFRoV bTmUXCb0XQi8yo9gNuHSZyH+EGAjp7dvci974KNGdCn+QlLZSIMrU8j26pSL8UiN S9+h9CNg3J3NiGANwANzqKyaHVmYJUBQHtXBfgbcquJUh6Gi5bs= =SuAr -----END PGP SIGNATURE----- --rfcleef5jejjmil7--