From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:306:2d92::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id 8ATgEeHX6GTBHgAAauVa8A:P1 (envelope-from ) for ; Fri, 25 Aug 2023 18:33:37 +0200 Received: from aspmx1.migadu.com ([2001:41d0:306:2d92::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id 8ATgEeHX6GTBHgAAauVa8A (envelope-from ) for ; Fri, 25 Aug 2023 18:33:37 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 05B6E40F79 for ; Fri, 25 Aug 2023 18:33:37 +0200 (CEST) Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=protonmail.com header.s=protonmail3 header.b=ol9q0xic; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=quarantine) header.from=protonmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1692981217; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=24k2LISZV3d59/SySXhxPKORUcSiw+q17NIcpLxfuOA=; b=TiAdRtCy/gHlznJ4nSeOL/6Ns5tqdEinpicFR0QhU2M5MmwMKFe7GmCH6vnF39o6D9AHAq a5GdXR4brFtherbSmodQ5BfPSD0jiFPr4bk6bYfybsF2koqyEtoKZx8CGwr1oyhTg2cLEp TKsJh3hNHj5r74vZvZg90YKNndR1g15hZ0gF0OOh5aD4LOBHPTN3a1wjbeRTMLKR+dg6vz 1F+8Fe4NWpsPp3U7OFYvTXDed1+mI9v00h3kLc93gHttuQfLuxVYoz8Lm6Dg5T9MOmir2y 771V83cKpdHjxKMQY8VkEqefCD2lKCp6yAiGlhvGXhOeTTmUw0up8vufDZyStA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=protonmail.com header.s=protonmail3 header.b=ol9q0xic; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=quarantine) header.from=protonmail.com ARC-Seal: i=1; s=key1; d=yhetil.org; t=1692981217; a=rsa-sha256; cv=none; b=i9KmDdhDAo/sIpLgfEn1FS8TFrhyTrFvdtG8akisFRZEEzTiSJY9/SnGFV915WE0zoRyjF b7GxpfKOOVPhiezNm6b66IlVEoGAXKIed+X3N/lme/q4z8jjmbdNjtUre2qwdIDx4FyoDl p/b/e92dB91HOKC9fY7T9qS/jXGfFfQPI5uYr6zk9uKqyAhqTFXZYysuCe2LaiMltuMVM4 EdE3zLBk/sTlFnnfj+eaakDKhI722ylh45A7TcwoXaOscKJuDtU72zw9tvu8TslSp4dVik A+HXZaj/YxUGkMZjc7BBf974BKgUjWf7+Jf4nKHnWpbO/Cnljnnsx9zyH81zEA== Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qZZkD-0000R4-OH; Fri, 25 Aug 2023 12:33:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qZZkB-0000MO-9D for guix-devel@gnu.org; Fri, 25 Aug 2023 12:33:07 -0400 Received: from mail-40131.protonmail.ch ([185.70.40.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qZZk7-0002fW-V7 for guix-devel@gnu.org; Fri, 25 Aug 2023 12:33:06 -0400 Date: Fri, 25 Aug 2023 16:32:41 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1692981173; x=1693240373; bh=24k2LISZV3d59/SySXhxPKORUcSiw+q17NIcpLxfuOA=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=ol9q0xiczJ0YEIAxPIYy2/JQaCjOJgZIO2iDTDBzxJUw2ih40Y+YWciff1Ui4RHTc r3p9gixO7kDqaMxR2kf6wOCLSVtoZ9pfToJl3GD+MSxYMoi2tD3Htf6/ZqDv9NfBF1 WCVI+jkcQF/MSG8NW54Z1YpqiepH+jwcdgfBc5u0qp3uW1eP5V/BIYIDmSCp/YLEM7 9OKC9pMeOpyVGH7LWT2KG/K5kYicy/CX4tZ+AkllgzEosYo9vn2jlPvYbN+HH/bnMd mGu+mLA/UvnbAqG6obnLJdsxof655S81Wjtg0agsVpHZyImTZ2hQY/QzS9NIXzELQ9 rPvelwC/tAR5A== To: =?utf-8?Q?Eidvilas_Markevi=C4=8Dius?= From: Kaelyn Cc: Nathan Dehnel , guix-devel@gnu.org Subject: Re: Relaxing the restrictions for store item names Message-ID: In-Reply-To: References: Feedback-ID: 34709329:user:proton MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=185.70.40.131; envelope-from=kaelyn.alexi@protonmail.com; helo=mail-40131.protonmail.ch X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: guix-devel-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-Migadu-Scanner: mx2.migadu.com X-Migadu-Spam-Score: -0.75 X-Spam-Score: -0.75 X-Migadu-Queue-Id: 05B6E40F79 X-TUID: vEqudlmy84t8 Hi, A couple of small early-morning (for me) comments below... not for or again= st the idea of percent encoding, but as a little bit of food for thought wh= ile pondering how to handle Unicode in package names and/or store paths. On Friday, August 25th, 2023 at 2:01 PM, Eidvilas Markevi=C4=8Dius wrote: > Although now, just a few hours later, I'm having second thoughts on > this. When you really think about it, it's very unlinkely that some > user would prefer typing something like >=20 > guix install %D0%B8%D0%BC%D0%B0%D0%B3%D0%B8%D0%BD%D0%B0%D1%80%D0%B8-%D0%B= F%D1%80%D0%BE%D0%B3%D1%80%D0%B0%D0%BC >=20 > over >=20 > guix install =D0=B8=D0=BC=D0=B0=D0=B3=D0=B8=D0=BD=D0=B0=D1=80=D0=B8-= =D0=BF=D1=80=D0=BE=D0=B3=D1=80=D0=B0=D0=BC I imagine that, for usability, the percent encoding (or other encoding or t= ransliteration) of non-ASCII characters could be handled transparently, i.e= . for "guix install =D0=B8=D0=BC=D0=B0=D0=B3=D0=B8=D0=BD=D0=B0=D1=80=D0= =B8-=D0=BF=D1=80=D0=BE=D0=B3=D1=80=D0=B0=D0=BC", guix would translate "= =D0=B8=D0=BC=D0=B0=D0=B3=D0=B8=D0=BD=D0=B0=D1=80=D0=B8-=D0=BF=D1=80=D0= =BE=D0=B3=D1=80=D0=B0=D0=BC" to the encoded form for operations. And if the= escape character (e.g. the "%" in percent encoding) isn't also a valid cha= racter for store or package names then the values can be handled transparen= tly. For example, both "guix install git" and "guix install %67%69%74" and = "guix install g%69t" would all install git. > even if they don't have the russian (or whatever other language) > keyboard layout set up on their system, so just for accessability > purposes, the solution wouldn't be all that great. > It would also make > store name unnecessarily long (they're already long as is), and > there's a 255 char limit for filenames that we have to keep in mind as > well. Searching the store using standard utilities such as find and > grep would too, as a consequence, I split out the quote above as a bit of reference. While I agree that we ha= ve to keep in mind the 255 char limit for filenames, with percent encoding = causing a single byte in ASCII or UTF-8 to become ~3 bytes (with iirc most = non-latin characters having multi-byte encodings in UTF-8) and the store ha= shes being a 33 byte prefix (counting the dash), 255 chars is still quite a= bit. Specifically, the extracted quote above--without the "> " prefixes an= d with line breaks treated as single characters--is exactly 255 characters.= (I find a bit of readable text to be helpful for wrapping my brain around = a value like "255 characters".) Cheers, Kaelyn > break... There's just too many > problems with this. >=20 > I believe what Julien proposed is the most reasonable solution: > unrestrict unicode characters in the store and (maybe) make it a > project policy to not put unicode characters inside package names > (however, personally I wouldn't be against that either). >=20 > Now ensuring that URIs don't break, especially for substitute > provision, should also be taken into consideration, but this can be > handled separately. >=20 > On Fri, Aug 25, 2023 at 12:14=E2=80=AFPM Eidvilas Markevi=C4=8Dius > markeviciuseidvilas@gmail.com wrote: >=20 > > On Fri, Aug 25, 2023 at 11:37=E2=80=AFAM Nathan Dehnel ncdehnel@gmail.c= om wrote: > >=20 > > > What you could do is implement percent encoding: > > > https://en.wikipedia.org/wiki/Percent-encoding > > > -Allows you to store package titles in any language in an encoded for= m > > > -Allows the titles to be typed on latin keyboards > > > -Allows the packages to be accessed through URIs in the future withou= t > > > causing problems > >=20 > > Now that's an idea. I didn't really thought of that. Although it'd > > probably be trickier to implement in order to make all the tooling > > compatible. I think that might be a good solution nonetheless.