From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark H Weaver Subject: bug#30820: Chunked store references in compiled code break grafting (again) Date: Wed, 21 Mar 2018 00:17:53 -0400 Message-ID: <87woy62ham.fsf@netris.org> References: <87o9jq7j7r.fsf@gnu.org> <87muz3dgy1.fsf@netris.org> <87370vg0e9.fsf@gnu.org> <87in9rbmay.fsf@netris.org> <87in9rw2en.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:53260) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eyVEM-0000O3-3c for bug-guix@gnu.org; Wed, 21 Mar 2018 00:20:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eyVEI-0001Wa-VD for bug-guix@gnu.org; Wed, 21 Mar 2018 00:20:06 -0400 Received: from debbugs.gnu.org ([208.118.235.43]:36793) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eyVEI-0001WV-Qe for bug-guix@gnu.org; Wed, 21 Mar 2018 00:20:02 -0400 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87in9rw2en.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Tue, 20 Mar 2018 09:56:48 +0100") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: 30820@debbugs.gnu.org Hi Ludovic, ludo@gnu.org (Ludovic Court=C3=A8s) writes: > Mark H Weaver skribis: > >> We would also need to find a solution to the problem described in the >> thread "broken references in jar manifests" on guix-devel started by >> Ricardo, which still has not found a satifactory solution. >> >> https://lists.gnu.org/archive/html/guix-devel/2018-03/msg00006.html Okay, do you have a proposed fix for the issue of jar manifests? There's a specification for that file format which mandates that "No line may be longer than 72 bytes (not characters), in its UTF8-encoded form. If a value would make the initial line longer than this, it should be continued on extra lines (each starting with a single SPACE)." >> My opinion is that I consider Guix's current expectations for how >> software must store its data on disk to be far too onerous, in cases >> where that data might include a store reference. I don't see sufficient >> justification for imposing such an onerous requirement on the software >> in Guix. > > In practice Guix and Nix have been living fine under these constraints, > and with almost no modifications to upstream software, so it=E2=80=99s no= t that > bad. Nix doesn=E2=80=99t have grafts though, which is why this problem w= as less > visible there. > >> Ultimately, I would prefer to see the scanning and grafting operations >> completely generalized, so that in general each package can specify how >> to scan and graft that particular package, making use of libraries in >> (guix build ...) to cover the usual cases. In most cases, that code >> would be within build-systems. > > That would be precise GC instead of conservative GC in a way, right? > So in essence we=E2=80=99d have, say, a scanner for ELF files (like =E2= =80=98dh_shdep=E2=80=99 > in Debian or whatever it=E2=80=99s called), a scanner for jars, and so on? No, I wasn't thinking along those lines. While I'd very much prefer precise GC, it seems wholly infeasible for us to write precise scanners and grafters for every file format of every package in Guix. My thought was that supporting scanning and grafting of 8-byte-or-longer substrings of hashes would cover both GCC's inlined strings and jar manifests, the two issues that we currently know about, and that it would be nice if we could add further methods in the future. For example, some software might store its data in UTF-16, or compressed. > Still, how would we deal with strings embedded in the middle of > binaries, as in this case? It seems to remain an open issue, no? I believe that I addressed that case in my original proposal, no? > I=E2=80=99m interested in experiments in that direction. I think that=E2= =80=99s a > longer-term goal, though, and there are open questions: we have no idea > how well that would work in practice. Thanks for discussing it. I'm willing to drop it and go with your decision for now, but the "jar manifest" issue still needs a solution. Regards, Mark