From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:37058) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gvpEp-0000ic-Oa for guix-patches@gnu.org; Mon, 18 Feb 2019 15:10:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gvpEo-0004nF-RY for guix-patches@gnu.org; Mon, 18 Feb 2019 15:10:03 -0500 Received: from debbugs.gnu.org ([209.51.188.43]:53952) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gvpEo-0004mt-Mn for guix-patches@gnu.org; Mon, 18 Feb 2019 15:10:02 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gvpEo-0004gc-CG for guix-patches@gnu.org; Mon, 18 Feb 2019 15:10:02 -0500 Subject: [bug#34223] Fixing timestamps in archives. Resent-Message-ID: References: <87imxjfjjt.fsf@gnu.org> From: Tim Gesthuizen In-reply-to: <87imxjfjjt.fsf@gnu.org> Date: Mon, 18 Feb 2019 21:07:03 +0100 Message-ID: <87mumshndk.fsf@yahoo.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-patches-bounces+kyle=kyleam.com@gnu.org Sender: "Guix-patches" To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: 34223@debbugs.gnu.org Hi Ludo, > Sorry for the delay! No problem! I have very little time anyway. > Nice work! It=E2=80=99s great that libarchive doesn=E2=80=99t need to ac= tually extract > the zip file to operate on it. > > Overall I think the approach of factorizing archive-timestamp-resetting > in one place and using it everywhere (=E2=80=98ant-build-system=E2=80=99 = and all) is the > right thing to do. > > However, I=E2=80=99m not sure whether we should introduce a new program f= or this > purpose. I believe =E2=80=98strip-nondeterminism=E2=80=99=C2=B9 (in Perl= ) by fellow > Reproducible Builds hackers also addresses this problem, so it may be > wiser to use it. I also think so. If there is already another program that does the job we should probably use it. > But really, since (guix build utils) already implements a significant > subset of =E2=80=98strip-nondeterminism=E2=80=99, it would be even better= if could avoid > to shell out to a C or Perl program. > > I played a bit with this idea and, as an example, the attached file > allows you to traverse the list of entries in a zip file (it uses > =E2=80=98guile-bytestructures=E2=80=99). Specifically, you can get the l= ist of file > names in a zip file by running: > > (call-with-input-file "something.zip" > (lambda (port) > (fold-entries cons '() port))) > > Resetting timestamps should be just as simple. > > How about taking this route? I also thought about taking this route. There are some problems with it though: - As Julien pointed out, the archive contents need to be uncompressed. This makes the problem much more complex and keeps us from writing a partial ZIP parser that replaces the timestamps in place. - While it would be quite elegant to just implement the parser in Scheme it would be redundant. After all we are developing a package manager so we should use it. This approach would be more attractive if there would be a Guile library for this. The best solution would be creating a proper library for handling archives when going with Scheme. - Maintaining a ZIP parser in Guix is a burden we should not take. - We need to care about a lot of details (ZIP64, probably more exotic extensions). I would be fine with writing an own parser in Scheme but I would like to point out that in every other place in Guix we are using external tools for handling archives (AFAIK). I am not quite sure which version would be the best, so I am open for other opinions on this. Maybe you could rephrase your position taking the compression problem into consideration. Tim.