From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id 2FyLCBBl41+LXgAA0tVLHw (envelope-from ) for ; Wed, 23 Dec 2020 15:41:04 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id OC+oBBBl418iegAA1q6Kng (envelope-from ) for ; Wed, 23 Dec 2020 15:41:04 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 77966940222 for ; Wed, 23 Dec 2020 15:41:03 +0000 (UTC) Received: from localhost ([::1]:38686 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ks6G6-0005pf-8L for larch@yhetil.org; Wed, 23 Dec 2020 10:41:02 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:53474) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ks6Fa-0005oK-Rc for guix-devel@gnu.org; Wed, 23 Dec 2020 10:40:32 -0500 Received: from lepiller.eu ([89.234.186.109]:48922) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ks6FV-0004xv-EL; Wed, 23 Dec 2020 10:40:28 -0500 Received: from lepiller.eu (localhost [127.0.0.1]) by lepiller.eu (OpenSMTPD) with ESMTP id a46b5f13; Wed, 23 Dec 2020 15:40:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=lepiller.eu; h=date :in-reply-to:references:mime-version:content-type :content-transfer-encoding:subject:to:cc:from:message-id; s= dkim; bh=kGrfcpIFjZtw84XuYU15GlVnxREu/Jxpe2t2duH3xT4=; b=D44pN9x 2fDxcTUezoGIkshUKiiav9Hhy8hvzRaKb7z0XRTahwlkRLPinkCRYYrcgDSIwR5P YqfA+4UhALPGZw+LWpwPSZHwBwrsOk/4HJD67KJWQIX/gT71BlHGfVFh/o0a31rj TTZieFQ5trzawmAPyruPWPJknoS4eZXIlQEfQpB43O2iksj3ZQ1hhhj/X7OeOa3s D0VKZwdshueNRd3KmAcxyv+zg3vVzRTa9aab/OLUJ+jTBO4OkQgFDRQtCh1/CqQh JbdBPTwCVTd0xbpBQ5NPo9OG/AAb4pSuOJan4G1UUFaTRt98peVG8ysWgPc/8LbB D/eULuupyg1xE1g== Received: by lepiller.eu (OpenSMTPD) with ESMTPSA id 6fe6eba8 (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256:NO); Wed, 23 Dec 2020 15:40:14 +0000 (UTC) Date: Wed, 23 Dec 2020 10:40:00 -0500 User-Agent: K-9 Mail for Android In-Reply-To: <86wnx8r4ys.fsf@gmail.com> References: <87wnx9wlea.fsf@gnu.org> <878s9oy8f7.fsf@gmail.com> <86wnx8r4ys.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: Identical files across subsequent package revisions To: guix-devel@gnu.org, zimoun , =?ISO-8859-1?Q?Miguel_=C1ngel_Arruga_Vivas?= , =?ISO-8859-1?Q?Ludovic_Court=E8s?= From: Julien Lepiller Message-ID: <077ECD6C-AB0D-4FEA-ABBA-82550834265E@lepiller.eu> Received-SPF: pass client-ip=89.234.186.109; envelope-from=julien@lepiller.eu; helo=lepiller.eu X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Guix Devel Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN X-Migadu-Spam-Score: 0.27 Authentication-Results: aspmx1.migadu.com; dkim=fail (headers rsa verify failed) header.d=lepiller.eu header.s=dkim header.b=D44pN9x ; dmarc=fail reason="SPF not aligned (relaxed)" header.from=lepiller.eu (policy=none); spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: 77966940222 X-Spam-Score: 0.27 X-Migadu-Scanner: scn1.migadu.com X-TUID: VyaMR8JB9wCY Le 23 d=C3=A9cembre 2020 09:07:23 GMT-05:00, zimoun a =C3=A9crit : >Hi, > >On Wed, 23 Dec 2020 at 14:10, Miguel =C3=81ngel Arruga Vivas > wrote: > >> Probably you're already aware of it, but I want to mention that >> Tridgell's thesis[1] contains a very neat approach to this problem=2E > >This thesis is a must to read! :-) > > >> A naive prototype would be copying of the latest available nar of the >> package on the client side and using it as the destination for a copy >> using rsync=2E Either the protocol used by the rsync application, or a >> protocol based on those ideas, could be implemented over the HTTP >layer; >> client and server implementation and cooperation would be needed >> though=2E > >I could misunderstand and miss something, one part of the problem is >how >to detect =E2=80=9Clatest=E2=80=9D; other said how to know it is differen= t=2E From my >memories, and I have drunk couple of beers since I read the thesis :-), >the =E2=80=99rsync=E2=80=99 approach uses timestamp and size=2E And if y= ou switch to >checksum instead, the performances are poor, because of IO=2E Well, it >depends on the number of files and their size, if this checksum are >computed ahead, etc=2E > > >> Another idea that might fit well into that kind of protocol---with >> harder impact on the design, and probably with a high cost on the >> runtime---would be the "upgrade" of the deduplication process towards >a >> content-based file system as git does[2]=2E This way the a description >of >> the nar contents (size, hash) could trigger the retrieval only of the >> needed files not found in the current store=2E > >Is it not related to Content-Addressed Store? i=2Ee, =C2=ABintensional >model=C2=BB? > >Chap=2E 6: >Nix FRC: > I think this is different, because we're talking about sub-element content= -addressing=2E The intensional model is about content-addressing whole stor= e elements=2E I think the idea would be to save individual files in, say, /= gnu/store/=2Elinks, and let nar or narinfo files describe the files to retr= ieve=2E If we are missing some, we'd download them, then create hardlinks= =2E This could even help our deduplication I think :) > > >Cheers, >simon