From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id ANqkKx0J418AewAA0tVLHw (envelope-from ) for ; Wed, 23 Dec 2020 09:08:45 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id oA+NJx0J41+ZYAAAbx9fmQ (envelope-from ) for ; Wed, 23 Dec 2020 09:08:45 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 79155940142 for ; Wed, 23 Dec 2020 09:08:45 +0000 (UTC) Received: from localhost ([::1]:36102 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ks08S-0003Dw-DS for larch@yhetil.org; Wed, 23 Dec 2020 04:08:44 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:51686) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ks08K-0003Dq-El for guix-devel@gnu.org; Wed, 23 Dec 2020 04:08:36 -0500 Received: from mail-ed1-x532.google.com ([2a00:1450:4864:20::532]:45313) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ks08G-0001wS-Lt; Wed, 23 Dec 2020 04:08:36 -0500 Received: by mail-ed1-x532.google.com with SMTP id r5so15483219eda.12; Wed, 23 Dec 2020 01:08:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=KFNR1GUE90WiQmo3M5++dpf8kRR91KMPXEEK0g5m/i0=; b=qBmETGMiEeCaKyzUXlOUfaxoGhaK9y3ZmnDQxN6f2y6VySpO0iFzy917FoQD/N61P+ mXqaHNn84SKkopLZn2hl0lsQ2o19+xeteyVXf3Qb0uBMnCAcYeha8/I4PtCkTgV+v0+/ Rhe7Yh7j/NIIYAzX9UUaScanvKvkp9vS5AVASo1J9EfRiEI5ernhRWstZMK3/yVP0NvQ aN0S5uxhpqteByj3pamVAXBBp85LyGHzH17h856G8iL4T5ttRq6YpYoHmNzK4ApNyfdT c1GCz2D1pSbVLOp9347ebQ4mJFc6LiXnxCU/VjEx6PK5VoimzGW52IcX5cvTHhL1b541 WgIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=KFNR1GUE90WiQmo3M5++dpf8kRR91KMPXEEK0g5m/i0=; b=AXhVInTNxutBI8OTQnD+ePM20wKV3wGGcTJHl+++QuEQ7ZHgnf71MtUeGQzI9LYiEp qtbrRBGIaRurJEcUByn81fTISCo6MUT3/B6Vwqmj13TYMHyqP3/GStqpuGcU5aoyEdf1 5Utm3/KVMvXUMv0rYw8cAU1jlQrAU6ONtOWpB7EMPaCRM/xmJe2UIhAj97SlYfVGStGZ 2ZfHfeTFpqIFPpHQFufe1YyVENF9zwT2lVNA7l4EcR9uFWsjTd9SetzYJNj9E2SIaJ6x Fg3Jq3Lx0V0O7ocervbnivx0dFIa3NvlgMHlFYG+Wm/n2MnzfTYDushoKp0PXQmTxBXM LzAA== X-Gm-Message-State: AOAM530NfO+h67k1GUwV+4Bjf/UQZ8Oeb3E1bH57/c4Z1OSwde2sVcrP 6CZI9p98dNm285RhFs2JKZPvg5GScdzi5w== X-Google-Smtp-Source: ABdhPJyXakiDHgHjgO61ykDT4AjXb9arp3e5YXeza7X6mdAykzLoP+7rTRujdJ2kbc0Sotm3084+Qw== X-Received: by 2002:aa7:da03:: with SMTP id r3mr23512271eds.155.1608714510285; Wed, 23 Dec 2020 01:08:30 -0800 (PST) Received: from [192.168.178.20] ([109.90.125.150]) by smtp.gmail.com with ESMTPSA id qu21sm11252474ejb.95.2020.12.23.01.08.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 23 Dec 2020 01:08:29 -0800 (PST) Subject: Re: Identical files across subsequent package revisions To: =?UTF-8?Q?Ludovic_Court=c3=a8s?= , Guix Devel References: <87wnx9wlea.fsf@gnu.org> From: Taylan Kammer Message-ID: <9851ca1f-ae56-66f3-912c-55db3f053c80@gmail.com> Date: Wed, 23 Dec 2020 10:08:27 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <87wnx9wlea.fsf@gnu.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2a00:1450:4864:20::532; envelope-from=taylan.kammer@gmail.com; helo=mail-ed1-x532.google.com X-Spam_score_int: -45 X-Spam_score: -4.6 X-Spam_bar: ---- X-Spam_report: (-4.6 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-2.521, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN X-Migadu-Spam-Score: -3.02 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=qBmETGMi; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: 79155940142 X-Spam-Score: -3.02 X-Migadu-Scanner: scn1.migadu.com X-TUID: X4LeBeqHFUTP On 22.12.2020 23:01, Ludovic Courtès wrote: > > Thoughts? :-) > My first thought: Neat, would love to see this implemented! :D My second thought: it's surprising that IceCat supposedly changes so much between releases. I suppose the reason is that this analysis is on a per-file basis, and IceCat is mostly a massive binary. That leads me to wonder: what about binary diffs for large files? Perhaps for all files that are bigger than N bytes, we could check if the binary diff is X% or smaller of the total size, and if so, the build servers could host the diff file alongside the full file. (Substitute N and X for sensible values, like maybe 5 MB and 50%.) But that could be a second, separate step I suppose. - Taylan