From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id lqnwKH/n118WZAAA0tVLHw (envelope-from ) for ; Mon, 14 Dec 2020 22:30:23 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id uBVKJH/n11+iaQAA1q6Kng (envelope-from ) for ; Mon, 14 Dec 2020 22:30:23 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 3C62794023D for ; Mon, 14 Dec 2020 22:30:23 +0000 (UTC) Received: from localhost ([::1]:46250 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kowMI-0007rN-6o for larch@yhetil.org; Mon, 14 Dec 2020 17:30:22 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:52622) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kowLt-0007Ww-7A for guix-devel@gnu.org; Mon, 14 Dec 2020 17:29:57 -0500 Received: from lepiller.eu ([2a00:5884:8208::1]:44304) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kowLq-0006ky-I1; Mon, 14 Dec 2020 17:29:56 -0500 Received: from lepiller.eu (localhost [127.0.0.1]) by lepiller.eu (OpenSMTPD) with ESMTP id a2672656; Mon, 14 Dec 2020 22:29:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=lepiller.eu; h=date :in-reply-to:references:mime-version:content-type :content-transfer-encoding:subject:to:from:message-id; s=dkim; bh=aG30BO5RyKc98EwoomxfTjV97jBSiygSNMQyQsJQuzY=; b=fD7qFYzkqijg odo6SF1uxu0Skv90Lx012Vn6JnM/ntW/tHwDV4A5KfW3iXVImaKwsTVKbE8TIXEa zEj2h//zLbkluOQzNxNbKxx3Wm832K7Z5TOMvh/lw51tL5xwrAsSxFxnrzVTy79A 2HbrgxkyCeq/ms0JuLzVaaipCjYrjDZUzilkugQY1JkRGgGwGrIKDyCXdMYcdbBv lFsCOEaSkXVEdBLUI6PdCGLojVOy/f0Vagb3zRcowGB/3933ocK2zhW8vFs05xlA 7UuBi8vorj8lU4CrmCABdy/IlTK01asUoeB9G1rUW7Qm1jzkvgTqSVjwms4nuO0i FHAHUcZitA== Received: by lepiller.eu (OpenSMTPD) with ESMTPSA id bbc48634 (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256:NO); Mon, 14 Dec 2020 22:29:47 +0000 (UTC) Date: Mon, 14 Dec 2020 17:29:39 -0500 User-Agent: K-9 Mail for Android In-Reply-To: <87im94qbby.fsf@gnu.org> References: <87im94qbby.fsf@gnu.org> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----4DP8BXXSUFJSGKTT7KD3ENIZPIV5GO" Content-Transfer-Encoding: 7bit Subject: Re: When substitute download + decompression is CPU-bound To: guix-devel@gnu.org,=?ISO-8859-1?Q?Ludovic_Court=E8s?= From: Julien Lepiller Message-ID: Received-SPF: pass client-ip=2a00:5884:8208::1; envelope-from=julien@lepiller.eu; helo=lepiller.eu X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN X-Migadu-Spam-Score: -1.51 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=lepiller.eu header.s=dkim header.b=fD7qFYzk; dmarc=pass (policy=none) header.from=lepiller.eu; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Migadu-Queue-Id: 3C62794023D X-Spam-Score: -1.51 X-Migadu-Scanner: scn0.migadu.com X-TUID: YXZRxxt74KYp ------4DP8BXXSUFJSGKTT7KD3ENIZPIV5GO Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable My proposed changes to allow for parallel download assume downloads are net= work-bound, so they can be separate from other jobs=2E If downloads are act= ually CPU-bound, then it has indeed no merit at all :) Le 14 d=C3=A9cembre 2020 17:20:17 GMT-05:00, "Ludovic Court=C3=A8s" a =C3=A9crit : >Hi Guix! > >Consider these two files: > >https://ci=2Eguix=2Egnu=2Eorg/nar/gzip/kfcrrl6p6f6v51jg5rirmq3q067zxih6-u= ngoogled-chromium-87=2E0=2E4280=2E88-0=2Eb78cb92 >https://ci=2Eguix=2Egnu=2Eorg/nar/lzip/kfcrrl6p6f6v51jg5rirmq3q067zxih6-u= ngoogled-chromium-87=2E0=2E4280=2E88-0=2Eb78cb92 > >Quick decompression bench: > >--8<---------------cut here---------------start------------->8--- >$ du -h /tmp/uc=2Enar=2E[gl]z >103M /tmp/uc=2Enar=2Egz >71M /tmp/uc=2Enar=2Elz >$ gunzip -c < /tmp/uc=2Enar=2Egz| wc -c >350491552 >$ time lzip -d /dev/null > >real 0m6=2E040s >user 0m5=2E950s >sys 0m0=2E036s >$ time gunzip -c < /tmp/uc=2Enar=2Egz >/dev/null > >real 0m2=2E009s >user 0m1=2E977s >sys 0m0=2E032s >--8<---------------cut here---------------end--------------->8--- > >The decompression throughput (compressed bytes read in the first >column, >uncompressed bytes written in the second column) is: > > input | output > gzip: 167=C2=A0MiB/s | 52=C2=A0MB/s > lzip: 56=C2=A0MiB/s | 11=C2=A0MB/s > >Indeed, if you run this from a computer on your LAN: > > wget -O - =E2=80=A6 | gunzip > /dev/null > >you=E2=80=99ll find that wget caps at 50=C2=A0M/s with gunzip, whereas wi= th lunzip >it >caps at 11=C2=A0MB/s=2E > >>From my place I get a peak download bandwidth of 30+=C2=A0MB/s from >ci=2Eguix=2Egnu=2Eorg, thus substitute downloads are CPU-bound (I can=E2= =80=99t go >beyond 11=C2=A0M/s due to decompression)=2E I must say it never occurred= to >me >it could be the case when we introduced lzip substitutes=2E > >I=E2=80=99d get faster substitute downloads with gzip (I would download m= ore >but >the time-to-disk would be smaller=2E) Specifically, download + >decompression of ungoogled-chromium from the LAN completes in 2=2E4s for >gzip vs=2E 7=2E1s for lzip=2E On a low-end ARMv7 device, also on the LAN= , I >get 32s (gzip) vs=2E 53s (lzip)=2E > >Where to go from here? Several options: > > 0=2E Lzip decompression speed increases with compression ratio, but > we=E2=80=99re already using =E2=80=98--best=E2=80=99 on ci=2E The o= nly way we could gain is > by using =E2=80=9Cmulti-member archives=E2=80=9D and then parallel de= compression as > done in plzip, but that=E2=80=99s probably not supported in lzlib=2E= So > we=E2=80=99re probably stuck here=2E > > 1=2E Since ci=2Eguix=2Egnu=2Eorg still provides both gzip and lzip arch= ives, > =E2=80=98guix substitute=E2=80=99 could automatically pick one or th= e other > depending on the CPU and bandwidth=2E Perhaps a simple trick would > be to check the user/wall-clock time ratio and switch to gzip for > subsequent downloads if that ratio is close to one=2E How well would > that work? > > 2=2E Use Zstd like all the cool kids since it seems to have a much > higher decompression speed: = =2E > 630=C2=A0MB/s on ungoogled-chromium on my laptop=2E Woow=2E > > 3=2E Allow for parallel downloads (really: parallel decompression) as > Julien did in =2E > >My preference would be #2, #1, and #3, in this order=2E #2 is great but >it=E2=80=99s quite a bit of work, whereas #1 could be deployed quickly=2E= I=E2=80=99m >not >fond of #3 because it just papers over the underlying issue and could >be >counterproductive if the number of jobs is wrong=2E > >Thoughts? > >Ludo=E2=80=99=2E ------4DP8BXXSUFJSGKTT7KD3ENIZPIV5GO Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable My proposed changes to allow for parallel download= assume downloads are network-bound, so they can be separate from other job= s=2E If downloads are actually CPU-bound, then it has indeed no merit at al= l :)

Le 14 d=C3=A9cembre 2020 17:20:17 GM= T-05:00, "Ludovic Court=C3=A8s" <ludo@gnu=2Eorg> a =C3=A9crit :
Hi Guix!

Consider these two files:

<= a href=3D"https://ci=2Eguix=2Egnu=2Eorg/nar/gzip/kfcrrl6p6f6v51jg5rirmq3q06= 7zxih6-ungoogled-chromium-87=2E0=2E4280=2E88-0=2Eb78cb92">https://ci=2Eguix= =2Egnu=2Eorg/nar/gzip/kfcrrl6p6f6v51jg5rirmq3q067zxih6-ungoogled-chromium-8= 7=2E0=2E4280=2E88-0=2Eb78cb92
https://ci=2Eguix=2Egnu=2Eorg/nar/lzip/kfcrrl6p6f= 6v51jg5rirmq3q067zxih6-ungoogled-chromium-87=2E0=2E4280=2E88-0=2Eb78cb92

Quick decompression bench:

--8<---------------cut here--= -------------start------------->8---
$ du -h /tmp/uc=2Enar=2E[gl]z103M /tmp/uc=2Enar=2Egz
71M /tmp/uc=2Enar=2Elz
$ gunzip -c < /tmp= /uc=2Enar=2Egz| wc -c
350491552
$ time lzip -d </tmp/uc=2Enar=2Elz= >/dev/null

real 0m6=2E040s
user 0m5=2E950s
sys 0m0=2E036s<= br>$ time gunzip -c < /tmp/uc=2Enar=2Egz >/dev/null

real 0m2= =2E009s
user 0m1=2E977s
sys 0m0=2E032s
--8<---------------cut h= ere---------------end--------------->8---

The decompression throu= ghput (compressed bytes read in the first column,
uncompressed bytes wri= tten in the second column) is:

input | output
gzip= : 167 MiB/s | 52 MB/s
lzip: 56 MiB/s | 11 MB/s
Indeed, if you run this from a computer on your LAN:

wget -O = - =E2=80=A6 | gunzip > /dev/null

you=E2=80=99ll find that wget ca= ps at 50 M/s with gunzip, whereas with lunzip it
caps at 11 MB= /s=2E

From my place I get a peak download bandwidth of 30+ MB/s= from
ci=2Eguix=2Egnu=2Eorg, thus substitute downloads are CPU-bound (I = can=E2=80=99t go
beyond 11 M/s due to decompression)=2E I must say= it never occurred to me
it could be the case when we introduced lzip su= bstitutes=2E

I=E2=80=99d get faster substitute downloads with gzip (= I would download more but
the time-to-disk would be smaller=2E) Specifi= cally, download +
decompression of ungoogled-chromium from the LAN compl= etes in 2=2E4s for
gzip vs=2E 7=2E1s for lzip=2E On a low-end ARMv7 dev= ice, also on the LAN, I
get 32s (gzip) vs=2E 53s (lzip)=2E

Where = to go from here? Several options:

0=2E Lzip decompression speed i= ncreases with compression ratio, but
we=E2=80=99re already using = =E2=80=98--best=E2=80=99 on ci=2E The only way we could gain is
by= using =E2=80=9Cmulti-member archives=E2=80=9D and then parallel decompress= ion as
done in plzip, but that=E2=80=99s probably not supported in = lzlib=2E So
we=E2=80=99re probably stuck here=2E

1=2E Sin= ce ci=2Eguix=2Egnu=2Eorg still provides both gzip and lzip archives,
= =E2=80=98guix substitute=E2=80=99 could automatically pick one or the oth= er
depending on the CPU and bandwidth=2E Perhaps a simple trick wo= uld
be to check the user/wall-clock time ratio and switch to gzip f= or
subsequent downloads if that ratio is close to one=2E How well = would
that work?

2=2E Use Zstd like all the cool kids sinc= e it seems to have a much
higher decompression speed: <
https://facebook=2Egithub=2Eio/zs= td/>=2E
630 MB/s on ungoogled-chromium on my laptop=2E = Woow=2E

3=2E Allow for parallel downloads (really: parallel decom= pression) as
Julien did in <https://issues=2Eguix=2Egnu=2Eorg/39728>=2E

M= y preference would be #2, #1, and #3, in this order=2E #2 is great but
= it=E2=80=99s quite a bit of work, whereas #1 could be deployed quickly=2E = I=E2=80=99m not
fond of #3 because it just papers over the underlying is= sue and could be
counterproductive if the number of jobs is wrong=2E
=
Thoughts?

Ludo=E2=80=99=2E

------4DP8BXXSUFJSGKTT7KD3ENIZPIV5GO--