From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp11.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id QLJ/BiTgQmJvfQAAgWs5BA (envelope-from ) for ; Tue, 29 Mar 2022 12:32:04 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp11.migadu.com with LMTPS id MB1KBCTgQmK3hgAA9RJhRA (envelope-from ) for ; Tue, 29 Mar 2022 12:32:04 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id B64BE4F1B for ; Tue, 29 Mar 2022 12:32:03 +0200 (CEST) Received: from localhost ([::1]:56494 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nZ98s-0004y0-RW for larch@yhetil.org; Tue, 29 Mar 2022 06:32:02 -0400 Received: from eggs.gnu.org ([209.51.188.92]:44660) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nZ97A-0004rU-Qc for guix-devel@gnu.org; Tue, 29 Mar 2022 06:30:17 -0400 Received: from [2001:470:142:3::e] (port=33200 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nZ97A-00007s-Hm; Tue, 29 Mar 2022 06:30:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:In-Reply-To:Date:References:Subject:To: From; bh=hy24lx2OZf/LQOhExMFws/r5pJOwxPG0bXvCLd4BHqE=; b=mkON4ICpm/plZdgQWliO 9OeTNCECjI9L/CSWJaylTmitQsj2eXyIspI/im56R0vWkkOPBRd8y+/X+5IgNDhPk9wkoBLE8TtzZ YbRnUNul0Su+mMnkD26J45NrJkmNdACad84it68ZIP9UrfOPi+oJiGroUpUqS4KIJqPcb9yXDMXpy PDp+dZ8ybwQCjGDvP8rCaEv8CWFcHorlgl9vabxoEI905liQMUP7zDEjc2Ean0Dhhuxgm+qWnEodz N9x/IoS9c9o4xpVk2fkaS0yYAhkiKx5bKzaF7ew0x10Ftpa6USqpC2Cv1dnmAFh4aIJg1RsxAA9aQ b6u17+4R47olpA==; Received: from [193.50.110.177] (port=51526 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nZ97A-0002qp-1E; Tue, 29 Mar 2022 06:30:16 -0400 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Maxim Cournoyer Subject: Re: Profiling of man-db database generation with zlib vs zstd References: <875yo53iuq.fsf@gmail.com> <87ee2r9gms.fsf@gnu.org> <87o81qviqg.fsf@gmail.com> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 9 Germinal an 230 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Tue, 29 Mar 2022 12:30:14 +0200 In-Reply-To: <87o81qviqg.fsf@gmail.com> (Maxim Cournoyer's message of "Sun, 27 Mar 2022 23:49:59 -0400") Message-ID: <87czi5126h.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1648549923; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=hy24lx2OZf/LQOhExMFws/r5pJOwxPG0bXvCLd4BHqE=; b=pztl+Qb4u/e3Y5FGHBm+EMOZM8hPg5hTOBiOytmWp+llPJPDWgkhSto1mSxZlcLFdReXVf GszldzRaa3sgEA5Ed8Glc2ac+gIxoKxvcw0uZtFXznC6tDLwAVKb10I8zp7BUOZlhJEdiK RN8kKotMUVkMPYlVZv17BXvSNBPf25HPIHriblXfis0iH9IMj4GOpW1x3rmZMXOKK6QtoS ak3xQngKet1Hm8VjKdHwLOP0yzpon9H4fsckg5D0rkXeznlwb56lxVxw7Yp6uQHrYEtU9I rr2NfriKY1vZ8dfF36huhfs9HVu678WYr39avPdd1+tiUuzkW/oYm5KXvDL3yA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1648549923; a=rsa-sha256; cv=none; b=iaA+8CDvQcKyloFvdcU+H63bfCc35Ja33reKxpKxBqYhUoUZHsdUxTL+0ymyQ156RTAhik ZpjdWCLCcntSgQeqa/h+XbShBLIIHrl/1VHBCY2gqgpv2kbiL5x09QHuLSqH/bYV7z+hso rd1pSXzHE4Gckov2TNyZHof97rreToTpIJwIkiwRP8l3Bo4k5N3RRYb0A2F1w94XO/5tJQ r1dHVh9xUEla8U44hG2VmHkXZY/s33+PUO4E7imN6trw0oGoDFmEgnO5gS0zk51FhcFDSy lPDZ0ytxIxzjcZrBpbfQD+mCJBGKoxLIf6kESUTtnEJ4V3okONayrFy9z98HYg== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gnu.org header.s=fencepost-gnu-org header.b=mkON4ICp; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -5.27 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gnu.org header.s=fencepost-gnu-org header.b=mkON4ICp; dmarc=pass (policy=none) header.from=gnu.org; spf=pass (aspmx1.migadu.com: domain of "guix-devel-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="guix-devel-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: B64BE4F1B X-Spam-Score: -5.27 X-Migadu-Scanner: scn1.migadu.com X-TUID: ZKNX3XKAon44 Hi! Maxim Cournoyer skribis: > You'll need to generate the tar.zst and tar.gz yourself, but the script > that was used is: > > ;; decompress-zstd.scm > (use-modules (ice-9 binary-ports) > (ice-9 match) > (statprof) > (zstd)) > > (define MiB (expt 2 20)) > (define input-file "/tmp/chromium-98.0.4758.102.tar.zst") > (define output-file "/dev/null") > > (define (decompression-test) > (call-with-input-file input-file > (lambda (port) > (call-with-zstd-input-port port > (lambda (input) > (call-with-output-file output-file > (lambda (output) > (let loop ((bv (get-bytevector-n input (* 4 MiB)))) > (match bv > ((? eof-object?) > #t) > (bv > (put-bytevector output bv) > (loop (get-bytevector-n input (* 4 MiB))))))))))))) To isolate the problem, you could allocate the 4=C2=A0MiB buffer outside of the loop and use =E2=80=98get-bytevector-n!=E2=80=99, and also remove code = that writes to =E2=80=98output=E2=80=99. > This confirms that guile-zstd is not noticeably faster than guile-zlib, > which is unexpected. Uh, surprising. Note that =E2=80=98statprof=E2=80=99 incurs overhead, so in general if you = want timings, get them without =E2=80=98statprof=E2=80=99. > Compare to the command line tools: > > $ time+ zstd -cdk /tmp/chromium-98.0.4758.102.tar.zst > /dev/null > cpu: 99%, mem: 10548 KiB, wall: 0:09.37, sys: 0.30, usr: 9.05 > > $ time+ gunzip -ck /tmp/chromium-98.0.4758.102.tar.gz > /dev/null > cpu: 99%, mem: 2908 KiB, wall: 0:22.29, sys: 0.31, usr: 21.98 > > where zstd is about 2.3x faster. > > It's unfortunate that the bulk of the time is spent in "anon" (anonymous > proc?), which doesn't say much. It=E2=80=99s likely one of the lambdas. > Perhaps I should open an issue with the guile-zstd project. Yes, or we can continue here. :-) >From there I think we should first fully isolate the thing we=E2=80=99re measuring, as discussed above, to gain confidence. It the code using guile-zstd is slower than the CLI, then it could be that guile-zstd doesn=E2=80=99t initialize the library properly, or that it= gets buffering wrong or something. I=E2=80=99ll see if I can give it a try too. Thanks for investigating! Ludo=E2=80=99.