From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Nala Ginrut Newsgroups: gmane.lisp.guile.devel Subject: Re: SHA256 performance with Guile 2.2 vs. Guile 3.0 Date: Sat, 4 Jan 2020 23:51:11 +0800 Message-ID: References: <874kxcnlh8.fsf@inria.fr> <87sgkwm4uv.fsf@gnu.org> <87h81b7bym.fsf@teapot.weinholt.se> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000f3aab1059b5265fb" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="109441"; mail-complaints-to="usenet@blaine.gmane.org" Cc: Andy Wingo , =?UTF-8?Q?Ludovic_Court=C3=A8s?= , Guile Devel To: =?UTF-8?Q?G=C3=B6ran_Weinholt?= Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Sat Jan 04 16:51:35 2020 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1inliA-000SL7-BY for guile-devel@m.gmane.org; Sat, 04 Jan 2020 16:51:34 +0100 Original-Received: from localhost ([::1]:34796 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1inli9-0000vR-68 for guile-devel@m.gmane.org; Sat, 04 Jan 2020 10:51:33 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:54853) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1inli4-0000vI-Je for guile-devel@gnu.org; Sat, 04 Jan 2020 10:51:30 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1inli3-0003jT-2N for guile-devel@gnu.org; Sat, 04 Jan 2020 10:51:28 -0500 Original-Received: from mail-yb1-xb2e.google.com ([2607:f8b0:4864:20::b2e]:41367) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1inli0-0003UI-SJ; Sat, 04 Jan 2020 10:51:25 -0500 Original-Received: by mail-yb1-xb2e.google.com with SMTP id k5so553178ybf.8; Sat, 04 Jan 2020 07:51:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=G0YTjs3aauHRXh0NGm47L0sfISQ3YRCe2PQIJgRabpI=; b=I5iXlYweVBDbJjPlD6tgGX6QOZhDVOP0uljxrYB3AMA6MlcShbo7BLIZe40i1u3dr1 kVCMmgBUdAQ+yfApMa+GOnmslwn5w+nNX8+iQMvBOUdVJsRyHjRjaIf3xoY07C4KlbVi fEgW1jGTpB7TEaIQ0KFzs2JZjI29yAj05dquOWWV8Khy2bQZ+l/2zR64GntOEaL+L2I5 SvlqmF5nzQWt3gUXbzEj/y13mJgkomdIXYux/KGsw0NcWA5WzouZFqjRZKIKPBQUp483 Jnqnot7DylU3QQFo/1fkC+BfIejInGn5ZXlB6UifYP6Xrs6kRbS8A/5XwjwvJSOqvDlM fIfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=G0YTjs3aauHRXh0NGm47L0sfISQ3YRCe2PQIJgRabpI=; b=EmzP1iYL2Q1owTkr9SbFeARNzSwBrpRvkpuGrp+c8oNM//MzmJ0hLTNuGJ3DPe1LZg ODVMvX7VksvaSP1LI/fd05rEACRTJihVYsz7vnxjP9Nu2RVqatI+jHVQySRxXFUTEMYc fNowmDAs7fL0azLdbjd6tco/+BWW0yQcNoA3WgMw0Ikb5Ea1esCYpsGYhWifOdY6AhAv eXav1u9TfRNiIlHJQBsF4o8SDdxEY6v9yfC08BIQEToqBwlw0dlO2MjQg7vO8iqxWMUR d6EC7Ydp4bNJwRUXsGADxXYMcU1TIutNNy5QdMVZ4fFrBEx4rdpI8NqLhmBxGpa78CWU Sk2Q== X-Gm-Message-State: APjAAAXt/do4st5j4YUtQxw7O6FHKT8ZhibRmuahZ9IhEleVa6uDwUE7 E6yo4FcC25Ok88Ub+M6OLgOKl2k1/6S/N8AwzvQT5eeyLYo= X-Google-Smtp-Source: APXvYqyc3n1Q6ffk2ZGsFGwWzt32P4bM044CJp0Au5FPcZXZeqdUwOPcQ/b8jAdoAFi2AyqiUgKxt8vakGer4k1iqYc= X-Received: by 2002:a5b:881:: with SMTP id e1mr48115457ybq.81.1578153083643; Sat, 04 Jan 2020 07:51:23 -0800 (PST) In-Reply-To: <87h81b7bym.fsf@teapot.weinholt.se> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::b2e X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.org gmane.lisp.guile.devel:20212 Archived-At: --000000000000f3aab1059b5265fb Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Congrats! I just replaced Weinholt's hmac implementation with NSS binding for product consideration, but nice to know this great result! And thanks to Weinholt's work, Artanis had been relying on it for many years. Best regards. On Sat, Jan 4, 2020, 18:36 G=C3=B6ran Weinholt wrote: > Ludovic Court=C3=A8s writes: > > > Ludovic Court=C3=A8s skribis: > > > >> ludo@ribbon ~/src/guix$ ./pre-inst-env guix environment --pure > --ad-hoc guile-next guile3.0-hashing -- guile ~/tmp/sha256.scm > >> > >> ;;; (hash > "b33576331465a60b003573541bf3b1c205936a16c407bc69f8419a527bf5c988") > >> clock utime stime cutime cstime gctime > >> 65.17 89.75 0.45 0.00 0.00 35.63 > > > > The patch below gives us: > > > > ludo@ribbon /tmp/hashing$ guile --r6rs -L .. ~/tmp/sha256.scm > > > > ;;; (hash > "b33576331465a60b003573541bf3b1c205936a16c407bc69f8419a527bf5c988") > > clock utime stime cutime cstime gctime > > 59.31 80.65 0.39 0.00 0.00 30.73 > > > > It=E2=80=99s a disappointingly small improvement. The reason is that (= hashing > > fixnums) adds another layer of opacity, where it ends up doing > > essentially: > > > > (define fx32xor fxxor) > > =E2=80=A6 > > > > Thus, no inlining, and no easy trick to solve that. :-/ > > I've pushed a Guile-specific version of (hashing fixnums) that inlines > the generic arithmetic procedures. This and some other small changes > improved the runtime: > > clock utime stime cutime cstime gctime > before: > 2.2.6 31.06 32.61 0.03 0.00 0.00 1.38 > 2.9.8 15.55 16.23 0.01 0.00 0.00 1.19 > after: > 2.2.6 2.95 3.01 0.00 0.00 0.00 0.10 > 2.9.8 1.98 1.99 0.00 0.00 0.00 0.08 > > That's about 100 times slower than sha256sum from coreutils. You might > get some more performance out of it by unrolling the loop in > sha-256-transform!. > > Regards, > > -- > G=C3=B6ran Weinholt | https://weinholt.se/ > Debian developer | 73 de SA6CJK > > --000000000000f3aab1059b5265fb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Congrats!
I just replaced Weinholt's= hmac implementation with NSS binding for product consideration, but nice t= o know this great result! And thanks to Weinholt's work, Artanis had be= en relying on it for many years.

Best regards.


On Sat, Jan 4, 202= 0, 18:36 G=C3=B6ran Weinholt <goran= @weinholt.se> wrote:
Ludovic= Court=C3=A8s <ludo@gnu.org> writes:

> Ludovic Court=C3=A8s <ludo@gnu.org> skribis:
>
>> ludo@ribbon ~/src/guix$ ./pre-inst-env guix environment --pure --a= d-hoc guile-next guile3.0-hashing -- guile ~/tmp/sha256.scm
>>
>> ;;; (hash "b33576331465a60b003573541bf3b1c205936a16c407bc69f8= 419a527bf5c988")
>> clock utime stime cutime cstime gctime
>> 65.17 89.75=C2=A0 0.45=C2=A0 =C2=A00.00=C2=A0 =C2=A00.00=C2=A0 35.= 63
>
> The patch below gives us:
>
> ludo@ribbon /tmp/hashing$ guile --r6rs -L .. ~/tmp/sha256.scm
>
> ;;; (hash "b33576331465a60b003573541bf3b1c205936a16c407bc69f8419a= 527bf5c988")
> clock utime stime cutime cstime gctime
> 59.31 80.65=C2=A0 0.39=C2=A0 =C2=A00.00=C2=A0 =C2=A00.00=C2=A0 30.73 >
> It=E2=80=99s a disappointingly small improvement.=C2=A0 The reason is = that (hashing
> fixnums) adds another layer of opacity, where it ends up doing
> essentially:
>
>=C2=A0 =C2=A0(define fx32xor fxxor)
>=C2=A0 =C2=A0=E2=80=A6
>
> Thus, no inlining, and no easy trick to solve that.=C2=A0 :-/

I've pushed a Guile-specific version of (hashing fixnums) that inlines<= br> the generic arithmetic procedures. This and some other small changes
improved the runtime:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0clock utime stime cutime cstime gc= time
=C2=A0before:
=C2=A0 =C2=A0 2.2.6=C2=A0 31.06 32.61=C2=A0 0.03=C2=A0 =C2=A00.00=C2=A0 =C2= =A00.00=C2=A0 =C2=A01.38
=C2=A0 =C2=A0 2.9.8=C2=A0 15.55 16.23=C2=A0 0.01=C2=A0 =C2=A00.00=C2=A0 =C2= =A00.00=C2=A0 =C2=A01.19
=C2=A0after:
=C2=A0 =C2=A0 2.2.6=C2=A0 =C2=A02.95=C2=A0 3.01=C2=A0 0.00=C2=A0 =C2=A00.00= =C2=A0 =C2=A00.00=C2=A0 =C2=A00.10
=C2=A0 =C2=A0 2.9.8=C2=A0 =C2=A01.98=C2=A0 1.99=C2=A0 0.00=C2=A0 =C2=A00.00= =C2=A0 =C2=A00.00=C2=A0 =C2=A00.08

That's about 100 times slower than sha256sum from coreutils. You might<= br> get some more performance out of it by unrolling the loop in
sha-256-transform!.

Regards,

--
G=C3=B6ran Weinholt=C2=A0 =C2=A0| https://weinholt.se/
Debian developer | 73 de SA6CJK

--000000000000f3aab1059b5265fb--