From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: =?UTF-8?Q?Linus_Bj=C3=B6rnstam?= Newsgroups: gmane.lisp.guile.user Subject: Re: string-for-each vs. for-each+string->list performance Date: Sat, 13 Jun 2020 08:41:17 +0200 Message-ID: References: <7377875a-507d-471e-866b-0f505517ff82@www.fastmail.com> <87k10c2gah.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="130419"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Cyrus-JMAP/3.3.0-dev0-525-ge8fa799-fm-20200609.001-ge8fa7990 To: =?UTF-8?Q?Ludovic_Court=C3=A8s?= , guile-user@gnu.org Original-X-From: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Sat Jun 13 08:41:57 2020 Return-path: Envelope-to: guile-user@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jjzrY-000XmI-GI for guile-user@m.gmane-mx.org; Sat, 13 Jun 2020 08:41:57 +0200 Original-Received: from localhost ([::1]:54362 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jjzrX-0004U0-0t for guile-user@m.gmane-mx.org; Sat, 13 Jun 2020 02:41:55 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:35704) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jjzrN-0004Tu-N1 for guile-user@gnu.org; Sat, 13 Jun 2020 02:41:45 -0400 Original-Received: from out4-smtp.messagingengine.com ([66.111.4.28]:46677) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jjzrL-0003mD-K8; Sat, 13 Jun 2020 02:41:45 -0400 Original-Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 4D2925C0152; Sat, 13 Jun 2020 02:41:41 -0400 (EDT) Original-Received: from imap1 ([10.202.2.51]) by compute4.internal (MEProxy); Sat, 13 Jun 2020 02:41:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.se; h= mime-version:message-id:in-reply-to:references:date:from:to :subject:content-type:content-transfer-encoding; s=fm3; bh=RwC0h g+KrswnPX+JltnvFJlkS44imqoz0OEHMQSPmNI=; b=rPEZLWThm8yY14wWEJ3+F LnfgOyK0fNJbbN1malP0WtLouf1D3deoyq2lgztzkb2TNn3oInB0qd5Ox9pJMDU7 acUXLYjghrdQbSBLawOLXTYba0PN1qT3FoT1YYe0F/FD9BDtNpV10LDG/ozG/0lf vUZpt+XRyoW+QginqzTSdS/McnTYffXWFTnZR+gW/tPGeUNaZNAQvOa+Inm07d8m Y9Yk7bM7RTjcd1OXdAEdZTGL+uZDja9zC3QBKIMbKmc7bT5uTYd0jmj2wxF7aKGP ssBCROhHrhInR+XPO75CEqoDp+gw7u4WueuAZTZVnPJpUq4SW0+CGUowbcdomwpr g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; bh=RwC0hg+KrswnPX+JltnvFJlkS44imqoz0OEHMQSPm NI=; b=CupiFPfvap+ZWvVK1yZ9a6e6dbT/VWHHX1kvW5kXL97sO1GJRHahtpslV 3/fRxa07gakntPxZEdzc9M6vn+fLjQpN5LCxP97DD/6fCYJm6qMxXU5/G0Rvz7n7 WXQhwpRYk4n+pvFZIASwOhpwnk0cL/KsXoM1y0NPD21jAuO0ziRRi47tsmHX7pC8 zBg4Ok62XQ+x24mIhFR/wjkJu8ALtGiWDgP+Pc3NK1xMkGVVFk0J9Dmu+5RVjZ6A 12n/53JiWh6HkGHBrCvQiRS3Opo98QirlFx192ETb9T2qfpG8QhZn9S+T0gY0W9M /n9u0NcqGSmg/5qZb2dKBihJW0M9A== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduhedrudeivddguddutdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefofgggkfgjfhffhffvufgtgfesthhqredtreerjeenucfhrhhomhepnfhi nhhushgpuehjnphrnhhsthgrmhcuoehlihhnuhhsrdhinhhtvghrnhgvthesfhgrshhtmh grihhlrdhsvgeqnecuggftrfgrthhtvghrnheptdefvddtfedvfeeuudekledutdfgtdeg heegleejueevvdetkefgheejgfdvgeegnecuffhomhgrihhnpehgnhhurdhorhhgnecuve hluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheplhhinhhushdr ihhnthgvrhhnvghtsehfrghsthhmrghilhdrshgv X-ME-Proxy: Original-Received: by mailuser.nyi.internal (Postfix, from userid 501) id F4214C200A5; Sat, 13 Jun 2020 02:41:40 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface In-Reply-To: <87k10c2gah.fsf@gnu.org> Received-SPF: pass client-ip=66.111.4.28; envelope-from=linus.internet@fastmail.se; helo=out4-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/13 02:41:41 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.io gmane.lisp.guile.user:16586 Archived-At: Thanks for clearing that up. I have an old implementation of large parts of srfi-13 if that would be = of interest. I don't know how much you want to change. The licence situa= tion of the reference implementation is weird iirc. A beginning could be to replace all higher order functions since that wo= uld minimize the kind of performance problems discussed here.=20 --=20 Linus Bj=C3=B6rnstam On Fri, 12 Jun 2020, at 22:13, Ludovic Court=C3=A8s wrote: > Hi, >=20 > Linus Bj=C3=B6rnstam skribis: >=20 > > You can cut another 15-ish % from that loop by making an inline loop= , btw > > > > (let loop ((pos 0)) > > (when (< pos (string-length str)) > > ... > > (loop (1+ pos))) > > > > I have been looking at the disassembly, even for simpler cases, but = I haven't been able to understand enough of it.=20 > > > > BTW: string-for-each is in the default environment, and is probably = the same as the srfi-13 C implementation. >=20 > =E2=80=98string-for-each=E2=80=99 in C (the default) is slower than it= s Scheme counterpart: >=20 > --8<---------------cut here---------------start------------->8--- > scheme@(guile-user)> (define (sfe proc str) > (define len (string-length str)) > (let loop ((i 0)) > (unless (=3D i len) > (proc (string-ref str i)) > (loop (+ 1 i))))) > scheme@(guile-user)> (define str (make-string 15000000)) > scheme@(guile-user)> ,t (sfe identity str) > ;; 0.263725s real time, 0.263722s run time. 0.000000s spent in GC. > scheme@(guile-user)> ,t (sfe identity str) > ;; 0.259538s real time, 0.259529s run time. 0.000000s spent in GC. > scheme@(guile-user)> ,t (string-for-each identity str) > ;; 0.841632s real time, 0.841624s run time. 0.000000s spent in GC. > scheme@(guile-user)> (version) > $2 =3D "3.0.2" > --8<---------------cut here---------------end--------------->8--- >=20 > In general we seem to pay a high price for leaving (calling a subr) an= d > re-entering (via =E2=80=98scm_call_n=E2=80=99) the VM. This is especi= ally acute here > because there=E2=80=99s almost nothing happening in C, so we keep boun= cing > between Scheme and C. >=20 > That=E2=80=99s another reason to start rewriting such primitives in Sc= heme and > have the C functions just call out to Scheme. >=20 > If we do: >=20 > perf record guile -c '(string-for-each identity (make-string 1500000= 0))' >=20 > we get this profile: >=20 > --8<---------------cut here---------------start------------->8--- > Overhead Command Shared Object Symbol > 31.10% guile libguile-3.0.so.1.1.1 [.] vm_regular_engine > 27.48% guile libguile-3.0.so.1.1.1 [.] scm_call_n > 14.34% guile libguile-3.0.so.1.1.1 [.] scm_jit_enter_mcode > 3.55% guile libguile-3.0.so.1.1.1 [.] scm_i_string_ref > 3.37% guile libguile-3.0.so.1.1.1 [.] get_callee_vcode > 2.34% guile libguile-3.0.so.1.1.1 [.] scm_call_1 > 2.31% guile libguile-3.0.so.1.1.1 [.] scm_string_for_each > --8<---------------cut here---------------end--------------->8--- >=20 > Indeed, we get better performance when turning off JIT: >=20 > --8<---------------cut here---------------start------------->8--- > $ GUILE_JIT_THRESHOLD=3D-1 time guile -c '(string-for-each identity=20= > (make-string 15000000))' > 0.47user 0.00system 0:00.47elapsed 100%CPU (0avgtext+0avgdata=20 > 26396maxresident)k > 0inputs+0outputs (0major+1583minor)pagefaults 0swaps > $ GUILE_JIT_THRESHOLD=3D100 time guile -c '(string-for-each identity=20= > (make-string 15000000))' > 0.83user 0.00system 0:00.83elapsed 100%CPU (0avgtext+0avgdata=20 > 26948maxresident)k > 0inputs+0outputs (0major+1748minor)pagefaults 0swaps > $ GUILE_JIT_THRESHOLD=3D0 time guile -c '(string-for-each identity=20 > (make-string 15000000))' > 0.84user 0.00system 0:00.85elapsed 100%CPU (0avgtext+0avgdata=20 > 27324maxresident)k > 0inputs+0outputs (0major+2548minor)pagefaults 0swaps > --8<---------------cut here---------------end--------------->8--- >=20 > So it seems that we just keep firing the JIT machinery on every > =E2=80=98scm_call_n=E2=80=99 for no benefit. >=20 > That=E2=80=99s probably also the reason why =E2=80=98%after-gc-hunk=E2= =80=99, =E2=80=98reap-pipes=E2=80=99, & > co. always show high in statprof: >=20 > https://lists.gnu.org/archive/html/guile-devel/2020-05/msg00019.html= =20 >=20 > Thanks, > Ludo=E2=80=99. >=20 >=20 >