From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Israelsson Tampe Newsgroups: gmane.lisp.guile.devel Subject: Re: Hatables are slow Date: Wed, 23 Feb 2022 09:02:53 +0100 Message-ID: References: <4dccd80b-18f2-40e3-b6b2-c1d97bd91224@www.fastmail.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000032690d05d8aae5c7" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="22438"; mail-complaints-to="usenet@ciao.gmane.io" Cc: guile-devel To: =?UTF-8?Q?Linus_Bj=C3=B6rnstam?= Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Wed Feb 23 09:05:00 2022 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1nMmdt-0005Tg-DH for guile-devel@m.gmane-mx.org; Wed, 23 Feb 2022 09:04:57 +0100 Original-Received: from localhost ([::1]:52738 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nMmdr-0002fW-AC for guile-devel@m.gmane-mx.org; Wed, 23 Feb 2022 03:04:55 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:34308) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nMmc9-0002e5-Mp for guile-devel@gnu.org; Wed, 23 Feb 2022 03:03:09 -0500 Original-Received: from [2607:f8b0:4864:20::42f] (port=34365 helo=mail-pf1-x42f.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nMmc7-0008Nr-NI for guile-devel@gnu.org; Wed, 23 Feb 2022 03:03:09 -0500 Original-Received: by mail-pf1-x42f.google.com with SMTP id g1so14614793pfv.1 for ; Wed, 23 Feb 2022 00:03:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tVzarXAi88lA5f+Ifv3vj2Vjyjqk6tRFapViipzxqHg=; b=X8xg6k0TMCPfBKZ9BpyhnJmV0KaytXOZNYA3Srz3tTTFEmTQuMa0jrfhelWOWvB8w7 LC0mcwM2ZZ09f9YAWVsA8YQEok4pBBBDPu6wQDKw+U+KIbQuCaTcye0wuL0rQXuo0TM+ BNUSBp3jn3BMdKsKk6QHkMVXND7pJ33XnWU7yi39q/yy5BoSlgqGM1VGv7U+O/bzNypn /nOi3aS3fGkt7HFN9P/9vs2agcYsMmovqxD0FAh+ZrYFRVrFBcnbgnYMeXCclZcbrvnG wXeFHpx+9hpsSOAbqlQruNoJwbTAsC1sOvojngwPqf1O9s7XljlZQTovZvbETwO66iRg ipGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tVzarXAi88lA5f+Ifv3vj2Vjyjqk6tRFapViipzxqHg=; b=2HdzgrQCSu0naIMIbvKhCPkH9vGhq5bwuIaswZRxoI+DV9730hWmWMAqR5GDxl4V0q XxZQmjOuUt4fZvGsvEhO9lwTKlb8fOcX0BTcqvx8YQRadABTR6b47hkI/6jLGHNFouIG 2gsfDHMsF6BZ4kLAhleqCrxNA9JhEQdnVvRAl4xPttzdUc4XtuUWLYv7vwObt+1+9B6n +X6rR1+tiuyaSBGMgDy9sIYz4/uU3ui5iEtjqHfXYCM9OIZejG756jbigtmm2UYlKmK/ oHvjS6uwSvnY5WgKFwZKbd65+6b+4wSzy6nIJLZZJl3xqAa1Ks9zuleWf2/AtFeslfPX qusA== X-Gm-Message-State: AOAM533saDpUBpMDuLVq/NdhfcqHdGTiLLJhzbVthwHi8msEuR8lhsvu inwSs9wZDGX7DR4tAIEtn/T/oyum22Zpnc921Tg= X-Google-Smtp-Source: ABdhPJzpfrY/H7Mx3JWAFtUw0enH8wckS9dUmiuYKFsU3IsHcMVSnP9ePXrNNgyAtIJAYLOgdT2yXDrY2wQ9UU6JaGo= X-Received: by 2002:a62:7e06:0:b0:4e0:f0f8:9b86 with SMTP id z6-20020a627e06000000b004e0f0f89b86mr28880539pfc.26.1645603384878; Wed, 23 Feb 2022 00:03:04 -0800 (PST) In-Reply-To: <4dccd80b-18f2-40e3-b6b2-c1d97bd91224@www.fastmail.com> X-Host-Lookup-Failed: Reverse DNS lookup failed for 2607:f8b0:4864:20::42f (failed) Received-SPF: pass client-ip=2607:f8b0:4864:20::42f; envelope-from=stefan.itampe@gmail.com; helo=mail-pf1-x42f.google.com X-Spam_score_int: -6 X-Spam_score: -0.7 X-Spam_bar: / X-Spam_report: (-0.7 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, PDS_HP_HELO_NORDNS=0.659, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:21146 Archived-At: --00000000000032690d05d8aae5c7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable yep that's a working solution, currently I make a proper goops class and the methods will dispatch to the right version. On Wed, Feb 23, 2022 at 8:41 AM Linus Bj=C3=B6rnstam < linus.bjornstam@veryfast.biz> wrote: > Hej! > > I would also propose a hash table based on a more sane interface. The > equality and hash procedures should be associated with the hash table at > creation rather than every time the hash table is used. Like in R6RS, > srfi-69, or srfi-12X (intermediate hash tables). > > Maybe the current HT could be relegated to some kind of compat or > deprecated library to be removed in 3.4... I am no maintainer, but I thin= k > we can all agree that the current API, while fine in the context of guile > 1.6, is somewhat clunky by today's standards. It is also commonplace enou= gh > that regular deprecation might become rough. > > Just the simple fact that hash-set! and hashq-set! can be used > interchangeably while you at the same time NEVER EVER should mix them is > somewhat unnerving. > > I would say a hash table that specifies everything at creation time (with > maybe an opportunity to use something like the hashx-* functions for > daredevils and for future srfi needs) is the way to go. > > Best regards > Linus Bj=C3=B6rnstam > > On Mon, 21 Feb 2022, at 14:18, Stefan Israelsson Tampe wrote: > > A datastructure I fancy is hash tables. But I found out that hashtables > > in guile are really slow, How? First of all we make a hash table > > > > (define h (make-hash-table)) > > > > Then add values > > (for-each (lambda (i) (hash-set! h i i)) (iota 20000000)) > > > > Then the following operation cost say 5s > > (hash-fold (lambda (k v s) (+ k v s)) 0 h) > > > > It is possible with the foreign interface to speedt this up to 2s using > > guiles internal interface. But this is slow for such a simple > > application. Now let's change focus. Assume the in stead an assoc, > > > > (define l (map (lambda (i) (cons i i)) (iota 20000000))) > > > > Then > > ime (let lp ((l ll) (s 0)) (if (pair? l) (lp (cdr l) (+ s (caar l))) s)= ) > > $5 =3D 199999990000000 > > ;; 0.114530s real time, 0.114391s run time. 0.000000s spent in GC. > > > > That's 20X faster. What have happened?, Well hashmaps has terrible > > memory layout for scanning. So essentially keeping a list of the > > created values consed on a list not only get you an ordered hashmap, > > you also have 20X increase in speed, you sacrifice memory, say about > > 25-50% extra. The problem actually more that when you remove elements > > updating the ordered list is very expensive. In python-on-guile I have > > solved this by moving to a doubly linked list when people start's to > > delete single elements. For small hashmap things are different. > > > > I suggest that guile should have a proper faster standard hashmap > > implemention of such kind in scheme. > > > > Stefan > --00000000000032690d05d8aae5c7 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
yep that's a working=C2=A0solution, currently I make a= proper goops class and the methods will dispatch to the right=C2=A0version= .

On Wed, Feb 23, 2022 at 8:41 AM Linus Bj=C3=B6rnstam <linus.bjornstam@veryfast.biz> wrote:<= br>
Hej!

I would also propose a hash table based on a more sane interface. The equal= ity and hash procedures should be associated with the hash table at creatio= n rather than every time the hash table is used. Like in R6RS, srfi-69, or = srfi-12X (intermediate hash tables).

Maybe the current HT could be relegated to some kind of compat or deprecate= d library to be removed in 3.4... I am no maintainer, but I think we can al= l agree that the current API, while fine in the context of guile 1.6, is so= mewhat clunky by today's standards. It is also commonplace enough that = regular deprecation might become rough.

Just the simple fact that hash-set! and hashq-set! can be used interchangea= bly while you at the same time NEVER EVER should mix them is somewhat unner= ving.

I would say a hash table that specifies everything at creation time (with m= aybe an opportunity to use something like the hashx-* functions for daredev= ils and for future srfi needs) is the way to go.

Best regards
=C2=A0 Linus Bj=C3=B6rnstam

On Mon, 21 Feb 2022, at 14:18, Stefan Israelsson Tampe wrote:
> A datastructure I fancy is hash tables. But I found out that hashtable= s
> in guile are really slow, How? First of all we make a hash table
>
> (define h (make-hash-table))
>
> Then add values
> (for-each (lambda (i) (hash-set! h i i)) (iota 20000000))
>
> Then the following operation cost say 5s
> (hash-fold (lambda (k v s) (+ k v s)) 0 h)
>
> It is possible with the foreign interface to speedt this up to 2s usin= g
> guiles internal interface. But this is slow for such a simple
> application. Now let's change focus. Assume the in stead an assoc,=
>
> (define l (map (lambda (i) (cons i i)) (iota 20000000)))
>
> Then
> ime (let lp ((l ll) (s 0)) (if (pair? l) (lp (cdr l) (+ s (caar l))) s= ))
> $5 =3D 199999990000000
> ;; 0.114530s real time, 0.114391s run time.=C2=A0 0.000000s spent in G= C.
>
> That's 20X faster. What have happened?, Well hashmaps has terrible=
> memory layout for scanning. So essentially keeping a list of the
> created values consed on a list not only get you an ordered hashmap, <= br> > you also have 20X increase in speed, you sacrifice memory, say about <= br> > 25-50% extra. The problem actually more that when you remove elements =
> updating the ordered list is very expensive. In python-on-guile I have=
> solved this by moving to a doubly linked list when people start's = to
> delete single elements. For small hashmap things are different.
>
> I suggest that guile should have a proper faster standard hashmap
> implemention of such kind in scheme.
>
> Stefan
--00000000000032690d05d8aae5c7--