From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Israelsson Tampe Newsgroups: gmane.lisp.guile.devel Subject: Re: more advanced bytevector => supervectors Date: Thu, 2 Sep 2021 17:54:43 +0200 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="00000000000030a97105cb05345d" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6457"; mail-complaints-to="usenet@ciao.gmane.io" To: guile-devel Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Thu Sep 02 17:55:16 2021 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mLp3c-0001NA-2P for guile-devel@m.gmane-mx.org; Thu, 02 Sep 2021 17:55:16 +0200 Original-Received: from localhost ([::1]:37304 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mLp3a-00024X-VU for guile-devel@m.gmane-mx.org; Thu, 02 Sep 2021 11:55:15 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:57116) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mLp3K-00021G-7i for guile-devel@gnu.org; Thu, 02 Sep 2021 11:54:58 -0400 Original-Received: from mail-pl1-x630.google.com ([2607:f8b0:4864:20::630]:38729) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mLp3I-0007cH-Ba for guile-devel@gnu.org; Thu, 02 Sep 2021 11:54:58 -0400 Original-Received: by mail-pl1-x630.google.com with SMTP id u1so1475547plq.5 for ; Thu, 02 Sep 2021 08:54:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=lEoeDs4yuGyNwqNO1fDTw+d2RHF2pjX5kRJ7yBqtMNg=; b=QvAy/MJO5+819UadwMbGqp8U/hLJAD91UmtHMyr/8CY5JJ7gb6mbVPc6bh4ikwodhd CDrihih66Ra3Jk3b1b11lwaudotkLEulHUfI4NY+lcSsW2lru9KCiL19ZIYv+Kqbjb4w N+nQ2z3gdUTLTVjtViPXIJqsYxqzRIHGZBWcCCF4xtuEgUxxbVkhjHOlV4LXsqkPP2lv RMVAOQIaRQoDo0VOM57l06z+PSuhVkHZZQPJxb8UpOVfv6DeU5b73ijWXZCUW/dMVgMP mhDclO62JxT+8IulaVu6f5iz299a+Hn9GeHtX7hgf3Z6fqePKi9DHH1Fi0hilAinNSWQ wYIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=lEoeDs4yuGyNwqNO1fDTw+d2RHF2pjX5kRJ7yBqtMNg=; b=t0MchSRCDVgcsz2bKZAkVRffOEK9hkfX0BRG4FKKCHPcHdnAvBTNNwyjuiogmIyd7k 0SMo3eoikyhTBQbNc8bBCNAsudjtZwrOCElzvfNB7IVgA7rFzWu1pdmI+8Hn6gJ9qdQP xfGqeqCXXam/0PfAJDocVUeQfF8JzYe9rIQLxVQEKkRxOyA/G0Ncf9+pE8V5RWDobR9G D7H0a5UJwmrAVDJBT9bLZ2OJ+NHTGkRuwdE9VDWmc1whbw3/vmNYH3qMoGyfD07Z5F5X Opb8xEopRDOpjZhpqS5IPz6jJE8a3PlNw71fMvlLaY/NoqaMRD8+idCFM+l8bm3CuP1A aeFw== X-Gm-Message-State: AOAM5310vmS5ZVsq1FzZ+dKCAbIdhoqRxJBp1NtCCt/kuE61CaBWOHgz PBfX6olkezswhCKrLtGrhKwGaiOZ9mtdkec2O9Mjg7rlXy8= X-Google-Smtp-Source: ABdhPJx4unFUBh3F9lDpkzeGmVczAwrRq9phN6mvaIMXi0VS3YG4219M/htScoSY1U9i1vx4M2/gbgpXycsY5DbdW3k= X-Received: by 2002:a17:90a:1f09:: with SMTP id u9mr4651098pja.206.1630598094431; Thu, 02 Sep 2021 08:54:54 -0700 (PDT) In-Reply-To: Received-SPF: pass client-ip=2607:f8b0:4864:20::630; envelope-from=stefan.itampe@gmail.com; helo=mail-pl1-x630.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:20836 Archived-At: --00000000000030a97105cb05345d Content-Type: text/plain; charset="UTF-8" Oh I just created the project, you can follow it here: https://gitlab.com/tampe/stis-supervectors On Thu, Sep 2, 2021 at 5:45 PM Stefan Israelsson Tampe < stefan.itampe@gmail.com> wrote: > Hi guilers! > > My next project is to explore a more advanced bytevector structure than > today's bytevectors. > I think that having this infrastructure in guile and core code taken > advantage of it like having strings otop of it and allowing our string > library to use those (the difficult case is probably to get regexps working > properly) > > Anyhow this is my initial comment in the code: > > #| > The idea of this data structure is to note that when we employ full > large datastructure scans we can allow for a much more rich and featurefull > datastructure then a simple bytevector. The reason is that we can divide > the data in byte chunks and spend most of the time scanning copying maping > those areas with usual methis, even optimized C - code taken advantage of > advanced cpu opcodes are possible here. ANd by dividing it in chunks we get > a lot of new features with essentiually no cose with more than complexity > which we manage mostly in scheme. We gain many things, > > 1. Many new features that can vary on the pages > > 2. less memory hogs as > a. it can use copy ion write semantics > b. it does not need to fins 1GB continuous blocks > > 3. potentially faster operations as we can fast foorward the zeroe on write > pages compared to pure bytevectors > > 4. we can allow for swaping and refferential datastructures to speed up > copying > even further > > 5. can get better fiber features of C programs that spends seconds or > minutes on > performing an operation because it will just spend a microsecond or such > in C-land and then swap back to Scheme. CTRL-C will also work nicely. > > 6. We could potentially build a string library untop of these > datastructures and > also we could add features to pages that lead to a much richer > interface. > > 7. resizing is much faster end efficient > > 8. reversing is super quick > > 9. queues and stacks of byte data can have a very efficient > implementations > > Drawback: > 1. More complex as we need to consider boudaries > 2. Slower one off operations like bytevector-u8-get as guile compiles the > core operatoins to quite effective JIT CPU encodings. But maybe we can > disign some caching to make those operations much faster and even have > suporting JIT operations. > > |# > > WDYT ? > > --00000000000030a97105cb05345d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Oh I just created the project, you can follow it here:

On Thu, Sep 2, 2021 at = 5:45 PM Stefan Israelsson Tampe <stefan.itampe@gmail.com> wrote:
Hi guilers!

= My next project is to explore a more advanced bytevector structure than tod= ay's bytevectors.=C2=A0
I think that having this infrastructu= re in guile and core code taken advantage=C2=A0of it like having strings ot= op of it and allowing our string library to use those (the difficult case i= s probably to get regexps working properly)

Anyhow= this is my initial comment in the code:

#|
The= idea of this data structure is to note that when we employ full
large d= atastructure scans we can allow for a much more rich and featurefull
dat= astructure then a simple bytevector. The reason is that we can divide
th= e data in byte chunks and spend most of the time scanning copying mapingthose areas with usual methis, even optimized C - code taken advantage of = advanced cpu opcodes are possible here. ANd by dividing it in chunks we get= a lot of new features with essentiually no cose with more than complexity = which we manage mostly in scheme. We gain many things,

1. Many new f= eatures that can vary on the pages

2. less memory hogs as
=C2=A0 = =C2=A0a. it can use copy ion write semantics
=C2=A0 =C2=A0b. it does not= need to fins 1GB continuous blocks

3. potentially faster operations= as we can fast foorward the zeroe on write
=C2=A0 =C2=A0pages compared = to pure bytevectors

4. we can allow for swaping and refferential dat= astructures to speed up copying
=C2=A0 =C2=A0even further

5. can = get better fiber features of C programs that spends seconds or minutes on=C2=A0 =C2=A0performing an operation because it will just spend a microse= cond or such
=C2=A0 =C2=A0in C-land and then swap back to Scheme. CTRL-C= will also work nicely.

6. We could potentially build a string libra= ry untop of these datastructures and
=C2=A0 =C2=A0also we could add feat= ures to pages that lead to a much richer interface.
=C2=A0
7. resizi= ng is much faster end efficient

8. reversing is super quick

9= . queues and stacks of byte data can have a very efficient implementations =

Drawback:
1. More complex as we need to consider boudaries
2.= Slower one off operations like bytevector-u8-get as guile compiles the =C2=A0 =C2=A0core operatoins to quite effective JIT CPU encodings. But may= be we can
=C2=A0 =C2=A0disign some caching to make those operations much= faster and even have
=C2=A0 =C2=A0suporting JIT operations.

|#

WDYT ?

--00000000000030a97105cb05345d--