From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Matt Wette Newsgroups: gmane.lisp.guile.devel Subject: Re: more advanced bytevector => supervectors Date: Thu, 2 Sep 2021 12:11:12 -0700 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="27034"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 To: guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Thu Sep 02 21:11:37 2021 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mLs7c-0006sZ-Gl for guile-devel@m.gmane-mx.org; Thu, 02 Sep 2021 21:11:36 +0200 Original-Received: from localhost ([::1]:39840 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mLs7b-0006sp-61 for guile-devel@m.gmane-mx.org; Thu, 02 Sep 2021 15:11:35 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:49972) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mLs7M-0006rA-UE for guile-devel@gnu.org; Thu, 02 Sep 2021 15:11:20 -0400 Original-Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]:37482) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mLs7K-00027a-5T for guile-devel@gnu.org; Thu, 02 Sep 2021 15:11:20 -0400 Original-Received: by mail-pg1-x52e.google.com with SMTP id 17so3020400pgp.4 for ; Thu, 02 Sep 2021 12:11:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding:content-language; bh=Z6GbQRVH36+C2HpG94vWyU314vXGxm/oCyIaH6tLRNk=; b=L9vCkT+FGG+fF7LGqQYfK3Mv28SW0MgzLEgO8t44O4jS4BZ0e45v6gn4LaQsviWbpJ HkHmqMsTSHZItBUSS0dk0Z383dfyA+yQtwoOR/UhJYqw0SM7y9TceMFctxdhYOBm+nXE EdxXHMM10g2gx8aXwNtw2e8MWnmMvG57x6kIdQTu+f/fI5H128aWLiSnrY8QGYebKDD5 NLni6lOuTXSzUhbSPUiUgo7v5oHNdqaH0Zz7szsPZPrGCwBG8KVZjrOs8gDi3KDLxOly d5ZOVpRwrBQ3tGYDqn5i2InVKfmBWK0FV8t7hTieFeeAE8FHCD9z9exvGdakYVSCHVho CAWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=Z6GbQRVH36+C2HpG94vWyU314vXGxm/oCyIaH6tLRNk=; b=irIGMItb7bWPMicLkQVkzBKft4J1c7Dj75PwPQRsVA4jQRYNuwFR2u1yjlymctk2qj +k896Kl0jF8liVPHxrmqzSXtMEfsPRfYbOk5jTh+0h3DcTEIkMz+k/8rhJDI5NQR47BC oIaASYt6GetJootpjjhJs8mvVf+roVD5jNgX5mTcm7k6SuD4rzsVEZRMxui4/zFRON8s 4zrH7qPJRbA5kdGD7PCd1fXZt+OQjBIgWlrGXAf2/zPMZna7Fi0gyI4I3/lHfmzFQYbH tTv2zK1sv/Y70KNNwyT5tHX8YqkQ9Co1tfwJFZKZksaZ2qiFQ+Z/B1HPMGxA7vTDMkMP eKfg== X-Gm-Message-State: AOAM532X2rmmDINhkStm2QxYw1Jb2KI6r9UrIZEXrw5A34oiSKK+Aube yN20zEiqoERen6NiFaQhkrMJzKcy0qA= X-Google-Smtp-Source: ABdhPJyFgXG6o1+ba2djTdCDJKCUD/rRyZKvTF/i295oQ9/dxA0+GlKuTjayCL22M1B2jdiK/Ezfgw== X-Received: by 2002:a62:1c84:0:b029:39a:87b9:91e with SMTP id c126-20020a621c840000b029039a87b9091emr4625015pfc.7.1630609874728; Thu, 02 Sep 2021 12:11:14 -0700 (PDT) Original-Received: from [192.168.2.163] (64-52-176-132.championbroadband.com. [64.52.176.132]) by smtp.gmail.com with ESMTPSA id 73sm3169580pfu.92.2021.09.02.12.11.13 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Sep 2021 12:11:14 -0700 (PDT) In-Reply-To: Content-Language: en-US Received-SPF: pass client-ip=2607:f8b0:4864:20::52e; envelope-from=matt.wette@gmail.com; helo=mail-pg1-x52e.google.com X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-2.225, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:20838 Archived-At: maybe guile could consider regexp's in scheme see  https://ds26gte.github.io/pregexp/index.html On 9/2/21 8:45 AM, Stefan Israelsson Tampe wrote: > Hi guilers! > > My next project is to explore a more advanced bytevector structure > than today's bytevectors. > I think that having this infrastructure in guile and core code taken > advantage of it like having strings otop of it and allowing our string > library to use those (the difficult case is probably to get regexps > working properly) > > Anyhow this is my initial comment in the code: > > #| > The idea of this data structure is to note that when we employ full > large datastructure scans we can allow for a much more rich and > featurefull > datastructure then a simple bytevector. The reason is that we can divide > the data in byte chunks and spend most of the time scanning copying maping > those areas with usual methis, even optimized C - code taken advantage > of advanced cpu opcodes are possible here. ANd by dividing it in > chunks we get a lot of new features with essentiually no cose with > more than complexity which we manage mostly in scheme. We gain many > things, > > 1. Many new features that can vary on the pages > > 2. less memory hogs as >    a. it can use copy ion write semantics >    b. it does not need to fins 1GB continuous blocks > > 3. potentially faster operations as we can fast foorward the zeroe on > write >    pages compared to pure bytevectors > > 4. we can allow for swaping and refferential datastructures to speed > up copying >    even further > > 5. can get better fiber features of C programs that spends seconds or > minutes on >    performing an operation because it will just spend a microsecond or > such >    in C-land and then swap back to Scheme. CTRL-C will also work nicely. > > 6. We could potentially build a string library untop of these > datastructures and >    also we could add features to pages that lead to a much richer > interface. > > 7. resizing is much faster end efficient > > 8. reversing is super quick > > 9. queues and stacks of byte data can have a very efficient > implementations > > Drawback: > 1. More complex as we need to consider boudaries > 2. Slower one off operations like bytevector-u8-get as guile compiles the >    core operatoins to quite effective JIT CPU encodings. But maybe we can >    disign some caching to make those operations much faster and even have >    suporting JIT operations. > > |# > > WDYT ? >