From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Taylan Kammer Newsgroups: gmane.lisp.guile.user Subject: Re: Question about data structures Date: Mon, 23 Nov 2020 04:42:37 +0100 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="29740"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 To: Zelphir Kaltstahl , Guile User Original-X-From: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Mon Nov 23 04:42:57 2020 Return-path: Envelope-to: guile-user@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kh2kh-0007d4-D5 for guile-user@m.gmane-mx.org; Mon, 23 Nov 2020 04:42:55 +0100 Original-Received: from localhost ([::1]:40604 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kh2kg-0007Sr-G1 for guile-user@m.gmane-mx.org; Sun, 22 Nov 2020 22:42:54 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:50916) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kh2kW-0007SG-B8 for guile-user@gnu.org; Sun, 22 Nov 2020 22:42:44 -0500 Original-Received: from mail-wm1-x32e.google.com ([2a00:1450:4864:20::32e]:53396) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kh2kU-0001uB-KS for guile-user@gnu.org; Sun, 22 Nov 2020 22:42:44 -0500 Original-Received: by mail-wm1-x32e.google.com with SMTP id p22so16612506wmg.3 for ; Sun, 22 Nov 2020 19:42:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=tOizB6tlAf7buz1j797Y0iYk2TNlIKd9VqYTyFc2ChY=; b=omBYQWNvxHdN/Nlf43TAI3ol9RKPF/T7HE7OQZOnGz7hF0gIS/xIP2EtdYrZ3xCWGI rWuq3iYb+++GRHvt4Rbh25tnUYqAVEMzKzot5n2hb4IpFt9kAFG0UZwapV8Eftv9gRw2 XkfUuZPytIC9gSIFV5UezbwM/XB55+N4gpy4/La0bsMhEJo7F/B5f4kujGUf7+bhC+md Y4WJaiBbWEvIxUvE4HIwNy5W85rG6UgecT7ntTXGA6XML6SEa45U//KyXnP/6Bt0L5Mk RVztGgRgcPBe850WJ7AiU1KlT7DhGD0xi9jJJQCs8a3QfaOCmR+OYQnyrOW7SubAf84/ cGKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=tOizB6tlAf7buz1j797Y0iYk2TNlIKd9VqYTyFc2ChY=; b=LVJkDmGJGLdUx5gwdpKlDS8Byys6bpEk5QgE2WvG3peBJdOCvp5FKBKqg2i6uODyh9 MlaaGS1ex20smCIoB0sgy7R2FwzT8uRLnDmxYDTd5jm/WvYDAN0DO+OaKYA26dsZ4EIB RYNDrB4AjNftIuvO9+GRJ3qqMnn2iBHQmAtNeoqYTUsXXYxG0HfTQGGnqBg5M1Qf2+Tn KeRz3/rha5K+vbdT3i/tvlNRocoEtIZEyhC4qrkJczEeXzcPETOiTg7dIrDir2FpFv86 grj0lBGjMswbTi5Zc+50B0MVFo4K6IpgcjQsT8JZi3lo3lbeAeSe/NJ5EeB0bwt4dGVR FSsA== X-Gm-Message-State: AOAM533u/1sKevb85Wju8tfSKx5y4OLxE+xZv8LpKvncso2LgHc86QFc Pmjy33kU4RtIF8TPjYNR+IUzdUlGPu0= X-Google-Smtp-Source: ABdhPJz0FZ/klUv/DDk8/CIKyy/S+lotNbrESIHmnQ5DKP867FwW7I0SohP9ApX64EgPPhEaXrYwAw== X-Received: by 2002:a1c:5f08:: with SMTP id t8mr21071198wmb.84.1606102959916; Sun, 22 Nov 2020 19:42:39 -0800 (PST) Original-Received: from [192.168.178.20] ([109.90.125.150]) by smtp.gmail.com with ESMTPSA id p11sm8605206wrj.14.2020.11.22.19.42.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 22 Nov 2020 19:42:39 -0800 (PST) In-Reply-To: Content-Language: en-US Received-SPF: pass client-ip=2a00:1450:4864:20::32e; envelope-from=taylan.kammer@gmail.com; helo=mail-wm1-x32e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guile-user@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: General Guile related discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-user-bounces+guile-user=m.gmane-mx.org@gnu.org Original-Sender: "guile-user" Xref: news.gmane.io gmane.lisp.guile.user:17045 Archived-At: On 22.11.2020 19:48, Zelphir Kaltstahl wrote: > Hello Guile Users! > > I have a question about data structures. > > [...] > > How do you approach this problem? Is it a problem at all? First of all, be cautious about premature optimization. In many cases it's best to just write the code the most straightforward way possible with the tools at hand, and not bother with optimization unless it actually proves to be an issue. Are you going to be processing files with millions of lines? Thousands of lines but on a very weak CPU? Does it matter if your program takes 0.1 seconds or 2 seconds to run? Now the actual answer, in case you need to optimize, or just want to learn more: All data structures that offer a sequential list of elements have to make some trade-offs between the performance of various operations, as well as the implementation complexity. Linked lists (i.e. "lists" in Scheme) are very simple, and a few operations are cheap as well, but they have the shortcomings you've described plus some more. Since your main concern seems to be appending, you could simply use a linked list where you keep a reference to the last cons pair (tail) of the list, so appending is simply a matter of a 'set-cdr!' operation on the tail. Python lists, JDK's ArrayList, and .NET ArrayList, among probably many other "list" or "array" data structures in popular languages nowadays use a relatively straightforward data structure that is backed by an actual array which can have empty slots (e.g. your Python list with 3 elements might be backed by an array of size 10), and is reallocated whenever there's no space left. This means that appending an element at the end is usually dirt cheap, until there's no space left, at which point the append operation is much heavier for one call, then the following calls are dirt cheap again, until it's full again... Inserting an element at the beginning or middle is also relatively expensive with those implementations, since all elements need to be shifted forward to make space for the new element. (Although this might be done with an operation like C's memcpy which is still actually very fast.) It's called a "dynamic array" by Wikipedia: https://en.wikipedia.org/wiki/Dynamic_array If you want to go on an adventure, you could implement a Scheme data structure called DVector that implements this strategy, using plain Scheme vectors for the backing array. The VList has also been mentioned in this thread, but from what I can tell it doesn't seem to offer a very efficient append operation. - Taylan