From: Taylan Kammer <taylan.kammer@gmail.com>
To: Zelphir Kaltstahl <zelphirkaltstahl@posteo.de>,
Guile User <guile-user@gnu.org>
Subject: Re: Question about data structures
Date: Mon, 23 Nov 2020 04:42:37 +0100 [thread overview]
Message-ID: <a5317e0c-9fe0-1fb9-c6b5-d3065d179285@gmail.com> (raw)
In-Reply-To: <d8df761e-0f1f-d57d-1180-ab23c928ee82@posteo.de>
On 22.11.2020 19:48, Zelphir Kaltstahl wrote:
> Hello Guile Users!
>
> I have a question about data structures.
>
> [...]
>
> How do you approach this problem? Is it a problem at all?
First of all, be cautious about premature optimization. In many cases
it's best to just write the code the most straightforward way possible
with the tools at hand, and not bother with optimization unless it
actually proves to be an issue. Are you going to be processing files
with millions of lines? Thousands of lines but on a very weak CPU?
Does it matter if your program takes 0.1 seconds or 2 seconds to run?
Now the actual answer, in case you need to optimize, or just want to
learn more:
All data structures that offer a sequential list of elements have to
make some trade-offs between the performance of various operations, as
well as the implementation complexity. Linked lists (i.e. "lists" in
Scheme) are very simple, and a few operations are cheap as well, but
they have the shortcomings you've described plus some more.
Since your main concern seems to be appending, you could simply use a
linked list where you keep a reference to the last cons pair (tail) of
the list, so appending is simply a matter of a 'set-cdr!' operation on
the tail.
Python lists, JDK's ArrayList, and .NET ArrayList, among probably many
other "list" or "array" data structures in popular languages nowadays
use a relatively straightforward data structure that is backed by an
actual array which can have empty slots (e.g. your Python list with 3
elements might be backed by an array of size 10), and is reallocated
whenever there's no space left. This means that appending an element at
the end is usually dirt cheap, until there's no space left, at which
point the append operation is much heavier for one call, then the
following calls are dirt cheap again, until it's full again...
Inserting an element at the beginning or middle is also relatively
expensive with those implementations, since all elements need to be
shifted forward to make space for the new element. (Although this might
be done with an operation like C's memcpy which is still actually very
fast.)
It's called a "dynamic array" by Wikipedia:
https://en.wikipedia.org/wiki/Dynamic_array
If you want to go on an adventure, you could implement a Scheme data
structure called DVector that implements this strategy, using plain
Scheme vectors for the backing array.
The VList has also been mentioned in this thread, but from what I can
tell it doesn't seem to offer a very efficient append operation.
- Taylan
next prev parent reply other threads:[~2020-11-23 3:42 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-22 18:48 Question about data structures Zelphir Kaltstahl
2020-11-22 19:45 ` divoplade
2020-11-22 20:24 ` Zelphir Kaltstahl
2020-11-22 21:00 ` divoplade
2020-11-28 2:47 ` Zelphir Kaltstahl
2020-11-22 20:58 ` kwright
2020-11-22 22:50 ` Tim Van den Langenbergh
2020-11-23 3:42 ` Taylan Kammer [this message]
2020-11-23 4:42 ` John Cowan
2020-11-23 14:54 ` [EXT] " Thompson, David
2020-11-23 9:52 ` Neil Jerram
2020-11-23 12:24 ` Dr. Arne Babenhauserheide
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a5317e0c-9fe0-1fb9-c6b5-d3065d179285@gmail.com \
--to=taylan.kammer@gmail.com \
--cc=guile-user@gnu.org \
--cc=zelphirkaltstahl@posteo.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).