unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Mark H Weaver <mhw@netris.org>
To: Arne Babenhauserheide <arne_bab@web.de>
Cc: guile-devel@gnu.org
Subject: Re: Immediate doubles (up to 2^256) and rationals coming to Guile 3
Date: Tue, 11 Jun 2019 20:03:26 -0400	[thread overview]
Message-ID: <87a7en7jpy.fsf@netris.org> (raw)
In-Reply-To: <87h89233pp.fsf@web.de> (Arne Babenhauserheide's message of "Thu,  06 Jun 2019 21:37:22 +0200")

Hi Arne,

Arne Babenhauserheide <arne_bab@web.de> writes:

> I don’t understand the implications of this, but I realized that I would
> love to have info about the sizes of different datastructures in Guile.

Sure.

> I recently started measuring how much memory is required for a record
> compared to a list, so I could give people concrete guidelines which
> datastructures to use for which algorithm, and for that such low-level
> details would be great to have! (or just to know where to read them up —
> or which keywords to look for)

For core data structures implemented in C, the place to look is in the
corresponding *.h and *.c files in libguile, for example struct.[ch],
strings.[ch], numbers.[ch], etc.  However, pairs receive special
handling, and only part of their implementation is in pairs.[ch].

I will summarize the costs of some important structures below:

In the following, a "word" is the size of a pointer (4 bytes on a 32-bit
system, or 8 bytes on a 64-bit system).

I will _not_ include the SCM word itself in the accounting of the size
of the object represented by that SCM.  That's because for
heap-allocated objects, the SCM is simply a pointer to the object, and
there can be many pointers to the same object.  The cost of SCM words
will instead be included in the cost of the aggregate objects that
contain them.

In this method of accounting, immediate objects cost nothing, because
they are entirely stored within the SCM itself.

Pairs cost 2 words.  Lists cost 2 words per element.  The empty list is
immediate, and therefore costs nothing.

In current 'master', records cost 1 word per field, plus another word,
rounded up to the nearest multiple of 2 words.  In 2.0 and 2.2, they
cost 1 word per field, plus another _2_ words, rounded up

Vectors cost 1 word per element, plus another word, rounded up to the
nearest multiple of 2 words.

Bytevectors cost 4 words plus the data itself, rounded up to the nearest
multiple of 2 words.  As a special case, empty bytevectors cost nothing.

Non-immediate inexact real numbers cost 16 bytes.  Non-immediate complex
numbers cost 24 bytes on 32-bit systems, or 32 bytes on 64-bit systems.

In the common case, strings currently cost 6 words plus the character
data itself, rounded up to the nearest multiple of 2 words.  The
character data costs either 1 byte per character (if all code points are
less than 256, i.e. the string is latin-1), or 4 bytes per character.
(However, I may propose a new string representation for 3.0).

As a special case, the empty string costs nothing.

Finally, I should mention that the ability to share substructure in
lists and pairs make them potentially more memory efficient than
vectors, depending on the use case.  For example, if you need a
structure with 4 fields, and a common operation in your application will
involve copying the structure except for one field which is changed,
then you might be able save memory by using lists, even though a
4-element list requires more memory than a 4-element vector or struct.
If you put the commonly-changed field in the CAR, then you can update
that field by allocating only a single pair (2 words) and sharing the
rest of the structure with the old version.  This is not possible with
vectors or structs, where you must allocate a fresh new object of 5 or 6
words to perform the same operation.

     Regards,
       Mark



  reply	other threads:[~2019-06-12  0:03 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-06  9:40 Immediate doubles (up to 2^256) and rationals coming to Guile 3 Mark H Weaver
2019-06-06 12:56 ` Mark H Weaver
2019-06-06 19:37 ` Arne Babenhauserheide
2019-06-12  0:03   ` Mark H Weaver [this message]
2019-06-06 20:09 ` tomas
2019-06-07 19:46 ` Ludovic Courtès
2019-06-09 16:56   ` Mark H Weaver
2019-06-09 17:30     ` Mark H Weaver
2019-06-11  8:39     ` Ludovic Courtès
2019-06-11 10:58       ` Mark H Weaver
2019-06-11 12:21         ` Ludovic Courtès
2019-06-11 11:34       ` Mark H Weaver
2019-06-08  1:13 ` Mark H Weaver
2019-06-08  8:07   ` Arne Babenhauserheide
2019-06-08  9:08     ` Chris Vine
2019-06-08  9:46       ` Arne Babenhauserheide
2019-06-08 10:24         ` Chris Vine
2019-06-08 13:12       ` Hans Åberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a7en7jpy.fsf@netris.org \
    --to=mhw@netris.org \
    --cc=arne_bab@web.de \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).