unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Dmitry Antipov <dmantipov@yandex.ru>
To: Paul Eggert <eggert@cs.ucla.edu>,
	 Stefan Monnier <monnier@IRO.UMontreal.CA>,
	Eli Zaretskii <eliz@gnu.org>
Cc: emacs-devel@gnu.org
Subject: Re: Objects layout and tagging scheme
Date: Fri, 03 Aug 2012 12:17:57 +0400	[thread overview]
Message-ID: <501B8935.9030602@yandex.ru> (raw)
In-Reply-To: <501AC251.8070203@cs.ucla.edu>

On 08/02/2012 10:09 PM, Paul Eggert wrote:

> For strings, 3 bits are free in the pointers to intervals,
> if we can assume intervals are aligned like other lisp
> objects, which should be possible to arrange.
>
> For vectors the same trick could be played, with next.buffer
> and next.vector.   Presumably we can think of a similar way
> to do it with next.nbytes, since nbytes is limited.

The more I do different things for C part of Emacs, the more I hate such
a bit tricks. IMHO they're much more obfuscating than all of the xVAR stuff.
Even worse, packing every possible unused bit turns further extensions into
a nightmare. For example, I can follow your suggestions and hack 2 bits
into free bits of pointers (and add more ugly stuff to Lisp_Cons); next,
someone will ask for 1 more bit (for tricolor marking - why not?), and
next round of obfuscation will start again.

That's why I'm thinking about per-object unified headers. Consider the
following layout: if LSB (or MSB) of Lisp_Object is non-zero, the rest
bits represents signed integer; otherwise, the rest bits represents the
pointer to heap object. Each object has 4-byte header. In the header,
mark bit, extra gc information and type information are always the same bits
for all objects; the rest of the header is object-specific or unused.
For example, cons header may be

struct cons_header {
   unsigned type : 6;     /* Lisp_Cons */
   unsigned gcmark : 1:
   unsigned gcinfo : 2;
   unsigned unused : 23;
};

Symbol header may be:

struct symbol_header {
   unsigned type : 6;    /* Lisp_Symbol */
   unsigned gcmark : 1;
   unsigned gcinfo : 2;
   unsigned redirect : 3;
   unsigned constant : 2:
   unsigned interned : 2;
   unsigned declared_special : 1;
   unsigned unused : 15;
};

etc. The only disadvantage is an increased memory consumption (Lisp_Cons is
a great loser here, plus pure objects which doesn't need gcXXX bits). But,
at the cost of this, we can have at least;

- No USE_LSB_TAG hacks - it's pretty enough to be sure that all heap objects
   are aligned to word boundary;
- No address space limitation, welcome mmap;
- Native limitation for vectors and strings length (size is, really, size,
   without ARRAY_MARK_FLAG, PSEUDOVECTOR_FLAG and so);
- No separate bitmaps for conses and floats, so, no alignment limitations
   for cons and float blocks - say goodbye to lisp_align_malloc;
- faster mark and check whether the mark is here already - no more
   switch (XTYPE (obj)) because all type bits are identically placed for all;
- simple type system without second-class citizens like current misc family.

I'm not sure that this layout may co-exists with the current one, so it's
a subject for development in the branch; when it will be done, we will
have a solid base for further GC improvements.

Dmitry




  reply	other threads:[~2012-08-03  8:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-02 16:10 Objects layout and tagging scheme Dmitry Antipov
2012-08-02 18:09 ` Paul Eggert
2012-08-03  8:17   ` Dmitry Antipov [this message]
2012-08-03 23:10     ` Stefan Monnier
2012-08-03  9:49 ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=501B8935.9030602@yandex.ru \
    --to=dmantipov@yandex.ru \
    --cc=eggert@cs.ucla.edu \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@IRO.UMontreal.CA \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).