unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* malloc and alignment
@ 2003-06-16 14:38 Stefan Monnier
  2003-06-16 15:15 ` David Kastrup
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Stefan Monnier @ 2003-06-16 14:38 UTC (permalink / raw



I'd like to try and get rid of the markbit on Lisp_Object values.
The idea is to scrape off a few more bits (I'd like to bump
maxint to 512MB instead of 128MB).  And I'd also like to represent
floats in 8bytes rather than 12.

So the idea is to use a separate bitmap, but in order to quickly
find the markbit corresponding to an object, we need to be able
to quickly find the bitmap for an arbitrary pointer.

The typical solution for this is to allocate things in chunks of
<power-of-2 here> such that you can get the base of the bitmap
by doing `ptrvalue & bitmask'.

But for this, I need to make sure things are properly aligned.
Currently alloc.c calls malloc to allocate blocks of about 8KB
which are then split into 1000 cons cells or floats or whatever.
So all we need is to make sure that those 8KB are properly aligned.
But how can I do that ?
It seems the only reliable way is to malloc 16KB and then waste
the 8KB that happen to be before/after the aligned part.

Should I just wrap malloc in my own malloc_aligned such that
malloc_aligned calls malloc to allocate chunks of say 1MB
and then gives them out in aligned packets of 8KB ?

Has anybody played with things like that and could share his/her
experience ?


	Stefan


PS: Of course, an alternative is to split my 8KB blocks into aligned
    subblocks of 256B: 31 cons cells (or floats) plus one 32bit integer
    for the bitmask.  This makes the bitmask access a tiny bit simpler
    since the bitmap is made of a single integer rather than an array
    of ints.  On another hand, it will waste a lot of cache space
    unless your cache lines are 256B long.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-16 14:38 malloc and alignment Stefan Monnier
@ 2003-06-16 15:15 ` David Kastrup
  2003-06-16 15:39   ` Stefan Monnier
  2003-06-16 15:59   ` Stefan Monnier
  2003-06-16 22:35 ` Miles Bader
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 15+ messages in thread
From: David Kastrup @ 2003-06-16 15:15 UTC (permalink / raw
  Cc: emacs-devel

"Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:

> I'd like to try and get rid of the markbit on Lisp_Object values.
> The idea is to scrape off a few more bits (I'd like to bump maxint
> to 512MB instead of 128MB).  And I'd also like to represent floats
> in 8bytes rather than 12.

If you are scraping off bits, why not instead _add_ one, but reserve
_all_ the additional tag values gained in that way for representing
integers?

Something like "tag byte values of 0 to 127 all represent integers,
tag byte values from 128 to 256 represent all other data types".

That way integers would go up to 1GB without pain.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-16 15:15 ` David Kastrup
@ 2003-06-16 15:39   ` Stefan Monnier
  2003-06-16 15:59   ` Stefan Monnier
  1 sibling, 0 replies; 15+ messages in thread
From: Stefan Monnier @ 2003-06-16 15:39 UTC (permalink / raw
  Cc: Stefan Monnier

> > I'd like to try and get rid of the markbit on Lisp_Object values.
> > The idea is to scrape off a few more bits (I'd like to bump maxint
> > to 512MB instead of 128MB).  And I'd also like to represent floats
> > in 8bytes rather than 12.
> 
> If you are scraping off bits, why not instead _add_ one, but reserve
> _all_ the additional tag values gained in that way for representing
> integers?
> 
> Something like "tag byte values of 0 to 127 all represent integers,
> tag byte values from 128 to 256 represent all other data types".
> 
> That way integers would go up to 1GB without pain.

We already eat 4bits of each pointer (and what's worse, we eat the topmost
ones, and leave the unused 2 (or 3) lower bits untouched).  If we eat more
we'll bump even more quickly into the problem where Emacs can't allocate
any more data because it has eaten up its address space (which is
currently only 256MB large and would be reduce accordingly).

We try to minimize the problem by allocating buffer-text separately,
so buffer-text shouldn't eat into this address space, but it is still
an important limitation (buffers eat up memory not only in terms
of buffer-text but also things like text-properties, ...).

Instead, I'd like to reduce those 4bits down to 3 (it's currently 3bits
of tag plus one bit of marking, so removing the markbit will bring it
down to 3) and maybe also make use of the 4-byte alignment of objects
so we don't waste the lower 2bits (we can just shift the pointer by
2bits before adding our 3bits tag, so we only eat up the top bit.
Or we can even impose an 8byte alignment, don't eat any pointer
space at all, and just place the tag on the lower bits).
Anyway, experimentation is needed first.


	Stefan "who always wished that 64bit architectures would use
                bit-addressing rather than byte-addressing, such that
                a natural 64bit alignment gives you 6 bits of tag
                space to play with"


PS: The only reason why the markbit exists is because cons cells
    are made of 2 words only (not extra header), so when we mark them,
    we use one of the bits of the `car', but since any value can be
    in the `car', that means that this bit of a Lisp_Object value
    cannot be used for anything else.  The markbit is not used
    to mark the Lisp_Object referenced, but to mark the object in which
    the Lisp_Object value is placed.
PPS: I.e. in order to get rid of this mark-bit, we only need markbit bitmaps
    for cons-cells.  But I also want to use them for floats in order to have
    a more efficient representation of floats (they are currently represented
    as a 64bit double plus a one-word header of which only one bit is used).

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-16 15:15 ` David Kastrup
  2003-06-16 15:39   ` Stefan Monnier
@ 2003-06-16 15:59   ` Stefan Monnier
  2003-06-17  4:57     ` Stephen J. Turnbull
  1 sibling, 1 reply; 15+ messages in thread
From: Stefan Monnier @ 2003-06-16 15:59 UTC (permalink / raw
  Cc: emacs-devel

> > I'd like to try and get rid of the markbit on Lisp_Object values.
> > The idea is to scrape off a few more bits (I'd like to bump maxint
> > to 512MB instead of 128MB).  And I'd also like to represent floats
> > in 8bytes rather than 12.
> 
> If you are scraping off bits, why not instead _add_ one, but reserve
> _all_ the additional tag values gained in that way for representing
> integers?

Oh, I also forgot to mention that integers currently have 3bit
of tag and 1bit of markbit, just like the rest and without getting
rid of the markbit (what I'm trying to do), you can't go below a
total of 2bits eaten (1bit of markbit and 1bit of tag).

By getting rid of the markbit, we get the current count on integers
down to 3bits eaten.  We can trivially reduce it further because
only 7 of the 8 tag values are currently used, so we can allocate
the 8th one to int as well, so we are down to 2bit of tags
for ints with only trivial changes to the code.  That's where
my 512MB number comes from.

To bump it up to 1GB, we'd need to indeed extend the tag size on
non-integers somewhat, which might not be that difficult, by
the way, but we'll see about that later.

BTW, someone on the XEmacs list recently complained that he had trouble
opening a 755MB file.  XEmacs supposedly handles up to 1GB buffers,
but his Emacs process went up to 2.1GB (Mule begins by allocating
a destination area of 1510MB because latin-1 has a max-expansion factor
of 2), so I think that a 512MB limit due to integers might not be that bad,
since bumping it to 1GB will just replace this limit with others.


	Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-16 14:38 malloc and alignment Stefan Monnier
  2003-06-16 15:15 ` David Kastrup
@ 2003-06-16 22:35 ` Miles Bader
  2003-06-16 23:11   ` Stefan Monnier
  2003-06-17  4:48 ` Stephen J. Turnbull
       [not found] ` <E19SJCe-0008Bb-Vp@fencepost.gnu.org>
  3 siblings, 1 reply; 15+ messages in thread
From: Miles Bader @ 2003-06-16 22:35 UTC (permalink / raw
  Cc: emacs-devel

On Mon, Jun 16, 2003 at 10:38:50AM -0400, Stefan Monnier wrote:
> So all we need is to make sure that those 8KB are properly aligned.
> But how can I do that ?
> It seems the only reliable way is to malloc 16KB and then waste
> the 8KB that happen to be before/after the aligned part.
> 
> Should I just wrap malloc in my own malloc_aligned such that
> malloc_aligned calls malloc to allocate chunks of say 1MB
> and then gives them out in aligned packets of 8KB ?

You could use `memalign'....

-Miles
-- 
"Though they may have different meanings, the cries of 'Yeeeee-haw!' and
 'Allahu akbar!' are, in spirit, not actually all that different."

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-16 22:35 ` Miles Bader
@ 2003-06-16 23:11   ` Stefan Monnier
  2003-06-17  2:25     ` Miles Bader
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier @ 2003-06-16 23:11 UTC (permalink / raw
  Cc: Stefan Monnier

> > So all we need is to make sure that those 8KB are properly aligned.
> > But how can I do that ?
> > It seems the only reliable way is to malloc 16KB and then waste
> > the 8KB that happen to be before/after the aligned part.
> > 
> > Should I just wrap malloc in my own malloc_aligned such that
> > malloc_aligned calls malloc to allocate chunks of say 1MB
> > and then gives them out in aligned packets of 8KB ?
> 
> You could use `memalign'....

How standard is it ?


	Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-16 23:11   ` Stefan Monnier
@ 2003-06-17  2:25     ` Miles Bader
  0 siblings, 0 replies; 15+ messages in thread
From: Miles Bader @ 2003-06-17  2:25 UTC (permalink / raw
  Cc: emacs-devel

"Stefan Monnier" <monnier+gnu/emacs@rum.cs.yale.edu> writes:
> > You could use `memalign'....
> 
> How standard is it ?

Here's the glibc info page about it.  The traditional `portable' version
is exactly the sort of over-allocate-and-use-the-middle type thing you
mentioned.  I guess the biggest problem is how to _free_ such blocks --
as noted below, glibc can do so, but BSD can't (who knows how old that
info is, though)...

-Miles


File: libc.info,  Node: Aligned Memory Blocks,  Next: Malloc Tunable Parameters,  Prev: Efficiency and Malloc,  Up: Unconstrained Allocation

Allocating Aligned Memory Blocks
................................

   The address of a block returned by `malloc' or `realloc' in the GNU
system is always a multiple of eight (or sixteen on 64-bit systems).
If you need a block whose address is a multiple of a higher power of
two than that, use `memalign', `posix_memalign', or `valloc'.
`memalign' is declared in `malloc.h' and `posix_memalign' is declared
in `stdlib.h'.

   With the GNU library, you can use `free' to free the blocks that
`memalign', `posix_memalign', and `valloc' return.  That does not work
in BSD, however--BSD does not provide any way to free such blocks.

 - Function: void * memalign (size_t BOUNDARY, size_t SIZE)
     The `memalign' function allocates a block of SIZE bytes whose
     address is a multiple of BOUNDARY.  The BOUNDARY must be a power
     of two!  The function `memalign' works by allocating a somewhat
     larger block, and then returning an address within the block that
     is on the specified boundary.

 - Function: int posix_memalign (void **MEMPTR, size_t ALIGNMENT,
          size_t SIZE)
     The `posix_memalign' function is similar to the `memalign'
     function in that it returns a buffer of SIZE bytes aligned to a
     multiple of ALIGNMENT.  But it adds one requirement to the
     parameter ALIGNMENT: the value must be a power of two multiple of
     `sizeof (void *)'.

     If the function succeeds in allocation memory a pointer to the
     allocated memory is returned in `*MEMPTR' and the return value is
     zero.  Otherwise the function returns an error value indicating
     the problem.

     This function was introduced in POSIX 1003.1d.

 - Function: void * valloc (size_t SIZE)
     Using `valloc' is like using `memalign' and passing the page size
     as the value of the second argument.  It is implemented like this:

          void *
          valloc (size_t size)
          {
            return memalign (getpagesize (), size);
          }

     *Note Query Memory Parameters:: for more information about the
     memory subsystem.


-- 
The secret to creativity is knowing how to hide your sources.
  --Albert Einstein

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-16 14:38 malloc and alignment Stefan Monnier
  2003-06-16 15:15 ` David Kastrup
  2003-06-16 22:35 ` Miles Bader
@ 2003-06-17  4:48 ` Stephen J. Turnbull
       [not found] ` <E19SJCe-0008Bb-Vp@fencepost.gnu.org>
  3 siblings, 0 replies; 15+ messages in thread
From: Stephen J. Turnbull @ 2003-06-17  4:48 UTC (permalink / raw
  Cc: emacs-devel

>>>>> "Stefan" == Stefan Monnier <monnier+gnu/emacs@rum.cs.yale.edu> writes:

    Stefan> I'd like to try and get rid of the markbit on Lisp_Object
    Stefan> values.  The idea is to scrape off a few more bits (I'd
    Stefan> like to bump maxint to 512MB instead of 128MB).  And I'd
    Stefan> also like to represent floats in 8bytes rather than 12.

Er, have you talked to Kyle Jones <kyle_jones@wonderworks.com>, Martin
Buchholz <martin@xemacs.org>, or Olivier Galibert <galibert@xemacs.org>?
It sounds like your goals are different, but XEmacs went through this
process about 5 years ago (how time flies, make that 7!).  XEmacs has
31-bit integers, but at the cost of conses being indirect objects.  IIRC,
those guys were the ones who designed and implemented the new system.

Actually, if you are happy with character is-a int (up to now, you
have been, but Ken'ichi recently made some noises about a true
character type) you might be able to stuff the cons type in instead of
characters.  I'm not sure how you'd handle marking (maybe you could
use the tagbits in the cdr, or the integer bit?) but that might be a
way to go.  That doesn't help with the floats, unfortunately (or does
it? I guess you're not using platform floats directly?)

Even though the goals are different, those guys would certainly have a
lot to say about possible pitfalls, and maybe have some ideas about
how to extend this to the situation where you want more tagbits than
the "natural" alignment (of a pointer to Lisp_Object).

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-16 15:59   ` Stefan Monnier
@ 2003-06-17  4:57     ` Stephen J. Turnbull
  2003-06-17  7:35       ` David Kastrup
  0 siblings, 1 reply; 15+ messages in thread
From: Stephen J. Turnbull @ 2003-06-17  4:57 UTC (permalink / raw
  Cc: emacs-devel

>>>>> "Stefan" == Stefan Monnier <monnier+gnu/emacs@rum.cs.yale.edu> writes:

    Stefan> but his Emacs process went up to 2.1GB (Mule begins by
    Stefan> allocating a destination area of 1510MB because latin-1
    Stefan> has a max-expansion factor of 2),

In XEmacs, that's a hypothesis, IIRC, not yet proved; it could just be
a bug in the allocation routines.

    Stefan> so I think that a 512MB limit due to integers might not be
    Stefan> that bad, since bumping it to 1GB will just replace this
    Stefan> limit with others.

FWIW when I proposed swapping 31-bit integers and 30-bit characters so
that we could have a proper native representation of ISO 10646 UCS-4,
it was vetoed on the grounds that the vetoer's coworker edited a
particular (ASCII-only) > 512MB logfile every day.

-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-17  4:57     ` Stephen J. Turnbull
@ 2003-06-17  7:35       ` David Kastrup
  0 siblings, 0 replies; 15+ messages in thread
From: David Kastrup @ 2003-06-17  7:35 UTC (permalink / raw
  Cc: Stefan Monnier

"Stephen J. Turnbull" <stephen@xemacs.org> writes:

> >>>>> "Stefan" == Stefan Monnier <monnier+gnu/emacs@rum.cs.yale.edu> writes:
> 
>     Stefan> but his Emacs process went up to 2.1GB (Mule begins by
>     Stefan> allocating a destination area of 1510MB because latin-1
>     Stefan> has a max-expansion factor of 2),
> 
> In XEmacs, that's a hypothesis, IIRC, not yet proved; it could just be
> a bug in the allocation routines.
> 
>     Stefan> so I think that a 512MB limit due to integers might not be
>     Stefan> that bad, since bumping it to 1GB will just replace this
>     Stefan> limit with others.
> 
> FWIW when I proposed swapping 31-bit integers and 30-bit characters so
> that we could have a proper native representation of ISO 10646 UCS-4,
> it was vetoed on the grounds that the vetoer's coworker edited a
> particular (ASCII-only) > 512MB logfile every day.

Sounds like transparent big integers might be needed at some time.
There is always going to be someone to cry foul because one bit less
does not cut it for him: after all, it was necessary to remove this
limitation from file systems, too.


-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
       [not found] ` <E19SJCe-0008Bb-Vp@fencepost.gnu.org>
@ 2003-06-24 22:52   ` Stefan Monnier
       [not found]     ` <E19VGK5-0000CQ-F4@fencepost.gnu.org>
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Monnier @ 2003-06-24 22:52 UTC (permalink / raw
  Cc: Stefan Monnier

>     The typical solution for this is to allocate things in chunks of
>     <power-of-2 here> such that you can get the base of the bitmap
>     by doing `ptrvalue & bitmask'.
> 
> I don't know this technique--could you explain it?

I must have been unclear.  Rather than keep the markbit s part of the object,
keep a separate bitmap.  In order to find the bit in the bitmap for a given
object, you need to find (from the object's pointer) both the base
address of the bitmap and the index of the bit in the bitmap.

This is typically done by allocating an array of objects, of size 2^N bytes,
such that the base of the array can be found by clearing the low-order bits
of the object's pointer.

See my current working code below.

>     It seems the only reliable way is to malloc 16KB and then waste
>     the 8KB that happen to be before/after the aligned part.
> 
> Libraries can have calls to malloc that Emacs does not even know
> about.  Therefore, you would have to do that every time you allocate
> conses or floats, which seems very wasteful.

Indeed, which is why I'd like to use something more efficient.
Right now, my working code just uses twice as much float space and cons space
as needed, which is obviously silly (since the code is writen so as to
make sure cons and floats only need a strict minimum of extra space).
But a good implementation of memalign should work around this problem.

In the mean time,
I'd like to already install part of the patch below: the part that
introduces a new `mark' field in every Lisp_Misc object (the field
is 1-bit wide and does not increase the size of the objects since
it is taken from explicit padding).  Any objection ?


	Stefan


PS: Here is my current code (more or less, it might not compile
    because I have hand-edited it and it might rely on a few minor
    changes in some other file(s)).

What it does:
- put the mark bit of buffers in their `size' field rather than in
  the buffer-name field.
- put the mark bit of Lisp_Misc objects into a new `mark' field.
- put the mark bit of symbols into a new `mark' field.
- put the mark bit of cons and floats into a separate bitmap.
I think the "mark field in Lisp_Misc" should be installed in any case
because it nicely simplifies the code without any space or time impact.
I also think the "use bitmap for floats" should be installed since it
significantly improves the memory efficiency of floats, but it still
requires improvements in the implementation of lisp_align_malloc.


Index: alloc.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/alloc.c,v
retrieving revision 1.303
diff -u -u -b -r1.303 alloc.c
--- alloc.c	15 Jun 2003 21:48:14 -0000	1.303
+++ alloc.c	24 Jun 2003 22:39:17 -0000
@@ -21,6 +21,7 @@
 
 #include <config.h>
 #include <stdio.h>
+#include <limits.h>
 
 #ifdef ALLOC_DEBUG
 #undef INLINE
@@ -568,6 +569,50 @@
   return p;
 }
 
+/* Allocate an aligned block of memory to store Lisp data.
+   It's just like lisp_malloc, except the return value is aligned
+   on a multiple of `alignment'.
+   This works by allocating a larger block and wasting some space
+   by returning not the real base but an "aligned" base address which
+   is internal to the block.
+   In order to be able to free this block, we keep a pointer to its real
+   base in the word just preceding the aligned base.
+
+   FIXME: The current implementation wastes a lot of memory.  We should use
+   `memalign' if the implementation of `memalign' is efficient, or we should
+   allocate larger blocks and do our own special-purpose malloc on them.  */
+static POINTER_TYPE *
+lisp_align_malloc (nbytes, type, alignment)
+     size_t nbytes;
+     enum mem_type type;
+     unsigned int alignment;
+{
+  /* FIXME: This should do all that lisp_malloc does.  */
+  void *base = malloc (nbytes + sizeof (void*) + alignment);
+  /* Find the aligned base.  */
+  void *val = (void*) ALIGN ((EMACS_UINT)base + sizeof (void*), alignment);
+  /* Store the real base in the word just preceding the aligned base.  */
+  ((void**)val)[-1] = base;
+  
+#if GC_MARK_STACK && !defined GC_MALLOC_CHECK
+  if (val && type != MEM_TYPE_NON_LISP)
+    mem_insert (val, (char *) val + nbytes, type);
+#endif
+  return val;
+}
+
+static void
+lisp_align_free (block)
+     POINTER_TYPE *block;
+{
+  BLOCK_INPUT;
+#if GC_MARK_STACK && !defined GC_MALLOC_CHECK
+  mem_delete (mem_find (block));
+#endif
+  /* The real base is stored in the word just before the aligned base.  */
+  free (((void**)block)[-1]);
+  UNBLOCK_INPUT;
+}
 
 /* Like malloc but used for allocating Lisp data.  NBYTES is the
    number of bytes to allocate, TYPE describes the intended use of the
@@ -947,6 +992,7 @@
   consing_since_gc += sizeof (struct interval);
   intervals_consed++;
   RESET_INTERVAL (val);
+  val->mark = 0;
   return val;
 }
 
@@ -958,10 +1004,9 @@
      register INTERVAL i;
      Lisp_Object dummy;
 {
-  if (XMARKBIT (i->plist))
-    abort ();
+  eassert (!i->mark);		/* Intervals are never shared.  */
+  i->mark = 1;
   mark_object (&i->plist);
-  XMARK (i->plist);
 }
 
 
@@ -976,10 +1021,6 @@
      function is always called through the MARK_INTERVAL_TREE macro,
      which takes care of that.  */
 
-  /* XMARK expands to an assignment; the LHS of an assignment can't be
-     a cast.  */
-  XMARK (tree->up.obj);
-
   traverse_intervals_noorder (tree, mark_interval, Qnil);
 }
 
@@ -988,23 +1029,15 @@
 
 #define MARK_INTERVAL_TREE(i)				\
   do {							\
-    if (!NULL_INTERVAL_P (i)				\
-	&& ! XMARKBIT (i->up.obj))			\
+    if (!NULL_INTERVAL_P (i) && !i->mark)		\
       mark_interval_tree (i);				\
   } while (0)
 
 
-/* The oddity in the call to XUNMARK is necessary because XUNMARK
-   expands to an assignment to its argument, and most C compilers
-   don't support casts on the left operand of `='.  */
-
 #define UNMARK_BALANCE_INTERVALS(i)			\
   do {							\
    if (! NULL_INTERVAL_P (i))				\
-     {							\
-       XUNMARK ((i)->up.obj);				\
        (i) = balance_intervals (i);			\
-     }							\
   } while (0)
 
 \f
@@ -1912,17 +1945,70 @@
 
    Each float_block is just under 1020 bytes long, since malloc really
    allocates in units of powers of two and uses 4 bytes for its own
-   overhead. */
+   overhead.
+   Which `malloc' does that, exactly ?  -stef  */
+
+#define FLOAT_BLOCK_ALIGN 1024
+#define FLOAT_BLOCK_BYTES FLOAT_BLOCK_ALIGN - 4
 
 #define FLOAT_BLOCK_SIZE \
-  ((1020 - sizeof (struct float_block *)) / sizeof (struct Lisp_Float))
+  (((FLOAT_BLOCK_BYTES - sizeof (struct float_block *)) * CHAR_BIT) \
+   / (sizeof (struct Lisp_Float) * CHAR_BIT + 1))
+
+#define GETMARKBIT(block,n)					\
+  (((block)->markbits[(n) / (sizeof(int) * CHAR_BIT)]	\
+    >> ((n) % (sizeof(int) * CHAR_BIT)))			\
+   & 1)
+
+#define SETMARKBIT(block,n)					\
+  (block)->markbits[(n) / (sizeof(int) * CHAR_BIT)]	\
+  |= 1 << ((n) % (sizeof(int) * CHAR_BIT))
+
+#define UNSETMARKBIT(block,n)					\
+  (block)->markbits[(n) / (sizeof(int) * CHAR_BIT)]	\
+  &= ~(1 << ((n) % (sizeof(int) * CHAR_BIT)))
+
+#define FLOAT_BLOCK(fptr) \
+  ((struct float_block *)(((EMACS_UINT)(fptr)) & ~(FLOAT_BLOCK_ALIGN - 1)))
+
+#define FLOAT_INDEX(fptr) \
+  ((((EMACS_UINT)(fptr)) & (FLOAT_BLOCK_ALIGN - 1)) / sizeof (struct Lisp_Float))
 
 struct float_block
 {
-  struct float_block *next;
+  /* Place `floats' at the beginning, to ease up FLOAT_INDEX's job.  */
   struct Lisp_Float floats[FLOAT_BLOCK_SIZE];
+  int markbits[1 + FLOAT_BLOCK_SIZE / (sizeof(int) * CHAR_BIT)];
+  struct float_block *next;
 };
 
+#ifndef ENABLE_CHECKING
+
+# define FLOAT_MARKED_P(fptr) \
+  GETMARKBIT (FLOAT_BLOCK (fptr), FLOAT_INDEX ((fptr)))
+
+# define FLOAT_MARK(fptr) \
+  SETMARKBIT (FLOAT_BLOCK (fptr), FLOAT_INDEX ((fptr)))
+
+# define FLOAT_UNMARK(fptr) \
+  UNSETMARKBIT (FLOAT_BLOCK (fptr), FLOAT_INDEX ((fptr)))
+
+#else
+
+static int float_marked_p (struct Lisp_Float *fptr)
+{ return GETMARKBIT (FLOAT_BLOCK (fptr), FLOAT_INDEX ((fptr))); }
+# define FLOAT_MARKED_P(fptr) float_marked_p (fptr)
+
+static void float_mark (struct Lisp_Float *fptr)
+{ SETMARKBIT (FLOAT_BLOCK (fptr), FLOAT_INDEX ((fptr))); }
+# define FLOAT_MARK(fptr) float_mark (fptr)
+
+static void float_unmark (struct Lisp_Float *fptr)
+{ UNSETMARKBIT (FLOAT_BLOCK (fptr), FLOAT_INDEX ((fptr))); }
+# define FLOAT_UNMARK(fptr) float_unmark (fptr)
+
+#endif
+
 /* Current float_block.  */
 
 struct float_block *float_block;
@@ -1945,10 +2031,12 @@
 void
 init_float ()
 {
-  float_block = (struct float_block *) lisp_malloc (sizeof *float_block,
-						    MEM_TYPE_FLOAT);
+  float_block = (struct float_block *) lisp_align_malloc (sizeof *float_block,
+							  MEM_TYPE_FLOAT,
+							  FLOAT_BLOCK_ALIGN);
   float_block->next = 0;
   bzero ((char *) float_block->floats, sizeof float_block->floats);
+  bzero ((char *) float_block->markbits, sizeof float_block->markbits);
   float_block_index = 0;
   float_free_list = 0;
   n_float_blocks = 1;
@@ -1962,9 +2050,6 @@
      struct Lisp_Float *ptr;
 {
   *(struct Lisp_Float **)&ptr->data = float_free_list;
-#if GC_MARK_STACK
-  ptr->type = Vdead;
-#endif
   float_free_list = ptr;
 }
 
@@ -1990,8 +2075,9 @@
 	{
 	  register struct float_block *new;
 
-	  new = (struct float_block *) lisp_malloc (sizeof *new,
-						    MEM_TYPE_FLOAT);
+	  new = (struct float_block *) lisp_align_malloc (sizeof *new,
+							  MEM_TYPE_FLOAT,
+							  FLOAT_BLOCK_ALIGN);
 	  new->next = float_block;
 	  float_block = new;
 	  float_block_index = 0;
@@ -2001,7 +2087,7 @@
     }
 
   XFLOAT_DATA (val) = float_value;
-  XSETFASTINT (XFLOAT (val)->type, 0);	/* bug chasing -wsr */
+  FLOAT_UNMARK (XFLOAT (val));
   consing_since_gc += sizeof (struct Lisp_Float);
   floats_consed++;
   return val;
@@ -2022,15 +2108,54 @@
    since malloc really allocates in units of powers of two
    and uses 4 bytes for its own overhead. */
 
+#define CONS_BLOCK_ALIGN 1024
+#define CONS_BLOCK_BYTES CONS_BLOCK_ALIGN - 4
+
 #define CONS_BLOCK_SIZE \
-  ((1020 - sizeof (struct cons_block *)) / sizeof (struct Lisp_Cons))
+  (((CONS_BLOCK_BYTES - sizeof (struct cons_block *)) * CHAR_BIT) \
+   / (sizeof (struct Lisp_Cons) * CHAR_BIT + 1))
+
+#define CONS_BLOCK(fptr) \
+  ((struct cons_block *)(((EMACS_UINT)(fptr)) & ~(CONS_BLOCK_ALIGN - 1)))
+
+#define CONS_INDEX(fptr) \
+  ((((EMACS_UINT)(fptr)) & (CONS_BLOCK_ALIGN - 1)) / sizeof (struct Lisp_Cons))
 
 struct cons_block
 {
-  struct cons_block *next;
+  /* Place `conses' at the beginning, to ease up CONS_INDEX's job.  */
   struct Lisp_Cons conses[CONS_BLOCK_SIZE];
+  int markbits[1 + CONS_BLOCK_SIZE / (sizeof(int) * CHAR_BIT)];
+  struct cons_block *next;
 };
 
+#ifndef ENABLE_CHECKING
+
+# define CONS_MARKED_P(fptr) \
+  GETMARKBIT (CONS_BLOCK (fptr), CONS_INDEX ((fptr)))
+
+# define CONS_MARK(fptr) \
+  SETMARKBIT (CONS_BLOCK (fptr), CONS_INDEX ((fptr)))
+
+# define CONS_UNMARK(fptr) \
+  UNSETMARKBIT (CONS_BLOCK (fptr), CONS_INDEX ((fptr)))
+
+#else
+
+static int cons_marked_p (struct Lisp_Cons *fptr)
+{ return GETMARKBIT (CONS_BLOCK (fptr), CONS_INDEX ((fptr))); }
+# define CONS_MARKED_P(fptr) cons_marked_p (fptr)
+
+static void cons_mark (struct Lisp_Cons *fptr)
+{ SETMARKBIT (CONS_BLOCK (fptr), CONS_INDEX ((fptr))); }
+# define CONS_MARK(fptr) cons_mark (fptr)
+
+static void cons_unmark (struct Lisp_Cons *fptr)
+{ UNSETMARKBIT (CONS_BLOCK (fptr), CONS_INDEX ((fptr))); }
+# define CONS_UNMARK(fptr) cons_unmark (fptr)
+
+#endif
+
 /* Current cons_block.  */
 
 struct cons_block *cons_block;
@@ -2053,10 +2178,12 @@
 void
 init_cons ()
 {
-  cons_block = (struct cons_block *) lisp_malloc (sizeof *cons_block,
-						  MEM_TYPE_CONS);
+  cons_block = (struct cons_block *) lisp_align_malloc (sizeof *cons_block,
+							MEM_TYPE_CONS,
+							CONS_BLOCK_ALIGN);
   cons_block->next = 0;
   bzero ((char *) cons_block->conses, sizeof cons_block->conses);
+  bzero ((char *) cons_block->markbits, sizeof cons_block->markbits);
   cons_block_index = 0;
   cons_free_list = 0;
   n_cons_blocks = 1;
@@ -2096,8 +2223,9 @@
       if (cons_block_index == CONS_BLOCK_SIZE)
 	{
 	  register struct cons_block *new;
-	  new = (struct cons_block *) lisp_malloc (sizeof *new,
-						   MEM_TYPE_CONS);
+	  new = (struct cons_block *) lisp_align_malloc (sizeof *new,
+							 MEM_TYPE_CONS,
+							 CONS_BLOCK_ALIGN);
 	  new->next = cons_block;
 	  cons_block = new;
 	  cons_block_index = 0;
@@ -2108,6 +2236,7 @@
 
   XSETCAR (val, car);
   XSETCDR (val, cdr);
+  CONS_UNMARK (XCONS (val));
   consing_since_gc += sizeof (struct Lisp_Cons);
   cons_cells_consed++;
   return val;
@@ -2563,6 +2692,7 @@
     }
 
   p = XSYMBOL (val);
+  p->mark = 0;
   p->xname = name;
   p->plist = Qnil;
   p->value = Qunbound;
@@ -2644,6 +2774,7 @@
 
   consing_since_gc += sizeof (union Lisp_Misc);
   misc_objects_consed++;
+  XMARKER (val)->mark = 0;
   return val;
 }
 
@@ -3247,14 +3378,12 @@
       struct float_block *b = (struct float_block *) m->start;
       int offset = (char *) p - (char *) &b->floats[0];
 
-      /* P must point to the start of a Lisp_Float, not be
-	 one of the unused cells in the current float block,
-	 and not be on the free-list.  */
+      /* P must point to the start of a Lisp_Float and not be
+	 one of the unused cells in the current float block.  */
       return (offset >= 0
 	      && offset % sizeof b->floats[0] == 0
 	      && (b != float_block
-		  || offset / sizeof b->floats[0] < float_block_index)
-	      && !EQ (((struct Lisp_Float *) p)->type, Vdead));
+		  || offset / sizeof b->floats[0] < float_block_index));
     }
   else
     return 0;
@@ -3392,18 +3521,15 @@
 	  break;
 
 	case Lisp_Cons:
-	  mark_p = (live_cons_p (m, po)
-		    && !XMARKBIT (XCONS (obj)->car));
+	  mark_p = (live_cons_p (m, po) && !CONS_MARKED_P (XCONS (obj)));
 	  break;
 
 	case Lisp_Symbol:
-	  mark_p = (live_symbol_p (m, po)
-		    && !XMARKBIT (XSYMBOL (obj)->plist));
+	  mark_p = (live_symbol_p (m, po) && !XSYMBOL (obj)->mark);
 	  break;
 
 	case Lisp_Float:
-	  mark_p = (live_float_p (m, po)
-		    && !XMARKBIT (XFLOAT (obj)->type));
+	  mark_p = (live_float_p (m, po) && !FLOAT_MARKED_P (XFLOAT (obj)));
 	  break;
 
 	case Lisp_Vectorlike:
@@ -3414,28 +3540,12 @@
 	    mark_p = (!GC_SUBRP (obj)
 		      && !(XVECTOR (obj)->size & ARRAY_MARK_FLAG));
 	  else if (live_buffer_p (m, po))
-	    mark_p = GC_BUFFERP (obj) && !XMARKBIT (XBUFFER (obj)->name);
+	    mark_p = GC_BUFFERP (obj)
+	      && !(XBUFFER (obj)->size & ARRAY_MARK_FLAG);
 	  break;
 
 	case Lisp_Misc:
-	  if (live_misc_p (m, po))
-	    {
-	      switch (XMISCTYPE (obj))
-		{
-		case Lisp_Misc_Marker:
-		  mark_p = !XMARKBIT (XMARKER (obj)->chain);
-		  break;
-
-		case Lisp_Misc_Buffer_Local_Value:
-		case Lisp_Misc_Some_Buffer_Local_Value:
-		  mark_p = !XMARKBIT (XBUFFER_LOCAL_VALUE (obj)->realvalue);
-		  break;
-
-		case Lisp_Misc_Overlay:
-		  mark_p = !XMARKBIT (XOVERLAY (obj)->plist);
-		  break;
-		}
-	    }
+	  mark_p = (live_misc_p (m, po) && !XMARKER (obj)->mark);
 	  break;
 
 	case Lisp_Int:
@@ -3483,13 +3593,12 @@
 
 	case MEM_TYPE_BUFFER:
 	  if (live_buffer_p (m, p)
-	      && !XMARKBIT (((struct buffer *) p)->name))
+	      && !(((struct buffer *)p)->size & ARRAY_MARK_FLAG))
 	    XSETVECTOR (obj, p);
 	  break;
 
 	case MEM_TYPE_CONS:
-	  if (live_cons_p (m, p)
-	      && !XMARKBIT (((struct Lisp_Cons *) p)->car))
+	  if (live_cons_p (m, p) && !CONS_MARKED_P ((struct Lisp_Cons *) p))
 	    XSETCONS (obj, p);
 	  break;
 
@@ -3500,41 +3609,17 @@
 	  break;
 
 	case MEM_TYPE_MISC:
-	  if (live_misc_p (m, p))
-	    {
-	      Lisp_Object tem;
-	      XSETMISC (tem, p);
-
-	      switch (XMISCTYPE (tem))
-		{
-		case Lisp_Misc_Marker:
-		  if (!XMARKBIT (XMARKER (tem)->chain))
-		    obj = tem;
-		  break;
-
-		case Lisp_Misc_Buffer_Local_Value:
-		case Lisp_Misc_Some_Buffer_Local_Value:
-		  if (!XMARKBIT (XBUFFER_LOCAL_VALUE (tem)->realvalue))
-		    obj = tem;
-		  break;
-
-		case Lisp_Misc_Overlay:
-		  if (!XMARKBIT (XOVERLAY (tem)->plist))
-		    obj = tem;
-		  break;
-		}
-	    }
+	  if (live_misc_p (m, p) && !((struct Lisp_Free *) p)->mark)
+	    XSETMISC (obj, p);
 	  break;
 
 	case MEM_TYPE_SYMBOL:
-	  if (live_symbol_p (m, p)
-	      && !XMARKBIT (((struct Lisp_Symbol *) p)->plist))
+	  if (live_symbol_p (m, p) && !((struct Lisp_Symbol *) p)->mark)
 	    XSETSYMBOL (obj, p);
 	  break;
 
 	case MEM_TYPE_FLOAT:
-	  if (live_float_p (m, p)
-	      && !XMARKBIT (((struct Lisp_Float *) p)->type))
+	  if (live_float_p (m, p) && !FLOAT_MARKED_P (p))
 	    XSETFLOAT (obj, p);
 	  break;
 
@@ -3863,16 +3948,6 @@
   POINTER_TYPE *result;
   size_t alignment = sizeof (EMACS_INT);
 
-  /* Give Lisp_Floats an extra alignment.  */
-  if (type == Lisp_Float)
-    {
-#if defined __GNUC__ && __GNUC__ >= 2
-      alignment = __alignof (struct Lisp_Float);
-#else
-      alignment = sizeof (struct Lisp_Float);
-#endif
-    }
-
  again:
   result = (POINTER_TYPE *) ALIGN ((EMACS_UINT)purebeg + pure_bytes_used, alignment);
   pure_bytes_used = ((char *)result - (char *)purebeg) + size;
@@ -4210,13 +4285,9 @@
     register struct gcpro *tail;
     for (tail = gcprolist; tail; tail = tail->next)
       for (i = 0; i < tail->nvars; i++)
-	if (!XMARKBIT (tail->var[i]))
-	  {
 	    /* Explicit casting prevents compiler warning about
 	       discarding the `volatile' qualifier.  */
 	    mark_object ((Lisp_Object *)&tail->var[i]);
-	    XMARK (tail->var[i]);
-	  }
   }
 #endif
 
@@ -4239,21 +4310,14 @@
     }
   for (backlist = backtrace_list; backlist; backlist = backlist->next)
     {
-      if (!XMARKBIT (*backlist->function))
-	{
 	  mark_object (backlist->function);
-	  XMARK (*backlist->function);
-	}
+
       if (backlist->nargs == UNEVALLED || backlist->nargs == MANY)
 	i = 0;
       else
 	i = backlist->nargs - 1;
       for (; i >= 0; i--)
-	if (!XMARKBIT (backlist->args[i]))
-	  {
 	    mark_object (&backlist->args[i]);
-	    XMARK (backlist->args[i]);
-	  }
     }
   mark_kboards ();
 
@@ -4278,7 +4342,7 @@
 	      {
 		if (GC_CONSP (XCAR (tail))
 		    && GC_MARKERP (XCAR (XCAR (tail)))
-		    && ! XMARKBIT (XMARKER (XCAR (XCAR (tail)))->chain))
+		    && !XMARKER (XCAR (XCAR (tail)))->mark)
 		  {
 		    if (NILP (prev))
 		      nextb->undo_list = tail = XCDR (tail);
@@ -4319,26 +4383,12 @@
      || GC_MARK_STACK == GC_USE_GCPROS_CHECK_ZOMBIES)
   {
     register struct gcpro *tail;
-
-    for (tail = gcprolist; tail; tail = tail->next)
-      for (i = 0; i < tail->nvars; i++)
-	XUNMARK (tail->var[i]);
   }
 #endif
 
   unmark_byte_stack ();
-  for (backlist = backtrace_list; backlist; backlist = backlist->next)
-    {
-      XUNMARK (*backlist->function);
-      if (backlist->nargs == UNEVALLED || backlist->nargs == MANY)
-	i = 0;
-      else
-	i = backlist->nargs - 1;
-      for (; i >= 0; i--)
-	XUNMARK (backlist->args[i]);
-    }
-  XUNMARK (buffer_defaults.name);
-  XUNMARK (buffer_local_symbols.name);
+  buffer_defaults.size &= ~ARRAY_MARK_FLAG;
+  buffer_local_symbols.size &= ~ARRAY_MARK_FLAG;
 
 #if GC_MARK_STACK == GC_USE_GCPROS_CHECK_ZOMBIES && 0
   dump_zombies ();
@@ -4524,7 +4574,6 @@
  loop:
   obj = *objptr;
  loop2:
-  XUNMARK (obj);
 
   if (PURE_POINTER_P (XPNTR (obj)))
     return;
@@ -4599,7 +4648,7 @@
 
       if (GC_BUFFERP (obj))
 	{
-	  if (!XMARKBIT (XBUFFER (obj)->name))
+	  if (!(XBUFFER (obj)->size & ARRAY_MARK_FLAG))
 	    {
 #ifdef GC_CHECK_MARKED_OBJECTS
 	      if (po != &buffer_defaults && po != &buffer_local_symbols)
@@ -4774,9 +4823,9 @@
 	register struct Lisp_Symbol *ptr = XSYMBOL (obj);
 	struct Lisp_Symbol *ptrx;
 
-	if (XMARKBIT (ptr->plist)) break;
+	if (ptr->mark) break;
 	CHECK_ALLOCATED_AND_LIVE (live_symbol_p);
-	XMARK (ptr->plist);
+	ptr->mark = 1;
 	mark_object ((Lisp_Object *) &ptr->value);
 	mark_object (&ptr->function);
 	mark_object (&ptr->plist);
@@ -4804,22 +4853,16 @@
 
     case Lisp_Misc:
       CHECK_ALLOCATED_AND_LIVE (live_misc_p);
+      if (XMARKER (obj)->mark)
+	break;
+      XMARKER (obj)->mark = 1;
       switch (XMISCTYPE (obj))
 	{
-	case Lisp_Misc_Marker:
-	  XMARK (XMARKER (obj)->chain);
-	  /* DO NOT mark thru the marker's chain.
-	     The buffer's markers chain does not preserve markers from gc;
-	     instead, markers are removed from the chain when freed by gc.  */
-	  break;
-
 	case Lisp_Misc_Buffer_Local_Value:
 	case Lisp_Misc_Some_Buffer_Local_Value:
 	  {
 	    register struct Lisp_Buffer_Local_Value *ptr
 	      = XBUFFER_LOCAL_VALUE (obj);
-	    if (XMARKBIT (ptr->realvalue)) break;
-	    XMARK (ptr->realvalue);
 	    /* If the cdr is nil, avoid recursion for the car.  */
 	    if (EQ (ptr->cdr, Qnil))
 	      {
@@ -4833,6 +4876,10 @@
 	    goto loop;
 	  }
 
+	case Lisp_Misc_Marker:
+	  /* DO NOT mark thru the marker's chain.
+	     The buffer's markers chain does not preserve markers from gc;
+	     instead, markers are removed from the chain when freed by gc.  */
 	case Lisp_Misc_Intfwd:
 	case Lisp_Misc_Boolfwd:
 	case Lisp_Misc_Objfwd:
@@ -4847,15 +4894,11 @@
 	case Lisp_Misc_Overlay:
 	  {
 	    struct Lisp_Overlay *ptr = XOVERLAY (obj);
-	    if (!XMARKBIT (ptr->plist))
-	      {
-		XMARK (ptr->plist);
 		mark_object (&ptr->start);
 		mark_object (&ptr->end);
 		objptr = &ptr->plist;
 		goto loop;
 	      }
-	  }
 	  break;
 
 	default:
@@ -4866,9 +4909,9 @@
     case Lisp_Cons:
       {
 	register struct Lisp_Cons *ptr = XCONS (obj);
-	if (XMARKBIT (ptr->car)) break;
+	if (CONS_MARKED_P (ptr)) break;
 	CHECK_ALLOCATED_AND_LIVE (live_cons_p);
-	XMARK (ptr->car);
+	CONS_MARK (ptr);
 	/* If the cdr is nil, avoid recursion for the car.  */
 	if (EQ (ptr->cdr, Qnil))
 	  {
@@ -4886,7 +4929,7 @@
 
     case Lisp_Float:
       CHECK_ALLOCATED_AND_LIVE (live_float_p);
-      XMARK (XFLOAT (obj)->type);
+      FLOAT_MARK (XFLOAT (obj));
       break;
 
     case Lisp_Int:
@@ -4912,8 +4955,7 @@
   Lisp_Object base_buffer;
 
   /* This is the buffer's markbit */
-  mark_object (&buffer->name);
-  XMARK (buffer->name);
+  buffer->size |= ARRAY_MARK_FLAG;
 
   MARK_INTERVAL_TREE (BUF_INTERVALS (buffer));
 
@@ -4922,18 +4964,21 @@
       Lisp_Object tail;
       tail = buffer->undo_list;
 
+      /* We mark the undo list specially because
+	 its pointers to markers should be weak.  */
+
       while (CONSP (tail))
 	{
 	  register struct Lisp_Cons *ptr = XCONS (tail);
 
-	  if (XMARKBIT (ptr->car))
+	  if (CONS_MARKED_P (ptr))
 	    break;
-	  XMARK (ptr->car);
+	  CONS_MARK (ptr);
 	  if (GC_CONSP (ptr->car)
-	      && ! XMARKBIT (XCAR (ptr->car))
+	      && !CONS_MARKED_P (XCONS (ptr->car))
 	      && GC_MARKERP (XCAR (ptr->car)))
 	    {
-	      XMARK (XCAR_AS_LVALUE (ptr->car));
+	      CONS_MARK (XCONS (ptr->car));
 	      mark_object (&XCDR_AS_LVALUE (ptr->car));
 	    }
 	  else
@@ -4950,13 +4995,13 @@
   else
     mark_object (&buffer->undo_list);
 
-  for (ptr = &buffer->name + 1;
+  for (ptr = &buffer->name;
        (char *)ptr < (char *)buffer + sizeof (struct buffer);
        ptr++)
     mark_object (ptr);
 
   /* If this is an indirect buffer, mark its base buffer.  */
-  if (buffer->base_buffer && !XMARKBIT (buffer->base_buffer->name))
+  if (buffer->base_buffer && !(buffer->base_buffer->size & ARRAY_MARK_FLAG))
     {
       XSETBUFFER (base_buffer, buffer->base_buffer);
       mark_buffer (base_buffer);
@@ -4980,19 +5025,20 @@
       break;
 
     case Lisp_Symbol:
-      survives_p = XMARKBIT (XSYMBOL (obj)->plist);
+      survives_p = XSYMBOL (obj)->mark;
       break;
 
     case Lisp_Misc:
+      /* FIXME: Maybe we should just use obj->mark for all?  */
       switch (XMISCTYPE (obj))
 	{
 	case Lisp_Misc_Marker:
-	  survives_p = XMARKBIT (obj);
+	  survives_p = XMARKER (obj)->mark;
 	  break;
 
 	case Lisp_Misc_Buffer_Local_Value:
 	case Lisp_Misc_Some_Buffer_Local_Value:
-	  survives_p = XMARKBIT (XBUFFER_LOCAL_VALUE (obj)->realvalue);
+	  survives_p = XBUFFER_LOCAL_VALUE (obj)->mark;
 	  break;
 
 	case Lisp_Misc_Intfwd:
@@ -5004,7 +5050,7 @@
 	  break;
 
 	case Lisp_Misc_Overlay:
-	  survives_p = XMARKBIT (XOVERLAY (obj)->plist);
+	  survives_p = XOVERLAY (obj)->mark;
 	  break;
 
 	default:
@@ -5021,7 +5067,7 @@
 
     case Lisp_Vectorlike:
       if (GC_BUFFERP (obj))
-	survives_p = XMARKBIT (XBUFFER (obj)->name);
+	survives_p = XBUFFER (obj)->size & ARRAY_MARK_FLAG;
       else if (GC_SUBRP (obj))
 	survives_p = 1;
       else
@@ -5029,11 +5075,11 @@
       break;
 
     case Lisp_Cons:
-      survives_p = XMARKBIT (XCAR (obj));
+      survives_p = CONS_MARKED_P (XCONS (obj));
       break;
 
     case Lisp_Float:
-      survives_p = XMARKBIT (XFLOAT (obj)->type);
+      survives_p = FLOAT_MARKED_P (XFLOAT (obj));
       break;
 
     default:
@@ -5074,7 +5120,7 @@
 	register int i;
 	int this_free = 0;
 	for (i = 0; i < lim; i++)
-	  if (!XMARKBIT (cblk->conses[i].car))
+	  if (!CONS_MARKED_P (&cblk->conses[i]))
 	    {
 	      this_free++;
 	      *(struct Lisp_Cons **)&cblk->conses[i].cdr = cons_free_list;
@@ -5086,7 +5132,7 @@
 	  else
 	    {
 	      num_used++;
-	      XUNMARK (cblk->conses[i].car);
+	      CONS_UNMARK (&cblk->conses[i]);
 	    }
 	lim = CONS_BLOCK_SIZE;
 	/* If this block contains only free conses and we have already
@@ -5097,7 +5143,7 @@
 	    *cprev = cblk->next;
 	    /* Unhook from the free list.  */
 	    cons_free_list = *(struct Lisp_Cons **) &cblk->conses[0].cdr;
-	    lisp_free (cblk);
+	    lisp_align_free (cblk);
 	    n_cons_blocks--;
 	  }
 	else
@@ -5124,19 +5170,16 @@
 	register int i;
 	int this_free = 0;
 	for (i = 0; i < lim; i++)
-	  if (!XMARKBIT (fblk->floats[i].type))
+	  if (!FLOAT_MARKED_P (&fblk->floats[i]))
 	    {
 	      this_free++;
 	      *(struct Lisp_Float **)&fblk->floats[i].data = float_free_list;
 	      float_free_list = &fblk->floats[i];
-#if GC_MARK_STACK
-	      float_free_list->type = Vdead;
-#endif
 	    }
 	  else
 	    {
 	      num_used++;
-	      XUNMARK (fblk->floats[i].type);
+	      FLOAT_UNMARK (&fblk->floats[i]);
 	    }
 	lim = FLOAT_BLOCK_SIZE;
 	/* If this block contains only free floats and we have already
@@ -5147,7 +5190,7 @@
 	    *fprev = fblk->next;
 	    /* Unhook from the free list.  */
 	    float_free_list = *(struct Lisp_Float **) &fblk->floats[0].data;
-	    lisp_free (fblk);
+	    lisp_align_free (fblk);
 	    n_float_blocks--;
 	  }
 	else
@@ -5176,7 +5219,7 @@
 
 	for (i = 0; i < lim; i++)
 	  {
-	    if (! XMARKBIT (iblk->intervals[i].plist))
+	    if (!iblk->intervals[i].mark)
 	      {
 		SET_INTERVAL_PARENT (&iblk->intervals[i], interval_free_list);
 		interval_free_list = &iblk->intervals[i];
@@ -5185,7 +5228,7 @@
 	    else
 	      {
 		num_used++;
-		XUNMARK (iblk->intervals[i].plist);
+		iblk->intervals[i].mark = 0;
 	      }
 	  }
 	lim = INTERVAL_BLOCK_SIZE;
@@ -5232,7 +5275,7 @@
 	       so we conservatively assume that it is live.  */
 	    int pure_p = PURE_POINTER_P (XSTRING (sym->xname));
 
-	    if (!XMARKBIT (sym->plist) && !pure_p)
+	    if (!sym->mark && !pure_p)
 	      {
 		*(struct Lisp_Symbol **) &sym->value = symbol_free_list;
 		symbol_free_list = sym;
@@ -5246,7 +5289,7 @@
 		++num_used;
 		if (!pure_p)
 		  UNMARK_STRING (XSTRING (sym->xname));
-		XUNMARK (sym->plist);
+		sym->mark = 0;
 	      }
 	  }
 
@@ -5290,29 +5333,7 @@
 
 	for (i = 0; i < lim; i++)
 	  {
-	    Lisp_Object *markword;
-	    switch (mblk->markers[i].u_marker.type)
-	      {
-	      case Lisp_Misc_Marker:
-		markword = &mblk->markers[i].u_marker.chain;
-		break;
-	      case Lisp_Misc_Buffer_Local_Value:
-	      case Lisp_Misc_Some_Buffer_Local_Value:
-		markword = &mblk->markers[i].u_buffer_local_value.realvalue;
-		break;
-	      case Lisp_Misc_Overlay:
-		markword = &mblk->markers[i].u_overlay.plist;
-		break;
-	      case Lisp_Misc_Free:
-		/* If the object was already free, keep it
-		   on the free list.  */
-		markword = (Lisp_Object *) &already_free;
-		break;
-	      default:
-		markword = 0;
-		break;
-	      }
-	    if (markword && !XMARKBIT (*markword))
+	    if (!mblk->markers[i].u_marker.mark)
 	      {
 		Lisp_Object tem;
 		if (mblk->markers[i].u_marker.type == Lisp_Misc_Marker)
@@ -5333,8 +5354,7 @@
 	    else
 	      {
 		num_used++;
-		if (markword)
-		  XUNMARK (*markword);
+		mblk->markers[i].u_marker.mark = 0;
 	      }
 	  }
 	lim = MARKER_BLOCK_SIZE;
@@ -5365,7 +5385,7 @@
     register struct buffer *buffer = all_buffers, *prev = 0, *next;
 
     while (buffer)
-      if (!XMARKBIT (buffer->name))
+      if (!(buffer->size & ARRAY_MARK_FLAG))
 	{
 	  if (prev)
 	    prev->next = buffer->next;
@@ -5377,7 +5397,7 @@
 	}
       else
 	{
-	  XUNMARK (buffer->name);
+	  buffer->size &= ~ARRAY_MARK_FLAG;
 	  UNMARK_BALANCE_INTERVALS (BUF_INTERVALS (buffer));
 	  prev = buffer, buffer = buffer->next;
 	}

Index: lisp.h
===================================================================
RCS file: /cvsroot/emacs/emacs/src/lisp.h,v
retrieving revision 1.456
diff -u -u -b -r1.456 lisp.h
--- lisp.h	23 May 2003 18:14:36 -0000	1.456
+++ lisp.h	24 Jun 2003 22:17:54 -0000
@@ -879,6 +816,8 @@
 
 struct Lisp_Symbol
 {
+  unsigned mark : 1;
+
   /* Non-zero means symbol serves as a variable alias.  The symbol
      holding the real value is found in the value slot.  */
   unsigned indirect_variable : 1;
@@ -1088,7 +1028,8 @@
 struct Lisp_Free
   {
     int type : 16;	/* = Lisp_Misc_Free */
-    int spacer : 16;
+    unsigned mark : 1;
+    int spacer : 15;
     union Lisp_Misc *chain;
   };
 
@@ -1096,7 +1037,8 @@
 struct Lisp_Marker
 {
   int type : 16;		/* = Lisp_Misc_Marker */
-  int spacer : 15;
+  unsigned mark : 1;
+  int spacer : 14;
   /* 1 means normal insertion at the marker's position
      leaves the marker after the inserted text.  */
   unsigned int insertion_type : 1;
@@ -1123,7 +1065,8 @@
 struct Lisp_Intfwd
   {
     int type : 16;	/* = Lisp_Misc_Intfwd */
-    int spacer : 16;
+    unsigned mark : 1;
+    int spacer : 15;
     EMACS_INT *intvar;
   };
 
@@ -1134,7 +1077,8 @@
 struct Lisp_Boolfwd
   {
     int type : 16;	/* = Lisp_Misc_Boolfwd */
-    int spacer : 16;
+    unsigned mark : 1;
+    int spacer : 15;
     int *boolvar;
   };
 
@@ -1145,7 +1089,8 @@
 struct Lisp_Objfwd
   {
     int type : 16;	/* = Lisp_Misc_Objfwd */
-    int spacer : 16;
+    unsigned mark : 1;
+    int spacer : 15;
     Lisp_Object *objvar;
   };
 
@@ -1154,7 +1099,8 @@
 struct Lisp_Buffer_Objfwd
   {
     int type : 16;	/* = Lisp_Misc_Buffer_Objfwd */
-    int spacer : 16;
+    unsigned mark : 1;
+    int spacer : 15;
     int offset;
   };
 
@@ -1188,7 +1134,8 @@
   {
     int type : 16;      /* = Lisp_Misc_Buffer_Local_Value
 			   or Lisp_Misc_Some_Buffer_Local_Value */
-    int spacer : 13;
+    unsigned mark : 1;
+    int spacer : 12;
 
     /* 1 means this variable is allowed to have frame-local bindings,
        so check for them when looking for the proper binding.  */
@@ -1224,7 +1171,8 @@
 struct Lisp_Overlay
   {
     int type : 16;	/* = Lisp_Misc_Overlay */
-    int spacer : 16;
+    unsigned mark : 1;
+    int spacer : 15;
     Lisp_Object start, end, plist;
   };
 
@@ -1233,7 +1181,8 @@
 struct Lisp_Kboard_Objfwd
   {
     int type : 16;	/* = Lisp_Misc_Kboard_Objfwd */
-    int spacer : 16;
+    unsigned mark : 1;
+    int spacer : 15;
     int offset;
   };
 
@@ -1242,7 +1191,8 @@
 struct Lisp_Save_Value
   {
     int type : 16;	/* = Lisp_Misc_Save_Value */
-    int spacer : 16;
+    unsigned mark : 1;
+    int spacer : 15;
     void *pointer;
     int integer;
   };
@@ -1268,8 +1218,6 @@
 /* Lisp floating point type */
 struct Lisp_Float
   {
-    Lisp_Object type;		/* essentially used for mark-bit
-				   and chaining when on free-list */
 #ifdef HIDE_LISP_IMPLEMENTATION
     double data_;
 #else

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
       [not found]     ` <E19VGK5-0000CQ-F4@fencepost.gnu.org>
@ 2003-06-27 23:17       ` Stefan Monnier
  2003-06-27 23:47         ` Miles Bader
  2003-06-29  2:30         ` Richard Stallman
  0 siblings, 2 replies; 15+ messages in thread
From: Stefan Monnier @ 2003-06-27 23:17 UTC (permalink / raw
  Cc: emacs-devel

>     I must have been unclear.  Rather than keep the markbit s part of the object,
>     keep a separate bitmap.  In order to find the bit in the bitmap for a given
>     object, you need to find (from the object's pointer) both the base
>     address of the bitmap and the index of the bit in the bitmap.
> 
> I understand the problem--what I don't understand is this solution:
> 
>     This is typically done by allocating an array of objects, of size 2^N bytes,
>     such that the base of the array can be found by clearing the low-order bits
>     of the object's pointer.
> 
> I will try to understand it.
> 
>     But a good implementation of memalign should work around this problem.
> 
> I am not convinced the problem can be solved this way.  Anyway,
> memalign is a C library function, and typically needs to be tied
> in with malloc.  We can't expect Emacs to always use our memalign.

Indeed.  And `memalign' is not a standard function anyway, so we
need to provide a fallback implementation anyway.

> The idea of allocating a much larger superblock big enough for 20 of
> these blocks, then distributing the blocks individually, could be a
> good solution.  That only wastes 5% of the space at most.  And if the
> superblock is treated by malloc as a large allocation, it will
> probably always have good enough alignment.  So you could detect that
> case, and use all 20 of the parts instead of just 19 of them.  That way,
> there is usually no waste.

That's what I implemented.  See the patch below.
If memalign is available, then the code can use it.  It seems that glibc
has a pretty good implementation of memalign, so it's worth the effort.

Here is the memory use breakdown for floats on x86 with glibc:

- with current code:
  we malloc (1020) and get 84 Lisp_Floats (of size 12b)
- with new code, malloc:
  we malloc (16384) and get 15 * 125 = 1875 Lisp_Floats (of size 8b)
- with new code, memalign:
  we memalign (16380) and get 16 * 124 = 1984 Lisp_Floats (of size 8b)

In all three cases, malloc (or memalign) itself does not use up much
extra memory (in the case of memalign, it uses up a lot of memory
if we memalign (16384) which is why I only memalign (16380).

The actual number of bytes per float used up in the three cases is:
- 12.14
- 8.74
- 8.26

I'd like to use the same scheme for cons cells (so as to get rid of the
markbit in Lisp_Object), so it's important to stay as close to 8bytes
as possible since that's what cons cells currently use.

>     I'd like to already install part of the patch below: the part that
>     introduces a new `mark' field in every Lisp_Misc object (the field
>     is 1-bit wide and does not increase the size of the objects since
>     it is taken from explicit padding).  Any objection ?
> Ok with me.

Installed, together with a patch that makes buffers use the `size'
field (like other vectorlike objects) rather than the `name' field.


	Stefan


PS: This patch has not been tested.  It's extracted from my code where it
    seems to work fine, but I might have missed a dependency with some other
    local changes.


Index: lisp.h
===================================================================
RCS file: /cvsroot/emacs/emacs/src/lisp.h,v
retrieving revision 1.458
diff -u -u -b -r1.458 lisp.h
--- lisp.h	26 Jun 2003 23:15:08 -0000	1.458
+++ lisp.h	27 Jun 2003 23:12:26 -0000
@@ -1282,8 +1216,6 @@
 /* Lisp floating point type */
 struct Lisp_Float
   {
-    Lisp_Object type;		/* essentially used for mark-bit
-				   and chaining when on free-list */
 #ifdef HIDE_LISP_IMPLEMENTATION
     double data_;
 #else

Index: alloc.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/alloc.c,v
retrieving revision 1.307
diff -u -u -b -r1.307 alloc.c
--- alloc.c	27 Jun 2003 22:54:26 -0000	1.307
+++ alloc.c	27 Jun 2003 23:06:03 -0000
@@ -19,6 +19,7 @@
 
 #include <config.h>
 #include <stdio.h>
+#include <limits.h>
 
 #ifdef ALLOC_DEBUG
 #undef INLINE
@@ -418,8 +424,8 @@
 /* Value is SZ rounded up to the next multiple of ALIGNMENT.
    ALIGNMENT must be a power of 2.  */
 
-#define ALIGN(SZ, ALIGNMENT) \
-  (((SZ) + (ALIGNMENT) - 1) & ~((ALIGNMENT) - 1))
+#define ALIGN(ptr, ALIGNMENT) \
+  ((void*) ((((EMACS_UINT)(ptr)) + (ALIGNMENT) - 1) & ~((ALIGNMENT) - 1)))
 
 
 \f
@@ -635,6 +641,211 @@
   UNBLOCK_INPUT;
 }
 
+/* Allocation of aligned blocks of memory to store Lisp data.              */
+/* The entry point is lisp_align_malloc which returns blocks of at most    */
+/* BLOCK_BYTES and guarantees they are aligned on a BLOCK_ALIGN boundary.  */
+/* Define USE_MEMALIGN if `memalign' can be used (i.e. if it can free).    */
+
+#define BLOCK_ALIGN 1024
+#define BLOCK_BYTES \
+  (BLOCK_ALIGN - sizeof (struct aligned_block *) - ABLOCKS_PADDING)
+
+/* Internal data structures and constants.  */
+
+#ifdef USE_MEMALIGN
+#define IF_MEMALIGN(a,b) a
+#else
+#define IF_MEMALIGN(a,b) b
+#endif
+
+/* Padding to leave at the end of a malloc'd block.  This is to give
+   malloc a chance to minimize the amount of memory wasted to alignment.
+   It should be tuned to the particular malloc library used.
+   The current setting is based on glibc-2.3.2.  */
+#define ABLOCKS_PADDING IF_MEMALIGN (sizeof (void*), 0)
+#define ABLOCKS_SIZE 16
+
+/* An aligned block of memory.  */
+struct ablock
+{
+  union
+  {
+    char payload[BLOCK_BYTES];
+    struct ablock *next_free;
+  } x;
+  /* `abase' is the aligned base of the ablocks.  */
+  /* It is overloaded to hold the virtual `busy' field that counts
+     the number of used ablock in the parent ablocks.
+     The first ablock has the `busy' field, the others have the `abase'
+     field.  To tell the difference, we assume that pointers will have
+     integer values larger than 2 * ABLOCKS_SIZE.  The lowest bit of `busy'
+     is used to tell whether the real base of the parent ablocks is `abase'
+     (if not, the word before the first ablock holds a pointer to the
+     real base).  */
+  struct ablocks *abase;
+  /* The padding of all but the last ablock is unused.  The padding of
+     the last ablock in an ablocks is not allocated.  */
+  char padding[ABLOCKS_PADDING];
+};
+
+/* A bunch of consecutive aligned blocks.  */
+struct ablocks
+{
+  struct ablock blocks[ABLOCKS_SIZE];
+};
+
+/* Size of the block requested with malloc or memalign.  */
+#define ABLOCKS_BYTES (sizeof (struct ablocks) - ABLOCKS_PADDING)
+
+#define ABLOCK_ABASE(block) \
+  (((unsigned long) (block)->abase) <= (1 + 2 * ABLOCKS_SIZE)   \
+   ? (struct ablocks *)(block)					\
+   : (block)->abase)
+
+/* Virtual `busy' field.  */
+#define ABLOCKS_BUSY(abase) ((abase)->blocks[0].abase)
+
+/* Pointer to the (not necessarily aligned) malloc block.  */
+#define ABLOCKS_BASE(abase) \
+  IF_MEMALIGN ((abase),     \
+	       (1 & (int) ABLOCKS_BUSY (abase) ? abase : ((void**)abase)[-1]))
+
+static struct ablock *free_ablock;
+
+/* Allocate an aligned block of nbytes.
+   Alignment is on a multiple of BLOCK_ALIGN and `nbytes' has to be
+   smaller or equal to BLOCK_BYTES.  */
+static POINTER_TYPE *
+lisp_align_malloc (nbytes, type)
+     size_t nbytes;
+     enum mem_type type;
+{
+  void *base, *val;
+  struct ablocks *abase;
+
+  eassert (nbytes <= BLOCK_BYTES);
+
+  BLOCK_INPUT;
+
+#ifdef GC_MALLOC_CHECK
+  allocated_mem_type = type;
+#endif
+
+  if (!free_ablock)
+    {
+      int i, aligned;
+
+#ifdef DOUG_LEA_MALLOC
+      /* Prevent mmap'ing the chunk.  Lisp data may not be mmap'ed
+	 because mapped region contents are not preserved in
+	 a dumped Emacs.  */
+      mallopt (M_MMAP_MAX, 0);
+#endif
+
+      if (IF_MEMALIGN (1,0))
+	abase = base = memalign (BLOCK_ALIGN, ABLOCKS_BYTES);
+      else
+	{
+	  base = malloc (ABLOCKS_BYTES);
+	  abase = ALIGN (base, BLOCK_ALIGN);
+	}
+      aligned = (base == abase);
+      if (!aligned)
+	((void**)abase)[-1] = base;
+
+#ifdef DOUG_LEA_MALLOC
+      /* Back to a reasonable maximum of mmap'ed areas.  */
+      mallopt (M_MMAP_MAX, MMAP_MAX_AREAS);
+#endif
+
+      /* Initialize the blocks and put them on the free list.
+	 Is `base' was not properly aligned, we can't use the last block.  */
+      for (i = 0; i < (aligned ? ABLOCKS_SIZE : ABLOCKS_SIZE - 1); i++)
+	{
+	  abase->blocks[i].abase = abase;
+	  abase->blocks[i].x.next_free = free_ablock;
+	  free_ablock = &abase->blocks[i];
+	}
+      ABLOCKS_BUSY (abase) = (struct ablocks *) aligned;
+
+      eassert (ABLOCK_ABASE (&abase->blocks[3]) == abase);
+      eassert (ABLOCK_ABASE (&abase->blocks[0]) == abase);
+      eassert (ABLOCKS_BASE (abase) == base);
+      eassert (aligned == (int)ABLOCKS_BUSY (abase));
+    }
+
+  abase = ABLOCK_ABASE (free_ablock);
+  ABLOCKS_BUSY (abase) = (struct ablocks *) (2 + (int) ABLOCKS_BUSY (abase));
+  val = free_ablock;
+  free_ablock = free_ablock->x.next_free;
+
+  /* If the memory just allocated cannot be addressed thru a Lisp
+     object's pointer, and it needs to be,
+     that's equivalent to running out of memory.  */
+  if (val && type != MEM_TYPE_NON_LISP)
+    {
+      Lisp_Object tem;
+      XSETCONS (tem, (char *) val + nbytes - 1);
+      if ((char *) XCONS (tem) != (char *) val + nbytes - 1)
+	{
+	  lisp_malloc_loser = val;
+	  free (val);
+	  val = 0;
+	}
+    }
+
+#if GC_MARK_STACK && !defined GC_MALLOC_CHECK
+  if (val && type != MEM_TYPE_NON_LISP)
+    mem_insert (val, (char *) val + nbytes, type);
+#endif
+
+  UNBLOCK_INPUT;
+  if (!val && nbytes)
+    memory_full ();
+
+  eassert (0 == ((EMACS_UINT)val) % BLOCK_ALIGN);
+  return val;
+}
+
+static void
+lisp_align_free (block)
+     POINTER_TYPE *block;
+{
+  struct ablock *ablock = block;
+  struct ablocks *abase = ABLOCK_ABASE (ablock);
+
+#if GC_MARK_STACK && !defined GC_MALLOC_CHECK
+  mem_delete (mem_find (block));
+#endif
+  /* Put on free list.  */
+  ablock->x.next_free = free_ablock;
+  free_ablock = ablock;
+  /* Update busy count.  */
+  ABLOCKS_BUSY (abase) = (struct ablocks *) (-2 + (int) ABLOCKS_BUSY (abase));
+  
+  if (2 > (int) ABLOCKS_BUSY (abase))
+    { /* All the blocks are free.  */
+      int i = 0, aligned = (int) ABLOCKS_BUSY (abase);
+      struct ablock **tem = &free_ablock;
+      struct ablock *atop = &abase->blocks[aligned ? ABLOCKS_SIZE : ABLOCKS_SIZE - 1];
+
+      while (*tem)
+	{
+	  if (*tem >= (struct ablock *) abase && *tem < atop)
+	    {
+	      i++;
+	      *tem = (*tem)->x.next_free;
+	    }
+	  else
+	    tem = &(*tem)->x.next_free;
+	}
+      eassert ((aligned & 1) == aligned);
+      eassert (i == (aligned ? ABLOCKS_SIZE : ABLOCKS_SIZE - 1));
+      BLOCK_INPUT;
+      free (ABLOCKS_BASE (abase));
+      UNBLOCK_INPUT;
+    }
+}
 
 /* Return a new buffer structure allocated from the heap with
    a call to lisp_malloc.  */
@@ -1899,21 +2110,48 @@
 /* We store float cells inside of float_blocks, allocating a new
    float_block with malloc whenever necessary.  Float cells reclaimed
    by GC are put on a free list to be reallocated before allocating
-   any new float cells from the latest float_block.
-
-   Each float_block is just under 1020 bytes long, since malloc really
-   allocates in units of powers of two and uses 4 bytes for its own
-   overhead. */
+   any new float cells from the latest float_block.  */
 
 #define FLOAT_BLOCK_SIZE \
-  ((1020 - sizeof (struct float_block *)) / sizeof (struct Lisp_Float))
+  (((BLOCK_BYTES - sizeof (struct float_block *)) * CHAR_BIT) \
+   / (sizeof (struct Lisp_Float) * CHAR_BIT + 1))
+
+#define GETMARKBIT(block,n)					\
+  (((block)->markbits[(n) / (sizeof(int) * CHAR_BIT)]	\
+    >> ((n) % (sizeof(int) * CHAR_BIT)))			\
+   & 1)
+
+#define SETMARKBIT(block,n)					\
+  (block)->markbits[(n) / (sizeof(int) * CHAR_BIT)]	\
+  |= 1 << ((n) % (sizeof(int) * CHAR_BIT))
+
+#define UNSETMARKBIT(block,n)					\
+  (block)->markbits[(n) / (sizeof(int) * CHAR_BIT)]	\
+  &= ~(1 << ((n) % (sizeof(int) * CHAR_BIT)))
+
+#define FLOAT_BLOCK(fptr) \
+  ((struct float_block *)(((EMACS_UINT)(fptr)) & ~(BLOCK_ALIGN - 1)))
+
+#define FLOAT_INDEX(fptr) \
+  ((((EMACS_UINT)(fptr)) & (BLOCK_ALIGN - 1)) / sizeof (struct Lisp_Float))
 
 struct float_block
 {
-  struct float_block *next;
+  /* Place `floats' at the beginning, to ease up FLOAT_INDEX's job.  */
   struct Lisp_Float floats[FLOAT_BLOCK_SIZE];
+  int markbits[1 + FLOAT_BLOCK_SIZE / (sizeof(int) * CHAR_BIT)];
+  struct float_block *next;
 };
 
+#define FLOAT_MARKED_P(fptr) \
+  GETMARKBIT (FLOAT_BLOCK (fptr), FLOAT_INDEX ((fptr)))
+
+#define FLOAT_MARK(fptr) \
+  SETMARKBIT (FLOAT_BLOCK (fptr), FLOAT_INDEX ((fptr)))
+
+#define FLOAT_UNMARK(fptr) \
+  UNSETMARKBIT (FLOAT_BLOCK (fptr), FLOAT_INDEX ((fptr)))
+
 /* Current float_block.  */
 
 struct float_block *float_block;
@@ -1936,10 +2174,11 @@
 void
 init_float ()
 {
-  float_block = (struct float_block *) lisp_malloc (sizeof *float_block,
+  float_block = (struct float_block *) lisp_align_malloc (sizeof *float_block,
 						    MEM_TYPE_FLOAT);
   float_block->next = 0;
   bzero ((char *) float_block->floats, sizeof float_block->floats);
+  bzero ((char *) float_block->markbits, sizeof float_block->markbits);
   float_block_index = 0;
   float_free_list = 0;
   n_float_blocks = 1;
@@ -1953,9 +2192,6 @@
      struct Lisp_Float *ptr;
 {
   *(struct Lisp_Float **)&ptr->data = float_free_list;
-#if GC_MARK_STACK
-  ptr->type = Vdead;
-#endif
   float_free_list = ptr;
 }
 
@@ -1981,7 +2218,7 @@
 	{
 	  register struct float_block *new;
 
-	  new = (struct float_block *) lisp_malloc (sizeof *new,
+	  new = (struct float_block *) lisp_align_malloc (sizeof *new,
 						    MEM_TYPE_FLOAT);
 	  new->next = float_block;
 	  float_block = new;
@@ -1992,7 +2229,7 @@
     }
 
   XFLOAT_DATA (val) = float_value;
-  XSETFASTINT (XFLOAT (val)->type, 0);	/* bug chasing -wsr */
+  FLOAT_UNMARK (XFLOAT (val));
   consing_since_gc += sizeof (struct Lisp_Float);
   floats_consed++;
   return val;
@@ -3240,14 +3495,12 @@
       struct float_block *b = (struct float_block *) m->start;
       int offset = (char *) p - (char *) &b->floats[0];
 
-      /* P must point to the start of a Lisp_Float, not be
-	 one of the unused cells in the current float block,
-	 and not be on the free-list.  */
+      /* P must point to the start of a Lisp_Float and not be
+	 one of the unused cells in the current float block.  */
       return (offset >= 0
 	      && offset % sizeof b->floats[0] == 0
 	      && (b != float_block
-		  || offset / sizeof b->floats[0] < float_block_index)
-	      && !EQ (((struct Lisp_Float *) p)->type, Vdead));
+		  || offset / sizeof b->floats[0] < float_block_index));
     }
   else
     return 0;
@@ -3394,8 +3646,7 @@
 	  break;
 
 	case Lisp_Float:
-	  mark_p = (live_float_p (m, po)
-		    && !XMARKBIT (XFLOAT (obj)->type));
+	  mark_p = (live_float_p (m, po) && !FLOAT_MARKED_P (XFLOAT (obj)));
 	  break;
 
 	case Lisp_Vectorlike:
@@ -3483,8 +3734,7 @@
 	  break;
 
 	case MEM_TYPE_FLOAT:
-	  if (live_float_p (m, p)
-	      && !XMARKBIT (((struct Lisp_Float *) p)->type))
+	  if (live_float_p (m, p) && !FLOAT_MARKED_P (p))
 	    XSETFLOAT (obj, p);
 	  break;
 
@@ -3812,7 +4062,7 @@
     }
 
  again:
-  result = (POINTER_TYPE *) ALIGN ((EMACS_UINT)purebeg + pure_bytes_used, alignment);
+  result = ALIGN (purebeg + pure_bytes_used, alignment);
   pure_bytes_used = ((char *)result - (char *)purebeg) + size;
 
   if (pure_bytes_used <= pure_size)
@@ -4825,7 +5032,7 @@
 
     case Lisp_Float:
       CHECK_ALLOCATED_AND_LIVE (live_float_p);
-      XMARK (XFLOAT (obj)->type);
+      FLOAT_MARK (XFLOAT (obj));
       break;
 
     case Lisp_Int:
@@ -4935,7 +5144,7 @@
       break;
 
     case Lisp_Float:
-      survives_p = XMARKBIT (XFLOAT (obj)->type);
+      survives_p = FLOAT_MARKED_P (XFLOAT (obj));
       break;
 
     default:
@@ -5039,19 +5243,16 @@
 	register int i;
 	int this_free = 0;
 	for (i = 0; i < lim; i++)
-	  if (!XMARKBIT (fblk->floats[i].type))
+	  if (!FLOAT_MARKED_P (&fblk->floats[i]))
 	    {
 	      this_free++;
 	      *(struct Lisp_Float **)&fblk->floats[i].data = float_free_list;
 	      float_free_list = &fblk->floats[i];
-#if GC_MARK_STACK
-	      float_free_list->type = Vdead;
-#endif
 	    }
 	  else
 	    {
 	      num_used++;
-	      XUNMARK (fblk->floats[i].type);
+	      FLOAT_UNMARK (&fblk->floats[i]);
 	    }
 	lim = FLOAT_BLOCK_SIZE;
 	/* If this block contains only free floats and we have already
@@ -5062,7 +5263,7 @@
 	    *fprev = fblk->next;
 	    /* Unhook from the free list.  */
 	    float_free_list = *(struct Lisp_Float **) &fblk->floats[0].data;
-	    lisp_free (fblk);
+	    lisp_align_free (fblk);
 	    n_float_blocks--;
 	  }
 	else
@@ -5372,6 +5573,8 @@
   pure_size = PURESIZE;
   pure_bytes_used = 0;
   pure_bytes_used_before_overflow = 0;
+
+  free_ablock = NULL;
 
 #if GC_MARK_STACK || defined GC_MALLOC_CHECK
   mem_init ();

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-27 23:17       ` Stefan Monnier
@ 2003-06-27 23:47         ` Miles Bader
  2003-06-29  2:30         ` Richard Stallman
  1 sibling, 0 replies; 15+ messages in thread
From: Miles Bader @ 2003-06-27 23:47 UTC (permalink / raw
  Cc: emacs-devel

On Fri, Jun 27, 2003 at 07:17:38PM -0400, Stefan Monnier wrote:
> Indeed.  And `memalign' is not a standard function anyway, so we
> need to provide a fallback implementation anyway.

It's `kinda' standard, in the sense that more than glibc implements it; posix
("1003.1d" no clue what that is) apparently specifies a version with
identical functionality but a wackier interface, called `posix_memalign',
which glibc also implements.

-Miles
-- 
[|nurgle|]  ddt- demonic? so quake will have an evil kinda setting? one that 
            will  make every christian in the world foamm at the mouth? 
[iddt]      nurg, that's the goal 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-27 23:17       ` Stefan Monnier
  2003-06-27 23:47         ` Miles Bader
@ 2003-06-29  2:30         ` Richard Stallman
  2003-07-04 20:42           ` Stefan Monnier
  1 sibling, 1 reply; 15+ messages in thread
From: Richard Stallman @ 2003-06-29  2:30 UTC (permalink / raw
  Cc: emacs-devel

    In all three cases, malloc (or memalign) itself does not use up much
    extra memory (in the case of memalign, it uses up a lot of memory
    if we memalign (16384) which is why I only memalign (16380).

    The actual number of bytes per float used up in the three cases is:
    - 12.14
    - 8.74
    - 8.26

With those ratios, I'd say it is efficient enough, for cons cells too.
Please install your changes.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: malloc and alignment
  2003-06-29  2:30         ` Richard Stallman
@ 2003-07-04 20:42           ` Stefan Monnier
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Monnier @ 2003-07-04 20:42 UTC (permalink / raw
  Cc: emacs-devel

>     In all three cases, malloc (or memalign) itself does not use up much
>     extra memory (in the case of memalign, it uses up a lot of memory
>     if we memalign (16384) which is why I only memalign (16380).
> 
>     The actual number of bytes per float used up in the three cases is:
>     - 12.14
>     - 8.74
>     - 8.26
> 
> With those ratios, I'd say it is efficient enough, for cons cells too.
> Please install your changes.

I've installed the change for floats (without using memalign).
I'll install the rest in small installments.


	Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2003-07-04 20:42 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-06-16 14:38 malloc and alignment Stefan Monnier
2003-06-16 15:15 ` David Kastrup
2003-06-16 15:39   ` Stefan Monnier
2003-06-16 15:59   ` Stefan Monnier
2003-06-17  4:57     ` Stephen J. Turnbull
2003-06-17  7:35       ` David Kastrup
2003-06-16 22:35 ` Miles Bader
2003-06-16 23:11   ` Stefan Monnier
2003-06-17  2:25     ` Miles Bader
2003-06-17  4:48 ` Stephen J. Turnbull
     [not found] ` <E19SJCe-0008Bb-Vp@fencepost.gnu.org>
2003-06-24 22:52   ` Stefan Monnier
     [not found]     ` <E19VGK5-0000CQ-F4@fencepost.gnu.org>
2003-06-27 23:17       ` Stefan Monnier
2003-06-27 23:47         ` Miles Bader
2003-06-29  2:30         ` Richard Stallman
2003-07-04 20:42           ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).