Re: Proposal: block-based vector allocator

unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed

From: Dmitry Antipov <dmantipov@yandex.ru>
To: emacs-devel@gnu.org
Subject: Re: Proposal: block-based vector allocator
Date: Fri, 09 Dec 2011 20:15:30 +0400	[thread overview]
Message-ID: <4EE23422.3050008@yandex.ru> (raw)
In-Reply-To: <jwvwra5n8yi.fsf-monnier+emacs@gnu.org>

[-- Attachment #1: Type: text/plain, Size: 2319 bytes --]

On 12/09/2011 06:05 PM, Stefan Monnier wrote:

> Your test with the *compilation* seems to indicate otherwise, or am
> I missing something?

"Not too much" for byte-compile test. Since *compilation* buffer test
allocates mostly an intervals, mmap() helps it regardless of the method
used for vector allocation.

> Exactly: no need to work hard.

Sure, but throwing away the almost-done work makes no sense, especially
when an alternative work is not even started (if I miss something done,
I would be glad to see it).

> The apparently tightest fit is not necessarily tighter: what happens if
> you use an "order624" to allocate a 600B object, then use the remaining
> 24B for another object, and then free the 600B object?

There will be an attempt to coalesce this 600-bytes vector with adjacent
vectors, and resulting vector will be set up on a free list.

> I also really want to know the answer to this question (third try):
> Why have different classes for 632B and 600B sizes?  You can't fit more
> than 6 vectors of either size in a 4KB block, so you might as well
> divide your "order600" block into 6 chunks of 680B (plus 16B of padding)
> rather than 6 chunks of 600B plus an unused 496B of padding.

I don't understand your question. It's _your_ proposal to divide objects
on classes by rounding the size up to nearest magic value ('standard malloc
algorithm'). My proposal assumes that vector block can contains objects
of any size, and I'm using segregated free lists for ..., 600, 608, 616, 624,
632, ... bytes free vectors.  I believe it was well explained why I rejects
both 'exact-fit vector blocks for each possible vector size' and 'good-fit
vector block for each size class' ('standard malloc') algorithms.

To be concrete, I'm attaching the current state of my proposal again.

Did you ever tried to obtain a histogram of vector sizes? Of course, it
highly depends on the benchmark; but I'm pretty sure that 2..6 - Lisp_Objects
vectors are very popular for the wide range of real workloads. That's
another reason why throwing away even 32 bytes is not so reasonable.

I'm scheduling the global benchmark for the night (32 vs. 64-bit
binaries, GCPROs vs.stack marking, block-based vector allocation vs.
current code, in all possible variants, 8 executables in total,
16 runs for each :-)).

Dmitry

[-- Attachment #2: vector_alloc_final.patch --]
[-- Type: text/plain, Size: 13111 bytes --]

=== modified file 'src/alloc.c'
--- src/alloc.c	2011-11-20 03:07:02 +0000
+++ src/alloc.c	2011-12-09 10:08:32 +0000
@@ -21,7 +21,7 @@
 #include <stdio.h>
 #include <limits.h>		/* For CHAR_BIT.  */
 #include <setjmp.h>
-
+#include <sys/param.h>		/* For roundup.  */
 #include <signal.h>
 
 #ifdef HAVE_PTHREAD
@@ -2896,9 +2896,12 @@
 			   Vector Allocation
  ***********************************************************************/
 
-/* Singly-linked list of all vectors.  */
+#ifndef PAGE_SIZE
+#define PAGE_SIZE 4096
+#endif
 
-static struct Lisp_Vector *all_vectors;
+#define VECTOR_BLOCK_SIZE (PAGE_SIZE - sizeof (void *))
+#define VECTOR_BLOCK_BYTES (VECTOR_BLOCK_SIZE - 2 * sizeof (void *))
 
 /* Handy constants for vectorlike objects.  */
 enum
@@ -2907,6 +2910,296 @@
     word_size = sizeof (Lisp_Object)
   };
 
+/* Free lists for all possible block-allocated vectors.  */
+
+#define VECTOR_FREE_LISTS (VECTOR_BLOCK_BYTES / word_size)
+
+/* Vector blocks are aligned to page size, so it's easy to
+   get vector block pointer from the vector pointer.  */
+
+#define VECTOR_BLOCK(v) ((struct vector_block *) \
+			 ((unsigned long)(v) & ~(PAGE_SIZE - 1)))
+
+/* When the vector is on a free list, vectorlike_header.SIZE is set to
+   this special value ORed with vector's memory footprint size.  */
+
+#define VECTOR_FREE_LIST_SIZE \
+  (((EMACS_INT) ~0) & ~(ARRAY_MARK_FLAG | PSEUDOVECTOR_FLAG | (PAGE_SIZE - 1)))
+
+/* Common shortcut to advance vector pointer over a block data.  */
+
+#define ADVANCE(v,nbytes) \
+  (struct Lisp_Vector *) ((unsigned char *) (v) + (nbytes))
+
+/* Common shortcut to setup vector on a free list.  */
+
+#define SETUP_ON_FREE_LIST(v,nbytes,index) do { \
+  (v)->header.size = VECTOR_FREE_LIST_SIZE | (nbytes); \
+  (index) = (nbytes) / word_size; \
+  eassert ((index) < VECTOR_FREE_LISTS); \
+  (v)->header.next.vector = vector_free_lists[(index)]; \
+  vector_free_lists[(index)] = (v); } while (0)
+
+struct vector_block
+{
+  unsigned char data[VECTOR_BLOCK_BYTES];
+  unsigned char *bump;
+  struct vector_block *next;
+};
+
+/* Current vector block.  */
+
+static struct vector_block *vector_block;
+
+/* Vector free lists, where NTH item points to a chain
+   of free vectors of NTH * WORD_SIZE bytes.  */
+
+static struct Lisp_Vector *vector_free_lists[VECTOR_FREE_LISTS];
+
+/* Singly-linked list of large vectors.  */
+
+static struct Lisp_Vector *large_vectors;
+
+/* Get a new vector block.  */
+
+static struct vector_block *
+allocate_vector_block (void)
+{
+  struct vector_block *block;
+
+#ifdef DOUG_LEA_MALLOC
+  mallopt (M_MMAP_MAX, 0);
+#endif
+
+  /* Although lisp_align_malloc should be used, the default
+     1K granularity looks too small for vector allocation.  */
+  block = memalign (PAGE_SIZE, VECTOR_BLOCK_SIZE);
+  if (!block)
+    memory_full (VECTOR_BLOCK_SIZE);
+
+#ifdef DOUG_LEA_MALLOC
+  mallopt (M_MMAP_MAX, MMAP_MAX_AREAS);
+#endif
+
+#if GC_MARK_STACK && !defined GC_MALLOC_CHECK
+  mem_insert (block->data, block->data + VECTOR_BLOCK_BYTES,
+	      MEM_TYPE_VECTORLIKE);
+#endif
+
+  block->bump = block->data;
+  block->next = vector_block;
+  vector_block = block;
+  return block;
+}
+
+/* Called once to initialize vector allocation.  */
+
+static void
+init_vectors (void)
+{
+  int i;
+
+  large_vectors = NULL;
+  vector_block = allocate_vector_block ();
+
+  for (i = 0; i < VECTOR_FREE_LISTS; i++)
+    vector_free_lists[i] = NULL;
+}
+
+/* Allocate vector from the vector block.  */
+
+static struct Lisp_Vector *
+allocate_vector_from_block (size_t nbytes)
+{
+  struct Lisp_Vector *vector;
+  struct vector_block *block;
+  size_t index, restbytes;
+
+  /* Vector with 0 slots for Lisp_Objects is allowed.  */
+  eassert (nbytes >= sizeof (struct vectorlike_header) &&
+	   nbytes <= VECTOR_BLOCK_BYTES);
+  eassert (nbytes % word_size == 0);
+  eassert (vector_block != NULL);
+
+  /* First, try to allocate from a free list
+     contains vectors of the requested size.  */
+  index = nbytes / word_size;
+  if (vector_free_lists[index])
+    {
+      vector = vector_free_lists[index];
+      vector_free_lists[index] = vector->header.next.vector;
+      vector->header.next.nbytes = nbytes;
+      return vector;
+    }
+
+  /* Next, check free lists contains larger vectors.  */
+  for (index = (nbytes + sizeof (struct Lisp_Vector)) / word_size;
+       index < VECTOR_FREE_LISTS; index++)
+    if (vector_free_lists[index])
+      {
+	struct Lisp_Vector *rest;
+
+	/* This vector is larger than it was requested.  */
+	vector = vector_free_lists[index];
+	vector_free_lists[index] = vector->header.next.vector;
+	vector->header.next.nbytes = nbytes;
+
+	/* Excessive bytes are used for the smaller vector,
+	   which should be set on an appropriate free list.  */
+	restbytes = index * word_size - nbytes;
+	eassert (restbytes % word_size == 0);
+	rest = ADVANCE (vector, nbytes);
+	SETUP_ON_FREE_LIST (rest, restbytes, index);
+	return vector;
+      }
+
+  /* Next, try to allocate from current vector block.  */
+  restbytes = vector_block->data + VECTOR_BLOCK_BYTES - vector_block->bump;
+  if (nbytes > restbytes)
+    {
+      /* Current block's free space isn't enough...  */
+      if (restbytes >= sizeof (struct Lisp_Vector))
+	{
+	  /* ...but it's enough to allocate smaller vector,
+	     which should be set on an appropriate free list.  */
+	  eassert (restbytes % word_size == 0);
+	  vector = (struct Lisp_Vector *) vector_block->bump;
+	  vector_block->bump += restbytes;
+	  SETUP_ON_FREE_LIST (vector, restbytes, index);
+	}
+      /* Anyway, need a new vector block.  */
+      block = allocate_vector_block ();
+    }
+  else
+    /* Allocate from current vector block.  */
+    block = vector_block;
+
+  /* Bump pointer allocation from current vector block.  */
+  vector = (struct Lisp_Vector *) block->bump;
+  block->bump += nbytes;
+  vector->header.next.nbytes = nbytes;
+  return vector;
+}
+
+/* Return amount of Lisp_Objects can be stored in that vector.  */
+
+#define VECTOR_SIZE(v) ((v)->header.size & PSEUDOVECTOR_FLAG ? \
+  (PSEUDOVECTOR_SIZE_MASK & (v)->header.size) : (v)->header.size)
+
+/* Reclaim space used by unmarked vectors.  */
+
+static void
+sweep_vectors (void)
+{
+  struct vector_block *block = vector_block, *bprev = NULL, *bnext;
+  struct Lisp_Vector *vector, *prev, *next;
+  int i;
+
+  total_vector_size = 0;
+  for (i = 0; i < VECTOR_FREE_LISTS; i++)
+    vector_free_lists[i] = NULL;
+
+  /* Looking through vector blocks.  */
+
+  while (block)
+    {
+      struct Lisp_Vector *
+	free_vectors[VECTOR_BLOCK_BYTES / sizeof (struct Lisp_Vector)];
+      int index, used;
+
+      for (index = 0, used = 0, vector = (struct Lisp_Vector *) block->data;
+	   (unsigned char *) vector < block->bump; vector = next)
+	{
+	  if (VECTOR_MARKED_P (vector))
+	    {
+	      VECTOR_UNMARK (vector), used = 1;
+	      total_vector_size += VECTOR_SIZE (vector);
+	      next = ADVANCE (vector, vector->header.next.nbytes);
+	    }
+	  else
+	    {
+	      EMACS_INT nbytes;
+
+	      if ((vector->header.size & VECTOR_FREE_LIST_SIZE) == 
+		  VECTOR_FREE_LIST_SIZE)
+		vector->header.next.nbytes =
+		  vector->header.size & (PAGE_SIZE - 1);
+	      
+	      next = ADVANCE (vector, vector->header.next.nbytes);
+
+	      /* While NEXT is not marked, try to coalesce with VECTOR,
+		 thus making VECTOR of the largest possible size.  */
+
+	      while ((unsigned char *) next < block->bump)
+		{
+		  if (VECTOR_MARKED_P (next))
+		    break;
+		  if ((next->header.size & VECTOR_FREE_LIST_SIZE) == 
+		      VECTOR_FREE_LIST_SIZE)
+		    nbytes = next->header.size & (PAGE_SIZE - 1);
+		  else
+		    nbytes = next->header.next.nbytes;
+		  vector->header.next.nbytes += nbytes;
+		  next = ADVANCE (next, nbytes);
+		}
+	      
+	      /* Collect a pointer to resulting vector.  */
+	      eassert (index < (VECTOR_BLOCK_BYTES /
+				sizeof (struct Lisp_Vector)));
+	      free_vectors[index++] = vector;
+	    }
+	}
+
+      if (used)
+	{
+	  /* Setup collected vectors on a free lists.  */
+	  while (--index >= 0)
+	    {
+	      vector = free_vectors[index];
+	      eassert (vector->header.next.nbytes % word_size == 0);
+	      SETUP_ON_FREE_LIST (vector, vector->header.next.nbytes, i);
+	    }
+	  bprev = block, block = block->next;
+	}
+      else
+	{
+	  /* Nothing used in this block.  */
+	  if (bprev)
+	    bprev->next = block->next;
+	  else
+	    vector_block = block->next;
+	  bnext = block->next;
+#if GC_MARK_STACK && !defined GC_MALLOC_CHECK
+	  mem_delete (mem_find (block->data));
+#endif
+	  free (block);
+	  block = bnext;
+	}
+    }
+
+  /* Sweep large vectors.  */
+
+  vector = large_vectors, prev = NULL;
+
+  while (vector)
+    if (VECTOR_MARKED_P (vector))
+      {
+	VECTOR_UNMARK (vector);
+	total_vector_size += VECTOR_SIZE (vector);
+	prev = vector, vector = vector->header.next.vector;
+      }
+    else
+      {
+	if (prev)
+	  prev->header.next = vector->header.next;
+	else
+	  large_vectors = vector->header.next.vector;
+	next = vector->header.next.vector;
+	lisp_free (vector);
+	vector = next;
+      }
+}
+
 /* Value is a pointer to a newly allocated Lisp_Vector structure
    with room for LEN Lisp_Objects.  */
 
@@ -2929,7 +3222,15 @@
   /* eassert (!handling_signal); */
 
   nbytes = header_size + len * word_size;
-  p = (struct Lisp_Vector *) lisp_malloc (nbytes, MEM_TYPE_VECTORLIKE);
+  if (nbytes > VECTOR_BLOCK_BYTES)
+    {
+      p = (struct Lisp_Vector *) lisp_malloc (nbytes, MEM_TYPE_VECTORLIKE);
+      p->header.next.vector = large_vectors;
+      large_vectors = p;
+    }
+  else
+    /* Rounding is needed for 32-bit code to preserve 8-byte alignment.  */
+    p = allocate_vector_from_block (roundup (nbytes, 8));
 
 #ifdef DOUG_LEA_MALLOC
   /* Back to a reasonable maximum of mmap'ed areas.  */
@@ -2939,9 +3240,6 @@
   consing_since_gc += nbytes;
   vector_cells_consed += len;
 
-  p->header.next.vector = all_vectors;
-  all_vectors = p;
-
   MALLOC_UNBLOCK_INPUT;
 
   return p;
@@ -4016,7 +4314,36 @@
 static inline int
 live_vector_p (struct mem_node *m, void *p)
 {
-  return (p == m->start && m->type == MEM_TYPE_VECTORLIKE);
+  if (m->type == MEM_TYPE_VECTORLIKE)
+    {
+      if (m->end - m->start == VECTOR_BLOCK_BYTES)
+	{
+	  /* This memory node corresponds to a vector block.  */
+	  struct vector_block *block = VECTOR_BLOCK (p);
+	  struct Lisp_Vector *vector = (struct Lisp_Vector *) block->data;
+
+	  /* Scan the whole vector block and see whether P points to
+	     the start of some vector which is not on a free list.  */
+	  while ((unsigned char *) vector < block->bump)
+	    {
+	      if ((vector->header.size & VECTOR_FREE_LIST_SIZE)
+		  == VECTOR_FREE_LIST_SIZE)
+		vector = ADVANCE
+		  (vector, (vector->header.size & (PAGE_SIZE - 1)));
+	      else if (vector == p)
+		return 1;
+	      else
+		vector = ADVANCE (vector, vector->header.next.nbytes);
+	    }
+	}
+      else
+	{
+	  if (p == m->start)
+	    /* This memory node corresponds to a large vector.  */
+	    return 1;
+	}
+    }
+  return 0;
 }
 
 
@@ -6169,33 +6496,7 @@
 	}
   }
 
-  /* Free all unmarked vectors */
-  {
-    register struct Lisp_Vector *vector = all_vectors, *prev = 0, *next;
-    total_vector_size = 0;
-
-    while (vector)
-      if (!VECTOR_MARKED_P (vector))
-	{
-	  if (prev)
-	    prev->header.next = vector->header.next;
-	  else
-	    all_vectors = vector->header.next.vector;
-	  next = vector->header.next.vector;
-	  lisp_free (vector);
-	  vector = next;
-
-	}
-      else
-	{
-	  VECTOR_UNMARK (vector);
-	  if (vector->header.size & PSEUDOVECTOR_FLAG)
-	    total_vector_size += PSEUDOVECTOR_SIZE_MASK & vector->header.size;
-	  else
-	    total_vector_size += vector->header.size;
-	  prev = vector, vector = vector->header.next.vector;
-	}
-  }
+  sweep_vectors ();
 
 #ifdef GC_CHECK_STRING_BYTES
   if (!noninteractive)
@@ -6331,7 +6632,6 @@
   Vdead = make_pure_string ("DEAD", 4, 4, 0);
 #endif
 
-  all_vectors = 0;
   ignore_warnings = 1;
 #ifdef DOUG_LEA_MALLOC
   mallopt (M_TRIM_THRESHOLD, 128*1024); /* trim threshold */
@@ -6344,6 +6644,7 @@
   init_marker ();
   init_float ();
   init_intervals ();
+  init_vectors ();
   init_weak_hash_tables ();
 
 #ifdef REL_ALLOC

=== modified file 'src/lisp.h'
--- src/lisp.h	2011-12-05 00:15:15 +0000
+++ src/lisp.h	2011-12-09 07:19:16 +0000
@@ -868,12 +868,15 @@
 struct vectorlike_header
   {
     EMACS_INT size;
-
-    /* Pointer to the next vector-like object.  It is generally a buffer or a
+    /* When the vector is allocated from a vector block, NBYTES is used
+       if the vector is not on a free list, and VECTOR is used otherwise.
+       For large vector-like objects, BUFFER or VECTOR is used as a pointer
+       to the next vector-like object.  It is generally a buffer or a 
        Lisp_Vector alias, so for convenience it is a union instead of a
        pointer: this way, one can write P->next.vector instead of ((struct
        Lisp_Vector *) P->next).  */
     union {
+      EMACS_INT nbytes;
       struct buffer *buffer;
       struct Lisp_Vector *vector;
     } next;

next prev parent reply	other threads:[~2011-12-09 16:15 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-06  5:22 Proposal: block-based vector allocator Dmitry Antipov
2011-12-06 13:35 ` Stefan Monnier
2011-12-06 15:14   ` Dmitry Antipov
2011-12-06 19:39     ` Stefan Monnier
2011-12-07  5:05       ` Dmitry Antipov
2011-12-07 12:27         ` Carsten Mattner
2011-12-07 13:52         ` Stefan Monnier
2011-12-07 16:08           ` Dmitry Antipov
2011-12-07 16:30             ` Stefan Monnier
2011-12-08  8:50               ` Dmitry Antipov
2011-12-08 13:52                 ` Stefan Monnier
2011-12-08  1:53             ` Stephen J. Turnbull
2011-12-08  4:41               ` Dmitry Antipov
2011-12-08 14:10                 ` Stefan Monnier
2011-12-08 16:48                   ` Dmitry Antipov
2011-12-08 19:58                     ` Stefan Monnier
2011-12-09  7:32                       ` Eli Zaretskii
2011-12-09  9:04                       ` Dmitry Antipov
2011-12-09 14:05                         ` Stefan Monnier
2011-12-09 16:15                           ` Dmitry Antipov [this message]
2011-12-09 21:04                             ` Stefan Monnier
2011-12-11 13:18                               ` Dmitry Antipov
2011-12-12  3:07                               ` Dmitry Antipov
2011-12-12 16:24                                 ` Stefan Monnier
2011-12-09  4:44                 ` Stephen J. Turnbull
     [not found] ` <jwvaa1yjs21.fsf-monnier+emacs@gnu.org>
2012-05-17  7:58   ` Dmitry Antipov
2012-05-18 17:40     ` Stefan Monnier
2012-05-21 12:19       ` Dmitry Antipov
2012-05-21 13:02         ` Andreas Schwab
2012-05-21 13:48           ` Dmitry Antipov
2012-05-21 15:07             ` Andreas Schwab
2012-05-22  5:23             ` Ken Raeburn
2012-05-21 20:12         ` Stefan Monnier
2012-05-22  8:24           ` Dmitry Antipov
2012-05-31 13:44           ` Dmitry Antipov
2012-05-31 15:43             ` Paul Eggert
2012-06-01  5:15               ` Dmitry Antipov
2012-06-01  5:44                 ` Paul Eggert
2012-06-01  9:06                   ` Dmitry Antipov
2012-06-01 17:36                     ` Stefan Monnier
2012-06-02  0:32                       ` Paul Eggert
2012-06-02  7:41                         ` Eli Zaretskii
2012-06-03  6:49                           ` Paul Eggert
2012-06-03 14:26                             ` Eli Zaretskii
2012-05-31 21:16             ` Stefan Monnier
2012-06-01  7:34               ` Dmitry Antipov
2012-06-01 17:40                 ` Stefan Monnier
2012-06-01 17:43                 ` Stefan Monnier
2012-06-06  7:02                   ` Dmitry Antipov
2012-06-06 13:13                     ` Stefan Monnier
2012-06-06 14:58                       ` Dmitry Antipov
2012-06-06 19:18                         ` Stefan Monnier
2012-06-07 10:03                           ` Dmitry Antipov
2012-06-07 14:07                             ` Stefan Monnier
2012-06-08  5:50                               ` Dmitry Antipov
2012-06-08  6:17                                 ` Stefan Monnier
2012-06-08  8:49                                   ` Dmitry Antipov
2012-06-08  8:53                                     ` Eli Zaretskii
2012-06-08  9:41                                       ` Eli Zaretskii
2012-06-08 10:00                                         ` Eli Zaretskii
2012-06-08  6:57                                 ` Eli Zaretskii
2012-06-08  6:38                             ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EE23422.3050008@yandex.ru \
    --to=dmantipov@yandex.ru \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).