From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: Proposal: block-based vector allocator Date: Fri, 18 May 2012 13:40:24 -0400 Message-ID: References: <4EDDA68B.5050601@yandex.ru> <4FB4AFA4.7020601@yandex.ru> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: dough.gmane.org 1337362844 19021 80.91.229.3 (18 May 2012 17:40:44 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 18 May 2012 17:40:44 +0000 (UTC) Cc: emacs-devel@gnu.org To: Dmitry Antipov Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri May 18 19:40:42 2012 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SVRAJ-0001Zr-EB for ged-emacs-devel@m.gmane.org; Fri, 18 May 2012 19:40:35 +0200 Original-Received: from localhost ([::1]:56809 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SVRAI-0001J1-MY for ged-emacs-devel@m.gmane.org; Fri, 18 May 2012 13:40:34 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:55118) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SVRAE-0001IV-Ku for emacs-devel@gnu.org; Fri, 18 May 2012 13:40:32 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SVRAC-0002sf-2h for emacs-devel@gnu.org; Fri, 18 May 2012 13:40:30 -0400 Original-Received: from ironport-out.teksavvy.com ([206.248.143.162]:23206) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SVRAB-0002sL-Ss for emacs-devel@gnu.org; Fri, 18 May 2012 13:40:27 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av0EAOMAh09FpZld/2dsb2JhbAA3o0KBCIF1AQEEAScvIwULCw4mEhQYDSSIE6IRi3UBDAIIAQsGAwIHOoNNA4NwBKNjhFg X-IronPort-AV: E=Sophos;i="4.73,1,1325480400"; d="scan'208";a="181014151" Original-Received: from 69-165-153-93.dsl.teksavvy.com (HELO pastel.home) ([69.165.153.93]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 18 May 2012 13:40:25 -0400 Original-Received: by pastel.home (Postfix, from userid 20848) id C2BC759577; Fri, 18 May 2012 13:40:24 -0400 (EDT) In-Reply-To: <4FB4AFA4.7020601@yandex.ru> (Dmitry Antipov's message of "Thu, 17 May 2012 11:58:28 +0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1.50 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 206.248.143.162 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:150558 Archived-At: > IIRC from the previous discussions, you're suggesting rounding-up allocator > for the sake of simplicity and better speed; Actually, the main reason is because it has proved over the years to be a good choice, for example in terms of fragmentation. > I'm still voting for exact-fit allocator with splitting/coalescing, > for the sake of best possible memory utilization. It's not at all clear that the memory utilization will be better in practice, for example because of fragmentation or because of the need to keep the "next" field in vectors. > It would be helpful to get a broader discussion on this. At this point, I'm OK with using your approach, so I only want to make sure the code is as simple and clean as possible. Here are some questions and comments about your code: > +#define VECTOR_BLOCK_SIZE (4096) No need to put parens, right? > + roundup_size = 8 This deserves a comment explaining the choice of 8 and describing what this constant is used for. > +/* One pointer is malloc overhead, others are BUMP and NEXT fields of struct > + vector_block. Rounding helps to maintain alignment constraints. */ > +#define VECTOR_BLOCK_BYTES \ > + (VECTOR_BLOCK_SIZE - roundup (3 * sizeof (void *), roundup_size)) Why do we care about malloc overhead? > +/* Free lists for all possible block-allocated vectors. */ > + > +#define VECTOR_FREE_LISTS ((VECTOR_BLOCK_BYTES / roundup_size) + 1) Some comment here or earlier should make it clear that we have one free list per vector size. Maybe just put this declaration together with the one for "vector_free_lists", after all they go together. And this macro should be called something like VECTOR_MAX_FREE_LIST_INDEX; its current names makes it sound like it returns the free lists (or a pointer to them). > +/* When the vector is on a free list, vectorlike_header.SIZE is set to > + this special value ORed with vector's memory footprint size. */ > + > +#define VECTOR_FREE_LIST_SIZE \ > + (((EMACS_INT) ~0) & ~(ARRAY_MARK_FLAG | PSEUDOVECTOR_FLAG | \ > + (VECTOR_BLOCK_SIZE - 1))) The name doesn't sound right, it's not a real size, so it should be VECTOR_FREE_LIST_FLAG. Also, do we really have to put this info in the `size' field? E.g. for cons-cells we put Vdead in the `car' slot. Of course having it in `size' is not terrible, but `size' is pretty complicated already. BTW, if needed we could add a special case so there is only 1 vector of size 0 and (make-vector 0 ?) always returns that same vector (so it doesn't have to come from a vector_block). > +/* Current vector block. */ > +static struct vector_block *vector_block; Do we actually want a "current vector block"? Here's what I mean by that: how often are we going to allocate from this "current vector block" as opposed to allocating from one of the free lists? It would be simpler, when we allocate a new vector block, to just carve out the vector we need from it and put all the rest directly on the appropriate free list, so there's never such a thing as a "current vector block". That also should lets us get rid of the "bump" field. > +/* Allocate vector from the vector block. */ ^^^ a > + /* Next, check free lists contains larger vectors. */ > + for (index = (nbytes + sizeof (struct Lisp_Vector)) / roundup_size; Add a comment explaining that the "+ sizeof (struct Lisp_Vector)" eliminates the case where the "excess bytes" aren't sufficient to make them into a valid vector. > +/* Return amount of Lisp_Objects can be stored in that vector. */ ^^^ which > + while (block) > + { > + struct Lisp_Vector *free_vectors[VECTORS_PER_BLOCK_MAX]; > + int index, used; You should be able to avoid this two-step "put them in free_vectors, then loop over free_vectors to put them on the free lists": just at the end of the coalescing loop, if we bump into the end of the vector_block check if we're also the first vector in the block, and if so free the whole (otherwise push the vector on the free list). > + if (nbytes > VECTOR_BLOCK_BYTES) > + { > + p = (struct Lisp_Vector *) lisp_malloc (nbytes, MEM_TYPE_VECTORLIKE); > + p->header.next.vector = large_vectors; > + large_vectors = p; > + } > + else > + /* Rounding is for 32-bit code to preserve alignment. */ > + p = allocate_vector_from_block (roundup (nbytes, roundup_size)); Maybe we should only allocate from vector blocks vectors that are smaller than (say) VECTOR_BLOCK_BYTES/2. The reason is that allocating from a vector block incurs an overhead (memory overhead of "roundup (3 * sizeof (void *), roundup_size)", and CPU overhead of setting up the block, allocating the vector out of it, scanning the block, ...). This is shared among all the vectors in the block, so for smallish vectors it's worth the trouble (since we save other overheads) but if the vector is large, there won't be many vectors with which to share that overhead. > + /* P is in the block's allocation range. Scan the block > + up to P and see whether P points to the start of some > + vector which is not on a free list. */ > + while (vector <= (struct Lisp_Vector *) p) > + { > + if ((vector->header.size & VECTOR_FREE_LIST_SIZE) > + == VECTOR_FREE_LIST_SIZE) > + vector = ADVANCE (vector, (vector->header.size & > + (VECTOR_BLOCK_SIZE - 1))); > + else if (vector == p) > + return 1; > + else > + vector = ADVANCE (vector, vector->header.next.nbytes); > + } > + } This deserves a FIXME mentioning that this could be a significant overhead. I don't see how we can get rid of this overhead without using a different allocation technique, but it's always good to make this choice explicit. Stefan