From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Tom Lord Newsgroups: gmane.lisp.guile.devel Subject: for example Date: Sat, 3 Aug 2002 19:20:13 -0700 (PDT) Sender: guile-devel-admin@gnu.org Message-ID: <200208040220.TAA14265@morrowfield.regexps.com> References: <200208040205.TAA14205@morrowfield.regexps.com> NNTP-Posting-Host: localhost.gmane.org X-Trace: main.gmane.org 1028427212 26414 127.0.0.1 (4 Aug 2002 02:13:32 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 4 Aug 2002 02:13:32 +0000 (UTC) Return-path: Original-Received: from fencepost.gnu.org ([199.232.76.164]) by main.gmane.org with esmtp (Exim 3.33 #1 (Debian)) id 17bAtd-0006rv-00 for ; Sun, 04 Aug 2002 04:13:29 +0200 Original-Received: from localhost ([127.0.0.1] helo=fencepost.gnu.org) by fencepost.gnu.org with esmtp (Exim 3.35 #1 (Debian)) id 17bAuD-00055S-00; Sat, 03 Aug 2002 22:14:05 -0400 Original-Received: from 1cust59.tnt13.sfo8.da.uu.net ([63.10.241.59] helo=morrowfield.regexps.com) by fencepost.gnu.org with esmtp (Exim 3.35 #1 (Debian)) id 17bAu6-00054u-00 for ; Sat, 03 Aug 2002 22:13:58 -0400 Original-Received: (from lord@localhost) by morrowfield.regexps.com (8.9.1/8.9.1) id TAA14265; Sat, 3 Aug 2002 19:20:13 -0700 (PDT) (envelope-from lord@morrowfield.regexps.com) Original-To: guile-devel@gnu.org Errors-To: guile-devel-admin@gnu.org X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.0.11 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: Developers list for Guile, the GNU extensibility library List-Unsubscribe: , List-Archive: Xref: main.gmane.org gmane.lisp.guile.devel:949 X-Report-Spam: http://spam.gmane.org/gmane.lisp.guile.devel:949 ok -- these are input to simple awk scripts (couple k lines) -- though I'm probably sending you versions that don't currently compile: And yes --- I'm being cryptic (a side effect of trying to compress a lot of info down to a short message) --- but the basic message is "bye", so don't waste time complaining. -t ! Register Types This file is part of the source code for the Hackerlab C library. It's translated to C by the awk program "./register-tags.awk". VM registers have small _external tags_ so they can hold a limited selection of unboxed values. This file declares the register tags and the union type for (various kinds of) register. *> core-registers register-type scm_register 2 **> holds scm s **> holds scm_u u **> holds scm_i i **> holds scm_f f -------------- ! The Bit Tag Spec File A popular misconception is that a tagging system simply maps values to a set of densely-packed, small integer tags, each tag representing a type. You'll see people write: <<< struct generic_object_header; { t_uint tag; ... }; >>> but that's really _oversimplified._ A good tagging system does much more than that. For example, here is an outline that is processed automatically to produce `enum' declarations for a system of "staggered tags" (see SCM, for example), predicate functions, and case labels. Since scheme is important enough that we should (at least casually) worry about the length and complexity of the bootstrap path from front-panel toggle-switches to full hosting environment, this file is designed to processed by a small `awk' script (i.e., a script for a little language with hash tables, loops, conditionals, and regexps but not much more). CLS note: I'm not sure how clear this will be to people who aren't looking at the rest of its context. I hope the notation is clear enough to be puzzled out.... *> scm-tags tags scm **: decodes-to scm_u We're going to define tags with a basename "scm" for values of type "scm". **> split-tag val 2 The smallest in-line tag will be two bits. ***> tag bibop_object (00) ****: tags-by-mask ****: decodes-to t_scm_sptr; Bibop objects are the lightest weight in terms of meta-data overhead (e.g., they don't necessarily have reference counts) and (if a direct pointer representation is used) alignment requirements (they are 4-byte aligned) Bibop objects share storage with page objects (see below). ***> tag cow_bibop_object (..) ****: tags-by-mask ****: decodes-to t_scm_cow_sptr; Lazy-linear bibop objects. There is a one bit reference count for each bibop object. When the first cow reference to an object is formed, that reference count is 0. If the cow reference is copied (to produce a second cow reference), the count is 1. If yet another cow-copy is made, the new copy is in fact a shallow-copy of the object with reference count 0 (references in shallow copy are cow references). When fetching a possibly cow field, programs can request a non-cow reference to a stable object which the field will continue to hold with a cow reference: if, before the fetch, the field held a cow reference to an object with (possibly) more than one cow reference, then a shallow copy is made and the field updated before the fetch returns. ***> split-tag heavy_pointer 2 (..) Bibop pointers have the nice property of being small (if implemented as direct pointers) but the drawback that reclamation of objects weakly held by bibop pointers can not occur until a scan has updated all pointers to the object. At the opposite extreme are object-table pointers and fat pointers: more or less interchangable ways to obtain cheap (in time) weak references and even cheaply destroyable objects. ****> tag vm_object (....) ****: tags-by-mask ****: decodes-to t_scm_obj The heap format of an object is quite complicated and is dcoumented in other files. ****> tag cow_vm_object (....) ****: tags-by-mask ****: decodes-to t_scm_cow_obj Lazy-linear vm objects. Similar to cow bibop objects, except that the reference count is larger. ****> tag vm_object_promise (....) ****: tags-by-mask ****: decodes-to t_scm_promise_obj Lazy, memoized, and referencer-memoized vm objects. ****> split-tag vm_page 1 (....) *****: tags-by-mask *****: decodes-to t_scm_page A modest pool of very-large-alignment (256 bytes) types. *****> split-tag vm_direcct_page 2 (....) ******> tag vm_page16 (......) ******> tag vm_page128 (......) ******> tag vm_page512 (......) ******> tag vm_page1024 (......) *****> split-tag vm_cow_page 2 (....) ******> tag vm_cow_page16 (......) ******> tag vm_cow_page128 (......) ******> tag vm_cow_page512 (......) ******> tag vm_cow_page1024 (......) ***> split-tag immediate 1 (..) Characters want to be "unicode+bucky bits" which adds up to _at least_ 24 bit and more comfortably to 29. Numbers are weird. Do we want one or two big-as-possible immediate integer types? or do we want to cram in lots of little types for tiny immediate rationals and complex numbers? How much of it should make sense in 16-bit environments? Atomic values: I don't care much about them. `nil' is the 0 non-immediate value. I wouldn't horribly miss `#t' or seeing it become a non-immediate -- almost nothing low-level ever dispatches on #t specifically. Indeed -- it's easy for an allocator to create disjoint, immutable, non-referencing objects that can be re-used across all VM instances and have well-known fixed addresses per-process. Use values of that sort for atomics: one extra memory fetch for eq? test (to look up the well-known address) but otherwise just as good. So it's a two way battle: numbers v. characters. Characters have the stricter data-size demands: let's give them half the remaining values: ****> tag character (...) *****: decodes-to t_unicode *****: decodes-exp (t_unicode)( ((val >> scm_tag_width_character) & (((scm)1 << scm_char_code_bits) - 1)) \ | (val & ((((scm)1 << scm_bucky_bits) - 1) << (scm_bits - scm_bucky_bits)))) In a 32-bit or larger environment, we get 29 bits for immediate characters -- enough for 21-bits of Unicode plus bucky-bits {left,right}x{shift,ctl,meta,alt}. Sweet. In the expanded (at least 32-bit) form, we keep the bucky bits in the high-order 8 bits. ****> tag immediate_signed (111) signed! *****: decodes-to scm_i *> scm-fast-dispatchers scm **> dispatcher is_bibop **> return 0 for bibop_object cow_bibop_objec **> return 1 otherwise **> dispatcher is_vm_object **> return 1 for vm_object cow_vm_object vm_object_promise **> return 0 otherwise **> dispatcher vm_obj_discipline ? ***> return cow for vm_cow_page16 vm_cow_page128 vm_cow_page512vm_cow_page1024 ***> return cow for cow_vm_object cow_bibop_object ***> return promise for vm_object_promise ***> return immediate for immediate ***> return regular otherwise this should generate [extern]enum scm_vm_obj_discipline scm_vm_obj_discipline(scm? obj) { switch (scm_tag(..)) ... } and inline t_uint8 scm_vm_obj_discipline_switch(scm? obj) { return scm_tag (...); } #define SCM_VM_OBJ_DISCIPLINE_COW_CASE ... #define SCM_VM_OBJ_DISCIPLINE_PROMISE_CASE ... #define SCM_VM_OBJ_DISCIPLINE_IMMEDIATE_CASE ... #define SCM_VM_OBJ_DISCIPLINE_REGULAR_CASE ... add a way to make pointer types for tags (e.g. vm_object) and for any binary dispatcher + a conversion function that return nil on "wrong type". (`scm_as_vm_obj(scm val) => `t_scm_vm_obj'). _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel