From mboxrd@z Thu Jan 1 00:00:00 1970 Path: main.gmane.org!not-for-mail From: Marius Vollmer Newsgroups: gmane.lisp.guile.devel Subject: Re: The relationship between SCM and scm_t_bits. Date: Sat, 21 Aug 2004 18:16:45 +0200 Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Message-ID: <87wtzsihzm.fsf@zagadka.ping.de> References: <40A6307C.9050809@dirk-herrmanns-seiten.de> <871xliq2p0.fsf@zagadka.ping.de> <40AE5A80.9050809@dirk-herrmanns-seiten.de> <40AEF7B3.2020707@dirk-herrmanns-seiten.de> <87zn54yq6q.fsf@zagadka.ping.de> <41264E64.5050008@dirk-herrmanns-seiten.de> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1093105046 20613 80.91.224.253 (21 Aug 2004 16:17:26 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 21 Aug 2004 16:17:26 +0000 (UTC) Cc: Paul Jarc , guile-devel@gnu.org Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Sat Aug 21 18:17:15 2004 Return-path: Original-Received: from lists.gnu.org ([199.232.76.165]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1ByYYM-0002Aj-00 for ; Sat, 21 Aug 2004 18:17:14 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1ByYci-0002rV-RM for guile-devel@m.gmane.org; Sat, 21 Aug 2004 12:21:44 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.33) id 1ByYcd-0002oG-2I for guile-devel@gnu.org; Sat, 21 Aug 2004 12:21:39 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.33) id 1ByYca-0002n2-HI for guile-devel@gnu.org; Sat, 21 Aug 2004 12:21:38 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.33) id 1ByYca-0002mu-CH for guile-devel@gnu.org; Sat, 21 Aug 2004 12:21:36 -0400 Original-Received: from [195.253.8.218] (helo=mail.dokom.net) by monty-python.gnu.org with esmtp (Exim 4.34) id 1ByYY6-0002rD-9V for guile-devel@gnu.org; Sat, 21 Aug 2004 12:16:58 -0400 Original-Received: from [195.138.45.208] (helo=zagadka.ping.de) by mail.dokom.net with smtp (Exim 4.34) id 1ByYY3-00070D-Sk for guile-devel@gnu.org; Sat, 21 Aug 2004 18:16:56 +0200 Original-Received: (qmail 7981 invoked by uid 1000); 21 Aug 2004 16:16:45 -0000 Original-To: Dirk Herrmann In-Reply-To: <41264E64.5050008@dirk-herrmanns-seiten.de> (Dirk Herrmann's message of "Fri, 20 Aug 2004 21:17:56 +0200") User-Agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3 (gnu/linux) X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: main.gmane.org gmane.lisp.guile.devel:4000 X-Report-Spam: http://spam.gmane.org/gmane.lisp.guile.devel:4000 Dirk Herrmann writes: >> The reason is that there exits code that does essentially this: >> >> scm_t_bits heap_field; >> >> SCM value = whatever (); SCM *ptr = (SCM *)&heap_field; *ptr = value; > > I assume that you mean that heap_field is actually an element of the heap. Yes. > We already had the discussion that I suggest to discourage this > style of coding since it violates a potential write barrier and will > lead to problems if we ever switch to a generational garbage > collection. Yes, that is the bigger issue. What we are discussing here are quite minor points, I'd say. There might be a time when we do want to have a write-barrier and then we can revisit whether to provide the *LOC accessors or not. Right now, removing them is not necessary. We should only remove them when there is an immediate benefit. > In particular, I have a problem with the following lines of code. > > In gc.h: > > #define SCM_GC_CELL_WORD(x, n) (SCM_UNPACK (SCM_GC_CELL_OBJECT > ((x), (n)))) > > This expression has a SCM value as an intermediate result, which > is definitely unclean, since the SCM value might (in contrast to the > definition of SCM) not represent a valid scheme object. Yes, that troubles me also a bit. But I get over it by realizing that we only really have one type, the type 'machine word', and SCM and scm_t_bits are essentially this same type, used to provide markup for different uses of the basic type 'machine word'. (In my view, it is essential that Scheme values are represented as a machine word. Using some other type that doesn't fit into a machine register, for example, would not be good enough.) As far as the ordinary user is concerned, we only have one type to represent a Scheme value, SCM. We don't say what a SCM is (whether it is a pointer, an integer, a struct, etc), only that you can assign it with '='. The internals of Guile, and unfortunately also a user that works with smobs, need to know more about SCM: that it really is a machine word and can be treated as an integral type. To treat it as such, a SCM is reinterpreted as a scm_t_bits. I think we need to make the following guarantees: - a SCM and a scm_t_bits have the same size in the sense that they can store exactly the same things. We always have SCM scm; scm_is_eq (SCM_PACK (SCM_UNPACK (scm)), scm) and scm_t_bits bits; SCM_UNPACK (SCM_PACK (bits)) == bits (*) - a size_t can be cast to scm_t_bits and back without losing information. (This is for storing integers in heap words.) - a void* can be cast to scm_t_bits and back without losing information. (This is for storing pointers in heap words.) - a scm_t_bits can be cast to void* and back without losing information. (This is for storing SCMs in void* locations provided by external code.) This is not as elegant and clean as dropping the guarantee (*), but it allows heap words to be declared as type SCM which is desirable since local variables and function arguments are also declared to be of type SCM. The reason that SCM is distinct from scm_t_bits at all is to get some help from the C compiler in type checking. > In numbers.h: > > #define SCM_I_BIG_MPZ(x) (*((mpz_t *) (SCM_CELL_OBJECT_LOC((x),1)))) > > This expression has a SCM* as an intermediate result, although in > this case we _know_ that we are actually pointing to a scm_t_bits > value. No, we point at an array of three SCMs... ;) This is actually a separate issue: the memory used by SCM_I_BIG_MPZ is always used as only one type, as an mpz_t. The reason that I changed all heap words to be declared as SCM was that previously some heap words would be written as a SCM and then read as a scm_t_bits. This is also the reason why I think that a union does not help at all: with such a union, we would write into one member and then read from the other. This is just as unclean as casting a pointer to scm_t_bits to a pointer to SCM. > Thus, I would just go ahead and apply it within the next couple of > days. Please do not apply it. We are not completely clean, true, but I doubt that we can attain perfect cleanliness anyway. Using a union would just complicate the issue without giving any benefit (that I could see). Things started out simple, and got more complicated with the introduction of scm_t_bits as an alias of SCM. Let's not continue this trend by pretending that SCM and scm_t_bits are actually separate types. They are not, they are the same type essentially, but one allows certain low-level operations that the other prevents. -- GPG: D5D4E405 - 2F9B BCCC 8527 692A 04E3 331E FAF8 226A D5D4 E405 _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://lists.gnu.org/mailman/listinfo/guile-devel