unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Attempting to unbox struct fields
@ 2016-02-28 15:03 David Thompson
  2016-02-28 21:30 ` Thompson, David
  0 siblings, 1 reply; 6+ messages in thread
From: David Thompson @ 2016-02-28 15:03 UTC (permalink / raw)
  To: guile-devel; +Cc: wingo

Hello wingo and list,

A couple days ago on #guile, I started a conversation about optimizing
some record types I use for linear algebra to take advantage of unboxed
arithmetic in the upcoming Guile 2.2.  Andy informed me of a temporary
hack I could try, but then said that The Right Thing is for Guile to
unbox struct fields.  So, I thought since Andy wrote such a nice post on
his blog about about Guile compiler tasks [0] that maybe I would give it
a try!  I have gone about as far as I can on my own (not far), and seek
the guiding light of the Guile maintainers to help unblock me.

The task is as follows, quoting from the above mentioned blog post:

    Guile's "structs", on which records are implemented, support unboxed
    values, but these values are untyped, not really integrated with the
    record layer, and always boxed in the VM. Your task would be to
    design a language facility that allows us to declare records with
    typed fields, and to store unboxed values in those fields, and to
    cause access to their values to emit boxing/unboxing instructions
    around them. The optimizer will get rid of those boxing/unboxing
    instructions if it can. Good luck!

I took an exploratory romp around the codebase and here's what I've
learned:

- Guile indeed supports unboxed fields in the struct implementation.
  Currently it only supports unboxed unsigned integers, but there's some
  preprocessor magic that can be removed to enable unboxed signed
  integers and doubles:

      switch (field_type)
      {
      case 'u':
        data[p] = SCM_NUM2ULONG (3, val);
        break;
      
      #if 0
      case 'i':
        data[p] = SCM_NUM2LONG (3, val);
        break;
      
      case 'd':
        *((double *)&(data[p])) = scm_num2dbl (val, (char *)SCM_ARG3);
        break;
      #endif

      ...

- It's easy enough to create a vtable with unboxed fields:

    (define vtable (make-vtable "uwuw"))
    (define s (make-struct vtable 123 456))
    (struct-ref s 0) ;; => 123

- struct-ref/immediate and struct-set!/immediate are the VM operations
  for reading from/writing to structs

- Roughly speaking, in the case of unboxed unsigned integers, one would
  want to insert a scm->u64 instruction before setting the value of an
  unboxed field, and one would want to insert a u64->scm instructor
  after getting the value of an unboxed field.

- In TreeIL, struct refs and sets are primcalls, and when compiling to
  CPS they receive some special treatment to unbox the index component
  of the respective operations.  This might be the procedure that will
  insert the boxing/unboxing instructions for the struct fields, but I'm
  not sure.

Now that I've learned a little bit, I have a bunch of questions that I
cannot yet answer:

- Is it possible to know the layout of a vtable at compile time?

- If so, where is that information stored?

- If not, does this mean that TreeIL needs to be changed to be able to
store typed struct field data in order to generate the correct CPS?

- Can the TreeIL format even be reasonably changed since its a public
interface that people target when writing their own language
implementations?

Basically, how could I possibly get my hands on the vtable information
at compile time?

Help would be very much appreciated!

Thanks,

-- 
David Thompson
GPG Key: 0FF1D807

[0] http://wingolog.org/archives/2016/02/04/guile-compiler-tasks



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Attempting to unbox struct fields
  2016-02-28 15:03 Attempting to unbox struct fields David Thompson
@ 2016-02-28 21:30 ` Thompson, David
  2016-02-29  3:56   ` Mark H Weaver
  0 siblings, 1 reply; 6+ messages in thread
From: Thompson, David @ 2016-02-28 21:30 UTC (permalink / raw)
  To: guile-devel; +Cc: Andy Wingo

[-- Attachment #1: Type: text/plain, Size: 171 bytes --]

Here's a patch I came up with to enable (and fix where necessary) the
support for signed integer and double struct fields.  Am I on the
right track here?

Thanks,

- Dave

[-- Attachment #2: 0001-struct-Add-support-for-unboxed-signed-integers-and-d.patch --]
[-- Type: text/x-diff, Size: 2751 bytes --]

From 8bde5c7018fde91cc7140777107bacfb3febb170 Mon Sep 17 00:00:00 2001
From: David Thompson <dthompson2@worcester.edu>
Date: Sun, 28 Feb 2016 16:11:35 -0500
Subject: [PATCH] struct: Add support for unboxed signed integers and doubles.

* libguile/struct.c (scm_make_struct_layout): Enable 'i' and 'd' cases.
(scm_is_valid_vtable_layout): Add 'i' and 'd' cases.
(scm_struct_init): Initialize signed integer and double fields.
(scm_struct_ref): Enable 'i' and 'd' cases.  Update 'd' case to use
scm_from_double instead of obsolete scm_make_real.
(scm_struct_set_x): Enable 'i' and 'd' cases.  Update 'd' case to use
SCM_NUM2DOUBLE instead of obsolete scm_num2dbl.
---
 libguile/struct.c | 32 ++++++++++++++++++++++++--------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/libguile/struct.c b/libguile/struct.c
index 8bfbcf4..c50019c 100644
--- a/libguile/struct.c
+++ b/libguile/struct.c
@@ -100,10 +100,8 @@ SCM_DEFINE (scm_make_struct_layout, "make-struct-layout", 1, 0, 0,
 	  {
 	  case 'u':
 	  case 'p':
-#if 0
 	  case 'i':
 	  case 'd':
-#endif
 	  case 's':
 	    break;
 	  default:
@@ -219,6 +217,8 @@ scm_is_valid_vtable_layout (SCM layout)
     switch (c_layout[n])
       {
       case 'u':
+      case 'i':
+      case 'd':
       case 'p':
       case 's':
         switch (c_layout[n+1])
@@ -360,6 +360,26 @@ scm_struct_init (SCM handle, SCM layout, size_t n_tail,
 		}
 	      break;
 
+            case 'i':
+	      if ((prot != 'r' && prot != 'w') || inits_idx == n_inits)
+		*mem = 0;
+	      else
+		{
+		  *mem = scm_to_long (SCM_PACK (inits[inits_idx]));
+		  inits_idx++;
+		}
+	      break;
+
+            case 'd':
+	      if ((prot != 'r' && prot != 'w') || inits_idx == n_inits)
+		*mem = 0.0;
+	      else
+		{
+		  *((double *) mem) = scm_to_double (SCM_PACK (inits[inits_idx]));
+		  inits_idx++;
+		}
+	      break;
+
 	    case 'p':
 	      if ((prot != 'r' && prot != 'w') || inits_idx == n_inits)
 		*mem = SCM_UNPACK (SCM_BOOL_F);
@@ -761,15 +781,13 @@ SCM_DEFINE (scm_struct_ref, "struct-ref", 2, 0, 0,
 	  answer = scm_from_ulong (data[p]);
 	  break;
 
-#if 0
 	case 'i':
 	  answer = scm_from_long (data[p]);
 	  break;
 
 	case 'd':
-	  answer = scm_make_real (*((double *)&(data[p])));
+	  answer = scm_from_double (*((double *)&(data[p])));
 	  break;
-#endif
 
 	case 's':
 	case 'p':
@@ -844,15 +862,13 @@ SCM_DEFINE (scm_struct_set_x, "struct-set!", 3, 0, 0,
 	  data[p] = SCM_NUM2ULONG (3, val);
 	  break;
 
-#if 0
 	case 'i':
 	  data[p] = SCM_NUM2LONG (3, val);
 	  break;
 
 	case 'd':
-	  *((double *)&(data[p])) = scm_num2dbl (val, (char *)SCM_ARG3);
+	  *((double *)&(data[p])) = SCM_NUM2DOUBLE (3, val);
 	  break;
-#endif
 
 	case 'p':
 	  data[p] = SCM_UNPACK (val);
-- 
2.6.3


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Attempting to unbox struct fields
  2016-02-28 21:30 ` Thompson, David
@ 2016-02-29  3:56   ` Mark H Weaver
  2016-02-29 14:26     ` Thompson, David
  0 siblings, 1 reply; 6+ messages in thread
From: Mark H Weaver @ 2016-02-29  3:56 UTC (permalink / raw)
  To: Thompson, David; +Cc: Andy Wingo, guile-devel

Hi David,

"Thompson, David" <dthompson2@worcester.edu> writes:
> Here's a patch I came up with to enable (and fix where necessary) the
> support for signed integer and double struct fields.

Great, thanks for working on it!  This is not a proper review, just a
couple of questions and comments.

> Am I on the right track here?

The first thing I noticed is that the patch assumes that doubles are the
same size as pointers.  Obviously this is not the case on 32-bit
systems.  What's the plan for those systems?

The patch also currently assumes that longs and pointers are the same
size, and this turns out to be false on LLP64 systems such as 64-bit
Windows.  See <https://debbugs.gnu.org/22406>.  However, it should be
straightforward to fix this issue.

> +	      if ((prot != 'r' && prot != 'w') || inits_idx == n_inits)
> +		*mem = 0.0;

Note that 'mem' is of type scm_t_bits*, so the 0.0 will be converted to
the integer 0 and then stored as an integer, which I guess is not what
you meant.  Note that in practice this works on IEEE 754 systems, where
0.0 is represented as all zero bits, but the code is somewhat misleading
and less portable than it could be.

     Thanks,
       Mark



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Attempting to unbox struct fields
  2016-02-29  3:56   ` Mark H Weaver
@ 2016-02-29 14:26     ` Thompson, David
  2016-02-29 17:43       ` Mark H Weaver
  0 siblings, 1 reply; 6+ messages in thread
From: Thompson, David @ 2016-02-29 14:26 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

Hi Mark,

On Sun, Feb 28, 2016 at 10:56 PM, Mark H Weaver <mhw@netris.org> wrote:
> Hi David,
>
> "Thompson, David" <dthompson2@worcester.edu> writes:
>> Here's a patch I came up with to enable (and fix where necessary) the
>> support for signed integer and double struct fields.
>
> Great, thanks for working on it!  This is not a proper review, just a
> couple of questions and comments.

I'm not very knowledgeable in the finer points of writing portable C
code or the Guile C API, so this feedback is very valuable.  Thanks!

>> Am I on the right track here?
>
> The first thing I noticed is that the patch assumes that doubles are the
> same size as pointers.  Obviously this is not the case on 32-bit
> systems.  What's the plan for those systems?

Yeah, I just hacked this together on my x86_64 system and paid no mind
to portability.  I was hoping that you or Andy or Ludovic would have
an idea for how to address portability issues. :)

> The patch also currently assumes that longs and pointers are the same
> size, and this turns out to be false on LLP64 systems such as 64-bit
> Windows.  See <https://debbugs.gnu.org/22406>.  However, it should be
> straightforward to fix this issue.
>
>> +           if ((prot != 'r' && prot != 'w') || inits_idx == n_inits)
>> +             *mem = 0.0;
>
> Note that 'mem' is of type scm_t_bits*, so the 0.0 will be converted to
> the integer 0 and then stored as an integer, which I guess is not what
> you meant.  Note that in practice this works on IEEE 754 systems, where
> 0.0 is represented as all zero bits, but the code is somewhat misleading
> and less portable than it could be.

D'oh!  I forgot to cast here.  Thanks for the explanation.  It
explains why I didn't notice the issue when testing.

My current plan is to keep pressing onward and produce a
proof-of-concept, hacky patch set that allows unboxed struct fields on
x86_64.  Then, with some help and guidance, I can sort out the
portability issues, code style issues, unit tests, and documentation
after confirmation that the prototype is on the right track.

Thanks,

- Dave



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Attempting to unbox struct fields
  2016-02-29 14:26     ` Thompson, David
@ 2016-02-29 17:43       ` Mark H Weaver
  2016-02-29 21:09         ` Thompson, David
  0 siblings, 1 reply; 6+ messages in thread
From: Mark H Weaver @ 2016-02-29 17:43 UTC (permalink / raw)
  To: Thompson, David; +Cc: guile-devel

"Thompson, David" <dthompson2@worcester.edu> writes:

>> The first thing I noticed is that the patch assumes that doubles are the
>> same size as pointers.  Obviously this is not the case on 32-bit
>> systems.  What's the plan for those systems?
>
> Yeah, I just hacked this together on my x86_64 system and paid no mind
> to portability.  I was hoping that you or Andy or Ludovic would have
> an idea for how to address portability issues. :)

I think the approach we need to take is that for 32-bit systems, doubles
will need to use two consecutive slots.  Furthermore, those slots will
either need to be aligned (i.e. the first slot must have an even index)
or else the code that accesses 'double' struct fields will need to
perform the access carefully, perhaps by copying each half separately
into a local 'union' and then copying the double from there.

> My current plan is to keep pressing onward and produce a
> proof-of-concept, hacky patch set that allows unboxed struct fields on
> x86_64.  Then, with some help and guidance, I can sort out the
> portability issues, code style issues, unit tests, and documentation
> after confirmation that the prototype is on the right track.

I think the issue above is fairly fundamental, and might not be easy to
address as an after-thought.  It would be good to test your preliminary
work on 32-bit systems throughout the development process.  Fortunately,
I know that you run GuixSD which makes this easy :)

What do you think?

    Thanks,
      Mark



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Attempting to unbox struct fields
  2016-02-29 17:43       ` Mark H Weaver
@ 2016-02-29 21:09         ` Thompson, David
  0 siblings, 0 replies; 6+ messages in thread
From: Thompson, David @ 2016-02-29 21:09 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On Mon, Feb 29, 2016 at 12:43 PM, Mark H Weaver <mhw@netris.org> wrote:
> "Thompson, David" <dthompson2@worcester.edu> writes:
>
>>> The first thing I noticed is that the patch assumes that doubles are the
>>> same size as pointers.  Obviously this is not the case on 32-bit
>>> systems.  What's the plan for those systems?
>>
>> Yeah, I just hacked this together on my x86_64 system and paid no mind
>> to portability.  I was hoping that you or Andy or Ludovic would have
>> an idea for how to address portability issues. :)
>
> I think the approach we need to take is that for 32-bit systems, doubles
> will need to use two consecutive slots.  Furthermore, those slots will
> either need to be aligned (i.e. the first slot must have an even index)
> or else the code that accesses 'double' struct fields will need to
> perform the access carefully, perhaps by copying each half separately
> into a local 'union' and then copying the double from there.

After talking with Andy on #guile for a bit, it has become apparent
that unboxing these fields will only be of limited utility.  For
example, it would be nice to eventually handle arrays of unboxed data
like in C with an array of structs, but unboxing struct fields doesn't
get us any closer to such a goal.  Thus, I am going to drop this work
and find another optimization to focus on.

Thanks anyway.  It's been an educational experience.

- Dave



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-02-29 21:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-28 15:03 Attempting to unbox struct fields David Thompson
2016-02-28 21:30 ` Thompson, David
2016-02-29  3:56   ` Mark H Weaver
2016-02-29 14:26     ` Thompson, David
2016-02-29 17:43       ` Mark H Weaver
2016-02-29 21:09         ` Thompson, David

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).