truth of %nil

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

* truth of %nil
@ 2009-06-29 21:12 Andy Wingo
  2009-06-29 21:44 ` Neil Jerram
  0 siblings, 1 reply; 38+ messages in thread
From: Andy Wingo @ 2009-06-29 21:12 UTC (permalink / raw)
  To: guile-devel

Hi all,

Daniel came up with an interesting test case:

    scheme@(guile-user)> (if %nil 1 2)
    1

We could fix this transparently by changing scm_is_false in boolean.h
from:

    #define scm_is_false(x) scm_is_eq ((x), SCM_BOOL_F)

to

    #define scm_is_false(x) (scm_is_eq ((x), SCM_BOOL_F) || SCM_NILP (x))

I'm not really sure if this is the right place for this to go, though.
It seems that it is. (Ideally the two values would differ by one bit
only, and we could mask that bit away and just have the one test.) What
do people think?

Andy
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-06-29 21:12 truth of %nil Andy Wingo
@ 2009-06-29 21:44 ` Neil Jerram
  2009-06-29 22:11   ` Andy Wingo
  2009-07-02 14:28   ` Mark H Weaver
  0 siblings, 2 replies; 38+ messages in thread
From: Neil Jerram @ 2009-06-29 21:44 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> Hi all,
>
> Daniel came up with an interesting test case:
>
>     scheme@(guile-user)> (if %nil 1 2)
>     1
>
> We could fix this transparently by changing scm_is_false in boolean.h
> from:
>
>     #define scm_is_false(x) scm_is_eq ((x), SCM_BOOL_F)
>
> to
>
>     #define scm_is_false(x) (scm_is_eq ((x), SCM_BOOL_F) || SCM_NILP (x))
>
> I'm not really sure if this is the right place for this to go, though.
> It seems that it is. (Ideally the two values would differ by one bit
> only, and we could mask that bit away and just have the one test.) What
> do people think?
>
> Andy
> -- 
> http://wingolog.org/

Seems wrong to me.  In Scheme #f should be the only false value.
What's the argument for %nil being false in Scheme code?

     Neil




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-06-29 21:44 ` Neil Jerram
@ 2009-06-29 22:11   ` Andy Wingo
  2009-06-30 22:22     ` Neil Jerram
  2009-07-02 14:28   ` Mark H Weaver
  1 sibling, 1 reply; 38+ messages in thread
From: Andy Wingo @ 2009-06-29 22:11 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

On Mon 29 Jun 2009 23:44, Neil Jerram <neil@ossau.uklinux.net> writes:

> Andy Wingo <wingo@pobox.com> writes:
>
>>     scheme@(guile-user)> (if %nil 1 2)
>>     1
>>
>>     #define scm_is_false(x) (scm_is_eq ((x), SCM_BOOL_F) || SCM_NILP (x))

> Seems wrong to me.  In Scheme #f should be the only false value.
> What's the argument for %nil being false in Scheme code?

I thought the original plan regarding %nil and #f and '() was that %nil
wasn't supposed to be seen normally from Scheme, and for that reason
(and (null? %nil) (not %nil)) would not be a problem.

Guile has treated %nil as false for quite some time:

    scheme@(guile-user)> ,o interp #t
    scheme@(guile-user)> (if %nil 1 2)
    $1 = 2

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-06-29 22:11   ` Andy Wingo
@ 2009-06-30 22:22     ` Neil Jerram
  2009-07-01  6:45       ` Daniel Kraft
  0 siblings, 1 reply; 38+ messages in thread
From: Neil Jerram @ 2009-06-30 22:22 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> On Mon 29 Jun 2009 23:44, Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> Andy Wingo <wingo@pobox.com> writes:
>>
>>>     scheme@(guile-user)> (if %nil 1 2)
>>>     1
>>>
>>>     #define scm_is_false(x) (scm_is_eq ((x), SCM_BOOL_F) || SCM_NILP (x))
>
>> Seems wrong to me.  In Scheme #f should be the only false value.
>> What's the argument for %nil being false in Scheme code?
>
> I thought the original plan regarding %nil and #f and '() was that %nil
> wasn't supposed to be seen normally from Scheme, and for that reason
> (and (null? %nil) (not %nil)) would not be a problem.
>
> Guile has treated %nil as false for quite some time:
>
>     scheme@(guile-user)> ,o interp #t
>     scheme@(guile-user)> (if %nil 1 2)
>     $1 = 2

I'm sorry... you're completely right.  Brain storm on my part.

But then I don't understand the cause of your suggestion.  Is it that
master has somehow regressed, so as to cause (if %nil 1 2) => 1 ?

     Neil




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-06-30 22:22     ` Neil Jerram
@ 2009-07-01  6:45       ` Daniel Kraft
  2009-07-01 21:54         ` Neil Jerram
  0 siblings, 1 reply; 38+ messages in thread
From: Daniel Kraft @ 2009-07-01  6:45 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, guile-devel

Hi Neil,

Neil Jerram wrote:
> Andy Wingo <wingo@pobox.com> writes:
>> Guile has treated %nil as false for quite some time:
>>
>>     scheme@(guile-user)> ,o interp #t
>>     scheme@(guile-user)> (if %nil 1 2)
>>     $1 = 2
> 
> I'm sorry... you're completely right.  Brain storm on my part.
> 
> But then I don't understand the cause of your suggestion.  Is it that
> master has somehow regressed, so as to cause (if %nil 1 2) => 1 ?

it seems so.  Doing just a

scheme@(guile-user)> (if %nil 1 2)
1

with a recent build (of at least my elisp branch, but that did not 
change anything in this respect of course) gives that answer.

Doing ,o interp #t as Andy did however also gives the right answer for 
me.  BTW, I've just changed my elisp compiler to use real nil instead of 
#f for nil, but now it doesn't have the right semantics of course 
(that's the motivation here).

Yours,
Daniel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-01  6:45       ` Daniel Kraft
@ 2009-07-01 21:54         ` Neil Jerram
  2009-07-05 13:07           ` Mark H Weaver
  0 siblings, 1 reply; 38+ messages in thread
From: Neil Jerram @ 2009-07-01 21:54 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: Andy Wingo, guile-devel

Daniel Kraft <d@domob.eu> writes:

> it seems so.  Doing just a
>
> scheme@(guile-user)> (if %nil 1 2)
> 1
>
> with a recent build (of at least my elisp branch, but that did not
> change anything in this respect of course) gives that answer.
>
> Doing ,o interp #t as Andy did however also gives the right answer for
> me.  BTW, I've just changed my elisp compiler to use real nil instead
> of #f for nil, but now it doesn't have the right semantics of course
> (that's the motivation here).

OK, I see.  The point is that VM ops like br-if use SCM_FALSEP (which
is equivalent to scm_is_false), and hence you're wondering if it would
be easier to change the definition of scm_is_false, than to modify
those ops to say (SCM_FALSEP (x) || SCM_NILP (x)).

I think the balance of arguments is clearly against doing that:

- There are lots of places that use scm_is_false where there is no
  need to allow for the value being tested being %nil.  Changing
  scm_is_false would be a performance hit for those places.

- There are only a handful of places (I think) that you need to change
  to get %nil-falseness in the VM.

- There is a similar number of places which already implement
  %nil-falseness in the interpreter by using (scm_is_false (x) ||
  SCM_NILP (x)), and these would logically have to be changed if you
  made your proposed change.

- It would be an incompatible API change.

So please just change the relevant places in the VM to say
(scm_is_false (x) || SCM_NILP (x)) instead.

Regards,
        Neil




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-01 21:54         ` Neil Jerram
@ 2009-07-05 13:07           ` Mark H Weaver
  2009-08-30 11:07             ` Neil Jerram
  0 siblings, 1 reply; 38+ messages in thread
From: Mark H Weaver @ 2009-07-05 13:07 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, Daniel Kraft, guile-devel

I would like to argue that the definitions of scm_is_false,
scm_is_true, and scm_is_null should indeed be changed to test for
%nil.

Do a grep-find in the tree for uses of these macros.  I think you'll
find that the majority of places where they are used should also be
checking for %nil, but they are not.

The only times when we can safely avoid testing for %nil is when we
know *statically* that the value being tested was not created by elisp
code.  Of course, in scheme, it is extremely rare that we can know
this statically.  Even bindings like `and', `or', and `not' could in
principle be bound to elisp functions.

More importantly, scm_is_false, scm_is_true, and scm_is_null in code
outside of guile's source tree should almost always be checking for
%nil, unless they know statically that their own code created the
value in question (because they shouldn't make assumptions about what
libguile's code will do in the future), which again is very rare.

Right now, there are scores of bugs in guile's tree that will only
show up sporadically for those doing heavy mixing of elisp and scheme,
because most code written in C (almost everything except for the
evaluator itself) is failing to check for %nil even though it should.

Do a quick grep for uses of scm_is_null in the C code for srfi-1, for
just one example.

The default should be to test for %nil.  If, in a particular use, it
can be proved statically that the value was not created by an elisp
function (which we can almost never prove), then that is a case where
we can use some faster test.  But someone will have to think about
each of these cases individually anyway, so it makes sense that these
faster tests should be named something different than the old names,
and preferably with a longer name, calling attention to the fact that
it is a potential source of bugs -- because even if at some point a
tested value can be proved to never be %nil, this might very well
change later, thus creating a new rarely-triggered bug in old code.

Maybe names something like this:

scm_is_false_xxx_assume_never_nil
scm_is_true_xxx_assume_never_nil
scm_if_null_xxx_assume_never_nil

One category of place where these could be used is code dealing with
data structures created internally by the evaluator -- though I'm not
very familiar with guile's internals, so I don't know how common these
data structures are, if indeed they exist at all.

Best regards,

   Mark

On Wed, Jul 01, 2009 at 10:54:50PM +0100, Neil Jerram wrote:
> OK, I see.  The point is that VM ops like br-if use SCM_FALSEP (which
> is equivalent to scm_is_false), and hence you're wondering if it would
> be easier to change the definition of scm_is_false, than to modify
> those ops to say (SCM_FALSEP (x) || SCM_NILP (x)).
> 
> I think the balance of arguments is clearly against doing that:
> 
> - There are lots of places that use scm_is_false where there is no
>   need to allow for the value being tested being %nil.  Changing
>   scm_is_false would be a performance hit for those places.
> 
> - There are only a handful of places (I think) that you need to change
>   to get %nil-falseness in the VM.
> 
> - There is a similar number of places which already implement
>   %nil-falseness in the interpreter by using (scm_is_false (x) ||
>   SCM_NILP (x)), and these would logically have to be changed if you
>   made your proposed change.
> 
> - It would be an incompatible API change.
> 
> So please just change the relevant places in the VM to say
> (scm_is_false (x) || SCM_NILP (x)) instead.
> 
> Regards,
>         Neil
> 
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-05 13:07           ` Mark H Weaver
@ 2009-08-30 11:07             ` Neil Jerram
  2009-08-30 14:11               ` Mark H Weaver
  0 siblings, 1 reply; 38+ messages in thread
From: Neil Jerram @ 2009-08-30 11:07 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Andy Wingo, Daniel Kraft, guile-devel

Hi Mark!

Mark H Weaver <mhw@netris.org> writes:

> I would like to argue that the definitions of scm_is_false,
> scm_is_true, and scm_is_null should indeed be changed to test for
> %nil.

OK, thanks to your arguments, I now agree with this.

> Do a grep-find in the tree for uses of these macros.  I think you'll
> find that the majority of places where they are used should also be
> checking for %nil, but they are not.

I started doing this.  Actually some of the first ones are in
backtrace.c (testing for the file and line source properties), and I
think those may be counter-examples...  but I agree that there are
many many cases that should be allowing for %nil.

But then I thought that this argument isn't really about the numbers.
It's that we have effectively taken a decision to treat Elisp as a
special case, among the set of languages that Guile may eventually
support - a fact which I now realize more clearly thanks to your and
others' querying of the treatment of nil, and thanks to the developing
language support - specifically in the sense of being able to pass
data between Scheme and Elisp without requiring any translation.  And
therefore it makes sense that libguile's most immediately available
APIs - i.e. scm_is_false/true/bool/null - should allow for their args
coming from either Scheme or Elisp.

So, thanks for persisting with the argument.

> If, in a particular use, it
> can be proved statically that the value was not created by an elisp
> function (which we can almost never prove), then that is a case where
> we can use some faster test.  But someone will have to think about
> each of these cases individually anyway, so it makes sense that these
> faster tests should be named something different than the old names,

This is of course something that you've included in your patch, and
I'm happy with that.

This is also something that could (potentially!) be optimized at
runtime or by the compiler (reminds me of the class of calls where
type checking could be eliminated if the compiler can prove that
objects will always be of the required type).

So, if you would be happy to do so, can I suggest that you rework your
patches so that they also make (and then assume, obviously) the
scm_is_false/true/bool/null change, and incorporate my other comments?

It would also be more convenient - and better for giving you your
deserved attribution - if you could submit them as Git patches.  Would
that be possible?  Alternatively, if you have your own Git repository,
we could pull from that.

Also, we will need documentation of the new APIs, and to explain the
overall concept; and a NEWS entry; and maybe a couple of tests to
check where data that includes %nil is passed to some of the fixed
functions.  Would you be willing to prepare all those too?

> One category of place where these could be used is code dealing with
> data structures created internally by the evaluator -- though I'm not
> very familiar with guile's internals, so I don't know how common these
> data structures are, if indeed they exist at all.

Yes, indeed.  In the updated patch(es), I suggest that you mark the
cases that you are not sure about, then I and other developers can
help work out what kind of check they should be doing.

Many thanks!

     Neil

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-08-30 11:07             ` Neil Jerram
@ 2009-08-30 14:11               ` Mark H Weaver
  2009-09-01 22:00                 ` Neil Jerram
  0 siblings, 1 reply; 38+ messages in thread
From: Mark H Weaver @ 2009-08-30 14:11 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, Daniel Kraft, guile-devel

Neil wrote:
> > I would like to argue that the definitions of scm_is_false,
> > scm_is_true, and scm_is_null should indeed be changed to test for
> > %nil.
> 
> OK, thanks to your arguments, I now agree with this.

Excellent!

What about scm_is_bool?  I'm tempted to suggest that it should work
the same way as "boolean?" within scheme, whatever that may be.  I
tend to think they ought to treat %nil as boolean, though I'm less
sure of this than about scm_is_true/false/null.  It's the right thing
for type-checking an argument that is expected to be boolean, which
seems to be fairly common in guile.  More complex code that is
dispatching on type (such as the aforementioned GOOPS code) will in
general have to be fixed to take into account that %nil is both a
boolean and a list.

One more thing: scheme code can reasonably expect to "write" a list of
simple values and then "read" it back in.  But now, lists might be
terminated by %nil instead of '().  Therefore, I think "read" needs to
be able to read SCM_LISP_NIL in whatever form we "write" it in.  I'll
let someone more knowledgable about guile reader issues decide what
that form should be.  Currently we write it as "#nil".

> > If, in a particular use, it
> > can be proved statically that the value was not created by an elisp
> > function (which we can almost never prove), then that is a case where
> > we can use some faster test.  [...]
>
> [...]
> 
> This is also something that could (potentially!) be optimized at
> runtime or by the compiler (reminds me of the class of calls where
> type checking could be eliminated if the compiler can prove that
> objects will always be of the required type).

Yes, I've also given this some thought.  If we were using C++ (I'm
very glad we're not, btw!) then I'm pretty sure we could use the type
system to mark certain functions as never returning %nil, and then
arrange to optimize away the %nil checks in those cases, but I can't
think of a way to do it with C, even with GCC's extensions.  Maybe, if
we can develop a reasonable proposal, we can get sufficient
functionality added to GCC.

> So, if you would be happy to do so, can I suggest that you rework your
> patches so that they also make (and then assume, obviously) the
> scm_is_false/true/bool/null change, and incorporate my other comments?

I will gladly do so.

> It would also be more convenient - and better for giving you your
> deserved attribution - if you could submit them as Git patches.  Would
> that be possible?

Will do.

> Also, we will need documentation of the new APIs, and to explain the
> overall concept; and a NEWS entry; and maybe a couple of tests to
> check where data that includes %nil is passed to some of the fixed
> functions.  Would you be willing to prepare all those too?

Yes, certainly.

> > One category of place where these could be used is code dealing with
> > data structures created internally by the evaluator -- though I'm not
> > very familiar with guile's internals, so I don't know how common these
> > data structures are, if indeed they exist at all.
> 
> Yes, indeed.  In the updated patch(es), I suggest that you mark the
> cases that you are not sure about, then I and other developers can
> help work out what kind of check they should be doing.

Indeed, some of the usage cases are difficult for me to evaluate, so
your help will be much appreciated!

Also, I signed my copyright assignment papers a while ago, and the
relevant file on fencepost has been updated accordingly.

   Best,
    Mark

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-08-30 14:11               ` Mark H Weaver
@ 2009-09-01 22:00                 ` Neil Jerram
  2009-09-02 15:57                   ` Mark H Weaver
  0 siblings, 1 reply; 38+ messages in thread
From: Neil Jerram @ 2009-09-01 22:00 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Andy Wingo, Daniel Kraft, guile-devel

Mark H Weaver <mhw@netris.org> writes:

> What about scm_is_bool?  I'm tempted to suggest that it should work
> the same way as "boolean?" within scheme, whatever that may be.  I
> tend to think they ought to treat %nil as boolean, though I'm less
> sure of this than about scm_is_true/false/null.  It's the right thing
> for type-checking an argument that is expected to be boolean, which
> seems to be fairly common in guile.  More complex code that is
> dispatching on type (such as the aforementioned GOOPS code) will in
> general have to be fixed to take into account that %nil is both a
> boolean and a list.

I agree (i.e. I think scm_is_bool (SCM_LISP_NIL) should be 1).

> One more thing: scheme code can reasonably expect to "write" a list of
> simple values and then "read" it back in.  But now, lists might be
> terminated by %nil instead of '().  Therefore, I think "read" needs to
> be able to read SCM_LISP_NIL in whatever form we "write" it in.  I'll
> let someone more knowledgable about guile reader issues decide what
> that form should be.  Currently we write it as "#nil".

Interesting point, but seems like one that could be left until it
crops up for real somewhere.

I assume the mainline case of writing a proper list will be fine,
because a list like (a b c . #nil) will be written out as "(a b c)" -
right?  Then, when read in again, it would become (a b c . ()) - I
think we may have to wait for real cases to know if that's actually a
problem at all.

> Yes, I've also given this some thought.  If we were using C++ (I'm
> very glad we're not, btw!) then I'm pretty sure we could use the type
> system to mark certain functions as never returning %nil, and then
> arrange to optimize away the %nil checks in those cases, but I can't
> think of a way to do it with C, even with GCC's extensions.  Maybe, if
> we can develop a reasonable proposal, we can get sufficient
> functionality added to GCC.

I was actually meaning the VM compiler...  but yes, maybe there are
also C things we could do.

>> So, if you would be happy to do so, can I suggest that you rework your
>> patches so that they also make (and then assume, obviously) the
>> scm_is_false/true/bool/null change, and incorporate my other comments?
>
> I will gladly do so.

Fantastic, thanks (and also for your 'Yes's to the other add-on
pieces)!

> Also, I signed my copyright assignment papers a while ago, and the
> relevant file on fencepost has been updated accordingly.

Yes, indeed; we (maintainers) got notified about that at the time;
apologies for not closing the loop with you then.

Regards,
        Neil




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-09-01 22:00                 ` Neil Jerram
@ 2009-09-02 15:57                   ` Mark H Weaver
  2009-09-17 21:21                     ` Neil Jerram
  0 siblings, 1 reply; 38+ messages in thread
From: Mark H Weaver @ 2009-09-02 15:57 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, Daniel Kraft, guile-devel

Neil Jerram wrote:
> > One more thing: scheme code can reasonably expect to "write" a list of
> > simple values and then "read" it back in.  But now, lists might be
> > terminated by %nil instead of '().  Therefore, I think "read" needs to
> > be able to read SCM_LISP_NIL in whatever form we "write" it in.  I'll
> > let someone more knowledgable about guile reader issues decide what
> > that form should be.  Currently we write it as "#nil".
> 
> Interesting point, but seems like one that could be left until it
> crops up for real somewhere.
> 
> I assume the mainline case of writing a proper list will be fine,
> because a list like (a b c . #nil) will be written out as "(a b c)" -
> right?  Then, when read in again, it would become (a b c . ()) - I
> think we may have to wait for real cases to know if that's actually a
> problem at all.

Certainly writing (a b c . #nil) as (a b c) would be most natural and
convenient, and maybe it's the best compromise, but I'm not entirely
sure it's safe.

What if we have an association list mapping symbols to booleans that
came from elisp?  Such a alist might look something like
((a . #t) (b . #nil)), and can reasonably be assumed to be written
and then read back in, but doing so would then result in
((a . #t) (b . ())), magically changing the false to a true.
This also violates the idea the CARs and CDRs should be treated the
same way.

I'm tempted to suggest that "write" should write (a . #nil) as
"(a . #nil)", and "display" should write it as "(a)".

   Best,
    Mark

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-09-02 15:57                   ` Mark H Weaver
@ 2009-09-17 21:21                     ` Neil Jerram
  0 siblings, 0 replies; 38+ messages in thread
From: Neil Jerram @ 2009-09-17 21:21 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Andy Wingo, Daniel Kraft, guile-devel

Mark H Weaver <mhw@netris.org> writes:

> Certainly writing (a b c . #nil) as (a b c) would be most natural and
> convenient, and maybe it's the best compromise, but I'm not entirely
> sure it's safe.
>
> What if we have an association list mapping symbols to booleans that
> came from elisp?  Such a alist might look something like
> ((a . #t) (b . #nil)), and can reasonably be assumed to be written
> and then read back in, but doing so would then result in
> ((a . #t) (b . ())), magically changing the false to a true.

Hmmm...  From the elisp point of view it's still false, of course.  From
the scheme point of view your point stands.

> This also violates the idea the CARs and CDRs should be treated the
> same way.

Also a good point.

> I'm tempted to suggest that "write" should write (a . #nil) as
> "(a . #nil)", and "display" should write it as "(a)".

For now I'm happy with any reasonable position (such as this), because I
don't think we've got any data to help decide between the options.
Hopefully it won't be too long before we have some real non-trival
Guile/Scheme/Elisp interactions.

      Neil




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-06-29 21:44 ` Neil Jerram
  2009-06-29 22:11   ` Andy Wingo
@ 2009-07-02 14:28   ` Mark H Weaver
  2009-07-02 14:50     ` Ludovic Courtès
  2009-07-02 22:50     ` Neil Jerram
  1 sibling, 2 replies; 38+ messages in thread
From: Mark H Weaver @ 2009-07-02 14:28 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, guile-devel

I've been considering writing a python compiler for guile.  For python
(and others) there are several values considered to be false, such as
0 and various empty collections, and so a different approach will have
to be taken to this problem.

If we want guile to handle many different languages, should we not try
to find an approach to "false-ness" that handles many languages, and
not just a few?

It seems to me that some code might misbehave in the presence of two
values which are both null? but not eq? to each other.  Also, it seems
more consistent to use the same strategy for handling various
languages' notions of false-ness.

To my mind, we should not be changing the data (which only works for
lisp), but rather the constructs that decide whether a given value is
false.

So how about having elisp `if' and `cond' compile not to scheme `if'
and `cond', but rather to scheme `elisp-if' and `elisp-cond'?  Or
perhaps compile `(if c a b)' to `(if (elisp-true? c) a b)'.

This approach, unlike the %nil approach, will work for other languages
too.

It also means that Guile's normal `if' and `cond' won't be slowed down
by having to check for two values instead of one.  That overhead may
be insignificant now, but when we have a native code compiler, it will
be quite significant in code size at least, even if the
representations of %nil and #f differ by only one bit.

What do you think?

    Mark

On Mon, Jun 29, 2009 at 10:44:54PM +0100, Neil Jerram wrote:
> Seems wrong to me.  In Scheme #f should be the only false value.
> What's the argument for %nil being false in Scheme code?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-02 14:28   ` Mark H Weaver
@ 2009-07-02 14:50     ` Ludovic Courtès
  2009-07-02 22:50     ` Neil Jerram
  1 sibling, 0 replies; 38+ messages in thread
From: Ludovic Courtès @ 2009-07-02 14:50 UTC (permalink / raw)
  To: guile-devel

Hi,

Mark H Weaver <mhw@netris.org> writes:

> I've been considering writing a python compiler for guile.  For python
> (and others) there are several values considered to be false, such as
> 0 and various empty collections, and so a different approach will have
> to be taken to this problem.

[...]

> So how about having elisp `if' and `cond' compile not to scheme `if'
> and `cond', but rather to scheme `elisp-if' and `elisp-cond'?  Or
> perhaps compile `(if c a b)' to `(if (elisp-true? c) a b)'.

I concur (but I haven't followed the elisp discussion closely).

Regardless of which approach the elisp front-end takes, this is
something other languages can already do.  This is something the
ECMAScript front-end does: see how `if' is handled in
`language/ecmascript/compile-ghil.scm' and the definition of `->boolean'
in `language/ecmascript/base.scm'.

Thanks,
Ludo'.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-02 14:28   ` Mark H Weaver
  2009-07-02 14:50     ` Ludovic Courtès
@ 2009-07-02 22:50     ` Neil Jerram
  2009-07-03 15:32       ` Mark H Weaver
  1 sibling, 1 reply; 38+ messages in thread
From: Neil Jerram @ 2009-07-02 22:50 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Andy Wingo, guile-devel

Mark H Weaver <mhw@netris.org> writes:

> I've been considering writing a python compiler for guile.

Great!

>  For python
> (and others) there are several values considered to be false, such as
> 0 and various empty collections, and so a different approach will have
> to be taken to this problem.
>
> If we want guile to handle many different languages, should we not try
> to find an approach to "false-ness" that handles many languages, and
> not just a few?

There's been loads of prior discussion on this subject.  Here are
pointers to some of that.

http://sourceware.org/ml/guile/1999-07/msg00251.html
http://sourceware.org/ml/guile/1998-07/msg00187.html
http://lists.gnu.org/archive/html/guile-devel/2001-09/msg00140.html
http://lists.gnu.org/archive/html/guile-devel/2001-11/msg00016.html

> It seems to me that some code might misbehave in the presence of two
> values which are both null? but not eq? to each other.

Example?  (This seems quite unlikely to me.)

> Also, it seems more consistent to use the same strategy for handling
> various languages' notions of false-ness.
>
> To my mind, we should not be changing the data (which only works for
> lisp), but rather the constructs that decide whether a given value is
> false.
>
> So how about having elisp `if' and `cond' compile not to scheme `if'
> and `cond', but rather to scheme `elisp-if' and `elisp-cond'?  Or
> perhaps compile `(if c a b)' to `(if (elisp-true? c) a b)'.
>
> This approach, unlike the %nil approach, will work for other languages
> too.

Certainly this is a possible approach.  In what's been done so far,
and what we do in future, I don't think there are any arguments that
trump all the other considerations.  It's just a matter of balancing
performance, robustness, and so on.  If more non-Lisp-like languages
are added, your consideration of cross-language consistency would gain
more weight.

On a matter of detail, I don't understand your statement that the
current %nil approach won't work for other languages.  As the query
that started this thread shows, it is perfectly possible to code a new
language (VM-Scheme, in this case) in which %nil is true.

If I understand it correctly, a key point of the thinking up till now
is that Elisp is a special case because it is so `tantalizingly
similar' (as Jim put it) to Scheme.  This similarity creates the
possibility of passing data directly between Elisp and Scheme, and the
fact that Guile Scheme treats %nil as both #f and '() follows from
that; otherwise it would be necessary to convert data as it passes
from one language to the other.  In other words - a performance point.

Now we have Brainfuck and ECMAScript too, but I don't know if they are
complex enough to cast significant doubt on the existing balance.  (To
be honest, I'm not sure if that's true for ECMAScript, I need to look
at Andy's code.)

Python on the other hand would be plenty complex enough, and I assume
it has arbitrarily complex data structures.  How do you envisage data
transfer working between Python and other languages?

> It also means that Guile's normal `if' and `cond' won't be slowed down
> by having to check for two values instead of one.  That overhead may
> be insignificant now, but when we have a native code compiler, it will
> be quite significant in code size at least, even if the
> representations of %nil and #f differ by only one bit.

Do you really think so?  Just because of two compare operations
instead of one?  Perhaps I'm misunderstanding you.

> On Mon, Jun 29, 2009 at 10:44:54PM +0100, Neil Jerram wrote:
>> Seems wrong to me.  In Scheme #f should be the only false value.
>> What's the argument for %nil being false in Scheme code?

(Just for the record, my statement here was wrong, and I've since
corrected it.)

Regards,
        Neil

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-02 22:50     ` Neil Jerram
@ 2009-07-03 15:32       ` Mark H Weaver
  2009-07-05  2:41         ` Mark H Weaver
  2009-07-23 21:12         ` Andy Wingo
  0 siblings, 2 replies; 38+ messages in thread
From: Mark H Weaver @ 2009-07-03 15:32 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

Thank you, Neil, for the pointers to earlier discussions of this
subject.  Having read them, I've been convinced that the %nil approach
is reasonable and probably the best way to deal with elisp<->scheme
interoperability.  Though, like you, I was not willing to easily
accept two false values and two end-of-list values :)

> Python on the other hand would be plenty complex enough, and I assume
> it has arbitrarily complex data structures.  How do you envisage data
> transfer working between Python and other languages?

I'll answer this in a later email.

> > It also means that Guile's normal `if' and `cond' won't be slowed down
> > by having to check for two values instead of one.  That overhead may
> > be insignificant now, but when we have a native code compiler, it will
> > be quite significant in code size at least, even if the
> > representations of %nil and #f differ by only one bit.
> 
> Do you really think so?  Just because of two compare operations
> instead of one?  Perhaps I'm misunderstanding you.

A single compare and branch is a very short instruction sequence,
especially (on some architectures at least) if the constant being
compared is a small number.

When there are two values to compare against, that means either two
compares and two conditional branches, or, if the two values differ by
only one bit, ANDing with a mask before the compare.  Either way, the
code size increases quite a bit.  This price will be paid for every
boolean test, and every end-of-list test, which are obviously very
common.

It might be worth considering a build-time option to disable %nil, so
that it's possible to build a version of guile which doesn't pay this
price.

   Best,
    Mark

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-03 15:32       ` Mark H Weaver
@ 2009-07-05  2:41         ` Mark H Weaver
  2009-07-05  9:19           ` Andy Wingo
  2009-07-06 21:46           ` truth of %nil Neil Jerram
  2009-07-23 21:12         ` Andy Wingo
  1 sibling, 2 replies; 38+ messages in thread
From: Mark H Weaver @ 2009-07-05  2:41 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

Below is a proposal for how to make boolean tests and end-of-list
tests faster and more compact, by renumbering the representations for
SCM_ELISP_NIL, SCM_EOL, SCM_UNDEFINED, and SCM_EOF_VAL.

But first, I decided to quantify the increase in code size testing
against two constants instead of one, by compiling the following short
test program with gcc -Os on three architectures: x86-32, arm, and
sparc:

(specifically, I did "gcc -Os -S foo.c" and then "as -a foo.s")

	loop1(int *p)
	{
	  while (*p != 0x004)
	    p++;
	}

	loop2(int *p)
	{
	  while ((*p & ~0x200) != 0x004)
	    p++;
	}

The size of the resulting loop bodies, in bytes, are as follows:

	arch   loop1   loop2
	--------------------
	x86-32   8      13
	arm     12      16
	sparc   16      20
	--------------------

I guess this is not too bad.

The constants chosen above are based on the following proposal on how
best to make SCM_BOOL_F and SCM_ELISP_NIL differ by only one bit, and
the same for SCM_EOL and SCM_ELISP_NIL.

This can be accomplished by making:

SCM_ELISP_NIL equal to SCM_MAKIFLAG (2) i.e. 0x204, and
SCM_EOL       equal to SCM_MAKIFLAG (3) i.e. 0x304.

These values are currently used by SCM_UNDEFINED and SCM_EOF_VAL, but
I'm hoping it won't be too disruptive to renumber those, especially if
it's done before the release of 2.0.

Then, testing for boolean truth becomes:

	if ((x & ~0x200) != 0x004)

and testing for end-of-list becomes:

	if ((x & ~0x100) == 0x204)

These should of course be written differently, perhaps as follows:

	if ((x & ~(SCM_ELISP_NIL ^ SCM_BOOL_F)) != (SCM_ELISP_NIL & SCM_BOOL_F))

and for end-of-list:

	if ((x & ~(SCM_ELISP_NIL ^ SCM_EOL)) == (SCM_ELISP_NIL & SCM_EOL))

along with a regression test somewhere to complain unless both of the
XOR subexpressions above are powers of two:

	#define IS_POWER_OF_TWO(x)             ((x) & ((x)-1) == 0)
	#define DIFFER_BY_ONLY_ONE_BIT(x, y)   IS_POWER_OF_TWO((x)^(y))

	if ( ! DIFFER_BY_ONLY_ONE_BIT(SCM_ELISP_NIL, SCM_BOOL_F) )
	  complain();
	if ( ! DIFFER_BY_ONLY_ONE_BIT(SCM_ELISP_NIL, SCM_EOL) )
	  complain();

What do you think?

    Mark

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-05  2:41         ` Mark H Weaver
@ 2009-07-05  9:19           ` Andy Wingo
  2009-07-07 11:14             ` Mark H Weaver
  2009-07-06 21:46           ` truth of %nil Neil Jerram
  1 sibling, 1 reply; 38+ messages in thread
From: Andy Wingo @ 2009-07-05  9:19 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel, Neil Jerram

On Sun 05 Jul 2009 03:41, Mark H Weaver <mhw@netris.org> writes:

> Below is a proposal for how to make boolean tests and end-of-list
> tests faster and more compact, by renumbering the representations for
> SCM_ELISP_NIL, SCM_EOL, SCM_UNDEFINED, and SCM_EOF_VAL.

That looks like great work, Mark!!

I don't think it's a problem to renumber these constants, no. A couple
of questions though:

> 	loop1(int *p)
> 	{
> 	  while (*p != 0x004)
> 	    p++;
> 	}

Did you mean while (p != 0x004) ?

Also, can you make a third test, equivalent to p == SCM_EOL || p ==
SCM_ELISP_NIL ?

> The size of the resulting loop bodies, in bytes, are as follows:
>
> 	arch   loop1   loop2
> 	--------------------
> 	x86-32   8      13
> 	arm     12      16
> 	sparc   16      20
> 	--------------------
>
> I guess this is not too bad.

I realize this is a bit of a silly benchmark, but can you time these?
Actually, can you time Guile? The changes to Guile should be minimal,
after all.

> What do you think?

Excellence, good sir, excellence!

Andy
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-05  9:19           ` Andy Wingo
@ 2009-07-07 11:14             ` Mark H Weaver
  2009-07-08 13:17               ` Mark H. Weaver
  2009-08-30 11:13               ` Neil Jerram
  0 siblings, 2 replies; 38+ messages in thread
From: Mark H Weaver @ 2009-07-07 11:14 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel, Neil Jerram

Having thought more about optimizing %nil handling, it occurs to me
that we will also want boolean tests from within lisp to be optimized.

From lisp, three values are considered to be false: #f, '(), and %nil.
We can use the same bit-masking trick to do these tests quickly if we
make sure that these three values differ in only two bit positions.

Therefore, I suggest that the first four SCM_MAKIFLAG values should be
#f, %nil, '(), and another never-to-be-used value which would also be
considered false by lisp code as a side effect of the masking trick.

In my previous proposal, I made sure to keep SCM_BOOL_F and SCM_BOOL_T
in IFLAG numbers 0 and 1.  If this is important, it could still be
arranged by making the aforementioned two bit positions something
other than the lowest two bits of the IFLAG number, but that would
mean our three lisp-false values would be spread out (e.g. 0/2/4/6).

Therefore, unless someone tells me otherwise, I'm going to assume it's
okay to put SCM_BOOL_F and SCM_BOOL_T in IFLAG numbers 0 and 4.  These
still have the property that SCM_BOOL_F and SCM_BOOL_T differ by only
one bit, and that SCM_BOOL_F is IFLAG number 0, both of which seem
potentially useful.

So, in my new proposal, the first five IFLAGS are as follows:

#define SCM_BOOL_F		SCM_MAKIFLAG (0)
#define SCM_ELISP_NIL		SCM_MAKIFLAG (1)
/* SCM_MAKIFLAG (2) would also be considered "false" by lisp code
 * and therefore should remain unassigned */
#define SCM_EOL			SCM_MAKIFLAG (3)
#define SCM_BOOL_T 		SCM_MAKIFLAG (4)

This numbering has the nice properties that 0 is #f, the first two are
considered false by scheme, and the first four are considered false by
lisp.

[An alternative numbering for which the following macros would also
 work is: ((#f 0) (#t 1) (%nil 2) (unused 4) (() 6)), if it's important
 to keep #f and #t together]

The testing macros would be as follows (these can of course be renamed
if my other pending proposal is rejected):

#define scm_is_false(x)  \
  (((x) & ~(SCM_ELISP_NIL ^ SCM_BOOL_F)) == (SCM_ELISP_NIL & SCM_BOOL_F))
#define scm_is_true(x)  \
  (((x) & ~(SCM_ELISP_NIL ^ SCM_BOOL_F)) != (SCM_ELISP_NIL & SCM_BOOL_F))
#define scm_is_null(x)   \
  (((x) & ~(SCM_ELISP_NIL ^ SCM_EOL))    == (SCM_ELISP_NIL & SCM_EOL))

#define scm_is_false_xxx_assume_not_lisp_nil(x)  ((x) == SCM_BOOL_F)
#define scm_is_true_xxx_assume_not_lisp_nil(x)   ((x) != SCM_BOOL_F)
#define scm_is_null_xxx_assume_not_lisp_nil(x)   ((x) == SCM_EOL)

And the lisp boolean tests would be something like this:

/*
 * Since we know SCM_ELISP_NIL and SCM_BOOL_F differ by exactly one
 * bit, and that SCM_ELISP_NIL and SCM_EOL differ by exactly one bit,
 * and that they of course can't be the same bit (or else SCM_BOOL_F
 * and SCM_EOL be would equal), it follows that SCM_BOOL_F and SCM_EOL
 * differ by exactly two bits, and those are the ones we need to
 * mask out to collapse all three values together.
 */
#define scm_is_lisp_false(x)  \
  (((x) & ~(SCM_BOOL_F ^ SCM_EOL)) == (SCM_BOOL_F & SCM_EOL))
#define scm_is_lisp_true(x)  \
  (((x) & ~(SCM_BOOL_F ^ SCM_EOL)) != (SCM_BOOL_F & SCM_EOL))

What do you think?

Also, you may have noticed that I've been using the term "lisp"
instead of "elisp".  This is because guile may support other lisps in
the future, and they will also need the same %nil handling.  (For that
matter, we could even use %nil to implement an "old scheme" language
which treats '() as false.)  With this in mind, should SCM_ELISP_NIL
be renamed to SCM_LISP_NIL?

Andy: thanks for the warm reception, and I'll answer your questions in
a later email.

  Best regards,
     Mark

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-07 11:14             ` Mark H Weaver
@ 2009-07-08 13:17               ` Mark H. Weaver
  2009-08-30 11:20                 ` Neil Jerram
  2009-08-30 11:13               ` Neil Jerram
  1 sibling, 1 reply; 38+ messages in thread
From: Mark H. Weaver @ 2009-07-08 13:17 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Andy Wingo, Neil Jerram, guile-devel

I've discovered two more tests that can be optimized using the same
bit masking tricks: scm_is_bool and scm_is_bool_or_lisp_nil (newly
created).  Since SCM_BOOL_F and SCM_BOOL_T differ by only one bit,
that one is easy.  The other one can be implemented the same way
as scm_is_lisp_false in my last proposal, by making IFLAG 5 another
never-to-be-used value.  That way, IFLAGS 0/1/4/5 (#f %nil #t
dont-use-2) are all the same except for two bit positions.

So I was thinking that scm_is_bool and scm_is_bool_or_lisp_nil could
be implemented as macros, which are as fast and as compact as testing
for boolean truth.  What do you think?

Also, since writing my last email, I've realized that my testing
macros need to use SCM_UNPACK.

I'm currently in the process of preparing a patch.  As part of that
process, I'm reviewing all uses of the affected macros, and evaluating
for each use case how %nil should be handled.  In order to remain
flexible with regards my other pending proposal, I'm forking macros
into two variants: one which checks for %nil when appropriate, and one
which doesn't.  We can decide what their names should be later.

I found one thorny use of scm_is_bool and scm_is_null, and request
your collective wisdom:

scm_class_of() in goops.c tries to determine the class of a scheme
value.  If scm_is_bool returns true, it's classified as
scm_class_boolean, and if scm_is_null returns true, it's classified as
scm_class_null.  Right now, that code doesn't consider %nil at all.

How do you all think %nil should be handled by goops?

It seems to me that it would be nice to try to support %nil
transparently when possible.

    Mark

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-08 13:17               ` Mark H. Weaver
@ 2009-08-30 11:20                 ` Neil Jerram
  0 siblings, 0 replies; 38+ messages in thread
From: Neil Jerram @ 2009-08-30 11:20 UTC (permalink / raw)
  To: Mark H. Weaver; +Cc: Andy Wingo, guile-devel

"Mark H. Weaver" <mhw@netris.org> writes:

> I found one thorny use of scm_is_bool and scm_is_null, and request
> your collective wisdom:
>
> scm_class_of() in goops.c tries to determine the class of a scheme
> value.  If scm_is_bool returns true, it's classified as
> scm_class_boolean, and if scm_is_null returns true, it's classified as
> scm_class_null.  Right now, that code doesn't consider %nil at all.

So currently the code (using your clarified APIs) gives
scm_class_unknown, right?

> How do you all think %nil should be handled by goops?

Given the %nil concept, I suppose what matters is that a %nil value
should be able to match both

(define-method f (arg <null>))

and

(define-method f (arg <boolean>))

That suggests to me that there should be a class <nil> that inherits
from both <null> and <boolean>, and that (class-of %nil) should be
<nil>.

What do you think?

     Neil




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-07 11:14             ` Mark H Weaver
  2009-07-08 13:17               ` Mark H. Weaver
@ 2009-08-30 11:13               ` Neil Jerram
  2009-08-30 14:15                 ` Mark H Weaver
                                   ` (2 more replies)
  1 sibling, 3 replies; 38+ messages in thread
From: Neil Jerram @ 2009-08-30 11:13 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Andy Wingo, guile-devel

Mark H Weaver <mhw@netris.org> writes:

> This numbering has the nice properties that 0 is #f.

Just to be clear: will this mean that (SCM_BOOL_F == 0) ?  As things
stand I don't think it will, because SCM_MAKIFLAG shifts and adds
0x04.

Just checking this because Ludovic said recently that (SCM_BOOL_F ==
0) would have nice properties for BDW-GC.

> Also, you may have noticed that I've been using the term "lisp"
> instead of "elisp".  This is because guile may support other lisps in
> the future, and they will also need the same %nil handling.  (For that
> matter, we could even use %nil to implement an "old scheme" language
> which treats '() as false.)  With this in mind, should SCM_ELISP_NIL
> be renamed to SCM_LISP_NIL?

Yes, that sounds like a good argument to me - i.e. I can't see any
reason why the special-case-ness of Elisp shouldn't apply equally to
other Lisps - so please do rename "ELISP" things to "LISP", where this
argument supports that.

Thanks,
        Neil

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-08-30 11:13               ` Neil Jerram
@ 2009-08-30 14:15                 ` Mark H Weaver
  2009-09-01 21:50                   ` Neil Jerram
  2009-08-30 22:01                 ` Ken Raeburn
  2009-08-31 21:55                 ` SCM_BOOL_F == 0 and BDW-GC Ludovic Courtès
  2 siblings, 1 reply; 38+ messages in thread
From: Mark H Weaver @ 2009-08-30 14:15 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, guile-devel

On Sun, Aug 30, 2009 at 12:13:59PM +0100, Neil Jerram wrote:
> Mark H Weaver <mhw@netris.org> writes:
> 
> > This numbering has the nice properties that 0 is #f.
> 
> Just to be clear: will this mean that (SCM_BOOL_F == 0) ?  As things
> stand I don't think it will, because SCM_MAKIFLAG shifts and adds
> 0x04.

Yes, that's correct.  SCM_BOOL_F is 4.  What I should have said above
is that #f is IFLAG number 0.

> > Also, you may have noticed that I've been using the term "lisp"
> > instead of "elisp".  This is because guile may support other lisps in
> > the future, and they will also need the same %nil handling.  (For that
> > matter, we could even use %nil to implement an "old scheme" language
> > which treats '() as false.)  With this in mind, should SCM_ELISP_NIL
> > be renamed to SCM_LISP_NIL?
> 
> Yes, that sounds like a good argument to me - i.e. I can't see any
> reason why the special-case-ness of Elisp shouldn't apply equally to
> other Lisps - so please do rename "ELISP" things to "LISP", where this
> argument supports that.

Sounds good!

    Mark




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-08-30 14:15                 ` Mark H Weaver
@ 2009-09-01 21:50                   ` Neil Jerram
  0 siblings, 0 replies; 38+ messages in thread
From: Neil Jerram @ 2009-09-01 21:50 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Andy Wingo, guile-devel

Mark H Weaver <mhw@netris.org> writes:

> On Sun, Aug 30, 2009 at 12:13:59PM +0100, Neil Jerram wrote:
>> Mark H Weaver <mhw@netris.org> writes:
>> 
>> > This numbering has the nice properties that 0 is #f.
>> 
>> Just to be clear: will this mean that (SCM_BOOL_F == 0) ?  As things
>> stand I don't think it will, because SCM_MAKIFLAG shifts and adds
>> 0x04.
>
> Yes, that's correct.  SCM_BOOL_F is 4.  What I should have said above
> is that #f is IFLAG number 0.

Thanks for clarifying that.  (And from other threads it seems clear
now that SCM_BOOL_F == 0 would actually be a problem!)

     Neil




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-08-30 11:13               ` Neil Jerram
  2009-08-30 14:15                 ` Mark H Weaver
@ 2009-08-30 22:01                 ` Ken Raeburn
  2009-08-31 21:59                   ` Ludovic Courtès
  2009-08-31 21:55                 ` SCM_BOOL_F == 0 and BDW-GC Ludovic Courtès
  2 siblings, 1 reply; 38+ messages in thread
From: Ken Raeburn @ 2009-08-30 22:01 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, Mark H Weaver, guile-devel

On Aug 30, 2009, at 07:13, Neil Jerram wrote:
> Mark H Weaver <mhw@netris.org> writes:
> This numbering has the nice properties that 0 is #f.
> Just to be clear: will this mean that (SCM_BOOL_F == 0) ?  As things
> stand I don't think it will, because SCM_MAKIFLAG shifts and adds
> 0x04.
>
> Just checking this because Ludovic said recently that (SCM_BOOL_F ==
> 0) would have nice properties for BDW-GC.

Was that in list email?  Maybe I overlooked it.  Having all-bits-zero  
be a valid object would make some things easier in my guile-emacs work  
too, but could cause other problems as well.  In Emacs all-bits-zero  
is now integer-zero, and in some places Lisp_Object variables are used  
or made visible to GC before being explicitly set, so I have to set  
them.  In guile-emacs, I can check in key places (like 'cons', or the  
'EQ' macro) for all-bits-zero and flag an error, or in certain cases  
patch over the problem temporarily.  While the integration is still  
minimal, I suppose SCM_BOOL_F shouldn't be showing up in elisp  
processing, so that still works, but if it gets moved further along as  
I'm hoping, that could change.  Having the default C initializer  
change from one valid value to another between Emacs and Guile-Emacs  
could make the bugs much more subtle.

I kind of assumed that making all-bits-zero an invalid value was a  
conscious choice by the Guile (or SCM?) designers which wasn't likely  
to be revisited.  It is, after all, a fairly easy way of highlighting  
a certain class of uninitialized-value problems -- choosing strict  
checking and debugging over letting the programmer be lazy.

I think I'm mildly in favor of keeping all-bits-zero as an invalid  
representation.  But, if it's a huge win for BDW-GC, maybe it's worth  
it.

Ken

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-08-30 22:01                 ` Ken Raeburn
@ 2009-08-31 21:59                   ` Ludovic Courtès
  2009-08-31 23:39                     ` Ken Raeburn
  0 siblings, 1 reply; 38+ messages in thread
From: Ludovic Courtès @ 2009-08-31 21:59 UTC (permalink / raw)
  To: guile-devel

Hi,

Ken Raeburn <raeburn@raeburn.org> writes:

> I kind of assumed that making all-bits-zero an invalid value was a
> conscious choice by the Guile (or SCM?) designers which wasn't likely
> to be revisited.  It is, after all, a fairly easy way of highlighting
> a certain class of uninitialized-value problems -- choosing strict
> checking and debugging over letting the programmer be lazy.

Indeed, that could have been one reason.  We could ask Aubrey Jaffer
about this.

> I think I'm mildly in favor of keeping all-bits-zero as an invalid
> representation.  But, if it's a huge win for BDW-GC, maybe it's worth
> it.

As discussed in my other message, it would actually be harmful.

Thanks,
Ludo'.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-08-31 21:59                   ` Ludovic Courtès
@ 2009-08-31 23:39                     ` Ken Raeburn
  0 siblings, 0 replies; 38+ messages in thread
From: Ken Raeburn @ 2009-08-31 23:39 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Aug 31, 2009, at 17:59, Ludovic Courtès wrote:
>> I think I'm mildly in favor of keeping all-bits-zero as an invalid
>> representation.  But, if it's a huge win for BDW-GC, maybe it's worth
>> it.
>
> As discussed in my other message, it would actually be harmful.

Then I'm definitely in favor of keeping it as invalid! :-)

Ken



^ permalink raw reply	[flat|nested] 38+ messages in thread

* SCM_BOOL_F == 0 and BDW-GC
  2009-08-30 11:13               ` Neil Jerram
  2009-08-30 14:15                 ` Mark H Weaver
  2009-08-30 22:01                 ` Ken Raeburn
@ 2009-08-31 21:55                 ` Ludovic Courtès
  2009-09-17 22:00                   ` Neil Jerram
  2 siblings, 1 reply; 38+ messages in thread
From: Ludovic Courtès @ 2009-08-31 21:55 UTC (permalink / raw)
  To: guile-devel

Hello!

Neil Jerram <neil@ossau.uklinux.net> writes:

> Just checking this because Ludovic said recently that (SCM_BOOL_F ==
> 0) would have nice properties for BDW-GC.

Actually he wasn't quite right when he said that.  :-)

The issue with BDW-GC is that "disappearing links" (weak pointers in
libgc parlance) replace pointers to objects that have been reclaimed by
NULL, and there's no way to tell it to use some other value.

That leads to insanities in the weak hash table implementation [0, 1],
which I thought could somehow vanish if SCM_BOOL_F == 0.

Unfortunately that's not true; it would even make things worse because
NULL would now be a valid Scheme value.

Instead what's really needed is a special pointer-to-reclaimed-object
value that can be distinguished from valid Scheme values since that
value ends up in the car or cdr of weak pairs in hash table buckets.  As
such, SCM_PACK (NULL) was a good choice until now.

SCM_UNDEFINED == 0 would work fine because SCM_UNDEFINED is not a valid
Scheme value, but it wouldn't change the implementation.

Thoughts?

Thanks,
Ludo'.

[0] http://git.savannah.gnu.org/cgit/guile.git/tree/libguile/weaks.c?h=boehm-demers-weiser-gc#n40
[1] http://git.savannah.gnu.org/cgit/guile.git/tree/libguile/hashtab.c?h=boehm-demers-weiser-gc#n97

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: SCM_BOOL_F == 0 and BDW-GC
  2009-08-31 21:55                 ` SCM_BOOL_F == 0 and BDW-GC Ludovic Courtès
@ 2009-09-17 22:00                   ` Neil Jerram
  2009-09-17 22:28                     ` Ludovic Courtès
  0 siblings, 1 reply; 38+ messages in thread
From: Neil Jerram @ 2009-09-17 22:00 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Hello!
>
> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> Just checking this because Ludovic said recently that (SCM_BOOL_F ==
>> 0) would have nice properties for BDW-GC.
>
> Actually he wasn't quite right when he said that.  :-)
>
> The issue with BDW-GC is that "disappearing links" (weak pointers in
> libgc parlance) replace pointers to objects that have been reclaimed by
> NULL, and there's no way to tell it to use some other value.
>
> That leads to insanities in the weak hash table implementation [0, 1],
> which I thought could somehow vanish if SCM_BOOL_F == 0.

They're not so bad.  But I agree that it would be nicer not to have to
use SCM_WEAK_PAIR_CAR, and just use SCM_CAR instead.

> Unfortunately that's not true; it would even make things worse because
> NULL would now be a valid Scheme value.

Yes, I see.

> Instead what's really needed is a special pointer-to-reclaimed-object
> value that can be distinguished from valid Scheme values since that
> value ends up in the car or cdr of weak pairs in hash table buckets.  As
> such, SCM_PACK (NULL) was a good choice until now.

Here I'm confused again.  I thought we now had no choice about the
pointer-to-reclaimed-object value, because BDW-GC always uses NULL.

> SCM_UNDEFINED == 0 would work fine because SCM_UNDEFINED is not a valid
> Scheme value, but it wouldn't change the implementation.

I'm afraid I don't understand "but it wouldn't change the
implementation".

0 is also used for procedures that haven't been extended to
primitive-generic - see SCM_SUBR_GENERIC - but I think making
SCM_UNDEFINED == 0 would be fine there too.

> Thoughts?

SCM_UNDEFINED == 0 is sounding promising...

     Neil




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: SCM_BOOL_F == 0 and BDW-GC
  2009-09-17 22:00                   ` Neil Jerram
@ 2009-09-17 22:28                     ` Ludovic Courtès
  2009-09-18 20:51                       ` Neil Jerram
  0 siblings, 1 reply; 38+ messages in thread
From: Ludovic Courtès @ 2009-09-17 22:28 UTC (permalink / raw)
  To: guile-devel

Hello,

Neil Jerram <neil@ossau.uklinux.net> writes:

> ludo@gnu.org (Ludovic Courtès) writes:

[...]

>> Instead what's really needed is a special pointer-to-reclaimed-object
>> value that can be distinguished from valid Scheme values since that
>> value ends up in the car or cdr of weak pairs in hash table buckets.  As
>> such, SCM_PACK (NULL) was a good choice until now.
>
> Here I'm confused again.  I thought we now had no choice about the
> pointer-to-reclaimed-object value, because BDW-GC always uses NULL.

True.  So, what I meant is that ((SCM) NULL) must be distinguishable
from valid Scheme values.

>> SCM_UNDEFINED == 0 would work fine because SCM_UNDEFINED is not a valid
>> Scheme value, but it wouldn't change the implementation.
>
> I'm afraid I don't understand "but it wouldn't change the
> implementation".

Ugly stuff like ‘scm_fixup_weak_alist ()’ would still be needed.

> SCM_UNDEFINED == 0 is sounding promising...

Yeah.  Sorry for the false hope about SCM_BOOL_F == 0.

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: SCM_BOOL_F == 0 and BDW-GC
  2009-09-17 22:28                     ` Ludovic Courtès
@ 2009-09-18 20:51                       ` Neil Jerram
  2009-09-20 17:21                         ` Ludovic Courtès
  0 siblings, 1 reply; 38+ messages in thread
From: Neil Jerram @ 2009-09-18 20:51 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Hello,
>
> Neil Jerram <neil@ossau.uklinux.net> writes:
>>
>> Here I'm confused again.  I thought we now had no choice about the
>> pointer-to-reclaimed-object value, because BDW-GC always uses NULL.
>
> True.  So, what I meant is that ((SCM) NULL) must be distinguishable
> from valid Scheme values.

OK.

>>> SCM_UNDEFINED == 0 would work fine because SCM_UNDEFINED is not a valid
>>> Scheme value, but it wouldn't change the implementation.
>>
>> I'm afraid I don't understand "but it wouldn't change the
>> implementation".
>
> Ugly stuff like ‘scm_fixup_weak_alist ()’ would still be needed.

Thanks, I understand now.  scm_fixup_weak_alist looks OK to me.  Surely
we must have had something like that with Guile GC too?  (Except that it
was probably mixed up with the GC'ing code, and so was even uglier!)

>> SCM_UNDEFINED == 0 is sounding promising...
>
> Yeah.  Sorry for the false hope about SCM_BOOL_F == 0.

So are you going to try out SCM_UNDEFINED == 0 ?

    Neil




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: SCM_BOOL_F == 0 and BDW-GC
  2009-09-18 20:51                       ` Neil Jerram
@ 2009-09-20 17:21                         ` Ludovic Courtès
  2009-09-20 21:03                           ` Neil Jerram
  0 siblings, 1 reply; 38+ messages in thread
From: Ludovic Courtès @ 2009-09-20 17:21 UTC (permalink / raw)
  To: guile-devel

Hello,

Neil Jerram <neil@ossau.uklinux.net> writes:

> ludo@gnu.org (Ludovic Courtès) writes:

[...]

>> Ugly stuff like ‘scm_fixup_weak_alist ()’ would still be needed.
>
> Thanks, I understand now.  scm_fixup_weak_alist looks OK to me.  Surely
> we must have had something like that with Guile GC too?  (Except that it
> was probably mixed up with the GC'ing code, and so was even uglier!)

Well, indeed.

>>> SCM_UNDEFINED == 0 is sounding promising...
>>
>> Yeah.  Sorry for the false hope about SCM_BOOL_F == 0.
>
> So are you going to try out SCM_UNDEFINED == 0 ?

Who, me?  :-)

I’ll add it to my to-do list and report back, then.

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: SCM_BOOL_F == 0 and BDW-GC
  2009-09-20 17:21                         ` Ludovic Courtès
@ 2009-09-20 21:03                           ` Neil Jerram
  2009-09-20 21:36                             ` Ludovic Courtès
  0 siblings, 1 reply; 38+ messages in thread
From: Neil Jerram @ 2009-09-20 21:03 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Hello,

Hi Ludo,

> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> So are you going to try out SCM_UNDEFINED == 0 ?
>
> Who, me?  :-)
>
> I’ll add it to my to-do list and report back, then.

I'm sorry, I didn't exactly mean that you _should_ add it to your list -
especially since I'm sure your list is already very long.  What I should
have said is that it seems like something that someone could try out,
and that perhaps I could do that, but that I don't want to duplicate if
you're already planning to do it very soon.

Regards,
        Neil

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: SCM_BOOL_F == 0 and BDW-GC
  2009-09-20 21:03                           ` Neil Jerram
@ 2009-09-20 21:36                             ` Ludovic Courtès
  0 siblings, 0 replies; 38+ messages in thread
From: Ludovic Courtès @ 2009-09-20 21:36 UTC (permalink / raw)
  To: guile-devel

Hi!

Neil Jerram <neil@ossau.uklinux.net> writes:

> ludo@gnu.org (Ludovic Courtès) writes:

>> Neil Jerram <neil@ossau.uklinux.net> writes:
>>
>>> So are you going to try out SCM_UNDEFINED == 0 ?
>>
>> Who, me?  :-)
>>
>> I’ll add it to my to-do list and report back, then.
>
> I'm sorry, I didn't exactly mean that you _should_ add it to your list -
> especially since I'm sure your list is already very long.  What I should
> have said is that it seems like something that someone could try out,
> and that perhaps I could do that, but that I don't want to duplicate if
> you're already planning to do it very soon.

I’m not planning to do it soon, so I’m happy if someone else gives it a
try.

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-05  2:41         ` Mark H Weaver
  2009-07-05  9:19           ` Andy Wingo
@ 2009-07-06 21:46           ` Neil Jerram
  2009-07-06 23:54             ` Mark H Weaver
  2009-07-08  8:08             ` Ludovic Courtès
  1 sibling, 2 replies; 38+ messages in thread
From: Neil Jerram @ 2009-07-06 21:46 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

Mark H Weaver <mhw@netris.org> writes:

> Below is a proposal for how to make boolean tests and end-of-list
> tests faster and more compact, by renumbering the representations for
> SCM_ELISP_NIL, SCM_EOL, SCM_UNDEFINED, and SCM_EOF_VAL.

Interesting.  I haven't looked at every detail but I'm happy to go
along with Andy's impression.

Assuming you are planning to work on the code changes for this, we
will need copyright assignment papers from you.  Will that be OK?

> along with a regression test somewhere to complain unless both of the
> XOR subexpressions above are powers of two:
>
> 	#define IS_POWER_OF_TWO(x)             ((x) & ((x)-1) == 0)
> 	#define DIFFER_BY_ONLY_ONE_BIT(x, y)   IS_POWER_OF_TWO((x)^(y))
> 	
> 	if ( ! DIFFER_BY_ONLY_ONE_BIT(SCM_ELISP_NIL, SCM_BOOL_F) )
> 	  complain();
> 	if ( ! DIFFER_BY_ONLY_ONE_BIT(SCM_ELISP_NIL, SCM_EOL) )
> 	  complain();

There are ways of writing compile time asserts; see
http://www.jaggersoft.com/pubs/CVu11_3.html for some.  I don't know
how portable these all are, but at work we use the case label one, and
that seems to be good on common platforms.

Regards,
        Neil

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-06 21:46           ` truth of %nil Neil Jerram
@ 2009-07-06 23:54             ` Mark H Weaver
  2009-07-08  8:08             ` Ludovic Courtès
  1 sibling, 0 replies; 38+ messages in thread
From: Mark H Weaver @ 2009-07-06 23:54 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

On Mon, Jul 06, 2009 at 10:46:11PM +0100, Neil Jerram wrote:
> Assuming you are planning to work on the code changes for this, we
> will need copyright assignment papers from you.  Will that be OK?

Yes, certainly.  I live in the Boston area, so I'll stop by the FSF
office and take care of that soon.

> There are ways of writing compile time asserts; see
> http://www.jaggersoft.com/pubs/CVu11_3.html for some.

Thanks for the pointer!  I confess I'd been thinking of using #if
and #error, which works on GNU CPP but is apparently not portable.

   Best,
    Mark




^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-06 21:46           ` truth of %nil Neil Jerram
  2009-07-06 23:54             ` Mark H Weaver
@ 2009-07-08  8:08             ` Ludovic Courtès
  1 sibling, 0 replies; 38+ messages in thread
From: Ludovic Courtès @ 2009-07-08  8:08 UTC (permalink / raw)
  To: guile-devel

Hi,

Neil Jerram <neil@ossau.uklinux.net> writes:

> There are ways of writing compile time asserts; see
> http://www.jaggersoft.com/pubs/CVu11_3.html for some.  I don't know
> how portable these all are, but at work we use the case label one, and
> that seems to be good on common platforms.

Gnulib's `verify' module provides macros for compile-time assertions.

Thanks,
Ludo'.





^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: truth of %nil
  2009-07-03 15:32       ` Mark H Weaver
  2009-07-05  2:41         ` Mark H Weaver
@ 2009-07-23 21:12         ` Andy Wingo
  1 sibling, 0 replies; 38+ messages in thread
From: Andy Wingo @ 2009-07-23 21:12 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel, Neil Jerram

On Fri 03 Jul 2009 17:32, Mark H Weaver <mhw@netris.org> writes:

> It might be worth considering a build-time option to disable %nil, so
> that it's possible to build a version of guile which doesn't pay this
> price.

You probably found it, but Guile does have such an option.

(Jeez, I didn't think I'd ever find myself in the position of struggling
to keep up with mail to guile-devel ;-)

Andy

-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2009-09-20 21:36 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-29 21:12 truth of %nil Andy Wingo
2009-06-29 21:44 ` Neil Jerram
2009-06-29 22:11   ` Andy Wingo
2009-06-30 22:22     ` Neil Jerram
2009-07-01  6:45       ` Daniel Kraft
2009-07-01 21:54         ` Neil Jerram
2009-07-05 13:07           ` Mark H Weaver
2009-08-30 11:07             ` Neil Jerram
2009-08-30 14:11               ` Mark H Weaver
2009-09-01 22:00                 ` Neil Jerram
2009-09-02 15:57                   ` Mark H Weaver
2009-09-17 21:21                     ` Neil Jerram
2009-07-02 14:28   ` Mark H Weaver
2009-07-02 14:50     ` Ludovic Courtès
2009-07-02 22:50     ` Neil Jerram
2009-07-03 15:32       ` Mark H Weaver
2009-07-05  2:41         ` Mark H Weaver
2009-07-05  9:19           ` Andy Wingo
2009-07-07 11:14             ` Mark H Weaver
2009-07-08 13:17               ` Mark H. Weaver
2009-08-30 11:20                 ` Neil Jerram
2009-08-30 11:13               ` Neil Jerram
2009-08-30 14:15                 ` Mark H Weaver
2009-09-01 21:50                   ` Neil Jerram
2009-08-30 22:01                 ` Ken Raeburn
2009-08-31 21:59                   ` Ludovic Courtès
2009-08-31 23:39                     ` Ken Raeburn
2009-08-31 21:55                 ` SCM_BOOL_F == 0 and BDW-GC Ludovic Courtès
2009-09-17 22:00                   ` Neil Jerram
2009-09-17 22:28                     ` Ludovic Courtès
2009-09-18 20:51                       ` Neil Jerram
2009-09-20 17:21                         ` Ludovic Courtès
2009-09-20 21:03                           ` Neil Jerram
2009-09-20 21:36                             ` Ludovic Courtès
2009-07-06 21:46           ` truth of %nil Neil Jerram
2009-07-06 23:54             ` Mark H Weaver
2009-07-08  8:08             ` Ludovic Courtès
2009-07-23 21:12         ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).