unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* proposal: scm_c_public_ref et al
@ 2011-03-06 11:24 Andy Wingo
  2011-03-06 16:24 ` Mark H Weaver
  2011-03-06 22:22 ` Ludovic Courtès
  0 siblings, 2 replies; 7+ messages in thread
From: Andy Wingo @ 2011-03-06 11:24 UTC (permalink / raw)
  To: guile-devel

Hey all,

As we move more and more to writing code in Scheme and not in C, it
becomes apparent that it is more cumbersome to reference Scheme
values than it should be.

I propose that we add helper C APIs like these:

    SCM scm_public_lookup (SCM module_name, SCM sym);
    SCM scm_private_lookup (SCM module_name, SCM sym);

    Look up a variable bound to SYM in the module named MODULE_NAME.  If
    the module does not exist or the symbol is unbound, signal an
    error.  The "public" variant looks in the public interface of the
    module, while scm_private_lookup looks into the module itself.

Then also:

    SCM scm_public_ref (SCM module_name, SCM sym);

    Equivalent to scm_variable_ref (scm_public_ref (module_name, sym)).

    SCM scm_private_ref (SCM module_name, SCM sym);

    Equivalent to scm_variable_ref (scm_private_ref (module_name, sym)).

And then:

    SCM scm_c_public_lookup (const char *module_name, const char *name);
    SCM scm_c_private_lookup (const char *module_name, const char *name);
    SCM scm_c_public_ref (const char *module_name, const char *name);
    SCM scm_c_private_ref (const char *module_name, const char *name);

    Like the above, but with locale-encoded C strings, for convenience.
    Module names are encoded as for `scm_c_resolve_module'.

With all these, we can happily implement Bruce Korb's
`eval-string-from-file-line' using the new `(ice-9 eval-string)', and
our code becomes:

    SCM scm_c_eval_string_from_file_line (const char *str, const char *file, int line)
    {
      return scm_call_5 (scm_c_public_ref ("ice-9 eval-string", "eval-string"),
                         scm_from_locale_string (the_string),
                         scm_from_locale_keyword ("file"),
                         scm_from_locale_string (file_name),
                         scm_from_locale_keyword ("line"),
                         scm_from_int (line));
     }

scm_call_N doesn't go up to 5 yet, but it probably should.  We can cache
the var if we want:

    SCM scm_c_eval_string_from_file_line (const char *str, const char *file, int line)
    {
      static SCM eval_string_var = SCM_BOOL_F;

      if (scm_is_false (eval_string_var))
        eval_string_var = scm_c_public_lookup ("ice-9 eval-string", "eval-string");

      return scm_call_5 (scm_variable_ref (eval_string_var),
                         scm_from_locale_string (the_string),
                         scm_from_locale_keyword ("file"),
                         scm_from_locale_string (file_name),
                         scm_from_locale_keyword ("line"),
                         scm_from_int (line));
     }

And we can use this strategy for easily moving over more C code to
Scheme modules as time goes on.

(If we make this convenient enough, we might be able to avoid defining
scm_c_eval_string_from_file_line ourselves, and just say to call the
function from ice-9 eval-string; but that's another discussion.)

Any thoughts?

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: proposal: scm_c_public_ref et al
  2011-03-06 11:24 proposal: scm_c_public_ref et al Andy Wingo
@ 2011-03-06 16:24 ` Mark H Weaver
  2011-03-06 17:10   ` Thien-Thi Nguyen
  2011-03-06 21:10   ` Andy Wingo
  2011-03-06 22:22 ` Ludovic Courtès
  1 sibling, 2 replies; 7+ messages in thread
From: Mark H Weaver @ 2011-03-06 16:24 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:
>     SCM scm_c_public_lookup (const char *module_name, const char *name);
>     SCM scm_c_private_lookup (const char *module_name, const char *name);
>     SCM scm_c_public_ref (const char *module_name, const char *name);
>     SCM scm_c_private_ref (const char *module_name, const char *name);
>
>     Like the above, but with locale-encoded C strings, for convenience.
>     Module names are encoded as for `scm_c_resolve_module'.

Given that the C strings passed to these functions will more often than
not be embedded in the source code, it seems to me that it's a mistake
assume that they are encoded in the current locale.  The current locale
is normally the locale of the user, which may be different from the
locale that the source code is written in.

Maybe utf8 is a better choice?

    Best,
     Mark



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: proposal: scm_c_public_ref et al
  2011-03-06 16:24 ` Mark H Weaver
@ 2011-03-06 17:10   ` Thien-Thi Nguyen
  2011-03-06 21:34     ` Andy Wingo
  2011-03-06 21:10   ` Andy Wingo
  1 sibling, 1 reply; 7+ messages in thread
From: Thien-Thi Nguyen @ 2011-03-06 17:10 UTC (permalink / raw)
  To: guile-devel

() Mark H Weaver <mhw@netris.org>
() Sun, 06 Mar 2011 11:24:33 -0500

   Maybe utf8 is a better choice?

A module name is a list of symbols, so why not use that from the beginning?
If the process of converting "ice-9 common-list" into (ice-9 common-list)
must happen somewhere, it would be nice if it could happen earlier, to
perhaps amortize over similarly-prefixed (ice-9 foo) names.

This suggests the interface should be at a higher abstraction level,
specifying the module name prefix (possibly empty list of symbols)
and the list of module name leaf symbols.  This kind of interaction
maps well to real life use, where client code knows a priori which
of the many modules in a family are required.

The more closer to declarative the interface the better.

Ideally, in C i would be able to specify:

 - prefix
   "(ice-9)"

 - list of requested elements
   { "q", /* leaf => (ice-9 q) */
     "make-q", "enq!", "deq!", NULL,
     "common-list", /* leaf => (ice-9 common-list) */
     "uniq", NULL
   };

 - vector to write them to
   SCM actual_scheme_objects[6];

Of course, i would need to add some sugar:
  #define MOD_Q   actual_scheme_objects[0]
  #define MAKE_Q  actual_scheme_objects[1]
  #define ENQ_X   actual_scheme_objects[2]
  #define DEQ_X   actual_scheme_objects[3]
  #edfine MOD_CL  actual_scheme_objects[4]
  #define UNIQ    actual_scheme_objects[5]

to make things easier on the eyes.  The MOD_foo objects might be useful
later to pass to "individual referencing" funcs (i.e., Andy's proposal).
If there are other prefixes, e.g., (database a) through (database z),
these could use ‘actual_scheme_objects’ or another object vector.

Generally, mass-referencing is more efficient and easier to maintain
than individual referencing primitives.

BTW, i agree that all C strings should be explicitly specified as UTF-8.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: proposal: scm_c_public_ref et al
  2011-03-06 16:24 ` Mark H Weaver
  2011-03-06 17:10   ` Thien-Thi Nguyen
@ 2011-03-06 21:10   ` Andy Wingo
  1 sibling, 0 replies; 7+ messages in thread
From: Andy Wingo @ 2011-03-06 21:10 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On Sun 06 Mar 2011 17:24, Mark H Weaver <mhw@netris.org> writes:

> Andy Wingo <wingo@pobox.com> writes:
>>     SCM scm_c_public_lookup (const char *module_name, const char *name);
>>     SCM scm_c_private_lookup (const char *module_name, const char *name);
>>     SCM scm_c_public_ref (const char *module_name, const char *name);
>>     SCM scm_c_private_ref (const char *module_name, const char *name);
>>
>>     Like the above, but with locale-encoded C strings, for convenience.
>>     Module names are encoded as for `scm_c_resolve_module'.
>
> Given that the C strings passed to these functions will more often than
> not be embedded in the source code, it seems to me that it's a mistake
> assume that they are encoded in the current locale.  The current locale
> is normally the locale of the user, which may be different from the
> locale that the source code is written in.
>
> Maybe utf8 is a better choice?

I thought about this, yes, and you're probably right.  However the other
scm_c_ functions in modules.h (and other files) use locale encoding.
That's probably not something we can change in 2.0.x, and it seemed at
the time that consistency was best.  (Or can we change them?  Does it
matter?)

If we were to choose a particular encoding to make convenient, I would
prefer latin1, because it is common, particularly efficient for Guile
(e.g. scm_from_latin1_symbol), and does not raise runtime errors.  That
does not sacrifice generality, as one can always use
scm_from_utf8_symbol or others, and use the SCM-accepting interface.

Humm.  What do you think about having these procedures take latin1?  We
can probably cast it as a compatibility-preserving change, with regards
to 1.8, which would always treat incoming strings as one byte per char.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: proposal: scm_c_public_ref et al
  2011-03-06 17:10   ` Thien-Thi Nguyen
@ 2011-03-06 21:34     ` Andy Wingo
  0 siblings, 0 replies; 7+ messages in thread
From: Andy Wingo @ 2011-03-06 21:34 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: guile-devel

Hello,

On Sun 06 Mar 2011 18:10, Thien-Thi Nguyen <ttn@gnuvola.org> writes:

> () Mark H Weaver <mhw@netris.org>
> () Sun, 06 Mar 2011 11:24:33 -0500
>
>    Maybe utf8 is a better choice?
>
> A module name is a list of symbols, so why not use that from the beginning?

The variants without _c_ did just that, so no problem there.  The
module_name like "ice-9 eval-string" already exists in
scm_c_resolve_module, so I'm not making that bit up.

My interest is to make invoking Scheme code in modules from C easier,
with less fussiness on the C side.  Your ideas are interesting, but
I don't think they are appropriate for the particular problem I am
interested in.

> BTW, i agree that all C strings should be explicitly specified as UTF-8.

ACK.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: proposal: scm_c_public_ref et al
  2011-03-06 11:24 proposal: scm_c_public_ref et al Andy Wingo
  2011-03-06 16:24 ` Mark H Weaver
@ 2011-03-06 22:22 ` Ludovic Courtès
  2011-03-07 10:45   ` Andy Wingo
  1 sibling, 1 reply; 7+ messages in thread
From: Ludovic Courtès @ 2011-03-06 22:22 UTC (permalink / raw)
  To: guile-devel

Hi!

Andy Wingo <wingo@pobox.com> writes:

>     SCM scm_public_lookup (SCM module_name, SCM sym);
>     SCM scm_private_lookup (SCM module_name, SCM sym);
>
>     Look up a variable bound to SYM in the module named MODULE_NAME.  If
>     the module does not exist or the symbol is unbound, signal an
>     error.  The "public" variant looks in the public interface of the
>     module, while scm_private_lookup looks into the module itself.

So this would be equivalent to:

  scm_module_variable (scm_resolve_module (module_name), sym)

?

I’m skeptical about adding 8 convenience C functions “as we move more
and more to writing code in Scheme and not in C” ;-), but if that’s what
people really want, then I won’t object.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: proposal: scm_c_public_ref et al
  2011-03-06 22:22 ` Ludovic Courtès
@ 2011-03-07 10:45   ` Andy Wingo
  0 siblings, 0 replies; 7+ messages in thread
From: Andy Wingo @ 2011-03-07 10:45 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Sun 06 Mar 2011 23:22, ludo@gnu.org (Ludovic Courtès) writes:

> Hi!
>
> Andy Wingo <wingo@pobox.com> writes:
>
>>     SCM scm_public_lookup (SCM module_name, SCM sym);
>>     SCM scm_private_lookup (SCM module_name, SCM sym);
>>
>>     Look up a variable bound to SYM in the module named MODULE_NAME.  If
>>     the module does not exist or the symbol is unbound, signal an
>>     error.  The "public" variant looks in the public interface of the
>>     module, while scm_private_lookup looks into the module itself.
>
> So this would be equivalent to:
>
>   scm_module_variable (scm_resolve_module (module_name), sym)

Using scm_module_variable is fine for accessing variables from your own
module, but it is not apparent to the user that this actually accesses
private bindings and not exported bindings.  So strictly speaking, for
public variables, you need to do scm_module_public_interface () on the
module.  But we penalize that with typing.

Even given that, though, scm_public_lookup is not the same, because the
error cases are different.  You want to get the messages, "(foo bar) is
not a module", and "baz is unbound in module foo".  But you can't get
that easily with the existing interfaces.  scm_c_resolve_module will
always return a module, creating one if none exists.  Then
scm_module_public_interface returns #f, and then the module_variable
would give you an answer like "#f is not a module", which is
nonsensical.

Furthermore, scm_module_variable returns #f when there is no binding --
a useful behavior in many cases, but as often not useful -- better to
get an error that the symbol is unbound, like scm_lookup does.

Finally, when you add to this the need to get SCM values for the module
and symbol, it's just too much sometimes, and you end up just coding the
thing in C.  Better to make it easy to use Scheme :)

> I’m skeptical about adding 8 convenience C functions “as we move more
> and more to writing code in Scheme and not in C” ;-), but if that’s what
> people really want, then I won’t object.

Understood!  I would like to avoid adding new C API, but this one seems
to allow us to write less C in the future.

For example, with the eval-string-from-file-line case, one can imagine
other parameters -- the current language, whether to compile or not,
etc.  Already there is eval-string which takes a module argument -- do
we add an eval-string-from-file-line variant with and without a module?

This sort of thing is well-supported in Scheme with keyword arguments,
and a pain to do from C.  So we need to make it easy for us to say,
"just call eval-string from (ice-9 eval-string), and add on keyword
arguments as appropriate", instead of encoding all of the permutations
into our C API.

MHO, anyway!  :)

Cheers,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-03-07 10:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-06 11:24 proposal: scm_c_public_ref et al Andy Wingo
2011-03-06 16:24 ` Mark H Weaver
2011-03-06 17:10   ` Thien-Thi Nguyen
2011-03-06 21:34     ` Andy Wingo
2011-03-06 21:10   ` Andy Wingo
2011-03-06 22:22 ` Ludovic Courtès
2011-03-07 10:45   ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).