unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Re: [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7
       [not found] <E1MkqDD-0003c6-C4@cvs.savannah.gnu.org>
@ 2009-09-08 23:00 ` Ludovic Courtès
  2009-09-09  4:16   ` Mike Gran
  0 siblings, 1 reply; 5+ messages in thread
From: Ludovic Courtès @ 2009-09-08 23:00 UTC (permalink / raw)
  To: Michael Gran; +Cc: guile-devel

Hello!

"Michael Gran" <spk121@yahoo.com> writes:

> http://git.savannah.gnu.org/cgit/guile.git/commit/?id=0d05ae7c4b1eddf6257f99f44eaf5cb7b11191be

[...]

> -  return scm_getc (input_port);
> +  return scm_get_byte_or_eof (input_port);

This is actually an earlier change, but the prototype of scm_getc is now
different from that in 1.8.  Presumably, this means that it’s not
source-compatible with 1.8, e.g., on platforms where
sizeof (int) < sizeof (scm_t_wchar), right?

> --- a/libguile/strings.h
> +++ b/libguile/strings.h
> @@ -111,7 +111,7 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
>  SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
>  SCM_API SCM scm_string_append (SCM args);
>  
> -SCM_INTERNAL SCM scm_i_from_stringn (const char *str, size_t len, 
> +SCM_API SCM scm_i_from_stringn (const char *str, size_t len, 
>                                       const char *encoding,
>                                       scm_t_string_failed_conversion_handler 
>                                       handler);
> @@ -157,7 +157,7 @@ SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
>  SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
>  SCM_INTERNAL void scm_i_string_stop_writing (void);
>  SCM_INTERNAL int scm_i_is_narrow_string (SCM str);
> -SCM_INTERNAL scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> +SCM_API scm_t_wchar scm_i_string_ref (SCM str, size_t x);

Were these changes intended?

> +  (with-locale "en_US.iso88591"
> +    (pass-if-exception "no args" exception:wrong-num-args
> +      (regexp-quote))

Is the locale part of the API?  That is, should programs that use
regexps explicitly ask for a locale with 8-bit encoding?

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7
  2009-09-08 23:00 ` [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7 Ludovic Courtès
@ 2009-09-09  4:16   ` Mike Gran
  2009-09-09  7:42     ` Ludovic Courtès
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Gran @ 2009-09-09  4:16 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Wed, 2009-09-09 at 01:00 +0200, Ludovic Courtès wrote:
> Hello!
> 
> "Michael Gran" <spk121@yahoo.com> writes:
> 
> > http://git.savannah.gnu.org/cgit/guile.git/commit/?id=0d05ae7c4b1eddf6257f99f44eaf5cb7b11191be
> 
> [...]
> 
> > -  return scm_getc (input_port);
> > +  return scm_get_byte_or_eof (input_port);
> 
> This is actually an earlier change, but the prototype of scm_getc is now
> different from that in 1.8.  Presumably, this means that it’s not
> source-compatible with 1.8, e.g., on platforms where
> sizeof (int) < sizeof (scm_t_wchar), right?

The readline library can't handle UCS-4 codepoints, but, it is capable
of dealing with locale-encoded text.  So, it needs to have the raw bytes
of the locale-encoded characters, and scm_get_byte_or_eof returns the
raw bytes.

> 
> > --- a/libguile/strings.h
> > +++ b/libguile/strings.h
> > @@ -111,7 +111,7 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
> >  SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
> >  SCM_API SCM scm_string_append (SCM args);
> >  
> > -SCM_INTERNAL SCM scm_i_from_stringn (const char *str, size_t len, 
> > +SCM_API SCM scm_i_from_stringn (const char *str, size_t len, 
> >                                       const char *encoding,
> >                                       scm_t_string_failed_conversion_handler 
> >                                       handler);
> > @@ -157,7 +157,7 @@ SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
> >  SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
> >  SCM_INTERNAL void scm_i_string_stop_writing (void);
> >  SCM_INTERNAL int scm_i_is_narrow_string (SCM str);
> > -SCM_INTERNAL scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> > +SCM_API scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> 
> Were these changes intended?

Well, one of the two of them was intended.  :)

> 
> > +  (with-locale "en_US.iso88591"
> > +    (pass-if-exception "no args" exception:wrong-num-args
> > +      (regexp-quote))
> 
> Is the locale part of the API?  That is, should programs that use
> regexps explicitly ask for a locale with 8-bit encoding?

Basically yes.  On Wed, 2009-09-09 at 01:00 +0200, Ludovic Courtès
wrote: 
> Hello!
> 
> "Michael Gran" <spk121@yahoo.com> writes:
> 
> > http://git.savannah.gnu.org/cgit/guile.git/commit/?id=0d05ae7c4b1eddf6257f99f44eaf5cb7b11191be
> 
> [...]
> 
> > -  return scm_getc (input_port);
> > +  return scm_get_byte_or_eof (input_port);
> 
> This is actually an earlier change, but the prototype of scm_getc is now
> different from that in 1.8.  Presumably, this means that it’s not
> source-compatible with 1.8, e.g., on platforms where
> sizeof (int) < sizeof (scm_t_wchar), right?

The readline library can't handle UCS-4 codepoints, but, it is capable
of dealing with locale-encoded text.  So, it needs to have the raw bytes
of the locale-encoded characters, and scm_get_byte_or_eof returns the
raw bytes instead of doing the processing necessary to make codepoints.

> 
> > --- a/libguile/strings.h
> > +++ b/libguile/strings.h
> > @@ -111,7 +111,7 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
> >  SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
> >  SCM_API SCM scm_string_append (SCM args);
> >  
> > -SCM_INTERNAL SCM scm_i_from_stringn (const char *str, size_t len, 
> > +SCM_API SCM scm_i_from_stringn (const char *str, size_t len, 
> >                                       const char *encoding,
> >                                       scm_t_string_failed_conversion_handler 
> >                                       handler);
> > @@ -157,7 +157,7 @@ SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
> >  SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
> >  SCM_INTERNAL void scm_i_string_stop_writing (void);
> >  SCM_INTERNAL int scm_i_is_narrow_string (SCM str);
> > -SCM_INTERNAL scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> > +SCM_API scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> 
> Were these changes intended?

Well, one of the two of them was intended.  :)

> 
> > +  (with-locale "en_US.iso88591"
> > +    (pass-if-exception "no args" exception:wrong-num-args
> > +      (regexp-quote))
> 
> Is the locale part of the API?  That is, should programs that use
> regexps explicitly ask for a locale with 8-bit encoding?

Basically yes. The libc regex is 8-bit, and it uses
scm_to/from_locale_string to convert regex's input and output.

Until libunistring comes with Unicode regex, I think this is the best we
can do.

Thanks,

Mike





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7
  2009-09-09  4:16   ` Mike Gran
@ 2009-09-09  7:42     ` Ludovic Courtès
  2009-09-09 10:38       ` Mike Gran
  0 siblings, 1 reply; 5+ messages in thread
From: Ludovic Courtès @ 2009-09-09  7:42 UTC (permalink / raw)
  To: guile-devel

Hi,

Mike Gran <spk121@yahoo.com> writes:

> On Wed, 2009-09-09 at 01:00 +0200, Ludovic Courtès wrote:

[...]

>> > -  return scm_getc (input_port);
>> > +  return scm_get_byte_or_eof (input_port);
>> 
>> This is actually an earlier change, but the prototype of scm_getc is now
>> different from that in 1.8.  Presumably, this means that it’s not
>> source-compatible with 1.8, e.g., on platforms where
>> sizeof (int) < sizeof (scm_t_wchar), right?

I was actually referring to the fact that 1.8 has:

  SCM_API int scm_getc (SCM port);

whereas 1.9 has:

  SCM_API scm_t_wchar scm_getc (SCM port);

What do you think?

>> > --- a/libguile/strings.h
>> > +++ b/libguile/strings.h
>> > @@ -111,7 +111,7 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
>> >  SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
>> >  SCM_API SCM scm_string_append (SCM args);
>> >  
>> > -SCM_INTERNAL SCM scm_i_from_stringn (const char *str, size_t len, 
>> > +SCM_API SCM scm_i_from_stringn (const char *str, size_t len, 
>> >                                       const char *encoding,
>> >                                       scm_t_string_failed_conversion_handler 
>> >                                       handler);
>> > @@ -157,7 +157,7 @@ SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
>> >  SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
>> >  SCM_INTERNAL void scm_i_string_stop_writing (void);
>> >  SCM_INTERNAL int scm_i_is_narrow_string (SCM str);
>> > -SCM_INTERNAL scm_t_wchar scm_i_string_ref (SCM str, size_t x);
>> > +SCM_API scm_t_wchar scm_i_string_ref (SCM str, size_t x);
>> 
>> Were these changes intended?
>
> Well, one of the two of them was intended.  :)

Shouldn’t both of them remain internal given that they have an ‘_i_’ in
their name?

>> > +  (with-locale "en_US.iso88591"
>> > +    (pass-if-exception "no args" exception:wrong-num-args
>> > +      (regexp-quote))
>> 
>> Is the locale part of the API?  That is, should programs that use
>> regexps explicitly ask for a locale with 8-bit encoding?
>
> Basically yes. The libc regex is 8-bit, and it uses
> scm_to/from_locale_string to convert regex's input and output.

That’s unfortunate but OTOH it’s the same as in 1.8, so I guess it’s OK.

> Until libunistring comes with Unicode regex, I think this is the best we
> can do.

Yes, that would be neat!

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7
  2009-09-09  7:42     ` Ludovic Courtès
@ 2009-09-09 10:38       ` Mike Gran
  2009-09-09 13:03         ` Ludovic Courtès
  0 siblings, 1 reply; 5+ messages in thread
From: Mike Gran @ 2009-09-09 10:38 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Wed, 2009-09-09 at 09:42 +0200, Ludovic Courtès wrote:
> Hi,
> >> > -  return scm_getc (input_port);
> >> > +  return scm_get_byte_or_eof (input_port);
> >> 
> >> This is actually an earlier change, but the prototype of scm_getc is now
> >> different from that in 1.8.  Presumably, this means that it’s not
> >> source-compatible with 1.8, e.g., on platforms where
> >> sizeof (int) < sizeof (scm_t_wchar), right?
> 
> I was actually referring to the fact that 1.8 has:
> 
>   SCM_API int scm_getc (SCM port);
> 
> whereas 1.9 has:
> 
>   SCM_API scm_t_wchar scm_getc (SCM port);
> 
> What do you think?

Sorry, I misunderstood.  It is, as you say, incompatible.
scm_t_wchar is scm_t_int32, not int, so 16-bit int platforms
and 64-bit int platforms would notice the change.  I'm fairly
sure Guile doesn't run in 16-bit int platforms, but, 64-bit 
platforms would notice the change.

I'd like to leave it scm_t_wchar == scm_t_int32.  Do you think that's a
problem?

> >> > --- a/libguile/strings.h
> >> > +++ b/libguile/strings.h
> >> > @@ -111,7 +111,7 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
> >> >  SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
> >> >  SCM_API SCM scm_string_append (SCM args);
> >> >  
> >> > -SCM_INTERNAL SCM scm_i_from_stringn (const char *str, size_t len, 
> >> > +SCM_API SCM scm_i_from_stringn (const char *str, size_t len, 
> >> >                                       const char *encoding,
> >> >                                       scm_t_string_failed_conversion_handler 
> >> >                                       handler);
> >> > @@ -157,7 +157,7 @@ SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
> >> >  SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
> >> >  SCM_INTERNAL void scm_i_string_stop_writing (void);
> >> >  SCM_INTERNAL int scm_i_is_narrow_string (SCM str);
> >> > -SCM_INTERNAL scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> >> > +SCM_API scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> >> 
> >> Were these changes intended?
> >
> > Well, one of the two of them was intended.  :)
> 
> Shouldn’t both of them remain internal given that they have an ‘_i_’ in
> their name?

I seemed to need to make scm_i_from_stringn into SCM_API so that I could
use it in libguilereadline.  Pragmatically, it is now functioning as
'SCM_API scm_from_stringn'.  The gray area is if libguilereadline is
philosophically 'internal' or 'external'.  If libguilereadline is
philosophically 'internal' it could keep the name scm_i_from_stringn,
but, if that is just confusing, it should probably become
scm_from_stringn.


> > Until libunistring comes with Unicode regex, I think this is the best we
> > can do.
> 
> Yes, that would be neat!

It is on their todo.  They have header files preallocated for it.  Its a
big job, though.

Thanks,
Mike





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7
  2009-09-09 10:38       ` Mike Gran
@ 2009-09-09 13:03         ` Ludovic Courtès
  0 siblings, 0 replies; 5+ messages in thread
From: Ludovic Courtès @ 2009-09-09 13:03 UTC (permalink / raw)
  To: guile-devel

Mike Gran <spk121@yahoo.com> writes:

> On Wed, 2009-09-09 at 09:42 +0200, Ludovic Courtès wrote:

>> I was actually referring to the fact that 1.8 has:
>> 
>>   SCM_API int scm_getc (SCM port);
>> 
>> whereas 1.9 has:
>> 
>>   SCM_API scm_t_wchar scm_getc (SCM port);
>> 
>> What do you think?
>
> Sorry, I misunderstood.  It is, as you say, incompatible.
> scm_t_wchar is scm_t_int32, not int, so 16-bit int platforms
> and 64-bit int platforms would notice the change.  I'm fairly
> sure Guile doesn't run in 16-bit int platforms, but, 64-bit 
> platforms would notice the change.
>
> I'd like to leave it scm_t_wchar == scm_t_int32.  Do you think that's a
> problem?

I checked on {powerpc64,sparc64,mips64el,ia64}-linux-gnu:

  * sizeof (int) == 4 on all of them;

  * sizeof (long) == 4 on all of them,
    except on ia64 where sizeof (long) == 8.

So presumably we shouldn't worry?

>> >> > --- a/libguile/strings.h
>> >> > +++ b/libguile/strings.h
>> >> > @@ -111,7 +111,7 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
>> >> >  SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
>> >> >  SCM_API SCM scm_string_append (SCM args);
>> >> >  
>> >> > -SCM_INTERNAL SCM scm_i_from_stringn (const char *str, size_t len, 
>> >> > +SCM_API SCM scm_i_from_stringn (const char *str, size_t len, 
>> >> >                                       const char *encoding,
>> >> >                                       scm_t_string_failed_conversion_handler 
>> >> >                                       handler);
>> >> > @@ -157,7 +157,7 @@ SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
>> >> >  SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
>> >> >  SCM_INTERNAL void scm_i_string_stop_writing (void);
>> >> >  SCM_INTERNAL int scm_i_is_narrow_string (SCM str);
>> >> > -SCM_INTERNAL scm_t_wchar scm_i_string_ref (SCM str, size_t x);
>> >> > +SCM_API scm_t_wchar scm_i_string_ref (SCM str, size_t x);
>> >> 
>> >> Were these changes intended?
>> >
>> > Well, one of the two of them was intended.  :)
>> 
>> Shouldn’t both of them remain internal given that they have an ‘_i_’ in
>> their name?
>
> I seemed to need to make scm_i_from_stringn into SCM_API so that I could
> use it in libguilereadline.  Pragmatically, it is now functioning as
> 'SCM_API scm_from_stringn'.

Cool.

> The gray area is if libguilereadline is philosophically 'internal' or
> 'external'.  If libguilereadline is philosophically 'internal' it
> could keep the name scm_i_from_stringn, but, if that is just
> confusing, it should probably become scm_from_stringn.

It's external.  It it needs something like `scm_from_stringn' then
potentially other users will need it as well, so we should have a public
API.

Thanks,
Ludo'.





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-09-09 13:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <E1MkqDD-0003c6-C4@cvs.savannah.gnu.org>
2009-09-08 23:00 ` [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7 Ludovic Courtès
2009-09-09  4:16   ` Mike Gran
2009-09-09  7:42     ` Ludovic Courtès
2009-09-09 10:38       ` Mike Gran
2009-09-09 13:03         ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).