unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* Re: [Bug-autogen] test failure with guile-2.0.2
       [not found] <20110710.001131.2114114687665811411.pipping@lavabit.com>
@ 2011-07-11 13:28 ` Bruce Korb
       [not found] ` <20110711.160906.808950167011716776.pipping@lavabit.com>
       [not found] ` <4E1AFA72.2050803@gnu.org>
  2 siblings, 0 replies; 13+ messages in thread
From: Bruce Korb @ 2011-07-11 13:28 UTC (permalink / raw)
  To: Elias Pipping, bug-guile; +Cc: bug-autogen

Hi Elias, et al.,

On 07/09/11 15:11, Elias Pipping wrote:
> with autogen 5.12 and guile [top of tree] I get a test failure that
> I do not get with the same version of autogen and guile 2.0.2.
>
> The failing test is string.test.
>
> It fails because instead of ending with \001\002\003\377 as expected,
> a generated string ends with \001\002\003?. I'm attaching the relevant
> output. Here's a snippet passed through `od -c`:
>
> 0002060   h   a   s   s   l   e   "   . 001 002 003 377  \r  \n   '<
> 0002260   h   a   s   s   l   e   "   . 001 002 003   ?  \r  \n   '<
>
> where the first line is expected and the second line is what's
> actually returned.
>
> I'm also attaching a script that reproduces the problem and can be run
> from inside a checkout of the guile git repository. It builds guile,
> installs it to a temporary location, then builds autogen 5.12 and
> makes it use that version of guile; In conjunction with git-bisect,
> this revealed that the following commit is to blame:
>
> commit 95f5e303bc7f6174255b12fd1113d69364863762
> Author: Andy Wingo<wingo@pobox.com>
> Date:   Thu Mar 17 18:29:08 2011 +0100
>
>      scm_{to,from}_locale_string use current locale, not current ports
>
>      * libguile/strings.c (scm_to_locale_stringn, scm_from_locale_stringn):
>        Use the encoding of the current locale, not of the current i/o ports.
>        Also use the current conversion strategy.
>
>      * doc/ref/api-data.texi (Conversion to/from C): Update docs.
>
> See also [1].

The intent is that I have several functions:  raw-shell-str, shell-str,
c-string and kr-string each of which produces precisely the same byte
sequence as their argument for the intended target environment.

The first two produce text that, when processed by the shell and passed
through as arguments to a program, will be seen by the program as
identical to the original string handed off to the guile function.
Similarly, c-string and kr-string will produce C variable initialization
text that, after compilation, the compiled program will see the exact
same sequence of bytes given to guile to hand off to the function I wrote.
If this is no longer the case, then guile has changed its interface again.
PLEASE DO NOT DO THAT.  If you think you made a mistake with the interface
I was told to use, please change the interface name and deprecate the
old one rather than cut me off at the knees.  Thank you.  Then I can
transition to another interface.  But please tell me which one.
I just want byte arrays.  I don't want Guile sticking its nose in and
"improving" the sequence of bytes for me (deleting DEL characters, for
example).

Thank you.  Regards, Bruce



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug-autogen] test failure with guile-2.0.2
       [not found] ` <20110711.160906.808950167011716776.pipping@lavabit.com>
@ 2011-07-11 17:32   ` Bruce Korb
  0 siblings, 0 replies; 13+ messages in thread
From: Bruce Korb @ 2011-07-11 17:32 UTC (permalink / raw)
  To: Elias Pipping, bug-guile

On 07/11/11 07:09, Elias Pipping wrote:
> I meant write:
>
> with autogen 5.12 and guile 2.0.2 I get a test failure that I do not
> get with the same version of autogen and guile 2.0.0.

Ah, right.  I took a guess at intentions.  But, it doesn't matter.

The fact is I was told by folks on the guile list I ought to be
using those scm to/from locale string thingies.  I did.  Now the
code has changed so they don't work the way they used to.  That is
an interface change.  If the interface changes, the interface name
ought to change, too.  Someone please name the correct function
that will not add the eighth bit when the character value is 0x7F
(the DEL character).  I will have to version the interface so that
I only use the locale string stuff prior to 2.0.2.  So much for
getting rid of the versioned interface glue.  :(



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug-autogen] test failure with guile-2.0.2
       [not found] ` <4E1AFA72.2050803@gnu.org>
@ 2011-07-13  8:48   ` Andy Wingo
  2011-07-13 12:54     ` Bruce Korb
       [not found]     ` <4E1D95A0.60504@gnu.org>
  0 siblings, 2 replies; 13+ messages in thread
From: Andy Wingo @ 2011-07-13  8:48 UTC (permalink / raw)
  To: Bruce Korb; +Cc: bug-guile, bug-autogen, Elias Pipping

Hi Bruce,

I'm not sure what the bug report here is.  I'm getting a lot of angst
though :-)

On Mon 11 Jul 2011 15:28, Bruce Korb <bkorb@gnu.org> writes:

> The intent is that I have several functions:  raw-shell-str, shell-str,
> c-string and kr-string each of which produces precisely the same byte
> sequence as their argument for the intended target environment.

But if I understand you correctly, here you would like to manipulate
*byte sequences* as strings.  Strings are logically character sequences,
so you need to choose a mapping that preserves the identity of bytes
with characters.  That mapping is latin-1.

In the NEWS for 2.0.0:

    ** New procedures: `scm_to_stringn', `scm_from_stringn'
    ** New procedures: scm_{to,from}_{utf8,latin1}_symbol{n,}
    ** New procedures: scm_{to,from}_{utf8,utf32,latin1}_string{n,}
        
    These new procedures convert to and from string representations in
    particular encodings.

    Users should continue to use locale encoding for user input, user
    output, or interacting with the C library.

    Use the Latin-1 functions for ASCII, and for literals in source code.

    Use UTF-8 functions for interaction with modern libraries which deal in
    UTF-8, and UTF-32 for interaction with utf32-using libraries.

    Otherwise, use scm_to_stringn or scm_from_stringn with a specific
    encoding.

See also
http://www.gnu.org/software/guile/manual/html_node/Conversion-to_002ffrom-C.html.

It sounds like you want scm_{to,from}_latin1_string.  On Guile 1.8 and
before, you can #define this to scm_{to,from}_locale_string, as earlier
versions of Guile did not have the needed string encoding support.

Regards,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug-autogen] test failure with guile-2.0.2
  2011-07-13  8:48   ` Andy Wingo
@ 2011-07-13 12:54     ` Bruce Korb
       [not found]     ` <4E1D95A0.60504@gnu.org>
  1 sibling, 0 replies; 13+ messages in thread
From: Bruce Korb @ 2011-07-13 12:54 UTC (permalink / raw)
  To: Andy Wingo; +Cc: bug-guile, bug-autogen, Elias Pipping

Hi Andy,

On 07/13/11 01:48, Andy Wingo wrote:
> I'm not sure what the bug report here is.  I'm getting a lot of angst
> though :-)

:)  Not angst so much as mild irritation, some of it stirred up residuals
from the 1.4 -> 1.6 -> 1.8 transitions.  I was only just recently able
to drop Guile 1.4 support and it will be a couple more before 1.6 goes
away.  My feint hopes of getting rid of the versioned Guile glue layer
are now dashed:

> On Mon 11 Jul 2011 15:28, Bruce Korb<bkorb@gnu.org>  writes:
>
>> The intent is that I have several functions:  raw-shell-str, shell-str,
>> c-string and kr-string each of which produces precisely the same byte
>> sequence as their argument for the intended target environment.
>
> But if I understand you correctly, here you would like to manipulate
> *byte sequences* as strings.  Strings are logically character sequences,
> so you need to choose a mapping that preserves the identity of bytes
> with characters.  That mapping is latin-1.

"latin1" is an alias for "ascii byte strings"?  Anyway:

> In the NEWS for 2.0.0:
>
>      ** New procedures: `scm_to_stringn', `scm_from_stringn'
>      ** New procedures: scm_{to,from}_{utf8,latin1}_symbol{n,}
>      ** New procedures: scm_{to,from}_{utf8,utf32,latin1}_string{n,}
>
>      These new procedures convert to and from string representations in
>      particular encodings.
>
>      Users should continue to use locale encoding for user input, user
>      output, or interacting with the C library.

This means that *THE SEMANTICS HAVE CHANGED* for these functions.
New semantics should always imply a new interface name.
This is a new interface.

>      Use the Latin-1 functions for ASCII, and for literals in source code.

The latin-1 functions should be the preferred spelling for the old "locale"
functions.  The new locale functions need a new spelling.  It is confusing
to have old functions performing new tricks.  The old name needs a
compatibility grace period.

> It sounds like you want scm_{to,from}_latin1_string.  On Guile 1.8 and
> before, you can #define this to scm_{to,from}_locale_string, as earlier
> versions of Guile did not have the needed string encoding support.

Actually, I said what I meant.  I want byte array functions where Guile
isn't thinking that it knows better than I do what bit values ought to
be in each byte.  It is an array of byte values each of which is  in the
range of 1 through 255 with the last value is always zero (0).

Thank you.  Regards, Bruce



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug-autogen] test failure with guile-2.0.2
       [not found]     ` <4E1D95A0.60504@gnu.org>
@ 2011-07-13 13:41       ` Andy Wingo
  2011-07-13 15:01         ` Thien-Thi Nguyen
                           ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Andy Wingo @ 2011-07-13 13:41 UTC (permalink / raw)
  To: Bruce Korb; +Cc: bug-guile, bug-autogen, Elias Pipping

Hi,

On Wed 13 Jul 2011 14:54, Bruce Korb <bkorb@gnu.org> writes:

> This means that *THE SEMANTICS HAVE CHANGED* for these functions.
> New semantics should always imply a new interface name.
> This is a new interface.

No need to shout, thank you.  I agree with you.  However in this case we
are covered: the new interface name is libguile-2.0.so.

You happened to run into this issue because of a change in how
scm_{from,to}_locale_string determines what the locale encoding is.  But
you could have run into it in one of many other ways.

In fact, for Guile 2.0, the interface changed to *match* the name.

> I want byte array functions where Guile isn't thinking that it knows
> better than I do what bit values ought to be in each byte.

Use a bytevector, then.  A string is logically an array of characters,
not bytes.

I realize that this is irritating to you, but it is the right thing,
improves the situation for loads of users, and is largely compatible.
But if what you really want is to continue using strings as bytevectors,
you will have to make a small #define, and then be on your way.

Regards,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug-autogen] test failure with guile-2.0.2
  2011-07-13 13:41       ` Andy Wingo
@ 2011-07-13 15:01         ` Thien-Thi Nguyen
  2011-07-15 16:25           ` Thien-Thi Nguyen
  2011-07-13 15:11         ` Bruce Korb
       [not found]         ` <4E1DB59B.9070700@gnu.org>
  2 siblings, 1 reply; 13+ messages in thread
From: Thien-Thi Nguyen @ 2011-07-13 15:01 UTC (permalink / raw)
  To: Andy Wingo; +Cc: bug-guile, bug-autogen, Bruce Korb, Elias Pipping

() Andy Wingo <wingo@pobox.com>
() Wed, 13 Jul 2011 15:41:42 +0200

   I realize that this is irritating to you, but it is the right thing,
   improves the situation for loads of users, and is largely compatible.

I think when you say "it is the right thing", you are missing the point.
Try to jump up and see things from a perspective that includes the motion
of the people who follow the leader.  That will make you a better leader.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug-autogen] test failure with guile-2.0.2
  2011-07-13 13:41       ` Andy Wingo
  2011-07-13 15:01         ` Thien-Thi Nguyen
@ 2011-07-13 15:11         ` Bruce Korb
       [not found]         ` <4E1DB59B.9070700@gnu.org>
  2 siblings, 0 replies; 13+ messages in thread
From: Bruce Korb @ 2011-07-13 15:11 UTC (permalink / raw)
  To: Andy Wingo; +Cc: bug-guile, bug-autogen, Elias Pipping

On 07/13/11 06:41, Andy Wingo wrote:
> No need to shout, thank you.  I agree with you.  However in this case we
> are covered: the new interface name is libguile-2.0.so.

Sorry.  It's hard to get nuanced vocal inflections encoded into text.
Too much emphasis, I guess.

> You happened to run into this issue because of a change in how
> scm_{from,to}_locale_string determines what the locale encoding is.  But
> you could have run into it in one of many other ways.
>
> In fact, for Guile 2.0, the interface changed to *match* the name.

Well, yes, but for source distributions, it gets compiled and linked
against whatever the current platform's default Guile happens to be.
You are covered only for built binary executables, not stuff that gets
recompiled expecting the old behavior.  That is why the API name
ought to change.  I wrote code based on being told that the old char
interface is now obsolete (the SCM_CHARS stuff).  The new spelling is
scm_*_locale_string().  I didn't particularly care for "locale" being
in the name, but there was no alternative.  And now these new functions
have changed their semantics to match their names better.  One can
argue that it is better, but the truth is that if the semantics change,
then it is a new interface and a changed interface ought to have a
changed name, emphasizing that it is not the same function anymore.

>> I want byte array functions where Guile isn't thinking that it knows
>> better than I do what bit values ought to be in each byte.
>
> Use a bytevector, then.  A string is logically an array of characters,
> not bytes.
>
> I realize that this is irritating to you, but it is the right thing,

I beg to differ.  It is not the right thing.  It is certainly more
consistent (behavior and name), but, repeating myself yet again,
different behavior is a different function and needs a different name.
Even though you really like the name.  New name, please.

> improves the situation for loads of users, and is largely compatible.

"almost" and "largely" and "nearly" are some of my favorite adjectives
to use about software development.

> But if what you really want is to continue using strings as bytevectors,
> you will have to make a small #define, and then be on your way.

This is shorter now that I am not fretting over Guile 1.4.x:

#if   GUILE_VERSION < 107000
# define AG_SCM_BOOL_P(_b)            SCM_BOOLP(_b)
# define AG_SCM_CHAR(_c)              gh_scm2char(_c)
# define AG_SCM_CHARS(_s)             SCM_CHARS(_s)
# define AG_SCM_FALSEP(_r)            SCM_FALSEP(_r)
# define AG_SCM_FROM_LONG(_l)         gh_long2scm(_l)
# define AG_SCM_INT2SCM(_i)           gh_int2scm(_i)
# define AG_SCM_IS_PROC(_p)           SCM_NFALSEP( scm_procedure_p(_p))
# define AG_SCM_LIST_P(_l)            SCM_NFALSEP( scm_list_p(_l))
# define AG_SCM_LISTOFNULL()          scm_listofnull
# define AG_SCM_LONG2SCM(_i)          gh_long2scm(_i)
# define AG_SCM_NFALSEP(_r)           SCM_NFALSEP(_r)
# define AG_SCM_NULLP(_m)             SCM_NULLP(_m)
# define AG_SCM_NUM_P(_n)             SCM_NUMBERP(_n)
# define AG_SCM_PAIR_P(_p)            SCM_NFALSEP( scm_pair_p(_p))
# define AG_SCM_STR02SCM(_s)          scm_makfrom0str(_s)
# define AG_SCM_STR2SCM(_st,_sz)      scm_mem2string(_st,_sz)
# define AG_SCM_STRING_P(_s)          SCM_STRINGP(_s)
# define AG_SCM_STRLEN(_s)            SCM_STRING_LENGTH(_s)
# define AG_SCM_SYM_P(_s)             SCM_SYMBOLP(_s)
# define AG_SCM_TO_INT(_i)            gh_scm2int(_i)
# define AG_SCM_TO_LONG(_v)           gh_scm2long(_v)
# define AG_SCM_TO_NEWSTR(_s)         gh_scm2newstr(_s, NULL)
# define AG_SCM_TO_ULONG(_v)          gh_scm2ulong(_v)
# define AG_SCM_VEC_P(_v)             SCM_VECTORP(_v)

#elif GUILE_VERSION < 201000
# define AG_SCM_BOOL_P(_b)            scm_is_bool(_b)
# define AG_SCM_CHAR(_c)              SCM_CHAR(_c)
# define AG_SCM_CHARS(_s)             scm_i_string_chars(_s)
# define AG_SCM_FALSEP(_r)            scm_is_false(_r)
# define AG_SCM_FROM_LONG(_l)         scm_from_long(_l)
# define AG_SCM_INT2SCM(_i)           scm_from_int(_i)
# define AG_SCM_IS_PROC(_p)           scm_is_true( scm_procedure_p(_p))
# define AG_SCM_LIST_P(_l)            scm_is_true( scm_list_p(_l))
# define AG_SCM_LISTOFNULL()          scm_list_1(SCM_EOL)
# define AG_SCM_LONG2SCM(_i)          scm_from_long(_i)
# define AG_SCM_NFALSEP(_r)           scm_is_true(_r)
# define AG_SCM_NULLP(_m)             scm_is_null(_m)
# define AG_SCM_NUM_P(_n)             scm_is_number(_n)
# define AG_SCM_PAIR_P(_p)            scm_is_true( scm_pair_p(_p))
# define AG_SCM_STR02SCM(_s)          scm_from_locale_string(_s)
# define AG_SCM_STR2SCM(_st,_sz)      scm_from_locale_stringn(_st,_sz)
# define AG_SCM_STRING_P(_s)          scm_is_string(_s)
# define AG_SCM_STRLEN(_s)            scm_c_string_length(_s)
# define AG_SCM_SYM_P(_s)             scm_is_symbol(_s)
# define AG_SCM_TO_INT(_i)            scm_to_int(_i)
# define AG_SCM_TO_LONG(_v)           scm_to_long(_v)
# define AG_SCM_TO_NEWSTR(_s)         scm_to_locale_string(_s)
# define AG_SCM_TO_ULONG(_v)          scm_to_ulong(_v)
# define AG_SCM_VEC_P(_v)             scm_is_vector(_v)
#else
#error unknown GUILE_VERSION
#endif

Regards, Bruce



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug-autogen] test failure with guile-2.0.2
       [not found]         ` <4E1DB59B.9070700@gnu.org>
@ 2011-07-14  9:01           ` Andy Wingo
  2011-07-17 20:47             ` ‘scm_from_locale_string’ and locale character encoding Ludovic Courtès
  0 siblings, 1 reply; 13+ messages in thread
From: Andy Wingo @ 2011-07-14  9:01 UTC (permalink / raw)
  To: Bruce Korb; +Cc: bug-guile, bug-autogen, Elias Pipping

Hi Bruce,

On Wed 13 Jul 2011 17:11, Bruce Korb <bkorb@gnu.org> writes:

> On 07/13/11 06:41, Andy Wingo wrote:
>> the new interface name is libguile-2.0.so.
>
> Well, yes, but for source distributions, it gets compiled and linked
> against whatever the current platform's default Guile happens to be.

I would like to avoid this circumstance in the future, while still
preserving Guile's ability to change.  Hopefully whenever you decide
that you can stop supporting Guile 1.6, as we have, then you can switch
to use pkg-config.  This will allow you to specify the versions of Guile
that you support, at build-time, *and choose them* from among a number
of installed Guile versions.

This way, you only deal with changes in Guile 2.2 *when you choose* to
upgrade to Guile 2.2.  Presumably at that point you read the NEWS as
well :-)

But, with the current guile-config situation, that's not the case.  You
end up dealing with changes in Guile when you're trying to do something
else, as now.  But it's bugs versus bugs, right?  What did you expect us
to do, deprecate scm_from_locale_string because in 1.8 it could be
treated as a byte array, after introducing it in 1.8?

> New name, please.

We'll try harder in the future.  But we cannot change the fact that
scm_from_locale_string does decode its argument.

One thing we might be able to do is Mark Weaver's "permissive" string
trick using the reserved unicode codepoints.  That code doesn't exist
yet though.

> #if   GUILE_VERSION < 107000
> # define AG_SCM_BOOL_P(_b)            SCM_BOOLP(_b)
> # define AG_SCM_CHAR(_c)              gh_scm2char(_c)

IMO, this is not the way to do deprecation.  The way to go is to use the
new names, and #define implementations for older Guile.  That way your
source is cleaner, and you get deprecation messages when current
functions are deprecated.

So here you should use scm_is_bool and SCM_CHAR.

> # define AG_SCM_CHARS(_s)             SCM_CHARS(_s)
[...]
> #elif GUILE_VERSION < 201000
> # define AG_SCM_CHARS(_s)             scm_i_string_chars(_s)

This is totally incorrect, and a bit dangerous.  It won't work if you
have a wide string.  The "_i_" in the name is for "internal".  From the
NEWS from 1.8.0, from February 2006, notes:

    ** The macros SCM_STRINGP, SCM_STRING_CHARS, SCM_STRING_LENGTH,
       SCM_SYMBOL_CHARS, and SCM_SYMBOL_LENGTH have been deprecated.

    They export too many assumptions about the implementation of strings
    and symbols that are no longer true in the presence of
    mutation-sharing substrings and when Guile switches to some form of
    Unicode.

    When working with strings, it is often best to use the normal string
    functions provided by Guile, such as scm_c_string_ref,
    scm_c_string_set_x, scm_string_append, etc.  Be sure to look in the
    manual since many more such functions are now provided than
    previously.

    When you want to convert a SCM string to a C string, use the
    scm_to_locale_string function or similar instead.  For symbols, use
    scm_symbol_to_string and then work with that string.  Because of the
    new string representation, scm_symbol_to_string does not need to copy
    and is thus quite efficient.

You'll be much less surprised at things if you read the NEWS when new
major versions are released :-)

Finally, as I think we have discussed already all of the relevant
aspects of this situation, we need to move on.  The easiest thing to do
is for you to put in a couple of #defines for
scm_{from,to}_latin1_string.  Then we can go back to building GNU!

Regards,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug-autogen] test failure with guile-2.0.2
  2011-07-13 15:01         ` Thien-Thi Nguyen
@ 2011-07-15 16:25           ` Thien-Thi Nguyen
  2011-07-18  9:30             ` Andy Wingo
  0 siblings, 1 reply; 13+ messages in thread
From: Thien-Thi Nguyen @ 2011-07-15 16:25 UTC (permalink / raw)
  To: bug-guile

() Thien-Thi Nguyen <ttn@gnuvola.org>
() Wed, 13 Jul 2011 17:01:17 +0200

   Try to jump up and see things from a perspective that includes the motion
   of the people who follow the leader.  That will make you a better leader.

A little bird had two things to say to me:
- that was a pretty wise-ass way to say things (w/ emphasis on ASS);
- what makes you an expert at being a leader?

To the first, i guess everyone is entitled to their preferred analogy.
Perhaps a better blend might have been to take the recent post by Andy wrt
lambda tribalism and draw the analogy between the benefits of functional style
programming and functional style interface design, or rather the latent woe
associated with their non-functional practice.  In this case, "semantics
changed while name unchanged" is basically a big fat ‘set!’ to the libguile
API.  Reasoning and optimizations are out the window because the trust is
broken.  We revert to coping behaviors and ugly gnashing (e.g., Guile-BAUX).

But i didn't have the energy to say that then, and this brings me to the
second answer: i am no expert at being a leader, but i know regret when i feel
it.  I feel regret now for not finding the energy then, but i also felt regret
then, as a feeling i would not wish upon Andy in the future, with the ‘set!’
fanout growing ever larger, with long-time Guile users dying a little inside
at every thought of balancing new-and-shiny cool w/ new-and-shiny pain, with
that soreness manifesting in mostly-offtopic threads, etc, etc.  That way lies
dissipation.

It doesn't need to be that way if with functional style interface design, but
of course, w/ more bindings, the burden of unfettered generation can indeed
weigh heavy.  Hopefully, this serves to push back onto the designer the need
to really think things through, to account for continuity and compatability in
the quest for new functionality.  I suppose the trick is to use the regret you
know you will feel to fuel the intensity of the design process, kind of like
consciously structuring code to avoid ‘set!’.

To get back on-topic, since the change was relatively recent, one way forward
would be to revert it and redesign.  We could simply say "ok, that was a
mistake, please avoid Guile 2.0.[01] (or whatever), sorry we will be more
careful in the future".  The other way is to do as Bruce suggests, add another
interface element.  To me, the former takes some chutzpah but is preferable
anyway -- it shows a balance of strength and gentleness, not to mention regret.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* ‘scm_from_locale_string’ and locale character encoding
  2011-07-14  9:01           ` Andy Wingo
@ 2011-07-17 20:47             ` Ludovic Courtès
  2011-07-18 16:18               ` Bruce Korb
       [not found]               ` <4E245CC5.4040505@gnu.org>
  0 siblings, 2 replies; 13+ messages in thread
From: Ludovic Courtès @ 2011-07-17 20:47 UTC (permalink / raw)
  To: Andy Wingo; +Cc: bug-guile, bug-autogen, Bruce Korb, Elias Pipping

Hello!

Andy Wingo <wingo@pobox.com> skribis:

> On Wed 13 Jul 2011 17:11, Bruce Korb <bkorb@gnu.org> writes:

[...]

>> New name, please.
>
> We'll try harder in the future.  But we cannot change the fact that
> scm_from_locale_string does decode its argument.

FWIW, I do understand the inconvenience and frustration reported here.

But as Andy suggests, I think it should come as no surprise that
‘scm_from_locale_string’ returns a string from a locale-encoded one.
Guile 1.8 already documented things this way [0].

As for (ab)using strings as byte sequences, it was bound to break with
the introduction of Unicode support.  To transition away from that, 1.8
had SRFI-4 and related I/O support (‘uniform-vector-read!’, etc.),
though it wasn’t as convenient as R6RS bytevectors.

Thanks,
Ludo’.

[0] http://www.gnu.org/software/guile/docs/docs-1.8/guile-ref/Conversion-to_002ffrom-C.html#index-scm_005ffrom_005flocale_005fstring-846



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Bug-autogen] test failure with guile-2.0.2
  2011-07-15 16:25           ` Thien-Thi Nguyen
@ 2011-07-18  9:30             ` Andy Wingo
  0 siblings, 0 replies; 13+ messages in thread
From: Andy Wingo @ 2011-07-18  9:30 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: bug-guile

Hi Thien-Thi,

On Fri 15 Jul 2011 18:25, Thien-Thi Nguyen <ttn@gnuvola.org> writes:

> "semantics changed while name unchanged" is basically a big fat ‘set!’
> to the libguile API.

Very true.  It seems that maintaining a library is an exercise in
managing mutation -- mutation of your software from state A to B, and
the corresponding mutation of your library's users.

There is an analog to the functional / imperative thing here regarding
our community: we can add names until the cows come home, but that
increases our heap size---both real, on machines, and virtual in terms
of size of API and mental load---until GC runs, in which case users of
old definitions have to migrate anyway.

But we don't always have to avoid mutation; we can use it where it is
sensible.  For example, strictly following "semantics changes -> name
changes" means introducing new names whenever you fix a bug, because
hey, someone might have been relying on it.  This obviously increases GC
costs.  I think we probably agree that bugs can be fixed without names
changing, so there is some middle point that is a good compromise
between heap size and GC rate.  Heap compaction is possible at major
GC's, when new major versions are released.

One cost of name allocation is that it turns part of our community into
garbage, because they use the old names that will be collected at some
point.  They will then have to change their code to point to live names.
If they do it sooner, we have a more unified, dynamic community, ready
to face changes.  If they do it later, we have a more brittle,
fragmented community.

That's not to say that we should treat our users like garbage, of
course! ;-)

Anyway, I continue to disagree regarding this particular point, but I do
appreciate the general concerns.  We're building something here --
specifically, GNU -- and it's irritating to have to take off your roof
because the walls aren't right.  But sometimes it is necessary, so we
need to manage this mutation, of ourselves & of our code.

Regards,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ‘scm_from_locale_string’ and locale character encoding
  2011-07-17 20:47             ` ‘scm_from_locale_string’ and locale character encoding Ludovic Courtès
@ 2011-07-18 16:18               ` Bruce Korb
       [not found]               ` <4E245CC5.4040505@gnu.org>
  1 sibling, 0 replies; 13+ messages in thread
From: Bruce Korb @ 2011-07-18 16:18 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: bug-guile, bug-autogen, Elias Pipping

On 07/17/11 13:47, Ludovic Courtès wrote:
> FWIW, I do understand the inconvenience and frustration reported here.
>
> But as Andy suggests, I think it should come as no surprise that
> ‘scm_from_locale_string’ returns a string from a locale-encoded one.
> Guile 1.8 already documented things this way [0].

The issue is less about scm_from_locale_string than:

* This appears to me to be habitual.  If it were not, my glue layer
   would not be so large.

* This is a behavioral change in a bug fix release.  The kind of
   change one might more readily anticipate in a 2.0.x to 2.1 transition.

* There was no warning mechanism put in place.  Release docs don't
   count because code I ship now will be built against Guile releases
   that have not been released yet.  I've not seen 2.0.2 even yet.

WRT the misuse of this particular function, what happened from
my perspective is that I was happily using the SCM_CHARS stuff with:
   export GUILE_WARN_DEPRECATED=detailed
in my testing environment and not getting any testing errors.
Then my clients report that they cannot build it any more.
(Why didn't the deprecation warning fire?  How about adding
GUILE_ERROR_DEPRECATED??  Then my testing should fail immediately!
In any event, I scan the test logs for "warn" and saw nothing.)
Somebody somewhere said, "use scm_from_locale_string".  I looked
it up and saw no better alternative, used it and it worked as I
needed it to.  Now it doesn't.

My biggest desire is to not have to read release documents and
figure out if somewhere in there is something that affects
my usage of the guile library interface.  I want to be thumped
with GUILE_WARN_DEPRECATED and have an obvious replacement.
Thank you.


/**
  *  Get the NUL terminated string from an SCM.
  *  As of Guile 1.7.x, access to the NUL terminated string referenced by
  *  an SCM is no longer guaranteed.  Therefore, we must extract the string
  *  into one of our "scribble" buffers.
  *
  * @param  s     the string to convert
  * @param  type  a string describing the string
  * @return a NUL terminated string, or it aborts.
  */
LOCAL char *
ag_scm2zchars(SCM s, const char * type)
{
#if GUILE_VERSION < 107000  /* pre-Guile 1.7.x */

     if (! AG_SCM_STRING_P(s))
         AG_ABEND(aprf(zNotStr, type));

     if (SCM_SUBSTRP(s))
         s = scm_makfromstr(SCM_CHARS(s), SCM_LENGTH(s), 0);
     return SCM_CHARS(s);

#else
     static char const bad_val[] =
         "scm_string_length returned wrong value: %d != %d\n";
     size_t len;
     char * buf;

     if (! AG_SCM_STRING_P(s))
         AG_ABEND(aprf("%s is not a string", type));

     len = scm_c_string_length(s);
     if (len == 0) {
         static char z = NUL;
         return &z;
     }

     buf = ag_scribble(len+1);

     {
         size_t buflen = scm_to_locale_stringbuf(s, buf, len);
         if (buflen != len)
             AG_ABEND(aprf(bad_val, buflen, len));
     }

     buf[len] = NUL;
     return buf;
#endif
}



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ‘scm_from_locale_string’ and locale character encoding
       [not found]               ` <4E245CC5.4040505@gnu.org>
@ 2011-07-19  9:35                 ` Ludovic Courtès
  0 siblings, 0 replies; 13+ messages in thread
From: Ludovic Courtès @ 2011-07-19  9:35 UTC (permalink / raw)
  To: Bruce Korb; +Cc: bug-guile, bug-autogen, Elias Pipping

Hi Bruce,

Bruce Korb <bkorb@gnu.org> skribis:

> On 07/17/11 13:47, Ludovic Courtès wrote:
>> FWIW, I do understand the inconvenience and frustration reported here.
>>
>> But as Andy suggests, I think it should come as no surprise that
>> ‘scm_from_locale_string’ returns a string from a locale-encoded one.
>> Guile 1.8 already documented things this way [0].
>
> The issue is less about scm_from_locale_string than:

[...]

> * This is a behavioral change in a bug fix release.  The kind of
>   change one might more readily anticipate in a 2.0.x to 2.1 transition.

Wait, I was referring to the fact that scm_from_locale_string actually
honors the current locale, which was introduced in the 1.8 → 2.0
transition along with Unicode support.

Did I misunderstand the change you were referring to?

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-07-19  9:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20110710.001131.2114114687665811411.pipping@lavabit.com>
2011-07-11 13:28 ` [Bug-autogen] test failure with guile-2.0.2 Bruce Korb
     [not found] ` <20110711.160906.808950167011716776.pipping@lavabit.com>
2011-07-11 17:32   ` Bruce Korb
     [not found] ` <4E1AFA72.2050803@gnu.org>
2011-07-13  8:48   ` Andy Wingo
2011-07-13 12:54     ` Bruce Korb
     [not found]     ` <4E1D95A0.60504@gnu.org>
2011-07-13 13:41       ` Andy Wingo
2011-07-13 15:01         ` Thien-Thi Nguyen
2011-07-15 16:25           ` Thien-Thi Nguyen
2011-07-18  9:30             ` Andy Wingo
2011-07-13 15:11         ` Bruce Korb
     [not found]         ` <4E1DB59B.9070700@gnu.org>
2011-07-14  9:01           ` Andy Wingo
2011-07-17 20:47             ` ‘scm_from_locale_string’ and locale character encoding Ludovic Courtès
2011-07-18 16:18               ` Bruce Korb
     [not found]               ` <4E245CC5.4040505@gnu.org>
2011-07-19  9:35                 ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).