* Re: [Bug-autogen] test failure with guile-2.0.2 [not found] <20110710.001131.2114114687665811411.pipping@lavabit.com> @ 2011-07-11 13:28 ` Bruce Korb [not found] ` <20110711.160906.808950167011716776.pipping@lavabit.com> [not found] ` <4E1AFA72.2050803@gnu.org> 2 siblings, 0 replies; 13+ messages in thread From: Bruce Korb @ 2011-07-11 13:28 UTC (permalink / raw) To: Elias Pipping, bug-guile; +Cc: bug-autogen Hi Elias, et al., On 07/09/11 15:11, Elias Pipping wrote: > with autogen 5.12 and guile [top of tree] I get a test failure that > I do not get with the same version of autogen and guile 2.0.2. > > The failing test is string.test. > > It fails because instead of ending with \001\002\003\377 as expected, > a generated string ends with \001\002\003?. I'm attaching the relevant > output. Here's a snippet passed through `od -c`: > > 0002060 h a s s l e " . 001 002 003 377 \r \n '< > 0002260 h a s s l e " . 001 002 003 ? \r \n '< > > where the first line is expected and the second line is what's > actually returned. > > I'm also attaching a script that reproduces the problem and can be run > from inside a checkout of the guile git repository. It builds guile, > installs it to a temporary location, then builds autogen 5.12 and > makes it use that version of guile; In conjunction with git-bisect, > this revealed that the following commit is to blame: > > commit 95f5e303bc7f6174255b12fd1113d69364863762 > Author: Andy Wingo<wingo@pobox.com> > Date: Thu Mar 17 18:29:08 2011 +0100 > > scm_{to,from}_locale_string use current locale, not current ports > > * libguile/strings.c (scm_to_locale_stringn, scm_from_locale_stringn): > Use the encoding of the current locale, not of the current i/o ports. > Also use the current conversion strategy. > > * doc/ref/api-data.texi (Conversion to/from C): Update docs. > > See also [1]. The intent is that I have several functions: raw-shell-str, shell-str, c-string and kr-string each of which produces precisely the same byte sequence as their argument for the intended target environment. The first two produce text that, when processed by the shell and passed through as arguments to a program, will be seen by the program as identical to the original string handed off to the guile function. Similarly, c-string and kr-string will produce C variable initialization text that, after compilation, the compiled program will see the exact same sequence of bytes given to guile to hand off to the function I wrote. If this is no longer the case, then guile has changed its interface again. PLEASE DO NOT DO THAT. If you think you made a mistake with the interface I was told to use, please change the interface name and deprecate the old one rather than cut me off at the knees. Thank you. Then I can transition to another interface. But please tell me which one. I just want byte arrays. I don't want Guile sticking its nose in and "improving" the sequence of bytes for me (deleting DEL characters, for example). Thank you. Regards, Bruce ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20110711.160906.808950167011716776.pipping@lavabit.com>]
* Re: [Bug-autogen] test failure with guile-2.0.2 [not found] ` <20110711.160906.808950167011716776.pipping@lavabit.com> @ 2011-07-11 17:32 ` Bruce Korb 0 siblings, 0 replies; 13+ messages in thread From: Bruce Korb @ 2011-07-11 17:32 UTC (permalink / raw) To: Elias Pipping, bug-guile On 07/11/11 07:09, Elias Pipping wrote: > I meant write: > > with autogen 5.12 and guile 2.0.2 I get a test failure that I do not > get with the same version of autogen and guile 2.0.0. Ah, right. I took a guess at intentions. But, it doesn't matter. The fact is I was told by folks on the guile list I ought to be using those scm to/from locale string thingies. I did. Now the code has changed so they don't work the way they used to. That is an interface change. If the interface changes, the interface name ought to change, too. Someone please name the correct function that will not add the eighth bit when the character value is 0x7F (the DEL character). I will have to version the interface so that I only use the locale string stuff prior to 2.0.2. So much for getting rid of the versioned interface glue. :( ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <4E1AFA72.2050803@gnu.org>]
* Re: [Bug-autogen] test failure with guile-2.0.2 [not found] ` <4E1AFA72.2050803@gnu.org> @ 2011-07-13 8:48 ` Andy Wingo 2011-07-13 12:54 ` Bruce Korb [not found] ` <4E1D95A0.60504@gnu.org> 0 siblings, 2 replies; 13+ messages in thread From: Andy Wingo @ 2011-07-13 8:48 UTC (permalink / raw) To: Bruce Korb; +Cc: bug-guile, bug-autogen, Elias Pipping Hi Bruce, I'm not sure what the bug report here is. I'm getting a lot of angst though :-) On Mon 11 Jul 2011 15:28, Bruce Korb <bkorb@gnu.org> writes: > The intent is that I have several functions: raw-shell-str, shell-str, > c-string and kr-string each of which produces precisely the same byte > sequence as their argument for the intended target environment. But if I understand you correctly, here you would like to manipulate *byte sequences* as strings. Strings are logically character sequences, so you need to choose a mapping that preserves the identity of bytes with characters. That mapping is latin-1. In the NEWS for 2.0.0: ** New procedures: `scm_to_stringn', `scm_from_stringn' ** New procedures: scm_{to,from}_{utf8,latin1}_symbol{n,} ** New procedures: scm_{to,from}_{utf8,utf32,latin1}_string{n,} These new procedures convert to and from string representations in particular encodings. Users should continue to use locale encoding for user input, user output, or interacting with the C library. Use the Latin-1 functions for ASCII, and for literals in source code. Use UTF-8 functions for interaction with modern libraries which deal in UTF-8, and UTF-32 for interaction with utf32-using libraries. Otherwise, use scm_to_stringn or scm_from_stringn with a specific encoding. See also http://www.gnu.org/software/guile/manual/html_node/Conversion-to_002ffrom-C.html. It sounds like you want scm_{to,from}_latin1_string. On Guile 1.8 and before, you can #define this to scm_{to,from}_locale_string, as earlier versions of Guile did not have the needed string encoding support. Regards, Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bug-autogen] test failure with guile-2.0.2 2011-07-13 8:48 ` Andy Wingo @ 2011-07-13 12:54 ` Bruce Korb [not found] ` <4E1D95A0.60504@gnu.org> 1 sibling, 0 replies; 13+ messages in thread From: Bruce Korb @ 2011-07-13 12:54 UTC (permalink / raw) To: Andy Wingo; +Cc: bug-guile, bug-autogen, Elias Pipping Hi Andy, On 07/13/11 01:48, Andy Wingo wrote: > I'm not sure what the bug report here is. I'm getting a lot of angst > though :-) :) Not angst so much as mild irritation, some of it stirred up residuals from the 1.4 -> 1.6 -> 1.8 transitions. I was only just recently able to drop Guile 1.4 support and it will be a couple more before 1.6 goes away. My feint hopes of getting rid of the versioned Guile glue layer are now dashed: > On Mon 11 Jul 2011 15:28, Bruce Korb<bkorb@gnu.org> writes: > >> The intent is that I have several functions: raw-shell-str, shell-str, >> c-string and kr-string each of which produces precisely the same byte >> sequence as their argument for the intended target environment. > > But if I understand you correctly, here you would like to manipulate > *byte sequences* as strings. Strings are logically character sequences, > so you need to choose a mapping that preserves the identity of bytes > with characters. That mapping is latin-1. "latin1" is an alias for "ascii byte strings"? Anyway: > In the NEWS for 2.0.0: > > ** New procedures: `scm_to_stringn', `scm_from_stringn' > ** New procedures: scm_{to,from}_{utf8,latin1}_symbol{n,} > ** New procedures: scm_{to,from}_{utf8,utf32,latin1}_string{n,} > > These new procedures convert to and from string representations in > particular encodings. > > Users should continue to use locale encoding for user input, user > output, or interacting with the C library. This means that *THE SEMANTICS HAVE CHANGED* for these functions. New semantics should always imply a new interface name. This is a new interface. > Use the Latin-1 functions for ASCII, and for literals in source code. The latin-1 functions should be the preferred spelling for the old "locale" functions. The new locale functions need a new spelling. It is confusing to have old functions performing new tricks. The old name needs a compatibility grace period. > It sounds like you want scm_{to,from}_latin1_string. On Guile 1.8 and > before, you can #define this to scm_{to,from}_locale_string, as earlier > versions of Guile did not have the needed string encoding support. Actually, I said what I meant. I want byte array functions where Guile isn't thinking that it knows better than I do what bit values ought to be in each byte. It is an array of byte values each of which is in the range of 1 through 255 with the last value is always zero (0). Thank you. Regards, Bruce ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <4E1D95A0.60504@gnu.org>]
* Re: [Bug-autogen] test failure with guile-2.0.2 [not found] ` <4E1D95A0.60504@gnu.org> @ 2011-07-13 13:41 ` Andy Wingo 2011-07-13 15:01 ` Thien-Thi Nguyen ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Andy Wingo @ 2011-07-13 13:41 UTC (permalink / raw) To: Bruce Korb; +Cc: bug-guile, bug-autogen, Elias Pipping Hi, On Wed 13 Jul 2011 14:54, Bruce Korb <bkorb@gnu.org> writes: > This means that *THE SEMANTICS HAVE CHANGED* for these functions. > New semantics should always imply a new interface name. > This is a new interface. No need to shout, thank you. I agree with you. However in this case we are covered: the new interface name is libguile-2.0.so. You happened to run into this issue because of a change in how scm_{from,to}_locale_string determines what the locale encoding is. But you could have run into it in one of many other ways. In fact, for Guile 2.0, the interface changed to *match* the name. > I want byte array functions where Guile isn't thinking that it knows > better than I do what bit values ought to be in each byte. Use a bytevector, then. A string is logically an array of characters, not bytes. I realize that this is irritating to you, but it is the right thing, improves the situation for loads of users, and is largely compatible. But if what you really want is to continue using strings as bytevectors, you will have to make a small #define, and then be on your way. Regards, Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bug-autogen] test failure with guile-2.0.2 2011-07-13 13:41 ` Andy Wingo @ 2011-07-13 15:01 ` Thien-Thi Nguyen 2011-07-15 16:25 ` Thien-Thi Nguyen 2011-07-13 15:11 ` Bruce Korb [not found] ` <4E1DB59B.9070700@gnu.org> 2 siblings, 1 reply; 13+ messages in thread From: Thien-Thi Nguyen @ 2011-07-13 15:01 UTC (permalink / raw) To: Andy Wingo; +Cc: bug-guile, bug-autogen, Bruce Korb, Elias Pipping () Andy Wingo <wingo@pobox.com> () Wed, 13 Jul 2011 15:41:42 +0200 I realize that this is irritating to you, but it is the right thing, improves the situation for loads of users, and is largely compatible. I think when you say "it is the right thing", you are missing the point. Try to jump up and see things from a perspective that includes the motion of the people who follow the leader. That will make you a better leader. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bug-autogen] test failure with guile-2.0.2 2011-07-13 15:01 ` Thien-Thi Nguyen @ 2011-07-15 16:25 ` Thien-Thi Nguyen 2011-07-18 9:30 ` Andy Wingo 0 siblings, 1 reply; 13+ messages in thread From: Thien-Thi Nguyen @ 2011-07-15 16:25 UTC (permalink / raw) To: bug-guile () Thien-Thi Nguyen <ttn@gnuvola.org> () Wed, 13 Jul 2011 17:01:17 +0200 Try to jump up and see things from a perspective that includes the motion of the people who follow the leader. That will make you a better leader. A little bird had two things to say to me: - that was a pretty wise-ass way to say things (w/ emphasis on ASS); - what makes you an expert at being a leader? To the first, i guess everyone is entitled to their preferred analogy. Perhaps a better blend might have been to take the recent post by Andy wrt lambda tribalism and draw the analogy between the benefits of functional style programming and functional style interface design, or rather the latent woe associated with their non-functional practice. In this case, "semantics changed while name unchanged" is basically a big fat ‘set!’ to the libguile API. Reasoning and optimizations are out the window because the trust is broken. We revert to coping behaviors and ugly gnashing (e.g., Guile-BAUX). But i didn't have the energy to say that then, and this brings me to the second answer: i am no expert at being a leader, but i know regret when i feel it. I feel regret now for not finding the energy then, but i also felt regret then, as a feeling i would not wish upon Andy in the future, with the ‘set!’ fanout growing ever larger, with long-time Guile users dying a little inside at every thought of balancing new-and-shiny cool w/ new-and-shiny pain, with that soreness manifesting in mostly-offtopic threads, etc, etc. That way lies dissipation. It doesn't need to be that way if with functional style interface design, but of course, w/ more bindings, the burden of unfettered generation can indeed weigh heavy. Hopefully, this serves to push back onto the designer the need to really think things through, to account for continuity and compatability in the quest for new functionality. I suppose the trick is to use the regret you know you will feel to fuel the intensity of the design process, kind of like consciously structuring code to avoid ‘set!’. To get back on-topic, since the change was relatively recent, one way forward would be to revert it and redesign. We could simply say "ok, that was a mistake, please avoid Guile 2.0.[01] (or whatever), sorry we will be more careful in the future". The other way is to do as Bruce suggests, add another interface element. To me, the former takes some chutzpah but is preferable anyway -- it shows a balance of strength and gentleness, not to mention regret. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bug-autogen] test failure with guile-2.0.2 2011-07-15 16:25 ` Thien-Thi Nguyen @ 2011-07-18 9:30 ` Andy Wingo 0 siblings, 0 replies; 13+ messages in thread From: Andy Wingo @ 2011-07-18 9:30 UTC (permalink / raw) To: Thien-Thi Nguyen; +Cc: bug-guile Hi Thien-Thi, On Fri 15 Jul 2011 18:25, Thien-Thi Nguyen <ttn@gnuvola.org> writes: > "semantics changed while name unchanged" is basically a big fat ‘set!’ > to the libguile API. Very true. It seems that maintaining a library is an exercise in managing mutation -- mutation of your software from state A to B, and the corresponding mutation of your library's users. There is an analog to the functional / imperative thing here regarding our community: we can add names until the cows come home, but that increases our heap size---both real, on machines, and virtual in terms of size of API and mental load---until GC runs, in which case users of old definitions have to migrate anyway. But we don't always have to avoid mutation; we can use it where it is sensible. For example, strictly following "semantics changes -> name changes" means introducing new names whenever you fix a bug, because hey, someone might have been relying on it. This obviously increases GC costs. I think we probably agree that bugs can be fixed without names changing, so there is some middle point that is a good compromise between heap size and GC rate. Heap compaction is possible at major GC's, when new major versions are released. One cost of name allocation is that it turns part of our community into garbage, because they use the old names that will be collected at some point. They will then have to change their code to point to live names. If they do it sooner, we have a more unified, dynamic community, ready to face changes. If they do it later, we have a more brittle, fragmented community. That's not to say that we should treat our users like garbage, of course! ;-) Anyway, I continue to disagree regarding this particular point, but I do appreciate the general concerns. We're building something here -- specifically, GNU -- and it's irritating to have to take off your roof because the walls aren't right. But sometimes it is necessary, so we need to manage this mutation, of ourselves & of our code. Regards, Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Bug-autogen] test failure with guile-2.0.2 2011-07-13 13:41 ` Andy Wingo 2011-07-13 15:01 ` Thien-Thi Nguyen @ 2011-07-13 15:11 ` Bruce Korb [not found] ` <4E1DB59B.9070700@gnu.org> 2 siblings, 0 replies; 13+ messages in thread From: Bruce Korb @ 2011-07-13 15:11 UTC (permalink / raw) To: Andy Wingo; +Cc: bug-guile, bug-autogen, Elias Pipping On 07/13/11 06:41, Andy Wingo wrote: > No need to shout, thank you. I agree with you. However in this case we > are covered: the new interface name is libguile-2.0.so. Sorry. It's hard to get nuanced vocal inflections encoded into text. Too much emphasis, I guess. > You happened to run into this issue because of a change in how > scm_{from,to}_locale_string determines what the locale encoding is. But > you could have run into it in one of many other ways. > > In fact, for Guile 2.0, the interface changed to *match* the name. Well, yes, but for source distributions, it gets compiled and linked against whatever the current platform's default Guile happens to be. You are covered only for built binary executables, not stuff that gets recompiled expecting the old behavior. That is why the API name ought to change. I wrote code based on being told that the old char interface is now obsolete (the SCM_CHARS stuff). The new spelling is scm_*_locale_string(). I didn't particularly care for "locale" being in the name, but there was no alternative. And now these new functions have changed their semantics to match their names better. One can argue that it is better, but the truth is that if the semantics change, then it is a new interface and a changed interface ought to have a changed name, emphasizing that it is not the same function anymore. >> I want byte array functions where Guile isn't thinking that it knows >> better than I do what bit values ought to be in each byte. > > Use a bytevector, then. A string is logically an array of characters, > not bytes. > > I realize that this is irritating to you, but it is the right thing, I beg to differ. It is not the right thing. It is certainly more consistent (behavior and name), but, repeating myself yet again, different behavior is a different function and needs a different name. Even though you really like the name. New name, please. > improves the situation for loads of users, and is largely compatible. "almost" and "largely" and "nearly" are some of my favorite adjectives to use about software development. > But if what you really want is to continue using strings as bytevectors, > you will have to make a small #define, and then be on your way. This is shorter now that I am not fretting over Guile 1.4.x: #if GUILE_VERSION < 107000 # define AG_SCM_BOOL_P(_b) SCM_BOOLP(_b) # define AG_SCM_CHAR(_c) gh_scm2char(_c) # define AG_SCM_CHARS(_s) SCM_CHARS(_s) # define AG_SCM_FALSEP(_r) SCM_FALSEP(_r) # define AG_SCM_FROM_LONG(_l) gh_long2scm(_l) # define AG_SCM_INT2SCM(_i) gh_int2scm(_i) # define AG_SCM_IS_PROC(_p) SCM_NFALSEP( scm_procedure_p(_p)) # define AG_SCM_LIST_P(_l) SCM_NFALSEP( scm_list_p(_l)) # define AG_SCM_LISTOFNULL() scm_listofnull # define AG_SCM_LONG2SCM(_i) gh_long2scm(_i) # define AG_SCM_NFALSEP(_r) SCM_NFALSEP(_r) # define AG_SCM_NULLP(_m) SCM_NULLP(_m) # define AG_SCM_NUM_P(_n) SCM_NUMBERP(_n) # define AG_SCM_PAIR_P(_p) SCM_NFALSEP( scm_pair_p(_p)) # define AG_SCM_STR02SCM(_s) scm_makfrom0str(_s) # define AG_SCM_STR2SCM(_st,_sz) scm_mem2string(_st,_sz) # define AG_SCM_STRING_P(_s) SCM_STRINGP(_s) # define AG_SCM_STRLEN(_s) SCM_STRING_LENGTH(_s) # define AG_SCM_SYM_P(_s) SCM_SYMBOLP(_s) # define AG_SCM_TO_INT(_i) gh_scm2int(_i) # define AG_SCM_TO_LONG(_v) gh_scm2long(_v) # define AG_SCM_TO_NEWSTR(_s) gh_scm2newstr(_s, NULL) # define AG_SCM_TO_ULONG(_v) gh_scm2ulong(_v) # define AG_SCM_VEC_P(_v) SCM_VECTORP(_v) #elif GUILE_VERSION < 201000 # define AG_SCM_BOOL_P(_b) scm_is_bool(_b) # define AG_SCM_CHAR(_c) SCM_CHAR(_c) # define AG_SCM_CHARS(_s) scm_i_string_chars(_s) # define AG_SCM_FALSEP(_r) scm_is_false(_r) # define AG_SCM_FROM_LONG(_l) scm_from_long(_l) # define AG_SCM_INT2SCM(_i) scm_from_int(_i) # define AG_SCM_IS_PROC(_p) scm_is_true( scm_procedure_p(_p)) # define AG_SCM_LIST_P(_l) scm_is_true( scm_list_p(_l)) # define AG_SCM_LISTOFNULL() scm_list_1(SCM_EOL) # define AG_SCM_LONG2SCM(_i) scm_from_long(_i) # define AG_SCM_NFALSEP(_r) scm_is_true(_r) # define AG_SCM_NULLP(_m) scm_is_null(_m) # define AG_SCM_NUM_P(_n) scm_is_number(_n) # define AG_SCM_PAIR_P(_p) scm_is_true( scm_pair_p(_p)) # define AG_SCM_STR02SCM(_s) scm_from_locale_string(_s) # define AG_SCM_STR2SCM(_st,_sz) scm_from_locale_stringn(_st,_sz) # define AG_SCM_STRING_P(_s) scm_is_string(_s) # define AG_SCM_STRLEN(_s) scm_c_string_length(_s) # define AG_SCM_SYM_P(_s) scm_is_symbol(_s) # define AG_SCM_TO_INT(_i) scm_to_int(_i) # define AG_SCM_TO_LONG(_v) scm_to_long(_v) # define AG_SCM_TO_NEWSTR(_s) scm_to_locale_string(_s) # define AG_SCM_TO_ULONG(_v) scm_to_ulong(_v) # define AG_SCM_VEC_P(_v) scm_is_vector(_v) #else #error unknown GUILE_VERSION #endif Regards, Bruce ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <4E1DB59B.9070700@gnu.org>]
* Re: [Bug-autogen] test failure with guile-2.0.2 [not found] ` <4E1DB59B.9070700@gnu.org> @ 2011-07-14 9:01 ` Andy Wingo 2011-07-17 20:47 ` ‘scm_from_locale_string’ and locale character encoding Ludovic Courtès 0 siblings, 1 reply; 13+ messages in thread From: Andy Wingo @ 2011-07-14 9:01 UTC (permalink / raw) To: Bruce Korb; +Cc: bug-guile, bug-autogen, Elias Pipping Hi Bruce, On Wed 13 Jul 2011 17:11, Bruce Korb <bkorb@gnu.org> writes: > On 07/13/11 06:41, Andy Wingo wrote: >> the new interface name is libguile-2.0.so. > > Well, yes, but for source distributions, it gets compiled and linked > against whatever the current platform's default Guile happens to be. I would like to avoid this circumstance in the future, while still preserving Guile's ability to change. Hopefully whenever you decide that you can stop supporting Guile 1.6, as we have, then you can switch to use pkg-config. This will allow you to specify the versions of Guile that you support, at build-time, *and choose them* from among a number of installed Guile versions. This way, you only deal with changes in Guile 2.2 *when you choose* to upgrade to Guile 2.2. Presumably at that point you read the NEWS as well :-) But, with the current guile-config situation, that's not the case. You end up dealing with changes in Guile when you're trying to do something else, as now. But it's bugs versus bugs, right? What did you expect us to do, deprecate scm_from_locale_string because in 1.8 it could be treated as a byte array, after introducing it in 1.8? > New name, please. We'll try harder in the future. But we cannot change the fact that scm_from_locale_string does decode its argument. One thing we might be able to do is Mark Weaver's "permissive" string trick using the reserved unicode codepoints. That code doesn't exist yet though. > #if GUILE_VERSION < 107000 > # define AG_SCM_BOOL_P(_b) SCM_BOOLP(_b) > # define AG_SCM_CHAR(_c) gh_scm2char(_c) IMO, this is not the way to do deprecation. The way to go is to use the new names, and #define implementations for older Guile. That way your source is cleaner, and you get deprecation messages when current functions are deprecated. So here you should use scm_is_bool and SCM_CHAR. > # define AG_SCM_CHARS(_s) SCM_CHARS(_s) [...] > #elif GUILE_VERSION < 201000 > # define AG_SCM_CHARS(_s) scm_i_string_chars(_s) This is totally incorrect, and a bit dangerous. It won't work if you have a wide string. The "_i_" in the name is for "internal". From the NEWS from 1.8.0, from February 2006, notes: ** The macros SCM_STRINGP, SCM_STRING_CHARS, SCM_STRING_LENGTH, SCM_SYMBOL_CHARS, and SCM_SYMBOL_LENGTH have been deprecated. They export too many assumptions about the implementation of strings and symbols that are no longer true in the presence of mutation-sharing substrings and when Guile switches to some form of Unicode. When working with strings, it is often best to use the normal string functions provided by Guile, such as scm_c_string_ref, scm_c_string_set_x, scm_string_append, etc. Be sure to look in the manual since many more such functions are now provided than previously. When you want to convert a SCM string to a C string, use the scm_to_locale_string function or similar instead. For symbols, use scm_symbol_to_string and then work with that string. Because of the new string representation, scm_symbol_to_string does not need to copy and is thus quite efficient. You'll be much less surprised at things if you read the NEWS when new major versions are released :-) Finally, as I think we have discussed already all of the relevant aspects of this situation, we need to move on. The easiest thing to do is for you to put in a couple of #defines for scm_{from,to}_latin1_string. Then we can go back to building GNU! Regards, Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* ‘scm_from_locale_string’ and locale character encoding 2011-07-14 9:01 ` Andy Wingo @ 2011-07-17 20:47 ` Ludovic Courtès 2011-07-18 16:18 ` Bruce Korb [not found] ` <4E245CC5.4040505@gnu.org> 0 siblings, 2 replies; 13+ messages in thread From: Ludovic Courtès @ 2011-07-17 20:47 UTC (permalink / raw) To: Andy Wingo; +Cc: bug-guile, bug-autogen, Bruce Korb, Elias Pipping Hello! Andy Wingo <wingo@pobox.com> skribis: > On Wed 13 Jul 2011 17:11, Bruce Korb <bkorb@gnu.org> writes: [...] >> New name, please. > > We'll try harder in the future. But we cannot change the fact that > scm_from_locale_string does decode its argument. FWIW, I do understand the inconvenience and frustration reported here. But as Andy suggests, I think it should come as no surprise that ‘scm_from_locale_string’ returns a string from a locale-encoded one. Guile 1.8 already documented things this way [0]. As for (ab)using strings as byte sequences, it was bound to break with the introduction of Unicode support. To transition away from that, 1.8 had SRFI-4 and related I/O support (‘uniform-vector-read!’, etc.), though it wasn’t as convenient as R6RS bytevectors. Thanks, Ludo’. [0] http://www.gnu.org/software/guile/docs/docs-1.8/guile-ref/Conversion-to_002ffrom-C.html#index-scm_005ffrom_005flocale_005fstring-846 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ‘scm_from_locale_string’ and locale character encoding 2011-07-17 20:47 ` ‘scm_from_locale_string’ and locale character encoding Ludovic Courtès @ 2011-07-18 16:18 ` Bruce Korb [not found] ` <4E245CC5.4040505@gnu.org> 1 sibling, 0 replies; 13+ messages in thread From: Bruce Korb @ 2011-07-18 16:18 UTC (permalink / raw) To: Ludovic Courtès; +Cc: bug-guile, bug-autogen, Elias Pipping On 07/17/11 13:47, Ludovic Courtès wrote: > FWIW, I do understand the inconvenience and frustration reported here. > > But as Andy suggests, I think it should come as no surprise that > ‘scm_from_locale_string’ returns a string from a locale-encoded one. > Guile 1.8 already documented things this way [0]. The issue is less about scm_from_locale_string than: * This appears to me to be habitual. If it were not, my glue layer would not be so large. * This is a behavioral change in a bug fix release. The kind of change one might more readily anticipate in a 2.0.x to 2.1 transition. * There was no warning mechanism put in place. Release docs don't count because code I ship now will be built against Guile releases that have not been released yet. I've not seen 2.0.2 even yet. WRT the misuse of this particular function, what happened from my perspective is that I was happily using the SCM_CHARS stuff with: export GUILE_WARN_DEPRECATED=detailed in my testing environment and not getting any testing errors. Then my clients report that they cannot build it any more. (Why didn't the deprecation warning fire? How about adding GUILE_ERROR_DEPRECATED?? Then my testing should fail immediately! In any event, I scan the test logs for "warn" and saw nothing.) Somebody somewhere said, "use scm_from_locale_string". I looked it up and saw no better alternative, used it and it worked as I needed it to. Now it doesn't. My biggest desire is to not have to read release documents and figure out if somewhere in there is something that affects my usage of the guile library interface. I want to be thumped with GUILE_WARN_DEPRECATED and have an obvious replacement. Thank you. /** * Get the NUL terminated string from an SCM. * As of Guile 1.7.x, access to the NUL terminated string referenced by * an SCM is no longer guaranteed. Therefore, we must extract the string * into one of our "scribble" buffers. * * @param s the string to convert * @param type a string describing the string * @return a NUL terminated string, or it aborts. */ LOCAL char * ag_scm2zchars(SCM s, const char * type) { #if GUILE_VERSION < 107000 /* pre-Guile 1.7.x */ if (! AG_SCM_STRING_P(s)) AG_ABEND(aprf(zNotStr, type)); if (SCM_SUBSTRP(s)) s = scm_makfromstr(SCM_CHARS(s), SCM_LENGTH(s), 0); return SCM_CHARS(s); #else static char const bad_val[] = "scm_string_length returned wrong value: %d != %d\n"; size_t len; char * buf; if (! AG_SCM_STRING_P(s)) AG_ABEND(aprf("%s is not a string", type)); len = scm_c_string_length(s); if (len == 0) { static char z = NUL; return &z; } buf = ag_scribble(len+1); { size_t buflen = scm_to_locale_stringbuf(s, buf, len); if (buflen != len) AG_ABEND(aprf(bad_val, buflen, len)); } buf[len] = NUL; return buf; #endif } ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <4E245CC5.4040505@gnu.org>]
* Re: ‘scm_from_locale_string’ and locale character encoding [not found] ` <4E245CC5.4040505@gnu.org> @ 2011-07-19 9:35 ` Ludovic Courtès 0 siblings, 0 replies; 13+ messages in thread From: Ludovic Courtès @ 2011-07-19 9:35 UTC (permalink / raw) To: Bruce Korb; +Cc: bug-guile, bug-autogen, Elias Pipping Hi Bruce, Bruce Korb <bkorb@gnu.org> skribis: > On 07/17/11 13:47, Ludovic Courtès wrote: >> FWIW, I do understand the inconvenience and frustration reported here. >> >> But as Andy suggests, I think it should come as no surprise that >> ‘scm_from_locale_string’ returns a string from a locale-encoded one. >> Guile 1.8 already documented things this way [0]. > > The issue is less about scm_from_locale_string than: [...] > * This is a behavioral change in a bug fix release. The kind of > change one might more readily anticipate in a 2.0.x to 2.1 transition. Wait, I was referring to the fact that scm_from_locale_string actually honors the current locale, which was introduced in the 1.8 → 2.0 transition along with Unicode support. Did I misunderstand the change you were referring to? Thanks, Ludo’. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-07-19 9:35 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20110710.001131.2114114687665811411.pipping@lavabit.com> 2011-07-11 13:28 ` [Bug-autogen] test failure with guile-2.0.2 Bruce Korb [not found] ` <20110711.160906.808950167011716776.pipping@lavabit.com> 2011-07-11 17:32 ` Bruce Korb [not found] ` <4E1AFA72.2050803@gnu.org> 2011-07-13 8:48 ` Andy Wingo 2011-07-13 12:54 ` Bruce Korb [not found] ` <4E1D95A0.60504@gnu.org> 2011-07-13 13:41 ` Andy Wingo 2011-07-13 15:01 ` Thien-Thi Nguyen 2011-07-15 16:25 ` Thien-Thi Nguyen 2011-07-18 9:30 ` Andy Wingo 2011-07-13 15:11 ` Bruce Korb [not found] ` <4E1DB59B.9070700@gnu.org> 2011-07-14 9:01 ` Andy Wingo 2011-07-17 20:47 ` ‘scm_from_locale_string’ and locale character encoding Ludovic Courtès 2011-07-18 16:18 ` Bruce Korb [not found] ` <4E245CC5.4040505@gnu.org> 2011-07-19 9:35 ` Ludovic Courtès
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).