* What's wrong with this? @ 2012-01-03 4:08 Bruce Korb 2012-01-03 15:03 ` Mike Gran 0 siblings, 1 reply; 117+ messages in thread From: Bruce Korb @ 2012-01-03 4:08 UTC (permalink / raw) To: guile-devel Development My "(get ...)" function always returns a string. This result was assigned to "tmp-text" and the "(string-upcase ...)" is complaining that the input is read only. Well, it isn't, so the real complaint is being hidden by the "string is read-only" message. It worked until I "upgraded" to openSuSE 12.1. > $ guile --version > guile (GNU Guile) 2.0.2 > ..... What is really wrong, please? > ERROR: In procedure string-upcase: > ERROR: string is read-only: "" > Scheme evaluation error. AutoGen ABEND-ing in template > confmacs.tlib on line 209 > Failing Guile command: = = = = = > > (set! tmp-text (get "act-text")) > (set! TMP-text (string-upcase tmp-text)) > (string-append > (if (exist? "no") "no-" "yes-") > (get "act-type")) ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: What's wrong with this? 2012-01-03 4:08 What's wrong with this? Bruce Korb @ 2012-01-03 15:03 ` Mike Gran 2012-01-03 16:26 ` Guile: " Bruce Korb 0 siblings, 1 reply; 117+ messages in thread From: Mike Gran @ 2012-01-03 15:03 UTC (permalink / raw) To: Bruce Korb, guile-devel Development > From: Bruce Korb <bruce.korb@gmail.com> > My "(get ...)" function always returns a string. > This result was assigned to "tmp-text" and the > "(string-upcase ...)" is complaining that the input is > read only. Well, it isn't, so the real complaint > is being hidden by the "string is read-only" message. > > It worked until I "upgraded" to openSuSE 12.1. > >> $ guile --version >> guile (GNU Guile) 2.0.2 >> ..... > > What is really wrong, please? > >> >> (set! tmp-text (get "act-text")) >> (set! TMP-text (string-upcase tmp-text)) >> (string-append >> (if (exist? "no") "no-" "yes-") >> (get "act-type")) > There does seem to be some strangeness w.r.t. read-only strings going on. On Guile 1.8.8 if you create a string this way, it is not read-only. guile> (define y "hello") guile> (string-set! y 0 #\x) guile> y "xello" On Guile 2.0.3, if you create a string the same way, it is read-only for some reason. scheme@(guile-user)> (define y "hello") scheme@(guile-user)> (string-set! y 0 #\x) ERROR: In procedure string-set!: ERROR: string is read-only: "hello" %string-dump can be used to confirm this scheme@(guile-user)> (%string-dump y) $4 = ((string . "hello") (start . 0) (length . 5) (shared . #f) (read-only . #t) (stringbuf-chars . "hello") (stringbuf-length . 5) (stringbuf-shared . #f) (stringbuf-wide . #f)) But if you create a string with 'string' it isn't read only scheme@(guile-user)> (define y (string #\h #\e #\l #\l #\o)) scheme@(guile-user)> (string-set! y 0 #\x) scheme@(guile-user)> y $7 = "xello" -Mike ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-03 15:03 ` Mike Gran @ 2012-01-03 16:26 ` Bruce Korb 2012-01-03 16:30 ` Mike Gran ` (2 more replies) 0 siblings, 3 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-03 16:26 UTC (permalink / raw) To: Mike Gran; +Cc: gnu-prog-discuss, guile-devel Development Hi Mike, Thank you for the explanation. However: On 01/03/12 07:03, Mike Gran wrote: >> It worked until I "upgraded" to openSuSE 12.1. >> >>> $ guile --version >>> guile (GNU Guile) 2.0.2 >>> ..... >>> (set! tmp-text (get "act-text")) >>> (set! TMP-text (string-upcase tmp-text)) >>> ERROR: In procedure string-upcase: >>> ERROR: string is read-only: "" >> > > There does seem to be some strangeness w.r.t. read-only > strings going on. > > On Guile 1.8.8 if you create a string this way, it is > not read-only. > > guile> (define y "hello") > guile> (string-set! y 0 #\x) > guile> y > "xello" > > On Guile 2.0.3, if you create a string the same way, it > is read-only for some reason. > > scheme@(guile-user)> (define y "hello") > scheme@(guile-user)> (string-set! y 0 #\x) > ERROR: In procedure string-set!: > ERROR: string is read-only: "hello" > > %string-dump can be used to confirm this There are a couple of issues: 1. "string-upcase" should only read the string (as opposed to "string-upcase!", which rewrites it). 2. it is completely, utterly wrong to mutilate the Guile library into such a contortion that it interprets this: (define y "hello") to be a request to create an immutable string anyway. It very, very plainly says, "make 'y' and fill it with the string "hello". Making it read only is crazy. Furthermore, I do not even have an obvious way to deal with the problem, short of a massive rewrite. I define variables this way all over the place. rewriting the code to (define y (string-append "hell" "o")) everywhere is stupid, laborious, time consuming for me, and time consuming at execution time. Guile 2.0.1, 2.0.2 and 2.0.3 need some rethinking. Dang!!!!! ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-03 16:26 ` Guile: " Bruce Korb @ 2012-01-03 16:30 ` Mike Gran 2012-01-03 22:24 ` Ludovic Courtès 2012-01-04 10:03 ` Mark H Weaver 2 siblings, 0 replies; 117+ messages in thread From: Mike Gran @ 2012-01-03 16:30 UTC (permalink / raw) To: Bruce Korb; +Cc: gnu-prog-discuss@gnu.org, guile-devel Development > From: Bruce Korb <bruce.korb@gmail.com> > 2. it is completely, utterly wrong to mutilate the > Guile library into such a contortion that it > interprets this: > (define y "hello") > to be a request to create an immutable string anyway. > It very, very plainly says, "make 'y' and fill it with > the string "hello". Making it read only is crazy. Agreed. -Mike ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-03 16:26 ` Guile: " Bruce Korb 2012-01-03 16:30 ` Mike Gran @ 2012-01-03 22:24 ` Ludovic Courtès 2012-01-03 23:15 ` Bruce Korb 2012-01-04 3:04 ` Mike Gran 2012-01-04 10:03 ` Mark H Weaver 2 siblings, 2 replies; 117+ messages in thread From: Ludovic Courtès @ 2012-01-03 22:24 UTC (permalink / raw) To: guile-devel Hi Bruce, And happy new year! Bruce Korb <bruce.korb@gmail.com> skribis: > Thank you for the explanation. However: > > On 01/03/12 07:03, Mike Gran wrote: >>> It worked until I "upgraded" to openSuSE 12.1. >>> >>>> $ guile --version >>>> guile (GNU Guile) 2.0.2 >>>> ..... > >>>> (set! tmp-text (get "act-text")) >>>> (set! TMP-text (string-upcase tmp-text)) > >>>> ERROR: In procedure string-upcase: >>>> ERROR: string is read-only: "" [...] >> On Guile 2.0.3, if you create a string the same way, it >> is read-only for some reason. >> >> scheme@(guile-user)> (define y "hello") >> scheme@(guile-user)> (string-set! y 0 #\x) >> ERROR: In procedure string-set!: >> ERROR: string is read-only: "hello" >> >> %string-dump can be used to confirm this > > There are a couple of issues: > > 1. "string-upcase" should only read the string > (as opposed to "string-upcase!", which rewrites it). Yes, that’s weird. I can’t get string-upcase to raise a read-only exception with 2.0.3, though. Could you try with 2.0.3, or come up with a reduced case? > 2. it is completely, utterly wrong to mutilate the > Guile library into such a contortion that it > interprets this: > (define y "hello") > to be a request to create an immutable string anyway. > It very, very plainly says, "make 'y' and fill it with > the string "hello". Making it read only is crazy. It stems from the fact that string literals are read-only, per R5RS (info "(r5rs) Storage model"): In many systems it is desirable for constants (i.e. the values of literal expressions) to reside in read-only-memory. To express this, it is convenient to imagine that every object that denotes locations is associated with a flag telling whether that object is mutable or immutable. In such systems literal constants and the strings returned by `symbol->string' are immutable objects, while all objects created by the other procedures listed in this report are mutable. It is an error to attempt to store a new value into a location that is denoted by an immutable object. In Guile this has been the case since commit 190d4b0d93599e5b58e773dc6375054c3a6e3dbf. The reason for this is that Guile’s compiler tries hard to avoid duplicating constants in the output bytecode. Thus, modifying a constant would actually change all other occurrences of that constant in the code, making it a non-constant. ;-) > Furthermore, I do not even have an obvious way to deal > with the problem, You can use: (define y (string-copy "hello")) > short of a massive rewrite. > I define variables this way all over the place. > rewriting the code to > (define y (string-append "hell" "o")) > everywhere is stupid, laborious, time consuming for me, > and time consuming at execution time. I agree that this is laborious, and I’m sorry about that. I can only say that Guile < 2.0 being more permissive than the standard turns out to be a mistake, in hindsight. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-03 22:24 ` Ludovic Courtès @ 2012-01-03 23:15 ` Bruce Korb 2012-01-03 23:33 ` Ludovic Courtès 2012-01-04 12:19 ` Ian Price 2012-01-04 3:04 ` Mike Gran 1 sibling, 2 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-03 23:15 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guile-devel On 01/03/12 14:24, Ludovic Courtès wrote: >> 2. it is completely, utterly wrong to mutilate the >> Guile library into such a contortion that it >> interprets this: >> (define y "hello") >> to be a request to create an immutable string anyway. >> It very, very plainly says, "make 'y' and fill it with >> the string "hello". Making it read only is crazy. > > It stems from the fact that string literals are read-only, per R5RS > (info "(r5rs) Storage model"): > > [[blah, blah, blah]] > > In Guile this has been the case since commit > 190d4b0d93599e5b58e773dc6375054c3a6e3dbf. > > The reason for this is that Guile’s compiler tries hard to avoid > duplicating constants in the output bytecode. Thus, modifying a You have changed the interface without deprecation or any other multi-year process. Please change it back. Please fix the problem by adding (define-strict y "hello") to have this new semantic. Thank you. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-03 23:15 ` Bruce Korb @ 2012-01-03 23:33 ` Ludovic Courtès 2012-01-04 0:55 ` Bruce Korb 2012-01-04 12:19 ` Ian Price 1 sibling, 1 reply; 117+ messages in thread From: Ludovic Courtès @ 2012-01-03 23:33 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce, Bruce Korb <bkorb@gnu.org> skribis: > On 01/03/12 14:24, Ludovic Courtès wrote: >>> 2. it is completely, utterly wrong to mutilate the >>> Guile library into such a contortion that it >>> interprets this: >>> (define y "hello") >>> to be a request to create an immutable string anyway. >>> It very, very plainly says, "make 'y' and fill it with >>> the string "hello". Making it read only is crazy. >> >> It stems from the fact that string literals are read-only, per R5RS >> (info "(r5rs) Storage model"): >> >> [[blah, blah, blah]] >> >> In Guile this has been the case since commit >> 190d4b0d93599e5b58e773dc6375054c3a6e3dbf. >> >> The reason for this is that Guile’s compiler tries hard to avoid >> duplicating constants in the output bytecode. Thus, modifying a > > You have changed the interface without deprecation or any other multi-year process. I could be just as offensive by suggesting that R5RS is 14 years old, etc., but I’d rather work towards an acceptable solution with you. Could you point me to the affected code? What would you think of using string-copy as I suggested? The disadvantage is that you need to modify your code, but hopefully that can be automated with a sed script or so; the advantage is that it would work with all versions of Guile. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-03 23:33 ` Ludovic Courtès @ 2012-01-04 0:55 ` Bruce Korb 2012-01-04 3:12 ` Noah Lavine 2012-01-04 21:17 ` Ludovic Courtès 0 siblings, 2 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-04 0:55 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guile-devel On 01/03/12 15:33, Ludovic Courtès wrote: > Could you point me to the affected code? What would you think of using > string-copy as I suggested? The disadvantage is that you need to modify > your code, but hopefully that can be automated with a sed script or so; > the advantage is that it would work with all versions of Guile. The disadvantage is that I know I have "clients" that have rolled their own templates, presumably by copy-and-edit processes that will invariably include (define var "string") syntax. Likely a better approach is to re-define the "define" function to my own C code and call the proper scm_whathaveyou functions under the covers. I'm sorry about being irritable. This is the third problem with 2.x. First a pre-defined value disappeared. A very minor nuisance. Then it turned out that the string functions would now clear the high order bit on strings, so they are no longer byte arrays and there is no replacement but to roll my own. I stopped supporting byte arrays. A noticable nuisance. Now it turns out that the conventional, ordinary way of creating a string variable yields a read-only string. Ouch. So I am cranky and sorry about being so. So I guess that's my fix. Write another function dependent upon Guile internals, much like scm_c_eval_string_from_file_line(), by copying scm_define() code, checking for a string value and copying that string -- if it is read-only? Should I check for that? What about "set!"? Should I check for a read-only value there, too? I do confess it feels a little bit like unraveling something.....It is scary. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 0:55 ` Bruce Korb @ 2012-01-04 3:12 ` Noah Lavine 2012-01-04 17:37 ` bytevector -- was: " Bruce Korb 2012-01-04 21:17 ` Ludovic Courtès 1 sibling, 1 reply; 117+ messages in thread From: Noah Lavine @ 2012-01-04 3:12 UTC (permalink / raw) To: Bruce Korb; +Cc: Ludovic Courtès, guile-devel Hello, > Then it turned out that the string functions would now clear the > high order bit on strings, so they are no longer byte arrays and > there is no replacement but to roll my own. I stopped supporting > byte arrays. A noticable nuisance. This is just a side note to the main discussion, but there is now a 'bytevector' datatype you can use. Does that work for you? If not, what functionality is missing? Thanks, Noah ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: bytevector -- was: Guile: What's wrong with this? 2012-01-04 3:12 ` Noah Lavine @ 2012-01-04 17:37 ` Bruce Korb 0 siblings, 0 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-04 17:37 UTC (permalink / raw) To: Noah Lavine; +Cc: Ludovic Courtès, guile-devel Hi, On 01/03/12 19:12, Noah Lavine wrote: >> Then it turned out that the string functions would now clear the >> high order bit on strings, so they are no longer byte arrays and >> there is no replacement but to roll my own. I stopped supporting >> byte arrays. A noticable nuisance. > > This is just a side note to the main discussion, but there is now a > 'bytevector' datatype you can use. Does that work for you? If not, > what functionality is missing? > > Thanks, Oh, no, thank _you_! That is likely what I need. I don't track Guile development closely. I have GUILE_WARN_DEPRECATED set to "detailed" and expect that to warn me when issues are coming up. It has actually yet to do so, however. Imagine my surprise. Cheers - Bruce ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 0:55 ` Bruce Korb 2012-01-04 3:12 ` Noah Lavine @ 2012-01-04 21:17 ` Ludovic Courtès 2012-01-04 22:36 ` Bruce Korb 1 sibling, 1 reply; 117+ messages in thread From: Ludovic Courtès @ 2012-01-04 21:17 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Hi Bruce, Bruce Korb <bkorb@gnu.org> skribis: > On 01/03/12 15:33, Ludovic Courtès wrote: >> Could you point me to the affected code? What would you think of using >> string-copy as I suggested? The disadvantage is that you need to modify >> your code, but hopefully that can be automated with a sed script or so; >> the advantage is that it would work with all versions of Guile. > > The disadvantage is that I know I have "clients" that have rolled their > own templates, presumably by copy-and-edit processes that will invariably > include (define var "string") syntax. If the users files are evaluated rather than compiled/loaded, this is not a problem: scheme@(guile-user)> (eval (call-with-input-string "(define foo \"sdf\")" read) (interaction-environment)) $9 = #<variable 32a8580 value: "sdf"> scheme@(guile-user)> (string-set! (variable-ref $9) 1 #\x) scheme@(guile-user)> (variable-ref $9) $10 = "sxf" Could you check whether this is the case? In case it’s not, I have another possible solution in mind. ;-) > I'm sorry about being irritable. This is the third problem with 2.x. Yeah, I understand it can be really annoying and frustrating. Believe me, despite the breadth and depth of changes between 1.8 and 2.0, we did our best to avoid such nuisances. Hopefully we can help solve them with you, so you can really benefit from 2.0 (it’s a significantly nicer piece of software!) Thanks, Ludo’. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 21:17 ` Ludovic Courtès @ 2012-01-04 22:36 ` Bruce Korb 2012-01-05 0:01 ` Ludovic Courtès 0 siblings, 1 reply; 117+ messages in thread From: Bruce Korb @ 2012-01-04 22:36 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guile-devel Hi Ludo, On 01/04/12 13:17, Ludovic Courtès wrote: > If the users files are evaluated rather than compiled/loaded, this is > not a problem: I do *all* guile processing via the ag_scm_c_eval_string_from_file_line function. I suck up a string from my input file, determine that it needs guile processing and invoke that function. It has this profile: > SCM > ag_scm_c_eval_string_from_file_line( > char const * pzExpr, char const * pzFile, int line); > > #define SCM_EVAL_CONST(_s) \ > do { static file_line_t const fl = { __LINE__ - 1, __FILE__, _s }; \ > pzLastScheme = fl.text; \ > ag_scm_c_eval_string_from_file_line(fl.text, fl.file, fl.line); \ > } while (0) and I *can* redefine define because I start Guile with my own initialization: > #define SCHEME_INIT_FILE "directive.h" > static const int schemeLine = __LINE__+2; > static char const zSchemeInit[3846] = // this is generated code... > "(use-modules (ice-9 common-list))\n\ > .................................."; > > pzLastScheme = zSchemeInit; > ag_scm_c_eval_string_from_file_line( > zSchemeInit, SCHEME_INIT_FILE, schemeLine); > > SCM_EVAL_CONST("(add-hook! before-error-hook error-source-line)\n" > "(use-modules (ice-9 stack-catch))"); > Could you check whether this is the case? So it is the case. My processing consists of slicing up the input into a bunch of slivers based on markers. I look at each sliver to see how to process it. Some are emitted directly, others trigger internal mechanisms, a few are handed off to a separate server shell process and finally, if the text starts with an open parenthesis or a semi-colon (Guile comment marker), then Guile gets it via that call. Thanks -Bruce ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 22:36 ` Bruce Korb @ 2012-01-05 0:01 ` Ludovic Courtès 2012-01-05 18:36 ` non-reproduction of initial issue -- was: " Bruce Korb 0 siblings, 1 reply; 117+ messages in thread From: Ludovic Courtès @ 2012-01-05 0:01 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Hi, Bruce Korb <bkorb@gnu.org> skribis: > On 01/04/12 13:17, Ludovic Courtès wrote: >> If the users files are evaluated rather than compiled/loaded, this is >> not a problem: > > I do *all* guile processing via the ag_scm_c_eval_string_from_file_line > function. [...] >> Could you check whether this is the case? > > So it is the case. So this is good news: it means you only have to modify your own code without worrying about your users’ code (modulo the fact that modifying literals is still a bad idea, as others pointed out.) BTW, were you able to find a stripped-down example that reproduces the ‘string-upcase’ problem? Thanks, Ludo’. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: non-reproduction of initial issue -- was: Guile: What's wrong with this? 2012-01-05 0:01 ` Ludovic Courtès @ 2012-01-05 18:36 ` Bruce Korb 2012-01-05 18:50 ` Mark H Weaver 0 siblings, 1 reply; 117+ messages in thread From: Bruce Korb @ 2012-01-05 18:36 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guile-devel On 01/04/12 16:01, Ludovic Courtès wrote: > BTW, were you able to find a stripped-down example that reproduces the > ‘string-upcase’ problem? Here's the stripped down example, but it does not reproduce the problem. :( I didn't copy into it my variation on scm_c_eval_string, but I hope that isn't the issue. Must be some subtle interaction somewhere.... #include <stdio.h> #include <libguile.h> static SCM my_get(void) { static char const zNil[] = ""; SCM res = scm_from_locale_string(zNil); printf("zNil at %p yields SCM 0x%llX\n", zNil, (unsigned long long)res); return res; } static void inner_main(void * closure, int argc, char ** argv) { scm_c_define_gsubr("my-get", 0, 0, 0, (scm_t_subr)(void*)my_get); scm_c_eval_string("(define a (my-get))" "(define b (string-upcase a))" "(exit 0)"); } int main(int argc, char ** argv) { scm_boot_guile(argc, argv, inner_main, 0); return 0; /* NOTREACHED */ } ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: non-reproduction of initial issue -- was: Guile: What's wrong with this? 2012-01-05 18:36 ` non-reproduction of initial issue -- was: " Bruce Korb @ 2012-01-05 18:50 ` Mark H Weaver 0 siblings, 0 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-05 18:50 UTC (permalink / raw) To: Bruce Korb; +Cc: Ludovic Courtès, guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/04/12 16:01, Ludovic Courtès wrote: >> BTW, were you able to find a stripped-down example that reproduces the >> ‘string-upcase’ problem? > > Here's the stripped down example, but it does not reproduce the problem. :( > I didn't copy into it my variation on scm_c_eval_string, but I hope > that isn't the issue. Must be some subtle interaction somewhere.... The bugs I found could have corrupted the internal representions of strings in such a way to cause what you're seeing. Please do try that patch I sent. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-03 23:15 ` Bruce Korb 2012-01-03 23:33 ` Ludovic Courtès @ 2012-01-04 12:19 ` Ian Price 2012-01-04 17:16 ` Bruce Korb 1 sibling, 1 reply; 117+ messages in thread From: Ian Price @ 2012-01-04 12:19 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > You have changed the interface without deprecation or any other multi-year process. > Please change it back. Please fix the problem by adding (define-strict y "hello") > to have this new semantic. Thank you. Fixing it with define-strict is ridiculous, as y is still mutable, it is the string "hello" which is not. As for mutable strings, I consider them a mistake to begin with, but if people expect them to be be mutable, and historically they are mutable (in guile), it is a mistake to change this without prior warning. -- Ian Price "Programming is like pinball. The reward for doing it well is the opportunity to do it again" - from "The Wizardy Compiled" ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 12:19 ` Ian Price @ 2012-01-04 17:16 ` Bruce Korb 2012-01-04 17:21 ` Andy Wingo ` (3 more replies) 0 siblings, 4 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-04 17:16 UTC (permalink / raw) To: Ian Price, Andy Wingo; +Cc: guile-devel On 01/04/12 04:19, Ian Price wrote: > ... As for mutable strings, I consider them > a mistake to begin with,... Let's step back and consider the whole point of Guile in the first place. My understanding is that one primary purpose is to be a facilitation language so that application developers have less to worry about and futz over. An extension language, if you like that phrase. As such, it would seem to me that a primary design goal would be to make the pathway as smooth as possible, rather than trying to emulate C and/or official Scheme language specs as closely as possible. To me, my primary concern is doing my little thing with the least total hassle. Having to study up on and thoroughly understand the Scheme language seems a lot harder than just using Perl (or what-have-you). Most scripting languages don't cut you off at the knees (change interfaces). So my main question is: Which is the higher priority, language purity or ease of use? The answer to that question answers several other things, like whether or not strings should be "allowed" to have high order bits set (not be pure ASCII) and whether or not to make read only strings be copy-on-write vs. fault-on-write. > We could add a compiler option to turn string literals into (string-copy > FOO). Perhaps that's the thing to do. No, because your clients have no control over how Guile gets built. We _do_ have control over startup code, however: (if (defined? 'set-copy-on-write-strings) (set-copy-on-write-strings #t)) Or, better, keep historical behavior and add: (if (defined? 'set-no-copy-on-write-strings) (set-no-copy-on-write-strings #t)) and fix the 1.9 bug (scribbling on shared strings) by making them copy-on-write thingys. Thank you. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 17:16 ` Bruce Korb @ 2012-01-04 17:21 ` Andy Wingo 2012-01-04 17:39 ` David Kastrup ` (2 subsequent siblings) 3 siblings, 0 replies; 117+ messages in thread From: Andy Wingo @ 2012-01-04 17:21 UTC (permalink / raw) To: Bruce Korb; +Cc: Ian Price, guile-devel On Wed 04 Jan 2012 12:16, Bruce Korb <bkorb@gnu.org> writes: >> We could add a compiler option to turn string literals into (string-copy >> FOO). Perhaps that's the thing to do. > > No, because your clients have no control over how Guile gets built. > We _do_ have control over startup code, however: I meant the Scheme compiler, Bruce -- the one that is in Guile. Not the C compiler used to compile Guile. Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 17:16 ` Bruce Korb 2012-01-04 17:21 ` Andy Wingo @ 2012-01-04 17:39 ` David Kastrup 2012-01-04 21:52 ` Ian Price 2012-01-04 22:46 ` Ludovic Courtès 3 siblings, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-04 17:39 UTC (permalink / raw) To: guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/04/12 04:19, Ian Price wrote: >> ... As for mutable strings, I consider them >> a mistake to begin with,... > > Let's step back and consider the whole point of Guile in the first place. > > My understanding is that one primary purpose is to be a facilitation > language so that application developers have less to worry about and > futz over. An extension language, if you like that phrase. As such, > it would seem to me that a primary design goal would be to make the > pathway as smooth as possible, rather than trying to emulate C and/or > official Scheme language specs as closely as possible. To me, my primary > concern is doing my little thing with the least total hassle. Having > to study up on and thoroughly understand the Scheme language seems > a lot harder than just using Perl (or what-have-you). Most scripting > languages don't cut you off at the knees (change interfaces). > > So my main question is: > > Which is the higher priority, language purity or ease of use? Encouraging language abuse like making _literals_ not eq? to themselves makes a language unpredictable. That is not a road to ease of use. It is a dead end. > and fix the 1.9 bug (scribbling on shared strings) by making them > copy-on-write thingys. So you want to give eq? unpredictable semantics as well. What else has made your black list of things to sacrifice in order to keep undefined code working in a particular undefined way? -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 17:16 ` Bruce Korb 2012-01-04 17:21 ` Andy Wingo 2012-01-04 17:39 ` David Kastrup @ 2012-01-04 21:52 ` Ian Price 2012-01-04 22:18 ` Bruce Korb 2012-01-04 22:46 ` Ludovic Courtès 3 siblings, 1 reply; 117+ messages in thread From: Ian Price @ 2012-01-04 21:52 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/04/12 04:19, Ian Price wrote: >> ... As for mutable strings, I consider them >> a mistake to begin with,... > > Let's step back and consider the whole point of Guile in the first place. This was not intended as an answer to this question, nor to be representative of the guile developers / users / what-have-you, but a personal opinion. > So my main question is: > > Which is the higher priority, language purity or ease of use? That is a loaded question, as it presupposes ease of use is always the same thing as impurity e.g. A zipper is just as usable IMO as a gap buffer, and doesn't require mutability. My opinion of mutable strings is that they have little practical use to me in my day to day programming, frankly I can count the number of times I've done it in any high level language (so not C etc) over the past 4 or so years on one hand, and I consider most of those uses mistaken in hindsight. It isn't just functional programming types who care about this, Python is a great example of a language which has not been hindered by immutable strings. The most common string operations in practice (for me) are concatenation, substrings, comparison/searching, and iteration, and I would think a better foundation for strings could be found by starting there rather than with the premise that strings are basically a specific type of vector. And again, just to be clear, I'm not making a proposal, just stating an opinion. -- Ian Price "Programming is like pinball. The reward for doing it well is the opportunity to do it again" - from "The Wizardy Compiled" ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 21:52 ` Ian Price @ 2012-01-04 22:18 ` Bruce Korb 2012-01-04 23:22 ` Mike Gran ` (2 more replies) 0 siblings, 3 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-04 22:18 UTC (permalink / raw) To: guile-devel On 01/04/12 13:52, Ian Price wrote: >> So my main question is: >> >> Which is the higher priority, language purity or ease of use? > That is a loaded question, as it presupposes ease of use is always the > same thing as impurity e.g. ... Absolutely not. Making decisions is always about trade-offs, otherwise it is not really a decision. Should you give preference to language aesthetics, or preference to ease of use *when* there is a divergence? More often than not, language purity (consistency) *improves* ease of use. Here we are looking at something that does not appear to me to improve ease of use. You have to go to some extra trouble to be certain that a string value that you have assigned to an SCM is not read only. That is not convenience. If Guile were to implement copy on write, then the user would not have to care whether a string were shared read only or not. It would be easier to use. The only code that would care at all would be the Guile internals. (Where it belongs -- my completely unhumble opinion :) ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 22:18 ` Bruce Korb @ 2012-01-04 23:22 ` Mike Gran 2012-01-04 23:59 ` Mark H Weaver 2012-01-05 7:22 ` David Kastrup 2 siblings, 0 replies; 117+ messages in thread From: Mike Gran @ 2012-01-04 23:22 UTC (permalink / raw) To: Bruce Korb, guile-devel@gnu.org > From: Bruce Korb <bkorb@gnu.org> >>> Which is the higher priority, language purity or ease of use? >> That is a loaded question, as it presupposes ease of use is always the >> same thing as impurity e.g. ... > Absolutely not. Making decisions is always about trade-offs, > otherwise it is not really a decision. Should you give preference > to language aesthetics, or preference to ease of use *when* > there is a divergence? More often than not, language purity > (consistency) *improves* ease of use. Here we are looking at > something that does not appear to me to improve ease of use. > You have to go to some extra trouble to be certain that a string > value that you have assigned to an SCM is not read only. > That is not convenience. If Guile were to implement copy on write, > then the user would not have to care whether a string were > shared read only or not. It would be easier to use. The only code > that would care at all would be the Guile internals. (Where it > belongs -- my completely unhumble opinion :) Well, I've read all the posts in this thread, and I was pretty aware of the arguments about read-only strings before this. So since I have little left to contribute, I'll sign off with one final statement about it... I agree completely with Bruce's statement above. The mutability of strings in Guile 1.8 was a feature, not a weakness. Even though it wasn't properly implemented, as Mark pointed out, it did what I meant every time I used it. I believe that mutability should be the default in all data types. Creating an immutable compound data type -- be it a string, pair, vector or whatever -- should never be the default, and should always be the case that requires extra syntax. R{5,6,7}RS disagrees with me on that, of course. I think R{5,6,7}RS is wrong. I understand the efficiency argument for immutable strings (and pairs). I don't care, because Guile has never been slow for anything I've asked it to do. That, I guess, is my completely unhumble opinion. :) Regards, Mike ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 22:18 ` Bruce Korb 2012-01-04 23:22 ` Mike Gran @ 2012-01-04 23:59 ` Mark H Weaver 2012-01-05 17:22 ` Bruce Korb 2012-01-05 7:22 ` David Kastrup 2 siblings, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-04 23:59 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > You have to go to some extra trouble to be certain that a string > value that you have assigned to an SCM is not read only. If you're going to mutate a string, you'd better be safe and make a copy before mutating it, unless you know very clearly where it came from. Otherwise, you might be mutating a string that some other data structure still references, and it might not take kindly to having its string mutated behind its back. The fact that some string (whose origin you don't know about) might be read-only is the least of your problems. At least that problem will now be flagged immediately, which is far better than the subtle and hard-to-debug damage might be caused by mutating a string that other data structures may reference. All mutable values in Scheme are pointers. In the case of strings, that means that they're like "char *", not "char []". A great deal of code freely makes copies of these pointers instead of copying the underlying string itself. That's a very old tradition, because it is rare to mutate strings in Scheme. > If Guile were to implement copy on write, then the user would not have > to care whether a string were shared read only or not. It would be > easier to use. Guile already implements copy-on-write strings, but only in the sense of postponing the copy done by `string-copy', `substring', etc. Implementing copy-on-write transparently without the user explicitly making a copy (that is postponed) is _impossible_. The problem is that although we could make a new copy of the string, we have no way to know which pointers to the old object should be changed to point to the new one. We cannot read the user's mind. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 23:59 ` Mark H Weaver @ 2012-01-05 17:22 ` Bruce Korb 2012-01-05 18:13 ` Mark H Weaver ` (2 more replies) 0 siblings, 3 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-05 17:22 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel On 01/04/12 15:59, Mark H Weaver wrote: > Implementing copy-on-write transparently without the user explicitly > making a copy (that is postponed) is _impossible_. The problem is that > although we could make a new copy of the string, we have no way to know > which pointers to the old object should be changed to point to the new > one. We cannot read the user's mind. So because it might be the case that one reference might want to see changes made via another reference then the whole concept is trashed? "all or nothing"? Anyway, such a concept should be kept very simple: functions that modify their argument make copies of any input argument that is read only. Any other SCM's lying about that refer to the unmodified object continue referring to that same unmodified object. No mind reading required. (define a "hello") (define b a) (string-upcase! a) b yields "hello", not "HELLO". Simple, comprehensible and, of course, not the problem I was having. :) "it goes without saying (but I'll say it anyway)": (define a (string-copy "hello")) (define b a) (string-upcase! a) b *does* yield "HELLO" and not "hello". Why the inconsistency? Because it is better to do what is almost certainly expected rather than throw errors. It is an ease of use over language purity thing. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-05 17:22 ` Bruce Korb @ 2012-01-05 18:13 ` Mark H Weaver 2012-01-05 19:02 ` Mark H Weaver 2012-01-05 20:24 ` David Kastrup 2012-01-05 22:42 ` Mark H Weaver 2 siblings, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-05 18:13 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > So because it might be the case that one reference might want to > see changes made via another reference then the whole concept is > trashed? "all or nothing"? Anyway, such a concept should be kept > very simple: functions that modify their argument make copies of > any input argument that is read only. Any other SCM's lying about > that refer to the unmodified object continue referring to that > same unmodified object. No mind reading required. > > (define a "hello") > (define b a) > (string-upcase! a) > b In order to do as you suggest, we'd have to change `string-upcase!' from procedure to syntax. That's because `string-upcase!' gets a _copy_ of the pointer contained in `a', and is unable to change the pointer in `a'. This is fundamental to the semantics of Scheme. We cannot change it without breaking a _lot_ of code. If we changed every string mutation procedure to syntax, then you wouldn't be able to do things like this: (string-upcase! (proc arg ...)) (map string-upcase! list-of-strings) Also, if you wrote a procedure like this: (define (downcase-all-but-first! s) (string-downcase! s) (string-set! s 0 (char-upcase (string-ref s 0)))) it would work properly for mutable strings, but if you passed a read-only string, it would do nothing at all from the caller's point of view, because it would change the pointer in the local parameter s, but not the caller's pointer. These proposed semantics are bad because they don't compose well. > "it goes without saying (but I'll say it anyway)": > > (define a (string-copy "hello")) > (define b a) > (string-upcase! a) > b > > *does* yield "HELLO" and not "hello". Why the inconsistency? You are proceeding from the assumption that each variable contains its own string buffer, when in fact they contain pointers, and (define b a) copies only the pointer. In other words, the code above is like: char *a = string_copy ("hello"); char *b = a; string_upcase_x (a); return b; What you are asking for cannot be done without changing the fundamental semantics of Scheme at a very deep level. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-05 18:13 ` Mark H Weaver @ 2012-01-05 19:02 ` Mark H Weaver 0 siblings, 0 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-05 19:02 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Replying to myself... >> "it goes without saying (but I'll say it anyway)": >> >> (define a (string-copy "hello")) >> (define b a) >> (string-upcase! a) >> b >> >> *does* yield "HELLO" and not "hello". Why the inconsistency? > > You are proceeding from the assumption that each variable contains its > own string buffer, when in fact they contain pointers, and (define b a) > copies only the pointer. In other words, the code above is like: > > char *a = string_copy ("hello"); > char *b = a; > string_upcase_x (a); > return b; Of course, in Scheme (and C) it is possible to do what you want by changing string-upcase! (string_upcase_x) from a procedure to a macro, but as you know, macros in C have significant disadvantages. Scheme macros are vastly more powerful and robust, but they also have significant disadvantages compared with procedures. Here's how you could do what you want with Scheme macros: (define-syntax-rule (string-upcase!! x) (set! x (string-upcase x))) Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-05 17:22 ` Bruce Korb 2012-01-05 18:13 ` Mark H Weaver @ 2012-01-05 20:24 ` David Kastrup 2012-01-05 22:42 ` Mark H Weaver 2 siblings, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-05 20:24 UTC (permalink / raw) To: guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/04/12 15:59, Mark H Weaver wrote: >> Implementing copy-on-write transparently without the user explicitly >> making a copy (that is postponed) is _impossible_. The problem is that >> although we could make a new copy of the string, we have no way to know >> which pointers to the old object should be changed to point to the new >> one. We cannot read the user's mind. > > So because it might be the case that one reference might want to > see changes made via another reference then the whole concept is > trashed? Yes. Because different references can't be distinguished, it would mean that you'd not actually have a reference to the modified copy after modifying it. Which renders the modification useless. > "all or nothing"? Anyway, such a concept should be kept very simple: > functions that modify their argument make copies of any input argument > that is read only. Any other SCM's lying about that refer to the > unmodified object continue referring to that same unmodified object. > No mind reading required. > (define a "hello") > (define b a) > (string-upcase! a) > b > > yields "hello", not "HELLO". Simple, comprehensible and, of course, > not the problem I was having. :) It is neither simple, nor comprehensible. > "it goes without saying (but I'll say it anyway)": > > (define a (string-copy "hello")) > (define b a) > (string-upcase! a) > b > > *does* yield "HELLO" and not "hello". Why the inconsistency? > > Because it is better to do what is almost certainly expected > rather than throw errors. > > It is an ease of use over language purity thing. You probably don't realize how ironic that is. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-05 17:22 ` Bruce Korb 2012-01-05 18:13 ` Mark H Weaver 2012-01-05 20:24 ` David Kastrup @ 2012-01-05 22:42 ` Mark H Weaver 2012-01-06 1:02 ` Mike Gran 2 siblings, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-05 22:42 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > Anyway, such a concept should be kept > very simple: functions that modify their argument make copies of > any input argument that is read only. Any other SCM's lying about > that refer to the unmodified object continue referring to that > same unmodified object. No mind reading required. > > (define a "hello") > (define b a) > (string-upcase! a) > b I suspect that what you really want is for `define' (and maybe some other things) to automatically do a deep copy instead of merely making a new reference to an existing object. For example, you seem to want (define a "hello") to make a fresh copy of the string literal, and for (define b a) to make another copy so that changes to the string referenced by `b' do not affect the string referenced by `a'. You seem to not want to think about aliasing issues. Indeed, it would be more intuitive if we always copied everything deeply, but that would be strictly less powerful, not to mention far less efficient, especially when handling large structures. `define' merely makes a new reference to an existing object. If you want a copy, you must explicitly ask for one (though this could be hidden by custom syntax). It would not be desirable for the language to make copies automatically as part of the core `define' syntax. For one thing, sometimes you don't want a copy. Sometimes you want shared mutable objects. Even if you do want to copy, there are different kinds of copies. How deeply do you want to copy? If it's a hierarchical list, do you want to copy only the first level of the list, or do you want to recurse? Suppose this hierarchical list contains strings. Do you want to copy the strings too, or just the list structure? I could go on and on. There's no good universal copier; it depends on your purposes. If you want an abbreviated way to both `define' and `copy', then you'll need to make new syntax to do that. Guile provides all the power you need to do this. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-05 22:42 ` Mark H Weaver @ 2012-01-06 1:02 ` Mike Gran 2012-01-06 1:41 ` Mark H Weaver 2012-01-06 9:23 ` David Kastrup 0 siblings, 2 replies; 117+ messages in thread From: Mike Gran @ 2012-01-06 1:02 UTC (permalink / raw) To: Mark H Weaver, Bruce Korb; +Cc: guile-devel@gnu.org > `define' merely makes a new reference to an existing object. If you > want a copy, you must explicitly ask for one (though this could be > hidden by custom syntax). It would not be desirable for the language to > make copies automatically as part of the core `define' syntax. For one > thing, sometimes you don't want a copy. Sometimes you want shared > mutable objects. It is curious that action of 'copy' really means the action of 'create a copy with different properties'. Shouldn't (string-copy "a") create another immutable string? Likewise, shouldn't (substring "abc" 1) return an immutable substring? ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 1:02 ` Mike Gran @ 2012-01-06 1:41 ` Mark H Weaver 2012-01-06 2:38 ` Noah Lavine ` (2 more replies) 2012-01-06 9:23 ` David Kastrup 1 sibling, 3 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-06 1:41 UTC (permalink / raw) To: Mike Gran; +Cc: Bruce Korb, guile-devel Mike Gran <spk121@yahoo.com> writes: > It is curious that action of 'copy' really means the > action of 'create a copy with different properties'. > > Shouldn't (string-copy "a") create another immutable string? Why would you want to copy an immutable string? > Likewise, shouldn't (substring "abc" 1) return an immutable substring? As I understand it, in the Scheme standards (at least before R6RS's immutable pairs) the rationale behind marking literal constants as immutable is solely to avoid needlessly making copies of those literals, while flagging accidental attempts to modify them, since that is almost certainly a mistake. If that is the only rationale for marking things read-only, then there's no reason to mark copies read-only. The philosophy of Scheme (at least before R6RS) was clearly to make almost all data structures mutable. Following that philosophy, in Guile, even though (substring "abc" 1) postpones copying the string buffer, it must create a new heap object. Once you've done that, it is feasible to implement copy-on-write. Now, the immutable pairs of R6RS and Racket have an entirely different rationale, namely that they enable vastly more effective optimization in a compiler. In this case, presumably you'd want copies to retain the immutability. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 1:41 ` Mark H Weaver @ 2012-01-06 2:38 ` Noah Lavine 2012-01-06 13:37 ` Mike Gran 2012-01-07 20:57 ` Guile: " Ian Price 2 siblings, 0 replies; 117+ messages in thread From: Noah Lavine @ 2012-01-06 2:38 UTC (permalink / raw) To: Mark H Weaver; +Cc: Bruce Korb, guile-devel Hello all, I must admit that I do not know much about why R5RS says that literals are constant, but I think there is a misunderstanding. Bruce does not want `define' to always copy its result. I think what he wants is for literals embedded in source code to be mutable. This would, of course, imply that each literal in the source code would be a new copy, even if they were identical. Weirdly enough, that is how my intuition works too. After all, if I made a string object in Scheme without going to any trouble, I would get a mutable object. If I write down a string, I expect to get the same sort of object. Bruce is also right that this enables quick and easy programming that munges strings. And I think the argument about putting strings in constant memory is bad - constant memory is an implementation detail. If it happens that we can store literals more efficiently when they are not mutated, then perhaps we should just detect that case and switch representations. Of course there is a trade-off here between ease of implementation and ease of use. This change seems pretty unimportant to me, especially if Python does all right with immutable strings, so I do not think it's important for us to support it. I just don't buy the arguments against supporting it. Noah On Thu, Jan 5, 2012 at 8:41 PM, Mark H Weaver <mhw@netris.org> wrote: > Mike Gran <spk121@yahoo.com> writes: >> It is curious that action of 'copy' really means the >> action of 'create a copy with different properties'. >> >> Shouldn't (string-copy "a") create another immutable string? > > Why would you want to copy an immutable string? > >> Likewise, shouldn't (substring "abc" 1) return an immutable substring? > > As I understand it, in the Scheme standards (at least before R6RS's > immutable pairs) the rationale behind marking literal constants as > immutable is solely to avoid needlessly making copies of those literals, > while flagging accidental attempts to modify them, since that is almost > certainly a mistake. > > If that is the only rationale for marking things read-only, then there's > no reason to mark copies read-only. The philosophy of Scheme (at least > before R6RS) was clearly to make almost all data structures mutable. > > Following that philosophy, in Guile, even though (substring "abc" 1) > postpones copying the string buffer, it must create a new heap object. > Once you've done that, it is feasible to implement copy-on-write. > > Now, the immutable pairs of R6RS and Racket have an entirely different > rationale, namely that they enable vastly more effective optimization in > a compiler. In this case, presumably you'd want copies to retain the > immutability. > > Mark > ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 1:41 ` Mark H Weaver 2012-01-06 2:38 ` Noah Lavine @ 2012-01-06 13:37 ` Mike Gran 2012-01-06 14:11 ` David Kastrup 2012-01-06 18:13 ` Mark H Weaver 2012-01-07 20:57 ` Guile: " Ian Price 2 siblings, 2 replies; 117+ messages in thread From: Mike Gran @ 2012-01-06 13:37 UTC (permalink / raw) To: Mark H Weaver; +Cc: Bruce Korb, guile-devel@gnu.org > From: Mark H Weaver <mhw@netris.org> >> It is curious that action of 'copy' really means the >> action of 'create a copy with different properties'. >> >> Shouldn't (string-copy "a") create another immutable string? > > Why would you want to copy an immutable string? > >> Likewise, shouldn't (substring "abc" 1) return an immutable > substring? I was being too snarky and rhetorical. Gotta stop writing e-mail before getting coffee. To say something possibly semi-constructive... The word 'string' in Scheme is overloaded to mean both string immutables and string mutables. Since a string immutable can't be modified to be a mutable, they really are different object types. String mutables appear to still exist in the latest draft of R7RS. Many of the procedures that operate on strings will are overloaded to take both immutables and mutables, but some, like string-set! take only mutables. There is an obvious syntax to construct a string immutable object: namely to have it appear as a literal in the source code. There thus isn't a need for a constructor function. There is a need for a constructor function to create string mutables, because a literal string in the source code indicates a string immutable. There are such constructors: (string <char> ...) and (make-string k <char>) which is fine. But there is no constructor for a string mutable that initializes it with a string in Guile 2.0. There was in Guile 1.8, where you could do (define <var-name> <string-literal>). So instead, syntactically, we now have to use 'string-copy' or 'substring' for its *side-effects*, namely that it doesn't mark the copy immutable. Those are rather poor and confusing names for constructors. If making such a suggestion weren't pointless, I'd pitch the idea of overloading 'string' or 'make-string' so they can be used as a constructor of a string mutable. Something like (string <string-literal>) or (make-string <string-literal>). This would be clearer than using string-copy, I think. Thanks, Mike ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 13:37 ` Mike Gran @ 2012-01-06 14:11 ` David Kastrup 2012-01-06 18:13 ` Mark H Weaver 1 sibling, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-06 14:11 UTC (permalink / raw) To: guile-devel Mike Gran <spk121@yahoo.com> writes: > There is an obvious syntax to construct a string immutable > object: namely to have it appear as a literal in the source code. > There thus isn't a need for a constructor function. Huh? There are _lots_ of strings which are better computed than spelled out. > But there is no constructor for a string mutable that initializes > it with a string in Guile 2.0. (string-copy "xxxxx") > There was in Guile 1.8, where > you could do (define <var-name> <string-literal>). No, it wasn't. guile> (define (x) "xxxxx") guile> (x) "xxxxx" guile> (string-upcase! (x)) "XXXXX" guile> (x) "XXXXX" guile> As you can see, reevaluating the definition suddenly delivers a changed result, because we are not talking about modifying a mutable string initialized with a literal, but about modifying the literal itself. Whether or not you replace the function body with (define y "xxxxx") y instead of just "xxxxx" does not change the result and does not change what happens. y does not refer to a string initialized from the literal, it refers to the literal. And changing the literal is a really bad idea. Just because you do not understand what the code did previously does not mean that the behavior was well-defined. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 13:37 ` Mike Gran 2012-01-06 14:11 ` David Kastrup @ 2012-01-06 18:13 ` Mark H Weaver 2012-01-06 19:06 ` Bruce Korb 2012-01-06 22:23 ` Guile BUG: " Bruce Korb 1 sibling, 2 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-06 18:13 UTC (permalink / raw) To: Mike Gran; +Cc: Bruce Korb, guile-devel Mike Gran <spk121@yahoo.com> writes: > The word 'string' in Scheme is overloaded to mean both string > immutables and string mutables. Since a string immutable > can't be modified to be a mutable, they really are different > object types. String mutables appear to still exist in the > latest draft of R7RS. > > Many of the procedures that operate on strings will are overloaded > to take both immutables and mutables, but some, like string-set! > take only mutables. This is the wrong way to think about it. In Scheme, mutable and immutable strings are _not_ different types. The way to think about it is that in Scheme, the program text itself is immutable, including any literals contained in it. This is true of _all_ literals, including '(literal lists), '#(literal vectors), "literal strings", #'(literal syntax) and any other types that might be added in the future that would otherwise be mutable. Imagine that you were evaluating Scheme by hand on paper. You have your program written on one page, and you have another scratch page used for the data structures that your program creates during evaluation. Suppose your program contains a very large lookup table, written as a literal list. This lookup table is on your program page. Now, suppose you are asked to evaluate (lookup key big-lookup-table). The way Scheme works is that `big-lookup-table' is _not_ copied. As `lookup' traverses the table, it contains pointers within the program page itself. However, Scheme prohibits you from modifying _anything_ that happens to be on the program page. It's not a question of type. It's a question of which page the data happens to be on. Now, we _could_ force you to copy big-lookup-table from the program page onto the scratch page before doing `lookup', just in case `lookup' might try to mutate its structure. But that would be a lot of wasted effort. Alternatively, we could allow you to modify the program itself. This is what Guile 1.8 did. You _could_ make an argument that this is desirable, on the grounds that we should trust that the programmer knows what he's doing. However, it's clear that Bruce did _not_ understood what he was doing. I don't think that he (or you) realized that the following procedure was buggy in Guile 1.8: (define (ten-spaces-with-one-star-at i) (define s " ") (string-set! s i #\*) s) Guile 1.8's permissivity allowed Bruce to unwittingly create a large body of code that was inherently buggy. IMHO, it would have been much better to nip that in the bud and alert him to the fact that he was doing something that was almost certainly unwise. > There is a need for a constructor function to create string mutables, > because a literal string in the source code indicates a string immutable. > > There are such constructors: (string <char> ...) and (make-string k <char>) > which is fine. > > But there is no constructor for a string mutable that initializes > it with a string in Guile 2.0. Yes there is: (string-copy "string-literal") If you don't like the name, then rename it: (define mutable-string string-copy) Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 18:13 ` Mark H Weaver @ 2012-01-06 19:06 ` Bruce Korb 2012-01-06 19:19 ` David Kastrup 2012-01-07 16:13 ` Mark H Weaver 2012-01-06 22:23 ` Guile BUG: " Bruce Korb 1 sibling, 2 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-06 19:06 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel On 01/06/12 10:13, Mark H Weaver wrote: > Imagine that you were evaluating Scheme by hand on paper. You have your > program written on one page, and you have another scratch page used for > the data structures that your program creates during evaluation. > Suppose your program contains a very large lookup table, written as a > literal list. This lookup table is on your program page. > > Now, suppose.... That is where my mental model diverges!! > sprintf(buf, "(define %s \"%s\")", "foo", my_str); > scm_eval_string(buf); > sprintf(buf, "(string-upcase! %s)", "foo") > // the string from my_str in "buf" is now scribbled over and completely gone > scm_eval_string(buf); Since I know the program I initially wrote (the define) is now gone, the string must have been copied off somewhere. I think one's first guess is that it was copied to someplace modifiable. However, that would be incorrect. It is copied off to writable memory, but marked as read-only for the purposes of Guile. Not intuitively obvious. > Guile 1.8's permissivity allowed Bruce to unwittingly create a large > body of code that was inherently buggy. IMHO, it would have been much > better to nip that in the bud and alert him to the fact that he was > doing something that was almost certainly unwise. Fail early and fail hard. Yes. But after all these discussions, I now doubt I have too many places where I am expecting to change a static value. Most of the strings that I wind up altering are created with a scm_from_locale_string() C function call. Very few strings are ever actually initialized with (define foo "something"), other than when creating placeholders because you cannot define within a nested collection of functions. e.g. (if (whatever) (define foo (get "this")) (define foo (get "that")) ) (string-upcase! foo) ==== Anyway, I did compile and build my toy and guile with CFLAGS='-g -O0'. The error message did not show. Instead it seg faulted while trying to make this call: scm_from_locale_string(""); There must be a corruption somewhere. It is either asymptomatic with Guile 1.8 (viz. my fault) or it is introduced with Guile 2.0 (meaning a Guile code issue). More in a few days. Thank you. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 19:06 ` Bruce Korb @ 2012-01-06 19:19 ` David Kastrup 2012-01-06 20:03 ` Mark H Weaver 2012-01-07 16:13 ` Mark H Weaver 1 sibling, 1 reply; 117+ messages in thread From: David Kastrup @ 2012-01-06 19:19 UTC (permalink / raw) To: guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/06/12 10:13, Mark H Weaver wrote: >> Imagine that you were evaluating Scheme by hand on paper. You have your >> program written on one page, and you have another scratch page used for >> the data structures that your program creates during evaluation. >> Suppose your program contains a very large lookup table, written as a >> literal list. This lookup table is on your program page. >> >> Now, suppose.... > > That is where my mental model diverges!! The mental model of the computer is what counts. >> sprintf(buf, "(define %s \"%s\")", "foo", my_str); >> scm_eval_string(buf); >> sprintf(buf, "(string-upcase! %s)", "foo") >> // the string from my_str in "buf" is now scribbled over and completely gone >> scm_eval_string(buf); > > Since I know the program I initially wrote (the define) is now gone, Why would a define be gone? > the string must have been copied off somewhere. I don't think you understand the concept of garbage collection. _Everything_ in Scheme exists permanently regarding all observable semantics (well, weak hash tables are a somewhat weird exception). Definitions, variables, continuations. There is no concept like a stack of local values that would get erased. Thanks to call/cc, there is not even a return stack that would get erased. Every object carries its own lifetime with it. It dies when nobody remembers it, not because of being in some scope or whatever else. > I think one's first guess is that it was copied to someplace > modifiable. However, that would be incorrect. It is copied off to > writable memory, but marked as read-only for the purposes of Guile. > Not intuitively obvious. Also wrong. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 19:19 ` David Kastrup @ 2012-01-06 20:03 ` Mark H Weaver 0 siblings, 0 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-06 20:03 UTC (permalink / raw) To: David Kastrup; +Cc: guile-devel David Kastrup <dak@gnu.org> writes: > Bruce Korb <bkorb@gnu.org> writes: > >>> sprintf(buf, "(define %s \"%s\")", "foo", my_str); >>> scm_eval_string(buf); >>> sprintf(buf, "(string-upcase! %s)", "foo") >>> // the string from my_str in "buf" is now scribbled over and completely gone >>> scm_eval_string(buf); >> >> Since I know the program I initially wrote (the define) is now gone, > > Why would a define be gone? I think what Bruce means here is that, in theory, the string object created in the above `define' might have held a reference to part of his buffer `buf'. And indeed, we do make a copy of that buffer. So why not make a mutable copy? The reason is that, even though we make a copy of the program as we read it (converting from the string representation of `buf' into our internal representation), we'd like to be able to use the program multiple times. When I speak of the "program text", I'm not referring to the string representation of the program, but rather the internal representation. If we allow the user to unwittingly modify the program, it might work once but fail thereafter, as in: (define ten-spaces-with-one-star-at (lambda (i) (define s " ") (string-set! s i #\*) s)) Now, some reasonable people might say "Why arbitrarily limit the user? He might know what he's doing, and he might really want to do this!" Scheme provides a nice way to do this too: (define ten-spaces-with-new-star-at (let ((s (make-string 10 #\space))) (lambda (i) (string-set! s i #\*) s))) I normally lean toward assuming that the user knows what he's doing, but in this case I think Scheme got it right. Accidentally modifying literals is a very common mistake, and is almost never a good idea. If you want to make a program with internal mutable state, Scheme provides free variables, as used in the example above. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 19:06 ` Bruce Korb 2012-01-06 19:19 ` David Kastrup @ 2012-01-07 16:13 ` Mark H Weaver 2012-01-07 17:35 ` mutable interfaces - was: " Bruce Korb 1 sibling, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-07 16:13 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > Fail early and fail hard. Yes. But after all these discussions, I > now doubt I have too many places where I am expecting to change a > static value. That's good news! :) > Most of the strings that I wind up altering are created with a > scm_from_locale_string() C function call. BTW, beware that scm_from_locale_string() is only appropriate for strings that came from the user (e.g. command-line arguments, reading from a port, etc). When converting string literals from your own source code, you should use scm_from_latin1_string() or scm_from_utf8_string(). Similarly, to make symbols from C string literals, use scm_from_latin1_symbol() or scm_from_utf8_symbol(). Caveat: these functions did not exist in Guile 1.8. If your C string literals are ASCII-only, I guess it won't matter in practice which function you use, although it would be good to spread the understanding that C string literals should not be interpreted according to the user's locale. Best, Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: mutable interfaces - was: Guile: What's wrong with this? 2012-01-07 16:13 ` Mark H Weaver @ 2012-01-07 17:35 ` Bruce Korb 2012-01-07 17:47 ` David Kastrup 2012-01-07 18:30 ` Mark H Weaver 0 siblings, 2 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-07 17:35 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel On 01/07/12 08:13, Mark H Weaver wrote: >> Most of the strings that I wind up altering are created with a >> scm_from_locale_string() C function call. > > BTW, beware that scm_from_locale_string() is only appropriate for > strings that came from the user (e.g. command-line arguments, reading > from a port, etc). When converting string literals from your own source > code, you should use scm_from_latin1_string() or scm_from_utf8_string(). > > Similarly, to make symbols from C string literals, use > scm_from_latin1_symbol() or scm_from_utf8_symbol(). > > Caveat: these functions did not exist in Guile 1.8. If your C string > literals are ASCII-only, I guess it won't matter in practice which > function you use, although it would be good to spread the understanding > that C string literals should not be interpreted according to the user's > locale. I go back to my argument that a facilitation language needs to focus on being as helpful as possible. That means doing what is likely wanted instead of throwing errors at every possibility. It also means not changing interfaces. It is certainly much more stable now than it was in the 1.4 to 1.6 transition era, but still. Anyway, this then? (abbreviated) #if GUILE_VERSION < 107000 # define AG_SCM_STR02SCM(_s) scm_makfrom0str(_s) # define AG_SCM_STR2SCM(_st,_sz) scm_mem2string(_st,_sz) #elif GUILE_VERSION < 200000 # define AG_SCM_STR02SCM(_s) scm_from_locale_string(_s) # define AG_SCM_STR2SCM(_st,_sz) scm_from_locale_stringn(_st,_sz) #elif GUILE_VERSION < 200004 #error "autogen does not work with this version of guile" choke me. #else # define AG_SCM_STR02SCM(_s) scm_from_utf8_string(_s) # define AG_SCM_STR2SCM(_st,_sz) scm_from_utf8_stringn(_st,_sz) #endif ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: mutable interfaces - was: Guile: What's wrong with this? 2012-01-07 17:35 ` mutable interfaces - was: " Bruce Korb @ 2012-01-07 17:47 ` David Kastrup 2012-01-07 18:30 ` Mark H Weaver 1 sibling, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-07 17:47 UTC (permalink / raw) To: guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/07/12 08:13, Mark H Weaver wrote: >>> Most of the strings that I wind up altering are created with a >>> scm_from_locale_string() C function call. >> >> BTW, beware that scm_from_locale_string() is only appropriate for >> strings that came from the user (e.g. command-line arguments, reading >> from a port, etc). When converting string literals from your own source >> code, you should use scm_from_latin1_string() or scm_from_utf8_string(). >> >> Similarly, to make symbols from C string literals, use >> scm_from_latin1_symbol() or scm_from_utf8_symbol(). >> >> Caveat: these functions did not exist in Guile 1.8. If your C string >> literals are ASCII-only, I guess it won't matter in practice which >> function you use, although it would be good to spread the understanding >> that C string literals should not be interpreted according to the user's >> locale. > > I go back to my argument that a facilitation language needs to focus > on being as helpful as possible. That means doing what is likely > wanted instead of throwing errors at every possibility. It also means > not changing interfaces. Undefined behavior is not an interface. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: mutable interfaces - was: Guile: What's wrong with this? 2012-01-07 17:35 ` mutable interfaces - was: " Bruce Korb 2012-01-07 17:47 ` David Kastrup @ 2012-01-07 18:30 ` Mark H Weaver 2012-01-07 18:55 ` Mark H Weaver 1 sibling, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-07 18:30 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/07/12 08:13, Mark H Weaver wrote: >>> Most of the strings that I wind up altering are created with a >>> scm_from_locale_string() C function call. >> >> BTW, beware that scm_from_locale_string() is only appropriate for >> strings that came from the user (e.g. command-line arguments, reading >> from a port, etc). When converting string literals from your own source >> code, you should use scm_from_latin1_string() or scm_from_utf8_string(). >> >> Similarly, to make symbols from C string literals, use >> scm_from_latin1_symbol() or scm_from_utf8_symbol(). >> >> Caveat: these functions did not exist in Guile 1.8. If your C string >> literals are ASCII-only, I guess it won't matter in practice which >> function you use, although it would be good to spread the understanding >> that C string literals should not be interpreted according to the user's >> locale. > > I go back to my argument that a facilitation language needs to focus > on being as helpful as possible. That means doing what is likely > wanted instead of throwing errors at every possibility. It also means > not changing interfaces. Sorry, but there's no way to maintain backward compatibility here. I know it's a pain, but there's no getting around the fact that in order to write proper internationalized code, we now need to think carefully about what encoding a particular string is in. There's no automatic way to handle this, not even in principle. Fortunately, most modern GNU/Linux systems default to a UTF-8 locale, in which case scm_from_locale_string and scm_from_utf8_string will be the same anyway. However, there are still some systems that use a non-UTF-8 locale, and we must strive to support them properly. > Anyway, this then? (abbreviated) > > #if GUILE_VERSION < 107000 > # define AG_SCM_STR02SCM(_s) scm_makfrom0str(_s) > # define AG_SCM_STR2SCM(_st,_sz) scm_mem2string(_st,_sz) > > #elif GUILE_VERSION < 200000 > # define AG_SCM_STR02SCM(_s) scm_from_locale_string(_s) > # define AG_SCM_STR2SCM(_st,_sz) scm_from_locale_stringn(_st,_sz) > > #elif GUILE_VERSION < 200004 > #error "autogen does not work with this version of guile" > choke me. This last clause is wrong. scm_from_utf8_string and scm_from_utf8_stringn were in Guile 2.0.0. > #else > # define AG_SCM_STR02SCM(_s) scm_from_utf8_string(_s) > # define AG_SCM_STR2SCM(_st,_sz) scm_from_utf8_stringn(_st,_sz) > #endif Just remember that this change implies that these macros should only be used for C string literals, and must _not_ be used for strings supplied by the user (e.g. command-line arguments and I/O). It could very well be that you're currently overloading these functions for both purposes, in which case you should split this pair of macros into two distinct pairs: one pair of macros for user strings (keep using scm_from_locale_string{,n} for these), and one pair for C string literals (use scm_from_utf8_string{,n} for Guile 2.0.0 or newer). Then look at each use of these old overloaded macros in your code, and figure out whether it's operating on a string that came from the user or a string that came from your own source code. Again, I stress that this has nothing to do with Guile. All software, if it wishes to be properly internationalized, needs to think about where a string came from. In general, your program's source code (and thus the C string literals it contains) will have a different encoding than C strings that come from the user. C strings of different encodings are essentially of different types (even though C's type system is too crude to distinguish them), and you must treat them as such. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: mutable interfaces - was: Guile: What's wrong with this? 2012-01-07 18:30 ` Mark H Weaver @ 2012-01-07 18:55 ` Mark H Weaver 0 siblings, 0 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-07 18:55 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Replying to myself... > Again, I stress that this has nothing to do with Guile. All software, > if it wishes to be properly internationalized, needs to think about > where a string came from. In general, your program's source code (and > thus the C string literals it contains) will have a different encoding > than C strings that come from the user. C strings of different > encodings are essentially of different types (even though C's type > system is too crude to distinguish them), and you must treat them as > such. In case it wasn't clear: Scheme strings don't have any encoding; they are a sequence of Unicode characters. Therefore, you never have to think about where a Scheme string came from. What you need to think about is where a raw sequence of bytes came from, whether it be a C string (C chars are not characters but merely bytes), a Scheme bytevector, or the bytes in a command-line argument, environment variable, or the bytes read from a file descriptor. Ideally, our code would make these distinctions very clear. However, if you're not motivated (or don't have time) to fix that properly right now, there's one fact that can save you a lot of time: on GNU/Linux and POSIX systems, every locale encoding is compatible with ASCII. Therefore, if you know that a string contains only ASCII characters, then you don't need to think about whether to use scm_from_locale_string or scm_from_utf8_string, because they'll both be equivalent. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-06 18:13 ` Mark H Weaver 2012-01-06 19:06 ` Bruce Korb @ 2012-01-06 22:23 ` Bruce Korb 2012-01-06 23:11 ` Mark H Weaver 2012-01-06 23:28 ` Guile BUG: What's wrong with this? Bruce Korb 1 sibling, 2 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-06 22:23 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel scm_from_locale_stringn() makes an optimization when the length is zero. It returns an immutable string of length zero. For reasons I no longer remember, I had my own ag_scm_string_upcase that called scm_string_upcase_x, presuming that scm_from_locale_stringn had returned a writable string. Two possible fixes: 1. remove the "optimization" 2. check the length in scm_string_upcase_x before choking. The reason for the seg fault is that scm_backtrace() faulted. I called it in the on_exit path and it couldn't cope. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-06 22:23 ` Guile BUG: " Bruce Korb @ 2012-01-06 23:11 ` Mark H Weaver 2012-01-06 23:35 ` Andy Wingo ` (2 more replies) 2012-01-06 23:28 ` Guile BUG: What's wrong with this? Bruce Korb 1 sibling, 3 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-06 23:11 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > scm_from_locale_stringn() makes an optimization when the length is zero. > It returns an immutable string of length zero. Good catch! > Two possible fixes: > > 1. remove the "optimization" > 2. check the length in scm_string_upcase_x before choking. I see a third possible fix, which I think I like best: 3. Make scm_nullstr into a mutable string. After all, it can't be changed anyway, and the _only_ reference to it is from scm_from_stringn, so the result should always be mutable. What do other people think? Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-06 23:11 ` Mark H Weaver @ 2012-01-06 23:35 ` Andy Wingo 2012-01-06 23:41 ` Bruce Korb 2012-01-07 14:35 ` Mark H Weaver 2 siblings, 0 replies; 117+ messages in thread From: Andy Wingo @ 2012-01-06 23:35 UTC (permalink / raw) To: Mark H Weaver; +Cc: Bruce Korb, guile-devel On Sat 07 Jan 2012 00:11, Mark H Weaver <mhw@netris.org> writes: > 3. Make scm_nullstr into a mutable string. After all, it can't be > changed anyway, and the _only_ reference to it is from > scm_from_stringn, so the result should always be mutable. > > What do other people think? Makes sense to me. Good catch, Bruce. Are you down for fixing this one, Mark? :-) Cheers, Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-06 23:11 ` Mark H Weaver 2012-01-06 23:35 ` Andy Wingo @ 2012-01-06 23:41 ` Bruce Korb 2012-01-07 15:00 ` Mark H Weaver 2012-01-07 14:35 ` Mark H Weaver 2 siblings, 1 reply; 117+ messages in thread From: Bruce Korb @ 2012-01-06 23:41 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel On 01/06/12 15:11, Mark H Weaver wrote: > Bruce Korb<bkorb@gnu.org> writes: >> scm_from_locale_stringn() makes an optimization when the length is zero. >> It returns an immutable string of length zero. > > Good catch! > >> Two possible fixes: >> >> 1. remove the "optimization" >> 2. check the length in scm_string_upcase_x before choking. > > I see a third possible fix, which I think I like best: > > 3. Make scm_nullstr into a mutable string. After all, it can't be > changed anyway, and the _only_ reference to it is from > scm_from_stringn, so the result should always be mutable. > > What do other people think? > > Mark > > > I think you are presuming that that is the only source of zero length immutable strings. Are you completely certain? Anyway: Running socket.test ERROR: socket.test: AF_INET6/SOCK_STREAM: bind - arguments: ((system-error "bind" "~A" ("Cannot assign requested address") (99))) ERROR: socket.test: AF_INET6/SOCK_STREAM: bind/sockaddr - arguments: ((system-error "bind" "~A" ("Cannot assign requested address") (99))) I'm going to assume that whatever that is, it isn't related to my change. Tho perhaps yours. I am not sure I understand all the ramifications of your change. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-06 23:41 ` Bruce Korb @ 2012-01-07 15:00 ` Mark H Weaver 2012-01-07 15:27 ` Bruce Korb 2012-01-07 15:47 ` David Kastrup 0 siblings, 2 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-07 15:00 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/06/12 15:11, Mark H Weaver wrote: >> Bruce Korb<bkorb@gnu.org> writes: >>> scm_from_locale_stringn() makes an optimization when the length is zero. >>> It returns an immutable string of length zero. >> >> Good catch! >> >>> Two possible fixes: >>> >>> 1. remove the "optimization" >>> 2. check the length in scm_string_upcase_x before choking. >> >> I see a third possible fix, which I think I like best: >> >> 3. Make scm_nullstr into a mutable string. After all, it can't be >> changed anyway, and the _only_ reference to it is from >> scm_from_stringn, so the result should always be mutable. >> >> What do other people think? >> >> Mark >> >> >> > > I think you are presuming that that is the only source of zero length > immutable strings. Are you completely certain? Empty string literals ("") in the program text are still immutable, so (string-upcase! "") still throws an error. I admit that this is an arguable point. Section 3.4 (Storage model) of the R5RS (and the R7RS draft) says "It is an error to attempt to store a new value into a location that is denoted by an immutable object." An empty string denotes no locations, so perhaps this should not be an error after all. The right place to fix this would probably be in `scm_i_string_start_writing' (strings.c). What do other people think? Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-07 15:00 ` Mark H Weaver @ 2012-01-07 15:27 ` Bruce Korb 2012-01-07 16:38 ` Mark H Weaver 2012-01-07 15:47 ` David Kastrup 1 sibling, 1 reply; 117+ messages in thread From: Bruce Korb @ 2012-01-07 15:27 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel On 01/07/12 07:00, Mark H Weaver wrote: >> I think you are presuming that that is the only source of zero length >> immutable strings. Are you completely certain? > > Empty string literals ("") in the program text are still immutable, so > (string-upcase! "") still throws an error. > > I admit that this is an arguable point. Section 3.4 (Storage model) of > the R5RS (and the R7RS draft) says "It is an error to attempt to store a > new value into a location that is denoted by an immutable object." An > empty string denotes no locations, so perhaps this should not be an > error after all. > > The right place to fix this would probably be in > `scm_i_string_start_writing' (strings.c). > > What do other people think? I think it too much effort for that function. I looked at it. The problem is that you'd have to pass it the start and end points, that is, change its interface. Not worth it. Either do as you've done and have a shared writable zero length string, or exit the functions that use scm_i_string_start_writing before calling that function, in the event that these string transformation functions detect a zero length string (my patch). Either way. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-07 15:27 ` Bruce Korb @ 2012-01-07 16:38 ` Mark H Weaver 2012-01-07 17:39 ` Bruce Korb 0 siblings, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-07 16:38 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/07/12 07:00, Mark H Weaver wrote: >> The right place to fix this would probably be in >> `scm_i_string_start_writing' (strings.c). > > I think it too much effort for that function. I looked at it. > The problem is that you'd have to pass it the start and end points, > that is, change its interface. Ah yes, excellent point! Indeed, it would not be enough for `scm_i_string_start_writing' to check for an empty string. Even for non-empty immutable strings, if the range of character indices passed to a string mutator is empty, then no characters will be changed, and therefore the operation should succeed. `scm_i_string_start_writing' doesn't have enough information to detect this case. I see two choices: * Modify the interface to `scm_i_string_start_writing' to give it the `start' and `end' indices. * Add checks to all string mutation functions: if the range is empty, then avoid calling `scm_i_string_start_writing'. The advantage to the first approach is that authors of future string mutators won't have to remember to handle this case specially, and I have very little confidence that they would. I'll look into this. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-07 16:38 ` Mark H Weaver @ 2012-01-07 17:39 ` Bruce Korb 2012-01-09 15:41 ` Mark H Weaver 0 siblings, 1 reply; 117+ messages in thread From: Bruce Korb @ 2012-01-07 17:39 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel On 01/07/12 08:38, Mark H Weaver wrote: > * Modify the interface to `scm_i_string_start_writing' to give it the > `start' and `end' indices. > > * Add checks to all string mutation functions: if the range is empty, > then avoid calling `scm_i_string_start_writing'. Yes. All of them. All four. > The advantage to the first approach is that authors of future string > mutators won't have to remember to handle this case specially, and I > have very little confidence that they would. > > I'll look into this. Either way. The advantage of quitting a string transformation function early when the length to modify is zero is you save more overhead than just calling scm_i_string_start_writing. But it's your call. Whatever. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-07 17:39 ` Bruce Korb @ 2012-01-09 15:41 ` Mark H Weaver 2012-01-09 17:27 ` Bruce Korb 0 siblings, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-09 15:41 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/07/12 08:38, Mark H Weaver wrote: >> * Add checks to all string mutation functions: if the range is empty, >> then avoid calling `scm_i_string_start_writing'. > > Yes. All of them. All four. For the record, there were 7 string mutation functions to fix :-P Anyway, I did as you suggested, and left `scm_i_string_start_writing' alone. Thanks, Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-09 15:41 ` Mark H Weaver @ 2012-01-09 17:27 ` Bruce Korb 2012-01-09 18:32 ` Andy Wingo 0 siblings, 1 reply; 117+ messages in thread From: Bruce Korb @ 2012-01-09 17:27 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel On 01/09/12 07:41, Mark H Weaver wrote: >> Yes. All of them. All four. > > For the record, there were 7 string mutation functions to fix :-P I guess my cscope search was not exhaustive enough. Thanks! We are talking 2.0.4, yes? When might that be? :) Regards, Bruce ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-09 17:27 ` Bruce Korb @ 2012-01-09 18:32 ` Andy Wingo 2012-01-09 19:48 ` Bruce Korb 0 siblings, 1 reply; 117+ messages in thread From: Andy Wingo @ 2012-01-09 18:32 UTC (permalink / raw) To: Bruce Korb; +Cc: Mark H Weaver, guile-devel On Mon 09 Jan 2012 18:27, Bruce Korb <bkorb@gnu.org> writes: > We are talking 2.0.4, yes? When might that be? :) A week? :) Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-09 18:32 ` Andy Wingo @ 2012-01-09 19:48 ` Bruce Korb 0 siblings, 0 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-09 19:48 UTC (permalink / raw) To: Andy Wingo; +Cc: Mark H Weaver, guile-devel On 01/09/12 10:32, Andy Wingo wrote: > On Mon 09 Jan 2012 18:27, Bruce Korb<bkorb@gnu.org> writes: > >> We are talking 2.0.4, yes? When might that be? :) > > A week? :) Wonderful! I just wanted reassurance it wasn't months. My own round tuits for hobby time things are often measured in months. :( ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-07 15:00 ` Mark H Weaver 2012-01-07 15:27 ` Bruce Korb @ 2012-01-07 15:47 ` David Kastrup 2012-01-07 17:07 ` Mark H Weaver 1 sibling, 1 reply; 117+ messages in thread From: David Kastrup @ 2012-01-07 15:47 UTC (permalink / raw) To: guile-devel Mark H Weaver <mhw@netris.org> writes: > Empty string literals ("") in the program text are still immutable, so > (string-upcase! "") still throws an error. > > I admit that this is an arguable point. Section 3.4 (Storage model) of > the R5RS (and the R7RS draft) says "It is an error to attempt to store a > new value into a location that is denoted by an immutable object." An > empty string denotes no locations, so perhaps this should not be an > error after all. > > The right place to fix this would probably be in > `scm_i_string_start_writing' (strings.c). > > What do other people think? Mutating list operations are allowed on '() (and do not change it). '(), the empty list structure, is eq? to itself regardless how you arrived at it. I think it would give some logical symmetry if the same held for "" and #(). "" is obviously a valid substring of either mutable or immutable strings. The result of (string-append! x "") should leave the immutability state of x alone. One rationale behind that is more or less that the immutability is a property of the characters of the string, and "" has no characters of its own and does not contribute to the characters. If there are predicates "immutable-string?" and "mutable-string?" (I don't have Guilev2 installed), then "" would be the only string satisfying both predicates. Efficiency of implementation might make other choices preferable, but that's what I would consider logically satisfying. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-07 15:47 ` David Kastrup @ 2012-01-07 17:07 ` Mark H Weaver 0 siblings, 0 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-07 17:07 UTC (permalink / raw) To: David Kastrup; +Cc: guile-devel David Kastrup <dak@gnu.org> writes: > Mutating list operations are allowed on '() (and do not change it). > '(), the empty list structure, is eq? to itself regardless how you > arrived at it. Excellent point. The R5RS says that `list' returns "a newly allocated list", but that's obviously not true for (list). So I guess we can take this as a precedent that the "newly allocated" language does not necessarily apply in the 0-element case. I wonder if the R7RS should make this point explicit. It's obvious for lists, but not for vectors or strings. > The result of (string-append! x "") should leave the immutability > state of x alone. There's no `string-append!' nor anything like it, because in Scheme the length of strings is fixed. Only the characters themselves can be changed, not the length. > If there are predicates "immutable-string?" and "mutable-string?" (I > don't have Guilev2 installed), then "" would be the only string > satisfying both predicates. There are no such predicates, and I don't see any good use for them. If you need to check whether a string is mutable, then you shouldn't be mutating it anyway. Anyway, mutability is not a property of strings in particular, but of all objects. Or at least it should be. Right now, we don't enforce immutability of literal lists or vectors, but we should. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-06 23:11 ` Mark H Weaver 2012-01-06 23:35 ` Andy Wingo 2012-01-06 23:41 ` Bruce Korb @ 2012-01-07 14:35 ` Mark H Weaver 2012-01-07 15:20 ` Mike Gran ` (2 more replies) 2 siblings, 3 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-07 14:35 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel I wrote: > 3. Make scm_nullstr into a mutable string. After all, it can't be > changed anyway, and the _only_ reference to it is from > scm_from_stringn, so the result should always be mutable. For the record: my statement above was in error; scm_nullstr is actually used in several files. However, I looked at each use, and in all cases a mutable string is appropriate. Also, it is SCM_INTERNAL. So I committed the change. However, I wonder if we should also remove this optimization from scm_from_stringn, as Bruce suggested. The R5RS says that `string' and `make-string' should return "a newly allocated string", which implies that the new string should not be `eq?' to any existing object. Although our docs for scm_from_stringn et al do not explicitly specify that the string is newly allocated, an argument could be made that we should follow the behavior of `string'. What do other people think? Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-07 14:35 ` Mark H Weaver @ 2012-01-07 15:20 ` Mike Gran 2012-01-07 22:25 ` Ludovic Courtès 2012-01-10 9:13 ` The empty string and other empty strings Ludovic Courtès 2 siblings, 0 replies; 117+ messages in thread From: Mike Gran @ 2012-01-07 15:20 UTC (permalink / raw) To: Mark H Weaver, Bruce Korb; +Cc: guile-devel@gnu.org > From: Mark H Weaver <mhw@netris.org> > > I wrote: >> 3. Make scm_nullstr into a mutable string. After all, it can't be >> changed anyway, and the _only_ reference to it is from >> scm_from_stringn, so the result should always be mutable. > > For the record: my statement above was in error; scm_nullstr is actually > used in several files. However, I looked at each use, and in all cases > a mutable string is appropriate. Also, it is SCM_INTERNAL. So I > committed the change. > > However, I wonder if we should also remove this optimization from > scm_from_stringn, as Bruce suggested. The R5RS says that `string' and > `make-string' should return "a newly allocated string", which > implies > that the new string should not be `eq?' to any existing object. I threw in the optimization a couple of years ago into scm_from_stringn only because I saw it used elsewhere in the code. This was well before Guile-2.0's switch of the immutable flag. So there wasn't much thought behind it. -Mike > > Although our docs for scm_from_stringn et al do not explicitly specify > that the string is newly allocated, an argument could be made that we > should follow the behavior of `string'. > > What do other people think? > > Mark > ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-07 14:35 ` Mark H Weaver 2012-01-07 15:20 ` Mike Gran @ 2012-01-07 22:25 ` Ludovic Courtès 2012-01-10 9:13 ` The empty string and other empty strings Ludovic Courtès 2 siblings, 0 replies; 117+ messages in thread From: Ludovic Courtès @ 2012-01-07 22:25 UTC (permalink / raw) To: guile-devel Hi, Mark H Weaver <mhw@netris.org> skribis: > I wrote: >> 3. Make scm_nullstr into a mutable string. After all, it can't be >> changed anyway, and the _only_ reference to it is from >> scm_from_stringn, so the result should always be mutable. > > For the record: my statement above was in error; scm_nullstr is actually > used in several files. However, I looked at each use, and in all cases > a mutable string is appropriate. Also, it is SCM_INTERNAL. So I > committed the change. Good! > However, I wonder if we should also remove this optimization from > scm_from_stringn, as Bruce suggested. The R5RS says that `string' and > `make-string' should return "a newly allocated string", which implies > that the new string should not be `eq?' to any existing object. > > Although our docs for scm_from_stringn et al do not explicitly specify > that the string is newly allocated, an argument could be made that we > should follow the behavior of `string'. > > What do other people think? Makes sense to return a new empty string, yes. Ludo’, who is hoping for the day where strings are immutable, period. :-) ^ permalink raw reply [flat|nested] 117+ messages in thread
* The empty string and other empty strings 2012-01-07 14:35 ` Mark H Weaver 2012-01-07 15:20 ` Mike Gran 2012-01-07 22:25 ` Ludovic Courtès @ 2012-01-10 9:13 ` Ludovic Courtès 2012-01-10 11:28 ` Mike Gran ` (2 more replies) 2 siblings, 3 replies; 117+ messages in thread From: Ludovic Courtès @ 2012-01-10 9:13 UTC (permalink / raw) To: guile-devel Hello Mark! Mark H Weaver <mhw@netris.org> skribis: > I wrote: >> 3. Make scm_nullstr into a mutable string. After all, it can't be >> changed anyway, and the _only_ reference to it is from >> scm_from_stringn, so the result should always be mutable. > > For the record: my statement above was in error; scm_nullstr is actually > used in several files. However, I looked at each use, and in all cases > a mutable string is appropriate. Also, it is SCM_INTERNAL. So I > committed the change. I just noticed that there are i18n.test failures on Hydra, which point at this commit: http://hydra.nixos.org/build/1790097 I think this is under the C locale, but I haven’t been able to reproduce it yet. Anyway, it seems that before, you couldn’t get any encoding error for scm_from_stringn ("", "SOME-ENCODING"), whereas now you can. A related question: can we have both narrow and wide empty strings? Thanks, Ludo’. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 9:13 ` The empty string and other empty strings Ludovic Courtès @ 2012-01-10 11:28 ` Mike Gran 2012-01-10 13:03 ` Mark H Weaver 2012-01-10 14:10 ` David Kastrup 2012-01-10 12:21 ` Mike Gran 2012-01-10 12:27 ` Mark H Weaver 2 siblings, 2 replies; 117+ messages in thread From: Mike Gran @ 2012-01-10 11:28 UTC (permalink / raw) To: Ludovic Courtès, guile-devel@gnu.org > From: Ludovic Courtès <ludo@gnu.org> > A related question: can we have both narrow and wide empty strings? The intention is that a string is encoded as wide only if it can't be encoded as narrow. So _newly created_ empty strings should only be narrow. Right now it seems that zero-length shared substring of a wide string is wide. A zero-length substring still shares the stringbuf of the original string. (%string-dump (substring (apply string (map integer->char (list 2001 2002 2003))) 3)) So I guess the answer is that you can have both wide and narrow empty strings if you believe that zero-length substrings need to point to a zero-length part of the stringbuf of the parent string from which they were generated. This is a little pedantic, but I think it might be the right answer. What do you think about that? Do zero-length substrings need to still share stringbufs with their parent strings? In any case, a string-copy of a narrow substring of an otherwise wide string should be a new narrow string. This should apply to zero-length substrings as well. This isn't happening, because we're missing a scm_i_try_narrow_string in string-copy, which is a bug. Thanks, Mike ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 11:28 ` Mike Gran @ 2012-01-10 13:03 ` Mark H Weaver 2012-01-10 13:09 ` Mike Gran 2012-01-10 15:41 ` Mark H Weaver 2012-01-10 14:10 ` David Kastrup 1 sibling, 2 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-10 13:03 UTC (permalink / raw) To: Mike Gran; +Cc: Ludovic Courtès, guile-devel Mike Gran <spk121@yahoo.com> writes: > Right now it seems that zero-length shared substring of a wide string is > wide. A zero-length substring still shares the stringbuf of the > original string. [...] > What do you think about that? Do zero-length substrings need to > still share stringbufs with their parent strings? I think the answer is: no they don't, and avoiding that might be a worthwhile optimization, mainly to avoid needlessly holding a reference to a potentially large stringbuf. > In any case, a string-copy of a narrow substring of an otherwise wide string > should be a new narrow string. This should apply to zero-length > substrings as well. This isn't happening, because we're missing > a scm_i_try_narrow_string in string-copy, which is a bug. I just fixed this. > Looks like for zero-length input strings, u32_conv_from_encoding can > return NULL. Interesting! Anyway, we now avoid calling `u32_conv_from_encoding' for empty strings. Thanks, Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 13:03 ` Mark H Weaver @ 2012-01-10 13:09 ` Mike Gran 2012-01-10 15:41 ` Mark H Weaver 1 sibling, 0 replies; 117+ messages in thread From: Mike Gran @ 2012-01-10 13:09 UTC (permalink / raw) To: Mark H Weaver; +Cc: Ludovic Courtès, guile-devel@gnu.org > From: Mark H Weaver <mhw@netris.org> >> What do you think about that? Do zero-length substrings need to >> still share stringbufs with their parent strings? > > I think the answer is: no they don't, and avoiding that might be a > worthwhile optimization, mainly to avoid needlessly holding a reference > to a potentially large stringbuf. That's a good point. Thanks, Mike ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 13:03 ` Mark H Weaver 2012-01-10 13:09 ` Mike Gran @ 2012-01-10 15:41 ` Mark H Weaver 2012-01-10 15:48 ` David Kastrup 1 sibling, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-10 15:41 UTC (permalink / raw) To: Mike Gran; +Cc: Ludovic Courtès, guile-devel Mike Gran <spk121@yahoo.com> wrote: >> Right now it seems that zero-length shared substring of a wide string is >> wide. A zero-length substring still shares the stringbuf of the >> original string. > [...] >> What do you think about that? Do zero-length substrings need to >> still share stringbufs with their parent strings? I wrote: > I think the answer is: no they don't, and avoiding that might be a > worthwhile optimization, mainly to avoid needlessly holding a reference > to a potentially large stringbuf. I went ahead and committed this optimization. Empty substrings are now always freshly allocated, and never hold a reference to the original stringbuf. I also added another optimization: `scm_i_make_string' now uses a common null_stringbuf when creating empty strings. The string object itself is still freshly allocated. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 15:41 ` Mark H Weaver @ 2012-01-10 15:48 ` David Kastrup 2012-01-10 16:15 ` Mark H Weaver 0 siblings, 1 reply; 117+ messages in thread From: David Kastrup @ 2012-01-10 15:48 UTC (permalink / raw) To: guile-devel Mark H Weaver <mhw@netris.org> writes: > Mike Gran <spk121@yahoo.com> wrote: >>> Right now it seems that zero-length shared substring of a wide string is >>> wide. A zero-length substring still shares the stringbuf of the >>> original string. >> [...] >>> What do you think about that? Do zero-length substrings need to >>> still share stringbufs with their parent strings? > > I wrote: >> I think the answer is: no they don't, and avoiding that might be a >> worthwhile optimization, mainly to avoid needlessly holding a reference >> to a potentially large stringbuf. > > I went ahead and committed this optimization. Empty substrings are now > always freshly allocated, and never hold a reference to the original > stringbuf. Why would they need an allocation at all? They don't contain characters. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 15:48 ` David Kastrup @ 2012-01-10 16:15 ` Mark H Weaver 2012-01-12 22:33 ` Ludovic Courtès 0 siblings, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-10 16:15 UTC (permalink / raw) To: David Kastrup; +Cc: guile-devel David Kastrup <dak@gnu.org> writes: > Mark H Weaver <mhw@netris.org> writes: > >> I went ahead and committed this optimization. Empty substrings are now >> always freshly allocated, and never hold a reference to the original >> stringbuf. > > Why would they need an allocation at all? They don't contain > characters. It is an arguable point, but although they don't contain characters, they can still be compared with other objects using `eq?'. The R5RS says that `string', `make-string', `substring', `string-append', `list->string', and `string-copy' return a newly allocated string, which implies that the returned string is not `eq?' to any other existing object. Admittedly, the R5RS also says that `list' returns a newly allocated list, which obviously cannot be true for the empty list. Nonetheless, it still seems safer to strictly follow the standard here. I expect that most implementations produce newly allocated empty strings (since that's what naturally happens unless you handle empty strings specially) and some programs might depend on this behavior, especially since the standard seems to mandate it. On the other hand, I don't expect that enough empty strings are created to make the optimization very significant, though perhaps I'm mistaken. Empty string literals ("") will still be shared, for what it's worth. What do other people think? Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 16:15 ` Mark H Weaver @ 2012-01-12 22:33 ` Ludovic Courtès 2012-01-13 9:27 ` David Kastrup 0 siblings, 1 reply; 117+ messages in thread From: Ludovic Courtès @ 2012-01-12 22:33 UTC (permalink / raw) To: guile-devel Hi Mark, Mark H Weaver <mhw@netris.org> skribis: > What do other people think? As you said, R5RS makes it clear that there can be several (in the sense of eq?) empty strings, so I think what you did is the right thing. Thanks! Ludo’. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-12 22:33 ` Ludovic Courtès @ 2012-01-13 9:27 ` David Kastrup 2012-01-13 16:39 ` Mark H Weaver 0 siblings, 1 reply; 117+ messages in thread From: David Kastrup @ 2012-01-13 9:27 UTC (permalink / raw) To: guile-devel ludo@gnu.org (Ludovic Courtès) writes: > Hi Mark, > > Mark H Weaver <mhw@netris.org> skribis: > >> What do other people think? > > As you said, R5RS makes it clear that there can be several (in the sense > of eq?) empty strings, so I think what you did is the right thing. Since it uses the same verbiage with regard to '(), could you please point out _where_ R5RS states that "freshly allocated" means "not eq?"? For me it means "does not contain any component in common with previously allocated material". The fixed constant '() or (list) (the neutral element with regard to list concatenation) not containing any allocated pairs meets that description, and the fixed constant "" or (string) (the neutral element with regard to string concatenation) not containing any allocated characters meets that description. So why treat them differently? What does it buy us except trouble? -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-13 9:27 ` David Kastrup @ 2012-01-13 16:39 ` Mark H Weaver 2012-01-13 17:36 ` David Kastrup ` (2 more replies) 0 siblings, 3 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-13 16:39 UTC (permalink / raw) To: David Kastrup; +Cc: guile-devel David Kastrup <dak@gnu.org> writes: > ludo@gnu.org (Ludovic Courtès) writes: > >> Hi Mark, >> >> Mark H Weaver <mhw@netris.org> skribis: >> >>> What do other people think? >> >> As you said, R5RS makes it clear that there can be several (in the sense >> of eq?) empty strings, so I think what you did is the right thing. > > Since it uses the same verbiage with regard to '(), could you please > point out _where_ R5RS states that "freshly allocated" means "not > eq?"? Section 3.4 (Storage model) of the R5RS states: Whenever this report speaks of storage being allocated for a variable or object, what is meant is that an appropriate number of locations are chosen from the set of locations that are not in use, and the chosen locations are marked to indicate that they are now in use before the variable or object is made to denote them. > For me it means "does not contain any component in common with > previously allocated material". The fixed constant '() or (list) (the > neutral element with regard to list concatenation) not containing any > allocated pairs meets that description, and the fixed constant "" or > (string) (the neutral element with regard to string concatenation) not > containing any allocated characters meets that description. I think this is a very reasonable interpretation, but this is not in accordance with the standard. > So why treat them differently? What does it buy us except trouble? I don't see how our current behavior buys us _any_ trouble. We've voluntarily opted-out of a (marginal) optimization opportunity, and that's all. In your proposed behavior: in _almost_ all cases, `scm_from_stringn' (et al) would return an object that is not `eq?' to any other existing object. However, in a single edge case, you'd have it return something that _is_ `eq?' to other existing objects. This is the kind of behavior that could easily buy us trouble. To my mind, if the optimization is insignificant (and I suspect that it is), then it is safer to treat the edge cases the same as the common case, for the sake of simplifying the semantics. However, my mind is not set in stone on this. Does anyone else here agree with David? Should we defend the legitimacy of this optimization, and ask the R7RS working group to include explicit language specifying that empty strings/vectors need not be freshly allocated? Thanks, Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-13 16:39 ` Mark H Weaver @ 2012-01-13 17:36 ` David Kastrup 2012-01-16 8:26 ` Marijn 2012-01-20 21:31 ` Andy Wingo 2 siblings, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-13 17:36 UTC (permalink / raw) To: guile-devel Mark H Weaver <mhw@netris.org> writes: > David Kastrup <dak@gnu.org> writes: > >> ludo@gnu.org (Ludovic Courtès) writes: >> >>> Hi Mark, >>> >>> Mark H Weaver <mhw@netris.org> skribis: >>> >>>> What do other people think? >>> >>> As you said, R5RS makes it clear that there can be several (in the sense >>> of eq?) empty strings, so I think what you did is the right thing. >> >> Since it uses the same verbiage with regard to '(), could you please >> point out _where_ R5RS states that "freshly allocated" means "not >> eq?"? > > Section 3.4 (Storage model) of the R5RS states: > > Whenever this report speaks of storage being allocated for a variable > or object, what is meant is that an appropriate number of locations > are chosen from the set of locations that are not in use, and the > chosen locations are marked to indicate that they are now in use > before the variable or object is made to denote them. And that's perfectly fine for the characters of a string. However, (string) has no characters. Like (list) has no list members. (list) does not need _any_ allocation, and neither would (string). For me it makes sense to make the fundamental building block of a type a self-contained value. For multi-value non-composite types (like numerical types) that is not necessarily feasible. For composite types with a single elementary non-composite value, it makes sense for me to make this value a basic cell value. Since empty strings are valid substrings of both mutable and non-mutable strings, I don't see that it makes sense to apply either property to them since it is impossible to change any character through them. So there are a number of operations which should for consistency's sake be able to check for this special value efficiently. Reserving a cell value for it seems like the straightforward thing to do, and that is what is done with lists also. >> For me it means "does not contain any component in common with >> previously allocated material". The fixed constant '() or (list) >> (the neutral element with regard to list concatenation) not >> containing any allocated pairs meets that description, and the fixed >> constant "" or (string) (the neutral element with regard to string >> concatenation) not containing any allocated characters meets that >> description. > > I think this is a very reasonable interpretation, but this is not in > accordance with the standard. Are you saying that (eq? (list) (list)) is not in accordance with the standard since the standard specifies that a freshly allocated list is to be returned? >> So why treat them differently? What does it buy us except trouble? > > I don't see how our current behavior buys us _any_ trouble. We've > voluntarily opted-out of a (marginal) optimization opportunity, and > that's all. > > In your proposed behavior: in _almost_ all cases, `scm_from_stringn' > (et al) would return an object that is not `eq?' to any other existing > object. However, in a single edge case, you'd have it return > something that _is_ `eq?' to other existing objects. This is the kind > of behavior that could easily buy us trouble. Why? You can't change any other value _through_ it. Do you want to use (string) as a not-eq-to-anything sentinel like Lisp people do with (list nil) sometimes? It is known that (list) will not do for that purpose (in spite of the standard saying that list will return a freshly allocated list), so do you really think people will expect (string) to do? > To my mind, if the optimization is insignificant (and I suspect that > it is), then it is safer to treat the edge cases the same as the > common case, for the sake of simplifying the semantics. You'll find yourself to be checking for "" more often in connection with strings than for 0 in connection with numbers because "" is special in that it contains no characters or other members. So for me "" is a prime candidate for a single-cell constant. We can live with other objects like 0 not being eq to equal values, so we certainly can with this one. > However, my mind is not set in stone on this. Does anyone else here > agree with David? Should we defend the legitimacy of this > optimization, and ask the R7RS working group to include explicit > language specifying that empty strings/vectors need not be freshly > allocated? They don't specify that empty lists need not be freshly allocated, either, so it would be strange to make a difference here. I think it makes more sense to define "freshly allocated" instead, as "no pre-existing object can be modified through any operation on it". That means that any single-cell constant is by definition "freshly allocated". And indeed, its _cell_ is freshly allocated even though that cell _value_ may be eq? to that of other cells. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-13 16:39 ` Mark H Weaver 2012-01-13 17:36 ` David Kastrup @ 2012-01-16 8:26 ` Marijn 2012-01-16 8:47 ` David Kastrup 2012-01-20 21:31 ` Andy Wingo 2 siblings, 1 reply; 117+ messages in thread From: Marijn @ 2012-01-16 8:26 UTC (permalink / raw) To: Mark H Weaver; +Cc: David Kastrup, guile-devel -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 13-01-12 17:39, Mark H Weaver wrote: > David Kastrup <dak@gnu.org> writes: > > However, my mind is not set in stone on this. Does anyone else > here agree with David? Should we defend the legitimacy of this > optimization, and ask the R7RS working group to include explicit > language specifying that empty strings/vectors need not be freshly > allocated? It seems to me that it can't hurt to ask for clarification of this issue on scheme-reports. Personally I think the intent of the standard is to say that you cannot expect (string) to be un-eq? nor eq? to (string), but let's get a broader perspective. Marijn -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8T3yoACgkQp/VmCx0OL2wG4QCeJkTP7qhm/ll6g/szLrz21uUB 0PwAoKLWlLOIIgcEC8EJKnR+6fYaV0he =8SBJ -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-16 8:26 ` Marijn @ 2012-01-16 8:47 ` David Kastrup 0 siblings, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-16 8:47 UTC (permalink / raw) To: guile-devel Marijn <hkBst@gentoo.org> writes: > On 13-01-12 17:39, Mark H Weaver wrote: >> David Kastrup <dak@gnu.org> writes: >> >> However, my mind is not set in stone on this. Does anyone else >> here agree with David? Should we defend the legitimacy of this >> optimization, and ask the R7RS working group to include explicit >> language specifying that empty strings/vectors need not be freshly >> allocated? > > It seems to me that it can't hurt to ask for clarification of this > issue on scheme-reports. Personally I think the intent of the standard > is to say that you cannot expect (string) to be un-eq? nor eq? to > (string), but let's get a broader perspective. It might be worth pointing out the similarity to (list) and (list) and '(). I think that eq-ness of memberless structures of type list and string (which also could allow mutable and immutable variants to be identical) is worth given separate mention as it is a special case that has semantics with regard to eq-ness and mutability and "freshly allocated" that are nowhere as obvious as with content-carrying variants. Even if the statement results to "can be implemented as", it would avoid choosing inferior implementation options because of trying to split hairs on what amounts to a bald head. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-13 16:39 ` Mark H Weaver 2012-01-13 17:36 ` David Kastrup 2012-01-16 8:26 ` Marijn @ 2012-01-20 21:31 ` Andy Wingo 2 siblings, 0 replies; 117+ messages in thread From: Andy Wingo @ 2012-01-20 21:31 UTC (permalink / raw) To: Mark H Weaver; +Cc: David Kastrup, guile-devel On Fri 13 Jan 2012 17:39, Mark H Weaver <mhw@netris.org> writes: > Should we defend the legitimacy of this optimization, and ask the R7RS > working group to include explicit language specifying that empty > strings/vectors need not be freshly allocated? It's a worthwhile question IMO. I'll mail scheme-reports. Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 11:28 ` Mike Gran 2012-01-10 13:03 ` Mark H Weaver @ 2012-01-10 14:10 ` David Kastrup 1 sibling, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-10 14:10 UTC (permalink / raw) To: guile-devel Mike Gran <spk121@yahoo.com> writes: >> From: Ludovic Courtès <ludo@gnu.org> >> A related question: can we have both narrow and wide empty strings? > > The intention is that a string is encoded as wide only if it can't > be encoded as narrow. So _newly created_ empty strings should only be narrow. > > Right now it seems that zero-length shared substring of a wide string is > wide. A zero-length substring still shares the stringbuf of the > original string. That sounds non-sensical to me. If it does not share any characters with the original string, there is no point in having a buffer (or a wide width) at all. Zero-length substrings should not be abused as pointers carrying any meaning. And they should not keep the original string from being collected. > What do you think about that? Do zero-length substrings need to still > share stringbufs with their parent strings? I consider it more a bug than a feature if they do. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 9:13 ` The empty string and other empty strings Ludovic Courtès 2012-01-10 11:28 ` Mike Gran @ 2012-01-10 12:21 ` Mike Gran 2012-01-10 12:27 ` Mark H Weaver 2 siblings, 0 replies; 117+ messages in thread From: Mike Gran @ 2012-01-10 12:21 UTC (permalink / raw) To: Ludovic Courtès, guile-devel@gnu.org > From: Ludovic Courtès <ludo@gnu.org> > I just noticed that there are i18n.test failures on Hydra, which point > at this commit: > > http://hydra.nixos.org/build/1790097 > > I think this is under the C locale, but I haven’t been able to reproduce > it yet. > > Anyway, it seems that before, you couldn’t get any encoding error for > scm_from_stringn ("", "SOME-ENCODING"), whereas now you can. Looks like for zero-length input strings, u32_conv_from_encoding can return NULL. -Mike ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 9:13 ` The empty string and other empty strings Ludovic Courtès 2012-01-10 11:28 ` Mike Gran 2012-01-10 12:21 ` Mike Gran @ 2012-01-10 12:27 ` Mark H Weaver 2012-01-10 16:34 ` Ludovic Courtès 2 siblings, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-10 12:27 UTC (permalink / raw) To: Ludovic Courtès; +Cc: guile-devel ludo@gnu.org (Ludovic Courtès) writes: > Anyway, it seems that before, you couldn’t get any encoding error for > scm_from_stringn ("", "SOME-ENCODING"), whereas now you can. Good point. I just committed a change to avoid this. > A related question: can we have both narrow and wide empty strings? I see one place where a wide null string could be created: vm-i-loader.c line 115, within the "load-wide-string" vm instruction, calls `scm_i_make_wide_string' but never calls `scm_i_try_narrow_string' as is usually done. I guess this might be because "load-wide-string" is normally never used for strings that contain only Latin 1 characters. Other than that, I don't see how a wide null string could exist outside of temporaries, although it's hard to entirely rule out the possibility. All of the code paths try to avoid using wide strings unless a wide character is actually present. If a wide null string did exist, I don't see what harm it would cause, besides causing failure if `scm_i_string_chars' applied to it. Thanks, Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 12:27 ` Mark H Weaver @ 2012-01-10 16:34 ` Ludovic Courtès 2012-01-10 17:04 ` David Kastrup 0 siblings, 1 reply; 117+ messages in thread From: Ludovic Courtès @ 2012-01-10 16:34 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel Hello Mark, Mark H Weaver <mhw@netris.org> skribis: > ludo@gnu.org (Ludovic Courtès) writes: >> Anyway, it seems that before, you couldn’t get any encoding error for >> scm_from_stringn ("", "SOME-ENCODING"), whereas now you can. > > Good point. I just committed a change to avoid this. Cool, thanks for the instant reply and fix! And thanks to Mike and you for the remainder of the discussion and optimizations. BTW, I just noticed that R5RS uses the phrase “empty strings” (plural) in the description of ‘eq?’, which means we’re indeed on the right track. Thanks, Ludo’. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: The empty string and other empty strings 2012-01-10 16:34 ` Ludovic Courtès @ 2012-01-10 17:04 ` David Kastrup 0 siblings, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-10 17:04 UTC (permalink / raw) To: guile-devel ludo@gnu.org (Ludovic Courtès) writes: > Hello Mark, > > Mark H Weaver <mhw@netris.org> skribis: > >> ludo@gnu.org (Ludovic Courtès) writes: >>> Anyway, it seems that before, you couldn’t get any encoding error for >>> scm_from_stringn ("", "SOME-ENCODING"), whereas now you can. >> >> Good point. I just committed a change to avoid this. > > Cool, thanks for the instant reply and fix! > > And thanks to Mike and you for the remainder of the discussion and > optimizations. > > BTW, I just noticed that R5RS uses the phrase “empty strings” (plural) > in the description of ‘eq?’, which means we’re indeed on the right track. R5RS is supposed to be a standard, not a guessing game. When there is nothing more definitive than splitting hairs in the grammar of the text, I would prefer sane semantics over probably not even intended contortions. "Freshly allocated" for me means that _no_ string operation on pre-existing objects can make this string different from what it is. And since there is no way to share the empty contents of an empty string with other strings, this is true even if every empty string is eq? to every other one. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile BUG: What's wrong with this? 2012-01-06 22:23 ` Guile BUG: " Bruce Korb 2012-01-06 23:11 ` Mark H Weaver @ 2012-01-06 23:28 ` Bruce Korb 1 sibling, 0 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-06 23:28 UTC (permalink / raw) To: guile-devel On 01/06/12 14:23, Bruce Korb wrote: Since I'm dead in the water, I've patched the 2.0.3 source: $ diff -u srfi-13.c~ srfi-13.c --- srfi-13.c~ 2011-07-06 15:50:00.000000000 -0700 +++ srfi-13.c 2012-01-06 15:26:44.963324773 -0800 @@ -2088,6 +2088,8 @@ string_upcase_x (SCM v, size_t start, size_t end) { size_t k; + if (start == end) + return v; v = scm_i_string_start_writing (v); for (k = start; k < end; ++k) @@ -2151,6 +2153,8 @@ string_downcase_x (SCM v, size_t start, size_t end) { size_t k; + if (start == end) + return v; v = scm_i_string_start_writing (v); for (k = start; k < end; ++k) @@ -2218,6 +2222,8 @@ SCM ch; size_t i; int in_word = 0; + if (start == end) + return str; str = scm_i_string_start_writing (str); for(i = start; i < end; i++) @@ -2310,6 +2316,8 @@ string_reverse_x (SCM str, size_t cstart, size_t cend) { SCM tmp; + if (cstart == cend) + return; str = scm_i_string_start_writing (str); if (cend > 0) ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 1:41 ` Mark H Weaver 2012-01-06 2:38 ` Noah Lavine 2012-01-06 13:37 ` Mike Gran @ 2012-01-07 20:57 ` Ian Price 2012-01-08 5:05 ` Mark H Weaver 2 siblings, 1 reply; 117+ messages in thread From: Ian Price @ 2012-01-07 20:57 UTC (permalink / raw) To: Mark H Weaver; +Cc: Bruce Korb, guile-devel Mark H Weaver <mhw@netris.org> writes: > As I understand it, in the Scheme standards (at least before R6RS's > immutable pairs) the rationale behind marking literal constants as > immutable is solely to avoid needlessly making copies of those literals, > while flagging accidental attempts to modify them, since that is almost > certainly a mistake. Erm, if you don't count literals, which were already immutable, then R6RS doesn't have immutable pairs. It does move the mutators to a separate module, but that is a not really equivalent, because even if you don't import (rnrs mutable-pairs), another module may mutate pairs returned by your library. Ditto for strings,etc. To quote section 5.10 "Literal constants, the strings returned by symbol->string, records with no mutable fields, and other values explicitly designated as immutable are immutable objects, while all objects created by the other procedures listed in this report are mutable." -- Ian Price "Programming is like pinball. The reward for doing it well is the opportunity to do it again" - from "The Wizardy Compiled" ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-07 20:57 ` Guile: " Ian Price @ 2012-01-08 5:05 ` Mark H Weaver 0 siblings, 0 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-08 5:05 UTC (permalink / raw) To: Ian Price; +Cc: Bruce Korb, guile-devel Hi Ian! Ian Price <ianprice90@googlemail.com> writes: > Mark H Weaver <mhw@netris.org> writes: > >> As I understand it, in the Scheme standards (at least before R6RS's >> immutable pairs) the rationale behind marking literal constants as >> immutable is solely to avoid needlessly making copies of those literals, >> while flagging accidental attempts to modify them, since that is almost >> certainly a mistake. > Erm, if you don't count literals, which were already immutable, then > R6RS doesn't have immutable pairs. It does move the mutators to a > separate module, but that is a not really equivalent, because even if > you don't import (rnrs mutable-pairs), another module may mutate pairs > returned by your library. Ditto for strings,etc. > > To quote section 5.10 > "Literal constants, the strings returned by symbol->string, records with > no mutable fields, and other values explicitly designated as immutable > are immutable objects, while all objects created by the other procedures > listed in this report are mutable." Ah, I guess you're right. I never studied the R6RS carefully outside of its handling of numerics. I wrote "at least before R6RS" to indicate that I was only knowledgeable about earlier versions. Racket's immutable pairs represent a break in the older tradition. Last I looked anyway, Racket's mutable pairs cannot even be accessed with the standard `car' and `cdr'. Therefore, they really are a different (and incompatible) type from mutable pairs. I still suspect that the rationale behind immutable pairs in the R6RS is to discourage mutation of pairs, to give compiler implementations such as Racket the freedom to make pairs truly immutable as thus benefit from better optimizer. However, I mistakenly implied that immutable pairs were a distinct type in the R6RS itself, and for that I apologize. Thanks, Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-06 1:02 ` Mike Gran 2012-01-06 1:41 ` Mark H Weaver @ 2012-01-06 9:23 ` David Kastrup 1 sibling, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-06 9:23 UTC (permalink / raw) To: guile-devel Mike Gran <spk121@yahoo.com> writes: >> `define' merely makes a new reference to an existing object. If you >> want a copy, you must explicitly ask for one (though this could be >> hidden by custom syntax). It would not be desirable for the language to >> make copies automatically as part of the core `define' syntax. For one >> thing, sometimes you don't want a copy. Sometimes you want shared >> mutable objects. > > It is curious that action of 'copy' really means the > action of 'create a copy with different properties'. > > Shouldn't (string-copy "a") create another immutable string? That would be rather pointless. You could just use the original string. > Likewise, shouldn't (substring "abc" 1) return an immutable substring? Why wouldn't you be using substring/shared if you are not going to modify either string? -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 22:18 ` Bruce Korb 2012-01-04 23:22 ` Mike Gran 2012-01-04 23:59 ` Mark H Weaver @ 2012-01-05 7:22 ` David Kastrup 2 siblings, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-05 7:22 UTC (permalink / raw) To: guile-devel Bruce Korb <bkorb@gnu.org> writes: > On 01/04/12 13:52, Ian Price wrote: >>> So my main question is: >>> >>> Which is the higher priority, language purity or ease of use? >> That is a loaded question, as it presupposes ease of use is always the >> same thing as impurity e.g. ... > > Absolutely not. Making decisions is always about trade-offs, > otherwise it is not really a decision. That does not apparently preclude the option of marketing it as one. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 17:16 ` Bruce Korb ` (2 preceding siblings ...) 2012-01-04 21:52 ` Ian Price @ 2012-01-04 22:46 ` Ludovic Courtès 3 siblings, 0 replies; 117+ messages in thread From: Ludovic Courtès @ 2012-01-04 22:46 UTC (permalink / raw) To: guile-devel Hello, Bruce Korb <bkorb@gnu.org> skribis: > So my main question is: > > Which is the higher priority, language purity or ease of use? FWIW I think “language purity” is one way to achieve “ease of use” (FSVO “language purity” at least.) Ludo’. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-03 22:24 ` Ludovic Courtès 2012-01-03 23:15 ` Bruce Korb @ 2012-01-04 3:04 ` Mike Gran 2012-01-04 9:35 ` nalaginrut ` (2 more replies) 1 sibling, 3 replies; 117+ messages in thread From: Mike Gran @ 2012-01-04 3:04 UTC (permalink / raw) To: Ludovic Courtès, guile-devel@gnu.org > In many systems it is desirable for constants (i.e. the values of literal > expressions) to reside in read-only-memory. To express this, it is > convenient to imagine that every object that denotes locations is > associated with a flag telling whether that object is mutable or immutable. > In such systems literal constants and the strings returned by > `symbol->string' are immutable objects, while all objects created by > the other procedures listed in this report are mutable. It is an error > to attempt to store a new value into a location that is denoted by an > immutable object. > > In Guile this has been the case since commit > 190d4b0d93599e5b58e773dc6375054c3a6e3dbf. > > The reason for this is that Guile’s compiler tries hard to avoid > duplicating constants in the output bytecode. Thus, modifying a > constant would actually change all other occurrences of that constant in > the code, making it a non-constant. ;-) This is a terrible example of the RnRS promoting some strange idea of mathematical purity over being useful. The idea that the correct way to initialize a string is (define x (string-copy "string")) is awkward. "string" is a read-only but copying it makes it modifyiable? Copying implies mutability? Copying doesn't imply modifying mutability in any other data type. Why not change the behavior 'define' to be (define y (substring str 0)) when STR is a read-only string? This would preserve the shared memory if the variable is never modified but still make the string copy-on-write. Regards, Mike ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 3:04 ` Mike Gran @ 2012-01-04 9:35 ` nalaginrut 2012-01-04 9:41 ` David Kastrup 2012-01-04 21:07 ` Ludovic Courtès 2 siblings, 0 replies; 117+ messages in thread From: nalaginrut @ 2012-01-04 9:35 UTC (permalink / raw) To: Mike Gran; +Cc: Ludovic Courtès, guile-devel@gnu.org > > In many systems it is desirable for constants (i.e. the values of literal > > expressions) to reside in read-only-memory. To express this, it is > > convenient to imagine that every object that denotes locations is > > associated with a flag telling whether that object is mutable or immutable. > > In such systems literal constants and the strings returned by > > `symbol->string' are immutable objects, while all objects created by > > the other procedures listed in this report are mutable. It is an error > > to attempt to store a new value into a location that is denoted by an > > immutable object. > > > > In Guile this has been the case since commit > > 190d4b0d93599e5b58e773dc6375054c3a6e3dbf. > > > > The reason for this is that Guile’s compiler tries hard to avoid > > duplicating constants in the output bytecode. Thus, modifying a > > constant would actually change all other occurrences of that constant in > > the code, making it a non-constant. ;-) > > This is a terrible example of the RnRS promoting some strange idea of > mathematical purity over being useful. > > The idea that the correct way to initialize a string is > (define x (string-copy "string")) is awkward. "string" is a read-only > but copying it makes it modifyiable? Copying implies mutability? > > Copying doesn't imply modifying mutability in any other data type. > > Why not change the behavior 'define' to be (define y (substring str 0)) when STR > is a read-only string? This would preserve the shared memory if the variable is never > modified but still make the string copy-on-write. > > Regards, > > Mike > Hi guys! I just pass by and see your dispute. I have been confused by the new immutable string design. But I used a macro "make-mutable-string" which hide string-copy for an abstraction. Anyway, if the efficiency would be an issue, one may choose bytevector to implement "make-mutable-string". And it's easy to substitute with sed. BTW, can't we make an efficient "mutable-string" module for an alternative? Just like old version. I mean it could be a Guile specific feature. -- GNU Powered it GPL Protected it GOD Blessed it HFG - NalaGinrut --hacker key-- v4sw7CUSMhw6ln6pr8OSFck4ma9u8MLSOFw3WDXGm7g/l8Li6e7t4TNGSb8AGORTDLMen6g6RASZOGCHPa28s1MIr4p-x hackerkey.com ---end key--- ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 3:04 ` Mike Gran 2012-01-04 9:35 ` nalaginrut @ 2012-01-04 9:41 ` David Kastrup 2012-01-04 21:07 ` Ludovic Courtès 2 siblings, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-04 9:41 UTC (permalink / raw) To: guile-devel Mike Gran <spk121@yahoo.com> writes: >> In many systems it is desirable for constants (i.e. the values of literal >> expressions) to reside in read-only-memory. To express this, it is >> convenient to imagine that every object that denotes locations is >> associated with a flag telling whether that object is mutable or immutable. >> In such systems literal constants and the strings returned by >> `symbol->string' are immutable objects, while all objects created by >> the other procedures listed in this report are mutable. It is an error >> to attempt to store a new value into a location that is denoted by an >> immutable object. >> >> In Guile this has been the case since commit >> 190d4b0d93599e5b58e773dc6375054c3a6e3dbf. >> >> The reason for this is that Guile’s compiler tries hard to avoid >> duplicating constants in the output bytecode. Thus, modifying a >> constant would actually change all other occurrences of that constant in >> the code, making it a non-constant. ;-) > > This is a terrible example of the RnRS promoting some strange idea of > mathematical purity over being useful. > > The idea that the correct way to initialize a string is > (define x (string-copy "string")) is awkward. "string" is a read-only > but copying it makes it modifyiable? Copying implies mutability? > > Copying doesn't imply modifying mutability in any other data type. Huh? (set-car! '(4 5) 3) => bad (set-car! (list-copy '(4 5)) 3) => ok Similar with literal vectors. Why should strings be different here? -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 3:04 ` Mike Gran 2012-01-04 9:35 ` nalaginrut 2012-01-04 9:41 ` David Kastrup @ 2012-01-04 21:07 ` Ludovic Courtès 2 siblings, 0 replies; 117+ messages in thread From: Ludovic Courtès @ 2012-01-04 21:07 UTC (permalink / raw) To: Mike Gran; +Cc: guile-devel@gnu.org Hi! Mike Gran <spk121@yahoo.com> skribis: >> In many systems it is desirable for constants (i.e. the values of literal >> expressions) to reside in read-only-memory. To express this, it is >> convenient to imagine that every object that denotes locations is >> associated with a flag telling whether that object is mutable or immutable. >> In such systems literal constants and the strings returned by >> `symbol->string' are immutable objects, while all objects created by >> the other procedures listed in this report are mutable. It is an error >> to attempt to store a new value into a location that is denoted by an >> immutable object. [...] > The idea that the correct way to initialize a string is > (define x (string-copy "string")) is awkward. "string" is a read-only > but copying it makes it modifyiable? Copying implies mutability? Sort-of: -- library procedure: string-copy string Returns a newly allocated copy of the given STRING. And a “new allocated copy” is mutable. > Copying doesn't imply modifying mutability in any other data type. It’s not about modifying mutability of an object (this can’t be done), but about fresh vs. constant storage. > Why not change the behavior 'define' to be (define y (substring str 0)) when STR > is a read-only string? This would preserve the shared memory if the variable is never > modified but still make the string copy-on-write. I think all sorts of literal strings would have to be treated the same. FTR, all these evaluate to #t: (apply eq? "hello" '("hello")) (apply eq? '(1 2 3) '((1 2 3))) (apply eq? '#(1 2 3) '(#(1 2 3))) This is fine per R5RS (info "(r5rs) Equivalence predicates"), but different from Guile <= 1.8. (I use ‘apply’ here to fool peval, which otherwise evaluates the expressions to #f at compile-time. Andy: should peval be hacked to give the same answer?) Thanks, Ludo’. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-03 16:26 ` Guile: " Bruce Korb 2012-01-03 16:30 ` Mike Gran 2012-01-03 22:24 ` Ludovic Courtès @ 2012-01-04 10:03 ` Mark H Weaver 2012-01-04 14:29 ` Mike Gran 2012-01-04 22:37 ` Ludovic Courtès 2 siblings, 2 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-04 10:03 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel Bruce Korb <bruce.korb@gmail.com> writes: > 2. it is completely, utterly wrong to mutilate the > Guile library into such a contortion that it > interprets this: > (define y "hello") > to be a request to create an immutable string anyway. > It very, very plainly says, "make 'y' and fill it with > the string "hello". Making it read only is crazy. No, `define' does not copy an object, it merely makes a new reference to an existing object. This is also true in C for that matter, so this is behavior is quite mainstream. For example, the following program dies with SIGSEGV on most modern systems, including GNU/Linux: int main() { char *y = "hello"; y[0] = 'a'; return 0; } Scheme and Guile are the same as C in this respect. Earlier versions of Guile didn't make a copy of the string in this case either, but it lacked the mechanism to detect this error, and allowed you to modify the string literal in the program text itself, which is a _very_ bad idea. For example, look at what Guile 1.8 does: guile> (let loop ((i 0)) (define y "hello") (display y) (newline) (string-set! y i #\a) (loop (1+ i))) hello aello aallo aaalo aaaao aaaaa <then an error> So you see, even in Guile 1.8, (define y "hello") didn't do what you thought it did. It didn't fill y with the string "hello". You were actually changing the program text itself, and that was a serious mistake. I'm sincerely sorry that you got yourself into this mess, but I don't see any good way out of it. To fix it as you suggest would be like suggesting that C should change the semantics of char *y = "hello" to automaticallly do a strcpy because some existing programs were in the habit of modifying the string constants of the program text. That way lies madness. If you want to make a copy of a string constant from the program text as a starting point for mutating the string, then you need to explicitly copy it, just like in C. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 10:03 ` Mark H Weaver @ 2012-01-04 14:29 ` Mike Gran 2012-01-04 14:45 ` David Kastrup ` (2 more replies) 2012-01-04 22:37 ` Ludovic Courtès 1 sibling, 3 replies; 117+ messages in thread From: Mike Gran @ 2012-01-04 14:29 UTC (permalink / raw) To: Mark H Weaver, Bruce Korb; +Cc: guile-devel@gnu.org > From: Mark H Weaver <mhw@netris.org> > No, `define' does not copy an object, it merely makes a new reference to > an existing object. This is also true in C for that matter, so this is > behavior is quite mainstream. For example, the following program dies > with SIGSEGV on most modern systems, including GNU/Linux: > > int > main() > { > char *y = "hello"; > y[0] = 'a'; > return 0; > } True, but the following also is quite mainstream int main() { char y[6] = "hello"; y[0] = 'a'; return 0; } C provides a way to create and initialize a mutable string. > Scheme and Guile are the same as C in this respect. Earlier versions of > Guile didn't make a copy of the string in this case either, but it > lacked the mechanism to detect this error, and allowed you to modify the > string literal in the program text itself, which is a _very_ bad idea. It all depends on your mental model. Your saying that (define y "hello") attaches "hello" to y, and since "hello" is a immutable, the string y contains must be immutable. This is an argument based on purity, not utility. If you follow that logic, then Guile is left without any shorthand to create and initialize a mutable string other than (define y (substring "hello" 0)) or (define y (string-copy "hello")) Someone coming from any other language would be surpised to find that the above is what you need to do to create an initialize a mutable string, I think. But 'define' just as easily can be considered a generic constructor that is overloaded in a C++ sense, and when "hello" is a string, y is assigned a copy-on-write version of the immutable string. It was wrong to change this without deprecating it first. Thanks, Mike Gran ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 14:29 ` Mike Gran @ 2012-01-04 14:45 ` David Kastrup 2012-01-04 16:47 ` Andy Wingo 2012-01-04 17:19 ` Mark H Weaver 2 siblings, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-04 14:45 UTC (permalink / raw) To: guile-devel Mike Gran <spk121@yahoo.com> writes: > If you follow that logic, then Guile is left without any shorthand > to create and initialize a mutable string other than > > (define y (substring "hello" 0)) > or > (define y (string-copy "hello")) Sure. Guile does not have shorthands for _mutable_ literals for lists or vectors either. One of the most significant points of a literal is that you can rely on it staying the same. > Someone coming from any other language would be surpised to find that > the above is what you need to do to create an initialize a mutable > string, I think. I don't know any language that permits the modification of literals. > But 'define' just as easily can be considered a generic constructor > that is overloaded in a C++ sense, It can be considered a lot of things that don't make sense. > and when "hello" is a string, y is assigned a copy-on-write version of > the immutable string. It was wrong to change this without > deprecating it first. Modifying literals _never_ _ever_ was guaranteed to lead to predictable results. Undefined behavior before, undefined behavior afterwards. There is no point in _deprecating_ something that _always_ was undefined behavior. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 14:29 ` Mike Gran 2012-01-04 14:45 ` David Kastrup @ 2012-01-04 16:47 ` Andy Wingo 2012-01-04 17:14 ` David Kastrup ` (2 more replies) 2012-01-04 17:19 ` Mark H Weaver 2 siblings, 3 replies; 117+ messages in thread From: Andy Wingo @ 2012-01-04 16:47 UTC (permalink / raw) To: Mike Gran; +Cc: Mark H Weaver, Bruce Korb, guile-devel@gnu.org On Wed 04 Jan 2012 09:29, Mike Gran <spk121@yahoo.com> writes: > char y[6] = "hello"; > > C provides a way to create and initialize a mutable string. This one is more like (define y (string #\h #\e #\l #\l #\o)) just like (define y (list #\h #\e #\l #\l #\o)) (define y (vector #\h #\e #\l #\l #\o)) etc. > It all depends on your mental model. Your saying that (define y "hello") > attaches "hello" to y, and since "hello" is a immutable, the string y > contains must be immutable. This is what the Scheme standard says, yes. > This is an argument based on purity, not utility. You don't think optimizations are of any use, then? :-) Immutable literals allows literals to be coalesced, leading to the impressive 2x speed improvements in Dorodango startup time, some months back. > It was wrong to change this without deprecating it first. I am not certain that is the case. Mutating string literals has always been an error in Scheme. It did "work" with Guile 1.8 and before; but since 1.9.0 when the compiler was introduced and started coalescing literals, it has had the possibility to cause bugs. The changes in 2.0.1 prevented those bugs by marking those strings as immutable. I was going to propose a workaround with an option to change vm-i-loader.c:43 and vm-i-loader.c:115 to use a scm_i_mutable_string_literals_p instead of 1, but that really seems like the path to perdition: previously compiled modules would start creating mutable strings where they really shouldn't. We could add a compiler option to turn string literals into (string-copy FOO). Perhaps that's the thing to do. Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 16:47 ` Andy Wingo @ 2012-01-04 17:14 ` David Kastrup 2012-01-04 17:32 ` Andy Wingo 2012-01-04 17:30 ` Bruce Korb 2012-01-04 18:31 ` Guile: What's wrong with this? Mark H Weaver 2 siblings, 1 reply; 117+ messages in thread From: David Kastrup @ 2012-01-04 17:14 UTC (permalink / raw) To: guile-devel Andy Wingo <wingo@pobox.com> writes: > We could add a compiler option to turn string literals into > (string-copy FOO). Perhaps that's the thing to do. What for? It would mean that a literal would not be eq? to itself, a nightmare for memoization purposes. And for what? For making code with explicitly undefined behavior exhibit a particular behavior that is undesirable in general. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 17:14 ` David Kastrup @ 2012-01-04 17:32 ` Andy Wingo 2012-01-04 17:49 ` David Kastrup 0 siblings, 1 reply; 117+ messages in thread From: Andy Wingo @ 2012-01-04 17:32 UTC (permalink / raw) To: David Kastrup; +Cc: guile-devel On Wed 04 Jan 2012 12:14, David Kastrup <dak@gnu.org> writes: > Andy Wingo <wingo@pobox.com> writes: > >> We could add a compiler option to turn string literals into >> (string-copy FOO). Perhaps that's the thing to do. > > What for? It would mean that a literal would not be eq? to itself, a > nightmare for memoization purposes. (eq? "hello" "hello") This expression may be true or false. It will be true in some circumstances and false in others, in all versions of Guile. > And for what? For making code with explicitly undefined behavior > exhibit a particular behavior that is undesirable in general. The Scheme reports and the Guile manual are both positive and negative specification: they require the implementation to do certain things, and they allow it to do certain others. Eq? on literals is one of the liberties afforded to the implementation, and with good reason. Correct programs don't assume anything about the identities (in the sense of eq?) of literals. Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 17:32 ` Andy Wingo @ 2012-01-04 17:49 ` David Kastrup 2012-01-04 18:09 ` Andy Wingo 0 siblings, 1 reply; 117+ messages in thread From: David Kastrup @ 2012-01-04 17:49 UTC (permalink / raw) To: guile-devel Andy Wingo <wingo@pobox.com> writes: > On Wed 04 Jan 2012 12:14, David Kastrup <dak@gnu.org> writes: > >> Andy Wingo <wingo@pobox.com> writes: >> >>> We could add a compiler option to turn string literals into >>> (string-copy FOO). Perhaps that's the thing to do. >> >> What for? It would mean that a literal would not be eq? to itself, a >> nightmare for memoization purposes. > > (eq? "hello" "hello") > > This expression may be true or false. It will be true in some > circumstances and false in others, in all versions of Guile. To itself. Not to a literal written in the same manner. (define (zap) "hello") (eq? (zap) (zap)) This expression may not choose to be true or false. >> And for what? For making code with explicitly undefined behavior >> exhibit a particular behavior that is undesirable in general. > > The Scheme reports and the Guile manual are both positive and negative > specification: they require the implementation to do certain things, > and they allow it to do certain others. Eq? on literals is one of the > liberties afforded to the implementation, and with good reason. > Correct programs don't assume anything about the identities (in the > sense of eq?) of literals. Of _different_ literals spelled in the same way. But one and the same literal has to be eq? to itself. It can't just replace itself with a non-eq? copy on a whim. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 17:49 ` David Kastrup @ 2012-01-04 18:09 ` Andy Wingo 0 siblings, 0 replies; 117+ messages in thread From: Andy Wingo @ 2012-01-04 18:09 UTC (permalink / raw) To: David Kastrup; +Cc: guile-devel On Wed 04 Jan 2012 12:49, David Kastrup <dak@gnu.org> writes: > (define (zap) "hello") > (eq? (zap) (zap)) > > This expression may not choose to be true or false. Indeed, good point. Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 16:47 ` Andy Wingo 2012-01-04 17:14 ` David Kastrup @ 2012-01-04 17:30 ` Bruce Korb 2012-01-04 17:44 ` David Kastrup 2012-01-04 18:26 ` Ian Price 2012-01-04 18:31 ` Guile: What's wrong with this? Mark H Weaver 2 siblings, 2 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-04 17:30 UTC (permalink / raw) To: Andy Wingo; +Cc: Mark H Weaver, guile-devel@gnu.org On 01/04/12 08:47, Andy Wingo wrote: > I was going to propose a workaround with an option to change > vm-i-loader.c:43 and vm-i-loader.c:115 to use a > scm_i_mutable_string_literals_p instead of 1, but that really seems like > the path to perdition: previously compiled modules would start creating > mutable strings where they really shouldn't. Instead, long-standing, previously written code was invalidated with 1.9, even if we were not smacked down until 2.0.1. Just because an obscure-to-those-not-living-and-breathing-Scheme-daily document said it was okay doesn't make it okay to those whacked by it. I would think recompiling should not be a great burden, *ESPECIALLY* given that it is a recent invention and therefore likely to have some initial issues that need dealing with. Like this, for example. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 17:30 ` Bruce Korb @ 2012-01-04 17:44 ` David Kastrup 2012-01-04 18:26 ` Ian Price 1 sibling, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-04 17:44 UTC (permalink / raw) To: guile-devel Bruce Korb <bruce.korb@gmail.com> writes: > On 01/04/12 08:47, Andy Wingo wrote: >> I was going to propose a workaround with an option to change >> vm-i-loader.c:43 and vm-i-loader.c:115 to use a >> scm_i_mutable_string_literals_p instead of 1, but that really seems like >> the path to perdition: previously compiled modules would start creating >> mutable strings where they really shouldn't. > > Instead, long-standing, previously written code was invalidated with > 1.9, even if we were not smacked down until 2.0.1. Yes, that is an inherent problem of writing code with undefined behavior. The only way to keep it working in the exact same manner is to use the exact same interpreter. And in the age of allocation randomization and multi-threading, not even that is reliable. > Just because an obscure-to-those-not-living-and-breathing-Scheme-daily > document said it was okay doesn't make it okay to those whacked by it. There was _never_ _any_ document that stated writing to literals was ok. You did so entirely on your own initiative and just were lucky that it happened to work under certain circumstances for a while. If people like to whack themselves, there is little one can do to keep them from doing so. They'll always find a way. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 17:30 ` Bruce Korb 2012-01-04 17:44 ` David Kastrup @ 2012-01-04 18:26 ` Ian Price 2012-01-04 18:48 ` Mark H Weaver 2012-01-04 19:29 ` Bruce Korb 1 sibling, 2 replies; 117+ messages in thread From: Ian Price @ 2012-01-04 18:26 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel@gnu.org Bruce Korb <bruce.korb@gmail.com> writes: > On 01/04/12 08:47, Andy Wingo wrote: >> I was going to propose a workaround with an option to change >> vm-i-loader.c:43 and vm-i-loader.c:115 to use a >> scm_i_mutable_string_literals_p instead of 1, but that really seems like >> the path to perdition: previously compiled modules would start creating >> mutable strings where they really shouldn't. > > Instead, long-standing, previously written code was invalidated with 1.9, long-standing, previously written _buggy_ code. > even if we were not smacked down until 2.0.1. > > Just because an obscure-to-those-not-living-and-breathing-Scheme-daily > document said it was okay doesn't make it okay to those whacked by it. There's an old saying, "Ignorance of the law is no excuse". If I wrote C code that doesn't conform to the C standard and depended on implementation specific behaviour, I have no recourse if it breaks on a different compiler. Guile explicitly claims to conform to the r5rs (and partially to the r6rs), both of which make this behaviour undefined, and srfi 13 explicitly makes this an error. (And FWIW I would not consider the R5RS obscure to people who have used scheme for even a short while, nor is it a terrific burden to read at 50 pages) Now, if you want to argue your position, it'd be better to argue that guile goes beyond r[56]rs in making these promises with regards to strings. For instance, substring-fill! as found at https://www.gnu.org/software/guile/manual/html_node/String-Modification.html implies that string literals are mutable — Scheme Procedure: substring-fill! str start end fill — C Function: scm_substring_fill_x (str, start, end, fill) Change every character in str between start and end to fill. (define y "abcdefg") (substring-fill! y 1 3 #\r) y ⇒ "arrdefg" So too does string-upcase! (https://www.gnu.org/software/guile/manual/html_node/Alphabetic-Case-Mapping.html), if we assume y is the same binding in both functions — Scheme Procedure: string-upcase! str [start [end]] — C Function: scm_substring_upcase_x (str, start, end) — C Function: scm_string_upcase_x (str) Destructively upcase every character in str. (string-upcase! y) ⇒ "ARRDEFG" y ⇒ "ARRDEFG" The same goes for string-downcase! and string-capitalize! I think it would be fair to say that someone could surmise that literal strings are meant to be mutable from these examples, and, if we do go down the immutable string literal route these examples would need to be addressed. On the other hand, you can argue that string literal immutability is implied by — Scheme Procedure: string-for-each-index proc s [start [end]] — C Function: scm_string_for_each_index (proc, s, start, end) Call (proc i) for each index i in s, from left to right. For example, to change characters to alternately upper and lower case, p (define str (string-copy "studly")) (string-for-each-index (lambda (i) (string-set! str i ((if (even? i) char-upcase char-downcase) (string-ref str i)))) str) str ⇒ "StUdLy" but on a purely numerical basis, mutability 4 - 0 immutability > I would think recompiling should not be a great burden, *ESPECIALLY* At this stage, I think that argument is fair enough, other people's mileage may vary. -- Ian Price "Programming is like pinball. The reward for doing it well is the opportunity to do it again" - from "The Wizardy Compiled" ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 18:26 ` Ian Price @ 2012-01-04 18:48 ` Mark H Weaver 2012-01-04 19:29 ` Bruce Korb 1 sibling, 0 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-04 18:48 UTC (permalink / raw) To: Ian Price; +Cc: guile-devel Ian Price <ianprice90@googlemail.com> writes: > — Scheme Procedure: substring-fill! str start end fill > — C Function: scm_substring_fill_x (str, start, end, fill) > > Change every character in str between start and end to fill. > > (define y "abcdefg") > (substring-fill! y 1 3 #\r) > y > ⇒ "arrdefg" > > So too does string-upcase! Ugh, thanks for pointing this out! Fixed. Any others? Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 18:26 ` Ian Price 2012-01-04 18:48 ` Mark H Weaver @ 2012-01-04 19:29 ` Bruce Korb 2012-01-04 20:20 ` David Kastrup 2012-01-04 23:19 ` Mark H Weaver 1 sibling, 2 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-04 19:29 UTC (permalink / raw) To: Ian Price; +Cc: guile-devel@gnu.org On 01/04/12 10:26, Ian Price wrote: >> Just because an obscure-to-those-not-living-and-breathing-Scheme-daily >> document said it was okay doesn't make it okay to those whacked by it. > There's an old saying, "Ignorance of the law is no excuse". If I wrote C > code that doesn't conform to the C standard I did. The standard changed. My code broke. The fix for read-only string literals was obvious and straight forward. The fix for pointer aliasing is virtually impossible, except to -fno-strict-aliasing for GCC. Yes, new code, fine, but the millions of lines of old code I deal with? No way. I think I've seen a reasonable way to go forward: an option to always copy newly defined strings. I am also a little curious: since this fault occurred on a string brought in via my C function named ag_scm_get() and it created the value with a call to scm_str02scm, shouldn't that function have created a mutable string copy? > Now, if you want to argue your position, it'd be better to argue that > guile goes beyond r[56]rs in making these promises with regards to strings. My number 1 argument may not be the strongest argument. My number 1 argument is that Guile, being an extension language, needs to be as forgiving and easy to use as it can possibly be because its client programmers (programmers using it) want to know as absolutely little as possible about it. No, I do *not* want to read, understand and remember 50 pages of stuff so that I can use Guile as an extension language. The memory barrier is much, *MUCH* lower for other scripting languages. > For instance, substring-fill! as found at > https://www.gnu.org/software/guile/manual/html_node/String-Modification.html > implies that string literals are mutable > > — Scheme Procedure: substring-fill! str start end fill > — C Function: scm_substring_fill_x (str, start, end, fill) > > Change every character in str between start and end to fill. > > (define y "abcdefg") > (substring-fill! y 1 3 #\r) > y > ⇒ "arrdefg" Who knows where I learned the idiom. I learned the minimal amount of Guile needed for my purposes a dozen years ago. My actual problem stems from this: > Backtrace: > In ice-9/boot-9.scm: > 170: 3 [catch #t #<catch-closure 8b75a0> ...] > In unknown file: > ?: 2 [catch-closure] > In ice-9/eval.scm: > 420: 1 [eval # ()] > In unknown file: > ?: 0 [string-upcase ""] > > ERROR: In procedure string-upcase: > ERROR: string is read-only: "" > Scheme evaluation error. AutoGen ABEND-ing in template > confmacs.tlib on line 209 > Failing Guile command: = = = = = > > (set! tmp-text (get "act-text")) > (set! TMP-text (string-upcase tmp-text)) What in heck is string-upcase doing trying to write to its input string? Why was the string returned by ag_scm_get() (the function bound to "get") an immutable string anyway? > SCM > ag_scm_get(SCM agName, SCM altVal) > { > tDefEntry* pE; > ag_bool x; > > pE = (! AG_SCM_STRING_P(agName)) ? NULL : > findDefEntry(ag_scm2zchars(agName, "ag value"), &x); > > if ((pE == NULL) || (pE->valType != VALTYP_TEXT)) { > if (AG_SCM_STRING_P(altVal)) > return altVal; > return AG_SCM_STR02SCM(zNil); > } > > return AG_SCM_STR02SCM(pE->val.pzText); > } "AG_SCM_STR02SCM" is either scm_makfrom0str or scm_from_locale_string, depending on the age of the Guile library. "zNil" is a pointer to a NUL byte that is, indeed, in read only memory, but surely scm_from_locale_string would not have been written in a way to detect that and add that attribute because of doing a memory probe. Further, it cannot be implemented in a way that does not copy it because I will most certainly call scm_from_locale_string using a pointer to memory that is immediately deallocated. It *MUST* copy the string. So what is this really about anyway? > I think it would be fair to say that someone could surmise that literal > strings are meant to be mutable from these examples, and, if we do go > down the immutable string literal route these examples would need to be > addressed. :) I think so. Meanwhile, I think the solution to be allowing Guile clients to say, with some initialization code of some sort, "copy my input strings" so the immutability flag is not set. (I do think it correct to not scribble on shared strings....) Thank you for your help! Regards, Bruce ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 19:29 ` Bruce Korb @ 2012-01-04 20:20 ` David Kastrup 2012-01-04 23:19 ` Mark H Weaver 1 sibling, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-04 20:20 UTC (permalink / raw) To: guile-devel Bruce Korb <bruce.korb@gmail.com> writes: > Who knows where I learned the idiom. I learned the minimal amount of > Guile needed for my purposes a dozen years ago. My actual problem > stems from this: > >> Backtrace: >> In ice-9/boot-9.scm: >> 170: 3 [catch #t #<catch-closure 8b75a0> ...] >> In unknown file: >> ?: 2 [catch-closure] >> In ice-9/eval.scm: >> 420: 1 [eval # ()] >> In unknown file: >> ?: 0 [string-upcase ""] >> >> ERROR: In procedure string-upcase: >> ERROR: string is read-only: "" >> Scheme evaluation error. AutoGen ABEND-ing in template >> confmacs.tlib on line 209 >> Failing Guile command: = = = = = >> >> (set! tmp-text (get "act-text")) >> (set! TMP-text (string-upcase tmp-text)) > > What in heck is string-upcase doing trying to write to its input > string? This looks like it might be just a bug. Could be that string-upcase creates its own copy of the string incorrectly including the immutable bit and then tries changing the string. No reason to play helter-skelter with the language. Instead the bug should be fixed. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 19:29 ` Bruce Korb 2012-01-04 20:20 ` David Kastrup @ 2012-01-04 23:19 ` Mark H Weaver 2012-01-04 23:28 ` Bruce Korb 2012-01-07 15:43 ` Fixed string corruption bugs (was Guile: What's wrong with this?) Mark H Weaver 1 sibling, 2 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-04 23:19 UTC (permalink / raw) To: Bruce Korb; +Cc: guile-devel [-- Attachment #1: Type: text/plain, Size: 944 bytes --] Bruce Korb <bruce.korb@gmail.com> writes: >> ERROR: In procedure string-upcase: >> ERROR: string is read-only: "" >> Scheme evaluation error. AutoGen ABEND-ing in template >> confmacs.tlib on line 209 >> Failing Guile command: = = = = = >> >> (set! tmp-text (get "act-text")) >> (set! TMP-text (string-upcase tmp-text)) > > What in heck is string-upcase doing trying to write to its input string? > Why was the string returned by ag_scm_get() (the function bound to "get") > an immutable string anyway? Good questions indeed. I spent a bunch of time investigating this, and found some bugs that might have caused this problem, although I'm not certain. Bruce: Can you please see if the patch below fixes this problem? Mike: Would you be willing to review this (very small) patch to see if it makes sense to you? I'd like a second opinion from someone familiar with that subsystem before I commit it. Thanks, Mark [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: [PATCH] Fix bugs related to mutation-sharing substrings --] [-- Type: text/x-patch, Size: 1925 bytes --] From a8da72937ff4d04e8d39531773cc05e676b2be1c Mon Sep 17 00:00:00 2001 From: Mark H Weaver <mhw@netris.org> Date: Wed, 4 Jan 2012 17:59:27 -0500 Subject: [PATCH] Fix bugs related to mutation-sharing substrings * libguile/strings.c (scm_i_is_narrow_string, scm_i_try_narrow_string, scm_i_string_set_x): Check to see if the provided string is a mutation-sharing substring, and do the right thing in that case. Previously, if such a string was passed to these functions, they would behave very badly: while trying to fetch and/or mutate the cell containing the stringbuf, they were actually fetching or mutating the cell containing original shared string. That's because mutation-sharing substring store the original string in CELL_1, whereas all other strings store the stringbuf there. --- libguile/strings.c | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/libguile/strings.c b/libguile/strings.c index 666a951..1628aee 100644 --- a/libguile/strings.c +++ b/libguile/strings.c @@ -436,6 +436,9 @@ scm_i_string_length (SCM str) int scm_i_is_narrow_string (SCM str) { + if (IS_SH_STRING (str)) + str = SH_STRING_STRING (str); + return !STRINGBUF_WIDE (STRING_STRINGBUF (str)); } @@ -446,6 +449,9 @@ scm_i_is_narrow_string (SCM str) int scm_i_try_narrow_string (SCM str) { + if (IS_SH_STRING (str)) + str = SH_STRING_STRING (str); + SET_STRING_STRINGBUF (str, narrow_stringbuf (STRING_STRINGBUF (str))); return scm_i_is_narrow_string (str); @@ -664,6 +670,12 @@ scm_i_string_strcmp (SCM sstr, size_t start_x, const char *cstr) void scm_i_string_set_x (SCM str, size_t p, scm_t_wchar chr) { + if (IS_SH_STRING (str)) + { + p += STRING_START (str); + str = SH_STRING_STRING (str); + } + if (chr > 0xFF && scm_i_is_narrow_string (str)) SET_STRING_STRINGBUF (str, wide_stringbuf (STRING_STRINGBUF (str))); -- 1.7.5.4 ^ permalink raw reply related [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 23:19 ` Mark H Weaver @ 2012-01-04 23:28 ` Bruce Korb 2012-01-07 15:43 ` Fixed string corruption bugs (was Guile: What's wrong with this?) Mark H Weaver 1 sibling, 0 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-04 23:28 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel On 01/04/12 15:19, Mark H Weaver wrote: > Bruce Korb<bruce.korb@gmail.com> writes: > >>> ERROR: In procedure string-upcase: >>> ERROR: string is read-only: "" >>> Scheme evaluation error. AutoGen ABEND-ing in template >>> confmacs.tlib on line 209 >>> Failing Guile command: = = = = = >>> >>> (set! tmp-text (get "act-text")) >>> (set! TMP-text (string-upcase tmp-text)) >> >> What in heck is string-upcase doing trying to write to its input string? >> Why was the string returned by ag_scm_get() (the function bound to "get") >> an immutable string anyway? > > Good questions indeed. I spent a bunch of time investigating this, and > found some bugs that might have caused this problem, although I'm not > certain. > > Bruce: Can you please see if the patch below fixes this problem? OK. I'll have to play this weekend. I have to download and install Guile sources and, unfortunately, this thread notwithstanding, I do have a day job.... Thank you so much!! Regards, Bruce ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Fixed string corruption bugs (was Guile: What's wrong with this?) 2012-01-04 23:19 ` Mark H Weaver 2012-01-04 23:28 ` Bruce Korb @ 2012-01-07 15:43 ` Mark H Weaver 2012-01-07 16:19 ` Fixed string corruption bugs Andy Wingo 1 sibling, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-07 15:43 UTC (permalink / raw) To: guile-devel > From a8da72937ff4d04e8d39531773cc05e676b2be1c Mon Sep 17 00:00:00 2001 > From: Mark H Weaver <mhw@netris.org> > Date: Wed, 4 Jan 2012 17:59:27 -0500 > Subject: [PATCH] Fix bugs related to mutation-sharing substrings > > * libguile/strings.c (scm_i_is_narrow_string, scm_i_try_narrow_string, > scm_i_string_set_x): Check to see if the provided string is a > mutation-sharing substring, and do the right thing in that case. > Previously, if such a string was passed to these functions, they would > behave very badly: while trying to fetch and/or mutate the cell > containing the stringbuf, they were actually fetching or mutating the > cell containing original shared string. That's because > mutation-sharing substring store the original string in CELL_1, > whereas all other strings store the stringbuf there. I committed this. Here's an example that segfaulted before these fixes: scheme@(guile-user)> (define s (string-copy "hello")) scheme@(guile-user)> (define ss (substring/shared s 1 4)) scheme@(guile-user)> (string-set! ss 0 #\λ) Segmentation fault Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Fixed string corruption bugs 2012-01-07 15:43 ` Fixed string corruption bugs (was Guile: What's wrong with this?) Mark H Weaver @ 2012-01-07 16:19 ` Andy Wingo 0 siblings, 0 replies; 117+ messages in thread From: Andy Wingo @ 2012-01-07 16:19 UTC (permalink / raw) To: Mark H Weaver; +Cc: guile-devel On Sat 07 Jan 2012 16:43, Mark H Weaver <mhw@netris.org> writes: >> Subject: [PATCH] Fix bugs related to mutation-sharing substrings Cool! > I committed this. Here's an example that segfaulted before these fixes: > > scheme@(guile-user)> (define s (string-copy "hello")) > scheme@(guile-user)> (define ss (substring/shared s 1 4)) > scheme@(guile-user)> (string-set! ss 0 #\λ) > Segmentation fault Probably a good idea to add it (or something like it) to the test suite :) Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 16:47 ` Andy Wingo 2012-01-04 17:14 ` David Kastrup 2012-01-04 17:30 ` Bruce Korb @ 2012-01-04 18:31 ` Mark H Weaver 2012-01-04 18:43 ` Andy Wingo 2 siblings, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-04 18:31 UTC (permalink / raw) To: Andy Wingo; +Cc: Bruce Korb, guile-devel Andy Wingo <wingo@pobox.com> writes: > We could add a compiler option to turn string literals into (string-copy > FOO). Perhaps that's the thing to do. I think this would be fine, as long as the default is _not_ to copy string literals. This would help Bruce a great deal with very little effort on our part, without mucking up the semantics for anyone else. David Kastrup <dak@gnu.org> writes: > What for? It would mean that a literal would not be eq? to itself, a > nightmare for memoization purposes. I agree that it should not be the default behavior, but I don't see the harm in allowing users to compile their own code this way. The memoization argument is a bit thin. How often is it useful to memoize against string arguments using eq? as the equality predicate? Remember, this would only for be for code that explicitly changed this compilation option. Best, Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 18:31 ` Guile: What's wrong with this? Mark H Weaver @ 2012-01-04 18:43 ` Andy Wingo 2012-01-04 19:29 ` Mark H Weaver 0 siblings, 1 reply; 117+ messages in thread From: Andy Wingo @ 2012-01-04 18:43 UTC (permalink / raw) To: Mark H Weaver; +Cc: Bruce Korb, guile-devel On Wed 04 Jan 2012 13:31, Mark H Weaver <mhw@netris.org> writes: > Andy Wingo <wingo@pobox.com> writes: >> We could add a compiler option to turn string literals into (string-copy >> FOO). Perhaps that's the thing to do. > > I think this would be fine, as long as the default is _not_ to copy > string literals. This would help Bruce a great deal with very little > effort on our part, without mucking up the semantics for anyone else. Yes, this was what I was thinking. > David Kastrup <dak@gnu.org> writes: >> What for? It would mean that a literal would not be eq? to itself, a >> nightmare for memoization purposes. > > I agree that it should not be the default behavior, but I don't see the > harm in allowing users to compile their own code this way. Well, we can fix this too: we can make "foo" transform to (copy-once UNIQUE-GENSYM str) with (define (copy-once key str) (or (hashq-ref mutable-string-literals key) (let ((value (string-copy str))) (hashq-set! mutable-string-literals key value) value))) Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 18:43 ` Andy Wingo @ 2012-01-04 19:29 ` Mark H Weaver 2012-01-04 19:43 ` Andy Wingo 0 siblings, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-04 19:29 UTC (permalink / raw) To: Andy Wingo; +Cc: Bruce Korb, guile-devel Andy Wingo <wingo@pobox.com> writes: >> David Kastrup <dak@gnu.org> writes: >>> What for? It would mean that a literal would not be eq? to itself, a >>> nightmare for memoization purposes. >> >> I agree that it should not be the default behavior, but I don't see the >> harm in allowing users to compile their own code this way. > > Well, we can fix this too: we can make > > "foo" > > transform to > > (copy-once UNIQUE-GENSYM str) > > with > > (define (copy-once key str) > (or (hashq-ref mutable-string-literals key) > (let ((value (string-copy str))) > (hashq-set! mutable-string-literals key value) > value))) Although this is a closer emulation of the previous (broken) behavior, IMHO this would be less desirable than simply doing (string-copy "foo") on every evaluation of "foo", which seems to be what Bruce (and probably others) expected "foo" to do. For example, based on the mental model that Bruce apparently had when he wrote his code, he might have written something like this: (define (hello-world-with-one-char-changed i c) (define str "Hello world") (string-set! str i c) str) Your UNIQUE-GENSYM hack emulates the previous behavior that makes the above procedure buggy. Simply changing "hello" to (string-copy "hello") would make the procedure work, and I believe conforms better to what Bruce expects. Of course, I'm only talking about what I think should be done when the compiler option is changed to non-default behavior. I strongly believe that the _default_ behavior should stay as it is now. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 19:29 ` Mark H Weaver @ 2012-01-04 19:43 ` Andy Wingo 2012-01-04 20:08 ` Bruce Korb 0 siblings, 1 reply; 117+ messages in thread From: Andy Wingo @ 2012-01-04 19:43 UTC (permalink / raw) To: Mark H Weaver; +Cc: Bruce Korb, guile-devel On Wed 04 Jan 2012 14:29, Mark H Weaver <mhw@netris.org> writes: > Although this is a closer emulation of the previous (broken) behavior, > IMHO this would be less desirable than simply doing (string-copy "foo") > on every evaluation of "foo", which seems to be what Bruce (and probably > others) expected "foo" to do. Thing is, why are we doing this? We know what the correct behavior is, as you say: > Of course, I'm only talking about what I think should be done when the > compiler option is changed to non-default behavior. I strongly believe > that the _default_ behavior should stay as it is now. The correct behavior is the status quo. We are considering adding a hack to produce different behavior for compatibility purposes. We don't have to worry about correctness in that case, only compatibility. IMO anyway :) Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 19:43 ` Andy Wingo @ 2012-01-04 20:08 ` Bruce Korb 2012-01-04 20:14 ` David Kastrup 2012-01-04 20:56 ` Andy Wingo 0 siblings, 2 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-04 20:08 UTC (permalink / raw) To: Andy Wingo; +Cc: Mark H Weaver, guile-devel On 01/04/12 11:43, Andy Wingo wrote: > The correct behavior is the status quo. We are considering adding a > hack to produce different behavior for compatibility purposes. We don't > have to worry about correctness in that case, only compatibility. IMO > anyway :) It would be a nice added benefit if it worked as one would expect. viz., you make actual, writable copies of strings you pull in so that if the string-upcase function were to modify its input, then it would not affect other SCMs with values that happen to be the same sequence of bytes. ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 20:08 ` Bruce Korb @ 2012-01-04 20:14 ` David Kastrup 2012-01-04 20:56 ` Andy Wingo 1 sibling, 0 replies; 117+ messages in thread From: David Kastrup @ 2012-01-04 20:14 UTC (permalink / raw) To: guile-devel Bruce Korb <bruce.korb@gmail.com> writes: > On 01/04/12 11:43, Andy Wingo wrote: >> The correct behavior is the status quo. We are considering adding a >> hack to produce different behavior for compatibility purposes. We don't >> have to worry about correctness in that case, only compatibility. IMO >> anyway :) > > It would be a nice added benefit if it worked as one would expect. > viz., you make actual, writable copies of strings you pull in so that > if the string-upcase function were to modify its input, then it > would not affect other SCMs with values that happen to be the same > sequence of bytes. If string-upcase modifies its input (or needs a mutable string to start with), this is a bug, in contrast to what string-upcase! may do. -- David Kastrup ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 20:08 ` Bruce Korb 2012-01-04 20:14 ` David Kastrup @ 2012-01-04 20:56 ` Andy Wingo 2012-01-04 21:30 ` Bruce Korb 1 sibling, 1 reply; 117+ messages in thread From: Andy Wingo @ 2012-01-04 20:56 UTC (permalink / raw) To: Bruce Korb; +Cc: Mark H Weaver, guile-devel On Wed 04 Jan 2012 15:08, Bruce Korb <bruce.korb@gmail.com> writes: > On 01/04/12 11:43, Andy Wingo wrote: >> The correct behavior is the status quo. We are considering adding a >> hack to produce different behavior for compatibility purposes. We don't >> have to worry about correctness in that case, only compatibility. IMO >> anyway :) > > It would be a nice added benefit if it worked as one would expect. I think that in this case, your expectations are just incorrect. I don't mean this rudely. I think you will be happier and more productive if you change your expectations in this regard to better match "reality" (the state of things, common practice, conventional Scheme wisdom, etc). Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 20:56 ` Andy Wingo @ 2012-01-04 21:30 ` Bruce Korb 0 siblings, 0 replies; 117+ messages in thread From: Bruce Korb @ 2012-01-04 21:30 UTC (permalink / raw) To: Andy Wingo; +Cc: Mark H Weaver, guile-devel On 01/04/12 12:56, Andy Wingo wrote: > On Wed 04 Jan 2012 15:08, Bruce Korb<bruce.korb@gmail.com> writes: > >> On 01/04/12 11:43, Andy Wingo wrote: >>> The correct behavior is the status quo. We are considering adding a >>> hack to produce different behavior for compatibility purposes. We don't >>> have to worry about correctness in that case, only compatibility. IMO >>> anyway :) >> >> It would be a nice added benefit if it worked as one would expect. > > I think that in this case, your expectations are just incorrect. I > don't mean this rudely. I think you will be happier and more productive > if you change your expectations in this regard to better match "reality" > (the state of things, common practice, conventional Scheme wisdom, etc). Going forward, yes, sure, like the pointer aliasing thing. It was just never an issue with the original C model and it became such later. In this case, expectations were built upon perl and shell scripting models, and it seemed to work that way. In any case, the specific problem that actually triggered this whole thread was that scm_from_locale_string seems to be returning a reference to an immutable string (unexpected) *AND* the string-upcase function is objecting to it (also unexpected). Otherwise, I'd have gone on oblivious to any sort of issue. :) Cheers - Bruce ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 14:29 ` Mike Gran 2012-01-04 14:45 ` David Kastrup 2012-01-04 16:47 ` Andy Wingo @ 2012-01-04 17:19 ` Mark H Weaver 2012-01-05 4:24 ` Mark H Weaver 2 siblings, 1 reply; 117+ messages in thread From: Mark H Weaver @ 2012-01-04 17:19 UTC (permalink / raw) To: Mike Gran; +Cc: Bruce Korb, guile-devel Mike Gran <spk121@yahoo.com> writes: >> From: Mark H Weaver <mhw@netris.org> >> No, `define' does not copy an object, it merely makes a new reference to >> an existing object. This is also true in C for that matter, so this is >> behavior is quite mainstream. For example, the following program dies >> with SIGSEGV on most modern systems, including GNU/Linux: >> >> int >> main() >> { >> char *y = "hello"; >> y[0] = 'a'; >> return 0; >> } > > > True, but the following also is quite mainstream > int main() > { > char y[6] = "hello"; > y[0] = 'a'; > return 0; > } > > C provides a way to create and initialize a mutable string. Scheme and Guile provide ways to do that too, but that's _never_ what `define' has done. >> Scheme and Guile are the same as C in this respect. Earlier versions of >> Guile didn't make a copy of the string in this case either, but it >> lacked the mechanism to detect this error, and allowed you to modify the >> string literal in the program text itself, which is a _very_ bad idea. > > It all depends on your mental model. Your saying that (define y "hello") > attaches "hello" to y, and since "hello" is a immutable, the string y > contains must be immutable. This is an argument based on purity, not > utility. If we were designing a new language, then it would at least be pertinent to argue this point. However, this is the way `define' has _always_ worked in every variant of Scheme, and the same is true of the analogous `set' in Lisp from the very beginning. > If you follow that logic, then Guile is left without any shorthand > to create and initialize a mutable string other than > > (define y (substring "hello" 0)) > or > (define y (string-copy "hello")) Guile provides all the machinery you need to define shorthand syntax if you like, e.g: (define-syntax-rule (define-string v s) (define v (string-copy s))) For that matter, you could also do something like this: (define-syntax define (lambda (x) (with-syntax ((orig-define #'(@ (guile) define))) (syntax-case x () ((_ (proc arg ...) e0 e1 ...) #'(orig-define proc (lambda (arg ...) e0 e1 ...))) ((_ v e) (identifier? #'v) (if (string? (syntax->datum #'e)) #'(orig-define v (string-copy e)) #'(orig-define v e))))))) This will change `define' (in the module where it's defined) to automatically copy a bare string literal on the right side. Note that this check is done at compile-time, so it can't look at the dynamic type of an expression. If that's not good enough and you're willing to take the efficiency hit at runtime for _every_ use of `define', you could change `define' to wrap the right-hand expression within a procedure call to check for read-only strings: (define (copy-if-string x) (if (string? x) (string-copy x) x)) (define-syntax define (lambda (x) (with-syntax ((orig-define #'(@ (guile) define))) (syntax-case x () ((_ (proc arg ...) e0 e1 ...) #'(orig-define proc (lambda (arg ...) e0 e1 ...))) ((_ v e) #'(orig-define v (copy-if-string e))))))) Scheme's nice handling of hygiene should make redefining `define' within your own modules (including (guile-user)) harmless. If it doesn't, that's a bug and we'd like to hear about it. > It was wrong to change this without deprecating it first. The only change here was to add the machinery to detect an error that was _always_ an error. It _never_ did what you say that it should do. What it did before was fail to detect that you were changing the string constant in the program text itself. The Guile 1.8 example I gave in my last email in this thread demonstrates that. To make that point even clearer, I'll post the full copy of the error message Guile 1.8 gave when my loop ran past the end of the string: guile> (let loop ((i 0)) (define y "hello") (display y) (newline) (string-set! y i #\a) (loop (1+ i))) hello aello aallo aaalo aaaao aaaaa Backtrace: In standard input: 2: 0* [loop 0] In unknown file: ?: 1 (letrec ((y "aaaaa")) (display y) ...) ... ?: 2 (letrec ((y "aaaaa")) (display y) ...) In standard input: 2: 3* [string-set! "aaaaa" {5} #\a] standard input:2:60: In procedure string-set! in expression (string-set! y i ...): standard input:2:60: Value out of range 0 to 4: 5 ABORT: (out-of-range) guile> Take a look at the backtrace, where it helpfully shows you an excerpt of the source code (admittedly after some transformation). See how the source code itself has been modified? This is what Bruce's code does. It was _always_ a serious error in the code, even if it went undetected in earlier versions of Guile. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 17:19 ` Mark H Weaver @ 2012-01-05 4:24 ` Mark H Weaver 0 siblings, 0 replies; 117+ messages in thread From: Mark H Weaver @ 2012-01-05 4:24 UTC (permalink / raw) To: Mike Gran; +Cc: Bruce Korb, guile-devel I wrote: > (define-syntax define > (lambda (x) > (with-syntax ((orig-define #'(@ (guile) define))) > (syntax-case x () > ((_ (proc arg ...) e0 e1 ...) > #'(orig-define proc (lambda (arg ...) e0 e1 ...))) > ((_ v e) > (identifier? #'v) > (if (string? (syntax->datum #'e)) > #'(orig-define v (string-copy e)) > #'(orig-define v e))))))) In case you're planning to use this, I just realized that this syntax definition has a flaw: it won't handle cases like this: (define (map f . xs) ...) To fix this flaw, change the two lines after syntax-case to: > ((_ (proc . args) e0 e1 ...) > #'(orig-define proc (lambda args e0 e1 ...))) The other macro I provided has the same flaw, and the same fix applies. Mark ^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: Guile: What's wrong with this? 2012-01-04 10:03 ` Mark H Weaver 2012-01-04 14:29 ` Mike Gran @ 2012-01-04 22:37 ` Ludovic Courtès 1 sibling, 0 replies; 117+ messages in thread From: Ludovic Courtès @ 2012-01-04 22:37 UTC (permalink / raw) To: guile-devel Hi! Mark H Weaver <mhw@netris.org> skribis: > For example, look at what Guile 1.8 does: > > guile> (let loop ((i 0)) > (define y "hello") > (display y) > (newline) > (string-set! y i #\a) > (loop (1+ i))) > hello > aello > aallo > aaalo > aaaao > aaaaa > <then an error> > > So you see, even in Guile 1.8, (define y "hello") didn't do what you > thought it did. It didn't fill y with the string "hello". You were > actually changing the program text itself, and that was a serious > mistake. Indeed, funny example! Ludo’. ^ permalink raw reply [flat|nested] 117+ messages in thread
end of thread, other threads:[~2012-01-20 21:31 UTC | newest] Thread overview: 117+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-01-03 4:08 What's wrong with this? Bruce Korb 2012-01-03 15:03 ` Mike Gran 2012-01-03 16:26 ` Guile: " Bruce Korb 2012-01-03 16:30 ` Mike Gran 2012-01-03 22:24 ` Ludovic Courtès 2012-01-03 23:15 ` Bruce Korb 2012-01-03 23:33 ` Ludovic Courtès 2012-01-04 0:55 ` Bruce Korb 2012-01-04 3:12 ` Noah Lavine 2012-01-04 17:37 ` bytevector -- was: " Bruce Korb 2012-01-04 21:17 ` Ludovic Courtès 2012-01-04 22:36 ` Bruce Korb 2012-01-05 0:01 ` Ludovic Courtès 2012-01-05 18:36 ` non-reproduction of initial issue -- was: " Bruce Korb 2012-01-05 18:50 ` Mark H Weaver 2012-01-04 12:19 ` Ian Price 2012-01-04 17:16 ` Bruce Korb 2012-01-04 17:21 ` Andy Wingo 2012-01-04 17:39 ` David Kastrup 2012-01-04 21:52 ` Ian Price 2012-01-04 22:18 ` Bruce Korb 2012-01-04 23:22 ` Mike Gran 2012-01-04 23:59 ` Mark H Weaver 2012-01-05 17:22 ` Bruce Korb 2012-01-05 18:13 ` Mark H Weaver 2012-01-05 19:02 ` Mark H Weaver 2012-01-05 20:24 ` David Kastrup 2012-01-05 22:42 ` Mark H Weaver 2012-01-06 1:02 ` Mike Gran 2012-01-06 1:41 ` Mark H Weaver 2012-01-06 2:38 ` Noah Lavine 2012-01-06 13:37 ` Mike Gran 2012-01-06 14:11 ` David Kastrup 2012-01-06 18:13 ` Mark H Weaver 2012-01-06 19:06 ` Bruce Korb 2012-01-06 19:19 ` David Kastrup 2012-01-06 20:03 ` Mark H Weaver 2012-01-07 16:13 ` Mark H Weaver 2012-01-07 17:35 ` mutable interfaces - was: " Bruce Korb 2012-01-07 17:47 ` David Kastrup 2012-01-07 18:30 ` Mark H Weaver 2012-01-07 18:55 ` Mark H Weaver 2012-01-06 22:23 ` Guile BUG: " Bruce Korb 2012-01-06 23:11 ` Mark H Weaver 2012-01-06 23:35 ` Andy Wingo 2012-01-06 23:41 ` Bruce Korb 2012-01-07 15:00 ` Mark H Weaver 2012-01-07 15:27 ` Bruce Korb 2012-01-07 16:38 ` Mark H Weaver 2012-01-07 17:39 ` Bruce Korb 2012-01-09 15:41 ` Mark H Weaver 2012-01-09 17:27 ` Bruce Korb 2012-01-09 18:32 ` Andy Wingo 2012-01-09 19:48 ` Bruce Korb 2012-01-07 15:47 ` David Kastrup 2012-01-07 17:07 ` Mark H Weaver 2012-01-07 14:35 ` Mark H Weaver 2012-01-07 15:20 ` Mike Gran 2012-01-07 22:25 ` Ludovic Courtès 2012-01-10 9:13 ` The empty string and other empty strings Ludovic Courtès 2012-01-10 11:28 ` Mike Gran 2012-01-10 13:03 ` Mark H Weaver 2012-01-10 13:09 ` Mike Gran 2012-01-10 15:41 ` Mark H Weaver 2012-01-10 15:48 ` David Kastrup 2012-01-10 16:15 ` Mark H Weaver 2012-01-12 22:33 ` Ludovic Courtès 2012-01-13 9:27 ` David Kastrup 2012-01-13 16:39 ` Mark H Weaver 2012-01-13 17:36 ` David Kastrup 2012-01-16 8:26 ` Marijn 2012-01-16 8:47 ` David Kastrup 2012-01-20 21:31 ` Andy Wingo 2012-01-10 14:10 ` David Kastrup 2012-01-10 12:21 ` Mike Gran 2012-01-10 12:27 ` Mark H Weaver 2012-01-10 16:34 ` Ludovic Courtès 2012-01-10 17:04 ` David Kastrup 2012-01-06 23:28 ` Guile BUG: What's wrong with this? Bruce Korb 2012-01-07 20:57 ` Guile: " Ian Price 2012-01-08 5:05 ` Mark H Weaver 2012-01-06 9:23 ` David Kastrup 2012-01-05 7:22 ` David Kastrup 2012-01-04 22:46 ` Ludovic Courtès 2012-01-04 3:04 ` Mike Gran 2012-01-04 9:35 ` nalaginrut 2012-01-04 9:41 ` David Kastrup 2012-01-04 21:07 ` Ludovic Courtès 2012-01-04 10:03 ` Mark H Weaver 2012-01-04 14:29 ` Mike Gran 2012-01-04 14:45 ` David Kastrup 2012-01-04 16:47 ` Andy Wingo 2012-01-04 17:14 ` David Kastrup 2012-01-04 17:32 ` Andy Wingo 2012-01-04 17:49 ` David Kastrup 2012-01-04 18:09 ` Andy Wingo 2012-01-04 17:30 ` Bruce Korb 2012-01-04 17:44 ` David Kastrup 2012-01-04 18:26 ` Ian Price 2012-01-04 18:48 ` Mark H Weaver 2012-01-04 19:29 ` Bruce Korb 2012-01-04 20:20 ` David Kastrup 2012-01-04 23:19 ` Mark H Weaver 2012-01-04 23:28 ` Bruce Korb 2012-01-07 15:43 ` Fixed string corruption bugs (was Guile: What's wrong with this?) Mark H Weaver 2012-01-07 16:19 ` Fixed string corruption bugs Andy Wingo 2012-01-04 18:31 ` Guile: What's wrong with this? Mark H Weaver 2012-01-04 18:43 ` Andy Wingo 2012-01-04 19:29 ` Mark H Weaver 2012-01-04 19:43 ` Andy Wingo 2012-01-04 20:08 ` Bruce Korb 2012-01-04 20:14 ` David Kastrup 2012-01-04 20:56 ` Andy Wingo 2012-01-04 21:30 ` Bruce Korb 2012-01-04 17:19 ` Mark H Weaver 2012-01-05 4:24 ` Mark H Weaver 2012-01-04 22:37 ` Ludovic Courtès
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).