unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* What's wrong with this?
@ 2012-01-03  4:08 Bruce Korb
  2012-01-03 15:03 ` Mike Gran
  0 siblings, 1 reply; 117+ messages in thread
From: Bruce Korb @ 2012-01-03  4:08 UTC (permalink / raw)
  To: guile-devel Development


My "(get ...)" function always returns a string.
This result was assigned to "tmp-text" and the
"(string-upcase ...)" is complaining that the input is
read only.  Well, it isn't, so the real complaint
is being hidden by the "string is read-only" message.

It worked until I "upgraded" to openSuSE 12.1.

> $ guile --version
> guile (GNU Guile) 2.0.2
> .....

What is really wrong, please?

> ERROR: In procedure string-upcase:
> ERROR: string is read-only: ""
> Scheme evaluation error.  AutoGen ABEND-ing in template
> 	confmacs.tlib on line 209
> Failing Guile command:  = = = = =
>
> (set! tmp-text (get "act-text"))
>        (set! TMP-text (string-upcase tmp-text))
>        (string-append
>          (if (exist? "no") "no-" "yes-")
>          (get "act-type"))



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: What's wrong with this?
  2012-01-03  4:08 What's wrong with this? Bruce Korb
@ 2012-01-03 15:03 ` Mike Gran
  2012-01-03 16:26   ` Guile: " Bruce Korb
  0 siblings, 1 reply; 117+ messages in thread
From: Mike Gran @ 2012-01-03 15:03 UTC (permalink / raw)
  To: Bruce Korb, guile-devel Development

> From: Bruce Korb <bruce.korb@gmail.com>


> My "(get ...)" function always returns a string.
> This result was assigned to "tmp-text" and the
> "(string-upcase ...)" is complaining that the input is
> read only.  Well, it isn't, so the real complaint
> is being hidden by the "string is read-only" message.
> 
> It worked until I "upgraded" to openSuSE 12.1.
> 
>>  $ guile --version
>>  guile (GNU Guile) 2.0.2
>>  .....
> 
> What is really wrong, please?
> 
>> 
>>  (set! tmp-text (get "act-text"))
>>         (set! TMP-text (string-upcase tmp-text))
>>         (string-append
>>           (if (exist? "no") "no-" "yes-")
>>           (get "act-type"))
>

There does seem to be some strangeness w.r.t. read-only
strings going on.

On Guile 1.8.8 if you create a string this way, it is
not read-only.

guile> (define y "hello")
guile> (string-set! y 0 #\x)
guile> y
"xello"

On Guile 2.0.3, if you create a string the same way, it
is read-only for some reason.

scheme@(guile-user)> (define y "hello")
scheme@(guile-user)> (string-set! y 0 #\x)
ERROR: In procedure string-set!:
ERROR: string is read-only: "hello"

%string-dump can be used to confirm this

scheme@(guile-user)> (%string-dump y)
$4 = ((string . "hello") (start . 0) (length . 5)
  (shared . #f) (read-only . #t) (stringbuf-chars . "hello")
  (stringbuf-length . 5) (stringbuf-shared . #f) (stringbuf-wide . #f))

But if you create a string with 'string' it isn't read only

scheme@(guile-user)> (define y (string #\h #\e #\l #\l #\o))
scheme@(guile-user)> (string-set! y 0 #\x)
scheme@(guile-user)> y
$7 = "xello"

-Mike



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-03 15:03 ` Mike Gran
@ 2012-01-03 16:26   ` Bruce Korb
  2012-01-03 16:30     ` Mike Gran
                       ` (2 more replies)
  0 siblings, 3 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-03 16:26 UTC (permalink / raw)
  To: Mike Gran; +Cc: gnu-prog-discuss, guile-devel Development

Hi Mike,

Thank you for the explanation.  However:

On 01/03/12 07:03, Mike Gran wrote:
>> It worked until I "upgraded" to openSuSE 12.1.
>>
>>>   $ guile --version
>>>   guile (GNU Guile) 2.0.2
>>>   .....

>>>   (set! tmp-text (get "act-text"))
>>>          (set! TMP-text (string-upcase tmp-text))

 >>> ERROR: In procedure string-upcase:
>>> ERROR: string is read-only: ""
>>
>
> There does seem to be some strangeness w.r.t. read-only
> strings going on.
>
> On Guile 1.8.8 if you create a string this way, it is
> not read-only.
>
> guile>  (define y "hello")
> guile>  (string-set! y 0 #\x)
> guile>  y
> "xello"
>
> On Guile 2.0.3, if you create a string the same way, it
> is read-only for some reason.
>
> scheme@(guile-user)>  (define y "hello")
> scheme@(guile-user)>  (string-set! y 0 #\x)
> ERROR: In procedure string-set!:
> ERROR: string is read-only: "hello"
>
> %string-dump can be used to confirm this

There are a couple of issues:

1.  "string-upcase" should only read the string
     (as opposed to "string-upcase!", which rewrites it).
2.  it is completely, utterly wrong to mutilate the
     Guile library into such a contortion that it
     interprets this:
         (define y "hello")
     to be a request to create an immutable string anyway.
     It very, very plainly says, "make 'y' and fill it with
     the string "hello".  Making it read only is crazy.

Furthermore, I do not even have an obvious way to deal
with the problem, short of a massive rewrite.
I define variables this way all over the place.
rewriting the code to
    (define y (string-append "hell" "o"))
everywhere is stupid, laborious, time consuming for me,
and time consuming at execution time.

Guile 2.0.1, 2.0.2 and 2.0.3 need some rethinking.  Dang!!!!!



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-03 16:26   ` Guile: " Bruce Korb
@ 2012-01-03 16:30     ` Mike Gran
  2012-01-03 22:24     ` Ludovic Courtès
  2012-01-04 10:03     ` Mark H Weaver
  2 siblings, 0 replies; 117+ messages in thread
From: Mike Gran @ 2012-01-03 16:30 UTC (permalink / raw)
  To: Bruce Korb; +Cc: gnu-prog-discuss@gnu.org, guile-devel Development

> From: Bruce Korb <bruce.korb@gmail.com>

> 2.  it is completely, utterly wrong to mutilate the
>     Guile library into such a contortion that it
>     interprets this:
>         (define y "hello")
>     to be a request to create an immutable string anyway.
>     It very, very plainly says, "make 'y' and fill it with
>     the string "hello".  Making it read only is crazy.

Agreed.

-Mike




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-03 16:26   ` Guile: " Bruce Korb
  2012-01-03 16:30     ` Mike Gran
@ 2012-01-03 22:24     ` Ludovic Courtès
  2012-01-03 23:15       ` Bruce Korb
  2012-01-04  3:04       ` Mike Gran
  2012-01-04 10:03     ` Mark H Weaver
  2 siblings, 2 replies; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-03 22:24 UTC (permalink / raw)
  To: guile-devel

Hi Bruce,

And happy new year!

Bruce Korb <bruce.korb@gmail.com> skribis:

> Thank you for the explanation.  However:
>
> On 01/03/12 07:03, Mike Gran wrote:
>>> It worked until I "upgraded" to openSuSE 12.1.
>>>
>>>>   $ guile --version
>>>>   guile (GNU Guile) 2.0.2
>>>>   .....
>
>>>>   (set! tmp-text (get "act-text"))
>>>>          (set! TMP-text (string-upcase tmp-text))
>
>>>> ERROR: In procedure string-upcase:
>>>> ERROR: string is read-only: ""

[...]

>> On Guile 2.0.3, if you create a string the same way, it
>> is read-only for some reason.
>>
>> scheme@(guile-user)>  (define y "hello")
>> scheme@(guile-user)>  (string-set! y 0 #\x)
>> ERROR: In procedure string-set!:
>> ERROR: string is read-only: "hello"
>>
>> %string-dump can be used to confirm this
>
> There are a couple of issues:
>
> 1.  "string-upcase" should only read the string
>     (as opposed to "string-upcase!", which rewrites it).

Yes, that’s weird.  I can’t get string-upcase to raise a read-only
exception with 2.0.3, though.  Could you try with 2.0.3, or come up with
a reduced case?

> 2.  it is completely, utterly wrong to mutilate the
>     Guile library into such a contortion that it
>     interprets this:
>         (define y "hello")
>     to be a request to create an immutable string anyway.
>     It very, very plainly says, "make 'y' and fill it with
>     the string "hello".  Making it read only is crazy.

It stems from the fact that string literals are read-only, per R5RS
(info "(r5rs) Storage model"):

  In many systems it is desirable for constants (i.e. the values of literal
  expressions) to reside in read-only-memory.  To express this, it is
  convenient to imagine that every object that denotes locations is
  associated with a flag telling whether that object is mutable or immutable.
  In such systems literal constants and the strings returned by
  `symbol->string' are immutable objects, while all objects created by
  the other procedures listed in this report are mutable.  It is an error
  to attempt to store a new value into a location that is denoted by an
  immutable object.

In Guile this has been the case since commit
190d4b0d93599e5b58e773dc6375054c3a6e3dbf.

The reason for this is that Guile’s compiler tries hard to avoid
duplicating constants in the output bytecode.  Thus, modifying a
constant would actually change all other occurrences of that constant in
the code, making it a non-constant.  ;-)

> Furthermore, I do not even have an obvious way to deal
> with the problem,

You can use:

  (define y (string-copy "hello"))

> short of a massive rewrite.
> I define variables this way all over the place.
> rewriting the code to
>    (define y (string-append "hell" "o"))
> everywhere is stupid, laborious, time consuming for me,
> and time consuming at execution time.

I agree that this is laborious, and I’m sorry about that.  I can only
say that Guile < 2.0 being more permissive than the standard turns out
to be a mistake, in hindsight.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-03 22:24     ` Ludovic Courtès
@ 2012-01-03 23:15       ` Bruce Korb
  2012-01-03 23:33         ` Ludovic Courtès
  2012-01-04 12:19         ` Ian Price
  2012-01-04  3:04       ` Mike Gran
  1 sibling, 2 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-03 23:15 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On 01/03/12 14:24, Ludovic Courtès wrote:
>> 2.  it is completely, utterly wrong to mutilate the
>>      Guile library into such a contortion that it
>>      interprets this:
>>          (define y "hello")
>>      to be a request to create an immutable string anyway.
>>      It very, very plainly says, "make 'y' and fill it with
>>      the string "hello".  Making it read only is crazy.
>
> It stems from the fact that string literals are read-only, per R5RS
> (info "(r5rs) Storage model"):
>
>    [[blah, blah, blah]]
>
> In Guile this has been the case since commit
> 190d4b0d93599e5b58e773dc6375054c3a6e3dbf.
>
> The reason for this is that Guile’s compiler tries hard to avoid
> duplicating constants in the output bytecode.  Thus, modifying a

You have changed the interface without deprecation or any other multi-year process.
Please change it back.  Please fix the problem by adding (define-strict y "hello")
to have this new semantic.  Thank you.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-03 23:15       ` Bruce Korb
@ 2012-01-03 23:33         ` Ludovic Courtès
  2012-01-04  0:55           ` Bruce Korb
  2012-01-04 12:19         ` Ian Price
  1 sibling, 1 reply; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-03 23:33 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce,

Bruce Korb <bkorb@gnu.org> skribis:

> On 01/03/12 14:24, Ludovic Courtès wrote:
>>> 2.  it is completely, utterly wrong to mutilate the
>>>      Guile library into such a contortion that it
>>>      interprets this:
>>>          (define y "hello")
>>>      to be a request to create an immutable string anyway.
>>>      It very, very plainly says, "make 'y' and fill it with
>>>      the string "hello".  Making it read only is crazy.
>>
>> It stems from the fact that string literals are read-only, per R5RS
>> (info "(r5rs) Storage model"):
>>
>>    [[blah, blah, blah]]
>>
>> In Guile this has been the case since commit
>> 190d4b0d93599e5b58e773dc6375054c3a6e3dbf.
>>
>> The reason for this is that Guile’s compiler tries hard to avoid
>> duplicating constants in the output bytecode.  Thus, modifying a
>
> You have changed the interface without deprecation or any other multi-year process.

I could be just as offensive by suggesting that R5RS is 14 years old,
etc., but I’d rather work towards an acceptable solution with you.

Could you point me to the affected code?  What would you think of using
string-copy as I suggested?  The disadvantage is that you need to modify
your code, but hopefully that can be automated with a sed script or so;
the advantage is that it would work with all versions of Guile.

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-03 23:33         ` Ludovic Courtès
@ 2012-01-04  0:55           ` Bruce Korb
  2012-01-04  3:12             ` Noah Lavine
  2012-01-04 21:17             ` Ludovic Courtès
  0 siblings, 2 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-04  0:55 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On 01/03/12 15:33, Ludovic Courtès wrote:
> Could you point me to the affected code?  What would you think of using
> string-copy as I suggested?  The disadvantage is that you need to modify
> your code, but hopefully that can be automated with a sed script or so;
> the advantage is that it would work with all versions of Guile.

The disadvantage is that I know I have "clients" that have rolled their
own templates, presumably by copy-and-edit processes that will invariably
include (define var "string") syntax.  Likely a better approach is to
re-define the "define" function to my own C code and call the    proper
scm_whathaveyou functions under the covers.

I'm sorry about being irritable.  This is the third problem with 2.x.
First a pre-defined value disappeared.  A very minor nuisance.
Then it turned out that the string functions would now clear the
high order bit on strings, so they are no longer byte arrays and
there is no replacement but to roll my own.  I stopped supporting
byte arrays.  A noticable nuisance.

Now it turns out that the conventional, ordinary way of creating
a string variable yields a read-only string.  Ouch.  So I am cranky
and sorry about being so.

So I guess that's my fix.  Write another function dependent
upon Guile internals, much like scm_c_eval_string_from_file_line(),
by copying scm_define() code, checking for a string value and copying
that string -- if it is read-only?  Should I check for that?

What about "set!"?  Should I check for a read-only value there, too?
I do confess it feels a little bit like unraveling something.....It is scary.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-03 22:24     ` Ludovic Courtès
  2012-01-03 23:15       ` Bruce Korb
@ 2012-01-04  3:04       ` Mike Gran
  2012-01-04  9:35         ` nalaginrut
                           ` (2 more replies)
  1 sibling, 3 replies; 117+ messages in thread
From: Mike Gran @ 2012-01-04  3:04 UTC (permalink / raw)
  To: Ludovic Courtès, guile-devel@gnu.org

>   In many systems it is desirable for constants (i.e. the values of literal
>   expressions) to reside in read-only-memory.  To express this, it is
>   convenient to imagine that every object that denotes locations is
>   associated with a flag telling whether that object is mutable or immutable.
>   In such systems literal constants and the strings returned by
>   `symbol->string' are immutable objects, while all objects created by
>   the other procedures listed in this report are mutable.  It is an error
>   to attempt to store a new value into a location that is denoted by an
>   immutable object.
> 
> In Guile this has been the case since commit
> 190d4b0d93599e5b58e773dc6375054c3a6e3dbf.
> 
> The reason for this is that Guile’s compiler tries hard to avoid
> duplicating constants in the output bytecode.  Thus, modifying a
> constant would actually change all other occurrences of that constant in
> the code, making it a non-constant.  ;-)

This is a terrible example of the RnRS promoting some strange idea of
mathematical purity over being useful.
 
The idea that the correct way to initialize a string is
(define x (string-copy "string")) is awkward.  "string" is a read-only
but copying it makes it modifyiable?  Copying implies mutability?
 
Copying doesn't imply modifying mutability in any other data type.
 
Why not change the behavior 'define' to be (define y (substring str 0)) when STR
is a read-only string?  This would preserve the shared memory if the variable is never
modified but still make the string copy-on-write.
 
Regards,
 
Mike



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04  0:55           ` Bruce Korb
@ 2012-01-04  3:12             ` Noah Lavine
  2012-01-04 17:37               ` bytevector -- was: " Bruce Korb
  2012-01-04 21:17             ` Ludovic Courtès
  1 sibling, 1 reply; 117+ messages in thread
From: Noah Lavine @ 2012-01-04  3:12 UTC (permalink / raw)
  To: Bruce Korb; +Cc: Ludovic Courtès, guile-devel

Hello,

> Then it turned out that the string functions would now clear the
> high order bit on strings, so they are no longer byte arrays and
> there is no replacement but to roll my own.  I stopped supporting
> byte arrays.  A noticable nuisance.

This is just a side note to the main discussion, but there is now a
'bytevector' datatype you can use. Does that work for you? If not,
what functionality is missing?

Thanks,
Noah



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04  3:04       ` Mike Gran
@ 2012-01-04  9:35         ` nalaginrut
  2012-01-04  9:41         ` David Kastrup
  2012-01-04 21:07         ` Ludovic Courtès
  2 siblings, 0 replies; 117+ messages in thread
From: nalaginrut @ 2012-01-04  9:35 UTC (permalink / raw)
  To: Mike Gran; +Cc: Ludovic Courtès, guile-devel@gnu.org

> >   In many systems it is desirable for constants (i.e. the values of literal
> >   expressions) to reside in read-only-memory.  To express this, it is
> >   convenient to imagine that every object that denotes locations is
> >   associated with a flag telling whether that object is mutable or immutable.
> >   In such systems literal constants and the strings returned by
> >   `symbol->string' are immutable objects, while all objects created by
> >   the other procedures listed in this report are mutable.  It is an error
> >   to attempt to store a new value into a location that is denoted by an
> >   immutable object.
> > 
> > In Guile this has been the case since commit
> > 190d4b0d93599e5b58e773dc6375054c3a6e3dbf.
> > 
> > The reason for this is that Guile’s compiler tries hard to avoid
> > duplicating constants in the output bytecode.  Thus, modifying a
> > constant would actually change all other occurrences of that constant in
> > the code, making it a non-constant.  ;-)
> 
> This is a terrible example of the RnRS promoting some strange idea of
> mathematical purity over being useful.
>  
> The idea that the correct way to initialize a string is
> (define x (string-copy "string")) is awkward.  "string" is a read-only
> but copying it makes it modifyiable?  Copying implies mutability?
>  
> Copying doesn't imply modifying mutability in any other data type.
>  
> Why not change the behavior 'define' to be (define y (substring str 0)) when STR
> is a read-only string?  This would preserve the shared memory if the variable is never
> modified but still make the string copy-on-write.
>  
> Regards,
>  
> Mike
> 

Hi guys! I just pass by and see your dispute.
I have been confused by the new immutable string design. But I used a
macro "make-mutable-string" which hide string-copy for an abstraction.
Anyway, if the efficiency would be an issue, one may choose bytevector
to implement "make-mutable-string". And it's easy to substitute with
sed.

BTW, can't we make an efficient "mutable-string" module for an
alternative? Just like old version. I mean it could be a Guile specific
feature.

-- 
GNU Powered it
GPL Protected it
GOD Blessed it

HFG - NalaGinrut

--hacker key--
v4sw7CUSMhw6ln6pr8OSFck4ma9u8MLSOFw3WDXGm7g/l8Li6e7t4TNGSb8AGORTDLMen6g6RASZOGCHPa28s1MIr4p-x hackerkey.com
---end key---




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04  3:04       ` Mike Gran
  2012-01-04  9:35         ` nalaginrut
@ 2012-01-04  9:41         ` David Kastrup
  2012-01-04 21:07         ` Ludovic Courtès
  2 siblings, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-04  9:41 UTC (permalink / raw)
  To: guile-devel

Mike Gran <spk121@yahoo.com> writes:

>>   In many systems it is desirable for constants (i.e. the values of literal
>>   expressions) to reside in read-only-memory.  To express this, it is
>>   convenient to imagine that every object that denotes locations is
>>   associated with a flag telling whether that object is mutable or immutable.
>>   In such systems literal constants and the strings returned by
>>   `symbol->string' are immutable objects, while all objects created by
>>   the other procedures listed in this report are mutable.  It is an error
>>   to attempt to store a new value into a location that is denoted by an
>>   immutable object.
>> 
>> In Guile this has been the case since commit
>> 190d4b0d93599e5b58e773dc6375054c3a6e3dbf.
>> 
>> The reason for this is that Guile’s compiler tries hard to avoid
>> duplicating constants in the output bytecode.  Thus, modifying a
>> constant would actually change all other occurrences of that constant in
>> the code, making it a non-constant.  ;-)
>
> This is a terrible example of the RnRS promoting some strange idea of
> mathematical purity over being useful.
>  
> The idea that the correct way to initialize a string is
> (define x (string-copy "string")) is awkward.  "string" is a read-only
> but copying it makes it modifyiable?  Copying implies mutability?
>  
> Copying doesn't imply modifying mutability in any other data type.

Huh?

(set-car! '(4 5) 3) => bad
(set-car! (list-copy '(4 5)) 3) => ok

Similar with literal vectors.

Why should strings be different here?

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-03 16:26   ` Guile: " Bruce Korb
  2012-01-03 16:30     ` Mike Gran
  2012-01-03 22:24     ` Ludovic Courtès
@ 2012-01-04 10:03     ` Mark H Weaver
  2012-01-04 14:29       ` Mike Gran
  2012-01-04 22:37       ` Ludovic Courtès
  2 siblings, 2 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-04 10:03 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bruce.korb@gmail.com> writes:
> 2.  it is completely, utterly wrong to mutilate the
>     Guile library into such a contortion that it
>     interprets this:
>         (define y "hello")
>     to be a request to create an immutable string anyway.
>     It very, very plainly says, "make 'y' and fill it with
>     the string "hello".  Making it read only is crazy.

No, `define' does not copy an object, it merely makes a new reference to
an existing object.  This is also true in C for that matter, so this is
behavior is quite mainstream.  For example, the following program dies
with SIGSEGV on most modern systems, including GNU/Linux:

  int
  main()
  {
    char *y = "hello";
    y[0] = 'a';
    return 0;
  }

Scheme and Guile are the same as C in this respect.  Earlier versions of
Guile didn't make a copy of the string in this case either, but it
lacked the mechanism to detect this error, and allowed you to modify the
string literal in the program text itself, which is a _very_ bad idea.

For example, look at what Guile 1.8 does:

  guile> (let loop ((i 0))
           (define y "hello")
           (display y)
           (newline)
           (string-set! y i #\a)
           (loop (1+ i)))
  hello
  aello
  aallo
  aaalo
  aaaao
  aaaaa
  <then an error>

So you see, even in Guile 1.8, (define y "hello") didn't do what you
thought it did.  It didn't fill y with the string "hello".  You were
actually changing the program text itself, and that was a serious
mistake.

I'm sincerely sorry that you got yourself into this mess, but I don't
see any good way out of it.  To fix it as you suggest would be like
suggesting that C should change the semantics of char *y = "hello" to
automaticallly do a strcpy because some existing programs were in the
habit of modifying the string constants of the program text.  That way
lies madness.

If you want to make a copy of a string constant from the program text as
a starting point for mutating the string, then you need to explicitly
copy it, just like in C.

      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-03 23:15       ` Bruce Korb
  2012-01-03 23:33         ` Ludovic Courtès
@ 2012-01-04 12:19         ` Ian Price
  2012-01-04 17:16           ` Bruce Korb
  1 sibling, 1 reply; 117+ messages in thread
From: Ian Price @ 2012-01-04 12:19 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:

> You have changed the interface without deprecation or any other multi-year process.
> Please change it back.  Please fix the problem by adding (define-strict y "hello")
> to have this new semantic.  Thank you.

Fixing it with define-strict is ridiculous, as y is still mutable, it is
the string "hello" which is not. As for mutable strings, I consider them
a mistake to begin with, but if people expect them to be be mutable, and
historically they are mutable (in guile), it is a mistake to change this
without prior warning.

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 10:03     ` Mark H Weaver
@ 2012-01-04 14:29       ` Mike Gran
  2012-01-04 14:45         ` David Kastrup
                           ` (2 more replies)
  2012-01-04 22:37       ` Ludovic Courtès
  1 sibling, 3 replies; 117+ messages in thread
From: Mike Gran @ 2012-01-04 14:29 UTC (permalink / raw)
  To: Mark H Weaver, Bruce Korb; +Cc: guile-devel@gnu.org

> From: Mark H Weaver <mhw@netris.org>
> No, `define' does not copy an object, it merely makes a new reference to
> an existing object.  This is also true in C for that matter, so this is
> behavior is quite mainstream.  For example, the following program dies
> with SIGSEGV on most modern systems, including GNU/Linux:
> 
>   int
>   main()
>   {
>     char *y = "hello";
>     y[0] = 'a';
>     return 0;
>   }

 
True, but the following also is quite mainstream
int main()
{
  char y[6] = "hello";
  y[0] = 'a';
  return 0;
}
 
C provides a way to create and initialize a mutable string.
 
> Scheme and Guile are the same as C in this respect.  Earlier versions of
> Guile didn't make a copy of the string in this case either, but it
> lacked the mechanism to detect this error, and allowed you to modify the
> string literal in the program text itself, which is a _very_ bad idea.

It all depends on your mental model.  Your saying that (define y "hello")
attaches "hello" to y, and since "hello" is a immutable, the string y
contains must be immutable.  This is an argument based on purity, not
utility.
 
If you follow that logic, then Guile is left without any shorthand
to create and initialize a mutable string other than
 
(define y (substring "hello" 0))
or 
(define y (string-copy "hello"))
 
Someone coming from any other language would be surpised to find that
the above is what you need to do to create an initialize a mutable string,
I think.
 
But 'define' just as easily can be considered a generic constructor
that is overloaded in a C++ sense, and when "hello" is a string, y is
assigned a copy-on-write version of the immutable string.
 
It was wrong to change this without deprecating it first.
 
Thanks,
 
Mike Gran



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 14:29       ` Mike Gran
@ 2012-01-04 14:45         ` David Kastrup
  2012-01-04 16:47         ` Andy Wingo
  2012-01-04 17:19         ` Mark H Weaver
  2 siblings, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-04 14:45 UTC (permalink / raw)
  To: guile-devel

Mike Gran <spk121@yahoo.com> writes:

> If you follow that logic, then Guile is left without any shorthand
> to create and initialize a mutable string other than
>  
> (define y (substring "hello" 0))
> or 
> (define y (string-copy "hello"))

Sure.  Guile does not have shorthands for _mutable_ literals for lists
or vectors either.  One of the most significant points of a literal is
that you can rely on it staying the same.

> Someone coming from any other language would be surpised to find that
> the above is what you need to do to create an initialize a mutable
> string, I think.

I don't know any language that permits the modification of literals.

> But 'define' just as easily can be considered a generic constructor
> that is overloaded in a C++ sense,

It can be considered a lot of things that don't make sense.

> and when "hello" is a string, y is assigned a copy-on-write version of
> the immutable string.    It was wrong to change this without
> deprecating it first.

Modifying literals _never_ _ever_ was guaranteed to lead to predictable
results.  Undefined behavior before, undefined behavior afterwards.
There is no point in _deprecating_ something that _always_ was undefined
behavior.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 14:29       ` Mike Gran
  2012-01-04 14:45         ` David Kastrup
@ 2012-01-04 16:47         ` Andy Wingo
  2012-01-04 17:14           ` David Kastrup
                             ` (2 more replies)
  2012-01-04 17:19         ` Mark H Weaver
  2 siblings, 3 replies; 117+ messages in thread
From: Andy Wingo @ 2012-01-04 16:47 UTC (permalink / raw)
  To: Mike Gran; +Cc: Mark H Weaver, Bruce Korb, guile-devel@gnu.org

On Wed 04 Jan 2012 09:29, Mike Gran <spk121@yahoo.com> writes:

>   char y[6] = "hello";
>  
> C provides a way to create and initialize a mutable string.

This one is more like

  (define y (string #\h #\e #\l #\l #\o))

just like

  (define y (list #\h #\e #\l #\l #\o))
  (define y (vector #\h #\e #\l #\l #\o))

etc.

> It all depends on your mental model.  Your saying that (define y "hello")
> attaches "hello" to y, and since "hello" is a immutable, the string y
> contains must be immutable.

This is what the Scheme standard says, yes.

> This is an argument based on purity, not utility.

You don't think optimizations are of any use, then?  :-)  Immutable
literals allows literals to be coalesced, leading to the impressive 2x
speed improvements in Dorodango startup time, some months back.

> It was wrong to change this without deprecating it first.

I am not certain that is the case.  Mutating string literals has always
been an error in Scheme.  It did "work" with Guile 1.8 and before; but
since 1.9.0 when the compiler was introduced and started coalescing
literals, it has had the possibility to cause bugs.  The changes in
2.0.1 prevented those bugs by marking those strings as immutable.

I was going to propose a workaround with an option to change
vm-i-loader.c:43 and vm-i-loader.c:115 to use a
scm_i_mutable_string_literals_p instead of 1, but that really seems like
the path to perdition: previously compiled modules would start creating
mutable strings where they really shouldn't.

We could add a compiler option to turn string literals into (string-copy
FOO).  Perhaps that's the thing to do.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 16:47         ` Andy Wingo
@ 2012-01-04 17:14           ` David Kastrup
  2012-01-04 17:32             ` Andy Wingo
  2012-01-04 17:30           ` Bruce Korb
  2012-01-04 18:31           ` Guile: What's wrong with this? Mark H Weaver
  2 siblings, 1 reply; 117+ messages in thread
From: David Kastrup @ 2012-01-04 17:14 UTC (permalink / raw)
  To: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> We could add a compiler option to turn string literals into
> (string-copy FOO).  Perhaps that's the thing to do.

What for?  It would mean that a literal would not be eq? to itself, a
nightmare for memoization purposes.

And for what?  For making code with explicitly undefined behavior
exhibit a particular behavior that is undesirable in general.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 12:19         ` Ian Price
@ 2012-01-04 17:16           ` Bruce Korb
  2012-01-04 17:21             ` Andy Wingo
                               ` (3 more replies)
  0 siblings, 4 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-04 17:16 UTC (permalink / raw)
  To: Ian Price, Andy Wingo; +Cc: guile-devel

On 01/04/12 04:19, Ian Price wrote:
> ...  As for mutable strings, I consider them
> a mistake to begin with,...

Let's step back and consider the whole point of Guile in the first place.

My understanding is that one primary purpose is to be a facilitation
language so that application developers have less to worry about and
futz over.  An extension language, if you like that phrase.  As such,
it would seem to me that a primary design goal would be to make the
pathway as smooth as possible, rather than trying to emulate C and/or
official Scheme language specs as closely as possible.  To me, my primary
concern is doing my little thing with the least total hassle.  Having
to study up on and thoroughly understand the Scheme language seems
a lot harder than just using Perl (or what-have-you).  Most scripting
languages don't cut you off at the knees (change interfaces).

So my main question is:

   Which is the higher priority, language purity or ease of use?

The answer to that question answers several other things, like
whether or not strings should be "allowed" to have high order bits
set (not be pure ASCII) and whether or not to make read only strings
be copy-on-write vs. fault-on-write.

> We could add a compiler option to turn string literals into (string-copy
> FOO).  Perhaps that's the thing to do.

No, because your clients have no control over how Guile gets built.
We _do_ have control over startup code, however:

   (if (defined? 'set-copy-on-write-strings)
       (set-copy-on-write-strings #t))

Or, better, keep historical behavior and add:

   (if (defined? 'set-no-copy-on-write-strings)
       (set-no-copy-on-write-strings #t))

and fix the 1.9 bug (scribbling on shared strings) by making them
copy-on-write thingys.

Thank you.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 14:29       ` Mike Gran
  2012-01-04 14:45         ` David Kastrup
  2012-01-04 16:47         ` Andy Wingo
@ 2012-01-04 17:19         ` Mark H Weaver
  2012-01-05  4:24           ` Mark H Weaver
  2 siblings, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-04 17:19 UTC (permalink / raw)
  To: Mike Gran; +Cc: Bruce Korb, guile-devel

Mike Gran <spk121@yahoo.com> writes:

>> From: Mark H Weaver <mhw@netris.org>
>> No, `define' does not copy an object, it merely makes a new reference to
>> an existing object.  This is also true in C for that matter, so this is
>> behavior is quite mainstream.  For example, the following program dies
>> with SIGSEGV on most modern systems, including GNU/Linux:
>> 
>>   int
>>   main()
>>   {
>>     char *y = "hello";
>>     y[0] = 'a';
>>     return 0;
>>   }
>
>  
> True, but the following also is quite mainstream
> int main()
> {
>   char y[6] = "hello";
>   y[0] = 'a';
>   return 0;
> }
>  
> C provides a way to create and initialize a mutable string.

Scheme and Guile provide ways to do that too, but that's _never_ what
`define' has done.

>> Scheme and Guile are the same as C in this respect.  Earlier versions of
>> Guile didn't make a copy of the string in this case either, but it
>> lacked the mechanism to detect this error, and allowed you to modify the
>> string literal in the program text itself, which is a _very_ bad idea.
>
> It all depends on your mental model.  Your saying that (define y "hello")
> attaches "hello" to y, and since "hello" is a immutable, the string y
> contains must be immutable.  This is an argument based on purity, not
> utility.

If we were designing a new language, then it would at least be pertinent
to argue this point.  However, this is the way `define' has _always_
worked in every variant of Scheme, and the same is true of the analogous
`set' in Lisp from the very beginning.

> If you follow that logic, then Guile is left without any shorthand
> to create and initialize a mutable string other than
>  
> (define y (substring "hello" 0))
> or 
> (define y (string-copy "hello"))

Guile provides all the machinery you need to define shorthand syntax if
you like, e.g:

  (define-syntax-rule (define-string v s) (define v (string-copy s)))

For that matter, you could also do something like this:

  (define-syntax define
    (lambda (x)
      (with-syntax ((orig-define #'(@ (guile) define)))
        (syntax-case x ()
          ((_ (proc arg ...) e0 e1 ...)
           #'(orig-define proc (lambda (arg ...) e0 e1 ...)))
          ((_ v e)
           (identifier? #'v)
           (if (string? (syntax->datum #'e))
               #'(orig-define v (string-copy e))
               #'(orig-define v e)))))))

This will change `define' (in the module where it's defined) to
automatically copy a bare string literal on the right side.  Note that
this check is done at compile-time, so it can't look at the dynamic type
of an expression.

If that's not good enough and you're willing to take the efficiency hit
at runtime for _every_ use of `define', you could change `define' to
wrap the right-hand expression within a procedure call to check for
read-only strings:

  (define (copy-if-string x)
    (if (string? x)
        (string-copy x)
        x))
  
  (define-syntax define
    (lambda (x)
      (with-syntax ((orig-define #'(@ (guile) define)))
        (syntax-case x ()
          ((_ (proc arg ...) e0 e1 ...)
           #'(orig-define proc (lambda (arg ...) e0 e1 ...)))
          ((_ v e)
           #'(orig-define v (copy-if-string e)))))))

Scheme's nice handling of hygiene should make redefining `define' within
your own modules (including (guile-user)) harmless.  If it doesn't,
that's a bug and we'd like to hear about it.

> It was wrong to change this without deprecating it first.

The only change here was to add the machinery to detect an error that
was _always_ an error.  It _never_ did what you say that it should do.

What it did before was fail to detect that you were changing the string
constant in the program text itself.  The Guile 1.8 example I gave in my
last email in this thread demonstrates that.

To make that point even clearer, I'll post the full copy of the error
message Guile 1.8 gave when my loop ran past the end of the string:

  guile> (let loop ((i 0))
           (define y "hello")
           (display y)
           (newline)
           (string-set! y i #\a)
           (loop (1+ i)))
  hello
  aello
  aallo
  aaalo
  aaaao
  aaaaa
  
  Backtrace:
  In standard input:
     2: 0* [loop 0]
  In unknown file:
     ?: 1  (letrec ((y "aaaaa")) (display y) ...)
     ...
     ?: 2  (letrec ((y "aaaaa")) (display y) ...)
  In standard input:
     2: 3* [string-set! "aaaaa" {5} #\a]
  
  standard input:2:60: In procedure string-set! in expression (string-set! y i ...):
  standard input:2:60: Value out of range 0 to 4: 5
  ABORT: (out-of-range)
  guile> 

Take a look at the backtrace, where it helpfully shows you an excerpt of
the source code (admittedly after some transformation).  See how the
source code itself has been modified?  This is what Bruce's code does.
It was _always_ a serious error in the code, even if it went undetected
in earlier versions of Guile.

     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 17:16           ` Bruce Korb
@ 2012-01-04 17:21             ` Andy Wingo
  2012-01-04 17:39             ` David Kastrup
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 117+ messages in thread
From: Andy Wingo @ 2012-01-04 17:21 UTC (permalink / raw)
  To: Bruce Korb; +Cc: Ian Price, guile-devel

On Wed 04 Jan 2012 12:16, Bruce Korb <bkorb@gnu.org> writes:

>> We could add a compiler option to turn string literals into (string-copy
>> FOO).  Perhaps that's the thing to do.
>
> No, because your clients have no control over how Guile gets built.
> We _do_ have control over startup code, however:

I meant the Scheme compiler, Bruce -- the one that is in Guile.  Not the
C compiler used to compile Guile.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 16:47         ` Andy Wingo
  2012-01-04 17:14           ` David Kastrup
@ 2012-01-04 17:30           ` Bruce Korb
  2012-01-04 17:44             ` David Kastrup
  2012-01-04 18:26             ` Ian Price
  2012-01-04 18:31           ` Guile: What's wrong with this? Mark H Weaver
  2 siblings, 2 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-04 17:30 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Mark H Weaver, guile-devel@gnu.org

On 01/04/12 08:47, Andy Wingo wrote:
> I was going to propose a workaround with an option to change
> vm-i-loader.c:43 and vm-i-loader.c:115 to use a
> scm_i_mutable_string_literals_p instead of 1, but that really seems like
> the path to perdition: previously compiled modules would start creating
> mutable strings where they really shouldn't.

Instead, long-standing, previously written code was invalidated with 1.9,
even if we were not smacked down until 2.0.1.

Just because an obscure-to-those-not-living-and-breathing-Scheme-daily
document said it was okay doesn't make it okay to those whacked by it.
I would think recompiling should not be a great burden, *ESPECIALLY*
given that it is a recent invention and therefore likely to have some
initial issues that need dealing with.  Like this, for example.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 17:14           ` David Kastrup
@ 2012-01-04 17:32             ` Andy Wingo
  2012-01-04 17:49               ` David Kastrup
  0 siblings, 1 reply; 117+ messages in thread
From: Andy Wingo @ 2012-01-04 17:32 UTC (permalink / raw)
  To: David Kastrup; +Cc: guile-devel

On Wed 04 Jan 2012 12:14, David Kastrup <dak@gnu.org> writes:

> Andy Wingo <wingo@pobox.com> writes:
>
>> We could add a compiler option to turn string literals into
>> (string-copy FOO).  Perhaps that's the thing to do.
>
> What for?  It would mean that a literal would not be eq? to itself, a
> nightmare for memoization purposes.

  (eq? "hello" "hello")

This expression may be true or false.  It will be true in some
circumstances and false in others, in all versions of Guile.

> And for what?  For making code with explicitly undefined behavior
> exhibit a particular behavior that is undesirable in general.

The Scheme reports and the Guile manual are both positive and negative
specification: they require the implementation to do certain things, and
they allow it to do certain others.  Eq? on literals is one of the
liberties afforded to the implementation, and with good reason.  Correct
programs don't assume anything about the identities (in the sense of
eq?) of literals.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: bytevector -- was: Guile: What's wrong with this?
  2012-01-04  3:12             ` Noah Lavine
@ 2012-01-04 17:37               ` Bruce Korb
  0 siblings, 0 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-04 17:37 UTC (permalink / raw)
  To: Noah Lavine; +Cc: Ludovic Courtès, guile-devel

Hi,

On 01/03/12 19:12, Noah Lavine wrote:
>> Then it turned out that the string functions would now clear the
>> high order bit on strings, so they are no longer byte arrays and
>> there is no replacement but to roll my own.  I stopped supporting
>> byte arrays.  A noticable nuisance.
>
> This is just a side note to the main discussion, but there is now a
> 'bytevector' datatype you can use. Does that work for you? If not,
> what functionality is missing?
>
> Thanks,

Oh, no, thank _you_!  That is likely what I need.  I don't track
Guile development closely.  I have GUILE_WARN_DEPRECATED set to
"detailed" and expect that to warn me when issues are coming up.
It has actually yet to do so, however.  Imagine my surprise.

Cheers - Bruce



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 17:16           ` Bruce Korb
  2012-01-04 17:21             ` Andy Wingo
@ 2012-01-04 17:39             ` David Kastrup
  2012-01-04 21:52             ` Ian Price
  2012-01-04 22:46             ` Ludovic Courtès
  3 siblings, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-04 17:39 UTC (permalink / raw)
  To: guile-devel

Bruce Korb <bkorb@gnu.org> writes:

> On 01/04/12 04:19, Ian Price wrote:
>> ...  As for mutable strings, I consider them
>> a mistake to begin with,...
>
> Let's step back and consider the whole point of Guile in the first place.
>
> My understanding is that one primary purpose is to be a facilitation
> language so that application developers have less to worry about and
> futz over.  An extension language, if you like that phrase.  As such,
> it would seem to me that a primary design goal would be to make the
> pathway as smooth as possible, rather than trying to emulate C and/or
> official Scheme language specs as closely as possible.  To me, my primary
> concern is doing my little thing with the least total hassle.  Having
> to study up on and thoroughly understand the Scheme language seems
> a lot harder than just using Perl (or what-have-you).  Most scripting
> languages don't cut you off at the knees (change interfaces).
>
> So my main question is:
>
>   Which is the higher priority, language purity or ease of use?

Encouraging language abuse like making _literals_ not eq? to themselves
makes a language unpredictable.  That is not a road to ease of use.  It
is a dead end.

> and fix the 1.9 bug (scribbling on shared strings) by making them
> copy-on-write thingys.

So you want to give eq? unpredictable semantics as well.  What else has
made your black list of things to sacrifice in order to keep undefined
code working in a particular undefined way?

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 17:30           ` Bruce Korb
@ 2012-01-04 17:44             ` David Kastrup
  2012-01-04 18:26             ` Ian Price
  1 sibling, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-04 17:44 UTC (permalink / raw)
  To: guile-devel

Bruce Korb <bruce.korb@gmail.com> writes:

> On 01/04/12 08:47, Andy Wingo wrote:
>> I was going to propose a workaround with an option to change
>> vm-i-loader.c:43 and vm-i-loader.c:115 to use a
>> scm_i_mutable_string_literals_p instead of 1, but that really seems like
>> the path to perdition: previously compiled modules would start creating
>> mutable strings where they really shouldn't.
>
> Instead, long-standing, previously written code was invalidated with
> 1.9, even if we were not smacked down until 2.0.1.

Yes, that is an inherent problem of writing code with undefined
behavior.  The only way to keep it working in the exact same manner is
to use the exact same interpreter.  And in the age of allocation
randomization and multi-threading, not even that is reliable.

> Just because an obscure-to-those-not-living-and-breathing-Scheme-daily
> document said it was okay doesn't make it okay to those whacked by it.

There was _never_ _any_ document that stated writing to literals was ok.
You did so entirely on your own initiative and just were lucky that it
happened to work under certain circumstances for a while.  If people
like to whack themselves, there is little one can do to keep them from
doing so.  They'll always find a way.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 17:32             ` Andy Wingo
@ 2012-01-04 17:49               ` David Kastrup
  2012-01-04 18:09                 ` Andy Wingo
  0 siblings, 1 reply; 117+ messages in thread
From: David Kastrup @ 2012-01-04 17:49 UTC (permalink / raw)
  To: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> On Wed 04 Jan 2012 12:14, David Kastrup <dak@gnu.org> writes:
>
>> Andy Wingo <wingo@pobox.com> writes:
>>
>>> We could add a compiler option to turn string literals into
>>> (string-copy FOO).  Perhaps that's the thing to do.
>>
>> What for?  It would mean that a literal would not be eq? to itself, a
>> nightmare for memoization purposes.
>
>   (eq? "hello" "hello")
>
> This expression may be true or false.  It will be true in some
> circumstances and false in others, in all versions of Guile.

To itself.  Not to a literal written in the same manner.

(define (zap) "hello")
(eq? (zap) (zap))

This expression may not choose to be true or false.

>> And for what?  For making code with explicitly undefined behavior
>> exhibit a particular behavior that is undesirable in general.
>
> The Scheme reports and the Guile manual are both positive and negative
> specification: they require the implementation to do certain things,
> and they allow it to do certain others.  Eq? on literals is one of the
> liberties afforded to the implementation, and with good reason.
> Correct programs don't assume anything about the identities (in the
> sense of eq?) of literals.

Of _different_ literals spelled in the same way.  But one and the same
literal has to be eq? to itself.  It can't just replace itself with a
non-eq? copy on a whim.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 17:49               ` David Kastrup
@ 2012-01-04 18:09                 ` Andy Wingo
  0 siblings, 0 replies; 117+ messages in thread
From: Andy Wingo @ 2012-01-04 18:09 UTC (permalink / raw)
  To: David Kastrup; +Cc: guile-devel

On Wed 04 Jan 2012 12:49, David Kastrup <dak@gnu.org> writes:

> (define (zap) "hello")
> (eq? (zap) (zap))
>
> This expression may not choose to be true or false.

Indeed, good point.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 17:30           ` Bruce Korb
  2012-01-04 17:44             ` David Kastrup
@ 2012-01-04 18:26             ` Ian Price
  2012-01-04 18:48               ` Mark H Weaver
  2012-01-04 19:29               ` Bruce Korb
  1 sibling, 2 replies; 117+ messages in thread
From: Ian Price @ 2012-01-04 18:26 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel@gnu.org

Bruce Korb <bruce.korb@gmail.com> writes:

> On 01/04/12 08:47, Andy Wingo wrote:
>> I was going to propose a workaround with an option to change
>> vm-i-loader.c:43 and vm-i-loader.c:115 to use a
>> scm_i_mutable_string_literals_p instead of 1, but that really seems like
>> the path to perdition: previously compiled modules would start creating
>> mutable strings where they really shouldn't.
>
> Instead, long-standing, previously written code was invalidated with 1.9,
long-standing, previously written _buggy_ code.

> even if we were not smacked down until 2.0.1.
>
> Just because an obscure-to-those-not-living-and-breathing-Scheme-daily
> document said it was okay doesn't make it okay to those whacked by it.
There's an old saying, "Ignorance of the law is no excuse". If I wrote C
code that doesn't conform to the C standard and depended on
implementation specific behaviour, I have no recourse if it breaks on a
different compiler. Guile explicitly claims to conform to the r5rs (and
partially to the r6rs), both of which make this behaviour undefined, and
srfi 13 explicitly makes this an error. (And FWIW I would not consider
the R5RS obscure to people who have used scheme for even a short while,
nor is it a terrific burden to read at 50 pages)

Now, if you want to argue your position, it'd be better to argue that
guile goes beyond r[56]rs in making these promises with regards to strings.

For instance, substring-fill! as found at
https://www.gnu.org/software/guile/manual/html_node/String-Modification.html
implies that string literals are mutable

— Scheme Procedure: substring-fill! str start end fill
— C Function: scm_substring_fill_x (str, start, end, fill)

    Change every character in str between start and end to fill.

              (define y "abcdefg")
              (substring-fill! y 1 3 #\r)
              y
              ⇒ "arrdefg"

So too does string-upcase!
(https://www.gnu.org/software/guile/manual/html_node/Alphabetic-Case-Mapping.html),
if we assume y is the same binding in both functions

— Scheme Procedure: string-upcase! str [start [end]]
— C Function: scm_substring_upcase_x (str, start, end)
— C Function: scm_string_upcase_x (str)

    Destructively upcase every character in str.

              (string-upcase! y)
              ⇒ "ARRDEFG"
              y
              ⇒ "ARRDEFG"

The same goes for string-downcase! and string-capitalize!

I think it would be fair to say that someone could surmise that literal
strings are meant to be mutable from these examples, and, if we do go
down the immutable string literal route these examples would need to be
addressed.

On the other hand, you can argue that string literal immutability is
implied by

— Scheme Procedure: string-for-each-index proc s [start [end]]
— C Function: scm_string_for_each_index (proc, s, start, end)

    Call (proc i) for each index i in s, from left to right.

    For example, to change characters to alternately upper and lower case,
p
              (define str (string-copy "studly"))
              (string-for-each-index
                  (lambda (i)
                    (string-set! str i
                      ((if (even? i) char-upcase char-downcase)
                       (string-ref str i))))
                  str)
              str ⇒ "StUdLy"

but on a purely numerical basis, mutability 4 - 0 immutability

> I would think recompiling should not be a great burden, *ESPECIALLY*
At this stage, I think that argument is fair enough, other people's
mileage may vary.

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 16:47         ` Andy Wingo
  2012-01-04 17:14           ` David Kastrup
  2012-01-04 17:30           ` Bruce Korb
@ 2012-01-04 18:31           ` Mark H Weaver
  2012-01-04 18:43             ` Andy Wingo
  2 siblings, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-04 18:31 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Bruce Korb, guile-devel

Andy Wingo <wingo@pobox.com> writes:
> We could add a compiler option to turn string literals into (string-copy
> FOO).  Perhaps that's the thing to do.

I think this would be fine, as long as the default is _not_ to copy
string literals.  This would help Bruce a great deal with very little
effort on our part, without mucking up the semantics for anyone else.

David Kastrup <dak@gnu.org> writes:
> What for?  It would mean that a literal would not be eq? to itself, a
> nightmare for memoization purposes.

I agree that it should not be the default behavior, but I don't see the
harm in allowing users to compile their own code this way.  The
memoization argument is a bit thin.  How often is it useful to memoize
against string arguments using eq? as the equality predicate?  Remember,
this would only for be for code that explicitly changed this compilation
option.

     Best,
      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 18:31           ` Guile: What's wrong with this? Mark H Weaver
@ 2012-01-04 18:43             ` Andy Wingo
  2012-01-04 19:29               ` Mark H Weaver
  0 siblings, 1 reply; 117+ messages in thread
From: Andy Wingo @ 2012-01-04 18:43 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Bruce Korb, guile-devel

On Wed 04 Jan 2012 13:31, Mark H Weaver <mhw@netris.org> writes:

> Andy Wingo <wingo@pobox.com> writes:
>> We could add a compiler option to turn string literals into (string-copy
>> FOO).  Perhaps that's the thing to do.
>
> I think this would be fine, as long as the default is _not_ to copy
> string literals.  This would help Bruce a great deal with very little
> effort on our part, without mucking up the semantics for anyone else.

Yes, this was what I was thinking.

> David Kastrup <dak@gnu.org> writes:
>> What for?  It would mean that a literal would not be eq? to itself, a
>> nightmare for memoization purposes.
>
> I agree that it should not be the default behavior, but I don't see the
> harm in allowing users to compile their own code this way.

Well, we can fix this too: we can make

  "foo"

transform to

  (copy-once UNIQUE-GENSYM str)

with

(define (copy-once key str)
  (or (hashq-ref mutable-string-literals key)
      (let ((value (string-copy str)))
        (hashq-set! mutable-string-literals key value)
        value)))

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 18:26             ` Ian Price
@ 2012-01-04 18:48               ` Mark H Weaver
  2012-01-04 19:29               ` Bruce Korb
  1 sibling, 0 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-04 18:48 UTC (permalink / raw)
  To: Ian Price; +Cc: guile-devel

Ian Price <ianprice90@googlemail.com> writes:

> — Scheme Procedure: substring-fill! str start end fill
> — C Function: scm_substring_fill_x (str, start, end, fill)
>
>     Change every character in str between start and end to fill.
>
>               (define y "abcdefg")
>               (substring-fill! y 1 3 #\r)
>               y
>               ⇒ "arrdefg"
>
> So too does string-upcase!

Ugh, thanks for pointing this out!  Fixed.  Any others?

    Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 18:43             ` Andy Wingo
@ 2012-01-04 19:29               ` Mark H Weaver
  2012-01-04 19:43                 ` Andy Wingo
  0 siblings, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-04 19:29 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Bruce Korb, guile-devel

Andy Wingo <wingo@pobox.com> writes:

>> David Kastrup <dak@gnu.org> writes:
>>> What for?  It would mean that a literal would not be eq? to itself, a
>>> nightmare for memoization purposes.
>>
>> I agree that it should not be the default behavior, but I don't see the
>> harm in allowing users to compile their own code this way.
>
> Well, we can fix this too: we can make
>
>   "foo"
>
> transform to
>
>   (copy-once UNIQUE-GENSYM str)
>
> with
>
> (define (copy-once key str)
>   (or (hashq-ref mutable-string-literals key)
>       (let ((value (string-copy str)))
>         (hashq-set! mutable-string-literals key value)
>         value)))

Although this is a closer emulation of the previous (broken) behavior,
IMHO this would be less desirable than simply doing (string-copy "foo")
on every evaluation of "foo", which seems to be what Bruce (and probably
others) expected "foo" to do.

For example, based on the mental model that Bruce apparently had when he
wrote his code, he might have written something like this:

  (define (hello-world-with-one-char-changed i c)
    (define str "Hello world")
    (string-set! str i c)
    str)

Your UNIQUE-GENSYM hack emulates the previous behavior that makes the
above procedure buggy.  Simply changing "hello" to (string-copy "hello")
would make the procedure work, and I believe conforms better to what
Bruce expects.

Of course, I'm only talking about what I think should be done when the
compiler option is changed to non-default behavior.  I strongly believe
that the _default_ behavior should stay as it is now.

      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 18:26             ` Ian Price
  2012-01-04 18:48               ` Mark H Weaver
@ 2012-01-04 19:29               ` Bruce Korb
  2012-01-04 20:20                 ` David Kastrup
  2012-01-04 23:19                 ` Mark H Weaver
  1 sibling, 2 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-04 19:29 UTC (permalink / raw)
  To: Ian Price; +Cc: guile-devel@gnu.org

On 01/04/12 10:26, Ian Price wrote:
>> Just because an obscure-to-those-not-living-and-breathing-Scheme-daily
>> document said it was okay doesn't make it okay to those whacked by it.
> There's an old saying, "Ignorance of the law is no excuse". If I wrote C
> code that doesn't conform to the C standard

I did.  The standard changed.  My code broke.  The fix for
read-only string literals was obvious and straight forward.
The fix for pointer aliasing is virtually impossible, except
to -fno-strict-aliasing for GCC.  Yes, new code, fine, but
the millions of lines of old code I deal with? No way.

I think I've seen a reasonable way to go forward:  an option
to always copy newly defined strings.  I am also a little curious:
since this fault occurred on a string brought in via my C function
named ag_scm_get() and it created the value with a call to
scm_str02scm, shouldn't that function have created a mutable
string copy?

> Now, if you want to argue your position, it'd be better to argue that
> guile goes beyond r[56]rs in making these promises with regards to strings.

My number 1 argument may not be the strongest argument.
My number 1 argument is that Guile, being an extension language,
needs to be as forgiving and easy to use as it can possibly be
because its client programmers (programmers using it) want to
know as absolutely little as possible about it.  No, I do *not*
want to read, understand and remember 50 pages of stuff so that
I can use Guile as an extension language.  The memory barrier is
much, *MUCH* lower for other scripting languages.

> For instance, substring-fill! as found at
> https://www.gnu.org/software/guile/manual/html_node/String-Modification.html
> implies that string literals are mutable
>
> — Scheme Procedure: substring-fill! str start end fill
> — C Function: scm_substring_fill_x (str, start, end, fill)
>
>      Change every character in str between start and end to fill.
>
>                (define y "abcdefg")
>                (substring-fill! y 1 3 #\r)
>                y
>                ⇒ "arrdefg"

Who knows where I learned the idiom.  I learned the minimal amount of Guile
needed for my purposes a dozen years ago.  My actual problem stems from this:

> Backtrace:
> In ice-9/boot-9.scm:
>  170: 3 [catch #t #<catch-closure 8b75a0> ...]
> In unknown file:
>    ?: 2 [catch-closure]
> In ice-9/eval.scm:
>  420: 1 [eval # ()]
> In unknown file:
>    ?: 0 [string-upcase ""]
>
> ERROR: In procedure string-upcase:
> ERROR: string is read-only: ""
> Scheme evaluation error.  AutoGen ABEND-ing in template
> 	confmacs.tlib on line 209
> Failing Guile command:  = = = = =
>
> (set! tmp-text (get "act-text"))
>        (set! TMP-text (string-upcase tmp-text))

What in heck is string-upcase doing trying to write to its input string?
Why was the string returned by ag_scm_get() (the function bound to "get")
an immutable string anyway?

> SCM
> ag_scm_get(SCM agName, SCM altVal)
> {
>     tDefEntry*  pE;
>     ag_bool     x;
>
>     pE = (! AG_SCM_STRING_P(agName)) ? NULL :
>         findDefEntry(ag_scm2zchars(agName, "ag value"), &x);
>
>     if ((pE == NULL) || (pE->valType != VALTYP_TEXT)) {
>         if (AG_SCM_STRING_P(altVal))
>             return altVal;
>         return AG_SCM_STR02SCM(zNil);
>     }
>
>     return AG_SCM_STR02SCM(pE->val.pzText);
> }

"AG_SCM_STR02SCM" is either scm_makfrom0str or scm_from_locale_string,
depending on the age of the Guile library.  "zNil" is a pointer to a NUL
byte that is, indeed, in read only memory, but surely scm_from_locale_string
would not have been written in a way to detect that and add that attribute
because of doing a memory probe.  Further, it cannot be implemented in a
way that does not copy it because I will most certainly call
scm_from_locale_string using a pointer to memory that is immediately
deallocated.  It *MUST* copy the string.  So what is this really about anyway?

> I think it would be fair to say that someone could surmise that literal
> strings are meant to be mutable from these examples, and, if we do go
> down the immutable string literal route these examples would need to be
> addressed.

:)  I think so.  Meanwhile, I think the solution to be allowing
Guile clients to say, with some initialization code of some sort,
"copy my input strings" so the immutability flag is not set.
(I do think it correct to not scribble on shared strings....)

Thank you for your help!  Regards, Bruce



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 19:29               ` Mark H Weaver
@ 2012-01-04 19:43                 ` Andy Wingo
  2012-01-04 20:08                   ` Bruce Korb
  0 siblings, 1 reply; 117+ messages in thread
From: Andy Wingo @ 2012-01-04 19:43 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Bruce Korb, guile-devel

On Wed 04 Jan 2012 14:29, Mark H Weaver <mhw@netris.org> writes:

> Although this is a closer emulation of the previous (broken) behavior,
> IMHO this would be less desirable than simply doing (string-copy "foo")
> on every evaluation of "foo", which seems to be what Bruce (and probably
> others) expected "foo" to do.

Thing is, why are we doing this?  We know what the correct behavior is,
as you say:

> Of course, I'm only talking about what I think should be done when the
> compiler option is changed to non-default behavior.  I strongly believe
> that the _default_ behavior should stay as it is now.

The correct behavior is the status quo.  We are considering adding a
hack to produce different behavior for compatibility purposes.  We don't
have to worry about correctness in that case, only compatibility.  IMO
anyway :)

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 19:43                 ` Andy Wingo
@ 2012-01-04 20:08                   ` Bruce Korb
  2012-01-04 20:14                     ` David Kastrup
  2012-01-04 20:56                     ` Andy Wingo
  0 siblings, 2 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-04 20:08 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Mark H Weaver, guile-devel

On 01/04/12 11:43, Andy Wingo wrote:
> The correct behavior is the status quo.  We are considering adding a
> hack to produce different behavior for compatibility purposes.  We don't
> have to worry about correctness in that case, only compatibility.  IMO
> anyway :)

It would be a nice added benefit if it worked as one would expect.
viz., you make actual, writable copies of strings you pull in so that
if the string-upcase function were to modify its input, then it
would not affect other SCMs with values that happen to be the same
sequence of bytes.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 20:08                   ` Bruce Korb
@ 2012-01-04 20:14                     ` David Kastrup
  2012-01-04 20:56                     ` Andy Wingo
  1 sibling, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-04 20:14 UTC (permalink / raw)
  To: guile-devel

Bruce Korb <bruce.korb@gmail.com> writes:

> On 01/04/12 11:43, Andy Wingo wrote:
>> The correct behavior is the status quo.  We are considering adding a
>> hack to produce different behavior for compatibility purposes.  We don't
>> have to worry about correctness in that case, only compatibility.  IMO
>> anyway :)
>
> It would be a nice added benefit if it worked as one would expect.
> viz., you make actual, writable copies of strings you pull in so that
> if the string-upcase function were to modify its input, then it
> would not affect other SCMs with values that happen to be the same
> sequence of bytes.

If string-upcase modifies its input (or needs a mutable string to start
with), this is a bug, in contrast to what string-upcase! may do.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 19:29               ` Bruce Korb
@ 2012-01-04 20:20                 ` David Kastrup
  2012-01-04 23:19                 ` Mark H Weaver
  1 sibling, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-04 20:20 UTC (permalink / raw)
  To: guile-devel

Bruce Korb <bruce.korb@gmail.com> writes:

> Who knows where I learned the idiom.  I learned the minimal amount of
> Guile needed for my purposes a dozen years ago.  My actual problem
> stems from this:
>
>> Backtrace:
>> In ice-9/boot-9.scm:
>>  170: 3 [catch #t #<catch-closure 8b75a0> ...]
>> In unknown file:
>>    ?: 2 [catch-closure]
>> In ice-9/eval.scm:
>>  420: 1 [eval # ()]
>> In unknown file:
>>    ?: 0 [string-upcase ""]
>>
>> ERROR: In procedure string-upcase:
>> ERROR: string is read-only: ""
>> Scheme evaluation error.  AutoGen ABEND-ing in template
>> 	confmacs.tlib on line 209
>> Failing Guile command:  = = = = =
>>
>> (set! tmp-text (get "act-text"))
>>        (set! TMP-text (string-upcase tmp-text))
>
> What in heck is string-upcase doing trying to write to its input
> string?

This looks like it might be just a bug.  Could be that string-upcase
creates its own copy of the string incorrectly including the immutable
bit and then tries changing the string.

No reason to play helter-skelter with the language.  Instead the bug
should be fixed.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 20:08                   ` Bruce Korb
  2012-01-04 20:14                     ` David Kastrup
@ 2012-01-04 20:56                     ` Andy Wingo
  2012-01-04 21:30                       ` Bruce Korb
  1 sibling, 1 reply; 117+ messages in thread
From: Andy Wingo @ 2012-01-04 20:56 UTC (permalink / raw)
  To: Bruce Korb; +Cc: Mark H Weaver, guile-devel

On Wed 04 Jan 2012 15:08, Bruce Korb <bruce.korb@gmail.com> writes:

> On 01/04/12 11:43, Andy Wingo wrote:
>> The correct behavior is the status quo.  We are considering adding a
>> hack to produce different behavior for compatibility purposes.  We don't
>> have to worry about correctness in that case, only compatibility.  IMO
>> anyway :)
>
> It would be a nice added benefit if it worked as one would expect.

I think that in this case, your expectations are just incorrect.  I
don't mean this rudely.  I think you will be happier and more productive
if you change your expectations in this regard to better match "reality"
(the state of things, common practice, conventional Scheme wisdom, etc).

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04  3:04       ` Mike Gran
  2012-01-04  9:35         ` nalaginrut
  2012-01-04  9:41         ` David Kastrup
@ 2012-01-04 21:07         ` Ludovic Courtès
  2 siblings, 0 replies; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-04 21:07 UTC (permalink / raw)
  To: Mike Gran; +Cc: guile-devel@gnu.org

Hi!

Mike Gran <spk121@yahoo.com> skribis:

>>   In many systems it is desirable for constants (i.e. the values of literal
>>   expressions) to reside in read-only-memory.  To express this, it is
>>   convenient to imagine that every object that denotes locations is
>>   associated with a flag telling whether that object is mutable or immutable.
>>   In such systems literal constants and the strings returned by
>>   `symbol->string' are immutable objects, while all objects created by
>>   the other procedures listed in this report are mutable.  It is an error
>>   to attempt to store a new value into a location that is denoted by an
>>   immutable object.

[...]

> The idea that the correct way to initialize a string is
> (define x (string-copy "string")) is awkward.  "string" is a read-only
> but copying it makes it modifyiable?  Copying implies mutability?

Sort-of:

  -- library procedure: string-copy string
      Returns a newly allocated copy of the given STRING.

And a “new allocated copy” is mutable.

> Copying doesn't imply modifying mutability in any other data type.

It’s not about modifying mutability of an object (this can’t be done),
but about fresh vs. constant storage.

> Why not change the behavior 'define' to be (define y (substring str 0)) when STR
> is a read-only string?  This would preserve the shared memory if the variable is never
> modified but still make the string copy-on-write.

I think all sorts of literal strings would have to be treated the same.

FTR, all these evaluate to #t:

  (apply eq? "hello" '("hello"))
  (apply eq? '(1 2 3) '((1 2 3)))
  (apply eq? '#(1 2 3) '(#(1 2 3)))

This is fine per R5RS (info "(r5rs) Equivalence predicates"), but
different from Guile <= 1.8.

(I use ‘apply’ here to fool peval, which otherwise evaluates the
expressions to #f at compile-time.  Andy: should peval be hacked to give
the same answer?)

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04  0:55           ` Bruce Korb
  2012-01-04  3:12             ` Noah Lavine
@ 2012-01-04 21:17             ` Ludovic Courtès
  2012-01-04 22:36               ` Bruce Korb
  1 sibling, 1 reply; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-04 21:17 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Hi Bruce,

Bruce Korb <bkorb@gnu.org> skribis:

> On 01/03/12 15:33, Ludovic Courtès wrote:
>> Could you point me to the affected code?  What would you think of using
>> string-copy as I suggested?  The disadvantage is that you need to modify
>> your code, but hopefully that can be automated with a sed script or so;
>> the advantage is that it would work with all versions of Guile.
>
> The disadvantage is that I know I have "clients" that have rolled their
> own templates, presumably by copy-and-edit processes that will invariably
> include (define var "string") syntax.

If the users files are evaluated rather than compiled/loaded, this is
not a problem:

  scheme@(guile-user)> (eval (call-with-input-string "(define foo \"sdf\")" read) (interaction-environment))
  $9 = #<variable 32a8580 value: "sdf">
  scheme@(guile-user)> (string-set! (variable-ref $9) 1 #\x)
  scheme@(guile-user)> (variable-ref $9)
  $10 = "sxf"

Could you check whether this is the case?

In case it’s not, I have another possible solution in mind.  ;-)

> I'm sorry about being irritable.  This is the third problem with 2.x.

Yeah, I understand it can be really annoying and frustrating.  Believe
me, despite the breadth and depth of changes between 1.8 and 2.0, we did
our best to avoid such nuisances.  Hopefully we can help solve them with
you, so you can really benefit from 2.0 (it’s a significantly nicer
piece of software!)

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 20:56                     ` Andy Wingo
@ 2012-01-04 21:30                       ` Bruce Korb
  0 siblings, 0 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-04 21:30 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Mark H Weaver, guile-devel

On 01/04/12 12:56, Andy Wingo wrote:
> On Wed 04 Jan 2012 15:08, Bruce Korb<bruce.korb@gmail.com>  writes:
>
>> On 01/04/12 11:43, Andy Wingo wrote:
>>> The correct behavior is the status quo.  We are considering adding a
>>> hack to produce different behavior for compatibility purposes.  We don't
>>> have to worry about correctness in that case, only compatibility.  IMO
>>> anyway :)
>>
>> It would be a nice added benefit if it worked as one would expect.
>
> I think that in this case, your expectations are just incorrect.  I
> don't mean this rudely.  I think you will be happier and more productive
> if you change your expectations in this regard to better match "reality"
> (the state of things, common practice, conventional Scheme wisdom, etc).

Going forward, yes, sure, like the pointer aliasing thing.
It was just never an issue with the original C model and it
became such later.  In this case, expectations were built
upon perl and shell scripting models, and it seemed to work
that way.  In any case, the specific problem that actually
triggered this whole thread was that scm_from_locale_string
seems to be returning a reference to an immutable string
(unexpected) *AND* the string-upcase function is objecting
to it (also unexpected).  Otherwise, I'd have gone on oblivious
to any sort of issue.  :)

Cheers - Bruce



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 17:16           ` Bruce Korb
  2012-01-04 17:21             ` Andy Wingo
  2012-01-04 17:39             ` David Kastrup
@ 2012-01-04 21:52             ` Ian Price
  2012-01-04 22:18               ` Bruce Korb
  2012-01-04 22:46             ` Ludovic Courtès
  3 siblings, 1 reply; 117+ messages in thread
From: Ian Price @ 2012-01-04 21:52 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:

> On 01/04/12 04:19, Ian Price wrote:
>> ...  As for mutable strings, I consider them
>> a mistake to begin with,...
>
> Let's step back and consider the whole point of Guile in the first place.
This was not intended as an answer to this question, nor to be
representative of the guile developers / users / what-have-you, but a
personal opinion.

> So my main question is:
>
>   Which is the higher priority, language purity or ease of use?
That is a loaded question, as it presupposes ease of use is always the
same thing as impurity e.g. A zipper is just as usable IMO as a gap
buffer, and doesn't require mutability.

My opinion of mutable strings is that they have little practical use to
me in my day to day programming, frankly I can count the number of times
I've done it in any high level language (so not C etc) over the past 4
or so years on one hand, and I consider most of those uses mistaken in
hindsight. It isn't just functional programming types who care about
this, Python is a great example of a language which has not been
hindered by immutable strings.

The most common string operations in practice (for me) are
concatenation, substrings, comparison/searching, and iteration, and I
would think a better foundation for strings could be found by starting
there rather than with the premise that strings are basically a specific
type of vector.

And again, just to be clear, I'm not making a proposal, just stating an
opinion.

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 21:52             ` Ian Price
@ 2012-01-04 22:18               ` Bruce Korb
  2012-01-04 23:22                 ` Mike Gran
                                   ` (2 more replies)
  0 siblings, 3 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-04 22:18 UTC (permalink / raw)
  To: guile-devel

On 01/04/12 13:52, Ian Price wrote:
>> So my main question is:
>>
>>    Which is the higher priority, language purity or ease of use?
> That is a loaded question, as it presupposes ease of use is always the
> same thing as impurity e.g. ...

Absolutely not.  Making decisions is always about trade-offs,
otherwise it is not really a decision.  Should you give preference
to language aesthetics, or preference to ease of use *when*
there is a divergence?  More often than not, language purity
(consistency) *improves* ease of use.  Here we are looking at
something that does not appear to me to improve ease of use.
You have to go to some extra trouble to be certain that a string
value that you have assigned to an SCM is not read only.
That is not convenience.  If Guile were to implement copy on write,
then the user would not have to care whether a string were
shared read only or not.  It would be easier to use.  The only code
that would care at all would be the Guile internals.  (Where it
belongs -- my completely unhumble opinion :)



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 21:17             ` Ludovic Courtès
@ 2012-01-04 22:36               ` Bruce Korb
  2012-01-05  0:01                 ` Ludovic Courtès
  0 siblings, 1 reply; 117+ messages in thread
From: Bruce Korb @ 2012-01-04 22:36 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hi Ludo,

On 01/04/12 13:17, Ludovic Courtès wrote:
> If the users files are evaluated rather than compiled/loaded, this is
> not a problem:

I do *all* guile processing via the ag_scm_c_eval_string_from_file_line
function.  I suck up a string from my input file, determine that it
needs guile processing and invoke that function.  It has this profile:

> SCM
> ag_scm_c_eval_string_from_file_line(
>     char const * pzExpr, char const * pzFile, int line);
>
> #define SCM_EVAL_CONST(_s) \
>     do { static file_line_t const fl = { __LINE__ - 1, __FILE__, _s }; \
>         pzLastScheme = fl.text; \
>         ag_scm_c_eval_string_from_file_line(fl.text, fl.file, fl.line); \
>     } while (0)

and I *can* redefine define because I start Guile with my own
initialization:

> #define SCHEME_INIT_FILE "directive.h"
> static const int  schemeLine = __LINE__+2;
> static char const zSchemeInit[3846] = // this is generated code...
> "(use-modules (ice-9 common-list))\n\
> ..................................";
>
>     pzLastScheme = zSchemeInit;
>     ag_scm_c_eval_string_from_file_line(
>         zSchemeInit, SCHEME_INIT_FILE, schemeLine);
>
>     SCM_EVAL_CONST("(add-hook! before-error-hook error-source-line)\n"
>                    "(use-modules (ice-9 stack-catch))");

> Could you check whether this is the case?

So it is the case.  My processing consists of slicing up the input
into a bunch of slivers based on markers.  I look at each sliver to
see how to process it.  Some are emitted directly, others trigger
internal mechanisms, a few are handed off to a separate server shell
process and finally, if the text starts with an open parenthesis
or a semi-colon (Guile comment marker), then Guile gets it via that call.

Thanks -Bruce



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 10:03     ` Mark H Weaver
  2012-01-04 14:29       ` Mike Gran
@ 2012-01-04 22:37       ` Ludovic Courtès
  1 sibling, 0 replies; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-04 22:37 UTC (permalink / raw)
  To: guile-devel

Hi!

Mark H Weaver <mhw@netris.org> skribis:

> For example, look at what Guile 1.8 does:
>
>   guile> (let loop ((i 0))
>            (define y "hello")
>            (display y)
>            (newline)
>            (string-set! y i #\a)
>            (loop (1+ i)))
>   hello
>   aello
>   aallo
>   aaalo
>   aaaao
>   aaaaa
>   <then an error>
>
> So you see, even in Guile 1.8, (define y "hello") didn't do what you
> thought it did.  It didn't fill y with the string "hello".  You were
> actually changing the program text itself, and that was a serious
> mistake.

Indeed, funny example!

Ludo’.




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 17:16           ` Bruce Korb
                               ` (2 preceding siblings ...)
  2012-01-04 21:52             ` Ian Price
@ 2012-01-04 22:46             ` Ludovic Courtès
  3 siblings, 0 replies; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-04 22:46 UTC (permalink / raw)
  To: guile-devel

Hello,

Bruce Korb <bkorb@gnu.org> skribis:

> So my main question is:
>
>   Which is the higher priority, language purity or ease of use?

FWIW I think “language purity” is one way to achieve “ease of use” (FSVO
“language purity” at least.)

Ludo’.




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 19:29               ` Bruce Korb
  2012-01-04 20:20                 ` David Kastrup
@ 2012-01-04 23:19                 ` Mark H Weaver
  2012-01-04 23:28                   ` Bruce Korb
  2012-01-07 15:43                   ` Fixed string corruption bugs (was Guile: What's wrong with this?) Mark H Weaver
  1 sibling, 2 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-04 23:19 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 944 bytes --]

Bruce Korb <bruce.korb@gmail.com> writes:

>> ERROR: In procedure string-upcase:
>> ERROR: string is read-only: ""
>> Scheme evaluation error.  AutoGen ABEND-ing in template
>> 	confmacs.tlib on line 209
>> Failing Guile command:  = = = = =
>>
>> (set! tmp-text (get "act-text"))
>>        (set! TMP-text (string-upcase tmp-text))
>
> What in heck is string-upcase doing trying to write to its input string?
> Why was the string returned by ag_scm_get() (the function bound to "get")
> an immutable string anyway?

Good questions indeed.  I spent a bunch of time investigating this, and
found some bugs that might have caused this problem, although I'm not
certain.

Bruce: Can you please see if the patch below fixes this problem?

Mike: Would you be willing to review this (very small) patch to see if
it makes sense to you?  I'd like a second opinion from someone familiar
with that subsystem before I commit it.

     Thanks,
       Mark



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: [PATCH] Fix bugs related to mutation-sharing substrings --]
[-- Type: text/x-patch, Size: 1925 bytes --]

From a8da72937ff4d04e8d39531773cc05e676b2be1c Mon Sep 17 00:00:00 2001
From: Mark H Weaver <mhw@netris.org>
Date: Wed, 4 Jan 2012 17:59:27 -0500
Subject: [PATCH] Fix bugs related to mutation-sharing substrings

* libguile/strings.c (scm_i_is_narrow_string, scm_i_try_narrow_string,
  scm_i_string_set_x): Check to see if the provided string is a
  mutation-sharing substring, and do the right thing in that case.
  Previously, if such a string was passed to these functions, they would
  behave very badly: while trying to fetch and/or mutate the cell
  containing the stringbuf, they were actually fetching or mutating the
  cell containing original shared string.  That's because
  mutation-sharing substring store the original string in CELL_1,
  whereas all other strings store the stringbuf there.
---
 libguile/strings.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/libguile/strings.c b/libguile/strings.c
index 666a951..1628aee 100644
--- a/libguile/strings.c
+++ b/libguile/strings.c
@@ -436,6 +436,9 @@ scm_i_string_length (SCM str)
 int
 scm_i_is_narrow_string (SCM str)
 {
+  if (IS_SH_STRING (str))
+    str = SH_STRING_STRING (str);
+
   return !STRINGBUF_WIDE (STRING_STRINGBUF (str));
 }
 
@@ -446,6 +449,9 @@ scm_i_is_narrow_string (SCM str)
 int
 scm_i_try_narrow_string (SCM str)
 {
+  if (IS_SH_STRING (str))
+    str = SH_STRING_STRING (str);
+
   SET_STRING_STRINGBUF (str, narrow_stringbuf (STRING_STRINGBUF (str)));
 
   return scm_i_is_narrow_string (str);
@@ -664,6 +670,12 @@ scm_i_string_strcmp (SCM sstr, size_t start_x, const char *cstr)
 void
 scm_i_string_set_x (SCM str, size_t p, scm_t_wchar chr)
 {
+  if (IS_SH_STRING (str))
+    {
+      p += STRING_START (str);
+      str = SH_STRING_STRING (str);
+    }
+
   if (chr > 0xFF && scm_i_is_narrow_string (str))
     SET_STRING_STRINGBUF (str, wide_stringbuf (STRING_STRINGBUF (str)));
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 22:18               ` Bruce Korb
@ 2012-01-04 23:22                 ` Mike Gran
  2012-01-04 23:59                 ` Mark H Weaver
  2012-01-05  7:22                 ` David Kastrup
  2 siblings, 0 replies; 117+ messages in thread
From: Mike Gran @ 2012-01-04 23:22 UTC (permalink / raw)
  To: Bruce Korb, guile-devel@gnu.org

> From: Bruce Korb <bkorb@gnu.org>
>>>     Which is the higher priority, language purity or ease of use?

>>  That is a loaded question, as it presupposes ease of use is always the
>>  same thing as impurity e.g. ...

> Absolutely not.  Making decisions is always about trade-offs,
> otherwise it is not really a decision.  Should you give preference
> to language aesthetics, or preference to ease of use *when*
> there is a divergence?  More often than not, language purity
> (consistency) *improves* ease of use.  Here we are looking at
> something that does not appear to me to improve ease of use.
> You have to go to some extra trouble to be certain that a string
> value that you have assigned to an SCM is not read only.
> That is not convenience.  If Guile were to implement copy on write,
> then the user would not have to care whether a string were
> shared read only or not.  It would be easier to use.  The only code
> that would care at all would be the Guile internals.  (Where it
> belongs -- my completely unhumble opinion :)

Well, I've read all the posts in this thread, and I was pretty aware
of the arguments about read-only strings before this.  So since I
have little left to contribute, I'll sign off with one final
statement about it...
 
I agree completely with Bruce's statement above.
 
The mutability of strings in Guile 1.8 was a feature, not a weakness.
Even though it wasn't properly implemented, as Mark pointed out, it
did what I meant every time I used it.
 
I believe that mutability should be the default in all data types.
Creating an immutable compound data type -- be it a string, pair,
vector or whatever -- should never be the default, and should always
be the case that requires extra syntax.
 
R{5,6,7}RS disagrees with me on that, of course.  I think R{5,6,7}RS
is wrong. 
 
I understand the efficiency argument for immutable strings (and pairs).
I don't care, because Guile has never been slow for anything I've asked
it to do.
 
That, I guess, is my completely unhumble opinion. :)
 
Regards,
 
Mike



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 23:19                 ` Mark H Weaver
@ 2012-01-04 23:28                   ` Bruce Korb
  2012-01-07 15:43                   ` Fixed string corruption bugs (was Guile: What's wrong with this?) Mark H Weaver
  1 sibling, 0 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-04 23:28 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On 01/04/12 15:19, Mark H Weaver wrote:
> Bruce Korb<bruce.korb@gmail.com>  writes:
>
>>> ERROR: In procedure string-upcase:
>>> ERROR: string is read-only: ""
>>> Scheme evaluation error.  AutoGen ABEND-ing in template
>>> 	confmacs.tlib on line 209
>>> Failing Guile command:  = = = = =
>>>
>>> (set! tmp-text (get "act-text"))
>>>         (set! TMP-text (string-upcase tmp-text))
>>
>> What in heck is string-upcase doing trying to write to its input string?
>> Why was the string returned by ag_scm_get() (the function bound to "get")
>> an immutable string anyway?
>
> Good questions indeed.  I spent a bunch of time investigating this, and
> found some bugs that might have caused this problem, although I'm not
> certain.
>
> Bruce: Can you please see if the patch below fixes this problem?

OK.  I'll have to play this weekend.  I have to download and install
Guile sources and, unfortunately, this thread notwithstanding, I do
have a day job....

Thank you so much!!  Regards, Bruce



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 22:18               ` Bruce Korb
  2012-01-04 23:22                 ` Mike Gran
@ 2012-01-04 23:59                 ` Mark H Weaver
  2012-01-05 17:22                   ` Bruce Korb
  2012-01-05  7:22                 ` David Kastrup
  2 siblings, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-04 23:59 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:
> You have to go to some extra trouble to be certain that a string
> value that you have assigned to an SCM is not read only.

If you're going to mutate a string, you'd better be safe and make a copy
before mutating it, unless you know very clearly where it came from.
Otherwise, you might be mutating a string that some other data structure
still references, and it might not take kindly to having its string
mutated behind its back.

The fact that some string (whose origin you don't know about) might be
read-only is the least of your problems.  At least that problem will now
be flagged immediately, which is far better than the subtle and
hard-to-debug damage might be caused by mutating a string that other
data structures may reference.

All mutable values in Scheme are pointers.  In the case of strings, that
means that they're like "char *", not "char []".  A great deal of code
freely makes copies of these pointers instead of copying the underlying
string itself.  That's a very old tradition, because it is rare to
mutate strings in Scheme.

> If Guile were to implement copy on write, then the user would not have
> to care whether a string were shared read only or not.  It would be
> easier to use.

Guile already implements copy-on-write strings, but only in the sense of
postponing the copy done by `string-copy', `substring', etc.

Implementing copy-on-write transparently without the user explicitly
making a copy (that is postponed) is _impossible_.  The problem is that
although we could make a new copy of the string, we have no way to know
which pointers to the old object should be changed to point to the new
one.  We cannot read the user's mind.

    Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 22:36               ` Bruce Korb
@ 2012-01-05  0:01                 ` Ludovic Courtès
  2012-01-05 18:36                   ` non-reproduction of initial issue -- was: " Bruce Korb
  0 siblings, 1 reply; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-05  0:01 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Hi,

Bruce Korb <bkorb@gnu.org> skribis:

> On 01/04/12 13:17, Ludovic Courtès wrote:
>> If the users files are evaluated rather than compiled/loaded, this is
>> not a problem:
>
> I do *all* guile processing via the ag_scm_c_eval_string_from_file_line
> function.

[...]

>> Could you check whether this is the case?
>
> So it is the case.

So this is good news: it means you only have to modify your own code
without worrying about your users’ code (modulo the fact that modifying
literals is still a bad idea, as others pointed out.)

BTW, were you able to find a stripped-down example that reproduces the
‘string-upcase’ problem?

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 17:19         ` Mark H Weaver
@ 2012-01-05  4:24           ` Mark H Weaver
  0 siblings, 0 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-05  4:24 UTC (permalink / raw)
  To: Mike Gran; +Cc: Bruce Korb, guile-devel

I wrote:
>   (define-syntax define
>     (lambda (x)
>       (with-syntax ((orig-define #'(@ (guile) define)))
>         (syntax-case x ()
>           ((_ (proc arg ...) e0 e1 ...)
>            #'(orig-define proc (lambda (arg ...) e0 e1 ...)))
>           ((_ v e)
>            (identifier? #'v)
>            (if (string? (syntax->datum #'e))
>                #'(orig-define v (string-copy e))
>                #'(orig-define v e)))))))

In case you're planning to use this, I just realized that this syntax
definition has a flaw: it won't handle cases like this:

  (define (map f . xs) ...)

To fix this flaw, change the two lines after syntax-case to:

>           ((_ (proc . args) e0 e1 ...)
>            #'(orig-define proc (lambda args e0 e1 ...)))

The other macro I provided has the same flaw, and the same fix applies.

      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 22:18               ` Bruce Korb
  2012-01-04 23:22                 ` Mike Gran
  2012-01-04 23:59                 ` Mark H Weaver
@ 2012-01-05  7:22                 ` David Kastrup
  2 siblings, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-05  7:22 UTC (permalink / raw)
  To: guile-devel

Bruce Korb <bkorb@gnu.org> writes:

> On 01/04/12 13:52, Ian Price wrote:
>>> So my main question is:
>>>
>>>    Which is the higher priority, language purity or ease of use?
>> That is a loaded question, as it presupposes ease of use is always the
>> same thing as impurity e.g. ...
>
> Absolutely not.  Making decisions is always about trade-offs,
> otherwise it is not really a decision.

That does not apparently preclude the option of marketing it as one.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-04 23:59                 ` Mark H Weaver
@ 2012-01-05 17:22                   ` Bruce Korb
  2012-01-05 18:13                     ` Mark H Weaver
                                       ` (2 more replies)
  0 siblings, 3 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-05 17:22 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On 01/04/12 15:59, Mark H Weaver wrote:
> Implementing copy-on-write transparently without the user explicitly
> making a copy (that is postponed) is _impossible_.  The problem is that
> although we could make a new copy of the string, we have no way to know
> which pointers to the old object should be changed to point to the new
> one.  We cannot read the user's mind.

So because it might be the case that one reference might want to
see changes made via another reference then the whole concept is
trashed?  "all or nothing"?  Anyway, such a concept should be kept
very simple:  functions that modify their argument make copies of
any input argument that is read only.  Any other SCM's lying about
that refer to the unmodified object continue referring to that
same unmodified object.  No mind reading required.

    (define a "hello")
    (define b a)
    (string-upcase! a)
    b

yields "hello", not "HELLO".  Simple, comprehensible and, of course,
not the problem I was having.  :)

"it goes without saying (but I'll say it anyway)":

    (define a (string-copy "hello"))
    (define b a)
    (string-upcase! a)
    b

*does* yield "HELLO" and not "hello".  Why the inconsistency?

   Because it is better to do what is almost certainly expected
   rather than throw errors.

It is an ease of use over language purity thing.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-05 17:22                   ` Bruce Korb
@ 2012-01-05 18:13                     ` Mark H Weaver
  2012-01-05 19:02                       ` Mark H Weaver
  2012-01-05 20:24                     ` David Kastrup
  2012-01-05 22:42                     ` Mark H Weaver
  2 siblings, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-05 18:13 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:
> So because it might be the case that one reference might want to
> see changes made via another reference then the whole concept is
> trashed?  "all or nothing"?  Anyway, such a concept should be kept
> very simple:  functions that modify their argument make copies of
> any input argument that is read only.  Any other SCM's lying about
> that refer to the unmodified object continue referring to that
> same unmodified object.  No mind reading required.
>
>    (define a "hello")
>    (define b a)
>    (string-upcase! a)
>    b

In order to do as you suggest, we'd have to change `string-upcase!' from
procedure to syntax.  That's because `string-upcase!' gets a _copy_ of
the pointer contained in `a', and is unable to change the pointer in
`a'.  This is fundamental to the semantics of Scheme.  We cannot change
it without breaking a _lot_ of code.

If we changed every string mutation procedure to syntax, then you
wouldn't be able to do things like this:

  (string-upcase! (proc arg ...))
  (map string-upcase! list-of-strings)

Also, if you wrote a procedure like this:

  (define (downcase-all-but-first! s)
    (string-downcase! s)
    (string-set! s 0 (char-upcase (string-ref s 0))))

it would work properly for mutable strings, but if you passed a
read-only string, it would do nothing at all from the caller's point of
view, because it would change the pointer in the local parameter s, but
not the caller's pointer.

These proposed semantics are bad because they don't compose well.

> "it goes without saying (but I'll say it anyway)":
>
>    (define a (string-copy "hello"))
>    (define b a)
>    (string-upcase! a)
>    b
>
> *does* yield "HELLO" and not "hello".  Why the inconsistency?

You are proceeding from the assumption that each variable contains its
own string buffer, when in fact they contain pointers, and (define b a)
copies only the pointer.  In other words, the code above is like:

  char *a = string_copy ("hello");
  char *b = a;
  string_upcase_x (a);
  return b;

What you are asking for cannot be done without changing the fundamental
semantics of Scheme at a very deep level.

     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: non-reproduction of initial issue -- was: Guile: What's wrong with this?
  2012-01-05  0:01                 ` Ludovic Courtès
@ 2012-01-05 18:36                   ` Bruce Korb
  2012-01-05 18:50                     ` Mark H Weaver
  0 siblings, 1 reply; 117+ messages in thread
From: Bruce Korb @ 2012-01-05 18:36 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On 01/04/12 16:01, Ludovic Courtès wrote:
> BTW, were you able to find a stripped-down example that reproduces the
> ‘string-upcase’ problem?

Here's the stripped down example, but it does not reproduce the problem.  :(
I didn't copy into it my variation on scm_c_eval_string, but I hope
that isn't the issue.  Must be some subtle interaction somewhere....

#include <stdio.h>
#include <libguile.h>

static SCM
my_get(void)
{
     static char const zNil[] = "";
     SCM res = scm_from_locale_string(zNil);
     printf("zNil at %p yields SCM 0x%llX\n", zNil, (unsigned long long)res);
     return res;
}

static void
inner_main(void * closure, int argc, char ** argv)
{
     scm_c_define_gsubr("my-get", 0, 0, 0, (scm_t_subr)(void*)my_get);
     scm_c_eval_string("(define a (my-get))"
                       "(define b (string-upcase a))"
                       "(exit 0)");
}

int
main(int argc, char ** argv)
{
     scm_boot_guile(argc, argv, inner_main, 0);
     return 0; /* NOTREACHED */
}



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: non-reproduction of initial issue -- was: Guile: What's wrong with this?
  2012-01-05 18:36                   ` non-reproduction of initial issue -- was: " Bruce Korb
@ 2012-01-05 18:50                     ` Mark H Weaver
  0 siblings, 0 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-05 18:50 UTC (permalink / raw)
  To: Bruce Korb; +Cc: Ludovic Courtès, guile-devel

Bruce Korb <bkorb@gnu.org> writes:

> On 01/04/12 16:01, Ludovic Courtès wrote:
>> BTW, were you able to find a stripped-down example that reproduces the
>> ‘string-upcase’ problem?
>
> Here's the stripped down example, but it does not reproduce the problem.  :(
> I didn't copy into it my variation on scm_c_eval_string, but I hope
> that isn't the issue.  Must be some subtle interaction somewhere....

The bugs I found could have corrupted the internal representions of
strings in such a way to cause what you're seeing.  Please do try that
patch I sent.

    Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-05 18:13                     ` Mark H Weaver
@ 2012-01-05 19:02                       ` Mark H Weaver
  0 siblings, 0 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-05 19:02 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Replying to myself...

>> "it goes without saying (but I'll say it anyway)":
>>
>>    (define a (string-copy "hello"))
>>    (define b a)
>>    (string-upcase! a)
>>    b
>>
>> *does* yield "HELLO" and not "hello".  Why the inconsistency?
>
> You are proceeding from the assumption that each variable contains its
> own string buffer, when in fact they contain pointers, and (define b a)
> copies only the pointer.  In other words, the code above is like:
>
>   char *a = string_copy ("hello");
>   char *b = a;
>   string_upcase_x (a);
>   return b;

Of course, in Scheme (and C) it is possible to do what you want by
changing string-upcase! (string_upcase_x) from a procedure to a macro,
but as you know, macros in C have significant disadvantages.  Scheme
macros are vastly more powerful and robust, but they also have
significant disadvantages compared with procedures.

Here's how you could do what you want with Scheme macros:

  (define-syntax-rule
    (string-upcase!! x)
    (set! x (string-upcase x)))

      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-05 17:22                   ` Bruce Korb
  2012-01-05 18:13                     ` Mark H Weaver
@ 2012-01-05 20:24                     ` David Kastrup
  2012-01-05 22:42                     ` Mark H Weaver
  2 siblings, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-05 20:24 UTC (permalink / raw)
  To: guile-devel

Bruce Korb <bkorb@gnu.org> writes:

> On 01/04/12 15:59, Mark H Weaver wrote:
>> Implementing copy-on-write transparently without the user explicitly
>> making a copy (that is postponed) is _impossible_.  The problem is that
>> although we could make a new copy of the string, we have no way to know
>> which pointers to the old object should be changed to point to the new
>> one.  We cannot read the user's mind.
>
> So because it might be the case that one reference might want to
> see changes made via another reference then the whole concept is
> trashed?

Yes.  Because different references can't be distinguished, it would mean
that you'd not actually have a reference to the modified copy after
modifying it.  Which renders the modification useless.

> "all or nothing"?  Anyway, such a concept should be kept very simple:
> functions that modify their argument make copies of any input argument
> that is read only.  Any other SCM's lying about that refer to the
> unmodified object continue referring to that same unmodified object.
> No mind reading required.

>    (define a "hello")
>    (define b a)
>    (string-upcase! a)
>    b
>
> yields "hello", not "HELLO".  Simple, comprehensible and, of course,
> not the problem I was having.  :)

It is neither simple, nor comprehensible.

> "it goes without saying (but I'll say it anyway)":
>
>    (define a (string-copy "hello"))
>    (define b a)
>    (string-upcase! a)
>    b
>
> *does* yield "HELLO" and not "hello".  Why the inconsistency?
>
>   Because it is better to do what is almost certainly expected
>   rather than throw errors.
>
> It is an ease of use over language purity thing.

You probably don't realize how ironic that is.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-05 17:22                   ` Bruce Korb
  2012-01-05 18:13                     ` Mark H Weaver
  2012-01-05 20:24                     ` David Kastrup
@ 2012-01-05 22:42                     ` Mark H Weaver
  2012-01-06  1:02                       ` Mike Gran
  2 siblings, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-05 22:42 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:
> Anyway, such a concept should be kept
> very simple:  functions that modify their argument make copies of
> any input argument that is read only.  Any other SCM's lying about
> that refer to the unmodified object continue referring to that
> same unmodified object.  No mind reading required.
>
>    (define a "hello")
>    (define b a)
>    (string-upcase! a)
>    b

I suspect that what you really want is for `define' (and maybe some
other things) to automatically do a deep copy instead of merely making a
new reference to an existing object.

For example, you seem to want (define a "hello") to make a fresh copy of
the string literal, and for (define b a) to make another copy so that
changes to the string referenced by `b' do not affect the string
referenced by `a'.

You seem to not want to think about aliasing issues.  Indeed, it would
be more intuitive if we always copied everything deeply, but that would
be strictly less powerful, not to mention far less efficient, especially
when handling large structures.

`define' merely makes a new reference to an existing object.  If you
want a copy, you must explicitly ask for one (though this could be
hidden by custom syntax).  It would not be desirable for the language to
make copies automatically as part of the core `define' syntax.  For one
thing, sometimes you don't want a copy.  Sometimes you want shared
mutable objects.

Even if you do want to copy, there are different kinds of copies.  How
deeply do you want to copy?  If it's a hierarchical list, do you want to
copy only the first level of the list, or do you want to recurse?
Suppose this hierarchical list contains strings.  Do you want to copy
the strings too, or just the list structure?  I could go on and on.
There's no good universal copier; it depends on your purposes.

If you want an abbreviated way to both `define' and `copy', then you'll
need to make new syntax to do that.  Guile provides all the power you
need to do this.

      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-05 22:42                     ` Mark H Weaver
@ 2012-01-06  1:02                       ` Mike Gran
  2012-01-06  1:41                         ` Mark H Weaver
  2012-01-06  9:23                         ` David Kastrup
  0 siblings, 2 replies; 117+ messages in thread
From: Mike Gran @ 2012-01-06  1:02 UTC (permalink / raw)
  To: Mark H Weaver, Bruce Korb; +Cc: guile-devel@gnu.org

> `define' merely makes a new reference to an existing object.  If you
> want a copy, you must explicitly ask for one (though this could be
> hidden by custom syntax).  It would not be desirable for the language to
> make copies automatically as part of the core `define' syntax.  For one
> thing, sometimes you don't want a copy.  Sometimes you want shared
> mutable objects.

It is curious that action of 'copy' really means the
action of 'create a copy with different properties'.
 
Shouldn't (string-copy "a") create another immutable string?
 
Likewise, shouldn't (substring "abc" 1) return an immutable substring?



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06  1:02                       ` Mike Gran
@ 2012-01-06  1:41                         ` Mark H Weaver
  2012-01-06  2:38                           ` Noah Lavine
                                             ` (2 more replies)
  2012-01-06  9:23                         ` David Kastrup
  1 sibling, 3 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-06  1:41 UTC (permalink / raw)
  To: Mike Gran; +Cc: Bruce Korb, guile-devel

Mike Gran <spk121@yahoo.com> writes:
> It is curious that action of 'copy' really means the
> action of 'create a copy with different properties'.
>  
> Shouldn't (string-copy "a") create another immutable string?

Why would you want to copy an immutable string?

> Likewise, shouldn't (substring "abc" 1) return an immutable substring?

As I understand it, in the Scheme standards (at least before R6RS's
immutable pairs) the rationale behind marking literal constants as
immutable is solely to avoid needlessly making copies of those literals,
while flagging accidental attempts to modify them, since that is almost
certainly a mistake.

If that is the only rationale for marking things read-only, then there's
no reason to mark copies read-only.  The philosophy of Scheme (at least
before R6RS) was clearly to make almost all data structures mutable.

Following that philosophy, in Guile, even though (substring "abc" 1)
postpones copying the string buffer, it must create a new heap object.
Once you've done that, it is feasible to implement copy-on-write.

Now, the immutable pairs of R6RS and Racket have an entirely different
rationale, namely that they enable vastly more effective optimization in
a compiler.  In this case, presumably you'd want copies to retain the
immutability.

     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06  1:41                         ` Mark H Weaver
@ 2012-01-06  2:38                           ` Noah Lavine
  2012-01-06 13:37                           ` Mike Gran
  2012-01-07 20:57                           ` Guile: " Ian Price
  2 siblings, 0 replies; 117+ messages in thread
From: Noah Lavine @ 2012-01-06  2:38 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Bruce Korb, guile-devel

Hello all,

I must admit that I do not know much about why R5RS says that literals
are constant, but I think there is a misunderstanding.

Bruce does not want `define' to always copy its result. I think what
he wants is for literals embedded in source code to be mutable. This
would, of course, imply that each literal in the source code would be
a new copy, even if they were identical.

Weirdly enough, that is how my intuition works too. After all, if I
made a string object in Scheme without going to any trouble, I would
get a mutable object. If I write down a string, I expect to get the
same sort of object. Bruce is also right that this enables quick and
easy programming that munges strings.

And I think the argument about putting strings in constant memory is
bad - constant memory is an implementation detail. If it happens that
we can store literals more efficiently when they are not mutated, then
perhaps we should just detect that case and switch representations.

Of course there is a trade-off here between ease of implementation and
ease of use. This change seems pretty unimportant to me, especially if
Python does all right with immutable strings, so I do not think it's
important for us to support it. I just don't buy the arguments against
supporting it.

Noah

On Thu, Jan 5, 2012 at 8:41 PM, Mark H Weaver <mhw@netris.org> wrote:
> Mike Gran <spk121@yahoo.com> writes:
>> It is curious that action of 'copy' really means the
>> action of 'create a copy with different properties'.
>>
>> Shouldn't (string-copy "a") create another immutable string?
>
> Why would you want to copy an immutable string?
>
>> Likewise, shouldn't (substring "abc" 1) return an immutable substring?
>
> As I understand it, in the Scheme standards (at least before R6RS's
> immutable pairs) the rationale behind marking literal constants as
> immutable is solely to avoid needlessly making copies of those literals,
> while flagging accidental attempts to modify them, since that is almost
> certainly a mistake.
>
> If that is the only rationale for marking things read-only, then there's
> no reason to mark copies read-only.  The philosophy of Scheme (at least
> before R6RS) was clearly to make almost all data structures mutable.
>
> Following that philosophy, in Guile, even though (substring "abc" 1)
> postpones copying the string buffer, it must create a new heap object.
> Once you've done that, it is feasible to implement copy-on-write.
>
> Now, the immutable pairs of R6RS and Racket have an entirely different
> rationale, namely that they enable vastly more effective optimization in
> a compiler.  In this case, presumably you'd want copies to retain the
> immutability.
>
>     Mark
>



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06  1:02                       ` Mike Gran
  2012-01-06  1:41                         ` Mark H Weaver
@ 2012-01-06  9:23                         ` David Kastrup
  1 sibling, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-06  9:23 UTC (permalink / raw)
  To: guile-devel

Mike Gran <spk121@yahoo.com> writes:

>> `define' merely makes a new reference to an existing object.  If you
>> want a copy, you must explicitly ask for one (though this could be
>> hidden by custom syntax).  It would not be desirable for the language to
>> make copies automatically as part of the core `define' syntax.  For one
>> thing, sometimes you don't want a copy.  Sometimes you want shared
>> mutable objects.
>
> It is curious that action of 'copy' really means the
> action of 'create a copy with different properties'.
>  
> Shouldn't (string-copy "a") create another immutable string?

That would be rather pointless.  You could just use the original string.

> Likewise, shouldn't (substring "abc" 1) return an immutable substring?

Why wouldn't you be using substring/shared if you are not going to
modify either string?

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06  1:41                         ` Mark H Weaver
  2012-01-06  2:38                           ` Noah Lavine
@ 2012-01-06 13:37                           ` Mike Gran
  2012-01-06 14:11                             ` David Kastrup
  2012-01-06 18:13                             ` Mark H Weaver
  2012-01-07 20:57                           ` Guile: " Ian Price
  2 siblings, 2 replies; 117+ messages in thread
From: Mike Gran @ 2012-01-06 13:37 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Bruce Korb, guile-devel@gnu.org

> From: Mark H Weaver <mhw@netris.org>
 >>  It is curious that action of 'copy' really means the
>>  action of 'create a copy with different properties'.
>>   
>>  Shouldn't (string-copy "a") create another immutable string?
> 
> Why would you want to copy an immutable string?
> 
>>  Likewise, shouldn't (substring "abc" 1) return an immutable 
> substring?

I was being too snarky and rhetorical.  Gotta stop writing e-mail
before getting coffee.
 
To say something possibly semi-constructive...
 
The word 'string' in Scheme is overloaded to mean both string
immutables and string mutables.   Since a string immutable
can't be modified to be a mutable, they really are different
object types.  String mutables appear to still exist in the 
latest draft of R7RS. 
 
Many of the procedures that operate on strings will are overloaded
to take both immutables and mutables, but some, like string-set!
take only mutables.
 
There is an obvious syntax to construct a string immutable
object: namely to have it appear as a literal in the source code.
There thus isn't a need for a constructor function.
 
There is a need for a constructor function to create string mutables,
because a literal string in the source code indicates a string immutable.
 
There are such constructors: (string <char> ...) and (make-string k <char>)
which is fine.
 
But there is no constructor for a string mutable that initializes
it with a string in Guile 2.0.  There was in Guile 1.8, where
you could do (define <var-name> <string-literal>). 
 
So instead, syntactically, we now have to use 'string-copy' or 'substring'
for its *side-effects*, namely that it doesn't mark the copy immutable.
Those are rather poor and confusing names for constructors.
 
If making such a suggestion weren't pointless, I'd pitch the idea
of overloading 'string' or 'make-string' so they can be used as
a constructor of a string mutable.  Something like
(string <string-literal>) or (make-string <string-literal>).  This
would be clearer than using string-copy, I think.
 
Thanks,
 
Mike



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06 13:37                           ` Mike Gran
@ 2012-01-06 14:11                             ` David Kastrup
  2012-01-06 18:13                             ` Mark H Weaver
  1 sibling, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-06 14:11 UTC (permalink / raw)
  To: guile-devel

Mike Gran <spk121@yahoo.com> writes:

> There is an obvious syntax to construct a string immutable
> object: namely to have it appear as a literal in the source code.
> There thus isn't a need for a constructor function.

Huh?  There are _lots_ of strings which are better computed than spelled
out.

> But there is no constructor for a string mutable that initializes
> it with a string in Guile 2.0.

(string-copy "xxxxx")

> There was in Guile 1.8, where
> you could do (define <var-name> <string-literal>).

No, it wasn't.

guile> (define (x) "xxxxx")
guile> (x)
"xxxxx"
guile> (string-upcase! (x))
"XXXXX"
guile> (x)
"XXXXX"
guile>

As you can see, reevaluating the definition suddenly delivers a changed
result, because we are not talking about modifying a mutable string
initialized with a literal, but about modifying the literal itself.

Whether or not you replace the function body with
(define y "xxxxx") y
instead of just "xxxxx" does not change the result and does not change
what happens.  y does not refer to a string initialized from the
literal, it refers to the literal.  And changing the literal is a really
bad idea.

Just because you do not understand what the code did previously does not
mean that the behavior was well-defined.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06 13:37                           ` Mike Gran
  2012-01-06 14:11                             ` David Kastrup
@ 2012-01-06 18:13                             ` Mark H Weaver
  2012-01-06 19:06                               ` Bruce Korb
  2012-01-06 22:23                               ` Guile BUG: " Bruce Korb
  1 sibling, 2 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-06 18:13 UTC (permalink / raw)
  To: Mike Gran; +Cc: Bruce Korb, guile-devel

Mike Gran <spk121@yahoo.com> writes:
> The word 'string' in Scheme is overloaded to mean both string
> immutables and string mutables.   Since a string immutable
> can't be modified to be a mutable, they really are different
> object types.  String mutables appear to still exist in the 
> latest draft of R7RS. 
>  
> Many of the procedures that operate on strings will are overloaded
> to take both immutables and mutables, but some, like string-set!
> take only mutables.

This is the wrong way to think about it.  In Scheme, mutable and
immutable strings are _not_ different types.

The way to think about it is that in Scheme, the program text itself is
immutable, including any literals contained in it.  This is true of
_all_ literals, including '(literal lists), '#(literal vectors),
"literal strings", #'(literal syntax) and any other types that might be
added in the future that would otherwise be mutable.

Imagine that you were evaluating Scheme by hand on paper.  You have your
program written on one page, and you have another scratch page used for
the data structures that your program creates during evaluation.
Suppose your program contains a very large lookup table, written as a
literal list.  This lookup table is on your program page.

Now, suppose you are asked to evaluate (lookup key big-lookup-table).

The way Scheme works is that `big-lookup-table' is _not_ copied.  As
`lookup' traverses the table, it contains pointers within the program
page itself.  However, Scheme prohibits you from modifying _anything_
that happens to be on the program page.  It's not a question of type.
It's a question of which page the data happens to be on.

Now, we _could_ force you to copy big-lookup-table from the program page
onto the scratch page before doing `lookup', just in case `lookup' might
try to mutate its structure.  But that would be a lot of wasted effort.

Alternatively, we could allow you to modify the program itself.  This
is what Guile 1.8 did.  You _could_ make an argument that this is
desirable, on the grounds that we should trust that the programmer knows
what he's doing.

However, it's clear that Bruce did _not_ understood what he was doing.
I don't think that he (or you) realized that the following procedure was
buggy in Guile 1.8:

  (define (ten-spaces-with-one-star-at i)
    (define s "          ")
    (string-set! s i #\*)
    s)

Guile 1.8's permissivity allowed Bruce to unwittingly create a large
body of code that was inherently buggy.  IMHO, it would have been much
better to nip that in the bud and alert him to the fact that he was
doing something that was almost certainly unwise.

> There is a need for a constructor function to create string mutables,
> because a literal string in the source code indicates a string immutable.
>  
> There are such constructors: (string <char> ...) and (make-string k <char>)
> which is fine.
>  
> But there is no constructor for a string mutable that initializes
> it with a string in Guile 2.0.

Yes there is: (string-copy "string-literal")

If you don't like the name, then rename it:

  (define mutable-string string-copy)

     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06 18:13                             ` Mark H Weaver
@ 2012-01-06 19:06                               ` Bruce Korb
  2012-01-06 19:19                                 ` David Kastrup
  2012-01-07 16:13                                 ` Mark H Weaver
  2012-01-06 22:23                               ` Guile BUG: " Bruce Korb
  1 sibling, 2 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-06 19:06 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On 01/06/12 10:13, Mark H Weaver wrote:
> Imagine that you were evaluating Scheme by hand on paper.  You have your
> program written on one page, and you have another scratch page used for
> the data structures that your program creates during evaluation.
> Suppose your program contains a very large lookup table, written as a
> literal list.  This lookup table is on your program page.
>
> Now, suppose....

That is where my mental model diverges!!

> sprintf(buf, "(define %s \"%s\")", "foo", my_str);
> scm_eval_string(buf);
> sprintf(buf, "(string-upcase! %s)", "foo")
> // the string from my_str in "buf" is now scribbled over and completely gone
> scm_eval_string(buf);

Since I know the program I initially wrote (the define) is now gone,
the string must have been copied off somewhere.  I think one's first
guess is that it was copied to someplace modifiable.  However, that
would be incorrect.  It is copied off to writable memory, but marked
as read-only for the purposes of Guile.  Not intuitively obvious.

> Guile 1.8's permissivity allowed Bruce to unwittingly create a large
> body of code that was inherently buggy.  IMHO, it would have been much
> better to nip that in the bud and alert him to the fact that he was
> doing something that was almost certainly unwise.

Fail early and fail hard.  Yes.  But after all these discussions, I
now doubt I have too many places where I am expecting to change a
static value.  Most of the strings that I wind up  altering are
created with a scm_from_locale_string() C function call.  Very few
strings are ever actually initialized with (define foo "something"),
other than when creating placeholders because you cannot define
within a nested collection of functions.  e.g.
   (if (whatever)
       (define foo (get "this"))
       (define foo (get "that"))  )
   (string-upcase! foo)

====

Anyway, I did compile and build my toy and guile with CFLAGS='-g -O0'.
The error message did not show.  Instead it seg faulted while trying
to make this call:  scm_from_locale_string("");

There must be a corruption somewhere.  It is either asymptomatic with
Guile 1.8 (viz. my fault) or it is introduced with Guile 2.0 (meaning
a Guile code issue).  More in a few days.

Thank you.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06 19:06                               ` Bruce Korb
@ 2012-01-06 19:19                                 ` David Kastrup
  2012-01-06 20:03                                   ` Mark H Weaver
  2012-01-07 16:13                                 ` Mark H Weaver
  1 sibling, 1 reply; 117+ messages in thread
From: David Kastrup @ 2012-01-06 19:19 UTC (permalink / raw)
  To: guile-devel

Bruce Korb <bkorb@gnu.org> writes:

> On 01/06/12 10:13, Mark H Weaver wrote:
>> Imagine that you were evaluating Scheme by hand on paper.  You have your
>> program written on one page, and you have another scratch page used for
>> the data structures that your program creates during evaluation.
>> Suppose your program contains a very large lookup table, written as a
>> literal list.  This lookup table is on your program page.
>>
>> Now, suppose....
>
> That is where my mental model diverges!!

The mental model of the computer is what counts.

>> sprintf(buf, "(define %s \"%s\")", "foo", my_str);
>> scm_eval_string(buf);
>> sprintf(buf, "(string-upcase! %s)", "foo")
>> // the string from my_str in "buf" is now scribbled over and completely gone
>> scm_eval_string(buf);
>
> Since I know the program I initially wrote (the define) is now gone,

Why would a define be gone?

> the string must have been copied off somewhere.

I don't think you understand the concept of garbage collection.
_Everything_ in Scheme exists permanently regarding all observable
semantics (well, weak hash tables are a somewhat weird exception).
Definitions, variables, continuations.  There is no concept like a stack
of local values that would get erased.  Thanks to call/cc, there is not
even a return stack that would get erased.  Every object carries its own
lifetime with it.  It dies when nobody remembers it, not because of
being in some scope or whatever else.

> I think one's first guess is that it was copied to someplace
> modifiable.  However, that would be incorrect.  It is copied off to
> writable memory, but marked as read-only for the purposes of Guile.
> Not intuitively obvious.

Also wrong.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06 19:19                                 ` David Kastrup
@ 2012-01-06 20:03                                   ` Mark H Weaver
  0 siblings, 0 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-06 20:03 UTC (permalink / raw)
  To: David Kastrup; +Cc: guile-devel

David Kastrup <dak@gnu.org> writes:

> Bruce Korb <bkorb@gnu.org> writes:
>
>>> sprintf(buf, "(define %s \"%s\")", "foo", my_str);
>>> scm_eval_string(buf);
>>> sprintf(buf, "(string-upcase! %s)", "foo")
>>> // the string from my_str in "buf" is now scribbled over and completely gone
>>> scm_eval_string(buf);
>>
>> Since I know the program I initially wrote (the define) is now gone,
>
> Why would a define be gone?

I think what Bruce means here is that, in theory, the string object
created in the above `define' might have held a reference to part of his
buffer `buf'.  And indeed, we do make a copy of that buffer.  So why not
make a mutable copy?

The reason is that, even though we make a copy of the program as we read
it (converting from the string representation of `buf' into our internal
representation), we'd like to be able to use the program multiple times.

When I speak of the "program text", I'm not referring to the string
representation of the program, but rather the internal representation.

If we allow the user to unwittingly modify the program, it might work
once but fail thereafter, as in:

  (define ten-spaces-with-one-star-at
    (lambda (i)
      (define s "          ")
      (string-set! s i #\*)
      s))

Now, some reasonable people might say "Why arbitrarily limit the user?
He might know what he's doing, and he might really want to do this!"

Scheme provides a nice way to do this too:

  (define ten-spaces-with-new-star-at
    (let ((s (make-string 10 #\space)))
      (lambda (i)
        (string-set! s i #\*)
        s)))

I normally lean toward assuming that the user knows what he's doing, but
in this case I think Scheme got it right.  Accidentally modifying
literals is a very common mistake, and is almost never a good idea.

If you want to make a program with internal mutable state, Scheme
provides free variables, as used in the example above.

     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-06 18:13                             ` Mark H Weaver
  2012-01-06 19:06                               ` Bruce Korb
@ 2012-01-06 22:23                               ` Bruce Korb
  2012-01-06 23:11                                 ` Mark H Weaver
  2012-01-06 23:28                                 ` Guile BUG: What's wrong with this? Bruce Korb
  1 sibling, 2 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-06 22:23 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel


scm_from_locale_stringn() makes an optimization when the length is zero.
It returns an immutable string of length zero.  For reasons I no longer
remember, I had my own ag_scm_string_upcase that called
scm_string_upcase_x, presuming that scm_from_locale_stringn had returned
a writable string.

Two possible fixes:

1. remove the "optimization"
2. check the length in scm_string_upcase_x before choking.

The reason for the seg fault is that scm_backtrace() faulted.
I called it in the on_exit path and it couldn't cope.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-06 22:23                               ` Guile BUG: " Bruce Korb
@ 2012-01-06 23:11                                 ` Mark H Weaver
  2012-01-06 23:35                                   ` Andy Wingo
                                                     ` (2 more replies)
  2012-01-06 23:28                                 ` Guile BUG: What's wrong with this? Bruce Korb
  1 sibling, 3 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-06 23:11 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:
> scm_from_locale_stringn() makes an optimization when the length is zero.
> It returns an immutable string of length zero.

Good catch!

> Two possible fixes:
>
> 1. remove the "optimization"
> 2. check the length in scm_string_upcase_x before choking.

I see a third possible fix, which I think I like best:

3. Make scm_nullstr into a mutable string.  After all, it can't be
   changed anyway, and the _only_ reference to it is from
   scm_from_stringn, so the result should always be mutable.

What do other people think?

    Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-06 22:23                               ` Guile BUG: " Bruce Korb
  2012-01-06 23:11                                 ` Mark H Weaver
@ 2012-01-06 23:28                                 ` Bruce Korb
  1 sibling, 0 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-06 23:28 UTC (permalink / raw)
  To: guile-devel

On 01/06/12 14:23, Bruce Korb wrote:
Since I'm dead in the water, I've patched the 2.0.3 source:

$ diff -u srfi-13.c~ srfi-13.c
--- srfi-13.c~  2011-07-06 15:50:00.000000000 -0700
+++ srfi-13.c   2012-01-06 15:26:44.963324773 -0800
@@ -2088,6 +2088,8 @@
  string_upcase_x (SCM v, size_t start, size_t end)
  {
    size_t k;
+  if (start == end)
+    return v;

    v = scm_i_string_start_writing (v);
    for (k = start; k < end; ++k)
@@ -2151,6 +2153,8 @@
  string_downcase_x (SCM v, size_t start, size_t end)
  {
    size_t k;
+  if (start == end)
+    return v;

    v = scm_i_string_start_writing (v);
    for (k = start; k < end; ++k)
@@ -2218,6 +2222,8 @@
    SCM ch;
    size_t i;
    int in_word = 0;
+  if (start == end)
+    return str;

    str = scm_i_string_start_writing (str);
    for(i = start; i < end;  i++)
@@ -2310,6 +2316,8 @@
  string_reverse_x (SCM str, size_t cstart, size_t cend)
  {
    SCM tmp;
+  if (cstart == cend)
+    return;

    str = scm_i_string_start_writing (str);
    if (cend > 0)



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-06 23:11                                 ` Mark H Weaver
@ 2012-01-06 23:35                                   ` Andy Wingo
  2012-01-06 23:41                                   ` Bruce Korb
  2012-01-07 14:35                                   ` Mark H Weaver
  2 siblings, 0 replies; 117+ messages in thread
From: Andy Wingo @ 2012-01-06 23:35 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Bruce Korb, guile-devel

On Sat 07 Jan 2012 00:11, Mark H Weaver <mhw@netris.org> writes:

> 3. Make scm_nullstr into a mutable string.  After all, it can't be
>    changed anyway, and the _only_ reference to it is from
>    scm_from_stringn, so the result should always be mutable.
>
> What do other people think?

Makes sense to me.  Good catch, Bruce.  Are you down for fixing this
one, Mark? :-)

Cheers,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-06 23:11                                 ` Mark H Weaver
  2012-01-06 23:35                                   ` Andy Wingo
@ 2012-01-06 23:41                                   ` Bruce Korb
  2012-01-07 15:00                                     ` Mark H Weaver
  2012-01-07 14:35                                   ` Mark H Weaver
  2 siblings, 1 reply; 117+ messages in thread
From: Bruce Korb @ 2012-01-06 23:41 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On 01/06/12 15:11, Mark H Weaver wrote:
> Bruce Korb<bkorb@gnu.org>  writes:
>> scm_from_locale_stringn() makes an optimization when the length is zero.
>> It returns an immutable string of length zero.
>
> Good catch!
>
>> Two possible fixes:
>>
>> 1. remove the "optimization"
>> 2. check the length in scm_string_upcase_x before choking.
>
> I see a third possible fix, which I think I like best:
>
> 3. Make scm_nullstr into a mutable string.  After all, it can't be
>     changed anyway, and the _only_ reference to it is from
>     scm_from_stringn, so the result should always be mutable.
>
> What do other people think?
>
>      Mark
>
>
>

I think you are presuming that that is  the only source of zero length immutable strings.
Are you completely certain?

Anyway:


Running socket.test
ERROR: socket.test: AF_INET6/SOCK_STREAM: bind - arguments: ((system-error "bind" "~A" ("Cannot assign requested address") (99)))
ERROR: socket.test: AF_INET6/SOCK_STREAM: bind/sockaddr - arguments: ((system-error "bind" "~A" ("Cannot assign requested address") (99)))


I'm going to assume that whatever that is, it isn't related to my change.
Tho perhaps yours.  I am not sure I understand all the ramifications of your change.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-06 23:11                                 ` Mark H Weaver
  2012-01-06 23:35                                   ` Andy Wingo
  2012-01-06 23:41                                   ` Bruce Korb
@ 2012-01-07 14:35                                   ` Mark H Weaver
  2012-01-07 15:20                                     ` Mike Gran
                                                       ` (2 more replies)
  2 siblings, 3 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-07 14:35 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

I wrote:
> 3. Make scm_nullstr into a mutable string.  After all, it can't be
>    changed anyway, and the _only_ reference to it is from
>    scm_from_stringn, so the result should always be mutable.

For the record: my statement above was in error; scm_nullstr is actually
used in several files.  However, I looked at each use, and in all cases
a mutable string is appropriate.  Also, it is SCM_INTERNAL.  So I
committed the change.

However, I wonder if we should also remove this optimization from
scm_from_stringn, as Bruce suggested.  The R5RS says that `string' and
`make-string' should return "a newly allocated string", which implies
that the new string should not be `eq?' to any existing object.

Although our docs for scm_from_stringn et al do not explicitly specify
that the string is newly allocated, an argument could be made that we
should follow the behavior of `string'.

What do other people think?

      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-06 23:41                                   ` Bruce Korb
@ 2012-01-07 15:00                                     ` Mark H Weaver
  2012-01-07 15:27                                       ` Bruce Korb
  2012-01-07 15:47                                       ` David Kastrup
  0 siblings, 2 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-07 15:00 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:

> On 01/06/12 15:11, Mark H Weaver wrote:
>> Bruce Korb<bkorb@gnu.org>  writes:
>>> scm_from_locale_stringn() makes an optimization when the length is zero.
>>> It returns an immutable string of length zero.
>>
>> Good catch!
>>
>>> Two possible fixes:
>>>
>>> 1. remove the "optimization"
>>> 2. check the length in scm_string_upcase_x before choking.
>>
>> I see a third possible fix, which I think I like best:
>>
>> 3. Make scm_nullstr into a mutable string.  After all, it can't be
>>     changed anyway, and the _only_ reference to it is from
>>     scm_from_stringn, so the result should always be mutable.
>>
>> What do other people think?
>>
>>      Mark
>>
>>
>>
>
> I think you are presuming that that is the only source of zero length
> immutable strings.  Are you completely certain?

Empty string literals ("") in the program text are still immutable, so
(string-upcase! "") still throws an error.

I admit that this is an arguable point.  Section 3.4 (Storage model) of
the R5RS (and the R7RS draft) says "It is an error to attempt to store a
new value into a location that is denoted by an immutable object."  An
empty string denotes no locations, so perhaps this should not be an
error after all.

The right place to fix this would probably be in
`scm_i_string_start_writing' (strings.c).

What do other people think?

      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-07 14:35                                   ` Mark H Weaver
@ 2012-01-07 15:20                                     ` Mike Gran
  2012-01-07 22:25                                     ` Ludovic Courtès
  2012-01-10  9:13                                     ` The empty string and other empty strings Ludovic Courtès
  2 siblings, 0 replies; 117+ messages in thread
From: Mike Gran @ 2012-01-07 15:20 UTC (permalink / raw)
  To: Mark H Weaver, Bruce Korb; +Cc: guile-devel@gnu.org

> From: Mark H Weaver <mhw@netris.org>
> 
> I wrote:
>>  3. Make scm_nullstr into a mutable string.  After all, it can't be
>>     changed anyway, and the _only_ reference to it is from
>>     scm_from_stringn, so the result should always be mutable.
> 
> For the record: my statement above was in error; scm_nullstr is actually
> used in several files.  However, I looked at each use, and in all cases
> a mutable string is appropriate.  Also, it is SCM_INTERNAL.  So I
> committed the change.
> 
> However, I wonder if we should also remove this optimization from
> scm_from_stringn, as Bruce suggested.  The R5RS says that `string' and
> `make-string' should return "a newly allocated string", which 
> implies
> that the new string should not be `eq?' to any existing object.

I threw in the optimization a couple of years ago into scm_from_stringn
only because I saw it used elsewhere in the code.  This was well before
Guile-2.0's switch of the immutable flag.  So there wasn't much thought
behind it.
 
-Mike
 
 
 
 
> 
> Although our docs for scm_from_stringn et al do not explicitly specify
> that the string is newly allocated, an argument could be made that we
> should follow the behavior of `string'.
> 
> What do other people think?
> 
>       Mark
>  



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-07 15:00                                     ` Mark H Weaver
@ 2012-01-07 15:27                                       ` Bruce Korb
  2012-01-07 16:38                                         ` Mark H Weaver
  2012-01-07 15:47                                       ` David Kastrup
  1 sibling, 1 reply; 117+ messages in thread
From: Bruce Korb @ 2012-01-07 15:27 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On 01/07/12 07:00, Mark H Weaver wrote:
>> I think you are presuming that that is the only source of zero length
>> immutable strings.  Are you completely certain?
>
> Empty string literals ("") in the program text are still immutable, so
> (string-upcase! "") still throws an error.
>
> I admit that this is an arguable point.  Section 3.4 (Storage model) of
> the R5RS (and the R7RS draft) says "It is an error to attempt to store a
> new value into a location that is denoted by an immutable object."  An
> empty string denotes no locations, so perhaps this should not be an
> error after all.
>
> The right place to fix this would probably be in
> `scm_i_string_start_writing' (strings.c).
>
> What do other people think?

I think it too much effort for that function.  I looked at it.
The problem is that you'd have to pass it the start and end points,
that is, change its interface.  Not worth it.  Either do as you've
done and have a shared writable zero length string, or exit the
functions that use scm_i_string_start_writing before calling that
function, in the event that these string transformation functions
detect a zero length string (my patch).  Either way.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Fixed string corruption bugs (was Guile: What's wrong with this?)
  2012-01-04 23:19                 ` Mark H Weaver
  2012-01-04 23:28                   ` Bruce Korb
@ 2012-01-07 15:43                   ` Mark H Weaver
  2012-01-07 16:19                     ` Fixed string corruption bugs Andy Wingo
  1 sibling, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-07 15:43 UTC (permalink / raw)
  To: guile-devel

> From a8da72937ff4d04e8d39531773cc05e676b2be1c Mon Sep 17 00:00:00 2001
> From: Mark H Weaver <mhw@netris.org>
> Date: Wed, 4 Jan 2012 17:59:27 -0500
> Subject: [PATCH] Fix bugs related to mutation-sharing substrings
>
> * libguile/strings.c (scm_i_is_narrow_string, scm_i_try_narrow_string,
>   scm_i_string_set_x): Check to see if the provided string is a
>   mutation-sharing substring, and do the right thing in that case.
>   Previously, if such a string was passed to these functions, they would
>   behave very badly: while trying to fetch and/or mutate the cell
>   containing the stringbuf, they were actually fetching or mutating the
>   cell containing original shared string.  That's because
>   mutation-sharing substring store the original string in CELL_1,
>   whereas all other strings store the stringbuf there.

I committed this.  Here's an example that segfaulted before these fixes:

  scheme@(guile-user)> (define s (string-copy "hello"))
  scheme@(guile-user)> (define ss (substring/shared s 1 4))
  scheme@(guile-user)> (string-set! ss 0 #\λ)
  Segmentation fault

     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-07 15:00                                     ` Mark H Weaver
  2012-01-07 15:27                                       ` Bruce Korb
@ 2012-01-07 15:47                                       ` David Kastrup
  2012-01-07 17:07                                         ` Mark H Weaver
  1 sibling, 1 reply; 117+ messages in thread
From: David Kastrup @ 2012-01-07 15:47 UTC (permalink / raw)
  To: guile-devel

Mark H Weaver <mhw@netris.org> writes:

> Empty string literals ("") in the program text are still immutable, so
> (string-upcase! "") still throws an error.
>
> I admit that this is an arguable point.  Section 3.4 (Storage model) of
> the R5RS (and the R7RS draft) says "It is an error to attempt to store a
> new value into a location that is denoted by an immutable object."  An
> empty string denotes no locations, so perhaps this should not be an
> error after all.
>
> The right place to fix this would probably be in
> `scm_i_string_start_writing' (strings.c).
>
> What do other people think?

Mutating list operations are allowed on '() (and do not change it).
'(), the empty list structure, is eq? to itself regardless how you
arrived at it.  I think it would give some logical symmetry if the same
held for "" and #().  "" is obviously a valid substring of either
mutable or immutable strings.  The result of (string-append! x "")
should leave the immutability state of x alone.  One rationale behind
that is more or less that the immutability is a property of the
characters of the string, and "" has no characters of its own and does
not contribute to the characters.

If there are predicates "immutable-string?" and "mutable-string?" (I
don't have Guilev2 installed), then "" would be the only string
satisfying both predicates.

Efficiency of implementation might make other choices preferable, but
that's what I would consider logically satisfying.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06 19:06                               ` Bruce Korb
  2012-01-06 19:19                                 ` David Kastrup
@ 2012-01-07 16:13                                 ` Mark H Weaver
  2012-01-07 17:35                                   ` mutable interfaces - was: " Bruce Korb
  1 sibling, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-07 16:13 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:
> Fail early and fail hard.  Yes.  But after all these discussions, I
> now doubt I have too many places where I am expecting to change a
> static value.

That's good news! :)

> Most of the strings that I wind up altering are created with a
> scm_from_locale_string() C function call.

BTW, beware that scm_from_locale_string() is only appropriate for
strings that came from the user (e.g. command-line arguments, reading
from a port, etc).  When converting string literals from your own source
code, you should use scm_from_latin1_string() or scm_from_utf8_string().

Similarly, to make symbols from C string literals, use
scm_from_latin1_symbol() or scm_from_utf8_symbol().

Caveat: these functions did not exist in Guile 1.8.  If your C string
literals are ASCII-only, I guess it won't matter in practice which
function you use, although it would be good to spread the understanding
that C string literals should not be interpreted according to the user's
locale.

    Best,
     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Fixed string corruption bugs
  2012-01-07 15:43                   ` Fixed string corruption bugs (was Guile: What's wrong with this?) Mark H Weaver
@ 2012-01-07 16:19                     ` Andy Wingo
  0 siblings, 0 replies; 117+ messages in thread
From: Andy Wingo @ 2012-01-07 16:19 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On Sat 07 Jan 2012 16:43, Mark H Weaver <mhw@netris.org> writes:

>> Subject: [PATCH] Fix bugs related to mutation-sharing substrings

Cool!

> I committed this.  Here's an example that segfaulted before these fixes:
>
>   scheme@(guile-user)> (define s (string-copy "hello"))
>   scheme@(guile-user)> (define ss (substring/shared s 1 4))
>   scheme@(guile-user)> (string-set! ss 0 #\λ)
>   Segmentation fault

Probably a good idea to add it (or something like it) to the test suite
:)

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-07 15:27                                       ` Bruce Korb
@ 2012-01-07 16:38                                         ` Mark H Weaver
  2012-01-07 17:39                                           ` Bruce Korb
  0 siblings, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-07 16:38 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:
> On 01/07/12 07:00, Mark H Weaver wrote:
>> The right place to fix this would probably be in
>> `scm_i_string_start_writing' (strings.c).
>
> I think it too much effort for that function.  I looked at it.
> The problem is that you'd have to pass it the start and end points,
> that is, change its interface.

Ah yes, excellent point!

Indeed, it would not be enough for `scm_i_string_start_writing' to check
for an empty string.  Even for non-empty immutable strings, if the range
of character indices passed to a string mutator is empty, then no
characters will be changed, and therefore the operation should succeed.
`scm_i_string_start_writing' doesn't have enough information to detect
this case.

I see two choices:

* Modify the interface to `scm_i_string_start_writing' to give it the
  `start' and `end' indices.

* Add checks to all string mutation functions: if the range is empty,
  then avoid calling `scm_i_string_start_writing'.

The advantage to the first approach is that authors of future string
mutators won't have to remember to handle this case specially, and I
have very little confidence that they would.

I'll look into this.

    Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-07 15:47                                       ` David Kastrup
@ 2012-01-07 17:07                                         ` Mark H Weaver
  0 siblings, 0 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-07 17:07 UTC (permalink / raw)
  To: David Kastrup; +Cc: guile-devel

David Kastrup <dak@gnu.org> writes:
> Mutating list operations are allowed on '() (and do not change it).
> '(), the empty list structure, is eq? to itself regardless how you
> arrived at it.

Excellent point.  The R5RS says that `list' returns "a newly allocated
list", but that's obviously not true for (list).  So I guess we can take
this as a precedent that the "newly allocated" language does not
necessarily apply in the 0-element case.

I wonder if the R7RS should make this point explicit.  It's obvious for
lists, but not for vectors or strings.

> The result of (string-append! x "") should leave the immutability
> state of x alone.

There's no `string-append!' nor anything like it, because in Scheme the
length of strings is fixed.  Only the characters themselves can be
changed, not the length.

> If there are predicates "immutable-string?" and "mutable-string?" (I
> don't have Guilev2 installed), then "" would be the only string
> satisfying both predicates.

There are no such predicates, and I don't see any good use for them.  If
you need to check whether a string is mutable, then you shouldn't be
mutating it anyway.

Anyway, mutability is not a property of strings in particular, but of
all objects.  Or at least it should be.  Right now, we don't enforce
immutability of literal lists or vectors, but we should.

     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: mutable interfaces - was: Guile: What's wrong with this?
  2012-01-07 16:13                                 ` Mark H Weaver
@ 2012-01-07 17:35                                   ` Bruce Korb
  2012-01-07 17:47                                     ` David Kastrup
  2012-01-07 18:30                                     ` Mark H Weaver
  0 siblings, 2 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-07 17:35 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On 01/07/12 08:13, Mark H Weaver wrote:
>> Most of the strings that I wind up altering are created with a
>> scm_from_locale_string() C function call.
>
> BTW, beware that scm_from_locale_string() is only appropriate for
> strings that came from the user (e.g. command-line arguments, reading
> from a port, etc).  When converting string literals from your own source
> code, you should use scm_from_latin1_string() or scm_from_utf8_string().
>
> Similarly, to make symbols from C string literals, use
> scm_from_latin1_symbol() or scm_from_utf8_symbol().
>
> Caveat: these functions did not exist in Guile 1.8.  If your C string
> literals are ASCII-only, I guess it won't matter in practice which
> function you use, although it would be good to spread the understanding
> that C string literals should not be interpreted according to the user's
> locale.

I go back to my argument that a facilitation language needs to focus
on being as helpful as possible.  That means doing what is likely
wanted instead of throwing errors at every possibility.  It also means
not changing interfaces.  It is certainly much more stable now than
it was in the 1.4 to 1.6 transition era, but still.

Anyway, this then?  (abbreviated)

#if   GUILE_VERSION < 107000
# define AG_SCM_STR02SCM(_s)          scm_makfrom0str(_s)
# define AG_SCM_STR2SCM(_st,_sz)      scm_mem2string(_st,_sz)

#elif   GUILE_VERSION < 200000
# define AG_SCM_STR02SCM(_s)          scm_from_locale_string(_s)
# define AG_SCM_STR2SCM(_st,_sz)      scm_from_locale_stringn(_st,_sz)

#elif   GUILE_VERSION < 200004
#error "autogen does not work with this version of guile"
   choke me.

#else
# define AG_SCM_STR02SCM(_s)          scm_from_utf8_string(_s)
# define AG_SCM_STR2SCM(_st,_sz)      scm_from_utf8_stringn(_st,_sz)
#endif



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-07 16:38                                         ` Mark H Weaver
@ 2012-01-07 17:39                                           ` Bruce Korb
  2012-01-09 15:41                                             ` Mark H Weaver
  0 siblings, 1 reply; 117+ messages in thread
From: Bruce Korb @ 2012-01-07 17:39 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On 01/07/12 08:38, Mark H Weaver wrote:
> * Modify the interface to `scm_i_string_start_writing' to give it the
>    `start' and `end' indices.
>
> * Add checks to all string mutation functions: if the range is empty,
>    then avoid calling `scm_i_string_start_writing'.

Yes.  All of them.  All four.

> The advantage to the first approach is that authors of future string
> mutators won't have to remember to handle this case specially, and I
> have very little confidence that they would.
>
> I'll look into this.

Either way.  The advantage of quitting a string transformation function
early when the length to modify is zero is you save more overhead than
just calling scm_i_string_start_writing.  But it's your call.  Whatever.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: mutable interfaces - was: Guile: What's wrong with this?
  2012-01-07 17:35                                   ` mutable interfaces - was: " Bruce Korb
@ 2012-01-07 17:47                                     ` David Kastrup
  2012-01-07 18:30                                     ` Mark H Weaver
  1 sibling, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-07 17:47 UTC (permalink / raw)
  To: guile-devel

Bruce Korb <bkorb@gnu.org> writes:

> On 01/07/12 08:13, Mark H Weaver wrote:
>>> Most of the strings that I wind up altering are created with a
>>> scm_from_locale_string() C function call.
>>
>> BTW, beware that scm_from_locale_string() is only appropriate for
>> strings that came from the user (e.g. command-line arguments, reading
>> from a port, etc).  When converting string literals from your own source
>> code, you should use scm_from_latin1_string() or scm_from_utf8_string().
>>
>> Similarly, to make symbols from C string literals, use
>> scm_from_latin1_symbol() or scm_from_utf8_symbol().
>>
>> Caveat: these functions did not exist in Guile 1.8.  If your C string
>> literals are ASCII-only, I guess it won't matter in practice which
>> function you use, although it would be good to spread the understanding
>> that C string literals should not be interpreted according to the user's
>> locale.
>
> I go back to my argument that a facilitation language needs to focus
> on being as helpful as possible.  That means doing what is likely
> wanted instead of throwing errors at every possibility.  It also means
> not changing interfaces.

Undefined behavior is not an interface.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: mutable interfaces - was: Guile: What's wrong with this?
  2012-01-07 17:35                                   ` mutable interfaces - was: " Bruce Korb
  2012-01-07 17:47                                     ` David Kastrup
@ 2012-01-07 18:30                                     ` Mark H Weaver
  2012-01-07 18:55                                       ` Mark H Weaver
  1 sibling, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-07 18:30 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:

> On 01/07/12 08:13, Mark H Weaver wrote:
>>> Most of the strings that I wind up altering are created with a
>>> scm_from_locale_string() C function call.
>>
>> BTW, beware that scm_from_locale_string() is only appropriate for
>> strings that came from the user (e.g. command-line arguments, reading
>> from a port, etc).  When converting string literals from your own source
>> code, you should use scm_from_latin1_string() or scm_from_utf8_string().
>>
>> Similarly, to make symbols from C string literals, use
>> scm_from_latin1_symbol() or scm_from_utf8_symbol().
>>
>> Caveat: these functions did not exist in Guile 1.8.  If your C string
>> literals are ASCII-only, I guess it won't matter in practice which
>> function you use, although it would be good to spread the understanding
>> that C string literals should not be interpreted according to the user's
>> locale.
>
> I go back to my argument that a facilitation language needs to focus
> on being as helpful as possible.  That means doing what is likely
> wanted instead of throwing errors at every possibility.  It also means
> not changing interfaces.

Sorry, but there's no way to maintain backward compatibility here.  I
know it's a pain, but there's no getting around the fact that in order
to write proper internationalized code, we now need to think carefully
about what encoding a particular string is in.  There's no automatic way
to handle this, not even in principle.

Fortunately, most modern GNU/Linux systems default to a UTF-8 locale, in
which case scm_from_locale_string and scm_from_utf8_string will be the
same anyway.  However, there are still some systems that use a non-UTF-8
locale, and we must strive to support them properly.

> Anyway, this then?  (abbreviated)
>
> #if   GUILE_VERSION < 107000
> # define AG_SCM_STR02SCM(_s)          scm_makfrom0str(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_mem2string(_st,_sz)
>
> #elif   GUILE_VERSION < 200000
> # define AG_SCM_STR02SCM(_s)          scm_from_locale_string(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_from_locale_stringn(_st,_sz)
>
> #elif   GUILE_VERSION < 200004
> #error "autogen does not work with this version of guile"
>   choke me.

This last clause is wrong.  scm_from_utf8_string and
scm_from_utf8_stringn were in Guile 2.0.0.

> #else
> # define AG_SCM_STR02SCM(_s)          scm_from_utf8_string(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_from_utf8_stringn(_st,_sz)
> #endif

Just remember that this change implies that these macros should only be
used for C string literals, and must _not_ be used for strings supplied
by the user (e.g. command-line arguments and I/O).

It could very well be that you're currently overloading these functions
for both purposes, in which case you should split this pair of macros
into two distinct pairs: one pair of macros for user strings (keep using
scm_from_locale_string{,n} for these), and one pair for C string
literals (use scm_from_utf8_string{,n} for Guile 2.0.0 or newer).

Then look at each use of these old overloaded macros in your code, and
figure out whether it's operating on a string that came from the user or
a string that came from your own source code.

Again, I stress that this has nothing to do with Guile.  All software,
if it wishes to be properly internationalized, needs to think about
where a string came from.  In general, your program's source code (and
thus the C string literals it contains) will have a different encoding
than C strings that come from the user.  C strings of different
encodings are essentially of different types (even though C's type
system is too crude to distinguish them), and you must treat them as
such.

      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: mutable interfaces - was: Guile: What's wrong with this?
  2012-01-07 18:30                                     ` Mark H Weaver
@ 2012-01-07 18:55                                       ` Mark H Weaver
  0 siblings, 0 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-07 18:55 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Replying to myself...

> Again, I stress that this has nothing to do with Guile.  All software,
> if it wishes to be properly internationalized, needs to think about
> where a string came from.  In general, your program's source code (and
> thus the C string literals it contains) will have a different encoding
> than C strings that come from the user.  C strings of different
> encodings are essentially of different types (even though C's type
> system is too crude to distinguish them), and you must treat them as
> such.

In case it wasn't clear: Scheme strings don't have any encoding; they
are a sequence of Unicode characters.  Therefore, you never have to
think about where a Scheme string came from.  What you need to think
about is where a raw sequence of bytes came from, whether it be a C
string (C chars are not characters but merely bytes), a Scheme
bytevector, or the bytes in a command-line argument, environment
variable, or the bytes read from a file descriptor.

Ideally, our code would make these distinctions very clear.  However, if
you're not motivated (or don't have time) to fix that properly right
now, there's one fact that can save you a lot of time: on GNU/Linux and
POSIX systems, every locale encoding is compatible with ASCII.
Therefore, if you know that a string contains only ASCII characters,
then you don't need to think about whether to use scm_from_locale_string
or scm_from_utf8_string, because they'll both be equivalent.

     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-06  1:41                         ` Mark H Weaver
  2012-01-06  2:38                           ` Noah Lavine
  2012-01-06 13:37                           ` Mike Gran
@ 2012-01-07 20:57                           ` Ian Price
  2012-01-08  5:05                             ` Mark H Weaver
  2 siblings, 1 reply; 117+ messages in thread
From: Ian Price @ 2012-01-07 20:57 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Bruce Korb, guile-devel

Mark H Weaver <mhw@netris.org> writes:

> As I understand it, in the Scheme standards (at least before R6RS's
> immutable pairs) the rationale behind marking literal constants as
> immutable is solely to avoid needlessly making copies of those literals,
> while flagging accidental attempts to modify them, since that is almost
> certainly a mistake.
Erm, if you don't count literals, which were already immutable, then
R6RS doesn't have immutable pairs. It does move the mutators to a
separate module, but that is a not really equivalent, because even if
you don't import (rnrs mutable-pairs), another module may mutate pairs
returned by your library. Ditto for strings,etc.

To quote section 5.10
"Literal constants, the strings returned by symbol->string, records with
no mutable fields, and other values explicitly designated as immutable
are immutable objects, while all objects created by the other procedures
listed in this report are mutable."

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-07 14:35                                   ` Mark H Weaver
  2012-01-07 15:20                                     ` Mike Gran
@ 2012-01-07 22:25                                     ` Ludovic Courtès
  2012-01-10  9:13                                     ` The empty string and other empty strings Ludovic Courtès
  2 siblings, 0 replies; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-07 22:25 UTC (permalink / raw)
  To: guile-devel

Hi,

Mark H Weaver <mhw@netris.org> skribis:

> I wrote:
>> 3. Make scm_nullstr into a mutable string.  After all, it can't be
>>    changed anyway, and the _only_ reference to it is from
>>    scm_from_stringn, so the result should always be mutable.
>
> For the record: my statement above was in error; scm_nullstr is actually
> used in several files.  However, I looked at each use, and in all cases
> a mutable string is appropriate.  Also, it is SCM_INTERNAL.  So I
> committed the change.

Good!

> However, I wonder if we should also remove this optimization from
> scm_from_stringn, as Bruce suggested.  The R5RS says that `string' and
> `make-string' should return "a newly allocated string", which implies
> that the new string should not be `eq?' to any existing object.
>
> Although our docs for scm_from_stringn et al do not explicitly specify
> that the string is newly allocated, an argument could be made that we
> should follow the behavior of `string'.
>
> What do other people think?

Makes sense to return a new empty string, yes.

Ludo’, who is hoping for the day where strings are immutable, period.  :-)




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile: What's wrong with this?
  2012-01-07 20:57                           ` Guile: " Ian Price
@ 2012-01-08  5:05                             ` Mark H Weaver
  0 siblings, 0 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-08  5:05 UTC (permalink / raw)
  To: Ian Price; +Cc: Bruce Korb, guile-devel

Hi Ian!

Ian Price <ianprice90@googlemail.com> writes:

> Mark H Weaver <mhw@netris.org> writes:
>
>> As I understand it, in the Scheme standards (at least before R6RS's
>> immutable pairs) the rationale behind marking literal constants as
>> immutable is solely to avoid needlessly making copies of those literals,
>> while flagging accidental attempts to modify them, since that is almost
>> certainly a mistake.
> Erm, if you don't count literals, which were already immutable, then
> R6RS doesn't have immutable pairs. It does move the mutators to a
> separate module, but that is a not really equivalent, because even if
> you don't import (rnrs mutable-pairs), another module may mutate pairs
> returned by your library. Ditto for strings,etc.
>
> To quote section 5.10
> "Literal constants, the strings returned by symbol->string, records with
> no mutable fields, and other values explicitly designated as immutable
> are immutable objects, while all objects created by the other procedures
> listed in this report are mutable."

Ah, I guess you're right.  I never studied the R6RS carefully outside of
its handling of numerics.  I wrote "at least before R6RS" to indicate
that I was only knowledgeable about earlier versions.

Racket's immutable pairs represent a break in the older tradition.  Last
I looked anyway, Racket's mutable pairs cannot even be accessed with the
standard `car' and `cdr'.  Therefore, they really are a different (and
incompatible) type from mutable pairs.

I still suspect that the rationale behind immutable pairs in the R6RS is
to discourage mutation of pairs, to give compiler implementations such
as Racket the freedom to make pairs truly immutable as thus benefit from
better optimizer.  However, I mistakenly implied that immutable pairs
were a distinct type in the R6RS itself, and for that I apologize.

    Thanks,
      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-07 17:39                                           ` Bruce Korb
@ 2012-01-09 15:41                                             ` Mark H Weaver
  2012-01-09 17:27                                               ` Bruce Korb
  0 siblings, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-09 15:41 UTC (permalink / raw)
  To: Bruce Korb; +Cc: guile-devel

Bruce Korb <bkorb@gnu.org> writes:
> On 01/07/12 08:38, Mark H Weaver wrote:
>> * Add checks to all string mutation functions: if the range is empty,
>>    then avoid calling `scm_i_string_start_writing'.
>
> Yes.  All of them.  All four.

For the record, there were 7 string mutation functions to fix :-P

Anyway, I did as you suggested, and left `scm_i_string_start_writing'
alone.

   Thanks,
     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-09 15:41                                             ` Mark H Weaver
@ 2012-01-09 17:27                                               ` Bruce Korb
  2012-01-09 18:32                                                 ` Andy Wingo
  0 siblings, 1 reply; 117+ messages in thread
From: Bruce Korb @ 2012-01-09 17:27 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

On 01/09/12 07:41, Mark H Weaver wrote:
>> Yes.  All of them.  All four.
>
> For the record, there were 7 string mutation functions to fix :-P

I guess my cscope search was not exhaustive enough.
Thanks!  We are talking 2.0.4, yes?  When might that be?  :)

Regards, Bruce



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-09 17:27                                               ` Bruce Korb
@ 2012-01-09 18:32                                                 ` Andy Wingo
  2012-01-09 19:48                                                   ` Bruce Korb
  0 siblings, 1 reply; 117+ messages in thread
From: Andy Wingo @ 2012-01-09 18:32 UTC (permalink / raw)
  To: Bruce Korb; +Cc: Mark H Weaver, guile-devel

On Mon 09 Jan 2012 18:27, Bruce Korb <bkorb@gnu.org> writes:

> We are talking 2.0.4, yes?  When might that be?  :)

A week? :)

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Guile BUG: What's wrong with this?
  2012-01-09 18:32                                                 ` Andy Wingo
@ 2012-01-09 19:48                                                   ` Bruce Korb
  0 siblings, 0 replies; 117+ messages in thread
From: Bruce Korb @ 2012-01-09 19:48 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Mark H Weaver, guile-devel

On 01/09/12 10:32, Andy Wingo wrote:
> On Mon 09 Jan 2012 18:27, Bruce Korb<bkorb@gnu.org>  writes:
>
>> We are talking 2.0.4, yes?  When might that be?  :)
>
> A week? :)

Wonderful!  I just wanted reassurance it wasn't months.
My own round tuits for hobby time things are often measured in months.  :(



^ permalink raw reply	[flat|nested] 117+ messages in thread

* The empty string and other empty strings
  2012-01-07 14:35                                   ` Mark H Weaver
  2012-01-07 15:20                                     ` Mike Gran
  2012-01-07 22:25                                     ` Ludovic Courtès
@ 2012-01-10  9:13                                     ` Ludovic Courtès
  2012-01-10 11:28                                       ` Mike Gran
                                                         ` (2 more replies)
  2 siblings, 3 replies; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-10  9:13 UTC (permalink / raw)
  To: guile-devel

Hello Mark!

Mark H Weaver <mhw@netris.org> skribis:

> I wrote:
>> 3. Make scm_nullstr into a mutable string.  After all, it can't be
>>    changed anyway, and the _only_ reference to it is from
>>    scm_from_stringn, so the result should always be mutable.
>
> For the record: my statement above was in error; scm_nullstr is actually
> used in several files.  However, I looked at each use, and in all cases
> a mutable string is appropriate.  Also, it is SCM_INTERNAL.  So I
> committed the change.

I just noticed that there are i18n.test failures on Hydra, which point
at this commit:

  http://hydra.nixos.org/build/1790097

I think this is under the C locale, but I haven’t been able to reproduce
it yet.

Anyway, it seems that before, you couldn’t get any encoding error for
scm_from_stringn ("", "SOME-ENCODING"), whereas now you can.

A related question: can we have both narrow and wide empty strings?

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10  9:13                                     ` The empty string and other empty strings Ludovic Courtès
@ 2012-01-10 11:28                                       ` Mike Gran
  2012-01-10 13:03                                         ` Mark H Weaver
  2012-01-10 14:10                                         ` David Kastrup
  2012-01-10 12:21                                       ` Mike Gran
  2012-01-10 12:27                                       ` Mark H Weaver
  2 siblings, 2 replies; 117+ messages in thread
From: Mike Gran @ 2012-01-10 11:28 UTC (permalink / raw)
  To: Ludovic Courtès, guile-devel@gnu.org

> From: Ludovic Courtès <ludo@gnu.org>
> A related question: can we have both narrow and wide empty strings?

The intention is that a string is encoded as wide only if it can't
be encoded as narrow.  So _newly created_ empty strings should only be narrow.
 
Right now it seems that zero-length shared substring of a wide string is
wide.  A zero-length substring still shares the stringbuf of the
original string.
 
(%string-dump
  (substring
    (apply string (map integer->char (list 2001 2002 2003)))
   3))
 
So I guess the answer is that you can have both wide and narrow empty
strings if you believe that zero-length substrings need to point to a
zero-length part of the stringbuf of the parent string from which
they were generated.  This is a little pedantic, but I think it might
be the right answer.
 
What do you think about that?  Do zero-length substrings need to
still share stringbufs with their parent strings?
 
In any case, a string-copy of a narrow substring of an otherwise wide string
should be a new narrow string.  This should apply to zero-length
substrings as well.  This isn't happening, because we're missing
a scm_i_try_narrow_string in string-copy, which is a bug.
 
Thanks,
 
Mike



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10  9:13                                     ` The empty string and other empty strings Ludovic Courtès
  2012-01-10 11:28                                       ` Mike Gran
@ 2012-01-10 12:21                                       ` Mike Gran
  2012-01-10 12:27                                       ` Mark H Weaver
  2 siblings, 0 replies; 117+ messages in thread
From: Mike Gran @ 2012-01-10 12:21 UTC (permalink / raw)
  To: Ludovic Courtès, guile-devel@gnu.org

> From: Ludovic Courtès <ludo@gnu.org>
> I just noticed that there are i18n.test failures on Hydra, which point
> at this commit:
> 
>   http://hydra.nixos.org/build/1790097
> 
> I think this is under the C locale, but I haven’t been able to reproduce
> it yet.
> 
> Anyway, it seems that before, you couldn’t get any encoding error for
> scm_from_stringn ("", "SOME-ENCODING"), whereas now you can.
 
Looks like for zero-length input strings, u32_conv_from_encoding can return NULL.
 
-Mike 



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10  9:13                                     ` The empty string and other empty strings Ludovic Courtès
  2012-01-10 11:28                                       ` Mike Gran
  2012-01-10 12:21                                       ` Mike Gran
@ 2012-01-10 12:27                                       ` Mark H Weaver
  2012-01-10 16:34                                         ` Ludovic Courtès
  2 siblings, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-10 12:27 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:
> Anyway, it seems that before, you couldn’t get any encoding error for
> scm_from_stringn ("", "SOME-ENCODING"), whereas now you can.

Good point.  I just committed a change to avoid this.

> A related question: can we have both narrow and wide empty strings?

I see one place where a wide null string could be created: vm-i-loader.c
line 115, within the "load-wide-string" vm instruction, calls
`scm_i_make_wide_string' but never calls `scm_i_try_narrow_string' as is
usually done.  I guess this might be because "load-wide-string" is
normally never used for strings that contain only Latin 1 characters.

Other than that, I don't see how a wide null string could exist outside
of temporaries, although it's hard to entirely rule out the possibility.
All of the code paths try to avoid using wide strings unless a wide
character is actually present.

If a wide null string did exist, I don't see what harm it would cause,
besides causing failure if `scm_i_string_chars' applied to it.

    Thanks,
      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10 11:28                                       ` Mike Gran
@ 2012-01-10 13:03                                         ` Mark H Weaver
  2012-01-10 13:09                                           ` Mike Gran
  2012-01-10 15:41                                           ` Mark H Weaver
  2012-01-10 14:10                                         ` David Kastrup
  1 sibling, 2 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-10 13:03 UTC (permalink / raw)
  To: Mike Gran; +Cc: Ludovic Courtès, guile-devel

Mike Gran <spk121@yahoo.com> writes:
> Right now it seems that zero-length shared substring of a wide string is
> wide.  A zero-length substring still shares the stringbuf of the
> original string.
[...]
> What do you think about that?  Do zero-length substrings need to
> still share stringbufs with their parent strings?

I think the answer is: no they don't, and avoiding that might be a
worthwhile optimization, mainly to avoid needlessly holding a reference
to a potentially large stringbuf.

> In any case, a string-copy of a narrow substring of an otherwise wide string
> should be a new narrow string.  This should apply to zero-length
> substrings as well.  This isn't happening, because we're missing
> a scm_i_try_narrow_string in string-copy, which is a bug.

I just fixed this.

> Looks like for zero-length input strings, u32_conv_from_encoding can
> return NULL.

Interesting!  Anyway, we now avoid calling `u32_conv_from_encoding' for
empty strings.

    Thanks,
      Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10 13:03                                         ` Mark H Weaver
@ 2012-01-10 13:09                                           ` Mike Gran
  2012-01-10 15:41                                           ` Mark H Weaver
  1 sibling, 0 replies; 117+ messages in thread
From: Mike Gran @ 2012-01-10 13:09 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Ludovic Courtès, guile-devel@gnu.org

> From: Mark H Weaver <mhw@netris.org>
>>  What do you think about that?  Do zero-length substrings need to
>>  still share stringbufs with their parent strings?
> 
> I think the answer is: no they don't, and avoiding that might be a
> worthwhile optimization, mainly to avoid needlessly holding a reference
> to a potentially large stringbuf.

That's a good point.
 
Thanks,
 
Mike



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10 11:28                                       ` Mike Gran
  2012-01-10 13:03                                         ` Mark H Weaver
@ 2012-01-10 14:10                                         ` David Kastrup
  1 sibling, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-10 14:10 UTC (permalink / raw)
  To: guile-devel

Mike Gran <spk121@yahoo.com> writes:

>> From: Ludovic Courtès <ludo@gnu.org>
>> A related question: can we have both narrow and wide empty strings?
>
> The intention is that a string is encoded as wide only if it can't
> be encoded as narrow.  So _newly created_ empty strings should only be narrow.
>  
> Right now it seems that zero-length shared substring of a wide string is
> wide.  A zero-length substring still shares the stringbuf of the
> original string.

That sounds non-sensical to me.  If it does not share any characters
with the original string, there is no point in having a buffer (or a
wide width) at all.

Zero-length substrings should not be abused as pointers carrying any
meaning.  And they should not keep the original string from being
collected.

> What do you think about that?  Do zero-length substrings need to still
> share stringbufs with their parent strings?

I consider it more a bug than a feature if they do.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10 13:03                                         ` Mark H Weaver
  2012-01-10 13:09                                           ` Mike Gran
@ 2012-01-10 15:41                                           ` Mark H Weaver
  2012-01-10 15:48                                             ` David Kastrup
  1 sibling, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-10 15:41 UTC (permalink / raw)
  To: Mike Gran; +Cc: Ludovic Courtès, guile-devel

Mike Gran <spk121@yahoo.com> wrote:
>> Right now it seems that zero-length shared substring of a wide string is
>> wide.  A zero-length substring still shares the stringbuf of the
>> original string.
> [...]
>> What do you think about that?  Do zero-length substrings need to
>> still share stringbufs with their parent strings?

I wrote:
> I think the answer is: no they don't, and avoiding that might be a
> worthwhile optimization, mainly to avoid needlessly holding a reference
> to a potentially large stringbuf.

I went ahead and committed this optimization.  Empty substrings are now
always freshly allocated, and never hold a reference to the original
stringbuf.

I also added another optimization: `scm_i_make_string' now uses a common
null_stringbuf when creating empty strings.  The string object itself is
still freshly allocated.

    Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10 15:41                                           ` Mark H Weaver
@ 2012-01-10 15:48                                             ` David Kastrup
  2012-01-10 16:15                                               ` Mark H Weaver
  0 siblings, 1 reply; 117+ messages in thread
From: David Kastrup @ 2012-01-10 15:48 UTC (permalink / raw)
  To: guile-devel

Mark H Weaver <mhw@netris.org> writes:

> Mike Gran <spk121@yahoo.com> wrote:
>>> Right now it seems that zero-length shared substring of a wide string is
>>> wide.  A zero-length substring still shares the stringbuf of the
>>> original string.
>> [...]
>>> What do you think about that?  Do zero-length substrings need to
>>> still share stringbufs with their parent strings?
>
> I wrote:
>> I think the answer is: no they don't, and avoiding that might be a
>> worthwhile optimization, mainly to avoid needlessly holding a reference
>> to a potentially large stringbuf.
>
> I went ahead and committed this optimization.  Empty substrings are now
> always freshly allocated, and never hold a reference to the original
> stringbuf.

Why would they need an allocation at all?  They don't contain
characters.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10 15:48                                             ` David Kastrup
@ 2012-01-10 16:15                                               ` Mark H Weaver
  2012-01-12 22:33                                                 ` Ludovic Courtès
  0 siblings, 1 reply; 117+ messages in thread
From: Mark H Weaver @ 2012-01-10 16:15 UTC (permalink / raw)
  To: David Kastrup; +Cc: guile-devel

David Kastrup <dak@gnu.org> writes:

> Mark H Weaver <mhw@netris.org> writes:
>
>> I went ahead and committed this optimization.  Empty substrings are now
>> always freshly allocated, and never hold a reference to the original
>> stringbuf.
>
> Why would they need an allocation at all?  They don't contain
> characters.

It is an arguable point, but although they don't contain characters,
they can still be compared with other objects using `eq?'.

The R5RS says that `string', `make-string', `substring',
`string-append', `list->string', and `string-copy' return a newly
allocated string, which implies that the returned string is not `eq?' to
any other existing object.

Admittedly, the R5RS also says that `list' returns a newly allocated
list, which obviously cannot be true for the empty list.

Nonetheless, it still seems safer to strictly follow the standard here.
I expect that most implementations produce newly allocated empty strings
(since that's what naturally happens unless you handle empty strings
specially) and some programs might depend on this behavior, especially
since the standard seems to mandate it.

On the other hand, I don't expect that enough empty strings are created
to make the optimization very significant, though perhaps I'm mistaken.
Empty string literals ("") will still be shared, for what it's worth.

What do other people think?

    Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10 12:27                                       ` Mark H Weaver
@ 2012-01-10 16:34                                         ` Ludovic Courtès
  2012-01-10 17:04                                           ` David Kastrup
  0 siblings, 1 reply; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-10 16:34 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guile-devel

Hello Mark,

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>> Anyway, it seems that before, you couldn’t get any encoding error for
>> scm_from_stringn ("", "SOME-ENCODING"), whereas now you can.
>
> Good point.  I just committed a change to avoid this.

Cool, thanks for the instant reply and fix!

And thanks to Mike and you for the remainder of the discussion and
optimizations.

BTW, I just noticed that R5RS uses the phrase “empty strings” (plural)
in the description of ‘eq?’, which means we’re indeed on the right track.

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10 16:34                                         ` Ludovic Courtès
@ 2012-01-10 17:04                                           ` David Kastrup
  0 siblings, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-10 17:04 UTC (permalink / raw)
  To: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Hello Mark,
>
> Mark H Weaver <mhw@netris.org> skribis:
>
>> ludo@gnu.org (Ludovic Courtès) writes:
>>> Anyway, it seems that before, you couldn’t get any encoding error for
>>> scm_from_stringn ("", "SOME-ENCODING"), whereas now you can.
>>
>> Good point.  I just committed a change to avoid this.
>
> Cool, thanks for the instant reply and fix!
>
> And thanks to Mike and you for the remainder of the discussion and
> optimizations.
>
> BTW, I just noticed that R5RS uses the phrase “empty strings” (plural)
> in the description of ‘eq?’, which means we’re indeed on the right track.

R5RS is supposed to be a standard, not a guessing game.  When there is
nothing more definitive than splitting hairs in the grammar of the text,
I would prefer sane semantics over probably not even intended
contortions.

"Freshly allocated" for me means that _no_ string operation on
pre-existing objects can make this string different from what it is.
And since there is no way to share the empty contents of an empty string
with other strings, this is true even if every empty string is eq? to
every other one.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-10 16:15                                               ` Mark H Weaver
@ 2012-01-12 22:33                                                 ` Ludovic Courtès
  2012-01-13  9:27                                                   ` David Kastrup
  0 siblings, 1 reply; 117+ messages in thread
From: Ludovic Courtès @ 2012-01-12 22:33 UTC (permalink / raw)
  To: guile-devel

Hi Mark,

Mark H Weaver <mhw@netris.org> skribis:

> What do other people think?

As you said, R5RS makes it clear that there can be several (in the sense
of eq?) empty strings, so I think what you did is the right thing.

Thanks!

Ludo’.




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-12 22:33                                                 ` Ludovic Courtès
@ 2012-01-13  9:27                                                   ` David Kastrup
  2012-01-13 16:39                                                     ` Mark H Weaver
  0 siblings, 1 reply; 117+ messages in thread
From: David Kastrup @ 2012-01-13  9:27 UTC (permalink / raw)
  To: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Hi Mark,
>
> Mark H Weaver <mhw@netris.org> skribis:
>
>> What do other people think?
>
> As you said, R5RS makes it clear that there can be several (in the sense
> of eq?) empty strings, so I think what you did is the right thing.

Since it uses the same verbiage with regard to '(), could you please
point out _where_ R5RS states that "freshly allocated" means "not eq?"?
For me it means "does not contain any component in common with
previously allocated material".  The fixed constant '() or (list) (the
neutral element with regard to list concatenation) not containing any
allocated pairs meets that description, and the fixed constant "" or
(string) (the neutral element with regard to string concatenation) not
containing any allocated characters meets that description.

So why treat them differently?  What does it buy us except trouble?

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-13  9:27                                                   ` David Kastrup
@ 2012-01-13 16:39                                                     ` Mark H Weaver
  2012-01-13 17:36                                                       ` David Kastrup
                                                                         ` (2 more replies)
  0 siblings, 3 replies; 117+ messages in thread
From: Mark H Weaver @ 2012-01-13 16:39 UTC (permalink / raw)
  To: David Kastrup; +Cc: guile-devel

David Kastrup <dak@gnu.org> writes:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Hi Mark,
>>
>> Mark H Weaver <mhw@netris.org> skribis:
>>
>>> What do other people think?
>>
>> As you said, R5RS makes it clear that there can be several (in the sense
>> of eq?) empty strings, so I think what you did is the right thing.
>
> Since it uses the same verbiage with regard to '(), could you please
> point out _where_ R5RS states that "freshly allocated" means "not
> eq?"?

Section 3.4 (Storage model) of the R5RS states:

  Whenever this report speaks of storage being allocated for a variable
  or object, what is meant is that an appropriate number of locations
  are chosen from the set of locations that are not in use, and the
  chosen locations are marked to indicate that they are now in use
  before the variable or object is made to denote them.

> For me it means "does not contain any component in common with
> previously allocated material".  The fixed constant '() or (list) (the
> neutral element with regard to list concatenation) not containing any
> allocated pairs meets that description, and the fixed constant "" or
> (string) (the neutral element with regard to string concatenation) not
> containing any allocated characters meets that description.

I think this is a very reasonable interpretation, but this is not in
accordance with the standard.

> So why treat them differently?  What does it buy us except trouble?

I don't see how our current behavior buys us _any_ trouble.  We've
voluntarily opted-out of a (marginal) optimization opportunity, and
that's all.

In your proposed behavior: in _almost_ all cases, `scm_from_stringn' (et
al) would return an object that is not `eq?' to any other existing
object.  However, in a single edge case, you'd have it return something
that _is_ `eq?' to other existing objects.  This is the kind of behavior
that could easily buy us trouble.

To my mind, if the optimization is insignificant (and I suspect that it
is), then it is safer to treat the edge cases the same as the common
case, for the sake of simplifying the semantics.

However, my mind is not set in stone on this.  Does anyone else here
agree with David?  Should we defend the legitimacy of this optimization,
and ask the R7RS working group to include explicit language specifying
that empty strings/vectors need not be freshly allocated?

   Thanks,
     Mark



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-13 16:39                                                     ` Mark H Weaver
@ 2012-01-13 17:36                                                       ` David Kastrup
  2012-01-16  8:26                                                       ` Marijn
  2012-01-20 21:31                                                       ` Andy Wingo
  2 siblings, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-13 17:36 UTC (permalink / raw)
  To: guile-devel

Mark H Weaver <mhw@netris.org> writes:

> David Kastrup <dak@gnu.org> writes:
>
>> ludo@gnu.org (Ludovic Courtès) writes:
>>
>>> Hi Mark,
>>>
>>> Mark H Weaver <mhw@netris.org> skribis:
>>>
>>>> What do other people think?
>>>
>>> As you said, R5RS makes it clear that there can be several (in the sense
>>> of eq?) empty strings, so I think what you did is the right thing.
>>
>> Since it uses the same verbiage with regard to '(), could you please
>> point out _where_ R5RS states that "freshly allocated" means "not
>> eq?"?
>
> Section 3.4 (Storage model) of the R5RS states:
>
>   Whenever this report speaks of storage being allocated for a variable
>   or object, what is meant is that an appropriate number of locations
>   are chosen from the set of locations that are not in use, and the
>   chosen locations are marked to indicate that they are now in use
>   before the variable or object is made to denote them.

And that's perfectly fine for the characters of a string.  However,
(string) has no characters.  Like (list) has no list members.  (list)
does not need _any_ allocation, and neither would (string).  For me it
makes sense to make the fundamental building block of a type a
self-contained value.  For multi-value non-composite types (like
numerical types) that is not necessarily feasible.  For composite types
with a single elementary non-composite value, it makes sense for me to
make this value a basic cell value.

Since empty strings are valid substrings of both mutable and non-mutable
strings, I don't see that it makes sense to apply either property to
them since it is impossible to change any character through them.  So
there are a number of operations which should for consistency's sake be
able to check for this special value efficiently.  Reserving a cell
value for it seems like the straightforward thing to do, and that is
what is done with lists also.

>> For me it means "does not contain any component in common with
>> previously allocated material".  The fixed constant '() or (list)
>> (the neutral element with regard to list concatenation) not
>> containing any allocated pairs meets that description, and the fixed
>> constant "" or (string) (the neutral element with regard to string
>> concatenation) not containing any allocated characters meets that
>> description.
>
> I think this is a very reasonable interpretation, but this is not in
> accordance with the standard.

Are you saying that (eq? (list) (list)) is not in accordance with the
standard since the standard specifies that a freshly allocated list is
to be returned?

>> So why treat them differently?  What does it buy us except trouble?
>
> I don't see how our current behavior buys us _any_ trouble.  We've
> voluntarily opted-out of a (marginal) optimization opportunity, and
> that's all.
>
> In your proposed behavior: in _almost_ all cases, `scm_from_stringn'
> (et al) would return an object that is not `eq?' to any other existing
> object.  However, in a single edge case, you'd have it return
> something that _is_ `eq?' to other existing objects.  This is the kind
> of behavior that could easily buy us trouble.

Why?  You can't change any other value _through_ it.  Do you want to use
(string) as a not-eq-to-anything sentinel like Lisp people do with (list
nil) sometimes?  It is known that (list) will not do for that purpose
(in spite of the standard saying that list will return a freshly
allocated list), so do you really think people will expect (string) to
do?

> To my mind, if the optimization is insignificant (and I suspect that
> it is), then it is safer to treat the edge cases the same as the
> common case, for the sake of simplifying the semantics.

You'll find yourself to be checking for "" more often in connection with
strings than for 0 in connection with numbers because "" is special in
that it contains no characters or other members.

So for me "" is a prime candidate for a single-cell constant.  We can
live with other objects like 0 not being eq to equal values, so we
certainly can with this one.

> However, my mind is not set in stone on this.  Does anyone else here
> agree with David?  Should we defend the legitimacy of this
> optimization, and ask the R7RS working group to include explicit
> language specifying that empty strings/vectors need not be freshly
> allocated?

They don't specify that empty lists need not be freshly allocated,
either, so it would be strange to make a difference here.

I think it makes more sense to define "freshly allocated" instead, as
"no pre-existing object can be modified through any operation on it".
That means that any single-cell constant is by definition "freshly
allocated".  And indeed, its _cell_ is freshly allocated even though
that cell _value_ may be eq? to that of other cells.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-13 16:39                                                     ` Mark H Weaver
  2012-01-13 17:36                                                       ` David Kastrup
@ 2012-01-16  8:26                                                       ` Marijn
  2012-01-16  8:47                                                         ` David Kastrup
  2012-01-20 21:31                                                       ` Andy Wingo
  2 siblings, 1 reply; 117+ messages in thread
From: Marijn @ 2012-01-16  8:26 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: David Kastrup, guile-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 13-01-12 17:39, Mark H Weaver wrote:
> David Kastrup <dak@gnu.org> writes:
> 
> However, my mind is not set in stone on this.  Does anyone else
> here agree with David?  Should we defend the legitimacy of this
> optimization, and ask the R7RS working group to include explicit
> language specifying that empty strings/vectors need not be freshly
> allocated?

It seems to me that it can't hurt to ask for clarification of this
issue on scheme-reports. Personally I think the intent of the standard
is to say that you cannot expect (string) to be un-eq? nor eq? to
(string), but let's get a broader perspective.

Marijn
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8T3yoACgkQp/VmCx0OL2wG4QCeJkTP7qhm/ll6g/szLrz21uUB
0PwAoKLWlLOIIgcEC8EJKnR+6fYaV0he
=8SBJ
-----END PGP SIGNATURE-----



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-16  8:26                                                       ` Marijn
@ 2012-01-16  8:47                                                         ` David Kastrup
  0 siblings, 0 replies; 117+ messages in thread
From: David Kastrup @ 2012-01-16  8:47 UTC (permalink / raw)
  To: guile-devel

Marijn <hkBst@gentoo.org> writes:

> On 13-01-12 17:39, Mark H Weaver wrote:
>> David Kastrup <dak@gnu.org> writes:
>> 
>> However, my mind is not set in stone on this.  Does anyone else
>> here agree with David?  Should we defend the legitimacy of this
>> optimization, and ask the R7RS working group to include explicit
>> language specifying that empty strings/vectors need not be freshly
>> allocated?
>
> It seems to me that it can't hurt to ask for clarification of this
> issue on scheme-reports. Personally I think the intent of the standard
> is to say that you cannot expect (string) to be un-eq? nor eq? to
> (string), but let's get a broader perspective.

It might be worth pointing out the similarity to (list) and (list) and
'().  I think that eq-ness of memberless structures of type list and
string (which also could allow mutable and immutable variants to be
identical) is worth given separate mention as it is a special case that
has semantics with regard to eq-ness and mutability and "freshly
allocated" that are nowhere as obvious as with content-carrying
variants.

Even if the statement results to "can be implemented as", it would avoid
choosing inferior implementation options because of trying to split
hairs on what amounts to a bald head.

-- 
David Kastrup




^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: The empty string and other empty strings
  2012-01-13 16:39                                                     ` Mark H Weaver
  2012-01-13 17:36                                                       ` David Kastrup
  2012-01-16  8:26                                                       ` Marijn
@ 2012-01-20 21:31                                                       ` Andy Wingo
  2 siblings, 0 replies; 117+ messages in thread
From: Andy Wingo @ 2012-01-20 21:31 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: David Kastrup, guile-devel

On Fri 13 Jan 2012 17:39, Mark H Weaver <mhw@netris.org> writes:

> Should we defend the legitimacy of this optimization, and ask the R7RS
> working group to include explicit language specifying that empty
> strings/vectors need not be freshly allocated?

It's a worthwhile question IMO.  I'll mail scheme-reports.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 117+ messages in thread

end of thread, other threads:[~2012-01-20 21:31 UTC | newest]

Thread overview: 117+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-03  4:08 What's wrong with this? Bruce Korb
2012-01-03 15:03 ` Mike Gran
2012-01-03 16:26   ` Guile: " Bruce Korb
2012-01-03 16:30     ` Mike Gran
2012-01-03 22:24     ` Ludovic Courtès
2012-01-03 23:15       ` Bruce Korb
2012-01-03 23:33         ` Ludovic Courtès
2012-01-04  0:55           ` Bruce Korb
2012-01-04  3:12             ` Noah Lavine
2012-01-04 17:37               ` bytevector -- was: " Bruce Korb
2012-01-04 21:17             ` Ludovic Courtès
2012-01-04 22:36               ` Bruce Korb
2012-01-05  0:01                 ` Ludovic Courtès
2012-01-05 18:36                   ` non-reproduction of initial issue -- was: " Bruce Korb
2012-01-05 18:50                     ` Mark H Weaver
2012-01-04 12:19         ` Ian Price
2012-01-04 17:16           ` Bruce Korb
2012-01-04 17:21             ` Andy Wingo
2012-01-04 17:39             ` David Kastrup
2012-01-04 21:52             ` Ian Price
2012-01-04 22:18               ` Bruce Korb
2012-01-04 23:22                 ` Mike Gran
2012-01-04 23:59                 ` Mark H Weaver
2012-01-05 17:22                   ` Bruce Korb
2012-01-05 18:13                     ` Mark H Weaver
2012-01-05 19:02                       ` Mark H Weaver
2012-01-05 20:24                     ` David Kastrup
2012-01-05 22:42                     ` Mark H Weaver
2012-01-06  1:02                       ` Mike Gran
2012-01-06  1:41                         ` Mark H Weaver
2012-01-06  2:38                           ` Noah Lavine
2012-01-06 13:37                           ` Mike Gran
2012-01-06 14:11                             ` David Kastrup
2012-01-06 18:13                             ` Mark H Weaver
2012-01-06 19:06                               ` Bruce Korb
2012-01-06 19:19                                 ` David Kastrup
2012-01-06 20:03                                   ` Mark H Weaver
2012-01-07 16:13                                 ` Mark H Weaver
2012-01-07 17:35                                   ` mutable interfaces - was: " Bruce Korb
2012-01-07 17:47                                     ` David Kastrup
2012-01-07 18:30                                     ` Mark H Weaver
2012-01-07 18:55                                       ` Mark H Weaver
2012-01-06 22:23                               ` Guile BUG: " Bruce Korb
2012-01-06 23:11                                 ` Mark H Weaver
2012-01-06 23:35                                   ` Andy Wingo
2012-01-06 23:41                                   ` Bruce Korb
2012-01-07 15:00                                     ` Mark H Weaver
2012-01-07 15:27                                       ` Bruce Korb
2012-01-07 16:38                                         ` Mark H Weaver
2012-01-07 17:39                                           ` Bruce Korb
2012-01-09 15:41                                             ` Mark H Weaver
2012-01-09 17:27                                               ` Bruce Korb
2012-01-09 18:32                                                 ` Andy Wingo
2012-01-09 19:48                                                   ` Bruce Korb
2012-01-07 15:47                                       ` David Kastrup
2012-01-07 17:07                                         ` Mark H Weaver
2012-01-07 14:35                                   ` Mark H Weaver
2012-01-07 15:20                                     ` Mike Gran
2012-01-07 22:25                                     ` Ludovic Courtès
2012-01-10  9:13                                     ` The empty string and other empty strings Ludovic Courtès
2012-01-10 11:28                                       ` Mike Gran
2012-01-10 13:03                                         ` Mark H Weaver
2012-01-10 13:09                                           ` Mike Gran
2012-01-10 15:41                                           ` Mark H Weaver
2012-01-10 15:48                                             ` David Kastrup
2012-01-10 16:15                                               ` Mark H Weaver
2012-01-12 22:33                                                 ` Ludovic Courtès
2012-01-13  9:27                                                   ` David Kastrup
2012-01-13 16:39                                                     ` Mark H Weaver
2012-01-13 17:36                                                       ` David Kastrup
2012-01-16  8:26                                                       ` Marijn
2012-01-16  8:47                                                         ` David Kastrup
2012-01-20 21:31                                                       ` Andy Wingo
2012-01-10 14:10                                         ` David Kastrup
2012-01-10 12:21                                       ` Mike Gran
2012-01-10 12:27                                       ` Mark H Weaver
2012-01-10 16:34                                         ` Ludovic Courtès
2012-01-10 17:04                                           ` David Kastrup
2012-01-06 23:28                                 ` Guile BUG: What's wrong with this? Bruce Korb
2012-01-07 20:57                           ` Guile: " Ian Price
2012-01-08  5:05                             ` Mark H Weaver
2012-01-06  9:23                         ` David Kastrup
2012-01-05  7:22                 ` David Kastrup
2012-01-04 22:46             ` Ludovic Courtès
2012-01-04  3:04       ` Mike Gran
2012-01-04  9:35         ` nalaginrut
2012-01-04  9:41         ` David Kastrup
2012-01-04 21:07         ` Ludovic Courtès
2012-01-04 10:03     ` Mark H Weaver
2012-01-04 14:29       ` Mike Gran
2012-01-04 14:45         ` David Kastrup
2012-01-04 16:47         ` Andy Wingo
2012-01-04 17:14           ` David Kastrup
2012-01-04 17:32             ` Andy Wingo
2012-01-04 17:49               ` David Kastrup
2012-01-04 18:09                 ` Andy Wingo
2012-01-04 17:30           ` Bruce Korb
2012-01-04 17:44             ` David Kastrup
2012-01-04 18:26             ` Ian Price
2012-01-04 18:48               ` Mark H Weaver
2012-01-04 19:29               ` Bruce Korb
2012-01-04 20:20                 ` David Kastrup
2012-01-04 23:19                 ` Mark H Weaver
2012-01-04 23:28                   ` Bruce Korb
2012-01-07 15:43                   ` Fixed string corruption bugs (was Guile: What's wrong with this?) Mark H Weaver
2012-01-07 16:19                     ` Fixed string corruption bugs Andy Wingo
2012-01-04 18:31           ` Guile: What's wrong with this? Mark H Weaver
2012-01-04 18:43             ` Andy Wingo
2012-01-04 19:29               ` Mark H Weaver
2012-01-04 19:43                 ` Andy Wingo
2012-01-04 20:08                   ` Bruce Korb
2012-01-04 20:14                     ` David Kastrup
2012-01-04 20:56                     ` Andy Wingo
2012-01-04 21:30                       ` Bruce Korb
2012-01-04 17:19         ` Mark H Weaver
2012-01-05  4:24           ` Mark H Weaver
2012-01-04 22:37       ` Ludovic Courtès

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).