* [bug #31680] R6RS string literal intraline whitespace removal
@ 2010-11-17 12:15 Göran Weinholt
2010-11-18 15:05 ` Mike Gran
0 siblings, 1 reply; 10+ messages in thread
From: Göran Weinholt @ 2010-11-17 12:15 UTC (permalink / raw)
To: Göran Weinholt, bug-guile
URL:
<http://savannah.gnu.org/bugs/?31680>
Summary: R6RS string literal intraline whitespace removal
Project: Guile
Submitted by: weinholt
Submitted on: Wed Nov 17 13:15:37 2010
Category: None
Severity: 3 - Normal
Item Group: None
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Details:
In R6RS subsection 4.2.7 there is this escape sequence:
\<intraline whitespace><line ending>
<intraline whitespace> : nothing
This escape sequence isn't working. Both the examples below should return
"foobar":
GNU Guile 1.9.13.58-b98d5a
Copyright (C) 1995-2010 Free Software Foundation, Inc.
Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.
Enter `,help' for help.
scheme@(guile-user)> (import (rnrs))
scheme@(guile-user)> (read (open-string-input-port "#!r6rs \"foo\\\n
bar\""))
$1 = "foo bar"
scheme@(guile-user)> (read (open-string-input-port "#!r6rs \"foo\\ \n
bar\""))
ERROR: In procedure scm_lreadr:
ERROR: #<unknown port>:1:10: illegal character in escape sequence: #\space
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?31680>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [bug #31680] R6RS string literal intraline whitespace removal
2010-11-17 12:15 [bug #31680] R6RS string literal intraline whitespace removal Göran Weinholt
@ 2010-11-18 15:05 ` Mike Gran
2010-11-18 18:56 ` Andy Wingo
0 siblings, 1 reply; 10+ messages in thread
From: Mike Gran @ 2010-11-18 15:05 UTC (permalink / raw)
To: Göran Weinholt, Mike Gran, bug-guile
Follow-up Comment #1, bug #31680 (project guile):
Yeah it is true that this isn't implemented and should be.
Ludo, Andy, I did in fact write an implementation of this in r6rs-strings
tree, but, I never committed it to master.
The commit was cff6adf899dbc0336c7c017d52504f8138c89b3d and the part of the
commit specific to this bug is all in read.c in scm_read_string and
SCM_READ_SPACE_LINE_SPACE
I didn't commit it because, at the time, I took exception with the r6rs spec
itself.
It states that <intraline-whitespace><line-ending><intraline-whitespace>
expands to nothing. And then it calls out that whitespace and line-ending are
any characters in the Unicode whitespace and line ending families!
I guarantee that no one will ever put something like <U+005d "backslash">
<U+205F "medium mathematical space"> <U+2028 "line separator"> <U+2006 "6 per
em space"> in the middle of a source code file. It will never, ever happen.
But writing the code to allow for all those cases lead to an ugly patch.
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?31680>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [bug #31680] R6RS string literal intraline whitespace removal
2010-11-18 15:05 ` Mike Gran
@ 2010-11-18 18:56 ` Andy Wingo
2010-11-18 19:45 ` Göran Weinholt
0 siblings, 1 reply; 10+ messages in thread
From: Andy Wingo @ 2010-11-18 18:56 UTC (permalink / raw)
To: Andy Wingo, Göran Weinholt, Mike Gran, bug-guile
Follow-up Comment #2, bug #31680 (project guile):
Interesting issue. I have no idea what that particular escape sequence is
for. Göran, do you use it? What for?
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?31680>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [bug #31680] R6RS string literal intraline whitespace removal
2010-11-18 18:56 ` Andy Wingo
@ 2010-11-18 19:45 ` Göran Weinholt
2010-11-18 20:24 ` Andy Wingo
0 siblings, 1 reply; 10+ messages in thread
From: Göran Weinholt @ 2010-11-18 19:45 UTC (permalink / raw)
To: Andy Wingo, Göran Weinholt, Mike Gran, bug-guile
Follow-up Comment #3, bug #31680 (project guile):
I've used it for multiline strings, like so:
(define modp-group1-p
(string->number
"FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1
29024E088A67CC74020BBEA63B139B22514A08798E3404DD
EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245
E485B576625E7EC6F44C42E9A63A3620FFFFFFFFFFFFFFFF"
16))
or like so:
"My name is Ozymandias, king of kings:n
Look on my works, ye Mighty, and despair!"
I think any implementation of multiline strings should be prepared to handle
characters from other operating systems. But surely you should not need to
check for U+205F and so on explictly, isn't it enough to see that they belong
to the right char-general-category?
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?31680>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [bug #31680] R6RS string literal intraline whitespace removal
2010-11-18 19:45 ` Göran Weinholt
@ 2010-11-18 20:24 ` Andy Wingo
2010-11-18 21:51 ` Mike Gran
0 siblings, 1 reply; 10+ messages in thread
From: Andy Wingo @ 2010-11-18 20:24 UTC (permalink / raw)
To: Andy Wingo, Göran Weinholt, Mike Gran, bug-guile
Follow-up Comment #4, bug #31680 (project guile):
On Thu 18 Nov 2010 20:45, Göran Weinholt <INVALID.NOREPLY@gnu.org> writes:
> (define modp-group1-p
> (string->number
> "FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1
> 29024E088A67CC74020BBEA63B139B22514A08798E3404DD
> EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245
> E485B576625E7EC6F44C42E9A63A3620FFFFFFFFFFFFFFFF"
> 16))
Sounds legit.
> surely you should not need to check for U+205F and so on explictly,
> isn't it enough to see that they belong to the right
> char-general-category?
I think the deal is that all (?) of the other escapes can be dealt with
via the equivalent of a `case' expression. This one requires a property
lookup. It's not as nice.
Also note the thread at
http://lists.r6rs.org/pipermail/r6rs-discuss/2010-November/006146.html.
Is there a use case for allowing intraline spaces before the newline?
Disallowing that would eliminate a state in the parser.
Andy
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?31680>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [bug #31680] R6RS string literal intraline whitespace removal
2010-11-18 20:24 ` Andy Wingo
@ 2010-11-18 21:51 ` Mike Gran
2010-11-19 15:24 ` Andy Wingo
0 siblings, 1 reply; 10+ messages in thread
From: Mike Gran @ 2010-11-18 21:51 UTC (permalink / raw)
To: Andy Wingo, Göran Weinholt, Mike Gran, bug-guile
Follow-up Comment #5, bug #31680 (project guile):
> I think the deal is that all (?) of the other escapes can be
> dealt with via the equivalent of a `case' expression. This
> one requires a property lookup. It's not as nice.
> Also note the thread at
> http://lists.r6rs.org/pipermail/r6rs-discuss/2010-November/006146.html.
> Is there a use case for allowing intraline spaces before
> the newline? Disallowing that would eliminate a state
> in the parser.
I don't think there is a valid use case for allowing that
initial intraline space. If (according to the discussion
in the R6RS list) the intention was to work around broken
editors, those bugs should be fixed with the editors, and
not become workarounds enshrined in a language spec.
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?31680>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [bug #31680] R6RS string literal intraline whitespace removal
2010-11-18 21:51 ` Mike Gran
@ 2010-11-19 15:24 ` Andy Wingo
2010-11-19 17:57 ` Mike Gran
0 siblings, 1 reply; 10+ messages in thread
From: Andy Wingo @ 2010-11-19 15:24 UTC (permalink / raw)
To: Andy Wingo, Göran Weinholt, Mike Gran, bug-guile
Follow-up Comment #6, bug #31680 (project guile):
OK, why don't we implement the escape sequence, <line ending><intraline
whitespace>* -> nothing, then. Does that sound right to you, Mike?
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?31680>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [bug #31680] R6RS string literal intraline whitespace removal
2010-11-19 15:24 ` Andy Wingo
@ 2010-11-19 17:57 ` Mike Gran
2011-01-21 7:35 ` Andy Wingo
0 siblings, 1 reply; 10+ messages in thread
From: Mike Gran @ 2010-11-19 17:57 UTC (permalink / raw)
To: Andy Wingo, Göran Weinholt, Mike Gran, bug-guile
Follow-up Comment #7, bug #31680 (project guile):
> OK, why don't we implement the escape sequence,
> <line ending><intraline whitespace>* -> nothing,
> then. Does that sound right to you, Mike?
Sounds great. I'll commit something this weekend if you
don't get to it first.
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?31680>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [bug #31680] R6RS string literal intraline whitespace removal
2010-11-19 17:57 ` Mike Gran
@ 2011-01-21 7:35 ` Andy Wingo
2011-01-21 8:22 ` Andy Wingo
0 siblings, 1 reply; 10+ messages in thread
From: Andy Wingo @ 2011-01-21 7:35 UTC (permalink / raw)
To: Andy Wingo, Göran Weinholt, Mike Gran, bug-guile
Follow-up Comment #8, bug #31680 (project guile):
I looked at implementing this, and it seems that Guile does handle the
sequence <LF> as being nothing; e.g.
"foo
bar"
=> "foo bar"
This appears to have been the case since time immemorial, or 1996 at least.
So we will probably have to add a reader option for this, unfortunately...
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?31680>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [bug #31680] R6RS string literal intraline whitespace removal
2011-01-21 7:35 ` Andy Wingo
@ 2011-01-21 8:22 ` Andy Wingo
0 siblings, 0 replies; 10+ messages in thread
From: Andy Wingo @ 2011-01-21 8:22 UTC (permalink / raw)
To: Andy Wingo, Göran Weinholt, Mike Gran, bug-guile
Update of bug #31680 (project guile):
Status: None => Fixed
Open/Closed: Open => Closed
_______________________________________________________
Follow-up Comment #9:
Fixed in git. You have to enable a reader option:
(read-enable 'hungry-eol-escapes)
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?31680>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-01-21 8:22 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-17 12:15 [bug #31680] R6RS string literal intraline whitespace removal Göran Weinholt
2010-11-18 15:05 ` Mike Gran
2010-11-18 18:56 ` Andy Wingo
2010-11-18 19:45 ` Göran Weinholt
2010-11-18 20:24 ` Andy Wingo
2010-11-18 21:51 ` Mike Gran
2010-11-19 15:24 ` Andy Wingo
2010-11-19 17:57 ` Mike Gran
2011-01-21 7:35 ` Andy Wingo
2011-01-21 8:22 ` Andy Wingo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).