* survey: string external representation
@ 2012-01-26 8:00 Thien-Thi Nguyen
2012-01-26 8:38 ` Andy Wingo
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Thien-Thi Nguyen @ 2012-01-26 8:00 UTC (permalink / raw)
To: guile-user
[-- Attachment #1: Type: text/plain, Size: 165 bytes --]
I am looking to improve ‘(database postgres-qcons) sql-quote’
robustness in the face of diverse Guile behaviors.
Here is string-xrep.scm in its entirety:
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: string-xrep.scm --]
[-- Type: text/x-scheme, Size: 250 bytes --]
(display (version))
(newline)
(for-each (lambda (n)
(simple-format #t "~S\t~S\t~S~%"
n
(integer->char n)
(string (integer->char n))))
(iota 256))
[-- Attachment #3: Type: text/plain, Size: 333 bytes --]
Attached below are the output of runs w/ Guile 1.4.1.124
and 1.8.7, respectively, made by command:
guile -s string-xrep.scm > string-xrep-VERSION.out
in a ‘LANG=it_IT.UTF-8’ environment. Could people who run
other Guile versions and/or other environments please run
the program and post the output, too? Thanks!
[-- Attachment #4: string-xrep-1.4.1.124.out --]
[-- Type: application/octet-stream, Size: 3290 bytes --]
[-- Attachment #5: string-xrep-1.8.7.out --]
[-- Type: application/octet-stream, Size: 3444 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: survey: string external representation
2012-01-26 8:00 survey: string external representation Thien-Thi Nguyen
@ 2012-01-26 8:38 ` Andy Wingo
2012-01-26 14:11 ` Mike Gran
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Andy Wingo @ 2012-01-26 8:38 UTC (permalink / raw)
To: Thien-Thi Nguyen; +Cc: guile-user
[-- Attachment #1: Type: text/plain, Size: 40 bytes --]
In en_US.UTF-8, guile from stable-2.0:
[-- Attachment #2: string-xrep-v2.0.3-187-g63fa6b1.out --]
[-- Type: text/plain, Size: 3764 bytes --]
2.0.3.164-7d02e2
0 #\nul "\x00"
1 #\soh "\x01"
2 #\stx "\x02"
3 #\etx "\x03"
4 #\eot "\x04"
5 #\enq "\x05"
6 #\ack "\x06"
7 #\alarm "\a"
8 #\backspace "\b"
9 #\tab "\t"
10 #\newline "\n"
11 #\vtab "\v"
12 #\page "\f"
13 #\return "\r"
14 #\so "\x0e"
15 #\si "\x0f"
16 #\dle "\x10"
17 #\dc1 "\x11"
18 #\dc2 "\x12"
19 #\dc3 "\x13"
20 #\dc4 "\x14"
21 #\nak "\x15"
22 #\syn "\x16"
23 #\etb "\x17"
24 #\can "\x18"
25 #\em "\x19"
26 #\sub "\x1a"
27 #\esc "\x1b"
28 #\fs "\x1c"
29 #\gs "\x1d"
30 #\rs "\x1e"
31 #\us "\x1f"
32 #\space " "
33 #\! "!"
34 #\" "\""
35 #\# "#"
36 #\$ "$"
37 #\% "%"
38 #\& "&"
39 #\' "'"
40 #\( "("
41 #\) ")"
42 #\* "*"
43 #\+ "+"
44 #\, ","
45 #\- "-"
46 #\. "."
47 #\/ "/"
48 #\0 "0"
49 #\1 "1"
50 #\2 "2"
51 #\3 "3"
52 #\4 "4"
53 #\5 "5"
54 #\6 "6"
55 #\7 "7"
56 #\8 "8"
57 #\9 "9"
58 #\: ":"
59 #\; ";"
60 #\< "<"
61 #\= "="
62 #\> ">"
63 #\? "?"
64 #\@ "@"
65 #\A "A"
66 #\B "B"
67 #\C "C"
68 #\D "D"
69 #\E "E"
70 #\F "F"
71 #\G "G"
72 #\H "H"
73 #\I "I"
74 #\J "J"
75 #\K "K"
76 #\L "L"
77 #\M "M"
78 #\N "N"
79 #\O "O"
80 #\P "P"
81 #\Q "Q"
82 #\R "R"
83 #\S "S"
84 #\T "T"
85 #\U "U"
86 #\V "V"
87 #\W "W"
88 #\X "X"
89 #\Y "Y"
90 #\Z "Z"
91 #\[ "["
92 #\\ "\\"
93 #\] "]"
94 #\^ "^"
95 #\_ "_"
96 #\` "`"
97 #\a "a"
98 #\b "b"
99 #\c "c"
100 #\d "d"
101 #\e "e"
102 #\f "f"
103 #\g "g"
104 #\h "h"
105 #\i "i"
106 #\j "j"
107 #\k "k"
108 #\l "l"
109 #\m "m"
110 #\n "n"
111 #\o "o"
112 #\p "p"
113 #\q "q"
114 #\r "r"
115 #\s "s"
116 #\t "t"
117 #\u "u"
118 #\v "v"
119 #\w "w"
120 #\x "x"
121 #\y "y"
122 #\z "z"
123 #\{ "{"
124 #\| "|"
125 #\} "}"
126 #\~ "~"
127 #\delete "\x7f"
128 #\200 "\x80"
129 #\201 "\x81"
130 #\202 "\x82"
131 #\203 "\x83"
132 #\204 "\x84"
133 #\205 "\x85"
134 #\206 "\x86"
135 #\207 "\x87"
136 #\210 "\x88"
137 #\211 "\x89"
138 #\212 "\x8a"
139 #\213 "\x8b"
140 #\214 "\x8c"
141 #\215 "\x8d"
142 #\216 "\x8e"
143 #\217 "\x8f"
144 #\220 "\x90"
145 #\221 "\x91"
146 #\222 "\x92"
147 #\223 "\x93"
148 #\224 "\x94"
149 #\225 "\x95"
150 #\226 "\x96"
151 #\227 "\x97"
152 #\230 "\x98"
153 #\231 "\x99"
154 #\232 "\x9a"
155 #\233 "\x9b"
156 #\234 "\x9c"
157 #\235 "\x9d"
158 #\236 "\x9e"
159 #\237 "\x9f"
160 #\240 "\xa0"
161 #\¡ "¡"
162 #\¢ "¢"
163 #\£ "£"
164 #\¤ "¤"
165 #\¥ "¥"
166 #\¦ "¦"
167 #\§ "§"
168 #\¨ "¨"
169 #\© "©"
170 #\ª "ª"
171 #\« "«"
172 #\¬ "¬"
173 #\255 "\xad"
174 #\® "®"
175 #\¯ "¯"
176 #\° "°"
177 #\± "±"
178 #\² "²"
179 #\³ "³"
180 #\´ "´"
181 #\µ "µ"
182 #\¶ "¶"
183 #\· "·"
184 #\¸ "¸"
185 #\¹ "¹"
186 #\º "º"
187 #\» "»"
188 #\¼ "¼"
189 #\½ "½"
190 #\¾ "¾"
191 #\¿ "¿"
192 #\À "À"
193 #\Á "Á"
194 #\Â "Â"
195 #\Ã "Ã"
196 #\Ä "Ä"
197 #\Å "Å"
198 #\Æ "Æ"
199 #\Ç "Ç"
200 #\È "È"
201 #\É "É"
202 #\Ê "Ê"
203 #\Ë "Ë"
204 #\Ì "Ì"
205 #\Í "Í"
206 #\Î "Î"
207 #\Ï "Ï"
208 #\Ð "Ð"
209 #\Ñ "Ñ"
210 #\Ò "Ò"
211 #\Ó "Ó"
212 #\Ô "Ô"
213 #\Õ "Õ"
214 #\Ö "Ö"
215 #\× "×"
216 #\Ø "Ø"
217 #\Ù "Ù"
218 #\Ú "Ú"
219 #\Û "Û"
220 #\Ü "Ü"
221 #\Ý "Ý"
222 #\Þ "Þ"
223 #\ß "ß"
224 #\à "à"
225 #\á "á"
226 #\â "â"
227 #\ã "ã"
228 #\ä "ä"
229 #\å "å"
230 #\æ "æ"
231 #\ç "ç"
232 #\è "è"
233 #\é "é"
234 #\ê "ê"
235 #\ë "ë"
236 #\ì "ì"
237 #\í "í"
238 #\î "î"
239 #\ï "ï"
240 #\ð "ð"
241 #\ñ "ñ"
242 #\ò "ò"
243 #\ó "ó"
244 #\ô "ô"
245 #\õ "õ"
246 #\ö "ö"
247 #\÷ "÷"
248 #\ø "ø"
249 #\ù "ù"
250 #\ú "ú"
251 #\û "û"
252 #\ü "ü"
253 #\ý "ý"
254 #\þ "þ"
255 #\ÿ "ÿ"
[-- Attachment #3: Type: text/plain, Size: 26 bytes --]
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: survey: string external representation
2012-01-26 8:00 survey: string external representation Thien-Thi Nguyen
2012-01-26 8:38 ` Andy Wingo
@ 2012-01-26 14:11 ` Mike Gran
2012-01-27 10:27 ` Thien-Thi Nguyen
2012-01-27 15:32 ` David Pirotte
3 siblings, 0 replies; 8+ messages in thread
From: Mike Gran @ 2012-01-26 14:11 UTC (permalink / raw)
To: Thien-Thi Nguyen, guile-user@gnu.org
[-- Attachment #1: Type: text/plain, Size: 294 bytes --]
Hi Thi-
On the box I'm at right now, the locale is C
Guile 1.8 gives me the result in guile-1.8.out.
Guile 2.0 gives me the result in guile-2.0.out.
If I add a (setlocale LC_ALL "") to the top of the script
guile-2.0 gives me guile-2.0.out2.
guile-1.8 is unchanged.
-Mike
[-- Attachment #2: guile-1.8.out --]
[-- Type: application/octet-stream, Size: 3444 bytes --]
[-- Attachment #3: guile-2.0.out --]
[-- Type: application/octet-stream, Size: 3323 bytes --]
2.0.3.93-84843-dirty
0 #\nul "\x00"
1 #\soh "\x01"
2 #\stx "\x02"
3 #\etx "\x03"
4 #\eot "\x04"
5 #\enq "\x05"
6 #\ack "\x06"
7 #\alarm "\a"
8 #\backspace "\b"
9 #\tab "\t"
10 #\newline "\n"
11 #\vtab "\v"
12 #\page "\f"
13 #\return "\r"
14 #\so "\x0e"
15 #\si "\x0f"
16 #\dle "\x10"
17 #\dc1 "\x11"
18 #\dc2 "\x12"
19 #\dc3 "\x13"
20 #\dc4 "\x14"
21 #\nak "\x15"
22 #\syn "\x16"
23 #\etb "\x17"
24 #\can "\x18"
25 #\em "\x19"
26 #\sub "\x1a"
27 #\esc "\x1b"
28 #\fs "\x1c"
29 #\gs "\x1d"
30 #\rs "\x1e"
31 #\us "\x1f"
32 #\space " "
33 #\! "!"
34 #\" "\""
35 #\# "#"
36 #\$ "$"
37 #\% "%"
38 #\& "&"
39 #\' "'"
40 #\( "("
41 #\) ")"
42 #\* "*"
43 #\+ "+"
44 #\, ","
45 #\- "-"
46 #\. "."
47 #\/ "/"
48 #\0 "0"
49 #\1 "1"
50 #\2 "2"
51 #\3 "3"
52 #\4 "4"
53 #\5 "5"
54 #\6 "6"
55 #\7 "7"
56 #\8 "8"
57 #\9 "9"
58 #\: ":"
59 #\; ";"
60 #\< "<"
61 #\= "="
62 #\> ">"
63 #\? "?"
64 #\@ "@"
65 #\A "A"
66 #\B "B"
67 #\C "C"
68 #\D "D"
69 #\E "E"
70 #\F "F"
71 #\G "G"
72 #\H "H"
73 #\I "I"
74 #\J "J"
75 #\K "K"
76 #\L "L"
77 #\M "M"
78 #\N "N"
79 #\O "O"
80 #\P "P"
81 #\Q "Q"
82 #\R "R"
83 #\S "S"
84 #\T "T"
85 #\U "U"
86 #\V "V"
87 #\W "W"
88 #\X "X"
89 #\Y "Y"
90 #\Z "Z"
91 #\[ "["
92 #\\ "\\"
93 #\] "]"
94 #\^ "^"
95 #\_ "_"
96 #\` "`"
97 #\a "a"
98 #\b "b"
99 #\c "c"
100 #\d "d"
101 #\e "e"
102 #\f "f"
103 #\g "g"
104 #\h "h"
105 #\i "i"
106 #\j "j"
107 #\k "k"
108 #\l "l"
109 #\m "m"
110 #\n "n"
111 #\o "o"
112 #\p "p"
113 #\q "q"
114 #\r "r"
115 #\s "s"
116 #\t "t"
117 #\u "u"
118 #\v "v"
119 #\w "w"
120 #\x "x"
121 #\y "y"
122 #\z "z"
123 #\{ "{"
124 #\| "|"
125 #\} "}"
126 #\~ "~"
127 #\delete "\x7f"
128 #\200 "\x80"
129 #\201 "\x81"
130 #\202 "\x82"
131 #\203 "\x83"
132 #\204 "\x84"
133 #\205 "\x85"
134 #\206 "\x86"
135 #\207 "\x87"
136 #\210 "\x88"
137 #\211 "\x89"
138 #\212 "\x8a"
139 #\213 "\x8b"
140 #\214 "\x8c"
141 #\215 "\x8d"
142 #\216 "\x8e"
143 #\217 "\x8f"
144 #\220 "\x90"
145 #\221 "\x91"
146 #\222 "\x92"
147 #\223 "\x93"
148 #\224 "\x94"
149 #\225 "\x95"
150 #\226 "\x96"
151 #\227 "\x97"
152 #\230 "\x98"
153 #\231 "\x99"
154 #\232 "\x9a"
155 #\233 "\x9b"
156 #\234 "\x9c"
157 #\235 "\x9d"
158 #\236 "\x9e"
159 #\237 "\x9f"
160 #\240 "\xa0"
161 #\¡ "¡"
162 #\¢ "¢"
163 #\£ "£"
164 #\¤ "¤"
165 #\¥ "¥"
166 #\¦ "¦"
167 #\§ "§"
168 #\¨ "¨"
169 #\© "©"
170 #\ª "ª"
171 #\« "«"
172 #\¬ "¬"
173 #\255 "\xad"
174 #\® "®"
175 #\¯ "¯"
176 #\° "°"
177 #\± "±"
178 #\² "²"
179 #\³ "³"
180 #\´ "´"
181 #\µ "µ"
182 #\¶ "¶"
183 #\· "·"
184 #\¸ "¸"
185 #\¹ "¹"
186 #\º "º"
187 #\» "»"
188 #\¼ "¼"
189 #\½ "½"
190 #\¾ "¾"
191 #\¿ "¿"
192 #\À "À"
193 #\Á "Á"
194 #\Â "Â"
195 #\Ã "Ã"
196 #\Ä "Ä"
197 #\Å "Å"
198 #\Æ "Æ"
199 #\Ç "Ç"
200 #\È "È"
201 #\É "É"
202 #\Ê "Ê"
203 #\Ë "Ë"
204 #\Ì "Ì"
205 #\Í "Í"
206 #\Î "Î"
207 #\Ï "Ï"
208 #\Ð "Ð"
209 #\Ñ "Ñ"
210 #\Ò "Ò"
211 #\Ó "Ó"
212 #\Ô "Ô"
213 #\Õ "Õ"
214 #\Ö "Ö"
215 #\× "×"
216 #\Ø "Ø"
217 #\Ù "Ù"
218 #\Ú "Ú"
219 #\Û "Û"
220 #\Ü "Ü"
221 #\Ý "Ý"
222 #\Þ "Þ"
223 #\ß "ß"
224 #\à "à"
225 #\á "á"
226 #\â "â"
227 #\ã "ã"
228 #\ä "ä"
229 #\å "å"
230 #\æ "æ"
231 #\ç "ç"
232 #\è "è"
233 #\é "é"
234 #\ê "ê"
235 #\ë "ë"
236 #\ì "ì"
237 #\í "í"
238 #\î "î"
239 #\ï "ï"
240 #\ð "ð"
241 #\ñ "ñ"
242 #\ò "ò"
243 #\ó "ó"
244 #\ô "ô"
245 #\õ "õ"
246 #\ö "ö"
247 #\÷ "÷"
248 #\ø "ø"
249 #\ù "ù"
250 #\ú "ú"
251 #\û "û"
252 #\ü "ü"
253 #\ý "ý"
254 #\þ "þ"
255 #\ÿ "ÿ"
[-- Attachment #4: guile-2.0.out2 --]
[-- Type: application/octet-stream, Size: 3793 bytes --]
2.0.3.93-84843-dirty
0 #\nul "\x00"
1 #\soh "\x01"
2 #\stx "\x02"
3 #\etx "\x03"
4 #\eot "\x04"
5 #\enq "\x05"
6 #\ack "\x06"
7 #\alarm "\a"
8 #\backspace "\b"
9 #\tab "\t"
10 #\newline "\n"
11 #\vtab "\v"
12 #\page "\f"
13 #\return "\r"
14 #\so "\x0e"
15 #\si "\x0f"
16 #\dle "\x10"
17 #\dc1 "\x11"
18 #\dc2 "\x12"
19 #\dc3 "\x13"
20 #\dc4 "\x14"
21 #\nak "\x15"
22 #\syn "\x16"
23 #\etb "\x17"
24 #\can "\x18"
25 #\em "\x19"
26 #\sub "\x1a"
27 #\esc "\x1b"
28 #\fs "\x1c"
29 #\gs "\x1d"
30 #\rs "\x1e"
31 #\us "\x1f"
32 #\space " "
33 #\! "!"
34 #\" "\""
35 #\# "#"
36 #\$ "$"
37 #\% "%"
38 #\& "&"
39 #\' "'"
40 #\( "("
41 #\) ")"
42 #\* "*"
43 #\+ "+"
44 #\, ","
45 #\- "-"
46 #\. "."
47 #\/ "/"
48 #\0 "0"
49 #\1 "1"
50 #\2 "2"
51 #\3 "3"
52 #\4 "4"
53 #\5 "5"
54 #\6 "6"
55 #\7 "7"
56 #\8 "8"
57 #\9 "9"
58 #\: ":"
59 #\; ";"
60 #\< "<"
61 #\= "="
62 #\> ">"
63 #\? "?"
64 #\@ "@"
65 #\A "A"
66 #\B "B"
67 #\C "C"
68 #\D "D"
69 #\E "E"
70 #\F "F"
71 #\G "G"
72 #\H "H"
73 #\I "I"
74 #\J "J"
75 #\K "K"
76 #\L "L"
77 #\M "M"
78 #\N "N"
79 #\O "O"
80 #\P "P"
81 #\Q "Q"
82 #\R "R"
83 #\S "S"
84 #\T "T"
85 #\U "U"
86 #\V "V"
87 #\W "W"
88 #\X "X"
89 #\Y "Y"
90 #\Z "Z"
91 #\[ "["
92 #\\ "\\"
93 #\] "]"
94 #\^ "^"
95 #\_ "_"
96 #\` "`"
97 #\a "a"
98 #\b "b"
99 #\c "c"
100 #\d "d"
101 #\e "e"
102 #\f "f"
103 #\g "g"
104 #\h "h"
105 #\i "i"
106 #\j "j"
107 #\k "k"
108 #\l "l"
109 #\m "m"
110 #\n "n"
111 #\o "o"
112 #\p "p"
113 #\q "q"
114 #\r "r"
115 #\s "s"
116 #\t "t"
117 #\u "u"
118 #\v "v"
119 #\w "w"
120 #\x "x"
121 #\y "y"
122 #\z "z"
123 #\{ "{"
124 #\| "|"
125 #\} "}"
126 #\~ "~"
127 #\delete "\x7f"
128 #\200 "\x80"
129 #\201 "\x81"
130 #\202 "\x82"
131 #\203 "\x83"
132 #\204 "\x84"
133 #\205 "\x85"
134 #\206 "\x86"
135 #\207 "\x87"
136 #\210 "\x88"
137 #\211 "\x89"
138 #\212 "\x8a"
139 #\213 "\x8b"
140 #\214 "\x8c"
141 #\215 "\x8d"
142 #\216 "\x8e"
143 #\217 "\x8f"
144 #\220 "\x90"
145 #\221 "\x91"
146 #\222 "\x92"
147 #\223 "\x93"
148 #\224 "\x94"
149 #\225 "\x95"
150 #\226 "\x96"
151 #\227 "\x97"
152 #\230 "\x98"
153 #\231 "\x99"
154 #\232 "\x9a"
155 #\233 "\x9b"
156 #\234 "\x9c"
157 #\235 "\x9d"
158 #\236 "\x9e"
159 #\237 "\x9f"
160 #\240 "\xa0"
161 #\241 "\xa1"
162 #\242 "\xa2"
163 #\243 "\xa3"
164 #\244 "\xa4"
165 #\245 "\xa5"
166 #\246 "\xa6"
167 #\247 "\xa7"
168 #\250 "\xa8"
169 #\251 "\xa9"
170 #\252 "\xaa"
171 #\253 "\xab"
172 #\254 "\xac"
173 #\255 "\xad"
174 #\256 "\xae"
175 #\257 "\xaf"
176 #\260 "\xb0"
177 #\261 "\xb1"
178 #\262 "\xb2"
179 #\263 "\xb3"
180 #\264 "\xb4"
181 #\265 "\xb5"
182 #\266 "\xb6"
183 #\267 "\xb7"
184 #\270 "\xb8"
185 #\271 "\xb9"
186 #\272 "\xba"
187 #\273 "\xbb"
188 #\274 "\xbc"
189 #\275 "\xbd"
190 #\276 "\xbe"
191 #\277 "\xbf"
192 #\300 "\xc0"
193 #\301 "\xc1"
194 #\302 "\xc2"
195 #\303 "\xc3"
196 #\304 "\xc4"
197 #\305 "\xc5"
198 #\306 "\xc6"
199 #\307 "\xc7"
200 #\310 "\xc8"
201 #\311 "\xc9"
202 #\312 "\xca"
203 #\313 "\xcb"
204 #\314 "\xcc"
205 #\315 "\xcd"
206 #\316 "\xce"
207 #\317 "\xcf"
208 #\320 "\xd0"
209 #\321 "\xd1"
210 #\322 "\xd2"
211 #\323 "\xd3"
212 #\324 "\xd4"
213 #\325 "\xd5"
214 #\326 "\xd6"
215 #\327 "\xd7"
216 #\330 "\xd8"
217 #\331 "\xd9"
218 #\332 "\xda"
219 #\333 "\xdb"
220 #\334 "\xdc"
221 #\335 "\xdd"
222 #\336 "\xde"
223 #\337 "\xdf"
224 #\340 "\xe0"
225 #\341 "\xe1"
226 #\342 "\xe2"
227 #\343 "\xe3"
228 #\344 "\xe4"
229 #\345 "\xe5"
230 #\346 "\xe6"
231 #\347 "\xe7"
232 #\350 "\xe8"
233 #\351 "\xe9"
234 #\352 "\xea"
235 #\353 "\xeb"
236 #\354 "\xec"
237 #\355 "\xed"
238 #\356 "\xee"
239 #\357 "\xef"
240 #\360 "\xf0"
241 #\361 "\xf1"
242 #\362 "\xf2"
243 #\363 "\xf3"
244 #\364 "\xf4"
245 #\365 "\xf5"
246 #\366 "\xf6"
247 #\367 "\xf7"
248 #\370 "\xf8"
249 #\371 "\xf9"
250 #\372 "\xfa"
251 #\373 "\xfb"
252 #\374 "\xfc"
253 #\375 "\xfd"
254 #\376 "\xfe"
255 #\377 "\xff"
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: survey: string external representation
2012-01-26 8:00 survey: string external representation Thien-Thi Nguyen
2012-01-26 8:38 ` Andy Wingo
2012-01-26 14:11 ` Mike Gran
@ 2012-01-27 10:27 ` Thien-Thi Nguyen
2012-02-05 9:32 ` Thien-Thi Nguyen
2012-01-27 15:32 ` David Pirotte
3 siblings, 1 reply; 8+ messages in thread
From: Thien-Thi Nguyen @ 2012-01-27 10:27 UTC (permalink / raw)
To: guile-user
[-- Attachment #1: Type: text/plain, Size: 627 bytes --]
Thanks to everyone who responded. Based on the collected
information, i've cobbled together a runtime check for
‘sql-quote’. It and some tests are in the attached program.
To play:
guile -s normalize.scm
guile -s normalize.scm stupid
The code assumes Guile 2 DTRT, but if you have doubts, you can
sed -i 's/guile-2/&-not-really/' normalize.scm
to disable that assumption. In any case, the program should exit
successfully, indicating smooth ‘write’ / ‘read’ round-tripping.
This is so (both w/ and w/o "stupid") for Guile 1.4.1.124 and 1.8.7.
___________________________________________
[-- Attachment #2: normalize.scm --]
[-- Type: text/x-scheme, Size: 4055 bytes --]
;; -*- mode: scheme; coding: utf-8 -*-
(define EXIT-VALUE #t) ; optimism
(define STUPID? (false-if-exception (string=? "stupid" (cadr (command-line)))))
;; PostgreSQL groks ‘\xXX’ as an octet w/ hex value XX.
;; It also groks raw octets. This is all fine and good.
;; The problem arises when there is a mix of contiguous
;; raw and \x representations, intended to represent a
;; UTF-8 (say) encoded character.
;;
;; It seems Guile
;; - 1.4 DTRT by doing nothing;
;; - 1.6 ???;
;; - 1.8 fails by \x-escaping inconsistently;
;; - 2.0 doesn't have this problem.
(cond-expand
(guile-2
(define normalize identity))
(else
(use-modules
(srfi srfi-13)
(srfi srfi-14))
(define normalize
(or (let* ((ego (char-set
;; These are not strictly necessary for
;; PostgreSQL, but we include them for
;; (Scheme-only) round-trip testing.
;; Doubtlessly, what doubtful ego!
#\" #\\))
(ugh (ucs-range->char-set #o177 #o400 #t ego)))
(and (not (char-set-every
(lambda (ch)
;; Does the octet xrep unmolested?
(char=? ch (string-ref (object->string (string ch)) 1)))
(char-set-difference ugh ego)))
(or (not STUPID?)
(begin (set! ugh ego)
#t))
;; Lame.
(lambda (s)
(define backslash-x
(let ((v (make-vector 256)))
(char-set-for-each
(lambda (ch)
(let ((i (char->integer ch)))
(vector-set!
v i (string-append
"\\x" (number->string i 16)))))
ugh)
;; backslash-x
(lambda (ch)
(vector-ref v (char->integer ch)))))
(let loop ((start 0) (acc '()))
(cond ((string-index s ugh start)
=> (lambda (idx)
(loop (1+ idx)
(cons* (backslash-x (string-ref s idx))
(substring/shared s start idx)
acc))))
((zero? start)
s)
(else
(string-concatenate-reverse
acc (substring/shared s start))))))))
;; Cool.
identity))))
(define (try s)
(simple-format
#t "ORIG:\t~S~%NORM:\t~S~%=>\t~A~%~%"
s (normalize s)
(let ((round (with-input-from-string
(with-output-to-string
(lambda ()
(if (eq? identity normalize)
(write s)
(begin (display #\")
(display (normalize s))
(display #\")))))
read)))
(cond ((equal? s round) 'SAME)
(else
(set! EXIT-VALUE #f) ;-O
(string-append
"DIFF: [" (number->string (string-length round))
"]|" round "|"))))))
(simple-format #t "Guile ~A~% LANG: ~S~% normalize: ~S~A~%~%"
(version) (getenv "LANG") (procedure-name normalize)
(if (and STUPID? (not (eq? normalize identity)))
" (but we stupidly revert to degeneracy)"
""))
(try "")
(try (list->string (map integer->char (iota 256))))
(try "U+2002: | | (utf-8: E2 80 82)")
(try "U+232C: |⌬| (utf-8: E2 80 82)")
(try "U+1D7FF: |𝟿| (utf-8: F0 9D 9F BF)")
(try "U+2F9B2: |䕫| (utf-8: F0 AF A6 B2)")
(try "U+2F9BC: |蜨| (utf-8: F0 AF A6 BC)")
(exit EXIT-VALUE)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: survey: string external representation
2012-01-26 8:00 survey: string external representation Thien-Thi Nguyen
` (2 preceding siblings ...)
2012-01-27 10:27 ` Thien-Thi Nguyen
@ 2012-01-27 15:32 ` David Pirotte
3 siblings, 0 replies; 8+ messages in thread
From: David Pirotte @ 2012-01-27 15:32 UTC (permalink / raw)
To: Thien-Thi Nguyen; +Cc: guile-user
[-- Attachment #1: Type: text/plain, Size: 55 bytes --]
Hello,
In en_US.UTF-8, guile-1.6.8 ...
Cheers,
David
[-- Attachment #2: string-xrep.out --]
[-- Type: application/octet-stream, Size: 3288 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: survey: string external representation
2012-01-27 10:27 ` Thien-Thi Nguyen
@ 2012-02-05 9:32 ` Thien-Thi Nguyen
2012-02-07 8:58 ` Andy Wingo
2012-02-07 9:52 ` David Pirotte
0 siblings, 2 replies; 8+ messages in thread
From: Thien-Thi Nguyen @ 2012-02-05 9:32 UTC (permalink / raw)
To: guile-user
[-- Attachment #1: Type: text/plain, Size: 343 bytes --]
() Thien-Thi Nguyen <ttn@gnuvola.org>
() Fri, 27 Jan 2012 11:27:30 +0100
The code assumes Guile 2 DTRT [...]
Well, further investigation raises new doubts. The issue really
is in contiguous mixed raw and \x-escaped octets, and not just
single byte external representation, so here is a followup
experiment that addresses that directly:
[-- Attachment #2: xrep2.scm --]
[-- Type: text/x-scheme, Size: 558 bytes --]
(setlocale LC_ALL "")
(define (hmm symbol)
(define (show x)
(display x) (display "\t") (write x) (newline))
(newline)
(show symbol)
(let ((string (symbol->string symbol)))
(show string)
(show (object->string string))))
(display "LANG: ") (write (getenv "LANG")) (newline)
(hmm 'foo)
(hmm '#{f\"o b\\r}#)
(hmm '⌬) ; U+232C (utf-8: E2 8C AC)
(hmm '䕫) ; U+2F9B2 (utf-8: F0 AF A6 B2)
(hmm '蜨) ; U+2F9BC (utf-8: F0 AF A6 BC)
[-- Attachment #3: Type: text/plain, Size: 150 bytes --]
Below are the output of two runs:
guile -s xrep2.scm \
| tee xrep2-$(guile --version | sed 's/.* //;q')-$LANG.out
What do other people see?
[-- Attachment #4: xrep2-1.4.1.124-it_IT.UTF-8.out --]
[-- Type: text/plain, Size: 276 bytes --]
LANG: "it_IT.UTF-8"
foo foo
foo "foo"
"foo" "\"foo\""
#{f\"o\ b\\r}# #{f\"o\ b\\r}#
f"o b\r "f\"o b\\r"
"f\"o b\\r" "\"f\\\"o b\\\\r\""
⌬ ⌬
⌬ "⌬"
"⌬" "\"⌬\""
䕫 䕫
䕫 "䕫"
"䕫" "\"䕫\""
蜨 蜨
蜨 "蜨"
"蜨" "\"蜨\""
[-- Attachment #5: xrep2-1.8.7-it_IT.UTF-8.out --]
[-- Type: text/plain, Size: 316 bytes --]
LANG: "it_IT.UTF-8"
foo foo
foo "foo"
"foo" "\"foo\""
#{\f\\\"o\ b\\\\r}# #{\f\\\"o\ b\\\\r}#
f\"o b\\r "f\\\"o b\\\\r"
"f\\\"o b\\\\r" "\"f\\\\\\\"o b\\\\\\\\r\""
âÐ âÐ
âÐ "â\x8c¬"
"â\x8c¬" "\"â\\x8c¬\""
䕫 䕫
䕫 "䕫"
"䕫" "\"䕫\""
蜨 蜨
蜨 "蜨"
"蜨" "\"蜨\""
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: survey: string external representation
2012-02-05 9:32 ` Thien-Thi Nguyen
@ 2012-02-07 8:58 ` Andy Wingo
2012-02-07 9:52 ` David Pirotte
1 sibling, 0 replies; 8+ messages in thread
From: Andy Wingo @ 2012-02-07 8:58 UTC (permalink / raw)
To: Thien-Thi Nguyen; +Cc: guile-user
[-- Attachment #1: Type: text/plain, Size: 191 bytes --]
On Sun 05 Feb 2012 10:32, Thien-Thi Nguyen <ttn@gnuvola.org> writes:
> guile -s xrep2.scm \
> | tee xrep2-$(guile --version | sed 's/.* //;q')-$LANG.out
>
> What do other people see?
[-- Attachment #2: xrep2--en_US.UTF-8.out --]
[-- Type: application/octet-stream, Size: 249 bytes --]
LANG: "en_US.UTF-8"
foo foo
foo "foo"
"foo" "\"foo\""
#{f"o b\r}# #{f"o b\r}#
f"o b\r "f\"o b\\r"
"f\"o b\\r" "\"f\\\"o b\\\\r\""
⌬ ⌬
⌬ "⌬"
"⌬" "\"⌬\""
䕫 䕫
䕫 "䕫"
"䕫" "\"䕫\""
蜨 蜨
蜨 "蜨"
"蜨" "\"蜨\""
[-- Attachment #3: Type: text/plain, Size: 26 bytes --]
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: survey: string external representation
2012-02-05 9:32 ` Thien-Thi Nguyen
2012-02-07 8:58 ` Andy Wingo
@ 2012-02-07 9:52 ` David Pirotte
1 sibling, 0 replies; 8+ messages in thread
From: David Pirotte @ 2012-02-07 9:52 UTC (permalink / raw)
To: Thien-Thi Nguyen; +Cc: guile-user
[-- Attachment #1: Type: text/plain, Size: 193 bytes --]
On Sun 05 Feb 2012 10:32, Thien-Thi Nguyen <ttn@gnuvola.org> writes:
> guile -s xrep2.scm \
> | tee xrep2-$(guile --version | sed 's/.* //;q')-$LANG.out
>
> What do other people see?
[-- Attachment #2: xrep2-1.6.8-en_US.UTF-8.out --]
[-- Type: application/octet-stream, Size: 255 bytes --]
LANG: "en_US.UTF-8"
foo foo
foo "foo"
"foo" "\"foo\""
#{f\"o\ b\\r}# #{f\"o\ b\\r}#
f"o b\r "f\"o b\\r"
"f\"o b\\r" "\"f\\\"o b\\\\r\""
⌬ ⌬
⌬ "⌬"
"⌬" "\"⌬\""
䕫 䕫
䕫 "䕫"
"䕫" "\"䕫\""
蜨 蜨
蜨 "蜨"
"蜨" "\"蜨\""
[-- Attachment #3: xrep2-1.6.8-fr_BE.UTF-8.out --]
[-- Type: application/octet-stream, Size: 255 bytes --]
LANG: "fr_BE.UTF-8"
foo foo
foo "foo"
"foo" "\"foo\""
#{f\"o\ b\\r}# #{f\"o\ b\\r}#
f"o b\r "f\"o b\\r"
"f\"o b\\r" "\"f\\\"o b\\\\r\""
⌬ ⌬
⌬ "⌬"
"⌬" "\"⌬\""
䕫 䕫
䕫 "䕫"
"䕫" "\"䕫\""
蜨 蜨
蜨 "蜨"
"蜨" "\"蜨\""
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-02-07 9:52 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-26 8:00 survey: string external representation Thien-Thi Nguyen
2012-01-26 8:38 ` Andy Wingo
2012-01-26 14:11 ` Mike Gran
2012-01-27 10:27 ` Thien-Thi Nguyen
2012-02-05 9:32 ` Thien-Thi Nguyen
2012-02-07 8:58 ` Andy Wingo
2012-02-07 9:52 ` David Pirotte
2012-01-27 15:32 ` David Pirotte
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).