unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
* survey: string external representation
@ 2012-01-26  8:00 Thien-Thi Nguyen
  2012-01-26  8:38 ` Andy Wingo
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Thien-Thi Nguyen @ 2012-01-26  8:00 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 165 bytes --]

I am looking to improve ‘(database postgres-qcons) sql-quote’
robustness in the face of diverse Guile behaviors.

Here is string-xrep.scm in its entirety:


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: string-xrep.scm --]
[-- Type: text/x-scheme, Size: 250 bytes --]


(display (version))
(newline)
(for-each (lambda (n)
            (simple-format #t "~S\t~S\t~S~%"
                           n
                           (integer->char n)
                           (string (integer->char n))))
          (iota 256))

[-- Attachment #3: Type: text/plain, Size: 333 bytes --]


Attached below are the output of runs w/ Guile 1.4.1.124
and 1.8.7, respectively, made by command:

 guile -s string-xrep.scm > string-xrep-VERSION.out

in a ‘LANG=it_IT.UTF-8’ environment.  Could people who run
other Guile versions and/or other environments please run
the program and post the output, too?  Thanks!


[-- Attachment #4: string-xrep-1.4.1.124.out --]
[-- Type: application/octet-stream, Size: 3290 bytes --]

[-- Attachment #5: string-xrep-1.8.7.out --]
[-- Type: application/octet-stream, Size: 3444 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: survey: string external representation
  2012-01-26  8:00 survey: string external representation Thien-Thi Nguyen
@ 2012-01-26  8:38 ` Andy Wingo
  2012-01-26 14:11 ` Mike Gran
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Andy Wingo @ 2012-01-26  8:38 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: guile-user

[-- Attachment #1: Type: text/plain, Size: 40 bytes --]

In en_US.UTF-8, guile from stable-2.0:


[-- Attachment #2: string-xrep-v2.0.3-187-g63fa6b1.out --]
[-- Type: text/plain, Size: 3764 bytes --]

2.0.3.164-7d02e2
0	#\nul	"\x00"
1	#\soh	"\x01"
2	#\stx	"\x02"
3	#\etx	"\x03"
4	#\eot	"\x04"
5	#\enq	"\x05"
6	#\ack	"\x06"
7	#\alarm	"\a"
8	#\backspace	"\b"
9	#\tab	"\t"
10	#\newline	"\n"
11	#\vtab	"\v"
12	#\page	"\f"
13	#\return	"\r"
14	#\so	"\x0e"
15	#\si	"\x0f"
16	#\dle	"\x10"
17	#\dc1	"\x11"
18	#\dc2	"\x12"
19	#\dc3	"\x13"
20	#\dc4	"\x14"
21	#\nak	"\x15"
22	#\syn	"\x16"
23	#\etb	"\x17"
24	#\can	"\x18"
25	#\em	"\x19"
26	#\sub	"\x1a"
27	#\esc	"\x1b"
28	#\fs	"\x1c"
29	#\gs	"\x1d"
30	#\rs	"\x1e"
31	#\us	"\x1f"
32	#\space	" "
33	#\!	"!"
34	#\"	"\""
35	#\#	"#"
36	#\$	"$"
37	#\%	"%"
38	#\&	"&"
39	#\'	"'"
40	#\(	"("
41	#\)	")"
42	#\*	"*"
43	#\+	"+"
44	#\,	","
45	#\-	"-"
46	#\.	"."
47	#\/	"/"
48	#\0	"0"
49	#\1	"1"
50	#\2	"2"
51	#\3	"3"
52	#\4	"4"
53	#\5	"5"
54	#\6	"6"
55	#\7	"7"
56	#\8	"8"
57	#\9	"9"
58	#\:	":"
59	#\;	";"
60	#\<	"<"
61	#\=	"="
62	#\>	">"
63	#\?	"?"
64	#\@	"@"
65	#\A	"A"
66	#\B	"B"
67	#\C	"C"
68	#\D	"D"
69	#\E	"E"
70	#\F	"F"
71	#\G	"G"
72	#\H	"H"
73	#\I	"I"
74	#\J	"J"
75	#\K	"K"
76	#\L	"L"
77	#\M	"M"
78	#\N	"N"
79	#\O	"O"
80	#\P	"P"
81	#\Q	"Q"
82	#\R	"R"
83	#\S	"S"
84	#\T	"T"
85	#\U	"U"
86	#\V	"V"
87	#\W	"W"
88	#\X	"X"
89	#\Y	"Y"
90	#\Z	"Z"
91	#\[	"["
92	#\\	"\\"
93	#\]	"]"
94	#\^	"^"
95	#\_	"_"
96	#\`	"`"
97	#\a	"a"
98	#\b	"b"
99	#\c	"c"
100	#\d	"d"
101	#\e	"e"
102	#\f	"f"
103	#\g	"g"
104	#\h	"h"
105	#\i	"i"
106	#\j	"j"
107	#\k	"k"
108	#\l	"l"
109	#\m	"m"
110	#\n	"n"
111	#\o	"o"
112	#\p	"p"
113	#\q	"q"
114	#\r	"r"
115	#\s	"s"
116	#\t	"t"
117	#\u	"u"
118	#\v	"v"
119	#\w	"w"
120	#\x	"x"
121	#\y	"y"
122	#\z	"z"
123	#\{	"{"
124	#\|	"|"
125	#\}	"}"
126	#\~	"~"
127	#\delete	"\x7f"
128	#\200	"\x80"
129	#\201	"\x81"
130	#\202	"\x82"
131	#\203	"\x83"
132	#\204	"\x84"
133	#\205	"\x85"
134	#\206	"\x86"
135	#\207	"\x87"
136	#\210	"\x88"
137	#\211	"\x89"
138	#\212	"\x8a"
139	#\213	"\x8b"
140	#\214	"\x8c"
141	#\215	"\x8d"
142	#\216	"\x8e"
143	#\217	"\x8f"
144	#\220	"\x90"
145	#\221	"\x91"
146	#\222	"\x92"
147	#\223	"\x93"
148	#\224	"\x94"
149	#\225	"\x95"
150	#\226	"\x96"
151	#\227	"\x97"
152	#\230	"\x98"
153	#\231	"\x99"
154	#\232	"\x9a"
155	#\233	"\x9b"
156	#\234	"\x9c"
157	#\235	"\x9d"
158	#\236	"\x9e"
159	#\237	"\x9f"
160	#\240	"\xa0"
161	#\¡	"¡"
162	#\¢	"¢"
163	#\£	"£"
164	#\¤	"¤"
165	#\¥	"¥"
166	#\¦	"¦"
167	#\§	"§"
168	#\¨	"¨"
169	#\©	"©"
170	#\ª	"ª"
171	#\«	"«"
172	#\¬	"¬"
173	#\255	"\xad"
174	#\®	"®"
175	#\¯	"¯"
176	#\°	"°"
177	#\±	"±"
178	#\²	"²"
179	#\³	"³"
180	#\´	"´"
181	#\µ	"µ"
182	#\¶	"¶"
183	#\·	"·"
184	#\¸	"¸"
185	#\¹	"¹"
186	#\º	"º"
187	#\»	"»"
188	#\¼	"¼"
189	#\½	"½"
190	#\¾	"¾"
191	#\¿	"¿"
192	#\À	"À"
193	#\Á	"Á"
194	#\Â	"Â"
195	#\Ã	"Ã"
196	#\Ä	"Ä"
197	#\Å	"Å"
198	#\Æ	"Æ"
199	#\Ç	"Ç"
200	#\È	"È"
201	#\É	"É"
202	#\Ê	"Ê"
203	#\Ë	"Ë"
204	#\Ì	"Ì"
205	#\Í	"Í"
206	#\Î	"Î"
207	#\Ï	"Ï"
208	#\Ð	"Ð"
209	#\Ñ	"Ñ"
210	#\Ò	"Ò"
211	#\Ó	"Ó"
212	#\Ô	"Ô"
213	#\Õ	"Õ"
214	#\Ö	"Ö"
215	#\×	"×"
216	#\Ø	"Ø"
217	#\Ù	"Ù"
218	#\Ú	"Ú"
219	#\Û	"Û"
220	#\Ü	"Ü"
221	#\Ý	"Ý"
222	#\Þ	"Þ"
223	#\ß	"ß"
224	#\à	"à"
225	#\á	"á"
226	#\â	"â"
227	#\ã	"ã"
228	#\ä	"ä"
229	#\å	"å"
230	#\æ	"æ"
231	#\ç	"ç"
232	#\è	"è"
233	#\é	"é"
234	#\ê	"ê"
235	#\ë	"ë"
236	#\ì	"ì"
237	#\í	"í"
238	#\î	"î"
239	#\ï	"ï"
240	#\ð	"ð"
241	#\ñ	"ñ"
242	#\ò	"ò"
243	#\ó	"ó"
244	#\ô	"ô"
245	#\õ	"õ"
246	#\ö	"ö"
247	#\÷	"÷"
248	#\ø	"ø"
249	#\ù	"ù"
250	#\ú	"ú"
251	#\û	"û"
252	#\ü	"ü"
253	#\ý	"ý"
254	#\þ	"þ"
255	#\ÿ	"ÿ"

[-- Attachment #3: Type: text/plain, Size: 26 bytes --]


-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: survey: string external representation
  2012-01-26  8:00 survey: string external representation Thien-Thi Nguyen
  2012-01-26  8:38 ` Andy Wingo
@ 2012-01-26 14:11 ` Mike Gran
  2012-01-27 10:27 ` Thien-Thi Nguyen
  2012-01-27 15:32 ` David Pirotte
  3 siblings, 0 replies; 8+ messages in thread
From: Mike Gran @ 2012-01-26 14:11 UTC (permalink / raw)
  To: Thien-Thi Nguyen, guile-user@gnu.org

[-- Attachment #1: Type: text/plain, Size: 294 bytes --]

Hi Thi-

On the box I'm at right now, the locale is C

Guile 1.8 gives me the result in guile-1.8.out.
Guile 2.0 gives me the result in guile-2.0.out.

If I add a (setlocale LC_ALL "") to the top of the script

guile-2.0 gives me guile-2.0.out2.  

guile-1.8 is unchanged.


-Mike

[-- Attachment #2: guile-1.8.out --]
[-- Type: application/octet-stream, Size: 3444 bytes --]

[-- Attachment #3: guile-2.0.out --]
[-- Type: application/octet-stream, Size: 3323 bytes --]

2.0.3.93-84843-dirty
0	#\nul	"\x00"
1	#\soh	"\x01"
2	#\stx	"\x02"
3	#\etx	"\x03"
4	#\eot	"\x04"
5	#\enq	"\x05"
6	#\ack	"\x06"
7	#\alarm	"\a"
8	#\backspace	"\b"
9	#\tab	"\t"
10	#\newline	"\n"
11	#\vtab	"\v"
12	#\page	"\f"
13	#\return	"\r"
14	#\so	"\x0e"
15	#\si	"\x0f"
16	#\dle	"\x10"
17	#\dc1	"\x11"
18	#\dc2	"\x12"
19	#\dc3	"\x13"
20	#\dc4	"\x14"
21	#\nak	"\x15"
22	#\syn	"\x16"
23	#\etb	"\x17"
24	#\can	"\x18"
25	#\em	"\x19"
26	#\sub	"\x1a"
27	#\esc	"\x1b"
28	#\fs	"\x1c"
29	#\gs	"\x1d"
30	#\rs	"\x1e"
31	#\us	"\x1f"
32	#\space	" "
33	#\!	"!"
34	#\"	"\""
35	#\#	"#"
36	#\$	"$"
37	#\%	"%"
38	#\&	"&"
39	#\'	"'"
40	#\(	"("
41	#\)	")"
42	#\*	"*"
43	#\+	"+"
44	#\,	","
45	#\-	"-"
46	#\.	"."
47	#\/	"/"
48	#\0	"0"
49	#\1	"1"
50	#\2	"2"
51	#\3	"3"
52	#\4	"4"
53	#\5	"5"
54	#\6	"6"
55	#\7	"7"
56	#\8	"8"
57	#\9	"9"
58	#\:	":"
59	#\;	";"
60	#\<	"<"
61	#\=	"="
62	#\>	">"
63	#\?	"?"
64	#\@	"@"
65	#\A	"A"
66	#\B	"B"
67	#\C	"C"
68	#\D	"D"
69	#\E	"E"
70	#\F	"F"
71	#\G	"G"
72	#\H	"H"
73	#\I	"I"
74	#\J	"J"
75	#\K	"K"
76	#\L	"L"
77	#\M	"M"
78	#\N	"N"
79	#\O	"O"
80	#\P	"P"
81	#\Q	"Q"
82	#\R	"R"
83	#\S	"S"
84	#\T	"T"
85	#\U	"U"
86	#\V	"V"
87	#\W	"W"
88	#\X	"X"
89	#\Y	"Y"
90	#\Z	"Z"
91	#\[	"["
92	#\\	"\\"
93	#\]	"]"
94	#\^	"^"
95	#\_	"_"
96	#\`	"`"
97	#\a	"a"
98	#\b	"b"
99	#\c	"c"
100	#\d	"d"
101	#\e	"e"
102	#\f	"f"
103	#\g	"g"
104	#\h	"h"
105	#\i	"i"
106	#\j	"j"
107	#\k	"k"
108	#\l	"l"
109	#\m	"m"
110	#\n	"n"
111	#\o	"o"
112	#\p	"p"
113	#\q	"q"
114	#\r	"r"
115	#\s	"s"
116	#\t	"t"
117	#\u	"u"
118	#\v	"v"
119	#\w	"w"
120	#\x	"x"
121	#\y	"y"
122	#\z	"z"
123	#\{	"{"
124	#\|	"|"
125	#\}	"}"
126	#\~	"~"
127	#\delete	"\x7f"
128	#\200	"\x80"
129	#\201	"\x81"
130	#\202	"\x82"
131	#\203	"\x83"
132	#\204	"\x84"
133	#\205	"\x85"
134	#\206	"\x86"
135	#\207	"\x87"
136	#\210	"\x88"
137	#\211	"\x89"
138	#\212	"\x8a"
139	#\213	"\x8b"
140	#\214	"\x8c"
141	#\215	"\x8d"
142	#\216	"\x8e"
143	#\217	"\x8f"
144	#\220	"\x90"
145	#\221	"\x91"
146	#\222	"\x92"
147	#\223	"\x93"
148	#\224	"\x94"
149	#\225	"\x95"
150	#\226	"\x96"
151	#\227	"\x97"
152	#\230	"\x98"
153	#\231	"\x99"
154	#\232	"\x9a"
155	#\233	"\x9b"
156	#\234	"\x9c"
157	#\235	"\x9d"
158	#\236	"\x9e"
159	#\237	"\x9f"
160	#\240	"\xa0"
161	#\¡	"¡"
162	#\¢	"¢"
163	#\£	"£"
164	#\¤	"¤"
165	#\¥	"¥"
166	#\¦	"¦"
167	#\§	"§"
168	#\¨	"¨"
169	#\©	"©"
170	#\ª	"ª"
171	#\«	"«"
172	#\¬	"¬"
173	#\255	"\xad"
174	#\®	"®"
175	#\¯	"¯"
176	#\°	"°"
177	#\±	"±"
178	#\²	"²"
179	#\³	"³"
180	#\´	"´"
181	#\µ	"µ"
182	#\¶	"¶"
183	#\·	"·"
184	#\¸	"¸"
185	#\¹	"¹"
186	#\º	"º"
187	#\»	"»"
188	#\¼	"¼"
189	#\½	"½"
190	#\¾	"¾"
191	#\¿	"¿"
192	#\À	"À"
193	#\Á	"Á"
194	#\Â	"Â"
195	#\Ã	"Ã"
196	#\Ä	"Ä"
197	#\Å	"Å"
198	#\Æ	"Æ"
199	#\Ç	"Ç"
200	#\È	"È"
201	#\É	"É"
202	#\Ê	"Ê"
203	#\Ë	"Ë"
204	#\Ì	"Ì"
205	#\Í	"Í"
206	#\Î	"Î"
207	#\Ï	"Ï"
208	#\Ð	"Ð"
209	#\Ñ	"Ñ"
210	#\Ò	"Ò"
211	#\Ó	"Ó"
212	#\Ô	"Ô"
213	#\Õ	"Õ"
214	#\Ö	"Ö"
215	#\×	"×"
216	#\Ø	"Ø"
217	#\Ù	"Ù"
218	#\Ú	"Ú"
219	#\Û	"Û"
220	#\Ü	"Ü"
221	#\Ý	"Ý"
222	#\Þ	"Þ"
223	#\ß	"ß"
224	#\à	"à"
225	#\á	"á"
226	#\â	"â"
227	#\ã	"ã"
228	#\ä	"ä"
229	#\å	"å"
230	#\æ	"æ"
231	#\ç	"ç"
232	#\è	"è"
233	#\é	"é"
234	#\ê	"ê"
235	#\ë	"ë"
236	#\ì	"ì"
237	#\í	"í"
238	#\î	"î"
239	#\ï	"ï"
240	#\ð	"ð"
241	#\ñ	"ñ"
242	#\ò	"ò"
243	#\ó	"ó"
244	#\ô	"ô"
245	#\õ	"õ"
246	#\ö	"ö"
247	#\÷	"÷"
248	#\ø	"ø"
249	#\ù	"ù"
250	#\ú	"ú"
251	#\û	"û"
252	#\ü	"ü"
253	#\ý	"ý"
254	#\þ	"þ"
255	#\ÿ	"ÿ"

[-- Attachment #4: guile-2.0.out2 --]
[-- Type: application/octet-stream, Size: 3793 bytes --]

2.0.3.93-84843-dirty
0	#\nul	"\x00"
1	#\soh	"\x01"
2	#\stx	"\x02"
3	#\etx	"\x03"
4	#\eot	"\x04"
5	#\enq	"\x05"
6	#\ack	"\x06"
7	#\alarm	"\a"
8	#\backspace	"\b"
9	#\tab	"\t"
10	#\newline	"\n"
11	#\vtab	"\v"
12	#\page	"\f"
13	#\return	"\r"
14	#\so	"\x0e"
15	#\si	"\x0f"
16	#\dle	"\x10"
17	#\dc1	"\x11"
18	#\dc2	"\x12"
19	#\dc3	"\x13"
20	#\dc4	"\x14"
21	#\nak	"\x15"
22	#\syn	"\x16"
23	#\etb	"\x17"
24	#\can	"\x18"
25	#\em	"\x19"
26	#\sub	"\x1a"
27	#\esc	"\x1b"
28	#\fs	"\x1c"
29	#\gs	"\x1d"
30	#\rs	"\x1e"
31	#\us	"\x1f"
32	#\space	" "
33	#\!	"!"
34	#\"	"\""
35	#\#	"#"
36	#\$	"$"
37	#\%	"%"
38	#\&	"&"
39	#\'	"'"
40	#\(	"("
41	#\)	")"
42	#\*	"*"
43	#\+	"+"
44	#\,	","
45	#\-	"-"
46	#\.	"."
47	#\/	"/"
48	#\0	"0"
49	#\1	"1"
50	#\2	"2"
51	#\3	"3"
52	#\4	"4"
53	#\5	"5"
54	#\6	"6"
55	#\7	"7"
56	#\8	"8"
57	#\9	"9"
58	#\:	":"
59	#\;	";"
60	#\<	"<"
61	#\=	"="
62	#\>	">"
63	#\?	"?"
64	#\@	"@"
65	#\A	"A"
66	#\B	"B"
67	#\C	"C"
68	#\D	"D"
69	#\E	"E"
70	#\F	"F"
71	#\G	"G"
72	#\H	"H"
73	#\I	"I"
74	#\J	"J"
75	#\K	"K"
76	#\L	"L"
77	#\M	"M"
78	#\N	"N"
79	#\O	"O"
80	#\P	"P"
81	#\Q	"Q"
82	#\R	"R"
83	#\S	"S"
84	#\T	"T"
85	#\U	"U"
86	#\V	"V"
87	#\W	"W"
88	#\X	"X"
89	#\Y	"Y"
90	#\Z	"Z"
91	#\[	"["
92	#\\	"\\"
93	#\]	"]"
94	#\^	"^"
95	#\_	"_"
96	#\`	"`"
97	#\a	"a"
98	#\b	"b"
99	#\c	"c"
100	#\d	"d"
101	#\e	"e"
102	#\f	"f"
103	#\g	"g"
104	#\h	"h"
105	#\i	"i"
106	#\j	"j"
107	#\k	"k"
108	#\l	"l"
109	#\m	"m"
110	#\n	"n"
111	#\o	"o"
112	#\p	"p"
113	#\q	"q"
114	#\r	"r"
115	#\s	"s"
116	#\t	"t"
117	#\u	"u"
118	#\v	"v"
119	#\w	"w"
120	#\x	"x"
121	#\y	"y"
122	#\z	"z"
123	#\{	"{"
124	#\|	"|"
125	#\}	"}"
126	#\~	"~"
127	#\delete	"\x7f"
128	#\200	"\x80"
129	#\201	"\x81"
130	#\202	"\x82"
131	#\203	"\x83"
132	#\204	"\x84"
133	#\205	"\x85"
134	#\206	"\x86"
135	#\207	"\x87"
136	#\210	"\x88"
137	#\211	"\x89"
138	#\212	"\x8a"
139	#\213	"\x8b"
140	#\214	"\x8c"
141	#\215	"\x8d"
142	#\216	"\x8e"
143	#\217	"\x8f"
144	#\220	"\x90"
145	#\221	"\x91"
146	#\222	"\x92"
147	#\223	"\x93"
148	#\224	"\x94"
149	#\225	"\x95"
150	#\226	"\x96"
151	#\227	"\x97"
152	#\230	"\x98"
153	#\231	"\x99"
154	#\232	"\x9a"
155	#\233	"\x9b"
156	#\234	"\x9c"
157	#\235	"\x9d"
158	#\236	"\x9e"
159	#\237	"\x9f"
160	#\240	"\xa0"
161	#\241	"\xa1"
162	#\242	"\xa2"
163	#\243	"\xa3"
164	#\244	"\xa4"
165	#\245	"\xa5"
166	#\246	"\xa6"
167	#\247	"\xa7"
168	#\250	"\xa8"
169	#\251	"\xa9"
170	#\252	"\xaa"
171	#\253	"\xab"
172	#\254	"\xac"
173	#\255	"\xad"
174	#\256	"\xae"
175	#\257	"\xaf"
176	#\260	"\xb0"
177	#\261	"\xb1"
178	#\262	"\xb2"
179	#\263	"\xb3"
180	#\264	"\xb4"
181	#\265	"\xb5"
182	#\266	"\xb6"
183	#\267	"\xb7"
184	#\270	"\xb8"
185	#\271	"\xb9"
186	#\272	"\xba"
187	#\273	"\xbb"
188	#\274	"\xbc"
189	#\275	"\xbd"
190	#\276	"\xbe"
191	#\277	"\xbf"
192	#\300	"\xc0"
193	#\301	"\xc1"
194	#\302	"\xc2"
195	#\303	"\xc3"
196	#\304	"\xc4"
197	#\305	"\xc5"
198	#\306	"\xc6"
199	#\307	"\xc7"
200	#\310	"\xc8"
201	#\311	"\xc9"
202	#\312	"\xca"
203	#\313	"\xcb"
204	#\314	"\xcc"
205	#\315	"\xcd"
206	#\316	"\xce"
207	#\317	"\xcf"
208	#\320	"\xd0"
209	#\321	"\xd1"
210	#\322	"\xd2"
211	#\323	"\xd3"
212	#\324	"\xd4"
213	#\325	"\xd5"
214	#\326	"\xd6"
215	#\327	"\xd7"
216	#\330	"\xd8"
217	#\331	"\xd9"
218	#\332	"\xda"
219	#\333	"\xdb"
220	#\334	"\xdc"
221	#\335	"\xdd"
222	#\336	"\xde"
223	#\337	"\xdf"
224	#\340	"\xe0"
225	#\341	"\xe1"
226	#\342	"\xe2"
227	#\343	"\xe3"
228	#\344	"\xe4"
229	#\345	"\xe5"
230	#\346	"\xe6"
231	#\347	"\xe7"
232	#\350	"\xe8"
233	#\351	"\xe9"
234	#\352	"\xea"
235	#\353	"\xeb"
236	#\354	"\xec"
237	#\355	"\xed"
238	#\356	"\xee"
239	#\357	"\xef"
240	#\360	"\xf0"
241	#\361	"\xf1"
242	#\362	"\xf2"
243	#\363	"\xf3"
244	#\364	"\xf4"
245	#\365	"\xf5"
246	#\366	"\xf6"
247	#\367	"\xf7"
248	#\370	"\xf8"
249	#\371	"\xf9"
250	#\372	"\xfa"
251	#\373	"\xfb"
252	#\374	"\xfc"
253	#\375	"\xfd"
254	#\376	"\xfe"
255	#\377	"\xff"

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: survey: string external representation
  2012-01-26  8:00 survey: string external representation Thien-Thi Nguyen
  2012-01-26  8:38 ` Andy Wingo
  2012-01-26 14:11 ` Mike Gran
@ 2012-01-27 10:27 ` Thien-Thi Nguyen
  2012-02-05  9:32   ` Thien-Thi Nguyen
  2012-01-27 15:32 ` David Pirotte
  3 siblings, 1 reply; 8+ messages in thread
From: Thien-Thi Nguyen @ 2012-01-27 10:27 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 627 bytes --]

Thanks to everyone who responded.  Based on the collected
information, i've cobbled together a runtime check for
‘sql-quote’.  It and some tests are in the attached program.
To play:

 guile -s normalize.scm
 guile -s normalize.scm stupid

The code assumes Guile 2 DTRT, but if you have doubts, you can

 sed -i 's/guile-2/&-not-really/' normalize.scm

to disable that assumption.  In any case, the program should exit
successfully, indicating smooth ‘write’ / ‘read’ round-tripping.
This is so (both w/ and w/o "stupid") for Guile 1.4.1.124 and 1.8.7.

___________________________________________

[-- Attachment #2: normalize.scm --]
[-- Type: text/x-scheme, Size: 4055 bytes --]

;; -*- mode: scheme; coding: utf-8 -*-

(define EXIT-VALUE #t)                  ; optimism

(define STUPID? (false-if-exception (string=? "stupid" (cadr (command-line)))))

;; PostgreSQL groks ‘\xXX’ as an octet w/ hex value XX.
;; It also groks raw octets.  This is all fine and good.
;; The problem arises when there is a mix of contiguous
;; raw and \x representations, intended to represent a
;; UTF-8 (say) encoded character.
;;
;; It seems Guile
;; - 1.4 DTRT by doing nothing;
;; - 1.6 ???;
;; - 1.8 fails by \x-escaping inconsistently;
;; - 2.0 doesn't have this problem.

(cond-expand
 (guile-2
  (define normalize identity))
 (else
  (use-modules
   (srfi srfi-13)
   (srfi srfi-14))
  (define normalize
    (or (let* ((ego (char-set
                     ;; These are not strictly necessary for
                     ;; PostgreSQL, but we include them for
                     ;; (Scheme-only) round-trip testing.
                     ;; Doubtlessly, what doubtful ego!
                     #\" #\\))
               (ugh (ucs-range->char-set #o177 #o400 #t ego)))
          (and (not (char-set-every
                     (lambda (ch)
                       ;; Does the octet xrep unmolested?
                       (char=? ch (string-ref (object->string (string ch)) 1)))
                     (char-set-difference ugh ego)))
               (or (not STUPID?)
                   (begin (set! ugh ego)
                          #t))
               ;; Lame.
               (lambda (s)
                 (define backslash-x
                   (let ((v (make-vector 256)))
                     (char-set-for-each
                      (lambda (ch)
                        (let ((i (char->integer ch)))
                          (vector-set!
                           v i (string-append
                                "\\x" (number->string i 16)))))
                      ugh)
                     ;; backslash-x
                     (lambda (ch)
                       (vector-ref v (char->integer ch)))))
                 (let loop ((start 0) (acc '()))
                   (cond ((string-index s ugh start)
                          => (lambda (idx)
                               (loop (1+ idx)
                                     (cons* (backslash-x (string-ref s idx))
                                            (substring/shared s start idx)
                                            acc))))
                         ((zero? start)
                          s)
                         (else
                          (string-concatenate-reverse
                           acc (substring/shared s start))))))))
        ;; Cool.
        identity))))

(define (try s)
  (simple-format
   #t "ORIG:\t~S~%NORM:\t~S~%=>\t~A~%~%"
   s (normalize s)
   (let ((round (with-input-from-string
                    (with-output-to-string
                      (lambda ()
                        (if (eq? identity normalize)
                            (write s)
                            (begin (display #\")
                                   (display (normalize s))
                                   (display #\")))))
                  read)))
     (cond ((equal? s round) 'SAME)
           (else
            (set! EXIT-VALUE #f)        ;-O
            (string-append
             "DIFF: [" (number->string (string-length round))
             "]|" round "|"))))))

(simple-format #t "Guile ~A~% LANG: ~S~% normalize: ~S~A~%~%"
               (version) (getenv "LANG") (procedure-name normalize)
               (if (and STUPID? (not (eq? normalize identity)))
                   " (but we stupidly revert to degeneracy)"
                   ""))

(try "")
(try (list->string (map integer->char (iota 256))))
(try "U+2002: | | (utf-8: E2 80 82)")
(try "U+232C: |⌬| (utf-8: E2 80 82)")
(try "U+1D7FF: |𝟿| (utf-8: F0 9D 9F BF)")
(try "U+2F9B2: |䕫| (utf-8: F0 AF A6 B2)")
(try "U+2F9BC: |蜨| (utf-8: F0 AF A6 BC)")

(exit EXIT-VALUE)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: survey: string external representation
  2012-01-26  8:00 survey: string external representation Thien-Thi Nguyen
                   ` (2 preceding siblings ...)
  2012-01-27 10:27 ` Thien-Thi Nguyen
@ 2012-01-27 15:32 ` David Pirotte
  3 siblings, 0 replies; 8+ messages in thread
From: David Pirotte @ 2012-01-27 15:32 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: guile-user

[-- Attachment #1: Type: text/plain, Size: 55 bytes --]

Hello,

In en_US.UTF-8, guile-1.6.8 ...

Cheers,
David

[-- Attachment #2: string-xrep.out --]
[-- Type: application/octet-stream, Size: 3288 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: survey: string external representation
  2012-01-27 10:27 ` Thien-Thi Nguyen
@ 2012-02-05  9:32   ` Thien-Thi Nguyen
  2012-02-07  8:58     ` Andy Wingo
  2012-02-07  9:52     ` David Pirotte
  0 siblings, 2 replies; 8+ messages in thread
From: Thien-Thi Nguyen @ 2012-02-05  9:32 UTC (permalink / raw)
  To: guile-user

[-- Attachment #1: Type: text/plain, Size: 343 bytes --]

() Thien-Thi Nguyen <ttn@gnuvola.org>
() Fri, 27 Jan 2012 11:27:30 +0100

   The code assumes Guile 2 DTRT [...]

Well, further investigation raises new doubts.  The issue really
is in contiguous mixed raw and \x-escaped octets, and not just
single byte external representation, so here is a followup
experiment that addresses that directly:


[-- Attachment #2: xrep2.scm --]
[-- Type: text/x-scheme, Size: 558 bytes --]

(setlocale LC_ALL "")

(define (hmm symbol)
  (define (show x)
    (display x) (display "\t") (write x) (newline))
  (newline)
  (show symbol)
  (let ((string (symbol->string symbol)))
    (show string)
    (show (object->string string))))

(display "LANG: ") (write (getenv "LANG")) (newline)
(hmm 'foo)
(hmm '#{f\"o b\\r}#)
(hmm '⌬)                                ; U+232C (utf-8: E2 8C AC)
(hmm '䕫)                               ; U+2F9B2 (utf-8: F0 AF A6 B2)
(hmm '蜨)                               ; U+2F9BC (utf-8: F0 AF A6 BC)

[-- Attachment #3: Type: text/plain, Size: 150 bytes --]


Below are the output of two runs:

  guile -s xrep2.scm \
    | tee xrep2-$(guile --version | sed 's/.* //;q')-$LANG.out

What do other people see?


[-- Attachment #4: xrep2-1.4.1.124-it_IT.UTF-8.out --]
[-- Type: text/plain, Size: 276 bytes --]

LANG: "it_IT.UTF-8"

foo	foo
foo	"foo"
"foo"	"\"foo\""

#{f\"o\ b\\r}#	#{f\"o\ b\\r}#
f"o b\r	"f\"o b\\r"
"f\"o b\\r"	"\"f\\\"o b\\\\r\""

⌬	⌬
⌬	"⌬"
"⌬"	"\"⌬\""

䕫	䕫
䕫	"䕫"
"䕫"	"\"䕫\""

蜨	蜨
蜨	"蜨"
"蜨"	"\"蜨\""

[-- Attachment #5: xrep2-1.8.7-it_IT.UTF-8.out --]
[-- Type: text/plain, Size: 316 bytes --]

LANG: "it_IT.UTF-8"

foo	foo
foo	"foo"
"foo"	"\"foo\""

#{\f\\\"o\ b\\\\r}#	#{\f\\\"o\ b\\\\r}#
f\"o b\\r	"f\\\"o b\\\\r"
"f\\\"o b\\\\r"	"\"f\\\\\\\"o b\\\\\\\\r\""

âÐŒ	âÐŒ
âÐŒ	"â\x8c¬"
"â\x8c¬"	"\"â\\x8c¬\""

䕫	䕫
䕫	"䕫"
"䕫"	"\"䕫\""

蜨	蜨
蜨	"蜨"
"蜨"	"\"蜨\""


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: survey: string external representation
  2012-02-05  9:32   ` Thien-Thi Nguyen
@ 2012-02-07  8:58     ` Andy Wingo
  2012-02-07  9:52     ` David Pirotte
  1 sibling, 0 replies; 8+ messages in thread
From: Andy Wingo @ 2012-02-07  8:58 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: guile-user

[-- Attachment #1: Type: text/plain, Size: 191 bytes --]

On Sun 05 Feb 2012 10:32, Thien-Thi Nguyen <ttn@gnuvola.org> writes:

>   guile -s xrep2.scm \
>     | tee xrep2-$(guile --version | sed 's/.* //;q')-$LANG.out
>
> What do other people see?


[-- Attachment #2: xrep2--en_US.UTF-8.out --]
[-- Type: application/octet-stream, Size: 249 bytes --]

LANG: "en_US.UTF-8"

foo	foo
foo	"foo"
"foo"	"\"foo\""

#{f"o b\r}#	#{f"o b\r}#
f"o b\r	"f\"o b\\r"
"f\"o b\\r"	"\"f\\\"o b\\\\r\""

⌬	⌬
⌬	"⌬"
"⌬"	"\"⌬\""

䕫	䕫
䕫	"䕫"
"䕫"	"\"䕫\""

蜨	蜨
蜨	"蜨"
"蜨"	"\"蜨\""

[-- Attachment #3: Type: text/plain, Size: 26 bytes --]


-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: survey: string external representation
  2012-02-05  9:32   ` Thien-Thi Nguyen
  2012-02-07  8:58     ` Andy Wingo
@ 2012-02-07  9:52     ` David Pirotte
  1 sibling, 0 replies; 8+ messages in thread
From: David Pirotte @ 2012-02-07  9:52 UTC (permalink / raw)
  To: Thien-Thi Nguyen; +Cc: guile-user

[-- Attachment #1: Type: text/plain, Size: 193 bytes --]

On Sun 05 Feb 2012 10:32, Thien-Thi Nguyen <ttn@gnuvola.org> writes:

>   guile -s xrep2.scm \
>     | tee xrep2-$(guile --version | sed 's/.* //;q')-$LANG.out
>
> What do other people see?  


[-- Attachment #2: xrep2-1.6.8-en_US.UTF-8.out --]
[-- Type: application/octet-stream, Size: 255 bytes --]

LANG: "en_US.UTF-8"

foo	foo
foo	"foo"
"foo"	"\"foo\""

#{f\"o\ b\\r}#	#{f\"o\ b\\r}#
f"o b\r	"f\"o b\\r"
"f\"o b\\r"	"\"f\\\"o b\\\\r\""

⌬	⌬
⌬	"⌬"
"⌬"	"\"⌬\""

䕫	䕫
䕫	"䕫"
"䕫"	"\"䕫\""

蜨	蜨
蜨	"蜨"
"蜨"	"\"蜨\""

[-- Attachment #3: xrep2-1.6.8-fr_BE.UTF-8.out --]
[-- Type: application/octet-stream, Size: 255 bytes --]

LANG: "fr_BE.UTF-8"

foo	foo
foo	"foo"
"foo"	"\"foo\""

#{f\"o\ b\\r}#	#{f\"o\ b\\r}#
f"o b\r	"f\"o b\\r"
"f\"o b\\r"	"\"f\\\"o b\\\\r\""

⌬	⌬
⌬	"⌬"
"⌬"	"\"⌬\""

䕫	䕫
䕫	"䕫"
"䕫"	"\"䕫\""

蜨	蜨
蜨	"蜨"
"蜨"	"\"蜨\""

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-02-07  9:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-26  8:00 survey: string external representation Thien-Thi Nguyen
2012-01-26  8:38 ` Andy Wingo
2012-01-26 14:11 ` Mike Gran
2012-01-27 10:27 ` Thien-Thi Nguyen
2012-02-05  9:32   ` Thien-Thi Nguyen
2012-02-07  8:58     ` Andy Wingo
2012-02-07  9:52     ` David Pirotte
2012-01-27 15:32 ` David Pirotte

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).