* argz SMOB @ 2004-01-05 17:40 Brian S McQueen 2004-01-06 19:54 ` Daniel Skarda 0 siblings, 1 reply; 19+ messages in thread From: Brian S McQueen @ 2004-01-05 17:40 UTC (permalink / raw) I have the beginnings of a SMOB library for GNU libc's argz. I will also be adding the envz functionality. Brian McQueen _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: argz SMOB 2004-01-05 17:40 argz SMOB Brian S McQueen @ 2004-01-06 19:54 ` Daniel Skarda 2004-01-08 16:44 ` Brian S McQueen 2004-01-15 18:43 ` Brian S McQueen 0 siblings, 2 replies; 19+ messages in thread From: Daniel Skarda @ 2004-01-06 19:54 UTC (permalink / raw) Cc: guile-user Hello, Brian S McQueen <bqueen@nas.nasa.gov> writes: > I have the beginnings of a SMOB library for GNU libc's argz. I will also > be adding the envz functionality. I am sorry I do not see what is it good for. Could you please explain what is your itch you are trying to scratch with this SMOB library? 0. _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: argz SMOB 2004-01-06 19:54 ` Daniel Skarda @ 2004-01-08 16:44 ` Brian S McQueen 2004-01-09 14:08 ` Daniel Skarda 2004-01-15 18:43 ` Brian S McQueen 1 sibling, 1 reply; 19+ messages in thread From: Brian S McQueen @ 2004-01-08 16:44 UTC (permalink / raw) Cc: guile-user I have several libraries in use here which make database queries, returning all the results in an argz (actually as an envz). I wanted to be able to tweak anything via guile. So I added guile to the project, giving me the ability to tweak from a script file without recompiling. Brian On Tue, 6 Jan 2004, Daniel Skarda wrote: > Hello, > > Brian S McQueen <bqueen@nas.nasa.gov> writes: > > I have the beginnings of a SMOB library for GNU libc's argz. I will also > > be adding the envz functionality. > > I am sorry I do not see what is it good for. Could you please explain what is > your itch you are trying to scratch with this SMOB library? > > 0. > _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: argz SMOB 2004-01-08 16:44 ` Brian S McQueen @ 2004-01-09 14:08 ` Daniel Skarda 2004-01-12 16:08 ` Brian S McQueen 0 siblings, 1 reply; 19+ messages in thread From: Daniel Skarda @ 2004-01-09 14:08 UTC (permalink / raw) Cc: guile-user > I have several libraries in use here which make database queries, > returning all the results in an argz (actually as an envz). I wanted to > be able to tweak anything via guile. So I added guile to the project, > giving me the ability to tweak from a script file without recompiling. Before I sent you my message, I read argz/envz description in libc reference manual and wondered why anybody would like to use such strange data structure. (which I still do not understand :) IMHO it would be better to convert output from your database queries to lists of strings and symbols (or alists in case of envz). They are more "natural" and "convenient" way for data representation in Scheme/Lisp and for processing results you can use tools already present in Guile (for-each, map, fold, ...) instead of reinventing your own for argz/envz SMOBS. 0. _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: argz SMOB 2004-01-09 14:08 ` Daniel Skarda @ 2004-01-12 16:08 ` Brian S McQueen 0 siblings, 0 replies; 19+ messages in thread From: Brian S McQueen @ 2004-01-12 16:08 UTC (permalink / raw) Cc: guile-user The db libraries are a fact I must live with here. But I made the SMOB interface, and started fiddling with it and noticed that, since NULL is allowed in a scheme string, that the entire argz thing can be viewed as a single string, and the guile string interface can be used. I will try it today. BTW - GNU C's argz and envz are quite simple to use. You can do bust up a query string into a hash like structure with near perl like ease. There is no memory allocation. Another very cool feature of GNU C is the obstack. I will post again when I test out the SCM string to C argz. Brian On Fri, 9 Jan 2004, Daniel Skarda wrote: > > > I have several libraries in use here which make database queries, > > returning all the results in an argz (actually as an envz). I wanted to > > be able to tweak anything via guile. So I added guile to the project, > > giving me the ability to tweak from a script file without recompiling. > > Before I sent you my message, I read argz/envz description in libc reference > manual and wondered why anybody would like to use such strange data structure. > (which I still do not understand :) > > IMHO it would be better to convert output from your database queries to lists > of strings and symbols (or alists in case of envz). They are more "natural" and > "convenient" way for data representation in Scheme/Lisp and for processing > results you can use tools already present in Guile (for-each, map, fold, ...) > instead of reinventing your own for argz/envz SMOBS. > > 0. > _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: argz SMOB 2004-01-06 19:54 ` Daniel Skarda 2004-01-08 16:44 ` Brian S McQueen @ 2004-01-15 18:43 ` Brian S McQueen 2004-01-16 0:21 ` Paul Jarc 1 sibling, 1 reply; 19+ messages in thread From: Brian S McQueen @ 2004-01-15 18:43 UTC (permalink / raw) Cc: guile-user Since scheme strings can contain the null character, I noticed there was no need for an argz smob. I removed it by using scheme strings instead and it works excellently. The C functions that produce argzs, are all callable from scheme since the argz is defined by a pointer and a length, just as are scheme strings. The C functions are wrapped in simple functions which use the guile API, and the trailing null char in each argz is dropped. I am sure that nobody is particularly interested in argzs, but some readers may be interested in a real life example of the guile API. Below are some examples of how I gained access to some C function from guile. If any of you veterans have any more constructive advice, I would be glad to hear it, but don't ask me to get rid of the argz. They are here to stay! Particularly, I wonder about the best way to produce a null terminated C string from a scheme string. I used scm_must_malloc, memcpy, memset. I expected a ready made guile call for this purpose, but I did not find any. A simple call to a function which is expecting an argz and returns nothing: static SCM printer_hostile_printer(SCM scm_out_buff) { struct argz_holder out_buff; SCM_ASSERT (SCM_STRINGP (scm_out_buff), scm_out_buff, SCM_ARG1, "printer_hostile_printer"); out_buff.argz = SCM_STRING_CHARS(scm_out_buff); out_buff.argz_len = SCM_STRING_LENGTH(scm_out_buff); output_printer(&out_buff); return SCM_UNDEFINED; } A call to a database query function which returns an argz full of query results: static SCM get_from_db(SCM scm_login) { char *login_chrs; int login_len; char * login; struct db_parm_holder login_parm = { 0 }; struct db_parm_holder argz_parm = { 0 }; SCM ret_val; SCM_ASSERT (SCM_STRINGP (scm_login), scm_login, SCM_ARG1, "get_from_db"); login_chrs = SCM_STRING_CHARS(scm_login); login_len = SCM_STRING_LENGTH(scm_login); login = (char *)scm_must_malloc(login_len + 1, "get_from_db"); memcpy(login, login_chrs, login_len); memset(login + login_len, '\0', 1); set_db_in_parm_str(&login_parm, "@login", USER_ID_LEN, login); set_db_ret_parm_envz(&argz_parm, NULL); db_query_va("get_from_db", &login_parm, &argz_parm, NULL); //don't take the last null term on the argz ret_val = scm_mem2string(argz_parm.data, argz_parm.data_len - 1); free(argz_parm.data); scm_must_free(login); return ret_val; } _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: argz SMOB 2004-01-15 18:43 ` Brian S McQueen @ 2004-01-16 0:21 ` Paul Jarc 2004-01-16 9:10 ` null terminated strings (was: argz SMOB) Andreas Voegele 0 siblings, 1 reply; 19+ messages in thread From: Paul Jarc @ 2004-01-16 0:21 UTC (permalink / raw) Cc: guile-user, Daniel Skarda Brian S McQueen <bqueen@nas.nasa.gov> wrote: > Particularly, I wonder about the best way to produce a null > terminated C string from a scheme string. SCM_STRING_COERCE_0TERMINATION_X(scheme_string); char* s=SCM_STRING_CHARS(scheme_string); Or do you need a separate copy? > static SCM printer_hostile_printer(SCM scm_out_buff) { > > struct argz_holder out_buff; > > SCM_ASSERT (SCM_STRINGP (scm_out_buff), scm_out_buff, SCM_ARG1, > "printer_hostile_printer"); You could also write that as: #define FUNC_NAME s_printer_hostile_printer SCM_DEFINE(printer_hostile_printer, "printer_hostile_printer", 1, 0, 0, (SCM scm_out_buff)) { struct argz_holder out_buff; SCM_VALIDATE_STRING(SCM_ARG1, scm_out_buff); paul _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* null terminated strings (was: argz SMOB) 2004-01-16 0:21 ` Paul Jarc @ 2004-01-16 9:10 ` Andreas Voegele [not found] ` <1074245327.6733.9.camel@localhost> 0 siblings, 1 reply; 19+ messages in thread From: Andreas Voegele @ 2004-01-16 9:10 UTC (permalink / raw) Paul Jarc writes: > Brian S McQueen <bqueen@nas.nasa.gov> wrote: >> Particularly, I wonder about the best way to produce a null >> terminated C string from a scheme string. > > SCM_STRING_COERCE_0TERMINATION_X(scheme_string); > char* s=SCM_STRING_CHARS(scheme_string); Is it necessary or desirable to use the macro SCM_STRING_COERCE_0TERMINATION_X if one uses Guile 1.6? I've found the following statement in the file libguile/strings.c that comes with Guile 1.6.4: "[...] we promise that strings are null-terminated." And in guile-readline/Changelog: "Remove calls to SCM_STRING_COERCE_0TERMINATION_X. Since the substring type is gone, all strings are 0-terminated anyway." _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <1074245327.6733.9.camel@localhost>]
* Re: null terminated strings [not found] ` <1074245327.6733.9.camel@localhost> @ 2004-01-16 10:17 ` Andreas Voegele 2004-01-16 11:02 ` Roland Orre 0 siblings, 1 reply; 19+ messages in thread From: Andreas Voegele @ 2004-01-16 10:17 UTC (permalink / raw) Roland Orre writes: > On Fri, 2004-01-16 at 10:10, Andreas Voegele wrote: > >> "Remove calls to SCM_STRING_COERCE_0TERMINATION_X. Since the >> substring type is gone, all strings are 0-terminated anyway." > > Such a statement is very worrying. What happened to the promises > that all substrings will be shared strings? I'm new to Guile. I don't know why shared strings were removed from Guile, but things were probably simplified greatly by this decision. I think that it is very useful to zero terminate all strings since embedding and extending Guile becomes easier. It seems that you'll have to stick to Thien-Thi Nguyen's Guile version, available at http://www.glug.org/, if you need shared strings. _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: null terminated strings 2004-01-16 10:17 ` null terminated strings Andreas Voegele @ 2004-01-16 11:02 ` Roland Orre 2004-01-16 12:24 ` Andreas Voegele 0 siblings, 1 reply; 19+ messages in thread From: Roland Orre @ 2004-01-16 11:02 UTC (permalink / raw) Cc: guile-user On Fri, 2004-01-16 at 11:17, Andreas Voegele wrote: > Roland Orre writes: > > > On Fri, 2004-01-16 at 10:10, Andreas Voegele wrote: > > > >> "Remove calls to SCM_STRING_COERCE_0TERMINATION_X. Since the > >> substring type is gone, all strings are 0-terminated anyway." > > > > Such a statement is very worrying. What happened to the promises > > that all substrings will be shared strings? > > I'm new to Guile. I don't know why shared strings were removed from > Guile, but things were probably simplified greatly by this decision. > I think that it is very useful to zero terminate all strings since > embedding and extending Guile becomes easier. During all years I've used scm and later guile (I think 14 years now) I have very very rarely had any need for null terminated strings. There is a function in guile to convert a null terminated string to a guile string, this I've used a few times. In C programming I often don't rely on null terminated strings either as there are many C library functions that works with length, which I consider more elegant. The absence of shared substrings on the other hand means a lot of special code to be able to do something similar without having to copy and create strings all the time. For instance in a conversion routine for fixed data base tables I made some years ago I had first used substrings. The program took 15 hours to run on a specific table. When I changed to use shared substrings it took about 1 hour on the same table. I think it is quite bad to have the requirement that guile should rely on null terminated strings as it is often quite easy to come around this from my point of view. In guile 1.6.4 it is also expressed so that all substrings are intended to become internally shared, which I don't see have happened yet. > It seems that you'll have to stick to Thien-Thi Nguyen's Guile > version, available at http://www.glug.org/, if you need shared > strings. OK, I haven't haven't followed his development very carefully, but maybe I should take a closer look. I don't know if he has catched up with the functionality with goops and such, which I need for matrix calculations. Best regards Roland Orre _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: null terminated strings 2004-01-16 11:02 ` Roland Orre @ 2004-01-16 12:24 ` Andreas Voegele 2004-01-16 18:20 ` Brian S McQueen 0 siblings, 1 reply; 19+ messages in thread From: Andreas Voegele @ 2004-01-16 12:24 UTC (permalink / raw) Roland Orre writes: > On Fri, 2004-01-16 at 11:17, Andreas Voegele wrote: >> Roland Orre writes: >> >> > On Fri, 2004-01-16 at 10:10, Andreas Voegele wrote: >> > >> >> "Remove calls to SCM_STRING_COERCE_0TERMINATION_X. Since the >> >> substring type is gone, all strings are 0-terminated anyway." >> > >> > Such a statement is very worrying. What happened to the promises >> > that all substrings will be shared strings? >> >> I'm new to Guile. I don't know why shared strings were removed from >> Guile, but things were probably simplified greatly by this decision. >> I think that it is very useful to zero terminate all strings since >> embedding and extending Guile becomes easier. > > During all years I've used scm and later guile (I think 14 years now) > I have very very rarely had any need for null terminated strings. There > is a function in guile to convert a null terminated string to a guile > string, this I've used a few times. In C programming I often don't > rely on null terminated strings either as there are many C library > functions that works with length, which I consider more elegant. It probably depends on what you're doing with Guile. I intend to use Guile to glue together code from a lot of existing C libraries. I'm currently writing wrappers for several C libraries and a lot of functions provided by these libraries require null terminated strings. On the other hand, I agree that a stable API is important. Because of this I also thought about using Guile 1.4 but I came to the conclusion that Guile 1.6 is better when you need to wrap a lot of C libraries. (I know SWIG but I don't like it very much.) > The absence of shared substrings on the other hand means a lot of > special code to be able to do something similar without having to > copy and create strings all the time. For instance in a conversion > routine for fixed data base tables I made some years ago I had first > used substrings. The program took 15 hours to run on a specific > table. When I changed to use shared substrings it took about 1 hour > on the same table. That's indeed a huge difference. > I think it is quite bad to have the requirement that guile should > rely on null terminated strings as it is often quite easy to come > around this from my point of view. In guile 1.6.4 it is also > expressed so that all substrings are intended to become internally > shared, which I don't see have happened yet. I'm wondering how substrings could be shared when there's also a promise that all strings are null terminated. _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: null terminated strings 2004-01-16 12:24 ` Andreas Voegele @ 2004-01-16 18:20 ` Brian S McQueen 2004-01-16 20:36 ` Paul Jarc 0 siblings, 1 reply; 19+ messages in thread From: Brian S McQueen @ 2004-01-16 18:20 UTC (permalink / raw) I know you folks are discussing an important topic regarding the design and future of libguile, but my more simple, but I am a very basic question on this subject. I am wondering about the most practical way for a programmer to get a C string from a scheme string. So what are some common techniques? The approach I used was tedious, so I expect you folks have a better way. I did this: login_chrs = SCM_STRING_CHARS(login_scm); login_len = SCM_STRING_LENGTH(login_scm); login = (char *)scm_must_malloc(login_len + 1, "func"); memcpy(login, login_chrs, login_len); memset(login + login_len, '\0', 1); If the scheme string is truly null terminated, this can be greatly simplified. Brian On Fri, 16 Jan 2004, Andreas Voegele wrote: > Roland Orre writes: > > > On Fri, 2004-01-16 at 11:17, Andreas Voegele wrote: > >> Roland Orre writes: > >> > >> > On Fri, 2004-01-16 at 10:10, Andreas Voegele wrote: > >> > > >> >> "Remove calls to SCM_STRING_COERCE_0TERMINATION_X. Since the > >> >> substring type is gone, all strings are 0-terminated anyway." > >> > > >> > Such a statement is very worrying. What happened to the promises > >> > that all substrings will be shared strings? > >> > >> I'm new to Guile. I don't know why shared strings were removed from > >> Guile, but things were probably simplified greatly by this decision. > >> I think that it is very useful to zero terminate all strings since > >> embedding and extending Guile becomes easier. > > > > During all years I've used scm and later guile (I think 14 years now) > > I have very very rarely had any need for null terminated strings. There > > is a function in guile to convert a null terminated string to a guile > > string, this I've used a few times. In C programming I often don't > > rely on null terminated strings either as there are many C library > > functions that works with length, which I consider more elegant. > > It probably depends on what you're doing with Guile. I intend to use > Guile to glue together code from a lot of existing C libraries. I'm > currently writing wrappers for several C libraries and a lot of > functions provided by these libraries require null terminated strings. > > On the other hand, I agree that a stable API is important. Because of > this I also thought about using Guile 1.4 but I came to the conclusion > that Guile 1.6 is better when you need to wrap a lot of C libraries. > (I know SWIG but I don't like it very much.) > > > The absence of shared substrings on the other hand means a lot of > > special code to be able to do something similar without having to > > copy and create strings all the time. For instance in a conversion > > routine for fixed data base tables I made some years ago I had first > > used substrings. The program took 15 hours to run on a specific > > table. When I changed to use shared substrings it took about 1 hour > > on the same table. > > That's indeed a huge difference. > > > I think it is quite bad to have the requirement that guile should > > rely on null terminated strings as it is often quite easy to come > > around this from my point of view. In guile 1.6.4 it is also > > expressed so that all substrings are intended to become internally > > shared, which I don't see have happened yet. > > I'm wondering how substrings could be shared when there's also a > promise that all strings are null terminated. > > > _______________________________________________ > Guile-user mailing list > Guile-user@gnu.org > http://mail.gnu.org/mailman/listinfo/guile-user > _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: null terminated strings 2004-01-16 18:20 ` Brian S McQueen @ 2004-01-16 20:36 ` Paul Jarc 2004-01-16 21:06 ` Tom Lord 0 siblings, 1 reply; 19+ messages in thread From: Paul Jarc @ 2004-01-16 20:36 UTC (permalink / raw) Cc: guile-user Brian S McQueen <bqueen@nas.nasa.gov> wrote: > If the scheme string is truly null terminated, this can be greatly > simplified. Apparently, whether it is guaranteed to be terminated depends on what version of Guile you're using. With 1.6.4, it might not be terminated (judging by the fact that make-shared-substring exists in that version). But regardless of whether the guarantee is there in general, you can always make it so for a particular string using SCM_STRING_COERCE_0TERMINATION_X, as I showed before. Versions of Guile that guarantee termination can define that macro as a no-op to support existing code. FWIW, I think that (preferably copy-on-write) shared substrings are valuable enough for performance (not to mention backward compatibility) that Guile should not remove them for the sake of guaranteeing termination. With shared substrings, we can still get termination when we need it with SCM_STRING_COERCE_0TERMINATION_X, but without shared substrings, we cannot get performance when we need it. paul _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: null terminated strings 2004-01-16 20:36 ` Paul Jarc @ 2004-01-16 21:06 ` Tom Lord 2004-01-16 21:02 ` Paul Jarc 2004-01-19 17:28 ` Ken Anderson 0 siblings, 2 replies; 19+ messages in thread From: Tom Lord @ 2004-01-16 21:06 UTC (permalink / raw) Cc: guile-user > From: prj@po.cwru.edu (Paul Jarc) > FWIW, I think that (preferably copy-on-write) shared substrings are > valuable enough for performance mutation-effects-both shared substrings are valuable as a feature of (extended) Scheme. A common idiom in string-manipulating programs is to manipulate and pass around triples (STRING START END). It's well worthwhile to make such triples a first-class type -- the alternative is to have to write lots of string manipulation functions in a style where they take three actual parameters (ideally with two of those being optional) to represent a single conceptual parameter. And if you have that abstract data type, since it is compatible with the RnRS requirements for the STRING? type, it may as well be a subset of the STRING? type. copy-on-write shared substrings are a performance feature -- mutation-effects-both shared substrings are an improvement to Scheme. -t _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: null terminated strings 2004-01-16 21:06 ` Tom Lord @ 2004-01-16 21:02 ` Paul Jarc 2004-01-16 21:27 ` Roland Orre 2004-01-19 17:28 ` Ken Anderson 1 sibling, 1 reply; 19+ messages in thread From: Paul Jarc @ 2004-01-16 21:02 UTC (permalink / raw) Cc: guile-user Tom Lord <lord@emf.net> wrote: > From: prj@po.cwru.edu (Paul Jarc) > > > FWIW, I think that (preferably copy-on-write) shared substrings are > > valuable enough for performance > > mutation-effects-both shared substrings are valuable as a feature of > (extended) Scheme. I agree... now. :) > mutation-effects-both shared substrings are an improvement to Scheme. Unless you're a fan of pure functional languages, of course. paul _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: null terminated strings 2004-01-16 21:02 ` Paul Jarc @ 2004-01-16 21:27 ` Roland Orre 0 siblings, 0 replies; 19+ messages in thread From: Roland Orre @ 2004-01-16 21:27 UTC (permalink / raw) Cc: guile-user On Fri, 2004-01-16 at 22:02, Paul Jarc wrote: >> Tom Lord <lord@emf.net> wrote: >> mutation-effects-both shared substrings are an improvement to Scheme. >> > Unless you're a fan of pure functional languages, of course. >From my point of view fans of pure functional languages will not usually see much of the reality. At least not as long as we are speaking (semi-)interpreted languages as guile, even though I myself, as being a quite imperative scheme programmer often looses in contests with a clever functional programmer... Roland Orre _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: null terminated strings 2004-01-16 21:06 ` Tom Lord 2004-01-16 21:02 ` Paul Jarc @ 2004-01-19 17:28 ` Ken Anderson 2004-01-19 18:46 ` Per Bothner 1 sibling, 1 reply; 19+ messages in thread From: Ken Anderson @ 2004-01-19 17:28 UTC (permalink / raw) Cc: guile-user, prj At 01:06 PM 1/16/2004 -0800, Tom Lord wrote: > > From: prj@po.cwru.edu (Paul Jarc) > > > FWIW, I think that (preferably copy-on-write) shared substrings are > > valuable enough for performance > >mutation-effects-both shared substrings are valuable as a feature of >(extended) Scheme. > >A common idiom in string-manipulating programs is to manipulate and >pass around triples (STRING START END). > >It's well worthwhile to make such triples a first-class type -- the >alternative is to have to write lots of string manipulation functions >in a style where they take three actual parameters (ideally with two >of those being optional) to represent a single conceptual parameter. > >And if you have that abstract data type, since it is compatible with >the RnRS requirements for the STRING? type, it may as well be a subset >of the STRING? type. > >copy-on-write shared substrings are a performance feature -- >mutation-effects-both shared substrings are an improvement to Scheme. Can you describe how this is an improvement to Scheme? I don't think i've ever wanted to do this. In Java, which does copy-on-write i often find myself carefully copying the substrings so they don't share structure. This is because of things like: - i don't know how long the underlying string (char array actuall) is. It can be much longer than the last line i've read. - many fields are a relatively small set of values so interning them helps. - some fields become something besides strings, such as numbers. Java only has one kind of string, which is fairly heavy weight. For example, the string "" takes 36 bytes: > (describe "") is an instance of java.lang.String // from java.lang.String value: [C@d42d08 offset: 0 count: 0 hash: 0 So, it might help to have several representations for strings. In Gule, of course, you can play even more games because you have C underneath. From the applications i've seen, interning strings has been more useful _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: null terminated strings 2004-01-19 17:28 ` Ken Anderson @ 2004-01-19 18:46 ` Per Bothner 2004-01-19 19:16 ` Ken Anderson 0 siblings, 1 reply; 19+ messages in thread From: Per Bothner @ 2004-01-19 18:46 UTC (permalink / raw) Cc: guile-user Ken Anderson wrote: > In Java, which does copy-on-write String (including substrings) are immutable, so they cannot be written. The implementation of the StringBuffer class does do copy-on-write, but that doesn't affect substrings. > i often find myself carefully copying the substrings so they don't share structure. Why? The only reason I can think of is garbage collection: A shared substring prevents the base from being collected. > This is because of things like: > - i don't know how long the underlying string (char array actuall) is. So? > Java only has one kind of string, which is fairly heavy weight. For example, the string "" takes 36 bytes: > >>(describe "") > is an instance of java.lang.String > > // from java.lang.String > value: [C@d42d08 > offset: 0 > count: 0 > hash: 0 This depends on the implementation, and the version of the implementation. GCJ uses for "": object header (4 bytes on 32-but systems) private Object data; /* points to itself in this case */ private int boffset; /* offset of first char within data */ int count; /* number of character */ private int cachedHashCode; /* chars follow if data==this */ (The data and boffset fields are only accessed by native C++ code.) Total 20 bytes. -- --Per Bothner per@bothner.com http://per.bothner.com/ _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: null terminated strings 2004-01-19 18:46 ` Per Bothner @ 2004-01-19 19:16 ` Ken Anderson 0 siblings, 0 replies; 19+ messages in thread From: Ken Anderson @ 2004-01-19 19:16 UTC (permalink / raw) Cc: guile-user At 10:46 AM 1/19/2004 -0800, Per Bothner wrote: >Ken Anderson wrote: > >> In Java, which does copy-on-write > >String (including substrings) are immutable, so they cannot be written. >The implementation of the StringBuffer class does do copy-on-write, but >that doesn't affect substrings. > >>i often find myself carefully copying the substrings so they don't share structure. > >Why? The only reason I can think of is garbage collection: A shared >substring prevents the base from being collected. Yes. Say you do something like (this is JScheme): > (define text "foo bar") "foo bar" > (define r (StringReader. text)) java.io.StringReader@9945ce > (define b (BufferedReader. r)) java.io.BufferedReader@2d96f2 > (define line (.readLine b)) "foo bar" > (define a (.substring line 0 3)) "foo" > (define b (.substring line 4)) "bar" > (describe a) foo is an instance of java.lang.String // from java.lang.String value: [C@79e304 offset: 0 count: 3 hash: 0 () > (describe b) bar is an instance of java.lang.String // from java.lang.String value: [C@79e304 offset: 4 count: 3 hash: 0 () > (vector-length (.value$# a)) 80 a and b share the same char[] of size 80, which wastes a lot of space in this case. (80 is the default string buffer size in BufferedReader). >>This is because of things like: >>- i don't know how long the underlying string (char array actuall) is. > >So? So you don't know how much space your line is taking up. >>Java only has one kind of string, which is fairly heavy weight. For example, the string "" takes 36 bytes: >> >>>(describe "") >> is an instance of java.lang.String >> // from java.lang.String >> value: [C@d42d08 >> offset: 0 >> count: 0 >> hash: 0 > >This depends on the implementation, and the version of the >implementation. > >GCJ uses for "": > object header (4 bytes on 32-but systems) > private Object data; /* points to itself in this case */ > private int boffset; /* offset of first char within data */ > int count; /* number of character */ > private int cachedHashCode; > /* chars follow if data==this */ >(The data and boffset fields are only accessed by native C++ code.) > >Total 20 bytes. Much better. _______________________________________________ Guile-user mailing list Guile-user@gnu.org http://mail.gnu.org/mailman/listinfo/guile-user ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2004-01-19 19:16 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-01-05 17:40 argz SMOB Brian S McQueen 2004-01-06 19:54 ` Daniel Skarda 2004-01-08 16:44 ` Brian S McQueen 2004-01-09 14:08 ` Daniel Skarda 2004-01-12 16:08 ` Brian S McQueen 2004-01-15 18:43 ` Brian S McQueen 2004-01-16 0:21 ` Paul Jarc 2004-01-16 9:10 ` null terminated strings (was: argz SMOB) Andreas Voegele [not found] ` <1074245327.6733.9.camel@localhost> 2004-01-16 10:17 ` null terminated strings Andreas Voegele 2004-01-16 11:02 ` Roland Orre 2004-01-16 12:24 ` Andreas Voegele 2004-01-16 18:20 ` Brian S McQueen 2004-01-16 20:36 ` Paul Jarc 2004-01-16 21:06 ` Tom Lord 2004-01-16 21:02 ` Paul Jarc 2004-01-16 21:27 ` Roland Orre 2004-01-19 17:28 ` Ken Anderson 2004-01-19 18:46 ` Per Bothner 2004-01-19 19:16 ` Ken Anderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).