* Pure (side-effect-free) calls into c/c++? @ 2020-01-10 22:36 Linas Vepstas 2020-01-11 14:13 ` Zelphir Kaltstahl ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Linas Vepstas @ 2020-01-10 22:36 UTC (permalink / raw) To: Guile User So, I've got lots of C code wrapped up in guile, and I'd like to declare many of these functions to be pure functions, side-effect-free, thus hopefully garnering some optimizations. Is this possible? How would I do it? A cursory google-search reveals no clues. To recap, I've got functions f and g that call into c++, but are pure (i.e. always return the same value for the same arguments). I've got user-written code that looks like this: (define (foo x) (g (f 42) (f x) (f 43)) and from what I can tell, `f` is getting called three times whenever the user calls `foo`. I could tell the user to re-write their code to cache, manually: viz: (define c42 (f 42)) (define c43 (f 43)) (define (foo x) (g c42 (f x) c43)) but asking the users to do this is .. cumbersome. And barely worth it: `f` takes under maybe 10 microseconds to run; so most simple-minded caching stunts don't pay off. But since `foo` is called millions/billions of times, I'm motivated to find something spiffy. Ideas? suggestions? -- Linas -- cassette tapes - analog TV - film cameras - you ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-10 22:36 Pure (side-effect-free) calls into c/c++? Linas Vepstas @ 2020-01-11 14:13 ` Zelphir Kaltstahl 2020-01-11 14:38 ` Matt Wette 2020-01-11 17:40 ` Linus Björnstam 2 siblings, 0 replies; 15+ messages in thread From: Zelphir Kaltstahl @ 2020-01-11 14:13 UTC (permalink / raw) To: linasvepstas; +Cc: guile-user Hello Linas, On 1/10/20 11:36 PM, Linas Vepstas wrote: > So, I've got lots of C code wrapped up in guile, and I'd like to declare > many of these functions to be pure functions, side-effect-free, thus > hopefully garnering some optimizations. Is this possible? How would I do > it? A cursory google-search reveals no clues. > > To recap, I've got functions f and g that call into c++, but are pure (i.e. > always return the same value for the same arguments). I've got > user-written code that looks like this: > > (define (foo x) > (g (f 42) (f x) (f 43)) > > and from what I can tell, `f` is getting called three times whenever the > user calls `foo`. I could tell the user to re-write their code to cache, > manually: viz: > > (define c42 (f 42)) > (define c43 (f 43)) > (define (foo x) (g c42 (f x) c43)) > > but asking the users to do this is .. cumbersome. And barely worth it: `f` > takes under maybe 10 microseconds to run; so most simple-minded caching > stunts don't pay off. But since `foo` is called millions/billions of times, > I'm motivated to find something spiffy. > > Ideas? suggestions? > > -- Linas I don't know exactly how to do it, but in theory, you could provide the user a macro, which looks for calls of `f` and makes it so, that these calls are only done once. My macro skills are not so great yet, so I don't know how to do that. I am just thinking, that in theory this should be possible, perhaps with a simplified case, that assumes, that the user does not redefine `f` inside the expression given to the macro. Just outputting the idea, not sure it is a good idea. Regards, Zelphir ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-10 22:36 Pure (side-effect-free) calls into c/c++? Linas Vepstas 2020-01-11 14:13 ` Zelphir Kaltstahl @ 2020-01-11 14:38 ` Matt Wette 2020-01-11 18:11 ` Linas Vepstas 2020-01-11 17:40 ` Linus Björnstam 2 siblings, 1 reply; 15+ messages in thread From: Matt Wette @ 2020-01-11 14:38 UTC (permalink / raw) To: guile-user On 1/10/20 2:36 PM, Linas Vepstas wrote: > So, I've got lots of C code wrapped up in guile, and I'd like to declare > many of these functions to be pure functions, side-effect-free, thus > hopefully garnering some optimizations. Is this possible? How would I do > it? A cursory google-search reveals no clues. > > To recap, I've got functions f and g that call into c++, but are pure (i.e. > always return the same value for the same arguments). I've got > user-written code that looks like this: > > (define (foo x) > (g (f 42) (f x) (f 43)) > > and from what I can tell, `f` is getting called three times whenever the > user calls `foo`. I could tell the user to re-write their code to cache, > manually: viz: > > (define c42 (f 42)) > (define c43 (f 43)) > (define (foo x) (g c42 (f x) c43)) > > but asking the users to do this is .. cumbersome. And barely worth it: `f` > takes under maybe 10 microseconds to run; so most simple-minded caching > stunts don't pay off. But since `foo` is called millions/billions of times, > I'm motivated to find something spiffy. > > Ideas? suggestions? > > -- Linas read this: http://community.schemewiki.org/?memoization and look at https://docs.racket-lang.org/memoize/index.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-11 14:38 ` Matt Wette @ 2020-01-11 18:11 ` Linas Vepstas 2020-01-11 18:52 ` Linas Vepstas 0 siblings, 1 reply; 15+ messages in thread From: Linas Vepstas @ 2020-01-11 18:11 UTC (permalink / raw) To: Matt Wette; +Cc: Guile User Hmm Thanks. Perhaps I should have been more clear. I'm talking about a handful of values that behave like constants, and NOT about memoization. So here's a bit more detail. The only thing the C++ code is doing is stuffing the values into a giant hash table... in C++. So its already very fast. The return value is a pointer into that hash table. Replicating that hash table a second time, in guile is ... well ... it has tens/hundreds of millions of entries, it would blow out RAM. (And the other functions, those that are actually CPU-intensive, already got the Grand Wazoo treatment for memization; some in guile, some in C++) For this particular issue, I'm thinking I need to understand macros, somehow. The point being that there is just a tiny handful of values -- some dozens, maybe hundreds, that some human has written into the code, and are being treated as if they were literal constants (because, in the C++ code, they *are* constants -- they're really just fixed entries in a symbol table) and so I want to automatically replace these by the actual constant value (the location in the c++ table). To recap: a macro that would convert (define (foo x) (g (f 42) (f x) (f 43)) into (define c42 (f 42)) (define c43 (f 43)) (define (foo x) (g c42 (f x) c43)) so that guild can treat c42 and c43 as constants (boxes, I guess). -- Linas On Sat, Jan 11, 2020 at 8:39 AM Matt Wette <matt.wette@gmail.com> wrote: > On 1/10/20 2:36 PM, Linas Vepstas wrote: > > So, I've got lots of C code wrapped up in guile, and I'd like to declare > > many of these functions to be pure functions, side-effect-free, thus > > hopefully garnering some optimizations. Is this possible? How would I do > > it? A cursory google-search reveals no clues. > > > > To recap, I've got functions f and g that call into c++, but are pure > (i.e. > > always return the same value for the same arguments). I've got > > user-written code that looks like this: > > > > (define (foo x) > > (g (f 42) (f x) (f 43)) > > > > and from what I can tell, `f` is getting called three times whenever the > > user calls `foo`. I could tell the user to re-write their code to cache, > > manually: viz: > > > > (define c42 (f 42)) > > (define c43 (f 43)) > > (define (foo x) (g c42 (f x) c43)) > > > > but asking the users to do this is .. cumbersome. And barely worth it: > `f` > > takes under maybe 10 microseconds to run; so most simple-minded caching > > stunts don't pay off. But since `foo` is called millions/billions of > times, > > I'm motivated to find something spiffy. > > > > Ideas? suggestions? > > > > -- Linas > > read this: http://community.schemewiki.org/?memoization > and look at https://docs.racket-lang.org/memoize/index.html > > > > -- cassette tapes - analog TV - film cameras - you ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-11 18:11 ` Linas Vepstas @ 2020-01-11 18:52 ` Linas Vepstas 2020-01-11 21:56 ` Taylan Kammer 2020-01-12 2:12 ` Linas Vepstas 0 siblings, 2 replies; 15+ messages in thread From: Linas Vepstas @ 2020-01-11 18:52 UTC (permalink / raw) To: Matt Wette; +Cc: Guile User Or, thinking aloud a bit: boxes and symbols.... So, for example, if I was able to tell apart calls (f 42) from calls (f x) where x is a "symbol" (or "variable", *see below*) referencing an integer, then, for the former case, I could create a new symbol (in guile) and attach it to a box, fill that box with whatever (f 42) would have returned. Thence-forward, any call site in the guile code that had (f 42) in it would get replaced by the symbol (or the unboxed value)... At the moment, I don't know how to tell apart 42, the literal number, from x, a symbol that references a number. (Somehow, I can't make symbol? work ...) I also don't know how to "edit" the call site (the cell, the box?) that has (f 42) as the call-target, and replace it by a constant (or a boxed constant). But my naive thinking fails: (define x 0) (symbol? x) => #f (variable? x) => #f So I guess that x is not a symbol, from the guile point of view!? This is .. confusing. What is x, then, if not a symbol? (define (foo x) (format #t "its ~A ~A\n" (symbol? x) (variable? x)) (+ x 1)) (foo 42) its #f #f (define y 66) (foo y) its #f #f How can I tell apart "42" from "y" ? -- Linas On Sat, Jan 11, 2020 at 12:11 PM Linas Vepstas <linasvepstas@gmail.com> wrote: > Hmm Thanks. Perhaps I should have been more clear. I'm talking about a > handful of values that behave like constants, and NOT about memoization. > > So here's a bit more detail. The only thing the C++ code is doing is > stuffing the values into a giant hash table... in C++. So its already very > fast. The return value is a pointer into that hash table. Replicating that > hash table a second time, in guile is ... well ... it has tens/hundreds of > millions of entries, it would blow out RAM. (And the other functions, > those that are actually CPU-intensive, already got the Grand Wazoo > treatment for memization; some in guile, some in C++) > > For this particular issue, I'm thinking I need to understand macros, > somehow. The point being that there is just a tiny handful of values -- > some dozens, maybe hundreds, that some human has written into the code, and > are being treated as if they were literal constants (because, in the C++ > code, they *are* constants -- they're really just fixed entries in a symbol > table) and so I want to automatically replace these by the actual constant > value (the location in the c++ table). To recap: a macro that would > convert > > (define (foo x) (g (f 42) (f x) (f 43)) > > into > > (define c42 (f 42)) > (define c43 (f 43)) > (define (foo x) (g c42 (f x) c43)) > > so that guild can treat c42 and c43 as constants (boxes, I guess). > > -- Linas > > > On Sat, Jan 11, 2020 at 8:39 AM Matt Wette <matt.wette@gmail.com> wrote: > >> On 1/10/20 2:36 PM, Linas Vepstas wrote: >> > So, I've got lots of C code wrapped up in guile, and I'd like to declare >> > many of these functions to be pure functions, side-effect-free, thus >> > hopefully garnering some optimizations. Is this possible? How would I >> do >> > it? A cursory google-search reveals no clues. >> > >> > To recap, I've got functions f and g that call into c++, but are pure >> (i.e. >> > always return the same value for the same arguments). I've got >> > user-written code that looks like this: >> > >> > (define (foo x) >> > (g (f 42) (f x) (f 43)) >> > >> > and from what I can tell, `f` is getting called three times whenever the >> > user calls `foo`. I could tell the user to re-write their code to cache, >> > manually: viz: >> > >> > (define c42 (f 42)) >> > (define c43 (f 43)) >> > (define (foo x) (g c42 (f x) c43)) >> > >> > but asking the users to do this is .. cumbersome. And barely worth it: >> `f` >> > takes under maybe 10 microseconds to run; so most simple-minded caching >> > stunts don't pay off. But since `foo` is called millions/billions of >> times, >> > I'm motivated to find something spiffy. >> > >> > Ideas? suggestions? >> > >> > -- Linas >> >> read this: http://community.schemewiki.org/?memoization >> and look at https://docs.racket-lang.org/memoize/index.html >> >> >> >> > > -- > cassette tapes - analog TV - film cameras - you > -- cassette tapes - analog TV - film cameras - you ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-11 18:52 ` Linas Vepstas @ 2020-01-11 21:56 ` Taylan Kammer 2020-01-12 2:15 ` Linas Vepstas 2020-01-12 4:12 ` Linas Vepstas 2020-01-12 2:12 ` Linas Vepstas 1 sibling, 2 replies; 15+ messages in thread From: Taylan Kammer @ 2020-01-11 21:56 UTC (permalink / raw) To: linasvepstas, Matt Wette; +Cc: Guile User On 11.01.2020 19:52, Linas Vepstas wrote: > Or, thinking aloud a bit: boxes and symbols.... > > So, for example, if I was able to tell apart calls (f 42) from calls (f x) > where x is a "symbol" (or "variable", *see below*) referencing an integer, > then, for the former case, I could create a new symbol (in guile) and > attach it to a box, fill that box with whatever (f 42) would have returned. > Thence-forward, any call site in the guile code that had (f 42) in it would > get replaced by the symbol (or the unboxed value)... > > At the moment, I don't know how to tell apart 42, the literal number, from > x, a symbol that references a number. (Somehow, I can't make symbol? work > ...) I also don't know how to "edit" the call site (the cell, the box?) > that has (f 42) as the call-target, and replace it by a constant (or a > boxed constant). > > But my naive thinking fails: > (define x 0) > (symbol? x) => #f > (variable? x) => #f > > So I guess that x is not a symbol, from the guile point of view!? This is > .. confusing. What is x, then, if not a symbol? The issue here is that the expression "(symbol? x)" first evaluates the expressions "symbol?" and "x" and then uses their values to go on. So it ends up being: (<value-of-symbol?> <value-of-x>) So by the time the procedure behind "symbol?" is called, it neither knows that it was called "symbol?" nor does it know that the value it received, i.e. the integer 0, came from a variable called "x". What you want to do is delve down to the macro layer so to say, by using "define-syntax" and ideally "syntax-case". Here's a demonstration: (define-syntax symbol-syntax? (lambda (stx) (syntax-case stx () ((_ x) (if (symbol? (syntax->datum #'x)) #'(display "yep\n") #'(display "nope\n")))))) scheme> (symbol-syntax? blah) yep scheme> (symbol-syntax? (blah)) nope What happens here is the following. Going through it part by part. (define-syntax symbol-syntax? (lambda (stx) ... You register a procedure (lambda (stx) ...) as a macro which you bind to "symbol-syntax?". Now the expression "(symbol-syntax? blah)" will not be evaluated in the regular fashion, because Guile sees that "symbol-syntax?" is registered as a macro. Instead of trying to get the values of "symbol-syntax?" and "blah" like it happened with "(symbol? x)", this time Guile calls the procedure registered with "symbol-syntax?" immediately, and passes it a "syntax object" that represents the whole expression "(symbol-syntax? blah)". (Not just the "blah" argument but the whole expression, don't ask why.) (syntax-case stx () ((_ x) ... Within the procedure, we use "syntax-case" to do pattern-matching on the expression "(symbol-syntax? blah)" that is stored in the variable "stx". We know that the first thing in the expression is "symbol-syntax?" because otherwise we wouldn't have been invoked in the first place, so we want to ignore that, hence we use the pattern "(_ x)" which ignores the first thing (via the underscore) and binds the second thing to "x". That second thing is "blah" so "x" contains that now. (if (symbol? (syntax->datum #'x)) ... Now "x" contains the element "blah" but to your probable surprise, it's not a symbol but rather a syntax object. Thankfully we can just call "syntax->datum" to get the corresponding data value, which is going to be the symbol "blah". You're probably wondering why we wrote #'x instead of just x in our call to "syntax->datum" and to be honest I'm not 100% clear on the reason to this day, but it's how syntax-case expects you to use pattern variables. You're never allowed to reference them "bare". #'(display "yep\n") Finally, we use #' to create a whole new syntax object, containing the expression (display "yep\n"). That syntax object is the value returned from our procedure, so Guile puts it where "(symbol-syntax? blah)" was. When you compile the code, all that actually ends up in the program is "(display ...)" with no trace of the original "(symbol-syntax? ...)" expression being left. That's what macros do; they replace code at compile time and leave no trace of themselves. In the second call, "(symbol-syntax? (blah))", the pattern variable "x" ends up containing a syntax object representing "(blah)", and calling "syntax->datum" on that yields a list and not a symbol, hence we end up returning '(display "nope\n")' from the macro. ---------- All that might be daunting at first but after a while it becomes natural, and the power it gives the programmer is really like nothing else. :-) What you intend to do, which is traverse through a whole lambda body and find instances of "(f 42)" to replace them, might be a bit tricky. Let's say you've written a macro "my-define" which does that, then consider the following definition: (my-define (foo x) (let ((f 42)) (+ f x))) Now you probably don't want to turn that "(f 42)" into anything else, because it's not really a call to your "f". I've never written a macro yet which does something like that, so I'm not sure I can help, but I'm happy to respond to questions about the general workings of the macro system. - Taylan ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-11 21:56 ` Taylan Kammer @ 2020-01-12 2:15 ` Linas Vepstas 2020-01-12 4:12 ` Linas Vepstas 1 sibling, 0 replies; 15+ messages in thread From: Linas Vepstas @ 2020-01-12 2:15 UTC (permalink / raw) To: Taylan Kammer; +Cc: Guile User Thanks Taylan, gmail is on the fritz lately and doesn't show replies until after I post; let me read what you wrote and ponder. ~~~ On Sat, Jan 11, 2020 at 3:56 PM Taylan Kammer <taylan.kammer@gmail.com> wrote: > On 11.01.2020 19:52, Linas Vepstas wrote: > > Or, thinking aloud a bit: boxes and symbols.... > > > > So, for example, if I was able to tell apart calls (f 42) from calls (f > x) > > where x is a "symbol" (or "variable", *see below*) referencing an > integer, > > then, for the former case, I could create a new symbol (in guile) and > > attach it to a box, fill that box with whatever (f 42) would have > returned. > > Thence-forward, any call site in the guile code that had (f 42) in it > would > > get replaced by the symbol (or the unboxed value)... > > > > At the moment, I don't know how to tell apart 42, the literal number, > from > > x, a symbol that references a number. (Somehow, I can't make symbol? work > > ...) I also don't know how to "edit" the call site (the cell, the box?) > > that has (f 42) as the call-target, and replace it by a constant (or a > > boxed constant). > > > > But my naive thinking fails: > > (define x 0) > > (symbol? x) => #f > > (variable? x) => #f > > > > So I guess that x is not a symbol, from the guile point of view!? This > is > > .. confusing. What is x, then, if not a symbol? > > The issue here is that the expression "(symbol? x)" first evaluates the > expressions "symbol?" and "x" and then uses their values to go on. So > it ends up being: > > (<value-of-symbol?> <value-of-x>) > > So by the time the procedure behind "symbol?" is called, it neither > knows that it was called "symbol?" nor does it know that the value it > received, i.e. the integer 0, came from a variable called "x". > > What you want to do is delve down to the macro layer so to say, by using > "define-syntax" and ideally "syntax-case". > > Here's a demonstration: > > (define-syntax symbol-syntax? > (lambda (stx) > (syntax-case stx () > ((_ x) > (if (symbol? (syntax->datum #'x)) > #'(display "yep\n") > #'(display "nope\n")))))) > > scheme> (symbol-syntax? blah) > yep > scheme> (symbol-syntax? (blah)) > nope > > What happens here is the following. Going through it part by part. > > (define-syntax symbol-syntax? > (lambda (stx) > ... > > You register a procedure (lambda (stx) ...) as a macro which you bind to > "symbol-syntax?". > > Now the expression "(symbol-syntax? blah)" will not be evaluated in the > regular fashion, because Guile sees that "symbol-syntax?" is registered > as a macro. > > Instead of trying to get the values of "symbol-syntax?" and "blah" like > it happened with "(symbol? x)", this time Guile calls the procedure > registered with "symbol-syntax?" immediately, and passes it a "syntax > object" that represents the whole expression "(symbol-syntax? blah)". > (Not just the "blah" argument but the whole expression, don't ask why.) > > (syntax-case stx () > ((_ x) > ... > > Within the procedure, we use "syntax-case" to do pattern-matching on the > expression "(symbol-syntax? blah)" that is stored in the variable "stx". > > We know that the first thing in the expression is "symbol-syntax?" > because otherwise we wouldn't have been invoked in the first place, so > we want to ignore that, hence we use the pattern "(_ x)" which ignores > the first thing (via the underscore) and binds the second thing to "x". > That second thing is "blah" so "x" contains that now. > > (if (symbol? (syntax->datum #'x)) > ... > > Now "x" contains the element "blah" but to your probable surprise, it's > not a symbol but rather a syntax object. Thankfully we can just call > "syntax->datum" to get the corresponding data value, which is going to > be the symbol "blah". > > You're probably wondering why we wrote #'x instead of just x in our call > to "syntax->datum" and to be honest I'm not 100% clear on the reason to > this day, but it's how syntax-case expects you to use pattern variables. > You're never allowed to reference them "bare". > > #'(display "yep\n") > > Finally, we use #' to create a whole new syntax object, containing the > expression (display "yep\n"). That syntax object is the value returned > from our procedure, so Guile puts it where "(symbol-syntax? blah)" was. > > When you compile the code, all that actually ends up in the program is > "(display ...)" with no trace of the original "(symbol-syntax? ...)" > expression being left. That's what macros do; they replace code at > compile time and leave no trace of themselves. > > In the second call, "(symbol-syntax? (blah))", the pattern variable "x" > ends up containing a syntax object representing "(blah)", and calling > "syntax->datum" on that yields a list and not a symbol, hence we end up > returning '(display "nope\n")' from the macro. > > ---------- > > All that might be daunting at first but after a while it becomes > natural, and the power it gives the programmer is really like nothing > else. :-) > > What you intend to do, which is traverse through a whole lambda body and > find instances of "(f 42)" to replace them, might be a bit tricky. > Let's say you've written a macro "my-define" which does that, then > consider the following definition: > > (my-define (foo x) > (let ((f 42)) > (+ f x))) > > Now you probably don't want to turn that "(f 42)" into anything else, > because it's not really a call to your "f". > > I've never written a macro yet which does something like that, so I'm > not sure I can help, but I'm happy to respond to questions about the > general workings of the macro system. > > > - Taylan > -- cassette tapes - analog TV - film cameras - you ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-11 21:56 ` Taylan Kammer 2020-01-12 2:15 ` Linas Vepstas @ 2020-01-12 4:12 ` Linas Vepstas 1 sibling, 0 replies; 15+ messages in thread From: Linas Vepstas @ 2020-01-12 4:12 UTC (permalink / raw) To: Taylan Kammer; +Cc: Guile User Thank you Taylan! So let me seize on this statement: On Sat, Jan 11, 2020 at 3:56 PM Taylan Kammer <taylan.kammer@gmail.com> wrote: > On 11.01.2020 19:52, Linas Vepstas wrote: > > When you compile the code, all that actually ends up in the program is > "(display ...)" with no trace of the original "(symbol-syntax? ...)" > expression being left. That's what macros do; they replace code at > compile time and leave no trace of themselves. > OK! So .. now how do I combine that with compile-time memoization? That is, I want to have the memoization run in the macro itself, during macro expansion, and have the macro evaluate into that result (i.e. the macro expands into a runtime constant) My first attempt fails. I create a file with the following contents: (use-modules (srfi srfi-1)) ; The function in question. It needs to become a runtime ; constant, for constant arguments. (define (bar x) (format #t "Called bar with ~A\n" x) (+ x 1)) ; Memoization boilerplate (define cache (make-hash-table)) (define (int-hash INT SZ) (modulo INT SZ)) (define (int-assoc INT ILIST) (find (lambda (pr) (equal? INT (car pr))) ILIST)) : First (failed) attempt at compile-time memoization (define-syntax foob (syntax-rules () ((foob EXP) (if (symbol? (quote EXP)) (begin (display "Its a symbol\n") (bar EXP)) (let ((junk (format #t "A hash lookup is happening for ~A\n" EXP)) (const-val (hashx-ref int-hash int-assoc cache EXP))) (display "It's a constant!\n") (if (not const-val) (begin (set! const-val (bar EXP)) (hashx-set! int-hash int-assoc cache EXP const-val))) const-val))))) The I compile it: (implictly) scheme@(guile-user)> (load "/tmp/mem.scm") ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-auto-compile argument to disable. ;;; compiling /tmp/mem.scm ;;; compiled /home/ubuntu/.cache/guile/ccache/3.0-LE-8-4.1/tmp/mem.scm.go Hmm OK. Lets try it out: scheme@(guile-user)> (define x 68) scheme@(guile-user)> (+ (foob 42) (foob x) (foob 42)) A hash lookup is happening for 42 It's a constant! Called bar with 42 Its a symbol Called bar with 68 A hash lookup is happening for 42 It's a constant! $1 = 155 So, I don't mind the first four print statements. The fifth statement I object to: I was hoping that it would not print; that the macro expansion would cause the second occurrence of (foob 42) to simply be expanded into "43" (because the macro looks it up, and uses the value it found). Now, I understand why my code doesn't work. But I can't quite figure out how to make it run the way I'm describing it: to force the macro to expand the second (and subsequent) (foob 42) into the value 43. (Side issue: how do I tell apart `42` from `(* 33 x)` ? The earlier syntax-symbol? cannot tell them apart) -- Linas p.s. re pattern-matching: my c++ system is a giant-size pattern matcher; it can unify multiple clauses, defining a graph, and then run the pattern-match over huge datasets (e.g. genomic/proteomic/reactomic data) looking for specific pathways. -- cassette tapes - analog TV - film cameras - you ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-11 18:52 ` Linas Vepstas 2020-01-11 21:56 ` Taylan Kammer @ 2020-01-12 2:12 ` Linas Vepstas 2020-01-12 3:21 ` Taylan Kammer 1 sibling, 1 reply; 15+ messages in thread From: Linas Vepstas @ 2020-01-12 2:12 UTC (permalink / raw) To: Matt Wette; +Cc: Guile User To answer my own question, it appears doable with a very simply macro, then: (define (bar x) (format #t "Called bar with ~A\n" x) (+ x 1)) (define (memoize FUNC) " memoize a function FUNC which takes a single int as argument " (define cache (make-hash-table)) (define (int-hash INT SZ) (modulo INT SZ)) (define (int-assoc INT ILIST) (find (lambda (pr) (equal? INT (car pr))) ILIST)) (lambda (ITEM) (define val (hashx-ref int-hash int-assoc cache ITEM)) (if val val (let ((fv (FUNC ITEM))) (hashx-set! int-hash int-assoc cache ITEM fv) fv))) ) (define bar-memo (memoize bar)) (define-syntax foo (syntax-rules () ((foo exp) (if (symbol? (quote exp)) (begin (display "its a symb\n") (bar exp)) (begin (display "no its not\n") (bar-memo exp)))))) scheme@(guile-user)> (define x 66) scheme@(guile-user)> (foo x) its a symb Called bar with 66 $1 = 67 scheme@(guile-user)> (foo 42) no its not Called bar with 42 $2 = 43 scheme@(guile-user)> (foo 42) no its not $3 = 43 scheme@(guile-user)> (foo 42) no its not $4 = 43 scheme@(guile-user)> (foo 42) no its not $5 = 43 scheme@(guile-user)> (foo x) its a symb Called bar with 66 $6 = 67 scheme@(guile-user)> (foo 68) no its not Called bar with 68 $7 = 69 scheme@(guile-user)> (foo 68) no its not $8 = 69 So that works -- I can now memoize all constants, can pass all variables on to the c++ function. Now if I could just memoize-in-place, without the hash table ... --linas On Sat, Jan 11, 2020 at 12:52 PM Linas Vepstas <linasvepstas@gmail.com> wrote: > Or, thinking aloud a bit: boxes and symbols.... > > So, for example, if I was able to tell apart calls (f 42) from calls (f x) > where x is a "symbol" (or "variable", *see below*) referencing an integer, > then, for the former case, I could create a new symbol (in guile) and > attach it to a box, fill that box with whatever (f 42) would have returned. > Thence-forward, any call site in the guile code that had (f 42) in it would > get replaced by the symbol (or the unboxed value)... > > At the moment, I don't know how to tell apart 42, the literal number, from > x, a symbol that references a number. (Somehow, I can't make symbol? work > ...) I also don't know how to "edit" the call site (the cell, the box?) > that has (f 42) as the call-target, and replace it by a constant (or a > boxed constant). > > But my naive thinking fails: > (define x 0) > (symbol? x) => #f > (variable? x) => #f > > So I guess that x is not a symbol, from the guile point of view!? This is > .. confusing. What is x, then, if not a symbol? > > (define (foo x) (format #t "its ~A ~A\n" (symbol? x) (variable? x)) (+ x > 1)) > (foo 42) > its #f #f > (define y 66) > (foo y) > its #f #f > > How can I tell apart "42" from "y" ? > > -- Linas > > > > On Sat, Jan 11, 2020 at 12:11 PM Linas Vepstas <linasvepstas@gmail.com> > wrote: > >> Hmm Thanks. Perhaps I should have been more clear. I'm talking about a >> handful of values that behave like constants, and NOT about memoization. >> >> So here's a bit more detail. The only thing the C++ code is doing is >> stuffing the values into a giant hash table... in C++. So its already very >> fast. The return value is a pointer into that hash table. Replicating that >> hash table a second time, in guile is ... well ... it has tens/hundreds of >> millions of entries, it would blow out RAM. (And the other functions, >> those that are actually CPU-intensive, already got the Grand Wazoo >> treatment for memization; some in guile, some in C++) >> >> For this particular issue, I'm thinking I need to understand macros, >> somehow. The point being that there is just a tiny handful of values -- >> some dozens, maybe hundreds, that some human has written into the code, and >> are being treated as if they were literal constants (because, in the C++ >> code, they *are* constants -- they're really just fixed entries in a symbol >> table) and so I want to automatically replace these by the actual constant >> value (the location in the c++ table). To recap: a macro that would >> convert >> >> (define (foo x) (g (f 42) (f x) (f 43)) >> >> into >> >> (define c42 (f 42)) >> (define c43 (f 43)) >> (define (foo x) (g c42 (f x) c43)) >> >> so that guild can treat c42 and c43 as constants (boxes, I guess). >> >> -- Linas >> >> >> On Sat, Jan 11, 2020 at 8:39 AM Matt Wette <matt.wette@gmail.com> wrote: >> >>> On 1/10/20 2:36 PM, Linas Vepstas wrote: >>> > So, I've got lots of C code wrapped up in guile, and I'd like to >>> declare >>> > many of these functions to be pure functions, side-effect-free, thus >>> > hopefully garnering some optimizations. Is this possible? How would I >>> do >>> > it? A cursory google-search reveals no clues. >>> > >>> > To recap, I've got functions f and g that call into c++, but are pure >>> (i.e. >>> > always return the same value for the same arguments). I've got >>> > user-written code that looks like this: >>> > >>> > (define (foo x) >>> > (g (f 42) (f x) (f 43)) >>> > >>> > and from what I can tell, `f` is getting called three times whenever >>> the >>> > user calls `foo`. I could tell the user to re-write their code to >>> cache, >>> > manually: viz: >>> > >>> > (define c42 (f 42)) >>> > (define c43 (f 43)) >>> > (define (foo x) (g c42 (f x) c43)) >>> > >>> > but asking the users to do this is .. cumbersome. And barely worth >>> it: `f` >>> > takes under maybe 10 microseconds to run; so most simple-minded caching >>> > stunts don't pay off. But since `foo` is called millions/billions of >>> times, >>> > I'm motivated to find something spiffy. >>> > >>> > Ideas? suggestions? >>> > >>> > -- Linas >>> >>> read this: http://community.schemewiki.org/?memoization >>> and look at https://docs.racket-lang.org/memoize/index.html >>> >>> >>> >>> >> >> -- >> cassette tapes - analog TV - film cameras - you >> > > > -- > cassette tapes - analog TV - film cameras - you > -- cassette tapes - analog TV - film cameras - you ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-12 2:12 ` Linas Vepstas @ 2020-01-12 3:21 ` Taylan Kammer 2020-01-12 4:32 ` Linas Vepstas 0 siblings, 1 reply; 15+ messages in thread From: Taylan Kammer @ 2020-01-12 3:21 UTC (permalink / raw) To: linasvepstas, Matt Wette; +Cc: Guile User On 12.01.2020 03:12, Linas Vepstas wrote: > To answer my own question, it appears doable with a very simply macro, then: > > (define (bar x) > (format #t "Called bar with ~A\n" x) > (+ x 1)) > > (define (memoize FUNC) > " > memoize a function FUNC which takes a single int as argument > " > (define cache (make-hash-table)) > (define (int-hash INT SZ) (modulo INT SZ)) > (define (int-assoc INT ILIST) > (find (lambda (pr) (equal? INT (car pr))) ILIST)) > > (lambda (ITEM) > (define val (hashx-ref int-hash int-assoc cache ITEM)) > (if val val > (let ((fv (FUNC ITEM))) > (hashx-set! int-hash int-assoc cache ITEM fv) > fv))) > ) > > (define bar-memo (memoize bar)) > > (define-syntax foo > (syntax-rules () > ((foo exp) > (if (symbol? (quote exp)) > (begin (display "its a symb\n") (bar exp)) > (begin (display "no its not\n") (bar-memo exp)))))) Using (quote exp) in the macro output to determine if the input was a symbol is... smart I guess! There's a problem in your approach though: the memoization and the lookup of the memoized value will happen at run-time, because your macro is merely emitting code that calls the function with the memoization cache. This also means, of course, that checking whether the argument was a symbol is useless. In your (lambda (ITEM) ...) definition, insert some (display ...) lines to print "was already memoized" or "will now be memoized" and you will see what I mean. Here's how calls to the function might then look: scheme> (define x 66) scheme> (bar-memo x) will now be memoized $1 = 67 scheme> (bar-memo x) was already memoized $2 = 67 scheme> (set! x 42) scheme> (bar-memo x) will now be memoized $3 = 43 scheme> (bar-memo x) was already memoized $4 = 43 It's important to understand that Scheme uses "pass-by-value" semantics in procedure calls. That means that a procedure doesn't know how the values it receives were computed -- whether they were constants, or variable references, or the result of another function call. Consider: (foo 42) (let ((x 42)) (foo x)) (foo (+ 21 21)) In all three cases, foo receives the same value, 42, and doesn't know how it arrived. I'm not sure if memoization is really what you want, since it's a run-time caching mechanism. It might be possible to create a sort of "compile-time memoization" but it would be quite complicated. For each constant value the macro receives, it would emit a global variable definition that will be bound to the call to the actual function at run-time, and then each call with the same constant would emit a reference to that variable. So the following: (display (f-memo 42)) (display (f-memo 66)) (display (f-memo 42)) (display (f-memo 66)) would magically emit code like: (define _x1 (f 42)) (define _x2 (f 66)) (display _x1) (display _x2) (display _x1) (display _x2) But I'm not sure how I'd write that hypothetical f-memo macro. You can't really make a macro-call deep within some code emit a top-level variable definition. All I can imagine are some pretty dirty methods that probably aren't worth the complexity. - Taylan ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-12 3:21 ` Taylan Kammer @ 2020-01-12 4:32 ` Linas Vepstas 2020-01-12 5:52 ` Linas Vepstas 0 siblings, 1 reply; 15+ messages in thread From: Linas Vepstas @ 2020-01-12 4:32 UTC (permalink / raw) To: Taylan Kammer; +Cc: Guile User Hi Taylan, Our emails are crossing in the ether... On Sat, Jan 11, 2020 at 9:21 PM Taylan Kammer <taylan.kammer@gmail.com> wrote: > > It might be possible to create a sort of "compile-time memoization" Yes, that's what I'm looking for... So the following: > > (display (f-memo 42)) > (display (f-memo 66)) > (display (f-memo 42)) > (display (f-memo 66)) > > would magically emit code like: > > (define _x1 (f 42)) > (define _x2 (f 66)) > (display _x1) > (display _x2) > (display _x1) > (display _x2) > > But I'm not sure how I'd write that hypothetical f-memo macro. > Well, how about a simpler case, then: an incrementing counter? (define-syntax incr (syntax-rules () ((incr) ... something??? ...))) so that (display (incr)) (display (incr)) (display (incr)) emits (define _x1 1) (display _x1) (define _x2 2) (display _x2) (define _x3 3) (display _x3) The hard part, from what I can tell, is making the macro stateful, (or continuable, or whatever you want to call it), to have it remember where it last left off with the counter. I presume that emitting the defines "shouldn't be that hard", just some kind of string pasting with the stateful macro state. -- Linas -- cassette tapes - analog TV - film cameras - you ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-12 4:32 ` Linas Vepstas @ 2020-01-12 5:52 ` Linas Vepstas 0 siblings, 0 replies; 15+ messages in thread From: Linas Vepstas @ 2020-01-12 5:52 UTC (permalink / raw) To: Taylan Kammer; +Cc: Guile User To answer my own question, I've now got a glimmer as to how to do most of this, using syntax-case and quasi-syntax, as described in https://www.gnu.org/software/guile/manual/html_node/Syntax-Case.html#Syntax-Case with the display-compile-timestamp example, and then cleverly reworking that. ...Basically, replacing the (current-time) in that example, with the memoization code, thus providing compile-time memoization. I'll post when I get something working-ish. Tomorrow-ish. Now all I need is a way to tell apart `42` from `(+ x y)` ... --linas ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-10 22:36 Pure (side-effect-free) calls into c/c++? Linas Vepstas 2020-01-11 14:13 ` Zelphir Kaltstahl 2020-01-11 14:38 ` Matt Wette @ 2020-01-11 17:40 ` Linus Björnstam 2020-01-12 3:03 ` Christopher Lam 2 siblings, 1 reply; 15+ messages in thread From: Linus Björnstam @ 2020-01-11 17:40 UTC (permalink / raw) To: guile-user I have a macro called lambda/memo and define/memo for these situations: https://hg.sr.ht/~bjoli/misc/browse/default/memoize.scm If the function gets called with a gazillion different arguments the memoizatiin hash gets large, and there are no mechanisms to stop that from happening. It also lacks a fast path for single argument functions. You can disregard the repo license. Use that function is you like, if you like to. -- Linus Björnstam On Fri, 10 Jan 2020, at 23:36, Linas Vepstas wrote: > So, I've got lots of C code wrapped up in guile, and I'd like to declare > many of these functions to be pure functions, side-effect-free, thus > hopefully garnering some optimizations. Is this possible? How would I do > it? A cursory google-search reveals no clues. > > To recap, I've got functions f and g that call into c++, but are pure (i.e. > always return the same value for the same arguments). I've got > user-written code that looks like this: > > (define (foo x) > (g (f 42) (f x) (f 43)) > > and from what I can tell, `f` is getting called three times whenever the > user calls `foo`. I could tell the user to re-write their code to cache, > manually: viz: > > (define c42 (f 42)) > (define c43 (f 43)) > (define (foo x) (g c42 (f x) c43)) > > but asking the users to do this is .. cumbersome. And barely worth it: `f` > takes under maybe 10 microseconds to run; so most simple-minded caching > stunts don't pay off. But since `foo` is called millions/billions of times, > I'm motivated to find something spiffy. > > Ideas? suggestions? > > -- Linas > -- > cassette tapes - analog TV - film cameras - you > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-11 17:40 ` Linus Björnstam @ 2020-01-12 3:03 ` Christopher Lam 2020-01-12 10:35 ` Linus Björnstam 0 siblings, 1 reply; 15+ messages in thread From: Christopher Lam @ 2020-01-12 3:03 UTC (permalink / raw) To: Linus Björnstam; +Cc: guile-user I can add a contribution! The good thing about memoize is it's simple to create. You forgot a catch however: if the memoized return-val is #f then your memoizer https://hg.sr.ht/~bjoli/misc/browse/default/memoize.scm will not recognise that #f is a valid cached return-val and will call the lambda again. (FWIW I shudder think what a *fast* memoizer would do). Here's how I did mine: (define (memoize f) (let ((h (make-hash-table))) (lambda args (cond ((hash-ref h args) => car) (else (let ((res (apply f args))) (hash-set! h args (list res)) res)))))) (define-syntax-rule (lambda/macro args body ...) (memoize (lambda args body ...))) (define-syntax-rule (define/macro (f . args) body ...) (define f lambda/macro args body ...)) On Sat, 11 Jan 2020 at 17:40, Linus Björnstam <linus.internet@fastmail.se> wrote: > I have a macro called lambda/memo and define/memo for these situations: > https://hg.sr.ht/~bjoli/misc/browse/default/memoize.scm > > If the function gets called with a gazillion different arguments the > memoizatiin hash gets large, and there are no mechanisms to stop that from > happening. It also lacks a fast path for single argument functions. > > You can disregard the repo license. Use that function is you like, if you > like to. > > > -- > Linus Björnstam > > On Fri, 10 Jan 2020, at 23:36, Linas Vepstas wrote: > > So, I've got lots of C code wrapped up in guile, and I'd like to declare > > many of these functions to be pure functions, side-effect-free, thus > > hopefully garnering some optimizations. Is this possible? How would I do > > it? A cursory google-search reveals no clues. > > > > To recap, I've got functions f and g that call into c++, but are pure > (i.e. > > always return the same value for the same arguments). I've got > > user-written code that looks like this: > > > > (define (foo x) > > (g (f 42) (f x) (f 43)) > > > > and from what I can tell, `f` is getting called three times whenever the > > user calls `foo`. I could tell the user to re-write their code to cache, > > manually: viz: > > > > (define c42 (f 42)) > > (define c43 (f 43)) > > (define (foo x) (g c42 (f x) c43)) > > > > but asking the users to do this is .. cumbersome. And barely worth it: > `f` > > takes under maybe 10 microseconds to run; so most simple-minded caching > > stunts don't pay off. But since `foo` is called millions/billions of > times, > > I'm motivated to find something spiffy. > > > > Ideas? suggestions? > > > > -- Linas > > -- > > cassette tapes - analog TV - film cameras - you > > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Pure (side-effect-free) calls into c/c++? 2020-01-12 3:03 ` Christopher Lam @ 2020-01-12 10:35 ` Linus Björnstam 0 siblings, 0 replies; 15+ messages in thread From: Linus Björnstam @ 2020-01-12 10:35 UTC (permalink / raw) To: Christopher Lam; +Cc: guile-user Thanks! I haven't used this macro in a billion years and wrote it as a comfort thingie for the first 100 project Euler problems. This thread got me to start thinking about how to memoize "smarter" and having a macro that allows you to trade a bit of speed for being able to specifically memoize the last n or the most common n arguments. -- Linus Björnstam On Sun, 12 Jan 2020, at 04:03, Christopher Lam wrote: > I can add a contribution! The good thing about memoize is it's simple > to create. You forgot a catch however: if the memoized return-val is #f > then your memoizer > https://hg.sr.ht/~bjoli/misc/browse/default/memoize.scm will not > recognise that #f is a valid cached return-val and will call the lambda > again. (FWIW I shudder think what a *fast* memoizer would do). > > Here's how I did mine: > > (define (memoize f) > (let ((h (make-hash-table))) > (lambda args > (cond > ((hash-ref h args) => car) > (else (let ((res (apply f args))) > (hash-set! h args (list res)) > res)))))) > > (define-syntax-rule (lambda/macro args body ...) > (memoize (lambda args body ...))) > > (define-syntax-rule (define/macro (f . args) body ...) > (define f lambda/macro args body ...)) > > > > On Sat, 11 Jan 2020 at 17:40, Linus Björnstam > <linus.internet@fastmail.se> wrote: > > I have a macro called lambda/memo and define/memo for these situations: https://hg.sr.ht/~bjoli/misc/browse/default/memoize.scm > > > > If the function gets called with a gazillion different arguments the memoizatiin hash gets large, and there are no mechanisms to stop that from happening. It also lacks a fast path for single argument functions. > > > > You can disregard the repo license. Use that function is you like, if you like to. > > > > > > -- > > Linus Björnstam > > > > On Fri, 10 Jan 2020, at 23:36, Linas Vepstas wrote: > > > So, I've got lots of C code wrapped up in guile, and I'd like to declare > > > many of these functions to be pure functions, side-effect-free, thus > > > hopefully garnering some optimizations. Is this possible? How would I do > > > it? A cursory google-search reveals no clues. > > > > > > To recap, I've got functions f and g that call into c++, but are pure (i.e. > > > always return the same value for the same arguments). I've got > > > user-written code that looks like this: > > > > > > (define (foo x) > > > (g (f 42) (f x) (f 43)) > > > > > > and from what I can tell, `f` is getting called three times whenever the > > > user calls `foo`. I could tell the user to re-write their code to cache, > > > manually: viz: > > > > > > (define c42 (f 42)) > > > (define c43 (f 43)) > > > (define (foo x) (g c42 (f x) c43)) > > > > > > but asking the users to do this is .. cumbersome. And barely worth it: `f` > > > takes under maybe 10 microseconds to run; so most simple-minded caching > > > stunts don't pay off. But since `foo` is called millions/billions of times, > > > I'm motivated to find something spiffy. > > > > > > Ideas? suggestions? > > > > > > -- Linas > > > -- > > > cassette tapes - analog TV - film cameras - you > > > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-01-12 10:35 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-01-10 22:36 Pure (side-effect-free) calls into c/c++? Linas Vepstas 2020-01-11 14:13 ` Zelphir Kaltstahl 2020-01-11 14:38 ` Matt Wette 2020-01-11 18:11 ` Linas Vepstas 2020-01-11 18:52 ` Linas Vepstas 2020-01-11 21:56 ` Taylan Kammer 2020-01-12 2:15 ` Linas Vepstas 2020-01-12 4:12 ` Linas Vepstas 2020-01-12 2:12 ` Linas Vepstas 2020-01-12 3:21 ` Taylan Kammer 2020-01-12 4:32 ` Linas Vepstas 2020-01-12 5:52 ` Linas Vepstas 2020-01-11 17:40 ` Linus Björnstam 2020-01-12 3:03 ` Christopher Lam 2020-01-12 10:35 ` Linus Björnstam
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).