* Some questions @ 2002-05-09 5:59 Lynn Winebarger 0 siblings, 0 replies; 4+ messages in thread From: Lynn Winebarger @ 2002-05-09 5:59 UTC (permalink / raw) So, I've been reading guile some more. I have some questions that I'm slowly answering myself by reading the source, but might be answered faster by people already in the know. I don't understand the last 3 lines of scm_ilookup: if (SCM_ICDRP (iloc)) return SCM_CDRLOC (er); return SCM_CARLOC (SCM_CDR (er)); I mean, I understand the operations, I just don't understand what the storage scheme behind the operations is. (What is it?) Memoized objects: what's the deal? The manual mentions them several times but never defines them. The evaluator (in particular lookupcar) seems to memoize everything for evaluation purposes but memoized objects are defined in debug.c which should mean they are for debugging purposes. I'm catting on a "roadmap" file I've jotted down that might be useful to others who haven't looked at the source or are scared off by it. Please correct anything I've got wrong. Yes, they're absolutely implementation-specific and would require keeping in sync with the sources. Maybe some others have something to add they might have found useful when starting to read the code. Lynn ---------------------------------- You are on long and twisty path through the forest. Some useful files: guile.c main and inner_main init.c scm_{boot,init}_guile1? and scm_load_startup_files script.c scm_shell tags.h Possible the most important documentation-wise, partially explains the convoluted typing scheme and how some immediates are actually considered "instructions" for the evaluator, and some notes on how the contents of cells are tagged and set up for garbage collection. gc_os_dep.c All kinds of hairy details, including the clever methods for finding the stack base when scm_boot_guile isn't used. eval.h home of the iloc macros. Supposing the i somehow stands for immediate, but not clear why it should be named that way. When closures are memoized (translated into the tree code using the goodies from tags.h), the variable references get changed into pairs of numbers denoting how many "stack" frames up to go, and the offset in the frame of the variable (this is set up on the first evaluation of the closure). Actually pair is not right, the frame and offset #'s are encoded into one immediate, with the lowest 8 bits being a type tag, then (currently) 11 bits for the frame and some bits for the offset. eval.c of course modules.c eval really depends on this, as everything depends on the (dynamically scoped) module to provide a top-level environment. dynamic-root.c Adds an odd spicy flavor to guile. For fluids, call/cc tricks, and threads. debug.[ch] home of the memoized type. Booting up a command line guile follows roughly this sequence (for our purposes there's no returning from the unindented functions). main (duh) scm_boot_guile (sets up local variable for stack base) scm_boot_guile1 => scm_init_guile1 inits all the smobs - this is where all those #include "something.x" statements will actually come into play. Does other initializations as well. Last act is to load startup files, particularly ice-9/boot-9.scm. boot-9 defines top-repl (using scm-style-repl and error-catching-repl) last act of boot-9 is to set the "current module" to (guile-user) inner_main scm_shell => scm_compile_shell_switches notable in that it constructs the code guile executes. In particular, it inserts a call to top-repl if guile's invoked for interactive use. returns the expression => scm_current_module (this was set when loading startup files, remember?). returns the module scm_eval_x scm_compile_shell_switches provides the expression, scm_current_module provides the current module (a smob!) scm_inner_eval_x this is wrapped by a dynamic wind that sets the current module to be the module handed to scm_eval_x on entrance and restores it on exit (redundant in this usage). scm_primitive_eval_x applies the module transformer (if any) on the expression, produces a top-level environment consisting of a list with the current module's "eval-closure" (a smob type created in module.c), and then passes the expression and environment off to the real eval (well, scm_i_eval_x, but that just calls the real eval) --------- eval.c and probably some other files include themselves in order to produce separate debugging versions of functions in the object code without having separate versions in the source. *.x files included in init functions are generated by guile-snarf based on SCM_DEFINE and other macros (which appear all over the code). lexical environments are just lists of lists of cons cells, but there's some slight weirdness with the value (may be the cdr or the cadr of the cons cell). ilocs: see above at eval.h some related functions are actually in debug.c memoized smobs: defined in debug.c Not clear what is going on here. All expressions should get "memoized" as part of expansion/evaluation, but the memoized objects appear geared toward preserving the source for debugging purposes ??? (otherwise, why is it in debug.c?) _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://mail.gnu.org/mailman/listinfo/guile-devel ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <52c42c3e1003242253w40bd6447k48e5a42cc1d33a98@mail.gmail.com>]
[parent not found: <m3zl1ws738.fsf@pobox.com>]
[parent not found: <52c42c3e1003251242g1a39b967pfdf5437b953bf054@mail.gmail.com>]
[parent not found: <m3hbo3s3dl.fsf@pobox.com>]
* Re: Some Questions [not found] ` <m3hbo3s3dl.fsf@pobox.com> @ 2010-03-29 6:18 ` Michael Lucy 2010-03-29 9:40 ` Andy Wingo 0 siblings, 1 reply; 4+ messages in thread From: Michael Lucy @ 2010-03-29 6:18 UTC (permalink / raw) To: guile-devel, Andy Wingo On Fri, Mar 26, 2010 at 5:56 AM, Andy Wingo <wingo@pobox.com> wrote: > http://www.gnu.org/software/guile/mail/mail.html :) Guile-devel. Send > things there, and copy me on them if you really want it to go to my > attention. Ah, OK, I thought you meant the summer-of-code@gnu.org mailing list. > Well, I've already heard of someone who would like to do Lua, so if it's > the same to you, I'd have a look at Python. Here's a mail I sent to the > list recently regarding practicalities: > > http://lists.gnu.org/archive/html/guile-devel/2010-03/msg00076.html > > I'd also have a look at Thomas Bushnell's early work at supporting > Python, before we had the compiler: http://savannah.gnu.org/projects/gpc > > I am somewhat concerned that we'll end up with a number of half-finished > language implementations. I'm not sure what to do or think about that. I'll probably apply to work on Python then. Two quick questions: 1. How fast would you need it to be? (i.e. am I going to be writing C code or can I stick with straight Scheme?) 2. I was thinking of compiling Python into Scheme code, but you suggest in that link compiling to Tree-IL instead. Could you elaborate on the logic behind that more? Does it just result in faster programs? > Hm, another thought, project-wise: have you heard of parsing expression > grammars before? Here are a couple links: > > http://en.wikipedia.org/wiki/Parsing_expression_grammar > http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html > > Writing a good PEG parser library, together with docs and example > grammars, sounds like about a 1.5-2 month project -- just about right > for the SOC. > > We do have a LALR parser generator; I said something else about it in > the mail to guile-devel. A PEG parser seems like a shorter project than two months, but maybe writing one that's up to GNU standards would take a while. Would you prefer this over a python compiler? I'd be up for it; I mostly want to be writing something intellectually interesting in Scheme. > > Happy hacking, > > Andy > -- > http://wingolog.org/ > Anyway, sorry I haven't written for the last 3 days. You said you wanted to see some code so I wrote some, but I kinda got caught up in it. I hosted things up at http://people.cs.uchicago.edu/~mlucy/ if you want to take a look. peg.scm is a PEG parser. It works on the things I've tested it on, but I'd be amazed if there weren't any bugs lurking there. It's also in need of a serious refactoring--please don't think my finished code looks like parse-expression. It works but I wouldn't consider it finished by any standard, I just ran out of time (applications open tomorrow). rvector.scm is a resizable vector library--not terribly useful, but I wrote it to get familiar with guile before writing peg.scm (and as evidence that I do, in fact, produce reasonably clean code once the bugs are ironed out). peg-tst.scm and rvector-tst.scm do what they sound like. peg-tst.scm is two tests and an example of the library actually parsing a simple function grammar. rvector-tst.scm won't complete successfully without the change I suggested in http://lists.gnu.org/archive/html/bug-guile/2010-03/msg00011.html. (Nobody wrote back so I'm not entirely sure it's a bug. If it isn't then rvector-tst.scm is wrong.) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Some Questions 2010-03-29 6:18 ` Some Questions Michael Lucy @ 2010-03-29 9:40 ` Andy Wingo 2010-03-30 5:08 ` Michael Lucy 0 siblings, 1 reply; 4+ messages in thread From: Andy Wingo @ 2010-03-29 9:40 UTC (permalink / raw) To: Michael Lucy; +Cc: guile-devel Hi Michael, On Mon 29 Mar 2010 08:18, Michael Lucy <MichaelGLucy@Gmail.com> writes: > I'll probably apply to work on Python then. Cool. > 1. How fast would you need it to be? (i.e. am I going to be writing C > code or can I stick with straight Scheme?) Scheme! Depending on how it's compiled, the resulting code probably won't run quite as fast as CPython (though it could in the future) -- due to the need for many calls to go through GOOPS dispatch -- but the speed of the compiler is ~unimportant, because the compiler's output will be cached. > 2. I was thinking of compiling Python into Scheme code, but you > suggest in that link compiling to Tree-IL instead. Could you > elaborate on the logic behind that more? Does it just result in > faster programs? Tree-IL is the right thing IMO, mostly because it allows you keep source location information, but it also allows you to express more precisely what you want to compile to. You don't want to run your compiler's output through the Scheme syntax expander, that's unnecessary. Also tree-il is the right place to hook into Guile's compiler infrastructure. >> Hm, another thought, project-wise: have you heard of parsing expression >> grammars before? Here are a couple links: >> >> http://en.wikipedia.org/wiki/Parsing_expression_grammar >> http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html >> >> Writing a good PEG parser library, together with docs and example >> grammars, sounds like about a 1.5-2 month project -- just about right >> for the SOC. >> >> We do have a LALR parser generator; I said something else about it in >> the mail to guile-devel. > > A PEG parser seems like a shorter project than two months, but maybe > writing one that's up to GNU standards would take a while. Would you > prefer this over a python compiler? I'd be up for it; I mostly want > to be writing something intellectually interesting in Scheme. Ah, I should have responded to you before I responded to "No Itisnt"; if you're down for this, let's do it then. I suspect that it's a project that would take 2 months to do *well*: * Compile PEG grammars at syntax time using a macro - into state-machine-like lexically-bound closures - see LPEG VM for example http://www.inf.puc-rio.br/~roberto/docs/peg.pdf - potentially augment Guile's VM with needed instructions * Procedural, LPEG-like interface? - run the compiler at runtime? * PEG grammars as text? - Guile PEG language, parsed by PEG itself? * Benchmarking - LPEG benchmarks would be a good first start * Test suite * Docs * Stream parsing? If you finished the first tasks, figuring out an efficient way to do stream parsing could well take all the rest of the time. (The LPEG paper works on strings.) So if you're down with it, perhaps you can do the PEG stuff, and No Itisnt can work on Lua. There's definitely enough work for everyone :) Hopefully the GOOG comes through with funding. > Anyway, sorry I haven't written for the last 3 days. NP! > You said you wanted to see some code so I wrote some, but I kinda got > caught up in it. I hosted things up at > http://people.cs.uchicago.edu/~mlucy/ if you want to take a look. Wow, that's great stuff! Working code is often a precursor to clean working code, so that's great. > peg.scm is a PEG parser. It works on the things I've tested it on, > but I'd be amazed if there weren't any bugs lurking there. It's also > in need of a serious refactoring--please don't think my finished code > looks like parse-expression. It works but I wouldn't consider it > finished by any standard, I just ran out of time (applications open > tomorrow). Ah right, the GOOG moves quickly. I think it's a great starting point, though we really want it to be a PEG compiler rather than (in addition to?) an interpreter, so I think the LPEG paper should be helpful here. > rvector.scm is a resizable vector library--not terribly useful, but I > wrote it to get familiar with guile before writing peg.scm (and as > evidence that I do, in fact, produce reasonably clean code once the > bugs are ironed out). Nice. I'm impressed that you managed to parse the doc sections on structs, they're quite obtuse. FWIW, it's usually clearer to use a higher-level abstraction, like records, as in srfi-9. And excellent that you found a bug, too. So, let me know what you think about PEG, if you think it's the right size for summer. I think it has great potential, and implementing it well (as a compiler and procedurally) will be of great use. Otherwise we can go ahead with Python, if that's to your liking. Cheers, Andy -- http://wingolog.org/ ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Some Questions 2010-03-29 9:40 ` Andy Wingo @ 2010-03-30 5:08 ` Michael Lucy 0 siblings, 0 replies; 4+ messages in thread From: Michael Lucy @ 2010-03-30 5:08 UTC (permalink / raw) To: Andy Wingo; +Cc: guile-devel Hi, On Mon, Mar 29, 2010 at 3:40 AM, Andy Wingo <wingo@pobox.com> wrote: > > Tree-IL is the right thing IMO, mostly because it allows you keep source > location information, but it also allows you to express more precisely > what you want to compile to. You don't want to run your compiler's > output through the Scheme syntax expander, that's unnecessary. > > Also tree-il is the right place to hook into Guile's compiler > infrastructure. > Ah, k. > > Ah, I should have responded to you before I responded to "No Itisnt"; if > you're down for this, let's do it then. I suspect that it's a project > that would take 2 months to do *well*: > > * Compile PEG grammars at syntax time using a macro > - into state-machine-like lexically-bound closures > - see LPEG VM for example > http://www.inf.puc-rio.br/~roberto/docs/peg.pdf > - potentially augment Guile's VM with needed instructions > * Procedural, LPEG-like interface? > - run the compiler at runtime? So, I'm assuming "syntax time" means the grammars will be compiled as soon as the source file is loaded and macros are parsed, while "runtime" means the grammars will be compiled when a function using them is called. Do you mean you'd like to see both of these supported? As for the interface, the lisp tradition I'm most comfortable in calls for transparent composing and decomposing of lists as the main way of dealing with data, so I was thinking of representing uncompiled grammars in a form similar to the example at the top of peg.scm and then compiling this to something efficient (a choice between this and more traditional string syntax is how CL-PPCRE approaches things: http://weitz.de/cl-ppcre/#scan ). For example, if I have a pattern stored in *var* and I want a new pattern that's one or more repetitions of that, I'd write `(,*var* *), or (list *var* '*). It looks like LPEG's approach is functions that return opaque data and other functions to combine these opaque data. For example, if I have a pattern stored in *var* and I want a new pattern that's one or more repetitions of that, I'd write (one-or-more *var*). The problem with this approach is that you have to choose between verbosity or confusion (e.g. LPEG uses * to concatenate things in the lpeg module, but when present in a string for the re module it means what it normally would in a PEG grammar). I haven't finished reading the paper, but I'm assuming the advantage to this approach is that the opaque representation is already compiled, so doing several hundred matches with one grammar won't require several hundred compilations. It seems like we could get the best of both worlds by having transparent list representations, functions to compose them (so (one-or-more *var*) yields the same thing as `(,*var* *)), and a compile-peg function that would take the list representation and produce a compiled PEG parser. All the peg functions would take either the uncompiled or the compiled form (they would check at runtime what they were dealing with), so in situations where speed is important people can store the compiled grammar and re-use it. > * PEG grammars as text? > - Guile PEG language, parsed by PEG itself? Once a parser's written, parsing strings into grammars sounds useful and not too difficult. > * Benchmarking > - LPEG benchmarks would be a good first start > * Test suite > * Docs > * Stream parsing? > > If you finished the first tasks, figuring out an efficient way to do > stream parsing could well take all the rest of the time. (The LPEG paper > works on strings.) > > So if you're down with it, perhaps you can do the PEG stuff, and No > Itisnt can work on Lua. There's definitely enough work for everyone :) > Hopefully the GOOG comes through with funding. Cool, PEG it is then. > > > So, let me know what you think about PEG, if you think it's the right > size for summer. I think it has great potential, and implementing it > well (as a compiler and procedurally) will be of great use. Otherwise we > can go ahead with Python, if that's to your liking. How does this look for a timeline (unless my math is very horrible, the "midterm" evaluations are 70% of the way through, so most of the real work will be done by then): The ~7 weeks before mid-term evaluations are due: 1 week defining the PEG syntax carefully and writing functions for dealing with it 3 weeks making the PEG syntax compile to VM code 1 week writing tests / fixing bugs that tests show 1 week documenting things that weren't well enough documented on the way 1 week for things to go wrong in The ~3 weeks after mid-term evaluations are in: 1 week writing useful add-ons (e.g. taking PEG grammar in strings) 1 week writing benchmarks and tuning the speed 1 week for cleaning things up and writing more documentation/examples The 1 week after suggested pencils-down date: More documentation/examples. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-03-30 5:08 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-05-09 5:59 Some questions Lynn Winebarger [not found] <52c42c3e1003242253w40bd6447k48e5a42cc1d33a98@mail.gmail.com> [not found] ` <m3zl1ws738.fsf@pobox.com> [not found] ` <52c42c3e1003251242g1a39b967pfdf5437b953bf054@mail.gmail.com> [not found] ` <m3hbo3s3dl.fsf@pobox.com> 2010-03-29 6:18 ` Some Questions Michael Lucy 2010-03-29 9:40 ` Andy Wingo 2010-03-30 5:08 ` Michael Lucy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).