* Some questions
@ 2002-05-09 5:59 Lynn Winebarger
0 siblings, 0 replies; 4+ messages in thread
From: Lynn Winebarger @ 2002-05-09 5:59 UTC (permalink / raw)
So, I've been reading guile some more. I have some questions that
I'm slowly answering myself by reading the source, but might be answered
faster by people already in the know.
I don't understand the last 3 lines of scm_ilookup:
if (SCM_ICDRP (iloc))
return SCM_CDRLOC (er);
return SCM_CARLOC (SCM_CDR (er));
I mean, I understand the operations, I just don't understand what the
storage scheme behind the operations is. (What is it?)
Memoized objects: what's the deal? The manual mentions them several
times but never defines them. The evaluator (in particular lookupcar) seems
to memoize everything for evaluation purposes but memoized objects are
defined in debug.c which should mean they are for debugging purposes.
I'm catting on a "roadmap" file I've jotted down that might be useful to
others who haven't looked at the source or are scared off by it. Please
correct anything I've got wrong. Yes, they're absolutely
implementation-specific and would require keeping in sync with the sources.
Maybe some others have something to add they might have found useful
when starting to read the code.
Lynn
----------------------------------
You are on long and twisty path through the forest.
Some useful files:
guile.c main and inner_main
init.c scm_{boot,init}_guile1? and scm_load_startup_files
script.c scm_shell
tags.h Possible the most important documentation-wise,
partially explains the
convoluted typing scheme and how some immediates
are actually considered "instructions" for the
evaluator, and some notes on how the contents of
cells are tagged and set up for garbage collection.
gc_os_dep.c All kinds of hairy details, including the clever
methods for finding the stack base when scm_boot_guile
isn't used.
eval.h home of the iloc macros. Supposing the i somehow stands
for immediate, but not clear why it should be named that
way. When closures are memoized (translated into the
tree code using the goodies from tags.h), the variable
references get changed into pairs of numbers denoting how
many "stack" frames up to go, and the offset in the frame
of the variable (this is set up on the first evaluation
of the closure). Actually pair is not right, the frame
and offset #'s are encoded into one immediate, with the
lowest 8 bits being a type tag, then (currently)
11 bits for the frame and some bits for the offset.
eval.c of course
modules.c eval really depends on this, as everything depends on
the (dynamically scoped) module to provide a top-level
environment.
dynamic-root.c Adds an odd spicy flavor to guile. For fluids,
call/cc tricks, and threads.
debug.[ch] home of the memoized type.
Booting up a command line guile follows roughly
this sequence (for our purposes there's no returning from the
unindented functions).
main (duh)
scm_boot_guile (sets up local variable for stack base)
scm_boot_guile1
=> scm_init_guile1 inits all the smobs - this is where all
those #include "something.x" statements
will actually come into play. Does other
initializations as well.
Last act is to load startup files, particularly
ice-9/boot-9.scm. boot-9 defines top-repl
(using scm-style-repl and error-catching-repl)
last act of boot-9 is to set the "current module"
to (guile-user)
inner_main
scm_shell
=> scm_compile_shell_switches notable in that it constructs the
code guile executes. In particular, it
inserts a call to top-repl if guile's invoked
for interactive use. returns the expression
=> scm_current_module (this was set when loading startup files,
remember?). returns the module
scm_eval_x scm_compile_shell_switches provides the expression,
scm_current_module provides the current module (a smob!)
scm_inner_eval_x this is wrapped by a dynamic wind that sets the
current module to be the module handed to scm_eval_x
on entrance and restores it on exit (redundant in this
usage).
scm_primitive_eval_x applies the module transformer (if any) on the
expression, produces a top-level environment
consisting of a list with the current module's
"eval-closure" (a smob type created in module.c),
and then passes the expression and environment off
to the real eval (well, scm_i_eval_x, but that just
calls the real eval)
---------
eval.c and probably some other files include themselves in
order to produce separate debugging versions of functions
in the object code without having separate versions in the
source.
*.x files included in init functions are generated by guile-snarf
based on SCM_DEFINE and other macros (which appear all over the code).
lexical environments are just lists of lists of cons cells, but there's
some slight weirdness with the value (may be the cdr or the cadr of the
cons cell).
ilocs: see above at eval.h some related functions are actually in
debug.c
memoized smobs: defined in debug.c Not clear what is going on here.
All expressions should get "memoized" as part of expansion/evaluation,
but the memoized objects appear geared toward preserving the source
for debugging purposes ??? (otherwise, why is it in debug.c?)
_______________________________________________
Guile-devel mailing list
Guile-devel@gnu.org
http://mail.gnu.org/mailman/listinfo/guile-devel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Some Questions
[not found] ` <m3hbo3s3dl.fsf@pobox.com>
@ 2010-03-29 6:18 ` Michael Lucy
2010-03-29 9:40 ` Andy Wingo
0 siblings, 1 reply; 4+ messages in thread
From: Michael Lucy @ 2010-03-29 6:18 UTC (permalink / raw)
To: guile-devel, Andy Wingo
On Fri, Mar 26, 2010 at 5:56 AM, Andy Wingo <wingo@pobox.com> wrote:
> http://www.gnu.org/software/guile/mail/mail.html :) Guile-devel. Send
> things there, and copy me on them if you really want it to go to my
> attention.
Ah, OK, I thought you meant the summer-of-code@gnu.org mailing list.
> Well, I've already heard of someone who would like to do Lua, so if it's
> the same to you, I'd have a look at Python. Here's a mail I sent to the
> list recently regarding practicalities:
>
> http://lists.gnu.org/archive/html/guile-devel/2010-03/msg00076.html
>
> I'd also have a look at Thomas Bushnell's early work at supporting
> Python, before we had the compiler: http://savannah.gnu.org/projects/gpc
>
> I am somewhat concerned that we'll end up with a number of half-finished
> language implementations. I'm not sure what to do or think about that.
I'll probably apply to work on Python then. Two quick questions:
1. How fast would you need it to be? (i.e. am I going to be writing C
code or can I stick with straight Scheme?)
2. I was thinking of compiling Python into Scheme code, but you
suggest in that link compiling to Tree-IL instead. Could you
elaborate on the logic behind that more? Does it just result in
faster programs?
> Hm, another thought, project-wise: have you heard of parsing expression
> grammars before? Here are a couple links:
>
> http://en.wikipedia.org/wiki/Parsing_expression_grammar
> http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html
>
> Writing a good PEG parser library, together with docs and example
> grammars, sounds like about a 1.5-2 month project -- just about right
> for the SOC.
>
> We do have a LALR parser generator; I said something else about it in
> the mail to guile-devel.
A PEG parser seems like a shorter project than two months, but maybe
writing one that's up to GNU standards would take a while. Would you
prefer this over a python compiler? I'd be up for it; I mostly want
to be writing something intellectually interesting in Scheme.
>
> Happy hacking,
>
> Andy
> --
> http://wingolog.org/
>
Anyway, sorry I haven't written for the last 3 days. You said you
wanted to see some code so I wrote some, but I kinda got caught up in
it.
I hosted things up at http://people.cs.uchicago.edu/~mlucy/ if you
want to take a look.
peg.scm is a PEG parser. It works on the things I've tested it on,
but I'd be amazed if there weren't any bugs lurking there. It's also
in need of a serious refactoring--please don't think my finished code
looks like parse-expression. It works but I wouldn't consider it
finished by any standard, I just ran out of time (applications open
tomorrow).
rvector.scm is a resizable vector library--not terribly useful, but I
wrote it to get familiar with guile before writing peg.scm (and as
evidence that I do, in fact, produce reasonably clean code once the
bugs are ironed out).
peg-tst.scm and rvector-tst.scm do what they sound like. peg-tst.scm
is two tests and an example of the library actually parsing a simple
function grammar.
rvector-tst.scm won't complete successfully without the change I
suggested in http://lists.gnu.org/archive/html/bug-guile/2010-03/msg00011.html.
(Nobody wrote back so I'm not entirely sure it's a bug. If it isn't
then rvector-tst.scm is wrong.)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Some Questions
2010-03-29 6:18 ` Some Questions Michael Lucy
@ 2010-03-29 9:40 ` Andy Wingo
2010-03-30 5:08 ` Michael Lucy
0 siblings, 1 reply; 4+ messages in thread
From: Andy Wingo @ 2010-03-29 9:40 UTC (permalink / raw)
To: Michael Lucy; +Cc: guile-devel
Hi Michael,
On Mon 29 Mar 2010 08:18, Michael Lucy <MichaelGLucy@Gmail.com> writes:
> I'll probably apply to work on Python then.
Cool.
> 1. How fast would you need it to be? (i.e. am I going to be writing C
> code or can I stick with straight Scheme?)
Scheme! Depending on how it's compiled, the resulting code probably
won't run quite as fast as CPython (though it could in the future) --
due to the need for many calls to go through GOOPS dispatch -- but the
speed of the compiler is ~unimportant, because the compiler's output
will be cached.
> 2. I was thinking of compiling Python into Scheme code, but you
> suggest in that link compiling to Tree-IL instead. Could you
> elaborate on the logic behind that more? Does it just result in
> faster programs?
Tree-IL is the right thing IMO, mostly because it allows you keep source
location information, but it also allows you to express more precisely
what you want to compile to. You don't want to run your compiler's
output through the Scheme syntax expander, that's unnecessary.
Also tree-il is the right place to hook into Guile's compiler
infrastructure.
>> Hm, another thought, project-wise: have you heard of parsing expression
>> grammars before? Here are a couple links:
>>
>> http://en.wikipedia.org/wiki/Parsing_expression_grammar
>> http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html
>>
>> Writing a good PEG parser library, together with docs and example
>> grammars, sounds like about a 1.5-2 month project -- just about right
>> for the SOC.
>>
>> We do have a LALR parser generator; I said something else about it in
>> the mail to guile-devel.
>
> A PEG parser seems like a shorter project than two months, but maybe
> writing one that's up to GNU standards would take a while. Would you
> prefer this over a python compiler? I'd be up for it; I mostly want
> to be writing something intellectually interesting in Scheme.
Ah, I should have responded to you before I responded to "No Itisnt"; if
you're down for this, let's do it then. I suspect that it's a project
that would take 2 months to do *well*:
* Compile PEG grammars at syntax time using a macro
- into state-machine-like lexically-bound closures
- see LPEG VM for example
http://www.inf.puc-rio.br/~roberto/docs/peg.pdf
- potentially augment Guile's VM with needed instructions
* Procedural, LPEG-like interface?
- run the compiler at runtime?
* PEG grammars as text?
- Guile PEG language, parsed by PEG itself?
* Benchmarking
- LPEG benchmarks would be a good first start
* Test suite
* Docs
* Stream parsing?
If you finished the first tasks, figuring out an efficient way to do
stream parsing could well take all the rest of the time. (The LPEG paper
works on strings.)
So if you're down with it, perhaps you can do the PEG stuff, and No
Itisnt can work on Lua. There's definitely enough work for everyone :)
Hopefully the GOOG comes through with funding.
> Anyway, sorry I haven't written for the last 3 days.
NP!
> You said you wanted to see some code so I wrote some, but I kinda got
> caught up in it. I hosted things up at
> http://people.cs.uchicago.edu/~mlucy/ if you want to take a look.
Wow, that's great stuff! Working code is often a precursor to clean
working code, so that's great.
> peg.scm is a PEG parser. It works on the things I've tested it on,
> but I'd be amazed if there weren't any bugs lurking there. It's also
> in need of a serious refactoring--please don't think my finished code
> looks like parse-expression. It works but I wouldn't consider it
> finished by any standard, I just ran out of time (applications open
> tomorrow).
Ah right, the GOOG moves quickly. I think it's a great starting point,
though we really want it to be a PEG compiler rather than (in addition
to?) an interpreter, so I think the LPEG paper should be helpful here.
> rvector.scm is a resizable vector library--not terribly useful, but I
> wrote it to get familiar with guile before writing peg.scm (and as
> evidence that I do, in fact, produce reasonably clean code once the
> bugs are ironed out).
Nice. I'm impressed that you managed to parse the doc sections on
structs, they're quite obtuse. FWIW, it's usually clearer to use a
higher-level abstraction, like records, as in srfi-9. And excellent that
you found a bug, too.
So, let me know what you think about PEG, if you think it's the right
size for summer. I think it has great potential, and implementing it
well (as a compiler and procedurally) will be of great use. Otherwise we
can go ahead with Python, if that's to your liking.
Cheers,
Andy
--
http://wingolog.org/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Some Questions
2010-03-29 9:40 ` Andy Wingo
@ 2010-03-30 5:08 ` Michael Lucy
0 siblings, 0 replies; 4+ messages in thread
From: Michael Lucy @ 2010-03-30 5:08 UTC (permalink / raw)
To: Andy Wingo; +Cc: guile-devel
Hi,
On Mon, Mar 29, 2010 at 3:40 AM, Andy Wingo <wingo@pobox.com> wrote:
>
> Tree-IL is the right thing IMO, mostly because it allows you keep source
> location information, but it also allows you to express more precisely
> what you want to compile to. You don't want to run your compiler's
> output through the Scheme syntax expander, that's unnecessary.
>
> Also tree-il is the right place to hook into Guile's compiler
> infrastructure.
>
Ah, k.
>
> Ah, I should have responded to you before I responded to "No Itisnt"; if
> you're down for this, let's do it then. I suspect that it's a project
> that would take 2 months to do *well*:
>
> * Compile PEG grammars at syntax time using a macro
> - into state-machine-like lexically-bound closures
> - see LPEG VM for example
> http://www.inf.puc-rio.br/~roberto/docs/peg.pdf
> - potentially augment Guile's VM with needed instructions
> * Procedural, LPEG-like interface?
> - run the compiler at runtime?
So, I'm assuming "syntax time" means the grammars will be compiled as
soon as the source file is loaded and macros are parsed, while
"runtime" means the grammars will be compiled when a function using
them is called. Do you mean you'd like to see both of these
supported?
As for the interface, the lisp tradition I'm most comfortable in calls
for transparent composing and decomposing of lists as the main way of
dealing with data, so I was thinking of representing uncompiled
grammars in a form similar to the example at the top of peg.scm and
then compiling this to something efficient (a choice between this and
more traditional string syntax is how CL-PPCRE approaches things:
http://weitz.de/cl-ppcre/#scan ). For example, if I have a pattern
stored in *var* and I want a new pattern that's one or more
repetitions of that, I'd write `(,*var* *), or (list *var* '*).
It looks like LPEG's approach is functions that return opaque data and
other functions to combine these opaque data. For example, if I have
a pattern stored in *var* and I want a new pattern that's one or more
repetitions of that, I'd write (one-or-more *var*). The problem with
this approach is that you have to choose between verbosity or
confusion (e.g. LPEG uses * to concatenate things in the lpeg module,
but when present in a string for the re module it means what it
normally would in a PEG grammar). I haven't finished reading the
paper, but I'm assuming the advantage to this approach is that the
opaque representation is already compiled, so doing several hundred
matches with one grammar won't require several hundred compilations.
It seems like we could get the best of both worlds by having
transparent list representations, functions to compose them (so
(one-or-more *var*) yields the same thing as `(,*var* *)), and a
compile-peg function that would take the list representation and
produce a compiled PEG parser. All the peg functions would take
either the uncompiled or the compiled form (they would check at
runtime what they were dealing with), so in situations where speed is
important people can store the compiled grammar and re-use it.
> * PEG grammars as text?
> - Guile PEG language, parsed by PEG itself?
Once a parser's written, parsing strings into grammars sounds useful
and not too difficult.
> * Benchmarking
> - LPEG benchmarks would be a good first start
> * Test suite
> * Docs
> * Stream parsing?
>
> If you finished the first tasks, figuring out an efficient way to do
> stream parsing could well take all the rest of the time. (The LPEG paper
> works on strings.)
>
> So if you're down with it, perhaps you can do the PEG stuff, and No
> Itisnt can work on Lua. There's definitely enough work for everyone :)
> Hopefully the GOOG comes through with funding.
Cool, PEG it is then.
>
>
> So, let me know what you think about PEG, if you think it's the right
> size for summer. I think it has great potential, and implementing it
> well (as a compiler and procedurally) will be of great use. Otherwise we
> can go ahead with Python, if that's to your liking.
How does this look for a timeline (unless my math is very horrible,
the "midterm" evaluations are 70% of the way through, so most of the
real work will be done by then):
The ~7 weeks before mid-term evaluations are due:
1 week defining the PEG syntax carefully and writing functions for
dealing with it
3 weeks making the PEG syntax compile to VM code
1 week writing tests / fixing bugs that tests show
1 week documenting things that weren't well enough documented on the way
1 week for things to go wrong in
The ~3 weeks after mid-term evaluations are in:
1 week writing useful add-ons (e.g. taking PEG grammar in strings)
1 week writing benchmarks and tuning the speed
1 week for cleaning things up and writing more documentation/examples
The 1 week after suggested pencils-down date:
More documentation/examples.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-03-30 5:08 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <52c42c3e1003242253w40bd6447k48e5a42cc1d33a98@mail.gmail.com>
[not found] ` <m3zl1ws738.fsf@pobox.com>
[not found] ` <52c42c3e1003251242g1a39b967pfdf5437b953bf054@mail.gmail.com>
[not found] ` <m3hbo3s3dl.fsf@pobox.com>
2010-03-29 6:18 ` Some Questions Michael Lucy
2010-03-29 9:40 ` Andy Wingo
2010-03-30 5:08 ` Michael Lucy
2002-05-09 5:59 Some questions Lynn Winebarger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).