unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* vm status update
@ 2008-12-26 17:24 Andy Wingo
  2008-12-28 22:50 ` Neil Jerram
  0 siblings, 1 reply; 21+ messages in thread
From: Andy Wingo @ 2008-12-26 17:24 UTC (permalink / raw)
  To: guile-devel

Happy St. Stephen's Day, hackers of the good hack!

I just landed a few patches on the vm branch that integrate backtrace
handling between the interpreter and the VM. `save-stack' saves the VM
stack and the interpreter stack, properly interleaved, as your
computation bounces back and forth between the interpreter and the VM.

This also means that we now get VM backtraces from scm_backtrace() and
friends.

There are still some hiccups, but this was one of the last things I
needed to get done before merging VM to master. I need to finish
updating the documentation, fix the tracing infrastructure to be like
the traps infrastructure (it already is, mostly), and we'll be done.

Finally, as part of the documentation work, I wrote up a history of
Guile. I'm including it in this mail for comment, but here's the link if
you want to read it that way:

  http://git.savannah.gnu.org/gitweb/?p=guile.git;a=blob;f=doc/ref/history.texi;hb=vm

I still need to fold in some feedback from Ludovic, so consider the
document a draft at this point.

Cheers,

Andy

* begin history.texi:

@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
@c Copyright (C)  2008
@c   Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.

@node History
@section A Brief History of Guile

Guile is an artifact of historical processes, both as code and as a
community of hackers. It is sometimes useful to know this history when
hacking the source code, to know about past decisions and future
directions.

Of course, the real history of Guile is written by the hackers hacking
and not the writers writing, so we round up the section with a note on
current status and future directions.

@menu
* The Emacs Thesis::  
* Early Days::                  
* A Scheme of Many Maintainers::  
* A Timeline of Selected Guile Releases::  
* Status::
@end menu

@node The Emacs Thesis
@subsection The Emacs Thesis

The story of Guile is the story of bringing the development experience
of Emacs to the mass of programs on a GNU system.

Emacs, when it was first created in its GNU form in 1984, was a new
take on the problem of ``how to make a program''. The Emacs thesis is
that it is delightful to create composite programs based on an
orthogonal kernel written in a low-level language together with a
powerful, high-level extension language.

Extension languages foster extensible programs, programs which adapt
readily to different users and to changing times. Proof of this can be
seen in Emacs' current and continued existence, spanning more than a
quarter-century.

Besides providing for modification of a program by others, extension
languages are good for /intension/ as well. Programs built in ``the
Emacs way'' are pleasurable and easy for their authors to flesh out
with the features that they need.

After the Emacs experience was appreciated more widely, a number of
hackers started to consider how to spread this experience to the rest
of the GNU system. It was clear that the easiest way to Emacsify a
program would be to embed a shared language implementation into it.

@node Early Days
@subsection Early Days

Tom Lord was the first to fully concentrate his efforts on an
embeddable language runtime, which he named ``GEL'', the GNU Extension
Language.

GEL was the product of converting SCM, Aubrey Jaffer's implementation
of Scheme, into something more appropriate to embedding as a library.
(SCM was itself based on an implementation by George Carrette, SIOD).

Lord managed to convince Richard Stallman to dub GEL the official
extension language for the GNU project. It was a natural fit, given
that Scheme was a cleaner, more modern Lisp than Emacs Lisp. Part of
the argument was that eventually when GEL became more capable, it
could gain the ability to execute other languages, especially Emacs
Lisp.

Due to a naming conflict with another programming language, Jim Blandy
suggested a new name for GEL: ``Guile''. Besides being a recursive
acroymn, ``Guile'' craftily follows the naming of its ancestors,
``Planner'', ``Conniver'', and ``Schemer''. (The latter was truncated
to ``Scheme'' due to a 6-character file name limit on an old operating
system.) Finally, ``Guile'' suggests ``guy-ell'', or ``Guy L.
Steele'', who, together with Gerald Sussman, originally discovered
Scheme.

Around the same time that Guile (then GEL) was readying itself for
public release, another extension language was gaining in popularity,
Tcl. Many developers found advantages in Tcl because of its shell-like
syntax and its well-developed graphical widgets library, Tk. Also, at
the time there was a large marketing push promoting Tcl as a
``universal extension language''.

Richard Stallman, as the primary author of GNU Emacs, had a particular
vision of what extension languages should be, and Tcl did not seem to
him to be as capable as Emacs Lisp. He posted a criticism to the
comp.lang.tcl newsgroup, sparking one of the internet's legendary
flamewars. As part of these discussions, retrospectively dubbed the
``Tcl Wars'', he announced the Free Software Foundation's intent to
promote Guile as the extension language for the GNU project.

It is a common misconception that Guile was created as a reaction to
Tcl. While it is true that the public announcement of Guile happened
at the same time as the ``Tcl wars'', Guile was created out of a
condition that existed outside the polemic. Indeed, the need for a
powerful language to bridge the gap between extension of existing
applications and a more fully dynamic programming environment is still
with us today.

@node A Scheme of Many Maintainers
@subsection A Scheme of Many Mantainers

Surveying the field, it seems that Scheme implementations correspond
with their maintainers on an N-to-1 relationship. That is to say, that
those people that implement Schemes might do so on a number of
occasions, but that the lifetime of a given Scheme is tied to the
maintainership of one individual.

Guile is atypical in this regard.

Tom Lord maintaned Guile for its first year and a half or so,
corresponding to the end of 1994 through the middle of 1996. The
releases made in this time constitute an arc from SCM as a standalone
program to Guile as a reusable, embeddable library, but passing
through a explosion of features: embedded Tcl and Tk, a toolchain for
compiling and disassembling Java, addition of a C-like syntax,
creation of a module system, and a start at a rich POSIX interface.

Only some of those features remain in Guile. There were ongoing
tensions between providing a small, embeddable language, and one which
had all of the features (e.g. a graphical toolkit) that a modern Emacs
might need. In the end, as Guile gained in uptake, the development
team decided to focus on depth, documentation and orthogonality rather
than on breadth. This has been the focus of Guile ever since, although
there is a wide range of third-party libraries for Guile.

Jim Blandy presided over that period of stabilization, in the three
years until the end of 1999, when he too moved on to other projects.
Since then, Guile has had a group maintainership. The first group was
Maciej Stachowiak, Mikael Djurfeldt, and Marius Vollmer, with Vollmer
staying on the longest. By late 2007, Vollmer had mostly moved on to
other things, so Neil Jerram and Ludovic Courtès stepped up to take on
the primary maintenance responsibility.

Of course, a large part of the actual work on Guile has come from
other contributors too numerous to mention, but without whom the world
would be a poorer place.

@node A Timeline of Selected Guile Releases
@subsection A Timeline of Selected Guile Releases

@table @asis
@item guile-i --- 4 February 1995
SCM, turned into a library.

@item guile-ii --- 6 April 1995
A low-level module system was added. Tcl/Tk support was added,
allowing extension of Scheme by Tcl or vice versa. POSIX support was
improved, and there was an experimental stab at Java integration.

@item guile-iii --- 18 August 1995
The C-like syntax, ctax, was improved, but mostly this release
featured a start at the task of breaking Guile into pieces.

@item 1.0 --- 5 January 1997
@code{#f} was distinguished from @code{'()}. Green threads were added.
Source-level debugging became more useful, and programmer's and user's
manuals were begun. The module system gained a high-level interface,
which is still used today in more or less the same form.

@item 1.1 --- 16 May 1997
@itemx 1.2 --- 24 June 1997
Support for Tcl/Tk and ctax were split off as separate packages, and
have remained there since. Guile became more compatible with SCSH, and
more useful as a UNIX scripting language. Libguile can now be built as
a shared library, and third-party extensions written in C became
loadable via dynamic linking.

@item 1.3.0 --- 19 October 1998
Command-line editing became much more pleasant through the use of the
readline library. The initial support for internationalization via
multi-byte strings was removed, and has yet to be added back, though
UTF-8 hacks are common. Modules gained the ability to have custom
expanders, which is still used for syntax-case macros. Ports have
better support for file descriptors, and fluids were added.

@item 1.3.2 --- 20 August 1999
@itemx 1.3.4 --- 25 September 1999
@itemx 1.4 --- 21 June 2000
A long list of lispy features were added: hooks, Common Lisp's
@code{format}, optional and keyword procedure arguments,
@code{getopt-long}, sorting, random numbers, and many other fixes and
enhancements. Guile now has an interactive debugger, interactive help,
and gives better backtraces.

@item 1.6 --- 6 September 2002
Guile gained support for the R5RS standard, and added a number of SRFI
modules. The module system was expanded with programmatic support for
identifier selection and renaming. The GOOPS object system was merged
into Guile core.

@item 1.8 --- 20 February 2006
Guile's arbitrary-precision arithmetic switched to use the GMP
library, and added support for exact rationals. Green threads were
removed in favor of POSIX threads, providing true multiprocessing.
Gettext support was added, and Guile's C API was cleaned up and
orthogonalized in a massive way.

@item 2.0 --- thus far, only unstable snapshots available
A virtual machine was added to Guile, along with the associated
compiler and toolchain. Support for locales was added. Running Guile
instances became controllable and debuggable from within Emacs, via
GDS. GDS was backported to 1.8.5. An SRFI-compatible interface to
multithreading was added, including thread cancellation.
@end table

@node Status
@subsection Status, or: Your Help Needed

Guile has achieved much of what it set out to achieve, but there is
much remaining to do.

There is still the old problem of bringing existing applications into
a more Emacs-like experience. Guile has had some successes in this
respect, but still most applications in the GNU system are without
Guile integration.

Getting Guile to those applications takes an investment, the
``hacktivation energy'' needed to wire Guile into a program that only
pays off once it is good enough to enable new kinds of behavior. This
would be a great way for new hackers to contribute: take an
application that you use and that you know well, think of something
that it can't yet do, and figure out a way to integrate Guile and
implement that task in Guile.

With time, perhaps this exposure can reverse itself, whereby programs
can run under Guile instead of vice versa, eventually resulting in the
Emacsification of the entire GNU system. Indeed, this is the reason
for the naming of the many Guile modules that live in the @code{ice-9}
namespace, a nod to the fictional substance in Kurt Vonnegut's
novel, Cat's Cradle, capable of acting as a seed crystal to
crystallize the mass of software.

Implicit to this whole discussion is the idea that dynamic languages
are somehow better than languages like C. While languages like C have
their place, Guile's take on this question is that yes, Scheme is more
expressive than C, and more fun to write. This realization carries an
imperative with it to write as much code in Scheme as possible rather
than in other languages.

These days it is possible to write extensible applications almost
entirely from high-level languages, through byte-code and native
compilation, speed gains in the underlying hardware, and foreign call
interfaces in the high-level language. Smalltalk systems are like
this, as are Common Lisp-based systems. While there already are a
number of pure-Guile applications out there, users still need to drop
down to C for some tasks: interfacing to system libraries that don't
have prebuilt Guile interfaces, and for some tasks requiring high
performance.

The addition of the virtual machine in Guile 2.0, together with the
compiler infrastructure, should go a long way to addressing the speed
issues. But there is much optimization to be done. Interested
contributors will find lots of delightful low-hanging fruit, from
simple profile-driven optimization to hacking a just-in-time compiler
from VM bytecode to native code.

Still, even with an all-Guile application, sometimes you want to
provide an opportunity for users to extend your program from a
language with a syntax that is closer to C, or to Python. Another
interesting idea to consider is compiling e.g. Python to Guile. It's
not that far-fetched of an idea: see for example IronPython or JRuby.

And then there's Emacs itself. Though there is a somewhat-working
Emacs Lisp translator for Guile, it cannot yet execute all of Emacs
Lisp. A serious integration of Guile with Emacs would replace the
Elisp virtual machine with Guile, and provide the necessary C shims so
that Guile could emulate Emacs' C API. This would give lots of
exciting things to Emacs: native threads, a real object system, more
sophisticated types, cleaner syntax, and access to all of the Guile
extensions.

Finally, there is another axis of crystallization, the axis between
different Scheme implementations. Guile does not yet support the
latest Scheme standard, R6RS, and should do so. Like all standards,
R6RS is imperfect, but supporting it will allow more code to run on
Guile without modification, and will allow Guile hackers to produce
code compatible with other schemes. Help in this regard would be much
appreciated.




^ permalink raw reply	[flat|nested] 21+ messages in thread
* vm status update
@ 2009-03-06 19:52 Andy Wingo
  2009-03-06 22:31 ` Ludovic Courtès
  0 siblings, 1 reply; 21+ messages in thread
From: Andy Wingo @ 2009-03-06 19:52 UTC (permalink / raw)
  To: guile-devel

Gentlemen, ladies: so long the hack, and so short the time. But the
Creator in her wisdom or absence has given us this moment in which to
ponder the novelties of the VM branch.

Since we last rapped together, let's see:

  * One Sunday, I decided that we couldn't honestly claim to have a
    multilingual environment without actually implementing other
    languages. So I wrote a JavaScript tokenizer, a parser, a compiler
    to GHIL, and a runtime -- a week later, it was working! I wrote more
    about it here:

    http://wingolog.org/archives/2009/02/22/ecmascript-for-guile

  * Ludovic fixed loading of large unsigned integers, and added a -o
    option to the compiler, and coalesced the Makefiles in to just one
    in module/. My -j8 machine at work compiles much faster now ;)

  * I've started to think about optimization, and what's clear is that
    GHIL as it stands is too much of a pain in the ass -- you can't turn
    a ((lambda ...) ...) into a (let ... ...) without like 30 lines of
    code. I decided that having alpha-renamed variables would eliminate
    the need for <ghil-env>, and make GHIL actually readable and
    writable without loss of information.

    So I started looking at separating expansion + renaming from
    compilation, as the Scheme lords decree, but I'm not quite there
    yet. I have an expander, but we really want source information -- so
    I just fixed syncase expansion to give us source information
    corresponding to its output variables, but haven't yet figured how
    to recover the source lexical names. But I'll get it.

Having now looked much more at syncase, I think it's pretty great. Also
given that it finally loads quickly, and gives us source information, I
want to include it at the heart of Guile -- early on in boot-9.scm. It
goes against lazy memoization, but given that expansion is fast (and
linear), that shouldn't be a big problem. We'll see how that goes.

Syncase + GHIL without <ghil-env> also gives us the opportunity to
simplify GHIL itself, removing e.g. quasiquote in favor of syncase's
expansion. That can let us simplify the evaluator too. The interpreter
could even become threadsafe, eventually.

Anyway, that's where I am. Bug-wise we still have a bug in backtraces,
which I need to pin down at some point, and update docs -- but generally
speaking we're mergeable. What do people think, should I be working on
master at some point?

Cheers,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 21+ messages in thread
* vm status update
@ 2009-02-14 22:32 Andy Wingo
  2009-02-16 11:47 ` Marijn Schouten (hkBst)
  2009-03-08 11:49 ` Neil Jerram
  0 siblings, 2 replies; 21+ messages in thread
From: Andy Wingo @ 2009-02-14 22:32 UTC (permalink / raw)
  To: guile-devel

Greets!

So, yes, it's Saturday night: but I do love Guile hacking so. (Also: my
partner is away.) So a VM status update it is!

  * The parts of the instruction stream that are mapped directly to
    "struct scm_objcode" are now aligned to 8-byte boundaries, and
    written in native endianness.

  * Much more source information propagates through the compiler and
    into the metadata now. In short, whereas before it was "expressions
    are only marked as coming from a source location if they are eq? to
    an expression read in by guile", now it is "expressions are marked
    with the source location of their containing expression, unless they
    are eq? to an expression read by guile".

    The upshot is that original source information is preserved to a
    much broader extent than before, as macro-expanded or transformed
    expressions all have some kind of anchor to the original source.

    Another ramification of this is that procedures have source
    information corresponding to where they were really defined, in
    addition to locations of their subexpressions. (program-source foo
    0) will give you that.

  * The in-bytecode metadata representation has been compressed. Now we
    associate bytecode offsets with line-column pairs, and only record
    that information when it changes. The idea is, byte N in the
    instruction stream corresponds to source info for byte M, where M <=
    N. Also, we only record the filename when it changes.

    This means that we can have more source information, as mentioned
    above, but still have objcode files of similar size.

  * The VM dispatches to signal handlers (asyncs) more often,
    specifically: on return from a call, just before a call, and on a
    tail call.

  * Stack captures are much more reliable. Before there were some bugs.
    This allows statprof to work properly, capturing the whole stack up
    to a common root.

  * I set out to optimize GOOPS, and ended up writing a new call tree
    visualizer:

    http://wingolog.org/archives/2009/02/09/visualizing-statistical-profiles-with-chartprof

    It turns out that most of the time loading GOOPS is in the compiler,
    which comes from those dynamic recompilation bits I mentioned in the
    past. So I focused on optimizing the compiler -- it is much faster
    now.

    But still, for the uses that GOOPS has, a closure is better than a
    compiler. I changed thing in GOOPS so that it doesn't compile at
    runtime any more, and now on this machine GOOPS loads in something
    like 40ms. That's pretty good! Though improvements are possible, of
    course.

  * The VM now has support for separate engines. Currently the engines
    are just "regular" and "debug", defaulting to "debug". There are not
    interfaces to change this at runtime, yet. But it turns out there's
    not much difference. See vm-engine.c for more details. It seems that
    native compilation would be much better than a "reckless" engine.

Well, that's about it as far as changes go. And as far as status? I'm
going to update the docs for changes in the last month, then talk
seriously about a merge to master. I think it's ready.

Happy hacking,

Andy

ps. Guile finally loads faster than Python now. It's about time...
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 21+ messages in thread
* vm status update
@ 2009-01-11 17:35 Andy Wingo
  2009-01-13  8:05 ` Ludovic Courtès
  0 siblings, 1 reply; 21+ messages in thread
From: Andy Wingo @ 2009-01-11 17:35 UTC (permalink / raw)
  To: guile-devel

Hey hackers,

I just finished up a lot of typing at the manual, and I hope I'm done
with that. The net result is that the VM is documented quite thoroughly,
and the compiler as well. I'll send those documents to the list in
separate mails for inline comments.

Otherwise, in the course of documentation, I've made a few minor
cleanups, some internal name changes and such. No sense polishing a
turd, they say.

I had an idea regarding unit tests recently: since GHIL and GLIL now
have (documented!) S-expression representations, we should be able to
easily and expressively test individual compiler passes. Looking forward
to that.

I also had another realization, that now that VM frames go into stack
structures, that statprof should work with the VM. Have yet to check
though.

Anyway, just some babblings. I'll probably switch to benchmarking and
profiling sometime soon. I also need to merge in master to vm, it's been
a while and there are probably some conflicts.

So that's my status. Happy hacking!

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 21+ messages in thread
* vm status update
@ 2008-09-13 15:59 Andy Wingo
  0 siblings, 0 replies; 21+ messages in thread
From: Andy Wingo @ 2008-09-13 15:59 UTC (permalink / raw)
  To: guile-devel

Hello,

A small update. Since I wrote last, the compiler now puts program
names in with program metadata, and programs print in a much more
human-readable fashion:

    scheme@(guile-user)> module-ref
    $2 = #<program module-ref (module name . rest)>

Some bugs were fixed in disassembly, allowing the addition of the
following two sections:

    scheme@(guile-user)> ,x module-ref
    [...]
    Arguments:

       0    local[0]: module
       1    local[1]: name
       2    local[2]: rest

    Bindings:

    8-58    local[3]: variable

    [...]

The arguments show how they are allocated. All local variables in a
frame are on the stack, within the frame structure, and are accessed by
index. The other possibility is that a variable is "external", that is,
lexically bound by some enclosed lambda -- these are allocated on the
heap.

The range on the left side of a bindings listing shows the range of
instructions in which that particular local variable is bound, and what
its name is.

Currently, local variables are not reused even if their dynamic extents
are non-contiguous -- an optimization to maybe make later. For example:

    scheme@(guile-user)> ,x (lambda () (let ((x 1)) x) (let ((y 2)) y))
    Disassembly of #<program #(0 14 #f) (x)>:

    nargs = 0  nrest = 0  nlocs = 2  nexts = 0

    Bytecode:

       0    (make-int8 1)                   ;; 1
       2    (local-set 0)
       4    (make-int8 2)                   ;; 2
       6    (local-set 1)
       8    (local-ref 1)
      10    (return)

    Bindings:

     2-4    local[0]: x
    6-11    local[1]: y                     ;; could reuse local 0

    Sources:

       2    #(0 14 #f)
       6    #(0 30 #f)

It seems that the argument printing code has a bug there -- nargs is 0,
but it still prints the program as having args (x).

An important bug was fixed when compiling `or' forms when the value
would be discarded, as in `(begin (or #t (error "what")) 4)' -- an extra
value would be left on the stack. You should recompile all your .go
files when you pull.

Currently I'm working on implementing multiple-values support, mostly as
in Ashley and Dybvig's paper,
http://repository.readscheme.org/ftp/papers/jmashley/lfp94.pdf. Instead
of having the multiple-value return address being a fixed offset behind
the normal return address in the instruction stream, however, I'm just
going to push the MV return address on the stack, behind the normal
return address.

If you are interested in helping with guile-vm, just download it and
give it a whirl, see if it works for you. If your program doesn't do
call/cc it should work fine. I'm interested in any bugs!

Cheers,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-03-12 20:49 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-26 17:24 vm status update Andy Wingo
2008-12-28 22:50 ` Neil Jerram
2009-01-05 16:06   ` Ludovic Courtès
2009-01-05 16:45     ` Neil Jerram
2009-01-05 19:53       ` Ludovic Courtès
2009-01-05 17:57     ` Andy Wingo
2009-01-05 21:03       ` Ludovic Courtès
2009-01-06  9:52         ` Andy Wingo
2009-01-06 14:54           ` Ludovic Courtès
  -- strict thread matches above, loose matches on Subject: below --
2009-03-06 19:52 Andy Wingo
2009-03-06 22:31 ` Ludovic Courtès
2009-03-08 22:40   ` Neil Jerram
2009-03-10 21:04     ` Andy Wingo
2009-02-14 22:32 Andy Wingo
2009-02-16 11:47 ` Marijn Schouten (hkBst)
2009-03-08 11:49 ` Neil Jerram
2009-03-10 21:36   ` Andy Wingo
2009-03-12 20:49     ` Neil Jerram
2009-01-11 17:35 Andy Wingo
2009-01-13  8:05 ` Ludovic Courtès
2008-09-13 15:59 Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).