unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
From: Linas Vepstas <linasvepstas@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: Guile User <guile-user@gnu.org>
Subject: Re: Guile bugs
Date: Tue, 19 Sep 2017 19:04:21 +0800	[thread overview]
Message-ID: <CAHrUA36G9d6jsZuKbBh_vpp+mVbfTLARxFv7Hifazc_9G5gRoA@mail.gmail.com> (raw)
In-Reply-To: <871sn875tl.fsf@gnu.org>

Hi Ludo,

On Fri, Sep 15, 2017 at 3:56 PM, Ludovic Courtès <ludo@gnu.org> wrote:

> Linas Vepstas <linasvepstas@gmail.com> skribis:
>
> > On Mon, Sep 11, 2017 at 2:26 AM, Ludovic Courtès <ludo@gnu.org> wrote:
> >
> >> Hello,
> >>
> >> Linas Vepstas <linasvepstas@gmail.com> skribis:
> >>
> >> > The stuff coming over the network sockets are bytes, not s-exps. Since
> >> none
> >> > of the bytes are ever zero, they are effectively C/C++ strings, and
> are
> >> > handled as such. These C strings are sent to  scm_eval_string()
> wrapped
> >> > by scm_c_catch().
> >>
> >> I don’t know to what extent that is applicable to your software, but my
> >> recommendation would be to treat that network socket as a Scheme port,
> >> pass it to ‘read’, and pass the result to ‘eval’ (as opposed to reading
> >> the whole string from C++ and passing it to ‘scm_eval_string’.)
> >>
> >
> > Why?  What advantage does this offer?
>
> It avoids copies and conversions, which is big deal if you deal with
> very big strings.
>
> > Its not clear that guile eval is smart enough to manage a network socket
> --
> > if the user starts a long-running process with intermittent prints, will
> it
> > send that to the socket?  What if the user hits cntrl-C in the middle of
> it
> > all? What if the code that came over the socket happened to throw an
> > exception?
>
> These are important considerations, but it’s not eval’s business IMO.
> Instead, I suggest building your own protocol around it, and having a
> way in that protocol to report both exceptions and normal returns.
>

Well, yes, this is exactly what I've done.

This conversation is frustrating: either piping read to eval is the right
thing to do, in which case eval must handle network connections correctly,
or else piping read to eval is the wrong thing to do.  You can't have it
both ways.


> > I've had to deal with all of these issues in the past, and have a stable
> > code base; but if I had to start all over again, its not clear that these
> > issues have gone away.  I mean, eval was designed to eval -- it was not
> > designed to support multi-threaded, concurrent network operations, right?
>
> Right.
>
> > To support my point: the default guile network REPL server is painfully
> > slow, and frequently crashes/hangs. It works well enough to do some demos
> > but is not stable enough for production use ... if its just read+eval,
> that
> > might explain why its unstable.
>
> I’ve never noticed slowness of the REPL server, nor crashes.
>

You are probably using it only very lightly, and not in a high-load systems
environment. It runs maybe 5x slower than my current guile shell server,
and it is very definitely unstable and crashy.

In my environment, I am sending it approximately from one up to twenty
scheme expressions every second, with a new socket opened for each scheme
expression. This goes on for days or weeks. I am using a custom guile
server written in C++, which accepts network connections, reads bytes from
the network, and sends them to scm_eval_string(). It mostly works fine,
with a couple of problems: there seems to be a pointless utf8-utf32
conversion, which started this email chain.

There also seems to be some sort of very rare race condition in the
compiler that leads to corruption inside of guile. I believe that this can
be triggered by starting twenty threads (for example) and then compiling
and running fairly short programs in each thread. By "fairly short" I mean
"less than 5-10 lines of code", and which compute and return answers in
less than a tenth of a second. Doing this for a few hours eventually causes
guile to hang in a spinloop, trying to read some guile-internal structure
that has invalid data in it. I opened a bug report for this a month or two
ago, but did not supply an easy-to-trigger test case.

I tried replacing my guile network server with the REPL shell, and
discovered that the REPL server is much much slower; I don't recall exactly
how I measured the 5x number, but that was from an actual measurement.   I
don't think the REPL server can handle 20 network connections per second.

I remember hypothesizing that guile was being re-initialized for every
network connection. Obviously, this is wasteful and slow.

Entering guile is a large bottleneck.  I once measured this, and I think it
takes approximately 200 microseconds to enter guile, which implies a
maximum limit of about 5K guile evaluations per second, when using the
simple-minded design of having the C code enter guile each time before
evaluation an expression.  By contrast, python (cython) can be entered in
10 or 20 microseconds.

The test case here is how many times per second can one eval some simple
expression, e.g. (+ 2 2) or the equivalent of that in python.

The solution for the heavy cost of entering guile is to create a pool for a
few dozen threads, enter guile in each, and then never exit -- just return
threads to the thread pool, when the eval is completed, and the thread is
no longer needed.  This cuts the  200 microseconds overhead to zero, and
what one is then left with is the cost of calling scm_eval_string().  I did
measure that too, but I don't recall the numbers.


> That said, if you run a REPL server in a separate thread and mutate the
> global state of the program, you could possibly crash it—no wonders
> here.
>

Yes, well, I would call that a bug! It feels like you are trying to blame
me for a guile bug -- its not my fault that it crashes!

I did not look very carefully, and don't recall what the stack traces
looked like, but I got the impression that there were race conditions in
guile init, and how it interacted with the sockets.

Likewise, the REPL server is meant to be used for debugging on
> localhost.  If you talk to a REPL server over the network with high
> latency, it’s going to be slow, not surprisingly.
>

The performance problem was not the  latency, it was the number of
connections it could accept.

I'll say it again: I have a different network server that is 5x faster than
the REPL server, and it works, it is stable.

For reasons completely unrelated to guile, I would like to declare my
network server deprecated and obsolete.  However, I cannot do this, because
the guile REPL server is not yet good enough to be an adequate replacement.

--linas

>
> So yes, I find the REPL server to be a really pleasant tool when
> debugging an application locally, but that’s all it is—it’s not a remote
> procedure call framework or anything like that.
>

> Thanks,
> Ludo’.
>



-- 
*"The problem is not that artificial intelligence will get too smart and
take over the world," computer scientist Pedro Domingos writes, "the
problem is that it's too stupid and already has." *


  reply	other threads:[~2017-09-19 11:04 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-12 23:56 How to make GNU Guile more successful Amirouche
2017-02-13  0:21 ` Amirouche
2017-02-13 11:06 ` Arne Babenhauserheide
2017-02-13 12:14   ` Arne Babenhauserheide
2017-02-13 20:20   ` Amirouche
2017-02-13 23:08     ` Arne Babenhauserheide
2017-02-13 20:28   ` Panicz Maciej Godek
2017-02-13 20:42     ` Amirouche
2017-02-13 22:34     ` Marko Rauhamaa
2017-02-13 23:56       ` Arne Babenhauserheide
2017-02-14  0:18         ` David Kastrup
2017-02-14 22:21           ` Arne Babenhauserheide
2017-02-15 17:03           ` Christopher Allan Webber
2017-02-16 19:18             ` sirgazil
2017-02-16 20:26               ` Amirouche
2017-02-14  5:59         ` Marko Rauhamaa
2017-02-14 19:36           ` Linas Vepstas
2017-02-14 20:54             ` Marko Rauhamaa
2017-02-14 22:20           ` Arne Babenhauserheide
2017-02-13 22:54     ` Arne Babenhauserheide
2017-02-14  9:54       ` Panicz Maciej Godek
2017-02-14 21:35         ` Arne Babenhauserheide
2017-03-01 19:21           ` Amirouche
2017-03-10 20:23             ` Amirouche
2017-07-14 21:54     ` Linas Vepstas
2017-07-14 21:59       ` Marko Rauhamaa
2017-07-15 10:10       ` Jan Wedekind
2017-07-15 12:55         ` Nala Ginrut
2017-07-15 12:58           ` Nala Ginrut
2017-07-15 22:17           ` Jan Wedekind
2017-07-16  9:54             ` Nala Ginrut
2017-07-17 18:52         ` Arun Isaac
2017-07-18 11:22         ` Ernest Adrogué
2017-07-16  8:30       ` Freja Nordsiek
2017-07-16  9:18         ` Marko Rauhamaa
2017-07-16 10:11           ` Freja Nordsiek
2017-07-16 10:31             ` Marko Rauhamaa
2017-07-16 10:39               ` Freja Nordsiek
2017-07-16 10:45                 ` Freja Nordsiek
2017-07-20 15:28       ` Guile bugs Ludovic Courtès
2017-07-20 16:22         ` Marko Rauhamaa
2017-07-20 18:26           ` Taylan Ulrich Bayırlı/Kammer
2017-07-20 18:35             ` Marko Rauhamaa
2017-07-20 20:41               ` Ludovic Courtès
2017-07-20 22:23                 ` Marko Rauhamaa
2017-07-21  4:05                   ` Mark H Weaver
2017-07-21  6:15                     ` Marko Rauhamaa
2017-07-21  8:16                       ` Chris Vine
2017-07-21  8:27                         ` Marko Rauhamaa
2017-07-21  9:17                       ` Mark H Weaver
2017-07-21 10:08                         ` Marko Rauhamaa
2017-07-21 10:22                           ` David Kastrup
2017-09-09 21:14                       ` Linas Vepstas
2017-09-09 22:31                         ` Marko Rauhamaa
2017-09-09 23:02                           ` Linas Vepstas
2017-07-21 16:33               ` Taylan Ulrich Bayırlı/Kammer
2017-07-21 17:12                 ` Marko Rauhamaa
2017-07-21 14:19           ` Matt Wette
2017-09-09 20:30         ` Linas Vepstas
2017-09-10 13:11           ` Ludovic Courtès
2017-09-10 19:56             ` Linas Vepstas
2017-09-11  7:26               ` Ludovic Courtès
2017-09-11  8:10                 ` Marko Rauhamaa
2017-09-11 11:34                   ` Ludovic Courtès
2017-09-14 17:54                 ` Linas Vepstas
2017-09-15  7:56                   ` Ludovic Courtès
2017-09-19 11:04                     ` Linas Vepstas [this message]
2017-09-19 20:18                       ` Chris Vine
2017-09-19 20:21                         ` Chris Vine
2017-09-19 23:39                           ` Nala Ginrut

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHrUA36G9d6jsZuKbBh_vpp+mVbfTLARxFv7Hifazc_9G5gRoA@mail.gmail.com \
    --to=linasvepstas@gmail.com \
    --cc=guile-user@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).