unofficial mirror of bug-guile@gnu.org 
 help / color / mirror / Atom feed
* bug#26711: Multithreading segfaults
@ 2017-04-29 11:16 Jacek Swiergocki
  2017-04-30 21:59 ` bug#26711: Example1 is buggy Linas Vepstas
  0 siblings, 1 reply; 2+ messages in thread
From: Jacek Swiergocki @ 2017-04-29 11:16 UTC (permalink / raw)
  To: 26711

[-- Attachment #1: Type: text/plain, Size: 4289 bytes --]

Hi all,

I have two examples of multi threading code that crash with segmentation
fault. If there is a bug in guile please fix it. If there is a problem only
in my code please help me how it can be workaround.

I am using Ubuntu 14.04 and guile compiled from the repository. The
examples are compiled by:
g++ demo.cc -Wall -std=c++11 -pthread -I/usr/local/test/include/guile/2.2
-L/usr/local/test/lib -lguile-2.2 -lgc

The first example has started to segfault since guile tagged v2.1.7. I have
not encountered problems for v2.1.6 and older versions. However for recent
version v2.2.2 it requires much more iterations to fail than v2.1.7 that
fails instantly.

////////////////////////////////////////////////////////////////////////////
// Example 1.

#include <libguile.h>

#include <thread>
#include <vector>
#include <atomic>
#include <mutex>

static volatile bool hold = true;
static std::atomic_int start_cnt(0);

static std::mutex init_once_mtx;
static bool start_inited_once = false;
static bool is_inited_once = false;

static std::mutex gc_mtx;

class Eval
{
public:
    Eval();
    ~Eval();
};

void* c_wrap_init(void*)
{
    return nullptr;
}

Eval::Eval()
{
    scm_with_guile(c_wrap_init, this);
}

Eval::~Eval()
{
    std::lock_guard<std::mutex> lck(gc_mtx);
    scm_gc();
}

void* c_wrap_init_only_once(void*)
{
    is_inited_once = true;
    return nullptr;
}

void init_only_once()
{
    std::lock_guard<std::mutex> lck(init_once_mtx);
    if (!start_inited_once)
    {
        start_inited_once = true;
        scm_with_guile(c_wrap_init_only_once, nullptr);
    }
    while (!is_inited_once); // spin;
}

void threadedInit(int thread_id)
{
    start_cnt ++;
    while (hold) {} // spin

    init_only_once();
    Eval* ev = new Eval();

    delete ev;
}

void test_init_race()
{
    int n_threads = 120;
    start_cnt = 0;
    hold = true;

    std::vector<std::thread> thread_pool;
    for (int i = 0; i < n_threads; i++)
        thread_pool.push_back(std::thread(&threadedInit, i));

    while (start_cnt != n_threads) {}  // spin
    printf("Done creating %d threads\n", n_threads);
    hold = false;

    for (std::thread& t : thread_pool) t.join();
    printf("Done joining %d threads\n", n_threads);
}

int main()
{
    for (int k = 0; k < 10000; k++)
    {
        test_init_race();
        printf("------------------ done iteration %d\n", k);
}

The second example requires much more iterations to crash with segfault.
Sometimes hundreds, sometimes thousands, it seems to be random. You need to
wait over a dozen of minutes, sometimes you need try again and restart. I
have found this problem for old versions e.g. v2.0.11 as well as for recent
version v2.2.2, so it seems to be an old problem.

////////////////////////////////////////////////////////////////////////////
// Example 2.

#include <libguile.h>

#include <thread>
#include <vector>
#include <atomic>
#include <mutex>

static volatile bool hold = true;
static std::atomic_int start_cnt(0);

static std::mutex init_once_mtx;
static bool start_inited_once = false;
static bool is_inited_once = false;

void* c_wrap_init_only_once(void*)
{
    is_inited_once = true;
    return nullptr;
}

void* c_wrap_eval(void*)
{
    return nullptr;
}

void init_only_once()
{
    std::lock_guard<std::mutex> lck(init_once_mtx);
    if (!start_inited_once)
    {
        start_inited_once = true;
        scm_with_guile(c_wrap_init_only_once, nullptr);
    }
    while (!is_inited_once); // spin;
}

void threadedInit(int thread_id)
{
    start_cnt ++;
    while (hold) {} // spin

    init_only_once();
    for (int i = 0; i < 100; ++i)
    {
        scm_with_guile(c_wrap_eval, nullptr);
    }
}

void test_init_race()
{
    int n_threads = 120;
    start_cnt = 0;
    hold = true;

    std::vector<std::thread> thread_pool;
    for (int i = 0; i < n_threads; i++)
        thread_pool.push_back(std::thread(&threadedInit, i));

    while (start_cnt != n_threads) {}  // spin
    printf("Done creating %d threads\n", n_threads);
    hold = false;

    for (std::thread& t : thread_pool) t.join();
    printf("Done joining %d threads\n", n_threads);
}

int main()
{
    for (int k = 0; k < 10000; k++)
    {
        test_init_race();
        printf("------------------ done iteration %d\n", k);
    }
}

--
Jacek

[-- Attachment #2: Type: text/html, Size: 6735 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* bug#26711: Example1 is buggy
  2017-04-29 11:16 bug#26711: Multithreading segfaults Jacek Swiergocki
@ 2017-04-30 21:59 ` Linas Vepstas
  0 siblings, 0 replies; 2+ messages in thread
From: Linas Vepstas @ 2017-04-30 21:59 UTC (permalink / raw)
  To: 26711

[-- Attachment #1: Type: text/plain, Size: 650 bytes --]

Example1.cc is has a work-around -- main() needs to call scm_init_guile()
or scm_with_guile().  If this is done, the problem goes away.

The problem with example1 is that the first thread to initialize guile is
eventually destroyed. However, the first thread to call guile never ever
sets "needs_unregister" in libguile/threads.c and thus, bdwgc never finds
out that this thread no longer exists. Sooner or later, bdwgc touches this
non-existent thread, and crashes.

If its OK to initialize guile for the first time ever in a transient
thread, then there's a bug in guile; else there's a bug in the example.

I'm now looking into example2.

--linas

[-- Attachment #2: Type: text/html, Size: 778 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-04-30 21:59 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-29 11:16 bug#26711: Multithreading segfaults Jacek Swiergocki
2017-04-30 21:59 ` bug#26711: Example1 is buggy Linas Vepstas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).