unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Locks and threads
@ 2009-02-11 22:31 Neil Jerram
  2009-02-11 23:05 ` Neil Jerram
                   ` (4 more replies)
  0 siblings, 5 replies; 51+ messages in thread
From: Neil Jerram @ 2009-02-11 22:31 UTC (permalink / raw)
  To: Guile Development; +Cc: Linas Vepstas

[-- Attachment #1: Type: text/plain, Size: 727 bytes --]

I've started working through the lock ordering and threading issues
that we have.  My plan is (with 1.8.x)

- first to address problems reported by helgrind (since I think we
  should take advantage of external tools before adding debug code to
  Guile internally)

- then to run Linas's define-race program, and address the problems
  that that shows up (including thread-safe define, hopefully)

- then (or maybe as part of the previous step) to look again at
  Linas's lock order debugging patch, to see if it would make sense to
  apply some of that.

Does that sound sensible; have I missed anything?

A patch for the first helgrind problem is attached; any comments would
be appreciated, as ever.

Regards,
        Neil


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Lock-ordering-don-t-lock-heap_mutex-and-then-thread.patch --]
[-- Type: text/x-diff, Size: 2882 bytes --]

From 3dcf473e9e535314fc6dedc99ff8d9132e7a3e3e Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Wed, 11 Feb 2009 21:22:57 +0000
Subject: [PATCH] Lock ordering: don't lock heap_mutex and then thread_admin_mutex

This fixes the following helgrind report.

Thread #1: lock order "0x4325084 before 0x4105328" violated
   at 0x40234F7: pthread_mutex_lock (hg_intercepts.c:408)
   by 0x40D01EA: scm_i_thread_put_to_sleep (threads.c:1622)
   by 0x4078958: scm_make_fluid (fluids.c:114)
   by 0x40778D6: scm_init_feature (feature.c:101)
   by 0x408C62E: scm_i_init_guile (init.c:464)
   by 0x40D1873: scm_i_init_thread_for_guile (threads.c:583)
   by 0x40D18B4: scm_i_with_guile_and_parent (threads.c:725)
   by 0x40D19BD: scm_with_guile (threads.c:714)
   by 0x408C42E: scm_boot_guile (init.c:350)
   by 0x8048710: main (guile.c:69)
  Required order was established by acquisition of lock at 0x4325084
   at 0x40234F7: pthread_mutex_lock (hg_intercepts.c:408)
   by 0x40CFFA4: guilify_self_1 (threads.c:468)
   by 0x408C58B: scm_i_init_guile (init.c:421)
   by 0x40D1873: scm_i_init_thread_for_guile (threads.c:583)
   by 0x40D18B4: scm_i_with_guile_and_parent (threads.c:725)
   by 0x40D19BD: scm_with_guile (threads.c:714)
   by 0x408C42E: scm_boot_guile (init.c:350)
   by 0x8048710: main (guile.c:69)
  followed by a later acquisition of lock at 0x4105328
   at 0x40234F7: pthread_mutex_lock (hg_intercepts.c:408)
   by 0x40CFFAC: guilify_self_1 (threads.c:470)
   by 0x408C58B: scm_i_init_guile (init.c:421)
   by 0x40D1873: scm_i_init_thread_for_guile (threads.c:583)
   by 0x40D18B4: scm_i_with_guile_and_parent (threads.c:725)
   by 0x40D19BD: scm_with_guile (threads.c:714)
   by 0x408C42E: scm_boot_guile (init.c:350)
   by 0x8048710: main (guile.c:69)

* threads.c (guilify_self_1): Add self to global thread list _before_
  entering Guile mode.
---
 libguile/threads.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/libguile/threads.c b/libguile/threads.c
index ba17746..05e01e3 100644
--- a/libguile/threads.c
+++ b/libguile/threads.c
@@ -465,13 +465,19 @@ guilify_self_1 (SCM_STACKITEM *base)
 
   scm_i_pthread_setspecific (scm_i_thread_key, t);
 
-  scm_i_pthread_mutex_lock (&t->heap_mutex);
-
+  /* As soon as this thread adds itself to the global thread list, the
+     GC may think that it has a stack that needs marking.  Therefore
+     initialize t->top to be the same as t->base, just in case GC runs
+     before the thread can lock its heap_mutex for the first time. */
+  t->top = t->base;
   scm_i_pthread_mutex_lock (&thread_admin_mutex);
   t->next_thread = all_threads;
   all_threads = t;
   thread_count++;
   scm_i_pthread_mutex_unlock (&thread_admin_mutex);
+
+  /* Enter Guile mode. */
+  scm_enter_guile (t);
 }
 
 /* Perform second stage of thread initialisation, in guile mode.
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-11 22:31 Locks and threads Neil Jerram
@ 2009-02-11 23:05 ` Neil Jerram
  2009-02-11 23:32   ` Ludovic Courtès
  2009-02-11 23:30 ` Linas Vepstas
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 51+ messages in thread
From: Neil Jerram @ 2009-02-11 23:05 UTC (permalink / raw)
  To: Guile Development

[-- Attachment #1: Type: text/plain, Size: 241 bytes --]

Neil Jerram <neil@ossau.uklinux.net> writes:

> - first to address problems reported by helgrind (since I think we
>   should take advantage of external tools before adding debug code to
>   Guile internally)

Here's the next one.

   Neil


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Don-t-leave-Guile-mode-to-lock-async_mutex.patch --]
[-- Type: text/x-diff, Size: 3403 bytes --]

From 76f55c5796f1fc7aca6c36bc57f06bab72300a94 Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Wed, 11 Feb 2009 23:03:16 +0000
Subject: [PATCH] Don't leave Guile mode to lock async_mutex

This fixes lots of helgrind-reported problems, such as:

Thread #1: lock order "0x4325084 before 0x4108928" violated
   at 0x40234F7: pthread_mutex_lock (hg_intercepts.c:408)
   by 0x40CFD37: scm_enter_guile (threads.c:377)
   by 0x40D0284: scm_pthread_mutex_lock (threads.c:1485)
   by 0x405A736: scm_async_click (async.c:155)
   by 0x406F9EE: deval (eval.c:4080)
   by 0x40761D9: scm_primitive_eval_x (eval.c:5921)
   by 0x40AD20E: install_handler (scmsigs.c:113)
   by 0x40AD402: scm_sigaction_for_thread (scmsigs.c:394)
   by 0x4087D1F: scm_gsubr_apply (gsubr.c:223)
   by 0x406DF55: scm_dapply (eval.c:4930)
   by 0x407147C: deval (eval.c:4378)
   by 0x406E1BD: scm_dapply (eval.c:5012)
  Required order was established by acquisition of lock at 0x4325084
   at 0x40234F7: pthread_mutex_lock (hg_intercepts.c:408)
   by 0x40CFD37: scm_enter_guile (threads.c:377)
   by 0x408C58B: scm_i_init_guile (init.c:421)
   by 0x40D1873: scm_i_init_thread_for_guile (threads.c:589)
   by 0x40D18B4: scm_i_with_guile_and_parent (threads.c:731)
   by 0x40D19BD: scm_with_guile (threads.c:720)
   by 0x408C42E: scm_boot_guile (init.c:350)
   by 0x8048710: main (guile.c:69)
  followed by a later acquisition of lock at 0x4108928
   at 0x40234F7: pthread_mutex_lock (hg_intercepts.c:408)
   by 0x40AFD61: scm_make_smob_type (smob.c:294)
   by 0x40AFF19: scm_smob_prehistory (smob.c:512)
   by 0x408C595: scm_i_init_guile (init.c:423)
   by 0x40D1873: scm_i_init_thread_for_guile (threads.c:589)
   by 0x40D18B4: scm_i_with_guile_and_parent (threads.c:731)
   by 0x40D19BD: scm_with_guile (threads.c:720)
   by 0x408C42E: scm_boot_guile (init.c:350)
   by 0x8048710: main (guile.c:69)

* libguile/async.c (scm_async_click): Don't leave Guile mode when
  locking async_mutex.  We don't need to, because none of the code
  that has async_mutex locked can block, and doing so may lead to lock
  ordering problems between async_mutex and a thread's heap_mutex.
---
 libguile/async.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/libguile/async.c b/libguile/async.c
index a9763da..56634d5 100644
--- a/libguile/async.c
+++ b/libguile/async.c
@@ -152,7 +152,7 @@ scm_async_click ()
      invoked even when pending_asyncs is zero.
   */
 
-  scm_i_scm_pthread_mutex_lock (&async_mutex);
+  scm_i_pthread_mutex_lock (&async_mutex);
   t->pending_asyncs = 0;
   if (t->block_asyncs == 0)
     {
@@ -197,7 +197,7 @@ scm_i_queue_async_cell (SCM c, scm_i_thread *t)
   int sleep_fd;
   SCM p;
   
-  scm_i_scm_pthread_mutex_lock (&async_mutex);
+  scm_i_pthread_mutex_lock (&async_mutex);
   p = t->active_asyncs;
   SCM_SETCDR (c, SCM_EOL);
   if (!scm_is_pair (p))
@@ -263,7 +263,7 @@ scm_i_setup_sleep (scm_i_thread *t,
 {
   int pending;
 
-  scm_i_scm_pthread_mutex_lock (&async_mutex);
+  scm_i_pthread_mutex_lock (&async_mutex);
   pending = t->pending_asyncs;
   if (!pending)
     {
@@ -278,7 +278,7 @@ scm_i_setup_sleep (scm_i_thread *t,
 void
 scm_i_reset_sleep (scm_i_thread *t)
 {
-  scm_i_scm_pthread_mutex_lock (&async_mutex);
+  scm_i_pthread_mutex_lock (&async_mutex);
   t->sleep_object = SCM_BOOL_F;
   t->sleep_mutex = NULL;
   t->sleep_fd = -1;
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-11 22:31 Locks and threads Neil Jerram
  2009-02-11 23:05 ` Neil Jerram
@ 2009-02-11 23:30 ` Linas Vepstas
  2009-02-11 23:53   ` Neil Jerram
  2009-02-11 23:30 ` Ludovic Courtès
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 51+ messages in thread
From: Linas Vepstas @ 2009-02-11 23:30 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

2009/2/11 Neil Jerram <neil@ossau.uklinux.net>:
> I've started working through the lock ordering and threading issues
> that we have.  My plan is (with 1.8.x)
>
> - first to address problems reported by helgrind (since I think we
>  should take advantage of external tools before adding debug code to
>  Guile internally)
>
> - then to run Linas's define-race program, and address the problems
>  that that shows up (including thread-safe define, hopefully)
>
> - then (or maybe as part of the previous step) to look again at
>  Linas's lock order debugging patch, to see if it would make sense to
>  apply some of that.
>
> Does that sound sensible; have I missed anything?

Err, sort of, yes, unless I misunderstand.  Guile 1.8 makes
a certain basic assumption that is splattered throughout
the code; it rather intentionally re-orders the order in which
one of the locks is taken.  If I remember correctly, its the
"in guile mode" lock.  So if you just go looking for locks
that are released out-of-order, you'll find lots of these.

At the time, I had decided that
1) it would be a lot of work to get these in order, and the
patch would likely be rejected, and
2) the reordering is essentially harmless (since its
consistently done).
3) there might have even been a performance hit (I don't
remember) by trying to get these into order.

This made using valgrind impossible, and that's why I created
the custom patch -- it made a point of ignoring this one
reversed-order, while checking for badness in everything else.

--linas




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-11 22:31 Locks and threads Neil Jerram
  2009-02-11 23:05 ` Neil Jerram
  2009-02-11 23:30 ` Linas Vepstas
@ 2009-02-11 23:30 ` Ludovic Courtès
  2009-02-12 12:55 ` Greg Troxel
  2009-03-04 23:49 ` Neil Jerram
  4 siblings, 0 replies; 51+ messages in thread
From: Ludovic Courtès @ 2009-02-11 23:30 UTC (permalink / raw)
  To: guile-devel

Hi Neil,

Neil Jerram <neil@ossau.uklinux.net> writes:

> I've started working through the lock ordering and threading issues
> that we have.  My plan is (with 1.8.x)
>
> - first to address problems reported by helgrind (since I think we
>   should take advantage of external tools before adding debug code to
>   Guile internally)
>
> - then to run Linas's define-race program, and address the problems
>   that that shows up (including thread-safe define, hopefully)
>
> - then (or maybe as part of the previous step) to look again at
>   Linas's lock order debugging patch, to see if it would make sense to
>   apply some of that.
>
> Does that sound sensible; have I missed anything?

Yes.  I think Helgrind can be very helpful.

> diff --git a/libguile/threads.c b/libguile/threads.c
> index ba17746..05e01e3 100644
> --- a/libguile/threads.c
> +++ b/libguile/threads.c
> @@ -465,13 +465,19 @@ guilify_self_1 (SCM_STACKITEM *base)
>  
>    scm_i_pthread_setspecific (scm_i_thread_key, t);
>  
> -  scm_i_pthread_mutex_lock (&t->heap_mutex);
> -
> +  /* As soon as this thread adds itself to the global thread list, the
> +     GC may think that it has a stack that needs marking.  Therefore
> +     initialize t->top to be the same as t->base, just in case GC runs
> +     before the thread can lock its heap_mutex for the first time. */
> +  t->top = t->base;
>    scm_i_pthread_mutex_lock (&thread_admin_mutex);
>    t->next_thread = all_threads;
>    all_threads = t;
>    thread_count++;
>    scm_i_pthread_mutex_unlock (&thread_admin_mutex);
> +
> +  /* Enter Guile mode. */
> +  scm_enter_guile (t);

Looks good.  `scm_enter_guile ()' does a bit more than the
`scm_i_pthread_mutex_lock ()' call that was removed, but it appears to
be OK.

Thanks for looking into this!

Ludo'.





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-11 23:05 ` Neil Jerram
@ 2009-02-11 23:32   ` Ludovic Courtès
  0 siblings, 0 replies; 51+ messages in thread
From: Ludovic Courtès @ 2009-02-11 23:32 UTC (permalink / raw)
  To: guile-devel

Neil Jerram <neil@ossau.uklinux.net> writes:

> * libguile/async.c (scm_async_click): Don't leave Guile mode when
>   locking async_mutex.  We don't need to, because none of the code
>   that has async_mutex locked can block, and doing so may lead to lock
>   ordering problems between async_mutex and a thread's heap_mutex.

[...]

> -  scm_i_scm_pthread_mutex_lock (&async_mutex);
> +  scm_i_pthread_mutex_lock (&async_mutex);

This one was tricky.  I took me a fair amount of M-. to grasp the
difference between these two variants.

Thanks!

Ludo'.





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-11 23:30 ` Linas Vepstas
@ 2009-02-11 23:53   ` Neil Jerram
  2009-02-12  0:18     ` Linas Vepstas
  2009-02-12 20:51     ` Ludovic Courtès
  0 siblings, 2 replies; 51+ messages in thread
From: Neil Jerram @ 2009-02-11 23:53 UTC (permalink / raw)
  To: linasvepstas; +Cc: Guile Development

Linas Vepstas <linasvepstas@gmail.com> writes:

> Err, sort of, yes, unless I misunderstand.  Guile 1.8 makes
> a certain basic assumption that is splattered throughout
> the code; it rather intentionally re-orders the order in which
> one of the locks is taken.  If I remember correctly, its the
> "in guile mode" lock.  So if you just go looking for locks
> that are released out-of-order, you'll find lots of these.

Yes, I think I understand this now (having seen it myself).  The
pattern is

- thread holding its heap_mutex - which is the normal state in guile
  mode

- thread calls scm_i_scm_pthread_mutex_lock to lock some other mutex

- scm_i_scm_pthread_mutex_lock:

  - unlocks the heap_mutex

  - locks the other mutex

  - locks the heap mutex again.

That in itself doesn't actually cause an ordering problem, but then
the thread releases the other mutex without releasing the heap mutex
first - which is perceived (by helgrind at least) as a problem.

(Is something like this actually _ever_ a problem?  If locks are
always _acquired_ in the right order, how can the order of _releasing_
ever cause a problem?)

The async_mutex handling (that I've posted a patch for) is one example
of this.

> At the time, I had decided that
> 1) it would be a lot of work to get these in order, and the
> patch would likely be rejected, and
> 2) the reordering is essentially harmless (since its
> consistently done).
> 3) there might have even been a performance hit (I don't
> remember) by trying to get these into order.

The other thing to bear in mind is that 99% of this will just
evaporate if we move to BDW-GC for 2.0.x; so - assuming we do end up
doing that - it makes sense to take a slightly more pragmatic approach
than normal for 1.8.x.

> This made using valgrind impossible, and that's why I created
> the custom patch -- it made a point of ignoring this one
> reversed-order, while checking for badness in everything else.

Thanks.  I understand this much better now!  On the other hand, after
the async_mutex patch, my helgrind output [1] is only reporting a
couple of problems now, so it looks like helgrind-cleanliness might be
achievable.

[1] I am only running a basic startup test, though: "valgrind
--tool=helgrind guile -q <<EOF".  Were you running something a lot
more complex than that?

Regards,
        Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-11 23:53   ` Neil Jerram
@ 2009-02-12  0:18     ` Linas Vepstas
  2009-02-12 20:51     ` Ludovic Courtès
  1 sibling, 0 replies; 51+ messages in thread
From: Linas Vepstas @ 2009-02-12  0:18 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

[-- Attachment #1: Type: text/plain, Size: 2619 bytes --]

2009/2/11 Neil Jerram <neil@ossau.uklinux.net>:
> Linas Vepstas <linasvepstas@gmail.com> writes:
>
>> Err, sort of, yes, unless I misunderstand.  Guile 1.8 makes
>> a certain basic assumption that is splattered throughout
>> the code; it rather intentionally re-orders the order in which
>> one of the locks is taken.  If I remember correctly, its the
>> "in guile mode" lock.  So if you just go looking for locks
>> that are released out-of-order, you'll find lots of these.
>
> Yes, I think I understand this now (having seen it myself).  The
> pattern is
>
> - thread holding its heap_mutex - which is the normal state in guile

Yes, that's the one.


> That in itself doesn't actually cause an ordering problem, but then
> the thread releases the other mutex without releasing the heap mutex
> first - which is perceived (by helgrind at least) as a problem.
>
> (Is something like this actually _ever_ a problem?  If locks are
> always _acquired_ in the right order, how can the order of _releasing_
> ever cause a problem?)

Yes, it can be a problem; I don't want to dream up
some particular scenario (this stuff destroys brain
cells) ... but I do vaguely remember skimming some
wikipedia article on locking that discussed this.
I think the scenario involves three locks, though.

This is why helgrind checks lock order -- its one of
the locking problems it can actually detect.  However,
for the case of guile, the heap mutex is not visible to
anything that isn't guile, and thus, its safe in this
particular case. If there were outside users, things
would be different.


> The other thing to bear in mind is that 99% of this will just
> evaporate if we move to BDW-GC for 2.0.x; so - assuming we do end up
> doing that - it makes sense to take a slightly more pragmatic approach
> than normal for 1.8.x.

Sure. As a reminder ... the only real remaining problem
that remained with a race to update some hash table,
when define was being used from several threads.
*thats* the bug that needs attention (but is hard to fix).

> [1] I am only running a basic startup test, though: "valgrind
> --tool=helgrind guile -q <<EOF".  Were you running something a lot
> more complex than that?

I had written some simple test case, which I think sprouted
a bunch of threads, and then did simple scheme things
in each .. e.g. just adding numbers, or whatever.   I'm
attaching some kind of simple test case to this email
however, it is very hacked, so I don't know if it actually
will find bugs, and  its probably doesn't do what it
claims to do.   I provide it only as a short-cut for creating
a new test case. ...

--linas

[-- Attachment #2: thread-exp.c --]
[-- Type: text/x-csrc, Size: 4552 bytes --]

/**
 * Guile threading bug/unexpected behaviour.
 *
 * Two posix threads are created.  The first thread runs, and defines
 * something. The second thread tries to access the thing defined by 
 * the first. But, due to some "unexpected" threading behaviour, the
 * things defined in the first thread are not visible to the first.
 * I'm pretty convinced this is a bug, and am hunting it down now.
 *
 * The printed output of this program, for me, under guile1.8.5, is
 *
 * HEllo, thread one is running
 * Goodbye, thread one is exiting
 * HEllo, thread two is running
 * ERROR: Unbound variable: x
 * Goodbye, thread two is exiting
 * Main is exiting
 *
 * To get a better idea of what's going on, it seems to have something
 * to do with the current module. By turning on print debugging, the 
 * output, for me, becomes:
 * 
 * HEllo, thread one is running
 * thread one had output:
 * the root module is #<module (guile) f73f9e80>
 * the current module is #<directory (guile-user) f73fc600>
 * 
 * Goodbye, thread one is exiting
 * HEllo, thread two is running
 * thread two had output:
 * the root module is #<module (guile) f73f9e80>
 * the current module is #<module (guile) f73f9e80>
 * 
 * ERROR: Unbound variable: x
 * Goodbye, thread two is exiting
 * Main is exiting
 */


#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <libguile.h>

SCM outport;

void prtdbg(const char * id)
{
#define SHOW_STUFF 1
#ifdef SHOW_STUFF

	char buff[100];
	sprintf(buff, "(display \"duuude id %s\\n\")\n", id);
	scm_c_eval_string (buff);
	scm_c_eval_string ("(display \"the root module is \")\n");
	scm_c_eval_string ("(display the-root-module)\n");
	scm_c_eval_string ("(newline)\n");
	scm_c_eval_string ("(display \"the current module is \")\n");
	scm_c_eval_string ("(display (current-module))\n");
	scm_c_eval_string ("(newline)\n");

   SCM out = scm_get_output_string(outport);
   char * str = scm_to_locale_string(out);
   scm_truncate_file(outport, scm_from_uint16(0));
 
   printf("%s had output:\n%s\n", id, str);
#endif
}

void * scm_one (void *p)
{
	outport = scm_open_output_string();
	scm_set_current_output_port(outport);
	prtdbg("thread one");
	// scm_c_eval_string ("(set-current-module the-root-module)\n");
	// prtdbg("thread one again");
	scm_c_eval_string ("(define x \"asddf\")\n");
}

void * scm_two (void *p)
{
	scm_set_current_output_port(outport);
	prtdbg("thread two");
	// scm_c_eval_string ("(display x)\n");
	scm_c_eval_string ("(display \"duuuuuuuuuuuuauuuuuuuuuuuude\")\n");
	scm_c_eval_string ("(define (fact n) (if (= n 1) 1 (* n (fact (- n 1)))))\n"
		"(display \"wahazzzuppp\\n\")\n"
		"(display (fact 369))\n"
		"(display \"yeah\\n\")\n");
}

void * scm_three (void *p)
{
	scm_set_current_output_port(outport);
	prtdbg("thread three");
	scm_c_eval_string ("(display x)\n");
}

void * scm_four (void *p)
{
	prtdbg("thread four");
	scm_c_eval_string ("(display x)\n");
}

static void * thread_zero (void * arg)
{
	printf("Thread zeroooooo \n");
	scm_with_guile(scm_one, NULL);
	printf("Thread exit \n");
	return NULL;
}

static void * thread_one (void * arg)
{
	printf("HEllo, thread one is running\n");

	pthread_attr_t attr;
	pthread_t t0;
	pthread_attr_init(&attr);

	// pthread_create(&t0, &attr, thread_zero, NULL);
	scm_with_guile(scm_one, NULL);

	printf("Goodbye, thread one is exiting\n");
	return NULL;
}

static void * thread_two (void * arg)
{
	printf("HEllo, thread two is running\n");
	scm_with_guile(scm_two, NULL);
	printf("Goodbye, thread two is exiting\n");
	return NULL;
}

static void * thread_three (void * arg)
{
	printf("HEllo, thread three is running\n");
	scm_with_guile(scm_three, NULL);
	printf("Goodbye, thread three is exiting\n");
	return NULL;
}

static void * thread_four (void * arg)
{
	printf("HEllo, thread four is running\n");
	scm_with_guile(scm_four, NULL);
	printf("Goodbye, thread four is exiting\n");
	return NULL;
}

main()
{
	int rc;
	pthread_attr_t attr;
	pthread_t t1, t2, t3, t4;

	pthread_attr_init(&attr);

	rc = pthread_create(&t1, &attr, thread_one, NULL);
	if(rc)
	{
		fprintf(stderr, "Fatal error: can't create thread one\n");
		exit(1);
	}
	sleep(1);
	rc = pthread_create(&t2, &attr, thread_two, NULL);
	if(rc)
	{
		fprintf(stderr, "Fatal error: can't create thread two\n");
		exit(1);
	}

	sleep(1);
	rc = pthread_create(&t3, &attr, thread_three, NULL);
	if(rc)
	{
		fprintf(stderr, "Fatal error: can't create thread three\n");
		exit(1);
	}
	sleep(1);
	rc = pthread_create(&t4, &attr, thread_four, NULL);

	sleep(5);
	printf("Main is exiting\n");
}

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-11 22:31 Locks and threads Neil Jerram
                   ` (2 preceding siblings ...)
  2009-02-11 23:30 ` Ludovic Courtès
@ 2009-02-12 12:55 ` Greg Troxel
  2009-02-12 18:00   ` Ken Raeburn
  2009-03-05 20:41   ` Neil Jerram
  2009-03-04 23:49 ` Neil Jerram
  4 siblings, 2 replies; 51+ messages in thread
From: Greg Troxel @ 2009-02-12 12:55 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 557 bytes --]


  Does that sound sensible; have I missed anything?

Also run tests on other than Linux, with as many different OS threading
implementations as possible.  For all systems, set the threading
libraries to the most restrictive settings, specifically aborting on any
behavior the standard says is undefined.

If the define-race program gets added as a check that make check runs,
my autobuild machine will run it daily.

I am not familiar with helgrind; not sure if it's useful to run that on
other platforms once it's ok on one, or what platforms it runs on.

[-- Attachment #2: Type: application/pgp-signature, Size: 193 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-12 12:55 ` Greg Troxel
@ 2009-02-12 18:00   ` Ken Raeburn
  2009-02-12 21:14     ` Ludovic Courtès
  2009-03-05 20:41   ` Neil Jerram
  1 sibling, 1 reply; 51+ messages in thread
From: Ken Raeburn @ 2009-02-12 18:00 UTC (permalink / raw)
  To: Greg Troxel; +Cc: guile-devel, Neil Jerram

On Feb 12, 2009, at 07:55, Greg Troxel wrote:
>  Does that sound sensible; have I missed anything?
>
> Also run tests on other than Linux, with as many different OS  
> threading
> implementations as possible.  For all systems, set the threading
> libraries to the most restrictive settings, specifically aborting on  
> any
> behavior the standard says is undefined.

Yeah, definitely NetBSD is a good one for this.

> If the define-race program gets added as a check that make check runs,
> my autobuild machine will run it daily.
>
> I am not familiar with helgrind; not sure if it's useful to run that  
> on
> other platforms once it's ok on one, or what platforms it runs on.

It's part of the valgrind suite, which currently only runs on Linux,  
though there is porting work in progress.  The helgrind tool checks  
for possible synchronization issues.

As helgrind and NetBSD's pthread checking code work by instrumenting  
the executable and watching progress, I'd suggest also trying out  
static analysis tools if anyone has them handy.  I see guile is  
already listed with Coverity's open-source scanner project; who's  
reviewing that data, if anyone?

Ken




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-11 23:53   ` Neil Jerram
  2009-02-12  0:18     ` Linas Vepstas
@ 2009-02-12 20:51     ` Ludovic Courtès
  1 sibling, 0 replies; 51+ messages in thread
From: Ludovic Courtès @ 2009-02-12 20:51 UTC (permalink / raw)
  To: guile-devel

Hello,

Neil Jerram <neil@ossau.uklinux.net> writes:

> (Is something like this actually _ever_ a problem?  If locks are
> always _acquired_ in the right order, how can the order of _releasing_
> ever cause a problem?)

It can't be a problem, AFAIUI.  I don't think Helgrind check the
releasing order, though; the manual doesn't mention it
(http://valgrind.org/docs/manual/hg-manual.html#hg-manual.lock-orders).

Thanks,
Ludo'.





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-12 18:00   ` Ken Raeburn
@ 2009-02-12 21:14     ` Ludovic Courtès
  2009-02-14  1:25       ` Ken Raeburn
  0 siblings, 1 reply; 51+ messages in thread
From: Ludovic Courtès @ 2009-02-12 21:14 UTC (permalink / raw)
  To: guile-devel

Hello,

Ken Raeburn <raeburn@raeburn.org> writes:

> As helgrind and NetBSD's pthread checking code work by instrumenting
> the executable and watching progress, I'd suggest also trying out
> static analysis tools if anyone has them handy.  I see guile is
> already listed with Coverity's open-source scanner project; who's
> reviewing that data, if anyone?

I don't feel like supporting that tool.  I noticed that other free
software supporters are doubtful, too:
http://blog.josefsson.org/2007/04/02/boycott-scancoveritycom/ .

We could use tools like Splint, but it seems to be quite intrusive and
difficult to use when not used from the beginning of the project (to get
an idea it gives 97 warnings on `alist.c'...).

Thanks,
Ludo'.





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-12 21:14     ` Ludovic Courtès
@ 2009-02-14  1:25       ` Ken Raeburn
  2009-02-14 16:09         ` Ludovic Courtès
  0 siblings, 1 reply; 51+ messages in thread
From: Ken Raeburn @ 2009-02-14  1:25 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Feb 12, 2009, at 16:14, Ludovic Courtès wrote:
> Ken Raeburn <raeburn@raeburn.org> writes:
>> [... Coverity ...]
> I don't feel like supporting that tool.  I noticed that other free
> software supporters are doubtful, too:
> http://blog.josefsson.org/2007/04/02/boycott-scancoveritycom/ .

The license situation might be a problem for some, sure; that's up to  
the individuals to decide.  It looks okay by me, but I don't have time  
to deal.

Quite a number of the projects at scan.coverity.com are GNU projects,  
including emacs and gcc; any idea if anyone from the GNU Project has  
talked to Coverity in an official capacity about getting the terms  
changed?

> We could use tools like Splint, but it seems to be quite intrusive and
> difficult to use when not used from the beginning of the project (to  
> get
> an idea it gives 97 warnings on `alist.c'...).

I used to use Splint, and it did find a few problems for me; between  
the intrusiveness and the relative lack of work on it these days, and  
the difficulty in expressing some constructs to it, I don't bother any  
more.  I'm not convinced it was worth the time and annoyance to  
annotate the code.

Ken



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-14  1:25       ` Ken Raeburn
@ 2009-02-14 16:09         ` Ludovic Courtès
  0 siblings, 0 replies; 51+ messages in thread
From: Ludovic Courtès @ 2009-02-14 16:09 UTC (permalink / raw)
  To: guile-devel

Hi,

Ken Raeburn <raeburn@raeburn.org> writes:

> Quite a number of the projects at scan.coverity.com are GNU projects,
> including emacs and gcc; any idea if anyone from the GNU Project has
> talked to Coverity in an official capacity about getting the terms
> changed?

No idea.

> I used to use Splint, and it did find a few problems for me; between
> the intrusiveness and the relative lack of work on it these days, and
> the difficulty in expressing some constructs to it, I don't bother any
> more.  I'm not convinced it was worth the time and annoyance to
> annotate the code.

Yes, that's exactly what I feared.  Then there's "cilly" [0], part of
the CIL framework, but I've never used it and it's unclear to me what
it's good at.

We could also use more GCC warnings.

Thanks,
Ludo'.

[0] http://manju.cs.berkeley.edu/cil/cil007.html#sec-driver





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-11 22:31 Locks and threads Neil Jerram
                   ` (3 preceding siblings ...)
  2009-02-12 12:55 ` Greg Troxel
@ 2009-03-04 23:49 ` Neil Jerram
  2009-03-05  3:54   ` Linas Vepstas
                     ` (2 more replies)
  4 siblings, 3 replies; 51+ messages in thread
From: Neil Jerram @ 2009-03-04 23:49 UTC (permalink / raw)
  To: Guile Development; +Cc: Linas Vepstas

[-- Attachment #1: Type: text/plain, Size: 474 bytes --]

Neil Jerram <neil@ossau.uklinux.net> writes:

> - first to address problems reported by helgrind (since I think we
>   should take advantage of external tools before adding debug code to
>   Guile internally)

Here's another lock ordering fix.  (For 1.8.x at least, I haven't
checked if it's relevant to master yet.)

1.8.x is then helgrind-clean (apart from one unimportant-looking
termination issue), so next it's on to the real problem, threadsafe
define.

        Neil


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Lock-ordering-don-t-allocate-when-in-critical-secti.patch --]
[-- Type: text/x-diff, Size: 6292 bytes --]

From 588edc4cd1bfea7d51c9aed463e8119482e7a3f0 Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Wed, 4 Mar 2009 23:45:11 +0000
Subject: [PATCH] Lock ordering: don't allocate when in critical section (scm_sigaction_for_thread)

This fixes the following helgrind report.

Thread #1: lock order "0x4114748 before 0x4331084" violated
   at 0x40234F7: pthread_mutex_lock (hg_intercepts.c:408)
   by 0x407A55F: deval (eval.c:4077)
   by 0x407A071: scm_dapply (eval.c:5012)
   by 0x407888A: scm_apply (eval.c:4811)
   by 0x407E20C: scm_call_0 (eval.c:4666)
   by 0x406BDF7: scm_dynamic_wind (dynwind.c:111)
   by 0x407620C: ceval (eval.c:4571)
   by 0x407E8D9: scm_primitive_eval_x (eval.c:5921)
   by 0x407E934: scm_eval_x (eval.c:5956)
   by 0x40B9140: scm_shell (script.c:737)
   by 0x4095AC5: invoke_main_func (init.c:367)
   by 0x4065A81: c_body (continuations.c:349)
  Required order was established by acquisition of lock at 0x4114748
   at 0x40234F7: pthread_mutex_lock (hg_intercepts.c:408)
   by 0x40B7BE1: scm_sigaction_for_thread (scmsigs.c:339)
   by 0x40911DE: scm_gsubr_apply (gsubr.c:223)
   by 0x4079E36: scm_dapply (eval.c:4930)
   by 0x407AB68: deval (eval.c:4378)
   by 0x407A071: scm_dapply (eval.c:5012)
   by 0x407888A: scm_apply (eval.c:4811)
   by 0x407E1D0: scm_call_1 (eval.c:4672)
   by 0x407FEEE: scm_map (eval.c:5489)
   by 0x407BDCC: deval (eval.c:4367)
   by 0x407D197: deval (eval.c:3698)
   by 0x407A071: scm_dapply (eval.c:5012)
  followed by a later acquisition of lock at 0x4331084
   at 0x40234F7: pthread_mutex_lock (hg_intercepts.c:408)
   by 0x40DC397: scm_enter_guile (threads.c:377)
   by 0x40DC8F4: scm_pthread_mutex_lock (threads.c:1485)
   by 0x4083FAA: scm_gc_for_newcell (gc.c:484)
   by 0x40A9781: scm_cons (inline.h:115)
   by 0x4079867: scm_dapply (eval.c:4850)
   by 0x407888A: scm_apply (eval.c:4811)
   by 0x40796D6: scm_call_3 (eval.c:4684)
   by 0x409A7C2: module_variable (modules.c:302)
   by 0x409A7EE: module_variable (modules.c:312)
   by 0x409A962: scm_sym2var (modules.c:466)
   by 0x40738F4: scm_lookupcar1 (eval.c:2874)

* libguile/scmsigs.c (close_1): Renamed `handler_to_async'; also
  handle #f case and wrapping the async in a cons, if necessary.

  (install_handler): Pass in async instead of constructing it; combine
  two branches into one.

  (scm_sigaction_for_thread): Allocate async upfront instead of inside
  the critical section, and pass it to install_handler calls.  Leave
  critical section before signaling out-of-range error.
---
 libguile/scmsigs.c |   53 ++++++++++++++++++++++++++++-----------------------
 1 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/libguile/scmsigs.c b/libguile/scmsigs.c
index 9433273..e15bbf3 100644
--- a/libguile/scmsigs.c
+++ b/libguile/scmsigs.c
@@ -108,10 +108,20 @@ static SIGRETTYPE (*orig_handlers[NSIG])(int);
 #endif
 
 static SCM
-close_1 (SCM proc, SCM arg)
+handler_to_async (SCM handler, int signum)
 {
-  return scm_primitive_eval_x (scm_list_3 (scm_sym_lambda, SCM_EOL,
-					   scm_list_2 (proc, arg)));
+  if (scm_is_false (handler))
+    return SCM_BOOL_F;
+  else
+    {
+      SCM async = scm_primitive_eval_x (scm_list_3 (scm_sym_lambda, SCM_EOL,
+						    scm_list_2 (handler,
+								scm_from_int (signum))));
+#if !SCM_USE_PTHREAD_THREADS
+      async = scm_cons (async, SCM_BOOL_F);
+#endif
+      return async;
+    }
 }
 
 #if SCM_USE_PTHREAD_THREADS
@@ -239,23 +249,10 @@ ensure_signal_delivery_thread ()
 #endif /* !SCM_USE_PTHREAD_THREADS */
 
 static void
-install_handler (int signum, SCM thread, SCM handler)
+install_handler (int signum, SCM thread, SCM handler, SCM async)
 {
-  if (scm_is_false (handler))
-    {
-      SCM_SIMPLE_VECTOR_SET (*signal_handlers, signum, SCM_BOOL_F);
-      SCM_SIMPLE_VECTOR_SET (signal_handler_asyncs, signum, SCM_BOOL_F);
-    }
-  else
-    {
-      SCM async = close_1 (handler, scm_from_int (signum));
-#if !SCM_USE_PTHREAD_THREADS
-      async = scm_cons (async, SCM_BOOL_F);
-#endif
-      SCM_SIMPLE_VECTOR_SET (*signal_handlers, signum, handler);
-      SCM_SIMPLE_VECTOR_SET (signal_handler_asyncs, signum, async);
-    }
-
+  SCM_SIMPLE_VECTOR_SET (*signal_handlers, signum, handler);
+  SCM_SIMPLE_VECTOR_SET (signal_handler_asyncs, signum, async);
   SCM_SIMPLE_VECTOR_SET (signal_handler_threads, signum, thread);
 }
 
@@ -308,6 +305,7 @@ SCM_DEFINE (scm_sigaction_for_thread, "sigaction", 1, 3, 0,
   int save_handler = 0;
       
   SCM old_handler;
+  SCM async;
 
   csig = scm_to_signed_integer (signum, 0, NSIG-1);
 
@@ -334,6 +332,10 @@ SCM_DEFINE (scm_sigaction_for_thread, "sigaction", 1, 3, 0,
 	SCM_MISC_ERROR ("thread has already exited", SCM_EOL);
     }
 
+  /* Allocate upfront, as we can't do it inside the critical
+     section. */
+  async = handler_to_async (handler, csig);
+
   ensure_signal_delivery_thread ();
 
   SCM_CRITICAL_SECTION_START;
@@ -351,10 +353,13 @@ SCM_DEFINE (scm_sigaction_for_thread, "sigaction", 1, 3, 0,
 #else
 	  chandler = (SIGRETTYPE (*) (int)) handler_int;
 #endif
-	  install_handler (csig, SCM_BOOL_F, SCM_BOOL_F);
+	  install_handler (csig, SCM_BOOL_F, SCM_BOOL_F, async);
 	}
       else
-	SCM_OUT_OF_RANGE (2, handler);
+	{
+	  SCM_CRITICAL_SECTION_END;
+	  SCM_OUT_OF_RANGE (2, handler);
+	}
     }
   else if (scm_is_false (handler))
     {
@@ -366,7 +371,7 @@ SCM_DEFINE (scm_sigaction_for_thread, "sigaction", 1, 3, 0,
 	{
 	  action = orig_handlers[csig];
 	  orig_handlers[csig].sa_handler = SIG_ERR;
-	  install_handler (csig, SCM_BOOL_F, SCM_BOOL_F);
+	  install_handler (csig, SCM_BOOL_F, SCM_BOOL_F, async);
 	}
 #else
       if (orig_handlers[csig] == SIG_ERR)
@@ -375,7 +380,7 @@ SCM_DEFINE (scm_sigaction_for_thread, "sigaction", 1, 3, 0,
 	{
 	  chandler = orig_handlers[csig];
 	  orig_handlers[csig] = SIG_ERR;
-	  install_handler (csig, SCM_BOOL_F, SCM_BOOL_F);
+	  install_handler (csig, SCM_BOOL_F, SCM_BOOL_F, async);
 	}
 #endif
     }
@@ -391,7 +396,7 @@ SCM_DEFINE (scm_sigaction_for_thread, "sigaction", 1, 3, 0,
       if (orig_handlers[csig] == SIG_ERR)
 	save_handler = 1;
 #endif
-      install_handler (csig, thread, handler);
+      install_handler (csig, thread, handler, async);
     }
 
   /* XXX - Silently ignore setting handlers for `program error signals'
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-04 23:49 ` Neil Jerram
@ 2009-03-05  3:54   ` Linas Vepstas
  2009-03-05 19:46     ` Neil Jerram
  2009-03-05 21:35   ` Ludovic Courtès
  2009-03-10 23:57   ` Neil Jerram
  2 siblings, 1 reply; 51+ messages in thread
From: Linas Vepstas @ 2009-03-05  3:54 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

2009/3/4 Neil Jerram <neil@ossau.uklinux.net>:
> Here's another lock ordering fix.  (For 1.8.x at least, I haven't
> checked if it's relevant to master yet.)

I skimmed it quickly, looks reasonable, except for this:

       else
-       SCM_OUT_OF_RANGE (2, handler);
+       {
+         SCM_CRITICAL_SECTION_END;
+         SCM_OUT_OF_RANGE (2, handler);
+       }

The matching SCM_CRITICAL_SECTION_START;
is not in an if block, ... before your change, where
was the matching END?  Why is there now an END
when one did not used to be needed? Was this a
bug you fixed, unrelated to the rest of the patch?

In the past, without this END, there should have been
a deadlock, so I guess this demonstrates that the else
branch is never taken?  All this is rather confusing.

Now, if this was the linux kernel, people would ask you
to split this patch into two patches: one to fix this
unbalanced start/end problem, and another to pull the
alloc out of the critical section.  That makes it easier
to review the correctness of the changes ...

--linas




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-05  3:54   ` Linas Vepstas
@ 2009-03-05 19:46     ` Neil Jerram
  2009-03-05 20:05       ` Neil Jerram
  0 siblings, 1 reply; 51+ messages in thread
From: Neil Jerram @ 2009-03-05 19:46 UTC (permalink / raw)
  To: linasvepstas; +Cc: Guile Development

Linas Vepstas <linasvepstas@gmail.com> writes:

> I skimmed it quickly, looks reasonable,

Thanks for reviewing!

> except for this:
>
>        else
> -       SCM_OUT_OF_RANGE (2, handler);
> +       {
> +         SCM_CRITICAL_SECTION_END;
> +         SCM_OUT_OF_RANGE (2, handler);
> +       }
>
> The matching SCM_CRITICAL_SECTION_START;
> is not in an if block, ... before your change, where
> was the matching END?  Why is there now an END
> when one did not used to be needed? Was this a
> bug you fixed, unrelated to the rest of the patch?

Yes, it's an unrelated bug.  All of the places that raise errors (and
so exit non-locally) should exit the critical section first.

> In the past, without this END, there should have been
> a deadlock, so I guess this demonstrates that the else
> branch is never taken?

Yes.  But that's expected; the else branch is only used if `sigaction'
is called with incorrect parameters.

> Now, if this was the linux kernel, people would ask you
> to split this patch into two patches: one to fix this
> unbalanced start/end problem, and another to pull the
> alloc out of the critical section.  That makes it easier
> to review the correctness of the changes ...

You're absolutely right.  I'll leave this part out, and generate a
separate patch for it.

Thanks again,
        Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-05 19:46     ` Neil Jerram
@ 2009-03-05 20:05       ` Neil Jerram
  2009-03-05 20:40         ` Linas Vepstas
                           ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: Neil Jerram @ 2009-03-05 20:05 UTC (permalink / raw)
  To: linasvepstas; +Cc: Guile Development

[-- Attachment #1: Type: text/plain, Size: 320 bytes --]

Neil Jerram <neil@ossau.uklinux.net> writes:

> Yes, it's an unrelated bug.  All of the places that raise errors (and
> so exit non-locally) should exit the critical section first.

> You're absolutely right.  I'll leave this part out, and generate a
> separate patch for it.

Here's the separate patch...

       Neil


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Avoid-throw-from-critical-section-given-invalid-sig.patch --]
[-- Type: text/x-diff, Size: 2861 bytes --]

From 4d78f2447a5a1e3cb1acf596929b1bfe4fbe3884 Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Thu, 5 Mar 2009 20:03:33 +0000
Subject: [PATCH] Avoid throw from critical section, given invalid sigaction call

* libguile/scmsigs.c (scm_sigaction_for_thread): Exit critical section
  before raising out-of-range error.

* test-suite/Makefile.am (SCM_TESTS): Add signals.test.

* test-suite/tests/signals.test: New file.
---
 libguile/scmsigs.c            |    5 ++++-
 test-suite/Makefile.am        |    1 +
 test-suite/tests/signals.test |   29 +++++++++++++++++++++++++++++
 3 files changed, 34 insertions(+), 1 deletions(-)
 create mode 100644 test-suite/tests/signals.test

diff --git a/libguile/scmsigs.c b/libguile/scmsigs.c
index c406007..e15bbf3 100644
--- a/libguile/scmsigs.c
+++ b/libguile/scmsigs.c
@@ -356,7 +356,10 @@ SCM_DEFINE (scm_sigaction_for_thread, "sigaction", 1, 3, 0,
 	  install_handler (csig, SCM_BOOL_F, SCM_BOOL_F, async);
 	}
       else
-	SCM_OUT_OF_RANGE (2, handler);
+	{
+	  SCM_CRITICAL_SECTION_END;
+	  SCM_OUT_OF_RANGE (2, handler);
+	}
     }
   else if (scm_is_false (handler))
     {
diff --git a/test-suite/Makefile.am b/test-suite/Makefile.am
index df8c7bc..5492137 100644
--- a/test-suite/Makefile.am
+++ b/test-suite/Makefile.am
@@ -63,6 +63,7 @@ SCM_TESTS = tests/alist.test			\
 	    tests/reader.test			\
 	    tests/receive.test			\
 	    tests/regexp.test			\
+	    tests/signals.test			\
 	    tests/socket.test			\
 	    tests/srcprop.test			\
 	    tests/srfi-1.test			\
diff --git a/test-suite/tests/signals.test b/test-suite/tests/signals.test
new file mode 100644
index 0000000..7cab85e
--- /dev/null
+++ b/test-suite/tests/signals.test
@@ -0,0 +1,29 @@
+;;;; signals.test --- test suite for Guile's signal functions       -*- scheme -*-
+;;;; 
+;;;; Copyright (C) 2009 Free Software Foundation, Inc.
+;;;; 
+;;;; This program is free software; you can redistribute it and/or modify
+;;;; it under the terms of the GNU General Public License as published by
+;;;; the Free Software Foundation; either version 2, or (at your option)
+;;;; any later version.
+;;;; 
+;;;; This program is distributed in the hope that it will be useful,
+;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;;; GNU General Public License for more details.
+;;;; 
+;;;; You should have received a copy of the GNU General Public License
+;;;; along with this software; see the file COPYING.  If not, write to
+;;;; the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
+;;;; Boston, MA 02110-1301 USA
+
+
+(use-modules (test-suite lib))
+
+(with-test-prefix "sigaction"
+
+  (pass-if-exception "handler arg is an invalid integer"
+    exception:out-of-range
+    (sigaction SIGINT 51))
+
+  )
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-05 20:05       ` Neil Jerram
@ 2009-03-05 20:40         ` Linas Vepstas
  2009-03-05 20:49           ` Neil Jerram
  2009-03-05 20:57         ` Linas Vepstas
  2009-03-25 19:00         ` Neil Jerram
  2 siblings, 1 reply; 51+ messages in thread
From: Linas Vepstas @ 2009-03-05 20:40 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

2009/3/5 Neil Jerram <neil@ossau.uklinux.net>:
>> Yes, it's an unrelated bug.  All of the places that raise errors (and
>> so exit non-locally) should exit the critical section first.
>
>> You're absolutely right.  I'll leave this part out, and generate a
>> separate patch for it.
>
> Here's the separate patch...

Looks good, except for the test case:

+  (pass-if-exception "handler arg is an invalid integer"
+    exception:out-of-range
+    (sigaction SIGINT 51))

whereas my bits/signum.h shows:

#define SIGUNUSED 31

#define  _NSIG    65 /* Biggest signal number + 1
               (including real-time signals).  */

I presume that in the era of 64-bit CPU's we might
start seeing new sigs in the 32-64 range.  Or maybe
the RT kernels already do ... ?

--linas




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-02-12 12:55 ` Greg Troxel
  2009-02-12 18:00   ` Ken Raeburn
@ 2009-03-05 20:41   ` Neil Jerram
  1 sibling, 0 replies; 51+ messages in thread
From: Neil Jerram @ 2009-03-05 20:41 UTC (permalink / raw)
  To: Greg Troxel; +Cc: guile-devel

Greg Troxel <gdt@ir.bbn.com> writes:

>   Does that sound sensible; have I missed anything?
>
> Also run tests on other than Linux, with as many different OS threading
> implementations as possible.  For all systems, set the threading
> libraries to the most restrictive settings, specifically aborting on any
> behavior the standard says is undefined.

I can't (or don't want to) do that myself.  We already have some
coverage here from Simon Josefsson's autobuilder, the Debian builds,
and Guile users running on interesting platforms.  If anyone sees
opportunities for feeding Guile into more build systems, please go
ahead!

> If the define-race program gets added as a check that make check runs,
> my autobuild machine will run it daily.

Yes, I will try to incorporate define-race in this way.

Thanks for your comments!

        Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-05 20:40         ` Linas Vepstas
@ 2009-03-05 20:49           ` Neil Jerram
  0 siblings, 0 replies; 51+ messages in thread
From: Neil Jerram @ 2009-03-05 20:49 UTC (permalink / raw)
  To: linasvepstas; +Cc: Guile Development

Linas Vepstas <linasvepstas@gmail.com> writes:

> 2009/3/5 Neil Jerram <neil@ossau.uklinux.net>:
>>> Yes, it's an unrelated bug.  All of the places that raise errors (and
>>> so exit non-locally) should exit the critical section first.
>>
>>> You're absolutely right.  I'll leave this part out, and generate a
>>> separate patch for it.
>>
>> Here's the separate patch...
>
> Looks good, except for the test case:
>
> +  (pass-if-exception "handler arg is an invalid integer"
> +    exception:out-of-range
> +    (sigaction SIGINT 51))
>
> whereas my bits/signum.h shows:
>
> #define SIGUNUSED 31
>
> #define  _NSIG    65 /* Biggest signal number + 1
>                (including real-time signals).  */
>
> I presume that in the era of 64-bit CPU's we might
> start seeing new sigs in the 32-64 range.  Or maybe
> the RT kernels already do ... ?

Right, but the 51 is the handler arg, for which the only valid integer
values are SIG_DFL (0) and SIG_IGN (1).  I'll add a comment to explain
this, and code to make the test fail in case SIG_DFL or SIG_IGN is (on
some platform) 51.

      Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-05 20:05       ` Neil Jerram
  2009-03-05 20:40         ` Linas Vepstas
@ 2009-03-05 20:57         ` Linas Vepstas
  2009-03-05 21:25           ` Neil Jerram
  2009-03-25 19:00         ` Neil Jerram
  2 siblings, 1 reply; 51+ messages in thread
From: Linas Vepstas @ 2009-03-05 20:57 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

2009/3/5 Neil Jerram <neil@ossau.uklinux.net>:
>> Yes, it's an unrelated bug.  All of the places that raise errors (and
>> so exit non-locally) should exit the critical section first.
>
>> You're absolutely right.  I'll leave this part out, and generate a
>> separate patch for it.
>
> Here's the separate patch...

Err, OK, so I thought I'd look at the code more carefully.
I don't understand the patch.

libguile/scmsigs.c has a SCM_CRITICAL_SECTION_START
at line 339, which seems to be balanced by
SCM_CRITICAL_SECTION_END;
at lines 442 and 461, right before the return from
the subroutine.

So why insert this seemingly un-needed SCM_CRITICAL_SECTION_END,
and worse, nest it in an else block?

--linas




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-05 20:57         ` Linas Vepstas
@ 2009-03-05 21:25           ` Neil Jerram
  2009-03-05 21:56             ` Linas Vepstas
  0 siblings, 1 reply; 51+ messages in thread
From: Neil Jerram @ 2009-03-05 21:25 UTC (permalink / raw)
  To: linasvepstas; +Cc: Guile Development

Linas Vepstas <linasvepstas@gmail.com> writes:

> I don't understand the patch.
>
> libguile/scmsigs.c has a SCM_CRITICAL_SECTION_START
> at line 339, which seems to be balanced by
> SCM_CRITICAL_SECTION_END;
> at lines 442 and 461, right before the return from
> the subroutine.
>
> So why insert this seemingly un-needed SCM_CRITICAL_SECTION_END,
> and worse, nest it in an else block?

Because when scm_sigaction_for_thread decides to throw an out-of-range
error (by the `SCM_OUT_OF_RANGE' call), it will exit non-locally at
that point (by calling `longjmp').  So, in this case, it won't pass
through (either of) the SCM_CRITICAL_SECTION_ENDs just before the
return statement.

Does that make sense now?

      Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-04 23:49 ` Neil Jerram
  2009-03-05  3:54   ` Linas Vepstas
@ 2009-03-05 21:35   ` Ludovic Courtès
  2009-03-10 23:57   ` Neil Jerram
  2 siblings, 0 replies; 51+ messages in thread
From: Ludovic Courtès @ 2009-03-05 21:35 UTC (permalink / raw)
  To: guile-devel

Hello Neil,

Thanks for the patches and tests!

Ludo'.





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-05 21:25           ` Neil Jerram
@ 2009-03-05 21:56             ` Linas Vepstas
  2009-03-06 11:01               ` Andy Wingo
  2009-03-08 22:04               ` Neil Jerram
  0 siblings, 2 replies; 51+ messages in thread
From: Linas Vepstas @ 2009-03-05 21:56 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

2009/3/5 Neil Jerram <neil@ossau.uklinux.net>:
> Linas Vepstas <linasvepstas@gmail.com> writes:
>
>> I don't understand the patch.
>>
>> libguile/scmsigs.c has a SCM_CRITICAL_SECTION_START
>> at line 339, which seems to be balanced by
>> SCM_CRITICAL_SECTION_END;
>> at lines 442 and 461, right before the return from
>> the subroutine.
>>
>> So why insert this seemingly un-needed SCM_CRITICAL_SECTION_END,
>> and worse, nest it in an else block?
>
> Because when scm_sigaction_for_thread decides to throw an out-of-range
> error (by the `SCM_OUT_OF_RANGE' call), it will exit non-locally at
> that point (by calling `longjmp').  So, in this case, it won't pass
> through (either of) the SCM_CRITICAL_SECTION_ENDs just before the
> return statement.
>
> Does that make sense now?

Ah. I suspected as much, and started reading the code,
but clearly I didn't go deep enough. Just now I double
checked and eventually found the longjump in throw.c
after digging fairly deep.

Perhaps I'm naive, perhaps some naming convention
could be used to  indicate that SCM_OUT_OF_RANGE
will never return?  None of the functions in the call stack
gave any real hint that they might now return; they mostly
looked liked ordinary functions.

--linas




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-05 21:56             ` Linas Vepstas
@ 2009-03-06 11:01               ` Andy Wingo
  2009-03-06 12:36                 ` Linas Vepstas
  2009-03-08 22:04               ` Neil Jerram
  1 sibling, 1 reply; 51+ messages in thread
From: Andy Wingo @ 2009-03-06 11:01 UTC (permalink / raw)
  To: linasvepstas; +Cc: Guile Development, Neil Jerram

Hi Linas,

On Thu 05 Mar 2009 22:56, Linas Vepstas <linasvepstas@gmail.com> writes:

> Perhaps I'm naive, perhaps some naming convention
> could be used to  indicate that SCM_OUT_OF_RANGE
> will never return?  None of the functions in the call stack
> gave any real hint that they might now return; they mostly
> looked liked ordinary functions.

Changing the names would be quite an undertaking. Perhaps we should be
more clear about this point in the docs, though, as it is fundamental to
Guile programming in C.

Cheers,

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-06 11:01               ` Andy Wingo
@ 2009-03-06 12:36                 ` Linas Vepstas
  2009-03-06 22:05                   ` Ludovic Courtès
  0 siblings, 1 reply; 51+ messages in thread
From: Linas Vepstas @ 2009-03-06 12:36 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Guile Development, Neil Jerram

2009/3/6 Andy Wingo <wingo@pobox.com>:
> Hi Linas,
>
> On Thu 05 Mar 2009 22:56, Linas Vepstas <linasvepstas@gmail.com> writes:
>
>> Perhaps I'm naive, perhaps some naming convention
>> could be used to  indicate that SCM_OUT_OF_RANGE
>> will never return?  None of the functions in the call stack
>> gave any real hint that they might now return; they mostly
>> looked liked ordinary functions.
>
> Changing the names would be quite an undertaking. Perhaps we should be
> more clear about this point in the docs, though, as it is fundamental to
> Guile programming in C.

The Linux kernel uses the gcc markup
__attribute__((noreturn))

to indicate a function that does not return; so e.g.

void blahfunc(SCM args) __attribute__((noreturn));

It used to be common to see
#define NORET __attribute__((noreturn))
and
void NORET blafunc(...);
put that style passed on ...

I have no clue what the compiler does with this.
perhaps it handles the stack differently, knowing
its never unwound, or maybe it doesn't generate
the function epilogue.

Even if the compiler does nothing, it would still be
a handy clue to the programmer reading the code!

--linas

/usr/src/linux-2.6.28.4# grep -r noreturn * |wc
     92     439    7488




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-06 12:36                 ` Linas Vepstas
@ 2009-03-06 22:05                   ` Ludovic Courtès
  0 siblings, 0 replies; 51+ messages in thread
From: Ludovic Courtès @ 2009-03-06 22:05 UTC (permalink / raw)
  To: guile-devel

Hi,

Linas Vepstas <linasvepstas@gmail.com> writes:

> /usr/src/linux-2.6.28.4# grep -r noreturn * |wc
>      92     439    7488

$ grep SCM_NORETURN libguile/*.[ch] | wc
     25     155    1758

Ludo'.





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-05 21:56             ` Linas Vepstas
  2009-03-06 11:01               ` Andy Wingo
@ 2009-03-08 22:04               ` Neil Jerram
  1 sibling, 0 replies; 51+ messages in thread
From: Neil Jerram @ 2009-03-08 22:04 UTC (permalink / raw)
  To: linasvepstas; +Cc: Guile Development

Linas Vepstas <linasvepstas@gmail.com> writes:

> Perhaps I'm naive, perhaps some naming convention
> could be used to  indicate that SCM_OUT_OF_RANGE
> will never return?  None of the functions in the call stack
> gave any real hint that they might now return; they mostly
> looked liked ordinary functions.

As Andy has said, this kind of thing is quite pervasive in Guile, and
so I think it's something that developers become used to very quickly.

(Incidentally, it happens in your define-race code too.  The first
time that the program hits an undefined variable, the error (from
within scm_c_eval_string) causes a jump back out to scm_with_guile.
For the thread concerned, therefore, s->count is never incremented
anymore, and the thread just keeps hitting the same undefined variable
over and over again...)

In this code (scm_sigaction_for_thread) it would probably be better to
use the scm_dynwind_* API, and in particular
scm_dynwind_critical_section, as that would cover the mainline case
and all the error cases in one go.  But I'm surprised to see
scm_dynwind_critical_section using a fat mutex; need to look into that
a bit more...

Regards,
        Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-04 23:49 ` Neil Jerram
  2009-03-05  3:54   ` Linas Vepstas
  2009-03-05 21:35   ` Ludovic Courtès
@ 2009-03-10 23:57   ` Neil Jerram
  2009-03-12  0:07     ` Neil Jerram
  2009-03-25 18:57     ` Neil Jerram
  2 siblings, 2 replies; 51+ messages in thread
From: Neil Jerram @ 2009-03-10 23:57 UTC (permalink / raw)
  To: Guile Development; +Cc: Linas Vepstas

[-- Attachment #1: Type: text/plain, Size: 617 bytes --]

Neil Jerram <neil@ossau.uklinux.net> writes:

> next it's on to the real problem, threadsafe define.

I've been running Linas's define-race test program, and hitting the
`throw from within critical section' quite a lot.

We've already discussed this a couple of times:
http://www.mail-archive.com/bug-guile@gnu.org/msg04613.html
http://www.mail-archive.com/guile-devel@gnu.org/msg03016.html

Andy proposed just removing the check for 1.8.x, but I would prefer to
keep it if possible, as I think there probably still are genuine
problems where the check could help.  What about the patch below?

Thanks,
        Neil


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-spurious-throw-from-within-critical-section-er.patch --]
[-- Type: text/x-diff, Size: 4178 bytes --]

From 2daf18f36c6428cbc883c0eb1a381ffbba59614b Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Tue, 10 Mar 2009 23:55:31 +0000
Subject: [PATCH] Fix spurious `throw from within critical section' errors

The crux of this problem was that the thread doing a throw, and so
checking scm_i_critical_section_level, was different from the thread
that was in a critical section.

* libguile/async.h (scm_i_critical_section_level): Removed, replaced
  by per-thread critical_section_level.
  (SCM_CRITICAL_SECTION_START, SCM_CRITICAL_SECTION_END): Use
  per-thread critical_section_level.

* libguile/continuations.c (scm_dynthrow): Check per-thread
  critical_section_level.

* libguile/threads.c (guilify_self_1): Init per-thread
  critical_section_level.
  (scm_i_critical_section_level): Removed.

* libguile/threads.h (scm_i_thread): New critical_section_level field.

* libguile/throw.c (scm_ithrow): Check per-thread critical_section_level.
---
 libguile/async.h         |    8 +++-----
 libguile/continuations.c |    2 +-
 libguile/threads.c       |    2 +-
 libguile/threads.h       |    3 +++
 libguile/throw.c         |    2 +-
 5 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/libguile/async.h b/libguile/async.h
index a81a98d..3c1725f 100644
--- a/libguile/async.h
+++ b/libguile/async.h
@@ -58,20 +58,18 @@ void scm_dynwind_unblock_asyncs (void);
    the manual.
 */
 
-/* Defined in threads.c.  scm_i_critical_section_level is only used
-   for error checking and will go away eventually. */
+/* Defined in threads.c. */
 extern scm_i_pthread_mutex_t scm_i_critical_section_mutex;
-extern int scm_i_critical_section_level;
 
 #define SCM_CRITICAL_SECTION_START \
   do { \
     scm_i_pthread_mutex_lock (&scm_i_critical_section_mutex);\
     SCM_I_CURRENT_THREAD->block_asyncs++; \
-    scm_i_critical_section_level++; \
+    SCM_I_CURRENT_THREAD->critical_section_level++; \
   } while (0)
 #define SCM_CRITICAL_SECTION_END \
   do { \
-    scm_i_critical_section_level--; \
+    SCM_I_CURRENT_THREAD->critical_section_level--; \
     SCM_I_CURRENT_THREAD->block_asyncs--; \
     scm_i_pthread_mutex_unlock (&scm_i_critical_section_mutex); \
     scm_async_click ();	\
diff --git a/libguile/continuations.c b/libguile/continuations.c
index 74bb911..69d2569 100644
--- a/libguile/continuations.c
+++ b/libguile/continuations.c
@@ -256,7 +256,7 @@ scm_dynthrow (SCM cont, SCM val)
   SCM_STACKITEM *dst = thread->continuation_base;
   SCM_STACKITEM stack_top_element;
 
-  if (scm_i_critical_section_level)
+  if (thread->critical_section_level)
     {
       fprintf (stderr, "continuation invoked from within critical section.\n");
       abort ();
diff --git a/libguile/threads.c b/libguile/threads.c
index 05e01e3..fc3e607 100644
--- a/libguile/threads.c
+++ b/libguile/threads.c
@@ -422,6 +422,7 @@ guilify_self_1 (SCM_STACKITEM *base)
   t->active_asyncs = SCM_EOL;
   t->block_asyncs = 1;
   t->pending_asyncs = 1;
+  t->critical_section_level = 0;
   t->last_debug_frame = NULL;
   t->base = base;
 #ifdef __ia64__
@@ -1667,7 +1668,6 @@ scm_i_thread_sleep_for_gc ()
 /* This mutex is used by SCM_CRITICAL_SECTION_START/END.
  */
 scm_i_pthread_mutex_t scm_i_critical_section_mutex;
-int scm_i_critical_section_level = 0;
 
 static SCM dynwind_critical_section_mutex;
 
diff --git a/libguile/threads.h b/libguile/threads.h
index 9417b88..2b0e067 100644
--- a/libguile/threads.h
+++ b/libguile/threads.h
@@ -113,6 +113,9 @@ typedef struct scm_i_thread {
   scm_t_contregs *pending_rbs_continuation;
 #endif
 
+  /* Whether this thread is in a critical section. */
+  int critical_section_level;
+
 } scm_i_thread;
 
 #define SCM_I_IS_THREAD(x)    SCM_SMOB_PREDICATE (scm_tc16_thread, x)
diff --git a/libguile/throw.c b/libguile/throw.c
index 7e2803f..92c5a1a 100644
--- a/libguile/throw.c
+++ b/libguile/throw.c
@@ -692,7 +692,7 @@ scm_ithrow (SCM key, SCM args, int noreturn SCM_UNUSED)
   SCM dynpair = SCM_UNDEFINED;
   SCM winds;
 
-  if (scm_i_critical_section_level)
+  if (SCM_I_CURRENT_THREAD->critical_section_level)
     {
       fprintf (stderr, "throw from within critical section.\n");
       abort ();
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-10 23:57   ` Neil Jerram
@ 2009-03-12  0:07     ` Neil Jerram
  2009-03-12  0:53       ` Neil Jerram
  2009-03-25 18:57     ` Neil Jerram
  1 sibling, 1 reply; 51+ messages in thread
From: Neil Jerram @ 2009-03-12  0:07 UTC (permalink / raw)
  To: Guile Development; +Cc: Linas Vepstas

[-- Attachment #1: Type: text/plain, Size: 1001 bytes --]

Neil Jerram <neil@ossau.uklinux.net> writes:

>> next it's on to the real problem, threadsafe define.
>
> I've been running Linas's define-race test program, and hitting the
> `throw from within critical section' quite a lot. [...]

After the patch that I posted yesterday, we finally get to the real
problem.  I've tried to address it by allowing a hashtable to have a
mutex associated with it - resulting in the patch below.  But
unfortunately it's still not good; a sample run of test-define-race
gives:

#<module (guile) 40376f00>
#<directory (guile-user) 40379680>
ERROR: ERROR: Unbound variable: x1-100499
Unbound variable: define
ERROR: Unbound variable: x4-100596
ERROR: Unbound variable: define
ERROR: Unbound variable: define
ERROR: Unbound variable: define
guile-define test case: good-bye!
test-define-race: 2 error(s)
FAIL: test-define-race

I'm off to sleep now, so I thought I'd post what I've done in case
others have thoughts on it or can see something wrong.

Regards,
        Neil


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Work-on-define-race.patch --]
[-- Type: text/x-diff, Size: 16184 bytes --]

From fd6931d5694f336aac9b9c5179dc0a4fc6f2c4c5 Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Wed, 11 Mar 2009 23:54:38 +0000
Subject: [PATCH] Work on define race

---
 libguile/hashtab.c                       |  130 +++++++++++++++++++++---
 libguile/hashtab.h                       |    4 +
 libguile/modules.c                       |    1 -
 libguile/throw.c                         |    5 +-
 test-suite/standalone/Makefile.am        |    5 +
 test-suite/standalone/test-define-race.c |  163 ++++++++++++++++++++++++++++++
 6 files changed, 291 insertions(+), 17 deletions(-)
 create mode 100644 test-suite/standalone/test-define-race.c

diff --git a/libguile/hashtab.c b/libguile/hashtab.c
index ea7fc69..098b610 100644
--- a/libguile/hashtab.c
+++ b/libguile/hashtab.c
@@ -103,6 +103,7 @@ make_hash_table (int flags, unsigned long k, const char *func_name)
   t->upper = 9 * n / 10;
   t->flags = flags;
   t->hash_fn = NULL;
+  t->mutex = SCM_BOOL_F;
   if (flags)
     {
       SCM_NEWSMOB3 (table, scm_tc16_hashtable, vector, t, weak_hashtables);
@@ -199,6 +200,13 @@ scm_i_rehash (SCM table,
 }
 
 
+static SCM 
+hashtable_mark (SCM table)
+{
+  scm_gc_mark (SCM_HASHTABLE_MUTEX (table));
+  return SCM_CELL_OBJECT_1 (table);
+}
+
 static int
 hashtable_print (SCM exp, SCM port, scm_print_state *pstate SCM_UNUSED)
 {
@@ -417,18 +425,34 @@ scm_hash_fn_get_handle (SCM table, SCM obj, unsigned long (*hash_fn)(), SCM (*as
 #define FUNC_NAME "scm_hash_fn_get_handle"
 {
   unsigned long k;
-  SCM h;
+  SCM h, mutex = SCM_BOOL_F;
 
   if (SCM_HASHTABLE_P (table))
-    table = SCM_HASHTABLE_VECTOR (table);
+    {
+      mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
+      table = SCM_HASHTABLE_VECTOR (table);
+    }
   else
     SCM_VALIDATE_VECTOR (1, table);
   if (SCM_SIMPLE_VECTOR_LENGTH (table) == 0)
-    return SCM_BOOL_F;
+    {
+      h = SCM_BOOL_F;
+      goto end;
+    }
   k = hash_fn (obj, SCM_SIMPLE_VECTOR_LENGTH (table), closure);
   if (k >= SCM_SIMPLE_VECTOR_LENGTH (table))
     scm_out_of_range ("hash_fn_get_handle", scm_from_ulong (k));
   h = assoc_fn (obj, SCM_SIMPLE_VECTOR_REF (table, k), closure);
+
+ end:
+  if (!scm_is_false (mutex))
+    scm_dynwind_end ();
+
   return h;
 }
 #undef FUNC_NAME
@@ -440,10 +464,18 @@ scm_hash_fn_create_handle_x (SCM table, SCM obj, SCM init, unsigned long (*hash_
 #define FUNC_NAME "scm_hash_fn_create_handle_x"
 {
   unsigned long k;
-  SCM buckets, it;
+  SCM buckets, it, mutex = SCM_BOOL_F;
 
   if (SCM_HASHTABLE_P (table))
-    buckets = SCM_HASHTABLE_VECTOR (table);
+    {
+      mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
+      buckets = SCM_HASHTABLE_VECTOR (table);
+    }
   else
     {
       SCM_ASSERT (scm_is_simple_vector (table),
@@ -458,7 +490,7 @@ scm_hash_fn_create_handle_x (SCM table, SCM obj, SCM init, unsigned long (*hash_
     scm_out_of_range ("hash_fn_create_handle_x", scm_from_ulong (k));
   it = assoc_fn (obj, SCM_SIMPLE_VECTOR_REF (buckets, k), closure);
   if (scm_is_pair (it))
-    return it;
+    ;
   else if (scm_is_true (it))
     scm_wrong_type_arg_msg (NULL, 0, it, "a pair");
   else
@@ -492,8 +524,13 @@ scm_hash_fn_create_handle_x (SCM table, SCM obj, SCM init, unsigned long (*hash_
 	      || SCM_HASHTABLE_N_ITEMS (table) > SCM_HASHTABLE_UPPER (table))
 	    scm_i_rehash (table, hash_fn, closure, FUNC_NAME);
 	}
-      return SCM_CAR (new_bucket);
+      it = SCM_CAR (new_bucket);
     }
+
+  if (!scm_is_false (mutex))
+    scm_dynwind_end ();
+
+  return it;
 }
 #undef FUNC_NAME
 
@@ -531,10 +568,18 @@ scm_hash_fn_remove_x (SCM table, SCM obj,
                       void *closure)
 {
   unsigned long k;
-  SCM buckets, h;
+  SCM buckets, h, mutex = SCM_BOOL_F;
 
   if (SCM_HASHTABLE_P (table))
-    buckets = SCM_HASHTABLE_VECTOR (table);
+    {
+      mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
+      buckets = SCM_HASHTABLE_VECTOR (table);
+    }
   else
     {
       SCM_ASSERT (scm_is_simple_vector (table), table,
@@ -542,7 +587,10 @@ scm_hash_fn_remove_x (SCM table, SCM obj,
       buckets = table;
     }
   if (SCM_SIMPLE_VECTOR_LENGTH (table) == 0)
-    return SCM_EOL;
+    {
+      h = SCM_EOL;
+      goto end;
+    }
 
   k = hash_fn (obj, SCM_SIMPLE_VECTOR_LENGTH (buckets), closure);
   if (k >= SCM_SIMPLE_VECTOR_LENGTH (buckets))
@@ -559,6 +607,9 @@ scm_hash_fn_remove_x (SCM table, SCM obj,
 	    scm_i_rehash (table, hash_fn, closure, "scm_hash_fn_remove_x");
 	}
     }
+ end:
+  if (!scm_is_false (mutex))
+    scm_dynwind_end ();
   return h;
 }
 
@@ -569,8 +620,16 @@ SCM_DEFINE (scm_hash_clear_x, "hash-clear!", 1, 0, 0,
 {
   if (SCM_HASHTABLE_P (table))
     {
+      SCM mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
       scm_vector_fill_x (SCM_HASHTABLE_VECTOR (table), SCM_EOL);
       SCM_SET_HASHTABLE_N_ITEMS (table, 0);
+      if (!scm_is_false (mutex))
+	scm_dynwind_end ();
     }
   else
     scm_vector_fill_x (table, SCM_EOL);
@@ -917,10 +976,18 @@ SCM
 scm_internal_hash_fold (SCM (*fn) (), void *closure, SCM init, SCM table)
 {
   long i, n;
-  SCM buckets, result = init;
+  SCM buckets, result = init, mutex = SCM_BOOL_F;
   
   if (SCM_HASHTABLE_P (table))
-    buckets = SCM_HASHTABLE_VECTOR (table);
+    {
+      mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
+      buckets = SCM_HASHTABLE_VECTOR (table);
+    }
   else
     buckets = table;
   
@@ -940,6 +1007,9 @@ scm_internal_hash_fold (SCM (*fn) (), void *closure, SCM init, SCM table)
 	}
     }
 
+  if (!scm_is_false (mutex))
+    scm_dynwind_end ();
+
   return result;
 }
 
@@ -955,10 +1025,18 @@ void
 scm_internal_hash_for_each_handle (SCM (*fn) (), void *closure, SCM table)
 {
   long i, n;
-  SCM buckets;
+  SCM buckets, mutex = SCM_BOOL_F;
   
   if (SCM_HASHTABLE_P (table))
-    buckets = SCM_HASHTABLE_VECTOR (table);
+    {
+      mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
+      buckets = SCM_HASHTABLE_VECTOR (table);
+    }
   else
     buckets = table;
   
@@ -977,6 +1055,9 @@ scm_internal_hash_for_each_handle (SCM (*fn) (), void *closure, SCM table)
 	  ls = SCM_CDR (ls);
 	}
     }
+
+  if (!scm_is_false (mutex))
+    scm_dynwind_end ();
 }
 
 SCM_DEFINE (scm_hash_fold, "hash-fold", 3, 0, 0, 
@@ -1065,6 +1146,25 @@ SCM_DEFINE (scm_hash_map_to_list, "hash-map->list", 2, 0, 0,
 }
 #undef FUNC_NAME
 
+SCM_DEFINE (scm_hash_use_mutex_x, "hash-use-mutex!", 2, 0, 0, 
+            (SCM table, SCM mutex),
+	    "Use @var{mutex} to serialize operations on @var{table} from multiple threads.")
+#define FUNC_NAME s_scm_hash_use_mutex_x
+{
+  /* Must be a real (i.e. not a vector) and non-weak hash table. */
+  SCM_VALIDATE_HASHTABLE (1, table);
+  if (SCM_HASHTABLE_WEAK_P (table))
+    SCM_MISC_ERROR ("can't use mutex with a weak hash table", SCM_EOL);
+    
+  if (!scm_is_false (mutex))
+    SCM_VALIDATE_MUTEX (2, mutex);
+
+  SCM_SET_HASHTABLE_MUTEX (table, mutex);
+
+  return SCM_UNSPECIFIED;
+}
+#undef FUNC_NAME
+
 \f
 
 
@@ -1072,7 +1172,7 @@ void
 scm_hashtab_prehistory ()
 {
   scm_tc16_hashtable = scm_make_smob_type (s_hashtable, 0);
-  scm_set_smob_mark (scm_tc16_hashtable, scm_markcdr);
+  scm_set_smob_mark (scm_tc16_hashtable, hashtable_mark);
   scm_set_smob_print (scm_tc16_hashtable, hashtable_print);
   scm_set_smob_free (scm_tc16_hashtable, hashtable_free);
   scm_c_hook_add (&scm_after_gc_c_hook, rehash_after_gc, 0, 0);
diff --git a/libguile/hashtab.h b/libguile/hashtab.h
index 1017354..250f59a 100644
--- a/libguile/hashtab.h
+++ b/libguile/hashtab.h
@@ -58,6 +58,8 @@ SCM_API scm_t_bits scm_tc16_hashtable;
 #define SCM_HASHTABLE_DECREMENT(x) (SCM_HASHTABLE_N_ITEMS(x)--)
 #define SCM_HASHTABLE_UPPER(x)     (SCM_HASHTABLE (x)->upper)
 #define SCM_HASHTABLE_LOWER(x)     (SCM_HASHTABLE (x)->lower)
+#define SCM_HASHTABLE_MUTEX(x)     (SCM_HASHTABLE (x)->mutex)
+#define SCM_SET_HASHTABLE_MUTEX(x, m)     (SCM_HASHTABLE (x)->mutex = m)
 
 #define SCM_HASHTABLE_N_BUCKETS(h) \
  SCM_SIMPLE_VECTOR_LENGTH (SCM_HASHTABLE_VECTOR (h))
@@ -74,6 +76,7 @@ typedef struct scm_t_hashtable {
   int size_index;		/* index into hashtable_size */
   int min_size_index;		/* minimum size_index */
   unsigned long (*hash_fn) ();  /* for rehashing after a GC. */
+  SCM mutex;			/* mutex for thread safety */
 } scm_t_hashtable;
 
 \f
@@ -132,6 +135,7 @@ SCM_API SCM scm_hash_fold (SCM proc, SCM init, SCM hash);
 SCM_API SCM scm_hash_for_each (SCM proc, SCM hash);
 SCM_API SCM scm_hash_for_each_handle (SCM proc, SCM hash);
 SCM_API SCM scm_hash_map_to_list (SCM proc, SCM hash);
+SCM_API SCM scm_hash_use_mutex_x (SCM hash, SCM mutex);
 SCM_API void scm_hashtab_prehistory (void);
 SCM_API void scm_init_hashtab (void);
 
diff --git a/libguile/modules.c b/libguile/modules.c
index 444ece1..da397d8 100644
--- a/libguile/modules.c
+++ b/libguile/modules.c
@@ -279,7 +279,6 @@ SCM_DEFINE (scm_env_module, "env-module", 1, 0, 0,
  * C level implementation of the standard eval closure
  *
  * This increases loading speed substantially.
- * The code will be replaced by the low-level environments in next release.
  */
 
 static SCM module_make_local_var_x_var;
diff --git a/libguile/throw.c b/libguile/throw.c
index 92c5a1a..b2ec371 100644
--- a/libguile/throw.c
+++ b/libguile/throw.c
@@ -264,6 +264,7 @@ scm_c_with_throw_handler (SCM tag,
 {
   SCM pre_unwind, answer;
   struct pre_unwind_data c;
+  SCM new_dynwinds;
 
   c.handler = handler;
   c.handler_data = handler_data;
@@ -272,7 +273,9 @@ scm_c_with_throw_handler (SCM tag,
   pre_unwind = make_pre_unwind_data (&c);
 
   SCM_CRITICAL_SECTION_START;
-  scm_i_set_dynwinds (scm_acons (tag, pre_unwind, scm_i_dynwinds ()));
+  new_dynwinds = scm_acons (tag, pre_unwind, SCM_BOOL_F);
+  SCM_SETCDR (new_dynwinds, scm_i_dynwinds ());
+  scm_i_set_dynwinds (new_dynwinds);
   SCM_CRITICAL_SECTION_END;
 
   answer = (*body) (body_data);
diff --git a/test-suite/standalone/Makefile.am b/test-suite/standalone/Makefile.am
index 356573f..bb2f089 100644
--- a/test-suite/standalone/Makefile.am
+++ b/test-suite/standalone/Makefile.am
@@ -137,6 +137,11 @@ test_scm_with_guile_LDADD = ${top_builddir}/libguile/libguile.la
 check_PROGRAMS += test-scm-with-guile
 TESTS += test-scm-with-guile
 
+test_define_race_CFLAGS = ${test_cflags}
+test_define_race_LDADD = ${top_builddir}/libguile/libguile.la
+check_PROGRAMS += test-define-race
+TESTS += test-define-race
+
 else
 
 EXTRA_DIST += test-with-guile-module.c test-scm-with-guile.c
diff --git a/test-suite/standalone/test-define-race.c b/test-suite/standalone/test-define-race.c
new file mode 100644
index 0000000..a79dcee
--- /dev/null
+++ b/test-suite/standalone/test-define-race.c
@@ -0,0 +1,163 @@
+/* Copyright (C) 2009 Free Software Foundation, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/*
+ * test-define-race.c
+ *
+ * Program to exhibit a race in the guile-1.8.x define code.
+ * See https://savannah.gnu.org/bugs/index.php?24867 for general 
+ * status and description.
+ *
+ * Summary: variable definition and lookup is not thread-safe in guile;
+ * attempting to look up a variable while another thread is defining 
+ * a variable can sometimes lead to the first thread loosing, and not
+ * seeing an existing, defined variable. Alternately, difining two
+ * different variables at the same time can result in one of them
+ * failing to be defined; on rarer occasions, a seg-fault results.
+ *
+ * Compile as:
+ * cc test-define-race.c -lpthread -lguile
+ *
+ * May need to run several times to see the bug(s).
+ *
+ * Linas Vepstas <linasvepstas@gmail.com> December 2008
+ */
+
+#include <libguile.h>
+#include <pthread.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+typedef struct
+{
+  int id;
+  int count;
+  int do_exit;
+  int error_count;
+  int last_ok;
+
+} state;
+
+static void * guile_mode_definer(void * ud)
+{
+  SCM result;
+  int val;
+
+  char buff[1000];
+  state * s = (state *) ud;
+
+  /* Increment before evaluation, in case evaluation raises an
+     error. */
+  s->count ++;
+  
+  if (s->last_ok)
+    {
+      /* See if the previous definition holds the expected value.  If
+	 the expected value is undefined, scm_c_eval_string will raise
+	 an error. */
+      s->error_count ++;
+      s->last_ok = 0;
+      sprintf (buff, "x%d-%d\n", s->id, s->count - 1);
+      result = scm_c_eval_string (buff);
+      val = scm_to_int(result);
+
+      if (val != s->count - 1)
+	printf ("Define mismatch on thread %d\n", s->id);
+      else
+	s->error_count --;
+    }
+
+  /* Define a new variable with a new value. */
+  sprintf (buff, "(define x%d-%d %d)\n", s->id, s->count, s->count);
+  scm_c_eval_string (buff);
+
+  /* If we reach here, the definition was apparently successful, so we
+     can check it on the next iteration. */
+  s->last_ok = 1;
+
+  return NULL;
+}
+
+static void * definer (void *ud)
+{
+  int i;
+  state * s = (state *) ud;
+
+  while(!s->do_exit)
+    for (i=0; i<4000; i++)
+      {
+	scm_with_guile (guile_mode_definer, ud);
+	sched_yield();  /* try to get the threads to inter-leave a lot */
+      }
+  return NULL;
+}
+
+static void init_ctr(state *s, int val)
+{
+  s->id = val;
+  s->count = 0;
+  s->do_exit = 0;
+  s->error_count = 0;
+  s->last_ok = 0;
+}
+
+static void * be_threadsafe(void * ud)
+{
+  scm_c_eval_string ("(begin (write the-root-module) (newline))");
+  scm_c_eval_string ("(begin (write (module-obarray the-root-module)) (newline))");
+  scm_c_eval_string ("(hash-use-mutex! (module-obarray the-root-module) (make-mutex))");
+  scm_c_eval_string ("(begin (write (current-module)) (newline))");
+  scm_c_eval_string ("(begin (write (module-obarray (current-module))) (newline))");
+  scm_c_eval_string ("(hash-use-mutex! (module-obarray (current-module)) (make-mutex))");
+  return NULL;
+}
+
+int main(int argc, char ** argv)
+{
+  pthread_t th1, th2, th3, th4;
+  state counter1, counter2, counter3, counter4;
+  int error_total;
+
+  scm_with_guile (be_threadsafe, NULL);
+
+  init_ctr (&counter1, 1);
+  init_ctr (&counter2, 2);
+  init_ctr (&counter3, 3);
+  init_ctr (&counter4, 4);
+
+  pthread_create(&th1, NULL, definer, (void *) &counter1);
+  pthread_create(&th2, NULL, definer, (void *) &counter2);
+  pthread_create(&th3, NULL, definer, (void *) &counter3);
+  pthread_create(&th4, NULL, definer, (void *) &counter4);
+
+  sleep(200);
+  counter1.do_exit = 1;
+  counter2.do_exit = 1;
+  counter3.do_exit = 1;
+  counter4.do_exit = 1;
+  printf ("guile-define test case: good-bye!\n");
+
+  pthread_join(th1, NULL);
+  pthread_join(th2, NULL);
+  pthread_join(th3, NULL);
+  pthread_join(th4, NULL);
+
+  error_total = (counter1.error_count + counter2.error_count +
+		 counter3.error_count + counter4.error_count);
+  printf("test-define-race: %d error(s)\n", error_total);
+  exit (error_total);
+}
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-12  0:07     ` Neil Jerram
@ 2009-03-12  0:53       ` Neil Jerram
  2009-03-12  1:29         ` Linas Vepstas
  2009-03-12 22:13         ` Andy Wingo
  0 siblings, 2 replies; 51+ messages in thread
From: Neil Jerram @ 2009-03-12  0:53 UTC (permalink / raw)
  To: Guile Development; +Cc: Linas Vepstas

Neil Jerram <neil@ossau.uklinux.net> writes:

> #<module (guile) 40376f00>
> #<directory (guile-user) 40379680>
> ERROR: ERROR: Unbound variable: x1-100499
> Unbound variable: define
> ERROR: Unbound variable: x4-100596
> ERROR: Unbound variable: define
> ERROR: Unbound variable: define
> ERROR: Unbound variable: define
> guile-define test case: good-bye!
> test-define-race: 2 error(s)
> FAIL: test-define-race
>
> I'm off to sleep now, so I thought I'd post what I've done in case
> others have thoughts on it or can see something wrong.

Thanks to a hint from helgrind, I think the problem might be that the
symbols obarray is not thread-safe.  But surely using a mutex for
every symbol that Guile reads would be terrible for performance... so
I hope there is an alternative.

          Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-12  0:53       ` Neil Jerram
@ 2009-03-12  1:29         ` Linas Vepstas
  2009-03-12  3:09           ` Clinton Ebadi
  2009-03-25 22:13           ` Neil Jerram
  2009-03-12 22:13         ` Andy Wingo
  1 sibling, 2 replies; 51+ messages in thread
From: Linas Vepstas @ 2009-03-12  1:29 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

2009/3/11 Neil Jerram <neil@ossau.uklinux.net>:
> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> #<module (guile) 40376f00>
>> #<directory (guile-user) 40379680>
>> ERROR: ERROR: Unbound variable: x1-100499
>> Unbound variable: define
>> ERROR: Unbound variable: x4-100596
>> ERROR: Unbound variable: define
>> ERROR: Unbound variable: define
>> ERROR: Unbound variable: define
>> guile-define test case: good-bye!
>> test-define-race: 2 error(s)
>> FAIL: test-define-race
>>
>> I'm off to sleep now, so I thought I'd post what I've done in case
>> others have thoughts on it or can see something wrong.
>
> Thanks to a hint from helgrind, I think the problem might be that the
> symbols obarray is not thread-safe.  But surely using a mutex for
> every symbol that Guile reads would be terrible for performance... so
> I hope there is an alternative.

Well, once you identify the section that needs locking,
you'll want to use an rwlock instead of a mutex.  The
rwlock (pthread_rwlock_rdlock) allows multiple
simultaneous readers.  The writers, however, get
exclusive access. (pthread_rwlock_wrlock)

I have no clue as to what the overhead is. I know that
the powerpc architecture implements locking primitives
that are extremely fast (a few cycles); I assume Intel does
as well.  These are used liberally within the kernel to
implement all sorts of fast atomic ops.

Every now and then, some newcomer reinvents a locking
system using these primitives, "for performance reasons",
and posts it to some mailing list. The standard, canonical
response is "you're an idiot", followed by pointers to 5
different bugs in the poster's code, and an explanation that
the pthreads implementation is faster than his, and that its
also debugged. (The response is less blunt when the
poster is a customer of yours, and the email was delivered
via some bushy-tailed executive).

An alternative idea is to try to apply some principles from
functional programming, e.g. "copy on write" (COW): when
the obarray needs to be updated, make a copy of it. Any
readers in other threads continue to safely use the
original version. When the new obarray is fully constructed,
new readers will use it.  The old obarray is garbage-collected
in due time.  I dunno if this could be applied for this case;
at least we have garbage collection, which removes one
major stumbling block for using COW.

--linas




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-12  1:29         ` Linas Vepstas
@ 2009-03-12  3:09           ` Clinton Ebadi
  2009-03-25 22:13           ` Neil Jerram
  1 sibling, 0 replies; 51+ messages in thread
From: Clinton Ebadi @ 2009-03-12  3:09 UTC (permalink / raw)
  To: Guile Development; +Cc: linasvepstas

Linas Vepstas <linasvepstas@gmail.com> writes:

> An alternative idea is to try to apply some principles from
> functional programming, e.g. "copy on write" (COW): when
> the obarray needs to be updated, make a copy of it. Any
> readers in other threads continue to safely use the
> original version. When the new obarray is fully constructed,
> new readers will use it.  The old obarray is garbage-collected
> in due time.  I dunno if this could be applied for this case;
> at least we have garbage collection, which removes one
> major stumbling block for using COW.

If the obarray were implemented as a persistent tree instead of hash
table you wouldn't even need to copy it--the only operation that would
need to be atomic would be committing the change--swap the pointer to
the root, but only if the current root is the same as when you started
modifying the obarray; otherwise merge your changes into what is
presently the root if it changed in the meantime (alternatively, lock
around the entire insertion/replace block).

Lookup time would be increased from amortized O(1) to O(logN) and
generate a bit of garbage on every insert (depth(tree) nodes), but
minimizes potentially expensive locking (e.g. if multiple concurrent
readers are allowed while writing to the hash table you have to handle
locking everyone out when the table needs rehashing--with the persistent
tree approach readers *never* block). I suppose the best way to find out
would be to implement it, and I may do so as an experiment (it would
require modifying some C, however, which I am not so inclined to use for
a mere thought experiment).

-- 
Lindsay (Carlton): should i eat more post its




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-12  0:53       ` Neil Jerram
  2009-03-12  1:29         ` Linas Vepstas
@ 2009-03-12 22:13         ` Andy Wingo
  2009-03-13 19:13           ` Neil Jerram
  2009-03-14 14:23           ` Ludovic Courtès
  1 sibling, 2 replies; 51+ messages in thread
From: Andy Wingo @ 2009-03-12 22:13 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Linas Vepstas, Guile Development

Hey Neil,

On Thu 12 Mar 2009 01:53, Neil Jerram <neil@ossau.uklinux.net> writes:

> Thanks to a hint from helgrind, I think the problem might be that the
> symbols obarray is not thread-safe.  But surely using a mutex for
> every symbol that Guile reads would be terrible for performance...

Dunno, in GStreamer we found that uncontended locks are cheap cheap
cheap. AFAIU they don't even cause context switches. And the reader will
be less important in terms of e.g. startup time once the VM lands.

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-12 22:13         ` Andy Wingo
@ 2009-03-13 19:13           ` Neil Jerram
  2009-03-25 23:19             ` Neil Jerram
  2009-03-14 14:23           ` Ludovic Courtès
  1 sibling, 1 reply; 51+ messages in thread
From: Neil Jerram @ 2009-03-13 19:13 UTC (permalink / raw)
  To: Andy Wingo, Linas Vepstas; +Cc: Clinton Ebadi, Guile Development

Andy Wingo <wingo@pobox.com> writes:

> Hey Neil,
>
> On Thu 12 Mar 2009 01:53, Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> Thanks to a hint from helgrind, I think the problem might be that the
>> symbols obarray is not thread-safe.  But surely using a mutex for
>> every symbol that Guile reads would be terrible for performance...
>
> Dunno, in GStreamer we found that uncontended locks are cheap cheap
> cheap. AFAIU they don't even cause context switches. And the reader will
> be less important in terms of e.g. startup time once the VM lands.

Hi Linas, Andy,

Many thanks for your input on this.  I'll go ahead with a mutex or
rwlock.  First thing is to see if it fixes the problem; if it does,
I'll try to measure the performance impact.

      Neil

PS. And thanks Clinton too - although I'd prefer not to have to do
such a big rewrite if we can manage without it.




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-12 22:13         ` Andy Wingo
  2009-03-13 19:13           ` Neil Jerram
@ 2009-03-14 14:23           ` Ludovic Courtès
  2009-03-16 22:57             ` Andy Wingo
  1 sibling, 1 reply; 51+ messages in thread
From: Ludovic Courtès @ 2009-03-14 14:23 UTC (permalink / raw)
  To: guile-devel

Hello!

Andy Wingo <wingo@pobox.com> writes:

> Dunno, in GStreamer we found that uncontended locks are cheap cheap
> cheap. AFAIU they don't even cause context switches.

On GNU/Linux they are implemented using futexes and the kernel is not
invoked when there's no contention.

Still, I would measure the impact on `module-define!' and `module-ref'
because that's at least an additional function call.

Thanks,
Ludo'.





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-14 14:23           ` Ludovic Courtès
@ 2009-03-16 22:57             ` Andy Wingo
  0 siblings, 0 replies; 51+ messages in thread
From: Andy Wingo @ 2009-03-16 22:57 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

On Sat 14 Mar 2009 15:23, ludo@gnu.org (Ludovic Courtès) writes:

> Still, I would measure the impact on `module-define!' and `module-ref'
> because that's at least an additional function call.

You probably know, but at least in the VM (and I thought in the
interpreter too), the locations are computed once, and the resulting
variable is cached. So module-ref is not called repeatedly for the same
binding.

Andy
-- 
http://wingolog.org/




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-10 23:57   ` Neil Jerram
  2009-03-12  0:07     ` Neil Jerram
@ 2009-03-25 18:57     ` Neil Jerram
  1 sibling, 0 replies; 51+ messages in thread
From: Neil Jerram @ 2009-03-25 18:57 UTC (permalink / raw)
  To: Guile Development; +Cc: Linas Vepstas

Neil Jerram <neil@ossau.uklinux.net> writes:

> I've been running Linas's define-race test program, and hitting the
> `throw from within critical section' quite a lot.
>
> We've already discussed this a couple of times:
> http://www.mail-archive.com/bug-guile@gnu.org/msg04613.html
> http://www.mail-archive.com/guile-devel@gnu.org/msg03016.html
>
> Andy proposed just removing the check for 1.8.x, but I would prefer to
> keep it if possible, as I think there probably still are genuine
> problems where the check could help.  What about the patch below?

I don't believe anyone commented on this, but I've committed it now to
1.8.x.  In case of any concerns, please let me know.

     Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-05 20:05       ` Neil Jerram
  2009-03-05 20:40         ` Linas Vepstas
  2009-03-05 20:57         ` Linas Vepstas
@ 2009-03-25 19:00         ` Neil Jerram
  2009-03-25 22:08           ` Ludovic Courtès
  2 siblings, 1 reply; 51+ messages in thread
From: Neil Jerram @ 2009-03-25 19:00 UTC (permalink / raw)
  To: linasvepstas; +Cc: Guile Development

Neil Jerram <neil@ossau.uklinux.net> writes:

>> Yes, it's an unrelated bug.  All of the places that raise errors (and
>> so exit non-locally) should exit the critical section first.
>
>> You're absolutely right.  I'll leave this part out, and generate a
>> separate patch for it.
>
> Here's the separate patch...

> Subject: [PATCH] Avoid throw from critical section, given invalid sigaction call
>
> * libguile/scmsigs.c (scm_sigaction_for_thread): Exit critical section
>   before raising out-of-range error.
>
> * test-suite/Makefile.am (SCM_TESTS): Add signals.test.
>
> * test-suite/tests/signals.test: New file.

I've committed this now too.  (Also to 1.8.x.  I'll come back later to
look at which of all these lock fixes are needed in master.)

       Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-25 19:00         ` Neil Jerram
@ 2009-03-25 22:08           ` Ludovic Courtès
  0 siblings, 0 replies; 51+ messages in thread
From: Ludovic Courtès @ 2009-03-25 22:08 UTC (permalink / raw)
  To: guile-devel

Hi Neil,

Neil Jerram <neil@ossau.uklinux.net> writes:

> Neil Jerram <neil@ossau.uklinux.net> writes:

>> Subject: [PATCH] Avoid throw from critical section, given invalid sigaction call
>>
>> * libguile/scmsigs.c (scm_sigaction_for_thread): Exit critical section
>>   before raising out-of-range error.
>>
>> * test-suite/Makefile.am (SCM_TESTS): Add signals.test.
>>
>> * test-suite/tests/signals.test: New file.
>
> I've committed this now too.  (Also to 1.8.x.  I'll come back later to
> look at which of all these lock fixes are needed in master.)

Perfect!

This one can go in `master' too, I think.

Neil Jerram <neil@ossau.uklinux.net> writes:

> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> I've been running Linas's define-race test program, and hitting the
>> `throw from within critical section' quite a lot.
>>
>> We've already discussed this a couple of times:
>> http://www.mail-archive.com/bug-guile@gnu.org/msg04613.html
>> http://www.mail-archive.com/guile-devel@gnu.org/msg03016.html
>>
>> Andy proposed just removing the check for 1.8.x, but I would prefer to
>> keep it if possible, as I think there probably still are genuine
>> problems where the check could help.  What about the patch below?
>
> I don't believe anyone commented on this, but I've committed it now to
> 1.8.x.  In case of any concerns, please let me know.

Sorry for not commenting earlier.  This looks good to me.  I suppose it
can't hurt in `master'; the future of critical sections is uncertain,
though.

Thanks!

Ludo'.





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-12  1:29         ` Linas Vepstas
  2009-03-12  3:09           ` Clinton Ebadi
@ 2009-03-25 22:13           ` Neil Jerram
  2009-03-25 22:34             ` Linas Vepstas
  1 sibling, 1 reply; 51+ messages in thread
From: Neil Jerram @ 2009-03-25 22:13 UTC (permalink / raw)
  To: linasvepstas; +Cc: Guile Development

Linas Vepstas <linasvepstas@gmail.com> writes:

> Well, once you identify the section that needs locking,
> you'll want to use an rwlock instead of a mutex.  The
> rwlock (pthread_rwlock_rdlock) allows multiple
> simultaneous readers.  The writers, however, get
> exclusive access. (pthread_rwlock_wrlock)

I forgot to comment on this before...

In my view the important case to optimize is the one with only a
single thread - i.e. I don't want to add any significant performance
penalty in that case.  And for that case, IIUC, there is no difference
between a mutex and an rwlock.

Also I would guess that we would hit more portability issues in future
with a rwlock than with a mutex.  Of course, we could look at
configure time for rwlock support, and fall back to a mutex - but that
has a cost too, in code complexity.

On balance, therefore, I don't think using a rwlock rather than a
mutex is worthwhile.

Regards,
        Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-25 22:13           ` Neil Jerram
@ 2009-03-25 22:34             ` Linas Vepstas
  0 siblings, 0 replies; 51+ messages in thread
From: Linas Vepstas @ 2009-03-25 22:34 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

2009/3/25 Neil Jerram <neil@ossau.uklinux.net>:
> Linas Vepstas <linasvepstas@gmail.com> writes:
>
>> Well, once you identify the section that needs locking,
>> you'll want to use an rwlock instead of a mutex.  The
>> rwlock (pthread_rwlock_rdlock) allows multiple
>> simultaneous readers.  The writers, however, get
>> exclusive access. (pthread_rwlock_wrlock)
>
>  I don't want to add any significant performance
> penalty in that case.  And for that case, IIUC, there is no difference
> between a mutex and an rwlock.

If there's no perf difference, then isn't that an argument
*for* rwlock, instead of against it?

> Also I would guess that we would hit more portability issues in future
> with a rwlock than with a mutex.

Well, since rwlock is a part of posix threads, wouldn't it
be available universally (wherever posix threads are
available)?

Are you voicing some suspicion that its somehow
buggier, or less tested?

(I have no direct experience on these two perf/portability
issues myself).

--linas




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-13 19:13           ` Neil Jerram
@ 2009-03-25 23:19             ` Neil Jerram
  2009-03-26  3:40               ` Linas Vepstas
                                 ` (2 more replies)
  0 siblings, 3 replies; 51+ messages in thread
From: Neil Jerram @ 2009-03-25 23:19 UTC (permalink / raw)
  To: Andy Wingo; +Cc: Clinton Ebadi, Linas Vepstas, Guile Development

[-- Attachment #1: Type: text/plain, Size: 1774 bytes --]

Neil Jerram <neil@ossau.uklinux.net> writes:

> Many thanks for your input on this.  I'll go ahead with a mutex or
> rwlock.  First thing is to see if it fixes the problem; if it does,
> I'll try to measure the performance impact.

Attached are 3 patches relating to thread-safe define.

#1 is Linas's test-define-race test, modified by me.  This test
reliably fails (without the following fix) if run for 200 seconds, but
I don't think we can have a test in Guile's "make check" that takes
200s.  Therefore I've coded it so that

- it runs for a duration specified by environment variable
  GUILE_TEST_DEFINE_RACE_DURATION

- it defaults to just 10s if that variable isn't defined.

So people who want to run this test meaningfully on their platforms
can do "GUILE_TEST_DEFINE_RACE_DURATION=200 make check".

#2 makes the symbols hash thread-safe, and it appears that this
completely fixes the define-race problem.  I don't understand why we
apparently don't need patch #3 as well - but that's what my results
indicate.

#3 allows a non-weak hash table to be automatically thread-safe, by
associating a fat mutex with it.

#3 is what I tried first, applying it to the module-obarray of the
(guile) and (guile-user) modules.  But test-define-race still gave
errors.  Then I added #2, and that fixed the errors.  Then I checked
#2 without #3, and that fixes all the test-define-race errors too.  So
apparently #3 is not needed.

It looks like we should commit #1 and #2 (although maybe with rwlock
instead of mutex), so please send any comments you have on those.  I'd
also appreciate thoughts on #3 as a possible hash table enhancement
(for master), and on why we don't apparently need to make the module
obarray hash tables thread-safe in this way.

Regards,
        Neil


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-New-test-for-thread-safe-define.patch --]
[-- Type: text/x-diff, Size: 6167 bytes --]

From 1f2707cc9473548863b483fd12d97afe7d0c94c2 Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Wed, 25 Mar 2009 20:55:37 +0000
Subject: [PATCH] New test for thread-safe define

Written by Linas Vepstas; modified by Neil Jerram.

* test-suite/standalone/Makefile.am: Add test-define-race test.

* test-suite/standalone/test-define-race.c: New test.
---
 test-suite/standalone/Makefile.am        |    5 +
 test-suite/standalone/test-define-race.c |  168 ++++++++++++++++++++++++++++++
 2 files changed, 173 insertions(+), 0 deletions(-)
 create mode 100644 test-suite/standalone/test-define-race.c

diff --git a/test-suite/standalone/Makefile.am b/test-suite/standalone/Makefile.am
index e7cfd82..766e447 100644
--- a/test-suite/standalone/Makefile.am
+++ b/test-suite/standalone/Makefile.am
@@ -145,6 +145,11 @@ test_scm_with_guile_LDADD = ${top_builddir}/libguile/libguile.la
 check_PROGRAMS += test-scm-with-guile
 TESTS += test-scm-with-guile
 
+test_define_race_CFLAGS = ${test_cflags}
+test_define_race_LDADD = ${top_builddir}/libguile/libguile.la
+check_PROGRAMS += test-define-race
+TESTS += test-define-race
+
 else
 
 EXTRA_DIST += test-with-guile-module.c test-scm-with-guile.c
diff --git a/test-suite/standalone/test-define-race.c b/test-suite/standalone/test-define-race.c
new file mode 100644
index 0000000..f375d55
--- /dev/null
+++ b/test-suite/standalone/test-define-race.c
@@ -0,0 +1,168 @@
+/* Copyright (C) 2009 Free Software Foundation, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+/*
+ * test-define-race.c
+ *
+ * Program to exhibit a race in the guile-1.8.x define code.
+ * See https://savannah.gnu.org/bugs/index.php?24867 for general 
+ * status and description.
+ *
+ * Summary: variable definition and lookup is not thread-safe in guile;
+ * attempting to look up a variable while another thread is defining 
+ * a variable can sometimes lead to the first thread loosing, and not
+ * seeing an existing, defined variable. Alternately, difining two
+ * different variables at the same time can result in one of them
+ * failing to be defined; on rarer occasions, a seg-fault results.
+ *
+ * Compile as:
+ * cc test-define-race.c -lpthread -lguile
+ *
+ * May need to run several times to see the bug(s).
+ *
+ * Linas Vepstas <linasvepstas@gmail.com> December 2008
+ */
+
+#include <libguile.h>
+#include <pthread.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+typedef struct
+{
+  int id;
+  int count;
+  int do_exit;
+  int error_count;
+  int last_ok;
+
+} state;
+
+static void * guile_mode_definer(void * ud)
+{
+  SCM result;
+  int val;
+
+  char buff[1000];
+  state * s = (state *) ud;
+
+  /* Increment before evaluation, in case evaluation raises an
+     error. */
+  s->count ++;
+  
+  if (s->last_ok)
+    {
+      /* See if the previous definition holds the expected value.  If
+	 the expected value is undefined, scm_c_eval_string will raise
+	 an error. */
+      s->error_count ++;
+      s->last_ok = 0;
+      sprintf (buff, "x%d-%d\n", s->id, s->count - 1);
+      result = scm_c_eval_string (buff);
+      val = scm_to_int(result);
+
+      if (val != s->count - 1)
+	printf ("Define mismatch on thread %d\n", s->id);
+      else
+	s->error_count --;
+    }
+
+  /* Define a new variable with a new value. */
+  sprintf (buff, "(define x%d-%d %d)\n", s->id, s->count, s->count);
+  scm_c_eval_string (buff);
+
+  /* If we reach here, the definition was apparently successful, so we
+     can check it on the next iteration. */
+  s->last_ok = 1;
+
+  return NULL;
+}
+
+static void * definer (void *ud)
+{
+  int i;
+  state * s = (state *) ud;
+
+  while(!s->do_exit)
+    for (i=0; i<4000; i++)
+      {
+	scm_with_guile (guile_mode_definer, ud);
+	sched_yield();  /* try to get the threads to inter-leave a lot */
+      }
+  return NULL;
+}
+
+static void init_ctr(state *s, int val)
+{
+  s->id = val;
+  s->count = 0;
+  s->do_exit = 0;
+  s->error_count = 0;
+  s->last_ok = 0;
+}
+
+static void * setup(void * ud)
+{
+  int *duration = (int *)ud;
+
+  /* Query an environment variable to find out how long to run this
+     test for, defaulting to 10s. */
+  *duration = scm_to_int (scm_c_eval_string ("(catch #t "
+    "(lambda () "
+    "  (round (string->number (string-append \"#e\" "
+				    "(or (getenv \"GUILE_TEST_DEFINE_RACE_DURATION\") \"10\"))))) "
+    "(lambda _ "
+    "  (write _) (newline) 10))"));
+
+  return NULL;
+}
+
+int main(int argc, char ** argv)
+{
+  pthread_t th1, th2, th3, th4;
+  state counter1, counter2, counter3, counter4;
+  int error_total;
+  int duration;
+
+  scm_with_guile (setup, &duration);
+
+  init_ctr (&counter1, 1);
+  init_ctr (&counter2, 2);
+  init_ctr (&counter3, 3);
+  init_ctr (&counter4, 4);
+
+  pthread_create(&th1, NULL, definer, (void *) &counter1);
+  pthread_create(&th2, NULL, definer, (void *) &counter2);
+  pthread_create(&th3, NULL, definer, (void *) &counter3);
+  pthread_create(&th4, NULL, definer, (void *) &counter4);
+
+  sleep(duration);
+  counter1.do_exit = 1;
+  counter2.do_exit = 1;
+  counter3.do_exit = 1;
+  counter4.do_exit = 1;
+
+  pthread_join(th1, NULL);
+  pthread_join(th2, NULL);
+  pthread_join(th3, NULL);
+  pthread_join(th4, NULL);
+
+  error_total = (counter1.error_count + counter2.error_count +
+		 counter3.error_count + counter4.error_count);
+  printf("test-define-race: %d error(s) in %ds\n", error_total, duration);
+  exit (error_total);
+}
-- 
1.5.6.5


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002-Make-the-interned-symbols-hash-thread-safe.patch --]
[-- Type: text/x-diff, Size: 13228 bytes --]

From 7b6b81cfef3705d20715b978b51004a8c4a6ddf6 Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Wed, 25 Mar 2009 22:34:23 +0000
Subject: [PATCH] Make the interned symbols hash thread-safe

These changes introduce a mutex that protects all accesses to and
modifications of the symbols hash.  They allow the test-define-race
test to run without errors for a long time.  (Well, for 200s anyway.)

* libguile/environments.c (obarray_enter, obarray_replace): Extra
  parameter on scm_i_rehash calls.

* libguile/hashtab.c: Include symbols.h.

  (scm_i_rehash): New mutex parameter.  When non-NULL, unlock and
  relock this mutex around allocations, and recheck new size
  calculation (in case another thread has changed it).

  (rehash_after_gc): Special treatment for the symbol hash.

  (scm_hash_fn_create_handle_x, scm_hash_fn_remove_x): Extra parameter on scm_i_rehash calls.

* libguile/hashtab.h (scm_i_rehash): New mutex parameter.

* libguile/symbols.c (symbols): Rename scm_i_symbols and make non-static.

  (symbols_mutex): New variable.

  (scm_sys_symbols): symbols -> scm_i_symbols.

  (lookup_interned_symbol): Lock symbols_mutex while performing lookup.

  (intern_symbol): Add name, len and raw_hash parameters.  Lock
  symbols_mutex while checking and modifying the symbols hash.  Once
  the mutex is locked, recheck in case another thread has already
  interned the symbol.  Return interned symbol, in order to tell
  callers about this last case.

  (scm_i_c_mem2symbol, scm_i_mem2symbol, scm_take_locale_symboln):
  Update intern_symbol calls.

  (scm_i_rehash_symbols_after_gc): New function.

  (scm_symbols_prehistory): symbols -> scm_i_symbols.  Initialize
  symbols_mutex.

* libguile/symbols.h (scm_i_symbols, scm_i_rehash_symbols_after_gc):
  New declarations.
---
 libguile/environments.c |    4 +-
 libguile/hashtab.c      |   53 +++++++++++++++++++--------
 libguile/hashtab.h      |    3 +-
 libguile/symbols.c      |   92 +++++++++++++++++++++++++++++++++++++----------
 libguile/symbols.h      |    2 +
 5 files changed, 117 insertions(+), 37 deletions(-)

diff --git a/libguile/environments.c b/libguile/environments.c
index 13d63c0..077a397 100644
--- a/libguile/environments.c
+++ b/libguile/environments.c
@@ -516,7 +516,7 @@ obarray_enter (SCM obarray, SCM symbol, SCM data)
   SCM_SET_HASHTABLE_BUCKET  (obarray, hash, slot);
   SCM_HASHTABLE_INCREMENT (obarray);
   if (SCM_HASHTABLE_N_ITEMS (obarray) > SCM_HASHTABLE_UPPER (obarray))
-    scm_i_rehash (obarray, scm_i_hash_symbol, 0, "obarray_enter");
+    scm_i_rehash (obarray, scm_i_hash_symbol, 0, "obarray_enter", NULL);
 
   return entry;
 }
@@ -550,7 +550,7 @@ obarray_replace (SCM obarray, SCM symbol, SCM data)
   SCM_SET_HASHTABLE_BUCKET (obarray, hash, slot);
   SCM_HASHTABLE_INCREMENT (obarray);
   if (SCM_HASHTABLE_N_ITEMS (obarray) > SCM_HASHTABLE_UPPER (obarray))
-    scm_i_rehash (obarray, scm_i_hash_symbol, 0, "obarray_replace");
+    scm_i_rehash (obarray, scm_i_hash_symbol, 0, "obarray_replace", NULL);
 
   return SCM_BOOL_F;
 }
diff --git a/libguile/hashtab.c b/libguile/hashtab.c
index ea7fc69..0292012 100644
--- a/libguile/hashtab.c
+++ b/libguile/hashtab.c
@@ -30,6 +30,7 @@
 #include "libguile/root.h"
 #include "libguile/vectors.h"
 #include "libguile/ports.h"
+#include "libguile/symbols.h"
 
 #include "libguile/validate.h"
 #include "libguile/hashtab.h"
@@ -117,13 +118,15 @@ void
 scm_i_rehash (SCM table,
 	      unsigned long (*hash_fn)(),
 	      void *closure,
-	      const char* func_name)
+	      const char* func_name,
+	      scm_i_pthread_mutex_t *mutex)
 {
-  SCM buckets, new_buckets;
-  int i;
+  SCM buckets, new_buckets = SCM_BOOL_F;
+  int i = 0;
   unsigned long old_size;
   unsigned long new_size;
 
+ restart:
   if (SCM_HASHTABLE_N_ITEMS (table) < SCM_HASHTABLE_LOWER (table))
     {
       /* rehashing is not triggered when i <= min_size */
@@ -133,7 +136,7 @@ scm_i_rehash (SCM table,
       while (i > SCM_HASHTABLE (table)->min_size_index
 	     && SCM_HASHTABLE_N_ITEMS (table) < hashtable_size[i] / 4);
     }
-  else
+  else if (SCM_HASHTABLE_N_ITEMS (table) > SCM_HASHTABLE_UPPER (table))
     {
       i = SCM_HASHTABLE (table)->size_index + 1;
       if (i >= HASHTABLE_SIZE_N)
@@ -157,12 +160,28 @@ scm_i_rehash (SCM table,
   SCM_HASHTABLE (table)->upper = 9 * new_size / 10;
   buckets = SCM_HASHTABLE_VECTOR (table);
   
-  if (SCM_HASHTABLE_WEAK_P (table))
-    new_buckets = scm_i_allocate_weak_vector (SCM_HASHTABLE_FLAGS (table),
-					      scm_from_ulong (new_size),
-					      SCM_EOL);
-  else
-    new_buckets = scm_c_make_vector (new_size, SCM_EOL);
+  if (scm_is_false (new_buckets) ||
+      (SCM_SIMPLE_VECTOR_LENGTH (new_buckets) != new_size))
+    {
+      /* Need to allocate or reallocate the new_buckets vector. */
+      if (mutex)
+	scm_i_pthread_mutex_unlock (mutex);
+
+      if (SCM_HASHTABLE_WEAK_P (table))
+	new_buckets = scm_i_allocate_weak_vector (SCM_HASHTABLE_FLAGS (table),
+						  scm_from_ulong (new_size),
+						  SCM_EOL);
+      else
+	new_buckets = scm_c_make_vector (new_size, SCM_EOL);
+
+      if (mutex)
+	scm_i_pthread_mutex_lock (mutex);
+
+      /* Another thread could have messed with the hashtable while the
+	 mutex was unlocked.  So we now have to recalculate the rehash
+	 size from scratch. */
+      goto restart;
+    }
 
   /* When this is a weak hashtable, running the GC might change it.
      We need to cope with this while rehashing its elements.  We do
@@ -271,11 +290,15 @@ rehash_after_gc (void *dummy1 SCM_UNUSED,
       h = first;
       do
 	{
-	  /* Rehash only when we have a hash_fn.
+	  /* Special treatment for the symbol hash, as it is protected
+	     by a mutex. */
+	  if (scm_is_eq (h, scm_i_symbols))
+	    scm_i_rehash_symbols_after_gc ();
+	  /* Otherwise, rehash only when we have a hash_fn.
 	   */
-	  if (SCM_HASHTABLE (h)->hash_fn)
+	  else if (SCM_HASHTABLE (h)->hash_fn)
 	    scm_i_rehash (h, SCM_HASHTABLE (h)->hash_fn, NULL,
-			  "rehash_after_gc");
+			  "rehash_after_gc", NULL);
 	  last = h;
 	  h = SCM_HASHTABLE_NEXT (h);
 	} while (!scm_is_null (h));
@@ -490,7 +513,7 @@ scm_hash_fn_create_handle_x (SCM table, SCM obj, SCM init, unsigned long (*hash_
 	  SCM_HASHTABLE_INCREMENT (table);
 	  if (SCM_HASHTABLE_N_ITEMS (table) < SCM_HASHTABLE_LOWER (table)
 	      || SCM_HASHTABLE_N_ITEMS (table) > SCM_HASHTABLE_UPPER (table))
-	    scm_i_rehash (table, hash_fn, closure, FUNC_NAME);
+	    scm_i_rehash (table, hash_fn, closure, FUNC_NAME, NULL);
 	}
       return SCM_CAR (new_bucket);
     }
@@ -556,7 +579,7 @@ scm_hash_fn_remove_x (SCM table, SCM obj,
 	{
 	  SCM_HASHTABLE_DECREMENT (table);
 	  if (SCM_HASHTABLE_N_ITEMS (table) < SCM_HASHTABLE_LOWER (table))
-	    scm_i_rehash (table, hash_fn, closure, "scm_hash_fn_remove_x");
+	    scm_i_rehash (table, hash_fn, closure, "scm_hash_fn_remove_x", NULL);
 	}
     }
   return h;
diff --git a/libguile/hashtab.h b/libguile/hashtab.h
index 1017354..3eca04c 100644
--- a/libguile/hashtab.h
+++ b/libguile/hashtab.h
@@ -96,7 +96,8 @@ SCM_API SCM scm_weak_key_hash_table_p (SCM h);
 SCM_API SCM scm_weak_value_hash_table_p (SCM h);
 SCM_API SCM scm_doubly_weak_hash_table_p (SCM h);
 
-SCM_API void scm_i_rehash (SCM table, unsigned long (*hash_fn)(), void *closure, const char*func_name);
+SCM_API void scm_i_rehash (SCM table, unsigned long (*hash_fn)(),
+			   void *closure, const char *func_name, scm_i_pthread_mutex_t *mutex);
 SCM_API void scm_i_scan_weak_hashtables (void);
 
 SCM_API SCM scm_hash_fn_get_handle (SCM table, SCM obj, unsigned long (*hash_fn) (), SCM (*assoc_fn) (), void * closure);
diff --git a/libguile/symbols.c b/libguile/symbols.c
index b1329fa..a233fe3 100644
--- a/libguile/symbols.c
+++ b/libguile/symbols.c
@@ -46,7 +46,8 @@
 
 \f
 
-static SCM symbols;
+SCM scm_i_symbols;
+static scm_i_pthread_mutex_t symbols_mutex;
 
 #ifdef GUILE_DEBUG
 SCM_DEFINE (scm_sys_symbols, "%symbols", 0, 0, 0,
@@ -54,7 +55,7 @@ SCM_DEFINE (scm_sys_symbols, "%symbols", 0, 0, 0,
 	    "Return the system symbol obarray.")
 #define FUNC_NAME s_scm_sys_symbols
 {
-  return symbols;
+  return scm_i_symbols;
 }
 #undef FUNC_NAME
 #endif
@@ -90,9 +91,13 @@ lookup_interned_symbol (const char *name, size_t len,
 {
   /* Try to find the symbol in the symbols table */
   SCM l;
-  unsigned long hash = raw_hash % SCM_HASHTABLE_N_BUCKETS (symbols);
+  unsigned long hash;
+
+  scm_i_pthread_mutex_lock (&symbols_mutex);
 
-  for (l = SCM_HASHTABLE_BUCKET (symbols, hash);
+  hash = raw_hash % SCM_HASHTABLE_N_BUCKETS (scm_i_symbols);
+
+  for (l = SCM_HASHTABLE_BUCKET (scm_i_symbols, hash);
        !scm_is_null (l);
        l = SCM_CDR (l))
     {
@@ -110,31 +115,69 @@ lookup_interned_symbol (const char *name, size_t len,
 		goto next_symbol;
 	    }
 
+	  scm_i_pthread_mutex_unlock (&symbols_mutex);
 	  return sym;
 	}
     next_symbol:
       ;
     }
 
+  scm_i_pthread_mutex_unlock (&symbols_mutex);
   return SCM_BOOL_F;
 }
 
 /* Intern SYMBOL, an uninterned symbol.  */
-static void
-intern_symbol (SCM symbol)
+static SCM
+intern_symbol (SCM symbol, const char *name, size_t len, size_t raw_hash)
 {
-  SCM slot, cell;
+  SCM slot, new_bucket, l;
   unsigned long hash;
 
-  hash = scm_i_symbol_hash (symbol) % SCM_HASHTABLE_N_BUCKETS (symbols);
-  slot = SCM_HASHTABLE_BUCKET (symbols, hash);
-  cell = scm_cons (symbol, SCM_UNDEFINED);
+  /* Allocate new cell and bucket before locking the mutex. */
+  new_bucket = scm_acons (symbol, SCM_UNDEFINED, SCM_BOOL_F);
+
+  scm_i_pthread_mutex_lock (&symbols_mutex);
+
+  hash = raw_hash % SCM_HASHTABLE_N_BUCKETS (scm_i_symbols);
+  slot = SCM_HASHTABLE_BUCKET (scm_i_symbols, hash);
+
+  /* We have to check again here that the symbol isn't already hashed;
+     another thread could have interned it inbetween when we tried to
+     look up the symbol and the scm_i_pthread_mutex_lock call
+     above. */
+  for (l = slot; !scm_is_null (l); l = SCM_CDR (l))
+    {
+      SCM sym = SCM_CAAR (l);
+      if ((scm_i_symbol_hash (sym) == raw_hash) &&
+	  (scm_i_symbol_length (sym) == len))
+	{
+	  const char *chrs = scm_i_symbol_chars (sym);
+	  size_t i = len;
+
+	  while (i != 0)
+	    {
+	      --i;
+	      if (name[i] != chrs[i])
+		goto next_symbol;
+	    }
+
+	  scm_i_pthread_mutex_unlock (&symbols_mutex);
+	  return sym;
+	}
+    next_symbol:
+      ;
+    }
+
+  SCM_SETCDR (new_bucket, slot);
+  SCM_SET_HASHTABLE_BUCKET (scm_i_symbols, hash, new_bucket);
+  SCM_HASHTABLE_INCREMENT (scm_i_symbols);
+  if (SCM_HASHTABLE_N_ITEMS (scm_i_symbols) > SCM_HASHTABLE_UPPER (scm_i_symbols))
+    scm_i_rehash (scm_i_symbols, scm_i_hash_symbol, 0, "intern_symbol",
+		  &symbols_mutex);
 
-  SCM_SET_HASHTABLE_BUCKET (symbols, hash, scm_cons (cell, slot));
-  SCM_HASHTABLE_INCREMENT (symbols);
+  scm_i_pthread_mutex_unlock (&symbols_mutex);
 
-  if (SCM_HASHTABLE_N_ITEMS (symbols) > SCM_HASHTABLE_UPPER (symbols))
-    scm_i_rehash (symbols, scm_i_hash_symbol, 0, "intern_symbol");
+  return symbol;
 }
 
 static SCM
@@ -149,7 +192,7 @@ scm_i_c_mem2symbol (const char *name, size_t len)
       /* The symbol was not found, create it.  */
       symbol = scm_i_c_make_symbol (name, len, 0, raw_hash,
 				    scm_cons (SCM_BOOL_F, SCM_EOL));
-      intern_symbol (symbol);
+      symbol = intern_symbol (symbol, name, len, raw_hash);
     }
 
   return symbol;
@@ -169,12 +212,22 @@ scm_i_mem2symbol (SCM str)
       /* The symbol was not found, create it.  */
       symbol = scm_i_make_symbol (str, 0, raw_hash,
 				  scm_cons (SCM_BOOL_F, SCM_EOL));
-      intern_symbol (symbol);
+      symbol = intern_symbol (symbol, name, len, raw_hash);
     }
 
   return symbol;
 }
 
+void
+scm_i_rehash_symbols_after_gc ()
+{
+  scm_i_pthread_mutex_lock (&symbols_mutex);
+
+  scm_i_rehash (scm_i_symbols, scm_i_hash_symbol, 0, "rehash_after_gc",
+		&symbols_mutex);
+
+  scm_i_pthread_mutex_unlock (&symbols_mutex);
+}
 
 static SCM
 scm_i_mem2uninterned_symbol (SCM str)
@@ -417,7 +470,7 @@ scm_take_locale_symboln (char *sym, size_t len)
     {
       res = scm_i_c_take_symbol (sym, len, 0, raw_hash,
 				 scm_cons (SCM_BOOL_F, SCM_EOL));
-      intern_symbol (res);
+      res = intern_symbol (res, sym, len, raw_hash);
     }
   else
     free (sym);
@@ -434,8 +487,9 @@ scm_take_locale_symbol (char *sym)
 void
 scm_symbols_prehistory ()
 {
-  symbols = scm_make_weak_key_hash_table (scm_from_int (2139));
-  scm_permanent_object (symbols);
+  scm_i_symbols = scm_make_weak_key_hash_table (scm_from_int (2139));
+  scm_permanent_object (scm_i_symbols);
+  scm_i_pthread_mutex_init (&symbols_mutex, NULL);
 }
 
 
diff --git a/libguile/symbols.h b/libguile/symbols.h
index f70d655..30c3005 100644
--- a/libguile/symbols.h
+++ b/libguile/symbols.h
@@ -34,6 +34,7 @@
 #define SCM_I_F_SYMBOL_UNINTERNED   0x100
 
 \f
+extern SCM scm_i_symbols;
 
 #ifdef GUILE_DEBUG
 SCM_API SCM scm_sys_symbols (void);
@@ -63,6 +64,7 @@ SCM_API SCM scm_take_locale_symboln (char *sym, size_t len);
 
 SCM_API unsigned long scm_i_hash_symbol (SCM obj, unsigned long n,
 					 void *closure);
+void scm_i_rehash_symbols_after_gc (void);
 
 SCM_API void scm_symbols_prehistory (void);
 SCM_API void scm_init_symbols (void);
-- 
1.5.6.5


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003-Allow-non-weak-hash-tables-to-be-automatically-threa.patch --]
[-- Type: text/x-diff, Size: 9989 bytes --]

From 7627abf6ecf1770367ea2c5f8d61e14fa4843c8a Mon Sep 17 00:00:00 2001
From: Neil Jerram <neil@ossau.uklinux.net>
Date: Wed, 25 Mar 2009 22:50:46 +0000
Subject: [PATCH] Allow non-weak hash tables to be automatically thread-safe

This patch allows a fat mutex to be associated with a hash table.
When a hash table has an associated mutex, that mutex will be
automatically locked and unlocked around all operations (include
lookups) on the hash table's internal data.

* libguile/hashtab.c (make_hash_table): Init mutex field to #f.

  (hashtable_mark): New function.

  (scm_hash_fn_get_handle, scm_hash_fn_create_handle_x,
  scm_hash_fn_remove_x, scm_hash_clear_x, scm_internal_hash_fold,
  scm_internal_hash_for_each_handle): If the hash table has a mutex,
  lock and unlock it around all accessing of the hash table's
  internals.

  (scm_hash_use_mutex_x): New function.

  (scm_hashtab_prehistory): Use hashtable_mark as hash table mark
  function.

* libguile/hashtab.h (SCM_HASHTABLE_MUTEX, SCM_SET_HASHTABLE_MUTEX,
  scm_hash_use_mutex_x): New declarations.

  (scm_t_hashtable): New mutex field.
---
 libguile/hashtab.c |  130 ++++++++++++++++++++++++++++++++++++++++++++++------
 libguile/hashtab.h |    4 ++
 2 files changed, 119 insertions(+), 15 deletions(-)

diff --git a/libguile/hashtab.c b/libguile/hashtab.c
index 0292012..7621314 100644
--- a/libguile/hashtab.c
+++ b/libguile/hashtab.c
@@ -104,6 +104,7 @@ make_hash_table (int flags, unsigned long k, const char *func_name)
   t->upper = 9 * n / 10;
   t->flags = flags;
   t->hash_fn = NULL;
+  t->mutex = SCM_BOOL_F;
   if (flags)
     {
       SCM_NEWSMOB3 (table, scm_tc16_hashtable, vector, t, weak_hashtables);
@@ -218,6 +219,13 @@ scm_i_rehash (SCM table,
 }
 
 
+static SCM 
+hashtable_mark (SCM table)
+{
+  scm_gc_mark (SCM_HASHTABLE_MUTEX (table));
+  return SCM_CELL_OBJECT_1 (table);
+}
+
 static int
 hashtable_print (SCM exp, SCM port, scm_print_state *pstate SCM_UNUSED)
 {
@@ -440,18 +448,34 @@ scm_hash_fn_get_handle (SCM table, SCM obj, unsigned long (*hash_fn)(), SCM (*as
 #define FUNC_NAME "scm_hash_fn_get_handle"
 {
   unsigned long k;
-  SCM h;
+  SCM h, mutex = SCM_BOOL_F;
 
   if (SCM_HASHTABLE_P (table))
-    table = SCM_HASHTABLE_VECTOR (table);
+    {
+      mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
+      table = SCM_HASHTABLE_VECTOR (table);
+    }
   else
     SCM_VALIDATE_VECTOR (1, table);
   if (SCM_SIMPLE_VECTOR_LENGTH (table) == 0)
-    return SCM_BOOL_F;
+    {
+      h = SCM_BOOL_F;
+      goto end;
+    }
   k = hash_fn (obj, SCM_SIMPLE_VECTOR_LENGTH (table), closure);
   if (k >= SCM_SIMPLE_VECTOR_LENGTH (table))
     scm_out_of_range ("hash_fn_get_handle", scm_from_ulong (k));
   h = assoc_fn (obj, SCM_SIMPLE_VECTOR_REF (table, k), closure);
+
+ end:
+  if (!scm_is_false (mutex))
+    scm_dynwind_end ();
+
   return h;
 }
 #undef FUNC_NAME
@@ -463,10 +487,18 @@ scm_hash_fn_create_handle_x (SCM table, SCM obj, SCM init, unsigned long (*hash_
 #define FUNC_NAME "scm_hash_fn_create_handle_x"
 {
   unsigned long k;
-  SCM buckets, it;
+  SCM buckets, it, mutex = SCM_BOOL_F;
 
   if (SCM_HASHTABLE_P (table))
-    buckets = SCM_HASHTABLE_VECTOR (table);
+    {
+      mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
+      buckets = SCM_HASHTABLE_VECTOR (table);
+    }
   else
     {
       SCM_ASSERT (scm_is_simple_vector (table),
@@ -481,7 +513,7 @@ scm_hash_fn_create_handle_x (SCM table, SCM obj, SCM init, unsigned long (*hash_
     scm_out_of_range ("hash_fn_create_handle_x", scm_from_ulong (k));
   it = assoc_fn (obj, SCM_SIMPLE_VECTOR_REF (buckets, k), closure);
   if (scm_is_pair (it))
-    return it;
+    ;
   else if (scm_is_true (it))
     scm_wrong_type_arg_msg (NULL, 0, it, "a pair");
   else
@@ -515,8 +547,13 @@ scm_hash_fn_create_handle_x (SCM table, SCM obj, SCM init, unsigned long (*hash_
 	      || SCM_HASHTABLE_N_ITEMS (table) > SCM_HASHTABLE_UPPER (table))
 	    scm_i_rehash (table, hash_fn, closure, FUNC_NAME, NULL);
 	}
-      return SCM_CAR (new_bucket);
+      it = SCM_CAR (new_bucket);
     }
+
+  if (!scm_is_false (mutex))
+    scm_dynwind_end ();
+
+  return it;
 }
 #undef FUNC_NAME
 
@@ -554,10 +591,18 @@ scm_hash_fn_remove_x (SCM table, SCM obj,
                       void *closure)
 {
   unsigned long k;
-  SCM buckets, h;
+  SCM buckets, h, mutex = SCM_BOOL_F;
 
   if (SCM_HASHTABLE_P (table))
-    buckets = SCM_HASHTABLE_VECTOR (table);
+    {
+      mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
+      buckets = SCM_HASHTABLE_VECTOR (table);
+    }
   else
     {
       SCM_ASSERT (scm_is_simple_vector (table), table,
@@ -565,7 +610,10 @@ scm_hash_fn_remove_x (SCM table, SCM obj,
       buckets = table;
     }
   if (SCM_SIMPLE_VECTOR_LENGTH (table) == 0)
-    return SCM_EOL;
+    {
+      h = SCM_EOL;
+      goto end;
+    }
 
   k = hash_fn (obj, SCM_SIMPLE_VECTOR_LENGTH (buckets), closure);
   if (k >= SCM_SIMPLE_VECTOR_LENGTH (buckets))
@@ -582,6 +630,9 @@ scm_hash_fn_remove_x (SCM table, SCM obj,
 	    scm_i_rehash (table, hash_fn, closure, "scm_hash_fn_remove_x", NULL);
 	}
     }
+ end:
+  if (!scm_is_false (mutex))
+    scm_dynwind_end ();
   return h;
 }
 
@@ -592,8 +643,16 @@ SCM_DEFINE (scm_hash_clear_x, "hash-clear!", 1, 0, 0,
 {
   if (SCM_HASHTABLE_P (table))
     {
+      SCM mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
       scm_vector_fill_x (SCM_HASHTABLE_VECTOR (table), SCM_EOL);
       SCM_SET_HASHTABLE_N_ITEMS (table, 0);
+      if (!scm_is_false (mutex))
+	scm_dynwind_end ();
     }
   else
     scm_vector_fill_x (table, SCM_EOL);
@@ -940,10 +999,18 @@ SCM
 scm_internal_hash_fold (SCM (*fn) (), void *closure, SCM init, SCM table)
 {
   long i, n;
-  SCM buckets, result = init;
+  SCM buckets, result = init, mutex = SCM_BOOL_F;
   
   if (SCM_HASHTABLE_P (table))
-    buckets = SCM_HASHTABLE_VECTOR (table);
+    {
+      mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
+      buckets = SCM_HASHTABLE_VECTOR (table);
+    }
   else
     buckets = table;
   
@@ -963,6 +1030,9 @@ scm_internal_hash_fold (SCM (*fn) (), void *closure, SCM init, SCM table)
 	}
     }
 
+  if (!scm_is_false (mutex))
+    scm_dynwind_end ();
+
   return result;
 }
 
@@ -978,10 +1048,18 @@ void
 scm_internal_hash_for_each_handle (SCM (*fn) (), void *closure, SCM table)
 {
   long i, n;
-  SCM buckets;
+  SCM buckets, mutex = SCM_BOOL_F;
   
   if (SCM_HASHTABLE_P (table))
-    buckets = SCM_HASHTABLE_VECTOR (table);
+    {
+      mutex = SCM_HASHTABLE_MUTEX (table);
+      if (!scm_is_false (mutex))
+	{
+	  scm_dynwind_begin (SCM_F_DYNWIND_REWINDABLE);
+	  scm_dynwind_lock_mutex (mutex);
+	}
+      buckets = SCM_HASHTABLE_VECTOR (table);
+    }
   else
     buckets = table;
   
@@ -1000,6 +1078,9 @@ scm_internal_hash_for_each_handle (SCM (*fn) (), void *closure, SCM table)
 	  ls = SCM_CDR (ls);
 	}
     }
+
+  if (!scm_is_false (mutex))
+    scm_dynwind_end ();
 }
 
 SCM_DEFINE (scm_hash_fold, "hash-fold", 3, 0, 0, 
@@ -1088,6 +1169,25 @@ SCM_DEFINE (scm_hash_map_to_list, "hash-map->list", 2, 0, 0,
 }
 #undef FUNC_NAME
 
+SCM_DEFINE (scm_hash_use_mutex_x, "hash-use-mutex!", 2, 0, 0, 
+            (SCM table, SCM mutex),
+	    "Use @var{mutex} to serialize operations on @var{table} from multiple threads.")
+#define FUNC_NAME s_scm_hash_use_mutex_x
+{
+  /* Must be a real (i.e. not a vector) and non-weak hash table. */
+  SCM_VALIDATE_HASHTABLE (1, table);
+  if (SCM_HASHTABLE_WEAK_P (table))
+    SCM_MISC_ERROR ("can't use mutex with a weak hash table", SCM_EOL);
+    
+  if (!scm_is_false (mutex))
+    SCM_VALIDATE_MUTEX (2, mutex);
+
+  SCM_SET_HASHTABLE_MUTEX (table, mutex);
+
+  return SCM_UNSPECIFIED;
+}
+#undef FUNC_NAME
+
 \f
 
 
@@ -1095,7 +1195,7 @@ void
 scm_hashtab_prehistory ()
 {
   scm_tc16_hashtable = scm_make_smob_type (s_hashtable, 0);
-  scm_set_smob_mark (scm_tc16_hashtable, scm_markcdr);
+  scm_set_smob_mark (scm_tc16_hashtable, hashtable_mark);
   scm_set_smob_print (scm_tc16_hashtable, hashtable_print);
   scm_set_smob_free (scm_tc16_hashtable, hashtable_free);
   scm_c_hook_add (&scm_after_gc_c_hook, rehash_after_gc, 0, 0);
diff --git a/libguile/hashtab.h b/libguile/hashtab.h
index 3eca04c..79cb30c 100644
--- a/libguile/hashtab.h
+++ b/libguile/hashtab.h
@@ -58,6 +58,8 @@ SCM_API scm_t_bits scm_tc16_hashtable;
 #define SCM_HASHTABLE_DECREMENT(x) (SCM_HASHTABLE_N_ITEMS(x)--)
 #define SCM_HASHTABLE_UPPER(x)     (SCM_HASHTABLE (x)->upper)
 #define SCM_HASHTABLE_LOWER(x)     (SCM_HASHTABLE (x)->lower)
+#define SCM_HASHTABLE_MUTEX(x)     (SCM_HASHTABLE (x)->mutex)
+#define SCM_SET_HASHTABLE_MUTEX(x, m)     (SCM_HASHTABLE (x)->mutex = m)
 
 #define SCM_HASHTABLE_N_BUCKETS(h) \
  SCM_SIMPLE_VECTOR_LENGTH (SCM_HASHTABLE_VECTOR (h))
@@ -74,6 +76,7 @@ typedef struct scm_t_hashtable {
   int size_index;		/* index into hashtable_size */
   int min_size_index;		/* minimum size_index */
   unsigned long (*hash_fn) ();  /* for rehashing after a GC. */
+  SCM mutex;			/* mutex for thread safety */
 } scm_t_hashtable;
 
 \f
@@ -133,6 +136,7 @@ SCM_API SCM scm_hash_fold (SCM proc, SCM init, SCM hash);
 SCM_API SCM scm_hash_for_each (SCM proc, SCM hash);
 SCM_API SCM scm_hash_for_each_handle (SCM proc, SCM hash);
 SCM_API SCM scm_hash_map_to_list (SCM proc, SCM hash);
+SCM_API SCM scm_hash_use_mutex_x (SCM hash, SCM mutex);
 SCM_API void scm_hashtab_prehistory (void);
 SCM_API void scm_init_hashtab (void);
 
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-25 23:19             ` Neil Jerram
@ 2009-03-26  3:40               ` Linas Vepstas
  2009-03-26  8:02                 ` Neil Jerram
  2009-03-26  9:10               ` Ludovic Courtès
  2009-03-26 22:51               ` Neil Jerram
  2 siblings, 1 reply; 51+ messages in thread
From: Linas Vepstas @ 2009-03-26  3:40 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, Clinton Ebadi, Guile Development

2009/3/25 Neil Jerram <neil@ossau.uklinux.net>:

> #2 makes the symbols hash thread-safe, and it appears that this
> completely fixes the define-race problem.

I reviewed patch2 as best I could, but I'm cross-eyed
cause its late at night, and I don't know guile internals
well.  So I'm not sure my review means much.  However,
isn't the following a mem leak:

+static SCM
+intern_symbol (SCM symbol, const char *name, size_t len, size_t raw_hash)

+  /* Allocate new cell and bucket before locking the mutex. */
+  new_bucket = scm_acons (symbol, SCM_UNDEFINED, SCM_BOOL_F);

+  for (l = slot; !scm_is_null (l); l = SCM_CDR (l))

...
+
+	  scm_i_pthread_mutex_unlock (&symbols_mutex);
+	  return sym;

it looks to me like new_bucket is never released.
I assume that the scm_acons did a malloc, but
maybe not

I'm to tired to say anything meaningful about the rest ...

--linas

p.s. many thanks for chasing this stuff down and fixing it !
It is all a very pleasant surprise!




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-26  3:40               ` Linas Vepstas
@ 2009-03-26  8:02                 ` Neil Jerram
  2009-03-26 18:39                   ` Linas Vepstas
  0 siblings, 1 reply; 51+ messages in thread
From: Neil Jerram @ 2009-03-26  8:02 UTC (permalink / raw)
  To: linasvepstas; +Cc: Andy Wingo, Clinton Ebadi, Guile Development

Linas Vepstas <linasvepstas@gmail.com> writes:

> I reviewed patch2 as best I could,

Thanks!

> it looks to me like new_bucket is never released.
> I assume that the scm_acons did a malloc, but
> maybe not

Yes, but the garbage collection will find and free it, so it's OK.

Regards,
        Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-25 23:19             ` Neil Jerram
  2009-03-26  3:40               ` Linas Vepstas
@ 2009-03-26  9:10               ` Ludovic Courtès
  2009-03-26 22:01                 ` Neil Jerram
  2009-03-26 22:51               ` Neil Jerram
  2 siblings, 1 reply; 51+ messages in thread
From: Ludovic Courtès @ 2009-03-26  9:10 UTC (permalink / raw)
  To: guile-devel

Hi Neil,

Neil Jerram <neil@ossau.uklinux.net> writes:

> - it runs for a duration specified by environment variable
>   GUILE_TEST_DEFINE_RACE_DURATION
>
> - it defaults to just 10s if that variable isn't defined.

Even 10s is going to be slightly annoying.  Another option would be to
create tens of guile-definer threads instead of just 4; it might trigger
the problem earlier.

> #2 makes the symbols hash thread-safe, and it appears that this
> completely fixes the define-race problem.  I don't understand why we
> apparently don't need patch #3 as well - but that's what my results
> indicate.

Roughly #2 is a specialized version of #3 as it addresses only the case
of the symbol hash table.  So it should be enough for this very problem.

#3 is probably the way to go in the longer run, but it changes the
API/ABI so it can't go in 1.8.

I'll skip #3 for now.

> * test-suite/standalone/test-define-race.c: New test.

Could you run `indent' on this file?

> +TESTS += test-define-race

It needs to be conditionalized on `--with-threads'.

>  scm_i_rehash (SCM table,
>  	      unsigned long (*hash_fn)(),
>  	      void *closure,
> -	      const char* func_name)
> +	      const char* func_name,
> +	      scm_i_pthread_mutex_t *mutex)

This function assumes that MUTEX is locked.  I would either write it
prominently in a comment above or change the semantics so that MUTEX is
acquired in `scm_i_rehash ()', not in the caller.

The latter might be achieved by integrating tests about whether TABLE
needs to be rehashed into `scm_i_rehash ()' itself.  I.e., this:

  if (SCM_HASHTABLE_N_ITEMS (table) < SCM_HASHTABLE_LOWER (table)
      || SCM_HASHTABLE_N_ITEMS (table) > SCM_HASHTABLE_UPPER (table))
    scm_i_rehash (...);

would become:

  scm_i_rehash (...);

I haven't reviewed all uses to see whether it's actually possible and
whether it would lead to race conditions, though.

> -static SCM symbols;
> +SCM scm_i_symbols;

I don't think this is needed, is it?

> +static SCM
> +intern_symbol (SCM symbol, const char *name, size_t len, size_t raw_hash)

Since all users of `intern_symbol ()' perform a
`lookup_interned_symbol ()' right before, I'd rather change users so
that they lock before `lookup' and unlock after `intern':

  lock (&symbols_mutex);

  sym = lookup_interned_symbol (name);
  if (scm_is_false (sym))
    {
      sym = scm_i_c_make_symbol (...);
      intern_symbol (sym);
    }

  unlock (&symbols_mutex);

Is it doable?

Thanks,
Ludo'.





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-26  8:02                 ` Neil Jerram
@ 2009-03-26 18:39                   ` Linas Vepstas
  0 siblings, 0 replies; 51+ messages in thread
From: Linas Vepstas @ 2009-03-26 18:39 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Andy Wingo, Clinton Ebadi, Guile Development

2009/3/26 Neil Jerram <neil@ossau.uklinux.net>:
> Linas Vepstas <linasvepstas@gmail.com> writes:
>
>> I reviewed patch2 as best I could,
>
> Thanks!
>
>> it looks to me like new_bucket is never released.
>> I assume that the scm_acons did a malloc, but
>> maybe not
>
> Yes, but the garbage collection will find and free it, so it's OK.

Duhh. Clearly, my brain inhabits the non-GC world.

--linas




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-26  9:10               ` Ludovic Courtès
@ 2009-03-26 22:01                 ` Neil Jerram
  2009-03-26 23:12                   ` Ludovic Courtès
  0 siblings, 1 reply; 51+ messages in thread
From: Neil Jerram @ 2009-03-26 22:01 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Hi Neil,

Hi Ludovic,

Many thanks for your detailed comments!

> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> - it runs for a duration specified by environment variable
>>   GUILE_TEST_DEFINE_RACE_DURATION
>>
>> - it defaults to just 10s if that variable isn't defined.
>
> Even 10s is going to be slightly annoying.  Another option would be to
> create tens of guile-definer threads instead of just 4; it might trigger
> the problem earlier.

That's an interesting idea, but to be honest I'd prefer to finish this
work and move onto something else.  Would it be OK just to reduce the
default runtime to 2 seconds?  (It wouldn't make any practical
difference, because 10 seconds isn't normally enough to see a define
race anyway.  It'll just be enough to catch any obvious bitrotting,
and people who want to run a real test can use
GUILE_TEST_DEFINE_RACE_DURATION.)

>> #2 makes the symbols hash thread-safe, and it appears that this
>> completely fixes the define-race problem.  I don't understand why we
>> apparently don't need patch #3 as well - but that's what my results
>> indicate.
>
> Roughly #2 is a specialized version of #3 as it addresses only the case
> of the symbol hash table.  So it should be enough for this very problem.

Actually #2 and #3 are a bit different (and probably I should have
explained more about this in my previous email).

The key points about #2 are that it uses a straight pthreads mutex to
protect the symbol hash, and that the symbol hash is a weak hash
table.  Using a pthreads mutex means that we have to take care not to
do any allocations (or anything else that could block) while the mutex
is locked.  The weakness means that as well as the obvious lookup and
intern operations, we have to allow for the hash table being accessed
and resized from an after-GC hook.

#3, on the other hand, uses a fat mutex, and can be applied to any
non-weak hash table.  Using a fat mutex means that it's fine to
allocate or block while the mutex is locked, and means that the code
can use scm_dynwind_lock_mutex, which is quite neat.  On the other
hand, it would cause a problem if the hash table concerned could be
accessed during GC, because

- if the GC didn't try to lock the fat mutex itself, it would be
  modifying the hash table under the feet of the thread that has the
  mutex locked

- if the GC does try to lock the fat mutex, it won't be able to and so
  will block... deadlock!

Hence the restriction of #3 to non-weak hash tables.

> #3 is probably the way to go in the longer run, but it changes the
> API/ABI so it can't go in 1.8.

If #3 was needed, I wouldn't see any problem with its API/ABI changes.
The API change is addition of one new primitive, and the ABI change is
to a structure that isn't intended for user code to access.  Am I
missing something?

On the other hand, if it isn't needed - as appears to be true -
there's no case for adding it to 1.8.

>> * test-suite/standalone/test-define-race.c: New test.
>
> Could you run `indent' on this file?

Sure, will do.

>> +TESTS += test-define-race
>
> It needs to be conditionalized on `--with-threads'.

It's already inside an "if BUILD_PTHREAD_SUPPORT".  Do I need to do
more than that?

>>  scm_i_rehash (SCM table,
>>  	      unsigned long (*hash_fn)(),
>>  	      void *closure,
>> -	      const char* func_name)
>> +	      const char* func_name,
>> +	      scm_i_pthread_mutex_t *mutex)
>
> This function assumes that MUTEX is locked.  I would either write it
> prominently in a comment above or change the semantics so that MUTEX is
> acquired in `scm_i_rehash ()', not in the caller.

I think I prefer the prominent comment.  My first thought was for
scm_i_rehash () to acquire the mutex.  I changed to the current code
because then all of the lock..unlock pairs are in symbols.c - which
seemed more consistent to me.

> The latter might be achieved by integrating tests about whether TABLE
> needs to be rehashed into `scm_i_rehash ()' itself.  I.e., this:
>
>   if (SCM_HASHTABLE_N_ITEMS (table) < SCM_HASHTABLE_LOWER (table)
>       || SCM_HASHTABLE_N_ITEMS (table) > SCM_HASHTABLE_UPPER (table))
>     scm_i_rehash (...);
>
> would become:
>
>   scm_i_rehash (...);
>
> I haven't reviewed all uses to see whether it's actually possible and
> whether it would lead to race conditions, though.

I think that could work, but it would add another lock/unlock to the
mainline.  (We need to hold the lock even just to evaluate the
rehashing condition.)  The change doesn't feel compelling enough to me
to justify that.

>> -static SCM symbols;
>> +SCM scm_i_symbols;
>
> I don't think this is needed, is it?

Yes, so that rehash_after_gc () (in hashtab.c) can identify the symbol
hash and call scm_i_rehash_symbols_after_gc () for it.

>> +static SCM
>> +intern_symbol (SCM symbol, const char *name, size_t len, size_t raw_hash)
>
> Since all users of `intern_symbol ()' perform a
> `lookup_interned_symbol ()' right before, I'd rather change users so
> that they lock before `lookup' and unlock after `intern':
>
>   lock (&symbols_mutex);
>
>   sym = lookup_interned_symbol (name);
>   if (scm_is_false (sym))
>     {
>       sym = scm_i_c_make_symbol (...);
>       intern_symbol (sym);
>     }
>
>   unlock (&symbols_mutex);
>
> Is it doable?

The problem with this is allocation (scm_i_c_make_symbol) while
holding the mutex.  The overall arrangement of the symbol code is

  lookup - allocate - intern

and I assume that is so as to avoid allocation in the case where the
symbol is already interned; otherwise the arrangement could be

  allocate - lookup/intern

with the lookup/intern being implemented as a single operation.

I've made my changes so as to leave this overall pattern the same.
The code certainly could be simpler if we always allocated upfront,
but I think it is better (for performance) not to do that.

On the other hand, I suspect the code could still be simplified to
remove the duplication between lookup_interned_symbol and
intern_symbol.  I'll have a look at that.

Thanks again for the review!

Regards,
        Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-25 23:19             ` Neil Jerram
  2009-03-26  3:40               ` Linas Vepstas
  2009-03-26  9:10               ` Ludovic Courtès
@ 2009-03-26 22:51               ` Neil Jerram
  2009-03-27  3:15                 ` Linas Vepstas
  2 siblings, 1 reply; 51+ messages in thread
From: Neil Jerram @ 2009-03-26 22:51 UTC (permalink / raw)
  To: Linas Vepstas, Guile Development

Neil Jerram <neil@ossau.uklinux.net> writes:

> #2 makes the symbols hash thread-safe, and it appears that this
> completely fixes the define-race problem.  I don't understand why we
> apparently don't need patch #3 as well - but that's what my results
> indicate.

Maybe it's because the fragments of code passed to scm_c_eval_string
in the test-define-race test are very simple - just

 (define x1-100 100)

and

 x1-100

- and that scm_c_eval_string consists of reading (=> symbol lookup)
followed by evaluation (=> module obarray lookup), and that each
scm_c_eval_string call is fast enough to be completed well within one
time slice.  Then the mutex locking at the start of each symbol lookup
would be enough to make sure that scm_c_eval_string calls on different
threads never overlap with each other.

Does that sound feasible?

(If so, a test like test-define-race, with patch #2 in place, should
still show errors on a multi-core system, or if the code that it
evaluates was made more complex; and use of patch #3 should then fix
those errors.)

Regards,
        Neil




^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-26 22:01                 ` Neil Jerram
@ 2009-03-26 23:12                   ` Ludovic Courtès
  0 siblings, 0 replies; 51+ messages in thread
From: Ludovic Courtès @ 2009-03-26 23:12 UTC (permalink / raw)
  To: guile-devel

Hi Neil,

Neil Jerram <neil@ossau.uklinux.net> writes:

> That's an interesting idea, but to be honest I'd prefer to finish this
> work and move onto something else.  Would it be OK just to reduce the
> default runtime to 2 seconds?  (It wouldn't make any practical
> difference, because 10 seconds isn't normally enough to see a define
> race anyway.  It'll just be enough to catch any obvious bitrotting,
> and people who want to run a real test can use
> GUILE_TEST_DEFINE_RACE_DURATION.)

Yes, that'd be OK.

> The key points about #2 are that it uses a straight pthreads mutex to
> protect the symbol hash, and that the symbol hash is a weak hash
> table.  Using a pthreads mutex means that we have to take care not to
> do any allocations (or anything else that could block) while the mutex
> is locked.  The weakness means that as well as the obvious lookup and
> intern operations, we have to allow for the hash table being accessed
> and resized from an after-GC hook.

OK, got it. (A limitation that vanishes with BDW-GC.)

> #3, on the other hand, uses a fat mutex, and can be applied to any
> non-weak hash table.  Using a fat mutex means that it's fine to
> allocate or block while the mutex is locked, and means that the code
> can use scm_dynwind_lock_mutex, which is quite neat.  On the other
> hand, it would cause a problem if the hash table concerned could be
> accessed during GC, because
>
> - if the GC didn't try to lock the fat mutex itself, it would be
>   modifying the hash table under the feet of the thread that has the
>   mutex locked
>
> - if the GC does try to lock the fat mutex, it won't be able to and so
>   will block... deadlock!
>
> Hence the restriction of #3 to non-weak hash tables.

OK.

> If #3 was needed, I wouldn't see any problem with its API/ABI changes.
> The API change is addition of one new primitive, and the ABI change is
> to a structure that isn't intended for user code to access.  Am I
> missing something?

I was concerned about the `scm_t_hashtable' change, but it doesn't
appear to be accessed by public inlines/macros, so that should be OK.

> On the other hand, if it isn't needed - as appears to be true -
> there's no case for adding it to 1.8.

Agreed.

>> It needs to be conditionalized on `--with-threads'.
>
> It's already inside an "if BUILD_PTHREAD_SUPPORT".  Do I need to do
> more than that?

No, I had overlooked this.

>>>  scm_i_rehash (SCM table,
>>>  	      unsigned long (*hash_fn)(),
>>>  	      void *closure,
>>> -	      const char* func_name)
>>> +	      const char* func_name,
>>> +	      scm_i_pthread_mutex_t *mutex)
>>
>> This function assumes that MUTEX is locked.  I would either write it
>> prominently in a comment above or change the semantics so that MUTEX is
>> acquired in `scm_i_rehash ()', not in the caller.
>
> I think I prefer the prominent comment.  My first thought was for
> scm_i_rehash () to acquire the mutex.  I changed to the current code
> because then all of the lock..unlock pairs are in symbols.c - which
> seemed more consistent to me.

Fair enough.

>> The latter might be achieved by integrating tests about whether TABLE
>> needs to be rehashed into `scm_i_rehash ()' itself.  I.e., this:
>>
>>   if (SCM_HASHTABLE_N_ITEMS (table) < SCM_HASHTABLE_LOWER (table)
>>       || SCM_HASHTABLE_N_ITEMS (table) > SCM_HASHTABLE_UPPER (table))
>>     scm_i_rehash (...);
>>
>> would become:
>>
>>   scm_i_rehash (...);
>>
>> I haven't reviewed all uses to see whether it's actually possible and
>> whether it would lead to race conditions, though.
>
> I think that could work, but it would add another lock/unlock to the
> mainline.  (We need to hold the lock even just to evaluate the
> rehashing condition.)  The change doesn't feel compelling enough to me
> to justify that.

Contention-free locks are "cheap" in that there's no syscall involved
(when using futexes); OTOH there's at least one function call, which
we'd rather avoid.

>>> -static SCM symbols;
>>> +SCM scm_i_symbols;
>>
>> I don't think this is needed, is it?
>
> Yes, so that rehash_after_gc () (in hashtab.c) can identify the symbol
> hash and call scm_i_rehash_symbols_after_gc () for it.

Oops, right.

>>> +static SCM
>>> +intern_symbol (SCM symbol, const char *name, size_t len, size_t raw_hash)
>>
>> Since all users of `intern_symbol ()' perform a
>> `lookup_interned_symbol ()' right before, I'd rather change users so
>> that they lock before `lookup' and unlock after `intern':
>>
>>   lock (&symbols_mutex);
>>
>>   sym = lookup_interned_symbol (name);
>>   if (scm_is_false (sym))
>>     {
>>       sym = scm_i_c_make_symbol (...);
>>       intern_symbol (sym);
>>     }
>>
>>   unlock (&symbols_mutex);
>>
>> Is it doable?
>
> The problem with this is allocation (scm_i_c_make_symbol) while
> holding the mutex.  The overall arrangement of the symbol code is
>
>   lookup - allocate - intern
>
> and I assume that is so as to avoid allocation in the case where the
> symbol is already interned; otherwise the arrangement could be
>
>   allocate - lookup/intern
>
> with the lookup/intern being implemented as a single operation.
>
> I've made my changes so as to leave this overall pattern the same.
> The code certainly could be simpler if we always allocated upfront,
> but I think it is better (for performance) not to do that.

Yes, you're right.  I hadn't grasped all the implications here.

One nitpick: `intern_symbol ()' shouldn't need NAME, LEN and RAW_HASH as
it can get them "for free" from SYMBOL.

> On the other hand, I suspect the code could still be simplified to
> remove the duplication between lookup_interned_symbol and
> intern_symbol.  I'll have a look at that.

Possible.

Thanks for looking into this!

Ludo'.





^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Locks and threads
  2009-03-26 22:51               ` Neil Jerram
@ 2009-03-27  3:15                 ` Linas Vepstas
  0 siblings, 0 replies; 51+ messages in thread
From: Linas Vepstas @ 2009-03-27  3:15 UTC (permalink / raw)
  To: Neil Jerram; +Cc: Guile Development

2009/3/26 Neil Jerram <neil@ossau.uklinux.net>:
>
> - and that scm_c_eval_string consists of reading (=> symbol lookup)
> followed by evaluation (=> module obarray lookup), and that each
> scm_c_eval_string call is fast enough to be completed well within one
> time slice.

One way to stress test such situations is to call
pthread_yield right after the unlock. i.e. temporarily
change scm_i_pthread_mutex_unlock to also
called yield. This will trigger a thread switch even
if the time-slice is not expired, and can often
expose these kinds of bugs.

--linas




^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2009-03-27  3:15 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-11 22:31 Locks and threads Neil Jerram
2009-02-11 23:05 ` Neil Jerram
2009-02-11 23:32   ` Ludovic Courtès
2009-02-11 23:30 ` Linas Vepstas
2009-02-11 23:53   ` Neil Jerram
2009-02-12  0:18     ` Linas Vepstas
2009-02-12 20:51     ` Ludovic Courtès
2009-02-11 23:30 ` Ludovic Courtès
2009-02-12 12:55 ` Greg Troxel
2009-02-12 18:00   ` Ken Raeburn
2009-02-12 21:14     ` Ludovic Courtès
2009-02-14  1:25       ` Ken Raeburn
2009-02-14 16:09         ` Ludovic Courtès
2009-03-05 20:41   ` Neil Jerram
2009-03-04 23:49 ` Neil Jerram
2009-03-05  3:54   ` Linas Vepstas
2009-03-05 19:46     ` Neil Jerram
2009-03-05 20:05       ` Neil Jerram
2009-03-05 20:40         ` Linas Vepstas
2009-03-05 20:49           ` Neil Jerram
2009-03-05 20:57         ` Linas Vepstas
2009-03-05 21:25           ` Neil Jerram
2009-03-05 21:56             ` Linas Vepstas
2009-03-06 11:01               ` Andy Wingo
2009-03-06 12:36                 ` Linas Vepstas
2009-03-06 22:05                   ` Ludovic Courtès
2009-03-08 22:04               ` Neil Jerram
2009-03-25 19:00         ` Neil Jerram
2009-03-25 22:08           ` Ludovic Courtès
2009-03-05 21:35   ` Ludovic Courtès
2009-03-10 23:57   ` Neil Jerram
2009-03-12  0:07     ` Neil Jerram
2009-03-12  0:53       ` Neil Jerram
2009-03-12  1:29         ` Linas Vepstas
2009-03-12  3:09           ` Clinton Ebadi
2009-03-25 22:13           ` Neil Jerram
2009-03-25 22:34             ` Linas Vepstas
2009-03-12 22:13         ` Andy Wingo
2009-03-13 19:13           ` Neil Jerram
2009-03-25 23:19             ` Neil Jerram
2009-03-26  3:40               ` Linas Vepstas
2009-03-26  8:02                 ` Neil Jerram
2009-03-26 18:39                   ` Linas Vepstas
2009-03-26  9:10               ` Ludovic Courtès
2009-03-26 22:01                 ` Neil Jerram
2009-03-26 23:12                   ` Ludovic Courtès
2009-03-26 22:51               ` Neil Jerram
2009-03-27  3:15                 ` Linas Vepstas
2009-03-14 14:23           ` Ludovic Courtès
2009-03-16 22:57             ` Andy Wingo
2009-03-25 18:57     ` Neil Jerram

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).