unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* Valgrind fix
@ 2010-03-07 22:53 No Itisnt
  0 siblings, 0 replies; 7+ messages in thread
From: No Itisnt @ 2010-03-07 22:53 UTC (permalink / raw)
  To: guile-devel


[-- Attachment #1.1: Type: text/plain, Size: 566 bytes --]

Here is patch that allows valgrind to run Guile under Linux non-IA64
architectures. It also fixes a potential leak in threads.c where a pthread
attribute was not destroyed.

Note that libgc still generates lots of warnings. For now, I've just been
suppressing all warnings from libgc, and I'm happy to report that Guile
doesn't seem to cause any issues on my system (Linux/AMD64). But it's
certainly possible that the suppressions could mask a legitimate issue.

Attached are the patch and suppressions file (which also suppresses a false
positive in libunistring).

[-- Attachment #1.2: Type: text/html, Size: 605 bytes --]

[-- Attachment #2: guile.supp --]
[-- Type: application/octet-stream, Size: 212 bytes --]

{
  <gc1>
  Memcheck:Cond
  fun:GC_*
}
{
  <gc2>  
  Memcheck:Value8
  fun:GC_*
}
# False positive: freea knowingly accesses uninitialized memory
{
  libunistring_freea
  Memcheck:Cond
  fun:libunistring_freea
}

[-- Attachment #3: valgrind.patch --]
[-- Type: application/octet-stream, Size: 2023 bytes --]

diff --git a/libguile/gc.c b/libguile/gc.c
index fc405f3..b531b85 100644
--- a/libguile/gc.c
+++ b/libguile/gc.c
@@ -617,6 +617,28 @@ scm_getenv_int (const char *var, int def)
 void
 scm_storage_prehistory ()
 {
+  /* libgc seems to obtain the stack address from the kernel on
+  linux/freebsd. this gives it valgrind's stack address, which it
+  then scans, causing a segmentation fault. this sets GC_stackbottom
+  to the address that valgrind wants the program to use */
+#if HAVE_PTHREAD_ATTR_GETSTACK && HAVE_PTHREAD_GETATTR_NP
+  size_t size;
+  void* sstart;
+  pthread_attr_t attr;
+  pthread_getattr_np(pthread_self(), &attr);
+  pthread_attr_getstack(&attr, &sstart, &size);
+  pthread_attr_destroy(&attr);
+  if(sstart) {
+    GC_stackbottom = (char*) sstart + size;
+  } else {
+    int dummy;
+    size_t stack_bottom = (size_t) & dummy;
+    stack_bottom += 4095;
+    stack_bottom &= ~4095;
+    GC_stackbottom = (char*) stack_bottom;
+  }
+#endif
+
   GC_all_interior_pointers = 0;
   GC_set_free_space_divisor (scm_getenv_int ("GC_FREE_SPACE_DIVISOR", 3));
 
@@ -813,7 +835,6 @@ scm_i_tag_name (scm_t_bits tag)
 
 
 
-\f
 void
 scm_init_gc ()
 {
diff --git a/libguile/threads.c b/libguile/threads.c
index 7c377d7..e052f47 100644
--- a/libguile/threads.c
+++ b/libguile/threads.c
@@ -632,6 +632,7 @@ scm_i_init_thread_for_guile (SCM_STACKITEM *base, SCM parent)
 static SCM_STACKITEM *
 get_thread_stack_base ()
 {
+  SCM_STACKITEM* ret;
   pthread_attr_t attr;
   void *start, *end;
   size_t size;
@@ -647,16 +648,18 @@ get_thread_stack_base ()
 
 #ifndef PTHREAD_ATTR_GETSTACK_WORKS
   if ((void *)&attr < start || (void *)&attr >= end)
-    return (SCM_STACKITEM *) GC_stackbottom;
+    ret = (SCM_STACKITEM*) GC_stackbottom;
   else
 #endif
     {
 #if SCM_STACK_GROWS_UP
-      return start;
+      ret = (SCM_STACKITEM*) start;
 #else
-      return end;
+      ret = (SCM_STACKITEM*) end;
 #endif
     }
+  pthread_attr_destroy(&attr);
+  return ret;
 }
 
 #elif defined HAVE_PTHREAD_GET_STACKADDR_NP

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Valgrind fix
@ 2010-03-08  0:04 No Itisnt
  2010-03-08 18:49 ` No Itisnt
  2010-03-08 21:16 ` Ludovic Courtès
  0 siblings, 2 replies; 7+ messages in thread
From: No Itisnt @ 2010-03-08  0:04 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 560 bytes --]

Here is patch that allows valgrind to run Guile under Linux non-IA64
systems. It also fixes a potential leak in threads.c where a
pthread attribute was not destroyed.

Note that libgc still generates lots of warnings. For now, I've just
been suppressing all warnings from libgc, and I'm happy to report that
Guile doesn't seem to cause any issues on my system (Linux/AMD64). But
it's certainly possible that the suppressions could mask a legitimate
issue.

Attached are the patch and suppressions file (which also suppresses a
false positive in libunistring).

[-- Attachment #2: guile.supp --]
[-- Type: application/octet-stream, Size: 212 bytes --]

{
  <gc1>
  Memcheck:Cond
  fun:GC_*
}
{
  <gc2>  
  Memcheck:Value8
  fun:GC_*
}
# False positive: freea knowingly accesses uninitialized memory
{
  libunistring_freea
  Memcheck:Cond
  fun:libunistring_freea
}

[-- Attachment #3: valgrind.patch --]
[-- Type: application/octet-stream, Size: 2023 bytes --]

diff --git a/libguile/gc.c b/libguile/gc.c
index fc405f3..b531b85 100644
--- a/libguile/gc.c
+++ b/libguile/gc.c
@@ -617,6 +617,28 @@ scm_getenv_int (const char *var, int def)
 void
 scm_storage_prehistory ()
 {
+  /* libgc seems to obtain the stack address from the kernel on
+  linux/freebsd. this gives it valgrind's stack address, which it
+  then scans, causing a segmentation fault. this sets GC_stackbottom
+  to the address that valgrind wants the program to use */
+#if HAVE_PTHREAD_ATTR_GETSTACK && HAVE_PTHREAD_GETATTR_NP
+  size_t size;
+  void* sstart;
+  pthread_attr_t attr;
+  pthread_getattr_np(pthread_self(), &attr);
+  pthread_attr_getstack(&attr, &sstart, &size);
+  pthread_attr_destroy(&attr);
+  if(sstart) {
+    GC_stackbottom = (char*) sstart + size;
+  } else {
+    int dummy;
+    size_t stack_bottom = (size_t) & dummy;
+    stack_bottom += 4095;
+    stack_bottom &= ~4095;
+    GC_stackbottom = (char*) stack_bottom;
+  }
+#endif
+
   GC_all_interior_pointers = 0;
   GC_set_free_space_divisor (scm_getenv_int ("GC_FREE_SPACE_DIVISOR", 3));
 
@@ -813,7 +835,6 @@ scm_i_tag_name (scm_t_bits tag)
 
 
 
-\f
 void
 scm_init_gc ()
 {
diff --git a/libguile/threads.c b/libguile/threads.c
index 7c377d7..e052f47 100644
--- a/libguile/threads.c
+++ b/libguile/threads.c
@@ -632,6 +632,7 @@ scm_i_init_thread_for_guile (SCM_STACKITEM *base, SCM parent)
 static SCM_STACKITEM *
 get_thread_stack_base ()
 {
+  SCM_STACKITEM* ret;
   pthread_attr_t attr;
   void *start, *end;
   size_t size;
@@ -647,16 +648,18 @@ get_thread_stack_base ()
 
 #ifndef PTHREAD_ATTR_GETSTACK_WORKS
   if ((void *)&attr < start || (void *)&attr >= end)
-    return (SCM_STACKITEM *) GC_stackbottom;
+    ret = (SCM_STACKITEM*) GC_stackbottom;
   else
 #endif
     {
 #if SCM_STACK_GROWS_UP
-      return start;
+      ret = (SCM_STACKITEM*) start;
 #else
-      return end;
+      ret = (SCM_STACKITEM*) end;
 #endif
     }
+  pthread_attr_destroy(&attr);
+  return ret;
 }
 
 #elif defined HAVE_PTHREAD_GET_STACKADDR_NP

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: Valgrind fix
  2010-03-08  0:04 Valgrind fix No Itisnt
@ 2010-03-08 18:49 ` No Itisnt
  2010-03-08 21:21   ` Ludovic Courtès
  2010-03-08 21:16 ` Ludovic Courtès
  1 sibling, 1 reply; 7+ messages in thread
From: No Itisnt @ 2010-03-08 18:49 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]

Here's a program that demonstrates the actual issue at hand.
Basically, libgc finds the stack address from either glibc or the
kernel, which is not what it wants because valgrind messes with the
address space of the host program. This is only on Linux, but I
understand the issue is similar on other BSD systems.

This is the Mono file that contains some workarounds (specifically
mono/metadata/boehm-gc.c:mono_gc_base_init) . Contrary to the comments
in the file the issue does not appear to be thread-related, although
it is solved using pthread functionality.

http://anonsvn.mono-project.com/viewvc/trunk/mono/mono/metadata/boehm-gc.c?view=log

The attached program grabs the stack address from the kernel, glibc,
and by taking the address of a stack object, then prints the
difference. When run normally, the differences are minimal, but when
run under valgrind, there is a giant difference between the base of
the actual program and the base given by the kernel.

Ideally, this would be patched in libgc, but we'd need a way to detect
whether the program is running under valgrind and the current
workaround would only work for threaded versions. I'm going to take a
crack at libgc CVS later on.

[-- Attachment #2: demo.c --]
[-- Type: text/x-csrc, Size: 1168 bytes --]

/* demo valgrind program: cc -o demo demo.c */

#include <assert.h>
#include <stddef.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define STAT_BUFFER_SIZE 4096
#define STAT_SKIP 27

extern ptrdiff_t __libc_stack_end;

int main(void) {
  int dummy;
  
  /* lifted from os_dep.c in libgc */
  char stat_buf[STAT_BUFFER_SIZE];
  int f = open("/proc/self/stat", O_RDONLY);
  if(f < 0 || read(f, stat_buf, STAT_BUFFER_SIZE) < 2 * STAT_SKIP) {
    perror("/proc/self/stat");
    return -1;
  }
  
  size_t buf_offset = 0;
  char c = stat_buf[buf_offset++];
  size_t i;
  for(i = 0; i < STAT_SKIP; ++i) {
    while(isspace(c)) c = stat_buf[buf_offset++];
    while(!isspace(c)) c = stat_buf[buf_offset++];
  }
  while(isspace(c)) c = stat_buf[buf_offset++];
  
  ptrdiff_t result = 0;
  while(isdigit(c)) {
    result *= 10;
    result += c - '0';
    c = stat_buf[buf_offset++];
  }
  close(f);
  
  printf("kernel stack base: %p\nlibc stack base: %p\ndumb stack base: %p\ndifference between kernel and dumb: %zd bytes\n",
         (void*) result, (void*) __libc_stack_end, (void*) &dummy, (void*) (result - (ptrdiff_t) &dummy));
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Valgrind fix
  2010-03-08  0:04 Valgrind fix No Itisnt
  2010-03-08 18:49 ` No Itisnt
@ 2010-03-08 21:16 ` Ludovic Courtès
  1 sibling, 0 replies; 7+ messages in thread
From: Ludovic Courtès @ 2010-03-08 21:16 UTC (permalink / raw)
  To: guile-devel

Hi,

No Itisnt <theseaisinhere@gmail.com> writes:

> Here is patch that allows valgrind to run Guile under Linux non-IA64
> systems. It also fixes a potential leak in threads.c where a
> pthread attribute was not destroyed.

Excellent, thanks!  (For the record, the inability to use Valgrind with
libgc has been bothering us for too long.)

The patch looks good to me.  I have one question:

+  /* libgc seems to obtain the stack address from the kernel on
+  linux/freebsd. this gives it valgrind's stack address, which it
+  then scans, causing a segmentation fault. this sets GC_stackbottom
+  to the address that valgrind wants the program to use */

Did you check that hypothesis?  It’d inspire me more confidence if we
could remove the “seems to” above.  :-)

Besides, would you be willing to assign the copyright of your patch to
the Free Software Foundation, so that we could install it?  If that’s
fine with you, we can discuss the details off-line.

Thank you!

Ludo’.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Valgrind fix
  2010-03-08 18:49 ` No Itisnt
@ 2010-03-08 21:21   ` Ludovic Courtès
  2010-03-12 18:41     ` No Itisnt
  0 siblings, 1 reply; 7+ messages in thread
From: Ludovic Courtès @ 2010-03-08 21:21 UTC (permalink / raw)
  To: guile-devel

Hi again,

I should’ve read this message before.  Thanks for the explanation.

No Itisnt <theseaisinhere@gmail.com> writes:

> The attached program grabs the stack address from the kernel, glibc,
> and by taking the address of a stack object, then prints the
> difference. When run normally, the differences are minimal, but when
> run under valgrind, there is a giant difference between the base of
> the actual program and the base given by the kernel.
>
> Ideally, this would be patched in libgc,

Indeed.  Now it’s curious why libgc would need to mess up with /proc
since it’s Linux-specific, inefficient (open(2), read(2), parse, etc.),
and fragile (what if /proc isn’t mounted?  what if the format of ‘maps’
changes?).

libgc should probably just use ‘__libc_stack_end’ et al. on all GNU
variants, including GNU/Linux.

Thanks,
Ludo’.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Valgrind fix
  2010-03-08 21:21   ` Ludovic Courtès
@ 2010-03-12 18:41     ` No Itisnt
  2010-04-13 12:15       ` Ludovic Courtès
  0 siblings, 1 reply; 7+ messages in thread
From: No Itisnt @ 2010-03-12 18:41 UTC (permalink / raw)
  Cc: guile-devel

Boehm GC CVS now has a patch that will fix this behavior. Compile with
threads and -DUSE_GET_STACKBASE_FOR_MAIN. See the following thread for
details.

http://comments.gmane.org/gmane.comp.programming.garbage-collection.boehmgc/3788

If Hans Boehm agrees, the behavior will probably become default and no
flags will be necessary.

At this point it's not necessary to patch Guile, as long as
valgrinders are willing to use CVS for now. I think that's for the
best, since the solution is pretty hacky.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Valgrind fix
  2010-03-12 18:41     ` No Itisnt
@ 2010-04-13 12:15       ` Ludovic Courtès
  0 siblings, 0 replies; 7+ messages in thread
From: Ludovic Courtès @ 2010-04-13 12:15 UTC (permalink / raw)
  To: No Itisnt; +Cc: guile-devel

Hi!

No Itisnt <theseaisinhere@gmail.com> writes:

> Boehm GC CVS now has a patch that will fix this behavior. Compile with
> threads and -DUSE_GET_STACKBASE_FOR_MAIN. See the following thread for
> details.
>
> http://comments.gmane.org/gmane.comp.programming.garbage-collection.boehmgc/3788
>
> If Hans Boehm agrees, the behavior will probably become default and no
> flags will be necessary.
>
> At this point it's not necessary to patch Guile, as long as
> valgrinders are willing to use CVS for now. I think that's for the
> best, since the solution is pretty hacky.

I tried libgc CVS recently and Valgrind still dumps a vgcore for Guile
(that’s x86_64-linux-gnu, Valgrind 3.5.0).  Am I missing something?

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-04-13 12:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-08  0:04 Valgrind fix No Itisnt
2010-03-08 18:49 ` No Itisnt
2010-03-08 21:21   ` Ludovic Courtès
2010-03-12 18:41     ` No Itisnt
2010-04-13 12:15       ` Ludovic Courtès
2010-03-08 21:16 ` Ludovic Courtès
  -- strict thread matches above, loose matches on Subject: below --
2010-03-07 22:53 No Itisnt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).