unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* [PATCH] Fix hanging of popen.test
@ 2010-06-10 22:54 Neil Jerram
  2010-06-11 19:48 ` Neil Jerram
  0 siblings, 1 reply; 18+ messages in thread
From: Neil Jerram @ 2010-06-10 22:54 UTC (permalink / raw)
  To: guile-devel; +Cc: Neil Jerram

The "open-output-pipe":"no duplicate" test has been hanging, on and
off, and not completely reliably, for a few years.  It's now doing so
fairly reliably for me, and investigation shows that

- the child shell process is in a tight loop (99% CPU)

- the parent Guile process is stuck calling waitpid().

The problem is that the child hasn't got the SIGPIPE that the test
intends, and so is continuing to echo "closed" forever; and Guile is
waiting for it to terminate, forever.

I haven't fully debugged the SIGPIPE problem, but it sounds very like
what Chet Ramey describes here:
http://old.nabble.com/Re%3A-SIGPIPE-not-properly-reset-with-%27trap---PIPE%27-p20985595.html.

(And my version of bash is 3.2.39.)

So, a fix should be to use something other than shell to implement the
child; and it appears that this works.

* check-guile.in (TEST_SUITE_DIR): Export.

* test-suite/tests/popen-child.scm: New script file.

* test-suite/tests/popen.test ("open-output-pipe", "no duplicate"):
  Use Guile for the child process, instead of shell.
---
 check-guile.in                   |    1 +
 test-suite/tests/popen-child.scm |    4 ++++
 test-suite/tests/popen.test      |    5 +++--
 3 files changed, 8 insertions(+), 2 deletions(-)
 create mode 100644 test-suite/tests/popen-child.scm

diff --git a/check-guile.in b/check-guile.in
index dde51b3..fc670e1 100644
--- a/check-guile.in
+++ b/check-guile.in
@@ -15,6 +15,7 @@ top_builddir=@top_builddir_absolute@
 top_srcdir=@top_srcdir_absolute@
 
 TEST_SUITE_DIR=${top_srcdir}/test-suite
+export TEST_SUITE_DIR
 
 if [ x"$1" = x-i ] ; then
     guile=$2
diff --git a/test-suite/tests/popen-child.scm b/test-suite/tests/popen-child.scm
new file mode 100644
index 0000000..4bfe6b7
--- /dev/null
+++ b/test-suite/tests/popen-child.scm
@@ -0,0 +1,4 @@
+(close-port (current-input-port))
+(let loop ()
+  (display "closed\n" (current-error-port))
+  (force-output  (current-error-port)))
diff --git a/test-suite/tests/popen.test b/test-suite/tests/popen.test
index 0a20cff..a408c9e 100644
--- a/test-suite/tests/popen.test
+++ b/test-suite/tests/popen.test
@@ -167,8 +167,9 @@
     (let* ((c2p (pipe))
 	   (port (with-error-to-port (cdr c2p)
 		   (lambda ()
-		     (open-output-pipe
-		      "exec 0</dev/null; while true; do echo closed 1>&2; done")))))
+		     (open-output-pipe (format #f
+                                               "guile -s ~a/tests/popen-child.scm"
+                                               (getenv "TEST_SUITE_DIR")))))))
       (close-port (cdr c2p))   ;; write side
       (with-epipe
        (lambda ()
-- 
1.7.1




^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-06-10 22:54 [PATCH] Fix hanging of popen.test Neil Jerram
@ 2010-06-11 19:48 ` Neil Jerram
  2010-06-14 21:27   ` Andy Wingo
  0 siblings, 1 reply; 18+ messages in thread
From: Neil Jerram @ 2010-06-11 19:48 UTC (permalink / raw)
  To: guile-devel

Neil Jerram <neil@ossau.uklinux.net> writes:

> +(close-port (current-input-port))
> +(let loop ()
> +  (display "closed\n" (current-error-port))
> +  (force-output  (current-error-port)))

Oops, missing a `(loop)' at the end there.  I'll see if the test still
works after adding that in.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-06-11 19:48 ` Neil Jerram
@ 2010-06-14 21:27   ` Andy Wingo
  2010-06-28 21:48     ` Neil Jerram
  0 siblings, 1 reply; 18+ messages in thread
From: Andy Wingo @ 2010-06-14 21:27 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

Heya Neil,

On Fri 11 Jun 2010 21:48, Neil Jerram <neil@ossau.uklinux.net> writes:

> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> +(close-port (current-input-port))
>> +(let loop ()
>> +  (display "closed\n" (current-error-port))
>> +  (force-output  (current-error-port)))
>
> Oops, missing a `(loop)' at the end there.  I'll see if the test still
> works after adding that in.

It was an interesting analysis, though I admit I didn't follow all of
it. The SIGPIPE didn't hit the subprocess because bash was doing
something strange? How did your new test work out?

Regards,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-06-14 21:27   ` Andy Wingo
@ 2010-06-28 21:48     ` Neil Jerram
  2010-06-29  9:31       ` Andy Wingo
  0 siblings, 1 reply; 18+ messages in thread
From: Neil Jerram @ 2010-06-28 21:48 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> Heya Neil,

Hi again!  Apologies for the long hiatus... my home network imploded for
a while.

> It was an interesting analysis, though I admit I didn't follow all of
> it. The SIGPIPE didn't hit the subprocess because bash was doing
> something strange?

Yes, I think so.  My interpretation of that email trail is that bash v3
doesn't let SIGPIPE get to its builtins, such as echo.

> How did your new test work out?

It worked.  However, when I try to build Guile again now, to double
check, I get:

cat alist.doc arbiters.doc array-handle.doc array-map.doc arrays.doc async.doc backtrace.doc boolean.doc bitvectors.doc bytevectors.doc chars.doc control.doc continuations.doc debug.doc deprecated.doc deprecation.doc discouraged.doc dynl.doc dynwind.doc eq.doc error.doc eval.doc evalext.doc expand.doc extensions.doc feature.doc fluids.doc foreign.doc fports.doc gc-malloc.doc gc.doc gettext.doc generalized-arrays.doc generalized-vectors.doc goops.doc gsubr.doc guardians.doc hash.doc hashtab.doc hooks.doc i18n.doc init.doc ioext.doc keywords.doc list.doc load.doc macros.doc mallocs.doc memoize.doc modules.doc numbers.doc objprop.doc options.doc pairs.doc ports.doc print.doc procprop.doc procs.doc promises.doc properties.doc r6rs-ports.doc random.doc rdelim.doc read.doc root.doc rw.doc scmsigs.doc script.doc simpos.doc smob.doc sort.doc srcprop.doc srfi-13.doc srfi-14.doc srfi-4.doc stackchk.doc stacks.doc stime.doc strings.doc strorder.doc strports.doc struct.doc symbols.doc threads.doc throw.doc trees.doc uniform.doc values.doc variable.doc vectors.doc version.doc vports.doc weaks.doc dynl.doc filesys.doc posix.doc net_db.doc socket.doc regex-posix.doc | GUILE_AUTO_COMPILE=0 ../meta/uninstalled-env guile-tools snarf-check-and-output-texi          > guile-procedures.texi || { rm guile-procedures.texi; false; }
`scm_trampoline_1' is deprecated. Just use `scm_call_1' instead.
guile: uncaught throw to wrong-type-arg: (vm-debug-engine Wrong type to apply: ~S (#<with-fluids 404dc498>) (#<with-fluids 404dc498>))
Cannot exit gracefully when init is in progress; aborting.
cat: write error: Broken pipe
/bin/sh: line 1:  9771 Done(1)                 cat alist.doc arbiters.doc array-handle.doc array-map.doc arrays.doc async.doc backtrace.doc boolean.doc bitvectors.doc bytevectors.doc chars.doc control.doc continuations.doc debug.doc deprecated.doc deprecation.doc discouraged.doc dynl.doc dynwind.doc eq.doc error.doc eval.doc evalext.doc expand.doc extensions.doc feature.doc fluids.doc foreign.doc fports.doc gc-malloc.doc gc.doc gettext.doc generalized-arrays.doc generalized-vectors.doc goops.doc gsubr.doc guardians.doc hash.doc hashtab.doc hooks.doc i18n.doc init.doc ioext.doc keywords.doc list.doc load.doc macros.doc mallocs.doc memoize.doc modules.doc numbers.doc objprop.doc options.doc pairs.doc ports.doc print.doc procprop.doc procs.doc promises.doc properties.doc r6rs-ports.doc random.doc rdelim.doc read.doc root.doc rw.doc scmsigs.doc script.doc simpos.doc smob.doc sort.doc srcprop.doc srfi-13.doc srfi-14.doc srfi-4.doc stackchk.doc stacks.doc stime.doc strings.doc strorder.doc strports.doc struct.doc symbols.doc threads.doc throw.doc trees.doc uniform.doc values.doc variable.doc vectors.doc version.doc vports.doc weaks.doc dynl.doc filesys.doc posix.doc net_db.doc socket.doc regex-posix.doc
      9772 Aborted                 | GUILE_AUTO_COMPILE=0 ../meta/uninstalled-env guile-tools snarf-check-and-output-texi > guile-procedures.texi

Is that a known problem?  Has the correct build incantation perhaps
changed (from './autogen.sh && ./configure && make && make check') ?

If I then go to a shell:

neil@arudy:~/SW/Guile/master$ GUILE_AUTO_COMPILE=0 ./meta/uninstalled-env guile -c '(+ 3 3)'
`scm_trampoline_1' is deprecated. Just use `scm_call_1' instead.
guile: uncaught throw to wrong-type-arg: (vm-debug-engine Wrong type to apply: ~S (#<with-fluids b7adb498>) (#<with-fluids b7adb498>))
Cannot exit gracefully when init is in progress; aborting.
Aborted

So it's nothing specifically to do with 'guile-tools
snarf-check-and-output-texi'.

Also, in the autogen.sh step, in case it's relevant, I get some
warnings:

am/snarf:5: AM_V_SNARF_$(V: non-POSIX variable name
guile-readline/Makefile.am:22:   `am/snarf' included from here
am/snarf:6: AM_V_SNARF_$(AM_DEFAULT_VERBOSITY: non-POSIX variable name
guile-readline/Makefile.am:22:   `am/snarf' included from here
am/snarf:5: AM_V_SNARF_$(V: non-POSIX variable name
libguile/Makefile.am:22:   `am/snarf' included from here
am/snarf:6: AM_V_SNARF_$(AM_DEFAULT_VERBOSITY: non-POSIX variable name
libguile/Makefile.am:22:   `am/snarf' included from here
libguile/Makefile.am:664: AM_V_FILTER_$(V: non-POSIX variable name
libguile/Makefile.am:665: AM_V_FILTER_$(AM_DEFAULT_VERBOSITY: non-POSIX variable name
am/guilec:33: AM_V_GUILEC_$(V: non-POSIX variable name
module/Makefile.am:22:   `am/guilec' included from here
am/guilec:34: AM_V_GUILEC_$(AM_DEFAULT_VERBOSITY: non-POSIX variable name
module/Makefile.am:22:   `am/guilec' included from here
am/snarf:5: AM_V_SNARF_$(V: non-POSIX variable name
srfi/Makefile.am:22:   `am/snarf' included from here
am/snarf:6: AM_V_SNARF_$(AM_DEFAULT_VERBOSITY: non-POSIX variable name
srfi/Makefile.am:22:   `am/snarf' included from here
am/snarf:5: AM_V_SNARF_$(V: non-POSIX variable name
test-suite/standalone/Makefile.am:22:   `am/snarf' included from here
am/snarf:6: AM_V_SNARF_$(AM_DEFAULT_VERBOSITY: non-POSIX variable name
test-suite/standalone/Makefile.am:22:   `am/snarf' included from here

Does that just mean I need to upgrade autotools?

Regards,
        Neil



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-06-28 21:48     ` Neil Jerram
@ 2010-06-29  9:31       ` Andy Wingo
  2010-06-29 19:11         ` Neil Jerram
  0 siblings, 1 reply; 18+ messages in thread
From: Andy Wingo @ 2010-06-29  9:31 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

Hi Neil,

On Mon 28 Jun 2010 23:48, Neil Jerram <neil@ossau.uklinux.net> writes:

> When I try to build Guile again now, to double check, I get:
>
> cat alist.doc arbiters.doc array-handle.doc array-map.doc arrays.doc async.doc backtrace.doc boolean.doc bitvectors.doc bytevectors.doc chars.doc control.doc continuations.doc debug.doc deprecated.doc deprecation.doc discouraged.doc dynl.doc dynwind.doc eq.doc error.doc eval.doc evalext.doc expand.doc extensions.doc feature.doc fluids.doc foreign.doc fports.doc gc-malloc.doc gc.doc gettext.doc generalized-arrays.doc generalized-vectors.doc goops.doc gsubr.doc guardians.doc hash.doc hashtab.doc hooks.doc i18n.doc init.doc ioext.doc keywords.doc list.doc load.doc macros.doc mallocs.doc memoize.doc modules.doc numbers.doc objprop.doc options.doc pairs.doc ports.doc print.doc procprop.doc procs.doc promises.doc properties.doc r6rs-ports.doc random.doc rdelim.doc read.doc root.doc rw.doc scmsigs.doc script.doc simpos.doc smob.doc sort.doc srcprop.doc srfi-13.doc srfi-14.doc srfi-4.doc stackchk.doc stacks.doc stime.doc strings.doc strorder.doc strports.doc struct.doc symbols.doc threads.doc throw.doc trees.doc uniform.doc values.doc variable.doc vectors.doc version.doc vports.doc weaks.doc dynl.doc filesys.doc posix.doc net_db.doc socket.doc regex-posix.doc | GUILE_AUTO_COMPILE=0 ../meta/uninstalled-env guile-tools snarf-check-and-output-texi          > guile-procedures.texi || { rm guile-procedures.texi; false; }
> `scm_trampoline_1' is deprecated. Just use `scm_call_1' instead.
> guile: uncaught throw to wrong-type-arg: (vm-debug-engine Wrong type to apply: ~S (#<with-fluids 404dc498>) (#<with-fluids 404dc498>))
> Cannot exit gracefully when init is in progress; aborting.
> cat: write error: Broken pipe
> /bin/sh: line 1:  9771 Done(1)                 cat alist.doc arbiters.doc array-handle.doc array-map.doc arrays.doc async.doc backtrace.doc boolean.doc bitvectors.doc bytevectors.doc chars.doc control.doc continuations.doc debug.doc deprecated.doc deprecation.doc discouraged.doc dynl.doc dynwind.doc eq.doc error.doc eval.doc evalext.doc expand.doc extensions.doc feature.doc fluids.doc foreign.doc fports.doc gc-malloc.doc gc.doc gettext.doc generalized-arrays.doc generalized-vectors.doc goops.doc gsubr.doc guardians.doc hash.doc hashtab.doc hooks.doc i18n.doc init.doc ioext.doc keywords.doc list.doc load.doc macros.doc mallocs.doc memoize.doc modules.doc numbers.doc objprop.doc options.doc pairs.doc ports.doc print.doc procprop.doc procs.doc promises.doc properties.doc r6rs-ports.doc random.doc rdelim.doc read.doc root.doc rw.doc scmsigs.doc script.doc simpos.doc smob.doc sort.doc srcprop.doc srfi-13.doc srfi-14.doc srfi-4.doc stackchk.doc stacks.doc stime.doc strings.doc strorder.doc strports.doc struct.doc symbols.doc threads.doc throw.doc trees.doc uniform.doc values.doc variable.doc vectors.doc version.doc vports.doc weaks.doc dynl.doc filesys.doc posix.doc net_db.doc socket.doc regex-posix.doc
>       9772 Aborted                 | GUILE_AUTO_COMPILE=0 ../meta/uninstalled-env guile-tools snarf-check-and-output-texi > guile-procedures.texi
>
> Is that a known problem?  Has the correct build incantation perhaps
> changed (from './autogen.sh && ./configure && make && make check') ?

This is not a known problem to me, and the build has not changed;
however it seems you are working on an old revision. Some things changed
in the past that required a clean build. Can you try that?

Furthermore in the past the meta/uninstalled-env wasn't setting
GUILE_SYSTEM_PATH properly, so it would pick up installed .scm files.

> If I then go to a shell:
>
> neil@arudy:~/SW/Guile/master$ GUILE_AUTO_COMPILE=0 ./meta/uninstalled-env guile -c '(+ 3 3)'
> `scm_trampoline_1' is deprecated. Just use `scm_call_1' instead.

This is fishy; nothing in current code calls scm_trampoline_1.

> guile: uncaught throw to wrong-type-arg: (vm-debug-engine Wrong type to apply: ~S (#<with-fluids b7adb498>) (#<with-fluids b7adb498>))
> Cannot exit gracefully when init is in progress; aborting.
> Aborted

Indeed.

> Also, in the autogen.sh step, in case it's relevant, I get some
> warnings:
>
> am/snarf:5: AM_V_SNARF_$(V: non-POSIX variable name

Yes, they are harmless warnings. You don't get these warnings with
automake 1.11. 

FWIW I'm going to not be around very much until sometime next week, so
apologies in advance for delayed replies :)

Regards,

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-06-29  9:31       ` Andy Wingo
@ 2010-06-29 19:11         ` Neil Jerram
  2010-06-30 22:50           ` Neil Jerram
  0 siblings, 1 reply; 18+ messages in thread
From: Neil Jerram @ 2010-06-29 19:11 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> Hi Neil,

Hi, and thanks for the quick reply.

> This is not a known problem to me, and the build has not changed;
> however it seems you are working on an old revision. Some things changed
> in the past that required a clean build.

I'm sure the tree is up to date, and I've already done a complete
rebuild.

>> neil@arudy:~/SW/Guile/master$ GUILE_AUTO_COMPILE=0 ./meta/uninstalled-env guile -c '(+ 3 3)'
>> `scm_trampoline_1' is deprecated. Just use `scm_call_1' instead.
>
> This is fishy; nothing in current code calls scm_trampoline_1.

Aha...  Time for strace then, which includes:

open("/home/neil/SW/Guile/master/module/srfi/srfi-1.scm", O_RDONLY|O_LARGEFILE) = 10
open("/usr/local/lib/libguile-srfi-srfi-1-v-4.la", O_RDONLY) = 11
open("/usr/local/lib/libguile-srfi-srfi-1-v-4.so.4", O_RDONLY) = 11
open("/usr/local/lib/libguile.so.18", O_RDONLY) = 11

even though the real libguile was loaded well before then:

open("/home/neil/SW/Guile/master/libguile/.libs/libguile-2.0.so.18", O_RDONLY) = 3

So the problem appears to be srfi-1.scm picking up something old from
/usr/local/lib.  I'll dig deeper.

>> am/snarf:5: AM_V_SNARF_$(V: non-POSIX variable name
>
> Yes, they are harmless warnings. You don't get these warnings with
> automake 1.11. 

Thanks, I'll upgrade.

> FWIW I'm going to not be around very much until sometime next week, so
> apologies in advance for delayed replies :)

No problem, I think you've already provided enough clue for me to make
progress!

   Neil



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-06-29 19:11         ` Neil Jerram
@ 2010-06-30 22:50           ` Neil Jerram
  2010-06-30 23:58             ` Neil Jerram
  2010-07-01 16:22             ` dsmich
  0 siblings, 2 replies; 18+ messages in thread
From: Neil Jerram @ 2010-06-30 22:50 UTC (permalink / raw)
  To: guile-devel

Neil Jerram <neil@ossau.uklinux.net> writes:

> Aha...  Time for strace then, which includes:
>
> open("/home/neil/SW/Guile/master/module/srfi/srfi-1.scm", O_RDONLY|O_LARGEFILE) = 10
> open("/usr/local/lib/libguile-srfi-srfi-1-v-4.la", O_RDONLY) = 11
> open("/usr/local/lib/libguile-srfi-srfi-1-v-4.so.4", O_RDONLY) = 11
> open("/usr/local/lib/libguile.so.18", O_RDONLY) = 11
>
> even though the real libguile was loaded well before then:
>
> open("/home/neil/SW/Guile/master/libguile/.libs/libguile-2.0.so.18", O_RDONLY) = 3
>
> So the problem appears to be srfi-1.scm picking up something old from
> /usr/local/lib.  I'll dig deeper.

Hmm.  I'm now suspecting a build order issue, which is masked if you
happen to have been building Guile regularly recently and so have a
similar enough libguile in /usr/local/lib.

If I hide everything I have installed in /usr/local/lib (by renaming lib
to libx), and then build again from scratch, then the error is:

  GEN    guile-procedures.texi
guile: uncaught throw to misc-error: (dynamic-link file: ~S, message: ~S (libguile-srfi-srfi-1-v-4 file not found) #f)

which makes sense because the build is still building everything in the
"libguile" directory and hasn't got to the "srfi" directory yet.

So my hypothesis now is:

- some change in the last few months has introduced a dependency of
  Guile script startup (specifically including the case where Guile is
  run to generate guile-procedures.texi) on srfi-1.scm

- this makes the build impossible!

- for regular developers, this may be masked by having a libguile.so and
  libguile-srfi-srfi-13-14-v-4.so in /usr/lib or /usr/local/lib that are
  recent enough to work.

Can anyone else build current git from scratch if they first hide or
delete any guile libraries in /usr/lib and /usr/local/lib?

      Neil



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-06-30 22:50           ` Neil Jerram
@ 2010-06-30 23:58             ` Neil Jerram
  2010-07-01 10:48               ` Andy Wingo
  2010-07-01 16:22             ` dsmich
  1 sibling, 1 reply; 18+ messages in thread
From: Neil Jerram @ 2010-06-30 23:58 UTC (permalink / raw)
  To: guile-devel

Neil Jerram <neil@ossau.uklinux.net> writes:

> So my hypothesis now is:
>
> - some change in the last few months has introduced a dependency of
>   Guile script startup (specifically including the case where Guile is
>   run to generate guile-procedures.texi) on srfi-1.scm
>
> - this makes the build impossible!
>
> - for regular developers, this may be masked by having a libguile.so and
>   libguile-srfi-srfi-13-14-v-4.so in /usr/lib or /usr/local/lib that are
>   recent enough to work.
>
> Can anyone else build current git from scratch if they first hide or
> delete any guile libraries in /usr/lib and /usr/local/lib?

It occurred to me that if this is right, the Hydra build should be
failing too.  And I think it is.  The "gnu:guile-master:tarball" build
last succeeded on 17th June, and has been failing since then:
http://hydra.nixos.org/job/gnu/guile-master/tarball/all?page=1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-06-30 23:58             ` Neil Jerram
@ 2010-07-01 10:48               ` Andy Wingo
  2010-07-01 20:29                 ` Patrick McCarty
                                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Andy Wingo @ 2010-07-01 10:48 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

On Thu 01 Jul 2010 00:58, Neil Jerram <neil@ossau.uklinux.net> writes:

> Neil Jerram <neil@ossau.uklinux.net> writes:
>
>> So my hypothesis now is:
>>
>> - some change in the last few months has introduced a dependency of
>>   Guile script startup (specifically including the case where Guile is
>>   run to generate guile-procedures.texi) on srfi-1.scm
>>
>> - this makes the build impossible!
>>
>> - for regular developers, this may be masked by having a libguile.so and
>>   libguile-srfi-srfi-13-14-v-4.so in /usr/lib or /usr/local/lib that are
>>   recent enough to work.
>>
>> Can anyone else build current git from scratch if they first hide or
>> delete any guile libraries in /usr/lib and /usr/local/lib?
>
> It occurred to me that if this is right, the Hydra build should be
> failing too.  And I think it is.  The "gnu:guile-master:tarball" build
> last succeeded on 17th June, and has been failing since then:
> http://hydra.nixos.org/job/gnu/guile-master/tarball/all?page=1

This has been an on-and-off issue:
02fcbf78b27788c03563e5c3d297a4cd469ce562, and
04af4c4c5221c082905d52eb5ad3829ed681d097. Indeed running the repl before
srfi-1 is built should not work. As a corollary, the docstring texi
thing should not require srfi-1 to work, which obviously is failing for
you and hydra, though not for me and a number of other people. I don't
really get it. I thought this was fixed. There have been a couple
threads about this, even, if you search the archives in recent months.

Can you look in the output of `meta/uninstalled-env env' for any off
paths?

What about in %load-path / %load-compiled-path ?

At the same time there must be an issue there somehow, or the tarball
buildbot would not be failing (surely?).

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-06-30 22:50           ` Neil Jerram
  2010-06-30 23:58             ` Neil Jerram
@ 2010-07-01 16:22             ` dsmich
  2010-07-01 21:22               ` Neil Jerram
  1 sibling, 1 reply; 18+ messages in thread
From: dsmich @ 2010-07-01 16:22 UTC (permalink / raw)
  To: guile-devel, Neil Jerram


---- Neil Jerram <neil@ossau.uklinux.net> wrote: 
> So my hypothesis now is:
> 
> - some change in the last few months has introduced a dependency of
>   Guile script startup (specifically including the case where Guile is
>   run to generate guile-procedures.texi) on srfi-1.scm
> 
> - this makes the build impossible!
> 
> - for regular developers, this may be masked by having a libguile.so and
>   libguile-srfi-srfi-13-14-v-4.so in /usr/lib or /usr/local/lib that are
>   recent enough to work.
> 
> Can anyone else build current git from scratch if they first hide or
> delete any guile libraries in /usr/lib and /usr/local/lib?
> 
>       Neil

If you "make -k" the needed things are built, and then another make finished up what's left.

-Dale








^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-07-01 10:48               ` Andy Wingo
@ 2010-07-01 20:29                 ` Patrick McCarty
  2010-07-03 22:17                 ` Ludovic Courtès
  2010-07-04 20:33                 ` Neil Jerram
  2 siblings, 0 replies; 18+ messages in thread
From: Patrick McCarty @ 2010-07-01 20:29 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel, Neil Jerram

On Thu, Jul 1, 2010 at 3:48 AM, Andy Wingo <wingo@pobox.com> wrote:
> On Thu 01 Jul 2010 00:58, Neil Jerram <neil@ossau.uklinux.net> writes:
>>
>> It occurred to me that if this is right, the Hydra build should be
>> failing too.  And I think it is.  The "gnu:guile-master:tarball" build
>> last succeeded on 17th June, and has been failing since then:
>> http://hydra.nixos.org/job/gnu/guile-master/tarball/all?page=1
>
> At the same time there must be an issue there somehow, or the tarball
> buildbot would not be failing (surely?).

I have been seeing the same problem recently...

I just did a full `git bisect' from the release_1-9-11 commit to git
master, and it found the following commit as the culprit:

  commit 4f99a499197b592a9a3060de2205531852f4f94d
  Author: Andy Wingo <wingo@pobox.com>
  Date:   Fri Jun 18 11:33:16 2010 +0200

      deprecate set-repl-prompt!


Hope this helps,
-Patrick



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-07-01 16:22             ` dsmich
@ 2010-07-01 21:22               ` Neil Jerram
  0 siblings, 0 replies; 18+ messages in thread
From: Neil Jerram @ 2010-07-01 21:22 UTC (permalink / raw)
  To: dsmich; +Cc: guile-devel

<dsmich@roadrunner.com> writes:

> If you "make -k" the needed things are built, and then another make finished up what's left.

Ah yes, good idea, that works.  Thanks!

     Neil



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-07-01 10:48               ` Andy Wingo
  2010-07-01 20:29                 ` Patrick McCarty
@ 2010-07-03 22:17                 ` Ludovic Courtès
  2010-07-04  9:07                   ` Andy Wingo
  2010-07-04 20:33                 ` Neil Jerram
  2 siblings, 1 reply; 18+ messages in thread
From: Ludovic Courtès @ 2010-07-03 22:17 UTC (permalink / raw)
  To: guile-devel

Hello,

Andy Wingo <wingo@pobox.com> writes:

>> It occurred to me that if this is right, the Hydra build should be
>> failing too.  And I think it is.  The "gnu:guile-master:tarball" build
>> last succeeded on 17th June, and has been failing since then:
>> http://hydra.nixos.org/job/gnu/guile-master/tarball/all?page=1
>
> This has been an on-and-off issue:
> 02fcbf78b27788c03563e5c3d297a4cd469ce562, and
> 04af4c4c5221c082905d52eb5ad3829ed681d097. Indeed running the repl before
> srfi-1 is built should not work. As a corollary, the docstring texi
> thing should not require srfi-1 to work, which obviously is failing for
> you and hydra, though not for me and a number of other people. I don't
> really get it. I thought this was fixed. There have been a couple
> threads about this, even, if you search the archives in recent months.

Though that’s not the proper fix, I propose to rewrite the SRFI-1 bits
that are in C to Scheme.  I’ll look into it if nobody beats me at it.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-07-03 22:17                 ` Ludovic Courtès
@ 2010-07-04  9:07                   ` Andy Wingo
  0 siblings, 0 replies; 18+ messages in thread
From: Andy Wingo @ 2010-07-04  9:07 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hi,

On Sat 03 Jul 2010 23:17, ludo@gnu.org (Ludovic Courtès) writes:

> I propose to rewrite the SRFI-1 bits that are in C to Scheme.

There's always the original srfi-1 code :)

Would be nice to benchmark these changes. As you mention it's not the
right fix to this problem, but it is the right thing to do in the
end. Unfortunately until we get a native compiler, there will be a speed
penalty for large-sized lists.

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-07-01 10:48               ` Andy Wingo
  2010-07-01 20:29                 ` Patrick McCarty
  2010-07-03 22:17                 ` Ludovic Courtès
@ 2010-07-04 20:33                 ` Neil Jerram
  2010-07-06 21:35                   ` Ludovic Courtès
  2010-07-17 11:57                   ` Andy Wingo
  2 siblings, 2 replies; 18+ messages in thread
From: Neil Jerram @ 2010-07-04 20:33 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Andy Wingo <wingo@pobox.com> writes:

> This has been an on-and-off issue:
> 02fcbf78b27788c03563e5c3d297a4cd469ce562, and
> 04af4c4c5221c082905d52eb5ad3829ed681d097.

Right.  The commit (4f99a499197b592a9a3060de2205531852f4f94) that
Patrick McCarty identified (with git bisect) is another very similar
case.

Even if we fix this particular one, it seems likely that similar cases
will keep arising, so a more general fix would be better.

The general problem is that any #:use-module or (@ ...) that indirectly
pulls in srfi-1, in any code that is run as part of Guile startup, will
cause the build to fail when doing the snarf-check-and-output-texi,
because libguile-srfi-srfi-13-14-v-4 hasn't been built at that point.

But if the developer concerned happens to have a compatible
libguile-srfi-srfi-13-14-v-4 and libguile installed in /usr/lib or
/usr/local/lib, the build may pick those up and so mask the problem.

Is there a reason why we don't just move all the SRFI C code into the
core libguile?  I think that would be a general fix.

Alternatively we could try making snarf-check-and-output-texi happen
later in the build.  But I had a quick go at that and found it tricky,
because of the dependency on all the .doc files, and those being listed
and generated by libguile/Makefile.am; if we moved the
snarf-check-and-output-texi step to, say, doc/Makefile.am, we'd lose
that dependency.

> I don't really get it. I thought this was fixed. There have been a
> couple threads about this, even, if you search the archives in recent
> months.

I've checked the one starting at
http://lists.gnu.org/archive/html/guile-devel/2010-06/msg00029.html.
But I think it just covers one particular case, that you fixed.

Regards,
      Neil



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-07-04 20:33                 ` Neil Jerram
@ 2010-07-06 21:35                   ` Ludovic Courtès
  2010-07-17 11:57                   ` Andy Wingo
  1 sibling, 0 replies; 18+ messages in thread
From: Ludovic Courtès @ 2010-07-06 21:35 UTC (permalink / raw)
  To: guile-devel

Hi Neil,

Neil Jerram <neil@ossau.uklinux.net> writes:

> Is there a reason why we don't just move all the SRFI C code into the
> core libguile?  I think that would be a general fix.

While not officially documented, libguile-srfi-srfi-1 and its API are in
fact public and should remain available.

Rather than move C code from there to libguile, I was thinking about
moving C code to Scheme, which, as Andy pointed out, will often boil
down to reverting an old rewrite from Scheme to C.  :-)

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-07-04 20:33                 ` Neil Jerram
  2010-07-06 21:35                   ` Ludovic Courtès
@ 2010-07-17 11:57                   ` Andy Wingo
  2010-07-17 18:08                     ` Patrick McCarty
  1 sibling, 1 reply; 18+ messages in thread
From: Andy Wingo @ 2010-07-17 11:57 UTC (permalink / raw)
  To: Neil Jerram; +Cc: guile-devel

Hi,

On Sun 04 Jul 2010 22:33, Neil Jerram <neil@ossau.uklinux.net> writes:

> Andy Wingo <wingo@pobox.com> writes:
>
>> This has been an on-and-off issue:
>> 02fcbf78b27788c03563e5c3d297a4cd469ce562, and
>> 04af4c4c5221c082905d52eb5ad3829ed681d097.
>
> Right.  The commit (4f99a499197b592a9a3060de2205531852f4f94) that
> Patrick McCarty identified (with git bisect) is another very similar
> case.

I think I fixed this one again.

> The general problem is that any #:use-module or (@ ...) that indirectly
> pulls in srfi-1, in any code that is run as part of Guile startup, will
> cause the build to fail when doing the snarf-check-and-output-texi,
> because libguile-srfi-srfi-13-14-v-4 hasn't been built at that point.
>
> But if the developer concerned happens to have a compatible
> libguile-srfi-srfi-13-14-v-4 and libguile installed in /usr/lib or
> /usr/local/lib, the build may pick those up and so mask the problem.
>
> Is there a reason why we don't just move all the SRFI C code into the
> core libguile?  I think that would be a general fix.

I agree. They should probably still be their own shlibs, but built at
the same time as libguile. Then we make snarf-check-and-output-texi
depend on the libs being built, as necessary.

Though I would hope that with time we can remove the need for these
shlibs, instead mostly implementing srfis in scheme, and using the
dynamic FFI as needed...

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix hanging of popen.test
  2010-07-17 11:57                   ` Andy Wingo
@ 2010-07-17 18:08                     ` Patrick McCarty
  0 siblings, 0 replies; 18+ messages in thread
From: Patrick McCarty @ 2010-07-17 18:08 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel, Neil Jerram

On Sat, Jul 17, 2010 at 4:57 AM, Andy Wingo <wingo@pobox.com> wrote:
> On Sun 04 Jul 2010 22:33, Neil Jerram <neil@ossau.uklinux.net> writes:
>> Andy Wingo <wingo@pobox.com> writes:
>>
>>> This has been an on-and-off issue:
>>> 02fcbf78b27788c03563e5c3d297a4cd469ce562, and
>>> 04af4c4c5221c082905d52eb5ad3829ed681d097.
>>
>> Right.  The commit (4f99a499197b592a9a3060de2205531852f4f94) that
>> Patrick McCarty identified (with git bisect) is another very similar
>> case.
>
> I think I fixed this one again.

Thanks!  Latest git compiles just fine for me.

Regards,
Patrick



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2010-07-17 18:08 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-10 22:54 [PATCH] Fix hanging of popen.test Neil Jerram
2010-06-11 19:48 ` Neil Jerram
2010-06-14 21:27   ` Andy Wingo
2010-06-28 21:48     ` Neil Jerram
2010-06-29  9:31       ` Andy Wingo
2010-06-29 19:11         ` Neil Jerram
2010-06-30 22:50           ` Neil Jerram
2010-06-30 23:58             ` Neil Jerram
2010-07-01 10:48               ` Andy Wingo
2010-07-01 20:29                 ` Patrick McCarty
2010-07-03 22:17                 ` Ludovic Courtès
2010-07-04  9:07                   ` Andy Wingo
2010-07-04 20:33                 ` Neil Jerram
2010-07-06 21:35                   ` Ludovic Courtès
2010-07-17 11:57                   ` Andy Wingo
2010-07-17 18:08                     ` Patrick McCarty
2010-07-01 16:22             ` dsmich
2010-07-01 21:22               ` Neil Jerram

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).