unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Re: broken "fast" build - hung containers
       [not found] <a6kepd77.fsf@lifelogs.com>
@ 2021-09-15  8:09 ` Lars Ingebrigtsen
  2021-09-15  9:57 ` emba.el (was: broken "fast" build - hung containers) Michael Albinus
       [not found] ` <87fsu6p4bx.fsf@gmx.de>
  2 siblings, 0 replies; 11+ messages in thread
From: Lars Ingebrigtsen @ 2021-09-15  8:09 UTC (permalink / raw)
  To: emacs-build-automation; +Cc: Ted Zlatanov, emacs-devel

Ted Zlatanov <tzz@lifelogs.com> writes:

> I was cleaning the EMBA machines last night. Lots of hanging containers.
> I think it was due to shm issues, but couldn't tell.
>
> The "fast" EMBA build seems to be hanging. The normal one works. Seems
> to be here:
>
> https://emba.gnu.org/emacs/emacs/-/jobs/27726
>
>   ELC      src/emacs-module-tests.elc
>   GEN      src/emacs-module-tests.log
> ERROR: Job failed: execution took longer than 3h0m0s second
>
> I see two of them hanging right now (the docker containers have not been
> removed). This probably is what started the cascade of failures that
> broke EMBA.
>
> I'll investigate when able, but if anyone has ideas, please let us know.

Hm...  that seems vaguely familiar.  I think I've seen that before once.

What's EMBA doing in the normal build here?  Is it just a "make check"?

(I've added emacs-devel to the CCs.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 11+ messages in thread

* emba.el (was: broken "fast" build - hung containers)
       [not found] <a6kepd77.fsf@lifelogs.com>
  2021-09-15  8:09 ` broken "fast" build - hung containers Lars Ingebrigtsen
@ 2021-09-15  9:57 ` Michael Albinus
  2021-09-16  8:31   ` emba.el Ted Zlatanov
       [not found] ` <87fsu6p4bx.fsf@gmx.de>
  2 siblings, 1 reply; 11+ messages in thread
From: Michael Albinus @ 2021-09-15  9:57 UTC (permalink / raw)
  To: emacs-build-automation; +Cc: emacs-devel

Ted Zlatanov <tzz@lifelogs.com> writes:

Hi,

> The "fast" EMBA build seems to be hanging. The normal one works. Seems
> to be here:
>
> https://emba.gnu.org/emacs/emacs/-/jobs/27726

Btw, a while ago I wrote a private package emba.el. It allows you to see
the EMBA jobs output in an Emacs buffer. Like this

--8<---------------cut here---------------start------------->8---
M-x emba-show-jobs
Number of jobs (default 5):

27760 * doc/misc/flymake.texi: Fix @include docstyle.texi
27759 * doc/misc/flymake.texi: Fix @include docstyle.texi
27758 * doc/misc/flymake.texi: Fix @include docstyle.texi
27757 Python shell: rearrange printing of newline before output
27756 Python shell: rearrange printing of newline before output
--8<---------------cut here---------------end--------------->8---

If you click on a line, you will see the job output. The job numbers are
also colorized, in order to see whether a job has succeeded or
failed. And they have a tooltip, which gives more information about the
job. Inspired by debbugs-gnu.el :-)

The downside is, that it requires glab.el, which is part of the ghub
package on MELPA.

Would it make sense to polish emba.el a little bit, and to offer this as
(Non)GNU ELPA package?

> Ted

Best regards, Michael.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: emba.el
  2021-09-15  9:57 ` emba.el (was: broken "fast" build - hung containers) Michael Albinus
@ 2021-09-16  8:31   ` Ted Zlatanov
  2021-09-16 16:26     ` emba.el Michael Albinus
  0 siblings, 1 reply; 11+ messages in thread
From: Ted Zlatanov @ 2021-09-16  8:31 UTC (permalink / raw)
  To: Michael Albinus; +Cc: emacs-build-automation, emacs-devel

On Wed, 15 Sep 2021 11:57:43 +0200 Michael Albinus <michael.albinus@gmx.de> wrote: 

MA> Btw, a while ago I wrote a private package emba.el. It allows you to see
MA> the EMBA jobs output in an Emacs buffer. Like this

MA> M-x emba-show-jobs
MA> Number of jobs (default 5):
...
MA> The downside is, that it requires glab.el, which is part of the ghub
MA> package on MELPA.

MA> Would it make sense to polish emba.el a little bit, and to offer this as
MA> (Non)GNU ELPA package?

Absolutely! I'd use it, the web UI for reviewing jobs is slow and
inconvenient. Let me know if I can help testing it.

Thank you!
Ted



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: broken "fast" build - hung containers
       [not found] ` <87fsu6p4bx.fsf@gmx.de>
@ 2021-09-16 12:44   ` Ted Zlatanov
  2021-09-16 16:16     ` Michael Albinus
  0 siblings, 1 reply; 11+ messages in thread
From: Ted Zlatanov @ 2021-09-16 12:44 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Lars Ingebrigtsen, emacs-build-automation, emacs-devel

On Wed, 15 Sep 2021 11:37:22 +0200 Michael Albinus <michael.albinus@gmx.de> wrote: 

MA> Ted Zlatanov <tzz@lifelogs.com> writes:

>> The "fast" EMBA build seems to be hanging. The normal one works. Seems
>> to be here:
>> 
>> https://emba.gnu.org/emacs/emacs/-/jobs/27726
>> 
>> ELC      src/emacs-module-tests.elc
>> GEN      src/emacs-module-tests.log
>> ERROR: Job failed: execution took longer than 3h0m0s second
>> 
>> I see two of them hanging right now (the docker containers have not been
>> removed). This probably is what started the cascade of failures that
>> broke EMBA.
>> 
>> I'll investigate when able, but if anyone has ideas, please let us know.

MA> I believe we shall comment out the "fast" jobs ATM. A while ago, I've
MA> tried to reorganize the jobs, maybe I have broken some dependencies. And
MA> honestly, I don't believe that the "fast" jobs bring something we don't
MA> see in the "normal" jobs. But that's another discussion ...

The specific jobs are below. The second and third one work.

test-fast-inotify hangs the Docker container (it can't be force-removed
by `docker rm' and when I try to remove the files, after stopping the
Docker daemon, I get a shm lock error). I think we should fix it rather
than disabling it. I have those hanging containers right now - what can
I do to find the trouble point?

test-fast-inotify:
  stage: fast
  extends: [.job-template, .test-template]
  variables:
    target: emacs-inotify
    make_params: "-C test check"

test-lisp-inotify:
  stage: normal
  extends: [.job-template, .test-template]
  variables:
    target: emacs-inotify
    make_params: "-C test check-lisp"

test-lisp-net-inotify:
  stage: normal
  extends: [.job-template, .test-template]
  variables:
    target: emacs-inotify
    make_params: "-C test check-lisp-net"

On Wed, 15 Sep 2021 10:09:10 +0200 Lars Ingebrigtsen <larsi@gnus.org> wrote: 

LI> Hm...  that seems vaguely familiar.  I think I've seen that before once.

LI> What's EMBA doing in the normal build here?  Is it just a "make check"?

Yeah. We don't do much unusual here, and it definitely used to work when
we set EMBA up. The underlying Docker image is the same between all
three jobs.

Thanks!
Ted



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: broken "fast" build - hung containers
  2021-09-16 12:44   ` broken "fast" build - hung containers Ted Zlatanov
@ 2021-09-16 16:16     ` Michael Albinus
  2021-09-18 20:01       ` Ted Zlatanov
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Albinus @ 2021-09-16 16:16 UTC (permalink / raw)
  To: Lars Ingebrigtsen, emacs-build-automation, emacs-devel

Ted Zlatanov <tzz@lifelogs.com> writes:

Hi Ted,

> The specific jobs are below. The second and third one work.
>
> test-fast-inotify hangs the Docker container (it can't be force-removed
> by `docker rm' and when I try to remove the files, after stopping the
> Docker daemon, I get a shm lock error). I think we should fix it rather
> than disabling it. I have those hanging containers right now - what can
> I do to find the trouble point?
>
> test-fast-inotify:
>   stage: fast
>   extends: [.job-template, .test-template]
>   variables:
>     target: emacs-inotify
>     make_params: "-C test check"
>
> test-lisp-inotify:
>   stage: normal
>   extends: [.job-template, .test-template]
>   variables:
>     target: emacs-inotify
>     make_params: "-C test check-lisp"
>
> test-lisp-net-inotify:
>   stage: normal
>   extends: [.job-template, .test-template]
>   variables:
>     target: emacs-inotify
>     make_params: "-C test check-lisp-net"

The first job belongs to another stage. I've introduced it months ago,
trying to reorganize the pipeline flow. Chances are good that I made it
miserably wrong.

> Thanks!
> Ted

Best regards, Michael.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: emba.el
  2021-09-16  8:31   ` emba.el Ted Zlatanov
@ 2021-09-16 16:26     ` Michael Albinus
  0 siblings, 0 replies; 11+ messages in thread
From: Michael Albinus @ 2021-09-16 16:26 UTC (permalink / raw)
  To: emacs-devel; +Cc: Jonas Bernoulli

Ted Zlatanov <tzz@lifelogs.com> writes:

Hi Ted,

> On Wed, 15 Sep 2021 11:57:43 +0200 Michael Albinus <michael.albinus@gmx.de> wrote:
>
> MA> Btw, a while ago I wrote a private package emba.el. It allows you to see
> MA> the EMBA jobs output in an Emacs buffer. Like this
>
> MA> M-x emba-show-jobs
> MA> Number of jobs (default 5):
> ...
> MA> The downside is, that it requires glab.el, which is part of the ghub
> MA> package on MELPA.
>
> MA> Would it make sense to polish emba.el a little bit, and to offer this as
> MA> (Non)GNU ELPA package?
>
> Absolutely! I'd use it, the web UI for reviewing jobs is slow and
> inconvenient. Let me know if I can help testing it.

I could add it, but I don't know whether requiring ghub from MELPA is
acceptable. Perhaps, we could convince Jonas Bernoulli to move it to
(Non)GNU ELPA?

If you are interested in private testing, I could send it to you. No, I
have no Github account for sharing ...

> Thank you!
> Ted

Best regards, Michael.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: broken "fast" build - hung containers
  2021-09-16 16:16     ` Michael Albinus
@ 2021-09-18 20:01       ` Ted Zlatanov
  2021-09-18 21:13         ` Michael Albinus
  0 siblings, 1 reply; 11+ messages in thread
From: Ted Zlatanov @ 2021-09-18 20:01 UTC (permalink / raw)
  To: Michael Albinus; +Cc: Lars Ingebrigtsen, emacs-build-automation, emacs-devel

I was able to replicate the bug by going into one of the hung containers
and running `make check`. It hangs in the module tests, and ends with a
kernel fault (this is 4.4.0). It seems that the
`module--help-function-arglist` test is the last one to succeed, and
`module--test-assertions--load-non-live-object` is the next one that may
be the failing one.

```
# docker exec -it [hung-container-ID] bash
# make -C lib all
make[1]: Entering directory '/checkout/lib'
...
GEN      src/editfns-tests.log
GEN      src/emacs-module-tests.log
2021 Sep 18 06:39:13 emba kernel BUG at /build/linux-Pv5wqf/linux-4.4.0/mm/memory.c:3214!
2021 Sep 18 06:39:13 emba RIP  [<ffffffff811cd1ae>] handle_mm_fault+0x13de/0x1b80

^C

Makefile:181: recipe for target 'src/emacs-module-tests.log' failed
make[3]: *** [src/emacs-module-tests.log] Interrupt
Makefile:335: recipe for target 'check-doit' failed
make[2]: [check-doit] Interrupt (ignored)
Makefile:305: recipe for target 'check' failed
make[1]: *** [check] Interrupt
Makefile:988: recipe for target 'check' failed
make: *** [check] Interrupt
```

So. I removed emacs-module-tests.* and the test kept going until
process-tests.* where it locked up with the same kernel bug.

I removed process-tests.* as well, and all the tests completed.

Could we disable the emacs-module-tests.el and process-tests.el inside
EMBA Docker?

For the EMBA admins: this can be replicated with

docker exec -it $(docker ps -q|shuf|head -1) bash

and then `make check` (which will randomly pick one of the dozens of
hung containers). I can dig further if someone can give me steps to
follow.

Thanks
Ted



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: broken "fast" build - hung containers
  2021-09-18 20:01       ` Ted Zlatanov
@ 2021-09-18 21:13         ` Michael Albinus
  2021-09-19  8:25           ` Ted Zlatanov
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Albinus @ 2021-09-18 21:13 UTC (permalink / raw)
  To: Lars Ingebrigtsen, emacs-build-automation, emacs-devel

Ted Zlatanov <tzz@lifelogs.com> writes:

Hi Ted,

> Could we disable the emacs-module-tests.el and process-tests.el inside
> EMBA Docker?

Sure. Our "make check" call shall be extended by (untested)

--8<---------------cut here---------------start------------->8---
EXCLUDE_TESTS = %emacs-module-tests.el %process-tests.el
--8<---------------cut here---------------end--------------->8---

If we do this, we shall also write bug reports for the maintainers of
both test files for analysis.

> Thanks
> Ted

Best regards, Michael.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: broken "fast" build - hung containers
  2021-09-18 21:13         ` Michael Albinus
@ 2021-09-19  8:25           ` Ted Zlatanov
  2021-09-19 14:49             ` Lars Ingebrigtsen
  2021-09-20 11:53             ` Michael Albinus
  0 siblings, 2 replies; 11+ messages in thread
From: Ted Zlatanov @ 2021-09-19  8:25 UTC (permalink / raw)
  To: Michael Albinus
  Cc: Lars Ingebrigtsen, emacs-build-automation, emacs-devel, Philipp Stephani

On Sat, 18 Sep 2021 23:13:51 +0200 Michael Albinus <michael.albinus@gmx.de> wrote: 

>> Could we disable the emacs-module-tests.el and process-tests.el inside
>> EMBA Docker?

MA> Sure. Our "make check" call shall be extended by (untested)

MA> EXCLUDE_TESTS = %emacs-module-tests.el %process-tests.el

I see you skipped the "fast" stage for now. That made the "normal" stage
we do on every commit pass, but the full builds' "platform images"
https://emba.gnu.org/emacs/emacs/-/jobs/28174 and "build images"
https://emba.gnu.org/emacs/emacs/-/jobs/28144 stages are still failing
on the master branch.

MA> If we do this, we shall also write bug reports for the maintainers of
MA> both test files for analysis.

I would really appreciate some guidance on increasing the verbosity of
the tests so we can produce a bug report.

There is no maintainer for either of those files. Based on the commit
history it looks like Philipp Stephani is a good contact for the module
tests and maybe could help us look at the process tests, so I am CC-ing
Philipp.

I'd like to reboot the emba and emba-runner machines again to clear all
the hung containers, but will wait in case there's some useful
information in them for the debugging.

Thanks
Ted



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: broken "fast" build - hung containers
  2021-09-19  8:25           ` Ted Zlatanov
@ 2021-09-19 14:49             ` Lars Ingebrigtsen
  2021-09-20 11:53             ` Michael Albinus
  1 sibling, 0 replies; 11+ messages in thread
From: Lars Ingebrigtsen @ 2021-09-19 14:49 UTC (permalink / raw)
  To: Michael Albinus, emacs-build-automation, emacs-devel, Philipp Stephani

Ted Zlatanov <tzz@lifelogs.com> writes:

> I see you skipped the "fast" stage for now. That made the "normal" stage
> we do on every commit pass, but the full builds' "platform images"
> https://emba.gnu.org/emacs/emacs/-/jobs/28174 and "build images"
> https://emba.gnu.org/emacs/emacs/-/jobs/28144 stages are still failing
> on the master branch.

Looks like several of the jobs passed, but some are failing now with:

lisp/button-tests.log:
   FAILED  button--help-echo-form

Which turns out to be legitimate -- the test fails here, too.

So we're getting closer.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: broken "fast" build - hung containers
  2021-09-19  8:25           ` Ted Zlatanov
  2021-09-19 14:49             ` Lars Ingebrigtsen
@ 2021-09-20 11:53             ` Michael Albinus
  1 sibling, 0 replies; 11+ messages in thread
From: Michael Albinus @ 2021-09-20 11:53 UTC (permalink / raw)
  To: Lars Ingebrigtsen, emacs-build-automation, emacs-devel, Philipp Stephani

Ted Zlatanov <tzz@lifelogs.com> writes:

Hi Ted,

> On Sat, 18 Sep 2021 23:13:51 +0200 Michael Albinus <michael.albinus@gmx.de> wrote:
>
>>> Could we disable the emacs-module-tests.el and process-tests.el inside
>>> EMBA Docker?
>
> MA> Sure. Our "make check" call shall be extended by (untested)
>
> MA> EXCLUDE_TESTS = %emacs-module-tests.el %process-tests.el
>
> I see you skipped the "fast" stage for now. That made the "normal" stage
> we do on every commit pass, but the full builds' "platform images"
> https://emba.gnu.org/emacs/emacs/-/jobs/28174 and "build images"
> https://emba.gnu.org/emacs/emacs/-/jobs/28144 stages are still failing
> on the master branch.

I will test in my private emba project what could be done.

> MA> If we do this, we shall also write bug reports for the maintainers of
> MA> both test files for analysis.
>
> I would really appreciate some guidance on increasing the verbosity of
> the tests so we can produce a bug report.

I believe there's no general rule. Every test package has its own
settings for increasing verbosity.

> There is no maintainer for either of those files. Based on the commit
> history it looks like Philipp Stephani is a good contact for the module
> tests and maybe could help us look at the process tests, so I am CC-ing
> Philipp.

Yep.

> I'd like to reboot the emba and emba-runner machines again to clear all
> the hung containers, but will wait in case there's some useful
> information in them for the debugging.
>
> Thanks
> Ted

Best regards, Michael.



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-20 11:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <a6kepd77.fsf@lifelogs.com>
2021-09-15  8:09 ` broken "fast" build - hung containers Lars Ingebrigtsen
2021-09-15  9:57 ` emba.el (was: broken "fast" build - hung containers) Michael Albinus
2021-09-16  8:31   ` emba.el Ted Zlatanov
2021-09-16 16:26     ` emba.el Michael Albinus
     [not found] ` <87fsu6p4bx.fsf@gmx.de>
2021-09-16 12:44   ` broken "fast" build - hung containers Ted Zlatanov
2021-09-16 16:16     ` Michael Albinus
2021-09-18 20:01       ` Ted Zlatanov
2021-09-18 21:13         ` Michael Albinus
2021-09-19  8:25           ` Ted Zlatanov
2021-09-19 14:49             ` Lars Ingebrigtsen
2021-09-20 11:53             ` Michael Albinus

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).