unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / Atom feed
* bug#43802: Knot: Linker runs very slowly and crashes during build
@ 2020-10-04 20:56 Simon South
  2020-10-04 23:01 ` Simon South
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Simon South @ 2020-10-04 20:56 UTC (permalink / raw)
  To: 43802

Building Knot 3.0.0 using "guix build knot" consistently appears to hang
for me when it gets to this point during the linking stage:

    CCLD     knsec3hash
  ar: `u' modifier ignored since `D' is the default (see `U')
    CCLD     kdig
    CCLD     khost

While it sits here the compiler is tying up 100% of a single CPU
core. On my ROCK64 with 4 GB of RAM, it eventually crashes with an
internal error:

  gcc: internal compiler error: Killed (program cc1)
  Please submit a full bug report,
  with preprocessed source if appropriate.
  See <https://gcc.gnu.org/bugs/> for instructions.
  make[3]: *** [Makefile:5381: libzscanner/la-scanner.lo] Error 1
  make[3]: Leaving directory '/tmp/guix-build-knot-3.0.0.drv-0/knot-3.0.0/src'

dmesg shows the compiler was killed for running out of memory:

  cc1 invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
  CPU: 2 PID: 22340 Comm: cc1 Not tainted 5.8.11-gnu #1
  (...)
  oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=cc1,pid=22340,uid=999
  Out of memory: Killed process 22340 (cc1) total-vm:2573780kB, anon-rss:2540708kB, file-rss:0kB, shmem-rss:0kB, UID:999 pgtables:5044kB oom_score_adj:0
  oom_reaper: reaped process 22340 (cc1), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

On my x86_64 machine the build eventually completes (that machine has
much more memory), but there is the same, weirdly long delay during
linking while the compiler runs.

I see no such delay however when I build the code "manually", using
"guix environment --pure knot" or even "guix environment --no-grafts
--container knot" as the manual suggests. The build then completes
quickly and successfully on either machine; the problem appears to
happen only when guix-daemon is involved.

Is there a known issue that can cause the linker to consume orders of
magnitude more resources when run by the Guix build process?

Apart from rebuilding gcc with debugging symbols (which seems to make
Guix want to rebuild every other package in the system as well) and
trying to understand what the compiler is doing, how might I go about
diagnosing this?

-- 
Simon South
simon@simonsouth.net




^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#43802: Knot: Linker runs very slowly and crashes during build
  2020-10-04 20:56 bug#43802: Knot: Linker runs very slowly and crashes during build Simon South
@ 2020-10-04 23:01 ` Simon South
  2020-10-05  0:09 ` Simon South
  2020-10-05 14:15 ` Ludovic Courtès
  2 siblings, 0 replies; 7+ messages in thread
From: Simon South @ 2020-10-04 23:01 UTC (permalink / raw)
  To: 43802

So naturally, as soon as I submit the bug report something occurs to me
that gets me unstuck.

The delay and crash are occuring while libtool is using gcc to compile
src/libzscanner/scanner.c, which appears to be generated at build time
from the file scanner.c.t0 in the same directory.

When I build Knot on my own, scanner.c has a size of 272 KB. When guix
builds it, scanner.c somehow balloons out to 1.9 MB! So naturally gcc is
going to need some time and space to make its way through all that code.

In fact the build process actually points out

  NOTE: Compilation of scanner.c can take several minutes!

So perhaps all this is completely expected. Still... 1.9 MB. Of C
code. It's tempting to think something is going wrong here. (And anyway,
why the huge discrepancy in file size?)

I'm investigating.

-- 
Simon South
simon@simonsouth.net




^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#43802: Knot: Linker runs very slowly and crashes during build
  2020-10-04 20:56 bug#43802: Knot: Linker runs very slowly and crashes during build Simon South
  2020-10-04 23:01 ` Simon South
@ 2020-10-05  0:09 ` Simon South
  2020-10-05 15:26   ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
  2020-10-05 14:15 ` Ludovic Courtès
  2 siblings, 1 reply; 7+ messages in thread
From: Simon South @ 2020-10-05  0:09 UTC (permalink / raw)
  To: 43802

Turns out this is not a bug. Knot ships with two parser implementations:
A smaller, slower one (272 KB) and a larger, faster one (1.9 MB). The
larger one is a bit too big to build reliably on systems with 4 GB or
less of available memory.

To test Knot on these machines, you can run "configure" with
"--disable-fastparser" as an argument (or edit gnu/packages/dns.scm to
do so) to force it to use the smaller parser. This also allows the build
to complete more quickly on systems that can use either.

So how was I getting the smaller implementation in my own builds without
realizing it? The configure script has some magical behaviour: It will
automatically select the faster-building implementation if it finds a
".git" folder in the current directory. This is presumably meant to help
developers, but the confusion it caused me demonstrates why I think this
sort of magical programming is bad practice.

At any rate, this bug report can be closed.

-- 
Simon South
simon@simonsouth.net




^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#43802: Knot: Linker runs very slowly and crashes during build
  2020-10-04 20:56 bug#43802: Knot: Linker runs very slowly and crashes during build Simon South
  2020-10-04 23:01 ` Simon South
  2020-10-05  0:09 ` Simon South
@ 2020-10-05 14:15 ` Ludovic Courtès
  2 siblings, 0 replies; 7+ messages in thread
From: Ludovic Courtès @ 2020-10-05 14:15 UTC (permalink / raw)
  To: Simon South; +Cc: 43802

Hi,

Simon South <simon@simonsouth.net> skribis:

> Building Knot 3.0.0 using "guix build knot" consistently appears to hang
> for me when it gets to this point during the linking stage:
>
>     CCLD     knsec3hash
>   ar: `u' modifier ignored since `D' is the default (see `U')
>     CCLD     kdig
>     CCLD     khost
>
> While it sits here the compiler is tying up 100% of a single CPU
> core. On my ROCK64 with 4 GB of RAM, it eventually crashes with an
> internal error:
>
>   gcc: internal compiler error: Killed (program cc1)
>   Please submit a full bug report,
>   with preprocessed source if appropriate.
>   See <https://gcc.gnu.org/bugs/> for instructions.
>   make[3]: *** [Makefile:5381: libzscanner/la-scanner.lo] Error 1
>   make[3]: Leaving directory '/tmp/guix-build-knot-3.0.0.drv-0/knot-3.0.0/src'
>
> dmesg shows the compiler was killed for running out of memory:
>
>   cc1 invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
>   CPU: 2 PID: 22340 Comm: cc1 Not tainted 5.8.11-gnu #1
>   (...)
>   oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/,task=cc1,pid=22340,uid=999
>   Out of memory: Killed process 22340 (cc1) total-vm:2573780kB, anon-rss:2540708kB, file-rss:0kB, shmem-rss:0kB, UID:999 pgtables:5044kB oom_score_adj:0
>   oom_reaper: reaped process 22340 (cc1), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
>
> On my x86_64 machine the build eventually completes (that machine has
> much more memory), but there is the same, weirdly long delay during
> linking while the compiler runs.

I this an LTO build (with ‘-flto’ in the compile and link flags)?  That
could explain the memory requirements.

Ludo’.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#43802: Knot: Linker runs very slowly and crashes during build
  2020-10-05  0:09 ` Simon South
@ 2020-10-05 15:26   ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
  2020-10-05 15:44     ` Simon South
  0 siblings, 1 reply; 7+ messages in thread
From: Tobias Geerinckx-Rice via Bug reports for GNU Guix @ 2020-10-05 15:26 UTC (permalink / raw)
  To: Simon South, Ludovic Courtès; +Cc: 43802

[-- Attachment #1: Type: text/plain, Size: 876 bytes --]

Simon,

Would it make sense to provide a faster-building slower-starting 
Knot variant alongside the main package?

Ludovic Courtès 写道:
> I this an LTO build (with ‘-flto’ in the compile and link 
> flags)?  That
> could explain the memory requirements.

No, but good guess.

Simon South 写道:
> Turns out this is not a bug.

The fast parser is written in Ragel[0], which compiles down to 
almost 2 MiB of ‘C’, which is then thrown at GCC to sort out.  I 
know to put the kettle on before hacking on Knot locally.

What I didn't know was that these generated C files were included 
in the release tarball.  We have the Ragel, we can rebuild them, 
and we now do so in commit 
2b73e50c31a61b5dcef35a1e4b9484d9dbcb0fbc.  Thanks for bringing it 
to my attention.

Kind regards,

T G-R

[0]: http://www.colm.net/open-source/ragel/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 247 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#43802: Knot: Linker runs very slowly and crashes during build
  2020-10-05 15:26   ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
@ 2020-10-05 15:44     ` Simon South
  2020-10-07 22:06       ` Ludovic Courtès
  0 siblings, 1 reply; 7+ messages in thread
From: Simon South @ 2020-10-05 15:44 UTC (permalink / raw)
  To: Tobias Geerinckx-Rice; +Cc: 43802

Tobias Geerinckx-Rice <me@tobias.gr> writes:
> Would it make sense to provide a faster-building slower-starting Knot
> variant alongside the main package?

I'm inclined to say "no", especially if we assume a substitute will
(nearly always) be available.

Unless someone is hacking on the scanner directly it ought to be safe to
add "--disable-fastparser" to dns.scm temporarily during testing, then
remove it before submitting a patch. If it isn't then probably _that_ is
the bug to be fixed.

> What I didn't know was that these generated C files were included in
> the release tarball.  We have the Ragel, we can rebuild them, and we
> now do so in commit 2b73e50c31a61b5dcef35a1e4b9484d9dbcb0fbc.

Neat!

-- 
Simon South
simon@simonsouth.net




^ permalink raw reply	[flat|nested] 7+ messages in thread

* bug#43802: Knot: Linker runs very slowly and crashes during build
  2020-10-05 15:44     ` Simon South
@ 2020-10-07 22:06       ` Ludovic Courtès
  0 siblings, 0 replies; 7+ messages in thread
From: Ludovic Courtès @ 2020-10-07 22:06 UTC (permalink / raw)
  To: Simon South; +Cc: 43802

Simon South <simon@simonsouth.net> skribis:

>> What I didn't know was that these generated C files were included in
>> the release tarball.  We have the Ragel, we can rebuild them, and we
>> now do so in commit 2b73e50c31a61b5dcef35a1e4b9484d9dbcb0fbc.
>
> Neat!

+1, yay for bootstrapping!

Ludo’.




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-07 22:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-04 20:56 bug#43802: Knot: Linker runs very slowly and crashes during build Simon South
2020-10-04 23:01 ` Simon South
2020-10-05  0:09 ` Simon South
2020-10-05 15:26   ` Tobias Geerinckx-Rice via Bug reports for GNU Guix
2020-10-05 15:44     ` Simon South
2020-10-07 22:06       ` Ludovic Courtès
2020-10-05 14:15 ` Ludovic Courtès

unofficial mirror of bug-guix@gnu.org 

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://yhetil.org/guix-bugs/0 guix-bugs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 guix-bugs guix-bugs/ https://yhetil.org/guix-bugs \
		bug-guix@gnu.org
	public-inbox-index guix-bugs

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.yhetil.org/yhetil.gnu.guix.bugs
	nntp://news.gmane.io/gmane.comp.gnu.guix.bugs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git