unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
* reproducible builds and debugging information
@ 2015-03-22 17:26 Tomáš Čech
  2015-03-24 21:09 ` Ludovic Courtès
  0 siblings, 1 reply; 10+ messages in thread
From: Tomáš Čech @ 2015-03-22 17:26 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 609 bytes --]

Hello Guix!

I have question about reproducible builds and generating debugging
information.

As I was tracing curl code, I needed to rebuild the package with
"-ggdb" in CFLAGS and enable debug among outputs.

The later doesn't change the hash (and the generated code), but the
first does. So I'd like to propose to put "-ggdb" to generally applied
CFLAGS for whole distribution. I don't know if it will cause any
performance penalty, but it would be possible to not only have
reproducible builds but also inspect them and debug them...

Without it it can be hard to analyze core dumps and trace.

WDYT?

S_W

[-- Attachment #2: Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reproducible builds and debugging information
  2015-03-22 17:26 reproducible builds and debugging information Tomáš Čech
@ 2015-03-24 21:09 ` Ludovic Courtès
  2015-03-25  0:33   ` Tomáš Čech
  0 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2015-03-24 21:09 UTC (permalink / raw)
  To: guix-devel

Tomáš Čech <sleep_walker@gnu.org> skribis:

> As I was tracing curl code, I needed to rebuild the package with
> "-ggdb" in CFLAGS and enable debug among outputs.
>
> The later doesn't change the hash (and the generated code), but the
> first does.

Both approaches change the output hash.  (As soon as a bit changes in
the build process, the output hash changes.)

Adding a “debug” output is nice because we have support to automatically
DTRT (info "(guix) Installing Debugging Files").

> So I'd like to propose to put "-ggdb" to generally applied CFLAGS for
> whole distribution.

Packages that have an autoconf-based build system, and I suppose most
others, are built with -g.  The binaries get stripped by default and
debugging info is lost unless the package has a “debug” output.

Currently a few key packages have that, but most don’t (I think Debian
does something similar, not sure about other distros.)

We could make it opt-out rather than opt-in, but the issue is disk usage
on build machine (including end-user machines.)  See
<http://lists.gnu.org/archive/html/bug-guix/2013-07/msg00015.html>.

Thoughts?

Ludo’.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reproducible builds and debugging information
  2015-03-24 21:09 ` Ludovic Courtès
@ 2015-03-25  0:33   ` Tomáš Čech
  2015-03-26 21:21     ` Ludovic Courtès
  0 siblings, 1 reply; 10+ messages in thread
From: Tomáš Čech @ 2015-03-25  0:33 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 5381 bytes --]

On Tue, Mar 24, 2015 at 10:09:50PM +0100, Ludovic Courtès wrote:
>Tomáš Čech <sleep_walker@gnu.org> skribis:
>
>> As I was tracing curl code, I needed to rebuild the package with
>> "-ggdb" in CFLAGS and enable debug among outputs.
>>
>> The later doesn't change the hash (and the generated code), but the
>> first does.
>
>Both approaches change the output hash.  (As soon as a bit changes in
>the build process, the output hash changes.)
>
>Adding a “debug” output is nice because we have support to automatically
>DTRT (info "(guix) Installing Debugging Files").

For once I have read that and understood that before sending an email.

Adding debug output is not nice because it changes the hash.

>> So I'd like to propose to put "-ggdb" to generally applied CFLAGS for
>> whole distribution.
>
>Packages that have an autoconf-based build system, and I suppose most
>others, are built with -g.

I can't confirm this statement.

I added "debug" output to curl package:

diff --git a/gnu/packages/curl.scm b/gnu/packages/curl.scm
index 821a957..996342a 100644
--- a/gnu/packages/curl.scm
+++ b/gnu/packages/curl.scm
@@ -74,6 +74,7 @@
          ;; verbose.
          (zero? (system* "make" "-C" "tests" "test")))
        %standard-phases)))
+   (outputs '("out" "debug"))
    (synopsis "Command line tool for transferring data with URL syntax")
    (description
     "curl is a command line tool for transferring data with URL syntax,
				 
Rebuilt:
/gnu/store/c2x7r38zkzf60vz02j7az7r847vy2sng-curl-7.40.0
/gnu/store/9gh88mgh68kxl38vyd42yc8i9v8fa449-curl-7.40.0-debug

And GDB is complaining on the provided files:
Reading symbols from /gnu/store/c2x7r38zkzf60vz02j7az7r847vy2sng-curl-7.40.0/bin/curl...Reading symbols from /gnu/store/9gh88mgh68kxl38vyd42yc8i9v8fa449-curl-7.40.0-debug/lib/debug//gnu/store/c2x7r38zkzf60vz02j7az7r847vy2sng-curl-7.40.0/bin/curl.debug...(no debugging symbols found)...done.
(no debugging symbols found)...done.

We have (hopefully) reproducible builds, you can see for yourself.

After adding `-g' to CFLAGS:
diff --git a/gnu/packages/curl.scm b/gnu/packages/curl.scm
index 821a957..d2bf4d9 100644
--- a/gnu/packages/curl.scm
+++ b/gnu/packages/curl.scm
@@ -60,7 +60,8 @@
        ("pkg-config" ,pkg-config)
        ("python" ,python-2)))
    (arguments
-    `(#:configure-flags '("--with-gnutls" "--with-gssapi")
+    `(#:configure-flags '("--with-gnutls" "--with-gssapi" "CFLAGS=-g")
+      #:make-flags '("CFLAGS=-g")
       ;; Add a phase to patch '/bin/sh' occurances in tests/runtests.pl
       #:phases
       (alist-replace
@@ -74,6 +75,7 @@
          ;; verbose.
          (zero? (system* "make" "-C" "tests" "test")))
        %standard-phases)))
+   (outputs '("out" "debug"))
    (synopsis "Command line tool for transferring data with URL syntax")
    (description
     "curl is a command line tool for transferring data with URL syntax,

it worked as expected:
(gdb) file /gnu/store/lz28mjhnddb91by1mq4bili35fm1dfyk-curl-7.40.0/bin/curl
Reading symbols from /gnu/store/lz28mjhnddb91by1mq4bili35fm1dfyk-curl-7.40.0/bin/curl...Reading symbols from /gnu/store/ml1dlaqrc8kkrka6xbp7y85iwcjd3p6k-curl-7.40.0-debug/lib/debug///gnu/store/lz28mjhnddb91by1mq4bili35fm1dfyk-curl-7.40.0/bin/curl.debug...done.


>The binaries get stripped by default and
>debugging info is lost unless the package has a “debug” output.

OK, the difference -g and -ggdb is slight, but there is the problem
with "debug" output.

When package has output "debug" always - there is no problem.

When package doesn't have "debug" output and I need it, mere adding
output "debug" into package receipt will change the hash so I'll get
different package.

Better solution for this from my POV would be to keep information
whether keep or drop debug output outside package receipt.

Then `guix package -i curl:debug' could mean request to keep generated
debug informations on disk.

>Currently a few key packages have that, but most don’t (I think Debian
>does something similar, not sure about other distros.)

On openSUSE you have available all the subpackage providing stripped
debug informations and subpackage providing source code from the
moment of build (so DWARF information in debug part can match the source).

On Gentoo it's up to you. It's not in the default configuration, it's
mentioned as recommended for debugging and QA purposes, but that is
not exactly good example as every distribution is different so sharing
corer dumps doesn't make sence.

Gentoo gather all stripped files and IIRC it was posible to keep it
compressed on disk and GDB still could use them...  (Very nice, but yeah, it
won't help hydra with disk space problem.)

>We could make it opt-out rather than opt-in, but the issue is disk usage
>on build machine (including end-user machines.)  See
><http://lists.gnu.org/archive/html/bug-guix/2013-07/msg00015.html>.
>
>Thoughts?

If we have distribution of reproducible packages, we can keep it
opt-in and generate debug information next time (by not dropping
it).

The only problematic packages will be the big ones like Webkit, Libre
Office and similar because generating debug increases memory usage
during the build significantly and may not be suitable for average
personal computer.


S_W

[-- Attachment #2: Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: reproducible builds and debugging information
  2015-03-25  0:33   ` Tomáš Čech
@ 2015-03-26 21:21     ` Ludovic Courtès
  2015-03-26 21:51       ` Tomáš Čech
  0 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2015-03-26 21:21 UTC (permalink / raw)
  To: guix-devel

Tomáš Čech <sleep_walker@gnu.org> skribis:

> On Tue, Mar 24, 2015 at 10:09:50PM +0100, Ludovic Courtès wrote:

[...]

>>Packages that have an autoconf-based build system, and I suppose most
>>others, are built with -g.
>
> I can't confirm this statement.
>
> I added "debug" output to curl package:

Indeed, I just checked and cURL overrides the default behavior.  Here it
has to be configured with --enable-debug, which also enables the test
suite (!).  Do you want to try that, and add the “debug” output?

>>The binaries get stripped by default and
>>debugging info is lost unless the package has a “debug” output.
>
> OK, the difference -g and -ggdb is slight, but there is the problem
> with "debug" output.
>
> When package has output "debug" always - there is no problem.
>
> When package doesn't have "debug" output and I need it, mere adding
> output "debug" into package receipt will change the hash so I'll get
> different package.

Right.

>>Currently a few key packages have that, but most don’t (I think Debian
>>does something similar, not sure about other distros.)
>
> On openSUSE you have available all the subpackage providing stripped
> debug informations and subpackage providing source code from the
> moment of build (so DWARF information in debug part can match the source).

You mean there’s a ‘-debug’ package for every single package?

>>We could make it opt-out rather than opt-in, but the issue is disk usage
>>on build machine (including end-user machines.)  See
>><http://lists.gnu.org/archive/html/bug-guix/2013-07/msg00015.html>.
>>
>>Thoughts?
>
> If we have distribution of reproducible packages, we can keep it
> opt-in and generate debug information next time (by not dropping
> it).

That’s not how it works; generating the debug info requires redoing the
whole build process, but with a slight difference.

> The only problematic packages will be the big ones like Webkit, Libre
> Office and similar because generating debug increases memory usage
> during the build significantly and may not be suitable for average
> personal computer.

A problem for C++ code in general.

So again, we could make “debug” opt-out by default, but that’ll be some
work because of issues like this.

What do people think?

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reproducible builds and debugging information
  2015-03-26 21:21     ` Ludovic Courtès
@ 2015-03-26 21:51       ` Tomáš Čech
  2015-03-27 21:24         ` Ludovic Courtès
  0 siblings, 1 reply; 10+ messages in thread
From: Tomáš Čech @ 2015-03-26 21:51 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2749 bytes --]

On Thu, Mar 26, 2015 at 10:21:35PM +0100, Ludovic Courtès wrote:
>Tomáš Čech <sleep_walker@gnu.org> skribis:
>
>> On Tue, Mar 24, 2015 at 10:09:50PM +0100, Ludovic Courtès wrote:
>
>[...]
>
>>>Packages that have an autoconf-based build system, and I suppose most
>>>others, are built with -g.
>>
>> I can't confirm this statement.
>>
>> I added "debug" output to curl package:
>
>Indeed, I just checked and cURL overrides the default behavior.  Here it
>has to be configured with --enable-debug, which also enables the test
>suite (!).  Do you want to try that, and add the “debug” output?
>
>>>The binaries get stripped by default and
>>>debugging info is lost unless the package has a “debug” output.
>>
>> OK, the difference -g and -ggdb is slight, but there is the problem
>> with "debug" output.
>>
>> When package has output "debug" always - there is no problem.
>>
>> When package doesn't have "debug" output and I need it, mere adding
>> output "debug" into package receipt will change the hash so I'll get
>> different package.
>
>Right.
>
>>>Currently a few key packages have that, but most don’t (I think Debian
>>>does something similar, not sure about other distros.)
>>
>> On openSUSE you have available all the subpackage providing stripped
>> debug informations and subpackage providing source code from the
>> moment of build (so DWARF information in debug part can match the source).
>
>You mean there’s a ‘-debug’ package for every single package?

For every single binary package, yes. You can suppress it too. Why it
is so surprising?

>>>We could make it opt-out rather than opt-in, but the issue is disk usage
>>>on build machine (including end-user machines.)  See
>>><http://lists.gnu.org/archive/html/bug-guix/2013-07/msg00015.html>.
>>>
>>>Thoughts?
>>
>> If we have distribution of reproducible packages, we can keep it
>> opt-in and generate debug information next time (by not dropping
>> it).
>
>That’s not how it works; generating the debug info requires redoing the
>whole build process, but with a slight difference.

I know, I was ambiguous again. We both meant the same.

I would like to move the decision whether to keep or to drop debug
information outside of the build itself to keep the hash the same.

Imagine situation where you added "debug" output to every package and
after each build the newly generated store with debug information is
deleted (carefully, not to corrupt database, of course). Your hash
still will be the same.

Then someone reports bug, delivers coredump from some crash.
You need debug info for analysis - you prevent from automatic
deletion of the store with debug information.


S_W



[-- Attachment #2: Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reproducible builds and debugging information
  2015-03-26 21:51       ` Tomáš Čech
@ 2015-03-27 21:24         ` Ludovic Courtès
  2015-03-27 21:55           ` Tomáš Čech
  0 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2015-03-27 21:24 UTC (permalink / raw)
  To: guix-devel

Tomáš Čech <sleep_walker@gnu.org> skribis:

> On Thu, Mar 26, 2015 at 10:21:35PM +0100, Ludovic Courtès wrote:

[...]

>>> On openSUSE you have available all the subpackage providing stripped
>>> debug informations and subpackage providing source code from the
>>> moment of build (so DWARF information in debug part can match the source).
>>
>>You mean there’s a ‘-debug’ package for every single package?
>
> For every single binary package, yes. You can suppress it too. Why it
> is so surprising?

It’s just that I didn’t know, and my recollection is that Debian doesn’t
have -dbg packages for every package.

> I would like to move the decision whether to keep or to drop debug
> information outside of the build itself to keep the hash the same.
>
> Imagine situation where you added "debug" output to every package and
> after each build the newly generated store with debug information is
> deleted (carefully, not to corrupt database, of course). Your hash
> still will be the same.

I see what you mean, but again, that’s not how it works, and I would
argue that it’s not desirable.

To move forward, a possible action would be to try to have ‘outputs’
default to '("out" "debug") and see (1) how much breaks, and (2) how
much space.

Would you like to give it a try?

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reproducible builds and debugging information
  2015-03-27 21:24         ` Ludovic Courtès
@ 2015-03-27 21:55           ` Tomáš Čech
  2015-03-28 17:41             ` Ludovic Courtès
  0 siblings, 1 reply; 10+ messages in thread
From: Tomáš Čech @ 2015-03-27 21:55 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1803 bytes --]

On Fri, Mar 27, 2015 at 10:24:22PM +0100, Ludovic Courtès wrote:
>Tomáš Čech <sleep_walker@gnu.org> skribis:
>
>> On Thu, Mar 26, 2015 at 10:21:35PM +0100, Ludovic Courtès wrote:
>
>[...]
>
>>>> On openSUSE you have available all the subpackage providing stripped
>>>> debug informations and subpackage providing source code from the
>>>> moment of build (so DWARF information in debug part can match the source).
>>>
>>>You mean there’s a ‘-debug’ package for every single package?
>>
>> For every single binary package, yes. You can suppress it too. Why it
>> is so surprising?
>
>It’s just that I didn’t know, and my recollection is that Debian doesn’t
>have -dbg packages for every package.
>
>> I would like to move the decision whether to keep or to drop debug
>> information outside of the build itself to keep the hash the same.
>>
>> Imagine situation where you added "debug" output to every package and
>> after each build the newly generated store with debug information is
>> deleted (carefully, not to corrupt database, of course). Your hash
>> still will be the same.
>
>I see what you mean, but again, that’s not how it works, and I would
>argue that it’s not desirable.

Yes, I know that it works differently now - that was the reason I
initiated this thread. If you considered this option and refused it,
I'm fine with that. Different distributions set different goals. I'd
like to hear the arguments against the general idea sometime but lets
not waste more time on this topic.

>To move forward, a possible action would be to try to have ‘outputs’
>default to '("out" "debug") and see (1) how much breaks, and (2) how
>much space.
>
>Would you like to give it a try?

Good idea, yes. I'll do that.

Thanks,

S_W

[-- Attachment #2: Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reproducible builds and debugging information
  2015-03-27 21:55           ` Tomáš Čech
@ 2015-03-28 17:41             ` Ludovic Courtès
  2015-03-29 17:24               ` Mark H Weaver
  0 siblings, 1 reply; 10+ messages in thread
From: Ludovic Courtès @ 2015-03-28 17:41 UTC (permalink / raw)
  To: guix-devel

Tomáš Čech <sleep_walker@gnu.org> skribis:

> On Fri, Mar 27, 2015 at 10:24:22PM +0100, Ludovic Courtès wrote:
>>Tomáš Čech <sleep_walker@gnu.org> skribis:
>>
>>> On Thu, Mar 26, 2015 at 10:21:35PM +0100, Ludovic Courtès wrote:
>>
>>[...]
>>
>>>>> On openSUSE you have available all the subpackage providing stripped
>>>>> debug informations and subpackage providing source code from the
>>>>> moment of build (so DWARF information in debug part can match the source).
>>>>
>>>>You mean there’s a ‘-debug’ package for every single package?
>>>
>>> For every single binary package, yes. You can suppress it too. Why it
>>> is so surprising?
>>
>>It’s just that I didn’t know, and my recollection is that Debian doesn’t
>>have -dbg packages for every package.
>>
>>> I would like to move the decision whether to keep or to drop debug
>>> information outside of the build itself to keep the hash the same.
>>>
>>> Imagine situation where you added "debug" output to every package and
>>> after each build the newly generated store with debug information is
>>> deleted (carefully, not to corrupt database, of course). Your hash
>>> still will be the same.
>>
>>I see what you mean, but again, that’s not how it works, and I would
>>argue that it’s not desirable.
>
> Yes, I know that it works differently now - that was the reason I
> initiated this thread. If you considered this option and refused it,
> I'm fine with that. Different distributions set different goals. I'd
> like to hear the arguments against the general idea sometime but lets
> not waste more time on this topic.

To be clear: I’m not rejecting the idea of having debugging symbols for
everything.  What I’m saying here is that there are deep design choices
that make it impossible to just “make a debug package and keep the hash
unchanged.”

A key design idea of Nix and Guix is that the store file name contains a
hash of all the inputs of the build process that led to this store item
(info "(guix) Introduction").  “All the inputs” really means everything,
including build scripts and command-line options.  Obviously the
processes that keep debugging symbols does different things from the one
that discards them, so it has a different hash.

This is really at the core of the design and I think it’s a strength.  I
don’t think this specific use case would justify changing it, esp. since
we can achieve basically the same goal differently.

I hope this clarifies the discussion.

>>To move forward, a possible action would be to try to have ‘outputs’
>>default to '("out" "debug") and see (1) how much breaks, and (2) how
>>much space.
>>
>>Would you like to give it a try?
>
> Good idea, yes. I'll do that.

Cool, thank you!

Ludo’.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reproducible builds and debugging information
  2015-03-28 17:41             ` Ludovic Courtès
@ 2015-03-29 17:24               ` Mark H Weaver
  2015-03-30 19:42                 ` Ludovic Courtès
  0 siblings, 1 reply; 10+ messages in thread
From: Mark H Weaver @ 2015-03-29 17:24 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guix-devel

ludo@gnu.org (Ludovic Courtès) writes:

> Tomáš Čech <sleep_walker@gnu.org> skribis:
>
>> On Fri, Mar 27, 2015 at 10:24:22PM +0100, Ludovic Courtès wrote:
>>>Tomáš Čech <sleep_walker@gnu.org> skribis:
>>>
>>>> Imagine situation where you added "debug" output to every package and
>>>> after each build the newly generated store with debug information is
>>>> deleted (carefully, not to corrupt database, of course). Your hash
>>>> still will be the same.
>>>
>>>I see what you mean, but again, that’s not how it works, and I would
>>>argue that it’s not desirable.

I think what Tomáš suggested above does not conflict with the design of
Nix and Guix.  As I understand it, he's suggesting that we have
'outputs' default to '("out" "debug") but then the debug outputs would
be immediately discarded for most packages.  This would save both disk
space and slave->hydra bandwidth.  Users could then get the debug
outputs by building the package locally.

I'm not necessarily advocating that we should do this, but wanted to
help facilitate communication.

       Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reproducible builds and debugging information
  2015-03-29 17:24               ` Mark H Weaver
@ 2015-03-30 19:42                 ` Ludovic Courtès
  0 siblings, 0 replies; 10+ messages in thread
From: Ludovic Courtès @ 2015-03-30 19:42 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

Mark H Weaver <mhw@netris.org> skribis:

> I think what Tomáš suggested above does not conflict with the design of
> Nix and Guix.  As I understand it, he's suggesting that we have
> 'outputs' default to '("out" "debug") but then the debug outputs would
> be immediately discarded for most packages.  This would save both disk
> space and slave->hydra bandwidth.  Users could then get the debug
> outputs by building the package locally.

OK, I see.  The thing is that “immediately discarded” really means
“after the build completed and they’ve been put in store.”  So the build
machine would still need to be able to cope with the additional storage
requirements.

Also, currently I don’t see how we could avoid transferring the “debug”
output back to the master: when the daemon offloads a derivation build,
it really expects to be able to get all the outputs back.  We could hack
the daemon to special-case “debug” outputs but that doesn’t seem great.

Dunno, maybe I’m still too blinded by what’s possible now to think about
what could be made possible.  :-)

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-03-30 19:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-22 17:26 reproducible builds and debugging information Tomáš Čech
2015-03-24 21:09 ` Ludovic Courtès
2015-03-25  0:33   ` Tomáš Čech
2015-03-26 21:21     ` Ludovic Courtès
2015-03-26 21:51       ` Tomáš Čech
2015-03-27 21:24         ` Ludovic Courtès
2015-03-27 21:55           ` Tomáš Čech
2015-03-28 17:41             ` Ludovic Courtès
2015-03-29 17:24               ` Mark H Weaver
2015-03-30 19:42                 ` Ludovic Courtès

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).