unofficial mirror of guix-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Chris Marusich <cmmarusich@gmail.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guix-devel@gnu.org
Subject: Re: Overhauling the cargo-build-system
Date: Sun, 08 Dec 2019 20:45:07 -0800	[thread overview]
Message-ID: <87h82ap0oc.fsf_-_@gmail.com> (raw)
In-Reply-To: <87imnjtsk3.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Sat, 16 Nov 2019 22:33:32 +0100, Sun, 17 Nov 2019 22:22:21 +0100, Sat, 16 Nov 2019 18:35:46 -0800")

[-- Attachment #1: Type: text/plain, Size: 9400 bytes --]

Hi,

Ludovic Courtès <ludo@gnu.org> writes:

> What I would have liked is to somehow replace the #:cargo-inputs
> argument (which is build-system-specific and thus “opaque”) with regular
> ‘native-inputs’ or ‘inputs’ field.

That would be nice.  However, it doesn't seem possible to express
Cargo's "dependencies" and "dev-dependencies" concepts using Guix's
current package DSL.

Consider the proc-macro2 and quote crates.  We added these two crates in
commit 2444abd9c124cc55f8f19a0462e06a2094f25a9d, in the same patch
series where we added #:cargo-inputs and #:cargo-development-inputs:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=35318

Here is the Cargo.toml file for proc-macro2:

  https://github.com/alexcrichton/proc-macro2/blob/master/Cargo.toml

  [dev-dependencies]
  quote = { version = "1.0", default_features = false }

And here is the Cargo.toml file for quote:

  https://github.com/dtolnay/quote/blob/master/Cargo.toml

  [dependencies]
  proc-macro2 = { version = "1.0", default-features = false }

Here is a diagram of their dependency relationship:

  +---------------+
  |     quote     | <+
  +---------------+  |
    |                |
    | dependencies   | dev-dependencies
    v                |
  +---------------+  |
  |  proc-macro2  | -+
  +---------------+

To Cargo, this cycle is not a problem, since "dev-dependencies" are
treated differently from "dependencies":

  https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html

  "Dev-dependencies are not used when compiling a package for building,
  but are used for compiling tests, examples, and benchmarks.

  These dependencies are not propagated to other packages which depend
  on this package."

The reason proc-macro2 declares a "dev-dependency" on quote is because
proc-macro2 uses quote in its doc tests:

  https://github.com/alexcrichton/proc-macro2/blob/e82e8571460a0a0e00f52f011a74a5e0359acf3e/src/lib.rs#L785

This relationship between proc-macro2 and quote cannot be readily
expressed using the current package DSL in Guix.  If you try to model
"dependencies" and "dev-dependencies" as "inputs" (or "native-inputs",
or some combination of the two), Guix will fail due to the cycle.

Presumably, proc-macro2 just needs the source of quote (and the source
of proc-macro2's other dependency, unicode-xid).  When Cargo builds
proc-macro2, it will take care of building quote and making it available
during proc-macro2's tests.  Guix "just" needs to provide proc-macro2
with the quote source.  You might think this poses a bootstrapping
problem for Cargo, but I guess it doesn't.  As long as Cargo has the
source for proc-macro2, quote, and unicode-xid, I guess it can build
proc-macro2 and quote in any order.

Unless we missed something in our discussion of patch 35318, there is no
easy way to express the relationship between proc-macro2 and quote
without changing (or mis-using) the existing package DSL.  In the same
way that the package DSL introduced "native-inputs" and "inputs" as
concepts to facilitate cross-compilation, one way to solve this problem
might be to introduce a new concept to the package DSL that would make
it possible for Guix to express this kind of relationship correctly.

However, in the discussion of patch 35318, everyone (myself included)
seemed opposed to changing the package DSL if we didn't have to.  For
example, in response to an earlier version of the patch series in which
we tried to map "dependencies" and "dev-dependencies" onto the "inputs"
and "native-inputs" concepts (which was probably an abuse of the package
DSL, since "native-inputs" is a cross-compilation concept), you said: "I
don't understand yet why you change the role of 'inputs' compared to how
it is in the rest of Guix."  Ultimately, we decided not to modify the
package DSL or the meaning of "inputs".  Instead, we decided to encode
the necessary information about dependencies in the cargo-build-system's
arguments.  That is how we arrived at #:cargo-inputs and
#:cargo-development-inputs.

By introducing #:cargo-inputs and #:cargo-development-inputs as package
arguments to the cargo-build-system, we were able to solve the cyclic
dependency problem in one specific way.  Perhaps there are better ways.
I agree it would be nice if it were integrated into the package DSL.  I
think that changing the package DSL to suit our needs might work, but
I'm not sure how to change it without making it too Cargo-specific.

> I know it’s not that easy with Rust and Cargo, I just never manage to
> fully grasp why :-), but at least that should be our horizon IMO.

Well, you're not alone!  I'm not (yet!) an expert in Rust, but I find
these problems difficult to understand, too.  Cyclic dependencies are
just one issue.  There are other problems, too.

One problem that Efraim has mentioned is that every crate wants all of
its sources to be available at build time.  It's as if each crate wants
all of its source crates (and all of their source crates, transitively)
to be propagated into the build environment, even if not all of them are
used.

Another problem, which Efraim has also mentioned, is that it seems hard
to "cache" the result of building a crate.  So every crate wants to
build its transitive closure of dependencies from scratch, every time.
In a traditional GNU/Linux system, I guess cargo must cache the results
of these builds somehow, but I don't know how.  In the case of C
libraries, we can just produce .so files that other builds can receive
as inputs, and all is well...but in Rust, it seems hard to do something
like that.  I have to admit, I don't know a lot about this, but based on
what I've heard, it sounds like we would basically have to re-implement
a lot of what Cargo is already doing, in order to get the behavior we
want.  Maybe that's the right path; I don't know.

Ludovic Courtès <ludo@gnu.org> writes:

>> I suppose one way to work around some of the issues is to make it so
>> that the crates "build" by copying the source to %out/share/guix-vendor
>> or something.
>
> So the core issue is that there’s nothing like shared libraries, is that
> correct?  This, in turn, means that there’s nothing to actually build,
> and thus a crate doesn’t really map to a package in the usual sense of
> the word, right?
>
> In that case, what you suggest (copying the source in the package
> output) sounds like it could work.  It would be an improvement over what
> we have now: the package graph would correspond to the crate graph.

Yes, you could install the source into the "out" output, or to a
separate output such as "src".  You could also define a bunch of
"proc-macro2-src" and "quote-src" packages that only build the source.
These possibilities sound similar to the plan we were originally
considering.  We discussed that plan here:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=35155

In short, we hoped to build a crate's source and install it (the source)
into a specific output.  Then for any given crate we want to build, we
would add other crates to its inputs, and the cargo-build-system would
take care of populating the vendor directory with the transitive closure
of source found in the inputs.  However, we couldn't implement that plan
because Guix's current DSL doesn't work with the cycles introduced by
Cargo's "dependencies" and "dev-dependencies".

Perhaps if we defined all crates as "proc-macro2-src" and "quote-src"
etc., we could then define a crate that builds an actual artifact (e.g.,
ripgrep) by dropping the "src" suffix.  So for ripgrep, we'd have
"ripgrep-src" and also "ripgrep".  The former would just copy the source
into the output, and the latter ("ripgrep") would list the former as an
input in order to actually build the program.  In this model, I guess
every "*-src" package would have no inputs and just one output.  I guess
any package that produces an artifact, for example the "ripgrep"
package, would list a bunch of "*-src" packages as inputs: one for every
crate in the transitive closure of dependencies and dev-dependencies of
the ripgrep crate.  That might solve the problem of cyclic dependencies,
and it might reduce (but not eliminate) the amount of excessive building
performed by cargo-build-system.  However, it would make some package
definitions large, it would introduce duplication of inputs across
packages that need the same "*-src" inputs, and it would create a lot of
"*-src" packages.  On the plus side, tools like "guix graph" would work
as-is; currently, "guix graph" has not been taught to understand
#:cargo-inputs and #:cargo-development-inputs for cargo-build-system
packages.

Maybe that really is a better way.  If I'm not mistaken, this seems to
be the direction Efraim has begun to take things with commit
86e443c71d4d19e6f80cad9ca15b9c3a301c738c.  Thank you for taking
initiative to try to improve things, Efraim!  I'm in favor of
experimenting with it to see if it works out.  If it proves to be useful
and we want to stick with it, then we should consider removing the logic
from cargo-build-system that implemented the #:cargo-inputs and
#cargo-development-inputs arguments.

In the end, I just want to be able to use Rust on Guix!

-- 
Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  parent reply	other threads:[~2019-12-09  4:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-10 15:50 Overhauling the cargo-build-system Efraim Flashner
2019-10-10 22:33 ` Ludovic Courtès
2019-10-11 14:13   ` Efraim Flashner
2019-11-16  6:31     ` Martin Becze
2019-11-16 16:37       ` John Soo
2019-11-16 18:44         ` Martin Becze
2019-11-16 21:33       ` Ludovic Courtès
2019-11-17  2:35         ` Martin Becze
2019-11-17  7:19         ` Efraim Flashner
2019-11-17 21:22           ` Ludovic Courtès
2019-11-18 10:20             ` Efraim Flashner
2019-11-23 17:27               ` Ludovic Courtès
2019-12-09  4:45           ` Chris Marusich [this message]
2019-12-09 20:14             ` Martin Becze
2019-12-19 16:10               ` Ludovic Courtès
2019-12-19 16:09             ` Ludovic Courtès
2019-12-19 17:23               ` John Soo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h82ap0oc.fsf_-_@gmail.com \
    --to=cmmarusich@gmail.com \
    --cc=guix-devel@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).